Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
6434
Isao Echizen Noboru Kunihiro Ryoichi Sasaki (Eds.)
Advances in Information and Computer Security 5th International Workshop on Security, IWSEC 2010 Kobe, Japan, November 22-24, 2010 Proceedings
13
Volume Editors Isao Echizen National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan E-mail:
[email protected] Noboru Kunihiro University of Tokyo, School of Frontier Sciences Department of Complexity Science and Engineering 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8561, Japan E-mail:
[email protected] Ryoichi Sasaki Tokyo Denki University, School of Science and Technology for Future Life Department of Information Systems and Multi Media 2-2 Kanda-Nishiki-cho, Chiyoda-ku, Tokyo 101-8457, Japan E-mail:
[email protected] Library of Congress Control Number: 2010937763 CR Subject Classification (1998): E.3, G.2.1, D.4.6, K.6.5, K.4.4, F.2.1, C.2 LNCS Sublibrary: SL 4 – Security and Cryptology ISSN ISBN-10 ISBN-13
0302-9743 3-642-16824-8 Springer Berlin Heidelberg New York 978-3-642-16824-6 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2010 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper 06/3180
Preface
The Fifth International Workshop on Security (IWSEC 2010) was held at Kobe International Conference Center, Kobe, Japan, November 22–24, 2010. The workshop was co-organized by CSEC, a special interest group concerned with the computer security of the Information Processing Society of Japan (IPSJ) and ISEC, a technical group concerned with the information security of The Institute of Electronics, Information and Communication Engineers (IEICE). The excellent Local Organizing Committee was led by the IWSEC 2010 General Co-chairs, Hiroaki Kikuchi and Toru Fujiwara. This year IWSEC 2010 had three tracks, the Foundations of Security (Track I), Security in Networks and Ubiquitous Computing Systems (Track II), and Security in Real Life Applications (Track III), and the review and selection processes for these tracks were independent of each other. We received 75 paper submissions including 44 submissions for Track I, 20 submissions for Track II, and 11 submissions for Track III. We would like to thank all the authors who submitted papers. Each paper was reviewed by at least three reviewers. In addition to the Program Committee members, many external reviewers joined the review process from their particular areas of expertise. We were fortunate to have this energetic team of experts, and are grateful to all of them for their hard work. This hard work included very active discussions; the discussion phase was almost as long as the initial individual reviewing. The review and discussions were supported by a very nice Web-based system, iChair. We would like to thank its developers. Following the review phases, 22 papers including 13 papers for Track I, 6 papers for Track II, and 3 papers for Track III were accepted for publication in this volume of Advances in Information and Computer Security. In addition to the contributed papers, the workshop featured two invited talks that were respectively given by eminent researchers, Jaideep S. Vaidya (Rutgers University) and Rainer B¨ ohme (Westfaelische Wilhelms-Universitaet Muenster, Germany). We deeply appreciate their contributions. Many people contributed to the success of IWSEC 2010. We wish to express our deepest appreciation for their contributions to information and computer security.
November 2010
Isao Echizen Noboru Kunihiro Ryoichi Sasaki
Organization
Co-organized by CSEC (Special Interest Group on Computer Security of the Information Processing Society of Japan) and ISEC (Technical Group on Information Security, Engineering Sciences Society, of the Institute of Electronics, Information and Communication Engineers, Japan).
General Co-chairs Hiroaki Kikuchi Toru Fujiwara
Tokai University, Japan Osaka University, Japan
Advisory Committee Hideki Imai Kwangjo Kim Koji Nakao G¨ unter M¨ uller Yuko Murayama Eiji Okamoto C. Pandu Rangan
Chuo University, Japan Korea Advanced Institute of Science and Technology, South Korea National Institute of Information and Communications Technology, Japan University of Freiburg, Germany Iwate Prefectural University, Japan University of Tsukuba, Japan Indian Institute of Technology, India
Program Committee Co-chairs Isao Echizen Noboru Kunihiro Ryoichi Sasaki
National Institute of Informatics, Japan The University of Tokyo, Japan Tokyo Denki University, Japan
Local Organizing Committee Local Organizing Committee Co-chairs
Koutarou Suzuki (NTT Corp., Japan) Maki Yoshida (Osaka University, Japan) Hiroyuki Inaba (Kyoto Institute of Technology, Japan) Toshihiro Ohigashi (Hiroshima University, Japan)
VIII
Organization
Award Chair Finance and Registration Co-chairs Liaison Co-chairs Publicity Co-chairs
System Co-chairs Publication Co-chairs
Mitsuru Tada (Chiba University, Japan) Hisao Sakazaki (Hitachi Ltd., Japan) Shinichiro Matsuo (National Institute of Information and Communications Technology, Japan) Tetsutaro Uehara (Kyoto University, Japan) Hiroshi Sasaki (NEC Corporation, Japan) Tetsuya Izu (Fujitsu Laboratories Ltd., Japan) Koji Nuida (National Institute of Advanced Industrial Science and Technology, Japan) Yasuharu Katsuno (IBM Research—Tokyo, Japan) Hiroki Takakura (Nagoya University, Japan) Toru Nakanishi (Okayama University, Japan) Shoichi Hirose (Fukui University, Japan)
Program Committee Track I: Foundations of Security (Track Chair: Noboru Kunihiro (The University of Tokyo, Japan)) Zhenfu Cao Eiichiro Fujisaki Tetsu Iwata Aggelos Kiayias Alfred Menezes Phong Nguyen Kazuo Ohta Raphael Phan Bart Preneel Christian Rechberger Palash Sarkar Willy Susilo Tsuyoshi Takagi Routo Terada Sung-Ming Yen Yuliang Zheng
Shanghai Jiao Tong University, China NTT, Japan Nagoya University, Japan University of Athens, Greece University of Waterloo, Canada INRIA and ENS, France The University of Electro-Communications, Japan Loughborough University, UK Katholieke Universiteit Leuven, Belgium Katholieke Universiteit Leuven, Belgium Indian Statistical Institute, India University of Wollongong, Australia Future University of Hakodate, Japan University of Sao Paulo, Brazil National Central University, Taiwan University of North Carolina, USA
Track II: Security in Networks and Ubiquitous Computing Systems (Track Chair: Isao Echizen (National Institute of Informatics, Japan)) Liqun Chen Bart De Decker William Enck Dieter Gollmann Yoshiaki Hori Angelos D. Keromytis Seungjoo Kim
HP Laboratories, UK Katholieke Universiteit Leuven, Belgium Pennsylvania State University, USA Hamburg University of Technology, Germany Kyushu University, Japan Columbia University, USA Sungkyunkwan University, South Korea
Organization
Kwok-Yan Lam Joseph Liu Javier Lopez Kyung-Hyune Rhee Ahamd-Reza Sadeghi Toshihiro Yamauchi Keisuke Takemori Sven Wohlgemuth Hiroshi Yoshiura Alf Zugenmaier
IX
Tsinghua University, China Institute for Infocomm Research, Singapore University of Malaga, Spain Pukyong National University, South Korea Ruhr-Universitat Bochum, Germany Okayama University, Japan KDDI Corporation, Japan National Institute of Informatics, Japan University of Electro-Communications, Japan DOCOMO Euro-Labs, Germany
Track III: Security in Real Life Applications (Track Chair: Ryoichi Sasaki (Tokyo Denki University, Japan)) Rafael Accorsi Claudio Ardagna Kevin Butler Pau-Chen Cheng Steven Furnell Jongsung Kim Tetsutaro Kobayashi Jigang Liu Masakatsu Nishigaki Hartmut Pohl Kai Rannenberg Sujeet Shenoi Reima Suomi Mikiya Tani Ryuya Uda Sabrina De Capitani di Vimercati Guilin Wang
University of Freiburg, Germany Universita degli Studi di Milano, Italy Pennsylvania State University, USA IBM Thomas J. Watson Research Center, USA University of Plymouth, UK Kyungnam Universtiy, South Korea NTT, Japan Metropolitan State University, USA Shizuoka University, Japan University of Applied Sciences Bonn-Rhein-Sieg, Germany Goethe University Frankfurt, Germany University of Tulsa, USA Turku School of Economics, Finland NEC, Japan Tokyo University of Technology, Japan University of Milan, Italy University of Birmingham, UK
External Reviewers Mansoor Alicherry, Man Ho Au, Jean-Philippe Aumasson, Sanjit Chatterjee, Kuo-Zhe Chiou, Kim-Kwang Raymond Choo, Sherman Chow, M. Prem Laxman Das, Tsukasa Endo, Jia Fan, Jun Furukawa, Benedikt Gierlichs, Goichiro Hanaoka, Takuya Hayashi, Matt Henricksen, Jens Hermans, Mitsugu Iwamoto, Yuto Kawahara, Yutaka Kawai, Vasileios Kemerlis, Dmitry Khovratovich, Hyung Chan Kim, Yuichi Komano, Fagen Li, Yang Li , Wei-Chih Lien, Hsi-Chung Lin, Hans Loehr, Di Ma Kazuya Matsuda, Daniele Micciancio, Marine Minier, Ryo Nishimaki, Natsuko Noda, Koji Nuida, Satoshi Obana, Vasilis Pappas Michalis Polychronakis, George Portokalidis, Daniel Ribeiro, Yusuke Sakai, Kazuo Sakiyama, Malek Ben Salem, Subhabrata Samajder, Bagus Santoso, Santanu
X
Organization
Sarkar, Yu Sasaki, Thomas Schneider, Gautham Sekar, Wook Shin, Martijn Stam, Jaechul Sung, Koutarou Suzuki, Tomoyasu Suzaki, Junko Takahashi, Isamu Teranishi, Jeremie Tharaud, Pairat Thorncharoensri, Carmela Troncoso, Jheng-Hong Tu, Damien Vergnaud, Chi-Dian Wu, Hongjun Wu, Shota Yamada, Go Yamamoto, Tsz Hon Yuen, Masayuki Yoshino, Fangguo Zhang, Mingwu Zhang.
Table of Contents
Invited Talks Automating Security Configuration and Administration: An Access Control Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaideep Vaidya Security Metrics and Security Investment Models . . . . . . . . . . . . . . . . . . . . Rainer B¨ ohme
1 10
Encryption Publishing Upper Half of RSA Decryption Exponent . . . . . . . . . . . . . . . . . Subhamoy Maitra, Santanu Sarkar, and Sourav Sen Gupta
25
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples . . . . . . . . . . Yamin Liu, Bao Li, Xianhui Lu, and Yazhe Zhang
40
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs in the RO Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yusuke Sakai, Goichiro Hanaoka, Kaoru Kurosawa, and Kazuo Ohta An Improvement of Key Generation Algorithm for Gentry’s Homomorphic Encryption Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Naoki Ogura, Go Yamamoto, Tetsutaro Kobayashi, and Shigenori Uchiyama
55
70
Data and Web Security Practical Universal Random Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marek Klonowski, Michal Przykucki, Tomasz Strumi´ nski, and Malgorzata Sulkowska
84
Horizontal Fragmentation for Data Outsourcing with Formula-Based Confidentiality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lena Wiese
101
Experimental Assessment of Probabilistic Fingerprinting Codes over AWGN Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minoru Kuribayashi
117
Validating Security Policy Conformance with WS-Security Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fumiko Satoh and Naohiko Uramoto
133
XII
Table of Contents
Protocols Efficient Secure Auction Protocols Based on the Boneh-Goh-Nissim Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Takuho Mitsunaga, Yoshifumi Manabe, and Tatsuaki Okamoto
149
Hierarchical ID-Based Authenticated Key Exchange Resilient to Ephemeral Key Leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Atsushi Fujioka, Koutarou Suzuki, and Kazuki Yoneyama
164
Group Signature Implies PKE with Non-interactive Opening and Threshold PKE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Keita Emura, Goichiro Hanaoka, and Yusuke Sakai
181
Network Security A Generic Binary Analysis Method for Malware . . . . . . . . . . . . . . . . . . . . . Tomonori Izumida, Kokichi Futatsugi, and Akira Mori A-HIP: A Solution Offering Secure and Anonymous Communications in MANETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlos T. Calafate, Javier Campos, Marga N´ acher, Pietro Manzoni, and Juan-Carlos Cano Securing MANET Multicast Using DIPLOMA . . . . . . . . . . . . . . . . . . . . . . . Mansoor Alicherry and Angelos D. Keromytis
199
217
232
Block Cipher Preimage Attacks against Variants of Very Smooth Hash . . . . . . . . . . . . . . Kimmo Halunen and Juha R¨ oning
251
Matrix Representation of Conditions for the Collision Attack of SHA-1 and Its Application to the Message Modification . . . . . . . . . . . . . . . . . . . . . Jun Yajima and Takeshi Shimoyama
267
Mutual Information Analysis under the View of Higher-Order Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thanh-Ha Le and Mael Berthier
285
Known-Key Attacks on Rijndael with Large Blocks and Strengthening ShiftRow Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu Sasaki
301
Implementation and Real Life Security Differential Addition in Generalized Edwards Coordinates . . . . . . . . . . . . . Benjamin Justus and Daniel Loebenberger
316
Table of Contents
Efficient Implementation of Pairing on BREW Mobile Phones . . . . . . . . . Tadashi Iyama, Shinsaku Kiyomoto, Kazuhide Fukushima, Toshiaki Tanaka, and Tsuyoshi Takagi
XIII
326
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lasse Harjumaa and Ilkka Tervonen
337
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Margus Niitsoo
354
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
371
Automating Security Configuration and Administration: An Access Control Perspective Jaideep Vaidya Rutgers University, Newark, NJ 07102, USA
[email protected] http://cimic.rutgers.edu/~jsvaidya
Abstract. Access control facilitates controlled sharing and protection of resources in an enterprise. When correctly implemented and administered, it is effective in providing security. However, in many cases, there is a belief on the part of the consumers that security requirements can be met by simply acquiring and installing a product. Unfortunately, since the security requirements of each organization are different, there is no single tool (or even any meaningful set of tools) that can be readily employed. Independent of the specific policy adopted, such as discretionary access control or role-based access control, most organizations today perform permission assignment to its entities on a more or less ad-hoc basis. Permissions assigned to entities are poorly documented, and not understood in their entirety. Such lack of system administrators awareness of comprehensive view of total permissions of an entity on all systems results in an ever growing set of permissions leading to misconfigurations such as under privileges, violation of the least privilege requirement (i.e., over authorization), and expensive security administration. In this talk, we examine the problem of automated security configuration and administration. This is a tough area of research since many of the underlying problems are NP-hard and it is difficult to find solutions that work with reasonable performance without trading-off accuracy. To address this, usable security mechanisms must be developed by employing novel methodologies and tools from other areas of research that have a strong theoretical basis. We discuss some of the existing work that addresses this and lay out future problems and challenges.
Access control is one of the most essential components of computer security. Access control systems in their various forms, facilitate the controlled sharing and protection of resources in an enterprise. To do this, an access control system enforces a specific access control policy. Thus, there are two basic components to access control – a policy specification mechanism and an enforcement mechanism. Today, there exist a variety of formal models to meet the wide needs of organizations in specifying access control policies. These include, but are not limited to, Discretionary Access Control (DAC), Mandatory Access Control (MAC), and Role Based Access Control (RBAC). Under the DAC policy, users at their discretion can specify to the system who can access the resources they own [1]. Under MAC, both users and resources have fixed security attributes (labels) I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 1–9, 2010. c Springer-Verlag Berlin Heidelberg 2010
2
J. Vaidya
assigned administratively [2]. The label associated with a user (clearance) determines whether he/she is allowed to access a resource with a certain label (classification). For most organizations, MAC is perceived as too stringent, and, at the same time, DAC is perceived as not adequate. RBAC has been proposed to deal with this. RBAC is policy-neutral and can capture some features from both DAC and MAC, specifically the flexibility of DAC and at the same time the stringentness of MAC. In the simplest case of RBAC, roles represent organizational agents that perform certain job functions within the organization [3,4]. The permissions associated with each role are administratively defined based on these job functions. Users, in turn, are assigned appropriate roles based on their qualifications. While RBAC concepts have been in practice for more than four decades, it has been formalized relatively recently (in the mid 90s) [3] and a standardized model has emerged in early 2000 [4]. Independent of whether DAC or RBAC is employed, most organizations today perform permission assignment on a more or less ad-hoc basis. This is true even in the case of network security, as in the case of firewalls. In the area of security, unfortunately, the availability of technology drives the security implementation. Moreover, in many cases, there is a belief on the part of the consumers that security requirements can be met by simply acquiring and installing a product. However, the security requirements of each organization are different and there is no single tool (or even any meaningful set of tools) that can be readily employed. Another primary reason for ad-hoc access control assignments is that, often, system administrators must also assume the role of security administrators. Even if security administrators are present, they probably do not have a complete understanding of the organizational processes, that are distributed among the different departments/units, and it is hard to gain such an understanding. Permissions assigned to users are poorly documented, and not understood in their entirety. Such lack of system administrators’ awareness of comprehensive view of a user’s total permissions on all systems results in an ever-growing set of permissions. Since keeping track of it is so daunting and resource hogging, organizations do not even attempt to do this. This leads to undesirable effects, such as misconfigurations of access control policies leading to under privileges, violation of the least privilege requirement (i.e., over authorization), and expensive security administration. Since accurate configuration and administration of access control policy is time consuming and labor intensive, automated tools are essential to aid in this process. Such tools can help to minimize new errors and identify and correct existing errors. This is a challenging area of research since many of the underlying problems are NP-hard and it is difficult to find solutions that work with reasonable performance without trading-off accuracy. To address this, usable security mechanisms must be developed by employing novel methodologies and tools from other areas of research that have a strong theoretical basis. We discuss some of the existing work that addresses this and lay out future problems and challenges. Specifically, we look at the issues of Role Engineering and Role Mining, Firewall Configuration and Administration, and Misconfiguration Detection.
Automating Security Configuration and Administration
1
3
Role Engineering and Role Mining
Today, role based access control is well accepted as the standard best practice for access control within applications and organizations. Due to its flexibility, such as ease of administration and intuitiveness, RBAC has been successfully employed at different levels of computer security. It is now part of many commercially available systems including operating systems, database management systems, workflow management systems as well as application software. As a result of its commercial success, it has become a standard to implementing access control in many of today’s organizations. Since the concept of a “role” is a commonly understood notion, RBAC has been easily adopted by organizations. A recent study by IBM [5] reports that RBAC creates a valid return on investment. Despite this, problems exist in administering such systems. Deploying RBAC requires one to first identify an accurate and complete set of roles, and assign users to roles and permissions to roles. This process, known as role engineering [6], has been identified as one of the costliest components in realizing RBAC [7]. Coyne [8] notes that when roles are well engineered (which means that permissions are assigned to roles to provide exactly the access required by a holder of the role), the security principle of least privilege will be met. There are two basic approaches towards role engineering: top-down and bottomup. Under the top-down approach[6], roles are defined by carefully analyzing and decomposing business processes into smaller units in a functionally independent manner. These functional units are then associated with permissions on information systems. In other words, this approach begins with defining a particular job function and then creating a role for this job function by associating needed permissions. Often, this is a cooperative process where various authorities from different disciplines understand the semantics of business processes of one another and then incorporate them in the form of roles. Since there are often dozens of business processes, tens of thousands of users and millions of authorizations, this is a rather difficult task. Several top-down approaches have been proposed in the literature [9,10,11,12,13,14,15,16] that mitigate some of the problems and use different solution techniques, with case studies [17] demonstrating some success (though at a high cost). However, due to the sheer scale of the problem, deploying RBAC is still considered as a highly expensive, time-consuming and daunting task, and relying solely on a top-down approach in most cases is not viable. In contrast, the bottom-up approach utilizes the existing permission assignments to formulate roles. Starting from the existing permissions (i.e., prior to RBAC), the bottom-up approach aggregates these into roles. It may also be advantageous to use a mixture of the top-down and the bottom-up approaches to conduct role engineering. While the top-down model is likely to ignore the existing permissions, a bottom-up model may not consider business functions of an organization [16]. However, this role discovery process, often known as role mining, has the advantage of automating the role engineering process. Kuhlmann, Shohat, and Schmipf [18] present a bottom-up approach, which employs a clustering technique similar to the k-means clustering. As such, it is first necessary to define the number of clusters. In [19], Schlegelmilch and
4
J. Vaidya
Steffens propose an agglomerative clustering based approach to role mining (called ORCA), which discovers roles by merging permissions appropriately. More recently, Vaidya et al.[20] propose an approach based on subset enumeration, called RoleMiner. An inherent problem with all of the above bottom-up approaches is that there is no formal notion of a good role, but simply present heuristic ways to find a set of candidate roles. The essential question is how to devise a complete and correct set of roles – this depends on how you define goodness/interestingness (when is a role good/ interesting?) Recently, Vaidya et. al[21] define the role mining problem (RMP) as the problem of discovering the minimal set of roles that can still describe the existing user permissions. This provides a formal way of measuring how good the discovered set of roles is. In addition to the basic RMP, [21] also introduces two different variations of the RMP, called the delta-approx RMP and the Minimal Noise RMP that have pragmatic implications. The key benefit of this is also to place the notion of role mining in the framework of matrix decomposition which is applicable to many other domains including text mining. Following this, several different objectives have also been defined such as minimizing the number of user-role and role-permission assignments[22,23,24]. Since all of these problems are NP-hard, several heuristic solutions have also been proposed[25,26,27,28,29,30]. The problem has also been cast in a very flexible integer programming model[31] to allow easy statement of constraints, and to enable the use of techniques research in the operations research community. Another avenue of research has been to integrate probabilistic models for role mining[32]. While the top-down approach to configuring RBAC is expensive and tedious, the bottom-up approach can automate the process, but lacks semantics. Ideally speaking, one should attempt to perform a hybrid of these two approaches to eliminate their drawbacks. Role mining can be used as a tool, in conjunction with a top-down approach, to identify potential or candidate roles which can then be examined to determine if they are appropriate given existing functions and business processes. There has been recent interest on further work exploring this (for example, [33]), but significantly more work needs to be carried out.
2
Firewall Configuration and Administration
A firewall is a system or group of systems that enforces an access control policy between two or more networks. As such, firewalls are simply an access enforcement mechanism for the network, and serve as the first line of defense against network attacks. While there are many types of firewalls, packet filters are the most common. The main task of packet filters in security policies is to categorize packets based on a set of rules representing the filtering policy. The information used for filtering packets is usually contained in distinct fields in the IPv4 packet header, namely the transport protocol, source IP, source port, destination IP and destination port. Each filtering rule R is an array of field values. A packet p is said to match a rule R if each header-field of p matches the corresponding rule-field of R. If the rule matches, the associated action (permit or deny) is carried out. In firewalls, each
Automating Security Configuration and Administration
5
rule R is associated with an action to be performed if a packet matches a rule. These actions indicate whether to block (“deny”) or forward (“permit”) the packet to a particular interface. The order of the rules makes a big difference in terms of the processing efficiency. In general, having “busier” rules earlier significantly speeds up processing. It is a known fact that the rule ordering optimization problem with rule dependency constraints is NP-complete. Therefore, heuristics have been proposed to enhance filtering performance. However, the order of rules also makes a huge difference in terms of security. Since firewalls process the first rule that matches a packet, inappropriate ordering of the rules can let malevolent packets through or block safe packets. Over time, the list of rules also tends to grow due to evolving network behavior. Some of the rules may even be redundant. In recent years, there has been work dealing with the problem of efficiency and security, to an extent. It should be possible to define what an optimal rule set might be and develop ways to discover such an equivalent optimal set. While existing work on detecting conflicts and generalizing firewall policies serves as an useful starting point, there are many open avenues for future research.
3
Misconfiguration Detection
Along with configuration of static policies, it is also necessary to consider dynamic activation and use of resources. Since logs of access requests are typically kept in any system, these logs can be leveraged to identify the rationale for decisions as well as misconfigurations. For example, a denial of request to access a resource may mean two things: (i) access should be denied since the user should not be given access (based on the user’s job position, role, projects, credentials, etc.) (ii) access should have been permitted (based on user’s position, etc.). An organization may be more concerned about denials of the second kind that perhaps are a result of poor (or inadequate) privilege assignment by an administrator. Since such denials result in interruption of service to users, proper handling of these events will result in improved service. This is especially the case when you consider the dynamic state of the organization and overall load on the security administrator. In addition, this process helps revise privilege assignment to users, and ensure least privilege. Much of the work on providing tools to aid in the validation of rules is focused on firewalls [34,35,36,37]. Recently, data mining-based approaches have been proposed to identify router and firewall misconfigurations and anomalies [38,39]. It should be similarly possible to identify misconfigurations causing denials for DAC and RBAC based systems. One possibility is to further assess the situation by capturing all allowed access requests as well as access denials in a log and analyze them. The goal is to come up with a basis for assessing why a user is denied a request and reexamining the access control permissions, which could help in revisiting the access control permission assignments. Under the principle of least privilege, every user of the system should operate using the least set of privileges necessary to complete the job [40]. Essentially,
6
J. Vaidya
this principle should be in place to limit the damage that can result from an accident or an error, and to reduce the probability of unintentional, unwanted, or improper use of privilege to occur. Unlike the above problem of under privileges, when enforcing least privilege, it is important to identifying (or discover) the patterns of absence of certain entities rather than their presence (i.e., why a certain privilege is never exercised). It may be possible to do this either by re-examining the user privileges or by looking at the user access patterns. This problem of over/under privileges actually occurs even at the level of file systems. The real reason why access control is not applied at a fine granular level for this is due to extent of effort required to appropriately configure them. With automated tools, this may become feasible as well. One problem with this approach is that we assume that the original access permission data is available. However, for file systems, this may be infeasible. Instead it may be necessary to focus on file hierarchies to identify appropriate subdirectories and files that may be misconfigured. This is another avenue for further research. An interesting line of research comes from the field of anomaly detection. Currently, much of the work on anomaly detection in the area of security is limited to network intrusion detection (e.g., [41,42]) and to detecting anomalous behavior (e.g., [43]). These are done by examining the historical data with the goal of characterizing this data, either access patterns of users or network accesses. However, when discovering anomalous permissions, the anomalies cannot be discovered based on the historical behavior. Thus, existing anomaly detection techniques developed for intrusion detection and detection anomalies of user behavior are not applicable. An anomalous permission assignment is a permission given to a user either the permission (object, privilege pair) is given to a user who is not similar to the other users for whom the same permission is given. The similarity or dissimilarity of a user with respect to a set of other users can be determined based on the characteristics of the users. Alternatively, a permission assignment can be anomalous if the permission itself is dissimilar with respect to those assigned to users that are characteristically similar. In order to discover both these types of anomalous permission assignments, one must exploit the semantic knowledge associated with the users and objects. This includes the values of the different credential attributes possessed by the users and the concepts associated with the objects. Unfortunately, traditional distance metrics may not work in this context due to curse of dimensionality, thus requiring fresh work tuned to this context. This is also a promising area of research.
References 1. DoD Computer Security Center: Trusted Computer System Evaluation Criteria (December 1985) 2. Bell, D., LaPadula, L.: Secure computer systems: Unified exposition and multics interpretation. Technical Report MTR-2997, The Mitre Corporation, Bedford, MA (March 1976)
Automating Security Configuration and Administration
7
3. Sandhu, R.S., et al.: Role-based Access Control Models. IEEE Computer, 38–47 (February 1996) 4. Ferraiolo, D., Sandhu, R., Gavrila, S., Kuhn, D., Chandramouli, R.: Proposed NIST Standard for Role-Based Access Control. In: TISSEC (2001) 5. Identity management design guide with ibm tivoli identity. Technical report, IBM (November 2005), http://www.redbooks.ibm.com/redbooks/pdfs/sg246996.pdf 6. Coyne, E.J.: Role-engineering. In: 1st ACM Workshop on Role-Based Access Control (1995) 7. Gallagher, M.P., O’Connor, A., Kropp, B.: The economic impact of role-based access control. Planning report 02-1, National Institute of Standards and Technology (March 2002) 8. Coyne, E., Davis, J.: Role Engineering for Enterprise Security Management. Artech House, Norwood (2007) 9. Fernandez, E.B., Hawkins, J.C.: Determining role rights from use cases. In: ACM Workshop on Role-Based Access Control, pp. 121–125 (1997) 10. Brooks, K.: Migrating to role-based access control. In: ACM Workshop on RoleBased Access Control, pp. 71–81 (1999) 11. Roeckle, H., Schimpf, G., Weidinger, R.: Process-oriented approach for role-finding to implement role-based security administraiton in a large industrial organization. In: ACM (ed.) RBAC (2000) 12. Shin, D., Ahn, G.J., Cho, S., Jin, S.: On modeling system-centric information for roleengineering. In: 8th ACM Symposium on Access Control Models and Technologies (June 2003) 13. Thomsen, D., O’Brien, D., Bogle, J.: Role based access control framework for network enterprises. In: 14th Annual Computer Security Application Conference, pp. 50–58 (December 1998) 14. Neumann, G., Strembeck, M.: A scenario-driven role engineering process for functional rbac roles. In: 7th ACM Symposium on Access Control Models and Technologies (June 2002) 15. Epstein, P., Sandhu, R.: Engineering of role/permission assignment. In: 17th Annual Computer Security Application Conference (December 2001) 16. Kern, A., Kuhlmann, M., Schaad, A., Moffett, J.: Observations on the role lifecycle in the context of enterprise security management. In: 7th ACM Symposium on Access Control Models and Technologies (June 2002) 17. Schaad, A., Moffett, J., Jacob, J.: The role-based access control system of a european bank: A case study and discussion. In: Proceedings of ACM Symposium on Access Control Models and Technologies, pp. 3–9 (May 2001) 18. Kuhlmann, M., Shohat, D., Schimpf, G.: Role mining - revealing business roles for security administration using data mining technology. In: Symposium on Access Control Models and Technologies (SACMAT). ACM, New York (June 2003) 19. Schlegelmilch, J., Steffens, U.: Role mining with orca. In: Symposium on Access Control Models and Technologies (SACMAT). ACM, New York (June 2005) 20. Vaidya, J., Atluri, V., Warner, J.: Roleminer: mining roles using subset enumeration. In: CCS 2006: Proceedings of the 13th ACM conference on Computer and communications security, pp. 144–153 (2006) 21. Vaidya, J., Atluri, V., Guo, Q.: The role mining problem: Finding a minimal descriptive set of roles. In: The Twelth ACM Symposium on Access Control Models and Technologies, Sophia Antipolis, France, June 20-22, pp. 175–184 (2007)
8
J. Vaidya
22. Vaidya, J., Atluri, V., Guo, Q., Lu, H.: Edge-rmp: Minimizing administrative assignments for role-based access control. Journal of Computer Security (to appear) 23. Ene, A., Horne, W., Milosavljevic, N., Rao, P., Schreiber, R., Tarjan, R.: Fast exact and heuristic methods for role minimization problems. In: The ACM Symposium on Access Control Models and Technologies (June 2008) 24. Zhang, B., Al-Shaer, E., Jagadeesan, R., Riely, J., Pitcher, C.: Specifications of a high-level conflict-free firewall policy language for multi-domain networks. In: The Twelth ACM Symposium on Access Control Models and Technologies, pp. 185–194 (2007) 25. Vaidya, J., Atluri, V., Guo, Q.: The role mining problem: A formal perspective. ACM Trans. Inf. Syst. Secur. 13(3), 1–31 (2010) 26. Molloy, I., Chen, H., Li, T., Wang, Q., Li, N., Bertino, E., Calo, S., Lobo, J.: Mining roles with semantic meanings. In: SACMAT 2008: Proceedings of the 13th ACM Symposium on Access Control Models and Technologies, pp. 21–30. ACM, New York (2008) 27. Colantonio, A., Pietro, R.D., Ocello, A.: Leveraging lattices to improve role mining. In: Proceedings of The IFIP TC-11 23rd International Information Security Conference (IFIP SEC 2008), pp. 333–347 (2008) 28. Colantonio, A., Di Pietro, R., Ocello, A.: A cost-driven approach to role engineering. In: SAC 2008: Proceedings of the 2008 ACM Symposium on Applied Computing, pp. 2129–2136. ACM, New York (2008) 29. Guo, Q., Vaidya, J., Atluri, V.: The role hierarchy mining problem: Discovery of optimal role hierarchies. In: Proceedings of the 24th Annual Computer Security Applications Conference (December 8-12, 2008) 30. Geerts, F., Goethals, B., Mielikainen, T.: Tiling databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 278–289. Springer, Heidelberg (2004) 31. Lu, H., Vaidya, J., Atluri, V.: Optimal boolean matrix decomposition: Application to role engineering. In: IEEE International Conference on Data Engineering (April 2008) (to appear) 32. Frank, M., Basin, D., Buhmann, J.M.: A class of probabilistic models for role engineering. In: CCS 2008: Proceedings of the 15th ACM conference on Computer and Communications Security, pp. 299–310. ACM, New York (2008) 33. Fuchs, L., Pernul, G.: Hydro - hybrid development of roles. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 287–302. Springer, Heidelberg (2008) 34. Bartal, Y., Mayer, A., Nissim, K., Wool, A.: Firmato: A novel firewall management toolkit. In: IEEE Symposium on Security and Privacy, pp. 17–31 (1999) 35. Mayer, A.J., Wool, A., Ziskind, E.: Fang: A firewall analysis engine. In: IEEE Symposium on Security and Privacy, pp. 177–187 (2000) 36. Hazelhurst, S., Attar, A., Sinnappan, R.: Algorithms for improving the dependability of firewall and filter rule lists. In: International Conference on Dependable Systems and Networks, pp. 576–585 (2000) 37. Yuan, L., Mai, J., Su, Z., Chen, H., Chuah, C., Mohapatra, P.: Fireman: A toolkit for firewall modeling and analysis. In: IEEE Symposium on Security and Privacy, pp. 199–213 (2006) 38. Le, F., Lee, S., Wong, T., Kim, H.S., Newcomb, D.: Minerals: using data mining to detect router misconfigurations. In: SIGCOMM Workshop on Mining Network Data (2006)
Automating Security Configuration and Administration
9
39. Al-Shaer, E.S., Hamed, H.H.: Discovery of policy anomalies in distributed firewalls. In: Annual Joint Conference of the IEEE Computer and Communications Societies (2004) 40. Saltzer, J.H., Schroeder, M.D.: The protection of information in computer systems. Proceeding of the IEEE 69, 1278–1308 (1975) 41. Allen, J., Christie, A., Fithen, W., McHugh, J., Pickel, J., Stoner, E.: State of the practice of intrusion detection technologies, cmu/sei-99-tr-028. Technical report, Carnegie Mellon University (1999) 42. Lunt, T.F.: A survey of intrusion detection techniques. Computers and Security 12(4), 405–418 (1993) 43. Lane, T., Brodley, C.: Temporal sequence learning and data reduction for anomaly detection. ACM Transations on Information Systems Security 2(3), 295–331 (1999)
Security Metrics and Security Investment Models Rainer B¨ ohme International Computer Science Institute, Berkeley, California, USA
[email protected] Abstract. Planning information security investment is somewhere between art and science. This paper reviews and compares existing scientific approaches and discusses the relation between security investment models and security metrics. To structure the exposition, the high-level security production function is decomposed into two steps: cost of security is mapped to a security level, which is then mapped to benefits. This allows to structure data sources and metrics, to rethink the notion of security productivity, and to distinguish sources of indeterminacy as measurement error and attacker behavior. It is further argued that recently proposed investment models, which try to capture more features specific to information security, should be used for all strategic security investment decisions beneath defining the overall security budget.
1
Introduction
According to recent estimates, global enterprises spent about US$ 13 billion on information security in 2009, and this figure is projected to grow by 14% in 2010 [1]. This amount is substantial even when broken down to the individual enterprise level. For instance, one in three surveyed firms in the US spends 5% or more of the total IT budget on information security [2]. In Japan, one in five firms spent 10% or more in 2007. However, the fraction of firms investing in security so intensively came down from one in three firms in 2006 [3]. This is not overly surprising as money allocated to security is not available for other purposes. So the key question in management of information security is if this money is being spent well. This question has attracted the attention of researchers from accounting, business, economics, computer science, and related disciplines. This paper attempts to survey and systemize the literature, thereby extracting more mature facts as insights for practitioners and distinguishing them from untested hypotheses and open research questions for academic researchers interested in the field. In Section 3 we decompose the security investment process and discuss all key variables. Section 2 focuses on data sources and metrics for these variables. Section 4 gives an overview of recent directions in research deviating from the standard approach towards more domain-specific or empirically founded models. The paper concludes with a brief outlook. I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 10–24, 2010. c Springer-Verlag Berlin Heidelberg 2010
Security Metrics and Security Investment Models
11
security productivity cost of security
benefit of security risk mitigation
security level Fig. 1. Decomposition of the security production function into two steps
2
What to Measure
The key quantity in investment theory is the ratio of cost to benefit, or in terms of a production function, the amount of output per unit of input. The purpose of a security investment model is to describe this relation formally for the domain of information security. Every security investment model builds on security metrics which define the model’s inputs, outputs, and parameters. If values are obtained from actual measurements, the model can predict whatever unknown variable it is solved for. Undoubtedly the most famous security investment model has been proposed by Gordon and Loeb [4]. Standing in the tradition of the accounting literature, this model defines a security breach probability function, which maps the monetary value of security investment to a probability of incurring a defined loss. Under the assumption of a class of security breach probability functions, the authors derive a rule of thumb for setting the security investment as a fraction of the expected loss without security investment.1 Several extensions of the Gordon–Loeb model criticize this conjecture [5], derive optimal investment rules for alternative forms of the security breach probability function [6], endogenize the probability of attack [7], or include timing decisions [8]. All variants have in common that security investment exhibits decreasing marginal returns: every additional dollar spent yields proportionally less benefit. This assumption can be justified intuitively [9] and it is also supported empirically on cross-sectional firm data [10]. From a measurement point of view, the high degree of abstraction of the Gordon–Loeb model and its variants can sometimes be inconvenient. This is so because the direct mapping of inputs (monetary amounts of security investment) 1
The precise conjecture states that for decreasing marginal returns, an upper bound for security investment is given by 1/e (or roughly 37%) of the expected loss without security investment [4].
12
R. B¨ ohme
to outputs (probability of loss) neglects intermediate factors, notably the security level. In practice, intermediate factors are oftentimes better observable than the abstract parameters of the Gordon–Loeb model. Therefore we use an alternative structure for our discussion of variables of interest. As depicted in Fig. 1, we decompose the security production function into two parts. First, the cost of security (in monetary terms) is mapped to the security level (solid lines in Fig. 1). Second, the security level stochastically determines the benefits of security (dashed lines and shaded area in Fig. 1). Indeterminacy is introduced to model attacker behavior. In the following we discuss each variable of interest and explain why this decomposition is useful. 2.1
Cost of Security
Cost of security seems to be the variable easiest to measure by summing up the expenses for the acquisition, deployment, and maintenance of security technology. Yet this reflects only the direct cost. Some security measures have non-negligible indirect cost, such as time lost due to forgotten credentials, the inconvenience of transferring data between security zones, or incompatibilities of security mechanisms slowing down essential processes. If security measures foster privacy or secrecy by enforcing confidentiality, some business decisions might have to be taken less informed and reach suboptimal outcomes compared to the fully informed case. This opportunity cost adds to the indirect cost of security. It is sometimes useful to express the cost of security as a function of the economic activity in the core business: fixed costs are independent of the activity in the core business whereas variable costs grow proportionality to the activity. It is often sufficient to assume fixed cost of security. However, the cost of distributing security tokens to customers or indirect costs due to delayed business processes are clearly variable and should be modeled as such. If the security investment model has a time horizon of multiple periods, one can distinguish the cost of security further by onetime and recurring (i.e., perperiod) costs. While the acquisition and deployment of protection measures is naturally modeled as onetime cost, their maintenance and most indirect costs are recurring. In certain situations it is useful to consider sunk costs, which cannot be recovered when decommissioning protection measures [9]. Most security equipment (e.g., firewall devices) can be sold (at a discount) or repurposed (e.g., as routers), and staff transferred or fired [4]. But the expenses for training or for the distribution of security tokens to customers are irreversibly spent. Whenever costs are distributed over several periods, effects of time-dependent discounting and non-linearities due to taxation can be considered [11]. This is common practice in general investment theory, but barely reflected in the specific literature on security investment so far. Given the pace of development and the short-term nature of most security investments, the errors introduced by ignoring these factors seem small compared to other sources of uncertainty and do not justify complicating the models excessively.
Security Metrics and Security Investment Models
13
Whatever breakdown is used to account the cost of security, this variable should be considered as deterministic up to measurement noise. That is, a true value exists in theory, although it might not always be easy to measure it exactly. 2.2
Security Level
The security level is the variable in the model that summarizes the quality of protection. Like cost of security, it can be assumed to be embodied in a deterministic state, even though it is even more difficult to measure. The reason is that the quality of protection is not necessarily a scalar, but some discrete state which has to be mapped to (at least) an ordinal scale. Deterministic indicators include patch level, existence and configuration of intrusion detection systems, whether virus scanners are installed on end-user PCs, etc. [12]. Despite being often crude and noisy, these indicators convey some indication about the actual security level. This way, the various process models to evaluate security in organizations qualitatively (e.g., [13,14]) can be connected with quantitative security investment models. In addition, the security level can often we observed through stochastic indicators where—again—the indeterminacy reflects attacker behavior. Examples for this category are typical incident measures of intrusion detection systems and virus scanners, such as the actual false alarm and missed detection rates. Observe that our decomposition of the security production function is useful if indicators of the security level are (partly) observable. Since in particular variables on the benefit side of security investment models are difficult to measure and error-prone, it can be of great help to include a supporting point by quantify the security level. This way, the first and second step of the security production function can be evaluated independently, checked for plausibility, and benchmarked against industry best practices. A related remark concerns the notion of security productivity. While it is defined for both steps jointly in the Gordon–Loeb framework [4,7]—in the absence of alternatives—we prefer to tie productivity more closely to the efficiency of the security technology and its ability to mitigate risk (as opposed to risk avoidance, transfer, and retention). As annotated in Fig. 1, security productivity is determined by the curvature of the function that maps the cost of security to the security level. It reflects the increase in security level per unit of security spending, possibly taking into account decreasing marginal returns.2 Since the second function on the benefit side is much more specific to the individual organization (e.g., due to differences in the assets at risk), our definition of security productivity has advantages when comparing the efficiency of security spending between organizations. 2
Intuitively, we expect that this characteristic applies to both mapping functions as depicted in Fig. 1. But this is not essential as long as the total effect prevails. There always exists a transformation of the security level so that only one function models the total effect of decreasing marginal returns.
14
R. B¨ ohme
2.3
Benefit of Security
The second step in the security production function involves the difficulty of mapping incidents to losses. More precisely, the security level is mapped to prevented incidents, which then can be translated to a benefit of security.3 Matsuura notes that fewer incidents can either be due to more attacks failing or due to fewer attacks. Most protection technology affects the first factor, but differences in security productivity could be used to balance investment along this dimension [7]. This is particularly relevant if the second factor (fewer attacks) is not specific to the organization, but affects others too (cf. Sect. 4.5). As mentioned above, the benefit of security largely depends on the value of the assets at risk. This opens up the can of worms of valuating intangible information assets. For the sake of brevity, we spare a survey of this topic. Assume for now that the value of all assets affected by an incident is known. Then we can distinguish situations in which this value imposes an upper bound on the losses from situations where the losses can exceed the asset value. Examples for the latter include cases of liability or secondary costs to recover the asset [15]. We use the broader term of recovery cost to subsume all direct and indirect costs associated with a loss event. By its very nature, losses and hence recovery costs are random variables that take positive values and oftentimes concentrate probability mass at zero (the case when no incident happens). These random variables can be summarized in scalars (e.g., by their moments), however not without losing information. We follow the convention in decision theory and express the expected benefits after a transformation by a utility function, which takes the risk aversion of the decision maker as parameter. If organizations are assumed to be risk neutral (this is justifiable for businesses), the utility function is the identity function. It is needless to say that the random nature of losses complicates not only the ex-ante perspective of security investment (“What measures should we implement?”), but also ex-post evaluations (“Did we implement the right measures?”) [16]. What appears right or wrong in one state of the world (i.e., realization of the random attack variable) is not necessarily the same in other states. This way or the other, a security manager’s standing within an organization will always depend on a combination of skill and luck. At least for the ex-ante perspective, very recent research points out that fuzzy logic might be the tool to deal with the large degree of uncertainty in security decision-making [17,18]. However, it is too early to tell if these concepts are implementable in practice and whether they provide the right kind of signals that can be interpreted on a technical and managerial level alike.
3
How to Measure
With the three variable of interest defined, there remain open questions how to measure or estimate their values (Sect. 3.1) and how to calculate meaningful decision criteria (Sect. 3.2) for a specific investment decisions. 3
Benefit is expressed in the same monetary unit as cost to calculate ratios.
Security Metrics and Security Investment Models cost of security
benefit of security
security level
abstract
abstract
security spending
expected benefit
budget allocation
(saved) recovery cost
protection measures concrete
15
qualitative evaluation
penetration testing
deterministic
incident counts
probabilistic
(prevented) direct loss concrete
Fig. 2. Security investment indicators structured by level of abstraction; arrowheads point towards increasing difficulty in measurement
3.1
Data Sources
Data sources can broadly be divided into internal sources of the investing organization and external sources. Figure 2 shows various security investment indicators from internal sources and their associated variable in the investment model. The indicators corresponding to cost and benefit of security are vertically ordered by their level of abstraction. Technical indicators of the security level, by their very nature, are concrete and specific to the technology in use [12]. Since the transition from in principle deterministic states to probabilistic quantities takes place at this level, it is convenient to organize these indicators along this dimension horizontally. On the cost side, security spending means the total amount of the security budget of an organization. It is the indicator of interest to set the budget (“How much is enough?” [19]). For a given budget, the next more concrete level is to decide the security strategy (“Where to invest?”). This involves the allocation of budget to the typical alternatives in risk management (mitigation, avoidance, transfer, retention) and to types of security investment (proactive versus reactive, technical versus organizational, etc.). Even more concrete is the cost of individual protection measures. For many measures, this cost is easily observable (e.g., by the price tag). Measuring security costs on more abstract levels becomes increasingly difficult, as indirect costs emerging from certain measures and from the interaction between measures [9] have to be taken into account. The hierarchy on the benefit side is roughly symmetric to the cost side. The only difference is that saved recovery cost and prevented direct loss are random
16
R. B¨ ohme
variables (or realizations in the ex-post perspective, if observable at all), whereas the expected benefits reflect an annualized4 and risk-adjusted monetary value. External data sources include threat level indicators, such as the number of active phishing sites, malware variants in circulation, breach disclosure figures, or the number of vulnerability alerts awaiting patches [20]. More and more of such indicators are collected and published on a regular basis by the security industry—mind potential biases [21]—, research teams, not-for-profit organizations, and official authorities. These indicators alone are certainly too unspecific for most organizations, but they can be helpful to update quantitative risk assessment models regularly and to adjust defenses tactically even if data from internal sources is only available at lower frequency or higher latency. By contrast, market-based indicators derived from price information in vulnerability markets have been proposed as alternatives to threat level indicators for their potential of being forward-looking [22,16]. In prior work, we have identified bug challenges, bug auctions, exploit derivatives, and premiums charged by cyber-insurers as potential data sources. However, the most dominant type of vulnerability market in practice are vulnerability brokers, which emit the least signals to construct telling indicators [23]. 3.2
Choice of Metrics
The main purpose of metrics is to compare between alternatives. While comparisons over time or across organizational units can be carried out with concrete technical indicators of the security level, comparisons between protection measures or budget allocation options require the underlying metrics to be on the same scale. This explains why the most regarded metrics in security investment are calculated as cost–benefit ratios on a higher level of abstraction. Over the past decade, substantial work has been done in adapting principles and metrics of investment theory for security investment [19,15,16]. Most prominent is the notion of a return on (security) investment (ROSI/ROI). Among a handful of variants, we prefer the one normalized by the cost of security [24,9], ROSI =
benefit of security − cost of security . cost of security
(1)
Higher values of ROSI denote more efficient security investment. Note that the notion of return in ROSI is broad, as prevented losses do not constitute returns in a narrow sense. Terminology feud aside, these metrics are also regarded with skepticism by practitioners who are familiar with the problems of statistical data collection for rare events. They see a main problem in obtaining annualized and risk-adjusted security benefit figures [12,25]. Nevertheless, these metrics seem to remain as necessary compromise to justify security expenses within organizations.5 It is 4 5
Or aggregated for any other fixed time horizon. Another incontestable application of ROSI are result presentations for analytical models, e.g., [9].
Security Metrics and Security Investment Models
17
common practice to make (or justify) budget decisions based on standard investment theory because it facilitates comparisons between investments in various domains. This has so often been noted that the largest annual survey among corporate information security managers in the US includes a specific question [2, Fig. 7]. According to that, ROSI is used by 44% of the responding organizations. The net present value (NPV) and the internal rate of return—two other standard investment indicators which allow for discounting, but share the same caveats—follow with 26% and 23%, respectively. Apparently security managers have little choice than adopting the terminology of corporate finance.
4
Recent Research Directions
Independent of the adoption of security metrics and investment models in practice, academia contributes to the formation and development of a security investment theory. This theory gets increasingly detached from its roots in accounting. Recent security investment models have been enriched with domain knowledge reflecting specific technical or environmental factors. While in the early days, security investment models were motivated with setting a security budget, newer models are devised to help setting a security strategy. The question has changed from “How much is enough?” [19] to “Where to invest?”. In the following we will briefly review interesting recent developments. 4.1
Timing
Security investment inherently involves decision-making under uncertainty: will this threat realize or not? This uncertainty is reduced over time as incidents can be observed. An elegant way to model this is offered by real options theory, a branch of financial investment theory which accounts for deferred investment (unlike, for instance, the NPV metric). Gordon, Loeb and Lucyshyn [26] first adapted this line of thought to information security and proclaimed a “waitand-see” tactic. Instead of over-investing into defenses that will never become relevant, it can be rational to wait until the first (non-catastrophic) incident happens, and then react. Herath and Herath [27] follow up and provide a comparison between ROSI-based security investment and the real options approach. Tatsumi and Goto [8] extend the Gordon–Loeb model [4] by a timing dimension. Balancing proactive versus reactive security investment is also studied by Yue and C ¸ akanyildirim [28] for the specific case of configuring an intrusion detection system (IDS), as well as in our “iterated weakest link” model [9]. This model combines several features specific to security investment—such as an attacker seeking to exploit the weakest link—in a repeated player-versus-nature game involving multiple threats over multiple rounds (unlike most real option models, which consider only two stages). The core idea is that the defender has some knowledge about the expected difficulty of pursuing several attack vectors, but
18
R. B¨ ohme
remains uncertain about the true order. Accepting that some attacks may be successful enables more targeted security investment and thus reaches overall better outcomes than blind over-investment. Thus in many cases, ROSI increases even after accounting for the losses of successful attacks. 4.2
Information Gathering
There are other ways to reduce the uncertainty in making security decisions than waiting for attacks. Sharing information with other defenders promises several benefits:6 1. Early warning. New attacks might not always hit all organizations at once. So the ones spared at the beginning do not need to wait until they get attacked, but can learn from their peers and upgrade just-in-time. On a technical level, this can be done by sharing IDS and anti-virus signatures. 2. Noise reduction through aggregation. Some types of incidents occur too rarely to estimate reliable probabilities of occurrence from internal observations only. By aggregating observations over many sites, even small probabilities can be determined more accurately. 3. Forensic discovery of structure. The nature of certain malicious activity online remains obscure to observers who see only a small fraction of the network. Sharing knowledge may give a ‘bigger picture’ and enable forensic investigations to find better defenses or prosecute perpetrators. Gordon, Loeb and Lucyshyn [30] as well as Gal-Or and Ghose [31] proposed models to determine the optimal amount of information sharing between organizations. In their game-theoretic framework, security investment and information sharing turn out to be strategic complements. Another way to gather information is to analyze precursors of attacks from internal sources via intrusion detection [32,33] and prevention systems [28]. Since the deployment and maintenance of such systems constitutes an investment, it is quite natural to refine investment models to include this feature. A related feature are professionals services to test the resilience against attacks by exposing it to the latest attack techniques. Commissioning these so-called penetration tests can be seen as an investment in information acquisition. Hence it has its place in security investment models [34]. Note that the ROSI metric cannot be calculated separately for information gathering tasks because the acquired information can make planned security investments obsolete. These savings sometimes exceed the cost of information gathering, thus leading to a negative denominator in Eq. (1). As a rule of thumb, ROSI is a metric for the joint efficiency of the entire security investment strategy. 4.3
Information Security Outsourcing
Once the security budget is defined, it is rational to consider security as a service that is subject to a make-or-buy decision similar to most other operations, 6
We list obvious benefits only. See for example [16, Table 1] for risks and [29] for ambivalent consequences of signaling information about the security level.
Security Metrics and Security Investment Models
19
though with specific risks and benefits [35]. Outsourcing in general is best approached as a principal–agent problem where the provider is susceptible to moral hazard [36]. Ding et al. adapted this theory to the special case of information security outsourcing and mention the providers’ long-term interest in a good reputation as limiting factor to moral hazard [37]. In a related analysis, the same team includes transaction costs in the investment model and warns that the decision to outsource security functions may bear hidden costs if principals find themselves locked into a relationship with their providers [38]. By contrast, Rowe [39] points to positive externalities of security outsourcing if multiple organizations share the same provider. These externalities arise from both economies of scale and improved information sharing. This is not only beneficial for the involved organizations but—depending on the model—also for others. Schneier specifies that it is important to differentiate between outsourced functions [40]: penetration and vulnerability testing (see Sect. 4.2), security auditing, system monitoring, general system management, forensics, and consulting all involve different risks and incentive structures. This is partly reflected in the security investment model by Ceszar, Cavusoglu and Raghunathan [41], who analyze under which conditions it is optimal to outsource system management and system monitoring to a single or multiple independent providers. 4.4
Cyber-Risk Transfer
Aside from risk mitigation and risk avoidance, the financial risk of security incidents can be transferred to third parties, notably cyber-insurers. If the premium is lower than the difference between benefit and cost of security, this is a viable investment option. Note that if the insurance market is in equilibrium, this is only true if organizations are either risk averse or better informed about their specific risk than the insurer. However, the market for cyber-insurance seems underdeveloped in practice, presumably due to three obstacles characterizing cyber-risk: interdependent security, correlated risk, and information asymmetries [42]. If this situation changes in the future, insurers will most likely require that protection measures against all known threats are in place. Therefore cyberinsurance shall rather be seen as complementary to investing in protection measures or outsourced security operations, not as a substitute. To ensure that a defined security level is maintained, insurers might collaborate with security service providers and advise their clients to outsource security operations to them (see Fig. 3). Zhao, Xue and Whinston [43] study such a scenario and conclude that outsourcing (which they see as a substitute to cyber-insurance) is preferable to cyber-insurance. However, in this model security service providers assume full liability for potential losses. We are not aware of a single provider who offers this in practice. So effectively, this result should be interpreted as a combination of security outsourcing and cyber-insurance. Such a combination in fact promises better outcomes than cyber-insurance alone [42].
20
R. B¨ ohme
transfer sec. outsourcing with unlimited liability
inhouse
security outsourcing
risk
cyberinsurance
accept maintain
control
delegate
Fig. 3. Orthogonal relation of cyber-risk transfer and outsourcing of security operations
4.5
Private versus Public Benefit
So far, this paper has taken the dominant perspective in investment theory: organizations seeking to maximize their private profit. A separate stream of related work has studied security investment as a problem of provisioning a public good. Varian [44] adapted Hirshleifer’s [45] theory of public goods with different aggregation functions to the domain of information security. In independent work, Kunreuther and Heal [46] study security investment when it generates positive externalities, i.e., an organization’s expected loss decreases not only with increasing own security level but also with increasing (aggregate) security level of other organizations connected in a network. Grossklags et al. [47] extend this work by distinguishing between two types of protection measures, one which generates positive externalities and one which does not. They describe the existence of equilibria in a game-theoretic setting as a function of the cost of both types of security investment. Cremonini and Nizovtsev [48] modify the setting by considering the case when security investment generates negative externalities. In general, if security investment creates positive externalities, profit-maximizing security investors try to free-ride and under-invest. The opposite is true if security investment creates negative externalities. 4.6
Empirical Underpinning
The academic literature on security investment suffers from a deficit in empirical validation with cross-sectional or longitudinal data7 , which can be explained by the difficulty of obtaining such data. The most regarded annual survey among US enterprises includes a number of relevant indicators, but its data quality if often criticized for ambiguous category definitions and low response rates indicating 7
References to several case studies of single organizations can be found e. g. in [49].
Security Metrics and Security Investment Models
21
potential coverage error [2]. Moreover, its results are not public since the 2008 edition, and the responses are not available in a disaggregated form. The situations is better in Japan, where METI8 data is available on a micro level. This data has been used to validate models of the Gordon–Loeb type [10]. Liu, Tanaka and Matsuura [49] also report evidence for the decomposed form of security investment models as advocated in this paper. They observe a broad indicator of security investment—including protection technology, organizational measures, and employee awareness raising—over several periods and find that consistency in security investment is a significant predictor for fewer incidents. Eurostat has collected some indicators related to security in its annual ICT surveys of households and enterprises in Europe. However, the data is very fragmented and the indicators are not focussed on security investment [21]. A special survey module tailored to security is being administered in 2010. We are not aware of any literature testing security investment models with Eurostat data. In [9], we present data from independent sources to support the basic assumptions in the iterated weakest link model. The model itself and its predictions, however, is not yet tested empirically.
5
Outlook
This paper has demonstrated that treating security investment as a science rather than an art is impeded by many factors, notably the difficulties of estimating probabilities for rare events and quantifying losses in monetary metrics. Some authors have suggested to abandon ROSI altogether. But what are the alternatives? No planning is not an option—it would be a miracle if about US$ 13 billion per year were spent effectively just by accident. So the medium-term outlook is to refine measurements and models (in this order!). If ROSI and derived metrics are deemed unreliable, they should not be used for anything but negotiating a security budget. More specific models that link cost to security level and security level to benefit are better suited for setting the security strategy or deciding about individual protection measures. They might help to spend smarter and therefore less for the same effect. As if managing information security investment in a scientific way was not already difficult enough, recent developments are likely to bring new challenges in the future. Ubiquitous network connectivity, novel architectures, and business models fostering massively distributed computing (aka cloud computing) are about to change the security landscape. On the cost side, this will make it more difficult to disentangle security investment from other expenses, e. g. for a redesign of the system architecture. Measures of the security level will become less reliable due to increasing interdependence between loosely connected and autonomous organizations. On the benefit side, detecting and measuring breaches in realtime will require sophisticated monitoring and forensics efforts (which themselves come at a cost). In addition, novel valuation methods will be needed to account for the value of (protected/breached/lost) information assets over time [50]. 8
The Japanese Ministry of Economy, Trade and Industry.
22
R. B¨ ohme
With the increasing dependence of organizations on information and information technology, the borderline between security investment and general risk management is about to blur. On the upside, this underlines the relevance of the subject. On the downside, it makes it even harder to keep an overview of the field and maintain a consistent terminology and conceptual framework.
Acknowledgements Thanks are due to Kanta Matsuura and the organizers of IWSEC 2010 for the kind invitation to the conference. Kanta further gave helpful comments on an earlier draft and he contributed the security investment statistics for Japan. The paper also benefited from additional comments by M´ ark F´elegyh´azi. The author gratefully acknowledges a postdoctoral fellowship by the German Academic Exchange Service (DAAD).
References 1. Canalys Enterprise Security Analysis: Global enterprise security market to grow 13.8% in 2010 (2010), http://www.canalys.com/pr/2010/r2010072.html 2. Richardson, R.: CSI Computer Crime and Security Survey. Computer Security Institute (2008) 3. METI: Report on survey of actual condition of it usage in FY 2009 (June 2009), http://www.meti.go.jp/statistics/zyo/zyouhou/result-1.html 4. Gordon, L.A., Loeb, M.P.: The economics of information security investment. ACM Transactions on Information and System Security 5(4), 438–457 (2002) 5. Willemson, J.: On the Gordon & Loeb model for information security investment. In: Workshop on the Economics of Information Security (WEIS). University of Cambridge, UK (2006) 6. Hausken, K.: Returns to information security investment: The effect of alternative information security breach functions on optimal investment and sensitivity to vulnerability. Information Systems Frontiers 8(5), 338–349 (2006) 7. Matsuura, K.: Productivity space of information security in an extension of the Gordon–Loeb’s investment model. In: Workshop on the Economics of Information Security (WEIS), Tuck School of Business, Dartmouth College, Hanover, NH (2008) 8. Tatsumi, K.i., Goto, M.: Optimal timing of information security investment: A real options approach. In: Workshop on the Economics of Information Security (WEIS). University College London, UK (2009) 9. B¨ ohme, R., Moore, T.W.: The iterated weakest link: A model of adaptive security investment. In: Workshop on the Economics of Information Security (WEIS), University College London, UK (2009) 10. Tanaka, H., Matsuura, K., Sudoh, O.: Vulnerability and information security investment: An empirical analysis of e-local government in Japan. Journal of Accounting and Public Policy 24, 37–59 (2005) 11. Brocke, J., Grob, H., Buddendick, C., Strauch, G.: Return on security investments. Towards a methodological foundation of measurement systems. In: Proc. of AMCIS (2007) 12. Jacquith, A.: Security Metrics: Replacing Fear, Uncertainty, and Doubt. AddisonWesley, Reading (2007)
Security Metrics and Security Investment Models
23
13. Alberts, C.J., Dorofee, A.J.: An introduction to the OCTAVETM method (2001), http://www.cert.org/octave/methodintro.html 14. Bodin, L.D., Gordon, L.A., Loeb, M.P.: Evaluating information security investments using the analytic hierarchy process. Communications of the ACM 48(2), 79–83 (2005) 15. Su, X.: An overview of economic approaches to information security management. Technical Report TR-CTIT-06-30, University of Twente (2006) 16. B¨ ohme, R., Nowey, T.: Economic security metrics. In: Eusgeld, I., Freiling, F.C., Reussner, R. (eds.) Dependability Metrics. LNCS, vol. 4909, pp. 176–187. Springer, Heidelberg (2008) 17. Sheen, J.: Fuzzy economic decision-models for information security investment. In: Proc. of IMCAS, Hangzhou, China, pp. 141–147 (2010) 18. Schryen, G.: A fuzzy model for it security investments. In: Proc. of ISSE/GISICHERHEIT, Berlin, Germany (to appear, 2010) 19. Soo Hoo, K.J.: How much is enough? A risk-management approach to computer security. In: Workshop on Economics and Information Security (WEIS), University of California, Berkeley, CA (2002) 20. Geer, D.E., Conway, D.G.: Hard data is good to find. IEEE Security & Privacy 10(2), 86–87 (2009) 21. Anderson, R., B¨ ohme, R., Clayton, R., Moore, T.: Security Economics and the Internal Market. Study commissioned by ENISA (2008) 22. Matsuura, K.: Security tokens and their derivatives. Technical report, Centre for Communications Systems Research (CCSR), University of Cambridge, UK (2001) 23. B¨ ohme, R.: A comparison of market approaches to software vulnerability disclosure. In: M¨ uller, G. (ed.) ETRICS 2006. LNCS, vol. 3995, pp. 298–311. Springer, Heidelberg (2006) 24. Purser, S.A.: Improving the ROI of the security management process. Computers & Security 23, 542–546 (2004) 25. Schneier, B.: Security ROI: Fact or fiction? CSO Magazine (September 2008) 26. Gordon, L.A., Loeb, M.P., Lucyshyn, W.: Information security expenditures and real options: A wait-and-see approach. Computer Security Journal 14(2), 1–7 (2003) 27. Herath, H.S.B., Herath, T.C.: Investments in information security: A real options perspective with Bayesian postaudit. Journal of Management Information Systems 25(3), 337–375 (2008) 28. Yue, W.T., C ¸ akanyildirim, M.: Intrusion prevention in information systems: Reactive and proactive responses. Journal of Management Information Systems 24(1), 329–353 (2007) 29. Grossklags, J., Johnson, B.: Uncertainty in the weakest-link security game. In: Proceedings of the International Conference on Game Theory for Networks (GameNets 2009), Istanbul, Turkey, pp. 673–682. IEEE Press, Los Alamitos (2009) 30. Gordon, L.A., Loeb, M.P., Lucysshyn, W.: Sharing information on computer systems security: An economic analysis. Journal of Accounting and Public Policy 22(6) (2003) 31. Gal-Or, E., Ghose, A.: The economic incentives for sharing security information. Information Systems Research 16(2), 186–208 (2005) 32. Lee, W., Fan, W., Miller, M., Stolfo, S.J., Zadok, E.: Toward cost-sensitive modeling for intrusion detection and response. Journal of Computer Security 10(1-2), 5–22 (2002)
24
R. B¨ ohme
33. Cavusoglu, H., Mishra, B., Raghunathan, S.: The value of intrusion detection systems in information technology security architecture. Information Systems Research 16(1), 28–46 (2005) 34. B¨ ohme, R., F´elegyh´ azi, M.: Optimal information security investment with penetration testing. In: Decision and Game Theory for Security (GameSec), Berlin, Germany (to appear, 2010) 35. Allen, J., Gabbard, D., May, C.: Outsourcing managed Security Services. Carnegie Mellon Software Engineering Institute, Pittsburgh (2003) 36. Jensen, M.C., Meckling, W.H.: Theory of the firm: Managerial behavior, agency costs and ownership structure. Journal of Financial Economics 3(4), 305–360 (1976) 37. Ding, W., Yurcik, W., Yin, X.: Outsourcing internet security: Economic analysis of incentives for managed security service providers. In: Deng, X., Ye, Y. (eds.) WINE 2005. LNCS, vol. 3828, pp. 947–958. Springer, Heidelberg (2005) 38. Ding, W., Yurcik, W.: Outsourcing internet security: The effect of transaction costs o managed service providers. In: Prof. of Intl. Conf.on Telecomm. Systems, pp. 947–958 (2005) 39. Rowe, B.R.: Will outsourcing IT security lead to a higher social level of security? In: Workshop on the Economics of Information Security (WEIS), Carnegie Mellon University, Pittsburgh, PA (2007) 40. Schneier, B.: Why Outsource? Counterpane Inc. (2006) 41. Cezar, A., Cavusoglu, H., Raghunathan, S.: Outsourcing information security: Contracting issues and security implications. In: Workshop on the Economics of Information Security (WEIS), Harvard University, Cambridge, MA (2010) 42. B¨ ohme, R., Schwartz, G.: Modeling cyber-insurance: Towards a unifying framework. In: Workshop on the Economics of Information Security (WEIS), Harvard University, Cambridge, MA (2010) 43. Zhao, X., Xue, L., Whinston, A.B.: Managing interdependent information security risks: A study of cyberinsurance, managed security service and risk pooling. In: Proc. of ICIS (2009) 44. Varian, H.R.: System reliability and free riding. In: Workshop on the Economics of Information Security (WEIS), University of California, Berkeley (2002) 45. Hirshleifer, J.: From weakest-link to best-shot: The voluntary provision of public goods. Public Choice 41, 371–386 (1983) 46. Kunreuther, H., Heal, G.: Interdependent security. Journal of Risk and Uncertainty 26(2-3), 231–249 (2003) 47. Grossklags, J., Christin, N., Chuang, J.: Secure or insure? A game-theoretic analysis of information security games. In: Proceeding of the International Conference on World Wide Web (WWW), Beijing, China, pp. 209–218. ACM Press, New York (2008) 48. Cremonini, M., Nizovtsev, D.: Understanding and influencing attackers’ decisions: Implications for security investment strategies. In: Workshop on the Economics of Information Security (WEIS), University of Cambridge, UK (2006) 49. Liu, W., Tanaka, H., Matsuura, K.: An empirical analysis of security investment in countermeasures based on an enterprise survey in Japan. In: Workshop on the Economics of Information Security (WEIS), University of Cambridge, UK (2006) 50. Berthold, S., B¨ ohme, R.: Valuating privacy with option pricing theory. In: Workshop on the Economics of Information Security (WEIS), University College London, UK (2009)
Publishing Upper Half of RSA Decryption Exponent Subhamoy Maitra, Santanu Sarkar, and Sourav Sen Gupta Applied Statistics Unit, Indian Statistical Institute, 203 B T Road, Kolkata 700 108, India {subho,santanu r}@isical.ac.in,
[email protected] Abstract. In the perspective of RSA, given small encryption exponent e (e.g., e = 216 + 1), the top half of the decryption exponent d can be narrowed down within a small search space. This fact has been previously exploited in RSA cryptanalysis. On the contrary, here we propose certain schemes to exploit this fact towards efficient RSA decryption. Keywords: Cryptology, Decryption Exponent, Efficient Decryption, Public Key Cryptography, RSA.
1
Introduction
RSA cryptosystem, publicly proposed in 1978 and named after its inventors Ron Rivest, Adi Shamir and Len Adleman, is the most popular Public Key Cryptosystem till date. Let us first briefly describe the RSA scheme [11,13]. Cryptosystem 1 (RSA). Let us define N = pq where p and q are primes. By definition of the Euler totient function, φ(N ) = (p − 1)(q − 1). – – – –
KeyGen: Choose e co-prime to φ(N ). Find d such that ed ≡ 1 mod φ(N ). KeyDist: Publish public key N, e and keep private key N, d secret. Encrypt: For plaintext M ∈ ZN , ciphertext C = M e mod N . Decrypt: For ciphertext C, plaintext M = C d mod N .
The efficiency of encryption and decryption in RSA depends upon the bit-sizes of e and d respectively, and further, both depend on the size of N too, as all the modular operations are done with respect to N . To improve the decryption efficiency of RSA, another variant of RSA was proposed that uses the Chinese Remainder Theorem (CRT). This is the most widely used variant of RSA in practice and is known as CRT-RSA [10,18]. Preliminaries. Before proceeding further, let us go through some preliminary discussions. For notational purpose, we denote the number of bits in an integer i by li ; i.e., li = log2 i when i is not a power of 2 and li = log2 i + 1, when i is a power of 2. By Small e, we mean e = 216 + 1 or around that range, which is popularly used for fast RSA encryption. I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 25–39, 2010. c Springer-Verlag Berlin Heidelberg 2010
26
S. Maitra, S. Sarkar, and S. Sen Gupta
Fact 1. For Small e, the top half of d can be estimated efficiently. Proof. The RSA equation ed = kφ(N ) + 1 translates to ed = k(N + 1) − k(p + q) + 1, where lk ≈ le and ld ≈ lN . In cases where e is Small, so is k, and hence can be found using a brute force search. Thus one can estimate d as follows. d=
k 1 k k 1 (N + 1) + − (p + q) ≈ (N + 1) + e e e e e
The error in this approximation is ke (p + q) < p + q as 1 < k < e. Thus, √ considering that the primes p and q are of same bit-size, the error is O( N ), that is one gets an approximation with error-size less than or equal to max (lp , lq ) ≈ 1 1 k 1 2 lN ≈ 2 ld . If we write d = d0 + d1 with d0 = e (N + 1) + e , then |d − d0 | < 2lN /2 , which implies that d0 estimates the top half of d correctly and d1 ≡ d (mod 2lN /2 ). Thus, for various values of k in the range 1 ≤ k < e, we get those many possibilities for the upper half of d, allowing for an efficient estimate. Our Motivation. The estimation of d stated in Fact 1 has been exploited in literature to propose partial key exposure attacks on RSA [2]. Our motivation though is to use this estimation in a constructive way. As one can estimate the half of the bits of d in most significant side anyway in cases where e is Small, there is no harm in choosing that top half on our own and to make it public. A few interesting questions come up in this direction. – Can one choose the most significant half of the bits of d to make RSA decryption more efficient than in the case for general RSA? – Can one choose the most significant half of the bits of d to personalize RSA in some way? – Can one choose the least significant half of the bits of d in some way (no constraint on the most significant half) so that higher workload can be transferred to a server in case of a server-aided decryption? Our Contribution. In this paper, we shall answer these questions one by one. First, in Section 2, we propose a scheme for RSA where one can choose around half of the bits of d in most significant side on his/her own, simply to make RSA decryption faster for Small e. It is important to note that our result does not compete with fast CRT-RSA decryption; this is only to highlight how simply the general RSA decryption can be made more efficient through this idea. Next, in Section 3, we answer the second question by proposing a personalized model of RSA by letting the user choose the most significant half of d on his own. We provide an answer to the third question in Section 4 and illustrate one of its potential applications in the form of a new RSA scheme for low end devices. In modern cryptography, the issue of cryptographic implementation on low end devices is an emerging field of research. The importance of efficiency comes from the computational constraints in low end hand-held devices like smart cards, phones or PDAs. One knows that sometimes the low end devices (M , say)
Publishing Upper Half of RSA Decryption Exponent
27
are connected to a server (S, say). In Section 4, we propose a scheme where S aids in the decryption process to decrease the workload of M without any significant loss of security. Note that there is quite a good amount of research in the field of Server-Aided RSA, initiated in [8] (one may refer to [9] for more recent trends and analysis in this area). However, the existing ideas use additional algorithms over basic RSA and the models are set on a more general framework. In this paper, we present a quite simple approach towards server-aided decryption that does not require any primitive other than RSA.
2
Efficient RSA for Small e
In this section, we propose a scheme for RSA where around half of the bits of d in most significant side can be chosen as per the will of the user. We consider e Small and hence it is logical to consider ld = lN . We write d = d0 + d1 , where 1 d1 = d mod 2 2 lN +le . That is, – d0 is an lN -bit integer where 12 lN − le many most significant bits of d0 and d are same, side are zero. and rest of the bits of d0 inleast significant – d1 is an 12 lN + le -bit integer where 12 lN + le many bits of d in least significant side constitute d1 . We shall henceforth call d0 the top half and d1 the bottom half of d. According to Fact 1, the top portion d0 of the decryption exponent d can be estimated efficiently as we have only a small number of options in case e is Small. The scheme we propose exploits this fact to make RSA decryption faster. Key Generation. The user is allowed to choose his/her favorite Small encryption exponent e, and d0 as described above. Once he/she decides upon the size of the primes p, q to be used, lN is fixed. Next, the user fixes a Small e (e.g., e = 216 + 1), and d0 with ld0 = lN . Thereafter, the key generation follows Algorithm 1. This algorithm (and also Algorithm 2 presented in Section 4) is in a similar line to that of the key generation algorithms presented in [4,14]. Efficient-RSA. Once we have the power to choose the top half of the decryption exponent (d0 shares the top half with d), one may choose it in such a fashion that makes the RSA decryption more efficient. A natural act is to choose the top half so that it has a low Hamming weight, which helps in faster decryption. In this line, we present our Efficient-RSA scheme as in Cryptosystem 2, and analyze the scheme thereafter. Cryptosystem 2 (Efficient-RSA). Choose a Small integer e as encryption exponent and choose lp , lq , the bit-sizes of the RSA primes. Choose the top half of the decryption exponent as d0 = 2lp +lq −1 . – – – –
KeyGen: (p, q, N, d) ← KeyGenAlgoMSB(e, lp , lq , d0 ). KeyDist: Publish public key N, e and keep private key N, d secret. Encrypt: For plaintext M ∈ ZN , ciphertext C = M e mod N . Decrypt: For ciphertext C, plaintext M = C d mod N .
28
S. Maitra, S. Sarkar, and S. Sen Gupta Input: Small encryption key e, Bitsize of primes lp , lq , and d0 with ld0 = lN Output: RSA parameters p, q, N, d Pick a prime p at random with bit-size lp such that gcd(p − 1, e) = 1; Pick a random number dpad of length 12 ld0 or less; Set d˜0 ← d0 + dpad ; Pick a random number k of length le with gcd(k, e) = 1; Set x ← e − (k(p − 1))−1 mod e ; 1 Set y ← k(p−1) (ed˜0 − 1); 1 Set z ← e (y − x); Set w ← x + ze; if w + 1 is prime and lw = lq then GOTO Step 12; end GOTO Step 2;
1 2 3 4 5 6 7 8 9 10 11 12
q ← w + 1; N ← pq; d˜1 ← − 1e ((ed˜0 − 1) − k(p − 1)w); d ← d˜0 + d˜1 ;
15
Set Set Set Set
16
RETURN p, q, N, d;
13 14
Algorithm 1. The Key Generation Algorithm (KeyGenAlgoMSB)
2.1
Correctness of Cryptosystem 2
The correctness of the scheme depends on the correctness of Key Generation (Algorithm 1), as the other phases are similar to regular RSA. Note that ed˜0 − 1 − k(p − 1)w ≡ −1 − k(p − 1)x ≡ −1 + k(p − 1)[k(p − 1)]−1 ≡ −1 + 1 ≡ 0
(mod e).
Hence, ed˜0 − 1 − k(p − 1)w = −ed˜1 , which implies e(d˜0 + d˜1 ) − k(p − 1)w = 1. That is, ed − k(p − 1)(q − 1) = 1, which is the well-known RSA equation. Thus, Algorithm 1 generates the keys as per RSA criteria. Bitsize of Primes. We also need to verify that the bit-sizes of p and q are lp and lq respectively, as prescribed by the user. In Algorithm 1, note that p is already chosen to be of size lp . Regarding q, note that we have chosen ld0 = lp +lq and lk = le . By construction, lx ≈ le and ly ≈ le + ld0 − lk − lp = lq . Thus we get lw = max(lx , lze ) = lze ≈ ly−x = ly ≈ lq , as required. Choice of d0 . Another important issue that requires verification is that the d0 supplied by the user does actually share the top half with the decryption exponent d, and does not get changed during the miscellaneous operations performed in Algorithm 1. We prove the following result in this direction.
Publishing Upper Half of RSA Decryption Exponent
29
Theorem 1. The output d generated by KeyGenAlgoMSB(e, lp , lq , d0 ) shares the top 12 lN − le bits with the input d0 . Proof. Note that we can write d = d0 + d1 with d1 = dpad + d˜1 where we have chosen dpad such that ldpad < 12 ld0 = 12 ld . Again, we have |ed˜1 | = |ed˜0 − 1 − k(p − 1)w| = |ed˜0 − 1 − k(p − 1)(x + ze)| = |ed˜0 − 1 − k(p − 1)(y + y˜)|, where y˜ = ze + x − y = 1e (y − x) e − (y − x) < e. Thus, we obtain the following. y| |ed˜1 | ≤ |ed˜0 − 1 − k(p − 1)y| + |k(p − 1)˜ < |ed˜0 − 1 − k(p − 1)y| + |k(p − 1)e| ed˜ − 1 0 − y + |k(p − 1)e| = |k(p − 1)| k(p − 1) < |k(p − 1)| + |k(p − 1)e| = (e + 1)|k(p − 1)|, and hence ld˜1 ≤ le +lk +lp −le = lk +lp = le +lp ≈ le + 12 ld , in cases where lp ≈ lq . Combining these, we get ld1 = max(ldpad , ld˜1 ) ≤ le + 12 ld . Thus, d0 represents the 1 2 lN − le many most significant bits of d correctly. 2.2
Efficiency of Cryptosystem 2
We have already mentioned that for all practical applications, CRT-RSA is implemented as it is more efficient than RSA. Also, we accepted the fact that our implementation does not compete with CRT-RSA in terms of efficiency. Thus our explanation of efficiency here is as compared with standard RSA. The encryption phase is the same as that of regular RSA and hence the efficiency is identical to that of a regular RSA scheme using Small e. The main advantage comes in case of decryption. As we have chosen d0 = 2lN −1 in our key generation algorithm and as ld1 ≈ lN 2 + le , we have the top half of d to be all 0’s except for the 1 at the MSB. Also, in the lower half of length ld1 , we have about 1 2 ld1 many 1’s and rest 0’s on an average. Now, we know that a 0 in d corresponds to just a squaring and a 1 corresponds to a squaring and a multiplication in our regular square and multiply algorithm used for modular multiplication in RSA. Thus, the number of computations in decryption phase will be as follows. – Regular computation for the bottom half: 12 ld1 multiplications and ld1 squares. – Just squaring for the top half: lN 2 − le − 1 squares. – Regular computation for the 1 at MSB: 1 multiplication and 1 square. Assume that the cost of one modular squaring is equivalent to μ times the cost of one modular multiplication. Hence, total number of modular multiplications in the decryption phase is
lN 1 ld 1 lN le le + μld1 +μ − le − 1 +(1+μ) = +μlN + +1 ≈ μ + lN + , 2 2 4 2 4 2
30
S. Maitra, S. Sarkar, and S. Sen Gupta
whereas the same in case of regular RSA (considering half of the bits of d are 1 on an average) is (μ + 12 )ld = (μ + 12 )lN , as ld = lN in general for Small e. Thus, we obtain an advantage (in proportion of less number of operations) of 1−
(μ + 14 )lN + (μ + 12 )lN
le 2
=
1−
2le lN
2(2μ + 1)
.
e Asymptotically, one can neglect 2l lN and hence we get the speed up of the order 1 of 2(2µ+1) . When μ = 1, we get an advantage of 16.67% in the decryption phase, and the advantage increases if μ < 1 in practice. Considering a practical scenario with lN = 1024 and e = 216 + 1, the advantage is 16.11% considering μ = 1. Our result provides similar kind of improvements as that of [7, Section 2.1]. Moreover, the algorithm in [7] could not achieve the asymptotic improvement of 16.67% in practice, whereas our algorithm proposed here reaches that. Hence, in the sense of practical implementation, our algorithm betters that of [7]. Since all the exponentiation operations are modular, it is also important to see how the modN part in the calculation of v 2 mod N or uv mod N can be efficiently executed for u, v ∈ ZN . It has been pointed out by Lenstra [5] that the operations become efficient when N is of the form N = 2lN −1 + t for some positive integer t which is significantly smaller than N . Around 30% improvement may be achieved for encryption and decryption with such 1024-bit RSA moduli. During the setup of RSA in this case, one of the primes p is chosen at random and the other one is constructed cleverly so that N gains its special form. Since our method chooses both primes p, q at random depending on the choice of d0 , our result does not consider any constraints on N , and hence improvement along the lines of [5] may not be immediately incorporated in our scheme.
2.3
Security of Cryptosystem 2
We have already discussed in Section 1 (Fact 1) that in case of RSA with Small e, the top half of the decryption exponent, that is d0 , can be obtained without much effort. Hence, choosing specific d0 in our scheme does not leak out any extra information regarding the system. Thus, it is quite reasonable to claim that the security of our Efficient RSA cryptosystem is equivalent to the security of a regular RSA system having Small encryption exponent e. Another observation is that we are constructing one of the primes (q) in the algorithm, based on the chosen d0 , dpad , p, k and e. A natural question is whether this construction makes the prime q special instead of a random prime, as is expected in RSA. We claim that the prime q constructed in Algorithm 1 is a random prime of length lq . The following arguments support our claim. – One may notice that d0 is chosen to be of a specific form, whereas dpad is a random integer of length 12 ld0 or less. This makes the lower half of d˜0 random, but the top half shares the same structure as that of d0 .
Publishing Upper Half of RSA Decryption Exponent
31
– Next, we choose p to be a random prime of length lp and k to be random integer co-prime to e. Hence, the inverse [k(p − 1)]−1 mod e is random in the range [1, e − 1], and so is x. – Let us now assume that ed˜0 −1 is of a specific structure. The reader may note that actually the lower half of this quantity is random, but the assumption of non-randomness just poses a stronger constraint on our argument. In this case, as k(p − 1) is totally random, we obtain y, and hence z to be random numbers as well. – The argument above justifies the randomness of w, by construction, and hence the randomness of q. One may also wonder whether p and q are mutually independent random primes, or do they possess any kind of interdependency due to the choices we made. The following arguments justify the mutual independence of p and q. – Note that in Algorithm 1, we have the following two approximate relations: 1 (ed˜0 − 1). q − 1 = w ≈ x + 1e (y − x) · e = y, and y ≈ k(p−1) – Hence, we have another approximate relation k(p − 1)(q − 1) ≈ ed˜0 , where p is a random prime of size 12 lN , parameter k is random with lk = le , and the bottom half of ed˜0 is random (edpad ) of size approximately 1 lN . 2
– Now, notice that the relation k(p − 1)(q − 1) ≈ ed˜0 , i.e., (p − 1)(q − 1) ≈ ke d˜0 , as discussed above, is analogous to the relation pq = N where the top half of N is predetermined, and p is chosen at random. This setup is precisely the one proposed by Lenstra [5] to make RSA modular operations faster by fixing a special structure of N . In case of [5], the primes p and q are not related in any sense, and following a similar logic, the primes in our setup are mutually independent random primes as well. 2.4
Runtime of Setup in Cryptosystem 2
The runtime to set up the proposed RSA scheme is dominated by the runtime of Key Generation (Algorithm 1), which in turn depends on the probabilistic analysis of success of the same. If we take a look at the if condition in Step 9 of Algorithm 1, then the probability of meeting the condition is heuristically of the order of log1 N . This is due to the fact that w, of size N 0.5 , is being constructed to be almost √ random and the distribution of primes of that size follows a density of log N , that is O(log N ). Thus, the expected number of iterations of the algorithm is O(log N ). What we have discussed in this section gives us the power to choose the top half of d, that is d0 , on our own. One may use this idea to implement a personalized RSA system as described in the following section.
32
3
S. Maitra, S. Sarkar, and S. Sen Gupta
Personalized RSA Scheme
Here we explore the freedom of choosing d0 in to obtain a personalized RSA implementation. Let us first talk about the motivation for this scheme. Motivation. We are all acquainted with the idea of Domain Name System (DNS) in context of the Internet. According to Wikipedia, “The Domain Name System (DNS) is a hierarchical naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participants.” In this context, our motivation is to propose a personalized scheme for Public Key Cryptosystem which, in some sense, is similar to the structure of DNS. This scheme will associate, with the RSA keys of a participant, various information about the participant himself. Moreover, it will translate some identification information of a participant meaningful to humans into a binary string and embed this into the RSA keys of the participant. Now, the only question is how to achieve this task. We can describe this scheme in details as follows. The Idea. Let us choose to use Small encryption exponent e for our scheme, e = 216 + 1 say, as it is used most generally on a global frame. In such a case, as we have discussed earlier in Section 1 (Fact 1), one can easily estimate the top half of the decryption exponent d accurately. If that is so, why not let the top half of d be published, without affecting the security of RSA. Moreover, if one chooses the top half of d on his own, embeds personal (not private) information, and publishes it along with his or her public key, then we can implement a personalized notion of associating public keys with the users. Personalized-RSA. Here the top half of the decryption exponent is chosen by the user to make the RSA personal. This applies only to the case with Small e. The user can fix Small e, the size of the primes p, q, and the personal top half d0 of the decryption exponent d to be used. The RSA keys for our scheme are N, e, d0 and N, d, obtained from the output of KeyGenAlgoMSB, and the encryption and decryption are similar to the regular RSA. A formal description of our proposed scheme of is as in Cryptosystem 3. Cryptosystem 3 (Personalized-RSA). Choose a Small integer e as encryption exponent and choose lp , lq , the bit-sizes of the RSA primes. Choose a personal d0 and embed user information (nothing secret) within d0 . – – – –
KeyGen: (p, q, N, d) ← KeyGenAlgoMSB(e, lp, lq , d0 ). KeyDist: Publish public key N, e, d0 and keep private key N, d secret. Encrypt: For plaintext M ∈ ZN , ciphertext C = M e mod N . Decrypt: For ciphertext C, plaintext M = C d mod N .
The correctness, key-sizes and runtime analysis of Cryptosystem 3 goes along the same line as Cryptosystem 2, and hence is being omitted to avoid duplicity.
Publishing Upper Half of RSA Decryption Exponent
33
Points to Note. There are no issues of guaranteed efficiency improvement in this case as the structure of d0 may be arbitrary as per the choice of the user. We do not compromise on the security of the RSA system by publishing d0 because in case with Small e, the top half d0 can very easily be estimated anyway. While choosing a personal d0 , the user must keep in mind the following. – It is not necessary to choose d0 exactly of the length lp + lq , as this issue will be corrected in the key generation algorithm anyway. – It is important to keep the embedded information shorter that 12 (lp + lq ), because the lower half of d will be modified by the KeyGenAlgoMSB. Suggested length of embedded information is min(lp , lq ) − le or shorter. We also like to clarify the fact that this personalized scheme does not offer any cryptographic identification or verification facilities for the user or the users respectively. As the encryption exponent e is Small, one can always obtain the top half d0 of the decryption exponent and run KeyGenAlgoMSB using it. Thus, there is obviously a risk that Oscar can generate and use a RSA system where his d0 is identical to the d0 of Alice or Bob. But this is similar to faking the Domain Name to IP Address correspondence in case of DNS, and can not be prevented right away. And of course, for obvious reasons of privacy, one should not embed sensitive information into d0 , as this is being published in our scheme. Potential Benefits. One may ask what benefits this Personalized-RSA scheme has to offer over the general RSA scheme used in practice. We would like to propose the following two ideas as potential benefits of Personalized-RSA. – In case of RSA cryptosystem, the public key gets binded to the implementor using certificates issued by trusted authorities. In general, the public key is embedded within the certificate along with other authentication parameters. This application provides a good motivation to reduce the size of the RSA public key, and a lot of research has been undertaken in this direction. One may refer to [15] for an idea proposed by Vanstone and Zuccherato, which was later broken using the idea of Coppersmith [3]. In case of Personalized-RSA, the public key N, e, d0 need not be completely embedded in the certificate as d0 is based on some well-known public identity of the user, the email id or name, say. This reduces the size of the RSA public key and certificate. – More importantly, one may note that the user creating the RSA framework must store the decryption key N, d as a secret parameter. In case of the proposed Personalized-RSA scheme, one just needs to store N, d1 , as the other part d0 becomes a part of the public key in this case. This considerably reduces the storage cost for RSA secret key. Another idea for generating small RSA keys may be found at [12]. In the next section, we present an alternative application of our methods of choosing portions of d towards efficient RSA decryption in low end devices with the help of a server.
34
4
S. Maitra, S. Sarkar, and S. Sen Gupta
Server Aided Scheme: Choosing the Bottom Half of d
As mentioned earlier, there are lots of existing results in the sector of ServerAided RSA (one may refer to papers, e.g., [8,9]), and we are not presenting any competing scheme in that aspect. The advantage of our simple approach is it uses only the RSA primitive and nothing else, unlike the existing schemes [8,9]. Consider a situation where Alice is using a hand-held device or a smart-card with low computing power. The term d0 is not kept secret for Small e and hence some third party may share part of the decryption load. The Scheme. Alice chooses Small e and fixes the lengths of the primes, lp and lq . She also chooses d1 cleverly so that the weight of d1 small, but it can prevent exhaustive search. The Hamming weight of d1 should be considerably lower than that in a random case, so as to make decryption faster. Now, the participants Alice (low end device), Paulo (server) and a Bob (sender) behave as follows. – KeyGen: Alice creates the keys using Algorithm 2 with input (e, lp , lq , d1 ). – KeyDist: Define d0 = d − d1 . Alice publishes public key N, e, gives Paulo the information N, d0 , and keeps her decryption key N, d1 secret. – Encryption: Bob encrypts plaintext M ∈ ZN as C = M e mod N . – Server: Paulo computes V = C d0 mod N , and sends (V, C) to Alice. – Decryption: Alice receives (V, C) and computes M = V C d1 mod N . Let us present the key generation algorithm for the proposed scheme. Since the server will execute V = C d0 mod N , we do not need any restriction on d0 , but we need to choose d1 for efficiency. Once Alice fixes e and d1 , and decides upon the size of the primes p, q to be used, the key generation algorithm described in Algorithm 2 will provide the solution. 4.1
Correctness of Algorithm 2
The correctness of the proposed scheme relies on the correctness of the key generation algorithm (Algorithm 2). One may note that ed = ew2lq −le +1 + ed1 = (−ed1 + 1 + k(p − 1)(q − 1)) + ed1 = k(p − 1)(q − 1) which represents the RSA equation ed ≡ 1 mod φ(N ). This proves the correctness of Algorithm 2, and hence of the proposed scheme. Bitsize of Primes. Note that we obtain lq = lz = ly − 1 from Algorithm 2. Also, it is obvious that ly is the size of e · 2lq −le +1 due to the modular operation while constructing y. Thus, lq = ly − 1 = le + (lq − le + 1) − 1 = lq , as expected. Choice of d1 . Along similar lines of analysis performed in case of Algorithm 1, one may also verify the following result. Proof omitted to avoid duplicity. Theorem 2. The output d generated by KeyGenAlgoLSB(e, lp, lq , d1 ) shares the bottom 12 lN − le bits with the input d1 .
Publishing Upper Half of RSA Decryption Exponent
35
Input: Encryption exponent e, Size of primes lp , lq , and d1 with ld1 = 12 lN Output: RSA parameters N, p, q, d Choose random prime p with lp = 12 lN , gcd(p − 1, e) = 1 and 12 (p − 1) odd; Choose a random integer k with lk = le such that gcd(k, e, 2) = 1; −1 Set x ← 12 k(p − 1) mod (e · 2lq −le +1 ); Set y ← [(ed1 − 1) · x] mod (e · 2lq −le +1 ); Set z ← 12 y + 1;
1 2 3 4 5 6
if z is prime and lz = lq then GOTO Step 9; end GOTO Step 2;
7 8
11
Set q ← z; Set N = pq; 1 Set w ← e·2lq −l (−ed1 + 1 + k(p − 1)(q − 1)); e +1
12
Set d ← w · 2lq −le +1 + d1 ;
13
RETURN N, p, q, d;
9 10
Algorithm 2. The Key Generation Algorithm (KeyGenAlgoLSB)
4.2
Efficiency of the Protocol
The encryption phase is the same as that of regular RSA and hence the efficiency is identical to that of a regular RSA scheme using Small e. However, substantial advantage can be achieved in case of decryption. As we have already commented in Section 2.2, the decryption cost for regular RSA is (μ + 12 )lN many modular multiplications, assuming the cost of one modular square is equivalent to μ times the cost of one modular multiplication. So total number of bit operation will be 3 (μ + 12 )lN . In this case, we are not keen to reduce the load of the server, but of the hand-held device. Thus, let us check the decryption cost for Alice. The computation performed by Alice during decryption will be 12 lN − le many modular squares (length of d1 ), and w1 many modular multiplications (number of 1’s in d1 ). Hence, total number of modular multiplications (including equivalent squaring) in decryption phase by Alice is μ 12 lN − le + w1 . Thus, Alice obtains an advantage (in proportion of less number of operations) 1 2w1 e μ 12 − llNe + w μ + 1 + 2µl lN lN − lN 1− . = 2μ + 1 μ + 12 Considering a practical scenario with lN = 1024, e = 216+1 and w1 = 40, the 1 advantage is 65.17% for μ = 1. Asymptotically, one can neglect llNe and w lN and µ+1 hence we get the speed up of the order of 2µ+1 . When μ = 1, we get an advantage of 66.67% in the decryption phase, and the advantage increases when
36
S. Maitra, S. Sarkar, and S. Sen Gupta
μ < 1 (in practice, it is sometimes considered that squaring requires less effort than multiplication). Along the line of CRT-RSA [10,18], Alice can calculate C d1 mod N using Chinese Remainder Theorem (CRT). She first calculates Cp ≡ C d1 mod p and Cq ≡ C d1 mod q. From Cp and Cq , one can easily obtain C d1 mod N using CRT-RSA. Since lp = lq = 12 lN , in this situation Alice needs to perform
2
1 lN 1 le w1 3 2 μ lN − le + w1 = μ − + l 2 2 4 2lN 2lN N many bit operations. Considering a practical scenario with lN = 1024, e = 216+1 1 and w1 = 40, the advantage is 82.58% for μ = 1. Neglecting llNe , w lN , one can achieve an advantage of 1−
μ/4 3μ + 2 = . μ + 1/2 4μ + 2
When μ = 1, asymptotic advantage of 83.33% can be obtained during decryption. In the next subsection we perform the security analysis of the proposed scheme and argue why the aforementioned choice of parameters is secure. 4.3
Security of the Protocol
Similar to the case of Cryptosystem 2, the prime q generated here is a random prime of size lq , and does not seem to possess any special form. Notice that the randomness of q in Algorithm 1 was generated from a random choice of dpad having greater information than the choice of d1 in case of Algorithm 2. This apparently hints that the randomness of q in Algorithm 2 is lower than that in Algorithm 1. But a closer observation will reveal that the randomness of q in Algorithm 2 also depends on the randomness of x, which in turn, depends on the randomness of p and k, carrying similar information as dpad in the earlier case. Thus the randomness of q is comparable in both algorithms. Moreover, the random choice of k and d1 guarantees the independence of p and q in Algorithm 2, which is much desired in this kind of a setup. Also, as d is of order of N , a lot of known attacks [18,16,1,17] will not work in our case. 1 Now assume d1 ≡ d mod (2 2 lN ) i.e., d1 represents the lower half of bits of d. 1 One may verify that the 2 lN − le many least significant bits of d1 and d1 are same. When e is Small and d1 is known, then number of possible options of d1 will be very small. So we may assume that knowing d1 is almost equivalent (with little bit of extra search) of knowing d1 . One must note that while choosing d1 for Small e, there is a risk of brute force attack on this portion, and as the top half is known by default (Fact 1), this may make the system vulnerable. We need to refer the following results by Boneh et. al. [2, Theorems 3.1, 3.3] towards the security analysis of this proposal. Fact 2. Suppose N ≡ 3 mod 4 and e < 18 N 0.25 . Then given 14 log2 N many least significant bits of d, one can factor N in polynomial of log N and e.
Publishing Upper Half of RSA Decryption Exponent
37
√
Fact 3. Suppose |p − q| > 4N and e < 18 N 0.25 . Then given 14 log2 N many bits of d in the positions (from the least significant side) 14 log2 N to 12 log2 N , one can factor N in polynomial of log N and e. Since in our case e is small, we should choose d1 such a way so that it is computationally infeasible to find the lower half of d1 (due to Fact 2). We should also choose d1 such that it is computationally impossible to find upper half of d1 (due to Fact 3). Thus one should choose d1 with weight w1 so that any half of the bit pattern of d1 can not be searched exhaustively. Let us illustrate the situation with a practical example for 1024-bit RSA. In − 16) = 248 and we know that the case of our scheme, half of d1 is of size 12 ( 1024 2 LSB must be a 1. Let us choose w1 ≈ 40 and assume that the 1’s are distributed uniformly at random over the length of d1 . So, there are 19 possible places out of 247 in the lower half of d1 for 1’s, and the rest are 0’s. Same is the case for the top half of d1 . A brute force search bit positions will result in for these 93 ≈ 2 . For a comparison, Number a computational search complexity of 247 19 Field Sieve (NFS) [6] is the fastest known factorization algorithm that requires around 286 time complexity to factor a 1024-bit RSA modulus. Hence, choosing w1 ≈ 40 in case of 1024 bit RSA with e = 216 + 1 suffices for the security if the 1’s are more or less uniformly distributed over the length of d1 . In this direction, we would like to mention that the security of this scheme is comparatively weaker than the security of Efficient-RSA. If one knows t many LSBs of d, then t LSBs of the primes p, q are compromised as well. Knowing these t many LSBs is easier in this scheme as d1 is constructed with low Hamming weight, whereas dpad in the Efficient-RSA scheme was chosen at random. 4.4
Runtime of Setup
The runtime to set up the proposed Server Aided scheme is dominated by the runtime of key generation (Algorithm 2). The if condition in Step 6 of Algorithm 2 is satisfied with probability of the order of log1 N . This is due to the fact that z, of size N 0.5 , is being constructed to be almost √ random and the distribution of primes of that size follows a density of log N , that is O(log N ). Thus, the expected number of iterations of Algorithm 2 is O(log N ) during the setup.
5
Conclusion
In this paper, we have taken up a known but highly understated fact that half of the decryption exponent d can be obtained in case of RSA implementations with Small encryption exponent e. We have deviated from using this in cryptanalysis of RSA, and have tried to exploit this in a constructive fashion. In this direction, we proposed a couple of key generation algorithms, and illustrated their implications through a few proposed RSA schemes, as follows. – Efficient-RSA: One can choose the upper half of the decryption exponent d, and obtain certain advantages in the decryption phase over natural RSA.
38
S. Maitra, S. Sarkar, and S. Sen Gupta
– Personalized-RSA: One can personalize the upper half of the decryption exponent d and publish it when e is Small, to create a DNS-like RSA convention. – Server Aided Scheme: One can choose the lower half of d and publish the upper half of d (for Small e) so that a server can help decrease the computation cost during RSA decryption. The uniqueness of our approach depends on the novelty of the motivation as well as the simplicity of the proposed schemes, which in most of the cases, use just a basic RSA primitive and nothing else. Acknowledgments. The authors are grateful to the anonymous reviewers for their invaluable comments and suggestions. The third author would like to acknowledge the Department of Information Technology (DIT), India, for supporting his research at the Indian Statistical Institute.
References 1. Boneh, D., Durfee, G.: Cryptanalysis of RSA with Private Key d Less Than N 0.292 . IEEE Transactions on Information Theory 46(4), 1339–1349 (2000) 2. Boneh, D., Durfee, G., Frankel, Y.: Exposing an RSA Private Key given a Small Fraction of its Bits, http://crypto.stanford.edu/~ dabo/abstracts/bits_of_d.html 3. Coppersmith, D.: Small Solutions to Polynomial Equations, and Low Exponent RSA Vulnerabilities. Journal of Cryptology 10(4), 233–260 (1997) 4. Galbraith, S., Heneghan, C., McKee, J.: Tunable Balancing RSA, http://www.isg.rhul.ac.uk/~ sdg/full-tunable-rsa.pdf 5. Lenstra, A.: Generating RSA moduli with a predetermined portion. In: Ohta, K., Pei, D. (eds.) ASIACRYPT 1998. LNCS, vol. 1514, pp. 1–10. Springer, Heidelberg (1998) 6. Lenstra, A.K., Lenstra Jr., H.W.: The Development of the Number Field Sieve. Springer, Heidelberg (1993) 7. Maitra, S., Sarkar, S.: Efficient CRT-RSA Decryption for Small Encryption Exponents. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 26–40. Springer, Heidelberg (2010) 8. Matsumoto, T., Kato, K., Imai, H.: Speeding up secret computations with insecure auxiliary devices. In: Goldwasser, S. (ed.) CRYPTO 1988. LNCS, vol. 403, pp. 497– 506. Springer, Heidelberg (1990) 9. Nguyen, P.Q., Shparlinski, I.: On the Insecurity of a Server-Aided RSA Protocol. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 21–25. Springer, Heidelberg (2001) 10. Quisquater, J.-J., Couvreur, C.: Fast decipherment algorithm for RSA public-key cryptosystem. Electronic Letters 18, 905–907 (1982) 11. Rivest, R.L., Shamir, A., Adleman, L.: A Method for Obtaining Digital Signatures and Public Key Cryptosystems. Communications of ACM 21(2), 158–164 (1978) 12. Sakai, R., Morii, M., Kasahara, M.: New Key Generation Algorithm for RSA Cryptosystem. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 77(1), 89–97 (1994)
Publishing Upper Half of RSA Decryption Exponent
39
13. Stinson, D.R.: Cryptography - Theory and Practice. 2nd Edition, Chapman & Hall/CRC (2002) 14. Sun, H.-M., Hinek, M.J., Wu, M.-E.: On the Design of Rebalanced RSA-CRT, http://www.cacr.math.uwaterloo.ca/techreports/2005/cacr2005-35.pdf 15. Vanstone, S.A., Zuccherato, R.J.: Short RSA Keys and Their Generation. Journal of Cryptology 8(2), 101–114 (1995) 16. Verheul, E., van Tilborg, H.: Cryptanalysis of less short RSA secret exponents. Applicable Algebra in Engineering, Communication and Computing 18, 425–435 (1997) 17. de Weger, B.: Cryptanalysis of RSA with small prime difference. Applicable Algebra in Engineering, Communication and Computing 13, 17–28 (2002) 18. Wiener, M.: Cryptanalysis of Short RSA Secret Exponents. IEEE Transactions on Information Theory 36(3), 553–558 (1990)
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples Yamin Liu, Bao Li, Xianhui Lu, and Yazhe Zhang State Key Laboratory of Information Security, Graduate University of Chinese Academy of Sciences, Beijing, China {ymliu,lb,xhlu,yzzhang}@is.ac.cn
Abstract. We give several examples to show that PA1 and IND-CCA2 together do not guarantee PA2 in the absence of random oracles, for both statistical and computational PA. In the statistical case, we use the Desmedt-Phan hybrid encryption scheme as the first example. If the DEM of the Desmedt-Phan hybrid encryption is an IND-CCA2 symmetric encryption without MAC, then the Desmedt-Phan hybrid is INDCCA2 and statistical PA1 but not statistical PA2. Extend the result to the Cramer-Shoup hybrid encryption scheme, we find that even statistical PA1+ and IND-CCA2 together could not reach statistical PA2. In the computational case, we give an artificial example to show that neither statistical nor computational PA1 together with IND-CCA2 could guarantee computational PA2. Keywords: Provable Security, Asymmetric Encryption, Plaintext Awareness, IND-CCA2.
1
Introduction
In this paper we give several examples to show the gap between PA2 plaintext awareness and the CCA2 security. We start by reviewing existing work and giving motivations for our work. 1.1
Background
The notion of plaintext awareness (PA) for asymmetric encryption schemes was introduced by Bellare and Rogaway under the random oracle model [4] then refined in [2] by Bellare et al, and PA without random oracles was defined by Bellare and Palacio in [3]. A similar but weaker notion called plaintext simulatability was proposed by Fujisaki in [12]. Besides, Herzog et al also tried to define PA without random oracles [13].
Supported by the National Natural Science Foundation of China (No.60673073), the National High-Tech Research and Development Plan of China (863 project) (No.2006AA01Z427) and the National Basic Research Program of China (973 project) (No.2007CB311201).
I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 40–54, 2010. c Springer-Verlag Berlin Heidelberg 2010
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples
41
At the beginning, PA was proposed to help proving the indistinguishability against chosen ciphertext attack (IND-CCA) of asymmetric encryption schemes. In [3], three hierarchical definitions of PA were provided: PA0, PA1, and PA2, and the relationship between PA and IND-CCA was proved, namely, PA1 + IND-CPA ⇒ IND-CCA1, and PA2 + IND-CPA ⇒ IND-CCA2. Later, Teranishi and Ogata proved that PA2 + OW ⇒ IND-CCA2 [17]. Informally, an asymmetric encryption scheme is plaintext aware if the creator of a valid ciphertext already knows the corresponding plaintext, or in other words, there is a plaintext extractor which could extract plaintext from a ciphertext given only the public key and the internal coin-flip of the ciphertext creator. If a scheme fulfils PA, then a decryption oracle in the IND-CCA model would be useless to the adversary. The definition of PA1 only involves two roles, a ciphertext creator and a plaintext extractor. The case for PA2 is more complicated: a plaintext creator is allowed, and it provides the ciphertext creator with ciphertexts which the ciphertext creator does not know the corresponding plaintexts. Eventually, it turns out that PA2 without random oracles seems harder to achieve than INDCCA2. However, PA is still an important notion of independent interest, not only because it shows an insight into the provable security of encryption schemes, but also because it is required in applications, i.e., the deniable authentication protocol of Raimondo et al [16]. PA could also be classified as perfect/statistical/computational PA, depending on the decryption power of the plaintext extractor. The perfect PA requires the plaintext extractor decrypts exactly like the decryption algorithm with zero error, and the statistical/computational PA requires the plaintext extractor and the decryption algorithm are statistically/computationally indistinguishable, i.e., in the statistical case, the plaintext extractor should give correct decryptions with overwhelming probability, while in the computational case, outputs of the plaintext extractor and the decryption algorithm are just computationally indistinguishable. Statistical PA2 was proved strictly stronger than computational PA2 by Teranishi and Ogata in [17]. In [8] Dent showed that the Cramer-Shoup hybrid encryption scheme [6] was computational PA2 under the DHK assumption by proving it was PA1+ and encryption simulatable, which implied PA1 and IND-CCA2 security respectively. This is the first evidence that PA2 without random oracles is realistic. Later, Teranishi and Ogata proved that the Cramer-Shoup hybrid was statistically PA2 under the DHK assumption by proving it was equality-PA1 and computationally random-like, which also implied PA1 and IND-CCA2 respectively. Other ways of proving PA2 without were also invented. Birkett and Dent proposed a weaker variant of PA2 named PA2I, and proved that PA2I + IND-CPA ⇒ PA2 [1]. Recently, Jiang and Wang studied the PA security of hybrid encryption and introduced relation-based definition of PA2 and IND-CCA2, called R-PA2 and R-IND-CCA2 respectively [14]. They proved that a key encapsulation mechanism (KEM) with R-PA2 and R-IND-CCA2 security, composed with a
42
Y. Liu et al.
one-time pseudorandom and unforgeable data encapsulation mechanism (DEM), could produce a PA2 hybrid encryption scheme, wherein R was a relationship based on the data encapsulation (DEM) of the hybrid encryption scheme. Proving PA2 without introducing sophisticated notions is desirable, such as with seemingly the mostly natural way, i.e., whether there is PA1 + NM-CCA2 ⇒ PA2, or equivalently, PA1 + IND-CCA2 ⇒ PA2, since PA1 and IND-CCA2 are relatively easier to achieve than PA2. Thus, we examine the problem. 1.2
Our Contributions
In this paper, we give several examples to show that PA1 and IND-CCA2 together do not guarantee PA2 in the absence of random oracles for both statistical and computational PA. In the statistical case, we obtain the result from more convincing natural examples rather than artificial ones. Firstly, we use the Desmedt-Phan hybrid encryption scheme [11], which is IND-CCA2 in the generic group model, as an example. If the data encapsulation mechanism used in the Desmedt-Phan hybrid encryption is an IND-CCA2 symmetric encryption scheme without message authentication code (MAC), then the resulting hybrid scheme is IND-CCA2 and statistical PA1, but not statistical PA2. Then we reconsider the Cramer-Shoup hybrid encryption scheme [6], which is IND-CCA2 in the standard model and is previously proved to be statistical PA2 if the underlying DEM is an Encrypt-then-MAC symmetric encryption scheme in [8,18]. Similarly, if the DEM is an IND-CCA2 symmetric encryption scheme without MAC, then the resulting Cramer-Shoup hybrid encryption scheme is IND-CCA2 and statistical PA1 (even PA1+), but no longer statistical PA2. For computational PA, we construct an artificial example. Given an asymmetric encryption scheme AE = (KG, E, D) which is IND-CCA2 and statistical PA2, we construct another asymmetric encryption scheme AE = (KG , E , D ), and show that AE is still IND-CCA2 and statistical PA1, but not computational PA2. That is, neither statistical nor computational PA1 together with IND-CCA2 could guarantee computational PA2. Organization. The paper is organized as follows. In section 2 we provide some notations and definitions. In section 3 we recall the Desmedt-Phan hybrid encryption scheme and recapitulate its security proof. In section 4 we give several examples to show that PA1 and IND-CCA2 do not guarantee PA2. Finally, section 5 is the conclusion.
2
Preliminaries
For a bit string x, |x| denotes its length. For a set S, |S| denotes its size, x ∈$ S $
means that x is a random element of S, and x ← S means that x is randomly
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples
43
$
chosen from S. For a randomize algorithm A, x ← A(·) means that x is assigned the output of A. An algorithm is efficient if it runs in polynomial time in its c input length. By X1 = X2 we mean that two distributions X1 and X2 are coms putationally indistinguishable, and = denotes statistically indistinguishability. A function is negligible if it decreases faster than any polynomial. When mentioning probabilistic polynomial time, we write PPT for short. ⊥ is the error symbol. 2.1
Security of Asymmetric Encryption
Here we recall some security notions for asymmetric encryption, mainly the IND-CCA2 security and PA. IND-CCA2 Security. The indistinguishability against adaptive chosen ciphertext attack (IND-CCA2) is considered as the standard security notion in the field of asymmetric encryption. For an asymmetric encryption scheme AE = (KG, E, D), and a PPT adversary A = (A1 , A2 ), the IND-CCA2 security is described by the following game: ExpIND-CCA2 (1k ) AE, A $
D(sk,·)
$
(pk, sk) ← KG(1k ); (m0 , m1 ) ← A1 $
∗ $
$
(pk)
D(sk,·)
b ← {0, 1}; C ← E(pk, mb ); b ← A2
(pk, C ∗ )
It is required that |m0 | = |m1 |. A wins the game if b = b . Its advantage is defined as 1 IND-CCA2 k AdvAE (1 ) = |P r[b = b ] − | ,A 2 Definition 1. (IND-CCA2)An asymmetric encryption scheme AE = (KG, E, D) IND-CCA2 k is said to be IND-CCA2 secure if for all PPT adversary A, AdvAE (1 ) is ,A negligible. IND-CCA2 is equivalent to another security notion called non-malleability against adaptive chosen ciphertext attack (NM-CCA2). However, the definition of NMCCA2 is much more complicate. Here we give a simplified version of NM-CCA2 from [2]. Definition 2. (NM-CCA2)An asymmetric encryption scheme AE = (KG, E, D) is said to be NM-CCA2 secure if for all PPT adversary A = (A1 , A2 ), def
NM-CCA2 k AdvAE, (1 ) = |SuccNM-CCA2 (1k ) − SuccNM-CCA2 (1k )| A AE, A AE, A,$
is negligible, where M is a valid message distribution, R is a nontrivial PPT binary def relation, and SuccNM-CCA2 (1k ) = AE, A
44
Y. Liu et al.
D(sk,·)
(pk); x∗ ← M ; y ∗ ← E(pk, x∗ ); $ D(sk,·) ∗ ∗ ∗ (R, y) ← A2 (M, y ); x ← D(sk, y) : y =y ∧x = ⊥ ∧ R(x, x ) = 1 $
$
$
P r (pk, sk) ← KG(1k ); M ← A1
$
def
and SuccNM-CCA2 (1k ) = AE, A,$ $ $ $ $ D(sk,·) (pk); (x∗ , x ˜) ← M ; y ∗ ← E(pk, x∗ ); P r (pk, sk) ← KG(1k ); M ← A1 $ D(sk,·) (M, y ∗ ); x ← D(sk, y) : y = y∗ ∧ x = ⊥ ∧ R(x, x ˜) = 1 (R, y) ← A2 Plaintext Awareness. Here we recap definitions of PA1, PA1+ and PA2 without random oracles. The definitions are mainly from [1]. PA2 plaintext awareness is described by two games, REAL and FAKE, wherein A is a ciphertext creator, A∗ is a plaintext extractor for A, and P is a plaintext creator which provides ciphertexts for A. Let R[A] be the coins of A, and let CLIST be a list of ciphertexts that A obtains from P. On receiving a distribution from $ A, P generates a message m accordingly, computes C ← E(pk, m), adds C to the ciphertext list CLIST and returns C to A. A could query a decryption oracle on any ciphertext C ∈ / CLIST . The decryption oracle answers the decryption queries of A with D(sk, ·) in the REAL game, and with A∗ (pk, ·, R[A], CLIST ) in the FAKE game. In the end of both games, A outputs a string describing its interaction with the decryption oracle. k ExpREAL A, D (1 ) $
(pk, sk) ← KG(1k )
k ExpFAKE A, A∗ (1 ) $
(pk, sk) ← KG(1k ) ∗
xREAL ← AD(sk,·),E(pk,P(·)) (pk)xF AKE ← AA $
$
(pk,·,R[A],CLIST ),E(pk,P(·))
(pk)
Definition 3. (PA2 Plaintext Awareness) An asymmetric encryption scheme is said to be statistical (computational) PA2 plaintext aware if for all ciphertext creators A, there exists a plaintext extractor A∗ such that for all plaintext creators P, xREAL and xF AKE are statistically (computationally) indistinguishable. PA1 is weaker than PA2 for that in the definition of PA1, the plaintext creator P is not available to A. Definition 4. (PA1 Plaintext Awareness) An asymmetric encryption scheme is said to be statistical (computational) PA1 plaintext aware if for all ciphertext creators A that do not make any queries to the plaintext creator P, there exists a plaintext extractor A∗ such that xREAL and xF AKE are statistically (computationally) indistinguishable.
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples
45
PA1+ is an intermediate notion between PA1 and PA2 [8]. In the PA1+ model, the ciphertext creator A could not access the plaintext creator P. However, A could access a randomness oracle R which takes no input and returns random strings that have equal length with real ciphertexts. The formal definition of PA1+ is similar to the definition of PA2 except that P is replaced by R and CLIST is replaced by RLIST , which contains all random strings that A obtains from P. 2.2
IND-CCA2 of Symmetric Encryption
The definition of IND-CCA2 security for symmetric encryption SE = (KG, E, D) shares the same flavor with that of asymmetric encryption, except that the KG simply produces a secret key K, and the adversary could also access to an encryption oracle besides the decryption oracle. To construct a symmetric encryption scheme with IND-CCA2 security, the most common way is to combine a one-time pad and a message authentication code (MAC). However, constructions of IND-CCA2 secure symmetric encryption without MAC are given by Desai in[9] and by Phan and Pointcheval in [15]. These constructions avoid the length overhead caused by MAC. 2.3
Hybrid Encryption
A hybrid encryption scheme [5,6] is the combination of an asymmetric key encapsulation mechanism (KEM) and a symmetric data encapsulation mechanism (DEM). The hybrid encryption scheme HPKE = (HKG, HE, HD) is defined as follows, wherein KEM = (KG, E, D) and DEM = (E, D) are used. HE(pk, m)
HKG(1k ) $
(pk, sk) ← KG(1k ) Return (pk, sk)
$
(K, c) ← E(pk) e ← E(K, m)
HD(sk, C) Parse C as (c, e) K ← D(sk, c) m ← D(K, e)
C = (c, e) Return m Return C Since a hybrid encryption scheme is essentially an asymmetric encryption scheme, security notions in the asymmetric case, such as PA and IND-CCA2, are applicable in the hybrid case naturally.
3
The Desmedt-Phan Hybrid Encryption Scheme
Desmedt and Phan proposed a hybrid encryption scheme [11] from Damg˚ ard’s ElGamal Encryption [7], embedding a DEM in the latter scheme. The DesmedtPhan hybrid encryption scheme was proved to be IND-CCA2 with assumptions in the generic group model. Here we briefly recall the scheme and its security proof.
46
Y. Liu et al.
Definition 5. (Desmedt-Phan Hybrid Encryption Scheme) Let (E, D) be a DEM. Hk is a universal family of hash functions. G is a group generation algorithm. The algorithms (KG, E, D) is described below: KG(1k )
E(pk, m) $
$
(g, q) ← G(1k )
r ← Zq u 1 = g r , u 2 = cr
$
x, y ← Zq c = gx, d = gy
K = H(dr ) e = EK (m) C = (u1 , u2 , e)
$
H ← Hk sk = (x, y)
D(sk, C) Parse C as (u1 , u2 , e) = ux1 return ⊥ If u2 Else K = H(uy1 ) m = DK (e) Return m
Return C
pk = (H, g, c, d) Return (pk, sk)
Security. The IND-CCA2 security of Desmedt-Phan hybrid encryption scheme are based on hashed DDH (HDDH), modified HDDH (MHDDH), extended HDDH (EHDDH), DHK, and EDHK assumptions. We put the definitions of HDDH, MHDDH and EHDDH assumptions in the Appendix, and briefly recall the DHK and EDHK assumptions here. The Diffie-Hellman Knowledge Assumption (DHK) [7,3,8] was introduced by Damg˚ ard, and the evidence of its intractability was given by Dent in [10] in the generic group model. Let G be a group generation algorithm, A is a PPT adversary, and R[A] is the coins of A, the DHK game is defined as follows: k ExpDHK A, K (1 ) $
$
(g, q) ← G(1k ); a ← Zq , A = g a $
IF (B, C) ← A(g, A, R[A]); b ← K((B, C), g, A, R[A]) IF C = B a and B = g b return 1; ENDIF; ELSE return b; ENDIF Return 0 k A wins the game if ExpDHK A, K (1 ) = 1, and the DHK advantage of A is defined as DHK DHK k k AdvG, A,K (1 ) = P r[ExpG, A, K (1 ) = 1]
Assumption 1. (DHK) For any PPT algorithm A, there is a PPT extractor DHK K, such that AdvG,A,K (1k ) is negligible. The EDHK assumption claims that if a adversary A is given not only (g, A) but also a DH-pair (B, C) relative to A, the only way for A to output another
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples
47
DH-pair (B , C ) is to choose x, y ∈ Zq and compute (B = B x g y ) and C = C x Ay . Desmedt and Phan also gave an evidence of the intractability of EDHK in the generic group model [11]. k ExpEDHK A, K (1 ) $
$
(g, q) ← G(1k ); a, b ← Zq , A = g a , B = g b , C = g ab
$
IF (B , C ) ← A(g, A, B, C, R[A]); x||y ← K((B , C ), g, A, B, C, R[A])
IF C = B a and C = B a and B = B x g y and C = C x Ay return (x, y); ENDIF ELSE return 1; ENDIF Return 0 k A wins the game if ExpEDHK A, K (1 ) = 1, and the EDHK advantage of A is defined as EDHK EDHK k k AdvG, A,K (1 ) = P r[ExpG, A, K (1 ) = 1]
Assumption 2. (EDHK) For any PPT algorithm A, there is a PPT extractor EDHK k (1 ) is negligible. K, such that AdvG,A,K Theorem 1. [11] The Desmedt-Phan hybrid encryption scheme is IND-CCA2 assuming that: 1. The HDDH and EHDDH assumptions hold, and 2. The DHK and EDHK assumptions hold, and 3. The DEM is IND-CCA2 secure. In the proof of Theorem 1, the HDDH and EHDDH assumptions are used to establish the semantic security of the scheme. Besides, a decryption simulator is constructed with a DHK extractor K1 and a EDHK extractor K2 . Assume an IND-CCA2 adversary A = (A1 , A2 ). If A1 submits a query (u1 , u2 , e), the decryption simulator runs K1 to extract the common exponent of (u1 , u2 ) and decrypts with the exponent. Similarly, if A2 submits a query (u1 , u2 , e), the decryption simulator runs K2 to look for the exponent. More details of the proof is in [11].
4
PA1 and IND-CCA2 Do Not Guarantee PA2
In the section, we first use the Desmedt-Phan hybrid encryption scheme as an example to show that statistical PA1 and IND-CCA2 do not guarantee statistical PA2. Then we extend the result to the Cramer-Shoup hybrid encryption scheme. Finally, we construct an artificial example and prove that PA1 and IND-CCA2 could not reach PA2 in the computational case.
48
4.1
Y. Liu et al.
Use the Desmedt-Phan as an Example
Let DEM = (E, D) in the Desmedt-Phan hybrid encryption scheme be an INDCCA2 DEM without MAC. The resulting scheme is IND-CCA2, or equivalently, NM-CCA2, according to Theorem 1. Here we prove that although the scheme achieves statistical PA1 but not statistical PA2. Theorem 2. The Desmedt-Phan hybrid encryption scheme is statistical PA1 assuming that the DHK assumption holds. Proof. Let A be any PA1 ciphertext creator of the Desmedt-Phan hybrid encryption scheme, and R[A] is the coins of A. We make use of a DHK extractor K to build a plaintext extractor A∗ for A. If A makes a decryption query on a ciphertext (u1 , u2 , e), A∗ proceeds as follows: k 1. Run the DHK extractor K((u1 , u2 ), g, c, R[A]). If ExpDHK G, A, K (1 ) = 0 or 1, then return ⊥ and reject the ciphertext. Else, if the game returns a value r ∈ Zq then continue. 2. Compute K = H(dr ) and return m = DK (e) to A.
The plaintext extractor A∗ could correctly simulate the decryption algorithm if it obtains correct answers from the DHK extractor K. Note the output of K is equal to rather than indistinguishable from the common exponent of (u1 , u2 ) only with negligible error probability. k ∗ Since P r[ExpDHK A, K (1 ) = 1] is negligible, thus A fails to correctly decrypt only with negligible probability. Hence, the Desmedt-Phan hybrid encryption scheme is statistical PA1. Remark 1. Though the DHK assumption is a computational assumption, it has been used to prove statistical PA in several literatures, such as the statistical PA1 of the Cramer-Shoup lite encryption scheme [3] by Bellare and Palacio, and the statistical PA2 of the Cramer-Shoup hybrid encryption scheme [18] by Teranishi and Ogata. Theorem 3. The Desmedt-Phan hybrid encryption scheme is not statistical PA2. Proof. Let A be a PA2 ciphertext creator of the Desmedt-Phan hybrid encryption scheme, with its coins denoted as R[A], and let P be a plaintext creator. CLIST is the list of ciphertexts that A obtains from P, with every entry of CLIST being a ciphertext of the whole Desmedt-Phan hybrid encryption scheme, not just of the KEM part or the DEM part. Note that the plaintext extractor A∗ does not know the coins of P, thus it can not decrypt ciphertexts in CLIST . A could produce ciphertexts without using the encryption algorithm in the following way: Query the plaintext creator P. On receiving a ciphertext C = (u1 , u2 , e) from P, choose e randomly from the ciphertext space of the DEM. Then C = (u1 , u2 , e ) is a valid ciphertext, and C ∈ / CLIST .
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples
49
Obviously, no plaintext extractor A∗ could not output the exact decryption of C = (u1 , u2 , e ) with overwhelming probability, since the common exponent of (u1 , u2 ) is not in R[A]. The result seems unexpected, since intuitively, if an encryption scheme is nonmalleable, then the ciphertexts from the plaintext creator P may be useless to the ciphertext creator A. However, though a ciphertext (u1 , u2 , e) could be easily modified into another valid ciphertext (u1 , u2 , e ), the non-trivial relation required in NM-CCA2 between them is unknown. Thus the modified ciphertext could be used to against the statistical PA2 plaintext awareness but would not contradict the NM-CCA2 security of the scheme. The fact shows a gap between CCA2 and PA2 in the aspect of ciphertext malleability. Besides, the reason why we choose the Desmedt-Phan hybrid encryption scheme which is secure in the generic group model as an example is that the proof of Theorem 1 cited from [11] seems to follow the way of PA2. The decryption simulator looks like a plaintext extractor since it answers a decryption query by employing the DHK or EDHK extractor to recover the randomness that A1 or A2 uses in the encryption. The difference is that in the IND-CCA2 case, the decryption simulator knows the randomness used to produce the challenge ciphertext, thus with the EDHK extractor, it could deal with A2 ’s decryption queries. However, in the PA2 case, the plaintext extractor does not know the coins that the plaintext creator P uses, thus even the EDHK extractor would be useless. The Desmedt-Phan hybrid encryption scheme could clearly show the gap between IND-CCA2 and PA2. 4.2
Apply the Same Argument to the Cramer-Shoup Hybrid
The IND-CCA2 security of the Desmedt-Phan hybrid encryption scheme was established under assumptions in the generic group model. However, the fact that statistical PA1 and IND-CCA2 do not guarantee PA2 could easily be extended to the standard model. Looking back on the Cramer-Shoup hybrid encryption [6] which was IND-CCA2 in the standard model, proved to be computational PA1+ and PA2 by Dent in [8] and proved to be statistical PA2 by Teranishi and Ogata in [18], we find that the argument of Theorem 3 is also suitable for the Cramer-Shoup hybrid encryption scheme. Specifically, the Cramer-Shoup hybrid which was proved to be statistical PA2 in [18] was composed of the Cramer-Shoup KEM [6] and an Encrypt-then-MAC DEM. If the underlying DEM is IND-CCA2 secure without MAC, then the resulting Cramer-Shoup hybrid, which is still IND-CCA2 secure and PA1+ (certainly PA1), is not PA2. Definition 6. (Cramer-Shoup Hybrid Encryption Scheme) Let (E, D) be a DEM. Let G be a group generation algorithm, and G =< g > is a cyclic group of order q. → Z is a hash F : G×G
→ {0, 1}n is a key derivation function, and H : G × G
function. The Cramer-Shoup hybrid encryption scheme is defined below:
50
Y. Liu et al.
KG(1k ) $
(g, q) ← G(1k ) $
w←
Z∗q $
x, y, z ← Zq W = gw ; X = gx Y = gy ; Z = gz
E(1k ) $
r ← Zq A = g r ; Aˆ = W r ; B = Z r K = F (A, B) ∈ {0, 1}n ˆ v = H(A, A)
pk = (W, X, Y, Z) sk = (w, x, y, z)
V = X r Y vr e = EK (m) ˆ V, e) C = (A, A,
Return (pk, sk)
Return C
D(1k ) ˆ V, e) Parse C as (A, A, ˆ v = H(A, A) ˆ w Check if V =Ax+yv and A=A if not then return ⊥ Else B = Az K = F (A, B) m = DK (e) Return m
Corollary 1. If the underlying DEM= (E, D) is an IND-CCA2 secure symmetric encryption scheme without MAC, then the Cramer-Shoup hybrid encryption scheme is statistical PA1+ under the DHK assumption but not statistical PA2. Proof. (Sketch.) The computational PA1+ plaintext awareness of the CramerShoup KEM was proved by Dent [8] under the DHK assumption. Since the DHK extractor is correct with overwhelming probability, thus the Cramer-Shoup KEM is actually statistical PA1+. This means that the Cramer-Shoup hybrid encryption scheme with arbitrary DEM would be statistical PA1+. And the same argument of Theorem 3 could be used here to disprove the statistical PA2 of the hybrid scheme. The result may be of some interest, since PA2 is desirable in applications, such as the plaintext awareness of the Cramer-Shoup hybrid encryption was used to prove the deniability of a key exchange protocol in [16], and IND-CCA2 secure symmetric encryption without MAC is also interesting because of its length efficiency. However, caution should be taken when combining a KEM with such a DEM. 4.3
The Situation for Computational PA
The argument in Theorem 3 seems inapplicable to computational PA2, since though A∗ could not output the correct decryption with overwhelming probability, it could output a random plaintext, which is computationally indistinguishable from the real plaintext from the view of the ciphertext creator. However, this does not mean that computational PA2 could be guaranteed by computational PA1 and IND-CCA2. Here we give an artificial example1 in the computational case. Let AE = (KG, E, D) be an asymmetric encryption scheme which is statistical PA2 and IND-CCA2. It is known that such schemes exist. For the sake of convenience, let 1k be the security parameter, and {0, 1}2n is the message space of AE, where n = p(1k ) for some polynomial p(·). 1
The artificial example was inspired by the comments of an anonymous reviewer.
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples
51
Then we construct AE = (KG , E , D ) as follows:
– KG (1k ): KG is the same as KG, except that it chooses a hash function T which is collision-free and pre-image resistant. T is added to the public key. – E (pk, m): On input m ∈ {0, 1}n, randomly select r ∈ {0, 1}n , compute c = E(pk, m||r) and t = T (r), and output the ciphertext C = (1, c, t). – D (sk, C): Parse C as (b, c, t), compute M = D(sk, c). If M = ⊥ then output ⊥, else parse M as (m, r). If t = T (r) then output ⊥; otherwise, if b = 1 then output m, otherwise output r.
Theorem 4. The asymmetric encryption scheme AE = (KG , E , D ) is INDCCA2 secure, statistical PA1 but not computational PA2, if AE = (KG, E, D) is IND-CCA2 and statistical PA2. Proof. The theorem follows from the following three lemmas.
Lemma 1. AE is IND-CCA2 secure.
Proof. The IND-CCA2 security of AE follows from the IND-CCA2 security of AE. If there is an adversary S against the IND-CCA2 security of AE , then we could construct an IND-CCA2 adversary S against the IND-CCA2 security of AE: Step 1. On receiving the public key pk from a challenger, S selects a collision free and pre-image resistant hash function T , and challenges S with pk = (pk, T ). Step 2. On receiving a decryption query C = (b, c, t) from S , S queries its own decryption oracle with c, and gets the answer M . If M = ⊥ then S output ⊥. Otherwise, S parses M as (m, r), and proceeds as the remaining steps of D . Step 3. On receiving a pair of messages (m0 , m1 ), S randomly selects r∗ , and send its own challenger the message pair (m0 ||r∗ , m1 ||r∗ ). On receiving $
its own challenge ciphertexts c∗ = E(pk, mσ ||r∗ ), where σ ← {0, 1}, S computes t∗ = T (r∗ ), and sends C ∗ = (1, c∗ , t∗ ) to S . Step 4. S answers decryption queries as in Step 2, except that it refuses to decrypt C ∗ . S would not leak the information of the plaintext of C ∗ even if S gets r∗ since T is collision-free. Finally, S outputs a guess bit σ , and S outputs σ as its guess bit. IND-CCA2 k IND-CCA2 k Obviously, AdvAE (1 ) = AdvAE (1 ). ,S ,S
Lemma 2. AE is statistical PA1.
Proof. The statistical PA1 of AE is also guaranteed by the statistical PA1 of AE. For any PA1 ciphertext creator A against AE , we could construct a plaintext extractor A ∗ which is also a ciphertext creator against AE. Since AE is statistical PA1, there exists a PA1 plaintext extractor A∗ for A ∗ . Here is the description of A ∗ :
52
Y. Liu et al.
– On receiving the public key pk = (pk, T ) from a challenger, set the coins of A as its own coins, i.e., R[A ∗ ] = R[A ]. – On receiving a decryption query C = (b, c, t) from A , invoke A∗ (pk, c, R[A ∗ ]), and get the answer M . If M = ⊥ then output ⊥, else parse M as (m, r). If t = T (r) then output ⊥; otherwise, if b = 1 then output m, otherwise output r.
A ∗ correctly simulates D (sk, ·) if it gets correct answers from A∗ . Since A∗ is statistically indistinguishable from D(sk, ·), it always returns correct decryptions except with negligible probability. Thus A ∗ fails to simulate D (sk, ·) with neg ligible probability. Hence, AE is statistical PA1, and certainly, computational PA1.
Lemma 3. AE is not computational PA2.
Proof. Let A be a PA2 ciphertext creator of AE , with its coins denoted as R[A], and let P be a plaintext creator. CLIST is the list of ciphertexts that A obtains from P. A could produce ciphertexts without using the encryption algorithm in the following way: Query the plaintext creator P. On receiving a ciphertext C = (1, c, t) from P, where c = E(pk, m||r) and t = T (r), set C = (0, c, t). / CLIST , A could ask the plaintext extractor A∗ to decrypt C . A∗ Since C ∈ would fail with overwhelming probability since it neither knows the coins of P, nor it could coin a string that is computationally indistinguishable from r for r is already committed in t and the hash function T is collision-free and pre-image resistant. The artificial example shows that computational PA1 and IND-CCA2 do not guarantee computational PA2, and statistical PA1 and IND-CCA2 also could not reach computational PA2. It is an open problem to find natural examples to support the result.
5
Conclusion
We give several examples to show that proving PA2 plaintext awareness by proving PA1 plaintext awareness and IND-CCA2 security is impossible in both statistical and computational cases, in the absence of random oracles. We use natural encryption schemes to show the result in the statistical case, and construct an artificial example in the computational case. The result may be of some value since PA2 is not only a notion used to prove IND-CCA2 but also of independent interest now.
Acknowledgements We are very grateful to anonymous reviewers for their helpful comments. We also thank Xiaoying Jia, Peng Wang and Liting Zhang for helpful discussions.
PA1 and IND-CCA2 Do Not Guarantee PA2: Brief Examples
53
References 1. Birkett, J., Dent, A.W.: Relations Among Notions of Plaintext Awareness. In: Cramer, R. (ed.) PKC 2008. LNCS, vol. 4939, pp. 47–64. Springer, Heidelberg (2008) 2. Bellare, M., Desai, A., Pointcheval, D., Rogaway, P.: Relations Among Notions of Security for Public-Key Encryption Schemes. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 26–46. Springer, Heidelberg (1998) 3. Bellare, M., Palacio, A.: Towards Plaintext-Aware Public-Key Encryption without Random Oracle. In: Lee, P.J. (ed.) ASIACRYPT 2004. LNCS, vol. 3329, pp. 48–62. Springer, Heidelberg (2004) 4. Bellare, M., Rogaway, P.: Optimal Asymmetric Encryption – How to Encrypt with RSA. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 92–111. Springer, Heidelberg (1995) 5. Cramer, R., Shoup, V.: A Practical Public Key Cryptosystem Provably Secure against Adaptive Chosen Ciphertext Attack. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 13–25. Springer, Heidelberg (1998) 6. Cramer, R., Shoup, V.: Design and Analysis of Practical Public-Key Encryption Schemes Secure against Adaptive Chosen Ciphertext Attack. SIAM Journal on Computing 33(1), 167–226 (2004) 7. Damg˚ ard, I.B.: Towards Practical Public Key Systems Secure against Chosen Ciphertext Attacks. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 445–456. Springer, Heidelberg (1992) 8. Dent, A.W.: The Crame-Shoup Encryption Scheme Is Plaintext Aware in the Standard Model. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 289–307. Springer, Heidelberg (2006) 9. Desai, A.: New Paradigms for Constructing Symmetric Encryption Schemes Secure against Chosen-Ciphertext Attack. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 394–412. Springer, Heidelberg (2000) 10. Dent, A.W.: The Hardness of the DHK Problem in the Generic Group Model (2006), http://eprint.iacr.org/2006/156 11. Desmedt, Y., Phan, D.H.: A CCA Secure Hybrid Damg˚ ard’s ElGamal Encryption. In: Baek, J., et al. (eds.) ProvSec 2008. LNCS, vol. 5324, pp. 68–82. Springer, Heidelberg (2008) 12. Fujisaki, E.: Plaintext-Simulatability. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E89-A(1), 55–65 (2006) 13. Herzog, J., Liskov, M., Micali, S.: Plaintext Awareness via Key Registration. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 548–564. Springer, Heidelberg (2003) 14. Jiang, S., Wang, H.: Plaintex-Awareness of Hybrid Encryption. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 57–72. Springer, Heidelberg (2010) 15. Phan, D.H., Pointcheval, D.: About the Security of Ciphers (Semantic Security and Pseudo-Random Permutations). In: Handschuh, H., Hasan, M.A. (eds.) SAC 2004. LNCS, vol. 3357, pp. 182–197. Springer, Heidelberg (2004) 16. Raimondo, M.D., Gennaro, R., Krawczyk, H.: Deniable Authentication and Key Exchange. In: Proceedings of ACM CCS 2006, pp. 400–409. ACM, New York (2006)
54
Y. Liu et al.
17. Teranishi, I., Ogata, W.: Relationship between Standard Model Plaintext Awareness and Message Hiding. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS, vol. 4284, pp. 226–240. Springer, Heidelberg (2006) 18. Teranishi, I., Ogata, W.: Cramer-Shoup Satisfies a Stronger Plaintext Awareness under a Weaker Assumption. In: Ostrovsky, R., De Prisco, R., Visconti, I. (eds.) SCN 2008. LNCS, vol. 5229, pp. 109–125. Springer, Heidelberg (2008)
Appendix: HDDH Assumption and Its Variants Assumption 3. (Hashed Decisional Diffie-Hellman Assumption, HDDH) Assume a group G =< g > of order q, and H is a hash function. There is no adversary can effectively distinguish the following two distributions: $
$
– the distribution RH : (g, g a , g b , H(Z)), where a, b ← Zq and Z ← G. $
– the distribution DH : (g, g a , g b , H(g ab )), where a, b ← Zq . Assumption 4. (Modified Hashed Decisional Diffie-Hellman Assumption, MHDDH) Assume a group G =< g > of order q, and H is a hash function. There is no adversary can effectively distinguish the following two distributions: $
$
– the distribution RMH : (g, g a , g b , g c , g ac , H(Z)), where a, b, c ← Zq and Z ← G. $ – the distribution DMH : (g, g a , g b , g c , g ac , H(g bc )), where a, b, c ← Zq . Assumption 5. (Extended Hashed Decisional Diffie-Hellman Assumption,EHDDH) Assume a group G =< g > of order q, and H is a hash function. Choose U ∈ G, U = 1. There is no adversary, on choosing an element v ∈ Z∗q , can effectively distinguish the following two distributions: $
$
– the distribution REH : (g, g a , g b , H(g ab ), H(Z)), where a, b ← Zq and Z ← G. $ – the distribution DEH : (g, g a , g b , g c , H(g ab ), H(U g abv )), where a, b ← Zq .
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs in the RO Model Yusuke Sakai1 , Goichiro Hanaoka2 , Kaoru Kurosawa3, and Kazuo Ohta1 1
2
The University of Electro-Communications, Japan {y-sakai,ota}@ice.uec.ac.jp National Institute of Advanced Industrial Science and Technology, Japan
[email protected] 3 Ibaraki University, Japan
[email protected] Abstract. In this paper, a simple generic method is proposed which can make a key encapsulation mechanism (KEM) more efficient. While the original KEM needs to be OW-CCCA secure and to satisfy reproducibility, the transformed KEM is CCA secure in the random oracle model and the size of ciphertexts is shorter. In particular, various existing CCA secure KEMs in the standard model can be transformed into more efficient ones in the random oracle model. We can implement both the original KEM and the transformed one in a single chip simultaneously with a small number of additional gates because our transformation requires computing a single hash value only.
1 1.1
Introduction Background
Designing secure and efficient public key encryption is widely recognized as an important research topic in the area of cryptography. Especially, as for achieving chosen-ciphertext (CCA) security, there are mainly two directions of researches. Namely, one is to pursue both high security and efficiency without relying on random oracles (which do not exist in the real world), and the other is to put more stress on higher efficiency without losing reasonable security by introducing random oracles. Both directions have their individual merits, and it would be useful for both theory and practice if there exists a unified methodology to handle these two approaches simultaneously. However, basically they are based on different paradigms, e.g. the twin encryption paradigm [28], the universal hash proof paradigm [14,15], the identity-based encryption paradigm [12,24], and the broadcast encryption paradigm [17,21] for the former, and the plaintext awareness paradigm [4] and the plaintext simulatability paradigm [18] for the latter. Therefore, it seems not easy to establish a unified methodology. 1.2
Our Contribution
In this paper, we propose a generic method for converting a CCA secure key encapsulation mechanism (KEM) which has a specific (but natural) property I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 55–69, 2010. c Springer-Verlag Berlin Heidelberg 2010
56
Y. Sakai et al.
into a more efficient one in the random oracle model. More precisely, via our conversion, an arbitrary KEM with constrained CCA security [22] (which is weaker than CCA security) and reproducibility (which we will explain later) can be generically transformed into a CCA secure KEM with shorter ciphertext length. By applying this conversion to existing schemes which do not rely on random oracles, we can also immediately construct CCA secure KEMs with shorter ciphertexts in the random oracle model. For example, if we transform the Cramer-Shoup scheme [14] via our conversion, the resulting scheme becomes another CCA secure KEM in the random oracle model whose ciphertext length is only the plaintext length plus the length of an element of the underlying cyclic group. The underlying mathematical hardness assumptions are identical in both schemes except that the transformed scheme requires random oracles. There are various concrete instantiations of our generic method, and we also present them in Sect. 5. We do not insist that our proposed method unifies the existing paradigms for yielding CCA security in both standard and random oracle models, but that it somewhat simultaneously handles some class of practical constructions in both models. Roughly speaking, our result implies that designing CCA secure KEM with reproducibility in the standard model is also a promising approach for constructing a more efficient one in the random oracle model. Therefore, by using our technique, once we strike on a new mathematical structure which can be utilized for designing a CCA secure KEM, it is also possible to “switch” it to be more compact one in the random oracle model if the original scheme has reproducibility. For example, interestingly a recently proposed scheme due to Rosen and Segev [33] also incidentally yields reproducibility, and thus, we can significantly shorten its ciphertext length with the help of random oracles. 1.3
Related Works
Shoup [34] made the first attempt for analyzing security of one public key encryption scheme in both standard and random oracle models, and showed that a variant of the Cramer-Shoup scheme [14] is provably CCA secure under the decisional Diffie-Hellman assumption and the computational Diffie-Hellman assumption in the standard and random oracle models, respectively. Cash, Kiltz, and Shoup [13] proposed the twin Diffie-Hellman problem, and showed that its mathematical structure is useful for constructing CCA secure KEMs in both standard and random oracle models. Boldyreva and Fischlin [9] showed that one of two random oracles in OAEP [7] can be instantiated with an existing function family. Boldyreva, Cash, Fischlin, and Warinschi [8] also showed one of two random oracles in the Bellare-Rogaway scheme [6] can be replaced with a realizable function. Pandey, Pass, and Vaikuntanathan [31] discussed feasibility of constructing fairly efficient CCA secure public key encryption by using adaptively secure perfect one-way functions instead of random oracles. However, it is unknown if such functions exist. There are also various techniques for converting weakly secure public key encryption in the standard model into CCA secure one in the random oracle
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs
57
model [6,7,19,20,30]. Since these schemes aim to acquire CCA security (with the help of random oracles) rather than to enhance efficiency, the resulting schemes are generally less efficient than the original schemes.
2
High Level Overview
The basic idea of our conversion is as follows. Let c = (ψ, π) and κ be a ciphertext and its corresponding session key, respectively, for a KEM Π. We assume that Π has a property that π is reproducible from ψ and the decryption key dk . By this property, the sender can remove the ciphertext component π from the ciphertext (since the above reproducibility guarantees that π can be computed from ψ and dk). Then, by re-setting the ciphertext and the session key as C = ψ and K = H(ψ, π, κ), respectively, where H is a random oracle, we can construct another KEM Π . Interestingly, Π is still CCA secure if Π is OW-CCCA secure. Roughly speaking, this is due to that in Π the sender cannot generate (C, K) unless he properly computes (c, κ), and thus, for a given (C, K), we can extract (c, κ) from H-list and interpret the security of (C, K) as that of (c, κ). By the assumption, Π is CCA secure, and so is Π .
3 3.1
Definitions Key Encapsulation Mechanism
A key encapsulation mechanism is a triple of probabilistic polynomial-time algorithms (G, E, D) such that: (1) the key generation algorithm G takes as input a security parameter 1k (k ∈ N) and outputs a pair of public key and decryption key (pk , dk ), (2) the encapsulation algorithm E takes as input a public key pk and outputs a pair of the session key and a ciphertext (K, C), and (3) the decapsulation algorithm D takes as input a decryption key and a ciphertext and outputs a session key K or the rejection symbol ⊥. We require that for all security parameter k ∈ N, we have Pr[(pk , dk ) ← G(1k ); (K, C) ← E(pk ) : D(dk , C) = K] = 1, where the probability is taken over the internal coin toss of G and E. 3.2
Security Notion
Chosen-ciphertext security of a key encapsulation mechanism is defined using the following game. We use the slightly simpler definition in which the adversary is given the challenge ciphertext together with the public key. In the definition of IND-CCA security for a public key encryption, the adversary is allowed to access the decapsulation oracle before obtaining the challenge ciphertext, but for a key encapsulation mechanism, the definition here is equivalent to the two-phase one [25]. Definition 1. Let (G, E, D) be a key encapsulation mechanism. We say that (G, E, D) is IND-CCA secure when for all probabilistic polynomial-time algorithm A which doesn’t query the decapsulation oracle with the challenge ciphertext C ∗ , we have that the following quantity
58
Y. Sakai et al.
⎡ ⎤ b ← {0, 1}; ⎢(pk , dk ) ← G(1k ); ⎥ ⎢ ⎥ ⎥ Pr ⎢(K0 , C ∗ ) ← E(pk ); : b = b ⎢ ⎥− ⎣K1 ← K; ⎦ b ← AD(dk ,·) (pk , Kb , C ∗ )
1 2
(1)
is negligible in k where K is the key space, from which the session key is picked. In the following we define the notion of one-wayness against constrained chosenciphertext attack (OW-CCCA security for short). Constrained chosen-ciphertext attack was firstly introduced by Hofheinz and Kiltz [22] in order to give a sufficient condition of the security of the Kurosawa-Desmedt public key encryption scheme. Definition 2. Let (G, E, D) be a key encapsulation mechanism, and CDec(dk , pred, C) be an oracle that returns D(dk , C) when pred(D(dk , C)) = 1 and ⊥ otherwise. We say that (G, E, D) is OW-CCCA secure when the following quantity Pr[(pk , dk ) ← G(1k ); (K ∗ , C ∗ ) ← E(pk ); K ← ACDec(dk ,·,·) (pk , C ∗ ) : K ∗ = K ] is negligible in k for all probabilistic polynomial-time algorithm A which follows the conditions below: (1) The adversary A does not query the challenge ciphertext C ∗ to the oracle (with any predicate pred), (2) the adversary A is only allowed to query CDec on predicates that is probabilistic polynomial-time computable, and 1 (3) the quantity maxE Q 1≤i≤Q PrK←K [predi (K) = 1 when A runs with E] is negligible in k where E be an environment that A interacts with, Q be the number of queries A submits, predi be the predicate A submits in the i-th decapsulation query, and the maximum is taken over all E whose running time is not longer than the original OW-CCCA challenger. 3.3
Reproducibility
In this subsection we introduce a notion of reproducibility. Reproducibility requires that given an incomplete ciphertext, from which particular components are simply omitted, the receiver, who has the decryption key, is able to reconstruct the omitted part of the ciphertext. We formalize this intuition as follows: Definition 3. A key encapsulation mechanism (G, E, D) is said to be reproducible if the following two conditions hold: (1) The encryption algorithm E always outputs a tuple of two components as a ciphertext and (2) there exists a polynomial-time algorithm R such that for all k ∈ N it holds that Pr[(pk , dk ) ← G(1k ); (κ, (ψ, π)) ← E(pk ); π ← R(dk , ψ) : π = π ] = 1. Bellare, Boldyreva, and Staddon [3] and Baek, Zhou, and Bao [2] defined notions of reproducibility in a manner slightly different to each other. Here we give another definition of reproducibility different from both of them, but for all known existing reproducible KEMs the essential mechanism of “reproduction” is common in the three definitions.
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs
4
59
The Proposed Transformation
In this section we describe our transformation. Given a key encapsulation mechanism (G, E, D) having OW-CCCA security and reproducibility, we construct a more efficient key encapsulation mechanism (G, E, D) which is IND-CCA secure and more efficient than (G, E, D) in the random oracle model. We require that given pk and ψ, it is efficiently checkable whether there exists (ψ, π) such that Pr[(ψ, π) ← E(pk )] = 0. We say that ψ is valid on pk if there exists such π. We further require that the decapsulation algorithm D outputs ⊥ for any input c such that Pr[c ← E(pk )] = 0. In the following, K denotes a key space, from which the session key is picked. The construction is as follows: Key generation. The algorithm G(1k ) runs G(1k ) to obtain a (pk , dk ), and chooses a cryptographic hash function H : {0, 1}∗ → K. The public key PK is (pk , H) and the decryption key DK is dk . Encapsulation. Using the public key PK = (pk , H), the algorithm E(PK ) first runs E(pk ) to obtain (κ, (ψ, π)). It then computes the session key as K ← H(ψ, π, κ). The final ciphertext C is ψ. Decapsulation. To decrypt a ciphertext C = ψ, the algorithm D(DK , C) first checks ψ is valid on pk . If ψ is not valid, it outputs ⊥. Otherwise, it invokes the reproduction algorithm R(dk , ψ) to reconstruct π. It then decrypts the complete ciphertext (ψ, π) and obtains κ ← D(dk , (ψ, π)). Finally it outputs K ← H(ψ, π, κ) as the session key. Theorem 1. In the transformation above, if (G, E, D) is OW-CCCA secure and H is modeled as a random oracle, then (G, E, D) is IND-CCA secure. Proof. Given an adversary A which breaks IND-CCA security of (G, E, D), we construct another adversary B which breaks OW-CCCA security of (G, E, D). To simulate the decapsulation oracle and the random oracle for A, the adversary B maintains a list of tuples (ψj , πj , κj , Kj ), in which κj and πj may have a value of a blank symbol “-”. This list is initially empty. By interacting with A, adversary B works as follows: Setup. The adversary B receives pk and (ψ ∗ , π ∗ ) as a public key and a challenge ciphertext of OW-CCCA game for (G, E, D). It then chooses a session key K ∗ at uniform random from K, and sends ((pk , H), K ∗ , ψ ∗ ) to A. Here H is a random oracle controlled by B as follows. Decapsulation query. When A makes a decapsulation query ψ † , B checks whether ψ † is valid on pk and returns ⊥ if it isn’t. Otherwise, B retrieves entries (ψj , πj , κj , Kj ) which satisfies that – the first component ψj is equal to the decapsulation query ψ † , and – (ψj , πj ) correctly encrypts κj . In order to find such an entry, B first picks the entries whose first component ψj is equal to the decapsulation query ψ † . For all of these entries, asking B’s decapsulation oracle CDec, B checks whether (ψj , πj ) encrypts κj . More precisely, B defines the predicate predκj as predκj (κ) = 1 if κ = κj and
60
Y. Sakai et al.
predκj (κ) = 0 otherwise (Note that κj is hard-coded into predκj ) and query ((ψj , πj ), predκj ) to CDec. If B receives κj for some entry, then returns the fourth component Kj of this entry to A. Otherwise no such entry is found in H-list, B chooses random K ← K, adds (ψ † , -, -, K) to H-list, and returns K to A. H-query. When A makes the query (ψ † , π † , κ† ) to the random oracle, B proceeds as follows: 1. If (ψ † , π † ) is equal to the challenge ciphertext (ψ ∗ , π ∗ ), which is the input to B, B chooses random K ← K, adds (ψ † , π † , κ† , K) to H-list, and returns K to A. 2. If (ψ † , π † ) correctly encrypt κ† (To examine whether it does so, again B uses the oracle CDec just as we explained above) and there exists in H-list an entry which forms as (ψ † , -, -, K) for some K, B replaces the entry (ψ † , -, -, K) with (ψ † , π † , κ† , K) and returns K to A. 3. Otherwise B chooses random K ← K, adds (ψ † , π † , κ† , K) to H-list, and returns K to A. Finding the session key. Finally A outputs its guess b ∈ {0, 1}. At this point B picks a random entry (ψj , πj , κj , Kj ) from the entries whose first and second components (ψj , πj ) are equal to the challenge ciphertext (ψ ∗ , π ∗ ). Then B outputs κj . Let Q be the event where A queries (ψ ∗ , π ∗ , κ∗ ) (the challenge ciphertext, its reproducible part, and its correct decryption result) to the random oracle. Due to the treatment of H-query (specifically item 1.), A’s view in the simulation differs from the one in the real attack if and only if the event Q occurs. This is because the random oracle in the simulation responds to the query (ψ ∗ , π ∗ , κ∗ ) the value independent from K ∗ , whereas the random oracle in the real attack responds to the query (ψ ∗ , π ∗ , κ∗ ) the random value with probability 1/2, and responds K ∗ itself with probability 1/2. This is the only difference between the simulation and the real attack. Here, we will analyze the probability that B correctly outputs the decryption κ∗ of the challenge ciphertext (ψ ∗ , π ∗ ). Lemma 1. Pr[Q] in the simulation above is equal to Pr[Q] in the real attack. Proof. Let Ql be the event where one of the first l queries of A to the random oracle contains (ψ ∗ , π ∗ , κ∗ ). We prove by mathematical induction on l Pr[Ql ] in the simulation is equal to Pr[Ql ] in the real attack for all l. Both in the simulation and in the real attack, we have that Pr[Q0 ] = 0. Now we assume that for some l ≥ 1 Pr[Ql−1 ] in the simulation is equal to Pr[Ql−1 ] in the real attack. We now show that the equality holds for Ql . We know that Pr[Ql ] = Pr[Ql | Ql−1 ] Pr[Ql−1 ]+Pr[Ql | ¬Ql−1 ] Pr[¬Ql−1 ] = Pr[Ql−1 ]+Pr[Ql | ¬Ql−1 ] Pr[¬Ql−1 ], thus it suffices to argue that Pr[Ql | ¬Ql−1 ] in the simulation is equal to Pr[Ql | ¬Ql−1 ] in the real attack and it completes the proof. Observe that as long as A does not query (ψ ∗ , π ∗ , κ∗ ) to the random oracle, its view in the simulation is identical to the one in the real attack. It implies that Pr[Ql | ¬Ql−1 ] in the simulation is equal to Pr[Ql | ¬Ql−1 ] in the real attack. As a result, the equality holds.
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs
61
Lemma 2. Let be the advantage of A (the quantity of (1)). In the real attack Pr[Q] ≥ 2. Proof. In the real attack, if A does not query (ψ ∗ , π ∗ , κ∗ ) then the decryption of the challenge ciphertext ψ ∗ is independent from A’s view. Therefore in the real attack Pr[A wins | ¬Q] = 1/2. Here we derive upper and lower bounds on Pr[A wins] simply Pr[A wins] = Pr[A wins | ¬Q] Pr[¬Q]+Pr[A wins | Q] Pr[Q] = 12 Pr[¬Q] + Pr[A wins | Q] Pr[Q] ≤ 12 Pr[¬Q] + Pr[Q] = 12 + 12 Pr[Q] and Pr[A wins] ≥ Pr[A wins | ¬Q] Pr[¬Q] = 12 (1 − Pr[Q]) = 12 − 12 Pr[Q]. It follows that |Pr[A wins] − 1/2| ≤ 12 Pr[Q]. By the assumption of A it holds that |Pr[A wins] − 1/2| ≥ , and thus we conclude that Pr[Q] ≥ 2. Due to Lemma 1 and 2, we can see that Pr[Q] ≥ 2 holds in the simulation. If A makes q queries to the random oracle, then B can choose the correct entry from H-list (and successfully wins OW-CCCA game) with probability at least 2/q. Remark 1. The proposed scheme gives a novel application of the notion of CCCA security. This notion was originally defined by Hofheinz and Kiltz [22] in order to explain the Kurosawa-Desmedt construction of public key encryption [27], in which the KEM part does not need to provide CCA security on its own. Kiltz and Vahlis [26] also employed the notion of CCCA security for identitybased KEMs to reduce the decryption cost in CCA secure IBE. In addition to them, due to Theorem 1, CCCA secure KEMs become to have a new novel application. Specifically, Theorem 1 states that if we obtain a CCCA secure KEM with reproducibility, we automatically obtain a more efficient CCA secure KEM in the random oracle model. As we will discuss in Sect. 6, this transformation is quite interesting from both theoretical and practical viewpoints. We can obtain two IND-CCA secure PKEs from (G, E, D) and (G, E, D) respectively by combining them with appropriate DEMs. These two PKEs can be implemented on a single chip simultaneously with a small additional costs. This is because, our transformation requires only a single additional hashing and thus the two schemes can share a large part of circuit for implementing key encapsulation mechanism. Notice that the notion of IND-CCA security is the standard requirement for PKE schemes. We refer readers to [17,4] for a rigorous definition of IND-CCA secure PKE, [34,32] for a construction and rigorous treatment of IND-CCA secure DEM, and [5] for authenticated encryption. From Theorem 1, we also immediately have the following two corollaries. Corollary 1. In the transformation above, if (G, E, D) is IND-CCCA secure and H is modeled as a random oracle, then (G, E, D) is IND-CCA secure. Furthermore, by combining (G, E, D) with an authenticated encryption and (G, E, D) with an IND-CCA secure DEM, we can obtain two IND-CCA secure PKEs. Corollary 2. In the transformation above, if (G, E, D) is IND-CCA secure and H is modeled as a random oracle, then (G, E, D) is IND-CCA secure. Furthermore, we can use an IND-CCA secure DEM to obtain two IND-CCA secure PKEs from both (G, E, D) and (G, E, D) in common.
62
5
Y. Sakai et al.
Instantiations of the Proposed Transformation
In this section, we describe several instantiations of our proposed schemes. 5.1
Assumptions
During this and following section we will use the cryptographic assumptions below. Definition 4. Let G be a group of prime order p. We say that the (τ, )-DDH assumption on G holds when for all τ -time algorithm A it holds that | Pr[A(g, g α , g β , g αβ ) = 1] − Pr[A(g, g α , g β , g γ )]| ≤ where the probability is taken over the random choices of α, β, γ, and the generator g. Definition 5. Let G be a group of prime order p. We say that the (τ, )-CDH assumption on G holds when for all τ -time algorithm A it holds that Pr[AO (g, g α , g β ) = g αβ ] ≤ where the probability is taken over the random choices of α, β and the generator g here the oracle O always responds with the empty string. We say that the (τ, )-GDH assumption on G [29] holds when the same condition except that O = ODDH holds, where the oracle ODDH (g, g x , g y , g z ) returns 1 when xy = z and returns 0 otherwise. Definition 6. Let G1 and G2 be bilinear groups of prime order p where there exists an efficiently-computable non-degenerate bilinear map e : G1 × G2 → GT such that e(g x , hy ) = e(g, h)xy for all integer x and y. We say that the (τ, )ˆ [10] holds when for all τ -time algorithm A it holds BDH assumption on G and G that Pr[A(g, h, g α , g β , g γ ) = e(g, h)αβγ ] ≤ where the probability is taken over the random choices of α, β, γ, the generator g of G1 , and the generator h of G2 . Definition 7. A triple of algorithms (I, F, F −1 ) is said to be a collection of trapdoor functions secure under uniform k-repetition if the following conditions hold: (1) The algorithm I on input 1k outputs a pair (s, t) ∈ {0, 1}k × {0, 1}k , where s is a description of a function fs : {0, 1}k → {0, 1}k , (2) the algorithm F on input (s, x) ∈ {0, 1}k outputs fs (x), (3) for all probabilistic polynomialtime algorithm A, it hold that Pr[A(1k , s1 , . . . , sk , F (s1 , x), . . . , F (sk , x)) = x] is negligible where the probability is taken over the random generation of s1 , . . ., sk , the random choice of x, and the internal coin toss of A, (4) for all (s, t) in the range of I and all x ∈ {0, 1}k , it holds that F −1 (t, F (s, x)) = x. 5.2
Instantiations
Here we describe several instantiations of our proposed transformation. In the following, TCR denotes a target collision resistant hash function and H denotes some cryptographic hash function which is modeled as a random oracle in the security proof. We first show the instantiations from a variant of the Cramer-Shoup KEM, a variant of the Kurosawa-Desmedt KEM, the Hanaoka-Kurosawa KEM, the Kiltz
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs
63
KEM, and the Boyen-Mei-Waters KEM. In all of these instantiations except for the one from the Boyen-Mei-Waters KEM, G denotes a group of prime order p and g be a generator of it. In the instantiation of the Boyen-Mei-Waters KEM, G1 and G2 denote groups of prime order p, equipped with a bilinear map e : G1 × G2 → GT , and g and h be generators of G1 and G2 , respectively. Instantiation from a Variant of the Cramer-Shoup KEM. The instantiation from a variant of the Cramer-Shoup KEM [16] is as follows: Key generation. Choose a random (w, x, y, z) ← Z4p and compute gˆ = g w , e = g x , f = g y , h = g z . Pick a target collision resistant hash function TCR : G2 → Zp and a cryptographic hash function H : G4 → K where K is a set from which the session key is chosen. Output the decryption key dk = (w, x, y, z) and the public key pk = (G, g, gˆ, e, f, h, TCR, H). Encapsulation. Pick a random r ← Zp and compute C = g r and K = H(g r , gˆr , er f rv , hr ) where v = TCR(g r , gˆr ). The session key is K and the ciphertext is C. ˆ Decapsulation. For a ciphertext C ∈ G, compute Cˆ = C w , v = TCR(C, C) ˆ C x+yv , C z ). For C and output K = H(C, C, ∈ G, output ⊥. Theorem 2. When the DDH assumption holds on G and H is modeled as a random oracle, the scheme above is IND-CCA secure. The above scheme is based on a modified version of the original Cramer-Shoup KEM. This modified version keeps w = logg gˆ in the decryption key, while the original doesn’t. This variant is originally described by Shoup [34] and also discussed by Cramer and Shoup [16, Sect. 9.3] (in which it was named as CS3b). We employ this variant in order to reproduce gˆr from g r . Instantiation from the Hanaoka-Kurosawa KEM. The instantiation from the Hanaoka-Kurosawa KEM (from the hashed Diffie-Hellman assumption) [21, Sect. 6] is as follows: Key generation. Generate a random polynomial f (x) = a0 + a1 x + a2 x2 over Zp and compute (y0 , y1 , y2 ) = (g a0 , g a1 , g a2 ). Pick a target collision resistant hash function TCR : G → Zp and a cryptographic hash function H : G3 → K where K is a set from which the session key is chosen. Output the decryption key dk = f (x) and the public key pk = (G, g, y0 , y1 , y2 , TCR, H). Encapsulation. Pick a random r ← Zp and compute C = g r and K = H(g r , g rf (i) , y0r ) where i = TCR(g r ) (Notice that one can easily compute g rf (i) 2 as y0 · y1i · y2i without the decryption key). The session key is K and the ciphertext is C. Decapsulation. For a ciphertext C ∈ G, compute i = TCR(C) and output K = H(C, C f (i) , y0r ). For C ∈ G, output ⊥. Theorem 3. When the CDH assumption holds on G and H is modeled as a random oracle, the scheme above is IND-CCA secure.
64
Y. Sakai et al.
Instantiation from the Kiltz KEM. The instantiation from the Kiltz KEM [25] is as follows: Key generation. Choose a random (x, y) ← Z2p and compute u = g x , v = g y . Pick a target collision resistant hash function TCR : G → Zp and a cryptographic hash function H : G3 → K where K is a set from which the session key is chosen. Output the decryption key dk = (x, y) and the public key pk = (G, g, u, v, TCR, H). Encapsulation. Pick a random r ← Zp and compute C = g r and K = H(g r , (ut v)r , ur ) where t = TCR(g r ). The session key is K and the ciphertext is C. Decapsulation. For a ciphertext C ∈ G, compute t = TCR(C) and output K = H(C, C xt+y , C x ). For C ∈ G, output ⊥. Theorem 4. When the GDH assumption holds on G and H is modeled as a random oracle, the scheme above is IND-CCA secure. In Sect. 6.1 we will show a more efficient variant of the above instantiation. This variant will be incidentally identical to ECIES [1] (See Sect. 6.1 for details). Instantiation from the Boyen-Mei-Waters KEM. The instantiation from the Boyen-Mei-Waters KEM [11] is as follows: Key generation. Choose a random (a, y1 , y2 ) ← Z3p and compute h0 = ha , u1 = g y1 , u2 = g y2 , and Z = e(g, h)a . Pick a target collision resistant hash function TCR : G → Zp and a cryptographic hash function H : G1 × G1 × GT → K where K is a set from which the session key is chosen. Output the decryption key dk = (h0 , y0 , y1 ) and the public key pk = (G1 , G2 , GT , e, g, h, u1, u2 , TCR, H). Encapsulation. Pick a random r ← Zp and compute C = g r and K = H(g r , r r (ur1 urt 2 ), Z ) where t = TCR(g ). The session key is K and the ciphertext is C. Decapsulation. For a ciphertext C ∈ G, compute t = TCR(C) and output ∈ G, output ⊥. K = H(C, C y1 t+y2 , e(C, h0 )). For C Theorem 5. When the BDH assumption holds on G and H is modeled as a random oracle, the scheme above is IND-CCA secure. In the original papers of the above three schemes (the Hanaoka-Kurosawa KEM [21], the Kiltz KEM [25], and the Boyen-Mei-Waters KEM [11]), several decisional assumptions (the hashed Diffie-Hellman assumption, the gap hashed DiffieHellman assumption, and the decisional bilinear Diffie-Hellman assumption, respectively) are used to prove indistinguishability of these schemes. However, their computational variants (the CDH assumption, the GDH assumption, and the BDH assumption, respectively) suffice to simply prove one-wayness of the schemes.
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs
65
Table 1. Brief overview of several instantiations Assumption Security of KEM Optimalit3y DDH IND-CCA DDH+ROM IND-CCA HDH IND-CCCA HK [21] CDH OW-CCCA HK + Ours CDH+ROM IND-CCA GHDH IND-CCA Kiltz [25] GDH OW-CCA Kiltz [25] + Ours GDH+ROM IND-CCA DBDH IND-CCA BMW [11] BDH OW-CCA BMW + Ours BDH+ROM IND-CCA Correlated Products + Hard-Core Bit IND-CCA RS [33] Correlated Products OW-CCA RS + Ours Correlated Products + ROM IND-CCA For “Optimality”, see Sect. 6.2.
Variant of CS [16] CS + Ours
Instantiation from the Rosen-Segev KEM. Let (I, F, F −1 ) be a collection of trapdoor functions secure under uniform k-repetition distribution, (GSig , S, V ) be a strongly unforgeable one-time signature. We assume that given s and y, it is efficiently checkable whether there exists x such that y = F (s, x). We further assume that the bit-length of the verification key of (GSig , S, V ) is k (for all security parameter k). The instantiation from the Rosen-Segev KEM [33] is as follows: Key generation. On input 1k generate 2k pairs of descriptions of functions (s01 , s11 ), . . ., (s0k , s1k ) and corresponding trapdoors (t01 , t11 ), . . ., (t0k , t1k ), independently. Pick a cryptographic hash function H : {0, 1}∗ → K where K is a set from which the session key is chosen. Output the decryption key dk = ((t01 , t11 ), . . . , (t0k , t1k )) and the public key pk = ((s01 , s11 ), . . . , (s0k , s1k ), H). Encapsulation. Choose random κ ← {0, 1}k and generate (vk , dk ) ← G(1k ) 1 where vk = (vk 1 , . . . , vk k ) ∈ {0, 1}k . Then compute C = (vk , F (svk 1 , κ), σ) vk 1 vk k 1 and K = H(vk , F (s1 , κ), . . . , F (sk , κ), σ, κ) where σ ← S(vk , (F (svk 1 , κ), vk k . . . , F (sk , κ))). The session key is K and the ciphertext is C. Decapsulation. For a ciphertext C = (vk , y, σ), parse vk = (vk 1 , . . . , vk k ) 1 and check whether there exists κ such that y = F (svk 1 , κ). If no such κ −1 vk 1 exists, output ⊥. Otherwise, compute such κ as κ ← F (t1 , y) and yi ← i F (svk i , κ) for every i ∈ {2, . . . , k}. Finally if V (vk , (y, y2 , . . . , yk ), σ) = 1 holds, output K = H(vk , y, y2 , . . . , yk , σ, κ) or output ⊥ otherwise. Theorem 6. When (I, F, F −1 ) is a collection of trapdoor functions secure under uniform k-repetition distribution, (GSig , S, V ) is a strongly unforgeable one-time signature, and H is modeled as a random oracle, the scheme above is IND-CCA secure. As in the instantiation from the Kiltz scheme, we will also show in Sect. 6.1 a more efficient variant of the above instantiation. Other Instantiations. Several existing KEMs in the random oracle model can be viewed as an instantiation of the proposed transformation: For example, ECIES-KEM [1] (and its modifications by twinning technique [13] and by signed
66
Y. Sakai et al.
quadratic residue [23]) can be viewed as the instantiation from ElGamal KEM (and its variant) which is OW-PCA secure. Another example is an RSA-KEM, instantiated from textbook RSA, which provides OW-PCA security under the RSA assumption.
6 6.1
Discussion Variants of the Proposed Transformation
Here we describe variants of the transformation we proposed in Sect. 4. Those variants drop some components in the input of the hash function H and further drop some components in the ciphertext. The variants we will describe are based on the following observation: In the proof of Theorem 1, for responding to a decryption query ψ † , the simulator extracts its corresponding π † from the H-list and reconstructs (ψ † , π † , κ† ) which will be asked to the simulator’s decryption oracle for checking whether κ† is the correct answer to the adversary. This implies that if there is a functionality which enables to check whether κ† is the correct answer without using the simulator’s decryption oracle, then π † is not necessary to be used for encryption and/or decryption. Actually, the Kiltz KEM (Sect. 5.2) and the Rosen-Segev KEM (Sect. 5.2) have such a functionality. Namely, in the security proof of the Kiltz KEM, the simulator has access to the DDH oracle, and this oracle provides the above functionality. Futhermore, we see that the Rosen-Segev KEM obviously has the above functionality since F is a deterministic function. Due to this observation, the instantiation from the Kiltz KEM in Sect. 5.2 can be further simplified as follows: Key generation. Choose a random x ← Zp and compute u = g x . Output the decryption key dk = x and the public key pk = (G, g, u, H). Encapsulation. Pick a random r ← Zp and compute C = g r and K = H(g r , ur ). The session key is K and the ciphertext is C. ∈ G, Decapsulation. For a ciphertext C ∈ G, output K = H(C, C x ). For C output ⊥. Similarly, the instantiation from the Rosen-Segev KEM in Sect. 5.2 can be simplified as follows. Key generation. On input 1k generate a pair of descriptions of functions s and t. Output the decryption key dk = t and the public key pk = s. Encapsulation. Choose random κ ← {0, 1}k and compute C = F (s, κ) and K = H(F (s, κ), κ). The session key is K and the ciphertext is C. Decapsulation. For a ciphertext C, check whether there exists κ such that C = F (s, κ). If there exists, compute such κ as κ ← F −1 (t, C) and output K = H(C, κ). Otherwise, output ⊥. We emphasize that the above variants are incidentally identical to some wellestablished schemes. That is, the former is equivalent to ECIES [1], while the latter is so to (the KEM part of) REACT [30]. This fact implies that our methodology is quite powerful and promising.
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs
6.2
67
Optimality
We will discuss optimality of the ciphertext length of several instantiation above. Intuitively, in all of the above Diffie-Hellman type instantiations the ciphertext is just one group element, thus sounds “optimal” for Diffie-Hellman type instantiation. Here, we will do more rigorous discussions. We first define optimality of the ciphertext length of a KEM, from the viewpoint of interpreting the ciphertext as a encoding of the session key transmitted. If we view the ciphertext in this way, it is natural to say that “the ciphertext length is optimal” when it is equal to the Shannon entropy of the session key from the view of the receiver. The formal definition is as follows: Definition 8. We say that a key encapsulation mechanism (G, E, D) is optimal if H(K | PK, DK) = |C| holds, where H(· | ·) denotes the conditional Shannon entropy, |C| denotes a bit-length of C, and K, PK, and DK are the random variables induced by the session key, the public key, and the decryption key, respectively. The instantiations described in Sect. 5.2 are all optimal in the sense of Definition 8. Specifically, when H is instantiated as an injective key derivation function, they all satisfy H(K | PK, DK) = log2 p (where p is the order of the underlying groups). Therefore, if we use an elliptic curve group as the underlying group, a group element is represented in log2 p bit and the optimality in Definition 8 is achieved (See Table 1). 6.3
Toward a Unified Methodology in the Two Models
Observations in Sect. 5.2 imply that one of the state-of-the-art schemes in two models (the standard and the random oracle models) can be obtained from the unified methodology. For example, the instantiation from the Hanaoka-Kurosawa KEM (Sect. 5.2) achieves the optimal efficiency (one group element as the ciphertext overhead) from the weakest assumption (the CDH assumption) in the random oracle model. We note that this is the state-of-the-art efficiency and security, which is firstly achieved by Cash, Kiltz, and Shoup [13]. We also note that the Hanaoka-Kurosawa KEM on its own achieves the state-of-the-art efficiency (two group elements (and MAC) as the ciphertext overhead) and security (the hashed Diffie-Hellman assumption) in the standard model. Therefore, basing on the proposed transformation two state-of-the-art schemes both in the standard and the random oracle models, are simultaneously explained in the unified way.
Acknowledgments We would like to thank the anonymous reviewers of IWSEC 2010 for their invaluable comments. We are also grateful to Yutaka Kawai, Yoshikazu Hanatani, and Shota Yamada for helpful discussions.
68
Y. Sakai et al.
References 1. Abdalla, M., Bellare, M., Rogaway, P.: The oracle Diffie-Hellman assumptions and an analysis of DHIES. In: Naccache, D. (ed.) CT-RSA 2001. LNCS, vol. 2020, pp. 143–158. Springer, Heidelberg (2001) 2. Baek, J., Zhou, J., Bao, F.: Generic constructions of stateful public key encryption and their applications. In: Bellovin, S.M., Gennaro, R., Keromytis, A.D., Yung, M. (eds.) ACNS 2008. LNCS, vol. 5037, pp. 75–93. Springer, Heidelberg (2008) 3. Bellare, M., Boldyreva, A., Staddon, J.: Randomness re-use in multi-recipient encryption schemes. In: Desmedt, Y.G. (ed.) PKC 2003. LNCS, vol. 2567, pp. 85–99. Springer, Heidelberg (2002) 4. Bellare, M., Desai, A., Pointcheval, D., Rogaway, P.: Relations among notions of security for public-key encryption schemes. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 26–45. Springer, Heidelberg (1998) 5. Bellare, M., Namprempre, C.: Authenticated encryption: Relations among notions and analysis of the generic composition paradigm. In: Okamoto, T. (ed.) ASIACRYPT 2000. LNCS, vol. 1976, pp. 531–545. Springer, Heidelberg (2000) 6. Bellare, M., Rogaway, P.: Random oracles are practical: A paradigm for designing efficient protocols. In: CCS 1993: Proceedings of the 1st ACM conference on Computer and Communications Security, pp. 62–73. ACM, New York (1993) 7. Bellare, M., Rogaway, P.: Optimal asymmetric encryption. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 92–111. Springer, Heidelberg (1995) 8. Boldyreva, A., Cash, D., Fischlin, M., Warinschi, B.: Foundations of non-malleable hash and one-way functions. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 524–541. Springer, Heidelberg (2009) 9. Boldyreva, A., Fischlin, M.: On the security of OAEP. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS, vol. 4284, pp. 210–225. Springer, Heidelberg (2006) 10. Boneh, D., Franklin, M.: Identity-based encryption from the Weil pairing. In: Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, pp. 213–229. Springer, Heidelberg (2001) 11. Boyen, X., Mei, Q., Waters, B.: Direct chosen ciphertext security from identitybased techniques. In: CCS 2005: Proceedings of the 12th ACM Conference on Computer and Communications Security, pp. 320–329. ACM, New York (2005) 12. Canetti, R., Halevi, S., Katz, J.: Chosen-ciphertext security from identity-based encryption. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 207–222. Springer, Heidelberg (2004) 13. Cash, D., Kiltz, E., Shoup, V.: The twin Diffie-Hellman problem and applications. In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 127–145. Springer, Heidelberg (2008) 14. Cramer, R., Shoup, V.: A practical public key cryptosystem provably secure against adaptive chosen ciphertext attack. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 13–25. Springer, Heidelberg (1998) 15. Cramer, R., Shoup, V.: Universal hash proofs and a paradigm for adaptive chosen ciphertext secure public-key encryption. In: Knudsen, L.R. (ed.) EUROCRYPT 2002. LNCS, vol. 2332, pp. 45–64. Springer, Heidelberg (2002) 16. Cramer, R., Shoup, V.: Design and analysis of practical public-key encryption schemes secure against adaptive chosen ciphertext attack. SIAM J. Comput. 33(1), 167–226 (2003) 17. Dolev, D., Dwork, C., Naor, M.: Non-malleable cryptography. In: STOC 1991: Proceedings of the Twenty-Third Annual ACM Symposium on Theory of Computing, pp. 542–552. ACM, New York (1991)
A Generic Method for Reducing Ciphertext Length of Reproducible KEMs
69
18. Fujisaki, E.: Plaintext simulatability. IEICE Transactions 89-A(1), 55–65 (2006) 19. Fujisaki, E., Okamoto, T.: How to enhance the security of public-key encryption at minimum cost. In: Imai, H., Zheng, Y. (eds.) PKC 1999. LNCS, vol. 1560, pp. 53–68. Springer, Heidelberg (1999) 20. Fujisaki, E., Okamoto, T.: Secure integration of asymmetric and symmetric encryption schemes. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 537–554. Springer, Heidelberg (1999) 21. Hanaoka, G., Kurosawa, K.: Efficient chosen ciphertext secure public key encryption under the computational Diffie-Hellman assumption. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp. 308–325. Springer, Heidelberg (2008) 22. Hofheinz, D., Kiltz, E.: Secure hybrid encryption from weakened key encapsulation. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 553–571. Springer, Heidelberg (2007) 23. Hofheinz, D., Kiltz, E.: The group of signed quadratic residues and applications. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 637–653. Springer, Heidelberg (2009) 24. Kiltz, E.: Chosen-ciphertext security from tag-based encryption. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 581–600. Springer, Heidelberg (2006) 25. Kiltz, E.: Chosen-ciphertext secure key-encapsulation based on gap hashed DiffieHellman. In: Okamoto, T., Wang, X. (eds.) PKC 2007. LNCS, vol. 4450, pp. 282– 297. Springer, Heidelberg (2007) 26. Kiltz, E., Vahlis, Y.: CCA2 secure IBE: Standard model efficiency through authenticated symmetric encryption. In: Malkin, T.G. (ed.) CT-RSA 2008. LNCS, vol. 4964, pp. 221–238. Springer, Heidelberg (2008) 27. Kurosawa, K., Desmedt, Y.: A new paradigm of hybrid encryption scheme. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 426–442. Springer, Heidelberg (2004) 28. Naor, M., Yung, M.: Public-key cryptosystems provably secure against chosen ciphertext attacks. In: STOC 1990: Proceedings of the Twenty-Second Annual ACM Symposium on Theory of Computing, pp. 427–437. ACM, New York (1990) 29. Okamoto, T., Pointcheval, D.: The gap-problems: A new class of problems for the security of cryptographic schemes. In: Kim, K.-c. (ed.) PKC 2001. LNCS, vol. 1992, pp. 104–118. Springer, Heidelberg (2001) 30. Okamoto, T., Pointcheval, D.: REACT: Rapid enhanced-security asymmetric cryptosystem transform. In: Naccache, D. (ed.) CT-RSA 2001. LNCS, vol. 2020, pp. 159–175. Springer, Heidelberg (2001) 31. Pandey, O., Pass, R., Vaikuntanathan, V.: Adaptive one-way functions and applications. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 57–74. Springer, Heidelberg (2008) 32. Phan, D.H., Pointcheval, D.: About the security of ciphers (semantic security and pseudo-random permutations). In: Handschuh, H., Hasan, M.A. (eds.) SAC 2004. LNCS, vol. 3357, pp. 182–197. Springer, Heidelberg (2004) 33. Rosen, A., Segev, G.: Chosen-ciphertext security via correlated products. In: Reingold, O. (ed.) TCC 2009. LNCS, vol. 5444, pp. 419–436. Springer, Heidelberg (2009) 34. Shoup, V.: Using hash functions as a hedge against chosen ciphertext attack. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 275–288. Springer, Heidelberg (2000)
An Improvement of Key Generation Algorithm for Gentry’s Homomorphic Encryption Scheme Naoki Ogura1 , Go Yamamoto2 , Tetsutaro Kobayashi2, and Shigenori Uchiyama1 1
2
Tokyo Metropolitan University
[email protected] NTT Information Sharing Platform Laboratories
Abstract. One way of improving efficiency of Gentry’s fully homomorphic encryption is controlling the number of operations, but our recollection is that any scheme which controls the bound has not proposed. In this paper, we propose a key generation algorithm for Gentry’s homomorphic encryption scheme that controls the bound of the circuit depth by using the relation between the circuit depth and the eigenvalues of a basis of a lattice. We present experimental results that show that the proposed algorithm is practical. We discuss security of the basis of the lattices generated by the algorithm for practical use.
1
Introduction
Some encryption schemes such as the RSA, Paillier [15], and Okamoto-Uchiyama [14] schemes have a homomorphic property. The homomorphic property provides a feature which enables us to deal with encrypted data without being able to decrypt the data. This property has various applications such as to secure voting systems or cross table generation. Many homomorphic encryption schemes incorporate the homomorphic property for only one operation, i.e., no encryption scheme is capable of evaluating any function. Constructing a fully homomorphic encryption scheme that could evaluate all functions is an important open problem in cryptography that has persisted for many years. In 2009, Gentry [5] solved this problem by using ideal lattices. Gentry showed that a fully homomorphic encryption scheme can be constructed in three stages: First, he proposed an abstract construction of homomorphic encryption schemes for some functions. Second, he embodied the idea with ideal lattices. We call this scheme Gentry’s basic scheme. Third, he proposed how to extend the scheme so that it has a fully homomorphic property. We call this scheme Gentry’s full scheme. Here, we concentrate on the basic scheme. This is because the efficiency of the full scheme is much lower than that of the basic scheme. We consider that we can construct a practical full scheme by improving the basic scheme. The key generation algorithm of Gentry’s basic scheme generates random basis of ideal lattices as the private key. A bound for the number of operations depends on these basis. Then, it is difficult to handle the number of executable operations in I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 70–83, 2010. c Springer-Verlag Berlin Heidelberg 2010
An Improvement of Key Generation Algorithm
71
advance. Therefore, we must repeat the key generation until the scheme can handle the desired number of operations. In other words, controlling the bound enables us to construct efficient Gentry’s scheme. Then, the problem naturally arises regarding how to handle the number of operations before generating the keys. In this paper, we address this problem by proposing a key generation algorithm that controls the bound of the circuit depth by using the relation between the circuit depth and the eigenvalues of a basis of a lattice. That is, the proposed key generation algorithm enables us to create a practical homomorphic encryption scheme for a given number of operations. We discuss security of the basis of the lattices generated by the algorithm for practical use. Also, we describe an efficient implementation of Gentry’s scheme and show that the proposed algorithm is practical based on experimental results. This paper is organized as follows. In Section 2, we briefly describe the ideal lattices and Gentry’s scheme. In Section 3, we discuss the problem that is dealt with in this paper. In Section 4, we propose an algorithm to address the problem. In Section 5, we explain the efficiency and the security analysis of the proposed algorithm. In Section 6, we present our conclusions.
2
Preliminaries
In this section, we explain some basic definitions and facts. 2.1
Definitions on Lattices
Gentry [5] used ideal lattices for constructing a homomorphic encryption scheme. In this section, we briefly review ideal lattices. Definition 1 (Ideal Lattices). Let R be a residue class ring of the integer univariate polynomial ring Z[x] modulo the ideal (f (x)), where f (x) is a monic integer univariate polynomial with degree n. Then, R is isomorphic to a sublattice of Zn as a Z-module. We define an ideal lattice (on f ) as a sublattice of Zn isomorphic to an ideal of R. This isomorphism enables us to introduce multiplication over Zn by using that over R. So ideal lattices have two operations: addition as a sublattice of Zn and multiplication corresponding to polynomial multiplication modulo f . One of the most simple ideals of R is a principal ideal. Sublattices corresponding to principal ideals fulfill important roles in constructing practical encryption schemes. Definition 2 (Rotation Basis) t For vector v = (v0 , v1 , . . . , vn−1 ) ∈ Zn , we define v¯ := v0 +v1 x+· · ·+vn−1 xn−1 mod f in R. Any element of principal ideal (¯ v ) can be written as a linear combination of generators v¯, v¯x, · · · , v¯xn−1 . By rot(v), we denote a matrix consisting of these generators.1 1
Gentry[5] refers to such a basis a “rotation basis.”
72
N. Ogura et al.
For example, if f (x) = xn − 1, rot(v) is the circulant matrix as: ⎛ ⎞ v0 vn−1 · · · v2 v1 ⎜ v1 v0 · · · v3 v2 ⎟ ⎜ ⎟ ⎜ .. .. . . .. .. ⎟ . ⎜ . . . . ⎟ . ⎜ ⎟ ⎝ vn−2 vn−3 · · · v0 vn−1 ⎠ vn−1 vn−2 · · · v1 v0 We refer to the lattice corresponding to the basis as the cyclic lattice. We can see for f (x) = xn + an−1 xn−1 + · · · + a0 , rot(v) = (bij )i,j satisfies the following recurring formula, ⎧ vi−1 (1 ≤ i ≤ n, j = 1) ⎨ −bn j−1 a0 . bij = (i = 1, 2 ≤ j ≤ n) ⎩ bi−1 j−1 − bn j−1 ai−1 (2 ≤ i ≤ n, 2 ≤ j ≤ n) Definition 3 (Half-Open Parallelepiped). Let L be a sublattice of Zn , regardless of whether or not it is an ideal lattice. There are some linear independent vectors b1 , b2 , · · · , bm of L such that all elements of L can be written as linear combinations of these vectors. We define a basis as the n × m-matrix 2 B := (b1 b2 · · · bm ). mFor basis B1 = (b1 b21 · · · bm ), we define half-open parallelepiped P (B) := { i=1 xi bi | − 2 ≤ xi < 2 }. Note that a basis is not uniquely defined for a lattice and so an infinite number of half-open parallelepipeds exist for a specific ideal lattice. We define the modulo operation by using a half-open parallelepiped. Definition 4 (Modulo Operation by a Lattice) Let L(B) be a lattice with basis B. For vector t ∈ Zn , we can find a unique vector t that satisfies the following conditions: – t is equivalent to t : t − t ∈ L(B) – t is a reduced vector: t ∈ P (B) We refer to t as the remainder of t by B. It is written as t ≡ t (mod B). We can compute t mod B as t
mod B = t − B · B −1 t ,
where for v ∈ Rn , v is a vector of Zn after each element of v is rounded to an integer. 2.2
Gentry’s Scheme
In [5], a homomorphic encryption scheme over an abstract ring is discussed, and then ideal lattices are proposed as a realization of the ring. In this subsection, 2
The terminology “basis” is typically defined as not a matrix but a set of vectors. In this paper, we follows Gentry’s notation about it.
An Improvement of Key Generation Algorithm
73
we explain Gentry’s basic scheme, which has a bound for the circuit depth. We concentrate on the basic scheme since we believe that progress in the basic scheme will lead us to improve the full scheme. First, we select monic integer polynomial f (x) ∈ Z[x] of degree n. Then, we set residue ring R = Z[x]/(f (x)). Also, let BI be a basis for some ideal I ⊂ R and define plaintext space P as (a subset of) P (BI ) ∩ Zn . For example, t P = {(b0 , b1 , . . . , bn−1 ) | bi ∈ {0, 1} for i = 0, 1, . . . , n − 1} for the scalar diagonal basis BI = 2En corresponding to I = (2), where En is the identity matrix of size n. Moreover, we select short vector s ∈ L(BI ), where L(BI ) is t the sublattice with basis BI .3 For instance, we can use s = (2, 0, 0, . . . , 0) for BI = 2En . For φ1 , φ2 ∈ Zn , we define φ1 +I φ2 := (φ1 + φ2 ) mod BI . Similarly, pk we define φ1 ×I φ2 := (φ1 × φ2 ) mod BI , φ1 +J φ2 := (φ1 + φ2 ) mod BJ , and so on. [KeyGen] Generate two basic matrices BJ pk and BJ sk corresponding to ideal J relatively prime to I. Then, the public-key is BJ pk and the secret-key is BJ sk . Typically, we can use BJ sk as rot(v) for random vector v with the corresponding polynomial prime to I. Also, we may set BJ pk as the Hermite normal form4 of BJ sk . We propose a more concrete key generation algorithm for improving the homomorphic property later. [Encrypt] pk For a plaintext π, output φ := (π + r × s) mod BJ , where r ∈ Zn is chosen randomly such that r ≤ . Note that is a security parameter that we determine later. [Decrypt] For a ciphertext φ, output π := (φ mod BJsk ) mod BI . [Evaluate] For circuit CI and tuple (φ1 , . . . , φt ) of ciphertexts, output CJ (φ1 , . . . , φt ), where CJ is the circuit replaced by CI using gate +J , ×J instead of gate +I , ×I . Gentry discussed the validity of Evaluate. See [5] and [6] for more information. Definition 5 (ρEnc ). ρEnc is value ρEnc := max
π∈P, r≤l
π+r×s .
√ For example, ρEnc ≤ n + 2 for I = (2), s = (2, 0, 0, · · · , 0)t . In this paper, we use satisfying ρEnc ≤ n for the sake of simplicity. The following value expresses the size of P (B). Definition 6 (ρDec ). ρDec is value ρDec := sup{ρ ∈ R>0 | Bρ ⊂ P (B)} , 3 4
As described in [5], we can also select s randomly for every encryption. In the current situation, we select s in advance to improve the homomorphic property. The Hermite normal form for a lattice is a unique basis and can be efficiently computed.See [13] for more information.
74
N. Ogura et al.
where Bρ := {t ∈ Rn | t < ρ}. In fact, ρDec can be determined by basis BJ pk . In what follows, we set B = BJ pk for simplicity. Lemma 1 ([5] Lemma 1) t For (b1 b2 · · · bn ) := (B −1 ) , ρDec =
1 . 2 maxj bj
Then, we quote the following important theorem. The theorem states that the bound of the circuit depth depends on value ρDec . Note that lg denotes the logarithm function to base 2. Theorem 1 ([5] Theorem 8) u×v Set γ := max 2, sup . u, v =0 u v Assume that the depth of a circuit C is less than or equal to lg
lg ρDec . lg(γρEnc )
Then, Evaluate for C (and any tuple of ciphertexts) is valid.
3
Bound of the Circuit Depth
In this section, we raise some questions that are related to the bound of the circuit depth. 3.1
Reasoning for Considering the Bound of the Circuit Depth
Gentry achieved a construction of a bootstrappable scheme by using a server aided cryptographic technique. Roughly speaking, the bootstrappable property is such that we can validly execute Evaluate for the decryption circuit. If we have a bootstrappable scheme, we can construct a homomorphic encryption scheme for any given operation bound by using Gentry’s technique. In this subsection, we discuss the potential to improve Gentry’s scheme. As mentioned earlier, the bound of the depth of circuits is connected to ρDec , which is determined by the basis of a lattice. If we selected the basis randomly as Gentry suggested, we cannot predict the bound of the circuit depth before generating keys. Then, we must increase the key size or repeat the key generation until the scheme can handle the bound of the circuit depth. Thus, the complexities of encryption/decryption or key generation are increased. Conversely, if we can control the bound of the circuit depth, we can minimize the key size and time-complexity. We may use a homomorphic encryption scheme to construct particular cryptographic protocols where the number of involved parties
An Improvement of Key Generation Algorithm
75
is bounded. In this case, we can estimate the bound of the circuit depth. Then, the problem naturally arises of how to handle the number of operations before generating the keys. In this paper, we address this problem. Note that we can construct a homomorphic encryption that has any bound for the circuit depth by using the full scheme. However, the full Gentry scheme requires an additional security requirement to the basic scheme. That is, the full scheme is based on the difficulty of not only the problem corresponding to the basic scheme but also a problem associated with server aided cryptography. Also, since the full scheme is constructed by applying the bootstrapping technique to the basic scheme, the efficiency of the full scheme is much lower than that of the basic scheme. By improving the basic scheme, we can consequently increase the efficiency of the full scheme through a reduction in the number of times the bootstrapping technique is applied. So we concentrate on the basic scheme. 3.2
Circuit Depth and Eigenvalue
The bound of the circuit depth is connected to ρDec , which is determined by the basis of a lattice as shown in Theorem 1. In this subsection, we show that the value is closely related to the eigenvalues of the basis. In what follows, elements of matrices are in the complex field. At first we define the notion called matrix norms. Definition 7. Let A be an n-dimensional square matrix. Then, the spectral norm of A is the value A := max Ax . x=1
Also, for A = (aij ), the Frobenius norm of A is the value A F :=
|aij |2 .
i,j
As is well known, A = λ| max | (A∗ A), where A∗ is the complex conjugate matrix of the transpose matrix At of A. Also, we denote the maximum and minimum of the absolute eigenvalues of A by λ| max | (A) and λ| min | (A), respectively. We can easily see A ≤ A F . Then, we deduce the following theorem from these properties. Theorem 2. For a real non-singular matrix B, λ| min | (B ∗ B) n λ| min | (B ∗ B) ≤ ρDec ≤ . 2 2 ∗
Proof. We denote column vectors of (B −1 ) by (b1 b2 · · · bn ). Then, ∗
∗
max bj ≤ max (B −1 ) x = (B −1 ) . j
x=1
76
N. Ogura et al.
So we have max bj ≥ j
1 bj n j
1 ∗ (B −1 ) F n 1 ∗ ≥ (B −1 ) . n ≥
Thus, the following equation and lemma 1 imply the theorem. ∗ ∗ (B −1 ) = λ| max | (B −1 (B −1 ) ) = 1/ λ| min | (B ∗ B) . The theorem says that the bound of the circuit depth is linked to the eigenvalues of B ∗ B. Also, for B = (bij ), we have max |bij | ≤ B = λ| max | (B ∗ B) . i,j
So the eigenvalues are also involved in the size of each elements of B. 3.3
Handling the Eigenvalues
Gentry [5] says that we may generate keys as rot(v) for some random vector v. So we analyze eigenvalues of rot(v). t
Theorem 3. Set B = rot(v) for v = (v0 , v1 , · · · , vn−1 ) on f (x) with degree n. We denote all roots (over the field) of f (x) = 0 by α1 , α2 , · · · , αn (counted up to its multiplicity). Then, if all roots αi are distinct, the eigenvalues of B are λi :=
n−1
vk αi k ,
k=0
and B can be diagonalized. More precisely, for P = (αi j−1 )1≤i,j≤n , P BP −1 = Λ, where Λ represents the diagonal matrix each diagonal element Λi,i for which is λi . Proof. For B = (bij ), it is only necessary to prove equation n
bkj αi k−1 = λi αi j−1 ,
k=1
for any 1 ≤ i, j ≤ n. Note that P is invertible if all αi ’s are distinct. The equation can be easily proved by induction on j for any (fixed) i.
An Improvement of Key Generation Algorithm
77
Note that it is not always true that eigenvalues of B t B can be determined by eigenvalues of real matrix B. However, if P t = P , that is, P is symmetric, then the statement is always true. Especially, if B is a circulant matrix, that is, f (x) = xn − 1, invertible matrix P equals discrete Fourier transformation matrix W = (ω ij ), where ω is a primitive n-th root of unity. Then, W is a symmetric matrix. Note that if |vi | is bounded by some constant c and |αi | = 1, λi is bounded as follows. |λi | = | ≤
n−1
vk αi k |
k=0 n−1
|vk ||αi |k
k=0
≤c
n
|αi | − 1 . |αi | − 1
This means that c must be large if |αi | ∼ 1. Especially, for f (x) = xn − 1, λi ∼ 0 = 1 and v1 , v2 , · · · , vn ∼ c. Thus, it is expected that ρDec in the case that αi take a small value if vi ’s are generated randomly. We can also generate vi by selecting vectors that are almost parallel to ei := (0, 0, · · · , 0, 1, 0, · · · , 0). A similar way may also be used in key generation for GGH cryptosystems [8]. In [8], two key generation methods were proposed. One method is to generate keys randomly and the other is to generate values by adding short random vectors to a vector which equals the multiplication of ei by a large constant. Goldreich et al. comment that attackers may obtain a clue into breaking the scheme if the latter is used. Note that it is not easy to generate a secure key, i.e. basis, that does not correspond to rot(v) for some v. This is because ideal lattices have a special construction. Let v¯1 , v¯2 , . . . , v¯k be generators of ideal I ⊂ R. Also, we denote the integer vector corresponding to v¯i by vi . Then, a basis of the ideal lattice for I should generate the column vectors of rot(vi ). So the size of the basis would be small compared to the size of vi . Thus it would seem that we cannot predict the bound of the circuit depth if we use usual key-generating methods such as random generation. Therefore, we propose another algorithm to address this problem. We approach the problem by controlling the eigenvalues in advance.
4
How to Control the Circuit Depth
In this section, we describe the proposed algorithm. 4.1
Key Idea
The proposed strategy for solving the problem is to take a basis where the sizes of the eigenvalues for which are ensured instead of generating keys randomly. However, there is a problem in implementing this strategy: elements of B can be in
78
N. Ogura et al.
the complex field. We address this problem by considering each element of B as an element in an integer residue ring in which f (x) can be completely factored. Here, we describe the main points of the algorithm. First, for circuit depth bound d, we estimate ρ by using Theorem 1. We recall that we assume ρEnc ≤ n. Second, we select a suitable m for regarding roots of f (x) as elements of integer residue ring Z/mZ. We provide an algorithm for selecting m by using a splitting field of f (x) over Q. Third, we select randomly λi such that |λi |/2 ≥ ρ. If λi ’s are eigenvalues of rotation basis B, the relation between ρDec and λi shown by Theorem 2 ensures that ρDec ≥ ρ. That is, the bound of the circuit depth is greater than d. Finally, we have B with the relation between eigenvalues λi and B derived using Theorem 3. Note that we can obtain v such that B = rot(v) by v = (rot(v))1 = B = (P −1 ΛP )1 = P −1 ΛP1 . 4.2
Proposed Algorithm
Here we show key generation algorithm that preserves the homomorphic property for the circuit where the depth is bounded by a given value in Table 1. Table 1. Key Generation Algorithm for Gentry’s Scheme Input: d: Bound of the circuit depth, f (x): Monic integer univariate polynomial such that n = deg(f ) Output: (B pk , B sk ): the pair of keysfor Gentry’s scheme d u×v 1. Compute ρ := (nγ)2 for γ := max 2, sup . u, v=0 u v 2. Compute a (not necessarily minimal) splitting field Q(θ) of f (x) over Q. 3. Compute the minimal polynomial g(x) of θ. 4. Compute m = |g(i)| for randomly generated integer i. 5. If the denominator of a root of f (x) over Q is not prime to m, then Goto 4. 6. Call the function GenKeyWithρ(f (x), m, ρ) and output the returned values.
Table 2. GenKeyWithρ (function) Input: f (x): Monic polynomial, m and ρ: Integers Output: (B pk , B sk ): the pair of keys for Gentry’s scheme 1. Select λ1 , λ2 , · · · , λn randomly such that 2ρ ≤ |λi | < m. 2. Construct P = (αi j−1 ) over Z/mZ, where f (x) = n i=1 (x − αi ) mod m. 3. Compute v = P −1 ΛP1 , where P1 is the first column vector of P . 4. Compute B = rot(v). 5. Output the integer matrix B sk corresponding to B. 6. Compute the Hermite normal form of B sk and output the matrix as B pk .
The validity of the algorithm in Table 1 easily follows by using the following simple proposition. Proposition 1. Let x0 ∈ Z such that m := |g(x0 )| = 0. Then, x0 is a root of g(x) over the integer residue ring Z/mZ.
An Improvement of Key Generation Algorithm
79
Proof. The proposition is clear as follows. g(x0 ) = ±m ≡ 0
(mod m) .
If we know the minimal splitting field of f in advance, we can skip Step 2 of the algorithm in Table 1. Especially, if we use a simple polynomial, f (x) = xn − 1, the following proposition shows that a primitive root of unity can be expressed over an integer residue ring. Proposition 2. Let n be a power of 2. Set m := ω n/2 + 1 for a power ω of 2. Then, ω is a primitive n-th root of unity over Z/mZ. Proof. The theorem follows immediately from the following congruent equation. ω n/2 ≡ −1 (mod m) . For simplicity of implementation, we propose another algorithm in Table 3. Note that we do not input f (x) but n in the algorithm. Table 3. Another Key Generation Algorithm for Gentry’s Scheme Input: d: Bound of circuit depth, n: Integer Output: (B pk , B sk ): the pair of keys for Gentry’s scheme, f (x): Monic integer polynomial u×v 2d 1. Compute ρ := (nγ) for γ := max 2, sup . u, v=0 u v 2. Generate randomly m such that m ≥ 2ρ. 3. Generate integers αi ∈ Z for i = 1, 2, . . . , n. 4. Compute f˜(x) = n i=1 (x − αi ). 5. Compute f (x) such that f (x) ≡ f˜(x) (mod m). by adding random multiples of m to each coefficients of f˜ except for the term xn . 6. Output f (x). 7. Call the function GenKeyWithρ(f (x), m, ρ) and output the returned values.
4.3
Feasible Bound of the Circuit Depth
In this subsection, we estimate a feasible bound for the circuit depth. Considering the security requirements, we could not use too large a circuit depth. As √ 1−δ mentioned in Section 5.2, the condition that n2ρ < 2n must be satisfied, where δ ∈ [0, 1) is a security parameter. Thus, we can estimate the maximum circuit depth as follows. √ 1−δ Proposition 3. Assume that ρ satisfies the condition n2ρ < 2n . Then, the bound of circuit depth d is less than 1−δ √ n − lg(2 n) lg . lg(nγ) For example, if δ = 18 , we can construct Gentry’s scheme with the circuit depth of 3 for f (x) = x256 − 1.
80
N. Ogura et al.
5
Analysis of the Proposed Algorithm
In this section, we analyze the efficiency and the security of the proposed algorithm. 5.1
Practicality of the Proposed Algorithm
First, we consider f (x) = xn − 1 in terms of efficiency. As noted in Section 3.3, if f (x) = xn − 1, then P is a discrete Fourier transformation matrix. So techniques for fast Fourier transformation can be applied to the algorithm. Since ΛP1 = (λ1 , λ2 , · · · , λn )t , we can compute vector v = P −1 ΛP1 by applying fast t Fourier transformation techniques (on P −1 = ( n1 ω −ij )) to (λ1 , λ2 , · · · , λn ) . Note that the fast Fourier transformation is efficient if n is a power of 2. Next, we describe implementation techniques for Gentry’s scheme. Since the modulo operation by a lattice is the most time-consuming in Gentry’s scheme, we consider how to improve its operation. If we take BI as scalar matrix 2En , A = (aij ) mod BI can be easily computed using (aij mod 2). Also, to speed up the encryptions, the inverse matrix of B pk is precomputed. Moreover,B sk = J
J
rot(v) can be computable efficiently by using −1
rot(v) · rot(v)
φ = v × w × φ ,
where w ∈ Qn5 satisfies v × w = (1, 0, 0, . . . , 0)t . Note that v1 × v2 = rot(v1 )v2 for v1 , v2 ∈ Zn and v1 × v2 can be computed with a polynomial multiplication. Also, element w ¯ ∈ Z[x]/(f (x)) corresponding to w is the inverse in Q[x]/(f (x)) of the element v¯ corresponding to v. So w (or w) ¯ is computable by applying the extended Euclidean algorithm to v¯ and f (x). Here, we present the experimental results of Gentry’s scheme using the proposed algorithm. Before that, we briefly summarize the key generation algorithm. First we generate integers λi ’s for the given number of operations. Then, we obtain the matrix corresponding to a rotation basis with the eigenvalues of λi by executing operations over an integer residue ring. Table 4 shows the experimental results of Gentry’s scheme with the proposed algorithm on f (x) = xn − 1. We used a computer with 2-GHz CPU (AMD Opteron 246), 4 GB memory, and a 160 GB hard disk. Note that we used at most 1 GB memory to execute the program. Magma [21] was used as the software for writing the program. We measured the computation times and the amount of memory used for each step, including key generation, encryption, decryption and d times multiplications of ciphertexts. Note that we show the average run time for the multiplication. The number of iterations is 10. We take the average values except the maximum and minimum for each item. Comparing the experimental results to those of [7], it appears that the proposed algorithm is not very efficient. We used Magma on the computer with 4 5
The isomorphism between Zn and Z[x]/(f (x)) is naturally extended to the isomorphism between Qn and Q[x]/(f (x)).
An Improvement of Key Generation Algorithm
81
Table 4. Experimental Results for Gentry’s Scheme on f (X) = X n − 1 n d Keygen [s] Encrypt [s] Decrypt [s] Multiply [s] Memory [MB]
32 1 3 0.056 0.091 0.000 0.0007 0.001 0.007 0.0003 0.0003 7.73 8.03
64 1 3 0.93 1.54 0.000 0.001 0.030 0.055 0.001 0.002 9.39 10.11
128 1 3 20.12 28.21 0.007 0.007 0.38 0.61 0.006 0.008 20.61 20.31
256 1 3 416.82 416.48 0.031 0.029 7.87 7.83 0.047 0.048 77.87 78.79
GB of memory, while Gentry et al. used NTL/GMP libraries on a computer with 24 GB of memory. Based on the current experiments, implementations with C seem to be much faster than those for the Magma implementation. To obtain more accurate results, we must compare the experimental results in the same experimentation environment. Here, we comment regarding the differences between the proposed algorithm and other related schemes. Smart and Vercauteren’s Scheme. In [18], an efficient fully homomorphic encryption scheme is proposed. They use a specific lattice inspired with some prime ideals over an algebraic number field. So their scheme is based on the hardness of a strong problem compared to that for the full Gentry scheme. Also, their experimental results show that their scheme has the homomorphic property for circuits but with a depth that would not be deep enough to enable a fully homomorphic encryption scheme. We expect that since the proposed algorithm uses eigenvalues it can be applied to their scheme. Stehl´ e and Steinfeld’s Scheme. In [19], an efficient fully homomorphic encryption scheme is proposed. They give a security analysis of the Sparse Subset Sum Problem, which is one of the hard problems underlying the security of the full scheme. The analysis leads us to smaller parameter choices. Also they improve the decryption algorithm for the full scheme. In contrast, we concentrate on the basic scheme, and the key generation algorithm in particular. In this way, the proposed algorithm is an improvement to Gentry’s scheme regarding this specific part and their algorithm focuses on another part. The proposed algorithm would be applied to generate a basis for their scheme. 5.2
Security Analysis of the Proposed Algorithm
Attackers may break Gentry’s scheme with a lattice reduction algorithm by finding short vectors. The following well-known theorem yields a bound for the length of the shortest vector with the determinant of the basis. Theorem 4 (Minkowski). Let α(B) be the length of the shortest vector in an n-dimensional full lattice with the basis B. Then, √ 1/n α(B) < ndet(B) .
82
N. Ogura et al.
Note that det(B) equals the multiplication of all eigenvalues of B. So we can control α(B) by selecting the eigenvalues. Various lattice reduction algorithms were proposed, for example, in [12] or [17]. The most efficient algorithm was proposed by Ajtai et al. [1]. The algorithm can find a vector of length at most 2O(n lg lg n/ lg n) times the length of the shortest non-zero vector. Also, Gama and Nguyen [4] provide assessments of the practical hardness of the shortest vector problem based on many experimental results. Especially, they explain why the 334-dimensional NTRU lattices [10] have not been solved. Since the NTRU lattice is an ideal lattice, we recommend using n > 334. We analyze the key generation algorithm assuming that we can compute short 1−δ vectors with the approximate factor 2n . Because we take the size of eigenvalues √ 1−δ as almost 2ρ, the condition that n2ρ < 2n should be satisfied. In fact, if α(B)/ ≥ 2n , Gentry’s scheme is broken. For more information, refer to [6]. Of course, the proposed algorithm generates more specially-configured keys than simple random generation. So the security level would decrease by restricting the keys. Investigating the security is for future work.
6
Conclusion
We proposed an efficient key generation algorithm that controls the bound of the circuit depth by using the relation between the circuit depth and eigenvalues of a basis of a lattice. The key generation algorithm enables us to create a homomorphic encryption scheme for a given number of operations. Also, we described an efficient implementation of Gentry’s scheme and showed that the proposed algorithm is practical based on experimental results. The algorithm is summarized as follows. First we generate eigenvalues for the given number of operations. Then, we obtain the matrix corresponding to a rotation basis by using eigenvalues over an integer residue ring. Although the experimental results show that the algorithm is practical, the efficiency of the algorithm remains a matter of research. Especially, we should improve the bound of the circuit depth. Improving the quality of the algorithm is for future work. For specific lattices such as cyclic lattices, we continue investigating the security of the scheme with the proposed method.
Acknowledgments This work was supported in part by Grant-in-Aid for Scientific Research (C)(20540125).
References 1. Ajtai, M., Kumar, R., Sivakumar, D.: A Sieve Algorithm for the Shortest Lattice Vector Problem. In: STOC 2001, pp. 266–275 (2001) 2. Cohen, H.: A Course in Computational Algebraic Number Theory. In: GTM138. Springer, Heidelberg (1996)
An Improvement of Key Generation Algorithm
83
3. ElGamal, T.: A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms. IEEE Transactions on Information Theory IT-31, 469–472 (1985) 4. Gama, N., Nguyen, P.Q.: Predicting Lattice Reduction. In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 31–51. Springer, Heidelberg (2008), http://www.di.ens.fr/~ pnguyen/pub_GaNg08.htm 5. Gentry, C.: Fully Homomorphic Encryption Using Ideal Lattices. In: STOC 2009, pp. 169–178 (2009) 6. Gentry, C.: A Fully Homomorphic Encryption Scheme. PhD thesis, Stanford University (2009), http://crypto.stanford.edu/craig 7. Gentry, C., Halevi, S.: A Working Implementation of Fully Homomorphic Encryption. In: EUROCRYPT 2010 rump session (2010), http://eurocrypt2010rump.cr.yp.to/9854ad3cab48983f7c2c5a2258e27717.pdf 8. Goldreich, O., Goldwasser, S., Halevi, S.: Public-Key Cryptosystems from Lattice Reduction Problems. In: Kaliski Jr., B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 112–131. Springer, Heidelberg (1997) 9. Gray, R.M.: Toeplitz and Circulant Matrices: A Review. In: Foundation and Trends in Communications and Information Theory, vol. 2(3), Now Publishers Inc., USA (2006) 10. Hoffstein, J., Pipher, J., Silverman, J.: NTRU: A Ring Based Public Key Cryptosystem. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 267–288. Springer, Heidelberg (1998) 11. Kitaev, A.Y., Shen, A.H., Vyalyi, M.N.: Classical and Quantum Computation. Graduate Studies in Mathematics, vol. 47. AMS, Providence (2002) 12. Lenstra, A.K., Lenstra Jr., H.W., Lov’asz, L.: Factoring Polynomials with Rational Coefficients. Mathematische Annalen 261, 513–534 (1982) 13. Micciancio, D.: Improving Lattice-based Cryptosystems Using the Hermite Normal Form. In: Silverman, J.H. (ed.) CaLC 2001. LNCS, vol. 2146, pp. 126–145. Springer, Heidelberg (2001) 14. Okamoto, T., Uchiyama, S.: A New Public-Key Cryptosystem as Secure as Factoring. In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS, vol. 1403, pp. 308–318. Springer, Heidelberg (1998) 15. Paillier, P.: Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999) 16. Rivest, R.L., Shamir, A., Adleman, L.: A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of ACM 21(2), 120–126 (1978) 17. Schnorr, C.P.: A Hierarchy of Polynomial Time Lattice Basis Reduction Algorithms. Theoretical Computer Science 53(2-3), 201–224 (1987) 18. Smart, N.P., Vercauteren, F.: Fully Homomorphic Encryption with Relatively Small Key and Ciphertext Sizes. In: Nguyen, P.Q., Pointcheval, D. (eds.) PKC 2010. LNCS, vol. 6056, pp. 420–443. Springer, Heidelberg (2010), http://eprint.iacr.org/2009/571 19. Stehl’e, D., Steinfeld, R.: Faster Fully Homomorphic Encryption. In: Cryptology ePrint archive (2010), http://eprint.iacr.org/2010/299 20. Turing Machines, http://www.math.ku.dk/~ wester/turing.html 21. Magma, http://magma.maths.usyd.edu.au/magma
Practical Universal Random Sampling Marek Klonowski, Michal Przykucki, Tomasz Strumi´ nski, and Malgorzata Sulkowska Institute of Mathematics and Computer Science, Wroclaw University of Technology, Poland ul. Wybrze˙ze Wyspia´ nskiego 50-370 Wroclaw {Marek.Klonowski,Michal.Przykucki,Tomasz.Struminski, Malgorzata.Sulkowska}@pwr.wroc.pl
Abstract. In our paper we modify and extend the line of research initiated in CRYPTO 2006 paper ([5]) on preserving privacy in statistical databases. Firstly we present a simpler approach giving the explicit formulas for the sampling probabilities. We show that in most cases our analysis gives substantially better results than those presented in the original paper. Additionaly we outline how the simplified approach can be used for constructing a protocol of privacy preserving sampling distributed databases.
1
Introduction
The amount of data stored in databases nowadays is huge. A significant part of it contains fragile, private information of individuals. We would like to protect this crucial data but at the same time we would like to release the datasets for a public consumption for a various reasons: reporting, data mining, scientific discoveries, etc. There are numerous possibilities of achieving such a goal: interactive approaches assume that the database administrator accept or refuse particular query while non-interactive approaches refers to releasing the secure (censored) subset of data as a database representation. In our paper we investigate one of the most commonly used non-interactive database sanitization mechanism – a simple random sampling in the terms of preserving privacy. The simple random sampling merely allows one to draw externally valid conclusions about the entire population based on the sample e.g., to conclude about averages, variances, clusters etc. We modify and extend a line of research initiated in [5], i.e. we are interested in finding the sampling probability which, for a given two parameters (where the larger is, the worse privacy is guaranteed) and δ (means that we guarantee -privacy with probability at least 1 − δ), ensures individuals’ privacy. Our definition of privacy comes from [6,5]. Of course, the larger database sample is
Partially supported by funds from Polish Ministry of Science and Higher Education – grant No. N N206 2573 35. Authors are also beneficiary of MISTRZ Programme Foundation for Polish Science. Marek Klonowski a beneficiary of scholarship for young reserechers (Polish Ministry of Science and Higher Education).
I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 84–100, 2010. c Springer-Verlag Berlin Heidelberg 2010
Practical Universal Random Sampling
85
released the more information it statistically brings to the audience. On the other hand, in order to preserve privacy this sample usually cannot be too large. Thus, it is important to compute the highest sampling probability p which preserves privacy. Our contribution includes, among others, the formula for p, which in most cases performs better than the one presented in [5] (i.e. outputs higher p). In contrast to the result from [5] our formula is the explicit result of database and parameters and δ. The other part of our paper concentrates on the application of a random sampling to distributed databases (eg. to sensor networks). It is the result of our explicit formula and it seems to be a new way of collecting/aggregating information from a peers while preserving theirs privacy. 1.1
Organization of This Paper
In Section 1.2 we outline the previous works on privacy of the statistical databases. Precise definitions of differential privacy and previous results for preserving this privacy in random samples of database are stated in Section 2. In Section 3 we derive calculation of the sampling probability which allows to preserve the differential privacy for the given parameters in databases containing only two distinct values. The general results are implied from this calculation in Section 4. In the next section we compare our results with those presented in the paper [5]. Section 6 is devoted to application of our result to distributed databases and collecting representative sample while preserving privacy of peers. We conclude in Section 7. 1.2
Related Works
The problem of releasing the statistical database, which will allow to achieve the accurate statistics about a population while preserving privacy of individuals has a rich literature mainly from the statistics community eg. [11,12,14,17]. There has been a series of lower bound results [2,4,3] that suggested that non-interactive databases cannot accurately answer all queries, or an adversary will be able to reconstruct all but a 1 − o(1) fraction of the original database exactly. Dwork has shown in [6] the impossibility of Dalenius’ goal of achieving semantic security for databases [11], i.e. the paradigm: ”nothing about an individual should be learnable from the database that cannot be learned without access to the database” cannot be achieved for databases. Instead of it, Dwork has introduced and formalized the idea of differential privacy [6,7]. Introduction of the new definition of the database privacy resulted in a series of papers which explore and extend that idea [10,18,5,19]. The relaxation of differential privacy definition and its connection to a statistical difference was presented in [10]. In the paper [19] authors suggested another definition of the database privacy, which they called a distributional privacy. They showed that
86
M. Klonowski et al.
their new formulation guarantees stronger privacy than the differential privacy. Using this definition, the authors proved that for the unlimited number of queries from the special class (still usefull class) the distributional privacy can be preserved in a non-interactive sanitization model. The computational complexity of releasing the differentially private data was investigated in [18]. The random sampling of database as the method of the non-interactive revealing database information was considered in terms of the differential privacy in [5]. Our paper is strongly connected with this paper, therefore we present some details of authors’ result in Section 2.
2
Random Sampling and Differential Privacy
Random sampling. We model the problem in the same way as it is done in [5]. The sanitizer starts with database D with N values in it. Then he goes through each row of the database and includes it in the sample with probability p (does nothing with probability 1 − p). The permuted sample is released. For such a scenario we would like to find the highest value of p which preserves privacy. The higher value of p means that the statistically larger sample can be released. And of course the sample gives more information. In our considerations we use the differential database privacy definition introduced in [6,7]. Definition 1. Let PD [S = s] denote the probability that the mechanism S produces the sample s from the database D. We say that the sanitization mechanism S is (1, )-private if, for every pair of databases D, D and every s that differ in at most one row PD [S = s] ≤ 1 + . PD [S = s] Note that for correctness and completness of this definition we assume that 0 0 = 1. As remarked in [5], there is no sanitization mechanism which allows to ensure (1, )-privacy with probability 1. It is clear, since with positive probability a characteristic value (e.g. a value which occurs once in the database) can be chosen into the database sample. Consequently, there exists the definition which allows the mechanism to violate (1, )-privacy on some subset of the samples having overall probability of occurance limited to δ. The definition can be also generalized to databases that differ in more than 1 row. Definition 2. Let TD,D be a set of all s such that PD [S = s] ≤ 1 + . PD [S = s] A sanitization mechanism S is (c, , δ)-private if, for every pair of databases D, D differing in at most c rows PD [S ∈ TD,D ] > 1 − δ.
Practical Universal Random Sampling
87
In the paper [5] authors showed how to connect the security parameters (, δ), the number of rare values in the database to the sampling probability p. The main result of their work was the theorem describing the dependency of p and the security parameters for the case in which the database samples differs in one row (providing (1, , δ)-privacy). Because it is closely connected to our consideration, we present this theorem in its original implicit form [5]. Theorem (Chaudhuri, Mishra[5]) Given a database D, let α = δ2 , k be the number of distinct values in D and 2 log( k )
α t be the total number of values in D that occur less than times. Also 1 let = max(2(p + ), 6p) and p + < 2 . Then a sample S of D drawn with
frequency p ≤
1 log( 1−α ) k 4t log( α )
is (1, , δ)-private when t > 0. When t = 0 a sample S
of D drawn with frequency p ≤ is (1, , δ)-private. Note that it is not straightforward how to find the sampling probability p for a given database D which preserves (1,, δ)-privacy using the above theorem. Authors also generalized their result to the situations in which the databases samples differs in c rows (providing (c, , δ)-privacy). The comparsion between our results and those from[5] are stated in Section 5. Notation. We denote the integer numbers by N. Let X be a random variable. The expression X ∼ Bin(n, p) means that the random variable X follows the binomial distribution with the parameters n and p. We also denote the expected value of random variable X by E[X]. In particular, when X ∼ Bin(n, p), then E[X] = np. Through this paper we assume that ab = 0 if b > a or b < 0.
3
Black and White Databases
At the beginning of our analysis we assume that a database contains only two kinds of values, say black and white. For each value from the database we take it and put it into the sample independently with the probability p. We would like to measure how much we can say about the initial databse observing the sample. More precisely, we would like to say if it is possible, judging by the sample, to distinguish a database D from another database D that differ in a one possition. Our main goal is to find the maximal sampling probability p which preserves (1, , δ)-privacy for a given database. From our perspective we can identify a database with a multiset of the black and white objects or just with a pair D = (W, B) of nonnegative integers. Using such convention the random sample discussed in this paper is just a pair of the independent random variables S = (Sw , Sb ), such that Sw ∼ Bin(W, p) and Sb ∼ Bin(B, p). For the sake of clarity of notation, let P [(x, y)|(W, B)] denote the probability that the result of the sample taken from the database D = (W, B) (i.e. containing W white elements and B black elements) is (x, y). One can easy see that
88
M. Klonowski et al.
P [(x, y)|(W, B)] =
3.1
W x
B x+y B+W −x−y p (1 − p) . y
(1)
(1, , δ)-Privacy for B&W Databases
Our goal is to achieve the result for the above model in the terms of the differential privacy. In order to do that, we need to calculate the information leakage of the two samples which differs in a one position. In the other words, we need to find all the possible situations which do not violate (1, )-privacy. Any pair of integer (x, y) represents the sample S which preserves (1, )privacy when both of the following inequations hold: P [(x, y)|(W, B)] ≤ 1 + , P [(x, y)|(W + 1, B − 1)] P [(x, y)|(W + 1, B − 1)] ≤ 1 + . P [(x, y)|(W, B)] Fact 1. Define a set C as follows C = {(x, y) : x ∈ N ∧ y ∈ N∧ B(1 + ) x − B∧ W +1 B B ∧y ≤ x+ }. (1 + )(W + 1) 1+ ∧y ≥
For every pair (a, b), if (a, b) ∈ C then (a, b) represents the sample which preserves (1, )-privacy. Proof. Using 1 we get P [(x, y)|(W, B)] W −x+1 B = , P [(x, y)|(W + 1, B − 1)] W +1 B−y and similarly P [(x, y)|(W + 1, B − 1)] W +1 B−y = . P [(x, y)|(W, B)] W −x+1 B This yields to the following constraints for y: B B B(1 + ) x+ ≥y≥ x − B. (1 + )(W + 1) 1+ W +1 In order to achieve (1, , δ)-privacy one needs to guarantee that P [(x, y)|(W, B)] ≥ 1 − δ. (x,y)∈C
(2)
Practical Universal Random Sampling
89
Notice that (1, , δ)-privacy looses the constraints of (1, )privacy a bit. Namely, it means that the sum of all probabilities P [(x, y)|(W, B)] which break (1, )-privacy is bounded by δ. It is not possible to find a compact formula of the above sum for arbitrary B and W . Also, it seems to be hard to compute the maximal p which fulfills this inequation for given and δ. Thus in the next subsection we concentrate on the tight estimations of this sum rather than the exact values. 3.2
Finding p for B&W Databases
Theorem 1. Let us have only two kinds of values in a database. Let W and 2 B denote cardinality of each of the values and set α = log 4δ , β = (1+) 2 . A random sampling with the probability p equal to or smaller than min(W, B)αβ + 3 (αβ)2 √ 6αβ −2 3 p0 = 1 + min(W, B) min(W, B) preserves (1, , δ)-privacy. Before we prove Theorem 1, let us note that the value of this critical probability p0 for all feasible parameters is smaller than 1. In order to prove Theorem 1, we will use the following lemma. Lemma 1. Let us have only two kinds of values in the database. Let W and B denote cardinality of each of values. If C is the set of all pairs (x, y) preserving (1, )-privacy, then 2 2 2 P [(x, y)|(W, B)] > 1 − 4e− min(W,B) (1−p) /12p(1+) . (x,y)∈C
Proof. We would like to bound the probability mass inside the tetragon T = +1) ((W + 1, B), (0, y0 ), (0, 0), (x0 , 0)) presented in Figure 1. Of course x0 = (W 1+ because (x0 , 0) is the intersection point of the line y = B(1+) x − B and the W +1 B horizontal axis. Analogously, y0 = 1+ . We construct a rectangle embedded in our tetragon in such a way that its sides are accordingly proportional to the sides of the whole probability space, its center is a point (x , y ), where x = E[Sw ] and y = E[Sb ], and its area is possibly maximal under mentioned conditions. Note that the point (x , y ) lies on the diagonal of our probability space. Without loss of generality we can assume that W ≥ B. Note that x = E[Sw ] = p(W + 1) and y = E[Sb ] = pB. According to our geometric construction we get the following proportions B 1+
W +1 (W +1) 1+
B
=
hb , (1 − p)(W + 1)
=
hw . (1 − p)B
90
M. Klonowski et al.
Fig. 1. We estimate the sum of the probabilities over the tetragon T by the sum of the probabilities over the rectangle enclosed by T
Thus B
hb = (1 − p), 1+ (W + 1)
hw = (1 − p). 1+ Let S1 = (x1 , y1 ) and S2 = (x2 , y2 ). Let also hw = x2 − x and hb = y1 − y . Since the slope of the segment (x0 , 0)(W + 1, B) is higher than of the segment (0, 0)(W + 1, B), which is higher than the one of the segment (0, y0 )(W + 1, B), we have
(W + 1)(1 − p) hw = , 2 2(1 + )
B(1 − p) hb hb > = . 2 2(1 + ) hw >
Therefore γ=
hw hb (1 − p) = > . p(W + 1) pB 2p(1 + )
Fact 2 (Chernoff bound). Let X ∼ Bin(np). For 0 < γ < 1, P [|X − E[X]| ≥ γE[X]] ≤ 2e−γ This fact can be found for example in [13].
2
E[X]/3
.
Practical Universal Random Sampling
91
Applying Chernoff bounds we instantly get P [|Sw − E[Sw ]| ≥ γE[Sw ]] ≤ 2e−E[Sw ]γ P [|Sb − E[Sb ]| ≥ γE[Sb ]] ≤ 2e
2
−E[Sb ]γ 2 /3
/3
< 2e−(W +1)
< 2e
2
(1−p)2 /12p(1+)2
−B2 (1−p)2 /12p(1+)2
,
.
Under the assumption W ≥ B the latter one of the above inequalities is weaker thus, as the random variables Sw and Sb are independent, we have P [|Sw − E[Sw ]| ≥ γE[Sw ] ∪ |Sb − E[Sb ]| ≥ γE[Sb ]] < < 4e−B
2
(1−p)2 /12p(1+)2
.
Since (x,y)∈C P [(x, y)|(W, B)] ≥ (x,y)∈T P [(x, y)|(W, B)], we immediately conclude that 2 2 2 P [(x, y)|(W, B)] > 1 − 4e− min(W,B) (1−p) /12p(1+) . (x,y)∈C
Proof (Theorem 1). Combining Lemma 1 and inequality 2 we deduce that for a given B, W and , if we find p for which 1 − 4e− min(W,B)
2
(1−p)2 /12p(1+)2
≥ 1 − δ,
then random sampling using that p will preserve (1, , δ)-privacy. This inequality can be rewritten after some straightforward transformations as
12 log 4δ ( + 1)2 2 S(p) = p − 2 + p + 1 ≥ 0. min(W, B)2 The function S(p) has two real roots – the smaller one (p0 ) is lying on the 2 interval (0, 1). What is more, when we set α = log 4δ , β = (1+) 2 , the value of this root is given by following expression 2 min(W, B)αβ + 3 (αβ) √ 6αβ p0 = 1 + −2 3 . min(W, B) min(W, B) For a given database, as long as the sanitizer uses p ∈ [0, p0 ] the random sampling preserves (1, , δ)-privacy.
4
General Results
this section we generalize our considerations and provide the results for databases with more than two different values. Furthermore, we discuss the value of the sampling probability as a function of and M , where M is the multiplicity of the rarest value from the database. We also present how the mechanisms that are (1, , δ)-private deal with databases that differ in more than one row.
92
M. Klonowski et al.
4.1
Multicolor Databases
Theorem 2. Let the database contain k distinct values. Let Mdenote the mul 2 tiplicity of the rarest value from the database and set α = log 4δ , β = (1+) 2 . Random sampling with the probability p equal to or smaller than 2 M αβ + 3 (αβ) √ 6αβ 1+ −2 3 M M preserves (1, , δ)-privacy. Before we prove the above theorem, let us note the following fact 2 Fact 3. Let α = log 4δ , β = (1+) 2 . The function 2 M αβ + 3 (αβ) √ 6αβ F (M ) = 1 + −2 3 M M is increasing on R+ . Proof. This fact follows from inequality F (M ) > 0 which holds for every real M .
Proof (Theorem 2). This theorem is a simple consequence of Theorem 1 and Fact 3. Let Mi be the multiplicity of i-th value from the database and Mn = mini Mi , where i ∈ {1, . . . , k}. From the inequality ∀l=m min(Ml , Mm ) ≥ Mn and the fact that F (M ) is increasing function, we get that ∀l=m F (min(Ml , Mm )) ≥ F (Mn ). This means that sampling with probability equal or smaller to F (mini (Mi )) preserves (1, , δ)-privacy.
From the above discussion we instantly get the following collorary. Collorary 1. For any database D1 , D2 , . . . , Dn , if the rarest value among these databases appears M times, then sampling with the probability F (M ) preserves (1, , δ)-privacy for each database. Translating it into real terms, we can use a universal value p = F (M ) for all databases as long as we can assume that the global minimum of cardinalities of distinct values is at least M . 4.2
Properties of Our Formula for p
Remark 1. Interesting is the fact that if we fix δ (which means fixing α) then F (M ) tends to the constant when tends to infinity. We have (1 + )2 =1 →∞ 2
lim β = lim
→∞
Practical Universal Random Sampling
93
√ √ M α + 3α2 6α lim F (M ) = 1 + −2 3 . →∞ M M It means that at some moment giving up -privacy will not increase the value of F (M ) significantly. Therefore if we want to achieve greater value of F (M ) at privacy expense then it is reasonable to choose the increase of δ, not . thus
Remark 1. It is also worth noticing that for any fixed .δ lim F (M ) = 1.
M→∞
Let N denote the total number of records in the whole database. The above limit shows that we can expect to have reasonable values of F (M ) when we are dealing with large databases in which the cardinality of the rarest value tends to infinity when N tends to infinity. This is of course an intuitive property. Note however that some of the previous results does not provide it. 4.3
(c, , δ)-Privacy for Multicolor Databases
Let us assume that the mechanism S preserves (1, , δ)-privacy. What can be said about comparing two databases differing in c > 1 positions? Theorem 3. A sanitization mechanism which is (1, , δ)-private is also (c, (1 + )c − 1, cδ)-private. Proof. Assume that databases D1 and Dc+1 differ in c rows. Let us constract a sequence of databases (D1 , D2 , . . . , Dc , Dc+1 ) such that each differs from the previous one by one row. Let Si be the random variable describing a sample generated by the investigated mechanism from the database Di (1 ≤ i ≤ c + 1). We will start with showing that the mechanism which is (1, )-private is also 1 i =s] (c, (1 + )c − 1)-private. For each i ∈ {1, . . . , c} we have 1+ ≤ PP[S[Si+1 =s] ≤ 1 + . Thus P [Sc = s] 1 P [S1 = s] P [S2 = s] ≤ ... ≤ (1 + )c c (1 + ) P [S2 = s] P [S3 = s] P [Sc+1 = s] which gives
1 P [S1 = s] ≤ ≤ (1 + )c . (1 + )c P [Sc+1 = s]
Now we need to show that our mechanism produces the samples, for which the above inequality holds, with the probability at least (1 − cδ). P [S1 =s] 1 c Let R denote the event (1+)c ≤ P [Sc+1 =s] ≤ (1 + ) . Let also Ri denote 1 i =s] the event 1+ ≤ PP[S[Si+1 ≤ 1 + . By Ri we mean the compliment of the =s] P [Si =s] 1 i =s] event Ri , so Ri = P [Si+1 =s] < 1+ ∨ PP[S[Si+1 =s] > 1 + . Notice that the event R can be stated as follows 1 P [Sc = s] P [S1 = s] P [S2 = s] c ≤ ... ≤ (1 + ) . (1 + )c P [S2 = s] P [S3 = s] P [Sc+1 = s]
94
M. Klonowski et al.
Then it is easy to see that {R1 , R2 , . . . , Rc } ⊆ R. Thus P [R] ≥ P [R1 , R2 , . . . , Rc ]. Since our mechanism is (1, , δ)-private we know that P [Ri ] ≥ 1 − δ which gives P [Ri ] ≤ δ. Finally we have P [R] ≥P [R1 , R2 , . . . , Rc ] = 1 − P [R1 ∨ R2 ∨ . . . ∨ Rc ] ≥ ≥1 − (P [R1 ] + P [R2 ] + . . . + P [Rc ]) ≥ (1 − cδ).
Worth noticing is the fact that Theorem 3 stated and proved above is universal, i.e. it is true for all sanitization mechanisms (not only for mechanisms of random sampling). Let us also notice that the similar approach is declared in [5]. Authors claim there that (1, , δ)-privacy implies (c, c, cδ)-privacy, however we did not manage to find any precise justification of this conjecture.
5
Comparison to Previous Results
Let us try to compare the result obtained by us to the one given in [5]. Let pnew (D, , δ) denote the probability introduced in the previous section and pold (D, , δ) denote the sampling frequency suggested in [5]. First of all, let us notice that such a comparison is not straightforward. The reasons for this are: – the value pold (D, , δ) depends on two database-specific parameters, i.e. k (the number of distinct values in the database D) and t (the total number 2 log ( 2k δ ) of values in the database D that occur less than times); pold is a decreasing function of the both of these parameters, – for a given δ, the value pold returned by [5] guarantees (1, , δ) privacy level where = max {2(pold (D, , δ) + ), 6pold (D, , δ)}; there is no obvious way to reverse this calculation and extract the value for the given δ and parameters, – for the given database D and the parameter δ there are some values of for which does not exist (because = max {2(pold(D, , δ) + ), 6pold (D, , δ)} on [0, ∞] is not a surjection, Figure 2), – p and are restricted by p + < 12 . Of course we would like to confront the sampling frequencies guaranteeing the same privacy levels. In order to do this, we need to compare the function pold (D, , δ) with pnew (D, max {2(pold (D, , δ) + ), 6pold (D, , δ)}, δ) = pnew (D, , δ). Let D be the database for which we perform the comparison. Since pold is a decreasing function of k – the number of the distinct values in the database, we set k = 2 as it is the smallest reasonable value (k = 1 means that all the rows have the same value). This means that the database D contains the rows with only two distinct values. Similarly, the function pold decreases with t, so we can consider only the situations in which t ≤ 1. This results in the particular structure of the database D. Namely, one of the value (the black one) in this database must not be rare no matter of and δ parameters, so we set the
Practical Universal Random Sampling
95
Ε 3.0 2.5 2.0 1.5 1.0 0.5
0.1
0.2
0.3
0.4
0.5
Ε
Fig. 2. The value of = max {2(pold (D, , 0.1) + ), 6pold (D, , 0.1)} as a function of , for D containing infinitly many black rows and 50 white rows
multiplicity of the black rows in D to infinity. The multiplicity of the second value (white) in the database will be used as a parameter. The comparison between pold (solid line) and pnew (dashed line) for some fixed parameters and δ is given in Figure 3. One can notice that there exist situations in which the solution from [5] performs better than ours (i.e. it provides higher p). This is true only for the certain values of ( cannot be greater than 0.25) and for a few database sizes. To be more precise, we can calculate the multiplicities of the white rows for which the solution from [5] outputs better p than ours. Let us set 2 ) = max {2(pold (D, , δ) + ), 6pold (D, , δ)}, log 4δ , β = (1+ and let W0 be 2 the solution of the following equation: 2 W αβ + 3 (αβ) √ 6αβ −2 3 = . 1+ W W If W0 ≥
2 log( 4δ )
then for all databases with the multiplicities of the white rows 2 log( 4 )
from the interval I = [ δ , W0 ] function pold performs better than pnew . For a given and δ it is very likely that I will be short or even empty. What is more, asymptotically our solution is far better, namely lim pnew (D, , δ) = 1
W →∞
where = max {2(pold (D, , δ) + ), 6pold (D, , δ)} and . 6 To sum up the comparison, we provide p preserving (1, , δ)-privacy and which is higher than p suggested in [5] in most cases. What is more, our formula has lim pold (D, , δ) =
W →∞
96
M. Klonowski et al. p
p
0.5 0.35
0.4
0.30 0.25
0.3
0.20
0.2
0.15 0.10
0.1
0.05 100
200
300
400
500
W
100
200
300
400
500
W
Fig. 3. Comparsion between pnew (dashed line) and pold (solid line) for: (left) = 0.1, δ = 0.01, (right) = 0.2, δ = 0.01
explicit form and depends only on the multiplicity of the rarest value in the database (and not on some other values like the number of the distinct values nor the count of the rare values). Therefore it may be used easier for the further applications as we show it in the next section.
6
Distributed Sampling
In this section we outline how the fact that we use the universal (i.e. the same for each sample and each database) sampling probability p can be helpful for sampling distributed databases. More precisely, let us consider a distributed system of units, called nodes. We assume that each node contains a database with several (say l) kinds of objects. Such database can be represented as a vector (n1 , n2 , . . . , nl ), i.e nj is the number of objects of the j-th kind. From such a database one can generate a sample according to sampling rules described in previous sections. Thus a sample is a random vector (c1 , c2 , . . . , cl ) such that cj ∼ Bin(nj , p). Note that if several samples are chosen with the same parameter p from several databases they can be added all together by coordinates (as vectors). One can observe the the resulting vector has the same distribution as a sample of union of databases. Indeed, let ni be the number of objects of the first kind in the database of the i-th node. Note that if X1 ∼ Bin(n1 , p) . . . Xl ∼ Bin(nl , p) then Xi ∼ Bin( ni , p) as long as X1 , . . . , Xl are independent, what seems to be an obvious assumption. The situation is exactly the same for other coordinates representing other objects. That observation can be very useful. One may think for example about the network of sensors (nodes). We assume that the nodes may not interact or the interaction between the nodes is highly restricted (limited energy, small range of communication). Each node is measuring some environmental parameters and collecting related data, that are finally mapped
Practical Universal Random Sampling
97
into relatively small set of objects of several kinds. Periodically, some external party (we call it Collector ) is given a possibility to get a statistical sample of all combined (union) of databases. One may assume that the Collector can communicate with each node a very limited number of times (for example the nodes may be distributed on a very large area). However, we do not want to allow the Collector to get information about aggregated data (or even a sample of this data) from the particular node or even the small subset of nodes. In other words – the Collector is allowed to get some global information, but not local. In this section we outline the protocol that realizes the goals described above. One of the techniques we use in our protocol is slightly modified cryptographic counter [9]. Secure cryptographic counter. Cryptographic n-counter is a cryptographic primitive that can be seen as a kind of secure function evaluation method, that allow to “add” to a ciphertext without decryption. Moreover, only the holder of the secret key, is able to find the actual state of the counter. Definition 3 (Cryptographic Counter [9]). A cryptographic n-counter is a triple of algorithms (G, D, T ). Let S = S0 ∪. . .∪Sn be a set of states representing numbers {0, . . . , n}. Algorithm G returns s0 ∈ S0 and pair of keys pk, sk. Algorithms D and T denote decryption and transition (incrementing) algorithms. – Dsk (s) = j ⇔ s ∈ Sj – Without sk it is not possible to computationally distinguish two s, s from different set of states. – If s ∈ Si and i + j ≤ n then T (pk, s, j) ∈ Si+j . Note that this definition is not formal and is just given for completeness. Fully formal definition is presented in [9]. Many natural implemantations of cryptographic counters may be adapted to the system with several parties using several pairs of keys. Moreover, each party processing the counter can remove one cryptographic layer using its secret key. Only after all the layers are removed (i.e. the counter is processed by all parties) the Collector is able to read the state of counter. Such modification may be costructed using universal re-encryption from [16] in straightforward manner. Very similar approach, however in a different context, was presented in [15]. Protocol description. We assume that all the nodes have assigned pair of keys. The aim of the Collector is to get a sample from all databases. Each sample is represented as a vector of cryptographic counters that are incremented at each node. The length of the vector is equal the number of kinds of possible objects, however in the description below we assume only one counter for the sake of simplicity. 1. The Collector is given s0 – the cryptographic counter secured with the keys of all the nodes. 2. Collector, interacting with the consecutive nodes, presents the current state of the cryptographic counter st . The node
98
M. Klonowski et al.
– computes the sample x and adds it to the counter st+1 ← T (pkt , st , x), – partially decrypts the counter. 3. After all the nodes are visited, the Collector obtains the sample (i.e. the value of the counter) Note that pkt is the actual public key. In the case of implementation based on [16] it is just a product of the public keys of parties not processed yet. Such a procedure works on the assumption that all the nodes are honest (but possibly curious). To make the scheme secure in different settings, dependently on the trust model, it may be necessary to implement some additional steps. Possible extensions. Note that the idea outlined above can be modified in many ways in order to fulfill requirements of a particular system. Especially, taking into account various trust models. Minimal cardinality of collected items. The above method is based on the fact that all databases use the same sampling parameter p. As underlined in the first part of this paper, using universal parameter p = F (M ) works well as long as one can ensure that the global minimal cardinality of the sampled items is M . Such assumption is acceptable in many realistic scenarios related to collecting data in a distributed environment. However, in many cases such assumption is not feasible. One of possible methods addressing this issue is to use two counters. The first contains the sample, when the other contains the exact value. The first counter is revealed only if second is greater or equal M . Note that particular realization of this idea strongly depends on the system, in particular on the trust model. If there is a single (semi-)trusted party, it can simply check the value of the second counter (Note that this party does not learn the samples from the particular nodes). Otherwise, one may consider the methods similar to the subprocedures of the e-voting protocol presented in [1]. The techniques from that paper allows group of parties to say if the encrypted number exceeds the fixed threshold in such a way that no other information is revealed. Alternatively, various other secure function evaluation methods can be applied [8]. Threshold scheme. In some distributed system it is not possible to have an access to all nodes collecting data. In such case one may consider revealing the sample from the union of a proper subset of all databases. Such a family of subsets can for example contain all the subsets of cardinality greater than a fixed k. This idea may be addressed using the standard cryptographic techniques (e.g. secret sharing threshold decryption schemes). Of course, in that case the sum of the samples is still the sample of the union of databases. Restricting number of samples. Note that having enough independent binomial (multinomial) samples from the same database one can find approximation of the sampled database with arbitrary accuracy using the standard statistical techniques. This obviously may lead to breaking the privacy.
Practical Universal Random Sampling
99
In this case a simple countermeasure is to release only the limited number of counters (e.g. one) per a single period and control the leakage of information. Similarly, dependently on the model, one may consider applying the verifiable cryptographic counters [9]. We believe that presented approach can be very useful for the distributed and dynamic systems, with particular focus on the systems of weak devices (eg. sensors) such that access to each node is expensive and communication between the devices is very restricted.
7
Conclusions and Future Works
We have presented study on differential privacy of random samples of databases. We have improved (in most cases) and simplified the result of [5]: sampling probability which preserves (1, , δ)-privacy can be now derived explicitly for a given database and and δ parameters. It has been also proved that preserving (1, , δ)-privacy implies preserving (c, (1 + )c − 1, cδ)-privacy. Regarding the discusion from the Section 6, from our perspective one of the most important question from practical point of view is how to construct a protocol offering the same functionality and at least fair level of security without using the asymetric cryptography. Such a light-weight modification would make the protocol much more suitable for the system of weak devices.
References 1. Desmedt, Y., Kurosawa, K.: Electronic voting: Starting over? In: Zhou, J., L´ opez, J., Deng, R.H., Bao, F. (eds.) ISC 2005. LNCS, vol. 3650, pp. 329–343. Springer, Heidelberg (2005) 2. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006) 3. Dwork, C., McSherry, F., Talwar, K.: The price of privacy and the limits of LP decoding. In: Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, pp. 85–94 (2007) 4. Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 202–210 (2003) 5. Chaudhuri, K., Mishra, N.: When random sampling preserves privacy. In: Dwork, C. (ed.) CRYPTO 2006. LNCS, vol. 4117, pp. 198–213. Springer, Heidelberg (2006) 6. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006) 7. Dwork, C.: Differential Privacy: A Survey of Results. In: Agrawal, M., Du, D.-Z., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008) 8. Goldreich, O.: The Foundations of Cryptography - Volume 2 Oded Goldreich
100
M. Klonowski et al.
9. Katz, J., Myers, S., Ostrovsky, R.: Cryptographic Counters and Applications to Electronic Voting. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, p. 78. Springer, Heidelberg (2001) 10. Kasiviswanathan, S.P., Smith, A.: A Note on Differential Privacy: Defining Resistance to Arbitrary Side Information. In: CoRR (2008) 11. Dalenius, T.: Towards a methodology for statistical disclosure control. Statistik Tidskrift 15 (1977) 12. Duncun, G.: Condifentiality and statistical disclosure limitation. In: International Encyclopedia of the Social and Behavioral Sciences. Elsevier, Amsterdam (2001) 13. Janson, S., L uczak, T., Ruci´ nski, A.: Random Graphs. Wiley and Sons, Chichester (2000) 14. Fienberg, S.: Confidentiality and Data Protection Through Disclosure Limitation: Evolving Principles and Technical Advances. In: IAOS Conference on Statistics, Development and Human Rights (2000) 15. Gomulkiewicz, M., Klonowski, M., Kutylowski, M.: Routing Based on Universal Re–Encryption Immune against Repetitive Attack. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325. Springer, Heidelberg (2005) 16. Golle, P., Jakobsson, M., Juels, A., Syverson, P.: Universal Re-encryption for Mixnets. In: Okamoto, T. (ed.) CT-RSA 2004. LNCS, vol. 2964, pp. 163–178. Springer, Heidelberg (2004) 17. Rubin, D.B.: Discussion: Statistical Disclosure Limitation. Journal of Official Statistics 9(2) (1993) 18. Dwork, C., Naor, M., Reingold, O., Rothblum, G.N., Vadhan, S.: On the Complexity of Differentially Private Data Release. In: STOC 2009 (2009) 19. Blum, A., Ligett, K., Roth, A.: A Learning Theory Approach to Non-Interactive Database Privacy. In: STOC 2008, ACM 978-1-60558-047-0/08/05 (2008) 20. Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, Cambridge (2005)
Horizontal Fragmentation for Data Outsourcing with Formula-Based Confidentiality Constraints Lena Wiese National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
[email protected] Abstract. This article introduces the notion of horizontal fragmentation to the data outsourcing area. In a horizontal fragmentation, rows of tables are separated (instead of columns for vertical fragmentation). We give a formula-based definition of confidentiality constraints and an implicationbased definition of horizontal fragmentation correctness. Then we apply the chase procedure to decide this correctness property and present an algorithm that computes a correct horizontal fragmentation.
1
Introduction
The interest in outsourcing data to a third-party storage (“server”) site has increased over the last decade with the main advantage being the reduction of storage requirements at the local (“owner”) site. Yet, because the storage server usually cannot be fully trusted, several approaches to protect the outsourced data have emerged. In general, there are the following approaches: – Encryption only: Before outsourcing, all data tuples are encrypted [1,2]; query execution on the outsourced data is difficult and inexact. – Vertical fragmentation and encryption: Some table columns are separated into different fragments as cleartext while other (partial) tuples are encrypted [3,4]; query execution is easier on the cleartext part but still decryption has to be executed by the data owner to execute queries on the encrypted part. – Vertical fragmentation only: When the data owner is willing to store some columns at his trusted local site in an owner fragment, other columns can be outsourced safely in a server fragment [5,6]; the fragmentation can be optimized with respect to assumptions on query frequencies. In this article we refrain from using encryption. It has already been argued in [5] that encryption is not necessary if a fragmentation is identified of which one fragment is stored at the trusted owner site. We reinforce the point that encryption is costly as it requires key management and long-term security of the encryption scheme. Moreover, often querying encrypted data is suboptimal [2] or only weak encryption is possible [7]. I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 101–116, 2010. c Springer-Verlag Berlin Heidelberg 2010
102
L. Wiese
In this article we adopt the client-server setting of [5]. In their approach for vertical fragmentation, only projection onto columns is supported and thus the so called “confidentiality constraints” are merely defined as sets of attributes of the database schema. They do not take into account the content – the actual data values – in a database instance. Moreover this approach only considers one (“universal”) relation instead of a full-blown database schema with several relation symbols (which is usually the case for databases in some normal form). However, normalized databases are advantageous because they reduce storage requirements (by removing redundancy) and facilitate data management (e.g., by avoiding anomalies). In the same sense, vertical fragmentation also lacks the notion of database dependencies – that is, constraints that can be specified on the database relations. Such database dependencies are crucial when it comes to controlling inferences: with dependencies further information can be derived from some (partial) database entries. To extend the “vertical fragmentation only” approach we make the following contributions: – We propose to use not only vertical but also horizontal fragmentation. In particular, we aim to filter out confidential rows to be securely stored at the owner site. The remaining rows can safely be outsourced to the server. – We extend expressiveness of the “confidentiality constraints” by using firstorder formulas instead of sets of attribute names. This implies that vertical fragmentation can be data-dependent in the sense that only some cells of a column have to be protected. – We explicitly allow a full database schema with several relations symbols and a set of database dependencies. With these dependencies we introduce the possibility of inferences to the fragmentation topic and provide an algorithm to avoid such inferences. The paper is organized as follows. Section 2 sets the basic terminology. Section 3 introduces logical formulas as syntactical material for confidentiality constraints. Section 4 presents a definition for horizontal fragmentation correctness; it analyzes the problem of fragmentation checking and introduces a new algorithm for the computation of a correct fragmentation. Lastly, we argue that also vertical and hybrid fragmentation can be data-dependent (in Section 5) and conclude this article in Section 6.
2
System Description
We view relational databases using the formalism of first-order predicate logic with equality (and possibly with other built-ins like comparison operators). A database has a schema DS = P, D where P is the set of relation symbols (that is, table names) and D is the set of dependencies between the relations. Each relation symbol P comprises a set of attributes (that is, data columns) and the arity arity(P ) is the number of attributes of relation symbol P . Each such attribute has associated a domain of constant values. A database instance is a set of ground
Horizontal Fragmentation for Data Outsourcing
103
atomic formulas (representing data tuples or data rows) in accordance with the database schema; these are formulas without variables and each formula consists of one relation symbol that is filled with some appropriate constant values. As database dependencies D we allow tuple-generating dependencies (tgds) and equality generating dependencies (egds). Tuple-generating dependencies can contain both universal and existential quantifiers. Their body as well as their head consists of a conjunction of atomic formulas. Definition 1 (Dependencies). A tuple-generating dependency (tgd) is a closed formula of the form ∀x φ(x) → ∃yψ(x, y) where φ(x) and ψ(x, y) are conjunctions of atomic formulas. φ(x) is called the body and ∃yψ(x, y) is called the head of the tgd. A tgd is called full if there are no existentially quantified variables y in ψ. An equality-generating dependency (egd) is a closed formula of the form ∀x φ(x) → (x = x ) where φ(x) is a conjunction of atomic formulas and x and x are contained in x. Note that a tgd indeed consists of disjunctions and negations (as the material implication → is only an abbreviation for disjunction and negation); and tgds can easily be written in conjunctive normal form when distributing the conjuncts of the head over the disjunctions in the body. For formulas that are more general than tgds (for example, arbitrary disjunctions) feasibility of fragmentation problems cannot be ensured. More precisely, decidability for the “fragmentation checking” and “fragmentation computation” problems (see Section 4) cannot be established in general and the chase procedure as well as the search algorithm presented below are not guaranteed to terminate. For tgds, cyclicity also leads to undecidability; this is why we provide the additional restriction of weak acyclity in Definition 2. But before doing that we introduce a running example. Example 1. In our running example, the database comprises some medical records and the relation symbols are P = {Illness, Treatment}. The relation Illness has the two attributes (that is, column names) Name and Disease; the relation Treatment has the two attributes Name and Medicine. An instance of the database schema with these relation symbols would be
I:
Illness Name Diagnosis Mary Aids Pete Flu Lisa Flu
Treatment Name Medicine Mary MedA Mary MedB Pete MedA
The set of dependencies D are tgds or egds. It can for example contain a formula that states whenever a patients takes two specific medicines, then he is certainly ill with the disease aids: ∀x Treatment(x, MedA) ∧ Treatment(x, MedB) →
104
L. Wiese
Illness(x, Aids). This is a full tgd. Or it can contain a tgd that states that if a patient receives medical treatment there should be an appropriate diagnosis: ∀x, y Treatment(x, y) → ∃zIllness(x, z). An egd could be a key dependency or a functional dependency if for example a patient ID uniquely determines a patient’s name. On tgds we now pose the additional requirement of weak acyclicity (see [8]). This property avoids that there are cyclic dependencies among existentially quantified variables. Such cyclicity could possibly lead to the case that database instances satisfying these dependencies are infinite which would make the system infeasible. Weakly acyclic tgds have nice properties as will be used later on; for example, the chase on weakly acyclic tgds is ensured to run in polynomial time. Definition 2 (Weak acyclicity [8]). For a given set S of tgds, its dependency graph is determined as follows: – For each relation symbol P occurring in S, create arity(P ) many nodes P1 , . . . , Parity(P ) ; these are the positions of P . – For every tgd ∀x (φ(x) → ∃yψ(x, y)) in S: if a universally quantified variable x ∈ x occurs in a position Pi in φ and in a position Pj in ψ, add an edge from Pi to Pj (if it does not already exist). – For every tgd ∀x (φ(x) → ∃yψ(x, y)) in S: if a universally quantified variable x ∈ x occurs in a position Pi in φ and in a position Pj1 in ψ, and an existentially quantified variable y ∈ y occurs in a position Pj2 in ψ, add a special edge marked with ∃ from Pi to Pj2 (if it does not already exist). A dependency graph is weakly acyclic, iff it does not contain a cycle going through a special edge. We call a set of tgds weakly acyclic whenever its dependency graph is weakly acyclic. In our example, the two tgds are acyclic (and hence also weakly acyclic) because edges only go from Treatment to Illness. For an open formula φ(x) (with free variables x) we can identify instantiations in an instance I; this corresponds to an evaluation of φ in I if φ is seen as a query: we find those constant values a (in accordance with the domains of the attributes) that can be substituted in for the variables x (written as [a/x]) such that φ(x)[a/x] holds in the instance I. For example, evaluating the formula Treatment(x, MedA) ∧ Treatment(x, MedB) would substitute in Mary for x and result in the answer Treatment(Mary, MedA) ∧ Treatment(Mary, MedB). Our aim is now to decompose an input instance I into two disjoint sets of tuples: the “server fragment” Fs and the “owner fragment” Fo . The server fragment has to be such that it satisfies the notion of “fragmentation correctness” (see Definition 4 below) even though we assume that the server has full (a priori) knowledge of the database dependencies D. This can be seen as a form of the “honest but curious” attacker model that is often used in cryptographic settings.
3
An Extended Syntax for Confidentiality Constraints
Usually for vertical fragmentation (see [3,4,5]) a confidentiality constraint is just a subset of the attributes of a universal relation. Its meaning is that no
Horizontal Fragmentation for Data Outsourcing
105
combination of values (a subtuple of the universal relation with all attributes of the constraint) must be fully disclosed. For example, for the relation Illness the confidentiality constraint {Name, Disease} means that no full tuple of Illness must be disclosed; but either the Name column or the Disease column may appear in a secure fragment. The singleton constraint {Name} means that the Name column must be protected but the Disease column can be published. In other words, a confidentiality constraint is satisfied, if of all the attributes in the constraint either one attribute is encrypted in the outsourced relation or the universal relation is decomposed such that each outsourced fragment is missing at least one of the attributes involved in the confidentiality constraint. We now introduce the formula-based notation for confidentiality constraints that will be used throughout this article. Attribute-based confidentiality constraints for vertical fragmentation can be expressed by formulas with free variables: the free variables of a formula are those contained in the confidentiality constraint. For example, the confidentiality constraint {Name, Disease} as a formula will be written as Illness(x, y). The variable x for the attribute Name as well as the attribute y for the attribute Disease are free such that either column can be removed (to yield a secure fragment) or encrypted. Other attributes not involved in a confidentiality constraint are written as existentially quantified (hence bound) variables. For example, the singleton constraint {Name} will be expressed as ∃y Illness(x, y): the only free variable is x and hence the Name column must be protected. Going beyond the attribute-based confidentiality constraints used in prior work, we now state how formula-based constraints can greatly improve expressiveness of constraints; hence, formulas make it possible to express finer-grained confidentiality requirements: 1. We can easily express protection of whole relations by existentially quantifying all variables instead of using several singleton constraints. For example ∃xy Illness(x, y) expresses that the whole relation Illness must not be outsourced to the server. 2. We can express data-dependent constraints by using constant values. For example, ∃x Illness(x, Aids) signifies that no row with the value Aids for the attribute Disease must be outsourced. In contrast, the open formula Illness(x, Aids) signifies that no combination of a patient name with the disease aids must be outsourced; that is, all patient names of those rows with an aids entry must be protected. This makes a difference when hybrid fragmentation is used where vertical and horizontal fragmentation can be combined. 3. We can combine several atomic expressions (expressions with one relation symbol only) in formulas with logical connectives like conjunction. For example, Illness(x, Aids) ∧ Treatment(x, MedB) means that for patients suffering from aids and at the same time being treated with a particular medicine MedB, either the name column from the relation Illness or from the relation Treatment must be suppressed. For the formula ∃y(Illness(x, Aids) ∧ Treatment(x, y)) the same applies for any medicine whatsoever.
106
L. Wiese
With our semantics, protection of a disjunctive formula (like for example ∃x(Illness(x, Aids) ∨Illness(x, Cancer))) can be simulated by splitting the formula into separate constraints (∃xIllness(x, Aids) and ∃xIllness(x, Cancer)): a server not allowed to know the whole disjunction is also not allowed to know any of the single disjuncts. In other words, each single disjunct implies the whole disjunctions. We define formula-based confidentiality constraints – that can be used for horizontal as well as hybrid fragmentation – as formulas that use the syntactic material (relation symbols and constants) of the database schema; we restrict the syntax to formulas without negation (“positive formulas”) that use only conjunction ∧ as a logical connective and have possibly some variables bound by existential quantifiers in a prefix. Definition 3 (Formula-based confidentiality constraints). Formula-based confidentiality constraints are positive conjunctive formulas possibly with existential prefix that mention only relation symbols and constants from the domains of the attributes as defined by the database schema. Free variables will only be used for vertical fragmentation. In the next section we concentrate on horizontal fragmentation. In this case we restrict confidentiality constraints to “closed” formulas; that is, all variables will be existentially quantified. In sum, a set of formula-based confidentiality constraints for horizontal fragmentation corresponds to a union of conjunctive Boolean queries. Another result of [8] that we will use is that certain answers for unions of conjunctive queries can be computed in polynomial time.
4
Horizontal Fragmentation
For vertical fragmentation, fragments consist of some cleartext columns and some encrypted (partial) tuples. In [5], the server and the owner fragment can simply be represented by two disjoint sets of attribute names. The natural join is used for reconstruction of the original relation (or parts of the original relation after a query; see [3]). In [5], the join of the server and the owner fragment has to be computed only on the tuple id because of an additional non-redundancy requirement. Previous work for vertical fragmentation covers the following two requirements called “fragmentation correctness”: completeness (that is, the original relation can be reconstructed by the owner from the fragments) and confidentiality (not all attributes of an attribute-based confidentiality constraint are contained in one fragment). In the “fragmentation only” approach [5] the requirement of non-redundancy (each attribute is contained either in the server or the owner fragment) is added; this concept has not been analyzed for approaches involving encryption because encrypted tuples usually contain redundant information. In contrast, in our horizontal fragmentation approach fragments are sets of rows instead of sets of columns. The fragments (the rows in the server and the
Horizontal Fragmentation for Data Outsourcing
107
owner fragment) have to be combined again by simply taking the union ∪ of the fragments. We now introduce our notion of fragmentation correctness for horizontal fragmentation. The completeness requirement easily translates to horizontal fragmentation by requiring that the union of the fragments (that is, rows) yields the original database instance. In the same sense, non-redundancy means that no row is contained in both the server and the owner fragment. The confidentiality requirement is more complex than in the vertical case because – it depends on the data in the database instance and not only on the attribute names – it involves the database dependencies that are assumed to be known a priori by the server – it respects the logical nature of closed formula-based confidentiality constraints. Hence we base confidentiality on the notion of logical implication |=. A set S of formulas implies a formula f (written S |= f ) if and only if every model (that is, every satisfying interpretation) of S also satisfies f . If the server knows some dependencies between data – as for example the database dependencies D – these can be applied as deduction rules on the server fragment to infer other facts that are presumably protected in the client fragment. In our system we have a strong attacker model in the sense that we assume the server to be aware of all dependencies in D and hence the server fragment has to be such that application of these dependencies do not enable the inference of any of the confidentiality constraints. We can thus say that a fragmentation ensures confidentiality if the server fragment (treating each tuple as a ground atomic formula) and the database dependencies (that can be applied as deduction rules) do not imply any formula-based confidentiality constraint. We adapt Definition 2 of [5] to formula-based confidentiality constraints as follows. Note that our fragments are sets of tuples (that is ground atomic formulas) in contrast to [5] where the fragments are sets of attribute names. Also note that for horizontal fragmentation we only accept closed confidentiality constraints as already mentioned in Section 3. Definition 4 (Horizontal fragmentation correctness). Let I be an instance of a database schema DS = P, D, C = {c1 , . . . , cm } be set of closed formula-based confidentiality constraints, and F = {Fo , Fs } be a fragmentation of I , where Fo is stored at the owner and Fs is stored at a storage server. F is a correct horizontal fragmentation with respect to C, iff: 1) Fo ∪ Fs = I (completeness); 2) for every ci ∈ C, Fs ∪ D |= ci (confidentiality); 3) Fo ∩ Fs = ∅ (non-redundancy). Our aim is now twofold: we first analyze how a given fragmentation can be checked for correctness and then elaborate how a correct fragmentation can be computed from an input instance.
108
4.1
L. Wiese
Fragmentation Checking
We now analyze the following problem: Problem 1. Given a database schema DS = P, D, an instance I of DS , a set C = {c1 , . . . , cm } of closed formula-based confidentiality constraints, and a fragmentation F = {Fo , Fs }, the fragmentation checking problem is to decide whether F is a correct horizontal fragmentation of I . Correctness and non-redundancy requirements of Definition 4 can be checked easily by the owner. However checking confidentiality again is more complex. We have to check that Fs does not reveal any confidentiality constraint itself; neither should Fs imply any confidentiality constraint whenever the server applies the database dependencies to the server fragment. So in general, it might happen that the server fragment Fs does not satisfy the database dependencies and the server uses them to deduce other facts. To ensure that the deduced facts do not breach confidentiality of the confidentiality constraints, the owner has to apply the dependencies to the server fragment to check the confidentiality requirement. We will use results of the “data exchange” area to decide the fragmentation checking problem. The famous chase procedure was introduced as a method to decide implication between two sets of dependencies [9]. Later on, it was used in [8] to compute “universal solutions” in a data exchange setting and in [10] for database repair checking. From a confidentiality point of view it was used in [11] to extend a mandatory access control (MAC) system and mentioned in [12] as a method to decide security of view instances. In particular, the results of [8] show that for the wide class of weakly acyclic tuple-generating dependencies and equalitygenerating dependencies (see Definition 2), the chase computes a universal solution containing some “null values” in time polynomial in the size of the input instance. It is also shown in [8] that if a conjunctive query is evaluated in a universal solution, this evaluation can also be done in polynomial time and the result is the set of “certain answers”: those answers that hold in every possible data exchange solution of the input instance. The results of [8] can be used to check confidentiality of constraints in a server fragment Fs as follows. If we restrict the database dependencies D to be weakly acyclic tgds and egds, the chase on the server fragment Fs terminates in time polynomial in the size of the server fragment. It results in a chased server fragment containing null values: existentially quantified variables in tgds are filled in with new null values and egds are applied to equate some values. More formally, if there is a mapping (a homomorphism) from the variables in the body of a dependency to the constants const(Fs ) and the null values nulls(Fs ) in the server fragment, then a chase step can be executed (“applied”). See also [8,9,11] for details. Definition 5 (Application of dependencies). A tgd ∀x φ(x) → ∃yψ(x, y) can be applied to the server fragment Fs if
Horizontal Fragmentation for Data Outsourcing
109
– there is a homomorphism h : x → const(Fs ) ∪ nulls(Fs ) such that for every atom P (x1 , . . . , xk ) (where the free variables are xi ∈ x for i = 1 . . . k) in φ(x), the atom P (h(x1 ), . . . , h(xk )) is contained in Fs – but h cannot be extended to map the existentially quantified variables y in the head ∃yψ(x, y) to const(Fs ) ∪ nulls(Fs ) such that for every atom Q(x1 , . . . , xl , y1 , . . . , yl ) (where the free variables are xi ∈ x for i = 1 . . . l and yj ∈ y for j = 1 . . . l ) in ψ(x, y), the atom Q(h(x1 ), . . . , h(xl ), h(y1 ), . . . , h(yl )) is contained in Fs . The result of applying a tgd to Fs is the union of Fs and all those atoms that can be generated from all atoms Q(x1 , . . . , xl , y1 , . . . , yl ) of ψ(x, y) with the variables x mapped according to h and the variables y each mapped to a new null value. An egd ∀x φ(x) → (x = x ) can be applied to the server fragment Fs if – there is a homomorphism h : x → const(Fs ) ∪ nulls(Fs ) such that for every atom P (x1 , . . . , xk ) in φ(x), the atom P (h(x1 ), . . . , h(xk )) is contained in Fs – but h(x) = h(x ). The result of applying an egd to Fs is obtained by – replacing all occurences of the null value in Fs with the constant if one of h(x) and h(x ) is a null value and the other is a constant or – replacing all occurences of one null value in Fs with the other if both h(x) and h(x ) are null values. Note that because we assume that the server fragment Fs is a subset of the input instance I and I is assumed to satisfy the dependencies, chasing with an egd cannot “fail” (that is, lead to an inconsistency). On the chased fragment the notion of “certain answers” can also be defined: a certain answer to a query is one that holds in any possible fragment that contains Fs as a subset and that satisfies the database constraints D; and we can find the certain answers by posing a query to the chased server fragment. Because we defined confidentiality constraints to be positive, existential, conjunctive and closed formulas, when we pose a constraint as a query to the chased server fragment, the certain answers can be computed in polynomial time as shown in [8]. We can be sure that confidentiality of a constraint is preserved if the certain answer of this constraint in the chased server fragment is false. We give a small example to illustrate the procedure. Example 2. Assume that we have given the server fragment
Fs :
Illness Name Diagnosis Lisa Flu
Treatment Name Medicine Mary MedB Mary MedA Pete MedC
110
L. Wiese
The set of dependencies contains two tgds that link treatments with diseases: D = {∀x Treatment(x, MedC) → ∃zIllness(x, z), ∀x Treatment(x, MedA) ∧ Treatment(x, MedB) → Illness(x, Aids)} Chasing Fs with D results in the following instance where τ is a null value: Fchase :
Illness Name Diagnosis Lisa Flu Pete τ Mary Aids
Treatment Name Medicine Mary MedB Mary MedA Pete MedC
Assume the confidentiality constraints stating that it should not be outsourced that there is a patient with aids and that there is a disease from which patient Pete suffers: C = {∃x Illness(x, Aids), ∃y Illness(Pete, y)} We see that the certain answers of the two confidentiality constraints in Fchase are both true and hence the server fragment does not comply with the confidentiality requirements. In this case the server fragment should not be outsourced. In addition to fragmentation correctness, the server fragment should be maximal and the owner fragment minimal in some sense; for example, the storage requirements at the owner site should be minimized. Beyond storage analysis, the metrics in [5] also analyze query frequencies. In the context of database repairs, [10] survey and analyze other optimization criteria that can also be adopted for fragmentation approaches. 4.2
Fragmentation Computation
We now propose an algorithm for a set of database dependencies D containing weakly acyclic tgds and egds and a set of closed confidentiality constraints C. The main idea is the following: starting with the original input instance I we identify tuples that must be moved from I to the owner fragment Fo or to the server fragment Fs by evaluating the confidentiality constraints and database dependencies as queries in I. The algorithm will decide for each affected tuple, whether it is possible to move it to the server fragment or not. The remaining tuples (not affected by the constraints and dependencies) can simply be moved to the server fragment. The decision can be accompanied by several optimization criteria (like the ones mentioned previously in Section 4.1). In contrast to these, we propose here to minimize the number cells(Fo ) of table cells that are moved to the owner fragment. That is, we take into account the size of the tuples where size is measured as the number of attributes. This indeed has an impact when several relations are contained in the database schema (in contrast to the approaches considering only a universal relation).
Horizontal Fragmentation for Data Outsourcing
111
SEARCH: – Input: instance I, confidentiality constraints C, dependencies D – Output: correct horizontal fragmentation F = {Fo , Fs } 1. Inst = ∅ 2. for each ∃x φ(x) ∈ C: remove ∃x 3. Inst = Inst ∪ {φ(x)[a/x] | I ∪ Fs |= φ(x)[a/x]} 4. for each ∀x φ(x) → ∃yψ(x, y) ∈ D 5. Inst = Inst ∪ {φ(x)[a/x] | I ∪ Fs |= φ(x)[a/x] AND I ∪ Fs |= ∃yψ(x, y)[a/x]} 6. if Inst = ∅: Fs = Fs ∪ I; return F = {Fo , Fs } 7. else choose l1 ∧ . . . ∧ lk ∈ Inst 8. if {l1 , . . . , lk } ⊆ Fs : conflict 9. else choose li ∈ {l1 , . . . , lk } such that li ∈ I 10. Fo = Fo ∪ li ; I = I \ li ; SEARCH 11. Fs = Fs ∪ li ; I = I \ li ; SEARCH Fig. 1. Horizontal fragmentation algorithm
We now describe our algorithm in detail and provide a pseudocode listing in Figure 1. We start with the input instance I and Fo = Fs = ∅. We then take the confidentiality constraints C = {c1 , . . . , cm } and execute the following steps. 1. Remove all existential prefixes from constraints ci = ∃x φ(x) such that they are now open formulas φ(x) with free variables x. 2. Evaluate the constraints in I ∪ Fs . That is, find those tuples of constants a such that the instantiation φ(x)[a/x] of variables x with constants a holds in the input instance and the server fragment: I ∪ Fs |= φ(x)[a/x]. 3. Add each such instantiation to the set of “candidate instantiations” Inst. Similarly, we treat the database dependencies D = {d1 , . . . , dm } – with the difference that we have to find those instantiations for which the body of the dependency is satisfied but the head is not. Note that this will only apply to tgds: all egds are satisfied in I; they will never be violated in Fs which is a subset of I. Hence let di be a tgd: di = ∀x φ(x) → ∃yψ(x, y) where φ(x) is the body, ∃yψ(x, y) is the head and both are conjunctions of atomic formulas. 1. Evaluate each tgd in I ∪ Fs and find those instantiations such that the body is satisfied but the head is not. That is, find those tuples of constants a such that (a) the instantiation of the body φ(x)[a/x] of variables x with constants a holds in the input instance and the server fragment: I ∪ Fs |= φ(x)[a/x]. (b) but the instantiated head ∃yψ(x, y)[a/x] is false in I ∪ Fs ; that is, I ∪ Fs |= ∃yψ(x, y)[a/x]. Note that this is a closed formula. 2. Add the instantiated body φ(x)[a/x] to the set of candidate instantiations Inst. The candidate set Inst contains only positive conjunctive ground formulas of the form l1 ∧ . . . ∧ lk . In order to achieve consistency of the server fragment Fs with the database dependencies D without violating the confidentiality constraints C,
112
L. Wiese
at least one of the conjuncts li has to be moved to the owner fragment Fo . Hence, if there is a formula in Inst for which all ground atoms l1 , . . . , lk are contained in the server fragment, a conflict with the dependencies and constraints has occurred. The search then continues with a distinct subproblem by backtracking. Otherwise, choose one formula from Inst and one ground atom li of that formula that is contained in I (and hence neither contained in Fo nor Fs ). Create two new subproblems: one by moving the ground atom li to Fo and the other one by moving li to Fs and recursively executing the search procedure on it. The candidate set Inst is emptied in every recursion. Repeat these steps until the evaluations of constraints and dependencies do not result in further candidate formulas; that is, until the candidate set Inst remains empty. Move all atoms remaining in I to the server fragment. The search indeed is a depth-first search along a binary tree as pictured in Figure 2. Two additional operations can speed up the search process significantly: unit propagation and branch-and-bound pruning. Unit propagation means that a candidate formula consisting of a single ground atom can be moved to the owner fragment without trying to move it to the server fragment; moving it to the server fragment would immediately result in a conflict. The same applies to formulas in the candidate set Inst for which exactly one ground atom li is contained in I and all other ground atoms were already moved to the server fragment. Branch-andbound pruning is helpful when an optimization requirement has to be fulfilled. In this case, not the first solution is output; instead, the search continues and tries to find a better one. We propose to count the number of table cells cells(Fo ) that are contained in the owner fragment Fo and try to minimize this amount. Whenever a fragmentation solution with a better count has been found, we can immediately stop exploration of the current branch of the search tree as soon as the number of cells in the owner fragment exceeds the number of cells of the previously found solution. For sake of simplicity, we leave the details of these two operations out of the pseudocode listing. Unit propagation is however incorporated in Figure 2 and also the cell count is annotated in each node of the search tree. Note that the algorithm in Figure 1 would return the first solution with cell count 8, while a branch-and-bound algorithm involving optimization would return the first minimal solution with cell count 6. Figure 2 shows the search tree for the following example. Example 3. The set of dependencies contains two tgds that link treatments with diseases: D = {∀x Treatment(x, MedC) → ∃zIllness(x, z), ∀x Treatment(x, MedA) ∧ Treatment(x, MedB) → Illness(x, Aids)} The set of confidentiality constraints states that the disease aids is confidential for any patient and that for patients Pete and Lisa it should not be outsourced that both suffer from the same disease: C = {∃xIllness(x, Aids), ∃y(Illness(Pete, y) ∧ Illness(Lisa, y))}
Horizontal Fragmentation for Data Outsourcing
113
Finally the input instance I is as follows:
I:
Illness Name Diagnosis Mary Aids Pete Flu Lisa Flu
Treatment Name Medicine Mary MedA Mary MedB Pete MedC
The input instance I satisfies all database dependencies. The first candidate set Inst finds the following instantiations of confidentiality constraints: Inst = {Illness(Mary, Aids), Illness(Pete, Flu) ∧ Illness(Lisa, Flu)} The unit formula Illness(Mary, Aids) can be added immediately to the owner fragment. For Illness(Pete, Flu) we can try both moving it to the owner and the server fragment and hence have two branches in the search tree. The first fragmentation found with cell count 8 is the following: Fo :
Illness Name Diagnosis Mary Aids Pete Flu
Treatment Name Medicine Mary MedA Pete MedC
Fs :
Illness Name Diagnosis Lisa Flu
Treatment Name Medicine Mary MedB
The first optimal fragmentation with cell count 6 is the following: Fo :
Illness Name Diagnosis Mary Aids Lisa Flu
Treatment Name Medicine Mary MedA
Fs :
Illness Name Diagnosis Pete Flu
Treatment Name Medicine Mary MedB Pete MedC
We now briefly analyze the algorithm in terms of correctness and runtime complexity of the proposed algorithm. First of all, the output fragmentation satisfies Definition 4 of horizontal fragmentation correctness: – Completeness is ensured because when all tuples that have to be moved to the owner fragment have been identified, the remaining tuples of I are moved to the server fragment. – Confidentiality is ensured because on the one hand, all instantiations of confidentiality constraints are handled such that they are not implied by the server fragment. On the other hand the algorithm proceeds such that the server fragment satisfies all database dependencies because no body of a tgd can be fully instantiated whenever the instantiated head does not hold in the server fragment. Hence no deduction of other facts is possible. In terms of fragmentation checking (see Section 4.1), the chase cannot apply any dependencies to Fs .
114
L. Wiese
Illness(Mary, Aids) to Fo cells(Fo ) : 2 Illness(Pete, Flu) to Fo cells(Fo ) : 4
Illness(Pete, Flu) to Fs cells(Fo ) : 2
Treatment(Pete, MedC) to Fo cells(Fo ) : 6
Illness(Lisa, Flu) to Fo cells(Fo ) : 4
Treatment(Mary, MedA) Treatment(Mary, MedA) to Fo cells(Fo ) : 8 to Fo cells(Fo ) : 6 Treatment(Mary, MedA) Treatment(Mary, MedA to Fs cells(Fo ) : 6 to Fs cells(Fo ) : 4 Treatment(Mary, MedB) to Fo cells(Fo ) : 8
Treatment(Mary, MedB to Fo cells(Fo ) : 6
Fig. 2. Example search
– Non-redundancy is ensured because ground atoms are contained in I (and hence neither in Fo nor in Fs ) before moving them to one of the fragments. The runtime complexity depends on the number of tuples in the input instance I as follows. Confidentiality constraints (without existential prefix) as well as bodies of tgds are positive conjunctive formulas. Hence their number of instantiations in I ∪ Fs is always finite (even with theoretically infinite domains of attribute values) and must indeed be contained in I ∪ Fs . Consequently, in the worst case every tuple in the input instance I has to be tested whether it has to be moved to the owner or the server fragment. Due to this binary nature, the worst case complexity is exponential in the number of tuples in I. However, average complexity might be a lot better when unit propagation and pruning are applied.
5
Vertical Fragmentation Can Be Data-Dependent
We now briefly elaborate how vertical fragmentation can be achieved with formulabased confidentiality constraints. In particular, vertical fragmentation can be made data-dependent in the sense that not whole columns are stored in the owner fragment but only sensitive parts of columns. For example, confidentiality of the constraint Illness(Pete, y) can be achieved by removing only those cells of the Name column for which Name equals Pete. The remainder of the Name column and the Disease column can then still be outsourced to the server fragment. Hence, our cell count metrics leads to a better solution in the case that only a part of a column is stored in the owner fragment.
Horizontal Fragmentation for Data Outsourcing
115
Indeed, this approach yields a form of “hybrid fragmentation”: A combination of vertical and horizontal fragmentation can maximize the amount of outsourced data better than each of the techniques alone: If only some values in a column (for example, only some entries in the Disease column) must be protected, vertical fragmentation would remove the whole column while horizontal fragmentation only suppresses the rows containing sensitive values. On the other hand, if all values of one column have to be protected (for example, all patient names), vertical fragmentation just removes this column while horizontal fragmentation would have to suppress the whole relation. The notion of fragmentation checking can also be applied to this hybrid approach: we can handle suppressed cells in the server fragment as null values and apply the chase to the server fragment as in Definition 5; the certain answers can also be computed for open confidentiality constraints and confidentiality is preserved if the answer set is empty. Fragmentation computation has to be modified accordingly such that not the whole row but only some cells of a row are suppressed in the server fragment. Yet there is a problem if we assume a well-informed and suspicious server. For example, if the server knows the definition of the confidentiality constraint Illness(Pete, y) then he could suspect that those tuples in the server fragment for which the name is missing actually belong to the patient Pete. This effect is known as “meta-inferences” (see [13]) because although the fragmentation satisfies the formal correctness definition still inference of confidential information is possible on a meta-level. In this case, appropriate countermeasures have to be taken. For example, by moving more name entries to the owner fragment as strictly necessary (and informing the server about it). Or ensuring that the confidentiality constraints lead to a server fragment that satisfies the properties of k-anonymity (see [14]).
6
Related Work and Conclusion
With the introduction of horizontal fragmentation correctness and formula-based confidentiality constraints, we extended the notion of secure fragmentation for data outsourcing (as analyzed in [5,6] for vertical fragmentation) significantly. On the one hand we showed that horizontal fragmentation gives rise to a new application of the chase for the fragmentation checking problem (as used in [8,10,12,11] for similar purposes). On the other hand we presented an algorithm that computes a correct horizontal fragmentation and at the same time can be used to optimize the fragmentation with respect to some criteria like for example our cell count criterion; other such criteria can also be used with the algorithm. Open questions remain: other fragments of first-order logic can be studied for database dependencies and confidentiality constraints; further research could investigate the behavior and performance of horizontal fragmentation when the user queries or updates his outsourced data; some query strategies are already analyzed for vertical fragmentation in [3,6]. Moreover the area of hybrid
116
L. Wiese
fragmentation can be advanced and the problem of meta-inferences can be investigated further. An in-depth analysis of applications of k-anonymity techniques [14] to data outsourcing is also desirable.
References 1. Hacig¨ um¨ us, H., Iyer, B.R., Li, C., Mehrotra, S.: Executing SQL over encrypted data in the database-service-provider model. In: SIGMOD Conference, pp. 216– 227. ACM, New York (2002) 2. Hacig¨ um¨ us, H., Iyer, B.R., Mehrotra, S.: Query optimization in encrypted database systems. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 43–55. Springer, Heidelberg (2005) 3. Aggarwal, G., Bawa, M., Ganesan, P., Garcia-Molina, H., Kenthapadi, K., Motwani, R., Srivastava, U., Thomas, D., Xu, Y.: Two can keep a secret: A distributed architecture for secure database services. In: Second Biennial Conference on Innovative Data Systems Research, CIDR 2005, pp. 186–199 (2005) 4. Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Fragmentation and encryption to enforce privacy in data storage. In: Biskup, J., L´ opez, J. (eds.) ESORICS 2007. LNCS, vol. 4734, pp. 171–186. Springer, Heidelberg (2007) 5. Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Keep a few: Outsourcing data while maintaining confidentiality. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 440–455. Springer, Heidelberg (2009) 6. Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Enforcing confidentiality constraints on sensitive databases with lightweight trusted clients. In: Gudes, E., Vaidya, J. (eds.) DBSec. LNCS, vol. 5645, pp. 225–239. Springer, Heidelberg (2009) 7. Biskup, J., Tsatedem, C., Wiese, L.: Secure mediation of join queries by processing ciphertexts. In: ICDE Workshops, pp. 715–724. IEEE Computer Society, Los Alamitos (2007) 8. Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theoretical Computer Science 336(1), 89–124 (2005) 9. Maier, D., Mendelzon, A.O., Sagiv, Y.: Testing implications of data dependencies. ACM Transactions on Database Systems 4(4), 455–469 (1979) 10. Afrati, F.N., Kolaitis, P.G.: Repair checking in inconsistent databases: algorithms and complexity. In: 12th International Conference on Database Theory, ICDT. ACM International Conference Proceeding Series, vol. 361, pp. 31–41. ACM, New York (2009) 11. Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Transactions on Knowledge & Data Engineering 12(6), 900–919 (2000) 12. Stouppa, P., Studer, T.: A formal model of data privacy. In: Virbitskaite, I., Voronkov, A. (eds.) PSI 2006. LNCS, vol. 4378, pp. 400–408. Springer, Heidelberg (2007) 13. Biskup, J., Gogolin, C., Seiler, J., Weibert, T.: Requirements and protocols for inference-proof interactions in information systems. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 285–302. Springer, Heidelberg (2009) 14. Ciriani, V., di Vimercati, S.D.C., Foresti, S., Samarati, P.: k-anonymity. In: Secure Data Management in Decentralized Systems. Advances in Information Security, vol. 33, pp. 323–353. Springer, Heidelberg (2007)
Experimental Assessment of Probabilistic Fingerprinting Codes over AWGN Channel Minoru Kuribayashi Graduate School of Engineering 1-1 Rokkodai-cho, Nada-ku, Kobe, Hyogo, 657-8501 Japan
[email protected] Abstract. The estimation of the false-positive probability has been an important concern for fingerprinting codes, and the formula of the probability has been derived under a restricted assumption and statistic model. In this paper, we first analyze the statistic behavior of the value of score derived from the correlation between a pirated codeword and codewords of all users when some bits are flipped. Then, the derivation of the score is adaptively designed to consider the attack model such that a pirated codeword is distorted by additive white Gaussian noise. The traceability and probability of false-positive are estimated by Monte-Carlo simulation, and the validity of the Gaussian approximation for the distribution of score is evaluated for probabilistic fingerprinting codes.
1
Introduction
Due to the progress in information technology, digital contents such as music, images, and movies are distributed from providers to multiple users connected with a network. Although it offers convenient means for users to obtain digital content, it also causes the threats of illegal distribution from malicious parties. In order to prevent users from distributing the pirated version of digital content, digital fingerprinting technique has been studied including the procedure of embedding and detecting fingerprints, secure protocol between buyer and seller, and the way of distribution and identification of illegal action. One of the critical threats for the fingerprinting system is the collusion of users who purchase a same content. Since their fingerprinted copies slightly differ with each other, a coalition of users can combine their fingerprinted copies of the same content for the purpose of removing/changing the original fingerprint. Such an attack is called a collusion attack. An early work on designing collusion-resistant binary fingerprinting codes was presented by Boneh and Shaw [1] underlying the principle referred to as the marking assumption. In this case, a fingerprint is a set of redundant digits which are distributed in some random positions of an original content. When a coalition of users attempts to discover some of the fingerprint positions by comparing their copies for differences, the coalition may modify only those positions where they find a difference in their fingerprinted copies. A c-secure code guarantees the tolerance for the collusion attack with c pirates or less. Tardos I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 117–132, 2010. c Springer-Verlag Berlin Heidelberg 2010
118
M. Kuribayashi
[12] has proposed a probabilistic c-secure code with negligible error probability which has a length of theoretically minimal order with respect to the number of colluders. One of the interesting reports about the characteristic of Tardos’s ˘ code is presented by Skori´ c et al. [2] about the symmetric version of the tracing algorithm. In the algorithm, correlation scores are used to detect the pirates. In the report [3], Gaussian approximation of the value of score derived from the correlation between a pirated codeword and codewords of all users. Based on the report, the code length is further shortened under a given false-positive probability. The results are supported and further analyzed by Furon et al. [4]. Nuida et al. [9] studied the parameters to generate the codewords of Tardos’s code which are expressed by continuous distribution, and presented a discrete version in attempts to reduce the code length and the required memory amount without degrading the traceability. Moreover, they gave a security proof under an assumption weaker than the marking assumption. However, the goal is to reduce the code length under the binary symmetric channel with a certain error rate. In addition, their estimation is based on the assumption that the number of colluders is less than c which is fixed in advance. In this paper, we study the statistic behavior of the value of the score when some bits of pirated codeword are flipped, and estimate the attenuation of average value of the score for Nuida’s code. In our attack model, a coalition of users produces a pirated copy under the marking assumption, and then, the pirated copy is distorted by attacks intended to remove/modify the watermarked signal embedded in digital content. We assume that the noise injected by the attacks is additive white Gaussian noise (AWGN). So, in our attack model, a pirated codeword produced by collusion attack is further distorted by transmitting over AWGN channel. In such a case, the symbols of received(extracted) codeword are represented by analog value. Considering the case of error correcting code, the soft decision detection can reduce more errors than the hard one which rounds the analog values into digital ones. In [6], the traceability and false positive probability of Tardos’s code were analyzed by experiments introducing the soft and hard decision methods into the tracing algorithm, and revealed that the false positive probability is increased with the amount of noise for both methods. In this study, the reason is analyzed by the statistic behavior of the value of the score, and the analysis is further applied for Nuida’s code. Moreover, for Nuida’s code, the dependency of the number of colluders and the type of collusion attack is measured by the behavior of the value of the score. It is remarkable that each symbol of a pirated codeword is rounded into binary digit which may be flipped by an additive noise if the hard decision method is used. Thus, the performance of the hard decision method is strongly related to the analysis of the statistic behavior of the value of the score. On the other hand, the soft decision method will be able to utilize the analogue signal to detect more colluders. The performance of the hard and soft decision methods are compared by Monte-Carlo simulation, and it is revealed that the soft decision method is suitable for the case that the amount of noise added to a pirated copy is very large. It is noted that the experimental results of Tardos’s code in [6] are derived
Experimental Assessment of Probabilistic Fingerprinting Codes
119
under the only restricted environment such that SNR is more than 3 [dB]. In this paper, we evaluate the performance of Nuida’s code as well as Tardos’s code by varying the SNR from −4 to 10 [dB]. We further evaluate the probabilities of false-positive for various kinds of code length, and compare the performance of Tardos’s code with that of Nuida’s code from the probability of false-positive point of view. The experimental results reveal an interesting characteristic such that the false positive probability of Nuida’s code is almost independent of the amount of noise, but is dependent heavily on it for Tardos’s code.
2
Fingerprinting Code
In the fingerprinting system, a distributor provides digital contents to users which contain fingerprint information. The number of users is N . If at most c users are colluded to produce a pirated copy using their fingerprinted copies, a fingerprinting code ensures that at least one of them can be caught from the copy under the well-known assumption called the marking assumption [1]. At the collusion attack, a set of malicious users called colluders try to find the positions of the embedded codeword from differences of their copies, and then to modify bits of the codeword in these positions. Suppose that a codeword of fingerprint codes is binary and each bit is embedded into one of the segments of digital content without overlapping using a robust watermarking scheme. It is possible for colluders to compare their fingerprinted copies of the content with each other to find the differences. In the situation, the positions that the bit of their codewords is different are detectable. The marking assumption states that any bit within a detectable position can be selected or even erased, while any bit without the position will be left unchanged in the pirated codeword. A fingerprint code is called totally c-secure if at least one of the colluders is traceable under the marking assumption with the condition that the number of colluders is at most c. Boneh and Shaw, however, proved that when c > 1, totally c-secure code does not exist if the marking assumption is satisfied [1]. Under the weaker condition that one of innocent users will be captured with a tiny probability , a c-secure code with -error was constructed. 2.1
Probabilistic Fingerprinting Codes
Tardos [12] has proposed a probabilistic c-secure code with error probability 1 which has a length of theoretically minimal order with respect to the number of colluders. On the binary digits of the codeword, the frequency of “0” and “1” is ruled by a specific probability distribution referred to as the bias distribution. The codewords are arranged as an N × L matrix X, where the j-th row corresponds to the fingerprint given to the j-th user. The generation of the matrix X is composed of two steps. 1. A distributor is supposed to choose the random variables 0 < pi < 1 independently for every 1 ≤ i ≤ L, according to a given bias distribution.
120
M. Kuribayashi Table 1. Examples of discrete version of bias distribution c p q c p q 1,2 0.50000 1.00000 7,8 0.06943 0.24833 3,4 0.21132 0.50000 0.33001 0.25167 0.78868 0.50000 0.66999 0.25167 5,6 0.11270 0.33201 0.93057 0.24833 0.50000 0.33598 0.88730 0.33201
2. Each entry Xj,i of the matrix X is selected independently from the binary alphabet {0, 1} with Pr(Xj,i = 1) = pi and Pr(Xj,i = 0) = 1 − pi for every 1 ≤ j ≤ N. In the case of Tardos’s codes, a certain continuous distribution is used as the bias distribution. The values of pi is selected from the range [t, 1 − t]. Here, t = 1/(300c) and pi = sin2 ri is selected by picking uniformly at random the value ri ∈ [t , π/2 − t ] with 0 < t < π/4, sin2 t = t. Nuida et al. [9] proposed the specific discrete distribution introduced by a discrete variant [10] of Tardos’s codes that can be tuned for a given number c of colluders. The bias distribution is called “Gauss-Legendre distribution” due to the deep relation to Gauss-Legendre quadrature in numerical approximation theory (see [9] for detail). The numerical examples of the discrete distribution are shown in Table 1, where q denotes the emerging probability of p. Let C be a set of colluders and c˜ be the number of colluders. Then we denote by XC the c˜ × L matrix of codewords assigned to the colluders. Depending on the attack strategy ρ, the fingerprint y = (y1 , . . . , yL ), yi ∈ {0, 1} contained in a pirated copy is denoted by y = ρ(XC ). For a given pirated codeword y, the (j) tracing algorithm first calculates a score Si for i-th bit Xj,i of j-th user by a certain real-valued function, and then sums them up as the total score S (j) = L (j) of j-th user. For Tardos’s code, if the score S (j) exceeds a threshold Z, i=1 Si the user is determined as guilty. The design of appropriate parameters has been studied in [12], [3], [10]. For Nuida’s code [9], the tracing algorithm outputs only one guilty user whose score becomes maximum. Although no explicit description about the use of a threshold have been presented, it is supposed to be applicable for Nuida’s code. In this paper, we calculate the threshold of Nuida’s code in the same manner as that of Tardos’s one, and evaluate the validity of the design of the threshold and measure the performance. By introducing an auxiliary function σ(p) = (1 − p)/p, the scoring function (j) Si in [12] is given as follows. ⎧ if yi = 1 and Xj,i = 1 ⎨ σ(pi ) (j) Si = −σ(1 − pi ) if yi = 1 and Xj,i = 0 (1) ⎩ 0 if yi ∈ {0, ?} ,
Experimental Assessment of Probabilistic Fingerprinting Codes
121
where “?” stands for erasure of element. The above scoring function ignores ˘ all position with yi ∈ {0, ?}. For such positions, Skori´ c et al. [2] proposed a symmetric version of accusation sum which scoring function is given as follows. ⎧ σ(pi ) if yi = 1 and Xj,i = 1 ⎪ ⎪ ⎪ ⎪ ⎨ −σ(1 − pi ) if yi = 1 and Xj,i = 0 (j) if yi = 0 and Xj,i = 0 Si = σ(1 − pi ) (2) ⎪ ⎪ −σ(p ) if y = 0 and X = 1 ⎪ i i j,i ⎪ ⎩ 0 if yi =? Note that an erasure symbol “?” is regarded as yi = 0 in Nuida’s code. The traceability is usually evaluated in terms of the probability 1 of accusing an innocent user and the probability 2 of missing all colluders. In order to guarantee that the probability of accusing an innocent user is below 1 , Tardos’s original code has length L = 100c2 log(N/1 ) [12]. In [3], the constant “100” was reduced to 4π 2 without changing the scheme. For the above symmetric conversion [2], the lower bound of the code length was given by L > π 2 c2 log(N/1 ). In the same paper, it was shown that the code length was further reduced by converting the construction of the code from binary to q-ary alphabets. For simplicity, we consider only binary fingerprinting code in this paper. The number of traceable colluders depends on the design of threshold Z. There are many statistical analyses of proper threshold Z for original and symmetric version of Tardos’s fingerprinting code. By modeling the accusation sums as ˘ normally distributed stochastic variables, Skori´ c et al. presented simple approximate expressions for the false-positive and false-negative rates [3]. Moreover, due to the Central Limit Theorem, it is reported that the accusation sums is approximated to follow Gaussian distribution. Under the assumption that the score S (j) follows Gaussian distribution, the threshold Z is expressed by the complementary error function erfc() for a given 1 [7]: √ Z = 2L · erfc−1 (21 /N ) . (3) Furon et al. studied the statistics of the score S (j) in [4]. Without loss of generality, the probability density function (PDF) of S (j) are approximated by the normal distribution N (0, L) when j-th user is innocent, and N (2L/˜ cπ, L(1 − 4/˜ c2 π 2 )) when he is involved in C. In this study, they insisted that the use of the Central Limit Theorem was absolutely not recommended when estimating the code length because it amounts to integrate the distribution function on its tail where the Gaussianity assumption does not hold. The Berry-Ess´een bound shows that the gap between the Gaussian law and the real distribution of the scores depends on their third moment. On the other hand, based on the above distributions of S (j) , the probability of true-positive per each colluder and the expected number of detectable colluders are theoretically estimated in [7] when the threshold Z is calculated by Eq.(3) for a given false-positive probability 1 , and the validity is evaluated through computer simulation. The simulation results also show that the probability of false positive is slightly less than the given 1 .
122
M. Kuribayashi
Although the above threshold given by Eq.(3) is specified for the symmetric version of tracing algorithm of Tardos’s code, it could be applicable for the Nuida’s code. Since the theoretical analysis of the validity of such a threshold is difficult because of its complexity, experimental assessment is performed in this paper. 2.2
Relaxation of Marking Assumption
Although the marking assumption is reasonable to evaluate the performance of fingerprint codes, there is a big gap from practical cases. Even if a watermarking scheme offers a considerable level of robustness, it is still possible to erase/modify the embedded bits with a non-negligible probability due to the addition of noise to a pirated copy. Because of the noise, the extracted signal from such a pirated copy must be distorted from the original signal yi ∈ {0, 1}. Therefore, the bits without the detectable position may be erased/modified by the attacks for the watermarked signal. In our assumption, the effects caused by attacks are modeled by additive white Gaussian noise, and the noise is added after collusion attack. The degraded codeword is represented by y = y + e , (4) where e is the additive white Gaussian noise. In order to cover more practical cases, various relaxation of the marking assumption have been introduced and several c-secure codes under those assumptions, called robust c-secure codes, have been proposed in [9], [5], [11], [8]. Among those assumptions, there are two common conditions: At least one of the colluders is traceable and the number of colluders is at most c. Their goal is mainly to estimate a proper code length L to satisfy that the probability of accusing an innocent user is below 1 , which is dependent on the number of flipped/erased bits at the undetectable position. Suppose that a fingerprint code is equipped in a fingerprinting system. Then, the code length must be determined under the considerations of system policy and attack strategies such as the number of colluders and the amount of noise. Here, our interest is how to design the good tracing algorithm that can detect more colluders and less innocent users no matter how many colluders get involved in to generate a pirated copy and no matter how much amount of noise is added to the copy. In this regard, it is meaningful to design a proper threshold Z for a given false probability 1 and a fixed code length. The threshold Z given by Eq.(3) could adjust well for the relaxed version of the marking assumption. In [6], the number of detectable colluders and false-positive probability for Tardos’s code was presented under the relaxed version of the marking assumption. However, it merely showed the results obtained by experiments. Our contribution of this paper is to present the effect of bit flip caused by the additive noise from the viewpoint of the correlation score. Moreover, the performance between Tardos’s code and Nuida’s code is compared with each other. In the following sections, we forget about the limitation of c-secure code such that the number of colluders is at most c. The performance of conventional tracing algorithm based on a threshold Z and its variant is evaluated for arbitrary number of colluders c˜.
Experimental Assessment of Probabilistic Fingerprinting Codes
3 3.1
123
Distribution of Accusation Sum Effect of Bit Flip
In this section, we consider the changes of accusation sum S (j) when arbitrary x bits of pirated codeword are flipped by attack under the assumption that each element of pirated codeword is rounded into a bit, namely, yi ∈ {0, 1}. Remember that the PDF of S (j) is approximated to be N (2L/˜ cπ, L(1 − (j) 4/˜ c2 π 2 )) when j-th user is involved in C, and the elements Si are indepen(j) dent with each other. Since the length of codeword is L, the PDF of Si is given by N (2/˜ cπ, 1 − 4/˜ c2 π 2 ). Suppose that i-th bit yi of pirated codeword is flipped. (j) (j) Then, the corresponding score Si is changed to −Si from Eq.(2). It means (j) that the variance of accusation sum S is unchanged by the bit flip, while the average is changed from 2/˜ cπ to −2/˜ cπ. When arbitrary x bits of pirated code(j) cπ, and that word are flipped, the sum of x elements Si is expected to be −2x/˜ of the other unflipped (L − x) elements is to be 2(L − x)/˜ cπ. Therefore, without loss of generality, when x bits of pirated codeword are flipped, the PDF of S (j) is expected to be N (2(L − 2x)/˜ cπ, L(1 − 4/˜ c2 π 2 )). (j) On the other hand, the PDF of S is approximated to be N (0, L) when j-th user is innocent. Then, it is expected that the PDF is unchanged even if any number of bits of pirated codeword are flipped. Due to the complexity of the parameters introduced in the discrete version of bias distribution in Nuida’s code, we skip the theoretical analysis of the distribution of accusation sum under the Gaussian assumption in this paper. Instead, we derive a conjecture of the distribution of accusation sum from the experimental results. 3.2
Numerical Evaluation
The above analysis is evaluated by implementing Tardos’s code with the following parameters. The number of users is N = 104 and the code length is L = 10000. The range of bias distribution pi is fixed by setting t = 0.000167 (c = 20). Under a constant number of colluders c˜ = 10, the PDF of accusation sum S (j) is calculated using Monte-Carlo simulation with 106 trials. Table 2 shows the mean and variance of accusation sum when x symbols of pirated codeword are flipped, where the values in parenthesis are theoretical ones. In this experiment, the performed collusion attack is “majority attack”: If the sum of i-th bit exceeds c˜/2, then yi = 1, otherwise, yi = 0. The PDF of the distribution is also described in Fig.1, where solid and dashed lines are the experimental and theoretical values, respectively. These results confirm that the PDF of S (j) actually follows N (2(L − 2x)/˜ cπ, L(1 − 4/˜ c2 π 2 )) in this experiment. The mean and variance of accusation sum for Nuida’s code is calculated using the following parameters. The discrete version of bias distribution is selected by the case c = 7, 8 in Table 1. The number of colluders is c˜ = 10, the code length is L = 104 , and the trials for Monte-Carlo simulation is 106 , which are the same parameters to Tardos’s code. Table 3 shows the mean and variance when x symbols of pirated codeword are flipped. From this table, we make a
124
M. Kuribayashi
Table 2. The mean and variance of accusation sum S (j) of Tardos’s code when c˜ = 10, where the values in parenthesis are theoretical ones innocent colluders x mean variance mean variance 0 −2.6 (0.0) 10499.5 (10000) 644.9 (636.6) 9955.6 (9959.5) 1000 −0.8 (0.0) 10253.5 (10000) 511.6 (509.3) 10042.0 (9959.5) 2000 −8.9 (0.0) 10501.0 (10000) 382.0 (382.0) 10318.3 (9959.5)
conjecture of the distribution of accusation sum. At first, it seems difficult to extract useful information from the values of variance. Because the values are almost equal to L and are very similar to that of Tardos’s code which variance of colluders’ S (j) are expected to be L(1 − 4/˜ c2 π 2 ) from the theoretical analysis. Then, we focus on the mean values of colluders’ S (j) . Referring to the mean value 2(L − 2x)/˜ cπ of Tardos’s code, that of Nuida’s code can be experimentally estimated by 2(L − 2x)/2.826˜ c from the three mean values in Table 3. In other word, the parameter “π” in the mean value of Tardos’s code is replaced by “A = 2.826” in that of Nuida’s one under the above condition. So, we make the following conjecture for the distribution of accusation sum of Nuida’s code; N (2(L − 2x)/A˜ c, L(1 − 1/2˜ c2 )), where A = 2.826 under “majority attack” and L = 10000. In order to confirm the validity of the conjecture, the PDF of S (j) are depicted in Fig.2, where solid and dash lines are the experimental and conjectured values, respectively. From the figure, we can see that the conjectured values are very close the experimental values. These results are derived by using the discrete version of Nuida’s bias distribution for c = 7, 8 in Table 1. However, the number of colluders is fixed by c˜ = 10 in the experiment and only “majority attack” is tested. Considering the design of the bias distribution, the parameter A may be sensitive for the change of c˜. Moreover, the value of A should be measured for different types of collusion attack. The changes of the value of A are depicted in Fig.3 by changing the number c˜ for 5 types of collusion attack; “majority”, “minority”, “random”, “all-0”, and “all-1”. Under the marking assumption, if i-th bit of c colluders’ codewords is different, that of pirated codeword yi is selected by the following manner. – majority: If the sum of i-th bit exceeds c/2, yi = 1, otherwise, yi = 0. – minority: If the sum of i-th bit exceeds c/2, yi = 0, otherwise, yi = 0. Table 3. The mean and variance of accusation sum S (j) of Nuida’s code when c˜ = 10 innocent colluders x mean variance mean variance 0 −9.5 10456.7 708.3 10316 1000 −6.9 10833.8 562.9 10039 2000 0.1 10119.4 421.5 10332
Experimental Assessment of Probabilistic Fingerprinting Codes 0.005
0.005 x=0 x = 1000 x = 2000
x=0 x = 1000 x = 2000
experimental theoretical
0.004
probability density
probability density
0.004
0.003 innocent users
0.002 colluders
experimental conjecture
0.003 innocent users
0.002 colluders
0.001
0.001
0
-400
-200
0
200
400
600
800
1000
0
1200
-400
-200
0
accusation sum S (j)
400
600
800
1000
1200
Fig. 2. The PDF of accusation sum S (j) of Nuida’s code when c˜ = 10
5
5
4
4 Tardos (A = π)
the value of A
the value of A
200
accusation sum S (j)
Fig. 1. The PDF of accusation sum S (j) of Tardos’s code when c˜ = 10
3 2
majority minority random all-0 all-1
1 0
125
2
4
6
8
10
12
14
16
number of colluders c˜
(a) c = 5, 6
18
20
Tardos (A = π) 3 2
majority minority random all-0 all-1
1 0
2
4
6
8
10
12
14
16
number of colluders c˜
18
20
(b) c = 7, 8
Fig. 3. The value of parameter A for 5 types of collusion attack when L = 10000
– random: yi ∈R {0, 1}. – all-0: yi = 0. – all-1: yi = 1. The results indicate that the value of A is almost constant when the number c˜ of colluders is below c, and that the value of A is widely varied with the type of collusion attack if c˜ exceeds c. Interestingly, we can see from Fig.3 that the behavior of the values for c˜ > c is completely different with the selection of discrete version of bias distribution in Table 1. The reason will come from the generation of the bias distribution. The detail analysis is left for the future work. 3.3
Estimation of True-Positive and False-Positive
Based on the statistical behavior of the colluders’ accusation sum derived by the above experiments, the number of detectable colluders from a pirated copy can be estimated by referring to the analysis in [7]. For Tardos’s code, the probability Pr[T P ] of true-positive per each colluder is given by
126
M. Kuribayashi
1 ˆ 2L
1 Pr[T P ] = erfc √ Z− , 2 ˜cπ 2σ 2 where
(5)
4
. (6) c˜2 π 2 Using the probability Pr[T P ], the expected number of detectable colluders is given by c˜ 1 ˆ 2L
NT P = c˜ Pr[T P ] = erfc √ Z− . (7) 2 ˜cπ 2σ 2 These analyses are based on the Gaussianity assumption for the distribution of accusation sum. The numerical results of the distribution confirm the validity of the assumption for both Tardos’s code and Nuida’s code. Therefore, it is expected for Nuida’s code that Pr[T P ] and NT P can be represented by Eq.(5) and Eq.(7) where the parameter “π” is replaced by “A”. On the other hand, even if the accusation sum of innocent users can be approximated by Gaussian distribution N (0, L) from the experimental results, the probability of false-positive cannot be simply expressed by Gauss error function as reported in [4]. Thus, the experimental evaluation is required for the probability of false-positive, which is discussed in Sect.5.
4
σ2 = L 1 −
Soft Decision Method
The signal extracted from a pirated copy is represented by analog value y because of the addition of noise e in our assumption. Considering the scoring function given by Eq.(2), each symbol of the pirated codeword y must be rounded into a bit {0, 1} or erasure symbol “?”. Hence, an extracted signal from a pirated copy is first rounded into digital value, and then the tracing algorithm is performed to identify the colluders. This method is analogous to the hard decision (HD) method in error correcting code. Here, there is an interesting question whether a soft decision (SD) method is applicable to the tracing algorithm by adaptively designing a proper threshold or not. In general, the performance of SD method is much better than the HD method in error correcting code. Suppose that in the HD method each symbol of the pirated codeword y is rounded into a bit, which is denoted by yi ∈ {0, 1} for 1 ≤ i ≤ L. If an erasure error is occurred, such a symbol is regarded as yi = 0 similar to the tracing (j) algorithm in Nuida’s code. Based on Eq.(2), a score Sˆi for i-th bit Xj,i of j-th user is represented by (2yi − 1)σ(pi ) if Xj,i = 1 (j) ˆ Si = (8) −(2yi − 1)σ(1 − pi ) if Xj,i = 0 . The design of threshold in Eq.(3) is based on the Gaussian approximation of L (j) (j) the score Sˆi . From the discussion in Sect.3.1, the PDF of Sˆ(j) = i=1 Sˆi is N (0, L) when j-th user is innocent even if any symbols in y are flipped from
Experimental Assessment of Probabilistic Fingerprinting Codes
127
that in y, and hence, the proper threshold ZHD is calculated by Eq.(3). In the SD method, yi in Eq.(8) is replaced by yi to calculate the score directly from the extracted analog signal y . Since y is distorted by AWGN channel, the effect on the score is also approximated to follow Gaussian distribution. Hence, if the 2 of the accusation sum is obtained, the proper threshold ZSD can variance σSD be designed using the same equation as the case of HD method:
2 erfc−1 (2 /N ) . ZSD = 2σSD (9) 1 2 Because of the randomness in the generation of codeword, the variance σSD can be calculated as follows. ˜ fingerprint codewords X˜ for ˜j ∈ {1, . . . , N }. 1. Generate N j (˜ j) ˆ 2. Calculate the correlation scores S . ˜ 2 3. Compute the variance of Sˆ(j) , and output it as σSD .
˜ codewords X˜ are statistically uncorrelated with the pirated The generated N j 2 ˜ codeword. If N is sufficiently large, a proper variance σSD can be obtained by the above procedure, and finally, a proper threshold ZSD is derived. It is noticed that the model of noisy channel is regarded as the binary symmetric channel (BSC) when the HD method is used, which is the same model as the report in [4]. Since an erasure symbol “?” is regarded as “0” in [8], the erasure channel assumed in the analysis is also equal to BSC. Even if AWGN channel is assumed in our paper, the HD method replaces the channel into BSC. On the other hand, the introduction of SD method enables us to utilize the characteristic of AWGN channel. In the next section, we experimentally evaluate the performance of these methods.
5
Experimental Results
10−1
10 Nuida
false-positive probability
number of detected colluders
The HD and SD methods are applicable for both Tardos’s code and Nuida’s code when a pirated codeword is distorted by AWGN channel. The performance of
8 6 Tardos
4 2 0
HD SD
-4
-2
0
2
4
6
8
10
SNR [dB]
Fig. 4. The number of detected colluders when c˜ = 10 and L = 10000
HD SD −2
10
Tardos
10−3 10−4 10−5 -4
1 = 10−4
-2
Nuida
0
2
4
6
8
10
SNR [dB]
Fig. 5. The false-positive probability when c˜ = 10 and L = 10000
M. Kuribayashi 10−1
8 7
false-positive probability
number of detected colluders
128
Nuida (SNR: 2 [dB]) Nuida (SNR: 1 [dB])
6
Tardos (SNR: 2 [dB])
5
Tardos (SNR: 1 [dB])
4 3 2
HD SD
1 0
2
4
6
8
10
12
14
16
number of colluders c˜
18
20
Fig. 6. The number of detected colluders for various number of colluders
Tardos (SNR: 1 [dB])
10−2
10−4 10−5 2
Tardos (SNR: 2 [dB])
HD SD
10−3
Nuida (SNR: 2 [dB])
1 = 10−4
4
6
Nuida (SNR: 1 [dB])
8
10
12
14
16
number of colluders c˜
18
20
Fig. 7. The false-positive probability for various number of colluders
Table 4. The comparison of number of detected colluders (a) L = 1000, c˜ = 3 SNR HD SD [dB] Tardos Nuida Tardos Nuida 1 0.76 1.55 0.85 1.65 2 1.16 2.03 1.13 1.97 5 2.20 2.79 1.85 2.58 10 2.61 2.94 2.40 2.87
(b) L = 2000, c˜ = 5 SNR HD SD [dB] Tardos Nuida Tardos Nuida 1 0.37 1.02 0.42 1.11 2 0.65 1.59 0.62 1.49 5 1.84 3.28 1.33 2.60 10 2.70 4.05 2.20 3.59
(c) L = 5000, c˜ = 8 SNR HD SD [dB] Tardos Nuida Tardos Nuida 1 0.56 1.53 0.64 1.71 2 0.98 2.40 0.94 2.31 5 2.76 5.03 2.00 4.05 10 4.09 6.31 3.36 5.66
(d) L = 10000, c˜ = 10 SNR HD SD [dB] Tardos Nuida Tardos Nuida 1 1.80 3.70 2.00 3.94 2 2.86 5.22 2.72 4.97 5 6.13 8.42 4.84 7.33 10 7.75 9.36 6.83 8.84
such methods are evaluated by experiments under the following conditions. The number of users is N = 104 and the number of trials for Monte-Carlo simulation is 105 . The range of bias distribution pi for Tardos’s code is fixed by setting t = 0.000167 (c = 20), and the discrete version of bias distribution of Nuida’s code is selected by the case of c = 7, 8 shown in Table 1. In the SD method, the 2 ˜ = 1000. The designed false-positive number of codewords to calculate σSD is N −4 probability is 1 = 10 . It is reported in [9] that the performance of Nuida’s code is better than that of Tardos’s code. So, we mainly compare the HD and SD methods from the behavior of the traceability point of view, and assess the validity of Gaussian assumption of accusation sum for innocent users. As shown in Fig.3, the attenuation of accusation sum for Nuida’s code, which are measured by the parameter A, becomes maximum when the majority attack is performed by colluders for the case that the discrete version of bias distribution
Experimental Assessment of Probabilistic Fingerprinting Codes
129
Table 5. The comparison of probability of false-positive
SNR [dB] c˜ 3 1 20 100 3 2 20 100 3 5 20 100 3 10 20 100
(a) L = 1000 HD [×10−4 ] SD [×10−4 ] Tardos Nuida Tardos Nuida 1108.5 0.1 167.3 0.0 1105.7 0.3 169.2 0.0 1089.3 0.4 153.4 0.0 871.5 0.1 94.2 0.0 881.4 0.4 93.2 0.0 858.6 0.3 85.8 0.0 300.1 0.1 9.2 0.2 313.3 0.3 10.8 0.2 297.4 0.1 5.6 0.0 5.5 0.1 0.1 0.2 6.8 0.1 0.0 0.1 4.3 0.1 0.1 0.1
SNR [dB] c˜ 8 1 20 100 8 2 20 100 8 5 20 100 8 10 20 100
(c) L = 5000 HD [×10−4 ] SD [×10−4 ] Tardos Nuida Tardos Nuida 523.1 0.6 82.6 0.3 528.1 0.5 78.5 0.6 532.6 0.4 76.2 0.5 417.6 0.4 46.6 0.6 424.9 0.4 45.8 0.7 419.0 0.8 45.9 0.3 151.4 0.6 7.1 0.5 146.8 0.6 4.5 0.8 151.3 0.3 4.9 0.5 4.0 0.5 0.1 0.6 2.4 0.5 0.3 0.8 2.6 0.6 0.4 0.7
SNR [dB] c˜ 5 1 20 100 5 2 20 100 5 5 20 100 5 10 20 100
SNR [dB] c˜ 10 1 20 100 10 2 20 100 10 5 20 100 10 10 20 100
(b) L = 2000 HD [×10−4 ] SD [×10−4 ] Tardos Nuida Tardos Nuida 851.3 0.1 133.7 0.5 843.1 0.1 128.8 0.1 817.1 0.3 123.6 0.0 673.6 0.2 76.9 0.5 674.0 0.7 73.6 0.2 655.1 0.1 70.1 0.0 242.2 0.6 8.7 0.6 246.9 0.1 7.7 0.2 232.8 0.1 7.0 0.1 5.1 0.1 0.0 0.2 5.8 0.3 0.0 0.2 5.5 0.0 0.0 0.1 (d) L = 10000 HD [×10−4 ] SD [×10−4 ] Tardos Nuida Tardos Nuida 383.8 0.7 62.7 0.5 380.1 0.5 51.1 0.8 365.1 0.6 54.7 0.5 307.4 0.4 34.3 0.2 222.0 0.3 22.2 0.7 287.9 0.5 28.7 0.6 111.6 0.8 4.6 0.4 102.8 0.5 3.3 1.0 98.3 0.6 3.6 0.7 2.4 0.3 0.1 0.6 2.5 0.6 0.3 0.6 2.1 0.4 0.3 0.6
is the case of c = 7, 8. So, a pirated copy is produced by the majority attack, and it is distorted by transmitting through AWGN channel. By fixing the number of colluders c˜ = 10 and the code length L = 10000, the number of detected colluders and false-positive probability for HD and SD methods are measured, which results are plotted in Fig.4 and Fig.5, respectively. For both codes, the HD method detects more colluders than the SD method when SNR is more than 2 [dB], and the SD method is suitable only when SNR is less than 2 [dB]. On the other hand, the characteristics of two codes are apparently appeared in the falsepositive probability. For Tardos’s code, the probability of HD method is higher than that of SD method, and both of the probabilities are drastically increased with the amount of additive noise. Meanwhile for Nuida’s code, the probability is almost constant and is below 1 . The results mean that the Gaussian assumption
130
M. Kuribayashi
Table 6. The number of detected colluders for various kinds of collusion attack when c˜ = 10 and L = 10000 SNR [dB] code 1 Tardos Nuida 2 Tardos Nuida 5 Tardos Nuida 10 Tardos Nuida
majority HD SD 1.80 2.00 3.70 3.94 2.86 2.72 5.22 4.97 6.13 4.84 8.42 7.33 7.75 6.83 9.36 8.84
minority HD SD 1.79 1.96 4.24 4.51 2.81 2.65 5.81 5.58 6.04 4.72 8.82 7.87 7.71 6.74 9.59 9.19
random HD SD 1.78 1.92 3.97 4.16 2.82 2.61 5.53 5.20 6.10 4.70 8.66 7.55 7.75 6.72 9.50 9.00
all-0 HD SD 1.78 1.92 3.96 4.14 2.82 2.61 5.52 5.19 6.10 4.69 8.65 7.54 7.75 6.72 9.50 9.00
all-1 HD SD 1.79 1.94 3.97 4.16 2.83 2.63 5.53 5.20 6.09 4.72 8.64 7.54 7.75 6.74 9.50 8.99
Table 7. The probability of false-positive [×10−4 ] for various kinds of collusion attack when c˜ = 10 and L = 10000 SNR majority [dB] code HD SD 1 Tardos 383.8 62.7 Nuida 0.7 0.5 2 Tardos 307.4 34.3 Nuida 0.4 0.2 5 Tardos 111.6 4.6 Nuida 0.8 0.4 10 Tardos 2.4 0.1 Nuida 0.3 0.6
minority HD SD 377.6 68.8 1.1 1.4 315.7 33.8 1.8 1.6 111.9 3.9 1.1 1.4 3.1 0.5 1.0 1.2
random HD SD 391.6 58.8 0.9 1.0 297.7 29.4 1.0 1.0 105.2 2.3 0.3 0.9 1.5 0.0 0.7 0.7
all-0 HD SD 384.8 58.0 0.6 0.7 310.4 34.4 0.8 0.7 113.3 4.2 0.9 1.2 2.8 0.4 0.6 0.7
all-1 HD SD 381.2 65.2 0.7 0.3 302.8 32.4 0.6 0.3 110.6 5.5 0.2 0.4 2.8 0.3 0.4 0.6
of the distribution of accusation sum is invalid for Tardos’s code, while it is valid for Nuida’s code under the above conditions. By changing the number c˜, the number of detected colluders and the false-positive probability are measured for two cases that SNR is 1 [dB] and 2 [dB], which results are shown in Fig.6 and Fig.7. Figure 6 indicates that the traceability of HD method is better than that of SD method when SNR is 2 [dB], while the performance of these methods is exchanged when SNR is 1 [dB]. It is remarkable that the false-positive probability is almost constant even if c˜ is changed from 2 to 20. Hence, we can say that the probability is independent on the number c˜ of colluders. The comparison of the number of detected colluders for various kinds of code length is shown in Table 4. From the table, it is confirmed that the HD method is better than the SD method to detect as many colluders as possible if SNR is more than 2 [dB], and vice versa. The probabilities of false-positive are also evaluated by changing the parameters c˜ and L, which results are shown in Table 5. The probabilities for Tardos’s code are much higher than the given 1 = 10−4 though the values are decreased with the code length L. Such characteristics are also appeared when the number c˜ of colluders is much higher than c. On the
Experimental Assessment of Probabilistic Fingerprinting Codes
131
other hand, the probabilities for Nuida’s code are almost constant and slightly less than 1 no matter how many users are colluded to produce a pirated copy and no matter how much noise is added to the codeword. The traceability and the probability of false-positive are further measured for some typical collusion attacks when c˜ = 10 and L = 10000. The results are shown in Table 6 and Table 7. As shown in Fig. 3, the attenuation of accusation sum for colluders is varied for five types of collusion attack. The number of detected colluders is varied in a similar fashion. Moreover, the HD method is better than the SD method when SNR is more than 2 [dB] for every types of collusion attack. There is a remarkable tendency for Nuida’s code in the probability of false-positive against the type of collusion attack. The less the attenuation of accusation sum is, the more the probability of false-positive becomes in this experiment. For example, the parameter “A” of minority attack in Fig.3 becomes minimum among five types of collusion attack, and then the probability of false-positive shwon in Table 7 becomes maximum in most cases. The detailed theoretical analysis for such a characteristic is left for the future work.
6
Conclusion
In this paper, we statistically estimate the distribution of accusation sum under a relaxed marking assumption, and experimentally evaluate the validity of the estimation. In the attack model, a pirated codeword is distorted by additive white Gaussian noise after performing collusion attack. The experimental results confirm that the estimation of the distribution of colluders’ accusation sum is valid for Tardos’s code when some bits are flipped. Assuming that each symbol of the pirated codeword is extracted from a pirated copy with analog value, hard and soft decision methods for calculating the accusation sum are proposed. The experimental results indicate that the hard decision method is better than the soft one if SNR is more than 2 [dB], and vice versa. It is also revealed that the probability of false-positive is almost constant for Nuida’s code, while it is drastically increased for Tardos’s code in proportion to the amount of noise.
Acknowledgement This research was partially supported by the Ministry of Education, Culture, Sports Science and Technology, Grant-in-Aid for Young Scientists (B) (21760291), 2010.
References 1. Boneh, D., Shaw, J.: Collusion-secure fingerprinting for digital data. IEEE Trans. Inform. Theory 44(5), 1897–1905 (1998) ˇ 2. Skori´ c, B., Katzenbeisser, S., Celik, M.: Symmetric Tardos fingerprinting codes for arbitrary alphabet sizes. Designs, Codes and Cryptography 46(2), 137–166 (2008)
132
M. Kuribayashi
ˇ 3. Skori´ c, B., Vladimirova, T.U., Celik, M., Talstra, J.C.: Tardos fingerprinting is better than we thought. IEEE Trans. Inform. Theory 54(8), 3663–3676 (2008) 4. Furon, T., Guyader, A., C´erou, F.: On the design and optimization of Tardos probabilistic fingerprinting codes. In: Solanki, K., Sullivan, K., Madhow, U. (eds.) IH 2008. LNCS, vol. 5284, pp. 341–356. Springer, Heidelberg (2008) 5. Guth, H.J., Pfitzmann, B.: Error- and collusion-secure fingerprinting for digital data. In: Pfitzmann, A. (ed.) IH 1999. LNCS, vol. 1768, pp. 134–145. Springer, Heidelberg (2000) 6. Kuribayashi, M.: Tardos’s fingerprinting code over AWGN channel. In: B¨ ohme, R., Fong, P.W.L., Safavi-Naini, R. (eds.) IH 2010. LNCS, vol. 6387, pp. 103–117. Springer, Heidelberg (2010) 7. Kuribayashi, M., Morii, M.: Systematic generation of Tardos’s fingerprinting codes. IEICE Trans. Fundamentals E93-A(2), 508–515 (2009) 8. Nuida, K.: Making collusion-secure codes (more) robust against bit erasure. eprint. 2009-549 (2009) 9. Nuida, K., Fujitu, S., Hagiwara, M., Kitagawa, T., Watanabe, H., Ogawa, K., Imai, H.: An improvement of discrete Tardos fingerprinting codes. Designs, Codes and Cryptography 52(3), 339–362 (2010) 10. Nuida, K., Hagiwara, M., Watanabe, H., Imai, H.: Optimization of Tardos’s fingerprinting codes in a viewpoint of memory amount. In: Furon, T., Cayre, F., Do¨err, G., Bas, P. (eds.) IH 2007. LNCS, vol. 4567, pp. 279–293. Springer, Heidelberg (2008) 11. Safavi-Naini, R., Wang, Y.: Collusion secure q-ary fingerprinting for perceptual content. In: Sander, T. (ed.) DRM 2001. LNCS, vol. 2320, pp. 57–75. Springer, Heidelberg (2002) 12. Tardos, G.: Optimal probabilistic fingerprint codes. J. ACM 55(2), 1–24 (2008)
Validating Security Policy Conformance with WS-Security Requirements Fumiko Satoh and Naohiko Uramoto IBM Research - Tokyo, 1623-14 Shimo-tsuruma, Yamato-shi, Kanagawa 242-8502, Japan {sfumiko,uramoto}@jp.ibm.com
Abstract. Web Services Security (WS-Security) is a technology to secure the data exchanges in SOA applications. The security requirements for WS-Security are specified as a security policy expressed in Web Services Security Policy (WS-SecurityPolicy). The WS-I Basic Security Profile (BSP) describes the bestpractices security practices for addressing the security concerns of WS-Security. It is important to prepare BSP-conformant security policies, but it is quite hard for developers to create valid security polices because the security policy representations are complex and difficult to fully understand. In this paper, we present a validation technology for security policy conformance with WS-Security messages. We introduce an Internal Representation (IR) representing a security policy and its validation rules, and a security policy is known to be valid if it conforms to the rules after the policy is transformed into the IR. We demonstrate the effectiveness of our validation technology and evaluate its performance on a prototype implementation. Our technology makes it possible for a developer without deep knowledge of WS-Security and WS-SecurityPolicy to statically check if a policy specifies appropriate security requirements. Keywords: WS-SecurityPolicy, WS-Security, Basic Security Profile, Conformance Validation.
1 Introduction Making SOA applications secure is a very important non-functional requirement, especially for enterprise systems. WS-Security [1] can provide end-to-end integrity and confidentiality by signing and encrypting SOAP messages. Web Services Security Policy (WS-SecurityPolicy) [2] is a specification for a security policy representation for WS-Security, and therefore providers need to prepare and publish security policies expressed in WS-SecurityPolicy to insure secure message exchanges. WS-Security can flexibly and reliably sign and encrypt messages ensuring their integrity and confidentiality. We need to apply WS-Security to the messages appropriately to eliminate security weaknesses. The WS-I Basic Security Profile (BSP) [3] describes the recommendations for WS-Security message interoperability, and also provides constraints on the WS-Security message structure to improve the security. One example of the requirements is that “a username token with a password should be signed” to prevent replay attacks against the security tokens. I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 133–148, 2010. © Springer-Verlag Berlin Heidelberg 2010
134
F. Satoh and N. Uramoto
If a policy developer needs to create a security policy that follows these requirements to improve the security of the message exchanges, a security policy should be defined to satisfy them. Defining a security policy conforming to the requirements requires security expertise and a deep understanding of WS-Security and WSSecurityPolicy. This is because the requirements refer to the WS-Security message structure, but the developer needs to define the requirements in a security policy using WS-SecurityPolicy, hence the developer must understand the relationships between the structure of WS-Security messages and the security policies and translate the requirements in WS-Security to the constraints of WS-SecurityPolicy. This is quite difficult because these specifications are complex and there are no supporting technologies or tools for checking the policy conformance. Currently developers must manually check the policy conformance. In this paper, we present a validation technology to test policy conformance for the WS-Security requirements. It is difficult to directly check if a security policy itself conforms to the requirements on WS-Security, so we introduce an Internal Representation (IR) that is the key of our approach. The IR is a model representing the structure of messages, and the policy is validated if it satisfies the constraints after being mechanically transformed into an IR. The IR is defined with predicates, and therefore the policy conformance can be validated by inference. We demonstrate our validation method using a prototype, and evaluated the performance of our method. Our technology can allow policy developers to reduce the developers’ workloads for creating valid security policies without requiring manual tests of policy conformance. The remainder of this paper is structured as follows. Section 2 discusses the problems in defining security policies. Our validation approach is explained in Section 3. Section 4 demonstrates the conformance validation using an example policy, and describes an experiment to evaluate the performance. Section 5 discusses related work and we conclude in Section 6.
2 Problems in Defining Security Policies WS-Security protects SOAP messages exchanged between requesters and providers using signed and encrypted XML messages. The signatures and encryptions should be done appropriately to protect the messages. The WS-I Basic Security Profile defines a number of constraints for message structures that should be taken into account when we use WS-Security. In Section 17 of the BSP, there is a list of security attacks and best-practices recommendations to defend against them. Signing on a username token is one example of a recommendation to prevent replay attacks on the token. Therefore a security policy should require a signature if a username token is included in a policy, but it is easy for a manual human inspection to fail to detect whether or not a security policy conforms to this constraint, because the specification of WS-SecurityPolicy is complicated and there are currently no effective tools to check the policies. Here we clarify three reasons why defining conforming security policies is difficult. 1. The Basic Security Profile describes the rules for the WS-Security message structure, but these are no rules for the security policies. Hence, a policy developer needs to understand which structures in a WS-Security message correspond to which security policies.
Validating Security Policy Conformance with WS-Security Requirements
135
2. The WS-SecurityPolicy specification defines a number of assertions, and therefore we can flexibly specify many kinds of security requirements. Because of this high flexibility, the specification is very complicated and it is difficult for a developer to fully and correctly understand all of the assertions. 3. One security policy can generate many WS-Security message structures. This means that a security policy specifies only part of the WS-Security message structure, and some portions of the message remain unconstrained by any security policies. Reason 1 means that a developer needs to visualize a corresponding WS-Security message structure clearly when defining a security policy in WS-SecurityPolicy. However, it is difficult to mentally visualize the corresponding WS-Security messages because the transformations are quite complicated as WS-SecurityPolicy assertions become the security requirements of WS-Security involving signatures, encryptions, and security tokens. This complexity of transformation comes from the differences in the descriptive levels of the security requirements between the WS-Security messages and the WS-SecurityPolicy assertions. In WS-Security, we can specify three primitive requirements, signing, encryption, and the use of security tokens, which correspond to integrity, confidentiality, and authentication, respectively. However, a security policy in WS-SecurityPolicy does not specify a primitive requirement directly. For example, a security policy specifies a special higher-level assertion such as ProtectTokens instead of directly specifying a signature on a security token by saying “the signed portion is the security token”. In that sense, WS-SecurityPolicy has a high-level description without specifying the primitive requirements directly. We discussed the details of these description levels in [4]. In addition, we cannot define the message structure uniquely with only one policy. A security policy can specify a set of minimum security requirements which should be satisfied by requesters, so the policy information is not sufficient to generate a unique message structure. Therefore a requester may add other kinds of security requirements to the WS-Security message when it sends the message to a provider. In this sense, many policy-conformant WS-Security messages may be valid. When we verify if a security policy conforms to the BSP rules, we need to check all of the possible WS-Security messages created by the policy, and this makes policy validation quite difficult for users who are not highly knowledgeable about both WS-Security and WSSecurityPolicy. For these reasons, we believe there is a need for a technology that can validate whether or not a security policy corresponds to a WS-Security message that conforms to the rules defined in the BSP. There are two approaches for checking if a security policy is defined appropriately. The first one is static validation of the security policy itself, and the other is checking the generated WS-Security messages by dynamically testing them against the security policy at runtime. There is an application server [5] that can validate WS-Security messages, but this dynamic validation will tend to impact the performance of the SOAP message processing. The static validation offers a hope of major improvements in the runtime performance. In addition, since there are multiple WS-Security messages that satisfy a security policy, we can eliminate many runtime invocations of the validation engine if the operative policy can be statically validated before runtime. We compared the dynamic and static validations of BSP conformance in [6].
136
F. Satoh and N. Uramoto
1. Internal Representation (IR) by predicate logic (Universal Set of WSS Messages)
BSP-conformant policy
Policy A BSP Rule 1
Policy B IR of Policy A
Policy C
IR of Policy B,C
2. Translation rules into IR BSP Rule 2
Policy D 3. Transformation from WSSP into IR
BSP-conformant IR
Fig. 1. Policy Conformance defined by IR
Therefore, we are now focusing on the static validation technology for WS-SecurityPolicy conformance. In the next section, we present the key ideas of our policy validation.
3 Security Policy Conformance Validation 3.1 Definition of Policy Conformance To address the difficulties of security policy validation as discussed in Section 2, we devised a new Internal Representation (IR) using predicate logic to represent a validated policy and validation rules. The reason for using predicate logic is that a security policy specifies a minimum set of constraints for WS-Security messages, so we need one representation model that can represent multiple message structures corresponding to one policy. A logic-based representation can represent undefined parts with variables. Figure 1 shows the policy conformance defined using IR. The central circle in Figure 1 shows the IR that corresponds to the universal set of WS-Security messages. A security policy can be transformed to the IR that corresponds to the WS-Security messages generated by the policy. For example, the policy IR of Policy B is shown by the small circle that may contain multiple corresponding message structures. The Policy C is transformed into the same IR circle as Policy B, which means that these policies generate the same WS-Security messages. We can specify a security policy flexibly in WS-SecurityPolicy, and so this situation is possible when the policies are transformed into IRs. The BSP defines multiple rules that describe the requirements of the WS-Security message structure, and therefore the BSP rules can also be transformed into the corresponding IRs. In Figure 1, the BSP Rule 1 is transformed into the corresponding IR shown as a oval larger than the circles of the policy IRs. A BSP rule is a constraint for only a part of the message structure, so there are a lot of message structures conforming to the BSP Rule 1. This is why the oval for the IR corresponding to the BSP rule is larger than the circle of a policy IR. In this figure, there are two BSP rules transformed into the IR space. If a security policy conforms to one BSP rule, then the policy IR
Validating Security Policy Conformance with WS-Security Requirements
137
Validation engine
policy XML
Automatic policy transformation
Definition of IR predicate
policy in IR predicate
BSP in IR predicate
policy(WSSMsg)
bsp(WSSMsg)
True or False
BSP conformance check policy(WSSMsg) not(bsp(WSSMsg))
→
Fig. 2. Policy Validation Mechanism
should be contained in the IR region of that rule. Here, Policy A is contained in the IR of BSP Rule 1, and also Policies A, B, and C are contained in the region of Rule 2. A BSP-conformant policy should satisfy all of the rules defined in a BSP, so we can check that the security policy conforms to a BSP when the policy IR is contained within all of the IRs of the BSP rules. In this example, Policy A satisfies both the BSP Rule 1 and Rule 2, and therefore only Policy A conforms to the BSP. We provide a policy validation mechanism using the IR. Figure 2 shows an abstract view of our policy validation mechanism. The IR is written using predicate logic, so the policy conformance can be validated by inference with a Prolog program or Java implementation of Prolog Predicate Logic. The validation engine is the Prolog program that has the BSP rules as policy validation rules (shown as bsp(WSSMsg) in Figure 2). The input is an XML file of a security policy to be validated, and it is transformed into the IR shown as policy(WSSMsg) in Figure 2. Here the policy IR is regarded as the Prolog facts and the BSP rules IR are regarded as the Prolog rules for the policy validation. If the policy conforms to the BSP rules, then the predicate policy(WSSMsg) → bsp(WSSMsg) will return true. Our validation engine executes the negation of the predicate as policy(WSSMsg) → not(bsp(WSSMsg)), and true is returned if the policy is not conformant to the BSP. We have three key technologies in the validation mechanism: (1) The IR (Internal Representation) defined as predicate logic, (2) Transformations from the security policy into the IR, (3) Translations of BSP rules into the IR. The IR should correspond to the XML structure of the WS-Security messages. Therefore we define the IR based on the schema of WS-Security messages, and the IR is able to represent any characteristics of WS-Security messages. We have mapping rules between the message schema and the predicates of IR, so the input policy can be transformed into the corresponding predicates of the IR by referring to the mapping rules. A security policy is written in XML format, and therefore it can be transformed into the corresponding IR by the transformation engine. In contrast, the BSP rules are provided as a document written in natural language, and we need to manually translate these rules into the IR. The BSP rules translation only needs to be done once, so this does not increase the user’s workload. 3.2 Internal Representation We defined a basic format for predicates to represent XML schema elements. Listing 1 is an example of a schema for a simple element named "A". The corresponding predicate "a" for the element "A" is defined in Listing 2. We present the IR predicates using the Prolog format.
138
F. Satoh and N. Uramoto
(01) <xs:complexType name="A"> (02) <xs:sequence> (03) <xs:element ref="Elem1" minOccurs="0" maxOccurs="1"/> (04) <xs:element ref="Elem2" minOccurs="1" maxOccurs="1"/> (05) <xs:element ref="Elem3" minOccurs="0" maxOccurs="unbounded”/> (06) <xs:any namespace="##other" minOccurs="0" maxOccurs="unbounded" processContents="lax" /> (07) (08) <xs:anyAttribute namespace="##other" processContents="lax" /> (09) Listing 1. An example schema for a simple element (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21)
a([Aid, Elem1List, Elem2List, Elem3Lists, AElemList, [AAttrList]]) :id(Aid), elem1(Elem1List), elem2(Elem2List), elem3List(Elem3Lists), anyElem(AElemList), anyAttr(AAttrList). elem1(null). elem3List(Elem3Lists) :member(Elem3List, Elem3Lists), elem3(Elem3List). elem3(null). Listing 2. The predicates for Listing 1
Here are the basic mapping rules between the schema elements and the IR predicates. - An element is mapped to a predicate that has the same name and a list as a predicate variable. - The list of the predicate includes three types of variables. The first variable (“Aid” in Line (10)) is a variable corresponding to the “wsu:Id” attribute of the element. The next variables (“Elem1List”, “Elem2List”, and “Elem3List3” in Line (10)) are lists to represent the child elements. Also, there are two variables for the “any” and “anyAttribute” elements in the schema (“AElemList” and “AAttrList in Line (10)). - If the element has attributes besides “wsu:Id”, then the corresponding variables are in the inner list of “AAttrList” in Line (10). - The predicates corresponding to the child elements are defined separately. - The number of occurrences of elements is specified by the corresponding predicate. If minOccurs="0" is specified (“Elem1” in Line (03) and “Elem3” in Line (05)), the corresponding predicate can have “null” in its variable (Lines (17), (21)). If maxOccurs="unbounded" is specified (“Elem3” in Line (05)), the upper predicate has the member predicate of the corresponding predicate of the element as in Line (18).
Validating Security Policy Conformance with WS-Security Requirements
139
%---------- SOAP Envelope ------------------------------------ ---------------envelope([EnvId, HeaderList, BodyList, EnvElemList, [EnvAttrList]]) :id(EnvId), header(HeaderList), body(BodyList), anyElem(EnvElemList), anyAttr(EnvAttrList). %---------- SOAP Header --------------------- --------------------------------header([HeaderId, HeaderContentList, HeaderElemList, [Version, Actor, MustUS, Relay, HeaderAttrList]]) :id(HeaderId), soapVersion(Version), ... anyElem(HeaderElemList), anyAttr(HeaderAttrList), memberchk(HeaderContent, HeaderContentList), headerContent(HeaderContent). %---------- A Security Header is SOAP Header ---------------------------headerContent(HeaderContent) :- securityHeader(HeaderContent). %---------- Security Header ---------------------------------------------------securityHeader([SecId, SigLists, EncLists, TokenLists, TSLists, SigConfList, SecElemList, [SecAttrsList]]):id(SecId), signatureList(SigLists), encryptionList(EncLists), sectokenList(TokenLists), timestampList(TSLists), signatureConfirmationList(SigConfList), anyElem(SecElemList), anyAttr(SecAttrsList). %---------- A Security Header can be inserted a list of signatures ----signatureList(X) :- memberchk(Y, X), signature(Y). signature(null). %---------- Signature Element ------------------------------------ -----------signature([SigId, SigInfoList, SigValList, SigKeyInfoList]) :id(SigId), signedInfo(SigInfoList), signatureValue(SigValList), keyInfo(SigKeyInfoList). %---------- Security Tokens ------------------------------------ ---------------sectoken(SecTokenList) :- usernameToken(SecTokenList). sectoken(SecTokenList) :- binarySecurityToken(SecTokenList). %---------- Username Token ------------------------------------ -------------usernameToken([UTId, Username, PwdList, NonceList, CreatedList, [UTAttrList]]) :id(UTId), username(Username), password(PwdList), nonce(NonceList), created(CreatedList), anyAttr(UTAttrList). %---------- Binary Security Token------------------------------------ -------binarySecurityToken([BSTId, Value, [EncodingType, ValueType, BSTAttrList]]) :id(BSTId), bstValue(Value), encodingType(EncodingType), valueType(ValueType), anyAttr(BSTAttrList). Listing 3. Example of IR predicate for WS-Security
We defined the mapping between the five schemas of WS-Security and the predicates for the IR based on these mapping rules. Listing 3 has examples of the IR predicates for the WS-Security schema. The envelope predicate is the top-level predicate of the IR. The SOAP Envelope has header elements and a Body element, represented by the header and body predicates, respectively. A header element has some attributes such as “soapVersion” or “actor”, as specified in the header predicate. A Security
140
F. Satoh and N. Uramoto
header is defined as a securityHeader predicate, which is a headerContent predicate. The securityHeader predicate has lists for the signature, encryption, sectoken, and timestamp predicates. Here the parts of the signature predicate and the usernameToken and binarySecurityToken predicates are shown as examples of sectoken predicates. Listing 3 shows how our IR predicate inherits the structure of the schema, and hence the IR and the WS-Security schema are known to have the same capabilities and granularities for representing the WS-Security message structure. 3.3 IR Transformation from Policies A security policy is represented in an XML format, hence we need to transform an XML representation into corresponding predicates for validation. We created a policy transformation engine to convert XML into the IR, and this is one of primary features of our work. We need mapping rules to transform policy assertions into the IR, but the mapping rules cannot be defined straightforwardly. Each security policy is defined as a combination of security policy assertions, and also each policy combination has a nested structure. For example, a SupportingTokens assertion can have an X509Token assertion, and that X509Token assertion can have other assertions, such as an AlgorithmSuite assertion and a SignedParts assertion. Depending on the policy structure, the mapping rules can become complicated. Therefore, our policy transformation uses the template-based approach shown in Figure 3. Here we clarify the definitions of the templates and rules used to fill the blanks in a template to generate the IR. We defined a top-level template as an IR template with blanks to be filled in by the lower-level IR fragments. Several kinds of lower-level IR fragments are defined, mainly for signatures, encryption, tokens, and timestamps. Each IR fragment has parameters that should be filled in by referring to the security assertions. For example, a signature IR fragment has variables to specify the algorithm names used for the signature and digest methods. Figure 3 shows a part of an IR template and fragments. The IR fragments have variables such as ${C14NAlgo}, and ${SigMAlgo}, and it also has a place to insert another IR fragment. In a signature IR fragment, a place for a KeyInfoType IR fragment is represented by [KeyInfoType IR fragment]. As shown in this example, an IR fragment may be nested in other fragments. The blanks for IR fragments and variables are filled in by referring to the combinations of security policy assertions. We categorize security policy assertions into the two types: Type 1 assertions can be mapped directly to IR fragments and Type 2 assertions can be mapped to variables within IR fragments. In a security policy, a combination of a SignedPart assertion and a SymmetricBinding assertion specifies a signed portion and signature-related information, and therefore this combination of assertions can be mapped to a Signature IR fragment. This means that they are Type 1 assertions. In contrast, an AlgorithmSuite assertion specifies algorithms for signatures and encryptions, and this assertion can fill in variables for algorithms in signature and encryption IR fragments, such as ${C14NAlgo}, ${SigMAlgo}, and ${DigMAlgo}. This kind of assertion is categorized as Type 2. The transformation executes three steps: (1) Map policy assertions of Type 1 to IR fragments, (2) Insert the IR fragments into an IR template, and (3) Assign or modify the values of the variables in the IR fragments according to the Type 2 policy assertions. An IR fragment may be nested in another fragment. After values are assigned to the variables in the IR fragments, they are inserted into an IR template. The IR
Validating Security Policy Conformance with WS-Security Requirements
141
Signature IR fragment
IR template policy(Env) :Env=[EnvId, HeaderList, BodyList, EnvElemList], ....... m ember(HeaderContent, HeaderContentList), BodyList=[BodyId, BodyElemList, [BodyAttrsList]], BodyId='bodyId', body(BodyList), [Signature [Signature IR IR fragments] fragments] [Encryption IR fragments] [Token [Token IR IR fragments] [Timestamp IR fragment]
Token IR fragment (UsernameToken) Token=[UTId, Username, PwdList, ....], UTId=${UTId}, PwdList=[PwdId, PwdString, [PwdType]], PwdType=${PwdType}, usernam eToken(Token), mem ber(Token, TokenLists),
Sig=[SigId, SigInfoList, SigValList, SigKeyInfoList], SigId=${SIgID}, SigInfoList=[SignedInfoId, C14NMethodList, ….], C14NMethodList=[C14NMId, C14NElemList, [C14NAlgo]], C14NAlgo=${C14NAlgo}, SigMethodList=[SigMId, HMacOutLenList, …], SigMAlgo=${SigMAlgo}, SigInfoRef=[RefId, TransList, DigestMList,…..], DigestMList=[DigMId, DigMElemList, [DigMAlgo]], DigMAlgo=${DigMAlgo}, RefURI=${RefURI}, member(SigInfoRef, SigInfoRefList), SigKeyInfoList=[KeyInfoId, KeyInfoTypeList, ….], KeyInfoType=[STRId, RefTypeList,…..], STRId=${STRId}, [KeyInfoType [KeyInfoTypeIRIRfragment] fragment] securityTokenReference(KeyInfoType), member(KeyInfoType, KeyInfoTypeList), signature(Sig), member(Sig, SigLists),
KeyInfoType IR fragment RefTypeList=[RType, RefId, [WsseRefURI, RefValueType]], WsseRefURI=${WsseRefURI}, reference(RefTypeList),
Fig. 3. Example of IR template and fragments
template has places for the signature, encryption, token, and timestamp IR fragments, and the policy IR can be generated by inserting all of the IR fragments. Specifying the security requirements by policy assertion combinations is one of features of the latest version of WS-SecurityPolicy specification [2]. Thanks to the template-based approach, we can support transformations from complicated combinations of policy assertions. Hence, we can handle the validation of security policies represented in the latest specification. This is one of our contributions in this study. 3.4 Policy Validation Rules We investigated the BSP rules to select the rules used for static policy validation. In this study, we used 21 rules defined in the BSP as the policy validation rules. Here we show the two BSP rules numbered R4201 and C5443 as examples of the policy validation rules. R4201: Any PASSWORD of Username Token MUST specify a Type attribute. This is one of the simplest rules defined in the BSP. A translated policy validation rule is the invalid_r4201 predicate, which is written in Prolog syntax as: invalid_r4201([UTId]) :validatedPolicy(EnvList), isSecurityHeader(EnvList, SecHeader), isSecurityToken(SecHeader, TokenList), isUsernameToken(TokenList), isPasswordType(TokenList, PwdType), PwdType=='null'.
The invalid_r4201 predicate returns false if the message created by the policy conforms to the R4201 rule, or it returns the ID of username token that violates this rule as
142
F. Satoh and N. Uramoto
the value of the UTId variable. The validatedPolicy predicate here is the IR of a security policy to be validated and the EnvList variable is a list of the envelope predicate shown in Listing 3. This EnvList has all of the information for the WS-Security message structures and attribute values created from the security policy, so the next three predicates return the username token information from the EnvList. The isSecurityHeader predicate returns the list of the securityHeader predicates in Listing 3, and the isSecurityToken predicate and the isUsernameToken predicate return the lists of the usernameToken predicates. The TokenList list has the PwdType variable which represents the type attribute of the username token, and therefore the PwdType variable should not be ‘null’ for the R4201 rule. C5443: When the signer's SECURITY_TOKEN is an INTERNAL _SECURITY_TOKEN, the SIGNED_INFO MAY include a SIG_REFERENCE that refers to the signer's SECURITY_TOKEN to prevent substitution with another SECURITY_TOKEN that uses the same key. This C5443 rule says that the security token may be self-signed if the token is used as a key of the signature. invalid_c5443([SigId, TokenId, IdList]) :validatedPolicy (EnvList), isSecurityHeader(EnvList, SecHeader), isSignatureList(SecHeader, SignatureList), invalidProtectTokens(SignatureList, SigId, TokenId, IdList). invalidProtectTokens(SignatureList, SigId, TokenId, IdList) :isMember(Signature, SignatureList), isSignatureKeyInfo(SigId, Signature, SignatureKeyInfo), securityTokenReference(SignatureKeyInfo), referenceType(SignatureKeyInfo, RType, TokenId), RType=='strRef', reference(RefTypeList), isSignTokenNotSigned(Signature, TokenId, IdList). isSignTokenNotSigned(Signature, TokenId, IdList) :isSigInfoReferenceList(Signature, SigInfoRefList), allSigRefIds(SigInfoRefList, IdList), not(isMember(TokenId, IdList)).
The invalid_c5443 predicate returns false if the message created by a policy conforms to the C5443 rule, or it returns the ID of a signature that violates this rule, the ID of a security token, and the ID lists of the signed parts. The invalid_c5443 predicate consists of the two fine-grained predicates invalidProtectTokens and isSignTokenNotSigned. The invalidProtectTokens predicate returns information on the signatures, the signature ID, and security token ID used as a signing key, and the ID list of the signed portions. The isSignTokenNotSigned predicate returns true if the security token ID is not included in the ID list of the signed portions, and this means that the signature did not sign the security token itself and thus violates the C5443 rule. Here we use the invalid predicates that return true if the policy is not conforming to the rules. The reason for this is so we can collect the information about the violations of the BSP rules. We translated 21 rules defined in the BSP, and so the policy validation engine can execute the following predicate to confirm that a particular security policy conforms to all of the rules:
Validating Security Policy Conformance with WS-Security Requirements
143
%------------------ Signature -----------------------------------(01) Sig=[SigId, SigInfoList, SigValList, SigKeyInfoList, SigObjList], (02) SigId='sigId', (03) SigInfoList=[SignedInfoId, C14NMethodList, SigMethodList, SigInfoRefList], (04) C14NMethodList=[C14NMId, C14NElemList, [C14NAlgo]], (05) C14NAlgo='exc14n', (06) SigMethodList=[SigMId, HMacOutLenList, SigMElemList, [SigMAlgo]], (07) SigMAlgo='rsasha1', %------------------ SignedInfo for Signature -----------------(08) SigInfoRef=[RefId, TransList, DigestMList, DigestValList, [RefURI, RefType]], (09) DigestMList=[DigMId, DigMElemList, [DigMAlgo]], (10) DigMAlgo='sha1', (11) RefURI='bodyId', (12) member(SigInfoRef, SigInfoRefList), %----------------- KeyInfo for Signature ----------------------(13) SigKeyInfoList=[KeyInfoId, KeyInfoTypeList, KeyInfoElemList], (14) KeyInfoType=[STRId, RefTypeList, STRElem, [STRTokenType, Usage, STRAttrList]], (15) RefTypeList=[RType, RefId, [WsseRefURI, RefValueType, RefAttrList]], (16) WsseRefURI=BSTokenId, (17) reference(RefTypeList), (18) securityTokenReference(KeyInfoType), (19) member(KeyInfoType, KeyInfoTypeList), (20) signature(Sig), (21) member(Sig, SigLists), %----------------- Binary Security Token for Signature ------(22) BSToken=[BSTokenId, Value, [BSTEncodingType, BSTValueType,BSTAttrList]], (23) BSTokenId='bstokenId', (24) binarySecurityToken(BSToken), (25) member(BSToken, TokenLists), %----------------- Username Token --------------------------------(26) UNToken=[UNTokenId, Username, PwdList, NonceList, CreatedList, UTAttrList]], (27) UNTokenId='untokenId', (28) PwdList=[PwdId, PwdString, [PwdType]], (29) PwdType='null', (30) usernameToken(Token), (31) member(Token, TokenLists). Listing 4. IR Example of Security Policy
invalidBSP(Info)) :- (invalid_r4201(Info)); (invalid_c5443(Info)); ...... The negation of the predicate bsp(Info), such as invalidBSP(Info), is the logical disjunction of 21 BSP rules, and returns true when at least one of the validation rules is violated and also returns the information about the violation as the Info list. A security policy should conform to all of the rules, so if the predicate returns false, then the security policy conforms to all of the BSP rules. In the next section, we give a concrete example of policy validation using IR.
144
F. Satoh and N. Uramoto
4 Evaluation 4.1 Policy Validation Rules We have developed a prototype of our policy conformance validation engine and here give an example of policy validation. The validation engine accepts a security policy as input and transforms it into the corresponding IR automatically. Then the IR becomes the input for the policy validation predicate invalidBSP in Section 3. Listing 4 shows some of the IR predicates, showing how one signature and two security tokens are represented in the IR. Lines (01) to (07) show a part of the IR for the signature assigned the signature id “sigId” in Line (02). Lines (08) to (12) correspond to the SignedInfo element for the signature. Line (11) shows the ReferenceURI that specifies the ID of the signed portion, which means that the element with the “bodyId” attribute is signed. Lines (13) to (21) are the IR of the KeyInfo for the signature. Here Line (16) means that the security token with “BSTokenId” attribute as its ID is a key for the signature. Lines (22) to (25) show the IR for the binary security token with the ID “BSTokenId”, so this token is used as a signing key. Lines (26) to (31) show the username token with the ID “untokenId” from Line (27). This username token has no password type, so Line (29) shows the PwdType is ‘null’. Here we simplify the policy validation rules, and assume that only Rules R4201 and C5443 apply. Hence we need to infer whether the following predicate is true or false: invalidBSP(Info) :- (invalid_r4201(Info)); (invalid_c5443(Info)). If the input policy has no HashPassword element under the UsernameToken assertion, then the corresponding policy IR has no value of the PwdType, and ‘null’ is assigned to the PwdType as shown in Line (29). In this case, the invalidBSP predicate becomes true and returns the following counterexample: InvalidNo = r4201, Info = [untokenId] This means that a security token of the ID “untokenId” violates the rule R4201. This security policy also violates the other rule, Rule C5443. According to Rule C5443, the security token used for signing needs to be selfsigned. However, the security policy does not sign the security token, and therefore we get the following counterexample: InvalidNo = c5443, Info = [sigId, bstokenId [bodyId]] This result means that the signature with ID “sigId” violates Rule C5443, and the invalid signature uses the security token with ID “bstokenId” and signs the element with ID “bodyId”. The user receives the violated rule IDs and related information as a counterexample, thus supposing resolution of any problems. A security policy specifies the minimum security requirements for the WS-Security messages. Therefore we cannot uniquely determine the message structure corresponding to a security policy. We can deal with this problem because our IR can represent multiple WS-Security message structures by using variables for the undefined parts. Our policy validation checks if the minimum security requirements conform to the validation rules. Our static validation technology can be integrated into the policy authoring tool, and the policy validation engine can contribute to resolving the problems of Section 2. The
Validating Security Policy Conformance with WS-Security Requirements
145
policy validation rules are embedded in the validation engine, and therefore the user can validate the concrete security policy by simply feeding the policy into the engine, even if the user does not fully understand the security requirements specified in the policy. We can validate the security policy while reducing the users’ workloads and without detailed knowledge of the WS-Security, which is one of the most important contributions of our work. 4.2 Performance Evaluation We performed an experiment to evaluate the performance of the policy validation in Prolog. We examined the policy validation of the test policies shown in Table I, and their requirements which are validated against the 21 validation rules. For example, Policy 1 requires a signature and an encryption on the SOAP Body using an X.509 certificate. Policy 2 also has an IncludeTimestamp assertion for Policy 1, and Policy 3 also has a ProtectTokens assertion. The UNToken(w/Pwd) means a UsernameToken assertion that has a HashPassword element specified as a SupportingToken assertion, and the UNToken(w/o Pwd) means a UsernameToken assertion with no HashPassword element. The Signed UNToken(w/Pwd) is a SignedSupportingToken assertion that has a UsernameToken assertion with a HashPassword element. All of these policies violate at least one of the validation rules. Here is the environment of our experiments. The operating system was Windows XP Professional with Service Pack 2 running on an Intel Core2 2.00-GHz CPU with 3 GB of memory. We performed the validation 30 times for each policy, and compared the average execution times. Figure 4 shows the average execution times for 15 test policies. According to these results, Policy 3 takes about 1.2 times longer than Policies Table 1. List of Test Policies Policy Number
Existing Policy Assertions
Policy 1
Signature and Encryption on SOAP Body by X.509 token
Policy 2
Policy 1 + TS
Policy 3
Policy 1 + TS+ PT
Policy 4
Policy 1 + UNT(w/o Pwd)
Policy 5
Policy 1 + UN(w/ Pwd)
Policy 6
Policy 1 + SUNT(w/o Pwd)
Policy 7
Policy 1 + SUNT(w/ Pwd)
Policy 8
Policy 1 + TS+ UNT(w/o Pwd)
Policy 9
Policy 1 + TS + PT+ UNT(w/o Pwd)
Policy 10
Policy 1 + TS + PT + UNT(w/ Pwd)
Policy 11
Policy 1 + TS + PT + SUNT(w/o Pwd)
Policy 12
Policy 1 + TS + PT + SUNT (w/ Pwd)
Policy 13
Policy 1 + TS + PT + UNT(w/ Pwd) + UNT(w/o Pwd)
Policy 14
Policy 1 + TS + PT + SUNT(w/ Pwd) + UNT(w/o Pwd)
Policy 15
Policy 1 + TS + PT + SUNT(w/ Pwd) + SUNT(w/o Pwd)
TS = Timestamp, PT = Protect Tokens UNT=Username Token, SUNT = Signed Username Token
146
F. Satoh and N. Uramoto
Execution time in mill seconds
1 or 2, and the execution time for Policy 9 is about 1.2 times the time of Policy 8. We believe these performance loses come from the ProtectTokens assertions. In addition, we found that a UsernameToken assertion does not have big impact on the validation performance by comparing the results for Policies 1, 4, 5, 6, and 7. Policies 14 and 15 are slower than the others, so we can conclude the number of existing assertions in a security policy affects the performance, because Policies 14 and 15 have many SupportingTokens and SignedSupportingTokens assertions. The ProtectTokens and SignedSupportingTokens assertions are transformed into the corresponding signature IR, and therefore multiple signatures are included in the policy IR. The number of signatures impacts the validation performance because the signature IR is complicated and has slower validation than the others, such as a token IR. The prototype supports some of the WS-SecurityPolicy assertions and the execution times are not long. However, the total execution time will depend on the usage scenarios of our technology. For example, when the validation is done for an existing policy, the execution time would be less critical rather than when validating a new policy at authoring time. We need to consider the likely usage scenarios for validation and enhance our technology to support more kinds of assertions before further performance evaluations. Other planned enhancements will include custom validation rules and ways to correct invalid security policies. 350
sd 300 onc es 250 llii m200 ni e m it 150 no it uc 100 ex E 50 0
11
22
33
44
55
66
77
88
99
10 10
11 12 12 13 13 14 14 15 15 11
Number of policy
Policy sequence number
Fig. 4. Execution time of validation for test policies
5 Related Work This study is an extension of our previous work [7], which presented our basic idea of the syntactic policy validation. In that paper, we only gave an overview of our approach, but since then we have refined our policy validation mechanisms, including the definition of the IR, clarifying the policy validation rules, devising ways to transform policies into the IR, and also testing the validation behavior using our prototype implementation. In addition we tested the performance of the prototype. This paper presents significant extensions beyond our previous work. There have been other studies of the security policies of WS-SecurityPolicy. Bhargavan et al. studied the operational semantics of WS-SecurityPolicy and provided a tool to test for communication vulnerabilities in [8]. They identified some typical
Validating Security Policy Conformance with WS-Security Requirements
147
security vulnerabilities in Web services, such as XML rewriting attacks, with a policydriven security mechanism in [9]. They also represented the WS-SecurityPolicy assertions using predicate logic and used the predicates for security policy generation and verification. We also focus on validating policies to improve the security policies, but our approach is completely different and complements theirs in several ways. Our approach tests a security-policy-generated WS-Security message structure for security against various attacks. In contrast, Bhargavan et al. focused on invalidating a process executed by an attacker. Their process model is quite simple, but in practice there are many attack variations. They cannot accurately model complicated processes, so other approaches are necessary such as static validation of the message structure. This is one of the ways that our work complements theirs. Also, their predicate logic representations of WS-SecurityPolicy were for the previous version of the WS-SecurityPolicy specification. The latest specification is quite different and more complicated than the previous one. The security policy assertions have been redesigned and the predicates in [8] are inadequate for the latest policies, while our IR predicates do address the latest complexities of WS-SecurityPolicy. One policy assertion no longer corresponds to one security requirement, but each security requirement can be represented by a combination of assertions. The transformation from a policy into a corresponding IR is not trivial, so we now provide an automatic transformation mechanism. This addresses one of the difficulties of using the latest WS-SecurityPolicy specification. Tziviskou and Nitto [10] proposed a formal specification for the requirements in WS-Security, seeking to verify if the exchanged messages satisfy the requirements. They defined their predicates based on the WS-SecurityPolicy assertions, and proposed a way to compare pairs of policies. In contrast to their approach, our IR is defined from the WS-Security message schema, not from WS-SecurityPolicy itself. Hence we can validate security policy conformance for the WS-Security message structure generated by the policy. There are a lot of WS-Security messages that can be generated from one security policy, which is one of difficulties we address by using predicate logic to represent all of the generated message structures using one IR. Lee et al. [11] worked to compose security policies, and they used the concept of logically defeasible events to test the security policies written in WS-SecurityPolicy. Their motivation, different departments creating a combined policy, is different from ours, but the approach using predicate logic is similar. There are many studies of logicbased approaches for security policies and not only for WS-SecurityPolicy, but a key feature of our work is handling the conformance validation for generated multiple WSSecurity messages with a logic-based approach. There are many projects using WS-SecurityPolicy, and policy validation has become important to support such work. We have plans to improve our technologies, such as supporting custom policy validation rules and performance improvements in the policy validation.
6 Conclusion Defining a security policy is very difficult when checking the policy conformance manually, so we are proposing a validation technology for policy conformance. Our approach introduces an IR (Internal Representation) based on predicate logic, and the policy IR is validated using inference. The constraints for improving security are discussed in the BSP, and we use them as the policy validation rules. The security
148
F. Satoh and N. Uramoto
policy in XML is transformed into the corresponding IR and it can be validated automatically, and therefore a policy developer can validate and insure that a security policy is valid in terms of conformance to the requirements for message structures without increasing the developers’ workloads. We show how the policy validation works using example policies, and evaluated the performance using a prototype implementation. Recently, security policies written in WS-SecurityPolicy have become common and they are being used in many applications, so our technology is very useful and can contribute value. This study used the BSP rules as policy validation rules, but the validation mechanism itself is independent of the BSP. We are working to support custom rules defined by developers or transformed from high-level policies such as corporate regulations. Our current prototype implementation of the validation engine uses Prolog, and we will study new implementation approaches to improve the performance. Our possible future works include introducing a new framework to support custom policy validation rules and verification of soundness and correctness of our method.
References 1. Web Services Security: SOAP Message Security 1.1, http://www.oasis-open.org/committees/download.php/16790/ wss-v1.1-spec-os-SOAPMessageSecurity.pdf 2. WS-SecurityPolicy 1.2, http://www.oasis-open.org/committees/download.php/ 23821/ws-securitypolicy-1.2-spec-cs.pdf 3. Basic Security Profile Version 1.0 Final Material, http://www.ws-i.org/Profiles/BasicSecurityProfile-1.0.html 4. Satoh, F., Yamaguchi, Y.: Generic Security Policy Transformation Framework for WSSecurity. In: International Conference on Web Services (ICWS 2007), pp. 513–520. IEEE Press, New York (2007) 5. IBM WebSphere Application Server, http://www.ibm.com/software/webservers/appserv/was 6. Prennschutz-Schutzenau, S., Mukhi, N.K., Hada, S., Sato, N., Satoh, F., Uramoto, N.: Static vs. Dynamic Validation of BSP Conformance. In: International Conference on Web Services (ICWS 2009), pp. 919–927. IEEE Press, New York (2009) 7. Nakamura, Y., Satoh, F., Chung, H.V.: Syntactic Validation of Web Services Security Policies. In: Krämer, B.J., Lin, K.-J., Narasimhan, P. (eds.) ICSOC 2007. LNCS, vol. 4749, pp. 319–329. Springer, Heidelberg (2007) 8. Bhargavan, K., Fournet, C., Gordon, A.D.: Verifying policy-based security for web services. In: 11th ACM Conference on Computer and Communications Security (CCS 2004), pp. 268–277 (2004) 9. Bhargavan, K., Fournet, C., Gordon, A.D., O’Shea, G.: An Advisor for Web Services Security Policies. In: ACM Workshop on Secure Web Services (2005) 10. Tziviskou, C., Nitto, E.D.: Logic-based Management of Security in Web Services. In: IEEE International Conference on Service Computing, pp. 228–235. IEEE Press, New York (2007) 11. Lee, A.J., Boyer, J.P., Olson, L.E., Gunter, C.A.: Defeasible security policy composition for web services. In: 4th ACM workshop on Formal methods in security, pp. 45–54 (2006)
Efficient Secure Auction Protocols Based on the Boneh-Goh-Nissim Encryption Takuho Mitsunaga1 , Yoshifumi Manabe2 , and Tatsuaki Okamoto3 1
Graduate School of Informatics, Kyoto University, Sakyo-ku Kyoto city, Japan Kobe Digital Lab. Edocho 93 Chuo-ku Kobe city, Japan 2 NTT Communication Science LaboratoriesNTT Laboratory 3-1 Morinosato-Wakamiya, Atsugi city, Kanagawa, Japan 3 NTT Information Sharing Platform Laboratories 3-9-11 Midoricho, Musashino city, Tokyo, Japan
Abstract. This paper presents efficient secure auction protocols for first price auction and second price auction. Previous auction protocols are based on a generally secure multi-party protocol called mix-and-match protocol. However, the time complexity of the mix-and-match protocol is large, although it can securely calculate any logical circuits. The proposed protocols reduce the number of times the mix-and-match protocol is used by replacing them with the Boneh-Goh-Nissim encryption, which enables calculation of 2-DNF of encrypted data.
1 1.1
Introduction Background
Recently, as the Internet has expanded, many researchers have become interested in secure auction protocols and various schemes have been proposed to ensure the safe transaction of sealed-bid auctions. A secure auction is a protocol in which each player can find only the highest bid and its bidder (called the first price auction) or the second highest bid and the first price bidder (called the second price auction). A simple solution is to assume a trusted auctioneer. Bidders encrypt their bids and send them to the auctioneer, and the auctioneer decrypts them to decide the winner. To remove the trusted auctioneer, some secure multi-party protocols have been proposed. The common essential idea is the use of threshold cryptosystems, where a private decryption key is shared by the players. Jakobsson and Juels proposed a secure MPC protocol to evaluate a function comprising a logical circuit, called mix-and-match [6]. As for a target function f and the circuit that calculates f , Cf , all players evaluate each gate in Cf based on their encrypted inputs and the evaluations of all the gates in turn lead to the evaluation of f . Based on the mix-and-match protocol, we can easily find a secure auction protocol by repeating the millionaires’ problem for two players. However, the mix-and-match protocol requires two plaintext equality tests for a two-input oneoutput gate. Furthermore, one plaintext equality test requires one distributed I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 149–163, 2010. c Springer-Verlag Berlin Heidelberg 2010
150
T. Mitsunaga, Y. Manabe, and T. Okamoto
decryption among players. Thus, it is important to reduce the number of gates in Cf to achieve function f . Kurosawa and Ogata suggested the ”bit-slice auction”, which is an auction protocol that is more efficient than the one based on the millionaire’s problem [8]. Boneh, Goh and Nissim suggested a public evaluation system for 2-DNF formula based on an encryption of Boolean variables [3]. Their protocol is based on Pallier’s scheme [12], so it has additive homomorphism in addition to the bilinear map, which allows one multiplication on encrypted values. As a result, this property allows the evaluation of multivariate polynomials with the total of degree two on encrypted values. In this paper, we introduce bit-slice auction protocols based on the public evaluation of the 2-DNF formula. For the first price auction, the protocol uses no mix-and-match gates. For the second price auction, we use the mix-and-match protocol fewer times than that suggested in [8]. 1.2
Related Works
As related works, there are many auction protocols, however, they have problems such as those described hereafter. The first secure auction scheme proposed by Franklin and Reiter [5] does not provide full privacy, since at the end of an auction players can know the other players’ bids. Naor, Pinkas and Sumner achieved a secure second price auction by combining Yao’s secure computation with oblivious transfer assuming two types of auctioneers [10]. However, the cost of the bidder communication is high because it proceeds bit by bit using the oblivious transfer protocol. Juels and Szydlo improved the efficiency and security of this scheme with two types of auctioneers through verifiable proxy oblivious transfer [7], which still has a security problem in which if both types of auctioneers collaborate they can retrieve all bids. Lipmaa, Asokan and Niemi proposed an efficient M + 1st secure auction scheme [9]. The M + 1st price auction is a type of sealed-bid auction for selling M units of a single kind of goods, and the M + 1st highest price is the winning price. M bidders who bid higher prices than the winning price are winning bidders, and each winning bidder buys one unit of the goods at the M + 1st winning price. In this scheme, the trusted auction authority can know the bid statistics. Abe and Suzuki suggested a secure auction scheme for the M + 1st auction based on homomorphic encryption [1]. However in their scheme, a player’s bid is not a binary expression. So, its time complexity is O(m2k ) for a m-player and k-bit bidding price auction. Tamura, Shiotsuki and Miyaji proposed an efficient proxy-auction [14]. This scheme only considers the comparison between two sealed bids, the current highest bid and a new bid. However, this scheme does not consider multiple players because of the property of the proxy-auction. 1.3
Our Result
In this paper, we introduce bit-slice auction protocols based on the public evaluation of the 2-DNF formula. For the first price auction, the protocol uses no
Efficient Secure Auction Protocols
151
mix-and-match gates. For the second price auction, we use the mix-and-match protocol fewer times than that suggested in [8].
2 2.1
Preliminaries The Model of Auctions and Outline of Auction Protocols
This model involves n players, denoted by P1 , P2 , ..., Pn and assumes that there exists a public board. The players agree in advance on the presentation of the target function, f as a circuit Cf . The aim of the protocol is for players to compute f (B1 , ..., Bn ) without revealing any additional information. Its outline is as follows. 1. Input stage: Each Pi (1 ≤ i ≤ n) computes ciphertexts of the bits of Bi and broadcasts them and proves that the ciphertext represents 0 or 1 by using the zero-knowledge proof technique in [3]. 2. Mix and Match stage: The players blindly evaluates each gate, Gj , in order. 3. Output stage: After evaluating the last gate GN , the players obtain ON , a ciphertext encrypting f (B1 , ..., Bn ). They jointly decrypt this ciphertext value to reveal the output of function f . Requirements for the encryption function. Let E be a public-key probabilistic encryption function. We denote the set of encryptions for a plaintext m by E(m) and a particular encryption of m by c ∈ E(m) . Function E must satisfy the following properties. 1.Homomorphic property. There exist polynomial time computable operations, −1 and ⊗, as follows. For a large prime q, 1. If c ∈ E(m), then c−1 ∈ E(−m mod q). 2. If c1 ∈ E(m1 ) and c2 ∈ E(m2 ), then c1 ⊗ c2 ∈ E(m1 + m2 mod q). For a positive integer a, define a · e = c ⊗ c ⊗ · · · ⊗ c. a
2.Random re-encryption. Given c ∈ E(m), there is a probabilistic re-encryption algorithm that outputs c ∈ E(m), where c is uniformly distributed over E(m). 3.Threshold decryption. For a given ciphertext c ∈ E(m), any t out of n players can decrypt c along with a zero-knowledge proof of the correctness. However, any t-1 out of n players cannot decrypt c. MIX protocol. The MIX protocol [4] takes a list of ciphertexts, (ξ1 , ...., ξL ), and outputs a permuted and re-encrypted list of the ciphertexts (ξ1 , ..., ξL ) without revealing the relationship between (ξ1 , ..., ξL ) and (ξ1 , ..., ξL ), where ξi or ξi can be a single ciphertext c, or a list of l ciphertexts, (c1 , ..., cl ), for some l > 1. For all players to verity the validity of (ξ1 , ..., ξL ), we use the universal verifiable MIX net protocol described in [13].
152
T. Mitsunaga, Y. Manabe, and T. Okamoto
Plaintext equality test. Given two ciphertexts c1 ∈ E(v1 ) and c2 ∈ E(v2 ), this protocol checks if v1 = v2 . Let c0 = c1 ⊗ c−1 2 . 1. (Step 1) For each player Pi (where i = 1,...,n): Pi chooses a random element ai ∈ Z∗q and computes zi = ai ·c0 . He broadcasts zi and proves the validity of zi in zero-knowledge. 2. (Step 2) Let z = z1 ⊗ z2 ⊗ · · · ⊗ zn . The players jointly decrypt z using threshold verifiable decryption and obtain plaintext v. Then it holds that v=
0 if v1 = v2 random otherwise
Mix and Match Stage. For each logical gate, G(x1 , x2 ), of a given circuit, n players jointly computes E(G(x1 , x2 )) from c1 ∈ E(x1 ) and c2 ∈ E(x2 ) keeping x1 and x2 secret. For simplicity, we show the mix-and-match stage for AND gate. 1. n players first consider the standard encryption of each entry in the table shown below. 2. By applying a MIX protocol to the four rows of the table, n players jointly compute blinded and permuted rows of the table. Let the ith row be (ai , bi , ci ) for i = 1,...,4. 3. n players next jointly find the row i such that the plaintext of c1 is equal to that of ai and the plaintext of c2 is equal to that of bi by using the plaintext equality test protocol. 4. For the row i, it holds that ci ∈ E(x1 ∧ x2 ). Table 1. Mix-and-match table for AND a1 a2 a3 a4
2.2
x1 ∈ E(0) ∈ E(0) ∈ E(1) ∈ E(1)
b1 b2 b3 b4
x2 ∈ E(0) ∈ E(1) ∈ E(0) ∈ E(1)
x1 ∧ x2 c1 ∈ E(0) c2 ∈ E(0) c3 ∈ E(0) c4 ∈ E(1)
Bit-Slice Auction Circuit
We introduce an efficient auction circuit called the bit-slice auction circuit described in [6]. In this scheme, we assume only one player bids the highest bidding price, so we do not consider a case two more players become the winners. Sup(k−1) (0) pose that Bmax = (bmax , ..., bmax )2 is the highest bidding price and a bid of a (k−1) (0) player i is Bi = (bi , ..., bi )2 , where ()2 is the binary expression. Then the (k−1) proposed circuit first determines bmax by evaluating the most significant bits of (k−2) all the bids. It next determines bmax by looking at the second most significant bits of all the bids, and so on.
Efficient Secure Auction Protocols
153
For two m-dimensional binary vectors X = (x1 , ..., xm ) and Y = (y1 , ..., ym ), X ∧ Y = (x1 ∧ y1 , ..., xm ∧ ym ) Let Dj be the highest price when considering the upper j bits of the bids. That is, (k−1)
D1 = (bmax , 0, ..., 0)2 (k−1) (k−2) D2 = (bmax , bmax , 0, ..., 0)2 ··· (k−1) (0) Dk = (bmax , ..., bmax )2 (k−j)
In the j-th round, we find bmax and eliminate a player Pi such that his bid satisfies Bi < Dj . For example, in the case of j = 1, a player i is eliminated if his bid Bi < D1 . By repeating this operation for j = 1 to k, at the end the remaining bidder is the winner. For this purpose, we update W = (w1 , ..., wm ) such that wi =
1 if Bi ≥ Dj 0 otherwise
for j = 1 to k. The circuit is obtained by implementing the following algorithm. For given m bids, B1 , ..., Bm , Vj is defined as (j)
(j)
Vi = (b1 , ..., bm ) for j = 0,...,k − 1, that is, Vj is the vector consisting of the (j + 1)th lowest bit of each bid. Let W = (w1 , ..., wm ), where each wj = 1. For j = k − 1 to 0, perform the following. (Step 1) For W = (w1 , ..., wm ), let
(j)
bmax
Sj = W ∧ Vj (j) (j) = (w1 ∧ b1 , ..., wm ∧ bm ) (j) (j) = (w1 ∧ b1 ) ∨ · · · ∨ (wm ∧ bm ) .
(j)
(Step 2) If bmax = 1, then let W = Sj . (k−1)
(0)
Then the highest price is obtained as Bmax = (bmax , ..., bmax )2 . Let the final W be (w1 , ..., wm ). Then Pi is the winner if and only if wi = 1. We summarize the algorithm as the following theorem. Theorem 1. [8] In the bit-slice auction above, - Bmax is the highest bidding price. - For the final W = (w1 , ..., wm ), Pi is a winner if and only if wi = 1 and Pi is the only player who bids the highest price Bmax .
154
2.3
T. Mitsunaga, Y. Manabe, and T. Okamoto
Evaluating 2-DNF Formulas on Ciphertexts
Given encrypted Boolean variables x1 , ..., xn ∈ {0, 1}, a mechanism for public evaluation of a 2-DNF formula was suggested in [3]. They presented a homomorphic public key encryption scheme based on finite groups of composite order that supports a bilinear map. In addition, the bilinear map allows for one multiplication on encrypted values. As a result, their system supports arbitrary additions and one multiplication on encrypted data. This property in turn allows the evaluation of multivariate polynomials of a total degree of two on encrypted values. Bilinear groups. Their construction makes use of certain finite groups of composite order that supports a bilinear map. We use the following notation. 1. G and G1 are two (multiplicative) cyclic groups of finite order n. 2. g is a generator of G. 3. e is a bilinear map e : G × G → G1 . Subgroup decision assumption. We define algorithm G such that given security parameter τ ∈ Z+ outputs a tuple (q1 , q2 , G, G1 , e) where G, G1 are groups of order n = q1 q2 and e : G × G → G1 is a bilinear map. On input τ , algorithm G works as indicated below, 1. Generate two random τ -bit primes, q1 and q2 and set n = q1 q2 ∈ Z. 2. Generate a bilinear group G of order n as described above. Let g be a generator of G and e : G × G → G1 be the bilinear map. 3. Output (q1 , q2 , G, G1 , e). We note that the group action in G and G1 as well as the bilinear map can be computed in polynomial time. Let τ ∈ Z+ and let (q1 , q2 , G, G1 , e) be a tuple produced by G where n = q1 q2 . Consider the following problem. Given (n, G, G1 , e) and an element x ∈ G, output ’1’ if the order of x is q1 and output ’0’ otherwise, that is, without knowing the factorization of the group order n, decide if an element x is in a subgroup of G. We refer to this problem as the subgroup decision problem. Homomorphic public key system. We now describe the proposed public key system which resembles the Pallier [12] and the Okamoto-Uchiyama encryption schemes [11]. We describe the three algorithms comprising the system. 1.KeyGen. Given a security parameter τ ∈ Z, run G to obtain a tuple (q1 , q2 , G, R − G and set G1 , e). Let n = q1 q2 . Select two random generators, g and u ← q2 h = u . Then h is a random generator of the subgroup of G of order q1 . The public key is P K = (n, G, G1 , e, g, h). The private key is SK = q1 . 2.Encrypt(P K, M ). We assume that the message space consists of integers in set {0, 1, ..., T } with T < q2 . We encrypt the binary representation of bids in our main application, in the case T = 1. To encrypt a message m using public key P K, select a random number r ∈ {0, 1, ..., n − 1} and compute
Efficient Secure Auction Protocols
155
C = g m hr ∈ G. Output C as the ciphertext. 3.Decrypt(SK, C). To decrypt a ciphertext C using the private key SK = q1 , observe that C q1 = (g m hr )q1 = (g q1 )m . Let gˆ = g q1 . To recover m, it suffices to compute the discrete log of C q1 base gˆ. Homomorphic properties. The system is clearly additively homomorphic. Let (n, G, G1 , e, g, h) be a public key. Given encryptions C1 and C2 ∈ G1 of messages m1 and m2 ∈ {0, 1, ..., T } respectively, anyone can create a uniformly distributed encryption of m1 +m2 mod n by computing the product C = C1 C2 hr for a random number r ∈ {0, 1, ..., n − 1}. More importantly, anyone can multiply two encrypted messages once using the bilinear map. Set g1 = e(g, g) and h1 = e(g, h). Then g1 is of order n and h1 is of order q1 . Also, write h = g αq2 for some (unknown)α ∈ Z. Suppose we are given two ciphertexts C1 = g m1 hr1 ∈ G and C2 = g m2 hr2 ∈ G. To build an encryption of product m1 · m2 mod n given only C1 and C2 , 1) select random r ∈ Zn , and 2) set C = e(C1 , C2 )hr1 ∈ G1 . Then
=
C = e(C1 , C2 )hr1 = e(g m1 hr1 , g m2 hr2 )hr1 m1 m2 m1 r2 +r2 m1 +q2 r1 r2 α+r g1 h1 = g1m1 m2 hr1 ∈
G1
where r = m1 r2 + r2 m1 + q2 r1 r2 α + r is distributed uniformly in Zn as required. Thus, C is a uniformly distributed encryption of m1 m2 mod n, but in the group G1 rather than G (this is why we allow for just one multiplication). We note that the system is still additively homomorphic in G1 . For simplicity, in this paper we denote an encryption of message m in G as EG (m) and one in G1 as EG1 (m). 2.4
Key Sharing
In [2], efficient protocols are presented for a number of players to generate jointly RSA modulus N = pq where p and q are prime, and each player retains a share of N . In this protocol, none of the players can know the factorization of N . They then show how the players can proceed to compute a public exponent e and the shares of the corresponding private exponent. At the end of the computation the players are convinced that N is a product of two large primes by using zeroknowledge proof. Their protocol was based on the threshold decryption that m out of m players can decrypt the secret. The cost of key generation for the shared RSA private key is approximately 11 times greater than that for simple RSA key generation. However the cost for computation is still practical. We use this protocol to share private keys among auction managers.
3
New Efficient Auction Protocol
In this section, we show bit-slice auction protocols based on the evaluation of multivariate polynomials with the total degree of two on encrypted values. For
156
T. Mitsunaga, Y. Manabe, and T. Okamoto
the first price auction, we compose a secure auction protocol on only 2-DNF formula on encrypted bits. (We do not need to use the mix-and-match protocol anymore). On the other hand, for the second price auction, we still need to use the mix-and-match protocol for several times. 3.1
First Price Auction Using 2-DNF Scheme
We assume n players, P1 , ..., Pn and a set of auction managers, AM . The players bid their encrypted prices, and through the protocol they publish encrypted flags whether they are still in the auction. The AM jointly decrypts the results of the protocol. Players find the highest price through the protocol and the winner by decrypting the results. Setting. AM jointly generates and shares private keys among themselves using the technique described in [2]. Bidding Phase. Each player Pi computes a ciphertext of his bidding price, Bi , as EN Ci = (ci,k−1 , ...., ci,0 ) (j)
where ci,j ∈ EG (bi ), and publishes EN Ci on the bulletin board. He also proves (j) in zero-knowledge that bi = 0 or 1 by using the technique described in [3]. Opening Phase. Suppose that c1 = g b1 hr1 ∈ EG (b1 ) and c2 = g b2 hr2 ∈ EG (b2 ), where b1 , b2 are binary and r1 , r2 ∈ Z∗n are random numbers. We define two polynomial time computable operations M ul and ⊗ by applying a 2DNF formula for AND, OR respectively. M ul(c1 , c2 ) = e(c1 , c2 ) = e(g b1 hr1 , g b2 hr2 ) ∈ EG1 (b1 ∧ b2 ) c1 ⊗ c2 = g b1 hr1 · g b2 hr2 = g b1 +b2 hr1 +r2 ∈ EG (b1 + b2 ) by applying a 2DNF formula for AND. The AM generates W = (w1 , ..., wm ), where each wj =1, and encrypts them as = (w is the encryption of (1,...,1) with the W ˜1 , ..., w ˜m ). The AM shows that W verification protocols. (Step 1) For j = k -1 to 0, perform the following. = (w ˜m ), AM computes si,j = M ul(w ˜i , ci,j ) for each (Step 1-a) For W ˜1 , ..., w player i, and ˜1 , c1,j ), ..., M ul(w ˜m , cm,j )) Sj = (M ul(w hj = M ul(w ˜1 , c1,j ) ⊗ · · · ⊗ M ul(w ˜m , cm,j ) (Step 1-b) The AM takes a plaintext equality test regarding whether hj is an (j) encryption of 0. If hj is an encryption of 0, AM publishes 0 as the value of bmax and proves it with the verification protocols, otherwise, AM publishes 1 as the (j) value of bmax .
Efficient Secure Auction Protocols
157
(j)
(Step 1-c) If bmax = 1, then each player creates a new encryption w ˜i which has the same plaintext value of si,j , otherwise he uses w ˜i for the next bit. In addition, the player shows the validity of computation with zero-knowledge proof. = (w ˜m ), AM decrypts each w ˜i with the verifi(Step 2) For the final W ˜1 , ..., w cation protocols and obtains plaintext wi . The highest price is obtained as (k−1) (0) Bmax = (bmax , ..., bmax )2 . Pi is a winner if and only if wi = 1. 3.2
Second Price Auction Using 2-DNF Scheme and Mix-and-Match Protocol
In the second price auction, the information that players can find is the second highest price and the bidder of the highest price. To maintain secrecy of the highest bid through the protocol, we need to use the mix-and-match protocol. However, we can reduce the number of times we use it. As a result, the proposed protocol is more efficient than that in [8]. Here, we define three types of new tables, Selectm, M AP1 and M AP2 for the second price auction. In the proposed protocol, the M AP1 and M AP2 tables are created among AM before an auction. On the other hand, Selectm is created through the protocol corresponding to the players’ inputs. The AM jointly computes values in the mix-and-match table for distributed decryption of plaintext equality test. Table Selectm is also used for the second price auction protocol in [8]; M AP1 and M AP2 are new tables that we propose. Given a message m, M AP1 and M AP2 are tables for mapping an encrypted value a1 ∈ EG1 (m) (which is an output of a computation with one multiplication) to a2 ∈ EG (m). Table Selectm has 2k + 1 input bits and k output bits as follows.
Selectm (b, x(m−1) , ..., x(0) , y (m−1) , ..., y (0) ) (m−1) (x , ..., x(0) ) if b = 1 = (m−1) , ..., y (0) ) otherwise (y For two encrypted input vectors (x(k−1) , ..., x(0) ) and (y (k−1) , ..., y (0) ), b is an encryption of the check bit that selects which vector to output, (x(k−1) , ..., x(0) ) or (y (k−1) , ..., y (0) ). For secure computation, the AM re-encrypts the output vector. In the proposed protocol, the Selectm table is created through the auction to update W corresponding to an input value E(bj ). The function of table M AP1 , shown in Table 2, is a mapping x1 ∈ {EG1 (0), EG1 (1)} → x2 ∈ {EG (0), EG (1)}. The table M AP2 , shown in Table 3, is the one for mapping x1 ∈ {EG1 (0), EG1 (1), ..., EG1 (m)} → x2 ∈ {EG (0), EG (1)}. These tables can be constructed using the mix-and-match protocol because the Boneh-Goh-Nissim encryption has homomorphic properties. The setting and bidding phases are the same as those for the first price auction, so we start from the opening phase.
158
T. Mitsunaga, Y. Manabe, and T. Okamoto Table 2. Table for M AP1 x1 x2 a1 ∈ EG1 (0) b1 ∈ EG (0) a2 ∈ EG1 (1) b2 ∈ EG (1) Table 3. Table for M AP2 x1 x2 a1 ∈ EG1 (0) b1 ∈ EG (0) a2 ∈ EG1 (1) b2 ∈ EG (1) ··· bi ∈ EG (1) am+1 ∈ EG1 (m) bm+1 ∈ EG (1)
= (w Opening phase. Let W ˜1 , .., w ˜m ), where each w ˜j ∈ EG (1) shown above. (Step 1) For j = k -1 to 0, perform the following. = (w˜1 , ..., w (Step 1-a) For W ˜m ), AM computes si,j = M ul(w ˜i , ei,j ) for each player i, and Sj = (M ul(w ˜1 , c1,j ), ..., M ul(w ˜m , cm,j )) hj = M ul(w ˜1 , c1,j ) ⊗ · · · ⊗ M ul(w ˜m , cm,j ) (Step 1-b) The AM uses table M AP1 for si,j for each i and finds the values s1,j , ..., s˜m,j ). The AM also uses the table M AP2 for hj as an of s˜i,j . Let Sj = (˜ input value. By using this table, AM retrieves E(bj ) ∈ EG (0) if hj is a ciphertext of 1, otherwise he retrieves E(bj ) ∈ EG (1). ). (Step 1-c) AM creates the table Selectm as input values (E(bj ), Sj , W The AM executes W = Selectm(E(bj ), Sj , W ), that is, if E(bj ) is the encryption is updated as Sj . of 1, W = (w˜1 , ..., w ˜m ), AM decrypts each w ˜i with verification (Step 2) For the final W protocols and obtains the plaintext wi . Pi is the winner if and only if wi = 1. The AM remove the player who bids the highest price and run the first price auction protocol again. The second highest price is obtained as Bmax = (k−1) (0) (bmax , ..., bmax )2 . Verification protocols Verification protocols are the protocols for players to confirm that AM decrypts the ciphertext correctly. By using the protocols, each player can verify the results of the auction are correct. We denote b as a palintext and C as a BGN encryption of b (C = g b hr ), where g, h and r are elements used in BGN scheme and f = C(g b )−1 . Before a player verifies whether b is the plaintext of C, the player must prove that a challenge ciphertext C = g x f r is created by himself with zero-knowledge proof that he has the value of x.
Efficient Secure Auction Protocols
159
1. A player proves that he has random element x ∈ Zn∗ with zero-knowledge proof. 2. The player computes f = C(g b )−1 from the published values, h, g and b, and select a random integer r ∈ Z∗n . He sends C = g x f r to AM . 3. The AM decrypts C and sends value x to the player. 4. The player verifies whether x = x . AM can decrypt C correctly only if order(f ) = q1 , which means that the AM correctly decrypts C and publishes b as the plaintext of C. 3.3
Security
1. Privacy for bidding prices Each player can not retrieve any information except the winner and the highest price or the second highest price (the first price auction and second price auction respectively). An auction scheme is secure if there is no polynomial time adversary that breaks privacy with non-negligible advantage (τ ). We prove that the privacy for bidding prices in the proposed auction protocols under the assumption that BGN encryption with the mix-and-match oracle is semantically secure. Given a message m, the mix-and-match oracle receives an encrypted value x1 ∈ EG1 (m) and returns the encrypted value x2 ∈ EG (m) according to the mix-and-match table shown in Table 3. (which has the same function as M AP2 ). Given a message m and the ciphertext x1 ∈ EG1 (m), the function of mix-and-match table is to map x1 ∈ EG1 (m) → x2 ∈ EG (m). The range of the input value is supposed to be {0,1,...,m} and the range of the output is {0,1}. We do not consider cases where the input values are out of the range. Using this mix-and-match oracle, an adversary can compute any logical function without the limit where BGN encryption scheme can use only one multiplication on encrypted values. So, an adversary can calculate Selectm (b, x(m−1) , ..., x(0) , y (m−1) , ..., y (0) ) = b(x(m−1) , ..., x(0) ) + (1 − b)(y (m−1) , ..., y (0) ) with an additional polynomial computation. M AP1 can also be computed if the range of the input value is restricted in {0,1}. Here, we define two semantic secure games and advantages for BGN encryption scheme and the proposed auction protocols. We also show that if there is adversary B that breaks the proposed auction protocol, we can compose adversary A that breaks the semantic security of the BGN encryption with the mix-and-match oracle by using B.. Definition 1 Let Π = (KeyGen, Encrypt, Decrypt) be a BGN encryption scheme, and O1 1 let AO1 = (AO 1 , A2 ), be a probabilistic polynomial-time algorithm, that can use the mix-and-match oracle O1 . BGN-Adv(τ ) = Pr[EXP TA,Π (τ ) ⇒ 1] − 1/2 where, EXP TA,Π is a semantic security game of the BGN encryption scheme with the mix-and-match oracle shown in Fig. 1. We then define an adversary B for an auction protocol and an advantage for B.
160
T. Mitsunaga, Y. Manabe, and T. Okamoto (P K, SK) ← KeyGen (m0 , m1 , s) ← Ao11 (P K) b ← {0, 1} c ← Encrypt(P K, mb ) b ← Ao21 (c, s) return 1 iff b = b Fig. 1. EXP TA,Π
(P K, SK) ← KeyGen (b1 , b2 , ..., bm−1 , bm0 , bm1 , s) ← B1 (P K) b ← {0, 1} c ← (Encrypt(P K, b1 ), Encrypt(P K, b2 ), ..., Encrypt(P K, bm−1 ), Encrypt(P K, bmb )) execute auction protocols using c as players bids b ← B2 (c, s) return 1 iff b = b Fig. 2. EXP TB,Π
Definition 2 Let Π = (KeyGen, Encrypt, Decrypt) be a BGN encryption scheme, and let B be two probabilistic polynomial-time algorithm B1 and B2 . Auction-Adv(τ ) = Pr[EXP TB,Π = 1] − 1/2 where EXP TB,Π is a semantic security game of the privacy of the auction protocol shown in Fig. 2. First of all, B1 generates k-bit integers, b1 , b2 , ..., bm−1 as plaintexts of bidding prices for player 1 to m − 1, and two challenge k-bit integers as bm0 , bm1 where bm0 and bm1 are the same bits except for i-th bit mi0 and mi1 . We assume bm0 and bm1 are not the first price bid in a first price auction and the second highest price in a second price auction.Then the auction is executed with (Encrypt(P K, b1 ), Encrypt(P K, b2 ), ..., Encrypt(P K, bm−1 ), r Encrypt(P K, bmb )) as the players’ encrypted bidding prices where b ← − {0,1}. After the auction, B2 outputs b’ ∈ {0,1} as a guess for b. B wins if b = b’. Theorem 2. The privacy of the auction protocols is secure under the assumption that the BGN encryption is semantically secure with a mix-andmatch oracle. We show if there is adversary B that breaks the security of the proposed auction protocol, we can compose adversary A that breaks the semantic security of the BGN encryption with the mix-and-match oracle. A receives two challenge k-bit integers as bm0 and bm1 from B and then A uses mi0
Efficient Secure Auction Protocols
161
and mi1 as challenge bits for the challenger of the BGN encryption. Then A receives Encrypt(P K, mib ) and executes a secure auction protocol with the mix-and-match table. In the auction, when decrypted values are needed, A can calculate them since he knows all the input values, b1 , b2 , ..., bm−1 except the i-th bit of bmb . Through the protocol, B observes the calculation of the encrypted bids and the results of the auction. After the auction, B outputs b , which is the guess for b. A outputs b , which is the same guess with B’s output for bmb . If B can break the privacy of the bidding prices in the proposed auction protocol with advantage (τ ), A can break the semantic security of the BGN encryption with the same advantage. 2. Correctness For correct players’ inputs, the protocol outputs the correct winner and price. From Theorem 1 introduced in Section 1.4, the bit-slice auction protocol obviously satisfies the correctness. 3. Verification of the evaluation To verify whether the protocol works, players need to validate whether the AM decrypts the evaluations of the circuit on ciphertexts through the protocol. We use the verification protocols introduced above so that each player can verify whether the protocol is computed correctly.
4 4.1
Comparison of Auction Protocols First Price Auction
The protocol proposed in [8] requires mk AND computations to calculate Sj = ˜m , cm,j )) for j = k -1 to 0 and k plaintext equality (M ul(w ˜1 , c1,j ), ..., M ul(w (i) tests when it checks whether bmax is the ciphertext of 0. One AND computation requires two plaintext equality tests. So, the total number of plaintext equality tests is 2mk + k. On the other hand, we do not use mix-and-match protocols anymore. The proposed protocol is based on only a 2-DNF scheme. So, Sj can be computed by addition and multiplication of ciphertexts. It requires only k plain(i) text equality tests to check bmax . A comparison between the proposed protocol and that in [8] is shown in Table 4. 4.2
Second Price Auction
In the second price auction protocol, the protocol in [8] requires (2m − 1)k AND, (m − 2)k OR and k Selectm gates. One OR gate requires two plaintext equality Table 4. Number of PET in the first price auction
[KO02] Proposed
AND PET Total PET(approx.) mk k 2mk + k 0 k k
162
T. Mitsunaga, Y. Manabe, and T. Okamoto Table 5. Number of PET in the second price auction
AND OR Selectm M AP1 M AP2 PET Total PET(approx.) [KO02] (2m − 1)k (m − 2)k k 0 0 0 6mk − 5k Proposed 0 0 k mk k k 3/2mk + 3k
tests. Selectm requires one test to check whether b is the ciphertext of 1, so in total approximately 6mk − 5k plaintext equality tests are required. Conversely, the proposed protocol requires M AP1 mk times and M AP2 k times. M AP1 requires one plaintext equality test which uses to check whether input value is a ciphertext of 0 or 1. The range of input value in the table M AP2 is m+ 1 (from 0 to m) and use one plaintext equality test for each column in the mix-and-match table. M AP2 requires approximately m/2+1 times on average. It also requires k plaintext equality tests to decide the second highest price among the rest of player except the winner. In total, the calculation cost of proposed protocols is 3/2mk + 3k. A comparison between the proposed protocol and that in [8] is shown in Table 5. In the second price auction we can reduce the number of times when the plaintext equality test is executed.
5
Conclusion
We introduced new efficient auction protocols based on the BGN encryption and showed that they are approximately two fold more efficient than that proposed in [8]. As a topic of future work, we will try to compose a secure auction protocol without using the mix-and-match protocol.
References 1. Abe, M., Suzuki, K.: M + 1st price auction using homomorphic encryption. In: Naccache, D., Paillier, P. (eds.) PKC 2002. LNCS, vol. 2274, pp. 115–124. Springer, Heidelberg (2002) 2. Boneh, D., Franklin, M.: Efficient Generation of Shared RSA keys. Invited paper Public Key Cryptography 1998. LNCS, vol. 1431, pp. 1–13. Springer, Heidelberg (1998) 3. Boneh, D., Goh, E., Nissim, K.: Evaluating 2-DNF Formulas on Ciphertexts. In: Kilian, J. (ed.) TCC 2005. LNCS, vol. 3378, pp. 325–341. Springer, Heidelberg (2005) 4. Chaum, D.: Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM, 84–88 (1981) 5. Franklin, M.K., Reiter, M.K.: The design and implementation of a secure auction service. IEEE Transactions on Software Engineering 22(5), 302–312 (1995) 6. Jakobsson, M., Juels, A.: Mix and Match: Secure Function Evaluation via Ciphertexts. In: Okamoto, T. (ed.) ASIACRYPT 2000. LNCS, vol. 1976, pp. 162–177. Springer, Heidelberg (2000)
Efficient Secure Auction Protocols
163
7. Juels, A., Szydlo, M.: A Two-Server Sealed-Bid Auction Protocol. In: Blaze, M. (ed.) FC 2002. LNCS, vol. 2357, pp. 72–86. Springer, Heidelberg (2003) 8. Kurosawa, K., Ogata, W.: Bit-Slice Auction Circuit. In: Gollmann, D., Karjoth, G., Waidner, M. (eds.) ESORICS 2002. LNCS, vol. 2502, pp. 24–38. Springer, Heidelberg (2002) 9. Lipmaa, H., Asokan, N., Niemi, V.: Secure Vickrey auctions without threshold trust. In: Blaze, M. (ed.) FC 2002. LNCS, vol. 2357, pp. 87–101. Springer, Heidelberg (2003) 10. Naor, M., Pinkas, B., Sumner, R.: Privacy preserving auctions and mechanism design. In: Proceedings of the 1st ACM Conference on Electronic Commerce (ACMEC), pp. 129–139. ACM Press, New York (1999) 11. Okamoto, T., Uchiyama, S.: A new public-key cryptosystem as secure as factoring. In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS, vol. 1403, pp. 308–318. Springer, Heidelberg (1998) 12. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999) 13. Park, C., Itoh, K., Kurosawa, K.: All/nothing election scheme and anonymous channel. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 248–259. Springer, Heidelberg (1994) 14. Tamura, Y., Shiotsuki, T., Miyaji, A.: Efficient Proxy-bidding system. IEICE Transactions on Fundamentals J87-A(6), 835–842 (2004)
Hierarchical ID-Based Authenticated Key Exchange Resilient to Ephemeral Key Leakage Atsushi Fujioka, Koutarou Suzuki, and Kazuki Yoneyama NTT Information Sharing Platform Laboratories 3-9-11 Midori-cho Musashino-shi Tokyo 180-8585, Japan {fujioka.atsushi,suzuki.koutarou,yoneyama.kazuki}@lab.ntt.co.jp
Abstract. In real applications of (public key-based) cryptosystems, hierarchical structures are often used to distribute the workload by delegating key generation. However, there is few previous study about such a hierarchical structure in the ID-based authenticated key exchange (AKE) scenario. In this paper, we introduce first hierarchical ID-based AKE resilient to ephemeral secret key leakage. Firstly, we provide a formal security model for hierarchical ID-based AKE. Our model is based on eCK security to guarantee resistance to leakage of ephemeral secret keys. We also propose an eCK secure hierarchical ID-based AKE protocol based on a hierarchical ID-based encryption. Keywords: hierarchical ID-based AKE, eCK security.
1
Introduction
Authenticated key exchange (AKE) protocols provide each user to establish a common session key secretly and reliably with the intended peer based on their own static secret keys. Recently, studies on ID-based AKE [7,10] have been received much attention in an usable sense regarding to the management of certificates. In the ID-based AKE scenario, a trusted key generation center (KGC) generates a static secret key from the identity (ID) for each user in key generation, and so the user can execute an AKE session without confirming the certificate of the peer as long as the user knows the peer’s ID. However, simple ID-based setting has a problem from the viewpoint of scalability. If the number of users increases, KGC has to handle enormous computations for key generation, verify proofs of each identity and establish secure channels to transmit secret keys. The workload for KGC becomes burdensome because there will be substantially fewer KGCs than total users. Thus, previous ID-based AKE protocols do not have scalability. To give scalability to ID-based cryptosystems in real applications, we can use hierarchical structures. In the hierarchical ID-based setting, each user has different ID but it is assigned to a node of the tree. Specifically, the ID of an user U at level t is represented as (ID1 , ID2 , . . . , IDt ). An user can generate the static key of users at lower levels than him. For example, the user which ID is (ID1 , ID2 ) can generate the secret key of the user which ID is (ID1 , ID2 , ∗), I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 164–180, 2010. c Springer-Verlag Berlin Heidelberg 2010
Hierarchical ID-Based AKE Resilient to Ephemeral Key Leakage
165
where ∗ means wild-card. Hence, KGC can delegate key generation to users (i.e., it is enough that KGC just generates master key and public parameters) and this setting is less burdensome. Various hierarchical ID-based primitives have been introduced like hierarchical ID-based encryption [9,4,8]. Also, in the IDbased AKE scenario, the extension to hierarchical structures is quite natural direction for scalable real applications. However, to our best knowledge, there is no hierarchical ID-based AKE protocol that is resilient to leakage of the master, static, and ephemeral secret keys, and it is an important open question. 1.1
Contribution
In this paper, we have mainly two contributions as follows: the first definition of hierarchical ID-based security model, and the first construction of hierarchical ID-based AKE protocol. Hierarchical ID-based Security Model. In contrast with the security model of ID-based AKE setting, we have to consider collusion resistance. That is, disclosure of an user’s secret does not compromise the secrets of higher-level users. Since we have to consider influence of disclosure of static or ephemeral secret in the session, we adopt extended Canetti-Krawzcyk (eCK) security model [12]. In the eCK model, we allow the adversary to obtain the master secret key, the static secret keys, and ephemeral secret keys individually. On the other hand, disclosure of an user’s secret trivially compromises the secrets of lower-level users. Thus, in order to capture these situation, we have to carefully reconsider freshness of the session. Freshness means the condition of the session as the adversary cannot trivially break secrecy of the session key. For example, collusion resistance can be represented in freshness as follows: the adversary cannot obtain any information of the session key between higher-level users, though the adversary can obtain the static secret key of an user, Hierarchical ID-based AKE Protocol. The proposed hierarchical ID-based AKE protocol is based on a hierarchical ID-based encryption [9]. We consider a tree where the master secret key is corresponding to the root node and a static secret key is corresponding to a node of the tree. From a static secret key corresponding to a node, one can generate static secret keys corresponding to child nodes of the node. KGC can delegate key generation ability to other entities by providing a static secret key corresponding to a node, and the delegated entity can generate static secret keys corresponding to lower-level nodes than him. Using this delegation property, KGC can be constructed hierarchically, and is scalable for large amount of users. The proposed protocol needs only two pass and is secure in the proposed hierarchical ID-based security model, i.e. resilient to ephemeral secret key leakage and collusion. We construct the proposed scheme based on the selective ID secure HIBE scheme of [9], though there exist full secure HIBE schemes, e.g., [4]. The reason is that [9] is based on the BDH assumption and thus the proposed scheme can be constructed on the gap BDH assumption, which is bilinear version of gap DH
166
A. Fujioka, K. Suzuki, and K. Yoneyama
assumption and proposed in [1]. If we adopt the HIBE scheme of [4], we need more complicated assumption. There exists a research on hierarchical ID-based AKE by Moriyama, Doi, and Chao [13]. Comparing with [13] that is secure in CK security model without random oracle, the proposed scheme is secure in eCK security model with random oracle. Organization. In Section 2, we define the hierarchical ID-based eCK security model. In Section 3, we propose our hierarchical ID-based AKE protocol that is secure in the hierarchical ID-based eCK security model. In Section 4, we conclude the paper. In Appendix A, we provide the security proof. 1.2
Related Works
Models for key agreement were introduced by Bellare and Rogaway [2] and BlakeWilson, Johnson and Menezes [3], in the shared- and public-key setting, respectively. Recent developments in two-party certificate-based AKE in the public-key infrastructure (PKI) setting have improved the security models and definitions, i.e., Canetti-Krawzcyk [6] security model and LaMacchia, Lauter, and Mityagin [12] eCK security model. There are some ID-based AKE protocols, see for example [7,14]. Boyd and Choo [5] summarize that many existing ID-based protocols are not as secure as we expect them to be. Furthermore security analysis for ID-based AKE protocols do not formally analyze ephemeral private key leakage. The only protocol that formally considers eCK security model for ID-based AKE is due to Huang and Cao [10]. Recently, PKI-based version of [10] is proposed in [11].
2
eCK Security Model for Hierarchical ID-Based AKE
In this section, we provide an eCK security model for hierarchical ID-based AKE. Our hierarchical ID-based eCK security model is an extension of the eCK security model for PKI-based AKE by the LaMacchia, Lauter and Mityagin [12] to hierarchical ID-based AKE. The proposed eCK security model for hierarchical ID-based AKE is different from the original eCK security model [12] for PKI-based AKE in the following points: 1) the session is identified by identity IDi of user Ui , 2) freshness conditions for StaticKeyReveal queries are different, and 3) MasterKeyReveal query is allowed for adversary same as in ID-based AKE. We denote by x ∈U X that an element x is uniformly randomly chosen from a set X. Algorithms. Hierarchical ID-based AKE protocol Π consists of the following algorithms. We denote a user by Ui and his associated identity by IDi = (IDi,1 , ..., IDi,t ) where IDi,j ∈ {0, 1}∗. We say IDi is a prefix of or equal to the identity IDi and denote IDi IDi , if IDi = (IDi,1 , ..., IDi,k1 ) and
Hierarchical ID-Based AKE Resilient to Ephemeral Key Leakage
167
IDi = (IDi,1 , ..., IDi,k1 , ..., IDi,k2 ) for 1 ≤ k1 ≤ k2 . User Ui and other parties are modeled as a probabilistic polynomial-time Turing machine. Parameters. A system parameter params is generated for a security parameter 1k . All following algorithms implicitly take params as input. Key Generation. The key generation algorithm KeyGen takes a security parameter 1k as input, and outputs a master secret key msk and a master public key mpk, i.e., KeyGen(1k ) → (msk, mpk). Key Extraction. The key extraction algorithm KeyExt takes the master public key mpk, identities IDi = (IDi,1 , ..., IDi,t−1 ) and IDi = (IDi,1 , ..., IDi,t−1 , IDi,t ), where the identity IDi is the prefix of the identity IDi , and a static secret key sskIDi corresponding to the identity IDi , and outputs a static secret key sskIDi corresponding to the identity IDi , i.e., KeyExt(mpk, IDi , IDi , sskIDi ) → sskIDi . In the case of t = 1, the key extraction algorithm uses the master secret key msk instead of a static secret key sskIDi , i.e., KeyExt(mpk, IDi , IDi , msk) → sskIDi . Key Exchange. User UA and user UB share a session key by performing the following 2-pass protocol. User UA has static secret keys sskIDA corresponding to IDA = (IDA,1 , ..., IDA,α ) and user UB has static secret keys sskIDB corresponding to IDB = (IDB,1 , ..., IDB,β ). User UA computes ephemeral keys by algorithm EphemeralKey, that takes the master public key mpk, the identity IDA , the static secret key sskIDA , and the identity IDB , and outputs an ephemeral secret key eskIDB and an ephemeral public key epkIDB corresponding to the identity IDB , i.e., EphemeralKey(mpk, IDA , sskIDA , IDB ) → (eskIDB , epkIDB ). User UA sends epkIDB to user UB . On the other hand, user UB computes ephemeral keys by algorithm EphemeralKey, that takes the master public key mpk, the identity IDB , the static secret key sskIDB , and the identity IDA , and outputs an ephemeral secret key eskIDA and an ephemeral public key epkIDA , corresponding to the identity IDA , i.e., EphemeralKey(mpk, IDB , sskIDB , IDA ) → (eskIDA , epkIDA ). User UB sends epkIDA to user UA . Upon receiving epkIDA , user UA computes a session key by algorithm SessionKey, that takes the master public key mpk, the identity IDA , the static
168
A. Fujioka, K. Suzuki, and K. Yoneyama
secret key sskIDA , the identity IDB , the ephemeral secret key eskIDB and the ephemeral public key epkIDB , and the ephemeral public key epkIDA , and outputs an session key K, i.e., SessionKey(mpk, IDA , sskIDA , IDB , eskIDB , epkIDB , epkIDA ) → K. Similarly, upon receiving epkIDB , user UB computes a session key by algorithm SessionKey, that takes the master public key mpk, the identity IDB , the static secret key sskIDB , the identity IDA , the ephemeral secret key eskIDA and the ephemeral public key epkIDA , and the ephemeral public key epkIDB , and outputs an session key K, i.e., SessionKey(mpk, IDB , sskIDB , IDA , eskIDA , epkIDA , epkIDB ) → K. Session. An invocation of a protocol is called a session. Session activation is made by an incoming message of the forms (Π, I, IDA , IDB ) or (Π, R, IDB , IDA , epkIDB ), where Π is a protocol identifier, I and R are role identifiers, IDA and IDB are user identifiers. If UA was activated with (Π, I, IDA , IDB ), then UA is called the session initiator. If UB was activated with (Π, R, IDB , IDA , epkIDB ), then UB is called the session responder. The initiator UA outputs epkIDB , then may be activated by an incoming message of the forms (Π, I, IDA , IDB , epkIDB , epkIDA ) from the responder UB , and computes the session key K if UA is activated by the message. The responder UB outputs epkIDA , and computes the session key K. If UA is the initiator of a session, the session is identified by sid = (Π, I, IDA , IDB , epkIDB ) or sid = (Π, I, IDA , IDB , epkIDB , epkIDA ). If UB is the responder of a session, the session is identified by sid = (Π, R, IDB , IDA , epkIDB , epkIDA ). We say that Ui is the owner of session sid, if the 3-rd coordinate of session sid is IDi . We say that Ui is the peer of session sid, if the 4-th coordinate of session sid is IDi . We say that a session is completed if a session key is computed in the session. The matching session of (Π, I, IDA , IDB , epkIDB , epkIDA ) is a session with identifier (Π, R, IDB , IDA , epkIDB , epkIDA ) and vice versa. Adversary. The adversary A, that is modeled as a probabilistic polynomialtime Turing machine, controls all communications between parties including session activation, by performing the following adversary query. – Send(message): The message has one of the following forms: (Π, I, IDA , IDB ), (Π, R, IDB , IDA , epkIDB ), or (Π, I, IDA , IDB , epkIDB , epkIDA ). The adversary obtains the response from the user. A user’s secret information is not accessible to the adversary, however, leakage of secret information is captured via the following adversary queries. – SessionKeyReveal(sid): The adversary obtains the session key for the session sid, provided that the session is completed. – EphemeralKeyReveal(sid): The adversary obtains the ephemeral secret key of the owner of the session sid.
Hierarchical ID-Based AKE Resilient to Ephemeral Key Leakage
169
– StaticKeyReveal(IDi ): The adversary learns the static secret key corresponding to the identity IDi . – MasterKeyReveal: The adversary learns the master secret key of the system. – EstablishParty(Ui , IDi ): This query allows the adversary to register a static public key corresponding to identity IDi on behalf of the party Ui and the adversary totally controls that party. If a party is established by EstablishParty(Ui , IDi ) query issued by adversary, then we call the party Ui dishonest. If not, we call the party honest. Freshness. For the security definition, we need the notion of freshness. Definition 1 (Freshness). Let sid∗ = (Π, I, IDA , IDB , epkIDB , epkIDA ) or (Π, R, IDB , IDA , epkIDB , epkIDA ) be a completed session between honest user UA with identity IDA and UB with identity IDB . If the matching session exists, then let sid∗ be the matching session of sid∗ . Here, we denote ID is a prefix of or equal to ID by ID ID . Define session sid∗ to be fresh if none of the following conditions hold: 1. Adversary issues a SessionKeyReveal(sid∗ ), or SessionKeyReveal(sid∗ ) query if sid∗ exists, 2. sid∗ exists and Adversary makes either of the following queries – both StaticKeyReveal(ID) s.t. ID IDA and EphemeralKeyReveal(sid∗ ), or – both StaticKeyReveal(ID) s.t. ID IDB and EphemeralKeyReveal(sid∗ ), 3. sid∗ does not exist and Adversary makes either of the following queries – both StaticKeyReveal(ID) s.t. ID IDA and EphemeralKeyReveal(sid∗ ), or – StaticKeyReveal(ID) s.t. ID IDB , where – if Adversary issues MasterKeyReveal() query, we regard that Adversary issues StaticKeyReveal(IDA ) and StaticKeyReveal(IDB ) queries. Security Experiment. For our security definition, we consider the following security experiment. Initially, the adversary A is given a set of honest users, and makes any sequence of the queries described above. During the experiment, A makes the following query. – Test(sid∗ ): Here, sid∗ must be a fresh session. Select random bit b ∈U {0, 1}, and return the session key held by sid∗ if b = 0, and return a random key if b = 1. The experiment continues until A makes a guess b . The adversary wins the game if the test session sid∗ is still fresh and if A guess is correct, i.e., b = b. The advantage of the adversary A in the AKE experiment with hierarchical ID-based AKE protocol Π is defined as HIAKE (A) = Pr[A wins] − 1/2. AdvΠ
We define the security as follows.
170
A. Fujioka, K. Suzuki, and K. Yoneyama
Definition 2 (Security). We say that hierarchical ID-based AKE protocol Π is secure in the hierarchical ID-based eCK model, if the following conditions hold: 1. If two honest parties complete matching Π sessions, then, except with negligible probability, they both compute the same session key. HIAKE 2. For any probabilistic polynomial-time bounded adversary A, AdvΠ (A) is negligible. Moreover, we say that hierarchical ID-based AKE protocol Π is selective ID secure in the hierarchical ID-based eCK model, if adversary A outputs (IDA , IDB ) at the beginning of the security experiment.
3
Proposed Hierarchical ID-Based AKE Protocol
We construct a hierarchical ID-based AKE protocol based on hierarchical IDbased encryption scheme [9]. By applying NAXOS technique [12], the proposed protocol can satisfy the hierarchical ID-based eCK security. 3.1
Assumption
Let k be the security parameter and p be a k-bit prime. Let G be a cyclic group of a prime order p with a generator g and GT be a cyclic group of the prime order p with a generator gT = e(g, g). Let e : G × G → GT be a polynomial-time computable bilinear non-degenerate map called pairing. We say that G, GT are bilinear groups with the pairing e. The gap BDH (Bilinear Diffie-Hellman) problem is as follows. Define the computational BDH (Bilinear Diffie-Hellman) function BDH : G3 → GT as BDH(g a , g b , g c ) = e(g, g)abc and the decisional BDH (Bilinear Diffie-Hellman) predicate DBDH : G4 → {0, 1} as a function which takes an input (g a , g b , g c , e(g, g)d ) and returns the bit one if abc = d mod p and the bit zero otherwise. An adversary A is given input g a , g b , g c ∈U G selected uniformly random and oracle access to DBDH(·, ·, ·, ·) oracle, and tries to compute BDH(g a , g b , g c ). For adversary A, we define advantage Adv gapBDH (A) = Pr[g a , g b , g c ∈U G, ADBDH(·,·,·,·)(g a , g b , g c ) = BDH(g a , g b , g c )], where the probability is taken over the choices of g a , g b , g c and the random tape of A. Definition 3 (gap BDH assumption). We say that G satisfy the gap BDH assumption, if for all polynomial-time adversary A, advantage Adv gapBDH (A) is negligible in security parameter k. 3.2
Proposed Hierarchical ID-Based AKE Protocol
In this section, we describe the proposed hierarchical ID-based AKE protocol.
Hierarchical ID-Based AKE Resilient to Ephemeral Key Leakage
171
Parameters. Let k be the security parameter. Let G, GT be bilinear groups with pairing e : G × G → GT of order k-bit prime p with generators g, gT = e(g, g), respectively. Let H1 : {0, 1}∗ → G, H2 : {0, 1}∗ → Zp , and H : {0, 1}∗ → {0, 1}k be cryptographic hash functions modeled as random oracles. Key Generation. The key generator randomly selects a master secret key s0 ∈U Zp . And, the key generator sets S0 = 1G (∈ G) and P0 = g (∈ G). Also, the key generator publishes master public key P0 and Q0 = P0s0 (∈ G). Lower-level Setup. An user U corresponding to (ID1 , ..., IDt ) picks a random st ∈U Zp and keeps it secretly. Key Extraction. The key extractor generates the static secret key (St , Q1 , ..., Qt−1 ) ((St ) if t = 1) corresponding to (ID1 , ..., IDt ) as Pt = H1 (ID1 , ..., IDt ) (∈ G), s s t St = Πi=1 Pi i−1 = St−1 Pt t−1 (∈ G), and Qi = P0si (∈ G) for 1 ≤ i ≤ t − 1. Key Exchange. In the following description, user UA is the session initiator and user UB is the session responder. User UA has static secret keys (SA,α , Q1 , ..., QA,α ) corresponding to IDA = (IDA,1 , ..., IDA,α ) and user UB has static secret keys (SB,β , Q1 , ..., QB,β ) corresponding to IDB = (IDB,1 , ..., IDB,β ). 1. UA chooses at random an ephemeral private key x ˜ ∈U Zp . Then, UA computes x x x = H2 (SA,α , x ˜), the ephemeral public key epkIDB = (P0x , PB,2 , ..., PB,β ) where PB,i = H1 (IDB,1 , ..., IDB,i ) (1 ≤ i < β), and sends it to UB . 2. Upon receiving epkIDB = (X0 , XB,2 , ..., XB,β ), UB checks whether e(P0 , XB,i ) = e(X0 , PB,i ) holds for all i (2 ≤ i ≤ β), and UB aborts if not. UB chooses at random an ephemeral private key y˜ ∈U Zp . Then, UB computes y y , ..., PA,α ) y = H2 (SA,α , y˜), the ephemeral public key epkIDA = (P0y , PA,2 where PA,i = H1 (IDA,1 , ..., IDA,i ) (1 ≤ i < β), and sends it to UA . UB computes the shared secrets σ1 = e(X0 , SB,β )/Πβi=2 e(QB,i−1 , XB,i ), σ2 = e(Q0 , PA,1 )y , σ3 = (X0 )y and the session key K = H(σ1 , σ2 , σ3 , Π, IDA , IDB , epkIDB , epkIDA ). 3. Upon receiving epkIDA = (Y0 , YA,2 , ..., YA,α ), UA checks whether e(P0 , YA,i ) = e(Y0 , PA,i ) holds for all i (2 ≤ i ≤ α), and UA aborts if not. UA computes the shared secrets x σ1 = e(Q0 , PB,1 )x , σ2 = e(Y0 , SA,α )/Πα i=2 e(QA,i−1 , YA,i ) σ3 = (Y0 )
and the session key K = H(σ1 , σ2 , σ3 , Π, IDA , IDB , epkIDB , epkIDA ). The shared secrets that both parties compute are σ1 = Πβi=1 e(P0 , PB,i )xsi−1 /Πβi=2 e(P0 , PB,i )xsi−1 = e(Q0 , PB,1 )x = e(P0 , PB,1 )s0 x , ysi−1 ysi−1 σ2 = Πα /Πα = e(Q0 , PA,1 )y = e(P0 , PA,1 )s0 y , i=1 e(P0 , PA,i ) i=2 e(P0 , PA,i ) σ3 = (X0 )y = (Y0 )x = P0xy , and therefore they can compute the same session key K.
172
3.3
A. Fujioka, K. Suzuki, and K. Yoneyama
Security
The proposed hierarchical ID-based AKE protocol is selective condition secure in the hierarchical ID-based eCK security model under gap BDH assumption and in random oracle model. Theorem 1. If G is a group where gap Bilinear Diffie-Hellman assumption holds and H1 , H2 and H are random oracles, the proposed hierarchical ID-based AKE protocol is selective ID secure in the hierarchical ID-based eCK model described in Section 2. Proof of Theorem 1 is provided in Appendix A. Here, we provide an intuitive sketch of the proof. Proof (Sketch). The adversary A can reveal the master secret key, static secret keys and ephemeral secret keys in the test session according to Definition 2. First, when the adversary A poses EphemeralKeyReveal query, x ˜ and y˜ may ˜) for x and be revealed. However, by the NAXOS technique (i.e., H2 (SA,α , x H2 (SB,β , y˜) for y), x and y are not revealed as long as SA,α and SB,β are not revealed, respectively. Since A cannot pose EphemeralKeyReveal query and StaticKeyReveal query for the same user in the test session as Definition 2, EphemeralKeyReveal query gives no advantage to A. Next, we consider the case when StaticKeyReveal query for the party UA is posed and there is no matching session. Then, the simulator cannot embed the BDH instances (g u , g v , g w ) into the static secret key of UA and the ephemeral public key of the peer. But, the simulator can still embed g u into the master public key Q0 = P0s0 , g v into the ephemeral public key P0x , and g w into the static public key PB,1 . Thus, the simulation successfully works and the simulator can obtain e(g, g)uvw from σ2 = e(P0 , PB,1 )s0 x . Finally, we consider the case when MasterKeyReveal query or both StaticKeyReveal queries for the test session and its matching session is posed. Then, the simulator cannot embed the BDH instances into the both static secret keys of the test session owner and its peer, and the master secret key. But, the simulator can still embed g u into the ephemeral public key P0x and g v into the ephemeral public key P0y because A cannot reveal x and y for UA and UB . Thus, the simulation successfully works and the simulator can obtain e(g, g)uvw by computing e(g w , σ3 ) = e(g w , g xy ).
4
Conclusion
In this paper, we firstly defined the eCK (extended Canetti-Krawzcyk) security model, that captures leakage of ephemeral secret key, for hierarchical ID-based AKE by extending eCK security model [12] for AKE. We also proposed an eCK secure hierarchical ID-based AKE protocol based on hierarchical ID-based encryption scheme [9], using NAXOS technique [12]. Our proposed protocol is selective condition secure in the hierarchical ID-based eCK security model under gap BDH assumption and in random oracle model.
Hierarchical ID-Based AKE Resilient to Ephemeral Key Leakage
173
References 1. Baek, J., Safavi-Naini, R., Susilo, W.: Efficient multi-receiver identity-based encryption and its application to broadcast encryption. In: Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 380–397. Springer, Heidelberg (2005) 2. Bellare, M., Rogaway, P.: Entity Authentication and Key Distribution. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 232–249. Springer, Heidelberg (1994) 3. Blake-Wilson, S., Johnson, D., Menezes, A.: Key Agreement Protocols and Their Security Analysis. In: IMA Int. Conf. 1997, pp. 30–45 (1997) 4. Boneh, D., Boyen, X., Goh, E.-J.: Hierarchical Identity Based Encryption with Constant Size Ciphertext. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 440–456. Springer, Heidelberg (2005) 5. Boyd, C., Choo, K.-K.R.: Security of Two-Party Identity-Based Key Agreement. In: Dawson, E., Vaudenay, S. (eds.) Mycrypt 2005. LNCS, vol. 3715, pp. 229–243. Springer, Heidelberg (2005) 6. Canetti, R., Krawczyk, H.: Analysis of Key-Exchange Protocols and Their Use for Building Secure Channels. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 453–474. Springer, Heidelberg (2001) 7. Chen, L., Cheng, Z., Smart, N.P.: Identity-based key agreement protocols from pairings. Int. J. Inf. Sec. 6(4), 213–241 (2007) 8. Gentry, C., Halevi, S.: Hierarchical Identity Based Encryption with Polynomially Many Levels. In: Reingold, O. (ed.) TCC 2009. LNCS, vol. 5444, pp. 437–456. Springer, Heidelberg (2009) 9. Gentry, C., Silverberg, A.: Hierarchical ID-Based Cryptography. In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501, pp. 548–566. Springer, Heidelberg (2002) 10. Huang, H., Cao, Z.: An ID-based authenticated key exchange protocol based on bilinear Diffie-Hellman problem. In: ASIACCS 2009, pp. 333–342 (2009) 11. Kim, M., Fujioka, A., Ustaoglu, B.: Strongly secure authenticated key exchange without naxos’ approach. In: Takagi, T., Mambo, M. (eds.) IWSEC 2009. LNCS, vol. 5824, pp. 174–191. Springer, Heidelberg (2009) ISBN 978-3-642-04845-6 12. LaMacchia, B., Lauter, K., Mityagin, A.: Stronger Security of Authenticated Key Exchange. In: Susilo, W., Liu, J.K., Mu, Y. (eds.) ProvSec 2007. LNCS, vol. 4784, pp. 1–16. Springer, Heidelberg (2007) 13. Moriyama, D., Doi, H., Chao, J.: A Two-Party Hierarchical Identity Based Key Agreement Protocol Without Random Oracles. In: SCIS 2008 (2008) 14. Smart, N.P.: An identity based authenticated key agreement protocol based on the Weil pairing. Electronics Letters 38(13), 630–632 (2002)
A
Proof of Theorem 1
We need the gap BDH (Bilinear Diffie-Hellman) assumption, where one tries to compute BDH(U, V, W ) accessing the DBDH oracle. Here, we denote BDH(U, V, W ) = e(g, g)log U log V log W , and the DBDH oracle on input (g u , g v , g w , e(g, g)x ) returns bit 1 if uvw = x and bit 0 otherwise. We will show that if polynomially bounded adversary A can distinguish the session key of a fresh session from a randomly chosen session key, we can solve the gap BDH problem. Let κ denote the security parameter, and let A be a
174
A. Fujioka, K. Suzuki, and K. Yoneyama
polynomially (in κ) bounded adversary. We use adversary A to construct a gap BDH solver S that succeeds with non-negligible probability. The adversary A is said to be successful with non-negligible probability if adversary A wins the distinguishing game with probability 1/2 + f (κ), where f (κ) is non-negligible, and the event M denotes a successful adversary A. Let the test session be sid∗ = (Π, I, IDA , IDB , epkIDB , epkIDA ) or (Π, R, IDA , IDB , epkIDA , epkIDB ) that is a completed session between honest user UA with identity IDA and UB with identity IDB , where users UA , UB are initiator and responder of the test session sid∗ . Let H ∗ be the event that adversary A queries (σ1 , σ2 , σ3 , Π, IDA , IDB , epkIDB , epkIDA ) to H. Let H ∗ be the complement of event H ∗ . Let sid be any completed session owned by an honest user such that sid = sid∗ and sid is non-matching to sid∗ . Since ∗ sid and sid are distinct and non-matching, the inputs to the key derivation function H are different for sid and sid∗ . Since H is a random oracle, adversary A cannot obtain any information about the test session key from the session keys of non-matching sessions. Hence Pr(M ∧ H ∗ ) ≤ 1/2 and Pr(M ) = Pr(M ∧ H ∗ ) + Pr(M ∧ H ∗ ) ≤ Pr(M ∧ H ∗ ) + 1/2, whence f (κ) ≤ Pr(M ∧ H ∗ ). Henceforth the event M ∧ H ∗ is denoted by M ∗ . We denote the master secret and public keys by s0 , Q0 = P0s0 . For user Ui , we denote the identity by IDi = (IDi,1 , ..., IDi,t ), the static secret key by Si,t , the ephemeral secret key by x ˜i , and the exponent of ephemeral public key by xi = H2 (Si,t , x˜i ). We also denote the session key by K. Assume that adversary A succeeds in an environment with n users, activates at most s sessions within a user. Let IDA = (IDA,1 , ..., IDA,α ), IDB = (IDB,1 , ..., IDB,β ) be the target identities selected by adversary A. We consider the following events. – Let D be the event that adversary A queries to H2 static secret key Si,t corresponding to ID s.t. (IDA,1 ) ID IDA or (IDB,1 ) ID IDB , before or without asking StaticKeyReveal queries for ID s.t. (IDA,1 ) ID ID or (IDB,1 ) ID ID, respectively, or MasterKeyReveal query. – Let D be the complement of event D. We consider the following events, that cover all cases of behavior of adversary A. – Let E1 be the event that test session sid∗ has no matching session sid∗ and adversary A queries StaticKeyReveal(ID) s.t. ID IDA . – Let E2 be the event that test session sid∗ has no matching session sid∗ and adversary A queries EphemeralKeyReveal(sid∗ ). – Let E3 be the event that test session sid∗ has matching session sid∗ and adversary A queries MasterKeyReveal() or queries StaticKeyReveal(ID) s.t. ID IDA and StaticKeyReveal(ID) s.t. ID IDB . – Let E4 be the event that test session sid∗ has matching session sid∗ and adversary A queries EphemeralKeyReveal(sid∗ ) and EphemeralKeyReveal(sid∗ ). – Let E5 be the event that test session sid∗ has matching session sid∗ and adversary A queries StaticKeyReveal(ID) s.t. ID IDA and EphemeralKeyReveal(sid∗ ).
Hierarchical ID-Based AKE Resilient to Ephemeral Key Leakage
175
– Let E6 be the event that test session sid∗ has matching session sid∗ and adversary A queries EphemeralKeyReveal(sid∗ ) and StaticKeyReveal(ID) s.t. ID IDB . To finish the proof, we investigate events D ∧ M ∗ , Ei ∧ D ∧ M ∗ (i = 1, ..., 6), that cover all cases of event M ∗ , in the following. A.1
Event D ∧ M ∗
In event D, adversary A queries static secret key Si,t to H2 , before asking StaticKeyReveal queries or MasterKeyReveal query or without asking StaticKeyReveal queries or MasterKeyReveal query. Let IDA = (IDA,1 , ..., IDA,α ), IDB = (IDB,1 , ..., IDB,β ) be the target identities selected by adversary A. In the case that (IDA,1 ) ID IDA holds in the condition of D, we embed s0 instance as Q0 = P0s0 = U and PA,1 = V , and extract S1 = PA,1 = g uv from the static secret key (St , Q1 , ..., Qt ). ( In the case that (IDB,1 ) ID IDB holds in the condition of D, we embed instance as Q0 = P0s0 = U and PB,1 = V , and s0 extract S1 = PB,1 = g uv same as the following simulation. ) In the case of event ∗ D ∧ M and that (IDA,1 ) ID IDA holds in the condition of D, S performs the following steps. Setup. S embeds instance (U = g u , V = g v , W = g w ) of gap BDH problem as Q0 = P0s0 = U and PA,1 = V . The algorithm S activates adversary A on this set of users and awaits the actions of adversary A. We next describe the actions of S in response to user activations and oracle queries. Simulation. The solver S simulate oracle queries as follows. S maintains list LH that contains queries and answers of H oracle, and list LS that contains queries and answers of SessionKeyReveal, Send(Π, IDi , IDj ): Simulate same as in event E1 ∧ D ∧ M ∗ . Send(Π, IDi , IDj , epki ): Simulate same as in event E1 ∧ D ∧ M ∗ . Send(Π, IDi , IDj , epki , epkj ): Simulate same as in event E1 ∧ D ∧ M ∗ . H(σ1 , σ2 , σ3 , Π, IDi , IDj , epki , epkj ): (a) If (σ1 , σ2 , σ3 , Π, IDi , IDj , epki , epkj ) is recorded in list LH , then return recorded value K. (b) Else if (Π, IDi , IDj , epki , epkj ) is recorded in list LS , DBDH(Pj,1 , Q0 , P0x , σ1 ) = 1, DBDH(Pi,1 , Q0 , P0y , σ2 ) = 1, and e(P0x , P0y ) = e(P0 , σ3 ), then return recorded value K and record it in list LH . (c) Otherwise, S returns random value K and record it in list LH . 5. SessionKeyReveal(sid): Simulate same as in event E1 ∧ D ∧ M ∗ . 6. H1 (IDi ): Simulate same as in event E1 ∧ D ∧ M ∗ . ˜): If (IDA,1 ) IDi where the static secret key (St , Q1 , ..., 7. H2 ((St , Q1 , ..., Qt ), x s0 Qt ) is corresponding to IDi , S computes S1 = PA,1 = g uv from the static secret key (St , Q1 , ..., Qt ), then S stops and is successful by outputting answer of gap BDH problem e(g uv , W ) = BDH(U, V, W ). Otherwise, simulates random oracle in the usual way. 1. 2. 3. 4.
176
A. Fujioka, K. Suzuki, and K. Yoneyama
8. EphemeralKeyReveal(sid): S returns random value x ˜ and record it. 9. StaticKeyReveal(IDi ): If (IDA,1 ) IDi IDA , S aborts with failure. Otherwise, simulate same as in event E1 ∧ D ∧ M ∗ . 10. MasterKeyReveal(): S aborts with failure. 11. EstablishParty(Ui , IDi ): S responds to the query faithfully. 12. Test(sid): S responds to the query faithfully. 13. If adversary A outputs a guess γ, S aborts with failure. Analysis. The simulation of adversary A environment is perfect except with negligible probability. Suppose event D occurs, S does not abort in Step 9 and Step 10, since StaticKeyReveal and MasterKeyReveal are not queried. Under event M ∗ , adversary A queries correctly formed σ1 , σ2 , σ3 to H. Therefore, S is successful as described in Step 7 and does not abort as in Step 13. Hence, S is successful with probability P r(S) ≥ pD , where pD is probability that D ∧ M ∗ occurs. A.2
Event E1 ∧ D ∧ M ∗
In event E1 , test session sid∗ has no matching session sid∗ , adversary A queries StaticKeyReveal(ID) s.t. ID IDA , and adversary A does not query EphemeralKeyReveal(sid∗ ) and StaticKeyReveal(ID) s.t. ID IDB by the condition of freshness. Let IDA = (IDA,1 , ..., IDA,α ), IDB = (IDB,1 , ..., IDB,β ) be the target identities selected by adversary A. We embed instance as Q0 = P0s0 = U , PB,1 = V , and P0x = W , and extract e(g, g)uvw from σ1 = e(P0 , PB,1 )s0 x . In the case of event E1 ∧ D ∧ M ∗ , S performs the following steps. Setup. S embeds instance (U = g u , V = g v , W = g w ) of gap BDH problem as Q0 = P0s0 = U , PB,1 = V , and P0x = W . S randomly selects integers jA ∈R [1, s], that becomes a guess of the test session with probability 1/s. S sets ephemeral public key of jA -th session of user UA with identity IDA as P0x = W and x PB,i = W bB,i for i (2 ≤ i ≤ β). The algorithm S activates adversary A on this set of users and awaits the actions of adversary A. We next describe the actions of S in response to user activations and oracle queries. Simulation. The solver S simulate oracle queries as follows. S maintains list LH that contains queries and answers of H oracle, and list LS that contains queries and answers of SessionKeyReveal, 1. Send(Π, IDi , IDj ): S picks ephemeral secret key x ∈U Zp , computes ephemeral public key epki honestly, records (Π, IDi , IDj , epki ), and returns it. 2. Send(Π, IDi , IDj , epki ): S picks ephemeral secret key y ∈U Zp , computes ephemeral public key epkj honestly, records (Π, IDi , IDj , epki , epkj ). and returns it.
Hierarchical ID-Based AKE Resilient to Ephemeral Key Leakage
177
3. Send(Π, IDi , IDj , epki , epkj ): If (Π, IDi , IDj , epki ) is not recorded, S records the session (Π, IDi , IDj , epki , epkj ) is not completed. Otherwise, S records the session is completed. 4. H(σ1 , σ2 , σ3 , Π, IDi , IDj , epki , epkj ): (a) If (σ1 , σ2 , σ3 , Π, IDi , IDj , epki , epkj ) is recorded in list LH , then return recorded value K. (b) Else if (Π, IDi , IDj , epki , epkj ) is recorded in list LS , DBDH(Pj,1 , Q0 , P0x , σ1 ) = 1, DBDH(Pi,1 , Q0 , P0y , σ2 ) = 1, and e(P0x , P0y ) = e(P0 , σ3 ), then return recorded value K and record it in list LH . (c) Else if DBDH(Pj,1 , Q0 , P0x , σ1 ) = 1, DBDH(Pi,1 , Q0 , P0y , σ2 ) = 1, and e(P0x , P0y ) = e(P0 , σ3 ), IDi = IDA , IDj = IDB , and the session is jA -th session of user UA , then S stops and is successful by outputting answer of gap BDH problem σ1 = BDH(U, V, W ). (d) Otherwise, S returns random value K and record it in list LH . 5. SessionKeyReveal(sid): (a) If the session sid is not completed, return error. (b) Else if sid is recorded in list LS , then return recorded value K. (c) Else if (σ1 , σ2 , σ3 , Π, IDi , IDj , epki , epkj ) is recorded in list LH , DBDH(Pj,1 , Q0 , P0x , σ1 ) = 1, DBDH(Pi,1 , Q0 , P0y , σ2 ) = 1, and e(P0x , P0y ) = e(P0 , σ3 ), then return recorded value K and record it in list LS . (d) Otherwise, S returns random value K and record it in list LS . 6. H1 (IDi ): Simulate as described below. ˜): If IDi IDB where the static secret key (St , Q1 , ..., 7. H2 ((St , Q1 , ..., Qt ), x Qt ) is corresponding to IDi , S aborts with failure. Otherwise, simulates random oracle in the usual way. 8. EphemeralKeyReveal(sid): S returns random value x ˜ and record it. 9. StaticKeyReveal(IDi ): If IDi IDB , S aborts with failure. Otherwise, simulate as described below. 10. MasterKeyReveal(): S aborts with failure. 11. EstablishParty(Ui , IDi ): S responds to the query faithfully. 12. Test(sid): If ephemeral public key P0x is not W in session sid, then S aborts with failure. Otherwise, responds to the query faithfully. 13. If adversary A outputs a guess γ, S aborts with failure. Simulation of H1 query : S maintains list LH1 that contains asked identity IDi = (IDi,1 , ..., IDi,ti ), hash values (Ti,1 , ..., Ti,ti ) where H1 (IDi,1 , ..., IDi,k ) := Ti,k , discrete logs (bi,1 , ..., bi,ti ) where bi,k ∈U Zp and Ti,k := bi,k P0 , and secret values (si,1 , ..., si,ti ). Here, we denote the selected identity as ID0 = (ID0,1 , ..., ID0,t0 ) instead of IDB . At the beginning of simulation, S adds selected identity ID0 to the list LH1 as follows: For 1 ≤ j ≤ t0 , S uniformly chooses s0,j and b0,j from Zp except b b0,1 = 1, and sets T0,j = P0 0,j except T0,1 = P1 . When IDi is asked to StaticKeyReveal query, S adds IDi to the list LH1 . First, we briefly explain the strategy of simulation. For StaticKeyReveal query, S has to return static secret keys without knowing s0 or P0s0 . In order to simulate StaticKeyReveal query validly, S sets the value to cancel the influence of T0,1 = P1
178
A. Fujioka, K. Suzuki, and K. Yoneyama
on the node next to the path from the root to the selected identity. Thus, S can compute answer for StaticKeyReveal query for outside of the path from the root to the selected identity without knowing s0 or P0s0 . The detailed simulation is as follows: Let m be maximal such that (IDi,1 , ..., IDi,m ) = (IDi ,1 , ..., IDi ,m ) where LH1 contains (IDi ,1 , ..., IDi ,m ) and n (≤ m) be maximal such that (IDi,1 , ..., IDi,n ) = (ID0,1 , ..., ID0,n ). 1) For 1 ≤ j ≤ m, S sets Ti,j = Ti ,j , si,j = si ,j and bi,j = bi ,j . 2) If 0 < m = n ≤ ti , for m ≤ j ≤ ti , S uniformly chooses si,j and bi,j from b
s−1 bi,1
b
Zp , and sets Ti,n+1 = P0 i,n+1 /P1 i,n and Ti,j = P0 i,j if n + 1 < j < ti . 3) If n < m or n = 0, for m < j ≤ ti , S uniformly chooses si,j and bi,j from b Zp , and sets Ti,j = P0 i,j . Simulation of StaticKeyReveal query: When IDi is asked to StaticKeyReveal query, S returns the static secret key of IDi as follows: S carries out the same s s ti as the simulation of H1 query for IDi , and defines Si,ti = Σj=1 Ti,ji,j−1 0 and s {Qi,j = Q0i,j : 1 ≤ j ≤ ti − 1}. Analysis. The simulation of adversary A environment is perfect except with negligible probability. The probability that adversary A selects the session, where ephemeral public key P0x = W , as the test session sid∗ is at least 1/s. Suppose this is indeed the case, S does not abort in Step 12. Suppose event E1 occurs, S does not abort in Step 9 and Step 10, since StaticKeyReveal(IDi ) s.t. IDi IDB and MasterKeyReveal() are not queried. Suppose event E1 ∧D occurs, S does not abort in Step 7, since StaticKeyReveal(IDi ) s.t. IDi IDB is not queried and the condition of D. Under event M ∗ , adversary A queries correctly formed σ1 , σ2 , σ3 to H. Therefore, S is successful as described in Step 4c and does not abort as in Step 13. Hence, S is successful with probability P r(S) ≥ p1 /s, where p1 is probability that E1 ∧ D ∧ M ∗ occurs. A.3
Event E2 ∧ D ∧ M ∗
In event E2 , test session sid∗ has no matching session sid∗ , adversary A queries EphemeralKeyReveal(sid∗ ), and adversary A does not query StaticKeyReveal(ID) s.t. ID IDA and StaticKeyReveal(ID) s.t. ID IDB by the condition of freshness. Adversary A cannot obtain no information about x except negligible guessing probability, since H2 is random oracle and by the condition of the event D. So S performs reduction same as in the case of event E1 ∧ D ∧ M ∗ . A.4
Event E3 ∧ D ∧ M ∗
In event E3 , test session sid∗ has matching session sid∗ , adversary A queries MasterKeyReveal() or queries StaticKeyReveal(ID) s.t. ID IDA and StaticKeyReveal(ID) s.t. ID IDB , and adversary A does not query
Hierarchical ID-Based AKE Resilient to Ephemeral Key Leakage
179
EphemeralKeyReveal(sid∗ ) and EphemeralKeyReveal(sid∗ ) by the condition of freshness. Let IDA = (IDA,1 , ..., IDA,α ), IDB = (IDB,1 , ..., IDB,β ) be the target identities selected by adversary A. We embed instance as P0x = U and P0y = V , and extract g uv from σ3 = g xy . In the case of event E3 ∧ D ∧ M ∗ , S performs the following steps. Setup. S embeds instance (U = g u , V = g v , W = g w ) of gap BDH problem as P0x = U and P0y = V . S randomly selects master secret key s0 and si ’s, and generates static secret keys honestly. S randomly selects integers jA , jB ∈R [1, s], that becomes a guess of the test session with probability 1/s2 . S sets ephemeral x = public key of jA -th session of user UA with identity IDA as P0x = U and PB,i bB,i U for i (2 ≤ i ≤ β). S sets ephemeral public key of jB -th session of user UB y with identity IDB as P0y = V and PA,i = V bA,i for i (2 ≤ i ≤ α). The algorithm S activates adversary A on this set of users and awaits the actions of adversary A. We next describe the actions of S in response to user activations and oracle queries. Simulation. The solver S simulate oracle queries as follows. S maintains list LH that contains queries and answers of H oracle, and list LS that contains queries and answers of SessionKeyReveal, 1. 2. 3. 4.
5. 6. 7.
8. 9. 10.
Send(Π, IDi , IDj ): Simulate same as in event E1 ∧ D ∧ M ∗ . Send(Π, IDi , IDj , epki ): Simulate same as in event E1 ∧ D ∧ M ∗ . Send(Π, IDi , IDj , epki , epkj ): Simulate same as in event E1 ∧ D ∧ M ∗ . H(σ1 , σ2 , σ3 , Π, IDi , IDj , epki , epkj ): (a) If (σ1 , σ2 , σ3 , Π, IDi , IDj , epki , epkj ) is recorded in list LH , then return recorded value K. (b) Else if (Π, epki , epkj ) is recorded in list LS , DBDH(Pj,1 , Q0 , P0x , σ1 ) = 1, DBDH(Pi,1 , Q0 , P0y , σ2 ) = 1, and e(P0x , P0y ) = e(P0 , σ3 ), then return recorded value K and record it in list LH . (c) Else if DBDH(Pj,1 , Q0 , P0x , σ1 ) = 1, DBDH(Pi,1 , Q0 , P0y , σ2 ) = 1, and e(P0x , P0y ) = e(P0 , σ3 ), IDi = IDA , IDj = IDB , and the session is jA th session of user UA and the session is jB -th session of user UB , then S stops and is successful by outputting answer of gap BDH problem e(σ3 , W ) = BDH(U, V, W ). (d) Otherwise, S returns random value K and record it in list LH . SessionKeyReveal(sid): Simulate same as in event E1 ∧ D ∧ M ∗ . H1 (IDi ): Simulates random oracle in the usual way. H2 ((St , Q1 , ..., Qt ), x ˜): If ((St , Q1 , ..., Qt ), x ˜) is used in jA -th session of user UA or used in jB -th session of user UB S aborts with failure. Otherwise, simulates random oracle in the usual way. EphemeralKeyReveal(sid): S returns random value x ˜ and record it. StaticKeyReveal(IDi ): S responds to the query faithfully, using the knowledge of master secret key s0 and si ’s MasterKeyReveal(): S returns s0 .
180
A. Fujioka, K. Suzuki, and K. Yoneyama
11. EstablishParty(Ui , IDi ): S responds to the query faithfully. 12. Test(sid): If ephemeral public key X is not U or Y is not V in session sid, then S aborts with failure. Otherwise, responds to the query faithfully. 13. If adversary A outputs a guess γ, S aborts with failure. Analysis. The simulation of adversary A environment is perfect except with negligible probability. The probability that adversary A selects the session, where ephemeral public key P0x = U and P0y = V , as the test session sid∗ is at least 1/s2 . Suppose this is indeed the case, S does not abort in Step 12. Suppose event E3 occurs, S does not abort in Step 7 except negligible probability of guessing ephemeral secret key x ˜, since EphemeralKeyReveal(sid∗ ) and ∗ EphemeralKeyReveal(sid ) are not queried. Under event M ∗ , adversary A queries correctly formed σ1 , σ2 , σ3 to H. Therefore, S is successful as described in Step 4c and does not abort as in Step 13. Hence, S is successful with probability P r(S) ≥ p3 /s2 , where p3 is probability that E3 ∧ D ∧ M ∗ occurs. A.5
Event E4 ∧ D ∧ M ∗
In event E4 , test session sid∗ has matching session sid∗ , adversary A queries EphemeralKeyReveal(sid∗ ) and EphemeralKeyReveal(sid∗ ), and adversary A does not query StaticKeyReveal(ID) s.t. ID IDA and StaticKeyReveal(ID) s.t. ID IDB by the condition of freshness. Adversary A cannot obtain no information about x and y except negligible guessing probability, since H2 is random oracle and by the condition of the event D. So S performs reduction same as in the case of event E3 ∧ D ∧ M ∗ . A.6
Event E5 ∧ D ∧ M ∗
In event E5 , test session sid∗ has matching session sid∗ , adversary A queries StaticKeyReveal(ID) s.t. ID IDA and EphemeralKeyReveal(sid∗ ) and adversary A does not query EphemeralKeyReveal(sid∗ ) and StaticKeyReveal(ID) s.t. ID IDB by the condition of freshness. Adversary A cannot obtain no information about y except negligible guessing probability, since H2 is random oracle and by the condition of the event D. So S performs reduction same as in the case of event E3 ∧ D ∧ M ∗ . A.7
Event E6 ∧ D ∧ M ∗
In event E6 , test session sid∗ has matching session sid∗ , adversary A queries EphemeralKeyReveal(sid∗ ) and StaticKeyReveal(ID) s.t. ID IDB and adversary A does not query StaticKeyReveal(ID) s.t. ID IDA and EphemeralKeyReveal(sid∗ ) by the condition of freshness. Adversary A cannot obtain no information about x except negligible guessing probability, since H2 is random oracle and by the condition of the event D. So S performs reduction same as in the case of event E3 ∧ D ∧ M ∗ .
Group Signature Implies PKE with Non-interactive Opening and Threshold PKE Keita Emura1 , Goichiro Hanaoka2 , and Yusuke Sakai3 1
Center for Highly Dependable Embedded Systems Technology, Japan Advanced Institute of Science and Technology (JAIST), 1-1, Asahidai, Nomi, Ishikawa, 923-1292, Japan
[email protected] 2 Research Center for Information Security (RCIS), National Institute of Advanced Industrial Science and Technology (AIST), 1-18-13, Soto-Kanda, Chiyoda, Tokyo, 101-0021, Japan
[email protected] 3 Graduate School of Information and Communication Engineering, The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu-shi, Tokyo 182-8585, Japan
[email protected] Abstract. In this paper, we show that both Public Key Encryption (PKE) with non-interactive opening and threshold PKE can be constructed from an arbitrary group signature scheme which is secure in the dynamic group setting. This result implies that group signature (in dynamic groups) is a significantly strong cryptographic primitive, since the above PKEs with additional functionalities are already much stronger primitives than the standard chosen-ciphertext secure PKE, which is itself recognized as a very powerful cryptographic tool. We can interpret our result as meaning that designing secure group signatures is significantly harder than many other ordinary cryptographic primitives.
1
Introduction
Background: Group Signature (GS) is a kind of anonymous signature, and is known as a popular cryptographic primitive. The concept of GS was investigated by Chaum and Heyst [11], and its typical usage is described as follows: The Group Manager (GM) issues a membership certificate to a signer. A signer makes a group signature by using its own membership certificate. A verifier anonymously verifies whether a signer is a member of a group or not. In order to handle some special cases (e.g., an anonymous signer behaves maliciously), GM can identify the actual signer through the open procedure. Since verifiers do not have to identify individual signers, GS is a useful and powerful tool for protecting signers’ privacy, and therefore many attractive applications of the GS have been proposed. For example, in a biometric-based authentication scheme [10], a user can be verified anonymously by using a user’s biometric trait as a secret key of GS. Therefore, we do not have to consider personal information exposure I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 181–198, 2010. c Springer-Verlag Berlin Heidelberg 2010
182
K. Emura, G. Hanaoka, and Y. Sakai
through the authentication procedure. In the identity management scheme for outsourcing business [22], the outsourcee does not have to manage the list of identities of users by applying a GS for the authentication of users. Therefore, we do not have to consider personal information exposure from outsourcee. In an anonymous survey [25], the dealer can collect statistical information without revealing as individual user by applying the open procedure of GS. These applications suitably capture the fact that the individual user’s privacy is strongly required in the modern information society, and therefore research on GS is an important topic of cryptography. However, designing a GS looks difficult due to its sophisticated functionalities. Theoretically, a GS can be constructed by using a Simulation-Sound NonInteractive Zero-Knowledge (SS-NIZK) proof system and a trapdoor one-way function [4,5]. This fact just describes the feasibility of a GS and nothing more, and the gap between practicability and feasibility is clearly wide. For example, as well-known results, a digital signature and a pseudorandom generator can be constructed generically by using a one-way function [17,18,27]. However, these results are completely impractical, and therefore stronger assumptions are required to construct practical schemes. Consequently, it is an important research issue to investigate the difficulty of constructing GS in practice. Our Contribution: In this paper, we show that both Public-Key Encryption with Non-interactive Opening (PKENO) [13,15] and Threshold Public Key Encryption (TPKE) [8] can be constructed from an arbitrary GS scheme which is secure in the dynamic group setting [5]. Although the fact that a GS (in static groups [4]) is a stronger cryptographic primitive compared with ChosenCiphertext secure PKE (CCA-secure PKE) has been shown in [26], it is not obvious at a glance that there are some relations among GS, PKENO, and TPKE. In our constructions, all properties of a dynamic GS are not required to construct PKENO and TPKE. This result implies that a GS (in dynamic groups) is a significantly stronger cryptographic primitive, since PKENO and TPKE are already much stronger primitives than the standard CCA-secure PKE, which is itself recognized as a very powerful cryptographic tool. We can interpret our result as meaning that designing secure GSs is significantly harder than many other ordinary cryptographic primitives. Related Work: Many GS schemes were proposed so far, and also there were many kinds of security requirements a GS such as Unforgeability, Exculpability, Traceability, Coalition Resistance, Framing Resistance, Anonymity, and Unlinkability [1,2,11,12]. Therefore, it was important to indicate the de facto standard of security requirements of a GS. In 2003, Bellare, Micciancio, and Warinschi [4] (BMW) showed that formulations of Full-Anonymity and Full-Traceability are strong enough to capture all the above security requirements. Although this BMW model has contributed greatly to GS, it only handles the static group settings. The number of members is decided in the initial setup phase, and new members cannot be added later. This limited setting is not suitable in many GS-based applications. For example, in an anonymous survey [25], each user has membership certificates corresponding to their attributes. One attribute
GS Implies PKE with Non-interactive Opening and Threshold PKE
183
certificate is issued for an attribute type, and the number of each attribute is the statistical information. In this application, a new membership certificate must be issued after the initial setup phase. As another example, let a signatory group be a branch of a company. Then, it is difficult to determine the number of employees in the initial phase in real environments, since usually, some new-comers will join the branch. Therefore, it is natural that the security requirements capture the dynamic (i.e., new members will be added) group settings. In addition, since GM generates all secret signing keys in the BMW model, GM can generate a signature such that the opening procedure outputs an honest user who did not make this signature. Moreover, there is no way to verify whether the result of opening is true or not. To handle these properties, Bellare, Shi, and Zhang [5] have proposed a new security requirement for a GS called the BSZ model, which implies the BMW model. We review the definitions of the BSZ model in Section 2. The concept of PKENO was investigated by Damg˚ ard, Hofheinz, Kiltz, and Thorbek [13]. The receiver of a ciphertext can reveal what the result of decrypting the ciphertext was without compromising the corresponding decryption key. For example, when a decryptor has a complaint against the decryption result, then PKENO comes into effect. A TPKE is a public key encryption system, where the private key is distributed among n decryption servers, and a ciphertext can be decrypted by using at least k servers’ private keys. Even if threshold decryption of a valid ciphertext fails, the combiner can identify the failed decryption servers. Very recently, as a work independent of our results, Galindo et al. showed that PKENO can be constructed from a robust TPKE [16]. From the Galindo et al. result, if TPKE can be constructed from GS, then PKENO also can be constructed from a GS. In addition, Ohtake et al. mentioned that TPKE might be generically constructed from any GS (secure in the BMW model) by using it as a black box. However, our result (TPKE can be constructed from a GS secure in the BSZ mode) is not obvious, even if the results of Galindo et al. [16] and Ohtake et al. [26] are given. See Section 4.4 for details.
2 2.1
Review of Group Signatures in the BSZ Model GS in Dynamic Groups
In 2005, to handle dynamic group settings, Bellare, Shi, and Zhang [5] showed that the formulations of Anonymity, Traceability, and Non-Frameability are strong enough to capture and imply all existing security requirements. In the BSZ model, the roles of GM are separated into the Issuing Manager (IM) and the Opening Manager (OM). The IM manages an issuer key ik, and executes the interactive joining algorithm with a user to issue his/her membership certificate, and therefore, group members can join in the signatory group after the initial phase. The OM manages an opening key ok, and executes the opening algorithm to reveal information about who the actual signer is. For public verification of the result of this opening procedure, a new algorithm called Judge is added. This
184
K. Emura, G. Hanaoka, and Y. Sakai
algorithm is important to handle a situation in which the OM is fully corrupted by an adversary. 2.2
Definitions for the BSZ Model
A GS scheme GS consists of the following seven algorithms GS = (GKg, UKg, Join, Iss, GSig, GVf, Open, Judge): Definition 1. System operations of GS GKg: This algorithm takes as an input a security parameter 1λ (λ ∈ N), and returns a group public key gpk, an issuer key ik, and an opening key ok. UKg: For a user Ui , this algorithm takes as an input 1λ , and returns a public and private key pair (upki , uski ). upki is added to the public table upk. We assume that upk is included in gpk. Join,Iss: This (interactive) algorithm takes as inputs gpk, ik, and the registration table reg from the IM, and gpk, upki , and uski from Ui , and returns a signing key gski to Ui . The transcript of the process is stored in reg[i]. GSig: This algorithm takes as inputs gpk, gski and a message M ∈ MGSig , and returns a group signature σ. GVf: This deterministic algorithm takes as inputs gpk, σ, and M , and returns 1 if σ is a valid group signature, and 0, otherwise. Open: This deterministic algorithm takes as inputs gpk, ok, M , σ, and reg, and returns (i, τ ), where i is the Ui ’s identity, and τ is a proof that Ui computed σ. Judge: This deterministic algorithm takes as inputs gpk, upki , M , σ, and τ , and returns 1 if σ is produced by Ui , and 0, otherwise. The purpose of the Judge algorithm is to handle the situation where the OM is fully corrupted by an adversary. The corrupted OM may produce a fakedbut-acceptable proof. The Judge algorithm must handle this case. Therefore, we assume that the Judge algorithm satisfies Soundness (which is implicitly required in the BSZ model), where for σ ← GSig(gpk, gski , M ), the following probability is negligible: any Probabilistic Polynomial Time (PPT) adversary with ok produces τ such that Judge(gpk, upkj , M, σ, τ ) = 1 for any j = i. Note that the Groth scheme [20] and a generic construction of GS secure in the BSZ model [5] satisfy this soundness property by applying the soundness of a NonInteractive Witness-Indistinguishable (NIWI) proof [21] and a SS-NIZK system, respectively. Next, we define security requirements of GS (Correctness, Anonymity, Traceability, and Non-Frameability). Correctness captures the fact that any group signatures generated by honest group members are valid, and the Open algorithm correctly identifies the actual signer, and proofs generated by the Open algorithm are always accepted by the Judge algorithm. Anonymity captures the fact that, even if an adversary A is given all signing keys, A cannot identify who the actual signer is, and cannot distinguish whether signers of two signatures
GS Implies PKE with Non-interactive Opening and Threshold PKE
185
are the same person or not. Traceability captures the fact that an adversary A cannot produce a signature such that either the OM cannot identify, or an honestly generated proof is rejected by the Judge algorithm. Non-Frameability captures the fact that an adversary A with both ik and ok cannot produce a judge-accepted proof for an honest user. In the following experiments, let HU be a set of honest users, CU be a set of corrupted users, GSet be a set of message-signature pairs, upk be a table which contains the public key of Ui upki , and reg be a table which contains the registration information of Ui in reg[i]. Definition 2. A GS is said to be satisfying Correctness if the advantage is negligible for any PPT adversary A in the following experiment. λ λ Advcorr GS,A (1 ) := Pr (gpk, ik, ok) ← GKg(1 ); CU ← ∅; HU ← ∅;
(i, M ) ← AAddU(·),RReg(·) (1λ , gpk); σ ← GSig(gpk, gski , M ); GVf(gpk, M, σ) = 0 ∨ ((j, τ ) ← Open(gpk, ok, M, σ, reg) ∧ i = j)∨ Judge(gpk, upki , M, σ, τ ) = 0
AddU: This is the add user oracle with the input of an identity i, where A can add Ui to the group as an honest user. Ui is added into HU. The oracle runs (upki , uski ) ← UKg(1k ), and Join (resp. Iss) on behalf of Ui with gpk, upki , and uski (resp. the IM with gpk, ik, and reg). A is given upki . RReg: This is the read registration table oracle with an input i, where the oracle returns reg[i]. Definition 3. A GS is said to be satisfying Anonymity if the advantage is negligible for any PPT adversary A in the following experiment. λ λ Expanon-b GS,A (1 ) : (gpk, ik, ok) ← GKg(1 ); CU ← ∅; HU ← ∅; GSet ← ∅;
d ← ACrptU(·,·),SndToU(·,·),USK(·),Open(·,·),WReg(·,·),Ch(b,·,·,·) (1λ , gpk, ik); Return d anon-1 λ anon-0 λ Advanon GS,A := Pr ExpGS,A (1 ) = 1 − Pr ExpGS,A (1 ) = 1 CrptU: This is the corrupt user oracle with inputs of identity i and upk, where A can add Ui with upki = upk to the group as a corrupted user. SndToU: This is the send to user oracle with input i, where A can run the Iss algorithm on behalf of the IM with an honest user Ui . USK: This is the user secret keys oracle with input i, where the oracle returns both the private signing key gski and the personal private key uski . Open: This is the opening oracle with inputs M and σ, where the oracle returns Open(gpk, ok, M, σ, reg) if (M, σ) ∈ GSet, and ⊥ otherwise. WReg: This is the write registration table oracle with inputs i and a value ρ, where the oracle modifies the contents of reg[i] ← ρ. Ch: This is the challenge oracle with inputs a bit b, i0 , i1 , and m. The oracle returns σ ← GSig(gpk, gskib , M ) if both Ui0 ∈ HU and Ui1 ∈ HU. If not, the oracle returns ⊥. The oracle stores (M, σ) in GSet.
186
K. Emura, G. Hanaoka, and Y. Sakai
Definition 4. A GS is said to be satisfying Traceability if the advantage is negligible for any PPT adversary A in the following experiment. λ λ Advtrace GS,A (1 ) := Pr (gpk, ik, ok) ← GKg(1 ); CU ← ∅; HU ← ∅; (M, σ) ← ASndToI(·,·),AddU(·),RReg(·),USK(·),CrptU(·,·) (1λ , gpk, ok); (i, τ ) ← Open(gpk, ok, M, σ, reg); GVf(gpk, m, σ) = 1 ∧ (i = 0 ∨ Judge(gpk, upki , M, σ, τ ) = 0) SndToI: This is the send to issuer oracle with an identity of a corrupted user i, where A can run the Join algorithm on behalf of Ui with an honest issuer. Definition 5. A GS is said to be satisfying Non-Frameability if the advantage is negligible for any PPT adversary A in the following experiment. λ λ Advnf GS,A (1 ) := Pr (gpk, ik, ok) ← GKg(1 ); CU ← ∅; HU ← ∅;
(M, σ, i, τ ) ← ASndToU(·,·),WReg(·),GSig(·,·),USK(·),CrptU(·,·) (1λ , gpk, ok, ik); = ∧ GVf(gpk, M, σ) = 1 ∧ Judge(gpk, upki , M, σ, τ ) = 1; i ∈ HU ∧ gski A did not query USK(i) or GSig(i, M )
GSig: This is the signing oracle with inputs i and a message M , where the oracle returns σ ← GSig(gpk, gski , M ) if both Ui ∈ HU and gski = , and ⊥, otherwise. 2.3
Properties of GS Schemes in the BSZ Model
To handle dynamic participants, some new (and minute) security requirements are introduced in the BSZ model. In addition, the BSZ model implies the BMW model. One might think that for constructing secure GS in the BSZ model, significantly stronger cryptographic tools are required than in the BMW model. However, actually the difference of the strength of security requirements between two models is not obviously wide. This is due to the following facts. First, the BSZ model does not consider the revocation of group members (i.e., partially dynamic groups). In addition, a construction of partially dynamic groups based on a GS secure in the BMW model has been considered (Section 5 of [4]). As in BSZ, the roles of the GM are separated into the OM and the IM. ik (resp. ok) is given to an adversary in the Full-Anonymity (resp. Full-Traceability) experiment. Nevertheless, GM can generate a signature such that the Open algorithm outputs an honest user who did not make this signature, and there is no way to verify whether the result of opening is true or not. From the above considerations, the differences between BMW and BSZ models are Non-Frameability, the Judge algorithm, and the concurrent Join algorithm. These can be implemented by using a SS-NIZK proof system [28], and a SS-NIZK proof system has already been applied to implement a concrete GS scheme secure in the BMW model (presented in [4]). That is to say, we do not have to introduce any new cryptographic primitives to implement these differences from feasibility’s point of view.
GS Implies PKE with Non-interactive Opening and Threshold PKE
2.4
187
Concrete Implementations
A generic construction of a GS scheme secure in the BSZ model has been introduced in [5] based on an existential unforgeable digital signature, a CCA-secure PKE, and a SS-NIZK system. In addition, there are several GS schemes secure in the BSZ model such as e.g., the Delerabl´ee and Pointcheval scheme [14] and the Groth schemes [19,20]. The Delerabl´ee and Pointcheval scheme is efficient but secure in only the random oracle model. Although the Groth scheme [19] is secure in the standard model, each group signature consists of a large number of group elements. The other Groth scheme [20] satisfies a reasonable constantsize group signature, and is secure in the standard model. We review the Groth scheme [20] in the Appendix.
3
Public Key Encryption with Non-interactive Opening (PKENO)
In this section, we propose a generic construction of PKENO by using a GS secure in the BSZ model. In our construction, properties of Traceability, NonFrameability, and the interactive Join algorithm are not required. This is evidence that the GS secure in the BSZ model is a much stronger cryptographic primitive compared with PKENO. This suggests that designing secure GSs is significantly harder than designing CCA-secure PKE. 3.1
Security Requirements of PKENO
We introduce security requirements of PKENO (Correctness, Completeness, IND-CCPA, and Proof Soundness) presented in [13]. A PKENO scheme PKEN O consists of the following five algorithms PKEN O = (PKENO.Gen, PKENO.Enc, PKENO.Dec, Prove, PKENO.Ver): Definition 6. System operations of PKENO PKENO.Gen: This algorithm takes as an input a security parameter 1λ (λ ∈ N), and returns a public/secret key pair (pk, sk). PKENO.Enc: This algorithm takes as inputs pk and a message m ∈ MP KEN O , and returns a ciphertext C. PKENO.Dec: This algorithm takes as inputs sk and C, and returns m or ⊥. Prove: This algorithm takes as inputs sk and C, and returns a proof π. PKENO.Ver: This algorithm takes as inputs pk, C, m, and π, and returns 1 if C is a ciphertext of m, and 0 otherwise. Definition 7. Correctness λ ∀(pk, sk) ← PKENO.Gen(1 ) and ∀m ∈ M , Pr PKENO.Dec sk, P KEN O PKENO.Enc(pk, m) = m = 1.
188
K. Emura, G. Hanaoka, and Y. Sakai
Definition 8. Completeness ∀C, Pr PKENO.Ver pk, C, PKENO.Dec(sk, C), Prove(sk, C) = 1 = 1. Definition 9. A PKENO is said to be Indistinguishable against ChosenCiphertext and Prove Attack (IND-CCPA) secure if the advantage is negligible for any PPT adversary A in the following experiment. A can issue to a decryption oracle ODec (sk, ·) and a proof oracle OProve (sk, ·), where, for an input ciphertext C, it returns the result of PKENO.Dec(sk, C) and Prove(sk, C), respectively. Note that A cannot query C ∗ to both oracles, where C ∗ is the challenge ciphertext. λ λ Advind-ccpa PKENO,A (1 ) := Pr (pk, sk) ← PKENO.Gen(1 ); (m0 , m1 , State) ← AODec (sk,·),OProve (sk,·) (1λ , pk); b ← {0, 1}; C ∗ ← PKENO.Enc(pk, mb ); $
b ← AODec (sk,·),OProve (sk,·) (C ∗ , State); b = b − 1/2 Definition 10. A PKENO is said to be satisfying Computational Proof Soundness if the advantage is negligible for any PPT adversary A in the following experiment. λ λ λ Advsnd PKENO,A (1 ) := Pr (pk, sk) ← PKENO.Gen(1 ); (m, State) ← A(1 , pk, sk); C ← PKENO.Enc(pk, m); (m , π ) ← A(C, State); PKENO.Ver(pk, C, m , π ) = 1; m =m Definition 11. A PKENO scheme is said to be secure if both IND-CCPA security and Computational Proof Soundness hold. 3.2
Proposed GS-Based PKENO
Note that in the PKENO.Gen algorithm, an algorithm executer runs the (interactive) Join, Iss algorithm alone. That is to say, since the algorithm executer has both ik and uski , the algorithm executer can play the roles of the IM and of a user, simultaneously, and can obtain gski . Protocol 1. Proposed GS-Based PKENO PKENO.Gen(1λ ): Given a security parameter 1λ (λ ∈ N), for i = 1, 2, it runs (gpk, ik, ok) ← GKg(1λ ), (upki , uski ) ← UKg(1λ ), and gski ← Join, Iss(gpk, ik, uski ), and sets an encryption key pk = (gpk, gsk1 , gsk2 , upk1 , upk2 , M ) and a decryption key sk = ok, where M ∈ MGSig is a (fixed) signed message. PKENO.Enc(pk, m): For a plaintext m ∈ {0, 1}, it runs σ ← GSig(gpk, gskm+1 , M ), and outputs C := σ. PKENO.Dec(pk, sk, C): For C = σ, it runs b ← GVf(gpk, M, σ). If b = 0, then it returns ⊥. Otherwise, it runs (i, τ ) ← Open(gpk, ok, M, σ, reg), and outputs m = i − 1.
GS Implies PKE with Non-interactive Opening and Threshold PKE
189
Prove(pk, sk, C): It runs b ← GVf(gpk, M, σ). If b = 0, then it returns ⊥. Otherwise, it runs (i, τ ) ← Open(gpk, ok, M, σ, reg), and outputs π = τ . PKENO.Ver(pk, C, m, π): It outputs b ← Judge(gpk, upkm+1 , M, σ, τ ). 3.3
Security Analysis
Theorem 1. Our GS-Based PKENO scheme satisfies Correctness and Completeness if the underlying GS scheme satisfies Correctness. Proof. If the underlying GS scheme satisfies Correctness, then a signature computed by an honestly generated secret key is always valid, and the Open algorithm correctly identifies the signer. In addition, a proof returned by the Open algorithm is always accepted by the Judge algorithm. Therefore, Correctness and Completeness clearly hold.
Theorem 2. Our GS-Based PKENO scheme is IND-CCPA secure if the underlying GS scheme satisfies Anonymity. Proof. Let A be an adversary who breaks the IND-CCPA security of our GSbased PKENO scheme, and C be the challenger of the underlying GS scheme in the Anonymity experiment. We construct an algorithm B that breaks Anonymity of the underlying GS scheme. First, C runs (gpk, ik, ok) ← GKg(1λ ), and sends gpk and ik to B. B issues the identity 1 (resp. 2) to the SndToU oracle, and also issues USK(1) (resp. USK(2)), and obtains (upk1 , usk1 , gsk1 ) (resp. (upk2 , usk2 , gsk2 )). B chooses M ∈ MGSig randomly, and sends pk = (gpk, gsk1 , gsk2 , upk1 , upk2 , M ) to A. When a decryption query C = σ is issued by A, B issues C to the opening oracle, obtains (i, τ ), and returns i − 1 to A. When a proof query C = σ is issued by A, B issues C to the opening oracle, obtains (i, τ ), and returns τ to A. Without loss of generality, we assume that m0 = 0 and m1 = 1. B issues (M, i0 , i1 ) to the challenge oracle, where i0 = 1 and i1 = 2, and obtains σ ∗ ← GSig(gpk, gskib , M ). B sends σ ∗ as the challenge ciphertext in the IND-CCPA experiment. Finally, A outputs the guessing bit b ∈ {0, 1}. B outputs b .
Theorem 3. Our GS-Based PKENO scheme satisfies Computational Proof Soundness. Proof. For m ∈ {0, 1} and σ ← GSig(gpk, gskm+1 , M ), A outputs (m , π ). Due to the definition of Proof Soundness, PKENO.Ver(pk, C, m , π ) = 1 and m = m. This means that GVf(gpk, M, σ) = 1 and Judge(gpk, upkm +1 , M, σ, τ ) = 1, where τ = π . However, this condition never occurs due to the Soundness of the Judge algorithm.
3.4
Concrete Instantiation of PKENO
Protocol 2. A PKENO Scheme based on Groth’s GS
190
K. Emura, G. Hanaoka, and Y. Sakai
PKENO.Gen(1λ ): Run (gpk, ik, ok) = ( (gk, Hash, f, h, T, crs, pk), z, xk ) ← $
$
GKg(1λ ), choose M ← Msots , x1 , x2 , r1 , r2 ← Zp , compute v1 = g x1 , v2 = g x2 , a1 = f −r1 , a2 = f −r2 , b1 = (hv1 )r1 z, and b2 = (hv2 )r2 z, and store v1 and v2 to reg. Output pkpkeno = (gpk, gsk1 , gsk2 , M ) and skpkeno = xk, where gsk1 = (x1 , a1 , b1 ), gsk2 = (x2 , a2 , b2 ). PKENO.Enc(pk, m): For m ∈ {0, 1}, run (vksots , sksots ) ← KeyGensots (1λ ), and $
= xm+1 holds. Choose ρ ← Zp , and comrepeat this until Hash(vksots ) pute a = am+1 f −ρ , b = bm+1 (hvm+1 )ρ , σ = g 1/(xm+1 +Hash(vksots )) , π ← PNIWI (crs, (gpk, a, Hash(vksots )), (b, vm+1 , σ )), y ← Epk (Hash(vksots ), σ ), ψ ← PN IZK (crs, (gpk, y, π), (r, s, t)), and σsots ← Signsksots (vksots , (M, a, π, y, ψ)), and output C := σ = (vksots , a, π, y, ψ, σsots ). PKENO.Dec(pk, sk, C): If 1 ← Vervkstos ((M, a, π, y, ψ), σsots ), 1 ← VNIWI (crs, (gpk, a, Hash(vkstos )), π), 1 ← VN IZK (crs, (gpk, π, y), ψ), and 1 ← ValidCiphertext (pk,Hash(vkstos ),y), then extract (b, v, σ ) ← Xxk (crs, (gpk, a, Hash(vksots )), π), and return m = i − 1. Else, return ⊥. Prove(pk, sk, C): If 1 ← Vervkstos ((M, a, π, y, ψ), σsots ), 1 ← VNIWI (crs, (gpk, a,Hash(vkstos )), π), 1 ← VN IZK (crs, (gpk, π, y), ψ), and 1 ← ValidCiphertext ← Xxk (crs, (gpk, a, (pk, Hash(vkstos ), y), then extract (b, v, σ ) Hash(vksots )), π), and return π = (σ , i, v), where v ∈ reg[i]. Else, return ⊥. PKENO.Ver(pk, C, m, π): Return 1 if e(σ , vg Hash(vksots ) ) = e(g, g) ∧ i = m + 1 ∧ i = 0, and 0, otherwise. In all previous PKENO constructions [13,15], the PKENO.Ver algorithm is implemented by using decryption procedures. The above GS-based PKENO.Ver algorithm is implemented by using the (Boneh-Boyen) signature [7] verification only, and has never appeared in the literature. As a benefit of this construction, we mention that the required computational cost is small compared with decryption-based construction.
4
Threshold Public Key Encryption (TPKE)
In this section, we propose a generic construction of TPKE by using the GS secure in the BSZ model. Properties of Traceability, Non-Frameability, and interactive Join algorithm are not required. This is evidence that the GS secure in the BSZ model is a much stronger cryptographic primitive compared with TPKE. As in the case of PKENO (Section 3), this suggests that designing secure GSs is significantly harder than designing CCA-secure PKE. 4.1
Security Requirements of TPKE
We introduce security requirements of TPKE (Correctness, Robustness, INDCCA, and Decryption Consistency) defined by Boneh, Boyen, and Halevi [8]. A TPKE scheme T PKE consists of the following five algorithms T PKE = (TPKE.Setup, TPKE.Enc, ShareDecrypt, ShareVerify, Combine):
GS Implies PKE with Non-interactive Opening and Threshold PKE
191
Definition 12. System operations of TPKE TPKE.Setup: This algorithm takes as inputs a security parameter 1λ (λ ∈ N), the number of decryption servers n, and a threshold k, where 1 ≤ k ≤ n, and outputs a public key pk, a verification key vk, and a vector of n private ke shares sk = (sk1 , . . . , skn ). The decryption server i is given as (i, ski ). TPKE.Enc: This algorithm takes as inputs pk and a message m, and returns a ciphertext C. ShareDecrypt: This algorithm takes as inputs pk, a decryption server’s identity i, ski , and C, and returns a decryption share μi = (i, μ ˆi ), or ⊥. ShareVerify: This algorithm takes as inputs pk, vk, C, and μi , and outputs 1 if μi is a valid decryption share, and 0 otherwise. Combine: This algorithm takes as inputs pk, vk, C, and k decryption shares {μi1 , . . . , μik }, and outputs a cleartext m or ⊥. Definition 13. Correctness λ ), ∀m ∈ MT P KE and ∀S = {i1 , i2, . . . , ik } ⊆ ∀(pk, vk, sk) ← TPKE.Setup(1 {1, 2, . . . , n}, Pr Combine pk, vk, C, {ShareDecrypt(pk, i, ski , C)}i∈S = m = 1. Definition 14. Robustness ∀C and ∀i ∈ {1, 2, . . . , n}, Pr ShareVerify(pk, vk, C, ShareDecrypt(pk, i, ski , C)) = 1 = 1. Definition 15. A TPKE is said to be Indistinguishable against ChosenCiphertext Attacks (IND-CCA) secure if the advantage is negligible for any PPT adversary A in the following experiment. A can issue a share decryption oracle OShareDec (pk, ·, ·), where for an input ciphertext C and an identity of the decryption server i, it returns the result of ShareDecrypt(pk, i, ski , C). Note that A cannot query C ∗ to the oracle, where C ∗ is the challenge ciphertext. Let S ⊂ {1, 2, . . . , n}, where |S| = k − 1. λ λ Advind-cca TPKE,A (1 ) := Pr S ← A; (pk, vk, sk) ← TPKE.Setup(1 , n, k); (m0 , m1 , State) ← AOShareDec(pk,·,·) (1λ , pk, vk, {ski }i∈S ); b ← {0, 1}; C ∗ ← TPKE.Enc(pk, mb ); $
b ← AOShareDec(pk,·,·) (C ∗ , State); b = b − 1/2
Definition 16. A TPKE is said to be satisfying Decryption Consistency if the advantage is negligible for any PPT adversary A in the following experiment. Let S = {μ1 , μ2 , . . . , μk } and S = {μ1 , μ2 , . . . , μk } be two sets of decryption shares, where S (resp. S ) contains decryption shares from k distinct servers. λ λ Advdec-consis TPKE,A (1 ) := Pr S ← A; (pk, vk, sk) ← TPKE.Setup(1 , n, k); (C, S , S ) ← AOShareDec(pk,·,·) (1λ , pk, vk, {ski }i∈S ), For all μ ∈ S ∪ S , ShareVerify(pk, vk, C, μ) = 1; Combine(pk, vk, C, S ) = Combine(pk, vk, C, S )
192
4.2
K. Emura, G. Hanaoka, and Y. Sakai
Proposed GS-Based TPKE
Protocol 3. Proposed GS-Based (n, n)-TPKE TPKE.Setup(1λ , n, n): Given a security parameter 1λ (λ ∈ N), for i = 1, 2, . . . , n and j = 1, 2, it runs (gpki , iki , oki ) ← GKg(1λ ), (upki,j , uski,j ) ← UKg(1λ ), and gski,j ← Join, Iss(gpki , iki , uski,j ), and sets pk = vk = {(gpki , gski,1 , gski,2 , upki,1 , upki,2 , Mi )}ni=1 , where Mi ∈ MGSig is a (fixed) signed message, and sk = {ski }ni=1 , where ski = oki . TPKE.Enc(pk, m): For a plaintext m ∈ {0, 1}, it chooses mj ∈ {0, 1} randomly for j = 1, 2, . . . , n − 1, and sets mn := ( n−1 j=1 mj ) ⊕ m. For i = 1, 2, . . . , n, it runs σi ← GSig(gpki , gski,mi +1 , Mi ), and outputs C := {σi }ni=1 . ShareDecrypt(pk, i, ski , C): It runs b ← GVf(gpki , Mi , σi ). If b = 0, then it returns ⊥. Otherwise, it runs (j, τi ) ← Open(gpki , oki , Mi , σi , regi ), and outputs μi = (mi , τi ), where mi = j − 1. ShareVerify(pk, vk, C, μi ): It outputs b ← Judge(gpki , upki,mi +1 , Mi , σi , τi ). Combine(pk, vk, C, {μ1 , . . . , μn }): For all μi , it checks 1 ← Judge(gpk n i , upki,mi +1 , Mi , σi , τi ). If all conditions hold, then it outputs m = i=1 mi , and ⊥, otherwise. (k, n)-TPKE based on secret sharing schemes: The above (n, n)-TPKE scheme can be extended to (k, n)-TPKE by using a Secret Sharing Scheme (SSS) such as Shamir’s SSS with polynomial interpolations [29]. By treating a message m ∈ {0, 1} as a (one-bit) secret, each mi is regarded as a share of the secret m. Therefore, we can apply any (k, n)-threshold SSSs to construct the (k, n)-TPKE. However, since TPKE is computationally secure, unconditional security of SSS is not necessary. In addition, the share size of the unconditionally-secure SSS is larger than that of the conditionally secure one. In our construction, the share size of SSS is related to the number of public keys. For example, in Shamir’s SSS, the number of possible values of mi is at least n, since the size of a field p is required such that p > n. Shamir’s SSS causes a large public key containing O(np) signing keys to be generated/used for the scheme to work. Therefore, from the viewpoint of the share size, we apply a weakly-private (k, n)-threshold SSS for sharing one bit [3], which is a conditionally-secure SSS. For example, the (2, n)-TPKE (based on [3]) is described as follows: For a decryption server i, TPKE.Setup(1λ , n, k) outputs (gpki , gski,1 , gski,2 , gski,3 , gski,4 , upki,1 , upki,2 , upki,3 , upki,4 , Mi ), since the domain of shares is {0, 1, 2, 3}. For mi ∈ {0, 1, 2, 3}, a partial ciphertext of a decryption server i is σi ← GSig(gpki , gski,mi +1 , Mi ). By using mi ∈ {0, 1, 2, 3} as a share, m can be computed using the reconstruction function of the underlying SSS. 4.3
Security Analysis
Theorem 4. Our GS-Based TPKE scheme satisfies Correctness and Robustness if the underlying GS scheme satisfies Correctness.
GS Implies PKE with Non-interactive Opening and Threshold PKE
193
Proof. If the underlying GS scheme satisfies Correctness, then a signature computed by an honestly generated secret key is always valid, and the Open algorithm correctly identifies the signer. In addition, a proof returned by the Open algorithm is always accepted by the Judge algorithm. Therefore, for all i ∈ {1, 2, . . . , n}, the ShareDecrypt algorithm correctly computes μi = (mi , τi ), and the ShareVerify algorithm accepts τi . Therefore, Correctness and Robustness clearly hold.
Theorem 5. Our GS-Based TPKE scheme is IND-CCA secure if the underlying GS scheme satisfies Anonymity. Proof. We assume that k = n in the following proof, however, we can prove the IND-CCA security by using the same approach in the (k, n)-threshold setting. Let A be an adversary who breaks IND-CCA security of our GS-based TPKE scheme, and C be the challengers of the underlying GS scheme in the Anonymity experiment. We construct an algorithm B that breaks the Anonymity of the underlying GS scheme. First, A sends S ⊂ {1, 2, . . . , n} with |S| = n − 1 to B. For all j ∈ S, B runs (gpkj , ikj , okj ) ← GKg(1λ ), chooses Mj ∈ MGSig randomly, and computes (upkj,1 , uskj,1 , gskj,1 ) and (upkj,2 , uskj,2 , gskj,2 )). C runs (gpk, ik, ok) ← GKg(1λ ), and sends gpk and ik to B. B issues the identity 1 (resp. 2) to the SndToU oracle, and also issues USK(1) (resp. USK(2)), and obtains (upk1 , usk1 , gsk1 ) (resp. (upk2 , usk2 , gsk2 )). Let i ∈ {1, 2, . . . , n} \ S. B chooses Mi ∈ MGSig randomly, and sets gpki := gpk, iki := ik, (upki,1 , uski,1 , gski,1 ) := (upk1 , usk1 , gsk1 ), and (upki,2 , uski,2 , gski,2 ) := (upk2 , usk2 , gsk2 ). B sends pk = vk = {(gpki , gski,1 , gski,2, upki,1 , upki,2 , Mi )}ni=1 and {skj = okj }j∈S to A. For a share decryption query C = {σi }ni=1 , i , if i ∈ S, then B returns the result of ShareDecrypt(pk, i, ski , C). If i ∈ S, then B issues σi to the opening oracle, and obtains (j, τi ), and returns (j − 1, τi ) to A. A sends (m0 , m1 ) to B. For j ∈ S, B chooses mj ∈ {0, 1} randomly, and computes σj∗ ← GSig(gpkj , gskj,mj +1 , Mj ). B issues (Mi , i0 , i1 ) to the challenge oracle, where i0 = 1 and i1 = 2, and obtains σi∗ ← GSig(gpki , gski,ib , Mi ). B sends C ∗ = {σi∗ }ni=1 as the challenge ciphertext in the IND-CCA experiment. Finally, A outputs the guessing bit b ∈ {0, 1}. B outputs ( j∈S mj ) ⊕ mb .
Theorem 6. Our GS-Based TPKE scheme satisfies Decryption Consistency. Proof. We assume that k = n in the following proof, however, we can prove Decryption Consistency by using the same approach in the (k, n)-threshold setting. Let A be an adversary who breaks the Decryption Consistency of our GS based TPKE scheme. A outputs C = (σ1∗ , . . . , σn∗ ), S , S . Due to the definition of Decryption Consistency, Combine(pk, vk, C, S ) = Combine(pk, vk, C, S ). Therefore, there exists a pair (μi , μi ) = ((mi , τi ), (mi , τi )) ∈ S × S such that = mi , ShareVerify(pk, vk, C, μi ) = 1, and ShareVerify(pk, vk, C, μi ) = 1. This mi means that Judge(gpki , upki,mi +1 , Mi , σi∗ , τi ) = 1 and Judge(gpki , upki,mi +1 , Mi , σi∗ , τi ) = 1. However, this condition never occurs due to the Soundness of the Judge algorithm.
194
K. Emura, G. Hanaoka, and Y. Sakai
4.4
Concrete Instantiation of TPKE
Protocol 4. A (n, n)-TPKE Scheme based on Groth’s GS TPKE.Setup(1λ , n, n): Set gk := (p, G, GT , e, g). For i = 1, 2, . . . , n, run (gpki , iki , $ oki ) = (gk, Hash, fi , hi , Ti , crsi , pki ), zi , xki ← GKg(1λ ), choose Mi ← $
MGSig , xi,1 , xi,2 , ri,1 , ri,2 ← Zp , compute vi,1 = g xi,1 , vi,2 = g xi,2 , ai,1 = −r −r fi i,1 , ai,2 = fi i,2 , bi,1 = (hi vi,1 )ri,1 zi , and bi,2 = (hi vi,2 )ri,2 zi , and store vi,1 and vi,2 to regi . Output pktpke = vktpke = {(gpki , gski,1 , gski,2 , Mi )}ni=1 and sktpke = {xki }ni=1 . TPKE.Enc(pk, m): For a plaintext m ∈ {0, 1}, choose mj ∈ {0, 1} randomly for j = 1, 2, . . . , n − 1, and set mn := ( n−1 j=1 mj ) ⊕ m. For i = 1, 2, . . . , n, run (vki , ski ) ← KeyGensots (1λ ), and repeat this until Hash(vki ) = xi,mi +1 holds. Choose ρi ← Zp , and compute ai = ai,mi +1 fi−ρi , bi = bi,mi +1 (hi vi,mi +1 )ρi , σi = g 1/(xi,mi +1 +Hash(vksots )) , πi ← PNIWI (crsi , (gpki , ai , Hash(vki )), (bi , vi,mi +1 , σi )), yi ← Epki (Hash(vki ), σi ), ψi ← PN IZK (crsi , (gpki , yi , πi ), (ri , si , ti )), and σi ← Signsksots (vki , (Mi , ai , πi , yi , ψi )). Output C := {(vki , ai , πi , yi , ψi , σi )}ni=1 . ShareDecrypt(pktpke , i, xki , C): If 1 ← Vervki ((Mi , ai , πi , yi , ψi ), σi ), 1 ← VNIWI (crsi , (gpki ,ai ,Hash(vki )),πi ), 1 ← VN IZK (crsi , (gpki ,πi ,yi ),ψi ), and 1 ← ValidCiphertext (pki ,Hash(vki ), yi ), then extract (bi , vi , σi ) ← Xxki (crsi , (gpki , ai , Hash(vki )), πi ), and return μi = (mi = j − 1, (σi , j, vi )), where vi ∈ regi [j]. Else, return ⊥. ShareVerify(pktpke , vktpke , C, μi ): Return 1 if e(σi , vi g Hash(vki ) ) = e(g, g) ∧ j = mi + 1 ∧ j = 0, and 0 otherwise. Combine(pktpke , vktpke , C, {μ1 , . . . , μn }): For all μi , it checks e(σi , vi g Hash(vki ) ) = 0. If all conditions hold, then it outputs m = = e(g, g) ∧ j = mi + 1 ∧ j n m , and ⊥ otherwise. i i=1 $
Ohtake et al. [26] mentioned that TPKE can be constructed from a publicly verifiable PKE scheme, and therefore TPKE might be generically constructed from any GS (secure in the BMW model) by using it as a black box. This observation is taken from the well-known folklore, where TPKE is constructed from publicly verifiable CCA-secure PKE schemes. For example, TPKE can be constructed from CCA-secure PKE schemes which are constructed from identitybased techniques [9,24]. In these constructions, a partial ciphertext is a share of an identity-based private key, and it is publicly verifiable whether it was properly generated for an identity or not. To combine private keys in a secret sharing manner, the actual private key can be computed, and therefore the Combine algorithm works. However, we insist that public verifiability of partial ciphertexts is not sufficient to construct TPKE. Even if a partial ciphertext μi is publicly verifiable, it is not clear whether the Combine algorithm can work without any secret values or not. For example, as in [26], if we indicate a group signature itself as a partial ciphertext, then public verifiability is satisfied. However, there is no way to recover the corresponding opening result from group signatures
GS Implies PKE with Non-interactive Opening and Threshold PKE
195
only. Even if a decryption key (namely an opening key of GS) is distributed in the secret sharing manner, each partial opening key does not work to open each group signature. Therefore, in our constructions, the Judge algorithm comes into effect to publicly verify the result of the opening procedure.
5
Conclusion
In this paper, we show that PNENO and TPKE can be constructed by using a GS in the BSZ model. PKENO and TPKE themselves are already much stronger cryptographic primitives compared with the CCA-secure PKE. In addition, properties of Traceability, Non-Frameability, and the interactive Join algorithm are not required in our constructions. We can interpret our result such that designing secure GSs is significantly harder than designing many other ordinary cryptographic primitives.
Acknowledgements The authors would like to thank the anonymous reviewers of IWSEC 2010 for their invaluable comments. The first author Keita Emura is supported by the Center for Highly Dependable Embedded Systems Technology as a Postdoctoral researcher.
References 1. Ateniese, G., Camenisch, J., Joye, M., Tsudik, G.: A practical and provably secure coalition-resistant group signature scheme. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 255–270. Springer, Heidelberg (2000) 2. Ateniese, G., Tsudik, G.: Some open issues and new directions in group signatures. In: Franklin, M.K. (ed.) FC 1999. LNCS, vol. 1648, pp. 196–211. Springer, Heidelberg (1999) 3. Beimel, A., Franklin, M.K.: Weakly-private secret sharing schemes. In: Vadhan, S.P. (ed.) TCC 2007. LNCS, vol. 4392, pp. 253–272. Springer, Heidelberg (2007) 4. Bellare, M., Micciancio, D., Warinschi, B.: Foundations of group signatures: Formal definitions, simplified requirements, and a construction based on general assumptions. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 614–629. Springer, Heidelberg (2003) 5. Bellare, M., Shi, H., Zhang, C.: Foundations of group signatures: The case of dynamic groups. In: Menezes, A. (ed.) CT-RSA 2005. LNCS, vol. 3376, pp. 136–153. Springer, Heidelberg (2005) 6. Boldyreva, A., Fischlin, M., Palacio, A., Warinschi, B.: A closer look at PKI: Security and efficiency. In: Okamoto, T., Wang, X. (eds.) PKC 2007. LNCS, vol. 4450, pp. 458–475. Springer, Heidelberg (2007) 7. Boneh, D., Boyen, X.: Short signatures without random oracles and the SDH assumption in bilinear groups. J. Cryptology 21(2), 149–177 (2008) 8. Boneh, D., Boyen, X., Halevi, S.: Chosen ciphertext secure public key threshold encryption without random oracles. In: Pointcheval, D. (ed.) CT-RSA 2006. LNCS, vol. 3860, pp. 226–243. Springer, Heidelberg (2006)
196
K. Emura, G. Hanaoka, and Y. Sakai
9. Boyen, X., Mei, Q., Waters, B.: Direct chosen ciphertext security from identitybased techniques. In: ACM Conference on Computer and Communications Security, pp. 320–329 (2005) 10. Bringer, J., Chabanne, H., Pointcheval, D., Zimmer, S.: An application of the Boneh and Shacham group signature scheme to biometric authentication. In: Matsuura, K., Fujisaki, E. (eds.) IWSEC 2008. LNCS, vol. 5312, pp. 219–230. Springer, Heidelberg (2008) 11. Chaum, D., van Heyst, E.: Group signatures. In: Davies, D.W. (ed.) EUROCRYPT 1991. LNCS, vol. 547, pp. 257–265. Springer, Heidelberg (1991) 12. Chen, L., Pedersen, T.P.: New group signature schemes (extended abstract). In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 171–181. Springer, Heidelberg (1995) 13. Damg˚ ard, I., Hofheinz, D., Kiltz, E., Thorbek, R.: Public-key encryption with noninteractive opening. In: Malkin, T.G. (ed.) CT-RSA 2008. LNCS, vol. 4964, pp. 239–255. Springer, Heidelberg (2008) 14. Delerabl´ee, C., Pointcheval, D.: Dynamic fully anonymous short group signatures. In: Nguyˆen, P.Q. (ed.) VIETCRYPT 2006. LNCS, vol. 4341, pp. 193–210. Springer, Heidelberg (2006) 15. Galindo, D.: Breaking and repairing Damg˚ ard et al. public key encryption scheme with non-interactive opening. In: Fischlin, M. (ed.) CT-RSA 2009. LNCS, vol. 5473, pp. 389–398. Springer, Heidelberg (2009) 16. Galindo, D., Libert, B., Fischlin, M., Fuchsbauer, G., Lehmann, A., Manulis, M., Schr¨ oder, D.: Public-key encryption with non-interactive opening: New constructions and stronger definitions. In: Bernstein, D.J., Lange, T. (eds.) AFRICACRYPT 2010. LNCS, vol. 6055, pp. 333–350. Springer, Heidelberg (2010) 17. Goldreich, O.: Foundations of Cryptography. Basic Tools, vol. 1. Cambridge University Press, New York (2001) 18. Goldreich, O.: Foundations of Cryptography. Basic Applications, vol. 2. Cambridge University Press, New York (2004) 19. Groth, J.: Simulation-sound NIZK proofs for a practical language and constant size group signatures. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS, vol. 4284, pp. 444–459. Springer, Heidelberg (2006) 20. Groth, J.: Fully anonymous group signatures without random oracles. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 164–180. Springer, Heidelberg (2007) 21. Groth, J., Sahai, A.: Efficient non-interactive proof systems for bilinear groups. In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 415–432. Springer, Heidelberg (2008) 22. Isshiki, T., Mori, K., Sako, K., Teranishi, I., Yonezawa, S.: Using group signatures for identity management and its implementation. In: Digital Identity Management, pp. 73–78 (2006) 23. Kiltz, E.: Chosen-ciphertext security from tag-based encryption. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 581–600. Springer, Heidelberg (2006) 24. Lai, J., Deng, R.H., Liu, S., Kou, W.: Efficient CCA-secure PKE from identitybased techniques. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 132– 147. Springer, Heidelberg (2010) 25. Nakanishi, T., Sugiyama, Y.: An efficient anonymous survey for attribute statistics using a group signature scheme with attribute tracing. IEICE Transactions 86A(10), 2560–2568 (2003)
GS Implies PKE with Non-interactive Opening and Threshold PKE
197
26. Ohtake, G., Fujii, A., Hanaoka, G., Ogawa, K.: On the theoretical gap between group signatures with and without unlinkability. In: Preneel, B. (ed.) AFRICACRYPT 2009. LNCS, vol. 5580, pp. 149–166. Springer, Heidelberg (2009) 27. Rompel, J.: One-way functions are necessary and sufficient for secure signatures. In: STOC, pp. 387–394 (1990) 28. Sahai, A.: Non-malleable non-interactive zero knowledge and adaptive chosenciphertext security. In: FOCS, pp. 543–553 (1999) 29. Shamir, A.: How to share a secret. Commun. ACM 22(11), 612–613 (1979) 30. Zhou, S., Lin, D.: Shorter verifier-local revocation group signatures from bilinear maps. In: Pointcheval, D., Mu, Y., Chen, K. (eds.) CANS 2006. LNCS, vol. 4301, pp. 126–143. Springer, Heidelberg (2006)
Appendix In the Appendix, we review the Groth scheme [20]. Certified signature [6]: In a signature with public key infrastructure, a public verification key must be guaranteed from a certification authority. Groth applies the (basic) Boneh-Boyen signature scheme [7] and the Zhou-Lin signature scheme [30] for the signing part and the certifying verification key part, respectively. Strong one-time signature [7]: Let (vksots , sksots ) ← KeyGensots (1λ ) be a verification/signing key pair. For a message M , we denote σsots ← Signskstos (vkstos , M ), and 1 ← Vervkstos (M, σsots ) which stands for σsots is a valid signature of M ∈ Msots . In the following GS construction, the message is signed by using sksots , and vksots is signed by using a secret key of the certified signature x. Non-Interactive Witness-Indistinguishable (NIWI) proof system [21]: The key generator KN I takes bilinear groups (p, G, GT , e, g) as input, where G and GT are cyclic groups of prime order p, e is an efficiently computable bilinear map e : G × G → GT , and g ∈ G is a generator, and outputs a common reference string crs and an extraction key xk. The prover P takes crs and a witness as input, and outputs a proof π. The verifier V takes crs, π, and a set of equations, and outputs 1 if the proof is valid, and 0 otherwise. The extractor X extracts the witness from π by using xk. In the following GS construction, π ← PNIWI (crs, (pk, a, m), (b, v, σ)) denotes a (two pairing product equations) NIWI proof π over a certified signature (b, v, σ ) on m such that e(a, hv)e(f, b) = T ∧ e(σ , vg m ) = e(g, g). 1 ← VNIWI (crs, (pk, a, m), π) denotes that π is a valid proof. By using xk, (b, v, σ ) can be extracted from π such that (b, v, σ ) ← Xxk (crs, (gpk, a, Hash(vksots )), π), where Hash : {0, 1}∗ → {0, 1}(λ) is a collision-free hash function. This proof consists of three commitments to b, v, and σ , respectively. c = (c1 , c2 , c3 ) = (F r U t , H s V t , g r+s W t x) is a commitment of a value x ∈ G, where r, s, t ∈ Zp are random values. From the witness indistinguishability of the proof, no adversary can reveal which group member has signed the message from a group signature. Tag-based PKE [23] and NIZK: We denote the encryption algorithm as y ← Epk (tag, m), where y = (y1 , . . . , y5 ) = (F r , H s , g r+s m, (g tag K)r , (g tag L)s )
198
K. Emura, G. Hanaoka, and Y. Sakai
and r, s ∈ Zp are random values. A ciphertext y can be publicly verified, and we denote it 1 ← ValidCiphertext(pk, tag, y), which means that y is a valid ciphertext under tag. Concretely, the ValidCiphertext algorithm outputs 1 if e(F, y4 ) = e(y1 , g tag K) and e(H, y5 ) = e(y2 , g tag L). In the following GS construction, a part of a certified signature σ is encrypted by using the Kiltz’s (selective-tag weakly CCA-secure) tag-based encryption scheme [23] with a tag Hash(vksots ). In addition, for a commitment c to σ and a ciphertext y under Hash(vksots ), a signer computes a NIZK proof to prove that the plaintext of y and the committed value in c are the same. We denote it as ψ ← PN IZK (crs, (gpk, y, π), (r, s, t)). 1 ← VN IZK (crs, (gpk, π, y), ψ) stands for ψ is a valid NIZK proof. Protocol 5. Groth’s Fully Anonymous Group Signature Scheme [20] $
GKg(1λ ): Set gk := (p, G, GT , e, g), select f, h, z, K, L ← G, compute T := e(f, z) and (crs, xk) ← KN I (1λ ), where crs = (F, H, U, V, W, U , V , W ). Set pk := (F, H, K, L), and output (gpk, ik, ok) = (gk, Hash, f, h, T, crs, pk), z, xk . UKg: Set upki = i and uski = . Join,Iss(Ui : gpk, IM : gpk, ik): As the result of this interactive algorithm, a user Ui obtains (vi , gski = (xi , ai , bi )) and IM obtains (vi , ai , bi ), respectively, with the relations e(ai , hvi )e(f, bi ) = T and vi = g xi . IM adds vi to reg[i]. We omit the actual protocol due to page limitations (See [20]). GSig(gpk, gski , M ): Run (vksots , sksots ) ← KeyGensots (1λ ), and repeat this until $
Hash(vksots ) = xi holds. Choose ρ ← Zp , and compute a = ai f −ρ , b = ρ bi (hvi ) , σ = g 1/(xi +Hash(vksots )) , π ← PNIWI (crs, (gpk, a, Hash(vksots )), (b, vi , σ )), y ← Epk (Hash(vksots ), σ ), ψ ← PN IZK (crs, (gpk, y, π), (r, s, t)), and σsots ← Signsksots (vksots , (M, a, π, y, ψ)), and output σ = (vksots , a, π, y, ψ, σsots ). GVf(gpk, M, σ): If 1 ← Vervkstos ((M, a, π, y, ψ), σsots ), 1 ← VNIWI (crs, (gpk, a, Hash(vkstos )), π), 1 ← VN IZK (crs, (gpk, π, y), ψ), and 1 ← ValidCiphertext (pk,Hash(vkstos ),y), then return 1, and 0 otherwise. Open(gpk, ok, M, σ, reg): Extract (b, v, σ ) ← Xxk (crs, (gpk, a, Hash(vksots )), π). Return (i, (σ , i, v)) if there exists i such that v = vi ∈ reg, and (0, (σ , 0, v)) otherwise. Judge(gpk, upki , M, σ, (σ , i, v)): Return 1 if e(σ , vg Hash(vksots ) ) = e(g, g) ∧ upki = i∧i = 0, and 0 otherwise.
A Generic Binary Analysis Method for Malware Tomonori Izumida1,2 , Kokichi Futatsugi2 , and Akira Mori1 1
National Institute of Advanced Industrial Science and Technology 2-3-26 Aomi, Koto-ku, Tokyo 135-0064, Japan {tomonori.izumida,a-mori}@aist.go.jp 2 Japan Advanced Institute of Science and Technology 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan {tizmd,kokichi}@jaist.ac.jp
Abstract. In this paper, we present a novel binary analysis method for malware which combines static and dynamic techniques. In the static phase, the target address of each indirect jump is resolved using backward analysis on static single assignment form of binary code. In the dynamic phase, those target addresses that are not statically resolved are recovered by way of emulation. The method is generic in the sense that it can reveal control flows of self-extracting/obfuscated code without requiring special assumptions on executables such as compliance with standard compiler models, which is requisite for the conventional methods of static binary analysis but does not hold for many malware samples. Case studies on real-world malware examples are presented to demonstrate the effectiveness of our method.
1
Introduction
Malicious software, or malware, has become a main threat to the network-centric society today. Malware scanners are used to protect computers from the threat. However, such tools rely on syntactic signature matching algorithms and are evadable by program transformation techniques. As a result, the threat is still an ongoing issue and amount of newly found malware samples are reported to antimalware companies each day. After receiving such samples, malware analysts examine each suspicious executable to understand its behavior and effects on the system, characterize the hazard which it may cause, and extract a distinctive signature pattern from the binary. These analysis steps involve many manual operations and are time consuming. Various methods have been proposed to automate analysis of malware binaries. Dynamic analysis, which executes a target on a debugger or an emulator within a controlled environment to observe its activity, is used in many practical studies. However, the method fails to detect fatal activities of interest if the target changes its behavior depending on trigger conditions such as existence of a specific file as only a single execution path may be examined for each attempt. The shortcomings come from the lack of global views of the program behaviors. Static analysis, on the other hand, extracts such high-level information as control I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 199–216, 2010. c Springer-Verlag Berlin Heidelberg 2010
200
T. Izumida, K. Futatsugi, and A. Mori
flow graphs (CFGs) from a binary and permits multi-path analysis without executing it. Enhancing the range of static analysis will greatly benefit the analysis of malware behaviors. Static analysis may be thwarted by various code obfuscation techniques. For example, opaque constants [14] obtained by lengthy arithmetic or intensional errors at run-time are difficult to predict statically. A number of techniques for obstructing disassemblers using opaque constants have been reported [14]. Executable packers such as UPX and ASPack are other obstacles to static analysis, which compress executable files and prefix small decompressing routines to the compressed data for run-time extraction. Although such packers can be applied for benign software to save disk space or to reduce loading time, many malware samples also abuse executable packers to hide themselves from reverse engineering. Considering the fact that these obfuscation techniques do not pose serious problems to dynamic analysis, it is natural to combine static and dynamic methods to mutually cover their weakness. If the static analyzer is ineffective for exploring fragments of the target binary, it can rely on its dynamic counterpart. The challenge of static binary analysis can be summarized to how we determine destination addresses of indirect jumps. For example, to follow the control flow after an instruction CALL EAX, we need to know what constant value will appear in the register EAX at the time of execution. This is a difficult task since it must be done without running the code and traces of values stored in machine registers or memory locations can be complex in malware because of its antianalysis capabilities. It is an undecidable problem in general and restrictions and/or approximations must be introduced for a concrete solution. Current approaches for indirect jump resolution assume that that the target binary is built from high level language like C/C++ with standard compiler conventions. There are many malware samples that are written directly in assembler languages and do not conform to such conventions at all. In this paper, we introduce a novel analysis method for malware binaries which does not rely on special assumptions on target binaries. The method combines static and dynamic analysis methods to reveal multi-path program behaviors by backward tracing of values in registers and memory locations at control transfer points. For this, a demand-driven symbolic evaluation method is developed for the static single assignment (SSA) form of machine instructions. We present a new static analysis tool called Syman with a light-weight emulator Alligator [12,11] both implemented for binary executables for MS Windows Operation Systems on IA-32 architecture. Syman and Alligator are designed to cooperate with each other to discover entire control flows of the target. Syman works on execution paths that are not explored by Alligator while Alligator handles self-extracting code and gathers run-time information including opaque values needed by Syman. The tools are controlled in the following manner. Syman is invoked first from the entry point of the program to build an initial CFG by resolving indeterminate destination addresses as mush as possible. Then, Alligator takes over control to start emulation from the entry point until it encounters a code block that is not
A Generic Binary Analysis Method for Malware
201
yet visited by Syman. When such disagreement occurs, Syman resumes expanding CFG from the newly found code block. In this way, Syman and Alligator work together to build an entire CFG. In a particular case where the target executable is compressed by a simple executable packer, the code block extracted by Alligator is considered unvisited and Syman proceeds to the now uncompressed part that was hidden in the original executable. The organization of the paper is as follows. Section 2 gives an overview of the binary analysis method and explains how the method combines static and dynamic techniques. Section 3 explains the static analysis method which allows to resolve indirect jumps without making assumptions on the target executable and Section 4 touches upon issues concerning implementation. Case studies with several real-world malware samples are explained in Section 5. We finally list related works in Section 6 and concludes with future work in Section 7.
2
Method Overview
In this section, we explain ingredients of our method, the static analyser Syman and the dynamic analyzer (emulator) Alligator, and how these two interact with each other. 2.1
Dynamic Analyser
The dynamic analyser (emulator) Alligator [12,11] consists of an Intel IA32 CPU simulator, virtual memory, and a number of stub functions corresponding to Win32 API functions for emulating operating system capabilities such as process/thread management and memory management. Alligator is lightweight in the sense that it does not require an instance of Windows OS and uses dummy software objects instead of emulating actual hardware such as disk drives and network devices. When a target executable is given, Alligator loads it into virtual memory, creates process/thread information structures in the same way as an actual Windows OS does, and loads dynamic linkable libraries (DLLs). Since Alligator emulates an OS capabilities by way of stub functions rather than executing API function code in DLLs, it does not require real DLLs distributed with Windows OS. Pseudo DLLs containing minimal information of API functions such as function names and entry addresses are used instead. Alligator notifies Syman of the lists of loaded DLLs and exported API functions when it finishes target loading. Syman uses this information to build special basic blocks corresponding to API functions (described in Section 4.1). The OS specific control such as exception handling, thread management, and memory management is emulated accordingly in Alligator. To interpret API functions symbolically, a stub function is invoked when control is transferred to the address of the API function in virtual memory. The default task of a stub function is to remove arguments placed on top of the stack and to store a tentative return value in register EAX. To make stub calls transparent and extendable,
202
T. Izumida, K. Futatsugi, and A. Mori
a machine generated pseudo library (DLL) files that imitate standard memory layout of Windows operating systems are prepared. In the current version, Alligator has approximately 1500 non-trivial stub functions over 1700 pseudo DLL files and operates without any proprietary code, which makes the system highly portable and manageable. The stub mechanism to bypasses external library code contributes greatly in making up for the processing speed. Note that Alligator uses a CPU simulator and is much slower than commercial emulators based on native execution. Alligator has an ability to detect execution of self-modified code by using memory page management system of the IA-32 architecture. Alligator clears the dirty bit of each page table entry before execution and for each step-wise execution it checks whether the dirty bit of the current code page is set or not. If it is set, it indicates that the instruction which is going to be executed might have been modified during previous execution. Alligator flags this event to let Syman analyze the code from the address in the later stage. Self-modified code generated by advanced packers can be analyzed by Syman using this mechanism. Alligator was originally developed for detecting unknown computer viruses/ worms by identifying common malicious behaviors such as mass mailing, registry overwrites, process scanning, external execution, and dubious Internet connection by monitoring API calls during emulation. Since Alligator operates in a self-contained software process, there is no need of restarting a guest OS for a new session nor additional costs for parallel processing in multiple instances, as opposed to other methods involving full OS emulation. Alligator can be used for server-side malware detection and in fact, it had successfully detected nearly 95% of email attached malware on a working email server from 2004 through 2006 without relying on signatures of previously captured samples. 2.2
Static Analyser
From an given address in code, the static analyser Syman disassembles the code stored in the virtual memory of Alligator by recursively traversing each direct jump and translates each disassembled instruction to a sequence of statements in intermediate representations (IR) that captures operational semantics of each instruction. See Fig. 1. Each IR statement is in the form of assignment, where a variable symbol is placed on the left hand side and an IR expression on the right hand side. A statement is said to define a variable v if v occurs in the left hand side, and is said to use v if v occurs in the right hand side. var sz ← expr sz represents sz-bit data transfers, where var sz is a variable symbol and expr sz is a value expression. We may omit subscripts sz when they are clear from the context. A control statement ← ControlExpr defines no variable and represents control transfers. To express the effect of a memory write operation, a memory expression is introduced. Assignments involving memory expressions are restricted to the form M ← Stsz (M, addr 32 , expr sz ), which denotes that an sz-bit value expr 32 is stored into address addr 32 and the memory state is updated from M to M .
A Generic Binary Analysis Method for Malware
203
exprsz : = intsz | varsz | exprsz + exprsz | exprsz − exprsz | exprsz ∗ exprsz . . . | exprsz &exprsz | exprsz exprsz |∼ exprsz . . . | Ldsz (M, expr32 ) ControlExpr : = Jump(expr32 ) | CALL(expr32 ) | RET(expr32 ) | Branch(expr1 , expr32 , expr32 ) . . . MemoryExpr : = M | St(M, expr32 , exprsz )
Fig. 1. Intermediate Representation X86 Instruction Translated Statements ADD EAX, 1234[EDX,ESI] src ← Ld32 (M, EDX + ESI + 1234) EAX ← EAX + src CALL EBX dest ← EBX ESP ← ESP − 4 M ← St32 (M, ESP, NextIP ) ← Call(dest) RET ESP ← ESP + 4 dest ← Ld32 (M, ESP) ← Ret(dest)
Fig. 2. Translation of Machine Instructions
Related to this is an expression Ldsz (M, addr 32 ), denoting an sz-bit value stored at address addr 32 in the memory state M . Note that a memory expression M has no bit size type. Figure 2 shows examples of translation. Notice that indirect memory operands such as 1234[EDX,ESI] and stack operations implied by CALL/RET are translated using memory expressions. The last statements appearing in the translation of CALL and RET are inserted for specifying the destination. The operators labeled Call/Ret are used for notational convenience. The static single assignment (SSA) form [8] is intermediate representation originally used for compiler optimization. It is obtained by converting a control flow graph and satisfies following conditions: 1. each usage of a variable is dominated by its definition, and 2. every variable is defined only once. A control point A dominates another point B if A precedes B in any execution path. To meet the first condition, an assignment statement in the form v ← Φ(v, . . . , v) is inserted at each control point where multiple definitions of a variable v merge. The function Φ on the left hand side is a pseudo function and has no algebraic meaning. Each variable symbol is renamed to satisfy the second requirement after the insertion of Φ-functions. For example, the definition of EAX dominates its use in the bottom block and other variables have multiple definitions (Fig.3(a)). To convert the graph into the SSA form, Φ-functions are inserted in the bottom block for the variables except EAX and each variable name is renamed to make the definition unique (Fig.3(b)). Syman continues instruction translation illustrated above until it encounters a control transfer instruction. If its destination address is immediately determined,
204
T. Izumida, K. Futatsugi, and A. Mori
(a) Original CFG
(b) Converted SSA Form
Fig. 3. Example of SSA Form
a new basic block is created from the address. Otherwise, the node is marked as indeterminate. After all immediate addresses have been processed, the control flow graph is converted into the SSA form, in which indeterminate destinations of marked nodes are examined if they evaluate to an immediate value. This evaluation step takes an demand-driven approach [18,20], in which variables in the evaluated expression is replaced with their definitions by tracing their def-use chains. When evaluating an expression Ldsz (M, addr ), memory state tracing is performed by searching the most recent expression of the form M ← Stsz (M , addr , val) where addr and addr coincide. The expression will be replaced with value val. The process can be complex when it involves a code segment that is reached from/returns to multiple points since the calling context must be identified to trace values in def-use chains of variables. This is what is called “inter-procedural analysis” in a traditional setting. For normal executables generated by standard compilers, it can be done by identifying subroutines/procedures and replicating them for each call site. This is not difficult because a subroutine is entered at the same address and returns to the next instruction of the calling CALL instruction by the RET instruction at the bottom. However, in malware code generated by an assembler, code sharing can be hidden in most obscure ways. For example, inserting an irregular RET instruction after pushing an arbitrary value on the stack can cause static analysis a serious trouble. Static analysis of malware binary code thus requires a new method for generalized inter-procedural analysis. Based on the observation that the situation appears as a destination d = Φ(addr1 , addr2 . . . , addrn ) merged by a Φ-function in the SSA form, we propose identifying a shared code segment between the definition site and the use site of the merged destination d, which has multiple control flows for each argument address addri . In other words, we interpret the Φ-function in the merged destination as a “choice operator” and perform interprocedural analysis by resolving the choice step by step. In a typical situation,
A Generic Binary Analysis Method for Malware
205
the shared code segment is a subroutine and the arguments of Φ correspond to the callers. Note that the subroutine is called n times. The evaluation algorithm is described in detail in Section 3. 2.3
Controller
The role of the controller is to manage the interaction between Syman and Alligator. When a target executable is given, the controller directs Alligator to load it into memory and to prepare the environment for execution. After that, the controller passes the entry address specified in the executable file to Syman for constructing an initial CFG before invoking dynamic execution. Once an initial CFG is obtained, the controller gets the current block to point to the initial basic block of the graph, and suspends Syman and gets Alligator to start step-wise execution from the entry point, confirming that each instruction is contained in the current block. After the last instruction of the current block is executed, Alligator continues if the next instruction address points at one of the successor blocks of the current block, in which case the next block becomes the new current block. The controller suspends Alligator and resumes Syman from the current instruction address to create a fresh CFG in the following cases: – The next instruction address after the execution of current block points at none of the successor blocks. This happens when Syman has failed to resolve an indirect jump. – Alligator reports that the current code page is modified during execution as mentioned in Section 2.1. The current block which was read by Syman in the previous phase may differ from actual code which resides in the memory. – The current instruction address is out of the current block. This can be caused by, for example, a run-time exception. The details of the way how a Windows OS handles such run-time exception is omitted due to the space limitation. After the dynamic analysis terminates (e.g., termination of the emulated process), the controller retrieves a sequence of CFGs created by Syman and puts them together to obtain a single large CFG as the final output of the analysis.
3
Symbolic Value Evaluation Method
In this section, we explain an algorithm to evaluate indeterminate values for extending CFGs. An expression expr is called a Φ-expression when expr is of the form Φ(expr1 , expr2 , . . . , exprn ). Similarly, a statement var ← expr is called a Φ-statement when expr is a Φ-expression. The location of a statement s, denoted by loc(s), is defined as a pair (n, i) where n is a node containing s and i is a nonnegative integer such that i = 0 when s is a Φ-statement and i = k when s is the k-th non-Φ-statement in n. We write (n, i): var ← expr when the statement is
206
T. Izumida, K. Futatsugi, and A. Mori
located at (n, i). The location of a variable var , denoted by loc(var ), is defined by the location of its defining statement if there exists, or (entry, 0) if var otherwise. Note that entry denotes a designated entry node of a CFG. We say a location (n, i) is higher than (m, j), or equivalently, (m, j) is lower than (n, i) when n dominates m (n = m), or i < j (n = m). We may write (n, i) (m, j). A special location (n, ∞) is defined for each node n such that (n, i) (n, ∞) for any i ≥ 0. We say an expression is valid if variables appearing in it can be ordered linearly by locations. By nature of control flow graphs in the SSA form, expressions in a CFG are guaranteed to be valid except P hiexpressions. The location of an expression expr , denoted by loc(expr ), is defined to be the lowest location of its variables when expr is valid, or undefined otherwise. For example, every expression appearing on the right-hand side of non-Φ-statements is valid by definition. 3.1
Demand-Driven Evaluation
Suppose that we evaluate a valid expression expr in a control flow graph cfg of the SSA form. The algorithm assumes a height limit to restrict substitution in the evaluation. If it is set higher, evaluation gets more accurate and more time-consuming. A height limit can be any location in cfg . We say expr is evaluated up to location l when the height limit is set to l. Let V be the set of variables appearing in expr . If loc(expr ) is higher than the height limit, evaluation terminates and expr is returned as the result. Otherwise, a variable v having the lowest location is taken from V and its definition is considered for further evaluation. A new height limit l is set to the location of the second lowest (next to v) variable in loc(V ). If v is a memory state variable M , an outermost subexpression Ld(M, addr ) of expr containing M is evaluated up to l using the memory state tracing method explained in the next section. The subexpression is replaced by the evaluation result in expr . Otherwise, the defining expression expr of v found in an assignment v ← expr located at loc(v) is evaluated up to l. If expr is not a Φ-statement, expr is further evaluated in a recursive manner and v is replaced by the result in expr . If expr is a Φ-statement, we locate an assignment (n, 0): v ← Φ(v0 , . . . , vk ). If (n, 0) loc(vi ), vi must have been defined in a node in a loop where n is the loop header, that is, the top node in the loop. We evaluate such vi up to (n, 0), that is, just for one round of the loop to see if the resulting expression expr i contains a recurrence of v. All occurrences of such v are replaced with a placeholder X, which is not treated as a variable in further evaluation. The case where expr i contains other variables defined at (n, 0) is treated similarly. Since loc(expr i ) must be higher than (n, 0), we evaluate expr i up to l. The other case where vi is defined at a node out of the loop, we simply evaluate vi up to l. Suppose all arguments vi of the Φ-function have been evaluated up to l. Let {expr i } be the set of resulting expressions. We form μX.Φ(expr 0 , . . . , expr k ) as the evaluation result of Φ(v0 , . . . , vk ), where μ is the fixed point operator.
A Generic Binary Analysis Method for Malware
207
Expressions bound by μ are not utilized in further computation except the trivial case where unguarded occurrences of X in a Φ-function are removed, that is, μX.Φ(X, A, X) becomes A. For example, ESP2 in Figure 4 is defined as Φ(ESP1 , ESP5 , ESP15 ). Since ESP5 becomes ESP2 by evaluation, and ESP15 is found to be equal to ESP1 , the result of evaluation of ESP2 becomes μX.Φ(ESP1 , X, ESP1 ) = ESP1 . The fixpoint operator is only used for preventing endless recursive computation and no fixed point calculus is attempted at the moment. For simplicity, we assume throughout the paper that CFGs are reducible where all nodes in a loop are dominated by the loop header. For irreducible CFGs, we must carefully treat the loops using dominator strong components [19]. It is noted that we keep track of (n, 0) explained above as a loop header during the evaluation along a loop to avoid interminable evaluation in the address comparison in the next section. Whenever evaluation of expr succeeds, the set V of variables appearing in expr is updated by taking into consideration variables appearing in replaced subexpressions. If the height limit is initially set higher than any variable appearing in expr , locations of variables in V are always lower than the height limit. The above evaluation steps are iterated until V is empty. Note that loc(expr ) l at this time. 3.2
Memory State Tracing
In this section, we explain ideas for evaluating an expression Ldsz (M, addr ) is to find the most recent memory assignment on the same memory location addr . In this section, we follow an example shown in Figure 4 and explain the case where M ’s defining assignment M ← Stsz (M , addr , value). The other case where M is defined by M ← Φ(M0 , . . . , Mi ) is explained in the next section. Firstly, we compare addresses by evaluating an expression addr − addr to see if the result d becomes an integer constant such that 0 ≤ d < (sz /8). If it is the case, we recursively evaluate value up to l. If d = 0 and sz = sz , the entire result is returned. Otherwise, required portion of the result is returned, and evaluation continues to obtain the rest. Note that the expression addr − addr is evaluated as far as possible regardless of l since the height limit is intended for controlling how far we trace the definitions of M and irrelevant for address comparison. Consider the evaluation of Ld(M11 , EBP1 ) up to (h, 0) in Figure 4 for example. The evaluation of EBP1 − ESP14 should go beyond (h, 0) until it is confirmed that ESP7 − (ESP7 + 12) = −12 (although this means failure of the comparison because d < 0) rather than stopping with the result EBP1 − ESP13 + 4 at (h, 0). However, if the comparison occurs in a loop, we limit evaluation up to the loop header since unrestricted evaluation may never terminate. The address comparison fails when the result d is a non-constant or an integer constant out of range. In this case, we recursively evaluate Ld(M , addr ) instead of Ld(M, addr ). If loc(M ) l, evaluation terminates and Ldsz (M , addr ) is returned. Note that an sz-bit value stored at addr is taken from the virtual
208
T. Izumida, K. Futatsugi, and A. Mori
Fig. 4. Memory State Tracing
memory of the system if l = (entry, 0) and in such a case addr is evaluated an integer constant. Since memory state M is not re-evaluated after comparison failure, there is a chance that the correct value is not obtained. Instead, we may end up with obsolete values. This happens also when one of the addresses in comparison varies in a loop since fixed point calculation is inefficient at the moment. In the current implementation, we may choose to be conservative when stopping address comparison. 3.3
Backtracking over Φ Functions
We discuss the case where M is defined by a Φ-statement in the evaluation of Ld(M, addr ) up to l. For convenience, we write the Φ-statement as (n, 0): M ← Φ(n0 : M0 , . . . , nk : Mk ), where ni is a predecessor node of n corresponding to the i-th argument. Since loc(addr ) (n, 0), if addr has variables defined at (n, 0), they are the lowest among the variables occurring in addr and must have been defined by Φ-statements of the form of v ← Φ(n0 : v0 , . . . , nk : vk ). For each ni , we define addr i to be an expression obtained by replacing each variable v in addr with vi . Since loc(vi ) (ni , ∞), the derived address addr i is still valid and loc(addr i ) (ni , ∞). We then evaluate derived expressions E = {Ld(M0 , addr 0 ), . . . , Ld(Mk , addr k )} by means of backtracking over incoming edges of n. Let InLoop = {Ld(Mi , addr i )
A Generic Binary Analysis Method for Malware
209
∈ E | (n, 0) loc(Mi )} and OutLoop = E \ InLoop. We first evaluate the expressions in InLoop up to (n, 0)1 , paying attentions to the confluence of memory state variables. As described in Section 3.2, the loop header (n, 0) is regarded as a height limit of address comparison in the evaluation. If InLoop contains a single expression of the form Ld(Mi , addr i ), we evaluate it up to (n, 0). Otherwise, we choose a pair of expressions of the form Ld(Mi , addr i ) and Ld(Mj , addr j ) from InLoop and let Mk be the nearest common variable appearing in the def-use chains of both Mi and Mj . We evaluate these expressions up to loc(Mk ) expecting that they converge. If the results are of the forms Ld(Mk , addr i ) and Ld(Mk , addr j ) where addr i = addr j , we merge them as Ld(Mk , Φ(addr i , addr j )). We put back the resulting pair to InLoop instead of the original pair. We iterate the step until InLoop contains at most a single expression of the form Ld(Mi , addr i ) where loc(Mi ) (n, 0) and evaluate other expressions in InLoop up to (n, 0). The set OutLoop is similarly evaluated up to (p, ∞). To avoid further recursive evaluation, we replace in InLoop the expressions containing Ld(M, ...) with ⊥. After that, the expressions in InLoop are evaluated up to (p, ∞). Finally, we evaluate Φ(expr 0 , . . . , expr k ) up to l for {expri } = InLoop ∪ OutLoop. We illustrate the above method on a couple of typical cases occurring in the resolution of destination dest 1 in Figure 4. Let us first consider the evaluation of Ld(M10 , EBP1 ) up to the location loc(M1 ). In this case, InLoop = ∅ and OutLoop = {Ld(M8 , EBP1 ), Ld(M9 , EBP1 )}. We evaluate both expressions in OutLoop up to loc(M6 ). Since EBP1 does not match any address appearing in the tracing paths, the results coincide as Ld(M6 , EBP1 ) which is evaluated up to loc(M1 ) (the result is ESP1 ). For the evaluation of Ld(M2 , ESP2 ) up to (entry, 0). We have that InLoop = {Ld(M4 , ESP5 )} and OutLoop = {Ld(M11 , ESP15 )}. However, since both are singletons, we only evaluate their contents. We first evaluate Ld(M4 , ESP5 ) up to the loop header (b, 0). This ends up with the recurrence of Ld(M2 , ESP2 ). We abandon the result since it means that there is (or possibly, for incompleteness of address comparison) no corresponding write operation found in the loop. We next evaluate Ld(M11 , ESP15 ) up to (a, ∞) to Ld(M1 , ESP1 ). By merging the results, evaluation proceeds until a single expression Ld(M1 , ESP1 ). Val 0 is returned as a final result.
4
Implementation Issues
Before presenting case studies, we explain relevant issues on implementation. 4.1
API Blocks
The evaluation method explained in the previous sections is implemented in Syman using Python language. Syman shares virtual memory with Alligator and consults its CPU simulator for the values of registers and memory contents 1
See the remark on reducibility in Section 3.1.
210
T. Izumida, K. Futatsugi, and A. Mori
hModule ← Ld32 (M, ESP + 4) lpProcName ← Ld32 (M, ESP + 8) EAX ← Kernel32::GetProcAddress(hModule, lpProcName) dest ← Ld32 (M, ESP) ESP ← ESP + 12 ← Ret(dest)
Fig. 5. An Example of API Block
prior to static analysis. Alligator is responsible for memory management including loading executables and library file as well as resolving addresses of API functions. Theses API function addresses also concern Syman since going into intricate Win32 library code should be avoided for revealing malware behaviors. In Syman, a special basic block is assigned for each API functions. Figure 5 shows such an API block for GetProcAddress function of Kernel32 library. It is noted that the stdcall semantics of Win32 platforms is encoded by storing the return valued in EAX register and removing the return address and the arguments pushed onto the stack before the call by increasing the ESP value by the corresponding numbers of bytes. Such numbers may vary for each API function. Syman shares a signature database with Alligator that maintains numbers and types of arguments of functions2 . In fact, the type information for strings are utilized for evaluating string values by concatenating each evaluated byte data (i.e., character) up to a null character. This typically takes place for LoadLibrary and GetProcAddress to know the names of libraries and functions. 4.2
Caching Evaluation Results
To avoid evaluating the same variable over and over, Syman caches the results of variable evaluation for future use. A cache entry for a variable var is a list of pairs (vali , li ) where val i is an evaluation result of var up to li . When we evaluate var up to a location l, we look for the pair (vali , li ) having the highest li such that l li and evaluate val i , instead of var , up to l. In case l = li , the new result newval is cached by appending (newval , l) to the cache entry of var . The caching mechanism considerably improves the processing speed of Syman.
5
Case Studies
In this section, we present several case studies performed on malware samples gathered in a real environment. 5.1
Defeating Obfuscated Code
We give two examples for obfuscated malware code. The first example, we take a Trojan horse identified as Rustock by the BitDefender scanner3 . In Rustock, a 2 3
The database can be automatically generated. http://www.bitdefender.com
A Generic Binary Analysis Method for Malware
211
Disassembled Code SSA Form 401172 MOV SS:[ESP], 4015A9 M106 ← St32 (M105 , ESP208 , 004015A9) 401179 PUSH 40123B
ESP209 ← ESP208 − 4 M107 ← St32 (M106 , ESP209 , 40123B)
40117E RET
dest 58 ← Ld32 (M107 , ESP209 ) ESP210 ← ESP209 + 4 ← Ret(dest 58 ) dest 59 ← Ld32 (M107 , ESP210 ) ESP211 ← ESP210 + 4 ← Ret(dest 59 ) ESP212 ← ESP211 − 1 ...
40123B RET 4015A9 INC ESP ... ...
Fig. 6. Obfuscated Control Flow (Rustok)
402F7C 402F85 402F92 402F9C 402FA6 402FB0 402FD2 402FD8 402FDC
MOV DS:1433[EBX]{8}, ’e’ MOV DS:1435[EBX]{8}, ’p’ MOV DS:1431[EBX]{8}, ’S’ MOV DS:1432[EBX]{8}, ’l’ MOV DS:1436[EBX]{8}, ’\0’ MOV DS:1434[EBX]{8}, ’e’ LEA EDI, DS:1431[EBX]{8} MOV SS:4[ESP]{32} ,EDI CALL DS:1637[EBX]{32}
Fig. 7. Hiding API Name String “Sleep” (Troj.Obfus.Gen)
number of RET instructions are inserted to make arbitrary jumps after putting the destination address on the stack top to hide its self-decrypting routine. See the sequence of code excerpted from Rustock in execution order and the translated statements in SSA form in Figure 6. By manipulating the stack beforehand, the two RET instructions at 40117E and 490123B are used to jump to arbitrary addresses instead of returning from procedure calls. The obfuscation appears simple, but is good enough to confuse state-of-the-art disassemblers including IDAPro as they lack data flow analysis capabilities. Statically tracing RET instructions requires serious data flow analysis of evaluating both the stack pointer and the values it points to. Malware creators can take advantage of this to congest analysis processes with automatically generated variants having slightly different obfuscation code. For Syman, it is an easy task to resolve such obfuscated RET instructions, e.g., dest 58 and dest 59 in Figure 6 are resolved as 40123B and 4015A9 respectively. Though Syman succeeds to generate a straight control flow graph for the entire decrypting routine of Rustock having about 103 nodes, its main routine can not be analyzed because of encryption. This is a common limitation of static binary analysis. In such cases, Alligator helps Syman to proceed as explained before. Secondly, we take an example of a malware sample named Troj.Obfus.Gen by BitDefender scanner of approximately 17.5KB size. It is an unencrypted Trojan horse that opens Internet connection using WININET.DLL library and serves as a good example of obfuscated code. As its name suggests, Troj.Obfus.Gen tries to hide the name of the API functions it calls or the library it dynamically load by decomposing the name string into groups of characters, and distributing them through various registers and memory locations. See the code excerpt from Troj.Obfus.Gen in Figure 7, in which the irrelevant code are omitted. In this example, an API name “Sleep” (including the last null character ’\0’), given as the second argument for a GetProcAddress library call (corresponding the last CALL DS:1637[EBX]{32}), is hidden in a sequence of MOV instructions. Although an advanced disassembler including IDAPro can suggest API function calls for memory indirect CALLs for which the target address is trivial, an obfuscated case as described above can not be handled and no further information is obtained in analysis.
212
T. Izumida, K. Futatsugi, and A. Mori
Syman has successfully analyzed more than 89% of the code in the size of the code section declared in the executable file. Considering the code alignment and padding bytes in the gap, it is fair to say that Syman’s analysis is complete for this example. There is only a single indirect destination which Syman failed to resolve in Troj.Obfus.Gen. It is a CALL instruction direct to an address at midpoint of a heap memory allocated by VirtualAlloc API function. The program tries to download a executable file from the Internet using WININET.DLL library API, store it into the heap, search an entry point in the downloaded executable, and then jump to the address obtained by the search computation with loops. Such address is obviously out of scope for static analysis and should be resolved by a dynamic analyser. Alligator is capable of doing thie. 5.2
Automated Analysis of Self-extracting Code
We experimented on a number of variants of online game worms identified as Trojan.PWS.OnlineGames.* collected in the year of 2009. These worms are encrypted by execution packers. We demonstrate how our method treats such samples by alternating Syman and Alligator. After Alligator loads a sample into memory, Syman constructs an initial CFG covering the decryption routine to pass control to Alligator which in turn executes the routine. When Alligator finishes the decryption, it jumps to the decrypted code block on a page modified during execution. Alligator judges that it has encountered self-modification of code and let Syman analyze the code from the newly found block. For many samples, the first thing to do after decryption is to retrieve the entry addresses of API functions and to store them in a global function table in order. Those API functions are later called directly by consulting the addresses in the table. Since the API functions stored in the table can not be statically determined, Syman creates the second CFG until it meets the first API call through the table. When Alligator reaches the CALL instruction, the API function to be called is now determined. Syman resumes analysis from the API function block. In this turn, the API function on the table are accessible by Syman. The main body of the sample is obtained as the third CFG, in which the CreateThread function may be called. In such cases, Alligator invokes Syman to analyze the thread code before emulating thread creation. As demonstrated above, our method can automatically analyze self-extracting malware code even in multiple encryption layers. 5.3
Processing Speed
In this section, several performance benchmark data is presented to show the current status of the tools. These data are collected on a PC with Quad-Core Intel Xeon CPU (3.0GHz) with 16GB RAM running under Linux kernel 2.6.24.
A Generic Binary Analysis Method for Malware
213
– Rustock: 79secs for analyzing the decrypting routine. – Troj.Obfus.Gen: 1588secs for entire analysis – online game worms: approximately from 20mins to 50 hours. Since Syman is implemented in Python language, the processing speed is not as good as we wish. We are working to improve the efficiency of evaluation.
6
Related Work
The limitation of single-path execution in dynamic analysis has been tackled by Moser and others [13]. The method tracks “tainted” values such as run-time system information and data obtained via network to identify conditional branches which depend upon them. When one of those branches is met, the execution context is saved once and restored after termination of a single execution to explore the other path by adjusting the tainted variables on which the branch depends. The method is reported to allow multi-path exploration in a dynamic fashion although the cost of saving/restoring the execution contexts may become costly as the number of branches to be checked increases. It is called the “path explosion” problem. Manipulating conditional branches without knowledge of global control flows also runs a risk of coming to a deadlock caused by inconsistency and explosive growth of the number of paths to be explored. A similar approach is reported by Brumley and others [5] that performs logical symbolic execution over multiple paths rather than context switching. The method seem to suffer from the same path explosion problem. These methods can be rendered ineffective by hiding trigger conditions as reported by Sharif and others [17], using a source code transformation technique introducing hash functions. The method proposed in this paper, on the other hand, does not depend on trigger conditions but on the destination addresses of branches at the machine instruction level. It will not be affected by trigger condition hiding alone. A rather different approach is taken by Milani and others [7] that abstracts malicious code fragments detected by dynamic analysis with a static analysis method similar to slicing. By comparing obtained abstract models against other samples, they successfully identified a number of common malicious behaviors that would have escaped usual dynamic analysis. However, the static analysis method they used is rather simple and no extensive date flow analysis is performed. The method is still vulnerable to advanced packing techniques and code obfuscation, which are common in modern malware. Resolution of target addresses of indirect jumps is studied in various contexts. Cifuentes and Emmerik proposed a solution for recovering jump tables from binary code [6], in their efforts toward decompilation. A jump table is an array of addresses generated by compilers for transferring to different program locations with a single indirect jump. The approach examines disassembled instructions backward from the indirect jump to find a sequence of instructions to fetch the target address which typically involves adding an index number to the base address of the table. The method requires the target executable to be compiled
214
T. Izumida, K. Futatsugi, and A. Mori
from C-like language by a specific compiler as it relies on known code patterns. Our method is capable of tracking control flows over such switching structures generated by normal compilers. The value set analysis (VSA) [2] is a technique developed by Balakrishnan and Reps for static analysis of binary code. It is based on abstract interpretations using approximation over sets of possible values stored in variable-like memory locations, which they call a-locs. The method assumes that the target binary is generated by a standard compiler and tries to identify memory locations corresponding to variables appearing in the original source code by heuristic analysis. The method depends on a commercial disassembler called IDAPro4 for a-loc discovery and initial CFG construction, after which fix point calculation is performed to estimate possible value ranges of a-locs via over-approximation. While the method focuses on compiler generated executables, there are many malware samples written in low-level assembly languages and not amenable to heuristic discovery of variables. This is why we developed an on-demand/ backward value evaluation method since the instances of registers and memory locations that may hold indeterminate addresses can not be enumerated before analysis for such malware code. By its nature, the VSA is unable to analyze malware code when it is not compiler generated and is obfuscated well enough to cheat IDAPro, which is possible, for example, by inserting fake RET instructions after manipulating the stack. The evaluation step of our method employs a constant propagation algorithm in SSA form called the sparse simple constant propagation (SSC) algorithm [15,16]. It traces the def-use chains in SSA form to replace a variable with a constant value when the definition is constant as described in Section 3.1. Binary code analysis has become a major topic in the area of computer security. The BitBlaze Binary Analysis Platform Project at UC Berkeley is one such prolific example5 . It advocates coordinated use of static and dynamic methods for binary code analysis. In its static analysis component called Vine [4], SSA form is used for efficient calculation and representation of weakest preconditions for which basic forward propagation is performed to simplify internal expressions in the form of symbolic execution. The value set analysis explained above is employed for resolving indirect jumps/calls. Unresolved memory access will be made concrete with a help from dynamic analysis, that is, real code execution in virtual environments. As has been pointed out, Syman is more suited for malware analysis than the VSA. Vine will benefit from the capabilities of Syman in this regard.
7
Conclusion and Future Work
In this paper, we have presented a malware analysis method that uses static analysis techniques to resolve indirect jumps for further exploration of control flows. The method is free of artificial limitations on the target binaries and is 4 5
http://www.hex-rays.com/idapro/ http://bitblaze.cs.berkeley.edu/
A Generic Binary Analysis Method for Malware
215
able to extract control flows not examined by dynamic analysis. Experiments show that entire CFG can be generated for malware samples armed with packers and obfuscation. We have developed a static analysis tool called Syman that implements an on-demand symbolic evaluation algorithm on the static single assignment form of binary code, where values passing through memory are traced by locating matching read/write operations. A previously developed dynamic analysis tool (e.g., an emulator) called Alligator is used to overcome the limitations of the static analysis. Generated CFGs can be used for many purposes including detection of unknown malware, multi-path analysis, and identification of dormant malicious behaviors. For further analysis on CFGs, we are planning to apply a tree-difference algorithm [10] on dominator trees derived from CFGs for measuring distance/ similarity among them. This will help us to establish an automated method for semantic classification/identification of malware. We plan to enhance abilities of Syman by introducing extended versions of the SSA form such as gated single assignment (GSA) [3,20] or the static single information (SSI) [1] form to better handle loops and conditional branches. Alligator also needs improvements, especially its emulation capabilities against various anti-debugger/emulator techniques. Since Alligator works as a self-contained software tool, it is not difficult to modify and improve its functions as opposed to full-fledged emulators requiring working Windows OS instances. We are currently working to incorporate Syman into a hardware based virtualization tool such as Ether [9] to build a fully automated platform for malware analysis.
Acknowledgements The research described in this paper has been supported by the Strategic Information and Communications R&D Promotion Programme (SCOPE) under management of the Ministry of Internal Affairs and Communications (MIC) of Japan.
References 1. Ananian, C.S.: The static single information form. Tech. Rep. MIT-LCS-TR801, Laboratory for Computer Science, Massachusetts Institute of Technology (September 1999), http://www.lcs.mit.edu/specpub.php?id=1340 2. Balakrishnan, G., Reps, T.: Analyzing memory accesses in x86 executables. In: Duesterwald, E. (ed.) CC 2004. LNCS, vol. 2985, pp. 5–23. Springer, Heidelberg (2004) 3. Ballance, R.A., Maccabe, A.B., Ottenstein, K.J.: The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages. In: Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation, pp. 257–271 (1990) 4. Brumley, D.: Analysis and Defense of Vulnerabilities in Binary Code. Ph.D. thesis, School of Computer Science, Carnegie Mellon University (2008)
216
T. Izumida, K. Futatsugi, and A. Mori
5. Brumley, D., Hartwig, C., Liang, Z., Newsome, J., Poosankam, P., Song, D., Yin, H.: Automatically identifying trigger-based behavior in malware. In: Lee, W., et al. (eds.) Botnet Analysis and Defense (2007) 6. Cifuentes, C., Van Emmerik, M.: Recovery of jump table case statements from binary code. Science of Computer Programming 40(2-3), 171–188 (2001) 7. Comparetti, P.M., Salvaneschi, G., Kirda, E., Kolbitsch, C., Kruegel, C., Zanero, S.: Identifying dormant functionality in malware programs. In: IEEE Symposium on Security and Privacy, pp. 61–76. IEEE Computer Society, Los Alamitos (2010) 8. Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N., Zadeck, F.K.: Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13(4), 451–490 (1991) 9. Dinaburg, A., Royal, P., Sharif, M., Lee, W.: Ether: malware analysis via hardware virtualization extensions. In: CCS 2008: Proceedings of the 15th ACM Conference on Computer and Communications Security, pp. 51–62. ACM, New York (2008) 10. Hashimoto, M., Mori, A.: Diff/TS: A tool for fine-grained structural change analysis. In: Proceedings of the 15th Working Conference on Reverse Engineering, WCRE (2008) 11. Mori, A.: Detecting unknown computer viruses – a new approach. In: Futatsugi, K., Mizoguchi, F., Yonezaki, N. (eds.) ISSS 2003. LNCS, vol. 3233, pp. 226–241. Springer, Heidelberg (2004) 12. Mori, A., Izumida, T., Sawada, T., Inoue, T.: A tool for analyzing and detecting malicious mobile code. In: Proceedings of the 28th International Conference on Software Engineering (ICSE 2006), pp. 831–834 (2006) 13. Moser, A., Kruegel, C., Kirda, E.: Exploring multiple execution paths for malware analysis. In: IEEE Symposium on Security and Privacy, SP 2007, pp. 231–245 (2007) 14. Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: 23rd Annual Computer Security Applications Conference, ACSAC (2007) 15. Reif, J.H., Lewis, H.R.: Symbolic evaluation and the global value graph. In: Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 104–118 (1977), http://doi.acm.org/10.1145/512950.512961 16. Reif, J.H., Lewis, H.R.: Efficent symbolic analysis of programs. Journal of Computer and System Sciences 32(3), 280–313 (1986), http://dx.doi.org/10.1016/0022-00008690031-0 17. Sharif, M.I., Lanzi, A., Giffin, J.T., Lee, W.: Impeding malware analysis using conditional code obfuscation. In: Proceedings of the Network and Distributed System Security Symposium, NDSS 2008, San Diego, California, USA (2008) 18. Stoltz, E., Wolfe, M., Gerlek, M.P.: Constant propagation: A fresh, demand-driven look. In: Symposium on Applied Computing. ACM SIGAPP, pp. 400–404 (1994) 19. Tarjan, R.E.: Fast algorithms for solving path problems. Journal of the ACM 28, 594–614 (1981) 20. Tu, P., Padua, D.: Gated SSA-based demand-driven symbolic analysis for parallelizing compilers. In: Proc. 9th International Conference on Supercomputing (ICS 1995), pp. 414–423. ACM Press, Barcelona (1995)
A-HIP: A Solution Offering Secure and Anonymous Communications in MANETs Carlos T. Calafate, Javier Campos, Marga N´ acher, Pietro Manzoni, and Juan-Carlos Cano Department of Computer Engineering Universidad Polit´ecnica de Valencia Camino de Vera s/n, 46022 Valencia, Spain
[email protected],
[email protected], {marnacgo,pmanzoni,jucano}@disca.upv.es
Abstract. Offering secure and anonymous communications in mobile ad hoc networking environments is essential to promote confidence and widespread adoption of this kind of networks. In this paper we propose and implement a novel solution based on the Host Identity Protocol (HIP) that offers both security and user-level anonymity in MANET environments. In particular, we introduce enhancements to the authentication process to achieve Host Identity Tag (HIT) relationship anonymity, along with source/destination HIT anonymity when combined with multihoming. We implemented our proposal in an experimental testbed, and the results obtained show that the performance degradation introduced by our proposal is minimal. We also detail how to efficiently integrate the proposed mechanism with both a reactive (DSR) and a proactive (OLSR) routing protocol. The improvements achieved using the routing-specific enhancements that we propose are then quantified analytically. Keywords: Anonymity, HIP, MANETs.
1
Introduction
Mobile ad hoc networks (MANETs) are one of the most challenging fields of research. Securing these networks efficiently is of utmost importance, and several new types of attacks specific to MANETs have been detected and thoroughly analyzed in the literature [1,2]. Communication anonymity, a field of research tightly related to security, has also received much attention from the research community in the past years. Despite most proposals have focused on wired environments [3,4,5], mobile ad-hoc networks [6,7,8,9] have also been recently addressed. The latter proposals differ from the former due to the special features of wireless transmission. Communication anonymity encompasses various topics [10] such as sender anonymity, recipient anonymity, relationship anonymity and localization anonymity. In this work we are mainly interested in relationship anonymity, i.e., in avoiding that MANET participants are able to identify the two communicating endpoints involved, e.g., in a VoIP call, thereby compromising security. I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 217–231, 2010. c Springer-Verlag Berlin Heidelberg 2010
218
C.T. Calafate et al.
Networked hosts are identified by their IP and MAC addresses; however, these can be easily altered by users. The problem of IP/MAC address forgery can be solved through a unique cryptographic identifier that remains unaltered throughout time, and that must be completely independent from network identifiers. In earlier anonymous protocols for MANETs, the authentication issue is either not considered [6], or a Third Trust Authority is required to exchange a set of pre-established parameters [7]. A group signature scheme is used in more recent proposals [8,9] where an authenticated key agreement protocol based on a grouporiented signature is used to verify that a message has been signed by one of the group members without actually knowing whom. Hence, this scheme requires that nodes have an a priori knowledge of some trusted nodes in the network; this means that other nodes, besides the two endpoints, must also implement the protocol. An integrated solution, known as HIP (Host Identity Protocol) [11], has been proposed to support authentication, security and mobility but specifically in the scope of the Internet infrastructure. In this paper we propose Anonymous HIP (A-HIP), a novel solution that improves the functionality of HIP to achieve authentication and relationship anonymity in MANET environments. Our solution is compatible with the original HIP mechanisms, and it is integrated with the IPsec protocol [12], providing a secure communications environment. Moreover, the proposed solution does not strictly require the rest of the nodes in the network to implement the HIP protocol, only requiring them to forward both unicast and broadcast IP datagrams. Thus, only the sender and the recipient have to implement the full HIP stack, thereby reducing resource consumption on intermediate nodes, which is considered a critical issue in MANETs. By relying on a real testbed, we implemented and then evaluated the efficiency of our proposal, showing that the impact on performance is minimal. Moreover, we combine our proposal with both a reactive and a proactive routing protocol, showing how to achieve further efficiency in the communication process. To the best of our knowledge, our proposal is pioneer in this area since no other authors have proposed anonymity mechanisms for MANETs based on improvements to the HIP protocol. Additionally, we implemented and validated our proposal in a real testbed, including a performance evaluation to verify that the proposed solution does not compromise performance. Notice that most existing proposals in this field do not perform any sort of performance evaluation in a real testbed, and many do not even include thorough performance results through simulation. The structure of this paper is the following: in the next section we refer to some related works on MANET anonymity, focusing mostly on the performance assessment of the different proposals. In section 3 we make a brief introduction to the Host Identity Protocol. Section 4 describes our proposal, also offering details on the packet format and the packet exchange process. The attacker model is addressed in section 5. Implementation details are presented in section 6. Experimental results are then discussed in section 7. In section 8 we explain
A-HIP: A Solution Offering Secure and Anonymous Communications
219
how to achieve optimal integration with MANET routing protocols, along with a summary of the improvements achieved. Finally, section 9 concludes the paper.
2
Related Works
Although the field of MANET anonymity has received significant attention from the research community in the past decade, most of the proposals found in the literature suffer from the following two drawbacks: (i) the performance of the different protocols has remained mostly untackled, and (ii) no actual implementations and validation of the different protocols in a real testbed have been made. For instance, focusing on the two solutions more frequently cited - MASK and ANODR - the MASK protocol [7] has only been validated through simplistic and overly optimistic simulation tests. In the case of ANODR [6], limited simulation tests with only minimal amounts of traffic are offered. Concerning the performance evaluation of anonymity solutions for MANETs, we find that some recent works have specifically addressed the performance of anonymous protocols for MANETs. Liu et al. [13] present a comprehensive survey and performance evaluation of different anonymous routing schemes, focusing on the existing trade-offs between the performance and the degree of protection. Through simulation they show that the processing delay associated with public key cryptography based protocols causes performance to degrade significantly. Another study by N´ acher et al. [14] shows that, for anonymous routing protocols like MASK and ANODR, anonymity is obtained at the expense of reducing performance values down to inefficient levels for both TCP and UDP protocols. In particular, they find that ANODR’s throughput ranges from 10 to 100 Kilobits per second, which is a really bad performance, while for the MASK protocol these values range from 100 to 500 Kilobits per second, which is still considered quite poor. With respect to UDP traffic, they find that excessive delay values impede the use of applications with real-time requirements, being the packet loss ratio also considered quite high. In this paper we offer a solution to provide anonymity in MANETs that is pioneer in the sense that it includes the proposal description along with implementation details and performance assessment tests made in a MANET testbed using real devices.
3
The Host Identity Protocol
The Host Identity Protocol (HIP) [11] was introduced by IETF’s HIP working group [15]. It was designed to make host identification independent from the points of attachment (IP addresses). Such solution, among other benefits, is able to solve the problem of tracking mobile hosts. One of the main concepts introduced by HIP is the Host Identity Tag (HIT). A HIT consists of a 128-bit identifier assigned to a specific machine. HITs, differently from IP addresses, are always permanent. This means that a host can have several IP addresses that change frequently throughout time without causing
220
C.T. Calafate et al.
hosts to break transport layer connections between them. The only requirement is that HITs are used to identify socket connections instead of IP addresses. Therefore, the use of HITs requires introducing a new layer between the routing and transport layers to achieve the desired independence between them. One of the main advantages of HITs is their close bonds with asymmetric cryptography. In fact, HIT generation consists of obtaining a 128-bit hash of a Host Identifier (HI), which consists of a public key generated through an asymmetric encryption algorithm. In particular, HIP authors propose using the SIGn-andMAc (SIGMA) algorithm [16]. HIP was designed to operate in the scope of the Internet by extending the DNS functionality to incorporate Host Identifiers (HIs) and Host Identity Tags (HITs). Through such a service it is easy for a user to discover and maintain both HI/HIT and IP addresses for a particular host or domain. With HIP, the setting up of a secure channel between two hosts relies on a authenticated four-way handshake based on the SIGMA algorithm. Such fourway handshake includes a Request message from initiator to responder (I1 ), a Challenge message sent back to the initiator (R1 ), a Response/Authentication message sent from initiator to the responder (I2 ) and, finally, an Authentication message that is sent back to the Initiator (R2 ). During this message interchange a session key is created, as well as a pair of IPsec ESP Security Associations (for more details on IPsec please refer to [12]). In the scope of mobile ad-hoc networks, HIP suffers from two main problems: (i) there is usually no access to a DNS server from HI/HIT or even IP discovery, and (ii) it is quite easy to track connections and their endpoints. Thus, a MANET specific solution is required when attempting to operate in these environments.
4
A-HIP: Anonymous and Secure MANET Communications Based on HIP
In this section we describe A-HIP, our novel solution offering security, relationship anonymity and limited sender/recipient anonymity to MANET communications by extending and improving the HIP protocol. The basic assumption for our approach is that, for communication to take place, end-points must be aware of each other’s HIT and the respective public key (HI). This requires a previous exchange of HIs/HITs between peers through a trustworthy mechanism. An example of how this could be achieved using currently available technology is to embed HIs/HITs on digital Business Cards and then use Bluetooth’s Business Card Exchange function [17] (part of the Object Push Profile) to make them available to trusted parties. Another option is to rely on mobile telephony messagery for this task. However, how hosts gain knowledge of each other’s HI/HIT is outside the scope of this paper. Our solution adopts the concept of multi-homing, allowing each endpoint to use a different IP address for each destination. Such requirement is not considered restrictive in MANET environments since the number of nodes is usually much lower
A-HIP: A Solution Offering Secure and Anonymous Communications
221
compared to, e.g., the address pool offered by a class A private network. Through such technique, both sender and recipient anonymity is achieved with respect to the rest of users. In addition, we extend the standard four-way handshake defined by HIP, allowing to translate HITs into an IPv4 or IPv6 address anonymously, and without requiring a DNS service. Before detailing how our mechanism works, we must first introduce some definitions. Let HITsrc and HITdst denote the source and destination HIT identifiers, and let κiP ub and κiP ri be the public and private keys associated to a certain HITi , respectively. We define the encryption of message m using key κ as: m∗ = E(m, κ) (1) and the message decryption using the complementary key as:
m = D(m∗ , κ )
(2)
With HIP, when two stations wish to exchange session setup information, the messages exchanged must comply with format defined in the RFC for HIP [11]. The basic structure of these messages is shown in figure 1. Packet type HITsrc HITdst Data Fig. 1. Structure of a HIP message
Since the Packet type, HITsrc and HITdst fields are unencrypted, all participants of a MANET are able to identify the two communication endpoints. Thus, in our scheme, we propose encrypting all the fields in HIP messages. The participants involved in a HIP session will have to use the public key of the destination - κdst P ub - to encrypt a given HIP message mHIP : m∗HIP = E(mHIP , κdst P ub )
(3)
Message m∗HIP may then propagate through the MANET without other stations being aware of the communication endpoints, as desired. Nevertheless, all the stations in the path must try to decrypt the message using their private key(s) to determine whether they are the destination. Figure 2 illustrates the modified message exchanging scheme proposed. Before proceeding we should remark that, in the scope of HIP, the Initiator and Responder concepts are introduced to refer to both endpoints, and they remain unaltered throughout the HIP session. Therefore, they are unrelated to the source and destination concepts, which maintain their usual meaning and thus alternate in time, as shown in this figure (see, e.g., messages I1 and R1 ). Our proposed scheme works as follows: initially, a HIT discovery packet is generated (message I1 ), being flooded to all nodes. Flooding is mandatory in a MANET environment since the initiator has no knowledge about the responder’s IP. Hosts receiving the message will use their private key (κiP ri ) to try to obtain
222
C.T. Calafate et al.
Fig. 2. Modified message exchanging between initiator and responder based on HIP
the original mHIP message. Only if the appropriate key (κResp. P ri ) is applied may the original message be retrieved: mHIP = D(m∗HIP , κResp P ri ),
(4)
making the HIT discovery process fully confidential by avoiding HIT traceability. The responder, aware of the initiator’s IP address based on the initial re quest, sends a challenge message (R1 ) back to the initiator via unicast, allowing the initiator to associate the responder’s HIT to its IP address. If the responder does not want to communicate with the initiator, it may opt to block the authentication process by not replying at all. A new message interchange between initiator and responder takes place afterward, during which a Diffie-Hellman authenticated key exchange is used to create a session key, allowing to establish a pair of IPsec ESP Security Associations (SA) between the initiator and the responder, as defined by the standard HIP. If, later, either the initiator or the responder wish to change their network identifiers, they must then proceed to update the connection using encapsulated HIP UPDATE packets (represented as U , U A and A in figure 2); since this is part of the standard HIP, please refer to [11] for more details. Long connection disruptions, which may be due to node mobility, require restarting the base exchange to update the existing security associations (SAs). In such case, a new HIT discovery packet (message I1 ) must be again flooded throughout the MANET, as described before.
5
Attacker Model
The trusted exchange of HI/HIT pairs assures that the cryptographic identity of a user is only known by trusted users. Identifying the two endpoints of a connection, however, becomes a very hard task thanks to our solution since session initiation is made anonymous to all users by (i) encrypting all packets (both HIP-related and
A-HIP: A Solution Offering Secure and Anonymous Communications
223
data), and (ii) avoiding static mappings between HI/HIT and IP/MAC addresses. So, a basic requirement for an attacker to successfully trace a connection and identify the two intervening endpoints, is that it has to be a trusted user by both parties. The actions required for such an attacker to be successful are: (i) it has to initiate secure connections towards all known users (whose HITs are cached) and attempt to obtain their geographic locations; (ii) it must promiscuously listen to the on-going traffic in the network and locate geographically the sources and destinations of that traffic; and (iii) it will compare the geographic positions of known users (whose HITs are known) against the geographic positions of sources and destinations of traffic being traced, and attempt to guess the correct IP-to-HIT mapping, thus breaking anonymity. Obviously, such an attack is quite complex to undertake, especially when geographical discrimination of users is complex due to their proximity or the presence of obstacles (e.g. indoor scenarios). Denial-of-Service (DoS) is another type of attack that could be exploited by an attacker. In particular, the attacker could inject several fake I1 messages (HIT discovery packets) to cause both the networks’ bandwidth to be exhausted (remind that these messages are flooded through the MANET) and the nodes’ resources to be consumed (all nodes attempt to decrypt these packets). Although addressing the DoS problem is outside the scope of this paper, such DoS attacks could be countered through appropriate message filters that limit the injection rate of these messages into the system.
6
Implementation Details
To validate our proposal and assess its effectiveness we developed a fully functional implementation for Linux/Unix operating systems. For this endeavor we enhanced an existing implementation of HIP [18] in order to implement the mechanism described in section 4.1 In Figure 3 we show a block diagram that illustrates the different elements that conform the HIP service for the reference implementation, along with the interaction with other software components. We highlight in the figure those modules that required enhancements to implement our proposal: session startup and I/O HIP packets. Regarding the session startup module, the changes proposed focused on encrypting all HIP messages using the responder’s public key to provide anonymity. Concerning the Input/Output module, changes mainly focused on modifying the target of I1 messages at the IP layer for them to be broadcasted and relayed by intermediate HIP agents. This way they are able to reach the intended message recipient in a multi-hop network scenario. Applications attempting to anonymously contact another MANET user can identify the destination by relying on notation < domain.hip >. All DNS resolutions in the .hip domain are intercepted and handled internally by the HIP service, which must provide a mapping to a specific IP address. In our case the 1
Our implementation is freely available upon request.
224
C.T. Calafate et al.
Fig. 3. Integration of the HIP service with other Linux software components
modified HIP session startup phase begins, allowing to establish a temporary mapping between the recipient’s HIT and one of its current IP addresses. Data packets are afterward transferred securely using IPsec technology.
7
Experimental Results
In this section we evaluate the performance cost of our proposal in terms of session startup times, end-to-end delay and throughput. To accomplish this task we setup a small testbed composed of four Asus EeePC 901 netbooks and a desktop PC (see figure 4). We configured their IEEE 802.11g integrated wireless cards (Ralink RT 2860 chipset [19]) in the ad-hoc mode, and we fixed the data rate at 54 Mbit/s. All the terminals involved in the testbed had a GNU/Linux operating system installed, kernel version 2.6.24. The Ralink wireless card drivers used were version 1.7.0.0.
Fig. 4. Snapshot of the testbed used for performance evaluation tests
Using the iptables tool [20], we logically enforced a chain topology that allowed us to assess performance at different hop counts between source and destination.
A-HIP: A Solution Offering Secure and Anonymous Communications
225
Notice that, in this setup, a manual pre-loading of the source’s cache with the IP address of the destination was required to allow the default HIP implementation to contact stations more than one hop away. Obviously, such approach is not required for A-HIP.
Fig. 5. Session startup times at different hop counts for the default HIP and the proposed A-HIP solution
In Figure 5 we show the overhead introduced by A-HIP compared to the default HIP implementation as we increase the hop count between the source and destination terminals. We can see that the extra encryption effort required to offer anonymity introduces an additional delay between 200 and 220 ms, practically duplicating the startup time. Since this overhead is limited to the initial exchange, we consider that the trade-off achieved is reasonable, being the startup time within acceptable bounds from the user perspective. Also notice that the number of hops does not negatively affect A-HIP, being the increase minimal as for the default HIP case. This evidences that the solution is not prone to suffer from scalability problems. In the experiments that follow we compare A-HIP against an insecure solution, that is, a situation where neither HIP nor A-HIP are used. We do not include the results for the Default HIP case since the performance values obtained do not differ from the ones achieved with our HIP extension. Figure 6 shows the mean round-trip time (RTT) delay for different payload sizes for both A-HIP and a No HIP case. We find that the additional processing required by A-HIP causes an RTT increase that ranges from 0.4 to 1.2 ms, being on average of 0.7 ms. Notice that the relative impact of this increase tends to disappear as the number of hops increases - from 97% (1 hop) down to 23% (4 hops) - since the additional overhead imposed is independent from the path traversed. In terms of throughput, Figure 7 shows that the maximum throughput achievable with the proposed solution is of about 12.5 Mbit/s. This upper limit is inherent to the CPU-bounded characteristics of the encryption processes, and can only be improved by using faster CPUs or specialized encryption hardware. As we increase the number of hops we observe that the performance drop caused by using
226
C.T. Calafate et al.
Fig. 6. Round-trip time delay for different payload sizes in an insecure mode (top) and using A-HIP (bottom)
encryption is quite limited, achieving throughput values close to those without encryption. Therefore, in a typical MANET environment, the impact on throughput will not be relevant, and it will certainly not compromise the transmission efficiency. According to different studies [13,14], these performance values are significantly better than those achieved by other MANET anonymity solutions.
8
Efficient Integration with MANET Routing Protocols
The implementation described in section 6 is generic in the sense that it adapts to any mobile ad-hoc network, no matter what routing protocol is used. Nevertheless, depending on the routing protocol used, specific cross-layer optimizations are possible to make the message dissemination process of I1 messages more efficient. In this section we propose cross-layer optimizations for both a reactive (DSR [21]) and a proactive (OLSR [22]) routing protocol for MANETs; afterward we analytically quantify the improvements achieved with these optimizations. 8.1
Cross-Layer Optimizations for DSR
In this section we detail how to adapt our proposal to operate efficiently in conjunction with the DSR routing protocol.
A-HIP: A Solution Offering Secure and Anonymous Communications
227
Fig. 7. Throughput at different hop counts in an insecure mode (No HIP) and using A-HIP
The general purpose data flooding mechanism used in the proposed A-HIP implementation (see section 6) applies to all routing protocols; nevertheless, more efficient solutions can be achieved for reactive routing protocols to avoid two rounds of flooding (HIT search flooding plus IP search flooding). We now present such a solution for the DSR routing protocol. DSR is an extremely flexible routing protocol for MANETs that relies on source routing [21]. Therefore, extending DSR to adapt it to our proposal merely requires extending the route discovery mechanism so that message I1 (see figure 2) is embedded into Route Request (RREQ) packets, thus replacing the “target address” identifier within the scope of DSR. With DSR, as a RREQ packet is propagated, the IP addresses of intermediate nodes are appended to the packet, conforming the entire route traversed. With our solution intermediate nodes are not aware of the final destination’s IP. However, this limitation imposes no problem and the packet can still be propagated throughout the MANET. When the destination (responder) receives this hybrid DSR/HIT discovery packet, it simultaneously learns about the route used, which allows sending a Route Reply (RREP) packet back to the source (initiator) that includes the challenge message. Such a message includes the responder’s IP address, and so the initiator also becomes aware of the route used when receiving the reply. If the secure channel is not yet created, the initiator proceeds to establish it according to the message exchange sequence shown in figure 2. Once it is established, route breakages of short duration can be handled by the routing protocol alone. Route breakages lasting for periods longer than the established HIP timeout value require restarting HIP’s base exchange, as explained at the end of section 4. We consider that such integration strategy is optimal in the sense that the amount of routing overhead involved is about the same as for the standard implementation, with the additional security and anonymity benefits. 8.2
Cross-Layer Optimizations for OLSR
OLSR [22] is a proactive routing protocol for MANETs. As such, it creates and continually maintains routes to all stations participating in the MANET.
228
C.T. Calafate et al.
Route maintenance is achieved by propagating Topology Control (TC) messages throughout the MANET using Multi-Point Relay (MPR) nodes. These nodes are a subset of the stations participating in the MANET which conforms a minimum spanning tree (MST). Thus, message propagation through MPR re-broadcasting offers high efficiency with little cost in terms of control traffic introduced. An efficient integration of our proposal with OLSR is achieved by modifying it to forward broadcasted I1 messages. This strategy allows to optimize the flooding process by taking advantage of the MST defined by the different MPR nodes, thus reducing the number of broadcast transmission events in the network. Concerning non-broadcasted messages (R1 , I2 , R2 ), these will be forwarded with minimum latency since OLSR provides the path between sender and recipient with no delay by constantly maintaining routes. 8.3
Improvements Achieved
In terms of performance, the values achieved when adopting the optimizations proposed in this section resemble those described in section 7 without any significant differences. However, the amount of overhead introduced in the MANET, measured as the number of HIP-related packet transmissions, varies significantly. Hence, we now proceed to analytically quantify the improvements achieved. Table 1 presents the packet overhead introduced at session startup for both DSR and OLSR routing protocols. We discriminate the number of transmissions associated with each startup message for the sake of clarity. Table 1. Packet overhead introduced at session startup Protocol
Mode of operation
Number of Tx per message I1 R1 I2 R2
Total number of Tx events
A-HIP N N+2×HC HC HC 2 × N + 4 × HC A-HIP+ DSR optimization N HC HC HC N + 3 × HC OLSR A-HIP N HC HC HC N + 3 × HC A-HIP+ OLSR optimization M STi HC HC HC M STi + 3 × HC DSR
In this table, M STi is the minimum spanning tree defined by the different Multi-Point Relays (MPRs) as seen by node i [22], while HC refers to the number of hops between sender and recipient and N refers to the total number of nodes. Our approach, when used in a DSR-based MANET, introduces the greatest packet transmission overhead since it requires a two-way broadcast flooding: one for A-HIP itself, and a second one triggered by DSR’s route discovery process. By introducing the proposed cross-layer optimization between DSR and A-HIP we are able to combine both broadcast flooding processes, drastically reducing the total number of transmissions required (N + HC packets less).
A-HIP: A Solution Offering Secure and Anonymous Communications
229
With respect to OLSR, a single broadcast flooding is required for the first message. The proposed optimization reduces the total amount of packets transmitted by limiting rebroadcast events to MPR nodes (N − M STi packets less). For high node densities in the MANET, this optimization becomes quite relevant compared to the default solution. Comparing both protocols, notice that a smaller number of transmission events is possible with OLSR with respect to DSR since the former regularly broadcasts topology control messages, something that does not occur with DSR. Finally, notice that the optimizations we proposed for these two protocols allow reaching a near-minimum number of transmission events for each routing protocol category.
9
Conclusions
In this paper we proposed a novel solution to provide private and untraceable communication between MANET peers. Compared to previous proposals, we were pioneer at actually implementing and testing our solution in a real testbed. We relied on the concept of HITs to offer user discovery and end-to-end encryption of data through full integration with HIP/IPsec technology. Our proposal assumes that communication endpoints are previously known to one another; this was enforced through the HIT concept. Once a secure channel is set up communication may start. From that point on, all data being sent benefits from end-to-end encryption through IPsec. In fact, one of the main benefits of our proposal is of being lightweight and easily implementable in real-life operating systems, as demonstrated in the paper. Experimentally we show that the additional delay imposed for the session startup time is between 200 and 220 ms, being the total time always below 500 ms. In a MANET environment, such initial delay is not considered restrictive. In terms of delay and throughput, A-HIP offers the same performance as the original HIP implementation. Compared to an insecure solution, delay and throughput values merely experience a very slight increase. The only exception was detected for one hop distances, where a maximum data encryption speed of 12 Mbit/s was obtained with the hardware used in the experiments. In the paper we also offered some hints on how to efficiently integrate our proposal with both DSR and OLSR routing protocols, achieving the desired degree of anonymity and security while keeping network overhead to a minimum. In particular, we offered analytical estimations on the improvements achieved for each routing protocol, discriminating gain per each message type.
Acknowledgments This work was partially supported by the Ministerio de Educaci´ on y Ciencia, Spain, under Grant TIN2008-06441-C02-01, and by the Generalitat Valenciana under Grant GV/2009/010.
230
C.T. Calafate et al.
References 1. Wu, B., Chen, J., Wu, J., Cardei, M.: A Survey on Attacks and Countermeasures in Mobile Ad Hoc Networks. In: Wireless/Mobile Network Security. Springer, Heidelberg (2006) 2. Yih-Chun, H., Perrig, A.: A survey of secure wireless ad hoc routing. IEEE Security & Privacy Magazine 2(3), 28–39 (2004) 3. Chaum, D.: The dining cryptographers problem: Unconditional sender and recipient untraceability. J. Cryptology 1(1), 65–75 (1988) 4. Chaum, D.: Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM 4(2) (February 1981) 5. Dingledine, R., Mathewson, N., Syverson, P.: Tor: The second-generation onion router. In: Proceedings of the 13th USENIX Security Symposium (August 2004) 6. Kong, J., Hong, X.: ANODR: anonymous on demand routing with untraceable routes for mobile ad-hoc networks. In: MobiHoc 2003: Proceedings of the 4th ACM International Symposium on Mobile Ad Hoc Networking & Computing, New York, NY, USA, pp. 291–302 (2003) 7. Zhang, Y., Liu, W., Lou, W., Fang, Y.: MASK: Anonymous on-demand routing in mobile ad hoc networks. Transactions on Wireless Communications 21, 2376–2385 (2006) 8. Lin, X., Lu, R., Zhu, H., Ho, P., Shen, X., Cao, Z.: ASRPAKE: An anonymous secure routing protocol with authenticated key exchange for wireless ad hoc networks. In: Proceedings of International Conference on Communications (ICC). IEEE, Los Alamitos(2007) 9. Paik, J.H., Kim, B.H., Lee, D.H.: A3RP: Anonymous and authenticated ad hoc routing protocol. In: Proceedings of International Conference on Information Security and Assurance. IEEE, Los Alamitos (2008) 10. Pfitzmann, A., Hansen, M.: Anonymity, unobservability, and pseudonymity - a proposal for terminology. In: Federrath, H. (ed.) Designing Privacy Enhancing Technologies. LNCS, vol. 2009, pp. 1–9. Springer, Heidelberg (2001) 11. Moskowitz, R., Nikander, P., Jokela, P., Henderson, T.: Host Identity Protocol. RFC 5201 (April 2008) 12. Kent, S., Seo, K.: Security Architecture for the Internet Protocol. IETF RFC 4301 (December 2005) 13. Liu, J., Kong, J., Hong, X., Gerla, M.: Performance Evaluation of Anonymous Routing Protocols in MANETs. In: IEEE Wireless Communications and Networking Conference, New Orleans, USA (April 2006) 14. N´ acher, M., Calafate, C.T., Cano, J.C., Manzoni, P.: Anonymous routing protocols: impact on performance in MANETs. In: IEEE International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2009), London, UK (September 2009) 15. Internet Engineering Task Force. Host identity protocol working group charter, http://www.ietf.org/html.charters/hip-charter.html 16. Krawczyk, H.: SIGMA: the ’SIGn-and-MAc’ Approach to Authenticated DiffieHellman and its Use in the IKE Protocols. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 400–425. Springer, Heidelberg (2003) 17. IEEE 802.15.1(tm) IEEE Standard for Information technology– Telecommunications and information exchange between systems– Local and metropolitan area networks–Specific requirements Part 15.1: Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Wireless Personal Area Networks (WPANs(tm)) (2002)
A-HIP: A Solution Offering Secure and Anonymous Communications
231
18. OpenHIP, http://downloads.sourceforge.net/openhip/hip-0.5.tgz 19. Ralink Technology Corporation, http://www.ralinktech.com/ (accessed: January 30, 2009) 20. The netfilter.org iptables project, http://www.netfilter.org/ (accessed: January 28, 2009) 21. Johnson, D.B., Hu, Y., Maltz, D.A.: The dynamic source routing protocol (dsr) for mobile ad hoc networks for ipv4. Request for Comments: 4728, MANET Working Group (February 2007) (work in progress), http://www.ietf.org/rfc/rfc4728.txt 22. Clausen, T., Jacquet, P.: Optimized link state routing protocol (OLSR). Request for Comments 3626, MANET Working Group (October 2003) (work in progress), http://www.ietf.org/rfc/rfc3626.txt
Securing MANET Multicast Using DIPLOMA Mansoor Alicherry and Angelos D. Keromytis Department of Computer Science, Columbia University
Abstract. Multicast traffic, such as live audio/video streaming, is an important application for Mobile Ad Hoc Networks (MANETs), including those used by militaries and disaster recovery teams. The open nature of multicast, where any receiver can join a multicast group, and any sender can send to a multicast group, makes it an easy vehicle for launching Denial of Service (DoS) attacks in resource-constrained MANETs. In this paper, we extend our previously introduced DIPLOMA architecture to secure multicast traffic. DIPLOMA is a deny-by-default distributed policy enforcement architecture that can protect the end-host services and network bandwidth. DIPLOMA uses capabilities to provide a unified solution for sender and receiver access control to the multicast groups, as well as to limit the bandwidth usage of the multicast group. We have extended common multicast protocols, including ODMRP and PIM-SM, to incorporate DIPLOMA. We have implemented multicast DIPLOMA in Linux, without requiring any changes to existing applications and the routing substrate. We conducted an experimental evaluation of the system in the Orbit MANET testbed. The results show that the architecture incurs limited overhead in throughput, packet loss, and packet inter-arrival times. We also show that the system protects network bandwidth and the end-hosts in the presence of attackers.
1
Introduction
Multicast enables delivery of information from one source to many destinations efficiently, without the source to unicasting to individual destinations. In multicast, nodes send packets over a link only once. They create copies of the packet, and send to multiple links when the packets need to go on multiple links to reach destinations. Multicasting is used for content distribution applications, like audio and video streaming. Mobile ad-hoc networks are increasingly used in tactical military and civil rapid-deployment networks, including emergency rescue operations and disaster relief network, due to their flexibility in deployment. Audio and video content distribution is an important application on these networks, making support for multicast an absolute necessity. Multicast also improves the efficiency of wireless links in MANETs, due to the broadcast nature of the medium. The set of nodes receiving the messages that are addressed to a common multicast address form a multicast group. Traditionally, there are three properties of multicast group [3]: I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 232–250, 2010. c Springer-Verlag Berlin Heidelberg 2010
Securing MANET Multicast Using DIPLOMA
233
1. All the members receive all the packets send to the multicast group. 2. Any node can join the multicast group. 3. Any node can send packet to the multicast group. All these properties have security implications, and there are solutions proposed for them in the context of the (wired) Internet. Most of these solutions differentiate the routers from the receiver nodes (or multicast group members), as it is the case in wired networks. The routers are secure and well behaved. These solutions are not suitable for MANETs, since the nodes play the dual role of receivers (and senders) of the traffic and routers for forwarding other node’s traffic. Furthermore, exploiting these properties increase the resource usage, making multicast an easy tool for launching denial of service attacks on resource constrained MANETs. In this paper, we propose extensions to DIPLOMA architecture, which stands for DIstributed PoLicy enfOrceMent Architecture, to provide multicast security in MANETs. DIPLOMA is a deny-by-default architecture [2] that enforces trust relationships and traffic accountability between mobile nodes through a distributed policy enforcement scheme for MANETs. In that architecture, capabilities propagate both access control rules and traffic-shaping parameters that should govern a node’s traffic. In the deny-by-default, model nodes can only access the services and hosts they are authorized for by the capabilities given to them. The enforcement of the capability is done in a distributed manner by all the nodes in the path from the source to the destination. Compromised or malicious nodes cannot exceed their authority and expose the whole network to an adversary. Upon detection, we can prevent a compromised node from further attacking the network simply by revoking its capabilities. Moreover, that architecture helps mitigate the impact of denial of service (DoS) attacks because excess or unauthorized packets are dropped closer to the attack source. Thus, we avoid unnecessary data processing and forwarding at the target node and the network itself. Multicast security protocols for wired networks have treated receiver access control and sender access control as two separate problems [10]. Receiver access control is provided using a group policy management system and a group member authorization system [3]. Sender access control can be provided using source specific multicast (SSM), in which only single source can transmit to a multicast group. A MANET node’s IP address can change when moving between networks, and requires explicit sender access control. Furthermore, because of the broadcast nature of the medium, it is much easier to do IP address spoofing in MANETs. In this paper, we provide a unified solution for both receiver access control and sender access control for MANETs by extending DIPLOMA to secure multicast traffic. We define capabilities for use with multicast traffic. There are separate capabilities defined for sending and receiving multicast traffic. A node will not be able to send, or join the multicast group without possessing these capabilities. These capabilities also provide bandwidth constraints for the multicast sessions, preventing resource hogging by the multicast group members. The nodes in MANET enforce the access control and bandwidth constraints of the capability in a distributed manner. We propose modifications to multicast
234
M. Alicherry and A.D. Keromytis
protocols to incorporate capabilities and show the modifications for two popular multicast routing protocols On Demand Multicast Routing Protocol (ODMRP) and Protocol Independent Multicasting Spare Mode (PIM-SM). We implement the multicast DIPLOMA on Linux. Our implementation does not require any changes to existing multicast applications or the PIM-SM multicast daemon. However, the applications see the benefit in terms of receiving only the authorized traffic, and being able to send the allocated bandwidth even in the presence of rogue nodes that are trying to conduct a DoS attack. We implement our system in the Orbit Lab testbed. We conduct extensive experiments to evaluate the performance and effectiveness of our system. We show that multicast DIPLOMA incurs minimal overhead in terms of throughput, packet loss and inter-arrival times. We also study the effect on video streaming in our system. Finally, we show that multicast DIPLOMA is effective against attackers. Note that we do not address confidentiality of the multicast messages. Group key encryption is used to encrypt the multicast traffic using symmetric keys. Group key management is used for efficient re-keying for dynamic group memberships [3]. We describe the DIPLOMA architecture in Section 2, the threat model in Section 3, its extension to multicast in Section 4, and the implementation in Section 5. We describe our experimental methodology and results in Section 6. Section 7 discusses related work.
2 2.1
System Architecture DIPLOMA Overview
In our architecture, one or more pre-defined nodes act as a group controller (GC), which is trusted by all the group nodes. A GC has authority to assign resources to the nodes in MANET. This resource allocation is represented as a credential (capability) called policy token, and it can be used to express the services and the bandwidth a node is allowed to access. They are cryptographically signed by the GC, which can be verified any node in the MANET. When a node (initiator) requests a service from another MANET node (responder) using the policy token assigned to the initiator, the responder can provide a capability back to the initiator. This is called a network capability, and it is generated based on the resource policy assigned to the responder and its dynamic conditions (e.g., level of utilization). Figure 1 gives a brief overview of DIPLOMA. All nodes in the path between an initiator to a responder (i.e., nodes relaying the packets) enforce and abide by the resource allocation encoded by the GC in the policy token and the responder in the network capability. The enforcement involves both access control and bandwidth allocation. A responder accepts packets (except for the first) from an initiator only if the initiator is authorized to send, in the form of a valid network capability. It accepts the first packet only if the initiator’s policy token is included. An intermediate node will forward the packets from a node only if they have an associated policy token or network capability, and if they do not
Securing MANET Multicast Using DIPLOMA
235
Fig. 1. System overview
violate the conditions contained therein. Possession of a network capability does not imply resource reservation; they are the maximum limits a node can use. Available resources are allocated by the intermediate nodes in a fair manner, in proportion to the allocations defined in the policy token and network capability. The capability need not be contained in all packets. The first packet carries the capability, along with a transaction identifier (TXI) and a public key. Subsequent packets contain only the TXI and a packet signature based on that public key. Intermediate nodes cache policy tokens and network capabilities in a capability database, treating them as soft state. A capability database entry contains the source and the destination addresses, TXI, the capability, public key for the packet signature and packet statistics. Capability retransmissions update the soft state of intermediate nodes when the route changes due to node mobility. The soft state after a route change is also updated using an on-demand query for the capability database entry from the upstream nodes. 2.2
Multicast Capability
DIPLOMA use multicast capabilities for access control and bandwidth limitations. They have same syntactic structure as unicast capabilities [2]. serial: 1307467 owner: unit01.nj.army.mil (public key) destination: 225.1.1.8 service: video bandwidth: 512kbps expiration: 2010-12-31 23:59:59 flags: MCAST RW issuer: captain.nj.army.mil signature: sig-rsa 23455656769340646678
236
M. Alicherry and A.D. Keromytis
The above represents a policy token assigned by node captain.nj.army.mil to unit01. This is a multicast capability, since the destination address is a multicast address. Unit01 can multicast video traffic up to 512 kbps to the group 225.1.1.8. There are two types of multicast capabilities: Multicast Send Capability (MSC) and Multicast Receive Capability (MRC). The flags in the capability indicate the type of the multicast capability. The nodes possessing a MSC can send the traffic to the multicast group, limited by the bandwidth allocation on the capability. They can also join the multicast group and receive the traffic from the group. The nodes possessing a MRC can join the multicast group only to receive data; They do not have authority to send data to the group. The group controllers allocate MSCs, and hence they are of type policy tokens. MRCs can be either a policy token or a network capability. The group controller or a sender that has authority in the form of a policy allocates them.
3
Threat Model
Our goal is to protect network resources and the multicast traffic from denial of service attacks, and to enforce access control rules in the absence of a fixed topology. Thus, we want a receiver node to be able to access only the multicast services it is entitled to, and to limit the amount of traffic that can be sent to any multicast group by the authorized senders. To preserve bandwidth and power, we need to filter any unauthorized traffic early on. We assume MANET environments where an adversary may be an existing node that has been compromised (insider) or a malicious external node that might want to participate in the MANET. In addition, there may be multiple cooperating adversaries; and compromised nodes may not be detected as such immediately, or ever (depending on their actions). The resources needed to access a service are allocated by the group controller(s) (GCs) of the MANET. Group controllers are nodes responsible for maintaining the group membership for a set of MANET nodes, and a priori authorize communications within the group. This means that GCs do not participate in the actual communications, nor do they need to be consulted by nodes in real time; in fact, if they distribute the appropriate policies ahead of time, they need not even be members of the MANET. In most cases, the GC may be reachable through a high-energy-consumption, high-latency, low-bandwidth long-range link (e.g., a satellite connection); interactions in such an environment should be kept to a minimum, and only for exceptional circumstances (e.g., for revoking access for compromised nodes). Without compromising a GC, an external node can participate in a MANET only by stealing the authorization credentials that are bound to the identity of a legitimate node. Because we envision GCs as being primarily offline or, at best, intermittently reachable (with respect to the MANET), we are not addressing the issue of compromised controllers in this paper. If a node is compromised, an adversary can only access the services and bandwidth that node is authorized to access. If other MANET nodes are adhering
Securing MANET Multicast Using DIPLOMA
237
to our architecture, a compromised node does not have the ability to disrupt or interfere with end-to-end service connectivity and other nodes beyond its local radio communication radius. The nodes providing services will receive only the traffic that the compromised node is authorized to transmit, unless the adversary is in the local communication radius.
4
DIPLOMA for Multicast Protocols
Unlike the unicast implementation of DIPLOMA, multicast implementation depends on the underlying multicast protocol used. This is because a multicast forwarding node does not know about the receiver nodes to enforce the multicast receive capability, without interfacing with the multicast routing protocol. Hence, our implementation influence the protocol by snooping and filtering the packets, even though it does not directly modify the multicast protocol processing modules. DIPLOMA may also modify the packet immediately before the packet is sent to the physical interface and immediately after it is received on the interface. There are two types of multicast routing protocols. The first type is flooding based protocols, where the multicast tree is created for the entire topology based on flooding. Later, part of the tree that does not have any receivers is pruned by explicit prune or status discovery messages. An example of this type of protocol is Protocol Independent Multicasting in Dense Mode (PIM-DM). This type of protocol is useful when most of the nodes in the network are members of the group. The second type, which is more predominant, creates a tree (or mesh) based on the membership. A branch in the tree is created only if there a node in that branch that wants to receive the multicast traffic from the group. There is no wasted data bandwidth in this protocol, even though efficiency of the bandwidth usage depends on the type of the tree construction. Examples of this type of protocol include Protocol Independent Multicasting in Sparse Mode (PIM-SM), MAODV, ODMRP etc. In this paper, we focus on implementing the DIPLOMA on this type of protocols. The receivers are required to send explicit messages to join the multicast tree. This message may traverse multiple intermediate nodes to reach the tree or the node in charge of constructing the tree. Depending on the protocol, the intermediate node may directly forward this message, or send a different message to the same effect to the upstream node. We call these messages collectively as Join-Tree messages. In PIM-SM protocol, Join-Tree messages constitute IGMP membership report message, as well as Join/Prune message. In ODMRP protocol, it is the Join Reply message serving this role. In DIPLOMA, we make use of Join-Tree messages to send the MRCs. The nodes drop the Join-Tree messages that do not contain the valid MRCs. When there are multiple downstream receivers, the forwarding node needs to send only one of the MRCs upstream. Join-Tree messages forwarded by a node may contain the MRC of its downstream node, instead of its own. This happens when the node is just a forwarding node but not a member of the multicast group. To avoid MRC reuse by rogue forwarding nodes for future multicast sessions, the receivers add an expiration time
238
M. Alicherry and A.D. Keromytis
to the MRC in Join-Tree messages. Receivers sign the (capability, timestamp) tuple with their public key. We call that message time stamped MRC. Many multicast protocols have explicit messages initiated by the sender to form the tree. For example, ODMRP has Join Query message send by the sender to initiate the tree creation. Not all protocols have this mechanism. For example, PIM-SM does not require the sender to join the multicast group for sending multicast packets. Hence, we do not rely on any of the protocol messages to send the MSC. Instead, we send the MSC when data traffic starts flowing, like in the unicast case. This has an advantage of treating multicast and unicast data the same way, independent of the underlying protocol. To provide added security, we also send the MSC on the protocols that require explicit tree create message from the sender. An intermediate node forwards a multicast data packet only if both of the following conditions are satisfied: 1. The data packet has an associated MSC from the sender in the node’s capability database, and the data packet is conformant to the capability in the form of valid packet signature and the bandwidth constraints. 2. The node has a valid multicast receive capability from one of the receivers in the downstream path. The intermediate node forwards the packet on an interface only if it has a time stamped MRC for a receiver that is reachable on that interface. A receiving node may leave the multicast tree in two ways depending on the multicast protocol. Some protocols support explicit leave messages. Since it may not be always possible to send a leave message (e.g., the receiver node crashed), the protocols also has periodic membership query. When the receiver node receives a query, it sends some form of a Tree-Join message. In a DIPLOMA enabled systems, the receive node also sends a time stamped MRC in those messages. Then the intermediate (forwarding) node forwards one of the time stamped MRC to the upstream node in its Tree-Join message. When a node does not receive any time stamped MRCs from the downstream nodes on an interface, that interface is pruned from the multicast tree (or mesh). 4.1
Security Analysis
We now discuss how our architecture relates to the threat model described in Section 3. Since the capabilities are signed by a GC and are verifiable by all nodes, adversaries cannot generate their own valid capabilities. Adversaries can create valid capabilities only if the GC is compromised. Since the individual packets are signed, an adversary cannot use a transaction id that does not belong to it to transmit packets. A compromised or malicious node that does not enforce the capability protocol can only have impact within its communication radius. Packets generated without the capability or with a snooped transaction id by a malicious node will be dropped by the neighboring nodes due to invalid signatures. A compromised
Securing MANET Multicast Using DIPLOMA
239
Join Query
R1 00 11 11111111111 00000000000 uery 00 11 0000000000 1111111111 00000000000 11111111111 Join Q 0000000000 1111111111 00000000000 11111111111 0000000000 1111111111 orward 00000000000 11111111111 0000000000 2. F1111111111 00000000000 11111111111 0000000000 1111111111 ly 00000000000 11111111111 F1 11111111111 0000000000 1111111111 00000000000 Join Rep 0000000000 1111111111 3. Send 00 11 00000000000 11111111111 000000000000 111111111111 ery 00000000000 11111111111 2. Fo 00 11 00000000000 11111111111 000000000000 111111111111 in Qu 00000000000 11111111111 rwar 00000000000 11111111111 000000000000 111111111111 end Jo 00000000000 11111111111 S . ly d 00000000000 11111111111 000000000000 111111111111 1 p 0000000000 1111111111 00000000000 11111111111 e 111111111111 Join 00000000000 11111111111 000000000000 0000000000 1111111111 00000000000 11111111111 Join R Que 00000000000 11111111111 3. Sen 000000000000 111111111111 0000000000 1111111111 00000000000 11111111111 ry rward 00000000000 000000000000 111111111111 d Join 0011111111111 11 0000000000 1111111111 00000000000 11111111111 4. Fo 000000000000 111111111111 0 R2 1 00 11 0000000000 1111111111 Reply S 0 1 00 11 0000000000 1111111111 1. Se 00000000000 11111111111 0000000000 1111111111 nd J 00000000000 11111111111 00000000000 11111111111 o 00000000000 11111111111 in 00000000000 11111111111 Que 00000000000 11111111111 00000000000 11111111111 r 00000000000 11111111111 y 4. Fo 00000000000 11111111111 00000000000 11111111111 00 11 2. Forward Join Query rw 00000000000 11111111111 ard Jo 00000000000 11111111111 00 11 00000000000 11111111111 in Re 00 11 00000000000 11111111111 1111111111111 0000000000000 00 11 ply 00000000000 11111111111 00 R3 11 11111111111 F2 00000000000 Join Reply
3. Send Join Reply
Fig. 2. ODMRP Protocol
node can only access the services it is authorized to. Packets of nodes trying to use more bandwidth than is allocated to them will be rejected. A malicious node frequently doing this can be detected and isolated. A multicast receiver can only join the multicast groups for which it posses a MRC. Similarly, a multicast sender can send traffic only to the groups for which it posses a MSC. Furthermore, this send traffic is limited by the bandwidth constraints of the MSC. Only the links which are part of the multicast tree or mesh actually carry the multicast traffic. Since the packets are signed, any injection of packets into a data stream is easily detectable by the nodes in the path. 4.2
DIPLOMA on ODMRP
Figure 2 gives a high level overview of the On Demand Multicast Routing Protocol (ODMRP) [13] protocol. There is a sender node S that wants to multicast data into a group. Three receiver nodes R1, R2 and R3 are part of the multicast group. Two nodes F 1 and F 2 are in the path from the node S to the receivers. We call those nodes as intermediate nodes. When the node S has data to multicast, it broadcasts a Join Query message to the neighboring nodes to discover a multicast tree. This message is received by the intermediate nodes F 1 and F 2, which in turn broadcasts to their neighbors. The nodes R1 and R2 receives the Join Query from the node F 1, and the node R3 receives the Join Query from the node F 2. The receiver nodes send a Join Reply message back to the nodes from which it received the Join Query message (i.e. the upstream nodes F 1 and F 2). Once the nodes F 1 and F 2 receive the Join Reply messages they become part of the forwarding group and forwards the Join Reply messages to node S. In DIPLOMA systems that are running over ODMRP protocol, Join Query messages are modified to contain the MSC of the sender, and the transaction id and the key that will be used by the sender for subsequent communication. The intermediate nodes store this capability information temporarily and forward the Join Query message to its neighbors. On receiving this Join Query, a receiver node in the multicast group responds with a Join Reply message. This Join Reply message is modified to contain the receiver’s time stamped MRC that authorizes the node to be part of the multicast group. On receiving a Join Reply, the intermediate node becomes part of the Forwarding Group. The intermediate node installs the saved MSC in its capability database. The intermediate node then forwards the Join Reply to its upstream node (i.e. towards the sender). It is
240
M. Alicherry and A.D. Keromytis
possible for the intermediate node to receive Join Replies from multiple receivers with different MRCs. The intermediate node needs to forward only one of them to its upstream node. Then the forwarding node starts forwarding the multicast data traffic to the downstream nodes. Similar to the unicast case, the forwarding nodes enforce the capability for all the multicast packets. Whenever the time stamped MRCs expire, a forwarding node stops forwarding any multicast packet received by the node. ODMRP is a stateless protocol that does not have any multicast leave or prune messages. Instead, the tree is valid only for certain duration. The tree is completely dissolved when that timer expires. Furthermore, the receiver nodes can respond with Join-Reply messages only when it receives Join-Request message from a sender. There is no mechanism for a new receiver to add itself to an existing multicast tree. The sender maintains the multicast tree, and adds new receivers by periodically sending the Join-Query message. The DIPLOMA keeps the time stamped MRC up to date through this periodic tree maintenance protocol. Whenever a receiver gets a new Join-Query message, it creates a new time stamped MRC to respond back in the Join-Reply. To maintain continuous multicast data session, it is important for the period in which a new Join-Query is generated to be less than the validity duration of the time stamped MRC. 4.3
DIPLOMA on PIM-SM
Protocol Independent Multicast - Sparse Mode (PIM-SM) is a popular multicast routing protocol that is independent of the underlying unicast protocol. This protocol works in conjunction with the Internet Group Membership Protocol (IGMP). The protocol explicitly creates a tree from the sender to the receivers. In PIM-SM, one of the router is designated as a Rendezvous Point (RP) for a multicast group. All the other routers need to join the group through RP. Whenever a node wants to join a multicast group, it conveys the message through an IGMP membership report message. A designated router (DR) for the node sends periodic PIM it Join/Prune messages towards the RP for the multicast group. Each router along the path to RP updates the packet forwarding state (routing entries) and sends the Join/Prune message towards the RP. Whenever a node wants to send the traffic to the multicast group, its DR encapsulates the data in PIM Register messages and unicasts it to the RP. The RP decapsulates the message and sends the data towards the receivers in the multicast tree. If the data-rate from the sender is high, then the RP sends a source specific Join/Prune message towards the sender. This extends the tree to the sender, and the sender can directly send multicast messages to the tree without encapsulating the messages. If the data rate warrants it, any DR can join source specific shortest path tree by sending a Join/Prune message towards the sender, and prune the shared tree towards the RP. We can enable DIPLOMA in multicast systems running PIM-SM by including the multicast capabilities in the IGMP and PIM messages. Whenever a receiver sends an IGMP membership report message, its timestamped MRC is included. DIPLOMA systems reject any membership report without the capability. A DR
Securing MANET Multicast Using DIPLOMA
241
Fig. 3. DIPLOMA Implementation
includes one of the time stamped capabilities of the downstream nodes in the Join/Prune messages it sends towards the RP or the source node. When a router receives a prune message, the corresponding time stamped MRC is removed from its tables. The node stops forwarding the packets, when it does not have any valid time stamped MRC from the downstream receivers. The multicast packets are sent similar to unicast case. Before sending the packet, the sender multicasts it’s MSC in a capability request packet with the capability, transaction identifier and the key for the packet signatures. This packet goes to the RP as a regular multicast packet or a Register packet; the RP in turn sends the packet to the multicast group (after decapsulation for the Register packets). All the nodes in the multicast tree add the capability to their capability database. If it is a register packet, then the nodes in the path between the sender and the RP will also extract the capability, transaction id and the key for the signature from the capability request, and install in their database. Any subsequent data packet multicast by the sender contains the transaction id and the packet signature. The signature is verified and the bandwidth is enforced by all the nodes in the multicast tree, and by the nodes between the sender and the RP in the case of the Register packets. If a receiver node joins the multicast tree after the transmission of the initial capability request packet by the sender, then it will not be able to validate the multicast data packets. DIPLOMA solves this by two means: Firstly, the sender periodically multicasts the capability request packet. The new receiver node can start accepting the data packets after the periodic multicast. Secondly, the receiver sends a request for the capability towards the sender using a DIPLOMA control (or error) packet. On receiving this request, either an intermediate node or the sender responds with the capability and the public key for the signature.
5
Linux Implementation
We now describe the implementation of Multicast DIPLOMA on Debian Linux system running kernel 2.6.30. For multicast routing, we use pimd, a PIM-SM package that comes with the Debian distribution. Since PIM-SM requires a separate unicast routing, we use University of Uppsala’s AODV implementation
242
M. Alicherry and A.D. Keromytis
called AODV-UU. Our implementation does not require any changes to the application program, routing module or PIM-SM daemon. The multicast DIPLOMA is implemented as a user level process, called DIPLOMA engine that interfaces with rest of the Linux packet processing subsystem using netfilter framework. We use netfilter queue to receive, modify, and filter packets in the DIPLOMA engine. Figure 3 shows how the DIPLOMA engine interfaces with netfilter subsystem. A brief description of Netfilter framework and how the DIPLOMA uses it for handling the unicast traffic can be found in [1]. The dotted lines are the hooks used only for the multicast traffic. The solid lines show the hook for both unicast and the multicast traffic. The reason for requiring additional hooks for the multicast traffic is due to the implementation of PIM-SM in Linux. It uses raw sockets to send and receive traffic; these packets do not go through INPUT and OUTPUT hooks, but traverse PREROUTING and POSTROUTING hooks. Next, we describe the packet flow for control (i.e., IGMP and PIM packets) and multicast data packets. 5.1
Membership Messages
When the system sends a membership message, in the form of an ICMP message or a PIM Join/Prune message, the DIPLOMA engine receives on the packet on OUTPUT hook. It checks for a valid MRC for the message in its database. The valid capability may be either its own capability, or a capability it received from a downstream node. It adds the capability in the packet and sends an ACCEPT verdict on the hook. When the system receives a membership message on the PREROUTING hook, it validates the packet. A valid packet needs to contain a valid MRC. The node saves the MRC in its tables for subsequent request to the upstream node. The capability is removed from the packet and an accept verdict is given. The PIM-SM daemon receives this packet over the RAW socket. The engine drops any membership message without a valid capability. 5.2
Capability Establishment
When a sender needs to multicast data, it creates a transaction identifier for use with subsequent packets to identify the session. It also creates an RSA key for signing the data packets of that session. The sender sends the transaction id, public key and the MSC authorizing the sender to send the multicast packet as a DIPLOMA control message. The DIPLOMA engine sends this message when it first sees a packet from the sender for a multicast group. The application program sending the multicast data need not be aware of this step. When a multicast member node or a forwarding node receives this message, it validates the capability and stores the transaction id, the public key and the MSC in its capability database. These nodes validate the subsequent data packets coming from the sender against the capability and verify the packet signatures. For updating the new receivers or new intermediate node after a route change, the sender multicasts the capability establishment packet periodically. A receiver
Securing MANET Multicast Using DIPLOMA
243
node can also request the sender on a unicast message to send the capability establishment packet, when it does not have that information due to late joining or a route change. 5.3
Multicast Data Packets
All the multicast data packets need to contain an associated capability. The DIPLOMA engine at the sender modifies the outgoing packets in the OUTPUT hook by including a capability header, which contains the transaction identifier and the packet signature. The packets sent to a multicast group are treated together as a block for the signature computation [1]. A packet block contains maximum of block size (P) packets that are sent with in the interval block timeout (T). The packet signatures for a block consist of RSA signature for the first packet and SHA-1 hashes for the remaining packets. The RSA signature is verifiable with the key sent in the capability establishment phase. The SHA-1 hashes are integrity protected by including them in the first packet. The engine at the intermediate node receives the multicast packet on FORWARD hook. The engine validates the packet against the capability using the transaction identifier. The validation including checking if there is a valid MSC in its database associated with the transaction identifier, if the packet has valid signature, and if the packet conforms to the bandwidth constraints of the capability. If the packet is valid, then the engine gives an ACCEPT verdict for forwarding the packet. If the packet is destined to the node as a receiver on the multicast group, DIPLOMA receives the packet on the INPUT hook. The engine validates the packet as above, removes the capability header from the packet and gives an ACCEPT verdict, causing the kernel to deliver the packet to the application.
6
Experimental Evaluation
In this section, we evaluate the effectiveness of multicast DIPLOMA. First, we compare the throughput, packet loss and inter arrival times of the systems with and without multicast DIPLOMA using periodic traffic. We also study these parameters using real video streaming traces. Finally, we study the effectiveness of DIPLOMA in containing the attacker nodes. 6.1
Testbed
We implemented the multicast DIPLOMA engine as described in Section 5 in Linux systems running Debian Linux with kernel 2.6.30. We use AODV-UU for routing unicast traffic, modified to handle multiple interfaces. For multicast routing, we use PIM-SM implementation called pimd that is available with Debian Linux distribution. We run the resulting system on multiple nodes in the Orbit lab1 wireless testbed. Orbit is an indoor wireless testbed consisting of 400 nodes 1
http://www.orbit-lab.org/
M. Alicherry and A.D. Keromytis
1 0
0
0
1
2 2
300 p/s (org) 300 p/s (dip)
100 p/s (org) 100 p/s (dip)
4000
1 1
0
2
500 p/s (org) 500 p/s (dip)
3 4 5 2
6
3500
Multicast Throughput (kbps)
244
3000 2500 2000 1500 1000 500 0 2
3
4
5
6
Number of hops (H)
Fig. 4. Tree topology
Fig. 5. Throughput for line topology
arranged as a 20x20 grid on a physical area of (20m x 20m). Each node contains 1-GHz VIA C3 processor, 512 MB RAM, a 20 GB hard disk, two wireless mini-PCI 802.11 a/b/g interfaces, and two 100BaseT Ethernet ports. Since the nodes are within the communication range of each other in Orbit testbed, we use channel hoping to create multi-hop topologies. The traditional methods of MAC address based filtering to create multi-hop topolgies are not suitable for studying the security system like DIPLOMA, since an attacker node can cause damage in its communication radius. Since the DIPLOMA engine is a user-level process, all packets are queued for user-level processing before transmission. To make a fair comparison, we also do a similar queuing of the packet to a user level process on systems not running the DIPLOMA (called original). The user level program gives an ACCEPT verdict on all the packets, without any processing. For measuring the performance of the DIPLOMA, we use two topologies: a line topology and a tree topology. In line topology nodes are allocated channels in such a way that each node can directly communicate only with its neighbors on either side (except for the first and last, which has only one neighbor). In this topology, the first node is the sender of the multicast. All the remaining nodes subscribe to the multicast group. The tree topology is shown on figure 4. The links are labeled with the channel with which the nodes communicate. Here the sender is the node 0 (root), and the multicast receivers are nodes 3,4,5,6 (leaf nodes). In the figure, the solid lines show the multicast tree and the dashed lines shows the links that are not participating in the multicast. We use the multi-generator tool mgen [14] from Naval Research Laboratory to send and receive traffic in our experiments. Each data points in this section represent an average of running six experiments, each experiments sending traffic for 30 seconds each. 6.2
Line Topology
In this set of experiments, we study the performance of DIPLOMA and the original schemes for the line topology. The sender sends periodic traffic of size 1024 bytes at the rate of 100, 300 and 500 packets per second. This corresponds to the rates of 819.2 Kbps, 2.4576 Mbps and 4.096 Mbps respectively.
Securing MANET Multicast Using DIPLOMA 500 p/s (dip) 500 p/s (org)
300 p/s (dip) 300 p/s (org)
100 p/s (dip) 100 p/s (org)
100 p/s (dip) 100 p/s (org)
40
500 p/s (dip) 500 p/s (org)
12
35
10
Packet inter-arrival (ms)
30
Packet Loss (%)
300 p/s (dip) 300 p/s (org)
245
25 20 15 10
8 6 4 2
5 0
0 2
3
4
5
Number of hops (H)
Fig. 6. Packet loss for line topology
6
2
3
4
5
6
Number of hops (H)
Fig. 7. Packet inter arrival times for line topology
Figure 5 shows the throughput received by the nodes at different hop lengths for different transmission rates. For the rate of 100 and 300 pkts/sec, both the DIPLOMA and the original schemes receive bandwidth close to the send bandwidth for all the hops. The bandwidth for the DIPLOMA is minimally (0.7% and 3.7% respectively) lower than the original. For the rate of 500 pkts/sec, the received bandwidth reduces as the hop count increases. This is because the available bandwidth decreases as the number of hops increases. The bandwidth for the DIPLOMA is 6.6% lower than the original. This is due to larger headers and processing required for the DIPLOMA. Figure 6 shows the packet loss for the same experiment. The packet losses are less than 1% for both the schemes for the rate of 100 & 300 pkts/sec. The packet losses are higher for the rate of 500 kbps, which explains the lower throughput as the hops count increases. The packet loss is about 5% more for DIPLOMA, due to larger headers, which require more bandwidth. Figure 7 shows the packet inter arrival times for the same experiments. For the rates 100 and 300 pkts/sec, the inter arrival is close to the inverse of their send rate. The inter arrival for diploma is slightly higher than the original, due to larger processing required. For the 500 pkts/sec rate, inter-arrival time increases with hop count, due to correspondigly higher packet loss. 6.3
Tree Topology
We study the throughput, packet loss and inter arrival times for the tree topology given in Figure 4. Here the root node 0 is the sender and the leaf nodes 3,4,5,6 are the receivers. Figure 8 shows the throughput at the nodes for both the schemes. Though the nodes are at same distance from the root, they receive different bandwidths. This may be because of the channel conditions and the packet scheduling. For some nodes, the DIPLOMA receives higher bandwidth than the original. The sum of the bandwidth received by all the four receivers is slightly higher for the original scheme compared to the DIPLOMA scheme. This total bandwidth is 2.1%, 2.1% and 3.6% higher respectively for the rates 100, 300 and 500 pkts/sec for original compared to DIPLOMA.
246
M. Alicherry and A.D. Keromytis
7,000
60
Node5
4 2 0
Node3
Node4
dip
6
dip
org
dip
org
org
dip
Node5
500 pkt/s 1 0 0300 pkt/s 1 1100 pkt/s 0
Node6
Fig. 9. Packet loss for tree topology
org
8
Node4
000 000 111 000 111 111 00 11 000 111 000 111 11 00 000 111 000 111 111 00 000 00011 111 00 11 000 111 000 111 000 111 00 11 000 00 11 000 111 111 000 111 000 111 00 11 000 111 00 11 000 111 000 111 000 111 00 11 000 111 000 111 000 111 00 11 000 111 000 111 00 11 000 111 111 000 111 00 000 111 000 111 00 11 000 111 000 00 11 000 111 00011 111 00 11 000 111 000 111 00 11 000 111 000 111 00 11 000 111 000 111 00 11 000 111 000 111 00 11 00 11 000 111 000 111 00 11 000 111 000 111 00 11 000 111 000 111 00 11 000 111 111 000 000 111 00 000 111 000 111 000 111 00 11 00 11 000 111 111 00 11 000 111 111 00011 111 000 000 000 111 000 111 00 11 00 11 000 111 000 111 000 000 111 111 00 11 00 11 000 111 000 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 000 111 111 00 11 00 11 000 111 000 111 000 000 111 00 11 000 111 111 00 11 000 111 000 111 00 11 000 111 000 000 111 000 111 00 11 000 111 00 11 000 111 111 000 000 111 00 11 000 111 000 111 000 111 00 11 00 000 111 111 000 111 00011 000 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111 00 11 000 111 000 111 00 11 000 111 111 000 000 00 000 111 000 111 111 00011 111 dip
10
org
Node3
org
12
dip
14
00 11 11 00 00 11 00 11 00 0011 11 00 11 00 11 00 11 00 11 00 11 00 11 0011 11 00 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 00 11 00 11 0011 11 00 11 00 11 00 00 11 00 11 0011 11 00 11 00 0011 11 00 11 00 11 00 0011 11 00 0011 11 00 00 11 0011 11 00 11 00 11 00 0011 11 00 11 00 11 00 11 00 11 00 0011 11 00 11 00 11 00 11 00 11 00 0011 11 00 11 00 11 00 0011 11 00 11 00 11 00 11
dip
Packet Loss (%)
0
Node6
org
Packet inter−arrival (ms)
16
20 10
Fig. 8. Throughput for tree topology 18
30
000 111 000 111 000 111 000 000 111 111 000 111 11 00 000 111 000 111 000 111 00 11 000 111 000 111 00 11 000 111 00 11 000 111 000 111 00 11 000 111 00 11 000 111 000 111 000 111 00 000 111 00 11 000 111 00011 000 111 00 11 000 111 111 00 11 000 111 000 111 000 111 00 11 000 111 00 11 000 111 111 000 111 000 00 11 000 111 00 11 000 111 000 111 000 111 00 11 000 111 00 11 000 111 000 111 000 111 00 000 111 00 11 000 111 000 111 00011 111 00 11 000 111 00 11 000 111 000 111 000 111 00 11 000 111 111 00 11 000 111 000 111 000 00 11 000 111 111 000 111 111 00 11 000 111 000 000 00 11 000 111 000 111 00 11 000 111 111 000 111 000 000 111 00 11 00 11 000 111 000 111 000 111 00 11 000 111 00 11 00 000 111 000 111 00011 111 00 11 000 111 00 11 00 11 000 111 000 111 000 111 111 000 11 00 00 11 000 111 000 111
Node5
0 500 pkt/s 1 1300 pkt/s 0 1100 pkt/s 0
dip
Node4
500 pkt/s 1 0 0300 pkt/s 1 1100 pkt/s 0
40
org
Node3
00 11 00 11 00 11 11 00 00 0011 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 0011 11 0000 11 00 11 00 11 00 11 00 11 00 0011 11 00 11 00 11 00 11 00 11 00 11 0011 11 00 00 11 00 11 00 11 00 0011 11 00 11 00 11 00 11 00 11 11 00 11 00 11 0000 11
50
dip
org
0
org
1,000
dip
2,000
org
3,000
dip
4,000
org
Multicast throughput (Kbps)
5,000
dip
111 000 000 111 000 111 000 111 00 11 000 111 00 11 000 000 111 00 111 11 000 111 000 111 0011 11 000 111 000 111 00 11 00011 111 00011 111 00 111 00 111 0011 11 000 000 111 00 111 000 000 00 00 11 00 11 000 111 000 111 00 111 11 000 111 000 111 00 11 00 11 00 11 000 000 111 00 00011 111 00011 111 00 11 00 00 11 000 111 000 111 00 11 000 000 111 00 111 00 11 0011 11 000 111 000 111 00 00011 00011 111 00 111 11 00 00 11 000 111 000 111 00 111 11 00011 111 00011 111 00 11 00 00 11 000 000 111 000 111 00 000 111 00 111 11 00 111 11 0011 11 000 111 00 000 000 00 11 000 111 00 11 00 11 000 111 00 11 00 11 000 111 000 111 00 11 000 111 00 11 000 111 00 111 11 00 11 000 111 000 00 11 000 111 00 11 000 111 00 111 11 00 111 11 00011 00011 00 11 000 111 00 000 111 000 111 00 11 00 00011 111 00011 111 0011 11 00 000 111 000 111 00 00 000 111 000 111 00 11 00 111 11 000 00 111 00 11 0011 11 00 11 000 111 000 000 111 000 111 00 11 00 11 000 111 000 111 00 11 000 111 00 11 000 111 000 111 00 11 000 111 00 111 11 00 111 11 000 111 00 11 000 111 000 00 11 000 111 000 00 11 00 11 00 11 000 111 00 11 000 111 00011 111 00 11 000 111 00 11 000 111 00 11 00 11 00 000 111 000 111 000 111 00 11 000 111 0011 11 000 111 000 111 00 11 00 11 00 11 000 111 00 11 000 111 000 111 000 111 00 11 00 00 11 000 111 00 111 11 000 111 00011 00 111 000
6,000
Node6
Fig. 10. Packet inter arrival times for tree topology
Figure 8 shows the packet loss at the nodes for both the schemes. Unlike the line topology, there was some packet losses (6% to 9%) for the rates of 100 and 300 pkts/sec for both the schemes on some of the nodes. This may also be due to channel conditions. Figure 10 shows the packet inter arrival times for both the schemes. Here also for some receivers, the inter arrival times were shorter for the DIPLOMA. However, on average, the inter arrival times for the DIPLOMA was slightly higher than the original, due to larger processing delays and the extra headers in DIPLOMA.
500 p/s (org)
500 p/s (dip)
500 p/s (dip)
500 p/s (org)
50
120
45 40
Packet inter-arrival (ms)
Multicast Throughput (kbps)
100
80
60
40
35 30 25 20 15 10
20
5 0
0 2
3
4 Number of hops (H)
5
6
Fig. 11. Streaming video throughput for line topology
2
3
4 Number of hops (H)
5
6
Fig. 12. Streaming video packet inter arrival times for line topology
Securing MANET Multicast Using DIPLOMA
Fig. 13. Streaming video throughput for the tree topology
6.4
30 25 20 15 10 5 0
0org 1 0dip 1
Node6
35
000 111 11 00 000 111 00 11 000 111 00 11 000 000 111 111 000 111 111 00 11 00 11 00011 111 000 000 111 000 111 000 111 00 11 00000 111 000 111 000 111 111 000 00 11 00 11 000 111 111 000 000 111 000 111 00 11 00 11 000 000 111 000 111 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111 111 000 111 000 00 00 11 00011 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111 111 000 111 111 000 00 11 00 11 000 000 111 000 111 111 000 111 00 11 00 11 000 000 111 000 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 00 11 00 11 00011 111 000 111 000 111 000 111 00 11 00000 111 000 111 111 000 111 000 00 11 00 11 000 111 000 111 111 000 111 111 000 00 11 00 11 000 000 111 000 111 111 000 111 00 11 00 11 000 000 111 000 111 000 111 00 11 00 11 000 111 000 111 000 111 000 111 00 11 00 11 00011 111 000 111 000 111 000 111 00 11 00000 111 000 111 111 000 111 000 00 11 00 11 000 111 000 111 111 000 111 000 Node5
0org 1 0dip 1
00 1111 00 00 11 00 11 00 11 00 11 00 0011 11 00 0011 11 00 11 00 11 00 11 0011 11 0000 11 00 11 00 11 00 0011 11 00 0011 11 00 11 00 11 00 11 0011 11 00 00 11 00 0011 11 00 0011 11 00 11 00 11 00 0011 11 00 11 0011 11 00 00 11 00 0011 11 00 0011 11 00 11 00 11 00 0011 11 00 11
Node4
0
40
Node3
20
45
Packet inter−arrival (ms)
40
0011 11 00 11 00 00 0011 11 00 11 00 11 00 0011 11 00 11 0011 11 00 00 11 00 0011 11 00 0011 11 00 11 00 11 00 0011 11 00 0011 11 00 0011 11 00 11 00 11 00 0011 11 00 11 00 11 00 11 00 11 00 11 0011 11 0000 11 00 11 00 11
50
Node6
60
00011 111 00 111 000 00 00011 111 00 00011 111 00 11 000 111 00 11 00011 111 00000 111 00 11 000 111 00 11 000 111 00 00011 111 00 11 000 111 00 11 000 111 00 00011 111 00 11 000 111 00 11 000 111 00 00011 111 00 11 000 111 00 11 00011 111 00000 111 00 11 000 111
Node4
80
Node3
Multicast Throughput (Kbps)
100
000 111 11 00 00011 111 00 11 00 000 000 111 111 00 11 00 00011 111 000 111 00 11 00 11 000 000 111 111 00 11 00 11 000 111 000 111 00 11 00 00011 000 111 111 00 11 00 11 000 000 111 111 00 11 00 11 000 111 000 111 00 11 00 11 000 111 000 111 00 11 00 11 000 000 111 111 00 11 00 11 000 111 000 111 00 11 00 00011 000 111 111 00 11 00 00011 111 000 111 111 00 11 00 11 000 000 111 00 11 00 11 000 111 000 111 00 11 00 00011 111 000 111 00 11 00 00011 000 111 111 00 11 00 11 000 000 111 111 00 11 00 00011 111 000 111 111 00 11 00 11 000 000 111 00 11 00 11 000 111 000 111 00 11 00 11 00011 111 000 111 00 11 00000 111 000 111 00 11 00 00011 000 111 111 Node5
120
247
Fig. 14. Streaming video packet inter arrival times for the tree topology
Streaming Video
In this set of experiments, we study the performance of streaming video. The experiments were conducted by creating a trace of streaming video using evalvid [12], and sending that packets based on that trace using mgen. Figures 11 and 12 shows the throughput and the inter arrival times for the streaming video for the line topology. The results show that both the DIPLOMA and the original schemes receive the full bandwidth the video, and the packets are received at constant inter-arrival times. Figures 13 and 14 shows the results for the tree topology. There was a small loss in two of the nodes for both the schemes. This behavior is similar to the results for the periodic traffic. 6.5
Attacker Resiliency
Now we study the effectiveness of multicast DIPLOMA in containing attackers. We use the topology given in figure 15. The solid lines show the multicast tree and the dashed lines show the unicast path. The labels on the links show the channels. In the experiments below, the nodes 0 and 1 are the senders. These nodes have only its neighboring nodes 2 and 3 respectively in it communication radius. Hence only nodes 2 or 3 cannot be protected by DIPLOMA, when these nodes misbehaves at Physical or MAC layer. We study how DIPLOMA can protect multicast sessions when there is a DoS attacker sending high-rate traffic. Node 1 (attacker) sends periodic traffic of size 1024 bytes at the rate of 1000 packets per second (i.e., rate of 8.19 Mbps) to node 7. The allocated bandwidth for the attacker was 1 Mbps. At the same time, node 0 multicasts to receivers nodes 4, 5 and 6 a periodic traffic of size 1024 at rates 100 pkts/s (i.e., 819.2 Kbps) or 300 pkts/s (i.e., 2.45 Mbps). Figure 16 shows the throughput at the three multicast receivers and the unicast receiver (attack traffic). In DIPLOMA, the attacker is able to achieve a bandwidth of 844 Kbps, which is the allocated bandwidth (minus the overhead). The multicast receivers receive close to their send bandwidth. The multicast receivers receive on average 749 Kbps and 1.80 Mbps respectively for 100 and
248
M. Alicherry and A.D. Keromytis 18,000 16,000
5 2
3
2
3
6
3
4
2
3
Throughput (Kbps)
0
0
14,000 12,000 10,000
300 pkt/s 100 pkt/s
8,000 6,000 4,000
Fig. 15. Attack topology
Node4
Node5
Node6
dip
org
dip
0
org
7
dip
3
org
3
3
dip
1
2
org
2,000
1
Attacker
Fig. 16. Throughput in presence of unicast attacker
300 pkt/s traffic. For the original scheme, the attacker is taking up most of the bandwidth, at 8.04 Mbps. The multicast traffic receives only a fraction of its send bandwidth. The multicast receivers receive on average only 517 kbps and 788 kbps respectively for 100 pkt/s and 300 pkt/s traffic.
7
Related Work
The concept of capabilities was used in operating system for securing resources [17]. Follow-on work investigated the controlled exposure of resources at the network layer using the concept of “visas” for packets [7], which is similar to network capabilities. More recently, network capabilities were proposed to prevent DoS in wired networks [4]. We extend the concept to MANET and use it for both access control rules and traffic shaping parameters. A survey of security issues and solutions for multicast in wired networks is presented in [3]. They classify the issues and solution based on the three properties described in Section 1. The solutions are specific to wired networks and not directly applicable to MANETs, which have no specialized router nodes. A number of solutions have been proposed for multicast receiver access control [10,9,5]. These solutions have trusted routers or query centralized servers, thus neither is suitable for MANETs. These protocols do not also have limitations on the amount of service accessed. DIPLOMA provides a unified solution to both receiver and sender access control, and supports bandwidth constraints. There are a number of multicast routing protocols proposed for MANETs. A survey of these protocols is present in [6]. There has been work dealing with security issues of these protocols. A discussion of possible attacks on MAODV (Multicast-extended AODV) routing can be found in [15]. The authors also propose an authentication framework to protect an MAODV network against these attacks. Tactical MAODV [16] extends MAODV through the integration of the security services necessary for the tactical deployment of MANETs, such as forward and backward secrecy and data confidentiality. In [8] authors extend their multicast MANET protocol MMARP with digital signatures using a public key
Securing MANET Multicast Using DIPLOMA
249
scheme. They use Cryptographically Generated Addresses (CGA) to keep attackers from impersonating other nodes. [11] introduces a protocol for secure communication in multicast groups with a pair of multicast trees for each multicast group; one for security information and the other for data traffic.
8
Conclusions and Future Work
We presented multicast DIPLOMA, an architecture for securing multicast traffic in MANETs. DIPLOMA is a deny-by-default, distributed policy enforcement architecture based on network capabilities. It prevents unauthorized senders from sending packets to a multicast group, and unauthorized receivers from joining the multicast group, protecting the end-host resources and the network bandwidth. We showed that popular multicast protocols such as ODMRP and PIM-SM can be modified to incorporate DIPLOMA. We implemented the DIPLOMA in Linux running PIM-SM without any changes to applications and routing. We evaluated the system on the Orbit MANET testbed. We showed that the impact of the scheme is minimal on throughput, packet loss, and packet inter-arrival times. We also showed that DIPLOMA allocates resources in a fair manner even in the presence of attackers, protecting legitimate traffic.
Acknowledgements This work was supported in part by the National Science Foundation through Grant CNS-07-14277 and by ONR through Grant N00014-09-10757. Any opinions, findings, conclusions, and recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the NSF, ONR, or the US Goverment.
References 1. Alicherry, M., Keromytis, A.D.: DIPLOMA: Distributed Policy Enforcement Architecture for MANETs. In: International Conference on Network and System Security (September 2010) 2. Alicherry, M., Keromytis, A.D., Stavrou, A.: Deny-by-Default Distributed Security Policy Enforcement in Mobile Ad Hoc Networks. In: SecureComm (September 2009) 3. Ammar, P.J.M.: Security issues and solutions in multicast content distribution: A survey. IEEE Network 17 (2003) 4. Anderson, T., Roscoe, T., Wetherall, D.: Preventing Internet Denial-of-Service with Capabilities. In: Proc. of Hotnets-II (2003) 5. Ballardie, A., Crowcroft, J.: Multicast-Specific Security Threats and Countermeasures. In: SNDSS (1995) 6. Cordeiro, C.M., Gossain, H., Agrawal, D.: Multicast over Wireless Mobile Ad Hoc Networks: Present and Future Directions. IEEE Network 17 (2003) 7. Estrin, D., Mogul, J.C., Tsudik, G.: Visa protocols for controlling interorganizational datagram flow. In: IEEE JSAC (May 1989)
250
M. Alicherry and A.D. Keromytis
8. Galera, F.J., Ruiz, P.M., Gomez-Skarmeta, A.F., Kassler, A.: Security Extensions to MMARP Through Cryptographically Generated Addresses. LNI (2005) 9. Hardjono, T., Cain, B.: Key Establishment for IGMP Authentication in IP Multicast. In: IEEE ECUMN (2000) 10. Judge, P., Ammar, M.: Gothic: A Group Access Control Architecture for Secure Multicast and Anycast. In: INFOCOM (2002) 11. Kaya, T., Lin, G., Noubir, G., Yilmaz, A.: Secure Multicast Groups on Ad Hoc Networks. In: ACM Workshop on Security of Ad Hoc and Sensor Networks (2003) 12. Klaue, J.: EvalVid - A Video Quality Evaluation Tool-set, http://www.tkn.tu-berlin.de/research/evalvid/ 13. Lee, S.-J., Gerla, M., Chiang, C.-C.: On-Demand Multicast Routing Protocol (October 1999) 14. Naval Research Laboratory. Multi Generator (MGEN), http://cs.itd.nrl.navy.mil/work/mgen/ 15. Roy, S., Addada, V.G., Setia, S., Jajodia, S.: Securing MAODV: Attacks and Countermeasures. In: IEEE Intl. Conf. SECON (2005) 16. Slezak, D., Kim, T., Chang, A.C., Vasilakos, T., Li, M., Sakurai, K.: Security in Tactical MANET Deployments. In: Comm. and NetInt. Conf., FGCN/ACN (2009) 17. Wobber, E., Abadi, M., Burrows, M., Lampson, B.: Authentication in the Taos Operating System. ACM Trans. on Computer Systems 12 (February 1994)
Preimage Attacks against Variants of Very Smooth Hash Kimmo Halunen and Juha R¨ oning Oulu University Secure Programming Group Department of Electrical and Information Engineering P.O. Box 4500 90014 University of Oulu
[email protected] Abstract. In this paper, we show that some new variants of the Very Smooth Hash (VSH) hash function are susceptible to similar types of preimage attacks as the original VSH. We also generalise the previous mathematical results, which have been used in the preimage attacks. VSH is a hash function based on the multiexponentiation of prime numbers modulo some large product of two primes. The security proof of VSH is based on some computational problems in number theory, which are related to the problem of factoring large integers. However, the preimage resistance of VSH has been studied and found somewhat lacking especially in password protection. There have been many different variants of VSH proposed by the original authors and others. Especially the discrete logarithm version of VSH has been proposed in order to make the hash values shorter. Further proposals have used the discrete logarithm in finite fields and elliptic curves to gain even more advantage to the hash length. Our results demonstrate that even for these new variants, the same ideas for preimage attacks can be applied as for the original VSH and they result in effective preimage attacks.
1
Introduction
Hash functions are a fundamental building block in modern cryptography. These functions accept any message as their input and output a relatively short hash value of fixed length, e.g., 256 bits. In recent years, hash functions have been studied very actively as the candidates for the new SHA-3 standard are being vetted out. For a concise and excellent survey on the state of the art of hash functions see [16]. For a hash function to be useful in cryptography, there are three basic properties that the function should possess. First of all, the hash function f should be preimage resistant, i.e, for any y = f (x), finding x should be hard. Furthermore, f should be second preimage resistant, i.e, given x and f (x) it should be difficult to find x = x with f (x) = f (x ). Finally, f should be collision resistant, meaning that finding any two distinct messages x and y with f (x) = f (y) should be
The work of the authors is supported by Infotech Oulu.
I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 251–266, 2010. c Springer-Verlag Berlin Heidelberg 2010
252
K. Halunen and J. R¨ oning
hard. Some other and more fine-grained properties have also been proposed for example in [19] and [9]. Most of the modern hash functions are based on the ideas of Merkle [15] and Damg˚ ard [4]. These results established the iterative paradigm, where a compression function that accepts only fixed length input is used iteratively to form a hash function that can accept messages of arbitrary length as its input. They also show that this construction preserves the collision resistance of the underlying compression function. This type of hash function construction has become the norm in modern hash functions as can be seen from MD5, SHA-1 and other hash functions that are in use today as well as from the SHA-3 contestants and other new hash function proposals. The competition and new ideas are necessary as the current hash functions and even the iterative method in general have been found quite lacking in security, e.g, attacks published in [25,24,23,5,8,9]. Very Smooth Hash (VSH) hash function was proposed by Contini et al. in [3]. The idea of VSH was to use number theory and the difficulty of integer factoring as a basis for building a provably collision resistant hash function. This was possible even though the underlying compression function is not collision resistant [3]. Previously, many hash functions did not have a proof for their collision resistance as it is very hard to prove a compression function to be collision resistant and thus be able to apply the results of Merkle and Damg˚ ard. Usually, hash functions have only heuristic security arguments presented for them and then practical attacks against specific functions show the strengths and weaknesses of the designs. Even though VSH is provably collision resistant, the authors pointed out that preimage resistance can not be guaranteed for short messages. Furthermore, Saarinen discovered in [21] a certain equivalence between VSH hash values and demonstrated that it could be used to attack passwords with a toy example. These ideas were further explored by Halunen et al. in [7] and [6] where more realistic password scheme was attacked and the technique was shown to have more potential especially in finding the preimages of multiple messages secured with the same VSH instance. VSH has many different variants, some of which have been proposed already by the original authors in [3]. However, these variants have the same drawback of having a very large hash length (1024 or more bits) in order to achieve a necessary level of security. A discrete logarithm variant of VSH, VSH-DL, was proposed by the original authors in order to alleviate this problem. In [13] Lenstra et al. show that the hash length can be shortened by the use of finite fields or elliptic curves instead of the prime numbers of the original VSH and VSH-DL. In this paper, we demonstrate that even the new variants proposed in [13] are susceptible to similar attacks as demonstrated in [21,7,6]. We also show some generalisations of the results of Saarinen. These results can be applied in some scenarios to achieve even better preimage attacks against VSH hash values. Our paper is organised in the following way. The next section contains the necessary definitions and describes in detail the VSH algorithm and its variants from [13]. In the third section, we recall the preimage finding methods by
Preimage Attacks against Variants of Very Smooth Hash
253
Saarinen [21] and Halunen et al. [7,6]. The fourth section then contains the generalisations of Saarinen’s results and the results, which enable the use of these methods also against the discrete logarithm variants of VSH from [13]. In the fifth section, there are some results concerning the application of the preimage finding methods for these new VSH-DL variants. The final sections contain discussion on the findings and the concluding remarks on our research.
2
Very Smooth Hash and Its Variants
The basic definitions given here follow the ones from [3] and [13]. We denote by pi the ith prime number, i.e., p1 = 2, p2 = 3, p3 = 5 etc., with the convention that p0 = −1. We define a constant c > 0 and denote by n a hard-to-factor, composite integer (usually an RSA modulus [18]). An integer a is pk -smooth if all its prime factors are smaller or equal to pk . Furthermore, a ∈ Z is a very smooth quadratic residue modulo n if the largest prime dividing a is at most (log n)c and a ≡ x2 mod n for some x ∈ Z. Here x is a modular square root of a. 2.1
The Basic VSH
Next we present the basic VSH algorithm from [3]. The VSH Algorithm 1. Let M = m1 m2 · · · mt be a message, with mj the j th bit of the message. k Let k (the block length) be the largest integer with i=1 pi < n and assume that t < 2k . Let x0 = 1. 2. Let l = kt be the number of message blocks in M . Pad the message by setting mi = 0 for all t < i ≤ lk. 3. Let t = b1 b2 · · · bk be t in binary. Let mlk+i = bi for all 1 ≤ i ≤ k. 4. For j = 0, 1, . . . , l calculate xj+1 = x2j ·
k
m(jk+i)
pi
mod n.
i=1
5. Return xl+1 as the hash value of M . l−1 If we define ei = j=0 mj·k+i 2l−j−1 , the result of the VSH algorithm is the value of the multiexponentiation ki=1 pei i mod n as stated in [3]. The security of VSH is based on the VSSR problem, which is defined as follows (see also [3, Def. 3]). Definition 1 (The VSSR problem: Very Smooth number nontrivial modular Square Root). Let n be a product of two, approximately the same size, prime numbers and k ≤ (log n)c . The VSSR problem is the following: Given k n, f ind x ∈ Z∗n such that x2 ≡ i=0 pci i mod n, where ci ∈ Z for all i ∈ {1, 2, . . . , k} and at least one of the exponents ci is odd.
254
K. Halunen and J. R¨ oning
In [3] the authors describe in detail the different assumptions and security limits for choosing suitable modulus n for VSH hash functions. We will omit the details as these are mostly relevant to the collision resistance proof of the hash function. It suffices to say that finding a collision for VSH can be reduced to the VSSR problem (and the computational VSSR problem that gives reasonable values for the size of n). These problems are tightly linked to integer factoring algorithms such as the number field sieve (NFS) [11]. The authors of [3] state that 1024-bit VSH gives security comparable to 840bit RSA. The hash length of VSH is thus much greater than that of the hash functions that are currently in use such as SHA-1 and SHA-256. Furthermore, the algorithm is slower than these hash functions as demonstrated by the comparison in [3]. 2.2
Discrete Logarithm Variant of VSH
The original authors propose several variants of the original VSH such as the cubing variant and Fast-VSH. One of the many VSH variants proposed in the original article was VSH-DL, the discrete logarithm variant. In this case, the hard problem to be solved is the following. (see [3, Def. 4]). Definition 2 (The NDLVS problem: Nontrivial Discrete-Log of Very Smooth numbers). Let p = 2q + 1 be prime, with q prime, and k ≤ (log p)c . The NDLVS problem is the following: Given p, f ind integers d1 , d2 , . . . , dk , with k |di | < q for all i ∈ {1, 2, . . . , k} and at least one di = 0, such that 2d1 ≡ i=2 pdi i mod p Now NDLVS problem can be used to prove the collision resistance of the following compression function. The VSH-DL compression function 1. Let p be an S-bit prime and q a prime with p = 2q + 1. Let k be a fixed integer length (k ≈ logS S ). let M = m1 m2 · · · mt be an t-bit message with t < (S − 2)k. 2. Let l = kt be the number of message blocks in M . Pad the message by setting mi = 0 for all t < i ≤ lk. 3. Let t = b1 b2 · · · bk be t in binary. Let mlk+i = bi for all 1 ≤ i ≤ k. 4. For j = 0, 1, . . . , l calculate xj+1 = x2j ·
k
m(jk+i)
pi
mod p.
i=1
5. Return xl+1 as the hash value of M . The above compression function takes as input message blocks of length (at most) (S − 2)k. The proof of the collision resistance of VSH-DL is very similar to the proof for the original VSH. Thus VSH-DL compression function can be extended to a collision resistant hash function by the results of Merkle and Damg˚ ard.
Preimage Attacks against Variants of Very Smooth Hash
2.3
255
VSH-DL -Variants in Finite Fields and Elliptic Curves
Even the original authors state that the hash length of VSH is quite long in comparison with other hash functions. Thus in [13] Lenstra et al. propose some new methods for shortening the hash length of VSH-DL by applying arithmetic in finite fields and elliptic curves. Previously, these methods have been used mainly in public key cryptography, where the key lengths have been quite large. Next we present the discrete logarithm variants of VSH and the necessary definitions from [13]. Let p be a prime and denote by Fp the finite field of order p and by Fp6 a sixth degree extension of Fp . The order of the multiplicative group F∗p6 is p6 −1 = (p2 −p+1)(p2 +p+1)(p+1)(p−1). We denote by G the unique subgroup of order p2 − p + 1 in F∗p6 . In [12] it is claimed that this subgroup “contains” the hardness of the discrete logarithm for F∗p6 as for the other subgroups there are some sub-exponential algorithms that can be used to solve the problem. The idea behind shortening the hash length of VSH-DL is to use finite field arithmetic instead of small prime numbers and then compress the result into G. Thus the hash length could be shortened. In [13] two methods are presented, namely XTR [14] and CEILIDH [20]. These offer compression/decompression methods, which can be used to shorten the hash length. XTR would also allow very efficient finite field arithmetic, but was deemed unsuitable for the multiexponentiation that is used in VSH [13]. The hardness assumption is based on a modified version of the discrete logarithm problem (DLP). Definition 3 (Discrete Logarithm Problem (DLP)). Let H be a f inite cyclic group of known prime order p and h be a generator of H. The discrete logarithm problem in H is the following. Given g drawn uniformly and at random from H, f ind the unique value 0 ≤ t < p for which ht = g. Definition 4 (k-modif ied DLP). Let H be a f inite cyclic group and h be |H|
a generator of H. Let K ≤ H be a subgroup of H with generator g = h |K| . |H|
|H|
Furthermore, let Ψ : H → H be a mapping for which Ψ (hi ) |K| = (hi ) |K| for all hi ∈ H. The k-modif ied discrete logarithm problem for (H, K, Ψ ) is as follows. Given hi drawn uniformly at random from H, with i = 1, 2, . . . , k, f ind a nonzero solution (c1 , c2 , . . . , ck ) ∈ {0, 1, . . . p − 1}k for k
Ψ (hi )ci
|H| |K|
= 1.
i=1
In [13] the authors show a reduction from the original DLP to the k-modified DLP and thus conclude that a VSH-DL working on the principles of the kmodified DLP is as secure as the original VSH-DL. The modified VSH-DL compression function is as follows (see [13, Algorithm 8]).
256
K. Halunen and J. R¨ oning
The modif ied VSH-DL compression function 1. Let H be a finite cyclic group of known and factored order and let h be a |H| generator of H. Let K be a subgroup of H with generator g = h |K| . Let p be a prime dividing |K| but not |H|/p and w = log2 p . Let C be an efficiently computable injection and k ∈ Z+ such that (w − 1)k < 2k . Furthermore, let |H|
2. 3. 4. 5.
|H|
Ψ : H → H be a mapping for which Ψ (hi ) |K| = (hi ) |K| for all hi ∈ H. For i = 1, 2, . . . , k draw hi uniformly at random from H. Let M = m1 m2 · · · mt be the message in binary, with t ≤ (w − 1)k and set x0 = 1. Let l = kt be the number of message blocks in M . Pad the message by setting mi = 0 for all t < i ≤ lk. Let t = b1 b2 · · · bk be t in binary. Let mlk+i = bi for all 1 ≤ i ≤ k. For j = 0, 1, . . . , l calculate xj+1 =
x2j
·
k
Ψ (hi )m(jk+i) .
i=1 |H| |K|
6. Return C(xl+1 ) as the hash value of M . Notice, that the first step of the algorithm is precomputation that has to be done only once for a given implementation of VSH-DL. Steps 2-4 are for the message padding and the initialisation of the compression function. The actual compression happens in step 5 and in the final step the final transformation C is applied and the final hash value is returned. In [13] it is proposed that the modified VSH-DL should be instantiated with H = F∗p6 and K = G. Also for the compression function C the authors propose CEILIDH. It is also shown how to gain advantages on the computation time by suitable choices for the finite field and that one can sample from Ψ (hi ) efficiently and that the image is large enough for all hi . This then enables the construction of a hash function from this new compression function by the results of Merkle and Damg˚ ard. Second proposal for modifying the VSH-DL hash function is to use elliptic curve arithmetic and thus shorten the hash value. Elliptic curves over fields Fp , p > 3 can be presented in the short Weierstrass form Y 2 = X 3 + a4 X + a6 , where commonly one chooses a4 = −3 to obtain some performance benefits in the arithmetic. Now, the set of all points (X, Y ), which satisfy the above equation form a commutative group when the addition is defined appropriately and a point at infinity is added to the set. This point is the identity element and the negation of a point (X, Y ) on the elliptic curve is the point (X, −Y ). Representing all the algorithms and background to elliptic curves and the arithmetic used in cryptography is well beyond the scope of this paper. Most of the relevant details can be found for example in [10].
Preimage Attacks against Variants of Very Smooth Hash
257
The change from the original or modified VSH-DL could be made in a number of ways and the major difference in the algorithm would be that instead of the modified VSH-DL algorithm using CEILIDH and finite field arithmetic, it would use elliptic curve arithmetic. Thus the elements hi in the algorithm would be changed to points Pi on the elliptic curve. Furthermore, the multiplication in the finite field would be changed to the addition on the elliptic curve. Thus, the compression function would be xj+1 = 2 · xj +
k
mjk+i · Pi .
i=1
The main benefit of applying finite fields or elliptic curves on VSH-DL would be a further decrease in the hash length. Also the memory footprint of these new variants is quite small as described in [13].
3
Preimage Attacks against Basic VSH
The basic VSH method has been found susceptible to preimage attacks by Saarinen [21] and Halunen et al. [7,6]. All these attacks are based on the following observation by Saarinen [21]. First, we define ∨ to be the bitwise OR and ∧ to be the bitwise AND operation on binary messages of equal length. Theorem 1. Let x, y and z be messages of equal length, with x ∧ y = 0 = z. Let H be a VSH hash function. Then the following equivalence holds for VSH hash values. H(x)H(y) ≡ H(x ∨ y)H(z) mod n. (1)
The equivalence can be seen from the fact that the assumptions on x and y yield equal values for the exponents of each prime pi on both sides of the equivalence. Thus the equivalence holds modulo n. Now Theorem 1 can be used to find preimages by transforming (1) into H(x) ≡ H(y)−1 H(x ∨ y)H(z) mod n.
(2)
In [21] the author uses (2) by tabulating the right hand side of the equivalence and then searching for a match on the left hand side. This is possible if one knows the message space sufficiently well, e.g., passwords of certain length. Now, finding a preimage of H(m) works as follows. We assume that m = m1 ∨ m2 for two messages m1 and m2 , which are formed by setting the first bits of m1 to the bits of the first half of the bits in m and the last bits of m2 equal to the last half of m. Both m1 and m2 are padded with zeroes in the parts that are not equal to m. It is easy to see that by replacing x with m1 and y with m2 (2) holds for m1 and m2 . Now in (2) we let y run through all the possible values for m2 and tabulate the values on the right hand side of (2) together with the values for y. It should be noted that in these calculations H(m) = H(x∨y) is given and
258
K. Halunen and J. R¨ oning
the right hand side values can be computed. After this, we let x run through all the possible values for m1 and when we find a match in the tabulated values, we know that m1 = x and m2 equals the corresponding tabulated value for y. Thus we obtain the preimage of H(m) from the equation m = m1 ∨ m2 = x ∨ y. Halunen et al. have shown in [7] that this method can be used with realistic VSH hash lengths of 1024 and 2048 bits and that it works against the cubing variants of these too. In [6] the authors show that (2) can be transformed in such a way that it allows the tables to be reused and for more efficient searching, if the computationally intensive calculation of the multiplicative inverses is done in the tabulation phase. The reusability of the precomputed tables is very effective in cases where multiple preimages for the same implementation of VSH and in the same message space need to be found.
4
Generalisations of the Preimage Attacks
In this section, we show that equation similar to (1) (and thus also to (2)) holds for VSH-DL and its variants in finite fields and elliptic curves. Furthermore, we improve on the results of Saarinen by generalising (1). This generalisation has some applications in cases where some of the preimage is already known to the attacker. First of all, it is straightforward to see that for the original VSH-DL proposal (1) holds. This is because the compression function works the same way as with regular VSH and thus the primes have the same exponents on both sides of the equivalence. Therefore, all the results on preimage finding presented in the previous section can be directly applied to the original VSH-DL. The first interesting case is the finite field variant of the VSH-DL compression function based on CEILIDH and the field Fp6 . In this case, we make the following l−1 observations. Recall that ei = j=0 mj·k+i 2l−j−1 . First we note that because CEILIDH is an efficient compression and decompression algorithm, we can omit the C function from our considerations by decompressing the final result. Thus |H|/|K| we have left the value xl+1 . Now, by the modified VSH-DL algorithm we have k |H|/|K| xl+1 = (x2l · Ψ (hi )m(lk+i) )|H|/|K| . (3) i=1
Furthermore, as Ψ (hi )|H|/|K| = |H|/|K| xl+1
=
|H|/|K| hi ,
2|H|/|K| xl
we get
·
k
m(lk+i) |H|/|K|
hi
.
(4)
i=1 |H|/|K|
We apply the same reasoning to xl
|H|/|K|
xl+1
=
and continue all the way to x0 to obtain
k i=1
e ·|H|/|K|
hi i
.
(5)
Preimage Attacks against Variants of Very Smooth Hash
259
Next we prove a theorem generalising (1) to the modified VSH-DL compression function. Theorem 2. Let x, y and z be messages of equal length, with x ∧ y = 0 = z. Let H be a modif ied VSH-DL compression function. Then the following equation holds H(x)H(y) = H(x ∨ y)H(z). (6) |H|/|K|
Proof. Let l be the common block length of x, y and z. Now H(x) = C(xl+1 ) k |H|/|K| u ·|H|/|K| = i=1 hi i , and by the considerations above and (5) we have xl+1 k vi ·|H|/|K| where ui is the exponent ei for the message x. We also have yl+1 = i=1 hi k w ·|H|/|K| and (x∨y)l+1 = i=1 hi i , where vi and wi are the respective ei exponents for the messages y and x ∨ y. k Because x ∧ y = 0, we have ui + vi − i=1 m(lk+i) = wi , where m(lk+i) are the bits of the representation of l as the VSH-DL algorithm requires. Now kbinary m zl+1 = ( i=1 hi (lk+i) )|H|/|K| and thus H(x ∨ y)H(z) and H(x)H(y) have the same exponents for all hi before the final compression function C is applied. Therefore (6) holds.
The second case concerns the elliptic curve variant of VSH-DL. The case is quite similar except that the exponents are now multipliers for the points on the elliptic curve and we replace multiplication with addition and exponentiation with multiplication. It is easy to see that (6) holds also in this case. The exponents are only changed to multipliers and exactly the same arguments can be applied, when the messages x and y satisfy the conditions of Theorem 2. These results make it possible to apply the preimage attacks from [21,7,6] exactly as in the original papers. This implies that even though these new variants retain the collision resistance of original VSH, these methods do not overcome the weakness against preimage attacks. Next we present a theorem generalising (1) for multiple messages. Theorem 3. Let k ∈ N+ and x1 , x2 , . . . xk , z be messages of equal length with xi ∧ xj = 0 = z for all i, j ∈ {1, 2, . . . , k}, i = j. Furthermore, let H be a VSH hash function. Then the following equivalence holds. H(x1 )H(x2 ) · · · H(xk ) ≡ H(x1 ∨ x2 ∨ · · · ∨ xk )H(z)k−1
mod n
(7)
Proof. The equivalence in (7) can be proved by induction on k. We use (1) as the basis step. Now assume that we have k messages xi , i ∈ {1, 2, . . . , k} and a message z satisfying xi ∧ xj = 0 = z for all i, j ∈ {1, 2, . . . , k}, i = j. Thus, H(x1 )H(x2 ) · · · H(xk−1 )H(xk ) ≡ H(x1 ∨ x2 ∨ · · · ∨ xk−1 )H(z)k−2 H(xk ) ≡ H(x1 ∨ x2 ∨ · · · ∨ xk−1 )H(xk )H(z)k−2 ≡ H(x1 ∨ x2 ∨ · · · ∨ xk )H(z)k−1
mod n.
260
K. Halunen and J. R¨ oning
The final equivalence holds, because xk ∧ (x1 ∨ x2 ∨ · · · ∨ xk−1 ) = z by the assumptions on xi .
Due to Theorem 2, it is evident that Theorem 3 also holds for VSH-DL compression function and its variants on elliptic curves and finite fields, with the equivalence changed to equality in the finite field or the elliptic curve. Furthermore, Theorem 3 enables an attacker to fix some known parts of the message, e.g., some characters in a password or some header information in IP packets, and mount the preimage attack against the unknown parts only. Because the unknown parts will contain many zero bits due to the assumptions of Theorem 3, the computation should also be faster. This is because there will be less arithmetic operations to be performed during the computation of the variable hash values. Other hash functions usually gain very little benefit from zero bits, but here the benefit could be quite significant. With the help of Theorem 3, we can now form a general algorithm for finding a preimage of a message of known length with possibly (but not necessarily) some parts of the preimage that are known to the attacker and that is hashed with VSH, VSH-DL, VSH-DL in a finite field or VSH-DL in an elliptic curve. A general preimage f inding algorithm for VSH and VSH-DL variants 1. Let H(m) be hash value for a VSH or VSH-DL variant H. Let m be a (partially) unknown message of known length of n-bits. Let t ∈ N be the number of distinct known parts of m and z a zero vector of length n. 2. Now divide m into m = x0 y1 x1 y2 · · · xt−1 yt xt , where each yi , i ∈ {1, 2, . . . , t} is a known part of m and each xj , j ∈ {0, 1, . . . , t} is an unknown part of m. 3. Set yi to be a message of length n by setting all other parts of m except yi to zeroes for all i ∈ {1, 2, . . . , t}. Similarly, form message variables xj (of length n) from the unknown parts xj for all j ∈ {0, 1, . . . , t}. 4. Divide the possible preimage space into two parts by forming the equation t i=1
H(yi )
2t
·
j=0
H(xj ) = H(z)2t H(m)
t
H(xj )−1
(8)
j= 2t +1
with the help of Theorem 3. 5. Compute and tabulate the values of the left hand side of (8) for all possible values of all the message variables xj on the left hand side. 6. Compute the right hand side of (8) for all possible values of all the message variables xj on the right hand side until a match between the tabulated values is found. 7. When a match is found between the table and the searched values, m = x0 ∨ y1 ∨ x1 ∨ · · · ∨ xt−1 ∨ yt ∨ xt for the values of xj , which yield the matching between the table and the search. 8. Return the preimage m of H(m). Remark 1. It should be noted that the above algorithm produces reusable tables only if the fixed parts of the message remain the same within the set of preimages.
Preimage Attacks against Variants of Very Smooth Hash
261
Also one can move the variables H(xj ) from one side to the other in (8) and try to gain advantage by using more memory to store greater part of the preimage space in the tabulation phase, which makes the search faster or vice versa for faster precomputation and less memory usage. The number of basic arithmetic operations in the preimage finding methods for VSH and the different variants of VSH-DL are almost the same for all methods. For the original VSH and VSH-DL the actual computation of the hash values requires k (the block length) multiplications, one modular squaring and a single modular multiplication for each message block. This is due to the fact that the primes used are selected in a way that a reduction is not needed for the product of the primes even if all of them are multiplied together. Unfortunately, such benefits are not present in the finite field and elliptic curve variants of VSH-DL as the arithmetic is no longer simple multiplication and modular reduction but arithmetic either in the field or on the elliptic curve. Thus for these hash values one requires one squaring or doubling and k + 1 multiplications or additions for each message block. In the case of finite fields there is also the final computation with the injection, but as this happens only once for each message (not message block), the added computational cost is negligible. If we consider n-bit messages as preimages for the preimage finding methods presented in [21,7,6] and in this paper, we can see the following. In the most general case, where the attacker does not have any information about the preimage, there are two constant messages (z and x ∨ y), for which the hash values need to be computed. If we want reusable tables, x ∨ y should not be present in the n tabulation. In the tabulation phase, we need to compute 2 2 hash values and save these in the table. In the searching phase, the worst case is that we need to n go through all 2 2 values before a match is found. Thus the overall complexity n for finding a preimage is O(2 2 +1 ) when we take into account the possibility for an odd n. A normal brute force attack for finding a preimage of an n-bit message has complexity O(2n ).
5
Practical Results
We programmed the VSH-DL finite field and elliptic curve variants using Python [17] and Sage [22] for support in finite field and elliptic curve arithmetic. The results show that we can use the preimage finding methods described above for the VSH-DL variants both in finite fields and elliptic curves. We considered 8-character passwords from an alphabet consisting of lowercase alphabets from a to z and numbers from 0 to 9. We used a naive approach of going through the possible words in alphabetical order, i.e., from aaaaaaaa to 99999999. The results confirmed that the theoretical tools from the previous section can be applied as preimage finding methods for VSH-DL variants. For our implementations, we had the block length of 128-bits as it was easy to implement and was reasonably close to the security parameter (131-bits) presented in [3]. We did not use cryptographically secure values for the parameters of the elliptic curves and finite fields i.e., the primes were not large enough
262
K. Halunen and J. R¨ oning Table 1. Results for VSH-DL variants for fifty passwords
Variant Precomputation Total time Average time Fastest time Slowest time Finite Field 0.79 h 25.00 h 0.48 h 12 s 0.93 h Elliptic Curve 0.93 h 24.85 h 0.48 h 11 s 0.92 h
for these functions to be cryptographically secure. For the finite field VSH-DL, we used a prime p of order 105 and for the elliptic curve we used p of order 1020 . These choices provided us with fields and curves which had enough elements/points that collisions before the right value of the password would not be formed. If the fields or curves are too small or the elements/points are chosen in a poor way, the algorithm fails to produce correct results, i.e. collisions between the tabulated values and the search phase values are formed before the correct preimage is found. For the elliptic curves, we chose random points on the curve, but for the finite field we chose the points a + i for i = 0, 1, 2, . . . 127, where a was a generator for the multiplicative group of the field. This is strictly not in accordance with the method proposed in [13]. However, this did not affect the outcome. We also left out the compression with CEILIDH and the exponentiation with |H|/|K| as these would have added to the computational time and would not have affected the results in any way. We applied the first method from [7] to find the preimages of passwords secured with VSH-DL finite field and elliptic curve variants. The tests were conducted for a set of 50 randomly generated passwords from our alphabet on a 12 core 2.6 GHz Intel Xeon with 16 GB of memory. The memory consumption was fairly small with the finite field variant using approximately 2 GB and the elliptic curve variant using slightly less than 1 GB of memory. Our program did not utilise any parallelisation, so it ran on only one of the 12 cores. The results for the computation times are presented in Table 1. It can be seen from the table that there is very little difference in the performance of the preimage finding method on these two VSH-DL variants. The main difference was on the precomputation, which was faster for the finite field variant. On the other hand, for the whole set of 50 passwords it took a while longer for the finite field variant to find the preimages. The average time for finding a preimage was the same to the accuracy that we measured. We took the average time from the actual searching times, which means that the precomputation was first deducted from the total time before computing the average time.
6
Discussion
Although our results clearly show that the preimage finding methods for VSH can be easily generalised to work on different VSH-DL variants, there were some difficulties that make our practical results rather inconclusive. When we tested our methods for a similar test setup as in [6], we found out that the memory
Preimage Attacks against Variants of Very Smooth Hash
263
consumption for these methods was quite significant even with quite modest parameter values, which we used. In fact, with our proof of concept program that was implemented with Python and Sage, our computers ran out of memory and had to resort to swapping to hard drives. This made the program very slow and we were unable to test VSH-DL variants with larger alphabets as in [7] and [6]. This makes our results incomparable with these previous studies. It would be interesting to take highly optimised implementations for VSH and all the VSH-DL variants and run our tests on these. The code that the original authors of [3] and [13] used for their tests would be a very good starting point. Unfortunately, this code was not available for our research. Other option would be to run these tests on better hardware, where memory would be abundant. Our theoretical findings for VSH give rise to some interesting observations. First of all, the result of Theorem 3 can be applied in scenarios, where parts of the message are known. One can fix the known parts and mount the preimage attack on the unknown parts by dividing them into two sets. Furthermore, this means that most of the message bits in the messages to be hashed are zeroes and thus the computation of hash values should be faster. This of course means that also the preimage is found faster. With other, more traditional, hash functions this is usually not the case as there is very little performance benefit to be gained from abundant zeroes in the messages. The results concerning the security of the VSH-DL and its variants against the preimage attacks are not surprising. The authors of [13] give no claims about the preimage resistance of these new proposals. However, the implementations and practical results presented in [13] show that these variants are much slower than the original VSH and thus are very impractical to use. This also means that even though the attacks proposed earlier can be used against these variants of VSH, the time of finding a preimage would be significantly longer than for the original VSH. In [6] the authors state that for the original VSH, the computationally most expensive part of the preimage finding method is the computing of the inverses of the hash values, which are needed in the method. Thus they are able to improve the total time for multiple passwords when the computationally intensive part is done in the tabulation phase and the searches are then faster. For VSH-DL variants in finite fields and elliptic curves these measures are not necessarily needed. If one knows the presentation of the multiplication group of a finite field Fq with the help of a generator g, one can easily compute the inverses of elements. For an element t = g l the element g q−1−l gives t−1 . However, as large fields are not usually represented in this manner in computer memory, the Euclidean algorithm in polynomial rings is then needed for the computation of t−1 . In Sage, for example, only the fields of cardinality less than 216 are presented in a form that facilitates the easy computation of inverses. For larger fields, more computationally intensive methods are applied. With elliptic curves, the inverses are easily obtained (for a point P = (X, Y ) we have −P = (X, −Y )) and are free in comparison with the actual computation of the sums and multiples of points. Thus, the attack is actually more effective for the elliptic curve variant
264
K. Halunen and J. R¨ oning
of VSH-DL as there is no computational drawback to using the inverses of points in the method. One further interesting point was that the choice of the points of the finite field is very important. For example, if one chooses the 128 different points of the field to be just the 128 first exponents of the generator a, it is very likely that the computations will lead to a false preimage. In our very first preliminary tests, where the fields were small, this seemed to happen quite often with the worst case password. Thus it is very important to have the points chosen randomly from the field. It is worth noticing that the complexity of the preimage finding attacks is much more tied to the entropy of the message space than to the bitlength of the hash values. Thus the reduced bitlength of the different VSH-DL variants does not automatically make it easier to find preimages. However, our results show that these functions are also not immune to the weaknesses of the original VSH as can be seen from the theorems and the practical results. For the original VSH hash function there is an improvement by Bellare and Ristov [1]. This is only a speed improvement and does not affect the security of VSH in any way. This improvement has not been applied to the different variants of VSH-DL and it could give some performance benefit for these hash functions too.
7
Conclusion
In this paper, we have generalised some previous preimage attacks on VSH to VSH-DL and its implementations in finite fields and elliptic curves. Furthermore, we have generalised the theoretical result of Saarinen, which is behind all these attacks. This generalisation relates multiple messages and their respective hash values in a similar manner as Saarinen’s original result and the result can be applied in some scenarios, where it makes finding the preimages even faster in the case that parts of the preimage are known to the attacker. The results of this paper and previous research show that VSH needs some modification to be used as a replacement for a random oracle in cryptographic protocols. It is a very interesting proposal and the simple form of the VSH hash function algorithm makes the design very easily accessible. Furthermore, it has lead into some previously uncharted number theoretical problems as presented in [2]. Thus the design is still an interesting research topic even if it has lost some of its appeal as a hash function and is not a participant in the SHA-3 competition. We are confident that it is possible to improve on the design or to use some other mathematical problem to construct practical hash functions that have rigorous proofs of security.
References 1. Bellare, M., Ristov, T.: Hash functions from sigma protocols and improvements to VSH. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp. 125–142. Springer, Heidelberg (2008)
Preimage Attacks against Variants of Very Smooth Hash
265
2. Blake, I.F., Shparlinski, I.E.: Statistical distribution and collisions of the VSH. Journal of Mathematical Cryptology 1(4), 329–349 (2007) 3. Contini, S., Lenstra, A.K., Steinfeld, R.: VSH, an efficient and provable collisionresistant hash function. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 165–182. Springer, Heidelberg (2006) 4. Damg˚ ard, I.B.: A design principle for hash functions. In: Brassard, G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp. 416–427. Springer, Heidelberg (1990) 5. De Canni`ere, C., Rechberger, C.: Preimages for reduced SHA-0 and SHA-1. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 179–202. Springer, Heidelberg (2008) 6. Halunen, K., Rikula, P., R¨ oning, J.: Finding preimages of multiple passwords secured with VSH. In: International Conference on Availability, Reliability and Security, ARES 2009, pp. 499–503 (March 2009) 7. Halunen, K., Rikula, P., R¨ oning, J.: On the security of VSH in password schemes. In: ARES, pp. 828–833. IEEE Computer Society, Los Alamitos (2008) 8. Joux, A.: Multicollisions in iterated hash functions. application to cascaded constructions. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 306–316. Springer, Heidelberg (2004) 9. Kelsey, J., Kohno, T.: Herding hash functions and the Nostradamus attack. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 183–200. Springer, Heidelberg (2006) 10. Koblitz, N.: Algebraic Aspects of Cryptography. Algorithms and Computation in Mathematics, vol. 3. Springer, Berlin (1998) 11. Lenstra, A.K., Lenstra Jr., H.W., Manasse, M.S., Pollard, J.M.: The number field sieve. In: STOC 1990: Proceedings of the twenty-second annual ACM symposium on Theory of computing, pp. 564–572. ACM Press, New York (1990) 12. Lenstra, A.K.: Using cyclotomic polynomials to construct efficient discrete logarithm cryptosystems over finite fields. In: Mu, Y., Pieprzyk, J.P., Varadharajan, V. (eds.) ACISP 1997. LNCS, vol. 1270, pp. 127–138. Springer, Heidelberg (1997) 13. Lenstra, A.K., Page, D., Stam, M.: Discrete logarithm variants of VSH. In: Nguyˆen, P.Q. (ed.) VIETCRYPT 2006. LNCS, vol. 4341, pp. 229–242. Springer, Heidelberg (2006) 14. Lenstra, A.K., Verheul, E.R.: The XTR public key system. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 1–19. Springer, Heidelberg (2000) 15. Merkle, R.C.: A certified digital signature. In: Brassard, G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp. 218–238. Springer, Heidelberg (1990) 16. Preneel, B.: The state of hash functions and the NIST SHA-3 competition. In: Yung, M., Liu, P., Lin, D. (eds.) Inscrypt 2008. LNCS, vol. 5487, pp. 1–11. Springer, Heidelberg (2009) 17. Python Software Foundation: Python programming language (2007), http://python.org/ 18. Rivest, R.L., Shamir, A., Adleman, L.M.: A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM 21(2), 120–126 (1978), see also U.S. Patent 4,405,829 19. Rogaway, P., Shrimpton, T.: Cryptographic hash-function basics: Definitions, implications, and separations for preimage resistance, second-preimage resistance, and collision resistance. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 371–388. Springer, Heidelberg (2004)
266
K. Halunen and J. R¨ oning
20. Rubin, K., Silverberg, A.: Torus-based cryptography. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 349–365. Springer, Heidelberg (2003) 21. Saarinen, M.J.O.: Security of VSH in the real world. In: Barua, R., Lange, T. (eds.) INDOCRYPT 2006. LNCS, vol. 4329, pp. 95–103. Springer, Heidelberg (2006) 22. Stein, W., et al.: Sage Mathematics Software, Version 4.3.5 (2010), http://www.sagemath.org 23. Stevens, M., Lenstra, A.K., de Weger, B.: Chosen-prefix collisions for MD5 and colliding X.509 certificates for different identities. In: Naor, M. (ed.) EUROCRYPT 2007. LNCS, vol. 4515, pp. 1–22. Springer, Heidelberg (2007) 24. Wang, X., Yin, Y.L., Yu, H.: Finding collisions in the full SHA-1. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 17–36. Springer, Heidelberg (2005) 25. Wang, X., Yu, H.: How to break MD5 and other hash functions. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 19–35. Springer, Heidelberg (2005)
Matrix Representation of Conditions for the Collision Attack of SHA-1 and Its Application to the Message Modification Jun Yajima and Takeshi Shimoyama Fujitsu Limited 4-1-1, Kamikodanaka, Nakahara-ku, Kawasaki, 211-8588, Japan {jyajima,shimo-shimo}@jp.fujitsu.com Abstract. In this paper, we propose a matrix representation of Chaining Variable Condition (CVC) and Message Condition (MC) for the collision attack on the hash function SHA-1. Then we apply this to an algorithm for constructing the Message Modification procedure in order to reduce the complexity for the collision attack of SHA-1. Message Modification is one of the most important techniques to search for a collision of SHA-1. Our approach will clarify how to effectively construct a “good” message modification procedure. By using our algorithm, we calculate some experiments of Message Modification for attacking SHA1 based on the Disturbance Vector and the Differential Path presented by Wang et al. Then we show the results of constructing some Message Modifications applicable to CVC up to step 261 .
1
Introduction
SHA-1 is a hash function which has been widely used since it was issued by NIST as a FIPS in 1995[1]. Recently, many researchers have discussed the collision attack against SHA-1. In 2005, Wang et al. succeeded in attacking the full 80-step SHA-1 with 269 complexity[2]. They adopted the multi-block collision technique introduced by [3,4], and adjusted the differential path that was known by then in the first round (step 1 to step 16) to another possible differential path. In the attack, they used local collisions. Local collision (LC for short) is a collision within a 6-steps differential path, as introduced by Chabaud and Joux[5]. In order to find an appropriate combination of LCs for SHA-1, the disturbance vector (DV) (introduced in [5]) is used. After that, the attack complexity was reduced to 261 -262 by improving the message modification techniques in [6,7]. Roughly speaking, the collision attack[2] consists of the following two phases; Preparing Phase and Searching Collision Phase. Preparing Phase Stage 1: Select a Disturbance Vector (DV) and obtain the message differentials ΔM = M − M . Derive the differential path (ΔA), sufficient conditions 1
In this paper, the Non-linear path on Round 1 of SHA-1 is not considered.
I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 267–284, 2010. c Springer-Verlag Berlin Heidelberg 2010
268
J. Yajima and T. Shimoyama
(Chaining Variable Conditions: CVCs), and the sufficient conditions on messages variables (Message Conditions: MCs) for the linear part (from step 17 to step 80 (step17-step80)). Stage 2: Determine the message modifications (MM) used in Stage 4 so that M satisfies all CVCs and all MCs derived in Stage 1 efficiently, and evaluate the complexity of the collision search. Stage 3: Locate differential paths which are the differences between two sequences of chaining variables in the compression function, and derive the CVCs and MCs for the non-linear part (step1-step16) for H(M ) = H(M ). Searching Collision Phase Stage 4: Search for a message M which satisfies all CVCs and MCs. The complexity of the above collision search is evaluated by the number of CVCs located in the differential path that are not satisfied after the execution of MM. In recent research, the stages 1-4 in the above attack procedure have been improved. On Stage 1, Wang’s method has been proposed in [2], and some related works have been described in [3,8,9,10,11]. On Stage 2, the applicable steps of MM were extended from step 21 to step 25[7]. Moreover such improvement can reduce the complexity of the collision search of SHA-1 from 269 to 261 -262 . Also the [12,13,14] have an effect on reducing the complexity of the collision search of SHA-1. On Stage 3, some algorithms for constructing differential paths were discussed in [15,16,17]. On Stage 4, De Canni`ere et al. found a colliding message pair against reduced SHA-1 with 70 steps[18]. In the phase of preparing the collision attack, the following requirements are important: (i) (ii) (iii) (iv)
Smaller #{total CVCs in steps 17-80} Many #{satisfiable CVCs by MM in steps 17-80} Can find a differential path A lot of message freedom for the collision search
In this paper, we focus on property (ii) related to Stage 2 in the Preparing Phase. In the above requirements, especially (ii), (iii) and (iv), there are properties that have a trade-off relationship with each other. More concretely, applying many MMs will cause a lot of number of CVCs and MCs. This increase of MCs may decrease both the message freedom and the deriving probability of the differential path. Similarly, the increase of CVCs also decreases the message freedom and the deriving probability of the differential path because MM is applied recursively. There is still room to consider which of Stage 2 or 3 should be executed early. In this paper, we make a strategy that Stage 2 has priority over Stage 3. The reason for our strategy is as follows: – CVCs and MCs are appended in both Stage 2 (evaluation of message modification) and Stage 3 (deriving the differential path) – The evaluation of MM is the most effective way to reduce the total complexity for the collision search and estimate the total complexity.
Matrix Representation of Conditions for the Collision Attack
269
– It seems that there are many more candidate solutions in Stage 3 than in Stage 2. – There may be a case that some modifiable CVCs of Stage 2 will not be able to modify by adding the Ex-CVCs of Stage 3 before executing Stage 2. For these reasons, Stage 2 precedes Stage 3; MM is evaluated and the complexity is reduced, and then the differential path is derived. In order to derive a differential path of the non-linear part, an efficient algorithm is described in [15] for example. Therefore, Stage 3 does not come under the scope of this paper, although an improvement in that stage seems be an important factor. By this consideration, the following assumption will be reasonable: – On the message freedom, all of m0 -m15 are free conditions at the beginning of Stage 2. Because Stage 2 precedes Stage 3, the message freedom has been left. So the above assumption can be considered. Related to Stage 2, the following two problems should be considered: A. How can we construct a concrete algorithm for MM? B. How many CVCs can be satisfied efficiently by applying the MM technique? For the problem A, Naito et al. proposed an algorithm of MM applicable to the CVCs up to step 25 on SHA-0[13], and Joux et al. proposed an MM method by using LC[19]. On the other hand, the problem B has still remained to be analyzed. Because the previous techniques search the MM methods heuristically, it is difficult to search for all the possible cases and to find the best MM. Therefore it is quite important to construct an algorithm to solve problem B automatically. In this paper, we propose a new approach to constructing an efficient search algorithm of MM patterns for reducing the complexity of the collision search of SHA-1 by using a matrix representation over GF (2) and linear algebra. Our search method uses some templates and linear algebra, and its advantage is that an answer can be clearly obtained namely “yes” if there is a solution and “no” if there is not. We do not have to use useless computer power to keep searching without knowing whether or not a solution can be found.
2 2.1
Summary of the Previous Results Abstract of Collision Search Algorithm of SHA-1
In Stage 4 of the collision search mentioned in Section 1, we derive a colliding hash value after executing the compression function in two blocks. As shown in Figure 1, the Stage 4 (Collision Search Stage) in the strategy of Collision Attack described in Section 1, tries to derive a pair of messages whose hash values after two blocks are the same. The process in each block of this attack is as follows.
270
J. Yajima and T. Shimoyama
Phase of Collision Search Shown by Wang 1. Make a message which satisfies all the MCs. 2. Correct some bits of the message so that it satisfies as many CVCs as possible by using the Message Modification technique. 3. To meet all CVCs of the remaining parts, the procedure is repeated while changing the messages appropriately.
Fig. 1. Collision Attack of SHA-1 by Wang
2.2
CVC and MC
The CVC in the algorithm shown in 2.1 is a condition on an output bit in each step of the compression function (chaining variable) which should be satisfied in order to satisfy the differential path. MC is a condition on the message bit which should be satisfied for the same reason. Each CVC and MC is derived in Stage 2 of Section 1. We show an example of CVC and MC of a step in Figure 2. When we search for a pair of collision messages, our aim is to generate a message M that meets these conditions. If we can obtain a message M which satisfies all CVCs and MCs, the pair of messages M and M (M is the message which has a certain difference from M ) generate a chaining variable difference that is useful for the attack.
3
Message Modification
Message Modification (MM for short) is a technique in which some of the message bits are changed to meet CVCs that are not satisfied, while other CVCs and MCs that are already satisfied are not changed, during a collision search. 3.1
Basic Message Modification (BMM)
During the search phase of the collision, we execute MM when we find that some inner variable ai does not satisfy CVC. MM is a technique for modifying the particular message variable related to ai not satisfying CVC. For example,
Matrix Representation of Conditions for the Collision Attack
271
Fig. 2. CVC and MC
Fig. 3. An Example of Basic Message Modification (BMM)
in the case of Figure 2 during the collision search phase, if a15,1 = 1, the value of the variable a15,1 will become 0 after modifying the message bit m14,1 , as shown in Figure 3. This technique is called Basic Message Modification (BMM for short) [6,7]. For a condition related to the variables a1 -a16 , we can make these variables satisfy such CVC by the BMM technique. 3.2
Advanced Message Modification (AMM)
The message variables mi (i > 16) after 16 steps are generated by following the procedure of the message expansion (mi = (mi−3 ⊕mi−8 ⊕mi−14 ⊕mi−16 ) ≪ 1). Therefore, in order to apply MM to the variable ai , we have to apply it to other variables mi−4 , mi−9 , mi−15 , mi−17 , since we cannot correct the message variable mi−1 directly. On the other hand, it seems not easy to satisfy CVC in such a simple way, because the change of the message bit influences the internal variables of other steps, and there is a possibility that CVCs that have already been satisfied may not be filled. The modification of the message should be adjusted in order to satisfy the CVC related to the variable after step 16, in consideration of these influences. Such MM is called Advanced Message Modification (AMM for short)[6,7].
272
4 4.1
J. Yajima and T. Shimoyama
Proposed Algorithm for MM Study of an Algorithm for MM Search
Representation of CVC/MC by Matrix. The equation CVC/MC can be represented by a matrix over GF(2). For example, the condition a17,2 + m16,2 + 1 = 0 can be denoted as follows: ⎞ ⎛ a17,2 (1, 1, 1) ⎝ m16,2 ⎠ = 0 1 By generalizing the above, any condition can be described in the same way as follows: A = (a1,0 , ..., a1,31 , ..., a80,0 , ..., a80,31 ) M = (m1,0 , ..., m1,31 , ..., m80,0 , ..., m80,31 ) ⎞ A T ⎝ tM ⎠ = 0 1 ⎛
t
where, T is the vector which has 5121 (= 80 · 32 + 80 · 32 + 1) elements, and its elements are 1 if the related variables exist on the condition, and are 0 otherwise. By preparing the matrix, we can derive MM efficiently, using linear algebra. Explanation of Search Algorithm of MM using a Matrix. Message Modification is a technique used to satisfy more conditions by reversing the value on {0, 1} if the condition is not satisfied, and by remaining the value if the condition has already been satisfied. Then, we have only to solve the following linear equation. In order to obtain a practical MM solution, we prepare some vectors ΔX1 , ΔX2 , ..., ΔXt in the Solution Space of ΔX whose the number of required Ex-CVC is relatively small. Then, we search the solution ΔX from the linear combination of ⎛ ⎞ 1 ⎜0⎟ ⎜ ⎟ T · ΔX = ⎜ . ⎟ ⎝ .. ⎠ 0 However, it is not guaranteed that the solution of ΔXi will really be effective as a solution of MM. So, we solve this problem by concretely deriving CVCs.
Matrix Representation of Conditions for the Collision Attack
4.2
273
Main Algorithm for MM Search
Our algorithm for MM search is as follows: – [Input] MM TARGET (The vector representation of the target CVC whose value should be changed.) CVC VECTOR (The matrix representation of the set of CVCs whose values should remain. ) – [Output] The differential path for satisfying the value of MM TARGET with all CVCs up to the step where MM TARGET is located has been satisfied. 1. Choose one MM TARGET as a CVC which should be satisfied. 2. Let MM STABLE be a set of CVCs and MCs which have already been satisfied and should remain the satisfaction. 3. Let X be a set of all variables contained in MM TARGET, MM STABLE 4. Let M be the matrix over GF (2) corresponding to the set of the conditions appending MM TARGET to the top position of MM STABLE, and omitting the constant term (the term of +1). 5. Prepare the set of the Message Differentials generated by the input differential from steps 1 to 16, whose differential path can exist up to the step where MM TARGET is located. 6. Solve the equation M · X = (1, 0, 0, ...., 0) by using linear algebra from the sets of the above-mentioned message differentials. 7. Extract the set of solutions of the differential path X which is satisfied by adding as few Ex-CVCs as possible. 8. Sort the results of the solutions in order that Ex-CVCs are few, and output it. The formula M · X = (1, 0, 0, ...., 0) in item 6 of the above algorithm means that the condition of MM TARGET changes the value and the value of the conditions included in the set MM STABLE are not changed. The differential path X that satisfies the above property can be used as a candidate for MM.
5
Detailed Description of Algorithm
In this section, we show the results of a computer experiment in which the MM search algorithm described in Section 4 was implemented with software. 5.1
Experiment 1. MM Search of the Specific CVC
Appropriate control of influences of the changed message bits. In item 6 of the algorithm in the previous section, when a set of the bits of the messages in m0 -m15 which should be changed is decided, the influence on the bits of the messages in m0 -m79 is uniquely determined by the message expansion. On the other hand, even if a set of changed message bits is given, the influence on the chaining variables a0 -a79 is not uniquely determined. The reasons come from the uncertainty of carries of arithmetic addition and the uncertainty of outputs of non-linear f function. It seems difficult to consider all the patterns
274
J. Yajima and T. Shimoyama
of the arithmetic addition and f function. We should note that the outcome of the procedures described here appears to be dependent on the order in which the conditions are treated. In our experiment, the following simplifications are made. Then, the influence on the chaining variables a0 -a79 of a set of changed message bits m0 -m15 can be uniquely determined, although the search space is limited. Our solution of MM will be obtained as a combination of these variations.2 Basic Variations of MM 1-bit variation : Choosing one bit from the set m0,0 -m15,31 as a changed message bit, and expanding it to the sets m0,0 -m79,31 and a0,0 -a79,31 , where we assume that there are no output differentials of the f function and there are no carry bits of arithmetic additions in all steps. 2-bit variation : Choosing two bits as the changed message bits; one is the bit mi,j in the set m0,0 -m14,31 and the other bit is mi+1,j+5 ; and expanding them to the set of message differentials and the differential pattern of the chaining variables according to LC which starts in the changed message bit mi,j , where we assume there are no differential outputs of the f function and there are no carry bits of arithmetic additions in all steps, and the differential of the 6th step of LC has not vanished. 3-bit variation : Choosing three bits as the changed message bits; one is the bit mi,j in the set m0,0 -m10,31 and the others are mi+1,j+5 , mi+5,j+30 ; and expanding them to the set of message differentials and the differential pattern of the chaining variables according to LC which starts in the changed message bit mi,j , where we assume there are no differential outputs of the f function and there are no carry bits of arithmetic additions in all steps. 6-bit variation : Choosing six bits as the changed message bits; one is the bit mi,j in the set m0,0 -m10,31 and the others are mi+1,j+5 , mi+2,j , mi+3,j+30 , mi+4,j+30 , mi+5,j+30 ; and expanding them to the set of message differentials and the differential pattern of the chaining variables according to LC which starts in the changed message bit mi,j , where we assume there are differential outputs of the f function and there are no carry bits of arithmetic additions in all steps. The above variations are shown in Appendix A. We should note that it is necessary to give some additional conditions (extra-chaining variable conditions, Ex-CVCs for short) to satisfy the assumptions related to the F function and the carry bits in the above variations. The following algorithms are used for deriving Ex-CVC from the differentials of a0,0 -a79,31 (ΔA for short, derived in Stage 1 of the Preparing Phase in the introduction.) and the differentials of m0,0 -m79,31 (ΔM for short). Note that we disregard any Ex-CVCs related to the steps after the step where the head term of MM TARGET is located. In order to execute MM against MM TARGET, only the Ex-CVCs of MM TARGET on upper steps (smaller number of the step) of MM TARGET have to be satisfied. On 2
Of course, other variations such as a 4-bit variation can also be considered.
Matrix Representation of Conditions for the Collision Attack
275
the other hand, it is not important whether Ex-CVCs of MM TARGET in lower steps (bigger number of the step) than in MM TARGET can be satisfied or not. Therefore, we can omit Ex-CVCs of MM TARGET located in the lower steps than in MM TARGET. In our experiment, when a differential (ΔM, ΔX) has a low success probability by MM despite appending of the Ex-CVCs, we decide that it is useless and will not be used. Moreover, when the corresponding Ex-CVC will contains a contradiction, such differential pattern will be removed from the solutions of MM. Deriving algorithm of CVC. To explain the algorithm, the following symbol is defined: Definition 1. In arithmetic differential ΔO = O − O, the behavior of each bit is defined as ∇Oi = oi − oi . Roughly speaking, this algorithm executes the following procedure: – [input] ||∇O||(= |∇O31 |231 + |∇O30 |230 + ... + |∇O0 |), ||∇R||, ||∇E||, ||∇M || in Figure 4 – [output] CVC to establish the differential path To establish the differential path in each step means that the following two equations hold: 1. ∇e0 ⊕ ∇r0 ⊕ ∇m0 ⊕ ∇f0 = ∇o0 (LSB). 2. ∇carryj ⊕ ∇ej ⊕ ∇rj ⊕ ∇mj ⊕ ∇fj = ∇oj
(other bits).
In equation 1 and 2, ∇fj and ∇carryj are variables. Let ∇oj ⊕∇mj ⊕∇rj ⊕∇ej = ∇zj . ∇carryj ⊕ ∇fj = ∇zj , CVCs are given to the input of the f function and the variables for the lower position that cause ∇carryj according to ∇fj and ∇carryj . At this point, CVC is given so as not to cause ∇carryj as much as possible. CVC giving algorithm – [Input] ||∇R||, ||∇O||, ||∇M ||, ||∇E||, ||∇B||, ||∇C||, ||∇D|| – [Output] CVC for R, E, M, O, F, B, C, D (CVC Vector) 1. Derive the differential value in each bit position from ||∇R||, ||∇O||, ||∇M ||, ||∇E||. Calculate |∇Rj | + |∇Oj | + |∇Mj | + |∇Ej |(number of differentials in each bit) in each bit. 2. On the differential of the output of the f function, f statej is given in each bit as “pass only,” “do not pass only,” and “pass/do not pass both possible” based on the differential of the input of the f function (|∇Bj |, |∇Cj |, |∇Dj |). 3. Execute “even odd initialization function” 4. Execute “CVC vector generation function”
276
J. Yajima and T. Shimoyama
Fig. 4. Extraction of CVC
Even odd initialization function – [Input] |∇Rj |, |∇Oj |, |∇Mj |, |∇Ej |, f statej – [Output] ∇fj , even, odd, carry 1. On |∇Rj |, |∇Oj |, |∇Mj |, |∇Ej |, and|∇Fj |, ∇fi−1 is adjusted so that |∇Rj |+ |∇Oj | + |∇Mj | + |∇Ej | + |∇Fj |(number of differentials in each bit) may become an even number as much as possible based on the flag of f statej . It does not make it to the even number according to the flag of f statej . 2. If the result of 1 is even, the number of differentials is saved in even[i][j], and if the result is odd, the number of differentials is saved in odd[i][j]. 3. If odd[i][j]=1 and even[i][j-1]+odd[i][j-1]+carry[i][j-1] ≥ 2, set carry[i][j]= 1(carry is available). CVC vector generation function – [Input] |∇Rj |, |∇Oj |, |∇Mj |, |∇Ej |, ∇fj , even, odd, carry – [Output] CVC for R, E, M, O, F, B, C, D (CVC vector) 1. For j = 0 to 31 2. If j = 31, execute as follows: (a) On |∇Rj |, |∇Ej |, |∇Mj |, and |∇Oj |, XOR 1 for the corresponding position of the matrix, and set carry flag[j]=1,2,3,4 respectively. In addition, XOR 1 for the position for “1” of the matrix when |∇Oj | = 1. (b) If even[i][j]=2 i. If carry[i][j+1]=0, there is assumed to be no carry from this bit to the higher bit, and XOR 1 for the position for “1” of the matrix. carry flag[j]=0. (c) If odd[i][j]=1 i. If carry[i][j]+odd[i][j] = 2, output the warning message.
Matrix Representation of Conditions for the Collision Attack
277
ii. If j = 0, XOR 1 for the appropriate position for the matrix based on the value of carry flag[j-1]. – If 1: position of |∇Rj−1 | – If 2: position of |∇Ej−1 | – If 3: position of |∇Mj−1 | – If 4: position of |∇Oj−1 | and the position for “1” 3. Set CVCs of B, C, and D for the matrix so that the differential of the output of the f function may become ∇fj . 4. Continue the loop of j Environment of our Computer Experiment. PC: Fujitsu FMV D5340; OS: Microsoft Windows XP Professional; CPU: Intel Core 2 Duo E6700; RAM: 2GB Experimental Results. The computer experiment under the above conditions was performed with the parameters, DV and Differential Path, shown by Wang[2]. By setting MM TARGET by a26,3 + a25,3 + m28,1 + m26,1 + 1 in step 26, we found two patterns applicable to MM. One of the results is shown in Appendix B. Note that we assumed there is no restriction on CVC in the differential path in the Round 1. 5.2
Experiment 2. Searching for a Range in Which MM Can Be Applied
It is known that the complexity for finding a collision is given by the following equation3 : Equation 1. The complexity for finding a collision = 2(N −n) such that N = the number of linearly independent conditions that should be satisfied. n = the number of conditions to which MM can be applied. By considering this, we implemented the algorithm that includes the MM search algorithm in this experiment. Algorithm – [input] target step: step of effort target – [output] the number of linearly independent CVCs that cannot apply MM. 0. MM sol= φ (CVC judged that MM can be applied), ExCVC= φ, CVC Vector → call ‘CVC’ 1. Diagonalize CV C ∪ ExCV C to remove the redundancy of the head term. 2. Extract the target CVC that is the target of MM, where the number of steps of the bit is maximum in CVC, whose head term is equal to or less than the target step, and is not included in the set of the head terms in MM sol. Count up the serial number of the target CVC (mm target cvc, 1-origin). 3
De Canni`ere and Rechberger [15] proposed more precise method to evaluate the search cost of collision attacks.
278
J. Yajima and T. Shimoyama
3. Solve MM by using the MM search algorithm. – Store the number of solutions to mm num[mm target cvc] – If the number of solutions is equal to 0, execute the procedure to count down mm target cvc of target CVC, and to decrease 1 from variable mm num[mm target cvc]. Count down the target step when mm target cvc=1. 4. Re-compose the ExCVCs. 5. If CVCs that should be applied MM to the target step from value TARGET STEP MIN+1 do not exist, output the number of CVCs to which MM is not applied in all CVCs, and finish the algorithm. 6. Return to 1.
Experimental Result. We performed experiment 2 with the same computer environment as experiment 1. In the step involving the effort target, we searched while increasing from step20 by one. In this experiment, we only executed MM algorithm to the original CVC, did not execute recursively to Ex-CVC located in the upper steps found by MM. In the case of executing the real collision search, we have to consider not only the original CVCs but also Ex-CVCs. The result that MM was able to be applied up to step26 by using DV/DP/CVCs that Wang et al.[2] had shown was obtained. In this experiment, the number of remaining CVCs was 64. The computing time was about 40 seconds. And the CVCs given by DP in Round 1 were ignored.
6
Conclusion
In this paper, we proposed a method for representing CVCs/MCs as a matrix for the collision attack of hash functions, especially SHA-1. Then we proposed an algorithm for constructing the MM procedure in order to reduce the complexity of the collision attack. This algorithm is based on a matrix representation and linear algebra over GF(2), and we were able to construct the MM procedure by resolving the matrix. We performed two computer experiments. In the first experiment, we succeeded in constructing the MM procedures using our proposed algorithm, and by using the parameters DV, DP and CVCs showed by Wang et al. And in the second experiment, we showed that there are some constructions of MMs applicable to CVCs up to step 26 in the above parameters. In these experiments, the Non-linear path in Round 1 of SHA-1 and recursive executions of the MM procedures to the added Ex-CVCs have not been considered. These considerations are marked for future works.
Acknowledgement This research was financially supported by a contract for research with the Information-Technology Promotion Agency (IPA), Japan.
Matrix Representation of Conditions for the Collision Attack
279
References 1. NIST. Secure hash standard. Federal Information Processing Standard, FIPS180-1 (April 1995) 2. Wang, X., Yin, Y.L., Yu, H.: Finding Collisions in the Full SHA-1. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 17–36. Springer, Heidelberg (2005) 3. Biham, E., Chen, R., Joux, A., Carribault, P., Lemuet, C., Jalby, W.: Collisions in SHA-0 and Reduced SHA-1. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 36–57. Springer, Heidelberg (2005) 4. Wang, X., Yu, H.: How to Break MD5 and Other Hash Functions. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 19–35. Springer, Heidelberg (2005) 5. Chabaud, F., Joux, A.: Differential Collisions in SHA-0. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 56–71. Springer, Heidelberg (1998) 6. Wang, X., Yao, A.C., Yao, F.: Cryptanalysis on SHA-1 Hash Function. Keynote Speech at Cryptographic Hash Workshop 7. Wang, X.: Cryptanalysis of Hash functions and Potential Dangers. Invited Talk at CT-RSA 2006 (2006) 8. Mendel, F., Pramstaller, N., Rechberger, C., Rijmen, V.: The Impact of Carries on the Complexity of Collision Attacks on SHA-1. In: Robshaw, M.J.B. (ed.) FSE 2006. LNCS, vol. 4047, pp. 278–292. Springer, Heidelberg (2006) 9. Pramstaller, N., Rechberger, C., Rijmen, V.: Exploiting Coding Theory for Collision Attacks on SHA-1. In: Smart, N.P. (ed.) Cryptography and Coding 2005. LNCS, vol. 3796, pp. 78–95. Springer, Heidelberg (2005) 10. Yajima, J., Iwasaki, T., Naito, Y., Sasaki, Y., Shimoyama, T., Kunihiro, N., Ohta, K.: A Strict Evaluation Method on the Number of Conditions for the SHA-1 Collision Search. In: ASIACCS 2008 (March 2008) 11. Manuel, S.: Classification and Generation of Disturbance Vectors for Collision Attacks against SHA-1. Cryptology ePrint Archive (November 2008), http://eprint.iacr.org/2008/469 12. Joux, A., Peyrin, T.: Hash Functions and the (Amplified) Boomerang Attack. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 244–263. Springer, Heidelberg (2007) 13. Naito, Y., Sasaki, Y., Shimoyama, T., Yajima, J., Kunihiro, N., Ohta, K.: Improved Collision Search for SHA-0. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS, vol. 4284, pp. 21–36. Springer, Heidelberg (2006) 14. Sugita, M., Kawazoe, M., Perret, L., Imai, H.: Algebraic Cryptanalysis of 58-Round SHA-1. In: Biryukov, A. (ed.) FSE 2007. LNCS, vol. 4593, pp. 349–365. Springer, Heidelberg (2007) 15. De Canni`ere, C., Rechberger, C.: Finding SHA-1 Characteristics. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS, vol. 4284, pp. 1–20. Springer, Heidelberg (2006) 16. Hawkes, P., Paddon, M., Rose, G.: Automated Search for Round 1 Differentials for SHA-1: Work in Progress. In: NIST Second Cryptographic Hash Workshop (August 2006) 17. Yajima, J., Sasaki, Y., Naito, Y., Iwasaki, T., Shimoyama, T., Kunihiro, N., Ohta, K.: A New Strategy for Finding a Differential Path of SHA-1. In: Pieprzyk, J., Ghodosi, H., Dawson, E. (eds.) ACISP 2007. LNCS, vol. 4586, pp. 45–58. Springer, Heidelberg (2007)
280
J. Yajima and T. Shimoyama
18. De Canni`ere, C., Mendel, F., Rechberger, C.: Collisions for 70-step SHA-1: On the full cost of collision search. In: Adams, C., Miri, A., Wiener, M. (eds.) SAC 2007. LNCS, vol. 4876, pp. 56–73. Springer, Heidelberg (2007) 19. Joux, A.: Message Modification, Neutral Bits and Boomerangs: From Which Round Should we Start Counting in SHA? In: NIST Second Cryptographic Hash Workshop (2006)
Appendix A
Fig. 5. 1-bit variation
Fig. 6. 2-bit variation
Matrix Representation of Conditions for the Collision Attack
281
Fig. 7. 3-bit variation
Fig. 8. 6-bit variation
Appendix B An example of the MM procedure that was found in computer experiment 1. Input DP: Input Differential Path without sign. Input condition: Input CVCs cvc target: A target CVC that should be flipped by a MM procedure. m diff: The message that must be flipped at the application of MM. Actually, when we flip m0 - m15 , the remaining m16 - m79 are automatically flipped by the message expansion.
282
J. Yajima and T. Shimoyama
a diff: Positions of ai that flipped by flipping of m diff. We can see that cvc target is satisfied by flipping a26,3 . ex cvc: ExCVCs that must be satisfied when this MM is executed. 0x40000001, 0x00000002, 0x00000002, 0x80000002, 0x00000001, 0x00000000, 0x80000001, 0x00000002, 0x00000002, 0x00000002, 0x00000000, 0x00000000, 0x00000001, 0x00000000, 0x80000002, 0x00000002, 0x80000002, 0x00000000, 0x00000002, 0x00000000, 0x00000001, 0x00000000, 0x00000002, 0x00000002, 0x00000001, 0x00000000, 0x00000002, 0x00000002, 0x00000001, 0x00000000, 0x00000000, 0x00000002, 0x00000001, 0x00000000, 0x00000002, 0x00000002, 0x00000000, 0x00000000, 0x00000002, 0x00000000, 0x00000000, 0x00000000, 0x00000002, 0x00000000, 0x00000002, 0x00000000, 0x00000002, 0x00000000, 0x00000002, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000004, 0x00000000, 0x00000000, 0x00000008, 0x00000000, 0x00000000, 0x00000010, 0x00000000, 0x00000008, 0x00000020, 0x00000000, 0x00000000, 0x00000040, 0x00000000, 0x00000028, 0x00000080; a13,3 + a15,1 + a17,1 , a13,3 + a14,3 + 1, a16,1 + m16,6 + 1, a13,0 + m16,30 + 1, a15,30 , a13,1 + a14,1 + 1, a14,3 + a16,1 + m17,1 + 1, a14,3 + a15,3 + 1, a17,31 + m17,4 + 1, a17,1 + m17,6 + 1, a15,31 + m17,29 + 1, a16,29 + 1, a13,0 + m17,30 + 1, a16,31 + 1, a15,3 +a17,1 +a19,1 , a15,3 +a16,3 +1, a15,31 +m18,29 +1, a17,29 , a15,1 +a16,1 , a19,1 + m19,6 +1, a15,31 +a17,31 +1, a18,29 +1, a21,0 +m20,0 +1, a17,3 +a18,3 +a19,1 +a21,0 , a17,31 +a18,31 +a19,29 +m20,29 +1, a21,0 +m21,5 , m21,6 +m21,5 +1, a17,31 +m21,29 +1, a19,2 + a20,2 + a21,0 + m22,0 , a23,1 + m22,0 , a24,1 + m23,1 , a23,1 + m23,6 + 1, a20,0 + a21,0 +a22,30 +m23,30 , a25,0 +m24,0 , a21,3 +a22,3 +a23,1 +m24,1 +1, a24,1 +m24,6 +1, a21,0 + a22,0 + a23,30 + m24,30 , a22,3 + a23,3 + a24,1 + m25,1 + 1, a25,0 + m25,5 + 1, a21,0 + m25,30 , a23,2 + a24,2 + a25,0 + m26,0 + 1, a27,1 + m26,1 , a24,0 + a25,0 + a26,30 + m27,30 + 1, a25,3 + a26,3 + m26,1 + m28,1 + 1, a28,1 + m27,1 , a27,1 + m27,6 + 1, a29,0 + m28,0 , a28,1 + m28,6 + 1, a25,0 + a26,0 + a27,30 + m28,30 + 1, a26,3 + a27,3 + a28,1 +m29,1 +1, a29,0 +m29,5 +1, a25,0 +m29,30 +1, a27,2 +a28,2 +a29,0 +m30,0 +1, a32,1 + m31,1 , a28,0 + a29,0 + a30,30 + m31,30 + 1, a33,0 + m32,0 + 1, a33,0 + m32,1 , a32,1 + m32,6 + 1, a29,0 + a30,0 + a31,30 + m32,30 + 1, a30,3 + a31,3 + a32,1 + m33,1 + 1, a33,0 + m33,5 , m33,6 + m33,5 + 1, a29,0 + m33,30 + 1, a31,2 + a32,2 + a33,0 + m34,0 , a35,1 + m34,0 , a36,1 + m35,1 , a35,1 + m35,6 + 1, a32,0 + a33,0 + a34,30 + m35,30 , a33,3 + a34,3 + a35,1 + m36,1 + 1, a36,1 + m36,6 + 1, a33,0 + a34,0 + a35,30 + m36,30 , a34,3 + a35,3 + a36,1 + m37,1 + 1, a33,0 + m37,30 , a39,1 + m38,1 , a39,1 + m39,6 + 1, a39,1 + m40,1 + 1, a37,3 + a38,3 + 1, a38,1 + a40,31 + 1, a43,1 + m42,1 , a40,1 + a41,31 + 1, a43,1 + m43,6 + 1, a43,1 + a45,1 , a41,3 + a42,3 + 1, a45,1 + m45,6 + 1, a42,1 + a44,31 + 1, a45,1 + a47,1 , a43,3 + a44,3 + 1, a44,1 + a45,31 + 1, a47,1 + m47,6 + 1, a44,1 + a46,31 + 1, a47,1 + a49,1 , a45,3 + a46,3 + 1, a46,1 + a47,31 + 1, a49,1 + m49,6 + 1, a46,1 + a48,31 + 1,
Matrix Representation of Conditions for the Collision Attack
283
a49,1 +m50,1 +1, a47,3 +a48,3 +1, a48,1 +a49,31 +1, a48,1 +a50,31 +1, a50,1 +a51,31 +1, a65,2 + m64,2 , a65,2 + m65,7 + 1, a63,4 + a64,4 + a65,2 + m66,2 + 1, a64,2 + a65,2 + a66,0 + m67,0 + 1, a68,3 + m67,3 , a65,2 + a66,2 + a67,0 + m68,0 + 1, a68,3 + m68,8 + 1, a65,2 + m69,0 + 1, a66,5 + a67,5 + a68,3 + m69,3 + 1, a67,3 + a68,3 + a69,1 + m70,1 + 1, a71,4 + m70,4 , a68,3 + a69,3 + a70,1 + m71,1 + 1, a71,4 + m71,9 + 1, a68,3 + m72,1 + 1, a73,3 + m72,3 , a69,6 + a70,6 + a71,4 + m72,4 + 1, a70,4 + a71,4 + a72,2 + m73,2 + 1, a74,5 +m73,5 , a73,3 +m73,8 +1, a71,4 +a72,4 +a73,2 +m74,2 +1, a71,5 +a72,5 +a73,3 + m74,3 + 1, a74,5 + m74,10 + 1, a72,3 + a73,3 + a74,1 + m75,1 + 1, a71,4 + m75,2 + 1, a72,7 + a73,7 + a74,5 + m75,5 + 1, a73,3 + a74,3 + a75,1 + m76,1 + 1, a73,5 + a74,5 + a75,3 + m76,3 + 1, a77,6 + m76,6 , a73,3 + m77,1 + 1, a74,5 + a75,5 + a76,3 + m77,3 + 1, a77,6 + m77,11 + 1, a74,5 + a79,3 , a79,5 + m78,5 , a75,8 + a76,8 + a77,6 + m78,6 + 1, a76,6 + a77,6 + a78,4 + m79,4 + 1, a80,7 + m79,7 , a79,3 + m79,8 + 1, a79,5 + m79,10 + 1; a26 3+a25 3+m28 1+m26 1+1 <m diff:26> 0x00000400, 0x00008000, 0x00000400, 0x00000100, 0x00000100, 0x00000100, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00010200, 0x00000a00, 0x00000000, 0x00020600, 0x00001600, 0x00000000, 0x00040c00, 0x00002c00, 0x00020400 0x00000400, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00010200, 0x00204a00, 0x04084200, 0x812a4c80, 0x21499610, 0xa912cc04, 0x22548ba5, 0xca997284, 0x73249a99 <ex cvc:89> a1,10 + m78,8 + m1,15 + 1, a1,10 + m2,10 + 1, a1,10 + m3,8 , a2,8 + 1, a1,10 + m4,8 + 1, a3,8 , a1,10 + m5,8 + 1, a18,9 + m17,9 , a18,16 + m17,16 , a19,9 + m18,9 , a19,11 + m18,11 , a19,14 + a18,9 , a19,21 + a18,16 , a20,9 + a18,9 + a16,11 , a17,11 + a16,11 + 1, a20,14 + a19,9 , a19,11 + a18,16 + a16,18 + 1, a17,18 + a16,18 + 1, a20,19 + a19,14 , a20,26 + a19,21 , a21,7 + a19,7 + a18,9 + a17,9 , a19,9 + a18,11 + a17,11 + m20,9 + 1, a21,10 + m20,10 , a21,11 +a19,11 +a18,13 +a17,13 , a21,14 +a20,9 , a21,17 +m20,17 , a21,19 +a20,14 , a21,21 + a19,21 + a18,23 + a17,23 , a21,24 + a20,19 , a22,4 + a21,31 , a22,9 + m21,9 , a22,10 + m21,10 , a22,12 + a21,7 + a20,12 + a19,14 + a18,14 + m21,12 + 1, a22,15 + a21,10 , a22,16 + a21,11 , a22,19 + a21,14 , a22,22 + a21,17 , a22,24 + a21,19 , a21,21 + a20,26 + a19,28 + a18,28 + 1, a22,29 + a21,24 , a23,2 + a22,29 , a21,7 + a20,9 + a19,9 + a18,9 + 1, a22,4 + a21,9 + a20,11 + a19,11 + 1, a23,10 + a21,10 + a20,12 + a19,12 , a23,11 + a21,11 + a20,13 + a19,13 , a23,14 + a22,9 + a21,14 + a20,16 + a19,16 + a18,16 + 1, a23,15 + a22,10 , a23,17 + a22,12 , a23,20 + a22,15 , a22,16 + a21,21 + a20,23 + a19,23 + 1, a23,24 + a22,19 , a23,27 + a22,22 , a23,29 + a22,24 , a24,0 + a23,27 , a24,2 + a23,29 , a23,31 + a22,4 + a21,6 + a20,6 + 1, a24,5 +a22,5 +a21,7 +a20,7 , a24,7 +a23,2 +a22,7 +a21,9 +a20,9 +a19,9 +1, a24,8 +a22,8 +
284
J. Yajima and T. Shimoyama
a21,10 + a20,10 , a24,9 + a19,11 , a22,10 + a21,12 + a20,12 + m23,10 + 1, a24,11 + m23,11 , a22,12 +a21,14 +a20,14 +a19,14 +1, a24,15 +a23,10 , a23,11 +a22,16 +a21,18 +a20,18 +1, a24,18 + m23,18 , a23,14 + a19,21 + 1, a24,20 + a23,15 , a24,22 + a23,17 , a24,25 + a23,20 , a24,29 + a23,24 , a25,2 + a24,29 , a24,0 + a23,5 + a22,7 + a21,7 + 1, a25,7 + a24,2 + a23,7 + a22,9 + a21,9 + a20,9 + 1, a25,9 + a23,9 + a22,11 + a21,11 , a24,5 + m24,10 + 1, a23,11 +a22,13 +a21,13 +m24,11 +1, a25,12 +a24,7 +a23,12 +a22,14 +a21,14 +a20,14 +1, a25,13 + a24,8 + a23,13 + a22,15 + a21,15 + m24,13 + 1, a25,14 + a24,9 , a25,16 + a24,11 , a23,17 + a22,19 + a21,19 + a20,19 + 1, a25,19 + a23,19 + a22,21 + a21,21 , a25,20 + a24,15 , a25,23 + a24,18 , a23,24 + a22,26 + a21,26 + a20,26 + 1, a25,25 + a24,20 , a25,27 + a24,22 , a25,30 + a24,25 ,
Mutual Information Analysis under the View of Higher-Order Statistics Thanh-Ha Le and Mael Berthier Morpho 18, Chauss´ee Jules Cesar, 95520 Osny, France {thanh-ha.le,mael.berthier}@morpho.com
Abstract. Mutual Information Analysis (MIA) is a generic attack which aims at measuring dependencies between side-channel signals and intermediate data during cryptographic operations. In this paper, we propose a novel approach to estimate the mutual information based on higherorder cumulants. The simulation and experimental results show that the cumulant-based MIA can be a good method in both first- and secondorder attacks. The implementation of the proposed method is practical and its extension to higher-order analysis does not require any additional development. Under higher-order statistics, we confirm the generality of MIA by recognizing the similitude between classical analysis and the cumulant-based MIA. Keywords: Mutual Information Analysis, Second-Order Analysis, HigherOrder Statistics, Cumulants, Edgeworth Expansion.
1
Introduction
During the last ten years, power analysis attacks have been widely developed under many forms. Such attacks analyse the relationship between the power consumption of a cryptographic device and the handling data during cryptographic operations. They are considered a very powerful attack for revealing confidential data (e.g. a secret key of a cryptographic algorithm) since they are imperceptible to users and do not require expensive equipments. The power analysis attack technique exploits dependencies between the power dissipation and the intermediate values of the cryptographic algorithms implemented in embedded devices. A representative example is Differential Power Analysis (DPA) [1] which explores the different behaviours of the power consumption signal when the examined bit is switched to 1 or 0. This mono-bit DPA was then improved by multi-bit analysis [2, 3]. The DPA was later extended to Correlation Power Analysis (CPA), firstly proposed in [5, 4] and then analysed in details in [6]. Some works have been proposed to improve the CPA. However, they always aim at analysing the linear dependence between the power consumption signals and a selection function based on intermediate values of a cryptographic algorithm. I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 285–300, 2010. c Springer-Verlag Berlin Heidelberg 2010
286
T.-H. Le and M. Berthier
Eight years after the publication of CPA, the generic form of power analysis was introduced under the name Mutual Information Analysis (MIA) [7, 8]. The mutual information is used as a criterion to measure linear and non-linear dependencies between the power consumption and intermediate values. MIA can be considered as an important evolution of power analysis since it works without any knowledge on the dependencies between observation and manipulated data. Side channel analysis based on mutual information has attracted a lot of research efforts in order to build a theoretical foundation for MIA [9], to explain ”How, When and Why” the MIA works [11], to compare MIA and CPA under a Gaussian assumption [12] and to formalize the MIA in the context of higher-order analysis [9, 13]. In this paper, we propose a new approach to estimate the mutual information based on the higher-order cumulants, called as the cumulant-based MIA. We consider the histogram-based MIA, which was used in almost all aforementioned publications, as a reference to evaluate the cumulant-based MIA. Firstly, simulation and experimental results show that the proposed method allows improving the efficiency of MIA in both the first- and second-order analysis. Secondly, the proposed method is practical since the approximation of the mutual information is computed directly from cumulants and it is valid in both first- and higherorder analysis, which was not the case of the previous works. Finally, in some specific contexts, the cumulant-based MIA is somehow similar to classical analysis which are frequently used in the first-order attack (e.g. the CPA) and in the second-order attack (e.g. the abs-diff-DPA [21] and the product-DPA [20, 10]). The rest of the paper is structured as follows. In Section 2, we give a background about MIA and related works. Section 3 provides a theoretical base of our solution which uses higher-order cumulants to estimate the mutual information. The first and second-order cumulant-based MIA attack are respectively presented in Section 4 and Section 5 with the details of simulation and experiment results. Our conclusions are drawn in Section 6.
2 2.1
Related Works Correlation Power Analysis and Mutual Information Analysis
Let HK = {HK,i }i=1..n denote the predictions of an intermediate value corresponding to messages mi and a key hypothesis K; W = {Wi }i=1..n denote the values of n side-channel signals at the instant t where the intermediate value is manipulated. We define ϕ(HK ) a selection function, for example the Hamming weight of HK . The original correlation analysis CPA uses the correlation factor ρ(K) between W and ϕ(HK ) as the distinguisher to find out the correct key. The guessed key is the one corresponding to the highest correlation factor. The correlation coefficient between W and ϕ(HK ) can be estimated by the following formula: ρ(K) =
cov(W, ϕ(HK )) σW σϕ(HK )
(1)
Mutual Information Analysis under the View of Higher-Order Statistics
287
where cov(W, ϕ(HK )) is the covariance of W and ϕ(HK ); σW , σϕ(HK ) are the standard deviations of W and ϕ(HK ) respectively. MIA uses the mutual information measure I(K) = I(W, ϕ(HK )) to discover all linear and non-linear dependencies between side-channel signals and manipulated data. Let Pr[W = x], Pr[ϕ(HK ) = y] denote the marginal probability distribution functions of W , ϕ(HK ) and Pr[W = x, ϕ(HK ) = y] denote the joint probability distribution function of W and ϕ(HK ). The mutual information of two discrete random variables W and ϕ(HK ) is defined by:
I(W ; ϕ(HK )) =
Pr[W = x, ϕ(HK ) = y] log
x∈W,y∈ϕ(HK )
Pr[W = x, ϕ(HK ) = y] Pr[W = x]Pr[ϕ(HK ) = y]
(2) The key hypothesis corresponding to the highest mutual information I(W, ϕ(HK )) is considered the correct key. 2.2
Probability Density Estimation
The core of the MIA technique is focused on efficient and practical ways to estimate the probability densities; and the mutual information can be estimated via Eq. (2). Probability density estimation methods can be classified into two categories: parametric and non-parametric. This section gives an overview on different probability density estimation methods that have been used in the context of MIA. These methods are summarized in Table 1. Non-parametric methods: A non-parametric probability density estimation is a statistical method that does not impose any model on variables and their distribution. Two non-parametric estimators commonly used in MIA are: – Histogram-based Estimator (HE): Histogram-based method is considered as the simplest non-parametric way to estimate the probability density of a variable. It consists in distributing observed values in disjoint zones called bins. The number of bins and the way to distribute the values in bins affect the efficiency of this method. – B-Spline Estimator (BSE): In the classical histogram approach, each data point is assigned to one, and only one, bin. For data points near to the border of a bin, small fluctuations due to noise might shift these points to neighbouring bins. B-Spline Estimator [14], a generalized histogram method, allows the data points to be assigned to several bins simultaneously using B-Spline functions. – Kernel Density Estimator (KDE): Instead of distributing values in determined bins, the Kernel Density Estimator adds neighbouring zones, defined by a Kernel function1 , around values and then sums up these zones together 1
A kernel is a non-negative +∞ real-valued integrable function K satisfying the following two requirements: −∞ K(u)du = 1 and K(−u) = K(u) for ∀u. Several types of kernel functions are commonly used: uniform, triangle, Gaussian and cosine.
288
T.-H. Le and M. Berthier
to build the estimator. By this way, the distribution estimated by a Kernel density estimator is smoother than the distribution of a histogram-based estimator. The advantage of non-parametric estimators compared to parametric ones is that they are free of distribution hypothesis. However, they are not parameter-free. The efficiency of a histogram based estimator depends on the number of bins and how we define the bins while the Kernel density estimator depends on the bandwidth and how we define the Kernel function. Table 1. Mutual information estimation methods and publications related to MIA. The sign ’x’ means that the method has been tested and evaluated in the publication.
Publications Gierlichs08 [7] Veyrat09 [11] Prouff09 [9] Moradi09 [12] Gierlichs10 [13] Flament10 [19] Venelli10 [24]
HE x x x x x x x
Non-parametric BSE KDE x x x x
Parametric FM x x -
KS x -
Other methods KSN CM x x x
Parametric methods: Parametric estimation assumes that the data distributions belong to a known family, such as the normal, log-normal, or exponential. The parameters are chosen by fitting a model to a data set. Until now, the most common parametric method used for MIA is Finite Mixture(FM). It assumes that the density distribution of a variable can be modelled as a sum of a number of weighted densities, for example, a sum of weighted Gaussian probability density functions. A method for determining its parameters is Expectation Maximization [18]. Besides, Kolmogorov-Smirnov distance (KS), Kolmogorov-Smirnov distance Normalised (KSN) and Cramer-von-Mises test (CM) can be used to compare directly two density distributions without explicitly knowing them [11].
3
Cumulant-Based Dependence Measure
In this section, the mutual information estimation using Edgeworth expansion [16] is presented. This approach, classified into parametric methods, is widely used in Independent Component Analysis to separate statistically independent sources [15, 17]. In fact, the idea of using Edgeworth expansion in MIA has been mentioned in [8]. However, to the best of our knowledge, the cumulantbased probability density estimation has not been applied to the MIA context yet. Moreover, our solution is not limited to the probability density estimation because we go further and estimate directly the mutual information using higherorder cumulants.
Mutual Information Analysis under the View of Higher-Order Statistics
289
Edgeworth expansion [16] allows approximating the probability density of a real random variable close to the standard normal in function of its cumulants. Lemma 1. The probability density of a real random variable U , close to the 2 standard normal n(u) = (2π)−1/2 exp( −u 2 ), can be approximated by: k2 (U ) − 1 k3 (U ) k4 (U ) p(u) ≈ n(u) 1 + h2 (u) + h3 (u) + h4 (u) 2! 3! 4! where h2 (u) = u2 − 1, h3 (u) = u3 − 3u and h4 (u) = u4 − 6u2 + 3 are ChebyshevHermite polynomials. The values kr (U ) are rth -order cumulants of U (for more details about cumulants, see Appendix 1). Remark 1. The number of terms of Edgeworth expansion [16] is not limited. In the context of this paper, we consider the four leading terms of Edgeworth expansion . As we suppose that the distribution of U is close to (but not exactly) the standard normal distribution, the second, third and fourth terms are used as correcting ones. Lemma 2. If U and V are two real random variables with distributions close to the standard normal, the Kullback-Leibler divergence of U and V can be approximated by: K(U ||V ) ≈
(k3 (U ) − k3 (V ))2 (k4 (U ) − k4 (V ))2 (k2 (U ) − k2 (V ))2 + + 4 12 48
Remark 2. See Appendix 2 for the proof. Lemma 3. Consider two real random vectors U = [U1 , U2 , ..., Un ] and V = [V1 , V2 , ..., Vn ] with components close to the standard normal distribution. The Kullback-Leibler divergence between U and V can be approximated as: K(U ||V ) ≈
2 1 U 1 U 1 U V 2 V 2 Rij − Rij + Tijk − Tijk + Qijkl − QVijkl 4 ij 12 48 ijk
ijkl
where i, j, k, l varie from 1 to n; and for X ∈ {U, V }, X m = Xm − E(Xm ) for m ∈ {i, j, k, l}, we have: X Rij =E(X i X j ) X Tijk =E(X i X j X k )
QX ijkl =E(X i X j X k X l ) − E(X i X j )E(X k X l ) − E(X i X k )E(X j X l ) − E(X i X l )E(X j X k ) Remark 3. This result is the generalized case of Lemma 2 in the multi-variables context. The terms of the estimation are a double sum over all the n2 pairs of indices, a triple sum over all n3 triplets of indices and a quadruple sum over all n4 quadruples of indices.
290
T.-H. Le and M. Berthier
Proposition 1. Consider a real random vector U = [U1 , U2 , ..., Un ] with components close to the standard normal distribution. The mutual information between the components of U can be estimated by: 1 U 2 1 1 U 2 2 I(U ) = I(U1 ; U2 ; ...; Un ) ≈ (Rij ) + (Tijk ) + (QU ijkl ) 4 12 48 ij=ii
ijk=iii
ijkl=iiii
(3) Proof. The mutual information between components of U can be obtained by minimizing the Kullback-Leibler divergence between U and V , over all V whose components are independent. When the components of V are independent, all cross-cumulants of V are zero and the Kullback-Leibler divergence between U and V becomes: 2 2 1 U 1 U K(U ||V ) ≈ Rij − k2 (Vi )δij + Tijk − k3 (Vi )δijk 4 ij 12 2 1 U Qijkl − k4 (Vi )δijkl + 48
ijk
ijkl
where the Kronecker symbol δ equals to 1 with identical indexes and 0 otherwise. U U The minimization is achieved for k2 (Vi ) = Rii , k3 (Vi ) = Tiii and k4 (Vi ) = QU iiii , we hence obtain Eq. (3). Remark 4. The mutual information estimation of U (up to the fourth-order) contains the second-, third- and fourth-order cross-cumulants of U . As the crosscumulants of independent variables vanish, the mutual information is logically approximated by a sum of squared cross-cumulants.
4 4.1
First-Order Analysis Cumulant-Based MIA
In the rest of the paper, we consider the Hamming weight (HW) of an intermediate value as the prediction (i.e. ϕ(HK ) = HW(HK )). If the messages are randomly chosen, the distribution of the prediction ϕ(HK ) is approximated by a Gaussian distribution. The distribution of the observation can also be considered as close to a Gaussian form [9, 12, 19]. ˜ , ϕ(H ˜ and ϕ(H ˜ K ) repLet us consider a bivariate S(K) = [W ˜ K )] where W ˜ and ϕ(H resent the centred and normalized values of W ˜ K ). We assume that the Gaussian assumption is sufficiently respected for the components of S(K) , according to Proposition 1, the mutual information between the components of S(K) can be approximated by: I(S(K)) ≈
1 S(K) 2 1 1 S(K) (Rij ) + (Tijk )2 + 4 12 48 ij=ii
ijk=iii
ijkl=iiii
S(K)
(Qijkl )2
(4)
The estimation of probability density functions is transparent. Once the Gaussian assumption is respected, one just needs to use Eq.(4) to estimate directly
Mutual Information Analysis under the View of Higher-Order Statistics
291
the mutual information. Compared to CPA, we need only to replace the correlation factor given by Eq.(1) by the mutual information given by Eq.(4) in the attack analysis procedure. By observing Eq.(4), one can realize that the first term is nothing other than ˜ and ϕ(H the correlation factor between W ˜ K ) or the correlation factor between W and ϕ(HK ). The second and the third terms represent higher-order dependencies of W and ϕ(HK ). 4.2
Simulation Validation
In our simulations, we observe the output of a DES Sbox. We also consider that the relation between the power consumption and the Hamming weight of the Sbox output is linear. The observations Wi corresponding to messages mi are then simulated as: Wi = HW(HK0 ,i ) + B where K0 is the correct key and B is a Gaussian noise whose variance is σ 2 . In this simulation, we consider the following methods: – CPA: the classical CPA given by Eq.(1), – Histogram-based MIA: the histogram-based MIA with 5 and 16 bins, – Cumulant-based MIA: the cumulant-based MIA given by Eq.(4),
Cumulant−based MIA Histogram−based MIA, 16 bins CPA Histogram−based MIA, 5 bins 100
σ=3
80
success rate (%)
sucess rate (%)
100
60 40 20 0
0
200
400 600 number of messages
800
1000
σ=5
80 60 40 20 0
0
200
400 600 number of messages
800
1000
Fig. 1. Simulation results with the linear model
The success rate is used to compare these methods. For each attack, we perform 100 repetitions to obtain an averaged success rate. Fig.1 represents the success rate variation of four methods when the number of messages varies from 100 to 1000 messages. The figure on the left corresponds to noise level σ = 3 and the figure on the right corresponds to noise level σ = 5. The simulation results show that the histogram of 5 bins gives a better result than the one of 16 bins because the possible values of the prediction is 0, 1, 2, 3, 4 (i.e. 5 values). We see that the choice of bins plays an important role in the efficiency of the histogram-based MIA attack. The cumulant-based MIA is slightly better than the histogram approach but not as good as the CPA. With a linear model, this result is coherent. In fact, the CPA, exploiting only the
292
T.-H. Le and M. Berthier
linear dependence between the components of S(K) (the first term of the Eq. 4), corresponds perfectly to the simulated linear model, so it is the best solution. Regarding MIA, when the linear model holds and the number of samples tends to infinite, some terms of the cumulant-based MIA estimation, for example the second term, should tend to zero. When the number of samples is small, this term will contribute a non-zero noise to the cumulant-based estimator and reduces its efficiency compared to the CPA. 4.3
Experimental Confrontation
We consider in this section the signals from DPA Contest 2008-2009 [22]. While the hypothesis of linearity between the power consumption and the Hamming weight (or Hamming distance) model is usually used in side channel analysis, it does not always hold for real devices. The dependency between the power consumption and the Hamming weight of the output of an Sbox in the last round of DES is given in Fig. 2. The figure shows that in this case the dependency between the observation and the prediction is not exactly linear.
8 7
Sbox number
6 HW=0 HW=1 HW=2 HW=3 HW=4
5 4 3 2 1 670
675
680
685 690 power consumption
695
700
Fig. 2. Dependency between the power consumption and the Hamming weight of the output of 8 Sboxes. The values are computed from 80000 signals.
The methods tested in the simulation are evaluated with the signals from DPA Contest 2008-2009. They are performed at the same condition during 100 experiments and the success key detection rate is used to compare their efficiency. Fig.3 shows the success rate of the methods when the number of message varies from 100 to 500 messages. Note that no particular key selection strategy is used during the key detection. From Fig.3, we observe that the cumulant-based MIA method is better than the histogram-based ones. For example, with the Sbox 5, 6, 7 and 8, the cumulantbased methods need about 300 messages to reach the success rate of 90% while
Mutual Information Analysis under the View of Higher-Order Statistics
293
success rate (%)
100
50 Sbox 1 200
300
400
500 success rate (%)
0 100 100
50 Sbox 3 200
300
400
Sbox 5 200
300
400
500
50 Sbox 7 0 100
200 300 number of messages
Sbox 2 0 100 100
500
50
0 100 100
50
success rate (%)
0 100 100
100
success rate (%)
success rate (%)
success rate (%)
success rate (%)
success rate (%)
Cumulant−based MIA CPA Histogram−based MIA, 16 bins Histogram−based MIA, 5 bins
400
500
200
300
400
200
300
400
200
300
400
500
50 Sbox 4 0 100 100
500
50 Sbox 6 0 100 100
500
50 Sbox 8 0 100
200
300 400 number of messages
500
Fig. 3. Experimental results: Success rates of CPA, histogram-based MIA and cumulant-based MIA on 8 Sboxes at the last round of DES
the histogram-based MIA with 5 bins needs at least 600 messages to reach this threshold. We observe that the number of bins in the histogram-based MIA needs to be carefully selected. The efficiency of the cumulant-based MIA is not far from the CPA (compared to the simulation case with the linear model). For some Sboxes, they reach the threshold of success rate = 100% almost at the same time. From both simulation and experimental results, one can confirm the effectiveness of the proposed cumulant-based MIA method.
5 5.1
Second-Order Analysis Masking Technique and Second-Order Analysis
Masking is a common solution to protect the device against the first-order analysis. The idea behind this technique is to mask intermediate values by random masks. Let consider the intermediate value HK0 ,i corresponding to message mi . With the masking technique, the device manipulates the masked data HK0 ,i ⊕Mi instead of the direct value HK0 ,i . Let t1 and t2 denote the instants when the mask
294
T.-H. Le and M. Berthier
Mi and the masked data HK0 ,i ⊕ Mi are manipulated. The power consumption W1 at t1 and W2 at t2 are simulated as follows: W1,i = ϕ(Mi ) + B1 W2,i = ϕ(HK0 ,i ⊕ Mi ) + B2 where B1 and B2 are Gaussian noises whose variances are σ12 = σ22 = σ 2 . The second-order analysis aims at studying the dependency between three components: the observations W1 at t1 and W2 at t2 and the prediction ϕ(HK ). In [21], Messerges focused on one bit and proposed to compute the correlation factor between the value of this bit and the absolute difference of two observations W1 and W2 (the abs-diff-DPA method). Chari et al. [20] suggested to calculate the correlation factor between the Hamming weight of the prediction HW(HK ) and the product of two observations W1 and W2 (the product-DPA method). In [10], it was shown that by replacing the product W1 W2 by the centred product (W1 − E(W1 ))(W2 − E(W2 )), the product-DPA attack is significantly improved (the centred-product-DPA method). According to [9, 13], MIA is considered an appropriate solution for higher-order analysis. 5.2
Extending Cumulant-Based MIA to Second-Order Analysis
In this section, we show how to extend the first-order cumulant-based MIA into ˜ 1, W ˜ 2 , ϕ(H the second-order analysis. Consider now a tri-variate S(K) = [W ˜ K )] ˜ K ) are centred and normalized values of W1 , W2 and ϕ(HK ). where W˜1 , W˜2 , ϕ(H The mutual information approximation of components of S(K) is always given by Eq.(4). The extension is thus transparent and straightforward. In the second-order analysis, we know that there is no relation between two among three components of S(K). Therefore all terms with less than three distinct indices should be equal to zero. It is important to note that this argument does not depend on any power consumption model. It is only based on the evident aim of the masking technique and so does not reduce the generality of MIA. The mutual information estimation using cumulants can be improved while taking into account this remark. Hence, another estimator can be used in the second-order analysis as following: I(S(K)) ≈
1 12
i=j,j=k,i=k
S(K)
(Tijk )2 +
1 S(K) (Qijkl )2 48
(5)
ijkl∈Q
where Q is the set of all quadruples of indices {i, j, k, l} in which three indices are distinct. S(K) We observe that in the second-order analysis, the first term Tijk of Eq. (5) is in fact the product of three components of S(K). It is very close to the correlation factor between (W1 − E(W1 ))(W2 − E(W2 )) and ϕ(HK ), which is the metric used in the centered-product-DPA [10]. When only one bit is considered, this term is also equivalent to the abs-diffs DPA estimator proposed in [21]. This
Mutual Information Analysis under the View of Higher-Order Statistics
295
interesting observation obtained from higher-order analysis confirms once again the generality of MIA. Comparison to another approach: The mutual information of three variables can also be reduced to the one of two variables. For example in [13], the mutual information between three components of S(K) is rewritten as: I(S(K)) = I(W1 ; W2 ; ϕ(HK )) = I(W1 ; W2 ) − I(W1 ; W2 |ϕ(HK ))
(6)
The mutual information I(W1 ; W2 ; ϕ(HK )) is estimated via I(W1 ; W2 |ϕ(HK )) and I(W1 ; W2 ). Any estimation method of the first-order analysis, including the cumulant-based MIA, can be used to approximate I(W1 ; W2 |ϕ(HK )) and I(W1 ; W2 ). As the distributions of W1 and W2 are Gaussian, the mutual information I(W1 ; W2 ) and I(W1 ; W2 |ϕ(HK )) can be estimated by cumulants. We call this method the combined MIA method. In this case, any function ϕ, for example the identity function, Hamming weight, Hamming distance functions, is valid. In a generic case, one uses the N -order analysis to exploit the dependence of the prediction and (N − 1) observations. The mutual information of N components is estimated via the ones of (N − 1) components and so on until the ones of two components. With the cumulant-based MIA, the extension is easier. We ˜ 2 ...., W ˜ N −1 , ϕ(H ˜ 1, W ˜ K )] and use just need to consider a N -variate S(K) = [W the cumulant-based estimator to approximate the mutual information of S(K). If the order N is higher than four, we can extend the estimator given by Eq. (4) to higher orders, for example the fifth and sixth orders. 5.3
Simulation Results
In this section, the simulation parameters are similar to the ones in Section 4.2 but the masking is applied to the outputs of Sboxes. We still assume a linear model. The following methods are considered: – Cumulant-based MIA given by Eq.(4), – Histogram-based MIA given by Eq.(6). The mutual information I(W1 ; W2 ) and I(W1 ; W2 |ϕ(HK )) are estimated by histograms of 5 bins or 16 bins. – Combined MIA given by Eq.(6) with the mutual information I(W1 ; W2 ) and I(W1 ; W2 |ϕ(HK )) given by Eq.(4), – Product-DPA method proposed in [20], – Product of 3 components of S(K) (the first term of Eq.(5)), – Centred-product-DPA method proposed in [10]. Fig. 4 represents the success rate variation of different methods in function of the number of messages. Three noise levels (σ = 1, 2, 5) are considered. According to the figure, the results given by centred-product-DPA method and the product of 3 components of S(K) are the best ones. The results of two methods are expectedly close to each other. When the noise is high, the method using the product of 3 components of S(K) is better than the centred-product-DPA method. As a linear model is assumed, it is not surprising that these two methods work well.
296
T.-H. Le and M. Berthier 100 success rate (%)
Product of 3 components of S(K) Cumulant−based MIA Histogram−based MIA, 16 bins Combined MIA Product−DPA Centred−product−DPA Histogram−based MIA, 5 bins
60 40 20 0
60 40 20 0
0
0
80
σ=2
80
success rate (%)
success rate (%)
100
σ=1
80
2000
4000 6000 number of messages
8000
10000
200
400 600 number of messages
800
1000
σ=5
60 40 20 0
1
2
3 4 5 number of messages
6
7
8 4
x 10
Fig. 4. Simulation results for second-order analysis with different noise levels
The cumulant-based MIA given by Eq.(4) works well in the first two cases (σ = 1 and σ = 2) but it is sensitive to high noise level (σ = 5). The combined MIA can be a good choice. It is less efficient than the cumulant-based method when noise is weak but much more powerful when the level of noise is significant. Compared to the histogram-based methods, the combined MIA is always better, particularly when noise increases. One can also observe that the performance of the centred-product-DPA is close to the product-DPA when noise increases. In fact, the former method is much more powerful than the latter when noise is small but they tend to the same order when the noise is high. In conclusion, the simulation result validates the cumulant-based MIA in the second-order analysis. Even though the cumulant-based MIA and the combined MIA are not the best solutions when the model is linear, we believe that in a general case they will be good choices to exploit linear and non linear dependencies between the observation and the prediction.
6
Conclusion
This paper presents a new approach of MIA based on the Edgeworth expansion which leads to the use of the higher-order cumulants to estimate the mutual information. The proposed cumulant-based MIA was validated in both the firstand second-order analysis. Another advantage of the proposed method is that it is practical to implement, particularly in higher-order analysis. An interesting point of the paper is the study of MIA under the view of higher-order statistics. This approach on one hand gives us a concrete representation of the mutual information and on the other hand allows us to stumble upon the similitude (when the model is linear) between MIA and some classical
Mutual Information Analysis under the View of Higher-Order Statistics
297
attacks such as the CPA in the first-order attack, and the abs-diff-DPA and the product-DPA in the second order analysis. In our point of view, this remark is important since it shows the generality of MIA (covering both linear and nonlinear dependencies). In particular cases (for example the linear model), some terms of the cumulant-based MIA estimator corresponding to particular dependencies contribute more significantly (to the mutual information analysis) than the others.
References 1. Kocher, P., Jaffe, J., Jun, B.: Differential Power Analysis. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999) 2. Messerges, T.S., Dabbish, E.A., Sloan, R.H.: Examining Smart-Card Security under the Threat of Power Analysis Attacks. IEEE Transactions on Computer 51(5), 541–552 (2002) 3. Bevan, R., Knudsen, E.: Ways to Enhance DPA. In: Lee, P.J., Lim, C.H. (eds.) ICISC 2002. LNCS, vol. 2587, pp. 327–342. Springer, Heidelberg (2003) 4. Mayer-Sommer, R.: Smartly analysing the simplicity and the power of simple power analysis on smartcards. In: Paar, C., Ko¸c, C ¸ .K. (eds.) CHES 2000. LNCS, vol. 1965, pp. 78–92. Springer, Heidelberg (2000) 5. Coron, J.S., Kocher, P., Naccache, D.: Statistics and Secret Leakage. In: Omicini, A., Tolksdorf, R., Zambonelli, F. (eds.) ESAW 2000. LNCS (LNAI), vol. 1972, pp. 157–173. Springer, Heidelberg (2000) 6. Brier, E., Clavier, C., Olivier, F.: Correlation Power Analysis with a Leakage Model. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 16–29. Springer, Heidelberg (2004) 7. Gierlichs, B., Batina, L., Tuyls, P., Preneel, B.: Mutual Information Analysis: A Generic Side-Channel Distinguisher. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 426–442. Springer, Heidelberg (2008) 8. Aumonier, S.: Generalized Correlation Power Analysis. In: Proceedings of the Ecrypt Workshop Tools For Cryptanalysis 2007, Poland (September 2007) 9. Prouff, E., Rivain, M.: Theoretical and Practical Aspects of Mutual Information Based Side Channel Analysis. In: Abdalla, M., Pointcheval, D., Fouque, P.-A., Vergnaud, D. (eds.) ACNS 2009. LNCS, vol. 5536, pp. 499–518. Springer, Heidelberg (2009) 10. Prouff, E., Rivain, M., Bevan, R.: Statistical Analysis of Second Order Differential Power Analysis. In: IEEE Transaction on Computers. LNCS, vol. 5536, pp. 499– 518. Springer, Heidelberg 11. Veyrat-Charvillon, N., Standaert, F.-X.: Mutual Information Analysis: How, When and Why. In: Clavier, C., Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 429–443. Springer, Heidelberg (2009) 12. Moradi, A., Mousavi, N., Paar, C., Salmasizadeh, M.: A Comparative Study of Mutual Information Analysis under a Gaussian Assumption. In: Youm, H.Y., Yung, M. (eds.) WISA 2009. LNCS, vol. 5932, pp. 193–205. Springer, Heidelberg (2009) 13. Gierlichs, B., Batina, L., Preneel, B., Verbauwhede, I.: Revisiting Higher-Order DPA Attacks: Multivariate Mutual Information Analysis. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 221–234. Springer, Heidelberg (2010)
298
T.-H. Le and M. Berthier
14. Daub, C.O., Steuer, R., Selbig, J., Kloska, S.: Estimating mutual information using B-spline functions - an improved similarity measure for analysing gene expression data. In: BMC Bioinformatics 2004 (2004), http://www.ncbi.nlm.nih.gov/pmc/articles/PMC516800/ 15. Comon, P.: Independent Component Analysis, A new concept? Special Issue on High-Order Statistics, Signal Processing 36(3), 287–314 (1994) 16. McCullagh, P.: Tensor methods in statistics, ch. 5. Chapman and Hall, London, http://www.stat.uchicago.edu/~ pmcc/tensorbook/ 17. Georgiev, P., Relescu, A., Ralescu, D.: Cross-cumulants measure for independence. Journal of Statistical Planning and Inference 137, 1085–1098 (2006) 18. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood for incomplete data via the EM algorithm. J. Roy. Statist. Soc., Ser., B 39(1), 1–38 (1977) 19. Flament, F., Guilley, S., Danger, J.-L., Elaabid, M.A., Maghrebi, H., Sauvage, L.: About Probability Density Function Estimation for Side Channel Analysis. In: Proceedings of First International Workshop on Constructive Side-Channel Analysis and Secure Design (COSADE 2010), Darmstadt, Germany (February 2010) 20. Chari, S., Jutla, C.S., Rao, J.R., Rohatgi, P.: Toward sound approaches to counteract power analysis attacks. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 398–412. Springer, Heidelberg (1999) 21. Messerges, T.S.: Using second-order power analysis to attack DPA resistant software. In: Paar, C., Ko¸c, C ¸ .K. (eds.) CHES 2000. LNCS, vol. 1965, pp. 238–251. Springer, Heidelberg (2000) 22. DPA Contest 2008/2009, http://projets.comelec.enst.fr/dpacontest/index.php 23. Kendall, M.G., Stuart, A.: The advanced theory of statistics, 2nd edn. Charles Griffin & Company Limited, London (1963) 24. Venelli, A.: Efficient Entropy Estimation for Mutual Information Analysis Using BSplines. In: Samarati, P., Tunstall, M., Posegga, J., Markantonakis, K., Sauveron, D. (eds.) WISTP 2010. LNCS, vol. 6033, pp. 17–30. Springer, Heidelberg (2010)
Appendix 1 Moments and cumulants are statistical measures which characterize signal properties. The first-order moment (the mean) and the second-order cumulant (the variance) have been widely used to characterize the probability distribution of a signal. If a signal has a Gaussian probability density function, it is sufficient to use the first and second-order measures to characterize it. However, many real-life signals are non-Gaussian and Higher-Order Statistics (moments and cumulants of orders higher than 2) are needed to fully describe them. Consider a one-dimensional real random variable U which is associated with its first and second characteristic functions. The moments of U can be obtained by deriving the first characteristic function at the point 0, whereas the cumulants can be obtained by deriving the second characteristic function at the point 0 [23]. The rth -order cumulant is a function of the moments of orders up to r. If the variable U is centred (i.e. μ1 (U ) = 0), for the orders from 1 to 4, these relations are:
Mutual Information Analysis under the View of Higher-Order Statistics
299
k1 (U ) = 0 k2 (U ) = μ2 (U ) = E[U 2 ] k3 (U ) = μ3 (U ) = E[U 3 ] k4 (U ) = μ4 (U ) − 3μ2 (U )2 = E[U 4 ] − 3E[U 2 ]2 where μr (U ) and kr (U ) are the rth -order moment and cumulant respectively.
Appendix 2 Proof of Lemma 2. (U) (V ) (U) We set: αU = k2 (U)−1 , αV = k2 (V2)−1 , βU = k33! , βV = k33! , γU = k44! , γV 2 k4 (V ) = 4! . The Kullback-Leibler divergence of U and V is given by: pU (x) K(U ||V ) = pU (x)log dx pV (x) R ≈ pU (x)log (1 + αU h2 (x) + βU h3 (x) + γU h4 (x)) dx R − pU (x)log (1 + αV h2 (x) + βV h3 (x) + γV h4 (x)) dx R
From the expansion log(1 + x) = x −
x2 2!
+ o(x2 ), we have:
A :=log (1 + αU h2 (x) + βU h3 (x) + γU h4 (x)) − log (1 + αV h2 (x) + βV h3 (x) + γV h4 (x)) ≈ (αU − αV )h2 (x) + (βU − βV )h3 (x) + (γU − γV )h4 (x) 2 αU − αV2 2 β 2 − βV2 2 γ 2 − γV2 2 h2 (x) − U h3 (x) − U h4 (x) 2 2 2 − (αU βU − αV βV ) h2 (x)h3 (x) − (αU γU − αV γV ) h2 (x)h4 (x)
−
− (βU γU − βV γV ) h3 (x)h4 (x)
Let rewrite K(U ||V ) using the following identity: pU (x) · Adx n(x) [1 + αU h2 (x) + βU h3 (x) + γU h4 (x)] · Adx R R A · n(x)dx + A [αU h2 (x) + βU h3 (x) + γU h4 (x)] n(x)dx . R
R
We recall the following properties of Chebyshev-Hermite polynomials: n(x)hk (x)dx = 0 for all k ≥ 1 R n(x)h2k (x)dx = k! for all k ≥ 1 R n(x)hk (x)hl (x)dx = 0 for all k =l, R
300
T.-H. Le and M. Berthier
The first term becomes: α2 − α2V β 2 − βV2 α2 − α2V A · n(x)dx = −2! U − 3! U − 4! U 2 2 2 R The second term is computed similarly and only terms with at most second degree are kept, we obtain A [αU h2 (x) + βU h3 (x) + γU h4 (x)] ≈ 2! (αU − αV ) αU + 3! (βU − βV ) βU + 4! (γU − γV ) γU , Finally, by combining two terms, we obtain: α2U − α2V β 2 − βV2 γ 2 − γV2 − 3! U − 4! U 2 2 2 + 2! (αU − αV ) αU + 3! (βU − βV ) βU + 4! (γU − γV ) γU
K(U ||V ) ≈ −2!
2
2
= (αU − αV ) + 3 (βU − βV ) + 12 (γU − γV ) =
2
(k3 (U ) − k3 (V ))2 (k4 (U ) − k4 (V ))2 (k2 (U ) − k2 (V ))2 + + 4 12 48
Known-Key Attacks on Rijndael with Large Blocks and Strengthening ShiftRow Parameter Yu Sasaki NTT Information Sharing Platform Laboratories, NTT Corporation 3-9-11 Midoricho, Musashino-shi, Tokyo 180-8585 Japan
[email protected] Abstract. In this paper, we present known-key attacks on block cipher Rijndael for 192-bit block and 256-bit block. Our attacks work up to 8 rounds for 192-bit block and 9 rounds for 256-bit block, which are one round longer than the previous best known-key attacks. We then search for the parameters for the ShiftRow operation which is stronger against our attacks than the one in the Rijndael specification. Finally, we show a parameter for 192-bit block which forces attackers to activate more bytes to generate a truncated differential path, and thus enhances the security against our attacks. Keywords: Rijndael, known-key attack, Super-Sbox analysis, truncated differential path, ShiftRow.
1
Introduction
Block ciphers are taking important roles in various aspects of our life. To evaluate the security of block ciphers, cryptanalysis is important. Block ciphers are sometimes used as stream ciphers or hash functions through the mode of operation. Recently, the security of block ciphers when they are instantiated in the hash function mode has been analyzed. Because many hash functions, even in the currently conducted SHA-3 competition [1], are designed based on block ciphers, this kind of analysis is useful. The known-key attack proposed by Knudsen and Rijmen [2] is the framework for this context. In this model, a secret key is given to attackers. Then, attackers aim to efficiently detect a certain property of a random instance of the block cipher, where the same property cannot be observed for an ideal permutation with the same complexity. In this paper, we analyze the security of the Rijndael block cipher [3], which is the base of Advanced Encryption Standard (AES) [4,5]. Different from AES, the original version of Rijndael supports three different block sizes; 128, 192, and 256 bits. Among these variants, the state size and the number of rounds computed are different. Several papers have reported the security analysis on Rijndael with large block sizes [3,6,7,8,9,10,11,12]. In this paper, we study the known-key attacks on Rijndael or AES. The previous results can be classified into two approaches; integral-based approach and I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 301–315, 2010. c Springer-Verlag Berlin Heidelberg 2010
302
Y. Sasaki
differential-based approach. The first known-key attack was presented by Knudsen and Rijmen [2], which found a non-ideal property of 7-round AES (Rijndael 128-bit block) by using integral attack. Then, Minier et al. [11] applied this attack to 7-round Rijndael 192-bit block and 8-round Rijndael 256-bit block. These are the current best known-key attacks with an integral-based approach. On the other hand, Mendel et al. presented the known-key attack on 7-round AES with a differential-based approach [13], where the attack is based on Rebound attack proposed by Mendel et al. [14]. Then, Gilbert and Peyrin [15] and Lamberger et al. [16] independently applied Super-Sbox analysis to the rebound attack. Gilbert and Peyrin [15] showed that 8-round AES was not an ideal in the known-key setting. This is the current best result on the 128-bit block with a differential-based approach. 192- and 256-bit blocks have not been evaluated in the differential-based approach. Our Contributions In this paper, we present known-key attacks on block cipher Rijndael for 192-bit and 256-bit blocks. Different from the previous work by Minier et al. [11], we use the differential-based approach. This enables us to extend the number of attacked rounds by one more round. Our attack on the 192-bit block finds a certain property with a complexity of 248 and 232 amount of memory, while identifying the same property with an ideal permutation requires a complexity of 2128 . Our attack on the 256-bit block finds a certain property with a complexity of 248 and 232 amount of memory, while identifying the same property with an ideal permutation requires a complexity of 264 . The attack results are summarized in Table 1.
Table 1. Comparison of attack results Block size Integral-based analysis Differential-based analysis 128 7 rounds [2] 8 rounds [15] 192 7 rounds [11] 8 rounds Ours 256 8 rounds [11] 9 rounds Ours
The attack efficiency is dependent on the diffusion mechanism including the parameter for the ShiftRow operation. In fact, when the parameter was determined, the resistance against attacks using truncated differentials were considered by designers [3]. However, details of how they considered the resistance were not explained. In this research, we search for new parameters which can prevent truncated differential paths with a small number of active bytes, and thus enhance the security against truncated differential attacks. With the parameter defined in the specification, the following truncated differential path can be constructed. 1 → 4 → 16 → 16 → 4 → 1 We show that by changing the parameter, the above path cannot be constructed and attackers need to use the following path instead.
Known-Key Attacks on Rijndael with Large Blocks
303
1 → 4 → 16 → 32 → 8 → 2 Due to the additional active bytes, satisfying this truncated differential path requires more complexity than the previous path, and thus the new parameter enhances the security against attacks using truncated differentials. We believe that when primitives based on Rijndael or AES with larger block sizes are designed, designers should consider this kind of observations. Paper outline. This paper is organized as follows. In Section 2, we describe the specification of Rijndael. In Section 3, we introduced the previous work. In Section 4, we present our known-key attacks on Rijndael 192- and 256-bit blocks. In Section 5, we show the new parameter for the ShiftRow operation which increases the security against our attacks. In Section 6, we conclude this paper.
2
Specifications
The block cipher Rijndael was designed by Daemen and Rijmen in 1998 and was selected as Advanced Encryption Standard (AES) in 2000 [4,5]. The original version of Rijndael supports three different block sizes; 128, 192, 256 bits and three different key sizes; 128, 192, 256 bits. The number of rounds (Nr ) computed inside the encryption and decryption procedures is dependent on the block and key sizes, which is defined in Table 2. Table 2. Number of rounds for Rijndael Nr 128-bit block 192-bit block 256-bit block 128-bit key 10 12 14 192-bit key 12 12 14 256-bit key 14 14 14
In the Key Schedule function, round keys are generated from the original secret key. Because round keys are regarded as given constant numbers in the known-key setting, the key-schedule function does not give any impact to the known-key attack. Hence, we omit its description. In the encryption procedure, the internal states for 128-, 192-, and 256-bit blocks are represented by 4 ∗ 4, 4 ∗ 6, and 4 ∗ 8 byte arrays, respectively. At the first, the original key is XORed to the plaintext, and then, a round operation is iteratively applied to the state Nr times. The round operation consists of the following four computations. ByteSub(BS): substitute each byte of the state according to an S-box table. ShiftRow(SR): apply the sj -byte left rotation to each byte at row j, (j = 0, 1, 2, 3) of the state, where, sj is defined as (0,1,2,3), (0,1,2,3), and (0,1,3,4) for 128-, 192-, and 256-bit blocks, respectively.
304
Y. Sasaki
MixColumn(M C): multiply each column of the state by a maximum-distance separable (MDS) matrix. MDS guarantees that the sum of active bytes in the input and output of the MixColumn operation is at least 5 unless all bytes are non-active. AddRoundKey(AK): apply bit-wise exclusive-or with a round key. Note that the MixColumn operation is not computed at the last round. According to the designers [3], the parameters for the ShiftRow operation were determined to satisfy the following properties. 1. 2. 3. 4.
3 3.1
The four offsets are different and the parameter for the first row is 0; Resistance against attacks using truncated differentials; Resistance against the integral attack; Simplicity.
Previous Work Known-Key Attacks on Rijndael
The framework of known-key attack. The concept of known-key attack was introduced by Knudsen and Rijmen at Asiacrypt 2007 [2], and was later formalized by Minier et al. at Africacrypt 2009 [11]. Informally, in this framework, attackers do as follows. Before the secret key becomes public, attackers define a problem which they aim to detect efficiently for the target block cipher. Then, the complexity of detecting the same problem for an ideal cipher, denoted by CI , is determined. After that, the key value is given to attackers. Finally, if attackers can detect the declared problem with the given key faster than CI , the target block cipher is said to be non-ideal. Integral-based approach for the known-key attack. Attackers collect a set of plaintexts (resp. ciphertexts) which satisfies a specific form so that a set of texts processed to the partial encryption (resp. decryption) also satisfies a certain property. Knudsen and Rijmen applied the integral attack in the known-key setting [2] and show that AES reduced to 7-rounds are not ideal against knownkey attacks. Minier et al. applied the integral attack to Rijndael for larger block sizes [11] and show that the number of attacked rounds was the same (7-rounds) for the 192-bit block and was extended to 8-rounds for the 256-bit block. Differential-based approach for the known-key attack. At FSE 2009, Mendel et al. proposed a framework of the truncated differential based analysis called rebound attack, which works against AES-based primitives with fixed key-input [14]. Rebound attack divides a differential path into two parts as inbound and outbound phases. For the inbound phase, attackers can control the most expansive part of the differential path with a very low average complexity. Then the outbound differential path are satisfied probabilistically. With this
Known-Key Attacks on Rijndael with Large Blocks
305
framework, Mendel et al. showed the known-key attacks on AES reduced to 7 rounds [13]. After that, Super-Sbox analysis was applied for this framework by Gilbert and Peyrin [15], and Lamberger et al. [16], independently. It combines 2 nonlinear layers (ByteSub) and 1 diffusion layer to 1 non-linear layer with larger substitution-box named Super-Sbox. This extends the inbound phase of the rebound attack by one more round. Gilbert and Peyrin [15] showed that AES reduced to 8-rounds was not ideal by using the Super-Sbox analysis. They showed that a pair of values which has 4-byte differences in the input and 4-byte differences in the output can be obtained with 248 computations and 232 amount of memory, while detecting the same paired values for an ideal cipher will cost 264 by a limited birthday problem. Note, however, that the fact no better generic attack is known does not exclude the possibility of the presence of such an attack. Also note that there is an approach to give the lower-bound of the generic attack e.g. [16,17]. The limited birthday problem does not take this approach. Mendel et al. [18] showed several attacks on the Grøstl-512 [19] hash function, which has an AES-based design with 8∗16 byte-array state. Due to the similarity of the state size, the attacks on Grøstl-512, especially the truncated differential path, is useful to consider the attack on Rijndael for 256-bit block. Note that there are several other approaches of extending the rebound attack, for example, multiple inbound phase by Matusiewicz et al. [20] and considering more details of several AES based constructions by Peyrin [21,22]. So far, the Super-Sbox approach is the most successful against plain AES or Rijndael. 3.2
Introduction of Rebound Attack and Super-Sbox Analysis
Rebound attack. The rebound attack can find a pair of values that satisfy the truncated differential path of 2 AES-rounds shown in Fig. 1.
#FOR
#FOR BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
#M SR
MC
AK
Try 232 values.
match of differences BS
SR
BS
SR
#BACK
BS
MC
AK
: active byte : static byte
Fig. 1. Rebound attack procedure
#BACK
#B2
Double-dotted arrow line denotes Super-Sbox part.
Fig. 2. Super-Sbox analysis procedure
306
Y. Sasaki
To achieve such a pair, the simplest method is randomly generating a pair that have 4-byte differences in the first column at state #FOR, and check if the differences after 2 rounds match the one shown in Fig. 1. Due to the MixColumn operation in the middle round, this succeeds with probability of 2−96 . On the other hand, the rebound attack finds such a pair in a different approach. First, an attacker chooses 28 differences of the state denoted by #FOR in Fig. 1. Then, she processes these differences until just before the next ByteSub operation and store the results in a table T . Note that the values are not determined at this stage. Next, the attacker chooses a difference of the state denoted by #BACK in Fig. 1. Then, she processes this difference until just after the next ByteSub operation and compare the result with differences stored in T . Due the design strategy of AES S-box, randomly given a pair of an input and an output differences of S-box has solutions with probability about 2−1 and the number of solutions is approximately 2. As a result, by checking the match of a difference from #BACK and 28 differences in T , 28 ·2−16 = 2−8 pair is expected to have solutions. Hence, by iterating the computation from #BACK 28 times, we expect to find a pair of differences of #FOR and #BACK that can be connected. Each connected pair has 2 solutions for each byte, and we thus obtain 216 solutions for each connected pair. In summary, we obtain 216 solutions of these two rounds with 28 computations from #FOR and #BACK. These solutions are used to satisfy the truncated differential path for the outside of these two rounds. Note that, we can choose 232 differences for #FOR and #BACK in maximum. Hence, the check of the outside path can be performed up to 264 times in maximum. Super-Sbox Analysis. Super-Sbox analysis extends the number of rounds that are efficiently connected. It satisfies the differential path for 3 AES-rounds shown in Fig. 2. First, an attacker chooses 232 all possible differences of the state denoted by #FOR in Fig. 2. Then, she processes these differences until just before the next ByteSub operation, which is denoted by #M (M stands for Match), and store the results in a table T . Next, the attacker chooses a difference of the state denoted by #BACK in Fig. 2. Then, she processes this difference until just after the next ByteSub operation, which is denoted by #B2. After that the attacker computes the Super-Sbox part, from #B2 to #M column by column. Bytes involved to compute a Super-Sbox are stressed by bold squares in Fig. 2. Namely, for all 232 possible values and a determined difference of column j for j = 0, 1, 2, 3 in #B2, compute corresponding diagonal 4-byte paired values at the state #M. The attacker stores the resulted values and differences in a table Tj . The Super-Sbox computation is iterated for all j and 4 tables T0 , T1 , T2 , and T3 , each contains 232 entries, are generated. Finally, for any entry in T , the attacker finds the items in T0 , T1 , T2 , and T3 whose differences match the one in T . Because each Tj has 232 entries, we expect to obtain 1 entry in average. The Super-Sbox analysis requires 232 computations and 232 memory. Then, 32 2 paired values satisfying the differential path are generated. Hence, 1 solution is generated with 1 computation in average. The attacker can choose the difference of #BACK from 232 candidates. Therefore, the analysis can be repeated up
Known-Key Attacks on Rijndael with Large Blocks
307
to 232 times in maximum. Finally, we conclude that the number of candidates for checking the outer path is 264 in maximum.
4
Known-Key Attacks on Rijndael with Large Blocks
In this section, we describe our attacks on Rijndael for larger block sizes. The number of attacked rounds are 8 and 9 for 192-bit block and 256-bit block, respectively. In the attack on 192-bit block, we find a pair of values which has 4-byte differences in the both of input and output with a complexity of 248 and 232 amount of memory, while finding the same property in an ideal permutation requires 2128 . For the 256-bit block, we find a pair of values which has 16-byte differences in the both of input and output with a complexity of 248 and 232 amount of memory, while an ideal case requires 264 . The most important parts of the attack is determining the truncated differential path. The followings should be considered in the attack. – The differential propagation is consistent with the MDS property. – The differential propagation for the outbound phase succeeds with a relatively high probability. – Enough freedom degrees are available to satisfy the outbound phase. – The complexity for finding the same property in an ideal permutation is higher than the proposed attack. In Section 4.1, we show the differential path for Rijndael 256-bit block and a detailed attack procedure. We then apply the same analysis to Rijndael 192-bit block in Section 4.2. 4.1
Attacks on 256-Bit Block
Truncated Differential Path. We show the truncated differential path for the 9-round Rijndael 256-bit block in Fig. 3. The number of active bytes in the path will propagate as follows. 1R
2R
3R
4R
5R
6R
7R
8R
9R
16 → 4 → 1 → 4 → 16 → 28 → 8 → 5 → 16 → 16 The above differential path is generated so that the number of active bytes in the input state to the 3rd and 8th rounds are small. Note that we cannot make the differential path with activating only 1 byte in these two states. If we do so, the number of active bytes in the input of the MixColumn operation in the 5th round will be 16 and in the output will be 16. However, in order to satisfy the MDS property, the sum of the number of active bytes in input and output needs to be 40 or greater. Hence, such an efficient differential path cannot be constructed. Inbound phase. The inbound phase starts from the state immediately after the ByteSub operation in the 4th round and the state just before the ByteSub operation in the 7th round. The goal of the inbound phase is to generate 232 paired values that satisfy the differential path for the inbound phase with a
308
Y. Sasaki
round 1
BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
2-24 MC
AK
BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
MC 2-24
AK
BS
SR
MC
AK
BS
SR
AK
round 2
round 3
round 4
round 5
round 6
round 7
round 8
round 9
Active bytes are filled in grey. Inside the broken line is the inbound phase. Super-Sbox part is emphasized by bold squares. Fig. 3. Truncated differential path for 9-round Rijndael 256-bit block
complexity of 232 and 232 amount of memory. The procedure for the inbound phase is as follows. 1. For each of 232 possible differences in the state immediately after the ByteSub operation in the 4th round, compute the corresponding differences in the input state to the 5th round. Because this computation involves only the linear operations, attackers can compute the differences without determining the values. Store the resulted differences in a table with 232 entries. 2. Choose a difference in the state just before the ByteSub operation in the 7th round, and compute the corresponding differences in the state between ByteSub and ShiftRow in the 6th round. Then, further proceed in backward by computing Super-Sboxes. Namely, for each column in this state, do as follows. (a) For 232 possible values in a column, compute the difference after the Super-Sbox, in other words, compute the difference after the Inverse ByteSub, Inverse AddRoundKey, Inverse MixColumn, Inverse ShiftRow, and Inverse ByteSub operations.
Known-Key Attacks on Rijndael with Large Blocks
309
(b) Check if the resulted difference will match the ones stored in the table generated in Step 1. If it matches, store the corresponding value and difference. Otherwise, discard it. 3. After all Super-Sboxes are computed, for each of 232 differences stored in Step 1, we can obtain the corresponding values following the differential path by looking up all Super-Sboxes. In the above procedure, Step 1 requires 232 computations and 232 memory. Step 2 requires 1 computation and 1 memory. Step 2a requires 232 computations and Step 2b requires at most 232 memory. Step 3 requires table look up of all SuperSboxes 232 times. Hence, the complexity of the above procedure is 232 time and 232 memory. In Step 2a, because the number of possible differences is 232 and we compute 232 differences, each difference will be expected to appear once in average. Therefore, for each of differences in the table generated in Step 1, we will obtain 1 value in average. Hence, we can obtain 232 paired values satisfying the differential path for the inbound phase. These solutions of the inbound phase are called starting points. To sum up, in the inbound phase, we generate 232 starting points with a complexity of 232 and 232 amount of memory. In other words, the average complexity to obtain one starting point is 1. Outbound phase. In the outbound phase, we just check that the differential path is satisfied probabilistically. In the differential path shown in Fig. 3, the differential propagation through the InverseMixColumn operation in the 3rd round succeeds with a probability of 2−24 and through the MixColumn operation in the 7th round succeeds with a probability of 2−24 . In the end, if we have 248 starting points, we will obtain a pair of values which follows the differential path. In the inbound phase explained in Section 4.1, we obtain 232 starting points for a chosen difference of the state just before the ByteSub operation in the 7th round. This can be chosen from 264 differences. Therefore, by iterating the attack procedure with starting from all 264 possible differences at Step 2, the inbound phase can be iterated 264 times in maximum. Therefore, by iterating the above attack procedure 216 times, we will obtain 216 · 232 = 248 starting points, and we then expect to obtain a pair of values satisfying the outbound phase. Hence, the total complexity of the attack is 248 (= 216 ∗ 232 ) computations and 232 amount of memory. Complexity for an ideal case. In this attack, we will find a pair of values that has 16-byte differences in both the input and output states (active-byte positions are fixed). For an ideal permutation, finding the same property can be regarded as finding a collision in 32 − 16 = 16 bytes. This will take approximately 264 computations with the birthday attack. Because this complexity is much higher than the one for the Rijndael 256-bit block, we can conclude that the 9-round Rijndael 256-bit block is not ideal in the known-key model. Note, however, that the fact no better generic attack is known does not exclude the possibility of the presence of such an attack.
310
4.2
Y. Sasaki
Attacks on 192-Bit Block
Truncated Differential Path. We show the truncated differential path for the 8-round Rijndael 192-bit block in Fig. 4. The number of active bytes in the path will propagate as follows. 1R
2R
3R
4R
5R
6R
7R
8R
4 → 1 → 4 → 16 → 16 → 4 → 1 → 4 → 4 Different from the truncated differential path for 256-bit block, we can make the path with only activating 1 byte in both of the input states to the 2nd and 7th rounds. In fact, with this strategy, the number of active bytes in the input of the MixColumn operation in the 4th round will be 16 and in the output will be 16. This satisfies the requirement of the MDS property where the sum of the number of active bytes in input and output must be 30 or greater. Hence, we can obtain the above differential path. Complexity and comparison with an ideal case. Because the attack procedure is almost the same as the one for 256-bit block, we only explain the attack results and omit the details. In the inbound phase, we can generate 232 starting points with a complexity of 232 and 232 amount of memory. In other words, one starting point is generated round 1
BS
SR
MC
AK
BS
SR
2-24 MC
AK
BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
MC 2-24
AK
BS
SR
MC
AK
BS
SR
AK
round 2
round 3
round 4
round 5
round 6
round 7
round 8
Fig. 4. Truncated differential path for 8-round Rijndael 192-bit block
Known-Key Attacks on Rijndael with Large Blocks
311
with a complexity of 1. The outbound phase includes two probabilistic differential propagations which succeed with a probability of 2−48 in total. Therefore, by repeating the inbound phase 216 times, the outbound phase is satisfied. As a result, we can obtain a pair of values that has 4-byte differences in both the input and output states with a complexity of 248 and 232 amount of memory. Note that the maximum number of starting points we can obtain is 264 , which is enough to satisfy the outbound phase. Let us consider the complexity for finding the same property in an ideal case. The problem is regarded as finding a 24−4 = 20-byte collision. However, because the form of differences in the input state is limited, attackers cannot run the birthday attack. In this case, attackers need to use the limited birthday attack introduced by Gilbert and Peyrin [15]. With this attack, the complexity for an ideal permutation is evaluated as 2128 , which is much higher than the attack on the Rijndael 192-bit block. Finally, we can conclude that the 8-round Rijndael 192-bit block is not ideal in the known-key model. Note that the attack using the following 10-round path is impossible: 16 → 4 → 1 → 4 → 16 → 16 → 4 → 1 → 4 → 16 → 16 In this case, the attack complexity for 10-round Rijndael 192-bit block will not change, which is 248 computations and 232 memory. However, the attack complexity for an ideal permutation will be much easier. Actually, the problem can be regarded as finding a 24 − 16 = 8-byte collision with the birthday attack, which requires 232 computations. Therefore, the attack on Rijndael will take more computations than the ideal case.
5
Strong ShiftRow Parameter against Truncated Differential Attack
In the attack on Rijndael 192-bit block explained in Section 4.2, we constructed the differential path by activating only 1 byte in the initial state for the 2nd and 7th rounds. Due to its small number of active bytes, the attack is efficient. In Section 5.1, we show the parameter for the ShiftRow operation which can increase the number of active bytes of the truncated differential path. In Section 5.2, we present the attack on modified Rijndael-192 whose parameter for the ShiftRow operation is replaced with the stronger one, and confirm that the attack becomes inefficient. 5.1
Search for Strong ShiftRow Parameter
In our attack, the complexity for the outbound phase is determined by the number of active bytes in the truncated differential path. In particular, the intermediate 5 rounds shown below, which starts with 1 active byte and end with 1 active byte, is making the number of active bytes be small and the attack be efficient. 2R 3R 4R 5R 6R (1) 1 → 4 → 16 → 16 → 4 → 1
312
Y. Sasaki
According to the design principle of the ShiftRow parameters [3] explained in Section 2, resistance against attacks using truncated differentials is considered. However, how they were considered is not mentioned. In this research, we explore the ShiftRow parameters that force attackers to activate more bytes to construct a truncated differential path. Possible patterns of ShiftRow parameter. At the first, we consider the possible patterns for the ShiftRow parameter. Let x, y, z, and w be the numbers of the left-cyclic shift bytes in row 0, 1, 2, and 3, respectively. We only consider the patterns that still satisfy the criterias in the original design principle. Therefore, we determine x = 0, and determine y, z, w so that 1 ≤ y < z < w ≤ 5. Hence, the number of possible patterns is 5 C3 = 10, which are shown in Table 3. Moreover, the left-cyclic shift by y, z, w bytes and the right-cyclic shift by y, z, w Table 3. Possible patterns of ShiftRow parameter Pattern 1 2 3 4 5 6 7 8 9 10
(x, y, z, w) (0, 1, 2, 3) (0, 1, 2, 4) (0, 1, 2, 5) (0, 1, 3, 4) (0, 1, 3, 5) (0, 1, 4, 5) (0, 2, 3, 4) (0, 2, 3, 5) (0, 2, 4, 5) (0, 3, 4, 5)
Remarks
symmetric pattern of (0,1,2,5) symmetric pattern of (0,1,3,4) symmetric pattern of (0,1,2,4) symmetric pattern of (0,1,2,3)
bytes have exactly the same effect (they activate the bytes in the symmetric positions). For example, (y, z, w) = (1, 2, 3) and (y, z, w) = (3, 4, 5) have the same effect. Consequently, only 6 patterns shown in Table 3 remain. Comparison of ShiftRow parameter. For patterns 1, 2, 3, 4, 5, and 7 in Table 3, with a standard miss-in-the-middle manner, we check whether or not the differential path shown as Eq. (1) can be constructed. Remember that, the sum of number of active bytes in the input and output of the MixColumn operation in the 4th round is 32, which satisfy the requirement of the MDS property; at least 30 bytes are active. However, depending on the parameter of the ShiftRow operation, the position of active bytes may be biased and some columns may have only 4 or smaller active bytes which is inconsistent with the MDS property. As a result of the analysis, we found that only pattern 4 can avoid the differential path of Eq. (1). The differential propagation with pattern 4 is described in Fig. 5. With pattern 4, 1 active byte in the initial state makes 4 columns to include 2 active bytes and 2 columns to be full active. Therefore, to satisfy the MDS property, the output state of the MixColumn operation needs to include
Known-Key Attacks on Rijndael with Large Blocks
313
D^ ĐŽŶƚƌĂĚŝĐƚ ZĞƋƵŝƌĞƐƚŚĂƚϰĐŽůƵŵŶƐ ŚĂǀĞϯŽƌŵŽƌĞƚŚĂŶϯ ĂĐƚŝǀĞďLJƚĞƐ
KŶůLJϮĐŽůƵŵŶƐ ŚĂǀĞϯŽƌŵŽƌĞ ƚŚĂŶϯĂĐƚŝǀĞďLJƚĞƐ
Fig. 5. Impossibility of constructing efficient path with pattern 4 round 1
BS
SR
MC
AK
BS
SR
2-24 MC
AK
BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
MC
AK
BS
SR
MC 2-48
AK
BS
SR
MC
AK
BS
SR
AK
round 2
round 3
round 4
round 5
round 6
round 7
round 8
Fig. 6. Truncated differential path for Rijndael 192-bit block with stronger ShiftRow parameter (pattern 4)
at least 4 columns that have at least 3 active bytes. However, in the backward propagation starting from 1 active byte always makes 4 columns to include 2 active bytes and 2 columns to be full active. Therefore, obtaining the differential path of Eq. (1) is proven to be impossible with pattern 4. In other words, if we use pattern 4, attackers need to activate more bytes in the truncated differential path, and thus attackers need more complexity to find a non-ideal property, which means the security against the truncated differential can increase. 5.2
Attack on Strengthened Rijndael-192
We present the attack on Rijndael-192 reduced to 8-round with replacing the parameters for the ShiftRow operation with pattern 4 in Table 3. The truncated
314
Y. Sasaki
differential path is shown in Fig. 6. In this path, the number of active bytes changes as follows. 1R
2R
3R
4R
5R
6R
7R
8R
4 → 1 → 4 → 16 → 32 → 8 → 2 → 4 → 4 In this attack, different from the attack in Section 4.2, 2 bytes are activated in the input state to the 7th round. Due to this change, the success probability of the differential propagation in the MixColumn operation in the 6th round is reduced to 2−48 . As a result, the attack complexity to find a pair of values that has 4-byte differences in both of the input and output states becomes 272 computations and 232 amount of memory. Although the attack on 8-round Rijndael 192-bit block is still faster than the ideal case, the attack complexity increases. Finally, we confirmed that the attack becomes inefficient with the new ShiftRow parameter compared to the original specification. 5.3
Remarks on Parameter for Rijndael-256
In Section 5, we searched for the parameter preventing an efficient truncated differential path whose intermediate part begins and ends with 1-byte difference. For Rijndael-256, as we pointed out in Section 4.1, it is easy to show that such an efficient differential path cannot exist. In other words, the parameter for 256-bits is already reached the same level as the strengthened parameter for 192-bits.
6
Conclusion
In this paper, we presented known-key attacks on block cipher Rijndael for 192bit and 256-bit block sizes. We used the differential-based analysis and the number of attacked rounds were extended to 8 and 9 for 192- and 256-bit blocks, respectively, which were one round longer than the previous best attacks. In addition, we studied the parameters for the ShiftRow operation, and showed that the parameter (0,1,3,4) could enhance the security against attacks using truncated differentials. We believe that this observation will be useful for designing primitives based on Rijndael or AES with the large block size in the future.
References 1. U.S. Department of Commerce, National Institute of Standards and Technology: Federal Register /Vol. 72, No. 212/Friday, November 2, 2007/Notices (2007), http://csrc.nist.gov/groups/ST/hash/documents/FR_Notice_Nov07.pdf 2. Knudsen, L.R., Rijmen, V.: Known-key distinguishers for some block ciphers. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 315–324. Springer, Heidelberg (2007) 3. Daemen, J., Rijmen, V.: AES Proposal: Rijndael (1998) 4. U.S. Department of Commerce, National Institute of Standards and Technology: Specification for the ADVANCED ENCRYPTION STANDARD (AES) (Federal Information Processing Standards Publication 197) (2001) 5. Daemen, J., Rijmen, V.: The design of Rijndael: AES – the Advanced Encryption Standard (AES). Springer, Heidelberg (2002)
Known-Key Attacks on Rijndael with Large Blocks
315
6. Ferguson, N., Kelsey, J., Lucks, S., Schneier, B., Stay, M., Wagner, D., Whiting, D.: Improved cryptanalysis of Rijndael. In: Schneier, B. (ed.) FSE 2000. LNCS, vol. 1978, pp. 213–230. Springer, Heidelberg (2001) 7. Nakahara Jr., J., de Freitas, D.S., Phan, R.C.W.: New multiset attacks on Rijndael with large blocks. In: Dawson, E., Vaudenay, S. (eds.) Mycrypt 2005. LNCS, vol. 3715, pp. 277–295. Springer, Heidelberg (2005) 8. Nakahara Jr., J., Pav˜ ao, I.C.: Impossible-differential attacks on large-block Rijndael. In: Garay, J.A., Lenstra, A.K., Mambo, M., Peralta, R. (eds.) ISC 2007. LNCS, vol. 4779, pp. 104–117. Springer, Heidelberg (2007) 9. Zhang, L., Wu, W., Park, J.H., Koo, B.W., Yeom, Y.: Improved impossible differential attacks on large-block Rijndael. In: Wu, T.-C., Lei, C.-L., Rijmen, V., Lee, D.-T. (eds.) ISC 2008. LNCS, vol. 5222, pp. 298–315. Springer, Heidelberg (2008) 10. Galice, S., Minier, M.: Improving integral attacks against Rijndael-256 up to 9 rounds. In: Vaudenay, S. (ed.) AFRICACRYPT 2008. LNCS, vol. 5023, pp. 1–15. Springer, Heidelberg (2008) 11. Minier, M., Phan, R.C.-W., Pousse, B.: Distinguishers for ciphers and known key attack against Rijndael with large block size. In: Preneel, B. (ed.) AFRICACRYPT 2009. LNCS, vol. 5580, pp. 60–76. Springer, Heidelberg (2009) 12. Wei, Y., Sun, B., Li, C.: New integral distinguisher for Rijndael-256. Cryptology ePrint Archive, Report 2009/559 (2009), http://eprint.iacr.org/2009/559 13. Mendel, F., Peyrin, T., Rechberger, C., Schl¨ affer, M.: Improved cryptanalysis of the reduced Grøstl compression function, ECHO permutation and AES block cipher. In: Jacobson Jr., M.J., Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009. LNCS, vol. 5867, pp. 16–35. Springer, Heidelberg (2009) 14. Mendel, F., Rechberger, C., Schl¨ affer, M., Thomsen, S.S.: The rebound attack: Cryptanalysis of reduced Whirlpool and Grøstl. In: Dunkelman, O. (ed.) Fast Software Encryption. LNCS, vol. 5665, pp. 260–276. Springer, Heidelberg (2009) 15. Gilbert, H., Peyrin, T.: Super-Sbox cryptanalysis: Improved attacks for AES-like permutations. In: Hong, S., Iwata, T. (eds.) Preproceedings of FSE 2010, pp. 368– 387 (2010) 16. Lamberger, M., Mendel, F., Rechberger, C., Rijmen, V., Schl¨ affer, M.: Rebound distinguishers: Results on the full Whirlpool compression function. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 126–143. Springer, Heidelberg (2009) 17. Lamberger, M., Mendel, F., Rechberger, C., Rijmen, V., Schl¨ affer, M.: The rebound attack and subspace distinguishers: Application to Whirlpool. Cryptology ePrint Archive, Report 2010/198 (2010), http://eprint.iacr.org/2010/198 18. Mendel, F., Rechberger, C., Schl¨ affer, M., Thomsen, S.S.: Rebound attack on the reduced Grøstl hash function. In: Pieprzyk, J. (ed.) CT-RSA 2010. LNCS, vol. 5985, pp. 350–365. Springer, Heidelberg (2010) 19. Gauravaram, P., Knudsen, L.R., Matusiewicz, K., Mendel, F., Rechberger, C., Schl¨ affer, M., Thomsen, S.S.: Grøstl addendum. Submission to NIST (2009) (updated) 20. Matusiewicz, K., Naya-Plasencia, M., Nikoli´c, I., Sasaki, Y., Schl¨ affer, M.: Rebound attack on the full LANE compression function. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 106–125. Springer, Heidelberg (2009) 21. Peyrin, T.: Improved differential attacks for ECHO and Grøstl. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 370–392. Springer, Heidelberg (2010) 22. Peyrin, T.: Improved cryptanalysis of ECHO and Grøstl. Cryptology ePrint Archive, Report 2010/223 (2010), http://eprint.iacr.org/2010/223; Full version of CRYPTO 2010
Differential Addition in Generalized Edwards Coordinates Benjamin Justus and Daniel Loebenberger Bonn-Aachen International Center for Information Technology Universit¨ at Bonn 53113 Bonn Germany
Abstract. We use two parametrizations of points on elliptic curves in generalized Edwards form x2 + y 2 = c2 (1 + dx2 y 2 ) that omit the xcoordinate. The first parametrization leads to a differential addition formula that can be computed using 6M + 4S, a doubling formula using 1M + 4S and a tripling formula using 4M + 7S. The second one yields a differential addition formula that can be computed using 5M + 2S and a doubling formula using 5S. All formulas apply also for the case c = 1 and arbitrary curve parameter d. This generalizes formulas from the literature for the special case c = 1 or d being a square in the ground field. For both parametrizations the formula for recovering the missing Xcoordinate is also provided. Keywords: Elliptic curve, Edwards form, addition formula, differential addition.
1
Introduction
Efficient arithmetic (addition, doubling and scalar multiplication) on elliptic curves is the core requirement of elliptic curve cryptography. It is the cornerstone in applications such as the digital signature algorithm (DSA), see [10], and Lenstra’s elliptic curve factoring method [11]. Various ways of representing elliptic curves have been proposed for the purpose of efficient arithmetic. For an overview, the reader can consult the standard reference [7] or the online ExplicitFormulas Database (EFD)1 . We have selected some of the top candidates and summarized them in the table below. Here M (resp. S) refers to multiplication (resp. a squaring) in the field. We ignore in this paper multiplications by a constant and the additions in the field, since their cost is negligible when compared to the cost of multiplication or squaring. With the advent of Edwards coordinates [8], extensive recent work [1,2,3,4] has provided formulas for addition on Edwards form that are more efficient (by a constant factor) than what is known for other representations. This makes the Edwards form particularly interesting for cryptographic applications. 1
See http://www.hyperelliptic.org/EFD
I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 316–325, 2010. c Springer-Verlag Berlin Heidelberg 2010
Differential Addition in Generalized Edwards Coordinates
317
Castryck, Galbraith and Farashahi [6] present doubling formulas for Edwards form with c = 1 like the one given in Corollary 1. They do not consider the case c = 1 and do not provide a general (differential) addition formula. Gaudry and Lubicz [9] present general efficient algorithms for a much broader class of curves. In order to adapt their ideas to the context of elliptic curves in generalized Edwards form, one needs to explicitly express the group law in terms of Riemann’s ϑ functions. Due to our inability to do so, we derive in this work formulas for elliptic curves in generalized Edwards form directly. We are in good company here; Castryck, Galbraith and Farashahi write: “This is an euphemistic rephrasing of our ignorance about Gaudry and Lubicz’ result, which is somewhat hidden in a different framework.” Special cases of our result can also be found on EFD: There are several formulas given for c = 1 under the assumption that the curve parameter d is a square in the field. The formulas on EFD are on one hand consequences of [9] but can also be deduced from our general formulas in Theorem 1 and Corollary 1, as explained at the end of section 3. In this work we give the first time differential addition and doubling formulas for Edwards curves having arbitrary curve parameters c and d. The restriction c = 1 is of less importance in practice, since every curve in generalized Edwards form can be transformed into an isomorphic curve with c = 1 via the map (x, y) → (cx, cy). The curve parameter d, on the other hand, is of greater importance: If d is a square in the ground field, the group law, as described in section 2, will not be complete anymore, i.e. the formulas defining the addition on the curve are not valid for all possible input points anymore due to a division by zero. Table 1. Some coordinate choices with fast arithmetic Forms Coordinates Addition Cost Doubling Cost Short Weierstraß (X : Y : Z) = (X/Z 2 , Y /Z 3 ) 12M + 4S 4M + 5S Montgomery form (X : Z) 4M + 1S 2M + 3S Edwards form (X : Y : Z) 10M + 1S 3M + 4S Inverted Edwards (X : Y : Z) = (Z/X, Z/Y ) 9M + 1S 3M + 4S Differential Edwards (Y : Z) 4M + 4S 4S (c = 1 and d square) (Y 2 : Z 2 ) 4M + 2S 4S
In the following, we use two parametrizations for elliptic curves in generalized Edwards form to obtain efficient arithmetic: In the first parametrization a point on the curve is represented by the projective coordinate (Y : Z). Notice that the X-coordinate is absent, so we cannot distinguish P from −P . This is indeed similar to Montgomery’s approach [12], where he represents a point in Weierstraß-coordinates by omitting the Y -coordinate. The parametrization used here leads to a differential addition formula, a doubling formula and a tripling formula on elliptic curves in generalized Edwards form. The addition formula can
318
B. Justus and D. Loebenberger
be computed using 6M + 4S (5M + 4S in the case c = 1), the doubling formula using 1M + 4S (5S when c = 1) and the tripling formula using 4M + 7S. We also provide methods for recovering the missing X-coordinate. Compared to earlier work like [6], [9] or the formulas on EFD, we explicitly consider all formulas also for the case c = 1, even though one would typically use in applications curves with c = 1. The second parametrization also omits the X-coordinate. Additionally it uses the squares of the coordinates of the points only. On elliptic curves in generalized Edwards form, addition can be done with 5M + 2S and point doubling with 5S. We also provide a tripling formula for this second representation. For point doubling we get completely rid of multiplications and employ squarings in the ground field only. This is desirable since squarings can be done slightly faster than generic multiplications, see for example [7]. This second representation is best suited when employed in a scalar multiplication. Again we explicitly consider all formulas also for the case c = 1. On EFD several formulas for this parametrization can be found, but only for the special case c = 1 and d being a square in the ground field. The idea of this representation can already be found in Gaudry and Lubicz [9], section 6.2. The plan of the paper is as follows. We recall the basics of Edwards coordinates in the next section and describe the addition and the doubling formula in section 3. The tripling fomulas are deduced in section 4. A formula for recovering the X-coordinate is given in section 5. The parametrization of the points that uses the squares of the coordinates only is analyzed in section 6.
2
Edwards Form
We describe now the basics of elliptic curves in generalized Edwards form. More details can be found for example in [3,4]. Such curves are given by equations of the form Ec,d : x2 + y 2 = c2 (1 + dx2 y 2 ), where c, d are curve parameters in a field k of characteristic different from 2. When c, d = 0 and dc4 = 1, the addition law is defined by (x1 , y1 ), (x2 , y2 ) →
x1 y2 + y1 x2 y1 y2 − x1 x2 , c(1 + dx1 x2 y1 y2 ) c(1 − dx1 x2 y1 y2 )
.
(1)
For this addition law, the point (0, c) is the neutral element. The inverse of a point P = (x, y) is −P = (−x, y). In particular, (0, −c) has order 2; (c, 0) and (−c, 0) are the points of order 4. When the curve parameter d is not a square in k, then the addition law (1) is complete (i.e. defined for all inputs). As noted earlier, every curve in generalized Edwards form can be transformed into a curve with c = 1 via the map (x, y) → (cx, cy).
Differential Addition in Generalized Edwards Coordinates
3
319
Representing Points in Edwards Form
As explained in the introduction, we represent a point P on the curve Ec,d using projective coordinates P = (Y1 : Z1 ). Write [n]P = (Yn : Zn ). Then we have Theorem 1. Let Ec,d be an elliptic curve in generalized Edwards form defined over a field k, such that char(k) = 2 and c, d = 0, dc4 = 1 and d is not a square in k. Then for m > n we have 2 Ym+n = Zm−n Ym2 (Zn2 − c2 dYn2 ) + Zm (Yn2 − c2 Zn2 ) , 2 (Zn2 − c2 dYn2 ) . Zm+n = Ym−n dYm2 (Yn2 − c2 Zn2 ) + Zm It can be computed using 6M + 4S. When n = m, the doubling formula is given by Y2n = −c2 dYn4 + 2Yn2 Zn2 − c2 Zn4 , Z2n = dYn4 − 2c2 dYn2 Zn2 + Zn4 , which can be computed using 1M + 4S. On EFD one finds related formulas for c = 1 and d being a square in k. We defer a detailed study of the relationship between the formulas given there and ours to the end of this section. Proof. Let P1 = (x1 , y1 ), P2 = (x2 , y2 ) be two different points on the curve Ec,d . Since the curve parameter d is not a square in k, the addition law (1) is defined for all inputs. Let P1 + P2 = (x3 , y3 ) and P1 − P2 = (x4 , y4 ). Then the addition law (1) gives y3 c(1 − dx1 x2 y1 y2 ) = y1 y2 − x1 x2 , y4 c(1 + dx1 x2 y1 y2 ) = y1 y2 + x1 x2 . After multiplying the two equations above, we obtain y3 y4 c2 (1 − d2 x21 x22 y12 y22 ) = y12 y22 − x21 x22 . Next we substitute x21 = equation) in (2) yielding
c2 −y12 1−c2 dy12
and x22 =
c2 −y22 1−c2 dy22
(2)
(obtained from the curve
y3 y4 (−dy12 y22 + c2 dy12 + c2 dy22 − 1) = c2 dy12 y22 − y12 − y22 + c2 .
(3)
After switching to projective coordinates, we see that for m > n the formula for adding [m]P = (Ym , Zm ) and [n]P = (Yn , Zn ) becomes 2 Ym+n Ym−n Y 2 (Z 2 − c2 dYn2 ) + Zm (Yn2 − c2 Zn2 ) . = m2 n2 2 (Z 2 − c2 dY 2 ) Zm+n Zm−n dYm (Yn − c2 Zn2 ) + Zm n n
(4)
320
B. Justus and D. Loebenberger
This proves the addition formula. If P1 = P2 , we obtain by the addition law (1) y3 c(1 − dx21 y12 ) = y12 − x21 . Similarly, if we substitute x21 =
c2 −y12 1−c2 dy12
into the equation above to obtain
y3 (cdy14 − 2c3 dy12 + c) = −c2 dy14 + 2y12 − c2 . This proves the doubling formula in Theorem 1 after switching to projective coordinates. We obtain additional savings in the case c = 1: Corollary 1. Assume the same as in Theorem 1. If c = 1 we have for m > n 2 2 Ym+n = Zm−n (Ym2 − Zm )(Zn2 − dYn2 ) − (d − 1)Yn2 Zm , 2 2 2 2 2 2 Zm+n = −Ym−n (Ym − Zm )(Zn − dYn ) + (d − 1)Ym Zn , which can be computed using 5M + 4S. For doubling we obtain Y2n = −(Yn2 − Zn2 )2 − (d − 1)Yn4 , Z2n = (dYn2 − Zn2 )2 − d(d − 1)Yn4 , which can be computed using 5S.
j
Remark 1. A simple induction argument shows that the computation of the 2 fold of a point can be computed using 5jS. A slight variant of the doubling formula in this Corollary is given by Castryck, Galbraith and Farashahi [6] in their section 3. On EFD similar doubling formulas can be found, but only for the special case of d being a square in the ground field. For general c the formulas of Theorem 1 do not seem to be in the literature. In the remainder of this section we will explore this relationship in more detail. We focus here in particular on Corollary 1 since EFD covers the case c = 1 only. As on EFD we assume now that d = r2 for some r ∈ k. Then we can write y2n =
4 2 2 4 −r2 Y2n + 2Y2n Z2n − Z2n 4 − 2r2 Y 2 Z 2 + Z 4 , r2 Y2n 2n 2n 2n
where y2n denotes the corresponding affine y-coordinate of the point. Thus we have 4 2 2 4 2r/(r − 1) · r2 Y2n − 2Y2n Z2n + Z2n ry2n = 4 − 2r2 Y 2 Z 2 + Z 4 ) . −2/(r − 1) · (r2 Y2n 2n 2n 2n 2 2 2 2 2 2 If we set A := 1+r 1−r (rY2n − Z2n ) and B := (rY2n + Z2n ) we can write the numerator of the last expression as B − A and the denominator as B + A, yielding the formulas dbl-2006-g and dbl-2006-g-2 from EFD. This can be computed with 4S, but only for those restricted curve parameters. The addition formulas dadd-2006-g and dadd-2006-g-2 from EFD can be deduced in a similar way from our differential addition formula in Corollary 1.
Differential Addition in Generalized Edwards Coordinates
4
321
A Tripling Formula
One also obtains a tripling formula that can be computed using 4M + 7S. This is cheaper than by doing first a doubling and afterwards an addition, which costs 7M + 8S (5M + 9S when c = 1). Proposition 1. Assume the same as in Theorem 1. Furthermore let char(k) = 3. Then we have Y3n = Yn (c2 (3Zn4 − dYn4 )2 − Zn4 (8c2 Zn4 + (Yn2 (c3 d + c−1 ) − 2cZn2 )2 Z3n
− c−2 (c4 d + 1)2 Yn4 )), = Zn (c2 (Zn4 − 3dYn4 )2 + dYn4 (4c2 Zn4 − (Yn2 (c3 d + c−1 ) − 2cZn2 )2 + c−2 ((c4 d + 1)2 − 12c4 d)Yn4 )),
which can be computed using 4M + 7S. Proof. Let (x3 , y3 ) = 3(x, y) = 2(x, y) + (x, y). Using the addition law (1), we obtain an expression for y3 . Inside the expression, make the substitution c2 −y 2 x2 = 1−c 2 dy 2 and simplify to obtain an expression in y only. Then we have y3 =
y(c2 d2 y 8 − 6c2 dy 4 + 4(c4 d + 1)y 2 − 3c2 ) . −3c2 d2 y 8 + 4d(c4 d + 1)y 6 − 6c2 dy 4 + c2
Switch to projective coordinates y = Y /Z and rearrange terms. The formula follows. Corollary 2. Assume the same as in Theorem 1. Furthermore let char(k) = 3 and assume c = 1. Then we have Y3n = Yn ((dYn4 − 3Zn4 )2 − Zn4 ((2Zn2 − (1 + d)Yn2 )2 + 8Zn4 − (1 + d)2 Yn4 )), Z3n = Zn ((Zn4 − 3dYn4 )2 − dYn4 ((2Zn2 − (1 + d)Yn2 )2 − 4Zn4 + (12d − (1 + d)2 )Yn4 )),
which can be computed using 4M + 7S.
5
Recovering the x-Coordinate
In some cryptographic applications it is important to have at some point both: x- and y-coordinates. Theorem 2 shows how to obtain them. There have been results [13,5] in this direction for other forms of elliptic curves. To recover the (affine) x-coordinate, we need the following Proposition 2. Fix an elliptic curve Ec,d in generalized Edwards form such that char(k) = 2 and c, d = 0, dc4 = 1 and d is not a square in k. Let Q = (x, y), P1 = (x1 , y1 ) be two points on Ec,d . Define P2 = (x2 , y2 ) and P3 = (x3 , y3 ) by P2 = P1 + Q and P3 = P1 − Q. Then we have x1 =
2yy1 − cy2 − cy3 , cdxyy1 (y3 − y2 )
provided the denominator does not vanish.
(5)
322
B. Justus and D. Loebenberger
Proof. By the addition law (1), we have c(1 − dxx1 yy1 )y2 = yy1 − xx1 , c(1 + dxx1 yy1 )y3 = yy1 + xx1 . Add the two equations and solve for x1 , and the Proposition follows.
The following lemma provides a simple criterion, which tells us when the denominator in formula (5) does not vanish. Lemma 1. Assume the same as in Proposition 2. Furthermore, let P1 , Q be points whose order does not divide 4. Then the formula (5) holds. Proof. The points P1 and Q have orders that are not 1, 2, 4, so x, x1 , y, y1 = 0. Suppose now y2 = y3 (i.e. y-coordinates of P1 + Q and P1 − Q are the same). By the addition law (1), this implies yy1 + xx1 yy1 − xx1 = . c(1 − dxx1 yy1 ) c(1 + dxx1 yy1 ) By solving for d it follows that dy 2 y12 = 1, which is a contradiction since d is not a square in k. We are now ready to prove Theorem 2. Let Ec,d be an elliptic curve in generalized Edwards form defined over a field k such that char(k) = 2, c, d = 0, dc4 = 1 and d is not a square in k. Let P = (x, y) be a point whose order does not divide 4. Let yn , yn+1 be the affine y-coordinates of the points [n]P, [n + 1]P respectively. Then we have xn =
2 2yyn yn+1 − cCn − cyn+1 , 2 cdxyyn Cn − yn+1
where A = 1 − c2 dy 2 , B = y 2 − c2 , Cn =
Ayn2 + B . dByn2 + A
Proof. Let [n]P = (xn , yn ), where P is not a 4-torsion point on Ec,d . Our task is to recover xn . By Proposition 2 with P1 = [n]P and Q = (x, y), we may write xn =
2yyn − cyn−1 − cyn+1 , cdxyyn (yn−1 − yn+1 )
(6)
where yn−1 ,yn+1 are the y-coordinates of the points [n − 1]P and [n + 1]P respectively. Now the variable yn−1 can be eliminated because of (4). Indeed we may write using (4) in affine coordinates
Differential Addition in Generalized Edwards Coordinates
yn−1 yn+1 =
Ayn2 + B , dByn2 + A
323
(7)
where A = 1 − c2 dy 2 ,
B = y 2 − c2 .
Now from (7), yn−1 can be isolated and put back in (6). This gives
xn =
2 2yyn yn+1 (dByn2 + A) − c(Ayn2 + B) − cyn+1 (dByn2 + A) . 2 cdxyyn Ayn2 + B − yn+1 (dByn2 + A)
The claim follows.
6
A Parametrization Using Squares Only
2 2 The formulas in Theorem 1 show that for the computation of Ym+n and Zm+n it is sufficient to know the squares of the coordinates of the points (Ym : Zm ), (Yn : Zn ) and (Ym−n : Zn−m ) only. This gives
Theorem 3. Assume the same as in Theorem 1. Then for m > n we have 2 2 Ym+n = Zm−n ((A + B)/2)2 , 2 2 2 Zm+n (A − B)/2 + (d − 1)Ym2 (Yn2 − c2 Zn2 ) , = Ym−n
with 2 )((1 − dc2 )Yn2 + (1 − c2 )Zn2 ), A := (Ym2 + Zm 2 )((1 + c2 )Zn2 − (1 + dc2 )Yn2 ). B := (Ym2 − Zm
This addition can be computed using 5M + 2S if one stores the squares of the coordinates only. When n = m, we obtain 2 2 Y2n = (1 − c2 d)Yn4 + (1 − c2 )Zn4 − (Yn2 − Zn2 )2 , 2 2 Z2n = dc2 (Yn2 − Zn2 )2 − d(c2 − 1)Yn4 + (c2 d − 1)Zn4 , which can be computed using 5S if one stores the squares of the coordinates only. Proof. This follows directly from Theorem 1 and elementary calculus.
A direct adaption of Corollary 1 does not give any speedup. Again on EFD one finds related formulas for c = 1 and d being a square in k. We will now sketch the computation of a scalar multiple [s]P in this parametrization. Assume P has affine coordinates (x : y). Then one would proceed as follows: After changing to projective coordinates (X : Y : Z), two squares (one for each of the coordinates Y and Z) have to be computed. Now a differential addition chain is employed to compute the multiple [s]P . During all but the
324
B. Justus and D. Loebenberger
last step of the computation we store the squares of the coordinates of the intermediate points only. The last step plays a special role now, since we wish to obtain at the end the coordinates of the point [s]P and not the square of the coordinates. To do so, we run the last step using the first parametrization. If we construct from the beginning the differential addition chain such that for each computation of Pm+n we have that m−n = 1, we obtain an efficient algorithm for computing the scalar multiple [s]P on an elliptic curve in generalized Edwards form using the second parametrization. In order to recover then the x-coordinate one would have to compute also the scalar multiple [s+1]P and use the recovering formula from Theorem 2. Also the tripling formula given in Proposition 1 can be adapted to this second parametrization. Namely we have Corollary 3. Assume the same as in Theorem 1. Furthermore, we assume char(k) = 3. Then we have 2 Y3n = Yn2 (c2 (3Zn4 − dYn4 )2 − Zn4 (8c2 Zn4 + (Yn2 (c3 d + c−1 ) − 2cZn2 )2
− c−2 (c4 d + 1)2 Yn4 ))2 , 2 Z3n = Zn2 (c2 (Zn4 − 3dYn4 )2 + dYn4 (4c2 Zn4 − (Yn2 (c3 d + c−1 ) − 2cZn2 )2
+ c−2 ((c4 d + 1)2 − 12c4 d)Yn4 ))2 , which can be computed using 4M+7S if one stores the squares of the coordinates only.
Acknowledgments This work was funded by the B-IT foundation and the state of North RhineWestphalia.
References 1. Bernstein, D.J., Birkner, P., Joye, M., Lange, T., Peters, C.: Twisted Edwards curves. In: Vaudenay, S. (ed.) AFRICACRYPT 2008. LNCS, vol. 5023, pp. 389– 405. Springer, Heidelberg (2008) 2. Bernstein, D.J., Birkner, P., Lange, T., Peters, C.: ECM using Edwards curves (2008) 3. Bernstein, D.J., Lange, T.: Faster addition and doubling on elliptic curves. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 29–50. Springer, Heidelberg (2007) 4. Bernstein, D.J., Lange, T.: Inverted Edwards coordinates. In: Boztas, S., Lu, H.-F. (eds.) AAECC 2007. LNCS, vol. 4851, pp. 20–27. Springer, Heidelberg (2007) ´ Joye, M.: Weierstraß elliptic curves and side-channel attacks. In: Nac5. Brier, E., cache, D., Paillier, P. (eds.) PKC 2002. LNCS, vol. 2274, pp. 183–194. Springer, Heidelberg (2002)
Differential Addition in Generalized Edwards Coordinates
325
6. Castryck, W., Galbraith, S., Farashahi, R.R.: Efficient arithmetic on elliptic curves using a mixed Edwards-Montgomery representation. Cryptology ePrint Archive, Report 2008/218 (2008) 7. Cohen, H., Frey, G.: Handbook of Elliptic and Hyperelliptic Curve Cryptography; written with Roberto M. Avanzi, Christophe Doche, Tanja Lange, Kim Nguyen and Frederik Vercauteren. Discrete Mathematics and its Applications. Chapman & Hall/CRC (2006) 8. Edwards, H.M.: A normal form for elliptic curves. Bulletin of the American Mathematical Society 44(3), 393–422 (July 2007) 9. Gaudry, P., Lubicz, D.: The arithmetic of characteristic 2 Kummer surfaces and of elliptic Kummer lines. Finite Fields and Their Applications 15(2), 246–260 (2009) 10. Information Technology Laboratory. FIPS 186-3: Digital Signature Standard (DSS). Technical report, National Institute of Standards and Technology (June 2009) 11. Lenstra Jr., H.W.: Factoring integers with elliptic curves. Annals of Mathematics 126, 649–673 (1987) 12. Montgomery, P.L.: Speeding the Pollard and elliptic curve methods of factorization. Mathematics of Computation 48(177), 243–264 (1987) 13. Okeya, K., Sakurai, K.: Efficient elliptic curve cryptosystem from a scalar multiplication algorithm with recovery of the y-coordinate on a Montgomery-form elliptic curve. In: Ko¸c, C ¸ .K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 126–141. Springer, Heidelberg (2001)
Efficient Implementation of Pairing on BREW Mobile Phones Tadashi Iyama1 , Shinsaku Kiyomoto2, Kazuhide Fukushima2 , Toshiaki Tanaka2 , and Tsuyoshi Takagi1 1 2
Graduate School of Mathematics, Kyushu University, 744, Motooka, Nishi-ku, Fukuoka, 819-0395, Japan KDDI R&D Laboratories Inc., 2-1-15, Ohara, Fujimino, Saitama, 356-8502, Japan
Abstract. Many implementations of pairings on embedded devices such as mobile phones, sensor nodes, and smart cards have been developed. However, pairings at the security level equivalent to 128-bit AES key have not been implemented in mobile phones without a high-level OS such as Windows. The R-ate pairing is one of the fastest pairings over large prime fields. In this study, we implemented the R-ate pairing at the security level equivalent to 128-bit AES key on BREW mobile phones. We compared the processing time of the R-ate pairing with those of the Ate pairing and ηT pairing. In the results, the R-ate pairing was fastest pairing. Also, we compared the processing time of pairings with those of RSA and ECC on ARM9 225MHz. In the result, the processing time of the R-ate pairing was similar those of RSA and ECC. Keywords: R-ate pairing, Addition Chain, BREW mobile phones.
1
Introduction
Pairing cryptosystems are applied to novel cryptographic protocols, for example ID-based cryptosystems, keyword searchable cryptosystems, and so on. Implementation algorithms of pairings mainly include the ηT pairing [2] that uses a small characteristic finite field, the Ate pairing [7] that is faster than the Tate pairing and uses large characteristic a finite field, and the R-ate pairing [11] that improves the Ate pairing. There are many reports of implementing pairings on not only PCs but also mobile phones, sensor nodes, and smart cards. Pairing implementations on mobile phones include ηT pairings by Kawahara et al. [9] and Yoshitomi et al. [20] and the Ate pairing by Yoshitomi et al. [21]. The implementations on ATmega128L include the ηT pairings by Oliveira et al. [14] and Ishiguro et al. [8] and the Ate paring by Szczechwiak et al. [19]. Additionally, the implementation on smart cards includes the Ate paring by Scott et al. [16]. The security level of the above reports is equivalent to an 80-bit AES key (1024bit RSA and 160-bit ECC [1]). Meanwhile, there are implementation reports of pairings at the security level equivalent to a 128-bit AES key (3072-bit RSA and 256-bit ECC [1,18]). The implementation on ATmega128L includes the ηT pairing by Miyazaki et al. [13]. The implementation on smartcards includes the Ate I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 326–336, 2010. c Springer-Verlag Berlin Heidelberg 2010
Efficient Implementation of Pairing on BREW Mobile Phones
327
pairing by Devegili et al. [5]. However, pairing at the security level equivalent to a 128-bit AES key has not been implemented on mobile phones without a high-level OS such as Windows. In this study, we implemented the R-ate pairing at the security level equivalent to a 128-bit AES key on BREW mobile phones, and compared the processing time of our implementation of the R-ate pairing with those of the Ate and ηT pairings. The processing time on ARM9 225MHz is 1.60 seconds for the R-ate pairing, 2.44 seconds for the Ate pairing, and 3.76 seconds for the ηT pairing. Also, we compared the processing time of pairings with those of RSA and ECC at the security level equivalent to a 128-bit AES key. The processing time on ARM9 225MHz is 7.51 seconds for RSA and 0.98 seconds for ECC. Additionally, we compared the relationship between processing times of pairings, RSA, and ECC at the security level equivalent to 128-bit AES key to those at the security level equivalent to 80-bit AES key. At the security level equivalent to 80-bit AES key, the pairings and ECC both using a small characteristic finite field and RSA are fast. However, at the security level equivalent to 128-bit AES key, the pairings and ECC using a large characteristic finite field are fast.
2
Extension Field Fp12
The extension field Fp12 is constructed by Fp2 and Fp6 . Let Fp be a prime field, and Fp2 , Fp6 , and Fp12 are represented in the same way as in Devegili et al. [5]; Fp2 = Fp [u]/(u2 + 2), Fp6 = Fp2 [v]/(v 3 − ξ), Fp12 = Fp6 [w]/(w2 − v), = Fp2 [W ]/(W 6 − ξ), where ξ = −u − 1 ∈ Fp2 . If p ≡ 7 (mod 8), −2 is a quadratic non-residue over Fp and u2 +2 is irreducible over Fp [u]. If p ≡ 1 (mod 6), ξ is a quadratic non-residue over Fp2 and W 6 − ξ is irreducible over Fp2 [W ]. α ∈ Fp12 is represented in any of the following; a0 , a1 ∈ Fp6 α = a0 + a1 w = (a0,0 + a0,1 v + a0,2 v 2 ) + (a1,0 + a1,1 v + a1,2 v 2 )w
ai,j ∈ Fp2
= a0,0 + a1,0 W + a0,1 W + a1,1 W + a0,2 W + a1,2 W 5 . 2
3
4
Fp12 arithmetic consists of addition, subtraction, multiplication, squaring, and inversion that are constructed using Fp arithmetic. Multiplication over Fp12 is calculated using Karatsuba’s method, and squaring over Fp12 is calculated using the complex method [4].
3
Barreto-Naehrig Curves
In this paper, we use the Barreto-Naehrig (BN) curve [3] as follows; E(Fp ) = {(x, y) ∈ Fp × Fp | y 2 = x3 + b} ∪ {∞},
328
T. Iyama et al.
where b ∈ Fp and ∞ is the point at infinity. BN parameters are as follows; p = 36z 4 + 36z 3 + 24z 2 + 6z + 1, r = 36z 4 + 36z 3 + 18z 2 + 6z + 1, t = 6z 2 + 1, where z ∈ Z, p, and r are primes and #E(Fp ) = r [3]. t is the trace of the Frobenius map π : (x, y) → (xp , y p ). The BN curve has embedding degree k = 12. The embedding degree k is the smallest positive integer such that r | pk − 1 and r2 pk − 1. We use z = 262 + 261 + 7981 and b = 3. E(Fp ) has following sextic twists; E (Fp2 ) = {(x, y) ∈ Fp2 × Fp2 | y 2 = x3 + b/ξ} ∪ {∞}. The order of E (Fp2 ) is #E (Fp2 ) = r(2p − r). There is E (Fp2 )[r], which is a subgroup of order r. If W is a root of W 6 − ξ, then a monomorphism is given by (x, y) → (xW 2 , yW 3 ). If parameters of Devegili et al. are used, p and r are 256-bit primes and p12 is a 3067-bit number, which just falls short of satisfying the security level equivalent to 128-bit AES key.
4
Pairings over BN Curves
In this section, we describe some algorithms of the (twisted) Ate and R-ate pairings. 4.1
Ate Pairing
The Ate pairing was proposed in 2006 by Hess et al. [7]. Algorithm 1. Ate pairing [6] INPUT: P ∈ E (Fp2 )[r], Q ∈ E(Fp ), 6z 2 ∈ Z OUTPUT: eA (P, Q) ∈ Fp12 1: T ← P 2: f ← 1 3: for i ← log 2 (6z 2 ) − 2 to 0 do 4: T ← 2T 5: f ← f 2 · lT,T (Q) 6: if (6z 2 )i = 1 then 7: T ←T +P 8: f ← f · lT,P (Q) 9: end if 10: end for 12 11: f ← f (p −1)/r 12: return f
Let P ∈ E (Fp2 ) and Q ∈ E(Fp ), the Ate pairing is defined as follows; 12
eA (P, Q) = f6z2 ,P (Q)(p
−1)/r
,
Efficient Implementation of Pairing on BREW Mobile Phones
329
where f6z2 ,P is the Miller function [6]. Steps 3 to 10 of Algorithm 1 are the Miller loop, where the number of iterations is log2 (6z 2 ) − 1. Let the i-th bit of √ a integer a be (a)i . Also, 6z 2 ≈ p. The exponentiation of Step 11 is called the final exponentiation. 4.2
R-ate Pairing
The R-ate pairing proposed by Lee et al. [11] is a generalized algorithm of the Ate pairing. Let P ∈ E (Fp2 ) and Q ∈ E(Fp ). The R-ate pairing eR (P, Q) ∈ Fp12 is defined as follows; eR (P, Q) 12
= (f6z+2,P (Q) · (f6z+2,P (Q) · l(6z+2)P,P (Q))p · lπ((6z+3)P ),(6z+2)P (Q))(p
−1)/r
,
where lA,B is the line through A and B. Algorithm 2. R-ate pairing [6] INPUT: P ∈ E (Fp2 )[r], Q ∈ E(Fp ), 6z + 2 ∈ Z OUTPUT: eR (P, Q) ∈ Fp12 1: T ← P 2: f ← 1 3: for i ← log 2 (6z + 2) − 2 to 0 do 4: T ← 2T 5: f ← f 2 · lT,T (Q) 6: if (6z + 2)i = 1 then 7: T ←T +P 8: f ← f · lT,P (Q) 9: end if 10: end for 11: f ← f · (f · lT,P (Q))p · lπ(T +P ),T (Q) 12 12: f ← f (p −1)/r 13: return f
In the Miller loop in Steps 3 to 10 of Algorithm 2, the number of iterations is log2 (6z + 2) − 1. Since the bitlength of 6z + 2 is about half that of 6z 2 + 2, the number of iterations of the R-ate pairing is about half that of the Ate pairing. In Step 11, we use f and (6z + 2)P computed by the Miller loop. Note that the exponent of Step 12 is the same value as that of the Ate pairing. 4.3
Arithmetic of Jacobian coordinates
In Algorithms 1 and 2, we use a point (X, Y, Z) in Jacobian coordinates corresponding to the point (x, y) in affine coordinates with x = X/Z 2 and y = Y /Z 3 . In Step 1, we transform P = (x, y) into T = (x, y, 1). In Step 4, 2T is a doubling point of T in Jacobian coordinates. Let T = (X, Y, Z) and 2T = (Xd , Yd , Zd ); Xd = 9X 4 − 8XY 2 , Yd = 3X 2 (4XY 2 − Xd ) − 8Y 4 , Zd = 2Y Z.
330
T. Iyama et al.
In Step 5, the tangent line at T is lT,T (Q) = Zd Z 2 y − 2Y 2 W 3 − 3X 2 W (Z 2 x − XW 2 ) ∈ Fp12 , where Q = (x, y) and Fp12 = Fp2 [W ]/(W 6 − ξ). In Step 7, T + P is an additional point of a point in Jacobian coordinates and a point in affine coordinates. Let T = (X1 , Y1 , Z1 ), P = (X2 , Y2 ), and T + P = (Xa , Ya , Za ); Xa = (Y2 Z13 − Y2 )2 − (X2 Z12 − X1 )2 (X1 + X2 Z12 ), Ya = (Y2 Z13 − Y1 )[X1 (X2 Z12 − X1 )2 − Xa ] − Y1 (X2 Z12 − X1 )3 , Za = (X2 Z12 − X1 )Z1 . In Step 8, lT,P (Q) = (y − Y2 W 3 )Za − (Y2 Z13 − Y1 )W (x − X2 W 2 ) ∈ Fp12 is the line through T and P . In Step 11 of Algorithm 2, (6z + 2)P is transformed into a point in affine coordinates, and we compute lπ((6z+3)P ),(6z+2)P (Q). 4.4
Final Exponentiation
Devegili et al. [5] showed a method of factoring (p12 − 1)/r into three parts (p6 − 1), (p2 + 1), and (p4 − p2 + 1)/r. The exponentiation by (p6 − 1) is as follows; 6
fp
−1
=
f0 − f1 w , f0 + f1 w
f0 , f1 ∈ Fp6 .
The cost of the exponentiation by (p6 − 1) is 1 inversion and 1 multiplication over Fp12 . If we let fi,j ∈ Fp2 (i, j = {0, 1, 2}), the exponentiation by p is; p p p f p = (f0,0 + (γ2 · f0,1 )v + (γ4 · f0,2 )v 2 ) p p p + (γ1 · f1,0 + (γ3 · f1,1 )v + (γ5 · f1,2 )v 2 )w.
When the above expression is used, the exponentiation by (p2 + 1) can be comp puted. γi = ξ i(p−1)/6 (i = {1, 2, 3, 4, 5}) can be precomputed, and fi,j can be computed by using conjugation over Fp2 . Thus, the cost of the exponentiation by (p2 + 1) is 5 multiplications over Fp2 . There are two methods of computing the exponentiation by (p4 − p2 + 1)/r (Hard Exponentiation) proposed by Devegili et al. [5] and Scott et al. [15]. The Devegili et al.’s method is Algorithm 3. Algorithm 3. Hard Exponentiation of Devegili et al. [5] INPUT: f ∈ Fp12 , p, r, z ∈ Z 4
2
OUTPUT: f (p −p +1)/r ∈ Fp12 1: a ← f −(6z+5) , b ← ap , b ← a · b 2 3 2: f1 ← f p , f2 ← f p , f3 ← f p 2 3: f ← f3 · [b · (f1 )2 · f2 ]6z +1 · b · (f1 · f )9 · a · f 4 4: return f 2
3
2
3
2
In the Scott et al.’s method, f −z , f z , f −z , f p , f p , f p , (f −z )p , (f z )p , 3 2 2 (f −z )p , and (f z )p are precomputed, and the following from y0 to y6 are computed;
Efficient Implementation of Pairing on BREW Mobile Phones 2
3
2
331
2
y0 = f p · f p · f p , y1 = 1/f, y2 = (f z )p , y3 = (f −z )p , 2
2
3
3
y4 = f −z /(f z )p , y5 = 1/f z , y6 = f −z /(f −z )p , where above inversions are the same as conjugation. When from y0 to y6 are used, the exponentiation by (p4 − p2 + 1)/r is the same as y0 · y12 · y26 · y312 · y418 · y530 · y636 . Algorithm 4. Hard Exponentiation of Scott et al. [15] INPUT: y0 , y1 , y2 , y3 , y4 , y5 , y6 ∈ Fp12 OUTPUT: y0 · y12 · y26 · y312 · y418 · y530 · y636 ∈ Fp12 1: T0 ← y62 · y4 · y5 , T1 ← y3 · y5 · T0 2: T0 ← T0 · y2 , T1 ← (T12 · T0 )2 3: T0 ← T1 · y1 , T1 ← T1 · y0 4: T0 ← T02 · T1 5: return T0
5
Improved Method
In this section, we describe the improved method of the Hard Exponentiation of Devegili et al. and Scott et al., where p = 36z 4 + 36z 3 + 24z 2 + 6z + 1 with z = 262 + 261 + 7981. 5.1
Addition Chain
In the Devegili et al.’s method, there are two exponentiations ; 6z +5 and 6z 2 +1. By contrast, in the Scott et al.’s method, there is only the exponentiation by z. We speed up each method using an addition chain. In the case of 6z + 5 to binary conversion, since there are bit sequences of 1001 and 1011, f 9 and f 11 are precomputed. Algorithm 5. Proposed Addition Chain (f 6z+5 ) INPUT: f ∈ Fp12 OUTPUT: f 6z+5 ∈ Fp12 1: f1 ← f, f2 ← f 9 , f3 ← f 11 2 50 4 7 2: f ← [(f22 · f3 )2 · f3 ]2 · f2 · f1 3: return f
In the case of computing f 6z+5 using the square-and-multiply method, the number of multiplications over Fp12 is 10. When Algorithm 5 is used, the number of multiplications over Fp12 can be reduced to 6. To compute the exponentiation by 6z 2 + 1, we use two exponentiations; by z and 6z [6]. For z to binary conversion, since there are bit sequences of 101, 111, and 1100, f 5 , f 7 , and f 12 are precomputed.
332
T. Iyama et al.
Algorithm 6. Proposed Addition Chain (f z ) INPUT: f ∈ Fp12 OUTPUT: f z ∈ Fp12 1: f1 ← f 5 , f2 ← f 7 , f3 ← f 12 23 49 4 3 2: f ← [(f32 · f2 )2 · f3 ]2 · f1 · f1 3: return f
In contrast, in the case of 6z to binary conversion, since there are bit sequences of 1001, 1011, and 1110, f 9 , f 11 , and f 14 are precomputed. Algorithm 7. Proposed Addition Chain (f 6z ) INPUT: f ∈ Fp12 OUTPUT: f 6z ∈ Fp12 1: f1 ← f 9 , f2 ← f 11 , f3 ← f 14 50
4
8
2: f ← [(f12 · f2 )2 · f2 ]2 · f3 3: return f
In the case of computing f z and f 6z using the square-and-multiply method, the number of multiplications over Fp12 is 10. When Algorithms 6 and 7 are used, the number of multiplications over Fp12 can be reduced to 7. 5.2
Comparison with Existing Methods
In the final exponentiation, Hankerson et al. [6] show that there are 7246 multiplications over Fp in the Devegili et al.’s method. When improved Devegili et al.’s method with Algorithms 5, 6, and 7 (Proposed method 1) is used, there are 6653 multiplications over Fp , a reduction of about 8%. In contrast, there are 7046 multiplications over Fp in the Scott et al.’s method. When improved Scott et al.’s method with Algorithm 5 (Proposed method 2) is used, there are 6560 multiplications over Fp , a reduction of about 7%. Table 1. The number of multiplications over Fp in the final exponentiation Devegili et al. [5] Proposed method 1 7246 6653 Scott et al. [15] Proposed method 2 7046 6560
6
Implementation Results
In this section, we describe the implementation results of the Ate pairing and the R-ate pairing on BREW mobile phones.
Efficient Implementation of Pairing on BREW Mobile Phones
6.1
333
ARM Processors and BREW
We implement pairings for ARM9 processors used by BREW mobile phones. BREW stands for Binary Runtime Environment for Wireless and is an application platform for CDMA mobile phones. The advantages of BREW are electric power savings and high-speed processing because BREW uses compiled language. In our implementation, we use ARM9 150MHz and ARM9 225MHz. We use BREW SDK 3.1.2 and Microsoft Visual C++ 6.0. The compiler is ARM compiler (RVCT 1.2). Note that we have not used BREW dependent techniques in order to accelerate the speed of our implementation. 6.2
Implimentation Results of the Ate and R-ate Pairings
In the Ate pairing and the R-ate pairing, we use Fp , Fp2 , Fp6 , and Fp12 arithmetic, that is, pairings are constructed using the arithmetic over Fp , where p = 36z 4 + 36z 3 + 24z 2 + 6z + 1 with z = 262 + 261 + 7981 discussed in Section 5. The processing time of the arithmetic over Fp is listed in Table 2. Table 2. Processing time of calculations over Fp imsecj
Addition Subtraction Multiplication Inversion Ate pairing R-ate pairing
ARM9 150MHz ARM9 225MHz 0.0114 0.0094 0.0062 0.0039 0.1205 0.0799 0.9910 0.6549 3681 2430
2436 1602
As shown in Table 2, subtraction over Fp is twice as fast as addition, because the remainder operation in subtraction repeats less than that in addition. Addition over Fp is 10 times faster than multiplication. In the final exponentiation, we used Proposed method 2 of Section 5.2. The Ate pairing required 56623 additions, 41855 subtractions, 22282 multiplications, and 1 inversion. By contrast, the R-ate pairing required 38510 additions, 27997 subtractions, 14395 multiplications, and 2 inversions. From the result in Table 2, the evaluation processing time of the Ate pairing on ARM9 225MHz was 2.48 seconds and of the R-ate pairing was 1.62 seconds. Also, from the result in Table 2, the implementation processing time of the Ate pairing on ARM9 225MHz was 2.44 seconds and of the R-ate pairing was 1.60 seconds. There was no difference between the evaluation and implementation processing times. 6.3
Comparison of 80-Bit and 128-Bit Security Level
In this section, we compare the processing time of pairings to those of RSA and ECC at the security level equivalent to 80-bit and 128-bit AES keys. Yoshitomi et al. [20,21] show the processing time of public public key cryptosystems (the Ate
334
T. Iyama et al.
pairing, ηT paring, RSA, ECC(p), and ECC(2m )) at the security level equivalent to 80-bit. The processing time of RSA is of the modulo exponentiation by 1024bit integers [1] and the processing time of ECC is of the scalar multiplication by 160-bit integers. There are two finite fields Fp (p is a 160-bit prime) and F2m (m = 163) in ECC [18, Tables 2 and 4]. Also, the ηT pairing requires a finite field F3n (n = 193) and the Ate pairing requires the extension field Fp4 constructed by Fp2 (p is a 256-bit prime) [21]. Yoshitomi et al. use the Chinese remainder theorem [12], the Montgomery multiplication [12], and the sliding window method of width 3 [12] in RSA, the non-adjacent form (NAF) for computing the scalar multiplication in ECC. At the security level equivalent to 128-bit AES key, the processing time of RSA is of the modulo exponentiation by 3072-bit integers [1] and the processing time of ECC is of the scalar multiplication by 256-bit integers. There are two finite fields Fp (p is a 256-bit prime) and F2m (m = 283) in ECC [18, Tables 2 and 4]. Also, the ηT pairing requires a finite field F3n (n = 509) [6]. Section 3 describes the structure of the R-ate pairing. In order to implement RSA, ECC, and the ηT pairing at the security level equivalent to 128-bit AES key, we used the same source codes by Yoshitomi et al. [20] Table 3. Comparison of public key cryptosystems at the security level equivalent to 80-bit and 128-bit AES keys on ARM9 225MHz(sec) RSA ECC(2m ) ECC(p) ηT pairing Ate/R-ate pairing 80-bit [20,21] 0.33 0.46 0.50 0.26 0.70 128-bit 7.51 3.80 0.98 3.76 1.60 8 7
80-bit 128-bit 0.33 7.51 7.51 0.46 3.8 0.5 0.98 0.26 3.76 Ate/R-ate pairing 0.7 1.6
6
sec
5 4
3.80
80-bit 128-bit
3.76
3 2
1.60 0.98
1 0.33
0.50
0.46
0.70 0.26
0
RSA
m
ECC(2 )
ECC(p )
ηT pairing
Ate/R-ate pairing
Fig. 1. Comparison of public key cryptosystems at the security level equivalent to 80-bit and 128-bit AES keys on ARM9 225MHz(sec)
Efficient Implementation of Pairing on BREW Mobile Phones
335
From Table 3 and Fig. 1, at the security level equivalent to 80-bit AES key, the ηT pairing is fastest of all. Also, ECC over a small characteristic finite field and RSA are comparatively fast. At the security level equivalent to 128-bit AES key, ECC over a large characteristic finite field is fastest of all. In addition, the R-ate pairing is comparatively fast and RSA is the slowest of all.
7
Conclusion
In this study, we implemented the Ate pairing and the R-ate pairing over the BN curves and the ηT pairing at the security level equivalent to 128-bit AES key using BREW mobile phones. We speeded up the final exponentiation of pairings using the proposed addition chain. In the Devegili et al.’s method, the number of multiplications over Fp is 7246. Meanwhile, using the proposed addition chain, the number of multiplications over Fp is 6560. The processing time of the R-ate pairing using the proposed method becomes 1.60 seconds on ARM9 225MHz. We compared the processing time of pairings to RSA and ECC at the security level equivalent to 128-bit AES key. As a result, at the security level equivalent to 128-bit AES key, the R-ate pairing is faster than RSA and ECC(2m ), however ECC(p) is the fastest of all. Also, we compared the processing time of public key cryptosystems at the security level equivalent to 80-bit and 128-bit AES keys. At the security level equivalent to 80-bit AES key, the fastest public key cryptosystems are ηT pairing, and ECC using a small characteristic finite field. By contrast, at the security level equivalent to 128-bit AES key, the fastest public key cryptosystems are ECC use a large characteristic finite field and the R-ate pairing.
References 1. Barker, E., Barker, W., Burr, W., Polk, W., Smid, M.: Recommendation for Key Management - Part 1: General. NIST Special Pablication 800-57 part 1 (2005) ´ Eigeartaigh, ´ 2. Barreto, P., Galbraith, S., O’h C., Scott, M.: Efficient Pairing Computation on Supersingular Abelian Varieties. Designs, Codes and Cryptography 42(3), 239–271 (2007) 3. Barreto, P., Naehrig, M.: Pairing-Friendly Elliptic Curves of Prime Order. In: Preneel, B., Tavares, S. (eds.) SAC 2005. LNCS, vol. 3897, pp. 319–331. Springer, Heidelberg (2006) ´ Eigeartaigh, ´ 4. Devegili, A., O’h C., Scott, M., Dahab, R.: Multiplication and Squaring on Pairing-Friendly Fields. Cryptography ePrint Archive, Report 2006/471 (2006) 5. Devegili, A., Scott, M., Dahab, R.: Implementing Cryptographic Pairings over Barreto-Naehrig Curves. In: Takagi, T., Okamoto, T., Okamoto, E., Okamoto, T. (eds.) Pairing 2007. LNCS, vol. 4575, pp. 197–207. Springer, Heidelberg (2007) 6. Hankerson, D., Menezes, A., Scott, M.: Software Implementation of Pairings. In: Identity-Based Cryptography, pp. 188–206. IOS Press, Amsterdam (2009) 7. Hess, F., Smart, N., Vercauteren, F.: The Eta Pairing Revisited. IEEE Transactions on Information Theory 52(10), 4595–4602 (2006) 8. Ishiguro, T., Shirase, M., Takagi, S.T.: Efficient Implementation of the Pairing on ATmega128L. IPSJ Journal 49(11), 3743–3753 (2008)
336
T. Iyama et al.
9. Kawahara, Y., Takagi, S.T., Okamoto, E.: Efficient Implementation of Tate Pairing on Mobile Phones using Java. IPSJ Journal 49(1), 427–435 (2008) 10. Koblitz, N., Menezes, A.: Pairing-Based Cryptography at High Security Levels. In: Smart, N.P. (ed.) Cryptography and Coding 2005. LNCS, vol. 3796, pp. 13–36. Springer, Heidelberg (2005) 11. Lee, E., Lee, H.-S., Park, C.-M.: Efficient and Generalized Pairing Computation on Abelian Varieties. IEEE Transactions on Information Theory 55(4), 1793–1803 (2009) 12. Menezes, A.J., Oorschot, P.C., Vanstone, S.A.: Handbook of Applied Cryptography. CRC Press, Boca Raton (1996) 13. Miyazaki, Y., Shirase, M., Takagi, S.T.: Elliptic Curve/Pairing-based Cryptographies on MICAz With Some Security Levels. In: WISA 2009, Short Presentation Track, pp. 183–195 (2009) 14. Oliveira, L., Scott, M., Lopez, J., Dahab, R.: TinyPBC: Pairings for Authenticated Identity-Based Non-Interactive Key Distribution in Sensor Networks. In: INSS 2008, pp. 173–180 (17-19, 2008) 15. Scott, M., Benger, N., Charlemagne, M.: On the Final Exponentiation for Calculating Pairings on Ordinary Elliptic Curves. In: Shacham, H., Waters, B. (eds.) Pairing 2009. LNCS, vol. 5671, pp. 78–88. Springer, Heidelberg (2009) 16. Scott, M., Costigan, N., Abdulwahab, W.: Implementing Cryptographic Pairings on Smartcards. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 2523, pp. 159–174. Springer, Heidelberg (2003) 17. Shirase, M., Kawahara, Y., Takagi, S.T., Okamoto, E.: Universal ηt Pairing Algorithm over Arbitrary Extension Degree. In: Kim, S., Yung, M., Lee, H.-W. (eds.) WISA 2007. LNCS, vol. 4867, pp. 1–15. Springer, Heidelberg (2008) 18. Standards for Efficient Cryptography Group (SECG), SEC 2: Recommended Elliptic Curve Domain Parameter, Ver. 2.0 (2010), http://www.secg.org/ 19. Szczechowiak, P., Oliveira, L., Scott, M., Collier, M., Dahab, R.: NanoECC: Testing the Limits of Elliptic Curve Cryptography in Sensor Networks. In: Verdone, R. (ed.) EWSN 2008. LNCS, vol. 4249, pp. 134–147. Springer, Heidelberg (2006) 20. Yoshitomi, M., Takagi, S.T., Kiyomoto, S., Tanaka, T.: Efficient Implementation of the Pairing on Mobilephones using BREW. IEICE Transaction E91-D(5), 1330– 1337 (2008) 21. Yoshitomi, M., Kiyomoto, S., Fukushima, K., Tanaka, T., Takagi, S.T.: Implementation of the Pairings on the Ordinary Elliptic Curves using BREW Mobilephones. In: Symposium on Cryptography and Information Security, SCIS 2009, 3C2-2 (2009)
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases Lasse Harjumaa and Ilkka Tervonen Dept. of Information Processing Science University of Oulu Oulu, Finland {lasse.harjumaa,ilkka.tervonen}@oulu.fi
Abstract. Gathering security-related requirements and designing dependable software is difficult. Even though software security has become one of the main challenge of software development and security issues are taken increasingly into account in software companies, the security viewpoint is typically loosely integrated in developers routines and development processes. This paper presents results from an experiment where use case, misuse case and mitigation use case descriptions were used to generate test cases for the system. This helps integrating the security characteristics into the product already in the first phases of development. By defining the misuse cases and planning corresponding mitigations help developers to build the security characteristics right into the product, because security is addressed throughout the development from the requirements phase to the testing phase. We suggest some enhancements to the misuse case approach to help developers identify security requirements more carefully. Furthermore, we present a procedure for generating test cases from the mitigations in order to ensure that security targets have been achieved. Results from our experiments indicate that the approach improves the process of producing relevant test cases. Keywords: Software testing, software security, security testing, misuse cases, test cases.
1 Introduction During the last years, security has gained great interest in software development. Software security has great influence in all areas of modern life, thus it should be taken seriously. For example, energy supply, health care and air traffic rely largely on software systems, and security problems in those systems could have catastrophic impact on society and well being of people. Earlier research shows that most of the security vulnerabilities and failures in software are caused by mistakes in design and implementation of the software [19, 24]. Design-level vulnerabilities are the most critical and severe flaws, and they are also very hard to handle [21]. Most software developers have not been trained to pay attention to security-related requirements [16]. Furthermore, commonly utilized software I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 337–353, 2010. © Springer-Verlag Berlin Heidelberg 2010
338
L. Harjumaa and I. Tervonen
development methods and tools do not directly support detection, planning or modeling of security characteristics [4, 5, 16, 23]. Lack of an integral security approach is the main reason for security failures in software products. The separate methods that are available have not been studied exhaustively [17]. Due to this, security aspects in software development are typically checked rather late in the development cycle, if checked at all. Inadequate integration of security perspectives into the development process makes the security assurance difficult and expensive [5]. If appropriate security methods were in place and well integrated into the development, security threats and risks could be tackled already in the early phases of development, making security assurance more effective and less expensive. Many security solutions focus on individual technical-level security defenses, such as encryption or firewalls. However, the main reason for security problems in most cases is inadequacies in systems design. A number of methodologies have been introduced to incorporate the security aspect into the development process. A good overview of such methodologies can be found from [18, 22, 25, 28], for example. Wysopal et al. introduce Secure Software Development Lifecycle (SSDL), which includes milestones for checking security quality at early stages of the development [29]. Villarroel et al. [26] state that a software development methodology that would comprise all aspects of security would be difficult to define and it would be too complex for practical use. They suggest that techniques and models that are based on the most accepted model standards would probably be more successful. Mouraditis & Giorgini [18] define the research area of secure software engineering as a branch of research concerned with development of secure software systems, which integrates security and software engineering. They consider secure software engineering as an umbrella under which the areas (among others) security requirements engineering, security modeling and secure software development lie. In this paper, we present results of a study in which test cases are derived from model descriptions including use cases, misuse cases and accompanying mitigation use cases. The descriptions are described in terms of the Unified Modeling Language (UML) notation, which is the most widely used modeling language in software development. This approach helps in focusing the security issues from the beginning of the development and building the security in the software. Furthermore, because UML is already widely in use, there is no need for excessive learning or new modeling tools. The mitigation use case approach is based on risk management practices, where mitigation of a risk is always defined for each risk. We mainly focus on finding out if documentation of mitigation use cases can help developers to understand the security aspect better and integrate it more tightly in test cases. Furthermore, we investigate if enhanced notation can help developers to produce more test cases that take security into account. Combining misuse cases, mitigation use cases and test cases provides a comprehensive view of the security concerns of a system. Our approach can be seen as a type of security testing [21]. The rest of the paper is organized as follows. Section two lists the related research. Section three provides an overview to two important background topics of the study: software security and misuse cases. Section four outlines the idea of mitigation uses cases and section five represents the procedure of deriving test cases from use cases, misuses and mitigations. Section six describes the student experiment and initial
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases
339
company case that we have carried out. Section seven discusses the validity of our research, and finally section eight summarizes the results of the research work.
2 Related Work Opdahl and Sindre have recently published a comparison of attack trees and misuse cases for identification of security threats. They found out that attack trees are more effective in finding threats than misuse cases when pre-drawn use case descriptions are missing. However, the participants considered both techniques similarly easy to use. They also outline need for experiments where participants get a partial threat specification, as in our case. [19] Jürjens [13] has presented an extension of UML to facilitate development of secure software systems. The extension is called UMLsec, and the aim is to integrate security requirements analysis into the development process. Several benefits can be achieved when security issues are dealt with UML. First, security becomes a part of the standard design. UML also provides a modular and reusable approach for modeling security related features and structures. Finally, it is possible to make use of tools that can handle UML. [13] UMLsec utilizes stereotypes to express security aspects in diagrams. A security goal tree is developed along the system specification. Security requirements are captured using use case diagrams. Activity diagrams are utilized during analysis. Class diagrams, sequence diagrams, statechart diagrams and package diagrams present the actual design. Formal semantics can be used to ensure that certain security requirements or protocols have been followed [13]. During implementation, deployment diagrams are used to ensure that security requirements are met in the physical layer. [13] Tool support for the UML approach has been discussed in [14]. Application of the approach in distributed information systems development has been addressed in [7]. Lodderstedt et al. [15] have introduced a set of enhancements to UML called SecureUML for describing role based access control policies in overall application model. The approach is integrated into model-driven development process in order to deal with the security issues from the beginning of the development. Pauli & Xu [20] have also presented an idea of integrating security related requirements within the use case descriptions. However, they don’t address the issue of use case models as basis of software testing.
3 Software Security and Misuse Cases Avizienis et al. [2] represent a dependability concept tree, which defines dependability through a three-part terminology: Threats to dependability, means to achieve dependability and attributes of dependability. According to this categorization, threats are errors, faults and failures. Attributes of dependability include availability, reliability, safety, confidentiality, integrity and maintainability. The means, by which dependability is attained, are fault prevention, fault tolerance, fault removal and fault forecasting. Prevention and tolerance aspects are addressed mainly during the system design and implementation, while fault forecasting requires data that is gathered from
340
L. Harjumaa and I. Tervonen
an operational system. Fault removal can be carried out both during implementation or when the system is in use. [2] Dependability is a subjective value that is based on the needs of the stakeholders. Thus, specifying the security needs should go along with specifying functional requirements, which are often negotiated with stakeholders by use case modeling. For example, Wysopal et al. introduce attack use cases for documenting security requirements [29]. This may help to achieve more comprehensive security oriented testing [29]. Specifying security-related requirements is a challenging task that most software developers have not been trained to do [16]. The requirements engineering approaches for security vary in premises, process details and notation. Security is a typical non-functional requirement. In some approaches it is defined as a quality attribute and e.g. in NFR framework and i* framework the term softgoal is used. NFR framework starts by treating security as one among many non-functional requirements. In the i* framework the security is treated as much as possible within the same notational and reasoning framework as for other non-functional requirements [28]. Use cases are abstract episodes of interaction between a system and its environment [16]. A use case characterizes a way of using a system, or a dialog that a system and its environment may share as they interact. Thus a use case is a specification of a type of complete interaction between a system and one or more actors. A use case must be complete in the sense that it forms an entire and coherent transaction. For example, making cash withdrawal at an ATM machine, placing a call on the telephone, or deleting a file from a file system, are examples of complete interactions with various sorts of systems that would qualify as use cases. Alexander [1] defines misuse (or abuse) cases as a form of use cases, which help document negative scenarios. A misuse case is simply a use case from the point of view of an actor hostile to the system under design. Some misuse cases occur in highly specific situations, whereas others continually threaten systems. [16] characterize abuse cases as a specification, which defines interactions that result in actual harm. A complete abuse case defines an interaction between an actor and the system that results in harm to a resource associated with one of the actors, one of the stakeholders, or the system itself. Hope et al. [12] state that the most practical way to identify misuse cases is an expert brainstorming session, as theoretical methods are laborious and require the system to be completely specified. Sindre & Opdahl [22] suggest a stepwise method for defining appropriate use cases. The stages of their procedure are the following. 1. 2. 3.
4. 5.
Normal users and the use cases describing the main functionality of the system are identified. The possible misbehaved users and potential misuse cases are identified. The relationships between misuse cases and normal use cases are identified. This is an important phase, because many security threats are related to normal operations carried out with the system. The next phase is to introduce new use cases that can help in identifying and preventing misuse cases. Finally, more detailed documentation of misuse cases can be composed.
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases
341
Experts with different specialties can be utilized during these phases in order to gain an extensive view on the system and possible threats against it. End users can provide valuable input for the first two phases. The third and fourth phases are more technically oriented, and expertise in security issues is needed. The fifth phase is important, because solutions to security threats are difficult to describe with just diagrams [22]. On the other hand, Alexander [1] makes a note that in some cases the name of a use case is adequate to point out shortcomings in requirements specifications. The actors in a misuse case model are the same kinds of external agents that participate in use cases. However, they should not be the same actors. If a human that is represented by an actor from a use case might also act maliciously in the corresponding role, then a new actor should be defined in the abuse case. Test cases that are derived from misuse cases have the same problems than other written software documentation. The requirements specifications may be incomplete, misuse case descriptions are not adequately detailed or cases are missing completely. Some use cases can be ambiguous. Furthermore, documentation should be updated and checked for validity. [6] Because use cases are targeted for finding the appropriate functionality of a system during the analysis phase, and not for revealing errors, it may be difficult to find mistakes in the software based on them. The use case approach is opposite to testing. While use cases are typically used to describe the correct operation of the system, testing tries to reveal as much errors as possible. Furthermore, testing always requires special expertise, and testing cannot be done by anyone even though simple use case descriptions may give such an impression. [6] One weakness of use case diagrams is that they only describe the functional requirements for the system, and test cases for non-functional requirements cannot be derived directly from them [27].
4 Mitigation Use Cases Figure 1 illustrates modeling use cases, misuse cases and mitigation use cases in UML notation. The left-hand side of the figure shows a normal use case named StoreData and the right-hand side a misuse case that threatens the system, named InsertErroneuousData. Thus, the figure depicts a situation where a hostile user tries to corrupt data in a system. BackupTheData is a mitigation use case for alleviating the consequences of possible misuse.
Fig. 1. Enhanced use case model
342
L. Harjumaa and I. Tervonen
We use the extension point notation for connecting mitigation use cases to the actual use cases that need security enhancements. The benefit of that is that any extra elements to the UML notation are not needed. The > relationship suits very well for modeling varying or additional functionality that is needed to ensure security of the original functionality. Mitigation use cases should always be connected to actual use cases, because they are related to the functionality of the system. Mitigations that cannot be connected to the actual functionality can even be irrelevant for the system. Furthermore, the misuse cases should be categorized according to the dependability viewpoint they are mainly related to. This helps in identifying mitigation strategies and planning appropriate design solutions for certain types of misuses. For example, creating and selecting security patterns [9] for implementation is easier when the types of misuses are thought about. The categorization is derived mainly from the Unified Model of Dependability (UMD) presented in [3]. The UMD allows stakeholders to define appropriate reactions to security issues for improving the dependability of a system. These reactions are classified into three groups: impact mitigation, recovery and occurrence reduction. [3] Based on the reactions that can be taken to mitigate possible security threats, we suggest the following three relationship types as an extension to misuse case diagrams. relationship prototype means actions that aim at reducing the impact of a threat or misuse. For example, providing an alternative path to requested services for the user or warning the user about possible security problems mitigate the impact of a misuse cases. prototype means functionality and actions that help to recover from a security failure. For example, creating backups from important data enables restoring the system state after crash. > relationship prototype means actions that try to prevent a security issue to happen or reduces the number of occurrences of the issue. For example, guarding a networked system with firewall prevents attacks toward the system. We have identified three types of relationships between misuses and mitigation use cases: , and >. For example, if an attacker tries to block a web server, consequences can be alleviated by using a mirror server. An example of recovery is creating regular backups of data, and restoring the system state after an attack. Closing insecure network connections is an example of a mitigation use case that tries to prevent the misuse entirely.
5 Deriving Test Cases from Mitigation Use Cases When test cases are derived from misuse cases and mitigation actions, full misuse case descriptions and scenarios are needed, as they contain detailed information about the basic and alternative flows of events related to the misuse cases. The preconditions, inputs and environmental constraints of the test case can be found from the use case descriptions. Heumann [11] describes a three-step process for creating test cases from a fully detailed use case. First, a full set of use-case scenarios is generated for each use case. Second, at least one test case and the conditions that will make it execute should be identified for each scenario. Third, the data values with which to test are identified for each test case. [11] The process of generating test cases from misuse
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases
343
cases is quite similar to this. The following procedure can be used to derive securityoriented test cases from fully described misuse cases and mitigation use cases. 1. 2. 3. 4. 5.
Ensure that you have relevant misuse cases and mitigation cases defined (mitigation cases are extensions of use cases). Define the extension points in use cases for each mitigation cases. Classify the relationship of each misuse case and corresponding mitigation case. Create scenarios for each misuse case and mitigation case. Create at least one test case for each scenario. Inputs, expected operation and outputs can be identified from the diagram. You can also use the classification done in step three to outline the test case. Vulnerabilities databases, organizational experience bases and personal expertise can be utilized to evaluate the accuracy of the descriptions.
The security related functionality or enhancements of a software system could be tested by creating tests based on the mitigation use cases. As illustrated in Figure 2, test cases should be created for “both directions” from the mitigation use case, that is a test case should be written from the > relationship viewpoint and another test case should address the relationship between the mitigation use case and corresponding misuse case. It is possible that sometimes these two viewpoints result in a single combined test case.
Fig. 2. Creating test cases from the mitigation use case
When defining test steps for the > relationship, steps in the test scenario can be identified quite easily from the actual use case that the mitigation use case extends. Test cases from the , or > relationship of the same mitigation use case are generated from a slightly different viewpoint. Thus, the mitigation use case can initiate more than one separate test scenarios. The purpose of testing the > direction is to ensure that the security functionality, such as encryption or backup creation exists, and testing the other direction, , and >, should ensure that the result of the security-related functionality, such as encryption, is as expected. The relationship identifier gives an indication what should be the result of the test. For example, > relationship could deny attacker's access to the system. Figure 3 presents a use case based on the model provided by Pauli & Xu [20]. We have enhanced the model by adding the extension points to the mitigation use case and defining the type of mitigation, which in this case is >. The diagram
344
L. Harjumaa and I. Tervonen
presents a situation, where an attacker tries to view employee’s payroll information. The solution is to encrypt the information and provide appropriate encryption key only to the authorized users. We illustrate the test case creation by two examples, which are derived from the Apply Cryptocraphic Function mitigation use case.
Fig. 3. An example use case diagram showing possible misuse and mitigation cases. (Modified from Pauli & Xu [20]).
Table 1 lists the test steps derived from the > relationship between the mitigation use case and the actual use case, Enter Information in this case. Table 1. Example test steps for the > relationship Step Description 1 Authentication to use the system is asked from the user. 2 User selects a function to add payroll data. 3 The user fills in the details. 4 The user requests to submit the information to the system. 5 The validity of the data is checked. 6 7
Result Username and password are checked against the user database. A form to fill in the user information appears. Data in the form is checked for validity. Data is first encrypted and then transferred over network and saved in the database. The database performs and internal check of the validity of the data (e.g. checksum). The user is provided feedback about the Confirmation dialog is presented to the user. operation. User logs out. All network and database connections are closed.
Table 2 represents an example of a test case that can be derived from the > relationship between the mitigation use case and the corresponding misuse case. To summarize, our procedure provides a belt and braces approach for security testing. Test cases derived from the extension point of a use case aim at ensuring that the security mechanism is built in the system, and test cases derived from the mitigation relationship try to ensure that the implemented mechanism is working correctly.
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases
345
Table 2. Example test steps for the > relationship Step 1
2 3 4
Description Result Filter network traffic to find data elements The attacker identifies addresses, ports and data related to payroll information. elements that are used to present and transfer payroll information. Wait somebody to log in to the system and The payroll data elements are extracted from the enter payroll information. network traffic. Read the payroll information and change The data is not readable. An arbitrary change is the data. made. The change is sent over network. The system checks the validity of the data (checksum) and rejects the request.
6 Experiment and Initial Company Case We have tested out these ideas in a student experiment during a second-year software engineering course. Furthermore, we have tested the approach in a small software company that has strong expertise in software development, especially testing. In the company, the approach was applied to real-life project and models, but was very limited in size. Only a portion of the use cases of the system was used in the experiment. It is not possible to provide any statistical analysis based on the company case, however, some useful observations can be made. Data was gathered by interviewing the developers and testers that participated in the experiment. Use cases that were created for the student experiment were quite simple, so that the participants did not need too much domain expertise. Figure 4 depicts the UML diagram that was created for the experiment. It is a part of an imaginary web application that allows the user to equip a car of their like by selecting the model, color and other features of the car. There is one possible misuse identified, which is attacker injecting malicious code into the input fields. Code injection is a typical security hazard in web-based applications.
Fig. 4. Use case diagram for the student experiment
346
L. Harjumaa and I. Tervonen
The students taking the course were divided into two groups, A and B. Students were grouped randomly and there were 19 students in group A and 20 students in group B. Group A was presented the whole diagram depicted in Figure 3, and group B was presented a reduced version of the diagram including only the misuse case and use cases, without mitigation use cases. With this arrangement, we wanted to find out if our approach of analyzing the misuse and modeling the mitigation use cases would help to identify better test cases for the system. Both groups were then given a test case template and asked to write acceptance test cases based on the diagram. Both groups were also given a step-by-step tutorial for deriving test cases from the use cases and misuse cases. The code injection hazard was also explained in the tutorial. There were two versions of the tutorial, so that the group that were given also the mitigation use cases was instructed to utilize the mitigation use cases in addition to normal use cases and misuse cases. We were interested in two questions in the experiment: 1) Do the mitigation use case documentation and model help to identify more acceptance test cases? 2) Do the mitigation use case documentation and model help to create more comprehensive acceptance test cases? The main use case in the example is ConfigureCar, which is initialized when a buyer actor accesses the system. There are two extension points related to the configuration, SelectModel and SelectColor use cases. Brief description of the ConfigureCar use case is provided in Table 3 in a typical use case template, listing the id, brief description, primary actor, preconditions, main event flow and postconditions of the scenario. Table 3. ConfigureCar use case Use case
ID Brief description Primary actor Preconditions Main flow Postconditions
ConfigureCar
uc-1 The user wants to configure a car.
Buyer The user has downloaded the car-configurator web page. 1. Extension point uc-1.1 2. Extension point uc-1.2 The user is provided with the car price according to his/her selections.
Table 4. SelectModel use case Use case
ID Brief description Primary actor Preconditions Main flow Postconditions
SelectModel
uc-1.1 The user selects a model of a car.
Buyer The user has proceeded to the model selection phase within the configurator. 1. The user selects a car model from the list of available models. 2. The current price of the car is updated and saved in the session object.
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases
347
Table 4 describes the SelectModel use case, which extends the ConfigureCar use case. The SelectColor use case is very much similar to the SelectModel use case. The description of the InjectMalicousScript misuse case included in the experiment model is listed in Table 5. The template used to document the misuse case is similar to the template of ordinary use case. Table 5. InjectMaliciousScript misuse case Misuse case ID Brief description Primary actor Preconditions Main flow
Postconditions
InjectMaliciousScript muc-1 The system is interrupted by entering hostile code or script as an input. Attacker The information for the configurator script is provided with the URL. 1. The attacker identifies vulnerable parameter in web page script. 2. The attacker executes hostile script through the vulnerable page by providing malicious parameter information. 3. As a response, the attacker receives sensitive information. A possible attack scenario can be found i.e. from wikipedia: http://en.wikipedia.org/wiki/Code_injection) Worst case scenario: the attacker takes control of the server or gets account information of the users of the server.
Table 6 provides the description of one of the two mitigation use cases in the example, the ValidateInput. The other mitigation use case, ValidateOutput is very similar. The mitigation use case template differs slightly from ordinary use case template. Actor information is not needed, as the mitigations are always connected to some actual use case. Priority of a mitigation use case can be derived from the misuse case description, where the risk exposure measure related to the misuse case should be calculated. The risk exposure is usually calculated with the formula RE = P x C, where P represents the probability of the risk and C represents the costs caused by the risk [10]. The higher the risk exposure, the higher the priority of the mitigation use case should be. Finally, CVE (Common Vulnerabilities and Exposures) and CWE (Common Weakness Enumeration) databases could provide practical guidelines for implementing the main flow, which is why we have included a link to CWE section to the template. In this case, CWE-94 refers to Failure to Control Generation of Code ('Code Injection') weakness. It is summarized in the CWE dictionary as follows: “The Table 6. ValidateInput mitigation use case Mitigation use case ID Brief description Priority Preconditions Link to CWE Main flow
Postconditions
ValidateInput miti-1 Ensure that input that is given does not contain malicious code. High User requests a configurator web page with a specific input. CWE-94 1. Remove or escape dangerous characters from the input string. 2. Check that the input string is within allowed set of input strings (e.g. ”red”, ”green”, etc.) 3. Use default values if the input string does not conform to rules. The input string is valid and ”cleaned”.
348
L. Harjumaa and I. Tervonen
product does not sufficiently filter code (control-plane) syntax from user-controlled input (data plane) when that input is used within code that the product generates.” (http://cwe.mitre.org/) [8] After reading through the material and creating the test cases, the participants of the experiment were asked to answer a short questionnaire about the participants' perceptions of the method. We will look a closer look on the following questions. Q1. I was already familiar with use case diagrams. Q2. I was already familiar with misuse cases. Q3. Why was it difficult to create test cases? Q3.1 The descriptions were inadequate. Q3.2 The descriptions were too abstract. Q3.3 The procedure of deriving test cases was unclear. Q3.4 I was unfamiliar with the domain. Q3.5 It was difficult to understand how security issues are related to the system. Questions 1 and 2 were about participants' previous knowledge on the use case and misuse case modeling. The third question concerned the obstacles that participants perceived during the experiment. The “descriptions” mentioned in the questionnaire mean the instructions; use case diagrams and templates given to the students. Group A got the additional mitigation use case material, while group B got the basic material. All questions were answered in a 5-point Likert scale where 1 meant “strongly disagree” and 5 meant “strongly agree”. In addition to the questionnaire, the results that the participants achieved were assessed. For each participant, the number of generated test cases and the quality of each case were recorded. The instructors of the course evaluated the quality of the test cases. In the evaluation, test case descriptions were assessed based on the detail level, rationality and the extent to which they addressed security issues. We use a simple t-test to analyze if there are differences between the two groups. The results are summarized in Table 7. Table 7. T-test results Variable Q1 Q2 Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 Grade Number of test cases
Group A mean 4.10 2.84 2.50 2.44 3.61 2.33 2.50 3.26 2.47
Group B mean 3.55 2.05 3.25 3.45 3.40 3.05 3.10 2.25 1.15
p-value 0.2069 0.03639 0.04039 0.0007778 0.5562 0.08162 0.1715 0.01064 0.000012
We use p-value 0.05 as a threshold value for the test results to be statistically significant. The null hypothesis for each question is that there is no difference between the groups. Regarding the questionnaire, the p-value is lower than 0.05, and the null hypothesis has to be rejected, in questions Q2, Q3.1 and Q3.2. We will first look at the questions Q3.1 and Q3.2. Thus, it seems that the mitigation use case descriptions and tutorial were perceived helpful by the participants and those who did not have the
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases
349
mitigation use case descriptions felt that the material was too abstract for creating test cases. According to these results, we can state that modeling and documenting the mitigation use cases is beneficial and improves concreteness of the test case derivation process. Regarding questions Q3.3 to Q3.5 it seems that there is no statistically significant difference how the participants perceived the process of deriving test cases. Thus, understanding the procedure of deriving test cases, the application domain and security threats of the system were not affected by the additional material. Grades gained from the assignment were significantly better in group A, which had the additional material. Furthermore, those who had the mitigation use case material (group A) produced significantly more test cases than those who had only the use case and misuse case material (group B). The left side of Figure 5 shows the grades of the students in both groups. The most typical grades in group A are 3 and 4, while a student in group B scored typically 2 or 3 in a five-point scale.
Fig. 5. Grades and number of produced test cases by group
Figure 5 also shows the number of test cases produced by students in each group. A student in the group A produced typically two or three test cases, while most students in the group B produced only one test case. Even though students were not purposely grouped by their previous experience, it seems that students in group A were more familiar with misuse cases. Thus, we have to analyze also how this previous knowledge affects on the experiment results. We recategorized the participants into two groups: Those who were already familiar with the misuse case approach (answered 3-5 to Q2) and those who were not familiar with misuse cases (answered 1 or 2 to Q2). 15 participants fell into the “previously familiar with misuse cases” group and 24 participants into the “not previously familiar with misuse cases” group. The results of the t-test carried out to compare these reorganized groups are summarized in Table 8. We take a look only on the relevant variables Q3.1-Q3-5, grades and number of test cases created. Only the variable describing number of test cases created by the participants has a p-value lower than 0.05. Thus, we can state that the mitigation use case documentation helps in creating better quality test cases, and participant of the experiment perceived the test case generation task more concrete if
350
L. Harjumaa and I. Tervonen
they had the mitigation use case descriptions. However, we cannot make distinction if the mitigation use cases or previous knowledge on misuse case approach helped to create more test cases. Table 8. Does previous knowledge affect the results? Mean of the “not familiar” group Q3.1 2.75 Q3.2 3.13 Q3.3 3.42 Q3.4 2.88 Q3.5 2.96 Grade 2.50 Number of test cases 1.50 Variable
Mean of the “familiar with” group 3.14 2.71 3.64 2.43 2.57 3.13 2.26
p-value 0.2920 0.1699 0.4887 0.2914 0.3979 0.1336 0.01311
In addition to the student experiment described above, feedback about our approach was collected from an initial company experiment that was carried out in a small software development organization specialized in testing. The company provides testing services to several customer companies. The system that was selected for experiment was a student information system developed for a British university. The case showed that creating the misuse cases, mitigation use cases and testing them should be started already in the first phases of the development cycle, when changes are easier to introduce. This also helps in defining the mitigation actions for preventing malicious use. As a positive side-effect, modelling the misuse cases brought up some useful ideas for further development of the system. In our company case, some missing functionality regarding the mitigation actions was identified and added to the system. Some challenges were also recognized. According to one software tester, ”creating ample misuse cases is difficult and requires, in addition to the theory, exceptionally deep understanding of the system.” The person responsible for describing misuse cases must be familiar with security issues. Another tester stated ”in path testing, one must evaluate how many possible paths exist. Even if one misuse case and a path that a hostile user uses were found, how can we ensure that the other paths are not vulnerable for misuse? Investigating this requires time and resources and may delay the work quite a bit.” If mitigation actions defined in use case diagrams have not been implemented, carrying out path and scenario testing will be difficult or impossible, because misuse case descriptions are not adequately detailed to include all possible paths that a hostile user may take. The results from our student experiment shows that mitigation use cases provide more information for deriving the test cases and manage design-level vulnerabilities.
7 Threats to Validity We believe that the results of our study strongly suggest that modeling the mitigation use cases alongside regular use cases improves the process of generating test cases from the models. However, we recognize certain uncertainties in our study.
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases
351
First, the participants of the experiment were students. That does not correspond to a real industrial setting. On the other hand, it would be very difficult to carry out comparative experiments in real life projects. Furthermore, the sample size was quite small, and the results could have been more convincing with a larger number of participants. Second, we could not do any formal analysis on the feedback gathered from the case company. Because the data is collected in free form, it would be difficult to replicate the company experiment. For that reason, we have presented only very limited portion of the company experiment. Clearly, more empirical experimentation is needed to fully justify our approach. We have already carried out a further student experiment (which has not been analyzed yet) and we are planning to carry out larger real-life modeling cases in companies.
8 Conclusion We have proposed a new approach for modeling possible mitigation activities related to the misuse cases. We suggest using extension points in UML diagrams for connecting mitigations to the actual use cases and defining , > or relationship between misuse cases and mitigation use cases for guiding the test case creation. When the modeling has been done carefully, test cases for the security related functionalities could be identified quite easily. We suggest defining and documenting mitigation use case models alongside misuse cases in the requirements specification phase of a system. According to the results of our experiments, the belt and braces approach provided by the mitigation use cases improves the concreteness of the process of generating test cases. Mitigation use case descriptions may help in identifying the possible paths that a hostile user may take. Modeling and documenting the mitigation use cases alongside use cases and misuse cases help to gain better understanding of security risks related to the system, take into account all stakeholders' viewpoints and prevent design-level security flaws. Considering the security issues from the early stages of development helps to identify and repair possible security vulnerabilities before the software product is released. Security aspect cannot just be clued into the underlying development process. The whole software quality assurance process needs to be adjusted in order to efficiently address security issues. Further research should address integration of the security aspects and mitigation planning into specific verification and validation techniques, such as software inspection.
References [1] Alexander, I.: Misuse Cases: Use Cases With Hostile Intent. IEEE Software 20(1), 58–66 (2003) [2] Avizienis, A., Laprie, J.C., Randell, B.: Fundamental Concepts of Dependability. In: Okamoto, E., Pieprzyk, J.P., Seberry, J. (eds.) ISW 2000. LNCS, vol. 1975, pp. 1–6. Springer, Heidelberg (2000) [3] Basili, V., Donzelli, P., Asgari, S.: A Unified Model of Dependability: Capturing Dependability in Context. IEEE Software 21(6), 19–25 (2004) [4] Baskerville, R.: The Developmental Duality of Information Systems Security. Journal of Management Systems 4(1), 1–12 (1992)
352
L. Harjumaa and I. Tervonen
[5] Baskerville, R.: Information Systems Security Design Methods: Implications for Information Systems Development. ACM Computing Surveys 25(4), 375–414 (1993) [6] Berger, B.: The Dangers of Use Cases Employed as Test Cases. In: STAR West Conference (2001), http://www.testassured.com/docs/Dangers.htm (referenced 23.11.2007) [7] Best, B., Jürjens, J.: Model-based Security Engineering of Distributed Information Systems using UMLsec. In: Proceedings of the 29th International Conference on Software Engineering, pp. 581–590 (2007) [8] Common Vulnerabilities and Exposures. The Standard for Information Security Vulnerability Names (2007), http://cve.mitre.org/ (referenced 12.9.2007). [9] Hafiz, M., Adamczyk, P., Johnson, R.E.: Organizing Security Patterns. IEEE Software, 52–60 (July/August 2007) [10] Hall, E.M.: Managing Risk: Methods for Software Systems Development. AddisonWesley, Reading (1998) [11] Heumann, J.: Generating Test Cases from Use Cases. Journal of Software Testing Professionals 3(2) (2002) [12] Hope, P., McGraw, G., Anton, A.I.: Misuse and Abuse Cases: Getting Past the Positive. IEEE Security & Privacy 2(3), 90–92 (2004) [13] Jürjens, J.: Using UMLsec and Goal Trees for Secure Systems Development. In: Proceedings of the 2002 ACM Symposium on Applied Computing (SAC), pp. 1026–1030 (2002) [14] Jürjens, J.: Sound Methods and Effective Tools for Model-based Security Engineering with UML. In: Proceedings of the 27th International Conference on Software Engineering, pp. 322–331 (2005) [15] Lodderstedt, T., Basin, D., Doser, J.: SecureUML: A UML-Based Modeling Language for Model-Driven Security? In: Jézéquel, J.-M., Hussmann, H., Cook, S. (eds.) UML 2002. LNCS, vol. 2460, pp. 426–441. Springer, Heidelberg (2002) [16] McDermott, J., Fox, C.: Using Abuse Case Models for Security Requirements Analysis. Proceedings of the 15th Annual Computer Security Applications Conference, 55–64 (1999) [17] Mead, N.R.: Identifying Security Requirements Using the Security Quality Requirements Engineering (SQUARE) Method. In: Mouraditis, H., Giorgine, P. (eds.) Integrating Security and Software Engineering: Advances and Future Visions. IDEA Group Publishing, London (2007) [18] Mouraditis, H., Giorgine, P.: Integrating Security and Software Engineering: An Introduction. In: Mouraditis, H., Giorgine, P. (eds.) Integrating Security and Software Engineering: Advances and Future Visions. IDEA Group Publishing, London (2007) [19] Opdahl, A.L., Sindre, G.: Experimental comparison of attack trees and misuse cases for security threat identification. Journal of Information and Software Technology 10(1), 916–932 (2009) [20] Pauli, J., Xu, D.: Integrating Functional and Security Requirements with Use Case Decomposition. In: Proceedings of the 11th International Conference on Engineering of Complex Computer Systems, pp. 57–66 (2006) [21] Potter, B., McGraw, G.: Software Security Testing. IEEE Security & Privacy 2(5), 81–85 (2004) [22] Sindre, G., Opdahl, A.L.: Eliciting Security Requirements by Misuse Cases. In: Proceedings of 37th International Conference Technology of Object-Oriented Languages and Systems, pp. 120–131 (2000)
Introducing Mitigation Use Cases to Enhance the Scope of Test Cases
353
[23] Siponen, M., Heikka, J.: Do Secure Information System Design Methods Provide Adequate Modeling Support? Information and Software Technology 50(9-10), 1035–1053 (2008) [24] Tøndel, I., Jaatun, M., Meland, P.: Security Requirements for the Rest of Us: A Survey. IEEE Software 25(1), 20–27 (2008) [25] Viega, J., McGraw, G.: Building Secure Software - How to avoid security problems the right way. Addison-Wesley, Boston (2004) [26] Villarroel, R., Fernández-Medina, E., Piattini, M.: Secure information systems development - a survey and comparison. Journal of Computers & Security 24(4), 308–321 (2005) [27] Weiss, M.: Modelling Security Patterns using NFR Analysis. In: Mouraditis, H., Giorgine, P. (eds.) Integrating Security and Software Engineering: Advances and Future Visions, IDEA Group Publishing, London (2007) [28] Wood, D., Reis, J.: Use Case Derived Test Cases. Harris Corporation. In: STAREAST on Software Quality Engineering Conference (1999) [29] Wysopal, C., Nelson, L., Dai Zovi, D., Dustin, E.: The Art of Software Security Testing. Addison-Wesley, Reading (2007)
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees Margus Niitsoo University of Tartu, Liivi 2, 50409 Tartu, Estonia Cybernetica AS, Akadeemia tee 21,12618 Tallinn, Estonia
[email protected] Abstract. Attack tree analysis is used to estimate different parameters of general security threats based on information available for atomic subthreats. We focus on estimating the expected gains of an adversary based on both the cost and likelihood of the subthreats. Such a multi-parameter analysis is considerably more complicated than separate probability or skill level estimation, requiring exponential time in general. However, this paper shows that under reasonable assumptions a completely different type of optimal substructure exists which can be harnessed into a linear-time algorithm for optimal gains estimation. More concretely, we use a decision-theoretic framework in which a rational adversary sequentially considers and performs the available attacks. The assumption of rationality serves as an upper bound as any irrational behavior will just hurt the end result of the adversary himself. We show that if the attacker considers the attacks in a goal-oriented way, his optimal expected gains can be computed in linear time. Our model places the least restrictions on adversarial behavior of all known attack tree models that analyze economic viability of an attack and, as such, provides for the best efficiently computable estimate for the potential reward.
1
Introduction
Assessing the security of a computer system has become an increasingly more important problem in the past decade due to the widespread use of computer communications. To date, many interesting graph-based solutions have been provided (see [1] for a review) that try to address the problem of threat estimation. Although the approaches vary greatly in their scope and methodologies, most of them have two things in common. First, they tend to concentrate on the technical aspects of computer systems such as network topology and software flaws. Secondly, the emphasis is usually placed on finding and describing the possible attack vectors, giving less thought on the analysis of attack feasibility or likelihood. For instance, [2,3] concentrate on semi-automatic generation of attack
Supported by Estonian SF grant no. 6944 and the Tiger University Program of the Estonian Information Technology Foundation. This research was additionally supported by the European Regional Development Fund through the Estonian Center of Excellence in Computer Science, EXCS.
I. Echizen, N. Kunihiro, and R. Sasaki (Eds.): IWSEC 2010, LNCS 6434, pp. 354–370, 2010. c Springer-Verlag Berlin Heidelberg 2010
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees
355
graphs and then only do rather simple path-enumeration and cut-set analysis on what they have generated. Both approaches have since been developed further but the emphasis still seems to remain on fast and accurate graph generation. The emphasis on technical aspects such as software vulnerabilities might not be the most relevant in terms of estimating actual security, especially considering the increasing number of successful social engineering attacks, which are usually much simpler, yet often just as effective, as clever technical hacking can be. Another flaw with the emphasis on technical aspects is that the assumptions about adversarial behavior are often overly simplistic. For instance, it is generally assumed that a system is insecure as soon as a possible means of penetration is found. In reality, knowing that there exists a way to penetrate the system is akin to knowing that it is possible to pick the lock on the front door of a house. However, it is obvious that whether the house will actually be burgled depends much more on whether there is something valuable inside, how high the probability of getting caught is and what the penalties for breaking and entering are. Just considering the quality of the lock on the door or the strength of the bars on the windows will definitely give some information about the security, but it might not tell much about the actual likelihood of an attack. This means that although the analysis of technical level vulnerability is of importance, it cannot provide for the complete analysis of security. It is also important to develop models that work on a higher level of abstraction and allow the integration of social engineering attacks alongside the more technical possibilities. It would also make more sense to concentrate on the incentives and possibilities available to the adversary and try to analyze his behavior rather than simply determining whether the attack is possible or not. The attack tree (also called threat tree) approach to security evaluation is not a recent development but has roots that reach back several decades to Fault tree analysis (see [4]). The approach was introduced for the study of information system security by Weiss [5] and made popular in that context by Bruce Schneier [6]. We refer to [7,8] for good overviews of the development and application of the methodology. Initial work concentrated on estimating just one of them at a time. The cost, feasibility and the skill level required for the attack have all been independently considered for analysis by different authors [9,6,10]. There is even a software package [11] on the market for performing such analyses1 . Buldas et al. [12] introduced a multi-parameter game-theoretic model which allowed estimation of the expected utility of the attacker with only a linear amount of computation. The model was later used in practice by Buldas and M¨ agi [13] to evaluate the security of several e-voting schemes in use at that time. However, the model of [12] was somewhat ad-hoc and turned out to be theoretically unsound. This was noticed by J¨ urgenson and Willemson [14], who in turn proposed a modification that resulted in a sound model for parallel adversary behavior in which the adversary has to attempt all the attacks at 1
We note that although multiple parameters are used by the software package, one of the parameters is used just for tree pruning and as such, their computational analysis is still single-parameter in essence.
356
M. Niitsoo
the same time in parallel. However, it seems that exponential running time is required to determine the maximal possible expected utility of the adversary. This means that the model is impractical for all but the smallest attack trees. J¨ urgenson and Willemson went on to consider a model of serial attacks [15] in which the adversary performs the attacks in a prescribed order and has full information about what the results of the previous attacks are. For that model they provide a quadratic time algorithm for calculating the expected utility for the adversary. Their model is sound, but only under a somewhat weird assumption on adversarial behavior. To be precise, they (implicitly) assume that the adversary always performs an elementary attack whenever doing so increases the chance of materializing the primary threat. They also propose a model which tries to remedy that problem by considering all the subtrees of the attack instead of just the full tree, but this forces their algorithm into exponential running-time and is still not guaranteed to give optimal results in the game-theoretic sense. We start out with a systematic decision-theoretic approach to adversarial behavior analysis and show that there actually exists a linear-time algorithm for computing the maximal expected outcome for an exponential sized family of attack orders. Our model is also one of the least restrictive models in terms of assumptions about adversarial behavior. This means that the expected utility of the adversary that is computed by our proposed algorithm is the highest among the currently known efficient computation methods and as such can serve as an upper bound for all of them.
2
Attack Trees
In security analysis, one is interested in estimating the parameters of some large primary threat. The adversary usually has many ways of materializing the primary threat and it is often possible to break it down into smaller parts such that either all or at least one of the parts need to be realized for the main attack to succeed. As many of these parts can then also be broken down in a similar manner, a tree structure is formed with increasingly simpler attack scenarios at the nodes. If one continues decomposing the threats, one is bound to eventually reach a point where the attack considered is simple enough such that its parameters can be directly estimated. When this happens, the threat is not decomposed further, but is left as a leaf of the tree and called elementary. When all the branches have been stretched out to that level of detail, the process is stopped and the result is called the attack tree corresponding to the primary threat at its root. A small example of an attack tree is depicted in Figure 1. For a given attack tree, we denote the set of all its leaf elementary attacks by X = {X1 , . . . , Xn }. The internal nodes of the tree that describe the primary threat and its non-elementary subgoals can be of two distinct types. If all of the parts of the subgoal need to succeed for the subgoal to be considered successful, the node is called an AND-node (∧-node). If just one successful part is enough for the subgoal to be successful, it is called an OR-node (∨-node). These two node types are usually sufficient to describe all the possible variations of the attacks.
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees
357
∧ Decrypt company secrets
∨
∨
Obtain encrypted file
Obtain the password
Bribe the sysadmin
Hack into the database
X1
X2
Brute force the password
Use a keylogger
X3
X4
Fig. 1. An example attack tree
This means that an attack tree is usually just an AND-OR tree in which the primary threat is realized precisely when the root of the AND-OR tree is satisfied. The problem can also be expressed in terms of Boolean functions F on elementary attacks as the AND-OR tree can easily be converted into a (monotone) Boolean formula of a certain simple form. For example, the attack tree in Figure 1 corresponds to the Boolean formula F (X1 , X2 , X3 , X4 ) = (X1 ∨ X2 ) ∧ (X3 ∨ X4 ). The attention is usually restricted to trees because they usually provide for a model that is simpler both computationally as well as in construction (for a specific security threat). In our case, however, it makes sense to talk about more general attack scenarios where F can be any Boolean function. These scenarios can still be analyzed in different ways depending on what parameters of the attack are taken into consideration and what assumptions are made about the adversarial behavior. Our approach follows that of J¨ urgenson and Willemson [16] who simplify the model of Buldas et al. [12]. Their model has just two parameters for each elementary attack Xi : 1. The probability of the attack succeeding pi . 2. The expected expenses ei when attempting the attack. Additionally, there is one global parameter g that denotes the expected utility that materializing the primary threat has for the adversary. We note that the expenses are meant to include not only the immediate expenses of performing the attack but also accounts for the expected cost of possible penalties that need to be paid if the attacker is caught in the act. How this value can be calculated is described in detail in [14]. We also note that it is reasonable to assume that the adversary does not get paid for attempting the attacks, i.e., the expenses are all non-negative. It is also natural to assume that g is positive for otherwise the attacker would not be motivated to even attempt the attack. These parameters allow one to carry out an economic analysis in which it is assumed that the adversary tries to maximize his expected outcome.
358
M. Niitsoo
In the serial model of attack, the adversary is assumed to try the elementary attacks one at a time in such a way that he always knows what has happened on the previous attacks. This may lead to the adversary not attempting an attack, for instance when it would have no impact on the final outcome any more, i.e., performing it would not increase the probability of winning g. However, whether this is the case or not largely depends on the order of the elementary attacks. In real life, the elementary attacks often have a natural order imposed on them by practical restrictions. For example, breaking into the company safe requires that the culprit first break into the office itself so the latter clearly has to precede the former in the order of attacks. For simplicity we indeed assume that the order of the attacks has been fixed beforehand and that the adversary has to consider the attacks in just the order they are given, starting with X1 and ending with Xn . Figure 2 depicts two different possible orders for the example attack tree of Figure 1. ∧
∧
∨ X1
∨ X2
X3 (a)
∨ X4
X1
∨ X2
X3
X4
(b)
Fig. 2. Two different orderings of the tree in Figure 1
Although we assume that the order of consideration is fixed for the attacks, the adversary still has a choice for each elementary attack of whether to commit it or not. If the past outcomes imply that performing the attack does not increase the likelihood of the main threat succeeding, one can definitely gain by choosing not to attempt the attack. This was the intuition behind the model of J¨ urgenson and Willemson [15]. The model they propose has the adversary always attempting an attack whenever it increases the probability of materializing the main threat. They show that their model is sound in the sense of Mauw and Oostdijk [10] and that computing the optimal adversary behavior can be done in O(n2 ) time for the tree model for all the possible orders. However, their model suffers from one critical flaw. Namely, it may sometimes happen that an attack costs more than it is worth in the sense that the probability increase of the main threat is so small that it does not outweigh the cost of performing the given elementary attack. J¨ urgenson and Willemson attempt to fix this flaw by saying they consider all the possible subsets of elementary attacks and try to find the optimal subset. This does not really solve the problem, however, as it may still be rational to perform the attack for one past while for the other it is not (see the full version of this paper [17] for an example). The
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees
359
only possible conclusion is that there exists a behavior strategy for the adversary that is strictly better than is describable by the serial model of J¨ urgenson and Willemson.
3
Our Model
To overcome the weaknesses in the previous serial model, we approach the problem from the classical viewpoint of decision theory (see [18] for an introduction). We assume the adversary to be a fully rational2 expected utility maximizer who can choose whether to attempt each elementary attack Xi ∈ X in their given order. Additionally, we assume full information about the past so that when considering Xi the adversary is assumed to have information about his past attack decisions and their results. We are interested in the optimal decision strategy for the adversary and how much he could expect to gain by following it. The strategy consists of a series of prescriptions on how to behave - one for each attack for each of its possible past series of events. This means that the adversary can choose to behave in different ways for different past outcomes. The strategy is considered optimal when it has the highest possible utility of all the possible decision strategies. This guarantees that the resulting behavior will produce the highest expected outcome of all the possible behaviors that only rely on the information accounted for within the model. Among other things, it will allow the adversary to refuse an attack when it is just too expensive. It is also fairly easy to show that these semantics are consistent in the sense of Mauw and Oostdijk [10]. The classical formalization for serial decision problems is via decision trees. Decision trees, like attack trees, have only two types of internal nodes and one type of leaves. The internal nodes are either decision nodes (depicted as squares) in which the attacker can choose the outcome, or chance nodes (depicted as circles) in which the outcome is chosen randomly according to a given distribution. The leaves are called utility nodes (depicted as diamonds) and they determine how much the current sequence of decisions and chance events is worth to the adversary. The decision strategy essentially consists of a prescription of what to choose at each decision node. When solving the tree, we are looking for a strategy that maximizes the expected utility. An example of a decision tree is depicted in Figure 3. It is quite straightforward to draw out the decision tree for adversarial behavior in the serial model described for any Boolean function F . For example, the decision trees for F (X1 , X2 ) have the form depicted in Figure 3. Finding the utilities at the leaves is also quite simple as it involves the positive utility of either g if F was true or 0 if it was false and from that one just needs to subtract the expenses incurred during the attack that depend on the exact decisions made and their outcomes. Since the decision trees corresponding to the attack trees (or any attack scenarios describable by Boolean formulae for that matter) are always binary, we 2
This provides the upper bound as an irrational adversary can do no better. This is the generally accepted justification for considering rational actors in decision theory.
360
M. Niitsoo
D2 p1 D1
D2 D2
p2 p2
p2
Fig. 3. A Decision Tree for F(X1 , X2 )
adopt the following convention when depicting them in figures. We always assume that the upper arrow corresponds to the positive answer (attack is attempted, attack succeeds) and that the lower arrow corresponds to the negative answer (attack is not attempted or fails). Additionally, since the chance nodes have just two outputs, the distribution for them is uniquely determined by the probability of the positive answer alone and we write that value inside the chance node. For utility nodes, we write their utility inside the diamond symbol whenever possible. To differentiate between decision and attack trees visually, decision trees are depicted with a root on the left while attack trees have the root on top. The standard algorithm for computing the expected value of decision trees is actually very straightforward. It takes an internal node for which the optimal expected outcomes of the children are known and computes the optimal expected outcome for that node as well. This is done by either taking the maximum of the utilities of the children if it is a decision node, or taking the weighed average of the outcomes of children according to the given distribution if it is a chance node. The process is repeated until the expected maximal outcome for the root node is computed, which is then taken as the expected maximal outcome of the whole tree. As we assume the adversary to be a rational utility maximizer, this is the most natural way to define his optimal gains. The decision tree approach provides us with a convenient, straightforward and theoretically sound formalization for adversarial behavior. Regrettably the trees are generally of exponential size in the number of attacks so for practical attack scenarios it is infeasible to even draw them out in full. When assuming that the attacks can be expressed as Boolean functions, however, the decision trees can often be greatly simplified and the optimal strategy can be found in polynomial time. We first introduce some additional decision theory to simplify further discussion. The standard model of decision trees is somewhat inconvenient for our purposes. It would be more natural if we could factor in the expenses at the place where they are incurred rather than at the leaves. The model can easily be modified to accommodate this by allowing utility changes to be placed on edges as well as at the leaves3 . We also change the semantics so that the utility of a given 3
So a utility node now always has one incoming arc and may also have at most one outgoing arc.
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees
D2 D1
e1
D2
p1 D2 e2
p2
0
e2
p2
g e2
p2
0
361
g g g 0
g 0
(a) D2 D1
e1
p1 D2
e2
p2
e2
p2
g g g 0
(b) Fig. 4. Generalization of the decision tree for F(X1 , X2 ) = X1 ∨ X2 with extra utility nodes (a) and then into an RDAG (b)
path from root to a leaf is now defined as the sum of all the utility nodes encountered on the way, the leaf where the path ends included. To cope with the augmented semantics within the optimization algorithm, we introduce a rule for compacting a utility node whose child is also a utility node. It works through simply merging the two nodes by summing their utilities. With this formalization we can move the expenses associated with performing the attacks in between the chance and decision nodes inside the tree, leaving only either 0 or g as the leaf utilities. It is straightforward to verify that this new model still gives the exact same maximal expected outcome as the original decision tree model. Applying this approach to the example of Figure 3 produces Figure 4(a)4 . In the case of our model this shift will often tend to produce a tree with many subtrees that are completely identical in the sense of having the exact same types of nodes with the exact same structure, distributions and utilities. As the algorithm for solving decision trees works from leaves towards the root it is clear that if two subtrees are equal to the end then the optimization algorithm will work in an essentially identical way in both. As solving two such subtrees separately is just duplicate work, we could save time by using the same solution in both places and computing it just once. A convenient way to represent this in our model is to replace the two equal subtrees with just a single one by making the incoming arcs to their roots both point to the same node. See Figure 4(b) for an illustration of this. This requires loosening the assumption of having a (rooted) tree into that of having a rooted directed acyclic graph (RDAG). It is 4
For readability we omit minus-signs from the utility nodes corresponding to the expenses. They do nonetheless convey negative utility and this is just a notational convenience for the figures.
362
M. Niitsoo
Di
ei
pi X1
(a)
X2 X2
g
g g 0
(b)
Fig. 5. A decision compound for Xi (a) and the RDAG in Figure 4 redrawn using decision compounds (b)
easy to see that the maximal expected utility remains unchanged whenever two equal subtrees are merged together. Before moving on we note that the two generalizations to the decision tree model are both fairly natural and quite standard. The generalization to RDAG is commonly referred to as coalescing and is viewed as one of the simplest ways of reducing the decision tree size. Allowing utilities on the edges is also completely standard. The RDAG generalization allows us to make one further simplification. Following the example on Figure 4 it should be intuitively clear that the subtree corresponding to an attack Xi failing and the subtree corresponding to the same attack Xi not being attempted are always identical to one another so we can always merge them together. This allows us to abstract the decision and its corresponding chance and utility nodes into a decision compound like the one depicted in Figure 5(a). After doing that we are left with a tree composed of just decision compounds and utility nodes (an example is depicted in Figure 5(b)). This allows for a simpler and more informative visual representation and all the decision RDAG figures in the following will use this abstraction. Therefore, the following diagrams will be composed of decision compounds (marked as rectangles with rounded corners) and two terminating leaves – one for 0 and one for g.
4
Non-crossing Trees
Since the decision trees are of exponential size in general, we need one additional assumption to achieve computational efficiency. We thus constrain the order of the attacks to adhere to a non-crossing assumption. The non-crossing condition basically means assuming goal-oriented behavior from the adversary. It is required that the elementary attacks of a subattack are always attempted together without considering any elementary attacks from the other subattacks in between. If a subattack is something to the effect of ”break into the main office and steal the data from the safe” then it is completely natural to assume that all the elementary attacks within it (such as ”pick the front door”, ”find the safe”, ”crack the safe open”, ”run for your life”) are all attempted one after the other without attempting elementary attacks from other subattacks (such as ”hack into the mainframe of the overseas office”) in between
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees
363
them. We essentially assume that one large goal is either satisfied or abandoned before a subsequent subattack of the same importance is ever attempted. As people do tend to work and think in a goal-oriented way, we believe this to be a rather reasonable assumption to make of an adversary. The name ”non-crossing” comes from the visual representation of the attack trees. Suppose an attack tree is drawn in such a way that all elementary attacks are positioned on a straight line in the order they are performed. The tree, then, is non-crossing precisely when it can be drawn so that no two arcs intersect one another (without changing the order of elementary attacks) and no arc crosses the straight lines from the root to the first and last elementary attack. This is best illustrated on Figure 2 where the subfigure (a) is non-crossing while (b) is crossing. Formally, the condition can be stated in the following way: Definition 1. Let F (X1 , . . . , Xn ) be a monotone Boolean formula of n variables. We say that F is non-crossing relative to the order X1 , . . . , Xn if it can be written in such a way that (simplicity) Only ∧ and ∨ operators are used (single-occurrence) Each variable occurs at most once (order-preserving) For all i < j, Xi appears to the left of Xj We note that for a given attack tree of n elementary attacks there are at least 2n−1 different non-crossing orders so our approach works for an exponentially large set of orders. However, it is only a negligible fraction of all the possible n! orders. For simplicity, we assume the attack trees to be binary so that every internal node has just two children. As we are dealing with AND-OR trees, this is without loss of generality. This is because a single AND-node with many children can be split into a series of AND-nodes with just two children each and the same can be done for OR nodes as well. This means we can decompose any non-crossing AND-OR tree into a non-crossing binary tree without changing the semantics. As the process is simple and completely mechanic we will discuss it no further and just assume we are using binary trees.
5
Efficient Computation for the Non-crossing Trees
For the case of non-crossing trees the RDAG can be greatly reduced in size. To be precise, it can be made to have just one decision compound for each elementary attack. The result is captured in the following theorem: Theorem 1. Let X1 , . . . , Xn be elementary attacks of an attack scenario described by F (X1 , . . . , Xn ). If F is a non-crossing Boolean function, the optimal attack strategy can be found in O(n) time. Proof. The result rests on three observations about the structure of non-crossing attack trees. Firstly, it is clear that two sub-RDAGs rooted at Xi are functionally equal whenever they give the same exact outcome for all the possible future
364
M. Niitsoo
choices Xi , . . . , Xn . More formally, for an attack tree corresponding to a Boolean function F , two subtrees with histories r1 , . . . , ri−1 ∈ {t, f } and r1 , . . . , ri−1 ∈ {t, f } rooted at Xi are equal precisely when for all Xi , . . . , Xn ∈ {t, f } we have F (r1 , . . . , ri−1 , Xi , . . . , Xn ) = F r1 , . . . , ri−1 , X i , . . . , Xn . (1) This is so because the subtrees rooted at two different decision compounds for the same attack Xi always have the same binary subtree structure all the way to the leaves and that they can only differ in leaf values. Leaf values, however, are fully determined by the Boolean function F and the equivalence holds because that, too behaves in an identical way in the two cases. It turns out that in the case of non-crossing trees this simplification allows us to systematically reduce the complexity of the decision RDAG to a manageable size. This can be done based on three simple observations. For illustration we will use an example attack tree depicted in Figure 6(a) to demonstrate the simplification process. The corresponding unsimplified decision RDAG is depicted in Figure 6(b).
X3
X4
g
g
g
X4
X2 X3
X4 X4
X1 ∧
X3
∨
X2
∨
X4 X4
X2
X3 (a)
X4
0
g g
g
0
g 0
X4
0
X4
0
X3 X1
g
g
0 0
(b)
Fig. 6. An attack tree (a) and the corresponding decision tree (b)
Before moving on to discuss the observations, we introduce some notation. We assume the tree to be a full binary tree so each internal node has exactly two direct descendants or children. The single-occurrence assumption means that there is one well-defined path from each of the elementary attacks to the root. We denote this path by Pi = {Y0 , Y1 , . . . , Yk } where Y0 is the root. Given a fixed non-crossing order, the two children Zl , Zr of a parent node Zp are uniquely ordered to satisfy the non-crossing restriction. We call Zl and Zr siblings of one another and we call the sibling Zl the left sibling of Zr and Zr the right sibling of Zl . If Zp is an AND-node, we say Zl is the left AND-sibling and if Zp is
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees
365
an OR-node, we call Zl the left OR-sibling of Zr . For an elementary attack Xi we call the set Li of left siblings of Pi its left (sibling) set. Left AND-set and left OR-set are defined in a similar way, being composed (respectively) of left AND-siblings and left OR-siblings of Pi . For example, consider the elementary attack X4 of the attack tree in Figure 6(a). Its left AND-set is composed of a single node – the OR-node that combines X1 and X2 . Its left AND-set also has just one node – the leaf node of the elementary attack X3 . These two nodes together form the complete left set for X4 . As X4 is the last elementary attack, its right set is empty. We note that the non-crossing assumption along with the assumption of full information about the past guarantee that whenever a decision Xi is being considered, all the values for its left set nodes can already be computed. This is because the elementary attacks required for that have already been performed and their results are known. This makes the left set central in our description of the algorithm as it is coupled with the fact that information about previous attacks can be compressed down to just the values of the nodes contained therein. We first note that the equation (1) always holds whenever r1 , . . . , ri−1 ∈ {t, f } and r1 , . . . , ri−1 ∈ {t, f } give the same results for the elements of the left set of Xi . This is intuitively easy to understand if you consider that the results from the elementary attacks are aggregated at the internal nodes and that having a stake in the aggregate results is the only way the elementary attacks really influence the root value. As our tree is non-crossing, the left set values for Xi can always be computed if r1 , . . . , ri−1 are all known and the left set values are, in fact, the topmost aggregate values that can be computed based solely on the history up to Xi . This means that the left set truth values are essentially the only information that is needed about the past attacks and their successes. After carrying out the simplification, the RDAG for the example attack tree in Figure 6 simplifies to the form depicted in Figure 7(a). The second thing we note is that there is actually just one valuation of the left set for which any given decision actually makes any difference - the one where all the nodes in the left AND-set evaluate to t and all the nodes in the left OR-set evaluate to f . This is because any left set value Zl deviating from that scheme would also determine the value for its parent clause Zp . For instance, if Zp was an OR clause and Zl would be t then no matter what the right branch Zr evaluates to, Zp would still evaluate to t. In these cases no elementary attack within the subtree of Zr could possibly modify the final outcome (as the aggregate value at Zp is determined already). Since Xi is contained in the subtree of Zr , its result is insignificant when computing the final result in this case. After carrying out this simplification for our example we arrive at the RDAG depicted in Figure 7(b). The third observation is that the decision compounds whose both outgoing arcs point to the same place are inconsequential and can actually be ignored. The reason for that is that performing the attack may cost something while not attempting it is always free. As such, it is always safe not to attempt an attack whenever both success and failure lead to the same future outcomes. This allows
366
M. Niitsoo
X2
X4
X3
X1 X2
g
X4 X4
X3
0
X2
X3
X2
X3
X1
X4
X4
g
X4 X4
0
(b)
(a)
Fig. 7. The attack tree being simplified according to the first (a) and the second (b) observation X3 X1
g X4
X2
0
Fig. 8. The attack tree after the third simplification
us to simplify the decision RDAG even further. For notational convenience we call the decision compounds that are left after this step consequential. The final RDAG for the example is depicted in Figure 8. Putting the three observations together, it is easy to see that there is just one consequential decision compound per each attack Xi . This follows from the fact that there is just one sub-RDAG for each left set valuation (the first observation) and that there is just one left set valuation for which it matters whether the compound succeeds or fails (the second observation) and that all the other compounds can be removed (observation three). This means that the decision RDAG is of linear size for the non-crossing trees. As the structure of the RDAG can also be computed in linear time5 and the optimization process takes time proportional to the number of nodes in the decision RDAG, the whole computation can be done in O(n) time.
6
Connection with Previous Models
As claimed in the introduction, our model achieves the highest expected outcome for the adversary of all the financial models proposed thus far. To be more formal, let F be an attack scenario with elementary attacks X and let σ be a fixed 5
Success always goes to the leftmost elemental attack of the subtree rooted at the lowest element of the right AND-set and analogously for failure and the right OR-set. This is easy to verify and we omit the details. Using this knowledge it is straightforward to generate the graph in linear time by first determining the leftmost elemental attack for each subattack and then traversing the tree left-to-right, keeping track of the right AND and OR sets.
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees
367
ordering for the elementary attacks in X . Denote by OutcomeJW the expected σ utility assigned to the attack tree with attack order σ on the optimal subset of X by the serial mode of Jurgenson and Willemson [15] and let OutcomeDT be σ the expected utility computed by the decision-theoretic model proposed in this paper. It is easy to show that Theorem 2. For all attack scenarios F and attack orders σ OutcomeJW ≤ OutcomeDT . σ σ It is interesting to note that in the case of non-crossing trees the equality always requires holds in Theorem 26 . We do, however note that finding OutcomeJW σ trying all the 2n subsets to find the optimal solution whereas our algorithm in linear-time. finds OutcomeDT σ This observation leads to another possible practical application. Let OutcomeP denote the expected outcome in the parallel model of [14]. J¨ urgenson and Willemson [15] showed that OutcomeP ≤ OutcomeJW σ for all attack orders σ. This result, when combined with ours, yields a lineartime method of finding non-trivial upper bounds on the adversarial outcome in the parallel model. This is something that was impossible with the previously known algorithms.
7
Possible Extensions
The simplification steps presented in the preceding section are not restricted to the non-crossing trees and can indeed be applied for many other Boolean functions and variable orders. To better understand that, we look at an alternative form of representing the underlying Boolean formulae. Binary Decision Diagrams (BDD) are a method of compactly representing Boolean formulae by rooted directed acyclic graphs where each node corresponds to a variable and has two distinguishable outgoing arcs – one for when the variable is set to true and one when it is set to false. The graph has just two leaves that correspond to the formula being evaluated to true and the other for false. The most common use of BDD-s involves the so called Reduced Ordered BDD-s (ROBDD-s) in which case the variables are in a fixed order and all the sub-RDAGs that are equivalent are merged. It is also assumed that a node is in-lined whenever its both outgoing arcs point to the same place. We refer a reader more interested in the theory of BDD to [19]. Following the steps of the previous section, it is trivial to see that our model of economic decision trees actually reduces to the form where we have a directed 6
This follows from the fact that there is always just one consequential decision compound for each elementary attack Xi from which it follows that the optimal subset for the model of [15] is exactly the set of elementary attacks Xi that are considered worthwhile in our model.
368
M. Niitsoo
graph of decision compounds (with output degree 2) and just two leaves. As the final destination leaf of a path is determined solely by the truth table of the Boolean function, it is clear that our model is in some sense isomorphic to BDDs. As we can also do all the simplification steps that are allowed for ROBDD-s, we can actually formalize our result as the following corollary: Corollary 1. Suppose we have an attack scenario with elementary attacks X = {X1 , . . . , Xn } so ordered. If the attack scenario F can be described with a polynomial-sized ROBDD with the same order, the optimal strategy and its expected outcome can also be determined in polynomial time. This result has three important implications. Firstly, it is known that many Boolean formulae (that cannot be expressed as AND-OR trees) have reasonably small representations as ROBDD-s for some variable orders. This is important because it may well make sense to find the utility even when the order for which this is feasible is completely absurd. This is because our model will produce a strictly higher utility than the parallel model described in the introduction. As the exact computation of the parallel model takes exponential time, our model can thus be used as a fast and practical means of finding an upper bound for the adversarial outcome in it. It is known that most interesting classes of Boolean functions do have orders for which their BDD-s are small, allowing our model to be used for just such estimation. A second implication (being somewhat a corollary of the first) is that we can actually allow a few elementary attacks to occur in multiple places in the tree. Namely, it is rather easy to verify that introducing another occurrence of a variable can at most double the size of the ROBDD7 . As such, the model still remains efficiently computable if 2-4 variables occur more than once. This is often the case in practice as most elementary attacks only matter in one place but there are usually a few (such as gaining root privileges) that are required in multiple places. In these cases there is still an exponential set of orders for which the approach will work. A third implication is that even for simple attack trees, we cannot use this approach for all the possible orders. To be precise, there are trees of very simple structure for which some orders produce exponential-sized ROBDD-s. This means that the approach, although very effective for gaining practical estimates and computing some attack orders, is still inherently limited. The exact extent to which these limits apply needs further exploration and we leave it as an open question for now. A quick review of BDD literature shows that this question is relatively unexplored. This is probably due to the fact that in general applications the order of variables in a BDD does not need to be fixed. Due to that, most literature is interested in finding orders that produce small BDD-s and not in determining the minimal size of a BDD for a given order. Some progress has been made, however and we refer the more interested reader to Chapter 5 of [19] for a general overview. 7
This is essentially achieved by Shannon decomposition on the remaining variables.
Optimal Adversary Behavior for the Serial Model of Financial Attack Trees
8
369
Open Questions
Our work still leaves many open questions to be further explored. For instance, the assumption of non-crossing trees, although quite natural, is still rather restrictive and it would be very interesting to find an efficient algorithm that works under more general assumptions. Another interesting question is whether the optimal order for the elementary attacks could be found efficiently in the cases where the order is not determined naturally. It would also be interesting to consider more general models, such as those allowing intermediate payouts in the subattack nodes. Research in any of these directions would greatly further the applicability of attack trees and provide us with much more accurate tools for predicting adversarial behavior.
Acknowledgments The author would like to thank Jan Willemson and Aivo J¨ urgenson for introducing him to the topic of attack trees and for helpful comments on this article. He is also very grateful to Sven Laur, Peeter Laud and all the anonymous reviewers for all their suggestions, which helped to make the article easier to follow and more self-contained.
References 1. Lippmann, R.P., Ingols, K.: An annotated review of past papers on attack graphs (2005) 2. Ammann, P., Wijesekera, D., Kaushik, S.: Scalable, graph-based network vulnerability analysis. In: CCS 2002: Proceedings of the 9th ACM Conference on Computer and Communications Security, pp. 217–224 (2002) 3. Jajodia, S., Noel, S., O’Berry, B.: Topological analysis of network attack vulnerability. In: Managing Cyber Threats: Issues, Approaches and Challenges (2003) 4. Ericson II, C.A.: Fault tree analysis - a history. In: Proceedings of the 17th International System Safety Conference (1999) 5. Weiss, J.D.: A system security engineering process. In: Proceedings of the 14th National Computer Security Conference, pp. 572–581 (1991) 6. Schneier, B.: Attack trees: Modeling security threats. Dr. Dobb’s Journal 24, 21–29 (1999) 7. Edge, K.S.: A Framework for Analyzing and Mitigating the Vulnerabilities of Complex Systems via Attack and Protection Trees. PhD thesis, Air Force Institute of Technology, Ohio (2007) 8. Espedahlen, J.H.: Attack trees describing security in distributed internet-enabled metrology. Master’s thesis, Department of Computer Science and Media Technology, Gjøvik University College (2007) 9. Moore, A.P., Ellison, R.J., Linger, R.C.: Attack modeling for information security and survivability. Technical Report CMU/SEI-2001-TN-001, Software Engineering Institute (2001) 10. Mauw, S., Oostdijk, M.: Foundations of attack trees. In: Won, D., Kim, S. (eds.) ICISC 2005. LNCS, vol. 3935, pp. 186–198. Springer, Heidelberg (2006)
370
M. Niitsoo
11. Amenaza: Secur/tree attack tree modeling (2010), http://www.amenaza.com/ 12. Buldas, A., Laud, P., Priisalu, J., Saarepera, M., Willemson, J.: Rational Choice of Security Measures via Multi-Parameter Attack Trees. In: L´ opez, J. (ed.) CRITIS 2006. LNCS, vol. 4347, pp. 235–248. Springer, Heidelberg (2006) 13. Buldas, A., M¨ agi, T.: Practical security analysis of e-voting systems. In: Miyaji, A., Kikuchi, H., Rannenberg, K. (eds.) IWSEC 2007. LNCS, vol. 4752, pp. 320–335. Springer, Heidelberg (2007) 14. J¨ urgenson, A., Willemson, J.: Computing exact outcomes of multi-parameter attack trees. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part II. LNCS, vol. 5332, pp. 1036–1051. Springer, Heidelberg (2008) 15. J¨ urgenson, A., Willemson, J.: Serial model for attack tree computations. In: Lee, D., Hong, S. (eds.) ICISC 2009. LNCS, vol. 5984, pp. 118–128. Springer, Heidelberg (2010) 16. J¨ urgenson, A., Willemson, J.: Processing multi-parameter attacktrees with estimated parameter values. In: Miyaji, A., Kikuchi, H., Rannenberg, K. (eds.) IWSEC 2007. LNCS, vol. 4752, pp. 308–319. Springer, Heidelberg (2007) 17. Niitsoo, M.: Optimal adversary behavior for the serial model of financial attack trees. Cryptology ePrint Archive, Report 2010/412 (2010), http://eprint.iacr.org/ 18. Jensen, F.V.: Bayesian Networks and Decision Graphs. In: Information Science and Statistics. Springer, Heidelberg (2001) 19. Wegener, I.: Branching programs and binary decision diagrams: theory and applications. SIAM, Philadelphia (2000)
Author Index
Alicherry, Mansoor Berthier, Mael B¨ ohme, Rainer
232
285 10
149
N´ acher, Marga 217 Niitsoo, Margus 354
Calafate, Carlos T. 217 Campos, Javier 217 Cano, Juan-Carlos 217 Emura, Keita
Mitsunaga, Takuho Mori, Akira 199
Ogura, Naoki 70 Ohta, Kazuo 55 Okamoto, Tatsuaki
149
181 Przykucki, Michal
Fujioka, Atsushi 164 Fukushima, Kazuhide 326 Futatsugi, Kokichi 199 Halunen, Kimmo 251 Hanaoka, Goichiro 55, 181 Harjumaa, Lasse 337 Iyama, Tadashi 326 Izumida, Tomonori 199 Justus, Benjamin
316
R¨ oning, Juha
84
251
Sakai, Yusuke 55, 181 Sarkar, Santanu 25 Sasaki, Yu 301 Satoh, Fumiko 133 Sen Gupta, Sourav 25 Shimoyama, Takeshi 267 Strumi´ nski, Tomasz 84 Sulkowska, Malgorzata 84 Suzuki, Koutarou 164
Keromytis, Angelos D. 232 Kiyomoto, Shinsaku 326 Klonowski, Marek 84 Kobayashi, Tetsutaro 70 Kuribayashi, Minoru 117 Kurosawa, Kaoru 55
Takagi, Tsuyoshi 326 Tanaka, Toshiaki 326 Tervonen, Ilkka 337
Le, Thanh-Ha 285 Li, Bao 40 Liu, Yamin 40 Loebenberger, Daniel Lu, Xianhui 40
Vaidya, Jaideep
Uchiyama, Shigenori 70 Uramoto, Naohiko 133
Wiese, Lena
1
101
316
Maitra, Subhamoy 25 Manabe, Yoshifumi 149 Manzoni, Pietro 217
Yajima, Jun 267 Yamamoto, Go 70 Yoneyama, Kazuki 164 Zhang, Yazhe
40