This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
.p<x1 >.p<x2 >. · · · .p<xn >.P a(x1 , · · · , xn ).P ::= a(p).p(x1 ).p(x2 ). · · · .p(xn ).P Here are some explanations on the syntax: 1. 0 is inaction, which can not perform any action. 2. The output prefix a<x>: name x is sent along the name a. 3. The input prefix a(x): a name is received along a name a, and x is a placeholder for the received name.
Using π-Calculus to Formalize Domain Administration of RBAC
283
4. The unobservable prefix τ.P : Process can evolve to P invisibly to observer. 5. Sum P+Q: an agent can enact n either P or Q. A sum of several agents P1 + P2 + · · · + Pn is written as i=1 Pi . 6. Composition P|Q: the components P and Q can proceed independently and can interact via shared names. A composition of several agents P1 |P2 |· · · n |Pn is written as i=1 Pi . 7. Match if x=y then P else Q : this agent will behave as P if x and y are the same name, otherwise it behave as Q. 8. Mismatch if x =y then P else Q: this agent will behave as P if x and y are not the same name, otherwise it behave as Q. 9. Restriction υx.P: the scope of name x is limited to P and can be used for communication between the component within P. The channel may be passed over another channel for use by another process. 10. Identifier A(y1 , · · · , yn ): every Identifier has a Definition A(x1 , · · · , xn ) = P where xi must be pairwise distinct, and the A(y1 , · · · , yn ) behaves as P with yi replacing xi for each i. 11. Polyadic expression is extension to allow multiple objects in communications. We here also admit the case n=0 when there is no object at all and we denote this as a and a(). Definition 3. For convenience, we define some other notations which will be used in later. – AND::= then if. For example, if x=y AND z=s then P ⇔ if x=y then if z=s then P. − → – − x→ n ::= (x1 , x2 , · · · , xn ). xn is representation for n-dimension vector. − → − → – xn ↑ x ::= (x1 , x2 , · · · , xn , x).− x→ n ↑ x means we add x to vector xn to compose a n+1-dimension vector. ⎧ (x2 , · · · , xn ) f or i = 1 ⎨ (x1 , x2 , · · · , xi−1 , xi+1 , xn ) f or 1 < i < n – − x→ n ↓ xi ::= ⎩ (x1 , x2 , · · · , xn−1 ) f or i = n − − → x→ n ↓ xi means we delete xi from vector xn to become a n-1-dimension vector. – if x∈ /− x→ = x1 AND x = x2 · · · AND x = xn then P. n then P:: = if x – if x∈ − x→ n then P(yi ) ::= if x = x1 then P (y1 ) else if x = x2 then P (y2 ) · · · else if x = xn then P (yn ).
4
Formalize DARBAC Using π-Calculus
The π-calculus representation of elements in DARBAC is shown in Table 1 and Figure 3. In our formalization, every process has several ports which can be accessed by other processes. There are two kinds of ports: administrative ports such as du, dr, do and dar which can be accessed by administrative role, and the access ports such as role access port r and object access ports opi .
284
Y. Lu et al. Table 1. The π-calculus representation of DARBAC elements DARBAC element User Role Object Administrative Role Object Operation Role Hierarchy Administrative Object Administrative Operation User Role Assignment Role Permission Assignment
π-calculus Process U Process R Process O Process AR Channel Channel Channel Process Channel Channel
Port du dr and r do, op1 , · · · , opn dar op1 · · · opn link with r port link with du,dr,do and dar link with r port link with op1 · · · opn port
dar AR du
dr
r U
R
do op[1..n] OBJ
Fig. 3. The representation of DARBAC elements
4.1
User Process
The main task of user process is : – Receive the role access port name from the administrative role and link or unlink with this port – Interact with role process along the role access port – Receive the delete command from the administrative role and destroy itself n → → Definition 4. U(du,n,− rn )::= i=1 ri .U (du, n, − rn )+ (υx) du(command, x).if command=DELETE then 0 → → else if command=ARU AND x∈ /− rn then U(du,n,− rn ↑x) → → − else if command=DRU AND x∈ rn then U (du,n-1, − rn ↓ ri ) The User process has one administrative port du which can be accessed by administrative role process, and n links to role processes. It can access role process by ri channel, or destroy itself if the administrative role process execute the DeleteUser command, or receive the role access channel ri from administrative role process to link or unlink with that role process. Example 2. We can define the following user processes in Example 1. Alice::= U(du alice, 1, r des). Bob::= U(du bob, 0, 0). Carl::= U(du carl, 0, 0). The Alice process links with one role access port r des which means Alice is assigned with Designer role. The Bob and Carl processes have no role access ports.
Using π-Calculus to Formalize Domain Administration of RBAC
4.2
285
Object Process
The main task of object process is: – Receive the access from roles – Receive the query command from the administrative role and returns the required object operation port name – Receive the delete command form the administrative role and destroy itself n −−→ −→ −−→ Definition 5. OBJ(do,n,− op→ n ,topn ) ::= i=1 opi ().OBJ(do, n, opn , topn )+ (νx) do(command, x, type). if command=DELETE then 0 −−→ else if command=QUERY AND type∈ topn − − → then (x.OBJ(do, n, − op→ n , topn )) The OBJ process has one administrative port do which can be accessed by administrative role process and n operation ports which can be accessed by role processes. The object process can wait the access by role process along opi channel, or destroy itself if the administrative role process execute the DeleteObject command, or receive query command from administrative role process and returns the operation port opi . Here we use topi to identify the type of each operation port. For example, the name of one operation port is op1 and its type is read. The operation channel opi of each object process is different, but their type topi may be the same. Thus the administrative role process don’t need to store every operation port name of every object process, it just need to store the do port and use topi to query the real operation port name of that object. Example 3. We can define the following object processes in Example 1. Requirement::= OBJ(do req, 1, op5, read). Code::= OBJ(do code, 2, op1, op2, read, write). The Requirement object can be read via port op5. The Code object can be read via port op1 and written via port op2. 4.3
Role Process
The main task of role process is: – Receive the query command from administrative role and return its access port name – Receive the access by user and other roles along its access port – Receive the object access port name from administrative role and link or unlink this port – Receive the child role access port name from administrative role and link or unlink this port – Receive the delete command form the administrative role and destroy itself → −→) ::= r().R(dr, r, n, − → −→)+ Definition 6. R(dr, r, n, − rn , m, − op rn , m, − op m m n m − → − − → − → −→)) + (ri .R(dr, r, n, rn , m, opm )) + j=1 (opj .R(dr, r, n, rn , m, − op m i=1 (νx) dr(command, x). if command=DELETE then 0
286
else else else else else
Y. Lu et al.
if if if if if
→ −→)) command=QUERY then (x.R(dr,r, n, − rn ,m,− op m − − → → − −→ ↓ op ) op command=DRP AND x∈ opm then R(dr,r,n,rn , m-1, − m i − − → → − − −→ ↑ x) command=ARP AND x∈ / opm then R(dr,r,n,rn ,m+1,op m → −→) → command=DRR AND x∈ − rn then R(dr,r,n-1,− rn ↓ ri ,m,− op m → − → − − −→) command=ARR AND x∈ / rn then R(dr,r,n+1,rn ↑ x,m,op m
The role process contains one administrative port dr to be accessed by administrative role process, one access port r to be accessed by user and other role processes, n links to its child roles and m links to object access ports. The role graph can be maintained by storing the child roles’ access ports and changed by administrative role’s AssignRoleEdge and DeAssignRoleEdge commands. Example 4. We can define the following role processes in Example 1: Manager::=R(dr man,r man,2,r des,r aud,0,0). Designer::=R(dr des,r des,1,r emp,1,op2). Auditor::=R(dr aud,r aud,1,r emp,0,0). Employee::=R(dr emp,r emp,0,0,0,0). The Manager role process has role access ports of its child role: port r des (Designer role) and port r aud (Auditor role). The Designer role process has one role access port r emp (Employee role) and one object access port op2 which means Designer can write Code object via op2. 4.4
Administrative Role Process
The main task of administrative role process is: – Store the administrative ports of users, roles, objects and child administrative roles within its administrative domain – Execute administrative operations in its administrative domain −−→ −−→ −→ −−→ −−→ Definition 7. AR(ar,s,dars ,k,duk ,l,drl ,m,dom , n, topn )::=(υ du,dr,r,do,− op→ n) −−→ −−→ −→ −−→ −−→ (CreateUser(ar,du)|AR(ar,s,dars ,k+1,duk ↑ du,l,drl ,m,dom , n, topn )) −−→ −−→ −→ −−→ −−→ +(CreateRole(ar,dr,r)|AR(ar,s,dars ,k,duk ,l+1,drl ↓ dr,m,dom , n, topn )) −−→ −−→ −→ −−→ −−→ −−→ +(CreateObj(ar,do,− op→ n , topn )|AR(ar,s,dars ,k,duk ,l,drl ,m+1, dom ↑ do, n, topn )) k −−→ −−→ −→ −−→ −−→ + i=1 (DeleteUser(ar,dui ).AR(ar,s,dars ,k-1,duk ↓ dui ,l,drl ,m,dom , n, topn )) l −−→ −−→ −→ −−→ −−→ + i=1 (DeleteRole(ar,dri ).AR(ar,s,dars ,k,duk ,l − 1, drl ↓ dri ,m,dom , n, topn )) m −−→ −−→ −→ −−→ −−→ + i=1 (DeleteObj(ar,doi ).AR(ar,s,dars ,k,duk ,l,drl ,m-1,dom ↓ doi , n, topn )) k l + (AssignRoleUser(ar,dui ,drj )+DeAssignRoleUser(ar,dui ,drj )) k i=1k j=1 + i=1 j=1 (AssignRoleEdge(ar,dri ,drj )+DeAssignRoleEdge(ar,dri ,drj )) l m n + i=1 j=1 t=1 (AssignRolePerm(ar,dri ,doj , topt )+ −−→ −−→ −→ −−→ −−→ DeAssignRolePerm(ar,dri ,doj , topt )) .AR(ar,s,dars ,k,duk ,l,drl ,m,dom , n, topn ) The administrative role process has one administrative port ar to be accessed by other administrative roles, s administrative ports of child administrative role processes, k administrative ports of user processes, l administrative ports of role processes, m administrative ports of object processes and n operation types of objects. The administrative role process is composed by administrative operations.
Using π-Calculus to Formalize Domain Administration of RBAC
287
Example 5. We define the following administrative role processes in Example 1: SSO::=AR(dar sso,1,dar pso1,0,1,dr man,0,0). PSO1::=AR(dr pso1,0,3,du alice,du bob,du carl,3,dr des,dr aud,dr emp,2, do code,do req,2,read,write). The SSO administrative role(dar sso) has 1 child administrative role PSO1 (port dar pso1) and one role Manager (port dr man)in its domain. The PSO1 administrative role(dar pso1) has three users(du alice,du bob,du carl), three roles (dr des,dr aud,dr emp) and two objects (do code,do req) in its domain. 4.5
Administrative Operation Process
The administrative operations are used by administrative role to manage users, roles, objects and their relations within its administrative domain. Process reduction rules can be used to explain the meaning of each administrative operation. Definition 8. The administrative operation processes are defined as follows: CreateUser(ar,du)::=U(du,0,0) CreateRole(ar,dr,r)::=R(dr,r,0,0,0,0) −−→ −→ −−→ CreateObj(ar,do,− op→ n , topn )::=OBJ(do,n,opn , topn ) DeleteUser(ar,du)::=du m DeleteObj(ar,do)::=do. i=1 DeAssignRolePerm(dri ,do) k DeleteRole(ar,dr)::=dr. i=1 DeAssignRoleUser(dui ,dr). l i=1 DeAssignRoleEdge(dri ,dr) AssignRoleUser(ar,du,dr)::=(υt)(dr.t(x).du) DeAssignRoleUser(ar,du,dr)::=(υt)(dr.t(x).du) AssignRolePerm(ar,dr,do,top)::=(υt)(do.t(x).dr) DeAssignRolePerm(ar,dr,do,top)::=(υt)(do.t(x).dr) AssignRoleEdge(ar,dr1 ,dr2 )::=(υt)(dr2 .t(x).dr1 ) DeAssignRoleEdge(ar,dr1 ,dr2 )::=(υt)(dr2 .t(x).dr1 ) Here are some examples of administrative operations. Example 6. SSO creates user David (Figure 4). SSO::= AR(dar sso,1,dar pso1,0,0,1,dr man,0,0) =⇒ CreateUser(dar sso,du dav) | AR(dar sso,1,dar pso1,1,du dav,1,dr man,0,0) =⇒ · · · =⇒ U(du dav,0) | AR(dar sso,1,dar pso1,1,du dav,1,dr man,0,0) = David | SSO Example 7. PSO1 assign Bob to Auditor role(Figure 5). PSO1 | Auditor | Bob =⇒ AssignRoleUser(dar pso1,du bob,dr aud) | Auditor | Bob | PSO1 =⇒ (υt)(dr aud.t(x).du bob) | R(dr aud,r aud,1,dr emp,0,0) | U(du bob,0,0) | PSO1 =⇒ (υt)(t(x).du bob) | (t.R(dr aud,r aud, 1, r emp,0,0)) | U(du bob,0,0) | PSO1 =⇒ · · · =⇒ R(dr aud,r aud,1,r emp,0,0) | U(du bob,1,r aud) | PSO1 = Auditor | Bob | PSO1
288
Y. Lu et al. dar_sso
dar_sso SSO'
dar_pso1
dr_m an
David
dar_pso1
r_m an
dr_m an r_m an
Manager
PSO1
du_dav SSO'
CreateUser
Manager
PSO1
Fig. 4. Create User dar_pso1
dar_pso1
PSO1
PSO1 dr_aud r_aud Bob
Auditor
dr_aud AssignRoleUser
r_aud Bob'
Auditor
Fig. 5. AssignRoleUser
Example 8. SSO deletes Manager role. SSO | Manager =⇒ DeleteRole(dar sso,dr man).AR(dar sso,1,dar pso1,0,0,0,0) | Manager =⇒ · · · =⇒ AR(dar sso,1,dar pso1,0,0,0,0)
5
Conclusion and Future Work
In this paper, we propose DARBAC model, a domain administration model for RBAC. In this model, administrative role can execute administrative operations on users, roles, objects and child administrative roles within its administrative domain. Then we use π-calculus to formalize the elements of DARBAC model. Process reduction rule can be used to explain the meaning of each administrative operation. The work presented in this paper can be extended in several directions. First, in this paper we only use the process reduction to represent the dynamic behaviors of administrative role. We can use other π-calculus analysis methods, such as bisimulation, congruence and modal mu-calculus, to further analyze the safety properties and expressive powers of our DARBAC model. Second, we can use the method in this paper to formalize other access control models such as DAC, MAC etc and compare the safety properties and expressive powers of these models in a unified π-calculus background.
Acknowledgement The work described in this paper was partially supported by The National Basic Research Program of China (Grant No. 2002CB312006) and the National HighTech R&D Program of China (Grant No.2003AA411022).
Using π-Calculus to Formalize Domain Administration of RBAC
289
References 1. Ravi S. Sandhu, Edward J. Coyne, Hal L.Feinstein, and Charles E. Youman. Rolebased access control models. IEEE Computer, 29(2):38–47, February 1996. 2. David F. Ferraiolo, Ravi Sandhu, S. Gavrila, D. Richard Kuhn and R. Chandramouli. Proposed NIST Standard for Role-Based Access Control. ACM Transactions on Information and System Security, 4(3):224–274, August 2001. 3. Ravi S. Sandhu, Venkata Bhamidipati, Qamar Munawer: The ARBAC97 Model for Role-Based Administration of Roles. ACM Transactions on Information and Systems Security, 2(1): 105-135 (1999). 4. Sejong Oh, Ravi S. Sandhu. A model for role administration using organization structure. SACMAT 2002: 155-162 5. Jason Crampton and George Loizou. Administrative scope: A foundation for rolebased administrative models. ACM Transactions on Information and System Security, 6(2), 201-231, 2003 6. H F. Wedde and M. Lischka. Modular Authorization and Administration. ACM Transactions on Information and System Security, 7(3): 363-391, August 2004. 7. M.Koch, LVMancini, F.Parisi-Presicce. A Graph based Formalism for RBAC. ACM Trans. Information and System Security, 5(3): 332-365, August 2002. 8. Manuel Koch, Luigi V. Mancini, Francesco Parisi-Presicce: Administrative scope in the graph-based framework. SACMAT 2004: 97-104. 9. R. Milner, J.Parrow, and D.Walker. A Calculus of Mobile Processes, Part I/II, Journal of Information and Computation, 100(1):1-77, Sept.1992. 10. Joachim Parrow. An Introduction to the Pi calculus, Handbook of Process Algebra, Elsevier, 2001,pp. 479-543. 11. Davide Sangiorgi and David Walker. The pi calculus: A theory of Mobile Processes, Cambridge University Press, 2001. 12. Martin Abadi, Andrew D. Gordon. A Calculus for Cryptographic Protocols: The Spi Calculus. ACM Conference on Computer and Communications Security 1997: 36-47. 13. M. Hennessy and J. Riely. Information Flow vs. Resource Access in the Asynchronous Pi-Calculus. ACM Transactions on Programming Languages and Systems, 24(5):566–591, September 2002. 14. Julian A. Padget, Russell J. Bradford: A pi-calculus Model of a Spanish Fish Market - Preliminary Report. First International Workshop on Agent Mediated Electronic Trading, AMET 1998: 166-188
An Efficient Way to Build Secure Disk∗ Fangyong Hou, Hongjun He, Zhiying Wang, and Kui Dai School of Computer, National University of Defense Technology, Changsha, 410073, P.R. China [email protected]
Abstract. Protecting data confidentiality and integrity is important to ensure secure computing. Approach that integrates encryption and hash tree based verification is proposed here to protect disk data. Together with sector-level operation, it can provide protection with characters as online checking, high resistance against attacks, any data protection and unified low-level mechanism. To achieve satisfied performance, it adopts a special structure hash tree, and defines hash sub-trees corresponding to the frequently accessed disk regions as hot-access-windows. Utilizing hot-access-windows, simplifying the layout of tree structure and correctly buffering portion nodes of hash tree, it can reduce the cost of protection sufficiently. At the same time, it is convenient for fast recovery to maintain consistency effectively. Related model, approach and system realization are elaborated, as well as testing results. Theoretical analysis and experimental simulation show that it is a practical and available way to build secure disk.
1 Introduction Providing a privacy and tamper-proof environment is crucial factor for ensuring secure or trusted computing. In this paper, we focus on protecting data confidentiality and integrity of mass storage device, with the specific instance of locally connected hard disk. Here, confidentiality means to prevent unauthorized data disclosure, while integrity means protection of data from corruption or unauthorized modification. Providing confidentiality is usually fulfilled through cryptography, and secret key cryptography (or symmetric cryptography, such as block cipher like AES) is applied to protect a mass of data. Generally, the process of encryption/decryption is relative straightforward. Providing solid integrity is a difficult task, especially when online checking and resistance against replay attack are required. Online mode is to check integrity after each access. It can avoid committing error result (for example, checking integrity of a file cannot detect an invalid data block before all the data blocks of the file reached; if portions of the file have taken effect before the entire file is verified, errors may be committed), but requires more frequent checking than offline checking (which checks whether the untrusted storage device performs correctly after a ∗
This work is supported by National Laboratory for Modern Communications (No. 51436050505KG0101).
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 290 – 301, 2006. © Springer-Verlag Berlin Heidelberg 2006
An Efficient Way to Build Secure Disk
291
sequence of operations have been performed). A replay attack means that an intruder stores a message and its signature (which is often a hash result), then uses them to spoof users later. Generally, integrity is verified by its integrity code, which is also referred to as MAC (Message Authentication Code). However, MAC is brittle to replay attack. To resist against replay attack, hash tree, or Merkle tree [1], is widely used. Hash tree regards the content of all files/blocks at some point as one continuous set of data, and maintains a single (all of the data, hash) pair. Relying on its trusted root, an arbitrarily large storage can be verified and updated [2]. In such way, an intruder cannot replace some of the files (or blocks) without being detected. Although hash tree can provide online checking and resist against replay attack, a naive hash tree will make the system too slow to use, as its checking process involves many node accesses. Some optimizing measures are put forward. For example, CHTree [3] uses L2 Cache to store portion nodes of hash tree to improve the performance when applying hash tree for memory verification. LHash/H-LHash [4] uses an incremental multisets hash method to maintain access logs to make checking at a latter time (which should be seen as offline checking). There exist many systems that protect mass data. Among them, some only provide confidentiality, some give integrity relying on MAC, and some use the principle of hash tree to resist against replay attacks. CFS [5] and Cryptfs [6] encrypt file data to provide confidentiality. Tripwire [7] uses the principle of MAC to check the integrity of files. SFSRO [8] uses hash of a file-data block as the block identifier to guarantee the integrity of content data. SUNDR [9] uses a hash of a block as the block identifier and a hash tree to provide data integrity at the file system level. PFS [10] keeps a list called the block map to map a file system block number to a hash of a block to protect data integrity. Arbre [11] builds hash tree into file system design tightly to protect integrity of the entire file system. But nearly all of these existing systems suffer from integrity protection. For example, PFS cannot prevent replay attack. Arbre inherits some limitations that other tree-structured file systems have, which makes an application requiring frequent synchronizations performs poorly. In this paper, we bring forward an approach to protect hard disk confidentiality and integrity. Firstly, it operates at sector level to give low-level protection, which makes meta-data and file data to be protected at the same time and is easy to be deployed into existing systems. Secondly, it applies encryption to prevent intruders from understanding data. Lastly, it constructs a hash tree to provide integrity with online mode, and to verify the whole protected data space as a single unit to resist against the intractable replay attack. Through simplifying the structure of hash tree, utilizing local character of hard disk access, buffering some of the hash tree nodes, as well as other measures like asynchronous verification, it can optimize hash tree checking process adequately to achieve high performance. The rest of this paper is organized as follows. Section 2 elaborates the fundamental method of our protection. Section 3 describes the specific system realization. Section 4 makes performance evaluation. Section5 discusses some related things. Section 6 concludes this paper.
292
F. Hou et al.
2 Protection Approach 2.1 Model Considering the case that a hard disk connects to host through local interface/bus (such as IDE or ATA), the protection model is shown as fig.1. 7UXVWHG 8QWUXVWHG +RVW 6HFXUH'ULYHU
,QWHUIDFH
+DUG'LVN
Fig. 1. The considered protection model. The trusted boundary lies between host and the attached hard disk.
In fig.1, hard disk is assumed to be vulnerable to attack. The trusted boundary lies between host and hard disk; that is, both processor/controller and system memory are treated to be trusted. This is reasonable when considering hard disk protection individually. Existing techniques [3, 12] can be used to verify or protect system memory, if necessary. A special component of “SecureDriver” lies in the inner part of trusted boundary, and it is to: (i) provide confidentiality by encrypting any data stored to hard disk, (ii) provide integrity by checking the value that host loads from a particular location of disk is the most recent value that it stored to the same place. With the proposed model, our approach tries to achieve the following purposes. − Low-level protection. We want protection mechanism not to touch file system and to work at the lowest level of disk operation. Directly operating upon each disk sector makes our mechanism a unified protection and can protect any data in disk (metadata, file data, as well as temp regions like the swapped pages of virtual-memory). At the same time, neglecting high-level data managements gives transparent protection to existing systems, and such low-level mechanism can be implemented straightforwardly on both legacy and modern storage systems (no matter for a file system or for a raw disk, and no matter using UNIX or Windows OS). − Data encryption. Any data stored in hard disk should be encrypted. − Integrity checking with online mode and resistance against replay attack. To realize such purpose, hash tree is perhaps the only feasible way. However, hash tree may incur a big cost to complete its checking processes, which is contradictory to realizing high performance. So, making sufficient optimizing of hash tree becomes the key solution to disk protection. − Consistency. As hard disk is permanent storage device, maintaining a consistent state in the event of a system crash is required. − Performance. To make protection to be worthwhile, it must not impose too great a performance penalty. Related protection processes should be high performance.
An Efficient Way to Build Secure Disk
293
2.2 Integrity Verification Through Hash Tree Hash Tree Optimizing. Utilizing the character of the protection model, we propose a simple but available method to optimize hash tree checking process as following. We build a single hash tree on the whole protected disk space, then, we define an access-window to be a hash sub-tree that corresponds to one hard disk sub-space. Thus, the entire hash tree consists of a number of access-windows. As disk access has strong local pattern, we define hot-access-windows to be the special access-windows that are frequently accessed for a given period, while other access-windows to be cool-accesswindows. Basing on these concepts, a tree style hash scheme is illustrated in fig.2. URRW K
K K KK KRW:LQ
8SSHUPRVW/HYHO 5RRW
K
0LGGOH/HYHO 7RS /RZHVW/HYHO /HDI
Fig. 2. The special hash tree with simplified structure and hot-access-windows. Structure specialization and hot-access-window give the way to make optimization.
In fig.2, a special hash tree with fixed three levels is constructed. The lowest-level nodes are leaf nodes. Each middle-level node is the collision-resistance hash of the leaf nodes affiliated with it. The root node lies in the uppermost-level, and is created from hashing the result that concatenates all the middle-level nodes. The whole protected space is split into access-windows, and each access-window has one middle-level node as the top node of its hash sub-tree. For example, h1, h2, h3 and h4 compose an accesswindow. When being frequently accessed, it becomes a hot-access-window, such as hotWin1. In hotWin1, one middle-level node of hash tree, h5, becomes the top node of the hash sub-tree in hotWin1. The integrity of h1, h2, h3 and h4 can be verified by h5, and updating of them can be aggregated at h5. The integrity of h5, as well as other top nodes of access-windows (e.g., h6), can be verified by root, and updating of them can be aggregated at root. In such way, dynamically changing data in the whole protected space can be verified and updated. To make it work well, rules and processes are designed as below. − Rules. Access-window has fixed width. Hot-access-windows can be located continuously to cover a bigger protected region, or distributed discretely in different places to cover many small regions. − Process1-- Initialize. Do once for normal working cycle: (i) reads all the protected data blocks to construct all the nodes of hash tree; (ii) saves all the top nodes of access-windows to non-volatile storage device (such as flash memory disk, or be placed in a special region of hard disk); (iii) saves the root node to non-volatile and secure place (such as an equipped flash memory device in the inner part of trusted boundary).
294
F. Hou et al.
− Process2-- Start. At the very beginning of each normal start, to do: (i) loads the saved top nodes and fills them into working buffer; (ii) concatenates all the top nodes of access-windows together, and computes the hash of the concatenated data; (iii) checks that the resultant hash matches the saved root node. − Process3-- Prepare an access-window. To do: (i) reads the protected data blocks covered by this access-window; (ii) hashes each block to get each leaf node; (iii) concatenates all the leaf nodes of this access-window together, and computes the hash of the concatenated data to get the top node. − Process4-- Construct a hot-access-window. To do: (i) executes as "Process3"; (ii) checks that the new generated top node matches the buffered top node of this hotaccess-window; (iii) buffers all the generated leaf nodes. − Process5 -- Verify a read covered by a hot-access-window. To do: (i) calculates the hash of the data block that is currently read; (ii) checks that the resultant hash matches the corresponding buffered leaf node. − Process6-- Update for a write covered by a hot-access-window. To do: (i) sets a flag to indicates that this hot-access-window has been modified; (ii) calculates the hash of the data block that is currently written; (iii) replaces the corresponding buffered leaf node with the resultant hash result. − Process7-- Remove a hot-access-window. For a no longer frequently accessed hotaccess-window, to do: (i) jumps to step (iv) directly, if this hot-access-window hasn’t ever been modified; (ii) concatenates all the leaf nodes of this hot-accesswindow together, and computes the hash of the concatenated data; (iii) updates the corresponding buffered top node of this hot-access-window with the resultant hash; (iv) withdraws this hot-access-window to abolish its buffered leaf nodes; (v) builds a new hot-access-window in another place (during working time, it implies to create a new hot-access-window if withdrawing an old one). − Process8-- Verify a read covered by a cool-access-window. To do: (i) executes as "Process3"; (ii) checks that the new generated top node matches the buffered top node of this cool-access-window. − Process9-- Update for a write covered by a cool-access-window. To do: (i) executes as "Process3"; (ii) replaces the buffered top node of this cool-access-window with the new generated one. − Process10-- Exit. At each normal exit, to do: (i) withdraws all the hot-accesswindows as "Process7" (without creating new hot-access-windows); (ii) concatenates all the top nodes of access-windows together, and computes the hash of the concatenated data; (iii) updates the root node with the resultant hash; (iv) saves back all the top nodes to their permanent storage place (as mentioned in "Process1"). The concept of hot-access-window is similar to Hou et al. [13]. But here, as system memory is treated to be trusted, related processes are significantly different to make better utilization of a relative big trusted buffer. It also has some similarities with CHTree method. But its simplified structure and regular allocation scheme differ greatly. Additionally, its regularity gives more facilities for the recovery process than CHTree does. By buffering all the leaf nodes of hot-access-window, checking in hot-accesswindow doesn’t need additional disk accesses. Checking in cool-access-window only
An Efficient Way to Build Secure Disk
295
need to read the disk region covered by one access-window. Top nodes of hot-accesswindows are updated only when withdrawing, and root node is updated only when exiting. So, most costly updates of higher-level hash nodes are combined and delayed without affecting the run-time performance, while it still provides online verifying. Fast Recovery. A consistent state can be treated as: (i) a write to a disk sector is completed, and (ii) corresponding hash tree update along the checking path until reaching the root node is completed. Recovery must ensure these two steps to be synchronous (encryption process won’t incur an inconsistent state). Although consistency can be maintained by executing "Section3.1, Algorithm1" again after unexpected crash, it requires big reading cost and has no security assurance. Fast recovery is proposed to make fast and secure recovery through a special data structure called Update-Snap. For the moment, we only describe how to make fast recovery, and we will discuss its security later. Three fields are contained in Update-Snap. One field is a flag, called UFlag, to indicate that if the top node of one access-window is consistent with the protected disk region covered by this access-window (flag value of “Y” means consistency, while “N” means inconsistency). Another field, called UHash, holds the value of the top node of one access-window. The last field is an index to associate each (UFlag, UHash) pair with its corresponding access-windows. To maintain Update-Snap, SecureDriver does as below. − For any modification to the protected region covered by an access-window, sets the corresponding UFlag to "N". This should be completed at the beginning of “Section2.2, 1), Process6, Update for a write covered by a hot-access-window” and “Section2.2, 1), Process9, Update for a write covered by a cool-access-window”. − For any updating to the top node, writes the value of this top node to the corresponding UHash, and sets its UFlag to "Y". This should be executed at the ending of “Section2.2, 1), Process7, Remove a hot-access-window”, as well as the ending of “Section2.2, 1), Process9, Update for a write covered by a cool-access-window”. To make recovery from a crash, SecureDriver does as below. − For each record in Update-Snap, fetches UHash directly if UFlag equals "Y". Otherwise, reads the disk region covered by the corresponding access-window to recalculates its top node; then, replaces UHash with the new calculated value and sets UFlag to be "Y". − Replaces the saved top nodes mentioned in "Section2.2, 1), Process1, Initialize" with these new values gotten in the above step. − Concatenates all the top nodes together, and computes the hash of the concatenated data; then, updates the root node with the resultant hash. For fast recovery, the worst case is that when system crashes, all the hot-accesswindows have been modified but haven’t been removed or withdrawn; the current update falls into cool-access-window, but the corresponding top node hasn’t been updated in time. In such case, it only requires to read the disk regions covered by all the hot-access-windows and one cool-access-window. This cost is far smaller than reading the entire protected disk space. Additionally, recording Update-Snap during
296
F. Hou et al.
the working time only occurs occasionally and related operations are lightweight. So, fast recovery is suitable for the case that recovery time is critical. Asynchronous Verification. Online checking may stall the next disk access to wait the current checking to be completed, especially when disk accesses are bursting. In order to give better performance, we allow a limited asynchronous integrity checking. For this purpose, a queue is used to hold several disk accesses to let one access start following the last one immediately without waiting the result of verification, until the queue is full. Thus, influence to the system caused by the delay of verification can be decreased. “Limited” means that the length of the queue is much shorter than pure offline verification (which makes one checking for one operation sequence about millions times of accesses, such as LHash). When asynchronous checking is applied, execution is actually “speculative”. An invalid sector may have been used before integrity violation is detected. As the length of asynchronous checking queue is limited, the probability of committing error result can also be restricted within an acceptable level. Existence of such buffer may affect the correctness of fast recovery. Possible solution is set UFlag immediately when a write transaction is put into this queue, although related checking process hasn’t been executed. 2.3 Data Encryption We apply a cipher in SecureDriver to encrypt/decrypt any data sector to/from hard disk. The cipher used by SecureDriver is the block cipher of AES. In fact, file or file system encryption need a complex key management process, such as how to apply cryptographic keys and how keys are revoked. Different systems control this process in various ways. In real usage cases, the secret key used by the cipher of SecureDriver should be changeable through an interface to high level management program. In this paper, we leave these things alone (assuming such management is the task of others, such as OS). For convenience, we assume that AES cipher always uses the same 128bit secret key. Like the root key of hash tree, AES secret key should also be kept in a secure and non-volatile memory device.
3 System Realization Specific realization needs to choose specific parameters of hash tree. One is the block size covered by each leaf node, which is set to be one disk sector (usually 512B). Another is the width of access-window. A wider value will burden more cost on preparing an access-window, and slower the process of hashing leaf nodes to match the top node. A narrower width can give more proper coverage on disk regions that are frequently accessed, but needs more buffer to hold more top nodes of accesswindows. A compromised selection is set the width to cover a 64KB disk region or 128 continuous sectors. Such width can make matching the top node of accesswindow to be quickly enough, disk coverage to be reasonable, and the number of access-windows won’t be too large. The last one is the number of hot-accesswindows. More of them can speed checking more; bur requires more buffer to hold more leaf nodes. Additionally, it may slow down the process of fast recovery. For
An Efficient Way to Build Secure Disk
297
common cases, maintaining 1K hot-access-windows is appropriate. This makes about (64KB * 1K = 64MB) protected space to be covered by hot-access-windows at any time. Additionally, we use SHA-1 hash function to produce hash tree nodes with the results of 160bit or 20B. With these selections, the root node is hold in secure memory with the capacity requirement of 20B. For a 10GB hard disk partition, there is about (10GB / 64KB ≈ 0.16M) access-windows. So, buffering top nodes and leaf nodes of hot-accesswindows needs a buffer about (0.16M*20B + 128*20B*1K ≈ 5.7MB). The concise modules of SecureDriver realization is shown in fig.3. 7UHH&KHFNHU 1RGH%XIIHU 6HFWRU%XIIHU
6HFXUH0HPRU\ $(6&LSKHU 6QDS5HFRUGHU
&RXSOHU
+DUGGLVNLQWHUIDFH Fig. 3. Modules of the specific realization of SecureDriver. Such security program operates at the disk driver layer, and uses system memory as its trusted buffer.
In fig.3, "Coupler" is to get sectors to/from disk. According to the addresses (i.e., sector numbers), it can determine which disk regions should be covered by hot-accesswindows. To allow asynchronous checking, a queue of "Sector Buffer" is used to hold several numbers of disk accessing. A comprised value is set the queue to be 16KB. "Tree Checker" executes the actions of hash tree optimization. "Node Buffer" holds all the top nodes of access-windows, as well as the leaf nodes of hot-access-windows. In order to avoid duplicating the read of the same disk region, it also holds the leaf nodes of the current cool-access-window (requiring additional 128*16B). So, subsequent checking in the same cool-access-window, as well as converting it into hotaccess-window immediately, needn’t read the corresponding disk region again. "Secure Memory" holds root node of hash tree permanently, as well as the AES secret key. "AES Cipher" is a cryptograph routine to convert one sector into its ciphertext or vice versa. At the end, "Snap Recorder" maintains Updating-Snap to prepare for fast recovery. For a 10GB hard disk, the size of Updating-Snap is about (UFlag+UHash+Index) * (number of access-windows) = (1bit+160bit+20bit) * 0.16M ≈ 3.6MB.
4 Performance Simulation Compared with the speed of disk accessing, encryption/decryption latency won’t become the bottleneck of performance. For integrity checking, the main cost comes from fetching disk sectors. Without considering initialization, system start and exit, three cases will affect the performance. Constructing a new hot-access-window is the first case, which requires reading
298
F. Hou et al.
the disk region covered by this hot-access-window to build the corresponding hash sub-tree (need to read 128 continuous sectors). Checking an access in a new coolaccess-window is the second case, which spends a cost similar to the first case. Fortunately, these two cases don’t take place frequently, in proportion to the whole number of disk accesses. The last case is that the checking throughput of hot-access-window cannot keep up with the bandwidth of disk accessing. In such case, although accesses are covered by hot-access-windows, checking delay will postpone the following ones. But we should be aware that calculating several hashes and making a match in system memory can be completed much quickly, when compared with the speed of disk I/O. Additionally, the asynchronous checking queue will do great help to eliminate the checking delays caused by these cases above. To appraise performance and correctness, we build a simulation framework in a PC with 1.7GHz P4 CPU, 256MB PC2100 DDR SDRAM and ATA100 disk. We write a block driver for the Linux kernel to implement our SecureDriver. A 10GB disk partition is tested. To check its correctness, we exit normally after running some times. Then, we re-construct hash tree and make comparison with those saved top nodes, as well as the root node. Matching means that it works correctly. For simplification, we don’t implement fast recovery when making simulation. Simulation results are shown in fig.4.
7UDFH$7UDFH% 7UDFH& D
E
Fig. 4. (a) Main performance results; measured for 128-sector access-window width, 1024 hotaccess-window number, and 16KB asynchronous checking queue; comparing each sample separately, and set the cases without protection to be "1.0". (b) Performance results of different width of access-window; tested for 128-, 256-, 2048-sector per access-window separately; maintaining the same capacity of working buffer, that is, the number of hot-access-window is adjusted also; setting the best one to be "1.0".
In fig.4, we use several disk-traces to imitate different disk usages. Trace-A is captured from Andrew benchmark [14] (gotten from I/O monitoring when running it on the same original PC), while Trace-B and Trace-C are edited from HP-Labs TPC-C and TPC-D [15] trace files separately (we cut some segments from these two traces). Fig.4 (a) shows that there is a slight performance penalty less than 5% for common cases (Trace-A and Trace-B). Trace-C has a visible performance decline about 12%. The reason is that Trace-C comes from TPC-D, which reflects the disk usage of DSS (Decision Support System) applications. In TPC-D applications, big disk space is scanned (such as searching or summing in a big database) and such scan doesn’t always have strong physical locality. So, the utilization of hot-access-window is greatly affected; that is, more hot-access-windows are frequently removed and many disk accesses fall into cool-access-windows.
An Efficient Way to Build Secure Disk
299
Some parameters are also adjusted to make more deep investigations. Fig.4 (b) tells that too big width of access-window isn’t a good selection. With bigger accesswindow, preparing an access-window requires to read more disk data. Additionally, it may give worse coverage on those disk regions that are frequently accessed. Different selections of other parameters may get different results. Related simulation validates some intuitive results: more hash and AES throughput (such as using special cryptography accelerator), more long asynchronous checking queue, and more hot-access-windows, can improve performance. However, un-careful selections may incur shortcomings, such as too long asynchronous checking queue isn’t very accord with the meaning of online verification.
5 Discussions In current stage, the security of Update-Snap based fast recovery hasn’t been fully studied, which is one of our future tasks. Here, we just list the potential attacks and suggest the possible solutions as the followings. − Directly tampering to UHash for (UHash, UFlag = "Y") pair will incur an integrity violation when checking the corresponding protected disk region at later time, because the regenerated top-node will be different from the stored value of UHash. − All the (UHash, UFlag) pairs should be encrypted with a secret key known only to the core of SecureDriver. So, attempting to re-calculate the hash nodes of some maliciously modified sectors to replace the value of (UHash, UFlag = "Y") pair can be detected, because intruder cannot produce the proper ciphertext of (UHash, UFlag = "Y") pair without obtaining the encryption key. − To prevent intruder from using a copied (UHash, UFlag = "Y", Sectors covered by this access-window) old pair to replace the new pair, applying an incremental hashing scheme [4, 16] to authenticate Update-Snap with low run-time cost is an available way. That is, whenever Update-Snap is modified (such as one UFlag is set to be "N", or one UHash is updated and its UFlag is set to be "Y" again), sign it through incremental cryptograph method and save the signature to secure place. − Malicious modifications to disk sectors for a (UFlag = "N", Sectors covered by this access-window) pair cannot be prevented directly. As recovery process will read the corresponding sectors and re-calculate its hash sub-tree, intruder can tamper disk sectors and make such tampering take effect. The possible solution is to maintain accessing log for each hot-access-window and the current cool-access-window. As soon as the UFlag is set to be “N”, it begins to record each modification to disk sectors; and if the UFlag is set to be “Y” again, it clears the corresponding log. With these logs, system can rollback to the most recent state of (Correct UHash, UFlag = "Y", Correct sectors covered by this access-window) pair. − Additionally, it is better to authorize reliable user to make recovery to prohibit from arbitrary "crash-recovery" operations. In fact, UHash contained in Update-Snap can be the same one with the permanent storage of top nodes mentioned in “Section2.2, 1), Process1, Initialize”. For clarity, we give UHash a separated logical name.
300
F. Hou et al.
Besides operates on disk sectors directly, another available selection is set the block size to be equal to several continuous sectors, such as a “cluster” (commonly used by file system as the basic data unit). This will be equivalent to providing “lowlevel” protection for real usage, while it decreases the number of leaf nodes greatly to reduce the cost of node buffering. The most flexible selection of our hash tree verification scheme is to choose different width of access-window. If run-time performance is more important than any other considerations (e.g., memory occupation, recovery time, etc.), it is better to have more hot-access-windows with smaller width. Only from the point of tamper detecting, the permanent storage of top nodes mentioned in “Section2.2, 1), Process1, Initialize” doesn’t need protection, as tampering to them can incur mismatch when compared with root node. However, to improve system availability, it is better to protect these nodes also. Or else, if these nodes are tampered, it has to rebuild the whole hash tree to match the root node, which will spend a long time. We can put these nodes into a special disk region that cannot be accessed by the “public” disk interface to give them certain protection. According to the allocation schemes of common file/storage systems, disk files created in a mostly empty disk are likely to occupy sequential sectors. However, they may become fragmentized after lots operations or running long times. This will affect the efficiency of our hash tree optimization method, as it deteriorates the locality of disk I/O. Often running a “Disk Defragmenter” program (which can combine some file fragments into one continuous region) may make improvement. Wang has found collisions for MD5 hash function [17]. In fact, SHA-1 also has collisions. For very secure application scenes, we should use SHA-2 to construct hash tree, but it will take up more space to store and buffer hash nodes (because the output of SHA-2 is more long than the results of MD5/SHA-1).
6 Conclusions Through encrypting data and constructing a hash tree on the protected disk sectors, our approach can provide solid protection at the lowest-level of disk accessing. To achieve good performance, it uses the concept of hot-access-window to quicken most of the integrity checking processes. Together with simplifying the layout of tree structure and properly buffering nodes of hash tree, checking process is sufficiently optimized. For common cases, performance penalty is less than 5%. As we have demonstrated and discussed, this approach should be a practical and available way to protect hard disk against information disclosure and tampering.
References 1. R. C. Merkle: Protocols for public key cryptography. IEEE Symposium on Security and Privacy (1980) 122-134 2. M. Blum, W. S. Evans, P. Gemmell, S. Kannan, and M. Naor: Checking the correctness of memories. IEEE Symposium on Foundations of Computer Science (1991) 90-99
An Efficient Way to Build Secure Disk
301
3. B. Gassend, G. E. Suh, D. Clarke, M. van Dijk, and S. Devadas: Caches and merkle trees for efficient memory authentication. Ninth International Symposium on High Performance Computer Architecture (2003) 4. G. E. Suh, D. Clarke, B. Gassend, M. van Dijk, and S. Devadas: Hardware Mechanisms for Memory Integrity Checking. Technical report, MIT LCS TR-872 (2003) 5. M. Blaze: A cryptographic file system for unix. In 1st ACM Conference on Communications and Computing Security (1993) 9-16 6. E. Zadok, I. Badulescu, and A. Shender: Cryptfs: A stackable vnode level encryption file system. Technical report, Computer Science Department, Columbia University (1998) 7. Tripwire. http://www.tripwire.org 8. K. Fu, F. kaashoek, and D. Mazieres: Fast and secure distributed read-only file system. In Proceedings of OSDI 2000 (2000) 9. D. Mazieres and D. Shasha: Don't trust your file server. 8th Workshop on Hot Topics in Operating Systems (2001) 10. C. A. Stein, J. H. Howard, and M. I. Seltzer: Unifying file system protection. In 2001 USENIX Annual Technical Conference (2001) 79-90 11. Fujita Tomonori and Ogawara Masanori: Protecting the Integrity of an Entire File System. First IEEE International Workshop on Information Assurance (2003) 12. G. E. Suh, D. Clarke, B. Gassend, M. van Dijk, S. Devadas: Aegis: Architecture for tamper-evident and tamper-resistant processing. 17th Int'l Conference on Supercomputing (2003) 13. Fangyong Hou, Zhiying Wang, Yuhua Tang, Jifeng Liu: Verify Memory Integrity Basing on Hash Tree and MAC Combined Approach. International Conference on Embedded and Ubiquitous Computing (2004) 14. J. H. Howard, M. L. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, M. J. West: Scale and performance in a distributed file system. ACM Transactions on Computer Systems, Vol.6, February (1988) 51-81 15. HP Labs. Tools and traces. http://www.hpl.hp.com/research/ 16. M. Bellare and D. Micciancio: A New Paradigm for collision-free hashing: Incrementality at reduced cost. In Proceedings of Eurocrypt'97, Springer-Verlag LNCS 1233 (1997) 17. Xiaoyun Wang, Dengguo Feng, Xuejia Lai, and Hongbo Yu: Collisions for hash functions MD4, MD5, HAVAL-128 and RIPEMD. Crypto2004 (2004)
Practical Forensic Analysis in Advanced Access Content System Hongxia Jin and Jeffery Lotspiech IBM Almaden Research Center, San Jose, CA, 95120 {jin, lotspiech}@us.ibm.com
Abstract. In this paper we focus on the use of a traitor tracing scheme for distribution models that are one-to-many. It can be a networked broadcast system; It can be based on prerecorded or recordable physical media. In this type of system, it is infeasible to mark each copy differently for each receipt. Instead, the system broadcasts limited variations at certain points, and a recipient device has the cryptographic keys that allow it to decrypt only one of the variations at each point. Over time, when unauthorized copies of the protected content are observed, a traitor tracing scheme allows the detection of the devices that have participated in the construction of the pirated copies. The authors have been involved in what we believe is the first large-scale deployment of the tracing traitors approach in a content protection standard for the new generation of high-definition DVD optical discs. Along the way, we have had to solve both practical and theoretical problems that had not been apparent in the literature to date. In this paper we will mainly present this state of practice of the traitor tracing technology and show some of our experience in bringing this important technology to practice.
1
Introduction
AACS [1], Advanced Access Content System, is founded in July 2004 by eight companies, Disney, IBM, Intel, Matsushita, Microsoft, Sony, Toshiba, and Warner Brothers. It develops content protection technology for the next generation of high-definition DVD optical discs. It supports expanded flexibility in accessing, managing, and transferring content within a standalone or networked environment. Compared to the previous DVD CSS system, which is a flat “do not copy” technology, AACS is an enabling technology, allowing consumers to make authorized copies of purchased movie discs, and potentially enriching the experience of the movie with an online connection. The fundamental protection of the AACS system is based on broadcast encryption with a subset-difference tree using device keys and a media key block[2]. It allows unlimited, precise revocation without danger of collateral damage to innocent devices. The mechanism is designed to exclude clones or compromised devices, such as the infamous “DeCSS” application used for copying “protected” DVD Video disks. Once the attacker K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 302–313, 2006. c Springer-Verlag Berlin Heidelberg 2006
Practical Forensic Analysis in Advanced Access Content System
303
has been detected, they are excluded from newly released content because the new media key blocks in the new content exclude the keys known to the attackers. However, the AACS founders do not believe that this level of renewability solves the piracy problem completely. What if an attacker re-digitizes the analogue output from a compliant device and redistributes the content in unprotected form? It can be an exact in-the-clear digital copy of the movie, with all of its extra navigation and features. In this case, the only forensic evidence availability is the unprotected copy of the content. Also, because of the inherent power of the revocation of the AACS system, it is possible that the attackers may forgo building clones or non-compliant devices and instead devote themselves to serverbased attacks where they try to hide the underlying compromised device(s). In one particular attack, you could imagine the attackers building a server that distributes per-movie keys. Of course, the attackers would have to compromise the tamper-resistance of one or more players to extract these keys. This is progress, because these server attacks are inherently more expensive for the attackers. However, AACS found it desirable to be able to respond to even these types of attacks. These attacks are anonymous. The only forensic evidence availability are the per-movie keys or the actual copy of the content. To help defend against these types of attacks, the AACS system uses tracing traitors technology. AACS uses the term sequence keys to refer to its tracing traitors technology against the anonymous attack. The suitability of the term will become apparent. However, to be consistent with the cryptographic literature, in this paper, the device who engages in piracy will be called equivalently either a traitor or a colluder. The AACS sequence key scheme allows us to apply the watermark early on in the content publishing process and can still provide traceability down to the individual content recipient. Traitor tracing problem was first defined by Fiat and Naor in a broadcast encryption system [3]. This system allows encrypted contents to be distributed to a privileged group of receivers (decoder boxes). Each decoder box is assigned a unique set of decryption keys that allows it to decrypt the encrypted content. what are the security problems with this system? A group of colluders can construct a clone pirate decoder that can decrypt the broadcast content. It is different from the one we are dealing in this paper. The threat model that this paper is concerned with is what AACS has called the “anonymous attack”. As we mentioned earlier, attackers can construct a pirate copy of the content (content attack) and try to resell the pirate copy over the Internet. Or the attackers reverse-engineer the devices and extract the decryption keys (key attack). They can then set up a server and sell decryption keys on demand, or build a circumvention device and put the decryption keys into the device. There are two well-known models for how a pirated copy (be it the content or the key) can be generated: 1. Given two variants v1 and v2 of a segment, the pirate can only use either v1 or v2 or unrecognizable, but not any other valid variant. 2. Given two variants v1 and v2 of a movie segment (v1 = v2 ), the pirate can generate any variant out of v1 and v2
304
H. Jin and J. Lotspiech
In this paper, we will be assuming that the attackers are restricted to the first model. This is not an unreasonable assumption in the AACS application. In order to enable tracing for content attack, each content distributed in the AACS application needs to have different variations. Watermarking is one of the ways to create different variations of the content. In a practical watermarking scheme, when given some variants of a movie segment, it would be infeasible for the colluders to come up with another valid variant because they do not have the essential information to generate such a variant. Even if they mount other attacks, such as averaging two variants, it may end up with a unrecognizable version. It can hardly be another valid variant. And there are methods as [7] that make it very difficult for colluders to remove the marks. However, an important point to note is that building a different variation is a media-format-specific problem. Watermarking is only one of the solutions. For example, in a DVD format using Blue Laser, the variation can be simply a different playlist. In this case, it has nothing to do with watermark; thus it is not restricted by the watermark robustness requirement. Also, for the key attack, the traitors will need to re-distribute at least one set of keys for each segment. For cryptographic keys that are generated randomly, it is impossible to generate a valid third key from combining two valid other keys. This is equivalent to the first model. A tracing scheme is static if it pre-determines the assignment of the decryption keys for the decoder or the watermarked variations of the content before the content is broadcast. The traitor tracing schemes in [4],[5] are static and probabilistic. They randomly assign the decryption keys to users before the content is broadcast. The main goal of their scheme in their context is to make the probability of exposing an innocent user negligible under as many real traitors in the coalition as possible. Fiat and Tassa introduced a dynamic traitor tracing scheme [8] to combat the same piracy under the same business scenario considered in this paper. In their scheme, each user gets one of the q variations for each segment. However, the assignment of the variation of each segment to a user is dynamically decided based on the observed feedback from the previous segment. The scheme can detect up to m traitors. It involves realtime computational overhead. Avoiding this drawback, sequential traitor tracing is presented in [10] and more formal analysis are shown in [11], [12]. AACS uses a similar model as [10] that requires no real-time computation/ feedback. However AACS has designed a traitor tracing scheme that attempts to meet all the practical requirements. The existing traitor tracing schemes either need more bandwidth than the content provider can economically afford, or the number of players their schemes can accommodate is too few to be practical, or the number of colluding traitors under which their schemes can handle is too few. Bringing the long-standing theoretical work to practice was the major effort we undertook in the AACS system. In the rest of this paper we will first summarize our basic scheme. We will then focus on discussing some other practical problems we have encountered in implementing the AACS tracing traitors scheme. AACS has been a collaborate
Practical Forensic Analysis in Advanced Access Content System
305
effort amongst the eight companies involved. Although the authors were the individuals primarily involved in this aspect of AACS, we benefited extensively from discussions, reviews, and proposals from the other companies. We would like to especially acknowledge Toru Kambayashi from the Toshiba Corporation, who worked out the details of mapping the technology to the HD-DVD disc format, and Tateo Oishi from the Sony Corporation, who worked out the details of mapping the technology to the Blue Ray disc format.
2
Overhead to Enable Tracing
As we mentioned above, in order to enable tracing, the content needs to be prepared with different variations. These different variations occupy extra bandwidth in a network broadcast system and occupy space in the physical optical media. Although the new generation of DVDs has substantially more capacity, the studios can use that capacity to provide a high definition picture and to offer increased features on the disc. While it is perfectly reasonable in a theoretical context to talk about schemes that increased the space required by 200% or 300%, no movie studio would have accepted this. A traitor tracing scheme can be practical only if it requires an acceptable overhead. In general, most studios were willing to accept some overhead, for example, below 10%, for forensics. As a nominal figure, we began to design assuming we had roughly 8 additional minutes (480 seconds) of video strictly for forensic purposes for a normal 2 hour movie. Theoretically we could use our 480 seconds to produce the most variations possible. For example, at one particular point in the movie, we could have produced 960 variations of a 1/2 second duration. In reality this clearly would not work. The attackers could simply omit that 1/2 second in the unauthorized copy without significantly degrading the value of that copy. We believe a better model is to have an order of 15 carefully-picked points of variation in the movie, each of duration 2 seconds, and each having 16 variations. As you can see, even with this, you could argue that the attackers can avoid these 30 or so seconds of the movie. Our studios colleagues have studied whether these parameters are sufficient, and their answer is, frankly, “it depends”. Different format seems to like different parameters. As a result, when we mapped our scheme to the actual disc format, it is very important that we made sure the duration of the variations was not pre-determined. Of course, longer durations require more overhead. But this is a tradeoff studios can make. We should also mention that whether or not a given movie uses tracing traitors technology is always the studio’s choice. In the absence of attacks, they would never use it, and the discs would have zero overhead for this purpose. To generalize the above observation, in the AACS model, we assume that each movie is divided into multiple segments, among which n segments are chosen to have differently marked variations. Each of these n segments has q possible variations. Each playing device receives the same disc with all the small variations at chosen points in the content. However, each variations is encrypted with a different set of keys such that any given device out of the player population has
306
H. Jin and J. Lotspiech
access to only one particular variation for each segment. Therefore, each devices plays back the movie through a different path, which effectively creates a different movie version. Each version of the content contains one variation for each segment. The model described here is same as used in [8][10] for content tracing. Each version can be denoted as an n-tuple (x0 , x1 , . . . , xn−1 ), where 0 ≤ xi ≤ q − 1 for each 0 ≤ i ≤ n − 1. A coalition could try to create a pirated copy based on all the variations broadcast to them. For example, suppose that there are m colluders. Colluder j receives a content copy tj = (tj,0 , tj,1 , . . . , tj,n−1 ). The m colluders can build a pirated copy (y0 , y1 , . . . , yn−1 ) where the ith segment comes from a colluder tk , in other words, yi = tk,i where 1 ≤ k ≤ m and 0 ≤ i ≤ n − 1. Unfortunately, the variations (y0 , y1 , . . . , yn−1 ) associated with the pirated copy could happen to belong to an innocent device. A weak traitor tracing scheme wants to prevent a group of colluders from “framing” an innocent user. In the AACS scheme we only deal with strong traitor tracing schemes which allows at least one of the colluders to be identified once such pirated copies are found. The AACS traitor tracing scheme, called sequence keys hereafter, is a static scheme. Like all tracing schemes in this category, it consists of two basic steps: 1. Assign a variation for each segment to devices. 2. Based on the observed re-broadcast keys or contents, trace back the traitors. 2.1
Basic Key Assignment
For the first step, AACS systematically allocates the variations based on an error-correcting code. A practical scheme needs to have small extra disc space overhead, accommodate a large number of devices in the system, and be able to trace devices under as large a coalition as possible. Unfortunately these requirements are inherently conflicting. Assume that each segment has q variations and that there are n segments. A small extra bandwidth means a small q. We represent the assignment of segments for each user using a codeword (x0 , x1 , . . . , xn−1 ), where 0 ≤ xi ≤ q − 1 for each 0 ≤ i ≤ n − 1. Take a look at a code [n, k, d], where n is the length of the codewords, k is the source symbol size and d is the Hamming distance which corresponds to the mininum number of segments by which any two codewords differ. To defend against a collusion attack, intuitively we would like the variant assignment to be as far apart as possible. In other words, the larger the Hamming distance is, the better traceability of the scheme. On the other hand, the maximum codewords the [n, k, d] code can accommodate is q k . In order to accommodate a large number of devices, e.g. billions, intuitively either q or k or both have to be relatively big. Unfortunately a big q means big bandwidth overhead and a big k means smaller Hamming distance and thus weaker traceability. It is inherently difficult to defend against collusions. In order to yield a practical scheme to meet all the requirements, AACS concatenates codes [9]. The number of variations in each segment are assigned following a code, namely the inner code, which are then encoded using another code, namely the outer code. We call the nested code the super code. The inner code effectively create multiple movie version for any movie and the outer code assign different movie versions to the user over a sequence of movies—hence the
Practical Forensic Analysis in Advanced Access Content System
307
term “sequence keys”. This super code avoids the overhead problem by having a small number of variations at any single point. For example, both inner and outer codes can be Reed-Solomon (RS) codes [9]. In a [n, k, d] RS code, d = n − k + 1. For example, for our inner code, we can choose q1 = 16, n1 = 15 and k1 = 2, thus d1 = 14. For the outer code, we can choose q2 = 256, n2 = 255 and k2 = 4, thus d2 = 252. The number of codewords in the outer code is 2564 , which means that this example can accommodate more than 4 billion devices. Suppose each segment is a 2-second clip, the extra video needed in this example is 450 seconds, within the 10% constraint being placed on us by the studios. So, both q, the extra bandwidth needed, and q k , the number of devices our scheme can accommodate, fit in a practical setting. The actual choices of these parameters used in the scheme depend on the requirements and are also constrained by the inherent mathematical relationship between the parameters q, n, k, d. In fact, there does not exist a single MDS code that can satisfy all the practical requirements. For a MDS code, n B8A53623DB5BF86D87EC66A13D460670801850BD <ParentTaskHash> jiang|192.168.1.189|1 XXX ...
Fig. 2. Task Schema Instance
Task-subdividing and Result-Collecting are two key concerns to perform distributed, especially general-purpose, cryptographic computing. It is impossible to design an all-purpose dividing and an all-purpose collecting algorithm because of diversity of cryptographic computing problems. We introduce plug-in mechanism to conquer these difficulties. Every Plugin is a piece of dividing or collecting code for one special type of task. For the detail of Plug-in, see section 4.4. It is also not a general algorithm that can conduct subtask-calculating for all kinds of atomic subtasks. We also adopt plug-in mechanism. Dividercollector service is used to automatically load and use these Plug-ins for task-subdividing and result-collecting. It can serve multiple dividable subtasks at one time. A cryptographic job can be divided into more subtasks using task-subdividing Plug-ins. A dividable subtask will be recursively and asynchronously divided till the job is finished. Note that in the original architecture design we separate the Dividercollector into two services (Divider and Collector) which will result in complex interaction between the two services for a sequential subtask. Manager service manages (e.g., create, invoke and destroy) all the Dividercollector instances, maintains the Dividecollector instance queues. If multiple
326
Z. Jiang et al.
jobs are being concurrently computed, the Manager will decide to invoke which divider instance according to the dynamic priority of currently working jobs when a division request comes. Calculator service is responsible for atomic subtask’s calculating. After accepting a calculation request, the service parses the atomic subtask data, then decides proper tools or codes. If a site lacks corresponding Plug-in, Calculator service will invoke Replica service to replicate corresponding tools or code from some remote site. Then Calculator service starts to calculate the atomic subtask, further to invoke Dynamic Libraries by Java JNI or to start computing engine by GT4 WSGRAM service. Lastly, it returns the calculation result to the source Dispatcher. 4.4
Plug-Ins Mechanism
As far as computing is concerned, an atomic subtask requires a task-calculation algorithm, and a dividable subtask requires a task-division and a task-collect algorithm. For many types of tasks, the codes of their task-calculation, taskdivision or task-collection algorithms have been available. To utilize existing algorithms, computing engines and new algorithms, we wrap these implemented algorithms into Plug-ins. In Crypto-Grid, Plug-ins can be classified into three types called Meta-divider, Meta-collector and Meta-calculator respectively, so three different interfaces need to be implemented respectively. Fig. 3 shows the methods in these interfaces. To divide a subtask, Manager service must find a Meta-divider which should match the subtask’s problemName attribute, by which Manager service matches the task type of subtask and Plug-in. When a site lacks corresponding Plugin, Manage service will invoke the Replica to search, and duplicate one from some remote node. For a new computing problem, the metadata of its three new Plug-ins which wrap corresponding algorithms should be registered to CGMR through the CDEDS service, which makes Crypto-Grid open and flexible. Reusability of computing algorithms benefits from Plug-in mechanism. A dividable task will be divided into either atomic or dividable subtasks. These Divider Interface div_SumbitTaskData(taskName, Taskdata) --- submit task data div_GetNewSubTask(subtaskNumber) --- fetch a new subtask div_IsFinished() --- check if division is finished div_CancelTask() --- cancel the division of the task div_IsItomic() --- decide if the subtasks of the task are atomic
Collector Interface col_PutSubTaskResult(problemName, taskName, encoding, compress, taskResultData) --- submit the result of its subtask col_IsFinished() --- judge if the result of the task is gained col_DivideFinished() --- tell the collector that division is over col_GetTaskResult() --- fetch the result of the dividable task
Calculator Interface cal_SubAtomicTaskData(problemName, taskName, encoding, compress, taskResultData) ---submit the data of an atomic task cal_StartComputing() --- start to calculate the atomic task cal_GetTaskResult() --- fetch the result of the atomic task
Fig. 3. Interfaces of Three Plug-ins
Integrating Grid with Cryptographic Computing
327
subtasks can belong to different types. Each Plug-in can be easily upgraded for higher efficiency. An atomic subtask will become dividable if the division algorithm of this computing problem is improved. We can almost immediately profit from the improved algorithm by simply modifying its parent’s Divider Plug-in, deleting the Calculator Plug-in, and adding new Divider and Collector Plug-ins. Subtasks of a parent task may be parallel or sequential. For sequential subtasks, interactions between its Divider and its Collector Plug-ins are complex. Algorithm providers can develop one Plug-in which implements both Divider and Collector interfaces to reduce interaction complexity. 4.5
Task-Distribution and Data-Transformation
Replica service is used to duplicate Plug-ins, engines or large datasets between a remote site and local system. This service is based on the core CDEDS service. On the basis of requirements and constraints of Dispatcher service, Replica automates the searching and finding of Plug-ins, engines, large datasets, and securely transferring them by RFT service, which is included in GT4. Dispatcher service functions as a subtask dispatcher and control interactions among Crypto-Grid nodes. If one node is idle, the Dispatcher can request subtasks from other nodes. If the work load of one node is too heavy, the Dispatcher can submit subtasks to other nodes. The core service RAMS decides proper nodes by their available resources and the resource demands of subtasks. A subtask can be submitted to either the node itself or other nodes. Note that the submission of a root task (i.e. job) is also by the help of the Dispatcher. In addition, the Dispatcher takes charge of fault tolerance. The Dispatcher doesn’t delete the sent subtask requests instead caches them until the results of these requests are returned, and these subtask results don’t be deleted until the Dispatcher successfully transfer them to their target nodes (the Manager of the same node or the Dispatcher of a remote node). For example, when an exception (e.g., a Calculator instance ceases unexpectedly or timeouts during an atomic subtask’s calculation) occurs, the Dispatcher will resubmits the subtask to some other site using the cached subtask request.
5 5.1
Implementation and Experiment Evaluation Distributed Computing Process
After the general description of the Crypto-Grid architecture, in the section we describe how it can be exploited through a practical example of distributed computing process. Cryptographic computing applications over the grid can make use of distributed computational power, datasets, algorithms and computing engines to speed up computing. It is not representative of every possible scenarios, however it is crucial to show how the execution of a significant cryptographic computing application can benefit from the Crypto-Grid services. These Plug-ins aim to perform a number of independent divisions, collections and calculations
328
Z. Jiang et al.
Dividercollector Service
Collector Plug-in
Divider Plug-in
Replica Service
Manager Service
NodeGUI Client
Job
CE1
Dispatcher Service
Calculator Plug-in
NodeP1 .. NodePm
Calculator Service
CE2 CEn
NodeC1 .. NodeCn CE = Computing Engine
Fig. 4. A distributed computing process
by applying computing tools to distributed subtask sets in parallel. Dividable subtasks are divided by Meta-dividers, atomic subtasks are calculated by Metacalculators, and the results are collected and assembled by Meta-collectors to obtain the global result. Fig. 4 shows a distributed cryptographic computing process, in which a global result is finally obtained on GUI client NodeGUI after its root task RT is launched on NodeGUI. This process is described as follows: Step 1: on NodeGUI, RT is sumbitted to one of NodePi , assuming NodeP1 here. Then RT is divided into subtasks, of which dividable subtasks are recursively submitted to NodeP1 itself, recursively submitted to NodeP2 , NodeP3 , . . . , NodePm if the work load of NodeP1 is heavy enough, and atomic subtasks are submitted to NodeC1 , NodeC2 , . . . , NodeCn (m n). Step 2: on each NodeCi (i = 1, . . . , n) the task result Ri is calculated from atomic subtask ASi by the Calculator Ci . Then each Ri is moved from Nodei to the originator of its subtask NodePj (j=1, 2, . . . , m). Step 3: on each NodePj the results of atomic tasks are collected and assembled by its corresponding Collector Plug-ins. If the result of some branch subtask is obtained in NodePj , it will be transferred to NodePk on which its parent task resides, and probably recursively collect and assemble among NodeP1 , NodeP2 , . . . , NodePm. Step 4: If the final result of RT is not obtained, repeat Step1, Step2 and Step3. Otherwise, the global result is submitted to NodeGUI. 5.2
Crypto-Grid Implementation
As discussed in Section 4, the Crypto-Grid architecture is composed of two hierarchic layers: the Core Crypto-Grid layer and the High level Crypto-Grid layer. To deploy cryptographic computing applications we have implemented the High level Crypto-Grid services (Dividercollector, Manager, Replica and Dispatcher Grid services) which are useful to start cryptographic computing Grid application. Moreover, we have also implemented the former layer on top of the GT4
Integrating Grid with Cryptographic Computing
329
services, which include CDEDS, RAMS and some basic tools allowing a user to publish metadata of cryptographic computing Grid objects (Meta plug-ins, code of cryptographic computing algorithms and computing engines). Maple and Crypto++ have been integrated into Crypto-Grid, and can be accessed by CDEDS and RAMS. For test purpose, distributed DES Exhaustive Key Search algorithm [13] (ExDES), Parallel Collision Search algorithm [14] (PCS) for elliptic curve and DES Differential Cryptanalysis algorithm [13][15] (Dif-DES) have been implemented and integrated into Crypto-Grid by Plug-in mechanism. We separately implements three Plug-ins with size of 56KB, 51KB and 53KB for dividing, collecting and calculating for Ex-DES. Dif-DES includes two stage, differential analysis and exhaust search. So we implement three Plug-ins for dividing and collecting and two Plug-ins for calculating; the largest size of them is 4KB with 56KB shared library; the size of task data is about 2.3MB. PCS has also three Plug-ins for dividing, collecting and calculating, of which the largest size is 7KB with 60KB shared library which further dynamically invokes Crypto++. 5.3
Experiment Evaluation
We have carried out our experiments on LAN with a group of computers with P4 2.2GHz CPU utilizing these implemented cryptographic algorithms. Table 1 shows the data (hexadecimal representation) used for these experiments. Table 2 shows their serial computing time in seconds. Ex-DES searches 232 plaintext/cipertext pairs. ECCp-48 and ECCp-49 attack two elliptic curves respectively with 48-bit and 49-bit prime module. Dif-DES computes 56-bit cipher key of 8 round DES. Fig. 5 shows the corresponding speedups over Crypto-Grid with one NodeP and with different number of NodeCs. Fig. 6 shows experiment results in the same condition except with two NodePs. The experiment results show that Crypto-Grid is a promising architecture. The depth of a task tree for cryptographic computation is usually not more than four, so inter-layer dependence of subtasks have little effect to speedup; Table 1. Computing data used for our experiments Test name Test data Ex-DES (p, c, k)=(c1274452c67660cd, 48656c6c6f20776f, 000000ffffffff) ECCp-48 (p, a, b) = (80000000110d, 34578, 7863d); (P, Q, d)= ((5e2d1, 6ddd21995695),(15804bc9c9d1, 6d8ea7053f42),99e) ECCp-49 (p, a, b) = (1000000012b11, 9fa20, 60f6c4); (P, Q, d)= ((62fb91, 3d74a1a1c34e),(76b08fc96694, 64a2571973a0),1232e) Dif-DES (p, c, k)=(9a8102b13e57c2d4, f8616c03876258b4, 9a295bc9b7b7f8) Table 2. Serial computing time of our experiments Test name Ex-DES ECCp-48 ECCp-49 Dif-DES Serial time 159232 2783.14 3997.29 1352
330
Z. Jiang et al.
40
40 linear Ex-DES ECCp-48 ECCp-49 Dif-DES
Speedup
30 25 20
linear Ex-DES ECCp-48 ECCp-49 Dif-DES
35 30 Speedup
35
15
25 20 15
10
10
5
5
0
0 5
10
15
20 25 30 Number of NodeCs
35
Fig. 5. Performance with one NodeP
40
5
10
15
20 25 30 Number of NodeCs
35
40
Fig. 6. Performance with two NodePs
speedup approximately linearly increases with node number does. However, if a computational task includes sequential subtasks or uses a probabilistic algorithm, how to arrange tree structure in our architecture will influence speedup as node number increases.
6
Conclusion
The Grid infrastructure is growing up very quickly and is going to be more and more complete and complex both in the number of tools and in the variety of supported applications. Crypto-Grid is a kind of grid application dedicated for cryptographic computing. Thanks to fundamental grid services and our special grid services, Crypto-Grid is scalable enough; it integrates and utilizes distributed and heterogeneous computational power and other grid resources more easily than traditional cryptographic computing systems. The Plug-in mechanism makes computational code reusable enough. For a subtask, Crypto-Grid can dynamically and automatically locate corresponding Plug-ins on grid by CDEDS service; replicate it from remote grid node by Replica service; mount it by Dividercollect or Calculator service; manage by Manager service which makes Crypto-Grid suitable for all kinds of cryptographic computing problems. Every subtask in a dynamic execution plan related closely to TDT will be mapped to the most proper grid node by RAMS service, which makes Crypto-Grid highperformance. In short, Crypto-Grid is a grid based cryptographic environment, which is general-purpose, open, scalable and high-performance.
Acknowledgments We thank the support of the National Grand Fundamental Research 973 Program of China (No. 2004CB318004), the National Natural Science Foundation of China (NSFC90204016) and the National High Technology Development Program of China under Grant (863, No. 2003AA144030).
Integrating Grid with Cryptographic Computing
331
References 1. Lenstra, A., Manasse, M.: Factoring by Electronic Mail. Advances in Cryptography – EUROCRYPT 89, LNCS 434. Springer-Verlag, (1990), 355-371. 2. Selkirk, A.P.L., Escott, A. E.: Distributed Computing Attacks on Cryptographic Systems. BT Technology Journal, (1999), 17(2): 69-73. 3. Asbrink, O., Brynielsson, J.: Factoring Large Integers Using Parallel Quadratic Sieve. Technical Report, Royal Institute of technology, Sweden, (2000). 4. Atkins, D., Graft, M., Lenstra, A. K., Leyland, P. C.: The Magic Words are Squeamish Ossifrage. Advances in Cryptography – ASIACRYPT 94. LNCS 917, Springer-Verlag, (1995), 263-277. 5. William Gropp, Ewing Lusk, Anthony Skjellum: Using MPI: Portable Parallel Programming with the Message-Passing Interface. Scientific and engineering computation. MIT Press, Cambridge, MA, USA, (1994). 6. Foster, I., Kesselman, C.: The grid 2: Blueprint for a New Computing Infrastructure, Elsevier Inc, USA, (2004). 7. Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organization. International J. Supercomputer Applications, (2001), 15(3). 8. Foster, I., Kesselman, C.: Globus: A Metacomputing Infrastructure Toolkit, International J. Supercomputer Application, (1997), 11(2), 115-128. 9. Tuecke, S., Czajkowski, K., Foster, I., et al.: Open Grid Services Infrastructure (OGSI) Version 1.0, Global Grid Forum Draft Recommendation. (2003). 10. Foster, I., Kesselman, C., Nick, J., Tuecke, S.: Grid Services for Distributed System Integration. IEEE Computer Society, (2002), 35(6):37-46. 11. Foster, I., Frey, J., Graham, S., et.al.: Modeling Stateful Resources with Web Services Version 1.1. Global Grid Forum Draft Recommendation. (2004). 12. Wei Dai: Crypto++ – A Free C++ Library for Cryptography, Version 5.2.1. Available at http://www.cryptopp.com, (2005). 13. Schneier, B.: Applied Cryptography. John Wiley & Sons, Inc., 2nd edition, (1996). 14. Paul C. van Oorschot and Michael J. Wiener: Parallel Collision Search with Cryptanalytic Applications. J. Cryptology, (1999), 12:1-28. 15. Biham, E., Shamir, A.: Differential Cryptanalysis of the Full 16-round DES. Advances in Cryptography – CRYPRO 92, LNCS 740, Springer-Verlag, (1993).
Three-Round Secret Handshakes Based on ElGamal and DSA Lan Zhou, Willy Susilo and Yi Mu Center for Information Security Research, School of IT and Computer Science, University of Wollongong, Wollongong, NSW 2522, Australia {lz815, wsusilo, ymu}@uow.edu.au
Abstract. Secret handshake, introduced recently by Balfanz et al, is a very useful cryptographic mechanism which allows two members of the same group to authenticate each other secretly. In a secret handshake protocol, an honest member in the group will never reveal his group affiliation unless the other party is a valid member of the same group. In other words, only the members who have certificates from the Group Administrator can be successful in handshaking. If a handshake between two parties fails, the identity of either party will not be disclosed. Several secret handshake schemes have been found in the literature, which are based on pairing, CA-Oblivious Encryption and RSA. Furthermore, several Oblivious Signature-Based Envelopes (OSBE) schemes based on the ElGamal signature family were introduced recently by Nasserian and Tsudik, and they proposed a generic construction of secret handshake from OSBE based on ElGamal signature family as well. It is shown in the generic construction that any ElGamal signature family based OSBE scheme can be converted to secret handshake within three communication rounds, except the ElGamal and DSA signature. In this paper, to complement the previous result, we show a three-round secret handshake scheme based on ElGamal signature. We prove that the scheme is existentially unforgeable in the Random Oracle Model (ROM). Finally we extend our scheme to a DSA-based secret handshake which also requires only three rounds. Keywords: Secret Handshake, Oblivious Signature Based Envelope, Hidden Credential, Privacy, Key Exchange, ElGamal, DSA.
1
Introduction
The Secret Handshake (SH) scheme introduced by Balfanz et al. [2] allows two members of the same group to identify each other secretly, but if one party does not belong to the group, he will learn nothing about group affiliation of the other party. One scenario for SH is as follows: a CIA agent Alice wants to
This work is partially supported by ARC Discovery Grant DP0663306.
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 332–342, 2006. c Springer-Verlag Berlin Heidelberg 2006
Three-Round Secret Handshakes Based on ElGamal and DSA
333
authenticate herself to Bob, but Alice does not know whether Bob is a CIA agent or not. If Alice shows Bob her credential directly, her CIA identity will be revealed to Bob, who could be an adversary. The situation will be different if Alice authenticates with Bob via an SH protocol; namely, Bob will learn nothing about Alice’s identity if he is not a CIA agent. Alice will never worry about the leakage of her CIA affiliation no matter whom she authenticates to, even with other CIA servers. This secrecy property can be extended to ensure that group members’ affiliations are revealed only to members who hold specific roles in the group. For example, Alice wants to authenticate to Bob if and only if Bob is a CIA agent with security level one, while herself as a CIA agent with security level two. Another important property of the handshake is that even if a third party Eve observes the exchange between Alice and Bob, he can learn nothing about the process including whether Alice and Bob belong to the same group, the specific identities of the group, and the roles of either Alice and Bob. Besides introducing the concept of the SH, Balfanz et al. [2] also showed an SH scheme based on pairing, but the computation is not as efficient as the one based on the discrete log problem. Recently, a series of OSBE schemes for the ElGamal family of signature schemes were proposed in [6], where the authors also discussed the generic conversion from OSBE to SH schemes. Furthermore, those ElGamal-OSBE and DSA-OSBE schemes cannot be used to achieve a three-move SH scheme. We will discuss this in the next Section. In this paper, we propose two novel SH schemes based on the ElGamal signature scheme and the DSA scheme, respectively. For the first time, we achieve three-move SH, which was believed infeasible in [6] (Appendix A). We also prove that our proposed scheme is secure. Finally, we propose a DSA-based SH scheme. Organization of The Paper The paper is organized as follow. In Section 2, we discuss some related work in the area of SH, including the first identity-based encryption and several credential systems proposed in recent years, such as OSBE and hidden credential. We will also discuss the relationship between these protocols. We then discuss the security properties of the SH and provide a security model in Section 3 . In Section 4, we introduce an ElGamal-based key agreement protocol, on which our handshake scheme is based, and we proceed with presenting our three-move ElGamal-based SH scheme. In Section 5, we discuss the security proof of our scheme. In Section 6, we propose a DSA-based handshake scheme, which uses the same technique as our ElGamal scheme. Finally, Section 7 concludes the paper.
2
Related Works
The notion of the SH was introduced in [2], where an SH scheme based on pairings was proposed. The security of the scheme is based on the Bilinear DiffieHellman (BDH) problem. This scheme was constructed from a pairing-based key agreement scheme. According to [2], existing anonymity tools such as anonymous
334
L. Zhou, W. Susilo, and Y. Mu
credentials, group signatures, matchmaking protocols, and accumulators, have different goals from SH, so that it is unclear how to achieve an SH scheme from any of them. In 2003, another credential system called Oblivious Signature Based Envelope (OSBE) was proposed [4]. It has very similar properties to the SH. The most typical properties are that they allow credential contents to be used directly in access control processes, making the systems in which credentials can be used without ever being disclosed. In an OSBE system, the encrypted message can only be disclosed by a third party’s signature of an agreed-upon message. The signature itself is used as the credential, and never requires to be disclosed to the message sender. OSBE is defined as an interactive protocol. The original paper [4] defines four parties, a Certificate Authority CA, a message sender S, a qualified recipient R1 and an unqualified recipient R2 . In [8], the relationship between OSBE and SH is explained and it was concluded that an OSBE scheme can easily be used to construct an SH scheme. A new variant of credential system called designated group credential was recently proposed in [7]. An SH scheme based on CA-Oblivious Encryption was introduced in [3]. In this scheme, it combines ElGamal encryption and Schnorr signature to construct a CA-oblivious PKI-enabled encryption secure under the CDH assumption. Based on this primitive, they proposed a new SH scheme based on the CA-oblivious encryption scheme. Subsequently, an SH scheme based on RSA was proposed [5]. This scheme is proven secure against active impersonator and detector adversaries that rely on the difficulty of solving the RSA problem. We note that to date, no SH scheme based on ElGamal signature exists in the literature. In 2005, several OSBE schemes for the ElGamal family of signature schemes were proposed in [6], including Schnorr, Nyberg-Rueppel and DSA. An OSBE scheme can be easily used to construct an SH scheme. As described in [6], OSBEs can be viewed as a sort of a one-side or asymmetric SH. The most naive approach is to simply combine two OSBE Interaction (S, R1 ) and (R1 , S) to obtain an SH. This way, we can see that most of the OSBE schemes in [6] can be easily used to construct SH schemes. However, the three-round protocol does not work for ElGamal- and DSA-OSBE schemes [6]. Hence an extra initial round would be necessary to be used to exchange the first message. It means that these two schemes cannot be used to construct SH scheme which only requires three rounds. We also point out that the DSA-OSBE proposed in [6] is flawed. We will show this flaw in Appendix B.
3
Definition and Security Properties of Secret Handshakes
We adapt the definition of an SH scheme from [2] to our SH scheme, which might potentially restrict the notion of a secret handshake scheme, but both the SH scheme of [2] and our SH scheme fall into this category. We define an SH scheme as a triple of probabilistic algorithms CreateGroup, AddUser, Handshake.
Three-Round Secret Handshakes Based on ElGamal and DSA
335
– CreateGroup, a key generation algorithm executed by the group administrator GA, on input of params, outputs the group public key G , and the GA’s private key tG . – AddUser is an algorithm executed between a group member and GA on GA’s private key tG and shared inputs: params, G, and the identity of the group member which is bit string ID of size regulated by params. After performing the algorithm, the group member will be issued a secret credential produced by GA for the member’s identity ID. – HandShake is the authentication protocol, executed between two parties A and B who want to authenticate each other on the public input IDA , IDB , and params. The private input of each party is their secret credential, and the output of the protocol for either party is either reject or accept. 3.1
Basic Security Properties
An SH scheme must satisfy the properties of completeness, impersonator resistant, and detector resistant. The adversary is allowed to run the protocols several times and be able to make additional queries after each attempt, before he announces that he is ready for the true challenges. He is allowed to ask for signatures on additional IDi = IDA strings during the handshake protocol with honest member V . He can also see all exchanged messages, can delete, modify, inject and redirect messages, can communicate with other party, and can reuse messages from past communications. Completeness. If honest members A, B of the same group run Handshake with valid certificates from the group administrator, which are the signatures generated for their ID strings IDA , IDB and for the same group GA = GB , then both parties output “accept”. Impersonator Resistance. The impersonator resistance property is violated if an honest party V who is a member of group G authenticates an adversary A as a group member, even though A is not a member of G. We denote the probability that the property is violated as follows: Pr[A succeeds in authenticating with V | V ∈ G ∩ A ∈ / G] ≤ ε, where ε is negligible. Detector Resistance. An adversary A violates the detector resistance property if it can decide whether some honest party V is a member of some group G by determining the relationship between the public message of the member and the public key of the group, even though A is not a member of G. The probability that the property is violated is as follows: Pr[A knows whether V is the valid member|public messages of V ∩A ∈ / G]≤ε.
4
Secret-Handshake Scheme Based on ElGamal Signature
In this section, we present our three-move SH scheme based on ElGamal signature in two stage. Firstly, we present an ElGamal based key agreement scheme, and then, we construct a three-move SH scheme based on it.
336
4.1
L. Zhou, W. Susilo, and Y. Mu
ElGamal-Based Key Agreement Scheme
Let us review the basic ElGamal signature for completeness. [9] The first step is key generation. Pick a large prime p such that p − 1 has a large prime divisor q, and also pick g which is an element of Z∗p of order q. Then, choose a random number S ∈ Z∗q , and compute y ≡ g S (mod p). The public key is K = p, g, y and the private key is S. To sign a message M ∈ {0, 1}∗ , we need to select a hash function H : {0, 1}∗ → Z∗q , and a secret random number r ∈ Z∗q . SigK (M, r) = α, β, where (mod p) and β = (H(M ) − α · S) · r−1
α = gr
(mod q)
The signature can be verified: V erif yK (M, α, β) = true ⇐⇒ g H(M ) ≡ y α · αβ
(mod p)
We now work on the key agreement protocol. Assume that there are two parties A and B whose unique identifications are IDA and IDB . Their identifications are signed with the third party’s ElGamal signature. Consequently, A obtains the signature αA , βA , where αA = g rA (mod p), βA = (hA − αA · S) · rA −1 (mod q), hA = H(IDA ), and B obtains the signature αB , βB . The key agreement is carried out as follows: – B chooses kB ∈ Zq at random, and computes ζB = αB (kB +1) (mod p · q), −1 ηB = βB · (kB + 1) · αB kB (mod q). Then, B sends ζB , ηB to A. – Upon receiving ζB , ηB from B, A chooses kA ∈ Zq at random and com−1 k η putes the shared key K = ((y (ζB mod q) · (ζB mod p) B )hB )αA A (mod p), where hB = H(IDB ). −1 A computes ζA = αA (kA +1) (mod p · q), ηA = βA · (kA + 1) · αA kA (mod q). Then A sends the pair ζA , ηA to B. – Upon receiving the pair ζA , ηA from A, B computes the shared key K = −1 k η ((y (ζA mod q) · (ζA mod p) A )hA )αB B (mod p), where hA = H(IDA ). Note that the value (αA , βA , kA ) is unknown to B and (αB , βB , kB ) is unknown to A. To check the correctness of the scheme, we show that the shared key that A and B will compute are equal. A:
= ((y (αB
(kB +1)
= ((g (αB
(kB +1)
= g αA B:
mod q)
K = ((y (ζB
K = ((y
kA
·αB
·S)
(ζA mod q)
= ((y (αA = ((g
mod q)
kB
(kA +1)
ηB hB −1 αA kA
· (ζB mod p)
= g αA
·αB
kB
)
βB ·(kB +1)−1 ·αB kB h −1 α kA B A
· (αB (kB +1) mod p)
· (g (hB −αB ·S)·αB
kB
)
−1
))hB
)αA
)
kA
(mod p) ηA hA −1 αB kB
· (ζA mod p) mod q)
(αA (kA +1) ·S) kA
)
· (g
)
)
βA ·(kA +1)−1 ·αA kA h −1 α kB A B
· (αA (kA +1) mod p) (hA −αA ·S)·αA kA
(mod p)
)
hA −1 αB kB
))
)
)
Three-Round Secret Handshakes Based on ElGamal and DSA
337
Due to lack of spaces, we omit the security proof of this scheme since it is not the main goal of this paper. We refer the reader to the full version of this paper [12] for a more complex account. 4.2
ElGamal Based Secret-Handshake Scheme
By modifying the key agreement protocol above, we can obtain an SH scheme. We call this scheme ElGamal-Based Handshake (EBH). EBH.CreateGroup. The administrator CA runs the ElGamal key generation algorithm to create the set of all keys {(p, q, g, y, S) | y ≡ g S (mod p)}, in which S is the group secret. EBH.AddUser. Select two collision-resistant cryptographic hash functions H1 : {0, 1}∗ → Z∗q , and H2 : {0, 1}∗ → {0, 1}k for some k. To add a user U to the group, the administrator CA first allocates a unique identity IDU to user, and generates a random nonce rU ∈ Zq . The CA then computes the hash value hU = H1 (IDU ), and gives the user U the corresponding signature αU , βU , where αU = g rU (mod p), βU = ((hU − αU · S) · rU −1 ) (mod q). EBH.Handshake. Let A and B be two users who would like to conduct an secret handshake. The three-move handshake protocol is given as follows. – B → A: IDB , ζB , ηB ζB = αB (kB +1) (mod p · q) −1 ηB = βB · (kB + 1) · αB kB (mod q) – A → B: IDA , V0 , ζA , ηA −1 k η V0 = H2 (((y (ζB mod q) · (ζB mod p) B )hB )αA A mod pIDA IDB 0) ζA = αA (kA +1) (mod p · q) −1 ηA = βA · (kA + 1) · αA kA (mod q) – B → A: V1 −1 k η V1 = H2 (((y (ζA mod q) · (ζA mod p) A )hA )αB B mod pIDA IDB 1) A verifies the V1 and accepts only if the following equation holds ?
V1 = H2 (((y (ζB
mod q)
ηB hB −1 αA kA
· (ζB mod p)
)
)
mod pIDA IDB 1)
B verifies the V0 and accepts only if the following equation holds ?
V0 = H2 (((y (ζA mod q) · (ζA mod p)
ηA hA −1 αB kB
)
)
mod pIDA IDB 0)
If both verification succeed, then A and B finish all the steps of the SH, and the handshake has been successful.
5
Security Consideration
Theorem 1. The above ElGamal-based SH scheme is Impersonator Resistant under the assumption that ElGamal signature is existentially unforgeable in the Random Oracle Model.
338
L. Zhou, W. Susilo, and Y. Mu
Proof. ElGamal-based SH is impersonator resistant if no polynomially bounded adversary wins the following game against the Challenger with non-negligible probability: The Challenger randomly picks a public key (g, p, q, y), and gives it to the adversary. The adversary responds with an IDA . The Challenger then picks a random pair ζA , ηA , where ζA ∈ Zp·q and ηA ∈ Zq . The adversary then outputs kA ∈ Zq , and the adversary wins the game if (g hA )kA = y ζA · ζA ηA (mod p). Given an attacker A that wins the above game with probability ε. We construct another attacker B that can successfully forge the ElGamal signature with probability ε. B does the following: 1. B, when given (g, p, q, y), passes (g, p, q, y) to A and gets IDA back. 2. B then computes hA = H(IDA ), picks a random pair ζA , ηA , and sends to A. Then B gets kA from A. ζA 3. Note that y · ζA ηA = (g hA )kA (mod p). If B uses g hA as the generator, ζA , ηA can be viewed as the ElGamal signatures of kA in (g hA , p, q, y). B succeeds in forging the signature if and only if A wins the above game. Hence, we can see that if the adversary A can impersonate the credential, it can be used to forge the ElGamal signature. And there is a assumption that ElGamal signature is existentially unforgeable. We can easily see that if this assumption holds, the probability ε should be a negligible value. An adversary A who can forge a valid signature can surely attack the SH protocol just as an honest member. So he can break the underlying security assumption. Obviously, the probability to break this assumption can not be smaller than the probability to forge a valid signature. Theorem 2. The above ElGamal-based SH scheme is Detector Resistant under the Computational Diffie-Hellman (CDH) assumption in the Random Oracle Model. Proof. Firstly, let us review the CDH assumption: given a cyclic group G, a generator g ∈ G, and group elements g a , g b , the probability to compute g ab is negligible. Then we consider the proof as follows. ElGamal-based SH is detector resistant if no polynomially bounded adversary wins the following game against the Challenger with non-negligible probability: The group administrator holds a key set for ElGamal (g, p, q, y, S), and the Challenger gets the (g, p, q), and gives it to the adversary. The Challenger first asks the member for a triple −1 IDA , ζA , ηA , where ζA = αA (kA +1) (mod p ·q) and ηA = βA · (kA + 1) ·αA kA (mod q). αA , βA is the ElGamal signature on IDA . The adversary then outputs y ∈ Zp , and the adversary wins the game if y = y. Given an attacker A that wins the above game with non-negligible probability ε. We construct another algorithm B that can successfully break the CDH assumption with probability ε. Algorithm B is as follows: 1. Given (g, p, q), B passes (g, p, q) to A. −(k +1) −1 k = g ζA and g αA A = (y ζA · 2. Given ζA , ηA , B can compute g αA A −1 ζA ηA )hA . Let a be αA −(kA +1) mod q and b be αA kA mod q as defined in the CDH problem.
Three-Round Secret Handshakes Based on ElGamal and DSA
339
3. B sends the pair ζA , ηA to A. Subsequently, B obtains y from A. −1 −1 −1 4. B can compute g αA = (ζA ηA ·ζA ·y)hA . Hence, B has successfully broken −1 −(k +1) k the CDH assumption by computing g αA = g ab = g αA A ·αA A mod p.
6
Secret-Handshake Scheme Based on DSA
In this section, we construct an SH Scheme based on DSA signatures using a similar idea as above. DSA-based [10] scheme is a bit complex since there are two modulus used in the scheme. The Digital Signature Algorithm (DSA) was developed by NIST as a more efficient alternative to ElGamal. For completeness, we review the scheme here: Let p be a larger prime such that p − 1 has a large prime divisor q, let g be an element of order q in Z∗p . Let S ∈R Z∗q , and compute y ≡ g S (mod p). So the public key is K = p, g, y and S is the private key. Define a hash function H : {0, 1}∗ → Z∗q , and pick a secret random number r ∈ Z∗q . SigK (M, r) = α, β, where α = (g r mod p) mod q and β = ((H(M ) + α · S) · r−1 ) mod q Verification algorithm is as follows: V erif yK (M, α, β) = true ⇐⇒ α ≡ (y αβ
−1
· g H(M )β
−1
mod p) mod q
From the above, we notice that the DSA signature is very similar to the ElGamal, but they have different modulus. Therefore, we cannot construct a DSA-based handshake scheme in a straightforward manner. The idea is to convert the DSA signature into the form of one modulus first, and then apply the ElGamal based SH scheme to the DSA signature. −1 Firstly, we compute α as follows: α = (g h · y α )β mod p. Now we can use the α , β as the certificates of the member to conduct the handshake. Since there is a little difference between the DSA signature and the ElGamal signature, we cannot apply the ElGamal based SH to the DSA directly. However, by slightly modifying our scheme in Section 5, we can obtain a DSA-based SH. The −1 modification is as follows: A computes ηA = −βA · (kA + 1) · αA kA (mod q) −1 and B compute ηB = −βB · (kB + 1) · αB kB (mod q). The rest of the scheme remains the same as our ElGamal based SH scheme defined in Section 5. We refer the reader to the full version of this paper [12] for the complete treatment of this scheme, including the security proof.
7
Conclusion
We proposed a three-move secret handshake scheme based on the ElGamal signature and DSA signature. Our work answered the open problem of constructing three-move SH schemes using ElGamal signature affirmatively. We also showed that our ElGamal (DSA) based scheme is secure against impersonator and detector attacks under the assumption the existentially unforgeable of the ElGamal (DSA) signature.
340
L. Zhou, W. Susilo, and Y. Mu
References 1. D. Boneh and M. Franklin. Identity-based encryption from the Weil pairing. Lecture Notes in Computer Science, 213 – 229, 2001. 2. D. Balfanz, G. Durfee, N. Shankar, D. Smetters, J. Staddon, and H. Wong. Secret Handshakes from Pairing-based Key Agreements. 2003 IEEE Symposium on Security and Privacy, pages 180 – 196, 2003. 3. C. Castelluccia, S. Jarecki, and G. Tsudik. Secret Handshakes from CA-Oblivious Encryption. Advances in Cryptology - Asiacrypt 2004, Lecture Notes in Computer Science 3329, pages 293 – 307, 2004. 4. N. Li, W. Du, and D. Boneh. Oblivious Signature-Based Envelopes. 22nd ACM Symposium on Principles of Distributed Computing (PODC 2003), pages 182 – 189, 2003. 5. D. Vergnaud. RSA-based secret handshakes. InternationalWorkshop on Coding and Cryptography, Bergen, Norway, March 2005. 6. S. Nasserian, G. Tsudik. Revisiting Oblivious Signature-Based Envelopes. Cryptology ePrint Archive, Report 2005/283, 2005. 7. C.Y. Ng, W. Susilo and Y. Mu. Designated Group Credentials. ACM Symposium on Information, Computer and Communications Security (ASIACCS’06), 2006. 8. Jason E. Holt. Reconciling CA-Oblivious Encryption, Hidden Credentials, OSBE. and Secret Handshakes. Cryptology ePrint Archive, Report 2005/215, 2005. 9. T. ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. on Information Theory, IT-31, pages 469 – 472, 1985. 10. National Institute of Standards and Technology. Digital Signature Standard. NIST FIPS PUB, 186, U.S. Department of Commerce, 1994. 11. D. Pointcheval, J. Stern. Security proofs for signatures. Eurocrypt96, pages 387398, 1996. 12. L. Zhou, W. Susilo and Y. Mu. Three-round Secret Handshakes based on ElGamal and DSA (full version). Manuscript, 2006.
A
Secret Handshake from ElGamal-OSBE
We now review the SH scheme constructed by the ElGamal-OSBE derived from the generic construction in [6]. Firstly, we present the ElGamal-OSBE which is defined as follows: Setup: This algorithm takes as input a security parameter t and creates an ElGamal key: (p, q, g, S, y), selects a suitable cryptographic hash function H, a function H for key derivation and two security parameters t1 and t2 , which are linear in t. It also chooses a semantically secure symmetric encryption scheme , two messages M and P and computes the ElGamal signature δ = (α, β). Interaction:
ε
Step 1: R1 → S: α = g k mod p Step 2: S receives α, generates z ∈R {1 · · · 2t1 p} with α (mod p − 1) = 0, computes Ks = y αz · g −hz mod p and derives ks = H (Ks ) Step 3: S → R1 : Z = αz mod p, C = ks [P ]
ε
Three-Round Secret Handshakes Based on ElGamal and DSA
341
Step 4: Upon receiving (Z, C), R1 computes Kr = Z −β , derives kr = H (Kr ) and decrypts C with kr . where the correctness is easy to see that: Ks = y αz g −hz = g (αS−h)z = g k(αS−h)k
−1
z
= α−βz = Z −β = Kr
Then from previous OSBE scheme, an ElGamal-based SH can be easily constructed as follows [6]: CreateGroup and AddUser are the same as our ElGamal-based handshake scheme. Here we only present the HandShake step. k – A → B: IDA , αA , where αA = gA (mod p) – B → A: IDB , αB , ZB k αB = gB (mod p) B generates zB ∈R Zp B computes ZB = αA zB B computes KB = y αA zB · g −hA zB mod p – A → B: αA , ZA , V0 k αA = gA (mod p) A generates zA ∈R Zp A computes ZA = αB zA A computes KA = y αB zA · g −hB zA mod p −βA A computes V0 = H2 (KA ZB 0) – B → A: V1 −βB V1 = H2 (ZA KB 1) −βB −βA As shown in the OSBE scheme, KA = ZA and KB = ZB , so that A −βA verifies the V1 and accepts only if V1 = H2 (KA ZB 1). B verifies the V0 and −βB accepts only if V0 = H2 (ZA KB 0). If both verification are successful, then A and B finish all the steps of the SH, and the handshake succeeds. From this SH scheme, we can observe that there are four rounds required.
B
Secret Handshake from DSA-OSBE
Now we will construct the DSA-based SH from DSA-OSBE. Since DSA-OSBE in [6] is very similar to the ElGamal one, the authors only gave the algorithm to compute the secret as follows: – Ks = (y α g h )z = g (αS+h)z −1 – Kr = Z β = αzβ = g k(αS+h)k z = g (αS+h)z Hence, we only present the SH scheme which can be constructed from DSAOSBE. CreateGroup and AddUser step are the same as our DSA-based handshake scheme. Also, we only present the HandShake step.
342
L. Zhou, W. Susilo, and Y. Mu
k – A → B: IDA , αA , where αA = gA (mod p) – B → A: IDB , αB , ZB k αB = gB (mod p) B generates zB ∈R Zp B computes ZB = αA zB B computes KB = y αA zB · g hA zB mod p – A → B: αA , ZA , V0 k αA = gA (mod p) A generates zA ∈R Zp A computes ZA = αB zA A computes KA = y αB zA · g hB zA mod p βA A computes V0 = H2 (KA ZB 0) – B → A: V1 βB V1 = H2 (ZA KB 1)
Similar to the handshake scheme from ElGamal-OSBE, the above scheme ? ? βB βA also requires four rounds. Only if KA = ZA and KB = ZB hold then the scheme will be complete. Unfortunately, this verification is incorrect. We check first equation: KA = (y αB · g hB )zA = g (αB S+hB )zA βB which is correct. The problem is due to ZA : KA ≡ ZA βB = αB zA βB = g k(αB S+h)k
−1
z
= g (αB S+hB )zA .
The authors of the original paper [6] believe KA = KA . However, the problem is as follows:
((g k mod p) mod q)(αB S+h)k
−1
z
= ((g k )(αB S+h)k
−1
z
mod p) mod q.
Obviously, the equality does not hold. Hence the DSA-OSBE scheme in [6] is flawed.
Securing C Programs by Dynamic Type Checking Haibin Shen, Jimin Wang, Lingdi Ping, and Kang Sun College of Computer Science and Technology, Zhejiang University, China {shb, wangjm}@vlsi.zju.edu.cn
Abstract. Flexible features of C can be misused and result in potential vulnerabilities which are hard to detect by performing only static checking. Existing tools either give up run-time type checking or employ a type system whose granularity is too coarse (it does not differentiate between pointer types) so that many errors may go undetected. This paper presents a dynamic checking approach to conquer them. A type system that is based on the physical layout of data types and has the proper granularity has been employed. Rules for propagating dynamic types and checking for compatibility of types during execution of the target program are also set up. Then a model of dynamic type checking on this type system to capture run-time type errors is built. Experimental results show that it can catch most errors, including those may become system vulnerabilities and the overhead is moderate.
1
Introduction
C is one of the most expressive and powerful programming languages with many flexible features. However, many of these features can be misused and lead to program bugs that are hard to detect and debug. What is worse, the bugs may be found and employed by hackers to perform system attacking : the notorious “buffer overflow” and “format string”[9] vulnerabilities are all of this kind. As system security has been regarded as an more and more important issue, many safe languages such as Java have been created and employed in system programming. However, the C language is still the choice of many system programmers. Moreover, there exist innumerable lines of C code in current operating systems and all kinds of applications, it is not a sensible idea to abandon the existing C programs and rewrite them from scratch using a safe language. Thus we must find an alternative solution to this problem, keep the high performance of C language as much as possible, and at the same time, make it safe. We organize the rest of the paper as follows: Section 2 discusses related work. Section 3 presents the dynamic type checking method and Section 4 discusses some practical problems in application of our method. Section 5 gives the experimental results. Section 6 summaries this work and gives the conclusion.
Supported by the National High Technology Research and Development Program of China (863 Program) (No. 2003AA1Z1060) and Natural Science Foundation of Zhejiang Province (No. Y105355).
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 343–354, 2006. c Springer-Verlag Berlin Heidelberg 2006
344
2
H. Shen et al.
Related Work
The idea of physical subtype and subtype polymorphism in C is well presented in Siff et al.[6] and Chandra et al.[8]. Siff et al. studies casts in C and groups them into downcasts, upcasts and mismatch. Chandra et al. studies the physical subtype theory and presents a model of physical type checking for C. Our type system is similar to that of Siff et al. and Chandra et al. but we take the dynamic properties of programs into account. Two ground types are added and the rules for subtype inference are also modified to suit our type system. More importantly, we use the type system in dynamic type checking, while the chief application of their type system is static type checking. The idea of dynamic checking appears in several tools. Purify[7] instruments binaries to perform memory access checking, but it does’t detect type errors. Valgrind[4], a Purify-like memory access checker for Linux which interprets programs to detect errors, does’t detect type errors either. Hobbes[5] and Loginov et al.[1] tool perform dynamic type checking so type errors can be detected. However, they do not differentiate pointers, so some type errors may be overlooked.
3
Dynamic Type Checking
The dynamic type checking process is composed of two procedures: the propagation procedure and the checking procedure. The propagation procedure takes care of propagation of dynamic types, while the checking procedure detects errors in the propagation procedure and tries to recover the program from errors. 3.1
Preliminaries
Terms 1. Program element (or Element): A program element (element for short) is a variable (local or global) or a member of a struct variable or a union variable, or an element of an array. Program elements can be classified by their types. For example, we refer to the program elements of array type array elements and the program elements of struct type struct elements. 2. Active member: This term is only meaningful for union elements. Generally, an union element has one or more members. When part or all of a member is written, we call this member the active member of the union element. Each union element has at most one active member. When a new member of the element is written, it is made the active member and previous active member is deactivated. 3. Active part: This term is only meaningful for structure and array elements. Given an structure element or array element e and a pointer p to any position of e, the part from the position pointed by p to the end of e is called an active part of e as viewed from p (it is often shortened to active part when the pointer p can be found in the context). A pointer to a structure in our type system can be looked upon as a pointer which always points at the
Securing C Programs by Dynamic Type Checking
345
t::= ground |a{e0 ,e1 ,e2 ,...,ek } //array of k elements |ptr t //pointer to t |s{m1 ,m2 ,m3 ,...,mk } //struct |u{|m1 ,m2 ,m3 ,...,mk |} //union |(t1 ,t2 ,...,tk )–>t //function m::= (t,l,i) //member labeled l of type t at offset i |(l:n,i) //bit field labeled l of size n at offset i ground::= e{id1 ,id2 ,...,idk } |uninitialized,error type,void *,char, unsigned char, short, int, long, float, double,... Fig. 1. Types in our type system
starting address of the active part of the structure. The active part shrinks when the pointer moves forward and expands when the pointer moves backward. Note that an active part of element e can be null if the pointer points at the end of e. The Type System. We employ a modified version of the type system described in Chandra et al.[8] in our dynamic type checking method. The types of our type system are shown in Figure 1. We have added the function type to the type system so we can catch potential errors caused by misuses of function pointer. We also add two special types to the ground type: uninitialized and error type. An element of type uninitialized has not been initialized by the programmer and does not have a definite value. An element of type error type indicates that an error has occurred during the propagation of dynamic types. An array in our type system is treated like a structure (each member is listed explicitly) since the dynamic types of its elements may differ from each other. An array of declared type t that has k elements can also be written as t[k] when we don’t care the dynamic types of its elements. Assumptions. For ease of discussion, we assume that all other offsets of structures and unions comply with the ANSI C standard. Another assumption is that all structure elements are explicitly padded. When a structure object is stored in accordance with the system’s alignment restrictions, one or more padding bytes may be added to it. In an “explicitly padded” version of a structure object, each padding byte is declared as an anonymous bit field. 3.2
Introduced Variables and Auxiliary Functions
Several variables are introduced to our type system to catch the dynamic properties of a running program, we list them in Table 1. In the table we use expressions like var(e) to refer to the introduced variable var which is related to internal
346
H. Shen et al. Table 1. Introduced variables
Variable dtype(v) alias(e) sptr(e)
Explanation records the dynamic type of each variable v during program execution a set which collects pointers that point to any part of the element e directly remembers the starting address of each variable of structure or array type
Table 2. Introduced auxiliary functions Function stype(e) sizeof(t) deref(t) ispointer(t) isarray(t) isstruct(t) obj(p) update alias(e)
Explanation compiler-assigned type of a C expression e size in byte of type or program element t dereferenced type of t if t is a pointer true iff t is a pointer true iff t is an array true iff t is a struct object that pointer p points to updates the type of each element in set alias(e) so that it keeps consistent with the type of e whenever the type of e changes
variable or program element e. A number of auxiliary functions are also defined in our type system, which are listed in Table 2: Besides the functions listed in Table 2, another function is defined to construct a type from a struct or an array. Let t be a struct or an array: t = s{m1 ,m2 ,...,mn } We define k s{mk+1 , mk+2 , ..., mn } where i=1 sizeof (mi ) = siz P ostf ixT ype(t, siz) = error type where no such m1 · · · mk exist 3.3
Propagation of Dynamic Types
Initialization. When the program starts to run, the dtypes of all global elements get initialized. The rule for initializing dynamic types of program elements of ground types is quite straightforward: if the element e has a definite value, its dtype(e) is set to stype(e); otherwise it is set to uninitialized. On entry to a function, similar things happen to local variables, but the dynamic types of global variables and formal parameters are not affected. Propagation. The rule for propagating dynamic types among program elements of ground types is quite straightforward when no pointers are involved. Once pointers are involved, things get a little more complicated. To simplify the discussion, we assume that statements related to pointers in the input program have been normalized to consist of only a few simple forms which are listed in Table 3. By introducing temporary variables, any complex statement can be normalized. Some examples of normalization are shown in Table 4.
Securing C Programs by Dynamic Type Checking
347
Table 3. Statements related to pointers in the normal form Name of Statement Normal Form Address-of p = &x Assignment p = Castopt q Pointer Dereference on rhs p = *q Pointer Dereference on lhs *p = q Plus p = q + k (k 0) Minus p = q − k (k 0)
Table 4. Examples of normalization of statements related to pointers Nomal form p = q -> a;
Normalization tmp = (char *)q + Offseta ; p = *tmp; tmp1 = &q; p = &(q.a); tmp2 = (char *)tmp1 + Offseta ; p = tmp2; tmp1 = &p; (DstType *)(&p) = q; tmp2 = (DstType *)tmp1; *tmp2 = q;
The propagation rules for statements listed in Table 3 are listed in Figure 2. Besides the cases listed in the table, dynamic types are also propagated when a function call is made. A function declared as void func(formal param1 ,formal param2 ,...,formal paramn ); and called in the form func(act param1 ,act param2 ,...,act paramn ); can be looked up as a sequence of assignments: formal param1 = act param1 ; formal param2 = act param2 ; ... formal paramn = act paramn ; call func; When the function call is made, the dtypes of actual parameters are propagated to the corresponding formal parameters. The propagation process continues when statements in the function body are executed. 3.4
Dynamic Type Checking
Dynamic type checking is performed in the checking procedure. First we will introduce some auxiliary functions used in this procedure: 1. is subtype(p,q) is true iff type p is a subtype of q. The concept “subtype” here is identical to “physical subtype” presented in Chandra et al.[8]. We also use the notion “tt’” to denote that t is a subtype of t’.
348
H. Shen et al.
Address-of: alias(obj(p)) = alias(obj(p)) - p; p = &x; dtype(p) = ptr dtype(x); alias(x) = alias(x) ∪ p; update alias(p);
Assignment: alias(obj(p)) = alias(obj(p)) - p; p = q; dtype(p) = dtype(q); alias(obj(q)) = alias(obj(q)) ∪ p; update alias(p);
Pointer Dereference on rhs: Pointer Dereference on lhs: if (ispointer(p)) if(ispointer(*p) alias(obj(p)) = alias(obj(p)) - p; alias(obj(obj(p))) = alias(obj(obj(p)))-*p; p = *q; *p = q; dtype(p) = deref(dtype(q)); dtype(*p) = dtype(q); if (ispointer(p)) if(ispointer(*p) alias(obj(obj(q))) = alias(obj(obj(q)))∪ p; alias(obj(q)) = alias(obj(q))∪ *p; update alias(p); update alias(obj(p)); Plus: alias(obj(p)) = alias(obj(p)) - p; p = q + k; if (isarray(q)) dtype(p) = PostfixType(dtype(q), k * sizeof(q[0])); else if(isstruct(deref(q))) dtype(p) = ptr PostfixType( deref(dtype(q)), k * sizeof(stype(*q))); else dtype(p) = dtype(q); alias(obj(q)) = alias(obj(q)) ∪ p; update alias(p);
Minus: alias(obj(p)) = alias(obj(p)) - p; p = q - k; if (isarray(q)) dtype(p) = PostfixType(dtype(sptr(q)), sizeof(dtype(sptr(q)))sizeof(dtype(q))-k * sizeof(q[0])); else if (isstruct(deref(q))) dtype(p) = ptr PostfixType( deref(dtype(sptr(q))), sizeof(deref(dtype(sptr(q))))sizeof(deref(dtype(q)))k * sizeof(stype(*q))); else dtype(p) = dtype(q); alias(obj(q)) = alias(obj(q)) ∪ p;
Fig. 2. Propagation rules for statements
2. prototype(f ) returns the prototype of function f. 3. compatible(t1 ,t2 ) checks if type t1 is compatible with t2 using the rules listed in Figure 3. When to Perform Checking. When no pointers, no structures and no unions are involved, there is no need to perform checking, because the compiler can guarantee the type safety of the program. So we consider only the statements in which these elements are involved. General pointer references and assignments can propagate errors, but they cannot generate errors, so Address-of statements and Assignment statements of pointers can also be neglected for checking. All other statements need to be checked. How to Perform Checking. The main task of dynamic type checking is to check the compatibility of an operator and its operands, finding potential type errors and reporting them to the programmer. In most cases, what we need to
Securing C Programs by Dynamic Type Checking
349
Basic rules: scalar ground types(all ground types except void ptr) are subtypes of themselves and not of other ground types. For example: int int,int double, int long Inference rules: [Reflexivity] tt [Void Pointers] ptr t void ∗ [Member subtype]
m = (l, t, i) m = (l , t , i ) i = i t t m m
[First members]
t t m1 = (l, t, 0) s{m1 , · · · , mk } t
[Flattened first members] s{f latten(t)} s{f latten(t )} ⎧ t: ⎪ ⎪ ⎨ if t is not a structure f latten(t) = {f latten(m1 ), · · · , f latten(mk )} : ⎪ ⎪ ⎩ if t = s{m1 , · · · , mk } t t [Integer pointers]
sizeof (int) = sizeof (void ∗) void ∗ int, ptr t int sizeof (int) = sizeof (void ∗) void ∗ unsigned int, ptr t unsigned int
[Long pointers]
sizeof (long) = sizeof (void ∗) void∗ long, ptr t long sizeof (long) = sizeof (void ∗) void ∗ unsigned long, ptr t unsigned long Fig. 3. Basic and inference rules for subtypes
do is to find the answer to the question “Can the value of an expression A be used as a value of type B?” for different A and B. 1. For unions: The dynamic type of a union object tracks the dynamic type of its active member, so there is no problem when its active member is referenced. When its non-active member is referenced, we need to check the compatibility of its active member and non-active member. If only part of
350
H. Shen et al.
the non-active member is referenced, we should check the compatibility of the active member and the referenced part of the non-active member. So we formulate the checking process as follows: (1) construct the minimum target type that contains type of the referenced part of the target member; (2) check the compatibility of the constructed type and the type of the active member. 2. For pointers: We consider only normalized statements listed in Table 3 here. (1) In Assignment statements and Address-of statements the dtype is propagated but no check is made since they cannot generate type errors. (2) In Plus statements and Minus statements, pointer arithmetic should be checked for out-of-bounds errors. This check is performed in PostfixType function calls. If the result of the plus operation goes beyond the last byte of the referenced object, or the result of the minus operation goes beyond the first byte of the referenced object, the returning type will be set to error type. The error will be found once the result pointer is dereferenced. (3) When a pointer is dereferenced, the dereferenced object must be a pointer, which is checked in deref. deref(p) will return error type if p is not a pointer. When error type is dereferenced, an error will be reported. If deref(p) returns some pointer type, we need to check the compatibility of the source type and the target type. Which is source and which is target depend on the side where pointer dereference happens. 3. For function calls: Each actual parameter of the function call should be checked to see if it is compatible with the corresponding formal parameter. If the function is called through a function pointer, another check should be made to ensure that the function pointer has the same number of parameters as the function prototype and type of each parameter is compatible with that of the corresponding parameter of the function prototype, and their returning types are the same. Checking the Compatibility of Two Objects. Function compatible is used to check the compatibility of two objects. It has two parameters: the target type, and the type we have. The rules to judge the compatibility are as follows: 1. The type uninitialized is not compatible with any type, that is,compatible(t, uninitialized) always returns false. 2. The type error type is not compatible with any type, that is,compatible(t, error type) always returns false. 3. A subtype is compatible with its parent types. 4. If the compiler can automatically convert source type to target type, for example, from char to int, they are compatible. 5. Two function types are compatible iff the following conditions hold: (1)they have exactly the same number of parameters; (2)The type of each parameter of one function is compatible with that of the corresponding parameter of another; (3)The size of each parameter of one function equals to that of the
Securing C Programs by Dynamic Type Checking
351
corresponding parameter of another. (4)Their returning types are compatible. (5)Their returning types have the same sizes. 6. In all other cases, the two given types are incompatible. When compatible is called, it uses the rules listed above one by one to test the compatibility of input types. If one rule cannot give an answer, the next one will be used. 3.5
Updating Aliases
Pointer p is an alias of object o if p references o. dtype(p) should be updated when the dynamic type of o changes, otherwise it will result in type inconsistencies. To prevent this situation, we must update the dtypes of elements in alias(e) by calling update alias(e) once program element e is updated. Since function update alias(e) is itself a recursive function, all affected elements will have their dtypes updated in one call.
4 4.1
Some Practical Problems Relaxing the Rules
Although we can flatten structures when necessary and we do not consider the labels of structure members when we perform type checking, the rules listed in Figure 3 may be still too restrictive for users. For example, some users may use array padding in their programs (see the ColorPoint example of [6]), so we add the following rules to the compatible rules list: 1. char[n] is compatible with type t iff sizeof(t)= n 2. Two structure types are compatible if they can be divided into one or more parts, and the size of each part of one structure equals that of the corresponding part of another, and the type of each part of one structure is compatible with that of the corresponding part of another. 4.2
Variable Argument Function
The rules for type checking of functions we presented above cannot manage variable argument functions because the number of arguments of such a function is variable. However, by looking into an implementation of variable argument function, it will be made clear that a variable argument function has not much more than an ordinary function except several wrapper macros which expand to assignment and pointer arithmetic statements. The C compiler puts the variable arguments in a variable argument list and transfers it to the called function. The function uses a series of macros to extract arguments from the list and casts them to destination types. Here is an implementation of these macros which appears in Microsoft C:
352
H. Shen et al.
typedef #define #define #define #define
char *va_list; _INTSIZEOF(n) ((sizeof(n)+sizeof(int)-1)&~(sizeof(int)-1)) va_start(ap,v) (ap=(va_list)&v + _INTSIZEOF(v)) va_arg(ap,t) (*(t *)((ap += _INTSIZEOF(t))-_INTSIZEOF(t))) va_end(ap) (ap = (va_list)0)
Among these macros,va start is used to adjust the pointer to the variable argument list and make it point to the first variable argument; va arg is used to extract one argument from the list, cast it to the destination type and move the pointer forward to the next argument in the list. INTSIZEOF is used to align addresses. We can see from the code that these macros are just pointer arithmetic and casts. If dynamic types of these arguments are provided, we can also perform dynamic type checking for variable argument functions. The dynamic types can be provided in this way: collect the dtype of each actual parameter, and assemble them in an array. This array is transferred to the called function with the other arguments. When an argument is extracted from the argument list, its type is also extracted from the array and type checking is performed. Then the dtypes of the arguments propagates as for ordinary functions. 4.3
External Libraries
Almost all C programs call functions from libraries. Variables defined in external libraries may also be used in C programs. Because dynamic types do not propagate in library functions, some information is lost after calling them. The dtypes of referenced variables may not be accurate any longer and we can only assume that the dtypes of all variables referenced by these functions have not changed. This can result in false alarms and many errors may go undetected. The problem can be solved by recompiling the library if its source code is available. However, in many cases, the source code cannot be obtained, then we can only count on our carefulness.
5
Experimental Results
We implement the dynamic type checking system in the lightweight C compiler — lcc[2] by making modifications to it. The version we use for implementation is 4.2. The checking system, which is linked to every user program it compiles, is implemented as part of compiler libraries. Necessary variables declarations and code of dynamic type initializations, propagations and checking are inserted by modified lcc compiler. We apply the the compiler to a set of small programs including various sorting algorithms and the maze problem to evaluate the performance of it, the results are shown in Table 5. Among the sorting algorithms, “Insertion1” is direct insertion sorting algorithm, “Insertion2” is binary insertion sorting algorithm, “Merge1” is the iterative version of mergence sorting algorithm, “Merge2” is the recursive version of mergence sorting algorithm, and others are self-explaining.
Securing C Programs by Dynamic Type Checking
353
Table 5. Performance and memory consumption measurements for small programs Program Insertion1 Insertion2 Quick Shell Selection Bubble Merge1 Merge2 Heap Maze
Base time 6.15 6.29 0.01 5.48 6.03 21.44 0.02 2.68 0.02 12.97
Base size 884 884 892 884 884 888 1084 1164 892 16740
Time 1450.54 1428.96 1.15 1109.26 708.53 3502.22 2.10 140.35 1.81 357.45
Size 4216 4216 4220 4212 4216 4216 3576 8344 4228 43248
Slowdown1 235.86 227.18 115.00 202.42 117.50 163.35 105.00 52.37 90.50 27.56
Aug1 4.77 4.77 4.73 4.76 4.77 4.75 3.30 7.17 4.74 2.58
Slowdown2 300.96 266.26 93.00 231.48 118.67 183.95 143.00 41.58 95.50 30.12
Aug2 4.17 4.18 4.14 4.13 4.18 4.13 3.57 7.06 4.14 2.12
Table 6. Error detection results when applying the tool to a test suite Bug Description Reading uninitialized locals Reading uninitialized data on heap Writing overflowed buffer on heap Writing overflowed buffer on stack Writing to unallocated memory Returing stack object Overwriting ending zero of string Function pointer with wrong number of arguments Function pointer with wrong returning type Vararg with wrong type of arguments Vararg with wrong number of arguments Bad union access/part of an object is uninitialized Bad union access/a complete uninitialized object Memory leakage Second free Bad type cases
Detection Result Yes Yes Yes Yes Yes No No No Yes Yes Yes Yes Yes No No Yes
All data are collected on a 2.0Ghz Pentium4 with 256 MB of memory, running Mandrake Linux Limited Edition 2005 (kernel 2.6.11). In the table we also list the results of Loginov et al. tool taken from Wang et al.[3] for comparison. Column “Base time” and “Base size” of the table list running time in seconds and memory consumption in KB of the program compiled by original lcc compiler respectively; column “Time” and “Size” list running time and memory consumption of the program compiled by modified lcc compiler respectively; column “Slowdown1” and “Aug1” list running time slowdown and memory consumption augmentation of “dynamically checked version” of each program; “Slowdown2” and “Aug2” list corresponding data of Loginov et al. tool. We also apply a test suite of small programs to evaluate the error detection ability of our compiler, each program in the suite contain one or more bugs caused by misuse of a flexible feature of C. The results are shown in Table 6.
354
6
H. Shen et al.
Conclusion
The dynamic type checking method we presented in this paper is effective in detecting bugs caused by misuse of flexible features of C, and the overhead it brings is tolerable. Since many of these bugs are related to vulnerabilities of systems, it is significant to enhancing system security.
References 1. Alexey Loginov, Suan Yong, Susan Horwitz, and Thomas Reps. Debugging via runtime type checking. In Proceedings of the Conference on Fundamental Approaches to Software Engineering, p.217-232, 2001. 2. David R. hanson, Christopher W. Fraser. A Retargetable C Compiler. Addison Wesley, 1995. 3. Jimin Wang, Lingdi Ping, Xuezeng Pan, Haibin Shen and Xiaolang Yan. Tools to make C programs safe: a deeper study. Journal of Zhejiang University SCIENCE. Vol.6A No.1 p.63-70,2005. 4. Julian Seward. Valgrind, an open-source memory debugger for x86-GNU/Linux. Technical report, http://valgrind.kde.org/, 2003. 5. Michael Burrows, Stephen Freund, and Janet Wiener. Run-time type checking for binary programs. In International Conference on Compiler Construction, 2003. 6. Michael Siff, Satish Chandra, Thomas Ball, Krishna Kunchithapadam, and Thomas Reps. Coping with Type Casts in C. Lecture Notes in Computer Science. 1687:180198, 1999. 7. Reed Hasting and Bob Joyce. Purify: fast detection of memory leaks and access errors. In Proceedings of the Winter USENIX Conference, 1992. 8. Satish Chandra and Thomas Reps. Physical type checking for C. In Proceedings of the ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, volume 24.5 of Software Engineering Notes (SEN), p.66-75, 1999. 9. Umesh Shankar, Kunal Talwar, Jeffrey S. Foster, and David Wagner. Automated Detection of Format-String Vulnerabilities Using Type Qualifiers. In Proceedings of the 10th USENIX Security Symposium, Washington,DC, 2001.
A Chaos-Based Robust Software Watermarking Fenlin Liu, Bin Lu, and Xiangyang Luo Information Engineering Institute, Information Engineering Univercity, Zhengzhou Henan Province, 450002, China [email protected], [email protected], [email protected]
Abstract. In this paper we propose a robust software watermarking based on chaos against several limitations of existing software watermarking. The algorithm combines the anti-reverse engineering technique, chaotic system and the idea of Easter Egg software watermarks. The global protection for the program is provided by dispersing watermark over the whole code of the program with chaotic dispersion coding; the resistance against reverse engineering is improved by using the anti-reverse engineering technique. In the paper, we implement the scheme in the Intel i386 architecture and the Windows operating system, and analyze the robustness and the performance degradation of watermarked program. Analysis indicates that the algorithm resists various types of semanticspreserving transformation attacks and is good tolerance for reverse engineering attacks.
1
Introduction
Software piracy has received an increasing amount of interest from the research community [1, 2, 3]. Nowadays, software developers are mainly responsible for the copyright protection with encryption, license number, key file, dongle etc. [1, 4]. These techniques are vulnerable suffered from crack attacks and hard to carry out pirate tracing. Moreover, software developers have to spend much time, resources and efforts for copyright protection. If there is a reliable system of software protection as the cryptosystem, the software based on the system can be protected to a certain extent. And software developers could devote most of their resources and efforts to developing the software without spending resource and efforts on intellectual property protection. Software watermarking is just an aspiring attempt in the aspect [5]. There are several published techniques for software watermarking. However, no single watermarking algorithm has emerged that is effective against all existing and known attacks. Davidson et al. [6] involved statically encoding the watermark in the ordering of basic blocks that constitute program. It is easily subverted by permuting the order of the blocks. A comparable spread spectrum technique was introduced by Stern et al. [7] for embedding a watermark by modifying the frequencies of instructions in the program. This scheme is robust to various types of signal processing. However, the data-rate is low and the scheme is easily subverted by inserting redundant instructions, code optimization, etc. With the pointer aliasing effects, Collberg et al. [8] first proposed K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 355–366, 2006. c Springer-Verlag Berlin Heidelberg 2006
356
F. Liu, B. Lu, and X. Luo
dynamic software watermarking, which embeds the watermark in the topology of a data structure that is built on the heap at runtime given some secret input sequence to the program. This scheme is vulnerable to any attack that is able to modify the pointer topology of the program’s fundamental data types. Cousot et al. [9] embed watermark in the local variable, and the watermark could be detected even if only part of the watermarked program is present. This scheme can be attacked by obfuscating the program such that the local variables representing the watermark cannot be located or such that the abstract interpreter cannot determine what values are assigned to those local variables. Nagara et al. [5] proposed thread-based watermarking with the premise that multithreaded programs are inherently more difficult to analyze and the difficulty of analysis increases with the number of threads that are ”live” concurrently. But the scheme need introducing a number of threads, and degradation of the performance could not be ignorable. In general, there are such limitations as follows: (A) the assumed threat-model is almost based on automated attacks (i.e. code optimization, obfuscation, reconstructed data and so on), but hardly on manual attacks (such as reverse engineering attacks). (B) Watermark is just embedded in a certain module of the program so that not all modules can be protected, and it can’t resist cropping attacks. (C) Watermark is embedded in the source code; because of its recompiling, the efficiency of embedding is rather low, especially fingerprint. (D) In the embedding procedure, programmers have to take on all the work, especially the complex watermark constructing and embedding, such that the watermark is not always feasible. This paper designs a new scheme that integrates chaotic system, anti-reverse engineering technology, and the idea of Easter Egg software watermarks-Chaosbased Robust Software Watermarking (CRSW) which holds the facility and feasibility of Easter Egg software watermarks, meanwhile resists various types of semantics-preserving code transformation attacks. When chaotic system is involved, dispersing watermark over the whole code provides global protection for the program. Furthermore, by involving anti-reverse engineering techniques, the resistance against anti-reverse engineering attacks is improved. In addition, CRSW embeds watermark into the executable code directly. The watermarked program need not recompile, and the efficiency is improved. The analysis of the proposed algorithm shows that CRSW resists various types of semantics transformation, is good tolerance against anti-reverse engineering technology attacks, and has modest performance degradation.
2
The Structure of CRSW
Easter Egg watermarks, a kind of dynamic software watermarks, is one of the most widely used watermarking [1, 8]. This watermarking, in essence, directly embeds a watermarking detector (or extractor) into the program. When the special input sequence is received, detector (or extractor) is activated, and then the watermark which is extracted from the watermarked program is displayed in a way of visualization. Thus, the semantics of detecting procedure (or extracting procedure) is included in the semantics of the watermarked program, so that Easter Egg water-
A Chaos-Based Robust Software Watermarking
357
marks resist various semantics-preserving transformation attacks [10]. The main problem with Easter Egg watermarks is that once the right input sequence has been discovered, standard debugging techniques will allow the adversary to locate the watermark in the executable code and then remove or disable it [8]. And then, the watermark is just embedded in one piece of the program typically, hence, cropping a particularly valuable module from the program for illegal reuse is likely to be a successful attack [10]. In this paper, basing on the idea of Easter Egg software watermarks, we attempt to propose a more robust and feasible software watermark—CRSW. The watermark is consisted with 4 essential parts: watermark W ,Input Monitoring Module Cm , Watermark Decoding Module Cd , and Anti-reverse Engineering Module Ca .Unlike the other watermarking, CRSW not only embeds watermark W into the program, but also embeds Cm , Cd , Ca in the form of executable code into the program. In this section, we expatiate the structure and the interrelations of CRSW embedding code which includes Cm , Cd , Ca (see Fig. 1). Formally, let P be the considered program, {α1 , α2 , · · ·}be the set of acceptable input of the program,P = T (P, W, Cm , Cd , Ca ) is the watermarked program (T is the watermarking transformation), the extracting procedure is = D(Γ (P )), where D is the extracting transformation, Γ is the code transforW mation. If the watermarked program has been attacked, Γ represents the attacking transformation. Otherwise, Γ is identical transformation.If D(Γ (P )) ≡cp W holds,T resist Γ , where equivcp is user-defined equal relationship. Input Monitoring Module realizes the mapping, Ψ : {α1 , α2 , · · ·} −→ {0, 1}. If Ψ (αi ) = 1 holds, Cd is activated. α ∈ Σ = {αi | Ψ (αi ) = 1} is defined as activation key. To describe the Watermark Decoding Module clearly, we briefly describe the watermark embedding procedure. Firstly, preprocess W : W = E(W, G), where G is digital chaotic system; then embed W into the code of program with chaotic dispersion coding and get the code Iew = Ω(W , I, G), where I is the code of program (we will discuss chaotic dispersion coding in the next section).Watermark from watermarked program and Decoding Module extracts the watermark W performs it in the form of visual action. The module consists of Watermark Output Module(Cdo ) and Chaotic system Module (Cdc ). In the extracting pro-
Fig. 1. The structure and interrelations of CRSW embedding code
358
F. Liu, B. Lu, and X. Luo
with reverse chaotic dispersion codcedure, firstly, the module extracts W −1 , G); by W = E −1 (W ing(W = Ω (Iew , G)); then gets the watermark W into the visual action, V , and display V to users. at last Cdo transform W W W Anti-reverse Engineering Module, which consists of Anti-static Analyzing Module(Cas ) and Anti-dynamic Debugging Module(Cad ), offers the protection from reverse engineering attacks for Cm , Cd . Cas applies the anti-static analyzing techniques, and Cad applies the anti-dynamic debugging techniques.
3
Embedding and Extraction of the CRSW
In this section, we discuss how to embed W , Cm , Cd and Ca into P , the construction of Cm , Cd and Ca will be described in the next section. We present chaotic substitution and chaotic dispersion coding before describing the embedding and extraction. 3.1
Chaotic Substitution ∂
Chaotic substitution is replacing i with c (c,i are two 8-bit binary integers), the result is that the value of i equates c, and we get s. With c and s, the original value of i is recovered by reverse chaotic substitution. Let G is digital chaotic system, without loss of generality, let the state space of G be [a, b). Thus Chaotic substitution can be expressed by s = ∂(i, c, G) = 28 ×
G(x, m) − a c(b − a) c ⊕ i, x = + a, m = + 1 (1) 8 b−a 2 λ
where ⊕ is XOR. G(x, m) is the state of G which has been iterated m times with the initial value x. λ is the parameter which can adjust iterated times. Reverse chaotic substitution is given by: i = ∂ −1 (s, c, G) = ∂(s, c, G)
(2)
Generally, the set A = {α1 , α2 , · · · , αk } replaces B = {b1 , b2 , · · · , bk } with chaotic substitution, and get the result: R = {rj } = ∂(B, A, G) = {∂(bj , aj , G)}, j = 1, 2, · · · , k
(3)
The reverse procedure is given by B = ∂ −1 (R, A, G) = ∂(R, A, G) 3.2
(4)
Chaotic Dispersion Coding ξ
Let X = {x1 , x2 , · · · , xn } be a chaotic sequence, without loss of generality, supposing xj ∈ [a, b), j = 1, 2, · · · , n. Chaotic dispersion coding is that dispersing W over I (the code of program), which is given by [I , S ] = ξ(W, I, X) where I is the resulting code, S is the save code.
(5)
A Chaos-Based Robust Software Watermarking
359
Let the length of W be n bytes and the length of I be l bytes. Thus, W = {w1 , w2 , · · · , wn }, I = {i1 , i2 , · · · , il }, S = {s1 , s2 , · · · , sn }. The steps of ξ are as follows: 1) Initialization: L ← l, N ← n, m ← L/N , j ← i, d ← 0, let I = {i1 , i2 , · · · , il } = I xj −a 2) Let r = m × b−a , d = d + r, sj = ∂(id , wj , G), id = wj 3) Algorithm is done, if j = n is satisfied. Otherwise go to 4) 4) Let L = L − r, N = N − 1, m = L/N , j = j + 1, go to 2). When algorithm is done, S = {s1 , s2 , · · · , sn }, I = {i1 , i2 , · · · , il }. The reverse chaotic dispersion coding, recovering W and I with S and I , can be expressed as: [I, W ] = ξ −1 (S , I , X). 3.3
Embedding
In CBSW, all of W , Cm , Cd , Ca are embedded into the executable code directly, the procedure is described below(Fig. 2 shows the change of executable code after embedding):
Fig. 2. The drawing of embedding watermark
360
F. Liu, B. Lu, and X. Luo
1) Give Key < K1 , K2 >, where K1 is activation key, K2 is the key of producing chaotic sequence. Supposing that length of watermark is n bytes, W can be expressed as {w1 , w2 , · · · , wn }. 2) Construct Watermark Decoding Module Cd , and Anti-reverse Engineering Module Ca ; Construct Input Monitoring Module Cm with K1 (the details of constructions are discussed in the next section) 3) Produce the chaotic sequence X = {x1 , x2 , · · ·} 4) Apply chaotic substitution to embed Cm , Cd , Ca into the code of P . Let the code blocks which are replaced with Cm , Cd , Ca be Im , Id , Ia respectively. We can get Sm = ∂(Im , Cm , G), Sd = ∂(Id , Cd , G), Sa = ∂(Ia , Ca , G), where G is the digital chaotic system. 5) Get the subsequence X (1) (the length is n) from X and preprocess W : W (1) (1) = E(W, X (1) ) = W ⊕ X (1) = {w1 , w2 , · · · , wn } = {w1 ⊕x1 , w2 ⊕x2 , · · · , wn (1) ⊕xn }, where ⊕ is XOR. 6) Get the subsequence X (2) (the length is n) from X; Embed W to I (I is the code which is the whole code exclusive the code that is replaced with Cm , Cd , Ca ) with chaotic dispersion coding and get the result [I , SW ] = ξ(W , I, X (2) ) (Fig. 2 shows the distribution of W in the watermarked program). 7) Save Sm , Sd , Sa and SW to the end of the executable code, and adjust the header of the executable code. 3.4
Extraction
Because the watermark extractor is embedded into the program, the extraction of the watermark is included in the execution of the watermarked program. We describe the execution of the watermarked program to illustrate the extraction. 1) The watermarked program runs. 2) The code of Anti-reverse Engineering Module runs. 3) The code of Input Monitoring Module runs, which monitor the input of the program. 4) Produce the chaotic sequence Y 5) Get the subsequence Y 2 (the length is n, Y (2) is the same as X (2) in the embedding algorithm) from Y , recover the code which is replaced with W , ] = ξ −1 (SW , I , Y (2) ). the procedure can be expressed by [I , W 6) Recover the code which is replaced with Cm , Cd , Ca , the procedure can be expressed by Im = ∂ −1 (Sm , Cm , G), Id = ∂ −1 (Sd , Cd , G), Ia = ∂ −1 (Sa , Ca , G). 7) The watermarked program keeps on running. 8) If the input matches with K1 (activation key). Get the subsequence Y (1) (the length is n, Y (1) is the same as X (1) in the embedding algorithm) form Y , into W with inverse preprocess, which is W = E −1 (W , Y (1) ) = and put W (1) , Y ). E(W into V (visual action) and perform V . 9) Transform W W W
A Chaos-Based Robust Software Watermarking
4
361
The Analysis of CRSW
This section is intended to discuss the robustness of CRSW and the performance degradation. Let the lengths of W , Cm , Ca , and Cd be n bytes, lm bytes, la bytes and ld bytes respectively. Firstly, we analyze the robustness. Let R P be the semantics of P , ω ∈ Γb = {ϕ|R ϕ(P )} = R P } is semantics-preserving transformation. In CRSW, because of visual output VW , R VW ⊆ R P is hold. Then the following relation holds according to the definition of semantics-preserving transformation: R VW ⊆R P = R ω(P )
(6)
Equation (6) indicates that the semantics-preserving transformations can not destroy the semantics of VW , and CRSW can resist various types of semanticspreserving transformation attacks except the attacks which can distinguish R VW and R P . In the Anti-reverse Engineering Module, anti-static analyzing techniques and anti-dynamic debugging techniques are introduced to thwart reverse engineering attacks. The performance of resistance against reverse engineering depends on anti-reverse engineer techniques applied in CRSW. As we can apply the more effective anti-reverse engineering techniques to the module that is dynamic and scalable, the resistance against reverse engineering will be enhanced. Moreover, watermark is embedded into the code of program by chaotic dispersion coding. Therefore, practicing the combination of the instructions and data, it improves the performance of anti-static analyzing. Because of the application of the chaotic dispersion coding, the watermark will cause the program to fail if the adversary wants to reuse any part of code solely. Since W is distributed uniformly over the code which is the whole code exclusive the code that is replaced with Cm , Cd and Ca , there is a byte of d −la watermark per l−lm −l bytes code averagely. Thus: n lv =
l − lm − ld − la n
(7)
where lv is the average length of the reused code. If lv ≤ lT are ensured, n, the d −la length of watermark, must satisfy the inequation n≥ l−lm −l . lT It is difficult to locate the watermark because the position of W is generated by chaotic sequence. In addition, s = ∂(i, c, G) (chaotic substitution) can be considered that i is encrypted with G and c. If c is tampered, i could not be decoded correctly when i = ∂ −1 (s, c, G). In CRSW, if W is tampered, it is impossible to recover the code which is replaced with W correctly in the extracting procedure; if Cm , Ca and Cd is tampered, it is also impossible to get back the code which is replaced with Cm , Ca and Cd correctly, which could cause the program to fail. As for the given G, c is assumed the secret key, thus the key space should be 2lc (lc is the length of c); the key space is 28(n+lm +ld +la ) in CRSW. We analyze the performance degradation of watermarked program below. From the point of space, embedding watermark increases the size of the program.
362
F. Liu, B. Lu, and X. Luo
In the embedding procedure, the size of the program increases n + lm + ld + la bytes because of chaotic substitution which is applied to our algorithm. From the point of runtime, embedding watermark brings the increasing runtime of the program. The reason is that before the execution of the watermarked program, the original code should be recovered from Sm , Sd , Sa and SW , of which the recovering time not only depends on the iterative efficiency of digital chaotic system, but also the contents of W , Cm , Ca and Cd . Let t be the time of iterating once, and T1 , the time of recovering code from Sm , Sd and Sa , satisfies the following inequation: 1 (lm + ld + la )t≤T1 ≤ (lm + ld + la )t × 28 λ
(8)
With recovering code from SW , chaotic sequence of n bytes should be generated for ξ −1 at first. Thus T2 , the time of recovering code from SW , satisfies: nt + nt≤T2 ≤nt +
28 nt λ
(9)
T1 + T2 , the time of recovering all code, satisfies 2nt + (lm + ld + la )t≤T1 + T2 ≤nt +
28 (n + lm + ld + la )t λ
(10)
If W , Cm , Ca and Cd is bit-balance (Bits 0 and 1 occur at the same frequency), the average time of the procedure is T = nt +
27 + 0.5 (n + lm + ld + la )t λ
(11)
In general, Cm , Ca and Cd are fixed, that is to say, lm + ld + la is constant, and t is also a constant for a given digital chaotic system, the equation (11) can be rewrite as follow: T = nt(1 +
27 + 0.5 27 + 0.5 )+ (lm + ld + la )t = β1 n + β2 λ λ
(12)
where β1 , β2 are constants. Equation (7) shows that the larger n is, the smaller la is, and the more intensive the protection is. Equation (12) shows that T is linearly increased in a manner that involves n. The users who are intent to apply CRSW should exhibit a trade-off between intensity of protection and the performance degradation.
5
Implementation of CRSW
The algorithm’s implementation is in the Intel i386 architecture and the Windows operating system. This section is to expatiate on the implementation of Input Monitoring Module, Anti-reverse Engineering Module, and Watermark Decoding Module. There are several problems that arise when implementing these modules, and the corresponding solutions are given at the end of this section.
A Chaos-Based Robust Software Watermarking
5.1
363
Input Monitoring Module Cm
The purpose of Cm is to monitor the input of the program. When implementing Cm , we put activation key K1 (or µ(K1 ), µ is a one-way function) into this module. When the input α is a match for K1 (or µ(K1 )), Watermark Decoding Module is activated. 5.2
Anti-reverse Engineering Module Ca
In theory, a sufficiently determined attacker can thoroughly analyze any software by reverse engineering. It is impossible to thwart completely reverse engineering attacks. The goal, then, is to design watermarking techniques that are ”expensive enough” to break-in time, effort, or resources—that for most attackers, breaking them isn’t worth the trouble. There are two kinds of techniques-static analyzing and dynamic debugging—in the reverse engineering techniques. Therefore, Ca consists of Anti-Static Analyzing Module and Anti-Dynamic Debugging Module. Decompile is the foundation of static analyzing techniques, we can disable static analyzing by disturbing decompiler which is developed based on the hypothesis that data and instructions are separated. However, data and instructions in the Von Neumann architecture are indistinguishable. Thus, we can mix data and instructions in order to disturb decompiler by adding special data and instructions (we call them disturbing data) between instructions. In Fig. 3, (a) gives source code by assembly language, lines 1,5,6 are original instructions, but lines 2,3,4 are the disturbing data. (b) shows the instructions from decompiler. We can see that there are errors from line 4 to the end. There are a number of disturbing data in [4]. In this paper, we insert several disturbing data into Cm , Ca and Cd . If we can apply code encryption, compression etc. to Anti-Static Analyzing Module, the performance will be further improved. Dynamic debugging relies on debugging tools highly, so the general principle of anti-virus can be introduced to detect whether program is being debugged or not by the characters of debug tools. If debugged, the program will jump to wrong control flow in order to prevent from debugging .We already have achieved the algorithm based on the characters of SoftICE, Windbg, and Ollydbg. Experiments demonstrate that it is available to resist these debugging tools. The characters of other debugging tools can be introduced to the improved implementation.
Fig. 3. Example of disturbing data
364
F. Liu, B. Lu, and X. Luo
There are several registers for debugging in the processor of the i386 architecture. Several debugging tools design feasible functions, such as BPM1 , hardware breakpoint, by involving the debug registers [4]. In the paper, we modify the value of debugging register and invalidate these functions. In addition, time sensitive code and breakpoint detection are introduced to the implantation. We have involved several kinds of anti-reverse engineering techniques. It is worthy mentioning that this module is scalable that more efficient anti-reverse engineering techniques can be introduced. Thus, they can enhance the resistance against reverse engineering attacks. 5.3
Watermark Decoding Module Cd
Watermark Decoding Module includes Watermark Output Module and Chaotic System Module. Watermark Output Module transforms the watermark extracted from watermarked program into visual output; Chaotic System Module implements digital chaotic system, which is only applied to the watermark extracting procedure. When chaotic systems are discretely realized in finite precision, some serious problems will arise, such as dynamical degradation, short cycle length and non-ideal distribution. And then, we must compensate for the dynamical degradation in the presence of chaos system. We apply 1D piecewise linear chaotic maps, and the scheme of compensation for degradation in [11] to our implantation. 5.4
Problems and Solutions
Because of directly embedding watermark into the executable code, when implementing Cm , Ca and Cd , two problems arise as follows: (1) After every module is embedded into various executable code, the code and data are loaded onto different addresses of memory, and the code can’t access data in memory correctly. Therefore, it must do self-location (locate the memory address by the code itself). (2) Since it is unnecessary to recompile after embedding, modules can’t automatically find the address of Windows API by compiler and loader, but get the address by themselves. The self-location of code and data can be implemented by call/pop/sub instructions. Fig. 4 gives the specific codes. EBX, a register, is used to save the difference of the loading address and the designing address. The loading address is the sum of EBX and the designing address, which is self-location. The procedure in getting the addresses of Windows APIs is as follows: 1) Get the loading base address of kernel32.dll. There is exception handling in Windows—structured exception handling (SHE). All exception handling functions are in a linked list, and the last element of the linked list is the default exception handling function which is in the module of kernel32.dll. We can gain the address of the default exception handling function through traversing the linked list, from which we can get the loading base address of kernel32.dll. 1
BPM, an instruction of SoftICE, can set a breakpoint on memory access or execution.
A Chaos-Based Robust Software Watermarking
365
Fig. 4. The code of self-location
2) Get the addresses of LoadLibrary and GetProcAddress, which are Windows APIs, from the export table of kernel32.dll by the loading base address of kernel32.dll. 3) Get the address of the arbitrary Windows API with LoadLibrary and GetProcAddress.
6
Conclusion
A chaos-based robust software watermarking algorithm is proposed in this paper, in which the anti-reverse engineering technique and chaotic system are combined with the idea of the Easter Egg software watermarks. In CBSW, Anti-reverse Engineering Module is open and scalable, and more efficient anti-reverse engineering techniques can be applied. The program can be protected by embedding the watermark into the entire codes with chaotic dispersion coding. It is difficult for the adversary to tamper the message (includes W , Cm , Cd and Ca ) embedded in the program with chaotic substitution. The analysis of the CRSW shows that the scheme can thwart various types of semantics-preserving transformation attacks, such as dead code wiping, code optimization, code obfuscation, and variable reconstruction. Furthermore, it improves resistance against reverse engineering attacks to a certain extent.
Acknowledgement The work is supported partially by the National Natural Science Foundation of China (Grant No.60374004), partially by the Henan Science Fund for Distinguished Young Scholar(Grant No.0412000200), partially by HAIPURT(Grant No. 2001KYCX008), and the Science-Technology Project of Henan Province of China.
References 1. C. Collberg, C. Thomborson. Watermarking, tamper-proofing, and obfuscation tools for software protection. IEEE Trans. Software Engineering. Vol.28, No.8, pages: 735-746 2. Zhang Lihe, Yang YiXian, Niu Xinxin, Niu Shaozhang. A Survey on Software Watermarking. Journal of Software. Vol.14, No.2, pages: 268-277, in Chinese. 3. Business Software Alliance. Eighth annual BSA global software piracy study: Trends in software piracy1994-2002, June 2003.
366
F. Liu, B. Lu, and X. Luo
4. Kan X. Encryption and Decryption: Software Protection Technique and Complete Resolvent. Beijing: Electronic Engineering Publishing Company, 2001, in Chinese. 5. Jasvir Nagra and Clark Thomborson. Threading software watermarks. In 6th Workshop on Information Hiding, 2004, pages: 208-223. 6. Robert L. Davidson and Nathan Myhrvold. Method and system for generating and auditing a signature for a computer program. US Patent 5,559,884, September 1996. Assignee: Microsoft Corporation. 7. Julien P. Stern, Gael Hachez, Franois Koeune, and Jean-Jacques Quisquater. Robust object watermarking: Application to code. In 3rd International Information Hiding Workshop, 1999, pages: 368-378. 8. C. Collberg and C. Thomborson, Software watermarking: Models and dynamic embeddings. Proceedings of POPL’99 of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1999, pages: 311-324. 9. Patric Cousot and Radhia Cousot. An abstract interpretation-based framework for software watermarking. In ACM Principles of Programming Languages(POPL’04), Venice, Italy, 2004, pages: 173-185. 10. Christian Collberg, Andrew Huntwork, Edward Carter, and Gregg Townsend. Graph theoretic software watermarks: Implementation, analysis, and attacks. In 6th Workshop on Information Hiding, 2004, pages:192-207. 11. Liu Bin, Zhang Yongqiang, and Liu Fenlin. A New Scheme on Perturbing Digital Chaotic Systems. Computer Science, Vol.32, No.4, 2005, pages: 71-74, in Chinese
Privately Retrieve Data from Large Databases Qianhong Wu1 , Yi Mu1 , Willy Susilo1 , and Fangguo Zhang2 1
Center for Information Security Research, School of Information Technology and Computer Science, University of Wollongong, Wollongong NSW 2522, Australia {qhw, ymu, wsusilo}@uow.edu.au 2 School of Information Science and Technology, Sun Yat-sen University, Guangzhou 510275, Guangdong Province, P.R. China [email protected]
Abstract. We propose a general efficient transformation from Private Information Retrieval (PIR) to Symmetrically Private Information Retrieval (SPIR). Unlike existing schemes using inefficient zero-knowledge proofs, our transformation exploits an efficient construction of Oblivious Transfer (OT) to reduce the communication complexity which is a main goal of PIR and SPIR. The proposed SPIR enjoys almost the same communication complexity as the underlying PIR. As an independent interest, we propose a novel homomorphic public-key cryptosytem derived from Okamoto-Uchiyama cryptosystem and prove its security. The new homomorphic cryptosystem has an additional useful advantage to enable one to encrypt messages in changeable size with fixed extension bits. Based on the proposed cryptosystem, the implementation of PIR/SPIR makes PIR and SPIR applicable to large databases.
1
Introduction
Consider the following scenario. A user wants to obtain an entry from a database of n λ-bit strings but does not want the database to learn which entry it wants. This problem was formally defined as Private Information Retrieval (PIR) in [4]. While protecting the privacy of user, if the user is not allowed to learn any information about the entries out of choice, the corresponding protocol is called a Symmetrically Private Information Retrieval (SPIR) protocol [10]. Usually, the PIR schemes (e.g., [5], [8], [9]) were suggested to be converted into SPIR protocols by employing zero knowledge techniques to validate the query. The notion of Oblivious Transfer (OT) is similar to PIR. It was introduced by Rabin [13] in which Alice has one secret bit m and wants to make Bob get it with probability 0.5. Additionally, Bob does not want Alice to know whether he gets m or not. For 1-out-of-2 OT, Alice has two secrets m1 and m2 and wants
This work is supported by ARC Discovery Grant DP0557493 and the National Natural Science Foundation of China (No. 60403007).
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 367–378, 2006. c Springer-Verlag Berlin Heidelberg 2006
368
Q. Wu et al.
Bob to get one of them at Bob’s choice. Again, Bob does not want Alice to know which secret he chooses. 1-out-of-n OT is a natural extension of 1-out-of-2 OT to the case of n secrets. 1-out-of-n OT is also known as all-or-nothing disclosure of secrets (ANDOS) [1] in which Alice is not allowed to gain combined information of the secrets, such as, their exclusive-or. Clearly, a 1-out-of-n SPIR is also a 1-out-of-n OT. The reason we need two concepts is the different motivations for using these primitives (and the way they were historically defined). The early motivation of OT is to reduce the intricate multi-party computation to simple cryptographic primitives. As SPIR is designed to retrieve entries from large databases, it is crucial to reduce the communication complexity and make it much less than n for a 1-out-of-n bit SPIR protocol. Traditionally, there are two models for PIR protocols, i.e., the multi-database model and the single-database model. For PIR in the former model, there are ω > 1 copies of the database in different servers and the servers have unlimited computational power, but the communication among them is not allowed. The best upper bound of communication O(nlog log ω/ω log ω ) in this model is due to Beimel et al. [3]. In the single-database model, it is assumed that the database server is computationally bound and there is only one copy of the database. The first scheme in this model was proposed in [8], with its security based on the quadratic residuosity problem and with O(κN ) server-side communication complexity, where is any constant and κ is a security parameter. Stern proposed a SPIR protocol based on an semantically secure homomorphic public-key√cryptosystem [14]. It has super-logarithmic total communication O(κδ logδ nδ logδ n ), where δ is the ciphertext expansion ratio of the underlying homomorphic cryptosystem. Cachin et al. [6] constructed PIR with polylogarithmic communication complexity O(κ log≥4 n) under the Φ-hiding assumption. Based on Paillier cryptosystem [12], Chang proposed PIR with communication complexity O(κ2d log n) in server side and O(κdn1/d log n) in the user side [5], where d > 3 can be any integer. Based on Damg˚ ard-Jurik public-key cryptosystems [7], Lipmaa proposed PIR with complexity O(κ log2 n) in user side and O(κ log n) in the server side [9]. This is the best asymptotical result to date. Following the second guideline, we reduce the concrete communication and computation overhead of PIR/SPIR protocols. The main contributions of this paper include: – The notion of polishing public-key cryptosystems and a general transformation from PIR to SPIR. Unlike existing schemes relying on inefficient zero-knowledge proofs, our transformation employs an efficient construction of OT and meets the goal to reduce the communication complexity of SPIR. The SPIR has almost the same communication complexity as the underlying PIR. – A novel efficient homomorphic public-key cryptosytem as an independent interest. The new homomorphic cryptosystem enables one to encrypt messages in changeable size with fixed extension bits. Based on the proposed cryptosystem, efficient PIR/SPIR protocols are implemented.
Privately Retrieve Data from Large Databases
369
The proposals outperform the state-of-the-art PIR/SPIR protocols. It makes PIR and SPIR applicable to large databases. For instance, to run a PIR or SPIR protocol with a database of 235 512-bit entries, the total communication is only about 134KB. The unavoidable linear computation cost regarding the scale of the database is in the server side which has often powerful computation power. The computation cost in the user side which has often limited computational power is logarithmic regarding the scale of the database. Hence, the proposals are practical for private information retrievals from large database. The rest of paper is organized as follows. In Section 2, we review the security definition of PIR. Section 3 presents a general transformation from PIR to SPIR with almost the same communication complexity as the underlying PIR. A novel homomorphic public-key cryptosytem is proposed and efficient PIR/SPIR protocols are implemented in Section 4, followed with conclusions in the last section.
2
Definitions of PIR/SPIR
In this section, we review the definition of 1-out-of-n bit PIR in [5] where each entry of the database is a single bit. The definition is naturally extended to 1-out-of-n λ-bit PIR in which each entry of the database is an -bit string. For an integer a ∈ N, let [a] denote the set {1, · · · , a}. We use the notation a ← A to denote choosing an element a uniformly at random from the set A, and use PPT to denote probabilistic polynomial time. A function is negligible in if for any polynomial p(·) there exists a 0 such that for all > 0 we have f () < 1/p(). Informally, a private information retrieval (PIR) scheme is an interactive protocol between two parties, a database Server and a User. The Server holds a database of n λ-bit strings x = x1 x2 · · · xn , where xk =∈ {0, 1}n for k ∈ [n], and the User holds an index ı ∈ [n]. In its one-round version, the protocol consists of (1) a query sent from the User to the database generated by an efficient randomized query algorithm, taking as an input the index ı and a random string C (ı); (2) an answer sent by the database answer algorithm, taking as an input the query sent by the User and the database x; and (3) an efficient reconstruction function applied by the User taking as an input the index ı, the random string C (ı), and the answer sent by the Server. At the end of the execution of the protocol, the following two properties must hold: (I) the User obtains the ı-th λ-bit string xı ; and (II) a computationally bounded database does not receive any information about the index of the User. We now give a formal definition of a PIR scheme. Definition 1. The single-database private information retrieval (PIR) is a protocol between two players Server, who has n λ-bit strings x = x1 x2 · · · xn where xk is the k-th λ-bit string of x, and User, who has a choice of index ı ∈ [n], that satisfies the following two properties: – Correctness: If the User and the Server follow the protocol, the User can learn xı and Server can send less than nλ bits to User.
370
Q. Wu et al.
– Choice Ambiguity: For any PPT algorithm A and any j ∈ [n], the following value is negligible in the security parameter : |Pr[A(1 ; C (ı)) = 1]−Pr[A(1 ; C (j)) = 1]|; where C (σ) is the distribution of communication from User induced by an index σ ∈ [n]. An SPIR scheme is a PIR scheme satisfying an additional privacy property: choice insulation. Namely, a computationally bounded User does not learn any information about the bits out of choice. It can also be viewed as an OT protocol (e.g., [13], [1]) with communication overhead lower than the scale of the database. As SPIR is designed to retrieve entries from large databases, it is crucial to reduce the communication complexity and make it much less than nλ for a 1-out-of-n λ-bit string SPIR protocol. We now give a formal definition of an SPIR scheme. Definition 2. The single-database private information retrieval (PIR) is a protocol between two players Server, who has n λ-bit strings x = x1 x2 · · · xn where xk is the k-th λ-bit string of x, and User, who has a choice of index ı ∈ [n], that satisfies the following three properties: – Correctness: If the User and the Server follow the protocol, the User can learn xı and Server can send less than nλ bits to User. – Choice Ambiguity: For any PPT algorithm A and any j ∈ [n], the following value is negligible in the security parameter : |Pr[A(1 ; C (ı)) = 1]−Pr[A(1 ; C (j)) = 1]|; where C (σ) is the distribution of communication from User induced by an index σ ∈ [n]. – Choice Insulation: For any PPT algorithm A and any n λ-bit strings x1 x2 · · · xn such that xσ = xσ for some σ ∈ [n], the following value is negligible in the security parameter : |Pr[A(1 ; C (x1 , x2 , · · · , xn )] = 1]− Pr[A(1 ; C (x1 , x2 , · · · , xn )] = 1]|; where C (z1 , z2 , · · · , zn ) is the distribution of communication from Server induced by a database of n λ-bit strings z = z1 z2 · · · zn .
3
General Constructions
In this section, we rewrite the Limpaa PIR scheme [9] with less restrictions on the underlying semantically secure homomorphic public-key encryptions. Subsequently, we transform it to SPIR without using zero-knowledge proofs as most existing schemes.
Privately Retrieve Data from Large Databases
3.1
371
General PIR Based on Homomorphic Public-Key Cryptosystems
Assume that the database has n λ-bit strings x1 , x2 , · · · , xn , where n = dI and d can be any constant. If n < dI , one can append a string of λ(dI − n) zeroes to the database. Let the User’s choice be ı = a1 + a2 d1 + · · · + aI dI−1 , where I−1 I ai ∈ [d] for i ∈ [I]. Let Eyi (·) : {0, 1}λ+Σi=1 γi → {0, 1}λ+Σi=1 γi for i ∈ [I] be I semantically secure homomorphic public-key encryptions, where δ0 = 0 and γi is the expansion length of the i-th encryption and yi is the corresponding public key. Denote the i-th decryption by Dsi and si is the corresponding secrete key. The Server and the User run the PIR protocol as follows. – For i ∈ [I], j ∈ [d], the User computes bi,j = Eyi (0) for j = ai and bi,j = Eyi (1) for j = ai . It sends (bi,1 , · · · , bi,d ) as its query to the database Server. – The Server does the following. • Compute x0,1 = x1 , · · · , x0,n = xn , J0 = n. • For i = 1, · · · , I, compute: xi−1,µ+d(Ji −1) xi−1,µ Ji = Ji−1 /d, xi,1 = ⊗dµ=1 bi,µ , · · · , xi,Ji = ⊗dµ=1 bi,µ . • Return xI,1 to the User. – The User computes x˜ı = Ds1 (Ds2 (· · · (DsI (xI,1 )) · · · )). We first consider the correctness of the protocol. Assume that ı0 = ı = a1 + x0,µ+d(j1 −1) a2 d1 + · · · + aI dI−1 = a1 + dı1 . In the first iteration, x1,j1 = ⊗dµ=1 b1,µ = Ey1 (x0,a1 +d(j1 −1) × 1 + Σµ=a1 +d(j1 −1) x0,µ × 0) = Ey1 (x0,a1 +d(j1 −1) ) for j1 = 1, · · · , J1 = J0 /d. Hence, by decrypting the (ı1 + 1)-th entity x1,ı1 of J1 strings {x1,j1 }, the User can extract x˜ı = Ds1 (x1,a1 +d(ı1 +1) ) = x0,a1 +dı1 = xı . Hence, the 1-out-of-J0 PIR is reduced to a 1-out-of-J1 PIR. By repeating the reduction I times, the User can extract x˜ı = Ds1 (Ds2 (· · · (DsI (xI,1 )) · · · )) = xı . Then we consider the choice ambiguity. In the protocol, the choice of the User is encoded as (a1 , a2 , · · · , aI ) and ai is encrypted as (Eyi (0), · · · , Eyi (0), Eyi (1), Eyi (0), · · · , Eyi (0)). ai −1
d−ai
Since the encryption is semantically secure, the Server learns nothing about the choice of the User. Hence, the choice is ambiguous and we have the following result on security. Theorem 1. The above PIR protocol is secure if the underling public-key cryptosystem is homomorphic and semantically secure. In the above PIR, the User needs d logd n encryptions and logd n decryptions. The Server needs (dn − d2 + d − 1)/(d − 1) exponentiations. To query the log n database, the User needs λ logd n + Σi=1d (logd n − i + 1)δi bits and the Server logd n needs λ+Σi=1 δi bits to answer the query. The total communication complexity is about λ(logd n+1)+logd n(logd n+3)δ/2, where δ = max{δ1 , · · · , δlogd n }. For a sufficiently large n and an appropriate parameter d, the total communication is O(log2d n) and less than nλ bits.
372
3.2
Q. Wu et al.
From PIR to SPIR
To convert PIR into SPIR, a popular method is to let the User run a zeroknowledge proof protocol and convince the Server the validity of the query ([5], [8], [9]). However, this solution is inefficient in practice which additionally introduce times of complexity of the underlying PIR protocol. In the following, we propose a general transformation from PIR to SPIR by embedding into an efficient OT protocol. Our approach requires almost no additional communication overhead. To achieve a general construction of efficient OT protocol, we introduce the notion of polishing public-key cryptosystems. Let λ, λ0 be security parameters. Definition 3. Let F ← G(1λ ), where Fy (·) : {0, 1}λ0 → {0, 1}λ is a family of public-key encryptions, and y ∈ Y is the public key and Y the public key space. Let s ∈ S be the corresponding secret key satisfying y = f (s) and S the secret key space, where f (·) is a one-way function. Fy (·) is a polishing public-key cryptosystem if for any PPT adversary A, the following probability is negligible Pr[F ← G(1λ ), s ← S, y0 = f (s), y1 ← Y, b ← {0, 1}; ˜b ← A(1λ , F, f, yb ) : ˜b = b]. The probability is taken over the coin flips of G and A. This notion is similar to the dense cryptosystem [15] in which the public key distributes uniformly. However, in a polishing public key cryptosystem, we only require that any PPT distinguisher cannot distinguish a correct public key from a random string in the public key space. In the following, we show how to achieve efficient 1-out-of-n OT and SPIR using polishing public-key cryptosystems. Let H(·) : {0, 1}∗ → Y be a cryptographic hash function and F ← G(1λ ). Fs−1 (·) denotes the reverse of Fy (·). Assume that the User’s secret choice is ı ∈ [n] and the Server has n λ0 -bit messages mk for k ∈ [n]. The OT protocol between the two parties is as follows. – The User randomly selects s ∈ S and computes y = f (s). It sends the Server y = y ⊕ H(ı) as its queries. Here ⊕ represents an efficient operation Y × Y → Y such that there exists another efficient operation : Y × Y → Y satisfying α ⊕ η η = α for any α, η ∈ Y. – The database Server computes yk = y H(k) for k ∈ [n]. It returns xk = Fyk (mk ) to the User. – The User extracts m ˜ı = Fs−1 (xı ). Clearly, the above protocol is an OT protocol. First, the User can decrypt mı since it knows the secret key of the ı-th public key yı = y H(ı) = y ⊕ H(ı) H(ı) = y = f (s). Second, the Server cannot determine the choice of the User as the underlying public-key cryptosystem is polishing and the database Server cannot distinguish yı from y1 , · · · , yn . Finally, the User cannot learn any information of the other messages out of choice from the ciphertexts from the Server since it does not know the corresponding secret keys. Hence, we have the following result.
Privately Retrieve Data from Large Databases
373
Lemma 1. If there exists a polishing public-key cryptosystem, then there is a one-round 1-out-of-n oblivious transfer with complexity O(n). The protocol is a generalization of the OT protocol in [16] which is a special case based on ElGamal cryptosystem. The construction exploits more extensive efficient public-key cryptosystems to build efficient OT protocols, for instance, OT from NTRU cryptosystem. From the viewpoint of the SPIR, the above protocol is unsatisfactory since the communication complexity is O(n). However, to enable the User symmetrically retrieve message mı from the Server’s database of n λ0 -bit messages mk for k ∈ [n], we can embed the above OT protocol into our PIR scheme and achieve an efficient SPIR scheme: The Server does not need to directly return all the ciphertexts xk = Fyk (mk ) to the User for k ∈ [n]. It can compress the n ciphertexts into one ciphertext xI,1 as shown in section 2.2 to enable User to obtain xı and then decrypt mı . The detailed transformation is as follows. – The User randomly selects s ∈ S, and computes y = f (s), y = y ⊕ H(ı). For i ∈ [I], j ∈ [d], the User computes bi,j = Eki (0) for j = ai and bi,j = Eki (1) for j = ai . It sends y , (bi,1 , · · · , bi,d ) as its query to the Server. – The database Server does the following. • Compute yk = y H(k), xk = Fyk (mk ) for k ∈ [n]. • Compute x0,1 = x1 , · · · , x0,n = xn , J0 = n. • For i = 1, · · · , I, compute: xi−1,µ+(d−1)Ji xi−1,µ Ji = Ji−1 /d, xi,1 = ⊗dµ=1 bi,µ , · · · , xi,Ji = ⊗dµ=1 bi,µ . • The database responds with xI,1 to the User. – The User computes x˜ı = Ds1 (Ds2 (· · · (DsI (xI,1 )) · · · )) and extracts m ˜ı = Fs−1 (x˜ı ). Clearly, the above protocol is a SPIR with communication O(log2 n). Indeed, compared with the underlying PIR protocol, the SPIR protocol requires only one additional element of the public key space Y. It is much more efficient than those relying on zero-knowledge proofs which introduce times of additional complexity of the underlying PIR protocol. From Lemma 1 and Theorem 1, we have the following result. Theorem 2. If there exists a polishing public-key cryyptosystem and semantically secure homomorphic public-key cryptosystem, then there exists a one round 1-out-of-n SPIR protocol with communication complexity O(log2 n).
4 4.1
Implementation Issues A Novel Homomorphic Cryptosystem
In this section, we propose a novel homomorphic cryptosystem. It can be viewed as an extension of the Okamoto-Uchiyama cryptosystem [11] and enjoys similar security properties. However, our scheme is much more efficient and has the
374
Q. Wu et al.
additional useful property to enable one to encrypt messages in changeable size with fixed extension bits. The new cryptosystem employs the difficulty of factorizing N = P t Q, where P and Q are ρ-bit primes. To the best of our knowledge, the most efficient 1− algorithm to factorize N = P t Q runs in time 2k +O(log ρ) [2], where t = k . This algorithm due to Boneh et al. [2] requires only polynomial space in log N . When = 1/2, the algorithm asymptotically performs better than the Elliptical Curve Method (ECM). However, we can always set the parameters k and appropriately so that the running time is about 280 , which is beyond the current computation power. Let P be a prime, t > 0 an integer and Γ = {x|x = 1 mod P t−1 ∧ x ∈ Z∗P t }. Note that Z∗P t is a cyclic group with order P t−1 (P − 1) and then #Γ = P . For any x ∈ Γ , define L(x) = (x − 1)/P t−1 . Clearly, L(x) is well-defined and has the following homomorphic property. Lemma 2. For a, b ∈ Γ , L(ab) = L(a) + L(b) mod P t−1 . If x ∈ Γ satisfies L(x) = 0 and y = xm mod P t for m ∈ ZP t−1 , then m = L(y)/L(x) = (y − 1)/(x − 1) mod P t−1 . Proof. From the definition of L, we have that L(ab) = (ab − 1)/P t−1 = (a − 1)(b − 1)/P t−1 + (a − 1)/P t−1 + (b − 1)/P t−1 = L(a)(b − 1) + L(a) + L(b). Note that (b − 1) = 0 mod P t−1 . It follows that L(ab) = L(a) + L(b) mod P t−1 . Then m = mL(x)/L(x) = L(xm )/L(x) = L(y)/L(x) = (y − 1)/(x − 1) mod P t−1 . This completes the proof. Let N = P t Q where P, Q are strong primes and gcd(P − 1, Q) = 1, gcd(P, Q − 1) = 1, t > 1. Assume that P, Q ∈ {2ρ , 2ρ + 1, · · · , 2ρ+1 } where ρ is a security ∗ parameter and γ = 2ρ + 2, λ = (t − 1)ρ. We randomly select an integer g ∈ ZN such that the order of g1 = g P −1 mod P t is P t−1 and the order of g mod N is P t−1 (P − 1)(Q − 1)/2. The public key is (N, g, λ). The private key is P . Encryption: To encrypt a message m ∈ {0, 1}λ , one randomly selects r ∈ ZN and computes C = g m+rN mod N . The ciphertext is C. Decryption: Given a ciphertext C ∈ Z∗N , compute c = C P −1 =g1m mod P t . m = L(c)/L(g1 ) mod P t−1 . For the infeasibility of inverting the encryption function, we have the following result. Theorem 3. Inverting the encryption function of our scheme is infeasible if and only if it is infeasible to factorize N = P t Q. Proof. Clearly, if there exists a PPT algorithm factorizing N with non-negligible probability, with the same probability, one can run this algorithm first and then invert the encryption function as the decryption algorithm. The time to recover m is polynomial. Now we assume that our scheme is insecure. There exists an PPT adversary A which can compute m from C with non-negligible probability. We will construct
Privately Retrieve Data from Large Databases
375
a PPT algorithm B using A as a black box to factorize N with non-negligible probability as follows. B randomly selects g ← Z∗N . As P, Q are strong primes and gcd(P − 1, Q) = 1, gcd(P, Q − 1) = 1, the probability that the order of g1 = g P −1 mod P t is P t−1 and the order of g mod N is P t−1 (P − 1)(Q − 1)/2 with overwhelming probability. (N, g, λ) is a correct public key pair with the same probability. B then randomly selects u ← ZN and computes C = g u mod N . We prove that C is a correct ciphertext with non-negligible probability. Let the order of g mod P t is P t−1 P and the order of g mod Q is Q , where P |(P − 1), Q |(Q − 1). The distribution of C can be represented by (u1 , u2 ), where u1 = u mod P t−1 , u2 = u mod lcm(P , Q ), gcd(N, lcm(P Q )) = 1. Similarly, the distribution of C mod N can be represented by (v1 , v2 ) where v1 = v mod P t−1 , v2 = v mod lcm(P , Q ) such that C = g v = g m+rN mod N . When u1 and v1 are fixed, the distribution of u2 and v2 are statistically close. As v1 uniformly distributes in {0, 1}λ and u1 uniformly distributes in ZP t−1 , where {0, 1}λ ⊂ ZP t−1 ⊂ {0, 1}λ+1 , C is in {C} with probability at least 1/2. Let A output 0 ≤ m ≤ 2λ < P t−1 for the forged ciphertext C. Then m satisfies m = u mod P t−1 . If u > P t−1 (the probability is overwhelming as u randomly distributes in ZN ), then u − m is a multiple of P t−1 . It follows that gcd(N, u − m) is P t−1 , P t or P t−1 Q. For each case, there are polynomial time algorithms to find the factors P and Q. Hence, with the help of A, B can efficiently factorize N with non-negligible probability. This completes the proof. For the semantically security of the scheme, it relies on the following Decisional N -Subgroup Assumption. This assumption is related to the P -subgroup assumption in [11] and the Decisional Composite Residosity Assumption in [12]. Definition 4. Decisional N -Subgroup Assumption. Let G(·) be a generator regarding our scheme such that (N, g, λ) ← G(1λ ) is a public key pair as defined above. For any PPT adversary A, the following value |Pr[(N, g, λ) ← G(1λ ), r ← ZN , b ← {0, 1}, A = g b+rN mod N ; b ← A(1λ , N, A)] − 0.5| is negligible in λ. The probability is taken over the coin flips of G and A. Theorem 4. The above cryptosystem is semantically secure against the chosen plaintext adversaries if and only if the Decisional N -Subgroup Assumption holds. Proof. Assume that the Decisional N -Subgroup Assumption does not hold. that is, a PPT algorithm A can distinguish E(0) and E(1) with non-negligible probability, where E(·) denotes the encryption defined above. Given {m0 , m1 } and C = E(m) where m ∈ {m0 , m1 }, we will construct a PPT algorithm B using A as a subroutine to distinguish E(m0 ) and E(m1 ). B randomly select α ← ZN and computes C = C/g m0 mod N , g = g (m1 −m0 )−αN mod N . With non-negligible probability, gcd((m1 − m0 ) − αN , P t−1 (P − 1)(Q − 1)/2) = 1. Hence, with the same probability, the distribution of g and g is statistically close, and β/((m1 −m0 )−αN ) mod lcm(P −1, Q−1) is defined for a random
376
Q. Wu et al.
integer β. Denote the encryption with N, g by E (·). Therefore, when C = E(m1 ), C = g m1 −m0 g rN = g (m1 −m0 )/((m1 −m0 )−αN ) g rN = g 1+(r+α/((m1 −m0 )−αN ))N mod N = E (1), if r + α/((m1 − m0 ) − αN ) mod lcm(P − 1, Q − 1) is defined. When C = E(m0 ), we have that C = g m0 −m0 g rN = g 0+(r/((m1 −m0 )−αN ))N mod N = E (0), if r/((m1 −m0 )−αN ) mod lcm(P −1, Q−1) is defined. B runs A with (N, g , C ) as inputs and obtains the answer whether C is E (0) or E (1), which immediately implies whether C is E(m0 ) or E(m1 ). Assume that our scheme is insecure. Then there is a PPT algorithm A to distinguish E(m0 ) and E(m1 ) with non-negligible probability. We will construct a PPT algorithm B using A as a subroutine to break the N -subgroup assumption. Let C is either E(0) or E(1). B randomly select α ← ZN and computes C = g m0 +αN C (m1 −m0 ) mod N . If C = E(0), C = E(m0 ). If C = E(1), C = E(m1 ). B runs A with (N, g, C ) as inputs and obtains the answer whether C is E(m0 ) or E(m1 ), which immediately implies whether C is E(0) or E(1). This completes the proof. Clearly, the above encryption is homomorphic. Let ρ = . The expansion rate of the scheme is 2/(t + 1). In the case that t = 2, it is the Okamoto-Uchiyama cryptosystem [11]. For t > 2, our extension is more efficient than the original Okamoto-Uchiyama cryptosystem. Further more, by keeping ρ fixed and improving t, one obtains a series of homomorphic encryptions of changeable message length with the same expansion γ = 2ρ + 2 bits. It is also more efficient than the schemes in [7] with the similar property. 4.2
Implementation of PIR
Let us assume the same settings as section 2.2. Following the general construction, the PIR protocol is implemented as follows. – The User randomly generates I public keys (N1 , g1 , λ), (N2 , g2 , λ + γ), · · · , (NI , gI , λ + (I − 1)γ), where Ni is generated as above. Denote the corresponding decryption procedures by Di (·) for i ∈ [I]. For i ∈ [I], j ∈ [d], the ξ +r N User randomly select ri,j ∈ ZNi and computes bi,j = gi i,j i,j i mod Ni , where ξi,j = 0 for j = ai and ξi,j = 1 for j = ai . It sends (Ni , bi,1 , · · · , bi,d ) as its query to the Server. – The database Server does the following. • Compute x0,1 = x1 , · · · , x0,n = xn , J0 = n. d xi−1,µ+d(j−1) • For i ∈ [I], j = 1, · · · , n/di , compute xi,j = µ=1 bi,µ . rNI • Return cI = xI,1 gI mod NI to the User, where r ∈R ZNi . – The User extracts x˜ı = D1 (D2 (· · · (DI (cI )) · · · )). The User needs about (d + 1)(λ logd n + γ logd n(logd n + 1)/2) bits. The Server needs about λ + γ logd n bits. For the User, the most time-consuming job is to generate the logd n public keys. Note that the public keys is reusable, and the User can generate sufficiently many public keys before the protocol is
Privately Retrieve Data from Large Databases
377
run. After this pre-computation, the User requires about (d + 1) logd n (λ + iγ)bit modular exponentiations. The Server needs about (n − 1)/(d − 1) d-base (λ + iγ)-bit modular exponentiations. We now analyze the practicality of the protocol by concrete parameters. Let γ = 1026, d = 32, λ = 512, n = 235 . That is, a User will privately retrieve from a large database Server of 235 512-bit strings. The largest t is 8 = 5121/3 and = 1/3. The running time to factorize N is about 273 . In this scenario, the User needs about 133KB and the Server needs about 1KB. The User needs about 231 modular exponentiations. The Server needs about 235 modular exponentiations. This computation is heavy and unavoidable. However, in practice, the Server has often scalable computational power and hence it is bearable. 4.3
Implementation of SPIR
Assume that the database has n λ-bit strings m1 , m2 , · · · mn where n = dI and d can be any constant. The User’s choice is ı ∈ [n]. Let G =< g > be group of a -bit large prime order in which discrete logarithm is difficult and g is a generator of G. Let H(·) : {0, 1}∗ → G be a cryptographic hash function. First, the User randomly selects s ∈ {0, 1} and computes y = (g ⊕ H(ı))s . Here, ⊕ means the group operation and denote its reverse by . The User sends the Server y, and (Ni , bi,1 , · · · , bi,d ) as that of the PIR protocol in the above section. The database Server selects a random integer r ∈ {0, 1} and computes xk = mk ⊕(g ⊕H(k))r for k ∈ [n]. Then it run the PIR protocol in Section 2.2. Finally, it returns z = y r and xI,1 to the User. The User extracts x˜ı = D1 (D2 (· · · (DI (xI,1 )) · · · )) and then decrypts m ˜ı = x˜ı z 1/s . Compared with underlying PIR protocol, the above SPIR requires only additional bits introduced by y and z. As xk = mk ⊕(g⊕H(k))r can be pre-computed before the query from the User, the SPIR protocol has almost the same online complexity as the underlying PIR protocol. Hence the SPIR scheme is also practical for large database retrievals.
5
Conclusions
Private information retrievals are useful cryptographic primitives. It implies other well-known cryptographic primitives such as the existence of one-way functions, oblivious transfers, multi-party computations. It can also be directly implemented for applications such as medical database retrievals, digital bank transactions and so on. In this paper, we proposed a general efficient transformation from PIR to SPIR without exploiting the zero-knowledge proofs. The proposals are implemented efficiently. The schemes of PIR/SPIR are applicable to secure large database retrievals. As an independent interest, we also contribute a novel efficient homomorphic public-key cryptosystem. It can be used to encrypt messages in changeable size while extension bits are constant.
378
Q. Wu et al.
References 1. G. Brassard, C. Cr´epeau, J.-M. Roberts. All-or-Nothing Disclosure of Secrets. In Proc. of Crypto’86, LNCS 263, pp. 234-238, Springer-Verlag, 1987. 2. D. Boneh, G. Durfee, and N. Howgrave-Graham. Factoring N = pr q for large r. In Proc. Crypto’99, LNCS 1666, pp. 326–337, Springer-Verlag, 1999. 3. A. Beimel, Y. Ishai, E. Kushilevitz, and J.-F. Rayomnd. Breaking the O(n1/(2k−1) ) barrier for information-theoretic private information retrieval. In Proc. of the 43-th IEEE Sym. On Found. Of Comp. Sci., 2002. 4. B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan. Private Information Retrieval. In Proc. of 36th FOCS, 1995. 5. Y. Chang. Single Database Private Information Retrieval with Logarithmic Communication. In Proc. of ACISP’04, LNCS 3108, pp. 50-61, Springer-Verlag. 2004. 6. C. Cachin, S. Micali, and M. Stadler. Computational Private Information Retrieval with Polylogarithmic Communication. In Proc. of Eurocrypt’99, LNCS 1592, pp. 402-414, Springer-Verlag, 1999. 7. I. Damg˚ ard, M. Jurik. A Generalisation, a Simplification and Some Applications of Paillier’s Probabilistic Public-Key System. In Proc. of PKC’01, LNCS 1992, pp. 119-136, Springer-Verlag, 2001. 8. E. Kushilevitz and R. Ostrovsky, Replication is not needed: single database, computationally-private information retrieval. In Proc. of FOCS’97, pp. 364-373. 9. H. Lipmaa. An Oblivious Transfer Protocol with Log-Squared Communication. In Proc. of ISC 05. LNCS 3650. pp.314-328, Springer-Verlag, 2005. 10. S. K. Mishra, P. Sarkar. Symmetrically Private Information Retrieval. In Proc. of Indocrypt’00, LNCS 1977, pp. 225-236, Springer-Verlag, 2000. 11. T. Okamoto, S. Uchiyama. A New Public-Key Cryptosystem as Secure as Factoring. In Kaisa Nyberg, editor, In Proc. of Eurocrypt’98, LNCS 1403, pp. 308-318, Springer-Verlag, 1998. 12. P. Paillier. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In Proc. of Eurocrypt’99, LNCS 1592, pp. 223-238, Springer-Verlag, 1999. 13. M. Rabin. How to Exchange Secrets by Oblivious Transfer. Technical Report TR81, Aiken Computation Laboratory, Harvard University, 1981. 14. J. P. Stern. A New and Efficient All-or-nothing Disclosure of Secrets Protocol. In Proc. of Asiacrypt’98, LNCS 1514, Springer-Verlag, pp. 357-371, 1998. 15. A. De Santis and G. Persiano, Zero-Knowledge Proofs of Knowledge Without Interaction. In Proc. of FOCS’92, pp. 427-436, IEEE Press, 1992. 16. W. Tzeng. Efficient 1-out-of-n Oblivious Transfer Schemes. In Proc. of PKC’02, LNCS 2274, Springer-Verlag, pp. 159-171, 2002.
An Empirical Study of Quality and Cost Based Security Engineering Seok Yun Lee1, Tai-Myung Chung1, and Myeonggil Choi2,* 1
School of Information and Communication Engineering, Natural Science Campus Sungkyunkwan University, 300 Cheoncheon-dong, Jangan-gu, Suwon-si, Geonggi-do, 440-746, Korea [email protected], [email protected] 2 Department of Systems Management Engineering, INJE University, 607 Obang-dong, Gimhae, Gyeongnam, 621-749, Korea [email protected]
Abstract. For reliability and confidentiality of information security systems, the security engineering methodologies are accepted in many organizations. A security institution in Korea faced the effectiveness of security engineering. To solve the problems of security engineering, the institution creates a security methodology called ISEM, and a tool called SENT. This paper presents ISEM methodology considering both product assurance and production processes take advantages in terms of quality and cost. ISEM methodology can make up for the current security engineering methodology. For support ISEM methodology, SENT tool, which is operated in Internet, support the production processes and the product assurances which ISEM demands automatically.
1 Introduction Many organizations have invested many resources to increase reliability and confidentiality of information security systems. As a series of efforts to obtain high quality of information security systems, the security engineering methodologies such as CC, ITSEC, SSE-CMM and SPICE have been introduced [8,11]. The security engineering methodologies could be divided into two approaches in terms of assuring objects. The first approach is a product assurance approach and the second approach is a production process approach. The product assurance approach focuses the assurance of products through evaluating functions and assurances of information security systems. CC (Common Criteria), ITSEC (Information Technology Security Evaluation Criteria) and TCSEC (Trusted Computer Security Evaluation Criteria) could be included in the product assurance approach. Although the product assurance approach could assure high quality, it takes high costs and periods. The production process approach focuses the assurance of production process. The production process approach shifts its focus from assuring products to assuring production processes. SSE-CMM (System Security Engineering-Capability Mature Model), *
Corresponding author.
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 379 – 389, 2006. © Springer-Verlag Berlin Heidelberg 2006
380
S.Y. Lee, T.-M. Chung, and M. Choi
SPICE, ISO 9000-3 (Guidelines for the development supply and maintenance of software) could be included in the production process approach. Although the cost and period the production process approach is lower than those of the product assurance approach, the assurance level should have been lower than that of the first approach. The product assurance approach has been frequently introduced in developing high reliable information. To solve high engineering costs, many organizations have sought a cost-effective security engineering methodology. In nature, the two security engineering approach could be supplemental [4]. This paper presents a security engineering methodology and a tool supporting the methodology, with which a security research institution in Korea have tried to solve a trade-off between cost and quality. The institute in Korea has created ISEM (High Secure Engineering Methodology) assuring both products and production process. To support ISEM, SENT (Secure Engineering Tool) has been developed. ISEM could make up for shortcomings of the product assurance approach such as CC, ITSEC, TCSEC, and could reflect the advantages of the production process approach such as SSE-CMM, SPICE. SENT could direct the users participating engineering to follow all the processes and to describe all assurances ISEM demands.
2 Review of Security Engineering Methodology This section briefly reviews the product assurance approach and the production process approach. In the early of 1980’s, TCSEC was developed in United States. TCSEC was primarily applied to engineer the trusted ADP (automatic data processing) systems. It was used to evaluate information security systems and the acquisition specifications of information security systems in public institutions. TCSEC has two distinct requirement sets, which consist of (1) security functional requirements, and (2) assurance requirements. The security functional requirements encompass the capabilities typically found in information processing systems employing general-purpose operating systems. General-purpose operating systems are distinct from the applications programs. However, the specific security functional requirements can be applied to the specific systems owing functional requirements, applications or special environments. The assurance requirements, on the other hand, can be applied to the systems that cover the full range of computing environments from the dedicated controllers to multilevel secure systems [2]. ITSEC is European-developed criteria filling a role roughly equivalent to the TCSEC. While ITSEC and TCSEC have many similar requirements, there are some important distinctions. ITSEC tends to place emphasis on integrity and availability, and attempts to provide a uniform approach in evaluating both products and systems. Like production process approach, ITSEC also introduces a distinction between doing the right job effectiveness and doing the job right. To do so, ITSEC allows less restricted collections of requirements for a system at the expense of more complete and less comparable ratings [3]. CC is an outcome of a series of efforts to develop IT security evaluation criteria that can be broadly accepted within the international community. The sponsoring organizations of TCSEC, and ITSEC pooled their efforts and began a joint activity to align their separate criteria into a set of IT security criteria. CC has security functional requirements and security assurance requirements. The CC has 7 Evaluation Assurance Levels. Especially, CC has been standardized as ISO/IEC 15408 [6, 7].
An Empirical Study of Quality and Cost Based Security Engineering
Product Service
381
Sun
Su n ULTRA
2
Engineering Process
Assurance Process
Assurance Rationale
Risk Process
Risk Information
Fig. 1. SSE-CMM consists of three domains such as risk process, engineering process and assurance process
ISO 9000-3, SPICE, and SSE-CMM are security engineering methodologies focusing on quality and controls in production process [1, 9, 11]. SSE-CMM is based on SE-CMM. To handle special principles of information system security engineering, SE-CMM was interpreted in respect of information security area, and new domains of production process and practices have been identified. As fig. 1 shows, SSE-CMM consists of three domains, which are risk process, engineering process and assurances process. In risk process, risk of product and service should be identified and prioritized. In engineering process, solutions to manage risks can be suggested. In assurance process, assurance rationale should be submitted to customers [5].
3 ISEM Methodology A security research institution in Korea had adopted a production process methodology to develop high reliable information security systems. To increase reliability and confidentiality of the information security systems, the institution shifted its focus from assuring quality of production process to assuring products themselves. After shifting its focus, the institution has faced increased costs and prolonged periods to engineer the information security systems. To solve the problem, the institution has sought to take advantages of both the production process approach and product assurance approach. The institution creates a methodology called ISEM and a tool called SENT to solve a trade-off between quality of products and engineering costs in developing the information security systems. ISEM accepts the advantage of the two approaches, so that it focuses both production process and product assurance. As fig.2 shows, ISEM consists of a design stage, and three stages of developing prototypes. ISEM adopts three main production processes in SSE-CMM, which includes assurance process, risk process and engineering process. The differences between ISEM and the two security engineering approaches lie in the granularity level. The production process approach such as SSE-CMM could adopt a same granularity level of security engineering process in an enterprise. The
382
S.Y. Lee, T.-M. Chung, and M. Choi
l
XX z{hnlG z{hnlG TTjGkT jGkT
s
z{hnl YY z{hnl
T{GX T{GX w GkT w GkT
l
h n Gw
z{hnl ZZ z{hnl
T{GYGw GkT T{GYGw GkT
l [[ z{hnl z{hnl T{GZGw GkT T{GZGw GkT
o
Fig. 2. ISEM Methodology consists of four stages
product assurance approach such as CC, ITSEC demands a granularity level of product in dependent of a product rating. But ISEM could elevate the granularity level with going through the four stages. Consequently, the granularity level of production process could increase from the 1st stage to the 4th stage. The rating in CC, TCSEC and ITSEC could be considered as a concept of stage in ISEM. ISEM demands a different level assurance in each stage. The reliability of the information security systems can be guaranteed through assurances in the form of documents. The assurance level of ISEM is lower than CC and TCSEC. ISEM requires for the developers and the evaluators to describe only the essential items in the documents. All of the information security systems need not be engineered from the 1st stage to the 4th stage. Depending on reliability demanded, the engineering stage of the information security systems can be decided. The high reliable information security systems should be engineered in all four stages, whereas the low reliable information security systems should be engineered in two or three stages. st
3.1 The 1 Stage in ISEM st
In the 1 stage, the developer designs the information security systems in conceptual level. As fig.3 shows, the conceptual design should assure a reliability of design. The st 1 stage consists of production processes, product assurances and evaluation. The production processes which the developer should observe are as following: 1). surveying currents of the information security systems, 2) analyzing user requirements, 3) designing security mechanisms, 4) specifying target systems, 5) designing information security systems in a conceptual level. The developer should survey technical currents to reflect target systems and analyze user’s requirements and security environments. Based on the survey and the analysis, the developer should develop cipher algorithms and security protocols and specify the target information security systems. After completing these procedures, the developer should specify the target systems. The specification of the target systems includes risk analysis, user requirements, functional requirements, and assurance requirements. The developers
An Empirical Study of Quality and Cost Based Security Engineering
383
cGwGwGGGXz{ z{hnlGe
z Gt z Gt k k
Y
X
|Gy |Gy
uGG uGG {Gz {Gz
Z
zGG zGG {Gz {Gz
[
jGk jGk
\
XX hG hG l l
st
Fig. 3. The production processes in the 1 stage
are able to design the information security systems in a conceptual level. To manage configuration of the 1st state, the developer should describe a sub-system by subsystem. For product assurance, the developer should document the activities happened in production process. The document which should be described are as following: 1) the note of technical survey, 2) the analysis of user requirements, 3) the design of security mechanisms, 4) the specification of target systems, 5) the conceptual design of information security systems, 6) the document of configuration management. The granularity level of product assurance in the 1st stage is low in that the 1st stage does not demand a detail description in the analysis of user requirements, the specification of target systems, and the conceptual design of information security systems. But the 1st stage demands a detail description of security mechanism design and configuration management. After completing the developing activity, the evaluator should develop the observation of production processes, the reliability of conceptual design, and the completeness of configuration management. To decrease periods and costs, the observation of production processes can be evaluated through check-list and interviews. In the 1st stage, the focus of evaluation lies in security mechanisms and configuration management. The reason to focus security mechanisms is that the design of security mechanisms is the most important process in developing high reliable information security systems. nd
3.2 The 2 Stage in ISEM st
nd
Developing and evaluating the 1 prototype happen in the 2 stage. As fig.4 shows, st the developer should reflect the conceptual design to the 1 prototype. After developst ing the 1 prototype, the evaluator should mainly validate the correctness between st assurance of the security mechanisms and the 1 prototype. nd In the 2 stage, the production processes are as following: 1). specifying the 1st prototype, 2) designing security mechanisms for the 1st prototype, 3) developing the 1st prototype, and 4) testing functions of information security systems. The production processes in the 2nd stage should mainly implement security mechanisms of the 1st prototype. To develop the 2nd prototype, the developer should design the security mechanisms using formal method and verify them in a mathematical way. Therefore, the vulnerability of security mechanisms and the related functions could be verified.
384
S.Y. Lee, T.-M. Chung, and M. Choi
cGwGwGGGY z{hnlGe
kG kG XX w w
Y
z Gt z Gt k k
X
XX w w z z
Z
mG{ mG{
[
hG YY hG l l
nd
Fig. 4. The production processes in the 2 stage
To assure the 1st prototype, the developer should describe documents, which are as following: 1) the specification of the 1st prototype, 2) the design of security mechanisms, 3) the document of functional test results, 4) the conceptual design of systems, and 5) the document of configuration management. The assurance granularity of the 2nd prototype is higher than that of the 1st stage. The specification of the 1st prototype and the design of security mechanisms should be described using formal method. Especially, the design of security mechanisms should include results of vulnerability test and security test in terms of security protocols operation. The documentation of functional test results only includes the result of function test. Despite the increased granularity of assurances, the documents described in production processes could be simple compared to those of CC. In the 2nd stage, the evaluator should validate correctness between assurances and the 1st prototype. To validate reliability and integrity, the evaluator should verify security mechanisms and functional results using documents and independent tests. rd
3.3 The 3 Stage in ISEM rd
nd
In the 3 stage, the 2 prototype should be developed and evaluated. As fig.5 shows, rd nd the production processes of the 3 stage are similar to those of the 2 stage. To make nd the 2 prototype closer to the target systems, the developer should modify the security st mechanisms and improve the entire functions in the 1 prototype. rd In the 3 stage, the production processes are as following: 1). specifying the 2nd prototype, 2) modifying security mechanisms for the 2nd prototype, 3) developing the 2nd prototype, and 4) testing the entire functions of information security systems. In the 3rd stage, the security mechanisms and the entire functions should be confirmed. In a case, the 2nd prototype could be the target systems and the developer could complete the 2nd prototype as the target systems To assure the 2nd prototype, the developer should describe documents, which are as following: 1) the specification of the 2nd prototype, 2) the design of security mechanisms, 3) the document of functional test results, 4) the detail design of information security systems, and 5) the document of configuration management. The differences between the 2nd stage and the 3rd stage are the assurance level of products. In the 3rd stage, the detail specification of the target systems and the detail design of information security systems should be described. To increase assurance level of the 2nd prototype, the developer should provide quantitative criteria to evaluate correctness of the security mechanisms and the entire functions.
An Empirical Study of Quality and Cost Based Security Engineering
385
cGwGwGGGZ z{hnlGe Y
kG kG w YY w
X
t Gz G t Gz G tGk tGk
w YY w z z
Z
mG{ mG{
GhG ZZG hG l l
[
rd
Fig. 5. The production processes in the 3 stage
The evaluator should verify the completeness of security mechanisms and the entire functions in the 2nd prototype. Although the granularity in the specification of target systems and the design of systems could increase, the costs and the periods of production process are similar to those of the 2nd stage. th
3.4 The 4 Stage in ISEM rd
th
rd
The 3 prototype can be developed and evaluated in the 4 stage. The 3 prototype is the target systems specified in the first stage. As fig.6 shows, the production processes th rd rd of the 4 stage are as following: 1). specifying the 3 prototype, 2) developing the 3 rd prototype 3) testing performance of the 3 prototype 4) testing hardware adaptation, rd and 5) testing operation of 3 prototype in the target environment. The production processes in the 4th stage focus operation of the 3rd prototype in the target environrd ments. Based on the results of tests, the placement of the 3 prototype could be decided. cGwGwGGG[ z{hnlGe Y
wG{ wG{
kG kG ZZ w w
X
ZZ w w z z
Z
lG{ lG{
[
vG{ vG{
[[ hG hG l l
\
th
Fig. 6. The production processes in the 4 stage
To assure the 2nd prototype, the developer should describe documents, which are as following: 1) the specification of the 3rd prototype, 2) the document of performance test results, 3) the document of environmental test results, 4) the document of operation test, and 5) the document of configuration management. To assure completeness of the 3rd prototype, the developer should provide quantitative criteria to meet the specification of the 3rd prototype in the performance test, the operational test, and the environment test. After testing, the developers should describe the results of tests. In the 4th stage, the documents of tests should be described in a detail.
386
S.Y. Lee, T.-M. Chung, and M. Choi
Evaluator should verify the consistencies between the assurances and the overall tests. After 4th stage, the overall assurances of products become completed. All the assurances of products have been completed in the form of documents in each stage.
4 SENT Tool To support ISEM methodology, SENT that consists of a Process-Supporting Systems (SYS1), an Assurance-Supporting Systems (SYS2), and a Specifying/EvaluatingSystems (SYS3) are developed in fig. 7.
Process-Supporting Systems (SYS1)
Assurance-Supporting Systems (SYS2)
Specifying/Evaluating Specifying/Evaluating Systems (SYS3) Fig. 7. SENT tool
The Process-Supporting Systems (SYS1) support the developers to observe production processes in each stage. The Assurance–Supporting systems (SYS2) support the developers and the evaluators to describe assurances of products. The AssuranceSupporting Systems are able to generate a predefined form, in which the developers just describe the assurances, so that the documents could be consistent through all the stages. Therefore, the description level in any document and document can be consistent. The specifying/evaluating systems (SYS3) support the users to specify and evaluate the prototype and target systems in each stage. SENT can be operated in a web-server which includes JSP container, JAVA BEAN, and Xindice database. 4.1 The Process-Supporting Systems (SYS1) The Process-Supporting Systems are operated in a central server. The users are able to upload and download the documents. As fig.8 shows, the documents can be saved in File Systems and information concerning the document can be saved in Database. The Process-Supporting Systems provide the two functions. First, the rules for production process guide the users to observe the production process. The users can perform their task in accordance with the production processes. The users could not jump up or omit any production processes before completing previous production processes. Second, authenticating users authorizes the users, who consist of the managers, the developers and the evaluators, to access SENT. The roles of the users is different depending on the users task, so that only the manager can review all the documents, grant the general users with access authorization, and post news in the systems. The general users can search and edit the documents.
An Empirical Study of Quality and Cost Based Security Engineering
Upload Sharing Documents
Download Download Sharing Documents
387
Listing Sharing Documents
WEB SERVER JSP Container Database Database Manager
Document Manager
Search Manager
File Systems
Database
Fig. 8. The structure of the Process-Supporting Systems
4.2 The Assurance-Supporting Systems (SYS2) The Assurance-Supporting Systems provide following functions. First, generating a form of document supports the users to edit data in a predefined format. As fig.9 shows, Presentation Layer, the Document Manager, and the Database Manager generate a form, which can be drawn and modified by the authorized users. The data and the form in the document can be saved in the Database and the File Systems, respectively. Second, Managing configuration provides two functions, which are categorizing documents and managing configuration documents. The function can categorize the documents by form and title and manage history of configuration documents in a detail. The function of managing configuration can issue a report of document alternation, which describes modifying items of documents, the modifying time, and the modifying users. As fig.9 shows, the Version Manager is able to perform categorizing documents by title and form and saving them.
External Application Interface
aa Form Form of of Document Document
Presentation Layer Search Engine
Database Manager Database
Document Manager
Document Transform Manager Access Manager
Version Manager Loader Generator
Saving and Using Structural Information
File Systems
Fig. 9. The structure of the Assurance-Supporting Systems
388
S.Y. Lee, T.-M. Chung, and M. Choi
4.3 The Specifying/Evaluating Systems The Specifying/Evaluating Systems provide the two important functions. First, the Specifying/Evaluating Systems can save a template of the specification and the evaluation report, which are similar to that of the Assurance-Supporting Systems. Second, describing the specification could analyze security environments which include assumptions, threats, and organization security. As fig.10 shows, inference engine is able to present security environment to the users to describe security environments. When the users input threats, assumptions, and security policy of an organization to the Intelligent Manager, the Intelligent Manager could pass them the Inference Engine. On receiving it, the Inference Engine is able to infer the security environments using the Database, and return the final specification of the target systems. Evaluation Evaluation Report Report
Specification Specification
Presentation Presentation Layer Layer
Intelligent Intelligent Manager Manager
Inference Inference Engine Engine
Document Document Manager Manager
Document Document Transform Transform Manager Manager
Document Document Manager Manager
Database
Fig. 10. Specifying/Evaluating Systems
5 Conclusion The paper suggests ISEM methodology and SENT tool for engineering high reliable information security systems. ISEM has been presented to take advantages from the contrary approaches of information security engineering. Although the product assurance approach such as TCSEC, ITSEC, and CC could engineer information security systems in a precise way, it takes high costs. Although the production process approach such as SPICE, SSE-CMM takes less cost compared to the product assurance approach, it could not assure information security systems precisely than the product assurance approach could assure. ISEM demands the users describe assurances of information security systems, and observe the suggested four stages. For feasibility of approach, ISEM mitigates assurance level of products and production processes depending on the reliability information security systems. The reason that ISEM is suitable for engineering high reliable information security systems is that the high reliable information security systems should be developed in high assurance level and cost-effective way. ISEM could provide the high assurance level and cost-effective engineering process.
An Empirical Study of Quality and Cost Based Security Engineering
389
SENT could support all the production processes and product assurances. SENT could be operated on Internet so that the users could easily access to SENT. The Process-Supporting Systems help the users observe the four stages and the AssuranceSupporting Systems help the users save efforts describing document. SENT is proved to be useful in developing high reliable information systems. Although the production process approach and the product assurance approach were introduced for developing the special-purpose information security systems, they could be applied in commercial information security systems. Due to reliability and effectiveness of ISEM and SENT, they could be suitable for developing high reliable systems, including cipher systems, military systems, space systems and so on.
References 1. Software Engineering Institute, Carnegie Mellon Univ.: SSE-CMM Appraisal Method, V.2.0, (1999) 2. Department of Defense: Trusted Computer System Evaluation Criteria, DoD 5200.28STD, (1985) 3. European Commission: Information Technology Security Evaluation Criteria (ITSEC), (1992). 4. Eloff,M., Solms,S.H.: Information Security Management, Hierarchical Framework for Various Approaches, Computers & Security, Vol.19, (2000) 243-256. 5. Hefner,R., Monroe,W.: System Security Engineering Capability Maturity Model, Conference on Software Process Improvement, (1997). 6. ISO/IEC: Common Criteria for Information Technology Security Evaluation Part 3: Security Assurance Requirements Version 2.1, (1999) 7. ISO/IEC: Common Methodology for Information Technology Security Evaluation Part 2: Evaluation Methodology Version 1.0 (1999) 8. Piazzal,C., Pivato,E., Rossi,S.,:CoPS-Checker of Persistent Security, In: Jensen, K, Podelski A.,(eds): Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, Vol.2988, Springer-Verlag, Berlin Heidelberg New York (2004) 93–107 9. Pijl ,G., Swinkels,G., and Verijdt, J.:ISO 9000 versus CMM: Standardization and Certification of IS Development, Information & Management, Vol.32, (1997) 267-274. 10. Qadeer,S., Rehof, J.: Context-Bounded Model Checking of Concurrent Software. In: Halbwachs,N.,Zuck, L.,D.(eds): Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, Vol.3440, Springer-Verlag, Berlin Heidelberg New York (2005) 93–107 11. Wood, C. and Snow, K.: ISO 9000 and information, Security, Computer & Security, Vol.14, No.4, (1995) 287-288.
Author Index
Baek, Yoo-Jin 1 Bai, Shuo 78 Bao, Feng 112, 142
Kim, Jangbok 67 Kim, Yosik 248 Lee, Jung Wook 153 Lee, Pil Joong 153 Lee, Seok Yun 379 Li, Jianhua 123 Li, Shipeng 13 Li, Tieyan 112, 142 Li, Xiao 134 Li, Xiehua 123 Lim, Jongin 33 Lin, Dongdai 314, 321 Lin, Lei 321 Liu, Fenlin 355 Liu, Yinbo 278 Lotspiech, Jeffery 302 Lu, Bin 355 Lu, Yahui 278 Luo, Hao 45 Luo, Xiangyang 355
Cao, Tianjie 314 Cao, Zhenfu 226 Chai, Zhenchuan 226 Chen, Kefei 165 Chen, Liqun 202 Cheng, En 100 Cheng, Zhaohui 202 Choi, Kyunghee 67 Choi, Myeonggil 379 Chung, Tai-Myung 379 Comley, Richard 202 Dai, Kui 290 Dong, Ling 165 Eom, Young Ik
269
Manulis, Mark 187 Mao, Xianping 314 Mu, Yi 214, 332, 367
Fang, Binxing 45, 57 Feng, Min 13 Han, Dong-Guk 33 Han, Zongfen 100 He, Guangqiang 177 He, Hongjun 290 He, Mingxing 134 Hou, Fangyong 290 Hu, Yupu 25 Huang, Xinyi 214 Hwang, Yong Ho 153
Noh, Mi-Jung
Park, Sangseo 248 Peng, Jinye 177 Ping, Lingdi 343 Qi, Fang 112 Qiu, Ying 142 Ramakrishna, R.S. 235 Ryou, Jaecheol 248
Jia, Weijia 112 Jiang, Zhonghua 321 Jin, Hai 100 Jin, Hongxia 302 Jin, Shiyao 90 Jung, Gihyun 67 Khan, Muhammad Khurram Kim, Gu Su 269 Kim, Hyung Chan 235
1
260
Sadeghi, Ahmad-Reza 187 Sakurai, Kouichi 235 Schwenk, J¨ org 187 Shen, Haibin 343 Shim, Jaehong 67 Shin, Wook 235 Sun, Jiaguang 278 Sun, Kang 343 Susilo, Willy 214, 332, 367
392
Author Index
Takagi, Tsuyoshi 33 Tang, Qiang 202 Tao, Zhifei 100 Wang, Baocang 25 Wang, Jimin 343 Wang, Ping 57 Wang, Zhiying 290 Wu, Ji 90 Wu, Qianhong 367 Wu, Yongdong 112 Xie, Feng 78 Xiong, Jin 177 Xu, Lin 321
Yang, Shutang 123 Ye, Chaoqun 90 Yun, Xiaochun 45, 57 Yun, Youngtae 248 Zeng, Guihua 177 Zhang, Fangguo 367 Zhang, Futai 214 Zhang, Jiashu 260 Zhang, Li 278 Zhao, Cunlai 13 Zhou, Lan 332 Zhou, Yuan 226 Zhu, Bin B. 13 Zhu, Hongwen 123 Zhu, Huafei 142