This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0, be the stochastic kernel on [Rd, Qjd] given by
K)..(q,·) = N(q,)..1,·) Here I denotes the identity matrix and N(q, ),,1,') denotes the normal distribution with the expectation vector q and the diagonal matrix I as covariance matrix. Then
(5) describes how the charge present at initial time zero diffuses until the time
t. Here Zd denotes the Lebesgue measure on [Rd, Qjd]. The discretization of the process corresponding to the source term f occurs as follows: Let f be a bounded measurable function defined on R d X R +. We define functions f+, f - putting
f+
:= max{O,
f}, f-:= max{O, - f} .
161
f-L1 denotes the measure on R d+ 1 X f-L1(B x {I}) f-L1(B x {-I})
{
-1,1} characterized by
J =J =
j+(XhB(X) Zd+1(dx), BE £d+1,
j-(X)XB(x) Zd+l(dx), BE £d+1.
(j?n denotes a random point system in Rd+1 x {-I, I} with distribution P(nJ.Ll)' Thus (j?n describes the_ configuration of charge points which are added spatially-temporally. If {(In{[x, S, I]) = 1 then a charge unit of the amount l/n is added. If (j?n{[x, s, -I]) = 1 then a charge unit of the amount -1/ n is added. After occurring a particle, it diffuses. Thus the contribution of this source process to the charge density until t > 0 is given by
D~n)O = ~
J
C
(Kt-s(x, ')X(O,t)(s)
+ 8xOX{t}(s)) (j?n(d[x, s, c]).
(6)
Here XB denotes the indicator function of a set B. The disturbance term is transformed similarly. Let IJ be a bounded measurable function defined on R d X R +. 1J2 is considered as the density of a measure f-L2 on [Rd+l, Qjd+1]. We define a measure f-L on Rd+1 x {-I, I} by f-L := f-L2 x
1
2" (8 1 + L d·
Let {(In be a random point system in R d+1 X { -1, I} with distribution p(nJ.L)' Analogously to (j?n the random point system {(In describes the configuration of charge points which are added spatially-temporally with mean intensity nIJ 2 including the "sign" of the unit charges. Differently to the effect of the first source process individual charges of the value 1/ Vn are generated. Thus the contribution of this second source process to the charge density until t > 0 is given by c;n) 0
=
In J
c (Kt-s(x,.) X(O,t)(s)
+ 8xOX{t}(s)) {(In (d[x, s, c]).
(7)
Each particle should give a contribution to the entire charge and evaluate independently on the other ones. These requirements do not follow from the properties of the Poisson process. Therefore, we assume that (j?n and {(In are independent. Consequently the entire process can be defined as the sum of independent random variables
(8)
162
By the normalization lin, respectively lifo, the "potential" of one generated particle decreases when n increases, whereas the average number of the particle increases, i.e., the produced charge is "smeared over" in a certain manner. From theorem 1.3.6 in 5 we can conclude: Theorem 3.1. Let be t > O. Then it holds
(a) The sequence of the (J"-additive processes u~n) = (u~n) (B); B E £d) converges (in the sense of definition 2.3) to a (J"-additive process Ut = (ut(B); BE £d). (b) For each finite sequence (Bi)'~l from £d the random vector [ut(BI), ... , ut(Bm)] is normally distributed. The distribution is characterized by the expectation values
and by the covariances
Let us note that the processes (Ut)t>o can be interpreted as the solution of the integral equation corresponding to (4) (cf. 5).
A Completely Discrete Particle Model In 3 a completely discrete particle model is considered, i.e., in addition to the source term j and the diffusion term (J"2 the initial condition cp has to be discretized. Furthermore the particles of the considered system are moving according to a Brownian motion. In the following we assume that cp(x) is an integrable function on Rd. Furthermore we assume that for all t > 0 the functions j(x,s)x(O,t)(s) , (J"2(x,s)X(0,t)(s) are integrable. Those conditions insure that the considered random point systems are always finite. In principle one can consider the case that cp, j, (J"2 are bounded measurable functions like in 5. Now the initial condition cp will be discretized as follows: /-to denotes the measure on R d+ 1 X {-1, I} characterized by
J =J
/-to(B x {I}) = /-to(B x {-I})
cp+ (X)xB (x) ld(dx), BE £d, CP- (X)XB(X) ld(dx), BE £d.
163
Let ~n be a Poisson random point system on R d X { -1, I} with distribution p(nl-'o)' ~n describes the configuration of charge points. Further ((wx(t))t>O)xERd denotes a family of independent standard Wiener processes on Rd. A charge point starts from x at initial time 0 and moves to x + wx(t) at time t. The contribution of this diffusion process to the charge density until t > 0 is given by
(9) We have already discretized the source term f and the diffusion term J2, i.e., we already have the point configurations of ~n and q,n. A charge point occurring at time s on position x moves to x + Wx (t - s) at time t. The contributions of these source processes to the charge density until t > 0 are given by
D-en) t (-)
11 In 1
=;
c;n)(-) =
-
c X(O,t)(s) Dx+wx(t-s)(-) q,n(d[x, s, c]), C
X(O,t)(S) Dx+wx(t-s)(-) q,n(d[x, s, c]).
(10) (11)
Furthermore we assume that ((w x (t))t2:0)XERd, ~n, ~n, q,n are independent. The entire process can be defined as the sum of independent random variables
(12) Similarly as the partially discrete particle model, the discrete system Ut(n) should approximate a continuous system. The following theorem makes that more precise 3. Theorem 3.2. Let be t given by
Vt(B) = At(B) at(B)
=
1
> 0, B
+
E £d. Further let Vb at measures on Rd
1
Kt-s(x, B)f(x, S)x(O,t)(S) ld+l(d[x, s])
Kt-s(x, B)J 2(x, S)x(O,t)(S) 1d+1(d[x, s]).
We assume that "\it is a generalized Brownian motion with noise intensity measure at. Then the sequence of the J-additive processes (Ut(n))nEN converges to a decomposable process Ut given by the following equation
164
Remark 3.1. This theorem means that the completely discrete particle model approximates a continuous system being different from the limit considered in 5, i.e., the limit of the completely discrete model has the same expectation values but different covariances, compared with the limit considered in 5. 4. Some Lemmas
For the proof of the main theorem in necessary to use the following lemmas.
3
(theorem 3.2 in this paper) it is
Lemma 4.1. Let \[Tn be a Poisson random point system with intensity measure nf.1 where f.1 is a locally finite measure on G and n E N. u(n) denotes the CT-additive process defined by u(n) := ~ \[Tn. Then the sequence (u(n) )nEN converges to the (trivial) CT-additive process f.1, i. e., AUCn) (g) -+ exp{ifg(x)f.1(dx)} (n-++oo).
Lemma 4.2. Let <pr,. Then for each t > 0 1>t := J 1>(d[x, s])Ox+wx(t-s)X(O,t)(s) becomes a Poisson random point system in R d with finite intensity measure f-Lt being absolutely continuous w. r. t. the Lebesgue measure with density
5. Proof
In this section we give the proofs of the lemmas in section 4. Using the lemmas the proof of theorem 3.2 is given in 3. Proof of Lemma 4.1 Let us consider the characteristic functional Cu(n) (g) of u(n). Firstly m
= Lc~m)XB("') with pairwise disjoint
we consider the special case gem)
k
k=l
subsets Bi m), ... , B~m). Then we have
J
gCm)(x)du(n)(x) =
fc~m)u(n) k=l
(Bkm)) =
~ fc~m)wn (Bkm)).
(13)
k=l
From (13) we obtain Cu(n) (g(m))
=
Eexp
(i Jg(m) (x)du(n) (x))
(i~ ~c~m)wn (Bkm))) +00 +00 ((m)) = L ... L II exp iC~ lk =
Eexp
m
h=O
lk=O k=l
xPr{wn(Bim)) =h,'" ,Wn(B~m)) =lm}
= fL~ exp
(iC~) lk) Pr {w
n
(Bkm)) =
lk}'
(14)
166
Since t(x) =
f>~m)iJ>t (Bk m)) k=l
=
f>~m) / k=l
iJ>(d[x, s])X(O,t) (s )Ox+wx(t-s) (
L
Bkm))
g(ml(x + wx(t - s)).
(30)
[x,sjE,O<s
(i / g(m) (X)diJ>t(X)) L
= Eexp {
g(m)(x
+ wx(t - S))}
[x,sjE,O<s
II
P,,(diJ»
= /
Eexp{ig(m)(x+wx(t-s))}
[x,sjE,O<sO)xERd and
(31)
iJ>.
is
obtained
from
the
independence
of
Further it holds
Eexp{ig(m)(x+wx(t-s))} = / exp {ig(m)(y)} kt-s(x,y)dy.
(32)
From (31),(32) we obtain
Ct (g(m)) =
exp ( / f.l(d[x, s]h(o,t) (s) [/ exp {ii m)(y) } kt-s(x , y)dy - 1])
= exp ( / { / / h(x, s )X(O,t) (s )kt-s(x, y)dxds } [ex p {ig(m) (y) } - 1] dY) . (33)
171
Setting h t as
ht(y)
:= /
/
h(x, s)X(O,t)(s)kt-s(x, y)dxds,
we obtain from (33)
CiPt(g(m)) = exp ( / ht(y) [exp {ig(m)(y)} -1] dY)
= exp ( / ILt(dy)
J) .
[exp {ig(m)(y)} - 1
(34)
Similarly as (18) let us approximate a general function g by the sequence (g(m))mEN. Then it holds from (34)
CiPt(g)
=
exp ( / ILt(dy) [exp{ig(y)} - l l)
.
That proves lemma 4.4 . •
References 1. L. Breiman, Probability, Reading, Mass., 1968. 2. J. Feldman, Decomposable processes and continuous products of probability spaces, J. Function. Analysis, 8, I-51, 1971. 3. K.-H. Fichtner, K. Inoue, M. Ohya, Approximative approaches to a stochastic partial differential equation by point systems, Preprint. 4. K.-H. Fichtner, R. Manthey, Weak approximation of stochastic equations, Stochastics and Stochastics Reports, 43, 139-160, 1993. 5. K.-H. Fichtner, M. Schmidt, Approximation of a continuous system by point systems, SERDIeA, 13, 396-402, 1987. 6. K.-H. Fichtner, G. Winkler, Generalized Brownian motion, point processes and stochastic calculus for random fields, Math. Nachr. 161, 291-307, 1993. 7. K. Matthes, J. Kerstan, J. Mecke, Infinitely divisible point processes, J.Wiley, New York, 1978. 8. J.B. Walsh, A stochastic model of neural response, Adv.Appl.Probability, 13, 231-281, 1981. 9. H. Zessin, The method of moments for random measure, Z. Wahrsch. Verw. Gebiete 62, 395-409, 1983.
This page intentionally left blank
Quantum Bio-Informatics IV eds. L. Accardi, W. Freudenberg and M. Ohya © 2011 World Scientific Publishing Co. (pp. 173- 183)
ON QUANTUM ALGORITHM FOR EXPTIME PROBLEM S. IRIYAMA AND M. OHYA
Department of Information Sciences , Tokyo University of Science 2641, Yamazaki, Noda City, Chiba, Japan There exists a quantum algorithm with chaos dynamics solving an NP-complete problem in polynomial time, called OMV SAT algorithm . The language class EXPTIME is larger class than NP, there is no classical algorithm to solve it in polynomial time. In this paper we propose a quantum algorithm for one of the problems in EXPTIME, Pebble Game, and compare the computational complexity of it with the classical one. We show that a quantum algorithm with Oracle solves it in polynomial time while a classical algorithm with same Oracle does in exponential time.
1. Introduction
We have studied on quantum algorithm for several years. Ohya, and Volovich discovered the quantum algorithm with chaos dynamics called the OMV quantum algorithm which can solve NP complete problem in polynomial time. We applied this quantum algorithm to the other problems, multiple alignment of amino acid sequence, Hamilton closed path problem, protein folding problem and EXPTIME problem. Therefore we found that OMV quantum algorithm is useful for searching problems, i.e., to search the objects which satisfy the given conditions. In the field of Bio-Information there exist many searching problems, and we can apply our quantum algorithm to them. In this paper, we show a quantum algorithm for EXPTIME problem, Pebble Game. Then we discuss the computational complexity of it.
2. Quantum Algorithm A quantum algorithm is constructed by the following steps: (1) Prepare a Hilbert space (2) Construct an initial state (3) Construct unitary operators to solve the problem 173
174
(4) Apply them for the initial state and obtain a result state (5) If necessary, amplify the probability of correct result (6) Measure an observable with the result state In the first step, we define the Hilbert space depending on the problem. Let
((:2
be a Hilbert space spanned by 10)
=
(~)
and 11)
=
(~),
a
normalized vector 1'If!) = a 10) + (311) on this space is called a qubit. Since we can use a superposition of 10) and 11) as an initial state vector, the quantum algorithm is more effective than classical one. One can apply Hadamard transformation H __ 1 (1 1 )
- y'2
1-1
to create a superposition. For 10) and 11), it works as H 10)
1
1
= y'210) + y'211)
1 1 H 11) = -10) - -11) .
y'2
y'2
Hadamard transformation has a very important role in a quantum algorithm. Here we introduce logical gates, which are NOT gate, C-NOT gate and CC-NOT gate. We call these gates fundamental gates. We can also construct AND and OR gate by considering the product of fundamental gates and some imprementations. The NOT gate UNOT is defined on a Hilbert space ((:2 as UNO T =
11) (01
+ 10) (11·
It works for an arbitrary qubit as
C-NOT UCN gate and CC-NOT Hilbert space as UCN
UCCN
are given on two and three qubit
= 10) (01 ® I + 11) (11 ® UNO T
UCCN =
10) (01 ® I ® I
+ 11) (11 ® 10) (01
®I
+ 11) (11 ® 11) (11 ® UNO T ,
respectively. The unitary operator to solve the problem is constructed by these fundamental gates.
175
3. OMV SAT Algorithm In this section, we explain OMV(Ohya-Masuda-Volovich) quantum algorithm which contains two part, that are unitary computation and chaos amplification process. It is discussed precisely in the papers l ,2,4,9. Let X == {Xl, ... ,xn},n E N be a set. Xk and its negation Xk (k = 1, ... , n) are called literals. Let X == {Xl, .. " Xn} be a set, then the set of all literals is denoted by X' == X UX = {Xl, ... , X n , Xl,"" X n }. The set of all subsets of X' is denoted by :F (X') and an element G E :F (X') is called a clause. We take a truth assignment t to all variables Xk. If we can assign the truth value to at least one element of G, then G is called satisfiable. Let L = {O, I} be a Boolean lattice with usual join V and meet 1\, and t (x) be the truth value of a literal X in X. Then the truth value of a clause G is written as t(G) == VxEct(x). Moreover the set C of all clauses Gj (j = 1,2,'" ,m) is called satisfiable iff t (C) == I\'j=l t (Gj ) = 1. Thus the SAT problem is written as follows: [SAT problem] Given a Boolean set X == {Xl,'" ,xn}and a set C = {Gl ,'" ,Gm } of clauses, determine whether C is satisfiable or not. That is, this problem is to ask whether there exists a truth assignment to make C satisfiable. It is known that we can check the satisfiability in polynomial time when a specific truth assignment is given, however we do not determine it in polynomial time when an assignment is not specified. We first calculate the total number of qubits, and show that this number depends on the input data. This calculation is done in polynomial time of input size. Since the total number of qubit required the quantum algorithm, we decide the Hilbert space and the initial state vector on it. Let C {Gl , ... ,Gm } be a set of clauses on X' {Xl, ... ,X n , Xl, ... ,x n }. The computational basis of this algorithm is on the Hilbert space H = (C 2)'l9 n +I-'+l where f.L is a number of dust qubits, it is shown that f.L is less than 2mn g. Let
be an initial state vector. For X we put
where
Cl, C2, ... ,Cn
E
=
{Xl, ... ,x n }
and a truth assignment t,
{O, I} , and we write t as a sequence of binary sym-
176
boIs:
A unitary operator Uc : 1i follows
-t
1i computes t (C) for all truth assignment as
where
Id P )
lei)
leI, e2, ... ,en) is a binary representation of t. Accardi and Sabbadini
=
is dust qubits denoted by p, strings of binary symbols, and
pointed out that OMV SAT algorithm is combinatorial 6 . Theorem 3.1. (l) For a set of clauses C = {Gl , ... , Gm } on X' == {Xl, ... ,X n , Xl,···, xn}, the number p, of dust qubits for algorithm of SAT
problem is p,::::: 2nm
For a set of clauses C = {Gl , ... , Gm }, we can construct the unitary operator Uc to calculate the truth value of C as
Uc ==
m- l
m
i=l
j=l
II UAND (i) IIUoR (j) H (n)
where, H (k) is a unitary operator to apply Hadamard transformation to first k qubits, that is
The computational complexity of quantum computation depends on the number of unitary operator in the quantum circuit. Let U be the unitary operator, it is written as
where Un,· .. ,Ul are fundamental gates. The computational complexity T (U) is considered as n.
177
We need to combine some fundamental gates such as UNOT, UCN and UCCN to construct the quantum circuit in fact. UAN D and UOR can be written as a combination of fundamental gates. Here we obtain the computational complexity T (Uc) of SAT algorithm by the number of UNOT, UAN D and UOR. Theorem 3.2. f) For a set of clauses C = {G 1 , ... , Gm {X1, ... ,X n ,X1, ... ,X n }, T(Uc) is
}
and literal X'
=
m
T (Uc) = m - 1 +
L
(IGkl
+ 2i~ -
1)
k-1
:::; 4mn-l
3.1. Chaos Amplifier
Here we will briefly review how chaos can playa constructive role in computation (see 1,2 for the details). Consider the so called logistic map which is given by the equation
The properties of the map depend on the parameter a. If we take, for example, a = 3.71, then the Lyapunov exponent is positive, the trajectory is very sensitive to the initial value and one has the chaotic behavior 2. It is important to notice that if the initial value Xo = 0, then Xn = for all n. The state 1'IjJ) of the previous subsection is transformed into the density matrix of the form
°
where PI and Po are projectors to the state vectors 11) and 10) . One has to notice that PI and Po generate an Abelian algebra which can be considered as a classical system. The following theorems is proven in 1,2,3. Theorem 3.3. For the logistic map Xn+1 = aXn (1 - Xn) with a E [0,4] and Xo E [0, 1], let Xo be 2~ and a set J be {O, 1,2, ... , n , ... , 2n}. If a is 3.71, then there exists an integer k in J satisfying Xk > ~. Theorem 3.4. Let a and n be the same in above theorem. If there exists
k in J such that Xk > ~, then k > log2~~A - 1.
178
Theorem 3.5. Let It (C)I be the cardinality of these assignments, if Xo ==
.{n with r = It (C)I and there exists k in J such that exists k satisfying the following inequality if C is SAT.
- 13.71 -log2 r] [nlog2 - 1
< k < [-45
-
-
Xk
>
~, then there
(n _1)]
From these theorems, for all k, it holds k (2) g3.71 q
{=>
0 0
iff C is not SAT iff C is SAT
Ohya and Volovich proposed quantum algorithm to calculate truth function by unitary operators and mentioned that it is necessary to use some amplification processes to detect the result in case of very small probability. Then they discovered that one can construct this process by using chaos dynamics. The computational complexity of OMV SAT algorithm is discussed precisely in the papers 10 ,9,1l
4. Language Classes
There exist several classical language classes defined by a deterministic Turing machine. Definition 4.1. Let M be a deterministic Turing machine such that halts for all input. For a length n of input, let f (n) be a maximum length of tape cell of M. We define a space complexity of M as
f.
Definition 4.2. PSPACE is the language class which is recognized by a deterministic Turing machine in a polynomial space. Definition 4.3. NPSPACE is the language class which is recognized by a non-deterministic Turing machine in a polynomial space. Definition 4.4. EXPTIME is the language class which is recognized by a deterministic Turing machine in an exponential time . The following relation is known:
P
~
NP
~
PSPACE
= NPSPACE
~
EXPTIME
179
5. Pebble Game Pebble Game is the two players game using a play board and stones(pebbles). Players move one stone at a time along a given rule alternatively. A player wins when he moves a stone to the winning position on the board given by the rule, or he loses when he cannot move any stones. We want to know whether there exist strategies such that a first mover can win every time. The computational complexity of this problem is obviously very large, and in some cases depending on a size of board and rule it belongs to EXPTIME. Then we propose a quantum algorithm for Pebble Game. First, we explain a representation of game and a definition of Pebble Game.
5.1. Representation of a Game Let I be a set of players, M a set of moves (or strategies) available to those players, and P a set of payments for each combination of moves. A game G is given a triplet (I, M, P). The set of moves is given by a rule of the game. Players choice their move in their turn alternatively, so then these choices are described by a sequence of moves. We denote this by a position Pi where a player i E I has the move. A set of ordered pair (pi, Pj) where Pi and Pj are positions such that Pi -7 Pj is a feasible move implies the rule of the game. The game is finished when there does not exist any moves in someone's turn. When the game is finished, the payments are given to players by a function of the position. In this study we assume that the game must be finished so that the length of position is finite. We say a player A wins if the payments of A is greater than a player B in two players game. Then we consider the following problem: [Game Probleml] Does there exist a way such that a player A wins independently of how a player B plays? To answer this, a winning position of A was introduced by Konig.as a set of positions where A wins in finite moves in spite of moves of B. A set of winning positions of A is constructed inductively from a set of positions where A wins by only one move. Let P is a winning position, q is also a winning position if there exists a move r A of A such that for all moves rB of B, it holds
180
Therefore, the Problem 1 is equivalent as the following: [Game Problem2] Determine whether an initial position of the game is in a winning position of A?
5.2. Pebble Game Let n be a positive finite integer denoted by a size of Pebble Game. n- Pebble Game is given by a triplet G Pebble = (I, M, P) where I =
{A,B}, M = {r=(r l ,r 2 ,r3)E{1, ... ,n}3;risanavailablemove} and P = {f: Mk -> {O, I}}. M is a set of available moves depending on a situation of the board and positions of stones. The rule of Pebble Game is the following: • Prepare n lattices denoted by a board and put m ( < n) stones on nodes. Each nodes has a unique number i = 1, ... n. • Players move one of stones alternatively if the stone so chosen can jump the neighbor stone. • If a player puts a stone on the node n, he wins. If a player can not move any stones, he loses. Let x = (Xl, X2, ... , xn) be a vector of Boolean variables denoted by a situation of board, where Xi has one to one correspondence with a node i: Xi
=
{O no stone on a node i 1 node i has a stone
We prepare m stones on the board Xs arbitrary and call it an initial situation. And we call a finished situation X f if there are no available moves. A move ri (i E {A, B}) for a player i is represented by (rl, r2, r3) E {I , ... , n} 3 where rl indicates a position of stone, r2 the neighbor and r3 a position of destination. If there is a stone on rl and r2 and there is not a stone on r3 , the move ri = (rl,r2,r3) is available. After one move, the situation of board is changed by
_ {Xi i = rl or i = r3
Xi -
Xi
otherwise
we denote ri (x) by a situation of board after a move rio If there no moves available or he can move a stone to the node n, the game is finished. Then we have a sequence of moves. For a sequence of moves Pi (i E {A, B}) = (r1, r§, ... , rf) a payment for a player A is given
181
by a function fA E P:
and for a player B is by fB E P: fB (Pi)
=
{ o1 ii == BA
Let us define the following problem: [Pebble Probleml] Does there exist a way such that a player A wins independently of how a player B plays in n-Pebble Game? This problem belongs to EXPTIME if m is not fixed 8 . We check whether an initial situation is a winning position of A for all number m of stones. Then the Pebble Problem1 is translated to the following problem: [Pebble Problem2] For all finished situations xf determine whether there exists k such that 3r~\lr~-1 ... 3d \lr13r~ (xs)
= xf
where Xs is an initial situation. We propose a quantum algorithm with Oracle to solve the Pebble Problem2, and show that even if we assume an Oracle, a classical algorithm is still an exponential time.
5.3. Computational Complexity of a Classical Algorithm for Pebble Game In order to discuss a relative computational complexity between classcal and quantum algorithm, we assume the following Oracle Mo:
Mo : {x; x is a situation} ---; N U {O} For a finished situation x f' Mo cutputs the time of it immediately, just one step. If a situation x is not a final situation, Mo outputs o. A classical algorithm to solve Pebble Problem2 is the following. Step1 For all situations, we do: Step2 Calculate time k (x) for a situation x Step3 If x is a final situation, construct a set Wi (i = 1,2, ... , k (x))of winning situations: Wi
where Wo Step4 Check Xs
= {y; 3r~, \lrB, 3r A, r~rBr AY E Wi-I}
= {x}
= Wk(x).
182
The number of all situations 2n , and the upper bound of number of available moves is 4n since a player can move one pebble into four directions at most. Therefore the computational complexity of a classical algorithm Tc (n) is
Tc (n)
rv
(4n)3 x 2n
rv
exp (n)
Even if we assume the Oracle, this problem belongs to EXPTIME.
6. Quantum Algorithm for Pebble Game As we assume the Oracle Mo, we use a quantum Oracle UMo which works as same as Mo. Here, we construct the following quantum algorithm: Step 1 Step2 Step3 Step4
Create a superposition of all situations. For the superposition, apply Oracle UMo. We construct Wi (i = 1,2, ... k (xf)) for the superposition. If Xs E Wk(x), make the final qubit 11).
All situations are represented by a binary form, so then we can create a superposition of them using Hadamard transformation. Step3 is achieved by a product of unitary gates. In Step4, AND operation is constructed by unitary gates 9 . If final qubit of superposition is 11), there exists a way of winning. Using the chaos amplifier, we obtain the result.
7. Computational Complexity Here, we calculate the computational complexity of the quantum algorithm as the total number of fundamental gates. The Step 1 is constructed by n Hadamard gates. The Step2 is done by only one Oracle UMo. The computational complexity of Step3 has the same order as the classical algorithm. In the Step4 AND operation requires n gates for n qubit. The upper bound of IWk(x) I is (~) where m is the number of pebbles. Therefore, computational complexity TQ (n) of quantum algorithm of Pebble Problem2 is
TQ (n)
rv
{n
+ 1 + (4n)3 + (:) }
x
[~(n
-1)]
rv
poly (n)
where [~( n - 1)] is for chaos amplification to obtain the correct result.
183
8. Conclusion
The computational complexity Tc (n) of a classical algorithm for n-Pebble Game with Oracle is
Tc (n) ~ (4n)3 x 2n ~ exp (n) This is a exponential time of input size n. We constructed a quantum algorithm for this, the computational complexity TQ (n) is
TQ (n)
cv {
n+
1+ (4n)
3
+ (:) }
x
[~( n -
1)]
cv
poly (n)
In this study, we assume the Oracle Mo and the unitary operator UMo which works as Mo. Even if we use this Oracle, the computational complexity of the classical algorithm for Pebble game is an exponential time while it of the quantum algorithm is a polynomial of n.
References 1. M.Ohya and I.V.Volovich, Quantum computing and chaotic amplification, J. opt. B, 5 ,No.6 639-642, 2003. 2. M .Ohya and I.V.Volovich , New quantum algorithm for studying NP-complete problems, Rep.Math.Phys. , 52 , No.l,25-33 2003. 3. M.Ohya and I.V.Volovich, Mathematical Foundation of Quantum Computers, Teleportations and Cryptography, to be published. 4. M.Ohya and N.Masuda, NP problem in Quantum Algorithm, Open Systems and Information Dynamics, 7 No.1 33-39 , 2000 . 5. L. Accardi and M.Ohya, A Stochastic limit approach to the SAT problem, Open systems and Information Dynamics, 11-3, 219-233 , 2004 6. L.Accardi and R.Sabbadini, On the Ohya- Masuda quantum SAT Algorithm, Preprint Volterra, N. 432, 2000. 7. E.Bernstein and U.Vazirani, Quantum Complexity Theory, In Proc. 25th ACM Symp. on Theory of Computation, 11-20, 1993. 8. T.Kasai, A. Adachi , S. Iwata, Classes of Pebble Games and Complete Problems,SIAM J. Comput. Volume 8, Issue 4, pp. 574-586 (1979) 9. S.Iriyama and M.Ohya, Rigorous Estimate for OMV SAT Algorithm, Open Systems & Information Dynamics, 15, 2, 173-187, 2008 10. S.Iriyama, M.Ohya and I.V.Volovich (2006) Generalized Quantum Turing Machine and its Application to the SAT Chaos Algorithm, QP-PQ:Quantum Prob. White Noise Anal., Quantum Information and Computing, 19, World Sci. Publishing, 204-225 11. S.Iriyama and M.Ohya (2008) Language Classes Defined by Generalized Quantum Turing Machine, Open System and Information Dynamics 15:4, 383-396. 12. S.Iriyama and M.Ohya (2009) The problem to construct Unitary Quantum Turing Machine for compute partial recursive function , TUS preprint. 13. S.Iriyama and M.Ohya (2010) Quantum Algorithm for Pebble Game and Its Computational Complexity, TUS preprint.
This page intentionally left blank
Quantum Bio-Informatics IV eds. L. Accardi, W. Freudenberg and M. Ohya © 2011 World Scientific Publishing Co. (pp. 185-197)
ON SUFFICIENT ALGEBRAIC CONDITIONS FOR IDENTIFICATION OF QUANTUM STATES
ANDRZEJ JAMIOLKOWSKI Institute of Physics, Nicholas Copernicus University, 87- 100 Torun, Poland E-mail: [email protected] The aim of this paper is to discuss the relationship between some problems of the identification of quantum states by geometric methods used in stroboscopic tomography and, on the other hand , by a lgebraic approach typical for the quantum generalization of classical sufficient statistics. Some examp les of necessary and sufficient conditions which must be fulfilled by generators of algebras in order to estimate states of quantum systems are discussed.
Keywords: algebras of observables; stroboscopic tomography; generators of subalgebras.
1. Introduction
One of the basic purpose of physical theories, in both classical and quantum sectors, is to describe events that are observed in Nature or in experiments conducted in laboratories. Usually, we have a physical system under investigation, and we try to obtain information about it by making some experiments. As results, measurement outcomes are registered. In fact, in many situations in classical systems, and in all cases in the quantum regime, we can not predict the individual measurement outcomes and we obtain only some probabilities of results. In other words, we obtain, as the outputs of experiments, only probability distributions on a set of possible measurement outcomes. The statistical theory of classical systems is based on classical probability theory. However, atoms and molecules obey the statistical laws of quantum mechanics and one has to develop a parallel theory based on quantum probability (noncommutative probability), which is an essential generalization of classical probability theory. In classical statistical physics we consider probability distributions as natural representation of states and random variables as representation of observables (physical quantities). In the description of microsystems it 185
186
is natural to start with the idea of observables as a quantum analogue of random variables and to define states as a derived concept. This means that we introduce quantum states through the concept of algebra of observables, taking the states to belong to the dual space. The algebra, furnished with the operation of conjugation (*-operation) is postulated to be a C*-algebra with identity. It can be considered as a natural generalization of the algebra of classical functions with *-operation given by complex conjugation, and whose real random variables are self-adjoint elements. A nice discussion of these issues is given by W. Thirring in his lecture notes 1. From the physical point of view, by a state we understand the description of statistical properties of a system when prepared repeatedly in the same way. In other words, one of the basic assumptions of quantum mechanics is based on the observation that determination of a completely unknown state can be achieved by appropriate measurements only if we have at our disposal a set of identically prepared copies of the system in question. From the mathematical point of view, we say that a state on a C*- algebra of observables is an assignment of a number (the expectation value) for every element of the algebra. This assignment should obey the natural laws of linearity and positivity. Thus, if we denote by A a C*algebra with identity (a set of observables) then a state on A is a linear map (1)
such that P(C"1Q1 + a2Q2) = a1P(Qd + a2P(Q2) for all a1, a2 in C and Q1,Q2 in A , and p(Q) ~ 0 for all positive Q E A. Usually, we also assume that p(lI) = 1. Moreover, to set up an effective approach to the above problem of state determination, one has to identify a collection of observables, a quorum 2 , such that their expectation values contain complete information about the state of the system under consideration. In the standard formulation of quantum mechanics, usually we introduce a Hilbert space H associated with a given microsystem and we identify an algebra of observables A with the set of hermitian elements of the Hilbert-Schmidt space B(H). For any normalized vector Iw) E H the formula
p,p(Q) =
(wIQlw)
(2)
defines a state on A. Such a state is called vector state. In general, the expectation values are given by convex combinations of some vector states
187
and we have the following equality
f5(Q)
=
Tr(pQ) ,
(3)
where p denotes a state of the system in question. The problems of state determination have gained new relevance in recent years , following the realization that quantum systems and their evolutions can perform tasks such as teleportation , secure communication or dense coding (c.f. e.g. 3,4). It is important to realize that if we identify the quorum of observables, then we also have a possibility to determine the expectation values of physical quantities (observables) for which no measuring apparatuses are available 5. The idea of stroboscopic tomography for open quantum systems appeared for the first time in the beginning of 1980 's (although it was expressed in different terms 6 ,7 from the ones used presently). In particular, the question of the minimal number of observables Ql, . .. , Q", for which quantum states can be (Ql , . . . , Q",)-reconstructible was discussed. Simultaneously, it was shown that reconstructibility of states in a finite dimensional systems can be achieved if a sequence of the so-called K rylov subspaces which are defined by
(4) where Q is a fixed observable and lL is a generator of time evolution of the system in question, span the Hilbert-Schmidt space B(H) (cf. below Sect. 3). That is, more precisely, if the following equality is satisfied
(5) In the above equality I-" denotes the degree of the minimal polynomial of the superoperator lL and Ql, ... , Qr represent fixed observables. The symbol ffi denotes the Minkowski sum of subspaces (5) (cf. e.g. 8). We recall that for two subspaces Kl and K2 of the vector space H, by Kl ffi K2 one understands the smallest subspace of H which contains both Kl and.K2 . It is well known that the Krylov subspaces Kk(lL, Q) for k = 1,2, ... form a nested sequence of subspaces of increasing dimensions that eventually become invariant under lL. Hence for a given Q, there exists an index I-" = 1-"( Q) , often called the grade of Q with respect to lL, for which Kl(lL,Q) 0, that is, p represents a faithful state. It follows from (16) and (17) that the operator 7r1! is the quantum analogue of a classical conditional probability. Indeed, in the case A is an Abelian algebra, 7r1! coincides with a classical conditional probability. Definition 3.1. An operator 7r E A is called a quantum conditional probability operator (QCPO) if 7r satisfies condition (16) and (17).
One easily finds the following formula n
7r (log e4> -log p 0 0') = Tr (p~ 01) 7f
) if N is even, otherwise it may be slightly smaller than the minimum 1.
249
Figure 6. The sectors in (B, D) plane where the respective Ai are the lowest energy levels, marked with different shades of gray. The dashed line is the boundary between the regions I and II of Fig. 5 and it corresponds to the switch of minh value, (11).
where D' = VD2 malization, are
+ 2D + 9 and the corresponding eigenvectors,
up to nor-
1'1/>1) = 10011) + l+D4_DI 10101) + 10110) + 11001) - 1+~+D/1101O) + 11100) and
1'1/>2) = 11110) -11101)
+ 11011) -
10111),
The states 1'1/>1) and 1'1/>2) are entangled. We have restricted here the analysis to the B 2 halfplane; the image for negative B is fully analogous, the respective eigenvectors are obtained from the current ones by spin flipping. The corresponding level-crossing diagram, showing regions in the parameter plane where different Ai become the lowest energy levels, is presented in Fig. 6. Thus 1'1/>3) is the ground state in the region below the A2 = A3 line which is D = ~ - 2; it coincides with sector III of Fig. 5. As 1'1/>3) is separable, no entanglement detection by means of (12) is possible in this domain. Above the D = ~ - 2 line the ground state changes to 1'1/>2) and for still larger values of D - to 1'1/>1). Both these domains have nonempty intersections with sectors I and II of Fig. 5 (cf. the dashed parabola in Fig. 6 corresponding to the I-II border). This data allows one to complete the detailed construction of the entanglement witness based on H as prescribed in (12). We have conducted the
°
250
construction analytically with the use of Mathematica and we have applied the witness W to a family of equilibrium states of (7),
(} (f3 ,
B D) = exp(f3H) , Z'
f3
1
=
kT'
Z
=
Trexp(-f3H) ,
(13)
for a range of positive temperatures. For T close to 0 the mixture (} is dominated by the terms corresponding to the lowest energies, say >'1 < A2 < ... , (14) and so (} inherits mostly the entanglement of l?,I;l)' Therefore, it is detected by W which, by construction, is most sensitive precisely to the entanglement of l?,I;l) type. However, one type of entanglement often present in thermal equilibrium states cannot be detected by witnesses of this kind. The separability of the ground state l?,I;l) of H, while making our witness construction faulty, does not exclude that the state (13) can nevertheless be entangled for a range of positive temperatures. This phenomenon, called thermally induced entanglement, happens when the first excited eigenstate 1?,I;2) is entangled and its admixture shows up in (} via the second term in (14) for certain T> Tmin. In Fig. 7 we collect the results of our tests of W H. The plots (left column) show the value max { -Tr(WH
(}) ,
o} .
for a range of Band T parameters. Separate plots correspond to different values of D, see also Fig. 8 for the location of these values on the eigenvalue level crossing diagram. We shall describe now how H can be used in the construction of a genuine entanglement witness. In this case, b. is the set of biseparable states. The argument that has previously led us to the simplification (910) of the extremalization problem for H is no longer valid, for the local components l'3
,= 1 +D2D'+ D' where D' = JD2 + 2D + 9. Since, > t for D ~ -2, also in the region (b) - (c) no improvement of (i is possible. In (d) and (d') our approximation reads (i = ,AI + (1-,)A2 and in (e) (i = ,AI + (1 -,)A4' Again the fact that, > for D ~ -2 excludes any improvement of (i in (d) or (d'). Finally, as the maximal squared overlap of 11J'!4) with states in ~ is ~ and , ~ ~ in (e), no better (i can be found as well in this case. Fig. 7 summarizes our computational results. Its left column, as mentioned earlier, contains plots of -Tr(WHI?) for the entanglement witness W H, and the right column - plots for the approximate genuine entanglement witness W H just constructed. The exemplary D values used are 1.0,0.25,0.0, -0.5. The reader may relate these plots with the location of D values on the level crossing diagram of Fig. 8: edges in the plots correspond to transitions between different sectors (a) - (e) with the change of B value.
t
254
Acknowledgment The author is greatly indebted to Professor Masanori Ohya and the QBIC Centre for support and hospitality during his visit at Tokyo University of Science in March 2010.
References 1. Michalski M., [in :] Quantum Bio-Informatics III, L. Accardi, W. Freudenberg, M. Ohya, eds., Quantum Probability and White Noise Analysis XXVI, p. 217, World Scientific, 2009. 2. Bennett Ch., et al., Phys. Rev. Lett. 82, 5385 (1999). 3. Collins D., et al., Phys. Rev. Lett. 88, 170405 (2002). 4. Diir W., J. 1. Cirac, R. Tarrach, Phys. Rev. Lett. 83, 3562 (1999). 5. Acin A., et al. , Phys. Rev. Lett. 87, 040401-1 (2001). 6. Bourennane M., et al., Phys. Rev. Lett. 92, 087902-1 (2004). 7. Horodecki M., P. Horodecki, R. Horodecki, Phys. Lett. A 223 , 1 (1996). 8. Lewenstein M., B. Kraus, J. 1. Cirac, P. Horodecki, Phys. Rev. A 62, 052310 (2000). 9. Terhal B., Phys. Lett. A 271 , 319 (2000). 10. T6th G. , Phys. Rev. A 71 , 010301(R) (2005). 11. Horodecki M., P. Horodecki, R. Horodrecki, Phys. Lett. A 283 ,1 (2001). 12. Eisert J., F. G. S. L. Brandao, K. M. Audenaert, New J. Phys. 9 , 46 (2007). 13. Eisert J., D. Gross, Multi-partite entanglement, [in] Lectures on quantum information, D. Bruss and G. Leuchs Eds., Wiley-VCH, Weinheim, 2006. 14. Horodecki R., P. Horodecki, M. Horodecki, K. Horodecki, Rev. Mod. Phys. 81, 865 (2009). 15. Acin A. , et al. , Phys. Rev. Lett . 85, 1560 (2000); quant-ph/0003050. 16. Acin A., A. Andrianov, E. Jane, R. Tarrach, J. Phys. A: Math. Gen. 34, 6725 (2001); quant-ph/0009107. 17. Michalski M., [in:] Quantum Bio-Informatics III, L. Accardi, W. Freudenberg, M. Ohya, eds., Quantum Probability and White Noise Analysis XXVI, p. 231, World Scientific, 2009. 18. Jamiolkowski A., Rep. Math. Phys. 3 , 275 (1972). 19. Carteret H. A., A. Higuchi, A. Sudbery, J. Math. Phys. 41 , 7932 (2000). 20. Verstraete F., J. Dehaene, B. De Moor, Phys. Rev. A 68, 012103 (2003). 21. Verstraete F., J. Dehaene, B. De Moor, H. Verschelde, Phys. Rev. A 65 , 052112 (2002). 22. Diir W., J. 1. Cirac, Phys. Rev. A 61, 042314 (2000). 23. Zhao M.-J., Z.-X. Wang, Rep. Math. Phys. 63 , 409 (2009). 24. Brukner C., V. Vedral, arXiv:quant-ph/0406040v1 (2004). 25. Vedral V., Open Sys. Information Dyn. 16, 287 (2009).
Quantum Bio-Informatics IV eds. L. Accardi, W. Freudenberg and M. Ohya © 2011 World Scientific Publishing Co. (pp. 255-265)
ON KADISON-SCHWARZ PROPERTY OF QUANTUM QUADRATIC OPERATORS ON M 2 (C)
FARRUKH MUKHAMEDOY*
ABDUAZIZ ABDUGANIEY+
Department of Computational €3 Theoretical Sciences Faculty of Science, International Islamic University Malaysia P.O. Box, 141, 25710, Kuantan Pahang, Malaysia * E-mail: [email protected];[email protected] +E-mail: [email protected] In the present paper we first describe quantum quadratic operators (q.q.o) acting on the algebra of 2 x 2 matrices M2(1C). Moreover, we provide necessary conditions for q.q.o. with Haar to satisfy Kadison-Schwarz condition. By means of such a description we give an example of q.q.o. which is not the Kadision-Schwarz operator. Keywords: quantum quadratic operators; Kadison-Schwartz operator.
1. Introduction
It is known that one of the main problems of quantum information is characterization of positive and completely positive maps on C* -algebras. There are many papers devoted to this problem (see for example 4,10,18,19). In the literature the most tractable maps, the completely positive ones, have proved to be of great importance in the structure theory of C*algebras. However, general positive (order-preserving) linear maps are very intractable 1o ,12. It is therefore of interest to study conditions stronger than positivity, but weaker than complete positivity. Such a condition is called Kadison-Schwarz property, i.e a map ¢ satisfies the Kadison-Schwarz property if ¢(a)*¢(a) :0::: ¢(a*a) holds for every a. Note that every unital completely positive map satisfies this inequality, and a famous result of Kadison states that any positive unital map satisfies the inequality for self-adjoint elements a. In 17 relations between n-positivity of a map ¢ and the KadisonSchwarz property of certain map is established. Some a nice property of the Kadison-Schwarz maps were investigated in 16. The present paper is devoted to the Kadison-Schwarz property of certain 255
256
class of operators on M2(C), It is known that the theory of Markov processes is a rapidly developing field with numerous applications to many branches of mathematics and physics. However, there are physical systems that can not be described by Markov processes. One of such systems is given by quadratic stochastic operators (see 2), which relates to population genetics. The problem of studying the behavior of trajectories of quadratic stochastic operators was stated in 20. The limit behavior and ergodic properties of trajectories of quadratic stochastic operators were studied in 7,8,9. However, such kind of operators do not cover the case of quantum systems. Therefore, in 5,6 quantum quadratic operators acting on a von Neumann algebra were defined and studied. Certain ergodic properties of such operators were studied in 13,14. In the present paper we are going to study quantum quadratic operators (q.q.o.) with the Kadison-Schwarz property. Note that operators with this property no need to be completely positive. Aim of the paper is to find some necessary conditions for the trace-preserving quadratic operators to be the Kadison-Schwarz ones. Since trace-preserving maps arise naturally in quantum information theory (see e.g. 15) and other situations in which one wishes to restrict attention to a quantum system that should properly be considered a subsystem of a larger system with which it interacts. Therefore, in Section 3 we describe q.q.o. with Haar state (invariant with respect to trace), namely certain characterizations of q.q.o, the KadisonSchwarz operators are given. By means of such a description in Section 4, we shall provide an example of q.q.o. which is not a Kadision-Schwarz operator. It is worth to mention that characterizations of positive and completely positive maps defined on M2 (C) were considered in 10,11,19 and 18, respectively.
2. Preliminaries In what follows, by M 2 (C) we denote an algebra of 2 x 2 matrices over complex filed C. By M 2(C) 0 M 2(C) we mean tensor product of M 2(C) into itself. We note that such a product can be considered as an algebra of 4 x 4 matrices M 4 (C) over C. In the sequel II means an identity matrix,
i.e.
n=
(~~).
By S(M 2 (C)) we denote the set of all states (i.e. linear
positive functionals which take value 1 at ll) defined on M2(C)'
Definition 2.1. A linear operator
~
: M 2(C)
----+
M 2(C) 0 M 2(C) is said
257
to be (a) - a quantum quadratic operator (q. q. 0.) if it satisfies the following conditions: (i) unital, i.e. ~ll = II ® ll; (ii) ~ is positive, i.e. ~x ~ 0 whenever x ~ 0; (b) - a K adison-Schwarz operator (KS) if it satisfies ~(x*x) ~ ~(x*)~(x)
for all x E M2(C)'
(1)
Note that if ~ is unital and KS operator, then it is a q.q.o. A state hE S(M 2 (C)) is called a Haar state for a q.q.o. ~ if for every x E M 2 (C) one has
(h ® id)
0
~(x)
= (id ® h)
0
~(x)
= h(x)ll.
(2)
Remark 2.1. Let U : M 2(C) ® M 2(C) --+ M 2(C) ® M 2(C) be a linear operator such that U(x ® y) = y ® x for all x, y E M2(C)' If a q.q.o. ~ satisfies U ~ = ~, then ~ is called a quantum quadratic stochastic operator. Such a kind of operators were studied and investigated in 13,14.
3. Quantum quadratic operators with Kadison-Schwarz property on M 2 (C)
In this section we are going to describe quantum quadratic operators on M 2 (C) as well as find necessary conditions for such operators to satisfy the Kadison-Schwarz property. Recall 3 that the identity and Pauli matrices {ll, 0"1,0"2, 0"3} form a basis for M 2 (C), where
0"1
=
(1001)
0"2
=
(0 -i) 0 i
0"3
=
(10-10) .
In this basis every matrix x E M 2 (C) can be written as x = woll + wa with Wo E C, w = (WI, W2, W3) E C 3 , here WO" = WIO"I + W20"2 + W30"3· Lemma 3.1.
18
The following assertions hold true:
(a) x is self-adjoint iffwo,w are reals; (b) Tr( x) = 1 iff Wo = 0.5, here Tr is the trace of a matrix x; (c) x > 0 iff Ilwll ::; wo, where Ilwll = JIW112 + IW212 + IW312.
258
Note that any state t.p E S(M 2 (C)) can be represented by
t.p(woll + wO") = Wo
+ (w, f),
(3)
where f = (11,12, h) E IR3 with Ilfll :; 1. Here as before (-,.) stands for the scalar product in CC 3 . Therefore, in the sequel we will identify a state t.p with a vector f E IR3. In what follows by T we denote a normalized trace, i.e. T(X) = ~ Tr(x), x E M2(C)' Let ~ : M 2(C) -+ M 2 (C) 0 M 2 (C) be a q.q.o. with a Haar state T. Let us write the operator ~ in terms of a basis in M 2 (C) 0 M 2 (C) formed by the Pauli matrices. Namely, ~ll =
ll 0 ll; 3 3 3
L
~(O"i) = bi(ll 0 ll) + 2:);;)(ll 0 O"j) + :2);;)(O"j 0 ll) + j=l
j=l
bml,i(O"m 00"d,
m ,l=l
where i = 1,2,3. By means of the Haar equality with T (see (2)) we find that bi = 0, el) -- b(2) ..J E {I " 2 3} . bij ij -- 0 f or every z, Positivity of ~ implies that all numbers {bij,d are real ones. Hence, one can prove the following ~ : M 2 (C) -+ M 2 (C) 0 M 2 (C) be a g.g.o. with a Haar then it has the following form:
Theorem 3.1. Let
state
T,
3
L
~(x) = woll 0 II +
(4)
(b m1 , w)a-m 0 0"1,
m,l=l where x
= Wo + WO", b m1 = (bm1,1, bml,2, bm1 ,3).
Let us turn to the positivity of~. Given vector f
= (11 ,12, h)
E IR3
put 3
(3(f)ij
=
L
(5)
bki,jfk.
k=l Define a matrix JB(f) = ((3(f)ij)rj=l. By IIJB(f) II we denote a norm of the matrix JB(f) associated with Euclidean norm in CC 3 . Put
S = {p = (Pl,P2,P3)
E IR3:
pi + p~
+ p~
:;
I}
259
and denote
111lH\111 = sup 11lH\(f) II· fES
Now given a state cp by E
589 17496
which means that (15) is not satisfied. Hence, .6.", is not a KS-operator at 10 = 1/3. 0 Now we are going to show that condition (6) is necessary for positivity of .6.. Let us again consider the operator .6.",. If 1101 :s: ~ is satisfied, then (6) holds. Indeed,
t,
1iti bij,kfiPj 12
=
E2(lhpi +Ihpi
+ hp2 + hp31 2 + Ihpi + hp2 + hP31 2
+ hp2 + hp31 2)
:s: E2((j~ + fi + f~)(pi + p~ + p~) +(j~
+ fi + ff)(pi + p~ + p~) +(pi + p~ + p~)(fi + f~ + f~))
:s: 10 2 (1 + 1 + 1) = 310 2 :s: l. From this and Theorem 4.1 we conclude that if operator .6.", is not positive, while (6) is satisfied.
10
E (~, ~) then the
265
Acknowledgement
The first author (F.M.) would like to thank Professor Noboru Watanabe and Professor Masanori Ohya for their kind hospitality during ICQBIC 2010. This work is partially supported by the Malaysian Ministry of Science, Technology and Innovation Grant 01-01-08-SF0079 and by HUM Research Endowment Grant B (EDW B 0905-303). References 1. S. N. Bernstein, Uchen. Zapiski NI Kaf. Ukr. Otd. Mat. 1924, no. 1, 83115. (Russian) 2. L. Boltzmann, Selected works, Nauka, Moscow 1984. (Russian) 3. O. Bratteli, D. W. Robertson, Operator algebras and quantum statistical mechanics. I, Springer, New YorkHeidelbergBerlin 1979. 4. M-D. Choi, Lin. Alg. Appl. 10(1975), 285-290. 5. N. N. Ganikhodzhaev, F. M. Mukhamedov, Uzb. Matem. Zh. 1997, no. 3, 8-20. (Russian) 6. N.N. Ganikhodzhaev, F. M. Mukhamedov, Izv. Math. 65 (2000), 873-890. 7. H. Kesten, Adv. in Appl. Probab. 1970, no. 2, 182, 179228. 8. Yu. I. Lyubich, Russian Math. Surveys 26(1971). 9. Yu. I. Lyubich, Mathematical structures in population genetics, Springer, Berlin 1992. 10. W.A. Majewski, M. Marciniak, J. Phys. A: Math. Gen. 34 (2001) 5863-5874. 11. W.A. Majewski, M. Marciniak, arXiv:0705.0798 12. W.A. Majewski, J. Phys. A: Math. Gen. 40 (2007) 11539-11545. 13. F.M. Mukhamedov, Method of Funct. Anal. and Topology, 7(2001), No.1, 63-75. 14. F.M. Mukhamedov, Izvestiya Math. 68(2004), 1009-1024. 15. M.A. Nielsen, I.L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, Cambridge, 2000. 16. A. G. Robertson, Math. Z. 156(1977), 205-206. 17. A. G. Robertson, Math. Proc. Camb. Philos. Soc. 94(1983), 291-296. 18. M.B. Ruskai, S. Szarek, E. Werner, Lin. Alg. Appl. 347 (2002) 159-187. 19. E. Stormer, Acta Math. 110(1963), 233-278. 20. S. M. Ulam, A collection of mathematical problems, Interscience, New YorkLondon 1960.
This page intentionally left blank
Quantum Bio-Informatics IV eds. L. Accardi, W . Freudenberg and M. Ohya © 2011 World Scientific Publishing Co. (pp. 267-278)
ON PHASE TRANSITIONS IN QUANTUM MARKOV CHAINS ON CAYLEY TREE
LUIGI ACCARDI and FARRUKH MUKHAMEDOY* SABUROY+
MANSOOR
Centro Interdisciplinare Vito Volterra II Universita di Roma "Tor Vergata " Via Columbia 2, 00133 Roma, Italy E- email: [email protected] Department of Computational & Theoretical Sciences Faculty of Science, International Islamic University Malaysia P.O. Box, 141, 25710, Kuantan Pahang, Malaysia * E-mail: [email protected];farrukh_m @iiu.edu.my + E-mail: [email protected] In the present paper we continue our investigations started in [Accardi L. , Ohno, H. , Mukhamedov , F. , Quantum Markov fields on graphs , I nf. Dim. Analysis, Quantum Probab. Related Topics (accepted) arxi v: 0911 . 1667]. In [Accardi L., Mukhamedov, F., Saburov M. On Quantum Markov chains on Cayley tree and associated chains with XY-model arxiv: 1004.3623] we provided a construction of forward and backward Quantum Markov Chains (QMC) d efined on the Cayley tree , and established uniqueness of QMC associated with XY-model on a Cayley tree order 2. In the present paper we study the same model on a Cayley tree order 3. Surprisingly in this case, we establish a phase transition (i.e. existence of two distinct quantum Markov chains) for the considered model on the Cayley tree order 3.
K eywords: quantum Markov Chain; Cayley tree; phase transition .
1. Introduction
Nowadays, it is know that Markov fields play an important role in classical probability, in physics, in biological and neurological models and in an increasing number of technological problems such as image recognition. Therefore, it is quite natural to forecast that the quantum analogue of these models will also playa relevant role. The quantum analogues of Markov processes were first constructed in 1, where the notion of quantum Markov 267
268
chain on infinite tensor product algebras was introduced. Nowadays, quantum Markov chains have become a standard computational tool in solid state physics, and several natural applications have emerged in quantum statistical mechanics and quantum information theory. The reader is referred to 3,5,6,16,21 and the references cited therein, for recent developments of the theory and the applications. A first attempts to construct a quantum analogue of classical Markov fields has been done in 4,6,9,17. These papers extend to fields the notion of quantum Markov state introduced in 8 as a sub-class of the quantum Markov chains introduced in 1 In 7 it has been proposed a definition of quantum Markov states and chains, which extend a proposed one in 20, and includes all the presently known examples. Note that in mentioned papers quantum Markov fields were considered over multidimensional integer lattice 7l,d. This lattice has so called amenability condition. Therefore, it is natural to investigate quantum Markov fields over non-amenable lattices. One of the simplest non-amenable lattice is a Cayley tree. First attempts to investigate Quantum Makov chains over such trees was done in 12, such studies were related to investigate thermodynamic limit of valence-bondsolid models on a Cayley tree 14. It was constructed finitely correlated states as ground states of that model. The mentioned considerations naturally suggest the study of the following problem: the extension to fields of the notion of generalized Markov chain. In 10 we have introduced a hierarchy of notions of Markovianity for states on discrete infinite tensor products of C* -algebras and for each of these notions we constructed some explicit examples. We showed that the construction of 8 can be generalized to trees. It is worth to note that, in a different context and for quite different purposes, the special role of trees was already emphasized in 17. Note that a noncommutative extensions of classical Markov fields, associated with Ising and Potts models on Cayley tree, were investigated in 18,19. In the classical case, Markov fields on trees are also considered in 22_26. In the present paper we continue our investigations started in 10,11. Using the construction (see 11) of forward and backward Quantum Markov Chains (QMC) defined on the Cayley tree, we investigate QMC associated with XY-model on a Cayley tree order 3. We establish a phase transition for the XY-model on the Cayley tree order 3 in a QMC scheme. Note that classical XY-model have been investigated by many authors on a Cayley tree 13.
269
2. Preliminaries
Recall that a Cayley tree rk of order kQ:1 is an infinite tree whose each vertices have exactly k + 1 edges. The vertices x and yare called nearest neighbors and they are denoted by I =< x, Y > if there exists an edge connecting them. A collection of the pairs < x, Xl >, ... , < Xd-l, Y > is called a path from the point x to the point y. The distance d(x, y), x, Y E V, on the Cayley tree, is the length of the shortest path from x to y. If we cut away an edge {x, y} of the tree r k, then rk splits into connected components, called semi-infinite trees with roots x and y, which will be denoted respectively by rk(x) and rk(y). If we cut away from rk the origin together with all k + 1 nearest neighbor vertices, in the result we obtain k+1 semi-infinite rk(x) trees with x E So = {y E rk : d(O, y) = I}. Hence we have
°
U rk(x) U {a}.
rk =
xESo
Therefore, in the sequel we will consider semi-infinite Cayley tree r~ = (L, E) with the root xo, L is the set of vertices and E is the set of edges. Now we are going to introduce a coordinate structure in r~ as follows: every vertex x (except for xo) of r~ has coordinates (iI, ... ,in), here im E {I, ... , k}, 1 :::; m :::; n and for the vertex xo we put (0). Namely, the symbol (0) constitutes level 0, and the sites (i l , ... , in) form level n ( i.e. d( xo, x) = n) of the lattice. Let us set n
Wn
=
{x E L : d(x,xo)
= n},
An
=
UWk,
m
A[n,mJ
=
U Wk,
(n < m)
k=n
k=O
co
En
= {
E E : X,y E An},
A~ =
UWk k=n
For x E r~, x
= (il, ... , in) denote S(x) = {(x,i): 1:::; i:::; k},
here (x, i) means that (i l , ... , in, i). This set is called a set of direct successors of x. From these one can see that Am = Am- 2 U (
U
{x U S(x)}),
(1)
XEW=-l
Em \ Em-l =
U U {< X,y >}. XEW=_l YES(x)
(2)
270
The algebra of observables Ex for any single site x E L will be taken as the algebra Md of the complex d x d matrices. The algebra of observables localized in the finite volume A c L is then given by EA = ® Ex. As xEA
usual if Al c A2 C L, then EAl is identified as a sub algebra of EA2 by tensoring with units matrices on the sites x E A2 \ AI. Note that, in the sequel, by E A,+ we denote positive part of EA. The full algebra EL of the tree is obtained in the usual manner by an inductive limit
One can see that the shift Ii induces a homomorphism on EL . In what follows, by S(EA) we will denote the set of all states defined on the algebra EA. Consider a triplet C c E c A of unital C* -algebras. Recall that a quasiconditional expectation with respect to the given triplet is a completely positive (CP) identity preserving linear map E : A ----* E such that E(ca)
= cE(a),
a E
A, c E C.
(3)
Notice that, as the quasi-conditional expectation E is a real map, one has E(ac) = E(a)c,
a E A, c E C.
as well.
Definition 2.1. Let cp be a state on E L . Then cp is called (i) a forward quantum d-Markov chain (QMC) , associated to {An}, on EL if for each An, there exist a quasi-conditional expectation EAi. with respect to the triplet
(4) and a state
such that for any n E N one has
(5) and
(6) in the weak- * topology.
271
(ii) a backward quantum d-Markov chain, associated to {An}, if there exist a quasi-conditional expectation EAn with respect to the triple BA n _ 1 C BAn C BAn+l for each n E N and an initial state Po E S(BAo) such that
in the weak-* topology. In this definition, forward and backward QMC
E E of the tree is assigned an operator K<x,y> E B{x,y}' We would like to define a state on BAn with boundary conditions Wo E B(o),+ and h = {h x E B x ,+ }xEL. To do this, let us denote Kn
=
w6/ 2K[0,ljK[1,2j ... K[n-1,n] II
h~/2,
(7)
xEWn
here by definition we put k
K[O,l] :=
II K<xO,(xO,i»'
k
K[m-1,mj:=
II II K<x,(x,i»,
m?: 1 (8)
i=l
Since, in generally, the operators K<x,y> do not commute (they share a vertex) and therefore, the product over i of operators Kx,(x,i) is actually ordered product, i.e. here we are taking a multiplication of those operators from left to right in the given coordinate structure. Define
(9) It is clear that Wnj, Wnj are positive.
272
In what follows, by TrA : BL - f BA we mean normalized partial trace, for any A oo 'P wQ ,h . If hx is invertible for all x E L , then 'P(b) h is a backward QMC on B L . WQ,
On the other hand, it is known 8 that {'P~',~} satisfy the compatibility condition if a sequence {Wnj} is projective with respect to Trnj, i.e. Trn - 1j (Wn j)
=
W n - 1j,
Vn E N.
(15)
One has the following Theorem 3.2. 11 Let the boundary conditions Wo E B(o) ,+ and h = {h x E Bx,+ }xEL satisfy (13) and (14). Then {Wnj} is a projective sequences of density operators, i. e. there is a unique forward QM C 'P WQ, (f) h on B L such U ) = W - lim that (n (n (n,J). rWQ ,h n---+oo Ywo ,h
273
Definition 3.1. We say that there exists a phase transition for a family of operators {K<x,y>} if (13) and (14) have at least two (wo, {hx}xEL) and Cwo, {hx}xEL) solutions such that the corresponding QMC 'Pw 0, hand 'Pw 0, fi are disjoint. Otherwise, we say there is no phase transition. 4. Quantum d-Markov chains associated with XY-model
In this section, we prove the existence of a phase transition of the quantum d-Markov chain associated with XY-model on a Cayley tree of order three. Let us consider a semi-infinite Cayley tree r~ = (L, E) of order 3. Our starting C* -algebra is the same EL but with Ex = M2 (q for x E L. By O"~u), O"£u) , O"~u) we denote the Pauli spin operators for at site u E L. Here (u)=(OI) O"x 10'
=
(0 -i) °'
O"z
=
exp{,6H}, ,6
(u)
O"y
i
(u)
(1 °)
0-1'
(16)
For every edge < u, v >E E put K
=
> 0,
(17)
H
= -12 (O"(u)O"(v) x x + O"(u)O"(v)) y y .
(18)
where
Now taking into account the following equalities 2m _ H2 _ 1 (n H -:2
-
(u) (v)) o"z O"z ,
2m-l H
== H ,
mEN,
one finds K =
n+ sinh,6H + (cosh,6 - I)H~u,v>'
We are going to describe all solutions h = {h x } and Wo of the equations (13),(14). Furthermore, we shall assume that hx = hy for every x, y E W n , n E N. Hence, we denote h~n) := hx, if x E W n . Now from (17),(18) one can see that K = K~u,v>' therefore, the equation (14) can be rewritten as follows K K ) - h(n-l) Trx ( K<x,y> K <X,z> K <x,v> h y(n)h(n)h(n)K z v <x,v> <X,z> <X,y> - x ,
(19) for every x E L. After a little algebra the equation (19) reduces to the following system {
2 - a(n-l) + A 2 a(n)la(n)1 11 12 11 Bllai~)I(ai~))2 + Allai~)13 = lai~-I)1 (n))3 B 2 (a 11
(20)
274
where Al = sinh 3 ;3 cosh;3, BI = sinh;3 cosh 2 ;3(1 + cosh;3 + cosh 2 ;3), (21) A2 = sinh2 ;3 cosh 2 ;3(1 + 2 cosh ;3), B2 = cosh 6 ;3, (22)
= hen) = hen) = z v
hen) y
(n) ( all (n)
(n))
l2 a (n) a 21 a 22
.
Remark 4.1. Note that according to positivity of h~n) and ai';) = a~~) we conclude that ai~) > lai;) I for all n E N.
Now we are going to investigate the derived system (20). To do this, let us define a mapping f : (x, y) E lR~ --+ (Xl, yl) E lR~ given by
{
B2(XI)3 + A 2x ' (y')2 = X BI (Xl?yl + Al (yl)3 = Y
(23)
Furthermore, due Remark 4.1, we restrict the dynamical system (23) to the following domain ,6.
= {(x,y)
E lR~
:x
> y}.
Denote Pg(t) = t g
+ 2t4 + 2t 3 -
t - 1,
(24)
+ 2 cosh;3) + D cosh6 ;3'
(25)
D:= A2 - AI. B I -B2 Further, we will need the following auxiliary fact:
(26)
-
tS
-
t7
-
t6 1
E := sinh2;3 cosh 2 ;3(1
Lemma 4.1. Let AI, B I , A 2 , B 2 , D be numbers defined by (24), (22), (26) and Pg(t) be polynomial given by (24), where ;3 > O. Then the following statements holds true:
(i) The polynomial P g (t) has only tree positive roots 1, t*, and t* such that 1.05 < t* < 1.1 and 1.5 < t* < 1.6. Moreover, if t E (1, t*) u (t*,oo) then Pg(t) > 0 and t E (t*,t*) then Pg(t) < O. Denote by ;3* = cosh -1 t* and;3* = cosh -1 t*; (ii) For any ;3 E (0, (0) we have Al < A 2 ; (iii) If;3 E (0,;3*] U [;3*,(0) then BI :s; B2 and If;3 E (;3*,;3*) then BI > B 2 ;
275
(iv) (v) (vi) (vii)
For any (3 E (0,00) we have Al + BI < A2 + B 2; If (3 E ((3*, (3*) then D > 1 and E > 0; For any (3 E (0,00) we have AIA2 < BIB2 and AIB2 < A 2B I ; If (3 E ((3*, (3*) then A2BI < AIA2 + 3A I B 2 + BIB2 and 2AIA2 3A I B 2 < A 2B I ·
+
Let us first find all of the fixed and periodic points of (23). Theorem 4.1. Let f be a dynamical system given by (23). Then the following assertions hold true:
(i) If (3 E (0, (3*] U [(3*,00) then there is a unique fixed point (cos~3 f3' 0) in the domain 6.;
(ii) If (3 E ((3*, (3*) then there are two fixed points in the domain 6., which are Cos~3f3'0) and (VDE,VE). (iii) For any (3 E (0,00) the dynamical system f does not have any k periodic points, where k ;::: 2.
Now let us formulate results concerning limiting behavior of f. Theorem 4.2. Let f : 6. ---+ lR~ be the dynamical system given by (23) and (3 E (0, (3*] U [(3*,00). Then the following assertions hold true:
(i) ify(O) > 0 then the trajectory {(x(n),y(n))}~=o of f starting from the point (x(O), y(O)) is finite. = 0 then the trajectory {(x(n),y(n))}~=o starting from the point (x(O), yeO)) has the following form
(ii) if yeO)
3\1x(O) cosh 3 (3 -n--{ x(n) = - - cosh 3 (3 yen)
= 0,
and it converges to the fixed point (cos~3 f3 ' 0). Theorem 4.3. Let f : 6. ---+ lR~ be the dynamical system given by (23) and (3 E ((3*, (3*). The following assertions hold true: (i) There are two invariant lines w.r.t. f defined by y = O} and 12 = {(x,y) E 6.: y = ]v};
h = {(x, y)
E
6. :
(ii) if an initial point (x(O), yeO)) belongs to the invariant lines lk' of dynamical system (23), then the trajectory {(x(n) , y(n))}~=o, starting from the point (x(O), yeO)), converges to the fixed point which belongs an invariant line lk' where k = l,2;
276
(iii) if an initial point (x(O), yeO)) satisfies the following condition
yeO) x(O)
( E
1)
0, y75
,
then the trajectory {(x(n), y(n))}~=o, starting from the point (x(O) , y(O)), converges to the fixed point (cos~3 f3 ,0) which belongs an invariant line h; (iv) if an initial point (x(O), y(O)) satisfies the following condition yeO) x(O) E
(1 ) y75,1
,
then the trajectory {(x(n),y(n))}~=o, starting from the point (x(O) , y(O)), is finite. Let 13 E (0,13*] u [13*,00). From Theorem 4.2, we infer that equation (19) has a lot of parametrical solutions (wo (a), {h x (a)}) given by
3\1a cosh3 13 wo(a) =
h~n) (a) =
(
cosh 3 f3
(27)
° for every x E V., here a is any positive real number. The boundary conditions corresponding to the fixed point of (23) is give by (27) at value of ao =
1
3 in. Therefore, further, we denote such cosh 13 operators by Wo (ao) and h~n) (ao). Let us consider the states
~
situation, the minimality of G and j: is guaranteed by the G-central ergodicity, i.e., G-ergodicity of the centre 3iT(j:) in the representation ngiven by the GNS representation of Wo 0 m induced from the vacuum state Wo of Ad 2, and we have the following commutativity diagram:
FH
=
j'G: unbroken alg. of observables
1:1/,
'\. 1:1
F
j'H: extended observables .JJ-1 :1
'\. '\.1:1 onto!
1:1/,
F: augmented alg.
onto!
/,onto
H
«--
.JJ-onto
! onto
onto '\. '\.
! onto
------
G/H
whose dual version describes the sector structure:
j'G= FH /onto
ironto
F 1:1 1:1 1:1
H
11'
i
"" ""onto 11'
i i
/1:1
F '---->
11'1: 1
G: broken
~
H
unbroken sectors .JJ-1 :1
"" onto
j'H ~ G /onto
X
H
if
i 1: 1 i i "" "" 1: 1 ---»
sector bdle .JJ.JJ-onto .JJ-
G/H: degenerate vacua
where F = Spec(3(F)) denotes the factor spectrum of F, etc. Remark: The physical essence ofthe extension A ===} Ad from the original observable algebra A to its Haag-dual net algebra Ad = FH can now be interpreted as an "extension of coefficient algebra A" by (the dual of) G / H
285
to parametrize the degenerate vacua: Ad
= FH = JC = [(F)<J (H\G)jC =
(iiW)
F )<J = A )<J (ii\G). In this extension, a part G / H of originally invisible G becomes visible through the emergence of degenerate vacua parametrized by G / H due to the condensation of order parameters E G/H associated with S(ponteneous) S(ymmetry) B(reaking) of G to H. As a result , observables A E A acquire G / H -dependence: A = (G / H =" g f----+ A(g) E A) E A)<J (ii\G), which should just be interpreted as an example case of logical extension 6 transforming a "constant object" (A E A) into a "variable object" (A E A)<J (ii\G)) having functional dependence on the universal classifying space G / H for (multi-valued) semantics( , as is familiar in the non-standard and Booleanvalued analyses). By replacing G/H with the space(-time), the above consideration can be utilized as a prototype for the origin of the functional dependence of physical quantities on space(-time) coordinates, due to the physical emergence of space(-time) from microscopic physical world. Along this line, we prescribe the similar logical extension procedure on the observable algebra Ad = FH adding G / H-dependence: C
Ad)<J
(ii\G) = FH
)<J
(H\G) = (F)<J (H\G))H = j"H.
Then, the whole sector structure of JH
= (FH )<J (ii\G)) can be identified
II;
this is seen to constitute a bundle
with its factor spectrum JH = G x H
structure,
II ~ JH =
G x H
II
--t>
G / H, called a sector bundle consisting
of the classifying space G / H of degenerate vacua, each fibre over which describes the sector structure II corresponding to the unbroken remaining symmetry H (or, more precisely, the conjugated group gHg- 1 for the vacuum parametrized by g = gH E G / H). Namely, the sector bundle,
II
~ JH
=
G x H
II
--t>
G/H, can be
understood as the connection= splitting of the dual, FH = II NK EN; • M := {O'ilO'i EM}, as the space of all finite sequences in M of length less or equal to N M, where 00 > N MEN (could be M = K) • S = Sl U S2 with Sl n S2 = 0 with lSI < 00. Namely K is the set of Secret Keys of length :s: N K, M is the set of Clear Messages of length :s: N M, K and M are finite spaces of symbols (just to fix the ideas should be: K = M = {O, I} the standard binary digits). Also S is a finite set of symbols (or functions) and we can consider it as the set of all possible actions that act on the clear message. At least we introduce the following spaces:
= AUB where A = ANA and B = UiBi where Bi = ANA X B~Bi. Also in this case A and Bi are finite sets of symbols as
• 0
above. From a biological point of view the space A is the exons space and B is the introns space.
293
o is the base space for Coded Messages (a coded messages will be a finite sequences of elements of 0). After this preliminary description of necessary spaces, now we can introduce some useful definitions: Definition 1. Be
{l, ... ,NK
with 0
K,
E K.
We define for mEN and for each
E
},
-s: m -s:
N K as a subsequence of
K,
(m = 0 represent a subsequence of length 0). Definition 2. Be
{l, ... ,NM
with 0
(J
E
M.
We define for n E N and for each i E
},
-s: n -s:
N M as a subsequence of
(J.
(n = 0 represent a subsequence of length 0). In the above definitions the operator is the sum with appropriate module depending on the length of the sequence and, for simplicity, in the following we will use the form K,i,m == K,i and (Ji,n == (Ji moreover holds: K,i E K and (J i E M
+
Definition 3. Be C = {EilEi E O} the space of all finite sequences in O.
C is the set of Coded Messages. 2.2. Functions
To reach our goal we must introduce two kinds of functions: • Coding functions: these functions are used to: - transform a portion of the clear message in a portion of the coded message (exons) - insert some apparently redundant information in the coded message (introns) • Operational functions: these functions are used to: - modify the local state of the coding system
294
- modify the global state of the variables of the system (see below) For the first we start with the definition of a family of coding functions fa : M X K ----+ 0 with a E 8: if a E 8 1 if a E 8 2
(1)
where E: E A and i E B. The functions fa must be chosen in such way to guarantee the existence of a function J : 0 X K ----+ M such that if a E 8 1 if a E 8 2
(2)
Now we introduce the definition of some operational functions: be ga : K x 0 ----+ K as follow:
(3) the existence of a function g that does not depend from a ensures the chaoticity of the system. For the last, we need, as a member of the set of the operational functions, the trivial characteristic function XS 1 : 8 ----+ {a, I}
2.3. Global variables and further definitions Now let us introduce some further definitions:
Definition 4. Be
K
E K a Pre Shared Key (PSK) of length N K.
In cryptography, a pre-shared key or PSK is a shared secret which was previously shared between the two parties using some secure channel before it needs to be used.
Definition 5. Be {a} length.
{ai lai
E
8} a random sequence of arbitrary
A random sequence is a kind of stochastic process. In short, a random sequence is a sequence of random variables. In computer science it comes
295
from a random number generator, often abbreviated as RNG, that is a computational device designed to generate a sequence of numbers or symbols that lack any pattern, i.e. appear random.
Definition 6. Be
(J
M a message of length N M.
E
In communications science, a message is information which is sent from a source to a receiver.
Definition 7. ~ E C will be the coded message of length that will depend on the system and the variables state. In cryptography, a coded message is a clear message transformed into an obscured form, preventing those who do not possess special information, or key, required to apply the transform from understanding what is actually transmitted.
2.4. Biological parallelism From the biological point of view the PSK and the random sequence {a} represent the whole mechanism of splicing (this is the process by which pre-mRN A is modified to remove certain stretches of non-coding sequences called exactly introns) , while the coded message is the pre-mRNA itself and the clear message is the final mRNA.
3. The algorithm
3.1. Coding Suppose to have a K, E K as a Pre Shared Key (PSK) of length N K and a message (J E M of length N M and that we have fixed two numbers n, mEN. (For simplicity and without affecting the generality of the system, we can suppose that N M is a multiple of n). Using the above definitions for ai, Crl i and K,ji we can define the following algorithm:
f Cti (Crl" K,jJ
I
K,ji+l Ci E
A and
Li E
ai
E
81
if
ai
E
82
= li + XS 1 n = gCti (K" ~i)
lHl
where
if
(4)
B.
Definition 8. The sequence
~
=
{~1'
... ' ~N}
E C is the coded message.
296
3.2. Decoding For the decoding phase suppose to have Ii E J( as a Pre Shared Key (PSK) of length N K and a coded message ~ E C of arbitrary length, and suppose that we have fixed two numbers n, mEN. Using the above definitions for ali and K,ji we can define the following algorithm:
{ !(~i' K,jJ = O"i = g(li, ~i) liji+l
if
~i E
A (5)
{!(~i' K,jJ =0 liji+l
= g(li, ~i)
if ~i E B*
cB
Note that, thanks to the definition of g, in the decoding phase is not necessary to know the random {O:i} sequence.
3.3. Preliminary observations The coded message depends on: • The clear message; • The secret Key (PSK); • The random sequence {o:}; and it important to understand that the algorithm during the coding phase using the same message and the same PSK can supply different coded messages simply using different {o:}. It is also important to underline the fact that it is no necessary, during the decoding phase, to know the sequence {o:} used in the coding phase. This because the information about the {o:} sequence are intrinsic in the couple (~i' K,jJ. In this circumstance we want to underline the possible role of introns to carry important information for the decoding phase, for example the function g(Ii,~) can use the information present in ~ E B to produces a change in the PSK. Moreover, during the coding phase there is an expansion of the original messages into the coded message, and this expansion comes from several factors, for example it comes from the dimensions of the Bi and from the number of elements of 8 2 in the {o:} sequence.
297
4. Early implementations
4.1. The environment Now let us introduce the first realization of the above algorithm. Step by step we will substitute each ingredient of the algorithm, with its representative object in the implementation. Starting from the definition of K = M = A = B1 = {O, I}, with NK,NM EN, NA = 3, NB l = 2 and S = Sl U S2 with Sl = {a,b} and S2 = {c, d}. 0 = {O, l}NA X ({O, l}NA X Finally n = 1 and m = 3.
B;Bl).
4.2. Functions and implementation To introduce the functions fa the table 1 T : A x K n using the following rules:
--+
S will be useful
1) if a E Sl
- be c = K,j - if ai = O"i = 0 then localize a binary number 000 S; r S; 111 such that Tr,c = a (or Tr,c = b) - if ai = O"i = 1 then localize a binary number 000 S; r S; 111 such that Tr,c = b (or T r,c = a) - now let ~i = (~i,1'~i,2'~i ,3 ) = (r1,r2,r3) (i.e. an exon I:i E
A).
2) if a
E
S2
- be c = K,j - localize a binary number 000 S; r S; 111 such that Tr, c = a - now let ~i ((~i , 1'~i , 2 ' ~i , 3 ) ' (~i ,4 , I:i,5)) (( r1, r2, r3), (~i,4' ~i,5))' (i.e. an intron ~i E B) - the other 2 components of the vector (i.e. ~i,4' ~i,5) will be given using a random choice in the binary range {OO, ... , 11}
298
Table 1 The seek table Tr,c
I 0 00 00 I
(2;i, l, 2;i, 2, 2;i ,3)
0 00
a b
0 10
C
aI I
d a b c d
1 00 101 11 0 I II
I
001
b c d a b c d a
I
000
c d a b c d a b
I
I j
al l
d a b c d a b c
100
10 I
110
I I I
a b c d a b c d
b c d a b c d a
c d a b c d a b
d a b c d a b c
Now for the 9 function we will use the following rules: • • • •
obtain r , c as above if T r,c E {a , b} then 9 will return ii j i +m if Tr,c = c then 9 will return iij i+(E i ,4 Ei,5ho if Tr,c = d then 9 will return iiji - (E i ,4 Ei,5 ho
where the subscript 10 means the decimal representation of the binary number in parenthesis. The last function we must define is 1 : 0 x K ----+ M. Remember that conditions in equation (2) must hold. The function /(L.i' ii j ) could be able to understand if L.i E A or L.i E B. Starting from the fact that L. (i.e. the entire coded message) is a sequence of 0 and 1, we can extract the single element in terms of bits (i.e. I;l i ' I;l i+1, " .). Following this strategy we can define: (6) the new function act in this way: • • • • •
be r = (I; li ,I;li+l,I;li+ 2) be c = iiji if Tr ,c = a then 1 returns O'i if Tr,c = b then 1 returns O' i else 1 return 0' i = 0
= {O} (or O' i = {I}) = {I} (or O' i = {O})
The function 9 acts as above but using (I;l i ' I;l i+ 1 , I; li+ 2 ) if Tr,c (I;I " I;l i+l,I;li+2 ,I;li+3, I;l iH ) if Tr ,c = {c,d}.
= {a , b}
or
299
4.3. Observation
In this simple implementation of the RNA algorithm, redundancy is given by the following considerations:
• for each processed bit in the clear message 3 bits will be inserted in the coded one • each intron will insert in the coded message 5 bits • we use an uniform distribution for the generation of the sequence
{ad • then we can suppose that the length of the sequence {ad is
II{adll = II{ailai E SI}11 + II{ailai E S2}11 • of course, in this implementation, must be
II{ailai E SI}11
=
NM
Then, if the original message has a length of N M bits, the length of coded one is le :::::; 3NM + 5NM = 8NM . But it is important to underline that this implementation uses a redundant table Tr,e, in fact for each c there are two r that satisfy the condition Tr,e = a. But of course if we want to reduce the expansion of the message we can replace it with the one in table 2 (in this version must be NA = 2 and m = 2).
Table 2 Analternative seek table Tr
1
1 00 00
(~i,I' ~i,2)
01 10 II
a b c d
I
01
b c d a
I)
10
c d a b
I
II
e
'I
d a b c
in this case the message explosion will be le :::::; 2NM + 4NM = 6NM . Another solution in order to diminish the explosion of the coded message is that to use an asymmetric probability function for the generation of the {ad for example:
300
Table 3
Ia I p I a b c d
3/8 3/8 1/8 1/8
Using these last two optimizations the approximate length of the coded message will be reduced to le ~ 3NM. 4.4. A biological implementation
Now, as an example, we want to create an implementation of our model nearer the biology. For this let us introduce again the ingredients for the new model: Starting from the definition of K = N, M = A = {a, b, c, d}, with NK,NM E N, and 5 = 51 U 52 with 51 = {c} and 52 = {nc}. (') = {ANA\(a,a,a)} U ({a,a,a} x (UiANBi)), NA = n = 3, NBi E Nand m=l. The function f will be: ~i'
cr' = fa (cri' K,jJ = { '
(a, a, a, B) with B
E
_
(7)
AKji
in this case the element (a, a, a) E A3 is the codon to localize the beginning of the introns in the sequence (it will never appear in the clear code) and the function 1 will be: if (~i,1'~i,2'~i,3) otherwise
-I- (a,a,a)
(8)
5. The role of the Coding Functions In section 2.2 we introduce the definition of Coding functions. These functions play an important role in the complexity and in the lengths of the cipher text. It is also very important to understand that the number of functions involved in the process of encryption can be huge. In fact, besides the
301
canonical functions introduced in section 4.1 we can add functions that have the role of: • Change the public / secret keys - For example in the cipher text we can insert a sequence of bits of arbitrary length that will replace the public (private) key to decode the rest of the message. • Insert a long sequences of random bits - As before but the inserted bits are ignored. • Move forward or backward the key pointer position - The presence of a disruptive effect on the sequencing access to the secret key can only add more noise in the attacks • Reset global parameters like message pointer position, key pointer position, secret and public data - See before is easy to see that the number of Coding functions that can be inserted in the encoding mechanism is great and may become part of the shared secret information. In the same way it is obvious that some Coding functions can greatly increase the size of the cipher text as it is also obvious that the inclusion of random bits within the text could create an increase in the randomness of the cipher text.
6. Statistical Analysis 6.1. Cryptanalysis Cryptanalysis (from the greek krypts, "hidden" and analein, "break") is the study of methods for obtaining the meaning of encrypted information without having access to secret information which is usually required to perform the operation. Typically it comes to finding a secret key. Cryptanalysis is the "counterpart" of cryptography, namely the study of techniques for concealing a message, and together form the cryptology, the science of writing hidden. Cryptanalysis refers also to any attempt to circumvent the security of other cryptographic algorithms and cryptographic protocols. Although the methods of cryptanalysis usually excludes attacks that are directed to the inherent weaknesses of the method to violate, such as
302
bribery, physical coercion, theft, social engineering, such attacks are often the most productive of cryptanalysis traditional, they are still an important component. The first cryptanalytic tool used to analyze the RNA-Crypto System is based on the Statistical analysis. For this purpose we used a famous battery of tests called Diehard (or Dieharder in the new version). • The Diehard tests is a battery of statistical tests for measuring the quality of a set of random numbers. • It is cited by NIST as one of the best statistical suite for testing randomness • It was developed by George Marsaglia over several years and first published in 1995 and then maintained and improved by Robert Brown at the Duke University. • We focused our attention to the following tests: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Birthday spacings Overlapping permutations Ranks of matrices Monkey tests Count the Is Parking lot test Minimum distance test Random spheres test The squeeze test Overlapping sums test Runs test The craps test
all these tests are well described in the software package and in literature.
6.2.
OUT
Idea
In nature a nucleotides ribbon is a sequence of (almost always) 4 symbols (a, c, g, tlu). In computer science a 'string' is a sequence of 2 symbols (0, 1). Now, suppose to translate the 4 symbols as in table 4:
303
Table 4 Translaton table base binary a 00 c 01 g 10 11 tlu then we have a correspondence between bits and nucleotides. With this assumption the Diehard tests can be used to assess the randomness of this binary conversion of a sequence of nucleotides as well as the sequence of an encrypted message. Our idea is based on the following questions: (1) Have RNA ribbons a good random behavior according to DieHard( er) tests? (2) Are RNA-Crypto System sequences a good random behavior according to DieHard (er) tests? The answers can be given by looking at test results. 6.3. BIG results
6.3.1. The Experiment We used only some very long sequences of nucleotides from on line standard free databases, this because the tests run well on more than 12 M bytes of data. Accordingly with some simple calculations, holds the formula 1M bytes = ~ M bases, then we need sequences of at least 48 M bases, so our attention was focused also on the human genome which can gave very long sequences.
6.3.2. The protocol (1) Get a sequence from the database (longer than 48 M bases) (2) Translate it in binary mode (3) Run the DH test on it
6.3.3. The Data For the biological sequences we uses the following:
304
• • • •
Caenorhabditis elegans chromosomes I-V, complete sequences. Walia by, whole genome. Human chromosome 14 complete sequence Drosophila melanogaster some chromosomes, complete sequences
all encoded use table 4.
6.3.4. The Results In table 5 we can see the results of our experiment using the Biological data. It is obvious that Bio-sequences are not random at all. The reason for this fact could be found, of course, in: • Some parts of a pre-mRNA sequence could be highly repetitive (Satellite, Minisatellite, ... ) • Some parts of a pre-mRNA sequence could be made up a very long sequence of the same nucleotide (Polyadenylation tail, ... ) n 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15
Table 5 Bio Results Test name Birthday Spacings Overlapping Permutations Ranks of 31x31 and 32x32 matrices Ranks of 6x8 Matrices Monkey Tests on 20-bit Words Monkey Tests OPSO,OQSO,DNA Count the 1's in a Stream of Bytes Count the l's in Specific Bytes Parking Lot Test Minimum Distance Test Random Spheres Test The Squeeze Test Overlapping Sums Test Runs Test The Craps Test
Status FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL
6.4. RNA-Crypto System results 6.4.1. The Experiment In the RNA-Crypto System experiment we long sequences of binary data encrypted with our protocol (approximately 100 MBytes of data for each
305
experiment) .
6.4.2. The Protocol The protocol used to estimate the randomness of RNA-Crypto System
IS:
(1) Run the DNACrypto program on a message of opportune length (2) Save the encrypted message (3) Run the DR test considering it as a random sequence.
6.4.3. The Data For the cryptographic sequences we uses:
• A 10 M bytes file filled with ASCII char 'A' • A 10 M bytes file filled with random binary numbers • One of the above biological sequence
encoded with the Algorithm
6.4.4. The Results
In table 6 we can see the results of our experiment using the RNA-Crypto System data. It is obvious that RNA-Crypto System sequences are random following the DR tests.
306
n 1 2 3 4 5 6
7 8 9 10 11 12 13 14 15
Table 6 Crypto Results Test name Birthday Spacings Overlapping Permutations Ranks of 31x31 and 32x32 matrices Ranks of 6x8 Matrices Monkey Tests on 20-bit Words Monkey Tests OPSO,OQSO,DNA Count the 1 's in a Stream of Bytes Count the 1 's in Specific Bytes Parking Lot Test Minimum Distance Test Random Spheres Test The Squeeze Test Overlapping Sums Test Runs Test The Craps Test
Passed PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS
6.5. First Results - Differences As expected the results correspond to our ideas on the sequences of nucleotides and maybe also to the ones about the Crypto System. Of course natural phenomena is not really random like a cryptographic system Some obvious questions are: • Why pre-mRNA sequences do not pass the statistical tests? - Of course they are not random (life is not random) • But some of the motivations, from the statistical point of view, may be the follows - Some parts of a pre-mRNA sequence could be highly repetitive (Satellite, Minisatellite, ... ) - Some parts of a pre-mRNA sequence could be made up a very long sequence ofthe same nucleotide (Polyadenylation tail, ... ) • Is is possible to manipulate the RNA-Crypto System protocol to obtain the same results?
307
6.6. Changes: Informatics emulates biology
6.6.1. Modify Cryptographic protocol - Phase I Due to its random nature the cryptographic model has not repeated sequences inside then we introduce a new coding function p (the replicator function) that will add these repeated sequences artificially inside the code These code can be considered: • active (they act in some way with the system) • passive (just junk!!!)
6.6.2. Strategies The replicator function
Definition 6.1. Be p : M x K x 0 p (a,
K"
---->
0 a junk function:
0) =
~
=
L
(9)
where B :3 i = OU,K and OU,K is a subsequence of o. If a is a part of the encoded message, this mechanism implements the repeated sequences phenomenon.
6.6.3. Alter Cryptographic protocol - Phase II Due to its random nature the cryptographic model has not significant allequal subsequences then we introduce a new coding function T (stutter function) that will add these mono-symbols subsequences artificially. Also in this case this sub-sequences can be created to be: • active (they act in some way with the system) • passive (just junk!!!)
6.6.4. Phase II - Strategies The stutter function
Definition 6.2. Be
T :
M xK
---->
0 a junk function:
(10) where B :3 i = Zn",K and Z E Au B i , nU,K is an integer. This mechanism implements the polyadenylation tail.
308
n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Table 7 New Crypto Results Test name Birthday Spacings Overlapping Permutations Ranks of 31x31 and 32x32 matrices Ranks of 6x8 Matrices Monkey Tests on 20-bit Words Monkey Tests OPSO,OQSO,DNA Count the 1 's in a Stream of Bytes Count the 1's in Specific Bytes Parking Lot Test Minimum Distance Test Random Spheres Test The Squeeze Test Overlapping Sums Test Runs Test The Craps Test
Status FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL
6.7. Results New results come from statistical analysis on the new cryptographic data. Note: the result in table 7 holds for all the data in the cryptographic set
6.8. New Results - Differences In these days we are working hard to prove that the security of the cryptographic system is not affected by the new entrants, nowadays we did not find any attack that can exploit these features, so we are highly confident that the new system has the same level of security of the standard one. • In fact, even if the spy could isolate the Introns, the rest of the message would be exactly equal to the previous version • Then the safety remains the same (or better) Moreover adding redundancy we obtained, as a side effect, some properties: • A certain robustness to error in transmission - Indeed, in a very redundant system, the probability that an error in a bit prevents the decoding is very low
309
This also suggests a particular interpretation of redundancy in RNA (protection against excessive mutations as an example) • Furthermore, the spy, during his attack, does not know whether the piece of code that is trying to attack be an Exon or an Intron This also suggests another particular interpretation of redundancy in RNA (protection against pathogens 7) 7. Conclusion In conclusion this job will allow us to take some considerations. For the first, adding to a good random sequence some artificial noise, we transformed it in a non random sequence but this transformation does not affect the cryptographic security. Then this means that tests of randomness, while giving a good estimate of safety, are not always infallible and more tests are needed to determine the goodness of an encryption system. Finally, the similarity between biology and computer science (cryptography) must be investigated further in order to give further interpretation to the biological mechanisms also today still partially obscure. References 1. Lewin, B., "Gene VI" Oxford University Press, 1997. 2. Menezes, A., van Oorschot, P. and Vanstone, S., "Handbook of Applied Cryptography", CRC Press, 1996 3. Regoli, M., "Bio-Cryptography: A Possible Coding Role for RNA Redundancy", Foundations of Probability and Physics-5 (AlP Conference Proceedings), Accardi EDT, 2008
This page intentionally left blank
Quantum Bio-Informatics IV eds. L. Accardi, W. Freudenberg and M. Ohya © 2011 World Scientific Publishing Co. (pp. 311-319)
DISCRETE APPROXIMATION TO OPERATORS IN WHITE NOISE ANALYSIS
SI SI Faculty of Information Science and Technology Aichi Prefectural University Aichi Prefecture, Japan
2000 AMS Classification: 60H40 In this paper we discuss how to approximate white noise functionals and the operators in white noise analysis by using variables depending on discrete parameter. We discuss to understand the basic idea and real meaning of approximation of operators.
1. Introduction
Main aim of this report is to discuss a method of approximation of white noise functional by using a system of variables with discrete parameter. Then, we naturally proceed to the approximation of operators. When we discuss approximation of nonlinear functionals of white noise we meet a crucial difficulty. Namely, to come to some nonlinear functionals of 13(t)'s we are naturally required to have the so-called renormalization. Thus, we shall provide a general theory of renormalization of white noise functionals. First, we shall take a system {Xn' n E Z} of standard Gaussian random variables and their functions. We then come to approximation of white noise functional cp( 13 (t), t E R} by those of Xn's. In this course, we can see reasons why renormalization is necessary for some white noise functionals. So this approximation that we discuss in this note is very much different from the approximation of ordinary functions. The last section is devoted to the approximation of operators acting on the space of white noise functionals. 311
312
2. Analysis of white noise functionals
We wish to discuss functionals rp(B(t), t E R), noting that {B(t) , t E R} is a system of idealized elemental random variables. First, we form basic functionals of B(t)'s, that is polynomials in B(t)'s. Consider a system A
= algebra generated
by the system {I, B(t), t E R}.
Proposition 2.1. A forms a graded algebra, that is i) A is an algebra. ii) A = L ~=o An, (algebraic direct sum) where An = {homogeneous polynomials of degree n}
iii) An· Am
=
{fn(B)gm(B);fn E An ,gm E Am}.
Remark 2.1. B(t) = B(t, w), wE n(fl), B is Brownian motion. For every w, B(t, w) is defined as a generalized function of t. But smeared variables B(~),~ E E,E being a nuclear space, are well defined as ordinary random variables. Remark 2.2. The sigma field generated by {B(t), t E R} is understood to be equal to the sigma field B generated by {B(~),~ E E}. It is known that 00
(L2) == L 2(n,B,fl) =
EB'Hn (Fock space) n=O
In particular, HI is spanned by B(~) , ~ E E and we have (1)
As an extension of the isometry we have
Also it is shown that (1) can be extended to
'Hi- I) ~ K(-I)(R),
(2)
where K( -1) (R) is the Sobolev space of degree -lover Rl. Since K( -1) (R) contains delta function btU, we see that B(t) is a well defined member of H 1( - I) .
313
In this line the orthogonalization A~ of the sub-algebras An leads us to define a space H~-n) of generalized white noise functionals of degree nand finally come to the space of generalized white noise functionals :
(L2)- = EBcn1{~-n). 3. Discrete parameter case We restrict time parameter to [0,1]. The linear space spanned by (E, ~n)"s ~n being a base of L2([0, 1]) is the same as the space spanned by the (E, Xn,k) where
Xn,k Write
= XLl. n , llnk = [k 2n' - 1 ~] 2n . k
X;: = 2~ (E, Xn,k)' Then X;:'s are independent identically
N(O, l)-distributed, n = 0,1,2"" ; 1 :s; k :s; 2n. Obviously B(X;:, n = 0, 1,2,'" ; 1 :s; k :s; 2n) = B =
VBn n
where Bn = B(X;:, 1 :s; k :s; 2n). We have n
Theorem 3.1. The space (L;J is increasing in n and inductive limit of (L;J is equal to (L2). 4. Operators : from discrete to continuous form We first approximate a Brownian motion by Levy's construction (see [7]) which is fitting to realize an approximation. That is, we should take an independent system {Ll.jnB}, which approximates Brownian motion when k
{llk} is getting finer. Thus, we take {Ll.}!} as a basic system for the random variables in the k
followings. We are now going to give the interpretation why we take Frechet derivative to define 8 t = a:(t)' In fact, we define as a limit (in the sense to be prescribed below) of where {Xn} is independent identically N(O, l)-distributed. To see the idea we simply note in the followings.
at,
at
314
1) In the discrete parameter case let X = (X 1 ,X2 ,···), Xi's be independent identically and N(O, I)-distributed. The partial derivative
a!L f(X) is defined to be
8 -8 f(x1,x2,···)1 x J -x·· Xn J
(3)
We wish to use analogous technique. 2) Since .6,~
-t
{t}
.6,nB ===?
~n
.
-t
B(t).
(4)
k
and since the S-transform of B(t) is ~(t), we can use a counter part of (3). Namely, for cp(B) = cp(B(t), t E R1) first take 8~~t) (Scp)(~) and apply S-l. This is expressible as
8
tCP
= ~ = S-1_8_(S ) 8B(t)
8~(t)
cp,
(5)
where atct) is the Frechet derivative. Formally we follow this, however we need some interpretations to have (5) understood correctly. Coming back to (3), the partial derivative a~n means a derivative with respect to X n . This fact should correspond to a variation of the random variable X n , for which we recongnize the variation within the world of random function. This is difficult to be justified. Having overcome this difficulty, the definition (5) is acceptable. In the expression (5), the Frechet derivative is understood to be a derivative obtained by measuring infinitesimal variations in all possible direction. The idea of understanding the partial derivative with respect to the random variable B(t) is the same as in the discrete case. Concerning the definition of the partial derivative 8t , we have to note a crucial difference form that in the discrete case. For a~n' the variable Xn is a standard guage, i.e. Xn has unit variance. On the other hand B(t) is an infinitesimal random variable, formally speaking it has variance Such difference is absorbed by the S-transform. Nevertheless, we must be careful when 8 t is approximated by a~n in the discrete case. The examples which are now in order illustrate this fact. We note that the Frechet derivative lets the degree of homogeneous functionals decrease by 1, which means 8 t is an annihilation operator. It
it.
315
acts in such a way that the kernel function F( Ul, ... ,un ) associated to cp E Hn becomes
The S-transform acting on (L2)- is expressible as (Scp)(~)
=
C(~)(e(x,O, cp(x))
since e(-'· ) E (L2)+. Set F = {(Scp)(~); cp E (L 2 )-, ~ E E}. Theorem 4.1. The vector space F can be topologized to be a Repoducing Kernel Hilbert Space with kernel C(~ - 7)), (~, 7)) E E x E and F~(L2)-.
Example 4.1. If j is continuous, then at
In particular, atB(s) analogue of
J
j(u)B(u)dU
= o(t -
= j(t).
s), where 0 is the delta function. This is the
Example 4.2. at : B(s)n := (n - 1) : B(t)n-l : o(t - s).
Note that 0(t - s) follows in the above expression. Example 4.3. For the Gauss kernel
cp = Nee! B(U)2du,
(6)
atcp = 2c : B(t)cp :,
(7)
Thus,
which comes from the variation of S-transform
o
o~U(O
=
2c~(t)U(O,
(See the literature 6 ). Taking S - l-transform, we obtain (7) .
(8)
316
A particular interest is centered to quadratic normal functionals, in the sense of LevylO, the S-transform of which are expressed in the form U(O
= 10
1
f(U)~(U)2du+ 1ot 1o t F(u,v)~(u)~(v)dudv,
where f is continuous and F E L2(R2). The action of the partial differential operator transform in the expression
at
is seen by its S-
UI(~, t) = 2f(t)~(t) + 10 t F(u, v)~(u)du. The second derivative
(9)
(10)
a; corresponds to the functional U/I(~,
t) = 2f(t)8 t
(11)
to which we wish to correspond 8~2 in the digital case. If we wish to correspond to the trace
z= 8~~
,
then nwe define
(12) In view of this, the second term is annihilated. Hence for
0, the operator (sin)aC c be defined by 00
( . )
sm
r
'"'
al..-C = ~
a
2n-l
r(2n-l)
(2n _1)!l..-c
(we do not discuss now in which sense the series converges) and let (sin)aCb be the operator in a space of E* -cylindrical measures on E, which is adjoint to (sin)aCc. Theorem 5.1. The dynamics of the Wigner measure is governed by the following equation:
The proof can be obtained by combination of technique of the theory of differentiable measures and some methods of developing equations describing the evolution of the Wigner function [4], [5], [6].
334
Theorem 5.2. Let z; E Mg(E), {ek" .. . , ekn } , {erp ... , er",} C B and let the Hamiltonian function H be Gr" ... ,rTn -cylindrical. If {erp .. . , e rTn } C {ek" ... , ekn }, then
f 2 (sin H .c;,.c v \x)
=
kl J· .. ,kn
2 J 2(sin)~£H
T l , · .. ,T;n
(f{vr 1 , ··· , r rn } U{k 1, · ·· , kn })( ... xs" ... ,xsP ... )dxs, .. .dxs P .
Theorem 5.3. A function W(·) taking values in Mg(E) is a solution of the equation governing the dynamics of the Wigner measure if and only if the functions t f--' gr" ... ,rn (t), defined by the equalities gr" .. .,r n (t) =
fr2,(,s.'.·n.,r) n~.c:H( W (t )) satisfy the following infi nite system of differential equations:
g~" . .,kn (t)( ·)
=
L J2(sin)~£Hr",rTn (g{ r" ... ,r=}U{k" ... ,kn}(t))
where the summation is over all finit e sets {rl, ... , rm} of natural numbers for which {rl, ... , r m} n {k 1 , ... , k n } -f. 0, and, in accordance with what has been said above, if {rl, ... , rm} n {k 1 , ... , k n } = {rl, ... , rm}, then the integral sign is assumed to be missing.
This follows from the theorems 5.1 and 5.2. Suppose that the operator (sin)£'H is defined on MOO(E). Theorem 5.4. Let, for every finite set {rl'"'' r n} of natural numbers, gr" ... ,rn be a function of a real argument taking values in the space of (bounded continuous) functions on Fr" ... ,rn , and, for each t the family of functions gr" ... ,r n (t) is compatible and determin es a unique measure w(t) E MOO(E). Th en, the function w(·) is a solution of the equation governing the dynamics of the Wigner measure, if the family of functions
335
grl, ... ,rn is a solution of the system of equations
x (.)
+
1
"" ~
lim - -
N-.oo Nm-p
{jl , ,, .,jp }={ rl ,,,. ,r",} n { kl ,,, . ,k n
}
If it is not assumed that the Plank constant is equal to one then it is necessary to substitute "2(sin)~" by "~(sin)~ ", where fi is the Plank constant. Remark 5.4. From Theorem 5.4, which is similar to theorem 4.1, one can deduce a quantum version of the classical Bogolyubov system of equations and also some other similar systems of equations. It is also worth noticing that the integrals w(t)(dq, dp) are not probabilities and hence (Remark 5.3) the measures w(t) are not Wigner measures.
Jp
6. Generalization of Poincare's model.
In this section we formulate a proposition (Proposition 6.1 below) that improve the similar proposition related to the Poincare model ([1], [8], [9]) of irreversible but symmetrical with respect to time evolution. Actually in the original model one assumes that the initial probability distribution, on the phase space, is the product of identical one-dimensional distributions; in Proposition 6.1 we consider some general distribution on the phase space and prove that this distribution tends (both when time tends to +00 and to ~oo) to the same limit as in the case when the initial distribution is the product of the identical distributions. The similar improvement is valid for the quantum version [10] of the Poincare model. Proposition 6.1. Let E
=Qx
P, E' 2
=P
x Q, I(p, q)
=
(q,
~p),
and let,
in natural notations, 'H(q,p) = 2: f,;:; (this Hamiltonian system describes noninteracting particles). Let v(-) be a solution of the Liouville equation for probability measures such that the finite-dimensional projections of the measure v(O) on E satisfy the conditions of Theorem 4·1 in flO). If Qn is a finite-dimensional subspace of the space Q and, for each t, P(t,') is the density of the projection of the measure v(t) on Qn , then, for any compact
336
subsets Kl and K2 of Qn! the following holds:
JK, P(t, q)dq JK2 P(t, q)dq
--->
mesK1 mesK2
,
t
--->
±oo,
where mes denotes the Lebesgue measure on Qn (so one could say that the probability distribution on the configuration space tends to the uniform distribution) . The proof uses Theorem 3.2 and is similar to the proof of Theorem 4.1 from [10]. This proposition somewhat strengthens a result of [8] which goes back to Poincare [1]. In [8] it was actually proved in the special case when the initial probability measure v(O) is the product of copies of a one-dimensional probability measure. A completely similar proposition is valid for quantum systems because for systems with quadratic Hamiltonian functions the equations for the Wigner measure coincide with the Liouville equations (cf. [10]).
References 1. H.Poincare. J.de Physique theorique et appliquee, 4 serie, 5, (1906),369-403. 2. N.N.Bogolyubov. Problems of dynamical theory in statistical physics. Moscow, 1946 (in Russian; there exists an English translation). 3. E.Wigner. Phys.Rev., 40, (1932), 749. 4. J. E. Moya!. Quantum mechanics as a statistical theory. Proc. Cambridge Philos. Soc. , v. 45 , (1949), 99-24. 5. G. B. Folland. Harmonic Analysis in Phase Space. (Princeton Univ. Press, 1989) . 6. Kim Y. S., Noz M. E. Phase-Space Picture of Quantum Mechanics. Group Theoretical Approach. (World Scientific, 1991). 7. Radu Balescu, Equilibrium and nonequilibrium statistical mechanics, vo!'l (John Wiley and Sons, 1975). 8. V. V. Kozlov, Thermal Equilibrium in the sense of Gibbs and Poincare (Moscow-Izhevsk, 2002) [in Russian]. 9. V.V.Kozlov. Reg. Chaotic Dyn. 1. (2004), 23-34. 10. V.V.Kozlov, O.G.Smolyanov. Theory Probability and Applications. Vo!' 51, 1, (2006) , pp. 1-13. 11. O.G.Smolyanov, S.V.Fomin. Soviet Mathematical Surveys, V. 31, 4. (1976), 3-56. 12. O.G.Smolyanov, H.von Weizsaecker.Comptes Rend. Acad. Sci. Paris. T. 321, ser. 1. (1995), 103-108. 13. N.Bourbaki , Integration, Chapitre 6 (Springer, 2007).
337
14. O.G.Smolyanov, H.v.Weizsacker. Smooth probability measures and associated differential operators. Inf. Dimens. Anal., Quantum Probab. and Relat. Top. V.2, 1, (1999),51-78. 15. L. Accardi and O. G. Smolyanov. Generalized Levy Laplacians and Cesaro Means. Doklady Mathematics, Vol. 79, 1, (2009), 1-4. 16. T.Hida, H.H.Kuo, J.Pothoff, L.Streit. White noise. An infinite dimensional calculus. Kluwer Academic, 1993.
This page intentionally left blank
Quantum Bio-Informatics IV eds. L. Accardi, W. Freudenberg and M. Ohya © 2011 World Scientific Publishing Co. (pp. 339-354)
ANALYSIS OF SEVERAL CATEGORICAL DATA USING MEASURE OF PROPORTIONAL REDUCTION IN VARIATION
KOUJI YAMAMOTO, KOUJI TAHATA, NOBUKO MIYAMOTO AND SADAO TOMIZAWA * Department of Information Sciences, Tokyo University of Science, Noda City, Chiba 278-8510, Japan * E-mail: [email protected] For a two-way contingency table with nominal row and column variables, the measures which describe the proportional reduction in variation (PRV) from the marginal distribution of one variable to the conditional distribution given the other variable are proposed by Goodman and Kruskal (1954), Theil (1970), and Freeman (1987, p. 101). Tomizawa, Seo and Ebi (1997), and Miyamoto, Usui and Tomizawa (2005) proposed the generalization of those measures. Tomizawa, Miyamoto and Yajima (2002), and Yamamoto and Tomizawa (2009) proposed the PRV measures for a nominal-ordinal contingency table and for an ordinal-ordinal contingency table, respectively. The present paper (1) reviews these PRY measures and (2) analyzes and compares between several categorical data using these PRY measures.
Keywords: Concentration coefficient; Measure, Proportional reduction in variation; Square contingency table; Total uncertainty coefficient.
1. Introduction
The data in Table 1 taken from the Meteorological Agency in Japan are obtained from the daily temperatures at Nagasaki City, Japan, in two years, 2001 and 2002, using three levels, (1) below normals, (2) normals and (3) above normals (see Tahata, Takazawa and Tomizawa, 2008). The observations, say, Iij, in the (i, j)-th cell indicate that for each of Iij days in 365 days (i.e., from 1 January to 31 December), the temperatures in two years are i in 2001 and j in 2002. Table 2 is taken from Tallis (1962) and constructed from the crossclassified data of Merino ewes according to the numbers of lambs born in consecutive years, 1952 and 1953 (also see Bishop, Fienberg and Holland, 1975, p. 288; Miyamoto, Niibe and Tomizawa, 2005). 339
340
Table 3 is the data on unaided distance vision of 7477 women aged 30-39 employed in Royal Ordnance factories in Britain from 1943 to 1946. The row variable is the right eye grade and the column variable is the left eye grade with the categories ordered from the Best (1) to the Worst (4). The vision data in Table 3 have been analyzed by many statisticians, including Stuart (1955), Bishop et al. (1975, p. 284), McCullagh (1978), Goodman (1979) , Agresti (1983) , Tomizawa (1985 , 1993, 2009), Miyamoto, Ohtsuka and Tomizawa (2004), Tomizawa, Miyamoto and Yamamoto (2006), Tomizawa and Tahata (2007), and Tahata, Yamamoto, Nagatani and Tomizawa (2009). Table 4 is the data on unaided distance vision of 3168 pupils comprising nearly equal number of boys and girls aged 6-12 at elementary schools in Tokyo, Japan, examined in June 1984. The data in Table 4 have also been analyzed by Tomizawa (1985), Miyamoto et al. (2004), and Tahata and Tomizawa (2006). Table 5 is the data on unaided distance vision of 4746 students aged 18 to about 25 including about 10 percent women in Faculty of Science and Technology, Science University of Tokyo in Japan examined in April 1982. The data in Table 5 have been analyzed by Tomizawa (1984, 1985) and Tahata et al. (2009). The data in Table 6 represent the cross-classification of a sample of individuals according to their socioprofessional category in 1954 and in 1962 (see Caussinus, 1965; Bishop et al., 1975, p. 298). Tables 1 through 6 are the data of square contingency tables having the same row and column classifications. In addition, the categories in each of Tables 1 through 6 are ordered. Many observations concentrate on (or near) the main diagonal cells in the table. Therefore the row classification tends to be strongly associated with the column classification, namely, the model of independence (i.e., null association) between the row and column classifications does not hold. For those data we are interested in whether or not the row value of an individual is symmetric to the column value. Many models of symmetry and asymmetry have been proposed by many statisticians; for instance, Bowker (1948), Caussinus (1965), Bishop et al. (1975, Chap. 8), McCullagh (1978), Goodman (1979), Agresti (1983, 2002), Tomizawa (1993, 2009) , and Tomizawa and Tahata (2007). We omit here the details of models of symmetry or asymmetry. For the data in Tables 1 through 6 we are also interested in measuring the relative improvement in variation in predicting the value of the other variable when the value of one variable is known, opposed to when it is not
341
known. Consider an r x c contingency table with both nominal categories of the explanatory variable X and the response variable Y. Let Pij denote the probability that an observation will fall in the (i, j)-th cell (i = 1, ... ,rj j = 1, ... ,c). A measure which describes the proportional reduction in variation (PRV) from the marginal distribution of Y to the conditional distribution of Y given the value of X has form
V(Y) - E[V(YIX)] V(Y)
(1.1 )
where V(Y) is an index of variation for the marginal distribution of Y, and E[V(YIX)] is the expectation of the conditional variation taken with respect to the distribution of X (Agresti, 2002. p. 56). Tomizawa, Seo and Ebi (1997) proposed the generalized PRY measure defined by
(A> -1),
where c
Pi·
=
r
LPit, p.j
=
t=l
LPSj, s=l
and the value at A = 0 is taken to be the continuous limit as A - 7 0, and where A is a real value that is chosen by the user. Note that Tomizawa and Ebi (1998) and Tomizawa and Machida (1999) extended the measure T(A) into the multi-way contingency tables. The variation index used in T(A) is
V(Y) =
~ (1 -tp~+l) , J=l
which includes the Shannon entropy (when A = 0) and Gini concentration (when A = 1). In special cases, when A = 1, T(l) is identical to Goodman and Kruskal's (1954) measure (called the concentration coefficient) defined
342
by
T=
and when).. = 0, T(O) is identical to Theil's (1970) measure (called the uncertainty coefficient) defined by
t U
=
i=l
tpij log ( Pi j ) j=l p"'P'J
---'----c-----
- LP.j logp.j j=l
In a nominal-nominal contingency table, for a situation in which the explanatory and response variables are not defined clearly, a measure which describes the PRY from the marginal distribution of one variable (of X and Y) to the conditional distribution of the variable given the value of the other variable has a general form V(Y)
+ V(X)
- E[V(YIX)]- E[V(X/y)] V(Y) + V(X)
(1.2)
Miyamoto, Usui and Tomizawa (2005) proposed a generalized PRY measure, i.e., a generalized total uncertainty measure Tt~~l with)" > -1 (in a similar idea to TP.·»). In a special case, when).. = 0, Tt~Ll is identical to Freeman's (1987, p. 101) total uncertainty measure defined by 2
Utotal =
t
i=l
t Pij log ( Pi j j=l p",p']
)
~-rCC-------c-----
- LPi.logpi. - LP.j logp.j i=l j=l
For a nominal-ordinal table with a nominal variable X and an ordinal variable Y, Tomizawa, Miyamoto and Yajima (2002) proposed a PRY measure. For ordinal-ordinal tables, Tomizawa and Yukawa (2003, 2004) proposed some PRY measures. Also, for an ordinal-ordinal table in which the explanatory and response variables are not defined clearly, Yamamoto and Tomizawa (2009) considered a PRY measure ~~lal with ).. > -1 (see Section 2).
343
For the data in Tables 1 through 6, we cannot define clearly which of row and column variables is the explanatory variable and the response variable. So, for these data we are interested in applying Yamamoto-Tomizawa PRY (A) measure I]> total' The purpose of the present paper is (1) to review the PRY measure I]>~~L and (2) to analyze and compare between the data in Tables 1 through 6 . h ,T..(A) usmg t e measure 'l'total' 2. Review of generalized total uncertainty measure
Consider an r x c contingency table with ordered categories in which the explanatory and response variables are not defined clearly. This section reviews briefly the generalized total uncertainty measure I]> ~~L. The measure is defined as follows: for A > -1,
(A + 1) I]>(A) _ total -
[~~(F(k))A+1 ~ ~'J
J(A) l(Jk)
+
j=l k = l
~ ~(dk))A+1 ~ ~ i= l k= l
2'
r- 1
c-1
L H~~]) + L j=l
'
H~~)
i=l
where c
j
F.~1) =
LP.t,
L
F.j2) =
t= l
=
r
'" ~Ps.,
C(2) 2'
= '" ~
Ps· ,
s=i+1
s=l
~A
(1 _~(F(k))A+1)
'
= ~A
(1 _~(dk))A+1)
,
H(A) = l(J)
H(A) 2(2)
p·t ,
t= j + 1
i
(l) C i·
~'J k=l
~
2'
k=l
(A)
J'Uk)
=
1
'\(H 1)
~ Flk) r
F(k)
~
{(
p,) A} ,
F(k)jF(k) 2J
1
J(A) 2(2k)
'J
_
1
344
with j
FS) =
aU) =
F(2) =
LPit, t=1
2J
c
Pit, L t=j+l
i
r
0 IS gIven '¥ total WIt Pij rep ace by {Pij}, where Pij = Iij / nand n = L L Iij. Using the delta method (Bishop et al., 1975, Sec. 14.6), Vn(~~LI - ~~L) has asymptotically (n ----> (0) a normal distribution with mean zero and variance (72 ~~iatl. For
[
345
the detail of (72[~~L], see Yamamoto and Tomizawa (2009). Therefore we can obtain an approximate confidence interval for ~~ial using the estimated approximate standard error o-[~~Ll/Vn for ~~~ial' where o-2[~~iall denote (72 [ ~~all with {Pij} replaced by {Pij}.
4. Analysis of data
We shall analyze the data in Tables 1 through 6 using the total uncertainty measure ~~ial' Table 7 gives the value of estimated measure ~~~L and the approximate 95% confidence interval for the measure ~~L applied to the data in Tables 1 through 6. We see from Table 7 that the confidence interval for ~~L applied to the data in Table 1 includes zero for all A. Therefore this would indicate that there is a structure of independence, i.e., null association, between the daily temperature at Nagasaki City in 2001 and in 2002. Namely, when we know the temperature of a day in 2001 (in 2002), the knowledge would not be useful for predicting the temperature of the day in 2002 (in 2001). We also see from Table 7 that for the data of Merino ewes in Table 2 the values of estimated measure ~~~Ll are close to zero, however, the confidence intervals for ~~ial do not include zero. Therefore these would indicate that the number of lambs born in 1953 may be somewhat associated with the number of lambs born in 1952. Namely, when we know the number of lambs born in 1952 (in 1953), the knowledge may be somewhat useful for predicting the number of lambs born in 1953 (in 1952). Moreover we see from Table 7 that for three kinds of vision data in Tables 3, 4 and 5, the values of estimated measure ~~~L are greater than zero and the confidence intervals for ~~ial do not include zero. Therefore for each of vision data in Tables 3, 4 and 5, the right eye grade for an individual is strongly associated with the left eye grade for the individual. Namely, when we know the grade of one eye (of right eye and left eye) for an individual, the knowledge would be useful for predicting the grade of the other eye for the individual. In addition, we see from Table 7 that (1) the value of estimated measure ~~~L and the values in confidence interval for ~~ial applied to the vision data of pupils in Table 4 are greater than the corresponding values of them applied to the vision data of women in Table 3, and (2) the values of them applied to the vision data of students in Table 5 are greater than the corresponding values of them applied to the vision data of pupils in Table
346
4. Thus, when we want to predict the grade of one eye for an individual by obtaining the knowledge of the grade of the other eye for the individual, (1) the knowledge would be useful for the data of pupils in Table 4 rather than for the data of women in Table 3, and also (2) the knowledge would be useful for the data of students in Table 5 rather than for the data of pupils in Table 4. We see from Table 7 that for example, when A = 1, the value of estimated measure ~!Ll is 0.650 for the data of students in Table 5. Thus, when we predict the grade of one eye for a student by obtaining the knowledge of the grade of the other eye for the student, the prediction becomes 65% better when we know the information than when we do not know the information. Similarly, for the vision data of pupils in Table 4, the prediction becomes 53% better when we know the information than when we do not know the information. Also, for the vision data of women in Table 3, the prediction becomes 44% better. We see further from Table 7 that for the data of socioprofessional status in Table 6, the value of estimated measure ~~lal are greater than zero and the confidence intervals for ~~L do not include zero. In addition, the value of ~~L applied to the data in Table 6 is greater than any other value of ~~ial applied to the data in Tables 1 through 5. Therefore, for the data of socioprofessional status in Table 6, the socioprofessional status in 1962 for an individual is strongly associated with the status in 1954 for the individual. Namely, when we know the socioprofessional status in 1954 (in 1962) for an individual, the knowledge would be useful for predicting the status in 1962 (in 1954) for the individual. We see from Table 7 that for example, when A= 1, the value of estimated measure ~!~al is 0.699 for the data of socioprofessional status in Table 6. Thus, we want to predict the socioprofessional status in one year of 1954 and 1962 for an individual by obtaining the knowledge of the status in the other year for the individual, the prediction becomes about 70% better when we know the information than when we do not know the information. 5. Remarks
For a two-way contingency table with the explanatory variable X and the response variable Y, the PRY measure of form (1.1) including, e.g., TCA), T and U, would be useful for seeing what degree the relative improvement in variation for predicting the value of response variable Y when we know the value of explanatory variable X is toward the perfect prediction (i.e.,
347
when the measure equals 1). For a two-way contingency table in which the explanatory and response variables are not defined clearly, the PRY measure of form (1.2) including, e.g., Tt~~~l' Utotal and ~~ial' would be useful for seeing what degree the relative improvement in variation for predicting the value of one variable when we know the value of the other variable is toward the perfect prediction. Yamamoto and Tomizawa (2009) applied the total uncertainty measure ~~ial to the data of cross-classification of father's and his son's occupational status in Denmark, in British and in Japan (though the details are omitted here). When we want to predict the son's occupational status for a pair of father and his son by obtaining the knowledge of his father's occupational status for the pair, and conversely we want to predict the father 's occupational status for the pair by obtaining the knowledge of his son's occupational status for the pair, we are interested in what degree the prediction becomes better when we know the information for one of the pair than when we do not know the information. In such a case, the PRY measure as ~~L would be useful for measuring the degree of the proportional reduction in variation. Note that (1) the measures T()') , T, U, TL~~1 and Utotal are usually used when the row and column classifications have both nominal categories, (2) the measure ~~L is used when those have both ordered categories, and (3) the PRY measure proposed by Tomizawa et al. (2002) is used when one of row and column classifications has the nominal category and the other has the ordered category.
6. Conclusions The present paper has analyzed several categorical data and compared the degree of PRY between them using the total uncertainty measure ~~L· In the present paper we have seen that when we want to predict the value of one variable by knowing the value of the other variable, the prediction based on the information would be useful for the unaided distance vision data and for the data of socioprofessional status rather than for the data of temperatures in two years and for the data of numbers of lambs born in consecutive years.
7. Discussion The data in Table 8, taken from Everitt (1992, p. 56) show the frequencies obtained when 284 consecutive admissions to a psychiatric hospital are
348
classified with respect to social class and diagnosis. Everitt examined what degree the knowledge of a patient's social class is useful for predicting his diagnostic category using Goodman and Kruskal's (1954) lambda measure. We are now interested in applying the PRY measures, TeA), Tt~~l and
1>~~L to the data in Table 8. However, since the categories in Table 8 seem to be nominal (i.e., not ordinal), it would not be suitable to apply the measure 1>~~lal. Therefore we shall apply TeA) and Tt~~~l to these data. We now see that the estimated values of TeA) (TL~~I) are, for instance, 0.041 (0.046) when A = 0, 0.043 (0.050) when A = 0.4, and 0.039 (0.049) when A = 1, and the confidence intervals for TeA) (Tt~~~l) do not include zero though the details are omitted. Therefore the knowledge of a patient's social class would be useful for predicting his diagnostic category, and also conversely the knowledge of a patient's diagnostic category would be useful for predicting his social class. So, using the PRY measure, it would be important to examine what degree the prediction of his diagnostic category becomes better when one knows the patient's social class than when one does not know it.
References 1. Agresti, A. (1983). A simple diagonals-parameter symmetry and quasi-
symmetry model. Statistics and Probability Letters, 1, 313-316. 2. Agresti, A. (2002). Categorical Data Analysis, second edition. Wiley, New York. 3. Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis: Theory and Practice. The MIT Press, Cambridge, Massachusetts. 4. Bowker, A. H. (1948). A test for symmetry in contingency tables. Journal of the American Statistical Association, 43, 572-574. 5. Caussinus, H. (1965). Contribution a l'analyse statistique des tableaux de correlation. Annales de la Faculte des Sciences de l'Universite de Toulouse, 29, 77-182. 6. Cressie, N. and Read, T. R. C. (1984). Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society, Ser. B, 46, 440-464. 7. Everitt, B. S. (1992). The Analysis of Contingency Tables, second edition. Chapman and Hall, London. 8. Freeman, D. H. (1987). Applied Categorical Data Analysis. Marcel Dekker, New York. 9. Goodman, L. A. (1979). Multiplicative models for square contingency tables with ordered categories. Biometrika, 66, 413-418. 10. Goodman, L. A. and Kruskal, W. H. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49, 732-764.
349 11. McCullagh, P. (1978). A class of parametric models for the analysis of square contingency tables with ordered categories. Biometrika, 65, 413-418. 12. Miyamoto, N., Niibe, K. and Tomizawa, S. (2005). Decompositions of marginal homogeneity model using cumulative logistic models for square contingency tables with ordered categories. Austrian Journal of Statistics, 34, 361-373. 13. Miyamoto, N., Ohtsuka, W. and Tomizawa, S. (2004). Linear diagonalsparameter symmetry and quasi-symmetry models for cumulative probabilities in square contingency tables with ordered categories. Biometrical Journal, 46, 664-674. 14. Miyamoto, N., Usui, E. and Tomizawa, S. (2005). Generalized total uncertainty measure for two-way contingency table with nominal categories. The Pacific and Asian Journal of Mathematical Sciences, 1, 23-39. 15. Patil, G. P. and Taillie, C. (1982). Diversity as a concept and its measurement. Journal of the American Statistical Association, 77, 548-561. 16. Stuart, A. (1955). A test for homogeneity of the marginal distributions in a two-way classification. Biometrika, 42, 412-416. 17. Tahata, K. and Tomizawa, S. (2006). Decompositions for extended double symmetry models in square contingency tables with ordered categories. Journal of the Japan Statistical Society, 36, 91-106. 18. Tahata, K. and Tomizawa, S. (2008). Orthogonal decomposition of pointsymmetry for multi-way tables. Advances in Statistical Analysis, 92, 255-269. 19. Tahata, K., Takazawa, A. and Tomizawa, S. (2008). Collapsed symmetry model and its decomposition for multi-way tables with ordered categories. Journal of the Japan Statistical Society, 38, 325-334. 20. Tahata, K., Yamamoto, K., Nagatani, N. and Tomizawa, S. (2009). A measure of departure from average symmetry for square contingency tables with ordered categories. Austrian Journal of Statistics, 38, 101-108. 21. Tallis, G. M. (1962). The maximum likelihood estimation of correlation from contingency tables. Biometrics, 18, 342-353. 22. Theil, H. (1970). On the estimation of relationships involving qualitative variables. American Journal of Sociology, 76, 103-154. 23. Tomizawa, S. (1984). Three kinds of decompositions for the conditional symmetry model in a square contingency table. Journal of the Japan Statistical Society, 14, 35-42. 24. Tomizawa, S. (1985). Analysis of data in square contingency tables with ordered categories using the conditional symmetry model and its decomposed models. Environmental Health Perspectives, 63, 235-239. 25. Tomizawa, S. (1993). Diagonals-parameter symmetry model for cumulative probabilities in square contingency tables with ordered categories. Biometrics, 49, 883-887. 26. Tomizawa, S. (2009). Analysis of square contingency tables in statistics. American Mathematical Society Translations, 227, 147-174. 27. Tomizawa, S. and Ebi, M. (1998). Generalized proportional reduction in variation measure for multi-way contingency tables. Journal of Statistical Research, 32, 75-84.
350
28. Tomizawa, S. and Machida, M. (1999). Measure of proportional reduction in variation for multi-way contingency tables with multiple response variables. The Egyptian Statistical Journal, 43, 167-182. 29. Tomizawa, S. and Tahata, K. (2007). The analysis of symmetry and asymmetry: orthogonality of decomposition of symmetry into quasi-symmetry and marginal symmetry for multi-way tables. Journal de la Societe Francaise de Statistique, 148, 3-36. 30. Tomizawa, S. and Yukawa, T. (2003). Proportional reduction in variation measures of departure from cumulative dichotomous independence for square contingency tables with same ordinal classifications. Far East Journal of Theoretical Statistics, 11, 133-165. 31. Tomizawa, S. and Yukawa, T. (2004). Proportional reduction in variation measure for two-way contingency tables with ordered categories. Journal of Statistical Research, 38, 45-59. 32. Tomizawa, S., Miyamoto, N. and Yajima, R. (2002). Proportional reduction in variation measure for nominal-ordinal contingency tables. Calcutta Statistical Association Bulletin, 53, 167-183. 33. Tomizawa, S., Miyamoto, N. and Yamamoto, K. (2006). Decomposition for polynomial cumulative symmetry model in square contingency tables with ordered categories. Metron, 64, 303-314. 34. Tomizawa, S., Seo, T. and Ebi, M. (1997). Generalized proportional reduction in variation measure for two-way contingency tables. Behaviormetrika, 24, 193-201. 35. Yamamoto, K. and Tomizawa, S. (2009). Measure of proportional reduction in variation and measure of agreement for contingency tables with ordered categories. International Journal of Applied Mathematics and Statistics, 14, 3-23.
351 Table 1. The daily temperatures at Nagasaki City, Japan, in 2001 and 2002; from Tahata et al. (2008).
2001
Below normals (1)
2002 Normals (2)
Above normals (3)
Total
Below normals (1) Normals (2) Above normals (3)
11 38 19
18 79 64
30 64 42
59 181 125
Total
68
161
136
365
Table 2. Merino ewes according to number of lambs born in consecutive years; from Tallis (1962). Number of Lambs in 1953
Number of Lambs in 1952 0 1 2
Total
2
58 26 8
52 58 12
1 3 9
111 87 29
Total
92
122
13
227
0
Table 3. D naided distance vision of 7477 women aged 30-39 employed in Royal Ordnance factories in Britain from 1943 to 1946; from Stuart (1955). Left eye grade Third Second (2) (3)
Right eye grade
Best (1)
Best (1) Second (2) Third (3) Worst (4)
1520 234 117 36
266 1512 362 82
Total
1907
2222
Worst (4)
Total
124 432 1772 179
66 78 205 492
1976 2256 2456 789
2507
841
7477
352 Table 4. Unaided distance vision of 3168 pupils comprising nearly equal number of boys and girls aged 6-12 at elementary schools in Tokyo, Japan, examined in June 1984; from Tomizawa (1985). Left eye grade Second Third (2) (3)
Right eye grade
Best (1)
Best (1) Second (2) Third (3) Worst (4)
2470 96 10 12
126 138 42 7
Total
2588
313
Worst (4)
Total
21 33 75 16
10 5 15 92
2627 272 142 127
145
122
3168
Table 5. Unaided distance vision of 4746 students aged 18 to about 25 including about 10% women in Faculty of Science and Technology, Science University of Tokyo in Japan examined in April 1982; from Tomizawa (1984). Left eye grade Second Third (2) (3)
Right eye grade
Best (1)
Best (1) Second (2) Third (3) Worst (4)
1291 149 64 20
130 221 124 25
Total
1524
500
Worst (4)
Total
40 114 660 249
22 23 185 1429
1483 507 1033 1723
1063
1659
4746
Table 6. Cross-classification of individuals according to Socioprofessional status; from Caussinus (1965). Status in 1954
(1)
(2)
Status in 1962 (3) (4) (5)
(1) (2) (3) (4) (5) (6)
187 4 22 6 1 0
13 191 8 6 3 2
17 4 182 10 4 2
11 9 20 323 2 5
Total
220
223
219
370
(6)
Total
3 22 14 7 126 1
1 3 4 17 153
232 231 249 356 153 163
173
179
1384
353
Table 7.
Estimate of measure ..(M) ~
m
m,n'
m=l 00
00
Theorem 4.3. For an initial state
p(M)
=
(8)
pi M )
E
i=-OCJ
(M
=
OOK, PSK), one has the following inequalities:
(1) () N ) S-( P(PSK).,A, a(PSK)
2':
S-( (OOK). () N ) p ,A, a(OOK) ,
(8) i=-OCJ
6
i
400
(2) I-( p(PSK) ;A* , _( (OOK) A* ?: 1 p ;,
eA, eB, a(PSK)' N (3N ) (PSK) eA, eB, a(OOK)' N (3N ) (OOK)'
References 1. Accardi, L., and Ohya, M., Compound channels, transition expectation and liftings , Appl. Math, Optim., 39 , 33-59 (1999). 2. Accardi, L. , Ohya, M. and Watanabe, N., Dynamical entropy through quantum Markov chain, Open System and Information Dynamics, 4, 71-87, (1997) . 3. Accardi, L., Ohya, M. and Watanabe, W., Note on quantum dynamical entropies , Rep. Math. Phys. , 38, 457-469, (1996). 4. Alicki, R . and Fannes, M. , Defining quantum dynamical entropy, Lett . Math. Physics, 32 , 75-82 , (1994). 5. Araki, H., Relative entropy for states of von Neumann algebras, Publ. RIMS Kyoto Univ. 11, 809-833, 1976. 6. Benatti, F., Deterministic Chaos in Infinite Quantum Systems, Springer, Berlin, (1993). 7. Choda, M. , Entropy for extensions of Bernoulli shifts , Ergodic Theory Dynam. Systems, 16, No.6 , 1197-1206 (1996). 8. Connes, A., Narnhoffer, H. and Thirring, W., Dynamical entropy of C*algebras and von Neumann algebras, Commun. Math. Phys., 112, 691719, (1987) . 9. Connes, A. and Stormer , E. , Entropy for automorphisms of von Neumann algebras, Acta Math. , 134, 289-306, (1975). 10. Emch, G.G., Positivity of the K- entropy on non-abelian K-fiows, Z. Wahrscheinlichkeitstheory verw. Gebiete, 29, 241 (1974). 11. Fichtner, K.H., Freudenberg, W., and Liebscher, V., Beam splittings and time evolutions of Boson systems, Fakultat fur Mathematik und Informatik, Math/ Inf/96/ 39, J ena, 105 (1996). 12. Hudetz, T., Topological entropy for appropriately approximated C*-algebras, J. Math. Phys. 35, No .8, 4303-4333 (1994). 13. Ingarden , R.S., Kossakowski, A., and Ohya, M., Information Dynamics and Open Systems, Kluwer, (1997). 14. Kolmogorov, A.N., Theory of transmission of information, Amer. Math. Soc. Transla tion, Ser. 2, 33, 291 (1963). 15 . Kossakowski, A., Ohya, M. and Watanabe, N., Quantum dynamical entropy for completely positive map, Infinite Dimensional Analysis, Quantum Probability and Related Topics, 2, No.2, 267-282, (1999) 16. Muraki, N. and Ohya, M., Entropy functionals of Kolmogorov Sinai type and their limit theorems , Letter in Mathematical Physics., 36, 327-335, (1996) .
401
17. von Neumann, J., Die Mathematischen Grundlagen der Quantenmechanik, Springer-Berlin, (1932). 18. Ohya, M., Quantum ergodic channels in operator algebras, J. Math. Anal. Appl., 84, 318-328, (1981). 19. Ohya, M., On compound state and mutual information in quantum information theory, IEEE Trans. Information Theory, 29, 770-774 (1983). 20. Ohya, M., Note on quantum probability, L. Nuovo Cimento, 38, 402-404, (1983) . 21. Ohya, M., Some aspects of quantum information theory and their applications to irreversible processes, Rep. Math. Phys. 27, 19-47, (1989). 22. Ohya, M., Information dynamics and its applications to optical communication processes, Springer Lecture Note in Physics, 378, 81-92, (1991). 23. Ohya, M., and Petz, D., Quantum Entropy and its Use, Springer, Berlin, (1993) . 24. Ohya, M., Petz, D., and Watanabe, N., On capacity of quantum channels, Probability and Mathematical Statistics, 17, 179-196 (1997). 25. Ohya, M., Petz, D., and Watanabe, N., Numerical computation of quantum capacity, International Journal of Theoretical Physics, 37, No.1, 507-510 (1998). 26. Ohya, M., and Watanabe, N., Construction and analysis of a mathematical model in quantum communication processes, Electronics and Communications in Japan, Part 1, 68, No.2, 29-34 (1985). 27. Ohya, M., and Watanabe, N., Foundations of Quantum Communication Theory (in Japanese), Makino Pub. Co., (1998). 28. Ohya, M., and Watanabe, N., Quantum capacity of noisy quantum channel, Quantum Communication and Measurement, 3, 213-220 (1997). 29. Park, Y.M., Dynamical entropy of generalized quantum Markov chains, Lett. Math. Phys. 32, 63-74, (1994) 30. Schatten, R., Norm Ideals of Completely Continuous Operators, SpringerVerlag, (1970). 31. Umegaki, H., Conditional expectations in an operator algebra IV (entropy and information), Kodai Math. Sem. Rep., 14, 59-85 (1962). 32. Uhlmann, A., Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in interpolation theory, Commun. Math. Phys., 54, 21-32, 1977. 33. Voiculescu, D., Dynamical approximation entropies and topological entropy in operator algebras, Comm. Math. Phys., 170, 249, (1995) 34. Watanabe, N., Some Aspects of Complexities for Quantum Processes, Open Systems and Information Dynamics, 16, No.2&3, 293-304, (2009).
This page intentionally left blank
Quantum Bio-Informatics IV eds. L. Accardi, W. Freudenberg and M. Ohya © 2011 World Scientific Publishing Co. (pp. 403-412)
A FAIR SAMPLING TEST FOR EKERT PROTOCOL GUILLAUME ADENIER* and NOBORU WATANABE
Tokyo University of Science, 2641 Yamazaki, Noda, Chiba 278-8510, Japan * E-mail: [email protected] Andrei Yu. Khrennikov
Linnaeus University, Vejdes plats 7, SE-351 95 Viixjo, Sweden
We propose a local scheme to enhance the security of quantum key distribution in Ekert protocol (E91). Our proposal is a fair sampling test meant to detect an eavesdropping attempt that would use a biased sample to mimic an apparent violation of Bell inequalities. The test is local and non disruptive: it can be unilaterally performed at any time by either Alice or Bob during the production of the key, and together with the Bell inequality test.
Keywords: Ekert protocol, Entangled states, Fair sampling.
1. Introduction
Ekert protocoP-3 uses entangled states to guarantee the secrecy of a key distributed to two parties (Alice and Bob) that wished to communicate secretly through a public channel. Identical measurements performed on a maximally entangled state yield perfect correlation, which can be used to produce a shared key. The secrecy of the key can be guaranteed by the violation of Bell inequalities measured for non-identical measurements. An unconditional violation of Bell inequalities would guarantee that no local (hidden) variables exist that an eavesdropper (Eve) could exploit. It would mean unconditional privacy: Eve could have full control of the detectors and the source, more advanced theory and technology, it would still be secure. 2 However, actual implementations of Ekert protocol presently require the use of photons, because a key distribution protocol is useful only if Alice 403
404
and Bob can be separated by macroscopic distances 4, which makes photons the only practical solution. The downside is that the type of photons is limited in practice by the pair creation process (parametric down conversion) and by the optical components that are used (fiber optics, polarizing beamsplitters) to wavelength at which standard photon counters have a poor detection efficiency 5 ,6. It means that optical implementations of Ekert protocols cannot avoid a rather heavy postselection: Alice and Bob must discard all measurements for which either of them failed to register a click. The trouble is that the validation of a violation of Bell inequalities observed in such a post selection protocol requires an extra assumption: the sample of detected pairs must represent fairly the population of emitted pairs (fair sampling assumption) . It has long been known that a breakdown of this assumption allows local hidden-variable models to reproduce exactly the predictions of Quantum Mechanics on the subset of detected pairs 7, unless the detection efficiency is higher than 83% 8 ,9,10 ,11. In the context of experiments on the foundations of Quantum Mechanics, the fair sampling assumption is usually considered reasonable, with the idea that Nature is not conspiratory. In Quantum Key Distribution however, Eve is expected to conspire 12, so that Alice and Bob must assume that Eve is actually attempting to bias their sample. a Naturally, this issue becomes critical if Eve manufactured the detectors, which means that Alice and Bob should thoroughly check that their detectors are functioning according to specifications 2. We will argue here that photomultipliers or avalanche photo diodes can in principle be subjected to such a bias sampling attack, by exploiting the thresholds of these detectors, in particular if they exhibit nonlinearities.
2. A biased-sampling attack on Ekert protocol The motivation for the possibility of a biased-sampling attack is that avalanche photo diodes and photomultipliers are fundamentally threshold detectors. They are sometimes referred to as such because they cannot distinguish between the absorption of one or of several photons, but it should also be pointed out that the principle of detection itself relies on thresholds, and that these thresholds are essential to the proper functioning of these aThis possibility should not be underestimated. A successful quantum hacking has already been successfully implemented experimentally with a time-shifting attack 13. The attack essentially introduced a hidden-variable in the protocol (the time-shift) to influence the probability for Alice and Bob to get either a 0 or a 1.
405
ipB
Figure 1. Standard Ekert protocol. Alice and Bob randomly switch their measurement settings. Pairs associated with identical measurement settings (if A = if B) are used to produce a correlated key, while those associated to non-identical measurement settings (if A i= if B) are used to check the violation of Bell inequality (and the security of the key).
detectors. At the input, the energy must be higher than the band gap to trigger an avalanche or a photoelectron; while at the output, the current must be higher than a discriminator value to be counted as a click 14. This combined threshold could be exploited by Eve to obtain an apparent violation of Bell inequalities on the detected sample using only local states entangled state 15 . The idea of the attack would be to replace the genuine source of entangled photons with a source of near threshold pulses correlated in polarization. These pulses would lead to a probability of generating a click in either output channel that would depend on the characteristics of the pulse and the measurement settings. Consider a near-threshold pulse linearly polarized along >- impinging on a polarizer set at a measurement angle cp. It would have a significantly greater chance to produce a click when the angle difference 0: between these two variables is 0: = 1>- - cpl ~ k1r /2 (parallelor orthogonal) than when 0: ~ k1r/2 + 7f/4 (diagonal). Indeed, in the first case most of the energy of the pulse would go in one specific channel, retaining enough energy in this channel to remain above the threshold, while in the other case the energy of the pulse would be split between both channels, in principle below the threshold. In the context of low efficiency of detection, it would bias the sampling of the pulses with the effect of artificially increasing the visibility of the coincidence counts, thus leading to an apparent violation of Bell inequalities 15.
406
In principle, Eve could even obtain an apparent violation of Bell inequalities reproducing exactly the predictions of Quantum Mechanics on the detected sample, by reproducing the asymmetrical detection pattern of a Larsson-Gisin model 10,11. For this purpose, Eve would have to send pairs of correlated pulses with energy Eo = 2 and polarization A, with A being a random variable uniformly distributed on the interval [-Jr /2, Jr /2]. On one side, Eve would have to send pulses to which the detectors react with a very steep rising detection probability at the threshold (ideal threshold), while on the other side, Eve would have to send pulses to which the detectors react with a detection probability rising linearly at the threshold (linear threshold). With the condition Eo = 2, what happens is that Malus' Law is cut by the bottom precisely at the intersection of the two channels (at I'P - AI = Jr /4). Consequently, Alice would always records a click in exactly one channel: channell if I'PA - AI < Jr/4, channel 0 otherwise; whereas on Bob's side the probability to get a click in the channel 1 would vary with cos2('PB - A) when I'PB - AI < Jr/4, and with sin2('PB - A) when I'P B - AI > Jr / 4 in channel O. The crucial feature of the resulting detection pattern is that the probability to obtain a click in either channel on Bob's side depends explicitly on A: it is maximum for I'PB - AI = 0, and decreases down to zero for I'PB - AI = Jr /4. The sampling is thus unfair, or biased, and leads to an apparent violation of Bell inequalities on the detected sample 10,11,15
Note that if instead of scanning the full correlation, Alice and Bob are only checking a few points of the correlation predicted by Quantum Mechanics, as is the case in Ekert protocol, Eve would not need to aim at reproducing the full correlation and could therefore implement her attack with a symmetrical design, inducing the same detection pattern for Alice and Bob. However, if each pulse contains at most one photon the biased sampling described here would be ineffective, because the energy seen at a detector would always be the same regardless of the measurement settings (when this photon does reach a detector). Eve would therefore have to produce nearthreshold pulses consisting of several photons of lower frequencies. In order for her attack to work however, she needs that the probability that a pulse produces a click in either channel be close to zero for angle differences close to diagonals, that is for angle differences close to a = I'P - AI = Jr / 4 + kJr /2, while it should be non zero, and significantly greater than the probability of a dark count, for other angles differences, in particular for a = I'P - AI = k7r /2.
407
3. Countermeasures In order to prevent Eve from using this bias sampling attack, Alice and Bob can in principle use several countermeasures. The first countermeasure consists in increasing the efficiency to reach 83%. However, this proves difficult with threshold detectors. Decreasing the band gap threshold does increase the efficiency of the detectors, but only at the cost of higher dark count rates. Unless special detectors operating near absolute zero temperature are used, such as Transition-Edge Sensors (which are too cumbersome and slow to be practical solution to QKD), this can be considered a general rule that applies to any detectors, and fundamentally limits their efficiencies, so that this desirable solution can be considered unrealistic in a quantum key distribution framework. The second countermeasure would be to guarantee that Alice's and Bob's detectors are not susceptible to nonlinearities at any frequency. The probability that a photon produces a click in a detector should remain the same regardless of the circumstances, in particular it should not depend on the number of other photons reaching the detector simultaneously. If this can be guaranteed, then Eve cannot exploit the threshold because each photon sees the threshold independently of the presence of other photons, and is therefore either above the threshold or below, regardless of the instrument settings. In practice, it might be difficult to guarantee that detectors do not exhibit nonlinearities at any frequency, and as much as this question has been studied experimentally, the result is that detectors do exhibit nonlinearities 16. A third countermeasure would consist in using filters to prevent Eve from using lower frequency photons. This would however come at the expanse of a lower overall quantum efficiency. The narrower the bandwidth of the filter, the lower the quantum efficiency. With photon detectors that have significant amount of dark count, this would mean an increased of the quantum bit error rate. The imperfections of the filters could also become the target of Eve's attack. If for instance the filter does not provide a complete extinction at a frequency usable for an attack, Eve would only have to send brighter pulses to allow enough of these photons to go through and perform the bias sampling attack. In principle, any of the three above countermeasure would be enough to prevent Eve's attack, but they can be very difficult or impractical to implement. This leads us to suggest a fourth possibility, which is to test the fairness of the sampling.
408
Figure 2. Fair Sampling test on Alice's side. The detector Al is replaced by a polarimeter with two detectors At and A 1 , whereas the detector Ao is replaced by a polarimeter with two detectors At and Ail. Ekert protocol is unaltered by our test if all detectors have the same efficiency "I): the light green area is equivalent to detector Al with efficiency "I), whereas the light red area is equivalent to detector Ao with efficiency "I).
For this purpose, we propose to analyzing the output channels of the polarizing beamsplitters, instead of simply feeding detectors with them. We keep the standard design of Ekert protocol, with two polarizing beamsplitters on each side (Alice and Bob) projecting the incoming pulses on random bases rp A and rp B, as depicted on Fig. 1, but we replace each detectors by a polarimeter: a polarizing beamsplitter followed by a detector at each output. Consider Alice's side (see Fig. 2). We label the additional PBS in channell by its orientation eA, , and the one in channel 0 by its orientation eAo. A click in any of the two detectors following eA, is counted as aI, whereas a click in any of the two detectors following eAo is counted as a o. Bob proceeds similarly with two polarimeters labeled eB, and eBo. From the point of view of Ekert protocol and of a genuine source of entangled photons, nothing is changed. Consider Alice's channell. The setup constituted by the PBS eA, and its corresponding detectors can be seen as one big single detector in channell, in which the orientation eA, has no influence on the result, that is, if we assume a balanced quantum efficiencies
409
TJ of the detectors at the two outputs. A photon exiting the PBS oriented
along r.p A through channell will be detected in either output channel after () Al with a probability TJ. Similarly, the setup constituted by the () Ao and its corresponding detectors can be seen as one single detector in channell, where the orientation () Ao plays no role whatsoever, and the same goes for Bob's setup. As long as all the detectors have the same efficiency TJ, each polarimeter can then be considered as one detector with quantum efficiency TJ· The polarimeter () Al can be seen as one single detector in channell, in which the orientation () Al has no influence on the result: a photon exiting the PBS r.p A through channel 1 will be detected in either output channel of polarimeter () Al with a probability TJ. Similarly, polarimeter () Ao can be seen as one single detector in channel 0, where the orientation () Ao plays no role whatsoever, and the same goes for Bob's setup. The production of the key and the verification of the violation of Bell inequalities is thus unaltered by our fair sampling test setup in case of a genuine source of entangled photons, because the additional measurement settings () AI' () A o , () BI and () BI controlled by Alice and Bob have no influence on the measured results. However, they have a strong influence on the result in the case of a biased sample attack by Eve. Let us consider the simpler case of an ideal threshold detector. By ideal threshold detector, we mean a detector that produces a click with certainty if an only if the pulse impinging on the detector carries an energy Eo greater than the detector threshold . Eve sends pairs of correlated pulses with energy Eo and polarization .x, where .x is a random variable uniformly distributed on the interval [0,271"[. By Malus law, the energy of the pulse reaching Alice's detector is
At
(1) Starting from a uniform distribution of the polarization .x of pulses on the circle, we write that !PAdAI = IFA +I (E A+ )dEA+ I, so that the probability I I to get an energy between E A +1 and E A +I + dEA +I in the A + channel is given by
(2)
where Emax = Eo cos 2 (r.p A detector (by Malus' law).
-
() AI)
is the maximum energy reaching the
410
The probability to obtain a click in an ideal threshold detector placed at the transmitted output (+) of polarimeter Al is then simply the integral of this density distribution over the energy reaching the detector, from the threshold to Emax:
(3)
(4) Similarly the probability to obtain a click in an ideal threshold detector positioned at the reflected output (-) of polarimeter Al is
2
Eo sin ('P A
-
BAI)
.
(5)
We can also consider a less ideal threshold detector, that would be more likely to resemble the characteristics of a real detector. In order to keep the calculations of the integrals simple enough, we considered the case of threshold detector with linear rise, that is, a threshold detector for which the probability to generate a click increases linearly starting from the threshold , possibly with a saturation value after which increasing the impinging energy no longer increases the probability to generate a click. In these cases, the analytical results are more complicated since the probability to get a click for an energy E + dE is not always equal to 1, but the principle of calculation remains the same: integrate the product of the probability density distribution by the probability of obtaining a click for a given energy. The analytical results are qualitatively similar to that of ideal threshold detectors. The results in the case Eo = 2- which is leading to a violation of Bell inequalities exactly reproducing the predictions of Quantum Mechanics- are displayed in Fig. 3: the probability to get a click in the polarimeter oriented along B Al depends on I'P A - BAli . It is maximum for I'P A - B Al I = 0 + k7r /2, and reaches zero for I'P A - BAli = 7r /4+ k7r /2. Similar results would be obtained for Alice's BAo polarimeter, and for Bob's polarimeters. This fair sampling test can be implemented very simply on Alice's side by fixing B Al = B Ao = O. The random switching in Ekert protocol (Fig. 1 and Fig. 2) ensures that the points at 0 and 7r/4 are both scanned automatically. Any significant difference in the number of single counts recorded
411
OM predictions
0.4
0.3 0.2 0.1
o
7r
7r
37r
4
2
4
Figure 3. Analytical Results in case of biased-sampling attack on threshold detectors in Alice's Al channel, with Eo = 2. The blue plots represent the probability to get a click in detector whereas the purple plots represent the probability to get a click in detector Ai. The probability to get a click in polarimeter () A, thus depends on I'P A - () All· By contrast, in case of a genuine source of entangled photons, Quantum Mechanics predicts independence.
At,
when 'P A = 0 and 'P A = 7f / 4 would betray Eve's attempt to bias the sample through a biased-sampling attack on the threshold detectors. Similarly, Bob would chose B, = Bo = 7f /8, and compare the number of singles when 'PB = -7f/8 and 'PB = 7f/8.
e
e
4. Conclusion This fair sampling test can be implemented during the production of the key and together with the violation of Bell inequality check, so that it seems hard to fool it without reducing the visibility of the violation. For instance, increasing the energy of the pulses with respect to the threshold would tend to reduce the dip in the fair sampling test, but it would reduce the visibility of the correlation (weaker violation of Bell inequalities) at the same time, and it would also give rise to double counts. The combination of a Bell inequality test with a monitoring of the double counts and our local fair sampling test therefore constitutes a solid scheme
412
against eavesdropping a E91 protocol using a biased-sampling attack. In principle, the same idea could be implemented in other QKD protocol, by replacing passive detectors in each channel by a device with the same efficiency, that would analyze further whichever degree of freedom is used to encode the key, instead of simply feeding detectors with it b.
Acknowledgments We are grateful to Hoi-Kwong Lo, Jan-Ake Larsson, Takashi Matsuoka and Masanori Ohya for useful discussions on Quantum Key Distribution.
References 1. A. K. Ekert, Phys. Rev. Lett. 67, 661(Aug 1991). 2. N. Gisin, G. Ribordy, W. Tittel and H. Zbinden, Rev. Mod. Phys. 74, 145(Mar 2002). 3. G. Jaeger, Quantum Information (Springer New-York, 2007). 4. V. Scarani, H. Bechmann-Pasquinucci, N. J. Cerf, M. Dusek, N. Ltitkenhaus and M. Peev, Rev. Mod. Phys. 81, 1301(Sep 2009). 5. H.-K. LO and Y. Zhao, Quantum cryptography (2008). 6. A. S. H. Z. J. G. R. T. W. D. Stucki, G. Ribordy, Journal of Modern Optics 48, p. 1967 (2001). 7. P. M. Pearle, Phys. Rev. D 2, 1418(Oct 1970). 8. A. Garg and N. D. Mermin, Phys. Rev. D 35, 3831(Jun 1987). 9. P. H. Eberhard, Phys. Rev. A 47, R747(Feb 1993). 10. J. Ake Larsson, Physics Letters A 256, 245 (1999). 11. N. Gisin and B. Gisin, Physics Letters A 260, 323 (1999). 12. A. Ekert, Less reality, more security(September, 2009). 13. Y. Zhao, C.-H. F. Fung, B. Qi, C. Chen and H.-K. Lo, Phys. Rev. A 78, p. 042333(Oct 2008). 14. G. F. Knoll, Radiation Detection and Measurement (Wiley &Sons, 1999). 15. G. Adenier, Violation of bell inequalities as a violation of fair sampling in threshold detectors FOUNDATIONS OF PROBABILITY AND PHYSICS5 1101 (AlP, 2009). 16. K. J. Resch, J. S. Lundeen and A. M. Steinberg, Phys. Rev. A 63, p. 020102(Jan 2001). 17. H.-K. L. Bing Qi, Li Qian, A brief introduction of quantum cryptography for engineers (2010).
bThe use of four detectors on each side can also serve other purpose, like shielding Alice and Bob from a time-shift attack 17
Quantum Bio-Informatics IV eds. L. Accardi, W. Freudenberg and M. Ohya © 2011 World Scientific Publishing Co. (pp. 413-426)
BROWNIAN DYNAMICS SIMULATION OF MACROMOLECULE DIFFUSION IN A PROTOCELL TADASHI ANDO
Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology250 14th Street NW, Atlanta, GA 30318-5304, USA
JEFFREY SKOLNICK
Center for the Study ofSystems Biology, School ofBiology, Georgia Institute of Technology250 14th Street NW, Atlanta, GA 30318-5304, USA The interiors of all living cells are highly crowded with macromolecules, which differs considerably the thermodynamics and kinetics of biological reactions between in vivo and in vitro. For example, the diffusion of green fluorescent protein (GFP) in E. coli is -IO-fold slower than in dilute conditions. In this study, we performed Brownian dynamics (BD) simulations of rigid macromolecules in a crowded environment mimicking the cytosol of E. coli to study the motions of macromolecules. The simulation systems contained 35 70S ribosomes, 750 glycolytic enzymes, 75 GFPs, and 392 tRNAs in a 100 nm x 100 nm x 100 nm simulation box, where the macromolecules were represented by rigid-objects of one bead per amino acid or four beads per nucleotide models. Diffusion tensors of these molecules in dilute solutions were estimated by using a hydrodynamic theory to take into account the diffusion anisotropy of arbitrary shaped objects in the BD simulations. BD simulations of the system where each macromolecule is represented by its Stokes radius were also performed for comparison. Excluded volume effects greatly reduce the mobility of molecules in crowded environments for both molecular-shaped and equivalent sphere systems. Additionally, there were no significant differences in the reduction of diffusivity over the entire range of molecular size between two systems. However, the reduction in diffusion of GFP in these systems was still 4-5 times larger than for the in vivo experiment. We will discuss other plausible factors that might cause the large reduction in diffusion in vivo.
1.
Introduction
One of the most characteristic features of the interiors of all living cells is the extremely high total concentration of biological macromolecules. Typically, 20-30% of the total volume of cytoplasm is occupied by a variety of proteins, nucleic acids and other macromolecules. Under these conditions, the distance between neighboring proteins is comparable to the protein size itself, though the molar concentration of each protein ranges from nM to 11M. In this crowded, heterogeneous environment, biomolecules work to maintain living systems and 413
414
they have evolved over several billion years. Therefore, modeling the crowded cellular environment is not only an important first step toward whole cell simulation but also a crucial factor in understanding the nature of living systems. In this study, we performed Brownian dynamics (BD) simulations of rigid macromolecules in a crowded environment mimicking the cytosol of E. coli to study the motions of macromolecules. BD simulations using an equivalent sphere system, where macromolecules were represented by their Stokes radius were also performed. It has been reported that the diffusion of green fluorescent protein (GFP) in E. coli is ~10-fold slower than in dilute conditions (1, 2). Our aim is to investigate the mechanism(s) that causes this large reduction in diffusion in vivo. 2.
Methods
2.1. Estimation of diffusion tensor of a macromolecule from atomic structure To account for the diffusion anisotropy of macromolecules in our simulation, the diffusion tensors of macromolecules were calculated by using the rigid-particle formalism method (3-5). Here, we will describe this approach briefly. The diffusion of an arbitrarily shaped object undergoing Brownian motion is expressed by a 6 x 6 diffusion tensor, D, which is related to a frictional or resistance tensor, S, through the generalized Einstein relationship, D = kBT S·l. Both D and S can be partitioned into 3 x 3 sub-matrices, which correspond to translation (tt), rotation (rr), and translation-rotation coupling (tr and rt) tensors: D= (
_ )-1 D,,) = k T(SII D - '
DII
.:I
B
D"
rr
.:I
tr
"
~rr
(1)
where (2) Here, the superscript T indicates transposition. Translational and rotational diffusion coefficients in a dilute solution are given by Do" = 1/3 Tr{D,,),
(3)
1/3 Tr{n,,),
(4)
Do" where Tr is the trace of the tensor.
=
415
The components of E can be obtained by the following procedure. From the Cartesian coordinates of the object consisting of N beads with the same radius, a, the 3 x 3 hydrodynamic interaction tensors between beads i and}, Tij (i,) = 1, ... , N) are calculated using the expression formulated by Rotne and Prager (6) and Yamakawa (7), the so-called RPY tensor, l";j ~
T.= 'J
2a, (5)
Here, 1] is the viscosity of the solvent and rij is the distance vector between beads i and j. It is important to note that the radius of bead is the only parameter to be optimized to reproduce hydrodynamic properties in dilute conditions. In what follows, we ignore intermolecular hydrodynamic interactions. Now, consider a 3N x 3N supermatrix, B, consisting of N x N Bij blocks at an arbitrary origin 0
B= [
Bij
1
B." ;
:.. ..
B:N ; ,
BNl
...
BNN
(6)
=6ij 6:17a + (1-6ij)rij.
Here, ~ij is the Kronecker delta function. This supermatrix is then inverted to obtain a 3N x 3N supermatrix, C, ...
1
C IN . , ... C NN
.
in which each of the written as
Cij
block is a 3 x 3 matrix. Now, the elements ofE can be
Ell
= LLCij'
E" = LLUi ·Cij' E"
where
(7)
=-LLUi·Cij.U
(8) j ,
416
U,
~[ ~ -Y;
-Zi
0
x;
-;;
y, ]
(9)
.
Here, Xi, Yi, and ziare the components of the position vector of bead i at origin 0. So far the choice of the origin of the coordinates has been left arbitrary. However, the diffusion tensor, D, especially translational and translation-rotation coupling tensors, depends on the origin. At a certain origin, the so-called center of diffusion Q, the translational diffusion coefficient reaches a minimum. The position of Q with respect to the arbitrary origin 0, is calculated with the diffusion tensor obtained at 0, Do, as
[ 1[ XOQ
ROQ = YOQ = Z Oil
XY _DTr,O
DYY + D" Tr,D Tr,O -
D;;,o xz D Tr,O
D Tr,O + D" Tr,O XX
-D" Tr,O - D:'~o ] DTr,O + DTr,O XX
YY
_1[ D
Y'
fr,O
-D'Y fr,O
]
Dt~O - Dt;~O . (10)
DXY - Dtr,O tr,O YX
D at the center of diffusion Q are calculated by D !t,Q =DIt,D -U OQ ·Drr,O ·U OQ +Drt,O ·U OQ -U OQ ·Dtr,O' D rl,!) =D rl,O -U OQ ·DTr,O'
(11)
D rr,Q =DTr,O'
where
U OQ
=( Z~Q
l-YoQ
o
YOQ]
-~OQ
.
(12)
A volume correction term for rotational and intrinsic viscosity estimation is applied in Eq. 8 in some studies (3, 4). However, significant deviations of calculated diffusion properties from experimental values were not observed even without the correction.
2.2. Brownian dynamics for arbitrarily shaped objects BD is one of the most important simulation approaches to investigate the Brownian motion of arbitrary shaped objects, in which solvent molecules are treated implicitly and the influence of solvent on solute particles is incorporated through frictional and stochastic forces (8). In the high-friction limit, where it is assumed that momentum relaxation is much faster than position relaxation, and when we treat the diffusion tensor as a constant, a BD propagation scheme for an arbitrarily-shaped object can be written as (9)
417
Xi
=
X~ + kl:,.~ Di ·Fi +gi(M), P
(13)
B
where I:,.t is the time step and Xi is the vector describing the position of the center of diffusion and orientation of the i-th object, (14) Here, rJ, r2, and r3 are the position of the center of diffusion, and qJ I, qJ2, qJ3 describe its orientation. FP is a generalized system force having two components, the force acting on the center of diffusion (f) and the torque Cr):
(15) g(l:,.t) is a 6 x 1 random displacement vector during time step I:,.t due to the Brownian noise, which satisfies the following relations: (16) Here Di is the 6 x 6 diffusion tensor of object i at the center of diffusion. Once this diffusion tensor is calculated as described above, we can compute the random displacement vector using a Cholesky decomposition technique (9). The Cholesky decomposition of the diffusion tensor D is determined as
D=S·ST,
(17)
where S is a lower triangle matrix. The desired vector geM) is then obtained by the following: (18) where Z is a 6 x 1 vector, which has elements chosen from a Gaussian distribution so that (19) In BD simulations, quatemions, q = (qQ, ql, q2, q3), were used for handling rotations of rigid objects (8). Diffusion tensors of objects were evaluated in body-fixed frames only once at the start of the simulation. The force and torque on each object calculated in the laboratory or space-fixed frame (f' or T S) were converted to their body-fixed frame (f' or T b) using the rotation matrix Q obtained with quatemions, (20)
418
For each step, quatemions were scaled usmg Lagrange's method of undetermined multipliers to satisfY q2 = I (10).
2.3. Potential function In this study, we considered only repulsive interactions between intermolecular particles in BD simulations using a soft-sphere potential described by (21) is the distance between particles i andj, and kss is a force constant. rm is ai + aj + fl, in which ai and aj are radii of particles i and j and fl is an arbitrary parameter representing buffer distance between particles. In this study, fl of 2 A and kss of 5kB TI fl2 was used, which means Vss = 5kB T at the distance rij = ai + aj. where
rij
2.4.
Simulation conditions and analysis
All simulations were performed at 298 K with periodic boundary conditions. For all simulation systems, ten independent simulations were run with different initial configurations. 35 I-ls simulations were performed with time step of 0.5 ps. Configurations of the systems were sampled every 1 ns. Trajectories for the first 5 I-lS were discarded for analysis. The translational diffusion coefficient of a particle in three dimensions is estimated by
([r(t + T)- r(t )]') = 6D T ,
(22)
where ret) is position of the particle at time t and T is time interval. () indicates the ensemble average over the same particle type and time t.
419
3. Results
3.1. Estimation of diffusion tensor of a macromolecule from atomic structure 0.18
Translational --B-Rotational -e-
0.16 ,;ij = Pi' 2>ij =
Pj'
So the dynamics A * describing
the change of sequence from xn to xn+1 is given by a certain mapping called a channel sending the probability distribution p to p == A *P . It is difficult to know the details of this dynamics in the course of sequence changes. The ECD can be used to measure the complexity without knowing the exact dynamics [7], which is one of the aspects due to the sequence change. The ECD for the amino acid sequences is given by the following formula; ECD(xn,x n+I )== L>;S(A*8J, where SO is the Shannon entropy and P
I (i
=
= j)
~ pA, 8; (j) = { 0 (i"* J)' Note
that the ECD(X n,xn+l) is written as ECD(p,A*) to indicate p and A* explicitly. This chaos degree is originally considered for how much chaos is produced by the dynamics A * [10, 11]. Therefore it is considered as: (1) A * produces a chaos iff ECD > 0 (2) A* does not produce a chaos iffECD = O. Moreover, the chaos degree ECD(X",X n+l ) provides a certain difference between xn and xn+1 through a change from xn to xn+1 , so that the chaos degree characterizes the dynamics changing xn to xn+1 . We calculated the ECD of the dynamics leading sequence changes of the V3 region which were obtained from patients infected with HIV -1 at several points in time after infection or seroconversion.
2.3. Longitudinal Sequence Data We obtained envelope V3 sequence data from several longitudinal studies of HIV -1 infection [12-16]. Some patients had progressed to AIDS and died of AIDS-related complications, while others have been asymptomatic during the period of follow-up. This paper has the results for four patients as representatives of many patients.
455
3. Results and Discussion The value of Dc by various C codes and that of the ECD for the V3 sequences are shown in Figure 1 for Patient A and Patient B who have been asymptomatic during the follow-up period, and those values are shown in Figure 2 for Patient C and Patient D who had progressed to AIDS and died of AIDS-related complications during the follow-up period.
Patient A
0.2
G-
9
0.15
~
I"l
-8
22
46
59
Dc 0.1
0.05
0 0
13
33
M>nths Post Seroconversion
035 0.3 0.25 0.2 0
() UJ
0.15 0.1 0.05 0 (0, 13)
(13, 22)
(22, 33)
(33,46)
(46, 59)
456
Patient B 0.2
..n 0.15
Dc ..tl
-
0.1
-~
---
0.05
0
29
3
42
58
---
70
100
",ntlls Post Seroconversion
0.3
0.25 0.2
0°15 ()
w
01
0.05
° Figure 1.
(3, 29)
(29, 42)
(42, 58)
(58, 70)
(70, 100)
The value of Dc and the value of the ECD for theV3 sequences obtained at each point
in time from two patients (Patient A and Patient 8) who have been asymptomatic during the followup period. Codes
Error·correcting Capacity
-+- (15, lO)-cyclic code
t 1 + (it G +al' +t4 +cit3 +ci
--- (21, 14)-cyclic code
...... Self-orthogmal code (Constraint Length 69)
GB'lerator Polynomial tj+a~t4+t'+at+ci
2 (randcrn erroc)
d J +DJ1 +rJ+d+l . d 8 + D U + LfD + n4 +1
- - (15, lO)-cyclic code
t!J+ t 4 +tJ +1
..... (21, 14)-cyclic code
t ' +ar+arl+atJ +at+l [j9 + nY, +[jB+ n21+ Dl4 +1 • d8+DD+D2S+D21 +£1+1
......... Self-orthogcnal code (Constraint Length 120) 3 (random error) -&- (15, lO)-cyclic code
"+1
457
The values of Dc for the V3 region obtained from asymptomatic patients were low for artificial codes represented by dark blue, pink and light blue at all points in time. In addition, all codes had a constant value of Dc' Although Patient B is a long term non-progressor, CD4+ T-Iymphocyte counts gradually decrease from around 100 months post-HIV-I seroconversion. We observed that the code structure of V3 region showed a change previous to the decrease of CD4+ T-Iymphocyte counts. The ECD of patients who have not diagnosed AIDS during follow-up like Patient A has maintained stable and low value. For Patient B, the value of the ECD for the V3 sequences obtained at 70 months and 100 months post-HIV-I seroconversion was larger compared with that of other points in time. Patient C 0.2
G-eeeeee~eee~
0.15
Dc 0.1
0.05
9
16
19
21
25
28
34
40
42
43
49
56
62
68
81
Mmths Post Seroconversion
0.35 0.3
o
0.25
()
W
0.2 0.15 0.1 0.05
o (3. 9)
(9,21)
(21,34)
(34, 49)
(49,68)
(68,81)
458
Patient D 0_2
0.15
Dc 0.1
0_05
3
14
24
34
45
51
61
66
68
17
80
87
94
98
105
Mmths Post Seroconversion
0.35 0._3 80.·25
w 0..2 015
0.1 0 0.5 0. (3,14)
Figure 2,
(14,24)
(24, 34)
(34,45)
(45,61)
(61,77)
0 for AF interaction), and K m the exchange integral for the exchange interaction between the spin of a dopant hole Sim and the dX 2 _y2 localized spin S i in the i-th CU06 octahedron (i-th CU05 pyramid). The values of the parameters in Eq. (1) for the case of LSCO are: J ~_--;-_--,Ig~ = 0.1, Ka* = -2.0, Kb l g = 4.0, ta*19 a*19 = 0.2, tb l g b , g = 0.4, ta*19 b , g = Jta~ga~gtbIgblg '" 0.28, Ea~g = 0, Eb 'g = 2.6 in units of eV, where the values of Hund's coupling exchange constant K a *Ig and Zhang-Rice exchange constant K b'g are taken from the first principles cluster calculations for a CU06 octahedron in LSCO 7,8, and the energy difference of the effective one-electron energies between ai g and bIg orbital states, Ea~g - Eb ,g = -2.6, is determined so as to reproduce the energy difference between the 3B 1g and 1A 1g multiplets in LSCO in the MCSCF cluster calculations 8, while those of tmn's are due to band structure calculations 4,5.
3. Effective energy band and the shape of Fermi surface Ushio and Kamimura 13,12 have started from tight binding Hamiltonian. Then they have obtained the effective many-body-effects included Hamiltonian (1) by taking into account the exchange term as a molecular field, which is determined so that the energies of 1A 1g and 3B 1g coincide with that calculated by Eto and Kamimura 8.
3.1. Effective energy band In figure 5 (b), the calculated many-body effect included energy band structure for up-spin (or down-spin) dopant holes for LSCO is shown for various values of wave-vector k and symmetry points in the AF Brillouin zone. Here one should note that the energy in this figure is taken for electronenergy but not hole-energy, and the Hubbard bands for localized big holes do not appear in this figure. In the undoped La2Cu04, all the energy bands in figure 5 (2) are fully occupied by electrons so that La2Cu04 is an insulator, consistent with ex-
478
In doping, a hole carrier begins to occupy from 11 point.
AF Brillouin zone
~
>~.----.~
0
~~~-+~
~ 00
~ ~ o .... ~~~2j:s~~~==~ 6L----L--L---L-~
(0,0,0)
(a)
('1C, O,O) .t::.. (0,0,0) G 1 ('1C / 2, '1C /2,0) (0,0, '1C)
(b)
Figure 5. (a) The Fermi surface of hole-carriers for x = 0.15 calculated for the #1 band. Here the kx axis is taken along rG1, corresponding to the x-axis (the Gu-O-Gu direction) in a real space. (b) The many-body-effect included band-structure for upspin dopant holes, obtained by Ushio and Kamimura 13,12. In this figure the Gu-O-Gu distance a is taken to be unity.
perimental results. In this respect the present effective energy band structure is completely different from the ordinary LDA energy bands 16,17. When Sr are doped, holes begin to occupy the top of the highest band in figure 5 (b) marked by #1 at ~ point which corresponds to Clf j2a, 7r j2a, 0) in the AF Brillouin zone. At the onset concentration of superconductivity, the Fermi level is located at the energy of E = 9.04 eV just below the top of the #1 band at ~, which is a little higher than that of the G 1 point. Here the G 1 point in the AF Brillouin zone lies at (7rja, 0, 0), and corresponds to a saddle point of the van Hove singularity. 3.2. The shape of Fermi surface
Based on the calculated band structure shown in figure 5 (b), U shio and Kamimura 13,6 calculated the Fermi surfaces for the underdoped regime of LSCO. This Fermi surface structure is also completely different from that
479
of an ordinary Fermi liquid picture, in which a Fermi surface is large. In figure 5 (a) the Fermi surface structure of hole-carriers calculated for x = 0.15 is shown as an example, where the Fermi surface (FS) consists of two pairs of extremely flat tubes. Thus the feature of FS in the underdoped regime is a small FS, and its volume is proportional to the doping concentration of dopant holes.
1.0
0.0 0.0
, 03 :, 05 . 0.5
0 .0
0.5
' 0.0
0.5
0.5
1.0
Figure 6. Observed doping dependence of Fermi arc (the inner section of FS) of La2-xSrxCu04 from x = 0.03 to 0.15, observed in ARPES by T. Yoshida et al 20,21 The calculated results based on the K-S model for x = 0.05 and x = 0.15 shown by thin curves are superimposed on the experimental results obtained by Yoshida et al.
Since the Bloch function of top band has Fourier component mainly in AF BZ, only the inner section of FS might be observed. ARPES researchers have called it "Fermi arc". Recently the Fermi arcs of hole carriers have been observed for the underdoped regime of LSCO by Yoshida et al 20,21. In figure 6 the calculated FS for x = 0.05 and x = 0.15 are compared with experimental results of La2-xSrxCu04 with x = 0.03, x = 0.05, x = 0.07 and x = 0.15. As seen in the figure, the agreement on the doping dependence of Fermi arcs between theory and experiment is fairly good. Recently, Meng et.al. 22 report ARPES measurements of Bi2Sr2-xLaxCu06+8 (La-Bi2201) that reveal Fermi pockets, supporting our calculated Fermi surface. We extend our calculations to the overdoped regime of LSCO. Since
480
the superexchange interaction becomes destroyed with increasing the hole concentration in the overdoped regime, the K-S model is considered not to hold beyond a certain hole concentration XO' As a result FS will change from small ones to a large one beyond XO' In the case of LSCO we think that Xo is 0.2 from experimental results of Loram et al 23 and of Nakano et al 24. 4. High temperature superconductivity
4.1. Spin-dependent electron-phonon interaction In this subsection we describe that the electron-phonon interaction in the K-S model depends on a spin direction of a hole carrier, in doing so, we describe only a stream of theoretical derivation. Therefore, for those who have interests in the derivation of equations, please read chapters 13 and 14 in our Springer text book entitled "Theory of Copper Oxide Superconductors" 6. On the basis of the K-S model we have shown that the interplay between electron-phonon interaction and the AF order leads to the d-wave phonon-involved mechanism in LSCO 26,27,6. As seen in figure 7, the atomic wave function for up-spin coincides with that for down-spin, if it is displaced by vector a (a is Cu-O-Cu distance). Thus, the wavefunctions of a hole-carrier with up and down spin in K-S model have the following phase relation:
(2) From this relation (2), the electron-phonon interaction matrix elements from states k to k' with up spin scattered by phonon with wave vector q, V l' (k, k', q), has the following spin-dependent property:
V l' (k, k', q)
= exp(iK . a)V 1 (k, k', q),
(3)
where K = k - k' - q is a reciprocal lattice vector in the AF Brillouin zone and a is a Cu-O-Cu distance. Since a reciprocal lattice vector in the AF Brillouin zone is expressed as K = (mr/a,m1f/a,O) with n+m = even, the electron-phonon interactions for up spin and down spin may have a different sign depending on a value of K.
4.2. Effective inter-hole interaction via phonon From the relation (3) and K = (n1f/a,m1f/a,O) with n+m = even, the effective interactions of a pair of holes from (k 1', - k 1) to (k' 1', - k' 1),
481
(a)
-
a
(b)
Figure 7. Schematic picture showing the phase relation between the wavefunctions of a hole-carrier with up and down spin on the extended two-story house mod el for La2-xSrxCu04
scattered by phonon with wave vector q, is expressed as
v l' (k , k' , q)V 1 (-k, -k', q) = exp(iK . a)1V l' (k , k' , q)12.
(4)
Since exp(iK . a) = +1 for n = even and exp(iK . a) = -1 for n = odd, the effective interaction for forming a Cooper pair becomes attractive for n = even while repulsive for n = odd. This remarkable result in the K-S model leads to the superconducting gap of d x 2_y2 symmetry. In figure 8 we show how attractive and repulsive subprocesses compete each other in a scattering process of a conduction hole by phonon. Suppose a hole occupies the state A in figure 8. A hole is scattered by phonon with wavevector q from A to shaded region because q is in normal BZ. So that , scattering process from the state A to the state B consists of two kinds of subprocesses. One subprocess is that a hole is scattered by phonon with wavevector q from A to state C' on FS. Since state C' is transferred to an equivalent state B by the translation of a reciprocal lattice vector (-Ql - Q2), the effective interaction for this subprocess is attractive
482
q: phonon wave vector
QIt Q2: AF reciproca l lattice vectors
..........•
umklapp sub-process (repulsive) normal sub-process (attractive)
Figure 8. Competition of attractive and repulsive subprocesses in the same scattering process of a carrier hole by phonon from state A to B in the AF Brillouin zone. A hole is first scattered by phonon to the point in the shadowed area , and then to the point equivalent to it.
according to equation (4) with n = 2. The other subprocess is that the hole at the state A on FS is also scattered from A to state C by phonon with wavevector q. Since the state C is equivalent to state B by the translation of a reciprocal lattice vector ( - Ql), the effective interaction is repulsive by equation (4) with n = m = 1. State C I is inside an AF Brillouin zone while state C is outside the AF Brillouin zone, one may say that a scattering process from A to B via C I is a Normal scattering while a scattering process from A to B via Cis Umklapp scattering. Normal and Umklapp scattering have different sign, so that the effective interaction between holes at state A and B becomes attractive or repulsive depending on the strengths of two scatterings. These strengths vary by depending a wave vector q. As an example of calculated results, we present the calculated results of the k and kl dependence of the electron-phonon spectral function 0;2 F r 1 ([1, k, k/) with spin-singlet for one of out-of-plane modes, A 1g mode (see the lower left corner ofthe Fig. 9(a)) in LSCO with tetragonal symmetry in figure 9, where Fr 1 ([1, k, k/) is the momentum-dependent density of phonon states and 0;2 is the square of the electron-phonon coupling constant
483
La
(b)
attractive
(a) repulsive
(C)
repulsive
Figure 9. Calculated result of a 2 P, 1 (0, k, k') and a gap function for one of out-of-plane phonon modes, Al g mode in LSCO as a function of k'
26,6. From the Fig. 9( c) we can see that the momentum-dependent spectral function varies by taking values with + and - sign, when the wave vector k' changes from the section CD of small FS to the section @)while k is fixed. This is clearly a d-wave behavior. By using this spectral function, we can obtain the Fig. 9(c), the obtained d-component gap function 6.(k) vary as a function of (cos(kxa) - cos(kya)) and the wavefunction of a Cooper pair has a spatial extension of d x 2_y2 symmetry. Kamimura et al calculated the electron-phonon spectral function a 2F(n) for all phonon modes of LSCO with tetragonal symmetry, and their calculated result of d-wave component of the spectral function for LSCO with tetragonal symmetry, a 2F i 1(2) (0,), is shown in figure 10 as a function of phonon frequency n. The phonon modes of LSCO with tetragonal symmetry are classified into in-plane modes and out-of-plane modes with regard to a CU02 plane. Figure 10 shows that the out-of-plane modes
484
(0 ~
c:
~
"'-
~
""a
~
Phonon Energy
n (meV)
s:l
M
& Cil
C"I
u
...,uI-
iII.1I6rd# Tok", Unitf("J".~ity
Un~tyofu.rn.us
HJJd.1 Us/tio
TokytlNilLWnaiColTcsrc7(Ttd.
A tam1d"
~z,'Olt:rd CcptTnr.:"s
""
Wfr;!IIImi~
A-W
G:l7i..,
11"~
f,wndt-St.-mU" UIllCJ.
l((,J.u(S
A. fJrrtm,,4,fL'
CopemKI4!>
Unt~~t"ofVb",~
Unit>. Unm
'mi.
SId.ilJ,;t Mal1/(~mtJtl{'trl Tiihmgar Umnnatocs O!nter) project from 2006 to 2011 ;5"1""""'00 by lapane... MinistJy of &luca_ and Tokyo Um""";'Y ofSdenre