Steganography Preserving Statistical Properties Elke Franz Dresden University of Technology, Department of Computer Scie...

Author:
Franz

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Steganography Preserving Statistical Properties Elke Franz Dresden University of Technology, Department of Computer Science D–01062 Dresden, Germany

Abstract. Steganographic modiﬁcations must not noticeably change characteristics of the cover data known to an attacker. However, it is a diﬃcult task to deﬁne an appropriate set of features that should be preserved. This article investigates possibilities to maintain statistical properties while embedding information. As an example, the simple embedding function “overwriting the least signiﬁcant bits” is discussed, w.r.t. greyscale images used as covers. Two modiﬁcations of this technique are presented here: The ﬁrst modiﬁcation avoids changes of ﬁrst-order statistics and thereby overcomes possible attacks based on histogram analysis. The second modiﬁcation tries to reduce noticeable modiﬁcations of second-order statistics and, thereby, of the structure of an image. The suggested modiﬁcations are tested on a set of images. Testing a possible attack and comparing these results to the results of the same attack on stego images created by the original method clariﬁes the improvements.

1

Introduction

Steganography aims to hide the existence of secret messages by embedding them into innocuously looking cover data. The generated stego data must not raise suspicion to be the result of steganographic processing. The goal of attacks is to detect the use of steganography [10, 18]. Successful attacks are able to detect modiﬁcations of the cover data, which can only be caused by the embedding. With regard to these attacks, an important requirement for steganography can be speciﬁed: The embedding must not cause signiﬁcant properties that indicate the use of steganography. Attacking means nothing but analyzing features of the stego data. It is not necessary to preserve the absolute values of the features, but changing their essential properties is critical for the security of a steganographic algorithm. As the analyses of attackers cannot be foreseen, it is very diﬃcult to deﬁne the set of features that should be considered. Obviously, the modiﬁcations caused by embedding have to be imperceptible. This can be regarded as the basic requirement that should be fulﬁlled by each steganographic method. Steganographic algorithms of the ﬁrst generation concentrated on this problem. However, successful attacks on such systems have shown that imperceptibility is necessary, but not suﬃcient for secure steganography. The human visual perception is not able to recognize all information of an image. This is commonly used for lossy compression. Statistical analysis can disclose imperceptible diﬀerences between F.A.P. Petitcolas (Ed.): IH 2002, LNCS 2578, pp. 278–294, 2003. c Springer-Verlag Berlin Heidelberg 2003

Steganography Preserving Statistical Properties

279

images. Steganographic algorithms of the second generation are the answer to these successful attacks [15, 19]. The evaluation of additional features is necessary to improve the security of steganographic algorithms. This article concentrates on statistical analysis, which will be used to deﬁne modiﬁcations of the well-known embedding operation “overwriting of the least signiﬁcant bits” (LSBs). The replacing of all LSBs of an image with a sequence of random bits will be simply attackable by histogram analysis (statistical attacks in [18]). The ﬁrst modiﬁcation overcomes this weakness. However, even if ﬁrst-order statistics are necessary to hide the use of steganography, they are also not suﬃcient. An image consisting of a white and a black half has the same histogram as an image consisting of the same number of white and black pixels randomly scattered over the image. Therefore, the image structure also has to be considered. The second modiﬁcation introduces a possible approach to further improve the overwriting by analysing second-order statistics. To evaluate the improvements yielded by the modiﬁcations, various stego images were generated from test images and analysed afterwards.

2 2.1

First Modification First-Order Statistical Properties

First-order statistics describe the commonness of the colours or shades of grey of an image. Usually, histograms are used to describe the distribution of the shades. The problem of the embedding function “overwriting” is the fact that it changes the characteristics of the histogram [18]. If embedding is done in the spatial domain, frequencies of adjacent colours or shades, which only diﬀer in the least signiﬁcant bit, become equal by replacing the LSBs with a random bit stream. The same happens to the histogram of frequency coeﬃcients if embedding is done in the frequency domain. To avoid such deviations, [19] suggests to use decrementing instead of overwriting. A possible solution to maintain the frequencies of coeﬃcients despite of overwriting is introduced in [15]: Every change is corrected by an inverse change of another coeﬃcient. Another approach is presented in this paper. 2.2

Modelling the Embedding

For this approach, it is useful to model the embedding as Markov source [9]. The frequencies of the shades after embedding depend on the frequencies of the shades before embedding and the transition probabilities. Because the state just depends on the previous state, we can think of it as a ﬁrst-order Markov source. The state probabilities of a Markov source are usually variable while the transition probabilities are invariant. The frequencies of the shades are given by the cover image and will be changed by embedding. The transition probabilities

280

Elke Franz

are deﬁned by the embedding algorithm and the distribution of the message to be embedded. Therefore, they are constant for embedding the same message by using the same embedding algorithm. For overwriting, the probabilities of the message bits correspond to the transition probabilities. Transitions between shades can solely happen between adjacent shades that only diﬀer in the least signiﬁcant bits which are replaced with random message bits. Such adjacent shades are referred to as groups in the following. 2.3

Suggested Modification

To preserve the distribution of the shades, we suggest the following approach: First, we determine the ideal distribution of the message to be embedded with the given algorithm in a given cover. The distribution is ideal if it can be embedded without changing the characteristics of the histogram. In fact, even the exact frequencies are preserved in our approach. Second, adjust the message that it matches the ideal distribution and embed it into the cover. It is assumed, that a given distribution can be changed to match an arbitrary distribution for practical reasons. Adding the necessary bits and subsequently permuting the message could do this. Permutation algorithms can be found in the literature, for example in [12]. Another possibility to change a distribution is described in [17]. The frequencies of the shades after embedding can be described by using the total probability theorem [14]. The present message distribution is given by the probability q0 (q1 ) for a message bit to be zero (one). In order preserve the cover histogram, the embedding is modelled as stationary Markov source. In this special case, the frequencies of the shades after embedding pi are equal to the frequencies of the shades before embedding pi , whereas pi stands for the probability of shade gi . Using this condition pi = pi , we can determine the ideal distribution of the message. As the message is a sequence of zeros and ones, it is suﬃcient to determine one probability. For example, the required probability q0 is given by: p2i = q0 (p2i + p2i+1 ) for p2i = p2i : where: q0 + q1 = 1, q0 + q1 = 1,

q0 = 255 j=0

p2i p2i + p2i+1

(1)

pj = 1, i = 0, ..., 127.

The distribution of the message has to match for each group. If overwriting a pixel of a group with the message bit “0” yields the shade g2i , the probability for the message bit to be “0” has to be equal to the probability for a pixel belonging to this group to be of shade g2i . Or in other words, the probabilities of the message bits have to be equal to the probabilities of the corresponding shades of the respective group. Moreover, really all pixels belonging to a selected group

Steganography Preserving Statistical Properties

281

have to be used for embedding; otherwise the transition probabilities would be modiﬁed. It is also possible to overwrite more than just the LSBs. This could be done to increase the embedding capacity. The modiﬁcation described below can be extended to maintain the histogram as follows: By overwriting the least significant b bits of the pixels, groups of m = 2b bits can be transformed into each other. The groups consist of the shades that only diﬀer in the least signiﬁcant b blocks Bi of length b. For each bits. The message is divided into n = length(emb) b group, the distribution of these blocks has to match the probability of a pixel to be of the corresponding shade of the respective group: pmi+j

= qj

m−1

pmi+k

k=0

for

pmi+k = pmi+k :

qj =

pmi+j m−1 k=0

where: j = 0, ..., m − 1, i = 0, ..., 255 m , qj = p(Bj ).

m−1 j=0

qj = 1,

m−1 j=0

(2)

pmi+k

qj = 1,

255 l=0

pl = 1,

Figure 1 illustrates the modelling. Theoretically, it would be possible to overwrite all bits of a pixel (supposed an appropriate distribution). Of course, this would signiﬁcantly change the image. Without evaluating any further properties, the allowed bit rate per pixel must be subjectively limited by the human user. The histogram of the stego image is identical to the histogram of the cover image; therefore, attacks based on ﬁrst-order statistics will fail. 2.4

Algorithm

The necessary steps are illustrated in Figure 2. First, the cover histogram is calculated in order to obtain the probabilities of the shades. Afterwards, one looks for usable groups of shades. A group is usable, if all of its shades occur in the image. In the next step, the groups are sorted. An embedding table is used for this step. The ﬁrst three columns of this table are directly derived from the histogram. Groups are referenced by their starting value, which is the ﬁrst shade of this group. The relative frequency of a group is the sum of all relative frequencies of the belonging shades. The capacity speciﬁes the number of bits that can be embedded in the pixels belonging to a certain group. The capacity is the product of the absolute frequency of the group and the number of embedded bits per pixel. Groups can be sorted according to various criteria, for example according to – the capacity, largest capacity ﬁrst (as used in the tests), or – the deviation from normal distribution, least deviation ﬁrst.

Elke Franz

.....

2i+1

p p

mi+1

mi+2

...

p’

mi+2

.....

.....

.....

p’

p’

p

p’

4i+2

4i+1

4i+2

4i+3

4i+3

p

p’

mi+m-1

mi+m-1

.....

p

4i+1

.....

p

p’

...

4i

4i

...

p

.....

overwriting the least significant bits

mi+1

...

2i+1

mi

p’

...

p

.....

p’

p’

...

mi

...

p

2i

.....

p’

2i

.....

.....

.....

p

...

282

.....

.....

overwriting the b least significant bits b = ld m = log2 m

overwriting the two least significant bits transition probabilities: q0

q3

p

probabilities for the shades before embedding

q1

qm-1

p’

probabilities for the shades after embedding

q2

i i

Fig. 1. Overwriting the least signiﬁcant bits

The sorting corresponds to the selection of groups if the embedded message is shorter than the embedding capacity of the cover image. The overall goal is the best utilization of the cover image. After this step, the ﬁrst three columns of the embedding table are ﬁlled. The order of rows corresponds to the order of processing while embedding. In the last preprocessing step, the message is prepared for embedding. It is split into parts which will be embedded in the groups. The distribution of every part has to be adjusted as described above in order to match the intended group. The length of a part is determined by the capacity of this group. These parts are inserted into the last column of the embedding table, subsequently ﬁlling up the rows. The last group will be padded if necessary. Finally, the message is embedded. The algorithm processes the cover pixel by pixel in a ﬁxed order. For each pixel belonging to a usable group, the next b bits of the appropriate row of the embedding table overwrite the b least signiﬁcant bits of the pixel. The resulting stego image is transmitted to the recipient of the message. Extracting the message is done in a similar manner. The recipient must generate the same embedding table to be able to extract the message. As the frequencies of the shades are not changed, he can do so by using the histogram of the stego image. Sender and recipient must know the sorting criteria of the groups. After

Steganography Preserving Statistical Properties

283

probability

0.008

0.006

0.004

0.002

0 0

50

100 150 shades of grey

200

1. Calculate cover histogram 4. Split the message into pieces for embedding into the groups and match distribution 2. Look for usable groups

46 47 48 49 50 51 52 53

..

Embedding table:

0.0436 0.0932 0.2336 0.1400 0.1832 0.1128 0 0.0448

..

relative frequency

starting value of the group

capacity (bit)

message

0.3736 0.2960 0.1368 ...

48 50 46 ...

934 740 342 ...

101101... 010011... 110111... ...

3. Sort usable groups

embedded bit string: 46 51 49 48 ... 45 53 40 42 ...

5. Process cover pixel by pixel in a fixed order, embedding in pixels belonging to a usable group

47 50 49 48 ...

1010...

... ... ... ... ...

46 51 43 48 ...

... ... ... ... ...

... ... ... ...

... ... ... ...

part of cover image

part of stego image

Fig. 2. Overwriting the LSBs — ﬁrst modiﬁcation generating the embedding table, the cover has to be processed in exactly the same order as it was done while embedding. The b least signiﬁcant bits of pixels belonging to a usable group are sorted in the respective row of the embedding table. Thereafter, the adjusting of the distribution has to be reversed. The rows of the table include pieces of the original message after this reversing. In the last step, the message must be successively put together from these pieces.

3 3.1

Second Modification Co-occurrence Matrices

As ﬁrst-order statistics are not suﬃcient, we want to have a look at correlations between pixels in order to get more information about the structure of an image. A common means of describing such correlations is the use of cooccurrence matrices. A co-occurrence matrix is a two dimensional matrix whose

284

Elke Franz

entries give the frequency for the co-occurrence of shades at two pixels separated by a ﬁxed distance and direction. Relations (∆x, ∆y) describe distance and direction. This is second-order statistics, because relations between two pixels are considered. For the shade g(x, y) of the pixel at position (x, y), the entry cij of a co-occurrence matrix describes the frequency of pairs of pixels with (g(x, y) = i) ∧ (g(x + ∆x, y + ∆y) = j). A separate matrix is generated for each relation. 3.2

Approach to Maintain the Matrix Entries

It is more diﬃcult to describe the eﬀects of embedding for second-order statistics. Instead of looking at a single histogram, various matrices must be considered. It is not clear from the beginning, which relations must be analysed. Therefore we want to use the co-occurrence matrices only as a means to ﬁnd shades that occur randomly and independent of each other in the image. If they are transformed into each other while embedding, it would neither be noticeable nor measurable. That means we are looking for noisy parts of the image that can be replaced by noise, and the analysis of second-order statistics should be used as a means of detecting noise. The random occurrence of the shades will be identiﬁable in the co-occurrence matrices: Two events A and B are called independent if their probabilities satisfy P (AB) = P (A)P (B) [14]. The entries of the matrix correspond to the joint probability, and the probabilities of the events are given by the frequencies of the shades in the histogram. That means, the formula is valid if the shades are stochastically independent. This condition can be used to establish rules that will maintain the matrix entries: 1. The shades, which can be transformed into each other while embedding, must be stochastically independent. 2. The embedding function used has to maintain ﬁrst-order statistics. A method to fulﬁll the second rule was already proposed in the previous chapter. So the remaining task is to ﬁnd independent shades. The relations to be analysed have to be speciﬁed. As it can be expected that neighbouring pixels are most correlated, we want to concentrate on relations that describe the direct neighbourhood of a pixel, namely (∆x, ∆y) ∈ {(1, 0), (−1, 1), (0, 1), (1, 1)}. 3.3

Stochastically Independent Shades

The χ2 -test of independence (e. g., [2]) was used to detect stochastically independent shades. This test belongs to the class of signiﬁcance tests that determine the probability that a given result did not occur by chance. The following null hypothesis was tested here: The shades, which can be transformed into each other while embedding, are stochastically independent. The number of shades for which independence must be tested depends on the embedding operation. In the example examined here it is given by the groups.

Steganography Preserving Statistical Properties

285

If the least signiﬁcant bit is used for embedding, two shades can be transformed into each other at a time: the shades g2i and g2i+1 . With regard to the cooccurrence matrix, the independence of the features “shade of the ﬁrst pixel” and “shade of the second pixel” is tested. The sample for the test consists of all pairs of pixels belonging to the given relation and having shades g2i and g2i+1 . A fourfold table can be used in this special case:

Feature Y

Feature X g2i+1 h21 h22 h2.

g2i h11 h12 h1.

g2i g2i+1

h.1 h.2 n

The statistic for the test of the hypothesis 2

χ2r =

n (h11 h22 − h12 h21 ) h1. h2. h.1 h.2

is χ2 -distributed with one degree of freedom. The Yates-correction is used for small samples with size less than 50. The statistic is calculated for all relations mentioned above. The modiﬁed embedding algorithm must be extended by the additional analysis described above. After selecting usable groups, they were further restricted to the groups whose shades occur independently of each other.

4 4.1

Evaluating the Modifications General Testing

R Both algorithms were realized in Mathematica . A set of 50 test images was used for embedding. The maximum amount of data was embedded, whereby the messages were random bit strings of the necessary distribution. To show the improvements of the modiﬁcations, embedding was compared to the original method of overwriting the least signiﬁcant bits. For each of the test images, four stego images were generated:

– – – –

stego1: stego2: stego3: stego4:

overwriting overwriting overwriting overwriting

according according according according

to to to to

the the the the

original method, ﬁrst modiﬁcation, 1 bit per pixel, ﬁrst modiﬁcation, 2 bits per pixel, second modiﬁcation, 1 bit per pixel.

The test images are scans of photographs showing natural scenes. They are of diﬀerent characteristics. Some of them include greater homogenous areas of

286

Elke Franz

test images (%)

80

1st modification (1 bit) 1st modification (2 bit) 2nd modification

60

40

20

1

2

3

4

5

6

7

8

9

10

class

embedding capacity

1 2 3 4 5 6 7 8 9 10 11

[ 0% - 10%) [10% - 20%) [20% - 30%) [30% - 40%) [40% - 50%) [50% - 60%) [60% - 70%) [70% - 80%) [80% - 90%) [90% - 99%) [99% - 100%]

11

class (as defined on the right)

Fig. 3. Embedding capacities of the test images shades that are possible in the saturation areas of the scanner used, others consist of medium shades only. In the former case, it can be expected to ﬁnd mainly one shade in such areas [4]. The maximum size of the images was 110 kByte. This test shows that the restriction to usable groups does not generally restrict the embedding capacity. The histograms of natural images usually cover a continuous area of shades. Therefore, the restriction may become noticeable only at the marginal areas of the histograms and only if the images include large homogenous areas. This is the same for embedding two bits per pixel. The second modiﬁcation results in stronger restrictions of the capacity, but in general, despite these restrictions, quite an amount of data can be embedded. The possible capacities for each method are compared in Figure 3; the maximal number of bits that can be embedded in a cover is given in percents (all LSBs of the cover correspond to 100 %). The covers are grouped into classes of diﬀerent capacity as described in the ﬁgure. 4.2

Possible Attacks

Describing the Attacks. The discussion of possible attacks on the steganographic methods is of special interest. To classify the attacks tested here, some relevant items are shortly discussed in the following. Possible attacks can be described w.r.t. – the strategy of the attack, – the analysed features of the stego images, and – the possible result of the attack. Attacking Strategies. The ﬁrst strategy may be to model features of the cover images and to look for deviations from this model. However, it is diﬃcult to generally describe the “normal” appearance of these features. There are diﬀerent strategies known to approach this problem. One of them is making assumptions about the features of the stego image by modelling the impact of embedding [18]. Another approach is to investigate changes of the features, even by

Steganography Preserving Statistical Properties

287

steganographic modiﬁcations [5], or by other operations, such as image processing functions [1]. Considering a set of training images, the impact of processing the images to the chosen features is analysed. To answer the question whether an intercepted image contains steganographic data, the same processing is done with this image and the changes of the analysed features are compared to the empirically determined thresholds. Analysed Features. Attacks concentrate on special features of the suspected image. Regarding statistical analysis, these features can be classiﬁed by ﬁrst-order statistics, second-order statistics and so on. It can be assumed that the force of an attack increases in this order. First-order statistics are used in attacks that analyse histogram modiﬁcations [5, 13, 15, 18]. Analysing more than one pixel at a time leads to higher-order statistics. Examples are the Laplace ﬁltering as described in [11], the subband ﬁltering used in [3], and the analysis of regular and singular groups in [7]. Possible Result. Of course, it will not be possible to get absolute certainty about the use of steganography, because this would require absolute certainty about the cover images and the impact of all possible processing. Therefore, at best it can be stated that an image is a stego image with the utmost probability. It depends on the attack such a statement is based on: – only on the attacker’s subjective estimation, – on the comparison to empirically determined thresholds, or – on statistical hypothesis testing. Both eﬃciency and reliability of an attack seem to increase in this order. The visual attacks in [18] as well as the subjective estimation of other image features belong to the ﬁrst category. The attacks in [1] and [5] can be seen as examples for the second category, and the statistical attacks in [15, 18] belong to the third category. Examined Attacks. As the suggested modiﬁcations maintain the ﬁrst-order statistics of the cover images, it is not necessary to test attacks based on histogram analysis. We can focus on attacks based on higher-order statistics. The strategy used here is to make assumptions about the features of the stego images. And ﬁnally, the result shall be based on hypothesis testing. Starting point is the Laplace ﬁltering described in [11]. Using the following Laplace operator regarding four adjacent pixels ﬁlters the images: 0 −1 0 H = −1 4 −1 0 −1 0 The results of ﬁltering two of the test images are shown in Figure 4. Because it is assumed that adjacent pixels are of similar colours, the maximum of the resulting ﬁlter histogram is expected to be zero. Overwriting the least signiﬁcant bits

288

Elke Franz

0.25 0.05 probability

probability

0.2 0.15 0.1

0.04 0.03 0.02

0.05 0.01 0 -40

-20 0 20 result of filtering

40

-40

-20 0 20 result of filtering

40

Image with large homogenous areas 0.007 0.007 probability

probability

0.0065 0.006 0.0055 0.005 0.0045

0.0065 0.006 0.0055 0.005 0.0045

0.004 0.004 0.0035 -40

-20 0 20 result of filtering

40

-40

-20 0 20 result of filtering

40

Image with medium shades

Fig. 4. Results of Laplace ﬁltering; cover on the left, stego1 on the right changes this characteristic; the LSBs become stochastically independent, which can be noticed in the histogram. However, the result depends on the structure of the image. The impact of overwriting cannot be noticed when the LSBs of the cover image look noisy. To work as eﬃciently as possible, a hypothesis test was used. The results of Laplace ﬁltering cannot generally be estimated because they depend on the image. However, just regarding the LSBs simpliﬁes the situation. At least, the impact of overwriting all LSBs according to the original method can be modelled. After overwriting all LSBs with a random bit sequence as done by the original method, each bit is zero or one with probability 0.5. Filtering only the LSBs of such a stego therefore yields a ﬁlter histogram whose frequencies can be derived from the probabilities of the LSBs. Possible results of this operation are values in the range of −4, ..., 4. Only the number of zeros n0 and ones n1 of the adjacent bits has to be regarded when determining the number of combinations that yield a certain result. The number of possible combinations of the adjacent pixels nm (n0 , n1 ) is given by the permutation Pnm of n elements of m groups with k1 , k2 , ..., km equal elements [8]: nm (n0 , n1 ) = Pnm =

4! n! = k1 ! · k2 !...km ! n0 ! · n1 !

The probability for a result P (res) of the ﬁltering is simply given by: P (res) = 0, 5nm (n0 ,n1 )+1

Steganography Preserving Statistical Properties

289

Table 1. Expected distribution after ﬁltering the LSBs with the Laplace operator Middle pixel Adjacent pixels Result Number Probability

0

1

4x1 3x1, 1x0 2x1, 2x0 1x1, 3x0 4x0 4x1 3x1, 1x0 2x1, 2x0 1x1, 3x0 4x0

–4 –3 –2 –1

1 4 6 4

0,03125 0,125 0,1875 0,125

0

2

0,0625

1 2 3 4

4 6 4 1

0,125 0,1875 0,125 0,03125

This way, the theoretically expected distribution of the ﬁlter histogram for ﬁltering the LSBs of a stego image as stated above can be determined (Table 1). Summarizing, to discriminate between cover and stego images, the Laplace operator ﬁlters the least signiﬁcant bit plane. The χ2 -test of homogeneity [2] is used to test the null hypothesis that the distribution of this result matches the expected distribution after overwriting the LSBs (Table 1). If the hypothesis is rejected, it is concluded that the LSBs of the analysed image do not contain a secret message in the form of a random bit string. The signiﬁcance level α of the test determines the critical value that decides whether or not the hypothesis is rejected. Therefore, the signiﬁcance level is the probability to make a type I error, that means to wrongly reject the null hypothesis. In the tests examined here, two signiﬁcance levels were used: α = 0.05 and α = 0.01. Therefore, three possible regions for the results can be deﬁned: – hypothesis is not rejected, – hypothesis is still accepted for α = 0.01, but rejected for α = 0.05, and – hypothesis is rejected even for α = 0.01. If the hypothesis is not rejected, the LSBs may contain a secret message. Other possible reasons are that the LSBs of this image match the expected distribution anyway, or that a type II error occured. This kind of error cannot be handled by a signiﬁcance test. This attack has a high probability of detecting the original embedding method. The ﬁrst modiﬁcation only considers ﬁrst-order statistics and is not constructed to resist such analysis. However, detection will become more difﬁcult because the distribution of the message was changed and therefore the embedded stream is not uniformly distributed. The best resistance can be expected from the second modiﬁcation, which respects second-order statistics. Of course, an attack has to be tailored to the speciﬁc embedding method. A second analysis was done to illustrate a possibility. The object of this analysis

290

Elke Franz 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% cover

stego1

stego 2

stego3

stego4

null hypothesis: distribution of LSBs matches distribution after overwriting with random message

hypothesis is not rejected hypothesis is rejected for α = 0.05 hypothesis is rejected for α = 0.01

indicates possible use of steganography detecting structures (no signs of steganography)

Fig. 5. Result of Laplace attack for images with non-noise LSBs

is the independence of shades. The embedding process in the ﬁrst modiﬁcation does not consider correlations between pixels. The distribution of the embedded stream is modiﬁed, but it is just a random sequence of bits. It can be expected that the embedding will destroy existing correlations. The number of independent shades is expected to increase after embedding. In contrast, the number of independent shades will not increase when using the second modiﬁcation because only independent shades are used for embedding. This is just one possibility for a more sophisticated attack tailored to the modiﬁcation suggested here. Another possibility would be testing randomness especially for possible groups. Results of the Attacks. The Laplace ﬁltering tests whether the LSBs of an image are uniformly distributed. The evaluation shows the advantage of using hypothesis testing: The test is able to detect deviations from uniform distribution that are imperceptible for the human eye. Deviations of the uniform distribution are, in fact, structures in the appropriate bit plane. Of course, even the LSBs of a cover image can look noisy and may not include structure at all. The test did identify structures in 33 of the 50 test images (with the naked eye, only in 17 images the structures were recognized). For the images without any measurable structures, the test result will not change for the analysis of the stego images. The 33 images with measurable structures in the LSBs are of greater interest. The assumptions made above are corroborated by the tests (Figure 5). Embedding without any modiﬁcation was detectable by the test for each test image. The ﬁrst modiﬁcation makes the recognition more diﬃcult, and ﬁnally, the stego images generated by using the second modiﬁcation yield the same test results as the cover images.

independent shades (%)

independent shades (%)

Steganography Preserving Statistical Properties

80

60

40

20

291

90 80 70 60 50 40 30 20

0 cover

stego1

stego2

stego 3

stego 4

Images with larger homogenous areas

cover

stego1

stego2

stego 3

stego 4

Images with medium shades

Fig. 6. Ratio of independent shades (in percents)

The second assumption about the number of independent shades was also supported by the tests (Figure 6). Randomization of the least signiﬁcant bit plane(s) has little eﬀect on the number of independent shades for noisy images (right hand side of the ﬁgure). In contrast, this number is signiﬁcantly changed if there are recognizable structures in the least signiﬁcant bit plane of the cover (left hand side of the ﬁgure). However, using the second modiﬁcation (stego4) will largely avoid these changes. Future work will be done to create further evaluations.

5

Summary and Outlook

In this paper we investigated possibilities that aim to preserve statistical properties while overwriting the LSBs of an image. For ﬁrst-order statistics, the embedding was modelled as stationary Markov source. For second-order statistics, an additional analysis of the cover image was suggested to improve the selection of random image parts. The modiﬁcations use groups of shades for embedding. Another approach that uses pairs of colors is described in [16]: Two colours are a pair and can be used for embedding if the diﬀerence between them is less than the noise of the host, and if they occur with about the same statistical frequency. The approach suggested here can be considered as a generalization of that approach. The impact of the modiﬁcations is illustrated in Figure 7. Overwriting all LSBs with a random string destroys structures in the least signiﬁcant bit plane. Our modiﬁcations clearly improve this result by maintaining existing structures. Tests have shown the usability of the suggested modiﬁcations. At worst, the algorithm reduces the embedding capacity of improper images. In spite of necessary restrictions, quite an amount of data can be embedded. Analysing the LSBs of stego images generated according to the second modiﬁcation yielded the same results as analysing the LSBs of the corresponding cover images. As it can be expected, the test results depend on the images. Only overwriting non-noise LSBs is detectable at all. Therefore, images with a perceptible structure in the LSBs are not suited for embedding according to the original method. However,

292

Elke Franz

cover: “bastei.pgm”

LSBs of the cover

LSBs of stego1

LSBs of stego2

LSBs of stego3

LSBs of stego4

Fig. 7. Results of suggested modiﬁcations

one can use them as covers nevertheless by preserving existing structures as it was proposed here. Future work has to be done to improve the statistical analyses and to use these results in order to improve embedding. The investigations were done for overwriting the LSBs of images in the spatial domain. For practical reasons, it is important to investigate the applicability of the modiﬁcations for other embedding operations. And ﬁnally, as mentioned above, the analyses for validating the results have to be improved, too. This will also include researching the applicability of attacks known from the literature. In [6] it was pointed out that the use of cover images previously stored in the JPEG format should be avoided. The JPEG compatibility test described in that article was not yet tested for the algorithms suggested here. Especially the RS analysis in [7] seems to be of great interest. Some ﬁrst tests were done to determine the R and S groups for cover and stego images produced here, but much more evaluation is necessary to make concrete statements. Finally, the author would like to thank the reviewers for their helpful comments.

Steganography Preserving Statistical Properties

293

References [1] Ismail Avcibas, Nasir Memon, B¨ ulent Sankur: Steganalysis of Watermarking Techniques using Image Quality Metrics. In: Ping Wah Wong, Edward J. Delp (Ed.): Security and Watermarking of Multimedia Contents III. Proceedings of SPIE Vol. 4314, 2001, 523-531. 287 [2] Wilfried J. Dixon, Frank J. Massey: Introduction to Statistical Analysis. McGraw-Hill Book Company, Inc. New York, 1957. 284, 289 [3] Hany Farid: Detecting Steganographic Messages in Digital Images. http://www.cs.dartmouth.edu/˜farid/publications/tr01.html 287 [4] Elke Franz, Andreas Pﬁtzmann: Steganography Secure Against Cover-StegoAttacks. In Andreas Pﬁtzmann (Ed.): Information Hiding. Third International Workshop, IH’99, Dresden, Germany, September/October 1999, Proceedings, Springer, LNCS 1768, 2000, 29-46. 286 [5] Jessica Fridrich, Rui Du, Meng Long: Steganalysis of LSB Encoding in Color Images. ICME 2000, New York City, July 31th - August 2nd, USA. http://www.ssie.binghampton.edu/fridrich/publications.html 287 [6] Jessica Fridrich, Miroslav Goljan, Rui Du: Steganalysis Based on JPEG Compatibility. SPIE Multimedia Systems and Applications IV, Denver, CO, August 20-24, 2001. 292 [7] Jessica Fridrich, Miroslav Goljan, Rui Du: Reliable Detection of LSB Steganography in Color and Grayscale Images. Proc. of the ACM Workshop on Multimedia and Security, Ottawa, CA, October 5, 2001, 27-30. 287, 292 [8] Wilhelm G¨ ohler: H¨ ohere Mathematik: Formeln und Hinweise. Bearb. von Barbara Ralle, 10. u ¨berarb. Auﬂage, VEB Deutscher Verlag f¨ ur Grundstoﬃndustrie, Leipzig, 1987. 288 [9] Solomon W. Golomb, Robert E. Peile, Robert A. Scholtz: Basic Concepts in Information Theory and Coding. Plenum Press, New York, 1994. 279 [10] Neil F. Johnson, Sushil Jajodia: Steganalysis of Images Created Using Current Steganography Software. In David Aucsmith (Ed.): Information Hiding. Second International Workshop, IH’98, Portland, Oregon, USA, April 1998, Proceedings, Springer, LNCS 1525, 1998, 273-289. 278 [11] Stefan Katzenbeisser, Fabien A. P. Petitcolas (Ed.): Information Hiding Techniques for Steganography and Digital Watermarking. Artech House, 2000. 287 [12] Donald E. Knuth: The art of computer programming. Volume 2: Seminumerical algorithms. Addison-Wesley, 3rd Ed., 1998. 280 [13] Maurice Maes: Twin Peaks: The Histogram Attack to Fixed Depth Image Watermarks. In David Aucsmith (Ed.): Information Hiding. Second International Workshop, IH’98, Portland, Oregon, USA, April 1998, Proceedings, Springer, LNCS 1525, 1998, 290-305. 287 [14] A. Papoulis, Probability, Random Variables, and Stochastic Processes, 2nd ed. New York, McGraw-Hill, 1984. 280, 284 [15] Niels Provos: Defending Against Statistical Steganalysis. 10th USENIX Security Symposium, August 2001. http://www.citi.umich.edu/u/provos/stego/ 279, 287 [16] Maxwell T. Sandford, Jonathan N. Bradley, Theodore G. Handel: The data embedding method. In Proc. of the SPIE Photonics East Conference, Philadelphia, September 1995. 291 [17] Peter Wayner: Mimic Functions. Technical Report, Cornell University, Department of Computer Science, 1990. 280

294

Elke Franz

[18] Andreas Westfeld, Andreas Pﬁtzmann: Attacks on Steganographic Systems. In Andreas Pﬁtzmann (Ed.): Information Hiding. Third International Workshop, IH’99, Dresden, Germany, September/October 1999, Proceedings, Springer, LNCS 1768, 2000, 61-76. 278, 279, 286, 287 [19] Andreas Westfeld: F5 — A Steganographic Algorithm: High Capacity Despite Better Steganalysis. In Ira S. Moscowitz (Ed.): Information Hiding. 4th International Workshop, IH’01, Pittsburgh, PA, USA, April 2001, Proceedings, Springer, LNCS 2137, 2001, 289-302. 279

Abstract. Steganographic modiﬁcations must not noticeably change characteristics of the cover data known to an attacker. However, it is a diﬃcult task to deﬁne an appropriate set of features that should be preserved. This article investigates possibilities to maintain statistical properties while embedding information. As an example, the simple embedding function “overwriting the least signiﬁcant bits” is discussed, w.r.t. greyscale images used as covers. Two modiﬁcations of this technique are presented here: The ﬁrst modiﬁcation avoids changes of ﬁrst-order statistics and thereby overcomes possible attacks based on histogram analysis. The second modiﬁcation tries to reduce noticeable modiﬁcations of second-order statistics and, thereby, of the structure of an image. The suggested modiﬁcations are tested on a set of images. Testing a possible attack and comparing these results to the results of the same attack on stego images created by the original method clariﬁes the improvements.

1

Introduction

Steganography aims to hide the existence of secret messages by embedding them into innocuously looking cover data. The generated stego data must not raise suspicion to be the result of steganographic processing. The goal of attacks is to detect the use of steganography [10, 18]. Successful attacks are able to detect modiﬁcations of the cover data, which can only be caused by the embedding. With regard to these attacks, an important requirement for steganography can be speciﬁed: The embedding must not cause signiﬁcant properties that indicate the use of steganography. Attacking means nothing but analyzing features of the stego data. It is not necessary to preserve the absolute values of the features, but changing their essential properties is critical for the security of a steganographic algorithm. As the analyses of attackers cannot be foreseen, it is very diﬃcult to deﬁne the set of features that should be considered. Obviously, the modiﬁcations caused by embedding have to be imperceptible. This can be regarded as the basic requirement that should be fulﬁlled by each steganographic method. Steganographic algorithms of the ﬁrst generation concentrated on this problem. However, successful attacks on such systems have shown that imperceptibility is necessary, but not suﬃcient for secure steganography. The human visual perception is not able to recognize all information of an image. This is commonly used for lossy compression. Statistical analysis can disclose imperceptible diﬀerences between F.A.P. Petitcolas (Ed.): IH 2002, LNCS 2578, pp. 278–294, 2003. c Springer-Verlag Berlin Heidelberg 2003

Steganography Preserving Statistical Properties

279

images. Steganographic algorithms of the second generation are the answer to these successful attacks [15, 19]. The evaluation of additional features is necessary to improve the security of steganographic algorithms. This article concentrates on statistical analysis, which will be used to deﬁne modiﬁcations of the well-known embedding operation “overwriting of the least signiﬁcant bits” (LSBs). The replacing of all LSBs of an image with a sequence of random bits will be simply attackable by histogram analysis (statistical attacks in [18]). The ﬁrst modiﬁcation overcomes this weakness. However, even if ﬁrst-order statistics are necessary to hide the use of steganography, they are also not suﬃcient. An image consisting of a white and a black half has the same histogram as an image consisting of the same number of white and black pixels randomly scattered over the image. Therefore, the image structure also has to be considered. The second modiﬁcation introduces a possible approach to further improve the overwriting by analysing second-order statistics. To evaluate the improvements yielded by the modiﬁcations, various stego images were generated from test images and analysed afterwards.

2 2.1

First Modification First-Order Statistical Properties

First-order statistics describe the commonness of the colours or shades of grey of an image. Usually, histograms are used to describe the distribution of the shades. The problem of the embedding function “overwriting” is the fact that it changes the characteristics of the histogram [18]. If embedding is done in the spatial domain, frequencies of adjacent colours or shades, which only diﬀer in the least signiﬁcant bit, become equal by replacing the LSBs with a random bit stream. The same happens to the histogram of frequency coeﬃcients if embedding is done in the frequency domain. To avoid such deviations, [19] suggests to use decrementing instead of overwriting. A possible solution to maintain the frequencies of coeﬃcients despite of overwriting is introduced in [15]: Every change is corrected by an inverse change of another coeﬃcient. Another approach is presented in this paper. 2.2

Modelling the Embedding

For this approach, it is useful to model the embedding as Markov source [9]. The frequencies of the shades after embedding depend on the frequencies of the shades before embedding and the transition probabilities. Because the state just depends on the previous state, we can think of it as a ﬁrst-order Markov source. The state probabilities of a Markov source are usually variable while the transition probabilities are invariant. The frequencies of the shades are given by the cover image and will be changed by embedding. The transition probabilities

280

Elke Franz

are deﬁned by the embedding algorithm and the distribution of the message to be embedded. Therefore, they are constant for embedding the same message by using the same embedding algorithm. For overwriting, the probabilities of the message bits correspond to the transition probabilities. Transitions between shades can solely happen between adjacent shades that only diﬀer in the least signiﬁcant bits which are replaced with random message bits. Such adjacent shades are referred to as groups in the following. 2.3

Suggested Modification

To preserve the distribution of the shades, we suggest the following approach: First, we determine the ideal distribution of the message to be embedded with the given algorithm in a given cover. The distribution is ideal if it can be embedded without changing the characteristics of the histogram. In fact, even the exact frequencies are preserved in our approach. Second, adjust the message that it matches the ideal distribution and embed it into the cover. It is assumed, that a given distribution can be changed to match an arbitrary distribution for practical reasons. Adding the necessary bits and subsequently permuting the message could do this. Permutation algorithms can be found in the literature, for example in [12]. Another possibility to change a distribution is described in [17]. The frequencies of the shades after embedding can be described by using the total probability theorem [14]. The present message distribution is given by the probability q0 (q1 ) for a message bit to be zero (one). In order preserve the cover histogram, the embedding is modelled as stationary Markov source. In this special case, the frequencies of the shades after embedding pi are equal to the frequencies of the shades before embedding pi , whereas pi stands for the probability of shade gi . Using this condition pi = pi , we can determine the ideal distribution of the message. As the message is a sequence of zeros and ones, it is suﬃcient to determine one probability. For example, the required probability q0 is given by: p2i = q0 (p2i + p2i+1 ) for p2i = p2i : where: q0 + q1 = 1, q0 + q1 = 1,

q0 = 255 j=0

p2i p2i + p2i+1

(1)

pj = 1, i = 0, ..., 127.

The distribution of the message has to match for each group. If overwriting a pixel of a group with the message bit “0” yields the shade g2i , the probability for the message bit to be “0” has to be equal to the probability for a pixel belonging to this group to be of shade g2i . Or in other words, the probabilities of the message bits have to be equal to the probabilities of the corresponding shades of the respective group. Moreover, really all pixels belonging to a selected group

Steganography Preserving Statistical Properties

281

have to be used for embedding; otherwise the transition probabilities would be modiﬁed. It is also possible to overwrite more than just the LSBs. This could be done to increase the embedding capacity. The modiﬁcation described below can be extended to maintain the histogram as follows: By overwriting the least significant b bits of the pixels, groups of m = 2b bits can be transformed into each other. The groups consist of the shades that only diﬀer in the least signiﬁcant b blocks Bi of length b. For each bits. The message is divided into n = length(emb) b group, the distribution of these blocks has to match the probability of a pixel to be of the corresponding shade of the respective group: pmi+j

= qj

m−1

pmi+k

k=0

for

pmi+k = pmi+k :

qj =

pmi+j m−1 k=0

where: j = 0, ..., m − 1, i = 0, ..., 255 m , qj = p(Bj ).

m−1 j=0

qj = 1,

m−1 j=0

(2)

pmi+k

qj = 1,

255 l=0

pl = 1,

Figure 1 illustrates the modelling. Theoretically, it would be possible to overwrite all bits of a pixel (supposed an appropriate distribution). Of course, this would signiﬁcantly change the image. Without evaluating any further properties, the allowed bit rate per pixel must be subjectively limited by the human user. The histogram of the stego image is identical to the histogram of the cover image; therefore, attacks based on ﬁrst-order statistics will fail. 2.4

Algorithm

The necessary steps are illustrated in Figure 2. First, the cover histogram is calculated in order to obtain the probabilities of the shades. Afterwards, one looks for usable groups of shades. A group is usable, if all of its shades occur in the image. In the next step, the groups are sorted. An embedding table is used for this step. The ﬁrst three columns of this table are directly derived from the histogram. Groups are referenced by their starting value, which is the ﬁrst shade of this group. The relative frequency of a group is the sum of all relative frequencies of the belonging shades. The capacity speciﬁes the number of bits that can be embedded in the pixels belonging to a certain group. The capacity is the product of the absolute frequency of the group and the number of embedded bits per pixel. Groups can be sorted according to various criteria, for example according to – the capacity, largest capacity ﬁrst (as used in the tests), or – the deviation from normal distribution, least deviation ﬁrst.

Elke Franz

.....

2i+1

p p

mi+1

mi+2

...

p’

mi+2

.....

.....

.....

p’

p’

p

p’

4i+2

4i+1

4i+2

4i+3

4i+3

p

p’

mi+m-1

mi+m-1

.....

p

4i+1

.....

p

p’

...

4i

4i

...

p

.....

overwriting the least significant bits

mi+1

...

2i+1

mi

p’

...

p

.....

p’

p’

...

mi

...

p

2i

.....

p’

2i

.....

.....

.....

p

...

282

.....

.....

overwriting the b least significant bits b = ld m = log2 m

overwriting the two least significant bits transition probabilities: q0

q3

p

probabilities for the shades before embedding

q1

qm-1

p’

probabilities for the shades after embedding

q2

i i

Fig. 1. Overwriting the least signiﬁcant bits

The sorting corresponds to the selection of groups if the embedded message is shorter than the embedding capacity of the cover image. The overall goal is the best utilization of the cover image. After this step, the ﬁrst three columns of the embedding table are ﬁlled. The order of rows corresponds to the order of processing while embedding. In the last preprocessing step, the message is prepared for embedding. It is split into parts which will be embedded in the groups. The distribution of every part has to be adjusted as described above in order to match the intended group. The length of a part is determined by the capacity of this group. These parts are inserted into the last column of the embedding table, subsequently ﬁlling up the rows. The last group will be padded if necessary. Finally, the message is embedded. The algorithm processes the cover pixel by pixel in a ﬁxed order. For each pixel belonging to a usable group, the next b bits of the appropriate row of the embedding table overwrite the b least signiﬁcant bits of the pixel. The resulting stego image is transmitted to the recipient of the message. Extracting the message is done in a similar manner. The recipient must generate the same embedding table to be able to extract the message. As the frequencies of the shades are not changed, he can do so by using the histogram of the stego image. Sender and recipient must know the sorting criteria of the groups. After

Steganography Preserving Statistical Properties

283

probability

0.008

0.006

0.004

0.002

0 0

50

100 150 shades of grey

200

1. Calculate cover histogram 4. Split the message into pieces for embedding into the groups and match distribution 2. Look for usable groups

46 47 48 49 50 51 52 53

..

Embedding table:

0.0436 0.0932 0.2336 0.1400 0.1832 0.1128 0 0.0448

..

relative frequency

starting value of the group

capacity (bit)

message

0.3736 0.2960 0.1368 ...

48 50 46 ...

934 740 342 ...

101101... 010011... 110111... ...

3. Sort usable groups

embedded bit string: 46 51 49 48 ... 45 53 40 42 ...

5. Process cover pixel by pixel in a fixed order, embedding in pixels belonging to a usable group

47 50 49 48 ...

1010...

... ... ... ... ...

46 51 43 48 ...

... ... ... ... ...

... ... ... ...

... ... ... ...

part of cover image

part of stego image

Fig. 2. Overwriting the LSBs — ﬁrst modiﬁcation generating the embedding table, the cover has to be processed in exactly the same order as it was done while embedding. The b least signiﬁcant bits of pixels belonging to a usable group are sorted in the respective row of the embedding table. Thereafter, the adjusting of the distribution has to be reversed. The rows of the table include pieces of the original message after this reversing. In the last step, the message must be successively put together from these pieces.

3 3.1

Second Modification Co-occurrence Matrices

As ﬁrst-order statistics are not suﬃcient, we want to have a look at correlations between pixels in order to get more information about the structure of an image. A common means of describing such correlations is the use of cooccurrence matrices. A co-occurrence matrix is a two dimensional matrix whose

284

Elke Franz

entries give the frequency for the co-occurrence of shades at two pixels separated by a ﬁxed distance and direction. Relations (∆x, ∆y) describe distance and direction. This is second-order statistics, because relations between two pixels are considered. For the shade g(x, y) of the pixel at position (x, y), the entry cij of a co-occurrence matrix describes the frequency of pairs of pixels with (g(x, y) = i) ∧ (g(x + ∆x, y + ∆y) = j). A separate matrix is generated for each relation. 3.2

Approach to Maintain the Matrix Entries

It is more diﬃcult to describe the eﬀects of embedding for second-order statistics. Instead of looking at a single histogram, various matrices must be considered. It is not clear from the beginning, which relations must be analysed. Therefore we want to use the co-occurrence matrices only as a means to ﬁnd shades that occur randomly and independent of each other in the image. If they are transformed into each other while embedding, it would neither be noticeable nor measurable. That means we are looking for noisy parts of the image that can be replaced by noise, and the analysis of second-order statistics should be used as a means of detecting noise. The random occurrence of the shades will be identiﬁable in the co-occurrence matrices: Two events A and B are called independent if their probabilities satisfy P (AB) = P (A)P (B) [14]. The entries of the matrix correspond to the joint probability, and the probabilities of the events are given by the frequencies of the shades in the histogram. That means, the formula is valid if the shades are stochastically independent. This condition can be used to establish rules that will maintain the matrix entries: 1. The shades, which can be transformed into each other while embedding, must be stochastically independent. 2. The embedding function used has to maintain ﬁrst-order statistics. A method to fulﬁll the second rule was already proposed in the previous chapter. So the remaining task is to ﬁnd independent shades. The relations to be analysed have to be speciﬁed. As it can be expected that neighbouring pixels are most correlated, we want to concentrate on relations that describe the direct neighbourhood of a pixel, namely (∆x, ∆y) ∈ {(1, 0), (−1, 1), (0, 1), (1, 1)}. 3.3

Stochastically Independent Shades

The χ2 -test of independence (e. g., [2]) was used to detect stochastically independent shades. This test belongs to the class of signiﬁcance tests that determine the probability that a given result did not occur by chance. The following null hypothesis was tested here: The shades, which can be transformed into each other while embedding, are stochastically independent. The number of shades for which independence must be tested depends on the embedding operation. In the example examined here it is given by the groups.

Steganography Preserving Statistical Properties

285

If the least signiﬁcant bit is used for embedding, two shades can be transformed into each other at a time: the shades g2i and g2i+1 . With regard to the cooccurrence matrix, the independence of the features “shade of the ﬁrst pixel” and “shade of the second pixel” is tested. The sample for the test consists of all pairs of pixels belonging to the given relation and having shades g2i and g2i+1 . A fourfold table can be used in this special case:

Feature Y

Feature X g2i+1 h21 h22 h2.

g2i h11 h12 h1.

g2i g2i+1

h.1 h.2 n

The statistic for the test of the hypothesis 2

χ2r =

n (h11 h22 − h12 h21 ) h1. h2. h.1 h.2

is χ2 -distributed with one degree of freedom. The Yates-correction is used for small samples with size less than 50. The statistic is calculated for all relations mentioned above. The modiﬁed embedding algorithm must be extended by the additional analysis described above. After selecting usable groups, they were further restricted to the groups whose shades occur independently of each other.

4 4.1

Evaluating the Modifications General Testing

R Both algorithms were realized in Mathematica . A set of 50 test images was used for embedding. The maximum amount of data was embedded, whereby the messages were random bit strings of the necessary distribution. To show the improvements of the modiﬁcations, embedding was compared to the original method of overwriting the least signiﬁcant bits. For each of the test images, four stego images were generated:

– – – –

stego1: stego2: stego3: stego4:

overwriting overwriting overwriting overwriting

according according according according

to to to to

the the the the

original method, ﬁrst modiﬁcation, 1 bit per pixel, ﬁrst modiﬁcation, 2 bits per pixel, second modiﬁcation, 1 bit per pixel.

The test images are scans of photographs showing natural scenes. They are of diﬀerent characteristics. Some of them include greater homogenous areas of

286

Elke Franz

test images (%)

80

1st modification (1 bit) 1st modification (2 bit) 2nd modification

60

40

20

1

2

3

4

5

6

7

8

9

10

class

embedding capacity

1 2 3 4 5 6 7 8 9 10 11

[ 0% - 10%) [10% - 20%) [20% - 30%) [30% - 40%) [40% - 50%) [50% - 60%) [60% - 70%) [70% - 80%) [80% - 90%) [90% - 99%) [99% - 100%]

11

class (as defined on the right)

Fig. 3. Embedding capacities of the test images shades that are possible in the saturation areas of the scanner used, others consist of medium shades only. In the former case, it can be expected to ﬁnd mainly one shade in such areas [4]. The maximum size of the images was 110 kByte. This test shows that the restriction to usable groups does not generally restrict the embedding capacity. The histograms of natural images usually cover a continuous area of shades. Therefore, the restriction may become noticeable only at the marginal areas of the histograms and only if the images include large homogenous areas. This is the same for embedding two bits per pixel. The second modiﬁcation results in stronger restrictions of the capacity, but in general, despite these restrictions, quite an amount of data can be embedded. The possible capacities for each method are compared in Figure 3; the maximal number of bits that can be embedded in a cover is given in percents (all LSBs of the cover correspond to 100 %). The covers are grouped into classes of diﬀerent capacity as described in the ﬁgure. 4.2

Possible Attacks

Describing the Attacks. The discussion of possible attacks on the steganographic methods is of special interest. To classify the attacks tested here, some relevant items are shortly discussed in the following. Possible attacks can be described w.r.t. – the strategy of the attack, – the analysed features of the stego images, and – the possible result of the attack. Attacking Strategies. The ﬁrst strategy may be to model features of the cover images and to look for deviations from this model. However, it is diﬃcult to generally describe the “normal” appearance of these features. There are diﬀerent strategies known to approach this problem. One of them is making assumptions about the features of the stego image by modelling the impact of embedding [18]. Another approach is to investigate changes of the features, even by

Steganography Preserving Statistical Properties

287

steganographic modiﬁcations [5], or by other operations, such as image processing functions [1]. Considering a set of training images, the impact of processing the images to the chosen features is analysed. To answer the question whether an intercepted image contains steganographic data, the same processing is done with this image and the changes of the analysed features are compared to the empirically determined thresholds. Analysed Features. Attacks concentrate on special features of the suspected image. Regarding statistical analysis, these features can be classiﬁed by ﬁrst-order statistics, second-order statistics and so on. It can be assumed that the force of an attack increases in this order. First-order statistics are used in attacks that analyse histogram modiﬁcations [5, 13, 15, 18]. Analysing more than one pixel at a time leads to higher-order statistics. Examples are the Laplace ﬁltering as described in [11], the subband ﬁltering used in [3], and the analysis of regular and singular groups in [7]. Possible Result. Of course, it will not be possible to get absolute certainty about the use of steganography, because this would require absolute certainty about the cover images and the impact of all possible processing. Therefore, at best it can be stated that an image is a stego image with the utmost probability. It depends on the attack such a statement is based on: – only on the attacker’s subjective estimation, – on the comparison to empirically determined thresholds, or – on statistical hypothesis testing. Both eﬃciency and reliability of an attack seem to increase in this order. The visual attacks in [18] as well as the subjective estimation of other image features belong to the ﬁrst category. The attacks in [1] and [5] can be seen as examples for the second category, and the statistical attacks in [15, 18] belong to the third category. Examined Attacks. As the suggested modiﬁcations maintain the ﬁrst-order statistics of the cover images, it is not necessary to test attacks based on histogram analysis. We can focus on attacks based on higher-order statistics. The strategy used here is to make assumptions about the features of the stego images. And ﬁnally, the result shall be based on hypothesis testing. Starting point is the Laplace ﬁltering described in [11]. Using the following Laplace operator regarding four adjacent pixels ﬁlters the images: 0 −1 0 H = −1 4 −1 0 −1 0 The results of ﬁltering two of the test images are shown in Figure 4. Because it is assumed that adjacent pixels are of similar colours, the maximum of the resulting ﬁlter histogram is expected to be zero. Overwriting the least signiﬁcant bits

288

Elke Franz

0.25 0.05 probability

probability

0.2 0.15 0.1

0.04 0.03 0.02

0.05 0.01 0 -40

-20 0 20 result of filtering

40

-40

-20 0 20 result of filtering

40

Image with large homogenous areas 0.007 0.007 probability

probability

0.0065 0.006 0.0055 0.005 0.0045

0.0065 0.006 0.0055 0.005 0.0045

0.004 0.004 0.0035 -40

-20 0 20 result of filtering

40

-40

-20 0 20 result of filtering

40

Image with medium shades

Fig. 4. Results of Laplace ﬁltering; cover on the left, stego1 on the right changes this characteristic; the LSBs become stochastically independent, which can be noticed in the histogram. However, the result depends on the structure of the image. The impact of overwriting cannot be noticed when the LSBs of the cover image look noisy. To work as eﬃciently as possible, a hypothesis test was used. The results of Laplace ﬁltering cannot generally be estimated because they depend on the image. However, just regarding the LSBs simpliﬁes the situation. At least, the impact of overwriting all LSBs according to the original method can be modelled. After overwriting all LSBs with a random bit sequence as done by the original method, each bit is zero or one with probability 0.5. Filtering only the LSBs of such a stego therefore yields a ﬁlter histogram whose frequencies can be derived from the probabilities of the LSBs. Possible results of this operation are values in the range of −4, ..., 4. Only the number of zeros n0 and ones n1 of the adjacent bits has to be regarded when determining the number of combinations that yield a certain result. The number of possible combinations of the adjacent pixels nm (n0 , n1 ) is given by the permutation Pnm of n elements of m groups with k1 , k2 , ..., km equal elements [8]: nm (n0 , n1 ) = Pnm =

4! n! = k1 ! · k2 !...km ! n0 ! · n1 !

The probability for a result P (res) of the ﬁltering is simply given by: P (res) = 0, 5nm (n0 ,n1 )+1

Steganography Preserving Statistical Properties

289

Table 1. Expected distribution after ﬁltering the LSBs with the Laplace operator Middle pixel Adjacent pixels Result Number Probability

0

1

4x1 3x1, 1x0 2x1, 2x0 1x1, 3x0 4x0 4x1 3x1, 1x0 2x1, 2x0 1x1, 3x0 4x0

–4 –3 –2 –1

1 4 6 4

0,03125 0,125 0,1875 0,125

0

2

0,0625

1 2 3 4

4 6 4 1

0,125 0,1875 0,125 0,03125

This way, the theoretically expected distribution of the ﬁlter histogram for ﬁltering the LSBs of a stego image as stated above can be determined (Table 1). Summarizing, to discriminate between cover and stego images, the Laplace operator ﬁlters the least signiﬁcant bit plane. The χ2 -test of homogeneity [2] is used to test the null hypothesis that the distribution of this result matches the expected distribution after overwriting the LSBs (Table 1). If the hypothesis is rejected, it is concluded that the LSBs of the analysed image do not contain a secret message in the form of a random bit string. The signiﬁcance level α of the test determines the critical value that decides whether or not the hypothesis is rejected. Therefore, the signiﬁcance level is the probability to make a type I error, that means to wrongly reject the null hypothesis. In the tests examined here, two signiﬁcance levels were used: α = 0.05 and α = 0.01. Therefore, three possible regions for the results can be deﬁned: – hypothesis is not rejected, – hypothesis is still accepted for α = 0.01, but rejected for α = 0.05, and – hypothesis is rejected even for α = 0.01. If the hypothesis is not rejected, the LSBs may contain a secret message. Other possible reasons are that the LSBs of this image match the expected distribution anyway, or that a type II error occured. This kind of error cannot be handled by a signiﬁcance test. This attack has a high probability of detecting the original embedding method. The ﬁrst modiﬁcation only considers ﬁrst-order statistics and is not constructed to resist such analysis. However, detection will become more difﬁcult because the distribution of the message was changed and therefore the embedded stream is not uniformly distributed. The best resistance can be expected from the second modiﬁcation, which respects second-order statistics. Of course, an attack has to be tailored to the speciﬁc embedding method. A second analysis was done to illustrate a possibility. The object of this analysis

290

Elke Franz 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% cover

stego1

stego 2

stego3

stego4

null hypothesis: distribution of LSBs matches distribution after overwriting with random message

hypothesis is not rejected hypothesis is rejected for α = 0.05 hypothesis is rejected for α = 0.01

indicates possible use of steganography detecting structures (no signs of steganography)

Fig. 5. Result of Laplace attack for images with non-noise LSBs

is the independence of shades. The embedding process in the ﬁrst modiﬁcation does not consider correlations between pixels. The distribution of the embedded stream is modiﬁed, but it is just a random sequence of bits. It can be expected that the embedding will destroy existing correlations. The number of independent shades is expected to increase after embedding. In contrast, the number of independent shades will not increase when using the second modiﬁcation because only independent shades are used for embedding. This is just one possibility for a more sophisticated attack tailored to the modiﬁcation suggested here. Another possibility would be testing randomness especially for possible groups. Results of the Attacks. The Laplace ﬁltering tests whether the LSBs of an image are uniformly distributed. The evaluation shows the advantage of using hypothesis testing: The test is able to detect deviations from uniform distribution that are imperceptible for the human eye. Deviations of the uniform distribution are, in fact, structures in the appropriate bit plane. Of course, even the LSBs of a cover image can look noisy and may not include structure at all. The test did identify structures in 33 of the 50 test images (with the naked eye, only in 17 images the structures were recognized). For the images without any measurable structures, the test result will not change for the analysis of the stego images. The 33 images with measurable structures in the LSBs are of greater interest. The assumptions made above are corroborated by the tests (Figure 5). Embedding without any modiﬁcation was detectable by the test for each test image. The ﬁrst modiﬁcation makes the recognition more diﬃcult, and ﬁnally, the stego images generated by using the second modiﬁcation yield the same test results as the cover images.

independent shades (%)

independent shades (%)

Steganography Preserving Statistical Properties

80

60

40

20

291

90 80 70 60 50 40 30 20

0 cover

stego1

stego2

stego 3

stego 4

Images with larger homogenous areas

cover

stego1

stego2

stego 3

stego 4

Images with medium shades

Fig. 6. Ratio of independent shades (in percents)

The second assumption about the number of independent shades was also supported by the tests (Figure 6). Randomization of the least signiﬁcant bit plane(s) has little eﬀect on the number of independent shades for noisy images (right hand side of the ﬁgure). In contrast, this number is signiﬁcantly changed if there are recognizable structures in the least signiﬁcant bit plane of the cover (left hand side of the ﬁgure). However, using the second modiﬁcation (stego4) will largely avoid these changes. Future work will be done to create further evaluations.

5

Summary and Outlook

In this paper we investigated possibilities that aim to preserve statistical properties while overwriting the LSBs of an image. For ﬁrst-order statistics, the embedding was modelled as stationary Markov source. For second-order statistics, an additional analysis of the cover image was suggested to improve the selection of random image parts. The modiﬁcations use groups of shades for embedding. Another approach that uses pairs of colors is described in [16]: Two colours are a pair and can be used for embedding if the diﬀerence between them is less than the noise of the host, and if they occur with about the same statistical frequency. The approach suggested here can be considered as a generalization of that approach. The impact of the modiﬁcations is illustrated in Figure 7. Overwriting all LSBs with a random string destroys structures in the least signiﬁcant bit plane. Our modiﬁcations clearly improve this result by maintaining existing structures. Tests have shown the usability of the suggested modiﬁcations. At worst, the algorithm reduces the embedding capacity of improper images. In spite of necessary restrictions, quite an amount of data can be embedded. Analysing the LSBs of stego images generated according to the second modiﬁcation yielded the same results as analysing the LSBs of the corresponding cover images. As it can be expected, the test results depend on the images. Only overwriting non-noise LSBs is detectable at all. Therefore, images with a perceptible structure in the LSBs are not suited for embedding according to the original method. However,

292

Elke Franz

cover: “bastei.pgm”

LSBs of the cover

LSBs of stego1

LSBs of stego2

LSBs of stego3

LSBs of stego4

Fig. 7. Results of suggested modiﬁcations

one can use them as covers nevertheless by preserving existing structures as it was proposed here. Future work has to be done to improve the statistical analyses and to use these results in order to improve embedding. The investigations were done for overwriting the LSBs of images in the spatial domain. For practical reasons, it is important to investigate the applicability of the modiﬁcations for other embedding operations. And ﬁnally, as mentioned above, the analyses for validating the results have to be improved, too. This will also include researching the applicability of attacks known from the literature. In [6] it was pointed out that the use of cover images previously stored in the JPEG format should be avoided. The JPEG compatibility test described in that article was not yet tested for the algorithms suggested here. Especially the RS analysis in [7] seems to be of great interest. Some ﬁrst tests were done to determine the R and S groups for cover and stego images produced here, but much more evaluation is necessary to make concrete statements. Finally, the author would like to thank the reviewers for their helpful comments.

Steganography Preserving Statistical Properties

293

References [1] Ismail Avcibas, Nasir Memon, B¨ ulent Sankur: Steganalysis of Watermarking Techniques using Image Quality Metrics. In: Ping Wah Wong, Edward J. Delp (Ed.): Security and Watermarking of Multimedia Contents III. Proceedings of SPIE Vol. 4314, 2001, 523-531. 287 [2] Wilfried J. Dixon, Frank J. Massey: Introduction to Statistical Analysis. McGraw-Hill Book Company, Inc. New York, 1957. 284, 289 [3] Hany Farid: Detecting Steganographic Messages in Digital Images. http://www.cs.dartmouth.edu/˜farid/publications/tr01.html 287 [4] Elke Franz, Andreas Pﬁtzmann: Steganography Secure Against Cover-StegoAttacks. In Andreas Pﬁtzmann (Ed.): Information Hiding. Third International Workshop, IH’99, Dresden, Germany, September/October 1999, Proceedings, Springer, LNCS 1768, 2000, 29-46. 286 [5] Jessica Fridrich, Rui Du, Meng Long: Steganalysis of LSB Encoding in Color Images. ICME 2000, New York City, July 31th - August 2nd, USA. http://www.ssie.binghampton.edu/fridrich/publications.html 287 [6] Jessica Fridrich, Miroslav Goljan, Rui Du: Steganalysis Based on JPEG Compatibility. SPIE Multimedia Systems and Applications IV, Denver, CO, August 20-24, 2001. 292 [7] Jessica Fridrich, Miroslav Goljan, Rui Du: Reliable Detection of LSB Steganography in Color and Grayscale Images. Proc. of the ACM Workshop on Multimedia and Security, Ottawa, CA, October 5, 2001, 27-30. 287, 292 [8] Wilhelm G¨ ohler: H¨ ohere Mathematik: Formeln und Hinweise. Bearb. von Barbara Ralle, 10. u ¨berarb. Auﬂage, VEB Deutscher Verlag f¨ ur Grundstoﬃndustrie, Leipzig, 1987. 288 [9] Solomon W. Golomb, Robert E. Peile, Robert A. Scholtz: Basic Concepts in Information Theory and Coding. Plenum Press, New York, 1994. 279 [10] Neil F. Johnson, Sushil Jajodia: Steganalysis of Images Created Using Current Steganography Software. In David Aucsmith (Ed.): Information Hiding. Second International Workshop, IH’98, Portland, Oregon, USA, April 1998, Proceedings, Springer, LNCS 1525, 1998, 273-289. 278 [11] Stefan Katzenbeisser, Fabien A. P. Petitcolas (Ed.): Information Hiding Techniques for Steganography and Digital Watermarking. Artech House, 2000. 287 [12] Donald E. Knuth: The art of computer programming. Volume 2: Seminumerical algorithms. Addison-Wesley, 3rd Ed., 1998. 280 [13] Maurice Maes: Twin Peaks: The Histogram Attack to Fixed Depth Image Watermarks. In David Aucsmith (Ed.): Information Hiding. Second International Workshop, IH’98, Portland, Oregon, USA, April 1998, Proceedings, Springer, LNCS 1525, 1998, 290-305. 287 [14] A. Papoulis, Probability, Random Variables, and Stochastic Processes, 2nd ed. New York, McGraw-Hill, 1984. 280, 284 [15] Niels Provos: Defending Against Statistical Steganalysis. 10th USENIX Security Symposium, August 2001. http://www.citi.umich.edu/u/provos/stego/ 279, 287 [16] Maxwell T. Sandford, Jonathan N. Bradley, Theodore G. Handel: The data embedding method. In Proc. of the SPIE Photonics East Conference, Philadelphia, September 1995. 291 [17] Peter Wayner: Mimic Functions. Technical Report, Cornell University, Department of Computer Science, 1990. 280

294

Elke Franz

[18] Andreas Westfeld, Andreas Pﬁtzmann: Attacks on Steganographic Systems. In Andreas Pﬁtzmann (Ed.): Information Hiding. Third International Workshop, IH’99, Dresden, Germany, September/October 1999, Proceedings, Springer, LNCS 1768, 2000, 61-76. 278, 279, 286, 287 [19] Andreas Westfeld: F5 — A Steganographic Algorithm: High Capacity Despite Better Steganalysis. In Ira S. Moscowitz (Ed.): Information Hiding. 4th International Workshop, IH’01, Pittsburgh, PA, USA, April 2001, Proceedings, Springer, LNCS 2137, 2001, 289-302. 279

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close