DEPENDENCE ANALYSIS
A Book Series On
LOOP TRANSFORMATIONS FOR RESTRUCTURING COMPILERS
Utpal Banerjee
Series Titles:...
182 downloads
821 Views
7MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
DEPENDENCE ANALYSIS
A Book Series On
LOOP TRANSFORMATIONS FOR RESTRUCTURING COMPILERS
Utpal Banerjee
Series Titles:
Loop Transformationsfor Restructuring Compilers: The Foundations Loop Parallelization Dependence Analysis
DEPENDENCE ANALYSIS
Utpal Banerjee Intel Corporation
A B o o k Series on
Loop Transformations for Restructuring Compilers
Kluwer Academic Publishers Boston / D o r d r e c h t / L o n d o n
Distributors for North America: Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA Distributors for all other countries: Kluwer Academic Publishers Group Distribution Centre Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS
Library of Congress Cataloging-in-Publication Data A C.I.P. Catalogue record for this book is available from the Library of Congress.
Copyright © 1997 by Kluwer Academic Publishers All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher, Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, Massachusetts 02061 Printed on acid-free paper.
Printed in the United States of America
To m y parents:
Late Santosh K u m a r Banerjee Santi Rani Banerjee
Contents Preface
xv
Acknowledgments
xvii
1 Introduction
1
Single Loops
15
2.1 2.2 2.3 2.4
Introduction ............................ Index and Iteration Spaces . . . . . . . . . . . . . . . . . . . Dependence Concepts ..................... Dependence Problem ......................
15 15 19 25
2.5 2.6
S o l u t i o n t o t h e Linear P r o b l e m . . . . . . . . . . . . . . . . Method of Bounds ........................
30 46
2
3
Double Loops
57
3.1
Introduction ............................
57
3.2 3.3 3.4
I n d e x a n d I t e r a t i o n Spaces . . . . . . . . . . . . . . . . . . . Dependence Concepts ..................... Dependence Problems .....................
58 64 71
81
4 Perfect Loop Nests 4.1
Introduction ............................
81
4.2 4.3 4.4 4.5
I n d e x a n d I t e r a t i o n Spaces . . . . . . . . . . . . . . . . . . . Dependence Concepts ..................... Subscript Representation .................... Dependence Problem ...................... Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . .
82 94 100 102 108
4.6
vii
, , o
Vlll
General Program
121
5.1 5.2 5.3 5.4
121 122 129 134
Introduction ............................ Dependence Concepts ..................... Dependence Problem ...................... G e n e r a l i z e d gcd T e s t . . . . . . . . . . . . . . . . . . . . . .
6 Method of Bounds 6.1 6.2 6.3 6.4 6.5 7
Introduction ............................ P e r f e c t Nest, O n e - D i m e n s i o n a l A r r a y . . . . . . . . . . . . R e c t a n g u l a r Loops, O n e - D i m e n s i o n a l A r r a y . . . . . . . . R e c t a n g u l a r Loops, M u l t i - D i m e n s i o n a l A r r a y . . . . . . . General Method . . . . . . . . . . . . . . . . . . . . . . . . .
139 139 141 151 158 165
Method of Elimination
171
7.1 7.2 7.3 7.4
171 172 176 180
Introduction ............................ Dependence Testing by Elimination . . . . . . . . . . . . . Two-variable Problems . . . . . . . . . . . . . . . . . . . . . Other Methods ..........................
8 Conclusions
189
A Linear Equations on Polytopes
191
A.1 A.2 A.3 A.4 A.5 A.6
Introduction ............................ Polytopes ............................. Real S o l u t i o n s to a Single E q u a t i o n . . . . . . . . . . . . . . Lagrangean Relaxation ..................... Real S o l u t i o n s to a S y s t e m o f E q u a t i o n s . . . . . . . . . . I n t e g e r S o l u t i o n s t o Linear E q u a t i o n s . . . . . . . . . . . .
191 192 193 196 200 204
Bibliography
207
Index
213
List of Figures 2.1 Statement d e p e n d e n c e graph for Example 2.2 . . . . . . . 2.2 The triangle P of T h e o r e m 2.11 . . . . . . . . . . . . . . . . 2.3 Loop nest of Example 2.7 after unrolling . . . . . . . . . . .
24 49 54
3.1 Index and iteration spaces for Example 3.1 . . . . . . . . .
63
List of Tables 2.1
Steps of A l g o r i t h m 2.1 for a = 21 a n d b = 34 . . . . . . .
33
3.1 I n d e x values for l o o p s o f Example 3.1 . . . . . . . . . . . . 3.2 I t e r a t i o n v a l u e s for l o o p s of Example 3.1 . . . . . . . . . . 3.3 Some i t e r a t i o n s of (Lx,L2) in Example 3.2 . . . . . . . . . .
62 62 69
4.1 4.2
97 98
Loop n e s t of Example 4.2 a f t e r u n r o l l i n g . . . . . . . . . . . D e p e n d e n c e s t r u c t u r e of (Lx,L2, L3) in Example 4.2.
5.1 Three s t a t e m e n t i n s t a n c e s for Example 5.1 . . . . . . . . . 5.2 P a r a m e t e r s for the D e p e n d e n c e P r o b l e m . . . . . . . . . . .
. .
127 130
List of Notations In t h e f o l l o w i n g , i = ( i l , i 2 , . . . , i m ) a n d j = ( j l , j 2 , . . . , j m ) m - v e c t o r s ( i n t e g e r o r real), a n d 1 < ~ < m .
a/b R Rm Z Zm
S0 if0 - 3 ~ > - 1 1 0 , so that 0 < ~ < 110/3. Here, ~ = L l 1 0 / 3 ] = 36, a n d the iteration p o i n t s of the loop are 0, 1, 2, . . . , 36. The iteration space is the set of these 37 integers. The index p o i n t s of L are f o u n d f r o m (2.5) by plugging in the values of ~: 120,120-3(1),120-
3(2),...,120-
3(36).
2.3. DEPENDENCECONCEPTS
19
Thus, the i n d e x space is t h e set {120, 117, 1 1 4 , . . . , 12}.
EXERCISES 2.2
1. What are the smallest and the largest values of I in the model loop L of this section? 2. Find a closed-form expression for the number of iterations of L, that holds for all values of p, q, and 0 (including cases where the loop fails to execute). 3. Find the index and iteration spaces of the loop L, when (a) p = 0 , q = 9 , 0 = 1 ; (b) p = 17,q = 39,0 = 5; (c) p = -15,q = 20,0 = 2; (d) p = 10,q = -13,0 = -3. 4. Find the number of iterations of L and the value of I in the last iteration, when (a)
p = 0 , q = 1 0 0 0 , 0 --- 1;
(b)
p = -17,
(c)
p = -15,q
(d)
p = 1001,q
q = 390,0 = -200,0 = -137,0
= 4; = 3; = -7.
2.3 Dependence Concepts C o n s i d e r two, n o t n e c e s s a r i l y distinct, a s s i g n m e n t s t a t e m e n t s S a n d T in o u r m o d e l loop L:
p,q,O H(I)
do/=
enddo If S lexically p r e c e d e s T in t h e p r o g r a m , we write S < T. The m e a n i n g of the n o t a t i o n S < T is t h e n clear, a n d it is o b v i o u s t h a t < is a t o t a l o r d e r in the set of a s s i g n m e n t s t a t e m e n t s in the p r o g r a m . An i n d e x p o i n t (i.e., a value o f t h e i n d e x variable I) d e t e r m i n e s a n i n s t a n c e of e a c h a s s i g n m e n t s t a t e m e n t in t h e loop. Since, by h y p o t h esis, t h e r e are no c o n d i t i o n a l s in the p r o g r a m , all s u c h i n s t a n c e s are executed. Let S(i) d e n o t e t h e i n s t a n c e of s t a t e m e n t S d e t e r m i n e d by an i n d e x p o i n t i, a n d T(j) t h e i n s t a n c e of s t a t e m e n t T d e t e r m i n e d b y
20
CHAPTER 2. SINGLELOOPS
an index p o i n t j. The distance f r o m S(~) to T(j) is defined to be the integer (3 - ~), where ~ and 3 are the iteration p o i n t s c o r r e s p o n d i n g to i a n d j, respectively. 1 We leave to the reader the p r o o f of the following result. L e m m a 2.4 Consider two assignment statements S and T in the single loop L. Let d denote the distance from an instance S(~) of S to an instance T (j) ofT. In the sequential execution of the program, S ( ~) is executed before T(j) iff either d > O, or d = 0 a n d s < T. The c o n c e p t of d e p e n d e n c e can be i n t r o d u c e d in m a n y different contexts. We define d e p e n d e n c e first b e t w e e n s t a t e m e n t instances, a n d t h e n b e t w e e n statements. An instance T(j) of a s t a t e m e n t T depends on an instance S(~) of a s t a t e m e n t S, if there exists a m e m o r y location .~M s u c h that 1. Both S(i) and T(j) reference (read or write) _W/; 2. S(i) is executed before T(j) in the sequential execution of the program; 3. During sequential execution, the location M is n o t written in the time period f r o m the e n d of execution of S(i) to the b e g i n n i n g of execution of T(j). The s t a t e m e n t s S and T n e e d n o t be distinct, b u t C o n d i t i o n 2 requires that the instances S(i) a n d T(j) be distinct. Let ~ and 3 d e n o t e the iteration values c o r r e s p o n d i n g to i and j, respectively. This dependence is loop-carried if ~ < 3; it is loop-independent if ~ = 3 (i.e., if i = j). In the l o o p - i n d e p e n d e n t case, we necessarily have S < T since S(~) is to be executed before T(i). Since a m e m o r y reference is either a "read" or a "write," a pair of s t a t e m e n t instances can reference the s a m e m e m o r y location in four different ways. This leads to four different types of d e p e n d e n c e :
1. T(j) is flow dependent on S(~), if S(~) writes ~4 a n d T(j) reads it (the value c o m p u t e d by S(~) is u s e d by T(j)); 1See [Pugh 92b] for concerns expressed about the proper definition of dependence distance, and [Wolf 94] for related comments.
2.3. DEPENDENCECONCEPTS
21
2. T(j) is anti-dependent o n S(~), if S(~) reads ~4 a n d T(j) writes it (S(~) u s e s the value in ~M before it is c h a n g e d by T(j)); . T(j) is output dependent o n S(~), if S(~) a n d T(j) b o t h write ~v/ (the value c o m p u t e d by T(j) is s t o r e d after the value c o m p u t e d by S(i) is stored);
4. T(j) is input dependent o n S(i), if b o t h S(~) a n d T(j) read ~M (the "read" by T(j) c o m e s after the "read" by S(~)). Now, we consider d e p e n d e n c e b e t w e e n two s t a t e m e n t s . A statem e n t T depends o n a s t a t e m e n t S, if there is at least one instance S(i) of S a n d one instance T(j) of T, s u c h that T(j) d e p e n d s o n S(i). We can be m o r e specific. For example, T is flow dependent on S if there is an instance pair (S(~), T(j)) s u c h that T(j) is flow d e p e n d e n t on S(~). One m a y similarly define w h a t is m e a n t by T is anti-dependent, output dependent, or input dependent o n S. The dependence of T o n S can be formally defined to be the set of all instance pairs (S(i), T(j)) s u c h that T(j) d e p e n d s o n S(~). Thus, T d e p e n d s o n S iff the d e p e n d e n c e of T o n S is n o n e m p t y . Sometimes, it is c o n v e n i e n t to say that there is d e p e n d e n c e between S a n d T if either T d e p e n d s o n S, or S o n T (or both). The d e p e n d e n c e of T on S can be s e p a r a t e d into two parts: the loop-carried part, consisting of all instance pairs (S(~), T(j)) s u c h that T(j) d e p e n d s o n S(~) a n d ~ < j; a n d the loop-independent part, consisting of all instance pairs (S(~), T(i)) s u c h that T(~) d e p e n d s on S(~). The l o o p - i n d e p e n d e n t part is necessarily e m p t y if T < S. We can also break u p the d e p e n d e n c e of T o n S by the type of d e p e n d e n c e b e t w e e n a pair of instances. This way we get four subsets: . The flow dependence of T on S consists of all instance pairs (S(i), T(j)) s u c h that T(j) is flow d e p e n d e n t on S(~); . The anti-dependence of T o n S consists of all instance pairs (S(i), T(j)) s u c h that T(j) is a n t i - d e p e n d e n t on S(~); . The output dependence of T o n S consists of all instance pairs (S(~), T(j)) s u c h that T(j) is o u t p u t d e p e n d e n t o n S(~); . The input dependence of T on S consists of all instance pairs (S(i), T(j)) s u c h that T(j) is i n p u t d e p e n d e n t o n S(i).
22
CHAPTER 2. SINGLE LOOPS
These subsets are not necessarily pairwise disjoint (e.g., T ( j ) m a y be both flow d e p e n d e n t and a n t i - d e p e n d e n t on S(i)). The d e p e n d e n c e b e t w e e n two s t a t e m e n t s can be described in t e r m s of the p r o g r a m variables 2 involved. (See Section Io4.3.) The instances of the o u t p u t variable of a s t a t e m e n t S d e t e r m i n e the m e m o r y locations written by the instances of S. On the other hand, the instances of the input variables of S d e t e r m i n e the m e m o r y locations read by the instances of S. Let u d e n o t e a variable of s t a t e m e n t S and v a variable of s t a t e m e n t To The pair (u, ~ ) causes a d e p e n d e n c e of T on S, if there are s t a t e m e n t instances S(i) and T(j), such that T ( j ) d e p e n d s on S(i), and the c o r r e s p o n d i n g m e m o r y location ~4 is r e p r e s e n t e d by both the instance of u for I = ~ and the instance of v for I = j. If u is the o u t p u t variable of S and ~ an input variable of T, t h e n (u, v) can cause a flow d e p e n d e n c e of T on S, or an anti-dependence of S on T, or both. The o u t p u t variables of S and T can cause an o u t p u t d e p e n d e n c e of T on S, a n d / o r of S on T. Two input variables of the s t a t e m e n t s can cause only an input dependence. A distance for the d e p e n d e n c e of T on S is the distance ( 3 - ~) f r o m an instance S(i) of S to an instance T ( j ) of T, where T ( j ) d e p e n d s on S(~). A given d e p e n d e n c e has at least one, but usually m a n y distances. Since S(i) m u s t be executed before T ( j ) by the definition of dependence, we have the following result as a direct c o n s e q u e n c e of Lemma 2.4.
Theorem 2.5 If a statement T depends on a s t a t e m e n t S in the loop L, then each dependence distance satisfies d > O. The equality d = 0 is possible only if S < T. The relation of d e p e n d e n c e b e t w e e n s t a t e m e n t s is d e n o t e d by ~, and we write S ~ T to indicate that T d e p e n d s on S. 3 The statement dependence graph of the given p r o g r a m is the directed graph that represents the relation ~. (In other words, it is the graph of ~; see Section 1.1.4.) We d e n o t e flow dependence, anti-dependence, output
2Weoften use the term "variable" somewhat loosely, although the exact meaning should always be clear from the context. In the current context, for "program variables," we may take the output variable X(I) of S and the input variable X (I-2) of T in Example 2.2 on the next page. 3Often, ~ is used as a generic symbol for dependence. For example, one may write S(i) ~ T(j) to denote dependence between statement instances.
2.3. DEPENDENCE CONCEPTS
23
d e p e n d e n c e , a n d i n p u t d e p e n d e n c e b y t h e s y m b o l s 6 f, 6 a, 6 °, a n d 6 i, respectively. (Thus, for example, S 6 f T m e a n s T is flow d e p e n d e n t on S.) T h e s e r e l a t i o n s m a y overlap, a n d t h e y c o n s t i t u t e the m a i n relation of d e p e n d e n c e : 6 = 6f ~J 6a ~j 6 ° u 6 i. F r o m this point, d e p e n d e n c e will n o t i n c l u d e i n p u t d e p e n d e n c e , u n l e s s o t h e r w i s e stated, The transitive c l o s u r e ~ o f t h e relation 6 is the relation o f indirect d e p e n d e n c e . Thus, a s t a t e m e n t T is indirectly d e p e n d e n t o n a s t a t e m e n t S if t h e r e is a (directed) p a t h f r o- -m S to T in the s t a t e m e n t
d e p e n d e n c e graph. In symbols, we have S 6 T if there is a n o n e m p t y sequence of statements Sl, S 2 , . . . , SN such that S = S1, Sl 6 S 2 . . . . . SN-1 6 S N , SN = T.
Indirect d e p e n d e n c e b e t w e e n s t a t e m e n t i n s t a n c e s is d e f i n e d similarly. E x a m p l e 2.2 C o n s i d e r the l o o p L. S
,
T" U"
do I = 4, 100, 2 X ( I ) --- X ( I ) + 1 Y ( 2 I - 4 ) = X ( I ) + X ( I + I) + X ( I + 2) + X ( I - 2) Y(I) = Z(I) enddo
The i n d e x v a l u e s o f t h e l o o p are 4, 6, 8 , . . . , 100, a n d the c o r r e s p o n d i n g i t e r a t i o n v a l u e s are 0, 1, 2 . . . . ,48. The first three iterations of L along with the last o n e are s h o w n below: S(4)" T(4) : U(4)"
X(4) = X(4) + 1 Y(4) = X(4) + X(5) + X ( 6 ) + X(2) Y(4) = Z(4)
S(6)T(6) • U(6)"
X(6) = X(6) + 1 Y(8) = X(6) + X(7) + X(8) + X(4) Y(6) = Z(6)
S(8) : T(8) :
X(8) = X(8) + 1 Y(12) -- X(8) + X(9) + X(10) + X(6)
u(8) :
Y(8) = z(8)
;
S(100) : T(100) : U(100) :
•
X(100) = X(100) + 1 Y(196) = X(100) + X(101) + X(102) + X(98) Y(100) = Z(100)
24
CHAPTER 2. SINGLELOOPS
Figure 2.1: S t a t e m e n t d e p e n d e n c e g r a p h for Example 2.2. (Statement i n s t a n c e s are always labeled by index values.) F r o m this pattern, it is easy to figure o u t the d e p e n d e n c e s t r u c t u r e of the program. Note the following facts: 1. S does n o t d e p e n d o n itself. 2. Fix the o u t p u t variable X(I) of S a n d take the i n p u t variables of T, one at a time, in the o r d e r of their appearance. We can m a k e these observations: (a) X(I) of S a n d X(I) of T cause a flow d e p e n d e n c e of T on S. This d e p e n d e n c e is l o o p - i n d e p e n d e n t ; it has no loopcarried part. The only d e p e n d e n c e distance is 0. (b) X(I) of S and X ( I + 1) of T do n o t cause a d e p e n d e n c e b e t w e e n S and To (c) X(I) of S a n d X(I + 2) of T cause an a n t i - d e p e n d e n c e of S o n T. This d e p e n d e n c e is loop-carried; it has no loopi n d e p e n d e n t part. The only d e p e n d e n c e distance is 1. (d) X(I) of S a n d X ( I - 2) of T cause a flow d e p e n d e n c e of T on S that is also loop-carried, w i t h o u t any l o o p - i n d e p e n d e n t part. Again, the only d e p e n d e n c e distance is 1. 3. There is an o u t p u t d e p e n d e n c e of U on T. It has a loop-carried p a r t a n d a l o o p - i n d e p e n d e n t part. There are several d e p e n d e n c e distances: 0, 1, 2 . . . . . 4. U d o e s n o t d e p e n d o n S, b u t U d e p e n d s on S indirectly.
2.4.
DEPENDENCE
PROBLEM
25
The s t a t e m e n t d e p e n d e n c e g r a p h for the p r o g r a m is given in Figure 2.1. Strictly speaking, this figure s h o w s the s u p e r p o s i t i o n of the g r a p h s of the r e l a t i o n s 8 f, 8 a, 8 °. (We have i g n o r e d i n p u t d e p e n d e n c e s b e t w e e n s t a t e m e n t s , a n d will u s u a l l y do so in f u t u r e examples.) Note the d i f f e r e n t m a r k i n g s o n the edges: the triangle r e p r e s e n t s flow dep e n d e n c e , the d a s h a n t i - d e p e n d e n c e , a n d the circle o u t p u t dependence. If we w a n t to s h o w m o r e d e t a i l e d d e p e n d e n c e i n f o r m a t i o n , t h e n a m o r e d e t a i l e d g r a p h will be necessary. For example, to emp h a s i z e t h a t t h e r e are two pairs of p r o g r a m variables e a c h of w h i c h c a u s e s a flow d e p e n d e n c e of T o n S, we w o u l d create a g r a p h w i t h two flow d e p e n d e n c e e d g e s f r o m S to T, e a c h labeled w i t h t h e a p p r o p r i a t e variables. EXERCISES 2.3
1. Take two iteration points ~ and ~ of the model loop L. Let i and j denote the corresponding index points. Let d = ) - [ and d' = j - ~. How is d' related to d? Explain the difficulties that may arise if the distance from a statement instance S(~) to a statement instance T ( j ) is defined to be d'. 2. Write down the set of all pairs of statement instances that constitute the flow dependence of T on S in Example 2.2. Show the different subsets that represent the flow dependences caused by different pairs of variables. 3. Write down the set of all pairs of statement instances that constitute the output dependence of U on T in Example 2.2. Give all the distances for this dependence. 4. Are there any input dependences between statements in Example 2.2? 5. Give a complete description of the dependence structure of the loop in Example 2.2 (including a dependence graph) after changing the loop-header to
2.4
(a) L:
d o I = 4,100,1
(b) L:
doi=4,100,3
(c) L :
d o I = 100, 4, -2
Dependence Problem
C o n s i d e r a g a i n t h e m o d e l loop L a n d two s t a t e m e n t s S a n d T in its body. Let u d e n o t e a variable of S a n d v a variable o f T. Now, we a d d r e s s the practical p r o b l e m o f d e t e r m i n i n g w h e t h e r u a n d v cause a d e p e n d e n c e b e t w e e n the s t a t e m e n t s : a d e p e n d e n c e of T o n S, or
26
CHAPTER 2. SINGLE LOOPS
of S on T, or both. In this context, we are not interested in the type (flow, anti-, output, or input) of dependence. This book focuses on array elements, and here we consider the case where u and v are b o t h elements of a given array X. Throughout this chapter, we a s s u m e that X is one-dimensional, unless otherwise specified. Let u = X ( f ( I ) ) and v = X ( g ( I ) ) , where f and ~ are realvalued functions that r e t u r n integer values for integer values of I. When S < T, u is the o u t p u t variable of S, and v an input variable of T, we m a y partially r e p r e s e n t the model p r o g r a m as follows: L.
d o / = p,~l,O :
S.
X(f(I)) .... :
T:
.......
X(g(I) ) . • •
:
enddo T h e o r e m 2.6 gives necessary conditions u n d e r which u and v will cause a d e p e n d e n c e between S and T. T h e o r e m 2.6 Consider a n y two statements S a n d T in the loop L. Let X ( f ( I ) ) denote a variable o f S and X ( g ( I ) ) a variable o f T , where X is a one-dimensional array. I f these variables cause a dependence between S a n d T, then the equation f (p + 0~) - e ( p + 03) = 0
(2.6)
has an integer solution (~, 3) such that O T2, t h e n t h e r e is no d e p e n d e n c e b e t w e e n s t a t e m e n t s S a n d T; t e r m i n a t e the algorithm. 12. [Find t h e set ~o. F r o m (2.13) we get j-
~ = ( a l - bl)t - Cl(~O - ]0).
(2.14)
Since a ~: b, we have a l ~: bx. Let ~ d e n o t e the value of t for w h i c h ~ = j. If ~ is a n i n t e g e r b e t w e e n T1 a n d T2, t h e n it is a n e l e m e n t (the o n l y e l e m e n t in this case) of ~0.] Set ~ "- Cl(~O - j o ) / ( a l - h i ) . If ~ is a n i n t e g e r s u c h t h a t r l -< ~ -< r2, t h e n set
~0 "- {(Cl~0 + bl~, Cl.~0 + alE)}. 13. [Find t h e sets ~1 a n d ~_~. The d i f f e r e n c e (] - ~) given b y (2.14) is e i t h e r positive for all t < ~ a n d negative for all t > ~, or negative for all t < ~ a n d positive for all t > ~. We c o m p u t e t h e i n t e r s e c t i o n s [T1, T2] C~ (--0% ~) a n d [Tx,T2] (3 (~, oo). See A l g o r i t h m 1.3.1 a n d figures 1.3.4, 1.3.5 for details.] Set T3 -- [ ~ -- I I T4 -- [ ~ + I ]
T5 ~Inin(T2, T3) T6 ~ m a x ( T 1 , T 4 ) . Select t h e p r o p e r case b a s e d o n t h e sign o f ( a l - bl):
CHAPTER 2. SINGLE LOOPS
40
Case ( a l > bl): If T6
O, continue. 2. co *- b o - a o
+ (b-a)p=
-25.
Since co m o d 0 = 0, continue. 3. c .- c o / O = - 5 . 'tq -- ~,~I~_1
4.
- -
~,~I~o
- -
~.
5. Since a ~: b, go to Step 9. .
[By Algorithm 2.1, find gcd (7, 3) and two integers ~o, 30 such that 7~o - 330 = gcd(7, 3).] 9 "- 1, (~o,,3o) -- (1, 2). Since c m o d 9 = 0, continue.
10. ( ~ 2 1 , ~ 1 , ¢ 1) "--
(alg, blg, c/9)
= (7,3,-5).
2.5. SOLUTION TO THE LINEAR PROBLEM
11. Go to Case (bl > 0). Vl -- [ - c l [ o / b l ] = 2, T 2 '-- [ ( 3 Cl[o)/bl] = 14. Go to Case (al > 0). T 1 '-- m a x (T1, [ - - C 1 , ~ 0 / / 2 1 ] ) = 2, T 2 "-- m i n (T2, [(t~ - C l ) 0 ) / ~ / 1 ] ) = Since T1 --< T 2 , continue. 12.
~ ~ Cl([0
- 3o)/(al
- bl)
43
6.
= 5/4.
[Since ~ is n e i t h e r an integer, n o r d o e s it lie b e t w e e n T1 a n d ~-2, the set 't~0 r e m a i n s empty.] 13. T3 "-- [ ~ - - 11 = 1, ,--
+ I J = 2,
T5 "" m i n ( T 2 , T 3 ) -- max
= 1,
('rl, V4) = 2.
Go to Case (t~l > bl). Since -/-6 --< "/-2, s e t xI~1 ~ {(--5 q- 3 t , - - 1 0 + 7t) : 2 < t < 6}. 14. Since xI~1 #: ~ , there is a loop-carried d e p e n d e n c e of s t a t e m e n t T on s t a t e m e n t S. The c o r r e s p o n d i n g set of distances is {4t - 5 : 2 < t < 6}. Since the sets ~-1 a n d ~/0 are empty, there is no loop-carried d e p e n d e n c e of S on T, nor any l o o p - i n d e p e n d e n t d e p e n d e n c e b e t w e e n S a n d T. We see that the pairs of iteration p o i n t s that cause the d e p e n d e n c e of T o n S f o r m the set xI~1 ---- {(1,4),
(4, 11), (7,18), (10,25), (13,32)}.
The c o r r e s p o n d i n g set of pairs of index p o i n t s is obtained by r e p e a t e d applications of the relation I = p + ~?0: {(15, 30), (30,65), (45,100), (60, 135), (75,170)}. Thus, T(30) d e p e n d s on S(15), T(65) d e p e n d s o n S(30), etc. There are five s u c h pairs of s t a t e m e n t instances. The set of d e p e n d e n c e
44
CHAPTER
2.
d i s t a n c e s is {3, 7, 11, 15, 19}. ( N o t e t h a t d e p e n d e n c e computed from iteration points.)
SINGLE LOOPS
d i s t a n c e s are
EXERCISES 2.5
1.
(a) By Algorithm 2.1, find ~/ = gcd(10, 14) and two integers x0 and Yo, such that 10x0 - 14y0 = kt. Show all steps. Apply Theorem 2.8 to find the general solution to the equation 10x - 14y = 6, using these values of )Co and Y0. (b) By trial and error, find a pair of integers ( x l , y l ) ~ (x0,Yo) with the same property: 1 0 X x - - 14yt = 9. Solve 10x - 14y = 6 again, this time using (Xl, Yl) instead of (x0, Yo) in Theorem 2.8. Show that the general solution obtained here generates the same set of ordered pairs as generated by the general solution in the previous problem. (c) Find a formula that will generate all integer pairs (x, y ) such that 10x-14y=g.
2. In Theorem 2.8, show that irrespective of whether ~q divides c or not, the set of all real solutions to Equation (2.9) is given by x = (c/9)Xo + (b/9)t Y = (c/9)2o + (a/~)t,
where t is a real parameter. 3. Write an algorithm such that given three integers a, b, and c, it decides if there is an integer solution (x, 2 ) > (0, 0) (i.e., x _> 0 and 2 > 0) to the diophantine equation a x - b y = c, and gives a formula to generate all such solutions when they exist. Apply your algorithm to three examples to demonstrate that the number of nonnegative solutions could be infinite, finite and positive, or zero. 4. In the context of Theorem 2.9, consider these three conditions: (a) X ( a I + ao) and X ( b I + bo) cause a dependence between S and T; (b) (bo - ao + b p - a p ) is an integral multiple of O . g c d ( a , b ) ; (c) (b0 - a0) is an integral multiple of gcd(a, b). Show that (a) ~ (b) ~ (c). (We already know about the first implication.) Give an example where (c) does not imply (b). Find conditions on loop parameters such that (b) ** (c) for all integers a, ao, b, and b0. While (b) is the gcd test for the dependence equation in terms of iteration values, we can think of (c) as the god test for the equation a i - b j = bo - ao, that is, for the dependence equation in terms of index values. Which is the stronger test for dependence in general: the gcd test in terms of iteration values, or the gcd test in terms of index values? When are they equivalent? (Note that (b) does not involve the final loop limit q, and (c) does not involve any loop parameters.)
2.5.
SOLUTION
TO
THE LINEAR
45
PROBLEM
5. A p p l y A l g o r i t h m 2.2 t o t h e f o l l o w i n g e x a m p l e s : (a) L : S : T :
doi=3,100,1 X(9I + 22) .... .......
17) • • •
X(6I-
enddo (b) L : S : T : (c) L : S : T :
doi=0,100,2 X(3I) .... ....... X(36) • • • enddo d o I = 100, - 1 0 , - 3 X ( 4 I + 16) . . . . .......
4) • • •
X(4I-
enddo (d) L : S : T :
d o I = 100, - 1 0 , - 2 X ( 4 I + 16) . . . . .......
4) • • •
X(4I-
enddo (e) L : S : T :
d o I = 100, - 1 0 , - 1 X ( 4 I + 16) . . . . .......
4) • • •
X(4I-
enddo (f) L : S : T :
d o I = - 5 0 , 100, 1 X(-2I + 25) .... .......
X ( 3 I + 4) • • •
enddo (g) L : S : T :
do I = -300, 200, 3 X ( 6 I - 17) . . . . .......
x(gI
+ 22). • •
enddo (h) L :
doi=0,100,1 S: T :
X(21,2I+1) .... ....... X ( 3 I + 1 , 3 I + 2) • • •
enddo 6. I n A l g o r i t h m 2.2, l e t a # b. S h o w t h a t if ~l'0 is n o n e m p t y , t h e s i n g l e p o i n t (c ! ( a - b ) , c ! ( a - b ) ) . 7. I n t h e c o n t e x t o f A l g o r i t h m 2.2, g i v e e x a m p l e s ~I~1,wit-1, a n d ~/0, (a) o n l y o n e is n o n e m p t y
t h e n it c o n t a i n s
such that of the three sets
( t h r e e e x a m p l e s , o n e f o r e a c h set);
(b) o n l y t w o a r e n o n e m p t y
(three examples, one for each combination);
(c) all t h r e e a r e n o n e m p t y
(one example).
8. W r i t e a s i m p l e r v e r s i o n o f A l g o r i t h m 2.2 f o r t h e c a s e w h e r e b = - a
~ 0.
46
C H A P T E R 2. SINGLE L O O P S
9. Consider the program L: S: T:
do I = p , q , O X .... ....... x..enddo
where x is a scalar. Can Algorithm 2.2 handle this d e p e n d e n c e problem?
2.6
Method of Bounds
As we shall see, the d e p e n d e n c e p r o b l e m b e c o m e s m o r e difficult w h e n we m o v e f r o m a single loop to a m o r e complicated p r o g r a m . The m a j o r difficulty arises f r o m the fact that the general solution to the d e p e n d e n c e e q u a t i o n usually contains m o r e t h a n one integer parameter, and t h e r e f o r e we get a s y s t e m of inequalities with m o r e t h a n one integer variable w h e n the solution is s u b s t i t u t e d in the d e p e n d e n c e c o n s t r a i n t s (Step 4, A l g o r i t h m 1.1). Such a s y s t e m of inequalities is n o t trivial to solve. We m e n t i o n e d in Chapter 1 that for this reason, an a p p r o x i m a t e a l g o r i t h m (Algorthm 1.2) is o f t e n used, where we test if there is an integer solution to the d e p e n d e n c e e q u a t i o n w i t h o u t any constraints, a n d a real solution to the equation w i t h the d e p e n d e n c e constraints. We also d i s c u s s e d two i m p l e m e n t a t i o n s of that algorithm: the m e t h o d of b o u n d s a n d the m e t h o d of elimination. Since the d e p e n d e n c e p r o b l e m involving a o n e - d i m e n s i o n a l array in a single loop is a simple problem, an application of the m e t h o d of elimination (Algorithm 1.3) to it is u n i n t e r e s t i n g (why?). On the other hand, by applying the m e t h o d of b o u n d s to this problem, we get results that can be applied to a m o r e general program. In this section, we introduce the m e t h o d of b o u n d s in t e r m s of a single loop. In Chapter 6, this m e t h o d will be s t u d i e d in detail in the context of a n e s t of several loops, where it is n o r m a l l y u s e d in practice. The m e t h o d of b o u n d s originated in [Bane 76]. Consider again the d e p e n d e n c e p r o b l e m b e t w e e n two s t a t e m e n t s S a n d T in a loop of the f o r m L:
do/= p,q,O H(~) enddo
47
2.6. METHOD OF BOUNDS
posed by a variable X ( a I + ao) of S and a variable X ( b I + bo) of T. Write the d e p e n d e n c e equation (2.11) (a0)~ - (bO).~ = bo - ao + (b - a ) p in the f o r m a ~ - b3 = c, where c = [b0 - ao + (b - a ) p ] / O . constraints (2.12):
(2.15)
Also, rewrite the d e p e n d e n c e
o-<j- j2, i3 = j3.
4.6. SPECIAL CASES
117
The set of all s u c h s o l u t i o n s is
{(il,jl,i2,j2, i3,j3):(ii,ji)
~
~1,1, (i2,j2) e W2,-1, (i3,j3) ~ ~3,0}
w h i c h is clearly e m p t y . Hence, T d o e s n o t d e p e n d o n S w i t h the d i r e c t i o n v e c t o r ( 1, - 1,0). Next, s u p p o s e we w a n t to k n o w if s t a t e m e n t S d e p e n d s o n statem e n t T with the d i r e c t i o n v e c t o r ( 1 , - 1 , 0 ) . T h e n the q u e s t i o n is w h e t h e r t h e r e is a n i n t e g e r s o l u t i o n ( i l , j l , i2,j2, i3,j3) tO t h e syst e m of e q u a t i o n s (4.37)-(4.39) satisfying t h e c o n s t r a i n t s (4.40)-(4.42), s u c h that
jl < il,j2 > i2,j3 = i3. (Note that (it, i2, i3) goes w i t h S a n d ( j l , j 2 , j 3 ) w i t h T, so t h a t their roles are n o w reversed.) The set of all s u c h s o l u t i o n s is
{ ( i i , j l , i 2 , j 2 , i 3 , j 3 ) : ( i t , j ~ ) ~ ~Itt,_l, (i2,j2) e ~I~2,1, (i3,j3) e ~3,o}. This set is n o n e m p t y so t h a t S d e p e n d s o n T w i t h the d i r e c t i o n v e c t o r (1, - 1 , 0). In fact, all i n s t a n c e pairs ( S ( i t , i2, i3), T ( j l , j 2 , j 3 ) ) , s u c h t h a t it > jx, i2 < j2, i3 = j3, a n d S(il, i2, i3) d e p e n d s o n T ( j l , j 2 , j 3 ) , f o r m the set {(S(tl + 1, t2, 5 ) , T ( t l , 3 t 2 - 3, 5)) : 0 < tt < 99,2 < t2 < 34}. The n u m b e r of s u c h pairs is 100 x 33 = 3300. The set of d i s t a n c e v e c t o r s for this d e p e n d e n c e of S o n T is {(1,3 - 2t2,0) : 2 < t2 -< 34} = { ( 1 , - 1 , 0 ) , ( 1 , - 3 , 0 )
..... (1,-65,0)}.
S p e c i a l C a s e 3. A G e n e r a l i z a t i o n o f C a s e s 1 a n d 2 Let ~, = EC a n d ~ = FC, w h e r e E a n d F are r n x rrr diagonal matrices, a n d C is a n r n x rt matrix. It is clear t h a t this last case i n c l u d e s the first two. We get .~ = I~ w h e n E a n d F are equal to the i d e n t i t y matrix, w h i l e / ~ = E a n d i~ = F w h e n C is the i d e n t i t y matrix. A n e x a m p l e of this case is p r o v i d e d b y the variables X(I3, 2It - 1,312 - 1) a n d
118
CHAPTER 4. PERFECT LOOP NESTS
X(5,211 + 1, I2 + 2) in a r e c t a n g u l a r triple loop, since t h e i r coefficient m a t r i c e s have the forms:
= diag(2, 3, 1) • C
= diag(2, 1, 0) • C.
In general, the a n a l y s i s o f Special Case 2 can be e x t e n d e d to i n c l u d e variables w h o s e coefficient m a t r i c e s are d i a g o n a l m a t r i c e s p o s t m u l t i plied by a p e r m u t a t i o n matrix. As in Special Case 2, let E = d i a g ( e l , e 2 , . . . , end) F = d i a g ( f l , f 2 , . . . , fro), a n d a s s u m e t h a t for e a c h ~" in 1 < r < rn, at least one o f er a n d f r is n o n z e r o . Then, we m a y c h o o s e the m a t r i x C in s u c h a w a y t h a t gcd(er,fr) = 1 for e a c h r . The d e p e n d e n c e e q u a t i o n (4.24) h a s the form i(EC) - )(FC) = 60 - rio. It can be b r o k e n u p into two s e p a r a t e s y s t e m s : iE - ) F = k
(4.43)
kC = !~o - to.
(4.44)
For a given k = (kl, k2 . . . . , kin), t h e general s o l u t i o n to (4.43) c a n be w r i t t e n as i = (~Iokl + fit1, ~20k2 + f2t2,..., ~mokm + f m t m ) .~ = (310kl + eltl,)20k2 + e2t2,... ,)mokrn + emtrn), w h e r e tl, t2 . . . . . tm are i n t e g e r p a r a m e t e r s , a n d
er~rO - f r 3 r o = 1
(1 < r < m ) .
The c o n s t r a i n t s (4.25) can also be e x p r e s s e d in t e r m s of kl, k 2 , . . . , km a n d tl, t 2 , . . . , tin.
4.6.
SPECIAL CASES
119
T h i s k i n d o f p r o b l e m i n a r e g u l a r l o o p n e s t ( w h e r e (~ is t h e i d e n t i t y m a t r i x ) b e c o m e s a s i m p l e p r o b l e m , if t h e m a t r i x C is s u c h t h a t E q u a t i o n ( 4 . 4 4 ) h a s n o s o l u t i o n o r a u n i q u e s o l u t i o n i n k. W e o m i t t h e d e t a i l s ; s e e E x a m p l e 1.5.12. T o s e e h o w t o t e s t w h e t h e r t h e m a t r i c e s .3, a n d !~ s a t i s f y t h e c o n d i t i o n d e f i n i n g S p e c i a l C a s e 3, s e e E x e r c i s e 1.5.6.1. EXERCISES 4.6 1. Consider the d e p e n d e n c e equation i.~ - ) ~ = ~o - ~o.
If ,~ = g, then for any given solution (io,,~o) and any given m-vector h, (io + h , ) o + h) is also a solution. Prove the converse: if the equation has this property, then A = g. 2. Find conditions on the iteration space of L such that it is finite and Theorem 4.9 is still valid. 3. Give an example of a nonuniform dependence with a unique distance vector. 4. Find the distance vectors, if any, of the dependence of T on S, S on T, and S on S in the following programs: (a) L1 : L2:
do I~ = 0 , 2 0 0 , 2 doI2 = 100+I1,0,-3 S: X(311 + I 2 + 5 , I ~ +212 + 3 ) . . . . T:
.......
X ( 3 1 1 + I2 + 1 , I ~ + 212). . .
enddo enddo
(b) L1 : L2 :
d o I1 = O, 100, 3 d o 12 = I1, 50 + Ix, 2 S : X(211 + 312 + 12) . . . . T:
.......
X(2I~ + 3 1 2 - 5 ) . . .
enddo enddo
5. Decide if T depends on S at level 2 in the program L1 : L2 :
do 11 = O, 100, 1 do 12 = 0, 50 + I1, 1 S : X ( 2 I ~ + 312 + 12) . . . . T : ....... X(211 + 312 - 5) • • • enddo enddo
6. In the following programs, study the dependence of T on S, S on T, S on S, and T on T. Find all distance and direction vectors, and levels of dependence. For each valid direction vector, find the set of all pairs of statement instances that are involved in the dependence.
C H A P T E R 4. PERFECT L O O P N E S T S
120
(a) L1 : L2 : L3 :
doI1 = 10,100,2 d o I2 = 1 0 0 , 0 , - 2 d o 13 = 20, 100, 1 S: X(211 - 1,312 - 1,I3) . . . . T : ....... X(211 + 1,I2 + 2, 5) • • • enddo enddo enddo
(b) L1 : L2 :
d o I~ = 1 , 1 0 0 , 2 do12 = 100,0,-3 S : X(611 - 1,212 - 1) . . . . T: ....... X(411 + 9 , 2 1 2 + 9 ) . . enddo enddo
7. D i s c u s s t h e p o s s i b i l i t y o f e x t e n d i n g t h e c o n c e p t s a n d r e s u l t s o f t h i s s e c t i o n t o a p r o g r a m t h a t is n o t a p e r f e c t n e s t .
Chapter 5
General Program 5.1
Introduction
A sequence of loops f o r m s a nest if each loop (except the first) is totally included within the previous loop. So far in the book, we have s t u d i e d the d e p e n d e n c e p r o b l e m in a perfect loop nest, where the b o d y of each loop (except the last) is t h e next loop in the sequence. The p a r a m e t e r s we u s e d to characterize a perfect loop n e s t (initial and final vectors, initial a n d final matrices, a n d stride matrix) can also be u s e d to characterize an ordinary loop nest. The m o d e l for this c h a p t e r is a p r o g r a m that consists of loops and a s s i g n m e n t statements, where the loops are arbitrarily (but legally) nested. If necessary, we create a trivial loop with one iteration so that the whole p r o g r a m is a loop. This does n o t alter the c o n c e p t s of d e p e n d e n c e in any way, b u t helps simplify the notation. Each given s t a t e m e n t in the p r o g r a m determines a loop nest; it is the s e q u e n c e of all loops in the p r o g r a m containing that s t a t e m e n t (counted f r o m the o u t e r m o s t loop inward). If the p r o g r a m is a perfect loop nest, t h e n the nests d e t e r m i n e d by all s t a t e m e n t s are the same. When we c o m p a r e two s t a t e m e n t s in the c u r r e n t p r o g r a m m o d e l to test for d e p e n d e n c e b e t w e e n them, we have to deal with two possibly different loop nests. This is h o w the d e p e n d e n c e p r o b l e m is to be generalized as we m o v e f r o m a perfect n e s t of loops to an arbitrary program. Much of the n e e d e d f o r m a l i s m is already in place. We show in Section 5.2 h o w the d e p e n d e n c e concepts can be easily e x t e n d e d f r o m a 121
122
CHAPTER 5. GENERAL PROGRAM
perfect loop n e s t to a general program. The linear d e p e n d e n c e probl e m in the general setting is f o r m u l a t e d in Section 5.3. In Section 5.4, we discuss the generalized gcd test that is the basic test in dependence analysis.
5.2 Dependence Concepts First, take any single a s s i g n m e n t s t a t e m e n t S in the program. Let n i s d e n o t e the n u m b e r of loops containing S and Ls the loop nest d e t e r m i n e d by S. For Ls, let Is d e n o t e the index vector, ~s the iteration vector, Ps0 a n d qs0 the initial a n d final vectors, Ps and Os the initial a n d final matrices, and Os the stride matrix. The n o r m a l i z e d final vector a n d the n o r m a l i z e d final matrix of Ls are d e n o t e d by/Is0 and 0 s , respectively, and they are c o m p u t e d using (4.10): 0S = O s p ~ I Q s O ~ 1 /Is0
)
(qs0 - Ps0P~IQs) O~1. ~
(5.1)
Next, c o n s i d e r any two, n o t necessarily distinct, a s s i g n m e n t statem e n t s S a n d T in the given p r o g r a m . If S lexically precedes T in the program, we write S < T. The m e a n i n g of the n o t a t i o n S < T is t h e n clear, a n d it is obvious that < is a total order in the set of all assignment statements. Let ni = niST d e n o t e the n u m b e r of loops that contain b o t h statem e n t s S a n d T, a n d L the nest f o r m e d by these loops. We have m < m i n ( m s , niT). Since the o u t e r m o s t ni loops of Ls and LT are the same, the first n i e l e m e n t s of Pso a n d PT0 are identical, and so are the first n i e l e m e n t s of qso a n d qT0- Also, the n i x ni submatrices f o r m e d by the t o p m o s t n i rows a n d the l e f t m o s t n i c o l u m n s of the matrices Ps a n d PT are the same, and so are the similar s u b m a t r i c e s of the matrices Qs and QT. Let us label the loops of the p r o g r a m L1, L2,... ,Lms+mT-m, such that L = (L1,L2 . . . . . Lm) Ls = (L1,L2 . . . . , L m , L m + l , . . . , L m s ) LT = ( L 1 , L 2 , . . . , L m , L m s + l , . . . , L m s + m T - m ) . A value i = (il, f2 . . . . , ira, im+l . . . . , ires) of the index vector Is det e r m i n e s an instance of s t a t e m e n t S that is d e n o t e d by S(i), a n d a
5.2. DEPENDENCE CONCEPTS
123
value j = (jl, j2 . . . . , jm, Jms+~,... ,Jms+mr-m) of the index vector IT d e t e r m i n e s an instance of s t a t e m e n t T that is d e n o t e d by T(j). The distance f r o m the instance S(i) to the instance T(j) is defined to be the m - v e c t o r ()1 - ~1,.~2 - ~2 . . . . . jrn -- ~rn), where i -~ (~1 . . . .
, ~rn, ~ m + l . . . .
, ~ms)
.~ = ( J 1 , . . - , j r n , ) ~ n s + l , . . . , j m S + m T - r n )
are the iteration vectors c o r r e s p o n d i n g to i a n d j, respectively. Thus, the distance is defined in t e r m s of the first m e l e m e n t s of the iteration vectors of the loop nests Ls a n d LT. In o t h e r words, the distance is defined 1 by the iterations of the c o m m o n n e s t L. An extension of L e m m a 4.4 to the general case is s t a t e d below w i t h o u t proof. L e m m a 5.1 Consider two assignment statements S a n d T in the model program. Let d denote the distance from an instance S(i) o f S to an instance T(j) o f T . In the sequential execution o f the program, S(i) is executed before T(j) iff one o f the following two conditions holds: (a) d > 0, (b) d = O a n d S
- O. The equality d = 0 is possible only i f S < T. 1Note t h a t in a g e n e r a l p r o g r a m , t h e i t e r a t i o n p o i n t s ~ a n d ) n e e d n o t h a v e t h e s a m e n u m b e r o f e l e m e n t s , so t h a t t h e d i f f e r e n c e ) - ~ m a y n o t e v e n m a k e sense.
CHAPTER 5. GENERAL PROGRAM
124
C o r o l l a r y 1 I f ¢~ is a direction vector for the d e p e n d e n c e o f s t a t e m e n t
T on s t a t e m e n t S in the g e n e r a l p r o g r a m , then ~ >__O. The equality o" = 0 is possible only if S < T. C o r o l l a r y 2 I f ~ is a d e p e n d e n c e level for the d e p e n d e n c e o f s t a t e m e n t
T on s t a t e m e n t S in the g e n e r a l p r o g r a m , then i < e < m + 1 (where m is the n u m b e r o f loops that contain both S a n d T). The value ~ = m + 1 is possible only if S < T.
E x a m p l e 5.1 C o n s i d e r t h e p r o g r a m L1 : L2: L3 : S : L4 : Ls :
T:
d o I1 = 1 , 1 0 0 , 1 d o I 2 = 1,I1,2 d o 13 = / 1 , I1 + 12, 1 X(211 - 1,312 + 1,213) . . . . enddo d o I4 = 100, 0, - 3 d o I~ = I1, I4, 1 ....... X(211 + 1,412 + 6 , I 4 + I s ) ' ' ' enddo enddo enddo enddo
S t a t e m e n t S d e t e r m i n e s t h e l o o p n e s t Ls = (L1,L2,L3) w i t h m s = 3 loops. For this nest, we have: P l = P l o = 1, ql
=
qlO
=
100,
P2 : P20 + P2111 : 1 + 0 X 11, 6/2 = q20 +6/21/1 = 0 + 1 x [ 1 , P3 = P30 + P3111 + P3212 = 0 + 1 x I1 + 0 x I2, 6/3 = 6/30 + 6/31/1 + q32/2 = 0 + 1 X I1 + 1 x I2, 01=1,02=2,03=1. H e n c e , t h e initial a n d final v e c t o r s a r e PS0 = ( P l 0 , P20, P30) = (1, 1 , 0 ) , qs0 = (ql0, q20, q30) = (100, 0, 0);
125
5.2. DEPENDENCE CONCEPTS
and the initial, final, and stride matrices are ( 1 -P21 Ps = 0 1 0 0 ( 1 -q21 O.s = 0 1 0 0 Os =
0 0
-P31 ) -P32 1 -q3~ ) -q32 1
02 0 0 03
=
=
=
( 1 0 0 ( 1 0 0
0 1 0 -1 1 0
0 2 0 0 0 1
-1 ) 0 , 1 -1 ) -1 , 1 .
We compute the normalized final vector and matrix: £1so = (qso - PsoP~lO.s ) O~ 1 = (99, 0, 1), 0.3" = Osp~IQsO~ 1 =
(1-1/20) 0 1 0 0
-2 1
.
The iteration vector ~s = (,~1,~2, i3) satisfies the constraints:
O< ~s
}
sfis -
- 0 if S < T, a n d tr >- 0 o t h e r w i s e . Also, b y C o r o l l a r y 2 to t h e s a m e t h e o r e m , w e h a v e 1 < / ? < m + 1 if S < T, a n d 1 < 4? < m o t h e r w i s e . T h e o r e m 5.4 Suppose that Equation (5. 3) has an integer solution (~,)) that satisfies the constraints (5.4)-(5.5). Let ~ = (El, ~2,..., ~rns) a n d • = ( j l . . . . , )rrt, ) r n s + l . . . . .
)r~tS+rrtT-rtl).
(a) I f ( i i , $ 2 , . . . , i m ) < ( 3 1 , ) 2 , . . . , j r n ) ,
then S-~T. -
-
(b) I f ( i l , ~ 2 . . . . ,~m) >- (31,j2 . . . . , ) m ) , then Tt~S. (c) I f ( i x , ~ 2 . . . . . ~m) = (~1,~2 . . . . . j m ) a n d s < T, t h e n S - ~ T . As always, w e d o n o t d i s t i n g u i s h b e t w e e n d e p e n d e n c e a n d indirect d e p e n d e n c e . If t h e r e is a n i n t e g e r s o l u t i o n to (5.3), s a t i s f y i n g (5.4)-(5.5) a n d a d d i t i o n a l c o n d i t i o n s , if any, t h e n w e a s s u m e t h a t the
132
C H A P T E R 5.
GENERAL P R O G R A M
c o r r e s p o n d i n g d e p e n d e n c e exists. The d e p e n d e n c e p r o b l e m for t h e general p r o g r a m is d e f i n e d as in Section 4.5. E x a m p l e 5.2 In this example, w e state the d e p e n d e n c e p r o b l e m for t h e p r o g r a m of Example 5.1: do 11 = 1,100, 1
L1 : L2 : L3 :
d o 12 = 1,11, 2
S: L4 : Ls :
T:
do 13 = 11,11 + 12, 1 X(211 - 1,312 + 1, 2•3) . . . . enddo d o 14 = 100, 0, - 3 do Is = 11,14, 1 ....... X(211 + 1,412 +6,14 + I s ) . • • enddo enddo enddo enddo
Much of t h e c o m p u t a t i o n h a s b e e n d o n e already. The variable of s t a t e m e n t S c a n b e w r i t t e n as X ((I1,12, I3)A + a0), w h e r e
A =
3 0
Its n o r m a l i z e d f o r m is
h
and
a0 = ( - 1 , 1,0).
X ((~1, ~2, I3)A + rio), w h e r e = Osp~IA =
, 0
rio = P s o p ~ I A + ao = ( 1 , 4 , 2 ) . The variable of T c a n b e w r i t t e n as X ((I1, I2, I4, I5 )B + bo), w h e r e
B=
4 0 0
bo = ( 1 , 6 , 0 ) .
'
133
5.3. DEPENDENCE PROBLEM
2ol)
Its n o r m a l i z e d f o r m is X ( ( i l , ~2, ~4, i s ) ~ + !~03, w h e r e
fi
"~-t~TPT1B =
0
8
0
0
0
0
0
-3
'
1
l~0 = PTop~XB + bo = (3, 10, 101). The d e p e n d e n c e e q u a t i o n (5.3) for this p r o b l e m is
(~1, ~2, ~33
20 0 6 0 0
02
-- (31,32,34,35)
0 0 0
08 0
- 30 1
= (2,0,993. (5.7)
It c o n s i s t s of t h r e e scalar e q u a t i o n s in 7 i n t e g e r variables: 2Zl-231 =2 6Z2 - 832 = 6 2il + 2~3 - 3 1 + 3 3 4 - 3 5 = 9 9 .
]
In Example 5.1, we f o u n d t h e d e p e n d e n c e c o n s t r a i n t s (5.4)-(5.5) for this p r o b l e m : 0 < ~ 1 _< 99 0--< ~2 _< ~1/2 0