Dependence Analysis (Loop Transformation for Restructuring Compilers)

DEPENDENCE ANALYSIS A Book Series On LOOP TRANSFORMATIONS FOR RESTRUCTURING COMPILERS Utpal Banerjee Series Titles:...

Author: Utpal Banerjee

182 downloads 821 Views 7MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

DEPENDENCE ANALYSIS

A Book Series On

LOOP TRANSFORMATIONS FOR RESTRUCTURING COMPILERS

Utpal Banerjee

Series Titles:

Loop Transformationsfor Restructuring Compilers: The Foundations Loop Parallelization Dependence Analysis

DEPENDENCE ANALYSIS

Utpal Banerjee Intel Corporation

A B o o k Series on

Loop Transformations for Restructuring Compilers

Kluwer Academic Publishers Boston / D o r d r e c h t / L o n d o n

Distributors for North America: Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA Distributors for all other countries: Kluwer Academic Publishers Group Distribution Centre Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS

Library of Congress Cataloging-in-Publication Data A C.I.P. Catalogue record for this book is available from the Library of Congress.

Copyright © 1997 by Kluwer Academic Publishers All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher, Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, Massachusetts 02061 Printed on acid-free paper.

Printed in the United States of America

To m y parents:

Late Santosh K u m a r Banerjee Santi Rani Banerjee

Contents Preface

xv

Acknowledgments

xvii

1 Introduction

1

Single Loops

15

2.1 2.2 2.3 2.4

Introduction ............................ Index and Iteration Spaces . . . . . . . . . . . . . . . . . . . Dependence Concepts ..................... Dependence Problem ......................

15 15 19 25

2.5 2.6

S o l u t i o n t o t h e Linear P r o b l e m . . . . . . . . . . . . . . . . Method of Bounds ........................

30 46

2

3

Double Loops

57

3.1

Introduction ............................

57

3.2 3.3 3.4

I n d e x a n d I t e r a t i o n Spaces . . . . . . . . . . . . . . . . . . . Dependence Concepts ..................... Dependence Problems .....................

58 64 71

81

4 Perfect Loop Nests 4.1

Introduction ............................

81

4.2 4.3 4.4 4.5

I n d e x a n d I t e r a t i o n Spaces . . . . . . . . . . . . . . . . . . . Dependence Concepts ..................... Subscript Representation .................... Dependence Problem ...................... Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . .

82 94 100 102 108

4.6

vii

, , o

Vlll

General Program

121

5.1 5.2 5.3 5.4

121 122 129 134

Introduction ............................ Dependence Concepts ..................... Dependence Problem ...................... G e n e r a l i z e d gcd T e s t . . . . . . . . . . . . . . . . . . . . . .

6 Method of Bounds 6.1 6.2 6.3 6.4 6.5 7

Introduction ............................ P e r f e c t Nest, O n e - D i m e n s i o n a l A r r a y . . . . . . . . . . . . R e c t a n g u l a r Loops, O n e - D i m e n s i o n a l A r r a y . . . . . . . . R e c t a n g u l a r Loops, M u l t i - D i m e n s i o n a l A r r a y . . . . . . . General Method . . . . . . . . . . . . . . . . . . . . . . . . .

139 139 141 151 158 165

Method of Elimination

171

7.1 7.2 7.3 7.4

171 172 176 180

Introduction ............................ Dependence Testing by Elimination . . . . . . . . . . . . . Two-variable Problems . . . . . . . . . . . . . . . . . . . . . Other Methods ..........................

8 Conclusions

189

A Linear Equations on Polytopes

191

A.1 A.2 A.3 A.4 A.5 A.6

Introduction ............................ Polytopes ............................. Real S o l u t i o n s to a Single E q u a t i o n . . . . . . . . . . . . . . Lagrangean Relaxation ..................... Real S o l u t i o n s to a S y s t e m o f E q u a t i o n s . . . . . . . . . . I n t e g e r S o l u t i o n s t o Linear E q u a t i o n s . . . . . . . . . . . .

191 192 193 196 200 204

Bibliography

207

Index

213

List of Figures 2.1 Statement d e p e n d e n c e graph for Example 2.2 . . . . . . . 2.2 The triangle P of T h e o r e m 2.11 . . . . . . . . . . . . . . . . 2.3 Loop nest of Example 2.7 after unrolling . . . . . . . . . . .

24 49 54

3.1 Index and iteration spaces for Example 3.1 . . . . . . . . .

63

List of Tables 2.1

Steps of A l g o r i t h m 2.1 for a = 21 a n d b = 34 . . . . . . .

33

3.1 I n d e x values for l o o p s o f Example 3.1 . . . . . . . . . . . . 3.2 I t e r a t i o n v a l u e s for l o o p s of Example 3.1 . . . . . . . . . . 3.3 Some i t e r a t i o n s of (Lx,L2) in Example 3.2 . . . . . . . . . .

62 62 69

4.1 4.2

97 98

Loop n e s t of Example 4.2 a f t e r u n r o l l i n g . . . . . . . . . . . D e p e n d e n c e s t r u c t u r e of (Lx,L2, L3) in Example 4.2.

5.1 Three s t a t e m e n t i n s t a n c e s for Example 5.1 . . . . . . . . . 5.2 P a r a m e t e r s for the D e p e n d e n c e P r o b l e m . . . . . . . . . . .

. .

127 130

List of Notations In t h e f o l l o w i n g , i = ( i l , i 2 , . . . , i m ) a n d j = ( j l , j 2 , . . . , j m ) m - v e c t o r s ( i n t e g e r o r real), a n d 1 < ~ < m .

a/b R Rm Z Zm

S0 if0 - 3 ~ > - 1 1 0 , so that 0 < ~ < 110/3. Here, ~ = L l 1 0 / 3 ] = 36, a n d the iteration p o i n t s of the loop are 0, 1, 2, . . . , 36. The iteration space is the set of these 37 integers. The index p o i n t s of L are f o u n d f r o m (2.5) by plugging in the values of ~: 120,120-3(1),120-

3(2),...,120-

3(36).

2.3. DEPENDENCECONCEPTS

19

Thus, the i n d e x space is t h e set {120, 117, 1 1 4 , . . . , 12}.

EXERCISES 2.2

1. What are the smallest and the largest values of I in the model loop L of this section? 2. Find a closed-form expression for the number of iterations of L, that holds for all values of p, q, and 0 (including cases where the loop fails to execute). 3. Find the index and iteration spaces of the loop L, when (a) p = 0 , q = 9 , 0 = 1 ; (b) p = 17,q = 39,0 = 5; (c) p = -15,q = 20,0 = 2; (d) p = 10,q = -13,0 = -3. 4. Find the number of iterations of L and the value of I in the last iteration, when (a)

p = 0 , q = 1 0 0 0 , 0 --- 1;

(b)

p = -17,

(c)

p = -15,q

(d)

p = 1001,q

q = 390,0 = -200,0 = -137,0

= 4; = 3; = -7.

2.3 Dependence Concepts C o n s i d e r two, n o t n e c e s s a r i l y distinct, a s s i g n m e n t s t a t e m e n t s S a n d T in o u r m o d e l loop L:

p,q,O H(I)

do/=

enddo If S lexically p r e c e d e s T in t h e p r o g r a m , we write S < T. The m e a n i n g of the n o t a t i o n S < T is t h e n clear, a n d it is o b v i o u s t h a t < is a t o t a l o r d e r in the set of a s s i g n m e n t s t a t e m e n t s in the p r o g r a m . An i n d e x p o i n t (i.e., a value o f t h e i n d e x variable I) d e t e r m i n e s a n i n s t a n c e of e a c h a s s i g n m e n t s t a t e m e n t in t h e loop. Since, by h y p o t h esis, t h e r e are no c o n d i t i o n a l s in the p r o g r a m , all s u c h i n s t a n c e s are executed. Let S(i) d e n o t e t h e i n s t a n c e of s t a t e m e n t S d e t e r m i n e d by an i n d e x p o i n t i, a n d T(j) t h e i n s t a n c e of s t a t e m e n t T d e t e r m i n e d b y

20

CHAPTER 2. SINGLELOOPS

an index p o i n t j. The distance f r o m S(~) to T(j) is defined to be the integer (3 - ~), where ~ and 3 are the iteration p o i n t s c o r r e s p o n d i n g to i a n d j, respectively. 1 We leave to the reader the p r o o f of the following result. L e m m a 2.4 Consider two assignment statements S and T in the single loop L. Let d denote the distance from an instance S(~) of S to an instance T (j) ofT. In the sequential execution of the program, S ( ~) is executed before T(j) iff either d > O, or d = 0 a n d s < T. The c o n c e p t of d e p e n d e n c e can be i n t r o d u c e d in m a n y different contexts. We define d e p e n d e n c e first b e t w e e n s t a t e m e n t instances, a n d t h e n b e t w e e n statements. An instance T(j) of a s t a t e m e n t T depends on an instance S(~) of a s t a t e m e n t S, if there exists a m e m o r y location .~M s u c h that 1. Both S(i) and T(j) reference (read or write) _W/; 2. S(i) is executed before T(j) in the sequential execution of the program; 3. During sequential execution, the location M is n o t written in the time period f r o m the e n d of execution of S(i) to the b e g i n n i n g of execution of T(j). The s t a t e m e n t s S and T n e e d n o t be distinct, b u t C o n d i t i o n 2 requires that the instances S(i) a n d T(j) be distinct. Let ~ and 3 d e n o t e the iteration values c o r r e s p o n d i n g to i and j, respectively. This dependence is loop-carried if ~ < 3; it is loop-independent if ~ = 3 (i.e., if i = j). In the l o o p - i n d e p e n d e n t case, we necessarily have S < T since S(~) is to be executed before T(i). Since a m e m o r y reference is either a "read" or a "write," a pair of s t a t e m e n t instances can reference the s a m e m e m o r y location in four different ways. This leads to four different types of d e p e n d e n c e :

1. T(j) is flow dependent on S(~), if S(~) writes ~4 a n d T(j) reads it (the value c o m p u t e d by S(~) is u s e d by T(j)); 1See [Pugh 92b] for concerns expressed about the proper definition of dependence distance, and [Wolf 94] for related comments.

2.3. DEPENDENCECONCEPTS

21

2. T(j) is anti-dependent o n S(~), if S(~) reads ~4 a n d T(j) writes it (S(~) u s e s the value in ~M before it is c h a n g e d by T(j)); . T(j) is output dependent o n S(~), if S(~) a n d T(j) b o t h write ~v/ (the value c o m p u t e d by T(j) is s t o r e d after the value c o m p u t e d by S(i) is stored);

4. T(j) is input dependent o n S(i), if b o t h S(~) a n d T(j) read ~M (the "read" by T(j) c o m e s after the "read" by S(~)). Now, we consider d e p e n d e n c e b e t w e e n two s t a t e m e n t s . A statem e n t T depends o n a s t a t e m e n t S, if there is at least one instance S(i) of S a n d one instance T(j) of T, s u c h that T(j) d e p e n d s o n S(i). We can be m o r e specific. For example, T is flow dependent on S if there is an instance pair (S(~), T(j)) s u c h that T(j) is flow d e p e n d e n t on S(~). One m a y similarly define w h a t is m e a n t by T is anti-dependent, output dependent, or input dependent o n S. The dependence of T o n S can be formally defined to be the set of all instance pairs (S(i), T(j)) s u c h that T(j) d e p e n d s o n S(~). Thus, T d e p e n d s o n S iff the d e p e n d e n c e of T o n S is n o n e m p t y . Sometimes, it is c o n v e n i e n t to say that there is d e p e n d e n c e between S a n d T if either T d e p e n d s o n S, or S o n T (or both). The d e p e n d e n c e of T on S can be s e p a r a t e d into two parts: the loop-carried part, consisting of all instance pairs (S(~), T(j)) s u c h that T(j) d e p e n d s o n S(~) a n d ~ < j; a n d the loop-independent part, consisting of all instance pairs (S(~), T(i)) s u c h that T(~) d e p e n d s on S(~). The l o o p - i n d e p e n d e n t part is necessarily e m p t y if T < S. We can also break u p the d e p e n d e n c e of T o n S by the type of d e p e n d e n c e b e t w e e n a pair of instances. This way we get four subsets: . The flow dependence of T on S consists of all instance pairs (S(i), T(j)) s u c h that T(j) is flow d e p e n d e n t on S(~); . The anti-dependence of T o n S consists of all instance pairs (S(i), T(j)) s u c h that T(j) is a n t i - d e p e n d e n t on S(~); . The output dependence of T o n S consists of all instance pairs (S(~), T(j)) s u c h that T(j) is o u t p u t d e p e n d e n t o n S(~); . The input dependence of T on S consists of all instance pairs (S(i), T(j)) s u c h that T(j) is i n p u t d e p e n d e n t o n S(i).

22

CHAPTER 2. SINGLE LOOPS

These subsets are not necessarily pairwise disjoint (e.g., T ( j ) m a y be both flow d e p e n d e n t and a n t i - d e p e n d e n t on S(i)). The d e p e n d e n c e b e t w e e n two s t a t e m e n t s can be described in t e r m s of the p r o g r a m variables 2 involved. (See Section Io4.3.) The instances of the o u t p u t variable of a s t a t e m e n t S d e t e r m i n e the m e m o r y locations written by the instances of S. On the other hand, the instances of the input variables of S d e t e r m i n e the m e m o r y locations read by the instances of S. Let u d e n o t e a variable of s t a t e m e n t S and v a variable of s t a t e m e n t To The pair (u, ~ ) causes a d e p e n d e n c e of T on S, if there are s t a t e m e n t instances S(i) and T(j), such that T ( j ) d e p e n d s on S(i), and the c o r r e s p o n d i n g m e m o r y location ~4 is r e p r e s e n t e d by both the instance of u for I = ~ and the instance of v for I = j. If u is the o u t p u t variable of S and ~ an input variable of T, t h e n (u, v) can cause a flow d e p e n d e n c e of T on S, or an anti-dependence of S on T, or both. The o u t p u t variables of S and T can cause an o u t p u t d e p e n d e n c e of T on S, a n d / o r of S on T. Two input variables of the s t a t e m e n t s can cause only an input dependence. A distance for the d e p e n d e n c e of T on S is the distance ( 3 - ~) f r o m an instance S(i) of S to an instance T ( j ) of T, where T ( j ) d e p e n d s on S(~). A given d e p e n d e n c e has at least one, but usually m a n y distances. Since S(i) m u s t be executed before T ( j ) by the definition of dependence, we have the following result as a direct c o n s e q u e n c e of Lemma 2.4.

Theorem 2.5 If a statement T depends on a s t a t e m e n t S in the loop L, then each dependence distance satisfies d > O. The equality d = 0 is possible only if S < T. The relation of d e p e n d e n c e b e t w e e n s t a t e m e n t s is d e n o t e d by ~, and we write S ~ T to indicate that T d e p e n d s on S. 3 The statement dependence graph of the given p r o g r a m is the directed graph that represents the relation ~. (In other words, it is the graph of ~; see Section 1.1.4.) We d e n o t e flow dependence, anti-dependence, output

2Weoften use the term "variable" somewhat loosely, although the exact meaning should always be clear from the context. In the current context, for "program variables," we may take the output variable X(I) of S and the input variable X (I-2) of T in Example 2.2 on the next page. 3Often, ~ is used as a generic symbol for dependence. For example, one may write S(i) ~ T(j) to denote dependence between statement instances.

2.3. DEPENDENCE CONCEPTS

23

d e p e n d e n c e , a n d i n p u t d e p e n d e n c e b y t h e s y m b o l s 6 f, 6 a, 6 °, a n d 6 i, respectively. (Thus, for example, S 6 f T m e a n s T is flow d e p e n d e n t on S.) T h e s e r e l a t i o n s m a y overlap, a n d t h e y c o n s t i t u t e the m a i n relation of d e p e n d e n c e : 6 = 6f ~J 6a ~j 6 ° u 6 i. F r o m this point, d e p e n d e n c e will n o t i n c l u d e i n p u t d e p e n d e n c e , u n l e s s o t h e r w i s e stated, The transitive c l o s u r e ~ o f t h e relation 6 is the relation o f indirect d e p e n d e n c e . Thus, a s t a t e m e n t T is indirectly d e p e n d e n t o n a s t a t e m e n t S if t h e r e is a (directed) p a t h f r o- -m S to T in the s t a t e m e n t

d e p e n d e n c e graph. In symbols, we have S 6 T if there is a n o n e m p t y sequence of statements Sl, S 2 , . . . , SN such that S = S1, Sl 6 S 2 . . . . . SN-1 6 S N , SN = T.

Indirect d e p e n d e n c e b e t w e e n s t a t e m e n t i n s t a n c e s is d e f i n e d similarly. E x a m p l e 2.2 C o n s i d e r the l o o p L. S

,

T" U"

do I = 4, 100, 2 X ( I ) --- X ( I ) + 1 Y ( 2 I - 4 ) = X ( I ) + X ( I + I) + X ( I + 2) + X ( I - 2) Y(I) = Z(I) enddo

The i n d e x v a l u e s o f t h e l o o p are 4, 6, 8 , . . . , 100, a n d the c o r r e s p o n d i n g i t e r a t i o n v a l u e s are 0, 1, 2 . . . . ,48. The first three iterations of L along with the last o n e are s h o w n below: S(4)" T(4) : U(4)"

X(4) = X(4) + 1 Y(4) = X(4) + X(5) + X ( 6 ) + X(2) Y(4) = Z(4)

S(6)T(6) • U(6)"

X(6) = X(6) + 1 Y(8) = X(6) + X(7) + X(8) + X(4) Y(6) = Z(6)

S(8) : T(8) :

X(8) = X(8) + 1 Y(12) -- X(8) + X(9) + X(10) + X(6)

u(8) :

Y(8) = z(8)

;

S(100) : T(100) : U(100) :

•

X(100) = X(100) + 1 Y(196) = X(100) + X(101) + X(102) + X(98) Y(100) = Z(100)

24

CHAPTER 2. SINGLELOOPS

Figure 2.1: S t a t e m e n t d e p e n d e n c e g r a p h for Example 2.2. (Statement i n s t a n c e s are always labeled by index values.) F r o m this pattern, it is easy to figure o u t the d e p e n d e n c e s t r u c t u r e of the program. Note the following facts: 1. S does n o t d e p e n d o n itself. 2. Fix the o u t p u t variable X(I) of S a n d take the i n p u t variables of T, one at a time, in the o r d e r of their appearance. We can m a k e these observations: (a) X(I) of S a n d X(I) of T cause a flow d e p e n d e n c e of T on S. This d e p e n d e n c e is l o o p - i n d e p e n d e n t ; it has no loopcarried part. The only d e p e n d e n c e distance is 0. (b) X(I) of S and X ( I + 1) of T do n o t cause a d e p e n d e n c e b e t w e e n S and To (c) X(I) of S a n d X(I + 2) of T cause an a n t i - d e p e n d e n c e of S o n T. This d e p e n d e n c e is loop-carried; it has no loopi n d e p e n d e n t part. The only d e p e n d e n c e distance is 1. (d) X(I) of S a n d X ( I - 2) of T cause a flow d e p e n d e n c e of T on S that is also loop-carried, w i t h o u t any l o o p - i n d e p e n d e n t part. Again, the only d e p e n d e n c e distance is 1. 3. There is an o u t p u t d e p e n d e n c e of U on T. It has a loop-carried p a r t a n d a l o o p - i n d e p e n d e n t part. There are several d e p e n d e n c e distances: 0, 1, 2 . . . . . 4. U d o e s n o t d e p e n d o n S, b u t U d e p e n d s on S indirectly.

2.4.

DEPENDENCE

PROBLEM

25

The s t a t e m e n t d e p e n d e n c e g r a p h for the p r o g r a m is given in Figure 2.1. Strictly speaking, this figure s h o w s the s u p e r p o s i t i o n of the g r a p h s of the r e l a t i o n s 8 f, 8 a, 8 °. (We have i g n o r e d i n p u t d e p e n d e n c e s b e t w e e n s t a t e m e n t s , a n d will u s u a l l y do so in f u t u r e examples.) Note the d i f f e r e n t m a r k i n g s o n the edges: the triangle r e p r e s e n t s flow dep e n d e n c e , the d a s h a n t i - d e p e n d e n c e , a n d the circle o u t p u t dependence. If we w a n t to s h o w m o r e d e t a i l e d d e p e n d e n c e i n f o r m a t i o n , t h e n a m o r e d e t a i l e d g r a p h will be necessary. For example, to emp h a s i z e t h a t t h e r e are two pairs of p r o g r a m variables e a c h of w h i c h c a u s e s a flow d e p e n d e n c e of T o n S, we w o u l d create a g r a p h w i t h two flow d e p e n d e n c e e d g e s f r o m S to T, e a c h labeled w i t h t h e a p p r o p r i a t e variables. EXERCISES 2.3

1. Take two iteration points ~ and ~ of the model loop L. Let i and j denote the corresponding index points. Let d = ) - [ and d' = j - ~. How is d' related to d? Explain the difficulties that may arise if the distance from a statement instance S(~) to a statement instance T ( j ) is defined to be d'. 2. Write down the set of all pairs of statement instances that constitute the flow dependence of T on S in Example 2.2. Show the different subsets that represent the flow dependences caused by different pairs of variables. 3. Write down the set of all pairs of statement instances that constitute the output dependence of U on T in Example 2.2. Give all the distances for this dependence. 4. Are there any input dependences between statements in Example 2.2? 5. Give a complete description of the dependence structure of the loop in Example 2.2 (including a dependence graph) after changing the loop-header to

2.4

(a) L:

d o I = 4,100,1

(b) L:

doi=4,100,3

(c) L :

d o I = 100, 4, -2

Dependence Problem

C o n s i d e r a g a i n t h e m o d e l loop L a n d two s t a t e m e n t s S a n d T in its body. Let u d e n o t e a variable of S a n d v a variable o f T. Now, we a d d r e s s the practical p r o b l e m o f d e t e r m i n i n g w h e t h e r u a n d v cause a d e p e n d e n c e b e t w e e n the s t a t e m e n t s : a d e p e n d e n c e of T o n S, or

26


of S on T, or both. In this context, we are not interested in the type (flow, anti-, output, or input) of dependence. This book focuses on array elements, and here we consider the case where u and v are b o t h elements of a given array X. Throughout this chapter, we a s s u m e that X is one-dimensional, unless otherwise specified. Let u = X ( f ( I ) ) and v = X ( g ( I ) ) , where f and ~ are realvalued functions that r e t u r n integer values for integer values of I. When S < T, u is the o u t p u t variable of S, and v an input variable of T, we m a y partially r e p r e s e n t the model p r o g r a m as follows: L.

d o / = p,~l,O :

S.

X(f(I)) .... :

T:

.......

X(g(I) ) . • •

:

enddo T h e o r e m 2.6 gives necessary conditions u n d e r which u and v will cause a d e p e n d e n c e between S and T. T h e o r e m 2.6 Consider a n y two statements S a n d T in the loop L. Let X ( f ( I ) ) denote a variable o f S and X ( g ( I ) ) a variable o f T , where X is a one-dimensional array. I f these variables cause a dependence between S a n d T, then the equation f (p + 0~) - e ( p + 03) = 0

(2.6)

has an integer solution (~, 3) such that O T2, t h e n t h e r e is no d e p e n d e n c e b e t w e e n s t a t e m e n t s S a n d T; t e r m i n a t e the algorithm. 12. [Find t h e set ~o. F r o m (2.13) we get j-

~ = ( a l - bl)t - Cl(~O - ]0).

(2.14)

Since a ~: b, we have a l ~: bx. Let ~ d e n o t e the value of t for w h i c h ~ = j. If ~ is a n i n t e g e r b e t w e e n T1 a n d T2, t h e n it is a n e l e m e n t (the o n l y e l e m e n t in this case) of ~0.] Set ~ "- Cl(~O - j o ) / ( a l - h i ) . If ~ is a n i n t e g e r s u c h t h a t r l -< ~ -< r2, t h e n set

~0 "- {(Cl~0 + bl~, Cl.~0 + alE)}. 13. [Find t h e sets ~1 a n d ~_~. The d i f f e r e n c e (] - ~) given b y (2.14) is e i t h e r positive for all t < ~ a n d negative for all t > ~, or negative for all t < ~ a n d positive for all t > ~. We c o m p u t e t h e i n t e r s e c t i o n s [T1, T2] C~ (--0% ~) a n d [Tx,T2] (3 (~, oo). See A l g o r i t h m 1.3.1 a n d figures 1.3.4, 1.3.5 for details.] Set T3 -- [ ~ -- I I T4 -- [ ~ + I ]

T5 ~Inin(T2, T3) T6 ~ m a x ( T 1 , T 4 ) . Select t h e p r o p e r case b a s e d o n t h e sign o f ( a l - bl):


40

Case ( a l > bl): If T6
O, continue. 2. co *- b o - a o

+ (b-a)p=

-25.

Since co m o d 0 = 0, continue. 3. c .- c o / O = - 5 . 'tq -- ~,~I~_1

4.

- -

~,~I~o

- -

~.

5. Since a ~: b, go to Step 9. .

[By Algorithm 2.1, find gcd (7, 3) and two integers ~o, 30 such that 7~o - 330 = gcd(7, 3).] 9 "- 1, (~o,,3o) -- (1, 2). Since c m o d 9 = 0, continue.

10. ( ~ 2 1 , ~ 1 , ¢ 1) "--

(alg, blg, c/9)

= (7,3,-5).

2.5. SOLUTION TO THE LINEAR PROBLEM

11. Go to Case (bl > 0). Vl -- [ - c l [ o / b l ] = 2, T 2 '-- [ ( 3 Cl[o)/bl] = 14. Go to Case (al > 0). T 1 '-- m a x (T1, [ - - C 1 , ~ 0 / / 2 1 ] ) = 2, T 2 "-- m i n (T2, [(t~ - C l ) 0 ) / ~ / 1 ] ) = Since T1 --< T 2 , continue. 12.

~ ~ Cl([0

- 3o)/(al

- bl)

43

6.

= 5/4.

[Since ~ is n e i t h e r an integer, n o r d o e s it lie b e t w e e n T1 a n d ~-2, the set 't~0 r e m a i n s empty.] 13. T3 "-- [ ~ - - 11 = 1, ,--

+ I J = 2,

T5 "" m i n ( T 2 , T 3 ) -- max

= 1,

('rl, V4) = 2.

Go to Case (t~l > bl). Since -/-6 --< "/-2, s e t xI~1 ~ {(--5 q- 3 t , - - 1 0 + 7t) : 2 < t < 6}. 14. Since xI~1 #: ~ , there is a loop-carried d e p e n d e n c e of s t a t e m e n t T on s t a t e m e n t S. The c o r r e s p o n d i n g set of distances is {4t - 5 : 2 < t < 6}. Since the sets ~-1 a n d ~/0 are empty, there is no loop-carried d e p e n d e n c e of S on T, nor any l o o p - i n d e p e n d e n t d e p e n d e n c e b e t w e e n S a n d T. We see that the pairs of iteration p o i n t s that cause the d e p e n d e n c e of T o n S f o r m the set xI~1 ---- {(1,4),

(4, 11), (7,18), (10,25), (13,32)}.

The c o r r e s p o n d i n g set of pairs of index p o i n t s is obtained by r e p e a t e d applications of the relation I = p + ~?0: {(15, 30), (30,65), (45,100), (60, 135), (75,170)}. Thus, T(30) d e p e n d s on S(15), T(65) d e p e n d s o n S(30), etc. There are five s u c h pairs of s t a t e m e n t instances. The set of d e p e n d e n c e

44

CHAPTER

2.

d i s t a n c e s is {3, 7, 11, 15, 19}. ( N o t e t h a t d e p e n d e n c e computed from iteration points.)

SINGLE LOOPS

d i s t a n c e s are

EXERCISES 2.5

1.

(a) By Algorithm 2.1, find ~/ = gcd(10, 14) and two integers x0 and Yo, such that 10x0 - 14y0 = kt. Show all steps. Apply Theorem 2.8 to find the general solution to the equation 10x - 14y = 6, using these values of )Co and Y0. (b) By trial and error, find a pair of integers ( x l , y l ) ~ (x0,Yo) with the same property: 1 0 X x - - 14yt = 9. Solve 10x - 14y = 6 again, this time using (Xl, Yl) instead of (x0, Yo) in Theorem 2.8. Show that the general solution obtained here generates the same set of ordered pairs as generated by the general solution in the previous problem. (c) Find a formula that will generate all integer pairs (x, y ) such that 10x-14y=g.

2. In Theorem 2.8, show that irrespective of whether ~q divides c or not, the set of all real solutions to Equation (2.9) is given by x = (c/9)Xo + (b/9)t Y = (c/9)2o + (a/~)t,

where t is a real parameter. 3. Write an algorithm such that given three integers a, b, and c, it decides if there is an integer solution (x, 2 ) > (0, 0) (i.e., x _> 0 and 2 > 0) to the diophantine equation a x - b y = c, and gives a formula to generate all such solutions when they exist. Apply your algorithm to three examples to demonstrate that the number of nonnegative solutions could be infinite, finite and positive, or zero. 4. In the context of Theorem 2.9, consider these three conditions: (a) X ( a I + ao) and X ( b I + bo) cause a dependence between S and T; (b) (bo - ao + b p - a p ) is an integral multiple of O . g c d ( a , b ) ; (c) (b0 - a0) is an integral multiple of gcd(a, b). Show that (a) ~ (b) ~ (c). (We already know about the first implication.) Give an example where (c) does not imply (b). Find conditions on loop parameters such that (b) ** (c) for all integers a, ao, b, and b0. While (b) is the gcd test for the dependence equation in terms of iteration values, we can think of (c) as the god test for the equation a i - b j = bo - ao, that is, for the dependence equation in terms of index values. Which is the stronger test for dependence in general: the gcd test in terms of iteration values, or the gcd test in terms of index values? When are they equivalent? (Note that (b) does not involve the final loop limit q, and (c) does not involve any loop parameters.)

2.5.

SOLUTION

TO

THE LINEAR

45

PROBLEM

5. A p p l y A l g o r i t h m 2.2 t o t h e f o l l o w i n g e x a m p l e s : (a) L : S : T :

doi=3,100,1 X(9I + 22) .... .......

17) • • •

X(6I-

enddo (b) L : S : T : (c) L : S : T :

doi=0,100,2 X(3I) .... ....... X(36) • • • enddo d o I = 100, - 1 0 , - 3 X ( 4 I + 16) . . . . .......

4) • • •

X(4I-

enddo (d) L : S : T :

d o I = 100, - 1 0 , - 2 X ( 4 I + 16) . . . . .......

4) • • •

X(4I-

enddo (e) L : S : T :

d o I = 100, - 1 0 , - 1 X ( 4 I + 16) . . . . .......

4) • • •

X(4I-

enddo (f) L : S : T :

d o I = - 5 0 , 100, 1 X(-2I + 25) .... .......

X ( 3 I + 4) • • •

enddo (g) L : S : T :

do I = -300, 200, 3 X ( 6 I - 17) . . . . .......

x(gI

+ 22). • •

enddo (h) L :

doi=0,100,1 S: T :

X(21,2I+1) .... ....... X ( 3 I + 1 , 3 I + 2) • • •

enddo 6. I n A l g o r i t h m 2.2, l e t a # b. S h o w t h a t if ~l'0 is n o n e m p t y , t h e s i n g l e p o i n t (c ! ( a - b ) , c ! ( a - b ) ) . 7. I n t h e c o n t e x t o f A l g o r i t h m 2.2, g i v e e x a m p l e s ~I~1,wit-1, a n d ~/0, (a) o n l y o n e is n o n e m p t y

t h e n it c o n t a i n s

such that of the three sets

( t h r e e e x a m p l e s , o n e f o r e a c h set);

(b) o n l y t w o a r e n o n e m p t y

(three examples, one for each combination);

(c) all t h r e e a r e n o n e m p t y

(one example).

8. W r i t e a s i m p l e r v e r s i o n o f A l g o r i t h m 2.2 f o r t h e c a s e w h e r e b = - a

~ 0.

46

C H A P T E R 2. SINGLE L O O P S

9. Consider the program L: S: T:

do I = p , q , O X .... ....... x..enddo

where x is a scalar. Can Algorithm 2.2 handle this d e p e n d e n c e problem?

2.6

Method of Bounds

As we shall see, the d e p e n d e n c e p r o b l e m b e c o m e s m o r e difficult w h e n we m o v e f r o m a single loop to a m o r e complicated p r o g r a m . The m a j o r difficulty arises f r o m the fact that the general solution to the d e p e n d e n c e e q u a t i o n usually contains m o r e t h a n one integer parameter, and t h e r e f o r e we get a s y s t e m of inequalities with m o r e t h a n one integer variable w h e n the solution is s u b s t i t u t e d in the d e p e n d e n c e c o n s t r a i n t s (Step 4, A l g o r i t h m 1.1). Such a s y s t e m of inequalities is n o t trivial to solve. We m e n t i o n e d in Chapter 1 that for this reason, an a p p r o x i m a t e a l g o r i t h m (Algorthm 1.2) is o f t e n used, where we test if there is an integer solution to the d e p e n d e n c e e q u a t i o n w i t h o u t any constraints, a n d a real solution to the equation w i t h the d e p e n d e n c e constraints. We also d i s c u s s e d two i m p l e m e n t a t i o n s of that algorithm: the m e t h o d of b o u n d s a n d the m e t h o d of elimination. Since the d e p e n d e n c e p r o b l e m involving a o n e - d i m e n s i o n a l array in a single loop is a simple problem, an application of the m e t h o d of elimination (Algorithm 1.3) to it is u n i n t e r e s t i n g (why?). On the other hand, by applying the m e t h o d of b o u n d s to this problem, we get results that can be applied to a m o r e general program. In this section, we introduce the m e t h o d of b o u n d s in t e r m s of a single loop. In Chapter 6, this m e t h o d will be s t u d i e d in detail in the context of a n e s t of several loops, where it is n o r m a l l y u s e d in practice. The m e t h o d of b o u n d s originated in [Bane 76]. Consider again the d e p e n d e n c e p r o b l e m b e t w e e n two s t a t e m e n t s S a n d T in a loop of the f o r m L:

do/= p,q,O H(~) enddo

47

2.6. METHOD OF BOUNDS

posed by a variable X ( a I + ao) of S and a variable X ( b I + bo) of T. Write the d e p e n d e n c e equation (2.11) (a0)~ - (bO).~ = bo - ao + (b - a ) p in the f o r m a ~ - b3 = c, where c = [b0 - ao + (b - a ) p ] / O . constraints (2.12):

(2.15)

Also, rewrite the d e p e n d e n c e

o-<j- j2, i3 = j3.

4.6. SPECIAL CASES

117

The set of all s u c h s o l u t i o n s is

{(il,jl,i2,j2, i3,j3):(ii,ji)

~

~1,1, (i2,j2) e W2,-1, (i3,j3) ~ ~3,0}

w h i c h is clearly e m p t y . Hence, T d o e s n o t d e p e n d o n S w i t h the d i r e c t i o n v e c t o r ( 1, - 1,0). Next, s u p p o s e we w a n t to k n o w if s t a t e m e n t S d e p e n d s o n statem e n t T with the d i r e c t i o n v e c t o r ( 1 , - 1 , 0 ) . T h e n the q u e s t i o n is w h e t h e r t h e r e is a n i n t e g e r s o l u t i o n ( i l , j l , i2,j2, i3,j3) tO t h e syst e m of e q u a t i o n s (4.37)-(4.39) satisfying t h e c o n s t r a i n t s (4.40)-(4.42), s u c h that

jl < il,j2 > i2,j3 = i3. (Note that (it, i2, i3) goes w i t h S a n d ( j l , j 2 , j 3 ) w i t h T, so t h a t their roles are n o w reversed.) The set of all s u c h s o l u t i o n s is

{ ( i i , j l , i 2 , j 2 , i 3 , j 3 ) : ( i t , j ~ ) ~ ~Itt,_l, (i2,j2) e ~I~2,1, (i3,j3) e ~3,o}. This set is n o n e m p t y so t h a t S d e p e n d s o n T w i t h the d i r e c t i o n v e c t o r (1, - 1 , 0). In fact, all i n s t a n c e pairs ( S ( i t , i2, i3), T ( j l , j 2 , j 3 ) ) , s u c h t h a t it > jx, i2 < j2, i3 = j3, a n d S(il, i2, i3) d e p e n d s o n T ( j l , j 2 , j 3 ) , f o r m the set {(S(tl + 1, t2, 5 ) , T ( t l , 3 t 2 - 3, 5)) : 0 < tt < 99,2 < t2 < 34}. The n u m b e r of s u c h pairs is 100 x 33 = 3300. The set of d i s t a n c e v e c t o r s for this d e p e n d e n c e of S o n T is {(1,3 - 2t2,0) : 2 < t2 -< 34} = { ( 1 , - 1 , 0 ) , ( 1 , - 3 , 0 )

..... (1,-65,0)}.

S p e c i a l C a s e 3. A G e n e r a l i z a t i o n o f C a s e s 1 a n d 2 Let ~, = EC a n d ~ = FC, w h e r e E a n d F are r n x rrr diagonal matrices, a n d C is a n r n x rt matrix. It is clear t h a t this last case i n c l u d e s the first two. We get .~ = I~ w h e n E a n d F are equal to the i d e n t i t y matrix, w h i l e / ~ = E a n d i~ = F w h e n C is the i d e n t i t y matrix. A n e x a m p l e of this case is p r o v i d e d b y the variables X(I3, 2It - 1,312 - 1) a n d

118

CHAPTER 4. PERFECT LOOP NESTS

X(5,211 + 1, I2 + 2) in a r e c t a n g u l a r triple loop, since t h e i r coefficient m a t r i c e s have the forms:

= diag(2, 3, 1) • C

= diag(2, 1, 0) • C.

In general, the a n a l y s i s o f Special Case 2 can be e x t e n d e d to i n c l u d e variables w h o s e coefficient m a t r i c e s are d i a g o n a l m a t r i c e s p o s t m u l t i plied by a p e r m u t a t i o n matrix. As in Special Case 2, let E = d i a g ( e l , e 2 , . . . , end) F = d i a g ( f l , f 2 , . . . , fro), a n d a s s u m e t h a t for e a c h ~" in 1 < r < rn, at least one o f er a n d f r is n o n z e r o . Then, we m a y c h o o s e the m a t r i x C in s u c h a w a y t h a t gcd(er,fr) = 1 for e a c h r . The d e p e n d e n c e e q u a t i o n (4.24) h a s the form i(EC) - )(FC) = 60 - rio. It can be b r o k e n u p into two s e p a r a t e s y s t e m s : iE - ) F = k

(4.43)

kC = !~o - to.

(4.44)

For a given k = (kl, k2 . . . . , kin), t h e general s o l u t i o n to (4.43) c a n be w r i t t e n as i = (~Iokl + fit1, ~20k2 + f2t2,..., ~mokm + f m t m ) .~ = (310kl + eltl,)20k2 + e2t2,... ,)mokrn + emtrn), w h e r e tl, t2 . . . . . tm are i n t e g e r p a r a m e t e r s , a n d

er~rO - f r 3 r o = 1

(1 < r < m ) .

The c o n s t r a i n t s (4.25) can also be e x p r e s s e d in t e r m s of kl, k 2 , . . . , km a n d tl, t 2 , . . . , tin.

4.6.

SPECIAL CASES

119

T h i s k i n d o f p r o b l e m i n a r e g u l a r l o o p n e s t ( w h e r e (~ is t h e i d e n t i t y m a t r i x ) b e c o m e s a s i m p l e p r o b l e m , if t h e m a t r i x C is s u c h t h a t E q u a t i o n ( 4 . 4 4 ) h a s n o s o l u t i o n o r a u n i q u e s o l u t i o n i n k. W e o m i t t h e d e t a i l s ; s e e E x a m p l e 1.5.12. T o s e e h o w t o t e s t w h e t h e r t h e m a t r i c e s .3, a n d !~ s a t i s f y t h e c o n d i t i o n d e f i n i n g S p e c i a l C a s e 3, s e e E x e r c i s e 1.5.6.1. EXERCISES 4.6 1. Consider the d e p e n d e n c e equation i.~ - ) ~ = ~o - ~o.

If ,~ = g, then for any given solution (io,,~o) and any given m-vector h, (io + h , ) o + h) is also a solution. Prove the converse: if the equation has this property, then A = g. 2. Find conditions on the iteration space of L such that it is finite and Theorem 4.9 is still valid. 3. Give an example of a nonuniform dependence with a unique distance vector. 4. Find the distance vectors, if any, of the dependence of T on S, S on T, and S on S in the following programs: (a) L1 : L2:

do I~ = 0 , 2 0 0 , 2 doI2 = 100+I1,0,-3 S: X(311 + I 2 + 5 , I ~ +212 + 3 ) . . . . T:

.......

X ( 3 1 1 + I2 + 1 , I ~ + 212). . .

enddo enddo

(b) L1 : L2 :

d o I1 = O, 100, 3 d o 12 = I1, 50 + Ix, 2 S : X(211 + 312 + 12) . . . . T:

.......

X(2I~ + 3 1 2 - 5 ) . . .

enddo enddo

5. Decide if T depends on S at level 2 in the program L1 : L2 :

do 11 = O, 100, 1 do 12 = 0, 50 + I1, 1 S : X ( 2 I ~ + 312 + 12) . . . . T : ....... X(211 + 312 - 5) • • • enddo enddo

6. In the following programs, study the dependence of T on S, S on T, S on S, and T on T. Find all distance and direction vectors, and levels of dependence. For each valid direction vector, find the set of all pairs of statement instances that are involved in the dependence.

C H A P T E R 4. PERFECT L O O P N E S T S

120

(a) L1 : L2 : L3 :

doI1 = 10,100,2 d o I2 = 1 0 0 , 0 , - 2 d o 13 = 20, 100, 1 S: X(211 - 1,312 - 1,I3) . . . . T : ....... X(211 + 1,I2 + 2, 5) • • • enddo enddo enddo

(b) L1 : L2 :

d o I~ = 1 , 1 0 0 , 2 do12 = 100,0,-3 S : X(611 - 1,212 - 1) . . . . T: ....... X(411 + 9 , 2 1 2 + 9 ) . . enddo enddo

7. D i s c u s s t h e p o s s i b i l i t y o f e x t e n d i n g t h e c o n c e p t s a n d r e s u l t s o f t h i s s e c t i o n t o a p r o g r a m t h a t is n o t a p e r f e c t n e s t .

Chapter 5

General Program 5.1

Introduction

A sequence of loops f o r m s a nest if each loop (except the first) is totally included within the previous loop. So far in the book, we have s t u d i e d the d e p e n d e n c e p r o b l e m in a perfect loop nest, where the b o d y of each loop (except the last) is t h e next loop in the sequence. The p a r a m e t e r s we u s e d to characterize a perfect loop n e s t (initial and final vectors, initial a n d final matrices, a n d stride matrix) can also be u s e d to characterize an ordinary loop nest. The m o d e l for this c h a p t e r is a p r o g r a m that consists of loops and a s s i g n m e n t statements, where the loops are arbitrarily (but legally) nested. If necessary, we create a trivial loop with one iteration so that the whole p r o g r a m is a loop. This does n o t alter the c o n c e p t s of d e p e n d e n c e in any way, b u t helps simplify the notation. Each given s t a t e m e n t in the p r o g r a m determines a loop nest; it is the s e q u e n c e of all loops in the p r o g r a m containing that s t a t e m e n t (counted f r o m the o u t e r m o s t loop inward). If the p r o g r a m is a perfect loop nest, t h e n the nests d e t e r m i n e d by all s t a t e m e n t s are the same. When we c o m p a r e two s t a t e m e n t s in the c u r r e n t p r o g r a m m o d e l to test for d e p e n d e n c e b e t w e e n them, we have to deal with two possibly different loop nests. This is h o w the d e p e n d e n c e p r o b l e m is to be generalized as we m o v e f r o m a perfect n e s t of loops to an arbitrary program. Much of the n e e d e d f o r m a l i s m is already in place. We show in Section 5.2 h o w the d e p e n d e n c e concepts can be easily e x t e n d e d f r o m a 121

122

CHAPTER 5. GENERAL PROGRAM

perfect loop n e s t to a general program. The linear d e p e n d e n c e probl e m in the general setting is f o r m u l a t e d in Section 5.3. In Section 5.4, we discuss the generalized gcd test that is the basic test in dependence analysis.

5.2 Dependence Concepts First, take any single a s s i g n m e n t s t a t e m e n t S in the program. Let n i s d e n o t e the n u m b e r of loops containing S and Ls the loop nest d e t e r m i n e d by S. For Ls, let Is d e n o t e the index vector, ~s the iteration vector, Ps0 a n d qs0 the initial a n d final vectors, Ps and Os the initial a n d final matrices, and Os the stride matrix. The n o r m a l i z e d final vector a n d the n o r m a l i z e d final matrix of Ls are d e n o t e d by/Is0 and 0 s , respectively, and they are c o m p u t e d using (4.10): 0S = O s p ~ I Q s O ~ 1 /Is0

)

(qs0 - Ps0P~IQs) O~1. ~

(5.1)

Next, c o n s i d e r any two, n o t necessarily distinct, a s s i g n m e n t statem e n t s S a n d T in the given p r o g r a m . If S lexically precedes T in the program, we write S < T. The m e a n i n g of the n o t a t i o n S < T is t h e n clear, a n d it is obvious that < is a total order in the set of all assignment statements. Let ni = niST d e n o t e the n u m b e r of loops that contain b o t h statem e n t s S a n d T, a n d L the nest f o r m e d by these loops. We have m < m i n ( m s , niT). Since the o u t e r m o s t ni loops of Ls and LT are the same, the first n i e l e m e n t s of Pso a n d PT0 are identical, and so are the first n i e l e m e n t s of qso a n d qT0- Also, the n i x ni submatrices f o r m e d by the t o p m o s t n i rows a n d the l e f t m o s t n i c o l u m n s of the matrices Ps a n d PT are the same, and so are the similar s u b m a t r i c e s of the matrices Qs and QT. Let us label the loops of the p r o g r a m L1, L2,... ,Lms+mT-m, such that L = (L1,L2 . . . . . Lm) Ls = (L1,L2 . . . . , L m , L m + l , . . . , L m s ) LT = ( L 1 , L 2 , . . . , L m , L m s + l , . . . , L m s + m T - m ) . A value i = (il, f2 . . . . , ira, im+l . . . . , ires) of the index vector Is det e r m i n e s an instance of s t a t e m e n t S that is d e n o t e d by S(i), a n d a


123

value j = (jl, j2 . . . . , jm, Jms+~,... ,Jms+mr-m) of the index vector IT d e t e r m i n e s an instance of s t a t e m e n t T that is d e n o t e d by T(j). The distance f r o m the instance S(i) to the instance T(j) is defined to be the m - v e c t o r ()1 - ~1,.~2 - ~2 . . . . . jrn -- ~rn), where i -~ (~1 . . . .

, ~rn, ~ m + l . . . .

, ~ms)

.~ = ( J 1 , . . - , j r n , ) ~ n s + l , . . . , j m S + m T - r n )

are the iteration vectors c o r r e s p o n d i n g to i a n d j, respectively. Thus, the distance is defined in t e r m s of the first m e l e m e n t s of the iteration vectors of the loop nests Ls a n d LT. In o t h e r words, the distance is defined 1 by the iterations of the c o m m o n n e s t L. An extension of L e m m a 4.4 to the general case is s t a t e d below w i t h o u t proof. L e m m a 5.1 Consider two assignment statements S a n d T in the model program. Let d denote the distance from an instance S(i) o f S to an instance T(j) o f T . In the sequential execution o f the program, S(i) is executed before T(j) iff one o f the following two conditions holds: (a) d > 0, (b) d = O a n d S
- O. The equality d = 0 is possible only i f S < T. 1Note t h a t in a g e n e r a l p r o g r a m , t h e i t e r a t i o n p o i n t s ~ a n d ) n e e d n o t h a v e t h e s a m e n u m b e r o f e l e m e n t s , so t h a t t h e d i f f e r e n c e ) - ~ m a y n o t e v e n m a k e sense.

CHAPTER 5. GENERAL PROGRAM

124

C o r o l l a r y 1 I f ¢~ is a direction vector for the d e p e n d e n c e o f s t a t e m e n t

T on s t a t e m e n t S in the g e n e r a l p r o g r a m , then ~ >__O. The equality o" = 0 is possible only if S < T. C o r o l l a r y 2 I f ~ is a d e p e n d e n c e level for the d e p e n d e n c e o f s t a t e m e n t

T on s t a t e m e n t S in the g e n e r a l p r o g r a m , then i < e < m + 1 (where m is the n u m b e r o f loops that contain both S a n d T). The value ~ = m + 1 is possible only if S < T.

E x a m p l e 5.1 C o n s i d e r t h e p r o g r a m L1 : L2: L3 : S : L4 : Ls :

T:

d o I1 = 1 , 1 0 0 , 1 d o I 2 = 1,I1,2 d o 13 = / 1 , I1 + 12, 1 X(211 - 1,312 + 1,213) . . . . enddo d o I4 = 100, 0, - 3 d o I~ = I1, I4, 1 ....... X(211 + 1,412 + 6 , I 4 + I s ) ' ' ' enddo enddo enddo enddo

S t a t e m e n t S d e t e r m i n e s t h e l o o p n e s t Ls = (L1,L2,L3) w i t h m s = 3 loops. For this nest, we have: P l = P l o = 1, ql

=

qlO

=

100,

P2 : P20 + P2111 : 1 + 0 X 11, 6/2 = q20 +6/21/1 = 0 + 1 x [ 1 , P3 = P30 + P3111 + P3212 = 0 + 1 x I1 + 0 x I2, 6/3 = 6/30 + 6/31/1 + q32/2 = 0 + 1 X I1 + 1 x I2, 01=1,02=2,03=1. H e n c e , t h e initial a n d final v e c t o r s a r e PS0 = ( P l 0 , P20, P30) = (1, 1 , 0 ) , qs0 = (ql0, q20, q30) = (100, 0, 0);

125


and the initial, final, and stride matrices are ( 1 -P21 Ps = 0 1 0 0 ( 1 -q21 O.s = 0 1 0 0 Os =

0 0

-P31 ) -P32 1 -q3~ ) -q32 1

02 0 0 03

=

=

=

( 1 0 0 ( 1 0 0

0 1 0 -1 1 0

0 2 0 0 0 1

-1 ) 0 , 1 -1 ) -1 , 1 .

We compute the normalized final vector and matrix: £1so = (qso - PsoP~lO.s ) O~ 1 = (99, 0, 1), 0.3" = Osp~IQsO~ 1 =

(1-1/20) 0 1 0 0

-2 1

.

The iteration vector ~s = (,~1,~2, i3) satisfies the constraints:

O< ~s

}

sfis -
- 0 if S < T, a n d tr >- 0 o t h e r w i s e . Also, b y C o r o l l a r y 2 to t h e s a m e t h e o r e m , w e h a v e 1 < / ? < m + 1 if S < T, a n d 1 < 4? < m o t h e r w i s e . T h e o r e m 5.4 Suppose that Equation (5. 3) has an integer solution (~,)) that satisfies the constraints (5.4)-(5.5). Let ~ = (El, ~2,..., ~rns) a n d • = ( j l . . . . , )rrt, ) r n s + l . . . . .

)r~tS+rrtT-rtl).

(a) I f ( i i , $ 2 , . . . , i m ) < ( 3 1 , ) 2 , . . . , j r n ) ,

then S-~T. -

-

(b) I f ( i l , ~ 2 . . . . ,~m) >- (31,j2 . . . . , ) m ) , then Tt~S. (c) I f ( i x , ~ 2 . . . . . ~m) = (~1,~2 . . . . . j m ) a n d s < T, t h e n S - ~ T . As always, w e d o n o t d i s t i n g u i s h b e t w e e n d e p e n d e n c e a n d indirect d e p e n d e n c e . If t h e r e is a n i n t e g e r s o l u t i o n to (5.3), s a t i s f y i n g (5.4)-(5.5) a n d a d d i t i o n a l c o n d i t i o n s , if any, t h e n w e a s s u m e t h a t the

132

C H A P T E R 5.

GENERAL P R O G R A M

c o r r e s p o n d i n g d e p e n d e n c e exists. The d e p e n d e n c e p r o b l e m for t h e general p r o g r a m is d e f i n e d as in Section 4.5. E x a m p l e 5.2 In this example, w e state the d e p e n d e n c e p r o b l e m for t h e p r o g r a m of Example 5.1: do 11 = 1,100, 1

L1 : L2 : L3 :

d o 12 = 1,11, 2

S: L4 : Ls :

T:

do 13 = 11,11 + 12, 1 X(211 - 1,312 + 1, 2•3) . . . . enddo d o 14 = 100, 0, - 3 do Is = 11,14, 1 ....... X(211 + 1,412 +6,14 + I s ) . • • enddo enddo enddo enddo

Much of t h e c o m p u t a t i o n h a s b e e n d o n e already. The variable of s t a t e m e n t S c a n b e w r i t t e n as X ((I1,12, I3)A + a0), w h e r e

A =

3 0

Its n o r m a l i z e d f o r m is

h

and

a0 = ( - 1 , 1,0).

X ((~1, ~2, I3)A + rio), w h e r e = Osp~IA =

, 0

rio = P s o p ~ I A + ao = ( 1 , 4 , 2 ) . The variable of T c a n b e w r i t t e n as X ((I1, I2, I4, I5 )B + bo), w h e r e

B=

4 0 0

bo = ( 1 , 6 , 0 ) .

'

133

5.3. DEPENDENCE PROBLEM

2ol)

Its n o r m a l i z e d f o r m is X ( ( i l , ~2, ~4, i s ) ~ + !~03, w h e r e

fi

"~-t~TPT1B =

0

8

0

0

0

0

0

-3

'

1

l~0 = PTop~XB + bo = (3, 10, 101). The d e p e n d e n c e e q u a t i o n (5.3) for this p r o b l e m is

(~1, ~2, ~33

20 0 6 0 0

02

-- (31,32,34,35)

0 0 0

08 0

- 30 1

= (2,0,993. (5.7)

It c o n s i s t s of t h r e e scalar e q u a t i o n s in 7 i n t e g e r variables: 2Zl-231 =2 6Z2 - 832 = 6 2il + 2~3 - 3 1 + 3 3 4 - 3 5 = 9 9 .

]

In Example 5.1, we f o u n d t h e d e p e n d e n c e c o n s t r a i n t s (5.4)-(5.5) for this p r o b l e m : 0 < ~ 1 _< 99 0--< ~2 _< ~1/2 0

Dependence Analysis (Loop Transformation for Restructuring Compilers)

Loop Transformations for Restructuring Compilers: The Foundations

Loop Transformations for Restructuring Compilers: The Foundations

Optimizing Compilers for Modern Architectures: A Dependence-based Approach

Optimizing compilers for modern architectures; a dependence based approach

Optimizing compilers for modern architectures a dependence-based approach

Loop

Nocked for a Loop

Compilers

Nocked for a Loop

Nocked for a Loop

Loop

Loop

Loop

Langlands correspondence for loop groups

Knock Me for a Loop

Langlands correspondence for loop groups

Knock Me for a Loop

Loop-d-Loop: More than 40 Novel Designs for Knitters

Infinite Loop

The Loop

Advanced Symbolic Analysis for Compilers: New Techniques and Algorithms for Symbolic Program Analysis and Optimization

Advanced symbolic analysis for compilers: new techniques and algorithms for symbolic program analysis and optimization

Uncertainty analysis with high dimensional dependence modelling

A Successful Transformation?: Restructuring of the Czech Automobile Industry

Infinite Loop

The Loop

Fourier Transformation for Pedestrians

Transformation groups for beginners

Fourier transformation for pedestrians

Fourier transformation for pedestrians

Dependence Analysis (Loop Transformation for Restructuring Compilers)

Loop Transformations for Restructuring Compilers: The Foundations

Loop Transformations for Restructuring Compilers: The Foundations

Optimizing Compilers for Modern Architectures: A Dependence-based Approach

Optimizing compilers for modern architectures; a dependence based approach

Optimizing compilers for modern architectures a dependence-based approach

Loop

Nocked for a Loop

Compilers

Nocked for a Loop

Nocked for a Loop

Loop

Loop

Loop

Langlands correspondence for loop groups

Knock Me for a Loop

Langlands correspondence for loop groups

Knock Me for a Loop

Loop-d-Loop: More than 40 Novel Designs for Knitters

Infinite Loop

The Loop

Advanced Symbolic Analysis for Compilers: New Techniques and Algorithms for Symbolic Program Analysis and Optimization

Advanced symbolic analysis for compilers: new techniques and algorithms for symbolic program analysis and optimization

Uncertainty analysis with high dimensional dependence modelling

A Successful Transformation?: Restructuring of the Czech Automobile Industry

Infinite Loop

The Loop

Fourier Transformation for Pedestrians

Transformation groups for beginners

Fourier transformation for pedestrians

Fourier transformation for pedestrians

Recommend Documents