ANNALS OF DISCRETE MATHEMATICS
General Editor: Peter L. HAMMER Rutgers University, New Brunswick, NJ, USA
Advisory Ed...
39 downloads
877 Views
29MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ANNALS OF DISCRETE MATHEMATICS
General Editor: Peter L. HAMMER Rutgers University, New Brunswick, NJ, USA
Advisory Editors: C. BERGE, Universite de Paris, France R.L. GRAHAM, AT&T Bell Laboratories, NJ, USA M.A. HARRISON, University of California, Berkeley, CA, USA V: KLEE, University of Washington, Seattle, WA, USA J.H. VAN LINT, California Institute of Technology, Pasadena, CA, USA G.C. ROTA, Massachusetts Institute of Technology, Cambridge, MA, USA T TROTTER, Arizona State University, Tempe, AZ,USA
54
SUBMODULAR FUNCTIONS AND ELECTRICAL NETWORKS
H. Narayanan
Department of Electrical Engineering Indian Institute of Technology at Bombay Bombay, India
1997
ELSEVIER AMSTERDAM
LAUSANNE
NEW YORK
OXFORD
SHANNON
TOKYO
ELSEVIER SCIENCE B.V. Sara Burgerhartstraat 25 P.O. Box 21 1,1000 AE Amsterdam, The Netherlands
ISBN: 0 444 82523 1
0 1997 Elsevier Science B.V. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science B.V., Copyright & Permissions Department, FIO. Box 521, 1000 A M Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCCI, 222 Rosewood Drive, Danvers, MA 01923. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A, should be referred to the publisher. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein.
This book is printed on acid-free paper. Printed in The Netherlands
-Ndadi circa 800 A.D.
Learning as a shoreless sea; the learner’s days are few; Prolonged study is beset with a thousand ills; With clear discrimination learn what’s meet f o r you Lake swan that leaves the water, drinks the milk.
Preface This book has grown out of an attempt to understand the role that the topology of an electrical network plays in its eacient analysis. The approach taken is to transform the problem of solving a network with a given topology, to that of solving another with a different topology (and same devices), but with additional inputs and constraints. An instance of this approach is network analysis by multiport decomposition - breaking up a network into multiports, solving these in terms of port variables and finally imposing the port connection conditions and getting the complete solution. The motivation for our approach is that of building more efficient circuit simulators, whether they are to run singly or in parallel. Some of the ideas contained in the book have already been implemented - BITSIM, the general purpose circuit simulator built at the VLSI Design Centre, I.I.T. Bombay, is based on the ‘topological hybrid analysis’ contained in this book and can further be adapted to use topological decomposition ideas. Many combinatorial optimization problems arise naturally when one adopts the above approach, particularly the hybrid rank problem and its generalizations. The theory required for the solution of these problems was developed by electrical engineers parallel to, and independent of, developments taking place in the theory of matroids and submodular functions. Consider, for instance, the work of Kishi and Kajitani, Iri, Ohtsuki et al in the late 60’s on principal partition and its applications, independent of Edmonds’ work on matroid partitions (1965). There is a strong case for electrical network topologists and submodular function theorists being aware of each others’ fields. It is hoped that the present book would fill this need. The topological network analysis that we have considered is to be distinguished from the kind of work exemplified by ‘Kirchhoff’s Third Law’ which has been discussed in many books published in the 60’s (eg. the book by Seshu and Reed [Seshu+Reed61]). In the 70’s much interesting work in this area was done by h i , Tomizawa, Recski and others using the ‘generality assumption’ for linear devices. Details may be found, for instance, in Recski’s book [Recski89]. In the present book devices play a very secondary role. Mostly we manipulate only Kirchhoff’s Laws. Submodular functions are presented in this book adopting the ‘elementary combinatorial’ as opposed to the ‘polyhedral’ approach. Three things made us decide in favour of the former approach. vii
viii It is hoped that the book would be read by designers of VLSI algorithms. In order to be convincing, the algorithms presented would have to be fast. So very general algorithms based on the polyhedral approach are ruled out. 0
0
The polyhedral approach is not very natural to the material on Dilworth truncation. There is an excellent and comprehensive monograph, due to Sfijishige, on the polyhedral approach to submodular functions; a book on polyhedral combinatorics including submodular functions from A.Schrijver is long awaited.
In order to make the book useful to a wider audience, the material on electrical networks and that on submodular functions are presented independently of each other. A final chapter on the hybrid rank problem displays the link. An area which can benefit by algorithms based on submodular functions is that of CAD for VLSI - particularly for building partitioners. Some space has therefore been devoted to partitioning in the chapter on Dilworth truncation.
The book is intended primarily for self study - hence the large number of problems with solutions. However, most of the material has been tested in the class room. The network theory part has been used for many years for an elective course on ‘Advanced Network Analysis’ - a third course on networks taken by senior undergraduates at the EE Dept, I.I.T. Bombay. The submodular function part has been used for special topics courses on combinatorics taken by doctoral students in Maths and Computer Science. This material can be covered in a semester if the students have a prior background in elementary graphs and matroids, leaving all the starred sections and relegating details and problems to self study. It is a pleasure to acknowledge the author’s indebtedness to his many colleagues, teachers and friends and to express his heartfelt gratitude. He was introduced to electrical network theory by Professors R.E.Bedford and KShankar of the EE Dept., I.I.T. Bombay, and to graph theory by Professor M.N.Vartak of the Dept. of Maths, I.I.T. Bombay. Professor Masao Iri, formerly of the University of Tokyo, now of the University of Chuo, has kept him abreast of the developments in applied matroid theory during the last two decades and has also generously spared time to comment on the viability of lines of research. He has benefited through interaction with the following: Professors S.D.Agashe, P.R.Bryant,A.N.Chandorkar ,M.Chandramouli,C.A.Desoer,A.Diwan,S.Fhjishige,
P.L.Hammer,M.V.Hariharan,Y.Kajitani,M.V. Kamath,M.S.Kamath,E.L.Lawler,
K.V.V; Murthy,T.Ozawa,S.Patkar,S.K.Pillai,P.G.Poo~ha,G.N.Revankar,S.Roy, S.C .Sahasrabudhe,P.C .Sharma,M.Sohoni,V.Subbarao,N.J .Sudarshan,V.K .Tandon, N.Tomizawa, P.P.Varaiya, J.M.Vasi. The friends mentioned below have critically read parts of the manuscript: S.Batterywala, A.Diwan, N.Jayanthi, S.Patkar, P.G.Poonacha and the ’96 batch students of the course ‘Advanced Network Analysis’. But for Shabbir Batterywala’s assistance (technical, editorial, software consultancy), publication of this book would have been delayed by many months.
ix Mr Z.A.Shirgaonkar has done the typing in Latex and Mr R.S.Patwardhan has drawn the figures.
The writing of this book was supported by a grant (HN/EE/TXT/95) from the C.D.P., I.I.T. Bombay. The author is grateful to his mother Lalitha Iyer, wife Jayanthi and son Hari for their continued encouragement and support.
Note to the Reader This book appears too long because of two reasons: 0
0
it is meant for self study - so contains a large number of exercises and problems with solutions. it is aimed at three different types of readers: -
Electrical engineers interested in topological methods of network analysis.
- Engineers interested in submodular function theory -
Researchers interested in the link between electrical networks and submodular functions.
To shorten the book for oneself it is not necessary to take recourse to drastic physical measures. During first reading all starred sections, starred exercises and problems may be omitted. If the reader belongs to the first two categories mentioned above, she would already find that only about two hundred pages have to be read. Sections, exercises and problems have been starred to indicate that they are not necessary for a first reading. Length of the solution is a fair indicator of the level of difficulty of a problem - star does not indicate level of difficulty. There are only a handful of routine (drill type) exercises. Most of the others require some effort. Usually the problems are harder than the exercises. Many of the results, exercises, problems etc. in this book are well known but cannot easily be credited to any one author. Such results are marked with a ‘(k)’. Electrical Engineers interested in topological methods
Such readers should first brush up on linear algebra (say first two chapters of the book by Hoffman and Kunze [Hoffman+Kunze72]),read a bit of graph theory (say the chapter on Kirchhoff’s laws in the book by Chua et al [Chua+Desoer+Kuh87] and the first four chapters of the book by Narsingh Deo [Narsingh Deo741) and then read chapters 2 to 8. The chapter on graphs contains material on contraction and restriction which is not easily available in textbooks on circuit theory, but which xi
xii is essential for a n understanding of subsequent chapters. So this chapter should be read carefully, particularly since it is written tersely. The chapter on matroids is optional. The chapter on electrical networks should be easy reading but scanning it is essential since it fixes some notation used subsequently and also because it contains material motivating subsequent chapters, e.g. multiport decomposition. The next three chapters contain whatever the book has to say on topological network analysis. Engineers i n t e r e s t e d i n submodular functions Such readers should read Chapters 2 to 4 and Chapters 9 to 13 and the first four sections of Chapter 14. If the reader is not interested in matroids he may skip material (chapters, sections, exercises, examples) dealing with them without serious loss of continuity. This would mean he would have to be satisfied with bipartite graph based instances of the general theory. The key chapter for such a reader is Chapter 9. This is tersely written-so should be gone through carefully. Researchers interested in the link b e t w e e n submodular functions and electrical networks
The key chapter for such a reader is Chapter 14. To read the first four sections of this chapter the reader has t o be familiar with Chapters 5, 6, 7 from the electrical networks part and the unstarred sections of the chapters on submodular functions. If he has some prior familiarity with submodular functions and electrical networks it is possible to directly begin reading the chapter picking up the required results on submodular functions as and when they are referred to in the text. To read the last section of the chapter, familiarity with Chapter 8 is required.
Comments on Notation Sometimes, instead of numbering equations, key statements etc., we have marked These marks are used over and over them with symbols such as (*), (w), again and have validity only within a local area such as a paragraph, a proof or the solution to a problem.
(4.
In some cases, where there is no room for confusion, the same symbol denotes different objects. For instance, usually B denotes a bipartite graph. But in Chapter 4, B denotes a base of a matroid- elsewhere a base is always denoted by b. The symbol E is used for the edge set of a graph, in particular a bipartite graph. But E(X),X V ( G ) denotes the set of edges with both endpoints within X , while E L ( X ) , X C V L ,in the case of a bipartite graph, denotes the set of all vertices adjacent only to vertices in X .
s
We have often used brackets to write two statements in one. Example: We say that set X is contained in Y (properly contained i n Y ) ,if every element of X is also a member of Y (every element of X is a member of Y
...
Xlll
and X # Y ) and denote it by X C Y ( X c Y ) . This is to be read as the following two statements. i. We say that set X is contained in Y , if every element of X is also a member
of Y and denote it by X E Y.
ii. We say that set X is properly contained in Y ,if every element of X is a member of Y and X # Y and denote it by X c Y.
xiv
List of Commonly Used Symbols Sets, Partitions, Partial Orders
set whose elements are e l , e2,. . . ,en set whose members are xi, i E I a f a m i l y (used o n l y in Chapters 2 and 11) element x belongs t o set X element x does not belong to set X for all elements x there exists an element x set X i s contained in set Y set X i s properly contained in set Y union of sets X and Y intersection of sets X and Y disjoint union of sets X and Y
union o f the sets X i disjoint union of the sets X i
Px II
Ni E n
ft(0) = 0,
= minnw, ( C f(xi))
ft(x)
xiEn
collection of all partitions of S that m i n i m i z e f - A(.) maximal and minimal member partitions in LA (usually) decreasing sequence of critical P L P values of f (.) principal sequence of partitions of f (.) partition of ll with N f u s as one of i t s blocks i f f the members of N f u s are the set of blocks of ll contained in a single block of n’ (n,,,, a partition of n) a partition with N as a block, i f f N i s the union of all blocks of II which are members of a single block o f nf,,
Chapter 1
Introduction Topological Methods The methods described in this book could be used to study the properties of electrical networks that are independent of the device characteristic. We use only topological constraints, namely, KCL and KVL. Our methods could, therefore, be called ‘network topological’. However, in the literature, ‘topological’ is used more loosely for all those results which use topological ideas, e.g. Kirchhoff’s Third Law, where the admittance of a resistive multiport is obtained in terms of products of admittances present in all the trees and certain special kinds of subtrees of the network. These latter results, though important, are not touched upon in this book. Here our aim has been to 0
0
give a detailed description of ‘topological methods in the strict sense’ for electrical networks, present applications: - to circuit simulation and circuit partitioning - to establish relations between the optimization problems that arise nat-
urally, while using these methods, to the central problems in the theory of submodular functions.
Applications There are two kinds of applications possible for the approach taken in this book: i. To build better (faster, numerically more rugged, parallelizable) circuit simulators. Typically, our methods will permit us to speak as follows.
2
1. INTRODUCTION
‘Solution of a network N containing arbitrary devices is equivalent to solution of topologically derived networks N1, . . . , N , under additional topological conditions. ’ An obvious application would be for the (coarse grained) parallelization of circuit simulation. We could have a number of machines M I , . . ,h!fk which could run generallspecial purpose circuit simulation of the derived networks Nl, * * . ,Nk. The central processor could combine their solutions using the additional topological conditions. Optimization problems would arise naturally, e.g. ‘how t o minimize the additional topological conditions?’ There are more immediate applications possible. The most popular general purpose simulator now running, SPICE, uses the modified nodal analysis approach. In this approach the devices are divided into two classes, generalized admittance type whose currents can be written in terms of voltages appearing somewhere in the circuit, and the remaining devices whose current variables will figure in the list of unknowns. The final variables in terms of which the solution is carried out would be the set of all nodal voltages and the above mentioned current variables. The resulting coefficient matrix is very sparse but suffers from the following defects: +
0
the matrix often has diagonal zeros; even for pure RLC circuits the coefficient matrix is not positive definite; if the subnetwork containing the admittance devices is disconnected, then the corresponding principal submatrix is singular.
These problems are not very severe if we resort to sparse LU methods [HajjSl]. However, it is generally accepted that for large enough networks (M 5000 nodes) preconditioned conjugate gradient methods would prove superior to sparse LU techniques. The main advantage of the former is that if the matrix is close to a positive definite matrix, then we can bound the number of iterations. The above defects make MNA ill suited t o conjugate gradient methods. There is a simple way out - viz. to use hybrid analysis (partly loop and partly nodal), where we partition elements into admittance type and impedance type. The structure of the coefficient matrix that is obtained in this latter case is well suited to solution obtained by the conjugate gradient technique but could easily, for wrong choice of variables, be dense. A good way of making the matrices sparse is to use the result that we call the ‘ N ~-LNBK theorem’ (see Section 6.4). Here the network is decomposed into two derived networks whose solution under additional topological (boundary) conditions is always equivalent to the solution of the original network. We select N A L so that it contains the admittance type elements and NBK so that it contains the impedance type elements. We then write nodal equations for NALand generalized mesh type equations for N B K . The result is a sparse matrix with good structure for using conjugate gradient methods - for instance for RLC networks, after discretization, we would get a positive definite matrix and for most practical networks, a large submatrix would be positive definite. A general purpose simulator BITSIM has been built using these ideas [Roy+Gaitonde+Narayanan90].
3 The application t o circuit partitioning arises as a biproduct when we try to solve a version of the hybrid rank problem using the operation of Dilworth truncation on submodular functions. Many problems in the area of CAD for VLSI need the underlying graph/hypergraph to be partitioned such that the ‘interaction’ between blocks is minimized. For instance we may have to partition the vertex set of a graph so that the number of lines going between blocks is a minimum. This kind of problem is invariably NP-Hard. But, using the idea of principal lattice of partitions (PLP), we can solve a relaxation of such problems exactly. This solution can then be converted to an approximate solution of the original problem [Narayanangl], [Roy+Narayanan91], [Patkar92], [Roy931,[Roy+Narayanan934 [Narayanan Roy +Pat kar961.
+
ii. A second kind of application is to establish strong relationships between electrical networks and combinatorial optimization, in particular, submodular function theory. There are a number of optimization problems which arise when we view electrical networks from a topological point of view. These motivate, and are solved by, important concepts such as convolution and Dilworth truncation of submodular functions. The hybrid rank problem and its generalizations are important instances. Other algorithmic problems (though not entirely topological) include the solvability of electrical networks under ‘generality’ conditions (see for instance [Recski+Iri80]). It is no longer possible for electrical engineers to directly apply well established mathematical concepts. They themselves often have t o work out the required ideas. The principal partition is a good instance of such an idea conceived by electrical engineers. A nice way of developing submodular function theory, it appears to the author, is to look for solutions t o problems that electrical networks throw up. We now present three examples which illustrate the concepts that we will be concerned with in network analysis. The following informal rule should be kept in mind while reading the examples (see Theorem 6.3.1 and also the remark on page 179).
Let N be an electrical network (not necessarily linear) with the set of independent current sources E J and the set of independent voltage sources EL. W e assume that the independent source values do not affect the device characteristic of the remaining devices. Then, the structure of the constraints of the network, in any method of analysis, (as f a r as variables other than voltage source currents and current source voltages are concerned) is that corresponding to setting the independent sources to zero, i.e., short circuiting voltage sources and open circuiting current sources. I n particular, for linear networks, the structure of the coeficient matrix multzplying the unknown vector is that corresponding t o the network obtained by short circuiting the voltage sources and open circuiting the current sources. Example 1.0.1 The JV’AL -JVBKmethod: Consider the electrical network whose graph is given in Figure 1.1. We assume that the devices associated with branches {1,2,3,4,5} (= A ) axe independent of
1. I N T R O D U C T I O N
4
'11
11
Figure 1.1: To illustrate the NAL- NBKMethod
those associated with branches {6,7,8,9,10,11} (= B ) . Then we can show that computing the solution of the network in N in Figure 1.1 is always equivalent to NBK, in the the simultaneous computation of the solutions of the networks NAL, same figure, under the boundary conditions ill
05
in NAL in
NAL
=ill
in NBK.
= v5 in
NBK.
Here, in NAL, the devices in A are identical to the corresponding devices in N. Shilarly in NBK, devices in B are identical to the corresponding devices in N . The subset L C B is a set of branches which, when deleted, breaks all circuits intersecting both A and B . The subset K 2 A is a set o f branches which, when contracted, destroys all circuits intersecting both A and B. The graph of NALis obtained from that of N by short circuiting the branches of B - L. We denote it bvy Cj x ( A U L ) . In this case L = { ll}. The graph ofNBK is obtained from that of A' by open circuiting branches of A - K. We denote it by G . ( BU K ) . In this case f< = ( 5 ) .
If the network is linear and A and B are of conductance and impedance type respectively, then we can, i f we choose, solve NALby nodal analysis and NBKby loop analysis. So this method can be regarded as a topological generalization of 'Hybrid Analysis. ' I f we so desire, we could try to chooseNAL or NBK such that they appear (when
5
Figure 1.2: A Network to be decomposed into Multiports
i ~ ‘Uk, are set to zero) in several electrically disconnected pieces. So the method can be regarded as a technique of ‘Network Analysis by Decomposition ’. Now we mention some related combinatorial optimization problems. i. Given a partition of the edges into A and B how to choose L , K minimally this is easy.
ii. Suppose the network permits arbitrarypartitions into A and B and we choose nodal variables for NALand loop vaxiables for NBK. Which partition would give the coefficient matrix of least size?
+
It can be shown that the size of the coefficient matrix is r(G . A ) v(G x B ) , where r ( G . A ) , v(G x B) respectively denote the rank of G . A and nullity of graph G x B. Minimization of this expression, over all partitions { A , B} of the edge set E(G) of G, is the hybrid rank problem which gave rise to the theory of principal Partition. Example 1.0.2 Multiport Decomposition: Let N be an electrical network with the graph shown in Figure 1.2. We are given that A 3 { 1,2. . . , lo} and B = { 11, . . ,24} (with devices in A and B decoupled). N B~ P,and ~ a ‘port connection The problem is to split N into two multiports N A ~ diagram ’ Nplp2 and solve N by solving NAP^ ,N B ~,Npl = p2 simultaneously. (In general this would be a problem involving n multiports). It is desirable to choose
6
1. INTRODUCTION
N*P, Figure 1.3: Decomposition into Multiports
PI p2 minimally. It turns out that ~
(Here G . A is obtained by open circuiting edges in B , while G x A is obtained by short circuiting edges in B ) . In this case this number is 1. The multiports axe shown in Figure 1.3. The general solution procedure using multiport decomposition is as follows: Find the voltage-current relationship imposed on PI by the rest of the network in N*p1, and on P2, by the rest of the network in N B P ~ This . involves solution ofNApl ,N B P ~ in terms of some of the current/voltage port variables of NAP, and some of the current/voltage port variables o f N B P ~The . voltage-current relationships imposed on PI, PJ (as described above) are treated as their device characteristics in the network Np,p2. When this is solved, we get the currents and voltages o f P I ,pZ. Networks NAP^, Nsp2 have already been solved in terms o f these variables. So this completes the solution of N . Like the NAL- NBKmethod (to which it is related), this is also a general method independent of the type of network. As before, the technique is more useful when the network is linear. This method again may be used as a network decomposition technique (for parallelizing) at a different level. Suppose Nap, (or N B P ~ splits ) into several subnetworks when some of the branches Pol PO^) of Pl(P2)are opened and others PSl(Ps2)shorted. Then, by using ip,, ( i p o 2 ) , u p s 1 ( v p s 2as) , variables in terms of
7 which Ahp, ( N B ~ are~solved, ) we can make the analysis look like the simultaneous solution of several subnetworks under boundary conditions. There is no restriction on the type of network - we only need the subnetworks to be decoupled in the device characteristic. The optimization problem that arises naturally in this case is the following: Given a partition of the edges of a network N into El , . . * ,&, find a collection of multiports N E ~. - - ~ ,N ~ E, , and ~ a port ~ ~connection diagram Npl,...,p k , whose combined KCE and KVE are equivalent to those of N , with the size of Pi a minimum under these conditions. This problem is solved in Chapter 8.
Remark: A t an informal level multiport decomposition is an important technique in classical network theory e.g. Thevenin-Norton Theorem, extracting reactances in synthesis, extracting nonlinear elements in nonlinear circuit theory, etc. However, for the kind of topological theory to be discussed in the succeeding pages we need a formal definition of ports that will carry over t o vector spaces from graphs. Otherwise the minimization problems cannot be stated clearly, let alone be solved. In the example described above, it is clear that if we match Nap, and Nep, along P I ,P.Lwe do not get back h/. A purely graph theoretic definition of multiport decomposition would therefore not permit the decomposition given in this particular example. Such a definition would lead to optimization problems with additional constraints which have no relevance for network analysis. Further, even after optimization according to such a definition, we would end up with more ports than required. Example 1.0.3 Fusion-Fission method: Consider the network in Figure 1.4. Six subnetworks have been connected together to make up the network. Assume that the devices in the subnetworks axe decoupled. Clearly the networks in Figure 1.4 and Figure 1.5 are equivalent, provided the current through the additional unknown voltage source and the voltage across the additional unknown current source axe set equal to zero. But the network in Figure 1.5 is equivalent to that in Figure 1.6 under the additional conditions i,l
+ i,2 + i v 3 + i = 0
As can be seen, the subnetworks of Figure 1.6 are decoupled except for the common variables v and a and the additional conditions. A natural optimization problem here is the following: Given a partition of the edges of a graph into E l , . . . Ek , what is the minimum size set of node pair fusions and node fissions by which all circuits passing through more than one Ei are destroyed?
8
1. ZNTRODUCTZON
Figure 1.5: A Network equivalent to N with Virtual Sources
9
Figure 1.6: Network
N decomposed by the Fusion-Fission Method
In the present example the optimal set of operations is to fuse nodes a and b and cut node a into a l ,a2 as in Figure 1.5. Artificial voltage sources are introduced across the node pairs to be fused and artificial current sources are introduced between two halves of a split node.
It can be shown that this problem generalizes the hybrid rank problem (see Section 14.4). Its solution involves the use of the Dilworth truncation operation on an appropriate submodular function. We now speak briefly of the mathematical methods needed to derive the kind of results hinted at in the above examples. The NAL- NBKmethod needs systematic use of the operations of contraction and restriction both for graphs and vector spaces and the notion of duality of operations on vector spaces. These have been discussed in detail in Chapter 3. The NAL- NBKmethod itself is discussed in Chapter 6. The rnultiport decomposition method requires the use of the 'Implicit Duality Theorem'. This result, which should be regarded as a part of network theory folklore, has received too little attention in the literature. We have tried t o make amends by devoting a full chapter to it. The optimization problem relevant to multiport decomposition ('port minimization') is discussed in Chapter 8. The fusion-fission method is a special case of the method of topological transformations discussed in Chapter 7. The solution of the optimization problem that
1. INTRODUCTION
10
it gives rise to (minimization of the number of fusion and fission operations needed to electrically decouple the blocks of a partition of the edge set) is given in Section 14.4.The solution uses the Dilworth truncation operation on submodular functions.
We next give a chapterwise outline of the book. Chapter 2 is concerned with mathematical preliminaries such as sets, families, vectors and matrices. Also given is a very brief description of inequality systems. Chapter 3 contains a very terse description of graphs and their vector space representation. Only material that we need later on in this book is included. Emphasis is placed on complementary orthogonality (Tellegen’s Theorem) and the important minor operations linked through duality. The duality described corresponds to complementary orthogonality of vector spaces (and not to the vector space - functional space relation).
Also included is a sketch of the basic algorithms relevant to this book - such as b f s, df s trees, construction of f-circuits, the shortest path algorithm, algorithms for performing graph minor operations and the basic join and meet operations on partitions. Some space is devoted to the flow maximization problem, particularly certain special ones that are associated with a bipartite graph. (Many of the optimization problems considered in this book reduce ultimately to (perhaps repeated) flow maximization). Chapter 4 gives a brief account of matroids. Important axiom systems such as the ones in terms of independence, circuit, rank, closure etc. are presented and shown to be equivalent to each other. The minor operations and dual matroids are described. Finally the relation between matroids and the greedy algorithm is presented. This chapter is included for two reasons:
Some of the notions presented in the previous chapter lead very naturally to their extension to matroids rnatroids are perhaps the most important instance of submodular functions which latter is our main preoccupation in the second half of this book. Chapter 5 contains a brief introduction to conventional electrical network theory, with the aim of making the book self contained. The intention here is also to indicate the author’s point of view to a reader who is an electrical engineer. This chapter contains a rapid sketch of the basic methods of network analysis including a very short description of the procedure followed in general purpose circuit simulators. Also included is an informal account of multiport decomposition and of some elementary results including Thevenin-Norton Theorem. Chapter 6 contains a description of topological hybrid analysis indicated in Example 1 .O. 1 . This chapter is a formalization of the topological ideas behind Kron’s Diakopotics. The methods used involve vector space minors. The main result is
11
Theorem 6.4.1 which has already been illustrated in the above mentioned example. Chapter 7 contains a detailed description of the Implicit Duality Theorem, its applications and its extensions to linear inequality and linear integrality systems. The operation of generalized minor is introduced and made use of in this chapter. The implicit duality theorem was originally a theorem on ideal transformers and states that if we connect 2-port transformers arbitrarily and expose k-ports, the result would be a k-port ideal transformer. (An ideal transformer, by definition, has its possible port voltage vectors and possible port current vectors as complementary orthogonal spaces.) We show that its power extends beyond these original boundaries. One of the applications described is for the construction of adjoints, another to topological transformations of electrical networks. The latter are used to solve a given network as though it has the topology of a different network, paying a certain cost in terms of additional variables. Multiport decomposition, from a topological point of view, is the subject of Chapter 8. We make free use of the Implicit Duality Theorem of the previous chapter. We indicate that multiport decomposition is perhaps the most natural tool for network analysis by decomposition. It can be shown that multiport decomposition generalizes topological hybrid analysis (see Problem 8.5). We present a few algorithms for minimizing the number of ports for a multiport decomposition corresponding to a given partition of edges of a graph. Finally, we show that this kind of decomposition can be used to construct reduced networks which mimic some of the properties of the original network. In particular we show that any RLMC network can be reduced to a network without zero eigen values (i.e., without trapped voltages or currents) but with, otherwise, the same ‘dynamics’ as the original network. The second half of the book is about submodular functions and the link between them and electrical networks. Chapter 9 contains a compact description of submodular function theory omitting the important operations of convolution and Dilworth truncation. (The latter are developed in subsequent chapters). We begin with the basic definition and some characteristic properties followed by a number of examples of submodular functions which arise in graphs, hypergraphs (represented by bipartite graphs), matrices etc. Basic operations such as contraction, restriction, fusion, dualization etc. are described next. These are immediately illustrated by examples from graphs and bipartite graphs. Some other operations, slightly peripheral, are described next. A section is devoted t o the important cases of polymatroid and matroid rank functions. It is shown that any submodular function is a ‘translate’ through a modular function of a polymatroid rank function. The idea of connectedness is described next. This corresponds to 2-connectedness of graphs. After this there is a very brief but general description of polyhedra associated with set functions in general and with submodular and supermodular functions in particular. The important result
12
1 . INTRODUCTION
due to Frank, usually called the ‘Sandwich Theorem’ is described in this section. The recent solution, due to Stoer, Wagner and Frank, of the symmetric submodular function minimization problem is described in the next section.
Chapter 10 is devoted to the operation of (lower) convolution of two submodular functions. We begin with purely formal properties and follow it with a number of examples of results from the literature which the operation of convolution unifies. Next we give the polyhedral interpretation for convolution viz. it corresponds to the intersection of the polyhedra of the interacting submodular functions. This is followed by a section in which the operation of convolution is used to show that every polymatroid rank function can be obtained by the fusion of an appropriate matroid rank function. In the next section, the principal partition (PP) of a submodular function with respect to a strictly increasing polymatroid rank function is dealt with. We begin with the basic properties of PP which give structural insight into many practical problems. An alternative development of PP from the point of view of density of sets is next presented. Finally the PP of naturally derived submodular functions is related to the PP of the original function. In the next section, the refined partial order associated with the PP is described. After this we present general algorithms for the construction of the PP of a submodular function with respect t o a nonnegative weight function. These use submodular function minimization as a basic subroutine. We consider two important special cases of this algorithm. The first, the weighted left adjacency function of a bipartite graph, is described in this chapter. In this case the submodular function minimization reduces to a flow problem. The second is the PP of a matroid rank function which is taken up in the next chapter. The last (starred) section in this chapter describes a peculiar situation where, performing certain operations on the original submodular function, we get functions whose PP is related in a very simple way to the original PP. This section is developed through problems. Chapter 11 is on the matroid union operation. In the first section, we give a sketch of submodular functions induced through a bipartite graph and end the
section with a proof that ‘union’ of matroids is a matroid. Next we give Edmond’s algorithm for constructing the matroid union. We use this algorithm to study the structure of the union matroid - in particular the natural partition of its underlying set into coloops and the complement, and the manner in which the base of the union is built in terms of the bases of the individual matroids. Finally we use the matroid union algorithm t o construct the PP of the rank function of a matroid with respect to the ‘ 1 . 1’ function. In Chapter 12 we study the Dilworth truncation operation on a submodular function. This chapter is written in a manner that emphasizes the structural analogies that exist between convolution and Dilworth truncation. As in the case of convolution, we begin with formal properties and follow it with examples of results
13 from the literature unified by the truncation operation. In the next section, we describe the principal lattice of partitions (PLP) of a submodular function. This notion is analogous to the PP of a submodular function - whereas in the case of the PP there is a nesting of special sets, in the case of the PLP the special partitions get increasingly finer. We begin with basic properties of the PLP, each of which can be regarded as a ‘translation’ of a corresponding property of PP. We then present an alternative development of the PLP in terms of cost of partitioning. In the next section we use this idea for building approximation algorithms for optimum cost partitioning (this problem is of great practical relevance, particularly in CAD for large scale integrated circuits). After this we describe the relation between the PLP of a submodular function and that of derived functions. Here again there is a strong analogy between the behaviours of PP and PLP. In Chapter 13, we present algorithms for building the P L P of a general submodular function. These algorithms are also analogous to those of the PP. The core subroutine is one that builds a ‘(strong) fusion’ set which uses minimization of an appropriately derived submodular function. We specialize these algorithms to the important special cases of the weighted adjacency and exclusivity functions associated with a bipartite graph. (The matroid rank function case is handled in Section 14.3). Next we present some useful techniques for improving the complexity of PLP algorithms for functions arising in practice. Lastly, using the fact that the PP of the rank function of a graph can be regarded, equivalently, as the PLP of the I V ( . )I function on the edge set, we have presented fast algorithms for the former.
The last chapter is on the hybrid rank problem for electrical networks. In this chapter, four different (nonequivalent) formulations of this problem are given. The second, third and fourth formulations can be regarded as generalizations of the first. Except in the case of the fourth formulation, we have given fast algorithms for the solution of the problems. This chapter is intended as the link between electrical networks and submodular functions. Each of the formulations has been shown to arise naturally in electrical network theory. The first two formulations require convolution and the third requires Dilworth truncation for its solution. The fourth formulation gives rise to an optimization problem over vector spaces which is left as an open problem.
Chapter 2
Mathernat ical Preliminaries 2.1
Sets
A set (or collection) is specified by the elements (or members) that belong to it. If element z belongs to the set (does not belong to the set) X,we write z E X (z $? X ) . Two sets are equal iff they have the same members. The set with no elements is called the empty set and is denoted by 8. A set is finite if it has a finite number of elements. Otherwise it is infinite. A set is often specified by actually listing its members, e.g. {el, e2, es} is the set with members el, e2, e3. More usually it is specified by a property, e.g. the set of even numbers is specified as {z : 2 is an integer and z is even } or as {z,z is an integer and z is even }. The symbols V and 3 are used to denote ‘forall’ and ‘there exists’. Thus, ‘Vz’ or ‘V 2’ should be read as ‘forall z’and ‘3%’ should be read as ‘there exists z’. A singleton set is one that has only one element. The singleton set with the element z as its only member, is denoted by {x}. In this book, very often, we abuse this notation and write z in place of {z}, if we feel that the context makes the intended object unambiguous. We say that set X is contained in Y (properly contained in Y ) , if every element of X is also a member of Y (every element of X is a member of Y and X # Y ) and denote it by X C Y ( X c Y ) . The union of two sets X and Y denoted by X U Y , is the set whose members are either in X or in Y (or in both). The intersection of X and Y denoted by X n Y , is the set whose members belong both to X and t o Y. When X and Y do not have common elements, they are said to be disjoint. Union of disjoint sets X and Y is often denoted by X &J Y. Union of sets XI, * * ,X , is denoted by Uy=l X i or simply by X i . When the X , are pairwise disjoint, their union is denoted by X i or
ur=l
u
Xi.
The difference of X relative to Y , denoted by X - Y , is the set of all elements in X but not in Y . Let X S. Then the complement of X relative to S is the set S - X and is denoted by X when the set S is clear from the context.
15
2. M A T H E M A T I C A L P R E L I M I N A R I E S
16
A mapping f : X 4 Y , denoted by f ( . ) , associates with each element z E X , the element f ( z ) in Y . The element f ( z ) is called the image of z under f(.). We say f(.) maps X into Y . The sets X , Y are called, respectively, the domain and codomain of f ( . ) . We denote by f ( Z ) ,2 C X , the subset of Y which has as is called the range o f f ( . ) . The members, the images of elements in 2. The set f ( X ) restriction o f f ( . ) to Z C X , denoted by f/Z(.) is the mapping from 2 to Y defined by f / Z ( z ) f(z),3: E 2.A mapping that has distinct images for distinct elements in the domain is said to be one to one or injective. If the range of f(.) equals its codomain, we say that f(.)is onto or surjective. If the mapping is one to one onto we say it is bijective. Let f : X -+ Y , g : Y + 2. Then the composition of g and f is the map, denoted by gf(.) or gof(.),defined by gf(x) G g(f(x)) Vx E X. The Cartesian product X x Y of sets X , Y is the collection of all ordered pairs (5, y), where 3: E X and y E Y. The direct sum X @ Y denotes the union of disjoint sets X ,Y . We use ‘direct sum’ loosely to indicate that structures on two disjoint sets are ‘put together’. We give some examples where we anticipate definitions which would be given later. The direct sum of vector spaces U1,V2 on disjoint sets 5’1, SZ is the vector space Vl U2 on S1 @ S2 whose typical vectors are obtained by taking a vector X I (al,...,ak)in U1 and a vector x2 E ( b l , . . . , b m ) in li2 and putting them together as x1 cff xz 3 ( a l ,. . . , (Lk,bl , . . . , b,). When we have two graphs GI, GG2on disjoint edge sets El, E2, GI @ Gzi would have edge set EI @ EZ and is obtained by ‘putting together’ and &. Usually the vertex sets would also be disjoint. However, where the context permits, we may relax the latter assumption and allow ‘hinging’ of vertices.
=
=
We speak of a family of subsets as distinct from a collection of subsets. The collection {{e1,e2}, {el,ez}, {el}} is identical t o {{e1,e2}, {el}}. But often (e.g. in the definition of a hypergraph in Subsection 3.6.6) we have to use copies of the sarne subset many times and distinguish between copies. This we do by ‘indexing’ them. A family of subsets of S may be defined to be a mapping from an index set I to the collection of all subsets of S. For the purpose of this book, the index set I can be taken to be { l , . . . , n }. So the family ({el,ez},{el,e2},{el}) can be thought of as the mapping 4(.)with
d41)
f
(e1,ea)
4(2)
E
{el,ez}
443)
=
{el).
(Note that a family is denoted using ordinary brackets while a set is denoted using curly brackets).
2.2
Vectors and Matrices
Iri this section we define vectors, matrices and related notions. Most present day books on linear algebra treat vectors as primitive elements in a vector space and leave them undefined. We adopt a more old fashioned approach which is convenient
17
2.2. VECTORS A N D MATRICES
for the applications we have in mind. The reader who wants a more leisurely treatment of the topics in this section is referred to [Hoffman+Kunze72]. Let S be a finite set {e1,e2,. . . ,en} and let F be a field. We will confine ourselves to the field ?J? of real numbers, the field C of complex numbers and the GF2 field on elements 0 , l ( O + O = O , O + 1 = 1 , 1 + O = 1 , 1 + 1 = 0 , l . l = 1,l.O = 0,O.l = 0,O.O = 0). For a general definition of a field see for instance [Jacobson74]. By a vector on S over F we mean a mapping f of S into F . T h e field F is called the scalar field and its elements are called scalars. The support of f is the subset of S over which it takes nonzero values. The sum of two vectors f , g on S over .F is defined by (f g)(ei) = f(ei) g(ei) Vei E S . (For convenience the sum of two vectors f on S , g on T over F is defined by (f +g)(ei) f(ei) +g(ei) Vei E S f l T ,as agreeing with f on S - T , and as agreeing with g on T - S ) . The scalar product of f by a scalar X is a vector Xf defined by (Xf)(ei) X(f(ei)) Vei E S. A collection V of vectors on S over F is a vector space iff it is closed under addition and scalar product. We henceforth omit mention of underlying set and field unless required.
+
+
=
=
A set of vectors {fl, f 2 , . . . ,f,} is linearly dependent iff there exist scalars XI,. . . ,A, not all zero such that X1f1 + . . . + X,f, = 0. (Here the 0 vector is one which takes value 0 on all elements of S ) . Vector f, is a linear combination of f i , . . . , f,-1 iff f, = X1f1 . . . Xn-1f,-1 for some XI,. . . ,X,-1.
+ +
The set of all vectors linearly dependent on a collection C of vectors can be shown to form a vector space which is said to be generated by or spanned by C. Clearly if V is a vector space and C V , the subset of vectors generated by C is contained in V . A maximal linearly independent set of vectors of V is called a basis of V . In general maximal and minimal members of a collection of sets may not be largest and smallest in terms of size. Example: Consider the collection of sets { { 1,2,3}, {4}, (5, 6}, { 1,2,3,5,6}}. The minimal members of this collection are {1,2,3}, {4}, {5,6}, i.e., these do not contain proper subsets which are members of this collection. The maximal members of this collection are {4},{ 1 , 2 , 3 , 5 , 6 } , i.e., these are not proper subsets of other sets which are members of this collection. The following theorem is therefore remarkable. Theorem 2.2.1 All bases of a vector space on a finite set have the same cardinality.
The number of elements in a basis of V is called the dimension of V , denoted by d i m ( V ) , or the rank of V , denoted by r ( V ) . Using Theorem 2.2.1 one can show that the size of a maximal independent subset contained in a given set C of vectors is unique. This number is called the rank of C. Equivalently, the rank of C is the dimension of the vector space spanned by C. If V 1 ,Vz are vector spaces and V1 C V Z ,we say V1 is a subspace of V z .
A mapping A : { 1,2, . . . ,m} x { 1 , 2 , . . . ,n } -+ F is called a m x n matrix. It may be thought of as an m x n array with entries from F.We denote A ( i , j )often by the lower case ai3 with i as the row index and j as the column index. We speak of the array ( a i l , . . . , ai,) as the ith row of A and of the array ( a l j , . . . , a,j) as the
18
2. MATHEMATICAL PRELIMINARIES
jth column of A. Thus we may think of A as made up of m row vectors or of n
column vectors. Linear dependence, independence and linear combination for row and column vectors are defined the same way as for vectors. We say two matrices are row equivalent if the rows of each can be obtained by linearly combining the rows of the other. Column equivalence is defined similarly. The vector space spanned by the rows (columns) of A is called its row space (column space) and denoted by R(A)(C(A)). The dimension of R(A)(C(A))is called the row rank (column rank) of A. If A is an m x n matrix then the transpose of A denoted by AT is an n x m rnatrix defined by AT(i,j) E A(j,Z). Clearly the Z t h row of A becomes the zth column of AT. If B is also a n rn x n matrix the sum A + B is an m x n matrix defined by ( A + B ) ( i , j ) A(Z,j)+B(i,j).I f D i s a n n x p m a t r i x , theproduct AD is an m x p matrix defined by AD(Z,j) z aikdkj. Clearly if AD is defined it does not follow that DA is defined. Even when it is defined, in general AD # DA. The most basic property of this notion of product is that it is associative i.e. A(DF) = (AD)F.
--
Matrix operations are often specified by partitioning. Here we write a matrix in terms of submatrices (i.e., matrices obtained by deleting some rows and columns of the original matrix). A matrix may be partitioned along rows:
A=
[
or along columns: or both:
When two partitioned matrices are multiplied we assume that the partitioning is compatible, i.e., for each triple ( i , j ,k) the number of columns of Aik equals the number of rows of Bkj. Clearly this is achieved if the original matrices A, B are compatible for product and each block of the column partition of A has the same size as the corresponding row partition of B. The following partitioning rules can then be verified. 1.
[ "'I]
c=
A21
ii. C
[ AlllAlz ]
[ ] AllC
...
A21 c
=
[ CAlllCAl2 ]
2.2. V E C T O R S A N D MATRICES
19
In general if A is partitioned into submatrices &k,B into submatrices Bkj then the product C = AB would be naturally partitioned into Cij 'EkAikBkj.
Matrices arise most naturally in linear equations such as Ax = b, where A and b are known and x is an unknown vector. When b = 0 it is easily verified that the
set of all solutions of Ax = b, i.e.,of Ax = 0,form a vector space. This space will be called the solution space of Ax = 0, or the null space of A. The nullity of A is the dimension of the null space of A. We have the following theorem. Theorem 2.2.2 I f two matrices are row equivalent then their null spaces are iden-
tical. Corollary 2.2.1 I f A, B are row equivalent matrices then a set of columns of A are independent i f fthe corresponding set of columns of B are independent.
The'following are elementary row operations that can be performed on the rows of a matrix: i. interchanging rows,
ii. adding a multiple of one row to another, iii. multiplying a row by a nonzero number. Each of these operations corresponds t o premultiplication by a matrix. Such matrices are called elementary matrices. It can be seen that these are the matrices we obtain by performing the corresponding elementary row operations on the unit matrix of the same number of rows as the given matrix. We can define elementary column operations similarly. These would correspond to post multiplication by elementary column matrices.
A matrix is said t o be in Row Reduced Echelon form (RRE) iff it satisfies the following: Let T be the largest row index for which aij # 0 for some j. Then the columns of the T x T unit matrix (the matrix with 1s along the diagonal and zero elsewhere) e l , .. . ,e, appear as columns, say Ci,, . . . , Cip of A with il < . . . < i,. Further if p < ik then akp = 0. We have the following theorem. Theorem 2.2.3 Every matrix can be reduced to a matrix in the RRE form by a sequence of elementary row transformations and is therefore row equivalent to such a matrix. It is easily verified that for an RRE matrix row rank equals column rank. Hence using Theorem 2.2.3 and Corollary 2.2.1 we have Theorem 2.2.4 For any matrix, row rank equals column rank.
The rank of a matrix A, denoted by r(A), is its row rank (= column rank). Let the elements of S be ordered as (el,. .. ,en). Then for any vector f on S we define Rf, the representative vector o f f , as the one rowed matrix (f(e1), . . . , f(e,)). We will not usually distinguish between a vector and its representative vector. When the rows of a matrix R are representative vectors of some basis of a vector space V
20
2. M A T H E M A T I C A L P R E L I M I N A R I E S
we say that R is a representative matrix of V . When R, R1 both represent V they can be obtained from each other by row operations. Hence by Corollary 2.2.1 their column independence structure is identical. An T x n representative matrix R, T 5 n , is a standard representative matrix iff R has an T x T submatrix which can be obtained by permutations of the columns of the T x T unit matrix. For convenience we will write a standard representative matrix in the form [I(Rn]or [R1l[I]. (Here I denotes the unit matrix of appropriate size). The dot product of two vectors f , g on S denoted by < f , g > over .F is defined by < f , g >- CeES f(e).g(e).We say f , g are orthogonal if their dot product is zero. If C is a collection of vectors on S then CL = set of all vectors orthogonal to every vector in C. It can be verified that C' is a vector space. Let V be a vector space on S with basis B. Since vectors orthogonal to each vector in B are also orthogonal to linear combinations of these vectors we have BL = VL. If R is a representative matrix of V , it is clear that V L is its null space. Equivalently V i is the solution space of Rx = 0. If R is a standard representative matrix with R = [IrxrIR1~], then the solution space of R x = 0 can be shown to be the vector space generated by the columns of
]
, where n = JSI.(HereI k x k
denotes the unit matrix with k rows). Equivalently V1 has the representative matrix [-R$11n--1.X7L-v]. The representative matrix of (Vl)' will then be R. We then have the following
Theorem 2.2.5 i. if [IrxrJR12] is a representative matrix of vector space V on S then [ - R ~ z [ I n - r x nis- ar ]representative matrix of VL. ii. r ( V i )
=IS1 - T ( V )
iii. (V = V . Hence two matrices are row equivalent iff their null spaces are iden tical.
Consider the collection of n x n matrices over F. We say that I is an identity for this collection iff for every n x n matrix B we have IB = BI = B. If I1,Tz are identity matrices we must have 1 1 = 1 2 = I. The unit matrix (with Is along the diagonal and 0s elsewhere) is clearly an identity matrix. It is therefore the only identity inatrix. Two n x n matrices A , B are inverses of each other iff AB = BA = I. We say A, B are invertible or nonsingular. If A has inverses B , C we must have C = C(AB) = (CA)B = IB = B . Thus the inverse of a matrix A, if it exists, is unique and is denoted by A-l. We then have the following
Theorem 2.2.6 ii.
If A, D are 11 x
i. ( A T ) - ' = (A-')T 71
invertible matrices, then (AD)-l = (D-' A-').
With a square matrix we associate an important number called its determinant. Its definition requires some preparation.
A bijection of a finite set to itself is also called a permutation. A permutation that interchanges two elements (i.e. maps each of them to the other) but leaves
2.2. VECTORS AND MATRICES
21
all others unchanged is a transposition. Every permutation can be obtained by repeated application of transpositions. We then have the following Theorem 2.2.7 I f a permutation u can be obtained by composition of an even number of transpositions then every decomposition of a into transpositions will contain an even number of them. By Theorem 2.2.7 we can define a permutation to be even (odd) iff it can be decomposed into an even (odd) number of transpositions. The sign of a permutation a denoted by sgn(a) is + l if a is even and -1 if a is odd. It is easily seen, since the identity (= aa-’) permutation is even, that sgn(u) = sgn(a-l). The determinant of an n x n matrix is defined by
where the summation is taken over all possible permutations of { 1 , 2 , .. . , n } . It is easily seen that determinant of the unit matrix is + l . We collect some of the important properties of the determinant in the following Theorem 2.2.8
i. det(A) = det(AT)
ii. Let
Then det(A”) = det(A)
+ det(A’).
iii. I f A has two identical rows, or has two identical columns then det(A) = 0.
iv. I f E is an elementary matrix det(EA) = det(E)det(A). Since every invertible matrix can be factored into elementary matrices, it follows that det(AB) = det(A)det(B),for every pair o f n x n matrices A , B . v. det(A) # 0 iff A is invertible. Problem 2.1 Size of
a
basis: Prove
i. Theorem 2.2.1 ii. I f V1 is a subspace of vector space Vz, dim V1
5 dim Vz.
iii. If V1 C Vz and dim V1 = dim Vz then V1 = Vz.
iv. an m x n matrix with m > n cannot have linearly independent rows. v. any vector in a vector space V can be written uniquely as a linear combination o f the vectors in a basis o f V . Problem 2.2 Ways of interpreting the matrix product: Define product o f matrices in the usual way i.e. C = A B is equivalent to Cij = a i k b k j . NOW show that it can be thought of as follows
xk
2. MATHEMATlCAL P R E L l M l N ARIES
22
i. Columns of C are linear combinations of columns of A using entries of columns of B as coefficients. ii. rows of C are linear combinations of rows of B using entries of rows of A as coefficients.
Problem 2.3 Properties of matrix product: Prove, when A, B, C are matrices and the products are defined i. (AB)C = A(BC)
ii. ( A B ) = ~B
~
A
~
Problem 2.4 Partitioning rules: Prove i. the partitioning rules.
Proh-dm 2.5 Solui m space and column dependence structure: Prove t h e orern 2.2.2 and Corollary 2.2.1. Problem 2.6 Algorithm for computing RRE: Give an algorithm for converting any rectangular matrix into the R R E form. Give an upper bound for the number of arithmetical steps in your algorithm. Problem 2.7 Uniqueness of the M E matrix: Show that no R R E matrix is row equivalent to a distinct R R E matrix. Hence prove that every matrix is row equivalent to a unique matrix in the R R E form. Problem 2.8 RRE of special matrices: i. If A is a matrix with linearly independent columns what is its RIG3 form? in addition A is square what is its R R E form?
ii. If A, B are square such that AB = I show that BA
If
= I.
iii. Prove Theorem 2.2.6
Problem 2.9 Existence and nature of solution for linear equations: Consider the equation Ax = b. i. Show that it has a solution
(a) iff r(A) = r(A1b). (b) iff whenever XTA = 0, XTb is also zero.
2.2. VECTORS AND MATRICES
23
ii. Show that a vector is a solution of the above equation iff it can be written in the form x, xp where xp is a particular solution of the equation while x, is a vector in the null space of A (i.e. a solution to the linear equation with b
+
set equal to zero).
iii. Motivation for the matrix product: Why is the matrix product defined as in Problem 2.2? (In the above equation suppose we make the substitution x = By. What would the linear equation in terms of y be?)
iv. Linear dependence and logical consequence: The above equation may be regarded as a set of linear equations (one for each row of A) each of which in turn could be thought of as a statement. Show that a linear equation is a logical consequence of others iff i t is linearly dependent on the others. Problem 2.10 Positive definite matrices:
i. Construct an example where A, B are invertible but their sum is not. ii. A matrix K is positive semidefinite (positive definite) iff xTKx 2 0 V x # 0 (xTKx > 0 Vx # 0). Show that
(a) a matrix is invertible if it is positive definite; (b) sum of two positive sem’definite matrices (positive definite matrices) is positive semidefinite (positive definite); (c) if K is a positive definite matrix,then AKAT is positive semidefinite and if, further, rows of A are linearly independent, then AKAT is positive definite; (d) inverse of a symmetric positive definite matrix is also symmetric positive definite. Problem 2.11 Projection of a vector on a vector space: Let x be a vector on S and let V be a vector space on S. Show that x can be uniquely decomposed as x = x1 xz, where x1 E V and x2 E VVI.Thevector x1 is called the projection o f x on v dong v I .
+
Problem 2.12 Parity of a Permutation: Show that if a permutation can be obtained by composing a n odd number of transpositions it cannot also be obtained by composing an even number of transpositions. Problem 2.13 Graph of a permutation: Define the graph Gu of a permutation c on {1,2;..n} as follows: V ( G b ) {1,2,...,n}; draw an edge with an arrow from i to j iff u(i) = j.
=
i. Show that every vertex in this graph has precisely one arrow coming in and one going out. Hence, conclude that each connected component is a directed circuit. ii. Show that if odd (even).
Gu has an
odd (even) number of even length circuits then u is
2. MATHEMATICAL PRELIMINARIES
24
Problem 2.14 Properties of the determinant: Prove Theorem 2.2.8. Problem 2.15 Equivalence of definitions of a determinant: Show that the usual definition of a determinant by expanding along a row or column is equivalent to the definition using permutations. Problem 2.16 Laplace expansion of the determinant: Let A be an n x matrix. Show that
(A
(
dl'
21,
. . . ' 2d"P
.' '
7
) is the
7%
p x p matrix whose (s,t ) entry is the ( & , i t ) entry of
Problem 2.17 Binet Cauchy Theorem: Let A be an m x n and B an n x 771 matrix with m 5 n. I f an m x m submatrix of A is composed of columns a l , . ' ' ,i,, the corresponding m x m submatrix of B is the one with rows 21 , . . . , 2,. Prove the Binet Cauchy Theorem: det(AB) = C product of determinants of corresponding m x in submatrices of A and B .
2.3 2.3.1
Linear Inequality Systems The Kuhn-Fourier Theorem
111 this section we summarize basic results on inequality systems which we need later on in the book. Proofs are mostly omitted. They may be found in standard references such as [Stoer+Witzgall70] and [Schrijver86]. This section follows the former reference. A linear inequality system is a set of constraints of the following kind on the vector x E X". Ax = a,
Bx
> b,
cx 2
c,
}
(0
Here, A, B , C are matrices, a,, b,, c, are column vectors with appropriate number of Tows. We say x 1 > x2(x1 2 xa) iff each component of x1 is greater than (greater tlian or equal to) the corresponding component of x2. A solution of an inequality system is a vector which satisfies all the inequality constraints of the system. A constraint which is satisfied by every solution of an inequality system is said to be a consequence of the system. In particular, we are concerned with constraints of the kind dTx = d, or > do or 2 d,. A legal linear combination of the system (I) is obtained by linearly combining
2.3. LINEAR INEQUALITY SYSTEMS
25
the equations and inequalities with real coefficients - ai for the linear equations, and non-negative real coefficients &, Yk for the ‘>’ linear inequalities and ‘2’ hear inequalities respectively. The resulting constraint would be a linear equation iff pj,yk are all zero. It would be a ‘>’ inequality iff at least one of the pj’s is nonzero. It would be a ‘2’ inequality iff all of pj are zero but at least one of the yk is nonzero. A legal linear combination is thus a consequence of the system. A legal linear combination, with at least one of the ai,&,yk nonzero, that results in the LHS becoming zero is called a legal linear dependence of the system. Another important way of deriving consequence relations is by weakening. This means t o and ‘>’ to ‘2’and also in the case of ‘>’ and ‘2’ to lower the weaken ‘=’ to ‘2’ right side value.
Example 2.3.1 Consider the system of linear inequalities:
+ 2x2 = 3 2x1 + z.1 = 4
51
+x2 > 1 2x1 f3x2 > 2 x1 +5x2 3 2 -XI - 2x2 2 4. 21
The legal linear combination corresponding to a1 = 1,a2 = 1, = 0, /32 = 0,y1 = 0,yz = 0 is 321 3 2 2 = 7; that corresponding to a1 = 1, a2 = 0, p1 = 1,/32 = 0, y1 = 1,yz = 0 is 321 8x2 > 6; = 0,p,= O,yl= l,yl = 0 is that corresponding to a1 = l,az = 0,/3, 2x1 2 2 2 5. The legal linear combination corresponding to a1 = 1,a2 = 0, p1 = 0, /32 = 0,y1 = O,y>= 1 is the zero relation 021 0x2 2 7. Thus in this case, the system has a legal linear dependence that is a contradiction.
+ + +
+
We can now state the fundamental theorem of Kuhn and Fourier [Fourierl826], [Kuhn56].
Theorem 2.3.1 ( Kuhn-Fourier Theorem) A linear inequality system has a solution iff no legal linear dependence is a contradiction. Sketch of the Proof of Theorem 2.3.1: First reduce the linear equations to the RRE form and express some of the variables in terms of the others. This substitution is now carried out also in the inequalities. So henceforth, without loss of generality, we may assume that we have only inequalities. If we prove the theorem for such a reduced system, it can be extended to one which has equalities also. Suppose all variables have either zero coefficient or the same sign in all the inequalities. In this case it is easy to see that the system has a solution whether the
2. MATHEMATICAL PRELIMINARIES
26
coefficients are all zero or otherwise. If all the coefficients are zero we are done - the theorem is clearly true. If not, it is not possible to get a legal linear dependence without using zero coefficients. So the theorem is again true in this case. We now present an elimination procedure which terminates at one of the above mentioned cases. Let the inequalities be numbered ( l ) , . . . , ( r ) ,(T + l ) , . . . , ( k ) . Let x, be present with coefficient +1 in the inequalities (l),...,(r) and with coefficient -1 in the inequalities (T + l ) ,. . . , (k). We create ~ ( -k T ) inequalities without the variables z n by adding each of the first r inequalities to each of the last ( k - r ) inequalities. Note that if both the inequalities are of the ( 2 ) kind, the addition would result in another of the ( 2 ) kind and if one of them is of the (>) kind, the addition would result in another of the (>) kind.
If the original system has a solution, it is clear that the reduced system also has one. On the other hand, if the reduced system has a solution (xi,. . . ,~ k - it ~ )is possible to find a value xk of x, such that (xi,.. ,xLPl, xk) is a solution of the original system. We indicate how, below. Let the inequalities added be
a31x1
+ . ..
-
x, > b,
(The cases where both are (>), both are (>) or first inequality (>) and second ( 2 ) are similar.) The pair of inequalities can be written equivalently as
+ . . . + a3(,-1)~n-i
~31x1
-
bj
> 2, 2 b,
-~
2 1 ~ .1 . .
-~~(~-1)xn-l
(*I
The extreme left of the above inequality (*) is always derived from the inequalities (T 1) to (k)while the extreme right is always derived from the (1) to ( r ) inequalities. When xi,. . . , is substituted in the above inequality, it would be satisfied for every pair of inequalities, from ( j 1) to ( k ) on the extreme left and (1) to (j) on the extreme right. After substitution, let the least of the extreme left term be reached for inequality ( p ) and let the highest of the extreme right term be reached for inequality ( 4 ) . Since
+
+
apixi
+ . . . + C Z ~ ( , - ~ ) X ; - ~ - bp > bq - aqixi - . . .
-~~(~-1)xL-j
(this inequality results when ( p ) and (4)are added), we can find a value xk of z, which lies between left and right sides of the above inequality. Clearly (xi,. . . , &) is a solution of the original system. If this procedure were repeated, we would reach a system where either all the coefficients have zero value or where the signs of the coefficients of a variable are all the same in all the inequalities. As mentioned before, for these cases the theorem is true. 0
As an immediate consequence we can prove the celebrated ‘Farkas Lemma’.
2.3. L I N E A R INEQUALITY S Y S T E M S
27
Theorem 2.3.2 (Farkas Lemma) The homogeneous system
AXSO has the consequence
dTx 5 0
iff the row vector dT is a nonnegative linear combination of the rows of A.
Proof : By Kuhn-Fourier Theorem (Theorem 2.3.1)’ the system ATy = d Y>O
has a solution iff
‘ x ’ ~+ APTI ~ = O , P T 2 0’ implies ‘xTd 5 0’; i.e., iff ‘Ax 5 0’ implies ‘dTx5 0.’ 0
The analogue of ‘vector spaces’ for inequality systems is ‘cones’. A cone is a collection of vectors closed under addition and non-negative linear combination. It is easily verified that the solution set of Ax 2 0 is a cone. Such cones are called polyhedral. We say vectors x, y (on the same set S) are polar iff < x , y > (i.e., the dot product) is nonpositive. If K: is a collection of vectors, the polar of K , denoted by KP is the collection of vectors polar to every vector in K. Thus, Farkas Lemma states: ‘Let C be the polyhedral cone defined by Ax 5 0. A vector d belongs to C P iff dT is a nonnegative linear combination of the rows of A.’
2.3.2
Linear Programming
Let S be a linear inequality system with ‘5’ and ‘=’ constraints (‘2’ and ‘=’ constraints). The linear programming problem or linear program is that of finding a solution x of S which maximizes a given linear function cTx (minimizes a given linear function cTx). The linear function to be optimized is called the objective function. A solution of S is called a feasible solution, while a solution which optimizes cTx is called an optimal solution, of the linear programming problem. The value of a feasible solution is the value of the objective function on it. The following linear programming problems are said to be duals of each other Primal program
Maximize
c1T x1
+ CTXZ
2. MATHEMATICAL PRELlMINARlES
28
y2
2 0.
We now present the duality theorem of linear programming [von Neumann471, [Gale+Kuhn+Tucker5 11. Theorem 2.3.3 For dual pairs of linear programs the following statements hold:
i. The value of each feasible solution o f the minimization program is greater than or equal to the value o f each feasible solution of the maximization program; ii. i f both programs have feasible solutions then both have optimal solutions and the optimal values are equal; iii. i f one program has an optimal solution then so does the other. ‘Ilie usual proof uses Farkas Lemma, or more conveniently, Kuhn-Fourier Theorem. We orily sketch it. Sketch of Proof: Part (i) follows by the solution of Exercise 2.1. Now we write down the inequalities of the primal and dual programs and another ‘ 5 ’ inequality which is the opposite of the inequality in part (i). Part (ii) would he proved if this system of inequalities has a solution. We assume it has no solution and derive a contradiction by using Kuhn-Fourier Theorem. 0
Exercise 2.1 Prove part (i) of Theorem 2.3.3.
A very useful corollary of Theorem 2.3.3 is the following: Corollary 2.3.1 (Complementary Slackness)
Let
{ :,= } {: max CTX
and
min bTy ATy 2 c
}
be dual linear programs. Let x,y he optimal solutions to the respective programs.
Then,
2.4. SOLUTIONS OF EXERCISES
i.
?i
29
> 0 implies (AT ) ZY- Ci,
ii. (AT)$ > ci implies ki = 0. Proof : We have by part (ii) of Theorem 2.3.3 cT% = j,Tblequivalently
cTli:- gTAli:= (cT - yTA)X = 0.
The result now follows since (cT - j,TA)2 0 and 12.20.
2.4
Solutions of Exercises
E 2.1: We use the linear programs given in the definition of dual linear programs. We have
2.5
Solutions of Problems
Most of these problems can be found as standard results in undergraduate texts on linear algebra (see for instance [Hoffman+Kunze72]). We only give the solution to the last two problems. Here we follow [MacDuffeeSS] [Gantmacher59] respectively.
P 2.16: We state the following simple lemma without proof. Lemma 2.5.1 If a1 . . at are permutations of { 1 , . . . ,n} then sgn(ala2.. . at) = (sgn(al))(sgn(a2))... (sgn(at))(where aiaj denotes composition of permutations ai, a j ) .
We have
where a , ,b are permutations on the sets { T I , . . . r k } , { r k + l . . . r,} respectively. Let a’ agree with a over { T I . . . r k } and over { r k + l . . r , } ] with the identity permutation. Let 0‘agree with /3 over { ~ k + .~ . ,. , r , } and with the identity permutation over {rl,...lrk}. So
2. MATHEMATICAL PRELIMINARIES
30 =
s g n ( ~ ’ a ’ u ) ( a i l ~ , ( i l )’
=
sgn(p)(aiib(Z1)
‘ ’
’ ’
uikaa(ik)ai*+18,(i*+l)
’ aZ~p(ik)aik+lfi(ik+l)
’ ’
’ ’ ’
‘inp,(in))
“inb(in)),
=
where p P’a’a. Since the RHS is the usual definition of the determinant of A, the proof is complete.
P 2.17: Let
u i j , bij
denote respectively the ( i , j ) t h entry of A , B . Then the matrix
Cyl=la i i l b i l l
..
Cym=l alimbimm
[ .Cyl=la r n i l b i l l
’.
Cim=l ami.,,bi.,,m
:
AB=
n
Now each column of AB can be thought of as the sum of n appropriate columns for instance the transpose of the first column is made up of rows - a typical one being ( a ~ ~ l b ~ l ~ , ~ ~ ~ , aUsing , ~ l bTheorem i l ~ ) . 2.2.8
-
where A
(
1’
21,
... . ‘ ’
’m 3
am
)
is the m x m matrix which has the first m rows of A
in the same order as in A but whose j t h column is the iSh column of A. So, again
by Theorem 2.2.8,
where kl < . . . < k,,
so. det(AB) =
{kl,.. . , k,}
= { i l , .. . ,i,}
and u is the permutation
Chapter 3
Graphs 3.1
Introduction
We give definitions of graphs and related notions below. Graphs should be visualized as points joined by lines with or without arrows rather than be thought of as formal objects. We would not hesitate to use informal language in proofs.
3.2 3.2.1
Graphs: Basic Notions Graphs and Subgraphs
A graph G is a triple ( V ( G ) E(G),ig) , where V(G)is a finite set of vertices, E(G) is a finite set of edges and is is an incidence function which associates with each edge a pair of vertices, not necessarily distinct, called its end points or end vertices (i.e., ig : E ( 4 ) + collection of subsets of V(G)of cardinality 2 or 1). Vertices are also referred t o as nodes or junctions while edges are referred t o also as arcs or branches. We note i. an edge may have a single end point - such edges are called selfloops.
ii. a vertex may have no edges incident on it isolated.
-
such vertices are said to be
iii. the graph may be in several pieces. Figure 3.1 shows a typical graph
G,.
A directed graph G is a triple (V(G),E ( G ) , a g ) where V(G),E(G) are the vertex set and the edge set respectively and ag associates with each edge an ordered 31
3. GRAPHS
32
a
f
e7
j
h e10
el 1
9
k
a
f
b
h
d
%
9
Figure 3.1: Undirected and Directed Graphs
pair of vertices not necessarily distinct (i.e., ag : E(G) + V ( G )x V ( G ) ) .The first element of the ordered pair is the positive end point or tail of the arrow and the second element is the negative end point or head of the arrow. For selfloops, positive and negative endpoints are the same. Directed graphs are usually drawn as graphs with arrows in the edges. In Figure 3.1, G d is a directed graph. We say a vertex u and an edge e are incident on each other iff u is an end point of 6 ’ . If e has end points u, u we say that u,u are adjacent to each other. TWO edges f ez are adjacent if they have a common end point. The degree of a vertex is the number of edges incident on it with selfloops counted twice. A graph Gs is a subgraph of G iff Gs is a graph, V(Gs)C V ( G ) , E ( G s )C E ( G ) , and the endpoints of an edge in Gs are the same as its end points in 5. Subgraph G, is a proper subgraph of G iff it is a subgraph of but not identical to it. The subgraph of 4 on Vl is that subgraph of 4 which has V1 as its vertex set and the set of edges of 4 with both end points in Vl as the edge set. The subgraph of G on El has El E E ( 4 ) as the edge set and the endpoints of edges in El as the vertex set. If G is a directed graph the edges of a subgraph would retain the directions they had in G (i.e., they would have positive and negative end points as in G).
3.2. GRAPHS: BASIC NOTIONS
33
Exercise 3.1 (k) In ZJY graph with no parallel edges (edges with the same end points) or selfloops show that the degree of some two vertices must be equal. Exercise 3.2 [k) Show that i. the sum of the degrees of vertices of any graph is equal to twice the number of edges of the graph; ii. the number of odd degree vertices in any graph must be even.
3.2.2
Connectedness
A vertex edge alternating sequence (alternating sequence for short) of a graph G is a sequence in which i. vertices and edges of
alternate,
ii. the first and last elements are vertices and iii. whenever a vertex and an edge occur as adjacent elements they are incident
on each other in the graph.
Example: For the graph !& in Figure 3.1, ( a , e l , b , e 3 , c , e g , q , e 6 , c , e 4 ,is ~ )an alternating sequence. A path is a graph all of whose edges and vertices can be arranged in an alternating sequence without repetitions. It can be seen that the degree of precisely two of the vertices of the path is one and the degree of all other vertices (if any) is two. The two vertices of degree one must appear at either end of any alternating sequence containing all nodes and edges of the path without repetition. They are called terminal nodes. The path is said to be between its terminal nodes. It is clear that there are only two such alternating sequences that we can associate with a path. Each is the reverse of the other. The two alternating sequences associated with the path in Figure 3.2 are ( U I , el , u2 , e2, 7 ~ 3 ,e3, u4) and ( ~ 4 ,e3, 2131 e2, 212, e l , ~ 1 ) . We say ‘go along the path from ui to u j ’ instead of ‘construct the alternating sequence without repetitions having ui as the first element and uj as the last element’. Such sequences are constructed by considering the alternating sequence associated with the path in which vi precedes v j and taking the subsequence starting with vi and ending with v j . A directed graph may be a path if it satisfies the above conditions. However, the term strongly directed path is used if the edges can be arranged in a sequence so that the negative end point of each edge, except the last is the positive end point of the succeeding edge. A graph is said to be connected iff for any given pair of distinct vertices there exists a path subgraph between them. The path graph in Figure 3.2 is connected
3. GRAPHS
“1
v2
v3
Figure 3.2: A Path Graph
Figure 3.3: A Circuit Graph and a Strongly Directed Circuit Graph while the graph Gu in Figure 3.1 is disconnected. A connected component of a graph 4 is a connected subgraph of 4 that is not a proper subgraph of any connected subgraph of G (i.e., it is a maximal connected subgraph). Connected components correspond to ‘pieces’ of a disconnected graph. Exercise 3.3 (k) Let 4 be a connected graph. Show that there is a vertex such that if the vertex and all edges incident on it are removed the remaining graph is still connected.
3.2.3
Circuits and Cutsets
A connected graph with each vertex having degree two is called a circuit graph or a polygon graph. (GL in Figure 3.3 is a circuit graph). If G‘ is a circuit subgraph of G then E(G’) is a circuit of G. A single edged circuit is called a selfloop. Each of the following is a characteristic property of circuit graphs (i.e., each can be used to define the notion).
3.2. GRAPHS: BASIC NOTIONS
35
We omit the routine proofs. i. A circuit graph has precisely two paths between any two of its vertices. ii. If we start from any vertex 'u of a circuit graph and follow any path (i.e., follow an edge, reach an adjacent vertex, go along a new edge incident on that vertex and so on) the first vertex to be repeated would be u . Also during the traversal we would have encountered all vertices and edges of the circuit graph. iii. Deletion of any edge (leaving the end points in place) of a circuit graph reduces it to a path.
Exercise 3.4 Construct i. a graph with all vertices of degree 2 that is not a circuit graph,
ii. a non circuit graph which is made u p of a path and an additional edge, iii. a graph which has no circuits, iv. a graph which has every edge as a circuit. Exercise 3.5 Prove Lemma 3.2.1 (k) Deletion of an edge (leaving end points in place) of a circuit subgraph does not increase the number of connected components in the graph.
Exercise 3.6 Prove Theorem 3.2.1 (k) A graph contains a circuit if i t contains two distinct paths between some two of its vertices.
Exercise 3.7 Prove Theorem 3.2.2 (k) A graph contains a circuit if every one of its vertices has degree
2 2.
A set T C: E(G) is a crossing edge set of G if V(G)can be partitioned into sets such that T = {e : e has an end point in Vl and in Vz}. (In Figure 3.4, C is a crossing edge set). We will call Vl, V,, the end vertex sets of T . Observe that while end vertex sets uniquely determine a crossing edge set there may be more than one pair of end vertex sets consistent with a given crossing edge set. A crossing edge set that is minimal (i.e., does not properly contain another crossing edge set) is called a cutset or a bond. A single edged cutset is a coloop. V l ,Vl
Exercise 3.8 Construct a graph which has (a) no cutsets (b) every edge as a cutset. Exercise 3.9 Construct a crossing edge set that is not a cutset (see Figure 3.4). Exercise 3.10 (k) Show that a cutset is a minimal set of edges with the property that when it is deleted leaving endpoints in place the number of components of the graph increases.
3. GRAPHS
36
B C Figure 3.4: A Crossing Edge Set and a Strongly Directed Crossing Edge Set Exercise 3.11 Short (i.e., fuse end points of an edge and remove the edge) all
branches of a graph except a cutset. How does the resulting graph look?
Exercise 3.12 Prove
Theorem 3.2.3 (k) A crossing edge set T is a cutset i f f i t satisfies the following:
i. I f the graph has more than one component then T must meet the edges of only one component. ii. If the end vertex sets o f T are V l , Vl in that component, then the subgraphs on Vl and Vl must be connected.
3.2.4
Trees and Forests
A graph that contains no circuits is called a forest graph (see graphs Gt and Gf in Figure 3 . 5 ) . A connected forest graph is also called a tree graph (see graph Gt in Figure 3 . 5 ) . A forest of a graph G is the set of edges of a forest subgraph of 4 that has V ( G )as its vertex set and has as many connected components as G has. A forest of a connecbed graph G is also called a tree of G. The complement relative to E(G) of a forest (tree) is a coforest (cotree) of G. The number of edges in a forest (coforest) of 4 is its rank (nullity). Theorem 3.2.4 assures us that this notion is well defined.
Exercise 3.13 (k)Show that a tree graph on two or more nodes has i. precisely one path between any two of its vertices ii. at least two vertices of degree one.
3.2. GRAPHS: BASIC NOTIONS
37
Figure 3.5: A Tree Graph Gt and a Forest Graph Gf
Exercise 3.14 Prove Theorem 3.2.4 (k) A tree graph on n nodes has (n- 1) branches. Any connected graph on 'n nodes with ( n - 1) edges is a tree graph. Corollary 3.2.1 The forest subgraph on n nodes and p components has ( n - p ) edges. Exercise 3.15 Prove Theorem 3.2.5 (k) A subset of edges of a graph is a forest (coforest) iff it is a maximal subset not containing any circuit (cutset). Exercise 3.16 (k) Show that every forest (coforest) of a graph G intersects every cutset (circuit) of G. Exercise 3.17 Prove Lemma 3.2.2 (k) A tree graph splits into two tree graphs if an edge is opened (deleted leaving its end points in place). Exercise 3.18 (k) Show that a tree graph yields another tree graph if an edge is shorted (removed after fusing its end points). Exercise 3.19 Prove Theorem 3.2.6 (k) Let f be a forest of a graph G and let e be a31 edge of G outside f . Then e U f contains only one circuit of G. Exercise 3.20 Prove
7
Theorem 3.2.7 (k) Let be a coforest of a graph G and let e be an edge of G outside 7 (i.e., e E f). Then e u f contains only one cutset of G (j.e., only one cutset of G intersects f in e). Exercise 3.21 (k) Show that every circuit is an f-circuit with respect to some forest (i.e., intersects some coforest in a single edge).
38
3. GRAPHS
Exercise 3.22 (k) Show that every cutset is an f-cutset with respect to some forest (j.e., intersects some forest in a single edge). Exercise 3.23 (k) Show that shorting an edge in a cutset of a graph does not reduce the nullity of the graph.
3 -2.5
Strongly Directedness
All the above definitions hold also in the case of directed graphs. The subgraphs in each case retain the original orientation for the edges. However, the prefix ‘strongly directed’ in each case implies a stronger condition. We have already spoken of the strongly directed path. A strongly directed circuit graph has its edges arranged in a sequence so that the negative end point of each edge is the positive end point of the succeeding edge and the positive end point of the last edge is the negative end point of the first (see g~~ in Figure 3.3). The set of edges of such a graph would be a strongly directed circuit. A strongly directed crossing edge set would have the positive end points of all its edges set in the same end vertex set (see Cd in Figure 3.4). In this book we will invariably assume that the graph is directed but our circuit subgraphs, paths etc. although they are directed graphs, will, unless otherwise stated, not be strongly directed. When it is clear from the context the prefix ‘directed’ will be omitted when we speak of a graph. For simplicity we would write directed path, directed circuit, directed crossing edge set instead of strongly directed path etc. Exercise 3.24 Prove: (Mjnty) A n y edge of a directed graph is either in a directed circuit or in a directed cutset hut not both. (For solution see Theorem 3.4.7).
3.2.6
Fundamental Circuits and Cutsets
Let f he a forest of and let e $ f. It can be shown (Theorem 3.2.6) that there is a unique circuit contained in e U f. This circuit is called the fundamental circuit (f - circuit) of e with respect to f and is denoted by L (e , f). Let et E f. It can be shown (Theorem 3.2.7) that there is a unique cutset contained in et U f. This cutset is called the fundamental cutset of et with respect to f and is denoted t-)Y W e t , f ) . Remark: The f-circuit L ( e , f ) is obtained by adding e to the unique path in the forest subgraph on f between the end points of e. For the subgraph on f , the edge et is a crossing edge set with end vertex sets say Vl ,V,. Then the f-cutset B ( e t ,f) is the crossing edge set of G with end vertex sets Vl , Vl.
3.2. GRAPHS: BASIC NOTIONS
39
Figure 3.6: Circuit subgraph and Crossing Edge Set with Orientations
3.2.7
Orientation
Let C j be a directed graph. We associate orientations with circuit subgraphs and crossing edge sets as follows:
An orientation of a circuit subgraph is an alternating sequence of its vertices and edges, without repetitions except for the first vertex being also the last (note that each edge is incident on the preceding and succeeding vertices). TWOorientations are equivalent if one can be obtained by a cyclic shift of the other. Diagrammatically an orientation may be represented by a circular arrow. It is easily seen that there can be at most two orientations for a circuit graph. (A single edge circuit subgraph has only one). These are obtained from each other by reversing the sequence. When there are two non equivalent orientations we call them opposite to each other. We say that an edge of the circuit subgraph agrees with the orientation if its positive end point immediately precedes itself in the orientation (or in an equivalent orientation). Otherwise it is opposite to the orientation. The orientation associated with a circuit subgraph would also be called the orientation of the circuit. Example: For the circuit subgraph of Figure 3.6 the orientations (71.1, e, 126, e6, 125, e5, n4, e4, 7 ~ 3 ,e3, 722, e2, n l ) , and (126, e6, 125, e5, 724, e4,123, e3, 71.2, e2, n i , e , n 6 ) are equivalent. This is the orientation shown in the figure. It is opposite to the orientation ( 7 5 1 , e 2 , 722, e3, 723, e4, 124, e5, 725, e6, 726, e, 121). The edge e agrees with this latter orientation and is opposite to the former orientation. An orientation of a crossing edge set is an ordering of its end vertex sets V1, VZ as (V1, V J )or as (VJ, Vl). An edge e in the crossing edge set with positive end point in VI and negative end point in VZ agrees with the orientation (Vl,V2) and is opposite to the orientation (V2, Vl). In Figure 3.6 the orientation of the crossing edge set is (V1, V2). Theorem 3.2.8 (k) Let f be a forest of a directed graph 5'. Let et E f and let e, E 7. Let the orientation of L(e,,f) and B(et,f) agree with e,, et, respectively.
40
T,, can be simplified to a minor of the form V .T' x T, or V x T' . T,, .
3.4.3
Vector Space Duality
We now relate the minors of V to the minors of V l . We remind the reader that G1 the complementary orthogonal space of 9 is defined to be on the same set as 9. In the following results we see that the contraction (restriction) of a vector space corresponds to the restriction (contraction) of the orthogonal complement. We say that Contraction and restriction are (orthogonal) duals of each other. ~
Theorem 3.4.3 (k) Let V be a vector space on S and let T C S . Then,
Proof: i . Let gT E ( V ' T ) I . For any f o n let fT denote f / T . Now i f f E v, then fT E 11.7' and < gT,fT > = 0. 0. If f E v we have Let g on S h e defined by g / T = gT, g / s - T
s
Thus g E V L and therefore, gT 6 V' x T.Hence, (V . 7 ' ) l C VL x 7' Next let gT E VL x T . Then there exists g E V L s.t. g / S - T = 0 and g/T = g T . Let fT E V . T . There exists f E V s.t. f / T = fT. N O W o r < f , g >=< fT,gT > < fS-T,OS-T > =< fT,gT >. Hence, gT E (V . T ) ' . We conclude that V' x 7' C (V . T ) I . This proves that (V ' 7 ' ) l = V1 x T .
+
3.4. BASIC OPERATIONS ON GRAPHS AND V E C T O R SPACES
55
ii. We have ( V l . T)' = ( V l ) l x T . By Theorem 2.2.5 ( ( V l . T ) l ) l = VL . T and (Vl)' = V . Hence, V L . T = (V x T)' The following corollary is immediate. Corollary 3.4.1 (k) (V x P . T)-' = VL . P x T I T C P
3.4.4
CS
Relation between Graph Minors and Vector Space Minors
We now show that the analogy between vector space minors and graph minors is more substantial than hitherto indicated - in fact the minors of voltage and current spaces of a graph correspond to appropriate graph minors. Theorem 3.4.4 (k) Let
4 be a graph with edge set E . Let T C E . Then
Proof : We remind the reader that by definition a voltage vector v is a linear combination of the rows of the incidence matrix, the coefficients of the linear combination being given by the entries in a potential vector A. We say v is derived from A. i. Let V T E V,(G. T ) Now V,(G. T ) = U,,(Gupen(E- T ) ) . Thus, VT E V,(Gqen(E - T ) ) . The graph Gopen(E - T ) has the same vertex set as G but the edges of ( E - T ) have been removed. Let VT be derived from the potential vector X of Gopen(E - T ) . Now for any edge e E TI vT(e) = X(a) - X(b), where a , b are the positive and negative end points of e. However, X is also a potential vector of G. Let the voltage vector v of G be derived from A. For the edge e E T I we have, as before, v(e) = X(a) - X(b). Thus, V T = v/T and therefore, V T E (V,(G)) . T . Hence V u ( G .T ) C (Vv(G)) .T. The reverse containment is proved similarly.
ii. Let V T E Vv(G x T ) . Now Vu(G x T ) = V,(Gshort ( E - T ) ) . Thus, VT E V,(GshWt ( E - T ) ) . The vertex set of Gshort ( E - T ) is the set {Vl, V - . . .Vn} where Vi is the vertex set of the ith component of GopeyT. Let VT be derived from the poteFtia1 vector in Gshort ( E - T ) .The vector X assigns to each of the the value X ( K ) . Now define a potential vector X on the nodes of G as follows: X(n) F i ( & ) , nE K. Since {Vl,. . . V k } is a partition of V ( G ) ,it is clear that X is well defined. Let v be the voltage vector derived from X in G. Whenever e E E - T we must have v(e) = 0 since both end points must belong to the same V,.
56
3. GRAPHS
Next, whenever e E T we have v(e) = X(a)-X(6) where a is the positive end point of < and b, the negative endpoint. Let a E V,, 6 E Vb,where V,, & E V(Gshort ( E - T ) ) . Then the positive endpoint of e in Gshort ( E - T ) is V, and the negative end point, bb
By defiiiitiori X ( a ) - X ( b ) = i(V,)-i(l.b).Thusv/T = V T . Hence, VT E ( V v ( G ) ) x T . Thus, V , ( G x T ) C (Vu(G))x T . The reverse containment is proved similarly. 0
Usiiig duality we can now prove
Theorem 3.4.5 (k) Let G be a directed graph on edge set E . Let 7’ C: E . Then,
ii. U z ( G x T ) = (Vi(G)). T Proof : i . U,(C;. T ) = (Vv(G. T ) ) I by the strong form of Tellegen’s Theorem. By Theorem 3.4.4, Vv(G . T ) = ( V v ( G ) ). T . 1 lence,
.. Tlie proof
11.
is similar. 0
Exercise 3.51 (k) For a connected directed graph G on node set {q,. . . , vk} if currents .JI, 52 . . . , Jk enter nodes v1,v2,.. . , z’k show that thrrr exists a vector i on E ( G ) , s.t. Ai = J iff C J , = 0. Exercise 3.52 Prove Theorem 3.4.5 directly. (Hint: the result of the preceding exercise would be useful in extending a current vector of G x T to a current vector of
c;).
3.4.5
Representative Matrices of Minors
As defined earlier, the representative matrix R of a vector space U on S has the vectors of a basis of V as its rows. Often the choice of a suitable representative matrix would give us special advantages. We describe how to construct a representative matrix which contains representative matrices of V . T and V x ( S - 7 ’ ) as its suhmatrices. We say in such a cme that V . T and V x ( S - T ) become ‘visible’ ill
R.
57
3.4. BASIC OPERATIONS ON GRAPHS A N D V E C T O R SPACES Theorem 3.4.6 (k) Let V be a vector space on S. Let T sentative matrix as shown below
T R=[
C S. Let R
be a repre-
S-T
7
(3.7)
where the rows of RTT are linearly independent. Then RTT is a representative matrix for V . T and R22, a representative matrix for V x (S - T ) .
Proof : The rows of RTTare restrictions of vectors on S to T . Hence, any linear combination of these rows will yield a vector of V . T . If fT is any vector in V . T there exists a vector f in V s.t. f/T = f T . Now f is a linear combination of the rows of R. Hence, f/T(= fT) is a linear combination of the rows of RTT.Further it is given that the rows of RTT are linearly independent. It follows that RTTis a representative matrix of V . T . It is clear from the structure of R (the zero in the second set of rows) that any linear combination of the rows of R 2 2 belongs to V x (S - T ) . Further if f is any vector in V s.t. f/T = 0 then f must be a linear combination only of the second set of rows of R. For, if the first set of rows are involved in the linear combination, since rows of RTTare linearly independent, f/T cannot be zero. We conclude that if f / ( S - T ) is a vector in 1.’ x ( S - T ) ,it is linearly dependent on the rows of R22. Now rows of R are linearly independent. We conclude that R22 is a representative matrix of V x T . 0
Remark:To build a representative matrix of V with the form as in Theorem 3.4.6, we start from any representative matrix of V and perform row operations on it so that under the columns T we have a matrix in the RRE form.
The following corollary is immediate Corollary 3.4.2 (k)
+
r ( V ) = r(V . T ) r(V x (S - T ) ) ,T C S Corollary 3.4.3 (k) Let
be a graph on E . Then
,r(G) = r(G . T )
+ r(G x ( E - T ) ),V T C E
Proof : We observe that r(G) = number of edges in a forest of result follows by Theorem 3.4.4.
4 = r(Vv(G)).The 0
111the representative matrix of Theorem 3.4.6 the submatrix R T ~ contains information about how T ,S - T are linked by V . If RTJ is a zero matrix then it is clear that V = VT 6?V S - T where V T ,VS-T are vector spaces on T ,S - T . Definition 3.4.4
ii
subset T of S is a separator of V iff V x T = V . T .
3. GRAPHS
58
I t is immediate that if T is a separator so is (S - T ) . Thus, we might say that 7', (S - T ) are decoupled in this case. Now by definition V . T 2 V x T . Hence,
equality of the spaces follows if their dimensions are the same. Hence, T is a separator iff r(V x T ) = r(V . T ) . The connectivity of V at T is denoted by ( ( T )and defined as follows:
< ( T )EE r(V . T ) - r(V x T ) It is easily seen that ( ( T )= ( ( S - T ) . Further, this number is zero if T is a separator. Exercise 3.53 ( k )
i. Let Ti
R11 R = [Rzl 0
Rows of Rlz and
TJ
T3
RI'L R13 0 R23] 0
(3.8)
33.33
are given to be linearly independent. Show that
R33 isarepresentativernatrixofVxT3,R12 o f V . l ' j , Rzl O f V . (TiUT2)XT as well as V x (Ti U T3). Tl (and hence these spaces must be the same). ii. How would R look if V . (Tl U T2) has Tl,Tj as separators?
Exercise 3.54 (k) Let
R=
[
Tl
TJ
Rii
0 (3.9)
?I
R33
Suppose rows o f
( E': ) , ( 2; ) , are linearly independent. Show that the number o f rows
Of
Rz. = T(V . T l ) - T(V x
T2)
(= T(V . T i ) - T(V x TI)).
Exercise 3.55 (k) Prove: Let ['(,) be the [(.) function for VL. Then ('(2") = < ( T ) ,V 7' C S. Exercise 3.56 ( k )Show that the union o f a forest of G X T and a forest of G . (E-Y') is a forest o f G. Hence, (Corollary 3.4.3) r(G x T) r(G . ( E - T ) )= r(G).
+
Exercise 3.57 (k) Prove: u(G . T ) u(G x ( S - T ) = ~(57).
+
Exercise 3.58 ( k ) Prove: Let 5' be a graph on E . Then T E is a separator of G (i.e., no circuit intersects both T and E - 1 (Subsection 3.2.9) i f fT is a separator of V,,(G). Hence, T is a separator o f G iff T ( G . T ) = ~ ( 5 7x T ) .
3.4. BASlC OPERATIONS ON GRAPHS A N D V E C T O R SPACES
59
Exercise 3.59 Let T be a separator of G. Let G . T ,G . ( E - T ) have a1, a2 forests respectively, P I , PZ circuits respectively and yl, 7 2 cutsets respectively. How many forests, coforests, circuits and cutsets does G have?
3.4.6
Minty’s Theorem
Tellegen’s Theorem is generally regarded as the most fundamental result in Electrical Network Theory. There is however, another fundamental result which can be proved to be formally equivalent to Tellegen’s Theorem [Narayanan85c] and whose utility is comparable to the latter. This is Minty’s Theorem (strong form) [MintyGO], which we state and prove below.
Theorem 3.4.7 (Minty’s Theorem (strong form)) Let G be a directed graph. Let E(G) he partitioned into red,blue and green edges. Let e be a green edge. Then e either belongs to a circuit containing only blue and green edges with all green edges of the same direction with respect to the orientation of the circuit or e belongs to a cutset containing only red and green edges with all green edges of the same direction with respect to the orientation of the cutset but not both. Proof: We first prove the weak form: ‘in a graph each edge is present in a directed circuit or in a directed cutset but not both’
Proof of weak form: We claim that a directed circuit and a directed cutset of the same graph cannot intersect. For, suppose otherwise. Let the directed cutset have the orientation (Vl, Vi). The directed circuit subgraph must necessarily have vertices in Vl as well as in V2 in order that the intersection be nonvoid. But if we traverse the circuit subgraph starting from the node in V1 we would at some stage crossover into Vl by an edge el2 and later return t o VI by an edge 1321. Now e12, e21 have the same Orientation with respect to the circuit which means that if one of them has positive end point in Vl and negative end point in VJ the other must have the positive and negative end points in Vl, V1, respectively. But this contradicts the fact that they both belong to the same directed cutset with orientation (VI, VL). Next we show that any edge e must belong either to a directed circuit or to a directed cutset. To see this, start from the negative end point n2 of the edge and reach as many nodes of the graph as possible through directed paths. If through one of these paths we reach the positive end point n1 of e we can complete the directed circuit using e. Suppose n1 is not reachable through directed paths from n2. Let the set of all nodes reachable by directed paths from 722 be enclosed in a surface. This surface cannot contain nl and has at least one edge namely e with one end inside the surface and one outside. It is clear that all such edges must be directed into the surface as otherwise the surface can be enlarged by including more reachable nodes. This collection of edges is a directed crossing edge set and contains a directed cutset which has e as a member (see Exercise 3.60). This completes the proof of the weak form.
60
3. GRAPHS
Proof of strong form: We open the red edges T and short the blue edges b to obtain from G, the graph Gg on the green edge set g ,i.e., Gg = 6 x (E(G)- b) ' 9 . In this graph the weak form holds. Suppose the edge e is part of a directed cutset in Gg.Then this is still a directed cutset containing only green edges in G ( E ( G )- r ) . (By Lemma 3.4.2, a set C C T C E ( G ) is a cutset of x 7' iff it is a cutset of 57). It would be a part of a red and green cutset in when red edges are introduced between existing nodes. On the other hand, suppose the edge e is part of a directed circuit in Gg. Then this is still a directed circuit containing only green edges in G x ( E ( G )- b ) . (By Lemma 3.4.1, a set C C T E(G) is a circuit of G . T iff it is a circuit of 57). It would be a part of a blue and green circuit in G when blue edges are introduced by splitting existing nodes. Thus, the strong form is proved. 0
Exercise 3.60 (k) Let e be a member of a directed crossing edge set C . Show that there exists a directed cutset CI s.t. e E CI E C. Exercise 3.61 (k) A Generalization: Prove: Let V be a vector space on S over the real field and let e E S. Then e is in the support of a nonzero nonnegative vector f in V or in the support o f a nonzero nonnegative vector g in VL but not in both. Exercise 3.62 (k) Partition into strongly connected components: Prove: The edges o f a directed graph can be partitioned into two sets - those that can be included in directed circuits and those which can be included in directed cutsets. i. Hence show that the vertex set of a directed graph can be partitioned into blocks so that any pair o f vertices in each block are reachable from each other; partial order can he imposed on the blocks s.t. Bi 2 B j iff a vertex of Bj can be reached from a vertex of Bi.
ii. Give a good algorithm for building the partition as well as the partial order.
3.5
Problems
Problems on Graphs Problem 3.1 (k)If a graph has no odd degree vertices,then it is possible to start from any vertex and travel along all edges without repeating any edge and to return to the starting vertex. (Repetition of nodes is allowed). Problem 3.2 (k) Any graph on 6 nodes has either 3 nodes which are pairwise adjacent or 3 nodes which are pairwise non-adjacent. Problem 3.3 (k) A graph is made up of parallel but oppositely directed edges onlv. Let T , l3 - T be a partition of the edges o f G such that
3.5. PROBLEMS
61
i. i f e E T then the parallel oppositely directed edge e' E T . ii. it is possible to remove from each parallel pair of edges in T ( E - T ) one o f the edges so that the graph is still strongly connected. Show that it is possible to remove one edge from each parallel pair of edges in G so that the graph remains strongly connected. Problem 3.4 (k) We denote by IC', the graph on n nodes with a single edge between every pair of nodes and by Km,n the bipartite graph (i.e.,no edges between left vertices and no edges between right vertices) on m left vertices and n right vertices, with edges between every pair o f right and left vertices. i. How many edges do K n , Km,n have?
ii. Show that every circuit o f K m , n has an even number of edges. iii. Show that Ic, has nn-' trees. iv. A vertex colouring is an assignment of colours to vertices of the graph so that no two of them which have the same colour are adjacent. What is the minimum number of colours required for Kc,IC,,,? Problems on Circuits Problem 3.5 [Whitney35] Circuit Matroid: Show that the collection C o f circuits of a graph satisfy the matroid circuit axioms: i. If C1, Cz E C then
C1
cannot properly contain C;L.
ii. I f e , E C1 n Cz, ed E C1 - C2, then there exists C, E C and
e,
4 C3 but ed does.
(73
C1 U (72 s.t.
Problem 3.6 (k) Circuit Characterization: i. A subset of edges C is a circuit of a graph i f fit is a minimal set of edges not intersecting any cutset in a single branch.
ii. Same as (i) except 'single branch' is replaced by 'odd number of branches'. iii. C is a circuit o f a graph i f fi t is a minimal set of branches not contained in any forest (intersecting every coforest). Problem 3.7 (k) Cyclically Connected in terms of Edges: A graph in which any two vertices can be included in a circuit subgraph is said to be cyclically connected. In such a graph any two edges can also be so included. Problem 3.8 (k) Cut Vertex: A graph with no coloops is cyclically connected i f f it has no cut vertex (a vertex whose removal along with its incident edges disconnects the graph).
3. GRAPHS
62 Problems on Cutsets
Problem 3.9 ( k ) Cutset Matroid: Show that the collection of cutsets of a graph satisfies the circuit axioms o f a matroid. Problem 3.10 ( k ) Cutset Characterization:
i. A subset of edges C is a cutset of a graph i f fit is a minimal set of edges not intersecting any circuit in a single edge {in an odd number of edges). ii. C is a cutset of a graph i f fit is a minimal set o f branches not contained in any coforest (intersecting every forest). Problem 3.11 (k) Show that every crossing edge set is a disjoint union of cutsets. Problem 3.12 ( k ) Cyclically Connected in terms of Edges in Cutsets: In a cyclically connected graph any two edges can be included in a cutset. Problems on Graphs and Vector Spaces Problem 3.13 ( k ) Show directly that KCE of a tree graph has only the trivial solution. What is the structure for which KVE has only the trivial solution? Problem 3.14 Rank of Incidence Matrix of a Tree Graph: Give three proofs for (rank of incidence matrix of a tree graph = number of edges of the graph’ using
(a) the determinant o f a reduced incidence matrix ( h ) current injection (c) by assuming branches to be voltage sources and evaluating node potentials. Problem 3.15 {k) Nontrivial KCE Solution and Coforest: Prove directly that the support of every nonzero solution to KCE meets every toforest. Hence, the rows o f an f-circuit matrix of Lj span V,(Lj). Hence, 7-(Vz(G)) = e (n- p). ~
Problem 3.16 (k) Nontivial KVE Solution and Forest: Prove directly that the support of every nonzero solution to K V E meets every forest. Ilencr, the rows of an f-cutset matrix of G span V v ( G ) . Hence, .(V,,(G)) = ( u - p ) Problem 3.17 (k) Determinants of Submatrices of Incidence Matrix: The determinant o f every submatrix o f the incidence matrix A is 0 , f l . Hence, this property also holds for every Q f and Bf . Problem 3.18 Interpreting Current Equations: Let A be an incidence matrix.
i. Find one solution to Ax = b, if it exists, by inspection (giving a current injection interpretation).
3.5. PROBLEMS
63
ii. Find one solution to ATy = v by inspection (using voltagesources as branches).
Problem 3.19 i. Let A be the incidence matrix of G. If Ax = b is equivalent to Q f x = b, relate b to b. Using current injection give a simple rule for obtaining b from b. ii. If Qf,x = bl, and Qf2x= b2 are equivalent give a simple rule for obtaining bl from b2. iii. If B f l y = dl, and Bf2y = da are equivalent give a simple rule for obtaining
dl from dZ.
Problem 3.20 If two circuit (cutset) vectors figure in the same f-circuit (f-cutset) matrix show that the signs of the overlapping portion fully agree or fully oppose. So overlapping f-circuits (f-cutsets) fully agree or fully oppose in their orientations. Problem 3.21 (k) Give simple rules for computing AAT,BfB;, Qf Q?. Show that the number of nonzero entries of AAT is 2e. Show that BfB;,QfQT may not have any zero entries. Hence observe that nodal analysis is preferable to fundamental loop analysis and fundamental cutset a~alysisfrom the point of view of using Gaussian elimination. (Consider the case where a single edge lies in every circuit (cutset) corresponding to rows of Bf(Qf)). Problem 3.22 Under what conditions can two circuit (cutset) vectors of a given graph be a part of the same f-circuit (f-cutset) matrix? Problem 3.23 (k) Construct good algorithms for building f-circuit and f-cutset vectors for a given forest (use dfs or bfs described in Subsections 3.6.1, 3.6.2). Compute the complexity. Problem 3.24 Special Technique for Building a Representative Matrix of
vi(G):
Prove that the following algorithm works for building a representative matrix of
ui(5): G2
51 be a subgraph of 5, be a subgraph of 5 s.t. E ( & ) n E ( & ) is a forest of GI,
bk
be a subgraph of G s. t. E(5k) f
Let
;:u(
l
[u:z:
E(G,)] is a forest of the subgraph
E ( 6 i ) ) and let U E(Gi) = E(G). Build representative matrices Rj for Ui(Gj),j= 1,2;'.k. Extend the rows of Rj to size E(G) by padding with 0s. Call the resulting matrix Rj. Then R is a representative matrix for V i ( G ) , where
G.
3. GRAPHS
64
Problem 3.25 Equivalence of Minty’s and Tellegen’s Theorems: Prove that Minty’s Theorem (strong form) and TeJJegen’s Theorem (strong form) are formally equivalent. Problems on Basic Operations of Graphs Problem 3.26 (k) Let CJ be graph. Let K i. K is a forest of
C E(G). Then
G .7‘ iff i t js a maximal intersection of forests of G with T .
ii. K is a forest of G x T iff it is a minimal intersection of forests of iii. K is a forest of
with ?’.
G x T iff KU (a forest of G . ( S - T ) )is a forest of G.
iv. K is a coforest of
G . T iff KU (a coforest of G
x ( S - T ) ) is a coforest of
G.
Problem 3.27 Relation between Forests Built According to Priority and Graph Minors: Let A1 , . . . A, be pairwise disjoint subsets of G. i. A forest f of G contains edges from these sets in the same priority iff i t is the union of forests from cj . A 1 , G.(Al u A z ) x Az, G.(A1 UA2 UA3) x A3, . . . G x A,. ii. Suppose the graph has only such forests what can you conclude? iii. What can you conclude if the priority sequence A,, i = 1, . ’ n and Ar(*)Z= 1, . . ‘71 for every permutation cr of 1, . ’ . n yield the same forests?
Problem 3.28 (k) Show how to build an f-circuit (f-cutset) matrix of G in which f-circuit (f-cutset) matrices of G . T and G x ( E - T ) become ‘visible’ (appear as submatrices). Let TZ C TI C_ E(G). Repeat the above so that the corresponding matrix of x T I . Tl is ‘visible’. Problem 3.29 (k) Suppose i n an electrical network on graph G the subset T is composed of current (voJtage) sources. How will you check that they do not violate Kcr, (KVL) :/
3.6
Graph Algorithms
h i this section we sketch some of the basic graph algorithms which we take for granted in the remaining part of the book. The algorithms we consider are
construction of trees and forests of various kinds for the graph ( b f s , dfs, minimum spanning) finding the connected components of the graph coristructioii of the shortest path between two vertices of the graph
3.6. GRAPH ALGORITHMS
65
construction of restrictions and contractions of the graph bipartite graph based algorithms such as for dealing with partitions flow maximization in networks The account in this section is very brief and informal. For more details the readers are referred to [Aho+Hopcroft+Ullman74] [Kozen92] [Cormen+Leiserson+Rivest90].
For each of the above algorithms we compute or mention the ‘asymptotic worst case complexity’ of the algorithm. Our interest is primarily in computing an upper bound for the worst case running time of the algorithm and sometimes also for the worst case storage space required for the algorithm. A memory unit, for us, contains a single elementary symbol (a number - integer or floating point, or an alphabet). Accessing or modifying such a location would be assumed to cost unit time. Operations such as comparison, addition, multiplication and division are all assumed to cost unit time. Here as well as in the rest of the book we use the ‘big Oh’ notation: Let f,g : N” -+ hfwhere N denotes the set of nonnegative integers and p is a positive integer. We say f is O ( g ) iff there exists a positive integer k s.t. f(n) 5 kg(n)for all n outside a finite subset of Np. The time and space complexity of an algorithm to solve the problem (the number of elementary steps it takes and the number of bits of memory it requires) would be computed in terms of the size of the problem instance. The size normally refers to the number of bits (within independently specified multiplying constant) required to represent the instance of the problem in a computer. It could be specified in terms of several parameters. For example, in the case of a directed graph with capacitated edges the size would be in terms of number of vertices, number of edges and the maximum number of bits required to represent the capacity of an edge. In general, the size of a set would be its cardinality while the size of a number would be the number of bits required to represent it. Thus, if n is a positive integer, log n would be its size - the base being any convenient positive integer. All the algorithms we study in this book are polynomial time (and space) algorithms, i.e., their worst case ) f(.) is a polynomial coinplexity can be written in the form O ( f ( n 1 ,. . . ,7 ~ ~ )where in the n2.Further, in almost all cases, the polynomials would have low degree ( 5 5). Very rarely we have used words such as NP-complete and NP-Hard. Informally, a problem is in P if the ‘answer to it’ (i.e., the answer to every one of its instances) can be computed in polynomial time (i.e., in time polynomial in the size of the instance) and is in N P if the correctness of the candidate answer to every instance of it can be verified in polynomial time. It is clear that P NP. However, although it is widely believed that P # NP, a proof for this statement has not been obtained SO far. An NP-Hard problem is one which has the property that if its answer can be computed in polynomial time, then we can infer that the answer to every problem in N P can be computed in polynomial time. An NP-Hard problem need not necessarily be in NP. If it is in NP, then it is said to be NP-complete. The
3. GRAPHS
66
reader irlterested in formal definitions as well as in additional details is referred to [Garey+Johnson79], [Van LeeuwenSO].
Exercise 3.63 A decision problem is one for which the answer is (yes or no). Convert the problem ‘find the shortest path between v1 and u2 in a graph’ into a ‘short’ sequence of decision problems. For most of our algorithms elementary data structures such as arrays, stacks, queues are adequate. Where more sophisticated data structures (such as Fibonacci Heaps) are used, we mention them by name and their specific property (such as time for retrieval, time for insertion etc.) that is needed in the context. Details are skipped arid may be found in [Kozen92].
Storing a graph: A graph can be stored in the form of a sequence whose i f h (composite) element contains the information about the i t h edge (names of end points; if edge is directed the names of positive and negative end points). This sequence can be converted into another whose ith element contains the information about the ith node (names of incident edges, their other end points; if the graph is directed, the names of out-directed and in-directed edges and their other end points.) We will assume that we can retrieve incidence information about the ith edge in O(1) time and about the (it” node) in O(degree of node i ) time. The conversion from one kind of representation to the other can clearly be done in O ( m n ) time where m is the number of edges and n is the number of vertices.
+
Sorting and Searching: For sorting a set of indexed elements in order of increasing indices, there are available, algorithms of complexity O ( n l o g n ) , where n is the number of elements [Aho+Hopcroft+Ullman74]. We use such algorithms without naming them. In such a sorted list of elements to search for a given indexed element takes O(1ogn) steps by using binary search.
3.6.1
Breadth First Search
A breadth first search (bfs) tree or forest for the given graph 5‘ is built as follows: Start from any vertex u, and scan edges incident on it. Select these edges and put the vertices v1, v2.. . v k o which are adjacent to v, in a queue in the order in which the edges between them and vo were scanned. Mark v, as belonging to component 1 and level 0. Mark v1 , . . . , U k o , as belonging to component 1 and level 1 and as children of u,. Mark the vertex v, additionally as a parent of its children (against each of its children). Suppose at any stage we have the queue v i l , . . . ,‘Uik and a set Mi of marked vertices. Start from the left end (first) of the queue, scan the edges incident on it and select those edges whose other ends are unmarked. If a selected edge is between vij and tlie unmarked vertex u,, then the former (latter) is the parent (child) of the latter (former). Put the children of vil in the queue after and delete uil from the queue. Mark these vertices as belonging to the level next to that of uil and to the same
3.6. GRAPH ALGORITHMS
67
component as wil and as children of vil (against vil). Mark the vertex vil as a parent of its children (against its children). Continue. When the graph is disconnected it can happen that the queue is empty but all vertices have not yet been marked. In this case continue the algorithm by picking an unmarked vertex. Mark it as of level 0 but as of component number one more than that of the previous vertex. Continue. STOP when all vertices of the graph have been marked.
A t the end of the above algorithm we have a breadth first search forest made
up of the selected edges and a partition of the vertex set of the graph whose blocks
are the vertex sets of the components of the graph. The starting vertices in each component are called roots. The level number of each vertex gives its distance from the root (taking the length of each edge to be one). The path in the forest from a given vertex in a component to the root in the component is obtained by travelling from the vertex to its parent and SO on back t o the root. In a directed graph a b f s starting from any vertex would yield all vertices reachable from i t through directed paths. In this case, while processing a vertex, one selects only the outward directed edges.
The complexity of the bfs algorithm is O ( m + n ) where m is the number of edges and n is the number of vertices. (Each edge is ‘touched’ atmost twice. Each vertex other than the root is touched when an edge incident on it is touched or when it is a new root. Except where the root formation is involved the labour involved in touching a vertex can always be absorbed in that of touching an edge. Each touching involves a fixed number of operations). The complexity of computing all the reachable vertices from a given vertex or a set of vertices of a directed graph through b f s is clearly also O ( m n).
+
3.6.2
Depth First Search
A depth first search ( d f s ) tree or forest for the given graph G is built as follows: Start from any vertex v, and scan the edges incident on it. Select the first nonselfloop edge. Let v1 be its other end point. Put vo,vl in a stack. (A stack is a sequence of data elements in which the last (i.e.’ latest) element would be processed first). Mark v, as belonging to component 1 and as having d f s number 0, v1 as belonging to component 1 and as having d f s number 1. Mark v o as the parent of 01 (against vl) and v1 as a child of v, (against v,). Suppose at any stage, we have the stack oil, ’ . . V i k and a set Mi of marked vertices. Start from the top of the stack, i.e., from ‘Uik and scan the edges incident on it. Let e be the first edge whose other end point v ~ i +is~ unmarked. Select e. Mark vi+l as of df s number one more than that of the highest df s number of a vertex in Mi
3. GRAPHS
68
and of component number same as that of Uik. Mark (against u i + l ) ‘Uik as its parent and (against V i k ) vi+l as one of its children. Add ui+l to the top of the stack and repeat the process. Suppose U i k has no edges incident whose other end points are unmarked. Then delete u i k from the stack (so that ui(k-l) goes t o the top of the stack). Continue. STOP when all vertices in the graph have been marked. When the graph is disconnected it can happen that the stack is empty but all vertices have not yet been marked. In this case continue the algorithm by picking an unmarked vertex. Give it a df s number 0 but component number one more than that of the previous vertex. At the end of the above algorithm we have a depth first search forest made up of the selected edges and a partition of the vertex set of the graph whose blocks are the vertex sets of the components of the graph. The starting vertices in each component are called roots. The path in the forest from a given vertex in a component to the root in the component is obtained by travelling from the vertex to its parent and so on back to the root. The time complexity of the d f s algorithm can be seen to be O ( m n ) where m is the number of edges and n, the number of vertices in the graph.
+
Exercise 3.64 (k) Let e be an edge outside a df s tree of the graph. Let
211,212
the end points of e with df s numbering a , b respectively. If b > a show that necessarily an ancestor of u2 (ancestor E parent’s parent’s _..parent).
VI
be is
The dfs tree can be used to detect 2-connected components of the graph in
+
U ( m n ) time [Aho+Hopcroft+Ullman74]. It can be used to construct the planar embedding of a planar graph in O ( n ) time [Hopcroft+Tarjan74], [Kozen92]. There is a directed version of the df s tree using which a directed graph can be decomposed into strongly connected components (maximal subsets of vertices which are mutually reachable by directed paths). Using the directed dfs tree this can be done in O(7n n ) time [Aho+Hopcroft+Ullman74].
+
Fundamental circuits: Let t be a forest of graph G and let e E ( E ( G )- t ) . To construct L ( e , t ) we may proceed as follows: Do a d f s of G . t starting from any of its vertices. This would give a d f s number t o every vertex in G . t . Let v1, u2 be the end points of e. From 211,712 proceed towards the root by moving from child to parent until you meet the first common ancestor v3 of v1 and 02. This can be done as follows: Suppose u1 has a higher d f s number than 212. Move from 211 to root until you reach the first u: whose dfs number is less or equal to that of v 2 . NOW repeat the procedure with 212,vi and so on alternately until the first common vertex is reached. This would be 7.13. (It can be shown that u; = 213). Then L ( e ,t ) _= { e } U { edges in paths from ul to u3 and u2 to 713}. TO build the circuit vector corresponding to L(e,t) proceed as follows: Let ‘(11 be the positive end point and 212, the negative end point of e. The path from u2 to 111 in the tree is the path from u2 to v3 followed by the path from 213 to ul. The circuit vector has value f l at e , 0 outside L ( e ,t ) and + I (-1) at e j , if it is along (against) the path from u2 to u1 in the tree. Complexity of building the L ( e , t ) is
69
3.6. GRAPH ALGORITHMS O(l L ( e , t ) I) and that of building all the L ( e i , t ) is O ( c I L ( e , t ) I).
Exercise 3.65 How would you build the f-circuit for a bfs tree?
3.6.3
Minimum Spanning Tree
We are given a connected undirected graph G with real weights on its edges. The problem is to find a spanning tree of least total weight (total weight = sum of weights of edges in the tree). We give Prim’s algorithm for this purpose: Choose an arbitrary vertex uo. Among the edges incident on v, select one of least weight. Suppose at some stage, X is the set of edges selected and V ( X ) ,the set of their end points. If V ( X ) # V(G), select an edge e of least weight among those which have only one end point in V ( X ) . Now replace X by X U e and repeat. Stop when V ( X ) = V(G). The selected edges constitute a minimum spanning tree. (w(s))
Exercise 3.66 Justify Prim’s algorithm for minimum spanning tree. Complexity: Let n be the number of vertices and m, the number of edges of the graph. The algorithm has n stages. At each stage we have to find the minimum weight edge among the set of edges with one end point in V ( X ) .Such edges cannot be more than m in number. So finding the minimum is O(m) and the overall complexity is O(mn). However, this complexity can be drastically improved if we store the vertices in (V(G) - V ( X ) ) in a Fibonacci Heap. This data structure permits the extraction of the minimum valued element in O(1ogn) amortized time (where n is the number of elements in the heap), changing the value of an element in O(1) amortized time and deleting the minimum element in O(1ogn) amortized time. (Loosely, an operation being of amortized time O ( f ( n ) )implies that, if the entire running of the algorithm involves performing the operation k times, then the time for performing these operations is O ( kf ( n ) ) . For each vertex ‘u in (V(G)- V ( X ) )the value is the minimum of the weights of the edges connecting it to V ( X ) . To pick a vertex of least value we have to use O(1ogn) amortized time. Suppose u has been added to V ( X ) and X replaced by X U e, where e has v as one its ends. Now the value of a vertex v’ in (V(G)(V(X)Ue)) has to be updated only if there is an edge between o and u’ . Throughout the algorithm this updating has to be done only once per edge and each such operation takes O(1) amortized time. So overall the updating takes O(m) time. The extraction of the minimum valued element takes O(n log n ) time over all the n stages. At each stage the minimum element has to be deleted from the heap. This takes O(1ogn) amortized time and O(n1ogn) time overall. Hence, the running time of the algorithm is O(m+nlogn). (Note that the above analysis shows that, without the use of the Heap, the complexity of Prim’s algorithm is O(n‘)).
70
3. GRAPHS
3.6.4
Shortest Paths from a Single Vertex
We are given a graph G, without parallel edges, in which each edge e has a nonnegative length Z(v1, v2), where 211,712 are the end points of e. If v1 = v2, then I ( q , v2) = 0. The length of a path is defined to be the sum of the lengths of the edges in the path. The problem is t o find shortest paths from a given vertex (called the source) to every vertex in the same connected component of the graph. We give Dijkstra's Algorithm for this problem. Start from the source vertex v, and assign to each adjacent vertex vi, a current distance d,(vi) = Z(wo,vi). Mark, against each vi, the vertex v, as its foster parent. (We will call w,, the foster children of v,). Let v1 be the adjacent vertex to w, with the least value of d,(vi). Declare the final distance of v1, df(v1) = d,(vl). Mark, against 211, the vertex vo as its parent. (We will call ul, a child of u,). (At this stage we have processed v, and marked its adjacent vertices). Assign a current distance 00 to each unmarked vertex. Suppose X C V ( G )denotes the processed set of vertices at some stage. For each neighbour v j E (V(G)- X ) of last added vertex v k , Check if d,(vj) > df(vk) Z(vk,vj). If Yes, then Mark, against vj, the vertex v k as its foster parent (deleting any earlier mark, if present). (We will call vj, a foster child of wk). Set d,(vj) df(vk) Z(vk,vj). Find a vertex vq E (V(G)- X ) with the least current distance d,(v,). Declare vq to have been processed and its final distance d f ( v q )from vo to be d,(u,). Mark, against v q , its foster parent uq as its parent (we will call wq a child of u q ) . Add vq to X . Repeat the procedure with X U vq in place of X . STOP when all vertices in the connected component of v, are processed.
+
=
uj
+
To find a shortest path from a processed vertex v j to v,, we travel back from to its parent and so on, from child to parent, until we reach v,.
Justification: To justify the above algorithm, we need t o show that the shortest distance from u, to vq (the vertex with the least current distance in V ( G ) - X ) is indeed d f ( u q ) .First, we observe that a finite d,(u), and therefore dp(u), for any vertex u is the length of some paths from u, to v. By induction, we may assume that for every vertex uin in X , d f ( v i n ) = length of the shortest path from vo to vin. Note that this is justified when X = (0,). Suppose d f ( v q )is greater than the length of a shortest path P ( u o , v q )from u, to uq. Let P ( u o , v q )leave X for the first time at 213 and let the next vertex be vOutE (V(G)- X ) . If uOut = u q , we must have d f ( v q )5 d f ( u 3 ) + 2(v3,vq) = length of P ( v o , v q ) . This is a contradiction. So v,,t # u q . NOW dc(vout)5 ( 4 ( 7 ~ 3 + ) 1(v3,vout)) 5 length of P ( v o , v q ) . Hence, d,(uOut) < dc(vq)= df(v,),which contradicts the definition of w q . We conclude that d f ( w q ) must be the length of the shortest path from v, to vq.
71
3.6. GRAPH ALGORITHMS
Complexity: Let n be the number of vertices and m, the number of edges of the graph. This algorithm has n stages. At each stage we have t o compute d,(vj) for vertices v j adjacent to the last added vertex. This computation cannot exceed O ( m ) over all the stages. Further at each stage we have t o find the minimum of d,(vi) for each vi in ( V ( G )- X ) . This is O ( n ) . So we have an overall complexity of O(n2 m). Now m 5 n2. So the time complexity reduces to O(n2). We note that the complexity of this algorithm reduces to O ( m n l o g n ) if the elements in V ( G )- X are stored in a Fibonacci Heap (see [Kozen92]).
+
3.6.5
+
Restrictions and Contractions of Graphs
Let G be a graph and let T C E(G). To build G . T , we merely pick out the edge end point list corresponding to T . This has complexity O(l T I). (Note that the edges of 7' still bear their original index as in the sequence of edges of G). To build G x T we first build GopenT. The graph GopenT has G . ( E ( G ) - T ) remaining vertices of G as isolated vertices. Next we find the connected components of GopenT. Let the vertex sets of the components be X I , . . ,X k . For each X i , whenever v E X i , mark it as belonging to X i . Now in the edge - end point list of T , for each edge el if ~ 1 , 2 1 2are its (positive and negative) endpoints, and if v1 E X i , 02 E X j , then replace v1 by X i and v~ by X j (GshortT has vertex set ( X I ,* * . , X k ) ) . The complexity of building GopenT is O(n+ 1 E - T I), where n is the number of vertices of G, that of finding its components is O(n+ I E -T I) (using d f s say), that of changing the names of vertices is O( 1 T I). So the overall complexity is O ( n m) where m =I E(G) 1 . -
+
+
+
Elsewhere, we describe methods of network analysis (by decomposition) which require the construction of the graphs G . E l , ' . . , G . Ek or G x E l , . * . ,G x Ek, where { E l , .. . , Ek} is a partition of E(G). The complexity of building @& . Ei is clearly O ( n m ) ,while that of building @& x Ei is O ( k ( n m ) ) .
+
3.6.6
+
Hypergraphs represented by Bipartite Graphs
Hypergraphs are becoming increasingly important for modeling many engineering situations. By definition, a hypergraph 31 is a pair (V('H),E(31)),where V ( X ) is the set of vertices of 31 and E ( R ) , a family of subsets of V(31) called the hyperedges of 31. (We remind the reader that in a family, the same member subset could be repeated with distinct indices yielding distinct members of the family). The reader would observe that undirected graphs are a special case of hypergraphs (with the hyperedges having cardinality 1 or 2). The most convenient way of representing a hypergraph is through a bipartite graph B G (VL,V R ,E ) a graph which has a left vertex set V L ,a (disjoint) right vertex set V, and the set of edges E each having one end in VL and the other in VR.We could represent 31 by B.H ( V L VR, , E ) identifying V ('H) with V L ,E(31) with VR with an edge in the bipartite graph between v E VL and e E VR iff v is a member of the hyperedge e of 3t.
72
3. GRAPHS
We can define connectedness for 31 in a manner similar to the way the notion is defined for graphs. 31 is connected iff for any pair of vertices u1,ufthere exists an alternating sequence v1, e l , vl, e2, . . . ,e f ,uf, where the ui are vertices and e,, edges s.t. each edge has both the preceding and succeeding vertices as members. It is easily seen that ? isi connected iff Bx is connected. Hence, checking connectedness of 3t can be done in O(l Vb I + I V, I + I E I) time. Since everything about a hypergraph is captured by a bipartite graph we confine our attention to bipartite graphs in this book. The reader interested in 'standard' hypergraph theory is referred to [Berge73].
3.6.7
Preorders and Partial Orders
A preorder is an ordered pair ( P , 5 ) where P is a set and ' on P that satisfies the following:
5' is a binary relation
< x , V x E P; z ~ y , y j z + x 5 z , ~ x , y , z E P . We can take the elements of P t o be vertices and join x and y by an edge directed from g to x if x 5 y. Let Gp be the resulting directed graph on the vertex set P . Then the vertex sets of the strongly connected components of G p are the equivalence classes of the preorder (2, y belong to an equivalence class iff 2 5 y and y 5 x ) . Let P he the collection of equivalence classes. If X I , X2 E P , we define X I5 X Z iff in the graph G p , a vertex in X1 can be reached from a vertex in Xz. It is easily seen that this defines a partial order ( X i 5 X i ; X i 5 X j and X j 5 X i iff -Y, = X j ; X i 5 X j , X j 5 X I , + X i5 X I , ) . This partial order ( P ,5 ) is said to be induced by ( P ,5 ) .By using a directed df s forest on the graph G p representing the preorder ( P ,5 ) we can get a graph representation of the induced partial order in time O(m n ) where m is the number of edges and n is the number of vertices in G p [A ho Hop croft Ullman 741. 2
+
+
+
A partial order can be represented more economically by using a Hasse Diagram. Here a directed edge goes from a vertex y to a vertex x iff y covers x , i.e., 2 5 y , x # y and there is no z s.t. z # x and z # y and x 5 z 5 y. An ideal I of ( P ,5 ) is a collection of elements of P with the property that if x E I and y 5 x then y E I . The principal ideal I , in ( P ,5)of an element z E P is the collection of all elements y E P s.t. y 5 x. Clearly an ideal is the union of the principal ideals of its elements. A dual ideal I d is a subset of P with the property that if x E I and x 5 z then z E I". Ideals and dual ideals of preorders are defined similarly. The dual of a partial order ( P ,5)is the partial order ( P ,>), where z 3 y iff y 5 x. We define the dual of a preorder in the same manner. We use 5 and 2 interchangeably (writing y 5 x or x 2 y) while speaking of a partial order or a preorder. Preorders and partial orders are used repeatedly in this book (see for instance Chapter 10).
3.6. GRAPH ALGORITHMS
73
Lattices Let ( P ,5 ) be a partial order. An upper bound of e l , e2 E P is an element e3 E P s.t. el 5 e3 and e2 5 e3. A lower bound of el and e2 would be an element e4 E P s.t. e4 5 el and e4 5 e2. A least upper bound (1.u.b.) of e l , e2 would be an upper bound e" s.t. whenever e3 is an upper bound of e1,e2 we have e3 2 e". A greatest lower bound (g.1.b.) of e l , e2 would be a lower bound el s.t. whenever e4 is a lower bound of e l , e2 we have e4 5 el. It is easy to see that if 1.u.b. (g.1.b.) of e1,e2 exists, then it must be unique. We denote the 1.u.b. of e l , e2 by el V e2 and call it the join of e l and e2. The g.1.b. of e l , e2 is denoted by el A e2 and called the meet of e l and e2. If every pair of elements in P has a g.1.b. and an 1.u.b. we say that ( P ,5 ) is a lattice. A lattice can be defined independently of a partial order taking two operations 'V' and ' A ' as primitives satisfying the properties given below: (idempotency) x V x = x V x E P ; x A x = x V x E P . (commutativity) x V y = y V x V x,y E P ; x A y = y A x t/ x,y E P . (associativity) (xv y) v z = x v (y v z ) V x,y, z E P . (z A y ) A z = x A (y A z ) v z , y , z E P . (absorption) z A (x V y) = x V (x A y) = x V x,y E P . The reader may verify that these properties are indeed satisfied by g.1.b. and 1.u.b. operations if we start from a partial order. A lattice that satisfies the following additional property is called a distributive lattice. (distributivity) x A (y V 2) = (x A y) V (x A 2 ) V x,y, z E P ; x V (y A z ) = ( x V y ) A ( x V z ) 'd x , y , z E P . e.g. The collection of all subsets of a given set with union as the join operation and intersection as the meet operation is a distributive lattice. (For a comprehensive treatment of lattice theory see [Birkhoff67]).
Exercise 3.67 Show that the collection of ideals of a partial order form a distributive lattice under union and intersection.
3.6.8
Partitions
Let S be a finite set. A collection (S1,.. . , S k } of nonvoid subsets of S is a partition of S iff S,= S and Si n Sj = 0 whenever i , j are distinct. If (S1,.. . , Sk} is a partition of S then the Si are referred to as its blocks. Let Ps denote the collection of all partitions of S . We may define a partial order ( P ,5 ) on Ps as follows: Let II1, II2 E Ps. Then II1 5 I I z (equivalently Il2 2 n,) iff each block of II1 is contained in a block of Il2. We say I l l is finer than IIz or n2 is coarser than n1. If n,, nb are two partitions of the join of n, and nb, denoted by II, V &, is the finest partition of S that is coarser than both n, and nb and the meet of IT, and nb denoted by H a A &, is the coarsest partition of that is finer than both II, and nb. It can be seen that these notions are well defined: To obtain the meet of Il, and nb we take the intersection of each block of II, with each block of nb and throw away the empty intersections. Observe that any element of
ui
s,
s
3. GRAPHS
74
s lies in precisely one such intersection.
Clearly, the resulting partition &b is finer than both II, and &,. Suppose II, is finer than both II, and &., Let N , be a block of II,. Then N , is contained in some block N , of II, and some block Nb of nb. SO N , C N , n Nb and hence N , is contained in some block of n a b . This proves that n a b is the meet of n, and nb and therefore, that the ‘meet’ is well defined. Next let IT, II‘ be two partitions coarser than II, and I I b . It is then easy to see that II A n’ is also coarser than n, and nb. Hence there is a unique finest partition of coarser than II, and nb. Thus, the ‘join’ is well defined.
s
Storing partitions: We can store a partition by marking against an element of S , the name of the block to which it belongs. Building
n, A nb: When n,, nb are stored, each element of swould have against
it two names - a block of
II, and a block of nb; a pair of names of intersecting
n,, nb can be taken to be the name of a block of n, A nb. Thus forming n, A from n,, nb is O(l s I). Building n, V nb: We first build a bipartite graph B with blocks of n, as V L ,
blocks of
blocks of nb as VR with an edge between N , E VL and Nb E VR iff N , fl N b # 8. It can be seen that this bipartite graph can be built in O(l S I) time (For each it belongs to). We find the connected element of S , check which blocks of components of this bipartite graph. This can be done in O(m n) time where m is the number of edges and n, the number of vertices in the bipartite graph. But both m and n do not exceed 1 S I . So O(m + n ) = O ( (S I). Now we collect blocks of II, (or nb) belonging to the same connected component of B. Their union would make up a block of II, V &,. (For, this block is a union of some blocks K of n, as well as a union of some blocks of nb. Union of any proper subset of blocks of K would cut some block of &). This involves changing the name marked against an element u E S - instead of say N,, it would be N,, which is the name of the connected component of B in which N , is a vertex. Thus, building II, V nb is O(l S I).
+
3.6.9
The Max-Flow Problem
this subsection we outline the max-flow problem and a simple solution for it. We also indicate the directions in which more sophisticated solutions lie. In subsequent chapters we use max-flow repeatedly to model various minimization problems. Other than the flexibility in modeling that it offers, the practical advantage of using the concept of max-flow lies in the availability of efficient algorithms. 111
Let cj be a directed graph. The flow graph (or flow network) F ( G ) is the tuple
(G, c , s,t) where c : E(G) -+ %+ is a real nonnegative capacity function on the edges of cj and s and t are two vertices of G named source and sink, respectively. A flow f associated with F ( 6 ) is a vector on E ( 6 ) satisfying the following conditions: i. f satisfies KCE at all nodes except s and t , i.e., at each vertex u other than s , t , the net outward flow
75
3.6. GRAPH ALGORlTHMS
, -,-,
-
, I
Figure 3.12: A Flow Graph with a max-flow and a min-cut where e,,ti(ei,j) into) v.
are the edges incident at v and directed out of (directed
ii. the net outward flow at s is nonnegative, and at t , is non-positive. iii. 0 5 f ( e ) 5 c ( e ) V e E E(G). (Often a flow is defined to be a vector satisfying (i) and (ii) above while a feasible flow would be one that satisfies all three conditions). An edge e with f ( e ) = c(e) is said to be saturated with respect to f. The value of the flow f, denoted by 1 f 1, is the net outward flow at s. A flow of maximum value is called a maxflow. An s,t-cut (cut for short) is an ordered pair ( A ,B ) , where A , B are disjoint complementary subsets of V ( G ) s.t. s E A and t E B . The capacity of the cut ( A ,B ) , denoted by c ( A , B ) is the sum of the capacities of edges with positive end in A and negative end in B. A cut of minimum capacity is called a min-cut. The flow across ( A ,B ) denoted by f ( A , B ) , is the sum of the flows in the ‘forward’ edges going from A t o B minus the sum of the flows in the ‘backward’ edges going from B to A . Example: Figure 3.12 shows a flow graph. Alongside each directed edge is an ordered pair with the second component indicating the capacity of the edge. A feasible flow f is defined on this flow graph with f ( e ) being the first component of the ordered pair alongside e. The reader may verify that the net flow leaving any node other than the source s and the sink t is zero. At s there is a net positive outward flow (= 7) and at t there is a net negative outward flow (= -7). Let
76
3. GRAPHS
A G {s, a, b, c , d } and let B E {g, f , t } . Then ( A , B )is an s , t cut. It can be verified that f ( A , B ) = 4 3 - 0 = 7. Observe that the forward edges ( c ,g) and (d, f ) of the cut ( A , B ) are saturated while the backward edge (9, d ) carries zero flow. It is clear that in the present case f ( A , B ) = c ( A ,B ) . From the arguments given below it would follow that the given flow has the maximum value, i.e, is a max-flow and that the cut ( A ,B ) is a min-cut, i.e., has minimum capacity.
+
Clearly the flow across an s,t-cut ( A ,B ) cannot exceed the capacity of ( A ,B ) , i.e., f ( A , B ) 5 c(A,B). Let ( A ,B ) be an s,t-cut. If we add the outward flows at all nodes inside A we would get the value f ( A ,B ) (flow of each edge with both ends within A is added once with a (+) sign and another time with a (-) sign and hence cancels) as well as I f 1 (at all nodes other than s the net outward flow is zero). We conclude that I f I= f ( A ,B ) . Let f be a flow in F ( G ) . Let P be a path oriented from s to t. Suppose it is possible to change the flow in the edges of P , without violating capacity constraints, as follows: the flow in each edge e of P is increased (decreased) by S > 0 if e supports (opposes) the orientation of P. Such a path is called an augmenting path for the flow f. Observe that this process does not disturb the KCE at any node except s,t. A t s, the net outward flow goes up by 6, while at t, the net inward flow goes up by S. Thus, if f’ is the modified flow, I f’ I=\ f I +6. This is the essential idea behind flow maximization algorithms.
It is convenient to describe max-flow algorithms and related results in terms of the residual graph Gf associated with the flow f . The graph Gf has the vertex set V ( G ) . Whenever e E E ( G ) and f ( e ) < c ( e ) , Gf has an edge e+ between the same end points and in the same direction as e ; and if 0 < f ( e ) , Gf has an edge e- in the opposite direction as e. Note that both e+ and e- may be present in G f . The edge e+ has the residual capacity T f ( e + )F c ( e ) - f ( e ) and the edge e- has the residual capacity r f ( e - ) = f ( e ) . We note that a directed path P from s to t in the residual graph G f corresponds to an augmenting path in F ( G ) with respect to f . Henceforth we would call such a path P in G f also, an augmenting path of f . The maximum amount by which the flow can be increased using this augmenting path is clearly the minimum of the residual capacities of the edges of P. This value we would call the bottle neck capacity of P. We now present a simple algorithm for flow maximization. This algorithm is due to Edmonds and Karp [Edmonds+Karp72].
ALGORITHM 3.1 Algorithm Max-Flow INPUT A flow graph F ( G ) F (G, c , s,t). OUTPUT (i) A maximum valued flow f,,, (ii) A min-cut ( A ,B ) s.t. I f,,,
for F(G). c ( A ,B ) .
I=
Initialize
Let f be any flow of F ( G ) ( f could be the zero flow for instance).
STEP 1
Draw the residual graph
Gf.Do a directed bfs
starting from s.
3.6. GRAPH ALGORITHMS STEP 2
77
If t is reached, we also have a shortest augmenting path P . Compute the bottle neck capacity 6 of P. Increase the flow along P by 6. Let f‘ be the new flow. Set f = f’ and GOT 0 STEP 1. If t is not reached, let A be the set of all vertices reached from s and let B z V ( 6 )- A. Declare f,,, = f , m-n-cut to be ( A , B ) .
STOP.
Justification of Algorithm 3.1 We need the following Theorem 3.6.1 (Max-Flow Min-Cut Theorem (Ford+Fulkerson56], [Fordi f i l k e r s o n621) i. The flow reaches its maximum value iff there exists no augmenting path.
ii. The maximum value of il flow in F(G) is the minimum value of the capacity of a cut. Proof : If a flow has maximum value it clearly cannot permit the existence of an augmenting path. If there exists no augmenting path the directed bfs from s in the residual graph will not reach t. Let A be the set of all vertices reached from s and let B be the complement of A. All edges with one end in A and the other in B must be directed into A as otherwise the set of reachable vertices can be enlarged. Now consider the corresponding edges of F ( g ) . Each one of these edges, if forward (away from A ) , must have reached full capacity, i.e., be saturated and if backward (into A), must have zero flow. But then, for this cut, f ( A ,B ) = c ( A ,B). Since for any flow f and any cut (A’,B’), we have I f (=f (A’,B’) 5 c(A’, B’), we conclude that f is a maximum flow. This completes the proof of (i). Since I f I= c(A, B ) and 1 f ( 5c(A‘, B‘) for any cut (A’, B’), it is clear that c(A, B ) is the minimum capacity of a cut of F ( G ) . This proves (ii). 0
The integral capacity case: We can justify the above algorithm for the case where capacities are integral quite simply. Let M be the capacity of the cut ( { s } ,V(G) - s ) . The bottle neck capacity of any augmenting path is integral. Whenever we fmd an augmenting path we would increase the flow by an integer and Theorem 3.6.1 assures us that if we are unable to find an augmenting path we have reached max-flow. Thus, in atmost M augmentations we reach maximum flow. This justification also proves the following corollary. Corollary 3.6.1 If the capacity function of a flow graph is integral, then there exists a ma-flow in the flow graph which is integral.
3. GRAPHS
78
Complexity We consider the integral capacity case. Each augmentation involves a directed b f s. This is O ( m ) in the present case. Hence, the overall complexity of Algorithm Max-Flow is O ( M m ) ,where m =I E(G) I . It is not obvious that Algorithm Max-Flow would terminate for real capacities. However, it can be shown that it does. Since the augmenting path is constructed through a bfs it is clear that it has minimum length. Edmonds and Karp [Edmonds+Karp72] have shown that if the shortest augmenting path is chosen every time, there are atmost mn augmentations. So the overall complexity of Algorithm Max-Flow is O ( m ' n ) .
Exercise 3.68 [Edmonds+Karp72] In Algorithm Max-Flow, if the shortest augmenting path is chosen every time, show that there are atmost mn augmentations. We mention a few other algorithms which are faster. These are based on Dinic's Algorithm [Dinic7O]. This algorithm proceeds in phases, in each of which, flow is pushed along a maximal set of shortest paths. Each phase takes 0(rnn)effort. The total number of phases is bounded by the length L of the longest s - t path in G (Clearly L 5 n). So the overall complexity is O ( L m n ) . The MPM Algorithm [MPM78] has the same number of phases as Dinic's Algorithm. But each phase is O ( n z ) .So the overall complexity is 0 ( L n 2 ) . The Sleator Algorithm [Sleator80], [SleatorfTarjan 831 computes each phase in 0 (rnlog n ) time and has an overall complexity 0 ( L mlog n). (Usually the above complexities are stated with n in place of L ) . For a comprehensive treatment of flow algorithms the reader is referred to [Ahuja+Magnanti+Orlin93].
The Nearest Source Side and Sink Side Min-Cuts When combinatorial problems are modelled as max-flow problems, usually the cuts with minimum capacity have physical significance. Of particular interest would be minimum capacity cuts ( A ,B ) where A or B is minimal. Below we show that these cuts are unique. Further, we show that computing them, after a max-flow has been found, is easy. We begin with a simple lemma.
Lemma 3.6.1 (k) Let ( A l l B l ) ,( A z ,Bz) be two minimum capacity cuts. Then ( A 1 U A'L,BI n Bz) and (Al n Az , B1 u Bz) are also minimum capacity cuts. Proof : Let f ( A ) sum of the capacity of edges with one end in A and directed away from A , A V ( G ) . Later, in Chapter 9 (see Exercise 9.1, Examples 9.2.5,9.2.6) we show that f(.) is submodular, i.e.,
c
f(x)+ f ( Y )
2
f W u Y ) + f (xn Y ) v x,y c V G ) .
Now if X , Y minimize f(.),the only way the above inequality can be satisfied is for f(.) to take the minimum value on X u Y ,X n Y also. This proves the lemma. 17
3.6. GRAPH ALGORITHMS
79
The following corollary is now immediate.
Corollary 3.6.2 Let F(G) G (G,c,s,t). Then F ( G ) has a unique min-cut ( A , B ) in which A is minimal (B is minimal). We will call the min-cut ( A ,B ) nearest source side (sink side) min-cut iff A is minimal ( B is minimal). To find the nearest source side (sink side) min cut we proceed as follows Algorithm Source (Sink) Side Min-Cut: First maximize flow and let f be the max-flow output by the algorithm. Draw the residual graph G f . Do a directed bf s in Gf starting from s and proceeding forward. Let A , be the set of all vertices reachable from s. Then ( A , , V(G)- A,) is the desired nearest source side min-cut. Let GF denote the directed graph obtained from Gf by reversing all arrows. The nearest sink side min-cut is obtained by doing a directed bf s starting from t in Gy . Let Bt be the set of all vertices reachable in G y from t. Then (V(G)- Bt, B,) is the desired nearest sink side min-cut. In order to justify the above algorithms we first observe that when we maximize flow for each min-cut ( A ,B ) we would have f ( A ,B) = c ( A ,B ) . Thus, if ( A , B ) is a min-cut, all the forward edges from A t o B would be saturated and all the backward edges from A to B would have zero flow. Therefore, in the residual graph Gf all edges across the cut would be directed into A . Now s E A and doing a bfs starting from s we cannot go outside A . Hence, if ( A , B )is a min-cut A, C A , where A, is the set of all vertices reachable from s in Gf.But ( A , , V(G)- A , ) is a min-cut. Hence, ( A , , V ( G )- A , ) is the nearest source side min-cut. The justification for the sink side min-cut algorithm is similar. (Note that the above justification provides an alternative proof that min-cuts ( A , B ) , where A or B is minimal, are unique). The complexity of the above algorithms is O ( m ) . So if they are added to the max-flow algorithms the overall complexity would not increase.
3.6.10
Flow Graphs Associated with Bipartite Graphs
Many optimization problems considered in this book are based on bipartite graphs. Usually they reduce to max- flow problems on a flow graph derived from the bipartite graph in a simple manner. We give below a brief account of the situation and standardize notation. Let B (VL,VR,E ) be a bipartite graph. The flow graph F ( B ,C L , CR) associated with B with capacity CL(.) @ CR(.) is defined as follows: C L ( . ) , c ~ ( .are ) nonegative real functions on V L ,VR respectively. (They may therefore be treated as weight vectors). Each edge e E E is directed from VL to VR and given a capacity 00. Additional vertices (source) s and (sink) t are introduced. Directed edges (s,VL), (WR,t ) are added for each W L E VL and each OR E VR. The capacity of the edge (s, V L ) is C L ( W L ) , V L E VL and the capacity of the edge ( W R ,t ) is C R ( V R ) , W R E V,. Figure 3.13 illustrates the construction of this flow graph.
3. GRAPHS
80
t
S
"L
E
"R
Figure 3.13: The Flow Graph associated with B
Figure 3.14: The Cut corresponding to X in the Flow Graph associated with B
3.6. GRAPH ALGORITHMS
81
Let T ( X ) denote the set of vertices adjacent to the vertex subset X E VL kJ VR in the bipartite graph B. Let CL(Z)( c R ( Z ) ) denote the sum of the values of c ~ ( . (cR(.)) ) on elements of 2. The cut corresponding to X C_ VL is the cut (s b.J X kJ r ( X ) ,t kd (VL - X ) kJ (VR - r ( X ) ) )(see Figure 3.14). We now have the following simple theorem which brings out the utility of the flow formulation. Theorem 3.6.2 (k) Let B,CL(.),CR(~),F(B,CL,CR) be as defined above.
+ cR(r(X)). ii. z VL minimizes the expression (VL - X ) + cR(r ( X ) ), x c VL, iff the cut corresponding to 2 is a men-cut of F ( B ,C L , Further, the capacity of the cut corresponding to z equals CL(VL- z)+ CR(r(2)). . X) i. The capacity of the cut corresponding to X is c ~ ( v r CL
CR).
Z,,, and a unique minimal set Zmin which minimize the above expression. Let CR(.) be strictly positive. Then the cuts corresponding to Z,,,, Zmin are respectively the nearest sink side and the nearest source side rnin-cuts of F ( B ,C L , C R ) .
iii. There is a unique maximal subset
Proof : i. This is immediate (see Figure 3.14).
ii. We will first show that there exist min-cuts which are cuts corresponding to some X1 5 VL. Let (s kJ X1 b.J Y ,t b.J (VL- X I ) kJ (VR - Y ) )be a min-cut of F ( B ,C L , C R ) , where X1 C VL, Y C VR. Since this is a min-cut, no infinite capacity edge must pass from
skdX1 kdY to its complement. This means that any edge leaving X1 must terminate in Y , i.e., r(X1) & Y. The capacity of the cut is CL(VL- X ~ ) + C R ( Y Now ) . consider the cut (s kJ X1 kJ r(X1),t k l (VL - X l ) kI (VR - r(X1))). The capacity of this cut is cL(VL- X I ) + c ~ ( r ( X 1 )5 ) cL(VL - x l ) + c ~ ( Y ) ,( C L , C R are nonnegative vectors). Thus the cut corresponding t o X I , is a min cut. Let Z minimize the expression CL(VL- X ) c R ( r ( X ) ) , X VL, and let the cut corresponding to 2' 2 VL be a min-cut of F ( B , C L , C R ) . The capacity of this cut c R ( r ( z ) ) 5 cL(vL - 2') C R ( r ( z ' ) ) . is cL(vL - 2') cR(r(z')). so C L ( v L However the LHS is the capacity of the cut corresponding to 2. Since the cut corresponding to z'is a min cut we must have CL(VL- 2 ) + c R ( r ( z ) ) 2 CL(VL - Z ' ) f c ~ ( r ( Z ' ) ) . We conclude that the two capacities are equal. Therefore 2' minimizes c ~ ( b ' ~X ) + c R ( r ( X ) ) , xC VL, and the cut corresponding to 2 is a min-cut.
+
+ z)+
c
+
iii. The nearest source side min-cut can be seen to be corresponding to some subset XI of V, even if CR(.)is nonnegative, but, not necessarily, strictly positive. Now let C R ( . ) be strictly positive. The nearest sink side min-cut is obtained by travelling backward from t through all unsaturated arcs to reach T R C VR and then backwards t o r(TR) C V L . The cut that we obtain by this process is (s Itl X2 Itl (VR - TR),t kl r(TR)Itl TR)where X z = VL - r(TR). It is clear that r(X2) C_ VR - TR. Suppose r ( X a ) C VR - TR. Then the capacity of the cut corresponding to X2 = CL(VL- X 2 ) c ~ ( r ( X 2 ) )
+
82
3. GRAPHS
< CL(VL- X,) + CR(VR- T R ) , since cR is strictly positive. The RHS of the above inequality is the capacity of the min-cut (s kJ X z kJ (VR - TR),t kJ r ( T R ) kJ T R )- a contradiction. We conclude that r ( X 2 ) = VR - TR, so that the nearest sink side min-cut corresponds to Xz.
+
We know that X1, X2 minimize the expression CL(VL- X ) c R ( r ( X ) ) , X C V L . Let A c s k~ X1 kl r(X1) and let B be the complement of A with respect to VL H VRkJ { s ,t } . Then ( A ,B ) cannot be a min-cut (using the justification for the Algorithm Source Side Min Cut). Also s H X 1 H r ( X 1 ) is the unique set with the above property. It follows that X I is the unique minimal set s.t. (sHXlkl)T(Xl),tW(VL-X1)kJ(VR-)T(X1))) is amin-cut. Hence, X 1 is theminimal set that minimizes CL(VL- X ) c R ( r ( X ) ) .The proof that X2 is the maximal set that minimizes the above expression is similar.
+
0
Remark: The expression that was minimized in the above proof is a submodular function. We shall see later, in Chapter 9, that such functions always have a unique minimal and a unique maximal set minimizing them. Complexity of Max-Flow Algorithms for Bipartite Graph Case Finally we make some observations on the complexity of the max flow algorithms when the flow graph is associated with a bipartite graph. We note that, in this case, the longest undirected path from s to t is O(min(1 VL 1, I VR I)) , since every path from s to t has to alternate between vertices of VL, VR. So the number of phases for Dinic’s (and related) algorithms would be O(min(1 VL 1, I VR I)). Therefore the overall complexities of the algorithms for this case would be Dinic’s hfPM
-
O(mn(min(1VL 1, I VR I))) O(n’(min(1 VL I,l VR I)))
Sleatm
-
O(mlogn(min(1 VL
-
I, I VR I))).
Here, n,m refer to the total number of vertices and edges respectively in the flow graph. So 7% =I VL I I VR I +2 and m =I E I 1 VL 1 I VR I .
+
+
+
Exercise 3.69 [Menger27] In any graph, show that the number of arc disjoint paths, between any pair of vertices s and t , is the number of branches in a m’n-cut separating s and t.
3.7
Duality
Duality is a useful concept often met with in mathematics, e.g. duality of vector spaces and spaces of functionals, duality of partial orders, duality of functions and Fourier transforms etc. When we encounter it we need to know why it arises
3.7. DUALITY
83
and how to use it. The duality that one normally deals with in electrical network theory, arises because the voltage and current spaces of graphs are complementary orthogonal. (For other examples of duality that one encounters within electrical network theory see [Iri+Recski80]). In this section we discuss informally how to dualize statements about graphs, vector spaces (and therefore, implicitly, electrical networks) and also as to when we may expect the dual of a true statement to be true. Let V be a vector space on S. We associate with V i. a set of operations each of which converts V t o a vector space on a subset of S - a typical operation is ( S - TI,TI - T 2 ) ( . )T,2 & TI C S , where
(S- T1,TI - T l ) ( V )= V . TI
x T2;
ii. classes of objects: 0 0 0 0
class of forests class of coforests class of circuits class of cutsets primal vectors (vectors in V ) dual vectors (vectors in V l ) .
Remark: For convenience we generalize the usual definitions of forest, coforest, circuit, cutset etc. to vector spaces. The reader may verify that, if V were replaced by V,(G), these definitions do reduce to the usual definitions in terms of graphs. A forest of U is a maximally independent subset of columns of a representative matrix of V , a coforest of U is the complement, relative to the set of columns of the representative matrix, of a forest, a circuit of V is a minimal set that is not contained in any forest of V , while a cutset of V is a minimal set that is not contained in any coforest of V . The classes of coforests, circuits and cutsets are used for convenience. Actually any one of the four classes can be treated as primitive and the rest expressed in terms of it. NOW we list the results which 'cause' duality. i. ( V ' ) l
V'.
= V , equivalently, x is a primal vector for V iff it is a dual vector for
ii. (V . T I x T . ) -=~V~ x TI . TZ = vl. ( S - ( T I - ~ 1 ) x) T Z , i.e., the operation (S - Tl,T1 - T l ) ( . )holds the same place relative to V , that the operation (7'1 - T J ,S - T I ) ( . holds, ) relative to V l . We say (S -TI, TI - T J ) ( .is ) dual to (TI - T2,S - T I ) ( . ) . iii. (later we add one more operation which includes all the above, namely, that of generalized minor)
(Vs f) V p ) l = V;
* Vp',
P
g s.
3. GRAPHS
84
iv. T is a forest (coforest) of V iff T is a coforest (forest) of V’ v.
c is a circuit (cutset) of v iff C is a cutset (circuit) of V’
Let us consider how to ‘dualize’ a statement about a vector space and the associated set of operations and classes of objects. Our procedure requires that the statement to be dualized be in terms of the primitive objects and operations, associated with a vector space, that we described above. Consider the statement i. ‘ A subset is a circuit of V x T iff it is a minimal intersection of a circuit of V with T ’. The first step is to write the statement in terms of VL : ‘A subset is a circuit of VL x T iff it is a minimal intersection of a circuit of V L with T ‘ _ Next we try to express the sets of objects involved in terms of the appropriate corriplementary orthogonal space. Thus ‘circuit of V L x T ’ becomes ‘cutset of ( V ~ x’ T)-’ ’ and ‘circuit of VL ’ becomes ‘cutset of (Vl)’ ’ we thus obtain the dual of (i): i d . ‘A subset is a cutset of V . T iff it is a minimal intersection of a cutset of V with 7‘’. The above procedure will yield a true (false) dual statement if we start with a true (false) statement. However, as we mentioned before, the statement that we start with must involve only the ‘primitives’ viz. the sets of operations and the classes of objects. Next let us consider the case of (directed) graphs. We associate with a graph, a vector space, namely, its voltage space. Given a statement about graphs we first see whether it can be written entirely in terms of its voltage space. If so, then we dualize it and interpret the dual statement in terms of graphs. For instance consider the statement -. 11. ‘A subset is a circuit of G x T iff it is a minimal intersection of a circuit of G with 7‘ ’. This statement can be written entirely in terms of U,,(G). If we substitute U in place of V,(G) in this latter statement, we get the statement (i) above. Its dual is (id). Now we resubstitute V,,(G) in place of V . This gives us ‘ A subset is a cutset of V,(G) . T iff it is a minimal intersection of a cutset of V,(G) with T ’. Interpreting this statement in terms of gives us z z d . ’ A subset is a cutset of G . T iff it is a minimal intersection of a cutset of G with T ’. The above procedure could fail in the beginning when we try to write the statement about G as a statement about V,,(G) or when we replace V v ( G ) by a general V . It could fail in the end when we replace V by Vv(G)in the dual statement. Here are a couple of examples of statements which cannot be dualized by our procedure.
i. ‘Let G be a connected graph and let f be a forest of G. Then there exists a unique path between any given pair of vertices using the edges of f alone ’.
3.7. DUALITY
85
The procedure fails because ‘path’ and ‘vertices’ cannot be extended to vector spaces.
ii. ‘There exists a graph 4 that has the given sets of edges Cl ,‘ . . ,Cn as circuits’. We can extend this to V v ( G ) , thence to V , and dualize the statement involving V . This statement would be: ‘There exists a vector space V that has the given sets of edges C1, . . . ,C, as cutsets’. The procedure can fail if we replace V by V,,(G) since the latter statement may be false. Exercise 3.70 What are the duals of the following?
i. rank function of a graph ii. r ( . ) where r ( T ) = dim(V . T ) iii. J(.), where J(T) = dim(V . T ) - dim(V x T ) iv. Closed sets of a graph (a subset of edges is closed if its rank is less than that of any proper superset) v. selfloops vi. coloops vii. separators of V
viii. separators (2 connected components) of a graph. Exercise 3.71 Dualize the following statements. Assuming the truth of the original statement, comment on the truth of the dual.
i. A coforest is a minimal set that intersects every circuit. ii. A circuit is a minimal set that intersects every coforest. iii. Every ring sum of circuits of G is a disjoint union of circuits. (C1+ r .
is the set of all elements which occur in an odd number of the Ci).
. frC,
iv. Let C1,C2 be circuits of G and let e, E C1n C2 and el E C1 - ( 7 2 . Then there exists a circuit C, of 6 s.t. el E C, Cl u Cz - e.,
v. Let G be a graph and let E(G) be partitioned into El,. * ,E n . Let f be a forest of G which has as many edges as possible of E l , then as many as possible of E 2 . . . u p t o E n. Then f n E j i s a f o r e s t o f ~ . ( U ~ ~ = , E ~ ) x E ~ , j = 1 , ~ ~ ~ vi. Let Q be a graph. Let E G E(G) be partitioned into sets A , B. Then L C: B is a minimal set such that Q . ( E - L ) has A as a separator iff (a) r(G x ( A u L ) ) = r ( Q . A ) . (b) L has no self loops in G x ( A U L ) .
86
3. GRAPHS
vii. Let V be a vector space on S and let S be partitioned into A , B . Let K C A be s.t. V x ( E - K ) has B as a separator. If x on S is s.t. x / A E V . A , z / B U K E V . ( B u K ) , then x E V .
Remark: i. We have described a ‘sufficient’ procedure for dualization. If the procedure fails we cannot be sure that the ‘dual’ statement is necessarily false. The procedure is, however, applicable wherever duality is found - we merely have to use the appropriate dual objects and operations. ii. If we restrict ourselves to the class of planar graphs we have the interesting result that there exists a graph G* s.t. Vi(G) = Vu(G*).In this case a wider range of statements can be dualized. In particular there is the notion of ‘mesh’or ‘window’ that is dual to that of a vertex. In this book we do not exploit ‘planar duality’.
3.8
Notes
Graph theory means different things to different authors. The kind of graph theory that electrical networks need was developed systematically (for quite different reasons) by Whitney [Whitney32], [Whitney33a], [WhitneySSb], [WhitneySSc]. In this chapter the emphasis has been on the topics of direct use to us later on in the book. We have followed, in the main, [SeshufReedGl] and [TutteGti]for the sections dealing with graphs, their representations and operations on graphs and vector spaces. For the section on graph algorithms we have used [Aho+Hopcroft+Ullman74] and [Kozen92]. For a recent survey of graph algorithms, the reader is referred to [Van IJeeuwen90].
3.9
Solutions of Exercises
E 3.1: There are ( n - 1) possible values for the degree of a node and n vertices (if the graph is connected and n > 1). E 3.2: i . If we add all the degrees (counting self loops twice) each edge is being counted twice. ii. The sum of all degrees is even. So is the sum of all even degrees. So the sum of odd degrees is even and therefore the number of odd degree vertices is even.
E 3.3: (Sketch) Start from any vertex v, and go to a farthest vertex. If this vertex is deleted there would still be paths from 11, to remaining vertices. E 3.4: i. circuit graphs disconnected from each other; ii. add the edge between a non-terminal vertex and another vertex of the path; iii. a single edge with two end points; iv. a graph with only self loop edges.
3.9. SOLUTIONS OF EXERCISES
87
E 3.5: Consider the graph obtained after deleting the edge e. If v 1 , v 2 are two vertices of this graph, there must have been a path P I between them in the original graph. If this path had no common vertex with the circuit subgraph it would be present in the new graph also. So let us assume that it has some common vertices with the circuit subgraph. If we go along the path from w1 t o 02 we will encounter a vertex of the circuit graph for the first time (say the vertex a ) and a vertex of the circuit graph for the last time (say b ) . In the circuit subgraph there is a path P2 between a and b which does not have e as an edge. If we replace the segment in PI between a and b by P2,we would get a path P3 in the new graph between v1 and 112.
E 3.6: Let P I ,P2 be the two paths between nodes a, b. We start from node a and go along PI, Pz towards b until we reach a vertex say c after which the two paths have different edges. (Note that the vertex c could be a itself). From c we follow PI towards b until we reach a vertex say d which belongs also to P2. Such a vertex must exist since b belongs both to PI and to P2. From d we travel back towards a along PJ. The segments c to d along PI and d to c along P2 would constitute a circuit subgraph since it would be connected and every vertex would have degree 2. E 3.7: If the graph has a self loop then that is the desired circuit. Otherwise we start from any vertex a and travel outward without repeating edges. Since every vertex has degree 2 2 if we enter a vertex for the first time we can also leave it by a new edge. Since the graph is finite we must meet some vertex again. We stop as soon as this happens for the first time. Let c be this vertex. The segment (composed of edges and vertices) starting from c and ending back at c is a circuit subgraph. E 3.8: (a) A graph made up of self loops only. (b) A single edge with two end points. E 3.10: A cutset is a set of crossing edges. Hence, it contains a minimal set of edges which when deleted increases the number of components of the graph. Consider any edge set C with the given property. It should be possible to partition the vertices of the graph into two subsets so that all edges between the two subsets are in C since deletion of C increases the number of components of the graph. Thus, we have two collections of subsets each member of which contains a member of the other. Hence, minimal members of both collections must be identical. E 3.11: All edges are parallel. There may be isolated vertices. E 3.12: i. Deletion of T must increase the number of components. Minimality implies that only one of the components should be split.
ii. if the subgraphs on V1,Vj are not connected deletion of T would increase the number of components by more than one. On the other hand, if subgraphs on VI,V2 are connected the set of edges between them must constitute a cutset because their deletion increases the number of components and deletion of a proper subset will leave a connected subgraph with vertex set V1 U V,.
88
3. GRAPHS
E 3.13: i. There must be at least one path because of connectedness. More than one path would imply the presence of a circuit by Theorem 3.2.1.
..
The tree graph cannot have only nodes of degree greater or equal to two as otherwise by Theorem 3.2.2 it will contain a circuit. Hence, it has a node a of degree less than two. Now if it has more than one node, because of connectedness, a has degree one. If we start from a and proceed away from it we must ultimately reach a node b of degree one since the graph is finite and repetition of a node would imply two distinct paths between some two nodes. 11.
E 3.14: Proof of Theorem 3.2.4: The trivial single node graph with no edges is a tree graph. The graph on two nodes with an edge between them is also a tree graph. It is clear that in these cases the statement of the theorem is true. Suppose it is true for all tree graphs on ( n - 1) nodes. Let t be a tree graph of n nodes. This graph, by Theorem 3.2.2, has a vertex II of degree less than two. If n > 1 , since t is connected, this vertex has degree 1. If we delete this vertex and the single edge incident on it, it is clear that the remaining graph t' has no circuits. It must also be connected. For, if nodes u l , v l have no path in t', the path between them in t uses u as a nonterminal node which therefore has degree 2 2 in t-a contradiction. Thus t' is a tree graph on (n - 1) nodes. By induction it has (n - 2) edges. So t has ( n - 1) edges. On the other hand, let G be a connected graph on n nodes with ( n - 1) edges. If it contains a circuit, by Lemma 3.2.1 we can delete an edge of the circuit without destroying connectedness of the graph. Repeating this procedure would ultimately give us a graph on n nodes that is connected but has no circuits. But this would be a tree graph with (n - 1) edges. We conclude that G must itself be a tree graph. Proof of Corollary 3.2.1: The number of edges = - l), where ni is the number of nodes of the i t h component.
cy='=,(ni
E 3.15: We will only show that maximality implies the subset is a forest (coforest). Suppose the set is maximal with respect to not containing a circuit. Then it must intersect each component of the graph in a tree. For, if not, atleast one more edge can be added without the formation of a circuit. This proves the set is a forest. Next suppose a set L is maximal with respect t o not containing a cutset. Removal of such a set from the graph would leave at least a forest of the graph. However, it cannot leave more edges than a forest for in that case the remaining graph contains a circuit. Let e be in this circuit. Deletion of L u e cannot disconnect the graph and so L U e contains no cutset - this contradicts the maximality. So removal of L leaves precisely a forest. E 3.16: Deletion of the edges in a cutset increases the number of components in the graph. Hence, every forest must intersect the cutset (otherwise the corresponding forest subgraph would remain when the cutset is deleted and would ensure that the number of components remains the same). Removal of edges of a coforest must destroy every circuit as otherwise the corresponding forest would contain a circuit. So a coforest intersects every circuit of the graph.
3.9. SOLUTIONS OF EXERCISES
89
E 3.17: Proof of Lemma 3.2.2: Let a, b be the end points of the edge e being deleted. Let V,, v b be the set of all vertices which can be reached from a, b respectively, by paths in the tree graph which do not use e. Suppose node u is not in V, or vb. But the connected component containing u cannot meet or v b (otherwise u can be reached from a or b by a path) and hence, even if e is put back u cannot be b by a path. But this would make the tree graph disconnected. connected to V, U v We conclude that V, U v b is the vertex set of the tree graph. The subgraphs on V,, v b are connected and contain no circuits and are therefore tree graphs.
v,
E 3.18: Let a , b be the end points of the edge e being contracted. It is clear that the graph after contraction of an edge e is connected. If it contains a circuit graph this latter must contain the fused node { a ,b } . But if so there exists a path in the original tree graph between a , b which does not use e. This is a contradiction.
E 3.19: Proof of Theorem 3.2.6: Let fG denote the subgraph of 4 on f. By the definition of a forest subgraph, between the end points of e say 721,722 there must be a path, say P in fc:. Addition of e to fG creates precisely two paths between n l , n 2 , namely, P and the subgraph on e. The path P has n1,n 2 of degree 1 and remaining vertices of degree two. Hence addition of e to P will create a connected subgraph in which every vertex has degree two. Now this must be the only circuit subgraph created when e is added to f . For if there are two such subgraphs, e must be a part of both of them since f contains no circuit. Hence they and therefore, fc: must have distinct paths between n1, n 2 which do not use e. But then by Theorem 3.2.1 there must be a circuit subgraph in fc - a contradiction. E 3.20: Proof of Theorem 3.2.7: We will prove the result for a connected graph first. Deletion of an edge of a tree graph must increase its connected components by one by Lemma 3.2.2. Deletion of e U f from the given graph G is equivalent to first deleting f and then, in the resulting tree subgraph fc: on f, deleting e. Therefore, the number of connected components must increase precisely by one when e U f is b be the vertex sets of the tree deleted. Let a, b be the endpoints of e and let V,, v subgraphs (which do not however correspond to trees of 4 ) that result when the edge e is deleted from fG,equivalently, when e u f is deleted from 4. Any crossing edge set that e U f contains must have V,, v b as end vertex sets. There is only one such. We conclude that e U f contains only one crossing edge set. This must be a cutset since the subgraphs on V,, v b are connected. If the graph were disconnected, when e U f is deleted, only one component say Ge which contains e would be split. Since any cutset contained in e U f is contained in Ge we could argue with Ge in place of G and the subset of f in Ge in place of f . So the theorem would be true in this case also. E 3.21: Let C be a circuit. Let e E C. Then C - e does not contain a circuit and can be grown to a forest of G. C is an f-circuit of this forest. E 3.22: Let B be a cutset with e E B . By minimality, deletion of B - e will not increase the number of components of the graph, i.e., there is a forest remaining when B - e is deleted. So B - e can be included in a coforest and B is an f-cutset of the corresponding forest.
90
3. GRAPHS
f be a forest subgraph of the given graph containing edge e of cutset C.This is possible since e is not a self loop. Contraction of this edge would convert f^ to a forest subgraph f of the new graph. The number of edges in the coforest would not have changed.
E 3.23: Let
E 3.25: Given such a matrix associate a vertex with each row and an edge with
each column. The edge has an arrow leaving the vertex (row) where its column has a + I and entering the vertex (row) where its column has a -1. If the column has only zeros the corresponding edge is a self loop incident on any of the vertices.
E 3.31: The matrix retains the property given in Exercise 3.25 when these operations are performed. E 3.35: i. When the vertex v is not a cutvertex (i.e., a vertex which lies in every path between some two vertices a, b of the graph which are distinct from itself). In this
case deletion of the edges incident at the vertex would break up the graph into atleast three components viz. v alone, component containing a and component containing b.
ii. Consider a graph with two parallel edges. iii. No. It then has t o be orthogonal to itself. Over the real field this would imply that it has null support. E 3.36: Scan the columns from left. Pick the first column corresponding to a non-selfloop edge. If k columns (edges) have been picked, pick the next column to be corresponding to the first edge which does not form a circuit with previously picked edges. Continue until all columns are exhausted. This gives us a forest of the graph. The f-cutset matrix of this forest with columns in the same order as before and rows such that an identity matrix appears corresponding to the forest would constitute the first set of rows of the RRE matrix. The second set of rows would be zero rows equal in number to the number of components.
E 3.38: If the graph is connected all nodes must have the same potential in order that the voltages of all branches are zero. (Otherwise we can collect all nodes of a particular voltage inside a surface. A t least one branch has only one endpoint within this surface. This branch would be assigned a nonzero voltage by the voltage vector). If the graph is disconnected all nodes of the same component must have the same potential by the above argument.
E 3.39: We use the above solution. If the graph is connected we see that XTA = 0
iff X has all entries the same, i.e., iff X belongs to the one dimensional vector space spanned by (1 1 . . 1 ) . But this means ( C ( A ) ) I has dimension one. Hence dim(C(il)) = 71 - 1, i.e., r ( A ) = n - 1.
E 3.41: Let i be a nonzero current vector. Let T be the support of i . The subgraph G . T of 5 must have each vertex of degree at least two (otherwise the corresponding row of A cannot be orthogonal to i). Hence 5 . T contains a circuit by Theorem 3.2.2. Thus support of i contains a circuit. Next every circuit vector is a current vector (Theorem 3.3.1). It follows that its support cannot properly contain the
91
3.9. SOLUTIONS OF EXERCISES
support of another nonzero current vector since a circuit cannot properly contain another circuit. Next let i be an elementary current vector. Clearly its support must be a circuit C. Let ic be the corresponding circuit vector. Now by selecting a suitable scalar a , the current vector i aic can be made to have a support properly contained in C. But this implies that the support of i aic is void, i.e., i = -sic as needed. Now regarding the cutset vector. Let v be a voltage vector. We know that it must be derived from a potential vector. Let Vl be the set of all nodes having some fixed potential (among the values taken by the potential vector). Then the crossing edge set corresponding to (Vl,E - V l ) must be a subset of the support of v. Thus, the support of v must contain a cutset. Now every cutset vector is a voltage vector (Theorem 3.3.2). It follows that its support cannot properly contain the support of another nonzero voltage vector since a cutset cannot properly contain another cutset. Next let v be an elementary voltage vector. Proceeding analogously to the current vector case we can show that v must be a scalar multiple of a cutset vector, as required.
+
+
E 3.42: A set of columns T of A are linearly dependent iff there exists a vector i with support T such that Ai = 0. By definition i is a current vector. By Theorem 3.3.6 we know that T must contain a circuit of G. Further, if T contains a circuit of G the corresponding circuit vector of G is a current vector from which it follows that the set of columns T of A are linearly dependent. The rows of Bf constitute a basis for V i ( G ) . By the strong form of Tellegen’s Theorem we know that V,(G) = ( V i ( G ) ) l .Hence, v is a voltage vector iff B ~ = v 0. The rest of the argument parallels that of the linear dependence of columns of A. E 3.43: An f-cutset matrix Q J of G is a representative matrix of V,(G) since by Theorem 3.3.2 its rows are linearly dependent on the rows of the incidence matrix and its rank equals the rank of A. Now we know that (Theorem 3.3.7) the columns of A are linearly independent iff the corresponding edges do not contain a circuit. This must also be true of any representative matrix Q j of V v ( G ) since A and Q J have the same column dependence structure. Let Q‘ be any standard representative matrix of V,(G). Let us assume without loss of generality that
(3.10)
The columns corresponding to T are linearly independent and (n - p ) in number. Hence, T must be a forest of G. Let QT be the f-cutset matrix with respect to T . Then QT = ( I Q 1 2 ) for some Q1,. But Q’ and QT are row equivalent to each other. So we conclude that Qlz = Qi, and Q’ = Q T . The f-circuit case proof is similar.
E 3.45: Proof of Theorem 3.3.8: Each KVE has the form cTv = 0, where
c is a circuit vector. Now every circuit vector is a current vector. So the size of a maximal independent
92
3. GRAPHS
set of circuit vectors cannot exceed r ( V i ( G ) ) .However, the rows of Bf constitute an independent set of circuit vectors of this size. The result follows.
E 3.48: i. is immediate.
.. (Sketch) If we start from any node of a circuit subgraph of G (that intersects
11.
7’) and go around it, this would also describe an alternating sequence (without edge repetition) of G x T starting and ending at the same vertex. This subgraph of S x T has each vertex of degree 2 2 and so contains a circuit of G x T . On the other hand given any circuit subgraph of 4 x T we can trace a closed alternating sequence around it which can be expanded to a closed alternating sequence of G corresponding to a circuit subgraph. So every circuit of G x T is the intersection of some circuit of G with T .
E 3.49: (Sketch) Assume without loss of generality that G is connected. Any cutset of s‘ that intersects T would, when removed, increase the number of components of 5‘ . T . Hence, it contains a cutset BT of B . T . Any cutset of G . T corresponds to vertex sets V1, V, between which it lies (the subgraphs of G . T on V l , V2 are connected). Now let Vl be grown to as large a vertex subset of V{ of (V(G)V2) as possible using paths that do not intersect B T . The cutset of G defined by IT{, (V(G) - V:) intersects T in B T . Next consider any cutset CT of G x T . This corresponds to a partition VIT,V ~ Tof c‘(G x T ) . Now VIT,V ~ are T composed of supernodes of G which are the vertex sets of components of (GopenT).The union of these supernodes yields a partition V,, Vz of V ( G ) . Clearly CT is the set of edges between Vl ,V2. The subgraphs of G x T on \‘IT. VZT are connected. So the subgraphs of G on V1, V2 are also connected. So CT is a cutset of G. Any cutset of made up only of edges in T can similarly be shown to he a cutset of G x T .
E 3.51: Ai X”A
=0
= J has a solution iff XTA = 0
+ XTJ
+ all components of X are identical.
= 0. If the graph is connected
E 3.52: i . A vector satisfies KC Equations of G . T iff when padded with 0s corresponding to edges in E ( G )-T it satisfies the K C Equations of 4. Hence, Vi(6 . T ) = (Vi(G))x 7’. ii. Let iT E Vi(G x T ) . In the graph G this vector satisfies generalized KCE at
supernodes which are vertex sets of components of GopenT. The previous exercise implies that we can extend this vector to edges within each of these components. Thus there is a vector i E Vi(G)s.t. i / T E Vi(G x T ) . Thus, Vi(G x T ) C ( V i ( G ) ).T. Any vector that satisfies KCE of G would satisfy generalized KCE at supernodes. Hence, if i E Vi(G) then i / T E Vi(B x T ) . Hence, ( V i ( G ) ). T 5 Vi(G x T ) .
E 3.53:
i. From Theorem 3.4.6, R33 is a representative matrix of V x T3 and 7’1
1:~ (3.11)
3.9. SOLUTIONS OF EXERCISES
93
is a representative matrix of V . (TI U 7'2). Now R21, Rl2 are given to have linearly independent rows. So R21, R12 are representive matrices of V . (7'1 u 7'2) x 21' and V . (TI U 7'2) . T2(= V . T2) respectively. Next Ti T3
(3.12) must be a representive matrix of V x (TI U T3). So R21 is a representative matrix of V x (Ti U T 3 ) .Ti.
ii. If
R11
E 3.54:
is a zero matrix, then V . (TI U 7'2) would have
R33
is a representative matrix of V x
T2
while
TI, T2
[ Ett ]
as separators. is a representative
matrix of V . T J .The result follows. E 3.55: " ( T ) = r(VL * T )- r(VL x T ) = I T I --T(V x T)- I T I + - T ( V T . )
(by Theorem 3.4.3).
E 3.56: We shall show that the union of a forest f1 of Gshort(E - T ) and a forest f2 of GupenT yields a forest of = G. GopenT has a number of connected components. The forest f 2 intersects each of these components in a tree. The vertex sets (supernodes) Vi of these components Gi figure as nodes of Gshort(E - T). If f l U f . contains ~ a circuit of G it cannot be contained entirely in GopenT. The corresponding circuit subgraph can be traced as a closed path starting from some vertex in Vi going through other sets Vj and returning to Vi. When the are fused to single nodes this subgraph would still contain two distinct paths between any pair of its vertices (which are supernodes in the old graph G). Thus, f l would contain a circuit of G'short(E-7') which is a contradiction. Hence, f1 U f i contains no circuit of 5'. On the other hand we can travel from any vertex u1 in G to any other vertex u f in the same component using only edges of f1 U f2. This is because a connected component of G would reduce to a connected component of Gshort(E - T ) . So v1, vf would be present in supernodes say Vl ,Vf which are nodes of Gshort( E - T ) and which have a path between them using only the edges of f l . This path F'2 can be exploded into a path P I 2 using only edges of f1 U fi in 6 as follows: The path PJ can be thought of as a sequence UI,UII,
e l , m , v 2 2 , e2
. . . e f , u;, uf
where u1, vll belong to the same component and in general u j , u j j belong to the same component of GopenT. So would (for notational convenience) u ; , vf. Now we can travel from u1 to ull , u2 to 1122, uj to u j j etc. using edges of f2. Addition of these intermediate edges and vertices yields the path P12. Thus, f1 U f2 contains no circuits and contains a tree of each component of G.
3. GRAPHS
94
E 3.57: Immediate from the above
E 3.58: Consider the incidence matrix A of C;. A set of columns of A are linearly independent iff the corresponding edges do not contain a circuit. Thus, T is a separator of iff there is no minimal dependent set of columns of A intersecting both T and ( E- T). Let
T
E-T
(3.13) be a representative rnatrix of V,(C;). This matrix and the incidence matrix are row equivalent and therefore have the same column dependence structure. If the rows of R T are ~ linearly dependent on the rows of Rs2, we can perform reversible row operations using the rows of the latter so that rows of R T are ~ made zero.If R T is ~ the zero matrix it is clear that no minimal dependent set of columns can intersect both T and E - T,where E = E(C;). If rows of R T are ~ not linearly dependent on those of R22, then r ( ( V , ( G ) ). T ) > r ( ( V , ( G ) )x T ) . Now, let f1,fi . T ,cj . ( E - T),respectively. The union of these two forests be forests of contains rnore edges than the rank of G and therefore, contains a circuit. But f l ,f2 do not individually contain circuits. We conclude that there must exist a circuit that intersects both f l and fi~. Thus, we see that T is a separator of C; iff rows of R T are ~ linearly dependent on the rows of R22 i.e., iff T is a separator of V ,(G ), i.e., iff r ( ( V , ( G ) ).T)= r ( ( V , ( G ) )x T).The last statement is equivalent to saying r(G . T )= r(G x T).
+
+
E 3.59: The graph cj has a1 a2 forests as well as coforests, ,& p2 circuits, y1 ~1 cutsets. This is because every forest of C;, when T is a separator, is a union of a forest of G . T and a forest of C; . ( E - T ) . Further, each circuit of G is either a circuit of . T or a circuit of C; . ( E - T ) . E 3.60: Let the directed crossing edge set have the orientation (Vl, Vl). The tail of the edge e lies in a component of the subgraph on VI.Let V’ be the vertex set of this component.Consider the directed crossing edge set defined by ( V f V(C;) , - V’). The head of the edge e lies in a component of the subgraph on V(G) - V’.Let V ” be the vertex set of this component.Consider the crossing edge set defined by (V(G) - V ” , V”).This has e as a member. It can be seen that it is a directed cutset.
E 3.61: We use Kuhn-Fourier Theorem. Let V be the solution space of Ax = 0. Suppose V has no nonnegative vector whose support contains e. Then the following systeni of inequalities has no solution Ax=O X ( . )
>0
x 2 0.
By Kuhn-Fourier Theorem there exists a vector A, a scalar a > 0 and a vector u 1 0 s.t. XTA + a x e + crT = 0. Thus, -XTA = (aT+ a x e ) .The vector crT axr lies in the space VL and has e in its support.
+
95
3.9. SOLUTIONS OF EXERCISES
E 3.62: ii. From each vertex obtain the set of all reachable vertices (do a b f s ) . This takes 0 ( (V 11 E I) time. Sort each of these sets and obtain a list in increasing order of indices. This takes O(l V l2 log I V I) time. For each pair ( v 1 , v ~check ) if 212 is reachable from v1 and if vl is reachable from vl. This takes O ( (V l2 log I V I) time. So overall complexity is O(l V I (max(1 E 1, I V I log I V I))). E 3.63: We assume that the length of an edge is an integer. We first find an upper bound u and a lower bound 1 for the length of this path. The upper bound could
be the sum of all the lengths and the lower bound could be the minimum length of an edge. Narrow down to the correct value of the distance between v1 and V L by asking question of the type ‘is there a path between v1 and 212 of length 5 di’. The value of di in this question could be chosen by binary search between u and I : dl = ( I if yes d2 = (1 + if no d2 = (uand so on. (Whenever any of these numbers is a fraction we take the nearest integer). Clearly the number of such di is O(log(u - 1 ) ) . Suppose d is the length of the shortest path. To find the edges of the shortest path we ask, for each edge e between v1 and v l l say, if there is a path of length d - d(e) between 2111 and 212. If yes (and it must be yes for one such edge) then e belongs to the path and we now try to find a path of length d - d(e) between 1111 and V L . By this process the shortest path can be found by framing 0 ( (E(G) I) decision problems. Overall the total number of decision problems is O(Iog(u - Z)+ I E(G) I).
+ y),
y),
y)
E 3.64: Observe that in the stack at any stage the top vertex has the highest df s numbering and we cannot get below it unless it has been deleted from the stack. Once a vertex has been deleted from the stack it can never reappear. If v1 is not an ancestor of v2 then they have a common ancestor v3 of highest d f s number. Since v1 has a lower df s number than 212 it would have been deleted from the stack before we went back to v 3 and travelled down to v2. But then the edge e would have been scanned when we were processing v1 for the last time. At that time the other end of e would have been unmarked and e would then have been included in the d f s tree. This is a contradiction.
E 3.65: The technique described for building f-circuits using d f s would work for any rooted tree ( a tree in which each node has a single parent). The advantage in the case of d f s and b f s is that we can stop as soon as we reach the earliest common ancestor. In the case of b f s we walk from v1 and u2 towards the root by first equalising levels (if wl has a higher level number we first reach an ancestor 21: of the same level as va). Thereafter we move alternately one step at a time in the paths v1 to root and 212 to root until the first common ancestor is reached. E 3.66: Suppose t is a minimum spanning tree whose total weight is less than that of the tree t a l g generated by the algorithm. Let t be the nearest such tree to t a l g (z.e., I t a l g - t I is minimum). Let e E (talg - t ) . Consider the f-circuit L ( e , t ) . If w(e) 5 w(ej) for some e j E ( L ( e , t )- e ) , then we could replace t by the tree t U e - e3 without increasing its weight. This would contradict the fact that t is the nearest minimum spanning tree to t n l g . Hence w ( e ) > w ( e j ) for each ej in ( L ( e , t )- e). However, e was selected, during some stage of the algorithm, as the
96
3. GRAPHS
edge of least weight with one end in the current set of vertices V,. The edges of ( L ( e ,t ) - e ) constitute a path between the end points of e. A t least one of them, therefore, has only one end point in V, and, therefore, has weight not less than that of e. This contradiction proves that tnlg - t is void. Since both tnlgand t have the same number of edges we conclude that tnlg = t.
E 3.68: Construct a 'level graph' containing all the edges of a 6fs tree in the residual graph from s to t and any other edge of that graph that travels from a lower to a higher level. (The level of a node is the 6f s number of the node). Clearly only such edges can figure in a shortest path from s to t. Whenever we augment the flow using a shortest path upto its bottleneck capacity, atleast one of the edges, say e, of the residual graph will drop out of the level graph. In the residual graph an oppositely directed edge to e would remain. But this edge cannot figure in the level graph unless the length of the shortest path changes (increases), since it, would be travelling from a higher to a lower level. An edge that has dropped out cannot return until the length of the shortest path changes. It follows that there can be at most m augmentations at a particular length of the shortest path from s to t. The length of the shortest path cannot decrease and also cannot exceed the number of nodes in the graph. Hence the total number of augmentations cannot exceed mn. E 3.69: (Sketch) Replace each edge by two oppositely directed edges of capacity 1 . Treat s as source and t as sink. Maximize flow from source to sink. Each unit of flow travels along a path whose edges (since their capacity is 1) cannot be used by another unit of flow. Hence, the maximum flow 5 maximum number of arc disjoint paths. The reverse inequality is obvious. In any cut of the flow graph the forward arcs (each of capacity 1) would correspond to arcs in the corresponding cut of the original graph. The result follows. E 3.70: i . r ( T ) = size of the maxiniiim circuit free set contained in T . So the dual function at T would give the size of the maximum cutset free set contained in T , i.e., the dual is v(.), the nullity function ( v ( T )=I T 1 - r ( G x T ) ) . ii. Let T * ( . ) be the dual. Then r * ( T )= dim(Vi . T ) =I T
I - dim(V x 7')
iii. Let [*(.) be the dual. Then
1. Thus, K is a minimal set that meets no circuit of M in a single element.
Next, let K be a subset of S that has the property P of meeting no circuit of M in a single element. Then, K cannot be contained in a cobase of M , since, if e E S - B , where B is a base of M , then L(e, B ) meets (S - B ) in e. Hence, K contains a circuit of M * . So if K is a minimal subset having the property P , then K contains a circuit K1 of M * . We have already seen that the circuit K1 of M' must have the property P . So if K1 c K there would be a contradiction. We conclude that K1 = K , i.e., K is a circuit of M * . 0
112
4. MATROIDS
We next relate the rank function of a matroid to that of its dual.
Theorem 4.3.3 (k) Let M = (S,Z) be a rnatroid and let M' e (S,Z*) be its dual. Let T ( . ) , T * ( . ) be the rank functions of M and M * respectively. Then r * ( X )=(X
I -(T(S)
- T(S - X))
Proof : We have, r * ( X )= size of a maximal subset of X that is a member of 2". Now T E z" iff there exists a base of M that is contained in S - T . Thus,
r * ( X ) = size of a maximal subset of X whose complement contains a base of M = size of the complement relative to X of a minimal subset of X that is the intersection of a base of M with X . Now a base of M has minimal intersection with X iff it has maximal intersection with ( S - X ) . Hence, the size of a minimal intersection of a base of M with X = r ( S ) - r ( S - X ) . Therefore, r * ( X ) is the size of the complement of a set of size ( r ( S )- r ( S - X ) ) relative to X . It follows that, r * ( X ) =I X I - ( r ( S ) - r ( S - X ) ) . 0
We know that, if B is a base of M and e # B, then e U B contains a unique circuit L ( e , B ) called the fundamental circuit of e with respect to B in the matroid M . Let et E B . Now ( S - B ) is a base of M * . Consider the fundamental circuit of et with respect to ( S - B ) in the matroid M ' . This is a bond of M and meets B in et. We call this bond, the fundamental bond of et with respect to B in the matroid M and denote it by B ( e t , B ) . We then have the following theorem.
Theorem 4.3.4 (k) Let B be a base of a matroid M on S . Let et E B and let P,. E S - B. Then - et is a base of M iff e , E B(et,B ) or, equivalently, et E L(e,, B ) and hence
i. B U e,
ii. e , E B(et,B ) iff et E L(e,, B ) . Proof
:
i. We know that B U e, contains a unique circuit L(e,, B ) . If et E L(e,, B ) , then H U e, - et is independent in M and has the same size as B and, therefore, is a base of M . If et # L(e,, B ) , then B u e, - et contains L(e,, B ) and is therefore not a base of M . Thus, B U e, - et is a base of M iff et E L(e,, B ) . On the other hand, working with the dual matroid, (S - B ) U et - e , is a base of M' iff e, belongs to the fundamental circuit of et with respect to S - B in M ' . Equivalently, B U e, - et is a base of M iff e, E B(et,B ) . ii. We have, from the above, that B LJ e, also iff e, E B ( e t ,B ) . The result follows.
-
et is a base of M iff et E L(e,, B ) and 0
Let us next consider the closure operators of a matroid and its dual. We have the following
4.3. DUAL OF A MATROID
113
Theorem 4.3.5 (k) Let M be a rnatroid on S . Let J(.), J*(.) be the closure operators of M , M' respectively. Let T 2 S and let B , B' be bases of M that intersect T maximally and minimally, respectively, among all bases of M . Then i. e E j ( T ) i f fe E T or e U (Bn T)is dependent in M .
ii. e E J(T) iff e E T or e E S - B and ( L ( e ,B ) ) n B
C T.
iii. e E [ " ( T )i f fe E T or e E B' and ( B ( e , B ' ) )f l ( S - B') C T , where the fundamental circuit and bond are taken with respect to M .
= B nT, Bb = B' nT.We will throughout consider the case where e $!T. i. We have, e E J ( T )iff there exists a circuit C s.t. e E C and C - e T (Lemma 4.2.2). Now, if e U BT is dependent, there exists a circuit contained in it. This circuit has e as a member, since BT is independent. Hence, e E j ( T ) . Next, if e E J ( T ) ,we have, ~ ( B T=) r ( T ) = r(T u e ) 2 ~ ( B uT e). We conclude (since r ( . ) is increasing), that r(&) = r ( & U e ) . Hence BT U e is dependent. Proof : Let BT
ii. Let e E J(T). Then by (i) above, e u BT is dependent in M . So it contains a circuit which has e as a member. This must be the unique circuit L ( e ,B ) contained in e U B . Hence, L ( e , B ) n B C T . Next suppose L ( e , B)n B C T,i.e., L(e, B ) n B C B T . But this means that eU BT contains the circuit L ( e , B ) . So e U BT is dependent in M . So by (i) e E J ( T ) .
iii. We first observe that B' has a minimal intersection among all bases of M with T iff S - B' has a maximal intersection among all bases of M * with T.SOby (ii) above, it follows that e E J * ( T )iff e E T or e E (S - ( S - B ' ) ) and L * ( e ,( S - B ' ) )n ( S - B ' ) C T , where L * ( e ,( S - B ' ) ) is the fundamental circuit of e with respect to ( S - B') in the rnatroid M " . Now L*(e,( S - B ' ) ) = B ( e , B'). So e E J * ( T )iff e E T or e E B' and ( B ( e ,B ' ) ) n (5' - B') C T . 0
Exercise 4.10 (k) Let G be a graph and Jet v E V,(G) and let i E Vi(G). Let T C E = E(G). Suppose only v / T ,i/T are known about v and i. Let J(.), /*(.) be the closure operators of M ( G ) ,M*(G)respectively. Show that u ( e ) ( i ( e ) )can be uniquely determined i f f e E / ( T ) ( e E J * ( T ) ) . Exercise 4.11 (k) Let f,g : 1)2 + 1)2 and Jet f(.) be increasing. Consider the functions I . 1, r ( . ) on subsets of S where I X I is the size of X and r ( . )is the rank function of a matroid M . Show that f (I . I) - g ( r ( . ) ) reaches a maximum on a closed subset of M .
114
4. MATROIDS
4.4
Minors of Matroids
In this section we generalize the notion of minors of graphs and vector spaces to matroids. Let M E (S,Z) be a matroid. Let T & S. The restriction (or reduction) of M to T , denoted by M T , is the matroid (T,Z*) where ZT is the collection of all subsets of T which are members of Z. The contraction of M to T , denoted by M x T , is the pair ( T , Z & )in which X E 1’ iff X U BS-T E Z whenever Bs-T is a base of M . (S - T ) . We show below that M x T is also a matroid. A minor of M is a matroid of the form ( M x TI) . TJ or ( M . T I ) x Tl,Tl C TI C S. Since there is no room for confusion we omit the bracket while denoting minors. It is clear from the definition that M . T is a matroid. We prove below that M x T is also a matroid. +
Lemma 4.4.1 (k) Let M be a matroid on S and let X 2 T C_ S . Suppose B i - T , Bi-T are two bases of M . ( S - T ) and X U B;-* is independent in M . Then X U B&* is also independent in M . Proof : Suppose the lemma fails. Then there exist two bases B k P T ,Bg-T of M . ( S - T ) s.t. X U B i T T is independent, X u BZ-* is dependent and I B i P T - BgPT I is a minimum for these conditions. Let e E B;-* - BiPT.Then e U B i P T contains the unique circuit L(e, B&-*) of M . ( S - T ) . Now L ( e , Bi-*) has some element e’ E (Bi-* - B;-*). Hence, B;-* = (B;-* - e’) U e is a base of M . ( S - T ) ,using Theorem 4.2.2. Let B1 be a base of M containing X U B i P T . We know that e U B’ contains the unique circuit L ( e , B 1 ) .Now by the definition of M . ( S - T ) it follows that circuits of M . (S- T ) are the same as circuits of M contained in ( S - T ) . Hence, L ( e ,B1) = L ( e , B i - - T ) and B3 E (B1 - e‘) U e is a base of M . Now X U Bi-T C_ B3 and therefore, X U BiWTis independent in M . But I B;-* - BZ-* (,(2,3,4,5),0,(7,8}} Find a maximum size subset of { 1,2,3,4,5,6,7,8} that is a member of F. Suppose we start with (6) and check if there is a member of .F properly containing the set, we would find there is no such member. However, (6) is not the maximum size subset that is a member of F (the required subset is {2,3,4,5}). Our main result relates matroids to the greedy algorithm. We need some preliminary definitions. Let S be a finite set and let 2 be a collection of subsets of S such that ‘Y E C: Y ’ implies ‘ X E 1.’ Let w(.) be a real weight function on S. Let the weight of a subset X of S , denoted by w(X), be defined by w(X)E CeEX w(e), X S. Let us call a maximal member of 2, a base of Z. Let X I ,X2 be two bases of Z. Let X I G { a l , . . . , a k } , X z { e l , . . . , e , ) and further let w(a1) 2 . . . 2 w ( a k ) and let w(e1) 2 . . . 2 w(e,). We define a preorder on the maximal members of Z as follows: XI X Z iff w ( a i ) > w(ei) whenever i is the least index s.t. w ( a i ) # w(ei). If X O is a base of Z s.t. X O X j , whenever X j is a base of Z, we say that Xo is a lexicographically optimum base of Z relative to w(.).
2,X
=
.4ssume that we have an oracle (2-oracle) which tells us whether a given subset of S belongs to 2.Then it is clear that with I S I queries to the 2-oracle one can determine a lexicographically optimum base of 2 relative to w(.) : Let S G { e l , . . . , en}. Without loss of generality let us assume that w(e1) 2 . . . 2 w(e,). Let eil be the heaviest element s.t. {eil} E Z. Suppose at some stage we have constructed a member T E Z. Let e k be the lightest element of T . To grow T further we look for the heaviest element e j in {ek+l,. . . , e n } for which T U e j E 2. If no such e j exists T is the lexicographically optimum base of 2. Clearly the above algorithm can be called ‘greedy’ since the set is grown to its full size by doing only local optimization with no back tracking. Theorem 4.6.1 (k) Let
X
C Y,Y
E
Z implies X
S be a finite set and 2,a collection of subsets of S s.t.
E 2,
= ( S ,2) is a matroid with Z as the collection of independent subsets of M , then a base of M , relative to w(.), is lexicographically optimum iff i t is a base of maximum weight.
i. [Gale68] Let w(.) be a weight function on S. If M
ii. I( for every weight function, the lexicographically optimum base of Z is also the base with maximum weight, then ( S ,1)is a matroid.
4.7. NOTES
123
Proof : i. Only if: Let X be a lexicographically optimum base of M relative to w(.)and let Y be a base of maximum weight s.t. I Y n x I is the maximum possible. Let X G {al,...,a~,},Y G { e l , . . . , e k } withw(ai) > w ( u j ) , i < j andw(ei) 3 w ( e j ) , i < j . Let X # Y. Let T be the highest index for which {al,.~~,a,} = {el,...,e,}. (If al # e l , we take T to be zero.) Y. Hence, w(a,+l)_> w(e,+l).Consider the Now U,+I# e,+l. We have X Y).There exists an element ei E (Y - X ) n L(a,+l, Y) fundamental circuit L(a,+l, s.t. Y' Y U a,+l - ei is a base of M . We have i 2 T 1 . Hence, w(ei) 5 w(e,+l) 5 W(U,+~). Hence, w(Y')2 w(Y). But this contradicts the fact that Y is the maximum weight base nearest to X . We conclude therefore that X = Y .
+
if: Let Y -{el,..., ek},w(ei) > w ( e j ) , i < j , b e a b a s e o f M o f m a x i m u m w e i g h t and let X = { a l , . . . , a h } , W ( Q ) w(aj), i j , be a lexicographically optimum base of M . Suppose Y is not lexicographically optimum. Let T be the least index for which w(a,) > w(e,). (For i < T we must have w(ai) = w(ei)). Now {el,.. . ,e,} cannot span {al , . . . , a,} as otherwise the base (Y - {el, . . . , e,}) U {a1 ,. . . , a,} would have greater weight than Y . Let a j , j 5 r, be an element that does not belong to the span of {el, ... ,e,}. Consider L ( a j ,Y ) . This set must intersect {e,+l,. . . , ek}. Let em belong to the intersection. Clearly ( a j U Y )-em is a base of M of greater weight than Y . This contradiction shows that w(ui) = w (e i),i 5 k . Hence Y is a lexicographically optimum base of M .
>
n. Then we can select a so that a < 1 but n < ( m - p ) + p a . In this case the lexicographically optimum base clearly contains B1 (and no element from Bz - B1).But we have w ( B 2 ) > w(BI), so that a base that contains B1 (and does not intersect Bz - Bl), cannot have maximum weight. We can avoid this contradiction only if n = m. 0
Exercise 4.20 Let M be a matroid on S and let w(+)be a weight function taking w-l(k - j 1). Show that a base B of M has the values l , 2 , . . . , k . Let Tj maximum weight iff B = B1 u . . . u BI,where Bj is a base of M . Ti)x T j .
=
4.7
+
cut,
Notes
Matroids were introduced into combinatorics by H.Whitney in 1935 [Whitney35], when he described several equivalent axiom systems which characterized the 'ab-
4. MATROIDS
124
stract properties of linear independence’. One such axiom system was also described by Van Der Waerden in his book on Modern Algebra [Van der Waerden371. Early work on the lattice of flats of a matroid was done by Birkhoff [BirkhofB5]. In the 1940’s Rado and Dilworth made important contributions to this theory [Rado42], [Dilworth44]. The subject received a big impetus when Tutte solved the regular matroid and graphic matroid characterization problems in 1959 [Tutte58], [Tutte59]. In mid 60’s important applications to combinatorial optimization were discovered by Edmonds and Fulkerson [Edmonds+Fulkerson65], [Edmonds65a], [Edmonds65b]. Since then, research in this area has remained very active, both in theory and in applications. The reader who wishes t o pursue the subject further may refer to [Tutte65], [Tbtte71], [Crapo+Rota70], [Randow76], [Welsh76], [Aigner79], [White86]. Applications may be found in [Papadimitriou+Steiglitz82], [Lawler76], [Faigle87], [Recski89]. A good way of accessing the basic papers of the subject is through [Kung86].
4.8 Solutions of Exercises E 4.1: Example 4.2.1: This follows from the facts that
maximal intersection of a forest (coforest) of G with T ,T of G . T (coforest of G x T ) ,
C E(G),is a forest
all forests of G . T have the same cardinality and that
all coforests of G x T have the same cardinality. Since forests and coforests are complements of each other, the bases of either of ( E ( G ) , & ) , ( E ( G ) , I care ) cobases of the other. Example 4.2.2: This follows from the fact that maximally independent subsets of columns of any submatrix of R (R*)have the same cardinality. Further if (I : K ) is a standard representative matrix of V then ( - K T : I) is a standard representative matrix of V I . Since there is a standard representative matrix corresponding to each maximally independent subset of columns, we conclude that bases of either of (S,Z), (S,Z”)are cobases of the other. Example 4.2.3: This follows from Example 4.2.1 and Theorem 11.2.6. Example 4.2.4: (sketch) Let T & V ( G ) and let 11, Z2 be two maximal members of 1 , contained in T . There exist matchings M I ,M2 which meet I1, I z . Consider the subgraph of 5’ on M1 u M2. (Note that the vertex set of this subgraph may contain vertices outside T ) . It can be seen that each component of this subgraph is either a circuit graph or a path graph. If 1 I2 I>( I1 I we must have the subset 12’ of 12 in one of these components of larger size than the subset 11‘ of 11 in the same component. It is then possible to find a matching in this component which meets 11’U ‘u for some ‘u E 12’ - II‘. Hence, Il U z1 E 1,. This is a contradiction. We conclude that I ZI I=I I2 I .
4.8. SOLUTIONS OF EXERCISES
125
E 4.2: Let Z satisfy Base Axioms with condition (ii') (the case where condition (ii) is satisfied is similar). In order to show that it satisfies the Independence Axioms we need only show that maximally independent subsets contained in T C S have the same cardinality.
Case 1: T = S. If B1, B2 are bases and e E B1 - B2, we can find an e' E B2 - 8 1 s.t. (B1 - e) U e' is a base. If we repeat this procedure we would finally get a base Bk C B2 s.t. I Bk I=I B1 I . But one base cannot properly contain another. so Bk = B2 and I B2 I=( B1 1 .
Case 2 : T c S. Suppose X = { q ,. . , xk} and Y = {PI,. . . ,ym} are maximally independent sets contained in T . Further let k < m. First grow X to a base B, and Y t o a base B y . Let Bz { X I , ' * * , z ~ : , P L + I , .' , P r } By
{ PI
7
' ' 7
. qr 1 B, - By.Hence, there is an element z in
~ r n~,r n + l ,
7
Since k < m, there is an element pt E B y - B, s.t. ( B , - p t ) U z is a base. Now z cannot be one of the yi as otherwise X would not be a maximally independent subset contained in T . So z = ys say. We thus have a new base BL ( B , - p t ) U ys. Observe that (By- BL) n ( S - T ) C ( B , - B,) n ( S - 7'). Repeating this procedure we would reach a base Bf s.t. Bf n ( S - T ) 2 By n ( S - T ) . Now if e E Bf - B y ,then there must exist z' E By- Bf s.t. (Bf - e) U z' is a base. But then z' E Y and X U z' is independent, which contradicts the fact that X is a maximally independent subset of T . We conclude therefore that I X (=I Y 1, i.e., that maximally independent subsets contained in T have the same cardinality.
=
E 4.3: i. Let Y E Z and let X & Y . We need to show that X E 1. We have r ( 8 ) = 0 and r ( A u e) 5 r ( A ) 1 V A E S. Hence, r ( X ) ’ - but in rnost cases we deal with the dot product. If inner product is intended this is mentioned. In Section 7.2 the ‘q-bilinear’ operation (a generalization of dot product and inner product) is defined and denoted also by < ., . >. This should however cause no confusion since discussion on this operation in the general 213
214
7. T H E IMPLICIT DUALITY THEOREM
sense is confined to that section.
iii. We abuse the notation in the following sense: even when we refer to two different dot product operations say one on V1 x V1 and the other on VPx V P ,where Vi, VP are spaces on different sets, we would use the same symbol < ., . >. When f1 is on S1 and f~ is on Sz then we take the dot product of f1 with fi, also denoted by < f1,f.L> , t o be eES1nS2
13y definition, the dot product is zero when S1 n Sz = 8.
For notational convenience we generalize the usual ideas of addition of vectors and of collections of vectors below. We also extend the definition of contraction and restriction to arbitrary collections of vectors. Let us define addition of vectors fs, , fs, on distinct sets by
+
When S1, SP are disjoint sets we usually write (fs, CB fs2)in place of (fs, fs,). Addition of collections of vectors Ks,,Ks, is denoted as usual by Ks, K s , and is defined by
+
Ks, + Ks, = {fs,+ fsz: fs, E Ics, , fs, E K&}. Once again, if S1,Sz are disjoint sets we usually write Ks, &i K s z in place of K's, + Ks,. For convenience we define Ks, - Ks, by Ks, - K:s,= {fs,
-
fs,: fs, E Ks,, fs, E Ics,}.
I t is clear that if K s , , Ks, are vector spaces on SI,S2 respectively then Ks, - Ks, is equal to K s , Ks, and is a vector space on S1 U Sz. On the other hand, if K s , ,Ks, are cones on S1, Sz respectively (a cone is a collection of vectors such that nonnegative linear combinations of vectors in the collection remain in the collection) then Ks, + Ks,,Ks,- Ks, are distinct but are still cones on s 1 u s2. For any collection Ks, of vectors S1 and T g SI, let
+
and let
Ks,x T
= {fT : fT = f/T, where f E
Ksl and f/(S1
-
T)(.)= O } .
We then have the following simple lemma whose routine proof we omit. L e m m a 7.1.1 Let K s p , K p be collection of vectors on SUP, P respectively. Then, Ksp x : p = ( K s p - K p ) x s.
*
7.1. THE VECTOR SPACE VERSION
7.1.1
215
The Implicit Duality Theorem: Orthogonality Case
Let Icp be a vector space whose representative matrix Rp is as follows:
4
RP=(I f where PI kJ 2'l
6
o),
= P . It is then clear that = K s p . ( S U P2) x S = K s p x ( S U P i ) . S.
K s p ++ K p
Further K k has the representative matrix
Hence, in this case
KS$+,~~=K,$~(SUP~)XS=(K~~+,IC~)~ (using Corollary 3.4.1). We now show that this result is true for arbitrary vector spaces. Theorem 7.1.1 (The Implicit Duality Theorem) Let V s p ,Vp be vector spaces on S k~P, P respectively. Then, (VSP
f)
V P ) l = V&
+)
Vp'.
We need the following lemma for the proof of this theorem. Lemma 7.1.2 A z = b has a solution iff ( A T a = 0)
(bTa= 0).
Proof of the Lemma: Let V A be the space spanned by the columns of A . Since ( V ; ) l = V A (Theorem 2.2.5), a vector b belongs to V A iff it is orthogonal to all vectors in V i , i.e., iff (A*a = 0) (bTa= 0). 0
Proof of Theorem 7.1.1: Let S
P
(As AP),
P
(AP)
be the representative matrices of V s p , V p respectively. A vector xs belongs to V s p ++V p iff there exist vectors XI, X2 s.t.
216
7. T H E IMPLICIT DUALITY THEOREM
13y lemma 7.1.2, this happens iff
i.c , iff
,.1 hus, l i s p
(YS E (V&J tt l i p =
By Theorem 2.2.5
(V,",
u&
+)
* v;
* V k ) ) * (JGY.7
Vb) = (V&
= 0).
It is clear that ( V k p +) U k ) is a vector space.
* V,')l
= (VSP
* UP)+ 0
Remark: The fact that vectors in V k p H V i and U s p +) U p are orthogonal is easy to see but of limited value. For almost all applications it is necessary t o show that the two spaces are complementary orthogonal. Example: Suppose spaces U and V' are solution spaces respectively of
If tlir two spaces are complementary orthogonal and if we know that x1 = K x ~ , we can conclude that y2 = -KTyl. This conclusion cannot be reached if we only knew that the two spaces are merely orthogonal. ('omplementary orthogonality of the concerned spaces is critical in the derivatioii of adjoints (see Subsection 7.3.4). We observe that the proof of the Implicit Duality Theorem depends on two facts: The spaces that we deal with satisfy ( V L ) l = U , and they have finite bases or, more generally, are finitely generated In order to generalize Theorem 7.1.1 we could look for situations where the above two conditions hold. Lemma 7.1.2 has a conical version (Farkas Lemma) and an integral version (a variation of Van der Waerden's Theorem). In both these cases the collections of vectors of interest are finitely generated. So our initial proofs assuiiie this. For completeness we later give a proof which works even when 'finitely generated' is not assumed (Problem 7.14). For the following exercises take V K , V K L , where K , L are disjoint sets, t o be vcctor spaces on K , K k~L respectively. Exercise 7.1 If 5 '1 n Sz = 0, and V l , V2 denote the vector spaces V S ,, V . C ; show ~, that i. ( V l v2)' = : .L 6 i i . ( V , rii U,) +) V 2 = U1.
v,',
7.1. THE VECTOR SPACE VERSION
217
Exercise 7.2 Let V1, Vz be vector spaces on SkJP and let V p ,V % be vector spaces on P . Show that i. ( V , V,) H V p 2 (V, H V p ) (V2 H V p )
+
+
ii. (V1 n V,) H V P 5 (V, H V P )n (V2 H V P ) iii. ( V , H ( V p V;)) 2 ( ~ H 1 Vp) (V1 +) v;) iv. (V, e ( V p n V ; ) ) 2 (VlH V p ) n (V2 t)Vj!,) v. (Vl '3 UP)* = (V, - V p ) l .s, where (V, - V p ) is a vector space on S H P .
+
+
Exercise 7.3 Changes in the spaces which leave the generalized minor unaltered: Show that i. V S PH V p = V S P +) (VP n (VSP . P ) ) ii. V s p y V p = ( V s p n ( V S P . S @ V p ) ) e V p iii. Let U p be a vector space on P s.t. $ p - V s p x P = V p - V S P x P. Then V s p t) V p = V s p f+ U p . Exercise 7.4 Duality of contraction and restriction: Let Q C T C S1. Prove using the Implicit Duality Theorem i. (V . T)* = VvI x T ii. (V x T ) =~vL T iii. (V x T . &)I = U L . T x Q . Exercise 7.5 When can a space be a generalized minor of another? Let V s p , V s be vector spaces on S & P, S respectively. Then there exists a vector space V p on P s.t. V s p H V p = V s i f fV S x~S C VS E V S P. S .
7.1.2 Matched and Skewed Sums For applications such as the decomposition of multiports it would be useful to have an extension of the 'H' notation to the situation where one of the underlying sets is not contained in the other. Definition 7.1.2 Let K s , , Ks, be collections o f vectors on sets SI,S2 respectively. Then the matched sum Ks, H K s Z is defined by
Ksl
Ks,
t)
=
{f : f = f l / ( S l - Sz)@ fJ/(S, - S1),where f1 E Ks1,f2 E K s Z and fi /S1n S, = fi/Sl n Sz}.
The skewed sum Ks, +Ks2 is defined by
Ks,+Ks,
{f:f=fi/(S1-Sz)@KJ/(Sz-S1), and f i l s l n S, = -fi/S1 n S 2 } .
wherefi E K ~ , , K J E K ~ ,
The reader may verify that if S1,Szare disjoint, the matched sum and skewed sum both correspond to direct sum. If S, E S1, they correspond to generalised minor.
218
7. THE I M P L I C I T D U A L I T Y THEOREM
We now have the following useful corollary to the Implicit Duality Theorem.
Corollary 7.1.1 Let V S , , US, be vector spaces on sets S1, SZ respectively and let I_ \ ) ) I < . . ’ > be the usual dot product operation. Then, (Vs, i+ US,)^ = (VsIsz). Proof : Let PI, P2 be disjoint copies of 5’1 n 5’2, with e E S 1 n S2 corresponding to el in P I and ex in P2. Let S1’ = (S1- Sa) U PI and let Sz’ (Sz- Si) U PZ. Let V&,, VL, be copies of Vs , , Us, built on Sl’,S2’ respectively as follows:
V.;,
= {f’ : f‘/S1 - S 2
= f/S1 -S2, f’(el) = f(e),Ve E S1 nSz, for s m e f E V s , }
aiid V k , is defined similarly with respect to Vs,. Let V l z be the vector space on PI P2 with the representative matrix Pl
[ I
+
p2
I1
in which each row has a 1 in the columns of elements corresponding to one element of S 1 n 5’2. It is now clear that
Vs,
f)
Us, = ( V k , 63 UL,, +) V l 2
and therefore,
(Vs, f) vszy = ((Vk, CB V & ) +) VlZ)* The RHS of the above equation, by Theorem 7.1.1, reduces to (VL, 9vk2)l Since
*~ 6 ~ . (Vk,
@
%,IL
= (VA,)l
@
(vLz)L,
this rcduces to
((Vk,,’
@
( V k 2 ) 9 ++
Vb2.
rhus, a vector f on (S1 - S2)U (3,- S1)belongs to the RHS iff there exist vectors fl‘ E ( V k , ) l , fi’ E (U&,)l,s.t. fl’/(Sl - SZ) = f / ( S l - S,), f 2 ’ / ( S 2 - S1) = f / ( & - S,)and fl‘/Pl C f t fi’/PJ E Vb2, Now the last condition is equivalent to fI’(r1) = -CJ‘(F.L) Ve E S 1 n S.2, since Vb2 has the representative matrix
and the corollary follows.
7.2. QUASI ORTHOGONALITY
219
Exercise 7.6 Let VAB denote a vector space on A u B , and space on A1 kd A2 . . . M Ak. Show that i. Restricted associativity: (VST & ’ VTP)
* VPQ
= VST =
e) (VTP +)
VA1.42
...A K , a vector
VPQ)
(VST @ V P Q ) ’& V T P ,
.. S,T’, P, Q are painvise disjoint and repeat for skewed sum;
if
11.
(VSIT1
+)
V T I P l )@ (VSzT2
+)
=
if
S 1 , T I ,Sz, T’J,P I ,P 2
iii.
* (VTiPi
(VS17’1 @ V S Z T Z )
VT2Pz)
(VSIT1 @ VT2P2)
+)
@ VTzpz)
(VSzTz @ VTiPi),
axe pairwise disjoint and repeat for skewed sum;
(VS~T ~~ ...~vs,,T,,+ ))
VT~T~...T,, = ( V S ~ Ttf~VT~T~...T,,) tf ( ~ S ~ T ~ @ . * * @ V S , , T , ) ,
where the Si,Ti are all pairwise disjoint, and repeat for skewed sum.
Exercise 7.7 Compatibility: An ordered pair ( U S ~ TV, S ~ T )where , Sl, S2,T axe pairwise disjoint and V S ~ TV ,S ~ T are vector spaces, is said to be compatible iff vSIT ’
and
T 2
V s l x~ T
VS2T
‘
T
C V S ~ Tx T .
Show that i. ( V S ~ TV ,S ~ Tis) compatible iff (Vk1,, V&,) is; ii. if ( U S ~ T , V S ~isTcompatible, ) then V s l + ~) ( V S ~ e) T V S ~ T=) V S ~ T ; iii. if ( V s l ~V, S ~ Tis) compatible then V S ~ T + ( V S ~ T + V S ~=TV) S ~ T .
7.2
*Quasi Orthogonality
It is convenient to focus attention on the essential properties of ‘dot product’ and ‘orthogonality’ which are needed for a result like Theorem 7.1.1 to hold. This would help us in generating other versions of the theorem. We do this through our definition of a ‘q-bilinear operation’ and ‘q-orthogonality’. For the following discussion short) operation.
< ., . > would be a quasi bilinear (q-bilinear for
Definition 7.2.1 Let X be a vector space over the scalar field 3.A q-bilinear operation < ’, . > on the collection of all ordered pairs of vectors in X takes values in 3 and satisfies the following conditions: i.
< LY f + p g , h >= a < f , h > +p < g, h >
7. T H E IMPLICIT DUALITY THEOREM
220
ii.
= + < h , g >
for all vectors f , g , h in X and scalars a , /3.
Remark: i . The reader would notice that the second condition in the above definition differs frorn the usual inner product as well as dot product definitions, being weaker. We need a definition which includes these operations as special cases. Further, this weaker condition is adequate for our purposes. ii. Note that the definition implies that < f , O >=< 0, f >= 0. iii. In this section, unless otherwise stated, < ., . > would always denote a q-bilinear operation.
Definition 7.2.2 Let X be a vector space over the field F. Let < .,' > be a (1-bilinear operation on ordered pairs o f vectors in X . Let A be a proper subset of F closed under addition and further let it satisfy (i) 0 E A, (ii) < f , g > E A +< g, f >E A for all vectors g , f E X. Vectors f , g E X are orthogonal i f f < f , g >=< g , f >= 0 and q-orthogonal i f f < f , g >E A. JVe mention some examples of q-orthogonality. 0
0
Clearly orthogonality is a special case of q-orthogonality. Here A
G
(0)
We say two real vectors x,y are polar iff their dot product is nonpositive.
So polarity is a special case of q-orthogonality taking A to be the set of all iioripositive real numbers.
We say two real vectors x,y are integrally dual iff their dot product is an integer. In this case we take A to be the set of all integers so that integral duality becomes a special case of q-orthogonality.
The vector 0 is easily seen to be orthogonal to all vectors in X and is therefore also q-orthogonal to them. For a collection K 2 X , K L would denote the collection of all vectors orthogonal to wery vector in K , and K" would denote the collection of all vectors q-orthogonal to every vector in K . We now have the following simple lemma.
Lemma 7.2.1 Let V be a subspace of X . Then
V' = v1 Proof : Let f E V and g E V' s.t. < f , g >= cx # 0. Let p E F - A. Now J u - ' f E V . Further < (pa-')f,g >= p a - l a = ,/3 $ A. Thus, g $ V " , which is a coiit,radiction. We conclude that (Y = 0 and therefore, g E V L . 0
Let us say that a collection of vectors
iff ( K x ) * = K . We then have
Lemma 7.2.2 Let
Ic1,
Ica C X with 0
K: 5 X is closed under q-orthogonality
E x71 f' Ica. Then
22 1
7.2. QUASI ORTHOGONALITY i. ( K l+ KZ)*= K1*n K 2 * , ii. If K1,K z ,K1* + K Z* are closed under q-orthogonality, then
( K n~ K,)*
=K
+
~ *K ~ * .
Proof : i. L e t g E ( K ~ + K ~ ) * . S i n c e O ~ K ~ f l K ~ , w e m u s t h a v e g ~ K ~ * a s w e l l a s g ~ K Next let g' E ICl*nK2*. Let f1 E K l , f ~E Kz.We have < fi + f ~ g' , >=< f1, g' > < f.1, g' > . Now < f1, g' >, < f2, g' >E A. Since A is closed under addition it follows that < fl + $ J , g' >E A. Thus, g' is q-orthogonal to every vector in K1 Kz. We conclude
+
+
( K +~ K ~ ) = * K1* n K ~ * .
ii. We have
(Kl* + &*)*
+
=
(Kc,*)* n (KZ*)*
=
I C ~ ~ K ~ .
Hence, ((Kl* &*)*)* = ( K I n Kz)" i.e., (K1n K2)* = K1* KZ', since (K1* K z * ) is closed under q-orthogonality. Lemma 7.2.3 Let i. X I * * 2
+
K1,ICz C_
+
X . Then the following hold.
K1.
ii. If K1 2 KZ then Kl* C K2*. iii. K1* is closed under q-orthogonality, iv. If K 1 ,IC'Z are closed then
Kl n K2 is closed under q-orthogonality.
Proof:
i.,ii. are immediate from the definition of q-orthogonality. iii. We have (K1*)**2 K l * . Next ( ( K I * ) * ) *= (Kl**)*& K 1 * , since K1** 2 K 1 . The result follows. iv. (K1 n K,)* 2 K 1*, (K1 n Kz)*2 K 2 * , since K1 ~ K C zK I , K z . Hence, ((K1n KZ)*)*& (Kl*)* n (K,*)*= K 1 n K Z . But for any K ,we must have K"" 2 K. Hence, (K1 n &)** 2 K1 n Kz.The result follows. 0
In the following lemma x p denotes a vector on P. Lemma 7.2.4 Let K S be a collection of vectors on S and let T S. Then ( K s . T)' = KS' X T , i f , for d l fT,fS-T,gT,gS-T,We have < fT @ fS-T,gT &i gS-T > = < fT9 gT > + < fS-T, gS-T > .
7. T H E IMPLICIT DUALITY THEOREM
222
Applications of the Implicit Duality Theorem
7.3
111this section we list applications of the Implicit Duality Theorem. Some of tlrese are discussed in more detail in subsequent chapters.
7.3.1
Ideal Transformer Connections
An ideal transformer on a set of ports S is a pair ( V s , V ; ) of complementary orthogonal spaces on S. The constraints of the transformer are: vs E Us,is E V i where VS, is axe the port voltage and current vectors. For example, the two port ideal transformer satisfies: v1 = n'u2,il = -$2. Thus, the voltage space has the representative matrix ( n : 1) and the current space has the representative matrix ( I : - 7 1 ) . The two spaces are clearly complementary orthogonal. It is a well known fact in network theory that ( i k ) if a set of &-port transformers are connected together in a n arbitrary m a n n e r and s o m e p o r t s exposed t h e n o n t h e exposed s e t of ports t h e permissible voltage and current vectors f o r m complementary orthogonal spaces
(see for instance [Belevitch68]). Usually, however, only one half of this fact is proved, namely, that the current and voltage vectors are orthogonal. We prove the result using the Implicit Duality Theorem. First we observe that if G is a graph then, by Tellegen's Theorem, ( V u ( G ) V , i(G)) constitutes an ideal transformer. Next, if (US,,U i Q ) , ( U Q , V & )are ideal transformers with V S Q ,V Q denoting vector spaces on S k~Q, Q respectively, then equivalent to the Implicit Duality Theorem is the statement that (USQ ++V Q ,V & +) U&) is an ideal transformer. For the following discussion we take Pj, i.e., S = (@ Pj - Q ) . Q = E(G), S H Q Xou. let the j t h ideal transformer be ( V p J ,U k ) and let the Pj's be all disjoint. This disconriected set of transformers constitutes the ideal transformer VpJ, VA). Let the set of ports which are connected together according to the graph G be Q. The set of exposed ports is U P j - Q. If the ports Q form the edges of a graph G, then on the set P we are imposing the constraints of the ideal transformer ( V z . ( G )Vi(G)). , The voltage vectors that can exist on ports Pj - Q are precisely
--
(ej ej
7.3. APPLICATIONS OF T H E IMPLICIT DUALITY THEOREM
(ej
223
those in VPJ) +) Vv(G)and the current vectors that exist on ports (sl PJ - Q are precisely those in V+>)+) V z ( G ) .By the Implicit Duality Theorem the last two spaces are complementary orthogonal. Equivalently,
(ej
is an ideal transformer. Further, the fact (-IF) implies the Implicit Duality Theorem and is therefore, equivalent to it. For, by connecting 2-port ideal transformers and exposing ports appropriately it can be seen that any ideal transformer can be built (see for instance Exercise 7.8). Thus, ( U s p ,V,$), ( V p , V,“) can be built this way. By plugging the ‘P ports’ of the first ideal transformer with the second (using ( i t - ) again) we get another ideal transformer namely (Vsp H V p , V i p H U;). We conclude that V& H v,l = ( V s p H V p ) l .
Exercise 7.8 To create a ‘graph’ using ideal transformers: Using only 1 1 1 2-port ideal transformers show how to build an ideal transformer (Vu(G), Vi(G)) where 5; is a specified graph. Exercise 7.9 Ideal transformers cannot be connected inconsistently: Our discussion implies that ideal transformers cannot be connected inconsistently. What would happen if we connect two 2-port transformers of different turns ratio in parallel? Exercise 7.10 Effect of an ideal transformer on remaining edges: Consider an electrical network with graph G. Let P C E(G) be the ports of an ideal transformer ( V p , V,‘). What would be the voltage and current constraints on (E(G)-P)? Equivalently what is the ideal transformer to whose ports the remaining devices of the network axe connected?
7.3.2
Multiport Decomposition
An electrical network can often be conveniently visualized as being made up of a number of multiports whose ports are connected together according to a connection diagram. In the literature it is often not clear that this is essentially a topological notion. We dwell at greater length on this important concept in a separate chapter. Here we merely outline the basic idea. E(G) be partitioned into Let G be the graph of the electrical network. Let E E l , * .. , Ek. Let P , a set disjoint from E l be partitioned into 4,.. . , P k . Let VE,P, be a vector space on Ei U Pi, i = 1 , . . . , k and let V p be a vector space on P. Then, ( V E ~,.. P ~. , U E ~U p~) is~ a; multiport decomposition of V,(G) iff
=
( @ V E , , ) ++VP = Vv(G).
Usually the spaces UE,P, would be voltage spaces of graphs GE,P, respectively (this is not necessary as we point out in the next chapter). In such a case, we have
224
ts'
7. T H E IMPLICIT DUALITY THEOREM
p22 p21 4I
p2 1
> *
p2 2
Figure 7.1: Multiport Decomposition of a Graph voltage vectors vE1p1,.. . , v E , ~ , of G E l p l , . . . , GE,pk respectively. s.t. V E ~ /PP ~i @ . . . @ V E , , ~ /Pk , belongs to V p (i.e., their port voltages match) iff V E p1 ~ / E l 63 . . . @ v ~ , p , / E kbelongs to v,(G). Let us call a graph G E ~ Pon~ El bJ PI with set PI specified as 'ports' as a multiport (i.e., the multiport is the pair ( G E ~ PPI)). ~, In Figure 7.1, the graph G is decomposed into multiports G E ,G~ E~ connected ~ ~ ~according to the port connection diagram G p 1 p 2 .A voltage vector (vi ,'W, 213, v4, U s , v 6 ) beloiigs to V,,(G) iff we can find vectors (ull,v12),( v z i , v 2 2 ) s.t. ( v i , ~ z , ~ i i , ~ 1E2 ) V p ( G ~ l),~ (713,0q,vs, l 7 ~ 6 , 2 1 2 1U , Z ) E V , , ( G E ~ Pand ~ ) ( ~ 1 11.~12, , u z i , 7 ~ 2 2 )E VU(GP, pZ). 1x1 our notation we say, if the above condition is satisfied, that
( U v ( G ~ l ~ ~l )~, ( G E ~V,(Gplp2)) P~); is a multiport decomposition of V,(G). We expect intuitively that the multiport decomposition represerlted through graphs should work for both voltages and currents. This fact requires proof. Essentially we need to show that ( U ; ( G ~ E V~ i~( G ~ )~,z p 2V)i;( G p l p 2 ) )is a multiport decomposition of Vi(G), whenever ( V , ( G E ~ P V ~ ), (, G E ~ P ~V,(Gp,p,)) ); is a multiport decomposition of V u ( G ) . Now, U,(G) = ( V v ( G ~ , p 1C)B V , ( G E ~ P ~ ) V ) v ( G p 1 p 2 )Hence, . by Implicit Duality Theorem
*
*
( V u ( G ) ) l = ((vv(GE1Pl))l@ ( V U ( G ; E ~ P Z ) ) ~(Vv(GPipz))l. )
7.3. APPLICATIONS OF T H E IMPLICIT DUALITY THEOREM
225
i.e., vi(G)= ( ~ ~ ( G E , P@ , ) V ~ ( G E ~f) P ~V i)()G p 1 p 2 )as , required. When an idea is intuitive in terms of graphs why bring in vector spaces? Here are some reasons: The graph version is misleading: if we actually connect the multiports along their ports according to the connection diagram, we usually would not get a graph with the same voltage space as G. It is inadequate for optimization purposes: for instance, we can usually reduce the number of port edges if we formulate decomposition as a vector space problem. While analysing the network, the vector spaces that we work with need not necessarily be associated with graphs - it is sufficient that their representative rnatrices be sparse and, preferably, 0 , l .
7.3.3
Topological Transformation Of Electrical Networks
A general way of looking at Network Analysis through Decomposition is to view it as a way of modifying network structure: a desired structure is imposed on the network at the cost of additional variables and additional constraints “arayanan871. We give a sketch of the method. The technique is valid for arbitrary networks. For linear networks, we can do more: we can prove bounds on the effort involved, making certain additional assumptions. Let the given network have graph G and device characteristic N(VE- e ) M(iE - j) = 0, where v, i are the branch voltage and current vectors, e , j are the voltage source and current source vectors. Each branch is composite made up of a device in series with a voltage source, the combination being in parallel with a current source.
+
Network analysis entails the solution of the following constraints:
&iE = 0 VE
-A,
T
V,
=0
N(VE- e) + M(iE - j) = 0 .
Suppose that the desired structure is a graph G‘ on the same set of edges as g. Let U,U’be the voltage spaces of graphs G,G‘ respectively. We look for a space UEP s.t. V = V E P V p and U’= U E e, ~ U;, i.e., V E is~an extension of both V and U’.It is desirable that 1 P 1 is minimized since, as we shall show, each element of P is associated with an additional variable.
*
+
z ] ’ R n b e The space V E P can be built using V V’ as follows. Let [:A]’[ the representative matrices of V , V’, U n V’ respectively. We take the representative ~ be matrix of U E to
7. THE IMPLICIT DUALITY THEOREM
226
that of V b to be
and that of V p to be
Pl
p2
[ I 01. I t is then immediate that V E p ++ V p = V and V E i+~ V b = V ' . By a suitable row transformation, the representative matrix of V E can ~ be put in the form (for an appropriate representative matrix A: of V ' )
K V E for G can be written as:
.is fa as
VF: is
concerned the above is equivalent to
By the Implicit Duality Theorem we have,
So the KCE of G can be written as
[
I
0
0 Rp,
]
= ( ;) 'PZ
(7.4)
227
7.3. APPLICATIONS OF THE IMPLICIT DUALITY THEOREM
(7.5) i e ,
Thus, the overall constraints can be written as r
0
-
-(A,.‘)T I
i
0
A,‘
0
0
: Rpz
M
0
N
:
...
...
...
0
-RT,
0
Ri
0
0
-RT
-
0
iE
0
0
...
V’n VE
...
iP,
1
0
0
VPI
:
0
0
0 0
-
Mj + Ne
=
(7.7)
...
-
0 0
-
-
Notice in Equation 7.7 that the border of the coefficient matrix has size equal to matrix, i.e., the left hand top corner of the matrix, is precisely the constraint coefficient matrix of the ‘new’ network with graph G’ but device characteristic same as the original network. Let G and G’ be near each other in the sense that r(V,(G) V,(G’)) - r(V,(G) f l V,(G’)) is very small in comparison with the ranks and nullities of the spaces involved. Then by solving the network the form of Equation 7.7 permits us t o solve network N’ for appropriate source distributions, I P I +I times. There is an additional set of equations of size I P I x I P I that has to be solved after this to complete the solution. (See Exercise 7.15).
I f‘l I + I PI (=(P 1 . The core of the
+
Exercise 7.11 Minimum common extension of two vector spaces: Let U ,U’ be vector spaces on S. Let P be disjoint from S. Let V s p be a vector space on S U P , and U p , Ub,be vector spaces on P. Let V s p +) V p = V ,V s p t)V b = U’.
Show that when the above conditions are satisfied, 1 P I is minimum i f fit equals
~ ( v+’v)- ~ ( vn’v). Exercise 7.12 Minimum extension of graphic spaces not always graphic: Construct a simple example for which the minimum extension VEP of V,(G), U,(G‘),
where G, S’ are given graphs, is not the voltage space of a graph.
Exercise 7.13 Suppose 4’ is made up only of coloops (selfloops). What would the matrices R1, R p , be?
7. THE lMPLIClT DUALITY THEOREM
228
Exercise 7.14 [Narayanan8O],[Kajitani+Sakurai+Okamoto]A metric on graphs on a given set of edges: Define the distance between two graphs G and G’ s.t. E ( G )= E(G’) by ~ ( G , G ’ )= T(v,,(G) v,(G’)> - T(v,,(G) n V ~ G ’ N . i. Show that d(., .) is a metric on the space of all graphs with edge set E(G),i.e., d(G,G) = 0, d(G,G’) d(G‘,G), d(G,G’) + d ( G ’ , G ” ) 2 d(G”G’’). ii. Define d ( V ,V’) E T(V + V’) - T(V n V‘), where V , V’ are vector spaces on E . (a) Show that d(., .) is a metric on the collection of all vector spaces on E . (b) Show that d ( U , V ’ ) = d ( U L , ( V ’ ) l )
+
Exercise 7.15 To solve N as though it has the structure of N‘: Let N,Jlf’ be networks on graphs G, 4’respectively with the same device characteristic M(iE - j) + N(VE- e) = 0.
Assume that d(G, G’) . 0
Remark: i. The reader would notice that the above proof is a translation (with additional explanations) of the proof of Theorem 7.1.1.
ii. We note that both the pairs of polyhedral cones K s p , K p as well be defined through inequalities:
K s p be the solution set of ( Bs K p be the solution set of
Bpxp
B p
K:,, K g
may
) xs 5 0 . XP
50
K;p be the solution set of ( A s Ap ) y” yp 2 0
K$ be the solution set of Apyp 5 0. Let CS be the collection of all vectors xs s.t. for some x p we have
Then Cg is the collection of all vectors ys s.t. for some y p we have YP
... 111.
Since by Lemma 7.2.1, the collection of all vectors q-orthogonal to a vector space is its complementary orthogonal space, and since the q-bilinear operation used here is the dot product and further a vector space is a special case of a cone, it follows that the Implicit Duality Theorem is a special case of the Implicit Polarity Theorem. The technique of the proof of Corollary 7.1.1 will work in the case of the present instance of q-orthogonality, namely, polarity. In this case also we would be working
7. THE IMPLICIT DUALITY THEOREM
236
with a vector space V l z as defined in the proof of the above mentioned corollary. By Lemma 7.2.1, Vy2 = Vb2 and therefore Vy2 would have the representative matrix Pl
P2
[ I -I]. Since ( K s l ' 63 K s , ' ) ~= ( K s , ' ) ~@ ( K s , ' ) ~ ,we must have
(Ksl' @ ICS,')P
* VY2 = ( ( K S J P @ (KS,')P) * VY,
The term in the RHS can now be seen to be equal to K s l P
+ K s z p We . thus have,
Corollary 7.4.1 Let K s l , K ~ ,be polyhedral cones of vectors on S1,S2 respecKs,)P = tively. Let '' be the usual dot product operation. Then ( K s , (Ks, K s a P ) ,where the superscript 'p' denotes polarity.
*
+
Exercise 7.16 Redo the Exercises on implicit duality case of polyhedral collections.
-
vector space case, for the
Exercise 7.17 Using the Implicit Duality Theorem prove i. ( K s p . S)* = IC:p x S ii. ( K s p x S)"= K > p . S where K s p is a polyhedral cone of vectors on S U P. Exercise 7.18 Proof of Implicit Duality from duality of contraction and restriction: Let S , P be disjoint and let P = PI u Pz. i. Show that ( K S P +f K P , )
* KP,
= =
* (KP, @ K P 2 ) ( K S P * K,) * KP,
KSP
.. Let X p be the collection of all vectors on P and T : X p + X p , be a nonsin-
11.
gular transformation. Let T(ICp) = K b . Let Tsp(fs K;.[>= T s p ( K ~ p )Show . that
... 111. that
Kkp
t)K&
=Ksp
@
fp)
G
fs
@
T(fp) and let
H Kp.
Assuming i t is true for all K s p and all finite sets S , P I ,P, s.t. P = PI U €'l
( K S P * ( X P , @ OP,))* = K:p
show that (KSP
+)
KP)* = ICZp
* ( O P , Cq X P , ) , * (-GI,
for the following two cases: K p
is a vector space on P , and K s p , a collection of vectors on S N P
K p , K s p are collections of vectors on P, S U P respectively.
7.4. LINEAR INEQUALITY SYSTEMS
7.4.1
237
Applications of the Polar Form
Both the polar version and the integrality version of the implicit duality theorem are presented in this chapter essentially for completeness [Narayanan85a]. However, we are able to cite atleast one reference in the literature on polyhedral combinatorics where a result, that could be regarded as an instance of the Implicit Polarity Theorem, is derived and applied [Balas+Pulleyblank87]. We state this Projection Theorem of Balas and Pulleyblank below but prove it using the Implicit Duality Theorem. Theorem 7.4.2 Let
z E {(u,x)
:
Au+Bx=bl D u + Ex 5 b2 u>o>
>
Let W G {(y,z): yTA + zTD >_ 0 , z 0) Then, X 3 {x : 3u s . t . (u,x) E Z } = {X : (yTB+ zTE)x 5 yTbl + zTb2 V(Y,Z) E W } . Proof: For notational convenience we take the ‘primal’ cone to be made up of column vectors and the polar cone t o be made up of row vectors. We regard bl ,bz
also as variables initially.
Let C E C(u,x, bl, b2) denote the cone AU
and let C,
= C,(u)
DU
+ BX- Ib,bl 1 0
+ EX- Ib2b2 5 0.
denote the cone (-Iu)u 5 0.
Let f f c^(x,bl, b2) = C t)C,. The restriction of c* to the components corresponding to x for fixed bl, b2 is the set X . Then ( f ) P = C p t) -Ct, by the Implicit Polarity Theorem. Now, (C)” = {(yTA+ z*D) @ (yTB+ zTE) @ -yT
@
-zT, z 1 0)
and
-ce = {PTIu,P L 01 by Farkas Lemma (Theorem 2.3.2, also part (ii) of Lemma 7.4.1). Hence,
+
(c*)”
= {(yTB zTE) @ -yT @ -zT), z 2 0 and yTA + zTD = pT 2 0 } = {(yTB zTE) @ -yT @ -zT V(y,z) E W } .
+
Now,
f = (e)””= {(x@bi@b2): < (x@bl@b2),( a @ f l @>< ~ )0
V((r@D@r)E
tp}.
7. T H E IMPLICIT D U A L I T Y THEOREM
238
Thus,
c^ = {(x @ bl @ b2) : (yTB+ zTE)x - yTbl - ZTbZ 5 0
V(y,Z ) E W } .
To get X we restrict the above set to the components corresponding to x fixing bl, bz. Hence, X = {x : (yTB zTE)x 5 yTbl zTbZ V(y,z) E W } .
+
+
0
The authors ([Balas+Pulleyblank87]) use this result to study the perfectly matchable subgraph polytope of an arbitrary graph. The variable of interest x is found in the naturally obtained constraints along with other variables. Direct elimination of these other variables would destroy the structure of the problem. The authors therefore, use the above ‘implicit projection’.
7.5
*Integrality Systems
Another good example of q-orthogonality is integral duality. We say two vectors on a set S over the rational field are integrally dual iff their dot product is an integer. As in the case of polyhedral cones and polarity, here too we have a good family of (regularly generated) collections of vectors which is closed under sum, intersection and integral duality. We can therefore prove an implicit integral duality theorem for such systems which we do in the present section. Throughout this section we will be dealing only with rational vectors. In this case also ‘< ., . >’ denotes the usual dot product. Let /CS be a collection of rational vectors. Then, the integral dual of K S is denoted by /Csd and is defined by xSd = {y, < y, x > is an integer vx E K S } The analogue of vector space and polyhedral cone in the present case is a ‘regularly generated’ collection of vectors. We define this notion below. Let (A,B) be an ordered pair of matrices whose rows are vectors over the rational field defined on a set S . The collection of all vectors of the form XTA XTB, where XI is an integral vector and X2, any rational vector, is said to be regularly generated by the rows of (A,B ) or regularly generated for short. For each row Aj of A let Aj, denote Aj - Aj, where Aj, is the projection of Aj on the space spanned by rows of B. (Note that Aj, is rational). Let A, denote the matrix obtained from A by replacing each row Aj by Aj,. Let B denote a maximal linearly independent subset of rows of B. It is then clear that the collection of vectors regularly generated by the rows of ( A , B ) is identical to that regularly generated by ( A n , B ) . We show next that A, can be replaced by an appropriate matrix A with the same row space as A, but with linearly independent rows.
+
[ :]
Definition 7.5.1 An integral matrix of full column rank is said to be in the Hermite Normal Form if it has the form
where B satisfies the following:
i. it is an upper triangular, integral, nonnegative matrix;
7.5. INTEGRALITY SYSTEMS
239
ii. its diagonal entries are positive and have the unique highest magnitude in their cofunms. We now have the following well known result.
Theorem 7.5.1 By using the elementary integral row operations: i. interchanging two rows ii. adding an integer multiple of one row to another
iii. multiplying a row by -1
any integral matrix A can be transformed after column permutation to an integral matrix of the form L
where All is a non singular matrix in the Hermite Normal form. Proof : After column permutations, if necessary, we can partition A as ( A l l ! A l ~ ) , where A11 is composed of a maximal linearly independent set of columns of A. With elementary integral row operations on the entire matrix modify the first column of A l l to a nonnegative vector with the least sum possible. Only one of these entries can be nonzero as otherwise the least nonzero entry can be subtracted from the others to reduce the sum. Bring this entry to the top left hand corner. The matrix p All has now been converted into a matrix of the form all 0 A”11 A”iz Repeat the procedure with the first column of A”11 aAd so on. A t the end of t i i s
(2
)
where A:, would 0 be in the upper triangular form. Now use the diagonal entries of A:, to convert all entries above, by elementary integral row operations, t o nonnegative numbers of value less than the diagonal entries. The resulting matrix A l l , by definition, is in the Hermite Normal Form.
procedure we would have the matrix in the form
0
Suppose a collection of vectors K is regularly generated by rows of (A,, B) where rows of A, are orthogonal to rows of B. Now A, = i(An’), where (A,’) is an integral matrix and k is an integer. By elementary integral row operations as in Theorem 7.5.1, we can reduce A,‘ to a row equivalent matrix A’ which has linearly independent rows. Since it is clear that the inverse of each elementary integral row operation is another such operation, a vector is an integral linear combination of the rows of A,’ iff it is an integral linear combination of the rows of A’. Let A z i ( A ‘ ) . Then rows of A are linearly independent and further, rows of A and A, can be generated from each other by integral linear combinations. Thus, the collection of vectors K is regularly generated by rows of ( A , B ) where rows of A and rows of B are independent and are mutually orthogonal. We say that such an ordered pair (A, B) is in the standard form.
240
7. T H E IMPLICIT DUALITY THEOREM
The integral dual operation is sufficiently well behaved, as we show below, for us to apply the implicit duality technique.
Theorem 7.5.2 Let K be a collection of vectors over the rational field regularly generated by the rows of (A,B) in standard form. Then i.
K d is regularly generated by rows of (C,D) in standard form where C is row equivalent to A, ACT = I, and D is a representative matrix for K I .
jj. ~ : d d=
x.
iii. If K: is a regularly generated collection of vectors on S bJ P then regularly generated. iv. If K' is another regularly generated collection then generated.
K . 5' is
K+K', K fix' are regularly
Proof:
i. We note that rows of A, B, D are mutually orthogonal. Also
\ :I / A \
is a rank matrix. This is also true of C, B, D. Let K' be regularly generated by rows of ( C , D ) ,where C , D are defined as in the statement of the theorem. Let x E K and y E K'. Then, \
< X,Y >= (X1TX2T) for suitable vectors XI, X2, 01,
~2
=
I
( ) (CT ! DT) ( : )
where XI,
XITgl,
g1
are integral. Thus,
which i s a n integer
We see therefore, that K' K d . On the other hand, suppose y is integrally dual to all vectors in K. We have, XzTBy is an integer for arbitrary Ax. This can happen only if By = 0, i.e., yT belongs to the space spanned by rows of Let y T =
~ T c T (T
)
. Suppose
01
(a
is not integral. We know that ACT = I.
Hence, for some integral value of A1 we would have XITACTg1nonintegral, which contradicts the fact that y E Kd. We conclude that ~1 must be integral and that K' 2 Kd.This proves the first part.
..
11.
From the above proof it is clear that if K" is regularly generated by rows of is regularly generated by rows of (A,B), i.e., ( K d ) d= K .
(C,D) then (K')'
iii. This is immediate by the definition of regularly generated collections.
7.5. INTEGRALITY SYSTEMS
24 1
iv. If K ,K‘ are regularly generated by rows of (A,B ) , (A’,B’) respectively then the rows of
(( ) , ( )) regularly generate K + K’. :t
Now K d , ( K ’ ) d are regularly generated and so is their sum. By Lemma 7.2.2 ( K d (K’)d)d= Kdd n KIdd = K n K’.
+
Hence, by the first part of the present theorem, K
n K’ is regularly generated.
(a )
0
Remark: An easy way of constructing C, from ( A , B ) in the standard form, is to first build a representative matrix D for KI. Let the matrix
have the
with APT = I. Then, P can be taken to be C. inverse (PT!QT:RT) Theorem 7.5.3 (Implicit Integral Duality Theorem): Let K s p , K larly generated collections of vectors on S kJ P, P respectively. Then,
( K s p t)K P ) = ~Kgp
p
be regu-
K$.
t)
Proof of Theorem 7.5.3: Let K s p , K p be regularly generated by the rows of ((As Ap), (Bs Bp)) and (Ap, Bp) respectively. A vector xs belongs t o KSP t) K p iff there exist vectors XI, p1 and A:!, 112, where XI, A2 are integral s.t.
Thus, xs E K S P
Ksp
t)
+ K p . By part
K P iff
I2 1
belongs to the regularly generated collection
(ii) of Theorem 7.5.2,
[ ]
belongs to K S P
belongs to ( l i s p + Kp)”. The collection ( K s p + K P ) is~ defined by
Thus, xs E lisp
K P iff
t)
+K p
iff it
7. THE IMPLICIT DUALITY THEOREM
242 i.e., iff
K$) + (xzys is integral). Hence, xs E ( I c s p +) K p ) iff xs E ( K g p H Kc),". Thus, K s p +) I c p = ( K g p +) K$),". Now, ys E K g P +) K$ iff it is the restriction of a vector (ys E K $ H ~
( ;z )
(**) satisfying the
condition (*). Equivalently K i p +) K$ is the restriction of ( I c ~ p + K p to ) ~S. Since Icsp, Icp are regularly generated we must have (Kspf K P ) also ~ regularly generated by Theorem 7.5.2. Hence, this must be true also of ( K s p K P ) . ~S , i.e., of Kg, +) K$. By the same theorem we must have
+
(K:,
H
K$)dd = K d SP
H
K dP
Hence, by (**)
( I c s p +) K p ) d = (K;, e, K $ ) d d = IcdS P
+)
K dP 0
The technique of the proof of Corollary 7.1.1 will work in the case of the present instance of q-orthogonality, namely, integral duality. In this case also we would be working with a vector space Vlg as defined in the proof of the above mentioned corollary. By Lemma 7.2.1, Vf2 = Ub2 and therefore Vfg would have the representative matrix Pl Pz [ I -I]. Since ( K s I 1@ KsZ1)' = Ks,Id @ Ks2Id,we must have ( I c S I 1@ I c S z l ) d H
V & = ( K S l ' d @ K S Z l d )+)
v&
The term in the RHS can now be seen to be equal to K S , ~ + K S We ~ ~ thus . have, Corollary 7.5.1 Let K s , ,lcsz be regularly generated collections of vectors on Let ( ' be the usual dot product operation. Then (Ks,H =K s ~ ~ + K s ~ ~ .
S1, S.L respectively.
Remark: As in the case of polarity, here too it is easy to see that the Implicit Duality Theorem is a special case of the Implicit Integral Duality Theorem. Exercise 7.19 Examine if implicit duality would work for the following instances of q-orthogonality. i. f , g are q-orthogonal iff < f , g > is a nonnegative integer. ii. f , g are q-orthogonal iff < f , g > is an integral multiple of a given integer.
7.6. PROBLEMS
7.6
243
Problems
Problem 7.1 i. Algorithm for building representative matrix of generalized minor: Given representative matrices for V s p ,V p show how to build a representative maVp. trix for V s p ii. Rank formula for generalized minor; another proof of the Implicit Duality Theorem: Complete the details of the following alternative proof of the Implicit Duality Theorem. (a) ~ ( V S + P) V P ) = ~ ( V S xP S ) ~ ( ( V S PP ) n V P ) - ~ ( ( V SxPP ) n V P ) (b) (Vsp t+ V p ) is orthogonal to (VkP t)Vjk). (c) T(VSP V P ) +r(V& t)V,“) =( I . Hence, (Vsp +) V p ) l = V& +) .
*
-+
*
v,‘
+
s
Problem 7.2 (*) Let V s p ,Vspl be called equivalent in S iff V s p . S = Vsp, . S . We say (Vsp,V k P ) ,(Vsp, , Vkp,) are equivalent in S iff V s p ,Vsp, and V i p ,Vkpl are equivalent in S. We say ( V s p ,V&) are minimal in P iff whenever (V s p l ,Vkpl) is equivdent to ( V s p ,V k P ) ,I Pi I > ) P I . Show that ( V s p ,V,$) is minimal in P iff either of the following equivalent conditions hold i. 1 P (=r ( V s p . S ) - ~ ( ~ x sS )p = r ( v k P . S ) - r ( v 6 t px S ) ii. ~ ( V s xp P ) = T(V& x P) = 0. Problem 7.3 Minor of generalized minors in terms of minor of V s p : Let SZ C S1 S and let VS, be a vector space on S1.Show that
(Vsp f) V p ) . s1 x
sz = (Vsp . (Sl u P ) x (SZ u P ) )
f)
VP
Problem 7.4 (*) Let ( V s p ,V p ) be compatible, i.e., VSP* P 2 V P and V S P x P L V P.
i. Let ~ ( V s xp P ) # 0, let i p E V S P x P, and let e belong to the support of f p . Let PI P - e , VP, = V p x P I , and let Vsp1 = VSP x ( S u P I ) . Show that VSP f) VP, = VSP f) V P (a) VSP, . Pi 2 VP, and Vspl x pi C: VP, (b) VSP, . s = vsp . s and Vsp1 x s = vsp x s. (c) ii. Let r(VkP x P ) # 0, let f p E VkP x P and let e belong to the support of f p . Let PZ 3 P - e, UP, = V p . 4 and let VspZ = V s p . ( S U Pz). Show that (4 VSPZ +) VP, = VSP VP, VSP, . Pz 2 VP, and Vsp, x PZ E Vp,, (b) VSP, . S = V s p . S and Vsp, x S V s p x S. (c) iii. Repeat the above problem when ‘compatibility’ is replaced by ‘strong compatibility’ (i.e., V s p . P = V p and V s p x P = Vp).
*
Problem 7.5 Compatibility permits recovery of V p from V s p , VSP t)V P : Prove
7. THE IMPLICIT DUALITY THEOREM
244
Theorem 7.6.1 . Let V S = V s p +) V p . Then, V p = V s p and Vsp x P C V p , (i.e., i f f ( V s p , V p ) are compatible).
+)
V s i f fVsp . P 2 VP
Problem 7.6 (*) Let V 1 ,V2 be vector spaces on S1, Sa respectively. Find a vector space V12p on (S1U S 2 ) k.d P that is an extension o f both Vl and V , such that I P I is the minimum possible. Show that the minimum value of I P I is r(V1 Va) x (S1 n s,))n (vzx (S1n Sz))).
+
+
Problem 7.7 Minimal common extension and algorithms for construction of the spaces: i. Given vector spaces V g , . . . , V i show that i f V s p is to be a common extension of
V k through spaces V k then
2
2
i i . Give a procedure for building V s p , V b s.t. I P I is equal to the RHS o f the above iriequality. Problem 7.8 Let V s p be a common extension of V i , V i . Then i. V s p is a common extension o f V s p tt V& and V s p +) V g . .. 11. Distance between spaces does not change when we take generalized minor with respect to an extension: I f V s p is an extension of V k , V g , . . . ,V i
d(V& v;, . . . , v;) = d ( ( V s p
* v;,, (VSP cs v;),. . . , (VSP cs V $ ) .
Here, d(. . . , V i , . . .) denotes r(C V i ) - r ( f l V i ) . ... Suppose V s p is a common extension of V;, V ; , . . ' , V!j. Show that, to find a minimal extension, the following procedure is valid . Let = Vsp V.4 = vsp v;
111.
v:, v;,
* *
v$ = vsp * vg.
Let V ~ be Q a minimal extension o f V b , V;, . . . , V $ . Then, V s p cs V ~ isQa niinimal extension of V k , V i , . . . , v&$.
Problem 7.9 How to build RI , Rp, efficiently in special situations: The procedure described in subsection 7.3.3 is useful only if the matrices RI,Rp, can be built efficiently. Brute force Gaussian elimination should be avoided. Linear
time algorithms, i f available, would obviously be the best. Rp, Let { E l , . . . , E k } be a partition of E(G). For the following cases build RI, efficiently:Let G' be i. @ , G . E , ii. x Ei 111. obtained from 4 by fusing nodes, iv. obtained from G by node splitting, v. obtained from G b y first fusing certain nodes and then splitting some nodes of the resulting graph.
...
eiG
7.6. PROBLEMS
245
Problem 7.10 Nodal analysis of n/ by bordering nodal matrix of N': In subsection 7.3.3 let N = G and M = -I. For this case derive a nodal analysis-like procedure in which the nodal matrix of graph G' appears as the core of the overall coefficient matrix. Specialise this derivation to the case where G' is made u p only of self loops and show that the usual nodal analysis equations result. Problem 7.11 i. Using circuit matrix of G' instead of reduced incidence matrix: In subsection 7.3.3 the reduced incidence matrix of G' is used while writing KCE and KVL has been applied in terms of node potentials of G'. Instead derive a dual set of equations using a representative matrix of the current space of G' for writing K V E and applying KVL in terms of loop currents of 4'.
ii. Loop analysis of N by bordering loop matrix of N': Let M = R and N = -I. For this case derive a loop analysis-like procedure in which the loop
analysis coefficient matrix of graph G' appears as the core of the overall coefficient matrix. Specialise this derivation to the case where G' is made u p only of coloops and show that the usual loop analysis equations result.
Problem 7.12 [Tellegen) Reciprocity: Let N be an electrical network with graph G. Let E(G) be partitioned into E p , E R , where the devices in Ep are norators and the devices in ER have the following characteristic: Let X R denote the vector ( v R , ~ R ) , yR denote the vector (-iR,vR), where (VR,i R ) belongs to the device characteristic of ER. Then the collection V X ~ of all the XR'Sis a vector space and is complementary orthogonal to the collection V y of ~ all the YR 's. e.g. VR = ( R ) i R where R is a symmetric matrix. Let xp denote the vector (vp, ip), yp denote the vector (-ip, vp), where ( (v p,V R ) , (ip, i R ) ) is a solution of the network. Show that the collection V x p of all the xp 's is complementary orthogonal to the collection V,p of all the yp 's. Problem 7.13 Adjoint Networks: Let N,N' be electrical networks with graph G. Let E(G) be partitioned into EDu Eyv kd Eyi kd Euv bd Eui, where the characteristic
i. of the devices in E D is
NA ii. of the devices in E,, is
)
[ 1;' ]
= 0 in N',
i,, = 0 i n N
unconstrained [norators) in hi' (in particular no constraints on v',~)
7. T H E IMPLICIT D U A L I T Y T H E O R E M
246
iii. of the devices in E,i is
vyi = 0 in N unconstrained (norators) in N’ (in particular no constraints on iIyi)
iv. of the devices in E,, is unconstrained in N (in particular i,, has no constraints) v’,, = O in
N’.
v. of the devices in E,i is unconstrained in N (in particular v,i has no constraints) ilVi = O
in N’.
The symbols u , y indicate input, output in N and output, input in
N’. If
show that
Remark: i.
( ML) NL) ) , ( MA orthogonal spaces.
N&
)
are representative matrices of complementary
ii. The usual electrical network adjoint N” is obtained by replacing i’ by -?’, v’ by v” in the device characteristic. iii. u , y indicate ‘input’,‘output’respectively in N and ‘output’,‘input’ respectively in N’.
Problem 7.14 *Proof of the Implicit Duality Theorem without using finiteness assumption: Let < ., . > be a q-bilinear operation as in Section 7.2. Further whenever S , P are disjoint let < fs @ f p , gs @ g p >=< fs, gs > + < f p , g p > . Let ‘*’ denote the y-orthogonality operation with the set A being closed with respect to both addition and subtraction. We say a collection K of vectors on a set T is closed iff ( K ” ) ’ = K. Prove the following: 1.
7.7. NOTES
247
Lemma 7.6.1 Let V s p , V p , be vector spaces. Let U p G V p n ( V s p P ) . Then (a) V S P H V P = V S P +) UP (b) if V p ,V s p , V s p . P, V; V:p x P are closed, then V 2 p H U: = V;p 9
+
.. 11.
Theorem 7.6.2 Let U p vgp
f)
U;.
g V s p .P. Zf V s p ,VgP.S
+)
V;.
are closed, then (VSP +) UP)*=
iii. Hence, if V p ,V s p ,V s p . P, VZp S, VSp + VSP x P are closed, then
( V s p H V P ) * = v;p
7.7
H
v;.
Notes
It seems very dificult to trace the origins of the Implicit Duality Theorem. The first publication, that the author could trace, which refers to the ideal transformer version of the result is [Belevitch68]. However in that reference the proof of the result only deals with orthogonality and omits the crucial aspect of the ranks of the orthogonal spaces adding up to the full rank. Kron’s use of the ‘power invariance postulate’, reminds us of this theorem [Kron39]. Unfortunately he never makes it clear under what conditions the ‘postulate’ can be used. Some of the proofs of the theorem presented in this chapter may be found in [Narayanan86a], [Narayanan86b], “arayanan871.
7.8
Solutions of Exercises
E 7.1: We remind the reader that the dot product of f1 on S1 with equal to
f2
on S2 is
eESlnS2
If S 1 n S,= 0,then, by definition, the dot product is zero. Wherever possible we state and solve a more general version. However, throughout, the q-bilinear operation is a dot product. The reader is referred to Section 7.2 for the definition of q-orthogonality and of the set A. i. If S1 n S, = 8, then ( I c 1 @ K2)’ = K1’ @ K Z * ,
where K1 (Kz)is a collection of vectors on S1(SZ) with the zero vector as a member. Proof : Any vector in the RHS is of the form f1+ fi where fi E K I* and 62 E Kz’. Consider gl E I c 1 , ga E Kz.We have
248
7. T H E I M P L I C I T D U A L I T Y T H E O R E M
Yow < g l , fl >, < g2,f.l >, belong to A. Therefore, so does their sum (taking g , f to be q-orthogonal iff < g , f >E A) i.e., f1 f2 is q-orthogonal t o every vector in
K1
+
@K2.
Thus, RHS C LHS. On the other hand if f E LHS, then f = f/S1 @ f/S2 = fi @ 62, say. Now for every gl E K 1 , g2 E K 2 , we have
d 3 < f i @ f i , g i ~ g z > = < f i , g>i + < f i , g a > . Setting gl to 0, we see that < fi,g2 > E A. Similarly, fl E K1' and f . ~ E Kz' and f E RHS. Thus, RHS 2 LHS.
E
A. Hence
ii. (li, @ K z ) t)Kz = K1 with the usual dot product as the q-bilinear operation. Proof straight forward. E 7.2: K p , K: are collections of vectors on P and K1, K2 are collections on S H P . We assume the collections are all closed under addition. i . ( K l Kz) t)K p 2 ( K 1 t)K p ) ( K z t)K p ) . Proof : Let f s E RHS. Then f s = f s l fs2 where fsl E K1 t) K p and t)Icp. There exist vectors fsl @ f p l E K l , f p l E K p , f s 2 @ fp' E Kz and fLq2E fp' E K p . Then, fsl f s 2@ f p l fp2 E K1 K2 and fpl f p 2 E K p . Hence, fS E L.H.S.
+
+
+
ii. (Kl n K z ) t)K p This is immediate.
+
+
+
c (K1
+
K p ) n ( K z +) K p ) .
t)
+
+
iii. l i 1 t) ( K p K g ) 2 (K1 t)K p ) (XIt)K $ ) . Proof is similar t o that of part (a) above. We use the fact that K1 is closed under addition. iv. lil t) ( K p n Kg) This is immediate.
c (K1
+)
K p ) n (K1 H K g ) .
(K1 t)K p ) ' = (K1 - K p ) * . S . (Note that K1 t)K p = (Kl - K p ) x S ) . This holds if ((XI - K p ) x S)* = ( K 1- K P ) * . S. v.
E 7.3: For (iii) below, we assume K s p is closed under addition and l i s p - K:SP x P g K s p .
K P = K s p t) ( K p n ( K s p . P ) ) ii. l i s p t)K p = ( K s p n ( K s p . S @ K p ) ) t)K p . ... 111. Let K P - & ~ p x P = K p - K s p x P. 1 1 ~ K1 s p t)K p = K s p t)K p . i. K S P
t)
? >
Proof: i . If fs E K s p t) K p , then there exists f p E K p s.t. fs @ fp E K s p . Clearly f p E K p n ( K . y p . P ) . Hence, fs E K s p +) ( K p n( K s p .P ) ) .The reverse containment is clear since
K s p t) K b 2 K s p whenever Kb 2 K:. ii. We have K k p
t)
t)K
K$
p
1K i p
K p whenever K k p 2 K i p . Hence, LHS 2 RHS.
t)
7.8. SOLUTIONS OF EXERCISES
249
Let fs E K s p H K p . Then there exists f p E Icp s.t. f s @ fp E K s p Clearly, fs @ f p E ( K S P . S @ Kp). Hence, fs @ fp E KSP n ( K S P . S @ K p ) . Thus, LHS C: RHS.
iii. Let fs E L.H.S. Then 3fp E kp s.t. f s @ f p E K s p . By the given condition on K p , K p there exist vectors fp', f p 2 in K s P x P, s.t. f p - fp' + fp2 E K p . Denote this last vector by f p 3 . Clearly, f s @ fp3 E K s p since lCsp - K s p x P E K s p . Hence, fs E K s p t)K p . Thus LHS & RHS. The reverse containment is proved identically, interchanging the roles of K p and KP.
E 7.4: Let Q 5 T C S. If Implicit q-orthogonality holds: i. (K . T ) * = K* x T . ii. (K x T ) " = K * . T . iii. ( K x T . Q)"= Ic* . T x Q. Proof: i. Take Ks-T G U S - T , where VS-T is the space XS-T of all vectors on S - T . Then by Lemma 7.2.1 v : - ~ V$-T = (05-7'). We have ( K .T ) " = (K t)~ S - T ) *= K* f) K:-T = K* X T .
.. In this case Ks-T
11.
having full rank.
3
{ O S - T } and K;-T
is the vector space XS-T on S - T
iii. Letting X p , P C S, denote the space on P with full rank, we have x 7'. Q = K H K s - Q , where Ks-Q E 0s-T @ XT-Q. NOWK:-Q = XS-T fB O T - Q . SO, K* H KC;.-Q = K * .T x Q .
K
E 7.5: Let K s p , K S be collections of vectors on S P, S respectively. Then there exists a collection of vectors K p on P s.t. 0 E K p and K s p +) K p = K S only if K s p x S C K S 5 K s p . S. The latter condition is sufficient for the existence of K p with 0 E K p , provided K s p is a vector space and Ks is closed under addition. P r o o f : Suppose K s p t)K p = KS and 0 E Kp. It is clear from the definition of the generalized minor operation that K s p . S 1 Ks.Since 0 E K p every vector fs s.t. fLsCq O p E Ksp must belong to Ks. Thus, K s p x S C_ Ks. On the other hand suppose K s p . S 2 Ks 2 K s p x S. Let K p be the collection of all vectors fp s.t. for some vector fs E K s , f s e f p E K S P . Clearly K s p +) K p 2 K s . If fs' E K s p x S then fs' E KS and fsl fB O p E K s p . Hence, by definition of K p , 0 E K p . Let fs be s.t. fs Bfp E lcsp for some fp E KP. We know that there exists fs' E K s s.t. fs'@fpE KSP. Since KSP is avector space, we must have, (fs-fs')@Op E KSP. Hence, fs - fs' E lcsp x S C_ Ks. Since Ks is closed under addition and fsl E Ks,
7. T H E IMPLICIT DUALITY THEOREM
250 it follows that ( f s - f s ' )
+ fs' = fs also belong to K s . Thus, K S P
f+
KP CKs. 0
E 7.6: i. A vector fs @ f Qbelongs to each of the spaces (whose equality is to be proved) iff there exist vectors f s @ f T , f T @ f p , fp@fo,belonging respectively t o V S T ,V T P ,VPQ. The skewed sum case is similar. ii. For each of these spaces a vector fsl @fpl@fs2 @fp2is a member iff there exist vectors fsl @ f T 1 , fT1 63 f p l , fs263 fT2, fT2 @ fp2 in vector spaces L I S ~ T ~VT,P,, , V T ~ Prespectively. ~, The skewed sum case is similar. iii. A vector f s ,
@.
. . @ fs, belongs to LHS (as well as RHS) iff there exist vectors fS1
@
fT1,
f S z @ fTz
,
' ' '
, fs, 63 fT,
belonging respectively to CE. . . @ VS,?:, and a vector fTl 63.. . CE fT, belonging to V,r,r2..7:,. The skewed sum case is similar.
E 7.7: i. This is clear since
( V S 1 T . TIL
= Vi,T x T,
(VS~T x T)' = VklT . T , and
2 ii. Let Vs,s, G ( V S ~ + T) V S ~ T ) . A vector f s 2 @ fT E V s l + ~) V s l s Z ~1
~2
iff
~ ,Vl;
iff there exist vectors f s l 83 fT E L ) s l ~ fsl , fBf~ E V S , T ,and fsz 63 f~ E V S ~ T . NOW, Os, @ (fT - f T ) E U s l ~ i.e., , f T - f T E V s l X~ T C V S ~ XT T . Thus, f s 2 @ f~ (fT - 1 ), E V S ~ Ti.e., , fsz @ fT E V S ~ T . Next let f s 2 & 3 f E ~ V S ~ TSince . V~,T.T 2Vs2~.T there , exists avector V S ~ THence, . fsl @ fszE V S , ~ , Therefore, . fsz @ fT E V S ~ + T) V s l s z . This proves the result.
+
fsl 63fT
E
iii. We have (VklT, V k Z T )compatible. Therefore,
Taking orthogonal complements on both sides we get,
E 7.8: Let Bf = [I.B12]be the f-circuit matrix of G, where the identity matrix columns correspond to a coforest f and the columns of B I Zcorrespond to the forest f . Associate with each edge of f an ordered pair of terminals with no two pairs corresponding to different edges having common terminals. (We describe a procedure which would work for any rational matrix
[I!B12] with
7.8. SOLUTIONS O F EXERCISES
251
the provision that we use 1 : bi transformers instead of 1 : 1 transformers). Suppose b k e k , where Be = ( b l , . . . ,b k ) is the a cotree voltage u, = B,v, = blel appropriate row of Blz and 2 r e l , . . . ,u,,are the voltages associated with the forest branches. We will use one 2-port ideal transformer for each nonzero entry of B12. Corresponding to b i , i = 1,.. . , k, we would have a 1 : bi transformer. The primary and secondary of this transformer have reference arrows. So the nodes of the primary (secondary) can be called tail node and head node corresponding t o tail of the arrow and head of the arrow associated with the primary (secondary). Attach the tail (head) node of the primary of the 1 : bi transformer to the first (second) node of the ordered pair of nodes associated with edge ei. Put the secondaries of the 1 : bi transformers in series. Assume for simplicity that 61, . . . , bk are non zero. The sum blvl . . . b k v k would now be associated with an ordered pair of nodes (nl1n2),wheren1 is the tail node of the secondary of the 1 : bl transformer and n 2 is the head node of the secondary of the 1 : bk transformer. The ordered pair ( m l n l ) of nodes will now be associated with the directed cotree edge e . When this procedure is completed each directed edge of the graph would be associated with an ordered pair of nodes. The voltage constraint at these pairs of terminals is given by
+
.
-
+
+
+ +
By the Implicit Duality Theorem the current constraints must be
These are precisely the voltage and current constraints of the graph.
E 7.9: Ideal transformers cannot be connected inconsistently since the zero solution would always work for linear homogeneous equations. When two ideal transformers of different turns ratio are connected in parallel they would still permit zero voltage across the ports. E 7.10: The voltage constraints are V,,(G) +) U p and the current constraints, U i ( G ) ++ Vp'. Thus the ideal transformer that the remaining devices of the network 'see', is (Vv(G) V P , W G ) V j 3
*
*
E 7.11: [Narayanan87] If U s p +) V p = V , then clearly V s p . S 2 V . Further, V i p t)U ,' = VL. So V & . S _> V I , i.e., (V,", . S)l C U , i.e., V s p x S C V . In the present problem it is therefore clear that V s p . S 2 V V' and USP x S G V n V'. Hence, T ( V S P . S ) - r(vSp x 2 T ( V v') - T ( V n v').
+
s)
But T(USP.
+
S ) - T ( V S P x S ) = T ( V S P . P ) - T ( V S P x P ) FJ P 1 .
+
Thus, 1 P 1 2T ( V V ' ) - T(V n V'). However, the construction given in the subsection 7.3.3 actually achieves equality.
7. T H E I M P L I C I T D U A L I T Y T H E O R E M
252
Thus, for V s p to be a minimum extension of V and V‘ it is necessary and sufficient that
IP 1=
T(V
+ v‘)- ~ (nvv’).
E 7.12: Let G be composed of two edges e1,e2 in parallel directed in the same direction and let G’ have e l , e2 in parallel directed oppositely. To make both G and (7 the minors of a graph G E P , we may think of P as composed of PI P.. When edges of PI are shorted and those of 4 opened, we should get G and when edges of P2 are shorted and those of PI opened, G’. It can be seen that this requires four edges. Starting from G we first introduce two edges in series with p one at its tail end and the other at the head end. These would be P I . Now add an edge from the tail of e2 to the head of el and an edge from the head of e2 to the tail of e l . These would be P2. The reason we cannot do with less number of edges in P is that to reverse e2 both its endpoints have to be detached and reattached. However, our construction clearly requires
r(vu(G)+ vu(G’)) - r(vu(G) n vu(G’))= 2 So V E P with 1 P 1 = 2 cannot be the voltage space of a graph.
elements in ?I Remark: Node pair fusion and node fission operations (to be discussed in the chapter on hybrid rank) are more powerful than graph minor operations. In the present case to move from Cj to G‘ we require only two such operations.
E 7.13: If G’ is made up only of coloops, then Vu(G’)2 Vu(G).Hence, R1 would havc zero rows, i.e., would not exist. As described in Subsection 7.3.3 K V E of 4 c‘nn be written as the two constraints VE
= (A,’)TVn’
and
RZ2vn’= 0,
where A, ‘ is the reduced incidence matrix of G‘. But A,.‘ is the unit matrix since G’ is made up of coloops. So the KVE of 4 is equivalent to R$,vE = 0. Further, the columns of Rp, are linearly independent. Thus, RF, must be a representative matrix of V z ( G ) . If G’ is made up entirely of self loops, A,’ will have no rows, i.e., would not exist. Neither would Rp,. The matrix R1 would be the same as the reduced incidence matrix A, of G. E 7.14: i. Let V , V ’ , U” denote V u ( G ) V,,(G‘), , V,(G”) respectively. We will only verify the t r 1angle inequality. We have,
d ( G , G‘)
+ d(G‘, G”)
d ( G , G”)
= =
~ (+vv‘)+ r(v‘ + v”) T(V n v’) ~ (+vv”) T(V n V ) . -
-
T(V’ n v”)
-
h c use the identity, T(v,)
lielice,
d(G, G’)
+ T(v,)= r(vl n v2)+ r(ul + v2).
+ d ( ~ ’G”) , - d ( ~G,” ) = ~ [ ( T ( v ’ )- T(V n V’ + V’ n v”))
253
7.8. SOLUTIONS OF EXERCISES +(T(v
n v,)- T(V n v’n v))))] 2 0.
ii. The first part is similar to the above. The second part follows from the fact that
(u + u’)l = vl n (v‘>l r ( V ) l =I E I - T ( V ) if
and
1, is on E .
E 7.15: Let us rewrite Equation 7.7 as follows: (7.10)
where x =
[ ] iE v’,
and y =
[ ].
We will describe the plan of our method
VE
first algebraically and later give it a network interpretation. Assume that the overall matrix is invertible. We have, from the above equations, if C11 is invertible, x c21x
=
c11-ya - c 1 2
(7.11) (7.12)
y]
= 0.
Hence, CZlc11-ya
i.e.,
- ClZ y)
c21c11-1c12y
=
0,
=
c21c11-1
(7.13)
a.
(7.14)
Since we assumed that the equation 7.10 is uniquely solvable the above equation in
y must also be uniquely solvable. Substituting this value of y in equation 7.11 we
get the value of x. Now the network interpretation: We first show that our assumptions above follow from assumptions about unique solvability of N and N‘ .
Equation 7.7 is equivalent to the constraints of the network N as far as the variables V E , ~ are E concerned. Hence the solution of this equation is unique as far as V E ,iE are concerned. The overall solution would be unique if we can show that the columns corresponding to v,’ ,ip, , v p , are linearly independent. This is so because (a) columns of ( ( A T ’ ) T RT ) are linearly independent - indeed they form a basis for U V’. The matrix Rp, is row equivalent to a.n identity matrix 0 and therefore, has independent columns. Thus, RT has RP, 0 linearly independent columns as desired. We assumed N’ is uniquely solvable. The matrix C11 would be the coefficient matrix, if for N’, we write KCE, KVL (in the potential difference form) and device characteristic equations. Hence, C11 is invertible.
+
)
254
7. THE IMPLICIT DUALITY THEOREM
Next we interpret the steps of the solution when we solve c 1 1
x = N - ClZ y
We are solving network N ' , (1+ I P I) times. First solve with a in place but y = 0. Next set a = 0 and keep one component of y at a time equal to 1 and all the rest zero. The solutions we get are respectively equivalent to Cll-' a and columns of C11-' Clz. Referring to Equation 7.7 we see that each column of Clz is effectively either a column of Rp, or a column of -RT. When we set one of the components of ip, equal to 1 the corresponding column Rp2j comes into the equation since we have A,' iE = -Rp 23. '
This means that in the network N' with sources j, e set to zero we have the current source vector -Rp2j entering the nodes. When we set one of the components of vp, equal to 1 the corresponding column -RC comes into the picture since we have
Here, -RG may be thought of as a branch voltage source vector, each entry being the value of a source voltage in series with the corresponding branch. Because I P I = T p ( K p ) . Let K i p = Tsp(K~p), where Tsp is defined in terms of Tp as in the previous section of this problem. We have
Ksp
+)
K p
=Kip
Now since K b = Tp(Kp), we claim that (Kb)*= (TpT)-'(K>) To prove this we first check that vectors in (i.e., that
* K>.
Kl, and (TpT)-'(K;)
(*)
are q-orthogonal
(K>)*2 (TpT)-' ( K k ) ) .
We have, if fp' E K > and gp' E (TpT)-'(K*), that there must exist fp E K g p E K;, s.t fp' = Tp(fp) and gp' = (TpTp) '(gp). Hence,
p
and
T -1 (fp')T(gp')= (Tp(fp))T((Tp ) (gp)) = f p T g p E A.
(Here we have treated Tp(fp) as a matrix product). On the other hand if fp E K p and g p E (TpT)((K',)"),then there must exist fp' E Kb and gp' E (A&)*, s.t fp = Tp-'(fp') and gp = (TpT)(gp').Hence,
=< Tp-lfp', T p T g p ' >=< fp', gp'
(fp)T(gp)
i.e., TpT(KL)*& Since
> € A,
Kc,equivalently, (K>)*C_ (TpT)-'Kg. This proves (*). Tsp(fs @ f ~ =) fs CB T P ( ~ P ) ,
it follows that
T -1
(T;p)-'(fs CB fp) = fs @ (TP )
(fp).
As in the case of K p ,Kl, we can verify in the case of K s p , K k p ( = T s p ( K s p ) )also that ( K i p ) * = (TZP)-'(KZp). By the previous section of the present problem we must have ( K i p ) ' +) (K>)*= K>p +) Kg. But it is given that
( K s p +) K p ) * = K>p
+)
K>
+)
(K>)*
and we have already seen that
It follows therefore that
(K&
* Kl,)"= (K$,)*
Next, we handle the case where K p is not a vector space. Let K p t be a copy of K p on set P' which itself is a copy of P disjoint from S kl P. Let Kppl be the vector space with representative matrix
256
7. T H E lMPLlCLT DUALITY THEOREM
where the rows have 1’s on columns corresponding to elements which are copies of each other. We then have, by the definition of generalized minor,
Since K p p i is a vector space we must have,
(see the solution of Exercise 7.1). Now K P p , (= Kjkp,) has the representative matrix P P‘ [ I -I]. Herice,
(KZ, @ K;,) H K;p! = KEP H (-K>),
which proves the required result.
E 7.19: We need to check if there is an appropriate class of collections of vectors s.t. if K belongs to this class (K*)* = K. i. f , g are q-orthogonal iff < f , g > is a nonnegative integer: Consider the collection of integral solutions to Ax
5 0.
It can be shown that this is not closed under q-orthogonalitySo implicit duality would not not work in this case if we take the above mentioned collection as basic.
ii. f , g are q-orthogonal iff < f , g > is an integral multiple of a given integer n: The Implicit Duality Theorem should go through in this case. We sketch the argurrient below: In this case, we could define ‘regularly generated by (A,B)’ to be the collection of vectors of the form XTA aTB where X is an integral vector, and a , an arbitrary rational vector. If K is regularly generated by (A, B) we can, without loss of generality, take
(k )
+
to have linearly independent rows - the argument as in Theorem
7.5.1. Then, using the arguments of Theorem 7.5.2, we conclude that Ic* would be regularly generated by ( C , D ) where C satisfies ACT = n(1) and rows of D span
the orthogonal complement of the space spanned by rows of
( )
.
The rest of
‘l’kicorem 7.5.2 also goes through in this case. Hence, we can mimic the proof of Theorem 7.5.3 and conclude
( K s p H K p ) * = K>p
H
K*,
257
7.9. SOLUTIONS O F PROBLEMS
Solutions of Problems
7.9
P 7.1: i. Let V k p be the subspace consisting of all vectors of V s p whose restrictions to P belong to V p . Choose a representative matrix for V k p of the form S
P
( 2; ) of as of ~ ( V s p S ) + ~ ( ( V s pP.) n U p ) . NOW ( Eii )
where rows of
( Eiz ) as
The number
rows (as well
well as rows of rank)
this representative matrix is clearly is a representative matrix for
x
VSP
are linearly independent
* VP.
To determine the space U k p : Construct representative matrices R = (Rs i Rz) for V s p and Rp for V p . Find the solution space of the equation
( k: )
Let (Q11 Q1,) be a representative matrix of the solution space. ( Note
( RF
and the rows of
RF )
Q11 (Rs
= 0). The rows of QllRz generate ( V s p . P) n V P
Rz)generate V k p .
ii. The rank of V s p +) V p : (a) The rank is the number of rows of
(
).
Noting that the number of rows
of Rzp (in (*)) equals r ( ( V s p x P ) n V p ) we see that r ( V S p ++ V p ) = r ( V S p x S )
+ r ( ( V S p .P ) n vP)
-
r ( ( V S Px P ) n V p ) .
(b) If fs,g s belong respectively to U s p +) U p and (V,’, +) V,‘), there exist vectors E V p and g p E Vjk s.t. f s @ fp E V s p and g s @ gp E V k p . Hence, < f s , g s > = < fs 83 fp, g s a3 g p > - < fp, g p > = 0.
fp
(c)
T(V&
++
v:) + r(vSp
+)
vp)
=
[T(v& x S ) + r((v,l,. P ) n vjk) T-((v,~, x P ) n vjk) + r(vSp x S ) + -
r ( ( u S pP . )n
vP)- ~ ( ( V S Px p) n Wl, (4
7. T H E IMPLICIT DUALITY THEOREM
258
where we have used the results of a previous section of the present problem. Now we make use of the following facts:
P 7.2: We have
I P I>
T(VSP.
P) -T(VSP x P).
The ltHS equals ( ~ ( V s .pS ) - ~ ( V s xp S ) ) by Corollary 3.4.2. Select V,?p so that it has the following representative matrix
S
P (7.15)
R2.7 is a representative matrix of V s p x S arid rows of Rls are linearly iridependent. It is clear that
where
IP
/=r ( V s p ' P ) - T ( V S P x P ) = ?-(lisp. S ) - T ( V S P x S ) .
Now
.(V&
'
+T(VS$
P ) - r(v,+px P ) = r(V,l,) - r(v.+px S ) - .(V.&) = T ( V & . S ) - T(VkP x S ) = 1 s 1 --T(Vsp x S ) - 1 s 1 + T ( V S P = T ( V s p .S ) - T ( V S P x S ) =
.S)
[PI.
Since we have already seen that, in general, I P 1 cannot be less than ~ ( U s .pS ) - ~ ( V s xp S ) , the pair ( V s p ,V,',) is minimal in P. To see the validity of the second condition we first observe that
only if r ( V ~ p x P ) = 0 and
I p I=
T(V&
. P ) - T(V&
x P ) ( = .(V&
. S ) - T(U&
x S))
. S)
7.9. SOLUTIONS OF PROBLEMS
259
only if r(V,’-, x P ) = 0. On the other hand since
r ( V k p .P ) =)P I -r(Vsp x P ) and T(VSP
’
P ) =I P 1 -r(V,”,
x P),
we must have, if r ( V s p x P ) = r(V,”, x P ) = 0, then
1P 1
= =
T(VSP.
r(V&
‘
P ) - T(VSP x P ) P) - T(V& x P).
So the result follows. P 7.3: Let Vs-s2 have the representative matrix
s - s1
s 1- s 2
[ I O I , where Sa C S1 2 S. In Exercise 7.18 we saw that if PI M P2 = P that
Hence,
* V P ) * V(S,-S,)
= (%P fs V ( S , - S z ) ) f) V P . But the LHS is ( V s p +) Vp).S1 XSZwhile the RHS is (Vsp.(S1UP)x (SaUP))+) V p . Thus the desired result follows. (VSP
P 7.4:
;(a) Let fs E V s p , +) Vp,. Then there exists fp, E Vp, s.t. f s @ fpl E Vsp,. Since VP, = V P x ( P - e), and V S P = ~ V s p x ( S U ( P - e)), we must have fpl @ 0, E U p and fs @ fp, @ 0, E V S P .Hence, fs E U s p fs UP. On the other hand let f s E V s p fs U p . Then there exists fp E V p s.t. fs@fpE V s p . It is given that there exists fp E V s p x P C V p with e in the supyort of For some X we must have (fp Xfp)(e) = 0. It is clear that f s @ (fp -t- Xfp) E V s p and f p + X f p E V p . Let fp, be the restriction of (fp X f p ) t o P I . Clearly fs@fp, E V s p , and fp, E VP,. Hence, fs E VSP, fs Vp, .
+
fp.
+
i(b) We have that V s p . P 2 V p . Let fp, E Vp,. Then fp, @ 0, E V p C V s p . P. Hence, there exists f s CB fp, @ 0, E V s p . Hence, f s @ fp, E VSP x ( S U P I ) = USP,. Hence fp, E VSP] . PI. Thus, Vp, Vsp, . P. Next, let fp, E Vsp, x PI.Then, there exists f s @ fp, @ 0, E V S P .Hence, fp, @ 0, E VSP x P E V p . Hence, fp, E V P x P I . Thus, Vsp, . PI 2 Vp, 2 USP, x PI. i(c) It is clear that V s p . S 2 V s p x ( S U P I ) S = Vsp, . S. To see the reverse containment let fs f VSP ’ S. Then there exists fp s.t. f s @ fp E V S P .We have a vector fp in V s p x P with e in its support. Hence, for a suitable A, (fp+Xfp)(e) = 0. Now fs @ (fp X f p ) E V s p and this vector takes zero value on e. Let fp, be the +
+
7. THE IMPLICIT D U A L I T Y T H E O R E M
260
+
restriction of (fp X f p ) to P I . Then fs @jfp, E Vsp,. Thus, fs E lispl . S. This proves that V s p . S Vspl . S , and since the reverse containment is clear we have V s p . S = V s p l . S . Next, to prove that V s p x S = Vsp, x S , we merely note that V s p 1 = vsp x (S u P I ) .
c
ii(a) From the previous section it is clear that
v,I, e v,‘ 2 ) = ) v,$ ~
=
v&,
fs
I vp,
since VkP, = (vsp . ( S u ~ x (SuPZ) and Vh2 = ( V p . P z ) = ~ Vjk x P.L.Hence,
VSP fs VP = ( V i p fs V j y = (V&,
fs
V k J L = VSPZ+) VP,
ii(b) Since V s p . P 2 V p 2 V s p x P , it follows that ( V s p . P ) l C V$ E. ( V S Px P ) I . Hence, V k p . P 2 Vp‘ 2 V k p x P. Now by the argument of the previous section of the present problem v,I,, . P2 2 Vk2 2 v,,I x p.2.
C VP, C VSP, * 9 . ii(c) We have, V k p . S = Vkp2. S by the arguments of the previous section. Hence, Hence, VSP, x
P2
V S Px S = Vsp, x S. Similarly, V k p x S = Vkp2x S by the arguments of the previous section. Hence, by taking orthogonal complements of both sides, V s p . S = lispz . S.
iii. The proof is similar to the ‘compatibility’ case.
P 7.5: If V p = V s p +) U i for some V g , then by Exercise 7.5;
This takes care of the necessity of the condition. Suppose VSP x P vp VSP . P.
c
c
Let fp E VSP t) V S . Then there exists fs E V s s.t. fs @ f p E V s p . But V S = V s p +-+ V p . Hence, there exists fp’ E V p s.t. fs @ fp’ E V s p . It follows that fp‘ - f p E V s p x P C V p . Hence, fp E V p . Next let fp E V p . Then there exists fs s.t. fs @ fp E V S P since V S P. P 2 V P . Now fs E V s p t) V p = V s , by the definition of the generalized minor operation. Hence, f p E V s p t)V s , once again using the definition of generalized minor. P 7.6: Extend each vector in V 1 ,Vz to S1U Sz by padding with zeros. Let us call the resulting vector spaces V i , V;l. Using the result in Problem 7.7, if V s p is a minimum extension of V i , V;l,then
IP
(=~
( v+;v;)- ~ ( vn ;v ; ) .
It is clear that V s p is a minimum common extension of V : , V i iff it is a minimum common extension of V1,Vz.
261
7.9. SOLUTIONS OF PROBLEMS
+
+
Now .(Ui U;)= r(V1 UZ)and Vl fl U; is the collection of vectors which are zero outside S1 fl SZ and belong to both Vi and U;. Hence,
~ ( vn;u;)= ~ ( ( ux, (sln sZ))n (vzx (sln SZ))).
P 7.7:This is a generalization of Exercise 7.11 and can be solved similarly. The essential difference is that the minors may have t o be generalized while in the case of two spaces ordinary minors were adequate. For details see [Narayanan87]. P 7.8: i. is obvious. ii. Let V s p be a common extension of Ug,. . ' ,U;. Let ~'j, = l j s p H Ug,z= 1, ...k. By the result in Problem 7.5 it is clear that V s p H U& = Ui,z= 1, ..k. Grow a basis for C U$ starting with vectors in Ui and using vectors which belong to some V i . Let fs', . . . ,fsr be the vectors of this basis which are outside Vg and let them belong to spaces U s l , . . . , V S , respectively where the V s i are not necessarily distinct and are each equal t o one of Ui.Then there exist fpl, . . . ,fp' in U p 1 , . . . , U p , respectively such that fsl @ fp', . * . ,fs' @ fp' belong t o U s p . (Here U p , are spaces not necessarily distinct each equal t o one of the V i ) . Suppose fp 1 , . . . , fp' are dependent modulo U&). Then there is a linear combination fs = X l f s l . . . Xrfs' which does not belong to Ui but fp = Xlfp' . * . X,fpr E n i U k . NOW the vector fs E n , ( U S p f) U;) but fs fZ nU$. This is a contradiction. We conclude therefore, that
ni
ni
(ni
+ +
i
ni
i
i
+ +
i
The reverse inequality follows by repeating the argument interchanging S and P.
iii. We first show that USP H VPQis an extension of V g , . . ,I$. Let fs E U!j. Then there exists fp s.t. fs @ fp E Vsp. Hence, fp E U s p H Ug = V;. Since UPQ is an extension of U; there exists f Q s.t. fp CBfQ E U p Q . Hence, fs @ f Q E U s p H VPQ. Then Vg C ( V s p H V P Q ) S. Similarly we can show that (U;)' (V,", H VjkQ). S = ( U s p H U P Q ) .~S. Hence, V i 2 (Vsp H V ~ Qx )S. By Exercise 7.5 this proves that ( U s p H U P Q ) is an extension of V g ) . To see that it is a minimal extension we note that
IQI
i
i
i
i
using the previous section of the present problem and Problem 7.7. Thus, once again using the result in the abovementioned problem, Y S Q E USP 1/pQ is a minimal extension of u;,. . . ,u:.
+)
P 7.9: i. This is a special case of part (iv) of the present problem, whose solution is given below.
7. THE IMPLICIT DUALITY THEOREM
262
ii. G'
G
ei4 x Ei where E l , . . ,El, is a partition of E . NOW, '
~ v ( @
G x Ei)
1
@ U ~ ( xG Ei) i
i
=
@(v~(G)) x Ei. i
Any vector of the form fE, @ O E - E , where fE, E (b',(G)) x Ei is also in v,(G) by the definition of contraction. So U , ( e i G x Ei) U,(G).Clearly Rp, cannot exist in this case. We show how to 4 x Ei.Build the graph 4 x ( E - T ) . Let Al, build R1. Select a forest T of 4' be the reduced incidence matrix of 4 x ( E - T ) . Let Al, G (A1,0), where the zero submatrix corresponds to the set T . We claim Al, can be chosen as the matrix R1 .
= ei
To prove this statement we first observe that if
( t:~ ) a is
representative matrix
of G with A;, a representative matrix of G', then R1 can be taken t o be Al,. The rows of Al, are voltage vectors of 4 since 4 x ( E - T ) is a contraction of 4. The columns T are independent in the reduced incidence matrix A,' of 4' whereas they are zero in Al,. Hence, the matrix
( 2:: )
has linearly independent rows.
Next r(G x ( E - T ) )+ ,r(G . T ) = r(G). Hence, r(G x ( E - T ) ) +1 T 12 r(G). Thus, the above matrix must have ~ ( 4rows ) and is therefore a representative matrix of G. This completes the proof that R1 can be taken to be AI,. The labour involved is to build G x ( E - T ) and its reduced incidence matrix. So the algorithm is O ( (E I).
iii. Let G' be obtained from G by fusing nodes. Then every voltage vector of G' can be derived from a node potential vector of 4 by assigning the same potential to each group of nodes of which make up a node of 4'. Hence,
Thus, the method of the previous section of this problem can be used to show the following: Let T be a forest of 4'. Let A1, be the reduced incidence matrix of G x ( E - T ) . Then R1 can be taken t o be (Alp.0 ) ,with zero submatrix corresponding to T . iv. Since G' is obtained from G by node splitting, U,(G') 2 U,(G).The matrix R1 is composed of row vectors which belong to U,(G) and not to U u ( 4 ' ) . Hence, in this case R1 would not exist. We have to construct Rp, through a fast algorithm. From the discussion in Subsection 7.3.3 it is clear that imposing KVL for G is equivalent to VE
- (AT')TVn' =
0 Rp2*vn' = 0
(*I (**I
7.9. SOLUTIONS OF PROBLEMS
263
The equation (*) represents the KVL conditions for Q’. It expresses the branch voltages of Q’ in terms of the node voltages. Suppose Q’ is made up of m connected components. For each connected component we choose a pseudo datum node. The node voltage vector v,’ represents the voltages of the nodes in each component with respect to the pseudo datum node voltage. We could draw additional edges between each node and the corresponding pseudo datum node with the arrow directed towards the latter. The voltages of these additional branches E, would be given by v,‘. Let us call the graphs obtained by adding E, to Q’ and that by adding E, to G respectively Gh,G,. (G, is obtained by performing those node fusions on GL through which 8’ becomes 6. In G‘ there are no voltage constraints on E,. The voltage constraints on E, in the graph & would be equivalent to RpZTv,’= 0. But these constraints are precisely the voltage constraints of Q, . E,. Thus, RpzT may be taken to be an f-circuit matrix of Q, . En. The complexity of this construction is O(number of nonzero entries of Rp2). This is bounded above by (v(G . E,))(l E , 1 -v(Q . E,)) + v(G . E,). Usually however the effort required would be much less particularly if we choose a variation of the mesh matrix. v. Let G” be obtained from G by fusing some nodes and 4‘ from G” by splitting some nodes of G”. Let G have KCE AriE = 0 and KVL constraints A, Tv, = V E . Then, KCE of G can be written as
where A,” is the reduced incidence matrix of
G” and
(
z,, )
is a representative
matrix of U v ( G ) . Observe that the set of equations (4are the KCE of G”. Since G” is obtaine.d’by fusing vertices of G, we can use the procedure outlined in the solution of parts (ii) and (iii) of the present problem to construct the matrix R1. The KVL of G is imposed by
Let us define V ” E 2 ( A , ” ) T ~ , ” So. the KVL of G can be written as
( A r ” ) T ~n V”” E V”E
fR1
T
Vpl
- VE
= 0 = 0.
Observe that the first set of equations of (Jdabove are the KVE of G”. Now let G‘ be obtained from G” by splitting nodes. Then KCE of G” can be written as A,.‘iE” + Rp,ip, = 0 and the KVE of Q” can be written as ( A r ’ ) T ~, ’V ” E
= 0
R ~ , * v , ’ = 0,
7. THE IMPLICIT DUALITY THEOREM
264
where A,‘ is the reduced incidence matrix of G’. The matrix Rp, can be constructed therefore by using the procedure outlined in the solution of parts (i) and (iv) above. Observe that the result of the above procedure is that KCL and KVL constraints of G are written equivalently (as far as iE,vE are concerned) as
+ Rp,iq
A,’iE
RliE (Ap‘)T~n’RITVp, - VE Rp, T v,’
+
= 0 = 0 = 0 I=
0.
P 7.10: Our starting point is Equation 7.7. We will assume that the device characteristic can be written in the form
-G(vE - e) + (iE - j) = 0. We start with KCE in the modified form
&‘iE
+ Rp,ip,
= 0
RliE = 0. We now use the device characteristic and get
A,’GVE + Rp,ip, = -A,’j + A,’Ge R ~ G v E = - R j +R1Ge Next we use KVL constraints in the altered form:
+
A,‘G((A,’)T~n’ + RITvp,) Rp,ip, = -A,.’j + RITvp,) = - R j RIG((A,,‘)~v,’ RpZTvn’ = 0 The final transformed nodal equations are as follows:
& ’ G ( R ~ ) ~ RP, RIG(RI)~ 0
P 7.11: i . Let the K V E of G be We write this as
0 0
1
v,‘ ; :
=
[
BVE= 0.
( 2 ) = ( x>, vE
where Bn is a representative matrix of (Vi(G)) n (Vi(G‘)). Similarly, let the KVE of G‘ be
+ A,’Ge + RlGe
-A,’j -Rj
+ A,‘Ge + RlGe 0
1
7.9. SOLUTIONS OF PROBLEMS
265
Let V i , V & be the spaces Ui(G) and Ui(G’). Then a minimal common extension UEP will have the following representative matrix:
E
PI
P2
or equivalently the matri.x
where BL is a representative matrix of V i . Let VEP +) V& = V & and let UEP u; = v;. Pl
f+
4
Then it is clear that Uk can be taken to have the representative matrix [ I 01 and Pl p2
V$ the representative matrix [ O I ] . Thus the KVL, KCL constraints of G can be rewritten (using a procedure analogous to that followed in Subsection 7.3.3) as follows: B ; v ~ + B ~ =~ vo ~ ~ B ~ V E= 0 iE - ( B 2/ )T a1t1 -BTip, = 0 B:zii
= 0
i i . We give the final transformed loop equations below:
BP, 0
P 7.12: Consider the constraints
Let us denote (*) and (**) by
]
i’l VPz ip,
=
[
-BLe + BaRj -Ble;BlRj]
7. THE IMPLICIT DUALITY THEOREM
266
It is clear that (*), (**) can also be denoted by
It is given that F ~ ( v R , ~=R0) can also be written as Fi(-iR,vR) = 0. As far as the variables (vp,ip) are concerned (*), (**), (JJ)are equivalent to
By the Implicit Duality Theorem i t follows that
LI)/,/,(
[(Solution space of ( J ) )+) (solution space of (J)) CI ) (solution space of (JJ))’
= (Solution space of
But, LHS = V z p , and RHS = V y p . The result follows.
Remark: The above may be used to prove reciprocity for certain kinds of networks. Example VR = R . ~ Rand R is a symmetric matrix. Suppose
U,p
[ -KT I ]
is defined by
[ yp ] =
[I K ]
1 yf 1
= 0. The V y p would be defined by
0. This is possible only if K is a symmetric matrix.
Let us now subject the network to two different port excitations. Let us call the
(vp, ip) vector corresponding to the first situation, (v’p, i’p) and that corresponding to the second situation, (v”p,i”p), Then
Hence, v‘pi”p = v”pi’p. P 7.13: We use the notation of Subsection 7.3.4. The constraints of n/ are
The device characteristic constraints are
n o constraints 077. vui n o constraints on iu,
A s fax as the variables vyu,iyi, vuv, iui are concerned, these constraints are equivalent to, say, Fuy(Vyv, i y i , v u u , iui) = 0.
7.9. SOLUTIONS OF PROBLEMS
267
By the Implicit Duality Theorem, the constraints
... K V E F: (v’g, v‘y v , vtYi,v‘,, ,v’,i) = o ‘..KCE F;(i’D, i’yv,ilYi,i‘,,, i’,i) = o ‘ . . device characteristic F; (i‘D,itYi, i’,i ,V ’ g , v’y, ,v’,,) = o are together equivalent, as far as the variables i’yv,
v‘yi,i’,,, v’ui
are concerned, to
F ~ y ( i ’ y , , ~ ‘ y ~ , i ’ , ,= , v0. ’,~) But the constraint
F i (i’D, itYi,i‘,i, v‘g ,vtY,,v‘,,) = 0
is equivalent to the following:
no constraints on v‘~,,, no constraints on itYi, (I)i’,i = 0, (I)v‘,, = 0. The result follows. P 7.14: Proof of Lemma 7.6.1: (a) is immediate.
(b). Since U p
V p it is clear that U$ 2 u;p
U$. Hence,
* u; c u;p
+)
U$
To prove the reverse containment, we first observe that, by Lemma 7.2.4, ( V s p . P)’ = V& x P. Next, since U;
+ U;p x P is given to be closed, we must have u;
= ( V p n V s p . P)* = u;
+ V;p
x P
(by Lemma 7.2.2). Suppose gs E V i P t-) U;. Then there exists g p E U; s.t. g s @ g p E V&,. Now U; = V; VZp x P. Thus, there exists g p ’ E VL;px P s.t. g p - gp‘ E U;. Since gp’ E U;p x P , we must have 0s @ gp’ E V:p. Hence,(( g s @ g p ) - (0s @ g p ’ ) ) E Vgp. is a vector space since V s p is one - by Lemma 7.2.1). Hence, gs @ ( g p - gp’) E U;p.
+
7. THE IMPLICIT DUALITY THEOREM
268
It follows that gs E V;p
t)V;.
0
Proof of Theorem 7.6.2: We have U p 2 V s p . P. It is easily seen that vectors in V s p +) U p and V;p t)U; are q-orthogonal. For, if fs E V s p H U p , then there exists fp E UP s.t. fs @ f p E V s p . If gs E t)U;, then there exists g p E U; s.t. gs @ g p E V i p . Now < fs ‘3 fp, gs @ g p >E A and < f p , g p >E A. Since A is closed under subtraction, it follows that
< f~ @ f
~ gs , @gp
> - < fp, g p
>E
A,
i.e., < fs, gs >E A. So fs, gs are q-orthogonal. Thus ( V s p t)U p ) * 2 (V;p t)U:).
Next let gs E ( U s p t)U p ) * . Let fs E VSP x S. Since O p 6 U p and fs @ O p E V S P , it is clear that fs E VSP t)Up and < fs, gs >E A. Hence, gs E (VSP x S ) * . Since V S P ,VLGp . S are closed, we have, using Lemma 7.2.4,
v;,.s
=
(V&’S)**
= (VYp x S)* = ( V s p x S ) * . Thus, g s E V 2 p . S. Hence, there exists g p on P s.t. gs @ g p E V;p. We will show that g p E U;.Let f p E Up C V S P.P. Then there exists fs on S s.t. f s @ f p E V s p . Since g s E (VSP t)U P ) * ,we must have < f s , gs >E A. We also have
< fs @ fp, gs CB g p >=< f s , g s > + < f p , g p > E A. Since A is closed under subtraction it follows that < fp,gp >E d,i.e., g p and therefore, g s E V g p t)U:. Thus ( V s p H U p ) * E (V;, t)U;).
E
U; 0
Chapter 8
Multiport Decomposition 8.1
Introduction
An informal discussion of the role of multiports in electrical networks may be found in Section 5.8 of Chapter 5. Its relation to the Implicit Duality Theorem is brought out briefly in Subsection 7.3.2 of Chapter 7. In this chapter, we give a formal description of multiport decomposition using the notion of generalized minor and the Implicit Duality Theorem. The primary application we have in mind is to network analysis by decomposition. Relevant t o this application is the port minimization of component multiports. These topics we deal with in detail. The port connection diagram can be viewed as the graph of a reduced network which keeps invariant the interrelationship between the subnetworks which go into the making of the different multiports. We give some instances of this idea in Section 8.5. Remark: The word multiport has been used in two different senses in this chapter. An electrical multiport is an electrical network with some devices, which are norators, specified as ports. A component multiport is a vector space V E P on E k!J P with the subset P specified as ports. Formally, in both cases no further property of ports needs to be specified.
8.2
Multiport Decomposition of Vector Spaces
Definition 8.2.1 Let V E be a vector space on E and let E be partitioned into ~ , P PI kd . . . kJP k is a set disE l , E z , . . . , Ek. Let V E , ~, , V E ~U p~, where joint from El be vector spaces on El U PI , . ' . , Ek U 9 ,P respectively. We say ( V E ~ P ., ,. . , V E ~ P; U , p ) is a k-multiport decomposition @-decomposition or
269
8. MULTIPORT DECOMPOSITION
2 70
decomposition for short) of V E iff
The set P is called the set of ports, V E , ~,i, = 1,. ' . ,k are called the components or component multiports of the decomposition while U p is called the coupler. A multiport decomposition ( V E ~ P. .~. ,,U E k p k ;U p ) of UE is said to be minimal iff whenever (V Ep ~; , . . . ,VE,,p;. ; U p !) is a multiport decomposition of U E , we have
I p I,
8.2.
MULTIPORT DECOMPOSITION O F VECTOR SPACES
271
Then we say that ( g ~ ~. . p. ,G~ E, ~ PG~p ;) is a k-multiport decomposition of G. The edges in P are called ports. The graphs G E , ~,,i = 1 , . . . , k are called the components or component multiports in the decomposition while G p is called the port connection diagram. We would usually write ( ( G ~ , p , ) kG; p ) instead of ( G E ~. .~. ,~G , E ~G p~ ) . ~If the ; components and coupler of a multiport decomposition of V,,(G) are voltage spaces of graphs we say that the decomposition is graphic. In general, a multiport decomposition of Uv(G)would not be graphic. Also the procedure for minimization of number of ports that we describe in this section yields a multiport decomposition of V,,(G) that is not always graphic. Computationally this is not a great hindrance, as we shall show. Coupling of given components to yield a given vector space
We describe below necessary and sufficient conditions under which given components V E ,p, ,j = 1 , . . ' , k can be coupled to yield a given vector space V E . First we state a simple lemma which characterizes an extension of a vector space. Lemma 8.2.1 Let V E ~V E , be vector spaces on E H P, E . Then, there exists a vector space V p on P s.t. VEP +) V p = UE iff v ~ .p E
_> V E 2
v ~ Xp E .
For proof see the solution of Exercise 7.5.
Theorem 8.2.2 Let V E ~ ~ ~ , . . . be , Vvectorspaceson E ~ ~ ~ EI~JPI,...,E~UP~. Let E NEi, P G kJPiand let V E be a vector space on E . Then there exists a vector a k-multiport ~ decomposition of space V p on P such that ( V E ~. . .~,V~ , E ; V p~) is ~ V E iff the following two equivalent conditions are satisfied:
ii. VE,P, . Ej 2 V E . Ej and V E , ~ x, Ej C V E x E j , j = l , . . . , k Proof
and
:
i. From Lemma 8.2.1 we have that
272
8. MULTIPORT DECOMPOSITION
7'he result follows.
(ej
(ej
ii. We observe that V E , ~ , .)E 2 V E iff V q p , ) . E . Ej 2 V E . Ej,j= I , . . . ,k , i.e.,iff V E , ~.,E j 2 v ~ . E j , j = l;..,k. VE x Ej,j = l , . . . , k , iff ( e j V ~ , p , x) E XEj Next V E 3 - ( e j V ~ , p , x) E i.e., iff V E , ~ x, Ej C: V E x Ej,j= l , . . . , k .
Compatibility of a decomposition In general one cannot recover the coupler V p of a decomposition given the components V E , ~ and , the decomposed space V E . This is possible (as we show below) precisely when v E , P , 'P3 2 v p ' P 3 , j = 1 , ' " , k and V E , ~ x, P3 C v p x P 3 , j = l , " . , k . When these conditions are satisfied we say that the components and the coupler of the decomposition are compatible, or more briefly, that the decomposition is
compatible.
Theorem 8.2.3 Let ( ( V ~ , p , ) kV, p ) be a decomposition of V E . Then
($
VE,P,)
* VE = VP
3
iff the decomposition is compatible. Proof : Given spaces V E ~V p, , V E on E P, P, E respectively such that V E P +) V p = V E , we have iff V E P .P 2 V p and V E P x P C V p (by Theorem 7.6.1 of U p = V E t~ )V E Problem 7.5). The result now follows from the fact that the two sets of conditions of Theorem 8.2.2 are equivalent. 0
Further Decomposition of Components Let ( V E ,P, , . , V E , , ~U ~ p );be a decomposition of V E . It would often be convenient to further decompose the components V E , ~. , There are two ways in which this could be done: '
We could perform an mj-multiport decomposition of V E ,p, ,j = 1, . . . ,k . In this case while decomposing V E , ~ we , would treat E3 and P3 the same way, i.e., not distinguish between them. We could try another kind of decomposition in which the final ports Pj do not appear in the individual components VE,{Q{but only in the port connection (see Figure 8.1.) diagram V ~ , Q
8.2.
MULTIPORT DECOMPOSITION OF V E C T O R SPACES
2 73
% Q2 Figure 8.1: Decomposition of a Multiport
The latter is encountered more often in network theory when electrical multiports are decomposed. For instance, in network theory, there are procedures for 2-port synthesis where simpler electrical 2-ports are first built and their ports then connected together to form the final electrical 2-port. In Figure 8.1, two component ~ ~ E are ~ connected Q ~ acccording to the port connection multiports on G E ~ Qand . reader would notice that this diagram GQP to yield the final multiport G E ~ The corresponds to the series connection of the two 2-ports. We formally define the ‘decomposition of component multiports’, as opposed to ‘decomposition of vector spaces’, below. The reader may, if he so wishes, identify V E with ~ one of the V E , ~in, the decomposition of a vector space V E . Let V E P be a vector space on E M P . The ordered pair ( V E P ,P ) is called a vector space on E k J P with ports P. More briefly we might say VEP is a vector space on E P with ports P . We say ( V E ~ Q . ~ ,VE,Q,; V Q ~ )where , Q fl( E M P ) = 8, M Q j = Q, is a matched k-multiport decomposition of ( V E P ,P ) iff
8. MULTIPORT DECOMPOSITION
274
We say it is a skewed k-multiport decomposition of ( V E P ,P ) iff
Thus the notion of decomposition of component multiports may be used to decompose a vector space hierarchically. Suppose we have V E f (@ VE,P,)H V P . We could then further decompose the components Y E , , as ( V E, ] Q , ~ , . . . ~ V E , ~ Q , ~i.e., ; V VQ E ~ P, R,= ) ,( @ ~ V E , , Q , , ) VQ,P,. Remark: As in the case of decomposition of vector spaces, we usually write ( ( V E , Q , )V~Q; P )instead of ( V E ~ Q. ] ., , V E ~ Q ,V. ;Q P ) .Again as in the case of decomposition of vector spaces, VE,Q, would be called components and V Q P the coupler. We say a k-multiport decomposition of ( U E P , P ) is graphic if the components and the coupler space are voltage spaces of graphs. We now have
*
Theorem 8.2.4 Let U E P be a vector space with ports P . Then ( V E ~ Q.,.,. , V E ~ Q ,V, ;Q P )is a matched k-multiport decomposition of V E P iff ( V & Q i , .. . , Vj!jkQk; VQ',) is a skewed k-multiport decomposition V i P . Proof
:
We use Corollary 7.1.1. We have in general, (VEP
+)
VQP)l
= (Vip+v&p)
Hence, ( ( @ v E , Q ,+ )) V Q P ) = ~ ((@~i,Q,)+~t$p) 3
j
and the theorem follows. U
Exercise 8.1 Number of ports needed to access a graph: Let G be a connected graph on edgeset E l . Suppose G were a subgraph of another graph G' on the set of edges El. i. If nothing is known about the structure of G I , show that the minimum number of ports PI we require, in a multiport decomposition of V,(G') with respect to a partition { E l , E2,. . . ,E k } of E', is equal to T ( G ) and that these ports can in general be arranged as the copy of a tree of 6. ii. What if G had p connected components? iii. Repeat for a multiport decomposition of Ui(G').
Exercise 8.2 Essential information about one part of a solution for the remaining part: Consider the complementary orthogonal linear equations
8.2.
MULTIPORT DECOMPOSITION OF VECTOR SPACES
275
( B ~ ! B ~ ) ;=; o (1) Suppose XI = XI.Let SP(21)be the collection of all x2 s.t. (121, x2) is a solution of Equations (*). To determine Sz(ji1) we usually would not have to know individual entries of 21. i. Show that it is sufficient to know the image p1 of x1 through an appropriate linear transformation Q, where the number of entries of the vector p1 = r(V . E l ) - T(V x E l ) , V denoting the row space of A and EI the columns corresponding to XI. Also show that if the vector p1 is obtained by a linear transformation of XI, then i t cannot have less entries than the above. ii. Repeat for the complementary orthogonal set of equations (Equations (I)). iii. How is this notion related to multiport decomposition?
Exercise 8.3 Counter-intuitive behaviour of decomposition: Give an example of a graph of G and a multiport decomposition ( G E , ~G~E, ~ P. .~. ,; G p ) such that if the graphs G E pl ~ , G E ~. . ~. are ~ connected , along their ports according to the port connection diagram G p , we do not get back a graph with the same voltage space as G. Exercise 8.4 Violation of port conditions after connection: In some practical synthesis procedures, one first synthesizes ‘component’ multiports, which are put together later to construct the desired larger multiport. I t is however necessary to check, after connection, whether the component multiports continue to satisfy the port conditions they were originally assumed to satisfy. In Figure 8.2, G E ~ ,QG ~E ~ Q ~ are put together according to G Q to ~ yield G E ~ . i. Check if the port conditions of G
E ~ QG~E, ~ Qare~ satisfied in G E P .
ii. [Narayanan85b] Give a general procedure for testing whether port conditions of component multiports are satisfied in the combined multiport.
Exercise 8.5 The decomposition of generalised minors of V E : Let ( V E ~. . ~. ,V~ ,E ~U p ~ ) be ~ a ;k-multiport decomposition of V E . Let Q C E and let V Q = V Q , @ . . . @ VQ,, where Qj
= Q n E j , j = l , . . . , k.
Then, V E +) VQ has the decomposition ( ( V E , ~t, ) V Q , ) , . . ., ( V E ~ + P)~VQ,.);VP). Hence, if T S C E , VE x S . T has the decomposition (VE,P, x ( S j U Pj) . (Tj u P j ) k ; V p ) where Sj f S n E j , and Tj = T n E j .
Exercise 8.6 If Pi are separators then Ei are separators: Let V p have P I ,. . . , pk as separators. Then
8. M ULTIPORT DECOMPOSITION
276
I
z’Q-+
/
/?
I
I
/Ca
I
I
jQl1
/I
‘I
I
;
tp1
\
tQ21
QQP
e9
0
Figure 8.2: Testing Port Conditions of Component Multiports
has E l , . , , .&I, as separators. Exercise 8.7 Compatible decomposition - Minors of V E that can be obtained through minors of V p : Let ( ( V E , ~)k; , V p ) be a compatible decomposition of V E . Let Il C { 1,. . . , k } . Let PI, = UjEIl Pj and E I , = UjEr, E j . Then i.
((@lEI,
V E , P , ) ~ E+ I ,) V P . P I , ) = V E . E I , .
ii. ( ( e 3 e r lV E , P , ) J E I ~ +) VP x PI,)= VE x iii.
((e3EI2
where12
EIl.
V E , P , ) ~ E+I)~U P x PI^ PI,) = V E x EI, c Il C { I , . . . , k ) ’
iv. In each of the above cases the derived decomposition is also compatible. Exercise 8.8 Counterintuitive behaviour of decomposition of a multiport: Give an example of the graphic decomposition of a vector space V E with ~ ports P such that when the graphs GE,Q, are connected according to ( S Q ~the resulting graph G;EP does not have V E P as its voltage space. Exercise 8.9 Flattening a hierarchical multiport decomposition: Let M = ( V E ~,P . ’ ~. ,V E ~; Vp,) P ~ be a matched k-multiport decomposition of the
8.3. ANALYSIS THROUGH MULTIPORT DECOMPOSITION
277
vector space V E Q with ports Q. Let VE,p, with ports Pj have a matched mj-multiport decomposition ( ( V E , ~ T , ~V) T~ ,, ~; , ) .Show that V E Q has the matched Cmj multiport decomposition (...,VE,,T,,,...;VTQ) where V T Q = ( e j V ~ , p , t ) )V P Q .
8.3
Analysis through Multiport Decomposition
Let N be an electrical network on the directed graph G. Let E(G) be partitioned into subsets E l , . . . , Ek which are mutually decoupled in the device characteristic of the network. Let ( V E ~ P ~. V, E ~ P V p~ ) ;be a multiport decomposition of V u ( G ) . We now describe a scheme for analyzing the network N using the above multiport decomposition. The procedure is valid for a network with arbitrary devices but we illustrate our ideas through a linear network since computationaliy this is the most important case. +
8.3.1
+
Rewriting Network Constraints in the Multiport Form
Let the device characteristic of the network
Mj(iE, - jj)
d be
+ N ~ ( v E , e j ) = 0 , j = 1 , . . .,k , -
(8.1)
where iE, ,VE, denote the current and voltage vectors respectively associated with the edge subset E J . Let V E be the voltage space of the graph G. Let the representative matrix of the space V E ,p, be (Rj Rp,),j = 1, . . . , k and of the space V p be R,. Thus, the KVL constraints can be written equivalently as far as the variables V E ( = V E , @ . . . @ V E ~ )are concerned, as follows:
[ zz; ] [ E', ] -
= O,j = l,...,k
v,,
We note that v p = (yp, 8 . . . @ v p k ) and R p = ( R p , . . .R p k ) . We know by Theorem 8.2.1 that V i ( 6 ) has the multi; Hence, the KCE of N may be written port decomposition (VIElp, ... V k k P kV,'). equivalently as far as the variable iE = (iE1 @ . . . @ iE,) are concerned as follows:
( Rj
:)
=
O,j = l,...,k
(8.4)
Rpip
=
0.
(8.5)
Rp,
We note that ip = (ipl @ . . . @ ip,). Equations 8.1, 8.2, 8.3, 8.4 and 8.5 are together equivalent to the constraints of the network N as far as the variables iE, VE are concerned.
278
8. MULTIPORT DECOMPOSITION
For convenience we rearrange these equations according to multiports and the coupler as follows:
( RJ
);:
=
0
1.n.
=
(:)
-ej)
=
0 , j = l,...,k
=
o
=
0.
RpJ
[ =:, ] [ :;, -
Mj(iEJ-jj)+Nj(vE,
Rpip ~ p - R pv,-T
(8.6)
(8.8) (8.9) (8.10)
We may regard the equations 8.6 through 8.8 as consisting of the KCL, KVL and device characteristic constraints of the electrical multiport networks N., p, ,j = 1, . . . , Ic. We remind the reader that informally an electrical multiport (multiport for short) is a network with some devices which are norators, specified as ports. It is in this sense that we use this word when we talk of solution of multiports henceforth.
8.3.2
An Intuitive Procedure for Solution through Multiports
Let us assume that each of the electrical multiports n / ~ , p , can be uniquely solved for arbitrary values of (ipJl,vpJ,) for some partitions {Pjl,Pj2}of Pj. The natural way of solving equations 8.6 to 8.10 is as follows: .
STEP 1: SolveNE,p,,j = l , . . . , k (a) setting all entries of (ipJ,,vpJ,) equal to zero; let ( ~ o E , ~ , , i &be, ~the , ) corresponding solution. (b) setting the source values e 3 ,j, equal to zero and setting one entry of (ipJl,vp,,) equal to one in turn and the rest to zero. If Pj has k, elements we would have k, solutions (vk,p J ,ik, ), t = 1, . . . , kj. ( c ) write the general solution (denoting Icj by q ) (8.11)
(8.12)
(Here z',. .. ,zq are the entries of ipJl,vpj2)).Thus the 2 variables, appearing in the RHS of Equation 8.12, also appear as some of the variables on the left. So the
8.3. ANALYSIS THROUGH MULTIPORT DECOMPOSITION
2 79
equation can be rewritten involving only the variables in vp, and in ip, . Thus the Equation 8.12, has the form of a device characteristic on the set of edges Pj. We will call this equation the equivalent device characteristic of Pj.
STEP 2: Combine all the equivalent device characteristics of Pj,j = 1,. . k (called D, say) with the KCL and KVL constraints of the coupler given in equation 8.9 and 8.10 and solve the resultant 'generalized network' Let (v;, if) be the solution of this network.
Hp on P.
STEP 3: Substitute appropriate entries of (v;, i;) in (ipjl,v p , * ) (i.e., in the variables) of Equation 8.11.
2
STOP
Remark: We define a generalized network Np to be a pair ( V ,'D) where V is a real vector space on P and D ' is a device characteristic on P as in Definition 6.2.1. The space V takes the place of V,,(G) of ordinary network. A solution is a pair of vectors ( v ( . ) , i ( . )s.t. ) v(t) E V V t E 8,i(t) E VL b' t E 32 and (v(.),i(.))E 2).
Detailed description of STEPS l ( a ) and l(b) In subsequent sections we describe a procedure for port minimization which allows , have a graph structure (i.e., V E , ~ can , be chosen the electrical multiport N E , ~to to be the voltage space of a graph GE,~,).We now go into details of steps l(a) and l(b) assuming this. (It should be noted however that graph structure is not as important as the sparsity that is its consequence.) We also simplify the notation as follows: S E,,T G P,, T I z PjI,T2 Pj2, G GE,P,, VST 3 V E , ~ , ,
& =6 x ( S U T I ) .s.
e
Select a reduced incidence matrix A, of x (S U T I ) .When port minimization is done it would follow that in the graph G E , ~ , the edges Pj would have no circuit or cutset. So we may assume that in the graph 6 x (S U 7'1) there is a forest t which does not intersect TI. Let A 2 be an f-cutset matrix with respect to the forest T2 of 4 x ((S- t ) U T ) . Let A l , A2 be the matrices obtained from A l , A 2 by lengthening the rows padding them with zeros so that they become vectors on SMT. Let us partition the matrix representative matrix of
( P;: ) as ( 2:;
A ~ T,1 ~
).
This matrix is a
Vv(G)(see Problem 7.9). We can rewrite the constraints
8. MULTIPORT DECOMPOSITlO N
280
-
0
-A:; 0 ... 0
AT^, 0
Als 0 M
0
I N
.........
A ~ s 0
o
0
o
0
0 -A:s 0
... 0
-AST, -IT,
A~T, 0 0 0 0 0
0 0
0
.........
A~T] I 0 0 0 0
0 I 0
0 0
0
is
vs vn2
0 0
I
iT1
iTz
VTI
-
0 0
vnl
=
Mjj+Nej
... 0 0 0
vT2
Computational Effort for the Procedure Let us exarnine the computational effort required in steps 1, 2 and 3 of the above procedure. Solving the network N E , ~in, STEP 1 entails the solution of a linear network on E,, (I P3 1 +1) times for appropriate source distributions corresponding t o (a) the actual source vectors jj,ej and (b) setting one term at a time of ip,, ,V P , ~to value 1 and the rest as well as source terms to zero. We expect this step to be the most expensive computationally because of the size of the Ej's. In practice the partition { E l , . . . . E k } of E can usually be chosen (using heuristics) such that I P I is less than about 5% of I E 1 (assuming I E I> 10000). Hence, the effort involved in
8.3. ANALYSIS THRO UGH M ULTIPORT DECOMPOSITION
$lvp
(AlS”n,)p T =vP
28 1
-OC
oc 7
V T ~is
Active
/
e- 1
\v e7
Figure 8.4: Graph Of A Multiport
STEP 2, i.e., in computing the solution of Np, should be regarded as negligible in comparison with the effort involved in solving the h‘.,p, repeatedly. Exercise 8.10 Structure of constraints of electrical multiports during S O lution: The method of multiport decomposition gives us the additional freedom of imposing appropriate structure on the equations corresponding to NE,pJ (on VE,pJ) by solving in terms of ipJ1, vpJ2. i. What is this structure? ii. What is the best partition ( P j l ,Pjz) for the rnultiport in Figure 8.4. Assume that all the edges are decoupled in the device characteristic. iii. For a given partition { E l ,. . ‘ ,E k } what is the complete range of structures possible for the electrical rnultiport constraint equations through variation of the port sets (through nonsingular transformation) as well as the partition of ports Pj into P j l , Pjz.
8. MULTIPORT DECOMPOSITION
282
Remark: Let us suppose that we use LU factorisation to solve the linear equations. When we solve the network Nqp, for various source distributions, the coefficient matrix, i.e., the core submatrix composed of the first three sets of rows and the first three sets of columns of Equation 8.13, remains the same. So the LU factorisation of this submatrix has to be done only once. We have to solve an equation say (L,U,)x, = b for (I Pj I + 1 ) values of b. For each of these values of b we have to premultiply it by L;' and then premultiply the result of the multiplication by U3'. Thus when we say solve N E , ~,,(I Pj 1 +1) times we are really speaking of an upper bound of the effort involved.
Port Minimization
8.4
The discussion in the previous section suggests that port minimization (see the definition at the beginning of Section 8.2) is useful since i. the number of times each of the electrical multiports N E ~ has ~ , to be solved
equals
(I Pj I +1),
and
ii. if the ports are not minimized, they may contain circuits or cutsets so that the
imposed port conditions cannot be treated as independent. Our procedure of keeping one entry of (ipJl,vpJ2)equal to one and the rest to zero would not be feasible.
However, it would be relevant to network analysis only if it is near-linear time. In the present section we give a few minimization algorithms. Two of these algorithms are very fast and can be used during the preprocessing stage of network analysis.
8.4.1
An Algorithm for Port Minimization
We begin with a general algorithm based on vector spaces [Narayanan86a], "arayanan871. The reader might like to review Section 3.4, particularly Subsections 3.4.2 and 3.4.5 and Exercises 3.53 and 3.54.
ALGORITHM 8.1 PORT MINIMIZATION 1 INPUT Representative matrix R of vector space V E on E , a partition { E l ; . . E k ) of E .
OUTPUT Representative matrices ( Rj Rp, ) of space VE,p, ,j = 1 , . . . , Ic and ~ );is representative matrix Rp of space U p , where ( V E ~, . ~. ,, V E ~ PV p a k-multiport decomposition of V E such that IP( is a minimum.
8.4.
PORT MINIMIZATION
283
-
Rii R22
R-
7
Rkk
-
(Observe that T(vE
STEP 3
STOP
'
R(k+l)l
R(k+l)T,
R(k+l)2
" '
Ri(k+l)k
-
is a square submatrix of size
Ej)- T ( V E x Ej)).
Let i t ( k + l ) T , - be the full row submatrix of R(k+l)jcorresponding to T j . Let Rp, = R(k+l)T,, j = 1 , . . . ,k. Take the representative matrix R p to be . Rp (Rp, : . . . i Rpk).
Complexity of A l g o r i t h m ( P o r t Minimization 1) In STEP 1 to compute the matrix R for the case where k = 2, we could proceed as follows. For the submatrix of R composed of columns El, and all rows, build the RRE but extend the row operations to all columns. This takes O ( q r 1 E I)
284
8. MULTIPORT DECOMPOSITION
operations where below:
T~
= r(V.El), = T ( U E )The result is a matrix of the form shown T
For the matrix A12 compute the RRE but extend the row operations to all columns. This converts A1 to A: where
This takes O(T: 1 E I) steps. At the end of these steps we have the matrix A' shown below:
A'=
[
Ail
0
Ab2 A22
]
To compute A' from R, as we have shown, takes O ( q r 1 E I) steps. Now if k > 2 we proceed as above but the columns of A22 correspond to E - E l . Also we have to repeat the procedure with A22 recursively. To break up A22 as above takes 0 ( ~ I rE - El I) steps where r2 = r(U x ( E - El) . E2). Repeating this procedure we see that computation of R takes O ( q r 1 E I + T ~ T 1 E - El 1 f . . . f ~ k - ~I Erk - 1 U Ek steps, where T~ = T ( U x ( E - (El U . . . U Ej-1)) . E j ) . By using Corollary 3.4.2 we see that TI . . . Tk = T ( U ) . Hence, complexity of the above computation is 1 E I), where T f YE). In STEP 2 we need to compute the RRE of
I)
+
+
o(?
'The columns TJ are merely those columns corresponding to the unit matrix of appropriate order appearing in the second set of rows of the RRE (corresponding to the rows of R(k+l)j). Thus, this computation, for all j = 1, ,k,is 0 ( r 2 1 E In STEP 3,we merely put together the matrices R ( h + l ) ~that , are already computed. Thus, the overall complexity of Algorithm 8.1 is 0 ( r 21 E I), where T z ~ ( V E ) .
I).
Remark: Algorithm 8.1 can be easily adapted to the case where VE is the voltage space of a graph so that it is near linear time and all the matrices generated are sparse. Further, for this case it produces a coupler space Vp which is voltage space of a graph. The component spaces V E , ~ ,are the sum of the voltage spaces of two graphs. Algorithm 8.2 described later produces component spaces which are voltage spaces of graphs but a coupler space which is not the voltage space of a graph.
8.4.
285
PORT MINIMIZATION
Justification of Algorithm (Port Minimization 1)
We now proceed to show that the output of the Algorithm 8.1 is a minimal kmultiport decomposition of VE. We first show that VE
= ($vEJPJ)
*vP.
j
Let fE E YE. We denote fE/Ej by f
~Then ~ .
Observe that by the structure of the representative matrix R, when j varies, X j would vary but &+I would remain fixed. Let
Hence,
Let, f~ = $ k + l k R P , k ,fplp, = + k + l ) R ~ ,= , j1, . . . , k . It is now clear that @fEJPJ - f P = f E @ o P . j
Hence, fE E
((@v E JPJ
f)
vP>
j
On the other hand, let fE E
((@VEJPJ)
* VP).
j
Then there exist fE, pJ ,j = 1 , . , . ,k and fp such that
286
8. MULTIPORT DECOMPOSITION
and since the columns corresponding to Tj span the columns of
R(k+l)j,we must have
R(k+l)j
as well as
But the RHS of the above equation is clearly a linear combination of the rows of R and therefore belongs to V E . Next we need to show that the k-multiport decomposition that is generated by the algorithm is minimal. It is clear that in the decomposition generated by the algorithm
I
Pj
I=
T(vE
. E 3 )- T ( v E x E j ) , j
= 1, . . , k . '
Lemma 8.4.1 proved below assures us that for every k-multiport decomposition 1 Pj 1 cannot be less than the above RHS. The minimality of the decomposition follows. Lemma 8.4.1 Let ( V E ~ .~. ., ,,V E , , ~V~p ;) be a k-multiport decomposition of Then, I Pj I> T ( V E. E j ) - T ( V E x E j ) , j = 1,. . . , k.
Proof : By Corollary 3.4.2
T ( v E J P J' E j ) - T ( u E , P , x E j ) = T ( V E , P , ' P J )- T ( V E , P , x p j ) 51 pj 1 . Next, by Theorem 8.2.2, we know that V E , ~.,E j 2 U E . Ej,and V E , ~ x, E3 2 V E x E j . We conclude therefore, that
UE.
PORT MINIMIZATION
8.4.
8.4.2
287
Characterization of Minimal Decomposition
In Theorem 8.4.1, below, we give a number of equivalent conditions for the minimality of a k-multiport decomposition. A preliminary lemma is needed for the proof of this theorem. Lemma 8.4.2 Let ( V E , ~ , , . . , V E ~ P U~ p; ) be a k-multiport decomposition of V E . Then
Proof : Let fEJI , ' . . ,fE, ' be a set of vectors which together with a basis of V E x E3 form a basis for U E . E3. Then there exist vectors fpJ1 , . . .,fpJ' in V p . P3 s.t. fEJ @ fpJ', . . . , fEJ' CB fpJ' E VE,p, . Suppose a nontrivial linear combination fp, Of fp, 1 , . . . , fp, ' belongs to up X P, . Let the same h e a r combination of fE, ', . . . ,fEJ ' yield the vector fEJ . Then fE, @ fpJ E VE,~,with fEJ E V E . EJ - VE X E, and fpl E V p x P3. Let fEp be the vector, on E kJ P , whose restriction t o E, bl P3 is fEJ CB fpJ and whose value outside this set is zero. Let fp be the vector, on P , whose restriction to PJ is fpJ and whose value outside this set is zero. Since VE = ( $ 3 V ~ J ~H, V ) p , it follows that fE, $ O ( E - E J ) belongs to V E . Hence, fEJ E V E x E J , which is a contradiction. We conclude that fp, I , . . . , fp, together with a basis of U p x P3 forms a n independent set. The result follows immediately. 0
Theorem 8.4.1 Let ( U E , ~ , ., . . , VE~P, ; V p ) be a k-multiport decomposition of V E . Then it is minimal iff the following equivalent conditions are satisfied. i. lpj I = T ( V E . E ~ ) - T ( V E X E j~=) l, , . . . , k . ii. T ( V E , P .~E j ) = T(VE . E j ) T(VE,P,x E j ) = T ( U E x E j ) and T ( V E J P Jx Pj) = T(V&, x Pj) = 0 , j = 1 , . . . ,k.
iv. If (R, i Rp, '), (B, iBp, ') are representative matrices of VE,pJ ,V;,
.
. tively and (Rp, :. :Rp,), (Bp,
pJ
respec-
. . .Bpk) are representative matrices of respectively then the matrices Rp, ,Rp, ', Bp, ,Bp, ' all have independent columns, j = 1 , . . ,k. +
.
U p ,Vp'
+
Proof : By Lemma 8.4.1 it follows that 1 Pj I> T ( V E E . j ) - T(VEx E j ) , j = 1 , . . . k for every k-multiport decomposition. But in Algorithm 8.1 we have constructed a k-multiport decomposition which satisfies the above condition with equality. Thus,
288
8. MULTIPORT DECOMPOSITION
( V E ~ P , ,. , V E ~V p~) is~a minimal ; k-multiport decomposition iff Condition (i) is '
satisfied.
Conditions (i) and (ii) are equivalent: Let Condition (i) hold. We have lpj
l>T(VE,P, ' P j ) - T ( V E , P ,
xPj)=T(VE,P,
. q )- .(VEJP,
x Ej)
(*)
But ~ ( U E , P ,. E j ) - T(UE,P, x E j ) 2 T ( U E . E j ) - T ( V E x El) by Theorem 8.2.2 It follows that Condition (i) can hold only if we have equality in place of the above inequality in Equation (*). But I p~ I> T ( V E , P J . p j ) , T ( V E , P , x p j ) 0, T ( v E , P , ' E j ) 2 T ( v E . E j ) and r ( V ~ , p ,x E j ) 5 T(UE x E l ) . Hence, equality holds in (*) only if I pj I= T(VE,P, ' PJ), .(VE,P, x PJ)= 0, T(VE,P, . E j ) = T ( U E . E j ) and T(UE,P, x E J )= T(VE x E j ) . Now I Pj I= T(VE,P,.G)r ( V i J p x, P 3 ) .SO 1 pJ (=T(VE,P, . E ~iffr(Uj$pJ ) X P ~ )= 0 This proves that Condition (i) implies Conditions (ii). Next let Condition (ii) hold. We have (by Theorem 3.4.3)
>
+
I Pj I
=
. PJ)+ T ( U i , p , r(vklPJ PJ)+ .(vE,P, .(VE,P,
'
x x
PJ) PJ)
SO we must have, using Condition (ii),
I pj I=
T(VEJPl '
PJ)- T ( v E , P ,
x PJ)
The RHS equals r ( V ~ , p.,E J )- r ( V ~ , p x, E J )by Corollary 3.4.2. But this expression equals T ( V E. E j ) - T(VE x E j ) . Hence, Condition (i) holds.
Conditions (i) and (iii) are equivalent: Let I Pj I= T ( U E . Ej)- T ( V E x Ej). Now I pj I= .(VEJPl . P j ) +.(U&,p, x Pj). Hence, by Condition (ii), I PJ )=r ( V ~ , p ., P J ) and r ( U ~ , p ,x P I ) = 0. Further by Lemma 8.4.2 T(VE
'
E J ) - T(VE x E j ) 5
T(Vp
'
Pj) - .(Vp x P j ) , j = 1,
' ' '
i.e., 1 Pj 15 .(Up ' P j ) - .(Up x P j ) , j = 1; ' . , k . The only way this inequality can be satisfied is to have I Pj )= .(Up .(Up x Pj) = 0. Thus Condition (i) implies Condition (iii). Next let us assume that
,k, . P j ) and
8.4.
PORT MINIMIZATION
289
By Lemma 8.4.1 we already have
1 Pj 12 T(VE. E j ) - T(VE x E j ) , j
= I , . . . ,k.
Suppose 1 Pj I> ~ ( V *EE j ) - ~ ( V xE E j ) for some j . Sowe haver(UEjp, * P j ) - r ( V ~ , p x, P ~>) r ( v ~ . E j ) - r ( V ~ x E Sj o) b . y Theorem 8.2.2 there exists (a) fE, E ( V E , P , . Ej - VE ' Ej),Or (b) g E , ( V E X Ej - VE,P, X Ej). Case (a): f E , E ( V E ~ P. ,Ej - V E . E j ) . In this case there exists a vector fEjpj E V E , ~ s.t. , f E j p j/ E j = f E , . Since U p . Pj has full rank there must exist a vector f p E U p such that f p / P j = fEjpj / P j . Since VE,P,. Pi has full rank for all i, it follows that there exist vectors fE,p, for all z # j also s.t. fE,p,/Pi = f p / P i . But this means that e i f ~ , p , / E belongs to V E . But then E , E V E . E j , which is a contradiction. Case (b): g E , E ( V E x Ej - U E , ~ , x E j ) . In this case there exists a vector gE E V E s.t. g E / E j = g E j and g E / ( E - E j ) = 0. By the definition of k-multiport decomposition it follows that there must exist vectors &E,P, E ~ E , P,i, = 1 , . . . ,k and g p E V p s.t. gE, P, / Ei = g E / Ei, i = I , . . . , k and g p / Pi = g E, P; / Pi, i = 1, . * . ,k. But then it follows that gE,p,/Ei = 0 , i # j . Hence, gE,p,/Pi E V E , ~X, Pi,; # j . But ~ ( V E , Px, Pi) = O , i = l,...,k. Thus gE,p,/Pi = O , i # j . It follows that gP/Pi = O , i # j . Hence, gp/Pj E V p x Pj. But this latter vector space also has zero rank. We conclude that gp/Pj = 0 . But this means that gEjpj/Pj = 0, i.e., gE,p, /Ej(= g E , ) belongs to U E , ~ ,x E j , a contradiction. Thus. in both cases we arrive at contradictions. We therefore must have
Thus, Conditions (i) and (iii) are equivalent.
Conditions (iii) and (iv) are equivalent: We observe that since
Now Condition (iv) merely states that
8. MULTIPORT DECOMPOSITION
290
I t follows that Condition (iii) is equivalent t o Condition (iv). 0
Exercise 8.11 (Strongly) Compatible Decomposition and Minimization Starting from a Strongly Compatible Decomposition: We remind the reader ~ . . . , V E ;V p~) of~V E is ~ compatible iff that a k-multiport decomposition ( V Epl, V E , p , ‘ P ~ > V p ’ P ~i = , l , ” ’ , k,
V E , ~x, P, & V p x P,, i = 1 ; . . , k . Let U S define a k-multiport decomposition ( V E ~. . .~,V~ E, ~ PV ~p ); of LIE to be strongly compatible iff
V E , ~. ,Pz = V p . P,, i = 1,.. . ,k, V E , ~x, P, = V p x P,,
a
= I , . . . , k.
Prove the following:
i. ((VE,~
, ) kV ; p)
is a (strongly) compatible decomposition of VE iff (@vE,P,)’p > ~ P ( ( @ ~ E , P , ) ’ ~ = v P ) J
3
and (@VE,P,) 3
ii ( ( V ,
p,)k;
(a) iff
x p
2 v P ( ( @ V E , P , ) x p =VP>. 3
V p ) is a compatible k-multiport decomposition of V E
( ( V ~ , p , ) kV ; E ) is
a compatible k-multiport decomposition of V p ;
(b) iff ( ( V E ~ P ~ V ) ;k;) is a compatible k-multiport decomposition of V h .
iii. ( ( V , p J ) k ;V p ) is a strongly compatible k-multiport decomposition of V E iff ( ( V E ,p, :) ; V b ) is a strongly compatible k-multiport decomposition of V i . ik (*) Let T ( V E , ~x, P,) > 0, let f , be a nonzero vector in U E , ~ x, P, and let e belong to its support. If ( V E 1 p l , .. . ,V E ~ ~V ,p ;) is a strongly compatible k-rnultiport decomposition of V E , then ( V E ~ P. .~. ,,V E , ~ .; .,. , V E ~ P VP ~ );, where P,’ = P, - e, P’ = P - e, V E , = ~ V E , ~x, ( E , U (Pz - e ) ) and Vpt = VP x ( P - e), is also a strongly compatible k-multiport decomposition of V E . v. (*) Let r ( ( V ~ . p , )xl P,) > 0, let gp, be a nonzero vector in ( V E , ~ ,x) P, ~ and let P belong to the support of gp$. Let Q3 = PJ ’dj # i, Q , E P, - e , P - ~,VE,QE , V E , ~ , V j # i, V E , Q , E V E , ~‘ ,( E , U Q,) and let Q VQ v p . &. Then ( ( V E , Q , )V~ Q ; ) is a strongly compatible k-multiport decomposition of V E if ( ( V , pJ ) b ; V p ) is strongly compatible.
8.4.
PORT MINIMIZATION
29 1
vi. (*) The preceding two sections give us an algorithm for constructing a minimal decomposition starting from a strongly compatible decomposition by successively contracting and deleting suitable elements. Show that this algorithm terminates in a minimal decomposition. Show further that every minimal decomposition is a strongly compatible decomposition. Exercise 8.12 Natural transformation from VE to V p for minimal decompositions: Let ( ( V ~ , p , ) kV, p ) be a minimal decomposition of UE. Show that
e3
i. If
fE, E V E there is a unique fp, E V p s.t. fEJ EB fp, E VE,P,, j = l,.'.,k,and hence , there is a linear transformation T : V E + V p s.t. T( f ~) = , fp, and fEJ f fp, E V E P, ~ ,j = 1 , . . . ,
e3 e3 ii. if eJ E Vp,then there exist ejfE, VE s.t fE, VE, ,j = 1, . . . ,k. correspond in this manner to ej fp, then If e3 fE, and e3 fE, @fp, E
E
fpJ
fE,
1
-
fE,2 E
VE,p,
X
p,
Ej = V E X E j .
Exercise 8.13 Uniqueness of V p for minimal decompositions: Let ( ( V ~ , p , ) k , V p ) ,((V~,p,)k,c p ) both be miinimal decompositions of VE. Then v p = vp. Exercise 8.14 Nonsingular transformation of port variables:
i. Let ( V E ~pl , . . . , V E ; Vp) ~ be ~ a~minimal k-decomposition of VE. Let ( V E , ~ ;. ,. . , V E , . ~,;V p f ) be obtained as follows: f ~ B, ~ P ,EVE,^, @fP, 3
iff E
VP
fEJ~Tj(fp,)€v~,~~,j=1,'",k iff
@Tj(fPJ) € U P , 3
where T, is a nonsingular linear transformation acting on vectors defined on p3.
Show that the latter decomposition is also minimal. ii. Given two minimal k-decompositions of VE show that one can be obtained
from the other by nonsingular linear transformations acting on vectors on port sets (as in the previous section of this problem).
Exercise 8.15 Structure of the columns Pj for a minimal decomposition Let ( V E ~ P. .~. , V E ~ PVp) ~ ; be a minimal k-decomposition of V E . )
i. Show that in any representative matrix Of VEJpJ,V;, P3 would be linearly independent.
p,
,VporVjk the columns
ii. The hybrid rank of a vector space V p is defined to be
m i n K , c P ( r ( v P . K1) + r(Vpl * ( P - Kl))). Show that hybrid rank of V p >I P3 I j = 1, ..., k.
8. MULTIPORT DECOMPOSITION
292
iii. We say Vpl is obtained from V p by nonsingular transformation of the Pj if fpl@...@fpK EVp($Tl(fp,)$...$Tk(fpK)EUp),
where fpt E Tj (fp, ) . > Show that I Pj 15 hybrid rank of Vpt , if V p f is obtained by nonsingular transformation of the Pj.
8.4.3
Complexity of Algorithm (Port minimization 1) for Graphic Spaces and Sparsity of the Output Matrices
We have proposed multiport decomposition as a prelude to network analysis. It is necessary therefore that the algorithm for decomposition be near linear time in the size of the set E and that the matrices generated be sparse. However, this can be hoped for, and is essential, only in the case where V E is the voltage or current space of a graph. Algorithm 8.1 is intended to work on an arbitrary vector space. While it is polynomial time ( O ( 6 I E I) as shown earlier), it needs t o be shown that it is acceptably fast when V E is the voltage space of a graph. The same holds for sparsity. In general we can say little about sparsity of the matrices generated but with voltage spaces of graphs one can hope to do better. Below, first we show that the algorithm is near linear time for voltage spaces of graphs. From the discussion it would also be clear that the matrices involved are quite sparse. Adaptation of Algorithm (Port minimization 1) to graphic spaces Let R be a reduced incidence matrix of a graph G. The matrix Rjj can be taken to be the reduced incidence matrix for G x E J ,j = 1, . . . , k. Building all the x Ej takes O ( k 1 E I) time (in fact this can be shown to take O(klV1 [ E l )time) and then all the Rjj takes an additional O(l E I) time). .. 11. To build the matrix (R(k+l)l: . . . : R(k+l)k), we could first select a forest t of the graph G x E j . This would be a disjoint union of forests t j of 5’x E j , j = 1, . . . ,k. Construct a reduced incidence matrix of the graph G x ( E - t ) . This takes O( I E I) time. Adjoin a zero submatrix corresponding to the set t. This : . . . : R ( k + l ) k ) since the rows of this matrix would be the desired matrix (R(k+l)l along with the rows of the reduced incidence matrix of G x Ej form a basis for V,(G) (see Problem 7.9). Building ( R ( k + l ) l : . . . : R ( k + l ) k ) takes O(l E I) time. ... 111. The representative matrix (Rj : Rp,) of V E , ~ is , 1.
+
ej
ej
where R ( k + l ) T , is a nonsingular submatrix of R(k+l)jof full rank. We obtain the rows of (R(k+ljj: R ( k + l ) ~ as , ) follows:
8.4.
PORT MINIMIZATION
293
Observe that the matrix ( R ( k + l ) l : . . . : R ( k + l ) k ) is the reduced incidence matrix of the graph 4’ obtained by adding self loops t t o the graph G x (E - t ) . Let (Rik+l)l . . . : Rik+l)k)be the incidence matrix of G’. Then, R’(b+Iljis the incidence matrix of g’. Ej. Thus, R(k+l)jcan be taken to be the reduced incidence matrix of this graph. Let Tj be a forest of this graph. The columns corresponding to this set form a maximal linearly independent subset of columns of R(k+l)j. Let this submatrix of R(k+l)jbe denoted by R ( k + l ) ~ The . matrix (R(k+l)j: R(k+l)~,) is the reduced incidence matrix of the graph Gkjpj obtained from 4’. Ej by adding a forest Pj that is a copy of Tj to g’ E j . Building R ( k + l ) j ,= j 1 , ... ,k takes O(l E I) time overall. Building the forests Tj and R ( k + 1 ) ~ , , j = 1,. . k takes O(l E I) time overall. iv. Let R(k+l)TJbe the submatrix of f i ( k + l ) j corresponding to the columns Tj. Select Rp, R(k+l)TJ. Building RpJ, j = 1,. . . ,k, clearly takes O ( ( E I) time overall. Thus, the Algorithm (Port minimization 1) takes O ( k 1 E I) t‘ime . We now briefly speak of the sparsity of the above matrices. We saw in the above are~ reduced incidence matrices discussion that the matrices Rjj, R(kil)j,R ( , + ~ ) T of appropriate graphs. Thus, the matrix +
s ,
I
has at most four nonzero entries per column in Rj and atmost two nonzero entries per column in Rpj . The matrix R p is the reduced incidence matrix of a graph. SO it has atmost two nonzero entries per column.
8.4.4
“Minimal Decomposition of Graphic Vector Spaces to make Component Spaces Graphic
We saw in the previous subsection that Algorithm (Port minimization 1) permits us to minimally decompose the voltage space of a graph in such a way that the coupler space is graphic. For such a vector space, it is not clear whether minimal multiport decomposition is possible with both component spaces and the coupler, graphic. However, in this subsection we give an algorithm which makes the component spaces graphic while losing control over the coupler space. In the interest of brevity we only sketch the justification. ALGORITHM 8.2 (PORT MINIMIZATION 2) INPUT A connected directed graph 4 with a partition { E l , . * ,Ek] of E E ( 6 ) . The space VE to be decomposed is V,(G). OUTPUT Graphs GE, j = 1, . . . ,k and a representative matrix Rpl of space V p , ) is a minimal k-multiport decomposiV p , s.t. ( U E 1 . . . , VEkP;?; tion of V E , where V E ,p; 3 V , (4 ~p; )).
8. MULTIPORT DECOMPOSITION
294
RE^ i . .
RE^)
of G. This
STEP 1
Construct a reduced incidence matrix R G is a representative matrix for V E .
STEP 2
For j=1 to k, do the following: Construct a forest f J of G . E, extending f, to a forest f of G. Extend f - fJ to a forest f-3 of G . ( E - E,). Let P,' E f-' - (f - f,). Contract (f - fJ) in G and delete E - EJ - (f-3 - (f - f')). The resulting graph is on E, U P,' and will be denoted by GE, p; . Take V E ,p'1 to be the voltage space of GE, P; . A reduced incidence matrix of G E p; ~ would be a representative matrix ofVE,l'i.
STEP 3
Let t, be a forest of G x E,,J = 1;. , k . Let t G Ut,. Construct the graph G x ( E - t). Add branches o f t as self loops to this graph. Call the resulting graph G'. Select a forest TJ for G' . E, ,J = 1 . . . , k
*
Let T = UT,. Let !& G' . T . Let (RT~ !...!RT*) be a reduced incidence matrix of GT. In the graph G E , ~ ; contract t, and delete E, U P,' - (TjU t,). The resulting graph is on T, U P,' and is denoted
by
Gk,p;. Let
(11
J
to the forest T,. Let Rp;
be an f-cutset matrix of G T , ~ ;with respect f ( R T , ) ( Q ~ , ~= ; )l, ,J. .
. , k and let Rp E
(R,!. . . i Rp). Output Rpj as the representative matrix of Vpj . STOP. Justification of Algorithm (Port minimization 2)
We confine ourselves to a statement of the main steps in the justification] omitting details, in the interest of brevity. We need the following elementary lemma. This essentially states that if we apply the same nonsingular transformation on the 'P, part' of vectors in V E , ~ and , V p the resulting spaces would still constitute a kmultiport decomposition of V E . Lemma 8.4.3 Let ( V E , ~ ., ., . ,VE*P*;V p ) be a k-multiport decomposition of V E .
Let (R, iRpJ) be a representative matrix of V E , ~,j, = 1, . . . ,k and let (RP,i . . . ~RP, ) be a representative matrix of V p . Let K, be a square nonsingular submatrix of size I P,
I
. Let (R, iRp1) be a representative matrix of W E , p; j = 1, . '
(Rp; i . . . iRp;:) be a representative matrix of Vpt , where Rp; = R p , (K3) and Rp; = Rp, (K,) Then, ( V E ~ .~. .; ,,V E , ~ ;V; p ) is a k-multiport decomposition of V E .
, k and let
295
8.4. PORT MINIMIZATION We omit the routine proof. I t is essentially the same as that of Exercise 7.18.
We indicate below how the k-multiport decomposition output by Algorithm 8.2 (Port minimization 2) is related t o that output by the adaptation of Algorithm 8.1 (Port minimization 1) (hereinafter called ‘Modified algorithm (Port minimization 1)’)given in Subsection 8.4.3, through nonsingular transformation of the Pj part of vectors in V E , ~ and , V p . This would justify Algorithm (Port minimization 2). Since 1 P’ )=)P 1, it would also follow that the latter algorithm is minimal.
i. The graph GJ 3 G E ~ .~E3; of Algorithm (Port minimization 2) is 2-isomorphic to G . E j . This is because in the graph G . ( E j U (f - f?)), (f - f3) is a separator. Thus, contraction or deletion of (f - f3) will result in 2-isomorphic graphs. It is also easily seen that GE,P; X EJ = G X Ej ii. Since the graph G E , ~ ;in the same algorithm is built so that QE,F;
. Ej
GEJPJx
it follows that,
=G
Ej =
. Ej x
Ej,
V E , ~ ‘;E3 = V E . E3. v ~ ~ Xp E3 ; =
VE X E3
iii. In the Modified Algorithm (Port minimization 1)
V E , ~ ,Ej vEJpJX
V E . Ej
1
Ej = V E
X
Ej
Now 1 P3 )=)P,! I . Hence, a representative matrix of V E , ~ can ; be obtained from that of VE,pl by post multiplying columns Pj by a nonsingular matrix K j . By Lemma 8.4.3,if Rp; = (RpJ)Kj,j= l , . . * , k , and V p l has the representative matrix (Rp;: . is a k-multiport decomposition of V E .
. !Rp), then ( V E ~ P .; ., + V , E ~; V~p l;)
iv. We need to show that our computation of Kj is correct. The graph Q‘ of the Modified Algorithm (Port minimization 1) and the Algorithm (Port minimization 2) are identical. In Algorithm (Port minimization 2) let Qkjp,, denote the graph obtained from G E , ~ ;by contracting the forest t, of g ~ ~ xp E, ; (2 G x E 3 ) and adding it as self loops. The graph is related to GL,pJ of
GbJp;
Modified Algorithm (Port minimization 1) as follows: In
G L J p , , the set of edges J
8. MULTIPORT DECOMPOSITION
296
PJ' form a forest. If we delete this forest and replace it by a copy P, of the forest T, denote the graph obtained from GL3 by adding of Gk3 we get G&3p3.Let GL3 p,p 3 3 P, but not deleting P3'. If (Ar3 App; A,.p3)is a reduced incidence matrix of this graph, there is no loss of generality in assuming that (A,., A,p3) is the same as the matrix
(R(k+l),
! R ( k + l )) ~of3 Modified Algorithm
(Port minimization 1). Further
(AT3: A,p;) is the reduced incidence matrix of the graph GL3 The matrix K, is defined by A,p; = (A,p3)K,. Now A,p, is identical to the submatrix A,T, of A,, corresponding to T,. Thus, to compute K,, we need only consider the matrix (AT~>:A,.p;). This is the reduced incidence matrix of the graph = GLJp:
GkJP, 3
. (T, U Pi). The column dependence structure of this matrix is identical to
that of the f-cutset matrix (IiQTJP,)of this graph with respect to T,. Hence, 3
Complexity of Algorithm (Port minimization 2) Computation of Rjj, j = 1, . . . , k is as in Modified Algorithm (Port minimization 1). This takes O(k 1 E I) time overall. Computation of (RjiRp;) (the representative matrix of V ~ , p ! ) , = j 1 , .. . , k is O(k 1 E I) since the graph operations to reach G E p; ~ ,j = 1 , . . . , k , are each O( 1 E I). We need to examine more carefully the following: (a) Construction of the matrices K j . This involves building an f-cutset matrix for G k I p ; .This is O(C 1 Pj' I"(= O(C 1 Pj 1') ). (b) Post multiplication of Rp, by Kj, j = 1, * . . , k . Now Rp, is a reduced incidence matrix. Multiplying Kj by a row of Rp, (corresponding to a vertex vi) is equivalent to adding some rows of Kj and subtracting some other rows. If qi is the degree of vi (the number of nonzero entries in the row), then the number of operations is qi(1 Pj I). Thus, the product can be computed in ( C q i ) I Pj I steps. But Cqi < 2 1 Pj I since Rp, is a reduced incidence matrix. Thus the product takes O(l Pj 1') steps and the overall multiplication takes O(C I Pj 1') steps.
+
Thus Algorithm (Port minimization 2) has time complexity O ( k I E I 1 Pj 1') . It is slower than Algorithm (Port minimization 1) but would still be acceptable if 1 P I= ( i R ( t ) ) T ( R ) i R ( t ) . The RHS is zero only if i R ( t ) = 0 since R is positive definite. Thus i R ( t ) = 0 and V R ( t ) = 0. If the capacitor matrix C is positive definite, then we can show that, solution (v,i) is trapped with respect t o E c iff ic = 0 and vc(t) E ( V v ( G ) ) x Ec (= Uv(G x E c ) ) , and further is constant in time. An example of this is where two capacitors are in series with nonzero initial voltages which are equal and opposite. The proof is similar to the resistor case discussed above. But here we work with vc, ic and show that vc(t) = 0, ic(t) = 0 using the fact that C is a positive definite matrix. If the mutual inductor matrix C is positive definite then solution ( v + ) is trapped with respect to E L iff VL = 0 and i L ( t ) E (Vi(G))x EL = Vi(G . E L ) and further is constant in time. An example of this is where inductors form a circuit and their initial currents circulate within the loop. To prove this fact we work with V L , i~ and show that v ~ ( t = ) 0 and &(t)= 0, using the fact that C is a positive definite matrix and using arguments similar to the resistive case. State equations for a linear network
We will now show how to use minimal multiport decomposition to write state equations for a circuit with capacitors, mutual inductors and nondynamic devices. A byproduct of our method is that we would get a reduced generalized network of the same kind whose solution mimics the solution of the original network except for the trapped solution corresponding to zero eigen values. Let hr be a linear network on graph G. Let E I E(G) be partitioned into E c , E L ,ER. Let the voltage and current vectors associated with these edges be denoted by V C , i C , V L , i L , V R , i R . Let the device characteristic of N be given by (C)vc (C)i,
-
ic = O
VL = 0
M(VR- e R ) + N ( i R - j,) = 0. where C, C are symmetric positive definite matrices. We could use any algorithm for minimal 3-multiport decomposition of the space V E = V,(G). But for convenience we use the notation of Algorithm (Port minimiza,j tion 1). Let ( R j i R p , ) , (BjiBp,) be representative matrices of V ~ , p , , V h , ~ ,=
C , L , R. Let (Rpc:iRpLiRpR) be the representative matrix of the space V p . The capacitor multiport equations are (8.14)
8.5. NETWORK REDUCTION BY DECOMPOSITION Cvc-ic
299 (8.15)
= 0
(8.16) These may be rewritten to obtain the relationship between ip, ,vp, and vc
= -Re C vc = -RcCRZV,,
(8.17) (8.18)
VP,
= R& vn,
(8.19)
VP,
=
-R&(Rc C R;)-'
=
-(Cp)-'
Rp, ip,
RPc ip,
(8.20) (8.21)
ip,, say.
Since the decomposition is minimal, rows of Rc are linearly independent and so are columns of Rp,. It follows that since C is symmetric positive definite so are (Rc C RZ)-l and R& (Rc C RZ)-' Rp,. We call the inverse of this latter matrix C p . The inductor multiport equations are (8.22) CiL
-VL
(8.23)
= 0
(8.24) This may be rewritten to obtain the relationship between ip,, vp, and B P Z VPL
= =
(L:) iL -Br, (C) BTil,
(8.25) (8.26)
-BL
ip,
= BZ,
iPz
= =
iL
(8.27)
il,
-B;z(B~ C BT)-' Bpz vp, -(&)-I VP, say.
(8.28) (8.29)
Since the decomposition is minimal, rows of BL are linearly independent and so are columns of Bp, . It follows that since C is symmetric positive definite so are (BL C BT)-' and BTz (BL L: BZ)-' Bp,. We call the inverse of this latter matrix Lp. The resistor multiport equations are (8.30)
M(VR- e R )
+ N(iR - j R )
= 0
(8.31) (8.32)
8. MULTIPORT DECOMPOSITION
300
Let us suppose that these latter can be equivalently written as (8.33)
+
M P ( ~ P-, epR) Np(ipR- jpR)= 0.
(8.34)
(We will now work with a generalized network as defined in page 279- the network (Vp,’D;) where P = PC PL PR ’ ; is defined by and the device characteristic D
Np is defined in Subsection 8.3.2). Consider a network Nb,
cpi’pL= VkL
--MP(v’P,
-
+ epR>+ Np(i‘p, - - j P R ) = 0.
We take v’p, -vpc, v‘pR= -vpR, v‘pLz -vpL, and i’p f ip. This network differs from n/p in that the voltages of Afb are negatives of the voltages of Afp but the currents in both the networks are the same. We use Nb instead of Np since the former is of the same kind as N . Let the state equations for this network be written as in Section 5.7. Let these equations be
[
‘TIP.
i’p,
] [ A‘LC A‘cc
=
A’CL ALL
Let the output equations be [YPI =
[ C’P,
C‘PL
]
(8.36)
where y p includes variables such as ip, ,vpL,ip,, vpR. Here we have assumed that vlpc:,i’pLdo not become dependent in the network N$ when the resistive device Characteristic alone is used (and Pc, PL itre treated as voltage sources and current sources respectively). This is done only for notational convenience.
Arb contains the dynamics of N In the discussion to follow we relate the solutions of Np and N without assuming that the initial conditions are specified. We will show that Equation 8.35 contains a description of the ‘essential dynamics’ of the network n/. We have, from Equations 8.16 and 8.24
8.5. NETWORK REDUCTION BY DECOMPOSITION
30 1
Note that knowledge of v,, ( t )and it, ( t ) gives us complete knowledge of the dynamics of the network. Decompose v,, ( t )into two orthogonal components vAc ( t ) and viC( t ) ,where R& (v;, ( t ) )= 0 and vi, ( t ) is spanned by the columns of Rp, . Similarly decompose il, ( t )into two orthogonal components i;, ( t )and i;, (t) where B& (itL(t)) = 0 and iiL(t) is spanned by columns of Bp, . We will show that v i C( t )i;z , ( t )are uniquely determinable from vpc ( t ) ,ip, (t).The ambiguity in obtaining vnc (t),it, ( t )from the latter variables is contained in vk, (t) and iiL(t). We now show that vi, (t) and it, ( t ) correspond to trapped constant solutions. We prove only the vAC case. The other case is similar (dual). Let us split vc into v& and v$ where v&(t)= (R:)viC (t).We then have
So v&(t)and therefore,v&(t), also belong to Y,pc x E c , i.e., to VE x EC (using Theorem 8.4.1 since the multiport decomposition is minimal).Hence, < ic(t),v&(t)> = 0, i s . , < Cv'c(t),V&!t)>= 0. The matrix C is positive definite. Hence, vk(t) = 0 and therefore, ic(t) = 0. Thus, vb(t) is a constant and (v,&@ Or, @ O R , Oc @Or, @ O R ) is a trapped solution. Similarly one can show that i L 1 ( t )is constant (where i L ' ( t ) = BEi;,) and (OC @ Or, @ O R , O c @ iL @ O R ) is a trapped solution. To obtain v i C( t )from vp, ( t )we proceed as follows: v:,(t) = (RPC)k(t)
say.
Hence, vp, ( t ) = (RpCTRp,)k(t).Now Rp, has linearly independent columns, since by the minimality of the decomposition, ~(UE,P,* P c ) =I Pc I . Hence, ,:v
T
(t) = RPC (RP, RPc)-'VPc (t).
Similarly one can show that
The trapped solutions corresponding to viC,i;, are constant solutions of linear constant coefficient differential equations which can exist in the absence of inputs. They are of the form keot and therefore, correspond t o zero eigen values. For completeness we show how to obtain the overall state equations. We have vn,
il,
T
(t>= v:c ( t )= RPc (RP, RPc )-'+Pc ( t ) ( t )= il", ( t ) = Bp, (BF, Bp,)-'ipL ( t )
(V,!Jt),ii,(t) are zero since v&(t),&t) are zero, v&(t)= (R;)vAc(t), i i ( t ) = (BE)i:L(t),and the coefficient matrices R;, BE have full column rank).
302
8. MULTIPORT DECOMPOSITION
Now in Equation 8.35, Vb, ,ibL have been expressed as time invariant linear functions of vlp,, ilpL,eR and j R . Now since vp, = -v'p,,ipL = i;, we can express vinc,ilLas time invariant linear functions of v p , , i p , , e ~ and j,. But T vpc = Rp, v,, and ip, = BTLilL.Thus, Vn,,ilL are expressed as time invariant linear functions of v,, ,il, ,eR and j R . These would be the required state equations. The number of state variables that we have obtained equals the number of entries in v,, and il,. Clearly this equals T ( U E ~ P+~r)( V i L p L ) NOW, . T ( U E , P ~ )= r ( V ~ , p. ~E c ) = T ( V E . E C since ) the multiport decomposition is minimal (Theorem 8.4.1). But ~ ( V EE.c ) = v(VV(G . E c ) ) = r(G . E c ) . Similarly, T(Vk,P,)
= +,'.EL) = r(Vi(G x E L ) ) = v(G x E L ) .
If in the network N&,when Pc, PL are treated as voltage sources and current sources, the voltages in vlp, and currents in i>, are independent, then it is clear from the above discussion on writing state equations in terms of the variables v,, ,il, , using equation 8.35, that v,, ,il, can have their initial conditions arbitrary. Thus, r(G . E c ) v(G x E L ) is the least number of state variables for this network. If N has the device characteristic of ER of the form
+
-G(vR - eR)
+ ( i R - j R ) = 0,
where G is positive definite, then one can show that Mb has the device characteristic of PR of the form - G P ( v P ~- e p R ) (ip, - j P R ) = 0,
+
where G P is positive definite. In this case, N& has no constant solution and therefore no zero eigen values (see Problem 8.7). Thus, all the zero eigen values of the network N are concentrated in the trapped solutions. For the above case, the number of zero eigen values of h/ =
+
T(VE x EC) r(V,' x E L ) = r(G x E c ) +Y(G . E L ) .
8.6
Problems
Problem 8.1 Given two decompositions to get a decomposition of one of the couplers: Let ((V~,p,)k,Vp), ((VE,Q,)~; V Q ) be decompositions of V E and V Q ) ,where further let the former be compatible. Then, ((V~,p,)k; ~ l ; . . , k , is acompatible decomposition o f u p . VQ,P, (VE,P,++ V E , Q , ) ,= Problem 8.2 In a compatible decomposition the generalized hybrid rank of V E and V p are the same: The generalized hybrid rank of a vector space VT relative to a partition { T I ,. . . ,Tk},of T equds mznv; { d ( V T ,V k ) } , where V& is a
303
8.6. PROBLEMS
vector space on T which has Tj,j = 1,. . . ,k as separators. We remind the reader that d(vT,u;) = r ( v - r(uT n
+ v&)
v;).
i. Let V E be a vector space on E . Let E be partitioned into { E l ,. * ., Ek}. Let i ) have ~ E j , j = 1, . . . , k , as separators and further be such that $(VE,
9 E ) 5 d ( V E , Vb)
whenever V& has the Ej as separators. Then @(VE
' E j ) _>
PE
2
(@VE
x Ej).
j
ii. Let ( V E ~ P. .~' ,,V E ~ P V ~ ;p ) be a compatible decomposition of V E . Then (the generalized hybrid rank of V E relative to { E l ,. . , Ek}) equals (the generalized hybrid rank of V p relative to { P I , .. . ,P k } ) . Problem 8.3 Let ( G decomposition G, i.e.,
E ~. .~. ,~G E, ~ PG~P ;) be a strongly compatible multiport
( V E ~ P ~ , . . . , V E ~ P ~whereVEjPj ;VP), e V , , ( G ~ , p , ) , j= l , . . . , k , V P E V ~ ( G P i)s, a strongly compatible decomposition of V E G V,,(G). Justify the following algorithm for building a minimal k-multiport decomposition starting from the above decomposition. ALGORITHM 8.3 (Port minimization 3) STEP 1 ConstructgraphsGE,p, x Pj, j = l ; . . , k . L e t t j , j = l , " . , k r e s p e c tively be forests of these graphs.
STEP 2
Construct graphs GEjpj Pj, j = 1, . . ,k. Let Lj, j = 1, . , k respectively be the coforests of these graphs such that L j n t j = 8, j = 1,.. . , k. Let Q j = Pj - ( t jU Lj), Q = U Qj, GE,Q, = GE,P, x (Ej U (pj- t j ) ) . ( E j U (Pj - t j - Lj)) GQ=GPx(p-u t j ) ' ( p - u tj-ULj) ( G E ~ Q' '~.,,G E ~ ;QGQ) ~ is a minimal k-multiport decomposition of 4. +
e
STOP Problem 8.4 Fast algorithm for minimal graphic 2-decomposition: Let G be a graph on E , and let E = El kJ Ez. Let PI be a copy of Ez,and P z , a copy of E l . Let G E ~ E ~4 and let G E ~ PG~E, ~ P ~ , G Pall ~ Pbe~ copies of G.
=
i. Show that ( G E ~ ~ ] ,
G p , p 2 ) is a strongly
compatible decomposition of G
ii. Use the algorithm of Problem 8.3 to obtain a minimal graphic Bdecomposition of c;. Problem 8.5 * ~ t f ~ NBK r , method and 2-decomposition: Show that the NAL - NBK method (see Section 6.4) can be regarded as a special case of network analysis through decomposition into two multiports.
8. MULTIPORT DECOMPOSITION
304
Problem 8.6 *Formal description of network reduction: Let N (G,D) be a network. Let E = E(G) be partitioned into { E l , . . . , E k } such that the Ei are decoupled in the device characteristic, i.e., V = D E x~ . . . x V E ~ Let. ((VE,P,)~; V p ) be a minimal k-multiport decomposition of VE. Let multiport networksNE,pJ ( V E , P , , ~ E , x 6pJ), j = l ; . . , k . Define D'p,, j = 1 , .. . , k as follows: (vbJ,ibJ)E DbJ iff there exist vectors v ~ , , i E ,s.t. ( v ~ @, (-vLJ),iEJ @ ih,) is a solution of NE, p, . Let Mf. f (Vp,DlP, x . . . 2 ) k k ) i. Let (v,i) be a solution of N. Then there is a unique solution (v;, i>) of Nf. which corresponds to (v,i) s.t.
(v/Ej @ (-v>/Pj), i/Ej @ i>/Pj) isasolutionofNE,p,, j = l ; . . , k ii. The reverse process is not quite unique: Given a solution (vk, i>) of a corresponding solution in the above sense is unique within a pair ( C , i)
&'A,
where G E
ejVE x Ej and E ejV& x Ej.
iii. Let ((VE,Q, ) k , VQ) be a compatible (not necessarily minimal) k-rnultiport decomposition of VE. Define N& and relate its solutions to those of Nb.
Problem 8.7 *Minimal reduced network for an RLC network has no zero eigen values: Show that the network NL defined in Section 8.5 has
i. no nonzero trapped solution if the sources have zero values. ii. all e i g e n d u e s nonzero.
Problem 8.8 *If the decomposition is compatible VE, V p have essentially the same polymatroid: Let ( V ~ , p , ) k ; Vp) be a k-multiport compatible decornposition of VE, ie., V E , ~ , . Pj 2 V p . Pj and VE,p,
x Pj
vp
x pj, j = l , ' . . , k
Define the following set functions on subsets of { 1 , 2 . . . k} PE(1)
=
T(vE
'
(u
Ez))
iEI
wE(1)
=
CT(VE~P, x Ei) iEI
P~(I) =
up
(UP i ) ) &I
Show that
8.7. SOLUTIONS OF' EXERCISES
305
i. p ~p p, axe polyrnatroid rank functions while W E , w p are modular functions. 11.
pfj - W E = p p - W
p
Problem 8.9 *Is the generalized minor operation matroidal?: Let V E P H V p = V E . Show that i. if the vector spaces are over GF2 then M ( U E ) can be determined from
M W E P )M , (VP); ii. in general, knowledge of M ( V E P )M , ( U p ) would not be sufficient for determine M (Y E ) .
US
to
8.7 Solutions of Exercises E 8.1: i. We denote El by E . By Theorem 8.2.2 we have
V E X ~E & ~ V E JX E Thus, T ( V E ~. E , ) - r(V,yp1 x E ) 2 T ( V E J. E ) - T ( V E x~ E ) But the LHS = T ( V E P.~P I )- ~ ( V E Px, PI)51 PI I . Hence, 1 PI 1 1 T ( V E J. S) - T ( V x~ E ) . Next suppose the graph G' is obtained by adding a copy t' of a tree t of G again to G. We have T ( V E , E ) - ~ ( U E x' E ) = r(G). So we cannot do with number of ports less than this number. On the other hand this number of ports is adequate. For, from these port voltages all voltages of the graph are uniquely determined. More formally, use of Algorithm (Port minimisation 2) will allow us to have a multiport decomposition of Vv(G') in which V E is~the~voltage space of a graph G E with PI containing no circuits or cutsets. Hence a copy of a tree of G is always sufficient to act as ports in V E ~ The ~ . structure of the tree does not matter by Lemma 8.4.3. +
..
We require r(G) ports in general. For each component we could add a tree of ports on the set of nodes of the component. 11.
...
If ( V E ~ P. * ~. VE*P*; , V p ) is a multiport decomposition of V , (GI) then we know that ( V i I p l , . . . V k h p k; V b ) is a multiport decomposition of Vi(G'). So the arguments of the previous sections of this problem are adequate to prove that a copy of a forest of would be adequate as ports in general. 111.
E 8.2: i. Consider the equation
~ ~
306
8. MULTIPORT DECOMPOSITION
By row transformation we can transform these into an equivalent form
(8.37) where A l l , A32 are representative matrices of V x E l , V x ( E - El), respectively. The rows of Azl, A22 are linearly independent and in number equal to r(V . El) r(U x E l ) . If All121 # 0, then S2(%1) = 0. So let us assume that A l l x ~= 0. Now we can rewrite Equation 8.37 as shown below as far as 1 2 1 , x2 are concerned. (8.38)
The affine space & ( % I ) is determined by Az1X1. So if p1 = A21Xl we do not require a vector of number of entries greater than that of p1 to determine S z ( X 1 ) . One cannot do with a vector in place of p1 with less number of entries provided that vector is obtained from X I by a linear transformation.To prove this we proceed as follows: The rows of A21 are linearly independent. So for each p1 there exists a corresponding 121. Also for different values of p1 the spaces S2 would be disjoint. Given a value of x2, p1 is determined uniquely. Suppose there is a matrix M s.t. y = Mxl determines &(XI) uniquely. Since & ( X I ) determines A21 (XI)uniquely A21 (XI)= f ( y ) for some function f(.). It is easily seen that f(.) must be a linear transformation. If U,,V, represent the range spaces of A21 and M then we have an onto linear transformation f : (U,) -+ 1.1,. Thus dim(V,) 2 dzrn(Up) as required.
ii. For (B1iB2);: = 0 the minimum number of entries of a vector to determine the affine space of vectors z . ~ knowing z1 can be similarly shown to be r(UL . El) - r ( V L x E l ) = T ( V . El) - r(V x El).
...
If we decompose V E into ( V E ~ PV ~E, ~ PV p~ );where E2 = E - El then the minimum size of P2 equals (r(V . E l ) - r(V x El)). 4 represents El as far as E2 is concerned, in the space V E ~If x ~ E~V E. and we are given xlE1 then the range of possible values xlE2 can take can be determined as follows: First find all possible vectors xlE1 @xp, in V E , ~If~the . decomposition is minimal there would be only one such vector. Let us for simplicity assume the decomposition is minimal. Next find the vector xpl @ xp, in U p , pz . Finally find the collection of vectors X E @j ~ xp, in V~,p,.The restriction of this collection to E2 gives the range of poissible values that xIE2 can take. 111.
E 8.3: The decomposition shown in Figure 8.5 provides such an example.
8.7. SOLUTIONS OF EXERCISES
" c1
"
c2
c3
-
307
N--
r '-v 0
pL 2
Figure 8.5: A Graph and its Multiport Decomposition
8. MULTIPORT DECOMPOSITION
308
E 8.4: i. In ( ~ E ~of QFigure ~ 8.2, the currents entering e l , e4 are always negatives of each other. However, in G E ~ because , of the circulating current in {e4, e.5,e6, e 7 } , this condition is not necessarily satisfied.
.. The port conditions of G E ~ Q ~ , G would E ~ Q ~be satisfied in C j ~ piff the current vectors that can exist in { e l , . . . ,elo} in GEP can also exist in ~ E ~ G QE ~~ QEquiv,~ . ii.
alently, iff
V ~ ( ~ xEEP) C @ ~ V ~ ( G E x, QEi). , The easiest way to check this is to see if each row of the incidence matrix of the graph on the right can be obtained by adding appropriate rows of the incidence matrix on the left. More generally, one can check if the rows of a circuit matrix of g ~ xpE are orthogonal to an incidence matrix of @GE,Q, x Ei.
Remark: In practice we need to know whether the port conditions of electrical ~ devices ~ ~ present are satisfied in the multiport NEQ multiports NE,p, ,N E with with devices present. However, the above procedure is still useful. This is because the characteristics of electrical devices, except those of short circuits and open circuits, cannot be constructed exactly. So one performs the above test after fusing the end points of short circuit branches and deleting open circuit branches. If the test succeeds then of course the connection is permissible. If it fails, the ~ ~ (restricted ~ to probability of the possible current solution vectors of N E @N,y2p2 E ) lying in the set of possible current solution vectors of NEQ(restricted to E ) can be seen to be zero. This can be shown by using the fact that if one picks a vector randomly out of a vector space, the probability of its lying in a given proper subspace is zero.
ej
E 8.5: Let V E P V E , ~. ,Then V E = V E P t) U p . Hence, ( V E t) V Q ) = ( V E Pt)U p ) t) V Q . Since P n Q = 0, the RHS can be written as (VEP f+ V Q )t) V p . N o w V ~ = V ~ ~ ~ . . . ~ V ~ ~ , w h e r e Q j c E ~ , ., 3 = 1 , . . . , k It is then easily seen that (@~E,P,) j
* (@vQ,)= @ ( ~ E ~ tfP VQ,). ~ j
j
E-S
I 1. I t is clear where Q = E - T and VQ has the representative matrix [ 0 that VQ is the direct sum of spaces V Q , whose representative matrices are given by , E J O S J S,-T,
I
1. Hence
S-T
8.7. SOLUTIONS OF EXERCISES
309
Since we have shown above that
the required result about decomposition of a minor follows. E 8.6: Let $ J f ~ J , @ fE,’ 3 E V E . w e will show that for any i, ( $ , + f ~ , ) @ fEJ’ E Y E . This is clearly equivalent to showing that V E = $(VE . E 3 ) .We have, ( YE, p, ) +) V p = V E . So there exist fp, , fp, ’ E V p s.t. fE, CD fp, E VE,P, and fE,’@fp,’ E V ~ , p , , j= l,...,k. Now fp, @ fp,‘ E U p , since V p = $ , ( V P . P 3 ) . Further, fE, fBfp, E VE,P,, j # i and fL, @ fp,‘ E VE,P,. Hence, fE,) @ fEt E LIE.
e3 e3+
ej eJ
(ej+
E 8.7: Let II, E (1, ‘ . , k}. i. Let fE, be a vector in the LHS. Then there exists a vector
ejEI1
Hence there exists a vector
We would then have
where 1 2 = { 1 , . - . , k} - I1 Hence there exists a vector
Thus there exists a vector
8. M ULTlPORT DE COMPOSlTl O N
310
(eJEI,. (eJEI1 E V E .EI,. Thus Using compatibility we have, ((eJEIl V i , p , ) * Vp‘ P I , ) = V& . EI,.
I t is therefore clear that fE,) E V E and hence, LHS C RHS. The reverse containment is easier to see.
fE,)
* ii. Hence, ( ( @ J E I l V k J P , ) l+) (V,‘ . PI,)* = (V,’ . E I , ) ~ This is clearly equivalent to the required result. ... This is a direct consequence of the previous two sections of this problem. 111.
We verify the compatibility condition only for the first section of this iv . problem. We have VE,P, VE,P,
’
PJ 2 VP ‘ pj = ( V P ‘ PI,)‘ p j , j E 11
x PI
s VP x p:, s ( V P . PI,) x P J , J E 11
E 8.8: Figure 8.2 shows Component multiports G;E~Q,,5 7 and~ a port ~ connection ~ ~ diagram GQP such that when the multiports are connected according to the latter the graph GEP shown in the figure results. But the voltage space of this graph G,p cannot be decomposed into the voltage spaces of the given component multiports and the port connection diagram. The reason is that K V E corresponding to (e4, e 5 , e6, e 7 ) in GEP is not indirectly imposed through the component multiports and the port connection diagram. Equivalently, if V E P = ( V V ( G ~ , Q 163) V , ( G E ~ Q ~ ) ) V,(GQP), it is clear that V E P # V , ( ~ E P ) . E 8.9: We have
*
VFQ
= ($($ 1
We first prove the following Lemma 8.7.1
i.e., iff there exist vectors
VEJ,T’, %
* VT’P, 1)
f+
VPQ
(*I
8.7. SOLUTIONS OF EXERCISES
31 1
Applying this lemma to the RHS of (*) we get
It is easy to see in general that
E 8.10: i. As discussed in Section 8.3, when we solve a multiport in terms of ip,, , vpJ2 the essential structure of the equations corresponds to LIE,p, x (Ej U Pjl) . Ej. If V E , ~ is , the voltage space of a graph G E , ~ ,then this structure corresponds to GE, pJ x ( E j U Pjl ) Ej . If we have freedom in choosing Pj, , Pjz we should choose +
that partition which gives us a large number of separators, preferably of uniform size, for the space V E , ~ x, (Ej U Pjl) ’ Ej. If the separators E j l , + . . Ejt , of this space are decoupled in the device characteristic we would have an advantage during analysis.
.-
It can be observed, in Figure 8.4, that if P I were ~ shorted and P I opened ~ we get two separators. The other three options: opening both, shorting both, opening PI1 and shorting P12do not give this advantage. Therefore, while solving for J ~ E ~ P , we should solve in terms of vp,, , ipI2.
11.
...
Shorting all the port edges gives us the structure V E , ~ x, E j . Deleting all port edges gives us the structure V E , ~ *, Ej. Now if we perform nonsingular transformations on the columns Pj of representative matrices of VE,p, and the same transformations also on columns Pj of representative matrices of V p , it can be seen that we would get a new multiport decomposition of LIE. For any vector space V g , s.t. 111.
(using Exercise 7.5). By column transformation if required we can convert the representative matrix of Vpr to the form [O I]. We may, therefore, without loss of generality assume that the representative matrix has this form. But this implies that the generalized minor operation is equivalent to an ordinary minor operation. l the form V E , ~ this ; would correspond to performing a Further, since V ~ p has
ej
8. M ULTIPORT DE CUMPOSITION
312
minor operation on each V E ~ ~ ; . Thus, for any V g = @jV& s.t.
We can find a set of transformed ports P' as well as a partition of each Pj' s.t
u,&= @ v&, = @ ( v E , p i J
(Ej
u pJ')
'
EJ)
3
If we use minimal multiport decomposition we must have
and and the range of possible structures lies between V E x Ej and V E . E j .
E 8.11: We need to show that i.
a compatible decomposition iff
( ( V ~ , p , ) hV; p ) is
.P_>VP
@VElPl
(*>
j
@vEIPl
x p
E VP.
(**)
j
Only if part is obvious. To prove the if part we observe that (@'E,pl)
'
i
@(vf?,pJ"3). j
So the condition (*) implies that U E , ~ ., Pj 2 U p . Pj V j . The condition (**) implies @(Vk]P]
1.p
2 Vp'
j
from which we conclude
v;lpl . Pj 2 v,l. Pj v j
and therefore, I j ~ , p , x Pj C V p x P3 V j . .. 11. (a) By the previous section of this problem we have that compatibility is equivalent to the conditions (*) and (**). But by Theorem 7.6.1 in Problem 7.5 these conditions are equivalent to
along with V,y:,p, . Ej 2 V E . Ej ' d j
(J )
8.7. SOLUTIONS OF EXERCISES
313
and v.qJpjX
Ej & V E X Ej ‘dj
(JJ)
NOW conditions ( J )and ( J J )are equivalent to the compatibility of the decompoof) V P . sition ( ( V E , P , ) ~ , V F The ‘strongly compatible’ case is proved similarly.
.. 11.
(b) By the Implicit Duality Theorem ( ( U E ~ P , )V~p; ) is a decomposition of
U E iff ( ( V i J p)k; J V,‘) is a decomposition of U&. Compatibility of the decomposition of V& follows since V E , ~.,Pj 2 V p . Pj is equivalent to V i J p Jx Pj Vjk x Pj and u p x P? is equivalent to vkjpJ . ~j 2 v,l . Pj. vE, p, x ~j
%.The equivalence of the strong compatibility of the decompositions ( ( V E ~ P ) k,; V P ) and ((VbJp,)k; V h ) is proved similar t o (ii) ( b ) above. We have V E (ejLI~,pj) ++ V p . Let V k G ( e j V ~ , p ;+)) V p i , fp, E VE,P, x Pi. We will first show that V E = V k . Let fE E V E . Then fE = @ fE, and there exists fpj s.t. fp fp, E vp and fE, @ fp, E V E , ~for , each j . By strong compatibility, VE,P,x Pi = V p x Pi. Hence Op-p, + fpt belongs to V P . Let fp denote this vector. Suppose fp, (e) = X f , (e). Then, fp - X f p E V p . Let fp, G fp - Xfp/P’. Since (fp - Xfp)(e) 3 0 , it is clear that f p E V p . We know that fp, E V E , ~x, Pi. Thus, fE, ‘3 (fp, - Xfpz) E V E , P , . But (fp, - xfpt)(e)= 0. so fE, @ (fpt/P,!) E V E , ~ ;Noting . that Pj = P,! v j # i, it is clear that fE, E V b . so V E E V b . The reverse containment is easier to see. Next we need to show that the decomposition ( ( V ~ , p ; ) kVp),where ; Pi = Pj V j # a , is strongly compatible. A vector fp; E V E z p ; x P,! iff fp; E V E ~x ~(Ei , kJ P,!) x P:(= V E , ~x, Pi x P i ) , i.e., iff 0, fpf E U E , ~ x, Pi. NOW by sirrong compatibility of ( ( V ~ , p , ) kU; p ) we have V E , ~x, Pi = V p x Pi. Hence, fp; E U E , ~ x; P,! iff 0, + fp; E V p x Pi, i.e., iff fp; E V p x Pi x P,!, i.e., iff fp; E V p x Pi x Pl, i s . , iff f p E Vpl x P,!. Next, a vector fp; E V E , ~.‘Pi’ ; iff fp; E V E , ~x, (Ei kJPi) . Pi, i.e., iff fp; E U E , P , . Pi x P;, i.e., iff fp; E U p . Pi x P,! (using strong compatibility), i.e., iff E V p x Pi Pi(= U p , . Pi). This proves the strong compatibility of ( ( V ~ , p ; ) kV;p ) . iv.
ej
ej
+
fpt
v. I t is clear by the previous section of this problem that ( ( U k J g , ) k ;VQ‘)is a strongly compatible decomposition of V i . (Observe that IlkEQ,= Uktpzx ( E i U Q i ) ) . But by an earlier section of the present problem this is equivalent to ( ( V E ~ Q ,V)Q ~ ), being a strongly compatible decomposition of V E . vi. The algorithm terminates in a strongly compatible decomposition ( (VE,T,) k ; V T ) , for which V E , ~ ;x T j = VT x T j = 0 and V&,?, x Tj = V$ x T j = 0. By Theorem 8.4.1 it follows that the decomposition is minimal. By the same theorem it also follows that every minimal decomposition is strongly compatible. E 8.12: i. Suppose there exist distinct vectors fp, and fpj’ in V p s.t. fEj @fp, as
ej
ej
8. M ULTIPORT DECOMPOSITION
314
well as fE, @fp,' belong to VE, p, ,3 = 1, . . . , k. But then fp, - fp,' E VE, P, x p3,3 = 1, . . . , k . For minimal decompositions V E , ~ , x P, has zero dimension (Theorem 8.4.1). Hence,
@ fP, ' = @ fP, . 3
3
The linearity of this correspondence is clear since, if for each j ,
and
belongs to V p . Thus, if $, fE, 1 corresponds to fp, 1 and
.. 11.
fp,' it follows that $ , ( X l f ~ ,
1
+ X'fE,')
$, fEJ correspond to 1
corresponds to $,(X,fp,
+ X'fp,').
We use the following facts: vE,P,
'
PJ = VP ' PJ
and
9
V E , ~ , X E3 = VE
X
E3.
(Theorem 8.2.2 and Theorem 8.4.1). The existence of a vector fEJ E VE corresponding to fp, is clear from the 1 fact that V E , ~ ., P, 2 Vp . P, ,j = 1, . . . ,k. If both @, fEJ and fE, L of VE correspond to $, fp, , it is clear that
e3
e3
ej
(fE,' - f E J 2 ) @ O p J E v ~ , ~ , , j = l ; ~ ~ , k fE,1-fE,2
EVE,^, X E , = V E X E , , ~ = ~ , . . . , ~
E 8.13: We use the results in Exercise 8.11. We see that both the decompositioris being minimal are also compatible decompositions of VE. But then ((VE, p , ) k ; V E ) is a decomposition of both Vp and Pp. We conclude that Vp = c p E 8.14: i. It is easily seen that fEJ ~ f p E, V ~ , p , , j= 1 , . . . , k and $fp, E V iff fEJ @fp; E V E , ~and @ f p J 3E Vpr, where fp; F T,(fp,). This proves that ( ( V ~ , p ~ )Vk p, , ) is a decomposition of VE. Minimality follows from the fact that I P' I=I P I . .. 11. We have VE, P, ' E3 = VE, P ' j Ej '
Ej = VE,pfj
EJ Hence, we have a representative matrix for V E , ~ , of the form Rjj 0 and for V ~ , p i jof the form R2j Q& vE,P, X
[2 &]
[
X
8.7. SOLUTIONS OF EXERCISES
315
( 2 ) is a representative matrix for V E , ~.E, j and Rjj is
where
a representative
matrix for V E , ~x, Ej. Now by the minimality of the decomposition (using Theorem 8.4.1) we have
1 PJ I=) PJ‘1
=
T(vEJP,‘ T(UE,P, ’
EJ)- T ( v E J P Jx EJ) PJ)= T ( V E 3 P ’ j ’ PJ’)
Thus, Q 2 J , Q& are representative matrices of V E ,pJ . PJ,V E ,ptJ . P3’ and are square and nonsingular. Clearly, there is a matrix T, s.t. QjJT3= Q&. Now let ( ( $ ~ ~ p ! , ) k , $ p )be the decomposition derived from (( VE,p J ) k ;U p ) by using the nonsingular transformation T, on fpJ as in the previous section of the present problem. Thus, ( ( $ ~ , p t ~ ) k$;p ( ) is a minimal k-decomposition of V E . We claim that the decomposition is identical to ( ( V ~ , p ! ~ ) k ; V p For ! ) . it is clear that $ E , P J ~= V E J p J by construction. By Exercise 8.13, since the decompositions are minimal and therefore compatible we can conclude that Vpf = V p f . E 8.15: i. This is condition (iv) of Theorem 8.4.1.
..
Since the set of columns PJ are linearly independent in a representative matrix of V p it is clear that PJnK1would be linearly independent in a representative matrix of U p . h’l . Similarly, since PJ are linearly independent in a representative matrix of Up‘, PJn ( P - K1) would be linearly independent in a representative matrix Of Vp‘ ( P - K1). Thus, 11.
9
I Pj IL T ( V P
*
K1)
+ T(Upl. ( P
-
K1))
for any K 1 P. The result follows.
iii.
Let
U p ,Ujk
have the representative matrices (RI!, . . . iRh), (B1 i
. . . ! Bk)
and let U p f ,Ub,have the representative matrices (Ri i, . . . !Ri), (B: i . . . i BL), where the columns of Rj, Bj correspond to Pj and those of R[i,BS correspond to Pi. Then it is clear that R[i = RjTj
for an appropriate nonsingular matrix T,, j = 1 , . . . ,Ic. Since (R’)(B’)T= 0 = (R)(B)Twe must have
B; = B ~ ( T ~ ~ ) - ’ . It is clear therefore, that the columns Pj‘ are linearly independent in the representative matrices of Ub, as well as V b , . Hence, I Pj’ ( 5 hybrid rank of U p by the previous section of the present problem. But I Pj’ I=) Pj 1 . So the result follows.
8. M ULTlPORT DE COMPOSlTION
316
8.8
Solutions of Problems
e3 e3
J = 1, , k , satisfy the above condition only if fE, E V E since ( ( V E , Q , ) ~U; Q ) is a decomposition of UE. Hence, this happens only if there exists fp, , J = 1, . . . , k , s.t. fE, @ fp, E VE,p, , j = 1 , . . . ,k,and fp, E U p , since ( ( V ~ , p , ) kV ; p ) is a decomposition of UE.But such fp, must satisfy,
Now fE,,
fPJ
- f , E
VE,P, x
Sirice the decomposition ((VE,p , ) k ; fp,
U p ) is
PJ, j = l , ’ . . , k . compatible this means that
- fp, E V p x P3.
L%P conclude that f , E V p . Hence, c p C V p . On the other hand let fp, E V p Since the decomposition is compatible,
e3
V F > ’ P j‘ V E , p , ‘ P J , J = 1 , ” ’ , k Hence, there exist
alld
fE,
,j
= 1 , . . . ,k , s.t.
8.8. SOLUTIONS OF PROBLEMS
317
+
Consider the space V b G i ) ~( e j ( V x~ E j ) ) .Since both i ) and ~ e j ( V x~ E j ) have the Ej as separators, V b also will have the Ej as separators. But
+
+
,. But UE n V b 3 V E n i ) ~ . Hence, V E V k = V E 3 Hence, ~ ( V EV ,b ) < ~ ( V E i ), ~ )a, contradiction. (V E x Ej ) . We can similarly prove that Hence, i ) 2~
ej
(The result also follows by using duality, i.e., working with V1 in place of V and using the facts that
(V . Ej)* = V* x E j , (V
+ V ’ ) I = VL fl (V’)’
and
d(V,V’) = d(+, ( V ’ ) l ) . Let V g be a vector space on E which has the Ej as separators. Let V& = V E , ~ ,+ ) ) Vb. We know by Exercise 8.6 that V b has Pj, j = l;.. ,k as separators. The decomposition ( ( V E ~ P ,V) ~ p ); is compatible and therefore by Exercise 8.1 I
.. 11.
(ej
VP
= (@(VEJP,)k)
* VE.
j
Hence by Problem 7.8
d(VE,&) = d ( V P , V & ) Thus, the hybrid rank of V E relative to { E l , .. . , E k } is not less than the hybrid rank of V p relative t o {PI , . . . , Pk}. Next let i ) p have the Pj as separators. Then the vector space
has the E3 as separators. Again d ( V p , $ p ) = ~ ( U E , $ E ) So . the hybrid rank of V p relative to {PI , . , Pk} is not less than the hybrid rank of V E relative to { E l , . .. , E k } . This proves the required result. s .
P 8.3: We need t o use the ideas of Exercise 8.11. Let al,.. ‘ , up be the edges of t, and let e l , . . . , ep be the edges of Lj. Let T3( 2 t 3 ) 3, = 1, . . . , k be forests of GE, pJ respectively. We observe that whenever 1 _< T 5 p , the f-cutsets of T3 with respect to a,+l ,. . . ,ap in the graph G E , ~ ,are contained in P3 and remain as f-cutsets even after a1 , . .. ,a,. are contracted. The corresponding
vectors (appropriately padded with zeros whenever required) belong to the voltage spaces of GE,P, x P3,GE,P, as well as that of the graph obtained from G E pJ ~ by contracting al , . . . ,a,.
8. MULTIPORT DECOMPOSITION
318
Since VE,P, x P ’ = V p x P3, (strong compatibility) these vectors (again appropriately padded with zeros) belong to V , ( G p ) as well as the voltage spaces of the graph obtained from G p by contracting a l , . . . ,a,. Thus, the branches al, . , a, may be successively contracted (i.e.,t’ may be contracted) in G E , ~ , as well as G p leaving the remaining components of the decomposition as they were earlier. The voltage spaces of the resulting graphs would continue to be a strongly compatible decomposition of V,(G). This process may be repeated for each of the components of the decomposition since when t , , j # i, is contracted the f-cutsets o f t , remain as they were before the contraction. At the end of this process involving all the j we would have the strongly compatible decomposition, ((GE,T, ) k , G T ) , where GE,T, = G E , P , GT = G P x
x (PJ- t j )
( P - Utj). j
The edges L j would be contained in a coforest of GE,T, as well as GT. (Because contraction of some tree edges would not disturb the corresponding coforest). We have (WGE,TJ
i.e., i.e.,
.Tj
= ( V v ( G T ) ’Tj)
(V~(GE,T,)) x Tj = (vi(G~)) x Tj
V ~ ( G E. T , ~j ) = ( V ~ ( G T . T j~ = ) ,l , . . . , k .
We repeat the argument now in terms of circuits. It would then follow that Lj can be deleted from all the GE,T, as well as GT and the result would be the strongly compatible decomposition ( ( G E , Q ~ )GQ) ~ ; of G. This decomposition would be minimal by Theorem 8.4.1 for the following reasons: i. ~ ( ( G E , Q1,. Q j ) =I Q j I= ~ G . QQ j ) (since the coforest edges Lj have been deleted from GE,T, . T j ) .
..
T((GE,Q,) x Qj) = 0 = ~ ( G Qx Q j ) (since the forest edges of G E , ~ ,x Pj have been contracted).
11-
P 8.4: i. is trivial since the components and the port connection diagram are copies of the same graphs and the relevant sets for application of strong compatibility conditions are also trivial.
..
11.
This is an immediate specialization of the algorithm of Problem 8.3.
P 8.5: In the NAL- N B K method, we have a graph whose edge set E(G) is partitioned into { A ,B } . Sets K E A , L B are such that G x ( A U L ) ) . A S G . A and 5 . ( B U K ) x B S G x B . We need to show, using multiport decomposition, 1. V,(G) equals the collection of all vectors f A - K @ f K @ f B s.t. there exist vectors f A - K 63 f K CB f, E V,(G X ( A U L ) )
8.8. SOLUTIONS OF PROBLEMS
ii.
319
Vi(G) equal the collection of all vectors
We will only prove the statement about voltage vectors. The statement about current vectors can be proved similarly (dually). For the discussion to follow we need to build copies of graphs derived from G. We use the following notation: The sets P A ,PB are copies of A , B . If G is alternatively denoted by GAB, GpApB would denote its copy on PA U PB. In general if GST, S C A , T C B is a graph derived from GAB by a sequence of operations, then gpspT would denote the graph derived from Gp, p B by the same sequence of operations on corresponding elements of the copy. Denote the graphs G x ( A U L ) ,G . ( B U K ) respectively by GAL,GBK and the graph G . ( B U K ) x ( K U L ) E G x ( A U L ) * ( K U L ) by GKL.
G.
We will first show that GAP^, G
B ; GpK ~ pL) ~ is a 2-multiport decomposition of
We start with the strongly compatible decomposition (Exercise 8.1 1) ( U A ~ ~ , VVBp PB p~A;)of Vu(G),where UST denotes the voltage space of U,(GST). From Exercise 8.11 we will use the idea that if ( U A ~ ~ , V B ~is~a ;strongly U ~ , ~ ~ ) compatible decomposition of U A B and if fp, E V A A x P1,e an element in the support of fpl , then e can be contracted in V A as~well~ as U p , p 2 and we would be left with a new strongly compatible decomposition of U A B . Dually if gp, E V i p x P I ,e an element in the support of g p , , then e can be deleted in l l ~ p as ~well as VP,p2 yielding a new strongly compatible decomposition of V A B . If LIAA is the voltage space of GAP, then a cutset (circuit) vector with support contained in PI would belong to V A ~x, PI ( Y i p x PI). So we can work directly with cutsets and circuits of GAP, x PI, &pl . PI, respectively. Build a forest t k of G . K , extend it to a forest t A of 4 . A . Let t ( B - - Lbe ) a forest of G . ( B - L ) . Since G x ( A u L ) . A E G . A , t~ contains no circuits in G x ( A U L ) . Hence, observing that Q x ( A U L ) is obtained by contracting the branches of B - L , t ( B - L ) U t A contains no circuits of G. Extend this set to a forest t of G. Let 5 be the corresponding coforest. We will denote by t,, 6the sets t n Y ,tn Y respectively. For simplicity the copies of these sets in PA U PB would also be denoted by the same symbols. Observe that the f-cutsets o f t with respect to edges in t ( 5 - t ) do not intersect A . Let el . . . , ep be the edges in ~ ( B - L ) .When e l , . . . ,e, (1 5 T < p ) , are contracted, it is clear that the f-cutsets o f t - {el, . . . ,e,} with respect to ev+l , . . ,e p remain as cutsets in the contracted graph. We can, therefore, contract t ( B - L ) in GAP^ and GpApB while leaving G B P unaltered ~ and the resulting voltage spaces would continue to be a +
8. M ULTZPORT DE COMPOSZTION
320
strongly compatible decomposition of U A B . The edges of ~ ( B - L would ) now have be. can comeselfloopsin&p, x ( A U P B - ~ ( B - L ) ) andGpApB~ ( P A U P B - ~ B - L ) They therefore, be deleted in x ( A U P B - ~ B - L ) as well as G p A p Bx ( P A U P B -tR--L). But deleting or contracting selfloops has the same effect. So they may be cow tracted in both the graphs. Thus, at this stage we are left with the graphs GAP, GAP^ x ( A U P L ) , G5pA and GpLpA GP,P, x (PAU P L ) , whose voltage spaces constitute a strongly compatible decomposition of V A B . Next consider the f-circuits o f t with respect to edges in
~ ( A - K ) in
the graph
G. These would remain as such in G . A % G x ( A U L ) . A and therefore also in G B and ~ G ~ p A p L So . ~ ( A - K )may be deleted in G ; B and ~ ~ G p A p Lwhile leaving GAP,
unaltered. The resulting voltage spaces would continue t o be a strongly compatible decomposition of V A B .The edges of t A - K would now have become coloops. They can therefore be contracted in GBP, ' ( B U P A - ~ A - K ) and GP,P, . ( P A U P L - ~ A - - K ) . But deleting or contacting coloops has the same effect as far as voltage spaces are concerned. So they may be deleted in both the graphs. Thus at this stage we are left with ( ; ~ p ~ , G s pG, G j ~ p ,. ( B U P K ) , G p K p L G p A p L' ( P K U P L ) . Their voltage spaces constitute a 2-multiport decomposition of V A B . We will next show that the port voltage matching conditions are equivalent to the condition that voltage vector on K be the same in G x ( A U L ) and G . ( B U K ) . The equivalence of port current matching conditions to the condition that current vector on L be the same in G . ( B U K ) and G x ( A U L ) , can be proved similarly (dually). In what follows if f, is a vector on Y , fpy would denote that vector on P, whose value on e' is the value o f f , on e l where e' in P, is the copy of e in y.
B fpL, fB @ fpK, ip, @ fp, of V,,(Gph.pL) respectively. Let us write fA as f A - K @ f K . The vector f K @ fpL E V,(GAP, . ( K U P L ) ) .Hence, noting that Gp,p, . ( P K U P L ) , the vector fp, @ fp, E V,(Gp,p,). The vector Gp,pL fp,< fBfp, belongs to Vw(GpApL. ( P K U PL))(=V,(Gp,p,)). Hence, the vector f~ @ fp, E ~ ~ ( G A P( K , U P L ) ) .Thus, fp, - ~ P X E ~ ' ~ ( G P A P L.(PK U P L ) X ~ K ) (= Vv(Gp, = ip, since (VE,@ (-vLJ),iEJ Bib,), in addition t o the above conditions, also belongs to ’DE, x Sp,. it must be a solution of NE P, . SO l (v;,, ihJ) E VD;.,Further @, v’p, E V p and @, i;, E Vjk. Thus, (v’p,ip)is a solution of Nb. If (v”p,i’lp) is another such solution, v’p, - v l ’ p J E V E , ~x, P3 for each 3 . Since the decomposition is minimal (using Theorem 8.4.1).each of these spaces has zero dimension. Hence, vlp = vl’p.Similarly one can show that i’p = i’’p.
..
11.
VIE, i;,
... 111
(v&,i’,), (v”E, i”E) are both solutions of N corresponding to (v’p,i>) iff for each 3 , and E V E , ~x, E3 5 V E x E3 for each 3 . - i ” ~E , V i J p Jx E, V k x E, - V”E,
N&would be defined in an identical manner to N;. A solution ( v b ,ib) ofN& corresponds to a collection SM of solutions of &’ such that whenever ( V I , i l ) , (v2,L) belong to SN we must have (v1 - v . ~E) V E x E, and (il - i 2 ) E $, V i x E,. But whenever two solutions of h: differ in this manner (by the previous sections of the present problem) they correspond to the same solution (vlp,i’p) of Nf.. Hence, each solution of N& corresponds to a unique solution of Nf..
e3
P 8.7: i. Using equations 8.33 and 8.34 we can show that ip, = - G p v p , where G P is a positive definite matrix. The edges PR in Nf. would therefore have the device characteristic
i‘p, = Gpv‘pR. Pc, PL have the device characteristic . I
Cpv,,
= ihc
8. MULTIPORT DECOMPOSITION
322
cpi;,
= Vb,
where C p , C p are positive definite. By the discussion on trapped solutions in RLMC networks in the above section we know that in A';,if (v, i) is trapped relative to PR then v f PR= 0 and if PR = 0; if trapped relative to PL then V/PL= 0 and i / P L E Vj$ x PL;if trapped relative to Pc then if Pc = 0 and v f Pc E V p x Pc. But the decomposition is given to be minimal. So by Theorem 8.4.1 we have r(V p x P c ) = 0 and T(V$ x P L ) = 0. SO i / P L = 0 and v / P c = 0. This proves the required result.
..
We will show that in Nb, a zero eigen value solution has to he trapped solution relative to P c , PR, PL. A zero eigen value solution implies that Vkc, are zero. (This means vbc, iLL are constant vectors). Let vbc # V p x Pc. This means that either v'p, # 0 or vlp, # 0. Since i'p, = 0 we must have v'p, = 0. So v'p, # 0. Since G p is positive definite we conclude that < v'p,, i& ># 0. Now we have 11.
ypI,
< vbc, ihc > + < vhL,i>, > + < vbR,ibR >= 0. SOeither < v'p, , iLc2 ># 0 or < v'p,, ikL ># 0. As we have seen, v'p, = 0, and since v;.,,, = 0, we must have = 0. This is a contradiction. A similar contradiction can be derived from the assumption that
iLc
i'p,
# V,'
x PI,
Thus, a zero eigeii value solution corresponds to a trapped solution. But from the previous section of this problem the only trapped solution to this problem is the zero solution. This proves the required result.
P 8.8: 1. is routine.
..
11.
We have
~ ( U E Pf+ V P ) = ~ ( V E xP E ) + ~ ( ( V E PP ) n V P ) - ~ ( ( V ExPP ) n V P ) . a
(see Problem 7.1). k!t
Let
VEp
ej
vEJpJ.
( ( V ~ , p , ) k ; V be p)
a compatible decomposition of V E . We know that
VE,p, . Pj 2 V p . Pj, j = 1, . . ' , k vE,p,
x Pj
5 vp x
Pj,j = I;",k.
Hy Exercise 8.7 we have, ( ( V ~ ~ p ~V )p j. ~P Il ); is a compatible decomposition of VI;. . E l . Thus,
r ( Y E " B I ) = r ( ( @ v E , P , ) x E l ) f r ( V P ' P l ) - - ( ( ~ V E , P ,x) pf) j€I
j€I
8.8. SOLUTIONS OF PROBLEMS
323
as required. P 8.9: i. If the vector spaces are over GF2, M ( V s ) fully determines V S .Hence if we know M ( V E P ) , M ( V P we ) , know v ~ p , V pand therefore, V E and M ( V E ) . ii.
See Example 7.1 p.95 "arayanan86al.
Chapter 9
Submodular Functions 9.1
Introduction
In combinatorial mathematics submodular functions are a relatively recent phenomenon. Systematic interest in this area perhaps began with the work of Edmonds in the late sixties [Edmonds70]. By then matroids were well studied with numerous applications to engineering systems already known. Submodular functions could be regarded as a generalization of matroid rank functions and it is natural to wonder whether they are really required. The answer is that, even if we ignore considerations of theory, we come across them far more often in practical problems than we come across matroids. The method of attack for these problems using submodular function theory is usually quite simple and the algorithms generated, very efficient. Study of basic ‘subinodular’ operations such as convolution and Dilworth truncation is likely to prove fruitful for practical algorithm designers since, in addition to completely capturing the essence of many practical situations, they also allow us to give acceptable approximate solutions to several intractable problems. In this chapter we begin with simple equivalent restatements of the definition of submodularity along with a number of instances where submodular functions are found in ‘nature’. Next we discuss some standard operations by which we get new submodular / supermodular (more compactly, semimodular ) functions starting from such functions. We then pay special attention to the important special cases of matroid and polymatroid rank functions. Next we give a sketch of the polyhedral approach to the study of semimodular functions. Finally we give a brief outline of some recent work on minimization of symmetric submodular functions. The important notions of convolution and Dilworth truncation of submodular functions are relegated to subsequent chapters. 325
9. SUBMODULAR
326
9.2
FuNcTroivs
Submodularity
We begin with a few definitions of submodularity which are easily proved to be equivalent. The most useful form appears to be one which essentially states that the ‘rate of increase’ of submodular functions is less on ‘larger’ sets ( ~ ~ O ~ O to ‘cap’ functons over the real line). Thereafter we present a number of simple examples of submodular functions.
Definition 9.2.1 Let S be a finite set. Let f : 2’ -+ 8, f is said to be a submodular (supermodular) function iff
VX,YES
f(x)+f(y)>f(XUY)+f(XnY)
(f(W + f ( Y )5 f ( X u Y ) + f ( X n Y )
V X ,y
(9.1)
c S)
The function f is modular if the inequality is replaced by equality. A function is semimodular if i t is submodular or supermodular.’ The following theorem gives equivalent conditions for submodularity / supermodularity. These conditions are often easier to apply in practice than the original definitions. The conditions for supermodularity are obtained by reversing the submodular inequalities and the proof of the equivalence is obtained by reversing the inequalities line by line in the submodular case proof.
Theorem 9.2.1 (k) j.
A function f properties.
: 2s
+R!
is submodular iff i t satisfies any one of the following
f ( X UU ) - f ( X ) 2 f ( X Ub U a) - f ( X
U b)
f ( X u u ) - f ( X ) _> f ( Y u a ) - f ( Y ) f ( X u 2)- f ( X ) 2 f ( Y u 2)- f(Y)
V X C S ,V U ,b E S - X (9.2)
vx 5 Y c s VX
s -Y Y c s ,V Z c s - Y ,VaE
(9.3) (9.4)
ii. The function f is supermodular iff i t satisfies any one of the above three
properties with the inequalities reversed.
Proof: i.
Let Y = X
(9.2) + (9.3) 3 (9.4) 3 (9.1) 3 (9.2)
u bl & bz bj . . .
(9.2) + (9.3) bk.
If a is not in Y , we have, by (9.2)
f ( X u a) - f ( X ) 2 f(Xu bl u a ) - f ( X u b l )
2 f ( X U bl
U bz U U ) - f ( X U bl U b z )
‘Warning: before the early 70’s submodular functions used to be referred to as semimodular functions in the literature.
U S
327
9.2. SUBMOD ULARITY
... f(XUalU~..Uat)-f(XUalU..4Jat-1)
3 f(YUa1U.
~.Ua~)-f(YUalU...Ua~-l)
Adding all the inequalities we get (9.4). (9.4) 3 (9.1) This is immediate by setting (in (9.4) )
X to xn Y , z to
(x- x n Y ) ( =( X u Y - Y)) and Y (9.1) + (9.2)
to Y
This is immediate by setting (in (9.1)) XtoXUaandY toXUb
ii. The supermodular case is similar. 0
We now give some common examples of submodular functions. The first set involves graphs. Let G be a graph on vertices V and edges E. Example 9.2.1 (k) Let V(X) _= set of endpoints of edges ofX, X E E(G). Then IV((+) (called the vertex function of G) is submodular. Example 9.2.2 ( k ) Let E ( V l ) = set of edges with both endpoints in V1, V1 C V(G). Then IEI(.) (called the interior edge function of G) is supermodular. Example 9.2.3 (k) Let I(V1) = set of edges with atleast one end point in V1, V1 5 V ( S ) . Then /I((.)(called the incidence function of B )is submodular.
Example 9.2.4 (k) Let r'(Vl) =set of vertices adjacent to some vertex in VI, Vl 5 V ( S ) . Then II'((.) (called the adjacency functionof B) is submodular.
Example 9.2.5 (k) Let cut(V1) = set of all branches with only one endpoint in Vl, V1 5 V(G). Then Icutl(.), called the cut function of G, is submodular. 'Throughout I X ( . ) ( and ] X I ( . ) are used interchangeably to specify the cardinality of X ( . ) , where X(.)is any set function.
328
9. SUBMODULAR FUNCTIONS
Example 9.2.6 (k) Let 6 be a directed graph. We can now define, analogous to the definition in Example 9.2.1, &,il(X), Vhend(X)over the subsets of the edge set of the directed graph. These lead to submodular functions. Analogous to the definitions in ri,(.),cutout(.), Examples 9.2.3, 9.2.4, 9.2.5, we could define I o u t ( . ) ,Ii,(.), rout(.), cut,,(,) etc. In each case the functions are submodular. (The functions l o u t ( . ) , are actually modular). The above examples can be generalized to the context of hypergraphs as represented by bipartite graphs. The reader might like to think of the right vertex set of the bipartite graph as the edge set of the hypergraph.
Example 9.2.7 Let B = (VL,VR,E ) . Let E L ( X ) 3 set of all vertices in VR adjacent only to vertices in X , X C VL. ER(.) is defined similarly on subsets of VR. Then IELI(.),IERI(.) (called theleft exclusivity function and the the right exclusivity function respectively of B ) are supermodular.
Example 9.2.8 Let B = ( V L , V R , E )be a bipartite graph. For X C VL define c ( X ) to be the set of all vertices in VR whose images under r(.)intersect both X and VL - X . Then lcl(.) is submodular. The next couple of examples are of the matroid kind.
Example 9.2.9 (k) Let E be the set of columns of a matrix over any field F. Then, the rank function T ( . ) on the subsets of E is submodular. Example 9.2.10 (k) Let G be a graph on the set of edges E(G). Let T ( X ) X, C E be the number of edges in a forest of G . X , the subgraph on X . Let r ' ( X ) ,X C E be the number of edges in a forest of G x X , the graph obtained by shorting and E be the number of edges in a renioving all edges in E - X . Let u ( X ) , X coforest of G x X . Let u ' ( X ) ,X E be the number of edges in a coforest of G . X . Then, T ( . ) , v(.) (called the rank and nullity functions of the graph respectively) are submodular, while r'( .) , u'(.) are supermodular. Exercise 9.1 Show that the functions listed in the above examples are submodular or supermodular as the case may be. Remark: In Examples 9.2.9,9.2.10above, the size of the 'maximal independent set' contained in a subset turns out t o be submodular. Weaker notions of independence do not always yield submodular functions. Exercise 9.2 presents such an instance. Exercise 9.2 Let G be a graph. Let a set of vertices Vl C V ( G ) be called eindependent if no two vertices of Vl are joined by an edge. Let k(Vi),Vi C V ( G )be the maximum size of an e-independent set contained in Vi. Show that k ( . ) is not in general submodular or supermodular. Exercise 9.3 Let (VL,VR,E) be a bipartite graph. Let w(.) be a nonnegative weight function assigned to VR. Let q : 2vL + R be defined by q ( X ) w ( Y ) where Y is the set of all vertices in VR which are adjacent to every vertex in X . Let q' : 2vL 4 R be defined by q ' ( X ) = w ( 2 ) where 2 is the set of all vertices in V, which are adjacent to none of the vertices in X . i. Show that q ' ( X ) = w ( E L ( V L- X ) ) and hence supermodular.
9.3. BASIC OPERATIONS ON SEMIMODULAR FUNCTIONS
329
ii. Define the complementary bipartite graph ~ ( V L VR, , E) of B ils follows: e E E i f f the endpoints of e are not connected by an edge of E. Show that q B ( X ) G q b ( X ) where qB and denote the appropriate functions defined for B and 3 respectively. Hence show that qB is supermodular.
&
The terms ‘submodular’ and ‘supermodular’ have arisen from the well known notion of modular set functions. However in our framework the latter functions are essentially trivial. For completeness we define modularity and give equivalent definitions below.
Definition 9.2.2 A function w : 2’
-+ R is modular i f fit satisfies the relation
w(X)+ w(Y)= w(Xu Y )+ w(X fl Y ) V X ’ Y & S.
If w(x)3
xeEX w(e), X C. S , then we call w(.) a weight function.
Theorem 9.2.2 (k)
i. Let w(.) be a modular function on subsets of S . Then, w(X)= C e f x ( w ( e ) - 4 0 ) ) + 4 0 ) . Hence, i f w(0) = O,w(X) = CeEX w(e).
ii. i f w E ( X ) = ) X U El, where E is a fixed subset of S , then wE(.)is modular. (Note that ~ ~ (=0IEI) which could be nonzero.) iii. the function f IX fl El, is modular. In particular 1. I is modular. So is the function WE(X) where E is a fixed subset of S . (In this case observe that the weight of elements in S - E is zero.)
Proof: We prove only the part (i). Let X
C. S and let a E S - X .
We then have
w(X)+ w(a) = w(X u a) + 40)’ The result then follows by induction. 0
9.3
Basic Operations on Semimodular Functions
We now present a number of operations which act on submodular / supermodular / modular functions and convert them to submodular or supermodular functions. We begin with addition and scalar multiplication. Here the underlying set does not change. In the case of ‘direct sum’ it becomes the disjoint union of the original sets while in the case of ‘fusion’ i t is a partition of the original set. We next consider the fundamental operations of ‘restriction’, ‘contraction’ and two types of dualization.
9. SUBMODULAR FUNCTIONS
330
In the case of contraction and restriction the new functions are over the power sets of appropriate subsets of the old set while the dualization operations do not change the underlying set. Definition 9.3.1 Let p l ( . ) , p z ( . )be real valued set functions on the subsets of S I , S p , where S1 nSz = 0. The direct sum o f p l ( . ) , p 2 ( . ) , denoted by (p1 @ p 2 ) ( . ) , is defined over subsets of S1 kd S2 by (PI
CB P 2 ) ( X I I3 X 2 ) _=
p1 ( X l )
+ pz(X2)
VXl
c
s1, x 2
c sz.
Exercise 9.4 (k) Let P I ( . ) , p z ( . ) be submodular functions and let w(.) be a rnodular function on the subsets o f S . Then (p1 p z ) ( . ) , ( p l w)(.), (p1 - w)(.), & I ( . ) , (A 2 0 ) are submodular while - P I ( . ) is supermodular. I f p1, p~ are submodular (supermodular) on subsets of disjoint sets S1, Sa then (p1 @ p z ( . ) ) is submodular (supermodular).
+
+
A common technique in optimization problems which involve finding the ‘best subset’ is to somehow show that the optimum subset can be thought to be a union of some of the blocks of an appropriate partition of the underlying set. In this mmner the size of the problem is reduced since each block can be treated as a single element. The fusion operation defined below formalizes this notion. Definition 9.3.2 Let p ( . ) be a set function on subsets o f S and Jet n be a partition (ES1,.. . Sk) o f S . Then the fusion of p relative to lI , denoted by p f u s . n ( . )is, defined on subsets of II by pfus.n(Xf)
=
u
T),Xf
TEXf
cr n.
It is immediate that p f u s . n ( . )is submodular (supermodular) [modular] if p ( . ) is such a function. Contraction, restriction, dualization are fundamental matroidal operations. For graphs these ideas correspond to short circuiting (contracting), open circuiting (deleting) and taking planar duals. These ideas generalize naturally to submodiilar functions. We prefer to define them for real valued set functions and then specialize them. Definition 9.3.3 Let p ( . ) be a real valued set function on subsets of S . The restriction o f p ( . ) to X C S denoted by p / X ( . ) is defined by p/X(Y)
EE
p(Y)
VY
cX
2
s.
(Note that there is an abuse o f notation here. The original function is on 2’, while the restriction according to the definition of page 16 is on 2x). The contraction o f p(.) to X 5 S, denoted by p o x ( . ) , is defined by p o X ( Y ) G p ( Y u ( S - X ) ) - p(S - X ) VY
c x g s.
Definition 9.3.4 Let p ( . ) be a real valued set function on subsets of S . Let a ( . ) be a modular function with a ( @= ) 0 (i.e., a ( . ) is ‘essentially’ a real vector ) and
9.3. BASIC OPERATIONS ON SEMIMODULAR FUNCTIONS
33 1
let a ( e ) 2 p ( e ) Ve E S. The comodular dual of p ( . ) relative to a(.),denoted by p * ( . ) , is defined by P*(W
=
c 4.)
-[AS)-A S -
eEX
(If a ( . ) is unspecified we take a(.) G p(e) of p ( . ) , denoted by p d ( . ) ,is defined by
Wl
Ve E S). The contramodular dual
= A S ) - f4s
-
X)
Let p ( . ) ,II,a(.) be as in the above definitions. We collect properties of contraction, restriction, fusion, comodular and contramodular dudization in the following theorems. The first of these speaks of how to reverse the order of contraction and restriction without affecting the outcome. The reader might like to compare it with Theorems 3.4.1 and 3.4.2. The routine proof is omitted. Theorem 9.3.1 (k) i. Let P
Q C S. Then,
C Q 5 S and let P, Q be unions of blocks of II. Let P f ,Q f be the sets of blocks of II contained in P, Q respectively. Then, (po(S - P ) / ( Q - P))(Y) = ( w u s . n o (-~pf)/(Qf - Pf))(Yf) = ( p m n ) / Q f 0 (Qf - Pf)(Yf), where Y is the union of the blocks in Yf C II
ii. Let P
The next theorem speaks of the dual of a dual of a function (itself), the duality of contraction and restriction, and of the self dual nature of fusion. We note that if the comodular dual of p ( . ) is taken with respect to a(.)then we would take that of p f u s . n ( . )with respect to a f u s . n ( . ) . Theorem 9.3.2 (k) If 4 8 ) = 0, then
9. SUBMODULAR FUNCTIONS
332 Proof:
=
c
eEZU(S-X)
( p * o X ) ( Z )= p * ( Z u ( S - X ) ) - p*(S - X )
4.1
-
c 4.) c
eES-X
=
4 e )
-
- (P(S) - P ( X - 2)) + ( A S ) - P(X))l
( P ( X )- P(X - 2 ) )
e€Z
= (P/X)*(Z>.
(c) This follows by using the above two results. (d) This follows from the definitions of fusion and comodular duals of p ( . ) and Pfus.rI(.). ii. The proof is similar to the
'*' case and is omitted.
The next theorem is a generalization of Corollaries 3.4.3 and 3.4.2. Its routine proof is omitted. Theorem 9.3.3 (k) Let A E S and let X C ( S - A ) . Then
We now show that contraction and restriction preserve submodularity and supermodularity while comodular and contramodular dualizations behave as the names indicate. Theorem 9.3.4 (k)
i. Let X C: S . Then, p / X ( . ) , p o X ( . )are submodular (supermodular) if p ( . ) is submodular (supermodular).
ii. If p ( . ) is submodular (supermodular), then p * ( . ) is submodular (supermodula.) while p d (.) is supermodular (submodular).
333
9.3. BASIC OPERATIONS ON SEMIMODULAR FUNCTIONS Proof: We consider only the submodular case. i. The submodularity of the restriction of a submodular function is obvious. We now consider contraction. We have p(K
2p
( ~ul y2u
u ( S - X ) ) + p(Y2 u ( S - X I )
(s x))+ p(Y1 n I5 u ( S -
-
X ) ) VYl,y2
cX
Further, p(S - X ) is a constant for subsets of X . The submodularity of p o x ( . ) follows.
ii. We have p(S - X )
+ p(S
-
Y ) 2 p(S - (xu Y ) )+ p(s - (xn Y ) )
vx,Y g S.
Further, w ( X )G CeEX a(.) is a modular function, p(S) is a constant. The submodularity of p * ( . ) follows. The supermodularity of p d ( . )follows noting that the negative of a submodular function is supermodular. 0
Submodular and supermodular functions associated with graphs, hypergraphs (as represented by bipartite graphs) etc usually behave in an interesting way. A basic operation such as the ones described above on a semimodular function associated say with a graph G takes it to another such function associated with a second graph which can be derived from G in a simple way. The exercises given below illustrate this idea.
Exercise 9.5 (k) Rank and nullity functions of a graph Let G be a graph. Let X C E(G). We remind the reader that G.X is the subgraph of G on X and G x X is the graph obtained from G by fusing the end points of edges in E(G) - X and removing them. Let r ’ ( X ) , X E, be the number of edges in a forest of G x X . We will call r’(.) the prime rank function of G. Let v ’ ( X ) ,X E, be the number of edges in a coforest of G . X . We will call v’(.) the prime nullity function of G. Prove
c
i. The rank function of G.X = r/X(.). ii. The rank function of
x X = roX(.).
iii. r d ( . )= T I ( . ) . iv.
T*(.)
= v(.).
v. The nullity function of G x X = v / X ( . ) . vi. The nullity function of G.X = v o X ( . ) . vii. The prime rank function of G x X = r’/X(.). viii. The prime rank function of G.X = r ’ o X ( . ) .
c
9. SUBMODULAR FUNCTIONS
334
ix. The prime nullity function of G.X = v‘/X(.).
x. The prime nullity function of G x X = doX(.).
Exercise 9.6 The incidence function IIl(.) and the interior edge function IEI(.) o f a graph G (see Examples 9.2.2 and 9.2.3) Let denote the graph obtained by fusing V ( G )- X into a single node and deleting dl edges with both end points in V ( G )- X and let ] I / [ ( . )IE’I(.) , be its incidence and interior edge functions respectively. Let [ I ” / ( . ) , ( E ” ( ( .be ) the incidence and interior edge functions respectively of the subgraph of G on X U r ( X ) . If!(.) be the incidence function of the subgraph of 6 on X . Let ll be a partition of V ( 6 )and let lInl(.) be the incidence function of the graph obtained from G by fusing the blocks of II into single vertices but not deleting any edges. Prove
s’
i. IIl/X(.) = lI’l/X(.) = lPl/X(.) and IIJoX(.)= II’loX(.) = II”loX(.). ii. II~OX(.) = ill(.).
iii. IIIfUs.n(.)= lInl(.).
iv. II[‘(.) = IEI(.).
v. Let G have no selfloops. Then, IZl*(.) = lIl(.), where the dual is defined with respect to the weight vector a(.) with a ( v ) = Ill(v) Vv E V(G). vi. IE’IoX(.)= IE’)oX(.)= IE”loX(.) and IEl/X(.) = lE’l/X(.) = IE”I/X(.). vii. I E ~ / x (=. )[&I(.). viii IElfUs.n(.)= IEnl(.).
Exercise 9.7 The IrLI(.), IELI(.) functions of a bipartite graph Let B = ( V L VR, , E ) be a bipartite graph. Let r L ( . ) G r/VL(.) and let r R ( ’ ) E r/VR(.). We Will calf I r L l ( . ) , (IrRl(’)) [ E L I ( . )(IERI(.)), , the left(right) adjacency function, left (right) exclusivity function respectively of B (see Example 9.2.7). Let X C V L . Let B.LX be the subgraph of B on X u r ( X ) and let B o ~ be x the graph obtained by first deleting VL - X r(VL - X ) and afl edges with atleast one end point in this set (B.Rx,B o ~ xX , C VR are similarly defined interchanging left and right). Let II be a partition of VL. Let Bn be the graph obtained from B by fusing ~ [ E ~I(.)L be the corresponding left the blocks of II into single vertices. Let I r n I(.), adjacency and left exclusivity functions. Show that i. lrLl/X(.) is the left adjacency function of B . L ~ .
ii. IrLIoX(.) is the left adjacency function of B o ~ x . iii. l r L l f U s . n ( . ) = lrlnL(.)-
9.4. O T H E R OPERATIONS O N SUBMODULAR FUNCTIONS iv.
335
IrLid(o= \ELI(.)
v. IEL~/X(.) is the left exclusivity function of B o ~ x . vi. IELJoX(.)is the left exclusivity function of B.Lx. vii. J E ~ l f ~ ~ .=n IE~ILI(.)(.)
9.4
*Other Operations on Semimodular Functions
We now consider a number of other operations on semimodular functions which yield other such functions. These operations, while being useful, are by no means standard. We therefore study them through a sequence of problems. Problem 9.1 (k) i. Let f (.), g(.) be submodular (supermodular) on subsets of S and let (f - g ) ( . ) be monotone increasing or monotone decreasing. Show that
h(.)= rnin(f(.),g ( . ) ) ( m a z ( f ( . ) , g ( . ) )is ) asubmodular (supermodular) function.
ii. Let f (.) be an increasing or decreasing submodular (supermodular ) function and let k be a constant. Show that min(k, f(X)) (max(k, f(X)) is a submodular
(supermodular) function.
Solution: We consider only the monotone increasing case, since (f - g ) ( . ) is monotone decreasing iff ( g - f)(.) is monotone increasing. Further we confine ourselves to submodular functions. i. Let X , Y C S . We will verify that h(X)
+ h ( ~2)h ( X u Y ) + h ( X n Y ) .
This is clear if h(.) agrees with f(.) or with g ( . ) on both X and Y . Let us therefore assume that h ( X ) = f ( X ) and h ( Y )= g ( Y ) . We then have,
+
+ f ( xn Y ) + g ( y )
h ( x ) h ( ~2)f ( xu Y )
-
f(y).
+ g ( Y )- f ( Y ) 2 g ( X U Y ) . Hence h ( X )+ h ( Y ) 2 g ( X u Y ) + f(xn Y ) 2 h ( X u Y )+ h ( X n Y ) .
But f ( X U Y )
ii. This is a direct consequence of the previous result.
~
The next problem involves an instance of the convolution operation to be discussed in the next chapter. As we shall see later, even more important than the pmin (.) and pmas(.) functions are the collection of subsets over which these functions become equal to p ( . ) .
9. SUBMODULAR FUNCTIONS
336
Show that i. pmin(.) , pmin (.) are submodular if p( .) is submodular and pmnz (.), &( are supermodular if p( .) is supermodular.
.)
ii. if p ( . ) be a submodular (supermodular) function on subsets of S then the subsets over which it reaches a minimum (maximum) form a distributive lattice(i.e., the collection is closed under union and intersection). Definition 9.4.2 The collection of subsets of S over which a submodular (supermodular) function reaches a minimum (maximum) is called its principal s t r uct w e . iii. Prove
Theorem 9.4.1 (k) Let p ( . ) be a submodular function on subsets of S . Let
X 2 S have the property that
A X )I P ( Y ) (PU(X) < P ( Y )
Then X is contained in some (every Solution:
c
vy
cx
vy c X I .
set that minimizes p ( . ) .
i. Let W L Z denote a subset of 2 S at which reaches the minimum among all the subsets of 2 and let mxn,my, denote mxun,myun.Let X C Y C S and let a E S - Y . We now have, p(mx,)
+ A m y )
p(9)
2 P ( m Y u mxn) + P ( m Y
fl mxn).
i.e.,
p ( m y u m x n )- P ( ~ Y 5) p(mxn)- P ( ~ n Y mxn). Now
m y U mx,,
Y U a and my n m x n C_ X . But then P(mYn)
i P ( m Y u mxn)
and
d m x ) 5 p ( m Y fl mxa).
9.4. O T H E R O P E R A T I O N S O N S U B M O D U L A R FUNCTIONS
337
We therefore have, p(mYn) - p(my) IP ( m X n > - A m x ) .
Since p m i n ( T ) = p ( m ~ )V T S, it is clear that p m i n ( . ) is submodular. The proof of the supermodular case p m n z ( . )is similar. Next let .(X) = p ( S ) - p(S - X ) .
0bserve that
h ( X ) = P(S) - 6 n n z ( S - X) and
h ( X ) = 4 s )- gmin(S
-
X).
The required results follow by noting that o(.)is supermodular (submodular) iff p is submodular (supermodular).
ii. Let X , Y minimize the submodular function p ( . ) . By the basic inequality
The only way this inequality can be satisfied is to have the values of p ( . ) on all four sets to be the same. The result follows. The proof for the supermodular case is similar.
iii. Proof of Theorem 9.4.1: Let p ( . ) reach the minimum, among all subsets of S , at 2 . If X is a subset of 2 , there is nothing to prove. Suppose X is not. Then X n 2 C X . By the submodularity of p ( . ) ,we then have,
Case1 p ( X ) 5 p ( X n 2 ) . In this case p ( 2 ) 2 p ( X u 2 ) . Thus X is contained in a subset that minimizes p ( . ) viz X u 2. Case2 p ( X ) < p ( X n 2). In this case p ( 2 ) > p ( X u Z ) , which is a contradiction. We conclude that X must be a subset of 2. 0
Problem 9.3 Let s(.) be an increasing set function taking subsets of a finite set SI to subsets of another finite set Sz i.e., s ( Y ) 2 s ( X ) VX c Y S1. Suppose s ( X )u s ( Y ) = s ( X
uY )
( s ( X )n s ( Y )= S ( X n Y ) ) . i. Let w(.) assign each element of S, a nonnegative weight. Define $ ( X ) = CeEs(x) w(e). Show that
(a) the function $(.) is submodular (supermodular).
9. SUBMODULAR FUNCTlONS
338
(b) (k) the set functions (defined in Exercise9.1) V ( . ) ,I'(.),I ( . ) ,V&(.), h e a n (.), rin(.), rout (.), Ii, (.), Iout (.) yield submodular functions when 'weighted nonnegatively' while E ( .) yields a supermodular function. Hence show that the nonnegatively weighted versions of cut(.),cutin(.),cutout are also submodular. jj.
(k) Let p ( . ) be an increasing submodular(supermodu1ar) function on subsets of S,. Define o(.)on subsets of S1 by a ( X ) G p ( s ( X ) ) , X C S1. Show that o(.)is submodular (supermodular). Example: (k) Let B be a bipartite graph on V L ,VR. Let the function s(.) be taken as the I'(.) ( E L ( . ) )function of the bipartite graph. Let p ( . ) be any increasing submodular (supermodular) function on subsets of VR. Then p ( r ( . ) ) ( p ( E L(.))) is a submodular (supermodular) function on subsets o f V L .
Solution: i(a) For any increasing set function, we must have,
s ( X )u s ( Y )
s(X
uY),
s ( X )n S ( Y ) 2 S ( X n Y ) .
If
s ( X )u s ( Y ) = s ( X
it is clear that
uY),
+ S(Y) = w ( s ( X ) )+ w ( s ( Y ) ) = W ( S ( X ) u s ( Y ) )+ w ( s ( X ) n s ( Y ) ) 2 a(xu Y )+ s^(X n Y ) . S(X)
The supermodular case (where s ( X ) n s ( Y ) = s ( X
n Y ) )is handled similarly.
i(b) It is easily verified that the functions V ( . ) , r ( . )I,( . ) , Vtnil(+),Vhenn(.), T i n ( . ) ,r o u t ( . ) , I ~ , ( . ) ,all ~ osatisfy u ~ ( .the ) property s ( X )u s ( Y ) = s ( X U Y ) ,while E ( . ) satisfies s ( X )f l s ( Y ) = s ( X nY ) . Fhrther they are all increasing set functions. Thus it is clear that they must yield sub- or supermodular functions (as the case may be) when weighted. To study the weighted version of cut(.),we observe that a t ( x )= qx)- E ( X ) and r(x)2 E ( x ) . SO
w ( a t ( X ) )= W ( T ( X ) )- w ( E ( X ) ) and the submodularity of this function follows from the sub- and supermodularity of the functions w(r(.)) and w(E(.). One can similarly prove that cutin(.),cutout(.) yield submodular functions when weighted.
ii. We consider only the case where s ( X U Y ) = s ( X ) U s ( Y ) and p ( . ) submodular. The supermodular case can be handled similarly. We have a(.)G p ( s ( . ) ) .NOW P ( S ( X > >+ P(S(Y>>2 P ( S ( X )
u s ( Y > >+ P(S(X> n 4 Y ) ) 2 P ( 4 X u Y ) )+ LL(S(X n y > >
9.5. POLYMATROID AND MATROID RANK FUNCTIONS (since
339
~ ( xn s)( Y ) 2 s ( X n Y ) ,s ( X ) u s ( Y ) = s ( X u Y )
and p ( . ) is increasing). Thus .(.) is submodular on subsets of S1. To verify the correctness of the example we need only verify that I'(.), EL(.) are increasing and respectively satisfy the property s ( X ) U s ( Y ) = s ( X U Y ) and the property s ( X ) n s ( Y ) = s ( X n Y ) . This, as mentioned before, is routine.
Problem 9.4 (k) Let f be any increasing cap (increasing cup) function from R to R. Let p be an increasing submodular (increasing supermodular) function. Show that f(p(.)) is submodular (supermodular). Examples: i. Let r ( . ) be the rank function o f a graph G. Let f ( t )G ( ~ ( 6 -) (t ) ~- T - ( G ) ) ~ . Note that f(.) is increasing in the interval [O,r(G)].Then f ( r ( . ) )is subrnodUlX.
ii. Let w(.) be a weight function on S. Let f ( t )
( w ( S ) )-~(t - w ( S ) )and ~
let g(t)E t 2 . Then f(w(.)) is submodular while g(w(.)),g(r'(.)),g(v'(.)) (the latter of Example 9.2.10) are supermodular.
Solution: If the function f(.) is an increasing cap function from R to R then
f(. + h ) - f(z)2 f ( Y + h) - f ( Y )
VY L
2,h
2 0.
Let p ( . ) be an increasing submodular function on the subsets of S and let X C Y S , a E S - Y . Then P(X
s
u a ) - P ( X ) 2 P(Y u a ) - P ( Y ) .
Let 6, E represent the left and right sides of the above inequality. Then, since f(.) is increasing,
f ( P ( X )+ 4 - f ( P ( W ) 2 f ( P ( X )+ 6 ) - f(P(W)
2 f ( P W + €1 - f ( P , ( Y ) ) * This is equivalent to saying that
f(Pu(Xu a)) - f ( P ( X ) )L
f(@
u a))- f ( P ( Y ) ) .
This proves the submodularity of f ( p ( . ) ) . The supermodular case is similar and the examples are direct applications of the result.
9.5
Polymatroid and Matroid Rank Functions
Matroid rank functions are the most important class of submodular functions. Polymatroid rank functions are their immediate generalization. As we shall show, any submodular function is a translate of a polymatroid rank function by a modular function. In this section we define these functions and study the results of applying on them some of the basic operations introduced in Section 9.3.
9. SUBMODULAR FUNCTIONS
340
Definition 9.5.1 A submodular function is a polymatroid rank function iff i t takes zero value on 8 and is nonnegative and increasing. Definition 9.5.2 A polymatroid rank function is a matroid rank function iff i t takes integral values and does not exceed 1 on any of the singletons. Exercise 9.8 (k) Show that the rank and nullity functions of a graph are matroid rank functions. Exercise 9.9 (k) Show that the vertex function, incidence function and adjacency function of a graph are polymatroid rank functions while the cut function Icutl(.) is not (see Examples 9.2.1, 9.2.3, 9.2.4, 9.2.5). Both matroid and polymatroid rank functions behave nicely with respect to comodular dualization.
Exercise 9.10 Theorem 9.5.1 (k) p * ( . ) is a polyrnatroid (matroid) rank function if p ( . ) is one. Remark: In the definition of p * ( . ) , if a(.) is less than p(e) for some e , the dual of a polymatroid rank function would not be a polymatroid rank function. But we would still have p* (.) as submodular. The next theorem states among other things that every submodular function is a translate of a polymatroid rank function. This idea would be useful when we relate minimization of submodular functions to the operation of convolution in the next chapter. Theorem 9.5.2 (k) Let p ( . ) be a submodular function on subsets of a finite set S i. I(.)is an increasing function iff
p(S) - p(S
-
e ) 2 0 Ve
E
S.
I t is nonnegative increasing iff i t satisfies, in addition, p(0) 2 0.
ii. If
A S ) - 4 s - e ) = A e ) - P(@) v e E
s,
then p ( . ) is modular. iii. Let a weight function w(.) on S be defined by w(e) p ( . ) - w(.) - p ( 0 ) is a polymatroid rank function.
G
p(S) - p(S - e ) . Then
Proof: ‘only if’ is clear. if:
i. Since p ( . ) is submodular, we have p(S) - p(S
-
e)
5 p ( X u e) - p ( X ) VX C S , e
The result follows. The nonnegative increasing case is trivial
E
S -X.
9.5. POLYMATROID AND MATROID RANK FUNCTIONS
34 1
ii. Since
14s)- A S - e> i P ( X u e ) - P ( X ) I d e ) - 4 8 )
VX
c S, e E S
-
X,
it follows that the given condition implies that
p ( X U e) - p ( X ) = p(e)
-
p(0) V X
c S ,e E S - X .
Thus eEX
Clearly this means p ( . ) is modular.
... It is easily verified that the function
p ( . ) - w(.) - ~ ( 0 satisfies ) the above condition for being nonnegative increasing and takes zero value on 8. Since it is clearly submodular, the function is a polymatroid rank function. 111.
0
The next exercise speaks of contraction, restriction and dualization on natural functions associated with a matroid. The reader might like to compare it with Exercise 9.5.
Exercise 9.11 (k) The rank and nullity functions of a matroid As has already been pointed out in Chapter 4, we can give a number of alternative descriptions of a matroid in terms of independent sets circuits bases matroid rank function. Let us assume the last description (of a matroid rank function on subsets of S ) is available. Then, a set X C S is said to be independent i f f r ( X )= i t is a base ifTr(X) = = r ( S ) , it is a circuit iff r ( X ) = - 1 and all proper subsets of X are independent. We remind the reader that every independent set can be extended to a base and that circuits are minimal dependent (non independent) sets. It is verified elsewhere that these classes satisfy the conditions of the appropriate axiom sets. Let us denote the matroid corresponding to these classes by M. The comodular dual of the rank function with respect to the 1.1 function is called the nullity function of the rnatroid and denoted by v(.). Let II be a partition of S . Show that
1x1
i.
1x1
1x1,
./X(.), .OX(.), u(.) are matroid rank functions.
ii. rf,,.n(.) is an integral polymatroid rank function. (We show later, in the next chapter, that all integral polymatroid rank functions can be obtained by fusion of matroid rank functions).
9. SUBMODULAR FUNCTIONS
342
iii. The independent sets of the matroid (denoted by M . X ) defined by r / X ( . ) are independent sets of M contained in X . The bases of M . X are maximal intersections of bases of M with X . The circuits of M . X axe circuits of M contained in X . iv. The independent sets of the matroid (denoted by M x X ) defined by r o X ( . ) are sets whose union with every independent set of M . ( S - X ) is independent in M . The bases of M x X axe minimal intersections of bases of M with X . The circuits of M x X are minimal intersections of circuits of M with X . v. The bases of the matroid (denoted by M * ) defined by r'(.) are the complements of bases of M . vi. r ' ( X ) = rank of M x X vii. The nullity function of M x X = u / X ( . ) . Thus M*.X = ( M x X ) * .
viii. The nullity function of M . X = u o X ( . ) . Thus M * x X = ( M . X ) * . ix. The contramodular dual of the rank function of M x X = r d / X ( . ) .
x. The contramodular dual of the rank function of M . X = r'oX(.). xi. The contramodular dual of the nullity function of M x X = u*oX(.). xii. The contramodular dual of the nullity function of M . X = u ' / X ( . ) . An elementary but useful notion in graphs is that of putting additional edges in parallel to existing ones. This notion immediately generalizes to matroids. For submodular functions however more is possible provided some minor conditions on monotonicity are satisfied.
Definition 9.5.3 Let p ( . ) be a submodular function on subsets of S . Elements e l , e2 E S are parallel with respect to p ( . ) iff p ( X u el) = p ( X u e 2 ) = p ( X u el u ea) VX C: S.
Observe that other elements cannot distinguish between el, ea.
Definition 9.5.4 [Lov&z83] Let p ( . ) be a submodular function on subsets of S . Let T C S and element UT # S . The parallel extension of p ( . ) by UT parallel to T, denoted by on subsets of S U UT is defined by
c(.)
iw)= PU(X)
fi(x U U T )
Fp
( x U T)
vx c s.
Theorem 9.5.3 (k) i.
fi(.) is a submodular function, if, for each e E T we have p ( S ) - p(S - e) > 0.
ii. if p ( . ) is a polymatroid rank function, then, so is ji(.).
9.5. POLYMATROID AND MATROID RANK FUNCTIONS
343
iii. if p( .) is a matroid rank function on subsets of S then e l , e2 axe in parallel iff
p(el) = 4e.L) = ~ ( { e ie2)). ,
Proof: i. We need to verify if
fi(xu e) - fi(X) 2 fi(Y u e) - fi(Y)
VX
Y
S u UT, Ve E ( S U U T ) - Y.
We have the following cases aT@Y UTEX 0
u ~ ~ Y - X a n d e @ T
0
u ~ ~ Y - X a n d e € T
In the first three cases the inequality holds by the submodularity of p ( . ) and the definition of G(.). In the last case the RHS is zero while the LHS is nonnegative since p ( X u e) - p ( X ) 2 14s)- 4 s - e) 2 0. ii. By the above reasoning, since in this case p ( . ) is increasing, we must have the parallel extension as submodular. Further b(@)= 0 and f i ( . ) is clearly increasing.
iii. We need only prove the ‘if’ part.
Let p ( . ) be a matroid rank function. We must have, for i = 1,2, p(X
u ei) - p ( X ) I p ( X u el u e2) -
F el, e21) = d e i ) . . . (*)
If p ( e l ) = 0 then it is easy to see that
p(X
u e l ) = p ( X u e2) = p ( X u el u ez) VX C S
Let p(e1) = 1. Suppose p ( X p(X
* * *
(**)
u el) - p ( X ) = 1. We claim e2 4 X , as otherwise
u el) - p ( X ) I PHel, e2)) - Ae2) = 0,
a contradiction. But then p ( X u ez)
+ p({el, e2>) 1 p ( X u el u e2) + p(ez)
u e z ) = p ( X u el LJe 2 ) (since p ( . ) is increasing and p((e1, ez}) = p(e2)). Similarly p ( X U e l ) = p ( X U el U e2). Thus the desired equality (**) holds. So we need only consider the case where
i.e., p ( X
p ( X u e l ) - p ( X ) = 0. By the above argument p ( X U e2) - p ( X ) cannot be 1 and is therefore 0. Now, by the submodularity of p ( . ) , we must have p ( X u el
u e2) - p ( X U e l ) 5 p ( X U e2) - p ( X ) = 0,
from which the desired equality (**) follows.
9. SUBMODULAR FUNCTIONS
344
9.6
Connectedness for Semimodular Functions
When a submodular function can be expressed as the direct sum of other such functions, problems involving it drastically simplify. We essentially have to look at much smaller underlying sets which are disconnected under the function. We sketch elementary ideas regarding connectedness in this section. We introduce the notion of an elementary separator of a submodular function below. This notion is a generalization of 2-connectedness for graphs and connectedness for matroids. Definition 9.6.1 Let p ( . ) be a submodular (supermodular) function on subsets of S with p(0) = 0. A set E C S is a separator of p ( . ) iff P(E) + P ( S - E ) = A S ) .
'4 minimal nonvoid separator is called an elementary separator Theorem 9.6.1 (k) Let p ( . ) be a submodular (supermodular) function on subsets of S with p(0) = 0. Then, 1.
p ( xl)
+ . . .+p ( x , )
2 p(xlu . . . u x,) vxl,. . . x, c S,xin xj= 0, i # j
(p(Xl)+...+p(Xn) 5 p ( X 1 u . . . u x , ) vx1,...x, c S,Xinxj = O,i # j ) . ii. if E l , l32 are separators of p ( . ) , then so are El U E2,El n E,. iii. E is a separator of p ( . ) iff p(x1)
+ ~ ( X Z=)p ( X 1 U X , )
VX1
E ,X ,
CS
-E
(Thus when E is a separator studying p ( . ) reduces to studying p/E(.), - E(.). In other words p ( . ) = (p/E @ p / S - E)(.)).
p/S
iv. E is a separator of p ( . ) iff
P/E(.) = WE(.). v. if I3 is a separator of p ( . ) , it is also a separator of p d ( . )and p * ( . ) .
Proof: We will handle only the submodular case. The supermodular situation is similar.
i. If X I , X 2 do not intersect we have , G l )
+ p(&)
L
p(X1
u X , ) + d0).
Sirice p(0) = 0, the result follows by induction on the number of sets.
9.6.
CONNECTEDNESS F O R SEMIMODULAR FUNCTIONS
ii. We have,
345
p(El)+ p(&)
2p(E1u E ~+)p(El n E d , A S - El) + P(S - E2) 2 Cl(S - (El u E2)) + P(S - (El n
Adding the two inequalities we get,
AEl) + P ( E 2 ) + P(S - El) + A S
1
u E ~+)P(& n E ~+) P(S - (
-
E2)
+
E u~ E ~ ) ) P(S - (El n E d )
2 2P(S). But El, E2 are separators and hence the LHS = 2p(S). Thus the inequalities are
throughout equalities and therefore P(E1
u E2) + 4 s - (El u E2)) = 4 s )
+
p(E1n E 2 ) P(S -
(an E d ) = A S )
as required.
iii. The ‘if’ part is trivial. To show the ‘only if’ part, let E be a separator. Let E2 denote S - E and let XI ElX 2 E2. We need t o show that
and the result follows, the reverse inequality already being shown. iv. Observe that
p/E(X) = p o E ( X ) V X C. E
9. SUBMODULAR FUNCTIONS
346
iff
p ( X ) = p((S - E ) U X)- p(S
-
E ) V X & E.
We see that this last condition, using the previous sections of the present problem, is equivalent to E being a separator. v. If E is a separator of p ( . ) , i t is easily verified that
Since this condition is sufficient for E to be a separator both when the function is submodular as well as when it is supermodular, the result follows. 0
9.7
"Semimodular Polyhedra
A powerful technique to study a class of real(rational) valued set functions is to associate a polyhedron with it and study the geometry of the polyhedron. In this section we begin with the simple notions of a set polyhedron and its dual and their specialization to semimodular polyhedra. We show that any 'polyhedrally tight' set function can be naturally extended to a convex function over R". In particular submodular functions are polyhedrally tight. Their extension due to Lov&z [Lov&z83] (called Lovasz extension [FujishigeSl]), has a very simple alternative description.
Definition 9.7.1 Let f(.) be a real valued set function on 2', S = {el;.. , e n } . Let x x denote the characteristic vector of X C S . When x is a real vector let
.(X)
= (XX)TX vx s.
Then the polyhedron associated with f(.),denoted by Pf is defined as fol1ows:A vector x E Rs belongs to Pf iff .(X)
5f(X)
vx c: s.
We say f(.) is polyhedrally tight iff for each X & S there exists a vector x E Pf such that z ( X ) = f ( X ) . The dual polyhedron associated with f(.) denoted by P f is defined as fol1ows:A vector x E Rs belongs to Pfd iff
.(X)
2 f ( X ) vx
c s.
We say f(.) is dually polyhedrally tight iff for each X x E P! such that x(X) = f ( X ) .
S there exists a vector
We list some simple properties of polyhedrally tight functions in the next theorem.
Theorem 9.7.1 (k) i. x E Pf and z(S) = f(S)iff x E Pfd and z(S) = f(S).
9.7. SEMIMODULAR POLYHEDRA
347
ii. if f ( . ) , g ( . )are polyhedrally tight and X 2 0 then (Xf tight . iii. i f A
CS
+ g)(.) is polyhedrally
and f(.) is polyhedrally tight, then f / A ( . ) is polyhedrally tight.
iv. i f f ( . ) is modular with f(0) = 0 then Pf is the set of all vectors beneath a single point and f(.) is polyhedrally tight. Proof: i. We have, z ( X ) 5 f ( X ) V X C S and z(S) = f(S) iff z(S - X ) 2 f(S)- f ( X ) VX C S and x(S) = f(S).
ii. If x E P f and y E Pg such that z ( X ) = f ( X ) and y(X) = g ( X ) then clearly (Ax y) E Pxf+g and further (Ax + y ) ( X ) = ( X f g)(X).
+
+
iii. This is immediate from the relevant definitions.
iv. Clearly f(.) is induced by the vector x = ( f ( e l ) ,.. . , f(e,)) (see Definition 9.2.2 and Theorem 9.2.2) and the vectors in Pf are precisely the set of all vectors less or equal to this vector. Since f ( X ) = z ( X ) VX C S , f(.) is polyhedrally tight. 0
Exercise 9.12 Let f(.) be a set function on subsets o f S and let x denote also the modular function induced by the vector x. Then Pf+x = Pf + x, where the latter addition denotes translation by the vector x.
When f(.) is submodular the problem of maximising a linear objective function over Pf is particularly easy. One need only use a greedy strategy. The next theorem speaks of this strategy. As a consequence it follows that f(.) must be polyhedrally tight. Further when f(.) is integral it turns out that it must have integral vertices. Theorem 9.7.2 (k) Let S = {el, , e n } and let f(.) be a submodular function on subsets o f S such that f(0) = 0 . Let c E RS be a nonnegative vector and let c(e1) 2 . . . 2 c(e,). Let x E Rs be a vector such that s..
i. x is integral i f f ( . ) is integral. ii. x optimizes the linear program m a x cTz,z E P f .
Further if c has a negative entry then the above linear program has no optimal solu tion.
9. SUBMODULAR FUNCTIONS
348 Proof: i. This is immediate.
ii. Let Tj denote { e l , . . . , e j } . We first show that x E Pj. If not then there exists a subset T C S such that z ( T ) > f(T). Let T have the smallest size consistent with this condition and let ei be the element of T with the largest index. Observe that T cannot be null since z(0) = f(0) = 0. We have z(T - e i ) 5 f(T- ei). Next, Ti 2 T . Hence, by the submodularity o f f ( . ) , f(Ti) - f(T- 4 I f(T)- f(T - 4. But
x ( e i ) = z(Ti)- z(Ti - ei) = f ( T i ) - f(Ti - ei).
So z ( e i ) 5 f(T) - f(T- ei). But z(ei) = z ( T )- z(T
-
ei).
Hence z(T - e i ) > f(T- e i ) , which contradicts the definition of T . Thus x E Pj. Next we show that x optimizes the linear program. We use LP duality. The dual linear program is TCS
1
YT
= c(ei) Vei E s , y 2 0.
T3e,
(observe that y has one component for each subset of S). We select YT = c ( e , ) - c(ei+l) if T = Ti,taking c(e,+l) to be 0. Otherwise YT is taken to be zero. It is easily verified that for such a selection y 2 0 and that
C
YT
= c(ei) Vej E S .
T3e,
Further,
f (ThT = ( 4 e l ) - c ( e z ) ) f ( e l )+ ( c ( e z )- c ( e s ) ) f ( { e l , e z ) + ) ...c(e,)f({el,...e,)).
+c(ez)(f({el,ez})
= c(el)f(e~)
+...c(en)(f({el;..e,})
- f(e1))
-
f({el;'.en-1})).
+ . . . + c(e,)z(e,)
= c(el)z(el)
= c T x.
This implies that x and y are optimal solutions to the primal and dual programs respectively. Finally let us consider the situation where c has a negative entry. We note that decreasing the component of any vector in P j will not take it out of the polyhedron. Therefore the component corresponding to the negative entry in c can be indefinitely decreased remaining in the polyhedron but arbitrarily increasing the
9.7. SEMIMODULAR POLYHEDRA
349
objective function. 0
Corollary 9.7.1 (k) I f f ( . ) is submodular (supermodular) with is polyhedrally tight (dually polyhedrally tight).
f(8) = 0 then f(.)
Proof: We first consider the situation where f(.) is submodular. Let X C S . In the statement of Theorem 9.7.2 we select c = X X . The selection procedure for x used in the statement of the theorem ensures that x ( X ) = f ( X ) . Further x E Pf. Next let f ( . ) be supermodular. Let X & S . In the polyhedron Pfd we select a vector x such that x(S - X ) = f d ( S- X) and z(S) = f(S)(the procedure given in the statement of the theorem permits this). This vector belongs to P," and satisfies x(X) = f ( X ) (see Theorem 9.7.1 (i)). 0
Corollary 9.7.2 (k) Iff (.) is submodular (supermodular) and integral then all the vertices of Pf (P,") axe integral.
Proof: We will consider only the submodular case . If x is a vertex of the polyhedron Pf then there exists a vector c such that cTz reaches its maximum value (among all vectors of the polyhedron) only at x. Clearly this vector (by Theorem 9.7.2) must be nonnegative . But then the procedure outlined in the same theorem yields an integral optimum if f(.) is integral. We conclude that this integral optimum must be the given vertex. The supermodular case follows by noting that g ( . ) is supermodular iff -g(.) is submodular, 0
m a x cT z , z E P-, is equivalent to
min
CT(-Z),
-z E
Pg" U
We now show that there is a natural convex extension to every polyhedrally tight set function. We need the following definitions.
Definition 9.7.2 Let S = { e l , . . . .en>. Let f(.) : @' -+ % and let g ( . ) be a set function on subsets of S defined by g ( X ) G f (xx) V X S. Then we say that g ( . ) is the set function induced by f(.). Definition 9.7.3 Let P ( A ,b ) denote the polyhedron defined by the system of inequalities Ax < b, where x is a vector in !Rs. Let fAb(') be the function on !RS defined by fAb(C) 3 ('7WLX C T X , X E P ( A , b ) ) .
350
9. SUBMODULAR FUNCTIONS
Then we say the function tab(.) is induced by the polyhedron P ( A , b ) . I f P ( A , b) is empty we take fAb(C) = -m vc E RS. Let fAb(c) (min cTx,x E P ( A ,b ) ) .
Then we say the function f A b ( . ) is dually induced by the polyhedron P ( A ,b ) . If P ( A ,b ) is empty we take fAb(c)= rn ‘dc E 8’. Theorem 9.7.3 (k) i.
fAb(XC1
+ pc2) 5 XfAb(C1) + pfAb(c2),X,p2 0.
ii. The collection of vectors on which fAb(’) takes finite values is closed under addition and nonnegative scalar multiplication (i.e., forms a cone). iii.
+ pcz) 2 XfA*(cl)+ pfAb(c2),X,p2 0 and the collection of vectors on which f A b ( . ) takes finite values is closed under addition and nonnegative scalar multiplication.
fAb(Xcl
iv. Let f(.) be a submodular function on subsets of S . Let f‘(.) be the function induced by P f and let f”(.) be the set function induced by f’(.). Let c be a vectorsuch thatc(e1) > . . . > c ( e n ). L e t T i , i = 1,2,...ndenote{el,...,ei}. Then (a) f’(.)takes finite values on all nonnegative vectors. (b) f”(X) = f ( X ) VX C S . (c) [Lov;isz83]f’(c)= ( c ( e l ) - c ( e . L ) ) f ( x T ~ ) +. (. .c ( e n - l ) - c ( e n ) ) f ( x T , - ~ ) + (c(en))f(XTn).
Proof i. Let f A b ( k 1 pcz) = ( X c l+ ~ c ~for) some ~ xx E P ( A ,b). By the definition of f A b ( . ) it follows that ( C I ) ~5 X f A b ( C 1 ) and (C2)TX 5 f A b ( C 2 ) . The result follows by multiplying the first inequality by X and the second by p and adding.
+
ii. This follows immediately from the preceding result.
... Similar
to the proof of the first two parts. Note that when P A b is not void, would take value --M when it does not take a finite value). iv(a) This follows from the definition of f’(.) and Theorem 9.7.2.
111.
f
iv(b) This follows from the definition of f”(.), the fact that f(.) is polyhedrally tight (Corollary 9.7.1) and by use of the greedy strategy. iv(c) By using the procedure given in the statement of Theorem 9.7.2 we can construct a vector x which optimizes mux xF,z, z E Pf
for a = 1 , 2 . . .n. We then have, by the definition of f’(C)
2 CTX.
f’(.),
351
9.7. SEMIMODULAR POLYHEDRA
But c = (c(e1)- c(ez))xT,
+ . . . (c(en-l) - c ( ~ ~ ) ) x T +, -(~c ( e n ) ) x ~ ,
and
+.
cTx = (c(e1)- c ( e z ) ) z ( T l )
Noting that
* *
(c(en-l) - c(en))Z(Tn-l)+ ( c ( e n ) ) d T n ) .
Z(Ti) = f(Ti) = f’(XT,),i = 1,2,. . .n,
we have, f‘(c) 2 (c(e1)- c( e z ) ) f ’ ( xT , )+ . . . (c(en-l) - c ( e , ) ) f ’ ( x ~ , - ~+) ( c ( e n ) ) f ‘ ( x ~ , ) .
The reverse inequality follows from the statement of the first part of the present theorem. 0
Remark: Observe that if c(e1) 1 . . . 1 c(e,), then c = ( c ( e l )- c(ez)>(xT,) + . . . ( c ( e , - ~ >- c(en))(xT,-l)+ ( C ( ~ , > ) ( X T , ) .
Thus f’(c) is obtained by performing the above linear combination of the ~ ( x T , ) . The next result is one of the deepest in submodular function theory. It is a very good starting point for proving many of the important results in this area. Our proof is, however, not polyhedral even though the result is naturally polyhedral. A generalization of the result using polyhedral methods (in fact the Hahn-Banach Separation Theorem) is given in Problem 9.15 Theorem 9.7.4 (The ‘Sandwich Theorem)’ [Frank821 Let f(.),g ( * ) be submodular and supermodular functions defined on subsets of S such that f (.) g ( . ) . Then thereexistsamodularfunction h(.)onsubsetsofSsuch that f(.) 2 h(.) > g ( . ) (equivalently, such that h(.) separates or lies between f(.) and g(.)). Further if f(.), g ( . ) are integral then h ( . ) can be chosen to be integral.
>
Our proof of the theorem is an algorithmically more efficient version of that due to Lovhsz and Plummer [Lov&z+Plummer86] and is based on the following lemma. Lemma 9.7.1 Let 0 c A c S be such that f ( A ) = g ( A ) and let there exist modular functions hA(.),hS-A(.)on subsets of A , S - A respectively such that f / A ( . ) 1 h ~ ( . 2) g/A(.) and f 4 S - A)(.) 1 L A (1. go(S ) - A)(.). Then f(’)2 (hA @ h S - A ) ( ’ ) 2 g ( ’ ) * Proof of the Lemma: Let X 5 S . Clearly,
352
9. SUBMODULAR FUNCTIONS
Since f(.),g ( . ) are submodular and supermodular respectively, we have
Further we are given that
fo(S - A)(X n ( S - A ) ) 2 h s - ~ ( Xn ( S - A ) ) 2 go(S - A)(X n ( S - A ) ) . Thus and where hs
f(X) 2 ~ s - A ( X n (S- A ) )
+f ( X nA) 2 hs(X)
g(X)I ~s-A(X n(S - A))
+ g ( X n A ) Ih s ( X ) ,
= / L A Cg h S - A .
Hence, f ( X ) 2 h ( X ) 2 g ( X ) . 0
Proof of Theorem 9.7.4: Assume without loss of generality that f(8) = 0. Suppose f(.) has every element e as a separator. Then f(.) is already modular and the theorem is trivially true. We therefore assume that the submodular function has atleast one element say e that is not a separator. Let Y be the subset of all elements of S which are singleton separators. We will assume that the theorem is true for (IS1< n ) and for = n, IS - Y J < m). We note that the theorem is st = 1 and also for IS - Y I = 0. We will now prove the result trivially true for 1 when (IS1= n, I S - YI = m ) . We have f(S)< f(S - e ) f ( e ) . Let f’(.)be the function f’(X) = f(X- e ) (f(S) - f(S - e ) ) ,e E X
(1st
+
+
= f ( X ) ,e @ X .
The function f’(.) is obviously a submodular function with e as a separator (see Theorem 9.6.1) and further f(.) 2 f’(.) (since, by the submodularity of f(.) , f ( X )- f ( X - e ) 2 f(S)- f(S- e ) V X such that e E X ) . Let the set A minimize
(f’- 9)(.).
(f’- g ) ( A ) is nonnegative we have found a submodular function ) , lies between f(.) and g ( . ) and further has one more (namely i t ( . )which singleton separator (namely { e } ) ,than f(.) has. Also f’(.)is integral i f f ( . ) is. By induction on I S - YI, the theorem is true for f’(.) and therefore for
i. Case 1 . If
f (.I.
(f’- g ) ( A ) be negative. Clearly A is not null or equal to S . Let f ” ( . ) be defined by
ii. Case 2. Let
f”(X) m i n ( f ( X ) , f ’ ( X + ) ( 9 - f ’ ) ( A ) ) VX C S. I t can be verified that f”(.) is submodular (directly or by using the idea of convolution, to be introduced in the next chapter ). Further f(.) 2 f”(.) 2
9.8. SYMMETRIC SUBMODULAR FUNCTIONS
353
g(.) and f ” ( A ) = g(A). It follows that f”/A(.) 2 g/A(-), f ” o ( S - A)(.) 2 go(S - A)(.). By induction on IS), we may assume that there are modular (.) between functions h~ (.) between f” /A(.) and g/A(.) and f”o(S - A)(.) and go(S - A)(.). Now by Lemma 9.7.1, ( h @~~ s - A ) ( * )lies between f”(.) and g(.) and therefore also lies between f(.) and g(3). Further, in case f(.),g(.) are integral it is clear that f” (.) is integral and we may assume . ) integral. It follows that ( h @~~ s - A ) ( . ) by induction that ~ A ( . ) , ~ s - A ( are is also integral. Thus the theorem is true when IS1 = n, IS - YI = m.
Remark: The above proof of the Sandwich Theorem contains an efficient algorithm for finding the separating modular function provided we have an efficient algorithm for minimising submodular functions (in this case the function (f’ - g)(.)). It may be verified that the algorithm requires no more than IS1 submodular function minimizations. Exercise 9.13 Let M be a matroid on S . Let r ( X ) e r ( M . X ) , r ’ ( X ) G r ( M x X ) , v ( X ) = v ( M x X ) , v ’ ( X ) z v(A4.X)ils in Exercise 9.11. We have already seen that T ( . ) , v(.) are submodular and T I ( . ) , v’(.) are supermodular. Show that i.
T(.)
2 T ’ ( . ) and v(.) 3 v‘(.).
ii. Find vectors wr,w, such that r ( . ) 2 w,(.) 2 r ’ ( . )and v(.) 2 w,(.) 2 v’(.).
Exercise 9.14 Let p ( . ) be a polymatroid rank function on subsets of S. Let
i. Show that p ( . ) 2 p ’ ( . ) .
ii. Choose a vector w(.) so that p ( . ) 2 w(.) 2
9.8
PI(.).
Symmetric Submodular Functions
A key problem in submodular function theory is that of minimization. To be precise, the search is for a practically efficient polynomial time algorithm for a general submodular function which is available through a rank oracle. (The oracle will give the value of the function on any given subset). The input size for such an algorithm is determined by the size of the underlying set and the maximum number of bits needed to represent a value of the submodular function. As we shall see later minimization is equivalent to convolution of an appropriate polymatroid rank function with a weight vector. The solution is known to the minimization problem in many practical situations: e.g. minimum directed cut in a graph, convolution of a matroid rank function with a weight vector [Cunningham84], [Narayanan95b].
9. SUBMODULAR FUNCTIONS
354
For the general problem the ellipsoid method [Grotschel+Lovasz+Schrijver81] does provide a polynomial algorithm which however is practically useless. There are a few algorithms for minimization [Cunningham85], [Sohoni92] which are practical but pseudo polynomial (for integral functions the algorithm is polynomial in the size of the underlying set and the maximum value of the function). Recently the case of symmetric submodular functions was solved in a surprisingly simple way. We describe this solution in this section. Definition 9.8.1 A set function g : 2’ g ( S - X ) VX c_
s.
+ % is symmetric iffg ( X ) =
Example 9.8.1 (k) The following are symmetric submodular functions. i. Cut function on the vertex subsets of a graph.
ii. The function ( c l ( X ) ,X & E(G) where c ( X ) F the set of vertices common to edges in X and E(G) - X . iii. Icl(X) acting on the left vertex set of a bipartite graph. Here Icl(X) =number of right side vertices adjacent to vertices in X as well as vertices in VL- X (see Example 9.2.8). iv. The function { ( X ) = r(A4.X) - r ( M x X ) ,X C S , where M is a matroid on S and r ( . ) denotes the rank function of the matroid. c‘.
O(X) = f ( X ) subsets of S.
+ f(S
-
X),X
C S,where f(.) is a submodular function
on
Exercise 9.15 Show that the functions in Example 9.8.1 are symmetric submodular functions. Remark: Prior to the work of Nagamochi and Ibaraki [Nagamochi+Ibaraki92a] (see also [N agamo chi+ Ibaraki92 b] [N agamo chi +Ono+ Ibaraki941) the standard way of finding min cut was through flow techniques . The above authors used a special (linear time) decomposition of a graph into forests to identify a pair of vertices for which a minimum separating cut was immediately available. This was stored. This pair was fused and the process repeated until the graph had only two vertices. The minimum value cut among all the stored cuts gives the minimum cut. A simpler algorithm of the same complexity was found by M. Stoer and F. Wagner [ StoerfWagner941 and, independently, by A . Frank [F’rank94]). It is this algorithm that we generalize below. The version we present is essentially the same as the one due to M. Queyranne [Queyranne95]. However we have tried to bring out the relationship to the Stoer-Wagner algorithm more strongly. We follow the notation of Stoer-Wagner and essentially do a line by line translation of their algorithm for finding a min cut to that for minimising a symmetric submodular function over all sets not equal to the full set or the void set.
Let f : 2’ +R ! be a submodular function. Let c(A, B ) $ ( f ( A ) + f ( B ) - f ( A U B ) ) V A , B C S , A n B = 0. Observethat c(A,B) = c(B,A).
355
9.8. S Y M M E T R l C S U B M O D U L A R FUNCTIONS
+
We then have c ( X ,S - X ) = i(f(X)f ( S - X ) - f(S)).If g ( X ) ’= c ( X ,S - X ) it is clear that g ( . ) is symmetric and submodular. We minimize g ( . ) over X C S , X #
O,x # s.
Subroutine Minimum Phase (9,a, S j ) BEGIN A +- (4 while ( A # S j ) { add to A the element e $ A such that c ( A , e ) is the largest i.e., f ( A u e ) - f ( e ) is least.
1
Store the ‘pair of the phase’ ( e n ,Sj - e n ) and the ‘value of the pair of the phase’ c(e,,Sj - en). Fuse en, e n - l . Replace the set S j by Sj+l { e I , e l , . . . , en--2,en-1 u en}. (Note that e n - l , en are treated as subsets of S.) Replace the function g by gfus.s,+l. END Subroutine Minimum Phase. Algorithm Symmetric: Initialize So = S. BEGIN The algorithm has (SI - 1 phases of Subroutine Minimum Phase. Value of current minimum pair = 00 while (ISjI > 1) { Subroutine Minimum Phase (9,a , S j ) . If the value of the pair of the phase (en,Sj - en) has a lower value than the current minimum pair then store (en,Sj - e n ) as the current minimum pair.
--
}
Let the minimum pair be (en,,Sj - eni) at the end of the algorithm (i.e., when lSjl = 2). Output sets en,, S - en, as minimising sets (non void proper subsets of S ) for g(.). END Algorithm Symmetric. We justify algorithm Symmetric through the following theorem. We first introduce some convenient notation and a couple of lemmas.
In any phase the elements of the current set Si are ordered as say ( e l ,e-2,. . . , ek). Let Aj = { e l , e Z , .. . , e j . - l } , j = 2 , . . . , k + 1. (In particular Ak+l = Si). Let A C {el, e-2,. . . , e k } with ek E A , e,-1 E A and {e,,. . . , e k - ~ }n A = 8. L e m m a 9.8.1
4.4, ek) - c ( A r ,ek) 5 c ( A ,&+I Proof: We note that
1 c ( X ,Y ) = $ ( X )
-
A ) - c(A,+1 n A , &+I -
+ f ( Y )- f ( X u Y)I
nA)
9. SUBMODULAR FUNCTlONS
356
Herice the LHS of the inequality in the statement simplifies to
while the RHS simplifies to
-f(A,+1
-
A,+1 n A ) + f(AT+1)1.
We therefore have t o show (after simplification) that the following inequality is valid. f ( A k )- f ( A ~ + l f) f ( A r u ek) - AT)
Now we note that
Hence by submodularity
By a similar argument we see that
(since A, u ek - A , = ek = A hence the lemma.
-
A,+1 n A ) . This proves the inequality(9.5) and 0
L e m m a 9.8.2 Let (ek,Si - ek) be the pair o f the phase. Then
where
ek E
A and ek-1 E
si
-
A.
Proof The proof is by induction on the size of Si.The lemma is true for Suppose it to be true for 5 Ic. Let A C Si with ek E A,e,-l E A , { e , , . . . ,ek--l} n A = 0. B y induction it follows that
lSil = 2.
9.8. SYMMETRIC SUBMODULAR FUNCTIONS
357
(since e T p 1E A,e, E A,+1) We have
Hence.
which is the required inequality. 0
Theorem 9.8.1 The current minimum pair (en,, Sj - en,) a t the end of the Algorithm Symmetric yields the minimising set en, for g ( . ) . Proof: The proof is by induction on the size of the set S . The theorem is clearly valid when the IS(= 2. Suppose it to be valid when the size of the set is n - 1. Let I S 1 = n. Now let the last two elements of the first phase be en-l,en. If the minimum value pair ( X ,S - X ) , X E S , X # 8, X # S had {en-l, e n } E X then we can fuse { e n - l , e n } and work with the set S2 = { e l ,e 2 , . . . ,en--2, {en-l, e n } } and the function g f u s , S z . By induction, in the subsequent phases of the algorithm the minimum pair will be revealed as a pair of some phase. On the other hand if the minimum pair ( X ,S - X ) had en E X , e,-l E S - X , by Lemma 9.8.2 the pair of the first phase has this minimum value. This completes the proof of the theorem. 0
Complexity of Algorithm Symmetric Suppose there is an oracle that gives the value of c ( X ,Y ) for given X, Y C_ S , X n Y = 0. If the current set is Si the number of calls to this oracle is O(lSi12) during the phase. A t the end of the phase the set reduces in size by one. We continue until the set reaches size two. Hence the total number of calls to the oracle is,O(ISiI3). Special cases can be handled much faster: for instance in the case of cut(.)function of a graph Stoer & Wagner show that the complexity is O(lEl+ )V )Zag IVl) elementary operations per phase, giving an overall complexity of O(IVIIEJ+IV(2ZogIVl).
358
9. SUBMODULAR FUNCTIONS
The case of the symmetric function ( r L - E L ) ( ' )of a bipartite graph B = (VL,VR,E ) is almost identical. The complexity here per phase is O(JEl+~ V)Zog L IVLI) and overall is ~ ( I v L I I E ~ 1VLI2Zog IvLI).
+
+
Exercise 9.16 (k) For any function f(.)show that g(X) = $ ( f ( X ) f(s- X ) ) is symmetric. Iff(.) is a symmetric function show that c(X, S - X )= f ( X ) provided f(S) = 0.
Exercise 9.17 If f (.) is a symmetric function which of the following operations preserve symmetry ? i. Contraction.
ii. Restriction. iii. Comodufar dual. iv. Contramodular dual. Exercise 9.18 For any submodular function f (.) show (using the notation of SecA , Y C B , A n B = 8. Interpret tion 9.8) that c ( A , B ) 2 c ( X , Y ) whenever X this statement for the cut function of a graph. Exercise 9.19 Let g(.) : 2s + !J2 be a symmetric submodular function. Give an algorithm to minimize g(.) over i. all subsets X , A
ii. all subsets
X
cS.
X, 0 c X & A.
Exercise 9.20 Specialize Algorithm Symmetric to the case of i. the cut function Icutl(.) of a graph
ii. the function
(IrLI
6 = (V, E ) .
- IELI)(.)of the bipartite graph B
iii. the symmetric function arising from
II'L
I(.)
=_
(VL,V R ,E ) .
in B above.
Exercise 9.21 In Theorem 9.5.2 we have shown how to subtract a modular function from a given submodular function to yield a polymatroid rank function. Work this case out for ) a t / ( . ) .
9.9
Problems
Problem 9.5 (k) Prove the principle of inclusion and exclusion for weight functions. (The intersection form: w(X1 U .. u X,) = C w ( X i )- C w ( X i n X j ) + State the union form of the principle. Suppose the w(Xi n X j n X , ) ...I. value of such a function is known for arbitrary unions of sets from a collection { X I , X , . . . X,}, show how to obtain its value on sets obtained by arbitrary intersection of such sets and their complements.
+
+
9.9. PROBLEMS
359
Problem 9.6 (k) Let p : 2s Then show that
-R be ia
submodular (supermodular) function.
i#j
i , j , k distinct
. . . + (-l)T+l i l ,...,a.,
p(Ai,
n . . . n Ai,)
distinct
+
i l ,...,i? distinct
Problem 9.7 Let G be a graph on vertices V and edges E. Let w be a nonnegative weight function on the edges ofG. As usual w ( X ) CeEX w(e). Let p : 2E + %+ be defined by p(X) G maximum weight of a forest of the subgraph of G on X . Show that p ( . ) is submodular. Examine the case where p(.) is defined by minimum weight instead of maximum weight.
=
Problem 9.8 How would you generalize Problem 9.7 above for the adjacency and exclusivity functions of a bipartite graph, given a nonnegative weight function of the left vertex set? Problem 9.9 (Generalization of Problem 9.7 to the matroid base case) Let M be a matroid on S. Let w be a nonnegative weight function on the elements of M . Let p : 2’ tIR+ be defined by p(X)
maximum weight base o f M . X
Show that p(.) is a polymatroid rank function.
=
=
Problem 9.10 (k) Let A be a matrix with real entries. Let R {rl, . . . ,rk}, C { c1 , . . . ,c,,} be the sets of rows and columns of A respectively. Let A x denote the R but all columns. Let f : 2R ---+ IR submatrix of A using only the rows in X be defined as follows
f ( x )= Zog(det ( A ~ A ; ) )vx c R Prove that f (.) is submodular. Problem 9.11 Let A be a matrix with linearly independent rows. Let T be a set of maximally independent columns of A and let the submatrix corresponding to the columns of T be the identity matrix. Let A x , X 5 T , denote the submatrix of A composed of those rows of A which have nonzero entries in one of the T . Prove that f (.) is columns corresponding to X . Let f ( X ) Zog(AxAg), X submodular.
=
9. SUBMODULAR FUNCTIONS
360
Problem 9.12 Let A be a totally unimodular matrix (every subdeterminant of A is 0 , f l ) with columns S . Let M be the matroid on S associated with A . Let T be a base of M . Let
f(X)G Zog(nurnber of buses of M x ( S - (T - X ) ) ) , X C T . Prove that f(.)is submodular. Problem 9.13 [Fujishige78a], [fijishige91] We consider some analogies between polymatroids and matroids in this problem. We note that i f we perform the fusion operation on a matroid rank function we obtain a polymatroid rank function. Later, in the next chapter, we show that every polymatroid rank function can be so obtained. In other words every polymatroid rank function can be ‘expanded’ into a matroid rank function. I f p : 2s --+R is a polymatroid rank function we say that Po = {x E P’: x(X) 5 p ( X ) VX S } is a polymatroid.
An independent vector of a polymatroid is a nonnegative vector in Pp, i.e., S. We will show later that if x is an it is a vector x s.t. x(X) 5 p ( X ) VX integral independent vector, in the expanded matroid there is a: independent set T C S whose intersection with the set e E S has size x(e) (Here S is the expanded version of the set S , the latter being a partition of S). i. Let
x
be a real vector on S . Define D(x) = {XlX
S,x(X) = p(X)}
Show that D(x) is closed with respect to union and intersection and hence has a unique maximal and a unique minimal set. ii. I f x is independent define the saturation function
sat(x) = U{XjX
Show that
sat(x) G
{ele E
c: S,x(X) = p(X))
S V a > O : x + axe 4,‘f }
Observe that the saturation function generalizes the closure function of a matroid. iii. Let D(x, e) denote the collection of sets in D ( X ) which have e as a member.
Show that D(x,e) is closed under union and intersection and hence has a unique maximal and a unique minimal element.
iv. For an independent vector x E Pp and e E sut(x), define the dependence function dep(x,e) = Ex S,X(X) = p ( ~ ) )
c
Let z(e) < p ( e ) and let 0 < Q < p ( e ) - x(e). Let x’ = x + axe and let O ( X ) = x’(X) - p(X) VX C S . Let D’(x‘) denote the collection o f sets where O(.) reaches a maximum. Show that
9.9. PROBLEMS
36 1
(a) D’(x’)is closed under union and intersection and hence has a unique maximal and a unique minimal set. (b) Let K be the minimal member of D’(x’).Then K = dep(x,e). (c) dep(x,e) = {e‘le’ E S , 3a
> 0 : x + a(xe- x e ) )E P p }
Observe that the dependence function generalizes the fundamental circuit of a matroid. Problem 9.14 Let f : 2s --+ !R be a submodular function with f(8) = 0. Let c be a nonnegative weight function on S . Let c x denote the row vector on S with cX(e) = 0 , e
4X
cx(e) = c(e), e E X
Let p : 2s 4!R be defined by T p ( X ) zi m a x cxx, x E Pf.
Show that p ( . ) is a submodular function. Problem 9.15 We state the Hahn-Banach Separation Theorem below. Let V be a normed space over IR and let E l , Ez be nonempty disjoint convex subsets of V , where El is open in V . Then there is a real hyperplane in V which separates El and E.2 in the following sense: For some linear functional g ( . ) over V and t E ‘8, we have
dYl)
> t 2 S ( 1 / 2 ) VYl
E &,Y2
E
E2.
Use this result to prove the following: Let f l , f2 : 2s --+ R be polyhedrally tight and dually polyhedrally tight set functions respectively with fl(.) 2 f2(.). Then there is a modular function w(.) such that ti(.)2 w(.)2 fi(.). Problem 9.16 Use Sandwich Theorem to prove the following: Let M I ,M2 be two matroids on S . Then the maximum size o f a common independent set of the two matroids = minxes q ( X ) + T ~ ( -S x),where T I ( . ) , Q ( . ) are the rank functions of the rnatroids ML, M.2 respectively. Problem 9.17 Let f : 2’ -+ !R be a submodular function. For each o f the following cases construct an algorithm for minimising this function.
i. f ( X - e ) - f(x)2 f((S- X ) u e ) - f(S - X) VX
c S, e E X .
ii. f(X- e ) - f ( X ) 5 f ( ( S - X ) U e ) - f(S- X) VX E S , e E X
362
9.10
9. SUBMODULAR F U N C T I O N S
Notes
Submodular functions became an active area of research in optimization after Edmond's pioneering work. Earlier prominent workers who used such ideas were H. Whitney [Whitney35] and W.T. Tutte [Tutte58], [Tutte59], [TutteGl], [Tutte65], [Tutte71] (in their work on graphs and matroids), G. Choquet [Choquet55] (in his work on capacity theory) and 0. Ore [Ore561 (in his work on graph theory). The present chapter follows Lovasz's excellent review paper [Lovasz83]. We have also made use of F'ujishige's comprehensive monograph [FujishigeSl] on the subject. In the present book, the polyhedral approach to submodular functions has not been emphasized. A thorough treatment of this subject through the polyhedral approach may be found in [Frank+Tardos88] and in the above mentioned monograph of Fujishige. The notion of comodular duality is due to McDiarmid [McDiarmid75]. The Sandwich theorem first appeared as a technical result in [Frank82]. Its importance as a central result in submodular function theory was recognized by Lovasz and Plimmer [Lovasz+Plummer86].
9.11
Solutions of Exercises
E 9.1: For most of the examples the easiest route for proving sub or super modularity is to show that Inequality 9.3 holds. Example 9.2.1: Let X C: Y E(G) and let a E E(G) - Y . Then
s
V ( X u a) = V ( X )[ t J ( V ( a )- V ( X ) ) ,
V ( Y u a ) = V ( Y )( t J ( V ( a )- V ( Y ) ) . Clearly since V ( Y )2 V ( X ) it follows that
c V(a)
V ( a )- V ( Y )
-
V(X).
Hence IVKX u a ) - I V l ( W
Example 9.2.2: Let X
L
IVl(Y u a ) - IVl(Y).
C Y C V ( G )and let a E V ( G )- Y . Now
E ( Z U a ) = E ( Z )b J Ez,,
zC V ( G ) , a E V ( G )- z,
where EZ,is the set of all edges, with a as one end point, the other endpoint lying in Z U a. Clearly Exn EY,. l'1lUS IEl(X U U ) - IEl(X) 5 )EI(YU U ) - E ( Y ) .
c
Example 9.2.3: Let X
C Y C V ( G )and let a E V ( G )- Y . Then I(X
u a ) = I ( X )( f l [ I ( a )- Z(X)],
363
9.11. SOLUTIONS OF EXERCISES
Z(Y u a ) = I ( Y )(SJ[Z(a)- I ( Y ) ] , Clearly
Z(a) - I ( X ) 1Z(a) - Z(Y).
The result follows. Example 9.2.4: Let X E Y E V ( G ) , a E V(G)- Y . We have
r(xu U) = r(x)(Sl(r(a)- r(x)), r(y u a ) = r(y)ItJ(r(a)- r(y)). Clearly
- r(X)) 2
The result follows. Example 9.2.5: Let X
u
C U ~ ( Ya )
Clearly
CY
V(G)and let a
E
W)).
V ( G )- Y . Then
= cut(^) - C U ~ (n YW ) t ( a ) )u ( m t ( a )- C U ~ ( Y ) ) ,
cut(a) - c u t ( X ) 2 cut(.)
- cut(Y).
and c u t ( X )n cut(.) C cut(Y)n cut(.) The result follows. Example 9.2.6: In every one of these cases the method of the above solutions works. Example 9.2.7: The proof is similar to that of Example 9.2.2. We only have to define E z n to be the set of all vertices in VR adjacent only to vertices in 2 U a and adjacent to a. Example 9.2.8: We have seen that the function IELI(.) is supermodular in Example9.2.7. Next let r L ( . ) denote r/VL(.). Now
c ( x ) = rL(x) - E ~ ( x )c, vL, x and Hence Now I r L ( ( . ) is submodular while IEI(.) is supermodular. The result follows from the fact that subtraction of a supermodular function from a submodular function yields another submodular function. Such facts are presented in Exercise 9.4 . Example 9.2.9: Let X C Y C S and let e E E - Y.Clearly if e is independent of Y it must be independent of X . Hence
T ( X U e ) - r ( X ) 2 l-(Y u e ) - l-(Y).
9. SUBMODULAR FUNCTIONS
364
The submodularity of r ( . )follows. Example 9.2.10: We parallel the argument of the previous example. Let X & Y C E(G) and let e E E ( g ) - Y . If e does not form a circuit with edges in Y it is clear that e will not do so with edges in X either. Hence
r(X
u e ) - r ( X ) 2 r(Y u e ) - r ( Y )
So r ( . )is submodular. We know that V'(X)
=
1x1- r ( X ) .
Since I . 1 is modular and r ( . )is submodular it is easily seen that the function u ' ( . ) is supermodular. Next observe that r ' ( X U e) > r ' ( X ) iff E - X contains no circuit with e as a member. Clearly, whenever Y 2 X , if E - X contains no circuit with e as a member neither will E - Y contain a circuit with e as a member. Hence
r ' ( X ~ e ) - r ' ( X< ) r'(Y~e)-r'(Y). This is equivalent to supermodularity of r ' ( . ) . The submodularity of v(.) follows since u ( X ) =I X I - r ' ( X ) . E 9.2: Let 4 be the graph on the vertex set { a ,b , c, d } with edges between ( a ,b ) , ( b , c ) ,(c,d ) , ( d , a ) . Let X = {b},Y = { c , b } , Z = {a,b,c}. Observe that
k(Y
u a ) - k ( Y ) = 1,
k(X
u a ) - k ( X ) = 0.
So k(.) cannot be submodular. On the other hand k(Z u d ) - k ( 2 ) = 0 ,
k(X
u d ) - k ( X ) = 1.
So k(.) cannot be supermodular. E 9.5: i. This is immediate.
ii. Let the rank function of G x X be r ' ( . ) . It is easily verified that r ' ( . )and roX(.) are matroid rank functions (increasing, integral, submodular with zero value on 8 and value not exceeding one on singletons). A matroid rank function r ( . ) is fully determined by its independent sets (sets on which r ( Y ) = IY I). So we need to show that r'(.) and r o X ( . )have the same independent sets. Now independent sets of r ' ( . ) are the circuit free sets of G x X . Let Y 2 X . We know that this set contains no circuit of G x X iff for each 2 C E(G) - X that contains no circuit of G, Y U Z contains no circuit of G, i.e., iff no circuit of intersects X in a subset of Y i.e., iff r(Y u ( E ( G )- X ) ) = IYI r ( E ( S ) - X ) i.e., iff r o X ( Y ) = IYI.
+
iii. This follows immediately from the above proof and the definition of contrarnodiilar dual.
9.11. SOLUTIONS OF EXERCISES
365
iv. Immediate from the definition of nullity function. The remaining parts follow from the above through the use of Theorem 9.3.2 (i), (ii).
E 9.6:
i. This is immediate from the definitions of restriction, contraction and incidence functions of the relevant graphs. ii. Let Y C X C V(4). Then, JIJoX(Y) IIl((V(G) - X ) U Y ) - IIl(V(G) - X ) = the number of edges which are incident on Y but not on V(G) - X. The result follows. iii. This is immediate. iv. We have,
Irl"(X)
= Irl(v(G)>- I W ( G ) - X). = IEl(W
v. Let X
G V(G).
We have
VEX
= IZI(X).
The remaining parts follow through the use of Theorem 9.3.2 (ii). on the correspondlng results for the incidence function.
E 9.7: i. This is immediate.
=
ii. Let Y g X. Then ( r L o X I ( Y ) r L ( ( V L - X )U Y ) - I'L(VL - X),i.e., the size of the set of vertices adjacent to Y but not in rL(VL - X ) . This is clearly the size of the set adjacent to Y in B O L X. iii. This is immediate from the definitions of ) ( r L ) f u s . n (and . ) IrnL((.). iv. Let X E V L . Then
Ir:I(X)
= IrLI(VL) - IrLl(vL - X).
This is the size of the set of vertices which are adjacent to X but not to VL - X i.e., the size of E L ( X ) . The remaining parts follow by the use of Theorem 9.3.2 (ii) on the above results.
E 9.8: The submodularity of these functions has already been shown (Example9.2.10). That they are increasing and integral functions and their values on singletons do not exceed 1 is clear.
9. SUBMODULAR FUNCTIONS
366
E 9.9: The submodularity of all these functions has already been shown (see the Solution of Exercise 9.1). It is clear that they all take zero value on the null set. Except for the cut function all the functions are monotone increasing. The cut function is symmetric i.e., its value is the same on a set and its complement. SOit cannot be monotone increasing if the graph is not trivial. E 9.10: Proof of Theorem9.5.1: We will consider only the polymatroid case, since the matroid case is an easy consequence. We have P ' W
c4.1
- [AS)-
eEX
4 s - X)l.
(Note that a ( . ) satisfies a ( e ) 2 p(e) Ve E S). We have already seen that p * ( . ) is submodular (Theorem 9.3.4). It is immediate from the definition that p*(8) = 0 Let Y 2 X . Since p ( . ) is submodular and p(0) = 0, we have
4 s - X ) - 4 s - Y )i
c
4e).
eE(Y-X)
Hence p * ( ~-) p * ( x ) z
C
a(.) - h(s - X ) - p ( s - Y ) ]L 0.
eEY-X
Thus p * ( . ) is increasing. 0
p*(e) 2 0
Ve E S , because p*(0) = 0 and p* is increasing.
Thus p * ( . ) is a polymatroid rank function. 0
E 9.11: i. It is easily verified that all these functions are submodular, increasing, integral, take the value zero on the null set and zero or one on singleton sets.
.. 11.
, integral and takes value zero
...
The statements about independent sets, bases and circuits are all immediate
Clearly this function is submodular, increasing on the null set. 111.
from the definitions of these sets in terms of the rank function and the definition of
the notion of restriction.
iv. A set Y is independent in M x X iff
JYI = T O X ( Y ) = T((S- X ) u Y ) - T(S - X ) . Now it can be verified that, whenever Z is a base of M . ( S - X ) ,
T((S-X)UY)-T(S-X)
=T(ZUY)-r(Z)
9.11. SOLUTIONS OF EXERCISES
367
and ~ ( 2=) 121. Hence Y is independent in M x X iff, whenever Z is a base of M . ( S - X), we have J Y1 J Z J = T(Zu Y ) ,
+
equivalently, we have Y u 2 independent in M . Noting that a base is a maximal independent set, we find that, Y is a base of M x X iff whenever Z is a base of M . ( S - X ) , Y U Z is a base of M . But a base of M . ( S - X) is a maximal intersection of a base of M with S - X . The desired result follows.
A circuit of a matroid can be seen to be a minimal set not contained in any base of the matroid. A circuit of M x X is a minimal set not contained in any minimal intersection of bases of M with X . Suppose C is a circuit of M with C n X # 8. Now C - X is independent in M and hence is contained in a base b of M . ( S - X ) . If b x is any base of M x X we know that b u b x is a base of M which intersects X minimally. Clearly b u b x cannot contain C. Hence no base of M x X can contain C n X . Hence C n X contains a circuit of M x X. On the other hand if C' is a circuit of M x X , it is seen from the definition (in terms of rank) of such a circuit that C' U 2 , when Z is a base of M . ( S - X ) , contains a circuit of M . Since Z contains no circuit of M it follows that C' contains the intersection of a circuit of M with X . v. We have, B is a base of
M' iff
IBI = T*(B)= T * ( S ) = IS1 - r ( S ) Nowr*(B) = ~ B ~ - T ( S ) + T ( S - - BT) .h u s B i s a b a s e o f M * iff IS-BI = T ( S - B ) i.e., iff S - B is a base of M . vi. We have, T"X)
ii T ( S ) - T(S - X
) = TOX(X).
But by (iv) above we have, rank of M x X = roX(X). The result follows. The remaining parts are consequences of Theorem 9.3.2 ((i), (ii)), when applied to the above results.
+
E 9.12: A vector y belongs to P f + x iff y ( X ) 5 f(X) x ( X ) VX S i.e., iff (y - x) E Pf. (y - x)(X) 5 f ( X ) V X
S i.e., iff 0
E 9.13: i. Since r ( 0 ) = 0, T(X)
Hence,
+ T(S - X ) 2 T ( S )
T(S) - T(S - X ) = T ' ( X ) 5 T ( X ) .
The proof for the v(.) case is similar. ii. Let b be any base of M. If X C S we have X n b independent in M and hence X n b is contained in a base of M . X . Thus IX n bl 5 T ( X ) .On the other hand X n b contains a base of M x X . Hence IX n bl 2 r ' ( X ) . Thus we can choose w,
9. SUBMODULAR F U N C T I O N S
368
4
as follows. w,(e) = 1, e E b and w,(e) = 0, e b. Similarly it can be seen that choosing w,(.) corresponding to a cobase would satisfy v(.) 2 w,(.) v'(.).
>
E 9.14: i. Similar to the matroid case.
ii. Choose w to be a base of the polymatroid, i.e., a vector on S such that w(X)5 p(X), X S and w(S) = p(S). We then have
s
P(X) 2 W ( X )
\Jxs s
p(S) - p ( X ) 5 w(S) - w(X) VX
c S.
E 9.15: The functions in (i) and (ii) are special cases of the function in (iii). The latter has already been shown to be submodular in the solution to Exercise 9.1, Example 9.2.8. The symmetry follows by definition. The function 0 and ( 0 , ~E) Ez iff z 5 0.
9. SUBMODULAR F U N C T I O N S
376
Hy the Separation Theorem we have a hyperplane H (which is the set of all points in %ISl+l on which the functional g(.) in the separation theorem takes the value t ) separating El and E2. Clearly (0,O) E H and H is a linear subspace. Now if ( C , Z ~ ) (, c , ~E )H we must have x1 = x2. For the hypothesis implies that (O,.EI - 2 2 ) E H , and ( O , x 2 - 2 1 ) E H . So neither (0,21 - 2 2 ) nor ( O , x 2 - 2 1 ) belongs to E l . Hence s1 - 52 5 0 and 2 2 - 2 1 5 0. We conclude T I = T . L .
=
Tlius H induces a function w(.) on %lSl through w ( c ) 2 iff ( c , . c ) E H . Since I! IS a linear subspace it is easy to verify that w(.) is a linear functional. Further w ( . ) induces a modular function on 2s. For 4 X X ) +W(XY)
+XY)
=
W(XX
=
W(XX"Y
=
4 X X U Y
+yxnv)
+ w(XxnY 1.
Next let ( c ,XI), ( c ,ZO), ( c ,2 2 ) belong respectively to E l , H , E2. We must have > 2 0 . Otherwise ( c , z g ) E El (by definition of El), which is a contradiction. Suppose 20 < 2 2 . Then W ( C , Z ~ ) # t as otherwise ( c , z 2 ) E H and 2 2 = 20. Hence W ( C , Q ) < t. Now w ( c , z 1 ) > t . Hence w(c,Xzl ( 1 - X ) Z ~ ) = t for some 0 5 X 5 1. But then 20 = X z l (1 - X)z2, which is impossible since 20 < 21,x.~. Hence we conclude that 2 1 > zO2 X . L . Thus fi(c) 2 W ( C ) 2 fl(c). Restricting these functions to characteristic vectors of subsets of S we get the desired result.
2 1
+
+
P 9.16: We first show that VJ.Y 5 S,'r1( X )
+ ra(S
~
> inm s i z e of
X)
coinmm independent set
.
(9.7)
Let 7 he a common independent set. Then
IT1
= ~ ~ ( x n ' i " ) + ~ ~ ( ( s - XI )Tn~T( )X ) + T ~ ( S - X ) .
So the desired result (9.7) follows We will next construct a common independent set T whose size equals T I (X) r z ( S - X ) for an appropriate subset X . We will use the Sandwich Theorem for this purpose. Consider the submodular and supermodular functions r1 (.), r,"(.) (where r $ ( X ) = P ~ ( S-) P.L(S- X ) VX C S ) . Let k be the least integer for which T I ( . ) T $ ( . ) - Ic. (Note that this means there exists some X S for which 1.1 (.) = r $ ( . )- k). By Sandwich Theorem there is an integral weight function h(.)
+
>
sucl i that
7-1
(.)
2 h ( . )2 T i ( . ) - k
I t is clear that h ( e ) = 0 or1 on each e E S. Let A be the support of h(.). We have
r i ( A ) 2 h ( A ) = IAl. Therefore A is independent in the matroid M I . Next,
h(S - A ) = 0
2 r$(S - A ) - k
9.12. SOLUTIONS OF PROBLEMS
377
i.e., r$(S -
A ) 5 k.
It follows that Q ( A ) 2 vl(S) - k . But k is the least integer such that T I ( . ) 2 $(.) - k . Hence there exists X C S such that q ( X ) = r f ( X ) - k . Therefore Q ( A ) 2 q ( X ) rz(S - X ) for some X & S . Let T C A be independent in M z with Q ( A ) = T . Now T is a common independent set for M I and Ma and we have
+
IT1 = r z ( A ) 2 r n i n x g r l ( X ) + rz(S - X). Thus T is the maximum size common independent set. Remark: The generalization of the above result is given in the next chapter. The above technique has to be modified slightly t o prove that result. P 9.17: i. Let a f S.Let g : 2SUn -+
R be defined as follows. g(X ua)
=f(X),X
g ( X ) = f(S - X ) , X
E s.
c s.
Observe that g ( . ) is symmetric, since if a $! Y,g ( Y ) = f(S - Y )= g((S - Y )U a ) and if a E Y, g ( Y ) = f(Y- a ) = g(S - (Y- a ) ) . For proving that g(.) is submodular we consider a number of cases. i(a) If X
C Y C S and e 4
Y , e # a.
g ( X U e ) - g ( X ) = f(S- X
-
e ) - f(S - X )
g ( Y u e ) - g ( Y ) = f(S - Y - e) - f(S - Y )
f(S- Y)- f(S- Y - e ) 2 f ( S - X)- f(S- X i(b) If X C Y C S and e = a.
and
g(X
- e).
u e ) - g ( X ) = f ( X ) - f(S - -v
g(Y u e ) - g ( Y ) = f(Y)- f(S - Y) so we need to show
f ( X ) - f ( Y )2 f(S - X ) - f.(S - Y ) Let Y - X = { e l , .. . , en}. We then have,
(f(x)- f(xu el)) + f ( u~e l ) - f ( X u el u ez)) + . . . + ( f ( Y en) - f ( Y ) ) 2 (f(s X ) - f(S x e l ) ) + (f(s X e l ) f(S X el ez)) + -
-
-
-
-
-
-
-
-
+(f(S- (Y - e n ) ) - f(S - Y ) ) . But this follows from the conditions of the problem. i(c) If X
C Y C S U a and a E X,e 4 Y. g(x u e ) - g ( X ) = f ( X u e - a ) - f ( X
-
a)
-
9. SUBMODULAR F U N C T I O N S
378
So the desired inequality is immediate.
i ( d ) If X
5Y
s Suaand a E Y -X,
e$Y
g ( X u e ) - g ( X ) = f(S- X - e ) - f(S- X )
g(Y Now
f(S- X
-
u e ) - g ( Y ) = f(Yu e - a ) - f ( Y - a)
e ) - f(S- X ) 2 f ( X U e ) - f ( X ) (given). But, f ( X u e ) - j ( X ) 2 f(Yu e - u) - f ( Y - a )
since a E Y
-
X . Thus the desired inequality holds.
Now the function g ( . ) is symmetric and submodular so that we can apply Algorithm Symmetric to it. If g ( . ) reaches its minimum among all nonvoid proper subsets of S u a at X , either X or S u a - X contains a. So f(.)reaches its minimum at .Y - u or ( S U a - X). However g ( . ) might reach its minimum at { a } . To avoid this eventuality minimize g ( . ) over all subsets that contain {a,ei},ei E S. Repeat the minimization for each ei. ii. In this case we define g : 2SUn+ 3 as follows. g(X
u a ) = f(S- X ) , x c S. g ( X )=
fV), X c s.
I t can be seen that g ( . ) is symmetric and submodular by arguments similar to those of the previous section of the present problem. We can minimize g(.) over nonvoid proper subsets of S u a , using Algorithm Symmetric. If the minimum occurs at X , either X or S u a - X , does not contain a. Then f(.) reaches its minimum on that set. If the minimum falls on S we use the strategy described in the previous section of the present problem.
Chapter 10
Convolution of Submodular Functions 10.1
Introduction
The operations of convolution and Dilworth truncation are fundamental to the development of the theory of submodular functions. In this chapter we concentrate on convolution and the related notion of principal partition. We begin with a formal description of the convolution operation and study its properties. Convolution is important both in terms of the resulting function as well as in terms of the sets that arise in the course of the definition of the operation. The principal partition displays the relationships that exist between these sets when the convolved functions are scaled. In this chapter, among other things, we study the principal partitions of functions derived from the original functions in simple ways such as through restricted minor operations and dualization. We also present efficient algorithrns for its construction and specialize these algorithms to an important instance based on the bipartite graph.
10.2 10.2.1
Convolution Formal Properties
Definition 10.2.1 Let f ( . ) , g ( . ) : 2s -+ 3 . The lower convolution of f(.) and g ( . ) , denoted beyf * g ( . ) , is defined by
f * g ( X ) = m i n y c x V ( Y )+ g ( X
- Y)l.
The collection of subsets Y , at which f(Y)+ g ( X - Y ) = f B f , g ( X ) .But if X = S, we will simply write B f , s . 379
* g ( X ) , is denoted
by
10. CONVOLUTlON OF SUBMODULAR FUNCTIONS
380
The upper convolution of f(.) and g ( . ) , denoted by fig(.), is defined by
fig(X)= mazycx[f(Y) +g(X
- Y1l.
It is clear that
iii. if x is a vector with all entries equal and the corresponding weight function is denoted by z(.), then
(f(.)+ 4 . 1 ) * (d.1+ 4 . 1 ) = f * d.1 + 4 . 1 . We now have the following elementary but important result. Theorem 10.2.1 (k) I f f ( . ) is submodular (supermodular) and g(.) is modular then f * g(.)(f*g(.)) is submodular(superrnodular). We need the following lemma for the proof o f the theorem. Lemma 10.2.1 Letg(.) beamodularfimctionon thesubsetsofS.LetA, B , C , D S such that A u B = C u D , A n B = C n D. Then
C
d A ) + S ( B ) = g(C) + s ( D ) . Proof Both LHS arid RHS in the statement of the lemma are equal to
0 Proof of the theorem: We consider only the submodular case. Let X , Y
Further let
C S.
Then.
f * d X ) + f * g ( Y ) = f ( 2 x )+ g ( X -
Zx)u (Y
Z X ) + f(ZY)+ s(Y
-
ZY)
ZX C X, Zy 5 Y,
LVv observe that, since
(X
-
-
z y )= (xu Y
-
(2, u zy)) u (xn Y
-
(2,
n z,))
a11d
(aY- Z.y) n (k'
-
2,) = ((x u Y )- (2, u 2,)) n (xn Y
-
(2,
nZy)),
10.2. CONVOLUTION
38 1
we must have, by Lemma 10.2.1,
zx)+ g(Y - z y )= g ( x u Y - (2, u z y ) )+ g ( x n Y - (Zxn Z y ) ) . Hence, f * g ( X ) + f * g ( Y ) L f ( z xu z y )+ f(zxn z y )+ g ( X u y - (zxu zy>) + g ( x n y - (zxn 2~)). g(X -
Thus,
f * g ( X ) + f * g(Y) 2 f * g ( X U Y )+ f * g ( X n Y),
which is the desired result. 0
Remark: It is clear that if g(.) is not modular, but only submodular, then g ( X - Z X ) g ( Y - Z y ) need not be greater or equal t o g ( X U Y - ( Z X U Z y ) ) g ( X n Y - ( Z X n Z Y ) ) . Thus the above proof would not hold if g ( . ) is only submodular.
+
+
Henceforth we will confine our attention to lower convolution of submodular functions with submodular or modular functions. The results can be appropriately translated for upper convolution in the supermodular case.
Exercise 10.1 (k)If f(.),g(.) are both submodular, f * g ( . ) is not always submodular. Construct an example to illustrate this fact. Exercise 10.2 (k) Show that in the following situations f * g(.) is submodular. i.
f(.)- f(0) 5 d.)- g ( 0 ) ,
ii. g ( . ) = f(.), iii. f(.) - g(.) is nionotonically decreasing (increasing).
Exercise 10.3 Let fl(.), f . ~ ( . )be increasing submodular functions and let g(.) be a non-negative weight function. Show that
((fl + f i ) * 9)(.)= ((fl * 9 + f 2 * 9) * 9)(.)
Exercise 10.4 (k)Show that a , , is closed under union and intersection, iff (.), g(.) are both submodular.
10.2.2
Examples
We now list, from the literature, a number of examples which are related to the notion of convolution.
i. Hall’s Theorem(P.Hal1[Ha1135]) Hall’s Theorem on systems of distinct representatives states the following in the language of bipartite matching :‘Let B = (VL,VR,E ) be a bipartite graph. There exists a matching meeting all the vertices in Vr, iff for no subset X of Vt we have I r ( X ) l < 1x1’. This condition is equivalent t o saying ‘. . . iff ()I?) * 1 . ~ ) ( V L =))VL).’
382
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
ii. Dulmage - Mendelsohn decomposition of a bipartite graph (Dulmage and Mendelsohn [Dulmage+Mendelsohn59]) The above mentioned authors made a complete analysis of all min covers and max inatchings in a bipartite graph through a unique decomposition into derived bipartite graphs. We present their decomposition using the language of convolution.
Let B = ( V t ,VR,E ) be a bipartite graph. Let ,131 denote the collection of subsets of VL which minimize hl(X) G Ir,I(X) + JVL- XI, with r L ( X ) denoting the set of vertices adjacent to vertices in X . Thus rninxcv,,hl(X) = (Ir,l * I . ~ ) ( V L ) . We have seen (Exercise 10.4) that ,131 is closed under union and intersection. Let Xmin and X,, be the minimal and maximal sets which are members of B1. Then X,,, - Xmin can be partitioned into sets Ni such that for any given member of B1, each Ni is either contained in it or does not intersect it, and further, the partition is the coarsest with this property. Let n be the partition whose blocks are Xmin, all the Ni and V, - X,,,. Let us define a partial order ( 2 )on the blocks of n as follows: Ni 2 Nj iff N j is present in a member of B, whenever Ni is present. For all N i , it is clear that Ni 2 Xmin and VL - X,,, 2 N i . Next for each block K of n we build the bipartite graph B K as follows: Let I K be the principal ideal of K (i.e., the collection of all elements (blocks of n) that are ‘less than or equal to’ K ) in the partial order. Let J K be the union of all the elements in I K . Then B K is r ( J) K - K ) ) . The partial order defined to be the subgraph of B on K ~ ( ~ ( J- K (2)induces a partial order ( 2 ~on) the collection of bipartite graphs B K , K E II. The Dulmage -Mendelsohn decomposition is the collection of all B K together with the partial order ( 2 ~ ) . We now present the important properties of this decomposition.
A set (Vt - X ) Y ,X 2 VL,Y E VRis a minimum cover of B (cover every edge of B is incident on some vertex of the set) iff X is the union of blocks which are in an ideal of the partial order ( 2 )and which are also contained in X,,,, and Y = I’(X). Every maximum matching is incident on all the vertices in I?(Xmaz) and
,’L
-
xmin.
A set of edges P is a maximum matching of B iff P = is a maximum matching of B K . Exercise 10.5 (k)Show that I r l ( X ) >I X (Fl*I . I)(VL) =I VL I.
I
VX
uKEnP K , where P K
VL is equivalent to
Exercise 10.6 Prove Theorem 10.2.2 [Konig36] (Konig) In a bipartite graph the sizes of a maximum matching and a min cover are equal. Exercise 10.7 (k) Prove the properties of the Dulmage - Mendelsohn decomposjtion listed above.
...
Decomposition of a graph into minimum number of subforests (Tutte [TutteGl],Nash-williams [Nash-WilliamsGl]).
111.
383
10.2. CONVOLUTION
Tutte and Nash-Williams characterized graphs which can be decomposed into k disjoint subforests as those which satisfy k r ( X ) 2 IXI,VX C E(G). This condition can be shown to be equivalent to (kr * I . ) ) ( E ( G )= ) (E(G)(. iv. The matroid intersection problem (Edmonds [Edmonds7O], [Lawler76]) Given two matroids M 1 ,M z on S , find a maximum cardinality subset which is independent in both matroids.
The size of the maximum cardinality common independent set = ( T I * rz)(S).To find this set one can either use Edmond’s algorithm for this purpose or find bases b l , b: of M 1 , M ; , which are maximally distant. (See the solution of Problem 9.16 and the matroid union algorithm in the next chapter). v. The matroid union problem Given two matroids M 1 , M2 find the maximum cardinality union of an independent set in MI and an independent set in M2.
The collection of all unions of two sets, one independent in M I and the other in M2 is also a matroid denoted M1 V Ma. Thus the maximum cardinality union of an independent set of MI and one of M z is a base of M I V M2. There is the well known ([Edmonds65a],[Edmonds68]) matroid union algorithm for constructing this set. The rank function of this matroid is ( ( T I + ~ 2 * ) I I)(.). The union of all circuits of this matroid is the minimal set X which satisfies ( ( T I 7-2) * 1 . I)(S) = 9
(7.1
+ TIL)(X)+ 1s
-
XI.
+
vi. Representability of matroids (A.Horn [Horn55]) Horn showed that k independent sets of columns can cover the set of all columns of a matrix iff there exists no subset A of columns such that IAl > k r ( A ) . He conjectured that this might be correct only for representable matroids (i.e., for matroids which are associated with column sets of matrices over fields). If the conjecture had been true then there would have been a nice characterization of representability. However Edmonds [EdmondsSSa] showed that this result is true for all matroids. He gave an algorithm for constructing k bases of a matroid whose union has the maximum cardinality. His results are equivalent to saying that k bases will cover the underlying set S of a matroid M iff M k (the union of M with itself k times) has no circuits. The rank function of this matroid is (kr * I . I)(.). So the result can be stated equivalently as ‘covering is possible iff (kr * 1 . I)(S) = IS[’.
10.2.3
Polyhedral interpretation for convolution
We now show that the convolution operation is naturally polyhedral in the sense that natural questions about submodular set polyhedra are related to it [Edmonds70].
Theorem 10.2.3
i. Let f ( . ) , g ( . ) be set functions on subsets o f S . Then
384
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
ii. If f ( . ) , g ( . ) are submodular functions that take zero d u e on 8 then f is polyhedrally tight. Equivalently,
* g(.)
f * g ( X ) = minycx(f(Y)+ g ( X - Y ) )= m a z : ( z ( X ) ) , where x is a vector satisfying z ( Z ) 5 f ( Z ) , z ( Z )5 g ( 2 ) V Z C X . Further
if f ( . ) , g ( . ) are integral, then x can be chosen to be integral. Proof:
+
A vector x E Pf.g only if z ( X ) 5 minycx(f(Y) g ( X - Y ) ) VX 5 S , i.e.; only if z ( X ) 5 f ( X ) and z(X) 5 g ( X ) for every subset X of S , i.e., only if x E Pf 17 Pg. i.
A vector x E Pf n Pg only if z ( Y ) 5 f ( Y ) and z ( X - Y ) 5 g ( X - Y ) VY C X 2 S , i.e., only if z ( X ) 5 minycxcs(f(Y) + g ( X - Y ) ) VX E S , i.e., only if x E Pf*y. .. 11. Any submodular (supermodular) function that takes zero value on the null set can be made into a polymatroid rank function by adding a large enough weight function x ( . ) which takes the same value on all singletons. In this case we know that (f(.)+ 4 . 1 ) * (d.1+ 4.1) = f * d.)+ 4.1. ~~
+
Further the polyhedron with the above function is Pf.s x. Hence, we need only to prove the required result for the case where f(.) and g(.) are polymatroid rank functions. This we assurne henceforth in the proof. Our proof follows Lovasz and Plummer [Lovasz+Plummer86] and uses the Sandwich Theorem (Theorem 9.7.4). We will exhibit a vector x in PfXgsuch that ~ ( 2=)f * g ( Z ) for a given 2 S . Since it is clear that z ( X ) 5 f * g ( X ) VX S , the result would follow.
c
c
Consider the (respectively) submodular and superrnodular functions
f ’ ( X ) e rnin(k,f ( X ) ) , g ’ ( X ) = maz(0,k - g(Z - X ) )
vx C 2.
Choose k = m z n x c z ( f ( X )+ g ( Z - X ) ) = f * g ( Z ) . We claim that f ’ ( X ) 2 g ’ ( X ) V X C 2.If f‘(X) = k then this fact is immediate. Otherwise let f’(X) = f ( X ) . We know that f(X)2 0.Let us assume therefore that g’(X) = k - y ( Z - X ) . But f ( X ) + g ( Z - X ) 2 Ic.Heiice f ( X ) 2 k - y ( 2 - X ) . ~
f
Further, we observe that
f’(0)=
g‘(0) = 0 and that f’(2)= g’(Z) =
k =
*S(Z).
Therefore, by the Sandwich Theorem (Theorem 9.7.4), there exists a nonnegative weight function z(.) on subsets of 2 such that f’(.) 2 z(.) 2 g ’ ( . ) and f’(2)= ~ ( 2=)g ’ ( 2 ) = k. We then have z ( X ) 5 f ( X ) for all X C 2,and since z ( Z - X ) 2 k-y(X)forallX C Z,wealsohavez(X)= z ( Z ) - z ( Z - X ) s z ( Z ) - k + g ( X ) =
s(-Y).
Thus the vector x corresponding to the weight function z(.) belongs to Pfn Pf,, and further satisfies z ( Z ) = f * g ( 2 ) .
eq=
10.3. MATROIDS, POLYMATROIDS AND CONVOLUTION
385
Further the Sandwich Theorem assures us that the vector x can be chosen to be integral if f (.) ,g( .) are integral. U
Exercise 10.8 (k) Let f(.) be a submodular function and Jet g ( . ) be a weight g ( e ) . It is natural to ask whether the vector g whose function, i.e., g ( X ) CeEX components are g ( e i ) , ei E S , belongs to Pf . More generally one could ask for a set X on which g ( . ) - f (.) reaches its maximum. [If this d u e is less than or equal to zero, we know that g belongs to P f , otherwise we atleast know an inequality in the definition o f Pf which g fails to satisfy in the worst possible way). This latter problem is called the membership problem of g over P f . Show that the above mentioned sets X are precisely those in Bf,, . Exercise 10.9 [k) Let p(.) be a submodular function. Let g ( . ) be the weight function defined through g ( e ) G p(S - e ) - p ( S ) . Let f(.) = p ( . )+ g ( . ) . In Problem 9.5.2 we saw that f(.)is a polyrnatroid rank function. Show that f(.) reaches a S iff X E Bf,, . Thus minimization of a submodular function minimum at X is equivalent to solving the membership problem over a polymatroid.
s
10.3
Matroids, Polymatroids and Convolution
In this section we relate matroids and polymatroids through the operation of convolution. The first result given below contains one of the most powerful ways of constructing matroid rank functions - by convolving a polymatroid rank function with a weight function which takes value 1 on singletons. The second result shows a way of regarding polymatroid rank functions as obtained through fusion of the underlying set of a matroid rank function. Once again convolution plays an important role.
Theorem 10.3.1 Let f(.),g(.) be arbitrary set functions on subsets o f S . i. Then f * g ( X u e ) - f * g ( X ) 5 min(masycx(f(YUe)-f(Y)),mazySx(g(YUe)- g ( Y ) ) ] , X
5 S, e E S.
* g ( . ) is increasing. Then so is f * g(.).
ii. Let f(.),g ( . ) be increasing. Then f iii. Let f(.),g ( . ) be integral.
iv. (Edmonds [Edrnonds70]) Let f (.) be an integral polymatroid rank function and let g(.) = 1 . (.Then f * g ( . ) is a matroid rank function.
The proof is now immediate.
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
386
ii.
Let, without loss of generality
,
f * g(X u e) = f ( Z ue) +g(X
...
-
Z),Z C X , e E (S - X ) .
The proof is immediate from the definition of convolution.
111.
iv. We need t o show that f * g ( . ) is an integral polymatroid rank function that takes value atmost one on singletons. We have, f(.),g ( . ) are increasing, integral, submodular, taking value zero on the null set and further g ( . ) is a weight function with y(e) = 1 Ve E S . From Theorem 10.2.1 it follows that f * g ( . ) is submodular. It is clear that f * g (0 ) = 0. The remaining properties for being a matroid rank function follow from the preceding sections of the present theorem. 0
Exercise 10.10 (k) Let p ( . ) be an integral polymatroid rank function on subsets of S . Characterize the independent sets and circuits of the matroid rank function /I*
I.1
Matroid expansion of a polymatroid It is clear that we would get a polymatroid rank function if we fuse the elements of a inatroid. Our next result shows that every polymatroid rank function is obtained by fusing the elements of an appropriate matroid.
Definition 10.3.1 Let f(.) be an integral polymatroid rank function on subsets of a set S E {el, e 2 , . . +,en}. Let S be another set with a partition (21, G 2 . . . ,&} and let I-(.) be a matroid rank function on subsets of S such that whenever X C { e l 3ez, . . . , e n } , we have,
f(W =W , , E X ~ Z ) .
Then
T(.)
is called a matroid expansion o f f ( . )
We will now show that every integral polymatroid rank function has a matroid expansion.
For each pi, if f ( e i ) = ki, replace ei by ki copies of it, all in parallel, and call the resulting set & . Thus f(.)is extended to a submodular function f(.)on subsets of S = u& (equivalently, f(.) is a restriction of f(.)). Let I . 1 be the cardinality function on the subsets of 5. By Theorem 10.3.1 we know that f(.)* 1 . I is a matroid Iank function, say T ( . ) on subsets of S. We will now show that T ( . ) is a matroid exparision of f ( . ) . Let X = { e z , , , . , e z r} and let X = E z l
k,
. . . k, F t t . We need to show that
.(R) = f * I . l(2)= f ( X ) = f ( X ) .
10.4. THE PRINCIPAL PARTITION
387
= f ( P )+ 1.k -PI. The RHS of the above equation is the minimum Now let f^*1. ((X) possible among all such sums. Thus P and therefore 2 - P must be the union of sets of the form e^il (addition of parallel elements to P does not increase the value of f(.)while decreasing the value of the 1 2 - PI term). Further, (again since the RHS must be minimum), we must have f(X - P ) 2 (X- PI as otherwise, since f(.) is submodular and takes zero value on the null set, we must have
+
f ( P ) IX
-
P ) > j ( P )+ p(X - P ) 2
f(X),
which is a contradiction. On the other hand, by submodularity o f f ( . ) and by reason of its taking zero value on the null set, we must have f(X - P)
5
c f(&) 12 =
- PI.
CicX-P
It follows that, f(2- P ) = 1 2 - PI. Thus f* 1 . I ( X ) = f ( P )+f(k- P ) . The RHS of this equation must, using the above mentioned properties of f(.),be greater or equal to f(.k),and, by the definition of convolution be less than or equal to it. We conclude that
r ( 2 ) f i * I . [(X) = f ( X ) = f(X).
Exercise 10.11 The matroid rank function r ( . ) has some unusual symmetry properties: Any subset of & behaves identically as any other subset of & that has the same cardinality. For instance if a circuit (base) of the matroid defined by r ( . ) contains a k element subset of Bi i t can be replaced by any other such k element subset of t?i and the resulting subset would remain a circuit (base). Prove the above statements.
10.4 10.4.1
The Principal Partition Introduction
The notion of principal partition (PP) is important because of the structural insight it provides in the case of many practical problems [Iri79b], [Iri+Fbjishige81], [Iri83], [Iri84], [Fujishigegl]. The literature on this subject is extensive. The idea began as the 'principal partition of a graph' [Kishi+Kajitani68] and was originally an offshoot of the work on maximally distant trees (summarized in Lemma 14.2.1). The extensions of this concept can be in two directions: towards making the partition finer or towards making the functions involved more general. Our present description favours the former approach and is based on the principal partition of a matroid ("arayanan741, [Tomizawa76]). Thus, although we begin by defining the PP of ( f ( . ) , g ( . ) ) , where f(.),g(.) are general submodular functions, most of the properties described are for the case where g(*) is a strictly increasing polymatroid rank function. For structural results (as in Sections 10.4.5 and 10.4.6) and algorithms (Section 10.6) we restrict ourselves to the important practical case of g ( . ) being a positive weight function.
10. C O N V O L U T I O N OF S U B M O D U L A R F U N C T I O N S
388
10.4.2
Basic Properties of PP
Definition 10.4.1 Let f ( . ) , g ( . ) be submodular functions on the subsets of a set S . The collection of all sets in t ? ~ (i.e., ~ , the ~ collection of sets which minimize X f ( X ) g(S - X ) over subsets of S ) , VX, X 2 0, is called the principal partition (PP) of (f(.),g(.)). We denote t ? ~ by~ B,X ~when f ( . ) , g ( . ) are clear from the context. We denote the maximal and minimal members of B X by X X , X X , respectively.
+
We now list the important properties of the principal partition of ( f ( . ) , g ( . ) ) ,where f(.)is a submodular function and g ( . ) , a strictly increasing polymatroid rank function on subsets of S . Property PP1 (This property is valid even if g ( . ) is not strictly increasing.) The collection B x ~ X, ~2, 0, is closed under union and intersection and thus has a unique maximal and a unique minimal element. Remark: For the remaining properties we assume f(.) to be submodular and g ( . ) to be a strictly increasing (i.e., g ( Y ) < g ( X ) V Y c X C S ) polymatroid rank function. Property PP2 If X i > X.L 2 0, then Xxl C X x z . Definition 10.4.2 A nonnegative d u e X for which B X has more than one subset as a member is called a critical value of (f(.),g(.)). Property PP3 the number of critical values of ( f ( . ) , g ( . ) ) is bounded by ISI. Property PP4 Let ( A t ) , z = 1 , . . . , t be the decreasing sequence of critical values of ( f ( . ) , g ( . ) ) . T h e n , X X t= X X , +f~o r z = l , . . . , t - l . Property PP5 Let (A,) be the decreasing sequence of critical values. Let A, > cr > X X Z = xu = = Xx,+l.
x,
&+I.
Then
Definition 10.4.3 Let f(.) be submodular and let g ( . ) be a strictly increasingpolymatroid rank function on subsets of S . Let ( A t ) ,i = 1 , ’ ’ ’ ,t be the decreasing s e quence of critical values of (f(.),g ( . ) ) . Then the sequence 0 = Xxl, X x 2 ,. . . , Xxt,X x t = S is called the principal sequence of ( f ( . ) , g ( . ) ) . A member ofBx would be alternatively referred to as a minimizing set corresponding to X in the principal partition of (f(.),g(.)). Proof of the properties of the Principal Partition i. PP1: Define h ( X ) = Af(X) + g(S - X ) V X C S , A 2 0. Observe that the function g ’ ( . ) , defined through g ’ ( X ) = g(S - X ) V X C S, is submodular. So
10.4. THE
PRINCIPAL PARTlTION
389
it is clear that h(.) is a submodular function. The principal structure (i.e., the collection of subsets that minimize this function) has been shown in Problem 9.2 to be closed under union and intersection and thus to have a unique minimal and a unique maximal set. But then the principal structure of h(.)is precisely the same
as Bx.
+
ii. PP2: Observe that minimizing X i f ( X ) g(S - X ) , V X C S, X i 2 0, i = 1 , 2 , is equivalent to minimizing f ( X ) + ( X i ) - ' g ( S - X ) VX S , X i 3 0 , i = 1,2. (Here O X +cc is treated as zero). So we may take the sets which minimize the latter expression to be the sets in f 3 ~,i, = 1,2. Define p i ( X ) G f ( X ) ( X i ) - ' g ( S - X ) VX E S , X i 2 0, i = 1,2. As in the case of h i ( . ) , p i ( . ) , i = 1 , 2 is also submodular. Let 2 1 minimize P I ( . ) .We will now show that pZ(21) < pZ(Y) V Y c 2 1 . By Theorem 9.4.1, it would then follow that 21 is a subset of every subset that minimizes p a . In particular it would follow that X x l C X x 2 . Let Y C 21. We have,
c
+
PZ(Z1) =p1(Z1)
+
-(k-l)g(S-21)
and
P.a(Y)= Pl ( Y )+ ( ( A d - ' - ( A d - ' ) g ( S - Y ) . Since g ( . ) is a strictly increasing submodular function, S - 21 c S - Y and ((AZ)-' - ( X I ) - ' ) > 0 , we must have ((X2)-' - (XI)-l)g(S - 2 1 ) < ( ( A 2 ) - ' - ( X l ) - ' ) g ( S - Y ) . Sincepl(Y) 2 p l ( Z 1 ) ,it follows that pa(Y) > ~ ~ ( 2 1 ) . iii. PP3: If BX has more than one set as a member then lX'l > IX'l. So if A1 , X2 are critical values and A1 > Xz, by Property PP2, we must have lXx,l < lXxzI. Thus the sequence X x , , where (Xi) is the decreasing sequence of critical values st elements. cannot have more than 1 iv. PP4: We need the following lemma. Lemma 10.4.1 Let X minimizes X - E is X'.
> 0. Then, for sufficiently small
E
> 0, the only set that
Proof of the Lemma: Since there are only a finite number of ( f ( X ) , g ( S- X ) ) pairs, for sufficiently small E > 0 we must have the value of (A - E ) ~ ( X g(S ) - X) lower on the members of BX than on any other subset of S . We will now show that, arnong the members of Bx,X' takes the least value of ( X - f ) f ( x ) + g ( S - X ) , > ~ 0. This would prove the required result. If X is not a critical value this is trivial. Let X be a critical value and let X I , X' be two distinct sets in Bx. Since X I c X ' , we have, g(S - X I ) > g(S - X ' ) . But, X f ( X 1 ) g(S - X I ) = X f ( X x ) g ( S - X ' ) . So, A f ( X 1 ) < Xf(XX). Since X > 0, we must have, - ~ f ( X 1 )> - ~ f ( X ' ) , c > 0. It follows that, (A - e)f(X1) g ( S - X I ) > (A - e ) f ( X ' ) g ( S - X ' ) .
+
+
+
+
+
Proof of PP4: By Lemma 10.4.1, for sufficiently small values of E > 0, X'x would continue to minimize ( X i - c ) f ( X ) + g ( S - X ) . As E increases, because there are only a finite number of ( f ( X ) , g ( S- X ) ) pairs, there would be a least value € 1 at which X': and atleast one other set minimize ( X i - ~ l ) f ( X+) g(S - X ) . Clearly, the next critical value Xi+l = X i - €1. Since X i > X i - € 1 , by Property PP2, we must have X x l E X X , - ~ , .Hence we must have, Xxi = X X , - ~= , X X ~ +as~desired. ,
390
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
v. PP5: This is clear from the above arguments. Exercise 10.12 (k) Show that the critical values have to be positive. What are X x , X x when X = +m?
Remark: All the properties hold in the case where f(.) is a strictly increasing polymatroid rank function while g( .) is merely submodular. Proofs are essentially the same except that while proving ‘if XI > Xz, then Xxl Xxp’ we work with X rather than (A)-’. Exercise 10.13 (k) Let f(.) be a submodular function on subsets of S . W e say ( X ,Y ) is a modular pair for f(.) i f f
f ( W + f(Y)= f ( X u Y ) + f ( X ” Y ) . Let g ( . ) be a positive weight function on S s.t. f ( e ) 5 g(e) Ve E S. Show that if X , Y are in B X with respect to ( f ( . ) , g ( . ) ) ,then i. ( X ,Y ) is a modular pair for f(.), ii. ( S - X, S - Y ) is a modular pair for f*(.) (the dual o f f ( . ) with respect to
d.1).
A characterization of principal partition would be useful for justifying algorithms for its construction. We will describe two such characterizations in Theorems 10.4.1 and Theorems 10.4.6 below. The first of these is a routine restatement of the proper ties. Theorem 10.4.1 Let f(.) be a submodular function on subsets of S and let g(.)
be a strictly increasing polymatroid rank function on S . Let BX denote t?xf,g. Let
XI, . . . , Xt be a strictly decreasing sequence of numbers such that i. each BA,,i = 1, . . . ,t has atleast two members, ii. B x , , Bjt~,+~ ,i = 1 , . . . ,t jii.
-
1 have atleast one common member set,
0 belongs to Bx,, while S belongs to Bx, .
Then XI,. . . , Xt is the decreasing sequence of critical values o f (f(.),g ( . ) ) and therefore the collection of all the sets which are member sets in all the Bx,,i = 1,. . . ,t is the principal partition o f (f(.), g ( . ) ) . Proof: Let X i , . . . , X i be the critical values and let 0 = Yo,.. . , Y k = S be the principal sequence of ( f ( . ) , g ( . ) ) .By Property PP2 of ( f ( . ) , g ( . ) ) ,the only member set in Bx, when X > X i , is 0. Further when X < X i , 0 is not in Bx. Hence XI = X i . Next by Property PP5, when X i > X > X i , the only member in Bx is Yl which is the maximal set in Bx;. Since Bx2 has atleast two sets we conclude that Xz 5 Ah. We know that Bx, and Bxz have a common member which by Property PP2 can only be YI. But for X < X i , by Property PP5, Yl cannot be a member of Bx.
10.4. THE PRINCIPAL PARTITION
391
Hence A 2 = A;. By repeating this argument, we see that t must be equal to k and X i = A:,z = l,...,t. 0
Exercise 10.14 Let g(.) be a submodular function on subsets of S and let f(.) be a positive weight function on S . Describe the principal partition of (f(.),g(.)). Exercise 10.15 What is the principal partition of a weight function f (.) with respect to another weight function g(.)? How is it related to the principal partition of g(.) with respect to f ( . ) ? Exercise 10.16 Let f ( . ) , f 2 ( . ) be submodular functions on subsets of S and Jet a positive weight function on S. f i r t h e r let fi(e) 5 g(e) Ve E S.
g(G) be
Storing PP
-
Partial order representation of a distributive lattice
We have seen that B A is closed under union and intersection, equivalently, the elements of B A form a distributive lattice with join in place of union and meet in place of intersection. The number of elements in B, would be usually too large to store directly. Fortunately there is a very simple representation [Birkhoff67] available, by which a distributive lattice may be stored. We describe this below: Let C be a collection of subsets of S closed under union and intersection. Let S be the union of all the sets in C. Define a preorder ‘kc’ on the elements of S through ‘el kc: e2 Vel, e2 E S iff whenever el belongs to a member of C, ez also belongs to it’. Definition 10.4.4 The preorder 9 ~and’ the partial order that i t induces on its equivalence classes are referred to as the preorder and partial order associated with C . Exercise 10.17 (k) Verify that kc is a preorder (and hence that its equivalent classes partition S). As discussed in Subsection 3.6.7, the preorder induces a partial order 2 on this partition.
We remind the reader that an ideal of a preorder is a subset Z of the set over which the preorder is defined, with the property that ei E Z,ei kc e j implies e j E Z. We now show that the ideals of ‘kc’are precisely the members of C. Let T be a member of C. Let el E T. Suppose el kc e2. Then, we have that whenever el belongs to a member of C, e2 will also belong to it. Hence e2 E T. Thus T is an ideal of C. Next, let I be an ideal of ‘kc’. By the definition of the
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
392
preorder, every member of 1 belongs to some set that is a member of C. Since C is closed under intersection there is a unique minimal member say T, that contains a given element e E I . Now if e' is any other element in T, we must have, by the definition of the preorder and the unique minimality of T,, that e t c e'. Hence 7; C I . Since C is closed under union we have that U e E r T eis a set in C. But this latter set is clearly the same as I. We observe that, in general, C might have size exponential in the size of S. But the Hasse diagram contains atmost as many elements as in S and atmost 0(/Sl2) edges.
Example 10.4.1 Let M be the collection {1,2}, {2}, { 1 , 2 , 3 , 6 } , { 2 , 3 , 4 , 6 } , { 1 , 2 , 3 , 4 , 6 } ,{2,3,6}. This collection can be emi1.y seen to be closed under union and intersection. The equivalence classes of the preorder are {2},{1},{3,6},{4}. The Hasse diagram has ( 1 ) 2 {2},{3,6} 2 {a}, (4) ? 1316). Exercise 10.18 Prove Lemma 10.4.2 (k) Let C be a collection of subsets of S closed under union and intersection. Let C' be the collection of complements of sets in S . Then i.
C' is closed under union and intersection.
ii. the preorders associated with C and C' are duals of each other, iii. the equivalence classes of the preorders are identical and the induced partial orders are duals of each other.
The Partition - Partial Order Pair Associated with ( f ( . ) , g ( . ) ) Each of the collections B x , in the the principal partition of (f(.),g(.)) (f(.)submodular, g ( . ) , a strictly increasing polymatroid rank function) is closed under union and intersection. Hence one can define preorders for each of them, so that the ideals of the preorders are identical to the members of the corresponding B x . For a given A, it is clear that X x is partitioned by the equivalence classes of the corresponding preorder. (One of the blocks of this partition is X x , provided this set is nonvoid). More conveniently, let us denote by N x , the collection { X - Xx : X E B x } . There is a one to one correspondence between sets in t ? and ~ sets in N x , with union of the sets in the former being X x and of those in the latter being X x - Xx. Further X x in the former corresponds t o 0 in the latter. Clearly Nx is closed under union and intersection and induces a preorder on the elements of X x - X x . Let us denote by I I ( A ) , the partition of X x - X x , whose blocks are the equivalence classes of the preorder induced by Nx. This partition is identical to that of X x induced by the preorder of Bx except that X x is omitted. Clearly the partitions induced in this manner by all the critical values, if put together, constitute a partition of S .
10.4. THE PRINCIPAL PARTITION
393
Let us denote by nPp, the union of all the partitions n(X),X a critical value. We denote by k p , the preorder on elements in S induced by all preorders of N A , X a critical value, with the additional condition e2 t p el whenever el E Xxl - Xxl,e2 E X x 2 - X x z , X I 2 Xz. The partial order induced on the blocks of nPpby k p is denoted by 2,. The pair ( I I p p , 2,)is referred to as the partition-partial order pair associated with (f(.), g(.)). This partial order is refined later.
10.4.3
Symmetry Properties of the Principal Partition of a Submodular Function
One of the attractive features of the principal partition is that, under fairly general conditions, it preserves the symmetries of the original structure. This fact is relatively easy to see if we use convolution as the basis for the development. But it is somewhat less obvious if we develop the idea of principal partition algorithmically (in terms of maximally distant forests and generalizations for instance). Below we formalize the notion of symmetry for a set function. Definition 10.4.5 Let f(.) be a real valued function on the subsets of S. An automorphism off(.) is a bijection a : S -+ S such that f ( X ) = f ( a ( X ) )VX C S . A set X is invariant under a(.) iff a ( X ) = X . Let + y ' be a preorder on the elements of Y S . We say ( k y 'is invariant under a(.)iff
c
9
Y is invariant under a ( . ) ,
0
a
' y
b iff a(.)
y y a(b).
A function g ( . ) is symmetric with respect to f(.) iff every automorphism o f f ( . ) is also an au tomorphism of g ( . ) . Theorem 10.4.2 Let f ( . ) be a submodular function and let g ( . ) be a strictly increasing polymatroid rank function on the subsets of S . If g(.) is symmetric with respect to f(.), then, i. the principal partition of
f (.I and
(f(.), g(.)) is invariant under the automorphisms of
ii. (equivalently) the partition-partial order pair of (f(.),g(.)) is invariant under
such automorphisms.
Proof: i. Let a ( . ) he an automorphism o f f ( . ) . We need to show
a set in 0
BX moves to another such set under a
the sets in the principal sequence remain invariant under a ( . ) .
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
394
We have that a ( . )is a bijection on S . It is then immediate that X E Y iff a ( X ) C a ( Y ) . Let X I minimize Xf(X) + g ( S - X ) . Since g ( . ) is symmetric with respect to f(.),we must have that g ( S - a ( X , ) ) = g(S - Xl). Hence a ( X 1 ) also minimizes the expression Xf(X) g(S - X ) . Thus the image of a set in BX is also in Bx. Since a ( . ) is a bijection, the sizes of these two sets must be the same. It then follows, from the fact that X X is the unique minimal set minimizing the above expression, that X x = cy(Xx). A similar argument shows that Xx is invariant under a .
+
ii. To show that the partition cy (.) , we need to show
-
partial order pair of ( f ( . ) , g ( . ) ) is invariant under
n(X)moves to another
0
a block of
0
the partial order on these blocks induced by
such block under a ( . )and
BX is invariant under a ( . ) .
We saw that a(.) is a bijection on S that leaves both X x and X x invariant and further simply permutes the sets in BX preserving the containment property. NOW the blocks of n(X)are determined by the member sets of BX (being the maximal subsets of X x which are not 'cut' by the member sets of Bx,i.e., which lie entirely inside or entirely outside these member sets). Hence a permutation of Bx would determine a permutation of n(A).Next, ( e l ) > K ( e 2 ) iff every set in B X containing ( e l ) also contains ( e 2 ) (denoting the equivalence class determined by an element e by ( e ) ) . Now X contains an element e iff a ( X ) contains a(.). Since cy permutes the sets in Bx it follows that every set in this collection which contains cy((e1)) also contains c y ( ( e 2 ) ) and hence cy((e1)) 2, a ( ( e 2 ) ) . 0
10.4.4 Principal Partition from the Point of View of Density
of Sets
The principal partition gives information about which subsets are densely packed relative to (f(.),g(.)). For instance if f(.) is the rank function of a graph and g ( X ) =I X 1, the sets of the highest density (the sets in BXm, where A" is the highest critical value) correspond to subgraphs where we can pack the largest (fractional) number of disjoint forests. Definition 10.4.6 Let f(.) be a submodular function and g ( . ) , a strictly increasing polymatroid rank function on subsets of S . The density of X 2 S with respect to ( f ( . ) , g ( . ) ) is the ratio ( g ( S ) - g(S - X ) ) / ( f ( X )- f(8)). The set S is said to be molecular with respect to ( f ( . ) , g ( . ) ) i f f ( f ( . ) , g ( . ) )has only one critical value, equivalently, i f fit has 0 , S as the principal sequence. A set S that is molecular is said to be atomic with respect to ( f ( + ) , g ( . ) iffS ) and0 are the onlysets in B x , X S is said to be molecular (atomic) with being the only critical value. A set X respect to (f(.),g(.)) i f fit is molecular (atomic) with respect to ( f / X ( . ) , g / X ( . ) ) .
The problem of finding a subset T of S of highest density for a given ( g ( S ) - T ) )value would be N P hard even for very simple submodular functions.
y(S
10.4. THE PRINCIPAL PARTITION
395
Example: Let f(.) G rank function of a graph, g ( X ) X 1. In this case g ( S ) - g(S - T ) =I T 1 and if we could find a set of branches of given size and highest density we can solve the problem of finding the maximal clique subgraph of a given graph. However, every set in the principal partition has the highest density for its ( g ( S ) - g ( S - T ) ) value (see Exercise 10.19) and further is easy t o construct. This apparent contradiction is resolved when we note that there may be no set of the given value of ( g ( S ) - g ( S - T ) )in the PP.
The idea of ‘density’ is natural for polymatroid rank functions. So, in this subsection, we confine ourselves to such functions even though the results can be generalized to arbitrary submodular functions. Exercise 10.19 Let f(.),g(.) be polymatroid rank functions on subsets of S with g ( . ) strictly increasing. Let T be a set in the principal partition of (f(.),g(.)). I f T‘ E S s.t. g ( S - T ) = g(S - T’) and T’ not in the principal partition show that the density of T Cr S is greater than that of T‘.
Exercise 10.20 Let f(.),g(.) be polymatroid rank functions on subsets o f S with
g ( . ) strictly increasing.
i. Show that S is molecular with critical value X iff ((Af)*g)(S) = Af(S) = g(S). ii. When S is molecular show that the critical value is equal to g ( S ) / f ( S ) . iii. Show that S is molecular (atomic) iff S has the highest density among all its
subsets (has higher density than all its proper subsets).
Remark: When the context makes (f(.),g(.)) clear we would simply say S is molecular (atomic) instead of S is molecular (atomic) with respect to (f(.),g(.)). Similarly while speaking of density.
* Alternative Development
of PP
The following exercises constitute an alternative development of the principal partition from the density based point of view. They are included primarily to bring out an aspect of the analogy between principal partition and the principal lattice of partitions (to be introduced in the next chapter). They may therefore be omitted during a first reading.
We need a preliminary definition. Definition 10.4.7 Let f(.),g(.) be polymatroid rank functions on subsets of S with g ( . ) strictly increasing. A set T satisfies the A-density gain (&density loss) condition with respect to (f(.),g ( . ) ) , iff whenever T’ 2 T (T” T ) , we have
- T‘) f(T’) - f(T)
g(S - T ) - g(S
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
396
We say that these conditions are satisfied strictly if the inequalities above are strict. Exercise 10.21 Let h X ( X ) g ( S - X ) Prove
=
+ X f ( X ) , X C S , f ( . ) , g ( . ) a defined in Definition 10.4.7.
Theorem 10.4.3 i. If T C S satisfies the X-density loss (X-density gain) condition, then there exists a subset f of S such that T minimizes hX(.) over subsets of S and T 2 T (T C T ) .
ii. If T C_ S satisfies the X-density loss (X-density gain) condition strictly then whenever T minimizes hX(.)over subsets of S we must have? 2 T (T C_ T ) . iii. A subset T of S satisfies both the X-density gain condition and X-density loss condition iff it minimizes hX(.)over subsets of S . Exercise 10.22 Let h X ( X ) G g ( S - X ) X f ( X ) , X C S , with f ( . ) , g ( . ) as defined in Definition 10.4.7. If T1,Tl g S satisfy the X-density gain (X-density loss) property with respect to ( f ( . ) , g ( . ) ) then
+
hX(T1 nT2) 5 hX(Ti),i= 1 , 2 (hX(Ti U T J )5 hX(Ti),i= 1 , 2 ) Exercise 10.23 Let h x ( X ) g ( S - X ) X f ( X ) , X C S , with f(.),g(.) as defined in Definition 10.4.7. Using the above exercises show that if Tl,Tl minimize hA(.), then so do TLU T? and Tl n TL.
=
+
Exercise 10.24 Let h l ( X ) G g(S - X ) X f ( X ) , X C S , with f(.),g(.) as defined in Definition 10.4.7.‘ Let T I minimize hX1(.), and let T2 minimize hX,(.) with X i > X2. Then 7; ? T I .
+
h
i
alternative to the last problem:
Exercise 10.25 Let h X ( X ) g ( S - X ) X f ( X ) , X g S . Let f(.),g(.) be as defined in Definition 10.4.7. Let 7’1, T2 minimize hX1(.), hX2(.) respectively over subsets of S , with XI > A:, . Show that hX2( T I ) < hX2( T ) T , c TI and hence, TI C T2.
=
10.4.5
+
Principal Partition of f*(-)and f * g ( - )
In this subsection we show that the principal partitions of (f * g ( . ) , g ( . ) ) and ( f * ( . ) , g ( . ) )are closely related t o that of (f(.),g(.)). In the case o f f * ( . ) , the principal partition is ‘oppositely directed’ t o that of f(.) while in the case of f * g ( . ) ,
397
10.4. THE PRINCIPAL PARTITION
the principal partition remains identical to that o f f ( . ) above a certain critical value and below it, becomes a set of coloops. Throughout, we take f(.) to be submodular on subsets of S and g ( . ) to be a positive weight function on S . While studying the case of f * g ( . ) , we further assume that f(0) = 0. We begin with some preliminary definitions.
Definition 10.4.8 Let f(.)be a submodular function on subsets o f S with f(8) = 0 . The function f(.) is said to be modular over X C S i f f(f/X)(.) is a modular function. The modular part o f f ( . ) is the union of all the separators over which it is modular. Let g(.) be a positive weight function on S . I f e belongs to the modular part o f f ( . ) with f ( e ) = 0 ( f ( e ) 2 g(e)) it is called a self loop (a coloop with respect to g(.)). Remark: i. When it is clear from the context, while speaking of coloops, we will omit reference to the function g(.). .. 11. In the case of a general submodular function, the above definition, where we insist that f ( e ) 2 g(e), seems necessary since single element separators, with value of f ( . ) greater or less than g ( . ) , behave differently in the PP of f * g(.). In the case of a matroid, the g(.) function one would normally work with would satisfy g ( e ) 5 1Ve E S. So coloops of the matroid (singleton separator with rank 1) would turn out, to be the same as the coloop with respect to g(s). Lemma 10.4.3 Let f(.) be a submodular function on subsets of S , taking zero value on the null set, and let g ( . ) be a positive weight function on S . Let f ( e ) 5 g(e) Ve E S. Let C: be the set of coloops o f f ( . ) with respect to g(.) and let L be the set of self loops o f f (.). Then
i. every subset of L is a minimizing set in the principal partition o f ( f ( . ) , g ( . ) ) corresponding to X = 00 and further L is the maximal such minimizing set. ii. ever,y set between S and S - C (both sets inclusive) is a minimizing set corresponding to A = 1 in the principal partition o f (f(,),g(.)). Therefore no critical value is less than 1. iii. S - C is the minimal minimizing set corresponding to X = 1 in the principal partition o f (f(.),g(.)).
Proof: i. This is immediate from the fact that
ii. Since f(0) = 0 we must have
f(X)= 0 iff X
s L.
f ( x u ~ ) ~ f ( ~ ) +v xf , (~ ~c S) , X n Y = 0 . Now f ( e ) 5 g ( e ) Ve E S. Hence
f(X) +g(S - X ) 2
f(W + f(S - X ) 2 f(S).
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
398
Thus S is a minimizing set corresponding to X = 1 . Now by the definition of coloops and since f ( e ) 5 g(e) Ve E S , it follows that if K is a set of coloops then g ( K ) f(S- K ) = f ( K ) f(S- K ) = f(S). Therefore S - K is a minimizing set corresponding to X = 1. That no critical value is less than I follows from Property PP2. iii. Let 2 be any such minimizing set. We must have f ( 2 )+ g ( S - 2 ) = f(S).But the LHS cannot be lower than f ( 2 ) + f(S - 2 ) 2 f(S).Thus the only way these conditions can be satisfied is to have f(S - 2 ) = g(S - 2 ) and f ( 2 ) f(S- 2 ) = f(S).Thus S - 2 must be a set of coloops. Hence S - C is the minimal such minimizing set.
+
+
+
0
We now study the principal partition of (f * g ( . ) , g(e)) through the following result.
Theorem 10.4.4 Let f(.) be a submodular function on subsets of S with f(8) = 0 and let g ( . ) be a positive weight function on S . Let p ( X ) denote X ( f * g ) ( X ) g(S - X ) and let h ( X ) denote Xf(X) g(S - X ) V X S.
+
+
i. When X 2 1 The minimum values of p ( . ) and h(.) over subsets of S axe equal. If Y minimizes p ( . ) then i t contains a subset 2 that minimizes h(.). 0
Any set that minimizes h(.) also minimizes p ( . ) .
ii. When X
> 1, Y minimizes p ( iff i t minimizes h(.). 9 )
iii. When X 2 1 there is a unique minimal set that minimizes both p ( . ) and h ( . ) and when X = 1, this set is the complement of the set of coloops o f f * g ( . ) with respect to g(.). Proof: i. By the definition of convolution,
Hence, since X 2 0, p ( X ) 5 h ( X ) VX C S and rninxcsp(X) 5 minxcsh(X). Next, for any subset X of S , when X 2 1 we have,
1Xf(2)
+ g(S - 2 ) = h ( Z ) ,
i.e., any subset X of S contains a subset 2 such that p ( X ) 2 h ( 2 )and p ( X ) cannot be less than the minimum value of h(.).We conclude that
m i n x c s p ( X )= rninxcsh(X)
10.4. THE PRINCIPAL PARTITION
399
and that any set that minimizes p ( . ) contains a subset that minimizes h(.). Let m denote this minimum value. Suppose Y minimizes h(.). We then have,
+
m = X f ( Y ) g ( S - Y)2 X ( f
* g ) ( Y ) + g(S - Y ) 2 m.
Thus Y must minimize p ( . ) .
ii. (A > 1) We need t o show that if Y minimizes p ( . ) it also minimizes h(.). We claim that in this case f * g ( Y ) = f ( Y ) ,from which it would follow that the minimum value m = h ( Y ) . Suppose otherwise. Then, we must have m = p ( Y ) G X(f = X(f(Z)
* g ( Y ) )+ g ( S - Y )
+ g(Y - 2))+ g ( S - Y ) for some 2 c Y > X(f(2))
+ g ( S - 2 ) = h ( 2 ) 2 m,
which is a contradiction. Thus we must have f * g ( Y ) = f ( Y )and hence Y minimizes
M.1.
iii. Since h(.)is clearly submodular (it is the sum of the submodular function A f ( Y ) and the submodular function g ( S - Y ) ) ,we must have the minimal minimizing set to be unique. From part (i) and (ii) above, this set is also the unique minimal minimizing set ofp(.), when X 2 1. We observe that f * g ( e ) 5 g ( e ) Ve E S . Hence it follows from Lemma 10.4.3 that when X = 1, the minimal minimizing set of p ( - ) is the complement of the set of coloops of f * g ( . ) with respect to g ( . ) . 0
The following corollary is now obvious.
Corollary 10.4.1 Let f(.) be a submodular function on subsets o f S with f(8) = 0 and let g ( . ) be a positive weight function on S. Let X 2 1. Then,
* 9 ) * 9 ) ( . )= ((Xf)* 9 ) ( . ) . From Theorem 10.4.4it is clear that
f o r X > 1, the sets in the principal partition of ( f ( q ) , g ( * ) ) (f(*) submodular with f (0)= 0, g ( . ) , a positive weight function) and in that of ( f * g ( . ) , g ( . ) ) are identical. The least critical value of(f*g(.), g ( . ) ) i s 1 and the minimizing sets for this value are the complements of subsets of coloops of f * g ( . ) . For X = 1, the minimal sets in the principal partition of (f(.), g ( . ) ) and in that of (f * g ( . ) ,g ( . ) ) coincide (see Figure 10.1). The principal partition of ( f ( . ) , g ( . ) ) may have critical values lower than 1. But we lose this information when we construct the principal partition of (f * g ( . ) ,g ( . ) ) . We next study the principal partition of the dual. W e have the following result which summarizes the relation between the PP of ( f ( . ) , g ( . ) ) and that of ( f * ( . ) , g ( . ) ) (see Figure 10.1).
400
10. CONVOLUTION OF SUBMODULAR FUNCTlONS
* ’k
*
X k
*
jp2
Figure 10.1: Comparison of PP of (f(.),g(.)),(f * g ( . ) , g ( . ) ) ,( f * ( . ) , g ( . ) )
I 0.4. THE PRINCIPAL PARTITION
401
Theorem 10.4.5 Let f(.) be a submodular function on the subsets ofS and let g(.)
be a positive weight function on S . Let B x , t3; denote respectively the collection of minimizing sets corresponding to X in theprincipal partitions of (f(.),g ( . ) ) , (f*(.),g ( . ) ) , where f*(.) denotes the dual o f f ( . ) with respect to g(9). Let A* denote ( I - (A)-')-' VX E R. Then i. A subset X of S is in t3x iff S - X is in
a;, ,
ii. if A 1 ,. . . , Xt is the decreasing sequence of critical values of (f(.),g(.)), then X i , . . . , X; is the decreasing sequence of critical values of (f*(.), g(.)), iii. if the principal sequence of
cipal sequence of (f"
(e),
(f(.),g(.)) is 0 = X o , . . . ,X t = S , then the prin0 = S - X t , * . ,S - X 0 - s.
g ( . ) ) is
1
iv. the partitions associated with the principal partitions of both (f(.),g(.)) and (f*(.),g(.)) are identical but the partial orders are duals. Proof: i. We will show that Y minimizes X f ( X )+g(S - X)iff S - Y minimizes g ( S - X). We have
= X*f(S - X )
X*f* ( X )+
+ (X* - l)g(X) - X*f(S) + g(S).
This is equivalent t o minimizing the expression A* (A" - 1)-l f(S- X ) + g ( X ) .Noting that X * ( X * - 1 ) - l = X we get the desired result. The remaining sections of the theorem are now straightforward. For the last section however we need t o use Lemma 10.4.2.
10.4.6
The Principal Partition associated with Special Minors
In general very little can be said about the relation between the principal partition of f(.) with respect to a positive weight function g ( . ) , and the principal partitions of the contractions and restrictions of f(.) with respect to g(.). For certain special cases the situation is better. The following lemma allows us to study such cases.
+
Let h x ( X ) denote X f ( X ) g(S - X ) V X S , h x ~ ( Xdenote ) Xf(X) g ( K - X ) VX 2 K , h s - ~ , x ( Ydenote ) X ( f o (S - K ) ) ( Y ) (g/(S - K ) ) ( S- K We then have
+
+
-
Y ) VY C (S - K).
Lemma 10.4.4 Let f(.) be a submodular function on subsets of S and let g ( . ) be a positive weight function on S . Let K C S .
10. CONVOLUTlON O F SUBMODULAR FUNCTlONS
402
i. If h x K ( . ) minimizes a t Z E K , then hx(.) minimizes a t a superset of X x in the principal partition of (f(.),g(.)), contains 2. ii.
z,i.e.,
(s
C - K ) minimizes h S - K , A ( . ) iff Y U K minimizes hx(.) among d l supersets of K . Hence, if in addition K minimizes h ~ ( . )then , Y U K minimizes h x (.).
Proof: i. We observe that, h x K ( X ) = h x ( X ) - g(S - K) V X G K . Hence h x ( 2 ) 5 hx(2’) VZ’ C 2. We know that hx(.)is submodular (see proof of Property PPP). Hence by Theorem 9.4.1 the desired result follows.
ii. We have, h s - K , x ( X ) = X(f(K U x )- f ( K ) )+ g ( s - K - x) vx s - K . So Y C S - K minimizes h S - K , x ( . ) iff Y u K minimizes hx(.) among all supersets of K .
0
The following corollary is immediate from the first part of the above lemma.
Corollary 10.4.2 Let f(.) be a submodular function on subsets of S and let g ( . ) be a positive weight function on S. Let S1 satisfy ( X f ) * g(S1) = X f ( S 1 ) . Then there exists a subset X’ of S such that X’ 2 S 1 and X1 E B x f , g .Hence, X x , in the principal partition of ( f ( . ) , g ( . ) ) , contains 5’1. Exercise 10.26 Prove
Lemma 10.4.5 Let f(.) be a submodular function on subsets of S and let g ( . ) be a positive weight function on S. Let K C. S. We then have i.
(f * g)/K(X) = ( f / K * g/K)(X) vx C K ,
ii. i f f * g ( S - K ) = f ( S - K ) , then (f*g)oK(X) = ( f o K * g o K ) ( X ) VX
C K.
Let f(.) be a submodular function on the subsets of S and let g ( . ) be a positive weight function on S. Let (using the notation of Lemma 10.4.4) K minimize ho and let P minimize ho, with p 2 8. Let fl(.) = (f o (S - K ) / ( P - K ) ) ( . ) and let 91
= ( g / P - WN.1.
We now describe i. the principal partition of (f/K(.),g/K(.)),
ii. the principal partition of ((f o (S - K))(.),(g/(S - K))(.)), iii. the principal partition of
(fi
(.),g1(.)).
i. Let t?l, denote t?,f / K , g / K , and let t?x denote t?xf,g as before. We have, h x K ( X ) = Xf/K(X) g(K - X ) VX K . So to determine the principal partition of (f/K(.),g/K(.)),we need only determine the subsets of K that minimize h x K ( . ) . Now h x K ( X ) = h x ( X ) - g(S - K ) VX K . Since K itself minimizes ho(.),the
+
c
c
403
10.4. THE PRINCIPAL PARTITION
sets that minimize h p ~ ( .are ) precisely those that minimize h p ( . ) and are contained in K . If X > 0 we know that X A C K by Property P P 2 of ( f ( . ) , g ( . ) ) . Hence the ) the same as those that minimize h ~ ( .. )If, however sets that minimize h x ~ ( .are X < p, because K is in 13; by Property PP2 of ( f / K ( . ) , g / K ( . ) ) ,the minimal set that minimizes h x ~ ( Xmust ) contain K , i.e., be equal to it. To summarize, (A = p). In this case B\ = {members of B A contained in K } 0
(A > p). In this case BL = f?~
0
(A < /I). In this case f3L = { K }
ii. We are given that K is contained in t ? ~associated with the principal partition of ( f ( . ) , g ( . ) ) . Let 11s denote by W A the collection of minimizing sets for X in the principal partition of ((f o (S - K ) ) ( . ) ,( g / ( S - K ) ) ( . ) ) . Now by Lemma 10.4.4, Y minimizes h S - K , A ( . ) iff among all supersets of K , K U Y minimizes hx(.). Since K belongs to B p , when X = 0,these supersets are precisely those sets in Bp that contain K . Thus f?”, = { Z - K , Z E By Property P P 2 of (f(.),g(.))every ), X < p, contains all sets in Bp. SOin this case the set that minimizes h ~ ( . when desired supersets are all the members of B x , i.e., B”x = { Z - K , Z E ax}. Again by Property PP2 of ((f o (S - K))(.),( g / ( S - K ) ) ( . ) ) ,when X > p, every set that is in B ” A is contained in all sets in P p . But the latter has 8 as a member. To summarize
a,}.
(A = p). In this case B ” A = ( 2 - K , Z E f ? ~ 2, z K } = (2- K,Z E 0
(A < p). In this case B ” A = ( 2 - K , Z E BA}.
0
(A
> p). In this case f ? ” ~= (8).
...
Observe that when 0 < p, P is a superset of K , and when 0 = /3, P U K is also a set in 230. So, without loss of generality, we need only consider the situation where P 2 K . Now by applying the ideas developed in the previous sections of the present problem we see that the principal partition of (fl(.), g1(.)) can be described as follows (see Figure 10.2): 111.
0
0
the critical values are those of the principal partition of ( f ( + ) , g ( . ) )that lie in the range to 0, including both numbers, the minimizing sets corresponding to these critical values in the principal partition of ( f l ( . ) , g l ( . ) )are precisely those sets ( X n P ) - K , where X is a minimizing set corresponding to these critical values in the principal partition of (f(.),g(.)). Thus, if Bi denotes B A ~ ~ we , , , have, when 8 < ,B
( p 2 x > 8 ) . B; = { X - K , X E B A > ii. (A = e). B: = {(xn P ) - K , X E ax}. i.
10. C O N V O L U T I O N OF SUBMODULAR FUNCTIONS
4u4
T
fx s
Figure 10.2: Comparison of PP of ( f ( . ) , g ( . ) )(, f l ( . ) , g l ( . ) ) 1 Further, when 0 = d, 8, = { (XnP ) - K , X E 8,) . Observe that in this case P - K is molecular with respect to (fl(.),gl It would be atomic if no set lies strictly between P and P n K in a,( = as),i.e., if P - K is a block in II(p). ( 3 ) ) .
'1.0 describe the same situation i n terms of the partition-partial order pair, denoting tlic partition for ( f , ( . ) , g 1 ( . ) )by and the partial order by > I , we have
n'
rIi = {z, zE 0
npp,z2 P
-
K}
n' corresponds t o a critical value X in the PP of corresponds to X in the PP of ( f ( . ) , g ( . ) ) a block in
the partial order
iff i t
>' is the restriction of the partial order of ( f ( . ) , g ( . ) ) to 11'.
_ . 1hn next theorern is a useful characterization
foregoing ideas.
( f l ( . ) , g (.))
of the principal partition using the
Theorem 10.4.6 (Uniqueness Theorem) Let
f(.) be a submodular function on subsets of S and let g ( . ) be a positive weight function on S. Let {SI,. . . , S,) he a partition of S . Let S1 U . . . U S k be denoted by Ek for k = I;.. ,t. Let ( f / E k o(Ek - E k - ' ) ) ( . ) , (g/(Ek - E k - , ) ) ( . ) be denoted by fk(.),yk(.) respectively for k = 1, . . . , t , and let the collection of minimizing sets corresponding to X in the principal partition of (fk (.), g t (.)) for k = 1, . . . , t , by B:.
10.4. T H E
405
PRLNCIPAL PARTITION
Let SI,, for k = 1,. . . ,t, be molecular with respect to (f~,(.), g k ( . ) ) with critical value XI, and let XI > . . . > At . Then, for ( f ( . ) , g ( . ) ) , i. the decreasing sequence of critical values is XI, principal sequence,
. . . ,At and 0, E l , . . . ,Et
is the
JI .
where Eo EZ 8. Proof: We prove the theorem by induction on t. The result is obviously true for t = 1 . Let it to be true for t = n - 1. Let f ’ ( . ) , g ’ ( . ) , denote ( f o ( S - El))(.),(g/(S- E l ) ) ( . ) , respectively and let B’, denote the minimizing sets corresponding to X in the principal partition of ( f ’ ( . ) , g ’ ( . ) ) . By the use of Theorem 9.3.1 we know that fk(.) = ( ~ ’ / ( EI , E l ) 0 (EI,- & - I ) ) ( . ) for k = 2 , . . ’ , t and gs(.) = (g’/(Ek - EI,-I))(.), for k = 2 , . . . , t.
By the induction assumption it follows that for
(f’(.),g’(.)),
i. the decreasing sequence of critical values is Xa , . . . , Xt and 8, Ez - E l , . . . ,Ef- El is the principal sequence. 11.
B;, = { Z , Z = ( E ~ - l- E 1 ) U Y , Y E B i k } , k = 2 , . . . , t. We will use Lemma 10.4.4 and the notation adopted therein. Since by Property PP2 of ( f ’ ( . ) , g ’ ( . ) ) , 0 is the only set that minimizes A S - - E ~ , X ~ ( .we ) , must have that hx, (.) takes strictly lower value on El than on all its proper supersets. But El E B i l . So by the lemma, El is a subset of some set in Bx,. We conclude that El is the maximal set that minimizes hx, (.). Now if X E B:, , the value of h x , ~(.) , is the same on both X as well as El. Hence the value of hx, (.) is the same on both 1 these sets, i.e., BA1 = B x , . Since 0 is in Bil (= Bxl), X1 must be the highest critical value of (f(.),d.)). When X < XI, El is the only set in Z?; and El is contained in every set in B x , by PropertyPP2 of (fl(.),gl(.)) and ( f ( . ) , g ( . ) ) . Once again by Lemma 10.4.4, Y U El minimizes iff Y minimizes h ~ - ~ , , x ( .Thus ) . Y u El is in Bx iff Y is in BL. We now see that X I , . . . , A t is a strictly decreasing sequence of numbers such that each of the Bx, has atleast two members, B x , has 0 as a member, Bx, has S as a member and further the maximal member of Z?x, is the minimal member of Bx,,, ,i = 1,. . . ,t - 1. Hence, by Theorem 10.4.1, we conclude that A 1 , . . . At is the decreasing sequence of all critical values for (f(.),g(.)) and the Bx,together constitute the principal partition of (f(.),g(.)). Thus the proof is complete for t = n. U
10. CONVOLUTION O F SUBMODULAR FUNCTIONS
406
Problem 10.1 ([fijishige80a], [fijishige91]) Let f(.) be a submodular function on subsets of S { e l , .. . , e n } and let g(.) be a positive weight function on S. Let 8 = &, E l , . . . , Ek S be the principal sequence and let XI,. . . ,X k , be the decreasing sequence of critical values of (f(.),g(.)). Let x be a vector defined by
--
--
Shou that i . x ( E L )= f ( E , ) , i = l ; . . , k . jj.
x is a base of Pf (i.e., x(X) 5 f ( X ) VX
CS
and x(S) = f(S)).
iii. x is a F-lexicographically optimum base of Pf relative to g(.), i.e., if z(eI)/g(el) 5 . . . 5 z(e,)/g(e,), wheneverx' is a base ofPf with z'(ei)/g(e:) 5 . . . I z'(eL)/g(eL) andtis thefirst indexfor whichz(et)/g(et) # z'(e:)/g(e:), then z(et)/g(et) > z'(e:)/g(e:>. iv. the F-lexicographically optimum base is unique.
Remark: The above definition of F-lexicographically optimum base is inconsistent (even as a generalization) with the one given in Section 4.6 for lexicographically optimum base of a matroid.
Problem 10.2 [Tomizawa+FujishigeB2], (fijishige911 Let f(.),g(.) be submodular functions on subsets of S and further let g ( . ) be an increasing function. This problem describes the structure of the principal partition in this case. i. Let LT 2 0. B,,,, is a distributive lattice. Hence, it has a unique maximal element Y" and a unique minimal element Y,. ii. If 0 5 u1 < u2,then
Y.1
C Y'2
and
Yu1C Y,,.
iii. Let y d ( X ) = g ( S ) - y(S - X ) and let (T < 0. Then B f , , S d is a distributive lattice. Hence it has a unique maximal and a unique minimal element. iv. Let B., refer to Bt3f,,g, u 2 0 and to B f , , g d , u < 0. If XI E B.,l and X2 E B.,, ,u1 < r 2 , then XI U XZ E B.,,, XI n X Z E B.,l. Hence, Yul Yuzand Y"' YU2.
c
10.5
*The Refined Partial Order of the Principal Partition
We spoke earlier about the partial order (2,)associated with ( f ( . ) , y ( . ) ) (f(.) submodular, g ( . ) , a positive weight function). The elements of this partial order
407
10.5 THE REFINED PARTIAL ORDER OF T H E PP
were the blocks of I I p p (maximal sets which are not cut by any of the members of the B A ) . The ideals of the partial order correspond to members of all the B x , i.e., correspond to the ‘solutions’ of minxcsXf(X) g(S - X ) for all the As. Suppose 0, E l , . . ,Et is the principal sequence of (f(.),g ( . ) ) , with El = XA1,. . . , S = Et = XAt. The partial order relationship between blocks that lie entirely within Ej+l -Ej is determined by the family t ? ~ , + while ~ if A is a block within Ej+l - Ej and B a block within ET+l - E, (T > j) we take A B . It follows by the Uniqueness Theorem (Theorem 10.4.6) that the principal partition and partial order would be unchanged even if we replace f(.) by @kfk(.), where fk(.) is as defined in the Uniqueness Theorem. Thus, the relationship (imposed by f(.)) between the blocks corresponding to different critical values is not brought out by the partial order
+
>
(>TI.
Additional structure related to the principal partition is revealed if we refine the partial order (but use the same partition I I p p ) as described below. This new partial order ( > R ) contains all the ideals of (>=) and some more. It, therefore, has ideals which are not solutions of rninxcsXf(X) g(S - X).But, as we will see, if X I ,X z are ideals in ( > R ) the principal partition of f /XI o (XI - X,)(.) relative to g/(X1 - X,)(.)is easy to describe. As in the case of (zT)these ideals also are modular with respect to f(.). Further, the ideals behave well with respect to addition, dualization and convolution of functions.
+
Throughout this section we assume f(.) to be submodular and g ( . ) to be a positive weight function. We now informally describe the construction of this refined partial order On the collection of blocks of IIpp contained in X x l = Xx,,both the partial orders (>=) and ( > R ) coincide. Suppose we have already built the partial order ( > R ) on the blocks contained in XA, . We extend it to blocks in XA,+~ as follows. Let X be a member of BA,. Then X - XA, is molecular with respect to ( ( f / X o (X - X,,))(.), (g/(X - X,,))(.)). To reach this structure we restricted f(.) to X and contracted out X x , . However, it may be possible to achieve this , Y C XA,, and structure by restricting on a smaller set Y ( X - X A ~ )where contracting out Y . But we would insist that Y be an ideal of the restriction of ( > R ) (already defined) on blocks of IIpp contained in XA,. It turns out that there is a unique smallest ideal of ( > R ) (restricted to the collection of blocks contained in XA,) with the above property. We would take all the blocks contained in Y to be below the maximal blocks contained in X . (For technical reasons we insist on simpler but equivalent conditions which are brought out in the following exercise). For blocks of I I p p within XA,+, - X x C ,we retain the same relationship as in (>=). Exercise 10.27 (k)Let f (.) be submodular on subsets of S. Let T I 2 TZ2 Iff(T2) - f(T2 - T3) = f(T1)- f(T1 - T3), then (f/Ti 0 T 3 ) ( X )= (f/Tz 0 T 3 ) ( X ) , X C T3.
T3.
In the next couple of pages we give an inductive definition of the refined partial
10. CONVOLUTION O F SUBMODULAR F U N C T l O N S
4U8
order
( > R ) associated with the principal partition of (f(.),g ( . ) ) (f(.)submodular, g ( . ) , a positive weight function). Before we can do so, however, we need a few
preliminary definitions and a lemma.
Let ( 2 ’ ) be a partial order on the blocks of IIpp contained in Ek . We say ( 2 ’ ) is a modular refinement of (>r) on the blocks of Hpp contained in Ek iff i. A
2’B
+ A >=
B , and
ii. if Y l ,l$ are ideals of
(2’) then
(Note that the second condition above is satisfied by Let
(>k)
(zr)(Exercise 10.13))
be a modular refinement of (>,.) over the blocks of
x
nPpcontained in
Ek(= x x k ) Let . X be a member of B A ~ +We ~ .say that - Ek is contraction related to a subset Y of & that is also a union of the blocks in an ideal of ( > k ) Iff - E’k) u Y ) - f ( y ) = u Ek) - f ( E k ) .
f((x
f(x
We then have the following lemma. (Henceforth we abuse the notation and say
‘Yis an ideal of (2)’ instead of ‘Y is a union of blocks of IIpp in an ideal of ( L ) ’ . )
Lemma 10.5.1 Let ( > k ) be a modular refinement of (>,.) over the blocks of npp contained in E k . Let X I , X2, X3 be members of B x k + , s.t. XI 2 X Z and let ~ ’ ~ , C~Ek> be. ideals ~ ~ of ( > k ) such that y3 2 Y1 and - E k , X 3 - Ek are Y2. contraction related to Yl, Then
i. .ya - Ek is contraction related to Y , , ii. .yl iii. .Y1
~
-
Ek is contraction related to YJ, & is contraction related to Y1 n Y2,
h-.X I U X 3
-
Ek
is contraction related to Y,
Proof : We use the following notation: For each set Z , Z G Z - E k . i. We have f ( i 1
Suppose
u y1) - f(y1)
f(xlu Ek) - f ( E k ) .
u YI) - f(y1)> f ( d 2u Ek) - f ( E k ) .
f(k2
(Now that by submodularity, LHS We then have
> RHS).
f(2,u y1) - f ( i 2 u y1) < f(x1u Ek) - f(k2u Ek), which contradicts the submodularity of f ( . ) . We conclude that X , is contraction related to Y , .
10.5 THE REFINED PARTIAL ORDER OF T H E PP ii. Suppose
409
f(% u Y3) - f(Y3) > f(X1 u Ek) - f(Ek).
We then have, using the fact that
2 1
is contraction related to Y1,
f(X* u h)- f(X1 u Yl)
> f ( Y 3 )- f(Yl),
which contradicts the submodularity o f f ( . ) . Thus, 2 1 is contraction related to iii. Since, over the blocks of IIpp contained in (2n)and Y1,Yl are ideals of ( > k ) we have
f(Yl)
we conclude that
iv. We have
E k , (>k)
Y3.
is a modular refinement of
+ f ( h )= f(Yl u v2)+ f(Yl n W .
10. CONVOLUTION O F SUBMODULAR FUNCTIONS
410
5 (f(-f~, u2
3
u Ek) - f ( E k ) )+ ( f ( ( X 1
2 3 )
u Ek) - f ( E k ) ) .
But f(.) is submodular. So the above inequality must be satisfied as an equality with the first and second terms on the LHS being respectively equal to the first and second terms on the RHS. Thus, f(-fi,
u i 3 u yl) - .f(yl)= f(g1u 2 3 u Ek) - .f(Ek). 0
Corollary 10.5.1 Let X be a member oft?^ . Then there is a unique minimal k+l ideal 1’ of the partial order ( 2 k ) s.t. x - Ek is contraction related to Y . Proof : This is an immediate consequence of the third part of Lemma 10.5.1. 0
x
If 1’ is the minimal ideal of (Lk) such that - Ek is contraction related to it, we say X - Ek is properly related to Y . From the above lemma i t is clear that there is a unique such subset Y . We axe now ready to present the inductive definition of ( 2 ~ ) . To begin with (21)is defined to agree with ( > A ) on the blocks of IIpp contained in on the blocks contained in E1 x Xx2. Let ( 2 k ) be a modular refinement of Ek = X A,, . We now extend the partial order ( > k ) to the partial order ( > k + l ) , which is a modular refinement of ( z x )on the blocks of H p p contained in Ek+l = -Y&, as follows. Let A, B be blocks of I I p p . i. If A, B axe contained in Ek then A ii. If A , B are contained in
Ek+l
>k+l
B iff A
- Ek then A
>k+l
>k
B.
B iff .4 >* B .
iii. If A is contained in Bk+l - Ek and B is contained in Ek, then A &+I B iff -YA - Ek is properly related to Y s 2 B , where X A is the minimal member of containing A and Y s is a union of blocks of an ideal of ( > k ) defined over blocks contained in Ek. Lemma 10.5.1 assures us that the above definition does yield a partial order on blocks of nppcontained in Ek+l and further that this partial order is a refinement of (>*) over blocks of nPpwithin & + I . It can be verified that (see Exercise 10.28 below), if X I , are unions of blocks in an ideal of ( > k + l ) then
x~
The above procedure, therefore, extends to a unique partial order ( > R ) ( s ( L t ) ) that is also a modular refinement of ( > T ) , on all the blocks of I I p p . We refer to this partial order as the refined partial order associated with the principal partition of (f(.), g(.)) (f(.)submodular, g(.), a positive weight function).
10.5 THE REFINED P A R T I A L O R D E R OF THE PP
41 1
Exercise 10.28 (k) Verify that if XI,X z axe unions of blocks in an ideal of ( > k + 1 ) then f(x,)= f(xlu x,) f(xln &),
f(x,) +
+
We now give a simple characterization of the partial order
(f( . ) , g ( . ) ) .
( L R ) associated
with
Theorem 10.5.1 Let X1,. ’ . , A t be the decreasing sequence of critical values of (f (.), g ( . ) ) (f(.)submodular on subsets of S, g(.), a positive weight function on S). A set X C S is an ideal of the refined partial order ( 2 ~associated ) with (f(.), g ( . ) ) iff it satisfies the following conditions: i. For each critical value X j , (X n XX,,,) U X X , is a set in B x J
Proof: We denote XX,(= Xx ) by E j , j = I , . . . , t . only if Let X be an ideal of (2~). Then X n Ej is an ideal of (zR),since by definition Ej is an ideal of of (2,) and therefore of ( 2 R ) and the intersection of ideals yields another ideal. We observe that any two blocks of nPpthat lie in Ej - Ej-1 have the same relationship in both (2,) and ( 2 ~ )Let . A be a block of I I p p in X n ( E j - Ej-I). The blocks of I I p p that are contained in ( E j - Ej-1) and that lie beneath A in ( L R ) must be identical to those beneath A in (2,). The remaining blocks beneath A in (2,)are precisely those in Ej-1. Thus, (X n ( E j - Ej-1)) u Ej-1 is an ideal of (2,) and therefore a member of Bx,. This proves the first condition. Next observe that Ej is an ideal of ( 2 ~ )Hence . X n Ej is also an ideal of (2~). Since X n Ej is an ideal of (>R),(by the inductive definition of ( 2 ~ ) it)follows that each block A of nppwithin X n ( E j - Ej-l) is contained in a minimal set r ( ~ of Ej s.t. X A - Ej-l is properly related to an ideal Y of ( 2 ~contained ) in X n Ej-1. Hence, by the first part of Lemma 10.5.1 it is clear that (X, - Ej-1) is contraction related to X n Ej-l. The union of all such sets ( X A - Ej-1) is clearly equal to X n (Ejwl- E j ) . By the fourth part of Lemma 10.5.1 the latter set must be contraction related to X n Ejpl. This proves the second condition.
if Conversely, suppose X satisfies conditions (i) and (ii). We will show that X is an ideal of ( 2R ) . We proceed inductively. It is clear that X n El is a set in Bxl and hence an ideal We . are given that of (2,)as well as ( Z R ) . Suppose X n E k - 1 is an ideal of ( 2 ~ ) (XcI&)U&-I is a s e t in B x , and also that Xn(Ek-Ek-1) is contraction related to X Ti Ek-1.Hence, there exists an ideal Y C X n E k - 1 of ( > R ) s.t. X fl ( E k - E k - 1 )
412
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
is properly related to Y . Now, if A is a block of IIpp in X n (El, - &-I), all blocks beneath A in ( > R ) which are in E k P l are contained in Y . Further, since ( X n b ! k ) U &-I is an ideal of ( > T ) , all blocks of nppin EI, - E k - 1 which are beneath A in (>*) are contained in X n (I& We conclude that this is also true with respect to ( > R ) since the relationship between such blocks is identical both in ( L T )and ( 2 ~ ) From . the inductive construction of ( 2 ~it) is clear that ( X n ( E k - E k - 1 ) ) U Y is an ideal of ( 2 ~ Hence, ). ((Xn(Ek-EEk--l))uY)u(XnEk-l) is an ideal of ( > R ) , i.e., X n Ek is an ideal of ( 2 ~ )This . completes the proof. 0
Exercise 10.29 (k) A base of a polyniatroid rank function f : 2" x on S s.t.
+ R is a vector
4x1 I f ( W vx E s, und
X ( S ) = f(S).
hi.say a base is consistent with a preorder [and with the induced partial order on the equivalence classes of the preorder) iffx(X) = f (X), whenever X is an ideal of . Let (>*) be the partial order and ( L R )be the refined partial order associated with the principal partition ( f ( . ) , g ( . ) ) (g(.), a positive weight function). Let x be a base o f f ( . ) consistent with ( L T ) Show . that i t is consistent with (2~). Exercise 10.30 Let ( 2 ~be)the refined partial order and let nppbe the partition of S associated with the principal partition of ( f ( . ) , g ( . ) ) where f(.) is submodular on subsets of S and g ( . ) is a positive weight function on S . Let X be the union of blocks of nppin an ideal of ( 2 ~ ) Describe . the principal partition of ( f / X ) ( . ) and tltr partial orders associated with it. Exercise 10.31 [k) Let f(.) be a submodular function on subsets of S and let g ( . ) fir a positive weight function OR S. Let f*(.) be the dual o f f ( . ) with respect to g ( . ) . Then the refined partial order of (f*(.), g ( . ) ) is dual to the refined partial order of ( f ( . ) l d.)).
10.6 10.6.1
Algorithms for PP Basic Algorithms
I I I this subsection we give a collection of algorithms which together would construct thp principal partition of ( f ( . ) , g ( . ) ) , where f(.)is submodular and g ( . ) is a positive weight function. The submodular functions may be available in various ways. One common way is through a 'rank oracle' which when presented with a subset would return the value of the function on it. We assume that we have available an algorithm called C(rnvo1veK (fl, fl) which, given submodular functions fl(.), fi(.) on subsets of k-,would output the unique minimal and maximal sets (Minset(convo1ve) and
10.6. ALGORITHMS FOR PP
413
Maxset(convo1ve) respectively) which minimize fi(X) + f i ( K - X ) VX E K . In general such an algorithm involves minimization of a submodular function. As we have remarked before, although polynomially solvable, we do not yet have truly practical algorithms for this problem. However, for the instances that are our primary concerri in this book we do have very good algorithms. Informally, Algorithm 10.1 proceeds as follows. We start with the set interval
(0,s).The subroutine, given below, breaks up the set interval (8, S) into ( 0 , Z ) and ( 2 ,S), where 2 minimizes the expression X f ( X ) + g(S - X ) , X 3 & If for every set X between the endsets, the value of Xf(X)+ g(S - X ) , does not exceed its value at the endsets (= g ( S ) + Xf(0)), then we are done - the principal sequence
is 0, S and the critical value is &. Otherwise we find the minimal set, say T , that minimizes the above expression. Now we work with the intervals (0, T ) ,( T ,S) and look for minimizing sets within the interval in question. In each case we use a value of X for which X f ( X ) g(T’ - X ) , where T’ is the right end of the interval, reaches the same value at both ends of the interval. When we are unable to subdivide the intervals any further we get a sequence of sets and a sequence of values which, the Uniqueness Theorem (Theorem 10.4.6) assures us, are respectively the principal scquence and the sequence of critical values of (f(.), g(.)).
+
Subdividef,,(A, B ) INPUT
A submodular function f(.) and a positive weight function y(.) on subsets of S. Sets A , B s.t. 8 C A C B 5 S .
OUTPUT The unique minimal minimizing set (Minset) A kJ 2 for B - A B , where = +IB))--sf((Ai. X . f ( x+)g ( B - X ) ,A
c c
!wEP
STOP
B -
+--
A
;:s))-”f’(Al,
Let f ’ ( Y )= f ( A kJ Y)- f ( A ) , C m v o l v e ( ~ - -(~x) f ’ , ~ ) Let 2 be the Minset(convo1ve) of the output. Output A kJ 2 as the Minset.
10. CONVOLUTlON OF SUBMODULAR FUNCTIONS
414
ALGORITHM 10.1 Algorithm P Sequence
A submodular function f(.) and a positive weight function g ( . ) on subsets o f S.
INPUT
OUTPUT The principal sequence of (f(.),g(.)). Initialize
Current Set Sequence t (0, S )
go
+
f(S)-f(0)
Current A Sequence t (A,) j t 0 0 is unmarked.
STEP 1
Let Current Set Sequence be (S!,. * . ,Si1) and let Current A Sequence be (A!, ' . . , If S i 1 5 t 5 rj - 1 is unmarked, then Subdividef,g(Si, S,i) Else GOTO STEP 3.
STEP 2
Let
( S ? ,. . . ,s;) = ( T I , .. . ,Tp) (A", .. + A",,_,) 1
( A l , .. . , & I )
Let T be the Minset output by Subdividef,, (Si, S,",,). If T = 5'2, then J t j + l r3
t9
(s;,. . . ,Sf]) t (TI,.. . T q )
(A",. . . ,A",-,) mark Si
t
GOTO S T E P 1 ;
( A l , * . .Aq-,).
10.6. ALGORITHMS FOR PP
($7..
STEP 3
415
t X i , t < i < Tj - 1 The Current Set Sequence j sj . , S t , t + l , $+a,. . * ,Si3) + (Ti,.. . , Tt, T ,Tt+l, . . ,T 9 ) . The Current X Seauence
Output Current Set Sequence its the Principal Sequence and Current X Sequence as the Critical Value Sequence.
STOP Next we consider the problem of construction of B X ~ , Since ~ . the number of sets in this family is very large we would try to get a representation of it through a partial order whose ideals correspond to the members of the family. This is possible since Bxf is closed under union and intersection (see page 391). We assume we have available an algorithm (easy to build) that, for a preorder k on S, if given the collection F {(e,Te),e E S}, T, f {ej,ej k e}, produces the Hasse Diagram of the induced partial order. We will call this algorithm Hasse Diagram (F).
--
ALGORITHM 10.2 Algorithm B x ~ , ~ INPUT
S = { e l , . . . ,en}. Submodular function f(.) and positive weight function g ( . ) on subsets of S , X 2 0.
OUTPUT A preorder whose ideals are precisely the members of Bxf . The preorder is specified through the Hasse diagram of the intuced partial order. STEP 1
Convolves (Xf, g) Let Z be the Minset, 2’ be the Maxset, X A t 2, X X t 2’.
STEP 2
For each j , e j E Z‘ - Z , let f j ( . ) f / ( S - ej)(:),gj(.) Cmvolve(s-e,) (Af’? 9’) Let Z j be the Maxset. {e,e e j , } = 2’-Z j .
STEP 3
Let~~{(ej,Yj),Yj=Z’-ZjifejEZ‘-Z,Yj=Z‘ifejEZ} Hasse Diagram F.
=
= g / ( S - ej)(.).
STOP Construction of ( I I p p , 2,) is as described in page 392.
Remark 10.6.1 In Algorithm 10.2 we have found, for each element e in S , the set of all elements e. From this the Hasse Diagram of the induced partial order has
416
10. CONVOLUTION OF SUBMODULAR F U N C T l O N S
I)cen h i l t . An equivalent procedure would be to find for each e in S , the set of dll (dements 3~ e. Clearly from this also the Hasse Diagram of the induced partial order car1 be constructed. The set of all elements 5~ e is obtained, when e E X ’, by finding the niinirnal minimizing set for the function Xf(X) g ( S - X ) , e E X C S . H i i t this is precisely what the subroutine Subdividef,,({e},S) does.
+
Justification for the PP algorithms Justification for Algorithm P-sequence is directly by use of Uniqueness Theorern (Theorem 10.4.6). For, at the end of the algorithm, we have a sequence of sets and a sequence of critical values which satisfy the conditions of the theorem. Algorithm B x ~ ,essentially involves, for each e E S, finding the largest member Z e of Bxf,,, whica does not contain e. By the definition of the preorder associated with B x ~ , ~ , the cornplernent of Z e contains precisely those elements which are present only in those members of B x ~ which , ~ have e present. If e E X X , there is no member of Bxf,, without e .
Complexity of the PP algorithms The inaiii subroutine is Convolve. So we will bound the number of calls to it. In Algorithm P-sequence, at each stage we have a nested sequence of sets. Hence the iiumber of subdivisions is bounded by I S I . Each call to Convoliie either creates a subdivision or marks a set. Marking a set Sj is equivalent to omitting (Sj,Sj+i) from further consideration. Now Sj+l - Sj must have had atleast two elements otherwise we could not have called Convolve. Thus, the total number of calls to Convolve cannot exceed 1 S I . Algorithm Bxf,, requires I 2’- Z I +1 calls to Convolve. The total number of critical values is bounded by I S I . Hence, for all the critical values together we do iiot, have to make more than 2 I S I calls to Convolve. Thus, building the principal partition as well as the partition, partial order pair (ll,,,, >*) associated with the principal partition requires O( I S I) calls to Convolve.
Remark: Note that we have assumed Convolve to be powerful since it produces Miiiset and Maxset instead of just any set minimising Xf(X)+ g ( S - X ) . This however, appears valid for practical situations such as where f(.) rank function of it graph or a matroid and g ( . ) = a positive weight function. Exercise 10.32 Speeding up the PP algorithm Let f ( . ) be a submodular function on subsets of S and let g ( . ) be a positive weight function. Let X S and let e E S - X . Let ’l’n,,,, Tmin denote the niaximal and minimal subsets that minimize f ( Y )+ g ( 7 - Y ) over Y C T . Prove that the following hold.
10.6. ALGORITHMS FOR PP
417
ii. Let f(.) be increasing. Then (f * g)(X U e ) = (f * g ) ( X ) iff f(X,,,
f( X m a x ) .
U e) =
Further, if (f * g ) ( X U e) = (f * g)(X), then ( X U e ) m n z = X,,, U e. How would you use this fact for computing S,,, efficiently when f (.) is integral and f(S) R ) ) and the union of blocks of nPpcontained in the ideal and use terms such as ‘properly related’ (defined in page 410) for the latter also. Let X I > . . . > At be the decreasing sequence of critical values. The partial orders ( > R ) and (2,)agree with each other over I I ( A l ) . Suppose we have built ( 2 ~for)I I ( X l ) H . . . H n ( X k ) . For each block A j of I I ( X k + l ) , let Ij be the principal ideal determined by A j in (>A ). Let I j be properly related to Y where Y is an k+l ideal of ( L R )(restricted to LI(A1)H. . . MI(&). Then the blocks beneath A j in (ZR) are the blocks in Y and the blocks in I j . We give an algorithm for computing Y below.
ALGORITHM 10.3 Algorithm Y INPUT ( L R )over II(X,)~J...~III(X~),I~. OUTPUT Ideal Y of ( > R ) over I I ( A 1 ) kJ . . . H n(&) s.t. Ij is properly related to Y . Initialize
Current ideal X t X X ~ , ~ .
STEP 1
For each Bj in
II(A1)
U .. . H II(X,),
uj
let
3 {BkiBk > R Bj, Bk xXk+i) and if Ij is contraction related to X X t ( X - Uj).
STEP 2
-
Uj,
Output the current ideal as Y .
STOP Complexity of construction of refined partial order To compute the principal ideal of A j in the refined partial order, we need t o compute atmost I IIpp I Uj’s. For each U j we need to compute ranks f ( X - U j ) , f ( I j U (X - U j ) ) .Thus, O(l l I p p I) ranks have to be computed.
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
418
Hence, to compute the principal ideal for all the blocks TI&, O(l npp12) Uj’s have to be computed and O(I IIpp 12) ranks have to be computed. We note that each U j is a principal ideal in an appropriate partial order.
10.6.3
Algorithm Convolves (wR (rZ, ) ,W L)
We must examine the algorithm Convolves(f , g) for two important special cases: i . f (.) w R ( r L ) ( . ) , where r L is the left adjacency function of a bipartite graph ( V L , V R , E ) ,W R ( . ) is a positive weight function on V R ,g(.) E WL(.), a positive weight function on V L .
.. f(.) F r ( . ) rank function of a matroid, g ( . ) =I . 1.
11.
We study the former case in this subsection and relegate the latter to the next chapter.
For the present instance
Xf (X)
+ g(S
-
+
X ) = X W R ( r L ) ( X ) WL(VL- X ) . Let X > 0.
As we saw for the case X = 1 in the proof of Theorem 102.2, minimizing the above function is equivalent to finding the min cut in the network NX F ( B ,W L , XWR). For convenience we repeat the definition and sketch the discussion given in Subsection 3.6.10. The flow graph is built as follows: Each edge of the bipartite graph is directed from left to right with capacity 00. There is a source vertex s and a sink vertex t. From s to each left vertex ‘UL there is a directed edge of capacity W L ( v L ) . From each right vertex V R there is a directed edge to t with capacity XWR(VR). IJsiIig the facts that a mincut should not have infinite capacity and that X > 0, WE: can show that a mincut must have the form ( s u X U r L ( X ) , tu (vL- X ) u
(vR- rL(x)))
(see Figure 10.3). The capacity of this cut is W L ( V L - X ) + X w R ( r L ) ( X ) .On the other hand, given .?, we can always build a ‘correspon+ng cut’ of the Fbove form whose capacity is given by the expression W L ( V L- X) + X W R ( r L ) ( x ) . Thus, there is a one to one correspondence between min cuts and sets X which minimize the expression 7u1,(V,,- X) XwR(rL)(X). Any standard max flow algorithm would yield a mincut. Finding the min cuts corresponding to the unique largest and unique smallest minimizing set 2 is also easy and does not cost additional complexity as we have shown in the above mentioned subsection.
+
Exercise 10.33 The subroutine SubdavideK,s(XwR(1’L),W L ) involves nlinimizing ? f f L ( l / > ,f XwR(rL)(x), K 2
x)
x.
Show that this minimization is equivalent to solving a max flow problem in which the capacity of the edges in the flow graph F ( B ,W L , XWR) are modified as follows: capacity of ( s , u),u E K is changed from ~ ~ ( 1 to 1 )00. Exercise 10.34
Let B that
( V L , V R , E )and W L ( . ) , W R ( . ) be positive weight functions on VL,VR.Show
419
10.6. ALGORITHMS FOR PP
/--
- - -- - - - -
4
\
Figure 10.3: Convolution through Flow Maximization
10.6.4
Example: PP of
(lrL1(-), wL(-))
We illustrate how to construct a bipartite graph with a desired principal partition of (]Iy( WL .(.)) ),and a desired refined partial order say the one given in part (a) of Figure 10.4. Here (ll?L/(.),wL(.)) are the adjacency function and a positive weight function on the left vertex set of a bipartite graph (VL,V,, E ) . We work with 1 . I but the same ideas work for any W L ( . ) .
We begin with a stock of ‘ ( l r L l ( . ) , I . I) atomic bipartite graphs’ (i.e., bipartite graphs for which the principal partition of ( J r L1 (.)71 . I) has only two sets: 8 and V L ) of the specified critical value. If a connected bipartite graph has a totally symmetric left vertex set, then it has to be atomic with respect t o ( l r L l ( . ) , I . I). The critical value for such bipartite graphs equals These bipartite graphs are seen in part (b) of the figure if one ignores the dotted lines. Let us call these bipartite graphs B A BB, ,B E * B , B ~B,c l , B c z ,BD,, Bnz. We remind the reader that the bipartite graph B . L X is defined to be the subgraph of B on X u I’L ( X ) whereas the bipartite
H.
10. C O N V O L U T I O N OF SUBMODULAR F U N C T I O N S
420
I---
--
---
Figure 10.4: The Principal Partition and Refined Partial Order for
(l?L(.),
I . I)
10.7. ALIGNED POLYMATROID RANK FUNCTIONS
42 1
graph B o ~ isdefined x t o b e thesubgraphon X ~ J ( ~ L ( X ~ J ( V L - X r r), ()V L - X ) ) . Given a bipartite graph B G (VL,VR,E ) with left adjacency function / r L I ( . ) , the derived bipartite graphs B . L x ,B o ~ xX, V L ,have as left adjacency functions I r L I/X(.) and Ir, I OX(.) (see Exercise 9.7). Let Si be one of the sets A , B1,. . . ,D2 and let Ist be the set corresponding to the principal ideal of Si relative to ( 2 ~ ) . We want the structure on Si to become atomic with respect to the left adjacency L s ~for , . instance, function and the 1.1 function, in the bipartite graph ( B . L ~ ~ , ) ~Thus, ( B . L ( C ~ ~must B ~be~the ~ same ~ ) as ) Bcl ~ ~(the ~ bipartite ~ graph on C1 kJrL(C1) when the dotted lines are removed). If the original bipartite graph has edges from the left vertex set C1 to the right vertex set of B B ~then , unless the ideal B3 U B1 is contracted we would be unable to get the desired atomic structure on C1. We therefore force this, since we want C1 > R B1, B3, and attach such (dotted) edges. This procedure, if carried out for each of the sets A , . . . ,D2, results in the bipartite graph with additional dotted edges shown in part (b) of the figure. Conversely the latter bipartite graph has the principal partition and refined partial order given in part (a) of the figure. It should be clear from this example that we can build a bipartite graph with any desired refined partial order by using the above procedure. The Hasse Diagram for the partial order (&) can be obtained from part (a) of the figure by adding the following additional lines: (01,G ) ,( 0 2 , G ) , (G,Bz),(C2,B2), (B1, A) and deleting (&, A). Observe that the principal partition and refined partial order carry more information than the Dulmage-Mendelsohn decomposition. The latter would show the partial order on B1,B2, B3 but lump all of D 1 D2, , CI,CZ and also lump all elements corresponding to X > 1. Exercise 10.35 What is the significance o f an atomic critical value X # l?
(IrLI,
1 I) *
structure for a
Exercise 10.36 Verify the following simple rules for building atomic structures relative to ( I r L l ( . ) , k I . I) where I r L l ( . ) is the adjacency function acting on the left vertex set of a bipartite graph (VL,V R ,E ) and k is any positive number. We say ‘atomic bipartite graph’ t o be brief. i. Any connected bipartite graph in which the left vertex set is totally symmetrical is atomic. ii. Build two atomic bipartite graphs on the same left vertex set. Merge corresponding left vertices. The result is an atomic bipartite graph.
iii. Start with an atomic bipartite graph. Replace each left (right) vertex by m copies. The result is an atomic bipartite graph. iv. Start with an atomic bipartite graph and interchange left and right vertex sets. This yields an atomic bipartite graph.
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
422
10.7
*Aligned Polymatroid Rank Functions
In this section we deal with situations where certain submodular functions have strongly related principal partitions and these relations carry through under special operations. We show that if we perform these operations on a single polymatroid rank function the principal partition is relatively unaffected. Our ideas culminate in Theorem 10.7.5, where we show that a polymatroid obtained from another through a ‘positive’ or ‘negative’ expression is ‘aligned ’ to the original. Since the ideas of this section, though useful, are not standard, they are stated in the form of problems.
Definition 10.7.1 Aligned polymatroid rank functions with respect to a positive weight function Let fo(.), f l (.) be polymatroid rank functions on subsets of S and let g ( . ) be a positive weight function on S. Let B:,, i = 0, 1, denote the collections of sets in the principal partition of ( f i ( . ) , g ( . ) ) , i = 0, 1, corresponding to A. We say fo(.), f1(.) are aligned with respect to g(.) iff i. The set of self loops of one of the polymatroid rank functions fo(.), f l ( . ) is contained in that of the other polymatroid rank function. Similarly, the set of coloops relative to g ( . ) of one of the functions fo(.), ti(.)is contained in that of the other. ii. Every set in the principal sequence of one of the functions, which contains the set of selfloops of both the functions and does not intersect the coloops of either function is a set in the principal sequence of the other function. iii. Whenever X C S contains the set of self loops of one of the functions say f i ( . ) , does not contain any of its coloops, and X E B i , for some A, then X E t3:,
for some A?, where j = i + 1 mod 2.
Iri general the property of being aligned is not transitive.With some additional conditions however, as illustrated in the next problem,it is.
Problem 10.3 Definition 10.7.2 Let fa(.), fl(.) be polymatroid rank functions on subsets of S and let g ( . ) be a positive weight function on S. If fo(.),fl(.) are aligned with respect to g ( . ) , and further every self loop of fo(.) is a self loop of fl(.) and every coloop relative to g ( . ) of fa(.) is a coloop relative to g ( . ) of fi (.) then we say that the principal partition of (fl(.),g(e))is coarser than that of (fo(.),g(.)). and that the principal partition of ( j , ( . ) , g ( . ) ) is finer than that of (fl(.),g(.)). Prove Theorem 10.7.1 Let fa(.), f1(.), fi(.), be polymatroid rank functions on subsets fi(.) are of S and let g(.) be apositive weight function on S . If fa(.), f1(.) and f1 aligned with respect to g ( . ) , and the principal partition of (fo(.),g(.)) is finer than (e),
10.7. ALIGNED POLYMATROID RANK FUNCTIONS
423
that of (fl(.),g(.)) which in turn is finer than that of (fz(.),g(.)) then f o ( . ) , f 2 ( . ) are aligned with respect to g ( . ) and further the principal partition of (fo(.),g(.))is finer than that of (fi(.),g(.)).
Solution: Proof of Theorem 10.7.1: We have, 0
0
the set of self loops of fo(.) is contained in that of contained in the set of self loops of fi(.) and
fl(.)
which in turn is
the set of coloops of fo(.) is contained in that of fl(.) which in turn is contained in the set of coloops of f2(.).
The alignedness of fo(.), fi(.)now follows if the second and third parts of its definition is satisfied. It is easy to verify that the part of the definition of alignedness about principal sequences of the two functions is satisfied. We will therefore verify only the third part.
To see this, consider first the case where X is a set, corresponding to X2 in the principal partition of (fi (.) ,g( .)), that contains all self loops of fi (.) and none of its coloops. Since f l ( . ) , f2(.) are aligned, X must be a set in the principal partition of fl(.) and further does not intersect any of the coloops of fi (.) and contains all its self loops. Since fo(.),fl(.) are aligned it follows that X must be a set in the principal partition of (fo(.), g(.)). Next, let X be a set corresponding to A0 in the principal partition of (to(.), g ( . ) ) , that contains all self loops of fo(.) and none of its coloops. It must then be a set in the principal partition of ( f l ( . ) , We then have three possibilities, by Lemma 10.4.3. g ( 9 ) ) .
(00 > X > 1, i.e., X contains all selfloops off^(.) and none of its coloops). Clearly X must be a set in the principal partition of (fz(.),g(.))also, since fl(.), f . ~ ( . )are aligned. 0
0
(A = 03, i.e., X is a subset containing only self loops of f l ( . ) ) . Clearly X contains only selfloops of fi(.) since the principal partition of (f2(.),g(*))is coarser than that of ( f i ( . ) , g ( . ) ) . (A = 1, i.e., X is a subset whose complement contains only coloops off^(.)). Clearly the complement of the set X contains only coloops of fi(.),since the principal partition of (fi(.), g(.)) is coarser than that of (fi (.), g(.)). 0
Problem 10.4 Prove Lemma 10.7.1 Let fo(.), f1(.) be polyrnatroid rank functions on subsets of S and let g ( . ) be a positive weight function on S. Let B i denote the collection of sets corresponding to X in the principal partition of (fo(.) fl(.),g(.)).
+
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
424
i. Let X be a set in Boxo as well as in B i l .Then X E B i 3 , where A3 = ((XI))-' (A1)-1)-1.
+
ii. Suppose, in addition, X is a maximal (minimal) member of maximal (minimal) member of a;, .
a:,,
then X is a
iii. If B:, = B i l then B:3 = B i , .
Solution: Proof of Lemma 10.7.1: i. We have,
fZ(X)
+ (&)-l)g(s
Hence,
-
X ) L f,(Y)
+ ( ( A p ) g ( S - Y ) , i= 0,1, Y c s.
fo(X) + f l ( X ) + ((Ao)-' fO(Y)
+ fl(Y)+
+ (A1)-')s(s- X ) I
+ (A1)-')g(s
-
Y).
This proves the required result. ii. In the above proof note that the final inequality reduces to an equality iff the former inequalities do so for i = 0 , l . The result now follows. iii. Proof depends on the fact stated above.
The following theorem is an immediate consequence of the above lemma. Theorem 10.7.2 Let fo(.), f l ( . ) bepolymatroidrankfunctions on subsetsofS and let g ( . ) be a positive weight function on S. Let (fo(.),g(.)), (fl(.),g(.)) have the same principal partition with decreasing sequence of critical values Aol, . . . ,Aot and A l l , . . . , A l t . Then ((fo fl)(.), g(.)) has the same principal partition with decreasing sequence of critical values ASl,. . , A3t, where X 3 i = ( ( A O ~ ) ~ ' ( A ~ i ) - l ) - ' , i = I , . ,t.
+
+
' '
Problem 10.5 Prove Theorem 10.7.3 Let f l (.), fo(.) be aligned polyrnatroid rank functions on subsets o f S relative to thepositive weight function g(.) such that fi(e) 5 g(e),i = 0, 1, Ve E S . Then
+
i. The principal sequence of (fl(.) fo(.),g(.)) is the coarsest common refinement ofthose of(fo(.),g(.)), (f1(.),9(.)). ii.
fl(.)
+ fa(.) is aligned to both f l ( . ) and fo(.)
with respect to g(.).
iii. If both fl(.) and fo(.) are aligned with f3(.) with respect to g(.) and further if the principal partitions of (fo(.),g(.)) and (fl(.),g(.)) are coarser than that of (f3(.),g(.)),then fl(.) fo(.) is aligned with f 3 ( . ) and the principal partition fo(.),g(.)) is coarser than that of (f3(.),g(.)). of (fl(.)
+
+
10.7. ALIGNED POLYMATROID RANK FUNCTIONS
425
Solution: The assumption fi(e) 5 g(e) is not essential. It has been made only to make the proof simpler and also because it is the only case of importance. Proof of Theorem 10.7.3: Let B i , i = 0,1,2 denote the collection of sets corresponding to X in the principal partitions of (fO(.),g(.)),(fl(.),g(.)) and ( t o ( . ) f l ( . ) , g(.)) respectively. Suppose the set of selfloops Siof fi(.) contains the set S j of selfloops of fj (.), where j = i 1 mod 2. From the definition of alignedness i t follows that the principal sequence of (fj(.), g(.)) has the form 0,S j , . . . ,Sil.. + . Now by the use of Lemma 10.7.1 it is clear that
+
+
+
the principal sequence of (fi(.) fj(.),g(.)) will be identical to the above sequence upto the set Siand that 0
if X is a critical value of (fj(.), g(.)) whose maximal minimizing set is contained in Siit satisfies the same property with respect to (fi(.) fj(.)]g(.)).
+
For these values of X the collections of sets are identical both in(fj(.),g(.)) and in (fi(.) fj(.),g(.)). From Lemma 10.4.3, if Ci is the set of coloops of fi(.) with respect to g(.), we know that S - Ci is the penultimate set in the principal sequence of (fi(.), g(.)), i = O l l . Suppose next, without loss of generality] that the set of coloops COof fa(.) contains the set of coloops C1 of fl(.). We then have by Lemma 10.7.1
+
+
the segment Si,. . . S - Co appears in the principal sequence of (fo(.) (.) ,g( .)) since it appears in the principal sequences of both (fo (.), g ( . ) ) and
fi
(fl(.),d.)).
0
if XO, XI are the critical values of (fo(.),g(.)), (fI(.)]g(.)) respectively coresponding to two successive sets in this segment then = = B i 2 where A2 = ((XrJ-1 (X1)-1)-1.
+
]
since in the principal partition of (fa(.), g(+)),S - CO,S are the minimal and maximal sets in L?:=l and every set between these two is also in this collection, therefore the segment S - CO,.. . ,S would be identical in the principal sequences of (fl(.),g(.))and (fa(.) + fl(.),g(.)). If X1 is a critical value of (fl ( . ) , g ( . . ) ) corresponding to two successive sets in the above segment then where Xz = (1 + (X1)-l)-l. Bil = All the parts of the theorem are now immediate 0
Problem 10.6 Prove Lemma 10.7.2 Let fo(.), f 1 ( . ) be polymatroid rank functions on subsets of S and let g(.) be a positive weight function on S . Let fa(.), f~(.)be aligned with respect t o g ( . ) . Then
i. fo
* g(.),f1(.)
are aligned with respect to g(.)
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
426
ii. if the principal partition of (fo(.),g(.)) is coarser than that of (fl(.),g(.)) then the principal partition of (fo * g( .), g( .)) is coarser than that of (fl(.) ,g( .)). Solution: Proof of Lemma 10.7.2: Let B i , i = 0, 1,2, denote the class of sets corresponding to X in the respective principal partitions of (fO(.),S(.))? (fl ( . ) ? d . ) ) (fo , * g(.),g(.)). By Theorem10.4.4 it is clear that for X > 1,B: = Z?.: In particular the maximal set in B: and B t are the same for any value of X > 1. But then by Property PP5 of the principal partition, the minimal set say Z in B: and Z?: are the same for X 2 1. Again by Theorem10.4.4, S - Z is the set of all coloops of fo * g(.) with respect to g(.). Thus we see that 0
The principal sequence of (fo * g(+),g(.))has the form 0,X1;..,Xv
= z,s,
while that of (fo, g(.)) has the form
0,x1; . ' ,x,= Z , X r f l , ' . . , S . 0
0
S - 2 is the set of all coloops of fo * g(.) with respect t o g(.). The sets Z and S are the minimal and maximal sets in B: when X = 1. For X > 1,B: = 23:.
Thus fo (.), fo *g(.) are aligned with respect to g( .) and further the principal partition of (fo*g(.),g(.))is coarser than that of (fo(.),g(.)). The second part of the theorem now follows by Theorem 10.7.1. U
Problem 10.7 Prove Theorem 10.7.4 Let fo(.), fl(.), f3(.) be polymatroid rank functions on subsets of S and let g(.) be a positive weight function on S . Let fo(.), f~(.) be aligned with respect tog(.). Let f3(.) be aligned to both fo(.) and f~(.) with respect to g(.) and further let the principal partitions of (fo(.),g(.)),(fl(.),g(.)) be coarser than that of (f3(.), d.1).m e n j.
fo, (fo
+ fl) * g(.) and f i , (fo + f ~ *)g(.) are aligned with respect to g ( . ) .
ii. the principal partition of ((fo+fl)*g(.), g(.)) is coarser than that of (f3(.), g(.)). Solution: The result is immediate from Lemma 10.7.2 and Theorem 10.7.1.
Problem 10.8 Prove Lemma 10.7.3 Let f i ( . ) , i = 0 , 1 be polymatroid rank functions on the subsets of S and let g(.) be a positive weight function on S with g(e) 2 fi(e),Ve E S , i = 0 , l . Then
42 7
10.7. ALIGNED POLYMATROID RANK FUNCTIONS
fG(.),f;(.) are so aligned. If in addition the principal partition of (fo(.),g(.)) is coarser than that of (fl(.),g(.)) then the principal partition of (fG(.),g(.)) is coarser than that of (f;(.)lg(.)).
i. fo(.),fl(.) are aligned with respect to g(+) iff
ii.
f;(.),fi (.) are aligned with respect to g(.) iff fo(.), fi(*) are SO aligned.
iii. if fo(.), fi (.) are aligned with respect tog(.) then f; *g(.), f;(.) are so aligned.
Solution: The assumption fi(e) matroid rank functions. Proof of Lemma 10.7.3:
5 g(e) is made only to make the duals into poly-
i. This follows from Theoreml0.4.5 and the definition of alignedness. ii. This follows from the above result and the fact that
fi*(.)= fo(.).
iii. This follows from the first part and Lemma 10.7.2.
Definition 10.7.3 Let fi(.), i = 0 , l be polymatroid rank functions on the subsets ofSandletg(.) beapositiveweightfunctiononS withg(e) 2 fi(e),Ve E S,i = 0 , l . We say that thepolyrnatroid rank functions fo(.), fl(.) are oppositely aligned with respect to g(.) iff f;(.),fi (.) (equivalently f;(.),fo(.)) are aligned with respect to
d.1-
The results about alignedness presented thus far permit us to talk of the alignedness of polymatroid rank functions derived from simpler aligned polymatroid rank functions through certain formal expressions involving the operations of addition, convolution with g(-), and dualization. We know that addition of aligned polymatroid rank functions results in another such, convolution with g(.) results in a coarser aligned polymatroid rank function while dualization oppositely aligns the polymatroid rank function. It is therefore clear that if the formal expressions were constructed according to certain simple rules, then we would have complete knowledge of the principal partition associated with the resulting polymatroid rank function. The care that we have to take essentially lies in convolving with g(.) whenever the value of the polymatroid rank function can become greater than that of g ( + )at any element - otherwise we cannot use dualization ideas freely. Definition 10.7.4
0
‘i’ is a positive expression of length 1.
‘i* ’ is a negative expression of length 2.
If ‘w ’ is a positive expression of length 1- 1 then ‘w * g ’is a positive expression of length 1.
If ‘w’ is a negative expression of length 1 - 1 then (w*g ’ is a negative expression of length 1.
428
10. CONVOLUTION OF SUBMODULAR FUNCTIONS 0
0
If ‘w is a positive expression of length 1- 1 then ‘(w)” ’is a negative expression of length 1. If ‘w ’ is a negative expression of length 1 - 1 then ‘(w)* ’ is a positive expression of length 1.
If ‘w ’ is a positive expression (negative expression) oflength 1- 1 then ‘(Xw)*g ’, X > 0 is a positive expression (negative expression) of length 1 + 1. 0
If ‘wo’,‘w1 ’ are positive expressions (negative expressions) of lengths k,1 respectively, then ‘(wo of length k 1 1.
+ +
+ w1) * g ’ is a positive expression (negative expression)
Remark: If ‘w ’ is an expression (positive or negative) with respect to the weight function g(.) on S and f(.) is a polymatroid rank function on subsets of S then w ( f ) ( . ) denotes the polymatroid rank function obtained by replacing all the ocurrences of the symbol ‘2’ in ‘w’ by I(.).
+
Example 10.7.1 Consider the expression ‘(i i * g) * g’. If this operates on the polymatroid rank function f(.) we get the polymatroid rank function ((f(.) f * g ( . ) ) * g ) ( . ) . The expressions ‘i’,‘i * g’,are positive with lengths 1 , 2 respectively. The expressions ‘i i * g’, ‘(i i * g) * g 1 are therefore positive with lengths 3 , 4 respectively. So the expression ‘ ( ( i i * g ) * g ) * * g ’ is negative with length 6 and the expression ‘( ( ( i i * g ) * g)* * g (32 * g ) * ) * g ’ is negative with length 6 4 1 = 11.
+
+ +
+ +
+
+
+
Remark: We will omit inverted commas henceforth while speaking of expressions. Problem 10.9 Prove Theorem 10.7.5 Let w be a positive (negative) expression with respect to g ( . ) , a positive weight function on S . Let f (.) be a polymatroid rank function on subsets of S . Then the polyrnatroid rank function w(f)(.) is aligned to f(.) (f*(.))with respect to g(.) and the principal partition of (w(f)(.),g(.)) is coarser than that of ( f ( . ) , g ( * )U ) * ( . ) , g ( . ) ) J firther w(f>(e) 5 g ( e ) Ve E s. Solution: Proof of Theorem 10.7.5: The proof is by induction using the following results: 0
if fa(.), f l ( . ) are aligned with the principal partition of the former coarser than that of the latter then fo(.) * g(*),fl(.) are aligned, with the former having a coarser principal partition than the latter (Lemma 10.7.2). if fo(.), f i (.) are aligned with the principal partition of the former coarser than that of the latter then fG(.), are aligned, with the former having a coarser principal partition than the latter (Lemma 10.7.3).
fT(.)
if fo(.), fl(.) are aligned with the principal partition of the former coarser than that of the latter then f; *g(.),f;(.) are aligned, with the former having a coarser principal partition than the latter (Lemma 10.7.3).
10.7. A L I G N E D P O L Y M A T R O I D R A N K F U N C T I O N S 0
0
429
if fo( .), fi (.) are aligned with fi (.) with their principal partitions coarser than that of the latter then (fo + fl) * g(.) is aligned with f . ~ ( . )and has a coarser principal partition than the latter has (Theorem 10.7.3, Lemma 10.7.2). if fo(.), fl(.) are aligned with f 2 ( . ) with their principal partitions coarser than that of the latter then X f o ( . ) , Xfi)(.), X > 0 are aligned with f ~ ( . )and have coarser principal partitions than the latter has.
Clearly the theorem is true for expressions of length 1. Suppose i t is true for expressions of length less or equal to k. Let w be a positive (negative) expression of length k + 1. Then w could have been built up out of shorter expressions in one of the following ways, for each of which the results mentioned above enable us to show that w satisfies the theorem.
w = 0 * g, where 0 is positive (negative) of length k. Here w remains positive (negative) and the required result follows from Lemma 10.7.2.
w = (0)*,where 0 is positive (negative) of length k. Here w becomes negative (positive) and the required result follows from Lemma 10.7.3. 0
0
(8 * g ) * , where 8 is positive (negative) of length k - 1. Here w becomes negative (positive) and the required result follows from Lemma 10.7.2 and 1,ernma 10.7.3.
w =
w = A8 * g,where 8 is positive (negative) of length k - 1 and X > 0. Here w remains positive (negative) and the required result follows from Lemma 10.7.2 and the fact that for any polymatroid rank function f(.),(Xf(.),g(.)), X > 0 has the same principal partition (but different critical values) as (f(.), g(.)).
.
+
*
w = (81 8 2 ) g,where 8i,i = 1 , 2 are both positive (negative) of length d , k - d respectively. Here w remains positive (negative) and the required
result follows from Theorem 10.7.3 and Lemma 10.7.2.
The fact that in each case the resulting polymatroid rank function has lower value on singletons than g ( . ) follows from the definition of convolution and the properties of the dual. 0
When f(.) is molecular with respect to g ( . ) we can make a stronger statement than in Theorem 10.7.5. In this case w ( f ) ( . ) is molecular even if w is not positive or negative. Problem 10.10 Prove Let u be an expression involving 0
convolution with g(.)
0
addition followed by convolution with g ( . )
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
430 0
positive scalar multiplication followed by convolution with g ( . )
0
dualization with respect to g(-)
Let f (.) be a molecular polymatroid rank function on subsets of S with respect to
y(.). Then w( f)(.) is also a molecular polymatroid rank function on subsets of S.
+
Solution: If fo(.),fl(.) are aligned with respect to g(.) we know that (fa f l ) * g(.), Xfo*g(.), X > 0, are all aligned to f1(.) (by Theorem 10.7.3, Lemma 10.7.2, Lemma 10.7.3 ). Clearly by the definition of alignedness, if fo(.),fl(.) are aligned and fi (.) is molecular then so must fo(.) be. It follows by induction on the number of elementary operations that must be molecular if f(.) is molecular. The next theorem shows that if we so wish we could generalize the notion of alignedness using different weight functions instead of a single one.
fo(.)
w ( f ( 9 ) )
Problem 10.11 Prove Theorem 10.7.6 Let fo(.),fl(.) be polymatroid rank functions on subsets of S which are molecular respectively with respect to the positive weight functions g o ( . ) , g ~ ( . ) . Then (pofo plfl)(.) is molecular with respect to (0090 olgI)(.), where pi, ai,i = 0 , 1 are greater than zero.
+
+
Solution: Proof of Theorem 10.7.6: Suppose not.Then there exists X C S such that U l f l
+ P Z f i ) ( X )+ (a191 + a2g2)(S
-
X ) < X(Plf1
+ PZfi)(S).
Then,
+
+ < al[(A/al)Plfl(S)] + a2[(~/@2)P2f2(s)1.
~ l [ ( V m ) P l f l ( x ) 91(S - X ) ] .2[(X/m)p2f2(X) +gz(S - W
]
So we must have, for i = l or 2,
(X/ai)Pifi(X) + 9 4 s - X ) < (X/fli)Pifi(S). This contradicts the molecularity of
ti(.)with respect to gi(.) 0
Problem 10.12 Let fo(.),fl(.) be polymatroid rank functions on subsets of S and let g(.) be a positive weight function on S . Let (f~(.),g(.)),(fl(.),g(.)) have identical principal partitions. If the refinedpartid order with respect to (fo(.),g(.)) and ( f l ( . ) , g ( . ) ) are identical, then it would also be identical to the refined partial order associated with ((fo + fi)(.),g(.)).
+
Solution: Lemma 10.7.1 assures us that ((fa f l ) ( . ) , g ( . ) ) also has the same principal partition. Next the characterization of ideals of > R given in Theorem 10.5.1 is such that if the conditions hold for a particular set with respect t o both to(.), fl(.) they would also hold for (fo fl)(.). This proves the required result.
+
The program that we carried out for aligned polymatroids can also be carried out for what could be called ‘strongly aligned polymatroids’ (defined below). We sketch these ideas in the following problems. However we omit the solutions.
43 1
10.8. NOTES
Problem 10.13 Let fo(.), ti(.)be aligned polymatroid rank functions on subsets of S relative to g(.), apositive weight function on S. We sayfo(.), fl(.) are strongly aligned relative to g ( . ) iff whenever A, B are two blocks present in both the partitions associated (fo(.), g(.)) and (fl(.),g(.)),but which are not coloops or selfloops in either partition the relationship between A and B is identical in both the refined partial orders. If fo(.), f i (.) are strongly aligned polymatroid rank functions relative to the positive weight function g ( . ) show that (fo + f 1 ) ( . ) is strongly aligned to both fo(.) as well as fl(.) relative to g ( . ) . Problem 10.14 Let f(.) be a polymatroid rank function on subsets of S and let
g ( . ) be a positive weight function on S. Show that f (.) is strongly aligned with Xf g ( . ) , X > 0.
*
Problem 10.15 Let f 1 ( + ) , fo(.) be strongly aligned polymatroid rank functions on subsets of S relative to a positive weight function g ( . ) . If both f l ( . ) and fo(.) are strongly aligned with f3 (.) with respect to g(.) and further if the principal partitions of (fo(.),g(.))and (fl(.),g(.)) are coarser than that Of (f3(.),g(.)),then, show that (fl(.) fo(.)) * g(.) is strongly aligned with A(.) relative to g(.) and the principal partition of ( f l ( . ) fo(.)) * g(.) is coarser than that of f 3 ( . ) .
+
+
Problem 10.16 Show that the statement obtained by replacing ‘aligned’ by ‘strongly aligned’ in Theorem 10.7.5 is true. Problem 10.17 Let fl(.), fi(.) be two polymatroid rank functions on subsets of S with identical principal partitions relative to the positive weight function g(.) and identical refined partial order given in Figure 10.4(a). The critical value sequences are given to be X I 1 = 5 , A12 = 4, A13 = 3, A14 = 2 and A21 = 4, A22 = 3, X23 = 4 X i 4 = F. Describe the principal partition and refined partial order of
i,
iii. (((2fl
* g ) * + fl*)* g ) * ( . ) , g ( . ) )
iv. In the previous parts compute the value of the functions on S.
10.8
Notes
Convolution, as an operation on submodular functions, was probably first studied systematically by Edmonds [Edmonds70]. Principal partition began with graphs when Kishi and Kajitani [Kishi+Kajitani68] decomposed a graph into three parts XX, X X and S-XX for X = 2. These ideas were generalized to matroids for integral X by Bruno and Weinberg [Bruno+Weinberg71] and for rational (real) X independently by Tomizawa and Narayanan [Tomizawa76], “arayanan741. For about a
432
10. C O N V O L U T I O N OF S U B M O D U L A R F U N C T I O N S
decade and a half, from late sixties to middle eighties, extensive work was done in Japan on the principal partition, its extensions and applications. Good surveys of this work may be found in [Iri79a], [Iri79b], [Iri+Fujishige81], [Tomizawa+Fujishige82], [hi831 and in the comprehensive monograph due to Fujishige [Fujishigegl]. Extensions of the basic ideas may be found for instance in [Ozawa74], [Ozawa75], [Ozawa76], [Ozawa+Kajitani79], in several papers due to Tomizawa in Japanese [Tomizawa80a], [Tomizawa80b], [Tomizawa80c], [Tomizawa80d] etc., in [Fujishige80a], [Fujishige80b], [Nakamura+Iri81], [Iri84], [Murota881 etc. Applications may be found, apart from the above mentioned surveys and monograph, in the following very partial list of references: [Iri71], [Tomi+Iri74], [Iri+Tomi76], [Ozawa76], [F’ujishige78b], [Sugihara’ig], [Sugihara80], [Sugihara+Iri80], [Sugihara82], [Iri+Tsunekawa+Murota82], [Sugihara83], [Sugihara84], [Sugihara86], [Murota+Iri85], [Murota87], [Murota90]. The west, except for the notable case of Bruno and Weinberg, has been largely immune to the principal partition virus. Recently, however, there were some signs of activity (see for instance [Catlin+Grossman+Hobbs+Lai92]). This chapter, as far as the discussion on principal partition theory goes, is in the main, a translation of the author’s PhD thesis “arayanan741, which used matroid union and partition as basic notions, t o the language of convolution of polymatroids and submodular functions (Subsections 10.4.5, 10.4.6, Sectionl0.7 etc. are very natural for matroids). We have adopted this approach because it is elementary and the extensions follow naturally. The readers interested in pursuing this subject further would do well to begin with the above mentioned survey papers of hi. Those interested in studying these ideas and their extensions in the context of convex programming are referred to [FujishigeSl].
10.9
Solutions of Exercises
E 10.1: Let
R1, B2
adjacency functions
Hence
be bipartite graphs on VL
r l ,r2 defined as follows:
G
{a,b,~ } , V c R { 1 , 2 , 3 , 4 } with
10.9. SOLUTIONS OF EXERCISES
433
This shows that (I‘l(* lr2((.) is not submodular. But we do know that ll?l(.)l,II‘,(.)l are submodular.
E 10.2: i. We know that f * g ( . ) = (f(.) - f ( 0 ) ) * ( g ( . ) - g ( 0 ) ) f ( 0 )+ g ( S ) . Thus f*g(.)is submodular if (f(.)- f ( 0 ) ) * ( g ( . ) - g ( 0 ) is submodular. Calling these new functions f ’ ( . ) , g ’ ( . ) respectively, we observe, using the fact that f’(8)= 0, by Theorem9.6.1 that
+
f‘ * 9’ (XI = f ’ W, which latter is submodular. The next two instances are special cases of the above result.
E 10.3: Since fl(.) 2 (fl * g ) ( . ) and we prove the reverse inequality. We have
YCX
(fl(Y) +g(X
=
(Yl)
= min( min
xcs
(fl
fi(.)
2
(fi
* g ) ( . ) , we have, LHS 2 RHS. Now
+ mi;(fi(Y) + g ( X
-Y))
-
+
Y)) g(S -X ) )
+ g ( Z - Yl)) + (f2(Yz) + g ( Z - Yz>)+ g ( S - 2)
(for some Y , , Y z ,Z with Y1 C YZ2 2
S),
(since fl(.), fl(.) are increasing and g ( . ) is a non-negative weight function)
2 fl(K n y z ) + h ( Y l n Yi)+ g ( S
- (Yl
n YZ:L))
2 ((fl + f d * g ) ( S ) . E 10.4: Suppose X I ,X2 belong to B f , g . We then have,
f * g(S) = f(X1) + g ( S - Xl) = f(X2) + g ( S - X 2 ) . Thus
2f
2
* g ( S ) = f(X1) + g ( S - Xl) + f W 2 ) + g ( S - X z )
f(xlu x2)+ g(s ( x ,u x,))+ f(xln x2)+ g(S - (XI n X d , -
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
434
using the submodularity of f ( . ) , g ( . ) . By the definition of convolution the only way the inequality can be satisfied is to have
f * g ( S ) = f(x, uX,)
+ g(S ( x ,u X , ) -
Thus XI U X . L XI , n X Z belong to
Irl(Y)+
I
nX,)
+ g(S
-
(XI n X , ) .
Bf,,.
E 10.5: If Irl(Y)s-K(f(Y)
Next
44 1
+g(((S
-
K ) u X ) - Y ) - f(S - K ) ) .. . (*I.
(f 0 K * g 0 K)(X) = m i n z ~ x (0fK ( 2 ) + g 0 K(X - 2 ) )
= m i n z c x ( f ( ( S- K ) u 2)- f(S- K ) + g ( X - 2)). .. ~
(**I.
Taking Y = ( S - K ) U Z we see that (*) and (**) are identical minimization problems. The result follows.
E 10.27: We have, for X
C T3,
(flT10 T3)(X) = f(Xu (7'1 - T3))- f(T1 - T31, ( ~ / T0zT3)(X) f(T1)
- f(T1
f ( X U (Z - 7'3)) - T3)
f(%)
f ( X u (T, - T 3 ) )- f(T1 - T3) I f ( X
-
-
f(Tz
f(T2 - T 3 ) i
- T3)i
u (T2 - T3))- f ( T 2 - T3).
Suppose the above inequality is strict. Then f(T1) - f ( X u (7'1 - 7'3))
> f ( T 2 )- f ( X u (T2 - T 3 ) ) ,
which contradicts the submodularity of f(.). We conclude that the inequality must be satisfied as an equality which proves the required result.
E 10.28: Each of Xi n Ek,xz fl Ek,(Xi U n Ek, ( X i n X Z )n Ek,iS a Union of blocks of IIpp in an ideal of > k . By the manner in which > k was extended to blocks in Ekfland by use of Lemma 10.5.1, we know that X1 - Ek is contraction I I Ek, - Ek to x 2 n Ek,XI u X z - Ek t o ( X I u Xa) n Ek and related to XI n X Z - Ek to (Xl n X,) n Ek.Since we assume that > k is a modular refinement of > r rover blocks contained in Ek and the concerned sets correspond to ideals of > k , we have,
x,)
xz
Further we have.
10. CONVOLUTION OF SUBMODULAR FUNCTIONS
442
Lsing (*) this reduces to
f(X1) + f(X2)= f ( X l u X 2 ) + f(X1 n X d as desired.
E 10.29: Clearly x / E l is a base for (f/El)(.).On the blocks of lIpp contained in El both the partial orders agree. So the statement is true in this case. Next x / E k is a base for ( f / E k ) ( . ) Suppose . the statement is true for this case. We will show . X/Ek+I is consistent that it is also true for the base x/Ek+l of ( f / E k + l ) ( . ) Now with respect to 2,. Let X C be the union of blocks of an ideal of > R . Then, X c1 Ek corresponds to an ideal of >R and by the induction assumption we must have X(X n Ek) = f(Xn Ek). (*) Further, X U Ek, Ek correspond to ideals of 2,. Hence, X ( x U Ek) = f ( X U Ek) and X(Ek) = f ( E k ) . Hence, X ( x - Ek) f(xu Ek) - f ( E k ) = f(x)- f(xn Ek), sirice is contraction related to X fl Ek. Hence, using (*), x ( X ) = f(x).
x
E 10.30: Let E l , . . ,Et be the principal sequence of (f(.),g(*)).Let X i , , . . . ,Xi,. be the earliest distinct nonvoid terms (in the same order) in the sequence El n X , . . . ,Et n X and Xi,, . . . , Xi, be the corresponding critical values. Then 0, Xi,, . . . , X i , is the principal sequence of (f/X)(.), (g/X)(.)) and Xi,, . . . , Xip is the decreasing sequence of critical values. The partition II’ associated would be the collection of blocks of I I p p which are contained in X . The partial order associated t o n’ and the refined with the principal partition would be the restriction of partial order would be the restriction of >R to II‘. We only sketch the proof. It can be seen that Ei,n X is molecular with critical value Xi, and that if Ei, f l X is contracted ( ( E i , - E i l ) n X ) is molecular with critical value Xi, and so on. (By the definition of > R , using Exercise 10.27, (f/(Ei, n X ) U E i l - l ) o (Ei, n X))(.) = ( f / E , , f l X ) ( , ) ,noting that E i l P ln X = 0. Further
>,
( . f / ( ( E i nx) 2 U E i Z - 1 ) o ( ( E i , - Ei,) nX))(.) = ( f / ( E i ,nX)o ( ( E i , - Eil)nX))(.)).
>,,
Also if Y is an ideal of Y nX nEi, , if nonnull, would be molecular with critical value Xi, and when X n E i , (or Y n E i , )is contracted, Y n x n ( E i ,- E i , ) , if nonnull, would be molecular with critical value Xi, etc. Thus, using Theorem 10.4.6, the partial order associated with the principal partition of (f/X)(.),(g/X)(.)) is the restriction of to II’. Further if Y is the union of blocks of nPpin an ideal of > R contained in X , Y n Ei, would similarly correspond to an ideal of > R . Thus, - Ei,) would be properly related in the principal partition of (f(.),g(.)), Y n (Ei, to Y n Ei, which is contained in X . Using this argument inductively it follows that the ideals of the refined partial order of ( f / X ) ( . ) , (g/X)(.)) are precisely the ideals of > R of ( f ( . ) , g ( . ) ) contained in X , as required.
>,
E 10.31: We will show that the collection of all complements of ideals of >R with respect to (f(.),g(.)) satisfy the characteristic properties (given in Theorem 10.5.1)
10.9. SOLUTIONS OF EXERCISES
443
of the ideals of the refined partial order with respect to ( f * ( . ) , g ( . ) ) .We claim that the ideals of this partial order are precisely the complements of ideals of >R. (We note that this statement is true for the partial orders associated with the principal partitions of (f(.), g ( . ) ) , (f*(.),g(.)) respectively). Let Z,,2“ denote respectively the minimal and maximal members of B:, the collection of subsets which minimize the expression a f * ( X ’ )+ g ( S - X’). We need to verify two conditions: When 2 is a complement of an ideal X of Z R ,
i. We know that (by Theorem 10.4.5), for some A,
zo,+l = s - X A , 2,) =
s - X AT+l’
where X A , ,xx,+,are the minimal members respectively of (2 n
) u z,, =
((s- X ) n ( S - X A , ) ) u ( S - XA,+,) u XAP))u ( S - xA,+l 1
=
(S - ( X
=
s - ( ( X U q , ) n XA,,,) s - [ ( X n XA,+l) u XA,I
=
Bx, , B A , + ~ Next, .
Thus the LHS is the complement of a member of BA, (whose maximal member is X A ). It is therefore, a member of Bz, (whose minimal member is Z,> = r+ 1 S - XA,+, ). This proves the first condition.
ii. This follows directly by using the definition o f f * ( . ) ,that 2 is a complement of an ideal X in L R , that Z,,+, , Z,, are complements of X x , , XA,+, respectively and that any two ideals of >R form a modular pair for f(.). E 10.32: i. Let h l ( Y ) , h z ( Y denote ) respectively f ( Y ) + g ( X - Y ) and f ( Y ) g ( X U e Y), e E ( S - X ) . It is easily verified that these functions are submodular. Now VY C X,,, and h z ( Y ) = h l ( Y ) + g(e), Y C X . Hence, hz(Y) > ha(X,,,) ha(Y) 2 h2(Xmn,) V Y X,,,. So using the submodular inequality for ha(.) on Xm,, and ( X U e ) m z n ,we conclude that ( X U e)mzn 2 X,,,. Similarly using it on X,,, and ( X u e)mn, we conclude that ( X U e ) m n , 2 X,,,.
+
ii. If f ( X m , , U e) = f(X,,,),then (f * g N W = f ( X m m ) + g(X - Xm,,)
= f(X,,,
u e ) + dX - -&a,)
L (f * s)(Xu e ) .
10. C O N V O L U T I O N O F S U B M O D U L A R F U N C T I O N S
444
However, we know that (since f(.),g(.) are increasing) (f * g)(XU e ) Hence, (f * g)(X) = (f * g)(XU e ) (by Theorem 10.3.1). Next let (f * e 6 (X U e),,,.
g)(X)= (f * g)(Xu e). We know ( X Then
(f * s)(xu e )
U e)mnz
2 (f * g)(X).
2 X,,,.
Suppose
+ g((X
f((xu e),,,)
(Xu e h n s ) u e ) > f ( ( X u e ) m a z ) + g(X - (Xu e ) r n n z ) ,
=
-
since g(.) is a positive weight function. But the last expression on the RHS 2 (f * g ) ( X ) ,which is a contradiction. Hence,
We then have
T h e only way these inequalities can be satisfied is through equalities. Hence,
(X u e)mns Hence, (X u P),,,
E X,,
-
e
c X,,,,
u e , and therefore, (X U e ) m n , = X,,,
ue.
Now we have
Heiice, f(X,,,)
= f(X,,,
U e),and
therefore,
l o compute S,,, efficiently we start from el E S and grow it to S in the following manner. Suppose we have reached the set X = { e l , . . . , e k } in this process and know X,,,,. For each e E S - X we check if f(X,,,) = f(X,,, U e). If so we discard e . If f(Xm,,) # f(X,,,ue’) we update X to the set XUe‘. We compute (XUe‘),,,. We continue this process until we reach a set Y with (f*g)(Y) = (f*g)(S). Observe can be computed since that 1 Y 1 1. Then
I Y 11 Y I*=[
Y
(a+l f(Sjn E,) or z(Sj U ET)> f(SjU E,). In the former eventuality we pick the set Sj nE, for our subsequent arguments. Otherwise we repeat the process with Sj u E,.. Thus without loss of generality we may assume that E , 3 Sj 2 E,-1 for some T . Now we have, by the definition of x,
+
+
.(Sj) - z(G-1) d S j ) - g(G-1) A,.
But z(Sj)
> f(Sj)while z(Er-l)= f(ET-l). Hence,
i.e., X,f(Sj) - g ( S j ) < A,f(E,-l)- g(E,-l).This contradicts the fact that E,-1 minimizes A,f(X) + g ( S - X ) . We conclude that z ( S j ) 5 f(Sj),j= l , . . . , n ,and hence, by Theorem 9.7.2, x is a base of Pf.
iii. Let x‘ be any other base of Pf with
z’(4)< . . . < z’(e’? )l de:)
-
-
deh)
10.10. SOLUTIONS OF PROBLEMS
447
Define 5’5 = {e:, . . . ,el}, j = 1 , . . * ,n. Let us say that the set E , is broken by a sequence e : , . . . , e k iff for some i, e: # E, but e l E E,. for some j > i . Let t be the least index for which z’(e:)/g(ei) # z ( e t ) / g ( e t ) .We need to show that z’(ei)/g(ei)< x ( e t ) / g ( e t ) .We will prove this by contradiction. Let z’(e:)/g(ei)> z ( e t ) / g ( e t ) .We proceed by first proving the following claim. Claim: Sets E l , . . . ,Eli are not broken by e i , . . . ,e i p 1 . Suppose not. Let m be the least index for which Em is broken by e: , . . ,eLVl. Now e l , . . . , et does not break E l , . . . ,Eh. So if et E Ej - Ej-1 it is clear, since El C EZ C . . . c En, that m 5 j and therefore, Am 2 A j . Hence, +
1
z ( e t ) / g ( e t )L -.
Xrn
Let q be the last index for which e i E E m . Since e l , ’ . . ,ell does not break Em, it follows that ell E E, - Em,s > rn. Hence, if q 5 t - 1, we have z’(eb) -- - z(eq)> de;)
and if q
dell)
1
-,
Am
> t - 1, we have
Thus it is clear that in every case
1
z’(eb)
deb) Now let { e l l , .
> -.
Am
. , e : , } = Em - Em-l, where i l < . . . < iw.If i l < t , then ~ ’ ( e : ,) z(ei1) -g(e:,)
(since
> -1
deil) -
Am
is not broken by e:, . . . , eh, and, therefore, eil @ Em-l). If i l 2 t , then
z’(e’) 1 We thus see that for each e‘ E Em - Em-l, -> g(e’) - Am
It follows that
10. C O N V O L U T I O N OF SUBMODULAR FUNCTIONS
448
%'(Em)-x'(Em-I) 1 > -. g ( E m ) - g(Em-1) Am Now E l , . . , E m P 1axe not broken by e ; , ' . . , eiPl. They are also not broken by { e ; , . . . , eLP1}. It follows that e l , . . ,en. Further, et E Ej with j 2 m. So Em-l for each e', E Ei - Ei-1, i 5 m - 1, we must have .
a.e.,
z'(el>
--
s(e2
4%)dez)
1 A2
Hence, z ' ( E i ) = z ( E i ) ,i = 1 , . . . ,m - 1. But z ( E i ) = f ( E i ) , i = 1 , .. . , k. Hence, d(E,,-l) = f ( E m - l ) . Hence,
f ( E m )- f ( E m - 1 ) g ( E m ) - g(Em-1)
,.'(Em)
-
- z'(Em-1)
g ( E m ) - g(Em-1)
1
> -.
Am
This is a contradiction as we have shown in the previous section. This contradiction can be avoided only if E l , . . . , Ek is not broken by e i , . . , eL-l. Thus, the claim is justified. Now let E, 2 Si-l 3 EvPl.We consider two cases. Case 1: E, 2 S l2 = {eil,...,e>v},jl< . . . < j y . Then St-1 - E,-I = Suppose Slpl{e,,i%'.',ejy}.NOW
Further,
Herice, for each e' E E,
-
Sl-l,
I t follows that
By arguments used in the previous sections of this problem
Further, z ' ( E T )5 f ( E , ) . We therefore have
449
I(' LO. SOLUTIONS OF PROBLEMS
Since d ( E , ) = z ( E T )= f(ET),and d(E,+1) 5 f(E,+1), i t follows that
Since the assumption that
leads us to a contradiction in every case, and since the two sides are not equal, we conclude that X'(e:)
< .(.t)
!3(4 det)' (iv) From the arguments of the previous section, the only F-lexicographically optimum base is the one defined by
This definition yields a unique base. So the F-lexicographically optimum base is unique.
P 10.2: [Fujishigegl] i. This has been already shown for the case where g ( . ) is a general submodular function (in the proof of P P l ) . ii. We proceed as in the proof of P P 2 slightly modifying the notation of that proof. Let PI ( X ) E f ( X ) alg(S - X ) P Z(X ) 3 f ( X ) m g ( S - X ) . If 2 1 minimizes p i ( . ) and Y c 21 we have
+
+
Pz(Y) = P l ( Y ) + ( m - m)dS - Y ) Pz (Zl) = Pl(&
+ ( g 2 - g1Ids - 21)
Since g ( . ) is increasing, Y c Z1 and cr.. > P I , pz(Y) 2 ~ ~ ( 2 1 ) .Hence, there is a set 1 that minimizes pz(.)and contains Z1 (by Theorem 9.4.1). Hence, Y u l C Ygz. If 2 is the unique set that minimizes P I ( . ) we have p i (Y) > pl(Zl), VY c 21. Hence, P z ( Y >> P z ( Z , ) , w c 21. Thus, in this case every set that minimizes pi(.)would contain 21. Thus, Yg22 Y g l .
iii. The proof is similar to the case where 2 0. Note that o g d ( . ) is a submodular function if g ( . ) is submodular and t~ is negative. iv. We have three cases: a. 0 5 g 1 < u2.
10. C O N V O L U T I O N OF SUBMODULAR F U N C T I O N S
450
b. CTl < 0 < u,. c. u1 < ff, 5 0. Cases (a) and (c) have already been considered in the previous sections of the present problem. So we consider only Case (b). Case (b): Let X I E B.,,, , Xa E B.n2. We use the following facts:
g(s (x,n x,))- [g(s- x,)- g(S - (Xl u X d l Ig(S - &), -
ff,(g(S
-
(Xl fl X , ) )
-
g(S - Xl)) 2 0,
m(g(X1 n X,)- g(X1)) 2 0.
We have f(*Yl)
+ L
01
MS)- g(X1)) + f(X2) + ff2g(S - X 2 )
f(xlu x,)+ f(xln x,)+ 4 s - (XI u xz)) +al(g(S) - g(X1 n X , ) )
+ g2(g(S (xln X , ) ) - g(S -
-
XI))
+ m ( g ( S ) - s(X1)) - ff,(g(S) - g(X1 n X 2 ) )
L f(X1 u X,) + ffzg(S- (Xl u X,)) +f(& fl x2) + - g(X1 n xd).
The only way the final inequality can be satisfied is to have f(x1) + m ( g ( S ) - g(Xl))= f(xln XZ) m(g(S) - g(X1 n X 2 ) ) and f ( X , ) u,g(S - X,) = f(X1 u XZ) azg(S - ( X I u X, ) ). The required result now follows.
+
+
+
P 10.17: Familiarity with Theorems 10.4.4, 10.4.5,10.7.2arid Problems 10.12 and 10.10 is assumed in the following solution. A polymatroid rank function f(.) and another obtained from it by taking direct sum of the structures f /xx,+l O (xA*+l - Xx, ) (.) have identical principal partitions and would continue to have identical principal partitions even if both are operated on by positive or negative expressions. This fact can be proved similar to the way in which Theorem 10.7.5 is proved and is also used below. i.
(fi
and
+ f2)(.)
fl(.).
would have identical principal partition and partial order as The critical values would change as follows:
f~(.)
1 1 12 A+, = (- + -)-I = 4 3 7’ 1 2 A+3 = (- + -)-I =1 3 3 4 1 3 x + 4 = (- + -)-I =5 2 4 (fl + f i ~ ) * g would have D1UDzUC1UC2 as coloops. The refined partial order would not change as far as A , Bl , B2, B3 are concerned. The elements in D1 U Dz U C1 U CZ would remain as isolated vertices in the Hasse Diagram of ( 2 ~ ) .
10.10. SOLUTIONS OF PROBLEMS
45 1
+
ii. The function ((fl fi) * g)*(.) would have D1 U D2 U C1 U C2 as self loops. The refined partial order would be the dual of that of ((fl + f 2 ) *g)(.). The critical values of (((fl fi) * g)(.) are 20/9,12/7 and 1, where the value 1 corresponds to D1 U Dz U C1 U C2. So the critical values of ((fl f2) * g)*(.) are
+
+
7 o o , ( l - -)-1,(112
9 20
-)-I
where the critical value 0;) corresponds to D1 U 0 2 U C1 U Cz, = (1 = ( 1 - &)-' corresponds to A . corresponds to B1 U B2 U B3 and
A)-'
A quick way of computing the principal partition in such cases is t o examine +X, ~-, X ~,,)(.)correwhat happens to each of the molecular structures f / x ~ o~( X sponding to the different critical values. In the following discussion, for notational convenience, the function which is the direct sum of the above functions on the sets (XX,,~ X X , ) will be denoted by f(-').Let us examine what would happen to the molecular structure on B1 U Bz U B3 in ((f1 f . ~* )g ) * ( . ) . The corresponding critical value in fl(.) is 4 and in f2(.) it is 3. When two functions have molecular (fi(.), g(.)) structure, adding them corresponds t o addition of the reciprocal of the critical values. So the critical value for (f1 +f2)(.) is ($+ ;)-' = This value is above 1. So convolution with g(.) will not affect the critical value. So we get as the critical value for Dualization would replace X by (1 ((fl fi) * g)*(.). This does not correspond to coloops. So we know the refined partial order to be as in the case of fi(.) and f2(.) as far as B1, Bz,B3 and elements below these are concerned. If, however, for an intermediate expression w(f1, fi) the critical value for Ei+l - Ei (where Ei are the sets in the principal sequence) falls 5 1, convolving with g ( . ) will make that region into coloops for the resulting function. In the present case no such intermediate expression occurs.
+
?.
i)-'.
+
...
Let us use the above technique with critical values to study the principal partition of (((2fl * g ) * + f l * ) * g ) * ( . ) .Critical value X I 1 for f l ( . ) + +A11 for 2f1(.). Since i X 1 1 > 1, therefore, the corresponding critical value continues to be $A11 for 2fl * d.). Next, ;A11 for 2fl * g(.) ( 1 - *)-' for (2fl * g)*(.). 111.
*
5 5.
So A11 = 5 for f l ( . ) becomes the critical value (l-&)-' = for (2fl*g)*(.). Next, corresponding to A l l , fl*(.) has critical value (1 - +)-' = So ((2f1 *g)*+fl*)(.)
(2
8
has critical value + $)-' = corresponding to X11. This value is less than 1. So convolution with g(.) will convert this set A into coloops and dualization would make these into selfloops. Repeating this computation with X 1 2 , X 1 3 we find that H I U Bz U B3 and C1 U C2 also become sets of self loops in ((2f1 * g)* f l * ) ( . ) . llowever, in the case A14 = 2, 2f1(.) has critical value 1. So (2f1 * g ( . ) ) ( . ) has only c:oloops and its dual has only selfloops in D1 U D2.Hence, ((2f1 * g ) * + f i * ) ( . ) would coincide with fl*(.) on D1 U D2. Now f~*(.)has critical value 2. So does f1* * g(.) aid iilso ( f l * * g ) * ( . ) . Thus the function (((2f1 * g)* + f i * ) * g)*(.) has the value on D1 U Dz and, since S - (DlU D z ) is made of selfloops, the same value
+
also 0 1 1
s.
(:iLI(:ulations in the case of the other functions are similar
Chapter 11
Matroid Union 11.1
Introduction
In this chapter we study the important operation of matroid union. We first prove that the matroid union operation yields a matroid if we start with matroids. There are many ways of proving this result. We have chosen a route which has the merit of displaying, along the way, some of the deepest results in matroid theory (e.g. Rado’s Theorem). However, space considerations have prevented us from giving the results the motivation they deserve. The key notion in our development is the idea of a submodular function induced through a bipartite graph. Next we present an algorithm for matroid union that is an immediate extension of Edmonds’ famous algorithm for matroid partition [Edmonds65a]. We study this algorithm in detail and, as a consequence, the structure of the union matroid. Finally we use this algorithm to construct the principal partition of the rank function of a matroid with respect to the 1 . 1 function.
11.2
Submodular Functions induced through a Bipartite Graph
In this section we study the effect that a submodular function, defined on one side of a bipartite graph, has on the other side. Results of this nature include some of the deepest (e.g. Rado’s Theorem on independent transversals) and some of the most practically useful (e.g. Matroid Union Theorem) in matroid theory. The treatment in this section largely follows that of Welsh [Welsh76]. However, we choose to work in the context of bipartite graphs rather than in that of families of sets.
Remark: In this chapter and in subsequent chapters, maximal independent sets would be denoted invariably by b and not by B.
453
11. MATROID UNION
454
We begin with a simple result which is a restatement of the one found in Problem 9.3 (second part). Theorem 11.2.1 Let B = (VL,VR,E ) b e a bipartitegraph. Let f ( . ) beanincreasing submodular function on subsets of VR. Then f ( r L ( + ) ) (where I ' L ( X ) , X C VL is the set of vertices adjacent to X in B ) , is an increasing submodular function on subsets of VL. (For proof see solution of the above mentioned problem.) It is convenient at this stage to introduce terms which are commonly used in 'transversal theory'. Let B = (VL,V R ,E ) be a bipartite graph. A family (xi : i E I ) of vertices in VL is a system of representatives (SR) of a subset Y {yi : i E I } of VR iff there exists a bijection 7 : I -+I s.t. xi E r ~ ( y ~ ( i(where )) rR(X),X C VR is the set of vertices adjacent to X in B ) . (We remind the reader that the definition of family (see page 16) permits xi = x5 even if i # j ) . The system of representatives of Y becomes a system of distinct representatives (SDR) or a transversal of Y iff xi # x j , i # j. Alternatively, a set T C_ VL is a transversal of Y iff there is a bijection 7 : T +-I s.t. 2 E rR(y7(+)) Vx E T . (A convenient way of defining this bijection when Y E VR is to use the same index set for both the family of left vertices in an SR and the right vertex set VR and the bijection to be the identity mapping). A matching in B is a set of edges of B no two of which have a vertex in common. Thus, T C VL is a transversal of Y iff there exists a matching in B with T as the set of left vertices and Y as the set of right vertices. Throughout this section, index sets (such as I ) are taken to be finite. The set underlying a family (xi: i E I ) is denoted by {xi : i E I } . Theorem 11.2.2 (Welsh (WeJsh761) Let B = (VL,VR,E ) be a bipartite graph with no parallel edges. Let VR = {yj : i E I } . Let f ( . ) be an increasing submodular function on subsets of V,. Then VR has a system of representatives (xi : i E I) such that
f({xi : i E J } ) 21 J iff
f ( r R ( y )2 )1 y
I
1
\Jy
VJ
5I
VR.
(*) (**)
Proof : Only if: Let (xi: i E I ) be an SR of VR. Let Y _= {yi : i E J } . (We note that by definition yi # y j , i # j ) . Then rR(Y) = UiGJr R ( y i ) 2 {xi : i E J } , since %t E f R ( y i ) V i E I . Further f(.) is increasing. Hence, f ( r R ( y ) )2 f({zi : E J } ) >I J I=( Y 1. So (*) is satisfied. If: Let (**) be satisfied. The proof is by induction on the number of edges in the bipartite graph. Clearly the result is true if there is only one edge. Let us assume that the result is true for all bipartite graphs with k edges and let B have k 1 edges. The result is in fact trivially true (there can be no SR) unless each vertex in VR has degree at least 1. If each vertex in VR has degree 1 the result is immediately true (the SR is ( r R ( y i ) : i E I ) ) . So we assume that each vertex of VR has degree atleast 1 and without loss of generality that vertex y1 E VR has degree atleast 2.
+
455
11.2. INDUCED SUBMODULAR FUNCTIONS
Let edges ( z l , y l ) , ( z ~ , y 1 )be incident on y1. Let Bl(B2) be the bipartite graph obtained by deleting (z1,yl) ((z2,yl)) from B. We claim that one of B1, B2 must satisfy (**) (with rR(.) replaced by their right adjacency functions rR1('),rR2(') respectively). Suppose not. Then there must exist subsets Y1, & of VR - {yi} s.t.
JYl ) + ) U , I + l Now
=
)YlUV2I+IYlnY2)+1
> f((rR(Y1) u rR(Y1 u y 2 ) ) + f(rR(Y1 n y2)). f(rR(Y1 n Y 2 ) )L I Yl
n v2 1
by (**). But then f(rR(yl)UrR(Y1 uY2)),i # j s.t. b: = Ufz1 bi, a n d & can bereached from vo in one step in G(b;,. * . ,b',).
uz=l
U:=l
Exercise 11.11 Let bl , . . . , bk be bases of the matroid M s.t.
u bi is base of
M k .
i. In the graph G ( b l , ' ' . , b k ) show that the shortest path between two vertices cannot exceed length T . ii. Let T be the set of all vertices in G ( b l , . . . , bk) from which no element common to two bases can be reached. Build the preorder on T - R by the following rule: 711 2 u2 iff v2 can be approached from v l . If (v) is an equivalence class of this preorder show that ( v ) n bi # 0 for each i .
Exercise 11.12 (k) Convert Algorithm Matroid Union to an algorithm for finding the niaximum size common independent set of two matroids M 1 , Ma1
Structure of the Matroid Union
11.3.4
The Algorithm Matroid Union gives some insight into the structure of the union of matroids. Specifically, using the algorithm, we present simple results on the set of non-coloops, circuits and f-circuits for this matroid. Finally we discuss the idea of approachability relative to a base of the union and show that this notion is helpful in the construction of the principal partition of a matroid. We define bases bl, . . . , bk of M I ., . . ,M k respectively to be maximally disk bi is a base of M 1 V . . . V M k (i.e., iff UE=lbi cannot be enlarged).
tant iff
u:=l
Lemma 11.3.3 Let b l , . . . , b k bemaximallydistant basesofmatroidsM1;..,Mk
respectively on S. Let bv = k bi and let R be the set of all elements reachable from S - k bi in the graph G ( b l , . . . ,bk). Then
Uiz1
ui,l
i.
bl n R,. ,bk n R are pairwise disjoint bases of M ' '
ii. M 1 . R , . . . ,M
k
1
. R, . . . , M
k
. R,
. R do not contain any coloops within R,
k iii. R is the unique minimal set that minimizes Ci=l ri(X)+ I S - X equivalently, minimizes k T ~ ( X )1 -X J , XC S ,
Ci=,
ir-. S
-
R is the set of coloops of M I V . . . V
M k .
1,
X
C S,
11.3. MATROID UNION: A L G O R I T H M A N D S T R U C T U R E
465
Proof:
By definition, R 2 S - b, and the set of all elements reachable from elements in in G(bl , . . . ,b k ) . Further R fl bi are pairwise disjoint. Otherwise an element v, E bi n bj is reachable from some 21, E S - bv in G(bi . . . , b h ) . Use of STEP 4 of Algorithm Matroid Union will allow us to enlarge bi, a contradiction. Next, if R n b j is not a base of M j . R, then for some Y E R - b j , Lj(v, b j ) has an element ‘u’ 6 R. But then u is reachable from some v, E S - bv and so must Y’ be. Thus, u’ E R, a contradiction.
i.
s - bv
U
ii. We have to show that the base bj n R of M j . R contains no coloop of M j . R. Let v,,, E bj n R. Then urn is reachable from Y, E S - bv in G ( b l , . . . ,b k ) . But this means there is v,-1 E R - b j , s.t. Lj(v,,,-l, b j ) contains urn. So urn is not a coloop of M j . R.
...
A set minimizes k r i ( X ) + 1 S - X J , X k iff it minimizes r i ( X ) - I X I,X C S , i.e., iff it maximizes I X 1 - C r i ( X ) , X S. We will work with the latter function. We have 111.
S
where r V ( . )is the rank function of the union of the matroids. Now I X 1 - r v ( X ) is a supermodular function, takes value zero on the null set and 1 or 0 on singletons. It is therefore an increasing function and reaches its maximum on S. Next 1 S 1 - r v ( S ) =( S - bv I . We know that R 2 S - .6, Further R n bi are pairwise disjoint bases of M i . R , i = 1,. . . , k. Hence,
1 R 1 -rV(R)
k
= 1 R I - x ~ i ( R )= 1 S
-
bv
I = I S I -rv(S).
i=l
Thus, I X I - C r i ( X ) reaches a maximum at R. This function is supermodular and use of the supermodular inequality reveals that it has a unique minimal set maximizing it. So it suffices to show that no proper subset of R maximizes the function. Suppose R’ C R is such that k
k
i=I
11. MATROID UNION
466
We conclude that 1 S - bv )=(R’ - bv n R’ 1 and 1 bv n R’ I = ri(R’).Since k S-b, 2 R’-bvnR’, we haves-bv = R‘-bvnR’. Since I bvnR‘ 1 = x i = l ~ i ( R ’ ) , b, nbi are pairwise disjoint bases of M i . R’,i = 1, . . . ,k. Hence, from the elements in S - bv it is impossible t o reach outside R’ in G ( b l , * . . ,b k ) , i.e., R‘ 2 R. Thus, R = R’, i.e., R is the minimal set that maximizes 1 X I - C T i ( X ) . k
iv. This part follows by Lemma 10.4.3. We give an alternative proof because the technique of the proof is useful. We need to show that S - R is contained in every base of the union and for each u E R there is some base of the union that does not contain it. If we initialize Algorithm Matroid Union on (bl ,. . . ,b k ) the algorithm must output the same set of bases. Hence,
Now if b:, . . . ,b; are bases of M I , . . . , Mk s.t. must have
k UiXl bi is a base of the union, we
k
utz1
Hence, UiZl 6: 2 S - R , as otherwise its size would be less than that of bi. Next let urn E R. If u, # bi we already have a base of the union to which urn does not belong. So let urn E bi. There exists u E S - bi from which urn can be reached in G(b1,. . . ,b k ) . Using a shortest path from u to u, we apply STEP 4 of Algorithm Matroid Union. This would result in a new set of bases bi ,. . . ,b;, the cardinality of whose union would be the same as that of bi. However, u E bi but u, @ bi. Thus, we have found a base of the union to which urn does not belong. So we conclude that urn is not a coloop. k
U
u
u
u
u
u
0
Using arguments similar to the proof of the above lemma we can prove the following two results. Lemma 11.3.4 Let b l , . ’ . , bk be bases of matroids M I , .. . ,M k respectively on s s.t. bi is a base b, of M I V . . . V M k . Let u E S bi, Let R, be the set of all elements reachable from v in G(b1, . . * ,bk). Then the f-circuit L v ( u ,bv) = R,.
uf=l
ut1
We thus see that R, does not change in G(b;, . . . , b;) provided subset is a base of the matroid union.
u bi u bi and this =
Lemma 11.3.5 Let C be a circuit of M1 V . . . V M k . Then C can be expressed as u kJ bl U . . . kd b k , where bi is a base of M i . C, i = 1 , . . ’ ,k , the bi are pairwise disjoint and every element of bi can be reached from u in G(b1,. . . ,b k ) .
ufXl
Approachability Let b l , . . . , bk be maximally distant bases of matroids M I ,. . . , Mk respectively on k S. Let b, bi. Let D be the set of all vertices of G(b1,. . . , b k ) contained in bv
= ui=,
11.3. MATROID UNION: ALGORITHM AND STRUCTURE
46 7
from which no element common t o more than one bi can be reached. It is clear that D n b i a r e b a s e s o f M i . D , i = l,...,k,andfurtheraredisjoint . L e t v , , ~ , E D . We say u, is approachable from up relative to ( b l , . * ,b k ) iff u, is reachable from up in G(b1, . . . , b k ) . Approachability depends on the base bv of M I V . . . V M I , and the matroids ,M 1 , . . . , Mk but not on the individual hi. However it cannot be defined knowing only M I V ' . . V Mk (without knowledge of M I , ... ,M k ) . This idea is algorithmically useful in building the principal partition of a matroid rank function. We collect all the useful properties of approachability in the following lemma. We remind the reader that minimizing x ~ i ( X ) -I X 1 is equivalent to minimizing CT,(X)+ I - 1 . Lemma 11.3.6 Let b l , . . . , b k be maximally distant bases ofmatroids M I ,. . . , Mk respectively on S. Let bv 3 U b i and let D C bv be the set of all vertices of G ( b 1 , . ' . , b k ) contained in bv from which no element common to more than one bi can be reached. Let f (.) = ri(.)- I . I) where ri(.) is the rank function of M i , i = l , . . . , k . Then, a
s x
(c
i.
I(.)reaches a minimum on D among all subsets of bv and D is the unique maximal such set,
ii. if up E D , then there is a unique minimal subset A, of D on which f (.) reaches a minimum among all subsets of bv s.t. up E A,; further, A, is the set of all elements approachable from up relative to ( b l , . . . , bk), iii. if R is the set of dJ elements reachable from S - bv in the graph G ( b l , * . . ,b k ) , then R u D is the maximal subset of S that minimizes f (.) over subsets of S, iv. if u p E D - R then R u A, minimizes f (.) over subsets of S and is the minimal set containing u p with this property. Proof: i. The function f (.) is obtained by summing modular functions and subtracting a submodular function and so is submodular. Let X bv. Then, since (binX) = X , it is clear that f ( X ) 2 0. Next let bv 2 D' 3 D. If binD' are all pairwise disjoint bases of M i . D', i = 1 , . . * , k,then D' does not contain any element common to more than one b, and it is impossible to reach outside D' from u E D' in G ( b 1 , . . . ,b k ) . Hence, D' C D , a contradiction. So we must have that either the binD' have nonvoid intersections or they do not all form bases of Mi . D'. In either case f ( D ' ) > 0. On the other hand, f (D) = 0, since D n bi are bases of Mi D , i = 1, ' . . , k and pairwise disjoint. So the minimum value of f (.) over subsets of bv is reached at D and it is the maximal such set. The unique maximality of D follows from the submodularity of f (.).
&
..
If there are two distinct minimal sets which minimize f(.) among subsets of bv containing up, by using the submodular inequality it would follow that their intersection is also a set of the same kind. This would contradict the minimality of the sets. So there must be a unique minimal set A , which minimizes f(.) among 11.
11. MATROID UNION
468
subsets of bv containing u p . Also up E D . So f ( D ) = f ( A p )= O.Thus A, minimizes f(.)among subsets of bv and further is the unique minimal such set containing up.
x!=l
Since f('4,) = 0, r i ( A p ) =I A, . Further, U;=l(bi n A,) = A,. We conclude that hi n A, are pairwise disjoint bases of Mi.A,,i = l , . " , k. Since p i p E A,, the set A; of all elements approachable from up relative to bl , . . ,b k cannot have element from S - A,. Hence, A; C A,. On the other hand, it is easy to see that A; has the property that bi n A; are pairwise disjoint bases of M , . A ; , i = I ; . . , k . Hence,
I
k 1=I
Also vp E A6 and hence A; must contain the minimal set A, that contains up and minimizes f(.) among subsets of bv. So A; 2 A,. Thus Ab = A,.
...
By Lemma 11.3.3 we know that R minimizes f(.). Consider the submodular inequality: f ( R ) + f ( D ) 2 f ( R u D) + f ( R n D ) . 111.
bVe know that f ( D ) 5 f ( R n D ) . Hence, f ( R ) 2 f ( R U D ) , i.e., Etu D minimizes
f (.I.
Now R = ( S - b v ) M ( R n b v ) . F u r t h e r R n b , i s a b a s e o f M , . R , i = l , . . . , k , arid these bases are pairwise disjoint. So f ( R n b v ) = 0 and hence R n bv E D. Therefore,
= =
-1S-b"l.
=
IR
n b 1)-~
IR ( C r i ( R n b v j - 1 R n b , 1)f(Rnb,)- ( S - b v I .
f(R) = ( C r i ( R ) -
- ~ vI
[S-b,
1
Since S - R C_ b v , in order to show that RU D is the maximal subset of S that minimizes f(.)over subsets of S, it suffices to prove that f ( R U D ' ) > f ( RU D ) whenever bv 2 D' 3 D.We have
f(RUD') = C r 2 ( R U D ' ) -JRUD'I 12
=
( x r , ( R UD')2=
=
b
( C r i ( D ' ) - 1 D' 0-
I D' I)- I R -
D'
1
1
1)- 1
1
R
-
bv
I
(since R is spanned by R n bv in each M i and R n bv E D C D'). We saw earlier that D' n hi are either not all bases of M i , i = 1,. . , k or are not pairwise disjoint. Hence, f(D') > 0. Thus, f ( R U D ' ) = f ( D ' ) - IR - b,( > 0- I S - bv I . We have
-
11.3. MATROID UNION: ALGORITHM AND STRUCTURE already seen that f ( R U D )= f ( R )= - 1 S - bv as needed.
469
1 . Thus, f ( R U D‘) > f ( R U D )
iv. We have,
m u A,)
k
CG(R u A,)i=l
=
I RU
A,
1)
k
= (Cri(RuA,)- l(RUA,)nD()-IR-bv
1,
i=l
k
=
( C r i ( R U A,)i=l
1 (RU A,) n D 1)- 1 S - bv I .
Now A,n bi are bases of M i . A,,i = l , . . . , k ,and R n b i are bases of M i . R,i = l ; . . , k . S o ( R u A , ) n b i s p a n RUA,inMi,i = l , . . . , k. Further, biareindependent in M i and bin D axe pairwise disjoint. Also Rfl b v , A,n b, are subsets of D. Hence, ( R U A,) n bi are pairwise disjoint bases of M i ( R U A,), i = 1 , . . . , k. Hence, 1
k
( C r i ( ( Ru A,) n D ) iL1
1 (RU A,) n D I)
=
0.
Hence,
f(RU A,)
=
k
(Cri( ( RU A,) n 0 ) - 1 ( RU A,) n D 1)-
1 S - bv 1 .
i=l
= O-IS-bv)
Thus, R U A , minimizes f( .). Next suppose vp E R U A”, and R U A”, minimizes f(.). We must have, -
I S-bv I=
k
f(RUA”,) = ( C r i ( R U A ” , ) - 1 (RUA”,)nb,
1)- 1 (RUA”,)-bv I .
i= 1
Since
S - bv
= R - bv we must have S - bv = ( RU A”,) - bv and therefore, k
This can happen only if (RU A ” p )n bi span M i . (RU A”,),i = l ; . . , k , and are pairwise disjoint. Thus f ( ( RU A”p)n b,) = 0. So ( R U A”,) n bv minimizes f(.) among subsets of b, containing up and therefore ( R U A”,) n bv 2 A,, by the definition of A,. Hence,
R u A,
( R - b,) u ( ( Ru A,) n b,) ( R - b v ) bJ( ( RU A”,) n b,) E C R U A ” , US required. =
11, MATROID UNION
470
11.4
PP of the Rank Function of a Matroid
11.4.1
Constructing Bx,,,.,
Algorithm Matroid Union can be used as a basic subroutine in building the principal partition of ( T , 1 . I), where T ( . ) is the rank function of a matroid.
. ~ .way of constructing this family Consider the problem of building B A ~ , ~Our would be as in the Remark 10.6.1 of page 415, i.e., we first find the minimal and maximal sets, X x , X x respectively, minimizing the function X r ( X ) + 1 S - X 1. Next we find, for each e E X x , the minimal minimizing set containing e. From this we can build the Hasse Diagram of the preorder (?A) whose ideals are the members of B A ~ , ~Finding . ~ . sets that minimize Ar(X)+ 1 S - X 1, where A = p / q , p , q positive integers,X g S , is equivalent to that of computing minimizing sets of p r ( X ) + q 1 S - X 1 . (Observe that the A’s that come up in this case when Algorithm S I and q 5 r ( S ) . Hence one can P-sequence is used are all rational. h r t h e r , p use a technique called ‘balanced bisection’ which is described in page 541). The following lemma is useful in solving this latter problem.
f(n’),then in fi we could replace N 1 , . . . , Nk by the blocks of IIfand get a new partition nnem s.t. < f@).
f(n,,,)
ii. Proof similar to the above. 0
Using Theorem 12.2.1 and Lemma 12.2.2 we can prove the following results
Theorem 12.2.2 [NarayananSl] Let f(.) be intersecting submodular (intersecting supermodular) over subsets of S and let II, $2 minimize (maximize) f(.)over P s . Then i.
nl V n, ,n, A &I
also minimize (maxim‘ze) ?(.).
485
12.2. DILWORTH TRUNCATION
ii. if N , , . . . , Nk are some of the blocks of IIl and M I , . . . , A4,.are some of the blocks of IIz such that Ni M j = 8 V i ,j and (U N i ) U (UM j ) = S , then the partition {NI ... )
)
n
Nk , M I
)
. . . ,M ~ minimizes } (maximizes ) ?(.).
Proof: We consider only the intersecting submodular case. i.(a) 111V II2 minimizes f(.): Let N1 be a block of
f(ni)+ f ( n ~ ,2>?(Hi
V
n2.By Theorem
n ~ ,+)f(ni A n ~ , ) .
12.2.1,
(*I
By Lemma 12.2.2,
m v , Hence, Hence,
f(.)reaches
f(W 2 f(H1 v K V , ) a minimum also at 111 V n ~ ,Repeating .
n, V n , ~v,. . . v IIN, and IIN,+,for j
of
Ilp,
I ml A n N l ).
we have the required result.
i.(b) II1 A II2 minimizes f(.): above,
the argument with = 1 t o T - 1, where T is the number of blocks
Let Ni be a block of
ml v n,
XIZ.
By the argument used
) = f(nl>.
But using this in the inequality (*) and using Lemma 12.2.2, we see that f(HN,)
Let
and
II2
=
f(nl A n,>.
have T blocks. Then by the definition of I I N ~we , have
r
f(nl A 1121 = C f(n1 A
-
(T
- 1)
i= 1
SO f(II2)
Cf(e). eES
= f(n1 A
nz).
ii. The result follows directly by the use of Lemma 12.2.2. Theorem 12.2.3 [NarayanangEib] Let f(.) be intersecting submodular (intersecting supermodular) over subsets of S and let X Y s. Let II minimize (maximize) ?(.) over P x . Then there exists a II’ in P y such that the blocks of II are contained in the blocks of II’ and XI’ minimizes (maximizes) 7(.)over P y .
s
486
12. DILWORTH TRUNCATION
Proof : We handle only the intersecting submodular case. Let II,II” minimize f(.)over Px.J(.) over P y respectively. Suppose II has a block N that is not contained in any block of II”. We then have by Theorem 12.2.1, taking ITN t o be a partition of L’,
f(n,) + f(n”)2 f ( n A~n”)+ f(n, V n”). Now f(n,) 5 f ( n A~n”),using Lemma 12.2.2. Hence, f(n”)2 f(n, v H”). Thus, IIN V II” minimizes f(.)over P y and has a block containing N . Repeating
this process yields a partition 11’ of Y such that its blocks contain the blocks of II. 0
Theorem 12.2.4 [Lov&z83] Iff (.) is intersecting submodular (intersecting supermodular) on subsets of S then ft(.) (ft(.))is submodular (supermodular). Proof: Select minimizing partitions II(X), JI(X u a ) ,I I ( Y )II(Y , u a ) respectively for f(.) over P x , f(.)over PxU,, f(.)over P y , over P y u , s.t. each block of I I ( X ) is contained in some block of n ( X U u ) as well as some block of n(Y)and each block n(Y)is contained in some block of II(Y U u ) . This is possible by Theorem 12.2.3. Let N,, M , be the blocks of n ( X U a ) , II(Y U a ) respectively that have u as a member. By Lemma 12.2.2, there is no loss of generality in assuming that the blocks outside N , are identical in II(X U a ) and n ( X ) and those outside M , are identical in II(Y U a ) and II(Y). We need to show that
I(.)
u a ) )- f ( r w ) .
f(n(xu a ) ) - f(Ww 1
n(x)
We can cancel terms involving common blocks between II(X U a ) and and those involving common blocks between II(Y ua) and n(Y).Let N1, . . . ,Nk be the blocks of n ( X ) contained in N , and M I , . . . , M r be the blocks of n(Y)contained in M,. (Observe that each N , is contained in some M J ) . Thus, we need to show that
f(Nn)-f({N1,.“,Nk}) L f ( M n ) -f({Ml,..‘,MT))
Let n + denote ~ the partition of M , that has blocks M I ,. . . ,M , and { u } . Let I I N ~ denote the partition of M , that has N , as a block and all the rest as singletons. We then have by Theorem 12.2.1, f(nNa)
+ f(II+M) 2 f(n,
/ I n+M)
+ fwiv, v n + M ) .
On both sides there are singleton terms corresponding to elements in M , - N , and a. If we cancel these and shift terms appropriately we have f(~a) -
f ( { ~ :., *
‘ 7
~ i jL ).f(n~,v n
where Ni , . . . , Nl is a partition of N
+ ~-)f({MI,.. . induced by KIN- A n + ~ .
>
Mr j),
The required result now follows, since we must have by Lemma 12.2.2 that
487
12.2. DILWORTH TRUNCATION
f({N{,'",N,'}) 2 . f ( { N l , ' * * , N k } ) and . f ( n N , v n + M ) 2 f(Ma). Exercise 12.2 Let f(.) be submodular on subsets of S . Iff(.) is increasing, show that so is jt(.). Exercise 12.3 [Lov&z83] Let f(.),g ( . ) be submodular functions on subsets of S with g ( . ) 5 f(.).I f g(8) = 0, show that
12.2.2
Examples
We now list a number of examples from the literature relevant to the Dilworth truncation operation.
i. Truncation of matroids(Di1worth [Dilworth44], see also [Mason81]) Let M be a matroid on S . Let s k be the collection of k-rank flats of M . Build a matroid M k on s k such that 0
each element of 5'1, has rank 1 if A is a flat of M with rank p > k then A, the collection of all k-rank flats of M contained in A , is a flat of M k with rank p - (k - 1).
Solution Let P X denote the collection of all partitions of X function r k ( . ) on subsets of S k as follows: Tk(0)
= O , r k ( x ) = minnepx C
X,En
(T'
- (k -
Here T ' ( . ) is the function induced by T ( . ) on subsets of s X ) . Clearly, T k ( ' ) = (T' - (k - l ) ) t ( . ) .
k
s s k . Define the rank I))(x~).
(i.e., d ( X )
~ (Y u , ) ,yi E
It can be shown that is a matroid rank function
0
rk(.)
0
if A is a flat of M with rank p > k then
,d is a flat of M
k
with rank p - (k- 1).
ii. Hybrid rank relative t o a partition of the edges of a graph("arayanan901) The problem described below arises when we attempt to solve an electrical network by decomposing it. First we define two operations on graphs. A node pair fusion means fusing two specified vertices 'UI,212 into a single vertex v12 while a node fission means splitting a node ' ~ 1into v l l , 2112, making some of the edges incident at
12. DILWORTH TRUNCATION
488
u I now incident at u1 and the remaining at 1112. We are given a partition n of the edge set E ( G ) of a graph G such that the subgraph on each block of the partition
is connected. Find a sequence of fusion and fission operations least in number such that the resulting graph has no circuit intersecting more than one block of n.
Solution (This problem is handled in detail in Section 14.4).It is easy to see that one cannot lose if one performs fusion operations first and then fission operations. Let I ( X ) , X C V ( G )be the set of blocks of II whose member branches are incident on vertices in X . Let n“ be a partition that minimizes 111 - 2(.). The best sequence is the following: Fuse each block of IIv into a single node. (If k nodes are in a single block this involves k - 1 operations). In the resulting graph, which we shall call G’, perform the minimum number of node fissions required to destroy all circuits intersecting more than one block of II. This is relatively easy to do and the number r ‘ ( N i )-r’(E(G)),where T I ( . ) is the rank function of such fission operations is of G‘. iii. New matroids A simple method (see Exercise 12.4 below) for generating new matroids from polymatroid rank functions is the following (see for example [Patkar92], [Patkar+Narayanan924). Let f ( . ) be an integral polymatroid rank function with f ( e ) = k , e E S. Let p k - q = 1. Then (pf - q ) t ( . ) is a matroid rank function.
Example Let IVl(.) be the polymatroid rank function on the subsets of E(G) (where G is a selfloop free graph ) such that I V ( ( X )G number of vertices incident on edges in S.Clearly IVl(e) = 2 . Then (klVl(.)- (2k - l))tis a matroid rank function. In particular (IVl(.)-l)t is the rank function of the graph and (2lVl(.)-3)t is the rank function of the rigidity matroid associated with the graph [Laman’iO], [Asimow+Roth78], [Asimow+Roth79]. Exercise 12.4 Let f (.) be an integral polymatroid rank function on subsets of S with f ( e ) = k Ve E S, k an integer. Let p , q be integers s.t. p k - q = I . Prove that (pf - q ) t ( . ) is a matroid rank function. Exercise 12.5 Prove the statements about the function to the Dilworth truncation of matroids problem.
rk(.)
given in the solution
iv. Posing convolution problems as truncation problems ( [N ar ayanan901, [Narayanan9 11, [Patkar +N arayanan92bl [Narayanan+Roy+Patkar92]) We give an example. Consider the convolution problem: ‘Find m i n x ~ ~ ( ~ ) X r w (X ( E)( G )- X),X 2 0, where E(G) is the edge set arid r ( . ) is the rank function of the graph G and w(.) is a nonnegative weight function on E(G).’ Let I ( Y ) G set of edges incident on vertices in Y 5 V ( G ) ,let E ( Y ) G set of edges incident only on vertices in Y 5 V ( G ) and let w ( l ( Y ) ) , w ( E ( Y )denote ) the sum of the weights of edges in the corresponding sets. Then one can show that X C E(G) solves the above convolution problem iff X = UNtEn,E ( N i ) , where II’ solves the truncation problem: ‘Find m z n ~ E p v ~ G ~ w (-I X(n) ( . ) ) or equivalently
+
find m a x n ~ ~ , , , , , w ( E ( .+ ) ) x(n).’ Thus the principal partition of the rank function of a graph can be determined by
12.3. THE PRINCIPAL LATTICE OF PARTITIONS
489
solving either of the above mentioned truncation problems for appropriate values of A. Indeed, this approach yields the fastest algorithm currently known for this for the unweighted case and principal partition problem - (O((EIIVl'Zog'(((V()) O(IEIIV('Zog(IVJ)) for the weighted case).
Polyhedral Interpretation for Truncation The polyhedral interpretation for Dilworth truncation appears to be less important than is the case for convolution. We however have the following simple result.
Theorem 12.2.5 Let f(.) be a real set function on subsets of S. i. Pf = Pfl and Pf = P f t
ii. If f(.) is polyhedrally tight then ft(.) = f(.). If f(.) is dually polyhedrally tight then ft(.)= f(.). iii. I f f ( . ) is submodular then ft(.) is polyhedrally tight and is the greatest polyhedrally tight function below f(.).Iff(.)is supermodular then ft(.) is dually polyhedrally tight and is the least dually polyhedrally tight function above f (.).
Proof : We will handle only the 'polyhedral' (as opposed to the 'dually polyhedral') case. i. If z ( X ) 5 f ( X ) V X C S then k
k
i=l
k l
for all partitions { X I ,. . . , xk} of X C s.Hence Pj P f t . The result follows since f t ( X ) 5 f ( X ) ,X # 8 and therefore Pf 2 P f l . ii. We have P f t = P f . For each X E S there exists x E Pf s.t. z ( X ) = f ( X ) . But z ( X ) 5 ftt(X) 5 f ( X ) . Hence, ft(X) = f ( X ) V X C S. iii. I f f ( . ) is submodular ft(.) is submodular and ft((8) = 0. We have seen (Corollary 9.7.1) that every submodular function that takes zero value on the empty set is polyhedrally tight. The remaining part of the statement follows from Exercise 12.3. 0
12.3 12.3.1
The Principal Lattice of Partitions Basic Properties of the PLP
The principal lattice of partitions has its roots in the variation of the hybrid rank problem described in page 487. It is curious that the principal partition is strongly
490
12. DILWORTH TRUNCATION
linked to the original hybrid rank problem as we indicate in Chapter 14. There are also formal analogies between principal partition and principal lattice of partitions which we will indicate at appropriate places. We begin by suggesting that the reader compare Subsection 10.4.2 with the present subsection. Definition 12.3.1 Let f (.) be submodular on the subsets of S. Let C x f (LA when f ( . ) is clear from the context) denote the collection of partitions of S that minimize (f - A)(.). The collection of all partitions of S which belong to some L A ,X E ?Fz is called the principal lattice of partitions of f(.).
As in the case of the principal partition, one of the interesting features of the principal lattice of partitions is that one need only examine a few (not more than ISl) As in order to solve the optimization problems for all the As. We list below the main properties of the principal lattice of partitions. The reader might like to compare them with those of the principal partition (Subsection 10.4.2).
i. Property PLPl The collection L A is closed under join (V) and meet ( A ) operations and thus has a unique maximal and a unique minimal element.
ii. Property PLP2 If A1 > Az, then IIx 5 lHA,, where IIx IIx,,respectively denote the maximal and minimal elements of L A .
...
111.
Definition 12.3.2 A number X for which Cx has more than one partition as a member is called a critical PLP value o f f ( . ) (critical value for short). Property PLP3 The number of critical PLP values o f f ( . ) is bounded by IS). iv. Property PLP4 Let XI,. . . , At be the decreasing sequence of critical PLP values of f(.). Then, = HA,,, for i = ~ , . . . ,-t 1. nA2
v. Property PLP5 Let XI, . . . , At be the decreasing sequence of critical PLP values. Let A, ‘Then T r A : z= n c -nc = .
n,,,
> cr > & + I .
Definition 12.3.3 Let f ( . ) be submodular on subsets of S. Let (At), i = 1 , . . . ,t tw the decreasing sequence of critical PLP values off(.). Then the sequence 110 = I I x , , n,, , . . . , H A , ,nxt= {S} is called the principal sequence of partitions of f ( . ) . .A member of L A would be alternativelyreferred to as a minimizing partition corresponding to X in the principal lattice of partitions off(.). Proof of the properties of the Principal Lattice of Partitions i. PLP1: This follows directly from Theorem 12.2.2.
ii. PLP2: The following lemma and theorem are needed for the proof of PLP2.
12.3. THE PRINCIPAL LATTICE OF PARTITIONS
49 1
Lemma 12.3.1 Let f(.) be a submodular function on subsets of S and let N C S. Let Il be any partition of S . Then,
(fXx,)(n) + ( F X ) ( ~ 2N )( f ) ( n
V
n ~+ )(f- xi)(nA H N )
-(h- Xi)(l II A H N I - 1 H N 1).
Proof : We have, by the definition of
(f - Xi)(.),
(f - &)(n v n N ) (f - xi)@ A n ~ = ) (f - xz)(nv n N ) + (f - x2)(nA nN) +(A, - X i ) I n A IIN 1 . By Theorem 12.2.1, the RHS
< (f)(n)
-
5 (f - A Z ) m
+ (f
+ ( f ) ( n ~ ) + (A, -
k)(nN)
- Xi) 1 n A HN 1 .
+ (A2 - &)(I n A n v I - I n N I).
The required result now follows immediately. 0
Theorem 12.3.1 Let f(.) beasubmodularfunction on thesubsetsofs. Let n,,i = 1 , 2 , be a partition at which (f - Xi)(.), i = 1,2 reaches a minimum. If X i > Xa then 1122 IIl. Proof : Let N be any block of
II1.
By the definition of TI,, we have
(fm v
nN>2
(f - WnZ)
Hence, by Lemma 12.3.1, we have
+ (XZ - Xi)(I
(f - X I ) ( ~ ZA H N ) I (f - XI)@,)
n2
A
n~ I - I n~ I>
Since XI > X2, using Lemma 12.2.2, we must have I IIz A I I N contained in a block of nz. Hence, IIz 2 I l l .
(=I
IIN I . Thus, N is 0
iii. PLP3: If CX has more than one partition as a member, then 1 ]>I n X I . So if XI ,XZ are critical values and XI > XZ, by the property PLP2, we must have 1 nxI (>(I I A ~1 . Now the maximum number of blocks a partition of S can have cannot exceed I S I . Hence, the number of critical values cannot exceed 1 S I , iv. PLP4: We need the following lemma. Lemma 12.3.2 Let X be a real number. Then for sufficiently small E partition that minimizes (f - (A - €))(.I over PS is HA.
> 0, the only
Proof : Since there is only a finite number of partitions of S , for sufficiently small E > 0, we must have the value of (f - (A - E ) ) ( . ) lower on the members of LA than on any other partition of S. But if II E LA and n # IIX
(f -
-
4)m
=
(f - X ) ( W + 6 I I +€ 1 IIA 1
> (f(x)(nX)
12. DILWORTH TRUNCATION
492 The result follows.
U
Proof of PLP4: For sufficiently small values of E > O , I I X t would continue to minimize (f - ( X i - E))(.) over partitions of S. As E increases, because there are only a finite number of partitions of S , there would be a least value €1 at which IIAz and at least one other partition of S minimize (f - (Xi - e l ) ) ( * ) . Clearly (Xi - €1) is the next critical value Xi+l. Since X i > X i - by property PLP2 we must have rIXi5 IIx,,, . Hence, we must have as desired. v. PLPB: This is clear from the above arguments.
Exercise 12.6 Let g ( + ) be a modular function on the subsets of S. Describe the PLP of g( .). Exercise 12.7 Show that
i. the PLP of /3f(.), where p > 0, f (.) submodular on subsets of S , is the same as that o f f ( . ) .X is a critical value of the PLP off(. ) i f fXP is a critical value of the PLP o f (.) .
of
ii. the PLP of (f + g ) ( . ) is the same as that o f f ( . ) i f g ( . ) is a weight function. The critical values of the PLP off (.) and the PLP o f (f + g ) ( . ) are identical. iii. the PLP o f f ( . ) and (f + k)(.), where k is a real number, are identical. The critical values of (f-t- k ) ( . ) are X i + k , where X i are the critical values off(.). .lust as in the case of the principal partition we give two characterizations for the principal lattice of partitions also. The first of these is presented below. Both the result and the proof are line by line translations of the statement and proof of Theorem 10.4.1. Lemma 12.3.3 Let f (.) be a submodular function on subsets of S . Let XI,. be a strictly decreasing sequence o f numbers such that
. . ,At
i. each LA,,i = 1, . . . ,t has atleast two member partitions,
ii. LA,,LA,,, , i = 1, . . . ,t - 1 have at least one common member partition. iii.
belongs to C x I while IIs belongs to LA,.
Then XI,. . . , A t is the decreasing sequence of critical values of the PLP of f (.) and therefore the collection of all partitions which are member partitions in all the LA,, a = 1, . . . ,t is the principal lattice of partitions o f f (.).
12.3. T H E PRINCIPAL LATTICE OF PARTITIONS
493
. . ,A; be the critical values and let no, II1,. . . , IIk = IIs be the principal sequence of partitions of f(.). By Property PLP2 the only member partition in CX, when X > XI’ is no. Further, when X < XI’, is not in CX. Hence, XI = XI’. Next by Property PLP5 when XI’ > X > X2’, the only member in CX is 111 which is the maximal partition in Cxl’. Since Cx2 has at least two member partitions we conclude that X2 5 Xz‘. We know that Cxl and C:x2 have a common member which by Property PLP2 can only be H I . But for X < Xz‘, by Property PLP5, 111cannot be a member of L A . Hence, Xz = X2’. By repeating this argument, we see that t = k and X i = X : , i = l,...,t. Proof : Let A l l , .
0
The next result is the main characterization of PLP. Later for convenience we restate it as Theorem 12.6.1. The latter result could be viewed as a ‘translation’ of Uniqueness Theorem (PP) (Theorem 10.4.6).
Theorem 12.3.2 ( Uniqueness Theorem (PLP))(NarayananSl] Let f(.) be a submodular function on subsets of S. Let IIo < IIl < . . . < nt = IIs be a strictly increasing sequence of partitions of S and let XI, . . . ,At be a strictly decreasing sequence of real numbers such that 1. 11.
(f-X2+1)(K)
=(f-Xi+l)(~,+l),~=o,~’~,t-1
(f - A i + l ) ( W 2 (f - XZ+l)(W, nz 5 n L K+l.
Then no,. . . ,nt is the principal sequence of partitions off(.) and XI, . ’ . , Xt is its decreasing sequence of critical values. Further,
+
Proof : The proof is by induction on t , where t 1 is the length of the principal sequence. The theorem is obviously true for t = 1 when the principal sequence is II0,IIs. Let the theorem be true for t < k . Let t = k . Let fii be the minimal partition of S at which (f - &+I)(.) reaches a minimum. We will first show that f i i 5 Hi for i = 0, . . , k . We know that f i 0 = no. Suppose for some i > 0, fii $ Hi. Then there exists a block N of fIi that is not contained in any block of Hi. It follows that Hi A IIN is distinct from Hi. Use of Lemma 12.2.2 shows that
-
By Theorem 12.2.1,
12. DlLWORTH TRUNCATION
494
However, on the sequence of partitions (IIi)fUs.n,,. . . , (&)fus.n; = ( ~ S ) ~ U S . I L (which are partitions of Hi), and on the sequence of values Xi+l,. . . ,XI,, the function ffus.ni(.) satisfies the conditions of the theorem. Therefore, by the induction assumption, (IIi)fus.ni,. . . , (IIk)fus.niis the principal sequence and X i + i , . . . , Xk the critical values of ffUs.ni(.). But then (*) cannot be true, a contradiction. We therefore conclude l?i 5 Hi, i = 0 , . * . ,k. We will now prove that fIi = Hi, i = 0 , . . . ,k by induction on k. be the maxiClearly HO = no. Let IIi = Hi for i = 0 , . . . ,k - 1. Let mal partition at which (f - Xi+l)(.) reaches a minimum for i = 0 , . * . ,k. Suppose f i ~
eEN
a.e.,
X(1-
z.e.,
I N I) + P(l N I) L 0 P(l N I) L X(l N I -1)
a.e.,
X 5 2p.
a.e.,
iii. If f(.) is increasing
f(N) L
2
f(e)
'de E N
4.
Hence,if f ( . ) is also nonnegative in addition, we may replace (*) above by
X(1-lNI)+P)Nl-9LO, which gives
The last two parts of theorem follow by direct application of the above results.
12. DILLWORTH T R U N C A T I O N
496 Storing the lattice Cx
We saw in the case of the principal partition that f ? is ~ a distributive lattice which can be stored in the form of a Hasse diagram. It is a bit more difficult to store LA. The following result shows that we only need to store I IIx I distributive lattices. (In the next chapter we present an alternative way of storing L A ) . Lemma 12.3.4 Let f(.) be a submodular function on subsets of S . Let N i , i = 0, . . . , k, be the blocks of IIx, the unique minimal partition that is a member of LA. Let D,v, be the collection of all blocks of partjtjons in L A which contain Ni. Then i. D N , is a distributive lattice
ii. A partition II of the 'DN, .
{ M I , . . . , M t } of S belongs to Cx iff each M j belongs to one
Proof : i. Let P,Q E D N , . Since P, Q are unions of blocks of nx so must P U Q as well as P n Q be. Let I I p , n,, IIP,, IIpuq denote respectively the partitions of S which have P as a block and remaining blocks from HA, Q as a block and remaining blocks from HA, P n Q as a block and remaining blocks from I I A , P u Q as a block and remaining blocks from I I x . By Theorem 12.2.2, IIp,II, belong to L A and so do I I p V II, and I I p A 11,. Hut I I p VII, = IIPuq and $ A n , = I I p q (note that P n Q 2 N i ) . The result follows.
ii. If each block of II belongs to one of the D N , , then by Theorem 12.2.2, ll E LA. On the other hand if II E Cx we must have II 2 IIx which implies that each block of II is a union of blocks of IIx and therefore belongs to one of the D N ,. U
Symmetry Properties of the PLP The PLP of a submodular function exhibits the expected symmetry properties. However, unlike the PP it does not induce a partial order among the elements since
the basic object here is a partition. l'he reader might like t o review the definitions preceding Theorem 10.4.2 in page 393. If u : S -+ S is a bijection and II = { N1, . . . , N k } is a partition of S then a(n) denotes { ~ ( N I .). ., , a ( N k ) } . We then have the following elementary but useful result. Theorem 12.3.4 Let f(.) be a submodular function on subsets of S and let a ( . ) be an autornorphism off (.). i. (f - A)(.) reaches a rninimum a t II among all partitions of S iff it reaches a minimum a t a(n). ii. For each X the partitions and IIx are invariant under a ( . ) .Hence, if N 1 , Nz are blocks of different sizes in some IIx and x E N1, y E Na then no
12.3. T H E PRINCIPAL LATTICE OF PARTITIONS
497
automorphism off (.) can map x to y. iii.
If N1, NZ axe blocks of IIx s.t. a ( N 1 ) = NZ, then D N ~(the collection of all blocks of partitions in Cx which contain N1) is isomorphic to D z ) Nunder ~ a(.).
Proof : i. Immediate from the definition of an automorphism of f(.) and the definition of a(II).
ii. We have a(IIx) also as a member of LA.Further, it has the same number of blocks as IIx. But IIx is the unique minimal partition that is a member of LA. Hence, a ( I l x ) 2 IIx and we conclude that a(II,) = HA. The case of IIx is similar. If a(.)is an automorphism of f(.) and I I A = { N I , . ,N t } then II, = a(II,) = { a ( N 1 ) ,. . . ,a ( N k ) } clearly N i , a ( N i ) have the same size since a(.) is a bijection. 9
.
iii. If A4 is a block containing NI ,a ( M ) must contain a(N1) = N2. The converse must be true since a-1 is also a bijection. The result follows. 0
12.3.2
PLP from the Point of View of Cost of Partitioning
The principal partition gives information about which subsets are densely packed relative to (f(.), g ( . ) ) while the principal lattice of partitions gives information about weak links between different subsets relative to f(.). The latter thus appears to be a better way of examining natural partitions relative to f(.). Below we give an informal description of the principal lattice of partitions from this point of view. Let us define the cost of a partition II of S relative to f(.) to be f(n) - f(S). (Note that the cost of the single block partition IIs is zero). Let cost rate (gain rate) of a partition I I 1 (II,) with respect to a coarser partition I I 2 (finer partition II,)be defined to be
Let XI, . . . , At be the decreasing sequence of critical values and let 110 = IIxl , . . . , HA,, IIA = S,be the principal sequence of partitions of f(.). The partition I I A would ~ have the cost A t ( [ IIx, I -1) (i.e., cost rate A t relative to ns). We may imagine that we are attempting to break S but for lower cost rate no partition occurs. Note that even at the above mentioned cost rate we may not be able to reach from Ils to IIA, through partitions whose number of blocks increases one at a time. (To reach IIx, from IIx, we can use partitions in LA,but these do not necessarily have a given number of blocks between III,, I and III x I). To further break up IIx, we have to pay a higher cost rate X t - l and so on, with the cost rate increasing each time we reach a partition in the principal sequence of partitions, until we reach the partition IIo in which each block is a singleton.
12. DILWORTH TRUNCATION
498
Every partition in the PLP has least cost relative to its number of blocks (see Exercise 12.8 below) and further is easy to construct (has a polynomial algorithm which is often quite fast). However, the problem of finding the partition of least cost relative to a given number of blocks is N P Hard even for the simplest submodular functions (example: f(.) (wI)(.), where ( w I ) ( X )s weighted sum of edges incident on the vertex set X C V(G),G a graph [Saran+VaziraniSl]). This apparent contradiction is resolved when we remember that there may be no partition of the given number of blocks in the PLP. Even in this case, however, good approximation algorithms can be given, as we will indicate later (see Section 12.4).
=
Exercise 12.8 (Compare Exercise 10.21) Let II be a partition of S in the PLP of a submodular function f(.). If II’ is any other partition of S with the same number is also a partition of blocks as II then f(II) 2 f(II’), the equality holding only if in the PLP off(.).
The following exercises constitute an alternative development of PLP from the cost point of view [Roy93],[RoyfNarayanan93b]. We need a preliminary definition and a basic result.
A partition Z1 satisfies the A-gain rate (A-cost rate) condition with respect to f(.) iff whenever n’ 2 IT (IT’5 n) we have
We say that these conditions are satisfied strictly if the inequalities above are strict.
Theorem 12.3.5 (The Optimum Cost Rate Theorem) Let f(.) be a submodular function on subsets of S. i. If a partition II of S satisfies the X - cost rate (A - gain r a t e ) condition, then there exists a partition fi of S such that fi minimizes (f)(.) and
ii 2 II (fi 5 n).
ii. If a partition II of S satisfies the A-cost rate (A-gain rate) condition strictly, then whenever fI minimizes (f)(.) we must have fi 2 II (fi 5 n). iii. A partition
II of S satisfies both the A-gain rate condition and A-cost rate condition iff it minimizes (f - A)(.).
Proof: i. Let II satisfy the A-cost rate condition, let fl minimize (f - A)(.) and let N be a block of II. We have by Theorem 12.2.1
12.3. THE PRINCIPAL LATTICE OF PARTITIONS Let { N1, . . . , N k } be the partition of N in IIN A Suppose ( f ) ( n N A fi) < (f - A),(I i.e.,
499
fi.
k
C f(Ni) - kX < f ( N ) - A. i=l
We now replace N by { N1, . . . ,N k } in II. Let this new partition be called II” . We then have f(rI”) - f(n) < (k - 1)A = (I rI” 1 - I n ])A, which violates the A-cost rate condition. Hence,
(f - A ) ( H N A fi) 2 (f - A)(UN). Hence,
(f - A)(HN v fi) I
(Fmfi).
Thus, (II, V fi) minimizes (f - A)(.) and one of its blocks contains N . Repeating this procedure gives us a partition that minimizes (f - A)(.) and is coarser than n. The argument for the ‘A-gain rate’ case is similar. (We use a block N of fi and use Theorein 12.2.1 on n, n ~ ) .
ii. (strict A-cost rate case) Going through the argument of the A- cost rate case used above, here we get
(f - X ) ( ~ NA fi) > ( f ) ( n ~ ) unless II, A fi = I,. The former alternative implies
a contradiction. Hence, we must have
I,
A
rI = r I N .
Hence, fi must be coarser than II.
iii. Suppose II satisfies both the A- cost rate and the A-gain rate conditions. Then, by the former condition, there exists a partition fi 2 II s.t. fi minimizes (f - A)(.). Suppose (f - A)@) < (f - A)(n).It is easily seen thatAthis violates the A-gain rate condition satisfied by II. We conclude that (f - A)(II) = (f - A)(II). Thus, II minimizes (f - A)(.). On the other hand let lI minimize (f - A)(.). For any II’ we then have
(f)X)(Iqi
(Fm‘).
By taking II‘ to be coarser (finer) than II, it follows that II satisfies the A-gain rate (A-cost rate) condition. 0
12. DILWORTH TRUNCATION
50u
Exercise 12.9 If two partitions II1, IIz have the A-gain rate (A-cost rate) property with respect to the submodular function f (.) on subsets of S then show that
(f - A)(ni A nz) 5 (f - A)(ni),i = 1,2, v n,) I (f - A ) ( & ) , i
((f)(II,
= 1,2).
Exercise 12.10 Let 111,112minimize (f - A). Then, using the above exercise, show that so do 111V n2,rIl A II,. Exercise 12.11 Let 111niinim’ze (f - Al)(.), andlet 112 minimize (f - A,)(.) with A1 > A 2 . Then I l l satisfies the strict A2 cost rate property and hence, II, 2 nl. The following result is useful for creating approximation algorithms for finding partitions of least cost with given number of blocks. The result is easiest to apply when the principal sequence is simple. We therefore, introduce some appropriate preliminary definitions. Let f (.) be a submodular function on subsets of S . We say S is KI-molecular relative to f(.) iff II0,IIs is the principal sequence of f(.). If in addition the only partitions in the PLP are &,ns we say S is II-atomic. (The reader might like to compare these notions with ‘molecular’ and ‘atomic’ in the case of PP).
Theorem 12.3.6 Let i.
If I nJ I> k
>I
nj,lIj+lboth
IIj+l
1,
minimize (f - A)(.) for some A.
then for any II E Ps with IIIl = k,
Further the equality holds onlv if II is a men~berof the PLP o f f (.).
ii. If S is II- molecular with respect to f (.) and
n is a partition of S
then,
f(n) < RHS. It follows that f(n) - X j + l I 11 I< f(&+l)- X j + l I % + I I,
Proof : i. Suppose
where
s o w IIj,Ilj+l minimize (f - A)(.) for a certain value of A. But the only value of A for which (f - X ) ( n j ) = (f - A)(n,+l) is clearly X j + l . Hence, A = A j + l and therefore,(*) is a contradiction. If there is equality then
(f - A j + l ) ( W Heme,
r1 also minimizes
(f
-
=
(f - 4i+l)(J%+l).
AJC1) and is therefore a member of the P I 2 of f(.).
12.3. THE P R r N c m L LATTICE OF PARTITIONS
501
ii. This is a restatement of the above for the molecular case. 0
We give some simple examples of submodular functions and examine the cost of a partition in each case.
Examples i. Let G be a graph and let w(.) be a positive weight function on the edge set of G. Let (wZ)(X)__ = sum of the weights of edges incident on X V ( G ) . In this case (wZ)(II) - ( w l ) ( V ( G ) ) which , is the cost of II relative to (wI)(*), is the __of the weights of the edges whose end points are in different blocks of II. sum ((wZ)(II) counts the weights of such edges twice and that of other edges once while (wZ)(V(G)) counts the weights of all edges once).
ii. Let ( w E ) ( X ) = sum of weights of edges with bothend points in X . Clearly - ( W E ) ( . )is a submodular function and ( w E )(V (G))- (wE)(II), which is the cost of II relative to - ( W E ) ( . ) ,is again the sum of the weights of edges whose end points are in different blocks of II. ( ( w E ) ( V ( G ) is ) the sum of weights of all the edges in the graph, (wE)(II) counts the weight of all edges with both end points within a block once).
-._ 111. Let B
E (VL,VR,E ) be a bipartite graph and let w(.) be a positive weight function on VR. Let (wI'L)(X) G sum of the weights of vertices adjacent to X 2 VL and let ( w E ~ ) ( x ) sum of the weights of the vertices adjacent to vertices in X hut not to vertices in VL - X . Here cost of II relative to (wTL)(.) is
=
( w r L ) ( n )- ( w r L ) ( v L )=
c
Uz E
(ki - I ) ( W ( V i ) ) ,
VR
where ki is the number of blocks of II to whose vertices and the cost of II relative to W EL)(.) is
V$
is adjacent,
( w E L ) ( V L) ( w E L ) ( I I ) = sum of weights of vertices which are adjacent t o vertices in more than one block of II. Observe that the cost of the partition in both cases, (wrL)(.) as well as ( w E L ) ( . ) , is related to the 'overlap' between blocks of II as reflected by the vertex sets in VR adjacent to these blocks. However, in the above two cases, the overlap is measured in very different ways. In the case of ( w r L ) ( . ) , if a vertex is adjacent to k blocks its weights is counted (k - 1) times. (In particular, a vertex adjacent to only one block does not contribute to the cost.) In the case of EL)(.), however, each vertex which is adjacent to more than one block of II has its weight counted exactly once and a vertex adjacent to only one block is not counted.
A graph 4 can be associated with a bipartite graph BG with VL = V(G), VR E(G) with e E VR adjacent to IJ E VL iff in G edge e is incident on v. However, in this case, the cost of a partition II of V(G)would be the same relative to ( w r L ) ( . ) as well as EL)(.) since here each vertex e E VR is adjacent t o exactly two vertices in VL.
502
12. DILWORTH TRUNCATION
Problem 12.1 This problem is analogous to Problem 10.2. To complete the analogy we remind the reader that if g'(.) is a submodular function on subsets of S then g ( X ) = g'(S - X) VX C S is also a submodular function. Let f(.),g(.) be submodular on subsets of S and let g(0) = 0. I.
Let C A ~ ,A~2, 0, denote the collection of all partitions that minimize
f(n) + xg(rI),n E Ps. If II1,11z E Cxg,f , then III V Hz, HI A
rI2 E
CX~,~.
11.
...
111.
Ifinstead ofg(0) = 0, we have the condition that g(X) < g(n)for dl partitions rI ofX not equal to IIx, then everypartition in C A ~is ~below , ~everypartition in C X ~when ~ , AZ ~ > XI 2 0.
Remark: The idea in the above problem can be used t o modify the PLP of f(.). Instead of minimizing (f - A)(.) we could minimize (f - A) g g ( . ) . In practice this could allow us to tamper with the size of blocks. The cost of a partition in the PLP of this new function would deviate from the optimum f(.)(for its size of blocks) at most by Ej(n).
+
12.4
*Approximation Algorithms through PLP for the Min Cost Partition Problem
We present a simple scheme for constructing approximation algorithms for partitions of minimum cost which appears to do well when the cost is in terms of a polymatroid rank function. For greater simplicity we first consider the case where the underlying set S is H-molecular with respect to the submodular function f(.). The generalization to the non-molecular case is easy and will be presented afterwards. ALGORITHM 12.1 Algorithm Min cost Partition Approximation Molecular INPUT A polymatroid rank function f(.) whose value is available a t each subset X C S (through a rank oracle) and k = the number of blocks for the desired partition. S is II-molecular relative to f(.).
OUTPUT A partition II of S with k blocks whose cost
m0)* (cost of optimum partition),
f(W f ( S ) -
where n
=I
S I .
503
12.4. APPROXIMATION ALGORITHMS
STEP 1
Sort the elements of S according to decreasing f(.) value. Let N be the block composed of the first (n - k 1) elements in this sequence. Output IIN as desired partition.
+
STOP Justification Let fI be the min cost partition of k blocks. We have
m) f(S)1&-H-(kfa)- 1 -
-
- f(S)),
using Theorem 12.3.6. Since N is composed of the largest valued first ( n - k elements of S, we must have
+ 1)
since f(.) is a polymatroid rank function. Hence,
f(no\-j(sl.
0
Let us now examine the ratio fno If we were to subtract a positive weight function g(.) from f(.), the above ratio would reduce even though the cost of a partition remains the same. The largest weight function that we can subtract from f(.), still retaining it as a polymatroid rank function, is the function g ( e ) f(S)- f(S- e ) V e E S. (Since f ( ( X - e) U e ) - f ( X - e ) 2 f(S)- f(S- e), it is clear that (f - g ) ( . ) is a polymatroid rank function). So the above algorithm must use as its input, the function (f - g ) ( . ) . Let us, therefore, assume without loss of generality that
12. DILWORTH TRUNCATION
504
Iri the above discussion we only used the fact that and ns minimize (f - A)(*). The ‘n-molecularity’ was not otherwise used. In general, let no, n1,. . . , n, = 11s he the principal sequence of partitions o f f ( . ) . As before let it be required that we fixid ai optimum partition of k blocks. If k =( IIj 1 for some j , then nj is the optimum partition. So let I nj I> k > ) nj+l 1 . In what follows, in place of successive terms of the principal sequence, we can take any two partitions that minimize (f - A)(.) for the same X and whose numbers of blocks are on either side of k. Let nj+1 E { S 1 , S p ; . . , S m } and let nj E { K ~ I > . N. ’l P, l , . . . , N m 1 , . . .Nmp,}, , where { N j l , . . . ,N j p J }is a partition of Sj. By Theorem 12.2.2 any partition of S that is obtained by taking some blocks from 1Ij+l and others from nj would also minimize (f Thus, we could obtain two partitions
11:
{ NI 1 , . . N1 p l ‘ 1
7
5’2
‘
. . , Sr - 1, Nr1, Nrp?, ‘
’ 7
Nml,
’ ‘ >
Nmp, }
which both minimize (f - A,+,)(.) and differ from each other only in that IIi+l has S 1 as a block while II[i has in its place N11, . . . , Nlpl are blocks. Further,
I
I> k > )
I.
ALGORITHM 12.2 Algorithm Min cost Partition Approximation INPUT i. A polymatroid rank function f (.) whose value is available at each X 2 S through a rank oracle, ii. k
G
the number of blocks for the desired partition,
iii. partitions II’ E { N l l , . . . , Nlpl,Sp, . . . ,S,}, rI” G {Sl, Sp, . . . ,S,} which both minimize (f - A)(.) for some X and s.t. 1 II‘ )>k >)II” 1 (equivalently, ( p l - 1) > k - m > 0). OUTPUT A partition ll of S with k blocks whose cost - cy (cost of optimum partition), ( a to be defined at the end of the algorithm), in general cr 5 ( p l - 1).
STEP 1
Let f’(.)= f/Sl(.) and let f2(.) (f$us.n(sl))(.), where n ( S 1 ) = { N I I , . . . , N I ~ ~ } . Let ~ ( { N i j )=) f’(n(s1))- f2(n(S,) - {Nij}) (= f ( S , ) - f(S,- N i j ) ) Let f3(.) = f2(.) - w(.)Sort { N11}, . . . , { N l p l } in order of decreasing value off’(.).
STEP 2
+
Lump the first (p1 - k m) of the sorted list into a single block M . Let { N i 1 , . ,NL-,} be the remaining blocks of n(S1). Let II E { N i l , . . . , N i - , , M , Sp, . . . ,S,}. Let .
+
j=1
12.4. APPROXIMATION ALGORITHMS
505
Define
Output II as the desired partition.
STOP Justification For simplicity of notation, we replace f3({Njj}) by f 3 ( N i j ) . We have
0. o n e flow maximization thus yields one set Cw,u,.Thus t o determine the zero bipartite graph for BN, corresponding to A requires I Nj l2 flow maximizations. By arguing as in the case of a general submodular function we conclude that computing the DTL of all the (f - A) (.) for all critical values requires 0 ( (Vr, 12) flow maximizations in a flow graph that has O(l VL 1 1 VR I) vertices, O(l VL I + 1 VR I I E I) edges and longest path from source to sink of length O(min(1 VL 1, I VR I)). Thus, the construction of the P L P of ( w R r ) ( . ) requires O(l VL 12) flow rnaximizations in the above flow graph. We remind the reader that Sleator's algorithm [Sleator80] for this problem would have complexity (as given in Subsection 3.6.10) O(l vL 1' (min(l vL 11 1 VR I))(/ E 1 1% 1 E 1)).
+
+
13.7.2
+
PLP of
(-wREL)(-)
We use a flow formulation for this problem also. The function E L ( *(of ) the bipartite graph B = (VL,V R ,E ) )is inconvenient to work with directly for such a formulation. So we use the fact that
Hence, ( w R E L ) ( X )= (wRI?)(VL) - ( w R I ' ) ( V L - X ) . Let f(x)f ( W R r ) ( v L - X ) , X E VL.Then, minimizing (-WEEL - X + ( w R r ) ( V L ) ) ( . )is equivalent to minimizing (f - A)(.). We concentrate on the construction of PLP of the function f ( X ) = ( w i f ) ( V L - X ) . Find Strong h i o n Set for -wREL(.) - A
+(WRr)(vL)
We first convert (f - A)(-) to a Z.S.S. function. Let q ' ( . ) be a weight function with q'(v) EE (f - A)(v) Vv E VL.Denote -q'(w) by q(u). Let p ( . ) = (f - A)(.) + q ( . ) . Suppose we know that the left vertex subset T contains no fusion set of the Z.S.S. T U e l e E X , and test function p ( . ) . Let e @ T . We minimize f ( X ) + q ( X ) , X whether the minimum value is less than A. The minimal minimizing set is a strong fusion set if the minimum value is less than A. Otherwise T U e has no fusion set.
13.7. PLP ALGORITHMS FOR
(wRr)(')AND - ( W R E L ) ( * )
55 1
I
\
'.
Figure 13.2: Flow graph for minimization of f ( X ) + q ( X ) ,X C 2 Let us consider the problem of minimizing f ( X ) 3-q ( X )
(WRr)(vL -
x)-
( w R r ) ( v L- V ) VEX
+ I x 1, x zc VL.
This can be posed as the flow problem of Figure 13.2 as we showed in Subsection 10.6.3 (and by the result in Exercise 10.33). The flow graph can be seen to be F ( B ,WZ,W R ) ( 2 is the set currently being tested for containing a strong fusion set), where
Maximizing flow yields a min cut of the form (3
u Y u r ( Y ) , tu (vL - Y ) u (vR
-
r(y)),
where Y F: (VL- X ) , VL 2 Y 2 (VL- 2 ) .Next we need X # 0 i.e., Y C VL.When Z grows from T to T U e, the capacity of (s,e) falls from DO to (A - (wRr)(VL- e ) ) . Instead if we make the capacity of ( s ,e) equal to zero then i t is easily verified that the capacity of the cut (s bj VLb j VR,t ) is not less than the capacity of the cut (3
u (VL- e ) w r(VL - e ) ,t u e k~(VR - r(vL- e)).
13. ALGORITHMS FOR THE PLP
552
Now if X is a fusion set, then the capacity of the cut, corresponding to VL - X , is less than that corresponding to VL - e in the original flow graph. This would hold triie i i i tlie modified flow graph also since e E X , if X is a fusion set. Hence in the rriodified flow graph, if X is a fusion set, the cut corresponding to Y E VL - has a lower capacity than the cut (sW VLkJ V R ,t ) . We can find the maximal Y by finding the nearest sink side cut (Theorem 3.6.2). We now check if f ( X ) q ( X ) < X where X Vi - Y. If so, X is a strong fusion set. Otherwise T U e contains no fusion set.
x
+
xVE(VL-y)Un
One final remark needs to be made for the case where q ( a ) is negative for some If such an element is in Y then (w,qI‘)(Y - a ) + q(zr) cannot have higher value than ( w R r ) ( Y ) C,,E(VL-y) q ( v ) . So we lose no generality in assuming that (I E Y in the first place when g(a) is negative. In particular it follows that if q ( e ) is negative T U e does not contain a fusion set. (L
E 1;.
+
‘I’hus the Subroutine Find Strong Fusion Set reduces to the following. suppose we know that T contains no fusion set. Let e @ T . If (A- ( W R r ) ( v L - e ) ) is riegative T U e contains no fusion set. Otherwise build the flow graph F ( B ,W’T, wfi) corresponding to T C V L :where w&(v) Wk(21)
w&)
5
XI)
uEVr,-T-fi
s x - ( W R l ~ ) ( V L- V ) , C E 1 s 0
hlaxiinize flow and find the nearest sink side min cut. By theorem 3.6.2 this has thr: foriri (.?
w Y w r(Y),t w
(vL
-
Y )w
Take -Y E r/, - Y. Check if f(X) + q ( X ) < A. If Yes declare X to he a strong fusion set. Otherwise T u e contains no fusion set.
(vR- r(y)).
We thus see that detection of a strong fusion set in this case requires I) flow maximizations in a flow graph that has O ( (li. 1 I VR I) vertices, O(1 t; 1 I VL I I VR I) edges arid longest path from source to sink of O(mZn(1VL 1, 1 VR I)).
O(1 1;;
+ +
+
If the problem were to determine a quasi-fusion set the algorithm is essentially the same as above. The difference is i . 1’
iriiist
be made minimal.
+ q ( X ) 5 X and X is a norisingleton set we declare that X is a quasi-fusion set.
ii. if f ( X )
Exercise 13.7 In order to ensure that the minimization of ( W R r ) ( Y ) q(VL - Y ) takes place over ).’ 2 L i , - Z , we put capacity of ( s v ) = 03 whenever v E VL - 2. Show that it is idc!qiiate to make this capacity A instead of cc provided A 2 ( w R r )
+
(v~).
13.7. PLP ALGORITHMS FOR (WRI‘)(.) A N D -(WREL)(.)
553
Exercise 13.8 In the flow graph of Figure 13.2 show that it is unnecessary to consider the case where (A - ( w R r ) ( v L - v)) is negative. Algorithm M i n ( f ,S ) for ( - w R E L ) ( . ) Let Bn denote the bipartite graph obtained from B by fusing the blocks of the partition n of VL into single vertices. Let W R & ( . ) be the weighted left exclusivity function of this graph. It can be verified that ( W R E l l ) ( ’ ) = ( W R E L )fzls.ll(’)
Let N be a strong fusion set that has just been detected. We fuse N to a single elernent and use Subroutine Find Strong Fusion Set on the function (-weighted left exclusivity function of the bipartite graph Bn, - A). This procedure is repeated until in the last bipartite graph this function has no fusion set. The left vertex set of this bipartite graph can be associated with a partition of VL (each vertex of the last bipartite graph is obtained by fusing a block of the partition of V L ) .This partition is the minimum partition that minimizes ( - ( w R E L )- A)(.).
A few remarks on the initialization of Find Strong Fusion Set. Suppose T has no fusion set. But T U e has the strong fusion set N . This is fused t o the vertex V N and we work thenceforward with the bipartite graph Bn, . The flow graph (for P L P of - ( w R E L ) ( . ) )corresponding to B n , would have capacities of all edges which are not leading into N unchanged. The capacity of the edge ( S , V N ) is taken to be A - ( w R r ) ( V L - N ) . We know that for the new function derived from the current weighted left exclusivity function, (T - N) U { N } does not contain a fusion set. Subroutine Find Strong Fusion Set is now run initializing it at (T - N) U { N } . It is therefore clear that, in order to complete Algorithm M z n ( f , V ~ ) , we have to perform O ( ~ V L Iflow ) maximizations in a flow graph that has O(l VL 1 1 V R I) vertices, O(l VL I I VR I + 1 E I) edges, longest path from source to sink of length O(min(1VL 1, I VR I)).
+
+-
S u h d i v i d e f ( n 1 ,n,) for - ( w R E L ) ( . ) Let B G ( V L ,VR, E ) be the bipartite graph under discussion. Then I l l , fl2 are partitions of VI,. In this case f(.) = - ( w R E L ) ( . ) .In STEP 1, f’(.)E f f u s . n , ( . ) . Let B n l denote the bipartite graph obtained from B by fusing the blocks of II1 into single vertices. Let Enl (.) denote the left exclusivity function of Bnl . It is clear that f’(.)= - ( w R E ~ ~ ) (As . ) in . STEP 2, let (I12)fus.nl have N 1 , . . . , Nk as blocks and let jj(.)= j ’ / N j ( . ) , j = 1 , . . . , k . Let BN, denote the subgraph of Bnl on (Nj U I1n,( N j ) ) . Let EN,(.) denote the left exclusivity function of B N ~Then .
ji = - ( W R E N , ) ( . ) .Now we can do Algorithm M i n ( f $ , N j ) using a suitable flow graph on R N , as described above. Thus, it is clear that Subroutine Svbdzvidef(II1,n,) requires O ( ~ ~I N- j , I) flow maximizations on a flow graph of size
13. ALGORITHMS FOR THE PLP
554
+
O(l I/; j 1 VR I) vertices, O(l VL I + 1 VR I 1 E I) edges, and longest path from source t o sink of length O(min(1VL 1, 1 VR I)). Thus, Algorithm P-sequence of partitions requires atmost I VL l2 flow maximizations on a flow graph of the above size.
+
DTL of ( - ( w R E L ) ( . )- A) Let B G (VL,VR,E ) be the bipartite graph under discussion. Let II1, lI2 be the minimal and maximal partitions of VL minimizing h - x(.) where h(.) = - ( w R E L ) ( . ) . Let (IIz)fus.n,have blocks N 1 , . . . , N k . Let Bnl,B N , ,E n l , EN,, etc. have the same meanings as above. It is clear that hfus.nl(.) = - ( w ~ E n ~ ) and (.) hfzls.nl /Nj('= ) -(WREN,)('). Let f i ( X ) _= (wRrN,)(Nj- X ) , X
C Nj
where
rNj(')is the left adjacency func-
tion of BN,. The minimizing partitions of N j for (-('WREN,)-
(fi (A+
+
A)(.)
are identi-
(wRl?NJ)(Nj))(.). Let X z (wRTN,)(Nj) and let cal to those for (fj - A)(.) - w(.), where w(.)is a weight function on Nj with w(e) = (1;- X)(e) Ve E N j . The function fj(.) is a type (000) function. Our problem is to find minimal zero sets containing given {va,vk} C N j . This problem is the same as the flow problem considered on page 551 with the added condition that { v i , v k } E X . (Note that in this case 2 = N j = left vertex set of BN,. The flow graph is as in Figure 13.2.) This can be handled by putting the capacities of edges going into vi and V k equal to zero, while maximizing flow. We look for a nearest sink side min cut, i.e., a min cut of the form (s H Y kl r(Y),tId ( N j - Y ) bj ( r ( N j ) - r ( Y ) ) )such that Y is maximal under the condition that { v ~ , v ~n}Y = 8 (see Remark below). Then CutU b = N j -Y. Thus, to determine all the Cv,VE corresponding to X and N j requires O ( ( N j 12) flow maximizations on the flow graph associated with BN, as given in page 551. To determine all the zero bipartite graphs corresponding to X requires O ( C I N j 12) flow maximizations. To determine all such graphs for all the X requires O(l VL 1') flow maximizations on a flow graph that has O(1 VL I 1 VR I) vertices, O(( Vr, 1 I Vt, 1 1 E I) edges and length of longest path from source to sink equal to O(min(1VL 1, I VR I)).
fJ =
+ +
Remark: Since 0
0
+
fj(.)
When = 0, zero sets.
is a type (000) function, we can show the following:
fj(.)
is a modular function and therefore all subsets of N j are
When > 0, if the capacities of (s,vi), (s,v k ) are made equal to zero, the capacity of the cut separating t from the rest is greater than the capacity either - of the cut separating s from the rest, or
13.8. STRUCTURAL CHANGES IN MINIMIZING PARTITIONS --
of the cut
(S
555
u Z u I’(Z),t w ( N j - 2)w ( r ( N j ) - r ( Z ) ) ) where , Z
=
VL - { u i , u k } .
Thus in any case the nearest sink side min-cut would yield a subset Y s.t. ( v i , u k ) n Y = 8, and, as mentioned above, we can take C ,., = Nj - Y .
As in the case of (wRr)(.) the construction of the PLP in this case also has
complexity (using Sleator’s flow algorithm [Sleator80])
13.8
Structural Changes in Minimizing Partitions
In this section we study the structural changes in maximal and minimal minimizing partitions off(.) (f(.) submodular) as the set grows and how to exploit these changes for improving the efficiency of the PLP algorithms. The key result which we need is Theorem 12.2.3 which assures us that the blocks of both the minimal minimizing partition and the maximal minimizing partition grow as the set grows. We have already seen that the blocks of minimal minimizing partitions of (f are elementary separators of (f - . ) t ( . ) (Theorem 12.5.1). The maximal minimizing partition of (f - .)(.) would of course have this property with respect to (f - a,,,t)t(.) where CT,,,~ is the next lower critical value after u.But it also has an interesting and useful property when the set grows without changing its (f - ~ ) t ( * value, which is the main result of this section. This result, given below, is a routine generalization of ideas present in [Patkar+NarayananSl],[Patkar92]. We denote the maxinial (minimal) partition that minimizes f(.) over partitions of X C S, by nmns ( X ) ( I I m i n ( X ) ) . .)(a)
Theorem 13.8.1 Let f (.) be an increasing submodular function over subsets of S. Let X C S and let e E S - X . Then f t ( X ) = f t ( X u e) iff there is a block N of I I m n s ( X )s.t. f ( N ) = f ( N U e ) and N U e is a block of n,,,(X U e). Proof : Since f(.) is an increasing submodular function so must ft(.) be (Exercise 12.2). Further by Theorem 12.2.3 the blocks of II,,,(X) are each contained in some block of IIma,(X U e). If Clearly the blocks other than N , NUe are identical in lI,,,(X) and IImnz(XUe). Further f ( N ) = f ( N U e). Hence, f(n,,,(X)) = f(II,,,(X Ue)) and f t ( X ) = ft(X U e).
Only If Let f t ( X ) = ft(X u e). Let M be the block of II,,,(X U e) that has e as a member. Since each block of lI,,,(X) is contained in a block of II,,,(X Ue ) it follows that
M = N1 U . * . U N k U e,
)
556
13. ALGORITHMS FOR THE PLP
where iV1, . . . , N I , are blocks of Itrnnz( X ) .The remaining blocks of nmnz ( X ) ,if a r ~ y , would be blocks also in II(X U e). We have f(II,,,(X u e ) ) = f(U,,,(X)). Hence,
f ( M ) = f(N1) + . . . + f(Nk). But
f(U N i l . k
f ( M ) 2 f ( M - e) =
u:=l
i= 1
Hence, the partition II of X which has Ni as a block and the others as in rI,,,(X) also satisfies f t ( X ) = f(n). If k > 1,II p IImaz(X),which is a contradiction. We conclude that k = 1 and, therefore, M = N U e where N is a block of r I l , l ( L T ( (and x ) further f ( M ) = f ( N ) as required. 0
Theorem 13.8.1 is algorithmically useful in the case of an integral submodular function f(.) on subsets of S whose lower truncation ft(.) is a polymatroid rank function with ft(S)I
tl
U t 2 )=2r(R)+
1 E -R I.
The dual has an identical proof.
iii. The function 2 r ( X ) + 1 E - X I is submodular and hence has a unique minimal and a unique maximal set minimizing it. (The unique minimal set is in fact R as mentioned above). Minimizing this function has earlier been shown to be equivalent to possessing the first two properties stated in this lemma. Similarly for the dual. iv. By Lemma 11.3.3, E - R is the set of coloops of M ( G ) V M(G). Thus every element in R can be put outside some base of M(G)v M(G),equivalently, for each e E R there exist maximally distant forests t i , th of G s.t. e $ t: U ta. Similarly for the dual. v. We have v ( X ) =I X
I - r ( E ) + r ( E - X).So
2v(X)+ I E - X
I=
( 2 r ( E- X ) + I X
I)+ I E I - 2 r ( E )
Hence, a set minimizes 2 v ( X ) + I E - X 1 iff its complement minimizes 2 r ( X ) + 1 E - X 1 . The result is an immediate consequence. 0
Exercise 14.1 Show that the minimum size among all representations of forests in G equals minXCE(r(G .
where E
x)+ v(G x ( E - x)),
-- E(G).
Exercise 14.2 Show that a subset A E(G) minimizes r(G . X ) + .u(G x ( E - X)) iff it minimizes 2 r ( X ) + I E - X 1, where E = E(G). Exercise 14.3 Show that two forests t l , t 2 of a graph iff
I t 2 - tl I= where E
minXgET(G .
x )+ V ( G X
are maximally distant
(E- X ) ) ,
= E(G)
Exercise 14.4 Describe the best strategy for the Shannon switching game for each of the three c s e s listed in the brief solution to the problem. Exercise 14.5 Justify the algorithm given in the brief solution to the ‘maximum rank of a cobase matrix problem’ and hence prove that the maximum rank of a cobase matrix equds the minimum term rank.
576
14. THE HYBRID RANK PROBLEM
14.3 14.3.1
- Second Formu-
The Hybrid Rank Problem lation Introduction
Chisicier the problem of analyzing a network JV whose devices are iiaturally partitioned into blocks with the device characteristic of each block available to us in both the conductance and the resistance form, i.e., vb
-
Rbib
= Sb
OT
GbVb
-
ib = 6b
(Rband G h are not iiec arily diagonal matrices). When we break up the devices iiito A arid B to use the hybrid analysis of Chapter 6 we would riot like to split the blocks since the variables in each block are coupled. So we are led to the following second formulation of the hybrid rank problem:
Let, Cj hc a graph and let ll be a partition of E(G). Find a partition { A , B } of E ( G ) , that, minimizes T ( G . A ) u(G x B ) under the condition that A (and therefore B ) is a. iiiiioii of hlocks of 11.
+
We know that minimizing T ( G . A ) + u(G x B ) over any family of subsets is thc 1 .4 1 -2r(5' . A ) (= - ( r ( G . A ) + v(4 x R ) + r ( G ) - I E ( G ) I)) over the same family of subsets. This is an instance of the membership problem for a polyiriatroid, given a matroid expansion, described in the next subsection. s a n e as maximizing
I I I prac:tice, by using the niethod of rriiiltiport decomposition (see Chapter 8), each sutmetwork by a forest subgraph, which would he adequate to represent the topological relationship between different subnetworks. Below, when we model the second formulation in terms of matroids, we therefore assume that the exrmnsion has ari independent set in place of a single element of the polymatroid. \Ve give two ways of solving the above prohlem. The first solves a rriorc general versioii in tcriris of matroids. The second coiiverts this problem iiito a series of siinple flow problems. o i i e (:a11replace
14.3.2
Membership Problem with Matroid Expansion
IVe hegiii with the matroid version of the second formulation of the hybrid rank problem: Ixt
.tZ he
H
inatroid on
3 and let
S he a partition of
f ( . ) E r r u s . s ( . )(Z.e., f ( X ) = 7-(
S.Let
u
Nz)).
.Y*E X
f ( c ) Ve E S . Fiiid the siihset, h t $ I ( . ) tlt: a weight fiiii(:tioii on S detiiicd hy g(c) o f S wtiidi iriiniinizes f(x) f*(S- -Y), where f * ( . ) is the coinodiilar dual o f f ( . ) relative to tlie weight function 9 ( . ) .
+
The following exercise shows that the above problem is related to the membership prohlern for a polymatroid, given a matroid expansion.
14.3. T H E H Y B R I D RANK PROBLEM - SECOND FORMULATION
577
Exercise 14.6 Let f(.) be a polyrnatroid rank function on subsets of S. Let f*(.) be the dual o f f ( . ) relative to the weight function g ( . ) where g(e) f ( e ) Ve E S. Let the hybrid rank of f(.) min f ( X ) f*(S- X ) .
=
+
XCS
Show that i. the hybrid rank f(.) = the hybrid rank off*(.).
ii. a subset K minimizes f ( X )
+ f*(S- X ) iff it maximizes g ( X ) - 2f(X).
iii. the hybrid rank of a contraction, a restriction or a minor off (.) cannot be more than the hybrid rank o f f ( . ) . We now state the membership problem for a polymatroid, given a matroid expansion: Let M he a matroid on S and let S be a partition of
f(.) = r f u S . s ( .()i . e . , f ( x )= r (
3. Let
U
~i)).
N , EX
Find the subset of S which maximizes g ( X ) - f ( X ) ,X function 011 S .
S , where g ( . ) is a weight
We give a simple general solution to this problem using the matroid union operation. For the case where the matroid is graphic, a more efficient solution is given later. First, we introduce some convenient notation. Let f(.) be an integral polymatroid rank function on subsets of S G {el, . . . , e n } . Let S be the set obtained from S by replacing each ei by the set di,with the condition that when ei, e j are distinct we have dindj = 0.Let H ( . ) be a function from subsets of S to subsets of S defined by H ( X ) E Ue,EX & , X C S. Let M be a matroid on S whose rank function r ( . ) satisfies the following:
VXCS.
f ( X )= T ( H ( X ) )
Wc say that the M or r ( . )is an expansion o f f ( . ) while f(.) is a fusion (aggregation) of T ( . ) . Henceforth we take 1 e^ I= r(e^)= f(e) V e E S. Let g(.) be a positive integral weight function on S . If for some e ’ , g ( e ’ ) > f ( e ’ ) , it is easy to see (Exercise 14.9) that e’ must necessarily belong to any set that maximizes g ( X ) - f ( X ) . We can therefore work with 5’- e’ in place of S , f o (S - el)(.) in place o f f ( . ) and g / ( S - e ‘ ) ( . in ) place of g ( . ) . So, without loss of generality, we may assume that g ( e ) 5 f ( e ) Ve E S. Let M , be the matroid on 9 with rank function a(.)defined by
a ( . )= r e , @
. @ T e n (.I,
where re*( X )
= I 2 1, X c di, I X G
15r ( d i ) - g ( e i )
r ( & ) - g ( e i ) , otherwise.
14. THE HYBRID R A N K PROBLEM
578
(i.e., M , is a direct sum of uniform matroids on the &). Exercise 14.7 Derived matroids are expansions of derived polymatfoids:
Let f(.) be a polymatroid rank function on subsets of S and let M on S be a niatroid expansion off(.) with rank function r ( , ) .Further let 1 e^ )=~ ( 2 V) e E S. Let ? = H ( T ) . Then i.
f / T ( . ) has the expansion M . T
ii. f oT(.) has the expansion M x
iii. let f * ( .) denote the dual o f f ( . ) with respect to the weight function g ( . ) defined by g ( e ) E f(e) V e E S. Then M " is an expansion off*(.). The following theorem assumes the notation of page 577 for g ( . ) , f(.), r ( . ) a , ,M ,
M, etc.
Theorem 14.3.1 Let f ( . ) be an integral polymatroid rank function and let g ( . ) be a positive weight function with g(e) 5 f ( e ) Ve E S . Let r ( . ) a , ( . )M , , M , be as above. Then. i. g ( X ) - f ( X ) = / H ( X ) 1 -(.
+ a ) ( H ( X ) ) x, c s.
ii. Let 4 be the set o f noncoloops of M V M , and let Rl be its closure relative to a ( . ) .Then Rl m a x i m i z e s I 2 I - ( r a ) ( R ) , 2C S, R l = H(R1) for some R1 C S , and d R 1 ) - f(R1) =I R l I -(. +
+ 4k).
...
111.
iv. The vector g belongs to the polymatroid Pf i f f there exists a base b of M s.t. 16 n el 2 g ( 6 ) V e E S.
We need the following lemma for proving this theorem. Lemma 14.3.1 Let M 1 , M z be matroids on S and let q ( . ) , r ~ ( respectively, .) be
their rank functions. Let b l ?b2 be maximally distant bases of M I , M2 respectively (i.e., bl u bz is a base of M1 V M z ) . Let R be the set o f noncoloops of M1 V M2 , and let R , denote its closure in the matroid M I . Then i.
R 1 maximizes I 2 1
ii. Rl
-
-(TI
+ . 2 ) ( 2 ) , R c S,
R is a set of coloop~of M~ . R l ,
iii. R , contains no coloops of M~ , Rl ,
iv. bl
n A,, b2 n Rl are disjoint bases of M I . R , , M z . R z respectively.
579
14.3. T I f E HYBRID R A N K PROBLEM - SECOND FORMULATION Proof : By Lemma 11.3.3, R maximizes we are done. Let 3 R. We must have
I 2 I -(rl + r 2 ) ( X ) , X G 9. If
R1
=R
1 Rl I -(r1 + 7 - 2 ) ( & ) >)
e,EX
= g ( X ) - f ( X ) as required.
ii. By Lemma 14.3.1, we know that R l maximizes I ,? I - ( r + a ) ( k ) , X ,!?. Next let i:;i n R1 # 0. We will show that 6i C R1. By the definition of a,,(.) any subset of E i whose size does not exceed ( r ( & )- g ( & ) ) is independent in M,,If 1 ei fR1 l 15 r ( & ) - g(&),-then ii n Rl would be a set of coloops of M , . R I . Since, by Lemma 14.3.1, R1 contains no coloops of M, R 1 , we conclude that I & n i l. ; 1 r ( & ) - g ( & ) = a,, (&). Next Rl is closed relative to a(.).But all elements in 6i are dependent on any subset of &i of cardinality a ( & ) .Hence, Rl 2 i i . Thus, R I = H ( l 3 l ) for some R1. Hence, g(R1) - f(R1)
=I
Rl
I4 .+ a ) ( & ) .
(*I
iii. (i) above implies that m a z x c s g ( X )- f ( X ) L maz*@
I 2 I 4.+ a>(Xi>
14. T H E H Y B R I D RANK P R O B L E M
580
while (ii) implies the reverse inequality. This proves the required result. iv. The vector g belongs to Pf iff
niazxcsg(~)- f ( ~=)rnazgc3 1 X
1 -(r
+ a ) ( X )5 0
This happens iff there are bases b, b, of M , M , respectively s.t. b U b, = S, i.e., iff (bn.G)u(b, ni?)= 2 V e E S i.e., iff I 6 n 2 1 +r(E) - g(2) 2 r ( 2 ) i.e., iff I bn 6 I> g(e) as required. Suppose we are given a matroid expansion of f(.).It is easy to build ,an expansion of kf(.), k a positive integer, as follows. Build k disjoint copies of s1,. . . , s k . An element ei E S now has k f ( e i ) copies in Si. Let us call this set &(k). Build copies M I , . . . ,MI, of the matroid M on !?I,. . . ,S k with rank functions r l ( . ) ,. ' , rk..) respectively. Let T - + ( . ) = @:xl ri(.). Define H k ( X ) = Ue,EX6 , ( k ) , X S. Since f ( X ) = T ( U ~ &) ,we~ must ~ have
ut=l
c
k f ( W = r+(
u
F,
&(k)).
EX
We can, therefore, handle the problem of maximizing g ( X ) - kf(X), X C S ,k E Z+ which arises in connection with the construction of the principal partition of ( f ( . ) , g ( + ) )by , using Theorem 14.3.1. Exercise 14.8 Concerning Theorem 14.3.1
Let M be the expansion o f f (.). Show that
- Good expansion:
i. If C is a circuit of M . 6 and a E C , then M . (S - u ) is an expansion o f f (.) with P^ - a in place o f 2.
ii. There exists an expansion o f f (.) with no more than
CeES f ( e ) elements.
Exercise 14.9 Concerning Theorem 14.3.1 - Better expansion: Let h ( e ) f(S)- f(S - e ) , e E S. Prove:
=
i. Ifg(e) > f ( e ) thene belongs toeveryset t h a t m a x i m i z e s g ( X ) - f ( X ) , X
C: S.
ii. (f-h)(.) is an integral polymatroid rank function and a set maximizes g ( X )-
f ( X ) ,X
C S i f fit maximizes
( g - h ) ( X )- (f - h ) ( X ) X , C: S.
iii. I f M is an expansion for f(.),then the expansion of (f - h)(.) can be built as follows: Foreache E S f i n d a b a s e 6 , 0fM.s.t. 1 b,n(S-e^) I=r(M.(S-E)). Let h ( e ) =I 6 , n 2 . Let Mred = M x (S - UeES(be n 2 ) ) . Then Mred is an expansion of (f - h )(.) .
14.3. THE HYBRID RANK PROBLEM
-
SECOND FORMULATION
58 1
Exercise 14.10 Let f (.) be an integral polymatroid rank function on subsets of S. Let f*(.) be the comodular dud o f f ( . ) relative to g(.) where g(e) = f ( e ) , e E S. Let the notation for matroid expansion be as in Theorem 14.3.1. I f M is an expansion o f f (.) on S, we sa,y a pair (?I , *z), ?I, ?l C 9 is a common independent pair of M , relative to S i f f
h
0
Tl ,TZ are independent in M , M respectively and
0
\Fln81=l$2n81V e E S .
The size of (fl,$l)is defined to be Prove:
1 ?I I (=I
$2
I).
i. I f M is an expansion o f f ( . ) , M' is an expansion o f f *(.). Further if f(S)= f ( S - e) Ve E S , then f*(e) = f(e). ii. (Assuming I 8
I=
f(e) Ve E S )
r n i n x c s ( f ( X )+ f*(S- X ) ) =
2
max (size of common independent pair of M , M * relative t o S ) max size of c m m m independent set of M ,M*.
Further, thereis an expansion M1 off (.) s.t. inequalityabove becomes an equality. Complexity of solving the membership problem given a matroid expansion Before using the present method, the size of the problem has to be reduced as in Exercise 14.8 and Exercise 14.9, i.e., i. we may assume that I ti? I= f(e) ii. if
g(e)
Off(.)
> f(e) we work with (S - e ) in place of S and f o (S - e)(.)in place a d
iii. we work with (f - h ) ( . ) and (g - h ) ( . ) ,in place of f ( . ) , g ( . )(where h(e) = f(s)- f(S- e ) ,e E S as in Exercise 14.9). Without loss of generality, we may assume that g(e) 5 f(e) Ve E S and h ( . )= 0. The rank of the matroid expansion of f(.)is f(S)and the size of is CeaES f(ei). Let m = max,* E~ f(ei). Now the complexity of Algorthm Matroid Union\is in terms of calls to the independence oracle and some elementary steps. The independence oracle for M , in the present case is trivial since M a is the direct sum of uniform matroids. So we will speak only of calls to the independence oracle of M .
582
14. THE HYBRID RANK PROBLEM
Let T denote r ( M ) .Let b, b, be the current bases of M , M,. To build G(b, b,) takes at most (I S I - T ) T calls to the independence oracle of M . This has to be done at most (I S 1 - T ) times. So the number of calls to the independence oracle is O(T(I9 I - T ) ’ ) . The complexity in terms of 1 S 1, noting that T , I S I are not more than m I S 1, is O(m3 I S 13). Next we consider the complexity of performing the bf s in G(b,b,), which takes us from S - ( b u b,) to an element in b n b,. The number of edges of this graph is O((l S I - 7 ) ~ ) . So the complexity of the search is also O((l 3 I -r)r). Thus, the number of elementary steps in Algorithm Matroid Union is O ( r ( (3 1 - T ) ~ )which again reduces to O(m3 I S 13). If m is less than 1 S 1, the present method compares favourably with that of the general pseudopolynomial algorithm of Cunningham [Cunningham841 (O(m I S 16. 5 Zog(m I S I)).) The membership problem, as it is usually stated, does not assume the availability of a matroid expansion. So the present algorithm cannot strictly be regarded as a solution to that problem. Exercise 14.11 Rewrite the algorithm for the solution of the membership problem given a matroid expansion, using only bases of the matroid M .
14.3.3
Membership Problem with Graphic Matroid Expansion
The special case where we need to maximize g ( X ) - Xf(X), X C S , given a graphic rnatroid expansion o f f ( . ) , is more relevant to electrical network analysis. For this case two alternative flow based procedures are possible. Neither of these methods attempts to build maximally distant trees. The optimum set, or a set contained in it, appears either in the source or the sink side of a min cut in the flow graph. The first of these procedures is edge based and is given below. The second procedure would be evident from the discussions in Section 13.9 on how to convert certain principal partition problems related to one side of a bipartite graph into principal lattice of partition problems (see Theorem 13.9.1). The complexity of the procedure, presented here, is substantially better than the one made possible by Theorem 14.3.1. We, however, follow the notation of the above mentioned theorem. The flow technique that we describe below is similar to that of Imai [Imai83] The difference lies in the following: We grow sets until a minimal nonvoid set can be found with positive value of h t ( X ) (defined below). This we contract and work with an updated ht(.). We continue this procedure until no sets can be found with a positive value of current ht(.). At this stage the union of all the contracted sets gives us the desired optimum set. Imai computes the optimal set using the ‘fundamental functions’ of the concerned polymatroid.
in that the flow graphs are identical.
Let G be a graph with S E(G) and rank function r ( . ) . Let S be a partition , 5 S. Let f(.) be defined on subsets of S s.t. of S and let H ( X ) G U e c E X e iX f ( X ) G T ( H ( X ) ) .Henceforth, in this subsection, H ( X ) would be denoted by 2.
583
14.3. T H E H Y B R I D R A N K P R O B L E M - SECOND FORMULATION
Let g ( . ) be a weight function on S . We further assume that
the subgraph of G on
i3
is connected for each e E S.
(*)
This assumption is necessary for the following procedure t o work. (Note that such an assumption would not be required for a procedure based on Theorem 14.3.1). Let V ( X )K set of end points of edges of G in members of X, X C S. We then have, using Assumption (*), f ( X ) = (I V I -l)t(X). Let h ( X ) E g(X) -- X(l V 1 -1)(X), X C S. We remind the reader that h t ( X ) G E masngp,h(n). In order to maximize g ( X ) - Xf(X),X S , we need to maximize h t ( X ) z g ( X ) - X(l V I -l)t(X), X C S.
s
Let us start with a set X, s.t. h t ( X ) _< 0 VX E X,. Let e 6 X,. We find the minimal nonvoid subset X , that maximizes h ( X ) among subsets of X , U e by using an algorithm called say M a z ( S , h). If h ( X , ) 5 0, we conclude that X , U e satisfies h t ( X ) 5 0 V X X , U e. For, if II is any partition of Y X , u e , then &(II) = CNiEn h ( N i ) 5 0. If h ( X , ) > 0 we contract S to G x ($ - X,).The function V ( . ) is now defined over subsets of S - X , , with V ( X ) = set of end points of edges of 4 x (S - X,) which are present in members of X, X S - X,. The function r(.) is now the rank function of G x (9 - X , ) and g ( . ) , the restriction of the original weight function to (S - X , ) . The functions h(.),ht(.) are defined as before in terms of g ( . ) and V(.). We initialize the algorithm M a z ( S - X , , h) at X , - X , and repeat the process. The process terminates when the current S - K has no nonvoid subset at which h(.) takes a positive value. We then declare K to be the minimal set that maximizes
s
g(X) - A(( V I -1)t(X),X
c s.
The flow formulation (The discussion that follows needs familiarity with Subsection 3.6.10 and Subsection 10.6.3. It would help t o have Figure 10.3 at hand. For the present discussion one may replace, in that figure, X by X , , the vertex VI by e , w ~ ( w 1 )by 00, w 2 by ei, w t ( w ) by g(ei) and WR(V) by I). We now consider the problem of maximizing h ( X ) among nonvoid subsets of %. This is equivalent to minimizing A(l V 1 ( X ) )+ g ( Z - X ) , S C X C 2.This is a flow problem (as described in Subsection 3.6.10 and in Subsection 10.6.3). But we have to be careful to confine ourselves only to nonvoid subsets. As in Exercise 10.33 we do this by forcing the newly added element e to be a member of the sets over which optimization is carried out. The flow graph for this problem is built as , z ) with 2 as the left vertex follows: First build the bipartite graph B E (2,V ( Z ) E set, I; being the current graph, V ( 2 ) in G as the right vertex set, an edge between 21 E V(2)and ei E 2 iff in the graph Q there is an edge in &i that is incident on v. Now add a source vertex s and join it to each vertex in 2 , a sink verte* t and join it to each vertex in V ( G ) .The capacities are: edge ( s , e i ) has capacity g(ei),ei # e,ei E 2,
14. THE HYBRID RANK PROBLEM
584 0
edge ( s , e ) has capacity ( e i , u ) has capacity
0
00,
00,
Vv E
V(Z),
( u , t ) has capacity A, Vv E V ( 2 ) .
The nearest source side min cut of this flow graph would have the form (s u X e V ( X , ) ,t (2- X,) u ( V ( 2 )- V ( X , ) ) ) ,e E X,. As discussed in Subsection 3.6.10, g ( Z - X , ) A ( ( V l ( X , ) )would have the minimum value among all nonvoid subsets of 2 and X , would be a minimal nonvoid such set. Hence, h ( X , ) would have the maximum value among all nonvoid subsets of 2 and X , would be a minimal nonvoid such set.
+
From this flow graph, we can build the flow graph corresponding to (2- 2,)by first building the subgraph B' of B on X , U I'L ( X , ) . Next r L ( X , ) is partitioned into Vl, . . . V k corresponding to the connected components B i , . . . ,B; of B'. V,, . ' ' , vk are now made into single nodes, X , and all edges incident on X , are deleted. Single edges go from Vl to t , * * - ,vk to t , each with capacity A. (It can however be shown easily that the bipartite graph B' is connected, i.e., k = 1). All other vertices and edges of the original flow graph and the capacities associated with the latter are left unchanged.
G
x
Justification
To justify the above algorithm for maximizing h t ( X ) E g ( X ) - A((V) - l ) t ( X ) ,we only need to explain the contraction step. Let h ( X ) 5 0 , 8 c X & X , and let X , be the minimal set that maximizes h ( . ) among nonvoid subsets of X , u e. If h ( X , ) > 0, it follows that h(.) reaches its maximum among partitions of X , at { X , } and h ( X e ) = ht(X,). Further it is clear since h ( X ) 5 o , 8 c X c X , that h t ( X ) 5 0 VX c X,. Since hi(.) is a supermodular function, use of the supermodular inequality and the fact that h t ( X ) < h t ( X , ) VX c X , would reveal that X , is a subset of any set that maximizes h t ( . ) over subsets of S. So we can work with the contraction of ht(.) over subsets of S - X,, and if Y is the minimal set that maximizes the latter function, Y U X , would be the minimal set that maximizes h t ( . ) .Now
( h t )o (S - x,)(x) = 9 0 (S - x,> ( x) ~r o (9 - X e ) ( A ) ,
x c S.
Since g ( . ) is a weight function, contraction is the same as restriction. The function ( S - Xe)(.)is the rank function of the graph 6 x (g - 2,).
T 0
Exercise 14.12 Improvement of graphic expansion: Let 6 be a graph on S with T ( . ) as its rank function. Let r ( . ) be the expansion for f(.) on subsets of S. We follow the notation of Exercise 14.9. Show that
=
M(G x (S - U e E S ( b en S))), where M ( G ) denotes the matroid whose independent sets are subforests of G, is an expansion off (.).
i. The matroid Mred
14.3. THE HYBRID RANK PROBLEM - SECOND FORMULATION
585
ii. Let us assume without loss of generality that 1 2 I= r ( e ) Ve E S (see Exercise 14.8). In the graph Gred G x (3- UeES(ben 2 ) ) , if G . 6 is connected, then G r e d . (i?- be n E ) would be a tree with no node incident only on edges of 2. Further no edge of the tree would be a cutset of Gred.
=
iii. Assume, for simplicity, that Gred . (2 - be n 2) is connected for each e. If we replace each (i? - be f l 2) in Gred by a tree on the same set of nodes, the rank function of the resulting graph would be an expansion for f (.).
We note that the set S that occurs in the above discussions would be the edge set of a reduced network. In the present case i t can be built very easily from the original graph of the network. (For matroids, in Exercise 14.9, we contracted certain interior ‘hidden’ elements within each e E S ) . Here each e represents a connected subgraph. For solving the present membership problem we can replace each subgraph by a tree on the ‘boundary nodes’ of the subgraph (see Exercise 14.12), contracting branches appropriately to eliminate cutsets. So each e would represent no more than r(GTed . e ) edges in the new reduced graph. However, g(e) remains the same as before. In most practical network problems large subnetworks have comparatively few boundary nodes. Thus the specified in the above discussions would be much smaller in size than the edge set of the original graph (say M 10%). If this is kept in mind, it would be clear from the discussion below that the algorithm we have presented is practical in the sense that it can be included in the preprocessing stage of a circuit simulator.
Complexity Now we discuss the complexity of the above algorithm - first for general A and later for X = 2, in both cases with g ( e ) = f(e) = r(G) Ve E S. We remind the reader that 151 = C e E S f(e). i. General A: We have to perform 1 S 1 flow maximizations. The number of edges in the flow graph is O(l S I). The complexity of one flow maximization using the Sleator algorithm is O(l S I (I S 1 log I S I)) elementary steps. (The Sleator algorithm [Sleator80] proceeds in stages. Each stage has complexity O(l S 1 log 1 S I). The number of stages is the length of the longest undirected path from source to sink and is of O(min(1 S 1, I V I)). Here we may assume 1 S (, %€I
W P ( 0
= C T ( L . E , P , x Pa). ZEI
Then
i.
P E (.) , pp (.)
functions
are polymatroid rank functions while
W E ( .), u p (.)
are modular
ii. ( P E -- u E ) ( ' ) = ( P P - U P ) ( ' ) iii. the fusion rank of V E relative to (El, . . . , Ek} is equal to the fusion rank of U p relative to {PI, . ,Pk} .
+
iv. the fission rank of V E relative to { E l ,. . . ,Ek} is equal to the fission rank of V p relative to {PI,. . . , P k } . Proof
:
For proof of the first two parts see Problem 8.8.
iii. The fusion rank of V E relative to {El,.. . ,Ek} = ( p -~ WE)(^) = = ( p p - up)(S) = hsion rank of U p relative to { P I ,. . , Pk}. +
iv. We observe that if ( ( V , p, ) k , U p ) is a compatible multiport decomposition of V E then ( ( V &* , , ) k , U,') is a compatible multiport decomposition of V k (by Theorem 8.2.1 and the facts that contractions and restrictions are orthogonal duals and that if Vl 2 Vz then Vf 2 U:). We note also that the fission rank of V E relative to { E l ,. . . ,Ek} equals the fusion rank of V i relative to { E l , . . , Ek} (Exercise 14.23). +
Let p L ( . ) be defined by replacing UE by V i in the definition of p ~ ( * )w;(.) , by , V&, in the definition of W E ( . ) , p > ( * ) by replacing U p by Up' replacing V E , ~by in the definition of p p ( . ) and u>(.) by replacing V E , ~by , Vh,p, in the definition
14. THE HYBRID RANK PROBLEM
604 of
up(.). It
is clear that (p&
-
wk)(.)= (pTp
-
wg)(.).Hence the fission rank of
V E relative to { E l , . . . , Ek} = the fusion rank of V i relative to {El,. . . , Ek} = ( p g - w ; ) ( S ) = (p*p- w ; ) ( S ) = fusion rank of ~ prelative ‘ to (PI,. . . , Pk) = fission rank of V p relative to (PI, . . . , Pk). U
Theorem 14.5.2 Let ( V E , ~, ., . ’ , V tion of vector space V E on E and let
E ; U ~p ) be~ a minimal ~ multiport decomposi-
Then i. the fusion rank of
V E ,relative to n,
{El,. . . , Ek},equals the rank of V P
ii. the fission rank of
V E relative to n,
{El,. . . ,Ek} equals the nullity of v p
iii. if
V ~ QVQ , are s.t. (VPQ +) VQ) = V p then
(VPQ ++ V b ) has P I ,’ ‘ ’ , Pk as separators then ( V E +) ~ V ~ Q +)) Vb has El,.’ . ,Eb as separators
and further if
iv. if VEQ,VQ are s.t.
(VEQ +) VQ) = V E then
(VEQ +) Vb) has El,. . . Ek as separators then ( V E P fs VEQ) +) V;2 has PI, ’ . . Pk as separators
and further if
I-.
the generalized hybrid rank of V E relative to n, equals the generalized hybrid rank of V p relative to { P I ,. ’ , Pk }. I
Proof : We first observe that a minimal decomposition is a compatible decomposiVp) tion (Theorem 8.4.1), and, therefore (Exercise 8.11), the ordered pair (@jV~,p,, is compatible. Further, by the above mentioned theorem, when the decomposition is minimal, i = 1 , . . . ,k . r(Vp x Pi) = .(V,‘ x Pi) = 0, i. Follows from Lemma 14.5.4 when we observe that the fusion rank of V p a = l,...,k. relative to {PI,. . . , Pk} is its rank since r ( V p X Pi) = 0, ii. Follows from the above mentioned lemma when we observe that the fission rank of ~p relative to { p1, . . , Pk} is its nullity (T(v~)), since r(V+ x Pi) = 0, = 1,. . . l k .
14.5. T H E HYBRID R A N K PROBLEM - F O U R T H F O R M U L A T I O N
iii. We have ( U E + ~) Lemma 14.5.1
Up) =
605
V E . Hence, V E P +) (VPQ +) V Q ) = V E . Hence by
(VEP
* V P Q ) * VQ = VE.
Next, by the same lemma, (VEP
* V P Q ) * V& = V E P
+)
(VPQ
* Vb).
By Lemma 14.5.3,since e j V ~ , p and , ( V ~ f) Q V b ) has Pj as separators, U E +~ ) (Up, t) V & ) has E j as separators. iv. Since ((LIE, p, ) k , U p ) is a minimal decomposition, ( ble. 'Then, by Lemma 14.5.2 (@jVE,P,)
VE,p, , U p ) is compati-
* ((@jVE,P,) * VP) = U P ,
i.e., ( , B j V ~ , p ,++ ) V E = V p , i.e., ( ( V E , ~ , ) ~ , VisE a) decomposition of V p . The result now follows by arguing as in (iii) above. (Note that in (iii) above we did not use minimality of decomposition). v. This follows directly from (iii) and (iv) above and the definition of generalized hybrid rank (page 599).
Remark: In the above theorem it may be noted that the fifth part depends only on compatibilty of the decomposition and not on its minimality.
If V E is the voltage space of a graph G and { E l , . . . , Eh}, a given partition of E , we can build a minimal multiport decomposition ((VE,p , ) k , V p ) such that V p is the voltage space of a graph G p (see Algorithm (Port minimizationl) of Chapter 8). In GP each G p . Pj, j = 1 , . . . ,k, would appear as a forest graph containing no cutsets of GP. By using the theorems listed in this subsection it follows that for computing the generalized hybrid rank we can work with G p rather than G. The results of Section 14.4 imply that the same is true in the case of the third formulation as well as for computing the minimum length fusions and fissions sequence. Exercise 14.24 A polymatroid membership problem: Give a polynomial algorithm for the following problem: Find if there exists, an independent set of columns of a representative matrix of V E that contains precisely kj columns from Ej ,j = 1, . . . ,k.
14.5.4
Relation between the Hybrid Rank of a Representative Matrix of a Vector Space and its Generalized Hybrid Rank relative to a Partition
In this subsection we relate the generalized hybrid rank of a vector space relative to a partition of the underlying set to the hybrid rank of the matroid of a modified
606
14. T H E H Y B R I D R A N K P R O B L E M
representative matrix. Analogous to the case of the third formulation (Subsection 14.4.3),we show that the hybrid rank of a vector space on E , relative to a partition II, f { E l , .. . , E k } of E , is the minimum of the hybrid ranks of the matroids associated with matrices which are obtained by replacing the columns in Ej by a set of independent cospanning columns.
A few preliminary definitions: Let M be a matroid on E. The hybrid rank of M has already been defined to he min T ( M . K ) v ( M x ( E - K ) ) .
+
KLE
Let A be a matrix. The matroid M(A) is the matroid on the set of columns of A where a subset is independent in M ( A ) iff the corresponding set of columns are independent in A . We say M(A) is associated with A. Let V E be a vector space on E . Then A represents V E if its rows form a basis for V E . If A represents V E , we denote by A * , a representative matrix of V k . The hybrid rank of V E is the hybrid rank of M(A), which can be seen to be
equivalently, min r(V KCE
. K ) + v(V
x ( E- K)).
Let rl, = { E l , .. , E k ) be a partition of E . If ( ( V ~ , p , ) kV, p ) is a minimal multiport decomposition of V E , we know (by Theorem 8.4.1) that .(Up . Pi) =I Pi I and r(V; . Pi) =I Pi (,i = 1, .. . ,k. Further, by Theorem 14.5.2, the generalized hybrid rank of V E relative to { E l , . . ., E k } equals that of V p relative to {PI,. . . , Pk}. More can be said using the above mentioned theorem: if we know how to find the nearest V k (to V p ) which has Pi as separators, we also know how to find the nearest V h (to V E ) which has Ei as separators. So, for all practical purposes, we may pretend that we are working with V p , { P I , . . . , P k } , corresponding to a minimal decomposition of V E , relative to { E l ,. . . ,E k } . Now if ( (VE,pj )k, V p ) is a minimal multiport decomposition of V E , we know by Theorem 8.4.1, that the columns corresponding t o Pi are independent in representative matrices of both V p as well as V;. Therefore, whenever convenient, we assume that each set of columns E , is independent in A as well as in A”. Let A be a matrix with E as its column set and let II, G { E l , . . . , E k } be a partition of E . Let C j be the vector space spanned by the columns of Ej. We say a matrix A’ is equivalent to A relative to { C j , j = 1 , . . . , k } , if its columns can be partitioned into { E l ‘ ,. . . , Eh’} where the Ej’ are bases of C j , j = 1, . . . , k. If U’ is a vector space on E , then d ( U , V’) = d ( V , VnU’)+d(UnV’,V ’ ) (see Exercise 14.21). If V‘ is the nearest space to V with blocks of n, as separators, it follows that no proper subspace V” of V’ can contain U n V’ as well as have blocks of II, as separators (otherwise d ( V ,V ” ) 5 d(V,V n V‘) + d(V n V‘, V ’ )< d ( V , U’)).Now L” = cBj(V’ . E j ) 2 B j ( ( V n V’) . E j ) _> V n V’. Hence, V‘ = @ j ( ( Vn V‘) . E j ) and d ( U n V’,U’) is the fission rank of V n V’. The problem of finding the nearest V’ with the desired properties is therefore equivalent to finding a subspace V1 of V for
14.5, THE HYBRID RANK PROBLEM - FOURTH FORMULATION
607
which (d(V,V1) + fission rank of V l ) is a minimum. Let us call this number the fusion - fission number of V1 relative to (V,II,). We then have the following simple result whose routine computational proof we omit. Theorem 14.5.3 Let V be a vector space on E . Let Il, E (El,...,Ek} be a partition of E. Let V1 be a subspace of V . Then
i. the fusion - fission number of Vl relative to ( V ,TI,) =
c
r(V1 . E j ) - 2r(V1)
+ .(V)
j
ii. the generalized hybrid rank of V relative to II,
- fusion number: We know that d ( U , V1) = = d(V,V V1) d(V V I ,V1). One may associate with a superspace V’ of V the ‘fission - fusion number’ d ( V ,V ’ ) + d(U’, V l ) , where d(V’, V l ) is the fusion number of V‘ relative to the partition n, = { E l ,. . . , Ek}. Show that
Exercise 14.25 Fission
+
+
+
i. the fission - fusion number of a superspace V’ of V equals
-
c
T(V’ x E j )
+ 2 r ( V ’ ) - .(V)
j
ii. the fi,ssion - fusion number of a superspace V’ of V relative to ( V ,II,) equals the fusion - fission number of the subspace ( V ’ ) l of V’ relative to (V’, n,). iii. the minimum of the fission - fusion numbers of superspaces of V equals the minimum of the fusion - fission numbers of subspaces of V = generalized hybrid rank of V relative to II,. The reader would notice that the above result is analogous to Theorem 14.4.2. Both the results may be regarded as of ‘row’ type. ( U , is a row subspace, the voltage space of Gfus.nis a subspace of the voltage space of 4). Our ultimate aim is to prove a result analogous to Theorem 14.4.3where hybrid rank relative to n, is shown to be equal to the minimum of hybrid ranks over ‘equivalent’ spaces. To do this we need to get a ‘column’ version of the hybrid rank relative to n,. We need a few definitions to proceed further. Let A be tbe representative matrix of a vector space V E . Then the row space R(A) = Vt;. The column space C(A) is the span of the columns of A. We relate R ( A ) .Then the subspaces of R(A) to subspaces of C(A) as follows. Let V1 column amnihilator with respect to A of U1, denoted by d,(Vl,A), is the collection of vectors of the form Ax s.t. XTAx = 0 whenever XTA belongs t o V1. Similarly the row annihilator with respect to A of C1 C_ C(A), denoted by
14. THE HYBRID RANK PROBLEM
608
A,(A,C1), is the collection of vectors of the form XTA s.t. XTAx = 0 whenever Ax belongs to C1. When it is clear from the context, we would omit reference t o the matrix A and write &(Ul),A,(Cl) in place of Ac(V1,A), A,(A,C1) respectively. I t is clear that Ac(V1),A,(C1) are vector spaces whether or not Ul,C1 are vector spaces. By routine linear algebra we now prove the following simple result.
Lemma 14.5.5 Let A be a representative matrix of V E . Let E’ subspace of VE and let C1 be a subspace of C(A). Then
E . Let
V1
be a
i. ifC1 = Ac(V1) then
ii.
(a) r(C1 n C ( d 1 ) )
+ r(V1 . E’) = ~ ( V .EE’)
(b) r(C1)
=~ ( V E ) .
+ @I)
C1 = Ac(V1) iff Vi
= dT(C1).
Proof :
=
i(a) Let V E have the representative matrix A (A’iA”),where A’ denotes the submatrix of A composed of all rows of A and E’ as the set of columns. Let UI
have the representative matrix (L)(A’.A”).Then C1 is the collection of all vectors
(A’iA”)$: s.t. L(A’!A”)$: = 0 and C1 nC(A’)is the collection of all vectors A‘xl s.t. L(A‘)xl = 0. Now the rows of L(A’) span V1 . E’. The collection of all vectors
orthogonal to this space is the solution space of (LA‘)xl = 0. Let us call this space V,, . Clearly r(V,,) =)E’ 1 -r(V1 . E’). Consider the linear transformation
A’ : V,,
-+
C1
nC(A’)
where x1 E V,, is mapped to A’x,. This is an onto mapping and its null space is the space of all vectors x1 E V,, s.t. A‘xl = 0. This space is (VE . E ‘ ) l n V,, . Hence, However, (UE . E ‘ ) l C Vzl. So the null space is (VE . r(C1
n C(A‘))
=
= 1
.(V,,) - ~ ( V .EE‘)I 1 E’ 1 -r(V1 * E’)- 1 E‘ I +T(VE * E‘) T ( V E ‘ E‘) - r(v1 ’ E‘).
i(b) This follows from the above by putting E’ = E .
ii. By the definition of A,(.),A,(.),if C1 = Ac(Ul) it is clear that V1 C A,(Ci). Further C1 E Ac(A,(C1)). Since C1 = Ac(V1),we have r(C1) = T ( V E )- r(V1). If V , c AT(CI),since CI g Ac(A,(C1)) we must have
14.5. THE HYBRID RANK PROBLEM - FOURTH FORMULATION a contradiction. So V1 = & ( e l ) . The same contradiction results if C1 c d c ( d , ( C 1 ) ) . So CI = UI = &(el) we have d c ( V 1 ) = d c ( d r ( C 1 ) ) = C1.
609
dc(dT(C1)).
Now if 0
We now give an expression for the hybrid rank of V relative t o II, in terms of column subspaces of a representative matrix of V . The result is analogous to Lemma 14.4.1.
Lemma 14.5.6 Let V be a vector space on E and let n, E { E l , . . . ,E k } be a partition of E. Let A be a representative matrix of V and let Cj denote the space spanned buy the columns E j , j = 1, . . . ,k. Then the generalized hybrid rank of V relative to n,
where C is a subspace of C (d). Proof : B,y Theorem 14.5.3, the generalized hybrid rank of V relative to
n,
Let C’ denote d,(U’). We know by Lemma 14.5.5, that every subspace C’ of C ( d ) can be written in the form d c ( V f ) for some subspace V’ 5 V . By the same lemma
T(U’ . E j ) = r ( V . E j ) - r(Cfn C j ) r ( V ’ ) = r ( V ) -.(C’). so
+
E ~-)T ( C ‘ n c,)) - 2 ( r ( u ) - r(C‘)) r ( V ) . The desired result follows when we note that r(V . E j ) = r ( C j ) . 17
We now present the main result of this section. We have already indicated that, restricting oneself to the case where the blocks of II, are independent in a representative matrix of V E , does not entail any loss in generality.
Theorem 1.4.5.4 Let V be a vector space on E and let n, = { E l , . . . , E k } be a partition of E . Let A be a representative matrix of V with E denoting its set of columns. Let C j be the span of the columns E j for j = 1,. . + ,k. Then, the generalized hybrid rank of I/’ relative to II, = min (hybrid rank of A’), where A’ is equivalent to A relative to { C j , j = 1 , . . . ,k}.
610
14. T H E HYBRID RANK PROBLEM
Proof : Let A‘ be the matrix with column set E‘, equivalent to A relative to { C j , j = l , . . . , k } .Let M’ G M(A‘). Let K be the subset of columns of A’ s.t. r ( M ’ . K ) v(M’ x (El - K ) ) = hybrid rank of A‘. Let C ( K )be the vector space spanned by the columns of K . Let { Ej‘, j = I, . . . ,k} be the partition of the columns 0 f A ‘ s . t . E j ’ i s a b a s i s f o r C j , j = l ; . . , k . Now
+
1 E’ - K I=
x(1
E ~ I’ - 1 K n E ~ 1) / 2
C(r(cj) j
j
U72d
T ( M ’. K ) = r ( C ( K ) ) , - .(cj n c ( K ) ) )
T ( M ’ x (El - K ) ) = r ( V ) - r(V . K ) = .(V)
- r(C(K)).
Helice
Thus, the hybrid rank of A‘
j
2 Next suppose
hybrid rank of U relative t o
.i
n,.
c^ minimizes the expression
over all subspaces of C(A). Let us choose Ej’ so that c^ n Cj, has as basis, a subset L A2 and V1,VZ minimize f ~(.), , f~~ (.) respectively then
iv. if XI
> XZ and C1,CZ minimize gAl(*),gA2(.) respectively then C1 5 CZ.
V1
2 VZ,
Proof :
+
+
h(V2) _> h ( V l + V Z ) h(V1 n V Z ) .So if V1, VZ minimize h(.), the above inequality must reduce to an equality. Hence U1 VZ and V1 n VJ minimize h( .) .
i. We have h(V1)
+
14. THE HYBRID R A N K PROBLEM
614
ii. This is an immediate consequence of the previous part of this theorem.
fx(.) is submodular. We have f x p1 ) + f x x , ( V 2+) (A1 X2)7-(V2) fx,(Vl + V 2 ) + fx,(V n vZ)+ (Xi - X2)r(V2)
iii. By Theorem 14.5.5, we know that
fx,( V l ) + fx, (V2 1
=
’ -
2
-
f x p + u2)+ fX,(Vln uz) +(A,
- x2)(r(v2) -
r(vl n v2)).
> XZ, unless r(V2) = r(V1 nV2),we must have LHS > fx,(Vi +V2) + fx, (V1 nV 2 ) ,a contradiction, since VI ,V Zminimize fxl (.), f ~(.), respectively.
Since X I
We conclude that r(U2) = r(V1 n V Z )and hence UZ C Ui. iv. The proof is similar to the above and is omitted.
14.6
Solutions of Exercises
E 14.1: Let us assume that G does not have self loops or coloops since these any way do not figure in a minimal representation and do not affect the minimum value of r(G . X) v(G x ( E - X)) (by including all the self loops in X and coloops in ( E - XI). Let ( A t ,B t ) be the representation of a forest t of G. Let T be the set of coloops of G . ( E - B t ) . There can be only one forest (namely, t ) of G . ( E - Bt) that contains At (by the definition of a representation). Hence, t = At U T . Let L = E - Bt - (At u T ) .In the graph G, edges of L must be spanned by At (i.e., there must be paths between their endpoints containing only edges of At) since T is the set of coloops of . ( E - B t ) . Let A At U L . Now At is a forest of G . A. Hence, T is a forest and Bt, a coforest of G x ( E - A ) . Hence,
+
I At U Bt I=
r(G . A )
+ v(G x ( E
-
A)).
On the other hand, given any set A & E , ( A t , B t ) is a representation of a forest if At is a forest of G . A and Bt, a coforest of x ( E - A ) . For, there is only one forest T of G x ( E - A ) that does not intersect Bt and At U T is a forest of G. Thus, the problem of finding a minimum representation is equivalent to finding a partition { A ,E - A } of G s.t. A minimizes r(G . X ) v(G x ( E - X ) ) .
+
E 14.2: We have, A C E minimizes r(G . X ) + v(G x ( E - X ) ) iff it minimizes r(G . X ) + v(G x ( E - X ) ) + r(G),i.e., iff it minimizes r(G . X ) + v(G x ( E - X ) ) + r(G . X ) + r(G x ( E - X ) ) , i.e., iff it minimizes 2r(G . X ) +
I E -X I
615
14.6. SOLUTIONS OF EXERCISES
E 14.3: Let t l ,t a be two forests of G and let A E . Then t 2 - ti is a subforest as well as a subcoforest of G. Hence, ( t z - tl) n A is a subforest of G . A and (t2 - t l ) n ( E - A ) is a subcoforest of G x ( E - A). Hence,
1 t2 -ti
15 r(G . A )
+ v(G x ( E - A )).
Thus, it is clear that, if t l , t 2 are two forests of 6 and A E s.t. the above inequality becomes an equality, then t l , t2 must be maximally distant and A must be minimizing the expression
r(G . X)
+ v(G x ( E - X)).
Next let t l , t 2 be maximally distant and let A be as in Lemma 14.2.1 (i(a) 81 i(b)). We have, t2 - tl = ( ( t 2 - t l ) n A ) U ( ( t 2 - t l ) n ( E - A ) ) . Now, t z n A = ( ( t 2- t l ) n A ) is a forest of G . A. Hence, tz n ( E - A ) is a forest of G x ( E - A ) . Similarly tl n ( E - A ) is also a forest of G x ( E - A ) . Further, tl u t 2 2 ( E - A ) . Hence, ( t z - t l ) n ( E - A ) is a coforest of G x ( E - A ) . We thus see that 1 t 2 - ti I= T(G . A ) v(G x ( E - A ) ) .
+
E 14.4: (Original solution due to Lehman, Edmonds [Edmonds65b]. Version in terms of principal partition due to Bruno and Weinberg [Bruno+Weinberg71]). Notation: I3y Xmin(Xmns)of G’ we mean the unique minimal (maximal) set that satisfies 2 r ( X ) + I E(G’) - X I = (272 I . l)(E(G‘)),where r ( . ) is the rank function of G’ we mean the unique minimal of G’, i.e., r ( T ) = r(G’.T). By Ymin(Yma5) (maximal) set that satisfies 2 v ( X ) + I E(G‘) - x
I=
(2v* I . l)(E(G‘)),
where v(.) is the nullity function of G’, i.e., v(T)= v(G’ x T ) . All the fundamental circuits are according to the graph G. Lemma 14.2.1 is used repeatedly.
Case 1: Let e M E Xminof G. Then there exist two maximally distant forests t l , t 2 of such that e M @ tl U t 2 and t 1 n Xmin, t 2 n Xmin are disjoint forests of G . Xmin. If the short player plays first, he contracts a branch e’ E L ( e M , t l ) . Let e E L ( d ,t 2 ) . Edges e’,e belong to Xmin of G and tl - e’, t 2 - e are maximally distant forests of G x ( E - e‘) ((tl - e’) U ( t z - e ) 2 E - Xmin and tl - e’, t z - e have the maximum possible intersection, among all forests of 4 x ( E - e’), with Xmin - el). Then e,w E Xmin of G x ( E - e’) since eM 4 (tl u t z - e - e l ) . Let the cut player play first. We consider all the alternative situations. In each case we show that either after the cut player’s 1st move, or after the short player has responded, eM belongs to the Xmin of a reduced graph. Continuing this procedure we would finally reach a graph in which there are three parallel edges, one of which is e M . For this graph it is clear that the short player would always win whether he plays first or second. Let e be the edge that is being deleted.
i. If e $! tl CJ t 2 , then t l , t 2 continue to be maximally distant forests of G . ( E - e ) . Hence, e M belongs to E - (tl U t z ) and therefore to Xmin of G . ( E - e).
14. THE HYBRID RANK PROBLEM
616
ii. Let e not belong to Xmin of G. Then e E tl U t z - Xmin. Let, without loss of generality, e E t 2 - Xmin. Extend tz - e to a new forest tb by adding a suitable edge of t l . This latter edge cannot belong to Xmin since edges of ti n Xmin are spanned by edges of t 2 n Xmin. Now t l , ti would be maximally distant forests in the graph G . ( E - e ) (tl U ti 2 E - e - Xmin and t l ,ti have the maximum possible intersection, among all forest of G . ( E - e ) , with Xmin) and e M E E - e - (ti U t z ) . Thus, e M belongs to Xmin of G . ( E - e ) . iii. Let e E (tl u t z )n Xmin. Now we pick maximally distant forests t l , t z of G s.t. e $4 ti U t6. By Lemma 14.2.1 this is possible since e E Xmin of G. If e M $! ti U ti, it is clear that e M E Xmin of 4 . ( E - e ) . So let us assume that e M E ti U tl. Suppose e M E ti. Then, (ti - e M ) can be extended to a forest of G . ( E - e ) using an edge e’ E ti s.t. e‘ E L(eM, ti). Now ti,tk intersect Xmin in forests of G . Xmin. Hence, e’ E Xmin. If now we short e’ (i.e., if this is the short player’s move) ti - e M and ti - e’ become maximally distant forests of 4’ = 4 . ( E - e ) x ( E - e - e’) ((tk - e ~ U (ti ) - e’) 2 E - e - e‘ - Xmin and further (ti - e M ) , ( t i - e ) have maximum possible intersection with Xmin - e - e’ in G’). Thus, in G’, eM lies outside the maximally distant pair of forests tl - e, t; - e M . Hence, e M belongs to Xmin of G‘. This completes Case 1.
Case 2: Let e M E Ymin. We use arguments dual to those used in the previous case to show that the cut player can always win playing first or second. In particular this means that in the argument we replace rank of G . T by nullity of G x T , forests by coforests, Xmin by Ymin, deletion (contraction) by contraction (deletion), fundamental circuit with respect to a forest by fundamental cutset with respect to a coforest. The final graph that we reach in this case will have three series edges with eM one of them. Case 3: Let e M E X,,, - Xmin = Y,,, - Ymin. In this case the one who plays first would win. Observe that if t l , t z are maximally has tl nx,,,,t, nx,,, as disjoint forests. Also distant forests of G then G . X,,, ( E - (tl U t 2 ) ) Xmin. Hence, in this case eM E tl U t 2 . Let e M E tl and let the s.t. e M E L ( e ,tl). Let e be contacted by short player play first. Let e E t z fl X,,, the short player. It is then clear that t z - e,tl - e M are maximally distant forests of 5’ x ( E - e ) . Now we can use the arguments of Case 1 to show that the short player must win treating the cut player’s next move as the ‘first move’. The situation where the cut player plays first can be handled by arguments dual to the above.
E 14.5: We first show (in the following Lemma) that the term rank of a given cobase matrix cannot be less than the rank of any cobase matrix.
Lemma 14.6.1 Let Q, Q’ be row equivalent matrices of the form gjven below:
TI Tz
T3
T4
617
14.6. SOLUTIONS OF EXERCISES
Proof : Observe that the second set of rows of Q’ is obtained by linear combination of the second set of rows of Q. Hence, Q23is a nonsingular matrix. From this fact it can be inferred that the term rank of
[ I“,
21
must be less than or equal to the term rank of
[
[ :i ] ,
]
is the same as the rank of Qkz QI4 be less than or equal to its term rank. The result follows. But the rank of
Q’2
which must
Next we display a cobase matrix where term rank = rank. This cobase matrix is constructed according to the algorithm given in the brief solution earlier. SO the following lemma justifies the algorithm and has the consequence that the maximum rank = minimum term rank.
Lemma 14.6.2 Let Q be the matrix shown below with set of columns S
[
bi Qii
bz - bl Qiz
c Qn
]
where column sets b l , bz are maximally distant bases of the column set o f Q and c = S - (bl U b 2 ) . Let bl bz - bl c &‘=[I P R ] be row equivalent to Q. Then the matrix
[
P
R
]
has term rank = rank.
Proof : By Lemma 14.2.1 there exists a set A of columns s.t. A 2 c and A f’ b l , A n bz are disjoint and span all of A. Hence, perhaps after rearranging rows, Q’ would have the form shown below (where the columns correspond from left to right respectively t o bl n A, bl - A, b2 n A , b2 - bl - A , S - (bl U b2))
1 0 1
[ O
QL 0
Qi4
Qh
Q& 0
]’
14. THE H Y B R I D R A N K P R O B L E M
618
with Q;, being a nonsingular matrix. Clearly the term rank of the matrix 1 Q;3 Q:, Q& equals 1 banA 1 while that of Q;, does not exceed I b 2 - b ) -A Thus-the matrix
1
1.
whose columns correspond from left to right to b2 n A , b2 - bl - A , c respectively, has term rank not exceeding 1 b2 - bl 1. But this matrix has the columns corresponding to bz - bl linearly independent and therefore has rank equal to 1 b2 - bl 1. Since rank of any matrix cannot exceed its term rank, this proves the result. 0
E 14.6: i. Immediate from the definition. ii. We have
f ( W + g(S - X ) - f(S)+ f ( W = 2f(X) - d X ) + S(S) - f(S).
f ( X ) + f*(S- X ) = The result follows.
...
We show the result for contraction. Since restriction of f(.) corresponds to contraction of f*(.), by (i) above the result would be true also for restriction. Since cvery minor is a restriction followed by a contraction the result would follow for minors. Let fl(.) = f o T(.).We then have, for X C T , 111.
fi(X>+fT(T-X) = f ( X L J ( S - T ) ) - f ( S - T ) + f * ( T - X )
I 5
+
f ( X )+ f ( S - T ) - f ( S - T ) f*(T- X ) f ( X ) + f*(S- X )
(where we have used the facts that f;(T- X ) = increasing function). So min of LHS 5 rnin of RHS.
f*(T- X ) and that f*(.)is an
E 14.7: We prove only the statement about the dual. We have
f*V)=
g ( T ) - f(S)+ f(S - T ) = f ( e ) - r ( 8 ) r ( S - ++)
C
+
eET
=
1 T 1 - r ( S ) + r ( S - +)
=
T*(T),
where r * ( . ) is the rank function of M * .
E 14.8: i. Let r ' ( . ) be the rank function of M . (S - a ) . If e @ X , it is clear that f ( X ) =
14.6. SOLUTIONS OF EXERCISES
619
~ ' ( u &). , , ~If e~E X , T ' ( ( U , , ~ &) ~ -u ~ (8 - u)) = T ( U , , €ei) ~ = f ( X ) since u is dependent on k - a in the matroid M . This proves the result.
..
11.
Repeating the operation in the previous part, we can destroy all circuits of
M contained in each d retaining the property of being an expansion of f(.) for the resulting matroid. So there is no loss of generality in assuming that M . k contains no circuit for each e E S. But then I 2 I= T ( M . I?) = f ( e ) . Hence,
c
I I= I 6 I= C e E S f(e). E 14.9: i. Suppose Y maximizes g ( X ) - f ( X ) , X C S and e # Y . Then
Thus, g(Y U e ) - f ( Y U e ) > g ( Y ) - f ( Y ) ,a contradiction.
.. We need only verify that (f
- h ) ( . )is an integral polymatroid rank function (the other part being valid for any weight function h(.)).We have (f - h ) ( e ) = f ( e ) - h ( e ) = f ( e ) - ( f ( S ) - f ( S - e ) ) . So ( f - h ) ( e ) 2 0, sincef(.) isapolymatroid rank function. Now h ( . )is aweight function and, therefore, (f-h)(.) is submodular. Further f(8) = (f - h ) ( @=) 0. We conclude that (f - h ) ( . )is a polymatroid rank function. The integrality is obvious.
11.
... Let denote the rank function of M r e d . For each e E S , let e' denote ( k - 6, n 6) and let X ' = { e l , e E X } , X S. We need to show that
111.
T I ( , )
(f
-
h ) ( ~=)r l (
U e')
v x c S.
e'EX'
We need the following preliminary lemma: Lemma 14.6.3 Let bs-x be a base of M (UrEX(6,n k ) ) U 6s-x is independent.
. (UeES--XC).
Then
Proof : Suppose the set is dependent. Then there is a minimal subset Y of X s.t. ( U e E Y ( 6 e n ~ ) ) ~ 6 sis- dependent. x Let el E Y . Let K (UeEY-el(6en2))u6~-x.
=
Grow K into a base bz of M * (S - 61). Now be, n 81 is a minimal intersection of a base of M with 21 and, therefore, is a base of M x 21. Hence, bz U (be, m1)is a base of M . But this contradicts the fact that a subset of this set, namely K U (bel n 81) is dependent. U
Now LHS of (*) equals
620
14. T H E HYBRID RANK PROBLEM
while RHS equals (by the definition of contraction),
eEX
E 14.10: i. Routine. ii. By the polymatroid intersection theorem (Theorem 10.2.3)
(the equality being satisfied with integral x if f(.) is integral). Now x is an integral independent vector of Pf (Pf*), iff there exists an independent set TI of M (independent set Tz of M * ) s.t.
(by the last part of Theorem 14.3.1). This proves the min max equality. Now if T is a common independent set of M , M * the vector y defined by v ( e ) GI T n B I is a vector in Pf n Pf.. Hence, y(S) 5 mitX,Epfnpf, x(S),as required.
Let M 1 be the expansion o f f (.) described in page 386. This matroid is obtained as follows: Replace each e by f ( e ) (= g ( e ) )parallel copies making up the set 6. Let f(.)denote the new polymatroid rank function. The rank function of M1 is given by 7-1 (.) G (f* I . I)(.). Now the minimal set that minimizes f ( X ) +f*(S- X ) , X S can be seen to be the minimal set that maximizes g ( X )- 2 f ( X ) X C_ S . Let N ( Y ) G set of all elements in 5 parallel to elements in Y , Y C S . Now by symmetry arguments one can show that (Lemma 11.4.1) Y maximizes g ( X ) - 2 f ( X ) , X 5 S , iff H ( Y ) maximizes I 2 I - 2 r l ( Z ) , Z S and further if Y is the minimal set which maximizes g ( X ) - 2 f ( X ) , H ( Y ) is the minimal set maximizing the corresponding
62 1
14.6. SOLUTIONS OF EXERCISES
expression. By Lemma 11.4.1, it would also follow that maxg(X) - 2 f ( X ) = max I Z XCS
ZCS
I -2r1(Z).
Let bv be a base of M1 V M I . If bv = b V b‘ where b, b‘ are bases of M 1 we must have bv - b as a common independent set of M 1 and M;. Now if P E H ( Y ) is the minimal set that maximizes I Z I -2rl ( Z ) , Z C S then (bv - b) n would be a base of M1 . ? (Lemma 11.3.3). If we use a ‘matroid translation: of cemma 14.2.1, we can show that (bv - b) n (S - 3) would be a base of M ; . (S - Y ) . Thus
I bv
-b
I=
TI(?)
+ ri*(S- P) = f ( Y )+ f*(S- Y ) .
This proves the equality we required. E 14.11: Order the elements of S as ( e 1 ; . . , e n ) . Start from any base bo of M . Let goi z Jbo n e i J ,i = 1 , . . . ,n. Let e l , . . . ,e k be the ‘deficient’ elements for which goi < g(ei). For each element in d1 - bo, . . . ,d k - bo, construct f-circuits relative to bo. Suppose the f-circuits contain elements of 811 , . . . ,&kl. For each of the elements in 811 - bo, . . . ,E l k l - bo construct f-circuits relative to bo. Repeat this procedure until you reach elements of a set d, for which go, > g(e,) (‘saturated’ element). We now have a ‘path’ (listing only vertices) say a1 , a;, u3,. . , a;, ai+l, . . . ,at-1, a:, where the unprimed elements are outside bo, ai,ai+l belong to the same dr and further E L ( u j , b o ) and u6 @ L(uj,bo) whenever p > j 1. Now we push unprimed elements of the path into bo and drop the primed ones. If a1 E di and u: E d, say, then the resulting base has one more element of e^i and one less element of 6, than bo has. Of all other sets e^k the updated base has the same number of elements as before. Repeating this procedure we either find a base which has for each e , number of elements in d not less than g ( e ) or a base bf from whose deficient elements it is not possible to reach saturated elements. The set of all elements which can be reached from deficient elements gives the minimal set that maximizes S ( X )- f ( W , s. E 14.12: I. This follows from the fact that M(G x T ) = M ( G ) x T and Exercise 14.9 (third part).
+
xc
ii. Since 1 E I= r ( e ) , the graph G . 6, if connected, must be a tree graph. Now in Gred, it must be t y e that T ‘ ( S r e d ) = r’(S,ed - (2 - be n 2)), where r‘(.) is the rank n b e ) (by the exercise referred to above). This function of G r e d , S r e d S - UeGS(d means that G - (be n C) cannot contain a cutset of G r r e d . In particular, there can be no node in Gred to which only edges of d are incident.
iii. Let e’ = d - be f l d and e” = the tree on the same set of nodes as e’. Let &ig contain both e‘ and e” for each e while G ” r e d contains only e” for each e. e’) = G r e d and Gbig . (UeESe”) = G ” r e d . Let ra(.) be the It is clear that Gbig . (UeES rank function of Gbig and let r”(.) be the rank function of G ” r e d . Then it is clear that r’( e’) = rb( (e’ u e ” ) )= r”( e ” ) vx S.
U
U
U
eEX
eEX
eEX
622
14. THE HYBRID RANK PROBLEM
This proves the required result.
E 14.13: We assume f ( X ) = (I V 1 -l)t(X) in the procedure. In particular, f ( e ) = (I I/’ 1 - l ) ( e ) . This latter is valid only if the subgraph of 5:on 6 is connected. E 14.14: This procedure identical to the one described for finding the minimal set that minimizes (I V I -l)t(X) + g(S - X ) VX E S. We note that 1 V 1 (.) is submodular and replacing 1 by G does not change the problem. E 14.15: i. The circuits of M are minimal sets s.t. (f - k ) t ( X )