COMBINATORIAL COMPUTATIONAL M A T H E M A T I C S P r e s e n t
and
Editors
Sungpyo Hong Jin Ho Kwak Ki Hang Kim Fred...
49 downloads
866 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
COMBINATORIAL COMPUTATIONAL M A T H E M A T I C S P r e s e n t
and
Editors
Sungpyo Hong Jin Ho Kwak Ki Hang Kim Fred W. Roush
World Scientific
F u t u r e
COMBINATORIAL .OMPUTATIONAL M A T H E M A T I C S P r e s e n t
and
F u t u r e
COMBINATORIAL COMPUTATIONAL M A T H E M A T I C S P r e s e n t
and
Pohang, The Republic of Korea
F u t u r e 15-17 February 2000
Editors
Sungpyo Hong Pohang University of Science and Technology, Korea
Jin Ho Kwak Pohang University of Science and Technology, Korea
Ki Hang Kim Alabama State University, USA
Fred W. Roush Alabama State University, USA
l i f e World Scientific «Mr
Singapore • New Jersey • London • Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
COMBINATORIAL AND COMPUTATIONAL MATHEMATICS: PRESENT AND FUTURE Copyright © 2001 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-02-4678-1
Printed in Singapore by World Scientific Printers
FOREWORD The Combinatorial and Computational Mathematics Center (Com 2 MaC) of Pohang University of Science and Technology (POSTECH), Pohang, Korea, held a workshop entitled "Combinatorial and Computational Mathematics: Present and Future"during February 15-17, 2000. The invited speakers were chosen from various countries as being among the world leaders in their fields. They provided a comprehensive survey of their respective areas and presented current important open problems. I deeply appreciate their participation in this history-making first major workshop for the Center. Henri Faure of the University of Marseilles spoke on Monte-Carlo and related methods for numerical integration. This deals with the difficult case of numerical integration with a large number of different variables, where the usual formulas of calculus are completely inadequate. The Monte-Carlo method in its simplest aspect estimates the volume of an n-dimensional set S by enclosing it in a box, choosing points at random in the box, and counting the number which lie in S. The proportion of the points which lie in S is then multiplied by the volume of the box. Peter Fishburn, winner of the 1996 von Neumann prize, discussed an overview of normative approaches to the development of formal theories of preference, judgment and choice within the theory of decision making. It begins with basic definitions of choice functions and preference relations, continues with the theory of ordinal preferences and comparable preference differences, then describes aspects of multiattribute utility theory and preferences over time streams. These are followed by discussions of choice functions, social choice theory, and the ranking of subsets in comparative probability assessment. The paper concludes with a review of alternative theories of choice for decisions under risk and uncertainty, including expected utility theories and generalizations. Ki Hang Kim, in joint work with Fred W. Roush, of Alabama State University gave a condensed introduction to mathematical social science from a combinatorial viewpoint for those new to this area, illustrating it with a combinatorial treatment of Arrow's Impossibility Theorem. Joseph Kung of the University of North Texas spoke on the various approaches to matroid theory in combinatorics, its applications and the open problems to it, including the basic idea that it is "linear independence without linear algebra". Going from the axioms, he discussed in particular varieties of matroids and matroids labeled by group elements. Jin Ho Kwak, in joint work with Jaeun Lee, examined several enumeration v
vi
problems for various types of nonisomorphic graph coverings of a graphs and some of their applications to a group theory or to a surface theory. It contains some enumeration of subgroups of surface groups and distribution problems of surface branched coverings. George Markowsky of the University of Maine spoke on lattice theory including various special kinds of lattices and duality. He pointed out that lattices have applications to genetics. David Pointcheval of Ecole Normale Superieure spoke on theoretical cryptology, giving a short history of cryptography leading into public key cryptography. Fred Roberts of Rutgers University and DIMACS discussed several crucial concepts in graph theory with important applications including graphcoloring, competition graphs, and intersection graphs, and the currently open problems in each area. F.W. Roush, in joint work with K.H. Kim and Susan Williams, deals with ordered cohomology of shifts of finite type. Shifts of finite type are a kind of dynamical system which have a particularly combinatorial nature since they can be presented by means of graphs, or (0, l)-matrices. Ordered cohomology is a new and important invariant which can also be seen as a kind of K-theory. Mike Waterman of University of Southern California spoke on a new computational method of identifying DNA segments mathematically. Efficient mathematical methods for DNA analysis are critical to the new fields related to genetic engineering. I also truly thank the Korean Science and Engineering Foundation and the Ministry of Science and Technology of Korea for their financial support of this successful workshop. Lastly, I would like to thank Prof. K.H. Kim and Prof. F.W. Roush of Alabama State University, Prof. Sungpyo Hong, POSTECH for successfully organizing the workshop and publishing this collection of papers. Jin Ho Kwak, Director The Combinatorial and Computational Mathematics Center (Com 2 MaC) POSTECH Pohang, Korea
CONTENTS
Foreword
v
Monte-Carlo and Quasi-Monte-Carlo Methods for Numerical Integration Henri Faure
1
Theoretical Approaches to Judgment and Choice Peter Fishburn
13
Combinatorial Aspects of Mathematical Social Science K. H Kim and F. W. Roush
30
Twelve Views of Matroid Theory Joseph P. S. Kung
56
Enumeration of Graph Coverings, Surface Branched Coverings and Related Group Theory Jin Ho Kwak and Jaeun Lee
97
An Overview of the Poset of Irreducibles George Markowsky
162
Number Theory and Public-Key Cryptography David Pointcheval
178
Some Applications of Graph Theory Fred Roberts
210
Duality and its Consequences for Ordered Cohomology of Finite Type Subshifts K. H. Kim, F. W. Roush and Susan G. Williams
243
Simple Maximum Likelihood Methods for the Optical Mapping Problem Vlado Dancik and Michael S. Waterman
266
M O N T E - C A R L O A N D QUASI-MONTE-CARLO M E T H O D S FOR N U M E R I C A L INTEGRATION HENRI FAURE Institut de Mathematiques de Luminy, U.P.R. 9016 CNRS 163 avenue de Luminy, case 907, F-13288 Marseille Cedex 09 France
1
INTRODUCTION
We consider the problem of numerical integration in dimension s, with eventually large s; the usual rules need a very huge number of nodes with increasing dimension to obtain some accuracy, say an error bound less than 10~ 2 ; this phenomenon is called "the curse of dimensionality" ; to overcome it, two kind of methods have been developped: the so-called Monte-Carlo and Quasi-Monte-Carlo methods. Very good and up-to-date monographs on the subject exist; our purpose in the present survey is only to present the basic constructions in the two approaches, with a special insight on the second one which performs better for numerical integration and sets the trend with randomized hybridations . To avoid technicalities, we restrict ourselves to the integral domain Is = [0, l ] s and to the more commonly used sets of nodes which are already implemented in computers routines; we also leave out the so-called Lattice methods to keep an appropriate length to this proceeding paper. The reader interested who should like to go further in the subject should consult the monographs of H.Niederreiter 8 , S.Tezuka 13 , M.Drmota-R.Tichy 2 , B.L.Fox3, J.Matousek 7 (in chronological order) and also of J.E.Gentle 4 for MC Methods and I.H.Sloan-S.Joe 12 for Lattice methods; the collective Springer Lecture notes in Statistics 138 5 and the proceedings of the three Conferences MCQMC published by Springer (Lectures Notes in Statistics 106 10 , 127 9 , and the book published this year n ) present the main contributions during the five last years; good references on the Number Theory background for Irregularities of Distribution are the books of J.Beck-W.W.L.Chen 1 and L.Kuipers-H.Niederreiter 6 . The goal is to obtain a numerical estimate for
/ / , with the integrand J i' f in L 1 ( / s ) , for various purposes such that computational Physics, numerical solutions of integro-differential equations (Boltzmann, particle transport), 1
2
numerical Probabilities, Finance. The approximation is obtained by means of JV-l n=0
with a well chosen point set P/v = {X 0 ,Xi, ...,Xw-i}; the choice of the point set PJV determines the name of the method: In Monte-Carlo methods, the points are chosen 'randomly' (in the sense of the random functions of computers); the procedure converges almost surely and there is a probabilistic error bound of the form 0(—4=)In Quasi Monte-Carlo methods, the points come from deterministic multidimensional sequences with very low irregularities of distribution; there is a deterministic error bound in terms of both the quality of the distribution (by means of the discrepancy D(PN), see 3.1 for the definition) and the regularity of the integrand (by means of the total variation V(f) of / ) , the famous Koksma-Hlawka theorem:
\ft-jiKtnx.)\-, ~ , and || as denned above are comparative probability relations denoting more likely than, equally likely as, and probabilistically incomparable to. Axioms for the binary relation approach are restrictions on £ . Examples, applied to all x,y,z € X, are: reflexivity :
x £3 x
transitivity :
(x >^y,y £3 z) => x £3 z
completeness : x^y
or y £3 x .
Completeness clearly implies reflexivity, and completeness and transitivity define £3 as a weak order. Examples for comparative probability are x £3 0, where 0 is the empty event, and the additivity condition which says that if event z is disjoint from events x and y then x £3 y (x U z) £3 (y U 2). The choice function and binary relation approaches often operate in tandem. When C is the primitive construct and A contains all singletons and doubletons from X, we often define £3 in terms of binary choices by
x£ V
if x
e C({x,y})
and refer to >3 as a revealed preference relation. When £ is primitive, one plausible definition of C is C(A) = {x e A : x >3 y for all i / e A } , provided that the set of maximally-preferred objects in A is not empty. Empty maximal sets can arise from intransitive or cyclic preferences, incomparabilities, or infinite sets with no largest members.
15
The ensuing sections of this chapter offer introductions to structures addressed by deterministic decision theories. Preference relations will serve as the primitive constructs in most cases. The sections are loosely organized by the traditional trichotomy of decision under certainty, under risk, and under uncertainty. More mathematical aspects of our topics are often subsumed by the representational theory of measurement as discussed in Krantz, Luce, Suppes and Tversky 2 , Roberts 3 and Fishburn 4 ' 5 . Surveys that provide more detailed overviews of many of our topics include Fishburn 6 for general coverage, Kami and Schmeidler 7 and Camerer and Weber 8 for decision under risk and uncertainty, Fishburn 9 for lexicographic decision rules, Plott 10 , Sen 11 and Blais 12 for choice and social choice, and Keeney and Raiffa 13 and Wakker 14 for multiple attributes. As a final introductory note, I list the titles of ensuing sections along with a few remarks on each. 2. Ordinal preferences. Representing preferences by order-preserving utility functions; partial orders and zones of indifference or indistinguishability; preference cycles. 3. Comparable preference differences. Preferences for preference differences; strength of preference. 4. Multiple attributes. Objects characterized by multiple attributes or criteria; additive conjoint measurement and other algebraic representations of holistic preferences; interdependencies among attributes; intransitivities. 5. Time streams. Preference between time-indexed outcome streams; finite vs. infinite-period streams. 6. Choice functions. Basic theory of the choice function approach; special conditions, including path independence; revealed preference. 7. Social choice functions. Group choice and theories of voting; impossibility theorems and voting paradoxes; Condorect social choice functions. 8. Subset ranking and choice. Comparative probability and ambiguity for event sets; relationships between orders on objects and subsets of objects; signed orders; joint receipt. 9. Lotteries and risk. Expected utility theory; notions of risk and stochastic dominance; nonlinear utility; multiple attributes. 10. Uncertainty. Expected utility with subjective probability; lottery acts; lexicographic orders; Choquet expected utility and other nonadditive or nontransitive theories.
16
2
ORDINAL PREFERENCES
The fundamental theorem of utility identifies conditions on (X, £ ) that are necessary and sufficient for the existence of a real-valued utility function u on X that preserves preferences in the sense that, for all x,y e X, x>Zy u{y) . Weak order suffices unless X/ ~, the set of indifference classes in X determined by ~, is uncountably infinite. Then we also require a countable order denseness condition which says that X/ ~ includes a countable subset Y such that, whenever x y y and the indifference classes of x and y are not in Y, x y z y y for some z whose indifference class is in Y. When weak order holds but countable order denseness fails, £; can be preserved by utility vectors ordered lexicographically. The simplest example is (X, ^ ) = ( l x R, > i ) , where >i is the lexicographic order o n l x l denned by (xi,x2)
>L (2/1,2/2) if xi > 2/1 or (xi = y\,x2
> y2) •
The general form for n-dimensional vectors is x>zy&
(u1(x),...,un{x))
>L (ui(j/),...,u n (j/)) ,
where each Ui is real-valued and (ai,..., an) >L {b\,..., bn) if the two vectors are equal or a^ > 6j for the first i at which they differ. The strict preference relation >- is a partial order if it is asymmetric (x y y => not(y y x)) and transitive. This accommodates intransitive indifference, e.g. (x ~ y,y ~ z,x y z), which reflects successive judgments of indifference or indistinguishability that aggregate to a noticeable difference. When X is countable, partial order allows the one-way representation x y y => u(x) > u(y) , but partial order is not entirely necessary for this. The necessary and sufficient condition is that yl, the transitive closure of >- defined by x y* y if x y y or if x y z\ y • • • y zm y y for some ZJ, is irreflexive. Two-way representations exist for some partial orders when we replace u by a more flexible construct. An example is x y y & I{x) > I(y) , where each I(x) is a real interval, and I(x) > I(y) means that a > b for all a € I(x) and all b € I{y)- This representation requires the partial order to satisfy (x y y,x' y y') => (x y y' or x' y y), in which case it is called an
17
interval order. If it is true also that (x y y y z) => (x y w or w y z) for every w € X, then >- is a semiorder and, for finite X, every /(a;) can be assigned the same length. Semiorders formalize a notion of zones of indistinguishability whose left and right ends are ordered the same way, whereas interval orders allow very different zones of indistinguishability for different objects. We say that (X, 'y) has a preference cycle if x y y y • •• y z y x for x,y,... ,z £ X. Preference cycles are often associated with multiple criteria situations in which different criteria govern different comparisons. Although preference cycles preclude traditional numerical representations, they can be accommodated by the positive codomain of a bivariate real-valued function, say x y y &p(x,y)
>0.
This is a trivial representation unless additional properties are imposed on 2/)- A common numerical representation involves real-valued utility differences: (x>y) ~* (z>w) u(z) - u(w) • This necessitates symmetry conditions such as (x,y) £* (z,w) =>• (w,z) £* (y,x), along with cancellation assumptions that allow decomposition of (x,y) into the difference u(x) — u(y), for example {(x,z) >* (w,b),(w,y)
£* (a,*)} =» (x,y) £* (a,b) .
Proofs of the preceding utility-difference representation are often facilitated by special structural assumptions of an algebraic or topological nature.
18
One generalization of the difference representation replaces u(x) — u(y) by - y if not(:r; ~* yi) for some i, and X{ y^ yi for the smallest such i. A more common situation under preferential independence arises when there are tradeoffs between factors, such as ( i i , ^ ) ~ (2/1,2/2) with x\ >-i y\ and 2/2 >~2 ^2- In such cases the additive conjoint measurement or additive utility model that represents the utility of an object as the sum of utilities for its factor levels may apply: n
{xi,...,xn)
£ (yi,.-.,yn)
o ^2ui(xi) *=i
n
> ^2ui(Vi)
,
»=i
where Ui is a real-valued function on Xi. The additive utility model presumes weak order, preferential independence, and more involved independence conditions. The general independence or cancellation condition says that if m > 2, if x1,... ,xm, y1,... ,ym 6 X are such that x],... ,x™ is apermutation of y\,..., yf1 for i = 1 , . . . , n, and if xi' ^ yi^ for j = 1 , . . . , m — 1, then m m not(a; y y ). If X has sufficiently rich algebraic or topological structure, it may suffice to use only the m = 2 or m = 3 part of the general condition, and the Uj may be unique up to similar positive affine transformations, which means that (vi,... ,vn) satisfies the representation in place of ( u i , . . . ,un) if
19 and only if there are real numbers a > 0 and j3\,..., j3n such that u$ = aui+fii for every i. Other algebraic combinations of factor utilities that do not necessarily presume preferential independence can be considered. An example is the twofactor multiplicative model (3:1,2:2) £ (2/1,2/2) «• u1(xi)u2(x2)
> ui(2/1)^2(2/2)
in which each in can have negative as well as positive values. Other types of interdependence among factors that involve some degree of utility decomposition have also been investigated. A generalization of the additive utility model that does not presume transitivity is n
t=l
where 0 0, but in other cases it allows preference cycles. 5
TIME STREAMS
The multiattribute formulation with x = (x\,X2,... ,xn) can be cast as a time-dependent process when i indexes successive time periods and xi is the outcome for period i in time stream x. When the number of periods is unbounded, we write x = {x\,x2, • • •)• Utility representations in the preceding section apply to finite-periods cases and can be extended to denumerableperiods cases. An example with X\ = X2 = • • • is the weighted additive form 00
u(xi,x2,...)
=
^Xiuixi)
in which u is a one-period utility function, A; > 0 is a weighting constant for period i, and both u and EAj are bounded to ensure convergence. If
20
Ai > A2 > A3 > • • •, the future is discounted, and it is discounted at a constant rate if A; = A* for some 0 < A < 1. But other patterns are plausible, as when the A^ increase for a time and then decrease toward zero. Although the Xi need not be the same in a time-stream formulation, identical outcome sets for the periods allow comparisons associated with notions of persistence and impatience. Suppose the Xi are identical, preferential independence applies, and >^i is the marginal preference relation on X0 = Xi for each i. We say that preferences are persistent or stationary if £ » = £ ; J for all i and j , and that they exhibit impatience if, whenever outcome o is preferred to b, (• • •, a, • • •, b, • • •) >- (• • •, b, • • •, a, • • •), where the two streams are identical in all other periods. One cause of preference interdependence among periods is a desire for variety, for example in one's diet. If preference in period i + 1 depends only on the outcome in period i, we might consider the additive first-order Markovian model in which u(xi,x2,---)
= ui(xi) +u2(x2,xi)
+ u3(x3,x2)
H
.
This can be specialized under relaxed notions of persistence and related aspects to Ui(a, b) = Ajti(a, b), A, > 0, for each i > 2. Comparable preference differences of section 3 can be considered in the time-stream formulation as a way of enriching the structure. For example, the utility difference representation coupled with a straightforward notion of persistent preference differences in the periods leads to the additive representation u(xi,... ,xn) = YiUi{xi) in the finite-periods setting. 6
CHOICE FUNCTIONS
We now consider a choice function C defined on a family A of nonempty subsets of X as the primitive construct with 0 C C(A) C A
for every
AG A .
A prominent theme for choice functions is their ability to be characterized by maximal elements of weak orders on X. For any such weak order £ let M(A, £ ) = {x € A : x >z y
for all
y € A} .
We say that C is weak-order representable if there is a weak order £ on X such that M(A, £ ) is a nonempty subset of C(A) for every A e A, and that C is exactly representable if C(A) = M(A, £ ) for some weak order and all A&A.
21
Supposing that A contains every nonempty finite subset of X, the inclusion condition [ACB
and
An C(B) ^ 0] => C{A) = A n C(B)
implies that C is exactly representable under the weak order £ whose strict part is defined by x >- y
if
x^y
and
C({x,y})
= {x} .
When no special structure is presumed for A, we define a revealed preferenceor-indifference relation ^o on X by x £o V if
x £ C(A)
and
and consider its transitive closure £Q m [x£AeA,yG
2/ 6 A tne
C{A),x>£
for some
A & A ,
following condition: y] =>• a; 6 (7(4) .
This condition, known as Richter's congruence axiom, is necessary and sufficient for exact representability. For every nonempty B C A, define C(B) as the set of all x in UgB such that x € C(B) for all B £ B that contain x. The modified congruence axiom C(B) 7^ 0 for every nonempty finite B C A implies that C is weak-order representable when C(A) is finite for every A £ A. The implication can fail, however, if some choice sets are infinite. Many other conditions on choice functions have been proposed. An interesting example is Plott's path independence condition C(A U B) = C[C(A) U C(B)} . This says that choices from larger sets can be based on choices among choices from smaller sets that cover the larger sets. Under suitable structure for A, path independence implies that >- based on C({x, y}) = {x} is a partial order. Other conditions, referred to as axioms of revealed preference, have been used in consumer economics as an alternative to utility maximization to explain choices of budget-restricted consumption bundles. 7
SOCIAL CHOICE F U N C T I O N S
A social choice function is a mapping F from a set A x T>, where A is as in the preceding section, into nonempty subsets of X such that F( , D) is a choice function for every D E T>. Each D is a data set that describes preferences or potential choices of a set of individuals or voters with respect
22
to X or A, and V is a collection of such data sets. Members of X are often referred to as candidates or alternatives, and data sets in V are sometimes called voter preference profiles. The social choice set F(A, D) can be viewed as the candidates in A most acceptable to the voters when A is the feasible set of candidates and D is the voter preference profile. The premier result (and challenge!) in social choice theory is Arrow's impossibility theorem 15 . Suppose that F is a social choice function o n ^ l x P and that X has at least three candidates, A contains every two-element subset of X, and V is the set of all n-tuples D = (£1, £2, • • •, £n) of weak orders on X. For each profile D, define on X by xyDy
if
x^y
and
F({x,y},D)
= {x} ,
and let x ^D V denote not(y >o x). Arrow's theorem says that F cannot simultaneously satisfy four apparently reasonable conditions: 1 (Pareto). For all D and {x, y}, x yD y if x H y for all i; 2 (Binariness). For all D, D' and {x,y}, if D and D' are the same on {x,y}, then )~o and >-£>< are the same on {x,y}; 3 (Social order). Every is a weak order; 4 (No dictator). No i is a dictator in the sense that for all D and all {x,y}, x yt y => x yD y. Arrow's impossibility theorem gave rise to a few dozen other theorems for A x V structures that identify conditions for F that are mutually inconsistent. All have roots in an old observation known as Condorcet's paradox of cyclical majorities. Its simplest example uses three candidates and three voters with transitive preferences x >-i y >-i z, z >-2 x y^ y and 1/^3X^3 x. Let a ym b mean that more voters prefer a to b than b to a. Then x ym y ym z ym x, so the simple majority relation ym is cyclic. Many other voting anomalies for elections with three or more candidates have been noted. For example, some widely used procedures have the property that a potentially victorious candidate turns into a loser after a profile is unambiguously changed in its favor. Moreover, virtually all election procedures are vulnerable to strategic misrepresentation whereby voters can secure the election of a preferred candidate by lying about their preferences. Together, impossibility theorems, voting paradoxes, and strategic misrepresentation indicate that there is no such thing as a fully acceptable election procedure when three or more candidates compete. In consequence, new procedures continue to be proposed and old ones rediscovered in attempts to find a better voting system. A recent example
23
is approval voting, where each voter votes for a subset of candidates without ranking and the winner is the candidate with the most votes. This deceptively simple system has many nice features and has been adopted by several professional societies. It has also been promoted for party primary elections but has encountered strong resistance in that political sphere from people with vested interests who are averse to change. 8
S U B S E T R A N K I N G A N D CHOICE
A popular construct for subset comparisons is a comparative probability relation ^ on a set £ of events, or subsets of a state space S, that contains the empty event 0 and the universal event S. We say that (£, £ ) agrees with a probability measure /x on £ if, for all A, B £ £, A>zB&
n(A) > /i(B) .
Agreement entails weak order, 5 >- 0, A £ 0 for all A £ £, and independence or cancellation conditions similar to those of additive conjoint measurement. When £ is infinite, an Archimedean axiom is also needed. The simplest cancellation condition, A y A' o A U B >- A' U B,
provided (A U A') D B = 0 ,
suffices when a suitably strong Archimedean axiom is used. A variety of weaker representations have been proposed for comparative probability, including several that use intervals bounded by lower and upper measures. The ( £ , £ ) formulation has also been used to address event ambiguity, a notion concerned with the difficulty in assessing probability. One axiom here is A ~ (S \ A), which asserts that an event and its complement are equally ambiguous. A different subset concern is preference or choice among objects formulated as subsets such as committees, option packages, or meals. One approach considers relationships between preferences on single items and on subsets of items. A simple axiom in this case is: if x >- y and Af){x,y} = 0 then A U {x} y AiJ {y}. However, interdependencies arising from substitutabilities, complementarities, and desires for variety or representativeness often invalidate such axioms and force consideration of interactions among items in viable evaluations of subsets. The subset comparison problem has given rise to the notion of a signed order as a more informative basis than preferences between single items in extending those preferences to subsets. Let X denote the set of single items, and let X* denote a disjoint copy of X. We can think of x* as the negation or
24
denial of x, with (x*)* = x so that X U X* is closed under the * operation. A signed order is a binary relation ^ o n l U l * that satisfies a £ 6 <S> b* £; a* for all a, 6 £ l U l * . In the committee selection setting, a; y y indicates that you would rather have x than y on the committee, x* y y means that you would rather have x not on than to have y on, and so forth. Another recent notion related to subsets is that of joint receipt. This has been considered primarily for sums of money where there is a natural addition operation, but it applies to other entities as well. The question arises in the monetary setting as to whether an individual who receives distinct amounts x and y, which could be gains or losses, evaluates them separately and then aggregates their values, or evaluates the package holistically after forming the sum x + y. With © denoting joint receipt, the related utility question is whether u(x®y) = f(u(x),u(y)) or u(x®y) = u{x+y). A hedonic editing rule has been proposed to the effect that u(x (By) = max{u(i + y),u(x) + u(y)}, in which case the utility of x © y is the larger of the utility of the sum x + y and the sum of the utilities of x and y considered separately. 9
LOTTERIES AND RISK
A lottery on a set X is a probability distribution p on X for which p{A) — 1 for some finite AC X. Members of X could be wealth levels, gains and losses, consumption bundles, multiattribute outcomes, time streams, candidates, or pure strategies. We consider a preference relation ^ o n a set P of lotteries on X that is closed under convex combinations so that Xp + (1 — A) Xp + (1 - X)r y Xq + (1 - X)r; 3. p y q y r => ap+(l—a)r (0,1).
y q y (3p+(l—(3)q for some
Q,j5 6
When these hold, u is unique up to a positive afjine transformation v = au + b with a > 0. Additional axioms are needed to extend the expected utility form to u(p) — J u(x)dp(x) when P is a set of probability measures on an algebra of subsets of X. There are also weaker versions of the fundamental theorem that replace weak order by partial order for p y q => u(p) > u(q), or that omit the Archimedean axiom 3, in which case u maps P linearly into a multidimensional vector space ordered lexicographically. Axiom 2 is the notorious independence axiom that is often violated by expressed preferences. An example is ($3000 with pr. 1) >- ($4000 with pr. 0.8, $0 otherwise) and ($4000 with pr. 0.2, $0 otherwise) y ($3000 with pr. 0.25, $0 otherwise). Theories that weaken axiom 2 to accommodate such violations have been developed. One example has the weighted linear representation u(p) w(p)
u(q) w(q)
where both u and w are linear and w is positive. Another is the rank-dependent form: p y q O u(p) > u(q), with / an increasing map from [0,1] into [0,1] and n
u(p) = u(a;i) + Y^iu(xj)
I
n
J2P^
~ u(xj-i)]f \i=j
j=2
when p has positive probabilities for x±,X2,... ,xn, which are ordered by increasing preference. A third relaxation of the linear theory that allows preference cycles as well as violations of independence has pyq&
tp(p, q) > 0 ,
where (p is a real-valued skew-symmetric function o n P x P that is linear separately in each argument. The function ip is often referred to as an SSB (skew-symmetric bilinear) utility function.
26
Risk attitudes typically refer to curvature properties of an increasing and differentiate function u on wealth or changes in wealth in the expected utility setting, but can also be formulated for nonlinear representations. Risk aversion applies when u is concave, or when the expected value of a nondegenerate lottery is preferred to the lottery or its certainty equivalent. Risk seeking describes the opposite behavior. It is often observed that people are risk averse in gains and risk seeking in losses. The theory of stochastic dominance associates classes of utility functions on wealth with comparisons between the cumulative distribution functions of lotteries. When p ^ q, p first-degree stochastically dominates q if the cumulative of p at x is no greater than the cumulative of q at x, for all x, and this is true if and only if the expected utility of p is greater than the expected utility of q for all increasing u. And p second-degree stochastically dominates q if the left-partial integral of the cumulative of p is uniformly no greater than the left-partial integral of the cumulative of q, and this is true if and only if p's expected utility exceeds q's for all increasing, concave u. Notions of stochastic dominance have also been developed for multivariate distribution functions. When X C XixX2x• -Xn, special conditions on (P, £ ) allow u o n l for the expected utility model to be decomposed into functions Ui on Xi for each attribute. If p ~ q whenever p and q have the same marginal distribution on Xi for each i, u has an additive decomposition u(xi,X2, • • • ,xn) = ^2iUi(xi). If the preference order for each i induced over marginal distributions on Xi when levels of other attributes are fixed is independent of those fixed levels, then u has a multiplicative if not additive decomposition. Decompositions have also been investigated for SSB functions and other nonlinear forms. 10
UNCERTAINTY
Our final category generalizes decision under risk to decision under uncertainty. Each potential decision in the uncertainty case is an act that assigns a consequence in X to each state in S. The act set F is a subset of X , the set of all maps from S into X. The decision maker is uncertain about which state is the true state and cannot affect its occurrence by the act taken. Consequences in X are the primary objects of value to the decision maker. Savage's theory 16 assumes that F = Xs and £ is the set of all subsets of S. It uses seven axioms for (F, £ ) to imply the existence of a unique subjective probability measure fi on £ and a bounded utility function u on X such that, forall/,5eF, f£9&
[ u(f(s))d(i(s) Js
> f u(s(s))d/i(s) • Js
27
The axioms imply that S is infinite and that u is unique up to a positive affine transformation. They include weak order, independence axioms, and an Archimedean condition. Numerous alternatives to Savage's SEU (subjective expected utility) theory have been developed. One group replaces consequences by lotteries in P , which facilitates derivation of the SEU model for finite S. If S = { 1 , 2 , . . . , n} and lottery act f assigns lottery pi to state i, we obtain the SEU form n
u(f) = Y2 fj-MPi) where the \ii are subjective probabilities and u is a linear function on P. This approach has been used with a weak Archimedean condition to obtain a lexicographic SEU model in which subjective probabilities are matrices and utilities are multidimensional vectors ordered lexicographically. The lottery-act formulation also facilities derivation of decompositional forms for multiattribute utilities. Two examples motivate theories that weaken the SEU representation but retain some of its key ideas. The first considers acts / and g for payoffs that depend on which face of a die comes up on one roll: 1 2 3 4 5 6 /ISIOOO $900 $800 $700 $600 $500 g $900 $800 $700 $600 $500 $1000 Even if an individual believes the die is balanced with probability | for each face, the correlations between payoffs under each state may lead to / >- g, or perhaps g >- /'. SEU theory requires / ~ g. The generalization of Savage's representation with
where ip is a skew-symmetric function on X x X, accommodates such preferences. Its axioms are similar to Savage's once weak order has been relaxed to allow preference cycles. Ellsberg's famous urn example suggests a different form for p. rather than u. One ball is to be drawn randomly from an urn containing 90 balls: 30 are red (R) and 60 are black (B) and yellow (Y) in unknown proportion. Consider acts /: win $10,000 if R drawn, nothing otherwise g: win $10,000 if B drawn, nothing otherwise / ' : win $10,000 if R or Y drawn, nothing otherwise g'\ win $10,000 if B or Y drawn, nothing otherwise .
28
Many people prefer f to g and g' to / ' for reasons of specificity. However, these preferences violate the SEU principle which says that if the only difference between (/, g) and (f',g') is that for some event E f(s) = 9(s) = x f° r a u s e E f'(s) = g'(s) = y for all seE , then f y g •& f > g1 • When / >- g and g' y f', subjective probabilities that reflect preferences in an obvious way cannot be additive. Nonadditive but monotonic subjective probabilities are involved in the representation f
JS
rO
/*00
/ w(s)da(s) =
a{s: w(s) > t]dt Jt=0
[1 - a{s : w{s) > t}]dt . Jt=-oc
This is referred to as Choquet expected utility, or CEU. Sufficiently strong structural assumptions imply that a is unique and u is unique up to a positive affine transformation. It is instructive to note that, when S is finite, the preceding integral turns into integration of a step function with a form similar to that of the preceding section for rank-dependent utility. Thus, CEU in the setting of uncertainty is analogous to rank-dependent utility in the lotterybased risk setting. References 1. P. C. Fishburn, Stochastic utility. In: S. Barbera, P. J. Hammond, C. Seidl, eds. Handbook of Utility Theory, volume 1. New York: Kluwer, 1998, pp. 273-319. 2. D. H. Krantz, R. D. Luce, P. Suppes and A. Tversky, Foundations of Measurement, volume 1. New York: Academic Press, 1971. 3. F. S. Roberts, Measurement Theory. Reading, Massachusetts: AddisonWesley, 1979. 4. P. C. Fishburn, Utility Theory for Decision Making. New York: Wiley, 1970. 5. P. C. Fishburn, Nonlinear Preference and Utility Theory. Baltimore, Maryland: Johns Hopkins University Press, 1988.
29
6. P. C. Fishburn, Utility and subjective probability. In R. J. Aumann, S. Hart, eds. Handbook of Game Theory, volume 2. Amsterdam: Elsevier, 1994, 1397-1435. 7. E. Kami and D. Schmeidler, Utility theory with uncertainty. In W. Hildenbrand and H. Sonnenschein, eds. Handbook of Mathematical Economics, volume 4. Amsterdam: Elsevier, 1991, 1763-1831. 8. C. Camerer and M. Weber, Recent developments in modeling preferences: uncertainty and ambiguity, J. of Risk and Uncertainty, 5: 325-370 (1992). 9. P. C. Fishburn, Lexicographic orders, utilities and decision rules: a survey, Management Sci., 20: 1442-1471 (1974). 10. C. R. Plott, Axiomatic social choice theory: an overview and interpretation, Amer. J. Pol. Sci., 20: 511-596 (1976). 11. A. K. Sen, Social choice theory: a re-examination, Econometrica, 45: 53-89 (1977). 12. A. Blais, The debate over electoral systems, Internat. Pol. Sci. Rev., 12: 239-260 (1991). 13. R. L. Keeney and H. Raiffa, Decisions with Multiple Objectives: Preferences and Value Tradeoffs. New York: Wiley, 1976. 14. P. P. Wakker, Additive Representations of Preferences. New York: Kluwer, 1989. 15. K. J. Arrow, Social Choice and Individual Values, second edition. New York: Wiley, 1963. 16. L. J. Savage, The Foundations of Statistics. New York: Wiley, 1954.
COMBINATORIAL ASPECTS OF MATHEMATICAL SOCIAL SCIENCE K.H. KIM Mathematics Research Group, Alabama State University, Montgomery, AL 36101-0271, U.S.A. and Fellow, Korean Academy of Science and Technology (KAST) F.W. ROUSH Mathematics Research Group, Alabama State University, Montgomery, 36101-0271, U.S.A.
AL
We survey briefly mathematical methods in a number of social sciences, and discuss the most famous result in the theory of social welfare functions from the viewpoint of Boolean matrices, to demonstrate that the theory is both combinatorial and discrete. Lastly, we provide important open problems.
1
INTRODUCTION
The purpose of this paper is to indicate the general nature of mathematical social sciences for those new to this area, and then, go into somewhat more depth on one important topic. What is mathematical social science? Mathematical social science is an application of mathematics to social science problems. Currently, active areas of research include (1) mathematical economics; (2) mathematical psychology; (3) mathematical sociology and (4) game theory. These disciplines each publish their own research journals, and there exist many related journals. Of course, there exists a journal known as Mathematical Social Sciences, since 1981 which encompasses the above-mentioned areas. The theory of binary relations or Boolean matrices is an effective tool for attacking various areas of social science and in fact serves as one common thread among these areas. 2
2.1
LIST OF A P P L I C A B L E MATHEMATICS IN SOCIAL SCIENCE Archaeology
Mathematical methods of seriation have been used to date artifacts in archaeology. These methods involve studying similarities between artifacts and then finding an order of the artifacts which results in the least total change between one date and the next following data. The mathematics can involve 30
31
matrix theory. This has some similarities with classification theory but is not the same, since the end product is an ordering on the different items, not a grouping of them into disjoint subsets. 2.2
Demography
In the nineteenth century, Thomas Malthus gave a mathematical theory of population. Demography is the forecasting of population by mathematical methods, based on birth and death rates for various age and class segments of the population. The basic methodology is finite difference equations, which can often be solved by matrix methods. 2.3
Economics and Econometrics
In the nineteenth century, David Ricardo translated some of Adam Smith's work into mathematics, such as the theory of comparative advantage in international trade. Cournot developed a mathematical theory of duopoly, 2 agent economic competition. That is, if AT& T were considered as competing against MCI only, we would have a duopoly. Bentham invented the idea of a mathematical utility function and its role in computing a welfare for society as a whole. Walras worked out aspects of modern competitive economics in detail. In particular he formulated the idea of a competitive equilibrium with a number of individuals. Economics is the most quantitative of the social sciences, and Walras's theory of competitive equilibrium is central to it. The basic ideas involved in this are, a utility function, or, equivalently, a preference function for each agent, for consumption (would he prefer a used car, or a vacation , a computer, and a country club membership), and a specification of the goods the agents can produce. An equilibrium is a set of prices at which supply for all goods equals demand for all goods, given that each individual can pay for what he buys out of what he sells. A variety of mathematical methods are used in this area: multidimensional calculus, linear programming, fixed point theorems, differential topology. Methods have been developed for computing equilibrium based on simplicial approximation. As we will discuss in greater length later, individual preferences are represented by numbers, utilities, the value of a situation to a certain person; the more he or she likes or desires or prefers the situation, the higher the utility. In market economies, it is expected that the outcome will be the result of a game played by the individual agents seeking to maximize their own utilities (again, to some degree). Thus, in these situations, one is interested in some
32
kind of equilibrium rather than an optimum. Before discussing the general theory of games, we mention some other areas of mathematical economics. Mathematical economists have produced models of nearly every conceivable situation bearing on economic life. Some are taken as qualitative, some are quantitatively fitted to reality using statistical methods. In the latter class belong the econometric models which are used to predict the performance of national economies, using a large number of aggregated economic quantities such as incomes in various sectors and aggregated demands estimated from the past behavior of consumers. These have been quite successful in predicting short-run behavior. This involves primarily statistics, which in turn involves a lot of linear algebra. Practical econometric models involve hundreds of variables, but a simplified example of an econometric model is the use of least squares to find a formula for the increase of wages over time. Some consider econometrics a separate discipline from mathematical economics. In long-term behavior it has been shown that many economic models are chaotic in the sense of the mathematical theory of dynamical systems. If this reflects the reality, then it may never be possible to uniformly predict economic behavior long in advance, as with the weather, unless government is able to control this behavior. It has been stated that no matter how powerful computers become, or how accurate weather measurements, it will never be possible to predict the weather in detail for more than 2 weeks in advance, due to intrinsic chaos in the mathematical sense. This involves chaos theory. Questions of motivation, incentives, and mechanisms have become very important in economic theory. One theme is, how can we design an economic system so that the individual agents, acting by self-interest, will act so as to benefit the group. For instance, one would want a situation in which "honesty is the best policy", efficiency pays off, and there is a basic fairness of the resulting distribution. To some degree the benefits of a free market are that it has many of these properties. Brams and Taylor's theory of fair division 5 has become well-known in the nineties. It deals with questions like the best way to divide goods among several players when the goods have different values to different players. In addition, is there some mechanism which under self-interest, will lead players to make fair divisions, like the classic method where one player cuts a cake and the other chooses which piece he wants. The mathematics involved is typically elementary algebra and probability.
33
2-4
Game Theory
Game theory analyzes situations of conflict between individuals, groups, countries, who are viewed as competitors or players in the game. It was invented by Von Neumann in 1928. Game theory is in part a branch of mathematics, insofar as its concepts are given and the question is to deduce their consequences; but it ties in to many mathematical social sciences in its attempts to define concepts that accurately reflect human behavior. Games are structures made up of a sequence of plays, in which each of a set of players chooses from a specified set of moves, given some information about previous moves, and obtains specified payoffs as a result. Each player has the ultimate goal of maximizing his or her utility in some sense- usually maximizing expected utility, but occasionally minimizing losses. The classical games of perfect information like chess have a complexity which is not much simplified by mathematics, and are not much related to most human situations. A class of games much more relevant to human interactions are n-person games consisting of one stage of play, in which each person chooses a strategy with no knowledge of what strategy the others choose. Thus, these are not games of perfect information because the players do not know already what move the other players have made. A strategy refers to some action which the player can take: a poker player can bluff, a general can invade a country. In terms of the n-tuple of plays, there is an n-tuple of expected utility payoffs. EXAMPLE. Probably the most famous 2-person game is the prisoner's dilemma, with payoff matrix something like (-1,-1) (0,-10)
(-10,0) \ (-9,-9),/
Each of two prisoners (who collaborated on a crime) is held in isolation, and is offered immunity if he testifies against the other, and the other does not testify against him. In that case the other receives a 10 year sentence. If both testify against each other, both get 9 years; but if neither talks they can only be convicted of a lesser charge resulting in 1 year in prison. Whatever strategy the other prisoner chooses, each player acting in selfinterest alone will choose to testify, with the result that both lose. It has been argued that this situation is at the heart of many real-life problems such as damage to the environment, where self-interest is at odds with a common interest which one player alone cannot secure.
34
As is, this is an example of a noncooperative game, where players are not allowed to communicate and form binding agreements. A noncooperative game can be like war, where the opposite sides do not meet together and jointly plan what they will do. The most important solution concept for noncooperative games is the Nash equilibrium. An n-tuple of strategies is in Nash equilibrium if and only if no single player by himself can achieve a higher payoff, when all the other players keep their strategies the same. A game is said to be 0-sum if the sums of the players utilities are always zero, meaning that whatever one player gains, another will lose. The number of players makes some difference in game theory; when there are only 2 players, game theory is simpler. The above prisoner's dilemma outcome is a Nash equilibrium, but Nash equilibrium solutions are much more convincing in some other situations like 2-person, 0-sum games. A cooperative n-person game is one in which the players can communicate and make binding agreements. This is like peace negotiations after a war, or labor negotiations, or political give-and-take within a country's government. If we extend this concept to coalitions, sets of players, by saying that no subset of the players, by choosing different strategies, can strictly improve the payoffs for all their members, we have a strong Nash equilibrium. This is one solution concept for cooperative n-person games. However in a great many classes of games, existence of a strong Nash equilibrium is very unlikely, and all sorts of complicated schemes have been devised, kernel, nucleolus, von Neumann-Morgenstern solutions. Game theory is important in modern mathematical economics in relation to the theory of incentive- compatible mechanisms, that is social structures such that players have an incentive to act in a way which is socially beneficial. Most of game theory involves only algebra and probability theory, but some uses linear programming, convex sets, topology, fixed point theorems, even measure theory. 2.5
Political science
The theory of voting methods and their properties has a great overlap with the theory of social welfare functions, which has a more economic origin. The goal of this theory is to describe which voting methods, in elections with at least 3 candidates, are to be preferred to others under various conditions. For example, one common method is plurality-runoff, in which if no candidate receives a majority on the first vote, the two candidates who receive the largest numbers of votes are compared with each other in a 2 way vote. This has some advantages over the straight plurality method, but other methods
35
such as the Borda method (the ranks of candidates by voters are added and compared), approval voting 3 , cumulative voting, have strong advocates. Two other areas of political science that have received a lot of mathematical study are the theory of parliamentary coalitions, an application of cooperative game theory (see, for instance, 4 ) , and the theory of power indices such as the Shapley value. This have been used in legal settlements. These involve primarily combinatorial probability. Statistical methods are also widely used in political science.
2.6
Psychology
In the nineteenth century, Galton and Fechner introduced mathematical methods into psychology. Psychology is in a sense the foundational social science, as the individual is the unit of society. Quantification of perceptions, attitudes, values, preferences leads to measurement theory, which in general is the theory of characterizing various types of scales in terms of qualitative properties. A very simple example is that any weak order on a finite set can be represented by a utility function. A less elementary one involves the theory of additive conjoint measurement, in which axioms about how pairs or triples of quantities are related can give rise to an additive function relating various scales. Measurement theory involves the theories of ordered groups and n-ary relations. This is only the briefest mention of a large body of theory, for more see 7.9>18>19.25,30 -j^g ftrst 0 f these in particular is highly combinatorial. In a sense game theory is relevant to psychology as a theory of individual behavior.
2.7
Sociology
In the nineteenth century, Quetelet applied mathematics, specifically statistics, to the study of sociology. Mathematical sociologists have used the theory of clustering and related methods to analyze social groups. For some limited group of people one considers a number of binary relations on it, such as friendship, spending time together, requesting assistance, or confrontation. All these binary relations are represented as matrices, where an (i,j) entry may denote the strength of this relationship between person i and person j . Then the group is divided into subgroups (clusters) based on the analysis of these matrices. The mathematics is primarily matrix theory and graph theory. This kind of analysis has recently been extended to a theory of social networks n .
36
3
SOCIAL WELFARE F U N C T I O N S (SWF)
A social welfare function, given a set of alternatives, expresses which choices are better than others for a given group of individuals. It is derived from knowing which alternatives are preferred to which others by the members of the group. A more combinatorial and discrete aspect of economics is the theory of social welfare functions 22 , 2 3 . Its most famous result won Kenneth Arrow a Nobel prize in 1973 for work done in 1951. Here we shall treat s.w.f. in terms of Boolean matrices rather than the usual relation-theoretic approach. For this, see Sen 31 . Sen also won a Nobel prize in 1998 for his excellent contributions to welfare economics including social choice theory, welfare and poverty indices, and studies of famine. In 1994, John Harsanyi won a Nobel prize for his game theory solutions, and Reinhard Selten won a Nobel prize for his perfect equilibrium concept in game theory; John Nash won the same year for his solution concepts in game theory such as Nash equilibrium. We consider a group of m individuals (consumers, competitors, voters), denoted M = { l , 2 , . . . , m } who are faced with a group choice between n alternatives (candidates, goods, proposals, choices) X = { 1 , 2 , . . . , n } . DEFINITION. The two-element Boolean algebra 0 = {0,1} is as follows:
+ 01
01 0 00 1 01
0 01 1 11
A matrix over /3 is called a Boolean matrix. Let Bn denote the set of all n x n Boolean matrices over /?. For both algebraic and combinatorial properties of Boolean matrices see 14 . Let Bx denote the set of all binary relations on X. For R £ Bx and A £ Bn, let
Jl
if
(i,j)eR
This defines an isomorphism from Bx to Bn under which union and intersection correspond to Boolean addition and the elementwise product ©. There is also an isomorphism from Bn to Dx, where Dx denotes the set of all directed graphs on X. EXAMPLE. Let X = {(1,2), (1,3), (2,3)}. Then
{1,2,3}.
Let R
=
{(1,2)}.
Let R
=
37
Most workers in this area use binary relations, since the basic datum that an individual prefers one alternative to another is a binary relation. However Boolean matrices have simple mathematical properties which enable us to give a simple proof of Arrow's theorem, and are closer to the main body of mathematics such as linear algebra, than are binary relations. Graphs are not much used by workers in social welfare functions but are often used in mathematical sociology, perhaps because they give a simple picture of social networks. Another, geometric approach is given in 27 - 28 .
38
Since we are going to work with Boolean matrices, we will restate the various order relations in terms of Boolean matrices. (a) Reflexive relation Va 6 X, (a, a) £ R: I < A, I is the identity matrix. (b) Symmetric relation Va, b € X,(a,b) e R =$• (b,a) e R: A = AT. (c) Antisymmetric relation Va, b e X, [(a, b) £ R A (6, a) € R] => a = b: A 0 AT < I. (d) Transitive relation Va, 6, c £ X, [(a, b) e R A (b, c) € R] =$> (a, c) G fl: A2 < A. (e) Complete relation Va, 6 G X, (a, 6) e R V (6, a) e R: A + AT = J, the matrix each of whose entries is 1. (f) Weak (pre-) order relation (complete, transitive binary relation) : A + AT — J, A2 < A. (g) Linear (total) order relation (complete, antisymmetric, transitive binary relation) : A © AT < I, A + AT = J,A2 < A. (h) Quasiorder relation (reflexive, transitive binary relation) :/ < A, A2 < A. (i) Partial order relation (reflexive, antisymmetric, transitive binary relation): i" < A, A 0 A < I, A + AT = J. Let Wx denote the set of all weak orders on X. Let Lx denote the set of all linear orders on X. Let Wn be the set of all n x n matrices corresponding to Wx • Likewise, let Ln denote the set of all n x n matrices corresponding to Lx- Then (1) |L„| = n\, (2) \Wn\ = X ) L i S(n,k)k\ where S(n,k), the Stirling's number of the second kind, is the number of equivalence relations on a set of n elements having k equivalence classes.
1 2 4 3 5 1 2 6 24 120 1 3 13 75 541 6 7 8 9 10 5040 40320 362880 3628800 720 \Ln\ n
\Ln\ \Wn\ n
Wn\ 4683 47293 545835 7087261 102247563 DEFINITION. A profile is an m-tuple of linear orders, an element of (Ln) . m
EXAMPLE. Suppose there are 3 committee members, John, Joe, and Smitty. They have to choose between 3 plans a,b,c for investing the committee's money. John prefers a to b and b to c. Joe prefers b to c and c to a. Smitty prefers a to c and c to b. The profile would be made up of these three linear orders, which as Boolean matrices are as follows: /l 0
1 1\ 1 1 ,
\0 0 1/
(I
0 0\ 1 1 1 , 1
V ° !/
(I 0
\°
1 1\ 1 0 . 1 l
J
39
We will denote the elements of a profile as i ? ( l ) , . . . , R(m). In Boolean matrix terms, we write them as A(l),... ,A(m). The linear orders can be called preference relations in the sense that they express the preferences of the individual voters. On the other hand they are linear orders also. Linear order is a more general concept insofar as it can apply to many other situations than individual or social choice, such as the linear order on the real numbers. Preference relation is restricted to linear order in this article to make possible a treatment by Boolean matrices, but other authors consider preference relations as being weak orders.
DEFINITION. A social welfare function (SWF) is a function F : (Ln)m
-»• Wn.
Thus in this definition, any function is allowed (the range being Wn implicitly gives transitivity and completeness). Later additional restrictions are considered. The notation L" refers to the set of n-tuples -R(l),..., R(m)) so that we can write a social welfare function in the form F(R(1),..., R(m)). EXAMPLE. Let F(A(l),.. .,A(m)) = A(l). Then the group preference is the same as that of individual 1. If we take the profile in the last example, then we get F([O V \0
I 0
i ],( i i i ],[ o i o N = [o l l I. 1/
\1
0
1/
\0
1
1/ J
\0
°
!/
EXAMPLE. Let F(A{1),..., A(m)) = J. Then the group is indifferent between any two alternatives. If we apply this to the profile in the last example, we will get F(\O v
\0
I
i ),[ i
i
i),( o i
0
1/
0
1/
\1
\0
o)) = I i
1 1/ J
V1
i 1
i ]. l
)
EXAMPLE. Suppose m = n = 2. We can write out a social welfare
40
function (chosen at random). 1
F
1 W 1
1
0 1J'\0
1
1 1 0 1
i
l W i
o
0
l)'\l
1
0
1 1
1 1
0 W 1 lj'\0
1 1
0
1 1
1 1
0 \ (I 11 ' 1 1
0 1
1 0' 1 1
There are many other social welfare functions which are somewhat more realistic, but they are much more complicated to define. The total number of SWF is
|L l
n
|w„| - = £( 5 (".*)* ! ) nim The number of dictatorial SWF is m, so the number of nondictatorial SWF is n
| V P g | L » l - m = ^(S(n,A;)fc!)" !m - m. it=i
It is difficult to count the number of Pareto optimal SWF but a related condition, unanimous, would suffice for Arrow's theorem, that is, whenever all voters have the same linear orders, this is also the group preference. The number of unanimous SWF is
|WglL"HL"l = £(S(n, *)*!)"'' This is because a unaminous SWF is completely determined when it is specified on the set of all profiles where not all voters have the same linear order-in this we remove the n! profiles where all voters have the same linear order. The number of SWF which satisfy Independence of Irrelevant Alternatives is much smaller and will be stated in the main theorem.
41
Since individuals in a free society think and behave differently, individuals demonstrate different degrees of various attributes and characteristics. The following three properties safeguard the basic human rights of individuals in the group and fairness to them. For the benefit of the readers, we will first state these attributes in terms of relations, then state them in terms of Boolean matrices. There are many other attributes. For a thorough treatment, see 31 .
(a) Pareto Optimality (PO) DEFINITION. A SWF is P. 0 . if whenever everybody in the group strictly prefers a to b, then the group strictly always prefers a to b (a, b here are only two out of n alternatives in X). Therefore a SWF is P.O if and only if
Vt,j e x, [v/c e M,R(k) ELXA => (i,j) 6 F(R(l),...,
{i,j) e R(k) A (j,i) i R(k)]
R(m)) A (j,i) i F(R(1),...,
R(m)).
Equivalently a SWF is P.O if and only if Vi, j € X, [V/c e M, A(k) e Ln A a{k)i:j = 1 A a{k)ji = 0] => F(A(1),...,
Aimfiij
= 1 A F(A(l),...,
A(m))ji = 0.
EXAMPLE. The Borda social welfare function is computed by having each individual i assign a rank r^ to alternative j , where for each i, r^ are 1,2,..., n in some order. Then the group ranks the alternatives according to the magnitude of Y^i rij •
(/?) Nondictatorial Condition (NC). DEFINITION. A SWF is nondictatorial if and only if there is no individual in the group such that if he prefers a to b, then the group always prefers a to b. Therefore a SWF is nondictatorial if and only if (where fci denotes the potential dictator)
42 VAi € M 3(R(k)) G ( L x ) m 3[VfcEM, i?(fc) G Lx]
A3i,jeXl
(i,j) F ( A ( 1 ) , . . . , A{m))ij = F ( £ < 1 ) , . . . , B<m)) 0 .
43
EXAMPLE. The Borda social welfare function is not independent of irrelevant alternatives. We will show that very few social welfare functions are independent of irrelevant alternatives. Dictatorial social welfare functions, antidictatorial ones (which reverse the preferences of a fixed individual), and constant social welfare functions satisfy 7.
Arrow originally proved his theorem on the domain Wx- For a simple Boolean matrix treatment, it is necessary to deal with the subset LxHowever to some degree this can be considered a stronger theorem since the assumptions on the SWF are not as strict. It could be shown that a dictator on Lx must be a dictator on Wx also. ARROW'S IMPOSSIBILITY THEOREM No SWF has attributes (a), (0), (7) for \X\ > 3. Characterization Theorem. All SWF satisfying 7 are specified as follows. There is a certain fixed weak order on X. It determines group preferences between blocks. Within each equivalence class of this weak order, there is a dictator, an antidictator, or the s.w.f. is constantly J. The dictator or antidictator must be the same in each block and in blocks below a constant block there is always a constant block. Proof: Let A(p) denote the linear order of individual p. Note by antisymmetry of linear orders for any A(p) £ Ln,i,j £ X, a{p)ij = a(p)j;- Independence of irrelevant alternatives (7) on the domain of linear orders means that the (i, j)-entry of the social welfare function is some function fij(v) where v = ( a ( l ) i j , . . . ,a(m)ij) is the vector of (i,j)-entries in the individual preference matrices A(l),..., A(m) . That is, the group choice between i, j depends only on all individual preferences between i, j , and this vector specifies those. Completeness of the social welfare function means fij(v) + fji(vc) = 1 always, where vc denotes the vector whose entries are the complements of the entries of v. This follows from the Boolean matrix formulation of completeness above. The definition of s.w.f. implies completeness and transitivity (weak order). Take any three alternatives i,j, k, using our assumption that \X\ > 3, and let u,v,w be the vectors of individual preferences for i to j \ j to k; i to k. That is, u = a(l)ij,... v =
,a(m)ij
a(l)jk,...,a(m)jk
44
w
-a(l)ik,...,a(m)ik.
We wish to consider exactly what the possibilities are for u,v,w. Transitivity says if the i, j and j , k entries are one, so is the i, k. So w > uv. It also says if the i,j and j , k entries are zero, so an individual prefers j to i, k to j , then he prefers k to i, so the i, k entry is zero. Therefore w < u + v. We check for any individual linear order R(s), that the 6 linear orders on i, j , k produce exactly the 6 possibilities for us,vs,ws that satisfy uv < w < u + v. Specifically Preference order us vs ws ijk 1 1 1 ikj 1 0 1 jik 0 1 1 jki 0 1 0 kij 1 0 0 kji 0 0 0 Then the Boolean matrix interpretation of transitivity says that fij(u)fjk(v)
< Mw)
(1)
whenever the vectors u,v,w satisfy uv < w < u + v. Pareto optimality says that fij(0,...,0) = 0, fij(l,..., 1) = 1. If we let w = u, v = 1, where 1 = ( 1 , 1 , . . . , 1), the equations give fik{v) < fa(v). By symmetry fik(v) = fij(v). Likewise fki(v) = fkj{v)It follows for a domain of at least 3 alternatives that all the / y are equal, and we can write them as f(v). Transitivity then says, uv<w f{u)f(v)
< f(w)
(2).
Therefore f(u) - f(v) = 1 =>• f(uv) = 1 (set w = uv). And u < w => f(u) < f(w) (set v = 1). Completeness says for all v, either f(v) = 1 or f(vc) = 1. In set-theoretic terms, S* = {z E Vm\f(z) = 1} is an ultrafilter. It is nonempty since it contains I by (a). Therefore it has a minimal element w. By PO (a) w > 0. Suppose w has at least two one entries. Let 0 < z < w. Then f(z) = 0,f(zc) = 1. Therefore f(wzc) = l,wzc < w. This contradicts w being minimal. So w has a single 1 entry, say in location s, and z > w =>• f(z) = 1. Thus zs = 1 =>• f(z) = 1. Since the product of any two minimal vectors of S* is less than either, the minimal element of 5* is unique, and f(z) = 1 •#• zs = 1. Therefore for all i,j 6 X,z £ Vm,fij(z) = f(z) = zs. This says individual s is a dictator. This contradicts (/3). We briefly outline the arguments needed instead to prove the characterization theorem when we assume IIA only. We define a fixed weak order
45
W(L) by saying that w(t)ij = 0 if and only if V i ? ( l ) . . . , R(m) e L™, F(R(1)...,
fl(m»y
= 0.
The transitivity relation (1) above still holds, and it implies that if W{L)IJ = 1 then Vfc e X, fik > fjk,fki
> fkj
(3).
This means within each equivalence class, all the functions / are equal. On this class, completeness gives 3 possibilities: /(0) = 1,/(I) = 0;/(0) = 0 , / ( I ) = 1; / ( I ) = /(0) = 1. By the arguments above these give an antidictator, a dictator, or a constant J within the matrix block represented by that equivalence class. Completeness implies that Vi, j € X,w(L)ij — 0 => w(c)ji = 1. The inequalities (3) imply that below a dictator or antidictator can be only a constant or the same dictator or antidictator. Below a constant block can be only a constant block. |.
COROLLARY The number of s.w-.f. satisfying IIA is n
k
£ ( £ S ( n , f c , r ) f c ! ( 2 m r + l)). fc=l
r=0
Here S(n, k, r) denotes the number of equivalence relations with k equivalence classes among which r equivalence classes have at least 2 members. Proof: For each weak order chosen as W(i), and each value of j from 1 to k, we can choose that at or above level j there is a dictator or antidictator from m possible individuals and below is a constant function. This gives 2mr choices for each weak order of this type. Then we add in the case when all equivalence classes are constant. The special case r — 0 also fits the formula.
A number of other attributes have been defined and studied for social welfare functions. DEFINITION A social welfare function F is anonymous if the following holds, where Sm denotes the symmetric group: VTT £ SmVR(l),...,
R(m) e 1%, F(R(l),...,
R(m)) = F(R(w(l)),...,
R(n(m))).
46
This means that the identity of the individuals makes no difference in the social choice, that everyone plays a symmetrical role. DEFINITION A social welfare function F is neutral if the following holds, where for any binary relation R and permutation IT of X, Rn denotes the binary relation such that Vx,y G X,n(x)Ririr(y) VTT £ Sn, R(l),...,
R(m) G L £ , F«{R{1),...,
xRy R(m)) = F(R{1)',...,
R{mY).
Being neutral means that every alternative is treated in an equal and symmetrical way, so that the ordering of the alternatives makes no difference. DEFINITION An s.w.f. F has positive responsiveness provided that whenever R{1),..., R(m)) and 5 ( 1 ) , . . . , S(m)) are two profiles, x G M such that \/y,z G X,z ^ x,Vi,yR(j)z
=> yS(j)z
we have the following: \fy £ X,xF(R(l),...,
R(m))y =* xF(S(l),...,
S(m))y
\fy G X, yF(S(l),...,
S(m))x =» yF(R(l),...,
R(m))x
The meaning of this is that if we change the preferences so as to increase the position of x, keeping other relative positions the same, the position of x is not decreased for the group choice. This hypothesis is more or less equivalent to a,/3, and also implies strategy-proofness, so it is very strict. In fact it is so strict that on the full domain L™,n > 2 it implies dictatorial. This corresponds to the concept of monotonicity for social choice functions. EXAMPLE. The Borda method has positive responsiveness, as do methods of this general rank-sum type.
EXAMPLE. An s.w.f. can be defined with positive responsiveness which is not strictly a rank-sum method by considering pairwise majority votes.
47
Define a relation C0 (Condorcet) by xC0y if and only if a majority of voters would prefer x to y in a contest between those two only. Now rank the alternatives according to the number of other alternatives which they would defeat in pairwise contests, i.e. |{j/|a;C 0 y}|.
EXAMPLE. Plurality-runoff does not have positive responsiveness (or monotonicity) in general. Suppose we have 3 candidates a, b, c and that 102 voters rank them as abc, 99 voters rank them as bca, 100 voters rank them as cab, 5 voters rank them as bac. Then b, a receive more first-place votes than c, and in a contest between them, a majority of the voters prefer a to b, so that a will be chosen. But if all bac voters were to raise the position of a by voting abc, then alternatives a, c have the largest number of first place votes, and now c defeats a in a contest between the two.
We have mentioned the Borda rule above, which adds up points for the different ranks for different voters. It has been extensively studied by D. Saari, who has argued that it is the best social welfare function. Indeed it does have many positive properties. However there are some other social welfare functions which have desirable properties which it lacks. Individual rationality is the same concept as strategyproofness, mentioned elsewhere. It means that if the voters vote as if they were playing a game, so that their votes are not necessarily their actual preferences, then the outcome from the s.w.f is a Nash equilibrium. In order for this to make sense for a s.w.f. some method of tie-breaking must be selected. Then individual rationality means that no individual, by voting as if his preferences were different, can obtain a result which is better according to his true preferences. Gibbard 12 and Satterthwaite 29 proved that if at least 3 alternatives are in the range of a social choice function and it is individually rational, then it must be dictatorial. The references by Pattanaik and Moulin study this topic in more detail. Some modified types of social welfare functions have been considered. DEFINITION A binary relation R on X is acyclic if and only iff Vfc > Vixi,...,xk e X
-'[(xi,x2)
£ R A (£2,2:3) € R A . . . A (xk-i,xk)
ERA {xk,Xi)
G R].
48
A binary relation on X is acyclic if and only if its matrix satisfies An = 0 where n = \X\. Let Tx denote the set of acyclic binary relations on X, and Tn the corresponding set of Boolean matrices. EXAMPLE Any strictly subtriangular Boolean matrix represents an acyclic binary relation.
A social decision function is a function i™ —> Tn. Social decision functions are not a major topic in welfare economics. Their primary advantage is that in contrast to Arrovian social welfare functions, they do exist. Their primary disadvantage is that unless an issue has overwhelming support, the group may not have a preference. This is like the case of two-thirds majority in the example above. Most legislative bills do not have enough support to pass by a two thirds majority Social welfare functions on general domains of partial orders were studied by Barthelemy 2 who proved a version of Arrow's theorem where the conclusion involves an oligarchy instead of solely a dictatorship. He gives other references to related results. Another variation, which is the subject of a large literature, is social choice functions. X
2
DEFINITION A social choice function is a function - 0, such that for any S C X, P G L™, C(S) C S.
C:(2
x
-0)xL^
In other words, a social choice function, given any set of voter preferences and any subset S of the alternatives, chooses a subset C(S) C S of the alternatives which it collectively prefers. Every social welfare function gives rise to a social choice function, which selects the set of alternatives which are maximal within the weak order. EXAMPLE. The Pareto set for preferences R(l),..., X\->3zinX 3 Vi G M, zR{i)y A ->yR{i)z.
R(m)
is {y G
That is, an alternative is Pareto optimal if there is no other alternative preferred to it by every voter. A well-known theorem of Plott 24 characterizes choice functions which arise from binary relations. Strategy-proofness is usually studied in terms of some form of social choice function. Every
49
social welfare function gives rise to a social choice function: this function selects those members of S which are optimal in the weak order defined by the social choice function. The study of special forms of strategy-proofness involves combinatorics, for instance systems of distinct representatives, statistical strategy-proofness, and the study of restricted domains where strategyproofness holds. Social choice theory has been exhaustively studied since Arrow's work l. Social welfare functions have been extended to continuous domains of interest in economics and have been characterized in various ways. For example John Harsanyi 13 characterized a version of Bentham's social utility. In general, to have a nontrivial social welfare function of this type it is necessary to allow some degree of interpersonal comparison of utilities, that is one must say the value of alternative a for person i is greater than the value of alternative b for person j . Harsanyi's social utility function is defined when X is a subspace of Rk and each individual's preferences are given, not by a binary relation, but by a utility function Ui(xi,... ,Xk). It represents the social utility simply as The leximin social welfare function proposed by John Rawls others uses the same domain, but its range is a weak order.
26
and
DEFINITION. Let v,w be vectors in Rn. In the leximin sense, v is preferred to w if for some x G R, for all y < x £ R
|{»K0• b g
AL){a}.
An antiexchange closure is a closure satisfying the antiexchange property. An antiexchange closure satisfying the finite basis property on a set S defines an antimatroid on S. An example of an antiexchange closure is convex closure for subsets of real n-dimensional Euclidean space Rn. If a is in the convex closure of Au{b}, then a is "inside" the convex set A U {b} and hence, b is "outside" the convex set A U {a}, that is, b is not in the convex closure of A U {a}. Thus, convex closure over the reals (or any ordered field) is an antiexchange closure. An axiomatic characterization of the lattice of convex sets in Rn can be found in 3 . It might be of interest to remark there is a characterization of compact convex sets in Rn not using the order relation. If A and B are subsets of Rn, then the Minkowski sum A + B is defined by A + B = {a + b:aeA
and b G B}.
A subset C is m-divisible if there exists a set C" such that C = C' + C' + ... + C, where there are m copies of C" in the Minkowski sum. The set C is infinitely divisible if it is m-divisible for every positive integer m. It is known 50 , p. 22 that a compact subset C in Rn is convex if and only if it is infinitely divisible. It would be interesting to explore the idea of infinitely divisible sets over other fields, such as the p-adics. Over fields of prime characteristic p, one cannot divide by p and a more useful definition is \p}-divisibility where we require m-divisibility only for integers m not divisible by p. The [p]-divisible sets over the finite vector space [GF(p e )] d have been characterized 65 : they are exactly
69 the sets closed under addition. In particular, over [GF(p)] d , a set is [p]-divisble if and only if it is a subspace. Let L be a lattice of finite rank. If x is an element in L, the element x* is the meet of all the elements covered by x. The lattice L is said to be locally lower distributive or meet-distributive every element x in L, the interval [ X% , XI IS cL distributive lattice. The following theorem (combining results due to Dilworth 14 , Edelman 18 , and probably others) describes the lattice structure of the lattice of closed sets of antiexchange closures. 3.4. THEOREM. are equivalent:
Let L be a lattice of finite rank. Then the following
(1) L is the lattice of closed set of an antimatroid. (2) L is locally lower distributive. (3) Every element in L has a unique decomposition into a join of joinirreducibles. Condition (3) is a finite version of the Krein-Milman theorem, which says that every convex set in Rn is the convex closure of its extreme points. It follows from Theorem 3.3 and 3.4 that locally lower distributive lattices are consistent. For more about antiexchange closures and locally lower distributive lattices, see 19 and 52 . We end this section with a discussion of subobjects and morphisms of matroids. If G is a matroid on the set S and T C 5, then the submatroid G\T is the matroid on T with the rank function of G restricted to subsets in T. The matroid G\T is also described as the matroid obtained by deleting the complement S\T from G. Note that the lattice of flats of a submatroid of G is usually not a sublattice of the lattice L(G). If a is an element in S, the deletion G\{a} is often written simply as G\a. If U C S, the contraction G/U ofG byU is the matroid on the complement S\U with rank function ranko/[/(yl) = r a n k e d U U) — rankest/) for a subset A in S\U. The lattice of flats of G/U is isomorphic to the upper interval [17, i] in L{G). As for deletions, the contraction G/{a} is often written as G/a. The lattice of flats of G/a is the upper interval [{a}, 1]. Thus, the simplification of G/a is the matroid defined on the lines of G containing a with lattice of flats [{a},l]. In particular, contraction by a point a corresponds to the classical geometric operation of projection from the point a.
70
Contractions and deletions commute. A matroid H is a minor of G if it can be obtained from G by a sequence of contractions and restrictions. Minors are subobjects when the morphisms are strong maps. There are two other categories of matroids: weak maps or specializations, and comaps. See 40 and 41 for more information on this. If G and H are matroids on disjoint sets S and T, then their direct sum G © H is the matroid on the union S U T with rank function rank(vl) = rank G (A f l 5 ) + rank;/(A n T). Taking the direct sum corresponds to putting the matroids G and H in the most general position possible, that is, in different dimensions. The lattice L(G © H) is the (cartesian) product L(G) x L{H). A matroid is connected if it is not the direct sum of two proper submatroids. An element a in a matroid G is an isthmus if rank({a}) = 1 and G equals the direct sum (G\a) © G|{a}. 4
G R A P H THEORY W I T H O U T VERTICES
Matroids can also be axiomatizatized using circuits. Circuits are abstractions of cycles in graphs and minimal linearly dependent sets in vector spaces. Matroid theorists usually allow graphs to have loops and multiple edges. A matroid on the set S can be specified by a collection of non-empty subsets of S called circuits satisfying the following axioms. (CI) If C\ and C2 are circuits, then C\ (£ C2 and C2 3, then the Dowling group matroid Qn{A) is not a secret-sharing matroid.
80
8
G R E E D Y ALGORITHMS, MATROID I N T E R S E C T I O N , A N D MATROID PARTITION
Matroids also occur prominently in the theory of combinatorial optimization. A reason for this is that collections of independent sets of matroids are set systems on which the greedy algorithm always work. Let S be a finite set and X a collection of subsets of S containing the empty set 0. Let w : S —» R+ be a non-negative real-valued "weight" function on S. If J is a subset of S, its weight w(J) is defined to be the sum of the weights of its elements, that is,
aeJ
The greedy algorithm attempts to find a subset of maximum weight in X in the following way: Start with 7 = 0. Suppose that I has been chosen. Amongst all the elements not in I, choose an element a such that Iu{a} is in the collection X and w(a) is maximum. Replace I by I U {a}. When I is a maximal subset of X, stop and output I as the subset having maximum weight. Edmonds 22 discovered that the independent sets of a finite matroid can be characterized using the greedy algorithm. Specifically, he showed that a collection I of subsets of a finite set S is the collection of independent sets of a matroid on S if and only if the following axioms holds. (Grl)
0
(Gr2)
If I e X and J C I, then J
G
X. el.
(Gr3) For every non-negative real-valued weight function on S, the greedy algorithm outputs a subset in I having maximum weight. The greedy algorithm applied to the cycle matroid of a graph is Kruskal's algorithm for finding the maximum-weight spanning tree in a graph. A history of the greedy algorithm for trees in graphs can be found in 28 . There is a similar axiomatization in which maximum-weight subsets are replaced by lexicographically-greatest subsets. The axiomatization using the greedy algorithm led to an interesting generalization of a matroid. A greedoid on a finite set 5 is defined by a collection 1 of subsets of S called feasible sets satisfying (Grl), (Gr3), and the following weakening of (Gr2). Accessibility. If I is a non-empty feasible set, then there exists an element a e I such that I\{a} is feasible.
81
Introductions to greedoids can be found in [B7] and 4 . The point of view of combinatorial optimization also led to the study of polytopes associated with matroids. The classic paper in this area is 2 1 . Perhaps the deepest results in matroid theory coming out of combinatorial optimization are the matroid partition theorem and the matroid intersection theorem. Both theorems are due to Edmonds 20 ' 21 . The matroid partition theorem is similar to the marriage theorem in matching theory: both assert that an "obviously" necessary condition is also sufficient. 8.1. MATROID PARTITION THEOREM. For 1 < i < m, let Gt be a matroid with rank function rank; on the finite set S. Then there exist subsets Si, S2, • • •, Sm such that Si is independent in G; and Si U S2 U . . . U Sm = S if and only if for every subset
ACS, m
^ r a n k i ( A ) > \A\. i=l
In particular, if G is a matroid on S, then S can be partitioned into m independent sets if and only if for every subset A C S , mrank(A) > |A|. Applying the matroid partition theorem to the matroid G and the dual of the matroid H, we obtain the matroid intersection theorem. 8.2. MATROID INTERSECTION THEOREM. Let G and H be matroids with rank functions ranko and rank// on the same finite set S. Then the maximum size of a subset independent in both G and H equals min{rank G (A) + rank#(B) : A U B = S } . There are polynomial-time algorithms to find partitions into independent sets and maximum-sized common independent sets. These algorithms also give the clearest proofs of Theorems 8.1 and 8.2. For more about matroid algorithms (in particular, the matroid matching problem), see [B9], [SI], [Sll], and 4 7 .
9. MATROID UNIONS AND TRANSVERSAL MATROIDS
82
The idea of partitioning into independent sets of different matroids leads to the following construction. If G\ and G 2 are matroids on the same set S, then the matroid union G\ V G2 is the matroid whose independent sets are sets of the form I\ U/2, where I\ is independent in G\ and I2 is independent in G2. There is no known elementary proof that G\ V G2, as defined, is actually a matroid. See [Bll, p. 403] for a proof. A comprehensive survey of matroid unions can be found in [Bll, Chapter 12]. Roughly speaking, matroid union corresponds to putting one representation matrix on top of another. To make this intuition precise, we need a way to remove "accidental" linear dependences among the columns of a matrix. Let G be an F-representable matroid on the set S and let M be a representation matrix of G over F. Let xa,a £ S be indeterminates, one for each element of 5, thought of as transcendental elements over an extension of the field F. The generic diagonal matrix D on the set S is the diagonal matrix with row and columns indexed by S whose aa-entry on the diagonal is xa. Then the product MD is the matrix obtained from M by multiplying the column indexed by a by the indeterminate xa. In particular, right multiplication by D makes the columns of M "algebraically independent". 9.1. THEOREM. Let G\ and G2 be .F-representable matroids on the same set S with representation matrices Mi and M 2 . Then the matrix M defined by
is a representation matrix for the matroid union G\ VG2 over a transcendental extension of F. The proof uses the multiple Laplace expansion for a single matrix. If M is a matrix, I is a subset of row indices, and J is a subset of column indices, then M[I\ J] is the \I\ x \J\ submatrix of M obtained by restricting M to the rows and columns indexed by I and J (keeping the same order as in M). 9.2. MULTIPLE LAPLACE EXPANSION. Let TV be an / x I square matrix, with rows and columns indexed by the integers 1,2,...,/. Then detiV=
J2
(-l)il+h+-+ik+k{k~1)/2detN[l,2,...,k\i1,i2,...,ik] x det N[k + 1, k + 2 , . . . , l\jk+i, jk+2, ••-,ji],
83
where the sum ranges over all fc-element subsets {ii,i2,...,ik} of {1,2,. ..,1} and {jk+i,jk+2,---,ji} is the complement of {h,i2, • • • ,h} in {1,2,...,/}. Let M'2 be any representation matrix for Gi over any field containing the field F, let M be the matrix '
Mi
—
. MTJ' and let H be the linear matroid defined by the matrix M. We shall prove that every 77-independent set is the union of a G\ -independent set and a G2-independent set. Suppose 7 is .ff-independent. Then there exists an \I\element subset J of rows such that the square submatrix M[J|7] with rows indexed by J and columns indexed by 7 is non-singular. Let J\ be the subset of rows in J from the matrix M\ and let J2 be the subset of rows in J from M2. Expanding the matrix M[J|7] according to the Laplace expansion and observing that detM[J|7] ^ 0, we conclude that one of the summands in the expansion, say, ± det Mr [Ji \h] det M'2 [J2 \I2], is non-zero. Because both subdeterminants are non-zero, I\ is independent in G\ and I2 is independent in G2. Hence, / is the union of a G\-independent set and a G2-independent set. Conversely, suppose that i" is independent in G± VG2. Then, I = IiL)I2, where 7i is G\-independent and I2 is G2-independent. Removing elements from I\ or I2 if necessary, we may suppose that Ii and I2 are disjoint. Choose a subset J\ of row indices so that Mi[Ji|7i] is non-singular and a subset J2 so that M2[J2 \h] is non-singular. In the Laplace expansion of det M[J\ U J2 \h U I2], the term ±detM1[J1\h]detM2[J2\I2}(l[xa), a€l2
is non-zero. The monomial ]1 xa comes from the indeterminates in the generic matrix D. Because every summand in the Laplace expansion has a different monomial and the indeterminates algebraically independent, there are no algebraic relations between the summands. Hence, detM[Ji U J2\h U I2] is non-zero and the columns indexed by I\ \JI2 are linearly independent. This completes the proof of Theorem 9.1. Important examples of matroid unions are transversal matroids. Let R be a relation between the set S and the set { 1 , 2 , . . . , m}. A subset 7 in 5 is
84
said to have a partial matching if there is an injection i : I -» { 1 , 2 , . . . , m} such that for every element a in / , a is related to t(a). The sets with partial matchings form the independent sets of a matroid on S called the transversal matroid T(R) of the relation R. Transversal matroids are matroid unions of rank-1 matroids. If G is a rank-1 matroid on the set 5, then S is the disjoint union of 0 the set of loops, and 5\0, the set of elements having rank 1. Then the transversal matroid T(R) is the matroid union d V G2 V . . . V Gm, where Gi is the rank-1 matroid whose set of rank-1 elements is R~l{i), the set {a : aRi} of elements in S related to the element i. Let F be a field and let Xj)0 be indeterminates, one for each pair (i,a) such that a is related to i in the relation R. Then, using Theorem 9.1 repeatedly, we conclude that the transversal matroid T(R) can be represented over an extension of F by the m x \S\ matrix whose i,a-entry is :Ej>a if aRi and 0 otherwise. This matrix is called the free matrix of the relation. Frobenius 25 was the first to study free matrices. For a historical study, see 59 . The following research area, proposed by Rota, may be of interest to the philosophically-minded. 9.3. RESEARCH AREA. Develop matching theory using free matrices, linear algebra, and the theory of determinants. Some work in this direction can be found in 16>31>38. Many theorems in matching theory are special cases of matroid theorems. As one might expect, the marriage theorem is a special case of the matroid partition theorem. The matroid intersection theorem yields a necessary and sufficient condition for the existence of a common transversal for two relations and Rado's theorem for independent transversals. For more on matching theory, see 31 and 48 . There are many unsolved problems in the area of matroid unions. The best known one, posed by Welsh, is to characterize the union-irreducible matroids, that is, those matroids which cannot be expressed as the matroid union of two matroids, both having strictly smaller rank. See [Bll, p. 474].
9
MATRIX MULTIPLICATION A N D T H E CAUCHY-BINET IDENTITY
Yet another useful determinantal identity in matroid theory is the CauchyBinet identity, which is a generalization of the homomorphism or multiplica-
85
tive property det(MiV) = det M det TV of determinants. 10.1. THE CAUCHY-BINET IDENTITY. Let A be an n x s matrix and B be an s x n matrix with the columns of A and the rows of B labelled by the same set K. Then,
det(MTV) = Y,
det
Mi1}det
N 1
i }'
i
where the sum is over all n-element subsets i" of K, M[I] is the nxn matrix obtained by restricting M to the columns indexed by i", and N[I] is the nxn matrix obtained by restricting TV to the rows indexed by I. For example, det
[[2
3-1.
1 3 0 7 6 9
equals the sum 1 2
4 3
1 3 4 + 3 0 7
0 -1
0 6
7 1 + 2 9
0 -1
1 6
3 9 '
The Cauchy-Binet identity sheds some light on a special case of matroid intersection, the common basis problem: given two matroids M and N having the same rank, does there exist a subset which is a basis in both M and iV? When the matroids M and N are representable over the same field, the common basis problem is equivalent to determining whether a generic matrix product is non-singular. 10.2. THEOREM. Let M and TV be rank-n nx s matrices with columns indexed by the same set 5. Then the linear matroids on S defined by M and N have a common basis if and only if the matrix MDN1 is non-singular. Here, D is the generic diagonal matrix on S defined in Section 9.
86
Theorem 10.2 follows from the identity (10.1)
d e t ( M l W ) = ] T det M[B] det N[B}( J J B
xa).
aGB
Since the indeterminates xa are algebraically independent, det(MDNl) ^ 0 if and only if one of the summands on the right-hand side is non-zero. The set B indexing that term is a common basis of the linear matroids defined by M and TV. We can also use the Cauchy-Binet identity to define bimatroid multiplication. Let B be a bimatroid between S and E and C be a bimatroid between E and T. Then the product bimatroid C o B between S and T defined by: (X, U) is a non-singular minor in C o B if and only if there exists A C E, (X, A) is a non-singular minor in B and (A, U) is a non-singular minor in C. It is not easy to prove that the bimatroid product is in fact a bimatroid. In fact, the known proof uses the matroid intersection theorem. Because many constructions (such as strong maps, matroid unions, and matroid induction) can be modelled by matrix multiplication, bimatroid multiplication is a unifying idea in the study of matroid constructions. A product of oriented bimatroids can be defined, at the cost of introducing a lexicographic order on the subsets of C. (A similar idea is used in 44 to define oriented matroid-union.) A more natural product for oriented bimatroids (if it exists) would be very useful in combinatorial differential topology (see l and 2 ). 10
BASIS G E N E R A T I N G F U N C T I O N S A N D T H E MATRIX-TREE THEOREM
As in the previous two sections, let {xa,a 6 5} be a set of indeterminates. The basis generating function I5(G;x) of the matroid G on the set S is the polynomial defined by
B(G;x) = Yl Ux-' B
a€B
where the sum ranges over all bases B of G. The basis generating function encodes in an algebraic form the description of the matroid G in terms of its bases. The basis generating function satisfies the following recursions.
87
If a is neither a loop or an isthmus, B{G;x) = B{G\a;x)
+
xaB(G/a;x).
If a is a loop, B{G;x) =
B(G\a;x).
If a is an isthmus, B(G;x)
-xaB(G\a;x).
For certain matroids, the basis generating function can be expressed as a determinant. Let M be an n x s matrix representing the rank-n matroid G with s elements. Then, by equation (10.1),
det(MDMl) = Y, (det(M[B])2([J xa). B
a£B
Setting all the variables xa to 1, we have det(MM') = ] T
(det{M[B})2.
B
Hence, we have the following theorem. 11.1. THEOREM. Let M be an n x s representation matrix for rank-n matroid G. If all the n x n square subdeterminants of M have values —1, 0 or 1, then B(G;x) =
det(MDMt)
and the number of bases in G equals det(MM 4 ). A class of matroids to which the theorem can be applied is the class of regular matroids. A regular matroid is a matroid which is representable over all fields. It can be shown that a regular matroid G can be represented by a totally unimodular matrix, that is, a matrix all of whose subdeterminants are —1, 0, or 1. The decomposition theorem of Seymour 6 1 says that every regular matroid can be built using simple operations from graphic matroids, cographic matroids, and a "sporadic" 10-point rank-4 matroid i?ioThe oriented vertex-edge incidence matrix M of a connected graph T with one row deleted is a totally unimodular matrix representing the cycle matroid M(T). Applying Theorem 11.1 to this case, one obtains the classical matrix-tree theorem of graph theory. Seymour's decomposition theorem indicates that Theorem 11.1 for regular matroids is almost the same as the matrix-tree theorem for graphs.
88
11
G E N E R I C R A N K - G E N E R A T I N G POLYNOMIALS
The basis generating function satisfies all except one of the contraction-anddeletion relations. Let / be a function defined on matroids on finite sets. The function / satisfies the contraction-and-deletion relations if (CD1)
f(M)
= f(N) if M is isomorphic to N, and
(CD2)
for every matroid G and every element a in G, f(G) = f(G\a) +
f(G/a)
if a is neither an isthmus nor a loop, and f(G) =
f(G\{a})f(G\a)
otherwise. The rank-generating polynomial R(G;x,X) variables x and A defined by
R(G;x,X) = J2
is the polynomial in the
A rank ^- rank(yl >a;l >l l- rank( ' 4 \
Aes where the sum ranges over all subsets A of S. The exponent of the variable A is the corank of the set A. Tutte 66 (in the special case of graphs) and Brylawski 8 showed that every function on matroids satisfying the contractionand-deletion relations is a specialization of the rank-generating polynomial. Setting x = —1 in the rank-generating polynomial, we obtain the characteristic polynomial x(G; A) of G. Indeed, if G has no loops, then
X(G;A)=
Y,
M0,X)A r a n k ( 5 ) - r a n k W ,
X:X£L(G)
where [i is the Mobius function 56 of the lattice L(G); if G has loops, then x(G; A) = 0. Uninverting the Mobius inversion, we obtain the following identity which will be used later:
(12.1)
J2
X(G/X;X) = A rank ( s >- rank ^).
X-.ACX
The rank-generating polynomial and its close relative, the Tutte polynomial, occur all over mathematics. A comprehensive survey to about 1991 can be found in 9 . Some new applications can be found in 45>69. Tutte polynomials for graphs with weights on their edges have been studied 64 . (See also 5 and
89
the references in there.) Many of the theorems for weighted graphs generalize in a straightforward way to weighted matroids. Let G be a matroid on the set S, let {xa, a € 5} be a set of variables, and let A be a new variable. The Tugger polynomial R(G; x, A) is the polynomial in the variables xa and A defined by
R(G;z,A) = J2 r k < s )- ran ^»(JIi«), AeS
aeA
where the sum ranges over all subsets A of 5. For example, the Tugger polynomial of £^2,3, the 3-point line with points a, b, and c, is 1 + X(xa + xb + xc) + \2{xaxb
+ xaxc + xbxc +
xaxbxc).
The Tugger polynomial is a generic rank-generating function of the matroid G. It contains a complete description of the rank function of G in an algebraic form. The Tugger polynomial satisfies an identity generalizing an identity of Tutte 68 . Let R ( G ; z - 1, A) be the polynomial obtained from R(G;x, A) by replacing every variable xa by xa — 1. 12.1. THEOREM.
R(G; x^l, A) =
^°IX'
J2
A
)( I I x*)>
X-.XeL(G)
a£X
where the sum ranges over all the closed sets of G. To prove Theorem 12.1, we need the following elementary identity:
(i2.2)
n ^ = E (n^-1))aeA
B:BCA
\b£B
)
For example, xaxb = (xa - l)(xb - 1) + (xa - 1) + (xb - 1) + 1. Consider the right-hand side of the identity in the theorem. Note that since x(G/X; A) = 0 when X is not closed, we can take the sum to range over all subsets X in S. Thus, using equations (12.1) and (12.2) and changing the order of summation, we have
90
£
x(G/A-;A)(n*«)
X:XCS
VaeX
/
= YI x(G/x-x)[j2 fn^- 1 ))) X-.XCS
\ACX
\aeA
) J
= £ fn^" 1 )) [ E X(G/X;A)| A:ACS \aeA
J
\X:ACX
J
= YI (n( a ; a- i )) A r a n k ( s ) " r , m k ( > i ) A-.AeS \aEA
J
= R(G;x-l,A). This completes the proof of Theorem 12.1. The striking thing about Theorem 12.1 is that it allows us to obtain the family of closed sets of G from the rank function of G using a simple (albeit time-consuming) algebraic operation. 12.2. PROBLEM. of matroids.
Find algebraic relations between other descriptions
Such relations would allow matroid computations to be done using computer algebra.
ACKNOWLEDGEMENT I would like to thank Joseph Bonin for his comments on several drafts of this paper. I would also like to thank the National Security Agency for supporting my research under Grant MDA 90498-1-0025. Books on matroid theory and related areas Bl.
G. Birkhoff, Lattice theory, 3rd edition, Amer. Math. Soc, Providence, Rhode Island, 1967.
B2.
J. E. Bonin, J. G. Oxley, and B. Servatius, eds., Matroid theory, Amer. Math. Soc, Providence, Rhode Island, 1996.
91
B3.
A. Bjorner, M. Las Vergnas, B. Sturmfels, N. L. White, and G. M. Ziegler, Oriented matroids, Cambridge Univ. Press, Cambridge, 1993.
B4.
H. H. Crapo and G.-C. Rota, On the foundations of combinatorial theory: Combinatorial geometries, Preliminary edition, M. I. T. Press, Cambridge, Massachusetts, 1970.
B5.
P. Crawley and R. P. Dilworth, Algebraic theory of lattices, PrenticeHall, Englewood Cliffs, New Jersey, 1973.
B6.
G. Gratzer, General lattice theory, 2nd edition, Birkhauser, Basel, 1998.
B7.
B. Korte, L. Lovasz, and R. Schrader, Greedoids, Springer-Verlag, Berlin and New York, 1991.
B8.
J. P. S. Kung, ed., A sourcebook in matroid theory, Birkhauser, Boston and Basel, 1986.
B9.
E. L. Lawler, Combinatorial optimization: Networks and matroids, Holt, Rinehart and Winston, New York, 1976.
BIO.
P. Orlik and H. Terao, Arrangements of hyperplanes, Springer-Verlag, Berlin and New York, 1992.
Bll.
J. G. Oxley, Matroid theory, Oxford Univ. Press, Oxford, 1992.
B12.
B. Polster, A geometrical picture book, Springer-Verlag, Berlin and New York, 1998.
B13.
M. Stern, Semimodular lattices, Cambridge Univ. Press, Cambridge, 1999.
B14.
W. T. Tutte, Graph theory as I konw it, Oxford Univ. Press, Oxford, 1998.
B15.
D. J. A. Welsh, Matroid Theory, Academic Press, London and New York, 1976.
B16.
N. L. White, ed., Theory of matroids, Cambridge Univ. Press, Cambridge, 1986.
B17.
N. L. White, ed., Combinatorial geometries, Cambridge Univ. Press, Cambridge, 1987.
B18.
N. L. White, ed., Matroid applications, Cambridge Univ. Press, Cambridge, 1992.
92
Introductory or survey papers on matroid theory. 51.
R. E. Bixby and W. H. Cunningham, Matroid optimization and algorithms, in Handbook of combinatorics, R. L. Graham, M. Grotschel, and L. Lovasz, eds., Elsevier North-Holland, Amsterdam, 1995, pp. 551-609.
52.
A. Delandtsheer, Dimensional linear spaces, in Handbook of incidence geometry, F. Boukenhout, ed., Elsevier North-Holland, Amsterdam, 1995, pp. 193-294.
53.
A. W. Ingleton, Representations of matroids, in Combinatorial mathematics and its applications, D. J. A. Welsh, ed., 1971, Academic Press, London and New York, pp. 149-169.
54.
J. P. S. Kung, Extremal matroid theory, in Graph structure theory, N. Robertson and P. D. Seymour, eds., Amer. Math. Soc, Providence, RI, 1992, pp. 21-62.
55.
J. P. S. Kung, The geometric approach to matroid theory, in Gian-Carlo Rota on combinatorics, J. P. S. Kung, ed., Birkhauser, Boston, 1995, pp. 604-622.
56.
J. P. S. Kung, Matroids, in Handbook of algebra, Volume 1, M. Hazewinkel, ed., Elsevier North-Holland, Amsterdam, 1996, pp. 157184.
57.
J. P. S. Kung, Critical problems, in [B2], pp. 1-127.
58.
J. G. Oxley, Matroid structure and connectivity, in [B2], pp. 129- 170.
59.
J. G. Oxley, Matroids, in Graph connections, L. W. Beineke and R. J. Wilson, eds., Oxford Univ. Press, Oxford, 1987, pp. 110-115.
510.
P. D. Seymour, Matroid minors, in Handbook of combinatorics, R. L. Graham, M. Grotschel, and L. Lovasz, eds., Elsevier North-Holland, Amsterdam, 1995, pp. 527-550.
511.
D. J. A. Welsh, Matroids and their applications, in Selected topics in graph theory 3, L. W. Beineke and R. J. Wilson, eds., Academic Press, London and San Diego, 1988, pp. 43-70
512.
D. J. A. Welsh, Matroids: Fundamental concepts, in Handbook of combinatorics, R. L. Graham, M. Grotschel, and L. Lovasz, eds., Elsevier North-Holland, Amsterdam, 1995, pp. 481-609.
93
References 1. L. Anderson, Topology of combinatorial differentiable manifolds, Topology, 38(1999), 197-221. 2. L. Anderson, Matroid bundles, in New perspective in algebraic combinatorics, L. J. Billera et al, ed., Cambridge Univ. Press, Cambridge, 1999, pp. 1-21. 3. M. K. Bennett, Convexity closure operators, Algebra Universalis, 10(1980), 345-354. 4. A. Bjorner and G. M. Ziegler, Introduction to greedoids, in [B18], pp. 284-357. 5. B. Bollobas and 0 . Riordan, A Tutte polynomial for coloured graphs, Combin. Probab. Comput., 8(1999), 45-93. 6. E. Brickell and D. M. Davenport, On the classification of ideal secret sharing schemes, J. Cryptology, 4 (1991), 123-134. 7. R. Brualdi and H. J. Ryser, Combinatorial matrix theory, Cambridge Univ. Press, Cambridge, 1991. 8. T. Brylawski, A decomposition for combinatorial geometries, Trans. Amer. Math. Soc, 171(1971), 235-282. 9. T. Brylawski and J. G. Oxley, The Tutte polynomial and its applications, in [B18], pp. 123-225. 10. J. Bukowski and A. G. de Oliveira, Invariant theory-like theorems for matroids and oriented matroids, Adv. Math., 109(1994), 34-44. 11. H. H. Crapo, Erecting geometries, in Proceedings of the Second Chapel Hill Conference on Combinatorics and its Applications, Univ. North Carolina, Chapel Hill, NC, 1970, pp. 74-99. 12. H. H. Crapo, Orthogonality, in [Bll], pp. 76-96. 13. J. Desarmenien, J. P. S. Kung and G.-C. Rota, Invariant theory, Young bitableaux, and combinatorics, Adv. Math., 27(1978), 63-92. 14. R. P. Dilworth, Lattices with unique irreducible decompositions, Ann. Math., 41(1940), 771-777. 15. P. Doubilet, G.-C. Rota and R. Stanley, On the foundations of combinatorial theory. VI. The idea of generating function, in Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and its Applications, Vol. II (Probability theory), Univ. California Press, Berkeley, CA, 1972, pp. 267-318. 16. P. Doubilet, G.-C. Rota and J. Stein, On the foundations of combinatorial theory. IX. Combinatorial methods in invariant theory, Stud. Appl. Math., 53(1974), 185-216. 17. T. A. Dowling, A class of geometric lattices based on finite groups, J.
94
Combin. Theory Ser. B, 14(1973), 61-86; erratum, ibid., 15(1973), 211. 18. P. H. Edelman, Meet-distributive lattices and the anti-exchange closure, Algebra Universalis, 10(1980), 290-299. 19. P. H. Edelman and R. E. Jamison, The theory of convex geometries, Geometriae Dedicata, 19(1985), 247-270. 20. J. Edmonds, Minimal partition of a matroid into independent sets, J. Res. Nat. Bur. Standards Sect. B, 69B(1965), 67-77. 21. J. Edmonds, Submodular functions, matroids and certain polyhedra, in Combinatorial structures and their applications, 1970, Gordon and Breach, New York, 69-87. 22. J. Edmonds, Matroids and the greedy algorithm, Math. Programming, 1(1971), 127-136 23. U. Faigle, Geometries on partially ordered sets, J. Combin. Theory Ser. B, 28(1980), 26-51. 24. J. V. Field and J. J. Gray, The geometrical work of Girard Desargues, Springer-Verlag, Berlin and New York, 1987. 25. G. Frobenius, Uber zerlegbare determinanten, Sitzber. Preuss. Akad. Wiss., 1917, 274-277. 26. J. D. Golic, On matroid characterization of ideal secret sharing schemes, J. Cryptology, 11(1998), 75-86. 27. K. M. Gragg and J. P. S. Kung, Consistent dually semimodular lattices, J. Combin. Theory Ser. A, 60(1992), 246-263; erratum, ibid., 71(1995), 173. 28. R. L. Graham and P. Hell, On the history of the minimum spanning tree problem, Ann. Hist. Comput., 7(1985), 43-57. 29. H.-J. Groh, Varieties of topological geometries, Trans. Amer. Math. Soc, 337(1993), 691-702. 30. M. Halsey, Line-closed combinatorial geometries, Discrete Math., 65(1987), 245-248. 31. L. H. Harper and G.-C. Rota, Matching theory, an introduction, in Advances in Probability, Vol. 1, P. Ney, ed., Marcel Dekker, New York, 1971, pp. 171-215. 32. J. Kahn, A problem of P. Seymour on non-binary matroids, Combinatorica, 5(1985), 319-323. 33. J. Kahn and J. P. S. Kung, Varieties of combinatorial geometries, Trans. Amer. Math. Soc, 271(1982), 485-499. 34. J. Kahn and J. P. S. Kung, A classification of modularly complemented geometric lattices, European J. Combin., 7(1986), 243-248. 35. D. G. Kelly and J. G. Oxley, Asymptotic properties of random subsets of projective spaces, Math. Proc. Cambridge Philos. Soc, 91(1982),
95
119-130. 36. D. G. Kelly and J. G. Oxley, Threshold functions for some properties of random subsets of projective spaces, Quart. J. Math. Oxford (2), 33(1982), 463-469. 37. J. P. S. Kung, Bimatroids and invariants, Adv. Math., 30(1978), 238-249. 38. J. P. S. Kung, Jacobi's identity and the Konig-Egervary theorem, Discrete Math., 49(1984), 75-77. 39. J. P. S. Kung, Basis exchange properties, in [B16], pp. 62-75. 40. J. P. S. Kung, Strong maps, in [B16], pp. 224-253. 41. J. P. S. Kung, Weak maps, in [B16], pp. 256-271. 42. J. P. S. Kung, Matchings and Radon transforms in lattices. I. Consistent lattices, Order, 2(1985), 105-112. 43. J. P. S. Kung, Pfaffian structures and critical problems in finite symplectic spaces, Ann. Combin., 1(1997), 159-172. 44. J. Lawrence and L. Weinberg, Unions of oriented matroids, Linear Algebra Appl., 41(1981), 183-200. 45. C. M. Lopez, Chip firing and the Tutte polynomial, Ann. Combin., 1(1997), 253-259. 46. L. Lovasz, Matroid matching and some applications, J. Combin. Theory Ser. B, 28(1980), 208-236. 47. L. Lovasz, The matroid matching problem, in Algebraic methods in graph theory, Vol. I, II (Szeged, 1978), 1981, North-Holland, Amsterdam, pp. 495-517. 48. L. Lovasz and M. D. Plummer, Matching theory, North-Holland, Amsterdam, 1986. 49. S. Mac Lane, Some interpretation of abstract linear dependence in terms of projective geometry, Amer. J. Math., 58(1936), 236-240. 50. G. Matheron, Random sets and integral geometry, Wiley, New York, 1974. 51. F. Matus, Matroids represented by partitions, Discrete Math., 203(1999), 169-194. 52. B. Monjardet, The consequences of Dilworth's work on lattices with unique irreducible decompositions, in The Dilworth theorems, K. Bogart, R. Freese, and J. Kung, eds., Birkhauser, Boston, 1990, pp. 192-201. 53. J. von Neumann, Continuous geometry, I. Halperin, ed., Princeton Univ. Press, Princeton NJ, 1960. 54. G. Peano, Sui fondamenti della geometria, Rivista di Matematica, 4(1894), 73. 55. K. Reuter, The Kurosh-Ore exchange property, Acta Math. Hungar., 53(1989), 119-127. 56. G.-C. Rota, On the foundations of combinatorial theory. I. Theory
96
57.
58. 59.
60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73.
of Mobius functions, Z. Wahrscheinlichkeittheorie und Verw. Gebiete, 2(1964), 340-368. G.-C. Rota, Combinatorial theory and invariant theory, Notes taken by L. Guibas from the National Science Foundation Seminar in Combinatorial Theory, Bowdoin College, Maine, 1971, unpublished typescript. E. R. Scheinerman and D. H. Ullman, Fractional graph theory, Wiley, New York, 1997. H. Schneider, The concepts of irreducibility and full indecomposabihty of a matrix in the works of Frobenius, Konig and Markov, Linear Algebra Appl., 18(1977), 139-162. A. Schrijver, Matroids and linking systems, J. Combin. Theory Ser. B, 26(1979), 349-369. P. D. Seymour, Decomposition of regular matroids, J. Combin. Theory Ser. B, 28(1980), 305-359. P. D. Seymour, On secret-sharing matroids, J. Combin. Theory Ser. B, 56(1992), 69-73. D. R. Stinson, Cryptography: theory and practice, CRC Press, Boca Raton, FL, 1995. L. Traldi, A dichromatic polynomial for weighted graphs and link polynomials, Proc. Amer. Math. Soc, 106(1989), 279-286. R. T. Tugger, Convexity without order, manuscript, May 2000. W. T. Tutte, A ring in graph theory, Proc. Cambridge Philos. Soc, 43(1947), 26-40. W. T. Tutte, Matroids and graphs, Trans. Amer. Math. Soc, 90(1959), 527-552. W. T. Tutte, On dichromatic polynomials, J. Combin. Theory, 2(1967), 301-320. D. J. A. Welsh and G. P. Whittle, Arrangements, channel assignments, and associated polynomials, Adv. in Appl. Math., 23(1999), 375-406. H. Whitney, Non-separable and planar graphs, Trans. Amer. Math. Soc, 34(1932), 339-362. H. Whitney, On the abstract properties of linear dependence, Amer. J. Math., 57(1935), 509-533. T. Zaslavsky, Biased graphs. I. Bias, balance, and gains, J. Combin. Theory Ser. B, 47(1989), 32-52. G. Ziegler, What is a complex matroid? Discrete and Comput. Geom., 10(1993), 313-348.
ENUMERATION OF G R A P H COVERINGS, SURFACE B R A N C H E D COVERINGS AND RELATED GROUP THEORY* JIN HO KWAK Combinatorial
and Computational Mathematics Science and Technology, Pohang.
Center Pohang University 790-784 Korea
of
JAEUN LEE Mathematics,
Yeungnam
University.
Kyongsan.l'12-1%9
Korea
Lots of graphs having a symmetry property can be described as coverings of simpler graphs. In this manuscript, we examine several enumeration problems for various types of nonisomorphic graph coverings of a graph and some of their applications to a group theory or to a surface theory. This manuscript is organized as follows. In section 1, we introduce basic concepts. In section 2, by using covering graph construction, we count the positive isomorphism classes of cycle permutation graphs, which is equal to the number of double cosets of the dihedral group Ittn in the symmetric group Sn on n elements. In section 3, we count nonisomorphic (connected) coverings of a graph and, as its application, we have another recursive formula for the number of conjugacy classes of subgroups of given index of a finitely generated free group. In section 4, we count nonisomorphic regular coverings of a graph whose covering transformation groups are abelian and, as its application, we count subgroups of given index of free abelian groups. The same work is done in section 5 for regular coverings having dihedral voltage groups. In section 6, we discuss a general counting formula for regular coverings having any finite voltage group. In section 7, after discussing a combinatorial proof of Hurwitz theorem for surface branched coverings, we consider the number of subgroups of surface groups. Finally, in section 8, we discuss a distribution of branched surface coverings of surfaces and some related topological properties including a generalization of the classical Alexander theorem.
1
Definitions a n d N o t a t i o n s
Let G be a connected finite simple graph with vertex set V(G) and edge set E(G). The neighborhood of a vertex v € V(G), denoted by N(v), is the set of vertices adjacent to v. We use \X\ for the cardinality of a set X. The number /3(G) = \E(G)\ — |^(G)| + 1 is equal to the number of independent cycles in G and it is referred to as the Betti number of G. Two graphs G and H are isomorphic if there exists a one-to-one correspondence between their vertex sets which preserves adjacency, and such a correspondence is called an isomorphism between G and H. An automorphism of a graph G is an isomorphism of G onto itself. Thus, an automorphism of *THIS WORK IS PARTIALLY SUPPORTED BY COM 2 MAC-KOSEF, KOREA. 97
98
G is a permutation of the vertex set V(G) which preserves adjacency. Obviously, a composition of two automorphisms is also an automorphism. Hence the automorphisms of G form a permutation group, Aut (G), which acts on the vertex set V(G). A graph G is called a covering of G with projection p : G —> G if there is a surjection p : V(G) -* V(G) such that p\N(y) : N(v) —> N(v) is a bijection for any vertex v G V(G) and v € p" 1 (w). We also say that the projection p : G —> G is an n-fold covering of G if p is n-to-one. A covering p : G —> G is said to be regular (simply, A-covering) if there is a subgroup A of the automorphism group Aut (G) of G acting freely on G so that the graph G is isomorphic to the quotient graph G/A, say by h, and the quotient map G —> G/^4 is the composition hop of p and /i. The fibre of an edge or a vertex is its preimage under p. Two coverings pi : Gi —> G, i = 1,2, are said to be isomorphic (or, equivalent) if there exists a graph isomorphism $ : Gi —» G^ such that the diagram
commutes. Such a $ is called a covering isomorphism. In particular, when Pi = Vi (saYi = V) with Gi = G2 (say, = G), it is called a covering tansformation of p, and the set of all covering transformations forms a group under the composition, called the covering transformation group of the covering p:G->G. Every edge of a graph G gives rise to a pair of oppositely directed edges. By e _ 1 = vu, we mean the reverse edge to a directed edge e = uv. We denote the set of directed edges of G by D(G). Each directed edge e has an initial vertex ie and a terminal vertex te. Following 4 , a permutation voltage assignment 0 on a graph G is a map Sn with the property that q~x(v) is a bijection between the n vertices {ui,i>2, • • • ,vn} for all v € V(G). Now, we define / : V(G) -> Sn by f(v) = $| p -i ( „) for all v € V(G). For an edge uv G D(G), if (u, h) is joined to (v, k) in G*, then 4>(uv)(h) = k and (u, f(u)(h)) is joined to (v, f(v)(k)) in G^ for any h. Thus, we have ip(uv)f(u) = f(v)<j>(uv), or ip(uv) = f(v)(j>(uv)f(u)~1 for all uv £ D(G). The authors showed that the converse is also true. Theorem 4 (24) Two n-fold coverings p : G& —» G and q : G^ —* G are isomorphic if and only if there exists a function f : V(G) —> Sn such that ip(uv) = f(y)4>(uv)f(u)~1 for eachuv € D(G). Moreover, if(t>,ijj S C\(G\n), then it is equivalent to say that there exists a permutation a £ Sn such that tjj(uv) = aip(uv)a~1 for each uv £ D(G) — D(T). By labeling the positively directed edges in D(G) — D(T) as ei,e2, ..., /3(G)i a normalized permutation voltage assignment can be identified as a /3(G)-tuple of permutations in Sn, and the set C^(G;n) can be identified as e
Sn,
(/3(G) times).
With an 5 n -action on the set C)p (G\ n) defined by simultaneous coordinatewise conjugacy: for any g G Sn and any (, i/> in Cj,(G\n) derive isomorphic coverings of G if and only if they belong to the same orbit under the 5 n -action. That is, each /3(G)-tuple of
104
P(G) 1 n=z 1 1
n n n n
= = = =
22 33 45 57
2 4 5 6 3 1 1 1 1 1 4 8 16 32 64 11 49 251 1393 8051 43 681 14491 336465 7997683 161 14721 1730861 207388305 24883501301
Table 2. T h e number I s o (G;n) for small n and small /3(G)
permutations (o"i,..., <Jp(G)), Cf £ <Sn i s identified with a normalized permutation voltage assignment in Gy(G; n), and such two tuples derive isomorphic coverings of G if and only if they are simultaneous coordinatewise conjugate. Two /3-tuples of permutations (CTI , a2,..., ap) and (TI , r 2 , . . . , Tp) in Sn are said to be similar by 5, or simply similar, if they are simultaneous coordinatewise conjugate by g, that is, n = 50"«ff_1
for
« = 1,2,..., /3.
If we can find g £ Sn that leaves fixed some k in { 1 , 2 , . . . , n}, then the tuples are said to be k-similar. By Theorem 4, there is a one-to-one correspondence between the similarity classes of /?(G)-tuples of permutations in Sn and the isomorphism classes of n-fold coverings of the graph G. We denote by Iso (G; n) the number of such isomorphism classes of n-fold coverings of G. To count Iso (G; n) by Burnside's Lemma, we first count Fix (g) for each g G Sn- Let C(g) and Z(g) denote the conjugacy class containing g and the center of g in the symmetric group Sn, respectively. Lemma 2 Under the Sn-action on C^(G; n) = Sn x Sn x • • • x Sn, we have (1) tf 9i and g-2 are conjugate, then \F\x(gi)\ = |Fix(# 2 )|, (2) for each g € Sn, Fix (g) = Z{g) x Z(g) X • • • x Z(g),
/?(G) times,
(3) |C(ff)||Z(«7)| - n! for any