Predicting Party Sizes
This page intentionally left blank
Predicting Party Sizes The Logic of Simple Electoral Syst...
46 downloads
933 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Predicting Party Sizes
This page intentionally left blank
Predicting Party Sizes The Logic of Simple Electoral Systems
Rein Taagepera
1
3
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trademark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © Rein Taagepera 2007 The moral rights of the author have been asserted Database right Oxford University Press (maker) First published 2007 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available Typeset by SPI Publisher Services, Pondicherry, India Printed in Great Britain on acid-free paper by Biddles Ltd., King’s Lynn, Norfolk ISBN 978–0–19–928774–1 1 3 5 7 9 10 8 6 4 2
Preface
This book uses electoral systems to predict a significant aspect of party systems—the number and size distribution of parties, and some of the consequences. It builds on Seats and Votes (Taagepera and Shugart 1989), but it is a very different book. This is so because of the marked advances in our understanding of the connection between institutional inputs, such as electoral systems, and political outputs, such as the number of parties and the duration of governmental cabinets. During the last 15 years, many important works have been published that reflect advances in the study of electoral systems. Farrell (2001) has a most thorough description of the multiplicity of electoral systems used, including their history. The handbook edited by Colomer (2004a) focuses on the origins of electoral systems and regularities in the choice among them, presenting numerous case studies. Gallagher and Mitchell (2006) describe electoral systems in 22 countries. Lijphart (1994) studies systematically the effect of institutional inputs on the number of parties and proportionality between seats and votes. Reynolds, Reilly, and Ellis (2005) offer practical advice on the strong and weak aspects of the various systems, and Diamond and Plattner (2006) evaluate the outcomes for consolidation of democracy. Katz (1997), Lijphart (1999), Powell (2000), and Reynolds (2002) analyze the implications of elections as a core part of democracy, each from a very different angle. Cox (1997), Blais (2000), and Norris (2004) connect institutions to political behavior, stressing the various forms of strategic coordination. Shugart and Carey (1992) and Jones (1995) study the impact of electoral systems in presidential regimes. Monroe (2007a, 2007b) reconsiders application of social choice theory to electoral systems, introduces new measures of bias and responsiveness, and applies the results to institutional engineering in democracies. Specific electoral rules and/or geographic areas have been addressed in more detail by Grofman et al. (1999) on single non-transferable vote in East Asia, Bowler and Grofman (2000) on single transferable vote, Shugart v
Preface
and Wattenberg (2001) on mixed-member proportional systems, Grofman and Lijphart (2002) on proportional representation in Nordic countries, Elklit (1997) on emerging democracies, and Reynolds (1999) on Southern Africa. Darcy, Welch, and Clark (1994) and Henig and Henig (2001) have investigated the effect of electoral systems on women’s representation, and Rule and Zimmerman (1994) cover women and minorities. The ability to carry out analyses of worldwide scope has depended on the availability of electoral data. Among data collections, Mackie and Rose (1974, 1982, 1991, 1997) has been the major workhorse for long-established democracies. Nohlen, Krennerich, and Thibaut (1999), Nohlen, Gotz, and Hartmann (2001), and Nohlen (2005) have completed the gap for Africa, Asia-Pacific and the Americas, respectively. A comparable collection for East Central Europe (Nohlen and Kasapovic 1996) seems to be available, as yet, only in German. How does the present book complement this extensive work? It focuses on the very simplest electoral systems. In that narrow slice, it specifies the average relationships between institutional inputs and the party political outputs to an unprecedented degree. The results may be of interest to the practitioners of politics, for the following reason. Political scientists knew a long time ago that certain electoral systems tend to restrict the number of parties and lengthen cabinet duration, but they did not quite know by how much, and why by precisely that much. Suppose a practicing politician felt that cabinets in her country did not last long enough so as to implement long-term policy. We could tell her that reducing the number of parties could help prolong cabinet duration, and this reduction in the number of parties, in turn, could be obtained by reducing the ‘district magnitude’. The latter means the number of seats allocated within the same district. The smaller it is, the tougher it usually makes for small parties to obtain representation. Yet, we would have been at a loss, 20 years ago, if the practitioner continued to ask: ‘To what level should the district magnitude be reduced, if we want to double the average lifetime of cabinets?’ Back then, we could just recommend that the politicians try some change in the given direction, and see if it is too little or too much. If they overdid the change in district magnitude, we would not know it until several decades later, because individual cabinets vary in duration. Only the mean duration over a long time span is strongly determined by institutional constraints. In contrast, we can now make explicit predictions, in the case of stable democracies, when the electoral rules are rather simple. New Zealand used to have a mean cabinet duration of 6.3 years, which implies that vi
Preface
the same party often remained in office after new elections. In 1996, New Zealand changed electoral rules and ushered in an era of short-lived coalition cabinets. It will be seen that a logical model predicts a new mean cabinet duration of 2.0 years, within ±30 percent. This means somewhere between 1.4 and 2.6 years. In 1996–2002, the actual mean duration was 1.4 years. If the model holds, this mean is likely to increase slightly in the future. We shall see. It can be seen that I am willing to go on record with a specific prediction. This book presents models which such predictions are based on. These models tie cabinet duration and the number and size of parties to institutional inputs. The book also presents actual evidence. Prediction becomes hazier for new democracies where the political culture is still in a flux. Even for mature democracies, prediction becomes unmanageable when electoral rules are rather complex—as they often are. Yet, compared to 1989, we have shifted from no ability to predict in any quantitative detail to some such ability. This advance gives us hope that, within the next 20 years, further strides will be made. The present book will help do so, by pointing out some promising directions, and by presenting a unified picture of the methods that have brought us that far. My method owes more to my Ph.D. (see Taagepera and Williams 1966) and other work in physics (Taagepera and Nurmia 1961; Taagepera, Storey, and McNeill 1961) than to what currently prevails in comparative political science. In physics, one tries to break up a problem into smaller pieces. It often results in a sequence of equations of varied formats, as dictated by the nature of the issue on hand, each equation involving only a few variables and even fewer constants. Interaction of variables follows the basic format D ← C ← B ← A. There are no alternatives. The same constant values often recur in many equations and have names—they are here to stay. In contrast, today’s political scientists often throw a large number of input variables into a simple regression equation that is either linear or follows a limited number of other habitual patterns. The same researcher may present several alternate regressions, with some variables included or excluded. These are parallel expressions for the same output, rather than sequential: D ← A or D ← (A, B) or D ← (B, C) or D ← (A, B, C). These are alternatives. The number of constants and coefficients exceeds the number of variables, and their numerical values are rarely used again, once published—they are dead on arrival into print. Many distinguished exceptions occur, but regression equations are presently the main pattern in comparative political science. vii
Preface
The physicist goes first after the most general, and only gradually fleshes it out with more realistic details. In contrast, today’s political scientists often want to include all possible factors at once. Does a colleague suggest another variable? The answer may well be: ‘OK, I’ll put it in the regression.’ I observed precisely this response at a conference, January 14, 2005, where the speaker entered all his variables linearly. If it were a physics conference, the reaction might rather be: ‘OK, I’ll try to fit it logically into the model when I get to work on a second approximation.’ I definitely prefer the sequential approach, using only a few variables at each stage, connected by an equation the mathematical format of which is based on logical expectations guided by empirical evidence. This method is described in more detail in Beyond Regression (Taagepera 2008). Most often the resulting format is nonlinear. Many of my colleagues in political science understand this method and make use of it (see Coleman 2007; Colomer 2007; Grofman 2007). Hopefully, the results presented here will enable others to appreciate the power this method adds to the more usual approaches in today’s political science. Still, some colleagues may not be convinced. After raising a slew of detailed issues, an anonymous reviewer for Taagepera and Allik (2006) candidly noted: ‘Perhaps I have continued problems with this paper because I am skeptical that there is much of value operating at such a high level of generality . . . huge amounts of real-world variation are consigned to nowhere.’ Actually, we consigned these variations to a much better place than nowhere, namely to the next-level analysis. While ferreting out the universal, science does not ignore detail, but it does introduce some hierarchy in approach to detail. The reviewer continued: ‘The pattern the paper identifies, even though it can be modeled in a convincing way, may simply be a contingent summary of the particular real-world data used.’ Here we reach the core of a general unease about my approach, among some colleagues. If my models fit, they supposedly must fit for the wrong reason, even if the hidden artifact cannot be pinned down! We shall see. This book makes a number of specific predictions that can turn out false. In contrast, all too many studies in political science are safe against being proved false because they only predict the broad direction of a change, while leaving its precise amount unspecified. Or they present a probability distribution, without specifying the expectation value, as quantum physicists call it. It roughly means the value at which there is a 50-50 probability of the next actual case being below or above the predicted value. This expectation value is what I have in mind when predicting a viii
Preface
mean cabinet duration of 2.0 years for New Zealand, with a specified range of error. It is not a rigidly ‘deterministic’ prediction—it only expresses the average expectation. This is how we should present our results, if we want political science to be considered relevant by the public and by the practitioners of politics. King, Tomz, and Wittenberg (2000) have rightly criticized substantively ambiguous conclusions like ‘The coefficient on education was statistically significant at the 0.05 level.’ A decision-maker would be hard put to make use of such research. If one expects political practitioners to make use of what political scientists publish, this gap between jargon and usable results should be a matter of concern. This brings me to another aspect where this book may differ from Taagepera and Shugart (1989). Meanwhile, I was drawn into active politics three times. I participated in Estonia’s constitutional assembly in 1991–2. In late 1992, I ran for President of Estonia, ending a respectable third (Taagepera 1993). And in 2001, I was the founding chair of a new party which, soon after the end of my tenure, went on to win the parliamentary elections, before crashing (see Taagepera 2006). I never was a hands-on manager of a political organization or campaign, but I was a high-level participant and acquired some feeling for time pressures faced by sleepless politicians. They may need to decide on an institutional feature without time for detailed study of alternatives and their possible consequences. Scholars choose their problems; problems choose their politicians. Can we present scholarly results in a way politicians could use? This book tries to open all chapters with a section ‘For the practitioner of politics’. What can we expect from electoral systems? If these systems are simple, we can already predict quite a lot, at a level of precision useful to the political practitioner. In the case of more complex electoral systems, we are still far from specific advice. This book tries to establish the basis for tackling evermore complex cases. There is some tension between these two goals—the clear-cut recipes the practitioner needs, and the more tentative reasoning that gropes toward the unknown. At the start of the chapters, I present the practical recipe first, in a simple form which risks overstating the level of certainty of the claims. After all, the practitioner is often under pressure to act pretty quickly on the basis of the best current evidence, rather than wait for better evidence in years to come. Thereafter, in the body of the chapter, I shift to the scholarly mood, supplying theoretical considerations and empirical evidence. I try to distinguish between what is widely usable and what is more technical, shifting the latter to appendices at the end of the ix
Preface
chapters. This is where I also express doubts about my own findings and point out deviating cases and trends. My main emphasis is on elections for and party strengths in the lower or only chamber of legislative assemblies. As a spin-off from assembly models, a model also results for seat allocation among regions and countries in federal and supranational assemblies—such as the European Parliament. Presidential elections enter only as a conceptual limit case when an assembly is gradually reduced in size. Party system specialists may be disappointed that this book says so little about the internal structure of parties and their interaction in a system. Our knowledge about parties continues to expand (among the most recent books, see Ware 1996; Mair 1997; Bowler, Farrell, and Katz 1999; Dalton and Wattenberg 2000; Gallagher, Laver, and Mair 2000; Gunther, Montero, and Linz 2002; Webb and Farrell 2002; Cross 2004; Adams, Merrill, and Grofman 2005; Katz and Crotty 2006). The structure and interaction of parties should affect their size distribution. By how much they do, however, does not seem to have reached the stage of operational prediction. Establishing and isolating the mean impact of electoral systems, as this book does, serves to narrow the range of what remains to be explained by other factors. Many people deserve thanks for helping this book to come into existence. My wife Mare kept bugging me about writing the book when other academic and intellectual concerns tended to take precedence. Her activity in chemistry and knowledge space theory approaches to teaching of science maintained my contact with practices in natural sciences. Claire M. Croft at Oxford University Press urged me to bring my post-1989 work together in book form, and Elizabeth Suffling, Tanya Dean, and Mick Belson edited it into a technically superb form. Bernard Grofman (2003) has been an early exponent of my approach, and his inseparable associate, A Wuffle, has supplied quotes to introduce two subparts of the book. Arend Lijphart, Mathew S. Shugart, Lorenzo de Sio, Daniel Bochsler, Evald Mikkel, Russ Dalton, and Anthony McGann have commented on various drafts and/or helped in other ways. Undergraduate and graduate students at University of California, Irvine and at University of Tartu in Estonia have wittingly or unwittingly raised new questions about various aspects of content and style. Indeed, several of them (Mirjam Allik, John Ensch, and Allan Sikk) have become coauthors of detailed studies condensed in this volume. Mirjam Allik also finalized most of the graphs. I thank them all. R.T. x
Contents
List of Figures List of Tables List of Symbols
1. How Electoral Systems Matter
xiii xv xix
1
Part I. Rules and Tools 2. The Origins and Components of Electoral Systems
13
3. Electoral Systems—Simple and Complex
23
4. The Number and Balance of Parties
47
5. Deviation from Proportional Representation and Proportionality Profiles
65
6. Openness to Small Parties: The Micro-Mega Rule and the Seat Product
83
Part II. The Duvergerian Macro-Agenda: How Simple Electoral Systems Affect Party Sizes and Politics 7. The Duvergerian Agenda
101
8. The Number of Seat-Winning Parties and the Largest Seat Share
115
9. Seat Shares of All Parties and the Effective Number of Parties
143
10. The Mean Duration of Cabinets
165
11. How to Simplify Complex Electoral Systems
177
12. Size and Politics
187
xi
Contents
13. The Law of Minority Attrition
201
14. The Institutional Impact on Votes and Deviation from PR
225
Part III. Implications and Broader Agenda 15. Thresholds of Representation and the Number of Pertinent Electoral Parties
241
16. Seat Allocation in Federal Second Chambers and the Assemblies of the European Union
255
17. What Can We Expect from Electoral Laws?
269
Appendix: Detecting Factors Other than the Seat Product References Index
xii
287 293 307
List of Figures
1.1. The opposite impacts of electoral systems and current party politics
3
2.1. When district magnitude changes, PR and plurality rules affect largest party bonus in opposite ways
20
4.1. Balance vs. effective number of legislative parties in 25 countries, 1985–1996
52
4.2. Effective number of legislative parties vs. effective number of electoral parties
54
5.1. Deviation from proportional representation vs. effective number of electoral parties
68
5.2. Proportionality profiles for FPTP elections in New Zealand and the USA
71
5.3. Proportionality profile for Two-Rounds and PR elections in France
72
5.4. Proportionality profile for PR elections in Finland
73
5.5. Proportionality profile for Mixed-Member Proportional elections in Germany
74
6.1. Vote shares at which parties tend to win their first seat vs. district magnitude, for various PR formulas
88
7.1. The opposite impacts of current politics and electoral system
107
7.2. The macro-Duvergerian agenda, as of 2007
109
8.1. The number of seat-winning parties vs. the seat product MS
118
8.2. The median seat share of the largest party vs. the number of seat-winning parties
123
8.3. The median seat share of the largest party vs. assembly size, for 30 single-seat systems—predictive model and regression line
126
8.4. The median seat share of the largest party vs. seat product MS for 46 single- and multi-seat systems—predictive model and regression line
129
9.1. Actual average seat shares of parties ranked by size vs. largest seat share
146
9.2. Average seat shares of parties ranked by size vs. largest seat share—politically adjusted predictive model and actual data
151
xiii
List of Figures 9.3. Effective number of legislative parties vs. seat product MS—predictive model and regression line
153
10.1. Mean cabinet duration vs. effective number of legislative parties—predictive model and regression line
169
10.2. Mean cabinet duration vs. seat product MS—predictive model and regression line
171
13.1. Seat shares vs. vote shares for FPTP with high disproportionality exponents—attrition law and Caribbean data
208
13.2. Actual seat shares vs. those calculated from the attrition law, for FPTP systems with high disproportionality exponents
210
13.3. Actual effective numbers of legislative parties vs. those calculated from the attrition law, for FPTP systems with high disproportionality exponents
212
13.4. Actual deviations from PR vs. those calculated from the attrition law, for two-party FPTP systems with high disproportionality exponents
213
15.1. Nationwide number of seat-winning parties vs. average threshold of representation
249
16.1. Number of subunit-based second chamber seats vs. the geometric mean of first chamber size and the number of subunits
259
16.2. Seat and voting weight distribution in the European Parliament and the Council of the EU in 1995—predictive model and actual values
264
xiv
List of Tables
3.1. Possible seat allocation rules in a single-seat district
25
3.2. Example of basic seat allocation options in a single-seat district
26
3.3. Allocation of 6 seats by d’Hondt divisors (1, 2, 3, . . . )
31
3.4. Allocation of seats in a 6-seat district, by various quota and divisor formulas
33
3.5. Example of seat allocation by single transferable vote (STV) in a 5-seat district
35
3.6. Established democracies 1945–90—number of electoral systems and the total number of elections in which they were used
44
3.7. Electoral systems and British–French heritage
45
5.1. Satisfaction of the Taagepera and Grofman (2003) criteria by three indices of deviation from PR
77
5.2. Mean values of deviation indices D1 and D∞ , for given mean values of D2
79
5.3. An example where Loosemore–Hanby’s deviation index D1 looks too low
80
5.4. A counterexample where Loosemore–Hanby’s deviation index D1 no longer looks too low
80
5.5. Examples where Loosemore–Hanby’s deviation index D1 may look preferable to Gallagher’s D2
81
5.6. Which pattern would correspond to one-half of maximum deviation from PR?
82
6.1. Effect of district magnitude and seat allocation formula on the distribution of seats in a district where the percentage vote shares are 48+, 25−, 13−, 9−, 4, and 1+
86
6.2. Magnitudes at which parties with percentage vote shares 48+, 25−, 13−, 9−, 4, and 1+ would win their first seat under various allocation formulas
87
xv
List of Tables 6.3. Effect of district magnitude and seat allocation formula on deviation from PR and on the effective number of parties, in a district where the percentage vote shares are 48+, 25−, 13−, 9−, 4, and 1+
89
6.4. The seat product and the resulting expected number of seat-winning parties in the assembly
92
6.5. Tentative values of allocation formula exponent F in the seat product, for various PR formulas
93
8.1. The actual number of seat-winning parties, the expected number (based on district magnitude and assembly size), and their ratio
117
8.2. Assembly size (S) and the largest party’s seat share (s1 ), for single-seat district systems
125
8.3. District magnitude (M), assembly size (S), and the largest party’s seat share (s1 ), for multi-seat PR systems
128
8.4. Complexity of electoral systems and deviation of the largest seat share from the model s1 = 1/(MS)1/8
130
9.1. Actual average seat shares of parties ranked by size vs. largest share
145
9.2. Seat shares of parties ranked by size, for given largest share—probabilistic model, politically adjusted model, and the actual world averages
150
9.3. Effective number of parties for given largest share
161
9.4. Entropy-based effective number of parties
164
11.1. Actual district magnitudes (M) and effective magnitudes derived from Meff = (N6 /S)F , for stable democracies with relatively simple electoral systems in 1945–96.
180
11.2. Effective magnitudes for complex electoral systems, with output Meff calculated from Meff = (N6 /S)F
182
12.1. Predicted largest seat shares and effective numbers of parties, at selected populations and district magnitudes
191
12.2. Total and per capita party memberships at selected populations and district magnitudes—empirical approximations
194
13.1. Women’s share in US public office
203
13.2. Caribbean countries with unusually high disproportionality exponents 207 13.3. Volleyball scores and the law of minority attrition
222
13.4. Selection constant for women’s attrition in US politics seems to be 3.5
222
14.1. From votes to seats, and back to votes—hypothetical vote shares, with S = 100 and n = 3.00
228
xvi
List of Tables 14.2. Predicting the number of electoral parties from assembly size, for high responsiveness FPTP systems
230
14.3. Predicting the deviation from PR (Gallagher’s D2 ) from assembly size, for high responsiveness FPTP systems
232
15.1. District-level thresholds of minimal representation for various seat allocation formulas—general and for a 6-seat district
243
15.2. Sample constellations (in %) where the party shown in bold narrowly wins or narrowly fails to win a seat in a 6-seat district, using the Sainte-Laguë seat allocation formula
243
15.3. Number of ‘pertinent’ electoral parties ( p ) and resulting thresholds of representation (in %), if p = M 1/2 + 2M 1/4 and TR = (TI TE )1/2
246
16.1. The number of seats in the European Parliament—prediction by the cube root law and the actual number
260
16.2. Total voting weights in the Council of the European Union—prediction by S = P 1/6 T 1/2 and actual
260
16.3. Characteristics of seat allocations in the Council of the European Union and the European Parliament
263
16.4. Incongruent seat allocations for the European Parliament elections of 2004, compared to population in 2000
265
A.1. Residuals of the number of seat-winning parties (N0 )
288
A.2. Residuals of the largest seat shares (s1 )
290
A.3. Residuals of the largest seat shares (s1 ), effective numbers of parties (N), and mean cabinet durations (C)
291
xvii
This page intentionally left blank
List of Symbols
This list of symbols is not intended to be complete, but it includes the most important, especially those used in several chapters. Numbers refer to the chapter in which the symbol first occurs. A
aggregate of an electoral system, 6
AV
alternative vote, 3
a
advantage ratio, seat–vote ratio, 5
B
index of balance, 4
BC
Borda count, 3
BV
block vote, 3
b
break-even point, 5
C
mean duration of cabinets, in years, 10; number of Electoral College members, 13
D1
deviation from PR, Loosemore–Hanby measure, 5
D2
deviation from PR, Gallagher measure, 5
D∞
deviation from PR, Lijphart (1994) measure, 5
d
divisor gap in seat allocation formula, 6
E
number of electoral districts, 13
F
formula exponent in seat product, 6; first chamber size, 16
FPTP
first-past-the-post, single-seat plurality, 2
H
entropy, 4
k
conversion exponent between largest seat and vote share, 14; general constant elsewhere
L
literate fraction of population, 12
LR
largest remainders, 3
LV
limited vote, 3
M
district magnitude, 2
xix
List of Symbols Meff
effective magnitude at given assembly size, 2
MMP
mixed-member proportional, 2
MS
seat product, 6
m
total membership of parties, 12
N, N2
effective number of components, parties, 4
NS
effective number of parties, based on seat shares, 4
NV
effective number of parties, based on vote shares, 4
N0
number of seat-winning parties, 4
N1
entropy-based number of parties, 4
N∞
number of parties that profit from small party abandonment, 4
n
disproportionality exponent, 7, 13; general constant elsewhere
OLS
ordinary least squares method of linear regression, 8
P
population, 12
PBV
party block vote, 3
PR
proportional representation, 2
p
number of seat-winning parties in a district, 8
p
number of parties running in a district, 15
q0
simple quota for seat allocation, 3
q1
Hagenbach–Bischoff quota for seat allocation, 3
qDroop
Droop quota for seat allocation, 3
qi
general quota for seat allocation, 3
R R
number of registered parties, 12 2
linear correlation coefficient squared, 8
r
number of registered parties, 12
S
assembly size, total number of seats, 2; second chamber size, 16
SNTV
single non-transferable vote, 3
STV
single transferable vote, 3
s, s1
fractional seat share of the largest party, 8
si
fractional seat share of ith party, 3
T
threshold of votes to win a seat, 8; number of territorial subunits, 16
TE
threshold of exclusion, 15
TI
threshold of inclusion, 15
TR
Two-Rounds elections, 3
t
total seat share of third parties, 12; turnout, 17
xx
List of Symbols UV
Unlimited Vote, 3
V
total number of votes, 13
VA
aggregate volatility, 5
VI
individual volatility, 5
V1
volatility, Pedersen measure, 5
V2
volatility, Gallagher measure, 5
V∞
volatility, Lijphart (1994) measure, 5
v, v1
fractional vote share of the largest party, 4
vi
fractional vote share of ith party, 4
W
working-age fraction of population, 12
xxi
This page intentionally left blank
1 How Electoral Systems Matter
For the practitioner of politics:
r r r
Electoral systems help determine how many parties a country has, how cohesive they are, who forms the government, and how long the government cabinets tend to last. Electoral systems are expressed in electoral laws. Their impact depends on the way politicians and voters make use of these laws. At times, flawed electoral laws can undo democracy or lead to staleness.
Who governs? Electoral systems matter in democracies because they affect the answer to this question Robert Dahl (1961) posed in a different context. In the January 2006 Palestinian elections, the electoral system used gave Hamas 70 percent of the seats and hence threw the Palestinian–Israeli relations into turmoil. Yet Hamas received only about 45 percent of the list votes, as against about 41 percent for the more moderate Fatah. 1 With proportional representation rules, no party would have won an absolute majority of the seats, leading to a more balanced coalition government. In contrast, the actual, heavily majoritarian electoral system was bound to boost the seats for whichever party received even slightly more votes. The answer to the question ‘Who governs?’ was determined as much by the electoral system as by popular votes. Elections are one way to determine who the leaders will be. This method is more peaceful than fighting it out, more credible in modern times than 1 Matthew S. Shugart, ‘The magnitude of the Hamas sweep: The electoral system did it’ (http://fruitsandvotes.com, visited on February 28, 2006), calculates the Hamas list vote as 44.5 percent. The complex multi-seat nontransferable vote system, with many other embellishments, makes an exact count for candidates difficult. See also Steven Hill, ‘Vote system gave Hamas huge victory’, Hartford Courant, February 8, 2006, The Prague Post, February 15, 2006.
1
Predicting Party Sizes
claims of divine favor, and more systematic than estimating the loudness of noise made by various factions at an open-air meeting. Only transfer of power from parent to offspring can compete with elections in orderliness of procedure; and in the modern world, elections have become a more widespread practice. The supposed goal is to have the ‘people’ express their will. By electoral system, we mean the set of rules that specify how voters can express their preferences (ballot structure) and how the votes are translated into seats. The system must specify at least the number of areas where this translation takes place (electoral districts), the number of seats allocated in each of these areas (district magnitude), and the seat allocation formula. All this will be discussed in more detail later. This book deals only with elections that offer some choice. It bypasses fake elections where a single candidate for a given post is given total or overwhelming governmental support, while other candidates are openly blocked or covertly undermined. It also largely overlooks pathologies of electoral practices such as malapportionment and gerrymander, except for pointing out which electoral systems are more conducive to such manipulation. The physical conditions of elections matter, such as ease of registration of voters and candidates, location and opening times of polling stations, the timing of elections, and ballot design—see Mozaffar and Schedler (2002) and Reynolds and Steenbergen 2006). It is presumed in this book that such conditions of electoral governance are satisfactory. My only concern is to explain, in what are considered fair elections, how electoral systems affect the translation of votes into seats, how the results also affect the distribution of the votes in the next elections, and what it means for party systems. Moreover, the book largely limits itself to first or only chambers of legislatures, except for one chapter on second chambers and supranational assemblies, plus incidental comments on presidential and local elections. This scope may look narrow, but translation of votes into seats by different electoral systems can lead to drastically different outcomes. We already saw what it meant for Palestine. Also, with a different electoral system (and traditions in applying it), a mere 36.3 percent of the total vote would not have made Salvador Allende president of Chile in 1970, and Chile’s history could have taken a very different course. Around 1930, the vote shares of the British Liberals and the Icelandic Progressives were practically the same: 23.4 percent for the Liberals in 1929 and 23.9 percent for the Progressives in 1933. But the rules for 2
How Electoral Systems Matter ELECTORAL SYSTEMS
SEATS DISTRIBUTION
VOTES DISTRIBUTION
POLITICS & PARTIES
POLITICAL CULTURE
Figure 1.1. The opposite impacts of electoral systems and current party politics
allocating assembly seats on the basis of popular votes differed. The Icelandic Progressives won 33 percent of the seats and played a leading role in the country’s politics, while the British Liberals won less than 10 percent of the seats. The resulting disappointment affected the votes in the next election and sent the Liberals down to near-oblivion. Thus, electoral systems can sometimes make or break a party—or even a country. In less spectacular ways, they affect party strengths in the representative assembly and the resulting composition of the governing cabinet. They can encourage the rise of new parties, bringing in new blood but possibly leading to excessive fractionalization, or they can squeeze out all but two parties, bringing clarity of choice but possibly leading to eventual staleness. It is well worth discovering in quantitative detail how electoral systems and related institutions affect the translation of votes into seats.
Electoral Systems, Seats, Votes, and Party Politics Figure 1.1 shows the opposite impacts of electoral systems and party politics on the distribution of seats and votes among parties. Electoral systems restrict directly the way seats can be distributed. In particular, when single-seat districts are used, only one party can win a seat in the given district. The impact on votes is more remote. When a party fails to obtain seats in several elections, it may lose votes because voters give up on it, or it may decide not to run in the given district. The impact on party system and hence on politics in general is even more remote. Still, if a party fails to win seats all across the country, over many elections, it may fold, reducing the number of parties among which the voters can choose. The impacts of the existing party system and current politics are attenuated in the reverse direction. The total number of meaningful parties may be limited by the workings of the electoral system, but current politics determines which parties obtain how many votes. The impact of current 3
Predicting Party Sizes
politics on the seats distribution is weaker, as the electoral system may restrict the number of parties that can win seats. Still, current politics determines which parties win seats. Finally, current politics has no impact at all on the electoral system, most of the times. Yet, infrequently, it has a major impact, when a new electoral system is worked out from scratch, or when protest against the existing electoral system builds up for any reason. At all stages, political culture plays a role. The same electoral laws play out differently in different political cultures, shaping different party systems. Along with the initial party system, political culture shapes the adoption of electoral laws. If stable electoral and party systems succeed in lasting over a long time, this experience itself can alter the initial political culture—a connection not shown in Figure 1.1. This book mentions political culture rarely, but not because I underestimate it. I just do the relatively easy things first, and political culture is harder to tackle. In the study of current politics, votes come first, and seats follow—the arrows at the top of Figure 1.1. This direction may look natural, but it is reversed when we study the impact of electoral systems. Now seats are restricted directly, and restrictions on votes follow in a slow and diffuse way—the arrows at the bottom of Figure 1.1. Recognition of such reversal is essential for elucidating the impact of electoral systems.
The Limiting Frames of Political Games Politics takes place in time and space—both the immutable physical space and the institutional space that politics can alter, but with much inertia. The physical size of polities matters for their functioning, as stressed early on by Robert Dahl and Edward Tufte (1973). Institutional size also places constraints on politics. For instance, in a five-seat electoral district, at least one party and at most five parties can win seats. Within these bounds, politics is not predetermined, but the limiting frame still restricts the political game. It is rare for one party to win all seats in a five-seat district, while such an outcome is inevitable in a single-seat district. This observation may look obvious and hence pointless, but it will be seen that it leads to far-reaching consequences. Institutions are containers within which the political processes take place. Containers matter. True, the content matters more, and containers do not decide what is poured into them. But if they leak, crack, overflow, 4
How Electoral Systems Matter
or corrode, they do affect the outcome. Indirectly, they even affect the content, because one learns from experience not to pour, for instance, the proverbial new wine into old wineskins. It would be false dichotomy to ask whether containers matter or not. It is a question of how much they matter, and how. So it is with political institutions. An excellent institutional framework cannot compensate for flawed political culture, but inadequate institutions can make it worse. Such a risk is high when political culture is corrosively intolerant and does not value cooperation and compromise. To maximize stability, institutions should be congruent with political culture, to use Harry Eckstein’s terminology (Eckstein 1966, 1998), but not so congruent as to help perpetuate an undemocratic culture. Electoral systems are part of such institutions.
Electoral Systems and Party Systems Here I use ‘electoral systems’ with some hesitation. In systems theory, a system divides the world into external and internal, and it has some capacity to restore internal equilibrium when disturbed by external factors. If so, then one could speak of an electoral system only when the electoral rules have been embedded in a political culture where voters and politicians have acquired reasonable skills in handling the rules to their enlightened self-interest, which includes most actors’ long-term interest in preserving a modicum of stability. Such skills are based on experience. A set of electoral rules can be promulgated as laws overnight, but it takes several electoral cycles for politicians and voters to learn how to handle these laws to their best advantage. Hence electoral rules become a stable limiting frame for the electoral game only when they have been used a fair number of times. In this light, should we define ‘electoral systems’ as not only a set of rules but also include the skills people exert in using them? There is some merit in such a definition (Taagepera 1998a), but it also leads to new difficulties in telling electoral and party systems apart. Therefore, I adhere to the generally accepted definition of electoral system as the set of rules that govern ballot structure and seat allocation. Electoral system thus defined is inextricably intertwined with party system. Even the earliest election in a new democracy is bound to take place in the context of some constellation of proto-parties, but to talk of a party system truly serves a purpose only when some degree of stability 5
Predicting Party Sizes
has set in regarding the identity, size, and interaction of parties. Early party constellations are often kaleidoscopic configurations of individual politicians, devoid of anything akin to a system. During early democratization, major parties may vanish completely and new ones may arise. Thus, the early party constellations can be even more fleeting than a kaleidoscope, where at least the pieces remain the same (Grofman, Mikkel, and Taagepera 2000). Such party constellations become a party system only slowly. What is involved in a party system? It is more than just the number and sizes of parties. It also includes their interactions. Peter Mair (1997: 214–20) offers two convincing examples of decoupling between parliamentary strengths of parties and their interaction patterns regarding government formation and maintenance. A long-standing feature of the Irish party system was Fianna Fail’s refusal to engage in coalition cabinets, which constrained the voters to vote either for Fianna Fail and singleparty cabinets or for coalition cabinets by ‘The Rest’. When a variety of reasons induced Fianna Fail to participate in coalitions, starting in 1989, the entire pattern of possible combinations expanded. Party system changed without any change in electoral system or any appreciable shift in seat shares of parties. Denmark is Mair’s contrary example (1997) of party system remaining the same despite shifts in party strengths. Instead of previous 5 parties, 10 parties won seats in 1973, and the combined vote share of the established 5 dropped from 93 percent to 65. Yet the interaction pattern of parties changed little, as another minority cabinet replaced the previous one. Note, however, that the electoral system did not change either—only the electoral outcomes did. The fact that a party system also involves interaction patterns among parties does not do away with the importance of the number and sizes of parties. A two-party system offers inherently different options, compared to a multiparty system. The shift in Ireland 1989 and the non-shift in Denmark 1973 both played themselves out within the usual range of options available in multiparty systems. When describing party systems, it would needlessly be limiting to claim either that only party sizes matter or, conversely, that party sizes do not matter at all. The initial electoral system can play a major role in determining the party system, but it is not the only factor. This book does focus on the impact of electoral systems on representation and party systems, because the workings of electoral systems are relatively well understood qualitatively and also with some quantitative rigor. But we should remember 6
How Electoral Systems Matter
that historical and cultural factors may produce a different party system on the basis of the same electoral system. Electoral systems affect politics, but they are also products of politics. Political pressures can alter them. This is well known, but after an initial bow to this two-way causality, most researchers treat electoral systems as causes of party systems rather than results. Consider the famous Duverger’s law (to which I will return), saying that plurality rule for seat allocation tends to produce a two-party system. How often does this allocation rule produce a two-party constellation, and how often does it result from a preexisting two-party constellation? Indeed, if the dawn of democracy in a given country finds the decision-makers divided into two parties, these parties may wish to choose the plurality rule so as to block entry of new competitors. If, on the contrary, the initial decisionmakers are split into many parties, they may wish to play it safe and adopt proportional representation (PR) so as to reduce their risk of total elimination. Only recently this issue has been addressed systematically (Boix 1999; Benoit 2002; 2004; Colomer 2005). Party constellations do tend to precede and determine the electoral systems. Once in place, though, the electoral system helps to preserve the initial party constellation and to freeze it into a party system. To avoid causal implications in either direction, we may reword Duverger’s law: ‘Seat allocation by plurality rule tends to go with two major parties.’
Chess Rules and Electoral Rules Electoral laws establish the rules for how the electoral game is carried out and how the winners are determined. In this, they are somewhat akin to chess rules (Taagepera 1998a). But there is one marked difference. Chess rules are extraneous to the game, while electoral rules are interwoven with the game. In his classic Fights, Games, and Debates (1960), Anatol Rapoport imagines going to a statistics-oriented person to analyze chess. The latter reports items like the distribution of duration of games and the attrition rates of chess figures at successive moves. Rapoport, however, mumbles: ‘But is this what we want to know about chess?’ In particular, does this enable us to play better chess? Guess not. Still, such statistical information on chess would be of interest. It certainly would, if proposals arose for changing the chess rules. Would the change make the game boringly long or, to the contrary, awkwardly 7
Predicting Party Sizes
short? But even then, the rules would not be part of the game. Before sitting down at the chessboard, a player will not negotiate for fairer rules to prevail on the chessboard, threatening otherwise with boycott. No player will declare that, if s/he wins, s/he will change the rules. These rules are quite constant in space and time. They define the game rather than being part of the game. The loser cannot claim that the rules were biased. Electoral rules also define the game, but they are part of it. They vary in space and time. Losers can blame them, and at times do. Change in electoral rules can be part of an election platform. Because these rules can be changed through political processes, the statistical and logical analysis of the properties of electoral systems is part of the study of politics, while the study of the consequences of various conceivable chess rules is not part of learning chess. This is not to deny the strategic aspects of politics, which are subject to game-theoretical approaches and conditioned by political culture and various path-dependent factors. One need not even claim equal importance for institutional aspects and for electoral systems in particular. They are merely the limiting frames for political games. A good electoral system cannot save a polity where many other institutions, attitudes, and policies have broken down. And on the other hand, a healthy polity can find ways to compensate for a poor electoral system. However, an inadequate electoral system can contribute to crisis in the case of shaky polities—and most polities have their fragile aspects and periods. My approach to electoral systems is very much in line with what made Rapoport ask ‘But is this chess?’ For chess, the response would be ‘No’, but for the study of politics, it is ‘Yes’, because here the rules are themselves part of the game.
The Study of Electoral Systems Within political science, electoral studies are a relatively mature field of study. They are located at the core of political science: Although there are many concerns of political science that do not center around elections, the study of democratic practices—to which elections indisputably are central—is certainly one of the most crucial topics for the discipline as a whole. The study of elections is more than the study of electoral systems, and the study of electoral systems is more than ‘seats and votes’, but the numerical values of
8
How Electoral Systems Matter seats and votes for individual political parties and candidates are among the most important quantitative indicators that we, as political scientists, employ in our work. (Shugart 2006)
For political scientists, electoral laws offer a further attraction: the possibility of institutional engineering. For the given votes, one can calculate the extent to which different electoral laws would have altered the composition of the representative assembly, and one can propose changes in laws. Of course, under different laws voters may have voted differently. For instance, a shift from plurality to PR may encourage voters to shift to third parties. Such tendencies also must be taken into account. Actually, fundamental changes in electoral laws are infrequent, because they usually require agreement by representatives chosen under the old laws—and why should they change laws that served them well in getting elected? Still, electoral laws may well be more conducive to institutional engineering than institutions firmly stipulated in constitutions, not to mention political culture. The quantitative nature of many features of electoral systems—the numbers of seats and votes, precise allocation algorithms, and the like— may attract those political scientists who yearn to discover quantitative regularities akin to those that have paid off in natural sciences. For the same reason, electoral studies may repel those who consider the study of politics an art rather than a science, or at most a science that thrives on richness of details rather than broad generalizations—zoology rather than molecular biology. Students of politics are largely reduced to nonrepeatable observations in vivo instead of repeatable in vitro laboratory tests. Hence any general scientific laws in politics, if they exist at all, are bound to be hidden, submerged underneath considerable random scatter in data. This scatter may easily be construed as absence of general laws. This book, however, presents evidence that logical models can be constructed in the context of electoral systems, and that they lead to specific quantitative predictions, which are confirmed empirically by the averages of many elections, and even more by averages of many electoral systems. This book first reviews the typology of electoral systems and introduces some analytical tools that make a comparative study of electoral systems possible. This overview of rules and tools is a prelude to the ‘Duvergerian agenda’ which has dominated the electoral studies for the last half-century—the attempt to express the impact of the main features of electoral systems on representation and party system. 9
Predicting Party Sizes
The central part of the book presents recent advances in the macroscopic aspect of the Duvergerian agenda. These advances help us understand the logic of simple electoral systems to the degree that specific quantitative predictions can be made for the average of many elections carried out under the same rules. Individual elections, of course, can vary wildly, just as daily weather can vary within a well-defined climatic pattern. All this applies to simple systems. We are still far from being able to predict in detail the impact of complex electoral systems, but we have made marked progress during the last few decades. In this light, the final part of the book broadens the agenda and asks: What can we expect from electoral laws? It briefly describes advances in studying more complex electoral systems and lays out the agenda for extending our predictive ability from simple to complex systems. Once we have a firm grip on the impact of institutions, we can separate it from the impact of political culture and study the latter in relative isolation from confusing side effects. We are then in a position to attempt to design electoral laws so as to obtain specific average outcomes—and also to have a sober awareness of the limits to our ability to design. The aforementioned quantitative nature of many features of electoral systems enables us to build and test logical quantitative models more extensively than has been the case in other studies of politics. Can we transfer some of the methodology developed for electoral systems? The book concludes with this provocative issue: To what extent can electoral studies supply a ‘Rosetta Stone’ to some other parts of political science?
10
Part I Rules and Tools
This page intentionally left blank
2 The Origins and Components of Electoral Systems
For the practitioner of politics:
r r r r r
When choosing an electoral system, a main trade-off is between decisiveness of government and representation of various minority views. As long as you keep the electoral system simple, its average effect can be predicted to a fair extent, and this book is of some use. When electoral systems are made complex, no one can predict their actual workings—and you kid yourself if you think that you can. Try remembering future. Do not push for laws that favor large parties just because your party is large now—it may shrink. Do not change electoral laws frequently. Allow an understanding to develop of how the electoral system works.
Elections are one way to determine who the leaders will be. But who determines what the rules for elections should be, and what are the options? How well does the resulting electoral system satisfy the original intent? The choice of electoral system is affected by so many contradictory concerns that the choices made in specific historical instances could have hardly been predicted, although some outcomes were more likely than some others (cf. Colomer 2004b). The path from devising electoral laws to a mature understanding of how the resulting electoral system works is also wrought with uncertainties. The first third of the book deals with ‘rules and tools’. It describes the variety of electoral rules devised that combine into electoral systems used in various countries. It also presents the analytic tools needed to measure
13
Rules and Tools
inputs, such as the total number of seats, and outputs, such as the number of parties.
Basic Choices The main concern is balance between decisiveness of government and representation of various minority views. Electoral systems that push toward a two-party system and hence one-party cabinets may promote decisiveness of government. This outcome has been claimed for seat allocation by plurality in single-seat districts, often designated as firstpast-the-post (FPTP). The desire for maximally PR, on the one hand, is best satisfied by a PR seat allocation rule applied nationwide. One can have both one-party cabinets and PR only if the political culture spontaneously develops just two parties of any appreciable size. This was the case in post-World War II Austria, despite its PR rules, but it occurs rarely. On the other hand, some political cultures may miss out both on decisiveness and on proportionality, having many small parties and yet large deviations from PR. One may also be concerned about party cohesion, which is weakened by some electoral systems, about voters having a personal representative, about regional, ethnic, and women’s representation, and so on. Colomer (2004b) stresses the desire to avoid the worst possible outcome for the largest party. If this party feels safe against electoral reverses, it may push for FPTP. But if it feels insecure, it may opt for PR as insurance against catastrophic loss. In new democracies, two considerations emerge stronger than in the established ones. One is legitimacy of electoral laws. If these laws are perceived as unfair, for whatever reason, founded or unfounded, then democracy is in trouble. The other aspect is the cost of elections, both in terms of money and expert labor. Some electoral systems are appreciably costlier than some others, and new democracies, in particular, are often strapped for funds and skilled administrators (Reynolds, Reilly, and Ellis 2005).
Electoral Laws Are Often Chosen or Changed in a Messy Way Electoral laws are made by humans. They depend on the constellation of political forces and opinions at the time they are adopted. A founding assembly consisting of one or two major blocs may prefer FPTP so as to 14
Components of Electoral Systems
freeze out newcomers, while a fractured assembly may pick some form of PR so as to enable all groupings to survive (Colomer 2004b, 2005). Yet many other factors and concerns also enter. It may look hard-boiled realism to declare that self-interest of the original decision-makers determines the choice of electoral laws when democracy is introduced (or reintroduced). However, such a claim retroactively explains away whatever the outcome happens to be, and hence it explains nothing (Taagepera 1998a). People decide on what is in their interest on varied and sometimes fleeting grounds. Winning the next election is a major concern, but it can conflict with long-term interests (including preservation of stability), ideological preferences (including advice by foreign advisers that belong to the same ideological strain), and the force of habit. Which of these will overrule the others in defining ‘self-interest’? The means used to achieve one’s presumed self-interest can be misinformed and hence counterproductive (Kaminski 2002; Andrews and Jackman 2005). Thus, during the liberalization processes in the Sovietdominated area of the late 1980s, the old communist regimes preferred to keep the Soviet electoral rules, which favor the largest party even when applied honestly. The Communists did so not only by force of habit but also because they expected to be the largest party. It turned out to be a catastrophic misjudgment in many countries. In Palestine 2006, Fatah may have made the same miscalculation. The predominant forces may stick to the pseudo-democratic election rules inherited from the preceding political regime, either because they are unaware of the alternatives or because they rationally try to balance the merits of the existing rules against the costs and risks of innovation. Thus, most ex-British colonies adopted the British FPTP. Little did they realize that what produces a fairly balanced two-party representation in the British Chamber of the Commons of some 600 members can produce lopsided one-party predominance in the 20-seat assembly of a small island nation. Such nations often ended up with a decimated parliamentary opposition. Similarly, several post-Soviet states maintained the Soviet electoral rules for a while, which required high participation, allowed voting against all candidates, yet required absolute majority to win. What formally worked in Soviet one-candidate pseudo-elections led in independent Ukraine to interminable repeat elections, with participation ever decreasing. Some seats remained vacant permanently. Countries may return to a tradition interrupted by dictatorship or foreign occupation. Thus Zambia’s Third Republic returned to the rules of the 15
Rules and Tools
multiparty First Republic, after the de jure one-party state of the Second Republic. Earlier tradition itself may offer contradictory options. Thus, Estonia’s choice in 1992 was influenced both by its ultra-proportional rules of the 1920s and by the disproportional rules adopted in 1938, in reaction to excessive multipartism. Such reaction to undesirable aspects of the existing electoral system is a major motive for changing it. Time pressures may be less than during the original introduction of democracy, but overreaction to the existing system may cause an excessive shift in a different direction. The role of random happenings should not be ignored. As Nigel Roberts (1997) asks regarding New Zealand: ‘What would have happened if David Lange had not made an inadvertent pledge during the 1987 election to hold a binding referendum on the question of electoral reform?’ But for this irretrievable slip of the tongue, the ball might not have started rolling. If this could happen in stable New Zealand, then how often may the choice of the initial electoral rules in new democracies have been decided by who happened to be at which meeting, and in what mood? With 20-20 hindsight, one can always invoke ‘self-interest’ so as to perfectly explain away this conglomeration of desire to win, yet follow tradition, avoid new thinking and information gathering, satisfy foreign ideological sponsors, and maintain some idealistic concern about future stability—all this combined with miscalculation and chance happenings. The process of determining the electoral laws often starts with competing simple formats being proposed, for example FPTP or nationwide PR. A compromise between the two may be negotiated, for example the ‘Mixed Member Proportional’ (MMP) system in West Germany in the late 1940s. If one of the basic formats carries, the losing side may try to introduce amendments. For instance, if nationwide PR promises representation to tiny parties, a legal threshold of some percentage of votes may be proposed so as to block them. Regional parties, however, who are major players in their respective regions, may oppose such a limitation, aimed chiefly at tiny nationwide parties. It may then be decided that the legal threshold does not apply to parties that satisfy certain local requirements. In the course of such wrangling, a superficially strong stipulation may be gutted by subtle further additions. Blocking and enabling measures may reach such complexity that no one can predict the actual pattern of outcomes. At this point, the electoral systems expert might well give up, but opposing politicians may still believe that they have outfoxed each other. The actual pattern, as it takes shape during several elections, may not satisfy anyone. 16
Components of Electoral Systems
If so, what opportunity does all that leave for supposedly rational advice by neutral experts on electoral systems? It is not up to the experts to question the motivational basis of the desired outcome. They can only help avoid misconceived means to reach the desired ends. They can ask ‘Which kind of results do you want?’ and then point out to what extent the rules under consideration may ensure or defeat the stated goals. To some extent, one can design for a two-party system and the long lasting cabinets that tend to result. One can also design for maximal PR, at the cost of relatively short-lived multiparty coalition cabinets. But it is almost impossible to design simultaneously for near-perfect PR, yet long-lasting cabinets. And only simple electoral systems lead to somewhat predictable party systems.
Components of Electoral Systems Elections can apply to one position (president), a few (local council) or several hundred (parliament). Voters may have to voice unqualified support for one or several candidates (‘categorical ballot’), or they may be able to rank candidates (‘ordinal ballot’). Details of electoral laws that have been or could be used are given in Chapter 3. Here the basic choices are outlined. Some most fundamental choices that pertain to elections are outside the electoral laws as such. Every democratic country needs electoral laws for the first or only chamber of its legislative assembly. If a presidential regime is chosen, it also needs laws for presidential elections. If a twochamber assembly is chosen, both chambers need election or selection rules. Some aspects of electing a president or selecting a second chamber are addressed in later chapters. Here I consider only the inevitable first or only chamber. The first question is: How many seats should such a chamber have? Large countries are almost bound to have more seats (in line with a logical relationship presented in Chapter 12), but there is some leeway. Given that smaller assemblies offer less room for variety, the choice of assembly size (S) affects the chances of smaller parties. For the given election, assembly size is usually fixed in advance, but in some systems it fluctuates slightly, depending on the outcome of the election. The next question is: Into how many electoral districts should the country be divided? Electoral districts mean the areas within which popular votes are converted into assembly seats. The number of seats allocated 17
Rules and Tools
within a district is called district magnitude (M). It is arguably the single most important number for election outcomes. One can have numerous single-seat districts where M = 1, or fewer multi-seat districts where M > 1. The limit is one nationwide district where district magnitude equals assembly size: M = S. All districts need not be of equal magnitude, and overlapping districts also occur. The next or rather concurrent issue is the seat allocation formula within the district. It is tied in with ballot structure. The voter may be given one or more votes. If only the first preferences are taken into account, the voter is asked to cast one or several categorical votes (categorical ballot). If second and later preferences are also taken into account, the voter is asked to rank the candidates (ordinal ballot). The allocation formula stipulates how the resulting votes are to be converted into seats. At the one extreme, all seats in the district may be given to the party with the most votes (plurality rule). At the other extreme, one could use a PR formula that favors the smallest parties and takes into account second preferences and support for specific candidates. Assembly size, district magnitude, and seat allocation formula (plus the corresponding ballot structure) are the three indispensable features regarding which a choice cannot be avoided, if one wants to allocate seats on the basis of votes. Further features can be added, such as legal thresholds for minimum representation. A mix of district magnitudes and allocation formulas can be introduced. Several rounds of voting can be used. Several tiers (levels) of seat allocation can be used, going beyond the basic districts. Such additions are frosting on the cake rather than indispensable ingredients. Seats can be allocated on the basis of votes for party lists or votes for individual candidates. The two options can also be mixed. Instead of having only the choice of party lists (closed lists), voters may have the option to voice preferences for one or more candidates on a list (open lists), or they may be even required to vote for a specific candidate. The so-called panachage (literally, cocktail) may even enable them to vote for a list but also mix in candidates from other lists. In sum, choosing an electoral system involves three inevitable choices (S, M, and allocation formula) and numerous optional ones. The ways to combine and mix them are infinite in principle and extremely numerous in practice. One can promulgate electoral laws, but the resulting party system may differ from the expected. Thus, with FPTP, most voters in most countries tend to vote for the two largest nationwide parties, but in some 18
Components of Electoral Systems
countries regional parties may subsist, or there may be large numbers of successful independents. Single-seat districts may look simple, but they still offer several choices for seat allocation, to be discussed in Chapter 3. In multi-seat districts the options multiply. Seat allocation can be made on the basis of votes for individual candidates or votes for party lists. Voters can have only one vote or as many as M, the number of seats at stake. If votes are for individual candidates, transfer of votes among candidates may be possible, when second preferences are marked on the ballot. When party lists are used (usually with one vote per voter) the basic choice is between plurality rule and one of the many PR seat allocation formulas. Multi-seat plurality favors the largest party nationwide. This advantage is already marked in the case of single-seat districts, but it grows with district magnitude. With a single nationwide district (M = S), this advantage becomes absolute: The largest party wins all the seats in the assembly. Because of this lopsided advantage, multi-seat plurality is rarely used in districts of more than 2 or 3 seats. The effect of magnitude is reversed when a PR formula is used. One comes closest to ideal PR when the entire country forms a single huge district. Here, a decreasing district magnitude increases the large party advantage and hurts the small parties. Proportionality is the least when a PR rule is applied in a single-seat district. Here, the plurality and PR rules meet and lead to the same outcomes. Indeed, ‘single-seat plurality’ could as well be called ‘single-seat PR’! This is why I prefer to designate it as FPTP, a relatively neutral term between plurality and PR. Figure 2.1 shows the overall picture for party lists in single- and multiseat districts. The contrast between plurality and PR allocation rules is extreme for a nationwide single district (M = S). Here plurality rule would assign all S seats in the assembly to the winning list, while PR rules would produce highly proportional outcomes. As the electorate is divided into increasingly smaller districts (M < S), the contrast between the outcomes of plurality and PR rules softens, until they yield the same outcome in the case of single-seat districts (M = 1).
Simple Electoral Systems The more complex the laws are, the more we are in uncharted territory, for several reasons. Voters may react to complex laws in different ways. Also, the more complex the laws, the fewer past cases with similar laws we 19
Rules and Tools
FPTP M=1 Marked largest party bonus
PLURALITY RULE
M=S All seats go to the largest party
Direction of bonus growth
PR RULE
M=S Hardly any largest party bonus
Figure 2.1. When district magnitude changes, PR and plurality rules affect largest party bonus in opposite ways
have in the world, so as to draw empirical lessons. Lastly, logical predictive models also become so complex that they can offer little guidance. The simplest family of electoral systems is the one where a total of S seats are allocated to closed lists in a single round, in districts of equal magnitude M and according to a standard PR formula. For M = 1, it boils down to FPTP. Semi-proportional or plurality formulas do not count as simple. Two parameters, M and S, largely suffice to specify a simple system. The electoral formula also affects the outcome, especially when M ranges from 2 to 5. Still, changes in magnitude matter markedly more (as is shown in Chapter 6). Apart from FPTP elections that involve no primaries (e.g. UK), perfectly simple electoral systems are rare. However, the specific impact of electoral systems on the translation of votes into seats is easiest to investigate for simple systems. This is what this book focuses on.
How Easy Should It Be to Alter the Electoral Laws? Should the electoral laws be specified in the constitution, making them hard to change? Or should they be regular laws that the national assembly can alter fairly easily? It depends on how easy it is to change the constitution. Practices vary, and consistency may not prevail even within the same country. Thus the Estonian Constitution of 1992 delves in detail on the procedure for electing the figurehead president but specifies only a vague ‘principle of proportionality’ for the election of the real power center, the national assembly. Similar vagueness has led to heated debates in the Czech Republic on how semi-proportional the electoral formula 20
Components of Electoral Systems
can become without violating the constitutional norm of proportionality (Novák and Lebeda 2005). The more the constitution spells out the details of the electoral system, the more difficult it becomes to change electoral laws that have proved inadequate. On the other hand, excessive ease can lead to opportunistic changes by an unpopular government who hopes to soften the blow at the next elections. Such was the case in France 1986 where the long-standing Two-Rounds rule in single-seat districts was replaced by multi-seat PR but was immediately reintroduced by the incoming majority. If electoral laws are changed too frequently, no stable pattern has time to develop. Maybe it should take at least a 55 or 60 percent majority to change the electoral law—or simple majority in two successive assemblies (Arendt Lijphart, private communication).
On Terminology Varying terminology continues to plague electoral studies. Districts are sometime called constituencies. Some traditional texts talk of singlemember and multi-member districts. However, electoral rules allocate seats or memberships to candidates or parties—they do not allocate ‘members’ as such. It is hence more logical to talk of single-seat and multi-seat districts. The terms electoral ‘rules’, ‘formulas’, ‘formats’, ‘laws’, ‘arrangements’, ‘systems’, and ‘design’ have at times been used on the very same page, almost as synonyms, but not quite. One might talk of the ‘FPTP rule by which the major party dominance is enhanced’, or the ‘FPTP laws according to which the major party dominance is enhanced’, or the ‘FPTP system in which the major party dominance is enhanced’. Different authors have used these and other terms in slightly different senses. Therefore, their definitions for the purposes of this book should be given. My definitions need not be more functional, but they clarify what the terms mean here. The focus of this book is on the effect of those rules by which votes are translated into seats and also the rules on how voters can express their votes—categorical or ranked ballot, number of votes per voter, open or closed list, etc. Apart from these ballot structure and seat allocation rules, there are other rules governing the process of elections in a broader sense: voting rights, registration of voters, calling the election, candidate nomination, campaigning, advertising, opinion polls, and distribution of polling places (Farrell 2001: 3; Reynolds, Reilly, and Ellis 2005: 5). The 21
Rules and Tools
study of such rules is outside the scope of this book, even while I recognize their impact on the votes. Hence, ‘electoral rules’ or ‘electoral system’ stands in this book for ballot structure and seat allocation rules, unless otherwise specified. I use ‘seat allocation formula’ when this is what I mean, rather than ‘electoral formula’, which sometimes has been used as a synonym for electoral rules in a broader sense. In individual countries, the rules that form the ‘electoral system’ find expression in ‘electoral laws’ specific to the country. Features that form part of the same law in one country may belong to separate laws in another. For analytic purposes, we overlook such differences—we compare the workings of similar ballot structures and seat allocation rules across countries, regardless of the legal formats. The electoral system is embedded in the electoral laws of the particular country. I take ‘electoral arrangements’ or ‘format’ to mean the way the electoral system is embedded in laws, but I rarely use these terms. Finally, I use ‘electoral design’ to mean futureoriented institutional engineering. It is not a mere synonym for existing system or laws. In previous literature, the consensus has been that ‘electoral system’ stands for a set of rules that are mutually consistent and completely specify the ballot structure and seat allocation. Explicitly or implicitly, this applies to Lijphart (1994: 1, 7), Farrell (2001: 3), Colomer (2004b: 3), Reynolds, Reilly, and Ellis (2005: 5), and Taagepera and Shugart (1989: xi). Farrell (2001) uses ‘electoral laws’ for what I designate as rules for elections in the broad sense, regulating everything from voting rights to opinion polls. Among these laws, there is one ‘set of rules which deal with the process of election itself: how citizens vote, the style of ballot paper, the method of counting, the final determination of who is elected . . . This is electoral system . . . ’ Farrell (2001: 3). In contrast to Farrell, authors like Lijphart (1994), Colomer (2004b), Reynolds, Reilly, and Ellis (2005), and Taagepera and Shugart (1989) avoid the term ‘electoral laws’.
22
3 Electoral Systems—Simple and Complex
For the practitioner of politics:
r
r r r
r r r
To allocate seats to candidates or parties, laws must specify at least the following: the total number of seats in the assembly (or its first chamber—assembly size), the number of seats allocated in each electoral district (district magnitude), how these seats are allocated (allocation formula), and how a voter can express her/his preferences (ballot structure). Assembly size depends strongly on population size. District magnitude can be as low as 1 (single-seat districts) or as high as assembly size. The simplest seat allocation formulas are d’Hondt and Sainte-Laguë divisors, and Hare quota plus largest remainders. For single-seat districts, these PR formulas are reduced to first-past-the-post, where the candidate with the most votes wins. With these formulas, the larger the district magnitude, the more proportional the seat shares are to the vote shares, and the more parties may be represented. The smaller the district magnitude, the larger the seat share of the largest party tends to be, and one-party cabinets become more likely. Optional features include legal thresholds, Two-Rounds elections, each voter having several votes, and voters ranking candidates. The advantages of complex and composite electoral systems may be real or imagined. Either way, they make it harder to predict the number of parties and the average proportionality of seats to votes.
At the minimum, a full set of electoral rules must stipulate the following: the total size of the representative body, the magnitude of electoral 23
Rules and Tools
districts, the seat allocation formula, and the corresponding ballot structure. To these, Lijphart (1994: 1) adds electoral threshold. However, no legal threshold needs to be stipulated (cf. Lijphart 1994: 11), and the ‘effective threshold’ inherent in district magnitude (to be discussed later) cannot be prescribed separately from district magnitude and seat allocation formula. As further factors, Lijphart (1994: 15) adds malapportionment, presidentialism, and apparentement. These three are discussed toward the end of this chapter. Farrell (2001), Colomer (2004b), and Reynolds, Reilly, and Ellis (2005) do not explicitly list the indispensable ingredients of electoral systems, nor did Taagepera and Shugart (1989). Assembly size depends strongly on population size. It is the most undervalued factor—it occurs in the subject index only in Taagepera and Shugart (1989) and Lijphart (1994). This chapter first describes the systems that are basic in that they do not add any further ingredients to the basic three, do not mix the basic formulas or systems, and apply the basic rules in a fairly uniform way. Thereafter, more complex or composite systems will be surveyed. I focus on the most widely used electoral systems. Farrell (2001) describes these and others in more detail, and Reynolds, Reilly, and Ellis (2005) indicate how they have worked out in specific countries. Colomer (2004b) presents a most thoughtful account on how and why the various approaches developed over history, with preferences shifting from unanimity and drawing by lots toward majority and then toward PR. The basic systems include what I call the simplest family of electoral systems (Chapter 2) and also some others that offer more elaborate allocation formulas or ballot structures, but without mixing them. In line with the scheme in Figure 2.1, the basic systems are divided into groups by district magnitude and by allocation formula: single-seat districts, multiseat districts with plurality-oriented seat allocation formulas, and multiseat districts with PR-oriented seat allocation formulas. The latter, in turn, may be party centered or candidate centered.
Single-Seat Districts (M = 1) When the country is divided into single-seat districts, then the basic choices are two. The candidate with the most votes (plurality or ‘relative majority’) could be declared the winner. This is traditionally called FPTP formula. Alternatively, absolute majority (more than 50 percent of the 24
Electoral Systems—Simple and Complex Table 3.1. Possible seat allocation rules in a single-seat district
Plurality Majority
Categorical ballot (marking one candidate)
Ordinal ballot (ranking several)
First-past-the-post Two-Rounds
Borda Count Alternative Vote
votes) could be required. In either case, the voter can be asked to cast either a categorical ballot for one candidate or to rank the candidates, as shown in Table 3.1. In FPTP systems, the candidate with the most votes wins. This system is widely used in British-heritage countries and tends to produce or preserve a two-party system with a fair balance between government and opposition. However, assembly size can make a difference. Large countries with large assemblies such as the United Kingdom (S around 650) and India (S around 550) can have more than two significant parties in the assembly, while in small countries like Barbados (S = 26) one party tends to have about 70 percent of the seats, leaving a weak opposition with only 30 percent (Taagepera and Ensch 2006). The reasons are discussed in Chapter 8. Among the systems with single-seat districts, FPTP qualifies as a simple electoral system. All others are more complex. In Two-Rounds (TR, ‘Second-Round Runoff’) systems, if no candidate reaches 50 percent of the votes, the two candidates with the most votes go into a second round of elections, where one of them is bound to reach 50 percent of valid ballots. Few stable democracies use this system (Birch 2003), but it is fairly widespread in Africa and Asia (see tables in Reynolds, Reilly, and Ellis 2005: 30–1). France requires 50 percent in the first round but only a plurality in the second (‘Two-Rounds Majority-Plurality’). The number of parties running in the first round can be large, but the second round tends to focus on only a few parties. In Alternative Vote (AV, ‘Majority Preferential’, ‘Instant Runoff’), voters may or must rank all candidates. When the votes are counted, the weakest candidate is eliminated and his voters’ votes are transferred according to their second preferences. The process is repeated, if necessary, possibly involving some voters’ third and lower preferences. When only two candidates remain, one of them is bound to have at least 50 percent. Alternative Vote has been used for nearly 100 years in Australia, where a two-and-ahalf party system has taken root (Farrell and McAllister 2006). Instead of eliminating the least popular candidate (the one with the fewest first place votes), the most disliked candidate (the one with the most last place votes) 25
Rules and Tools
could be dropped. This variant has not been used but has theoretical advantages (Grofman and Feld 2004). In Borda count (BC), the ranked votes are weighted. With 4 candidates running, every first preference receives 3 points, every second preference 2 points, and every third preference 1 point. These weighted votes are added up, and the candidate with plurality wins. As Jean-Charles de Borda himself put it 200 years ago, it is a good system ‘only for honest men’ (Colomer 2004b: 30), because it is highly susceptible to strategic voting. Only two Pacific countries use variants of BC: Nauru and Kiribati (Reilly 2002). Intermediary approaches are possible, especially with Two-Rounds. Recall that France has a majority-plurality system. In some intermediary approaches, winning in the first round may require only 40 percent of the votes or being sufficiently ahead of the next-ranking candidate (e.g. by 10 percent votes). All these options also apply to direct presidential elections, where M = 1 by definition. The workings of these systems are illustrated in Table 3.2. There are 100 voters and 4 candidates, assumed to line up on the simplistic left-right ideological scale. When voters are asked to rank candidates, I will assume that their second preference is the candidate closest to their first choice. In the case of equal closeness, they are assumed to split their second preferences evenly between the candidates to the left and to the right of their first preference. Assume the first preferences are as shown in the first line in Table 3.2. The rest follows from these simplifying assumptions. Left has the most first preference votes and wins by the FPTP rule. By the Two-Rounds majority rule, the second round pits Left against Right, the centrist voters shift to their ideologically closest candidates, and Right wins. Borda count multiplies the first preferences by 3, second preferences
Table 3.2. Example of basic seat allocation options in a single-seat district
First or only preference Second preference Third preference
Left
Center-Left
Center-Right
Right
Total
33 14/2 = 7 14/2 = 7
14 33 + 24/2 = 45 24/2 + 29 = 41
24 14/2 + 29 = 36 33 + 14/2 = 40
29 24/2 = 12 24/2 = 12
100 100 100
Eliminated 173
Eliminated 184 wins
Eliminated —
14/2 + 24 = 31 31 + 29 = 60 wins
First-past-the-post 33 wins Two Rounds, 2nd round 33 + 14 = 49 Borda Count, total points 120 Alternative Vote, 2nd stage 33 + 14/2 = 40 3rd stage 40
26
29 + 24 = 51 wins 123 600 29 Eliminated
100 100
Electoral Systems—Simple and Complex
by 2, and third preferences by 1, and then adds these points. Center-Right narrowly surpasses Center-Left and wins. Finally, by the Alternative Vote rule, the process is more complex. Since no one reaches 50 percent, the weakest candidate is eliminated—the Center-Left. His votes are reallocated according to the second preferences. Still, no one reaches 50 percent. The weakest candidate now is Right, narrowly surpassed by Center-Right, thanks to the boost of second preferences from Center-Left. With Right eliminated, her votes are reallocated according to the second preferences, and Center-Right wins by a large margin of 60. Thus, depending on the seat allocation formula chosen, almost any candidate could win in this particular sample, chosen to illustrate the importance of the allocation formula. In most actual cases, many formulas yield the same result. It is most important to realize that the actual election outcomes are not rule-blind. The seat allocation formula is known before the elections take place, and parties and voters will adjust. The outcome depends on how well they can coordinate (Cox 1997). When the rule is FPTP, then CenterRight would effectively play a ‘spoiler’ role, enabling Left to win. Hence, if the opinion polls offer a realistic idea of the relative strengths of the candidates, Center-Right might drop out so as not to split the right. If Right and Center-Right fail to coordinate in such a way in the first election, the Left victory may teach them to present a joint candidate in the next election. The latter step, in turn, will force Left and Center-Left to choose between presenting a joint candidate and facing sure defeat. This is how FPTP pushes the party system toward two dominant parties, as claimed by Duverger’s law, but it may take time, and exceptions outnumber the cases where a balanced nationwide two-party system develops. The other allocation formulas exert less pressure toward concentration. In Two-Rounds, many candidates may continue to run in the first round and the losers may bargain with their support prior to the second round. In Alternative Vote, voters do not have to worry about playing a ‘spoiler’ role. The voter may express support for her/his favorite, even if the latter has no chance to win, and then mark as second preference the preferred one among the top candidates. However, suppose two rightist parties and only one leftist party run. If ranking is mandatory, leftist voters are forced to mark a rightist as their second preference. To give them another option, the leftist party may induce a weak centrist candidate to run. In BC, if the opinion polls enable the Left voters to anticipate the outcome, they may tilt victory to Center-Left by strategically marking 27
Rules and Tools
their third preference as Right, so that the total points for Center-Right drop to 151. Anticipating this Left ploy, Right may respond in kind, reducing the Center-Left points to 144. Or Right may even induce a fake leftist candidate to run so as to act as a spoiler. The resulting uncertainties are the main reason for why BC is little used.
Multi-Seat Districts: Overview Seat allocation in multi-seat districts can be carried out on the basis of votes for individual candidates or votes for lists presented by parties (or other groups). A large number of combinations are possible, such as the following. When voters vote for individual candidates, each of them may have only one vote or as many as M, the number of seats at stake. If they have several votes, they may or may not be allowed to place them all on the same candidate (cumulation). If voters have only one vote, they may be allowed (or even required) to rank candidates, making transfer of votes among candidates possible. This way the votes for the losing candidates and also the superfluous votes for the top candidates would not be ‘wasted’ but would help these voters’ second preference candidates. Finally, various forms of Approval Ballot permit voting for up to M candidates, or even more. When party lists are used (usually with one vote per voter), the basic choice is between plurality, where the party with the most votes wins all the M seats in the district, and one of the many PR seat allocation formulas to be described later. Intermediary ‘semi-proportional’ formulas also exist.
Multi-Seat Districts with Plurality-Oriented Seat Allocation Formulas One can run party lists in multi-seat districts and allocate all seats to the list with the most votes. Stable democracies have largely abandoned such Party Block Vote (PBV, ‘Block Ballot’), because it boosts the advantage of the largest party nationwide even more than FPTP, weakening the opposition to the point of making it ineffective. In contrast to such a party-centered approach, one can formally ignore the existence of parties, focus on candidates, and give each voter as many 28
Electoral Systems—Simple and Complex
votes as there are seats in the district. Such Unlimited Vote (UV, ‘Unlimited Ballot’) enables a voter to spread her votes among, say, the candidates she deems the most honest, regardless of ideology or party. However, if there is strong party loyalty, UV becomes akin to PBV, decimating all opposition to the dominating party. Unlimited Vote is used in many local elections in the USA, which are formally run on non-party basis. One can alleviate major party dominance by allowing Cumulative Vote (‘Cumulative Ballot’), where the minority can load their votes heavily on a few candidates, or by shifting from UV to Limited Vote (LV, ‘Limited Ballot’), meaning that for M seats in the district, each voter has less than M votes (to be used with or without cumulating). ‘Multiple Non-Transferable Vote’ is a term that could mean either LV or UV. This was the ingredient that made Hamas the big winner in Palestine 2006 (cf. Chapter 1). As the number of votes per voter is reduced, the dominant party voters find it ever harder to hoard all the seats in the district. When the number of votes per voter is reduced to the square root of district magnitude (M 0.5 ), seat shares seem to become fairly proportional to the vote shares, although no theoretical proof to that effect seems to exist. Approval Voting (‘Approval Ballot’) amounts to unlimited vote carried to the extreme: Vote for as many candidates as you please, and the M candidates with the most votes win. Jean-Charles de Borda might again say that it is a good system ‘only for honest men’, those who tolerantly approve of their less-preferred candidates too. When the electorate is polarized, however, and voters vote only for their own party’s candidates, the plurality party wins all the seats. Even worse, an intolerant minority can prevail over a tolerant majority of 60 percent who also approves of some of their less-preferred candidates. Approval voting easily turns into disapproval voting.
Multi-Seat Districts with PR-Oriented Formulas—Party Centered Here, the goal is to make the seat shares of parties reasonably proportional to their vote shares. The voter is given one categorical vote, to be cast for a party list (‘closed list’) or for a candidate within the list (‘open list’). Either way, parties receive seats on the basis of their total votes (List PR). These seats go to the candidates at the top of the closed list, or to the candidates with the most personal votes in the case of open list. Intermediary ways to allocate seats within the party are also used. The issue of intraparty 29
Rules and Tools
seat allocation is important, for example for representation of women (Matland and Taylor 1997) and local interests (Crisp et al. 2004; Shugart, Valdini, and Suominen 2005), but this book largely bypasses it. How should the seats be allocated among the parties? The easiest approach to understand might be Simple Quota and Largest Remainders, called more briefly Hare-LR. It is used for instance in Costa Rica, and it works as follows. If M seats are available, a share 100%/M of the votes should entitle a party to a seat. This is the simple or Hare quota (also called exact or Hamilton quota), designated here as q0 for reasons of systematics that will become apparent later. Parties receive as many seats as they have full quotas of votes. These quotas are subtracted from the total vote shares, leaving almost always remainders of votes, and some seats also remain unallocated. Such seats are allocated to the parties with the largest remainders. Any party with a remainder of more than half-quota (q0 /2) is likely to receive such a remainder seat, but it depends on how the votes happen to be distributed among the other parties. The quickest way to determine the Hare-LR allocation is to multiply the fractional vote shares of parties by M. The integer parts of these products represent quota seats, and the remaining seats go to parties with the largest decimal parts. So as to avoid having too many remainder seats, we could reduce the quota. One might consider 100%/(M + 1), designated here as q1 . Now it is possible to allocate more seats by full quotas—or even all of them. But one runs a tiny risk of allocating more seats than the district has. Suppose M = 4, so that q1 = 20 percent. If party votes happen to be exactly 60.00, 20.00, and 20.00 percent, 5 seats would be allocated! To guard against this admittedly unlikely outcome, the Hagenbach-Bischoff quota adds 1 vote to the total votes, before calculating the quota, while Droop quota adds 1 vote to the quota itself (NOT 1 percent of all votes!), leading to qDroop = 100%/(M + 1) + 1 vote. This single vote makes over-allocation of seats impossible. For practical purposes, the Droop and HagenbachBischoff quotas are identical to q1 . Some electoral systems (e.g. formerly in Italy) have not worried about over-allocation and have used q1 , and even q2 = 100%/(M + 2) and q3 = 100%/(M + 3), both called Imperiali quotas. In case of over-allocation, the lucky district simply receives extra seats in the representative assembly, until the next election. Quotas larger than simple quota can also be devised, for example, q−1 = 100%/(M − 1), and so on, but they are not used in practice. Somewhat counterintuitively, small quotas favor large parties, while large quotas favor 30
Electoral Systems—Simple and Complex
small parties. This is easiest to see by considering extreme cases. Suppose again that M = 4, and we decide to make use of q−3 = 100%/(M − 3) so that the quota is 100 percent. No one receives a quota seat, and the four largest parties receive one remainder seat each, even when the vote shares are as unbalanced as 60, 30, 7, 2, and 1. Of course, considering q−3 is unrealistic, but unrealistic extreme cases offer a powerful conceptual tool in disciplines such as physics. I will use this tool extensively, and it is explained in Beyond Regresssion (Taagepera 2008). Fixed quotas of a specified number of votes have also been used. Then the total number of seats a district receives depends on turnout. This might be an incentive to go and vote. The basic philosophy for all quotas plus largest remainders is subtraction: Each time a seat is allocated to a party, a specified amount of votes is subtracted from its total votes. Instead, one can also use a divisor philosophy: Each time a party is allocated a seat, divide its total votes by a specified amount, before the allocation of the next seat is considered. Such divisors are described next. The most widely used divisors are the d’Hondt (Jefferson) divisors, 1, 2, 3, 4, . . . They work as shown in Table 3.3. Suppose the party percentage of vote shares in the district are exactly 48, 25, 13, 9, 4, and 1, and M = 6 seats are to be allocated. First, we divide all vote shares symbolically by 1, which does not alter them. Next, we allocate the first seat to the largest share, 48 percent, as indicated by (1) next to ‘48’ in Table 3.3. But we also divide this share by 2, reducing it to 24. As we compare the new shares, the second-largest party’s 25 percent exceeds the largest party’s 24, and hence it receives the second seat, while its share is divided by 2. The next two seats go again to the largest party, with its share divided by 3 and then by 4. (A common mistake students make at this point is to divide 24 by 3, instead of dividing the original 48 by 3.) The fifth and sixth seats go to the third- and second-largest parties, whose quotients (13 and 12.5, respectively) narrowly surpass the largest party’s 12. Table 3.3. Allocation of 6 seats by d’Hondt divisors (1, 2, 3, . . . ) votes, %
48 (1) 48/2 = 24 (3) 48/3 = 16 (4) 48/4 = 12
25 (2) 25/2 = 12.5 (6)
13 (5) 13/2 = 7.5
9
4
1
seats seats, %
3 50
2 33
1 20
0 0
0 0
0 0
31
Rules and Tools
In this particular case, all major parties are overpaid: their seat shares exceed their vote shares. In general, however, d’Hondt divisors tend to favor the largest party. As M increases, this advantage lessens. Finland has used d’Hondt divisors for a full 100 years (with mean M = 14), and other current examples include Switzerland, Luxembourg, Spain, and Portugal. Various other divisors can also be used. Faster increase in divisors reduces large party advantage. Sainte-Laguë (Webster) divisors (1, 3, 5, 7, . . . ) abolish this advantage, and the so-called Danish divisors (1, 4, 7, 10, . . . ) actually favor the smaller parties (as is shown in Chapter 6). To increase their seat shares, large parties might then split their lists strategically, but it would be risky to do so, because fake splits may become real. At the extreme, one could use huge divisor gaps such as 1, 51, 101, 151, . . . Then all M largest parties may win one seat each. In the other direction, slowly increasing divisors favor heavily the largest party. Imperiali divisors (1, 1.5, 2, 2.5, . . . ) have been used. (Do not confuse them with the aforementioned Imperiali quotas!) The divisor series with the slowest increase would be 1, 1, 1, 1, . . . where the largest party wins all the seats. Thus, surprisingly, multi-seat plurality rule surfaces as the extreme member of the divisor family of the PR formulas. One can also devise divisors that tend to favor middle-sized parties. The Modified Sainte-Laguë divisors (1.4, 3, 5, 7, . . . ) are used in Norway and Sweden. Here the initial divisor 1.4 (instead of 1) makes it hard for small parties to receive their first seat. The quaintest divisors ever used might be the ‘modified d’Hondt’ divisors used in Estonia: 10.9 , 20.9 , 30.9 , 40.9 , 50.9 , . . . They are equivalent to 1, 1.87, 2.69, 3.48, 4.26, . . . To illustrate the effect of the various quota and divisor approaches, Table 3.4 shows the allocations of seats when vote shares are again exactly 48, 25, 13, 9, 4, and 1 percent, as in Table 3.3, and the district has 6 seats. The perfectly proportional seat share, as shown at the top of Table 3.4, is fractional and can only be approximated. Visibly, allocations 3, 2, 1, and 3, 1, 1, 1, which occur in the center of the table, come closest to proportionality, and these are the only rules used fairly widely. (Operational measures for deviation from PR are presented in Chapter 5.) Allocation formulas at the top of the table tend to overpay the largest party and are rarely used. Those at the bottom tend to overpay the small parties and are hardly ever used. Given that large quotas allocate all too many seats by largest remainders, while small quotas risk over-allocation of quota seats, one may look for a sufficient quota. Start with Droop quota and reduce the quota gradually, until all seats are allocated by quota, with no need to consider the 32
Electoral Systems—Simple and Complex Table 3.4. Allocation of seats in a 6-seat district, by various quota and divisor formulas Votes, % Perfectly proportional seat share Steady divisors (1, 1, 1, 1, . . . ) = plurality Imperiali divisors (1, 1.5, 2, 2.5, . . . ) Modified d’Hondt (1, 1.87, 2.69, 3.48, . . . ) Imperiali quota q3 = 100%/(M + 3) = 11.1% Imperiali quota q2 = 100%/(M + 2) = 12.5% D’Hondt divisors (1, 2, 3, 4, . . . ) Modified Sainte-Laguë div. (1.4, 3, 5, 7, . . . ) Droop/Hagenbach-B. quota q1 = 14.3% Sainte-Laguë divisors (1, 3, 5, 7, . . . ) Hare quota q0 = 100%/M = 16.7% Danish divisors (1, 4, 7, 10, . . . ) Quota q−1 = 100%/(M − 1) = 20% Quota q−2 = 100%/(M − 2) = 25% Quota q−3 = 100%/(M − 3) = 33.3% Divisors 1, 51, 101, 151, . . . Quota q−4 = 100%/(M − 4) = 50%
48
25
13
9
4
1
2.88
1.50
0.78
0.54
0.24
0.06
6 4 4 4 3 3 3 3 3 3 3 3 2 2 1 1
0 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1
0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0
0 0 0 0 0 0 [overallocation!] 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 1 0 1 1 0 1 1 1 1 1 1
remainders. The result may surprise: The remainderless quota is equivalent to d’Hondt divisors. In other words, d’Hondt represents the sufficient quota (Colomer 2004b: 44). Thus, the d’Hondt formula occupies a central position on the landscape of List PR formulas: It is at the crossroads of quota and divisor methods. This is how Thomas Jefferson actually came to define what later came to be called d’Hondt divisors in electoral studies (Colomer 2004a: 44). Like Alexander Hamilton, who first defined the simple quota, and Daniel Webster, who first defined the Sainte-Laguë divisors, Jefferson was concerned with seat allocation to the US states according to their populations. All these approaches were reinvented in Europe when the need arose to allocate seats to parties according to their votes. As district magnitude increases, all allocation formulas in the central range of Table 3.4 tend to produce seat allocations closer to perfect PR, and the choice of the particular formula makes less of a difference (see Chapter 6). In principle, these allocation formulas can be applied to any district magnitude, up to and including nationwide allocation. In the opposite direction, what happens if these formulas are applied to single-seat districts? All of them allocate the only seat at stake to the party with the most votes, and hence they become equivalent to FPTP. In this sense, drawing a line between List PR in multi-seat districts and FPTP is artificial. FPTP is a limiting case of List PR. 33
Rules and Tools
The description of seat allocation formulas in various countries often risks confusion. The d’Hondt procedure can be speeded up by first allocating seats by full Droop or Hagenbach-Bischoff quotas and then, instead of using largest remainders, switching to d’Hondt. This is how the Swiss electoral law describes the procedure. The outcome is pure d’Hondt. Yet describing it as ‘the Hagenbach–Bischoff formula’ may leave the mistaken impression that the Hagenbach–Bischoff quota is used with largest remainders.
Multi-Seat Districts with PR-Oriented Formulas—Candidate Centered Rather than force voters to vote for parties as blocs, one may wish to let them express preferences for specific candidates, regardless of party affiliations. The aforementioned LV allows a voter to vote for several candidates. It seems to achieve a reasonable degree of PR among parties when the number of votes per voter does not exceed the square root of district magnitude (M 0.5 ). The most limited number of votes per voter would be one vote. Called single non-transferable vote (SNTV), this may be the simplest method that could be applied in multi-seat districts. In a district with M seats, the M candidates with the most votes win. In analogy to FPTP, SNTV may be considered an ‘mth past the post’ system (Reed and Bolland 1999). Simplicity is desirable, but SNTV has a unique drawback. It is the only major multi-seat system that penalizes parties for running too many candidates. This is illustrated by the following example. Consider a five-seat district. Based on opinion polls and previous election results, party votes (in percent) are expected to be A: 45, B: 13, C: 28, and D: 14. Assume party A fields 2 candidates, who split the vote 23 to 22, and party C also fields 2 candidates, who split the vote 16 and 12. A would win 2 seats and each other party 1 seat. However, A is underpaid, with 45 percent of the votes and only 40 percent of the seats. It might consider running 3 candidates and would win 3 seats, if the votes are divided evenly among the candidates (15, 15, 15). But if one of the candidates is overly popular, so that the split is 23, 11, 11, then A could end up with only 1 seat. The same could happen, if the shares of the 3 candidates are roughly equal but the total vote for A falls below expectations. Because of such coordination problems, SNTV is rarely applied in districts of more than 3–5 seats. Such a low district magnitude reduces the 34
Electoral Systems—Simple and Complex
degree of proportionality to the point where SNTV is often called semiproportional. The limited proportionality, however, is more a matter of magnitude than of the allocation formula as such (Cox 1996). Japan, South Korea, and Taiwan have used SNTV, but Japan abandoned it in 1994 (see Grofman et al.1999). Coordination dilemmas can be avoided, if voters are asked to rank candidates. Stipulate that it takes a Droop quota worth of votes to win a seat. Second and later preferences are used to transfer excess votes of popular candidates to other candidates. Called single transferable vote (STV), this method is the multi-seat equivalent of the aforementioned Alternative Vote. While the latter uses only elimination of weaker candidates, multiseat districts also impose the need to transfer excess votes of the successful candidates. The STV is used in Ireland, Malta, and the Australian Senate (Bowler and Grofman 2000). Table 3.5 follows up on the previous example of an unlucky vote constellation, where SNTV would allocate the largest party only 1 seat out of 5. How would STV allocate these seats? We will assume that the second preferences go to the candidates of the same party, or to the ideologically closest party. Droop quota for M = 5 is 100%/6 = 16.7 percent. (The extra 1 vote in the Droop quota can be neglected.) Any candidate who reaches this quota wins a seat. Her excess votes are allocated according to her voters’ second preferences. If this helps further candidates to reach a Table 3.5. Example of seat allocation by single transferable vote (STV) in a 5-seat district Candidates First preference votes (%) Quota allocation Remainder transfer (assumed) New totals
A1
A2
23.0 11.0 −16.7 6.3 → +4.3 15.3
A3
B
C1
C2
D
11.0
13.0
16.0
12.0
14.0
+2.0 13.0 +9.0 ← −12.0 → 3.0 26.0 17.0
Elimination of the weakest New totals Quota allocations Remainder transfers New totals
+8.6 21.6
Quota allocation Remainder transfers New totals
−16.7 +4.9 ← 4.9 17.9
Quota allocation Residual remainders Seats for parties
15.3 2 seats
−16.7 1.2
−16.7 ← 8.3
−16.7 ← 0.3
[they add up to one Droop quota.] 1 seat 1 seat 1 seat
Note: Assume that candidates are listed in the order of placement on left-right scale.
35
Rules and Tools
full quota, the process is repeated. If not, then the weakest candidate is eliminated, and his votes are allocated according to his voters’ second preferences. In later stages, third and fourth preferences may come into play. In this particular example, the largest party wins 2 seats, as it did with SNTV when not taking risks and fielding only 2 candidates. Visibly, the STV procedure is more complex than those previously described—and I have omitted some details that can make it even messier—see Farrell, Mackerras, and McAllister (1996) and Bowler and Grofman (2000). With M = 5, it is quite possible that 6 parties might run. Since there is no penalty for fielding many candidates, in contrast to SNTV, there might be an average of 4 candidates per party, for a total of 24. It is hard to rank that many candidates in a meaningful way. Hence, STV is rarely used in districts with more than 5 seats. This low district magnitude impedes approach to perfect PR. Yet, STV offers maximal freedom of choice to the voters, without fear that one’s vote might be wasted. For instance, if a voter’s main concern is to enhance women’s representation, he could express high preference for all female candidates. Thus, STV may have considerable philosophical appeal, and computer programs can handle the technical aspects easily. What happens when the SNTV or STV procedures are applied in singleseat districts? The SNTV is then reduced to FPTP, similarly to List PR formulas. The STV, however, is reduced to Alternative Vote. In sum, the traditional distinction between single- and multi-seat districts is not needed in the analysis of the impact of electoral systems on party systems. Indeed, such distinction makes analysis harder. Single-seat districts are merely the limiting cases of multi-seat districts.
Complex and Composite Electoral Systems All electoral systems previously described offer only one district magnitude and one seat allocation formula, even while this formula might be quite involved. The possibilities for electoral design multiply when district magnitude varies from district to district, when legal thresholds are introduced to constrain the workings of the basic allocation formula, or when different allocation formulas are applied in sequence or in parallel. These and some other practices are reviewed next. The prime task of this book, however, is to explain in depth the effects of the simplest of the basic electoral systems. From this viewpoint, what follows can be bypassed, but it is needed, if one wants to obtain a picture 36
Electoral Systems—Simple and Complex
of systems actually used. Indeed, tabulation at the end of the chapter shows that a fair proportion of actual electoral systems involve complex or composite features.
Unequal District Magnitudes When multi-seat districts are used in a country, they almost never have the same magnitude, because they tend to correspond to historical– geographical subunits. Typically, M would be higher in major cities and lower in rural areas, while in small ethnically distinct areas some singleseat districts might be thrown in. Thus, most magnitudes in Finland are close to the arithmetic mean of 14, but extremes have ranged at times from 3 to 28, plus one single-seat district in the autonomous Åland Islands. Habitually, the mean magnitude is used to characterize the electoral system, but variation may have political consequences. The direction and nature of these consequences emerges best when we consider a hypothetical extreme case. Suppose a country with an assembly of 200 seats has one district of M = 100 around the capital city and 100 districts of M = 1 in the sparsely settled countryside. All districts use the same List PR formula, which in the single-seat districts amounts to FPTP. The arithmetic mean magnitude is very close to 2, but the resulting party system is likely to be quite different from that in a country of 100 districts of M = 2. Two-seat districts may enable 2 or 3 parties to receive seats. In contrast, the district of M = 100 may offer opportunities for a dozen of parties. With such a power base, some small parties may also try their luck in the single-seat districts. In sum, the country with all seats at M = 2 is likely to have a two- or threeparty system, while the country with variation in M is likely to have a multiparty system. One may devise measures of average M different from the arithmetic mean, so as to magnify the impact of the largest districts (see Appendix of Taagepera 1998b). But no such averaging may circumvent a feature noticed by Monroe and Rose (2002): Uneven distribution of district magnitudes can bias party representation. A simple illustration follows. Assume that our aforementioned hypothetical country has only two major parties, one having 70 percent support in the urban district but only 30 percent in the rural districts, and the other having the proportions reversed. Thus, the overall voting strengths of the two parties are equal and amount to 50 percent (assuming no malapportionment). Their 37
Rules and Tools
representation in the assembly, however, is lopsided. The rural party wins all the rural FPTP seats, plus its proportional share of 30 seats in the urban district, for a total of 130 seats. The urban party only wins its proportional share of seventy seats in the urban district. This extreme example is unlikely to materialize. But to a lesser degree, the same tendencies do occur in actual countries that use a wide range of district magnitudes, as shown by Monroe and Rose (2002) for Spain. Compared to more complex and composite electoral systems, uneven M has been considered a minor deviation from simple electoral systems. Yet it can significantly alter the survival chances of small parties. It can also alter the seat balance of major parties.
Legal Thresholds While PR for parties may be considered desirable in general, a profusion of tiny parties is not. Therefore, limits on minimal representation are imposed in many countries that use List PR in large magnitude districts or even nationwide. Typically, parties below a given threshold of votes are not entitled to participate in seat allocation. The legal threshold used may be a low as 0.67 percent (The Netherlands) or as high as 5 percent (Germany). Some countries apply even higher thresholds to alliance lists of several parties. It matters whether the threshold applies nationwide or in individual districts. Suppose a party has 4.9 percent of the nationwide votes. A nationwide threshold of 5 percent would bar it from obtaining seats. The same threshold applied in individual districts, however, would not prevent it from winning seats in those districts where the party has more than 5 percent of the votes. District magnitude as such imposes an effective threshold. For example, when M = 5, it is nearly impossible for a party to win a seat with less than 10 percent of the votes. These effective thresholds are calculated later (Chapter 15). For the moment, it is important to realize that a districtlevel legal threshold may block small parties in large districts while having no impact whatsoever in small districts. Take Spain, with many five-seat districts but also huge districts in Barcelona and Madrid. Only the latter are affected by Spain’s district-level legal threshold of 5 percent, because the effective thresholds inherent in small district magnitudes are larger than that. 38
Electoral Systems—Simple and Complex
Legal thresholds can be expressed in terms other than a percentage of votes. It may be a fixed number of votes, or a threshold of 2 seats in the assembly, so as to block one-seat parties and independents. On the other hand, legal thresholds may contain loopholes. Thus, the 5 percent bar in Germany does not apply to parties that win at least 3 seats in single-seat districts.
Legal Majorities Sometimes minimal representation for the largest party is prescribed, so as to enhance governability. It so happened in Malta that the largest party by votes obtained fewer seats than the next-largest party, whose votes were placed more advantageously across the districts. As a reaction to this anomaly, Malta now stipulates that a party with more than 50 percent votes must obtain more than 50 percent of the seats. If needed, extra seats are added to the usual number. In Italian municipal elections, parties with sufficient pluralities are given 60 percent of the seats in large cities and 66.7 percent in small municipalities, before seats are allocated to other parties. In contrast to Malta, the total size of the assembly does not change.
Multiple Tiers When simple quota is applied in multi-seat districts, one may decide to allocate seats only by full quota. The remainder votes and seats are transferred to larger second-tier super-districts or even a single nationwide district. There, allocation formula may change. Thus, Belgium used to switch from quota to d’Hondt. The overall outcome would be equivalent to pure d’Hondt in Belgian super-districts, except for limitations on small parties and alliance lists, in the upper tier. Some countries have three tiers, and Greece has even four, with complex limitations—see Lijphart (1994: 44). When too many seats are deemed to go to the second tier, one may alleviate the full quota rule in the districts and allocate seats by largest remainders, as long as these remainders surpass 0.9 or 0.75 of the full quota. Estonia introduced such a relaxation around 2000. Note that the number of seats allocated in the upper tier(s) varies from election to election and is not known ahead of the time. One can also assign a fixed number of seats to each part of the system. Then we have a composite system, as discussed next, rather than a multitier one. 39
Rules and Tools
Composite Systems: Single versus Double Ballot, Parallel versus Compensatory A country that uses single-seat districts may offer smaller parties a chance by superimposing a parallel system that operates by List PR. The number of such mixed systems has been increasing, and the variety of options is large—see Massicotte and Blais (1999). Sometimes the voter has a single ballot for both systems—for example, in Mexico, Taiwan, South Korea, and the Italian Senate (Ferrara 2006). This puts a voter who prefers a small party in a bind. If she votes for the small party, her vote has no effect on the outcome in the single-seat district, where only the largest parties have a chance. If she votes for a large party, her preferred small party may lose a PR seat. Such dilemmas are avoided by giving each voter separate ballots for single-seat and large-magnitude districts. A voter who prefers a small party would vote for this party in the large district but may well vote strategically for one of the two largest parties in the single-seat district. Nationwide allocation, in turn, can proceed either in parallel or in a compensatory way, as explained below. The two approaches, which may yield very different outcomes, are often confused. Seats can be allocated separately in two parallel systems. When there are 100 single-seat districts and one nationwide district of M = 20, then small party representation remains symbolic. But overall semi-PR results when the large district has 100 seats. Then small parties win about one-half of their proportional shares, while the two largest parties still maintain the hefty bonus gained in the single-seat districts. Italy and Japan shifted in the 1990s to such parallel systems. One may go a step further and use the vote shares in the nationwide district to establish nationwide PR. Here we have a compensatory system. Parties that win many small-district seats receive accordingly fewer seats in the nationwide allocation. If the nationwide seats are few, say, 20 out of a total of 120, they may not suffice to restore nationwide PR. But if there are as many nationwide seats as small-district seats, then one comes close to perfect PR, unless blocked by a legal threshold—which usually is the case. Germany uses such a compensatory MMP system (see Shugart and Wattenberg 2001). It has also been called Personalized PR, and Additional Member system, although the latter term conjures the image of parallel systems that do not interact. Note that compensatory systems
40
Electoral Systems—Simple and Complex
could also function with a single ballot per voter. The small party voter would not have any impact on who wins in the local small district, but it would not matter in the nationwide allocation by compensatory PR. One can also complement small List PR districts with a fixed number of nationwide compensatory seats. Usually, small party access is subject to limitations. In Iceland, only parties with at least one district seat can receive compensatory seats. Denmark sets a legal threshold of 2 percent votes, but with complex loopholes.
Intra-List Party Competition and Apparentement Most List PR systems favor larger parties even at medium district magnitudes, and the effect becomes strong at low magnitudes. This creates an incentive for parties to present joint lists in specific districts or nationwide, if electoral law allows it. The effect is similar when separate lists are used, but they are declared to count together for the purposes of seat allocation—a format traditionally called apparentement. Switzerland is a long-standing example. Detailed arrangements vary. Competition among allied parties remains possible, if the voter votes for a specific candidate. Such alliances become crucial in Chile, where M = 2. The logic of two-seat districts, similar to Duverger’s law, forces the parties to form two large blocks. Within the block, each party presents one candidate (it would be self-defeating to run more than one). The votes for all candidates in the block are added and supply the basis for allocation of seats by d’Hondt. This means that in order to win both seats, a block must have more than twice the votes of the other block, which is rarely the case. Within the block, the seat(s) won go to the candidate(s) with the most votes. The same method could be used in single-seat districts, if candidates were allowed to make alliances. The seat would go to the candidate with plurality vote within the alliance with plurality in the district. In analogy with the FPTP, it could be called FHFT—First Horse in the First Team. Such a system would push the parties toward formation of two major blocks of parties, while preserving many parties. This approach has been used for presidential elections in Uruguay. ‘Desistance pacts’ among Italian parties (Gambetta and Warner 2004: 246) also amount to an informal application of ‘FHFT’.
41
Rules and Tools
Party-Candidate Cocktails (Panachage) We have seen that in the presence of party lists, a voter may be required to vote for a party only (closed list), for a party and optionally for a candidate within a party (open list), or for a specific candidate within the party (the most open list). Panachage (literally: cocktail) carries the option one step further: The voter may vote for a party but also for a specific candidate in another party. Why allow such a mix? By this means a supporter of a hopelessly small party can cast a symbolic vote for this party, while still giving support to the most acceptable candidate in a major party. Used in Switzerland and Luxembourg, it may be seen as a step from List PR toward STV.
Primary Elections Parties can (or may be required by law) carry out intraparty elections so as to determine the rank order of candidates on a closed list—or the only candidate in the case of single-seat districts. Such primaries may be limited to formal party members, or they may be open to the public, the only limit being that one can participate in only one party’s primary. In the USA, where ‘bipartisan gerrymander’ (see next section) often makes one party’s primary the only election with a real choice, proposals have risen to have a joint primary for all parties. The two top candidates would advance to the actual election. This system would be similar to Two-Rounds, except that the two finalists might belong to the same party. Thus the line between intraparty elections (that might be considered a private affair) and public elections is not clear-cut. Comparative studies of primaries are as yet few. This is one area where more work is needed.
Pathologies of Electoral Systems: Malapportionment and Gerrymander Electoral fraud can take place in many ways. Parties and candidates can be prevented from running or campaigning and advertising. Voters can be intimidated or bought, and votes can be miscounted. But short of such blatant fraud, various games can also be played with magnitudes and boundaries of electoral districts. 42
Electoral Systems—Simple and Complex
Malapportionment means that some districts have too few or too many seats, compared to their population. During rapid urbanization, malapportionment can arise spontaneously, when the countryside becomes depopulated but still preserves its assembly seats. Reapportionment on the basis of new census results may be delayed, if parties who stand to lose from it are strong in the existing assembly. The larger the district magnitude, the easier it is to reapportion in an incremental way. Indeed, malapportionment becomes impossible when a nationwide single district is used. Gerrymander means drawing single-seat district borders in such a way as to assure safe districts (say, 60 percent majorities) to the party in charge of districting, while leaving the other major party with wastefully large losing minorities (say, 40 percent) in those districts and with wastefully huge winning majorities (say, 80 percent) in other districts. It started with Governor Elbridge Gerry’s somewhat salamander-shaped district in Massachusetts, 1812, which came to be called Gerry’s Mander. Many districts in contemporary USA look like salamanders that have gone through a meat grinder, with almost disconnected pieces scattered around—examples are shown in Rush and Engstrom (2001: 5–13). Gerrymander is a specific problem of single-seat districts. It is hard to carry out with low-magnitude multi-seat districts and impossible when the magnitude is large. Bipartisan gerrymander is a development of recent decades in the USA. The two major parties agree to divide the state into districts safe for either of them. Effectively, voters no longer choose the assembly members but assembly members choose their voters. The only election with a meaningful choice is the locally dominant party’s primary. As a result, nearly all incumbents who run are reelected. Furthermore, appealing to the center of the general public (the median voter) no longer is a winning strategy. One has to appeal to the median voter of the locally dominant party. This way, bipartisan gerrymander may have contributed to an increase in ideological polarization of the US House, although contrary evidence (Brunell 2006) also exists.
Which Electoral Systems are used the Most? Lijphart (1994) analyzed 70 different electoral systems, used in 27 most stable democracies, from 1945 to 1990. The countries range from developed countries to Costa Rica and India. Also included were elections to the European Parliament (EP). A change in electoral system sometimes meant 43
Rules and Tools Table 3.6. Established democracies 1945–90 Systems
Elections
Basic (plus legal thresholds and unequal district magnitudes) First-past-the-post M = 1, alternative vote M = 1, majority–plurality List PR, d’Hondt formula List PR, Hare-LR formula List PR, modified Sainte-Laguë Single transferable vote Single nontransferable vote
7 3 2 21 5 2 4 2
78 19 8 94 38 15 27 18
Composite PR, with two-tier districting PR, two tiers and adjustment seats PR, four-tier districting Mixed PR-majority
7 13 3 1
31 68 4 2
TOTAL
70
402
Note: Number of electoral systems and the total number of elections in which they were used. Source: Tables in Lijphart (1994: 16–47).
a thorough reversal (e.g. from FPTP to large-magnitude PR), but often it was a relatively minor shift within the same basic framework. Some systems lasted only one or two elections, before they were changed, while the USA with its frequent House elections went through 23 elections. The frequency distribution by type of electoral rule is shown in Table 3.6. In sum, 26 percent of the elections used single-seat districts, mostly FPTP, while 37 percent used List PR, mostly d’Hondt. Only 11 percent used STV or SNTV, and 26 percent used composite systems, mostly combining twotier districting and adjustment seats. From the 1950s to the 1990s, Golder (2005) records a vast expansion of democratic elections, and the relative shares of various systems fluctuate. By and far, the proportion of majoritarian systems has been steady around 36 percent, and there has been a shift away from pure PR (down from 41 percent in the 1950s to 28 percent in the 1990s) in favor of mixed systems (up from 14 to 22 percent). In the early 2000s, Reynolds, Reilly, and Ellis (2005: 30) counted 68 established democracies in sovereign or autonomous areas of the world. Their electoral systems were distributed as follows: Single-seat districts 39.7 percent, mostly FPTP; List PR 30.9 percent; STV or SNTV 5.8 percent; Block Vote 11.8 percent; various composites 11.8 percent. Block Vote owes its rather large share to small British islands such as Guernsey, Jersey, and Man. If one goes by population shares, India and the USA bring the FPTP 44
Electoral Systems—Simple and Complex
share to 70 percent. On the other hand, 61 percent of the new democracies of the recent decades have chosen List PR. When existing systems are changed, the trend also is away from plurality/majority systems toward PR and mixed systems (Soudriette and Ellis 2006). What determines the choice of electoral systems? It makes eminent sense to consider the constellation of political forces at the moment of decision. It is in the interest of a two-party founding body to choose FPTP and of a multiparty body to choose PR. However, colonial history looks like an even more powerful determinant. Of all countries that have election rules, 36 percent have some British heritage, but this share shoots up to 87 percent for the FPTP countries and drops to 6 percent for the List PR countries. Also, only 13 percent of all countries have some French heritage, but 45 percent of the Two-Rounds countries do. Many other countries used FPTP or Two-Rounds in the past, but most of those with no British/French heritage have later switched to List PR or mixed systems. Strikingly, while colonialism has receded into more distant past, its impact on electoral systems has become more marked. The details of this Franco-British factor are shown in Table 3.7, based on Reynolds, Reilly, and Ellis (2005: 32 and 166–73). Many of the 199 countries and territories listed have poor or nonexistent democratic credentials. Authoritarian regimes can subvert any electoral laws for their noncompetitive fake elections, but they disproportionately tend to prefer Two-Rounds. Table 3.7. Electoral systems and British–French heritage Electoral system
Total cases
British heritage
French heritage
Other cases
British share (%)
French share (%)
Other’s share (%)
Total
199
72
26
101
36
13
51
British heritage favorites FPTP AV STV LV & BC BV & PBV SNTV
47 3 2 2 19 4
41 3 2 2 11 2
2 0 0 0 4 0
4 0 0 0 4 2
87 100 100 100 58 50
4 0 0 0 21 0
9 0 0 0 21 50
French heritage favorite TR
22
2
10
10
9
45
45
Favorites of the rest of the world List PR Parallel MMP
70 21 9
4 3 2
6 4 0
60 14 7
6 14 22
9 19 0
86 67 78
45
Rules and Tools
The prevalence of FPTP in the British-heritage countries, from St Kitts to India, is striking, as is its current absence in the rest of the world. Alternate Vote, STV, LV, and modified BC occur in very small numbers, but all those cases happen to be in the British-heritage one-third of all the countries. To a lesser degree, this also applies to Block Vote, PBV, and SNTV. As noted, Two-Rounds occurs with relatively high frequency in French-heritage countries. The rest of the world uses preponderantly List PR, with an increasing sprinkle of parallel systems and MMP.
Conclusions The central purpose of this book is to elucidate regularities in the impact of electoral systems on party systems. This effort has to start with the simplest electoral systems. Even some of the electoral systems designated here as basic surpass as yet our present capabilities for building quantitatively predictive models. This applies even more to those systems designated as complex and composite. Pathologies of electoral systems are a source of increased random noise. While building logical models of manageable simplicity, one has to overlook many such complications. At the same time, we must be aware of the simplifications made, so as not to mistake the models for the real world. Only then can we use these models for prediction—and know the limits on these predictions. Many claims made in this chapter and in the preceding ones need logical proof and empirical testing. For this we need operational tools. What exactly do we mean when talking of a smaller or a larger number of parties, or of deviation from PR? Some such analytic tools are presented in the next chapters.
46
4 The Number and Balance of Parties
For the practitioner of politics:
r r
r r r
When some parties have many seats and some have few, we need a meaningful ‘effective’ number of parties, so as to compare the effects of electoral systems on party systems. The standard way to express the effective number of parties is to convert to fractional seat shares, square them, add, and take the inverse. Thus, for seat shares 50-40-10, the effective number is N = 1/[(0.5 × 0.5) + (0.4 × 0.4) + (0.1 × 0.1)] = 2.38. The same can be done with vote shares. This method is not ideal, but all others are worse. Effective number can be complemented by a measure of balance in party sizes.
The number of parties is among the most frequent numbers in political analysis, and it is central to the study of party systems. A party system involves, of course, much more than the mere number of parties, but it is impossible to describe it without giving some idea of how many players are involved. But what is a meaningful number of parties in an assembly, when some parties have many seats and some have few? Also, what is a meaningful number of parties in an election, when some of them obtain many more votes than some others? Description of electoral systems in Chapter 3 has included hints at how they affect the number of parties. It is time to measure it. The main part of this chapter presents what is needed to measure the number of parties in an informed way, for various purposes. Chapter appendix addresses various methodological issues that some readers may wish to bypass. 47
Rules and Tools
We begin with the number of legislative parties—those in the assembly. Three ways to measure this number will be introduced, plus a resulting index of balance, so as to characterize a mix of large and small parties. These measures are then used to classify party systems. Extension to electoral parties leads to a comparison of the numbers based on votes and seats. Thereafter, the effective number of components is expanded to uses beyond electoral systems. In chapter appendix, the discussion becomes somewhat more technical, as I justify the choice of indices, indicate relationships among them, and discuss the quest for a number of relevant parties.
Basic Indices of Number and Balance of Legislative Parties It often suffices to talk of two-party or two-and-a-half party systems, and so on. But in the face of large, small, and tiny parties, we need a more precise measure for cross-country comparisons of institutional effects, and also for detecting gradual changes within a country. Three approaches to the number of parties are useful either for practical purposes or for construction of predictive models. As a specific example, suppose 100 seats are distributed among six parties as 48-25-13-9-4-1. How many parties are there? The simplest way is to count the seat-winning parties,meaning those who are represented in the assembly with at least one seat. I will call this number N0 , for reasons of systematics explained in chapter appendix: N0 = the number of seat-winning parties. In our example, N0 = 6. This is the largest number of parties that could possibly be claimed for this constellation. Its obvious shortcoming is that we may not feel the system really has 6 meaningful parties. We should take into account at least the relative size of the largest party. Indeed, the inverse of the largest fractional share represents the smallest number of parties that could be claimed for a party constellation (see chapter appendix). I will call this number N∞ , for reasons of systematics: N∞ =
1 = inverse of the largest fractional share. s1
In our specific example, s1 = 0.48, hence N∞ = 1/0.48 = 2.08. This may look more realistic than N0 = 6, but we may feel that it underestimates the number of parties, given that more than two parties can have some impact. 48
The Number and Balance of Parties
The effective number of parties yields a value intermediary between N0 and N∞ . Called N2 for reasons of systematics, it is calculated as N2 =
1 = inverse sum of squared fractional shares. (si )2
Here si stands for the fractional seat share of ith party. In our specific example, N2 = 1/[0.482 + 0.252 + 0.132 + 0.092 + 0.042 + 0.012 ] = 1/0.3196 = 3.13. For some purposes, this party system acts as if it were composed of 3 equal-sized parties, plus a minor fourth party. This does not imply that the largest three parties matter fully, the fourth only a little, and the rest not at all—the meaning is more subtle. Introduced into political science by Laakso and Taagepera (1979), the effective number has ‘become the most widely used measure of the number of parties’ (Lijphart 1994: 70). ‘It is now the standard measure of how concentrated vote shares are in electoral contests’ (Cox 1997: 29). The formula above requires that all seats be first converted into fractional seat shares, but one can bypass this stage. Suppose seats in an 80-seat assembly are distributed 45-34-1 (as in New Zealand in 1943). Rather than convert into fractional shares, one can calculate the effective number as N2 = 802 / (452 + 342 + 12 ) = 2.01. The following formula can be used with the numbers of seats as well as with percentages: N2 =
( Si )2 . (Si )2
Here Si is the number of seats for ith party. Note the beautiful symmetry of this expression. The numerator and denominator include exactly the same symbols, with only the parentheses shifted. N∞ cannot be larger than N2 , and N0 cannot be smaller: N0 ≥ N2 ≥ N∞ . They are equal when all seat-winning parties have equal shares. For example, for 25-25-25-25, all three measures yield 4.00. We use the effective number more often than any other measure. Therefore, I will from now on designate it simply as N, unless otherwise specified: N = N2 . The formula for the effective number is ‘operational’ in that it can be applied mechanically to any constellation of fractional shares. But the formula does not tell us which shares we should feed in. Should the 49
Rules and Tools
German CDU and its Bavarian ally CSU be counted as a single party or two sister parties, given their noncompetition in elections and cooperation in government? In the reverse direction, factions inside the Japanese Liberal Democratic Party could be seen as ‘parties within the party’ (Reed and Bolland 1999). Such judgment calls are up to the researcher, regardless of how the number of parties is defined. Lijphart (1999: 69–74) settles such dilemmas by calculating N in both ways and taking the arithmetic mean. This is a reasonable solution when the two horns of the dilemma are fairly close. But Chile presents difficulties. There, the effective number of major voting blocks is close to 2. However, the effective number of mutually competing parties within the blocks is many times higher. Both numbers make sense, in different ways, but their mean might not make any sense. For a given effective number of parties, the actual shares of parties may be quite equal or highly unequal. The following index of balance may be a suitable measure of such variation (Taagepera 2005): B=
logN∞ −logs1 = . logN0 logN0
This index takes the ratio of the logarithms of the minimal (N∞ ) and maximal (N0 ) estimates of the number of parties. It is also the logarithm of the largest fractional share divided by the logarithm of the number of seat-winning parties, with a negative sign. The index of balance can range from nearly 0 to 1. It is 1 for full balance, meaning that all seat-winning parties have equal shares. It approaches 0 for utter imbalance. Most constellations have a balance around 0.5. Our sample constellation 48-25-13-9-4-1 yields B = 0.41, meaning that it is slightly less balanced than most constellations. Among the various constellations that all lead to N = 3.00, 34-33-33 has a high balance of B = 0.98, while a low balance of B = 0.18 is reached with 57-1 plus 21 parties at 2 percent. Odd as the definition of B may look, the next section shows that it leads to some agreement with intuitive notions about balance.
Mapping Party Systems, Using Number and Balance of Parties Characterizing the types of party systems has concerned students of party politics for a long time—see the excellent overview by Steven Wolinetz (2006). A long strand of researchers, extending from Jean Blondel (1968) 50
The Number and Balance of Parties
to Allan Siaroff (2000), has considered the number of parties and the relative balance among them the basic criteria for classifying party systems. Of course, several other criteria also enter. The nature of alternation among ruling parties must be considered (Mair 1997). Alternation may be nil (permanent hegemonic party), partial (when some coalition partners are replaced), or wholesale (ruling party or coalition fully replaced). The degree of ideological polarization can be mild or strong (Sartori 1976). Patterns of opposition can be competitive, cooperative, or coalescent (Dahl 1966). Still, with few exceptions (Gunther and Diamond 2003), these specifications complement the number and balance of parties rather than replacing them. In the words of Wolinetz (2006: 60): ‘Relationships depend on numbers.’ Blondel (1968) had to depend on impressionistic estimates of the number of parties rather than an operational measure. Siaroff (2000) made use of the effective number of parties but had to depend on qualitative estimates of balance. With the benefit of the index of balance, we can now operationalize both measures, as exemplified in Figure 4.1. Here balance is graphed against the effective number in 25 stable party systems, from 1985 to 1996. The average indicators for two to four elections are shown, as derived from data in Mackie and Rose (1997). Cases with and without absolute majorities are shown with different symbols. Approximately in line with Siaroff’s typology (2000), Figure 4.1 distinguishes regions that correspond to two-party, two-and-a-half to threeparty, multiparty, and highly multiparty systems. The line B = 0.50 offers a convenient separation line between relatively balanced and unbalanced systems, with about half the cases falling on either side of the line. Balance is a matter of degree, though. In many ways, Denmark is closer to the Netherlands, across the B = 0.5 line, than to Italy, in the same conventional region. Extreme lack of balance is rare, although it could occur for any effective number. Also rare is near-perfect balance, of which Malta is the only case in Figure 4.1. If a third party had won even a single seat, Malta’s index of balance would have tumbled to around 0.65, close to the USA. Given such a possibility, balance may look very unstable, but this is not so. Over two to four elections in 1985–96, it is observed to remain stable within ± 0.03 units, while the effective number at times changes by more than 1 unit. As the number of parties increases, balance tends to be restricted to an ever narrowing zone around B = 0.5. Deviations from this mean are possible but rare. 51
Rules and Tools 1 Conceptually forbidden area
MAL
0.9
0.8
0.7
BALANCED BEL AUT
0.6
ICE
Balance
USA NZ
0.5
NET NORISR DEN SWE
FRA
GRE
CAN POR JPN
0.3
FIN
GER
AUL
0.4
LUX
SWI
ITA
IRE UNBALANCED
s1 > 0.5
UK SPA
s1 < 0.5 B = −log2/log(1−N/4)
0.2
0.1
Oneparty
Twoparty
Multiparty
Threeparty
Highly multiparty
0 1
2
3
4
5
6
7
8
Effective number of legislative parties
Figure 4.1. Balance vs. effective number of legislative parties in 25 countries, 1985–1996
At the left and top of the graph, conceptually forbidden areas are shown. Indeed, for less than two effective parties, high balance becomes impossible. Also, very high balance is impossible for most values of the effective number of parties, given that the definition of B involves N0 , which comes in integer numbers. An empty one-party zone at N < 1.7 and high imbalance is shown in the graph. Among the stable democracies by Lijphart’s criteria (1999), Botswana would be located there. African 52
The Number and Balance of Parties
party systems in general tend to have low balance, with a few dominant large parties surrounded by numerous small ones (Mozaffar, Scarritt, and Galaich 2003). Sister parties present problems in Germany and Australia. Should they be counted together or separately? Both options are shown in Figure 4.1 and are seen to matter relatively little for the German CDU and CSU. For the National and Liberal Parties in Australia, however, the way to count them makes a difference. Such judgment calls cannot be avoided. Spain and Japan illustrate the limits of a classification of party systems on the basis of number and balance alone. Both countries have large major parties plus many minor ones, meaning low balance. They are located close to each other in Figure 4.1. Their alternation patterns, however, differ. In Japan, Liberal Democratic Party continued being the predominant party, while in Spain People’s Party relieved the Socialists as the largest party. A quantitative measure of alternation remains to be worked out. It would represent a third dimension, orthogonal to N and B. For given N and B, can we tell whether the largest party has absolute majority? It can be shown that majority never prevails above the dashed curve B = −log2/log(1 − N/4), shown in Figure 4.1. Below this curve, majority becomes increasingly likely as balance decreases, but it never becomes a certainty at N larger than 2. Even when N is as low as 2.2 and B is as low as 0.20, this combination still could arise from 2 parties barely short of 50 percent plus a smattering of parties with 1 seat each—unlikely as such a combination would be. Note, however, that a party can remain predominant even while not enjoying absolute majority at the moment. Hence the characterization of party system as such need not depend on absolute majority. In Japan, majority materialized in 1986 and 1990, but not in 1993. It certainly affected the moment’s politics but not the broad framework.
Legislative Versus Electoral Parties The effective number could also be calculated on the basis of vote shares of parties: NV =
1 . (vi )2
Here vi stands for the fractional vote share of the ith party. To distinguish between the number of legislative and electoral parties, we can use the 53
Rules and Tools 6 PR M =1
5
4
Ns
FRA 1958−81 3 SPA 1977−89 IND 1962−84
−
0.
Nv
N
s
=
N
v
−
= Ns
1
4
0.
8
N
s
=
N
v
2
0 0
1
2
3 Nv
4
5
6
Figure 4.2. Effective number of legislative parties vs. effective number of electoral parties
notation NS for seats and NV for votes when the need arises. These numbers are sometimes referred to as effective number of parliamentary parties (ENPP) and effective number of electoral parties (ENEP), respectively, but single symbols with subscripts are more in line with scientific notation. Indeed, ENP risks to be mistaken for multiplication of the quantities E , N, and P . The effective number based on votes (NV ) almost always exceeds the one based on seats (NS ), although exceptions occur. Figure 4.2 shows the average NS over many elections graphed against the average NV . It includes those 37 electoral systems distinguished by Lijphart (1994: 17– 47 and 160–2) that involved at least 3 national elections. 1 The effective number of parties is rarely much less than 2.0—overwhelming dominance 1 The effective numbers data are given in Lijphart (1994:160-2), labeled ENEP and ENPP. The number of elections is listed in other tables (Lijphart 1994:17-47). I have omitted the electoral systems that were discarded after one or two elections, because the first election after a change of system is atypical. Also omitted were elections to the European Parliament, because the perceived weakness of this body affects voting behavior. When the electoral outcome is considered unimportant, voters are more willing to waste their vote on minor
54
The Number and Balance of Parties
by one party is unusual in stable democracies. The effective number rarely exceeds 5.5 for electoral parties or 5.0 for legislative parties. Most differences between the two range from 0 to 0.8, and the median is 0.4 (Taagepera and Shugart 1989: 84). Hence, for most electoral systems, NS = NV − 0.4 ± 0.4. This relationship is purely empirical. It obviously could not apply when NV is less than 1.4, as NS would fall below 1. The gap exceeds 0.8 mainly in early, unsettled electoral systems, where more parties run than can realistically expect to win seats under the given electoral laws. Only India 1962–84, France 1958–81, and Spain 1977–89 have a persistent gap of more than 0.8 between NV and NS . The latter two countries are among those where malapportionment in favor of rural areas has been heavy (Lijphart 1994: 128).
Effective Number of Components: Seats, Votes, Polities, and Power Shares All that has been said about seat shares also applies to shares of votes, except that it would be difficult to specify N0 , the number of parties that receive at least one vote. Actually, we would be more interested in the number of ‘serious’ parties that run, but this number is hard to define (as is discussed in Chapter 15). The task becomes even more difficult when numerous independents run, some of them seriously. In some ways, each independent acts as a separate party, yet they lack many characteristics of parties. The effective number of components other than parties can be useful outside the realm of electoral and party studies. This approach applies whenever well-defined components add up to a well-defined total. One can measure the effective number of polities in the world, based on their areas or populations (Taagepera 1997a). Over the last 5,000 years, their number has fallen from close to a million to around 20. One could also measure the effective number of car manufacturers in a country or the effective number of ethnic groups or religions—provided one can agree on which groups represent different ethnies or religions. The number obviously differs depending on whether one enters the fractional shares parties that fail to win seats. As a result, the gap between vote- and seat-based effective numbers of parties is often much wider in Euroelections than in national elections.
55
Rules and Tools
of Muslims and Christians as such, or whether one enters the Sunni, Shia, Catholic, Orthodox, and Protestant shares separately. In addition to seat or vote shares, one could also consider the effective number of parties based on shares of power, using for instance the standardized Banzhaf indices (Dumont and Caulier 2006). This may bring us closer to operationalizing the measurement of Sartori’s ‘relevant parties’ (1976), to be discussed in the chapter appendix. The power index approach presumes that minimal winning coalitions are the norm for government formation. The actual distribution of cabinets in stable democracies, however, shows an appreciable incidence of minority cabinets and oversized (i.e. larger than minimal winning) coalitions (Lijphart 1999: 98): Minimal winning, one-party Minimal winning coalitions Minority cabinets Oversized coalitions
37% 25% 17% 21%.
Nonetheless, the power index approach is well worth further investigation. The choice depends on the goal. For the purposes of momentary governmental power, a constellation 53-47 means that the larger party has all the governmental power, as reflected by the power-based effective number, NP = 1.00, rather than by the seats-based effective number, NS = 1.99. But the smaller party may well become the dominant one, come next elections. Here the seats-based NS would reflect better the nature of the party system, compared to power-based NP , which remains at 1.00 after power transition to the other party.
Conclusion This completes the overview of how to apply the effective number and balance of parties (or other components) in various ways. What follows in the chapter appendix may not be needed to make use of these measures but matters from the methodological viewpoint. Several weak aspects of the effective number of parties, defined as N = 1/(si )2 , have been pointed out by Taagepera and Shugart (1989: 259), Molinar (1991), Taagepera (1997b, 1999a), and Dunleavy and Boucek (2003), as is discussed in appendix. Still, for most purposes the effective number remains preferable to any of the alternatives offered. The only way to improve on it significantly would be to supplement it with a second indicator. It could be the index of balance B = logN∞ /logN0 ), or the inverse of the largest 56
The Number and Balance of Parties
share, or Sartori’s number of relevant parties (1976) (discussed in chapter appendix). The fractional shares that enter the equation for N can be shares of votes, seats, power, or something else, depending on the purpose. In this book, N without a subscript stands for the effective number of parties based on seat shares. Mapping the party systems on the basis of the effective number and balance of legislative parties leads to a picture close to some previous intuitive classifications.
Appendix to Chapter 4 What options do we have for measuring the number of parties in the first place, and what are their limitations? How do the various measures interrelate? Are they inherently different, or are they different variants of the same master equation? Could the number and balance of parties be combined into a single super-index? Last but not the least, could we feed in something else, apart from the sheer size of parties, in a quest for a number of relevant parties? It may look at times that I am going very slow and making a fuss about the obvious, but the superficially obvious has sometimes unexpected implications.
How to measure the number of parties First, let us consider some properties of the three standard ways to measure the number of parties. The number of seat-winning parties (N0 ) looks simple, but the same value N0 = 6 could represent 20-20-20-20-10-10 or 90-2-2-2-2-2. Moreover, in the presence of numerous tiny parties and independents, many data collections may lump as many as 20 percent of the seats under the label ‘Others’. When 9 seats are reported as ‘Others’, it could mean 9 seats for a single ephemeral party or one seat each for 9 parties or independents. In the absence of any other knowledge, the square root of the seats for ‘Others’ might be added to the explicitly named seat-winning parties. Apart from such uncertainty, N0 is the largest number of parties that conceivably could be claimed for the given constellation. At the other extreme, the inverse of the largest fractional share (N∞ = 1/s1 ) is the smallest number that could be claimed, for the following reason. Consider 25-25-25-25. Clearly, there are 1/0.25 = 4 parties. Suppose now that we are given 25- . . . - . . . and are told that none of the unknown shares surpasses 25 percent. The smallest number of parties we could possibly guess at is 4. If we guessed 3, at least one of the other shares would have to surpass 25 percent. Extending this observation to any largest fractional seat shares usually leads to inverses that are not integers. Consider a largest share of 30 percent. Then
57
Rules and Tools N∞ = 1/0.30 = 3.33. This means 1+1+1+1/3. The corresponding constellation with the largest shares, and hence arguably with the fewest possible number of parties, is 30-30-30-10, where 10 is indeed one-third of the large shares. It can be shown that such an interpretation of the fractional part of N∞ always applies. The nice aspect of such a definition of the number of parties is that, whenever N∞ is less than 2, we know that one party has absolute majority. The complementary weakness of N∞ is that it is not sensitive at all to the distribution of shares beyond the first, so that N∞ = 2.08 could mean a balanced 48-48-4 or an unbalanced 48-10-10-10-10-10-2. Note that N∞ is not affected by the ‘Others’ category. In this sense it is a more operational measure than N0 —less is left to the judgment of the person doing the calculation. In the presence of large and small parties, the effective number of parties, N = N2 = 1/(si )2 , is an intermediary measure between N0 and N∞. It saves us an arbitrary decision on which parties matter, by applying self-weighting. This means that, for seat shares 50-40-10, each fractional share is given a weight proportional to its size: 0.50 × 0.50 = 0.25 0.40 × 0.40 = 0.16 0.10 × 0.10 = 0.01, The third party contributes very little to the resulting sum of 0.42. We do not have to exclude it by setting some arbitrary threshold—the small party effectively eliminates itself, through self-weighting. The effective number N = 1/0.42 = 2.38 indicates that for many purposes (but not all!) the party system acts as if it were composed of 2 equal-sized parties, plus a smaller party. The constellation that fits that description exactly and leads to N = 2.38 is 45.3-45.3-9.4. The algebraic connection between the fractional part (0.38) of N and the seat share of the residual party (0.094) exists but is quite complex. For our previous example 4825-13-9-4-1, the simple equivalent with the same value of N = 3.13 would be 32.632.6-32.6-2.2, meaning 4 seat-winning parties rather than the actual 6. The minimal number of parties needed to form a majority coalition can usually be estimated from N as follows. Take one-half of the effective number of parties and round it off to the closest integer (Taagepera 2002a). In 50-40-10, we have 2.38/2=1.19, which rounds off to 1. For 48-25-13-9-4-1, one-half of the effective number (3.13/2 = 1.57) rounds off to 2, but it does so only narrowly, reflecting the fact that just a small increase in the largest party share would enable it to form a single party majority cabinet. This visualization of N fails sometimes, but such cases are rare in practice. The effective number of parties conveys some intuitive meaning as long as all parties are roughly of the same size. But consider the constellation 53-15-10-1010-2, where N = 3.00. In what sense are there 3 effective parties, rather than 1 or 5? The visualization of at least N/2 parties being needed for majority coalition also
58
The Number and Balance of Parties fails here. Indeed, 3.00/2 = 1.50 technically rounds off to 2, while actually one party alone has absolute majority. This leads us to a shortcoming of N: It does not always tell us whether one party has an absolute majority. When N < 2, the largest share is bound to be more than 50 percent. When N exceeds 4, the largest share is bound to be less than 50 percent. But when 2 < N < 4, we do not know—and this is the range where the effective number most often falls (cf. Figure 4.2). From this viewpoint, we are better served by N∞ = 1/s1 , but at the cost of not distinguishing between cases such as 53-47, where the next election might reverse the power relationship, and 53-15-10-1010-2, where the largest party is likely to remain dominant even if loses majority. Both have N∞ = 1.89. In contrast to N∞ , the effective number N is sensitive to the distribution of shares beyond the first. The constellation 48-48-2 has N = 2.17, while 48-10-1010-10-10-2 has N = 3.56, although both have N∞ = 2.08. But this also means that a residual category of ‘Others’ can affect N. Suppose we have 48-25-13-4-1, plus 9 seats for ‘Others’. If these 9 seats go to a single ephemeral party, then N = 3.13. If they go to 9 separate parties or independent candidates, then N = 3.20. In the absence of any further information, one might take the mean of these extremes, leading to N = 3.16. It can be seen that the possible error is small, unless ‘Others’ include a large share of the seats. In that case, more refined approaches exist (see Taagepera 1997b).
Approximate relationships between the three measures of the number of parties This section presents approximate relationships that prevail between N0 , N, and N∞ , on the average. Logical proof and empirical evidence for these relationships are given in later chapters. If we only knew the largest seat share and hence N∞ , our best guess for the effective number would be N∞ with exponent 4/3 = 1.33: N≈
1 s1
4/3 4/3 = N∞ .
Our best guess for the number of seat-winning parties would be the square of N∞: N0 ≈
1 s1
2 2 = N∞ .
Do not expect exact agreement in individual cases! Thus, for 48-25-13-9-4-1, we 4/3 actually have N = 3.13, but N∞ would lead to N ≈ (1/0.48)4/3 = 2.084/3 = 2.66. 2 Also, since 2.08 = 4.3, we would guess that 4 or 5 parties would win seats, when the actual number is 6. The point is that, even in the absence of any other information but the largest share, we can still estimate the effective number of parties and the number of seat-winning parties, although with an appreciable
59
Rules and Tools error. Even with such limited information, we can do better than saying ‘We do not know’. This book makes systematic use of such an ‘ignorance-based’ approach (Taagepera 1999b) to develop useful predictive models. Conversely, if we only knew that N0 parties won seats, our best guess for the effective number would be 2/3
N ≈ N0 , and for N∞ it would be the square root of N0 : 1/2
N∞ ≈ N0 . Thus, for 6 parties winning seats, our best guesses would be 62/3 ≈ 3.30 for the effective number of parties, instead of 3.13 for our specific example. We would also guess at 61/2 ≈ 2.45 for N∞ , which leads to 1/2.45 = 0.41 = 41% for the largest share, instead of the actual 48 percent. If we knew both the number of seat-winning parties and the largest seat share, our best guess for the effective number would be the geometric mean of the two separate guesses. In our specific example, (2.66 × 3.30)1/2 = 2.96. Compared to either of the separate guesses, this is closer to the actual 3.13, as one would expect: The more information we have, the better our guess is likely to become. Do not use these coarse approximations, if you have more detailed information on the distribution of seats! But also have the courage to make the most out of incomplete information, rather than say, ‘I don’t know’. This is one of the most valuable lessons I learned from my Ph.D. work in physics. Indeed, nuclear physicist Enrico Fermi (after whom element Fermium is named) reputedly challenged his students with a social-sciency question: ‘How many piano tuners are there in the City of New York?’ The idea was to show how much knowledge could be deduced from apparently total ignorance, so as to guide one’s research into the right ballpark. Indeed, with minimal coaching, most of my undergraduates get the number of piano tuners within a factor of 2 of the census figure.
Systematics of the number of parties The expressions for N0 , N2 , and N∞ all derive from the same master equation (Laakso and Taagepera 1979): Na = [(si )a ]1/(1−a) , where the parameter a can range from 0 to ∞. The larger the value of a, the more heavily the largest share weighs in, compared to the smaller shares. When a = 0, all seat-winning parties weigh in at equal weight of 1, so that the formula yields the total number of parties, N0 . When a tends to ∞, only the largest share matters, and the formula yields the inverse of this share, N∞ . As a is increased, the value of Na
60
The Number and Balance of Parties decreases. When a = 2, the effective number N2 = 1/(si )2 results. Obviously, no measure of the number of parties that exceeds N0 would make sense. The master equation also suggests that no measure that falls short of N∞ = 1/s1 would make sense. The master equation is undefined at a = 1, but its limit as a tends toward 1 is well defined. N1 is the inverse of the product of all fractional shares, raised to the exponents equal to themselves: 1 N1 = si . si It can also be expressed as N1 = e H , where H is the system entropy, as defined in physics: H = −si lnsi . While physicists use natural logarithms in the definition of entropy, information scientists prefer logarithms to the base 2. It does not matter; either way, N1 = 1/ sisi . The connection to entropy is remarkable and possibly surprising. This measure of the number of parties has been occasionally used ever since 1960 (see Taagepera and Shugart 1989: 260). It has great philosophical appeal, in view of the importance of entropy as a unifying concept that extends from thermodynamics to information science. Unfortunately, it has one practical drawback. Its values, located in-between those of N0 and N2 , are overly sensitive to the presence of tiny parties and independents, which data sources tend to group under the label ‘Others’. Thus, it may not be even possible to calculate N1 with sufficient precision, and it exaggerates the number of parties, compared to our intuitive notions. For these reasons, N2 =1/(si )2 has proved preferable, as a measure of the number of parties. It is also slightly easier to calculate N2 , and some desirable consequences of using N2 will emerge in later chapters.
The quest for a single super-index to characterize party systems It would be nice to have a single number that all alone would express all there is to characterize about a party system. Unfortunately, no single measure of the number of parties can be satisfactory. The hard fact is that no single number can indicate both the central tendency and the variation around it. What does this abstract statement mean? Consider a normal distribution with a mean of 1,000 units. If the individual values range around this mean with a standard deviation of only 1 unit, then
61
Rules and Tools the mean pretty much tells the entire story. But if the standard deviation is 100 units, then this standard deviation needs to be specified so as to warn us about the appreciable scatter. Similarly, the effective number is a measure of central tendency. It alone will do, when all parties are of roughly equal size. But if they are widely out of balance, then the effective number cannot tell us so and needs a complementary index to show how balanced the party sizes are. In particular, an effective number between 2 and 4 does not tell us with certainty whether one party has absolute majority. To be on the safe side, one might supplement N in such cases with N∞ (Taagepera 1999a). This approach was used by Siaroff (2003) to investigate two-and-a-half-party systems. But in most cases the second number conveys little extra information, given the heavy colinearity 4/3 of N and N∞ (remember the approximation N2 ≈ N∞ !). This is why the index of balance was introduced. True, B still does not clearly tell us whether one party has majority, and it is slightly colinear with N, for the following reasons. Although B ranges in principle from 0 to 1, very low values of B can be reached only with a huge number of tiny parties, or with a very large largest party. For example, 99-1 is characterized by N = 1.02 and B = 0.01. At the opposite extreme, values of B exceeding 0.5 become impossible when the largest share exceeds 21/2 /2 = 0.7071. Due to these limitations, some colinearity between B and N remains—in contrast to 0 colinearity, in principle, between the mean and standard deviation of a normal distribution. But this colinearity is minimal, compared to the one between N∞ and N. All the preceding applies to the balance of legislative parties. Estimation of the balance of electoral parties is made difficult because the number of parties that obtain at least a minimum of votes is hard to stipulate. I will return to this issue in Chapter 15. No single number can indicate both the central tendency and the variation around it. Nonetheless, the idea of having a single number to characterize a party system is so attractive that the optimistic quest for such a perfect measure has lasted for many decades, and probably will continue. Apart from the entropy-based measure, it has produced two indices that boil down to a mix of N and N∞ . Molinar’s index NP (1991) can be shown (Taagepera 1999a) to amount to N P = 1 + N − (N/N∞ )2 . For 50-50, we have of course N = N∞ = N P = 2.00. If one of these parties breaks up, leading to 50-25-25, then N∞ remains the same, N goes up to 2.67, but NP actually goes down to 1.89. The latter shift may well reflect the increased degree of predominance of the largest party and the resulting imbalance, but hardly the number of parties as such. While avoiding some problems of N, index NP introduces several others (see Taagepera 1999a). Dunleavy and Boucek (2003) propose an index Nb that amounts to averaging N and N∞ . This average, Nb = (N + N∞ )/2, involves, of course, all the same problems these authors rightly criticize in N, attenuated by a half. Still, given that N has problems that N∞ does not have, and vice versa, their mean might conceivably alleviate both sets of problems to a sufficient degree. Unfortunately, this is not
62
The Number and Balance of Parties always the case. A major shortcoming of N is that, in the range 2 < N < 4, it gives no hint on whether the largest party has absolute majority. Within the reduced range 2 < Nb < 3, Nb preserves this indeterminacy. For instance, for the constellation 52-8-8-8-8-8-8, where N2 = 3.24 and N∞ = 1.92, we get Nb = 2.58, which, like N, gives no hint about the absolute majority of the largest party. To repeat a statement that the proponents of Nb found hard to accept: ‘One may harbor the illusion that by judicious combination of N and N∞ (plus possibly something else) one might achieve a super-index that satisfies all desiderata. This is about as wishful as hoping to combine the mean and the standard deviation into a single measure. Two numbers are inherently able to transmit more information than one.’ (Taagepera 1999a)
The number of relevant parties All the approaches discussed up to now are mechanical in the sense that they do not take into account anything but the number of seats parties have. This is the best one can do in the absence of any other knowledge. Here, we face both the strength and the weakness of such a mechanical approach. Its strength lies in enabling us to compare a large number of cases, across countries and within countries at different times, without the need to dig up huge amounts of detailed information and then risk getting lost in the multiplicity of considerations. The weakness of the mechanical approach lies in not making use of extra knowledge that would color the picture. After all, some small centrist parties matter more than some extremist larger parties, for coalition formation and other purposes. Giovanni Sartori (1976) proposed a very different approach to the number of parties: the number of relevant parties, designated here as R. His basic criterion for relevance is whether a party has entered into governing coalitions. Some large parties, such as the Italian Communist Party, have at times been perennially excluded from coalitions on grounds of ideological incompatibility. Yet they are relevant, because their exclusion makes formation of majority cabinets difficult, and minority cabinet survival depends on the tacit support by such excluded parties. Therefore, Sartori also includes in his count of relevant parties those capable of ‘blackmail’. Sartori counts the relevant parties, without assigning different weights to highly or only marginally relevant ones. Expert opinions may differ on whether to include a party with marginal coalition or blackmail ability. This means that the number of relevant parties (R) is less ‘operational’ than the effective number of parties (N). ‘Operational’ means that any one who carries out the prescribed operation (such as N = 1/si2 ) obtains the same result. Note that I do not claim that even N is fully operational, because opinions may disagree on whether to include affiliated parties such as the German CDU and CSU jointly or separately. Excluding the blackmail parties, Sartori (1976: 300–3) also offers a measure of the number of government-relevant parties alone, designated here as G. In a private
63
Rules and Tools communication to Sartori (January 27, 2000), I used the data in his Parties and Party Systems (1976) to try to establish some average relationships between R, G, and the effective numberN. These relationships depend on several of assumptions on the use of which Sartori has not corrected me. The central pattern for the relationship between Sartori’s numbers of government-relevant parties (G) and all relevant parties (R) is around R = G + 1.5, meaning that most often there are 1 to 2 relevant parties that are rarely included in government support. The opposite extremes are Austria’s classical grand coalitions (R = G) and Norway and Italy (R = G + 3). The latter two also differ from each other. Norway truly had three relevant center-right parties in addition to the usual one-party social-democrat cabinet. In Italy, one large blackmail party indirectly made more parties relevant for government support in the following way. When 30 percent of seats are excluded from the game, majority cabinet formation requires 50 of the remaining 70, rather than 50 of 100, making even minor parties desirable partners. This accounts for the large total number of relevant parties Sartori finds in Italy. For the relationship between Sartori’s number of government-relevant parties and the effective number of parties (N), the central pattern is G = N − 1, meaning very roughly that one of the N largest parties is most often outside the cabinet. However, in a number of cases G = N, meaning roughly that either all large parties are included (Austria) or large parties left out of the cabinet are balanced by rather small parties being included (West Germany, Italy, and most of all, France IV). At the other extreme, in Norway, Sweden, and Canada G falls short of (N − 1), reflecting unusually marked size disparity between the largest party and the 2 or 3 next-largest ones. Such disparity would imply a low value of B, the index of balance. Finally, the relationship between Sartori’s total number of relevant parties and the effective number of parties varies. While R and N agree in two-party constellations, R is larger than N for many multiparty constellations. The gap increases as the number of parties increases. Among the multiparty systems, Switzerland and the Netherlands are closest to R = N. At the other extreme, R exceeds N most in Italy, with Norway and France IV and V coming next. Italy and Norway are of course the outliers regarding R = G + 1.5 too. This means that in these countries R is out of whack not only with N but also with Sartori’s own number of governmentrelevant parties. In sum, these various ways to express the number of parties (N, R, and G) partly tell the same story, while partly illuminating different aspects of party systems. This book prefers to use N, because it is highly operational and is found to be connected to institutional factors. For cases that deviate from the general patterns thus predicted, it might be worthwhile to consider by how much the number of relevant parties differs from the effective number.
64
5 Deviation from Proportional Representation and Proportionality Profiles
For the practitioner of politics:
r r r r r r r r
Excessive deviation from proportional representation may hurt the democratic legitimacy of the regime. Two alternatives predominate for measuring the deviation from PR. The simplest is to add all the differences between seat and vote shares of each party, then divide by 2. This measure (D1 ) ranges from 2 to 7 percent for most of the stable PR systems, and from 6 to 24 percent for most of the first-past-the-post systems. By a fancier measure (D2 ), these ranges shift to 1 to 5 percent and 5 to 17 percent, respectively. Malapportionment of districts can increase the deviation from PR appreciably. Proportionality profiles are a way to show which parties are advantaged and which are disadvantaged. Volatility of votes from one election to the next can be measured the same way as the deviation from PR.
This chapter deals with measurement of deviation from PR, meaning the deviation of seat shares of parties from their vote shares. One can consider the overall deviation for all parties and characterize it by a number. One can also consider each party separately and graph the resulting proportionality profiles of parties. 65
Rules and Tools
Two other features have the same mathematical format as deviation from PR: volatility of votes from one election to the next, and the extent of ticket splitting, when voters have more than one ballot. All these phenomena deal with measuring deviation from a norm. Deviation from PR is 0 when seat shares equal vote shares, which represent the norm. Volatility is 0 when vote shares in the next election equal those in the previous one, which is taken as the norm. The extent of ticket splitting is 0 when all voters vote exactly for the same parties with all their ballots. Here, any ballot could be taken as the norm for the others.
Basic Indices of Deviation from PR and Volatility Two ways to measure deviation from PR, volatility, and ticket splitting have dominated. Both start with the difference between the actual share and the norm, for each party, but then they process these differences in different ways. Loosemore and Hanby (1971) introduced into political analysis the index of deviation that I will designate as D1 , for reasons of systematics explained in chapter appendix. For deviation from PR, it is D1 =
1 |si − vi |. 2
Here si is the ith party’s seat share, and vi is its vote share. The index can range in principle from 0 to 1 (or 100 percent). Note that |si − vi | = |vi − si | is never negative. D1 dominated until Gallagher (1991) introduced what I will designate as D2: 1/2 1 2 D2 = . (si − vi ) 2 It has often been designated as the ‘least square’ index, but this is a misnomer. The index does involve squaring a difference but no minimization procedure so as to find some ‘least’ squares. D2 can range from 0 to 1 (100 percent), but whenever more than two parties have nonzero deviations the upper limit actually remains below 1—an awkward feature to be discussed in chapter appendix. When only two parties have nonzero deviations, the one gaining what the other is losing, then D1 and D2 have the same value. But when more than two parties have nonzero deviations, then D1 is bound to be larger than D2 . In sum, D1 ≥ D2 . 66
Deviation from PR
It is possible, though rare, that one of these indices increases while the other decreases, from one election to the next. A third possible index is simply the largest single difference for any party. For reasons of systematics (see chapter appendix), I will designate it as D∞ : D∞ = max|si − vi |. The largest |si − vi | may or may not pertain to the largest party. At times a third party loses more than the largest party wins, because some of the gain goes to the second-largest party. Along with D1 andD2 , Lijphart (1994: 62) proposed D∞ for consideration. He labeled these three indices, D, LSq, and LD, respectively and published their average values for the aforementioned electoral systems in various countries and periods (Lijphart 1994: 160–2). In most cases, the values of D2 and D∞ are fairly close: D2 ≈ D∞ . D2 tends to fall slightly short of the largest single seat–vote difference but can exceed it occasionally. It is always the case that D1 ≥ D∞ . For volatility, the corresponding equations are 1 |v1i − v0i |, 2 1/2 1 2 V2 = , (v1i − v0i ) 2
V1 =
V∞ = max |v1i − v0i |. Here the subscripts 1 and 0 refer to two elections. V1 is often called the Pedersen index (Pedersen 1979), though Przeworski (1975) foreshadowed it, under the name disinstitutionalization. In the case of deviation from PR, Gallagher’s D2 rapidly displaced D1 during the 1990s as the most popular index. Discussion of concepts in chapter appendix suggests that this shift may not have been as justified as it looked at the time. Maybe D1 is actually preferable. But going with the flow, I will use D2 for applications that follow, despite my reservations. For volatility, V1 has continued to reign. The same seems to be the case for ticket splitting. In the following, I will focus mainly on deviation 67
Rules and Tools )
18
PR
.5(
M =1
16
D2
Malapportioned
14
Nv
=5
−1
FRA 1958−81
Conceptual
D2 (%)
12 10 SPA 1977−89 8
ICE 1946−59
6
JPN 1947−90
4 D2 = 15(N v − 1)/N v2
2 0 1
2
3
4
5
6
Nv
Figure 5.1. Deviation from proportional representation vs. effective number of electoral parties
from PR. Chapter appendix will address the general issue of how to measure deviation from a norm, how the various indices are related, and the relative advantages of D1 and D2 .
Empirical Patterns of Deviation from PR One may wonder whether the number of parties that run affects deviation from PR. More parties could lead to more deviation from PR, because more votes might be wasted on small parties that fail to get seats. However, party systems are likely to adjust themselves to electoral systems. High deviation from PR would push less successful parties toward giving up, which would reduce deviation. Thus, an informal ‘law of conservation of D’ (Taagepera and Shugart 1989: 123) might prevail, where political culture would decide which equilibrium level of deviation is considered tolerable. Figure 5.1 shows the empirical picture when Gallagher’s D2 is graphed against the effective number of electoral parties (NV ). The aforementioned 37 electoral systems studied by Lijphart (1994: 160–2) are used, which involve at least 3 national elections. While the number of parties tends to remain stable or change gradually over long periods, deviation from PR can fluctuate wildly from one election to the next (see graph in Taagepera and Shugart 1989: 109). Thus only averages over at least 68
Deviation from PR
3 elections are meaningful. Three categories of systems are shown with distinct symbols: single-seat districts with FPTP or Alternative Vote; multiseat PR; and some of the systems where Lijphart (1994: 128) observes high malapportionment in favor of rural areas: France 1958–81, Spain 1977–89, Iceland 1946–59, and Japan 1947–90. For FPTP and AV, deviations are mainly around 10 percent but surpass 16 percent in India. D2 tends to increase steeply with increasing number of parties, roughly following a pattern D2 = 5.5%(NV − 1). Note that D2 must be 0 when a single party runs (NV = 1), because it is bound to receive all the votes given to any party, as well as all the seats. This conceptual ‘anchor point’ is shown with a triangle in Figure 5.1, and the approximate trend line is made to pass through it. The importance of such anchor points is discussed in Beyond Regression (Taagepera 2008). Deviation is less than 5 percent for multi-seat PR systems not subject to high malapportionment. In contrast to FPTP/AV, deviation tends to decrease slightly with increasing number of parties. The average pattern is roughly D2 = 15%(NV − 1)/NV2 . Why use such a curved approximation rather than a simple straight line? A straight line would predict a deviation of about 5 percent when a single party runs, which does not make any sense. Conceptually, 0 percent is required. The linear fit would also predict a negative deviation at very large numbers of parties. Although such large numbers might never be reached in actual party systems, it is good scientific practice to avoid conceptual inconsistencies even in extreme situations. The curve D2 = 15%(NV − 1)/NV2 is the simplest one that passes through the anchor point at NV = 1 and also remains above 0 even at high NV . Apart from respecting the anchor point at NV = 1, the curves shown in Figure 5.1 are empirical. D2 tends to range from 1 to 5 percent for stable PR systems, and from 5 to 17 percent for stable FPTP/AV systems. Systems with high malapportionment (France, Spain, Iceland, and Japan) fall outside the FPTP/SV and multi-seat PR patterns observed. When Loosemore–Hanby’s D1 is used, the pattern of Figure 5.1 becomes more diffuse but is preserved. The curves (not shown) shift upward. D1 tends to range from 2 to 7 percent for stable PR systems, and from 6 to 24 percent for stable FPTP systems. Lijphart (1999: 169) graphs the effective number of legislative parties (NS ) against D2 , for 36 most stable democracies. Here D2 is found to decrease with increasing NS both for PR and for FPTP/AV. It is unclear 69
Rules and Tools
whether the reversal of the pattern for FPTP/AV is due to the use of NS or whether the inclusion of many small FPTP countries alters the picture.
Proportionality Profiles Indices of deviation from PR characterize the entire electoral system, which is useful for comparative purposes, but they do not tell us how the system affects parties of different sizes. In particular, they do not tell us whether the system advantages large or small parties. In Chapter 3, electoral systems were characterized by their inputs—institutions and laws—but similar laws sometimes produce different outcomes in different countries—or even during different periods in the same country. It would be useful to have a way to characterize electoral systems by their outputs, including their degree of proportionality for individual parties in individual elections. This is what the proportionality profiles are about. The advantage ratio or seat–vote ratio of a given party in a given election is defined as the ratio of its seat shares to vote shares: a=
(% seats) . (% votes)
Proportionality profile is the graph of advantage ratio versus vote share, for each party, in one or several elections (Taagepera and Laakso 1980). Figures 5.2–5.5 present some typical profiles, using data from Mackie and Rose (1991, 1997). Note that all curves that try to express the mean proportionality profile must respect two conceptual anchor points. A party with almost 0 votes will win 0 seats, and hence a = 0 when v = 0. At the other extreme, a party with all votes will win all the seats, so that a = 1 when v = 1. In between these two anchor points, smaller parties tend to be underpaid (a < 1) and larger parties overpaid (a > 1). The share of votes at which the average pattern crosses the horizontal line a = 1 is the break-even point (b). Ideally, no party with vote share below b should be overpaid, and no party with vote share above b should be overpaid. An operational definition for b might balance the number of errors in both directions (Taagepera and Shugart 1989: 258). Then the break-even point is the vote share such that there are as many cases with a > 1 and v < b (top left quadrant) than there are cases with a < 1 and v > b (bottom right quadrant). Figure 5.2 shows the profiles for 10 elections in New Zealand (1966–93) that preceded the shift to PR, and also for 10 elections in the USA (1976– 94) during a similar period. Both countries used FPTP, but in the USA 70
Deviation from PR 1.8
a = 100%/v
NZ 1966−93, b ~ 33% USA 1976−94, b ~ 49%
1.6
FORBIDDEN AREA
1.4
en
ts
1
ne po On
e
0.4
op
0.6
nt
pp
s
on
0.8 Tw oo
Advatage ratio
1.2
0.2 0 0
20
40
60
80
100
Votes (%)
Figure 5.2. Proportionality profiles for FPTP elections in New Zealand and the USA
practically only two parties competed, while in New Zealand third parties kept challenging the big two, especially toward the end of the period. It will be seen later (Chapter 13) that the expected profiles can be calculated for FPTP, depending on the number of opponents the given party faces. Figure 5.2 shows these expected curves in the case of one and two opponents, respectively. These curves assume that the so-called cube law applies (see Chapter 13). With only one opponent, this means a = v 2 /[v 3 + (1 − v)3 ]. With two equal-sized opponents, a = v 2 /[v 3 + 0.25(1 − v)3 ]. With one opponent, the break-even point comes at 50 percent. The maximum overpayment is 33 percent (a = 1.33), and it would occur at 67 percent of the votes. At extremely high vote shares the advantage ratio is bound to decrease, because a is limited to a = 100%/v—otherwise the party would have more than 100 percent of the seats. This conceptual maximum curve for a is shown in Figure 5.2, and the region above it is indicated as conceptually ‘Forbidden Area’. The US data visibly hug the one-opponent curve. The empirical break-even point is around b = 49%. With two equal-sized opponents, the break-even point is lowered to 33.3 percent. In the face of a split opposition, a party can achieve much higher overpayments, and at lower vote shares. The maximum is a = 1.61, and it would occur at a vote share of 52.5 percent. The New Zealand data are mostly scattered between the one-opponent and two-opponent 71
Rules and Tools 2.5
Advantage ratio
2.0
1.5
1.0 M =1 PR
0.5
0.0 0
10
20
30
40
Votes (%)
Figure 5.3. Proportionality profile for Two-Rounds and PR elections in France
curves. The empirical break-even point is no higher than 33 percent. Most parties with less than 14 percent votes receive no seats. Figure 5.3 shows the profile for 10 elections held in France (1958–93) after the introduction of Two-Rounds majority-plurality rule in singleseat districts. Vote shares in the first (or only) round are used, as this indicates the voters’ first preferences. A different symbol is used for 1986, when PR was used. No theoretical curves are available. An approximate best-fit curve is sketched in. The outcomes are extremely scattered, as seat allocation depends on nationwide and local deals prior to the second round. Depending on their ability to cut deals, some parties with 20 percent votes won more than 40 percent of the seats, while some others won as little as 2 percent. According to the operational definition of the break-even point, b ≈ 16%, but the wide scatter renders the very notion of a break-even point almost pointless. As for the PR election in 1986, the data points are almost as widely scattered as for the Two-Rounds elections. This scatter reflects the low average district magnitude (M = 5.8), the high legal threshold of 5 percent 72
Deviation from PR 1.4
1.2
Advantage ratio
1
0.8
0.6
0.4
0.2
0 0
20
10
30
Votes (%)
Figure 5.4. Proportionality profile for PR elections in Finland
applied at district level, plus the inability of many parties to learn how to cope with the new rules. Figure 5.4 shows the profile of a typical List PR system with numerous parties: Finland 1979–95. Only five elections are shown, so as not to clutter the graph, but the pattern is almost identical for the preceding five elections (1962–75). District magnitudes varied widely around a mean of M = 14, and local alliances (apparentement) were allowed. No theoretical curves are shown. The operationally defined break-even point is around 6 percent—much lower than for the FPTP systems—and the maximum overpayment of large parties remains modest. Without alliances and without some extra-large districts, the profile might follow the curve shown in Figure 5.4, with b = 11%. However, large parties are willing to strike alliances with tiny parties (v < 6%), so as to shift seats away from their major competitors. When a tiny party thus wins even a few seats, its advantage ratio can exceed 1.0— the more so, the smaller its nationwide vote share. In Figure 5.4, this region of well-allied tiny parties appears as a distinct cluster at the top left. 73
Rules and Tools 1.2
Advatage ratio
1
0.8
0.6 0.4
0.2
0 0
5
10
15
20
25
30
35
40
45
50
Votes (%)
Figure 5.5. Proportionality profile for Mixed-Member Proportional elections in Germany
Finally, Figure 5.5 shows the profile for eight elections in West Germany (1961–87) after its electoral laws stabilized in the late 1950s but prior to German unification. This is the typical profile for essentially nationwide allocation by PR, subject to a legal threshold of 5 percent nationwide votes. It is a step function: a is 0 for V < 5%, and a is slightly above 1.0 for V > 5%—1.03 on the average. The degree of advantage of the successful parties depends on the vote share (v0 ) that went to the parties that did not surpass the threshold in that particular election: a = 100%/(100% − v0 ). How do advantage ratios of individual parties tie in with the systemwide deviation from PR? Since a = 1 means perfectly PR, we should consider the deviations 1 − a. All these deviations can be turned positive by taking either the absolute value |1 − a| or the square (1 − a)2 . We cannot just take their mean for all parties, large and small, because tiny parties would carry too much weight. Once we weight the deviations 1 − a by the respective vote shares, we are back to D1 in the case of |1 − a|. By weighting the squares (1 − a2 ) with the squares of the respective vote shares, we are essentially back to D2 . A proportionality profile is a ‘snapshot’ of the given electoral system. It indicates at a glance the average impact of the electoral system on large and small parties and the degree of scatter around this average. Oddities
74
Deviation from PR
are quickly spotted, such as the high advantage of some tiny parties in Finland. If we truly understand the functioning of an electoral system, then we should be able to predict its proportionality profile without looking at any data. The theoretical curves in Figure 5.2 hint at some ability to do so. This challenge is addressed toward the end of the book (Chapter 14).
Volatility and Ticket Splitting When measuring volatility, V1 has continued to rule, and no shift to V2 seems to have been considered. Volatility is usually measured on the basis of votes in consecutive elections. But it can also be measured on the basis of seats instead of votes, or on the basis of nonconsecutive elections (e.g. 20 years apart, so as to evaluate long-term change), etc. Splits and mergers of parties make it sometimes hard to decide whether the voters are volatile rather than parties (Sikk 2005). It is important to distinguish between individual and aggregate volatility. Individual volatility (VI ) can be established only by exit polls or interviews, and the result depends on the quality of voters’ recollections of how they voted several years earlier. Aggregate volatility (VA ) refers to changes in party total vote shares. It can easily be measured whenever party votes data are available, but it hides individual shifts that cancel out. If as many voters switch from party A to party B as vice versa, then the aggregate volatility is 0. In the opposite direction, aggregate volatility could be as high as individual volatility, but no higher. To repeat, for given individual volatility, the corresponding aggregate volatility can be as high as the individual, or as low as 0. In the absence of any other knowledge, our best guess is halfway between these limits: VA ≈ 0.5VI . It is somewhat risky to reverse the direction. But when all we know is aggregate volatility—which is usually the case—our best guess for individual volatility still might be double the aggregate volatility: VI ≈ 2VA . The study of volatility as well as ticket splitting remains outside the scope of this book. I only point out that the methods of measurement are the same as for deviation from PR, with similar dilemmas.
75
Rules and Tools
Conclusion No perfect way to measure deviation from PR has been found. As already observed for the number of parties, a single index cannot express all the details. Chapter appendix shows that two indices offer comparable advantages. Fortunately, in most actual cases they tell pretty much the same story. In the case of FPTP, deviation from PR may tend to increase with increasing number of parties running, while the reverse may be true for PR in large-magnitude districts. Proportionality profiles offer a way to express visually the differential impact of an electoral system on the fortunes of individual parties, large and small.
Appendix to Chapter 5 What follows matters from the methodological viewpoint, although it may not be needed to make use of proportionality profiles and indices of deviation. What are the options for measuring deviation from a norm? How do the various measures interrelate? Are they inherently different, or are they different variants of the same master equation? What are the advantages and limitations of the two main indices (D1 and D2 ), and what are the alternatives that may work better for some purposes? These questions of methodology are similar to those that arose for the number of parties. I will consider these issues from the viewpoint of shares of votes (vi ) and seats (si ) of parties. Volatility and ticket splitting may present different problems.
How to measure deviation from a norm How do we measure deviation from a norm? While 0 deviation is clearly defined as si = vi , maximum deviation is hazier. It could be argued that maximum deviation is reached when parties with votes have no seats, and parties with no votes have all the seats. But should such maximum deviation be construed as ∞, leading to a scale from 0 to ∞, or as 100 percent deviation, leading to a scale from 0 to 1? Here we compare two numbers, si and vi , which leads to possibilities that did not arise when establishing the number of parties, based on si alone. Should we first modify si and vi separately (by squaring them, for instance), before subtracting or dividing them? In view of such multiplicity of options, it should come as no surprise that a recent review (Taagepera and Grofman 2003) found that 19 different indices have been proposed. For choosing among them, Taagepera and Grofman (2003) proposed 12 criteria, which an ideal measure of deviation should satisfy—see Table 5.1. They partly overlap with criteria proposed earlier by Monroe (1994).
76
Deviation from PR Table 5.1. Satisfaction of the Taagepera and Grofman (2003) criteria by three indices of deviation from PR D1
D2
D∞
1 1 1 1 1
1 1 1 1 1 0
0 1 1 1 1 0
0.5
1
0.5
1 0 1 1 1
1 1 0.5 1 1
1 0 1 1 1
TOTAL SCORE
10.5
10.5
8.5
7a. Satisfies Dalton’s principle of transfers (for ratios) ALTERNATE TOTAL SCORE
1 11
0 9.5
0.5 8.5
Theory-inspired criteria Is informationally complete (makes use of all si and vi ) Uses data for all parties uniformly Uses si and vi symmetrically Varies between 0 and 1 (or 0 and 100 percent) Has value 0 when si = vi for all parties. Has value 1 (or 100 percent) when parties with no votes have all the seats. 7. Satisfies Dalton’s principle of transfers (for differences)
1. 2. 3. 4. 5. 6.
Practical criteria 8. Does not include the number of parties 9. Is insensitive to lumping of residuals 10. Is simple to compute 11. Is insensitive to shift from fractional to per cent shares 12. Apart from si and vi , does not include any other inputs
Some criteria may look self-evident, but each of them is violated by at least 1 of the 19 measures proposed. Among these criteria, Dalton’s principle needs explanation (see Monroe 1994). When a seat is transferred from a richer component to a poorer one, the index of deviation should increase (the strong form of Dalton’s principle) or, at least, it should not decrease (the weak form). ‘Richer’ here refers to the party that is more overpaid, or at least less underpaid, in terms of seat shares as compared to vote shares. It seems straightforward, but there is a hitch. To evaluate relative overor underpayment, do we compare the differences (si − vi ) or the ratios (si /vi )? It can happen that one party has a higher difference while another has a higher ratio. Which one is ‘richer’ in such a case? The difference criterion is usually accepted. If so, then D1 satisfies the weak form but fails the strong form of Dalton’s principle, whileD2 does satisfy the strong form too. This is the outcome shown in the main body of Table 5.1. However, the ratio criterion sounds as reasonable. By the ratio criterion, it turns out (Taagepera and Grofman 2003) that it is D1 that satisfies the strong form, while D2 fails to satisfy even the weak form! Among the 19 indices considered by Taagepera and Grofman (2003), Table 5.1 shows only 3. Loosemore–Hanby’s D1 (1971) and Gallagher’s D2 (1991) are included because they reach the highest total scores (10.5 of a possible 12). Thus they look equally adequate but not ideal. However, if we use the ratio criterion for transfers, instead of differences, then D1 increases to 11.0 and widely surpasses D2 ,
77
Rules and Tools which drops to 9.5. Then D2 would share the second and third places with Gini index (not discussed here), which advances from 8.5 to 9.5. The single largest deviation, D∞ , is included because it is part of the same systematics, to be presented next. By the criteria shown, D∞ (total score 8.5) is surpassed by 4 and equaled by 5 other indices, so it is not among better the measures.
Systematics of deviation from a norm The simplest measure of deviation from a norm might indeed be the largest deviation alone, meaning D∞ . Its disadvantage is that whatever happens to the rest of the parties does not matter. This problem is akin to the one we encountered previously when considering the inverse of the largest share as a measure of the number of parties (N∞ ). The next simplest approach might be D1 . It runs into trouble with Dalton’s principle, at least when the difference criterion is used. It is also felt to overemphasize the importance of numerous small deviations as compared to one large one. This problem is akin to what disqualifies N0 and weakens the entropy-based N1 as measures of the number of parties. Finally, like N2 previously, D2 represents an intermediary approach, taking into account all the deviations but giving more weight to the larger ones. Actually, all three indices, D1 , D2 , and D∞ , are special cases of a master equation, presented here for the first time: Dk =
1 |si − vi |k 2
1/k .
Any k < 1 would emphasize smaller deviations at the expense of larger ones, which is hardly desirable, and k = 1 is on the verge of doing so. At the other extreme, as k tends toward ∞, Dk tends ∞ the largest deviation alone, neglecting all others. As a compromise, k = 2 leads to D2 , which boosts the impact of larger deviations. The smaller deviations eliminate themselves through squaring, much the same way smaller parties do in N2 . If there are only two parties, then one’s loss is the other’s gain, so that D1 = D2 = D∞ . When more than two parties have nonzero deviations, then the broad pattern is the following. As k increases beyond 1, Dk at first decreases and then starts to increase again. This means that D1 > D2
[more than two nonzero deviations]
is inevitable, while D∞ may or may not catch up with D2 . It depends on whether the square of the largest deviation exceeds the sum of squares of the other deviations. For instance, when D1 = 10%, individual deviations 8-2-5-5 lead to D∞ > D2 , while 7-3-5-5 lead to D∞ < D2 . Whenever at least two parties are underpaid and
78
Deviation from PR Table 5.2. Mean values of deviation indices D1 and D∞ , for given mean values of D2 Range of D2
Mean D2 (%)
D1 (%)
100(D2 /100).863
D∞ (%)
126(D2 /100)1.058
0–1.99% 2.00–4.99 5.00–9.99 10.00–17
1.45 3.23 7.23 12.28
2.65 5.11 10.36 16.32
2.58 5.17 10.36 16.37
1.47 3.33 8.10 14.09
1.42 3.33 7.81 14.33
two other parties are overpaid in terms of seats, then D1 is bound to be larger than D∞ . To evaluate the empirical relationships among the three indices, we can use again those 37 electoral systems studied by Lijphart (1994: 160–2), which involved at least 3 national elections. Table 5.2 divides them into groups by size of D2 . As expected, D1 is always larger than D2 . The ratio of D1 to D2 decreases with increasing deviation from PR, while their difference increases. Consider some simple cases. If 8 parties had equal deviations (4 gaining and 4 losing), it can be shown that the ratio D1 /D2 would have to be exactly 2. The observed ratio at low deviations (1.83) comes close to this simple equivalent constellation. On the other hand, for 4 parties with equal deviations, we would have D1 /D2 = 20.5 = 1.41. The observed ratio at the highest deviations (1.30) drops somewhat below this simple equivalent. Table 5.2 shows that the overall pattern is well approximated by D1 = 100
D2 100
.863 [D1 and D2 in percent].
This means that in about one-half the cases the actual value of D1 is expected to fall below D1 =100(D2 /100).863 , and in about one-half the cases it is expected to surpass it. This equation is empirical. It can be expected to express the average relationship between D1 and D2 , as long as most small deviations result from many parties competing, while most large deviations result from 3 to 4 parties competing. It can also be seen in Table 5.2 that D∞ tends to be larger than D2 , although values smaller than D∞ are also observed at low deviations (D2 < 5%). In contrast to D1 /D2 , the ratio of D∞ to D2 increases with increasing deviation, as does their difference. Table 5.2 shows that the overall pattern is well approximated by D∞ = 126(D2 /100)1.058 , with D∞ and D2 in percent. I have not worked out the theoretical justification behind this empirical approximation.
The advantages and disadvantages of Loosemore–Hanby’s D1 and Gallagher’s D2 The fact that D2 practically always is less than D1 is often considered an advantage. This is based on the feeling that D1 exaggerates the total degree of deviation when numerous parties have tiny deviations. One might offer the examples in Table 5.3.
79
Rules and Tools Table 5.3. An example where Loosemore–Hanby’s deviation index D1 looks too low
votes (%) seats (%)
Case A
Case B
55 45 60 40
10 . . . 10 11 . . . 11
D1 = 5.0% D 2 = 5.0% D ∞ = 5.0%
D1 = 5.0% D2 = 2.24% D∞ = 1.0%
10 . . . 10 9 ... 9
In case A, a single party falls short by 5 percent and a single other party has a corresponding bonus. We have the impression of a 5 percent deviation, and the indices D1 , D2 , and even D∞ all confirm this expectation. Now suppose that 10 parties receive 10 percent votes each, but 5 of them win 11 percent of the seats each, while the other 5 win only 9 percent each (case B). These many tiny individual deviations may look like negligible random fluctuations, but they still add up to D1 = 5%—the same value we have in case A. In contrast, D2 distinguishes between cases A and B. It is 5 percent for A but only 2.24 percent for B. D∞ drops even further, to 1 percent. It would seem that D2 can distinguish between a few significant and many random deviations, while D1 cannot. But I have cheated. Cases A and B differ in more than concentration of deviation. Let us introduce cases C and D, shown in Table 5.4. First compare case C to previous case A. All indices are exactly the same, but in case A we saw a nonrandom deviation (in contrast to randomness in case B), while case C can be explained only as a large random deviation—cf. the large scatter in advantage ratios for France, in Figure 5.3. How come? In case A, we were influenced by our expectation of the larger party obtaining a bonus and the smaller party being penalized, because this is the usual pattern. This expectation is extraneous to measurement of deviation as such. Next, how would we feel about case D, where 5 parties with 15 percent votes receive a 1 percent bonus, while 5 parties with 5 percent votes have a 1 percent penalty? The indices are exactly the same as in case B above, and all individual deviations are as tiny as in case B, but now the bonuses and penalties all go in the Table 5.4. A counterexample where Loosemore–Hanby’s deviation indexD1 no longer looks too low Case C votes (%) seats (%)
80
50 50 45 55 D1 = 5.0% D 2 = 5.0% D ∞ = 5.0%
Case D 15 . . . 15 5 . . . 5 16 . . . 16 4 . . . 4 D1 = 5.0% D2 = 2.24% D∞ = 1.0%
Deviation from PR Table 5.5. Examples where Loosemore–Hanby’s deviation indexD1 may look preferable to Gallagher’s D2 Case E votes (%) seats (%)
100 0 0 100 D1 = 100% D2 = 100% D∞ = 100%
Case F 50 50 0 0 0 100 D1 = 100% D2 = 86.6% D∞ = 100%
Case G 50 50 0 0 . . . 0 0 0 1 1 1 ... D1 = 100% D2 = 50.5% D∞ = 50%
expected directions. Are we still certain that D1 = 5.0% is excessive and D2 = 2.24% is adequate, in case D? There is a systematic shift of 5 percent. Even if we decide that the lower value of D2 for cases B and D, as compared to cases A and C, is justified, it partly results from an awkward artifact: When many parties are present, the possible scale for D2 stops much short of 100 percent. The hypothetical examples in Table 5.5 clarify this claim. Suppose an absolute monarch decides to have elections for a 100-seat assembly. Suppose one party receives all the votes, but the monarch still decides to assign all the seats to a royal party that received 0 votes (case E). Clearly, this outcome represents 100 percent deviation from PR, and all 3 indices lead to such a result. Now suppose two parties split the vote, but neither receives any seats (case F). Is the deviation from PR now alleviated, just because the vote is less concentrated? D1 and D∞ respond ‘no’: Deviation remains at 100 percent. But D2 tells us ‘yes’: Deviation is reduced to 86.6 percent. Now suppose the monarch does not assign the seats to a royal party but to 100 individuals, who formally are independents and hence in many ways are equivalent to 100 separate parties (case G). D1 tells us that deviation remains at 100 percent. But D2 says deviation is further reduced to a mere 50.5 percent. D∞ also says that deviation is only 50 percent. Thus, D2 and D∞ hide the blunt fact that none of those parties who received votes got any seats. They would be the autocrat’s preferred indices! In sum, the very attentiveness to concentration of deviation that makes Gallagher’s D2 look more attractive than Loosemore–Hanby’s D1 in case B leads to gross underestimation of total deviation in case G. No satisfactory method has been devised to correct for this discrepancy—and this is not for lack of trying. Up to now, we have defined zero deviation from PR, and also what would constitute full deviation. To specify the central part of a scale anchored by these extremes, we should further ask which pattern of deviation would correspond to one-half of maximum deviation. It might be argued that this is a situation where one-half of the votes is converted by perfect PR, while the other half is converted by utmost lack of PR (cases H and I shown in Table 5.6). Once more, the outcome
81
Rules and Tools Table 5.6. Which pattern corresponds to a half of maximum deviation from PR? Case H votes (%) seats (%)
50 50 D1 D2 D∞
50 0 0 50 = 50% = 50% = 50%
Case I 25 25 25 25 0 0 25 25 0 0 25 25 D1 = 50% D2 = 35.4% D∞ = 25%
depends on the number of parties. In case H, all 3 indices yield 50 percent. In case I, only D1 does. D2 yields 35.4 percent, while D∞ drops to 25 percent. Recall that both D1 and D2 were found to be superior to any other measures proposed, when it comes to the desiderata listed in Table 5.1, yet neither satisfies all of them. When Dalton’s principle is applied with the ratio criterion, D1 strongly surpasses D2 . It also yields reasonable values in most cases above, while the values of D2 are counterintuitive for cases F, G, and I. The only situation where D1 seems inferior is lack of discrimination between cases A and B—but this contrast becomes suspect when comparing cases C and D. Moreover, when it comes to logical model building, D1 has been easier to visualize (cf. Taagepera and Shugart 1989: 109–11). In sum, the shift from D1 to the more complex D2 in the electoral studies of the 1990s may not be as justified as it looked at the time. Maybe D1 is preferable. Actually, we might need an index that takes into account the directionality of shifts—an index that labels large party bonuses and small party penalties positive, while declaring large party penalties and small party bonuses negative. Consider the index 1 , d = (si − vi ) vi − NV where NV is the effective number of electoral parties. It would express expected directionality in most cases, but not for cases E–G in Table 5.5. Furthermore, I do not know how to normalize it with respect to conceptual limits. Going even deeper, the very quest for a single ideal measure to characterize all aspects of deviation from a norm might be as futile as the quest for a single super-index to characterize the number of parties in all its aspects (see Chapter 4). Here, the analogy with central tendency and variation around it does not apply. Something else is needed.
82
6 Openness to Small Parties: The Micro-Mega Rule and the Seat Product
For the practitioner of politics:
r r r r r
For inclusive representation of even small parties, it helps to have large assemblies, large district magnitudes, and large quotas or large gaps between divisors in formulas for allocating seats. Conversely, large parties would prefer small assemblies, magnitudes, and quotas—but only if they are absolutely certain to stay large. Worldwide tendency has been to play it safe and move toward more inclusive representation. The number of parties increases with increasing ‘Seat Product’—the number of the seats in the assembly × the number of seats in the average district. When seats are allocated by plurality in multi-seat districts, however, the number of parties decreases with increasing number of seats per district.
Now that the ways to measure the number of parties and deviation from PR have been specified, we can proceed to sense more quantitatively how the three indispensable components of electoral systems affect the openness of the system to small parties. These components are assembly size, district magnitude, and seat allocation formula. The degree of inclusiveness (openness to smaller parties) of a simple electoral system largely depends on the combination of these three institutional inputs. The direction of the impact of these inputs on system openness is compactly worded in Josep Colomer’s ‘micro-mega rule’ (2004b: 3). Going beyond directionality, their quantitative impact is expressed by the ‘seat 83
Rules and Tools
product’, first explicitly proposed in this book. This chapter also offers some illustrative cases and other evidence that may help in getting a feel for how district magnitude, seat allocation formula, and assembly size affect the number and sizes of parties. This focus on electoral system openness to small parties involves no value judgment. I do not claim that more inclusive systems are better. All I say is that electoral systems differ in their degrees of openness, and that these differences impact the style and nature of politics. The golden mean lies somewhere between extreme closure and extreme openness, and its location may vary, depending on circumstances. Offering a measure of institutional openness, in the form of seat product, merely helps us to place a given system on the openness scale.
Colomer’s Micro-Mega Rule Large assemblies, large electoral district magnitudes, and List PR allocation formulas with a large quota or large gaps between successive divisors—all these enhance openings for small parties. Conversely, it would seem to be in the interest of large parties to keep the competition out by having small assemblies, small district magnitudes, and small quotas or small gaps between divisors. While such knowledge has diffusely been around for some time, Colomer (2004b: 3) compresses it in a felicitous ‘micromega rule’: The small prefer the large, and the large prefer the small. He extends it to large parties preferring a single allocation formula, while it is in the interest of small parties to have composite systems where different parts have different formulas. Such variety tends to offer further entry points to small parties. If it is true that ‘the small prefer the large’, then the reverse should be in the interest of the large parties, except for one major reservation. The longer democracies last, the more the generalization above needs qualification. Over time, even large parties experience moments of weakness and learn to appreciate less risky formulas—larger assemblies, district magnitudes, and quotas, plus composite systems. Therefore, the secular trend has been to shift from nationwide winner-take-all toward evermore inclusive electoral systems, as extensively documented by Colomer (2004b: 53–62). So, over time, the micro-mega rule might become: The small prefer the large, and the large hesitate preferring the small. 84
Micro-Mega Rule and Seat Product
Having defined in the preceding chapters how to measure the number of parties and the deviation from PR, we can now illustrate quantitatively the way small parties stand to profit from large district magnitudes and from seat allocation formulas that have large quotas or large gaps between divisors. It is more difficult to illustrate the impact of assembly sizes. Detailed formulas and empirical evidence will be given in later chapters. But a reasonably chosen illustration may help in getting a feel for what is involved.
The Effect of District Magnitude and Seat Allocation Formula: An Illustrative Example Consider again the distribution of percentage votes 48, 25, 13, 9, 4, 1, as used in Table 3.3. That table compared the effects of various electoral formulas when M = 6. Now we gradually increase the district magnitude, from 1 to above 100, and ask how the seats are allocated on the basis of three different formulas. Hare quota (with largest remainders) is the largest quota used in practice, and it should favor the smaller parties, according to Colomer’s rule. Among the widely used divisor rules, d’Hondt has the smallest gaps between divisors and thus should favor the larger parties. Table 3.3 confirmed it, at one specific district magnitude. All other formulas widely used in practice fit in-between Hare-LR and d’Hondt. Among these, Sainte-Laguë divisors will be seen to emerge as arguably the most proportional seat allocation formula. How do these three formulas affect the seat allocation as magnitude is gradually increased? This is shown in Table 6.1, where percentage of vote shares are assumed to be 48+, 25−, 13−, 9−, 4, and 1+. To avoid ties in allocation of seats, a tiny amount is added (+) or subtracted from (−) from integer percentages. It can be seen that Hare-LR often gives fewer seats to the largest party and more to the small ones, compared to d’Hondt. Sainte-Laguë most often follows the Hare pattern but sometimes agrees with d’Hondt and in rare cases differs from both (M = 20, 51, shown for this purpose, and also M = 25, not shown in the table). Entries in bold script for Sainte-Laguë highlight these deviations from Hare. The Hare-LR result for the party with 4 percent votes at M = 9 and 10 (shown in bold script) highlights the so-called Alabama paradox: This party wins a seat when 9 seats are to be allocated, but loses it again when the total is raised to 10 seats! It wins it back at M = 11 (not shown in 85
Rules and Tools Table 6.1. Effect of district magnitude and seat allocation formula on the distribution of seats in a district where the percentage vote shares are 48+, 25−, 13−, 9−, 4, and 1+ M 1 2 3 4 5 6 7 8 9 10 20 30 51 70 100 106
Hare-LR 1 1 2 2 2 3 3 4 4 5 10 14 24 34 48 51
0 1 1 1 1 1 2 2 2 3 5 8 13 17 25 26
0 0 0 1 1 1 1 1 1 1 2 4 7 9 13 14
0 0 0 0 1 1 1 1 1 1 2 3 5 6 9 10
Sainte-Laguë 0 0 0 0 0 0 0 0 1 0 1 1 2 3 4 4
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
1 1 2 2 3 3 3 4 5 5 9 14 24 34 48 51
0 1 1 1 1 1 2 2 2 3 5 8 13 17 25 26
0 0 0 1 1 1 1 1 1 1 3 4 7 9 13 14
0 0 0 0 0 1 1 1 1 1 2 3 4 6 9 10
0 0 0 0 0 0 0 0 0 0 1 1 2 3 4 4
d’Hondt 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
1 1 2 3 3 3 4 5 5 5 11 15 25 35 48 51
0 1 1 1 1 2 2 2 2 3 5 8 13 18 25 27
0 0 0 0 1 1 1 1 1 1 2 4 7 9 13 14
0 0 0 0 0 0 0 0 1 1 2 2 4 6 9 9
0 0 0 0 0 0 0 0 0 0 0 1 2 2 4 4
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
Table 6.1), but loses it once more at M = 12. Only from M = 13 on does this party steadily win a seat. Similarly, the party with 1 percent votes wins a seat when 25, 33, or 48–50 seats are allocated but loses it again when the total is raised. Only from M = 53 on does the 1 percent party steadily win a seat. The name of the Alabama paradox originates in the late 1800s, when Alabama did lose a seat in the US House when the House size was increased, even while Alabama’s share in the US population did not decrease. Thus the Alabama paradox is of concern when quota formulas are applied to seat allocation among territorial units. It is a minor inconvenience in practice when applied to allocation of seats to parties. However, it becomes a nuisance when one tries to answer the simple theoretical question: At what magnitude does a party with a given vote share stand a chance to win a seat? For divisor formulas, the answer is clear. For Hare-LR, the answer ‘25 seats’ would be here superficially correct for the tiny party, yet misleading, because the first seat becomes safe only above M = 53. Table 6.2 shows the magnitudes at which the parties in Table 6.1 would obtain at least one seat under various allocation formulas. Added to the previous three formulas are Imperiali (1, 1.5, 2, 2.5, . . . ) and Danish (1, 4, 7, . . . ) divisors. For divisor formulas, d represents the gap between successive divisors (starting from 1). Sufficiently large parties obtain minimal representation almost regardless of the allocation formula used. The 86
Micro-Mega Rule and Seat Product Table 6.2. Magnitudes at which parties with percentage vote shares 48+, 25−, 13−, 9−, 4, and 1+ would win their first seat under various allocation formulas Votes shares
Danish (d = 3)
48+ 25− 13− 9− 4 1+
1 2 3 5 9 33
100/d
33.3
Hare-LR
Sainte-Laguë (d = 2)
d’Hondt (d = 1)
Imperiali (d = 0.5)
1 2 4 5 9, 11, 13 25, 33, 48–50, 53
1 2 4 6 13 49
1 2 5 9 24 96
1 3 9 15 43 189
—
50
100
200
success of the small parties, however, depends very much on the allocation formula. The tiny 1 percent party can win its first seat only when district magnitude approaches M = 100/d. This means a magnitude close to 33 for Danish divisors but close to 200 for Imperiali. Indeed, it can be shown that tiny parties win their first seat when their vote share approaches v = 1/(Md), meaning simple quota 1/M divided by the divisor gap. With d’Hondt (d = 1), small parties need almost a full quota to win a seat. With Sainte-Laguë (d = 2), they need only one-half of the simple quota, while with Imperiali (d = 0.5), they need double the simple quota. When the Alabama paradox cases are overlooked, Hare-LR is close to Sainte-Laguë. The effects of various other quotas remain to be studied. Figure 6.1 makes the results in Table 6.2 more explicit, graphing the percentage of votes at which the first seat tends to be won against district magnitude. Both variables are on logarithmic scales. The lines shown are theoretical (from Chapter 15). The quasi-data points are from Table 6.2, omitting the Alabama paradox cases. The striking feature is that the Sainte-Laguë pattern (shown as round symbols) is a straight line at all district magnitudes. This means that parties always tend to win their first seat when their vote share reaches one-half of the simple quota. The same is true for Hare-LR (small crosses), except for the Alabama paradox cases. In contrast, allocations with other formulas start out at the Sainte-Laguë line but then gradually shift, at higher magnitudes—upwards for d < 2 (slanted crosses for d’Hondt, squares for Imperiali) and downwards for d > 2 (triangles for Danish). From this viewpoint, Sainte-Laguë appears central among all divisor rules. So might Hare-LR be among all quota rules, when overlooking the Alabama paradox. On different mathematical grounds, Balinski and 87
Rules and Tools 100 d’H on lim it
Votes (%) at which the first seat is won
it
m
i li l
ria
pe
it lim
Im
dt h nis Da
10
Sainte-Laguë limit
1 1
10
100
1,000
District magnitude
Figure 6.1. Vote shares at which parties tend to win their first seat vs. district magnitude, for various PR formulas
Young (2001) also consider Sainte-Laguë the most proportional formula. So do Schuster et al. (2003), both on theoretical and on empirical grounds, along with Hare-LR. Table 6.3 shows the deviation from PR (Gallagher’s measure, D2 ) and the effective number of parties for the seat allocations in Table 6.1. As district magnitude increases, deviation from PR at first decreases rapidly, with occasional minor reversals. It later decreases slowly, as D2 approaches the conceptual limit of D2 = 0. Hare-LR always produces deviations at least as low as d’Hondt, and often appreciably lower. Sainte-Laguë most often agrees with Hare but occasionally with d’Hondt. The vote shares used are bound to yield perfect lack of deviation (D2 = 0.00) at M = 100, but D2 increases again at higher magnitudes, though only slightly. Perfect lack of deviation is seen to be extremely rare. The effective number of legislative parties is low at low magnitudes. When Hare-LR or Sainte-Laguë is used, it quickly catches up with the 88
Micro-Mega Rule and Seat Product Table 6.3. Effect of district magnitude and seat allocation formula on deviation from PR and on the effective number of parties, in a district where the percentage vote shares are 48+, 25−, 13−, 9−, 4, and 1+ M
1 2 3 4 5 6 7 8 9 10 20 30 51 70 100 106
Deviation from PR (D2 )
Effective number (NS )
Hare-LR
Sainte-Laguë
d’Hondt
Hare-LR
Sainte-Laguë
d’Hondt
42.4 21.2 18.5 11.0 11.7 9.0 6.5 4.1 6.3 5.3 1.3 2.6 1.3 1.3 0.0 0.5
42.4 21.2 18.5 11.0 12.6 9.0 6.5 4.1 6.6 5.3 2.8 2.6 1.4 1.3 0.0 0.5
42.4 21.2 18.5 22.3 12.6 9.6 11.1 12.4 6.6 5.3 6.2 2.6 1.4 1.9 0.0 0.5
1.00 2.00 1.80 2.67 3.57 3.00 3.27 2.91 3.52 2.78 3.16 3.15 3.16 3.12 3.13 3.13
1.00 2.00 1.80 2.67 2.27 3.00 3.27 2.91 2.61 2.78 3.33 3.15 3.19 3.12 3.13 3.13
1.00 2.00 1.80 1.60 2.27 2.57 2.33 2.13 2.61 2.78 2.60 2.90 3.01 2.93 3.13 3.10
Note: For deviation, disagreements between Hare-LR and Sainte-Laguë are shown in bold script. For effective number, values higher than the one based on votes (N V = 3.13) are shown in bold.
effective number of electoral parties (3.13) and at times even surpasses it. Such overbeats are indicated in bold script. Large overbeats are fewer for Sainte-Laguë, although they exceed Hare at M = 20 and 51. At higher magnitudes, Hare and Sainte-Laguë yield NS = NV , on the average. This equality suggests that these rules favor large and small parties equally. With d’Hondt, in contrast, NS approaches NV but never surpasses it. The remaining gap indicates that d’Hondt maintains a large party advantage, however small, even at large M.
The Effect of District Magnitude and Seat Allocation Formula: Generalization The conclusions drawn from the illustrative example could be derived through more formal mathematical procedures. Doing so would gain in respectability but lose in readability and intuitive understanding. So I will leave it at that, at least until Chapter 15. The seat allocation formula matters for the fortunes of parties. At the same district magnitude, the smallest parties need almost twice the votes to land their first seat with d’Hondt, as compared to Sainte-Laguë. Yet 89
Rules and Tools
district magnitude matters much more, as long as one sticks to the usual PR formulas. (Shifting to plurality allocation rule in multi-seat districts reverses the effect!) Altering the magnitude by a factor of 2 (meaning multiplying or dividing by 2) affects small parties more than any shift among the usual PR formulas could—and even larger changes in M are available, given a possible range from M = 1 to M = 100 and even higher. At M = 5, a party with 5 percent votes has little chance of winning a seat even under Sainte-Laguë. At M = 20, it is bound to win a seat even under d’Hondt. In this qualified sense, district magnitude is the decisive factor, as long as all seats are allocated within districts, by some PR allocation rule. As pointed out in Figure 2.1, allocation rule can reverse the effect of magnitude when allocation by plurality is included. It is obviously possible to overrule the effect of district magnitude by district level legal thresholds or apparentement (which is relatively rare), or when the seat allocation process is extended beyond the basic districts (which is frequent). The illustrative example used in the previous section has a major weakness. It tacitly presumes that vote distribution is unaffected by district magnitude. This is not so. The distribution 48-25-13-9-4-1 is realistic for a district of magnitude around 10, embedded among similar neighboring districts where the two smallest parties might be doing better than in the given district. If the actual district magnitude is 1 or 2, however, only 2 or 3 parties or blocs are likely to survive. On the other hand, if the actual M is around 100 (with no legal threshold), then even more parties than 6 are likely to try their luck, and the 48 percent party would be subject to heavy centrifugal forces. Such interaction between district magnitude and the size distribution of parties needs to be put on a less impressionistic foundation, by empirical measurement and predictive model construction. This will be done in the central part of this book.
The Effect of Assembly Size The effect of district magnitude and seat allocation formula could be illustrated with a hypothetical vote distribution. This is not possible with assembly size. We could look for actual cases where everything but assembly size is the same, or carry out some quasi-experiments on what would happen if we reduced an actual assembly size. Neither approach is conclusive, but they help. 90
Micro-Mega Rule and Seat Product
Grofman and Handley (1989) considered the percentage of blacks in the Houses and Senates of the US states. The electorates and electoral rules for the two bodies are basically the same, the only difference being that the House usually has more members. The percentage of blacks is found to be higher in the Houses than in the respective Senates. While this result does not refer to parties, it suggests that if an ethnic or ideological minority decides to form a party, it would have more success in larger assemblies. The following quasi-experiment could be carried out. Take a country with single-seat districts and plurality rule, where the vote distribution in each district is known. Fuse neighboring districts two by two, apply plurality rule, and see how many seats parties would win in the combined districts. Fuse the resulting districts once more, and so on. Such a quasiexperiment was carried out for the British elections of 1983 (Kaskla and Taagepera 1988; cf. Taagepera and Shugart 1989: 173–4). Reducing the number of districts reduces the effective number of parties in the House of Commons as follows: Votes 633 seats 316 seats 79 seats 11 seats 1 seat
2.93 effective parties 1.98 (actual assembly) 1.86 1.70 1.53 1.00
This quasi-experiment presumes that reducing assembly size would not appreciably alter the voting pattern. In reality, it would. Voters would give up on smaller parties, reducing the effective number of electoral parties, and thus possibly depressing the number of legislative parties even further. The figure for 11 seats compares well with the pattern in some tiny island countries such as St Vincent & Grenadines, where the parliament in 1974–84 had 13 seats and NS ranged from 1.35 to 1.74 even while the identity of the dominant party changed (calculations based on data in Nohlen 1993: 701) When the assembly is reduced to one seat, NS is bound to fall to 1.00. Taking this anchor point into account supports the qualitative idea that reducing assembly size must tend to reduce the number of parties. The two studies presented do not amount to systematic empirical evidence, but they may help to give a feeling for the trend. A predictive model will be presented and tested later on. 91
Rules and Tools
The Seat Product of Simple Electoral Systems District magnitude, seat allocation formula, and assembly size can be combined into a ‘seat product’ that characterizes the degree of openness or inclusiveness of a simple electoral system. Inspired by Colomer’s micromega rule (2004b: 3), it is first made explicit in this book. Here, I run ahead of the evidence to be presented later on. However, it may help to know in which direction we are proceeding. In some ways, the seat product represents the comprehensive outcome of my trying to make sense of electoral systems, over 40 years. The single number that best characterizes any simple electoral system is the product of the number of the seats in the assembly (S) and the number of seats in the average district (mean district magnitude, M), the latter modulated by the seat allocation formula through a ‘formula exponent’ (F ): Seat Product = M F S. For the usual PR formulas, F is around +1. It is slightly smaller for d’Hondt than for Sainte-Laguë or Hare-LR, by an amount to be specified later. For plurality formula, F = −1. For semi-proportional formulas, intermediary values of F would apply. In the case of FPTP, M = 1, and thus the value assigned to F does not matter. Using F = −1 would stress the fact that FPTP is the most proportional outcome to which plurality formula could lead, while using F = +1 would remind us that FPTP is the least proportional outcome to which a PR formula could lead. When overlooking multi-seat plurality, the seat product basically amounts simply to MS. The seat product incorporates all three indispensable components of a simple electoral system. What range do its numerical values cover, and what do they mean? Some typical values of the seat product are shown in Table 6.4, along with its fourth root. The latter represents the number of Table 6.4. The seat product and the resulting expected number of seat-winning parties in the assembly
Any country with nationwide plurality If UK had plurality in M = 10 districts Actual UK, FPTP in 650 districts Malta, 65 seats by PR in M = 5 districts Finland, 200 seats by PR in M = 14 districts Netherlands, 150 seats, nationwide PR
92
MFS
N0 = (M F S)1/4
1 65 650 325 2,800 22,500
1 2.8 5.0 4.2 7.3 12.2
Micro-Mega Rule and Seat Product
parties expected to win at least one seat in the assembly (N0 ), as will be proposed and tested later, in Chapter 8. It has been noted earlier that the most and the least proportional outcomes both occur when there is a single nationwide district: M = S (cf. Figure 2.1). In this case, the plurality formula yields a seat product of S −1 S = 1, expressing the fact that only one party can obtain seats. At the other extreme, M = S with a PR formula leads to a seat product S 2 , which becomes quite large in the case of large assemblies. This large number suggests that many parties can obtain representation, and Chapter 8 will specify, how many. The seat product implies that a 625-seat assembly elected by FPTP is as inclusive (open to small parties) as a 25-seat assembly elected by nationwide PR, given that the seat product is 625 in both cases. Such a claim may look debatable at first glance, but it will be substantiated in later chapters. The exponent F in the seat product M F S is not merely a symbolic formulation that can take only the values +1 for PR and −1 for plurality. There are ways to specify its values for different proportional and semiproportional seat allocation formulas. These specifications have not yet been worked out in detail, but the general method for doing so is already apparent. The tentative results are shown in Table 6.5. See chapter appendix for the reasoning behind it. At the level of a single district, S = M, and hence the seat product is M F +1 . This value is also shown in the table, and some implications are presented in the chapter appendix. By this count, the use of d’Hondt would reduce the seat product appreciably, compared to Sainte-Laguë or Hare-LR. For Finland, it would drop from 2,800 to 1,030, reducing the expected number of seat-winning parties from 7.3 to 5.7. For the Netherlands, the cut is even more drastic, from 22,500 to 3,350, reducing the seat-winning parties from 12.2 to 7.6—and the 0.67 percent legal threshold would reduce it even further. However, it is still uncertain whether the specific PR formula really has
Table 6.5. Tentative values of allocation formula exponent F in the seat product, for various PR formulas
Plurality Imperiali divisors D’Hondt Sainte-Laguë, Hare-LR Danish divisors
d
F
F +1
0 0.5 1 2 3
−1.00 0.32 0.62 1.00 1.26
0.00 1.32 1.62 2.00 2.26
93
Rules and Tools
that much impact. Hence, unless otherwise indicated, I will use F = 1 for the d’Hondt systems too.
Conclusion The examples presented help visualize how large assemblies, large district magnitudes, and large gaps between divisors (or large quotas) help the small parties, and how small assemblies, magnitudes, and divisor gaps or quotas help the large parties. This tendency is expressed qualitatively in Colomer’s macro-mega rule and quantitatively in the dependence of the number of parties on the seat product. Inclusiveness (openness to small parties) of an electoral system is reflected in the number of parties that make it to the assembly. Whichever way one measures this number, it tends to increase with increasing seat product M F S. This formula suggests that district magnitude and assembly size enter basically on an equal basis, but the seat allocation formula enters as a modulator of district magnitude that can reverse the direction of its impact. The small (parties) prefer the large (assemblies, district magnitudes, divisor gaps, and quotas) because all these contribute to increase M F S.
Appendix to Chapter 6 This appendix explains how I estimated the values of the formula exponent (F ), as offered in Table 6.5. It also wonders whether the square root of the seat product is the basic building block for evaluating the properties of electoral systems.
How to estimate the formula exponent in the seat product Which seat allocation formula should be taken to represent perfect proportionality (F = 1.00)? As ‘remainderless quota’, d’Hondt has considerable appeal, but it keeps the effective number of legislative parties below the effective number of electoral parties even at large magnitudes. Thus, it is not quite neutral regarding large and small parties. Deviation from PR offers limited guidance, because it has been pointed out (e.g. Cox and Shugart 1991; Gallagher 1991) that a measure of deviation can be constructed on the basis of any seat allocation formula so as to make that particular formula look the most proportional. The effective number of parties seems more neutral in this respect. When one takes the effective number based on votes as the standard, Table 6.3 shows that d’Hondt restricts the number
94
Micro-Mega Rule and Seat Product of parties at all district magnitudes, while Hare-RL and Sainte-Laguë fluctuate fairly evenly between under- and over-representation. Moreover, these two formulas also look central in Figure 6.1. Thus the choice seems to be between Hare-RL and Sainte-Laguë, as the benchmark (F = 1.00) for comparing the ‘PR-ness’ of all seat allocation formulas. Hare-LR is simple to visualize, but the Alabama paradox makes it inconvenient. Therefore, it would seem that Sainte-Laguë is the most suitable benchmark. Recall that this agrees with the theoretical arguments by Balinski and Young (2001) and Schuster et al. (2003) who consider Sainte-Laguë the most proportional formula. There is another reason to focus on divisor formulas. As pointed out in Table 3.4, divisors offer a link to plurality (divisor gap d = 0) at the one extreme and to ‘every party gets a seat’ (huge divisor gaps) at the other. Quotas do not offer such a wide range without risking overallocation. The usefulness of this continuous variable, the divisor gap, emerges as one graphs the effective number of parties versus district magnitude and considers what happens, depending on the allocation formula used. This is what Kaskla and Taagepera (1988) did in a quasi-experiment that grouped the British single-seat electoral districts two by two, four by four, and so on. Treating them as multi-seat districts, they applied various seat allocation formulas to the resulting vote shares based on the actual votes in 1983. The resulting curves, N versus M, are shown in Taagepera and Shugart (1989: 265). The broad pattern is as follows. We have two benchmarks for the effective number of parties. One is NV , which depends on the vote constellation. The other is NS at M = 1, designated here as NFPTP , which is lower than NV and tends to increase with increasing assembly size. This dependence on S is addressed in Chapter 9. Here we ask what happens to NS at constant assembly size, as the district magnitude is increased. With multi-seat plurality (d = 0), NS decreases below NFPTP , until it drops to 1.00 at M = S. With d’Hondt (d = 1), it increases above NFPTP , slowly approaching NV but never consistently reaching it. With Sainte-Laguë (d = 2), it quickly reaches NV and by M = 10 exceeds it, before slowly approaching NV again, down from higher values. The extent of this over beat increases, as the divisor gap is increased to Danish divisors (d = 3) and beyond. But what would happen at tiny non-zero divisors such as d = 0.2? At low M, they behave like plurality, lowering NS below NFPTP , but later the pattern reverses itself—NS surpasses NFPTP and approaches NV , thus behaving more like a PR formula. Around M = 10, Ns is back to the level of NFPTP , as if the effect of the d = 0.2 formula were not affected by district magnitude. In this light, if we take Sainte-Laguë to correspond to pure PR (F = 1.00) in the seat product M F S, then the Danish divisors should be assigned a value of F somewhat higher than 1.00 so as to express their tendency to favor small parties. D’Hondt should have a value of F lower than 1.00, and Imperiali an even lower value. Multi-seat plurality calls for F = −1 so that the seat product is reduced to
95
Rules and Tools 1 at M = S. In between, a divisor close to 0.2 would correspond to F = 0. This assignment depends on taking M = 10 as the magnitude where the divisor effect is required to be nil, compared to FPTP. The precise value of d may depend on assembly size. Here we have a first approximation. In sum, we must have F = −1 for d = 0 (plurality) and F = +1 for d = 2 (SainteLaguë). For d = 1 (d’Hondt) we would expect a value of F moderately below 1, and for d = 3 (Danish) moderately above 1. These demands are roughly satisfied, if we set the relationship between F and d to F =2
k d − 1. 2
The parameter k depends on the value of d at which we want F to assume the value 0. If d = 0.2 for F = 0, then the equation above leads to k = log0.5/log0.1 = 0.30. Hence the tentative connection between divisor gap and allocation formula exponent F might be around F =2
0.3 d − 1. 2
This is only a preliminary estimate. No constant divisor gap exists for which proportionality would be independent of magnitude. One can only balance decreasing proportionality at low magnitudes and increasing proportionality at high magnitudes. Empirical usefulness of the formula above remains to be tested, and I have as yet no theoretical justification. Empirically, it would lead to the values of allocation formula exponent F listed in Table 6.5. This table focuses on divisor formulas, but the general approach used should apply to quota formulas with various quota values, taking Hare-LR to correspond to F = 1.00.
Implication of seat product for a single district When considering a single district of M seats, one would have to set the total number of seats also at M, so that the seat product becomesM F M = M F +1 . With plurality formula (F = −1), this equation leads to M 0 = 1, as it should, as one party wins all the seats. With a perfect PR formula (F = 1), we get M 2 . The values of F in Table 6.5 would imply the following. The same degree of proportionality that prevails with Sainte-Laguë at magnitude M should prevail already at a lower magnitude M (2/2.26) = M 0.88 with Danish divisors, at a higher magnitude M (2/1.62) = M 1.23 with d’Hondt, and at a still higher magnitude M 2/1.32 = M 1.52 with Imperiali. (With plurality, this magnitude would be pushed to infinity.) A quick comparison with Tables 6.1 to 6.3 shows both agreement and disagreement, depending on the value of M and also on the criteria of proportionality used (deviation from PR, either D1 or D2 , or closeness of the seat-based effective number to the vote-based).
96
Micro-Mega Rule and Seat Product
The aggregate of a simple electoral system In some contexts, the square root of the seat product appears as the underlying basic building block for expressing the impact of an electoral system. I will designate it as the ‘aggregate’ (A) of the electoral system: A = (M F S)1/2 . Subsequent chapters will present the logical arguments and observational tests for relationships between the electoral system aggregate and the various measures of the number of parties, as highlighted in Chapter 4. With PR or FPTP (F = 1), the following applies, on the average: N0 = the number of seat-winning parties = A1/2 (see Chapter 8). N2 = the effective number of legislative parties = A1/3 (see Chapter 9). N∞ = inverse of the largest seat share = 1/s1 = A1/4 (see Chapter 8). These are, respectively, the square, cube, and fourth roots of the electoral system aggregate. Consequently, the average relationships among these measures of the number of parties form a remarkable series with exponents 1, 2, 3, and 4: 4 = N23 = N02 = A1 . N∞
What it implies is that, never mind how one tries to measure the number of parties, it still boils down to the same aggregate of assembly size and district magnitude, the latter critically modulated by the seat allocation formula. What does A represent? Allocation by plurality in a single nationwide district leads to A = 1, expressing the fact that only one party can win seats in the assembly. At the other extreme, M = S with PR allocation yields A = S. It expresses the fact that up to S parties can conceivably win seats in the assembly—even while such an outcome is highly unlikely. Thus, at both extremes, the electoral system aggregate equals the maximum number of parties that conceivably could win a seat. Unfortunately, such a simple interpretation of the aggregate no longer applies when the country is divided into several districts (M < S). Take for instance the case M = 1, where PR and plurality rules both yield A = S 1/2 . The number of parties winning seats could still be as high as S, if a different party (or independent) wins in each district. As long as a PR allocation formula is applied, the conceivable maximum number of seat-winning parties remains the same, S, regardless of whether district magnitude is S or 1 or anything in-between. We may sense that the likely number of seat-winning parties will go down as district magnitude decreases. This is what N0 = A1/2 expresses. Thus, for FPTP (M = 1), the likeliest number of seat-winning parties would be A1/2 = S 1/4 . With 2-seat districts, the number of seat-winning parties would be expected to go slightly up with PR rules, to (2S)1/4 = 1.19S 1/4 , and down with plurality, to (S/2)1/4 = 0.84S 1/4 .
97
Rules and Tools It still is not clear whether the seat product or the aggregate should be seen as the centerpiece of an electoral system, and so I present them both. As seen above, the aggregate has appealing theoretical features, but in the chapters that follow, the product MS fits in naturally into my prose, while the aggregate does not. Also, it would be hard to get a political practitioner to be involved with something that includes a square root! So I use MS more often than A. The order in which discoveries are made is rarely the logical order in which it is convenient to present them in retrospect. It is evident that multi-seat plurality rule is even less favorable to small parties than FPTP. But that it acts on the number of parties as if it changed a magnitude M into 1/M—this intuition has bugged me for several decades, without my being able to prove it. The relationships between the seat product MS and the various measures of the number of parties were gradually developed and tested in articles published in 1993–2006. A final correction for the effective number of legislative parties was made while writing this book. Only then could the pieces come together into the dual notion of seat product and aggregate of an electoral system.
98
Part II The Duvergerian Macro-Agenda: How Simple Electoral Systems Affect Party Sizes and Politics
Ask not what the electoral rules can do for your country, ask what those rules can do on the average. A Wuffle
This page intentionally left blank
7 The Duvergerian Agenda
For the practitioner of politics:
r
r r r
Starting with the characteristics of the electoral system, the ‘Duvergerian agenda’ aims at predicting the average seat and vote share distributions of parties, their effective number, and deviation from PR. In the case of simple electoral systems, such prediction has become possible for seat shares and cabinet duration. Simple electoral systems are those using a usual PR formula or first-pastthe-post, so that assembly size and district magnitude tell the whole story. The following chapters will offer specific formulas and tell when they can or cannot be used.
Duverger’s law has been mentioned from the very first chapter on. Maurice Duverger (1951, 1954) highlighted the possibility of predictable relationships between electoral systems and political outcomes. The search for such regularities has been called the ‘Duvergerian agenda’ (Shugart 2006: 28), and it arguably has formed the core of the field of electoral studies during the late 1900s. This search looks for answers to questions like: How does the electoral system shape the party system? To what extent are voters’ choices affected by electoral rules? And what are the processes that cause the relationships found? The very idea of the existence of predictable relationships between electoral systems and party political consequences remains controversial, but it keeps revolving. Our aim should be to go beyond qualitative statements and establish quantitative average patterns. Semi-quantitative answers 101
The Duvergerian Macro-Agenda
such as Colomer’s micro-mega rule (Chapter 6) are useful guides, but one should get as specific as possible. This chapter specifies the core idea of the Duvergerian approach and traces its development over the last fifty years. It indicates the broad research agenda this approach suggests and the opportunities the simplest electoral systems present, in particular. A road map results for the several following chapters, which form the centerpiece of this book.
The Core Idea of the Duvergerian Approach One broad idea underlies the line of inquiry that received a major boost from Duverger’s work, although Duverger himself expressed it in a narrower form. When the electoral system is simple: average distribution of party sizes depends on the number of seats available. Directly, this means the number of seats in the electoral district. Singleseat districts restrict the number of parties more than do multi-seat districts. However, the total number of seats in the representative assembly matters, because more seats offer more room for variety. It is possible to have more than 10 parties in a 500-seat assembly, but not in the 10-seat national assembly of St Kitts and Nevis. At the same district magnitude, a larger assembly is likely to have more parties, all other factor being the same. Once we add the impact of the seat allocation formula, Colomer’s micro-mega rule and the notion of seat product emerge (cf. Chapter 6). The chapters that follow will deduce and test the logical consequences of this dependence of party system on the number of seats available in the assembly and the district. Within the Duvergerian agenda, this is the ‘macro’ part in that it deals with system-level variables (Shugart 2006). The complementary ‘micro’ part tries to elucidate how such macrochanges emerge from decisions made on the individual level, by voters and politicians. Before proceeding to the resulting research agenda, this chapter summarizes the history of the Duvergerian thought.
Duverger’s Law and Hypothesis: Mechanical and Psychological Effects The study of electoral systems began with advocacy pieces for specific sets of rules, such as those written by Borda (cf. Colomer 2004b: 30), Hare 102
The Duvergerian Agenda
(1859), and Mill (1861). This tradition continued up to the mid-1900s. (For details, see Taagepera and Shugart 1989: 47–50 or Colomer 2004b). A major analytical landmark was reached with Maurice Duverger’s work. Duverger (1951, 1954) was the first to announce clearly what came to be called Duverger’s law and hypothesis (Riker 1982), making a connection between electoral and party systems. Avoiding implications of unidirectional causality, they can be worded as follows: (1) Seat allocation by FPTP tends to go with two major parties (‘law’). (2) PR formulas in multi-seat districts tend to go with more than two major parties (‘hypothesis’, because more exceptions were encountered). Note that the Duverger statements (law and hypothesis) involve only one parameter, district magnitude. This means they address only the systems I have called simple. They say nothing about elections with run-offs, tiers, legal thresholds, ordinal ballots, or any other complications, unless such complex systems are somehow reduced to analogous simple systems, as has been tried in different ways by Taagepera and Shugart (1989) and Lijphart (1994). These statements can be made more specific thanks to improved operationalization of the notions involved. Rae (1967) coined the term district ‘magnitude’ and applied it to systematic worldwide analysis. Laakso and Taagepera (1979) introduced the effective number of parties. In those terms, we could interpret Duverger’s law to mean ‘M = 1 goes with 1.5 < N < 2.5’, while Duverger’s hypothesis means ‘M ≥ 2 goes with N > 2.5’. Actually, as district magnitude increases from M = 1 to M = S (nationwide single district), the number of parties tends to increase gradually and at a decreasing rate, as first shown graphically in Taagepera and Shugart (1989: 144). In this light, the discontinuity between the law and hypothesis should be removed, leading to a single function N = f (M) for the average pattern. Taagepera and Shugart (1989: 144, 153) offered empirical equations, but no logical explanation for the patterns observed could be found. These equations no longer should be used, because models with a stronger conceptual foundation are presented in Chapters 9 and 14. What produces the outcomes noted by Duverger? Low district magnitudes (and M = 1 in particular) arguably put a squeeze on the number of parties in two ways. In any single-seat district with plurality, one of the two largest parties nationwide will win, unless a third party has a 103
The Duvergerian Macro-Agenda
local concentration of votes quite different from its nationwide degree of support. This is the so-called Duverger mechanical effect. Hence third party votes most often are ‘wasted’ (for the purpose of winning seats), so that these parties are underpaid, nationwide. Correspondingly, the two largest parties will be overpaid in terms of seats. This effect is observed instantaneously, for any given election, once the seat and vote shares are compared. In this sense, it is ‘mechanical’. In contrast, the so-called Duverger psychological effect develops slowly, over several elections. The mechanical effect means that votes for third parties are effectively wasted in most districts. In the next election, some voters are tempted to abandon such parties, except in the few districts where the third party won or came close. With reduced votes, such parties stand to win still fewer seats in the next election, causing even further voters to give up on them. Thus, third parties are gradually eliminated, unless they have local strongholds. But even there, voters may hesitate between a preferred third party and a tolerable major party, which has more chances to form the cabinet and bring resources to the district. The psychological effect is often presented in terms of voter strategies, but it also works on politicians and contributors. Anticipating another defeat and lacking resources, a third party may desist from running in a district even before its former voters have a chance to abandon it. Financial contributors may be hard to find, and few people may volunteer to campaign for a lost cause. ‘Scholars disagree over which of these causal mechanisms—strategic voting in the mass electorate or strategic contributing in the elite strata—is the more important. . . . In my view, both kinds of resource concentration are important. Elites typically act first’ (Cox 1997: 30). The Duverger effects apply foremost at the district level. This is where the seat is lost or won and where the votes are wasted or not, regardless of nationwide results. Voters have no direct reason to abandon a third party nationwide who won in their own district—or only narrowly lost and could win in the next election. The extension of the psychological effect to the nationwide scene need not follow, but it often does, if voters perceive the third party representatives as ineffective in the assembly. Third parties have vanished in the USA, but have survived and even made a comeback in the United Kingdom. In Canada, Duverger’s law operates at the provincial level, but the two dominant parties are not the same in all provinces, leading to a more scattered nationwide pattern. The picture is even more diverse in India.
104
The Duvergerian Agenda
Many non-Duvergerian factors may counter the psychological effect (and other strategic considerations) in UK and elsewhere. When parties are successful in subnational or supranational elections where PR is used (such as elections to the Scottish Assembly or the Parliament of the European Union), these parties are motivated to show their flag in the FPTP elections too. When parties are publicly financed, it may pay to run even when few or no seats are won. The advent of TV may enhance third party visibility, compared to printed press, and the Internet makes intraparty contacts less expensive for parties with dispersed memberships. When the major party programs converge toward the middle voter (Downs 1957), they may come to look so similar in the eyes of third party voters that neither party may be seen as a ‘lesser evil’ worth a strategic shift away from the third party. Strategic considerations, such as those in the classical Duverger effects, are not limited to single-seat districts. As district magnitude increases, these effects are attenuated but still restrict the smaller parties. If a country has larger and smaller districts, smaller parties may decide to run only in the larger districts. Thus in Finland 1962–83, about 7.5 parties ran in the smallest districts (M = 7 to 11), while about 8.5 did in the largest (M = 17 to 27). The effective number of electoral parties was around 4.6 in the smallest and around 5.1 in the largest (Taagepera and Shugart 1989: 119). Once again, parties may give up on voters before voters have a chance to abandon the parties. It is hard to sort out the strategic considerations of voters, a party’s anticipation of such considerations, and the party’s own dilemma between concentration of resources and showing the flag in many districts.
The Broad Duvergerian Agenda The Duvergerian agenda consists of explaining and predicting the results and causes of Duverger’s effects. It includes micro and macro aspects. A micro dimension underlies the psychological effect and related strategic considerations. It involves the individual decisions of voters, party leaders and contributors in what Cox (1997) calls strategic coordination. Reed (1991) observed that, in Japanese SNTV elections, M + 1 ‘serious’ candidates tend to run in a district with M seats. Cox (1997: 99) proposed an extended M + 1 rule as a direct generalization of Duverger’s law and tested it in various ways. This issue will be revisited in Chapter 15.
105
The Duvergerian Macro-Agenda
The longstanding macroscopic approach tries to make use of the restrictions imposed by electoral rules (low district magnitude, in particular) to explain the number and size distribution of parties, as well as the degree of disproportionality of seats to votes. The number of political cleavages or ‘issue dimensions’ is also taken into account, to the extent it can be estimated independently of party differences. In a recent overview of electoral systems, Shugart (2006) considers the macro dimension of the Duvergerian agenda the ‘core of the core’ of electoral studies. Since 1980, advances in the study of simple electoral systems for parliamentary elections have been such that Shugart (2006) feels that ‘the agenda of proportionality and number of parties is largely closed’ and needs only fine-tuning. But it is always risky to call an agenda closed. Around 1900, just prior to the birth of relativity and quantum mechanics, many considered physics a closed field. True, the ‘core of cores’ of the Duvergerian agenda has been investigated to the point where meaningful spin-offs have become possible toward systematic investigation of more complex electoral systems, intraparty impact of electoral rules, and the effects of ‘second-order’ rules such as closed versus open lists. But this need not mean that the core issues are resolved to a satisfactory degree. This book presents recent findings in the macro-Duvergerian realm and indicates that quite a lot still remains to be done. The Duverger statements, important as they are historically, still give only semiquantitative answers. In FPTP systems, should we visualize Duverger’s law more as 52-48 or as 50-40-10? Saying that it could mean either and that more precision is not needed would restrict prediction to the level politicians can handle on their own, without a need for political scientists. It surely would be of interest to know which electoral laws are more conducive to 52-48 rather than to 50-40-10. Saying that such precision cannot be achieved would mean giving up before even trying. For beginners, we can determine the empirical world average pattern for all FPTP systems at various assembly sizes. Once we have such a baseline, we can measure by how much individual countries deviate from it, and ask why. Alternatively or concurrently, we may ask what patterns we would expect on logical grounds. In multi-seat PR systems, similar questions need answers. Duverger’s hypothesis merely predicts more than two significant parties. But how many, and what are their most likely relative sizes? Obviously, district magnitude matters, as approximated by outdated empirical equations offered by Taagepera and Shugart (1989: 144, 153). But if, at a given 106
The Duvergerian Agenda
magnitude, N is 3.00, does it more often imply three equal parties (3433-3) or one large party and several smaller ones (45-29-21-16)? And why is the empirical relationship what it is, rather than some other expression in M? There must be a reason for the specific form, and we would not be scientists if we did not try to find it. Colomer’s micro-mega rule expands the scope of Duverger’s statements, by going beyond district magnitude and including assembly size and seat allocation formula. But it still is only semi-quantitative. For the tiniest parties, Colomer’s recipe is clear: Promote as large assemblies, district magnitudes, and quotas as possible. But what would be the optimal combinations for a medium-sized party? In retrospect, no progress could be made as long as the dominant picture was votes determining the seats, with the electoral system a black box in-between the votes and seats (cf. Taagepera and Shugart 1989: 64, 202). The erroneous central idea was that this black box determines how votes are translated into seats: ELECTORAL SYSTEM
−→ SEATS. [WRONG] VOTES −−−−−−−−−−−−−−− The breakthrough came with the realization (Taagepera and Shugart 1993) that the electoral system is not an intervening control box between votes and seats. Rather, votes and electoral system both affect seats, from opposite directions: VOTES −−−−→ SEATS ←−−−− ELECTORAL SYSTEM.
[RIGHT]
This may seem a minor difference in visualizing the Duvergerian idea. However, the first format leads to a dead end, while the second one opened up a way to crack the Duvergerian nut. The basic idea was expressed in Figure 1.1. Figure 7.1 introduces the mechanical and psychological effects. For simplicity, it omits political culture.
ELECTORAL SYSTEM
SEATS DISTRIBUTION
VOTES DISTRIBUTION
CURRENT POLITICS, CULTURE, HISTORY
MECHANICAL PSYCHOLOGICAL EFFECT EFFECT
Figure 7.1. The opposite impacts of current politics and electoral system
107
The Duvergerian Macro-Agenda
For individual elections, votes come first, based on current politics and, more remotely, on the country’s historical peculiarities (‘path dependence’). These votes will determine the seats, in conjunction with the mechanical effect of the electoral system. For the average of many elections, however, the causal arrow reverses its direction. Through the mechanical effect, electoral system pressures the distribution of seats to conform to what best fits in with the total number of seats available. Through the psychological effect, the electoral system eventually also impacts the distribution of votes, possibly counteracting culture and history. Indeed, voters are no longer free to vote for a seventh-ranking party, if the electoral system has deprived it of seats and it has stopped to exist. As a result, the average of many elections in many countries using similar electoral systems may produce a predictable pattern. Do other factors matter, such as a country’s historical tradition and culture, and the moment’s political events? Of course they do. But they can be addressed only when the more universal patterns have been elucidated. Actually, the impact of the electoral system reaches even further than votes. Expanding on findings by Anderson and Guillory (1997) and Klingemann (1999), Andrew Drummond (2006) has documented that it affects political attitudes. Those voters whose preferred parties win elections tend to view elections, government and even democracy more favorably than the losers. On the average, majoritarian electoral systems create more losers, and the loss itself is starker. As a result, overall evaluation of politics and democracy tends to be lower in majoritarian systems and higher in PR systems. Going even further, Lijphart (1999: 270–300) brings evidence that people in consensual countries, largely defined by PR electoral system, not only are more satisfied with democracy but also have less social violence, more political equality, and higher women’s representation. Lijphart (1999: 258–70) also challenges the widespread belief that consensual countries are supposedly less efficient economically. Consensus politics involves aspects beyond electoral system, so some of these differences may not derive from electoral systems as such. It could well be that countries with inherently consensual political cultures tend to choose multi-seat PR rather than FPTP in the first place. It would be harder to claim that people initially satisfied with democracy choose multi-seat PR. This book will not investigate such cultural effects of electoral systems. 108
The Duvergerian Agenda
The Macro-Duvergerian Agenda for Simple Electoral Systems The overall road map for the central part of this book is shown in Figure 7.2, to be discussed in some detail. It expands and clarifies a scheme first offered in Taagepera (2001), tracing the connections to be FOUNDING PARTIES
POPULATION
ELECTORATE
Self-Preservation
DISTRICT MAGNITUDE
ASSEMBLY SIZE
LARGEST SEAT SHARE
DISPROPORTIONALITY EXPONENT
STRATEGIC CONSIDERATIONS
EFFECTIVE NUMBER OF PARLIAMENTARY PARTIES
CABINET DURATION
OTHER SEAT SHARES
DEVIATION FROM PR
VOTE SHARES
EFFECTIVE NUMBER OF ELECTORAL PARTIES
Figure 7.2. The macro-Duvergerian agenda, as of 2007
109
The Duvergerian Macro-Agenda
specified in the following chapters. All this applies only to simple electoral systems—those that include no features beyond assembly size, a fairly uniform magnitude for districts, and seat allocation according to a usual PR formula (which boils down to FPTP when M = 1). Thick arrows indicate definitions, such as defining the effective number on the basis of seat shares of parties. Thin arrows indicate connections for which we have quantitatively predictive models. Dashed arrows show conceptual connections for which more than fine-tuning is needed, because even the broadest form of the quantitative model is fuzzy or missing. Only downward arrows are shown, because this is the predominant direction of causality under usual conditions, but mutual interaction is not to be excluded. For instance, politics usually has little effect on population size, but when politics leads to secession or annexation, it has a major effect. Starting from the top of Figure 7.2, population strongly constrains assembly size. It will be seen (Chapter 12) that a cube relationship prevails: Assemblies of 100 seats tend to go with 1,000,000 people, while island countries with little more than 1,000 people have little more than 10 seats. In democracies, electorate is almost proportional to population. District magnitude is wide open in principle. The founding parties can choose any magnitude, from M = 1 (FPTP) to M = S, the assembly size. Parliamentary parties can later initiate changes. Actually, choice is restricted by the self-preservation instinct of parties, as condensed in Colomer’s micro-mega rule. True, in new democracies the dominant parties may be short-sighted and act contrary to their long-term interests. Boix (1999) and Colomer (2005) have advanced our knowledge of how parties choose electoral systems, but a quantitatively predictive model still eludes us. One of the best rules of thumb still is ‘British heritage → FPTP, no FrancoBritish heritage → List PR’ (cf. Chapter 3). District magnitude places constraints on the number of parties that can win seats in the district. The expected average number of seat-winning parties in one district can be determined, using the ignorance-based model approach (cf. Taagepera 1999b, 2003). Adding the constraints exerted by assembly size, the same approach enables us to predict the average number of seat-winning parties nationwide one could expect in the absence of any other information. This is where the seat product MS (cf. Chapter 6) emerges, as shown in Chapter 8. For clarity, Figure 7.2 omits the number of seat-winning parties. The number of seat-winning parties, in turn, constrains the seat share of the largest party. The largest share cannot be more than 100 percent of 110
The Duvergerian Agenda
the entire assembly, nor can it be less than the average share, which is the inverse of the total number of seat-winning parties. The ignorance-based model approach allows us to determine the expected average seat share of the largest party, based only on the number of seat-winning parties, itself based on the seat product MS. It is a purely institutional model, up to this point, and it fits the data, on the average. This model is also developed and tested in Chapter 8. Subtracting the expected seat share of the largest party from the total yields the range in which the second largest share can lie. On this basis the entire most likely distribution of seat shares of parties can be inferred, in the absence of any other information. However, at this stage, the observed average distribution differs from the one predicted by the purely institutional model. For the first time, we have to introduce a non-institutional parameter to account for strategic and other factors that hurt the smaller parties. A part of their inherent support is shifted to major parties. A good fit to the observed average seat share distribution is obtained in Chapter 9 when the transfer parameter is set around a half. The effective number of legislative parties can be estimated in two ways. It can be calculated on the basis of the estimated seat shares of all parties. Alternatively, it can be deduced from the largest share alone. Indeed, the largest share places constraints on the value the effective number of parties can take. While this approach is less precise than the one based on the shares of all parties, it has the advantage of being purely institutional, bypassing the strategic considerations that work against the small parties. Thus it enables us to make a prediction (in Chapter 9) for the effective number of legislative parties, based on the seat product MS alone. A major payoff comes in Chapter 10, when the effective number of legislative parties, in turn, is connected to duration of governmental cabinets. This time, the predictive model does not use the ignorance-based approach but optimizes the number of communication channels. The overall outcome is a specific prediction regarding average cabinet duration, made solely on the basis of the seat product MS plus one empirically determined constant. By this time, the ignorance-based approach has been repeated so many times that the impact of district magnitude and assembly size is quite distant and can be expected to be completely blurred out by other political and cultural factors. The wonder is that this is not the case. The institutional effect on cabinet duration is still evident—and it has the predicted functional form. 111
The Duvergerian Macro-Agenda
Thus, this sequence of predictive models actually offers a baseline for informed institutional engineering. Population largely fixes the assembly size, but district magnitude can be modified so as to alter average cabinet duration not only in the desired direction but also to the desired degree. However, the given country’s historical tendency to deviate from worldwide averages must be corrected for. With the same seat product, Italy will have shorter lasting cabinets than Spain. What about the third ingredient of the micro-mega rule, the seat allocation formula? As long as one keeps away from multi-seat plurality, the allocation formula affects the impact of district magnitude to a relatively minor extent (cf. Figure 6.1 and Table 6.5). The following chapters study in detail the links in the concatenation that extends from district magnitude to cabinet duration. At the same district magnitude, we should expect the d’Hondt systems to deviate from the average toward the direction less favorable to small parties–fewer seat-winning parties, lower effective number of parties, and longer cabinet durations. The HareLR and Sainte-Laguë systems should fit better. The difference, however, is small and hard to detect. This concludes the study of the largely mechanical effect of electoral systems on the distribution of seats—the left half of Figure 7.1. This part of the macro-Duvergerian agenda is now largely closed, except for connections to population and founding parties. Details about the effective number of parties need to be fine-tuned. Future emphasis would be on extending the theory from simple to more complex electoral systems. The largely strategic (‘psychological’) impact of electoral systems on the distribution of votes, on the other hand—the central part of Figure 7.1— still remains wide open. This is the region at the bottom right of Figure 7.2. Institutions are bound to impact votes in a fuzzier way than seats. An electoral system can block the seventh-largest party from getting any seats, but it cannot prevent people from voting for this party, if they really insist and the party refuses to fold. How are seat shares connected to vote shares of parties? This is the thin dashed horizontal line at lower right of Figure 7.2. Here the relationship was first worked out in the opposite direction, going from votes to seats. This work started a century ago with the so-called cube law of AngloSaxon elections that applied to FPTP systems and empirically connected the seat ratios of two parties to their vote ratios. The empirical cube law was extended into a theory-based seat-vote equation (Taagepera 1973, 1986) that applied to PR elections, too. It can now be seen as part of a broader law of minority attrition (Chapter 13). The disproportionality 112
The Duvergerian Agenda
exponent that enters here depends on the number of voters (electorate) and seats (assembly size). It is strongly conditioned by district magnitude. Throughout the path from seat-winning parties to cabinet duration, assembly size and district magnitude play an essentially symmetrical role in the form of the seat product. In contrast, their role is cardinally different in the law of minority attrition. This law can be reversed to go from seats to votes. (Only this direction is shown in Figure 7.2.) This way, the vote shares, too, can be inferred from assembly size and district magnitude alone, but with increasing blur, plus a new difficulty. Parties with few votes are easily predicted to win no seats. But how does one go in the reverse direction and estimate the vote shares of parties that run and do not get any seats? Ways to work out the most probable distributions need refining. Once this major link has been completed (Chapter 14), the effective number of electoral parties follows from vote shares. It is found to be more manageable, however, to estimate the largest vote share from the largest seat share and the seat-vote equation, and use it to estimate the effective number of electoral parties. The difference between the largest seat and vote shares also enables us to estimate at least some of the indices of deviation from PR. By this time, one can expect that the distant connection to the institutional factors (assembly size and district magnitude) would largely be overridden by other political and cultural factors. Surprisingly, some faint connection can still be detected, but full testing remains to be done. Only at that point could we say that the macro-Duvergerian agenda is closed, as far as the simple electoral systems are concerned, apart from fine-tuning.
Conclusion The quantitative study of the relations between votes, seats, and electoral systems took off with the observation of the ‘cube law’ at the beginning of the twentieth century. It received a major boost with Maurice Duverger in the mid-1900s, to the point that the core of electoral studies during the most recent 50 years has largely consisted in trying to implement the Duvergerian agenda. The core idea of the Duvergerian approach is that, when the electoral system is simple, the average distribution of party seat shares depends on the number of seats available. As the next chapters document in detail, it can now be specified that this distribution depends on the product of 113
The Duvergerian Macro-Agenda
the number of seats available in the assembly and in the district. When one shifts from legislative to electoral parties, assembly size and district magnitude begin to enter asymmetrically. Plurality rule and complex and compound electoral systems need separate treatment. The macro-Duvergerian agenda has recently made marked advances, and for simple electoral systems it might become closed in the near future. This would mean that the average seat and vote share distributions and the resulting measures of number of parties and deviation from PR could be inferred from the characteristics of the electoral system. Proportionality profiles could be predicted. Thereafter, the macro-Duvergerian agenda would focus on elucidation of more complex electoral systems and elections other than nationwide legislative elections. The micro-Duvergerian agenda remains to be developed to the point where quantitative predictions can supplement postdictions and the macro-level phenomena can be explained through micro-level processes.
114
8 The Number of Seat-Winning Parties and the Largest Seat Share
For the practitioner of politics:
r r r r
r
r r
The quantity to watch is the product of the number of seats in the assembly and the number of seats allocated in the average district. The larger this ‘seat product’, the larger the number of parties in the assembly. The larger the seat product, the smaller the seat share of the largest party. If you wish to increase the largest seat share by one-tenth (0.1), your best bet is to multiply (1+0.1) by itself 8 times, which yields 2.14. This is by how much you must divide the present seat product. This means that either you split the present districts into two smaller districts or you cut the assembly size by a half. If you wish to reduce the largest seat share by one-tenth, your best bet is to multiply (1 − 0.1) by itself 8 times, which yields 0.43. This is by how much you must divide the present seat product. To do so, you can increase district magnitude or assembly size or both. This way to calculate is based on a logical model that agrees with the world average. It is approximate, because other factors enter. At the same average district magnitude, widely unequal districts usually reduce the seat share of the largest party.
Here we start from nothing but district magnitude and assembly size, and presume FPTP or a standard PR seat allocation rule. On that institutional basis alone, we predict how many parties are likely to win at least one seat, and how large the seat share of the largest party is likely to be, 115
The Duvergerian Macro-Agenda
on the average. If you think it cannot be that simple, you have lots of company, but I will try to explain why it works. The world average supplies a comparison point for individual countries. The main body of the chapter offers what is needed to apply the predictive models developed, along with evidence about their degree of validity for simple electoral systems. It concludes with implications for institutional engineering. Most derivations of models are given in chapter appendix, along with more technical and philosophical concerns.
The Number of Seat-Winning Parties The logical model to be presented claims that the most likely number of parties ( p) that win at least one seat in an isolated district of M seats is (Taagepera and Shugart 1993) p = M 1/2 , when List PR is used. It obviously also applies to FPTP. Nationwide, when an assembly of S seats is elected in districts of M seats, the most likely number of seat-winning parties (N0 ) is, according to this model, N0 = (MS)1/4 . This means that, with a large numbers of cases, we expect one-half of them to fall above and one-half below the value N0 = (MS)1/4 . No prediction is made about individual elections, where the number of seatwinning parties could deviate widely from the model. The mean for many elections using a simple electoral system (List PR or FPTP) is expected to be within a factor of 2 of the model, that is, no more than the double and no less than a half of the predicted number of seat-winning parties. We cannot expect to be closer, because country-specific factors beyond the seat product also enter. Finally, the mean of many electoral systems with the same seat product is expected to be close to the model, because here the country-specific factors should cancel out. Before presenting the quantitatively predictive model, let us see how well it works. All those elections in Mackie and Rose (1991) were considered where at least four elections were carried out under the same rules and all seats were allocated in districts. Values for the resulting thirty electoral systems are tabulated in Taagepera (2002b). Table 8.1 compares the actual and predicted numbers of seat-winning parties, using the geometric means for electoral systems with the same seat allocation formula. 116
The Number of Seat-Winning Parties Table 8.1. The actual number of seat-winning parties, the expected number (based on district magnitude and assembly size), and their ratio Electoral formula
No. of systems
Actual N0
N0 = (MS)1/4
Ratio
9 2 3 8 2 1 5
3.57 5.99 4.29 6.96 3.54 10.0 7.03
3.87 6.29 3.95 6.12 3.05 6.6 4.04
0.92 0.95 1.09 1.14 1.16 1.5 1.74
16 14 30
4.41 6.30 5.21
3.82 5.63 4.58
1.15 1.12 1.14
FPTP Modified Sainte-Laguë STV D’Hondt Alternate Vote SNTV (Japan) Two Rounds All M = 1 systems All M > 1 systems All systems Data source: Taagepera (2002b).
The electoral systems are listed in Table 8.1 in the order of increasing ratios of the actual and expected numbers of seat-winning parties. The deviations of these ratios from 1.00 indicate the degree of fit of the model. The means of simple electoral systems fit within 15 percent. The model need not fit for more complex systems, but for STV and Alternate Vote, it still does. Only for Two-Rounds systems and the single case of SNTV would we need to look for factors beyond the impact of the seat product. Among all 30 electoral systems, the actual-to-expected ratio is the lowest for the USA 1938–88 (0.54) and the highest for two Two-Rounds systems, Germany 1871–1912 (3.0) and the Netherlands 1888–1913 (2.0). See Taagepera (2002b) for individual electoral systems and various methodological issues. Figure 8.1 uses the same data to graph the actual N0 for 30 electoral systems against the seat product MS, both on logarithmic scales. The mean expectation line N0 = (MS)1/4 is shown, as well as the lines at one-half and at twice these values. The data points are expected to be crowded in the center of the zone delineated by the latter two lines. If the model does not hold, the data points could be scattered all over the place or they could all be above or below the predicted line. Except for two Two-Rounds systems (Germany 1871–1912 and the Netherlands 1888–1913), system averages are located in the expected zone. Thus the model has some merit. Any acceptable model must include the anchor point MS = 1 → N0 = 1 (also shown in Figure 8.1), because one seat obviously goes to one and only one party. The best-fitting line (not shown) that passes through this 117
The Duvergerian Macro-Agenda 100
Number of Seat-Winning Parties (N 0)
M>1 M=1
GER 1871− SWI
10
IRE
SPA JPN
NET 1888−
N0 = 2(MS).25
USA 1938− .25
N0 = (MS)
N0 = (MS).25/2
1 1
10
100
1,000
10,000
Seat Product (MS )
Figure 8.1. The number of seat-winning parties vs. the seat product MS Data source: Taagepera (2002b).
anchor point corresponds to N0 = (MS)0.26 rather than the predicted N0 = (MS)0.25 . The difference is a minor one. Note that the criteria of agreement with the model differ for predictive models and for postdictive statistical data fits (see Taagepera 2008). For the latter, measures of scatter, such as R 2 , are the main criteria. For predictive models, in contrast, closeness to the predicted value is what matters. The visibly low R 2 in Figure 8.1 is due to the relatively short range that MS can take on the logarithmic scale, but all simple systems are within the expected zone. Indeed, if the FPTP systems were tested separately, R 2 would be close to 0, due to the short range of MS=S, but the predictive model would still be confirmed within an 8 percent average deviation from the model.
The Number of Seat-Winning Parties: The Model for Single District Having shown that the model reflects reality, I now present the first part of the model, the one for a single district, in some length, because it 118
The Number of Seat-Winning Parties
introduces the broad idea of ‘ignorance-based models’(Taagepera 1999b), explained in more detail in Beyond Regression (Taagepera 2008). It is important to clarify this broad idea, as it will be used repeatedly later on. The nationwide model for seat-winning parties is placed in chapter appendix, as will subsequent models based on this approach. Consider an electoral district with 100 seats, such as the Netherlands actually had in 1918–52 in its first chamber—a single nationwide district. How many parties are likely to win seats, and what is the likely average number of seats per party? Assume a usual PR formula, so as to exclude multi-seat plurality. Also assume that each seat is allocated to one specific party. At one extreme, a single party could win all the seats. At the other, 100 parties could win one seat each. Both extreme outcomes are unlikely under PR rules, but they are conceptually possible. In contrast, a number of parties below 1 or above 100 is logically impossible. In the absence of any other information, if we had to hazard a guess, we would minimize the maximum error by choosing the mean between the logical limits. For reasons explained in Taagepera (2008), the geometric mean should be used, because both limits are positive and, moreover, vary by several orders of magnitude. Hence we would guess at 10 parties to win seats. We could approach the problem from a different angle, asking: What is the likely average number of seats per party? This number, too, can range from 1 (when 100 parties win 1 seat each) to 100 (when one party wins all the seats). The geometric mean is 10 seats per party. The two approaches yield congruent answers: 10 parties, each winning an average of 10 seats, amount to a total of 100 seats, which is the actual number. This may look obvious, so why belabor it? The point is that it would not work with the arithmetic mean. Indeed, some colleagues have wondered why I try to avoid the good old arithmetic mean. So let us use it. The arithmetic mean of 1 and 100 parties winning at least one seat each yields 50.5 seat-winning parties. If someone told you that a country is likely to have as many as 50 parties in its 100seat assembly, would your common sense accept it? Never mind, let us continue. The average number of seats per party can also range from 1 to 100, so the arithmetic mean is 50.5 seats per party. However, an average of 50.5 parties winning an average of 50.5 seats per party would amount to a total of 2,550 seats rather than 100! Thus, using the arithmetic mean would lead to logical inconsistency. Only the geometric mean avoids it, for reasons given in Taagepera (2008). Maybe the present specific example makes it more believable on an intuitive level. 119
The Duvergerian Macro-Agenda
Actually, 8 to 17 parties won seats in the 9 nationwide PR elections where the Netherlands had 100 seats in its first chamber (1918–52). The mean number of seats per party ranged from 5.9 to 12.5. The geometric mean was 10.29 parties winning seats, and the geometric mean seat share was 9.72 seats per party—pretty close to the expected 10. This example shows that a guess based on conceptual limits can be appreciably off for an individual election, yet can be close for the average of several elections. Let us now generalize for districts of any magnitude M, excluding a multi-seat plurality allocation rule. Designate by p the number of seatwinning parties within one district. The conceivably allowed range for p is 1 ≤ p ≤ M. In the absence of any other information, our best guess for the number of parties that win at least one seat is the geometric mean of the extremes: p = M 1/2 . This is a mathematically elegant expression. The reaction of many political scientists may well be the one described by Steven Reed (1996): Political scientists are traditionally less comfortable with the assumption of mathematical elegance than are physicists. We tend to be more comfortable with the presumption that ‘things are more complicated than that’. More importantly, we expect some behavioral model to underlie our theories. There is no particular behavioral basis to Taagepera’s theory. It does not depend on rational voters or strategic political parties. It is less a ’political’ theory than a mechanical one. Taagepera and Shugart see this characteristic as a strength (Taagepera and Shugart 1993: 456). However, theories without actors seem more appropriate for physics than for political science... When we have an accurate equation produced in this fashion, what is it exactly that we know? (Reed 1996: 73)
In the sense of German Verstehen, intuitive insight into what is going on, we may have little. Yet, we have ability to predict, on a definitely nonempirical basis, and this is not to be discounted. Accordingly, Reed concludes: One must be able to isolate mechanical effects before one can properly evaluate behavioral and political effects. Mechanical effects may not be as fascinating as real politics, but they are equally important. (Reed 1996: 80)
To paraphrase Winston Churchill’s dictum about democracy, p = M 1/2 is the worst possible prediction one could make—except for all others. This prediction is pulled almost completely from thin air, but all the others would be completely so, in the absence of further information. If further information should yield a different outcome, we would be happy to 120
The Number of Seat-Winning Parties
accept it. Such information could come in two forms. One is the empirical median of a large number of cases at the same M. The other is a logical argument of a political nature. In the absence of either, p = M 1/2 is the best we can do. Note that for single-seat districts (M = 1), p = M 1/2 yields p = 1, as it certainly should. This is an essential anchor point. Whatever format should be proposed for p = f (M), it must yield p = 1 when M = 1. For a single case with M > 1, no firm prediction is made. The actual figure may be far off from p = M 1/2 . But for a large number of cases, our best guess is that one half of them would be above and the other half below the curve p = M 1/2 . I call this best guess the expectation value. In the words of a classical text on quantum mechanics: The expectation value is the mathematical expectation (in the sense of probability theory) for the result of a single measurement, or it is the average of the results of a large number of measurements . . . (Schiff 1955: 24)
It is a notion useful in quantum mechanics, among others, but it involves nothing specific to physics. If I had to predict, I would predict the expectation value, because everything else would be even less justified, prior to receiving any further information. This is the meaning of ‘expectation’ and ‘prediction’ in this chapter, and the following. All this applies to one isolated district, such as a single nationwide one. In a country with many multi-seat districts, one can expect p = M 1/2 to underestimate the number of seat-winning parties, for the following reason. There is a difference between a district of M = 25 within a larger country and a small country where all 25 assembly seats are elected within the same district. In the small country, all parties are generated within the district. In the large country, large nationwide parties with no ready constituency within the given district may still run, bringing in funds and personnel from elsewhere, and occasionally winning at least one of the 25 seats. In order to show the flag throughout the country, they may shift resources to such districts even when this diversion might reduce their seats in their strongholds. If so, then the number of seat-winning parties in the given district may exceed 251/2 = 5. This phenomenon does not increase the number of parties that win seats nationwide, since it refers only to large parties. Parties that barely stand to win one or a few seats in the assembly will try to concentrate their resources in their strongholds. Taagepera and Shugart (1993) offer a more complex model that accounts for the likely impact of nationwide politics in the districts. It has not been fully tested, because we are more interested 121
The Duvergerian Macro-Agenda
in the number of parties that win seats nationwide, and there, the effect of district-center interaction cancels out. The extension of the model to the nationwide scene, N0 = (MS)1/4 , is given in chapter appendix. It follows from the conceptual limits of 1 to M parties winning seats in a district and 1 to S parties winning seats in an assembly of S seats.
The Largest Seat Share Connecting the seat share of the largest party (s1 ) to the electoral system involves two steps. First, s1 is connected to the number of seat-winning parties (N0 ), yielding 1/2
s1 = 1/N0
−1/2
= N0
,
or, in an equivalent but more symmetric form, 1/2
s1 N0
= 1.
Second, the connection to the seat product is almost automatic, through N0 = (MS)1/4 : s1 =
1 , (MS)1/8
or, in an equivalent but more symmetric form, s1 (MS)1/8 = 1. The derivation of the predictive model is given in chapter appendix. It starts with the observation that the largest party wins at least the average number of seats and at most close to all seats. The rest follows. Here, the model will be tested in three stages: (1) largest share versus the number of seat-winning parties, (2) largest share versus the seat product for single-seat systems, (3) largest share versus the seat product for multi-seat systems. The first stage is interconnected with the index of balance presented in Chapter 4.
The Largest Share, the Number of Seat-Winning Parties, and the Index of Balance 1/2
The model s1 = 1/N0 has been tested (Taagepera 2005) with those 604 elections in Mackie and Rose (1991, 1997) where the fuzzy ‘Others’ 122
The Number of Seat-Winning Parties 1.0 s1 = 1
Seat share of the largest party (s1)
0.5
0.3 FORBIDDEN AREA
0.2
s1 = 1/N00.5 s1 = 1/N0
0.1 1
2
3
5
10
20
The number of seat-winning parties (N0)
Figure 8.2. The median seat share of the largest party vs. the number of seatwinning parties Source: Reprinted, with modified labels, by permission of Sage Publications Ltd from Rein Taagepera, ’Conservation of Balance in the Size of Parties’, Party Politics, 11: 283–98 (© Sage Publications, 2005).
category did not include more than one seat. This meant 24 countries. 1/2 The overall median of the product N0 s1 is 0.985—only 1.5 percent off the predicted 1.000. Figure 8.2 shows the degree of agreement at different numbers of seatwinning parties. Both s1 and N0 are graphed on logarithmic scales, so 1/2 s1 = 1/N0 becomes a straight line. This median relationship is confirmed when 3–12 parties win seats. Deviations occur when only two parties win seats (93 elections), or more than 12 (29 elections). These deviations are discussed in chapter appendix. Shown as thick lines in Figure 8.2 are the conceptual limits of s1 at given N0 . The lower limit (s1 = 1/N0 ) occurs when all seat-winning parties have equal shares. The upper limit (close to s1 = 1) occurs when the largest party has almost everything. The expected average line is the geometric mean of these extremes. 1/2 One unintended result of testing s1 = 1/N0 was development of the index of balance (Taagepera 2005), introduced in Chapter 4 as a means to complement the effective number of parties in somewhat the same way as standard deviation complements the mean of a normal distribution. This 123
The Duvergerian Macro-Agenda
index was defined as B=
−logs1 logN∞ = . logN0 logN0
Reconsider Figure 8.2 in this light. The lower limit (s1 = 1/N0 ) corresponds to perfect balance (B = 1). The upper limit (close to s1 = 1) corresponds to near-complete imbalance (B ≈ 0). What Figure 8.2 says is that party systems have a median balance of 0.5, except at N0 = 2, where there is more balance, and at very high N0 , where there is more imbalance between the largest party and the tiny ones. 1/2 In sum, if the number-share conservation s1 N0 = 1 holds, then a balance of 0.5 results. These are two ways to express the same average relationship. Extremely unbalanced and balanced distributions of seats can occur and do, as shown in chapter appendix. It is just that halfbalanced distributions are more frequent.
The Largest Share versus the Seat Product for Single-Seat Systems The clearest cases to test are systems with single-seat districts, because many actual single-seat systems are truly simple FPTP. With M = 1, the model s1 = 1/(MS)1/8 is reduced to 1 (S)1/8
[FPTP]
s1 S 1/8 = 1.
[FPTP]
s1 = or, more symmetrically
Based on detailed data in Taagepera and Ensch (2006), Table 8.2 shows the mean assembly sizes, largest seat shares and products s1 S 1/8 for FPTP, Two-Rounds (TR) and Alternate Vote (AV) systems. For FPTP, systems with small, medium, and large assemblies are shown separately. Taagepera and Ensch (2006) tabulate the country, time period and the number of elections for each system. As a rule, only systems with at least 5 elections were accepted. However, in order to extend the range to very small assemblies, some systems with only 3 or 4 elections were included for assemblies with 10 to 42 seats. For the 24 FPTP systems, the mean product s1 S 1/8 , expected to be 1.00, is off by only 1.1 percent. For groups by assembly sizes, the geometric means are off by ±4 percent at most, and these errors look random, meaning that 124
The Number of Seat-Winning Parties Table 8.2. Assembly size (S) and the largest party’s seat share (s1 ), for single-seat district systems Number of elections
8 FPTP systems with S = 10 to 24 8 FPTP systems with S = 26 to 68 8 FPTP systems with S = 75 to 643 All 24 FPTP systems 5 TR systems with S = 100 to 508 1 AV system All 30 single-seat systems Presidential elections
37 71 209 317 42 31 390
Geometric means for S
s1
s1 S 1/8
17.04 43.25 178.11
.676 .653 .537
261.95 106.3
.437 .497
0.963 1.045 1.027 1.011 0.876 0.890 0.985 1.000
1
1.000
Data source: Taagepera and Ensch (2006).
no systematic dependence on S is left unaccounted. Hence the model is confirmed within ±1 percent. The lowest mean s1 S 1/8 for an individual system is 0.68, for the 3 elections in St Kitts 1980–9 (S = 10.3). The highest is 1.29, for the 7 elections in Botswana 1965–94 (S = 33.2). The 84 elections in the USA 1828–1994 are also markedly off (s1 S 1/8 = 1.22). The mean for 5 Two-Rounds systems, where the model is not expected to apply, deviates much more, falling below 1.00 by 13 percent. The individual systems vary widely, with s1 S 1/8 ranging from 0.58 for the 13 elections in Germany 1871–1912 (S = 396) to 1.70 for the 6 elections in Italy 1895–1913 (S = 508). On the average, Two-Rounds seems to reduce the seat share of the largest party, when controlling for assembly size. In the single case of Alternate Vote (Australia, 31 elections in 1919– 93), the low value of s1 S 1/8 (0.89) depends on counting the Liberal and Country/National seats separately, like Mackie and Rose (1991, 1997) do. If the Coalition of Liberal and Country/National parties were counted as a single party, like Nohlen, Gotz, and Hartmann (2001) do, the mean largest share would increases to 59.2 percent and s1 S 1/8 = 1.06. For all 30 single-seat systems, the mean product s1 S 1/8 , expected to equal 1.00, is off by only 1.5 percent. Thus the model is confirmed within ±1.5 percent even when including the Two-Rounds systems, which can expected to deviate from the model for simple electoral systems. At the bottom, Table 8.2 also includes the presidential systems, as a reminder that the model also applies to them, as an extreme limiting case. Figure 8.3 (from Taagepera and Ensch 2006) shows the largest seat shares of the 30 single-seat systems graphed against assembly sizes, both on logarithmic scales, so that s1 = 1/(S)1/8 becomes a straight line. The 125
The Duvergerian Macro-Agenda 1.0
s1 = S –0.125, R 2 = 0.245 ‘Model’
0.8 Italy 1895−1913, 6
0.6
Mean s1
St Kitts 1980−9, 3
0.4
s1 = 0.915S –0.108 R 2 = 0.256 ‘Best-Fit’ Netherlands 1888−1913, 8
Germany 1871−1912, 13
0.2 100 Mean S
10
1,000
Figure 8.3. The median seat share of the largest party vs. assembly size, for 30 single-seat systems—predictive model and regression line Source: Reprinted from Electoral Studies, 25, R. Taagepera and J. Ensch, ‘Institutional Determinates of the Largest Seat Share’, 760–75, © 2006 Elsevier Ltd., with permission from Elsevier.
ordinary least squares (OLS) best fit of the logarithms corresponds to s1 = 0.915S −0.108 .
[best fit, R 2 = 0.256]
It is almost superimposed to the predicted line, for which the R 2 is practically as high: s1 = 1.00S −0.125 .
[predictive model, R 2 = 0.245]
The low R 2 comes largely from the scatter of the Two-Rounds systems. Recall that R 2 matters for postdictive analysis, because it is the only measure of quality of fit one has there. Here, however, the main question is whether the predicted average curve agrees with the actual best fit within the random scatter of data. It visibly does when assemblies are on the small side. The largest seat shares are larger than predicted for most assemblies of more than 200 seats. One must ask whether this deviation from the model is random or systematic. The case of India, not included in the Taagepera and Ensch (2006) dataset, may be instructive in respect. Here I use data from Nohlen, Gotz, and Hartmann (2001: I: 577–9). From 1951 to 1984, India had a very large dominant party (geometric mean s1 = 0.671). With mean S = 517.2 for the 8 elections, it yields an extremely high s1 S 1/8 = 1.465. Yet for the next 126
The Number of Seat-Winning Parties
5 elections (1989–99) the mean s1 drops sharply to 0.348. With S = 543, it yields a very low s1 S 1/8 = 0.764. The overall mean is still on the high side (1.140), but the shift tells us that it may be too early to conclude that the model needs adjustment at high assembly sizes. It was observed in Chapter 6 that it is hard to devise a hypothetical example to illustrate how assembly size affects the number and size distribution of parties. Figure 8.3 supplies evidence: Reduced assembly size does enhance largest party predominance, at least in single-seat systems. But such directional model is the easy part, and it cannot offer quantitative prediction. The real triumph is here that one specific decreasing curve was offered as the best possible guess, on purely theoretical grounds, and the actual best fit curve is almost superimposed to it.
The Largest Seat Share versus the Seat Product for Multi-Seat Systems Now we proceed to test the full model, s1 = 1/(MS)1/8 or s1 (MS)1/8 = 1, using multi-seat PR systems. Taagepera and Ensch (2006) could locate only 10 systems where all seats were allocated within districts, for at least 5 elections. Even most of these systems deviate from the ideal of simple systems. In 6 systems List PR with divisors was used, mostly d’Hondt, but also modified Sainte-Laguë. Complexities included varying district magnitudes, legal thresholds, and district level apparentement. In 4 other systems, candidate-centered PR was used: STV, SNTV, or List PR approaching nonlist through extensive panachage and cumulation of multiple votes per voter (Switzerland). In order to extend the range of the seat product MS, elections in a nationwide single district are of high interest, but all such systems involve legal thresholds. Relaxing the criteria even further, 6 nationwide single district systems were included, with legal thresholds ranging from 0.67 to 5 percent. It is shown in Chapter 15 that the restraining effect of a nationwide threshold T corresponds approximately to that of a district magnitude of M = (75%/T) − 1. Instead of M = S, this adjusted magnitude was used for nationwide PR with thresholds. The geometric means for these three groupings are shown in Table 8.3. See Taagepera and Ensch (2006) for data on individual systems. In the relatively simple list PR systems, mean s1 (MS)1/8 exceeds the predicted 1.00 by 9 percent, individual systems ranging from 0.94 (Luxembourg 1919–94) to 1.30 (Spain 1977–96). The mean for the more complex systems is within 127
The Duvergerian Macro-Agenda Table 8.3. District magnitude (M), assembly size (S), and the largest party’s seat share (s1 ), for multi-seat PR systems Number of elections
List PR in districts (6 cases) Candidate-centered PR in distr. (4) Nationwide PR + threshold (6)a All 16 multi-seat systems All 46 systems (M > 1and M = 1)
Geometric means for M
S
MS
s1
s1 (MS)1/8
106
9.39
171.8
1613
.432
1.086
88
4.90
152.3
746
.428
0.971
68
(42.0)a
196.7
8270
.362
1.119
262
1.068
652
1.013
Data source: Taagepera and Ensch (2007). a
In nationwide single district with legal threshold T , M = S is adjusted to M = (75%/T ) − 1.
3 percent of the expectation, but the range is wide, from 0.67 (Switzerland 1919–95) to 1.28 (Japan 1928–93). In nationwide PR systems, the mean s1 (MS)1/8 formally exceeds 1.00 by 12 percent. This figure depends very much on the adequacy of the threshold correction M = (75%/T) − 1. For all 16 PR systems in Table 8.3, the mean excess over s1 (MS)1/8 is 7 percent. Finally, joining the single-seat and multi-seat systems and taking the geometric mean for all 46 systems in Tables 8.2 and 8.3 yields s1 (MS)1/8 = 1.013—only 1.5 percent above the predicted 1.00. If the seat product for the d’Hondt systems were taken as M 0.62 S rather than MS (as suggested in Chapter 6), then s1 (M 0.62 S)1/8 = 0.978, on the average. Overall agreement with the expected 1.000 would become even better. However, the method for estimating the exponent F in M F S should be refined before it can be used with any confidence. Figure 8.4 shows the largest seat shares of all 46 systems (single-seat and multi-seat) graphed against the seat product, both on logarithmic scales, so that s1 = 1/(MS)1/8 becomes a straight line. The corresponding data are tabulated in Appendix to the book. The OLS best fit of the logarithms of M and S separately corresponds to s1 = 0.847M −0.113 S −0.090 .
[best fit, R 2 = 0.54]
This is to be compared to the predicted relationship, for which the R 2 is practically as high: s1 = 1.00M −0.125 S −0.125 . 128
[theoretical model, R 2 = 0.51]
The Number of Seat-Winning Parties 1.0 s1 = (MS)–0.125
0.8
Italy 1895−1913, 6
R = 0.509 ‘Model’ 2
0.6 Mean s1 0.4
St Kitts 1980−89, 3
s 1 = 0.847M –0.113S –0.090 R 2 = 0.538 ‘Best-Fit’ Netherlands 1888−1913, 8
Germany 1871−1912, 13
0.2 10
100
Switzerland 1919−25, 21
1,000
10,000
Mean MS
Figure 8.4. The median seat share of the largest party vs. seat product MS for 46 single- and multi-seat systems—predictive model and regression line Note: Squares: M = 1; Triangles: M > 1. Source: Reprinted from Electoral Studies, 25, R. Taagepera and J. Ensch, ‘Institutional Determinants of the Largest Seat Share’, 760–75, © 2006 Elsevier Ltd., with permission from Elsevier.
These two lines are almost superimposed. (How does one graph on a scale of MS when M and S have different exponents? See Taagepera and Ensch 2006) Maximum error is under 3 percent.) Here R 2 is higher than in Figure 8.3, simply because the range of MS is wider. But again, the question is not how high is R 2 , but whether the predicted curve agrees with the actual best fit within the random scatter of data. It visibly does. Table 8.4 lists the various electoral systems in the order of increasing mean deviation from the predicted largest share. The main features adding complexity to the systems are also shown. Two-Rounds falls short of the model and has by far the widest range of variation. Connecting first-round votes to seats, largely won in the second round, may add unpredictability. As mentioned earlier, the figure for Alternate Vote depends heavily on how to count parties in Australia; both alternatives are shown in Table 8.4. The effect of different candidate-centered PR systems may differ. The mean of this mixed category happens to agree with the model. The FPTP approaches the ideal simple system more than any other category, and the deviation from the model is minimal on the average. Individual systems, however, can deviate appreciably. Multi-district List PR is another category simple in principle, but in practice many additional 129
The Duvergerian Macro-Agenda Table 8.4. Complexity of electoral systems and deviation of the largest seat share from the model s1 = 1/(MS)1/8 Allocation formula
Two-Rounds
Alternate Vote Candidate-centered PR in districts FPTP Multi-district list PR Nationwide PR
Factors of complexity
Loose connection between first-round votes and second-round seats. Individual cross-party preferences. Individual cross-party preferences. Simple. Some primaries and gerrymander. Uneven M, apparentement, legal thresholds. Legal thresholds, their effect on M conventionally adjusted.
Deviation Mean
Range
−12%
−42 to +70 %
−11/ + 6%
—
−3%
−33 to +28%
−1%
−32 to +29 %
+9 %
−6 to +30 %
+12%
−7 to +33 %
features enter, and they apparently tend to reinforce the largest party. This is surprising at first look, because most factors of complexity would seem to favor the smaller parties. However, if the exponent F in M F S were taken as less than 1.0 for the relatively many cases using d’Hondt, then the excess for the largest share would decrease. Systems with nationwide PR also look unexpectedly favorable to the largest party, but here the outcome depends on the accuracy of the conventional way to adjust magnitude in view of legal thresholds. The model treats district magnitude and assembly size in a symmetrical way, but the extent of their ranges differs. The limited range of assembly sizes shows up when the logarithm of the largest seat share is regressed against the logarithm of S alone. Here R 2 = 0.28. When regressed against the logarithm of M alone, R 2 increases to 0.43. Regression against both M and S leads to R 2 = 0.54 (Taagepera and Ensch 2006). Thus assembly size still matters in practice, but district magnitude matters more.
Implications for Institutional Engineering Now we are approaching payoff for institutional engineering. Connecting the largest seat share to institutional characteristics would enable us to go beyond trial-and-error. The number of all seat-winning parties is of little interest to the practical politician, because it heavily depends on the 130
The Number of Seat-Winning Parties
tiniest parties that barely win a seat or two and have almost no impact on politics. But the largest seat share matters for government formation and survival. When it is less than 50 percent, it influences the number and weight of potential coalition partners the largest party can choose. If the largest party remains in opposition, its seat share influences its blocking ability. When the largest share surpasses 50 percent, the extent of the excess still makes a difference by enhancing the ruling party’s clout, yet also encouraging factions within it. Frequent changes in electoral systems are not desirable. I still believe the following: A major purpose of elections is to supply a stable institutional framework for the expression of various viewpoints. Even if imperfect, a long-established existing electoral system may satisfy this purpose better than could a new and unfamiliar system, even if it were inherently more advantageous. . . . Familiarity breeds stability. . . . Major electoral reforms should not be undertaken lightly. (Taagepera and Shugart 1989: 218)
However, if the urge to change becomes strong, the model presented here may help to fine-tune change so as not to overdo it. For institutional engineering, the quantity to watch is the seat product—the product of the number of seats in the assembly and the number of seats allocated in the average district. The lower the seat product, the larger the seat share of the largest party. To increase the average share of the largest party, one can lower either district magnitude or assembly size, or lower both to a more moderate degree. The simple model presented here goes beyond mere directionality of adjustment. It tells us by how much we should change the seat product, for a desired effect. The model indicates the world average of the largest share, for a given combination of district magnitude and assembly size. It would be simple-minded, however, to base institutional engineering in a given country solely on a universal model, without taking into account the country’s peculiarities. If a country has exceeded the world average in the past, it is likely to continue to do so when the electoral system is modified. A corresponding adjustment term should be introduced into the worldwide model. We do not have to know which factors cause the need for adjustment. They could be due to political culture, institutional features other than M and S, or something else. Chances are that they will continue to exert a similar influence on the altered electoral system. In other words, the effective seat product could differ somewhat from the product of M and S, depending on other factors. 131
The Duvergerian Macro-Agenda
Consider Finland, for instance. Its largest seat share during the last 50 years has been around 27 percent, which is rather low. Indeed, it is only 3/4 of the worldwide expectation. Here, the relationship is not s1 = 1/(MS)1/8 . It can be expected to be around s1 = 0.75/(MS)1/8 , subject to some assumptions. Suppose Finland wishes to raise the largest share to an average around 32 percent, changing nothing but district magnitude. One should not resort to the full worldwide model. There is no need even to calculate the adjusted s1 = 0.75/(MS)1/8 for the existing and the desired conditions. It suffices to observe that the desired increase would multiply the present largest share by 32/27=1.185. To obtain this outcome, the model suggests that MS should be divided by (1.185)8 = 3.89. If the assembly size is not changed, this would mean going from the present 14 districts at mean M = 14.3 to 54 districts at mean M = 3.7. The present assembly of 200 seats is relatively large for Finland’s population of 5 million (see Chapter 12). If it can be reduced to 150 (0.75 of the present), then M can be reduced only by a factor of 3.89 × 0.75 = 2.92, meaning 31 districts at mean M = 4.9. Either change would obviously whittle down or eliminate some smaller parties. Whether this would be acceptable is up to the decision-makers. The political scientist can only offer alternative projections. Apart from assembly size and the mean district magnitude, other aspects of the electoral system must also be considered. The largest district in Finland at times has had 27 seats—double the mean. This district is where many small parties win their only seat. Simply making the district magnitudes more uniform could reduce the number of seat-winning parties by a factor of (2)1/4 = 1.19, that is, by 19 percent. It would boost the largest share by (2)1/8 = 1.09, meaning a 9 percent increase, from 27 to 29.5. Some small parties in Finland win seats thanks to district-level alliances. If such alliances are prohibited, further small parties would be squeezed out, and the largest seat share would increase by an amount hard to estimate. In sum, when the electoral system includes stipulations that go beyond the simplest possible, changing those extra features can go a long way to alter the seat share of the largest party. Political inertia that opposes any change in the electoral system is considerable. After all, those in position to decide profited from the existing system—they got elected. It is usually easier to make the rules more favorable to small parties. The large parties might not object, as this buys them insurance in case they themselves lose popularity (Colomer 2004b). Going in the reverse direction (as presented above) could meet 132
The Number of Seat-Winning Parties
vociferous opposition by smaller parties. However, if the proliferation of parties should become manifestly excessive, then public opinion may demand a sharp cutback, for instance by going from PR to FPTP. A gross over-correction may result. This is where the present model could become useful. For a reasonably simple electoral system in a country with some past democratic record, we have advanced much beyond the qualitative advice ‘To have fewer and larger parties, reduce district magnitude’. We can now tell roughly by how much it should be reduced. Yes, it is ‘roughly’, but this is better than no estimate at all of the degree of reduction needed. Estimation is more difficult when the country has not had any democratic elections, because then we do not know the direction of local correctives to the universal average pattern. The best we could do is to compare with neighboring democratic countries with somewhat similar cultures. All bets are off when politicians choose to go beyond the simplest format and insert all sorts of complex and mutually contradictory features. Then we cannot calculate with any precision. In particular, if the choice is Two-Rounds, anything can happen. If you want predictability, keep it simple.
Appendix to Chapter 8 This appendix presents the derivations for the nationwide number of seat-winning parties and the largest seat share. It offers further insights into the minimal measure of the number of parties (N∞ ) and index of balance. It wonders about the apparent symmetry of the roles of M and S in the seat product. Finally, it points out that nothing obliges the real world to follow the probabilistic averages as expressed in the models presented. Then why do they fit, nonetheless?
The number of seat-winning parties nationwide: The model Consider the number of parties that are likely to win at least one seat in an assembly of S members. Assume a simple electoral system where all representatives are elected in districts of uniform magnitude M, using some usual PR formula (or FPTP, when M = 1). The nationwide number of seat-winning parties (N0 ) is at least equal to the number of such parties ( p) in a single district, which itself can conceivably range from 1 to M, with an expected mean of M 1/2 . (At least as a first approximation, the impact of nationwide politics in districts is ignored, since it would cancel out,
133
The Duvergerian Macro-Agenda nationwide.) Even if approximately M 1/2 parties win seats in each district, these may not be the same parties in all the districts. Hence the nationwide number can be expected to be larger than in any district: N0 > p = M 1/2 . The most favorable condition for small parties to win at least one seat would be, if the entire country were made a single nationwide district of magnitude S. Then the number of seatwinning parties could range from 1 to S, with an expected mean of S 1/2 . Any subdividing of the country into several districts is bound to reduce the chances of the smallest parties. Hence N0 < S 1/2 . In sum, M 1/2 < N0 < S 1/2 . If nothing else is known besides M and S, then the best guess for N0 is the one that balances the district level and nationwide constraints. The geometric mean of the extremes is
N0 = (MS)1/4 .
This is the point where the seat product MS, announced in Chapter 6, emerges from purely probabilistic considerations. In this sense, it is a pivotal point. Remarkably, M and S play a symmetrical role in predicting the number of seatwining parties, as long as we avoid multi-seat plurality and complex electoral systems. This model agrees with the following two anchor points. When M = S (nationwide single district), the model yields N0 = S 1/2 , as it should. When S = 1, M is also bound to be 1, given that M ≤ S, so that N0 = 1 results, as it well should. When only one seat is at stake, it means presidential rather than assembly elections. The two elections differ in a number of ways, but presidential elections also offer similarities with elections in an M = 1 district to fill an assembly seat. The same seat allocation formulas offer themselves, even while the importance of presidential elections may affect the choice of a formula. Hence we should be worried if a predictive model for the number of seat-winning parties in the assembly did not predict correctly the outcome of presidential elections. The present model does correctly predict N0 = 1 for S = 1. In the case of FPTP elections for an assembly (M = 1, S > 1), the model predicts N0 = S 1/4 —the number of seat-winning parties is simply the fourth root of assembly size. This is relatively easy to test, given the large number of FPTP election results available. For countries with several multi-seat districts, testing becomes more difficult because hardly any truly simple PR system exists. If nothing else, district magnitude varies from district to district. Malta seems to be the only country where a uniform district magnitude of M = 5 has been maintained over a long time. Malta held the assembly size constant at S = 40 for the 5 elections held in 1947– 55. The model predicts that (5 × 40)1/4 = 3.76 parties would win seats. The actual figures ranged widely, from 2 to 6, but their geometric mean was 3.73, close to the prediction. Actually, so close an agreement is plain luck. With only five elections
134
The Number of Seat-Winning Parties to average, one would be happy if the prediction were within plus or minus one party—especially given that Malta used STV rather than categorical List PR.
How the largest share relates to the number of seat-winning parties: The model There must be an average relationship between the number of seat-winning parties and the seat share of the largest party, because many seat-winning parties would restrict the number of seats that could go to the largest. Vice versa, a small largest party leaves many more opportunities to the small parties. Let us specify the conceptual limits. When N0 parties win at least one seat each, the average fractional share for a seat-winning party is 1/N0 . The largest share (s1 ) obviously must at least equal this average. It also must fall slightly short of the total (1), so as to leave N0 − 1 seats to the other parties. So the largest share cannot be larger than (S − N0 + 1)/S. We can neglect N0 and simply say that s1 < 1, if assembly size S is much larger than N0 , as it usually is. The conceptual limits on the largest share then are 1/N0 ≤ s1 < 1. In the absence of any other information, our best guess for s1 is the geometric mean of the conceptual limits (Taagepera and Shugart 1993): 1/2
s1 = 1/N0
−1/2
= N0
.
Conversely, N0 = 1/s12 = s1−2 . A symmetrical form that does not take a stand on which variable influences the other is s12 N0 = 1
1/2
or s1 N0
= 1.
This is a kind of a ‘law of conservation’, in that the value of this product remains unchanged. Quantities that are conserved during a transformation are of considerable interest in physics, and might be so in social sciences. The ‘number-share conservation’ developed here applies much more broadly than just to seats in an assembly. For instance, it enables us to predict the shares of the largest federal subunits, once the number of subunits is given. For the populations as well as areas in the USA, Canada, and Australia, it works within ±20 percent (Taagepera 1999b). In the absence of any other knowledge, whenever a well-defined total is divided among N0 components, the fractional share of the largest component multiplied by the square root of the number of components is expected to be 1.
135
The Duvergerian Macro-Agenda Some of the apparent consequences of number-share conservation may seem hard to accept. Suppose an external factor increases N0 —e.g. a party splits up. Does the current largest share now feel pressure, so to say, to change so as to conserve the product around 1? Conversely, if some external factor increases the largest share, does the number of components feel pressure to change downwards? Such a dilemma actually arises regarding the Central Australian and Canadian Northern territories, which are distinct but not full-fledged federal subunits. Should they be counted among the N0 components, when estimating the populations of New South Wales and Ontario, respectively? And how could the latter populations depend on such number games? Similar questions may arise for parties that run as semi-coalitions. It so happens that counting the aforementioned territories as separate components leads to an underestimate of the populations of the largest subunit, while discounting them leads to an overestimate—but both are in the ±20 percent range. The apparent problem is akin to the one that arises with normal distributions: If some property is normally distributed in several subspecies, how can it also be normally distributed for the entire species? But most often it is. Party-based elections offer a possibility to test the number-share conservation with literally hundreds of cases: The largest seat share in every election where the number of seat-winning parties can be specified. It has nothing to do with the particular electoral system, because the number-share conservation is as universal a relationship as normal distribution. Such testing of a proposed law of conservation is of interest by itself. What makes it even more interesting for the study of electoral systems is that the number of seat-winning parties can be estimated from the product MS. If so, then the largest seat share, too, can be estimated from purely 1/2 institutional data (M and S), to the extent that s1 N0 = 1 applies. The variation around the median thus estimated is of course considerable, especially in the middle ranges of the variables. Consider the data used in Figure 8.2. With 4 parties winning seats, the median prediction is s1 = 0.500. The actual median of the 79 cases is 0.495, even though the distribution is wide: Range of s1
0.30–0.39
0.40–0.49
0.50–0.59
0.60–0.69
0.70–0.79
0.80–0.89
12
32
26
4
3
2
Number of cases
Conversely, for s1 ranging from 0.45 to 0.54, the median prediction is N0 = 4. The actual median of the 162 cases is 4, with the following distribution: Range of N0
2
3
4
5
6
7
8–9
10–12
13–16
Number of cases
23
42
30
28
15
9
6
5
4
1/2
Visibly, s1 N0 = 1 cannot be expected to work in single cases any better than the weight of one particular British woman can be inferred from the average weight of British females. But the average prediction is quite precise, without any input of data.
136
The Number of Seat-Winning Parties Methodological issues remain. The distributions above are visibly not normal— as one would expect for variables that cannot take negative values (cf. Taagepera 2008). Thus arithmetic means would be misleading. But these distributions are not clearly lognormal either, so that the use of geometric mean is also questionable. This is why the median was used, as it makes no assumptions about the form of the distribution. Another remaining problem is the direction of testing. Figure 8.2 shows the median s1 at given N0 , as if the number of seat-winning parties were driving the largest share, rather than vice versa. What would result from a reverse graphing of median N0 at given s1 ? We run into difficulties because empirical N0 comes in integer values only. The discrepancy in Figure 8.2 at N0 = 2 arises from a hidden assumption that fails at very low N0 . Consider our starting point: 1/N0 ≤ s1 < 1. Taking the geometric mean of these conceptual limits implicitly presumes that the distribution of log s1 is symmetrical, so that extreme cases on both sides occur with equal frequency. If so, then the outcome would be 71-24. However, when only two parties achieve representation, political competition may push toward a balance between them. Hence constellations close to the lower limit (50-50) are politically quite plausible, while constellations approaching the upper limit (about 99-1) are unlikely in democracies. Assuming a distribution that reflects these considerations produces the observed median value of s1 when only two parties win seats (see Taagepera 2005). 1/2 The deviation from s1 N0 = 1 at high N0 in Figure 8.2 is harder to explain. Beyond 12 seat-winning parties, further tiny parties or independents winning a seat or two no longer seem to affect the largest share. This brings us to the thorny issue of how to count the independents. In one sense, an independent representative is equivalent to a minor party with a single seat. Hence the independent should be counted as one more seat-winning party. But there is a difference between a party with nationwide organization and ambitions that happens to win only one seat nationwide, and an independent candidate who concentrates solely on one particular district. Parties tend to have claims of representing some ideology or interest. Independents have fewer such claims and, once elected, often tend to coalesce with a larger party to an extent small parties cannot afford without losing their raison d’être. In electoral studies the issue becomes salient when a large proportion of seats are occupied by independents, as has happened in Ireland and Japan, in particular. (For this reason, all elections in Ireland had to be eliminated from the testing of 1/2 s1 N0 = 1.) Assuming that these countries had 20–50 distinct parties makes little sense, but pretending that the seats occupied by independents do not exist leads to equally odd results. In sum, the question of how to count the independents as parts of a party system matters when there are many of them. Here more work is needed.
137
The Duvergerian Macro-Agenda
The minimal measure of the number of parties and index of balance It was shown in Chapter 4 that the inverse of the largest share (N∞ = 1/s1 ) is the minimal measure of the number of parties for a given seats constellation. The 2 = 1. This average connection relationship s12 N0 = 1 can now be recast as N0 /N∞ means that, on the average, N∞ is the square root of N0 , and, conversely, N0 is the square of N∞ : 1/2
2 . N∞ = N0 and N0 = N∞
These relationships (hinted at in appendix to Chapter 4) represent a probabilistic average. It can be seen from Figure 8.2 that this average holds when N0 ranges from 3 to 12, meaning N∞ ranging from 1.7 to 3.5 or the largest share ranging from 29 to 58 percent. This covers most of the usual range. One has to be careful, however, in the case of highly multiparty systems where even the largest share falls much below 30 percent. Given that N0 = (MS)1/4 , the minimal measures of the number of parties can also be tied to institutions: 1/2
N∞ = N0
= (MS)1/4 .
The formula for balance itself can be recast in terms of the two extreme measures of the number of parties. Balance is the ratio of logarithms of the minimal and maximal measures of the number of parties: B=
logN∞ . logN0
We have been successful in connecting the minimal and maximal measures of the number of parties to the seat product and might expect the same to be the case for balance—but here we fail. The balance for particular countries cannot 2 be predicted on the basis of institutions. Indeed, the average relationship N0 = N∞ leads to B = 0.5, which certainly holds as a world average. This is the only thing we can predict about balance, for any M and S. In other words, the index of balance for a given country precisely tells us by how much they deviate from the balance that could be expected, given their assembly size and district magnitude. It is a second-order measure, like deviation from PR.
How the largest seat share relates to the seat product When all seats are allocated within districts of fairly equal magnitudes, it was shown that, on the average, N0 = (MS)1/4 parties can be expected to win seats. Upon testing, this prediction held within 15 percent for FPTP and the 1/2 is to be expected on usual PR systems. It was also shown that s1 = 1/N0 quite universal grounds, and this expectation is confirmed, with some deviation
138
The Number of Seat-Winning Parties when only two parties win seats. Connecting the two equations immediately yields s1 =
1 (MS)1/8
or, in a more symmetric form, s1 (MS)1/8 = 1. This is another sort of a law of conservation: The product of the largest seat share and the 8th root of the seat product is constant at 1. A symmetrical formulation avoids the issue of which variables are dependent, and which are ‘independent’. In the following, the largest seat share is treated as dependent on institutions, but the model only posits interdependence. A dominant party that happens to become unusually large may conceivably exert pressure to change the electoral system so as to ensure its continued domination. If so, then the largest share becomes the driving force, and MS becomes the ‘dependent’ part. Oddly but pleasantly, at this stage we leave behind one methodological headache of the previous stages of model construction. Counting the number of seat-winning parties, be it at district or nationwide level, is made hard by lack of detailed data on the smallest parties, often lumped as ‘Others’, and the fuzzy nature of independents. But now we are dealing only with the largest seat share, for which clear data are much easier to locate. The only dilemmas involve semijoint largest parties like the CDU and CSU in Germany and fractured parties like LDP in Japan.
Seat product—the main characteristic of a simple electoral system According to the models N0 = (MS)1/4 and s1 = 1/(MS)1/8 , district magnitude and assembly size play a strikingly symmetrical role. An increase in one would compensate for a decrease in the other. The same number of seat-winning parties would be expected for a large 625-seat assembly elected from single-seat districts (S = 625, M = 1) and for a tiny 25-seat assembly elected in a single nationwide district (S = M = 25). In both, N0 = (MS)1/4 = 5.5 parties are expected to win seats. Is there really such equivalence? And regardless of how many parties win one or a few seats, does such equivalence extend to the effective number of parties? It will be seen that it does. Such equivalence makes the seat product MS the single most important indicator to characterize a simple electoral system. Just as district magnitude (along with the seat allocation formula) characterizes the effect of the electoral system in a single district, the product MS characterizes the nationwide effect of the electoral system (along with the allocation formula).
139
The Duvergerian Macro-Agenda The seat product could be used at the district level, where it becomes simply the magnitude, modified by F . It was noted in Taagepera and Shugart (1989: 118, 139– 41) that the empirical relationships of deviation from PR and break-even point with magnitude involved the square root of M, as if this were some sort of a basic building block. This square root of M would correspond, at district level, to what was called electoral system aggregate in appendix to Chapter 6. But if district magnitude and assembly size play symmetrical roles, how come the impact of magnitude on electoral outcomes was recognized a century ago, while the impact of assembly size is still questioned? The answer is that the widths of the ranges that M and S can take are not at all symmetrical. M can vary over more than two orders of magnitude—from 1 to 100 and beyond. Since 1001/4 = 3.16, having nationwide PR instead of FPTP can triple the number of seat-winning parties. In contrast, S in the 30 systems condensed in Table 8.1 varies barely over one order of magnitude—25–628. Going from an assembly of 25 seats to 628 would only double the number of seat-winning parties, given that (628/25)1/4 = 2.24. Moreover, the actual options when choosing an assembly size are limited by the population to be represented (as is explained in Chapter 12). Median countries have presently around 10 million people, and their assemblies rarely have less than 100 or more than 400 seats. Having a 400-seat assembly instead of 100 can boost the number of seat-winning parties only by (400/100)1/4 = 1.41, meaning 41 percent. Such a change is dwarfed by what one can achieve by changing district magnitude. It is hence no wonder that the impact of S does not catch the eye as easily as the impact of M.
It need not work, but it does. Why? When I expected 10 parties to win seats in a 100-seat PR district, it was not because of some positive arguments. There was only a negative reason: Any other expectation would be even harder to justify, in the absence of any further information. It need not materialize, even as average, when further information arrives. It is merely the only guess we are justified to make under complete information blackout, apart from conceptual limits. The same goes for all subsequent stages of the model, up to the largest seat share in the assembly. If the term ‘prediction’ has slipped into the model building, its meaning is the following: I cannot be certain it actually is so, but if I had to make a quantitative prediction, yes, I can make one (and only this one!), rather than say ‘I don’t know’. As information is added, we could fully expect to find that the empirical average pattern differs from the simple model. The direction and extent of deviation from the model may help us locate factors of a political nature that could explain the deviation. Yes, reality can be expected to deviate from a model built on nothing else than conceptual limits. But we are in for a surprise: For the simplest electoral systems we can find, the simple model does work quite well. Deviations from the model arguably increase at the rate the systems turn more complex.
140
The Number of Seat-Winning Parties A second surprise is that the model fits even better for the largest seat share than for the number of seat-winning parties. Indeed, for the simplest system (FPTP), the number of seat-winning parties deviates from the model by 8 percent on the average, while the largest share deviates only by 1 percent. For all systems tested, the deviations are 14 percent and less than 1.5 percent, respectively. Yet, in order to build up the model for the largest share, we played the ignorance-based card three times over. We did it twice for the number of seat-winning parties—at district and assembly levels—and for a third time when going from the number of seats to the largest share. We started on a thin limb to begin with, and then we took more and more risks. It should work less well as stages are added. It should work less well at each successive stage, because random error accumulates. Moreover, sooner or later, some universal sociopolitical factor may enter so as to tilt the world average away from the purely ignorance-based best guess. Such factors do enter, indeed, for individual electoral systems. Imperial Germany falls 42 percent short of the expectation for the largest share, while pre-World War I Italy exceeds it by 70 percent. With such a wide observed range, the world average itself could easily be anywhere between 80 and 120 percent off the simple model. Why is it within 1 percent for the simplest category (FPTP) and within 1.5 percent for the mean of all systems? True, when testing for the largest seat share, we circumvented the problem of independents and other ‘Others’ that bedeviled measurement of the number of seat-winning parties. Also, somewhat different sets of electoral systems were used in the two studies. Still, improvement of fit with the simple model remains puzzling. Maybe the micro-Duvergerian specialists can explain it. The fact is that the simple model seems to apply, as a worldwide average, without need for major adjustments. In this particular case, sociopolitical nature turns out to be as simple, on the average, as it possibly could be. Individual countries need correction terms, but the world does not. One may feel like protesting: ‘This is outrageous. It cannot be that simple and devoid of political content. There must be some artifact.’ Indeed, several reviewers for the studies condensed here suggested that many other factors could affect the largest seat share. They then went on to claim that overlooking these factors may have led to an artificially good agreement with the simple model. But how could that be? Overlooking significant factors reduces agreement almost by definition of what is significant. Occam’s razor played a role in development of natural sciences: Omit what is not absolutely needed. This is no time to blunt Occam’s razor in political science. Instead of remaining in denial, one is better off by accepting the outrageously simple model as a baseline and focusing on the political features that make individual polities deviate from it. This is where many other factors enter— political, cultural, and other institutional. What is it that makes the Swiss largest party at any given election so small and the US largest party so large, compared to expectations based on district magnitude and assembly size? How could we
141
The Duvergerian Macro-Agenda predict the quantitative degree of such deviations? What is it that makes the TwoRounds systems so unpredictable—and what should we include to make it more predictable? To the extent that the ignorance-based model fits the world average for the simplest systems, we are tempted to go beyond the initial claim that ‘This is the best guess we can make in the absence of any further information.’ We may be tempted to become more assertive: ‘This is no longer a guess but a firm prediction, because it has worked previously.’ It would be risky. We should preserve some humility, in view of our inability to predict for individual systems. As further data accumulate, a small but significant deviation from the simple model may appear for the average of even the simplest electoral systems.
142
9 Seat Shares of All Parties and the Effective Number of Parties
For the practitioner of politics:
r r r
r r r
The quantity to watch is again the seat product—the number of seats in the assembly × the number of seats allocated in the average district. The lower the seat product, the lower the effective number of parties in the assembly. If you wish to reduce the effective number of parties by one-tenth, your best bet is to multiply 1 − 0.1 = 0.9 by itself 6 times, which yields 0.53. This is by how much you must multiply the present seat product. To do so, you can cut the present districts into two smaller districts or cut the assembly size by a half. You can also reduce both district magnitude and assembly size by about 30 percent. This way to calculate is based on a logical model that agrees with the world average. It is approximate, because other factors enter, but this is your best bet. The average seat shares of second-largest and third-largest parties also can be calculated from the seat product. At the same average district magnitude, unequal districts usually increase the effective number of parties.
Chapter 8 established a model that predicts the largest seat share in simple electoral systems on the basis of the seat product. Here the model is extended to seat shares of parties at all other ranks by size. This is not about sizes of specific parties but parties that happen to occupy a given rank by size at a given election. The following questions are of interest. Does political competition tend to place the two largest parties in a special 143
The Duvergerian Macro-Agenda
category by size, far larger than the third parties, or do seat shares taper off gradually? Could either pattern predominate, depending on the size of the largest share? How is the resulting effective number of parties connected to the seat product? Here the other seat shares are calculated in terms of an intervening variable: the largest seat share. Since the latter can be expressed in terms of the seat product MS, so can the other shares, in principle. However, this is messy mathematically, because the equations that connect the other seat shares to the largest are quite involved. Mathematics as such does not become more complex, compared to the previous chapter, but simple expressions pile up. The effective number of parties is the most widely used single number to characterize a party system. Fortunately, at this stage of cumulation, the mathematics becomes simpler again, because the probabilistically expected pattern can be approximated by another simple function of the seat product.
The Empirical Pattern of Seat Share Distribution The first task is to find out what the empirical pattern looks like. At given seat share of the largest party, what is the typical share that goes to the second-largest party, and so on? Note that we do not follow the pattern of one specific party but parties that happen to have a given rank by size, at a given election. For individual elections the possibilities are wide open. The secondlargest party may tie with the largest (e.g. The Netherlands 1901, 1905, 1909, both largest parties at 25.0 percent), or it may have as few seats as the third-largest party (e.g. Germany 1898: 25.7 −14.1 −14.1 −. . . ; Italy 1900: 81.1 −6.7 −6.6 −. . . ). Some countries may follow a steady pattern that deviates from the worldwide average—but we can establish such a deviation only after we determine the average, to serve as a benchmark. In some other countries, the distribution may vary widely from one election to the next. A major shift in the fortunes of one specific party may or may not alter the size distribution, compared to the previous election. When the Conservatives in Canada plummeted from 57.3 percent of the seats in 1988 to 0.7 percent in 1993, the ranked distribution shifted merely from 57.3 −28.1-14.6 to 60.0 −18.3 −17.6 −3.1 −0.7 −0.3. All this variation nonetheless occurs around some worldwide average, which may have considerable inertia over space and time. It may also have 144
The Effective Number of Parties Table 9.1. Actual average seat shares of parties ranked by size vs. largest share Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 SUM
Seat shares, for given largest share (±2.5%) 20.0 17.8 14.9 13.3 11.2 8.8 5.5 2.8 2.2 1.7 0.8 0.4 0.3 0.1 0.1 0.1 100.0
25.0 22.4 19.0 12.3 7.2 5.3 3.8 1.8 1.2 0.8 0.5 0.3 0.2 0.1 0.1
30.0 24.8 17.5 11.9 6.6 3.8 2.2 1.4 0.8 0.5 0.2 0.1 0.1 0.1
35.0 27.8 18.2 9.5 4.9 2.1 0.9 0.7 0.4 0.2 0.1 0.1 0.1
40.0 29.9 14.8 9.0 3.8 1.6 0.5 0.2 0.1 0.1
45.0 32.0 13.7 6.3 2.1 0.6 0.2 0.1
50.0 37.1 8.0 3.5 1.2 0.2 0.1
55.0 40.6 4.1 0.2 0.1 0.0 0.0
60.0 35.3 3.3 0.8 0.4 0.1 0.1
65.0 29.9 3.4 1.4 0.3 0.0 0.0
70.0 25.4 2.9 0.7 0.5 0.3 0.2
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
Source: Smoothed from Taagepera and Allik (2006).
a logical explanation in terms of statistics and/or what democratic politics is about, at its most universal. This is why establishing the worldwide average pattern is of interest. Taagepera and Allik (2006) used essentially all of the more than 700 elections recorded in Mackie and Rose (1991, 1997), regardless of the complexity of the electoral system used. The elections were grouped by intervals of the largest share (s1 ), and in each interval the arithmetic mean was calculated for the second-largest party, the third-largest, and so on. Table 9.1 shows the empirical results for largest shares up to 70 percent. Beyond 70 percent, few cases are available, and hence the averages come with a large random error. Compared to data published in Taagepera and Allik (2006), Table 9.1 converts these data to 5 percent intervals of the largest share. A few more parties than predicted by N0 = 1/s12 gain occasional representation, though mostly with an average of less than one seat per election. Figure 9.1 presents the graph of the original data, reproduced from Taagepera and Allik (2006). The format used is an extension of what has been called the Nagayama triangle. Nagayama (1997) graphed the vote shares of the second-running contestant against the vote shares of the top contestant. The total of the two shares is 100 percent at most (s1 + s2 = 1 = 100%), and the second-largest share can at most equal the largest (s2 = s1 ). These lines determine a triangle that delimits the allowed 145
The Duvergerian Macro-Agenda
50% s2 = 0.8s1
Seat shares
40%
s2 = s1
s1 + s2 = 100%
30% s2 = 0.8(100 − s1) 20% 10%
0% 0%
10%
20%
30%
40%
50%
60%
80%
70%
90%
100%
Largest party seat share First-
Second-
Third-
Fourth-
Fifth-
Sixth-
Seventh-largest
Figure 9.1. Actual average seat shares of parties ranked by size vs. largest seat share Source: Reprinted from Electoral Studies, 25, R. Taagepera and M. Allik, ‘Seat Share Distribution of Parties: Models and Empirical Patterns’, 696–713, © 2006 Elsevier Ltd., with permission from Elsevier.
zone for any election results. Reed (2001) popularized this format, and Grofman et al. (2004) investigated thoroughly its theoretical properties. Taagepera (2004) extended its use from candidates to parties, from vote shares to seat shares, and—most relevant here—to third-ranking parties, and so on. In addition to the upper limits for the second-largest party (s2 = s1 and s2 + s1 = 1 = 100%, respectively), Figure 9.1 shows the empirical patterns for parties at all ranks by size, up to s7 . Also shown, as a dashed line, is an approximation for the pattern of the second-largest party. When the largest share is less than 50 percent, the second-largest share tends to be 0.8 times the largest share: s2 = 0.8s1 . When the largest share surpasses 50 percent, the second-largest share tends to be 0.8 times what is left by the largest fractional share: s2 = 0.8(1 − s1 ). At this stage, we do not distinguish between cases with different electoral systems that happen to yield the same largest share in a given election. The cases with very large largest shares correspond mainly to single-seat districts and those with very small largest shares to PR in large districts. There is a wide overlap in the center. Work in progress (Taagepera and Laatsit 2007) indicates that the patterns for FPTP and List PR diverge to some extent. What we have in Table 9.1 and Figure 9.1 is the average
146
The Effective Number of Parties
for all electoral systems. Such an approach offers the advantage of a large number of cases and the disadvantage of wide dispersal among them. It is a basis for more detailed work to follow. The broad pattern is clear, except for the blank at largest shares less than 20 percent. Here, a thought experiment will help. If the largest share were extremely small, then it would take a huge number of parties, some of them with shares almost equal to the largest, to bring the total up to 100 percent. For instance, if the largest share were 4 percent, it would take more than 25 parties, many of them close to but none of them larger than 4 percent. However, within the actual range of the largest share, even the seventh-ranking party (the smallest one shown in Figure 9.1 and Table 9.1) has a larger share than 4 percent. Hence, as the largest share increases, all the curves must rise at first, starting out from an anchor point at s1 = 0 → si = 0. Thereafter, they must peak and decline when the largest share becomes sufficiently large. This is the broad common pattern shared by parties at all ranks, from the second-largest share to the seventhlargest (and beyond). But when does the peak occur, and how high does it reach? Furthermore, do some curves bunch together more tightly than some others? At first glance at Figure 9.1, it might be tempting to guess that the ith ranking party peaks when the largest percentage share is 100%/i. In other words, maximum si corresponds to s1 ≈ 1/i. It is close, but not quite accurate. For the second-largest party, the peak occurs at 55 percent, higher than 100%/2 = 50. In contrast, the peaks for the third- and fourth-ranking parties occur at less than 33.3 and 25 percent, respectively. The heights of the peaks offer no readily visible regularities. In particular, the peak for the second-largest party towers way above all the others. It hugs the line for the largest party (graphed against itself) when s1 < 0.5 = 50%, but so does the third-largest party when s1 < 0.25 = 25%. Later, the two curves part company, and very drastically so. The shares of third- and fourth-largest parties show a minor increase at very large largest shares. This may be an artifact due to a low number of cases. In sum, the details of the curves are so intricate that it might look hopeless to determine the reasons and hence the underlying pattern. But let us try, anyway. In particular, why is it that we start to have a twoparty game, with the third-largest party bunched with minor parties, only when the largest share surpasses 30 percent? Why is it that, at largest share 25 percent, it rather looks like 3 parties standing apart from the rest, rather than 2?
147
The Duvergerian Macro-Agenda
The Probabilistic and Politically Adjusted Models for Seat Shares In the previous chapter, we guessed at the largest fractional share, based on its conceptual limits: the average share 1/N0 and the near total, 1/2 1. The result, s1 = 1/N0 , agreed pretty well with actual values. We can repeat the procedure with the second-largest party, using only the part (1 − s1 ) left over by the largest party. Details are given in chapter appendix. Taagepera and Allik (2006) tabulate the complete results, for the largest share ranging from 14 to 91 percent, and show the resulting graph, again following the Nagayama format. The pattern vaguely agrees with the observed average pattern when the largest share is small, but it fails to agree when the largest share increases. In particular, the model predicts that the second-largest party would peak later and at a much lower value than it actually does, and the third-largest party is also predicted to peak much later. 1/2 At this point, we should recall what was said about s1 = 1/N0 in the previous chapter: It is the only guess we can reasonably make, if conceptual boundaries are all we know. It is an expectation value only in this limited sense. We guessed at the number of seat-winning parties, purely on the basis of two institutional constraints (district magnitude and assembly size), and were surprised to find that it actually worked, with no need to introduce any further political considerations. We went on to the largest share, and the surprise was repeated. With the shares of other parties, politics finally catches up with institutions. We need some political input. But we should keep such input at the bare minimum—the least that we can get away with so as to explain the pattern observed. This means introducing only some broad principle of politics that applies to all democratic systems. The actual mean seat shares almost always penalize smaller parties in favor of larger parties, compared to the probabilistic expectations. The transition point between ‘smaller’ and ‘larger’ parties shifts as the largest share increases. At s1 = 0.2 = 20%, as many as 6 largest parties exceed the probabilistic expectation. The number drops to 4 at s1 = 0.3, to 3 at s1 = 0.35, and to only 2 at s1 = 0.40. The simplest mathematical function that comes close to expressing these observations is the inverse of the largest fractional share, 1/s1 . We encountered 1/s1 earlier (Chapter 4), as one of the ways to express the number of parties: N∞ . As a measure of the number of parties, it tends 148
The Effective Number of Parties
to be inferior to the usual effective number of parties. But here it acquires an intriguing special meaning: N∞ seems to be the number of parties that profit from a politically induced shift of seats to larger parties, compared to probabilistic expectations. Recall that the peak value for ith-ranked party was observed to correspond to s1 ≈ 1/i. In other words, this peak occurs when i = N∞ . Peaking means that the party shifts from the bonus group when N∞ is large to the penalized group when N∞ becomes smaller. So the two observations are mutually consistent. There must be a way to explain logically why the shift occurs at rank equal to N∞ , but I have not found it yet. What causes this rather systematic shift? Duverger’s mechanical and psychological effects immediately come to mind, but here we are dealing with something even more general. The Duverger effects are most marked for single-seat districts, where the largest share tends to be large. Yet here we observe penalization of the smallest parties even when the largest share is quite small, which most often corresponds to PR. Such penalization may be a major puzzle raised by the discrepancy between data and probabilistic expectations in Taagepera and Allik (2006). Several factors may disadvantage the smallest parties even under PR. Legal thresholds of representation block them in some systems—but only in some. More broadly, small parties always suffer from lack of ‘economics of scale in advertising, raising funds, securing portfolios supplying policy benefits, and so on’ (Cox 1997: 141). Media coverage of minor parties is so limited that some voters could be unaware of the very existence of parties whose programs might appeal to them. The major factor may consist of strategic devices much broader than Duverger’s psychological effect, devices of the type Cox (1997: 194–6) has called strategic sequencing. Moreover, winning seats is not the only goal the voters have in mind. Even if a preferred party wins seats proportional to its votes, some voters may still abandon it, if larger parties offer a better chance to be represented not only in the assembly but also in government. Taagepera and Allik (2006) construct a model to account for the resulting shift of support from small to large parties. It is condensed in chapter appendix. Table 9.2 shows the seat share distributions at three values of the largest share: a low 25 percent, where about 16 parties are expected to win seats according to N0 = 1/s12 ; a median 38 percent, where about 7 parties are expected to win seats; and a rather high 50 percent, where about 4 parties are expected to win seats. At each of these values 149
The Duvergerian Macro-Agenda Table 9.2. Seat shares of parties ranked by size, for given largest share—probabilistic model, politically adjusted model, and the actual world averages Party
Largest share 25.0%
Largest share 38.0%
Largest share 50.0%
Rank
Prob.
Polit.
Actual
Prob.
Polit.
Actual
Prob.
Polit.
Actual
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 SUM
19.4 14.9 11.3 8.5 6.3 4.6 3.3 2.4 1.6 1.1 0.7 0.4 0.3 0.1 0.0 100.0
20.7 17.4 14.2 8.3 3.4 2.8 2.2 1.7 1.3 1.0 0.8 0.6 0.5 0.3 0.1 100.0
22.4 19.0 12.3 7.2 5.3 3.8 1.8 1.2 0.8 0.5 0.3 0.2 0.1 0.1 0.0 100.0
25.4 16.5 10.1 5.9 2.9 1.2 0.0
30.0 16.0 5.7 3.9 2.7 1.7 1.0 0.6 0.3 0.1 0.0
29.9 17.0 7.8 3.1 1.5 0.7 0.6 0.5 0.4 0.3 0.2 0.1
28.9 14.9 6.2 0.0
34.6 8.2 4.5 2.2 0.5
37.1 8.0 3.5 1.2 0.2
100.0
100.0
100.0
100.0
100.0
100.0
Source: Data interpolated from Taagepera and Allik (2006).
of the largest share, three distributions are shown: those predicted by purely probabilistic and politically adjusted models, plus the actual one. Horizontal lines indicate the ranks beyond which parties no longer are expected to win seats, on the basis of N0 = 1/s12 . Bold script indicates the cases where the model differs from the actual average by more than 2.0 percentage points. For the second- to fourth-ranking parties, most predictions by the probabilistic model deviate from the actual distribution by more than 2 percentage points. At s1 = 0.50 = 50%, this model underestimates the secondlargest share by 8 percentage points (and the deviation becomes worse for larger s1 , not shown here). The politically adjusted model, though far from perfect, agrees with the actual distribution within 2 percentage points in two-thirds of the cases. Note that the model N0 = 1/s12 predicts the number of seat-winning parties quite well at s1 = 0.25 = 25%. At 38 percent, many more parties occasionally win a seat, resulting in a string of average shares of less than 1 percent. The politically adjusted model reproduces this long tail, while the probabilistic model does not. The same applies at s1 = 0.50 = 50%, where the fifth-ranking party achieves some minimal representation in most cases. Here the general deviation from N0 = 1/s12 at low number of seat-winning parties starts to set in. 150
The Effective Number of Parties
50%
s2 = s1
Seat shares
40%
s1 + s2 = 100% 30%
20%
10%
0% 0%
Second-
10%
20%
Third-
30%
Fourth-
40%
50%
60%
Empirical second-
70%
100% 90% 80% Largest party seat share
Empirical third-
Empirical fourth-largest
Figure 9.2. Average seat shares of parties ranked by size vs. largest seat share— politically adjusted predictive model and actual data Source: Reprinted from Electoral Studies, 25, R. Taagepera and M. Allik, ‘Seat Share Distribution of Parties: Models and Empirical Patterns’, 696–713, © 2006 Elsevier Ltd., with permission from Elsevier.
The overall degree of fit is shown in Figure 9.2. The model-based curves are shown along with the actual data (the same as in Figure 9.1), for the second- to fourth-largest parties. Consider each curve separately. The seat shares of the second-largest party agree with the model within ±2 percentage points, as long as the largest share remains below 50 percent. This error is within the range of random scatter of data points themselves, so the fit is as good as it possibly could be. When the largest share is between 50 and 60 percent, the model falls short of the actual values by up to 6 percentage points. This deviation looks serious enough to call for further refinement of the model. At largest shares beyond 70 percent, the model predicts a pure two-party constellation, so that s2 = 1 − s1 . The actual curve falls below that expectation by up to 8 percentage points, but the number of data points is low and restricted to a few countries, so that the data are questionable. The seat shares of the third-largest party agree with the model within ±3 percentage points, as long as the largest share remains below 55 percent. The model predicts the rise, peak, and decline quite accurately. When the largest share ranges from 55 to 65 percent, the model exceeds the actual values by up to 5 percentage points. At largest shares beyond 70 percent, the model predicts complete extinction of the third party, 151
The Duvergerian Macro-Agenda
while actually its share rises again—but once again, the data themselves are questionable. The seat shares of the fourth-largest party agree with the model within ±2 percentage points everywhere, except when the largest share is 40 percent. The corresponding graphs for fifth- to seventh-largest parties are available in Taagepera and Allik (2006). Agreement with the model is within ±2 percentage points, which says little, given that the shares themselves are below 12 percent and mostly below 5 percent. Here the accuracy of the empirical data suffers from the presence of the ‘Others’ category. In sum, this graph shows that the politically adjusted model does fit, within the random fluctuation of the empirical data, when the the largest seat share is below 50 percent. For the largest shares above 50 percent, a refinement of the model may in order, along with further data collection so as to determine the empirical means with more confidence. Table 9.2 agrees with the picture described. Taagepera and Allik (2006) offer full tabulation of numerical values.
The Effective Number of Legislative Parties We now come to the effective number of legislative parties (N), arguably the most important single indicator to characterize a party system. For a given largest seat share, the values of N are restricted to a range of which the upper and lower boundaries are somewhat complex. As shown in chapter appendix, the geometric mean of these extreme values can be approximated with N=
1 4/3 s1
.
Combining it with s1 = (MS)−1/8 leads to a mean estimate of the effective number of parties from purely institutional inputs. The effective number of legislative parties is around the sixth root of the seat product: N = (MS)1/6 . I tested this approximate model with those twenty-five countries in Lijphart’s Patterns of Democracy (1999) in which all seats are allocated in districts, so that M can be determined. The corresponding data are tabulated in Taagepera and Sikk (2007). The ratio N/(MS)1/6 , expected to 152
The Effective Number of Parties
N
100
10
1/6
N = (MS)
PNG
NED
SPA
MRT MLT
N = 1.09MS 0.153
UK
R 2 = 0.5091
1,000 MS
1,0000
BOT
1 10
100
100,000
Figure 9.3. Effective number of legislative parties vs. seat product MS—predictive model and regression line Data source: Taagepera and Sikk (2007).
be 1.00, ranges from 0.72 (UK 1945–97) to 2.74 (Papua New Guinea 1977– 97). The geometric mean of this ratio is 1.036 for 14 single-seat systems and 0.953 for the 11 multi-seat systems. For all 25 systems, the geometric mean is 0.999. This is closer to 1.000 than one could hope for. Using these data, Figure 9.3 shows the mean effective number of legislative parties (see data in Appendix to the book) graphed against the seat product MS, using logarithmic scales on both axes. The best linear fit of logarithms corresponds to N = 1.09(MS)0.153 .
[observed best fit, R 2 = 0.51]
It almost overlaps with the expected N = 1.00(MS)0.167 .
[theoretical model]
Visibly, the predictive model fits this particular data-set practically as well as the postdictive best fit. Lighter lines indicate a half and double the expected value. Most data points crowd together in the center of the zone, and only Papua New Guinea is outside. Due to approximation involved 4/3 in N = 1/s1 , we would expect some deviation at MS > 3,500, but it seems insignificant compared to random variation. Various other factors besides the seat product also interact with the effective number of parties. The degree of centralization offers a puzzling example. An increasing number of parties has been found to increase 153
The Duvergerian Macro-Agenda
central government expenditures, roughly as the cube of the largest party seat share (Mukherjee 2003). In turn, higher political and economic centralization has been found to reduce the effective number of parties in India and the USA (Chhibber and Kollman 1998). While the measures of centralization and number of parties differ somewhat, it should not make that much of a difference. Do we have a situation where centralization and party multiplication keep each other in check? I have no answer.
Conclusions and Implications for Institutional Engineering The most important conceptual result is that we have logically connected the various ways to measure the number of parties to each other and to institutional inputs of the electoral system, condensed in the seat product. Three ways to measure the number of parties were pointed out earlier: N0 , N2 , and N∞ . We can now express their average relationship to the seat product MS. The basic building block seems to be not the seat product itself but its square root. It is the geometric mean of M and S and might be called the aggregate of the electoral system (cf. appendix to Chapter 6): A = (MS)1/2 . The averages of N0 , N2 , and N∞ correspond to the square, cube and 4th roots, respectively, of the electoral system aggregate: N0 = the number of seat-winning parties = (MS)1/4 = A1/2 . N2 = the effective number of legislative parties = N = (MS)1/6 = A1/3 . N∞ = inverse of the largest seat share = 1/s1 = (MS)1/8 = A1/4 . These relationships among them form a remarkable series: 4 = N23 = N02 = A1 . N∞
It follows (and this is of course the same as N0 = 1/s12 ) that 2 . N0 = N∞
The systematics of measuring the number of parties, which included N0 , N2 , and N∞ , was first pointed almost 40 years ago (Laakso and Taagepera 1979). The approximate relationships shown above, however, do not emerge directly from the definitions of N0 , N2 , and N∞ . The path of discovery, always using the geometric means of upper and lower limits, was the following. First the highest possible way to count the parties (N0 ) was obtained from MS. Then the lowest possible way (N∞ = 1/s1 ) was 154
The Effective Number of Parties
calculated from N0 , and it was found to be the square of the highest. The intermediary N2 was obtained from balancing these two extremes. At the start of the chapter, the following question was posed: Does political competition tend to place the two largest parties in a special category of size, far larger than the third parties, or do seat shares taper off gradually? We find it depends on the size of the largest share. Electoral system divides the parties ranked by size into two groups: relative losers and winners. As the largest share increases, ever fewer parties belong to the advantaged group. As the largest share surpasses 25 percent, only two winning parties tower above the rest. The seat product allows us to make much more specific predictions for the distribution of seat shares, for any simple electoral system. For instance, in an assembly of 200 seats elected in 8-seat districts, we would expect somewhat more than (1,600)1/4 = 6.3 parties to win seats, on the average. The largest party is expected to have about (1,600)−1/8 = 0.40 = 40% of the seats. Figure 9.2 and Table 9.2 for politically adjusted shares then suggest a distribution around 40-30-16-6-4-2-1-1, with 8 parties winning seats (for a more detailed table, see Taagepera and Allik 2006). Instead, one may prefer to use the empirically observed distributions in Table 9.1. Then the pattern changes slightly, to 40-30-15-9-4-1.5-0.5, with 7 parties winning seats usually, and sometimes a few more. This would be our best guess in the absence of any other information. The main use of the pattern thus established is to supply the baseline for characterizing the average seat distributions in a given country. Once we find out how the country differs from the worldwide average, we can start looking for the underlying reasons. This should be a fertile field of study in years to come. What are the implications for institutional engineering? The observations made in the previous chapter still apply. The quantity to watch is the seat product. The larger the seat product is, the larger the effective number of parties. To lower the average effective number of parties, one can lower either district magnitude or assembly size, or lower both to a more moderate degree. Once again, the country’s peculiarities must be taken into account. The effective seat product could differ somewhat from the product of M and S, depending on other factors. To alter an existing effective number of parties, however, there is no need to go through the worldwide average model. Suppose you wish to reduce the effective number of parties by one-tenth. Your best bet is to multiply 1 − 0.1 = 0.9 by itself 6 times, which yields 0.53. This is by how much you must multiply the present seat product, which roughly means 155
The Duvergerian Macro-Agenda
dividing the seat product by 2. To do so, you can cut the present districts of magnitude M into two smaller districts of magnitude M/2. You can also cut the assembly size by a half, but now it gets a bit tricky. If you keep the same district boundaries, district magnitudes would be cut into by a half, and you would reduce the number of parties excessively. To keep the same magnitude, you would have to join two existing districts. You can also reduce both district magnitude and assembly size by about 30 percent, but district boundaries would have to be redesigned. More generally, for any desired fraction x of decrease in the effective number of parties, multiply 1 − x by itself 6 times. This is our best bet for the fraction by which the existing seat product has to be multiplied. This change can be obtained by altering either district magnitude, assembly size, or both—but check that the new M and S do give the desired product.
Appendix to Chapter 9 This appendix derives the predictive models for seat shares and the effective number of parties. It extends the model to the entropy-based variant of the effective number. It also points out a marked conceptual inconsistency in the seat share model, which makes us wonder why the model still fits actual data.
The probabilistic model for seat shares In the previous chapter, we used the conceptual limits of the largest fractional share, average share 1/N0 and almost 1, to estimate the likely mean. The result, s1 = 1/2 1/N0 , agrees pretty well with actual values, as long as the largest share remains less than 63 percent (cf. Figure 8.3), which would correspond to N0 = 2. Given this success, let us repeat the procedure with the second-largest party. For given largest share, the remaining (N0 − 1) parties account for a total fraction (1 − s1 ) of all the seats. Hence their average share is (1 − s1 )/(N0 − 1). This is the lower limit for the second-largest party. Its upper limit is slightly below (1 − s1 ), so as to leave a minimal number of seats to the third parties. The geometric mean of these limits is s2 =
(1 − s1 ) . (N0 − 1)1/2
For the third-largest party, similar reasoning yields s3 =
156
(1 − s1 − s2 ) . (N0 − 2)1/2
The Effective Number of Parties The general formula for the ith ranking party is si =
(1 − s j ) , (N0 − i + 1)1/2
where the summation ranges from j = 1 to i − 1. Once the largest seat share is given, all other shares can be calculated. Note that this approach does not presume a connection between the largest share and the number of parties, such as s1 = 1/2 1/N0 . Introducing this relationship would lead to s2 = s1
1 − s1 1 + s1
1/2 .
For integer values of N0 , the sum of shares thus calculated is 1, as it should. If we do calculate N0 from N0 = 1/s12 and it has a noninteger value, then we apply the formula to the integer part of N0 , and the resulting sum of seat shares is slightly less than 1. We take the remainder to represent a tiny party whose rank exceeds the integer part of N0 . Taagepera and Allik (2006) include a table of resulting seat shares and graph it.
Political adjustment to the probabilistic model for seat shares We now assume that the small parties give up a fraction m of their inherent support base to larger parties. The tiniest parties may well give up a larger fraction. If so, then m gradually decreases with increasing party size. It would reach 0 when the party almost accedes the select club of the parties that profit from the shift. Thus m can be expected to be a function of the relative size of the parties. Determining the shape of this function, however, is difficult. As a first approximation, we assume that m is the same for all parties that lose support. The conceptual limits on such a constant are m = 0, when no support is lost, and m = 1, when all support is lost. In the absence of any further knowledge, we try first the median value m = 0.5, which simplifies calculations. On this basis we establish the total share of seats given up by the small parties. This total is transferred to the large parties. The number of such parties was observed to be close to N∞ = 1/s1 . The largest among them may well profit more than proportionately, but it is hard to estimate how much more. So again, as a first approximation, we assume that all such parties profit equally. The party at the watershed between the losers and the winners occupies a special position. This is the party at rank i0 closest to 1/s1 . We assume it gains as much as it loses, so that its share remains unchanged. This assumption breaks down when the largest share is so large (more than 40 percent) that even the third-largest party starts losing seats. From this point on, we must assume that there is no middle ground between losers and winners.
157
The Duvergerian Macro-Agenda The previous probabilistic equation for seat shares was si = (1 − s j )/[(N0 − i + 1)1/2 ]. It is now adjusted according to the description above. The notation si is used so as to tell the adjusted shares apart from the unadjusted. Full derivation of the following equations is given in Taagepera and Allik (2006). For small parties, those at rank index i > i0 , the adjusted seat shares are si = (1 − m)si .
[i > i0 ]
For large parties, those at rank index i < i0 , the adjusted seat shares become si = si +
m(1 − sk ) , (i0 − 1)
[i < i0 ]
where k runs from 1 to i0 . The intermediary party (i = i0 ) undergoes no adjustment when the largest share is small: si = si .
[i = i0 , s1 < 0.4]
At unadjusted s1 > 0.4 = 40%, this intermediary stage vanishes, and for the two largest parties the equation becomes si = si +
m(1 − s1 − s2 ) . 2
[2 largest parties, s1 > 0.4]
When the largest share exceeds 50 percent, the number of winners from the adjustment drops below 2, so that even the second-largest party begins to suffer from the adjustment. Then s1 = s1 + m(1 − s1 ).
[largest party, s1 > 0.50]
Assuming m = 0.5 yields a slightly simplified set, still fairly messy: si = 0.5si si = si + 0.5(1 − sk )/(i0 − 1) si = si . si = si + (1 − s1 − s2 )/4. s1 = 0.5(1 + s1 ).
[i > i0 ] [i < i0 ] [i = i0 , s1 < 0.40] [2 largest parties, s1 > 0.40] [largest party, s1 > 0.50]
This is what the curves in Figure 9.2 and the politically adjusted numbers in Table 9.2 are based on. Work in progress (Taagepera and Laatsit 2007) graphs the curves for values of m ranging from 0 to 1 and compares them to actual data.
158
The Effective Number of Parties
Critique of the politically adjusted model for seat shares Once again, this is the worst possible way to account for political adjustment to the probabilistic model, except for all others, if we want to end up with a specific prediction. We approximate a varying loss function m(s) with a constant. We also use a constant level of gain, plus one neutral point that suddenly vanishes when the largest share becomes large. Even before making this gross simplification, we assumed a single function m(s) to account for the multitude of distinct factors that range from strategic sequencing to consequences of economics of scale. Could we feed them in separately? We could, as far as algebra is concerned. But we would end up with so many parameters that our database would not suffice to determine their numerical values. It would be another of those impressive models that are unable to make specific numerical predictions. In the face of such complexities, one may encounter advice to give up and limit oneself to qualitative or directionally predictive models, which never are falsified because their predictions are so fuzzy that everything and its opposite fit in. In terms of quantitative prediction, we would be left with the previous probabilistic model, with all its discrepancies. Yet these discrepancies all point in one broad direction: Compared to PR, the largest parties win, while the smaller ones lose, with a neutral point around the party whose rank is close to N∞ = 1/s1 . Even a single parameter, such as m, should go a long way to correct for this broad discrepancy. The result would be expected to be closer to the actual pattern, compared to the probabilistic model. If our guess at m = 0.5 is excessive, we would observe an overcorrection, and vice versa. Hence, such a first approximation would help us to refine the model, either by adjusting the value of m or by suggesting how to introduce a second parameter. Figure 9.2 indicates that the model can 1/2 stand refinement at large values of the largest share, where s1 = 1/N0 breaks down. One hazy aspect of the model is that it equivocates between shares of seats and shares of votes, when it talks of the ‘inherent support base’ of a party. For PR, it does not matter much, but when seat shares differ appreciably from vote shares, as is the case for FPTP, it might matter. The seat and vote levels are hard to disentangle conceptually, because voters can react only at the vote level (by withholding votes), while being motivated to defect by what happens at the seat level: low representation of the party in assembly and government, plus the concomitant low press coverage. Two questions immediately arise: What would the pattern be, if Figure 9.1 were redone, using vote shares instead of seat shares? And would the patterns be the same for List PR and for FPTP? Work in progress (Taagepera and Laatsit 2007) indicates that the overall pattern for votes and seats are identical, within the range of random error. Thus the equivocation between votes and seats in modeling political adjustment is less severe than it could have been. The value of m might
159
The Duvergerian Macro-Agenda be around 0.25 for List PR and around 0.75 for FPTP—but still with essentially the same patterns for seats and for votes. A relatively minor problem with the adjustment is that the procedure includes discontinuities. As the largest share increases, the other larger parties make sudden transitions from winner status to neutral and then to loser status. This explains the jagged shapes of the curves in Figure 9.2. Such ‘quantum jumps’ are most likely due to approximating a smoothly changing loss function m(s) and the corresponding gain function with constants. Broad agreement with data suggests that we are basically on the right path. Hence it may be worth while to invest in working out a model with smoothly changing m(s). Further questions arise when the largest share increases beyond 63 percent, so 1/2 1/2 that applying s1 = 1/N0 rounds off to N0 = 2. Figure 8.3 shows that s1 = 1/N0 no longer applies when N0 = 2. So the model should be reworked for cases with a hegemonic largest party. But the most severe problem with the adjusted model is one that no commentator has picked up: The model undermines its own foundations. The very starting point of the adjustment was that the number of seat-winning parties is close to N0 = 1/s12 . But in the process of adjustment, the largest share itself shifts from its ‘inherent support level’ (s1 ) to an adjusted level. This adjusted level is the only one we actually observe, according to the model. But if so, then N0 = 1/s12 no longer applies to the observed largest share—it only applies to a hypothetical and unobservable ‘inherent support level’ for the largest party! The nice part of this paradox is that it explains a discrepancy in Table 9.2. At largest shares 38 and 50 percent, more parties are observed to win minor shares of seats than was expected on the basis of N0 = 1/s12 . This is indicated by the horizontal lines in Table 9.2. Moreover, the adjusted model quite accurately reproduces these shares. But how come that the relationship N0 = 1/s12 (in its 1/2 reversed form s1 = N0 ) fitted the actually observed values so well in Figure 8.2? At some values of the ‘inherent support level’, the adjusted largest share exceeds it appreciably, and hence both should not fit. Is the reader confused? It should be so, because I am confused myself. In view of the agreement with data, this confusion is no reason to give up on the model presented. Clumsily, it expresses something real. It will take time to clarify the terms used.
How the effective number of legislative parties connects to the largest share We have two ways to estimate the effective number of legislative parties, for a given largest seat share. We could calculate it from the equations of the politically adjusted model. But the result would be algebraically messy. Alternatively, we could try a shortcut. Observing that the effective number depends most heavily on the largest share, we could try to estimate the effective number from that largest
160
The Effective Number of Parties Table 9.3. Effective number of parties for given largest share s1 (%) Act.N
20.0 7.32
25.0 5.75
30.0 4.93
35.0 4.08
11.18 1/s1 7.51 2/s10.9 − 1
8.00 5.96
6.09 4.91
4.83 4.14
3.95 3.56
3.31 3.10
2.83 2.73
2.45 2.43
2.15 2.17
1.91 1.95
1.71 1.76
6.35
4.98
4.05
3.39
2.90
2.52
2.22
1.98
1.78
1.61
3/2
4/3
1/s1
8.55
40.0 45.0 50.0 55.0 60.0 65.0 70.0 3.56 3.05 2.53 2.13 2.06 1.95 1.80
3/2
Source: Calculated from actual mean shares in Table 9.1 and as estimated from N = 1/s1 4/3 2/s10.9 − 1 (new more precise model), and N = 1/s1 (new approximate model).
(old model), N =
Note: Deviations of more than 0.3 parties from the actual mean values are shown in bold.
share alone. The values derived from the adjusted model for all seat shares would serve as a check on how accurate the shortcut is. An average connection between the effective number (N) and the largest share 3/2 (s1 ) was proposed by Taagepera and Shugart (1993): N = 1/s1 . Combined with 3/2 −1/8 3/16 , it leads to N = (MS) . It turns out that N = 1/s1 overestimates N, s1 = (MS) except when the largest share exceeds 62 percent. The discrepancy is due to a mistake in the model. I first present the old model, because it is simple to follow, and then two versions of the corrected one. At a given value of s1 , the effective number could be almost as low as 1/s1 = N∞ . This is the case when all shares are equal. It could also be almost as high as 1/s12 = N0 . This is the case when all other shares are infinitesimally small. With only the limits 1/s1 < N < 1/s12 known, the best guess is the geometric mean of N∞ and N0 : 3/2 N = 1/s1 . Taagepera and Allik (2006) compared these estimates with the actual mean values and found deviations. Similar contrasts can be seen in Table 9.3, as compared 3/2 to empirical mean seat shares in Table 9.1. When the largest share is small, 1/s1 severely exceeds the observed mean N. The observed mean catches up around 3/2 s1 = 0.62 and surpasses 1/s1 when the largest share becomes predominant. The reason for such a discrepancy is that the simple model overstates both conceptual limits. The old model assumed that the minimal effective number, at given largest share, is reached when all shares are equal. Then N = 1/s1 . However, all shares can be equal only when s1 = 1/2, 1/3, 1/4, etc. At any other values, N cannot go as low as 1/s1 . The difference is marked when s1 is large, and this is why the observed 3/2 mean N remains larger than 1/s1 when the largest share exceeds 62 percent. At the other extreme, the old model assumed that maximal N at given largest share is reached when the shares of all seat-winning parties but the largest are infinitesimally small. Then N = 1/s12 . But the other shares cannot be that small. The smallest conceivable nonzero share is one out of the S seats in the assembly. Yet even 1/S is too low. All shares but the largest being 1/S would imply a number of seat-winning parties that most often exceeds by far the mean expectation of N0 = 1/s12 parties winning seats. If we limit the number of seat-winning parties
161
The Duvergerian Macro-Agenda to N0 = 1/s12 , the effective number of parties is the lowest when all parties apart from the largest are equal. These equal shares amount to a total of (1 − s1 ) divided among (N0 − 1) parties. The resulting maximum N falls short of the previous limit, 1/s12 . The discrepancy is the largest at small s1 . This is why the observed 3/2 mean N remains larger than 1/s1 when the largest share is small. The picture changes when the largest share becomes so large that the N0 = 1/s12 projects to less than 2 seat-winning parties. Here we partly have to revert to the earlier model. The important outcome is that the resulting intricate relationship between the effective number of parties and the largest share can be approximated by models 3/2 that agree with data better than does the previous N = 1/s1 . The details of two refined models are given in the next section, and the results are shown in Table 9.3. The two-parameter model N=
2 −1 s10.9
[refined new model]
agrees with actual mean data within ±0.3 parties at all ranges of the largest share. Note that it respects the conceptual anchor point s1 = 1 → N = 1. The simplified one-parameter model N=
1 4/3
s1
[approximate new model]
agrees with actual mean data within ±0.3 parties only when the largest share 3/2 exceeds 27 percent. Still, this is a degree of agreement the previous N = 1/s1 offers only when the largest share exceeds 43 percent (cf. Table 9.3). When the largest share is as small as it ever is observed to (around 20 percent), even the new approximate model exceeds the observed mean by 1.2 parties, but this is appreciably better than the old model’s excess of 3.9 parties. One may lose in generality by trying to fit too closely to theoretical boundary 4/3 conditions which offer a wide permissible zone. So I will use N = 1/s1 . Combining it with s1 = (MS)−1/8 leads to a simple format for the mean estimate of the effective number of parties from purely institutional inputs: N = (MS)1/6 . The effective number of legislative parties is around the sixth root of the seat product. Compared to the old model’s exponent 3/16 = 0.1875, we have shifted to 1/6 = 0.1667. For very large seat products (MS > 3,500), it can be expected to overestimate N by more than 0.3. If this difference matters, N = 2/s10.9 − 1 should be 4/3 considered instead of N = 1/s1 . In most cases, random variation exceeds this level.
162
The Effective Number of Parties
The effective number of parties and the largest share: Details of the refined model This section involves tedious calculations that yield a somewhat wiggly mean relationship between the effective number of legislative parties and the largest seat share. I have to present these calculations so as to justify the claim that, 4/3 with just a slight smoothing of the actual average relationship, N = 1/s1 is the probabilistically expected value of N. The result-oriented reader, however, may wish to bypass this section. For a given largest share (s1 ) and number of seat-winning parties (N0 ), the largest possible value of the effective number of parties (N) corresponds to the other (N0 − 1) parties having equal shares (1 − s1 )/(N0 − 1): max N =
s12
1 1 = 2 . 2 2 + (N0 − 1)(1 − s1 ) /( p − 1) s1 + (1 − s1 )2 /(N0 − 1)
As long as N0 = 1/s12 applies, this expression simplifies into max N =
1 + s1 . 2s12
When the largest share is very small, max N approaches 0.5/s12 . This is appreciably lower than the previous limit 1/s12 . When N0 = 2, the relationship N0 = 1/s12 no longer applies (cf. Figure 9.2). This N0 = 2 corresponds to s1 larger than 1/(2.5)0.5 = 0.63. Here maximum possible N corresponds to the situation where all parties but the largest have one seat. Using the actual number of seats, there are then (S − S1 ) such small parties in an assembly with S seats, and max N = S 2 /(S12 + S − S1 ). Since S12 S − S1 even for an assembly as small as 10 seats, this limit reduces itself to the previous max N = S 2 /S12 = 1/s12 . The lower limit is even more complex. Assume that assembly size is sufficiently large (S N0 ), so that the minimum of one seat going to each minor party can be neglected. Actually, this is the case for all assemblies with at least three seats. Indeed, whenever N0 = 1/s12 and s1 = (MS)−1/4 hold, then S N0 amounts to M S 3 , which is the case whenever S > 3. When s1 = 1/2, 1/3, 1/4, etc., we can have all shares equal and hence N = 1/s1 . For intervening values of the largest share, making as many shares as possible equal to the largest minimizes N, leaving the remainder as a smaller party. For 1 < s1 < 1/2, no other party can match the largest. The effective number is lowest when the second-largest party is as large as possible, meaning s2 = 1 − s1 . The resulting minimal effective number is min N =
1 , 1 − 2s1 + 2s12
[1 < s1 < 1/2]
a value larger than 1/s1 . At s1 = 0.65, 1/s1 = 1.54, while 1/(1 − 2s1 + 2s12 ) = 1.83.
163
The Duvergerian Macro-Agenda For 1/2 < s1 < 1/3, the lowest N corresponds to s2 = s1 and s3 = 1 − 2s1 . Then min N =
1 . 1 − 4s1 + 6s12
[1/2 < s1 < 1/3]
For the next ranges which are still of practical interest, the equations are min N =
1 , 1 − 6s1 + 12s12
[1/3 < s1 < 1/4]
min N =
1 , 1 − 8s1 + 20s12
[1/4 < s1 < 1/5]
where the deviation from 1/s1 becomes negligible. We should try to fit the geometric means of these minimum and maximum values with some simple function of s1 . They can be well fitted with a twoparameter format N = a/s1b − a + 1, which correctly predicts N = 1 when s1 = 1. The best fit is close to N = 2/s10.9 − 1. However, we should also try to fit with the simpler one-parameter format N = 1/s1n , because that form can be easily combined with s1 = (MS)−1/8 so as to connect N with the seat product MS in a simple way. This is possible, indeed. Depending on whether one wishes to emphasize the fit at lower or higher values of the largest share, the exponent n could be taken as anywhere between 1.30 and 1.45. The choice of the simple fraction n = 1/3 = 1.333 agrees with data (cf. Figure 11.4), although it could run into trouble at very high values of the seat product.
Entropy-based effective number of legislative parties and the largest share The same approach can be used to estimate any measure of the number of parties, Na = [(si )a ]1/(1−a) , with 0 < a < ∞ (cf. Chapter 4). In particular, the entropy-based N1 = e H might be of interest. The exact equations for minimum and maximum values are even more involved than those for N2 . The best two-parameter fit is around N1 = 4/s10.67 − 3. The best one-parameter fit, around N1 = 1/s11.6 , is very coarse—see Table 9.4.
Table 9.4. Entropy-based effective number of parties s1 (%) 20.0 25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0 65.0 70.0 Act. N1 8.60 7.08 6.10 5.03 4.34 3.68 3.02 2.33 2.36 2.30 2.21 − 3 8.76 7.13 5.96 5.08 4.39 3.83 3.36 2.97 2.63 2.34 2.08 4/s0.67 1 13.13 9.19 6.86 5.36 4.33 3.59 3.03 2.60 2.26 1.99 1.77 1/s11.6 Source: Calculated from actual mean shares in Table 9.1 and as estimated from two equations. Note: Deviations of more than 0.3 parties, compared to the actual mean values, are shown in bold.
164
10 The Mean Duration of Cabinets
For the practitioner of politics:
r r r
r r
The quantity to watch is again the seat product—the product of the number of seats in the assembly and the number of seats allocated in the average district. The lower the seat product, the longer the mean duration of government cabinets. If you wish to double the mean duration of cabinets, your best bet is to divide the present seat product by 8. To do so, you can cut each present district into 8 smaller districts. You can also cut each present district into 2 smaller districts and cut assembly size by a half. This way to calculate is based on a logical model that agrees with the world average. It is approximate, because other factors enter, but this is your best bet. A desired increase in cabinet duration does not come free—it reduces representation of small parties, and they will put up a fight. If, as a price for reduction in district magnitude, you agree to introduce new complexities in the electoral rules, then all bets are off regarding consequences.
Many political and even economical factors may be affected by the effective number of parties, to judge by the numerous empirical studies that include N. Even when significant, most such relationships remain empirical. For the mean duration of government cabinets, however, we can also establish a predictive model of why the number of parties should affect it, and by how much. This means that a logical connection is established between the mean cabinet duration and the effective number of parties. As the latter, in turn, is connected to the seat product, so will cabinet 165
The Duvergerian Macro-Agenda
duration. Thus, mean cabinet duration can be predicted from institutional inputs. The duration of cabinets can make a difference for the nature and quality of governance. If the regular interelection period is 4 years, then a mean cabinet duration of 4 years implies that cabinets tend to last until the next election, while a mean duration of 2 years implies roughly one cabinet change between two elections. Extremely short-lived cabinets make it hard to formulate and implement policy. At the other extreme, cabinets that last past many elections may favor cronyism and stagnation. So the ability to use institutional means to modify the mean cabinet duration by a specified amount could be of interest to political practitioners. In particular, when new democracies decide on their institutions, duration of cabinets is often among the concerns of decision-makers, foremost in the form of what they do not want: ‘Let us avoid short-lived cabinets like they have in. . . .’ As for the students of politics, Warwick (1994: 139) represents the widespread view that even mildly short-lived cabinet can put regime survival in danger, over the longer run. Dogan disagrees (1989), and Lijphart (1999: 130) puts it bluntly: ‘This view is as wrong as it is widespread.’ Lijphart (1999: 131–9) sees mean cabinet duration as a useful indicator of executive dominance and proceeds to measure it in various ways. To the extent we can explain why some countries have shorter durations than some others, we might also understand various implications of cabinet duration, such as how it affects regime performance. Nothing will be said here on why some cabinets last longer than some others, within the same country with stable institutions. This is the subject of a rich separate literature, reviewed by Laver (2003), which focuses on bargaining models based on rational choice. As far as the mean duration is concerned, however, bargaining models offer no specific predictions. Thus the chapter on ‘Party systems and cabinet stability’ in a book by Laver and Shepsle (1996: 195–222) offers ‘two basic conclusions’: Certain bargaining constellations are ‘substantially more stable’, and ‘the model can be used to understand why governments might change tack between elections’ (Laver and Shepsle 1996: 215). They present theory, simulations, and discussion of specific past cases. But if one asks how much duration is expected in a given country, on the average, they offer no answer. The present book does give such an answer, with a 50-50 probability of this prediction being high or low and with an estimate of likely margin of error. This answer is based on 166
The Mean Duration of Cabinets
the number of players (parties) involved, a number itself grounded in institutions.
The Inverse Square Law of Cabinet Duration, Relative to the Number of Parties That the number of parties affects cabinet duration was documented by Lijphart (1984: 124–6) several decades ago. Coalition cabinets, prevalent in multiparty systems, tend to be more short-lived than one-party majority cabinets. When Grofman (1989) controlled for the effective number of parties, the effect of cabinet type (minimal winning or larger or smaller) on cabinet duration largely disappeared. The number of parties and cabinet types are strongly correlated, and it is easier to visualize the number of parties imposing cabinet type rather than cabinet type impacting the number of parties. A logical connection between the number of parties and cabinet duration was presented in Taagepera and Shugart (1989: 99– 101). The crucial link is the number of communication channels among the parties, which can also become conflict channels. Hence more parties mean more potential conflicts, which can undo a cabinet. The following model emerges. We may surmise that cabinets break up, on the average, when a certain amount of conflict has accumulated in the political system. It may look simplistic, but this is actually the only logical guess we could make, in the absence of any further information, short of sterile ‘We cannot know’. If this assumption is wrong or overly simple, lack of agreement with data will tell us so. To the extent it holds, the mean duration of cabinets (C) can be expected to be inversely proportional to the frequency of conflicts ( f ): When conflict frequency doubles, duration is halved. More broadly, C = k / f , where k is a constant. True, it could well be that C is not proportional to 1/ f . But if so, would it increase at a more or less than proportional rate? If we cannot answer this question, our best bet is proportionality. Conflict frequency itself may depend on the number of conflict channels (c) among parties. Would it be more than proportional to c, or less? Not knowing which way it is, the only defensible guess is proportionality: f = k c, where k is a constant. Of course, some channels are less conflictual than some others, and the degree of conflict varies over time. This is why some individual coalitions last longer than others. But here we are concerned about the mean degree of conflict per channel. 167
The Duvergerian Macro-Agenda
If the assembly has n equal-sized parties, then the number of communication channels among them is c = n(n − 1)/2, as one can easily check by drawing connecting lines between points. However, this would underestimate the number of potential conflict channels. Parties are not fully unitary actors. Coalitions sometimes break up because of a coalition partner’s internal conflict. If we add an intraparty conflict channel per party, we would have c = n(n + 1)/2, which, in turn, might be an overestimate. The mean of the two estimates is simply c = n2 /2. Combining all these links yields C = k/n2 , where k is a constant, k = 2k /k . Thus the model predicts an inverse square relationship, leaving k to be determined empirically. Since n is a pure number, k must have the same time units as cabinet duration itself. We will measure C and k in years. Actually, all parties are rarely equal-sized. In such cases we will assume that the effective number of parties (N) is to be used: C=
k . N2
The choice of the effective number of parties can be disputed, and should be. First, let us see where it leads us. If using N is erroneous, then the presumed relationship will not be observed. Using Lijphart’s data (1984) for stable democracies, Taagepera and Shugart (1989) found that C = 400 months/N2 = 33 years/N2 predicts mean cabinet duration within a factor of 2, meaning that the observed values are within a zone that extends from a half to double the predicted value. A journal reviewer once mistook ‘within a factor of 2’ to mean within ±2 years. So it is worth stressing: if C = 1 year, then ‘within a factor of 2’ means from 0.5 to 2 years; but if C = 10 years, then it means from 5 to 20 years. Then the logarithm of C is within ±log 2 of the predicted value. More extensive data (from Lijphart 1999) leads to C = 39 years/N2 (Taagepera 2003). A reanalysis with slightly corrected data (Taagepera and Sikk 2007) puts the best fit with the format C = k/N2 at C=
42 years . N2
The measure of cabinet duration used was the one devised by Dodd (1976) and designated as ‘average cabinet life I’ by Lijphart (1999: 132–3). 168
The Mean Duration of Cabinets
C (years)
100
31.3/N1.757 10 C = (R 2 = 0.787)
MRT 1 1 N
C = 42/N 2 (R 2 = 0.770)
10
Notes: Thin solid line: best-fit between logarithms. Bold solid line: theoretically based prediction [C = 42 years/N 2]. Dashed lines: a half and double the expected value.
Figure 10.1. Mean cabinet duration vs. effective number of legislative parties— predictive model and regression line. Source: Taagepera and Sikk (2007).
The model implies that the product of N2 and cabinet duration is conserved: N2 C = k. The mean C and N for every country yield a different value of duration constant k. Estimates of the world mean have moved from 33 to 42 years. When calculated on the basis of individual country data for the 36 stable democracies in Lijphart (1999), the distribution around the mean of 42 years is roughly normal, with a standard deviation of 14 years. The lowest individual value is 16 years, and the highest is 72 years—except for Switzerland at 445 years! Being more than 3 standard deviations off justifies the exclusion of Switzerland from the test set. Switzerland is the only non-presidential country where the executive, once empowered by parliament, does not depend on legislative confidence (Lijphart 1999: 119–29). Thus some key assumptions of the inverse square model may not apply to such regimes. Figure 10.1 (from Taagepera and Sikk 2007) shows cabinet duration for the remaining 35 democracies in Lijphart (1999) graphed against the effective number of parties, both on logarithmic scales, so that the inverse square relationship becomes a straight line. The best linear fit of logarithms corresponds to C=
31.3 years . N1.76
[observed best fit, R 2 = 0.79]
169
The Duvergerian Macro-Agenda
When the exponent 2 is imposed, the best fit is C=
42 years . N2
[theoretical model, R 2 = 0.77]
The two lines almost superimpose, and R 2 is barely reduced, so the model visibly fits. In view of existence of a logical model and its empirical confirmation, we have here a law, in the scientific sense of the term—the inverse square law of cabinet duration, relative to the number of parties. It might seem more appropriate to consider only the communication channels within the coalition. Why would stresses among the parties outside the coalition shorten coalition duration? This question is well founded, but unpublished work by Lijphart and me (reported in Taagepera and Shugart 1989: 101–2) yields a surprising result: There is poor correlation between the mean coalition duration and the number of parties in the coalition itself. Parties excluded from the coalition still seem to have ways to affect its duration. It remains to account for the deviation from C= 42 years/N2 for individual countries. Here the cabinet types may enter separately from the number of parties. Also, balance in the size of parties may play a role because, at the same effective number, party systems with a dominant party might be more durable. Inclusion of centrist parties also might contribute to duration (van Roozendaal 1992), at the same effective number of parties.
Is there an Inverse Cube Law of Cabinet Duration, Relative to the Seat Product? Chapter 9 connected the effective number of parties to the seat product MS: N = (MS)1/6 , on the average. It follows (as first pointed out in Taagepera and Sikk 2007) that cabinet duration has an inverse cube relationship to the seat product, on the average: C=
42 years . (MS)1/3
Out of the 35 democracies tested in Figure 10.1, district magnitude cannot be specified for 10 countries, because all seats are not allocated within the districts, or further features are introduced, such as large legal thresholds. Figure 10.2 (from Taagepera and Sikk 2007) shows cabinet duration for 170
The Mean Duration of Cabinets
C (years)
100
BOT C = 42/(MS)1/3 2 (R = 0.240) 10
SPA 0.233
C = 21.848/(MS) 2 (R = 0.300)
MRT PNG
NED
IND
1
10
100
1,000 MS
10,000
100,000
Notes: Thin straight line: best-fit between logarithms. Bold straight line: theoretically based prediction [C = 42 years/(MS)1/3]. Dashed lines: a half and double the expected value.
Figure 10.2. Mean cabinet duration vs. seat product MS—predictive model and regression line Source: Taagepera and Sikk (2007).
the remaining 25 democracies graphed against the seat product, both on logarithmic scales, so that the inverse cube relationship becomes a straight line. See data in Appendix to the book. The best linear fit of logarithms corresponds to C=
21.8 years . (MS)0.233
[observed best fit, R 2 = 0.30]
When the exponent 1/3 is imposed, the best fit is C=
42 years . N0.333
[theoretical model, R 2 = 0.24]
Thus, in this case, the expected slope exponent is appreciably above the actual, and R 2 is reduced. In model building, we are now 4 steps removed from the seat product: MS → N0 → s1 → N → C. Agreement with the model is bound to decreases with each extra step, because further political, cultural, historical, and institutional factors can enter, and random variation also accumulates. Consequently, R 2 for the best-fit line (for logarithms) can be expected to decrease, and does so indeed, as we go from C versus N (R 2 = 0.79) to C versus s1 (R 2 = 0.53) and to C versus MS (R 2 = 0.30). For the predicted line, the decrease in R 2 is even steeper (Taagepera and Sikk 2007), from 0.77 to 0.35 and 0.24, respectively. Such a decrease in R 2 would be of little concern, as long as the predicted exponent (slope of the log-log graph) agrees with the model. Actually, as 171
The Duvergerian Macro-Agenda
one successively regresses C against N, s1 , and MS, the empirical slopes increasingly fall short of the expected: by 12 percent (slope 1.76 compared to 2.00), 20 percent (2.14 to 2.67) and 30 percent (0.233 to 0.333), respectively. Part of this increasing discrepancy is an artifact of one-directional regressing, as discussed in Beyond Regression (Taagepera 2008). Still, quite a few cases in Figure 10.2 differ from expectation by more than a factor of 2 (Botswana and Spain on the high side, and Mauritius and Papua New Guinea on the low side). More data are needed to clarify the issue—and possibly a more refined definition of cabinet duration. Taagepera and Sikk (2007) point out the problem of Mauritius, which deviates markedly from the expected pattern in Figures 10.1 and 10.2. The same prime minister stayed in office for 14 years but juggled the party composition 7 times, in quite minor ways. By the Dodd (1976) counting rules, this led to a mean cabinet duration of only 2.1 years, which feels low in the face of such a long tenure by the same prime minister. Maybe the way to measure cabinet duration should be altered in the light of this anomaly. A general inverse cube law of cabinet duration, relative to the seat product, may well exist, as a universal average, but for the moment it is a possibility rather than certainty. We may either tentatively accept it, as basis of uncertain prediction, or we would have to abstain from prediction altogether. Note that, in 19 cases of 25, C = 42 years/(MS)1/3 still predicts mean cabinet duration within a factor of 2. Moreover, the 6 widely deviant cases are spread evenly in Figure 10.2: Botswana, Spain, and the Netherlands are above the predicted zone, while Mauritius, Papua New Guinea, and India are below.
Conclusions and Implications for Institutional Engineering Once more, we have reached a remarkably simple connection to institutions, this time for a highly visible feature in politics: how often do governmental cabinets change, on the average. Mean cabinet duration tends to relate to the seat product as C = 42 years/(MS)1/3 not just empirically but for well-defined theoretical reasons. This equation is our best bet in institutional engineering, but it should be used with some good sense. The regular interelection period is around 4 years in most countries. If so, then a mean duration of 2 years implies roughly one cabinet crisis between two elections, while a mean cabinet duration of 4 years implies that cabinets tend to last until the next election. We can now say that 172
The Mean Duration of Cabinets
such a doubling of mean duration corresponds to cutting the electoral system magnitude MS by a factor of 2 cubed, meaning by 8. To double the mean duration, one might split each existing district into 8 smaller ones so as to reduce district magnitude. Alternatively, one might split each district into 2 smaller districts, while also reducing the total assembly size by one-half. The latter step alone would cut the magnitude of existing districts by a half. With existing districts split, district magnitude is only one-quarter of the previous. For example, assume an assembly of S = 128 is elected in 16 districts of M = 8, so that MS = 1,024. Reducing the assembly to S = 64 seats and splitting the districts into two would result in 32 districts with M = 64/32 = 2. Now M S = 128, which is 1/8 of 1,024. Note that the duration constant k did not enter in the example above. In comparing the relative effect (doubling) of two combinations of M and S in the same country, the constant k cancels out. This is how one should always proceed in institutional engineering, if the country has a previous democratic record. If it does not have such a record, the value of k in C = k/N2 could be estimated from the records of democratic neighbors with approximately the same sociopolitical characteristics. Applying the worldwide mean of k = 42 years should be the last resort. It carries a large margin of uncertainty. A desired increase in cabinet duration does not come free—it also alters party constellation. If cabinet duration is to be doubled, MS is to be divided by 8. However, assembly size and district magnitude are juggled to that end, the effective number of parties, N = (MS)1/6 , must be expected to go down by a factor of about 81/6 = 1.41. The largest share s1 = (MS)−1/8 is likely to increase by a factor of 81/8 = 1.30, that is, by 30 percent. The number of seat-winning parties, N0 = (MS)1/4 , is likely to be reduced by a factor of 81/4 = 1.68, that is by 40 percent. If the existing assembly has 10 parties represented, only 10/1.68 = 6 are likely to remain—and the smallest of them are likely to have their shares of seats reduced. They and the four parties completely excluded will put up a fight. If they cannot block the reduction in district magnitude, they will look for ways to counterbalance it by introducing various complexities in the electoral rules. Think Italy in the 1990s. Once the electoral rules are made complex, the resulting impact on the number of parties and cabinet duration becomes unpredictable. Changes in electoral rules are never easy and painless, nor should they be. I am not here in the business of pointing out tricks of how to pull it off. All I can do is to say: If you manage to change the seat product 173
The Duvergerian Macro-Agenda
MS in a simple electoral system by a certain amount, up or down, then specific degrees of change in party constellation and cabinet duration are the most likely to follow, within a large but specified margin of error.
Appendix to Chapter 10 What determines the value of the duration constant in C = k/N2 ? Over time, the mean estimates of duration constant k in C = k/N2 have shifted from 33 to 42 years. Both values are in the ball park of the length of a full political career. Extended to N = 1, it would suggest that one-party democracies would still undergo change in partisan composition of cabinet every 40 years or so. Given that one-party democracy is unusual, Taagepera and Shugart (1989: 101) approach the issue through an ideal two-party system (N = 2). If the tendency to ‘throw the rascals out’ is balanced by the increase in resources that incumbency brings, then the party in power has a 50 percent chance to win the next elections, a 25 percent chance to win the two next elections, and so on. Assume no early dissolution. Then the probabilities of a cabinet lasting 1, 2, 3, etc. interelection periods ( p) are 1, 1/2, 1/4, 1/8, etc., which add up to C = 2 p. If p is 4 years, then the average duration of the cabinet in an ideal two-party system (N = 2) would amount to 8 years. Then k = C N2 = 32 years. Larger values of k would suggest that incumbent resources more than outweigh incumbent unpopularity. If we take C = k/N2 dead seriously and use individual country values of C and N to calculate the constant k, the results range from 16 years for Mauritius 1976– 97 (a methodologically problematic case) and 24 years for Greece 1974–2004, to 71 years for the Netherlands 1946–2002 and 72 years for Botswana 1965–2004 (with its nearly one-party democracy). These figures represent cabinet durations ‘normalized’ for the effect of N, as physicists would put it, or ‘controlled’ for N, as some social scientists put it. This means that the impact of the number of parties has been removed, so that the variation in k is either random or due to some other factors. Which other factors might affect mean cabinet duration? With the same number of parties, it may be presumed that polities with a more transparent political culture have longer lasting cabinets. Indeed, rough calculations by Allan Sikk and me suggest that k is higher in countries with less corruption and higher selfexpression scores on Inglehart’s (1997) scale. Predictive models for this aspect remain to be worked out. Could the value of k have something to do with Lijphart’s ‘executives-parties’ (1999) dimension? The latter distinguishes between consensual and majoritarian systems. With the same number of parties, consensual polities might be expected to have longer lasting cabinets. However, when graphing k versus executives-parties scores for 1945–96 (from Lijphart 1999: 312), hardly any trend can be seen. The
174
The Mean Duration of Cabinets geometric mean of the 7 most majoritarian cases (scores below −1.0) is k = 39.4 years. For the 7 most consensual cases (scores above +1.0), it is k = 46.5 years. The difference, though in the expected direction, is not significant. The catch here is that the number of parties is not the same. Having a PR electoral system and a large number of parties are major criteria for Lijphart rating a country consensual in the first place. Most of the consensual steam goes into raising the number of parties, which shortens cabinet life, rather than into raising the duration constant, which lengthens cabinet life. Hence consensus systems, despite a possibly higher duration constant, tend to have shorter lived cabinets, compared to majoritarian.
Which came first, the model or the facts? Science consists of interaction between mental constructs and data. In which order do they tend to come? Do we first construct the logical model of how things should be and then gather data to test it? Or do we first gather data on how things are, graph them and then look for patterns that beg for a logical explanation? Cabinet duration offers examples of both. The possible connection between the number of parties and cabinet duration was so direct that it made sense to graph C versus N. For broad reasons explained in Beyond Regression (Taagepera 2008), both would be graphed on logarithmic scales. Once this was done, the slope was so blatantly close to −2 that one had to suspect an inverse square relationship even before looking for a logical model. The reverse direction applies to the cube root law. The path that leads from the seat product to cabinet duration is so indirect that one hardly would have the idea to graph C against MS just for the heck of it. And if someone did, the empirical slope would not have led toward an explanation in terms of a cube root, because the data are too scattered. Here the model C = k/(MS)1/3 definitely came first, resulting from the combination of C = k/N2 and N = (MS)1/6 , and testing with data followed. So it can start with either the egg or the hen, depending on circumstances. Actually, it is a repeated interaction. The idea that the number of parties might affect cabinet duration was already a directional model, albeit not yet quantitative. And testing C versus MS might lead to discrepancies that make one modify the model.
175
This page intentionally left blank
11 How to Simplify Complex Electoral Systems
For the practitioner of politics:
r r
r
Resistance to simplifying a complex electoral system is least when the existing number and size of parties are not altered. The existing effective number of parties is most likely to be maintained, when the district magnitude used in the new simple electoral system is taken as the sixth power of the effective number of parties, divided by assembly size. The task remains risky, especially when one party is very large and the others very small. The devil is in the details. But the level of details can be addressed only once effective magnitude lays out the broad picture.
The stated goal of this book is predicting party sizes on the basis of electoral systems. For simple electoral systems, this task is now completed for parties-in-the-assembly. With only two basic characteristics—assembly size and district magnitude—we can specify the likeliest number and size distribution of parties in the assembly, and we get some agreement with actual data. We can even extend this approach to a major output of assembly politics—stability of government. Most surprising, the two basic characteristics play a symmetrical role (as long as multi-seat plurality is avoided and the differences among the various PR formulas are overlooked). They fuse into a single characteristic, the seat product, which alone predicts the average party constellation for a given simple electoral system. Predicting parties-in-the-assembly completes a half of the macroDuvergerian agenda—the central and lower left parts of the scheme 177
The Duvergerian Macro-Agenda
in Figure 7.2. The remaining major task is to predict parties-in-theelectorate—the lower right corner in Figure 7.2. This objective also involves deviation from PR, at the intersection of seats and votes. Two side issues also remain: Where do electoral systems come from? What about the nonsimple systems? All too many of the actual electoral systems are complex. What can we predict about their impact on party systems? This chapter addresses the complex systems. The next one deals with population, which strongly determines assembly size and may affect party politics in other ways, too. The determinants of the other component of the seat product, district magnitude, remain an open question.
The Notion of Output-Based Effective Magnitude Effective magnitude can be approached from two opposite directions. One can start from actual electoral laws in a complex electoral system and skillfully try to evaluate their likely effect on electoral outcomes, as compared to the effect of a given district magnitude in a simple electoral system. This path was followed by Taagepera and Shugart (1989) and, in the format of effective threshold, by Lijphart (1994). Chapter appendix describes the evolution of this approach. Alternatively, one can take at face value some output of the electoral system, such as the effective number of parties, and calculate which simple electoral system would be expected to produce it, on the average. This is what is done here. The mean relationship between MS and the effective number of parties was established as N = (MS)1/6 . Reversing it yields 6 N N6 Meff = . ≥1 S S This approach applies as long as it turns out that Meff ≥ 1. If not, we will have to assume seat allocation by plurality, as will be explained soon. Due to the large exponent 6 of N in the effective magnitude formula, even small random variations in the number of parties would be magnified. Suppose a small assembly of 60 seats elected by FPTP is observed to have N = 2.0. It would lead to an estimate of Meff = 1.07, which largely agrees with the actual M = 1. But a minor and possibly random shift of 10 percent, from N = 2.0 to N = 2.2, would yield Meff = 1.89, suggesting PR in 2-seat districts. In the opposite direction, N = 1.8 would yield Meff = 0.567, much less than 1. How should this outcome be interpreted? 178
How to Simplify Electoral Systems
We no longer can assume a simple electoral system, meaning List PR or FPTP, but have to keep in mind the full form of the seat product, M F S, which allows for multi-seat plurality. If N6 /S turns out less than 1, we have to interpret it as use of multi-seat plurality, meaning F = −1. Now N6 /S is reversed into S/N6 : Meff = S/N6 , plurality.
[N6 /S < 1]
In the example above, this means Meff = 1/0.567 = 1.76. It rounds off to 2 and thus suggests plurality in 2-seat districts, if anything. The general expression is Meff = (N6 /S)1/F . If F is restricted to values +1 and −1, one could hesitatingly simplify it to 6 F N , Meff = S where using F = −1 implies allocating seats by plurality rule. The example above indicates that a mere 10 percent random deviation in the effective number of parties will alter the effective magnitude by a factor of almost 2, because 1.16 = 1.77. An electoral system with 1-seat districts can easily seem to have 2-seat districts, with either PR or plurality. Also, a PR system with M = 10 can seem to have 5- or 20-seat districts, judging by the value of the effective number of parties. Only larger deviations might need further explanation.
Output-Based Effective Magnitude for Systems of Known District Magnitude Effective magnitude will first be calculated for relatively simple electoral systems where district magnitude should play the major role in determining the effective number of parties. This way, we can see how closely the formula reproduces district magnitude. The actual systems used for testing N = (MS)1/6 still have complexities such as unequal district magnitudes and second rounds. Therefore, we should expect discrepancies between the calculated and actual district magnitudes. If certain degrees of disagreement with the actual district magnitude correspond to certain types of complexities, then we might have a way to measure the impact of such complexities. In this light, the general formula Meff = (N6 /S) F is now applied to those 26 out of Lijphart’s 36 stable democracies (1999) in which all seats are distributed within districts. None use multi-seat plurality. Table 11.1 179
The Duvergerian Macro-Agenda Table 11.1. Actual district magnitudes (M) and effective magnitudes derived from Meff = (N 6 /S) F , for stable democracies with relatively simple electoral systems in 1945–1996. Country
M
S
MS
N
Meff = (N 6 /S) F
Barbados Trinidad Botswana Bahamas Jamaica Mauritius New Zealand Papua-NG Australia Canada USA France India UK Malta Costa Rica Ireland Luxembourg Norway Japan Spain
1 1 1 1 1 1 1 1 1 1 1 1 1 1 5.0 7.8 3.5 14.2 7.7 4.0 6.7
26 36 37 42 55 68 85 108 128 270 435 508 542 635 59 55 154 57 154 486 350
26 36 37 42 55 68 85 108 128 270 435 508 542 635 294 426 538 809 1190 1940 2330
1.76 1.82 1.35 1.68 1.62 2.71 1.96 5.98 2.22 2.37 2.40 3.43 4.11 2.11 1.99 2.41 2.84 3.36 3.35 3.71 2.76
1 1 6 Plur. 2 Plur. 3 Plur. 6 1 423 1 1 2 3 9 7 Plur. 1 4 3 25 9 5 1
Portugal Finland Switzerland Israel Netherlands
11.3 14 25 120 140
249 200 197 120 140
2810 2940 4920 14,400 20,000
3.33 5.03 5.24 4.55 4.65
6 81 104 74 72
Comments
One ethnic party hegemony No explanation Ethnic parties? Extremely local ethnic parties
Two-Rounds Ethnic-regional parties No explanation No explanation
Despite ethnic parties and uneven M Uneven M and local alliances Panachage and cumulation
Sources: Lijphart (1999: 76–7) for effective numbers of parties (N), various sources for assembly sizes (S), and actual magnitudes (M).
presents these countries in the order of increasing MS, except for keeping the M = 1 systems separate. The calculated values of Meff are rounded off to the nearest integer, with apparent use of multi-seat plurality indicated. In 5 cases of 25, the effective district magnitude is less than the actual by more than a factor of 2. These are shown in bold. The 6 contrary cases, where the effective district magnitude exceeds the actual by more than a factor of 2, are shown in bold italics. Some single-seat systems have so few parties that one might think they have multi-seat plurality, if the effective number of parties were the only information on hand. This is the case for Jamaica, Botswana, and, most strongly, for UK. One would think UK had plurality rule in 7-seat districts rather than FPTP, and no institutional explanation is in sight. Some other single-seat systems have so many parties that one might think they have 180
How to Simplify Electoral Systems
PR in multi-seat districts, if the effective number of parties were the only information on hand. In France, Two-Rounds may give smaller parties an entry point. In India and Mauritius, local ethnic parties might be seen as a factor, but in this case Canada’s marked local variety should also show up as an unusually high effective magnitude—yet it does not. The truly extraordinary case is Papua New Guinea, where fractionalization exceeds what would be expected if PR were used in a single nationwide district! Indeed, its effective magnitude surpasses by far its assembly size. No strategic coordination whatsoever seems to take place among the various tribes. On the PR side, Malta and Spain look as if they had single-seat districts. It cannot be pinned on Malta’s use of STV, because Ireland fits almost perfectly (Meff = 3.4 vs. actual M = 3.5). In Spain, the low Meff occurs despite a wide variation in district magnitudes that would push precisely in the opposite direction. Nor can it be due to ethnic parties as such, because these, too, tend to increase the effective number of parties (cf. India). In the opposite direction, Finland and Switzerland look as if these countries consisted of 2 to 3 huge districts. Uneven M and local alliances help small parties in Finland, and panachage and cumulation may do the same in Switzerland, but their effect could hardly be that large. Historical path dependence might have to be invoked. Could the effect of possible random variation in the effective number of parties be mitigated by basing the estimate of effective magnitude on several outputs, including the largest party’s seat share and cabinet duration? This can readily be done, converting the previous relationships into forms Meff = 1/(s18 S) and Meff = (42 years/C)3 /S, respectively. Neither offers more agreement with the actual magnitude than Meff = N6 /S does. The geometric mean of the three approaches does no better than Meff = N6 /S alone. Many party systems appear strongly affected by other factors besides the seat product, but the nature of these factors is not readily visible.
Output-Based Effective Magnitudes for Complex Systems We may now proceed to cases where other features override district magnitude, so that the latter can be expected to have little impact on party political outputs. We can still try to use Meff = (N6 /S) F to estimate the effective magnitude, always at given assembly size (cf. chapter appendix), but with huge doubts. Table 11.2 shows the remaining 10 cases among Lijphart’s 36 stable democracies (1999). Countries are shown in the order 181
The Duvergerian Macro-Agenda Table 11.2. Effective magnitudes for complex electoral systems, with output Meff calculated from Meff = (N 6 /S) F Country Greece 1974 on Austria 1945 on Germany 1949 on Sweden 1948 on Colombia 1958 on Venezuela 1959 on Italy 1946 on Belgium 1946 on Denmark 1945 on Iceland 1946 on
District M
S
N
Output Meff
Input Meff
4 7/20 1 8/11 — — 20 7 6/7 1.5/6
300 175 526 298 191 199 613 205 170 58
2.20 2.48 2.93 3.33 3.32 3.38 4.91 4.32 4.52 3.72
3 Plur. 1 2 5 7 7 23 29 29 64
3 ∼3.5/20 10 9/12 — 4 ∼20 12 25 60
Comments ‘Reinforced PR’ Two-party tradition
Sources: Lijphart (1994: 31–44) for the lowest level district magnitudes, Lijphart (1999: 76–7) for effective numbers of parties (N), various sources for assembly sizes (S), and Taagepera and Shugart (1989: 136–9) for estimates of input-based Meff .
of increasing effective magnitude. For comparison, the table also shows the input-based effective magnitudes given for roughly the same periods in Taagepera and Shugart (1989: 136–9). They broadly agree with the effective thresholds in Lijphart (1994). When the output- and input-based effective magnitudes agree, institutional explanation seems sufficient. The cases where they disagree are shown in bold in Table 11.2. Here historical– cultural factors may have to be considered. Greece has had various forms of ‘reinforced PR’, meaning lowmagnitude districts plus legal thresholds that at times have reached 30 percent for 3-party alliances. The total effect seems stronger than was deemed possible by Taagepera and Shugart (1989): it looks close to plurality in 3-seat districts rather than PR. The Greek electoral system acts like reinforced plurality rather than ‘reinforced PR’. Austria has a long tradition of one major sociopolitical cleavage that supports a two-party system even while the electoral systems might have allowed more parties to rise in 1945–70 and definitely made it easy from 1971 on. Despite district level M = 7 and M = 20 for the two periods (plus mildly restrictive second tiers), Austria’s number of parties would make one think that it has single-seat districts. Germany has effectively nationwide PR, restricted by a 5 percent legal threshold. Taagepera and Shugart (1989) estimated this restriction to be comparable to having seat allocation in districts of M = 10. The conversion formula M = 75%/T − 1 (cf. chapter appendix) suggests M = 75/5 − 1 = 14. Actually, Germany looks as if it had PR in 2-seat districts. In contrast to Austria, a prior two-party tradition cannot be invoked. The 182
How to Simplify Electoral Systems
Erststimme (‘primary vote’) on the ballot being cast for single-seat districts might exert a strong psychological effect (as it was intended), despite the compensatory Zweitstimme (‘secondary vote’) that restores PR. The input- and output-based effective magnitudes agree for Italy, Denmark, and Iceland. Sweden has fewer parties than the electoral system would enable it to have, while Venezuela has mildly more. Belgium looks like a country with very high magnitude. Here, ethnic split has produced more parties than one could expect on the basis of institutional inputs.
Conclusion and Implications for Institutional Engineering The seat product MS allows us to calculate the expected party political output of a simple electoral system. For complex electoral systems, where district magnitude clearly does not tell everything, one can reverse direction. We can calculate an effective magnitude from known assembly size and the actual output in the form of effective number of legislative parties. This would be the district magnitude which, for given assembly size, would be expected to lead to the same effective number of parties, if simple electoral rules were used. In some cases, effective magnitude corresponds to multi-seat plurality. This output-based effective magnitude can be compared with actual district magnitudes—or with input-based effective magnitudes derived from judicious evaluation of the impact of various features in a complex electoral system. Such comparisons help us gain confidence in the estimates based on inputs when there is agreement. Disagreements make us ask questions about causes that go beyond electoral systems—ethnic saliency and historical path dependence, for instance. What does output-based effective magnitude mean for institutional engineering? Suppose it is felt in a country that its existing electoral system is overly complex and could stand simplification. Even when most parties in the assembly share this feeling, they would be leery of introducing changes for fear that their own party may suffer. So it becomes a question of how to simplify the rules without altering the number and size distribution of parties. This is precisely what the output-based effective magnitude is about. Starting from the actual effective number of parties, the effective magnitude indicates to which simple electoral system the given complex system is most like. Thus, effective magnitude is the 183
The Duvergerian Macro-Agenda
measure of the present system that helps to find the district magnitude to be used in the simpler system. It is not that simple, of course. It works best when the present party system is well balanced. In the presence of one very large and numerous very small parties, the effective number of parties does not tell the whole story. In such a case, one would also want to calculate the effective magnitude based on the largest seat share, Meff = 1/(s18 S), and try to balance this number with the one obtained from Meff = N6 /S. A satisfactory compromise may or may not be available. The devil is in the details. But the level of details can be addressed only once the broad picture has been adequately laid out.
Appendix to Chapter 11 The quest for input-based effective magnitude Effective magnitude is a notion that has caused considerable confusion in electoral studies. Given that I was the one who coined the term, it is up to me to clarify it— and now it can be done. Since the publication of Taagepera and Shugart (1989), there have been times when I exclaimed to myself that the entire concept was selfcontradictory, and I began to avoid it. At the same time, I kept receiving inquiries about how to determine it for a given complex electoral system. Notions and procedures have a life of their own. They sometimes continue to be used even while their originators have given up on them. The brief history of effective magnitude is presented here in the hope of reducing the incidence of faulty applications in the future. The main message is the following:
r r
Effective magnitude makes sense only in the context of a given assembly size. For complex electoral systems, effective magnitude is the value to plug into the seat product MS, in lieu of district magnitude, so as to obtain the observed party constellation.
The notion of an effective magnitude was introduced by Taagepera and Shugart: ‘The concept of magnitude will have to be broadened to include not only the district magnitude but also a nationwide [emphasis added] “effective magnitude” ’ (Taagepera and Shugart 1989: 126). We did not call it ‘effective district magnitude’, because we felt it addressed precisely the various complex features of electoral systems, most of which operate outside the existing districts. Nonetheless, our chapter title ‘Magnitude: The Decisive Factor’ (Taagepera and Shugart 1989: 112) engendered a small cottage industry, claiming that district magnitude is not always the decisive factor, because there is the seat allocation formula (plurality!) and all those factors that go beyond the district. But read the
184
How to Simplify Electoral Systems title again: the word ‘district’ is not there. The review table (Taagepera and Shugart 1989: 138–9) explicitly contrasts the two magnitudes, ‘District’ and ‘Effective’. Effective magnitude was meant to transcend the existing districts. We cannot complain, because we were confused ourselves. We looked for some sort of a magnitude that went beyond the existing districts and would produce roughly the same effect on outputs, such as the number of parties and deviation from PR, as the existing complex electoral system. We could not quite pin it down. The crucial shortcoming was ignoring assembly size, even while we pointed out elsewhere (Taagepera and Shugart 1989: 174–5) that it affected the number of parties. We imagined districts where the number of seats would equal the effective magnitude, but left open how many such districts there would be—even while the answer is self-evident in retrospect: assembly size divided by effective magnitude. For such a fuzzy equivalent system, we tried to develop some guidelines on how to evaluate the impact of legal thresholds, adjustment seats, and multistage seat allocation (Taagepera and Shugart 1989: 135–40, 206–69). Lijphart (1994) added valuable insights by shifting the focus from effective magnitude to effective threshold. A degree of equivalence between district magnitudes and legal thresholds had been pointed out by Taagepera and Shugart (1989: 117, 276–7). Lijphart (1994: 12) stated explicitly that reducing a legal threshold or increasing the district magnitude ‘can be seen as the two sides of the same coin’. (The statement obviously implies that all seats are allocated within districts.) Rather than expressing legal thresholds and various complexities in terms of a somewhat equivalent effective magnitude, he expressed district magnitude and all the rest in terms of a somewhat equivalent effective threshold. The coarse conversion formula between the ‘two sides of the coin’ shifted from T = 50%/M (Taagepera and Shugart 1989: 117) to T = 75%/(M + 1). The latter formula was first reported in Lijphart (1994: 183) as private communication by Taagepera, and it was explicated in Taagepera (1998b)—see Chapter 15. The main flaw was that nationwide and district level legal thresholds were confused and mistakenly treated as equivalent. The incongruence becomes blatant when the notion of effective threshold is applied to single-seat districts. Lijphart (1994) estimated it as T = 35%, and the formula T = 75%/(M + 1) yields T = 37.5% for M = 1. While such a threshold reflects reasonably a party’s chances to win in a district (graph in Taagepera 1998b: 399), imagining its equivalence to a nationwide legal threshold of 37.5 percent would be preposterous. In New Zealand 1928, such a nationwide legal threshold would have disqualified even the nationwide winner from obtaining seats! While less visible, the same incongruence affects the ‘other side of the coin’, effective magnitude. According to T = 75%/(M + 1), a nationwide legal threshold as high as 37.5 percent would appear to be no more restrictive on parties than having single-seat districts. This is the point where I began to doubt of the existence of this phantom, the effective magnitude. In retrospect, the central position of the product MS, highlighted already in Taagepera and Shugart (1993), should have led
185
The Duvergerian Macro-Agenda to the answer that finally dawned on me while writing the present book. For the effect of a simple electoral system on the number and seat share distribution of parties, it is the combination M F S that matters, in the first approximation. For a complex electoral system with given assembly size and seat allocation formula, one can look for an effective magnitude that would have roughly the same effect on the party system as would an actual district magnitude in a simple system. In retrospect, the input-based estimates of effective magnitude in Taagepera and Shugart (1989: 136–9) coincide reasonably well with output-based ones (cf. Table 11.2), despite ignoring the assembly size factor. How come? Most national assemblies have 50–500 members, which is a fairly limited range. The estimates were instinctively fitted to about 200 members. When one goes to very small or very large assemblies, more marked differences could arise.
186
12 Size and Politics
For the practitioner of politics:
r r r r
Take the cube of the number of seats in the first or only chamber of your national assembly, and you roughly get your country’s population. Unconsciously, assembly sizes are chosen to fit the cube root of population, because this size minimizes the workload of a representative. The population of a country puts broad limits on the size of its national assembly and thus limits the options for institutional engineering. Smaller countries have fewer registered parties but more party members per 1,000 population. They may have slightly more durable cabinets.
Is politics different in small and large countries, all other factors being the same? In their seminal Size and Democracy, Robert Dahl and Edward Tufte (1973) convincingly showed that systematic differences are to be expected and that they do show up empirically. Some of these differences involve political institutions, such as the size of representative assemblies, while others, such as trade/GNP ratio or military capabilities, affect politics indirectly. This chapter reviews the work done since 1970 on the impact of population on assembly size, cabinet duration, the number of parties and their memberships.
Overview A causal link between country population (P ) and assembly size (S) was established by Taagepera (1972). It starts with the number of communication channels—the same basic notion used for the inverse square law of cabinet duration (Chapter 10). It leads to a cube root law of assembly 187
The Duvergerian Macro-Agenda
sizes: S = P 1/3 , which applies within a factor of 2 to most countries with sufficient literacy. This equation predicts assemblies (or first chambers) of 100 seats for countries of 1 million people, and 1,000 seats for 1,000 million. The model was streamlined and re-tested by Taagepera and Shugart (1989: 173–83). Using analogy with absorption law in physics, a logical model for trade/GNP ratio was also developed (Taagepera 1976; Taagepera and Hayes 1977), but it does not concern us here directly. After the 1970s, interest waned in population (or area) as a factor in domestic politics. Dag and Carsten Anckar (1995) raised again the issue of size and democracy, adding insularity as a special factor. Taagepera (1972, 1976) had noted that small island countries tend to have assembly sizes below the cube root of population, as well as trade/GNP ratios lower than those of continental countries of similar population. Dag Anckar (1997) and Carsten Anckar (1997a, 1998, 2000) focused on three variables relevant to party politics: the number of parties registered, the vote share of the largest party, and the effective number of electoral parties. The input consisted of population and area (which are highly correlated) of 77 states. It was found that an increase in either measure of size tends to go with increases in the number of parties registered and the effective number of electoral parties, while the vote share of the largest party tended to decrease correspondingly. These empirical findings continue to hold when elections with FPTP and PR rules are considered separately. Steven Weldon (2006) added a complementary empirical observation: The total membership of parties also tends to increase with the electorate (which is very highly correlated with population). However, the per capita party membership decreases with increasing electorate, and so may the degree of activity of these members. Meanwhile, a different line of research began to stress the significance of assembly size, without explicitly connecting it to population. Grofman and Handley (1989) noted differences in the composition of Houses and Senates of US states that could be attributed only to differences in the number of members (cf. Chapter 6). Lijphart (1994) found some impact of assembly size on disproportionality and multipartism. Taagepera and Shugart (1993) built a multilayered logical model centered on what I now call the seat product MS, and assembly size emerged as one of the two key variables. The indirect connection to population, through the aforementioned cube root law, was noted in Taagepera (2001), but no direct test was carried out. 188
Size and Politics
Subsequent sections describe the cube law of assembly sizes and its implications for party sizes, as well as the work by Anckar (1998, 2000) and Weldon (2006) on the number and membership of parties. I focus on population (rather than area or electorate) as the central measure of country size. It is not always clear how population influences various political quantities. It may affect the number of registered parties and total party membership directly, or through assembly size, which may put restraints on party politics.
The Cube Root Law of Assembly Sizes When the memberships of the first or only chambers of assemblies of stable democracies are graphed against population, both on logarithmic scales, the empirical slope is visibly around 1/3—see graph in Taagepera and Shugart (1989: 175). The best fit line passes near the point P = 1, S = 1. This is an important conceptual anchor point. A population of 1 person would be expected to represent itself. Thus it seems that assembly size is close to the cube root of population represented: S = P 1/3 . Taagepera and Shugart (1989: 175) also show that, from 1790 to 1913, the US House played a continual catch-up game with the cube root law, as population expanded during the intercensal periods. In 1913 the House size was frozen, and it is by now only two-thirds of the cube root of population. Graphing all national assemblies of the world (Taagepera and Shugart 1989: 176) confirms that the cube law pattern applies to one-party or no-party regimes as well as for those with at least some multiparty elements. Yet there is a tendency for assemblies to be smaller than S = P 1/3 when literacy is low and populations are less than 1 million. It may be argued that only literate people can meaningfully participate. Taagepera (1972) defined ‘active population’ as the literate working age population: P = P LW, where P is the total population, L is literacy rate, and W is the working age fraction of the population. Omission of literate population past retirement age is unjustified conceptually but was imposed by availability of data. It leads to moderate underestimation of active population. A logical quantitative model, presented in chapter appendix, predicts that assembly size would be the cube root of double the active population: S = (2P )1/3 . 189
The Duvergerian Macro-Agenda
When all national assemblies of the world are graphed against P = P LW (Taagepera and Shugart 1989: 178), the fit is good for active populations above 0.2 million. For very small populations, assembly sizes still tend to fall short of the model. Notably, island nations tend to have smaller assemblies than expected. In all countries with extensive literacy, total population tends to be close to double the working age population, as countries with many children tend to have few old people, and vice versa. Therefore, when literacy is above 90 percent, little is lost by using S = P 1/3 rather than the more complex expression. When literacy is 75 percent, it is expected to reduce assembly size by 1 − 0.751/3 = 10%. The logical model applies only to democratic countries, so why would nondemocratic regimes pick similar sizes for their puppet assemblies? It may be imitation. Communist regimes tended to have twice the usual assembly size for given population, presumably so as to look especially democratic. Most postcommunist democracies have cut down on assembly sizes. China continues with a bloated assembly of 3,000, while the cube root of its population of 1,300,000,000 is still only 1,100. Even if a genuine world parliament were formed, the present world population of 6 billion would project to only 1,800 seats. As of now, assembly sizes still fit S = P 1/3 when population exceeds 1 million and its growth rate is low. Countries with rapidly growing populations seem to play a perennial catch-up game, gradually enlarging their assemblies. Overall, countries with single-seat districts tend to have relatively small assemblies, British counterexample notwithstanding. The two effects, population growth and single-seat districts, combine in many small island countries. By now, we have appreciable time series for growth in population and assembly size. In view of the importance of assembly size in affecting the number of parties, a thorough reanalysis would be desirable, updating Taagepera and Shugart (1989). Does the cube root law apply to subnational assemblies? The way the model is constructed, it should. But empirically, such assemblies tend to fall short. I could find data for all city councils of the capital cities in the European Union (EU), except Ljubljana and La Valetta. The geometric mean for the 23 cities is S = 0.82P 1/3 , which is 18 percent below expectation. The Houses of the US states also fall short of S = P 1/3 . As for the US cities, they often elect only 5 to 10 commissioners. Here the philosophy seems to be elect a city government rather than a deliberative assembly. 190
Size and Politics Table 12.1. Predicted largest seat shares and effective numbers of parties, at selected populations and district magnitudes Population
10,000 1 million 100 million
s1 (%) for
N for
M=1
M = 10
M = 100
M=1
M = 10
M = 100
68 56 46
51 42 35
(38) 32 26
1.67 2.15 2.78
2.45 3.16 4.08
(3.59) 4.64 5.99
Do Smaller Countries have Fewer Legislative Parties? When S is replaced by P 1/3 , in line with the cube root law, in previous models s1 = (MS)−1/8 for the largest seat share and N = (MS)1/6 for the effective number of parties, we obtain s1 = M −1/8 P −1/24 and N = M 1/6 P 1/18 . Table 12.1 shows the corresponding average predictions. No systematic testing has been carried out. For a population of 10,000, the cube root law indicates a 22-seat assembly, and all actual assemblies fall much short of 40 seats; hence the entry for M = 100 in Table 12.1 is unreal.
Have Smaller Countries More Durable Cabinets? If mean cabinet duration tends to be C = 42 years/(MS)1/3 (cf. Chapter 12) and S = P 1/3 , then we would expect C=
42 years . M 1/3 P 1/9
This implies that smaller countries would have more durable cabinets, on the average. When single-seat districts are used, we would expect a mean duration of 11.7 years in a country of 100,000 and only 5.4 years in a country of 100 million—a twofold decrease. However, population enters as its ninth root, which could make its impact hardly noticeable, compared to that of district magnitude. A country of 100 million using 100-seat districts would have an expected cabinet duration of 1.2 years, meaning a much larger 4.5-fold decrease due to M. 191
The Duvergerian Macro-Agenda
I tested the 14 single-seat systems for which Lijphart (1999) gives cabinet durations. The geometric mean cabinet duration is 7.6 years for the 8 smaller countries (population 0.25 to 4 million in 1993, geometric mean 1.18 million). It is less indeed—5.8 years—for the 6 larger countries (population 17 to 870 million, geometric mean 82 million). The ratio of durations for the two groups would be expected to be the ninth root of their population ratio, which is 1.6. The actual ratio of durations is only 1.3. While the mean trend goes in the expected direction, it may well be accidental, given that the scatter of data exceeds by far both means. It would take many more data to test whether smaller countries really tend to have more durable cabinets. Such data will become available by 2010, as the Third Wave democracies reach the minimum age of 20 years, the criterion of stable democracy proposed by Lijphart (1999). We shall see.
How Population could Affect the Number of Parties Registered and their Memberships The general format of equations in previous sections is y = kMa P b , where k, a, and b are constants. Given that Anckar (1998) and Weldon (2006) deal with quantities that can take only positive values, considerations presented in Beyond Regression (Taagepera 2008) suggest that their data might be reanalyzed in the same framework, where y stands, successively, for the number of parties registered, their combined membership, etc. When logarithms are taken, these equations become linear: log y = log k + a logP + b log M. As in Chapters 8–11, linear regressions should be tried with all variables logged, rather than using the variable themselves or logging only some of them (such as population). What is the rationale for expecting that, along with population, district magnitude may play a role in quantities such as the number of parties registered and their memberships? Ever since Duverger’s law was spelled out, it has been clear that seat allocation by FPTP cuts down minor party representation. When unsuccessful parties go out of business, the number of parties registered may be expected to decrease. With fewer parties, their combined membership may also decrease. With PR, in contrast, the larger the district magnitude, the more parties can win at least a few seats, possibly inducing more parties to register and recruit members. Even if it should be confirmed that population and district magnitude do impact the number and membership of parties, it remains to be 192
Size and Politics
verified whether the changes are proportional to population and magnitude with some exponent so that the format y = kMa P b applies. And even if this should be the case, the question would remain: Why do the constants have the values they empirically have? Our understanding remains incomplete until we can answer the ‘why’. This is ongoing research, and only tentative results can be reported in the next few sections.
Do Small PR Countries have Higher Party Memberships, Per Capita? Party membership means here the combined membership of all parties in the country. One might expect that relatively more people join parties in smaller countries, where party politics is less anonymous and closer to home. Thus, smaller countries may have higher per capita party memberships. However, small countries also have smaller assemblies, which limit the number of successful parties and hence may reduce per capita party membership. So which way is it? There are two ways to find out. One is to construct more detailed quantitative models so as to see by how much coziness could increase political participation and by how much small assemblies could impede it. These models have not been completed. The other way is empirical. A study by Steven Weldon (2006) confirms that relatively more people join parties in smaller countries. Weldon compares the party membership to the size of the electorate in 27 countries in the late 1990s. Electorate (P ) amounts to approximately 3/4 of the total population. Membership density (d, the ratio of membership to electorate) in long-standing democracies ranges from 1.6 percent in France and 1.9 percent in UK to 23.8 percent in Malta and 27.3 percent in Iceland. Linear regression of the logarithms of d and P corresponds to d = 1,550(P )−0.37 (R 2 = 0.51). Austria is a major outlier, with d = 17.7%, despite its medium population. I prefer to think in terms of total party membership (m) rather than per capita. Keep in mind that d is in percentage, and assume that electorate is roughly three-quarters of the population (P = 0.75P ). Then Weldon’s regression line (2006) would correspond to m = 0.75P (d/100) = 12(P )+0.63 . This would imply that total party membership grows somewhat faster than the square root of electorate. The joker in the equation is the constant 12. At P = 1, one could visualize at most one party with a membership of 1. Yet, when the equation above is carried to this limit, it would imply that an electorate of 1 person would still include 12 party 193
The Duvergerian Macro-Agenda
members! If the anchor point at P = 1 is to be respected, the empirical equation would need some modification, unless some other factor is introduced, such as district magnitude. At the same population, more parties are available in PR systems. Hence one might expect that PR systems incite more people to join parties. Weldon (2006) does not consider the possible effect of electoral system, but his data and graph suggest that higher district magnitudes go with higher per capita party memberships. Shifting from electorate to total population, preliminary reanalysis of his data suggests that total party membership (m) might be related to district magnitude (M) and population (P ), approximately as m = M 3/8 P 3/4 . This equation says that total party membership grows slower than population, and that larger district magnitudes boost party membership (presumably by boosting the number of parties). This equation satisfies the anchor point: P = 1, M = 1 → m = 1. However, I have no theoretical justification as yet for these particular exponents. It must have something to do with the number of parties. The corresponding per capita party membership would be m M 3/8 = 1/4 . P P As population increases, per capita party membership would decrease as the fourth root of population, meaning fairly slowly. Table 12.2 shows the estimates that result from these extremely approximate empirical fits. Again, the entry for M = 100 is unreal for a population of 10,000. Weldon (2006) also finds that party member activism seems to decrease as party membership increases. Larger memberships may reinforce a feeling of anonymity and passivity among the rank and file. This finding Table 12.2. Total and per capita party memberships at selected populations and district magnitudes—empirical approximations Population
10,000 1 million 100 million
194
Total membership (thou.) M=1
M = 10
1.0 31 1,000
2.4 75 2,371
Per capita membership (%)
M = 100
M=1
M = 10
M = 100
(5.6) 178 5,623
10.0 3.2 1.0
23.7 7.5 2.4
(56.2) 17.8 5.6
Size and Politics
depends crucially on assuming that the subunits are the relevant population units in federal countries, and the evidence rests on Germany alone. More research remains to be done on the effect of population on party membership. Recent decrease in party memberships in most democracies adds another factor that makes the study of population effects more difficult (Tan 1989, 1997; Dalton and Wattenberg 2000).
Do Smaller Countries have a Lower Number of Parties Registered? The number of parties registered (r ) might be expected to depend foremost on ease of registration, which could hardly be expected to be sizerelated. In some countries, party registration is a matter of sending in a form and paying a symbolic fee, while some other countries set various further conditions, such as a large founding membership. Still, with similar registration requirements, smaller countries would have fewer politicians interested in going through the process. Indeed, Anckar (1998, 2000) does find a correlation between r and the logarithm of population. Since both r and P can take only positive values, the format r = aP b should be preferred to the format r = a + b logP used by Anckar (cf. Taagepera 2008). At P = 1, the number of parties registered could not be above 1, but it could be much lower. Hence a 1 could be expected. Reanalyzing Anckar’s data along these lines, the best fit of logarithms shows that the number of parties registered is approximately r = 0.04 P 3/8 . This equation implies 1.3 registered parties for a country of 10,000 people, 7 parties for 1 million, and 40 parties for a country of 100 million. Most countries agree within a factor of 2. Countries with a lower number of parties registered are Mexico, Turkey, South Korea, Japan, and Jamaica. In contrast to the number of seat-winning and effective parties, the number of parties registered shows no correlation with district magnitude. The exponent 3/8 is sufficiently close to 1/3 to recall that assembly size is S = P 1/3 . Hence, one might wonder whether the number of registered parties might be proportional to assembly size. However, when the number of registered parties is graphed against assembly size, both on logarithmic scales, scatter widens. The best symmetric fit is around r = 1.6S 0.9 . The proportionality line r = 0.1S expresses the trend almost as well, except at very small assemblies of less than 30 seats. The widening 195
The Duvergerian Macro-Agenda
scatter suggests that the number of parties registered does not have a causal link to assembly size. Both seem to be affected by population (among other factors), separately: r ← P → S rather than P → S → r . For assembly, minimization of communication channels suggests the exponent 1/3. For the number of parties registered, I have no logical model, as yet. Another puzzling regularity is observed empirically: Total party membership, as measured by Weldon (2006), tends to be proportional to the square root of the product of population and the number of parties registered, as measured by Anckar (1998): m = 27(r P )1/2 . It fits within a factor of 2 for most M = 1 and M > 1 systems. I have no logical explanation, and it does not connect easily with the regularities previously observed. More research remains to be done in all those aspects.
The Number of Electoral Parties The number of seat-winning parties, the largest seat share, and the effective number of legislative parties were all found to be connected to the seat product MS, with S, in turn, connected to population. For the electoral parties, the number of registered parties has been empirically connected to population. Assuming that parties are registered in order to run, r represents the number of vote-getting parties. Anckar (1998, 2000) has empirically connected population to the largest vote share and the effective number of electoral parties. A logical connection of these quantities to institutions is established in Chapter 14, but the extension to population remains to be worked out.
Third Parties in FPTP Systems For FPTP systems, the largest seat share is s1 = (1/S)1/8 according to the model developed in Chapter 8. Thus, its preponderance decreases with increasing assembly size, and the same is bound to be the case for the second-largest share, as confirmed by inspection of Table 9.1. Hence the relative share of third parties should increase with increasing assembly size. 196
Size and Politics
This is found to be so indeed, when the combined seat shares of third parties (t = 1 − s1 − s2 ), as tabulated in Gerring (2005), are graphed against assembly sizes. For S < 50, the median t is around 4 percent, while it is around 7 percent for 50 < S < 200 and 18 percent for S > 200. The theoretical pattern becomes too complex to be calculated. The empirical median fit is close to t = 0.006(S − 2)0.59 , where the subtraction of 2 reminds us that an assembly of 2 seats could not possibly fit in a third party. Marked deviations are UK (4.3 percent) and USA (0.1 percent) on the low side and Papua New Guinea (54 percent), Solomon Islands (36 percent), and St Kitts (22 percent) on the high side. To the extent the cube root law holds, the relationship to population would be around t = 0.006P 0.2 . Gerring (2005) observes a strong correlation of third party share with federalism. It is partly a size effect, as federal countries tend to be on the large side, but partly seems real, as Papua New Guinea, Solomon Islands, and St Kitts stand out among smaller countries by being at least mildly federal.
Population and Institutional Engineering Many other factors besides its size should affect the functioning of a representative assembly, and unusual size does not always look openly dysfunctional. Tiny New Hampshire does not seem to suffer from its unusually large lower chamber of some 400 members. To fit in with the cube root law, the British House of Commons would have to shrink by 40 percent, while the US House would have to expand by 40 percent. Yet neither body shows ill effects that could be traced back to their size. Even if optimizing the channels of communications mattered, the bottom of the curve of channels versus assembly size is so flat that a large range of sizes should be almost equally acceptable. And yet assemblies seem to be sensitive to the cube root of population. The USA started out in 1790 with a House much smaller than a half of the cube root norm, but within 40 years it brought the House up to the cube root of its ever-expanding population—and then stayed close to the cube root until 1913. Worldwide, as the population increased in most countries from 1970 to 1985, assembly sizes went up by more than 10 percent in 57 countries, stayed about the same in 44, and went down by more than 10 per cent in only 4 countries (Taagepera and Shugart 1989: 179). This trend seems to continue (except for reductions in postcommunist 197
The Duvergerian Macro-Agenda
countries). It may be felt that assembly sizes could be all over the place, but somehow they hew close to the cube root of population. What this means for institutional engineering is that countries do have some leeway in choosing the size of their national assembly, but there are limits. Most actual assembly sizes are within a factor of 2 of the cube root of the populations represented. For a country of 1 million, this zone extends from 50 to 200 seats. For 64 million, it is 200 to 800, and for 343 million, 350 to 1,400 million. Still, most assemblies are closer to the centers of these zones—as if it mattered. It was observed earlier that the previous models for the largest seat share, the effective number of parties and mean cabinet duration can be reformulated in terms of population and district magnitude. I have superficially tested only the expected relationship to cabinet duration, C = 42 years/(M 1/3 P 1/9 ). Populations are a given that institutional engineers find hard to alter. Thus its connection to legislative parties seems to be of only academic interest, for the time being, but it may change. The same applies to the population dependence of electoral parties. Here, all findings are recent and empirical. Logical models need to be constructed and tested. Once this is done, something useful for practical purposes may emerge.
Appendix to Chapter 12 The cube root law of assembly sizes: The model The main purpose of legislative assemblies may be to pass laws, but what they physically do most of the time is talking and listening. The original French term parlement comes from parler—to speak. Interacting with constituents and colleagues, assembly members always face an overload of communication. Minimizing this load is a significant factor for their efficiency. By trial and error, assembly sizes tend to adjust toward maximum efficiency. Consider two extreme cases. If the assembly is very small, the interaction load within the assembly is low, but the number of constituents per representative is large. On the other hand, if the assembly is very large, the constituent load decreases, but assembly interactions grow even faster. Some intermediary size may be optimal. Let us put it in terms of the number of communication channels—the approach used previously when modeling mean cabinet duration (Chapter 10). Each active member of the population must have a two-way channel (talking and listening) to a representative, if representative democracy is to have meaning. Channels among constituents also exist, but they do not put a load on the
198
Size and Politics representative. The active population outside the assembly is (P − S). The number of constituent channels toward each of the S representatives is (P − S)/S = P /S − 1. But since they both send and receive information, the total number of constituency channels per representative is 2P − 2. S
cC =
Within the assembly, each member communicates with each of the other S − 1 members both as speaker and as listener, meaning 2(S − 1) channels per representative. But this is not all. Whenever two members are talking, the other S − 2 members are interested in what they are talking about and, figuratively, try to listen in. How many channels does a given representative try to monitor? Each of the other (S − 1) representatives is at one end of a channel to each of the (S − 2) remaining representatives, meaning (S − 1)(S − 2) ends of channels or (S − 1)(S − 2)/2 channels. In sum, the number of assembly channels per representative is (S − 1)(S − 2) S 2 S = + − 1. 2 2 2
cA = 2(S − 1) +
Thus the total number of channels making demands on an average assembly member is c = cC + cA =
2P S 2 S + + − 3. S 2 2
As a check, note that when the population is reduced to 1 (P = S = 1), the number of channels becomes 0, as it should. All channels are of course not equally demanding, but as a first approximation we will assume that they are. Then the load on the assembly member is minimized when c is minimal. Apply differential calculus. The number of communication channels is minimized when the differential of c with respect to S is made 0: d dc = dS dS
2P 1 2P S 2 S + + − 3 = S − 2 + = 0. S 2 2 S 2
This equation can be transformed into S3 +
S2 − 2P = 0. 2
For populations of more than 1,000, the central term adds less than 0.5 percent to the first one and can be neglected. Hence the optimal assembly size is
S = (2P )1/3 .
[P > 1,000]
199
The Duvergerian Macro-Agenda
The approximation P = 2P holds within 10 percent when literacy is above 75 percent. Then, simply, S = (P )1/3 . Taagepera and Shugart (1989: 181) show graphically how the number of constituent channels decreases with increasing assembly size, while the number of assembly channels increases. The bottom of the curve at medium assembly sizes is rather flat, so that deviations from the optimal S by a factor of 2 do not increase the load per representative appreciably. This is the zone where most actual assemblies are observed to lie. Some of the implicit assumptions that enter the model are listed in Taagepera and Shugart (1989: 181). With these reservations, the cube root expression has a theoretical foundation as well as empirical confirmation. Hence it qualifies as a law in the scientific sense. The various approximations made do not apply when the population represented is extremely small. Disturbingly for the validity of the model, however, actual assemblies tend to fall noticeably below the expectation even for populations as large as 200,000. I have not found a logical reason. As noted, the actual sizes of the assemblies of US states and of the capital cities of the EU also fall below expectations. The mean shortfall for the latter is only 18 percent, but it makes one wonder whether the model needs adjustment in the case of non-sovereign assemblies.
200
13 The Law of Minority Attrition
For the practitioner of politics:
r
r
r
r
r
r
The more important levels have fewer positions—and the share of minorities goes down. A party with a small share of votes gets an even smaller share of seats. If women are few in city councils, they are even fewer in the legislative assembly. The law of minority attrition expresses it quantitatively. In first-past-the-post systems, we can calculate the seat shares of all parties from their vote shares. A 1 percentage point increase in votes for a major party produces an increase of about 3 percentage points in seats, but it depends on the number of voters and seats. This ‘responsiveness’ can range from 2 percentage points when the assembly has many seats to 4 when it has few seats. If you feel responsiveness is too steep or too mild, it could be altered by changing assembly size. For a guess at the responsiveness, divide the zeroes in the number of voters by the zeroes in the number of seats. For example, with 1,000,000 voters and 100 seats, responsiveness ratio is 6/2 = 3.0. For more accuracy, use logarithms. Country-specific cultural and geographic factors can alter responsiveness. The law of minority attrition can be adjusted for multi-seat plurality, such as the US Electoral College, where responsiveness is even higher than in FPTP. It also applies to PR systems, but the responsiveness is so close to 1 (perfect PR) that it may not be of practical interest. The law of minority attrition might help determine which part of the ‘rubber ceiling’ on women’s advancement is natural and which part is socially imposed.
201
The Duvergerian Macro-Agenda
We now move on to investigate the impact of electoral systems on the distribution of votes. As indicated in Figure 7.1, the electoral system has a massive, direct, and largely mechanical effect on the distribution of seats. When it comes to votes, it becomes a diffuse and indirect psychological effect (plus other strategic and logistic effects). Voters are in principle free to vote for any party, regardless of institutional constraints. However, they might not be free to vote for the seventh-largest party when the FPTP system has driven this party out of existence. And they may be free to vote for the third-largest party only while being aware that their vote may not count in a positive way—and may count negatively by reducing the vote for their second preference, who otherwise could have a chance to win. Conversely, with large-magnitude PR, one may feel one’s vote counts, but one’s preferred party may join a coalition cabinet that may soft-pedal the program items that made one vote for that party in the first place (Strøm 1990). So, electoral systems affect seats immediately and votes in the long term. However, as one looks at a single election, votes come first, and seats are allocated later. Therefore, it made sense in early electoral studies to take the votes as given and try to explain the seats in terms of votes. In contrast, this book focused first on the seat share distribution, due to realization that the institutional impact on the average of many elections operates in the opposite direction, starting with institutions. This impact on seat distribution has been addressed with some success. It is now time to ask: What can average seat share distributions tell us about the average vote shares? But we have to take a detour. This chapter reviews what can be found by going from votes to seats. A broad predictive model is presented, the law of minority attrition. Only thereafter can Chapter 14 proceed to the reverse approach, trying to predict votes from seats, which themselves are estimated from institutional givens.
The Law of Minority Attrition: Women’s ‘Rubber Ceiling’, Elections, and Volleyball Scores In the course of sequential selection processes, the categories underrepresented in the early phases tend to be underrepresented even more in the later phases, where the positions available are scarcer. Robert Putnam (1976: 33) called it the ‘law of increasing disproportionality’ as one moves up the ladder of authority. Suppose the share of an ethnic minority is 202
The Law of Minority Attrition Table 13.1. Women’s share in US public office Public office City council members Mayors State lower house members US House of Representatives US Senate
Number of positions
Women’s share (%)
∼100, 000 ∼10, 000 ∼10, 000 435 100
20 10 10 5 2
Source: Adjusted from Taagepera (1994).
small at the lower echelons, where positions are relatively numerous: total seats in provincial legislatures, or assistant professorships at universities. If so, then its share tends to be even smaller at higher echelons with fewer seats or positions: national assemblies, or full professorships. The same goes for women, although they are not a minority in the overall population, as their careers raise them toward what used to be called the ‘glass ceiling’ but more recently has been characterized as a ‘rubber ceiling’ because it is not firm but offers ever stronger resistance as one moves up. The pattern is clear in the US politics: ‘The higher one goes, the fewer women one finds in public office’ (Darcy, Welch, and Clark 1994). Table 13.1 shows rough percentages of women in the US public offices of the early 1980s. Women’s share decreases as positions are fewer. Something analogous happens in elections. The number of positions is large in the electorate. It is much smaller in the assembly elected—and the seat share of minor parties tends to be smaller than their vote share. The effect is minor in PR systems and marked in plurality systems, but it is there. Now suppose the assembly is an electoral college that chooses the president, by plurality vote. The number of positions is reduced to one, and minority attrition is bound to become extreme, as all seat shares but one are reduced to zero. But the phenomenon of attrition applies even beyond social categories. In volleyball, the total number of points won by either team is relatively large compared to the total number of games won, which in the US championships can range only from 3 to 5. In terms of Table 13.1, games are scarcer ‘positions’ than points—and the losers’ share of total games won is smaller than their share of the points earned. The ability of votes to win seats in FPTP elections depends on their location among electoral districts. Similarly, points help to win games, depending on their location among the game periods. This brings us to two fundamental questions. Can we formulate logically founded predictive models for minority attrition processes such as 203
The Duvergerian Macro-Agenda
party votes to seats, volleyball points to games, and women’s shares in city councils and national assemblies? And if we do, are the models different, or does the same model fit all phenomena? To put it differently, are the mechanisms that underlie minority attrition in elections specifically political, more broadly social, or imposed by even more general mechanisms? If the detailed patterns of attrition of minorities are similar in elections and volleyball scores, then it would suggest that the underlying mechanisms do not depend on human nature any more than the normal distribution of weights of humans or peas depends on the specific nature of humans or peas. This is what makes the study of volleyball scores worthwhile. The outcome is more important for elections than playing ball. The brief answer is that the minority attrition processes can indeed be expressed by a quantitatively predictive model. Moreover, the attrition processes involved in volleyball matches and elections do look similar. The same equation comes close to expressing the attrition in both cases, and the only free parameter in this equation is the number of positions available at various stages of selection. In sum, the basic attrition process may not depend on specifically political or even broadly human considerations. A certain degree of attrition is inherent to minority–majority relationships. However, political or social extras can be thrown in so as to reinforce or reverse it. Males are a minority among US school teachers, but there is no attrition of this minority when it comes to advancement to school principals—quite to the contrary! Top physicians also are males even in societies where most physicians are females. The law of minority attrition may supply a baseline so as to determine which part of attrition is natural and which part is socially imposed. In the following, the law of minority attrition will be presented in the context of FPTP elections, where it first was developed, under the name of seat–vote equation (Taagepera 1969, 1973). It is later extended to multi-seat districts with seat allocation by PR and plurality. Analysis of volleyball scores and women’s ‘rubber ceiling’ is given in chapter appendix.
The Law of Minority Attrition for FPTP Systems How are the seat shares (si , s j ) of two parties, i and j, related to their vote shares (vi , v j )? When all seats are allocated by plurality in single-seat 204
The Law of Minority Attrition
districts, the expected relationship can be shown to be n vi si = , sj vj where the ‘disproportionality exponent’ (n) is n=
log V . log S
Here V is the total number of voters and S the total number of single-seat districts. This is the format in which a special case of the law was first observed, around 1910. It was found that 3 vi si = sj vj fits British elections. The relationship came to be called the ‘cube law’ of Anglo-Saxon elections (Kendall and Stuart 1950). When vote shares of two major parties are 60 to 40 percent, then their seat shares tend to be around 77 to 23. This empirical regularity was observed to apply in several countries where FPTP was used. The connection of n to the number of districts and votes was established around 1970 (Taagepera 1969, 1973). How this model is theoretically derived is explained in chapter appendix. For calculation of seat shares, the following format is more practical: si =
vin , vkn
where the summation is over all parties that receive votes. For instance, suppose the percentage, of votes are 40-35-25, and n = 3. Then s1 = 0.403 /(0.403 + 0.353 + 0.253 ) = 0.064/(0.064 + 0.043 + 0.016) = 0.064/0.123 = 0.52 = 52%. I have written it out in detail, because then the other seat shares can be computed quickly: s2 = 0.043/0.123 = 35% and s3 = 0.016/0.123 = 13%. Here the largest party gets a bonus, the second-largest breaks even, while the third party is heavily penalized. This is Duverger’s mechanical effect spelled out quantitatively. The equation above is fully equivalent to si /s j = (vi /v j )n , as one can easily check by dividing si = vin /vkn by s j = v nj /vkn . This equation and n = log V/log S are quite different in nature. The first has little specifically sociopolitical content, as the format yi = f (xi )/ f (xk ) applies to many physical as well as social phenomena. The second is more specific. The exponent n reflects the disproportionality of the electoral system. When only two parties contest the seats, n is what has been called the responsiveness of an electoral system to a shift in votes (Tufte 1973). Suppose both 205
The Duvergerian Macro-Agenda
of these parties have close to 50 percent of the votes. Then n = 3.0 means that when the vote share of a party increases by 1 percentage point, its seat share increases by 3 percentage points. For n = 1, we have perfect proportionality of seat shares to vote shares. The more n increases beyond 1, the more the curve seats-versus-votes becomes steeper at vi = 50%, meaning more disproportional seat distribution. When n tends to ∞, the curve tends to become vertical at vi = 50%, and we reach utmost disproportionality: The party with the most votes wins all the seats. The equation n = logV/logS indicates that disproportionality increases when more voters are added but decreases when more single-seat districts are added. When the cube root law of assembly sizes applies (see Chapter 12), then logV/logS ≈ logP /logS = 3. As an example, suppose P = 10 million. If the cube root law applies, S = (107 )1/3 = 215, and log P /log S = 7/2.33 = 3.00. Suppose only 5 million people vote. In this case, log V/log S = 6.70/2.33 = 2.87, which is within 5 percent of 3.00. It follows from si = vin /vkn and n = log V/log S that vi si = (log V) log . (log S) log sj vj This is the law of minority attrition expressed as a single equation, in the seats–votes context. This equation is symmetric in seats and votes, which has important implications, as discussed in chapter appendix. The law of minority attrition applies beyond national assemblies. When n = 1, it would express perfect PR. This situation could formally emerge with FPTP in the extreme case where there are as many seats as voters, with each voter voting for oneself. Then n = logV/logS = 1. Some trade union elections actually come close to the limit n = 1, because even a 2-worker shop elects a shop steward. Thus the average n for trade union elections is predicted to be around 1.5. For agreement with data, see graph in Taagepera and Shugart (1989: 163). In national assemblies, n ranges from 2.5 to 4.0. Finally, in direct presidential elections with plurality rule, S = 1 means logS = 0, so that n = logV/logS tends toward ∞—which corresponds, as it should, to a seat ratio of 1:0, regardless of the vote ratio. Let us see in more detail what the attrition law can and cannot do.
Testing the Law of Minority Attrition for FPTP Systems The cube law emerges from the general minority attrition law because democratic assemblies tend to follow the cube root law of assembly sizes. 206
The Law of Minority Attrition Table 13.2. Caribbean countries with unusually high disproportionality exponents Country Antigua Barbados Trinidad St Lucia Grenada St. Vincent
Period
Assembly Size
Exponent (n)
Two-party elections
1980–9 1966–91 1961–91 1974–92 1972–84 1974–89
17 26 36 17 15 13
3.5 3.5 3.6 3.8 3.9 4.0
1980, 1989 1971, 1976, 1981, 1986 1961, 1971, 1986 1974, 1979, 1992 1972, 1976, 1984 1984, 1989
Data source: Nohlen (1993).
When the actual number of seats and valid votes is used, the expected disproportionality exponent around 1970 was observed to range from n = 2.61 for the relatively large British House of Commons to n = 3.17 in the relatively small House of Representatives in New Zealand (Taagepera and Shugart 1989: 166). Assemblies in small island nations tend to fall below the cube root of population. This means that the disproportionality index n reaches 3.5 and even 4, when they use FPTP (Lijphart 1990). The winner’s advantage is huge, and the opposition is often utterly decimated. These nations may think they use the British electoral system, but the outcome is more extreme because of excessively small assembly sizes. Table 13.2 shows assembly sizes and the resulting disproportionality exponents n in 6 small island countries where this exponent is unusually high, ranging from 3.5 to 4.0. These are the only countries in Latin America and the Caribbean where data listed in Nohlen (1993) lead to n ≥ 3.5. Their assemblies are very small even for their small populations. With such high exponent values, any disagreements with the model should stand out the strongest. This is why I will use them as test cases throughout this chapter. Elections where third party votes are less than 5 percent are indicated. These practically pure two-party elections will be seen to fit the attrition law better than multiparty elections. Figure 13.1 graphs si versus vi for individual elections in these countries, for data in Nohlen (1993), using n = 3.75. Two curves and two sets of data points are shown. The ‘one-opponent’ curve represents the prediction of the attrition law when only two parties run. Then the attrition law becomes s=
vn , v n + (1 − v)n
207
The Duvergerian Macro-Agenda 100 Two-party contests
s
=
v
Multiparty contests
Two
oppo
nent s
80
oppo
nent
Seats (%)
60
One
40
20
0 0
20
60
40
80
100
Votes (%)
Figure 13.1. Seat shares vs. votes shares for FPTP with high disproportionality exponents—attrition law and Caribbean data
s and v being the seat and vote shares, respectively, of either of the two parties. The corresponding data for elections where the third party vote share is below 5 percent are shown with round symbols. Recall that the disproportionality exponent n (responsiveness) is the slope of the curve at si = vi = 50%, meaning that when the vote share of a party increases by 1 percentage point around 50 percent, its seat share increases by n = 3.75 percentage points. At very low or high vote shares, the seat–vote curve bends so as to respect the conceptual anchor points (0; 0) and (100; 100), along with a third anchor point at (50; 50). This is an example of a general curve with 3 anchor points, as discussed in Beyond Regression (Taagepera 2008). The data largely agree with the prediction, with some random scatter spread almost evenly on the two sides of the curve. 208
The Law of Minority Attrition
When more than two significant parties compete, no unique curve, si versus vi , emerges from si = vin /vkn , because the seats of a given party depend on how the votes are distributed among its competitors. A party with 40 percent votes can be a big winner when the distribution is 40-30-30, while 40-50-10 would make it a relative loser. The outcomes from all such constellations can be calculated from si = vin /vkn , but graphical representation becomes more complicated. The ‘two-opponent’ curve in Figure 13.1 shows what the minority attrition law predicts when a party faces two opponents with exactly the same vote shares. Here the equation is s=
vn . v n + 21−n (1 − v)n
A split opposition would greatly benefit the given party. Elections with more than two significant parties running are shown in Figure 13.1 with cross symbols. As expected, they are mostly on the left of the oneopponent curve. Small parties are even on the left of the two-opponent curve. Surprisingly, they actually tend to receive PR, indicated by the line s = v, which would correspond to n = 1. We will return to this disagreement. For the moment, let us observe that the general case cannot graphed and tested in the format of Figure 13.1, as each data point corresponds to a different curve. We have to look for other formats, and it is not easy. One approach is to graph seat ratios versus vote ratios for all parties, always relative to the party with the largest vote share: si /s1 versus vi /v1 . This approach was used by Taagepera and Shugart (1989: 160–8). Shugart (2007) applies a related format to 208 elections, ranging from Canadian provinces to the Caribbean. He graphs the expected ratio of seats of the two major parties, calculated as (v1 /v2 )n , against the actual seat ratio, s1 /s2 . The scatter is wide, but on the average the expected ratio is close to the actual—except for nationwide elections in India. Here the actual ratio falls steadily much below the expected. In India, both largest parties have been unusually small. A general shortcoming of any ratio approach is that it boosts relative error, and it is especially severe here for the following reason. When the largest share surpasses its expected size by a random amount, the second-largest share is likely to be correspondingly smaller by a similar amount. Hence the ratio (v1 + ε)/(v2 − ε) is boosted doubly—by the +ε in the numerator and by the −ε in the denominator. 209
The Duvergerian Macro-Agenda 100 Two-party contests Multiparty contests
Actual seat share (%)
80
60
ed
ct
l=
e xp
E
a tu
Ac
40
20
0 0
20
40
60
80
100
Expected seat share (%)
Figure 13.2. Actual seat shares vs. those calculated from the attrition law, for FPTP systems with high disproportionality exponents
The most precise way to test the attrition law might be the following, when more than two parties run. Calculate the expected seat share for each party directly from si = vin /vkn , using all the actual vote shares and the exponent value n = logV/logS calculated from the number of valid votes in that particular election. Graph the actual seat shares against these expected ones. Because this approach is relatively timeconsuming, it has not been used previously, but there may not be any way around it. Figure 13.2 graphs the previous Caribbean data, based on Nohlen (1993), in this format—actual seat shares versus the expected. Again, twoand multiparty contests are shown with distinct symbols. For the twoparty contests, limited and balanced scatter around the line ‘Actual = Expected’ shows that data fit well with the predictive model. In contrast, multiparty contests tend to give the largest party fewer seats than would 210
The Law of Minority Attrition
be expected on the basis of the attrition law, with the corresponding boost to the smaller parties. This one-sided deviation from expectation does not recur in the aforementioned larger sample investigated by Shugart (2007), so it is too early to declare that the model applies only to almost pure two-party contests. But a thorough testing in the format of Figure 13.2, using all available FPTP elections, would be called for. We should also examine the conceptual underpinnings of the model, which may lead to a correction term for the law of minority attrition.
Using Votes to Predict the Effective Number of Legislative Parties and Deviation from PR Effective number of legislative parties and deviation from PR are central to the study of the impact of electoral systems on party systems. How well can the attrition law predict these quantities on the basis of votes? I use again the previous Caribbean data, where the disproportionality constant is extreme, so that large deviations from PR can be expected. Figure 13.3 shows the actual effective number of legislative parties, based on Nohlen (1993), graphed against the one calculated from the vote shares with the help of the attrition law. The scatter around the line ‘Actual = Expected’ is limited and balanced for two-party contests, showing agreement, on the average. One does not expect an institutional model to fit the outcomes of individual elections but only the overall tendency. For multiparty contests, in contrast, the attrition law steadily underestimates the number of legislative parties, in line with Figure 13.2, where this model overestimated the largest seat share. Grenada 1990 stands out as an extreme case. Once more, Shugart’s aforementioned study (2007) suggests that more data analysis is needed before drawing conclusions. When we proceed to deviation from PR (Gallagher’s D2 ), we are reaching the predictive bounds of the attrition law for the following reason. Deviation from PR involves subtraction of two variables, si − vi , which boosts random error. When the largest seat share surpasses the vote share by more than the expected amount, through random fluctuation, then the other seat shares are bound to be correspondingly smaller. Note that, in contrast, calculation of effective number of parties involves only additions. 211
The Duvergerian Macro-Agenda
Gr90
Two-party contests
3
Multiparty contests
2.5
ed
Actual Ns
ct
al
=
e xp
E
u ct
A
2
1.5
1 1
1.5
2
2.5
Expected Ns
Figure 13.3. Actual effective numbers of legislative parties vs. those calculated from the attrition law, for FPTP systems with high disproportionality exponents
For the multiparty contests in Caribbean countries the correlation between actual and predicted deviations from PR becomes almost nil, and there is no point in graphing it. For two-party contests, shown in Figure 13.4, the scatter around the line ‘Actual = Expected’ is wide but fairly balanced. In sum, the law of minority attrition can be used to infer the effective numbers of legislative parties from the vote shares of parties, at least for some types of elections. For deviation from PR, random scatter tends to take over. We are here approaching the limits of predictability based on this model. 212
Ex pe
ct e
d
The Law of Minority Attrition
Actual D2
Ac tu
al
=
30
20
10
0 0
10
20
30
Expected deviation from PR (D2)
Figure 13.4. Actual deviations from PR vs. those calculated from the attrition law, for two-party FPTP systems with high disproportionality exponents
The Law of Minority Attrition for Multi-Seat Districts The law of minority attrition can be adjusted for multi-seat plurality, such as the US Electoral College, where responsiveness is even higher than FPTP. It can also be adjusted for PR systems, where the responsiveness is close to 1 (perfect PR). The respective disproportionality exponents are n=
logV logS
n=
logV . logE
1/M [PR]
and [plurality]
213
The Duvergerian Macro-Agenda
Here E stands for the number of electoral districts. At M = 1, both expressions yield n = logV/logS. See chapter appendix for elaboration and possible extension to Two-Rounds elections.
Why Some FPTP Contests Deviate from the Law of Minority Attrition The century-long awareness of the tendencies expressed in the ‘cube law’ may have contributed to its own demise in recent UK elections (Blau 2004), as parties have learned to counteract this natural tendency by concentration of resources into the most profitable districts. The result is that the disproportionality exponent has recently been reduced to a value lower than logV/logS. In the USA, the pattern has been affected in the past by one-party elections in the South and traditional gerrymander elsewhere. More recently, bipartisan gerrymander also enters. Once more, one is reminded that few actual electoral systems are simple. The attrition law still holds for FPTP systems as a unifying first approximation that joins trade union elections (n ≈ 1.5), assembly elections in single-seat districts (n = 2.5 to 4), and direct presidential election, as a limiting case where S = 1 and n tends toward ∞. The fading of the cube law in Britain may be among the first instances where political science expressly encounters a broad question: When does our understanding of the world alter the world itself? In quantum mechanics, the observation of an elementary particle inevitably alters either its position or momentum (the famous principle of indeterminacy), but the problem fades in macroscopic physics. Microorganisms respond to the invention of antibiotics by mutations that increase their resistance. Awareness of the law of gravitation helped humans to devise ways to circumvent its impact and build airplanes. When political science develops laws that describe simple political phenomena, politicians can be expected to look for loopholes. Their inventiveness can match that of aeronautical engineers. Coming now to the Caribbean countries graphed in Figures 13.2 and 13.3, we observed that multiparty contests tend to give the largest party fewer seats than the attrition law would predict, with the corresponding boost to the smaller parties. This deviation suggests that we should test with more data but also examine the conceptual underpinnings of the model. This may lead to a correction term for the law of minority attrition. 214
The Law of Minority Attrition
One underlying assumption of the attrition law is that the support of parties in the individual districts follows some ‘regular’ distribution pattern. At the very least, a unimodal distribution is implied. However, the distribution of effective number of electoral parties across districts is far from unimodal in all four countries graphed by Chhibber and Kollman (2004: 42). The USA comes closest, with a sharp symmetric peak at N = 2.0 marred only by uncontested elections that produce a minor spike at N = 1.0. Great Britain has a high one-sided peak at N = 2.0, followed by extremely few cases at N = 2.1 and a low and wide secondary peak at N = 2.5. Canada and India follow vaguely similar patterns. The striking common feature of the latter three countries is lack of districts with 2.1 effective parties. Such a value, barely higher than N = 2.0, could result from the presence of a third party of only 3 percent when the two major parties are balanced (48.5-48.5-3), or a larger third party, up to 7 percent, when the distribution is fairly lopsided, such as 60 − 33 − 7. The second peak around N = 2.5 would correspond to constellations such as 44-44-12, with a larger third party. What these bimodal patterns might express is reluctance of third parties to run in districts where they have very little support. It remains to be determined whether and how the attrition law could be modified so as to account for bimodal distributions of party strengths.
Conclusion We have developed a predictive model for FPTP systems to convert from vote shares of parties to their seat shares. It covers the basic patterns of a wide variety of elections, from direct and indirect presidential elections to parliamentary elections with FPTP or List PR, and with a slight modification, multi-seat plurality. In the context of elections, the model is expressed as a ‘seat–vote equation’ in two senses. It converts votes to seats, and the disproportionality exponent (‘responsiveness’) of the conversion depends on the total votes and seats, at least in FPTP and List PR. Beyond elections, this law of minority attrition applies to attrition of social minorities as their careers raise them toward the ‘rubber ceiling’. It can be tested in the unlikely context of volleyball scores. In the preceding chapters, district magnitude and assembly size played such symmetric roles that their effect could be condensed into a single indicator, the seat product MS. This convenient package no longer works in the conversion of votes into seats. Here S enters the expression 215
The Duvergerian Macro-Agenda
n = logV/logS separate from M. There is no inherent reason why M and S should come in a single package in all aspects of electoral studies. Now that we see them part company, there is even more cause to marvel about their staying together in the same package MS for so long—all the way from the number of seat-winning parties to mean cabinet duration. Once more, it is time to remind ourselves that logical regularities can apply only to averages in simple electoral systems, with equal numbers of seats and votes per district. The regularities observed depend on a quasinormal (or at least unimodal) distribution of voting strengths of parties across the districts. If a country consists of ethnically distinct regions, and the parties are ethnically based, then FPTP could lead to pretty proportional outcomes (exponent n close to 1) rather than anything like the cube relationship. Thus country-specific cultural, geographic, and political factors can alter responsiveness. The law of minority attrition is but a first approximation. It enables us to calculate the seat shares of all parties from their vote shares, as long as some known and possibly also some unknown assumptions hold. Thus, it supplies a base line. If (and only if) deviations from its predictions are observed do we have to ask what further factors enter.
Appendix to Chapter 13 This appendix offers a formal derivation of the law of minority attrition for FPTP systems, extends it to multi-seat districts, and tests it with volleyball scores and women’s shares in politics.
Derivation of the law of minority attrition My Master’s thesis on ‘The Seat–Vote Equation’ (Taagepera 1969) at the University of Delaware first formulated the law of minority attrition and indicated its two forms—a single symmetric equation or two separate ones. The same year, Henri Theil (1969) offered formal proof that the mechanical effect of FPTP on the transformation of vote shares into seat shares must follow the format si = sj
vi vj
n .
The important observation was that, among all functions of the form si /s j = F (vi /v j ), this is the only one that does not lead to inconsistencies in the presence of more than two parties. Here we have an example of the ‘Sherlock Holmes
216
The Law of Minority Attrition approach’: winnowing out the inconsistent options leaves only one acceptable form. The alternative format si = vin /vkn automatically follows. An alternative approach is to observe that the format yi = f (xi ) f (xk ) results whenever the values of yi depend on xi and their sum is 1: yi = f (xi ) and yi = 1. These are very loose conditions and hence fit many physical and social phenomena. Impose the further condition that the ratio of two values of y must be a function of the ratio of the corresponding values of x. This means yi /y j = F (xi /x j ), as presumed by Theil. Then f (x) = axn is the only satisfactory function f (x). Hence yi = xin /xkn , or in terms of s and v, si = vin /vkn . But why presume that ratios depend on ratios, as expressed by si /s j = F (vi /v j ), rather than presuming that differences depend on differences, as expressed by si − s j = F (vi − v j )? Still other combinations of two seat shares might be presumed to depend on the same combination of the corresponding vote shares. It can be shown (Taagepera and Shugart 1989: 185) that the relationship of differences can be valid only for perfect PR, and that other combinations run into inconsistencies. Theil (1969) left the value of exponent n open. Its value 3 in the empirical ‘cube law’ remained unexplained. It results when a certain degree of district-to-district variability is assumed, but this degree of variability itself had to be arbitrarily chosen so as to yield n = 3. In contrast, my Master’s thesis, the gist of which reached print several years later (Taagepera 1973), started with the following thought experiment that shows that the value of n cannot be 3 under all circumstances. First, reduce the number of seats available gradually to 1, which corresponds to direct presidential election. Whenever party j has the most votes, it wins the single seat, so that the seat ratio must become si /s j = (0/1). Such a ratio can be obtained from the vote ratio (vi /v j ) only when the exponent n is made to tend to ∞. Next, go in the opposite direction and increase the number of seats until it equals the number of voters (V). Assuming that everyone runs and votes for oneself, perfect PR would prevail. This would require n = 1 in the equation si /s j = (vi /v j )n . It becomes clear that the exponent n must decrease from ∞ to 1, as the number of seats (S) increases from 1 to V. The number of seats available matters. It follows that the number of voters also matters. Suppose we have 100 singleseat districts for a million voters and an exponent larger than 1 is observed to apply—like the cube law. Now, for the same 100 districts, reduce the number of voters to 100. Perfect PR (n = 1) must set in, meaning that the exponent decreases from some larger value to 1. Thus, at constant S, the exponent n must decrease with decreasing number of voters (V). In sum, n is a function of both V and S: n = n(V, S). More specifically, n is an increasing function of V and a decreasing function of S. Next, consider multistage elections, where V voters elect a larger electoral college of C electors, who elect the final body of S members. The exponents in the two stages are n(V, C) and n(C, S), respectively. For the total process it is n(V, S). Consistency requires that the final outcome must be the same, whether we calculate by stages or directly: V → C → S should be equivalent to V → S. Such consistency
217
The Duvergerian Macro-Agenda requires that n(V, C)n(C, S) = n(V, S). This is possible only when n(V, S) has the form n(V, S) = f (V)/ f (S) Here a function of V is divided by the same function of S. It then results from si = vin /v nk that f (S)log
si sj
= f (V)log
vi vj
.
This is the minority attrition law, except that we still have to specify the function f . When S = 1, n = f (V)/ f (S) must tend toward ∞. Hence f (1) = 0 is required. Among the functions that satisfy this condition, the simplest are f (x) = x − 1 and f (x) = logx. Empirical data (cube law, in particular) do not agree with f (x) = x − 1 and strongly point in the direction of f (x) = logx. The theoretical argument to that effect, however, made in Taagepera (1973) and repeated in Taagepera and Shugart (1989: 187), remained debatable. This was a lasting weak link in the theory of the minority attrition law. I now offer the following reasoning. When S increases from 1 to V, the exponent n decreases from ∞ to 1. What would be the mid-ranges for S and n? The mid-range for S is easy to define. Given that S and V can take only positive values and V can be much larger than 1, the meaningful mean of 1 and V is their geometric mean (cf. Taagepera 2008). Hence the mid-range of S is defined as S = V 1/2 . For n, the mid-range is hard to define, because one of the limits tends to ∞. Observe, however, that the minority attrition law is symmetric in seats and votes. It corresponds not only to si = vin /vkn but also to vi = sim/skm, where m = 1/n = f (S)/ f (V). In contrast to n, this m has a finite range, from m = 0 (when S = 1) to m = 1 (when S = V ). Given that m can become 0, the use of geometric mean is excluded, and arithmetic mean defines the mid-range in this case as m = 12 . Would the two mid-ranges, S = V 1/2 and m = 12 , correspond to each other? In principle, S = V 1/2 could correspond to a value lower or higher than m = 12 , but we have no reasons to prefer one direction to the other. Thus, in the absence of any further information, the best guess is that f (V 1/2 )/ f (V) = 1/2. The function f (V) = logV satisfies this condition. It also satisfies f (1)/ f (V) = 0 and f (V)/ f (V) = 1. Furthermore, by taking the mean of S = 1 and S = V 1/2 , thereafter the mean of S = V 1/4 and S = 1, and so on, it becomes clear that f (V) = logV is the only acceptable function. Thus n=
logV . logS
In the proof of f (V) = log V, did I stack the cards in favor of log V the moment I took the geometric mean of 1 and V for S but the arithmetic mean of 0 and 1 for m? This was not an ad hoc choice but application of general principles enounced in
218
The Law of Minority Attrition Taagepera (2008) and repeatedly applied in various chapters of the present book. Yes, the cards were stacked in favor of logx—but by nature, not by me. Combining n = logV/logS with si /s j = (vi /v j )n leads to the format first enounced around 1970 (Taagepera 1969, 1973): logS log
si sj
= logV log
vi vj
= ‘selection process constant’.
At that stage, it was called the seat–vote equation. Its nature as a broader law of minority attrition was first pointed out in Taagepera and Shugart (1989: 184–5, 188), Taagepera (1994). Such a format symmetric in seats and votes has esthetic appeal and brings into evidence a combination of party characteristics that remains unchanged as votes are converted into seats (and possibly into seats in an intervening electoral college also elected by FPTP). The expression ‘log(sum of all components) times log(ratio of two components)’ remains the same at different stages of the selection process. In physics, this constant would be called a ‘constant of motion’. Here, ‘selection process constant’ seems appropriate. This constant is what makes the symmetric form of the minority attrition law theoretically interesting. For most purposes, however, it is more convenient to plug n = logV/logS into the format si = vin /vkn .
The law of minority attrition for multi-seat districts (and Two-Rounds elections) With multi-seat districts, a crucial distinction must be made between seat allocation by plurality and PR, symbolized by the notation M F , where F = 1 for PR and F = −1 for plurality. The law of minority attrition can be extended to multi-seat PR and plurality, but in quite different ways. No theory has been proposed for TwoRounds election, but Dolez and Laurent (2005) have carried out empirical work that will be discussed. The minority attrition relationship is extended to multi-seat PR (Taagepera 1986) by proposing that in this case n=
logV logS
1/M .
[PR]
Combining it with si /s j = (vi /v j )n leads to a single equation which is again symmetrical in seats and votes: si vi (logS)1/M log = (logV)1/M log = constant. [PR] sj vj I still cannot prove this formula, but it seems to fit. For M = 1, it reduces itself to the original minority attrition law for FPTP, as it should. When M increases
219
The Duvergerian Macro-Agenda beyond 1, the Mth root quickly reduces the exponent n to values hardly above the n = 1.00 of perfect PR. Consider a 100-seat assembly that represents a population of 1 million, so that the cube root law of assembly size applies: logV/logS = 3. Even with a district magnitude as low as M = 5, the disproportionality exponent n becomes as low as 31/5 = 1.25. Taagepera and Shugart (1989: 168) show the degree of agreement for Finland’s nationwide pattern (where n = 1.07 is predicted) and individual districts (n = 1.14 predicted). For a 100-seat assembly elected nationwide (M = S = 100), the expression becomes n = (logV/logS)1/S = 30.01 = 1.011. This is close to perfect PR, but still not quite. Now consider multi-seat plurality. Plurality rule applied to a single nationwide district allocates all seats to the same party, regardless of the number of seats. Hence, it is the number of electoral districts (E ) rather than the number of seats that matters in this case, so that n = logV/logS must be replaced by n = logV/logE . Combining it with si /s j = (vi /v j )n leads to a single equation which no longer is symmetrical in seats and votes: logS log
si sj
= logE log
vi vj
.
[Plurality]
For single-seat plurality, S = E , so the distinction does not matter. So it turns out that we have two quite different expressions for disproportionality exponent n, one for plurality and one for PR: n=
log V , logE n=
[Plurality]
logV logS
1/M .
[PR]
At M = 1, the two become identical. This M = 1 corresponds to the least proportional outcome one could have with List PR rules, and also to the most proportional outcome one could have with plurality rule. It would be nice to find a single unifying expression that includes both of them. Taagepera and Shugart (1989: 169) offered a form that still depends on arbitrary assignment of parameter values, and no progress has been made. The unified attrition law becomes asymmetrical in seats and votes the moment one introduces multi-seat plurality. In view of its fading popularity, one might as well forget about it, if it were not for the US Electoral College. In the US Electoral College, all seats in the same state go to the party with the most votes (with few marginal exceptions). Hence the number of electoral districts where the plurality rule is applied is the number of states, which has gone from 23 in 1820 to the present 51 (including the District of Columbia). As the population also increased, the resulting values of n = logV/logE has remained around 5, and this is close to the median of actual data (Taagepera and Shugart 1989: 162), which
220
The Law of Minority Attrition are extremely scattered. A complicating feature is the widely unequal number of seats per state. The theory of the law of minority attrition has not been extended at all to electoral systems with ordinal ballot or Two-Rounds. However, Dolez and Laurent (2005) have carried out empirical work on the French legislative elections 1978– 2002. They find that, when n = 4, the format si /s j = (vi /v j )n fits for the Left and the Right taken as blocs. Given that France has S = 555 metropolitan single-seat districts, FPTP would be expected to lead to such a high exponent value only when logV = 4logS = 4log555 = 11.0, meaning an electorate of some 100 billion people! It remains to be tested whether all Two-Rounds systems lead to extra high disproportionality exponents. If so, then the mechanisms leading to it would have to be found. On the other hand, the high exponent could merely be a result of unusually high nationalization of electoral behavior since the 1960s, as Dolez and Laurent (2005) suggest. If so, it would be in striking contrast to the exponent decreasing during the same period in UK (Blau 2004).
Volleyball, women, and the law of minority attrition In volleyball scores the two stages that correspond to votes and seats are the number of points (P) and the number of games (G) won by the winner (W) and the loser (L). In contrast to elections, where more than two parties may be involved, each match involves only two teams, so that the symmetric form of the attrition equation becomes an elegant assertion that ‘The log of sums times the log of ratios is constant’: GW PW = log(PW + PL )log = c. log(G W + G L )log GL PL Here c is a constant, the selection process constant. This constant differs in almost every individual election, because seats can be divided in a large variety of ways. Volleyball tournament rules, in contrast, allow for only three outcomes, because the contest ends when one side has won 3 games. It can only be 3 to 2 or 3 to 1 or 3 to 0. Hence there are only three values for the selection process constant, when the latter is based on the games, cG = log(G W + G L )log(G W /G L ). If the match is won 3 to2, cG = log(3 + 2)log(3/2) = 0.123. If it is won 3 to 1, cG = 0.287. If it is won 3 to 0, log(3/0) tends toward ∞, and so does cG . This case will need special discussion. Now we can compare the games-based process constants cG to the median values of the selection process constant emerging from the points earned, cP = log(PW + PL )log(PW /PL ). We expect that median cP = cG . The actual values of cP range widely. They even extend to negative values, because a team with fewer total points can win more games. But my unpublished calculations (Table 13.3) show that the medians of a large number of US championship matches are close to expectations.
221
The Duvergerian Macro-Agenda Table 13.3. Volleyball scores and the law of minority attrition Games ratio (G W /G L ) Number of matches Expected constant (cG ) Observed median cP Deviation
3:2 119 0.123 0.098 +26%
3:1 162 0.287 0.224 +28%
3:0 96 [→infinity] .390 ??
Given the wide range of actual outcomes, the degree of agreement of the means is impressive. However, cP falls short of cG by the same percentage in both finite cases, which suggests a possible systematic deviation. The constellation 3:0 offers us a clue. We are here applying an essentially continuous-variable model to a low number of integers, and this spells trouble. As long as the loser wins at least one point, the attrition equation technically predicts that the loser also wins at least a tiny fraction of a game. But games are counted in integer numbers. This means that a formal 0.03 games (or even 0.49 games) won by the loser would be rounded off to 0 games won. The observed median cP = 0.39 for the 3:0 outcomes corresponds to what would be expected if the ratio of games won were about 2.8 to 0.2. Similarly, the observed medians for 3:1 and 3:2 matches correspond to what would be expected if the ratio of wins were about 2.9 to 1.1 and 2.9 to 2.1, respectively. The structure of the data is such that the larger figure is always rounded off upward and the lower one downward. Hence the gap we observe between cP and cG . I have not yet found a way to refine the model so as to counteract this tendency. One difference between elections and volleyball scores is that the total number of seats to be filled is predetermined (or almost so), while the number of volleyball games (3, 4, or 5) is known only retroactively, and the winner’s number is always 3. One has to review Theil’s (1969) and Taagepera’s (1973) possible hidden assumptions in light of these and other possible differences. Women’s ‘rubber ceiling’ is more difficult to analyze, because it has been undergoing a change over the last 50 years. Should we compare women’s share in the US Senate to their share among city council members now or 30 years previously, Table 13.4. Selection constant for women’s attrition in US politics seems to be 3.5 Public office City council members Mayors State lower house members US House of Representatives US Senate
222
Number of positions
Women’s share%
Selection constant
∼100,000 ∼10,000 ∼10,000 435 100
20 10 10 5 2
∼3.1 ∼3.8 ∼3.8 3.4 3.3
The Law of Minority Attrition when the present senators got their starts? Many numbers in Table 13.1 are only vague estimates, but let us see what we get when we calculate the selection process constant at various stages. The equation is c = log(PW + PM )log(PM /PW ), where W stands for women and M for men. For example, for city council members, the constant is around c = log100,000 log(80/20) = 3.1. To the extent that the model fits, the selection constant should be the same at all other stages too. Table 13.4 shows the results, along with the previous data in Table 13.1. The selection constant does remain in the range 3.5 ± 0.4, which is encouraging. It would be worthwhile to gather and test more precise data, and for many countries.
223
This page intentionally left blank
14 The Institutional Impact on Votes and Deviation from PR
For the practitioner of politics:
r r
r r
Using nothing but the product of assembly size and district magnitude, theory-based equations allow us to estimate the average effective number of parties based on votes. Deviation from proportional representation is typically 10–20 percent for first-past-the-post electoral systems. It can be estimated from the product of assembly size and district magnitude, within ±4 percentage points. These results refer to the averages for many elections carried out under the same electoral laws. In individual elections, the number of parties and deviation from PR can vary widely. When estimating the likely effect of changes in electoral laws on the number of parties and deviation from PR, also take into account the past tendencies in the given country.
Reversing the usual direction of the law of minority attrition enables us to deduce vote shares of parties from their seat shares, which themselves can be traced back to the impact of district magnitude and assembly size. This reversal is imperfect, and random noise increases. Still, in principle, we should be able to use this step to calculate the average distribution of all vote shares, including the largest, and hence the effective number of electoral parties. Combining the respective seat and vote shares, we should also be able to estimate deviation from PR on the basis of seat product.
225
The Duvergerian Macro-Agenda
The causal chain, however, becomes so extended that only limited predictability can be expected. In the Chapter 13, calculation of deviation from PR on the basis of actual vote shares and the attrition equation led to wide scatter. The scatter can be expected to widen when we use theoretical seat shares, themselves imperfectly deduced from institutions. Up to now, this book has complemented predictive modeling with extensive testing, often reported in even more detail in previously published articles. Now we are in uncharted waters. This chapter indicates how to approach the problem of converting from seats to votes in theory. It offers illustrative examples, but testing with extensive data remains to be done. It is a research agenda where many researchers can get involved.
Relationships Between the Ways to Measure the Number of Electoral Parties Recall that the various ways to express the number of legislative parties are related, on the average, as 4 N∞ = N23 = N02 .
The number of seat-winning parties (N0 ) was derived from M and S. Later derivations, however, followed from N0 on logical grounds, with no further institutional input. Indeed, the inverse of the largest share (N∞ ) can be expected to be approximately the square root of the number of components, whether these components are seat-winning parties or Canadian provinces. The effective number is tied to N∞ in a similarly abstract way. What this means is that the logic that leads from N0 to N∞ and N2 applies to vote shares too. Without any further proof needed, we can expect that the relationship above connects the effective number of electoral parties and the inverse of the largest vote share, on the average: 4 3 ≈ NV2 . NV∞
The number of vote-getting parties (NV0 ) is harder to define than the previous number of seat-winning parties. We are interested in the number of ‘serious’ electoral parties/candidates, rather than those who obtain a mere couple of votes. Various norms have been proposed: add one more to those parties/candidates that win a seat (Reed 1991; Cox 1997), or add those parties/candidates who obtain at least 70 percent of the votes needed to win a seat (Hsieh and Niemi 1999; Niemi and Hsieh 2002). One 226
The Institutional Impact on Votes
operational way would be to define the number of ‘serious’ vote-getting parties as 2 = NV0 = NV∞
1 vi2
and see whether the resulting values of such a phantom NV0 make any sense. Chapter 15 returns to this issue.
From Seats to Votes: The Attrition Equation Read Backward The law of minority attrition was deduced going from votes to seats, but its symmetrical form indicates that one can go in the reverse direction, too: vi =
sim , skm
where, in the case of PR elections or FPTP, log S 1/M 1 m= = . n log V Once the average seat shares are estimated from institutional inputs (Chapter 9), we can use the equations above to estimate the average vote shares over many elections with simple electoral rules (Taagepera 2001). This procedure looks simple, but we run into a problem of rounding to integers (like with volleyball scores in appendix of Chapter 13). Seats come in integer numbers. When the seat–vote equation predicts 0.54 seats for one minor party and 0.44 seats for another, it is easy enough to round 0.54 seats to 1 seat and 0.44 seats to 0 seats. But when the rounded figures for seats are given, how can one guess at what decimal fractions they were rounded from? How can we get the ‘0.44’ back from ‘0’? To illustrate the degree of error that can result, let us consider a vote distribution akin to that in many recent UK elections: 40-35-20-5. For simplicity, assume that 1 million people vote and 100 FPTP seats are at stake, so that n = 3.00. First, use the seat–vote equation to calculate the seat shares and round them to integer numbers of seats. Then feed these seats into the reversed seat–vote equation, so as to see how close we get to the initial vote shares. The results are shown in Table 14.1. Based on zero seats for party D, the reversed seat–vote equation predicts zero votes for this party. All other vote shares are overestimated 227
The Duvergerian Macro-Agenda Table 14.1. From votes to seats, and back to votes—hypothetical vote shares, with S = 100 and n = 3.00 Party
A
B
C
D
Eff. N
Votes shares (%) Calculated seats (%) Number of seats Seat shares (%) Calculated votes (%) Difference
40 55.65 56 56 42.18 +2.18
35 37.28 37 37 36.73 +1.73
20 6.96 7 7 21.09 +1.09
5 0.11 0 0 0 −5.00
3.08
Seats (out of 100.25) Calculated votes (%) Difference
56 39.44 −0.56
37 34.35 −0.65
7 19.72 −0.28
0.25 6.49 +1.49
2.20 2.80
3.16
correspondingly, and the effective number of electoral parties is underestimated by 0.28 parties. How can we correct for this rounding down to zero? Knowing that zero seats for one or several minor parties could have been rounded off from as high as 0.49 seats or as low as 0.00 seats, a conservative estimate might assume a single unrepresented party, with a virtual seat share halfway between 0 and 0.50, meaning 0.25. This new starting point is shown at the bottom part of Table 14.1. Having a seat total of 100.25 presents no problem, because one could plug either seat shares or numbers into vi = sim/skm, without altering the outcome. Now the vote share of party D is overestimated, while the other vote shares are underestimated accordingly. The differences are reduced, though, compared to the previous approach, and the effective number of electoral parties is overestimated only by 0.08 parties. This example suggests that the compound error in estimating the vote shares on the basis of seat shares might usually be only a few percent, but we cannot be sure. There may be more than one unrepresented party. Maybe we should assume one party with 0.25 seats and another at 0.25/2 = 0.125, and still another at 0.125/2 = 0.0625 seats. More systematic statistical approaches may be available to estimate the next values in a decreasing series such as 56-37-7- . . . One would have to carry out cyclical calculations (actual votes → seats → votes) in various countries, so as to obtain a better sense of the range of error. This applies to FPTP systems. For List PR, the differences between vote and seat shares are bound to be small, but the low threshold of representation (see Chapter 15) sometimes tempts numerous tiny parties to run and fail. The number of votes wasted this way may add up and could be hard to estimate. We are here on untested grounds. 228
The Institutional Impact on Votes
Predicting the Number of Electoral Parties from Institutional Inputs Previously (Chapter 9), all seat shares could be calculated, starting from the largest seat share, itself calculated from the seat product MS. Then the effective number of legislative parties could be calculated from these seat shares. These calculations were messy, however, and so an approximate shortcut was devised, leading directly from the largest share to the effec4/3 tive number: N = 1/s1 . It involved a loss of precision of known extent, which was tolerable in comparison with variation caused by other factors. We face a similar choice here. We could use all seat shares to convert into vote shares, but the connection to the seat product would depend on complex calculations. Instead, we can look for a way to connect the largest vote share (v1 ) alone to the largest seat share (s1 ). Assume that it can be approximated as v1 = s1k , where k > 1, so that the largest vote share is smaller than the largest seat share. Then v1 = s1k = (MS)−k/8 and NV∞ = (MS)k/8 = (NS∞ )k . The effective number of electoral parties would result automatically: NV =
1 4/3 v1
= (MS)k/6 = NSk .
This number is larger than the number of legislative parties, NS = (MS)1/6 . A phantom number of ‘serious’ vote-getting parties also results: k . NV0 = (MS)k/4 = NS0
For the usual range of the largest seat share in stable FPTP systems (s1 > 0.3), the theoretical derivation for conversion s1 → v1 is given in chapter appendix. A satisfactory approximation is k = 1.28n0.21 .
[FPTP, s1 > 0.3]
Here n is the previous disproportionality exponent n = 1/m = log V/ log S. Given that n > 1, it follows that k > 1, and v1 < s1 , as one would expect. Let us carry out a reality check to see whether any traces of an institutional impact on the electoral parties remain, at the level of a quantitative prediction. Consider the six Caribbean countries with an unusually 229
The Duvergerian Macro-Agenda
high disproportionality exponent (responsiveness) of 3.5 to 4.0, as used in Chapter 13. The mean exponent n = 3.75 for these countries yields k = 1.28 (3.75)0.21 = 1.69. Hence, for these FPTP systems, v1 = s11.69 .
[high-exponent FPTP]
For the largest seat share, the geometric mean of all 30 elections in these countries (data from Nohlen 1993) is s1 = 72.0%. How well does it predict the largest vote share? It yields v1 = s1k = 0.7201.69 = 0.574 = 57.4%. The actual geometric mean of the largest vote shares is 55.9 percent. We are off by only 1.5 percentage points. The following connections to assembly size follow: v1 = S −1.69/8 = S −0.21
[high-exponent FPTP]
and hence NV∞ = S 0.21 ,
[M = 1, n = 3.75]
NV = S 1.69/6 = S 0.28 NV0 = S 1.69/4 = S 0.42 . Table 14.2 tests for the impact of assembly size for individual countries, arranged in the order of increasing S. The overall geometric means differ slightly from the ones above because all countries do not have the same number of elections. Going from assembly size to the largest seat share already involves a mean relative error of 6 percent. Surprisingly, the mean error does not increase as we go from legislative to electoral parties, Table 14.2. Predicting the number of electoral parties from assembly size, for high responsiveness FPTP systems Country, no. of elections
St. Vincent, 4 Grenada, 4 St. Lucia, 6 Antigua, 3 Barbados, 6 Trinidad, 7 GEOM. MEAN Relative error
230
Largest seat share S 13.5 15 17 17 26 36
−1/8
Act. s1
.722 .713 .702 .702 .665 .639 .734
.819 .690 .629 .859 .694 .735 .690 +6.4%
S
Largest vote share S
−1.69/8
.577 .565 .550 .550 .502 .469 .534
Eff. no. el. parties
Act.v1
S 1.69/6
Act. NV
.597 .499 .547 .631 .534 .580 .563 +5.4%
2.08 2.14 2.22 2.22 2.50 2.74 2.31
2.23 2.40 2.15 1.99 2.15 2.22 2.19 −5.2%
The Institutional Impact on Votes
although the causal chain is longer. For the effective number of electoral parties, the relative error reaches 20 percent for individual countries, but it remains below 6 percent for the mean of the 6 countries. Thus the model passes this first reality check. Full testing of Duvergerian approach extended to electoral parties remains to be done.
Predicting the Vote Shares of All Parties from Institutional Inputs The average seat share distribution of all parties, for given largest seat share, was determined empirically in Chapter 9, and a logical model was offered and tested. Work in progress (Taagepera and Laatsit 2007) indicates that the empirical pattern, relative to the largest share, is the same for vote shares, within the range of random error. Small parties seem to lose about one-half of their inherent support to large parties, whether one goes by seats or by votes. The theoretical model presented seems to apply to seat and vote shares equally well. However, Taagepera and Laatsit (2007) also graph PR and FPTP elections separately. The patterns for seats and votes are again similar, but they differ for the two seat allocation rules. With FPTP, small parties seem to lose about three quarters of their inherent support, while with PR they lose only one quarter. The heavier losses in systems subject to Duverger’s psychological effect come as no surprise. However, it is confirmed that appreciable loss of inherent support for small parties occurs in PR systems too.
Predicting Deviation From PR from Institutional Inputs Deviation from PR (D2 , Gallagher’s measure) has previously been found to correlate with the logarithm of district magnitude (Anckar 1997b). No equation for the best fit line was given, so we cannot compare with the results of the following calculations. With this estimation I really go on the limb, by assuming that D2 ≈ (s1 − v1 ), which is not always the case. It is true that D2 is often close to the largest seat–vote difference (cf. Chapter 5). But the largest difference also may come from third party losses, which can exceed the largest party gains when some gain goes to the second-largest party. Moreover, subtractions always boost error,
231
The Duvergerian Macro-Agenda Table 14.3. Predicting the deviation from PR (Gallagher’s D 2 ) from assembly size, for high responsiveness FPTP systems Country
St. Vincent Grenada St. Lucia Antigua Barbados Trinidad GEOM. MEAN Relative error
S −1/8 − S −1.69/8 (%)
Actual D2 (%)
14.5 14.8 15.2 15.2 16.3 17.0 15.5
17.9 15.9 9.5 23.3 14.0 14.5 15.3 −1.2%
as pointed out earlier. With these cautionary notes, we may try to see whether the following theoretical approximation comes anywhere close to reality: D2 ≈ s1 − v1 = s1 − s1k = (MS)−1/8 − (MS)−k/8 . Table 14.3 shows the outcomes for the 6 Caribbean countries, using the calculated and actual largest shares in Table 14.2. The geometric mean of the actual deviations from PR is within 1 percent of the theoretical prediction. This is too good to hold upon more extensive testing, but it is encouraging nonetheless. For individual countries, the arithmetic mean difference between the expected and the actual is 3.9 percentage points, reaching 8.7 for Antigua, where only 3 elections are averaged. This result is better than one could have expected. It seems that, with a sufficient number of FPTP elections, we may actually start off with nothing but assembly size and still be able to predict the mean deviation from PR mostly within 4 percentage points. It remains to be seen to what extent this result is confirmed with FPTP in larger assemblies and with PR systems. Figure 5.1 shows the empirical pattern of D2 versus the effective number of electoral parties. Given that both D2 and NV can be estimated theoretically from the seat product, we should be able to calculate the theoretical curves D2 versus NV and compare them with the empirical relationships observed in Figure 5.1. This task remains to be done. It can be seen that theoretical estimation of deviation from PR becomes complicated even for FPTP and may become even more intractable for multi-seat List PR. All this leaves unexplained a simple empirical regularity documented by Taagepera and Shugart (1989: 118, 141) for 232
The Institutional Impact on Votes
Loosemore–Hanby’s D1 . On the district level, D1 =
50% , M 1/2
D1 =
25% . M 1/2
and nationwide,
One may suspect that the latter might be affected by assembly size. Accounting theoretically for these empirical regularities remains a challenge.
Predicting Proportionality Profiles from Institutional Inputs Proportionality profiles offer ‘snapshots’ of the actual electoral systems, as illustrated in Figures 5.2–5.5. Such profiles indicate at a glance the average impact of the given electoral system on large and small parties, and the degree of scatter around this average. If we truly understand the functioning of an electoral system, then we should be able to predict its proportionality profile without looking at any data, just on the basis of electoral rules, institutional givens, and possibly some information on political culture. The deductive chain that extends from institutions to advantage ratio is schematically the following (cf. Figure 7.2): Seat product MS → largest seat share → other seat shares plus largest vote share → other vote shares → advantage ratios How close are we to such a stage, even for the simplest electoral systems? An unpublished report (Taagepera 2002c) offers a few early examples. Agreement is satisfactory for New Zealand, Finland, and the Netherlands, while limited for UK—as one might expect on the basis of Table 11.1, where UK looks as if it applied plurality in 7-seat districts (rather than M = 1). This part of testing remains to be done. It would involve establishing criteria for goodness of prediction, comparing the predicted and actual profiles for many countries, and establishing the main foci where disagreement arises in the deductive chain that extends from institutions to advantage ratios. 233
The Duvergerian Macro-Agenda
Conclusions and Implications for Institutional Engineering Does this chapter complete the Duvergerian macro-agenda, at least for the simplest electoral systems? Formally, the last blank corner in Figure 7.2 has been filled in—the vote shares, the effective number of electoral parties, and deviation from PR. The impact of institutions on votes remains to be fully tested for FPTP systems with large assemblies, and the very theory must still be fleshed out for PR systems, so as to determine the values of parameter k. Nonetheless, we have made marked headway, as no corner of the overall scheme remains completely blank and the path for completing the rest is outlined. At the very least, this chapter offers a research agenda for interested scholars. One significant aspect not included in Figure 7.2 remains to be considered: the various kinds of thresholds of representation. They occasionally slipped into preceding discussions but need more systematic presentation. This is done in Chapter 15. As for institutional engineering, the vote shares and the effective number of electoral parties may be of less interest than seat shares and the effective number of legislative parties, which directly impact politics in the assembly. But deviation from PR becomes at times subject of political concern, as a large deviation may impinge on perceived fairness of the system. The deviation from PR (D2 , Gallagher’s measure) is typically 10–20 percent for FPTP systems. It has been seen that, using nothing but the product of assembly size and district magnitude, theoretically based equations may allow us to estimate it within ±4 percentage points. This error refers to the averages for many elections carried out under the same electoral laws. In individual elections, the number of parties and deviation from PR can vary widely. Thus the precision of prediction is limited, but we can still be more quantitative than merely predicting the expected direction of change, when a country contemplates a change in assembly size or mean district magnitude. Once more, when estimating the likely effect of changes in electoral laws on the number of parties and deviation from PR, one must also take into account the past tendencies in the given country. A country with deviation from PR higher than theoretically predicted and higher than the empirical world average at given seat product can be expected to maintain a correspondingly higher deviation from PR after an electoral reform. Other institutions, political culture, and past history matter. 234
The Institutional Impact on Votes
Appendix to Chapter 14 Here the largest vote share is derived from the largest seat share. Also investigated is the gap between the effective numbers of legislative and electoral parties, and balance in party sizes is revisited.
Theoretical derivation of the largest vote share from the largest seat share for FPTP The reversed attrition equation yields the largest vote share in terms of the seat shares of all parties: v1 = s1m/skm. Calculations are simpler when one considers 1/v1 rather than v1 . Then s m 1 = mk . v1 s1 The first task is to approximate the sum skm as a function of the largest seat share s1 alone. Assume that the largest party faces N equal-sized parties. Then skm = s1m + N
(1 − s1 ) N
m
and sm 1 1 − s1 m = km = 1 + N . v1 s1 N s1 What would be a realistic number of parties the largest party faces? Following a similar line of thought, Taagepera and Shugart (1989: 190) used the effective number of parties but subtracted one, so as to account for the party under consideration. This is a clear undercount, and it led to an overestimate of break-even points and distortions in the calculation of proportionality profiles (Taagepera and Shugart 1989: 88–91, 191–6). Rather, all seat-winning parties (N0 ) should be taken into account. If so, then the largest party faces N = N0 − 1 other parties. Moreover, we have seen that N0 itself can be expressed in terms of the largest seat share: N0 = 1/s12 . When this assumption is included, the previous equation becomes 1 =1+ v1
1−m 1 1 − s1 m − 1 . s1 s12
Upon further extensive simplification (but without approximations!), it becomes 1 =1+ v1
1−m 1 1 −1 +1 . s1 s1
235
The Duvergerian Macro-Agenda Graphing this function, log v1 versus log s1 , for the actual range the largest seat shares take in most FPTP elections (0.30 to 0.80) leads to what are for practical purposes straight lines for all values of parameter m that occur in FPTP (n = 2.5 to 4.0, hence m = 0.4 to 0.25). This means an approximation v1 = s1k , where the exponent k depends on n. Note that this expression fits two conceptual anchor points: When s1 = 0 then v1 = 0, and when v1 = 1 then s1 = 1. Graphing k versus n on logarithmic scales yields a straight line which corresponds to the equation k = 1.28n0.21 .
[2.5 ≤ n ≤ 4.0]
Is this a truly theoretical result, in view of the empirical-looking fit in this equation? It is theoretical, indeed, because no empirical data enter. Approximations for complex theoretical expressions still are theoretical, and there is nothing unusual in such simplification. In quantum mechanics, Schrödinger’s equation can be solved exactly only for the hydrogen atom; all quantum chemistry follows from judicious approximations. Of course, one must specify the range of input values for which a given approximation is valid. Here this range is the one that occurs for FPTP elections: 2.5 ≤ n ≤ 4.0. Within this range, the values of the largest vote share calculated from v1 = s1k and k = 1.28n0.21 differ by no more than ±0.4 percentage points from those calculated directly from the exact equation 1/v1 = 1 + (1/s1 − 1)(1/s1 + 1)1−m. The approximation would most likely be different for the PR systems, where 1.0 ≤ n ≤ 1.25. I have not calculated it as yet. The model v1 = s1k runs into conceptual trouble for presidential elections. S = M = 1 leads to s1 = 1 and hence v1 = 1. Of course, this value v1 = 1 could be the result of rounding off from as low as 0.51, but in plurality election a candidate can win with even less than v1 = 0.49, which would round off to 0. This inconsistency also means that v1 = s1k cannot be extended to single FPTP electoral districts. Does it affect countrywide results? At least for hypothetical data in Table 14.1, s1 = 0.56 predicts v1 = 39.3%, close to the initial 40 percent.
The gap between the effective numbers of electoral and legislative parties It was noted in Chapter 4 that, on the average, NV − NS ≈ 0.4. Can it be theoretically explained why this gap tends to be around 0.4 rather than much more or 1/k much less? Recall that NV = NSk and hence NS = NV . Thus theoretical expectation
236
The Institutional Impact on Votes would be 1/k
NV − NS = NV − NV
= (MS)k/6 − (MS)1/6 .
This expression is hard to reduce to a simpler form. For the 6 Caribbean countries, the theoretically predicted gap ranges from 0.54 to 0.92. It vastly exceeds the actual gap. Here the subtraction boosts the relatively small difference between expected and actual NV , as reported in Table 14.2. It remains open whether the predictive ability of the model reaches here a limit, or whether a small adjustment could do the trick. For detailed worldwide checking of the relationship between the effective numbers of electoral and legislative parties, it might be easier to compare the ratios rather than the differences of the two numbers: (1−1/k)
NV /NS = (MS)(k−1)/6 = NV
.
Balance in party sizes Balance in party sizes is one characteristic that has not been connected to institutions. As a world average, B = 0.5 fits, and this was the basis for our estimates for the largest seat share and hence the effective number of parties and cabinet duration. But what causes deviations from this average balance? Looking at Figure 4.1, we may ask which institutions might be common to countries with low balance (UK, Spain, Greece, Portugal, Canada, Ireland, and Italy) in contrast to countries with high balance (Malta, Belgium, Austria, Luxembourg, Iceland, and Finland)? One may note that the low balance group consists mainly of large countries (median population around 40 million), while the high balance group consists of small countries (median population around 1.5 million). But this is a happenstance of countries included in the data source (Mackie and Rose 1997). Inclusion of small Caribbean FPTP countries discussed in the present chapter would even out the population score. Regardless of population, FPTP countries tend to have few parties and low balance between the largest party and the rest. But what about multi-seat electoral systems that also exhibit a dominant party (Japan, pre-1990 Italy, and Ireland)? And what about Greece, Portugal, and Spain, where major parties take turns in having lopsided majorities? Is it just path-dependent political development, or can we locate institutional features that favor the rise of fleetingly or durably predominant parties? It remains to be seen.
237
This page intentionally left blank
Part III Implications and Broader Agenda
Ask not what the electoral rules can do for your country, ask what your country can do to the electoral rules. A Wuffle
This page intentionally left blank
15 Thresholds of Representation and the Number of Pertinent Electoral Parties
For the practitioner of politics:
r
r
Nationwide threshold of minimal representation is the average vote share needed to win one seat in the assembly. It is close to 38 percent divided by the square root of the seat product (assembly size times district magnitude). If greater inclusion of political minorities is desired, this threshold can be lowered by increasing district magnitude and/or assembly size.
Various thresholds of representation have entered previous discussion, and it is time to present them systematically. The break-even point (b) in Chapter 5 represents the threshold of nationwide seat share of a party breaking even with its vote share. District magnitudes at which parties with a given vote share obtain their first seat (Chapter 6) hint at a threshold of minimal representation in a district. Such a threshold visibly depends on the seat allocation formula used, and the differences can be appreciable, as illustrated in Table 6.1. For various usual PR formulas, an overall average estimate for the threshold of minimal representation in a district was invoked in Chapter 8: T = 75%/(M + 1). Legal thresholds of representation also have been mentioned, from Chapter 3 on. There is considerable confusion between thresholds at district and national levels, as well as about the significance of minimal and other levels of representation. Some of these thresholds depend on the number of parties competing. Hence the issue of the number of ‘serious’ electoral parties arises again and will be addressed. 241
Implications and Broader Agenda
District-Level Thresholds of Minimal Representation Thresholds of representation are vote shares a party must obtain so as to win a specified number or share of seats. Of most interest are the votes needed to win one seat (minimal representation), the votes needed for breaking even (seat share equals vote share), and the votes needed to win one-half of all seats (majority). These thresholds depend on the seat allocation formula, and also on the number of parties running. At the district level, they were worked out in the 1970s. At the nationwide level, which is of more interest, progress has been made only since the 1990s. This section deals with the thresholds for minimal representation at the district level. The inclusion threshold (TI ) is defined as the minimum vote share a party needs to win its first seat, under the most favorable conditions. For a vote share v smaller than TI , the probability of winning a seat is 0: P (1) = 0. Conversely, the exclusion threshold (TE ) is the maximum vote share with which a party still can fail to win its first seat, under the most adverse conditions. For v larger than TE , the probability of winning a seat is 100 percent: P (1) = 1. In-between these vote shares, the party may or may not win a seat, depending on how the other vote shares are distributed among its competitors. We may expect the probability to increase gradually with the party’s vote share. We might wish to designate as the mean threshold of minimal representation (TR ) the vote share at which a party has a 50-50 chance to win its first seat. The inclusion and exclusion thresholds can be theoretically calculated, for given district magnitude (M) and number of parties running ( p ). For d’Hondt allocation formula, Rokkan (1968) established the inclusion threshold, and Rae, Hanby, and Loosemore (1971) the exclusion threshold. Others followed, such as inclusion threshold for Hare-LR (Laakso 1979). Table 15.1 shows the general formulas for d’Hondt, Sainte-Laguë, and Hare-LR, and also the specific values when 6 or 8 parties run in a 6-seat district (the example used in Tables 3.3 and 3.4). For the Sainte-Laguë and Hare exclusion thresholds, many sources list the same value as for d’Hondt, TE = 1/(M + 1), but this is true only if more parties run than there are seats. If the number of parties is equal or less than M, which is usually the case at high M, then the exclusion threshold is lower. At M = 1, all these thresholds boil down to the FPTP threshold, also shown. At which vote share would a party have a 50-50 chance to win its first seat? We have not found a way to calculate it, but it could be around 242
Thresholds of Representation Table 15.1. District-level thresholds of minimal representation (in fractional shares) for various seat allocation formulas—general and for a 6-seat district Formula
d’Hondt
Inclusion, TI Exclusion, TE
1/(M + p − 1) 1/(M + 1)
Sainte-Laguë
Hare-LR
FPTP
1/(2M + p − 2) 1/(2M − p + 2) for p ≤ M 1/(M + 1) for p > M
1/Mp ( p − 1)/Mp for p ≤ M 1/(M + 1) for p > M
1/ p
8.333% 6.250% 12.500% 9.38% 8.84% 5.556% 14.286% 9.92% 8.91%
8.333% 2.778% 13.889% 8.33% 6.21% 2.083% 14.286% 8.18% 5.46%
Sample thresholds for M = 6, where 75%/(M + 1) = 10.7% TI = TE 14.286% p = 2 TI 9.091% p = 6 14.286% TE 11.64% TR arithm. 11.40% TR geom. TI 7.692% p =8 14.286% TE 10.99% TR arithm. 10.48% TR geom.
1/2
the midpoint of the interval where 0 < P (1) < 1. This could mean the arithmetic mean, TR ≈ (TI + TE )/2, or the geometric mean, TR ≈ (TI TE )1/2 . Both are shown in Table 15.1, for the specific case of M = 6. The outcomes depend heavily on how many parties run. At given number of parties, d’Hondt has the highest thresholds, and Hare-LR most often has the lowest (except for TE at p = 6). However, the inclusion threshold TI = 1/Mp for Hare-LR is rather artificial. Even much higher vote shares have a nearzero probability to result in a seat, because this is the range where the Alabama paradox occurs (cf. Chapter 6). Hence the actual 50-50 point for landing the first seat with Hare-LR is likely to be higher than the means of inclusion and exclusion thresholds. What types of constellations lead to threshold outcomes? For SainteLaguë, Table 15.2 shows examples for a 6-seat district where a small party wins a seat with a vote share barely above the theoretical inclusion Table 15.2. Sample constellations (in %) where the party shown in bold narrowly wins or narrowly fails to win a seat in a 6-seat district, using the Sainte-Laguë seat allocation formula p = 2 p = 6 p = 8
narrowly included narrowly excluded narrowly included narrowly excluded narrowly included narrowly excluded
8.34–91.66 8.33–91.67 6.255–5 at 18.79 and 6.251–4 at 6.250–68.749 12.45–4 at 12.50–37.55 5.56–2 at 5.55–5 at 16.668 14.27–6 at 14.28–0.05
243
Implications and Broader Agenda
threshold or fails to win a seat with a vote share barely below the theoretical exclusion threshold.
The Number of Pertinent Electoral Parties in a District All these theoretical thresholds of inclusion, as well as some thresholds of exclusion, depend on the number of parties competing, which is not an institutional input. The sticky part is that, at this point, we do not know how many parties would typically run in a meaningful way, nor even what we should mean by ‘meaningful’. As pointed out in Chapter 14, this number is of interest for various other purposes too, but its likely value is hard to deduce from institutional inputs. Moreover, it is also hard to measure it empirically, even retroactively. While the number of seat-winning parties is fairly clear, determination of the ‘number of votegetting parties’ is complicated by parties that do obtain a few votes but still cannot be considered serious or pertinent to the process. Reed (1991, 2003) considered the number of candidates who win or narrowly fail to win in Japanese SNTV elections. He observed that the number of such ‘serious’ or ‘viable’ candidates that run in a district with M seats tends to be M + 1. Gallagher (2001) confirmed the findings for the Japanese second chamber elections of 1998, which used a mix of FPTP and PR. Cox (1997: 99) presented this M + 1 rule as a direct generalization of Duverger’s law and tested it in various ways. Depending on the electoral system, M + 1 is meant here to be the number of viable candidates OR of viable lists—those winning at least one seat or coming close to winning. The distinction between candidates and lists is blurred at low M, where few parties expect to win more than one seat, anyway. In the Netherlands, however, where all 150 assembly seats are determined in a single nationwide district (i.e. M = 150), 151 viable candidates seems an understatement, while 151 viable parties is clearly an overstatement. Hence the M + 1 rule is not likely to hold for viable candidates and cannot possibly hold for viable parties in large multi-seat districts. Reed’s argument is well grounded for SNTV, where parties are penalized for running too many candidates and every candidate is effectively competing against all others. With List PR, however, the argument might be presented in a different way. For every seat-winning party, its viable candidates could well be its seat-winning candidates (Si ) plus one that was close to winning: Si + 1. With p = M 1/2 seat-winning parties (Chapter 8) this 244
Thresholds of Representation
means a total of M + p = M + M 1/2 viable candidates for the seat-winning parties. To these, one must add an estimate for viable candidates in parties that failed to win a single seat. As for the number of parties, it may well be that only two parties should run in a single-seat district on rational grounds. Then the average threshold of representation would be 50 percent. However, this is most often a gross overestimate, because more parties tend to run than rational choice theory proposes. The range of empirical estimates of the average threshold of representation in FPTP districts extends from 35 percent (Lijphart 1994) to possibly into the low 40s. The unusually pure two-party systems in the USA and in small island countries are exceptions. Parties beyond two may not be ‘viable’ in the sense of having a chance to win, but they do have a real impact by lowering the threshold of inclusion for the ‘serious’ parties. Therefore, this ‘irrational’ reality cannot be ignored. Maybe such parties that run (including independents) could be called ‘pertinent’ even when they are not ‘viable’. The minimal baseline for the number of pertinent parties ( p ) is the number of seat-winning parties, p = M 1/2 . My hunch is that p = M 1/2 + 2M 1/4 , although I cannot prove it. Here M 1/2 represents the number of seatwinning parties, and 2M 1/4 is an estimate of the number of further parties enticed to run in an M-seat district. This formula would yield 3 parties running when M = 1. The corresponding threshold of inclusion would be 33 percent, the threshold of exclusion is definitely 50 percent, and the arithmetic and geometric means of the two are 41.7 and 40.8 percent, respectively—close to the observed average. The numbers of parties running in Finnish districts (Taagepera and Shugart 1989: 119) straddle the predictions by p = M 1/2 +2M 1/4 : actual 7.4 versus the predicted 6.5 for M around 9; 7.9 versus 7.6 for M around 14; and 8.6 versus 9.0 for M around 22. The formula would predict 19 pertinent electoral parties for the Netherlands. Returning now to the pertinent candidates, they would include the M seat-winning candidates plus one additional candidate per pertinent party: c = M + M 1/2 + 2M 1/4 . For the Netherlands (M = 150), it would suggest 169 pertinent candidates: the 150 winners, 12 near-winners from the 12 seat-winning parties and 245
Implications and Broader Agenda
7 near-winners from 7 parties failing to win a single seat. A reality check has not been carried out. When investigating thresholds of inclusion, such as the ones in Table 15.1, one might as well plug in p = M 1/2 + 2M 1/4 , rather than using impressionistic figures for the number of parties. The results are shown in Table 15.3. For districts with up to 5 seats, the number of pertinent parties would exceed M, while for districts of 6 or more seats it would fall short of M. Regardless of how p is chosen, d’Hondt has the highest thresholds. Sainte-Laguë has the lowest exclusion thresholds. Hare-LR has the lowest inclusion thresholds (due to Alabama paradox) and also the lowest TR , taken as the geometric mean of TI and TE . At low M, arithmetic means are higher than the geometric by up to 0.15 percentage points for d’Hondt, up to 1.2 for Sainte-Laguë, and up to 2.3 for Hare-LR. Despite the variations in Table 15.3 at given district magnitude, the means of the inclusion and exclusion thresholds remain within ±3 percentage points for all usual List PR formulas, even at low magnitudes. An overall estimate for the mean threshold of minimal representation is the aforementioned T=
75% . M+1
[district-level approximation]
The reasons for this choice are given in Taagepera (1998b). It does not include the number of parties and hence could be off the mark, if Table 15.3. Number of ‘pertinent’ electoral parties ( p ) and resulting thresholds of representation (in %), if p = M 1/2 + 2M 1/4 and TR = (TI TE )1/2 . d’Hondt M 1 2 3 4 5 6 7 8 9 10 20 50 100
Sainte-Laguë
TI
TE
TR
TI
TE
TR
TI
TE
3.00 3.79 4.36 4.83 5.23 5.58 5.90 6.19 6.46 6.72 8.70 12.39 16.32
33.3 20.9 15.7 12.8 10.8 9.5 8.4 7.6 6.5 6.4 3.6 1.6 0.9
50.0 33.3 25.0 20.0 16.7 14.3 12.2 11.1 10.0 9.1 4.8 2.0 1.0
40.8 26.3 19.8 16.0 13.4 11.6 10.5 9.2 8.0 7.6 4.2 1.8 0.9
33.3 17.3 12.0 9.2 7.6 6.4 5.6 5.0 4.5 4.0 2.1 0.9 0.5
50.0 33.3 25.0 20.0 16.7 13.5 11.0 10.2 8.7 7.5 3.2 1.1 0.5
40.8 24.0 17.3 13.6 11.2 9.3 7.8 7.1 6.2 5.5 2.6 1.0 0.5
33.3 13.2 7.6 5.2 3.8 3.0 2.4 2.0 1.7 1.5 0.6 0.2 0.1
50.0 33.3 25.0 20.0 16.7 13.7 11.9 10.5 9.4 8.5 4.4 1.8 0.9
Note: Cases where p > M are shown in bold.
246
Hare-LR
p
TR 40.8 21.0 13.8 10.2 8.0 6.4 5.4 4.6 4.0 3.6 1.6 0.5 0.2
Thresholds of Representation
unusually many or few parties run—as seen in Table 15.1. It is approximately halfway between the values of TR for d’Hondt and Sainte-Laguë in Table 15.3, except at M = 1. For single-seat districts, the formula yields T = 37.5%. It may be too low. The empirical point of 50-50 probability for winning the seat was around 39 per cent votes in UK 1983 and as high as 49.5 percent in the US elections 1970 for the House (Taagepera 1998b). Unpublished work on districts in Canada suggests a figure around 41 percent. Is T = 75%/(M + 1) theoretical or empirical? The starting point consisted of purely theoretical expressions for inclusion and exclusion thresholds, and the denominator (M + 1) harks back to the most widely occurring threshold of exclusion. The ‘75 percent’ in the numerator, however, came from largely empirical juggling of thresholds for various seat allocation formulas. So the expression is an empirical one, but with roots in theory.
Nationwide Threshold of Minimal Representation and Number of ‘Pertinent’ Electoral Parties The above results apply to a single district. What about minimal representation, nationwide? What share of nationwide vote is likely to bring a party one seat in the assembly when the country is divided into districts of approximately equal magnitudes? The crucial factor is the number of electoral districts (E ). Consider the extreme possibilities for distribution of votes among districts. The nationwide threshold could approach the district-level threshold, if the votes are uniformly dispersed. At the other extreme, it could be as low as district-level threshold divided by the number of districts, if all the nationwide votes are concentrated into one district. Thus the limits on the nationwide threshold are Tdistrict /E < Tnationwide < Tdistrict . Taking the geometric mean of the extremes suggests that the nationwide average threshold minimal representation is the district-level threshold divided by the square root of the number of districts (Taagepera 2002b): T=
75% . (M + 1)E 1/2
[nationwide approximation, parties]
247
Implications and Broader Agenda
Here independent candidates differ from small parties, because they run in one district only, so that Tnationwide = Tdistrict /E . Hence T=
75% . [(M + 1)E ]
[nationwide, independents]
This dependence on the number of districts suggests that a small party (or independent candidate) stands to win a seat with a lesser share of votes with FPTP than with List PR. This apparent paradox was first pointed out by Grofman (1999). The key word is ‘nationwide’. It takes a larger share of district votes to win by FPTP than by multi-seat PR, but a smaller share of nationwide votes. How should we measure the actual vote level at which a party has a 50-50 chance of winning a seat? This empirical threshold of nationwide minimal representation (T) can be defined as follows (Taagepera 1989). At this vote share, the number of cases where parties have won at least one seat with vote shares lower than T equals the number of cases where they have failed to win seats with vote shares higher than T. For most countries, this empirical threshold agrees with the model above within a factor of 2, but UK, Spain, and Imperial Germany (1871–1917) have markedly lower thresholds than predicted (Taagepera 2002b). A lower nationwide threshold would enable more parties to surmount it. In other words, the lower the threshold, the higher the number of seatwinning parties, N0. Let us establish a more specific connection. In terms of assembly size and district magnitude, the number of districts is E = S/M. Plugging this value into the equation for parties above, we can express the nationwide threshold in terms of seat product MS: T=
75% (MS)1/2 (1 +
1 ) M
.
The impact of (1 + 1/M) is quite limited. This factor equals 2 when M = 1, and tends toward 1 when M becomes very large (limited by M ≤ S). The extremes are T=
37.5% (MS)1/2
[for FPTP]
and T=
248
75% (MS)1/2
[for very largeM].
Thresholds of Representation
The mean number of seat-winning parties was previously shown to be N0 = (MS)1/4 . Hence T=
N02
75% . 1 1+ M
Reversing it yields 75% 1/2 N0 = T 1/2 . 1 1+ M Accordingly, the average number of seat-winning parties ranges from N0 = (37.5%/T)1/2 for FPTP to N0 = (75%/T)1/2 when the district magnitude is huge. This number can be determined empirically (with some practical difficulties, as pointed out in Chapter 8), so that this prediction can be tested. Figure 15.1 (modified from Taagepera 2002b) graphs N0 against T, both on logarithmic scales. The predicted zone is the area between the line marked as M = 1 and the curve marked as M = S. All actual data points
Number of seat-winning parties (N0)
100
M>1 M=1 Legal T GER 1971−
10
GER 1920−
Complex
SPA
ISR SWI IRE JPN
DEN NET 1988−
UK
2 x Average Prediction USA 1984− USA 1928−
T = 75%/(N02+1) [M = S] Predicted Zone
Average Predicted/2
1 0.1
2
T = 75%/2N0
[M = 1]
1
10
100
Threshold (T, in %)
Figure 15.1. Nationwide number of seat-winning parties vs average threshold of representation Reprinted, with modified labels, from Electoral Studies, R. Taagepera, ‘Nationwide Threshold of Representation’, 383–401, © 2002, Elsevier Ltd., with permission from Elsevier.
249
Implications and Broader Agenda
are within a factor of 2 of the center of this zone. Overall, they are evenly dispersed on both sides of the zone, but there are differences among the types of electoral systems. Agreement with model is excellent for simple multi-seat systems, as 10 of 14 points are within the predicted zone. For systems with legal thresholds the number of seat-winning parties always exceeds the prediction at given (legal) threshold. For the M = 1 systems, the reverse tends to be the case. In retrospect, the derivation of the predictive model involves a flaw. It was assumed that the nationwide threshold could approach the districtlevel threshold, if the votes are uniformly dispersed among the districts. This is true only to a point. In a single district, T = 75%/(M + 1) represents the average threshold of winning one seat with 50-50 probability. If the party has such a vote share in E districts, it is likely to win one seat in one-half of the districts, meaning a total of E /2 seats rather than a single one. For multi-seat PR the difference might be a minor one, but in the case of FPTP, this would mean winning a half of all the seats available! In this light, it is surprising that the M = 1 data in Figure 15.1 fit even as well as they do. I have not found a way to refine the model so as to overcome this flaw. One might think of looking for nationwide thresholds of inclusion and exclusion and take their mean. These thresholds have been calculated (Taagepera 1998c), but they diverge so widely that their averages supply no clue about the location of the vote share with 50-50 probability for landing the first seat. How many ‘pertinent’ parties would run nationwide? The tentative formula p = M 1/2 + 2M 1/4 applies to districts. Extension to the nationwide level might involve the square root of the number of districts or the seat product, but the work remains to be done. A phantom number of vote-getting parties (NV0 ) emerged in Chapter 14. One might test how extensions of the formula proposed in the previous section compare with this phantom number. Both might be inserted into threshold formulas, to see whether they make sense.
The Threshold of Breaking Even The break-even point b was defined in Chapter 5 as the vote share for which fractional seat shares equal vote shares so that the advantage ratio (a) is 1. This point can be seen as another threshold. Attempts to build a logical model for the break-even point (Taagepera and Shugart 1989: 88–91, 270) yielded b = 100%/NV , but most actual values are below this 250
Thresholds of Representation
prediction. The flaw in the model was that it used the effective number of electoral parties rather than some estimate of the total number of parties that run in a ‘serious’ way. The model could now be revised, replacing the effective number by the phantom number of vote-getting parties deduced in Chapter 14. This would connect the break-even point to the seat product, indirectly. Testing remains to be done. However, the significance of the break-even point varies. It is clear for FPTP systems, with profiles such as shown in Figure 5.2. Here the average proportionality profile curve intersects the horizontal line a = 1 at a sharp angle. Immediately below the break-even point, a party is heavily shortchanged in terms of seats per votes, so this point matters. In contrast, the shift is more gradual and diffuse for multi-seat PR (e.g. Figure 5.4), so that falling somewhat below the formal break-even point is of little concern. The break-even point loses any meaning for highly dispersed profiles such as France (Two-Rounds, Figure 5.3). Still, at least for FPTP, better institutions-based prediction of the threshold of breaking even would be of interest.
The Threshold of Absolute Majority In the calculation of inclusion and exclusion thresholds in districts, one can go beyond minimal representation and calculate such thresholds for winning two seats, and so on. One can also calculate the minimal vote shares at which a half of the seats could be won, and the maximal shares at which such majority of seats could still elude a party. Most thorough work in this direction has been carried out by Rubén Ruiz Rufino (2005). That study develops general equations that allow one to calculate the inclusion and exclusion thresholds ranging from minimal representation to absolute majority, for different allocation formulas. In line with the power and generality of the equations, the number of parameters to be fed in becomes large, and I hope a more user-friendly version can be worked out. Ruiz Rufino (2005) uses the effective number of parties as an input variable, but with the help of N = (MS)1/6 , purely institutional expressions could be established. Extension to nationwide vote shares is conceivable. For FPTP, it has popped up spontaneously in the previous section. Here ‘minimal representation’ in a district implies exclusive representation in that district. To the extent that T = 75%/(M + 1) = 37.5% expresses the mean threshold of minimal representation a single district, it also expresses the mean threshold of absolute majority nationwide. 251
Implications and Broader Agenda
The Paradoxical Relationships Between the Thresholds of Minimal Representation, Breaking Even, and Reaching Absolute Majority It might seem evident that it would take fewer votes to achieve minimal representation than to break even, and fewer votes to break even than to achieve absolute majority, but it can be misleading. Actual relationships can be tricky at district level, and even more so at the district–nationwide interface. In one single-seat district, the minimal representation of one seat also means overrepresentation, absolute majority, and 100 percent majority. Relationships are less clear in multi-seat districts, and contrasts are less marked, but they still matter—and they are also harder to detect. At district level, it may come as a surprise that not only inclusion thresholds but also exclusion thresholds of minimal representation mean overrepresentation, for the usual PR formulas. Indeed, the highest exclusion threshold we observed was for d’Hondt: 1/(M + 1). This vote share assures a share 1/M of the seats, meaning an advantage ratio a = (M + 1)/M = 1 + (1/M), which exceeds 1. This means that, within a single district, a party with only one seat is often overrepresented—unless we go to allocation formulas less proportional than d’Hondt. Transfer from district to nationwide level is far from obvious. It was pointed out that it can take a smaller share of nationwide votes to win an assembly seat with FPTP than with PR. According to N0 = (MS)1/4 , it is as easy to win a seat in a 625-seat assembly elected by FPTP than in an assembly of 25 elected by nationwide PR. There is a difference, however, in the degree of power that such minimal representation involves. Having 1 seat in an assembly of 25 may make a party a minor but serious player in cabinet formation, while having 1 seat of 625 amounts to very little. Thus the substantive meaning of minimal representation depends itself on assembly size.
Legal Thresholds and their Concordance with Effective Thresholds Some electoral systems impose legal thresholds on representation, such as requiring 5 percent of votes before a party can participate in allocation of seats. Transition is sharp, from no seats to nearly breaking even or even becoming overrepresented (see Figure 5.5). In contrast to the sharp ‘vertical’ barrier (step function) imposed by the legal threshold, the seat product imposes a gradual ‘tilted’ barrier zone that no single one of the 252
Thresholds of Representation
aforementioned types of thresholds can characterize completely. For legal thresholds, a single number mostly says it all. The tilted barrier raised by district magnitude is all too easily assumed to be analogous to a legal threshold, but one must exert caution. Confusion is enhanced by failing to distinguish between district-level and nationwide restrictions. Legal thresholds can be applied at district level (Spain) or nationwide (Germany)—and the difference matters. Consider a party with 4.9 percent nationwide votes. If all seats are allocated in districts of more than 10 seats, then a district-level legal threshold of 5 percent most likely would allow it to win seats in some districts. In contrast, a nationwide 5 percent legal threshold would completely block it. Previous analyses (including Taagepera and Shugart 1989; Lijphart 1994) often have confused the two levels, treating nationwide legal thresholds as equivalent to district-level effective thresholds imposed by district magnitude, such as T = 75%/(M + 1). Before one juxtaposes the two, one must correct for the effect of the number of districts (E ). As noted earlier, the effective nationwide threshold set by a district-level effective threshold T is around T/E 1/2 —and most likely even less for FPTP.
Conclusions and Implications for Institutional Engineering This chapter has more questions, cautionary notes, and suggestions than firm answers. An operational definition of the number of parties that run seriously (or at least semi-seriously) is of interest in scholarly discussion. It impinges on various aspects of thresholds of representation. The world trend in democracies may be toward greater inclusion, as claimed by scholars such as Colomer (2004b) and Lijphart (1999). If so, then the threshold of minimal nationwide representation could be a formal yardstick of interest to political practitioners.
253
This page intentionally left blank
16 Seat Allocation in Federal Second Chambers and the Assemblies of the European Union
For the practitioner of politics:
r
r
r
r
Here the implications for institutional engineering are major, because the institutional structure of the European Union is as yet unsettled. Making use of the logical models presented and tested here could save appreciable political wrangling and might improve the outcomes. The number of seats in the European Parliament seems to grope toward the cube root of total population, which is the empirical and logical norm for national assemblies. This ‘cube root law of assembly sizes’ could be made the official norm: The number of seats equals the cube root of the EU population. The total voting weights for qualified majority voting in the Council of the European Union seem to grope toward a logically founded formula for balanced representation of total population and of member states. This formula could be made the official norm: The total of voting weights equals the sixth root of the EU population × the square root of the number of member states. Allocation of EP seats and CEU voting weights among the members of EU has for 40 years closely approximated the one predicted by a ‘minority enhancement equation’ solely on the basis of the number and populations of member states plus the total number of seats or voting weights. This logically founded formula could be made the official norm. The formula is more complex than for total size of assemblies,
255
Implications and Broader Agenda
r
but it is easily workable with a computer program or even a pocket calculator. All these models may be of use for some other supranational bodies and federal second chambers.
This chapter describes a promising spin-off from the law of minority attrition (Chapter 13), even while it is marginal for predicting party sizes as such. A predictive model is constructed and tested for allocation of seats among member countries in the EP and the Council of the European Union (CEU). The model is based on nothing but the constraints imposed by the total population, the number of seats (or voting weights) in the given body and the number and populations of member countries. Applying this logically based model could save on negotiation time when negotiators know that the eventual outcome will be close to fitting the model, anyway. This would be the major payoff for this chapter. The problem of seat allocation in the European bodies is part of a more general one, that of seat allocation in supranational entities such as the United Nations (UN) as well as in national second chambers, at least those that somehow reflect territorial subunits. It was recognized a long time ago that allocating seats to territorial units on the basis of their populations represents a problem mathematically similar to that of allocating seats to parties on the basis of votes. The latter is approximated by the law of minority attrition. However, mathematical similarity is limited in one respect. Small parties have no inherent right to even minimal representation. In contrast, territorial subunits in a federal second chamber may have such a right, simply by being a distinct subunit. The same goes for member countries in supranational entities such as the EU and international organizations. It follows that the number of constituent parts must be worked into the minority attrition equation, along with their populations and the total number of seats, before this equation can be applied to allocation of seats in such entities. Indeed, minority attrition must be reversed into minority enhancement. As applied to seats and votes, the law of minority attrition involves the total number of seats in the assembly as a given. The basis for determining this total number should be investigated in the first place. For the first or only chambers of national assemblies that represent individuals, the cube law of assembly sizes applies, with some reservations (cf. Chapter 12). To the extent that the EP can be considered the analog of national 256
Seat Allocation in EU Assemblies
parliaments, one may well ask how its total size relates to the cube root of population. Both the EP and the CEU could also be argued to have more in common with the second chambers of national parliaments. So the determinants of the sizes of second chambers should also be considered, especially those that are somehow tied to territorial subunits. We should know how many seats are to be allocated before trying to allocate them. Therefore, the size question will be addressed first. Thereafter, a minority enhancement equation will be developed on the basis of the law of minority attrition. It will be tested with EU data. Application to federal second chambers and some supranational bodies is briefly discussed.
The Size of Subunit-Based Second Chambers Although the second chambers do not pretend to represent the population as such, larger countries may be expected to have larger second chambers. When the second chambers are based on territorial subunits, their size might also depend on the number of subunits. Taagepera and Recchia (2002) compiled the populations (P ) and the sizes of first and second chambers (F and S, respectively) for 28 contemporary countries where federal or other territorial subunits form the basis of election or appointment of at least part of the second chamber—this part being designated as S . The sizes of assemblies and populations can take only positive values, and hence a linear relationship of logarithms is the simplest relationship to be expected (cf. Taagepera 2008), if there is a relationship at all. For these 28 countries, the first chamber sizes (F ) roughly follow the aforementioned cube root law but fall to about one-half of that value at very low populations—the general pattern observed in Chapter 12. Linear regression of log F on log P yields R 2 = 0.81. For second chambers, the best-fit line log S versus log P corresponds to S = 0.46P 0.304 (R 2 = 0.56), or with some rounding off, S = 0.48P 0.30 . The exponent 0.30 is slightly lower than the 0.33 in the cube root law (Taagepera and Recchia 2002). The best fit line of log S on log F , the size of the first chamber, corresponds to S = 1.00F 0.786 (R 2 = 0.68). Thus, second chambers tend to be smaller than first chambers. The values of R 2 suggest that the second chamber sizes may be affected by population through the first chamber size rather than directly. 257
Implications and Broader Agenda
When second chambers are selected on the basis of T territorial subunits, their sizes are subject to the following limiting constraints. (1) If all T subunits are to be represented, the chamber must have at least T seats. This is the lower limit on S. (2) If people were represented as individuals, the second chambers would be akin to first chambers and would be expected to have the same size (F ). This is the upper limit on S. In the absence of any further knowledge, our best guess would be the geometric mean of the two limits, as there is no reason to stress the impact of one over the other (cf. Taagepera 2008). So the predictive model is S = (FT)1/2 . The actual best-fit exponent was found to be 0.506, very close to the expected 0.500. Thus the first chamber size and the number of subunits seem, indeed, to influence the size of the second chamber to roughly the same degree. For further testing, Taagepera and Recchia (2002) considered only the number of those second chamber seats allocated on the basis of subunits (S ), excluding nonregional appointments. Figure 16.1 shows the actual number of such seats graphed against the expected size (FT)1/2 . We expect S = 1.00(FT)1/2 + 0. The best-fit line is S = 0.96(FT)1/2 + 12, with R 2 = 0.52 between S and (FT)1/2 . This line is seen to be extremely close to expected one. Note that R 2 is not increased, compared to fitting with first chamber size alone. The difference is that the equation S = 1.00F 0.786 was purely postdictive, with no explanation given for why the exponent should be around 0.786 rather than something else. Here, in contrast, we have a predictive model, posited before any input of data, on very general grounds. It is confirmed by the agreement between the predicted and actual best-fit lines, regardless of the degree of scatter around them (cf. Taagepera 2008). Now combine the cube root law for first or only chambers, F = P 1/3 , with S = (FT)1/2 . It leads to S = P 1/6 T 1/2 . If a second chamber or supranational assembly is selected purely on the basis of subunits or member states, balanced representation of total population as well as of member states may call for this total size: sixth 258
Seat Allocation in EU Assemblies
Number of subunit-based second chamber seats
400
300
e
t lin
t-fi
s Be
l
de
Mo 200
100
0 0
100
200
300
Geometric mean of F and number of subunits
Figure 16.1. Number of subunit-based second chamber seats vs. the geometric mean of first chamber size and the number of subunits Source: Reprinted from R. Taagepera and S. Recchia, ‘The Size of Second Chambers and European Assemblies’, European Journal of Political Research, 41: 185–205, © 2002 European Consortium for Political Research, with permission from ECPR.
root of the population × square root of the number of subunits. This model has not yet been fully tested.
The Sizes of the European Parliament and the CEU The patterns observed for the first and second chambers of national parliaments will now be applied, with obvious caution and reservations, to two-tiered supranational assemblies. Among the various ancillary bodies of the EU, the EP and the smaller CEU stand out. How would their sizes look, if these were the sizes of national first and second chambers, respectively? The CEU uses two forms of voting: qualified majority voting (QMV), where larger countries have larger voting weights, and unanimity voting, where all countries have equal votes. Only the QMV aspect will be considered, so that CEU size is shorthand for the total of voting weights. 259
Implications and Broader Agenda Table 16.1. The number of seats in the European Parliament—prediction by the cube root law and the actual number
1964 1979 1989 1994 1995 2004
Population (million)
P 1/3
Actual seats
173 277 341 348 369 454
557 652 699 703 717 769
142 410 518 567 626 785
Actual as % of P 1/3 25 63 74 81 87 102
Table 16.1 shows the population of the EU and the number of seats in the EP (based on Taagepera and Hosli 2006). The cube root of the population is compared to the assembly size. The initial size of the EP amounted to only one-quarter of the cube root of population represented, but it gradually approached the cube root with each expansion of the Union and reached it by 2004. This asymptotic growth is reminiscent of the growth of the US House, which started in 1790 at one-third of the cube root of the population but caught up with the cube root within 40 years (Taagepera and Shugart 1989: 175). The CEU followed a rather similar pattern of asymptotically approaching the level of S = P 1/6 T 1/2 , from 1958 to 1995, as seen in Table 16.2 (based on data in Taagepera and Hosli 2006). However, the Treaty of Nice proposed a sharp change. The figure for 2004 in Table 16.2 refers to the number proposed in the EU Constitutional Treaty, not yet ratified. It would send the size of the CEU through the ceiling, exceeding P 1/6 T 1/2 by more than 300 percent.
Table 16.2. Total voting weights in the Council of the European Union—prediction by S = P 1/6 T 1/2 and actual
1958 1973 1981 1986 1995 (2004)
260
Population (P , million)
Members (T )
P 1/6 T 1/2
Actual
172 257 288 323 369 454
6 9 10 12 15 25
58 76 81 91 103 140
17 58 63 76 87 (456)
Actual as % of P 1/6 T 1/2 29 76 74 84 84 (326)
Seat Allocation in EU Assemblies
The Minority Enhancement Equation for Seat Allocation in Federal Second Chambers and EU Bodies Regardless of whether the total number of seats shows any regularity, those seats must be somehow allocated among the territorial subunits. Two distinct norms could be used. In national first or only chambers, the norm is ‘a person is a person’, and subunits receive seats in proportion to their populations (with possibly a minimum of one seat per subunit stipulated). In international bodies and some federal second chambers, the prevailing norm is ‘a state is a state’, and each of them receives the same number of seats. This is the case for the Assembly of the UN and for the US Senate. Some supranational bodies and federal second chambers try to accommodate both of these conflicting norms, allocating larger states more seats, but still short of their proportional due. This third alternative is used for the Canadian second chamber, for the EP, and also for the voting weights in the CEU. Can we make logical predictions for the outcome of a compromise that tries to accommodate both norms, representation of individuals and of states? It turns out we can. The starting point is the law of minority attrition (Chapter 13) that, among other applications, relates the seats (Si , S j ) of two parties, i and j, to their votes (Vi , Vj ). Its first component is Si = SVin /Vkn , where S is the total number of seats ( Sk = S). Henri Theil (1969) offered formal proof that the transformation of vote shares into seat shares must follow this format, because, among all functions of the form Si /S = f (Vi /Vv j ), this is the only one that does not lead to inconsistencies in the presence of more than two parties. Theil (1969) also pointed out that the same general formula may be applied to seat allocation in international institutions whenever one wishes to overrepresent smaller states. It suffices to replace votes (Vi ) by the populations (Pi ) of the countries: Si =
SPin . Pkn
Here the summation is over N countries and an exponent smaller than 1 must be used. The value n = 0 would provide each country with the same number of seats, regardless of population size (‘a state is a state’), while n = 1 represents countries in direct proportion to their populations (‘a person is a person’). 261
Implications and Broader Agenda
The value n = 0.5, halfway between these extremes, may seem a balanced compromise between the two norms. It has a mathematical rationale, was recommended by Theil (1969) and roughly fits the distribution of CEU voting weights. However, the distribution of seats in the EP fits only when n is around 0.7. Why would the EU use different criteria for the two bodies? It is that the total number of seats matters. Indeed, if the present EU, with 25 members (as of 2006), had an assembly of only 25 seats, the only acceptable way to allocate them would be to allocate each country one seat—which would correspond to n = 0 in the equation above. On the other hand, if the assembly were huge, appreciable proportionality to populations of countries could be afforded, while still giving representation even to the smallest member states. Taagepera and Hosli (2006) present a modification of the law of minority attrition where they express the disproportionality exponent n in terms of not only total population (P ) and total number of seats (S), but also the number of member states (T). The model is presented in chapter appendix. The result is n=
1 log S 1 log P
− −
1 log T 1 log T
.
This is a more complex expression than what we obtained in Chapter 15 for seats and votes: n = log V/ log S. If we applied the latter to member states, with populations replacing votes (n = log P / log S), the small states would be left with no representation, like small parties are. But here we want to overrepresent them, population-wise, because they are distinct members. This is why the number of members must be brought in, and it complicates the expression for n. The combination of this expression with Si = SPin / Pkn can be called the minority enhancement equation. Once P , S, and T are given, the number of seats for any country i can be calculated on the basis of its population, Pi .
Testing the Minority Enhancement Equation with the Parliament and the Council of the European Union Taagepera and Hosli (2006) tabulate the predicted and actual voting weights or seats in the CEU and EP. Table 16.3 shows the considerable changes in these bodies, over 30 years. Yet the calculated values of exponent n remained quite steady: 0.47 ± 0.06 for CEU and 0.69 ± 0.03 for EP. 262
Seat Allocation in EU Assemblies Table 16.3. Characteristics of seat allocations in the Council of the European Union and the European Parliament
Time period Number of members Population (million) Seats/voting weights Range of exponent n Shares misallocated by predictive model
CEU
EP
1958–95 6–15 172–387 17–87 0.41–0.52
1964–95 Same Same 142–626 0.67–0.72
0–5.2%
2.3–5.1%
Source: As analyzed in Taagepera and Hosli (2006).
The predicted seats were rounded to integers, using the equivalent of the Largest Remainders approach, so as to preserve the total number of seats. The predictive model misallocated only 2.6 percent of the CEU voting weights, on the average, and 3.7 percent of the EP seats. Three empirical postdictive data fits by other authors, reviewed in Taagepera and Hosli (2006), could hardly do better than the predictive model. Figure 16.2 shows the degree of fit in 1995, a year with average misallocations (2.3 percent for CEU, 3.4 for EP). The theoretical lines are shown, and they agree with data points so well that they might be mistaken for statistical best-fit lines. They are not! The lines shown are predictions by a logical model based solely on the number of countries, total seats (or voting weight units), and country populations. The only consistent deviation is an excess for the smallest member, Luxembourg— and it prevails throughout the entire period. Data and graphs for other time periods are shown in Taagepera and Hosli (2006). These are very robust results, over 30 years. The contrast between EP and CEU confirms that, with the same number and total population of member states, the size of the assembly makes a difference: The smaller the body, the more disproportionate the representation of populations is, in favor of smaller member states. No arbitrary value of the disproportionality exponent, such as n = 0.5, can fit all sizes. But how come that the EU institutions conform to the minority enhancement equation when its decision-makers have not been aware of its existence? Could the negotiators or arbitrators in a supranational organization be mathematiciens malgré soi? To some extent, this is so, indeed. From the very beginning, the EU seemed to respect the following two ground rules. First, even the smallest member must have nonzero 263
Implications and Broader Agenda 100.0
Ita UK Ger Fra Spa
Seats (Voting Weights)
European Parliament (Theoretical Slope = 0.672)
Ire Fin
Por Gre Bel AutSwe Den
Net
10.0 Lux
Council of the EU (Theoretical Slope = 0.456)
1.0 0.1
10.0 1.0 Member State Population [in Millions]
100.0
Figure 16.2. Seat and voting weight distribution in the European Parliament and the Council of the EU in 1995—predictive model and actual values Source: Taagepera and Hosli (2006). Reprinted from Political Studies, 54, R. Taagepera and M. Hosli, ‘National Representation in International Organizations’, 370–98, © 2006 Political Studies Association, with permission from the Political Studies Association.
representation. Second, a more populous state must not have less representation than a less populous one. These ground rules are close to two constraints of the three on which the exponent n is based in the model (see chapter appendix). In other words, the predictive model is not artificial but derives largely from very simple and practical principles. So the larger states should have more representation. But how much more should they have, compared to the smaller? Here the model starts out from the norm that, if the given institution offers only one position, it would go to the largest member. Somehow, this highly debatable and apparently irrelevant constraint is conducive to the same outcome as the complex haggling during 40 years of constructing the EU. The Treaty of Nice and subsequent negotiations broke, for the first time in EU history, the ground rule that a more populous state must not have less representation than a less populous one. Table 16.4 shows examples of incongruence in seat allocations for the EP elections of 2004. The new members are shown in bold, and several of them are clearly
264
Seat Allocation in EU Assemblies Table 16.4. Incongruent seat allocations for the European Parliament elections of 2004, compared to population in 2000 Country
Slovakia Denmark Finland Ireland Lithuania .... Estonia Cyprus Luxembourg Malta
Population (million)
Seats
5.4 5.3 5.2 3.8 3.7
14 16 16 15 13
1.4 0.8 0.4 0.4
6 6 6 4
underrepresented, compared to old members with similar populations. The populations changed little, from 2000 to 2004. Rather, the EU had implicitly introduced second-class citizenship for new members. Allocations of voting weights in the CEU underwent similar pressures. With some of the basic ground rules bent, it is not surprising that the gap between the predictions of the minority enhancement equation and the actual seats or voting weights has increased moderately since the Treaty of Nice—as they would for any postdictive curve fitting, too. In particular, the smallest member states (Cyprus, Luxembourg, and Malta) are heavily overrepresented, presumably thanks to the founding member status of Luxembourg that pulls the others along (see Taagepera and Hosli 2006: Figure 4). Rather than reducing the importance of minority enhancement equation as a logically based norm, the post-Nice developments show that the expanding EU needs to protect itself against ad hoc haggling. All countries have explicit seat allocation rules for parties in national parliaments, and federal countries have such rules for seat allocation among federal subunits. This seems so self-evident that some reviewers for Taagepera and Hosli (2006) found it hard to believe that EU does not have explicit rules for seat allocation among member states. If it wants to avoid further haggling, the EU needs firmer ground rules. The minority enhancement equation offers a consistent basis on which further stipulations could be grafted, depending on commonly accepted special needs.
265
Implications and Broader Agenda
Application of the Minority Enhancement Equation to Federal Second Chambers and United Nations A systematic analysis of federal second chambers remains to be done, so as to distinguish those that follow pure principles of either ‘a state is a state’ (USA) or ‘a person is a person’ (Austria) from those that try to follow a middle course (Canada). The interesting question is, how close the latter come to the minority enhancement equation—and could they profit from coming even closer. The UN has followed a two-pronged approach. Its General Assembly treats all countries as equals—as if it were a second chamber. In contrast, Security Council assigns more representation to its permanent members, which are among the most populous. In view of formal veto power and realistic ability to use it, it could be argued that the USA yields power at least commensurate with its share of the world population. In this respect, Security Council appears more akin to a national first chamber—contrary to what its relationship to the General Assembly might look at the first glance. One could play at determining the sizes of two assemblies that correspond to the world population, using the models F = P 1/3 and S = P 1/6 T 1/2 . One could then allocate these seats according to the populations of member countries of the UN, using the minority enhancement equation. This is left as an exercise for those interested.
Conclusions and Implications for Institutional Engineering The models presented and tested here are of direct interest for institutional engineering in the EU, because its institutional structure is as yet unsettled. Making use of these models could save appreciable political wrangling and improve the outcomes. The EU needs firmer ground rules for allocation of seats in the EP and voting weights in the CEU. It could also profit from ground rules for the sizes of these bodies. The cube root law, its extension to subunit-based chambers, and the minority enhancement equation offer a consistent basis. This is not a take-it-or-leave-it proposition. The models presented could be taken as a starting point, on which further stipulations could be grafted, depending on perceived special needs. For instance, suppose one desires even stronger overrepresentation of the smallest members, such as Luxembourg-sized countries in the EU. It could be obtained by 266
Seat Allocation in EU Assemblies
adding 0.5 or 1 million to the populations of all countries. The addition would hardly affect large or even median countries while boosting the tiniest. All this is outside the central focus of this book—predicting party sizes. Yet it is connected methodologically. Assembly sizes affect party size distribution, through the seat product MS. Analogous determinants operate in sub- and supranational bodies. The law of minority attrition enables us to connect party seat shares to their vote shares. One can easily modify it to fit situations where even the smallest components are entitled to representation. Given the potential importance for institutional engineering, this is a major spin-off from electoral and party studies. The reverse can also come about. The additional insights gathered from analysis of supranational and federal bodies might, in turn, prove useful in electoral and party studies.
Appendix to Chapter 16 Construction of the minority enhancement equation Previous equation n = log V/ log S (Chapter 14) inspires us to look for logical constraints on the value of n in the case of international bodies and federal second chambers. The idea behind this equation is that the total numbers of seats and votes determine the disproportionality exponent n. In the case of suband supranational bodies, the total number of seats (S) remains a factor, total population (P ) easily substitutes for total votes, but the number of territorial units to be represented (T) also matters. The way P , S, and T enter is subject to logical constraints. This means that the function n = f (T, S, P ) must apply to extreme or other special cases, if it is to be general. Three such logical constraints can be posited (Taagepera and Hosli 2006). I. If the number of seats to be allocated matches the number of territorial units (S = T), then the only reasonable way to distribute them is to give each unit one seat. Mathematically, this means f (S = T, P ) = 0. Indeed, if n = 0 is entered into Si = SPin / Pkn , it becomes Si = S/T = 1, given that a0 = 1 for any finite number a. Here the difference between party representation and territorial representation becomes important. When seats are scarce, one could easily leave a tiny party with no seat and provide a large party with several seats. In the case of territorial units, however, even the tiniest member state is entitled to one seat, before even the largest country can receive a second one. A state is a state. II. If the number of seats equals total population (S = P ), then every person should get a seat, and we would have perfectly PR of populations. A person is a
267
Implications and Broader Agenda person. Mathematically, f (T, S = P ) = 1, because n = 1 in Si = SPin / Pkn leads to Si = SPi /P and hence Si /S = Pi /P . III. If there is only one seat to be allocated, the largest member has the strongest claim. Equation Si = SPin / Pkn yields this result when n tends toward ∞. This implies that f (T, S = 1, P ) → ∞. This constraint is more debatable than the others. Why not rotate the seat among the members, as is the case for the presiding country of the EU? Rotation would mean return to the norm ‘a state is a state’. Visibly, this norm applies to the EU presidency, but not to the Council of the EU and the EP. We need a third stipulation, distinct from the two previous ones, to specify a model in three variables (S, P, and T)—and f (T, S = P ) = 1 works. But it would be nice if it could be grounded in a different way. The two first constraints are satisfied whenever n = f (T, S, P ) has the form n=
g(S) − g(T) , g(P ) − g(T)
where g(x) means an identical transformation of the variables involved. More complex expressions also satisfy the constraints. Occam’s razor principle holds that the simplest form should be chosen, unless nature imposes more complex forms. Note that n = 0.5, the value proposed by Theil (1969) implies that g(S) is the arithmetic mean of g(T) and g(P ): g(S) =
g(T) + g(P ) → n = 0.5. 2
The third constraint is satisfied when g(x) is a function such that g(1) → ∞. Again, there are other ways to satisfy the three constraints, but this one is the simplest. The simplest functions g(x) leading to g(1) → ∞ are g(x) = 1/(x − 1) and g(x) = 1/ log x. For reasons analogous to those presented previously in appendix to Chapter 13, the logarithmic expression is to be preferred. Empirically, g(x) = 1/(x − 1) would predict values of n ranging from 0.69 to 0.95 for CEU, while the actual values of n have been under 0.6 ever since 1958. For EP, the predictions would be above 0.96, while the actual values have remained under 0.75. Thus, the function g(x) = 1/(x − 1) clearly cannot explain the actual seat or voting weight allocations, while the excellent fit resulting from g(x) = 1/ log x can be seen in previous Figure 16.2.
268
17 What Can We Expect from Electoral Laws?
For the practitioner of politics:
r r r r r
Expect electoral laws to have an effect on party system, government stability, and other features. Expect some ability to fine-tune simple electoral laws to desired goals, but be cautious. Do not expect any ability to tailor complex electoral laws to desired goals. Consider marginal adjustments rather than flipping to completely different electoral laws. Keep the same electoral laws for at least three elections before changing them.
What do we know about electoral systems worth conveying to political practitioners intent on creating or revising an electoral system? I will try to answer this question, keeping in mind that some change in the party system is usually at least one of the goals. The next question is how can we generate further usable knowledge? I will outline a broader agenda, going beyond macro-level models for simple legislative elections. This agenda involves micro-level models, more complex systems, and elections beyond the legislative. Chapter appendix offers data in a form where the effect of the seat product MS has been removed (‘controlled for’) so that detection of other factors may be easier.
269
Implications and Broader Agenda
How Much Do We Know? If ethnic, religious, or social groups really insist on slaughtering each other, then no electoral system can prevent them. One cannot expect that much from electoral laws. But in borderline cases, some systems may work better than some others. In homogeneous societies too, electoral systems can have detectable effects on policy outputs and, over the long run, on political culture. Decision-makers try to choose an electoral system that fits the existing political culture or nudges it in a desired direction, but they do not always succeed. Fiji is a recent example where advice by political scientists clashed. Fiji is ethnically split among original Fijians and Indo-Fijians whom the British colonial rulers brought in as laborers. The conflict peaked with a military coup in 1987. In 1996, a Constitution Review Commission aimed at consensus-building among competing ethnic groups and proposed Alternative Vote in a mix of communal and ethnically heterogeneous single-seat districts. It was adopted with minor changes and was used in the elections of 1999 and 2001. Alternative Vote had been proposed by Donald Horowitz (1985, 2002, 2006), who maintained that it would lead to parties courting the second choice votes of centrist voters and hence to softening the extremist rhetoric. Fiji discarded the contrary advice by Arend Lijphart (1977, 2002), who proposed a ‘consociational’ approach, with closed-list PR, group autonomy, and power sharing in multiparty coalitions. Alternative Vote in Fiji is now considered a failure (Fraenkel and Grofman 2005, 2006a, 2006b; Stockwell 2005). Compared to multi-seat PR, single-seat districts are bound to increase the number of frustrated voters who do not get their first preferences elected. Alternative Vote could mitigate it only if the existing political culture favors compromise rather than seeing it as more dishonorable than defeat. Contemporary Westerners may take the existence of a culture of compromise too much for granted. In Fiji, disproportionality between the seat and vote shares became huge. The voters’ second choices were manipulated by party leaders, giving an edge to extremist parties at the expense of the moderate ones. Like BC, AV might be a good system ‘only for honest men’ (cf. Chapter 3), meaning those who do not play strategic games in a divided society. In nearby New Caledonia, ethnic strife also interrupted democratization with election boycotts and violent clashes from 1984 on. The existing List PR was complemented in 1998 by mandatory power sharing, decentralization, and improved access to voting outside the capital area. It is 270
What Can We Expect from Electoral Laws?
not an unqualified success, but leaders of opposing parties have served together in coalition governments and in 2004 centrist parties triumphed, by shifting the agenda away from ethnicity (Fraenkel and Grofman 2005; Maclellan 2005). A sample of two countries is too small to draw conclusions. New Caledonia may have been lucky and Fiji unlucky. The consociational approach still has to prove itself in Kosovo (Taylor 2005). But some systems may be more failure prone than others. Single-seat districts may provide fewer viable mechanisms for consolidation of democracy (Birch 2005). We have plenty of empirical data and precedents. Half a century ago, W.J.M. MacKenzie (1954: 54) maintained that ‘The only thing that can be predicted with certainty about the export of elections is that an electoral system will not work in the same way in its new settings as in its old.’ Is it still true? In Harry Eckstein’s terminology (1966, 1998), if institutions are not sufficiently congruent with the existing political culture, they fail or yield unexpected results. A country can do pretty unexpected things even to straightforward electoral laws (cf. A Wuffle, quote at the start of Part III). Extrapolation into the future is risky even regarding the same country. The ‘cube law’ fitted the British elections up to the 1960s, but then the disproportionality exponent shifted from 3 toward 2 and then 1.5 in the 1970s (cf. Chapter 13). The importance of third parties has also increased in the UK, without any marked change in the electoral laws. At the level of recipes based on single country precedents, political scientists have little more to offer than historians or journalists. They have an edge when they go comparative, detecting empirical regularities while also including cautionary case studies. Even so, empirically based regularities depend on ‘all other things being the same’. How can we know which things must remain the same, if we do not know what causes the observed relationship? How could we predict whether the cube rule continues to hold in Britain without knowing its cause? This is where predictive models based on logical considerations (ranging from general to specific) offer more certainty. They help us know which things must be the same, for the observed regularities to hold, and to what extent changes in inputs alter the outputs. They help but do not guarantee. The law of minority attrition explains the exponent 3 in the cube rule by interaction between the number of voters and number of seats. But these numbers changed little in the UK at the time the exponent dropped. We might presume that the very knowledge about the seat– vote relationships enabled parties to counteract the natural tendencies 271
Implications and Broader Agenda
by concentration of resources into the most promising districts. But what is then left of the predictive ability of the minority attrition equation? Such questions must be faced. If they cannot be immediately answered, the choice is between reverting to pure empiricism—or even to an ‘anything can happen’ attitude—or keeping on looking for further logically grounded explanations. I favor the latter, and we are making headway. It may take more than a few exceptions to sink a logically grounded model. When physicists noticed that energy did not seem conserved in certain subatomic processes, they did not discard the principle of conservation of energy but posited the existence of an as yet undetectable particle, the neutrino. It took decades before more direct evidence for the existence of this phantom particle could be found.
Designing Electoral Laws and Waiting for a Party System to Evolve The devil is in the detail. If you clutter electoral laws with details that could be avoided, then the devils of unexpected consequences will have a field day. If you keep it simple, you will have some ability to predict. If simple electoral systems produce undesirable outcomes in the given cultural context, we may at least know in retrospect what caused them, and then we can try incremental changes. When electoral systems are made complex, any degree of rational predictability vanishes. Incremental adjustments to unwelcome surprises become impossible when we cannot even be sure which component is at fault. Hence attempts at correction may make it worse. For simple electoral systems, the corrective ability should not be dismissed, nor should it be overestimated. Excessive optimism would only lead to disappointment and complete dismissal. Even when a newly democratizing country chooses a simple system, inspection of previous graphs shows that various outputs can be off by a factor of 2. We might expect to have 4 parties in the parliament, but can get as many as 8—or only 2. Most political science undergraduates should be able to tell you that few countries have more than 8 or less than 2 parties, so where is predictability? First, in most cases the outcome is likely to be closer to expectation. Second, gradual adjustment is possible, if you know what to adjust and by how much to do so. If the given political culture, other institutions, and other factors combine to produce 8 parties in the assembly and one still 272
What Can We Expect from Electoral Laws?
wants to have 4, one might cut district magnitude by a factor of 24 = 16. This is so because of the relationship N0 = (MS)1/4 in Chapter 8 indicates that cutting M 16-fold would cut N0 twofold. Actually, better err on the conservative side and cut M by a factor of 8. Then allow at least three elections to take place, and see what happens (Taagepera 2002d). Indeed, it takes several elections with the same laws before their average long-term effects can be evaluated. Parties and voters need time to learn how to use a newly adjusted system to their best advantage. If the laws are continuously altered, no stable system and ways to handle it can emerge. Of course, some initial choices may be so disastrous as to be given up in a hurry, but they are rare. The laws may not be that badly dysfunctional, once people learn to use them. Moreover, if you truly botched it the first time, what guarantees that a total flip does not lead from flaws discovered to flaws as yet unknown? Fine-tuning may achieve the desired effects more safely. Sometimes the change needed may lie outside the electoral system as such. Party financing laws can affect the number of districts in which parties decide to run—which, in turn, affects party votes, seats-to-votes ratios, and possibly seats. Do parties strike election-time alliances but part ways once in the assembly? The gut reaction might be to set higher legal thresholds for alliances, but this would complicate the electoral system and be hard to police. In contrast, parliamentary rules that deny material benefits to parliamentary groupings that did not feature in elections may be self-policing.
Simple Electoral and Party Systems The predictions about party systems in this book often have sounded as if ‘a party is a party is a party’. Parties have been treated as beings of the same kind, differing only in the number of votes and seats they command. This is of course far from reality. Parties differ widely in their internal structure and cohesion, among other features—and electoral systems interact with these. Closed-list PR tends to reinforce central party leaderships, while open-list PR and STV enable the individual candidates to buck the leaders. Intraparty election or selection rules for leaders and for candidates in general elections are another aspect of electoral systems this book has not dealt with. Even for national elections, after briefly describing the variety of electoral systems, I have effectively reduced the range to ‘simple electoral systems’, meaning closed-list PR and FPTP as its limiting case. 273
Implications and Broader Agenda
So what is my excuse for treating parties like Democritian atoms, indivisible and lacking internal structure, and all electoral systems as simple? This question may be raised by those who argue that everything is so ‘richly’ interrelated with everything else that, if one cannot investigate everything at once, one is not entitled to investigate anything at all. Such a claim would restrict us to holistic approaches such as religion or art on the one hand, or to a grand linear regression equation on the other, where all the ingredients are thrown into the same pot on an equal basis. I have encountered such demands. But this is not how science proceeds. Science proceeds in stages, trying to go from the more general to the more detailed, yet not starting with so broad and vague generalities that connections remain vacuous. Call it the middle range theory approach, if you will. I am well aware that most actual electoral systems are not simple at all, that they cannot be reduced to assembly size, district magnitude, and seat allocation formula. Similarly, parties are not simple entities. But I have put this awareness and related factual knowledge on temporary hold, for the following reason. If we cannot decipher the relationships among institutions, seats, votes, and parties even in simple systems, how could we expect to do so in more complex ones? The scientific approach is to solve problems first in simple settings, and then gradually expand into complex. Thus, most of heat transfer is three-dimensional, but textbooks start with transfer in ideally one-dimensional rods and then proceed to ideally two-dimensional flat plates. This was also the order in which theory was worked out in the first place, although all empirical experience is bound to be three-dimensional. The approach is similar for elections and parties, except for one aspect. I could define an ideally simple electoral system, but how does one define an ideally simple, generic party, to which the models presented here would apply foremost? If anything, it would be an ideally centralized party, a dictatorially run monolith. Correspondingly, the ideal electoral system would offer closed-list PR (or FPTP), where voters are forced to choose between monolithic parties, not individual candidates. If we cannot make predictive sense of elections–parties interface for such a simple setup, then how could we expect to make sense of more complex setups? These simplifying assumptions are not normative. As founding chair of a political party in Estonia, I struggled hard for intraparty democracy and against the relentless hand of Michels’ iron law of oligarchy (Taagepera 2006) that pushed toward central control. Also, although I favor keeping electoral systems simple, I instinctively like personalized PR and STV, despite the resulting complexities for analysis. But let us face 274
What Can We Expect from Electoral Laws?
it: Monolithic parties and closed-list PR (and FPTP) are easier to work into logical models, compared to situations where intraparty democracy and voters’ choice of individual candidates blur the simple picture. Putting such details on hold does not mean ignoring or disliking them. They are just relegated to the next level of analysis. There is one caveat. Such an ordering of priorities presumes that we have sensed correctly what comes first and is indispensable. It also presumes that what comes next does not produce random noise (from the viewpoint of the presumed basic model) so huge as to drown out the predictions of first-order models. This is hard to obtain under nonlaboratory conditions. For me, the major surprise regarding the models presented in this book is that they actually work as well as they do. Given the dearth of truly simple electoral systems, all these models have been tested with systems that present marked complexities—and yet these models hold, as averages of many electoral systems. They seem to express, indeed, the central features of the actual systems, even when there is no inherent reason why ‘ignorance-based’ models should hold upon further input of information. It is easy to poke holes into the generalizations presented, by pointing out deviant cases, but such an attitude does not advance the orderly quest for more structured understanding, if we want to go beyond encyclopedic compilation and cataloguing of factual knowledge. Yes, Switzerland diverges drastically from the prediction of the inverse square law of cabinet duration. So what? Feathers in the wind do not disprove gravity. A fairly narrow zone, with slope 2 on log–log graph, still fits all those systems where cabinets depend on parliamentary confidence. Switzerland shows that we do not yet understand everything about mean cabinet duration. It does not undo the fact that we already do understand something about why cabinets last as long as they do.
Going Beyond the Simple Electoral Systems Some aspects of simple electoral systems still need extensive testing. This applies to institutional inputs to vote shares and deviation from PR (Chapter 14) and to the varied impacts of population size on politics (Chapter 12). The conclusion ‘This model has not yet been fully tested’ also pops up in various other sections throughout the book. Why did not I carry out such testing before publishing the book? It would have taken many more years, the more so because success in thoroughly investigating 275
Implications and Broader Agenda
one aspect would bring up further questions. It is more efficient to publish the many existing findings now and point out the unfinished or new issues as challenges to anyone interested. The same applies to the weak theoretical links in well-tested models such as the third assumption for the minority enhancement equation (Chapter 16) or the embryonic model for the number of pertinent electoral parties (Chapter 15). I now turn to aspects of the electoral–party system nexus where the simple models presented here do not suffice, yet can help cast further light on more complex situations. Although the Duvergerian macro-agenda is not completed, it has been investigated to the point where meaningful spinoff has become possible toward systematic investigation of more complex electoral systems—‘second-order’ rules such as closed versus open lists, and intraparty effects of electoral rules. Much work in this direction has already been carried out. It is to be hoped that the mutually interlocking models presented in this book help advance such understanding. Any advances in the macro dimension also present new challenges to the micro dimension of the Duvergerian framework. An analogous list could be presented for second-order effects of party structures. Even if the electoral system were ideally simple, it could be expected to have a simple effect of the kind modeled here only on simple party systems, where all parties are monolithic. Here, I will limit myself to the broader agenda for the study of electoral systems. The following survey owes much to the recent overview of electoral systems by Matthew Shugart (2006). It proceeds from more general considerations toward some details of nationwide legislative elections, and then touches on some other levels. Among the many effects of such factors and their interactions, I focus on how they might affect the number and size distribution of parties.
The Micro Dimension of Duverger In physics, macroscopic laws of thermodynamics were first developed, such as the ideal gas law, inducing and extrapolating from macroscopic observation. These laws were useful in practice. Only much later did statistical mechanics ground them in microscopic movement of particles. The relationships presented in this book are macroscopic. A major part of the micro-Duvergerian agenda would be to supply individual-level foundations for the system-level relationships observed in this book. 276
What Can We Expect from Electoral Laws?
In principle, statistical mechanics could have developed from scratch, absent nineteenth-century thermodynamics. In reality, the macroscopic laws offered incentives, because they needed deeper explanation, and also a reality check on whether the micro-level calculations went in the right direction. The relationships presented here could play a similar role. As observed in appendix to Chapter 8, geometric averages of logical limits need not work at all, but they often do, as worldwide averages. It is up to micro-Duvergerians to find out, why they work—and this is of course also a path toward explaining deviations from these averages. The processes that lead to the Duvergerian average patterns of distribution of seats need to be made more explicit. The mechanical and psychological effects are entangled, and Benoit (2002) has argued that the strength of the mechanical effect has often been overstated due to ‘prefiltering’ by psychological considerations. The psychological effect itself risks being a catch-all term for strategic choices of varied types, by individual actors in individual elections. Actors include not only voters but also party leaders and campaign contributors. Cox (1997) achieved a major advance with his aforementioned notion of ‘strategic coordination’, which may or may not materialize so as to offer an optimal number of candidates or lists. From his testing of the aforementioned ‘M + 1’ rule at low district magnitudes, Cox concludes that the quality of voter information decreases with district magnitude. Blais (2000) has investigated the limits of rational choice approaches to the decision to vote or not to vote. I will not discuss here the still broader agenda of strategic considerations by candidates, voters, and parties which, in turn, depend on the ideological distribution of parties.
Political Culture At given electoral system characteristics, political culture certainly should affect the outputs, including the number and size of parties. Political culture, however, is a broad term covering various aspects which are hard to operationalize. Furthermore, political culture is a major factor in determining the type of government and electoral system a country chooses in the first place. Consensual cultures are more likely to choose PR than winner-take-all cultures (Lijphart 1999). Thus, various aspects of political culture can act on the number and size of parties directly, by modifying the impact of the electoral system, or indirectly, through the electoral system itself. 277
Implications and Broader Agenda
This double impact was pointed out for mean cabinet duration (Chapter 10). With the same number of parties, consensual polities might be expected to have longer lasting cabinets than the majoritarian. But consensual polities are also more likely to choose PR, which leads to more parties. The two effects seem to cancel each other out. This is not a reason for giving up, but more refined work is needed both on measuring political culture and on connecting it to various aspects of political outputs through predictive models.
Ethnic and Geographical Determinants of the Size of Parties In addition to institutions, the number of politicized social cleavages or ‘issue dimensions’ also affects the number and size of parties. In the absence of distinct issues, parties will not form even if the electoral systems offer few restraints—see Vatter (2003) for a recent test. However, impressionistic estimates of the number of issue dimensions risk become tautological, as they are affected by the known number of parties. To counteract this risk, Ordeshook and Shvetsova (1994) introduced ethnic heterogeneity as a measurable proxy for issues. As it overlooks nonethnic cleavages, it may underestimate of the number of issues, but it represents an advance toward objective measurement. The interaction of cleavages and district magnitude has been confirmed by Ordeshook and Shvetsova (1994), Amorim Neto and Cox (1997), Cox (1997, 208–21), and Geys (2006). They find that interaction is multiplicative, while Lago Penas (2004) reports somewhat higher correlation for Spain when adding the two factors. Once one agrees on which ethnic or other interest groups are distinct, their effective number can be measured. Yet ethnic heterogenity may not increase party system fragmentation when parties are not structured along ethnic lines, or when various minority groups form a single party (Madrid 2005). The location of these groups also matters. A group uniformly dispersed across the country may contribute less to heterogeneity than does a group of equal size concentrated in a border area where it forms the majority (Mozaffar, Scarritt, and Galaich 2003). More generally, geographical location of support for different parties interacts with the effect of electoral systems in determining their strength in the assembly (Gudgin and Taylor 1979; Johnston 1981; Eagles 1995; Park 2003), along with turnout differences and malapportionment (Grofman, Koetzle, and 278
What Can We Expect from Electoral Laws?
Brunell 1997). Maybe we would need a simple index of geographical concentration to express the location of minorities.
Party Financing The way parties are financed varies—see special issue of Party Politics on party finance, edited by Fisher and Eisenstadt (2004). The state increasingly finances parties, and the way it is done can affect the number and nature of parties (Burnell and Ware 1998; van Biezen 2003). If funds are allocated by seats won, it freezes out small parties underpaid in terms of seats. If, on the contrary, funds are allocated by votes obtained, then tiny parties can survive even in FPTP districts, increasing the effective number of electoral parties and possibly that of the legislative parties, indirectly. Hence, ‘Rules on party finance should be integrated more fully in the future study of the results of electoral reforms’ (Hooghe, Maddens, and Noppe 2006). They could modify the mean outcomes based on the seat product alone. The quantitative impact of such refinements should be worked into the predictive model.
Two-Tier PR As one extends the study beyond the simple electoral systems, systems with two (or more) tiers command attention in view of their widening spread. Recall that two-tier PR systems come in two forms: parallel and compensatory. The outcomes can be quite different, yet the two are often confused. Take the example where voters cast votes in 100 FPTP districts and also in a 100-seat nationwide district. With parallel rules, the FPTP seats may go to two major parties, while all parties win their proportional share in the nationwide tier. In total, third parties win about a half of their proportional due in seats, while the two top parties are overpaid accordingly. With compensatory rules, in contrast, nationwide proportionality is restored (usually subject to a legal threshold of votes), which means that the major parties loose whatever advantage they obtained in the FPTP districts. Elklit and Roberts (1996) have stressed this ‘two-tier compensatory member’ electoral rule as a separate category, more often called MMP. The volume edited by Shugart and Wattenberg (2001) updates our knowledge 279
Implications and Broader Agenda
about the particularities of this approach that avoids malapportionment problems, yet preserves the benefits of local representation. Several countries have recently adopted two-tier PR, either as parallel allocation (e.g. Italy and Japan) or MMP (e.g. New Zealand and Scotland), offering political scientists equivalents of crucial experiments among and within countries. In New Zealand, the shift from FPTP to MMP has reduced disproportionality, as expected (Gallagher 1998), but may not have reduced the adversarial nature of politics characteristic of FPTP (Barker and McLeavy 2000). Note that reduction in disproportionality results directly from a softened mechanical effect, which is instantaneous, while political style is a cultural aspect that may need more time to set in. In Japan, the shift from SNTV to FPTP and PR in parallel arguably has made the system more disproportional (Gallagher 1998), and the dominant Liberal Democratic Party has maintained its grip. Italy’s shift from List PR to FPTP and PR in parallel highlights a little noted aspect of Duvergerian effects in FPTP districts: They tend to favor formation of two major blocks, but those blocks do not have to be unified parties. In Italy, parties form two blocks to present candidates in the single-seat districts, while maintaining their separate identities thanks to the nationwide part of elections (Katz 1996). Thus Duverger’s law is observed to work in Italy at the district level (Reed 2001), while the nationwide landscape remains almost as fractured as it was under List PR. How would two tiers affect the number and size distribution of parties? Colomer’s micro-mega rule suggests that more entry points with different rules would help small parties, but it depends on specific electoral rules at both levels (Cox and Schoppa 2002). Moreover, it still remains debatable how to determine an input-based effective magnitude (cf. Chapter 11) in the face of two separate or interacting tiers, so as to establish a basis for comparison. Frequent addition of legal thresholds in some tiers complicates the issue further. In Italy, interaction between the two tiers works out in the PR tier in such a way that an increase in district magnitude actually tends to reduce the effective number of parties (Ferrara 2004). This is so because the voters are motivated strategically to desert the strong party they have voted for in the FPTP district. Still, a 15-country study, which includes Italy (Moser and Scheiner 2004), finds no contamination between the two tiers. Nishikawa and Herron (2004) find that the overall effective number of legislative parties tends to fall in between pure FPTP and PR systems. 280
What Can We Expect from Electoral Laws?
Preferential-List PR It matters more than one may think whether voters vote for parties or for individual candidates (Grofman 1999). Shugart (2006) observes that the literature implicitly has equated PR with closed lists (which are also part of my definition of a simple electoral system), even while preferential (open) lists may be more prevalent in practice. In fact, preferential lists can be used even in the FPTP framework, as Uruguay has done for presidential elections (Shugart 2006). Several candidates, possibly belonging to separate but allied parties, form open lists, where voters vote for a specific candidate. The single seat goes to the list that achieves plurality and, within the list, to the candidate with the most votes. It so to say combines primaries with general elections. As usually practiced, FPTP amounts to closed-list PR applied in single-seat districts, but it is more akin to SNTV in one respect: In both SNTV and standard FPTP, a party is penalized for presenting an excessive number of candidates. The study of preferential lists remains underdeveloped. They come in a bewildering number of subtly different forms, with possibly different consequences. The attempt at classification of various closed list, preferentiallist, quasi-list, and nonlist rules by Shugart (2006) offers a road map. The effect of preferential lists on the number and size distribution of parties remains to be investigated. Defection and new party formation is the only recourse for an ambitious dissenter who is ranked low in a closed list. By enabling independent-minded candidates to run and win, preferential lists may prevent such splits. Hence they may reduce the number of parties. On the other hand, preferential lists may loosen party discipline to the point where the meaning of party as a unit of analysis becomes questionable. This was certainly the effect of SNTV in Japan.
Presidential and Prime Ministerial Elections In form, presidential elections are akin to legislative elections in a single single-seat district, and the same alternatives offer themselves for choice of electoral rules (see Chapter 3). However, deviation from proportionality is higher, because there is no statistical evening out over many districts. At the same time, the stakes are higher than in a single parliamentary district, if the president has appreciable power. Therefore, some seat allocation 281
Implications and Broader Agenda
rules are used or proposed that would never be considered in assembly elections (Shugart and Taagepera 1994; Samuels and Shugart 2003). Two-Rounds elections are more frequent in presidential than in assembly elections, so as to give the winner a stronger mandate than mere plurality among many candidates. Indeed, among 91 countries with direct presidential elections, Blais, Massicotte, and Dobrzynska (1997) find a preponderance of majority runoffs (54 percent), followed by plurality (22 percent), and various other majority procedures (13 percent). However, when the eventual winner differs from the first round front runner, such an inversion can actually weaken democratic governability (Pérez-Liñán 2006). Presidential elections interact with assembly elections, especially if they precede the latter (Shugart and Carey 1992). Presidential coattails may boost a major party, thus reducing the effective number of parties and altering their size distribution. Some countries with symbolic heads of state have experimented or toyed with the idea of direct election of prime minister. These are largely untested grounds in practice and theory (see Diskin and Hazan 2002).
Subnational and Supranational Elections Elections at subnational levels vary in the number of levels and in importance, which depend on the degree of formal federalization and actual decentralization of finances (Lijphart 1999). Information on subnational election rules and outcomes is harder to come by than for national elections. This is why they have been less studied, especially in a comparative way, but it is becoming an active field. If the federal or lower subunits have appreciable autonomy, then subnational elections could offer further entry points to small and new parties, unless these are blocked by electoral and party financing rules. Thus subnational elections may or may not increase the number of parties, compared to that in a fully unitary country. The only major supranational elections are those to the EP. While seat allocation to member states is centrally determined (and has been modeled in Chapter 16), election rules have been set by individual countries, and may differ from those used in national elections. The seat product is often appreciably lower than in national elections, because fewer seats are at stake, for the given country. Even so, Euroelections may still advantage smaller parties, because voters tend to consider these elections less important than the national ones. Hence some voters vote in Euroelections 282
What Can We Expect from Electoral Laws?
for new protest parties with whom they would not want to take their chances in national elections. This may have an indirect effect on national elections, as these new or small parties achieve more visibility thanks to success in Euroelections.
The Intraparty Dimension For given vote shares, electoral rules affect not only which parties win seats but also who gets those seats within the party—see special issues of Party Politics on Party Democracy and Direct Democracy, edited by Scarrow (1999), and on Democratizing Candidate Selection, edited by Pennings and Hazan (2001). In List PR, parties may wish to appeal to various constituencies by including women and ethnic minorities, and some such candidates may win. In contrast, for the single candidate in standard FPTP, parties tend to prefer males of dominant ethnicity. Hence the percentage of women tends to be higher in assemblies elected by PR (Rule 1981). PR may also promote higher intraparty turnover (Darcy, Welch, and Clark 1994; Henig and Henig 2001). Matland and Taylor (1997) document a finer distinction: Even in multiseat closed-list PR, parties tend to place males at the top of the list when they expect to win only one seat. Preferential lists may enable women to win even when the party leadership does not expect them to win. Nationwide PR might be expected to even out representation, but this need not be so. Geographic representation is not uniform in Israel and the Netherlands, and underrepresented regions tend to elect relatively fewer women (Latner and McGann 2005). A candidate’s ability to win depends on party label and also on the ‘personal vote’ his or her own image can attract. Carey and Shugart (1995) reasoned that the incentive to cultivate a personal vote should increase with increasing district magnitude in open-list PR but decrease in closedlist PR. Indeed, the larger the district magnitude, the less incentive for personal activity closed lists can offer. The probability is low that personal activity by the nth ranked candidate on the list can increase the number of seats won by the party exactly from n − 1 to n. In open-list PR (and also SNTV), personal activity can put a candidate ahead of fellow candidates— and the more so when more seats are at stake. The two contrary trends fuse at M = 1, where the single candidate is the party’s face in that district. Two indirect tests have confirmed this conjecture. As M increases, the frequency of initiating bills of a local character goes up for preferential-list 283
Implications and Broader Agenda
PR but down for closed list (Crisp et al. 2004). So does the probability that the candidate is born in the district and is experienced in elected office (Shugart, Valdini, and Suominen 2005). Further distinctions between candidate-centered politics and localism are pointed out by Grofman (2005). Further examples of the incidence of general election rules on intraparty politics are offered in Shugart (2006). Little is as yet known about them, because intraparty data are more voluminous and harder to come by than inter-party election data.
Conclusion: Are Electoral Systems a Rosetta Stone for Parts of Political Science? Political science has been an intellectual field largely separate from politics. So was physics, from civil engineering two centuries ago, and biology from medicine, one century ago. Political science and politics may start to connect. It depends on how quickly political science complements postdictive statistical methods with predictive ones. In the study of electoral systems, we have made headway during the last half-century and during the last decade, even while we have to go beyond just ‘seats and votes’ (Powell 2006). We already know something about electoral systems worth conveying to political practitioners. Our quantitatively predictive ability is largely restricted to the simplest electoral systems where the seat product alone largely determines the number of access points for smaller parties. Hence the advice to practitioners is to keep electoral laws simple. In more complex systems, the number of access points is multiplied by ethnic and geographic variety, multilevel elections, and various second-order elections. Small party prospects may be affected by party financing rules and presidential elections. As advances in sciences bring new answers, they also engender new questions. Hence the broader agenda for electoral studies that goes beyond the macro-Duvergerian. The study of micro-Duvergerian processes, complex electoral systems and intraparty impact of electoral rules are visibly at a stage where the territory is still being mapped and further intricacies are discovered. Here, predictive ability is spotty. In contrast, the macro-Duvergerian agenda that focuses on the simplest electoral systems has seen a breakthrough, since 1990, in quantitative prediction of the average impact of electoral systems on the distribution of seats among parties. Extension to the distribution of votes and prediction of 284
What Can We Expect from Electoral Laws?
disproportionality is in the process. This breakthrough is based on logical quantitative models. To the extent that the theory of simple systems is completed, gradual extension to more and more complex systems can proceed. Electoral systems are inextricably intertwined with party systems. The number and strength of parties is largely measured in terms of election results—votes and seats. It might be more meaningful to consider cohesion of parties and their ability to get one’s way in negotiations, but these are harder to measure. Thus the effective number of parties, usually based on election figures, remains perhaps the most widely used single index in political science, despite its well-known shortcomings. It pops up whenever the party system is included as a possible factor in explaining or affecting any political phenomena. Such penetration of other subfields made Taagepera and Shugart (1989) ask whether electoral studies could offer some branches of political science the equivalent of what Rosetta stone did for deciphering of hieroglyphs. Compared to other political phenomena, electoral systems deal with fairly hard numbers: number of votes, seats, electoral districts, and so on. Thus these studies are especially amenable to methods used in more established scientific disciplines. . . . Votes might be to the quantitative development of political science what mass has been for physics and money for economics: a fairly measurable basic quantity. (Taagepera and Shugart 1989: 5)
In developed sciences, quantitative expressions interlock. The same quantities recur in various different equations. A constant measured in one context is used in a different one. These numerical values are stepping stones. In comparison, quantitative knowledge in political science has largely been fractured. The numerical values of coefficients found in a regression analysis are rarely used for further analysis. Such numerical values are end points, dead on arrival into printed pages. Figuratively, quantitative relations in physics are like railroads in Europe—they interlock. Those in political science are like many railroads in Africa—isolated tracks starting in port cities and ending in the hinterland. Simple electoral systems are an exception. Here the product of district magnitude and assembly size leads to the number of seat-winning parties, which leads to the largest seat share, which leads to the effective number of parties. Here we have the beginnings of an interlocking network of equations which, through the mean duration of cabinets, promises to extend beyond the realm of electoral and party systems. 285
Implications and Broader Agenda
In addition to such ‘colonization’ potential, the success of quantitatively predictive logical models in electoral studies offers an inducement to other subfields of political studies to supplement their methodological approaches. There is more to the quantitative study of politics than just regression and factor analysis on the one side and rational choice on the other. Some other sciences have been served well by thought experiments based on the notions of boundary conditions and extreme cases, continuity of change between those limits, and elimination of logical inconsistencies. Such notions can have their uses in political studies too. They will not open all doors, but this is not needed. It suffices if they open some.
286
APPENDIX
DETECTING FACTORS OTHER THAN THE SEAT PRODUCT
This appendix offers data in a form where the effect of the seat product MS has been removed (‘controlled for’) so that detection of other factors may become easier. The predictive models based on seat product account for a fair part of the observed variation in the number of seat-winning parties, the largest seat share, the effective number of parties, and mean duration of cabinets. The pattern is the following. The part of variation accounted for is 51 percent both for the largest seat share (Figure 8.4) and for the effective number (Figure 9.3). This means that all factors not correlated with the seat product account together for less than the seat product does alone. Hence, seat product clearly is the most important single factor. Other factors include features of electoral systems apart from M and S, such as the ones mentioned in Chapter 7, other institutions, cultural features, and path-dependent historical developments. For mean cabinet duration, the seat product accounts for only 24 percent of the variation (Figure 10.2). It may thus look as if some other factor may account for more than does the seat product. However, the input of other factors must largely come through the effective number of parties, which single-handedly accounts for 77 percent of the variation in mean cabinet duration (Figure 10.1). Thus only 23 percent of the total variation is left for all other factors independent of N. While assembly size is heavily determined by country population, the determinants of district magnitude are wide open. It is conceivable that some as yet undetected factors largely determine M and, quite separately, also determine the mean duration of cabinets. Even more broadly, all too many variables are interdependent rather than one-directionally dependent. Sophisticated statistical procedures can point out all sorts of colinearities and covariations among variables—but causal linkages are another matter. So I have followed a different path, a more naive one, if you will, but one that has served well in the advances of physics—advances that eventually enabled us to construct computers, so that even people with little mathematical sophistication
287
Predicting Party Sizes Table A.1. Residuals of the number of seat-winning parties (N0 ) Country, period and no. of elections M=1 Germany 1871–1912, 13 Netherlands 1888–1913, 8 Norway 1906–18, 5 France 1958–81, 7 Denmark 1901–18, 7 UK 1922–87, 19 Australia 1901–17, 7 New Zealand 1890–1987, 32 Australia 1919–87, 28 Canada 1878–1988, 31 Italy 1895–1913, 6 Norway 1882–1903, 8 USA 1828–82, 28 USA 1884–1936, 27 Sweden 1887–1905, 8 USA 1938–88, 26 M>1 Spain 1977–86, 4 Ireland 1922–89, 24 Switzerland 1919–87, 19 Japan 1928–86, 22 Norway 1921–49, 8 Luxembourg/2 1922–51, 7 Norway 1953–85, 9 Luxembourg 1919–89, 11 Portugal 1975–87, 7 Malta 1921–45, 6 Finland 1907–87, 30 Sweden 1952–68, 6 Malta 1947–87, 11 Sweden 1908–48, 14
Seat product (MS)
(MS)1/4
Actual N0
Residual, R 0 = N0 /(MS)1/4
396 100 124 470 118 628 75 81 106 247 508 114 240 396 226 435
4.5 3.2 3.3 4.7 3.3 5.0 2.9 3.0 3.2 4.0 4.8 3.3 3.9 4.5 3.9 4.6
13.6 6.5 5.7 6.7 4.7 6.4 3.4 3.5 3.7 4.4 5.1 2.9 3.0 3.3 2.8 2.5
3.02 2.03 1.73 1.43 1.42 1.28 1.17 1.17 1.16 1.10 1.06 0.88 0.77 0.73 0.72 0.54
2,345 525 1,521 1,920 1,125 359 2,707 751 2,814 99 2,800 2,059 250 1,886
7.0 4.8 6.3 6.6 5.8 4.3 5.9 5.2 7.3 3.2 7.2 6.7 4.0 6.6
12.8 8.2 10.5 10.0 6.5 4.7 6.3 5.5 6.9 3.0 6.8 5.7 3.2 5.2
1.83 1.71 1.67 1.52 1.12 1.09 1.07 1.06 0.95 0.94 0.94 0.85 0.80 0.79
Note: Calculated from data in Taagepera (2002b), as graphed in Figure 8.1. ‘Luxembourg/2’ indicates elections carried out in one-half of the country.
on their own part can instigate sophisticated statistical analyses. Once this machinery is available, should we abandon the simpler approaches? I do write this book on a computer, but when conceptual thinking becomes tense, I grab for a pencil. It makes sense to use the most appropriate technology for the given purpose, not the most advanced one in a technological sense. This approach has led to predictive models that connect the seat product, the number of parties, and the mean cabinet duration. Are these connections really causal, so that a change in district magnitude truly would lead to a change in cabinet duration? We are reminded that maybe the best measure of overall technical development of a country is the per capita number of telephones, but it would be risky to put all national resources into buying telephone sets and hope
288
Detecting Other Factors that the rest would follow. I think predictions based on seat product are on safer causal grounds, but one has to maintain a healthy dose of skepticism. This dose is between blind acceptance and blind rejection. Some of my colleagues may wish to follow up on this simple but time-honored approach of addressing only one or a few variables at a time, while carefully thinking through how (i.e. in what functional form) these variables might logically impinge on the number and size distribution of parties and other features that may derive from it. They may discover connections to which I may be blind. For this purpose, the following Tables A.1–A.3 are offered. They show the residuals, that means what is left to be accounted for when the expected impact of the seat product has been removed (‘controlled for’). Table A.1 shows such residuals for the number of seat-winning parties. The countries are listed in the order of decreasing residuals, which range from 3.0 to 0.54 for single-seat systems and from 1.83 to 0.79 for multi-seat systems. A residual value 1.00 means that the prediction by MS fits exactly. A residual of 2 means that the actual number of seat-winning parties is twice the expected number, while a residual of 0.5 means that the actual number is one-half of the expected. Only two countries fall outside this range. This limited range makes detection of causal (or at least correlated) factors so much more difficult. As mentioned in Chapter 8, operational measurement of number of seat-winning parties presents difficulties, and hence a large part of the residual may be measurement error—which makes discovering further causal factors even harder. I offer these data nonetheless, just in case. Table A.2 shows analogous residuals for the largest seat share. The countries are listed in the order of decreasing residuals, which range from 1.7 to 0.58 for singleseat systems and from 1.30 to 0.67 for multi-seat systems. Here measurement error is much smaller, which makes it more promising grounds for detecting further causal factors. On the other hand, the range of the residuals is even narrower than in previous table. Table A.3 shows the residuals for the largest seat share, the effective number of parties, and the mean cabinet duration, all for the same data-set. (For the largest seat share, overlap with previous table is appreciable.) The countries are listed in the order of decreasing residuals for the effective numbers, which range from 2.7 to 0.72. The residuals for mean cabinet duration have a much wider range—from 3.1 down to 0.20—which should make detection of further factors easier. However, the high correlation with the effective number must be kept in mind. Here in particular, please note that the residuals refer to the theoretically expected relationship, not the empirical best fit. In the case of C versus MS, R 2 is 0.30 for the empirical fit but only 0.24 for the predictive model. This 24 percent is the part of variation that is not only accounted for in a statistical sense but also explained in a more substantive sense. Hence, this is the part other factors are to complement.
289
Predicting Party Sizes Table A.2. Residuals of the largest seat shares (s1 ) Country, period and no. of elections M=1 Italy 1895–1913, 6, TR Botswana 1965–94, 7 Antigua 1980–89, 3 United States 1828–1994, 84 United Kingdom 1885–1992, 29 Bahamas 1972–87, 4 Canada 1878–1993, 32 Trinidad 1961–91, 7 St. Vincent 1974–89, 4 Jamaica 1944–89, 11 Mauritius 1976–95, 6 Norway 1882–1903, 8 Barbados 1966–91, 6 Dominica 1975–90, 4 Sweden 1887–1905, 8 New Zealand 1890–1993, 34 Belize 1979–89, 3 Grenada 1972–90, 4 France 1958–93, 10, TR Norway 1906–18, 5, TR Cook Islands 1965–99, 10 St. Lucia 1974–92, 6 Australia 1919–96, 31, AV Samoa 1979–2001, 7 Denmark 1901–18, 7 Australia 1901–17, 7 Cuba 1901–54, 23 St. Kitts & Nevis 1980–89, 3 The Netherlands 1888–1913, 8, TR Germany 1871–1912, 13, TR M>1 Spain 1977–96, 7 Japan 1928–93, 24, SNTV Portugal 1975–95, 9 Sweden 1908–68, 20 Ireland 1922–92, 25, STV Norway 1921–93, 19 Malta 1921–92, 18, STV Luxembourg 1919–94, 19 Finland 1907–95, 32 Switzerland 1919–95, 21
Seat product (MS)
1/(MS)1/8
Actual s1
Residual, R 1 = s1 (MS)1/8
508 33 17 344 643 42 247 35 14 47 68 114 26 21 226 81 24 15 496 124 23 17 106 47 117 75 64 10 100 396
0.46 0.65 0.70 0.48 0.45 0.63 0.50 0.64 0.72 0.62 0.59 0.55 0.67 0.68 0.51 0.58 0.67 0.71 0.46 0.55 0.68 0.70 0.56 0.62 0.55 0.58 0.59 0.75 0.56 0.47
0.78 0.83 0.86 0.59 0.53 0.73 0.58 0.73 0.82 0.65 0.65 0.59 0.69 0.69 0.51 0.58 0.66 0.69 0.44 0.50 0.61 0.63 0.50 0.51 0.46 0.48 0.48 0.51 0.34 0.27
1.70 1.29 1.22 1.22 1.18 1.16 1.14 1.14 1.12 1.12 1.10 1.07 1.04 1.01 1.01 1.00 0.99 0.97 0.95 0.91 0.90 0.90 0.89 0.86 0.84 0.83 0.80 0.68 0.61 0.58
2,360 1,930 2,770 1,600 567 1,200 180 504 2,860 1,540
0.38 0.39 0.38 0.40 0.45 0.41 0.52 0.46 0.37 0.40
0.49 0.50 0.42 0.45 0.47 0.42 0.52 0.43 0.33 0.27
1.30 1.28 1.14 1.13 1.04 1.03 0.98 0.94 0.89 0.67
Note: Calculated from data in Taagepera and Ensch (2006), as graphed in Figures 8.3 and 8.4. M = 1 systems are FPTP, unless otherwise indicated: TR = Two-Rounds; AV = alternate vote. M > 1 systems are List PR, unless otherwise indicated: STV = single transferable vote; SNTV = single nontransferable vote. District magnitudes remained the same during the periods shown and variations in assembly size were relatively minor, except for USA (213–437), Malta (from 10 for Government Council, 1939 and 1945, to 65) and Luxembourg (from 25 in partial elections to 64).
290
Detecting Other Factors Table A.3. Residuals of the largest seat shares (s1 ), effective numbers of parties (N), and mean cabinet durations (C )—R 1 = s1 (MS)1/8 , R N = N/(MS)1/6 , R C = C (MS)1/3 /42 yrs. Country and period M=1 Papua-NG 1977–97 India 1977–96 Mauritius 1976–97 France 1959–2002, TR Barbados 1966–94 Trinidad 1961–2001 Australia 1946–96, AV New Zealand 1946–96 Canada 1945–93 Bahamas 1972–2002 USA 1947–2000 Jamaica 1962–89 Botswana 1965–2004 UK 1945–97 M>1 Finland 1945–2003 Luxembourg 1945–99 Japan 1946–96, SNTV Norway 1945–97 Ireland 1948–97, STV Israel 1949–96 The Netherlands 1946–2002 Portugal 1976–2002 Costa Rica 1953–98 Malta 1966–87, STV Spain 1977–2004
Seat product (MS)
s1
R1
N
RN
C
RC (years)
108 542 68 508 26 36 128 85 270 42 435 55 37 635
0.40 0.55 0.62 0.44 0.70 0.75 0.51 0.57 0.56 0.73 0.62 0.76 0.75 0.53
0.71 1.21 1.06 0.97 1.05 1.17 0.93 0.99 1.12 1.17 1.32 1.25 1.18 1.20
5.98 4.11 2.71 3.43 1.76 1.82 2.22 1.96 2.37 1.68 2.40 1.62 1.35 2.11
2.74 1.44 1.34 1.21 1.02 1.00 0.99 0.93 0.93 0.90 0.87 0.83 0.74 0.72
1.65 2.4 2.1 3.1 9.5 10.0 9.9 6.3 8.0 14.9 7.7 9.2 39.6+ 8.6
0.19 0.47 0.20 0.59 0.67 0.79 1.19 0.66 1.23 1.23 1.39 0.83 3.14+ 1.76
2,940 809 1,940 1,190 538 14,400 19,600 2,810 426 294 2,330
0.27 0.41 0.54 0.47 0.48 0.38 0.34 0.43 0.52 0.53 0.50
0.73 0.95 1.39 1.13 1.06 1.25 1.18 1.16 1.12 1.08 1.32
5.03 3.36 3.71 3.35 2.84 4.55 4.65 3.33 2.41 1.99 2.76
1.33 1.10 1.05 1.03 1.00 0.92 0.90 0.89 0.88 0.77 0.76
1.5 6.0 3.9 4.3 3.8 1.75 3.3 3.2 4.9 10.6 9.0
0.51 1.33 1.16 1.08 0.74 1.01 2.13 1.08 0.88 1.68 2.84
Note: Calculated from data in Taagepera and Sikk (2007), as graphed in Figures 9.4 and 10.2. M = 1 systems are FPTP, unless otherwise indicated: TR = Two-Rounds; AV = alternate vote. France had one PR election in 1986. M > 1 systems are List PR, unless otherwise indicated: STV = single transferable vote; SNTV = single non-transferable vote. District magnitudes remained the same during the periods shown and variations in assembly size were relatively minor. Countries are listed in the order of decreasing residual for N (which corresponds to locations in Figure 9.4).
291
This page intentionally left blank
References
Adams, James F., Merrill, Samuel, and Grofman, Bernard (2005). A Unified Theory of Party Competition. Cambridge: Cambridge University Press. Amorim Neto, Octavio and Cox, Gary W. (1997). ‘Electoral Institutions, Cleavage Structures, and the Number of Parties’, American Journal of Political Science, 41: 149–74. Anckar, Carsten (1997a). ‘Size and Democracy: Some Empirical Findings’, in Dag Anckar and L. Nilsson (eds.), Politics and Geography: Contributions to the Interface. Sundsvall: Mid-Sweden University Press. (1997b). ‘Determinants of Disproportionality and Wasted Votes’, Electoral Studies, 16: 501–15. (1998). Storlek och partisystem: En studie av 77 stater [Size and party system: A study of 77 states]. Åbo: Åbo Akademi University Press. (2000). ‘Size and Party System Fragmentation’, Party Politics, 6: 305–28. Anckar, Dag (1997). ‘Dominating Smallness: Big Parties in Lilliput Systems’, Party Politics, 3: 243–63. and Anckar, Carsten (1995). ‘Size, Insularity and Democracy’, Scandinavian Political Studies, 18: 211–29. Anderson, Christopher J. and Guillory, Christine A. (1997). ‘Political Institutions and Satisfaction with Democracy: A Cross-national Analysis of Consensus and Majoritarian Systems’, American Political Science Review, 91: 61–81. Andrews, Josephine and Jackman, Robert W. (2005). ‘Strategic Fools: Electoral Rule Choice under Extreme Uncertainty’, Electoral Studies, 24: 65–84. Balinski, Michael L. and Young, H. Peyton (2001). Fair Representation: Meeting the Ideal of One Man, One Vote. Washington, DC: Brookings Institution Press. Barker, Fiona and McLeavy, Elizabeth (2000). ‘How Much Change? An Analysis of the Initial Impact of Proportional Representation on the New Zealand Parliamentary Party System’, Party Politics, 6: 131–54. Benoit, Kenneth (2002). ‘The Endogeneity Problem in Electoral Studies: A Critical Re-examination of Duverger’s Mechanical Effect’, Electoral Studies, 21: 35–46. (2004). ‘Models of Electoral System Change’, Electoral Studies, 23: 363–89. Birch, Sarah (2003). ‘Two Round Electoral Systems and Democracy’, Comparative Political Studies, 36: 319–44.
293
References Birch, Sarah (2005). ‘Single-Member District Electoral Systems and Democratic Transition’, Electoral Studies, 24: 281–301. Blais, André (2000). To Vote or Not to Vote? The Merits and Limits of Rational Choice Theory. Pittsburgh, PA: University of Pennsylvania Press. Massicotte, Louis, and Dobrzynska, Agnieszka (1997). ‘Direct Presidential Elections: A World Summary’, Electoral Studies, 16: 441–55. Blau, Adrian (2004). ‘A Quadruple Whammy for First-Past-The-Post’, Electoral Systems, 23: 431–53. Blondel, Jean (1968). ‘Party Systems and Patterns of Government in Western Democracies’, Canadian Journal of Political Science, 1: 180–203. Boix, Carles (1999). ‘Setting the Rules of the Game: The Choice of Electoral Systems in Advanced Democracies’, American Political Science Review, 93: 609–24. Bowler, Shaun, and Grofman, Bernard (eds.) (2000). Elections in Australia, Ireland, and Malta under the Single Transferable Vote: Reflections on an Embedded Institution. Ann Arbor, MI: University of Michigan Press. Farrell, David, and Katz, Richard S. (eds.) (1999). Party Discipline and Parliamentary Government. Columbus, OH: Ohio State University Press. Brunell, Thomas L. (2006). ‘Rethinking Districts: How Drawing Uncompetitive Districts Eliminates Gerrymanders, Enhances Representation, and Improves Attitudes toward Congress’, PS: Political Science and Politics, 40: 77–85. Burnell, Peter and Ware, Alan (eds.) (1998). Funding Democratization. Manchester, UK: Manchester University Press. Carey, John M. and Shugart, Matthew S. (1995). ‘Incentives to Cultivate a Personal Vote: A Rank Ordering of Electoral Formulas’, Electoral Studies, 14: 417–39. Chhibber, Pradeep and Kollman, Ken (1998). ‘Party Aggregation and the Number of Parties in India and the United States’, American Political Science Review, 92: 329–42. (2004). The Formation of National Party Systems: Federalism and Party Competition in Canada, Great Britain, India, and the United States. Princeton, NJ: Princeton University Press. Coleman, Stephen (2007). ‘Testing Theories with Qualitative and Quantitative Predictions’, European Political Science, forthcoming. Colomer, Josep M. (ed.) (2004a). Handbook of Electoral System Choice. Houndsmills and New York: Palgrave Macmillan. (2004b). ‘The Strategy and History of Electoral System Choice’, in Colomer (2004a), pp. 3–78. (2005). ‘It’s Parties That Choose Electoral Systems (or, Duverger’s Laws Upside Down)’, Political Studies, 53: 1–21. (2007). ‘What Other Sciences Look Like’, European Political Science, forthcoming. Cox, Gary W. (1996). ‘Is the Single Nontransferable Vote Superproportional? Evidence from Japan and Taiwan’, American Journal of Political Science, 40: 740–55.
294
References (1997). Making Votes Count: Strategic Coordination in the World’s Electoral Systems. Cambridge: Cambridge University Press. and Shugart, Matthew S. (1991). “Comment on Gallagher’s “Proportionality, Disproportionality and Electoral Systems” ’, Electoral Studies, 10: 348–92. Cox, Karen and Schoppa, Leonard J. (2002). ‘Interaction Effects in Mixed-Member Electoral Systems: Theory and Evidence from Germany, Japan and Italy’, Comparative Political Studies, 35: 1027–53. Crisp, Brian F., Escobar-Lemmon, Maria C., Jones, Bradford S., Jones, Mark P., and Taylor-Robinson, Michelle M. (2004). ‘Voter-Seeking Incentives and Legislative Representation in Six Presidential Democracies’, Journal of Politics, 66: 823–46. Cross, William (2004). Political Parties. Vancouver: University of British Columbia Press. Dahl, Robert A. (1961). Who Governs? New Haven, CT.: Yale University Press. (1966). ‘Patterns of Opposition’, in Robert Dahl (ed.), Political Opposition in Western Democracies. New Haven, CT: Yale University Press, pp. 332–47. and Tufte, Edward R. (1973). Size and Democracy. Stanford, CA: Stanford University Press. Dalton, Russell J. and Wattenberg, Martin P. (2000). Parties without Partisans: Political Change in Advanced Industrial Democracies. Oxford: Oxford University Press. Darcy, Robert, Welch, Susan, and Clark, Janet (1994). Women, Elections, and Representation. Lincoln, NE: University of Nebraska Press. Diamond, Larry and Plattner, Mark (eds.) (2006). Electoral Systems and Democracy. Baltimore, MD: Johns Hopkins University Press. Diskin, Abraham and Hazan, Reuven Y. (2002). ‘The 2001 Prime Ministerial Election in Israel’, Electoral Studies, 21: 659–64. Dodd, Lawrence C. (1976). Coalitions in Parliamentary Government. Princeton, NJ: Princeton University Press. Dogan, Mattei (1989). ‘Irremovable Leaders and Ministerial Instability in European Democracies’, in Mattei Dogan (ed.), Pathways to Power: Selecting Rulers in Pluralist Democracies. Boulder, CO: Westview, pp. 239–75. Dolez, Bernard and Laurent, Annie (2005). ‘The Seat–Vote Equation in French Legislative Elections (1978–2002)’, French Politics, 3: 124–41. Downs, Anthony (1957). An Economic Theory of Democracy. New York: Harper. Drummond, Andrew J. (2006). ‘Thinking Outside the (Ballot) Box: Gauging the Systemic and Cognitive Consequences of Electoral Rules for Parties, Partisans, and Partisanship’, Ph.D. thesis, University of California, Irvine. Dumont, Patrick and Caulier, Jean-François (2006). ‘The “Effective Number of Relevant Parties”: How Voting Power Improves the Laakso-Taagepera Index’, Unpublished. Dunleavy, Patrick and Boucek, Françoise (2003). ‘Constructing the Number of Parties’, Party Politics, 9: 291–315. Duverger, Maurice (1951). Les partis politiques. Paris: Le Seuil.
295
References Duverger, Maurice (1954). Political Parties: Their Organization and Activity in the Modern State. London: Methuen. Eagles, Munroe (ed.) (1995). Spatial and Contextual Models in Political Research. London: Taylor and Francis. Eckstein, Harry (1966). Division and Cohesion in Democracy: A Study of Norway. Princeton, NJ: Princeton University Press. (1998). ‘Congruence Theory Explained’, in Harry Eckstein, Frederic J. Fleron, Erik P. Hoffman, and William Reisinger (eds.), Can Democracy Take Root in PostSoviet Russia? Lanham, MD: Rowman & Littlefield. Elklit, Jørgen (ed.) (1997). Electoral Systems for Emerging Democracies: Experiences and Suggestions. Copenhagen: Danida. and Roberts, Nigel S. (1996). ‘A Category of Its Own: Four PR Two-Tier Compensatory Member Electoral Systems’, European Journal of Political Research, 30: 217–40. Farrell, David M. (2001). Electoral Systems: A Comparative Introduction. London: Palgrave. and McAllister, Ian (2006). The Australian Electoral System: Origins, Variations, and Consequences. Sidney: University of New South Wales Press. Mackerras, Malcolm, and McAllister, Ian (1996). ‘Designing Electoral Institutions: STV Systems and Their Consequences’, Political Studies, 44: 24–43. Ferrara, Federico (2004). ‘Electoral Coordination and the Strategic Desertion of Strong Parties in Compensatory Mixed Systems with Negative Vote Transfers’, Electoral Studies, 23: 391–413. (2006). ‘Two in One: Party Competition in the Italian Single Ballot Mixed System’, Electoral Studies, 25: 329–50. Fisher, Justin and Eisenstadt, Todd (2004). ‘Comparative Party Finance: Introduction: What Is to Be Done?’, Party Politics, 10: 619–26. Fraenkel, Jon and Grofman, Bernard (2005). ‘Introduction—Political Culture, Representation and Electoral Systems in the Pacific Islands’, Commonwealth & Comparative Politics, 43: 261–75. (2006a). ‘Does the Alternative Vote Foster Moderation in Ethnically Divided Societies? The Case of Fiji’, Comparative Political Studies, 39: 623–51. (2006b). ‘The Failure of the Alternative Vote as a Tool for Ethnic Moderation in Fiji: A Rejoinder to Horowitz’, Comparative Political Studies, 39: 663–6. Gallagher, Michael (1991). ‘Proportionality, Disproportionality and Electoral Systems’, Electoral Studies, 10: 38–40. (1998). ‘The Political Impact of Electoral System Change in Japan and New Zealand’, Party Politics, 4: 203–28. (2001). ‘The Japanese House of Councilors Election 1998 in Comparative Perspective’, Electoral Studies, 20: 603–25. and Mitchell, Paul (eds.) (2006). The Politics of Electoral Systems. Oxford: Oxford University Press.
296
References Laver, Michael, and Mair, Peter (2000). Representative Government in Modern Europe. New York: McGraw-Hill. Gambetta, Diego and Warner, Steven (2004). ‘Italy: Lofty Ambitions and Unintended Consequences’, in Colomer (2004b), pp. 237–52. Gerring, John (2005). ‘Minor Parties in Plurality Electoral Systems’, Party Politics, 11: 79–107. Geys, Benny (2006). ‘District Magnitude, Social Heterogeneity and Local Party System Fragmentation’, Party Politics, 12: 281–97. Golder, Matt (2005). ‘Democratic Electoral Systems around the World, 1946–2000’, Electoral Studies, 24: 103–21. Grofman, Bernard (1989). ‘The Comparative Analysis of Coalition Formation and Duration: Distinguishing Between-Country and Within-Country Effects’, British Journal of Political Science, 19: 291–302. (1999). ‘SNTV, STV, and Single-Member District Systems: Theoretical Comparisons and Contrasts’, in Grofman et al. (1999), pp. 317–33. (2003). ‘Rein Taagepera’s Approach to the Study of Electoral Systems’, Journal of Baltic Studies, 35: 167–85. (2005). ‘Comparisons among Electoral Systems: Distinguishing between Localism and Candidate-Centered Politics’, Electoral Studies, 24: 735–40. (2007). ‘Toward a Science of Politics?’, European Political Science, forthcoming. and Feld, Scott L. (2004). ‘If You Like the Alternative Vote (a.k.a. the Instant Runoff), then You Ought to Know about the Coombs Rule’, Electoral Studies, 23: 641–59. and Handley, Lisa (1989). ‘Black Representation: Making Sense of Electoral Geography at Different Levels’, Legislative Studies Quarterly, 14: 265–79. and Lijphart, Arend (eds.) (2002). The Evolution of Electoral and Party Systems in the Nordic Countries. New York: Agathon. Koetzle, William, and Brunell, Thomas (1997). ‘An Integrated Perspective on the Three Potential Sources of Partisan Bias: Malapportionment, Turnout Differences, and the Geographic Distribution of Party Vote Shares’, Electoral Studies, 16: 457–70. Lee, Sung-Chull, Winckler, Edwin A., and Woodall, Brian (eds.) (1999). Elections in Japan, Korea, and Taiwan under the Single Non-Transferable Vote. Ann Arbor, MI: University of Michigan Press. Mikkel, Evald, and Taagepera, Rein (2000). ‘Fission and Fusion of Parties in Estonia, 1987–1999’, Journal of Baltic Studies, 31: 329–57. Chiaramonte, Alessandro, D’Alimonte, Roberto, and Feld, Scott L. (2004). ‘Comparing and Contrasting the Uses of Two Graphical Tools for Displaying Patterns of Multiparty Competition: Nagayama Diagrams and Simplex Representations’, Party Politics, 10: 273–99. Gudgin, Graham and Taylor, Peter J. (1979). Seats, Votes and the Spatial Organization of Elections. London: Pion.
297
References Gunther, Richard and Diamond, Larry (2003). ‘Species of Political Parties: A New Typology’, Party Politics, 9: 167–99. Montero, José Ramón, and Linz, Juan J. (eds.) (2002). Political Parties: Old Concepts and New Challenges. Oxford: Oxford University Press. Hare, Thomas (1859). On the Election of Representatives: Parliamentary and Municipal. London: Longman, Roberts, Green. Henig, R. and Henig, S. (2001). Women and Political Power: Europe since 1945. London: Routledge. Hooghe, Marc, Maddens, Bart, and Noppe, Jo (2006). ‘Why Parties Adapt: Electoral Reform, Party Finance and Party Strategy in Belgium’, Electoral Studies, 25: 351–68. Horowitz, Donald (1985). Ethnic Groups in Conflict. Berkeley, CA: University of California Press. (2002). ‘Constitutional Design: Proposals versus Processes’, in Reynolds (2002), pp. 15–36. (2006). ‘Strategy Takes a Holiday: Fraenkel and Grofman on the Alternative Vote’, Comparative Political Studies, 39: 652–62. Hsieh, John F.-S. and Niemi, Richard G. (1999). ‘Can Duverger’s Law Be Extended to SNTV? The Case of Taiwan’s Legislative Yuan Elections’, Electoral Studies, 18: 101–16. Inglehart, Ronald (1997). Modernization and Postmodernization; Cultural, Economic and Political Change in 43 Societies. Princeton, NJ: Princeton University Press. Johnston, Ron J. (1981). Political, Electoral and Spatial Systems. Oxford: Oxford University Press. Jones, Mark P. (1995). Electoral Laws and the Survival of Presidential Democracies. Notre Dame, IN: Notre Dame University Press. Kaminski, Marek M. (2002). ‘Do Parties Benefit from Electoral Manipulation? Electoral Laws and Heresthetics in Poland, 1989–1993’, Journal of Theoretical Politics, 14: 325–58. Kaskla, Edgar and Taagepera, Rein (1988). ‘Effect of District Magnitude on the Number of Parties: A Quasi-Experiment’, Los Angeles Electoral Geography Conference, April 3–5. Katz, Richard S. (1996). ‘Electoral Reform and the Transformation of Party Politics in Italy’, Party Politics, 2: 31–53. (1997). Democracy and Elections. Oxford: Oxford University Press. and Crotty, William (eds.) (2006). Handbook of Party Politics. London: Sage. Kendall, M. G. and Stuart, A. (1950). ‘The Law of Cubic Proportion in Election Results’, British Journal of Sociology, 1: 183–97. King, Gary, Tomz, Michael, and Wittenberg, Jason (2000). ‘Making the Most of Statistical Analyses: Improving Interpretation and Presentation’, American Journal of Politics, 44: 341–55.
298
References Klingemann, Hans-Dieter (1999). ‘Mapping Political Support in the 1990s: A Global Analysis’, in Pippa Norris (ed.), Critical Citizens: Global Support for Democratic Government. Oxford: Oxford University Press. Laakso, Markku (1979). ‘Thresholds for Proportional Representation Reanalyzed and Extended’, Munich Social Science Review, 1: 19–28. and Taagepera, Rein (1979). ‘Effective Number of Parties: A Measure with Application to West Europe’, Comparative Political Studies, 23: 3–27. Lago Penas, Ignacio (2004). ‘Cleavages and Thresholds: The Political Consequences of Electoral Laws in the Spanish Autonomous Communities, 1980–2000’, Electoral Studies, 23: 23–43. Latner, Michael and McGann, Anthony (2005). ‘Geographical Representation under Proportional Representation: The Cases of Israel and the Netherlands’, Electoral Studies, 24: 709–34. Laver, Michael (2003). ‘Government Termination’, Annual Review of Political Science, 6: 23–40. and Shepsle, Kenneth A. (1996). Making and Breaking Governments: Cabinets and Legislatures in Parliamentary Democracies. Cambridge: Cambridge University Press. Lijphart, Arend (1977). Democracy in Plural Societies: A Comparative Exploration. New Haven, CT: Yale University Press. (1984). Democracies: Patterns of Majoritarianism and Consensus Government. New Haven, CT: Yale University Press. (1990). ‘Size, Pluralism, and the Westminster Model of Democracy: Implications for the Eastern Caribbean’, in Jorge Heine (ed.), A Revolution Aborted: The Lessons of Grenada. Pittsburgh, PA: University of Pittsburgh Press, pp. 321–40. (1994). Electoral Systems and Party Systems. Oxford: Oxford University Press. (1999). Patterns of Democracy: Government Forms and Performance in Thirty-Six Countries. New Haven, CT: Yale University Press. (2002). ‘The Wave of Power-Sharing Democracy’, in Reynolds (2002), pp. 37– 54. Loosemore, John and Hanby, Victor J. (1971). ‘The Theoretical Limits of Maximum Distortion: Some Analytic Expressions for Electoral Systems’, British Journal of Political Science, 1: 467–77. MacKenzie, W. J. M. (1954) ‘The Export of Electoral Systems’, Political Studies, 5: 240–57. Mackie, Thomas T. and Rose, Richard (1991). The International Almanac of Electoral History. London: Macmillan, and Washington, DC: Congressional Quarterly. Previous editions: 1974, 1982. (1997). A Decade of Election Results: Updating the International Almanac. Glasgow, UK: Centre for the Study of Public Policy, University of Strathclyde. Maclellan, Nic (2005). ‘From Eloi to Europe: Interaction with the Ballot Box in New Caledonia’, Commonwealth & Comparative Politics, 43: 394–418.
299
References Madrid, Raúl L. (2005). ‘Indigenous Voters and Party System Fragmentation in Latin America’, Electoral Studies, 24: 689–707. Mair, Peter (1997). Party System Change: Approaches and Interpretations. Oxford: Clarendon Press. Massicotte, Louis and Blais, André (1999). ‘Mixed Electoral Systems: A Conceptual and Empirical Survey’, Electoral Studies, 18: 341–66. Matland, Richard E. and Taylor, Michelle M. (1997). ‘Electoral System Effects on Women’s Representation: Theoretical Arguments and Evidence from Costa Rica’, Comparative Political Studies, 30: 186–210. Mill, John Stuart (1861). Considerations on Representative Government. New York: Harper. Molinar, Juan (1991). ‘Counting the Number of Parties: An Alternative Index’, American Political Science Review, 85: 1383–91. Monroe, Burt L. (1994). ‘Disproportionality and Malapportionment: Measuring Electoral Inequity’, Electoral Studies, 13: 132–49. (2007a). Electoral Systems in Theory: Rethinking Social Choice Theory and Democratic Institutions. Ann Arbor, MI: University of Michigan Press. (2007b). Electoral Systems in Practice: Understanding Distortions in Democratic Representation. Ann Arbor, MI: University of Michigan Press. and Rose, Amanda G. (2002). ‘Electoral Systems and Unimagined Consequences: Partisan Effects of Districted Proportional Representation’, American Journal of Political Science, 46: 67–89. Moser, Robert G. and Scheiner, Ethan (2004). ‘Mixed Electoral Systems and Electoral System Effects: Controlled Comparison and Cross-National Analysis’, Electoral Studies, 23: 575–99. Mozaffar, Shaheen and Schedler, Andreas (2002). ‘The Comparative Study of Electoral Governance—Introduction’, International Political Science Review, 23: 5–27. Scarritt, James R., and Galaich, Glen (2003). ‘Electoral Institutions, Ethnopolitical Cleavages, and Party Systems in Africa’s Emerging Democracies’, American Political Science Review, 97: 379–90. Mukherjee, Bumba (2003). ‘Political Parties and the Size of Government in MultiParty Legislatures’, Comparative Political Studies, 36: 699–728. Nagayama, Masao (1997). ‘Shousenkyoku no kako to genzai’ [The present and future of single-member districts], Annual Conference of the Japan Political Science Association, Kyoto, September 4–6. Niemi, Richard and Hsieh, John F.-S. (2002). ‘Counting Candidates: An Alternative to the Effective N’, Party Politics, 8: 75–99. Nishikawa, Misa and Herron, Erik S. (2004). ‘Mixed Electoral Rules’ Impact on Party Systems’, Electoral Studies, 23: 753–68. Nohlen, Dieter (ed.) (1993). Handbuch der Wahldaten Lateinamerikas und der Karibik. Opladen: Leske + Budrich. (2005). Elections in the Americas: A Data Handbook. Oxford: Oxford University Press.
300
References and Kasapovic, Mirjana (1996). Wahlsysteme und Systemwechsel in Osteuropa. Opladen: Leske & Budrich. Krennerich, Michael, and Thibaut, Bernhard (eds.) (1999). Elections in Africa: A Data Handbook. Oxford: Oxford University Press. Gotz, Florian, and Hartmann, Christof (eds.) (2001). Elections in Asia and the Pacific: A Data Handbook, vols. 1 and 2. Oxford: Oxford University Press. Norris, Pippa (2004). Electoral Engineering: Voting Rules and Political Behavior. Cambridge: Cambridge University Press. Novák, Miroslav and Lebeda, Tomáš (2005). Electoral Laws and Party Systems: The ˇ ek. Czech Experience. Dobrá Voda: Aleš Cenˇ Ordeshook, Peter C. and Shvetsova, Olga (1994). ‘Ethnic Heterogeneity, District Magnitude, and the Number of Parties’, American Journal of Political Science, 38: 101–23. Park, Myoung Ho (2003). ‘Sub-national Sources of Multipartism in Parliamentary Elections: Evidence from Korea’, Party Politics, 9: 503–22. Pedersen, Mogens N. (1979). ‘The Dynamics of European Party Systems: Changing Patterns of Electoral Volatility’, European Journal of Political Research, 7: 7–26. Pennings, Paul and Hazan, Reuven Y. (2001). ‘Democratizing Candidate Selection: Causes and Consequences’, Party Politics, 7: 267–75. Pérez-Liñán, Aníbal (2006). ‘Evaluating Presidential Runoff Elections’, Electoral Studies, 25: 129–46. Powell, G. Bingham (2000). Elections as Instruments of Democracy: Majoritarian and Proportional Visions. New Haven, CT: Yale University Press. (2006). ‘Election Laws and Representative Governments: Beyond Votes and Seats’, British Journal of Political Science, 36: 291–315. Przeworski, Adam (1975). ‘Institutionalization of Voting Patterns, or Is Mobilization a Source of Decay?’, American Political Science Review, 69: 49–67. Putnam, Robert D. (1976). The Comparative Study of Political Elites. Englewood Cliffs, NJ: Prentice-Hall. Rae, Douglas W. (1967). The Political Consequences of Electoral Laws. New Haven, CT: Yale University Press. Hanby, Victor, and Loosemore, John (1971). ‘Thresholds of Representation and Thresholds of Exclusion: An Analytical Note on Electoral Systems’, Comparative Political Studies, 3: 479–88. Rapoport, Anatol (1960). Fights, Games, and Debates. Ann Arbor, MI: University of Michigan Press. Reed, Steven R. (1991). ’Structure and Behavior: Extending Duverger’s Law to the Japanese Case’, British Journal of Political Science, 29: 335–56. (1996). ‘Seats and Votes: Testing Taagepera in Japan’, Electoral Studies, 15: 71– 81. (2001). ‘Duverger’s Law Is Working in Italy’, Comparative Political Studies, 34: 312–27.
301
References Reed, Steven R. (2003). ‘What Mechanism Causes the M + 1 Rule? A Simple Simulation’, Japanese Journal of Political Science, 4: 41–60. and Bolland, John M. (1999). ‘The Fragmentation Effect of SNTV in Japan’, in Bernard Grofman et al. (1999), pp. 211–26. Reilly, Benjamin (2002). ‘Social Choice in the South Seas: Electoral Innovation and the Borda Count in the Pacific Island Countries’, International Political Science Review, 23: 355–72. Reynolds, Andrew (1999). Electoral Systems and Democratization in Southern Africa. Oxford: Oxford University Press. (ed.) (2002). The Architecture of Democracy: Constitutional Design, Conflict Management, and Democracy. Oxford: Oxford University Press. and Steenbergen, Marco (2006). ‘How the World Votes: The Political Consequences of Ballot Design, Innovation and Manipulation’, Electoral Studies, 25: 570–98. Reilly, Ben, and Ellis, Andrew (2005). Electoral System Design: The New International IDEA Handbook. Stockholm: International Institute for Democracy and Electoral Assistance. Riker, William H. (1982). ‘The Two-Party System and Duverger’s Law: An Essay on the History of Political Science’, American Political Science Review, 76: 753–66. Roberts, Nigel S. (1997). ‘ “A Period of Enhanced Surprise, Disappointment, and Frustration”? The Introduction of a New Electoral System in New Zealand’, in Elklit (1997), pp. 63–74. Rokkan, S. (1968). ‘Elections: Electoral Systems’, in International Encyclopedia of the Social Sciences. New York: Crowell, Collier, Macmillan. Ruiz Rufino, Rubén (2005). Aggregated Threshold Functions: A Characterization of the World Electoral Systems between 1945–2000. Madrid: Centro de Estudios Avanzados en Ciencias Sociales. Rule, Wilma (1981). ‘Why Women Don’t Run: The Critical Contextual Factors in Women’s Legislative Recruitment’, Western Political Quarterly, 34: 60–77. and Zimmerman, Joseph F. (eds.) (1994). Electoral Systems in Comparative Perspective: Their Impact on Women and Minorities. Westport, CT: Greenwood. Rush, Mark E. and Engstrom, Richard L. (2001). Fair and Effective Representation? Debating Electoral Reform and Minority Rights. Lanham, MD: Rowman & Littlefield. Samuels, David J. and Shugart, Matthew S. (2003). ‘Presidentialism, Elections and Representation’, Journal of Theoretical Politics, 15: 33–23. Sartori, Giovanni (1976). Parties and Party Systems: A Framework for Analysis. Cambridge: Cambridge University Press. Scarrow, Susan E. (1999). ‘Democracy within—and without—Parties: Introduction’, Party Politics, 5: 275–82. Schiff, Leonard E. (1955). Quantum Mechanics. New York: McGraw-Hill. Schuster, Karsten, Pukelsheim, Friedrich, Drton, Mathias, and Draper, Norman (2003). ‘Seat Biases of Apportionment Methods for Proportional Representation’, Electoral Studies, 22: 651–76.
302
References Shugart, Matthew S. (2006). ‘Comparative Electoral Systems Research: The Maturation of a Field and New Challenges Ahead’, in Gallagher and Mitchell (2006), pp. 25–55. (2007). ‘Inherent and Contingent Factors in Reform Initiation in Plurality Systems’, Conference paper in preparation. and Carey, John M. (1992). Presidents and Assemblies: Constitutional Design and Electoral Dynamics. New York: Cambridge University Press. and Taagepera, Rein (1994). ‘Plurality versus Majority Election of Presidents: A Proposal for a “Double Complement Rule”’, Comparative Political Studies, 17: 323–48. and Wattenberg, Martin P. (eds.) (2001). Mixed-Member Electoral Systems: The Best of Both Worlds? Oxford: Oxford University Press. Valdini, Melody E., and Suominen, Kati (2005). ‘Looking for Locals: Voter Information Demands and Personal Vote-Earning Attributes of Legislators under Proportional Representation’, American Journal of Political Science, 49: 437–49. Siaroff, Alan (2000). Comparative European Party Systems: An Analysis of Parliamentary Elections since 1945. New York: Garland. (2003). ‘Two-and-a-Half-Party Systems and the Comparative Role of the “Half”’, Party Politics, 9: 267–90. Sikk, Allan (2005). ‘How unstable? Volatility and the Genuinely New Parties in Eastern Europe’, European Journal of Political Research, 44: 391–412. Soudriette, Richard W. and Ellis, Andrew (2006). ‘Electoral Systems Today: A Global Snapshot’, Journal of Democracy, 17: 78–88. Stockwell, Robert F. (2005). ‘An Assessment of the Alternative Vote System in Fiji’, Commonwealth & Comparative Politics, 43: 382–93. Strøm, Kaare (1990). Minority Government and Majority Rule. Cambridge: Cambridge University Press. Taagepera, Rein (1969). ‘The Seat–Vote Equation’, MA thesis, University of Delaware. (1972). ‘The Size of National Assemblies’, Social Science Research, 1: 385–401. (1973). ‘Seats and Votes: A Generalization of the Cube Law of Elections’, Social Science Research, 2: 257–75. (1976). ‘Why the Trade/GNP Ratio Decreases with Country Size’, Social Science Research, 5: 385–404. (1986). ‘Reformulating the Cube Law of Elections for Proportional Representation Elections’, American Political Science Review, 80: 489–504. (1989). ‘Empirical Threshold of Representation’, Electoral Studies, 8: 105–16. (1993). ‘Running for President of Estonia: A Political Scientist in Politics’, PS: Political Science and Politics, 26: 302–4. (1994). ‘Beating the Law of Minority Attrition’, in Rule and Zimmermann (1994), pp. 233–45. (1997a). ‘Expansion and Contraction Patterns of Large Polities: Context for Russia’, International Studies Quarterly, 41: 475–504.
303
References Taagepera, Rein (1997b). ‘Effective Number of Parties for Incomplete Data’, Electoral Studies, 16: 145–51. (1998a). ‘How Electoral Systems Matter for Democratization’, Democratization, 5: 68–91. (1998b). ‘Effective Magnitude and Effective Threshold’, Electoral Studies, 17: 393–404. (1998c). ‘Nationwide Inclusion and Exclusion Thresholds of Representation’, Electoral Studies, 17: 405–17. (1999a). ‘Supplementing the Effective Number of Parties’, Electoral Studies, 18: 497–504. (1999b). ‘Ignorance-Based Quantitative Models and Their Practical Implications’, Journal of Theoretical Politics, 11: 421–31. (2001). ‘Party Size Baselines Imposed by Institutional Constraints: Theory for Simple Electoral Systems’, Journal of Theoretical Politics, 13: 331–54. (2002a). ‘Implications of the Effective Number of Parties for Cabinet Formation’, Party Politics, 8: 227–36. (2002b). ‘Nationwide Threshold of Representation’, Electoral Studies, 21: 383– 401. (2002c). ‘Limiting Frame of Political Games: Logical Quantitative Models of Size, Growth and Distribution’, Center for the Study of Democracy, University of California, Irvine. Research Monograph CSD 02-03, www.democ. uci.edu (2002d). ‘Designing Electoral Rules and Waiting for an Electoral System to Evolve’, in Reynolds (2002), pp. 248–64. (2003). ‘Arend Lijphart’s Dimensions of Democracy: Logical Connections and Institutional Design’, Political Studies, 51: 1–19. (2004). ‘Extension of the Nagayama Triangle for Visualization of Party Strengths’, Party Politics, 10: 301–6. (2005). ‘Conservation of Balance in the Size of Parties’, Party Politics, 11: 283– 98. (2006). ‘Meteoric Trajectory: Res Publica Party in Estonia’, Democratization, 13: 78–94. (2008). Beyond Regression: The Need for Predictive Models in Social Sciences. Oxford: Oxford University Press, forthcoming. and Allik, Mirjam (2006). ‘Seat Share Distribution of Parties: Models and Empirical Patterns’, Electoral Systems, 25: 696–713. and Ensch, John (2006). ‘Institutional Determinants of the Largest Seat Share’, Electoral Systems, 25: 760–75. and Grofman, Bernard (2003). ‘Mapping the Indices of Seats–Votes Disproportionality and Inter-Election Volatility’, Party Politics, 9: 659–77. and Hayes, James P. (1977). ‘How Trade/GNP Ratio Decreases with Country Size’, Social Science Research, 6: 108–32.
304
References and Hosli, Madeleine O. (2006). ‘National Representation in International Organizations: The Seat Allocation Model Implicit in the EU Council and Parliament’, Political Studies, 54: 370–98. and Laakso, Markku (1980). ‘Proportionality Profiles of West European Electoral Systems’, European Journal of Political Research, 8: 423–46. —– and Laatsit, Mart (2007). ‘Vote and Seat Share Distribution of Parties: How and Why’, Unpublished. and Nurmia, Matti (1961). ‘On the Relations between Half-Life and Energy Release in Alpha-Decay’, Ann. Acad. Sci. Fennicae A. VI. 78. and Recchia, Steven (2002). ‘The Size of Second Chambers and European Assemblies’, European Journal of Political Research, 41: 185–205. and Shugart, Matthew S. (1989). Seats and Votes: The Effects and Determinants of Electoral Systems. New Haven, CT: Yale University Press. (1993). ‘Predicting the Number of Parties: A Quantitative Model of Duverger’s Mechanical Effect’, American Political Science Review, 87: 455–64. and Sikk, Allan (2007). ‘Institutional Determinants of Mean Cabinet Duration’, Unpublished. and Williams, Ferd (1966). ‘Photoelectroluminescence of Single Crystals of Manganese-Activated Zinc Sulfide’, Journal of Applied Physics, 13: 3085–91. Storey, Robert S., and McNeill, Keith G. (1961). ‘Breakdown Strength of Caesium Iodide’, Nature, 190: 994. Tan, Alexander C. (1989). ‘The Impact of Party Membership Size: A Cross-National Analysis’, Journal of Politics, 60: 188–98. (1997). ‘Party Change as Party Membership Declines’, Party Politics, 3: 363–77. Taylor, Andrew (2005). ‘Electoral System and the Promotion of “Consociationalism” in a Multi-ethnic Society: The Kosovo Assembly Elections of November 2001’, Electoral Studies, 24: 435–63. Theil, Henri (1969). ‘The Desired Political Entropy’, American Political Science Review, 63: 21–5. Tufte, Edward R. (1973). ‘The Relationship between Seats and Votes in Two-party Systems’, American Political Science Review, 67: 540–47. Van Biezen, Ingrid (2003). Financing Political Parties and Election Campaigns: Guidelines. Strasbourg: Council of Europe. Van Roozendaal, Peter (1992). ‘The Effect of Dominant and Central Parties on Cabinet Composition and Durability’, Legislative Studies Quarterly, 17: 5–36. Vatter, Adrian (2003). ‘Legislative Party Fragmentation in Swiss Cantons: A Function of Cleavage Structure or Electoral Institutions?’, Party Politics, 9: 445–61. Ware, Alan (1996). Political Parties and Party Systems. Oxford: Oxford University Press.
305
References Warwick, Paul (1994). Government Survival in Parliamentary Democracies. Cambridge: Cambridge University Press. Webb, Paul and Farrell, David (2002). Political Parties in Advanced Industrial Democracies. Oxford: Oxford University Press. Weldon, Steven A. (2006). ‘Downsize My Polity? The Impacts of Size on Party Membership and Member Activism’, Party Politics, 12: 467–81. Wolinetz, Steven B. (2006). ‘Party Systems and Party System Types’, in Katz and Crotty (2006), pp. 51–62.
306
Index
A Wuffle x, 99, 239, 271 Adams, James F. x additional member, see electoral systems, composite adjustment seats, see electoral systems, composite advantage ratio 70–5 Alabama paradox 85–7, 95, 243, 246 Åland Islands 36 Allende, Salvador 2 Allik, Mirjam vii, 145–6, 148–51, 155, 157–8, 161 alternative vote (AV) 25–7, 36, 45–6, 270 agreement with simple models 117, 125, 129 Amorim Neto, Octavio 278 Anckar, Carsten 188–9, 192, 195–6, 231 Anckar, Dag 188 Anderson, Christopher J. 108 Andrews, Josephine 15 Antigua, 232 apparentement 24, 41, 73, 90, 127, 132 approval voting 28–9 assembly size 17–18, 23–4, 102, 110, 127, 184–6 cube root law of 110, 188–90, 197–200, 206 and openness to small parties 90–1 Australia 25, 35, 135–6 sister parties 53, 125, 129 Austria 14, 64, 182, 193, 237, 266 balance in party sizes, index of 48, 56, 62, 123–4, 138, 237 and effective number of parties 50–3 Balinski, Michael L. 87, 95 ballot structure 18, 23 Banzhaf power index 56 Barbados 25 Barker, Fiona 280 Belgium 39, 183, 237 Benoit, Kenneth 17, 277
Birch, Sarah 25, 271 Blais, André v, 40, 277, 282 Blau, Adrian 214, 221 block vote (BV), see party block vote Blondel, Jean 50–1 Boix, Carles 7, 110 Bolland, John 34, 50 Borda, Jean Charles de 26, 29, 102 Borda count (BC) 26–8, 45–6, 270 Botswana 52, 125, 172, 174, 180 Boucek, Françoise 56, 62 Bowler, Shaun v, x, 35–6 break-even point 70–3, 235, 250–1, 252 Britain, see United Kingdom British-heritage countries 15, 25, 45–6, 110, 270 Brunell, Thomas L. 43, 279 Burnell, Peter 279 cabinet duration vi, viii, 111, 165–75, 191–2 data 289–91 inverse cube law 170–2, 175 inverse square law 167–70, 174–5, 275 cabinet types 56, 167 Canada 135–6, 247 district level 215, 247 national level 64, 135–6, 144, 237 province level 104, 181, 209 second chamber 261, 266 Carey, John M. v, 282–3 Caribbean countries: with high disproportionality exponent 207, 209–12, 214, 229, 232, 237 categoric ballot 17–18 Caulier, Jean-François 56 causal direction 7, 103, 108, 110 chess rules and electoral rules 7–8 Chhibber, Pradeep 154, 215 Chile 2, 41, 50 China 190
307
Index Churchill, Winston 120 Clark, Janet vi, 203, 283 Cleavages 106 closed list 18, 29, 283 coalition cabinets 56, 58, 131, 202 break-up 166, 168 duration 17, 167 Coleman, Stephen vii Colomer, Josep M. v, vii, 7, 13–15, 22, 24, 26, 33, 102–3, 110, 132, 253 micro-mega rule 83–5, 92, 94, 102, 107, 110 communication channels, number of 111, 167–8, 170, 198–200 compensatory seats, see electoral systems, compensatory conceptual anchor points 69, 208 for number of parties 91, 117, 121, 134 conceptual limits 31, 162, 140 on number and size of parties 119–20, 125, 135, 148, 156, 157 see also quantitatively predictive logical models conservation laws 68, 124, 135–6 Costa Rica 30 Cox, Gary W. v, 27, 35, 49, 94, 104–5, 149, 226, 244, 277–8 Cox, Karen 280 Crisp, Brian 30, 284 Cross, William x Crotty, William x cube law of elections 112–13, 205, 214, 217, 218 cube root law of assembly sizes, see assembly size cumulative voting 28 Cyprus 265 Czech Republic 20 Dahl, Robert A. 1, 4, 51, 187 Dalton, Russell J. x, 195 Dalton’s principle 77–8, 82 Danish divisors 32–3, 86–8, 93, 95–6 Darcy, Robert vi, 203, 283 democracy and elections 1–2, 108 Denmark 6, 41, 51, 183 ‘deterministic’ prediction ix deviation from a norm 76 deviation from PR 65–82, 113 Gallagher index (D2 ) 66–9, 76–82, 88–9, 211, 231, 234 indices to suit any seat allocation formula 94 ‘law of conservation’ 68
308
Lijphart index (D∞ ) 67, 77–9 Loosemore–Hanby index (D1 ) 66–9, 76–82, 233 master equation for indices 78 prediction from institutional inputs 231–3 prediction from votes 211–13 relations between indices 79–82 relation to effective number of parties 68–9 d’Hondt divisors 31–2, 44 as sufficient quota 33, 94 formula exponent for 93–6, 128, 130 large party bias 85–9, 112 modified, in Estonia 32–3 openness to small parties 85, 89–90 presented as Hagenbach-Bischoff 43 threshold of representation 242–3, 246–7, 252 Diamond, Larry v, 51 Diskin, Abraham 282 disproportionality exponent 112, 205–8, 213–15, 217 district level alliances, see apparentement district magnitude 103 definition 2, 18–19, 23 determinants of 110 openness to small parties 85–9 unequal 37–8, 132 divisor formulas for seat allocation 31–4 see also Danish; d’Hondt; Imperiali; Jefferson; Sainte-Laguë divisors Dobrzynska, Agnieszka 282 Dodd, Lawrence C. 168, 172 Dogan, Mattei 166 Dolez, Bernard 219, 221 double ballot 40 Downs, Anthony 105 Droop quota 30, 33–4 Drummond, Andrew J. 108 Dumont, Patrick 56 Dunleavy, Patrick 56, 62 duration of cabinets, see cabinet duration Duverger, Maurice 101–3 mechanical effect 104, 205 psychological effect, 104–5, 231 two effects combined, 107, 112, 149, 231, 277, 280 Duverger’s law 7, 27, 41, 192, 280 generalization as M+1 rule 105, 244 Duverger’s statements (law and hypothesis) 103, 106, 107
Index Duvergerian agenda 101–14 degree of completion 177, 231, 234, 276, 284 macro-agenda 102, 106, 109–14 micro-agenda 102, 114, 141, 276–7 Eagles, Munroe 278 Eckstein, Harry 5, 271 effective magnitude 178–83 input based 178, 184–6 output based 178–84 effective number of components 48, 55 effective number of electoral parties 53–4, 113, 196, 215 prediction from institutional inputs 229–31 effective number of legislative parties (N, N2 ) 48–54, 56–9, 111 data 289–91 and index of balance 50–3 and N0 and N∞ 59–60, 62–3, 97, 154, 160–4, 226 and number needed for majority coalition 58 prediction from institutional inputs 152–3, 162 prediction from votes 211–12 and relevant parties 63–4 effective number of parties: gap between electoral and legislative 54, 236–7 power index based 56 effective number of polities 55 effective thresholds, see thresholds Eisenstadt, Todd 279 electoral design, see institutional engineering electoral districts, definition 2, 17, 21 electoral laws vs. electoral systems 22, 272–3 electoral reform, see electoral systems, choice and change electoral systems 23–46 aggregate 97–8, 154 choice and change 13–17, 20–1, 43–5, 131, 273 composite: compensatory vs. parallel 40–1, 45–6 definition 2, 5, 21 openness to small parties 83–98 pathologies 42–3, 46 research 8–9 Elklit, Jørgen vi, 279 Ellis, Andrew v, 14, 21–2, 24–5, 44–5
England, see United Kingdom Engstrom, Richard L. 43 Ensch, John 25, 124–7, 129–30 Estonia ix–x, 16, 20, 32, 39, 274 ethnic conflict and electoral systems 270–1 ethnic minority representation 14, 91, 202–4, 216, 278 European Parliament 43, 255–7, 266, 268 minor party vote 54n, 105, 282 seat allocation 261–5 size 259–60 European Union 190, 200, 255 Constitutional Treaty 260, 264–5 European Union, Council of 255–7, 259–66, 268 total voting weights 259–60 voting weight allocation 261–5 expectation values viii, 121, 148 Farrell, David M. v, x, 21–2, 24–5, 36 Fatah 1, 15 federal subunits: number and largest share 135–6 representation in second chamber 266 Feld, Scott L. 26 Ferrara, Frederico 40, 280 Fiji 270–1 financing of parties 104, 105, 279 Finland 105, 132, 245 as typical d’Hondt 32, 92–3 proportionality profile 73, 75, 233 unusual features 37, 181, 237 first-past-the-post (FPTP) 14, 18, 24–7 and deviation from PR 68–9 frequency of 44–6 proportionality profiles 69–71 as single-seat PR 19, 23, 33, 36, 92 Fisher, Justin 279 formula exponent (F ), 92–6, 128, 130 see also seat product Fraenkel, Jon 270–1 France 21, 193, 221 proportionality profile 72, 80 two-rounds majority-plurality 25–6 unusual features 55, 64, 69, 181 fraud, electoral 42 French-heritage countries 45–6, 110 Galaich, Glen 53, 278 Gallagher, Michael v, x, 94, 244, 280 index of deviation from PR, see deviation from PR Gambetta, Diego 41
309
Index geometric vs. arithmetic means 118, 137, 218 Germany, Federal Republic 40, 64, 182, 195, 253 CDU/CSU party, 50, 53, 63 legal threshold 38–9, 253 proportionality profile 74 typical MMP 16 Germany, imperial 117, 125, 141, 144, 248 Gerring, John 197 Gerry, Elbridge 43 gerrymander 42–3, 214 bipartisan 42–3, 214 Geys, Benny 278 Golder, Matt 44 Gotz, Florian vi, 125–6 Greece 39, 174, 182, 237 Grenada 211 Grofman, Bernard v, vii, x, 6, 26, 35–6, 76–7, 91, 146, 167, 188, 248, 270–1, 278, 281, 284 Gudgin, Graham 278 Guillory, Christine A. 108 Gunther, Richard x, 51 Hagenbach-Bischoff quota 30, 33 basis for d’Hondt divisors 43 Hamas 1, 29 Hamilton, Alexander 33 Hamilton quota 30 Hanby, Victor J. 66, 242 Handley, Lisa 91, 188 Hare, Thomas 102 Hare quota 30 Hare quota and largest remainders (Hare-LR) 30, 33, 44, 85–6, 112 formula exponent 92–3, 95–6 openness to small parties 85–9 threshold of representation 242–3, 246 Hartmann, Christof, vi 125–6 Hayes, James P. 188 Hazan, Reuven Y. 282, 283 Henig, R. vi, 283 Henig, S. vi, 283 Herron, Erik S. 280 Hill, Steven 1n Hooghe, Marc 279 Horowitz, Donald 270 Hosli, Madeleine O. 260, 262–5, 269 Hsieh, John F.-S. 226 Iceland 2–3, 41, 69, 183, 193, 237 ignorance-based models 110–11, 119, 141–2
310
see also quantitatively predictive logical models Imperiali divisors 32–3, 86–8, 95–6 Imperiali quotas 30, 33 India 69, 209, 215 major FPTP system 44, 46 province level variety 104, 181 unusual features 55, 89, 126, 172 Inglehart, Ronald 174 institutional constraints 4 institutional engineering 9, 22, 112, 197–8, 272–3 European Union, United Nations and second chambers 255–6, 266–7 first or only chambers 130–3, 155–6, 172–4, 183–4, 234, 253 intra-list competition 29, 41, 283 inverse square law of cabinet duration, see cabinet duration Ireland 6, 35, 137, 181, 237 island countries 188, 190, 207 Israel 1, 283 Italy 30, 39, 41, 183, 237 number of relevant parties 63–4 parallel system 40, 280 two-rounds (1895–1913) 125, 144 Jackman, Robert W. 15 Jamaica 180, 195 Japan 69, 105, 139 Liberal Democratic Party dominance, 50, 53, 139, 237 shift to two-tier PR 280 typical SNTV 35, 117, 281 unusual features 69, 128, 137, 195 Jefferson, Thomas 33 Jefferson divisors 31 Johnston, Ron J. 278 Jones, Mark P. v Kaminski, Marek M. 15 Kasapovic, Mirjana vi Kaskla, Edgar 91, 95 Katz, Richard S. v, x, 280 Kendall, M. G. 205 King, Gary ix Kiribati 26 Klingemann, Hans-Dieter 108 Koetzle, William 278 Kollman, Ken 154, 215 Kosovo 271 Krennerich, Michael vi
Index Laakso, Markku 49, 60, 70, 103, 154, 242 Laatsit, Mart 146, 158–9, 231 Lago Penas, Ignacio 278 Lange, David 16 largest remainders 263 with Hare quota, see Hare quota largest seat share (s1 ) 110, 122–33, 135–9 data 289–91 inverse as measure of number of parties (N∞ ), see number of parties largest vote share for given largest seat share 229–30, 235–6 Latin America 207 Latner, Michael 283 Laurent, Annie 219, 221 Laver, Michael x, 166 Lebeda, Tomáš 21 legal majorities 39 legal thresholds, see thresholds legitimacy 14 Lijphart, Arend v, 21–2, 24, 39, 43, 49–50, 52, 54–6, 67–9, 79, 103, 108, 166–70, 174–5, 178–9, 181–2, 185, 188, 192, 207, 245, 253, 270, 277, 282 limited vote (LV) 29, 45–6 Linz, Juan J. x list PR 29 frequency 44–6 literacy as basis for assembly size 189–90 Loosemore, John 66, 242 Loosemore-Hanby index for deviation from PR, see deviation from PR Luxembourg 32, 42, 129, 237, 263, 265–6 M+1 rule 105, 244–5, 277 McAllister, Ian 25, 36 Macclellan, Nic 271 McGann, Anthony 283 MacKenzie, W. J. M. 271 Mackerras Malcolm 36 McLeavy, Elizabeth 280 McNeill, Keith G. vii Mackie, Thomas T. vi, 51, 70, 116, 122, 125, 145, 237 Maddens, Bart 279 Madrid, Raúl L. 278 magnitude, see district magnitude; effective magnitude Mair, Peter x, 4, 51 majoritarian electoral systems 1, 24–6, 108 malapportionment 24, 42–3, 65, 69
Malta 92, 134–5, 193 typical STV 35 unusual features 39, 51, 181, 237, 265 Massicotte, Louis 40, 282 Matland, Richard E. 30, 283 Mauritius 172, 174, 181 mechanical effect, see Duverger Michels, iron law of 274 micro-mega rule, see Colomer Mill, John Stuart 103 Merill, Samuel x Mexico 40, 195 Mikkel, Evald 4 minority attrition, law of 112–13, 201–23 derivation 216–21 FPTP 204–15 multi-seat districts 213–14, 219–21 two-rounds 221 women and ethnic minorities 202–4, 215, 222 minority cabinets 56 minority enhancement equation 261–8 Mitchell, Paul v mixed member proportional (MMP) 16, 40, 44 poportionality profile 74 Molinar, Juan 56, 62 Monroe, Burt L. v, 37–8, 76–7 Montero, José Ramón x Moser, Robert G. 280 Mozaffar, Shaheen 2, 53, 278 Mukherjee, Bumba 154 multi-seat districts 18–19, 28–36 Nagayama, Masao 145 Nagayama triangle 145, 148 Netherlands, The 38, 64, 144, 172, 233, 283 two-rounds (1888–1913) 117 as typical nationwide district 92–3, 119–20, 244–5 New Caledonia 270–1 New Hampshire 197 New Zealand 49, 185, 207 change in electoral system vi–vii, ix, 16, 280 proportionality profile 70–1, 233 Nice, Treaty of, see European Union Niemi, Richard G. 226 Nishikawa, Misa 280 Nohlen, Dieter vi, 91, 125–6, 207, 210–11, 230 Noppe, Jo 279 Norris, Pippa v
311
Index Norway 32, 64 Novák, Miroslav 21 number of parties 47–64 effective, see effective number of parties entropy-based (N1 ) 61, 169 largest seat share based (N∞ ) 48, 50, 57–60, 62–3, 97, 138, 149 master equation 60–1 NP index 62 registered 188, 192–3, 195–6 relevant 48, 57, 63–4 seat-winning (N0 ) 48, 50, 57–60, 97, 110–11, 116–22, 133–5; data 288–9 serious or pertinent 226–7, 244–6, 250 number of serious candidates 105 Nurmia, Matti vii Occam’s razor 141 open list 18, 29, 281, 283 Ordeshook, Peter C. 278 ordinal ballot 17–18 ‘Other parties’ 5, 122, 139, 141, 152 overpayment for largest party 104 Palestine 1–2, 15, 29 panachage 18, 42, 127, 181 Papua-New Guinea 153, 172, 181, 197 Park, Myoung Ho 278 parties: balance in size, see balance in party size internal structure x membership 188, 192–5 number, see number of parties, effective number regional 16 party block vote (PBV) 28, 44–6 party systems: definition 5–6 mapping with number and balance of parties 50–3 simple 273–5 path dependence 108, 183 Pedersen, Mogens N. 67 Pennings, Paul 283 Pérez-Liñán, Aníbal 282 physics 188 conserved quantities 135, 219 general structure 120, 214, 284, 285 methods 31, 272, 287 sequential approach to problems vii–viii, 274, 276–7 terminology 61, 121, 174 Plattner, Mark v
312
Plurality allocation rule 18–19, 92–3, 97–8, 178–80 multi-seat plurality 19–20, 98, 178–9, 182, 183, 219–20 in single-seat districts, see first-past-the-post political culture 4–5, 10, 14, 131, 234 consensual vs. majoritarian 108, 174–5, 277–8 political engineering, see institutional engineering political practitioners, advice to 1, 23 advantages of simple electoral laws ix, 13, 101 allocating seats in second chambers and Europarliament 255 altering assembly size 187 altering the mean duration of cabinets 165 altering the number of parties 83, 115, 143, 225 increasing representation of women and minorities 201, 241 measuring the number of parties and disproportionality 47, 65 simplifying complex electoral systems 177, 269 population, effect on politics 110, 187–200 Portugal 32, 237 Powell, G. Bingham v, 284 predictive ability 10, 17, 120, 142, 212 models 112, 118 quantitative prediction vi–viii, 9, 140, 211 see also quantitatively predictive logical models preferential voting, see open list presidential elections 24, 125, 134, 206, 281–2 primary elections 42 prime ministerial elections 281–2 proportional representation (PR), 7, 14, 18–19 deviation from, see deviation from PR proportionality profiles 65, 70–5, 233 Przeworski, Adam 67 psychological effect, see Duverger Putnam, Robert 202 quantitatively predictive logical models 46, 110, 271, 285–6 vs. directional models 127, 159, 175 see also predictive ability
Index quota method for seat allocation 30–3 see also Droop; Hagenbach-Bischoff; Hamilton; Hare; Imperiali quotas R-squared, significance of 118, 126, 129, 171–2 Rae, Douglas W. 103, 242 ranges of variables 130, 140 Rapoport, Anatol 7–8 Recchia, Steven 257–9 Reed, Steven R. 34, 50, 105, 120, 146, 226, 244, 280 regression analysis: logging all variables 192 OLS on logged variables 117–18, 123, 125–6, 128, 153, 169, 171 Overuse of vii–viii, 285 Reilly, Benjamin v, 14, 21–2, 24–6, 44–5 relative majority, see plurality representation thresholds, see thresholds responsiveness of FPTP, see disproportionality exponent Reynolds, Andrew v–vi, 2, 14, 21–2, 24–5, 44–5 Riker, William H. 103 Roberts, Nigel S. 16, 279 Rokkan, Stein 242 Rose Amanda G. 37–8 Rose, Richard vi, 51, 70, 116, 122, 125, 145, 237 Rosetta stone 10, 285 Ruiz Rufino, Rubén 251 Rule, Wilma vi, 283 Rush, Mark E, 43 St Kitts and Nevis 46, 102, 125, 197 St Vincent 91 Sainte-Laguë divisors 32–3, 44, 85, 112 formula exponent for 92–6 modified 32–3 openness to small parties 85–90 thresholds of representation 242–3, 246–7 Samuels, David J. 282 Sartori, Giovanni 51, 56–7, 63–4 Scarritt, James R. 53, 278 Scarrow, Susan 283 Schedler, Andreas 2 Scheiner, Ethan 280 Schiff, Leonard E. 121 Schoppa, Leonard J. 280 Schrödinger’s equation 236 Schuster, Karsten 88, 95 Scotland 105, 280
seat allocation formulas 18, 22–3, 112 effect on openness to small parties, 85–9 seat product (M F S, MS) 92–8, 110–11, 139–40, 154, 179–82 effective 131, 155 seat shares of parties 144–52, 156–60 seat–vote equation, 112, 204, 215, 216, 219 reversed 227 second chambers: seat allocation 261–8 size 257–9 selection process constant 219, 221–3 self-interest 5, 15–16, 110 Shepsle, Kenneth A. 166 Shugart, Matthew S. 30, 40, 94, 101–2, 106, 116, 120–1, 131, 161, 185, 209, 211, 217, 276, 279, 281–5 Shvetsova, Olga 278 Siaroff, Alan 51, 62 Sikk, Allan 75, 152–3, 168–72, 174 simple electoral systems 19–20, 103, 109–10, 178–9, 272 predictability 10, 46 simple quota, see Hare quota single-member districts, see single-seat districts single non-transferable vote (SNTV) 34–6, 44–6, 105, 117, 244 single-seat districts 18–19, 24–8 single transferable vote (STV) 35–6, 44–6, 117 sister party problem 50, 53, 63, 125, 129, 139 size and politics, see population small countries, politics in 191–6 Solomon Islands 197 Soudriette, Richard W. 45 South Korea 35, 40, 195 Soviet electoral rules 15 Spain 32, 38, 253, 278 deviations from predictions 38, 127, 172, 181, 248 low balance in party sizes 53, 237 malapportionment 55, 69 stability, political 15–16, 131, 166 Steenbergen, Marco 2 Stockwell, Robert F. 270 Storey, Robert S. vii strategic voting and sequencing 104–5, 112, 149 Strøm, Kaare 202 Stuart, R. 205 sub- and supranational elections 54n, 105, 282–3
313
Index Suominen, Kati 30, 289 Sweden 32, 64, 183 Switzerland 32, 34, 64, 181 apparentement and panachage 41–2, 127 long-lasting cabinets 169, 275 small largest share 128, 141 Taiwan 35, 40 Tan, Alexander C. 195 Taylor, Andrew 271 Taylor, Michelle 30, 283 Taylor, Peter J. 278 Theil, Henri 216–17, 222, 261–2, 268 Thibaut, Bernhard vi thought experiment 91–2, 147 see also quantitatively predictive logical models threshold of majority 251–2 thresholds of representation 24 confusion between district and nationwide 185, 252–3 effective, district level 38, 178 effective, nationwide 247–50 inclusion and exclusion 242–3, 246 legal, district level 90, 144 legal, nationwide 16, 38–9, 74, 127, 182 ticket splitting 75 tiers, multiple 39, 44, 279–80 see also electoral systems, composite Tomz, Michael ix Tufte, Edward R. 4, 187, 205 Turkey 195 two-rounds (TR) 21, 25–7, 42, 45–6 proportionality profile 72 unpredictable outcomes 117, 125, 129 two-tier, see electoral systems, composite Ukraine 15 United Kingdom (UK) 44, 91–2, 193, 215 cube law and its demise 205, 207, 214, 221, 271 district level threshold of representation 247–8 Liberal Party demise 2–3 proportionality profile 233 typical large country FPTP 1, 15, 20, 25, 104
314
unusual features 180, 190, 197, 237, 153 United Nations (UN) 256, 261, 266 United States (USA) 44, 135, 205, 215, 266 Electoral College 201, 213, 220 gerrymander 42–3, 214 House 86, 189, 260 Large largest seat share 125, 141 proportionality profile 70–1 Senate 261, 266 states within 33, 91, 190, 200 unusually pure two-party system 104, 117, 197, 245, 247 women’s representation 203–4, 222 unlimited vote (UV) 29 Uruguay 41 Valdini, Melody E. 30, 284 Van Biezen, Ingrid 279 Van Roozendaal, Peter 170 Vatter, Adrian 278 Venezuela 183 volatility of votes 67, 75 volleyball scores 203–4, 221–2 Ware, Alan x, 279 Warner, Steven 41 Warwick, Paul 166 wasted votes 104 Wattenberg, Martin P. vi, x, 40, 195, 279 Webb, Paul x Webster, Daniel 32–3 Welch, Susan vi, 203, 283 Weldon, Steven A. 188–9, 192–4, 196 Williams, Ferd vii Wittenberg, Jason ix Wolinetz, Steven B. 50–1 women’s representation 14, 30, 108, 283 affected by law of minority attrition 202–4, 222–3 Young, H. Peyton 88, 95 Zambia 15 Zimmermann, Joseph F. vi