INTRODUCTION TO THE SERIES The aim of the Handbooks in Economics series is to produce Handbooks for various branches of ...
174 downloads
1359 Views
36MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
INTRODUCTION TO THE SERIES The aim of the Handbooks in Economics series is to produce Handbooks for various branches of economics, each of which is a definitive source, reference, and teaching supplement for use by professional researchers and advanced graduate students. Each Handbook provides self-contained surveys of the current state of a branch of economics in the form of chapters prepared by leading specialists on various aspects of this branch of economics. These surveys summarize not only received results but also newer developments, from recent journal articles and discussion papers. Some original material is also included, but the main goal is to provide comprehensive and accessible surveys. The Handbooks are intended to provide not only useful reference volumes for professional collections but also possible supplementary readings for advanced courses for graduate students in economics.
EDITORS' INTRODUCTION The field of Public Economics has been changing rapidly in recent years, and the sixteen chapters contained in this Handbook survey many of the new developments. As a field, Public Economics is defined by its objectives rather than its techniques, and much of what is new is the application of modern methods of economic theory and econometrics to problems that have been addressed by economists for over two hundred years. More generally, the discussion of public finance issues also involves elements of political science, finance and philosophy. These connections are evidence in several of the chapters that follow. Public Economics is the positive and normative study of government's effect on the economy. We attempt to explain why government behaves as it does, how its behavior influences the behavior of private firms and households, and what the welfare effects of such changes in behavior are. Following Musgrave (1959) one may imagine three purposes for government intervention in the economy: allocation, when market failure causes the private outcome to be Pareto inefficient, distribution, when the private market outcome leaves some individuals with unacceptably low shares in the fruits of the economy, and stabilization, when the private market outcome leaves some of the economy's resources underutilized. The recent trend in economic research has tended to emphasize the character of stabilization problems as problems of allocation in the labor market. The effects that government intervention can have on the allocation and distribution of an economy's resources are described in terms of efficiency and incidence effects. These are the primary measures used to evaluate the welfare effects of government policy. The first chapter is this volume, by Richard Musgrave, presents an historical development of these and other concepts in Public Finance, dating from Adam Smith's discussion in The Wealth of Nations of the role of government and the principles by which taxes should be set. The remaining chapters in the Handbook examine different areas of current research in Public Economics. Analyses of the efficiency and incidence of taxation, developed in Musgrave's chapter, are treated separately in Alan Auerbach's chapter in the first volume and Laurence Kotlikoff's and Lawrence Summers' chapter in the second volume, respectively. Auerbach surveys the literature on excess burden and optimal taxation, while Kotlikoff and Summers discuss various theoretical and empirical approaches that have been used to measure the distributional effects of government tax and expenditure policies. These general analyses of the effects of taxation form a basis for the consideration of tax policies in particular markets or environments, as is contained in the chapters by Jerry Hausman, Agnar Sandmo, Avinash Dixit, Harvey Rosen, John Helliwell and Terry Heaps, and Joseph Stiglitz.
xvi
Editors'Introduction
Hausman discusses the effects of taxation on labor supply, including a treatment of how one empirically estimates such effects in the presence of tax and transfer programs. He also considers the incentive effects of social welfare programs such as unemployment compensation and social security. Sandmo focuses on the other major factor in production, capital, dealing with theory and evidence about the effects of taxation on private and social saving and risk-taking. Dixit shows how the basic results about the effects of taxation may be extended to the trade sector of the economy, casting results from the parallel trade literature in terms more familiar to students of Public Finance. Rosen's chapter brings out the characteristics of housing that make it worthy of special consideration. He considers the special econometric problems involved in estimating the response of housing demand and supply to government incentives. Because of its importance in most family budgets and its relatively low income elasticity of demand, housing has been seen as a suitable vehicle for government programs to help the poor, and Rosen discusses the efficiency and incidence effects of such programs. Helliwell and Heaps consider the effects of taxation on output paths and factor mixes in a number of natural resource industries. By comparing their results for different industries, they expose the effects that technological differences have on the impact of government policies. Stiglitz treats the literature on income and wealth taxation. The remaining chapters in the Handbook may be classified as being on the "expenditure" side rather than the "tax" side of Public Finance, though this distinction is probably too sharp to be accurate. In Volume I, Dieter B6s surveys the literature on public sector pricing, which is closely related both to the optimal taxation discussion in Auerbach's chapter and Robert Inman's consideration, in Volume II, of models of voting and government behavior. The question of voting and, more generally, public choice mechanisms, is treated by Jean-Jacques Laffont in his chapter. The chapters by William Oakland and Daniel Rubinfeld focus on the provision of "public" goods, i.e., goods with sufficiently increasing returns to scale or lack of excludability that government provision is the normal mode. Oakland considers the optimality conditions for the provision of goods that fall between Samuelson's (1954) "pure" public goods and the private goods provide efficiently by private markets. Rubinfeld surveys the literature on a special class of such goods: local public goods. Since the work of Tiebout (1956), much research has been devoted to the question of whether localities can provide efficient levels of public goods. The other two chapters in Volume II also deal with problems of public expenditures. Anthony Atkinson considers the effects of the range of social welfare programs common in Western societies aimed at improving the economic standing of the poor. Some of these policies are touched on in the chapters by Hausman and Rosen, but the coexistence of many different programs itself leads
Editors' Introduction
xvii
to effects that cannot be recognized by examining such programs seriatim. Jean Dreze and Nicholas Stern present a unified treatment of the techniques of cost benefit analysis, with applications to the problems of developing countries. References Musgrave, R.A., 1959, The theory of public finance (McGraw-Hill, New York). Samuelson, P.A., 1954, The pure theory of public expenditures, Review of Economics and Statistics 36, 387-389. Tiebout, C.M., 1956, A pure theory of local expenditures, Journal of Political Economy 94, 416-424.
CONTENTS OF THE HANDBOOK VOLUME I Editors' Introduction Chapter 1
A Brief History of Fiscal Doctrine R.A. MUSGRAVE Chapter 2
The Theory of Excess Burden and Optimal Taxation ALAN J. AUERBACH Chapter3
Public Sector Pricing DIETER BOS Chapter 4
Taxes and Labor Supply JERRY A. HAUSMAN Chapter 5
The Effects of Taxation on Savings and Risk Taking AGNAR SANDMO Chapter 6
Tax Policy in Open Economies AVINASH DIXIT Chapter 7
Housing Subsidies: Effects on Housing Decisions, Efficiency, and Equity HARVEY S. ROSEN Chapter 8
The Taxation of Natural Resources TERRY HEAPS and JOHN F. HELLIWELL
VOLUME II Chapter 9
Theory of Public Goods WILLIAM H. OAKLAND
viii
Contents of the Handbook
Chapte 10
Incentives and the Allocation of Public Goods JEAN-JACQUES LAFFONT Chapter 11
The Economics of the Local Public Sector DANIEL RUBINFELD Chapter 12
Markets, Government, and the "New" Political Economy ROBERT INMAN Chapter 13
Income Maintenance and Social Insurance ANTHONY B. ATKINSON Chapter 14
The Theory of Cost-Benefit Analysis JEAN DREZE and NICHOLAS STERN Chapter 15
Pareto Efficient and Optimal Taxation and the New New Welfare Economics JOSEPH STIGLITZ Chapter 16
Tax Incidence LAURENCE KOTLIKOFF and LAWRENCE SUMMERS
Chapter 9
THEORY OF PUBLIC GOODS WILLIAM H. OAKLAND Tulane University
1. Introduction This chapter is concerned with the class of goods which has come to be known as public goods or collective goods. A distinctive characteristic of such goods is that they are not "used up" in the process of being consumed or utilized as an input in a production process. As Samuelson (1954, 1955) has so succinctly stated, public goods have the property that "one man's consumption does not reduce some other man's consumption". This contrasts sharply with private goods which, once consumed or utilized as an input, can no longer be of service to others. To be considered public a good must also be of interest to more than one consumer or firm. Otherwise, the fact that the consumption possibilities of others are undiminished is irrelevant. A fireworks display in the middle of an empty desert is a private good even though the same display in a city park would be a public good. A public good may also create disutility or reduced profits. A cigarette smoked in a crowded classroom is an example. The disutility suffered by one non-smoker does not reduce the disutility suffered by other non-smokers. Similarly, the reduced profits suffered by one fisherman because of water pollution does not reduce the loss to other fishermen. Public goods are of particular relevance to public policy because they tend to be inefficiently provided by private arrangements such as the market mechanism. Consider first the category of public "goods" as opposed to public "bads". Because they are not used up in the act of consumption of production, the marginal cost of extending service to additional users is zero. Private provision of the good, however, will necessitate revenues from users in order to defray the cost of producing the good. Such charges will usually lead some potential users to forgo consumption, creating a deadweight efficiency loss. Handbook of Public Economics, vol. II, edited by A.J. Auerbach and M. Feldstein © 1987, Elsevier Science Publishers B. V. (North-Holland)
W.H. Oakland
486
A further source of difficulty is encountered if the public good has the property that potential users cannot be costlessly excluded from its services. In the event that exclusion costs are prohibitive, private sellers will be unable to use prices to finance production costs. Instead, the latter will have to be covered by voluntary contributions of users-a mechanism unlikely to provide an efficient outcome because one can enjoy the good whether or not he/she contributes. If exclusion is feasible, but costly, any outlays for exclusion are themselves a source of inefficiency. While the inability to exclude costlessly exacerbates the efficiency problems of private provision of public goods it is not essential for market failure. The fact that the marginal cost of additional users is zero is itself sufficient to insure market failure. This is not true for public "bads", however. If exclusion costs are zero, no-one will avail themselves of the potential harm caused by the public bad. If exclusion costs are non-zero private markets will generally provide a superoptimal supply of the public bads. The supplier of the public bad will often have little incentive to consider the damages imposed on others, whether the damages arise from exclusion outlays or actual harm from the public bad. Nevertheless, the question of whether such public bads will be overprovided by private arrangements has been the source of contention in the literature. A full treatment of this issue is deferred until later in the chapter. We will begin our analysis with the derivation and interpretation of the efficiency conditions for pure public goods. These conditions are then generalized to cover cases which are intermediate between pure public goods and pure private goods, such as externalities. The positive issues of how private market arrangements will provide for public goods are treated next. The analysis concludes with a discussion of alternative mechanisms for making public goods decisions within the public sector.
2. Pure public goods Consider an economy consisting of N individuals. Let X be the quantity of a good which is produced and x' be the individual i's consumption of the good. Then a good is purely public if xi<X,
i=
,..., N,(1)
with the possibility that the equality can hold for all i simultaneously. The weak inequality is used to allow for the possibility of exclusion; if exclusion costs are prohibitive, (1) will take the form of an equality.
487
Ch. 9: Theory of Public Goods
On the other hand, a good is purely private if Exi < X.
(la)
N
That is, the sum of the individual consumptions cannot exceed total production. Given production, and assuming N = 2, the consumption possibilities for pure public and pure private goods are shown in Figure 2.1. For private goods, consumption possibilities are located within the triangle XOX, while for public goods they must fall within the rectangle AXOX if exclusion is costless. If exclusion is not possible the consumption possibilities for the public good collapse to the point A. Inspection of Figure 2.1 or comparison of equations (1) and (la) quickly reveal the fundamental difference between pure private and pure public goods- the total absence of conflict between the consumption of different individuals when the good is public. In Figure 2.1, for example, person l's consumption is irrelevant to person 2's consumption possibilities when the good is public. If the good is private, and total production is allocated to consumers, an increase in person l's consumption must be accompanied by an equal decrease in person 2's possibilities. As Musgrave (1969) has aptly stated, public goods involve no "rivalryness" among consumers while private goods involve perfect "rivalryness".
X
A
Figure 2.1
488
W.H. Oakand
Let us turn to the question of optimal resource allocation in a world with public goods. Without loss of generality we can consider a world with two goods, one purely public and the other purely private. Efficiency is defined by the following problem: max ui(y i, xi) s.t. B i -
ui
(y i , x i )
0,
j = 1,..., N
N
E_ yk
0.
3.4. Externalities As mentioned above, public goods are sometimes provided as a by-product of the provision of some private good. The public goods so provided may generate either positive or negative benefits. Cigarette smoke serves as an example of the latter, whereas inoculation against communicable diseases represents examples of the former. What appears to differentiate these cases from pure public goods is
497
Ch: 9: Theoy of Public Goods
that rivalry exists among consumers for the good which gives rise to the external benefits or costs. That is, an increase in one person's allocation of cigarettes must be at the expense of some other person's allocation. Nevertheless, such a distinction is more apparent than real. Goods with external effects fit the mold of pure public goods precisely.7 Consider a two-good world where one of the goods is strictly private, whereas the other has the property that, in addition to the benefits it provides the direct consumer, it confers positive benefits to other agents in the economy. 8 Moreover, let the external benefits have to property of pure public goods in that one person's consumption of the externality does not reduce the consumption possibilities of others. The utility function of each individual can be written Ui = ui(yi i,
,Zi.,..., Z-, Z+l, . .,ZN),
...
,
(30)
with the constraints on production and distribution being Ey'i Y,
(31)
N
Exi < X,
(32)
N
z'<xJ,
i, i= 1,...N,j=1,...,N,
F(X, Y) = 0.
(33)
(34)
As before, y represents the private good. x' now corresponds to an individual's allocation of the good, X, which gives rise to the external benefits, zj (j 4 i). Note that from this standpoint of individual j, xi (i j) is a pure public good. On the other hand, from the vantage point of individual i, x' is a pure private good. For i's allocation of x can only be increased by reducing someone else's allocation. The good x, therefore, appears to have properties of both private and public goods. To consider the optimal resource allocation for such an economy let us recognize that exclusion from external benefits can never be optimal. Also assume that individuals are not sated by their direct consumption of X. Hence, (32) and (33) will hold as equalities. Our problem, then, is to maximize U'(y, 7 See 8
xi, ... , X
N )
Samuelson (1969b) and Mishan (1969, 1976). As with our discussion of pure public goods we will focus on the case of positive benefits. Negative benefits create no new issues that have not already been discussed.
498
W.I. Oakland
subject to Bi= ui(yjtY
Eyk
, ,...xN, XN) ,.....
i,
=
N,
Y 0 and f" < 0. Output can be divided among expenditures on a public good, X, and private goods, Y. If individuals are identical, ny = Y, and the region's budget constraint is ny + C(X) =f(n).
(65)
Consider the problem of choosing the optimal level of public good for a fixed population and where the utility function is u(y, X). The allocation must satisfy the Samuelson rule, nu_ = C'(X),
(66)
Uy
and the constraint (65). Together, these define the indirect utility function, V(n), where Uy
V'(n)= n f'-y.
(67)
As in the case of clubs, there is insufficient structure to assure that V(n) has a maximum or if it does, that it is unique. As before, we will assume that (67) has a unique zero and that V" < 0 at this point. It follows that the optimal population requires that workers be paid, after taxes, a wage equal to their marginal product. Moreover, it will be the case that C( X) =f(n) - nf'(n).
(68)
That is, aggregate expenditure on public services is equal to aggregate rents. Moreover, if the underlying production function exhibits constant returns to scale in labor and land, public expenditures will equal land rents. Since the structure of this problem is so similar to the club model, there is little merit in pursuing a parallel discussion. Instead, we can conclude with a statement of the welfare optimum condition for the division of an identical population among two regions: r
r"Yv
f
-l-L,
.......
1
w·
v·.
(69)
Ch. 9: Theory of Public Goods
509
which is equivalent to (59'). Thus, differences in private income should reflect wage differences only if utility is the same in the two regions. 2 3 If the high wage region is also the region with greater utility, it must be the case that differences in private consumption are less than the wage differentials. Here, a tax on the wages of the workers in the high wage region would be necessary to achieve an optimum. On the other hand, if the high wage region is the one with lower utility, we would need a subsidy to wages in that region. For efficiency, we only require that the interests of a potential migrant be in opposition to the interests of his community. Thus, a region which would prosper by a reduction in population would not contain any members who would voluntarily move out.
4. Private provision of public goods We turn our attention now to the question of how, in the absence of governmental intervention, private markets would tend to provide for public goods. Without such information it is not possible to decide whether public intervention is warranted and, if it is, what is the best form it could take. The market's performance will depend critically upon the characteristics of the public good such as possibilities for exclusion, the number of individuals who derive utility from it, the presence of large benefits to direct consumers, and the scale economies of public good production. Moreover, it may also depend upon the legal-institutional framework within which market transactions occur. Because of these complexities, few generalizations can be made beyond the observation that private markets will tend to underprovide public goods and to overprovide public bads. Whether market performance is sufficiently inadequate to warrant public intervention will depend upon the particular circumstances. Indeed, there may be cases where private markets will perform efficiently. Owing to this diversity of possible outcomes we will only discuss some of the more important special cases.
4.1. No exclusion, small number of participants 4.1.1. "Public goods" Consider a situation where the number of consumers deriving utility from a public good is small and where exclusion is impossible. An example of such a case would be a small island inhabited by two consumers, A and B. Each is 23 Flatters et al. (1974) derive an expression similar to (69) but without the term involving the welfare function. This difference reflects their requirement that utility in the two regions must be equal.
510
W.H. Oakland
interested in the control of mosquitoes.4 If A eliminates mosquitoes it benefits B and vice versa. To keep the argument simple, we assume that the resources available to each consumer are fixed and that it costs one unit of private good to obtain one unit of mosquito control. That is, for each consumer we have xi +y = Yi,
i = A,B,
(70)
where x is mosquito control purchased by i. y and Y' are private good consumption and income, respectively. Each consumer has a utility function U = Ui(XA + xB, yi),
(71)
where we assume that each individual derives equal utility from his own and the other party's mosquito control. To begin with, let us assume that the individuals live at opposite ends of the island so that each is unaware of the others mosquito control activity except as it is reflected in the size of the mosquito population. Hence, each will take the x of the other as given when making his own choices. The first-order conditions for utility maximization are then (70) and --=1,
i=A,B.
(72)
uy
Solving equations (70) and (72) for each party yields the reaction functions: XA = XA(XB);
XB = XB(XA).
Assuming that an increase in the other's control activities lead to a reduction in one's own, these reaction functions will appear as in Figure 4.1. Equilibrium levels of xA and xB are given by point E in the diagram. Clearly, at E the level of mosquito control is sub-optimal. For, by (72), the total benefit of an additional unit of mosquito control equals 2, while the cost of the unit is only 1. It would appear that the extent of market failure in this case is quite substantial-marginal benefits being twice marginal cost. But this result stems from the assumption that each party is indifferent between his direct consumption of mosquito control and that of the other consumer. This would be the case for a pure public good. If, however, we were considering a consumption externality, the inefficiency might be quite modest. This would be so if each consumer derives considerably more satisfaction from his own consumption than for that of 24
This example is drawn from Buchanan (1968).
511
Ch. 9: Theory of Public Goods R
XD
AO
A
X
Figure 4.1
the other party. Whether or not significant resource misallocation develops in this case, turns upon the relative marginal utility of direct and indirect consumption of the good. Some authors, however, would deny that any misallocation would result from market provision in this particular case.2 5 At points such as E in Figure 4.1, both parties will be aware of the mutual benefits of expanding mosquito control. For example, if B were to pay A for expanding his consumption, A would have the incentive to do so. As long as the aggregate benefits of expansion exceed production costs, it will be possible for the two parties to work out a cost-sharing scheme for additional output that will benefit both. Such possibilities are shown in Figure 4.2, where the indifference curves of A and B are superimposed on the reaction functions. 26 Note that the reaction functions pass through the peaks of the indifference curves. Consequently, at the point of intersection, E, the indifference curves also intersect. By sharing costs of additional units of x, the parties could move to a point between P and P' on the "contract curve", the locus of efficient outcomes. Since both parties are interested 25 See, 26
for example, Buchanan (1968, pp. 45-46). The indifference curves are arrived at by substituting for y from the budget constraint into the utility functions.
512
W.H. Oakland
Figure 4.2
in higher levels of utility, the argument goes, the necessary agreement will be forthcoming. While this logic is appealing, it flies in the face of reality. Consider defense expenditure. Clearly, all nations would be better off if such expenditure were eliminated. Nevertheless, they continue and have tended to escalate. A moment's reflection suggests that defense outlay is exactly analogous to the mosquito control example; one nation's outlay creates a negative externality to the other. The failure of bargaining to achieve an efficient outcome, in large measure, owes to the multiplicity of efficient outcomes. 27 In Figure 4.2, all points along ZA and ZB are efficient and have the property that no person is worse off than he/she would be in isolation. Since the points ZA and ZB carry substantially different utility to the parties, it is not implausible that each should attempt to capture as large a share of the gains as possible - even if it means not achieving an efficient outcome. Indeed, the point E, itself an inefficient outcome, may not be achievable via bargaining. While E corresponds to the so-called "Nash" bargaining solution, the subjective nature of the benefits gives both parties the incentive to under-represent their true preferences. Thus, the reaction curves 27
For a similar view about the limitations of bargaining, see Lipnowski and Maital (1983).
Ch. 9: Theory of Public Goods
513
revealed during the bargaining process may be considerably below those shown in Figure 4.2. The preceding difficulties caused by bargaining are referred to as transactions costs by those who believe market outcomes tend towards efficiency. 28 It is alleged that, when the number of parties is small, transactions costs will also be small. The remarks in the preceding two paragraphs suggest this view may be an act of faith instead of a scientifically derived hypothesis. 4.1.2. "Public bads"
The analysis of the previous subsection carries over with little modification to the instance where the externality generates negative benefits. Nevertheless, since this case has commanded wide attention in the literature it is useful to spend a little time on it.
In a celebrated article, Coase (1960) argued that as long as property rights were defined unambiguously, private markets would tend to efficiently allocate goods with negative externalities when the number of parties involved is small. A corollary to this hypothesis is that it makes little difference whether the property right is assigned to the person generating the externality or to the party affected by it. When the externality confers positive benefits the issue of property rights does not arise. It is difficult to conceive of the benefactor having the legal right to force an individual to increase his/her consumption of the externality generating activity.2 9 In the case of a negative externality, however, it is not unreasonable to confer the right to enjoin the damaging activity to the party being damaged. Consider the case of air pollution. If the pollutee has the right to clean air the polluter will have to bribe him for permission to emit pollutants. The level of the bribe will reflect the benefits to the polluter from using the air to eliminate wastes. If damage is a continuous increasing function of emissions, the pollutee will allow emissions up to the point where the marginal damage to him is equal to the payment for the marginal unit of pollutant. The latter, in turn, will be equal to the marginal benefit to the polluter. Hence, pollution will be carried out at optimal levels. If the polluter could not be enjoined by the pollutee, it would be the pollutee who would have to do the bribing. By similar reasoning efficiency would be reached. The reader will recognize the parallelism between this reasoning and that used to demonstrate efficiency for the case of positive externalities. Hence, it shares the same limitations. Namely, there are incentives for the parties to misrepresent their preferences so as to gain a disproportionate share of the gains from trade. 28 See, 29
for example, Coase (1960). A possible exception may be compulsory elementary education.
514
W.H. Oakland
For example, if the polluter has the property right, he will have an incentive to overstate the level of pollution that he would generate in the absence of any monetary payment. This will have the effect of raising the willingness of the pollutee to pay for any level of reduction in emission levels. While such a strategy could still yield an optimal level of pollution it would not do so if the pollutee misrepresented his true preferences. The latter has strong incentives to do so in view of the fact that the total payment he will be asked to make will depend upon his revealed willingness to pay for inframarginal units.30 In other words, the polluter will attempt to extract the pollutee's entire consumer surplus from emission reduction. He will do so by making an all or nothing offer based upon his estimate of the pollutee's demand curve for pollution reduction. Recognizing that some consumer surplus is better than none, the pollutee will understate his demand for reduction. This tendency together with the polluter's incentive to overstate base levels of pollution will lead to super-optimal levels of pollution. 31 The major distinction between external "bads" and external "goods" has to do with distribution outcomes rather than to efficiency properties. By varying the locus of property rights, society can affect the relative bargaining strength of the parties. But, in itself, this will not insure efficient behavior through the private market.
4.2. No exclusion, large number of participants The inefficiency in the small number of participants case arises out of the bilateral monopoly nature of the problem. It is well known that similar problems would arise in bilateral bargaining over private goods. As such, the problem has more to do with game theoretic considerations than with the public nature of the good. When the number of participants is large, however, this similarity disappears. It can be demonstrated that by sufficiently increasing the number of actors in a market for private goods, the outcome would tend towards efficiency via the mechanism of perfect competition. With a public good, however, the outcome will diverge further from the efficient outcome. In markets for private goods, when the number of participants is large, individual actors become price-takers. In public good markets, on the other hand, people become quantity-takers. Consequently, no agent feels that he/she can influence the amount of public good which is made available. The rational agent, therefore, will attempt to "free ride" on the public good supplies of others. The consequence will be gross underprovision of public goods and gross overprovision of public bads. 30
This incentive to free ride at the expense of others is discussed more thoroughly below. If the externality is between two firms, a merger could eliminate the problem. Mergers, however, have costs of their own. See Coase (1937) and Williamson (1975). 31
Ch. 9: Theory of Public Goods
515
This result can be shown more rigorously by expanding the number of agents in the mosquito control example to N, a large number. Unlike the small numbers case, however, it is now rational for each agent to take the mosquito control activities of others as independent of his/her own choices. The result will be equation (70) and u < 1,
i= 1,..., N.
(72a)
uy
The inequality is written in recognition of the fact that some agents may choose to devote no resources to mosquito control. This would be the case if the cost of the public good are large relative to any individual's direct benefit. Assume that (72a) holds as an equality for K < N agents. Then we have, in equilibrium,
> K,
(73)
N Uy
with the strict inequality holding if at least one of the remaining (N-K) individuals place positive value on the last unit of mosquito control. Just as it was inappropriate to measure the extent of market failure by expression such as (73) in the small numbers case, it is inappropriate to use it for that purpose here. Even if K were zero there could be cases of serious underprovision. Thus, while no individual could afford a nuclear submarine, aggregate benefits could exceed aggregate costs by a large factor. The same would be true in a case of externality where one's direct marginal benefit from a good vastly exceeds the external marginal benefit of direct consumption by others. The important point is that the opportunity for bargaining which is present in the small numbers case virtually disappears when the number of actors is large. If bargaining costs are proportional to the number of agreements, they would increase with the square of the size of the bargaining group. It is unlikely, therefore, that we would obtain a level of public good which exceeds that indicated by (70) and (72a).
4.3. Exclusion possible, large number of participants We now consider the class of public goods for which exclusion is costless. This opens the possibility of market provision by private firms. If economies of scale are reached at low levels of output relative to the total output of public good, the number of firms who participate in the market will be large. For analytic convenience we assume that each firm produces one unit of the public good at
516
W.H. Oakiand
cost c.32 The total revenues of a firm will be given by p · , where p is the price charged each customer and n is the number of customers served. Assuming free entry, it follows that for every firm p = c/n.
(74)
Thus, the price of a particular unit will vary inversely with the number of individuals consuming it. Now suppose that all consumers have the same demand function, D(p). Then D(c/N) units would be produced and each sold at the uniform price c/N, where N is the number of individuals in the economy. Lower priced units could not be sustained because they could not cover production costs. Higher priced units would not be available because people can obtain all they want at the prevailing price. Clearly, such an equilibrium is Pareto optimal, for we have N. D 1= c, where D - 1 is the inverse demand function and corresponds to the consumer's marginal valuation of the public good. In general, however, consumers will not have identical demand functions. At the lowest possible price, c/N, firms will provide sufficient output so as to satisfy the individual with the lowest demand. Thus, individuals with excess demand at the price c/N will have to pay higher prices to obtain additional public good consumption. If individual demands at each price are distinct, then, at the price c/(N - 1), firms will produce sufficient output so as to satisfy the individual with the second lowest demand, etc. Under these circumstances, the market will provide for public goods at a continuum of prices between c/N and c. At c, only the person with the largest demand will be purchasing that firm's output. Moreover, it will only be this individual who consumes the entire industry output. 33 The equilibrium described above is characterized by variation in the prices at which individual units of public good are sold. Note, however, that all consumers of a given unit pay the same price. Also, arbitrage cannot be used to eliminate these price differentials as it would do if the good were a private good. Those who purchase the higher priced units are consuming all lower priced units. Hence, purchase for resale would not be possible. It is obvious that the equilibrium described above is not Pareto optimal. Efficiency requires that all consumers have access to all units which are produced. But some individuals facing prices above c/N, choose not to consume the higher priced units. It can also be seen that the market produces sub-optimal levels of 32 In essence, the minimum average cost of production is reached at a level which we normalize to be equal to one unit. 33 For a more detailed discussion of this equilibrium, see Oakland (1974).
Ch. 9: Theory of Public Goods
517
public good. For the last unit produced we have MRSN = c,
(75)
where MRSN corresponds to the largest demander's marginal valuation. If, after allowing all individuals to expand, at no charge, their consumption to the level of industry output, some individual other than N places any positive benefit on additional consumption, we have, by virtue of (75), Y MRS'> c.
(76)
N
Thus, unless all but the Nth individual are satiated by the equilibrium output of public good, it would be worthwhile to expand production. While the preceding characterization of market equilibrium has won some acceptance in the literature, there have been other formulations of the problem. One notable example has been set forth by Thompson (1968), who concludes that markets may actually overprovide for public goods. In his scheme, the preferences of individual consumers are perfectly known to firms and to other consumers. Each firm, therefore, will attempt to extract from each consumer his/her maximum willingness to pay for that unit. But since each firm also sees itself as the marginal supplier to any particular individual, it will recognize that charging a person his/her marginal valuation will lead to consumer surplus on units already purchased. It will, therefore, attempt to capture the latter source of benefit as well. The result is that each firm charges the individual, his/her average valuation for its unit of public good. This leaves the consumer with no surplus whatever. In effect, the economy is no better off with public goods than without. In terms of the Samuelson condition we have ,MRSi< DAV'=c, N
(77)
N
where A V' is the individuals' average valuation of the public good and is greater than marginal valuation because the latter is decreasing. While clearly a novel approach, its informational requirements strain the imagination. Indeed, it is precisely the problem of determining consumer preferences which gives rise to the efficiency problems associated with public goods. To assume it away is to render the question of private versus public provision moot. Efficiency could always be achieved by providing the public good in the government sector.
518
W.H. Oakland
4.4. Exclusion possible, single seller 4.4.1. Monopoly seller- uniform price Because of economies of scale or considerations of space, private provision of public goods will often involve monopolistic elements.3 4 Examples would be highways, airports, waterways, parks, zoos and museums. As long as the facility remains uncongested each meets the criteria of an excludable pure public good. Consider the case of a single seller of the public good. Depending upon the firm's information regarding the tastes of its clients it will choose a pricing strategy to maximize profits. To begin with, assume that the firm is constrained to charge a single, uniform price to each of its customers and, for simplicity, it produces under constant costs. The firm's problem is to maximize
p- Exi- cX
(78)
N
subject to x i < D i (p),
(79)
xi < X,
(80)
where x i is the ith individual's consumption, Di(p) is his demand function and X is the total production of the public good. Defining X' and i' to be the shadow prices of constraints (79) and (80), respectively, the first-order conditions are Ex' + X'D(p)=O0, N
Ela-i
(81)
N
= c
o,
(82)
N
p = X + i,
(83)
where Ai > 0 and l >_ 0. Consumers can be classified into two mutually exclusive sets: those for whom x i < X and hence ' = 0, and those for whom x i < Di(p) and hence XA= 0. Label the sets R and R, respectively. If the sets were not mutually exclusive, we would 34
This case is discussed in Bums and Walsh (1981), Brennan and Walsh (1981) and Brito and Oakland (1980).
Ch. 9: Theory of Public Goods~
519
have for some i: D(p) > x < X.
But for such individuals, the firm could let x' to rise to the smaller of D'(p) and X without changing its costs - thereby increasing profits. 3 5 Thus we can rewrite our first-order conditions as Ex N
i
+ pED(p) = 0,
(84)
R
n(R)p = c,
(85)
where n(R) are the number of people in R. Equation (85) is the profit maximizing condition for X. Since only members of R utilize the entire output, only they are relevant to the output decision. Moreover, since they have excess demand at X, price does not have to be lowered to induce them to consume marginal increases in output. Similarly, (84) is the profit maximizing condition for price. For any given X, (84) determines the revenue maximizing price. Since only those in set R are not capacity constrained only they are sensitive to changes in price. Let us examine the efficiency implications of this solution. First, as long as
consumers are not identical, some are excluded from consuming total output. Second, we have by (85) and the nature of those belonging to R,
)MRS'> n(R)p = c.
(86)
Thus, the monopolist produces too little public good. While the same deficiencies are present under competition, they are less severe. That is, there is less exclusion and more production under competition than under monopoly. While this proposition can be demonstrated for the general case, we confine our proof to the case where the public good is not an inferior good and where individuals can be uniquely ranked in terms of their intensity of demand for the public good.36 Clearly, those in set R consume more at lower prices under competition. For under monopoly they pay the uniform price c/n(R), whereas under competition the prices they pay range from c/N up to c/(N - k) = c/n(R), where k is the individual with the largest demand in the set R. Now consider those in R. Under competition they were not capacity constrained at the price c/n(R). If they had 3s5 It is also clear that neither set could be empty. For if R were empty, (81) would be violated. If R were empty it would imply that all people were on their demand curve at X, which is inconsistent. 36 This means that for any price list p =f(x), individuals' notional demands maintain the same ranking.
520
W.H. Oakland
excess demand at c/n(R) they were able to call forth additional production. Thus, they are better off. Moreover, since they demanded at least as much under competition at prices up to c/n(R), output is greater under competition. 4.4.2. Monopoly provision- price discrimination
If the firm has perfect information regarding consumer preferences, Thompson (1968) has shown that a price discriminating monopolist would provide public goods in an efficient manner. Each consumer would be allowed to consume the entire output at a charge which equals his total willingness to pay for it. Thus, there is no inefficiency arising from exclusion. Also, at the margin, each person would be charged his/her marginal valuation, thereby insuring that the Samuelson condition is met. As pointed out above, the assumption of perfect knowledge of preferences will rarely be encountered in reality. Consumers have considerable incentives to disguise their preferences because each is in a bilateral bargaining position with the firm. This caveat notwithstanding, the firm may often have information regarding the nature of the demand for its product which it can translate into additional profits. For example, if the firm has knowledge about the distribution of preferences for the public good but not those of any particular customer, it would find it worthwhile to charge average prices which depend upon the quantity purchased. This situation would be similar to the competitive solution where different units of the public good command different prices. 3 7 To investigate this possibility we consider a world in which consumers have identical preferences but differ by income according to the continuous density function f(I). We further assume that the firm charges an average price schedule, R(x)/x, where R(x) is the total charge for x units and R'(x) is the price of the xth unit. One special case of R(x) would be px, the case considered earlier. The maximand of the monopolist would thus be JR(x(I))f(I) dI - cX,
(87)
where I* and I** are the lower and upper bounds of income, respectively.
Confronted with this price schedule, each consumer maximizes utility, u(x, y), subject to the budget constraint R(x) + y = I, where the price of the private good is normalized to be unity. The consumer will behave such that
R'(x) = That . (88) is, consumption of the public good is carried out to the point where they That is, consumption of the public good is carried out to the point where the 3'The discussion here draws heavily on Brito and Oakland (1980).
Ch. 9: Theory of Public Goods
521
consumer's marginal valuation is equal to marginal price. If we substitute for y from the budget constraint into the utility function and totally differentiate with respect to income, we find, by virtue of (88), du d
=
(x(I),
(I)),
(89))
where x(I) and y(I) are the Engel curves obtained by solving (88) together with the budget constraint. If aggregate income in the economy, Y= fIf(I)dI, is taken to be fixed, the firm's maximand can be re-expressed as Y-JfI*y(i)f (I )dI- cX.
(90)
In effect, the maximization of the firm's revenue is equivalent to minimizing aggregate consumption of the private good. The last step is to substitute for y in terms of u and x from the definition of the utility function, y =y(u, x). Thus the firm's problem is to maximize -
y(x(I), u(I))f(I)dI- cX
(91)
subject to x(I) < X
(92)
and (89). Now this is a standard control problem with u as the state variable and x as the control variable. Its solution will yield the trajectories u(I) and x(I), which together with the utility function and the budget constraint can be used to construct R(x(I)), the profit maximizing price schedule. The mechanics of the solution will not be developed here.3 8 Instead, we shall summarize the results. First, little, in general, can be said about the shape of R(x) other than it will not be equal to px. However, if the utility function is separable and if f(I) is non-decreasing, R'(x) will be everywhere decreasingexactly the opposite of the competitive case. Second, at the optimum, we must have R'(X) n(X) = c,
(93)
where n(X) is the number of people who consume the entire industry output. This is the same result as in the uniform pricing case. Since uy/u, > R'(X) for 38
A formal derivation can be found in Brito and Oakland (1980).
522
W.. Oakland
members of n (X), it follows that the monopolist produces too little public good. If the public good is non-inferior, the monopolist will produce less than in the competitive case. Most striking, however, is the result that the monopolist's marginal prices will be somewhere above and nowhere below those obtained under competition. Thus, not only will production be higher but exclusion will be less under competition than for the monopolist. As with private goods, therefore, monopoly creates excess welfare losses. 5. Public sector provision of public goods Since the economics of public sector decision-making is covered extensively in other chapters in this Handbook, only the highlights will be discussed here. 39 Basically, the task confronting policy-makers is to insure that the Samuelson condition holds. This requires knowledge of how the private sector itself provides for public goods, how it will react to policy parameters, and information on costs and tastes associated with the public good. Obtaining this information is a formidable task - perhaps even unachievable. Nevertheless, even if policy-makers have complete information there are strong reasons to suspect that they will not always implement an efficient allocation. Public sector decision-makers may frequently have personal interests which conflict with an efficient allocation. Our discussion in this section treats these issues separately: (1) how to obtain the
information necessary to achieve efficiency; and (2) how public sector decisionmakers behave.
5.1. Pigouvian taxes and subsidies Before proceeding to these questions it is convenient to discuss an issue which arises in connection with externalities - the possibility of using corrective taxes or subsidies to remove inefficiencies arising from private behavior. Such an approach, if appropriate, is attractive because it minimizes the need for bureaucratic intervention into the resource allocation process, thus avoiding some of the potential wastes identified below. Moreover, this approach harnesses individual self-interest to achieve the desired outcome. Consider the conditions for Pareto efficiency when there is an external effect
from the consumption of commodity i by individual j: MRSjj. + E MRSjk = MC i,
(94)
ja-k 39 See, for example, Chapter 12 by Inman, Chapter 11 by Rubinfeld, Chapter 10 by Laffont, and Chapter 14 by Dreze and Stem in this Handbook.
Ch. 9: Theory of Public Goods
523
where MRSjk is the marginal rate of substitution between the numeraire commodity and j's consumption of the ith good for the kth individual; MC' is the marginal cost, in terms of the numeraire, of producing another unit of commodity i. According to Pigou (1947) economic efficiency can be achieved for such goods by placing a subsidy (tax) on the purchase of commodity i by individual j equal to the external benefits (costs) represented by the second term on the left-hand side of (94). This has the effect of lowering (increasing) the cost of an additional unit of commodity i by an amount equal to the marginal benefit (cost) conferred upon others. If the individual behaves so as to equate his/her direct marginal benefit of commodity i (i.e. the first term on the left-hand side of (94)) with his/her net marginal cost of procuring the unit, it follows that such a subsidy (tax) will result in the equality of the Samuelson condition (94). There have been several objections to this Pigouvian prescription. First, it ignores the possibility that individuals may engage in private transactions to account for external benefits and costs (cf. the discussion of mosquito control above). That is, individuals who enjoy external benefits may offer their own bribes to the person who generates the benefits. If this is indeed the case, the Pigouvian subsidy would lead to oversupply of the public good; for the individual would be doubly compensated for the external benefits.40 This problem is unlikely to arise if the number of participants in the external benefits of an individuals' consumption is large; for beneficiaries will attempt to free ride on one another. But if the number of participants is small, the Pigouvian approach is likely to be inappropriate-though there is also no guarantee that the private solution will prove optimal. An equally serious shortcoming of the Pigouvian prescription arises if externalities introduce non-convexities into the problem. For then the solution to condition (94) may not be unique.4 1 If this is the case, the marginal adjustments introduced by the Pigouvian tax on subsidy may drive the economy away from the global optimum toward a local optimum. While some improvement is necessarily achieved by the adjustment, the entire approach diverts attention from the fact that a fundamentally different approach is necessary to achieve global efficiency. Finally, even if these problems do not arise, the reader should not be lulled into believing that the Pigouvian approach offers a "solution" to the underlying problem of efficiently providing public goods. For its implementation requires knowledge of the marginal external benefits or costs generated by each person's 40This criticism was first made by Buchanan and Kafogolis (1963) although the argument is implicit in Coase's (1960) critique of the Pigouvian approach. 41See Baumol (1972) and Starrett (1972) for a discussion of the problems caused by non-convexities. Typically the problem is treated within the context of production externalities. Baumol (1964), however, argues that the problem can also arise because of consumption externalities.
524
W.H. Oakland
consumption. But if such information were available to policy-makers they could allocate the good directly - without having to resort to correcting market prices. 42 Moreover, the determination of individual preferences is the major obstacle to the achievement of efficiency. There is nothing in the Pigouvian approach which reduces the scope of this information problem.
5.2. Information Consider the simplest case of a pure public good for which exclusion is impossible and where costs of the public good are prohibitive relative to the resources available to any single agent. Thus, if the public good is to be provided at all it must be done through the public sector. Assuming that full information is available on costs of production, it remains to determine individuals' marginal valuations. Inspection of the efficiency conditions (2)-(6) suggests that the latter depend upon the distribution of welfare in the society. Hence, the efficient output of the public good is not unique, in general. This is not surprising since it also characterizes the efficient provision of private goods. Since the underlying income distribution can be modified by taxes and transfers we can take it as given for purposes of this discussion. Nevertheless, even with a given distribution of income the efficient output of the public good is not generally unique. 4 3 The problem here is one of distributing the inframarginal surplus arising from the public good. This issue does not arise with respect to private goods because market institutions have a unique method of distributing the gains from trade. It involves confronting each individual with constant prices for goods he/she consumes or sells on factor markets. Since public goods are not sold, this convenient device is not available. An explicit decision must be made as to how to distribute the surplus. This will, in turn, affect individual marginal valuations for the public good and hence the efficient level of provision. It appears, therefore, that considerations of income distribution are hopelessly entwined with the problem of efficiency. 5.2.1. Voluntary exchange Some economists have attempted to cope with this difficulty by adapting a convention on how inframarginal gains should be distributed. Specifically, they require that individuals be charged (taxed) an amount equal to their marginal 42
If individual consumers generate different marginal external benefits, the corrective tax or subsidy would vary across individuals, negating much of the simplicity of the Pigouvian approach. 43 See Aaron and McGuire (1969).
Ch. 9: Theory of Public Goodsr
525
valuation times the level of public good supplied.4 4 This is equivalent to the method used for allocating surpluses arising from private goods, and is usually referred to as benefit taxation. Since any desired distribution of welfare can be achieved by lump-sum taxes and transfers, this convention separates income distribution issues from those of efficiency. 4 5 This procedure implies that the determination of efficient public good levels simultaneously carries with it a method for financing the public goods. As we shall demonstrate momentarily, it is this connection that renders the approach non-operational. But first note that the convention requires that the policy-maker discover the Marshallian demand curve of each individual for the public good. This demand curve shows the amount of public good desired at each constant tax price. Each point on this demand curve corresponds to the individual's marginal valuation for that level of public good supply. To determine the efficient level of public good, the policy-maker needs only to sum these demand curves vertically and find where the resultant curve intersects the marginal cost schedule for the public good. 4 6 The vertical summation represents the aggregate marginal valuation for each particular level of public good. Since individual tastes and income vary, the marginal valuations of individuals will generally differ at the optimum. This means that each individual's tax price will vary - the highest price charged to the individual with the greatest marginal valuation. It must be emphasized that the preceding calculations cannot be considered to be a conceptual exercise with actual taxes determined by some other method. For the demand curves are valid indicators of marginal valuations only if taxes are indeed collected according to marginal benefits received.47 Thus, individuals must actually be confronted with such tax prices when the policy-maker obtains the data on individual demand curves. We can now see why this approach breaks down. Because the policy-maker is not omniscient the demand curve will have to be supplied by the individual. The threat of exclusion, however, cannot be used to obtain the demand data; hence, the individual will have to supply it voluntarily. If the number of individuals is large, an individual's tax payment will be an insignificant fraction of total taxes. But since total taxes determine the level of the public good supplied, it follows that the individual's response will have little or no impact on output. In this circumstance the rational individual will try to convince the government that his demand curve is identically zero, so as to minimize his tax payments. Thus, 4This is the so-called Lindahl approach. See Johansen (1963). 45However, if lump-sum taxes are not feasible and only distortionary taxes are open to the policy-maker, the dichotomization of distribution and efficiency is invalid. See Diamond (1974). 46Strictly speaking, this procedure is correct only if the public good is produced under conditions of constant costs. Otherwise, other taxes or subsidies will be required which will affect the underlying demand curves. 47 An exception would arise if individual demands were independent of income.
526
W.H. Oakland
individuals will not voluntarily reveal their true preferences to decision-makers. They will instead try to become free riders at the expense of others. 5.22. Incentive compatible mechanisms
The problem of inducing people to reveal their preferences for public goods stems from the association of their stated demand curves with the taxes they are asked to pay.4 8 Some economists have suggested that honest revelation can be induced by making a person's tax payment approximately independent of his stated preferences. Basically, an individual would be asked to pay a two-part tax.4 9 The first part of the tax is used to cover the cost of the public good and is set in advance. Specifically, each person would be preassigned some share of the cost of the public good. The second part of the tax would depend upon how much an individual's revealed preferences would change the outcome. If the individual reports a demand curve which is identically zero, the second tax would be zero. If, on the other hand, the individual reported preferences which caused an increase in the supply of the public good, he would be liable for the additional cost, less the willingness to pay others for the additional units.5 0 The latter is measured by the area under the reported demand curves of others. In such circumstances, individuals would be induced to reveal their true demand curves irrespective of whether they believed others are telling the truth. Hence, all individuals are induced to truthfully reveal their demand curves and efficiency would appear to be in reach. There have been a number of criticisms of this and related mechanisms. 15 Some have argued that the proceeds of the second tax must be wasted if the mechanism is to work. Others have suggested that the mechanisms are vulnerable to the formation of coalitions which could exploit them. Moreover, the mechanisms produce valid measures of marginal valuations only if individual demand curves are independent of income. 52 The most serious shortcoming arises in the large number case. Here the chosen level of public good is approximately 48The late Leif Johansen (1977) argues that the importance of this problem is overstated. This seems, however, to be an act of faith. 49 See Tideman and Tullock (1976), Groves and Ledyard (1977), and Green and Laffont (1979) for examples of such mechanisms. Only simple mechanisms are discussed here. For a thorough discussion of this literature see Chapter 10 by Laffont in this Handbook. 50 This is the so-called "Clarke" tax, as discussed in Tideman and Tullock (1976). The Groves-Ledyard (1977) variant is somewhat different although it carries a similar content. See Fane and Sieper (1983). 5 1See, for example, the Special Supplement to the Spring 1977 issue of Public Choice, 27-2. 52 The Clarke mechanism is particularly sensitive to the point. The individual demand schedules are based upon the premise that the agent actually pays the tax indicated by his tax share. However, under the mechanism, tax shares are preassigned and fixed. Consequently, the revealed demand curves only indicate the agent's marginal valuation if the demand for the public goods is independent of income- a highly dubious condition.
Ch. 9: Theory of Public Goods
527
independent of the individual's response, i.e. the aggregate outcome is approximately the same whether or not the individual refuses to participate by reporting no demand whatever.5 3 Since the determination of one's own demand curve requires considerable thought, many people would simply not participate or would seriously understate their demand.5 4 Consequently, this mechanism is of little practical use when the number of taxpayers is large. And if the number is small it is subject to the waste, coalition and income effect criticisms mentioned above. 5.2.3. Voting
The ballot box is the principal method employed by democratic societies to elicit citizen's demands for public services.55 Wicksell (1896) has shown that, under certain idealized circumstances, the voting mechanism can generate an efficient supply of public goods. He demonstrated that if public good expenditures and tax shares were decided simultaneously, a unanimity voting rule would lead to efficiency. Assuming voters vote in favor of any program which increases their benefits by no less than their taxes, only programs which offer a Pareto improvement can be passed. As long as public expenditures are inefficient it will be possible to formulate a new program which could command unanimous consent. If voting is costless, this process would continue until Pareto optimality is achieved. Unfortunately, the conditions necessary to make a unanimity rule viable are not met in practice. For one thing, voting is not costless. The probability that an alternative tax-spending proposal does not harm anyone is virtually zero once some minimal level of public service is provided. Thus, the unanimity rule would favor the status quo, at sub-optimal levels of provision. Also, it is likely that some individuals will behave strategically, vetoing proposals that benefit them in the hope that some superior alternative will be proposed. Thus, the Wicksellian unanimity rule has little practical value. It does, however, offer insights into why alternative voting rules tend to be ineffective. For with any plurality short of unanimity, proposals could be adopted which redistribute income. Since voters cannot distinguish efficiency gains from redistribution gains, non-unanimous voting rules hopelessly entangle efficiency and redistributional considerations. It is not surprising, therefore, that voting has serious shortcomings as a preference revelation mechanism. Indeed, analysts have 53 In the context of the Groves-Ledyard mechanism, failure to participate would be manifested by asking for no change from what others demand. 54 This point has not received the attention in the literature it deserves. The mechanisms require sophisticated behavior on the part of the agents when formulating their own choices, but assume that taxpayers are naive concerning how their response will actually affect the outcome. 55For an excellent survey of the voting mechanism, see Chapter 12 by Inman in this Handbook. Also see Mueller (1979).
528
W..
Oakland
generally concluded that most voting processes are incapable of determining an equilibrium outcome - i.e. one that cannot be defeated by an alternative. Some analysts have attempted to overcome the lack of existence by assuming that tax shares are fixed - say by the constitution. In such cases, voters' preferences for a public good will usually be single peaked, i.e. there is a unique optimum from the viewpoint of each taxpayer.5 6 If the choice of public good supply is made by majority rule, then the outcome will be the personal optimum of the individual with the median preferences - i.e. the individual whose preferred outcome is considered "too high" by half the voters and "too low" by the other half. In general, it is not possible to determine whether such an outcome yields an oversupply or undersupply of the public good from an efficiency perspective. It will also only be by accident, with almost zero probability, that the outcome will be efficient. For example, given the tax rule, we may have 51 percent of the voters enjoying a net benefit of one penny with the other 49 percent suffering negative benefits of fifty dollars apiece; an obvious case of oversupply. Or the figures could be reversed, with undersupply being the case. These examples underscore the fundamental difficulty of the voting rule-people are unable to express the intensity of their preferences. Consequently, the outcome may be dominated by distributional considerations. Intensity of preference can be introduced into the voting mechanism by explicit or implicit vote trading. The latter is often referred to as log-rolling, while the former is the case of a multi-issue ballot. Here people can trade their support for public goods about which they have a marginal interest for support for goods for which they have a strong interest. Unfortunately, such mechanisms reintroduce the likelihood that an equilibrium will not exist. To a large measure this reflects their inability to eliminate distributional considerations from the decision mechanism. We can only conclude that voting is a highly suspect preference revelation mechanism, which under plausible conditions may be counter-productive. 5.2.4. Cost-benefit analysis57 Although people do not have incentives to voluntarily reveal their preferences they may occasionally do so unconciously through their behavior in markets for private goods. This would be the case if the public good is a close substitute in consumption for some private good traded in the market. For example, the benefit people derive from fire protection might be gleaned from their behavior with respect to fire insurance and fire protection materials. Or the benefits of pollution control may be estimated by comparing land values of plots subject to 56 All 57
that is required is that consumers' preference orderings be quasi-concave. For an excellent survey of this technique see Chapter 14 by Dr&ze and Stem in this Handbook. Also see Layard (1972) and Mishan (1976).
Ch. 9: Theory of Public Goods
529
different degrees of pollution. Another important example has to do with public services which affect the time at the disposal of an individual. Thus, a highway improvement which reduces travel time may be valued by observing how an individual makes choices among the transportation modes or transportation routes which offer different combinations of time and out-of-pocket costs. 58 The most important application of this technique, however, has to do with public intermediate goods which are used in the production of private goods. Here the relevant preference information is based in a production function as opposed to the utility function. As such the relevant information is objective as opposed to subjective. The government can obtain the services of the appropriate technical personnel to determine the productivity of public good inputs. The value of the service is obtained by weighting the additional output by the observed market prices of the private good. Cost-benefit analysis, then, can be a useful technique for ascertaining preferences when the public good has a close substitute in consumption or if is as input into the production of private goods. For such goods as defense, however, it can be of little help, meaning we must rely on the highly imperfect voting mechanism for obtaining preferences. 5.2.5. The Tiebout mechanism
If the public good has a spatial dimension it may be possible to use taxpayer mobility as a means of revealing preferences.59 For if communities differ in their supply of public goods and taxes, individuals can locate in that community which most closely matches their tastes. Under certain idealized conditions, such mobility can lead to efficient levels of spatial public goods. These conditions include those developed earlier for local public goods; that is, there must be a large number of communities for each taste class. In addition, it may be necessary to restrict individual mobility so that individuals do not base their locational choice upon distributional considerations. This would be the case if taxes were collected on some base which is positively correlated with income such as the property tax. For then it might be attractive for low income individuals to migrate to communities with high average income. For a given level of services, property tax rates would be lower in richer communities. Hence, some sort of mobility restriction, such as zoning, may be necessary to achieve efficiency. 6 0 58Bradford and Hildebrandt (1977) rigorously developed conditions under which it is possible to determine preferences for public goods from observed market demand functions for complementary private goods. 59 See Chapter 11 by Rubinfeld in this Handbook. 60 Some analysts have argued that richer communities would provide a higher level of services. Thus, taxes on the poor may not be lower in the rich communities. The issue, then, is whether the poor are willing to pay the higher taxes for the greater services. However, if the tastes of the rich are diverse, there would be rich communities which offered lower taxes and higher services to the poor resident. See Westhoff (1979) and Epple, Filimon and Romer (1984).
530
W.If. Oakland
While the Tiebout mechanism has attracted wide attention among economists its assumptions are so restrictive as to cast doubt on its validity. Recall it has not yet been possible to determine conditions for efficiency when the number of individuals of given tastes is not large relative to optimal community size. Without such conditions it is impossible to judge the efficiency of mobility models. Moreover, in reality the structure of local governments does not fit well with the Tiebout assumptions. The existence of large central cities and restrictions on mobility due to discrimination raise serious questions about the relevance of the Tiebout mechanism.
5.3. Behavior of public officials Efficient public goods supply may not be achieved even if decision-makers have full access to consumer tastes. Public officials cannot be supposed to ignore their own economic self-interests when formulating policy. They have strong incentives not only to retain their positions but to upgrade them, thereby augmenting their salaries and position of authority. In part this reflects the quasi-rent characteristics of an occupation - once selected they are difficult to change without a loss of earnings. It also reflects the opportunity open to a policy-maker to augment his/her income prospects once out of office by using the power of the office to confer rents upon particular constituents. For many officials this augmentation occurs directly through expanding the size of their organization because salary or prestige and organization size often go hand in hand. Two examples will nicely illustrate the problems which can arise here. The first is Niskannen's (1971) model of the monopoly bureau. The second is Rosenthal and Romer's (1978) agenda-setting model. We will treat them in turn. 61 Niskannen's model begins with the observation that a particular government bureau is often the sole provider of a public good. Because it is in the public sector, however, the bureau's objective is not to maximize profits but to maximize output, thereby giving the bureau's managers greater economic opportunity and power and prestige. Niskannen assumes that the sponsoring agency (e.g. the legislature) does not have knowledge of the cost function for the public good. Hence, it can only monitor the bureau's output and its budget. On the other hand, the bureau is assumed to know perfectly the benefits that the sponsor derives from the public good. Under the best circumstances these benefits would be an accurate assessment of the taxpayer's true benefits from the public good. It is assumed that the bureau makes an all-or-nothing proposal to the sponsor. This 61
These models are also discussed by Inman in Chapter 12 of this Handbook.
Ch. 9: Theory of Public Goods
531
offer will be that which maximizes the output of the public good, X, subject to
C(X) fiui(g(m*l(O'),... ,m*(i-)(Oi-1)
daO-i mi,
m*(i+l)(Oi+t),.,
m* (01)),Oi).~(O-7i0)dO- ', Vmi E M.
541
Ch. 10: Allocation of Public Goods
A Bayesian-Nash equilibrium is a Nash equilibrium of response functions m *'1(.), . . ., m * (.) such that, if each agent i believes that the others are using these responses functions, he should himself use m*'(.). His expected utility for
his message m*'(O') is at least as good as his expected utility for any other message. 2.3. Maximin equilibrium: c = m An I-tuple of messages (m*',..., m*') is a maximin equilibrium at 0 iff Vi E I, min - ui(g(m*', m-), 0')
M-'iM '
min
u'(g(m', m-'),O'), Vm'eM'.
m '-M-
For each agent i, m*' is the best strategy if he envisions the worst strategies by the others from his point of view. 2.4. Nash equilibrium: c
n
An I-tuple of messages (m*l,..., m* l ) is a Nash equilibrium at 0 iff Vi E I,
g(m*i, m*-i)Ri(6')g(mi, m*-'),
Vm'E M'.
Implementation in Nash equilibria is a very different notion of implementation than the other three concepts. For any of the other concepts, despite incomplete information, each agent is able to obtain his best strategy by pure thinking, in a completely decentralized way, even though each notion corresponds to a different attitude. On the other hand, a Nash equilibrium can be arrived at only with a dynamic procedure because of incomplete information. 3 Then, the question of manipulation along the procedure arises. Implementation in Nash equilibria can be viewed as a weakening of incentives requirements as follows. Assume that there is a dynamic procedure; at date t of the procedure agent i is told the answer of date t- 1 and believes that this is the final step and that the others will not change their answers. Under this myopic behavior, if the procedure converges, the outcome corresponds to Nash implementation. 3 An alternative interpretation is that when the game is played all agents have all the information. This interpretation is not, in general, appropriate in public economics.
542
J.-J. Laffont
The above discussion stresses the need for, among other things, myopic behavior. Why not adopt myopic behavior as an appropriate weakening of incentives requirement, but keep one of the other decentralized solution concepts discussed above? To formalize this idea we consider economies where agents have C' utility functions defined on R L , u'(x'), where L is the dimensionality of the commodity space. Let U' be the space of admissible utility functions for agent i. Mount and Hurwicz's diagram is here
U' x*xU'
U
,
A
/ M' x · ·. We mi(t), mi(t), which
MI
consider continuous time procedures; agent i's message is a function t 2>0, and let M'(-) M'(t), t > O, be an admissible space of functions t O. Myopic behavior consists in choosing m'(t) at instant t in a way maximizes the increase of utility
dU' dt =
aui
L
E
l
dmx
x~(x'(t)) dt
(m (t
1-1 ax;
),
m - (t ))
.
The outcome function associated with each I-tuple (ml(t),..., m'(t), t > O) is the solution of the differential system dx dx~ = G(mi(t) m i(t)),
i=,
of messages
I,1=1,.. L.
A procedure is an I-tuple of message spaces Mi(t), t 0, and a set of outcome revision functions Gi(.), i= 1,..., I, = 1,,.., L. Note that at each instant the objective function of an agent is characterized by his marginal rates of substitution, for example all are expressed with respect to commodity one. The local utility function of the agent is linear in these characteristics. So, at each instant the procedure defines a mechanism for linear utility functions. We refer to the induced local mechanism.
543
Ch. 10: Allocation of Public Goods
2.5. Local dominant equilibrium A set of messages (m*l(t),..., m*I(t), t > 0) is a local dominant equilibrium iff ViEI, Vt>0,
E> ix (x*i(t)) d (mi(t), m-i(t)), =1 tVmi(t)E
Mi(t);
Vm-i(t)
EM-i(t),
where dx,
dt (M*i(t),m
i(t))=
G(m*i(t), m-(t)),
1=,.
L,
,
and max-i(t))imin equilibrium can be Alocal Bayesi(man(t),-Nash equil(t))ibrium and a loG(m(t),cal
A local Bayesian-Nash equilibrium and a local maximin equilibrium can be defined in a similar way. In this Chapter we will use the three equilibrium concepts, a, b, and c, and their local versions. Incentives theory is greatly simplified by the revelation principle. A mechanism is direct if agent i's message space M' coincides with his space of characteristics O', i = 1,..., I. A direct mechanism is revealing if OEEgC(O), VGE9,
i.e. if the true characteristics are equilibrium messages. A procedure is direct if, at each instant, the induced local mechanism is A direct procedure is revealing if at each instant the induced local mechanism is revealing. We state only the revelation principle for implementation in dominant gies. It extends basically to all decentralized solution concepts [see Barbera and Dasgupta, Hammond and Maskin (1979)].
direct. direct strate(1982)
544
J.-J. Laffbnt
Theorem 2.1. Gibbard's Revelation Principle (1973) Let (M, g) be a mechanism which implements the social choice function f() in dominant equilibrium. Then there exists a direct revealing mechanism (, 4,) which implements f(.) in dominant equilibrium. Proof Let m*l,..., m * t an I-tuple of equilibrium messages for (M, g) when the agents' characteristics are respectively 0, ... , O1. Define, for every 0 E O, A(0)
g(Eg,d(0)).
By definition, (0) =f(0); (, 4) is a direct mechanism. Suppose it is not revealing. Then there exists an agent i and i'E O' such that
(Oi -)P'(0i'),(i, 0-i)
(2.1)
where P'(O') is the relation of strict preference associated with R'(i'). From the definitions, (2.1) can be rewritten as g(Eg,a(t, 0-'))P'(0')g(Egd(0i, 0 - i )).
(2.2)
By assumption, (m*1,..., m*i,..., m*') E Eg, d(0 ,i
- i
(2.3)
)
Expression (2.2) tells us that there exists (M*l...,
M.(-1),
,
m)Egd(,
-i),
such that g(m*l ..... m*('-',
which contradicts (2.3).
.,
m*')Pi(Oi)g(m*l...
, m*,
m*
Q.E.D.
Therefore, all the social choice functions which can be implemented in dominant equilibria by complex mechanisms can also be implemented by direct revealing mechanisms or are said to be self-implementable. We restrict the analysis to these mechanisms.
Ch. 0: Allocation of Public Goods
545
Finally, let us note that the above definitions can be extended to a social choice correspondence F: - A. Implementation means then that there exists (M, g) and c such that VO,
g(Eg c())
0
and
g(Eg c())
F(O).
To summarize, in a world of decentralized information a decision-maker is constrained to use social choice functions which can be implemented, i.e. which are compatible with individual incentives. A good appraisal of the induced limitations of government intervention is fundamental for public economics.
3. The Gibbard-Satterthwaite theorem The first point to understand is that if no restriction at all on agents' characteristics is available, implementation in dominant strategies is essentially unreachable. This is formalized by the Gibbard-Satterthwaite theorem. Let Z(A) be (to simplify the proof) the set of all total orderings (indifference is not allowed) pi admissible for agent i, i = 1,..., I; we say that the property of universal domain (UD) is satisfied. A social choice function (SCF) here is a mapping from [(A)]' into A. If a social choice function f has a range A' c A, then f is called a SCF with range A'. A social choice function with range A' is dictatorial if there exists an agent whose favorite announced alternative in A' is always the social choice. There exists i E (1,..., I} such that for any profile P = ( 1l,..., P) [2(A)'] and any a E A', such that a f(P), then f(P)P'a. We have: Theorem 3.1. The Gibbard(1973)-Satterthwaite (1975) impossibility theorem If A' has at least three elements, a social choice function with range A' which satisfies universal domain and is implementable in dominant equilibria is dictatorial. Proof From Theorem 2.1 we can restrict the analysis to direct revealing mechanisms. The first lemma shows that if an agent can by himself change the social choice, truth cannot be a dominant strategy. 4
This proof is due to Schmeidler and Sonnenschein.
546
J. J. Laffont
Lemma 3.1 If there exists i E (1,..., I} such that f(P 1 ,..., P',..., Pi) = a,, f(pl ... ,p
... P) = a,
aa
a 2,
and if a2 P'al or
alP'ia2,
then the truth is not a dominant strategy at either (P', (pi
. ., pi
,..., ..... P)
or
.....pi).
Proof If aLP"a i 2 f(P 1,... Pi,..., P )Plif(P 1,..., p P ,..., P), then if agent i has preferences P", it is in his interest to announce P. If a 2 P'al,f(P',..., P'i,..., P')P'f( 1 ....., pi,..., p'), then if agent i has preferences P', it is in his interest to announce P'i. Q.E.D. Lemma 3.2 If f has range A' and if B cA', if, for a profile P = (P ' ,..., P.), for any a, E B and any a2 E A, a2 B, alPia2 for any i, then f(P 1 ,..... P) c B. Proof Suppose, on the contrary, that f(Pl,..., Pi) = a 2 4 B. Since B c A', there exist (P', ... Pi) [(A)]1 such that f(P'1,..., P') = a, E B. Consider then the set of social states i=I defined by a =f(Pt
O....p,i, pi+l .... pl)
and let j be the first integer such that a E B. Then we have f(p"l,...,
p,, pJ+l
f (P/1'..'
p,(j-1), pi .
. , P') = a . P) = a
E B, - 1
q B,
Ch. 10: Allocation of Public Goods
547
and
Then, from Lemma 3.1, the truth is not a dominant strategy for agent j if he has preferences P'. Q.E.D. This lemma shows in particular that f() must select in A' a Pareto optimal outcome. The idea of this proof is then to construct, from the social choice function f, a social welfare function F which satisfies the assumptions of Arrow's impossibility theorem. First, let us see how an ordering on A' is constructed from f for each profile. For any pair (a,,a 2) a modified profile ('l,..., P')is constructed as follows: Pi is constructed from pi by moving up to the top of the ordering a, and a 2, keeping the ranking defined by P' on {a,, a 2} and on {a/a A, a al, a * a 2 }. Let us denote by laa2(Pi) the ordering Pi obtained in this way. From Lemma 3.2, we know that either a1 =f(Pl,..., P) or a2 =f(j l.., ). Then, we define
alPa2 ,
if a =f(P1,...,
a 2 Pal, if a 2 =f(l,
),
... , P).
This construction can be repeated for any pair of alternatives in A'; it yields, for the profile P, a binary relation on A'. Repeated for any profile P in [(A)]', it gives a potential social welfare function F which associates to any profile a binary relation. It remains to show that this binary relation is indeed an ordering, and that F satisfies all the assumptions of Arrow's theorem, universal domain, Pareto optimality and independence of irrelevant alternatives. By definition, F satisfies universal domain. Pareto optimality on A' follows immediately from Lemma 3.2. If F did not satisfy independence of irrelevant alternatives it would mean that there would exist two profiles, P and P', and two alternatives, a E A' and a2 EA', such that a lPia2
alP'ia 2,
Vi E I,
and (for example) alF(P)a2
and
a 2 F(P')al,
548
J.-J. Laffont
which means
f l(P
") I...Ia,.(P) )
a· .
As in Lemma 3.2 consider the alternatives: a3
=f(P,aa2(P
,.
)
12
)(PI)
9
la2
a3 = al, a3 = a 2, and, from Lemma 3.2, a
{(a,, a 2},
Vi.
The proof ends as in Lemma 3.2. Clearly, for any P, F(P) is total on A' and anti-symmetric (since f is single valued). It remains to show that F(P) is transitive. Suppose it is not. Then there exists (P',..., P'), a,, a2, a3 in A' such that al=f(caa2( ),..
a2 =f(a
a 3 =f(0
3,(p)
0 3(P'
.aa(P'))aaF(P)a
,...
a2 ,(P')) a3
... ,%,(P'))
2,
a 2 F(P)a3, a 3F(P)a.
Let (', ... , p') be the profile obtained by moving up (a,, a2 , a3 ) in each ordering, keeping the ranking defined by P' in {a, a 2, a3}) and (a/a eA, a=a,, a a 2, a a3 ). Assume then (without loss of generality) that a1 =f(PA ..., p).
Let (Pl',..., P"') be the profile obtained from (P",..., PI) by moving down a2 to the third position in each ordering P'i (to ensure by Lemma 3.2 that a 2
549
Ch. 10: Allocation of Public Goods
will not be chosen). Then alP'a3
-*
alP"'a3 ,
Vi
I.
Since a 3 F(P)al by assumption, the property of independence of irrelevant alternatives implies a3 =f(p
l,...,
Let a' =f(p,,1...
P",).
P"', p,('+l),..., pt'), i = 0,..., 1, and let j be the first
al. integer such that a If a = a3 , then the truth is not a dominant strategy for agent j (Lemma 3.1). If a = a2, then the truth is not a dominant strategy for agent j since
f(P"1,...,
p"ip'J+... P")= a2,
and aP"'a .2
Consequently, F(P) is transitive. From Arrow's theorem, F is dictatorial. The dictator for F is a dictator for f. Q.E.D. To obtain positive results it is then clear that we must either introduce a priori restrictions on agents' characteristics or weaken the notion of incentive compatibility. In economies with only public goods, the simple restriction to "economic environments", i.e. to an economy where agents have well-behaved utility functions on an n-dimensional space of continuous public projects, still yields an impossibility theorem. When private goods exist in the economy and production is possible, some mechanisms incentive compatible in dominant strategies (and even Pareto optimal) can be exhibited. Satterthwaite and Sonnenschein (1981) give the following example. Consider an economy with three consumers and two pure private goods. Commodity 2 is produced from commodity 1 according to the linear technology Y2 = y, and initial resources are three units of good 1, one for each agent. Agents two and three share the first unit of good 1 in proportions which depend or the mean curvature of one's utility function. If it is "very curved", agent two gets the unit of good 1, if it is "very flat", agent three gets the unit of good 1. Similarly, the second and third units are shared in the same way by
550
J. J. Laffont
permutating agents circularly. Then, each consumer receives his maximal point (from the point of view of this announced utility function) on the budget line x + x 2 = w, where w is the amount of good he has received by the above procedure. This mechanism is clearly incentive compatible in dominant strategy since each agent gets his best point in an exogenously given set.5 4. Public goods only; positional dictators Positive results can be obtained at the cost of an extremely restrictive assumption on preferences which is famous in social choice theory, namely single peakedness, and which is heavily used in the median voter paradigm. Consider an economy with an odd (for simplicity) number I of agents such that social states can be ordered on a line in such a way that all agents have single-peaked preferences (see Figure 4.1). agent 2 agent I
al
agent 3
02
03
04
05
i6
07
Figure 4.1
As noted by Dummett and Farquharson (1961) the social choice function which selects the "peak" alternative of the median agent (which would be selected by majority voting) is truthfully self-implementable in dominant strategies. Clearly, if an agent is the median agent he should tell the truth. If an agent has his peak to the right of the median agent's peak (agent 3 in Figure 4.1), he gains nothing by lying with an announced preference which still has its peak to the right of the median agent's peak (since nothing changes) and he moves the selected peak in the wrong direction if he lies by announcing a peak to the left of 5
As shown by Hammond (1979), at the individual level a dominant strategy mechanism is characterized by an exogenously given set on which the agent picks his best point. The difficulty is then to ensure social feasibility of the allocation. When Pareto optimality is not required, this feasibility condition takes the form of a set of inequalities which can be easily satisfied, at least when individual rationality is not required. If Pareto optimality is imposed, the mechanism must satisfy in addition some equalities and this is often generically impossible. In the above example, a good mileage is obtained from the production technology which basically suppresses one of the two equilibrium conditions. In a pure exchange economy, the conjecture is that "generically" there exists no Pareto optimal dominant strategy equilibrium. See below for the context of one private good and one public good.
551
Ch. 10: Allocation of Public Goods
the median agent's peak. (A similar reasoning applies to agents whose peaks are to the left of the median agent's peak.) Moulin (1980) has shown that, when messages are restricted to announcing peaks only, the family of efficient, anonymous, incentive compatible social choice functions can easily be described. It is obtained by adding less than I fixed ballots to the I agents' ballots and then choosing the median of this set of ballots. When transfers across agents are not feasible, this is a good justification of majority voting if preferences can be assumed single peaked. 6
5. The differential approach
In the sequel where parameterizations of utility functions will be frequently used, we take the so-called differentiable approach to construct incentive compatible mechanisms. Consider, in a two-commodities exchange economy, agent i's utility function (linear in the unknown parameter):
Ox
- X2 + X,
where ' is the true characteristic of agent i known only to agent i; 0' is known by the Center to belong to an open interval of R containing zero, 1'. Let o= i,=1'. Let i' denote his answer to the mechanism which is (at the individual level) a pair of functions, xi(9', 0i), xi(O', 9 - ), where -i'= (81, ... , -i l, 8 i+1 ... , 'O).Finally, let (wi, w') i = 1,..., I be initial endow-
ments. Let us restrict the analysis to differentiable7 mechanisms. Agent i will announce a 9' which satisfies the first-order condition of his maximization problem: i
max i 6
ix('
-
i)
2
0,i
2
-i)
2
+x
-)
See also Laffond (1980). 7The restriction to differentiable mechanisms is in fact not needed. A direct argument, without any assumptions, shows that the function x2 (') must be non-decreasing. Since a non-decreasing function is differentiable almost everywhere, the above characterization holds almost everywhere. Mechanisms which are infinitely non-differentiable are not interesting from an economic point of view. The right family to consider is the set of functions which are differentiable except at a finite number of points where they may even be discontinuous. The characterization is then analogous to the one given above in each interval where it is differentiable. Then we must add conditions to connect the pieces which are simply conditions making the utility functions of the agent continuous across the pieces.
552
J.-J. Laffont
The first-order condition is
a
('i,
a-i) - xi(0 , -)
a2'i'
a(x ( , -i) + aXi(i ao
aoi
'
) = 0.
A necessary condition for the truth to be a dominant strategy is therefore
0iaX i, aai2 (O
x-) x0 - i ) ai ( i 2( 20i,
-
- + ax,e,
0 - i) ,=
(5.1)
for any 0-'. Since we wish to obtain the truth for any 0' E O', (5.1) must hold for any E O'. 9 It is therefore an identity which we rewrite as O a2(, (i
aO'
-
Xl(O,
-i)_
-
i)') x(
-
s
ao'-')('O,i, O-i) + -aoi
(i', 0-) =
(5.2)
or -
2ixi0(si o-i]
X2 (Si , 0-') ds'+ hi(0-'),
VOt
.
(5.3) A second necessary condition is obtained from the local second-order condition. The above maximization problem must be concave around the truth. Let O(o ,i 'i) =
ixi(o',
- i
)
X2
The first-order condition can be written as
a (l i,)
=0.
The second-order condition is a02.
But, from identity (5.2),
ao (', o i) _ o.
+
i(i
-).
553
Ch. 10: Allocation of Public Goods
We therefore have a-
=°
+2 a
The second-order condition can then be rewritten
0')o, >0,
aO ( i.e. here
x(
0,
i O
VE0.
-i)0,
These necessary conditions are here sufficient, as can be checked O'x2,(o', 9-i)
-
22
X2
> ix (ofi
Xi) O+x( O-
X2 (
-i)
i
i) + xi(Oi ,-i)
(5.4)
Using (5.3), expression (5.4) becomes: (di - Oi)x (i,
fx2(si,
9i) x
i) ds',
which is true from the fact that x2 is non-decreasing in o'. The characterization of differentiable dominant strategy mechanisms at the individual level is then completed. We can choose a function x(-) non-decreasing in i' and then the function x(8) is determined by integration with an arbitrary function hi(-i). In a private goods economy the feasibility conditions are I
I
E
< E w2
x(o',o-')
i=l
i=1
and Ehi(0-i)-
Wi
• i=l
E
-(_
X-2(si,
i))
x(s',
i') dsi
554
J.-. Laffont
Suppose now that good 2 is a public good (x = x 2 for any i) and that its cost in units of good 1 is C(x 2). The feasibility condition is then C(x 2 (o)) + E h'(0-')- -E aX 2 (S i
)
f(si
2(Si
-i))
0.
6.Thasrk-GrovesVi
6. The Clarke-Groves-Vickrey mechanisms 8 A major intellectual event of the last decade in public economics has been the irruption of the Clarke-Groves-Vickrey (CGV) mechanisms claiming to be a solution for the long-standing free-rider problem. 9 In this section we show a possible intellectual origin of these mechanisms, discuss some of their properties and show that they are a special case of a more general class of incentive compatible mechanisms. The intuition of a CGV mechanism can be given from the Vickrey auction. Consider an indivisible good that a seller wants to sell to a group of I buyers. The Vickrey auction enables the seller to elicit the true willingnesses to pay of all buyers as follows. Let ' denote buyer i's true willingness to pay and i' denote his (sealed) bid. The Vickrey auction tells him that the buyer with the highest bid will receive the object and will pay the second highest bid. It is easy to see that truthful bidding is a dominant strategy. Consider, without loss of generality, buyer 1 and let us distinguish between two cases. In case 1, truthful bidding leads buyer 1 to be the highest bidder. He cannot gain by announcing 0' # 8'. Indeed, suppose he tries to lie. If he announces 1 * O1 but still larger than the second highest bid, nothing changes. If he announces 9' below the second highest bid, he loses the object and pays nothing. He has lost a gain of (1 _ 02) he would have had by being truthful. In case 2, he is not the highest bidder if he is truthful. He does not receive the object and pays nothing. Suppose he lies with a lie 81, still below the highest bid. Nothing changes for him. If his announced bid becomes the highest bid, he receives the object but pays the second highest bid which, by assumption, is here higher than his true willingness to pay, making a loss. Note that this mechanism yields a Pareto optimal outcome (since the agent with the highest bid gets the object) and that the payment made is the loss 'The early papers are Clarke (1971), Smets (1972), Groves and Loeb (1975) and Vickrey (1961). 9A thorough discussion is not feasible here [see, for example, Green and Laffont (1979), Laffont and Maskin (1982)].
Ch. 10: Allocation of Public Goods
555
inflicted by the agent who gets the object on the rest of society since he pays the second highest valuation. Consider now an economy with one private good and one public good in which each agent has a quasi-linear utility function, .,
i(xi, y)=xi + i(y),
I,
where xi is agent i's consumption of the private good and y is the quantity of the public good (which, for simplicity, has a zero cost). Finally, let i' be the initial resource of agent i in private good with x = Y=lxi. Consider first the simple case of a (0, 1} public project with the normalization: V(0) = 0;
i= 1 ... 1.
V (1) = ,
Excluding corner solutions where some agents are bankrupted, a Pareto optimal public decision (under perfect information) is defined by I
y=l*
0>0. i=l
Let 0' be the announced willingness to pay for the realization of the public good (y = 1), i = 1,..., I, and let 0=(01 . .. ,1), -i= (01 ... ,
i-
Oi +
,...,
),
0 = (0i, 0-i).
The Clarke mechanism can be defined as follows: I
'> O
y(O) = 1
(6.1)
i=l
ti(O) =:--
ji, joi
= 0,
if ( i) j;i
otherwise.
(
< 0, i=1
(6.2)
The decision is taken according to (6.1). Moreover, agent i receives a (nonpositive) transfer ti . If his answer changes the collective decision, i.e. ,j i0 has
J.-J. Laffont
556
a sign different from the sign of = 0i', he must pay the cost that he imposes on the rest of society, I ,* 1. Otherwise, he pays nothing. It is easy to check that 0'= 0i is a dominant strategy for agent i, i = 1,..., I. Let A' denote the difference between agent i's utilities if he announces 'i and agent i's utility if he announces 0'. Different cases must be distinguished.
Case : Y jiO
>0.
(a)
+
0i
i'> O0;
joi
E O + Oi20, j=i
bA= 0' -
' = O.
j=i
Ju~~jl
(c)
E j+ i 0, i=1
if E 0' O} has i elements,
otherwise,
ti(O) = hi(-i'), Vi. (b) The random dictatorship:
y()
= E yk(0), k=l
with yh(O) = O,
1 =
ti()
if
h < 0,
if
h>0,
= hi(O-i),
Vi,
where y(O) should now be interpreted as the probability of realizing the project. Laffont and Maskin (1982) give an example where, for a given optimization problem, the Clarke mechanism is dominated by the random dictatorship which clearly does not always take the Pareto optimal public good decision. Finally, we show how, in the case of divisible public goods, the differentiable approach yields constructively the CGV mechanisms. Suppose that the utility function of agent i is now xi + Vi(y, O
iEOi),
i
i=1 , ...
I.
We look for transfers which implement the Pareto optimal public decision, i.e. the function y*(O) which is implicitely defined by a (y, i) =O.
i= solves the program
Agent i solves the program max vi(y*(Oi, O-), ') + t'(9O, O-')),
eE Oi9
(6.3)
Ch. 10: Allocation of Public Goods
559
yielding, as a necessary condition:
ay*(oi, -i) + at(i ay
aO'
-i) = 0,
MI
which, by the argument of Section 5, is transformed into a differential equation:
) = -a
ati 8i
(y*)()
)ay*
-(Y7t~l ,9
(0),
which can be integrated, yielding:
ti(O)
i
(y * ( s i
a-i), si) Y-(s ,i
- i)
i= 1 ... , I.
dsi+ hi(-i) (6.4)
From (6.3):
aVi
a ,.
a (y*(),oi)= - E aY(*() joi
i).
Substituting into (6.4) we get ti(8) = E vi(y*(O), O) + hi(O-i), jii
which is the Groves class for continuous projects. 7. Weakening incentives requirements Let us remain with the case of a divisible public project, as at the end of the previous section. The major drawback of the class of transfers exhibited there, namely,
ti'() = E VJ(y*(O), oj ) + hi(O-i), joi
is that they do not balance in general, or equivalently that the dominant strategy mechanisms are not Pareto optimal.
560
J.-J. Laffont
D'Aspremont and G6rard-Varet (1979) have shown that weakening incentives requirements to the notion of truthful Bayesian-Nash implementation enables the construction of Pareto optimal mechanisms, i.e. here of mechanisms which balance the budget. Consider first the functions r'(O') which induce agent i to tell the truth if he maximizes his expected utility, assuming that the others tell the truth. Agent i solves the problem max f
,vi(y*('),
i) + r'(O'))
1(`-)dO-',
where agent i's expectation is taken to be independent of his type i', and where y*(O) is as in Section 6 the first-best choice of the public good. The differentiable approach yields the identity:
afo,iv .
aro=
ay*
Y*_ (O-i)dO-'+ ar' =O,
i
and integrating: r()= rl(°')= -
Jo JOay' i
i(O-`) ao'(.1
adv
ay*
dO-'ds
+
constant.
(7.1)
From Section 6 we know that
ay i ds
oa
= - E vi(y*(0), O') + hi( '),
so that reversing integrals in (7.1) gives r'(
)i =
d-i + constant.
vj(y*(0),i') i(i)
f
(7.2)
To obtain transfers which balance, it is enough to consider then -
ti(0) = r'(')
E rJ(0j). ji
The addition to r'(8 ') of a function of but, in addition, clearly I
E ti(O) i=1
0.
-'
does not perturb the above argument
Ch. 10: Allocation of Public Goods
561
The possibility of taking into account dependent beliefs, i.e. functions i(8-i/9i) which depend on i', has been the subject of further research by D'Aspremont and Gerard Varet (1982) and Laffont and Maskin (1979). Recently, Cremer and Riordan (1982) have observed that it is possible to construct balanced mechanisms in which one agent is attributed a Bayesian-Nash behavior, all the others having a dominant strategy. Let us conclude by observing that, as formula (7.2) makes clear, the design of ri(0') requires knowledge of agent i's expectations, 4i(9-'). Another way of weakening incentives has been to look for implementation in maximin equilibria [see Dubins (1979) and Thomson (1979)]. The introduction of random schemes is also helpful [Gibbard (1977, 1978)]. Finally, starting with the work of Groves and Ledyard (1977) it became clear that implementation in Nash equilibria opened great possibilities, such as the achievement of Pareto optimality. Adding the requirement of individual rationality, restricts essentially the mechanisms to the implementation of Lindahl equilibria [see Hurwicz's (1979) theorem]. Maskin (1977) provided a characterization of social choice functions implementable in Nash equilibria. However, as explained in Section 2, this is a very weak and unclear notion of implementation in games of incomplete information.
8. Planning procedures The idea of using planning procedures to provide public goods is an old one. The first rigorous formalizations appear in Tideman (1972), Dreze and de la Valle Poussin (1971) and Malinvaud (1971, 1972). We adopt here a presentation based on the logic of our theme. Planning procedures are presented by solution concept from the point of view of implementation.
8.1. Local dominant equilibrium
For simplicity, we consider an economy with two consumers and two commodities, a public good and a private good. The utility function of agent i, vi(x', y), is a smooth strictly concave utility function. Agent i starts the procedure with an endowment wi > 0 in the private good. For simplicity, again, we ignore boundary problems and assume that the public good is costless.10 10
It is straightforward to define imputed shares of the cost of the public good and define net willingnesses to pay.
562
J.-J. Laffont
Myopic behavior implies that agent i attempts to maximize at instant t: d v'
where §' denotes the true marginal rate of substitution. The revelation principle applied to the local game implies that we can restrict the analysis to direct planning procedures where agents announce marginal rates of substitution: = y(ol, 2), i= Xi(01, 02),
(8.1)
i=1,2.
A planning procedure is strongly locally individually incentive compatible (SLIIC) if revelation of the true marginal rate of substitution is a dominant strategy at each instant for every consumer. It is balanced if X'(O1
, O2)
+ X 2 (8',92) =
VO.
A Pareto optimal allocation (x l , x2 , y) is characterized here by Ol(x l , y) + 02 (x 2, y) =
(Samuelson condition),
x +X 2 =0. We say that a planning procedure is Pareto optimal iff its stationary points are Pareto optimal allocations: Y(Ol(x, y), 82(x2, y))=O XI(OI(x, y), 2 2 (X2 2, y)) X(O1(xl, y), (x , y)) = 0 X 2 (Ol(xl, y), 02(x2, y))=O
01(x',y) + 0 2 (X2 , y) = 0,
X +xX2 =0
It is convergent iff the solutions of the dynamic system (8.1) converge to its stationary points; it is dynamically efficient iff it is balanced, Pareto optimal and convergent. We say that it is individually rational iff d v' dt
0,
i=1,2.
A class of planning procedures is neutral or unbiased iff each individually
563
Ch. 10: Allocation of Public Goods
rational Pareto optimal allocation can be reached by a member of the class. We restrict the analysis to C2 planning procedure [i.e. where Y(-), X'(.) and X 2 (.) are C2]. At each instant, the objective function of agent i is linear in 0'. The method of Section 5 can be used to show that incentive compatibility requires
Y aeXl(
'
1
Y O;-
, 92) =
O' 1
Y (1,
X 2 (0, 02) = -,
y(
2)
dsl + hl (0 2 ),
s 2 ) ds2
+ h 2 (91 ).
(8.2)
We can then prove: Theorem 8.1
Under the above assumptions a two-person C2 planning procedure is SLIIC, Pareto optimal, balanced and individually rational iff (i) (ii)
= A(0 1 ) - A(-02), A strictly increasing and C 2,
Y()
XL(a)
=
0sl'1(s1)ds
-
+ f9 252A'(s2)ds2,
x 2(H) = -X 1 (8). Proof
As a first step we show that balancedness requires Y(O) = y 1 (81 ) + y 2(0 2 ).
From incentive compatibility, Xl + X 2 =
-
Y (51 82 )dsl + h1 (82 )
f1
o
-|o
as s
as
(l,
2)dS2+
2(Ol).
564
J-J. Laffont
Differentiating with respect to O8and 02, we obtain: (01 + 02)
182'
Therefore, balancedness implies
(0 +
a2Y
)
0=
and a2Y 8182 a00 02 O,
since Y is C2. Reciprocally, integrating back we show that 8 2 Y/80'82- 0 implies the possibility of constructing a balanced procedure. Pareto optimality requires: Yl(O 1 ) + Y 2 (-_
1
O0 for any 81 E 0.
)
Denoting A(-) - yl(.), we have: Y(o) = (o 1 ) -
(-o 2 ).
To avoid the existence of stationary points different from the Pareto optima, A( ) must be strictly increasing. From the above results, transfers can be rewritten as X=
-
X2 =
+ f s2A'(-s2)ds2 + K,
flSlA(sl)ds 2
sA'( _
2
)s ds 2 + fo fOs1A(s1)
ds 1
K.
It remains to show the implication of individual rationality, do'
-T cr 'Y+ Xi > O.
Integrating by parts 1'Y + X1, we have: f
0
A(s)ds-(90+
02)A(-9
2) + K
0.
Ch. 10: Allocation of Public Goods
565
From the fact that A(.) is increasing, the first two terms are always non-negative. Since they can become zero (if 01 = - 02), K must be non-negative. Reasoning symmetrically for agent 2 we show that K< 0, implying K= 0. Q.E.D. Thanks to individual rationality, the dynamic efficiency of the SLIIC procedures defined above is easily proved: U + U2 can be taken as a Lyapunov function. It is increasing and easily bounded above by the boundedness of the feasible set. Consequently, the procedures are quasi-stable in the Uzawa sense; any limit point of the trajectory is a stationary point. Stability then follows since, from the strict convexity of preferences and technology, stationary points are Pareto optimal and there is a unique Pareto optimum that dominates all points of the trajectory. Since the property of truthful revelation is a closed property we can take non-differentiable limits of increasing functions. Moreover, the above analysis is easily extended to piecewise/differentiable procedures. For example, Y()
= + 1,
=0,
= -1,
if 01 > 0 and 2 > 0 (one strictly at least); if 01 > 0 and 02 < 0, or 01 < 0 and 02 > 0, or 01 = 0 and 02 = 0; if 01 < 0 and 02 < 0 (one strictly at least),
defines an incentive compatible procedure. This procedure," which can be called the Bowen procedure, does not require any transfers since Y is piecewise constant. However, for more than two agents it does not have the property of Pareto optimality, a usual feature associated with majority voting. Finally, the neutrality of the class of SLIIC procedures, when all increasing functions Y(.) are considered, is easily shown. However, this property does not extend to the case of more than two agents. 12 Moreover, the property of neutrality as defined above is not very interesting when agents do not behave truthfully. A planning procedure should be able to use at date t only the information available at date t. The question of the ability of the Center to affect the distribution of resources when agents behave strategically still remains quite open. 1It 12
corresponds to majority voting. For more see Fugigaki and Sato (1981), Laffont and Maskin (1983), and Champsaur and Rochet (1983).
566
J.-J. Laffont
Since little is available on planning procedures based on local Bayesian equilibrium, we turn now to local maximin equilibrium.
8.2. Local maximin equilibrium The planning procedures introduced by Dreze and de la Valle Poussin and Malinvaud-the MDP procedures-turned out to be a special class of direct revealing procedures when the solution concept is the local maximin equilibrium, and the change. of public good is linear in marginal willingnesses to pay: = 9 +02,
= -_0 + x 2 = _--02 81 >
0
,
+
52 2
82>0,
2
, 81
+ 8
2=1 .
This procedure can be balanced since individually rational since
is additive in 01 and
2.
It is clearly
do'
dt cc Sip 2 .
When varying the weights (8 i') one generates a neutral class [Champsaur, 1976]. Finally, it is easy to check that truthful behavior is a local maximin strategy. Indeed, truthful behavior ensures that utility does not decrease (individual rationality). Any other strategy opens the possibility of net losses. One may wonder if there is a SLIIC procedure associated with p = + 02. There is one, as can be checked from formula (8.2). It corresponds to 1= 2 = 1/2, i.e. to an and equal sharing of the surplus .p 2 remaining after the compensations - 0' - 082) provided for the change in public good. (For I> 2 the two classes do not intersect.) There is no good reason to limit the analysis to linear Y(-) functions. A characterization of all procedures which are incentive compatible in the local maximin sense includes of course the MDP class and the SLIIC class [see Laffont (1985)]. The widespread interest in the MDP class has led to more work and results about this procedure, using the concept of local Nash equilibrium [see Roberts (1979), Henry (1979)] as well as results on the properties of the global game (i.e. when agents are not myopic) [Champsaur and Laroque (1982)].
Ch. 10: Allocation of Public Goods
567
9. Conclusion Many results described in this chapter appear fairly abstract for any type of application. However, some real progress has been achieved in the understanding of the difficulties raised by the free-rider problem and, more generally, by incentives constraints; the basic ideas leading to the construction of incentive compatible mechanisms have been laid out. In my view, two major issues remain. First, any real application will be made with methods which are crude approximations to the mechanisms obtained here. The rationale for these simplified schemes remains to be given. Clearly, transaction costs will play a major role in this explanation, but other considerations, such as simplicity and stability to encourage trust, goodwill and cooperation, will have to be taken into account. Second, the above analysis has taken the view that agents know their willingnesses to pay for public goods. However, it is often the case that transaction costs must be incurred by agents to find out correctly this information. Two different views can be held here. The optimistic view is that the practice of direct democracy should, in the same way as the practice of market activities for private goods, lead agents to become well informed about their tastes for public goods. The pessimistic view is that transaction costs will still have to be incurred by agents to provide accurate information; therefore, the incentives mechanisms must provide a strength of incentives which is strong enough to overcome these costs. This more subtle version of the free-rider problem cannot be considered as solved. 3 It can even be conjectured that it cannot be solved in a satisfactory way within the homo economicus paradigm. However, if moral considerations can be relied upon for a positive attitude towards becoming an informed citizen, it will be important to be able to use social decision mechanisms which, if they offer to agents only a weak incentive to be truthful, at least do not give any incentive to misrepresent their information.
References D'Aspremont C. and L.A. Gerard-Varet, 1979, Incentives and incomplete information, Public Economics 11, 25-45. D'Aspremont C. and L.A. Gtrard-Varet, 1982, Bayesian incentive compatible beliefs, Mathematical Economics 10, 83-103. Barbera S., 1982, Truthfulness and decentralized information, Mimeo (Bilbao). Border K. and J. Jordan, 1983, Straightforward elections, unanimity and plantom voters, Economic Studies 50, 153-170. Champsaur P., 1976, Neutrality of planning procedures in an economy with public goods, Economic Studies 43, 293-300. 13
See Green and Laffont (1979, ch. 13) and Muench and Walker (1979).
Journal of Journal of Review of Review of
568
J.-J. Laffont
Champsaur P. and G. Laroque, (1982), Strategic behavior in decentralized planning procedures, Econometrica 50, 325-344. Champsaur P. and J.C. Rochet, 1983, On planning procedures which are locally strategy proof, Journal of Economic Theory 30, 353-369. Clarke, E., 1971, Multipart pricing of public goods, Public Choice 8, 19-33. Cremer J. and M. Riordan, 1982, A Stackelberg solution to the public goods problem, Mimeo (University of Pennsylvania). Dasgupta P., P. Hammond and E. Maskin, 1979, The implementation of social choice rules: Some general results on incentive compatibility, Review of Economic Studies 46, 185-216. Dreze J. and D. de la Vallee Poussin, 1971, A ttonnement process for public goods, Review of Economic Studies 38, 133-150. Dubins, L., 1979, Group decision devices, Mimeo (Berkeley). Dummett and Farquharson, 1961, Stability in voting, Econometrica 29, 33-44. Fugigaki Y. and K. Sato, 1981, Incentives in the generalized MDP procedure for the provision of public goods, Review of Economic Studies 48, 473-486. Gibbard A., 1973, Manipulation of voting schemes: A general result, Econometrica 41, 587-601. Gibbard A., 1977, Manipulation of schemes that mix voting with chance, Econometrica 45, 665-681. Gibbard A., 1978, Straightforwardness of game forms with lotteries as outcomes, Econometrica 46, 595-614. Green, J. and J.J. Laffont, 1977, Characterization of satisfactory mechanisms for the revelation of preferences for public goods, Econometrica 45, 427-438. Green J. and J.J. Laffont, 1979, Incentives in public decision making, Studies in Public Economics (North-Holland, Amsterdam) vol. 1. Groves T. and J. Ledyard, 1977, Optimal allocation of public good: A solution to the free rider problem, Econometrica 45, 783-810. Groves T. and M. Loeb, 1975, Incentives and public inputs, Journal of Public Economics 4, 311-326. Hammond P., 1979, Straightforward individual incentive compatibility in large economies, Review of Economic Studies 46, 63-82. Henry C., 1979, On the free rider problem in the MDP procedure, Review of Economic Studies 46, 293-303. Holmstrm B., 1979, Groves schemes on restricted domains, Econometrica 47, 1137-1144. Hurwicz J., 1979, On allocations attainable through Nash equilibria, Journal of Economic Theory 21, 140-165. Laffond, G., 1980, Revelation des preferences et utilit6s unimodales, Working Paper (CNAM). Laffont J.J., 1985, Incitations dans les procedures de planification, Annales de l'INSEE 58, 3-36. Laffont J.J. and E. Maskin, 1979, A differential approach to expected utility maximizing mechanisms, in: J.J. Laffont, ed., Aggregation and revelation of preferences, Studies in Public Economics (North-Holland, Amsterdam) vol. 2. Laffont J.J. and E. Maskin, 1982, The theory of incentives: An overview, in: W. Hildenbrand, ed., Advances in economic theory (Cambridge University Press, Cambridge) ch. 2. Laffont J.J. and E. Maskin, 1983, A characterization of SLIIC planning procedures with public goods, Review of Economic Studies 50, 171-186. Malinvaud, E., 1969, Lecons de theorie microeconomique (Dunod, Paris). Malinvaud, E., 1971, A planning approach to the public good production, Swedish Journal of Economics 73, 96-112. Malinvaud E. 1972, Prices for individual consumption, quantity indicators for collective consumption, Review of Economic Studies 39, 385-405. Maskin E., 1977, Nash equilibrium and welfare optimality, Mimeo. Moulin, H., 1980, On strategy-proofness and single peakedness, Public Choice 35, 437-455. Muench T. and M. Walker, 1979, Identifying the free rider problem, in: J.J. Laffont, ed., Aggregation and revelation of preferences (North-Holland, Amsterdam). Rob R., 1982, Asymptotic efficiency of the demand revealing mechanism, Journal of Economic Theory 28, 207-220. Roberts J., 1979, Incentives in planning procedures for the provision of public goods, Review of Economic Studies 55, 283-292. Satterthwaite M., 1975, Strategy-proofness and Arrow's conditions: Existence and correspondence
Ch. 10: Allocation of Public Goods
569
theorems for voting procedures and social welfare functions, Journal of Economic Theory 10, 187-217. Satterthwaite M. and H. Sonnenschein, 1981, Strategy proof allocation mechanisms at differential points, Review of Economic Studies 48, 587-598. Smets H., 1972, Le principe de la compensation rciproque, Paris OCDE, Direction de l'Environnement. Thomson W., 1979, Maximin strategies and the elicitation of preferences, in: J.J. Laffont, ed., Aggregation and revelation of preferences, Studies in Public Economics (North-Holland, Amsterdam) vol. 2. Tideman N., 1972, The efficient provision of public goods, in: S. Mushkin, ed., Public prices for public products (The Urban Institute, Washington, D.C.). Vickrey W., 1961, Counterspeculation, auctions and competitive sealed tenders, Journal of Finance 16, 1-17.
Chapter 11
THE ECONOMICS OF THE LOCAL PUBLIC SECTOR DANIEL L. RUBINFELD* University of California, Berkeley
1. Introduction
One of the dominant areas of study in public finance during the past several decades has been the provision of public goods. Public finance economists spend a good deal of time and effort attempting to determine what individuals' preferences for public goods are, to decide what the "efficient" level of provision of goods is, and to analyze the actual production process that provides these goods. While the provision of public goods may occur at any of several levels of government, local aspects of public economics have only recently come to be an area of substantial academic interest. What distinguishes the study of the local public sector from the general theory of public goods is the possibility of migration between local jurisdictions. My analysis begins and ends with a focus on the "Tiebout (1956)" model. Tiebout suggested that it might be useful to view the provision of local public goods in a system of numerous jurisdictions as being analogous to a competitive market for private goods. Competition among jurisdictions would allow for a variety of bundles of public goods to be produced, and individuals would reveal their preferences for those public goods by moving (" voting with your feet"). Tiebout also argued that such a process, at least in the abstract, would lead to an efficient outcome in the sense that no one would be able to make themselves better off without making someone else worse off. The Tiebout view provides an extreme view of the world, but it does allow us to concentrate on what makes the local public sector of unusual interest. Of course, the Tiebout view of the world is a narrow one. However, at present there really is no fully satisfactory model of the local public sector, so that the *This chapter was completed in March 1983, with only minor updating of newer material. I apologize to those authors whose work has inadvertently been omitted. Henry Aaron, Alan Auerbach, Harvey Brazer, Jan Brueckner, Paul Courant, Gerald Goldstein, Edward Gramlich, Robert Inman, Helen Ladd, Mark Pauly, Thomas Romer, Susan Rose-Ackerman, Harvey Rosen, Suzanne Scotchmer and Richard Tresch were all kind enough to read and comment on an earlier draft of the paper. Edith Brashares provided valuable research assistance. Handbook of Public Economics, vol. II, edited by A.J. Auerbach and M. Feldstein © 1987, Elsevier Science Publishers B. V. (North-Holland)
572
D.L. Rubinfeld
appropriate question is whether and under what assumptions the Tiebout model (or an alternative model with fixed population which focuses on the choice of public goods within a jurisdiction) provides a good "approximation" to the local public economy. I will begin with a simple and hopefully intelligible form of the model and then add sequentially a number of complicated, but relevant assumptions which can be crucial in understanding both normative and positive aspects of local public service provision. One of the concerns raised by Tiebout (often confused in the literature) is the determination of optimal public service provision. Optimality or efficiency (used interchangeably) in local public economics has two distinct meanings, which I will have occasion to discuss throughout the chapter. The more common use of the term applies to the provision of public services within a single jurisdiction. An optimal provision, which leads to intrajurisdictionalefficiency, is one which maximizes the sum over all individuals within the jurisdiction of the net-of-costs willingness to pay for the public good, where the set of individuals within the jurisdiction is taken to be fixed. Optimality within a system of jurisdictions, i.e. interjurisdictionalefficiency, applies to the provision of public services among jurisdictions when migration is possible. Efficiency is achieved when the existing number of jurisdictions results in the provision of a level of public goods which is sufficient to satisfy individuals' demands, and is produced at minimum cost. The notion of interjurisdictional efficiency, while clearly unimportant to the discussion of public goods at the federal level, is central to local public economics and will serve as a focal point of this chapter. As a first step in discussing the question of jurisdictional optimality one might ask whether a given jurisdiction ought to be increased in size or decreased in size. In other words, is it efficient to have a small number of large jurisdictions or a large number of small jurisdictions? One immediate thought is that a small jurisdiction has an advantage from a political perspective. The smaller and more homogeneous is each of the communities in a system of local governments, the more likely it is that services provided will be consistent with desires of each and every member of the population. Thus, a small jurisdiction is likely to minimize the "political externalities" that individuals create when selecting a level of public goods provision that does not satisfy everyone's tastes. On the other hand, local redistributive goals that society might have are likely to be thwarted if communities are small and homogeneous. In addition, most public goods involve benefits that extend beyond jurisdictional boundaries. The obvious way to solve this externality problem is to internalize it by expanding the jurisdictional scope. Finally, jurisdictional scale may also be affected by the supply side of the local public sector. If there are economies of scale in the provision of local public services, it makes sense to have larger jurisdictions, for production purposes at least. The problem of balancing these objectives is a complex problem which will appear and reappear throughout the discussion.
Ch. 11: Economics of the Public Sector
573
The chapter is primarily a methodological one. I sketch examples of some of the kinds of models that local public finance economists have used, illustrating some of their advantages and difficulties, but without attempting to achieve complete coverage. A general discussion of public goods provision has been omitted here since it is treated in the opening chapter of this Handbook, while the discussion of public choice and political economy appears in a later chapter. Relatively few references are given in the text, but the reader is urged to read the footnotes and suggested references there when further detail is desired. Section 2 begins the discussion of optimality among jurisdictions with an analysis of the club model of local public provision of public services. This model allows one to see in a very simple way the conditions under which the competitive nature of the provision of local public goods is similar to the competitive market. It illustrates the trade-off between scale economies in the provision of public goods and the externality created when communities become congested or crowded. Section 3 continues with a detailed description of the Tiebout model. It expands the earlier discussion to look at the allocation of individuals among communities when individuals have different incomes, and when a property tax is used to finance public goods. The primary concern is whether an equilibrium exists and if so whether it will be efficient. Local public economics is concerned with the fact that both public spending and taxes can have a direct impact on land and property values. This additional consideration is added to the Tiebout framework in Section 3. The possibility that public sector variables can affect land values alters the analysis of the efficiency of the Tiebout mechanism. How one might test the Tiebout model in such a framework is sketched out at the end of the section. With the broad normative analysis in the background, the review switches briefly to a more technical positive analysis of the local public sector. I ask whether information about individual demand functions for public goods can be obtained within a world of numerous jurisdictions, and if so how one might estimate those demand functions. To some extent the issues are similar to those in the general public finance literature, but information about interjurisdictional variation in the provision of public services adds greatly to the set of possible techniques for estimating demand functions. However, the possibility of migration in response to fiscal variables in a Tiebout-type world adds substantially to the difficulty of obtaining consistent demand parameter estimators. Section 4 illustrates several traditional approaches which can yield demand functions. Section 5 focuses on demand functions also, but looks at some non-traditional methods for estimating demand functions. The first involves analysis of the use of survey data, and the second an examination of differentials in property values, the so-called "hedonic" approach. The discussion then continues to ask: If one knows about demand functions, is it possible to test whether the provision of local public services within jurisdictions is efficient? Finally, Section 6 presents a
574
D.L Rubinfeld
broader overview of the question of the allocation of responsibilities among levels of jurisdictions. The so-called "fiscal federalism" question asks whether the local government ought to be balancing its budget, what local public goods should be provided locally, and whether and how the revenues ought to be shared between levels of government.
2. Community formation and group choice The large number of jurisdictions providing public goods distinguishes local public economics from the rest of public economics. Because individuals are mobile and vary in tastes, the determination of the optimal level of provision of public goods among a series of jurisdictions is complex, as is the meaning of "optimality" itself. Tiebout provided a start by describing a model in which there is a mobile population and a sufficient set of either potential or existing communities offering varying bundles of public goods so that, with costless mobility, individuals can choose the community with the best package of public goods and taxes and in the process reveal their true preferences for public goods. There is no private sector production in the Tiebout model. Individuals simply allocate some portion of their income (obtained outside the model as "dividend income") towards the consumption of the public good and consume the rest. The result is an equilibrium with individuals distributed among communities on the basis of public service demands, with each individual obtaining his own desired public service-tax bundle. The Tiebout model is a demand side model, within which individuals choose locations on the basis of their preferences for public goods. Tiebout's paper says little if anything about the supply side and, in particular, nothing about the technology of production of public goods or the underlying political mechanism by which levels of public goods might be chosen. Average cost curves for public services are U-shaped as a function of population size, and all jurisdictions are assumed to provide services at the point of minimum average cost. As a consequence of all of these assumptions, Tiebout suggests that the outcome of a process by which individuals select jurisdictions will be optimal or Pareto efficient in the sense that no one can be made better off without making someone worse off. Efficiency arises both because public goods are provided at minimum average cost and because each individual resides in a jurisdiction in which his demand is exactly satisfied. By revealed preference individuals who could have moved chose not to, and thus cannot make themselves better off. The Tiebout model has appealed to a large number of writers in public finance, in part because of its analytical convenience, and in part because of the analogy
Ch. 11: Economics of the Public Sector
575
between the Tiebout framework and the competitive model with which economists are so familiar. The number of variants of the Tiebout model is substantial. Here, as in previous analyses, I focus on the analogy between the many jurisdictions and the competitive market. The fundamental question to be answered is whether a competitive analog in the form of an equilibrium distribution of households among jurisdictions exists, and if so whether that equilibrium is efficient. In the simple version of the Tiebout model that arises out of the theory of "clubs",' the Tiebout equilibrium exists and yields an efficient outcome. As the model is generalized to incorporate more realistic assumptions, however, an existence problem arises.. Perhaps more disturbing is the fact that when an equilibrium does exist, it may not be efficient. Since my objective is to progress from the simple Tiebout model towards more complex and realistic models of the local public sector, a brief summary of the assumptions of the Tiebout model will serve as a good starting point. The reader should refer to this list in the following sections as many of the assumptions are relaxed. The Tiebout assumptions are: (1) Individuals have perfect information. (2) Mobility is costless and is responsive only to fiscal conditions. (3) Public goods are provided at minimum average cost within each jurisdiction. Each new migrant to a community pays an access cost equal to the cost of providing public services to that migrant. (4) There are no interjurisdictional externalities. (5) There are a sufficient number of jurisdictions and a sufficient number of households of each type (in terms of tastes and incomes) so that each jurisdiction can contain identical individuals. Thus, new communities can be developed costlessly. (6) There is no public choice mechanism other than the utility-maximizing decision of an identical set of individuals. (7) All income is dividend income, not generated by private production. There is no labor market. (8) Public goods are financed by lump sum taxes. (9) There is no land, no housing; and therefore no capitalization. Clearly, these restrictions are extremely strong. The hard question is whether the Tiebout model is sufficiently like the local public economy so as to make it useful for both normative and positive purposes. This is a question for which there is not an easy answer, but which motivates a good deal of the methodological discussion which follows. 2 'See, for example, Buchanan (1965), See, also, Sandler and Tschirhart (1980), Berglas (1982), Berglas and Pines (1984), and Scotchmer (1985a). 2 Of course, this is in part an empirical question. as is discussed in Oates (1981).
D.L. Rubinjeld
576
2.1. A starting point- the theory of clubs Assume initially that all individuals are identical in tastes and in incomes. In such a world the level of public services provided will be the utility-maximizing level selected by any individual within the community. This most elementary "public choice" mechanism will be relaxed later in the chapter, in which case the competitive analogy is substantially weakened. Assume also that a pure public good is provided to all identical residents of a community or region which is isolated from other communities.3 The "pure" public good has the property that its consumption is unaffected by the number of people in the community. To produce the public good, the production of the private good, and thus private consumption, must be forgone. As a result, a trade-off exists, in both production and consumption, between public and private goods. Following Stiglitz (1977), let G represent the level of provision of the public good, X private consumption per capita, W total income available to all members of the community, and N the number of individuals. Private and public goods are assumed to be measured in identical units and produced by a single production process. The single production function, f, for both public and private goods, is given in equation (1): W=f(N),
f' positive, f" negative.
(1)
With this functional form there are first increasing and then decreasing returns to scale (in population). Total income W is measured in units of output, produced within the jurisdiction. The constraint W= XN + G
(2)
reflects the fact that W must be fully allocated among the purely private good and the purely public good. As a result of these assumptions, the Stiglitz model is quite restrictive. It requires, in effect, that individuals work where they live and consume public goods on islands. If N is fixed, and all individuals in a given jurisdiction have identical tastes and income, then each resident's problem is to choose X and G to maximize his utility function, U(X, G). The Lagrangian is (X is a Lagrange multiplier): maximize U( X, G) - X(XN+ G-f(N)),
(3)
and the first-order conditions are dU/X - XN=
(4)
3 This model is a good starting point, but is somewhat limited in its application to the local public sector, since it assumes an isolated island economy. The assumption that all income is generated from local production is a very restrictive one.
577
Ch. 11: Economics of the Public Sector
X
f(N)/N
:U f(N)
G
Figure 2.1
and
aU/aG-
= 0.
(5)
Solving equations (4) and (5) yields: N [(au/aG)/(aU/ax)] = 1.
(6)
Equation (6) is the Samuelson condition for efficient provision of the public good. The expression in brackets is the marginal rate of substitution of public for private goods, and N is the population. The left-hand side of the equation is the sum of the individual marginal rates of substitution between public and private goods. The right-hand side is the marginal rate of transformation, i.e. the marginal cost in terms of forgone units of X of producing an additional unit of output G (since inputs and outputs are all measured in equivalent units). In Figure 2.1, f(N) represents maximal public good provision, while f(N)/N is the maximum possible per capita consumption of the private good. Since the technology is assumed to be linear, the production possibility frontier is linear and an interior optimum is attained at point E. Thus, for a given population, the optimum provision of the public good is easily determined, since complications associated with the public choice process in a world of heterogeneous individuals are not present here. Now consider the problem of allocating population among jurisdictions. If the level of public goods, G, is fixed in each community, how many people should reside in each? The "club" analogy applies here, since this is the kind of decision that would face an entrepreneur attempting to decide how many individuals to let into an exclusive private club. The entrepreneur's objective is to obtain the
D.L. Rubinfeld
578
highest entry fee possible (with identical individuals), which can be accomplished by choosing N to maximize private good consumption given a fixed G. From (1) and (2) private good consumption is given by (7)
X= (f(N) - G)/N. Maximizing X with respect to N yields:
(8)
X=f'(N).
This condition states that population is added to the point at which the added income is equal to the per capita consumption of the private good. If the additional income were lower than X, for example, the newest member of the club would not have sufficient private income to assure that the current standard of living (private consumption) would be maintained. With the level of G as well as N allowed to vary, the full club model can be analyzed. As population increases, aggregate income and thus the maximum level of public good obtainable increases, given that there is a positive marginal product of labor. However, the maximal level of private consumption decreases, since the marginal productivity of labor diminishes. In other words, as N increases, f(N) rises, but f(N)/N falls. An example of how the "opportunity sets" vary with N is given in Figure 2.2. Clearly, a high population allows for increased public good consumption, but at the cost of having less of the private good available (and greater congestion and other externalities in a more realistic model). In terms of the entrepreneur's decision to set the size of the club optimally, the locus of all possible opportunities is relevant, since any allocation of resources x
G Figure 2.2
Ch. 11: Economics of the Public Sector X
579 X
A
G Figure 2.3
not on the locus cannot be utility maximizing. Figure 2.2 illustrates the situation in which there are only three sizes of jurisdictions, but the general point should be clear. Although the production possibilities set within each community is convex for a given N, the opportunity set may well be non-convex. As a consequence, a number of possibilities can arise, three of which are illustrated in Figure 2.3. 4 Figure 2.3 shows that the optimal size of the community may be either zero, the entire population, or somewhere in between, depending upon the shape of the opportunity locus and the shape of the indifference curves of the identical individuals. The model is an extremely simple one, but the theoretical possibility of "corner solutions" because of supply or demand conditions holds quite generally.5 The corner-solution possibility should not be particularly troubling to students concerned with policy, because the Tiebout model assumes away a 4 See 5
Stiglitz (1977) for additional details. See, also Wooders (1980). Multiple equilibria are also possible, but not illustrated in the figure.
580
D.L. Rubinfeld
number of "frictions" such as imperfect information, moving costs, and land-use restrictions which make a corner outcome unlikely. However, to public finance theorists the prospect is more serious because it suggests that models without "interior" equilibria may occur in local public economics. If interior optima are to be studied (using the calculus) some further assumptions about the nature of utility and production functions or about institutions such as land-use controls, must be made. 6 To avoid this problem I assume in the next section that any solution to the optimization problem does not involve a corner solution.
2.2. Optimum club size with congestion The club model becomes more interesting and more realistic if one allows for congestion in the consumption of the public good. There are a number of ways in which congestion might be modeled, but one simple alternative is to account for congestion in the cost of providing the public good. To do so, I borrow from Henderson (1979) and denote the cost of providing a unit of output of the public good (e.g. a year of public education) by C(N). If the public good is a pure public good, then C'(N)= 0. If the public good has either private good characteristics or can have congestion in consumption, then C'(N) is positive. I assume that consumers are able to purchase units of housing, H, at an exogenous price, p, and for pedagogic reasons that they are identical in both taste and income, but will relax this assumption in a moment. The individuals solve the following optimization problem, where Y represents individual exogenous income and the price of X is one: maximize U( X, H, G) subject to X = Y - pH - [C(N)/N] G.
(9) (10)
The first-order conditions are similar to those in (4) and (5) yielding: N [(aU/aG)/(aU/aX)] = C(N)
(Ila)
[(au/aH)/(au/ax)]=p.
(Ilb)
and
In (Ila) the left-hand side represents the sum of the marginal rates of substitution of individuals' preference for public versus private goods, while the right-hand side represents the marginal rate of transformation or the cost of producing an additional unit of the public good when population is fixed. Equation (lb) represents the equivalent condition for the MRS between housing and the private good. 6 Courant and Rubinfeld (1981, pp. 297-298) describe a set of conditions sufficient to guarantee an interior optimum.
Ch. 11: Economics of the Public Sector
581
Now, return to the entrepreneurial decision about optimum club size. Maximizing private consumption with respect to N, with G fixed, yields the necessary condition given in (12): C'(N) = C(N)/N.
(12)
Equation (12) states that the optimum size of the community 7 is the point at which the average cost of the provision of public services is equal to the marginal cost of adding another individual to the club. For the standard U-shaped cost curve this is the point of minimum average cost. Thus, an entrepreneur wishing to maximize consumer's utility, and implicitly entrepreneurial rents, would choose a community size at which the public good is produced at minimum average cost. Models of this type as applied to the study of local jurisdictions assume that communities can control the entry of population as well as the provision of public goods, so that by choosing N to minimize the cost of providing public goods, they can maximize the price they can obtain for entry into the jurisdictions. 8 Any entry charge is taken to be the same for everyone. Assuming that the total population to be allocated among jurisdictions is a multiple of the optimum club size, the equilibrium solution would involve a host of identical clubs, all of equivalent optimal size. As shown by Henderson (1979), such an outcome would be efficient and stable. When consumers are identical, if any jurisdiction is above or below the optimum jurisdiction size, then the cost of public good provision in that jurisdiction would be higher than average, and hence, a new club would be formed to provide the same public good bundle at lower cost. If the population were not a multiple of club size the analysis would be greatly complicated because public goods would not be produced at minimum average cost in each community. Pauly (1970) argues that in general an equilibrium will not exist in such a case. The club model provides a natural introduction to the Tiebout model because it describes optimal public good provision within communities as well as optimal jurisdiction (or club) size. The Tiebout model focuses primarily on interjurisdictional optimality in a world of varying tastes and incomes, and should be viewed simply as an analysis of the optimal provision of public goods in a series of clubs or jurisdictions. 3. The Tiebout model 3.1. Allocating individuals with different incomes Models of optimal community size are most interesting when the tastes and/or incomes of consumers and their public good demands vary. To get a sense of the 7
Assuming that the optimum exists and is not a corner solution. 8See, for example, McGuire (1974).
582
D.L. Rubinfeld
added complexity which results I will treat the simplest case in which there are two types of individuals with identical tastes, but different incomes. This turns out to be slightly easier to analyze than the case of identical incomes and differing tastes which is what the Tiebout model is about. I will also assume that each of these groups is sufficiently large to support jurisdictions of the optimal size and that individuals within jurisdictions pay equal tax shares. A necessary condition for optimality within each jurisdiction is that the population must be divided into different types of clubs each of which is homogeneous in income and satisfies the Samuelson condition for efficiency in the provision of public goods. A mixed club, on the other hand, would necessarily be inefficient. A typical possibility in a mixed club would be that the actual level of public provision would be an average of the demands of the two types of individuals. Neither the low nor the high income group would then be consuming their most preferred level of public services given the taxes that they pay. In this case both groups could improve their well-being if they moved to homogeneous communities 9. Heterogeneity is inefficient when all individuals pay equal taxes within a jurisdiction. Of course, heterogeneity may make sense in a model in which individuals of differing skills are complementary factors in production or if individuals simply have a taste for living among others of different tastes and/or incomes. However, within the relatively simple club model we ask a harder question. Does the heterogeneity result change when taxes paid are a function of income or of the benefits received from the consumption of public goods? To answer this question, consider a Lindahl pricing or tax scheme in which the tax paid per unit of public good is equal to the marginal benefit that the individual receives from the consumption of the last unit of the public good provided.' 0 By construction, the Lindahl taxation scheme has the property that both high and low income individuals will choose to consume the identical public service bundle given the "price" that they face. Although appealing in terms of the marginal conditions of utility maximization, this solution is nonetheless inefficient; both groups of individuals can increase their welfare by moving to jurisdictions with equal tax shares. To achieve the income redistribution implicit in the diversity of tax shares it is better to simply redistribute income rather than redistribute through induced overcon9 As Henderson (1979) argues, in a mixed club the Samuelson condition will provide for provision of public goods above that of the low demanders and below that of the high demanders. (There is a positive income elasticity of demand.) Such an arrangement lowers the potential utility obtained by all members in the community. The low demanders are forced to consume on the same production possibilities curve as in a homogeneous community, but at a point that is not a tangency with their indifference curves. The same is true for the high demanders since they underconsume rather than overconsume. 'OLindahl prices here are set equal to each individual's marginal benefit from the public good at a given level of provision (not necessarily the efficient level).
Ch. 11: Economics of the Public Sector
583
sumption of public goods.1 l Thus, heterogeneity is not efficient in this simple Tiebout model. The argument developed here could be extended to the case in which there are a large number of types of individuals differing both in incomes and in tastes. Clearly, the optimal number of jurisdictions which provide public goods would increase. With costless mobility each individual could move to the community of his or her choice. In such a world the tax system would not be distorting since public goods would be paid for with a head tax. The market analogy suggested by Tiebout would be complete. An equilibrium that existed would be efficient. While Tiebout did not 'describe how such an equilibrium might arise, one can easily image an entrepreneurial process that would lead to the creation of new communities. It is at this point that the extreme limitation of the Tiebout model becomes clear. For the Tiebout model to generate an efficient outcome each community would have to produce at minimum average cost, yet there may not be enough people of each type to provide the appropriate scale for production efficiency. In addition, the assumption that new communities can be developed costlessly is crucial for the market analogy to be complete. 12 Without free entry individuals cannot be assured that opportunities will exist to consume the utility-maximizing level of public good in a homogeneous community. Free entry is a very strong assumption, however. In urban areas and in urban models with labor markets and limited accessibility, the number of possible communities is restricted. A new community would either have to evolve at the urban fringe with workers facing high costs of commuting, or new urban areas would have to develop - an unlikely prospect. Finally, the Tiebout competitive model does not allow for the property tax as a means of financing public goods - a complication to which I now turn.
3.2. The property tax The property tax is the primary source of revenue for local governments in the United States. While its role as a revenue source has diminished substantially since the turn of the twentieth century, it still represents over 50 percent of total local revenues (54.7 percent in 1980). This figure includes only revenues that are raised locally. Intergovernmental aid from both states and from the federal government today represent an additional major revenue source, a subject to which we turn in Section 6 of the chapter. When the property tax is used to finance the local public good and when tastes for housing are not perfectly correlated with tastes for public goods (i.e. when "See Henderson (1979) as well as Eberts and Gronberg (1981). This point is developed in Courant and Rubinfeld (1982) and Epple and Zelenitz (1981).
12
584
D.L. Rubinfeld
housing and public goods are not perfect complements or substitutes,) the Tiebout model becomes substantially more complex. In this case the entrepreneurs in each community provide combinations of public goods and housing to suit all individuals. As a result, there must be a sufficient number of communities to provide the desired levels of public goods and of housing for all groups of individuals. The necessary number of such communities is likely to be extremely large. To see exactly how the presence of a property tax alters the efficiency properties of the Tiebout model, I add a property tax on housing (assumed identical to land here to simplify matters) to the previous model. Again, to simplify the discussion I choose the property tax rate, t, to be defined as the rate of taxation on annual housing expenditures. This rate is equal to the usual tax rate on property value divided by an appropriate discount rate (assuming an infinite stream of housing services). If p is the annual price of producing housing, and t is the tax rate, then p(1 + t) is the annual price of housing, gross of taxes, that the consumer faces when making his consumption decision. Utility maximization is now carried out subject to the budget constraint given by X= Y-p(1 + t)H,
(13)
where ptH = C(N)(G/N), the average cost of a unit of public good G. The first-order condition for optimal public service provision remains as before - the Samuelson condition that the sum of the marginal rates of substitution equals the marginal rate of transformation. However, consider the first-order condition given in (14), which provides the necessary condition for the optimum consumption of housing: (au/aH)/(aU/ax) =p(l + t).
(14)
If there were no property tax, the efficient housing consumption condition would be given by (lib), reproduced here as (15), rather than (14): (au/aH)/(au/ax)=P
(15)
Clearly, the property tax alters the housing consumption choice, leading to underconsumption of housing (other things equal). This suboptimality in housing consumption will also lead to inefficiency in public good consumption, except when public good and housing choices are independent. In general, I would expect either complementarity or substitutability between housing and public service demand. To the extent that public goods and housing are complements, I would expect the underconsumption of housing caused by the property tax to lead to underconsumption of the public good. On the other hand, if public goods
Ch. 11: Economics of the Public Sector
585
and housing are substitutes, I would expect public goods consumption to be greater than would be efficient.' 3 The inefficiency of housing consumption is a crucial problem for the Tiebout models that follow. As long as the property tax does distort housing consumption, any conclusions about efficiency or inefficiency in the provision of public goods will be conditional on current housing consumption. Such analyses might be reasonable in a short-run model where housing consumption is fixed, but can lead to unreasonable results in a long-run model in which housing consumption is allowed to vary.
3.3. Production inefficiencies in the Tiebout model Inefficiencies can also arise in Tiebout models when there are economies of scale in the provision of public goods and individuals assume that their migration will not alter the provision of public goods or taxes in either of the jurisdictions. Assume that land is a fixed factor in the production of the public good and that individuals cannot contract for public goods outside of their jurisdiction. Migration of a low income individual into the high income community reduces the per capita costs of providing the public good in this high income community, but it increases the per capita costs in the low income community. As a result, it is possible to have an equilibrium in which consumers live in a variety of small jurisdictions, each optimizing by selecting the bundle of housing and public good that maximizes his utility, and each content, believing that there is no move that could make him better off. However, it may be possible to combine communities, taking advantage of the scale economies in the provision of public goods, and make everyone better off.l 4 Bewley (1981) illustrates this production inefficiency in a very special case where individuals have identical tastes for public goods and utility is separable 13 This is a standard result in the theory of the second best, whereby a failure in one market, the housing market, can create inefficiencies in other markets. For a discussion of the property tax in a general equilibrium setting, see Courant (1977). See, also Dusansky et al. (1981). For additional studies which evaluate the question of efficiency of resource allocation in models with public goods and with various kinds of distortionary taxes, see, for example, Wildasin (1977), Hochman (1981) and Rufolo (1979). The possibility of efficiency of taxation has usually focused on the land tax, or other taxes that serve as head taxes. See, for example, Hochman (1981, 1982) and Bucovetsky (1981). The discussion of land taxation has occasionally focused on the "Henry George Theorem", which states in effect that with an optimal jurisdictional population the aggregate land rents will be identically equal to the optimal spending on the local public good. The obvious appeal of such a proposal is that one need not resort to distortionary taxation to finance the public sector. The result, unfortunately, holds only under rather extreme assumptions, and is therefore more of a curiosity than a fundamental result. For some discussion of the Henry George Theorem, see, for example, Flatters, Henderson and Mieszkowski (1974), Stiglitz (1977, 1983a, 1983b), Arnott (1979), Arnott and Stiglitz (1979), Berglas (1982), Hartwick (1980), Schweizer (1983) and Wildasin (1984). 14In the language of game theory the model would involve a game without a core.
586
D.L. Rubinfeld
between private and public goods. Consider an equilibrium in which there are two communities, each providing identical levels of the pure public good. Such an equilibrium exists because no one individual will have any motivation to switch communities, the two being identical. However, if the two communities were to combine, the new community could provide twice the level of public goods at no additional per capita cost, making everyone better off. Clearly, it is the inability of individuals to form coalitions and the absence of land developers that allows for an inefficient equilibrium.15 15A more complex kind of inefficiency can arise when individuals have complementary tastes and are mismatched in their current jurisdictions; as discussed in Bewley (1981). To take a simple arithmetic example, assume that four individuals supply one unit of labor and are each independently able to provide one unit of a public good with that labor. There are four distinct public goods labeled GA, GB, Gc, and GD, and two communities, 1 and 2. Each individual's utility function is given as: U(A) = 2GA + GB, U(B) = GA + 2GB, U(C) = 2GC + GD, U(D) = Gc + 2GD,
where GA + GB + Gc + GD = 4. Thus, individuals vary in the extent to which they care about each of the four public goods that might be provided, and total production of public goods must be limited to the number of individuals living in each community. Now consider the following equilibria. In equilibrium I, A and C live in community 1, B and D live in community 2. In 1, GA = 1= GC while in 2, GB = 1 = GD Then, U(A) = U(B) = U(C) = U(D) = 2.
In equilibrium II, A and B live in 1, while C and D live in 2. D. Then, in 1, GA = 1 GB, while in 2, Gc = 1 = G Then, U(A)
U(B) = U(C) = U(D) = 3.
Here, A and C both live in one community, but each is poorly matched, and is able therefore to obtain a level of utility equal to only 2 units. The same is true for individuals B and D in community 2. There are no complementarities in terms of consumption here, but the equilibrium is a stable equilibrium because any attempt by any individual to move to the alternative community will result in a level of utility which falls from 2 to 1. The problem is the strong assumption that individuals do not take into account any shift in the production of public goods which might arise when they enter the community. Were the entrepreneurs running the community to encourage migration, aware of the option to change the level of public service provision, then the equilibrium which is clearly inefficient, would also be stable. With proper entrepreneurial ability, equilibrium II would arise. Here A and B live together in community 1, taking advantage of their complementarities, as do individuals C and D in community 2. In this case the utility level obtained by all is 3 and the equilibrium is efficient. These examples are simple, but they apply quite broadly. Even if the model allowed for the ability of governments to alter public service provision, the possibility of inefficiency would still exist.
Ch. 11: Economics of the Public Sector
587
3.4. Will an equilibrium exist? Before proceeding much further it is important to ask whether there is an equilibrium consistent with any version of the Tiebout model. In fact without additional institutional restrictions on mobility, one cannot be assured of this result.l 6 The non-existence problem can be most easily seen in a model with two income classes all having identical utility functions.l7A graphical demonstration is easier if I assume that the utility function is separable between the private good and the combined bundle of housing and the public good as given in equation (16): U(X, H, G)= V(X) + W(H, G).
(16)
Both low and high income individuals live in homogeneous communities, with no commercial and industrial tax base. Is such an outcome an equilibrium? Under property tax financing of the public good, the "tax-price" that any individual pays for a unit of public good is equal to the cost per unit of producing the public good times the individual's tax share, in this case the ratio of the individual's house value to the total house value (total tax base) in the community. With a head tax everyone's tax-share is 1/N, the reciprocal of the number of community residents. If the homogeneous community outcome is to be an equilibrium no individual should have an opportunity to improve his welfare by moving to another jurisdiction. As an example of why this might be a problem [following Henderson (1979)], consider the utility-maximizing calculus of one low income household. In both panels (a) and (b) of Figure 3.1 the low income individual is initially at point D, facing budget line AB in the low income community. However, buying a small house in the high (rather than low) income jurisdiction, the low income individual effectively faces a lower tax-price for public goods than do the high income individuals with larger homes. The tax-price in dollars per unit of public good for the low income household can be represented as PG= (t p. HL)/G,
(17a)
where t = expenditures/tax base = cG/[p(HL + NHHH)] (recall that I am assuming only one low income entrant to the high income community). Here PG '6 A complete argument would also discuss stability and would necessitate a description of the process by which public goods provision is determined, as well as a discussion of the adjustment of housing prices to the potential or actual movement of each of the income classes. 17 For a more complete discussion of this issue, see Buchanan and Goetz (1972) and Henderson (1979) for a theoretical perspective, and Inman and Rubinfeld (1979) for an empirical discussion. More general analyses of equilibria in models with public goods appear in Greenberg (1977, 1983), Richter (1978, 1982), and Foley (1967). See also Ellickson (1973, 1977), and Stahl and Varaiya (1983).
D.L. Rubinfeld
588 (a)
(b)
Figure 3.1. (a) Low income household does not migrate; (b) low income household migrates.
represents the tax-price, c the marginal cost of providing public goods (assumed equal to the average cost), and HL and HH the housing consumption of the low income and high income individuals (of which there are NH), respectively.i 8 Then, it follows directly that PG = C[ pHL/p ( HL + NHHH) = c[HL/(HL + NHHH)].
(17b)
Since the tax-price is a function of how much housing the low income person consumes, the budget constraint faced by the low income individual in the high income community varies with the level of housing consumption selected. In the high income community, the budget constraint faced by the low income individ18
The price of housing is assumed to be exogenous and thus, does not affect tax-price.
Ch. 11: Economics of the Public Sector
589
ual is given by line AC (in both panels), with housing consumption at H1, and public good provision at G1. The indifference curve, U0, in Figure 3.1, panels (a) and (b), shows the maximum level of utility obtainable given the prices that the consumer faces in the low income community. An equilibrium arises if the low income household would not find it advantageous to move to the high income community. In panel (a), the consumer is initially at point D in the low income community, facing budget constraint AB. If the low income individual were to move to the high income community, he would be required to buy the public services, G, offered there. G involves a low tax-price per unit of public services, but a high total expenditures on public goods. As a result, if G were purchased the low income individual would have little income to spend on housing and would consume housing HI associated with point E on budget line AC. The benefits from the additional public good consumption are outweighed by the lost utility from diminished housing consumption. In this case the outcome is an equilibrium, since low income individuals have no desire to migrate elsewhere. Incidently, it is here that the separability assumption is helpful. With separability the graphical analysis can focus on the trade-off between housing and the public good. Private good expenditures are omitted since they are unaffected by changes in the price of the public good relative to housing. In panel (b), however, the non-existence problem can arise. Here, the level of public good provided in the low income community is not as high as in the previous case, so that the low income individual faces a lower cost of the public good in terms of housing, even though the tax-price has fallen somewhat less than in the previous case. The move from point D to point F represents an increase in utility for the low income individual choosing to live in the high income community (even though the public good remains at G). If the result of calculations of this type is for low income individuals to continually move into high income communities, and for high income individuals to move to avoid the low income individuals, an equilibrium will not exist. Since there is nothing within the Tiebout framework which restricts individual mobility, it is perfectly reasonable, and even likely, to expect a non-existence problem of this kind. How can existence be assured in such a Tiebout framework? One solution involves the use of various zoning and land use devices to limit the mobility of low income families. For the residents of the high income community, a given spending level per household can be obtained with a lower tax rate than in the low income community, since the high income tax base is larger. Admitting low income individuals, who purchase houses valued below the community average into the community lowers the tax base per family and raises the tax rate. This increased tax rate makes the high income community less attractive to future residents and lowers the value of remaining properties-causing tax rates to increase again. As a result, the incentive of current residents to keep out lower
590
D.L. Rubinfeld
income families may differ from the incentive of private developers who may make greater profits by building high density housing units for lower income families than by constructing large homes.' 9 A low income entry barrier can be created if the high income community increases the level of public spending beyond the level preferred by high income individuals. This adds to the cost of entry to low income individuals but also lowers the utility of current residents. A more direct and effective approach is for the high income community to regulate the minimum size for houses. By fixing a minimum consumption level for housing greater than the housing demand of low income individuals, the cost of entry into the high income community is increased. If the minimum is set to be equal to the high income individuals' consumption levels, then the low income migrant will no longerface a lower tax-price for public services. By forcing equal housing consumption, the exclusionary zoning device changes the property tax to a "head" tax, with everyone in the community paying an equal share of the cost of the public good. Faced with a head tax, the low income individuals are unable to move to the high income community, since they would have to buy the same level of housing and more of the public good. The argument just made was developed by Hamilton (1975). In an attempt to rescue the Tiebout model, Hamilton shows that with perfectly elastic housing supply and zoning the Tiebout equilibrium will exist and be efficient, despite property tax financing of the public good. If property taxes paid equal benefits received, there is no migration incentive (on that account). If they do not, there is. Of course, this result occurs only because the property tax has been effectively turned into a head tax and because the distortion in housing consumption caused by the property tax has been eliminated. In the usual non-zoning case low income individuals desire to move into the high income community because the benefits from the increased consumption of public services outweigh the costs or taxes paid for those public services. Thus, the purchase of a small house generates a fiscal surplus or residuum for the low income individual. From the perspective of residents in the high income community, a "fiscal externality" has been created, since the new entrant does not fully pay for the additional public service costs. A fiscal externality is created in the low income community as well if the public good is provided originally at minimum average cost; in this case the departure of the resident could force the average cost of provision of public services to be higher for all remaining residents. Thus, the zoning device can simultaneously improve the welfare of high income households and eliminate the fiscal externality. Hamilton's argument shows that 19The arguments concerning the motivation of the various parties to use zoning as a restrictive device are given in White (1975) and Mills (1979). See also Rubinfeld (1979), Eppel et al. (1978), Hamilton (1983a) and Zodrow and Mieszkowski (1983).
Ch. 11: Economics of the Public Sector
591
to the extent that one can make the property tax a head tax, an efficient outcome can be insured. However, without a head tax (even with Lindahl taxes which equate marginal benefit to marginal cost), migration incentives arise, generating externalities that are likely to lead to an inefficient outcome.20 And, there is reason to believe that actual zoning policies deviate substantially from the one which transforms a property tax into a head tax. The analysis suggests that the pure Tiebout equilibrium will exist and be efficient if there are sufficient communities to match the tastes of all individuals for both housing and public service consumption, if the public good is produced at minimum aerage cost with respect to population, if profit-maximizing entrepreneurs are free to produce new communities to satisfy the demands of residents, and if a host of additional assumptions mentioned previously also hold.2 Even though the Tiebout model involves a series of strong and unrealistic assumptions, the insights that it gives about the manner in which individuals respond to fiscal variables are relevant for both theory and practice. The Tiebout framework is useful from a normative perspective because it defines a clear set of conditions which are necessary for optimality, while clarifying the intrajurisdictional and interjurisdictional aspects of efficiency. The "competitive" Tiebout world highlights the efficiency gains to be made when individuals are allowed to choose by "voting with their feet". It can and has been useful from a positive perspective because it emphasizes the effect that variation in the supply of public goods can have on locational choice, as opposed to more traditional empirical analyses which do not take migration into account. However, both normative and positive aspects of the Tiebout model are changed somewhat when the land market is modeled explicitly. This additional complication is the subject of the discussion which follows.
3.5. Capitalization of public spending and taxes in a Tiebout framework I have assumed that the pre-tax price of housing in any of the communities is fixed. To the extent that housing prices respond to differences in public sector variables, the "capitalization" of public sector variables into housing prices may provide an adjustment mechanism that will improve the efficiency of the Tiebout process. But, even with capitalization, inefficiency, both in terms of the distribu20 The previous results are overstated somewhat since they assume a positive correlation between tastes for public goods and income. If tastes and income are uncorrelated, efficiency and income mixing are not incompatible. 21These conditions are sufficient, but not necessary, for an efficient equilibrium and thus appear overly stringent. For example, Mark Pauly points out that one could have an efficient equilibrium at other than minimum average cost if there were economies of scale up to a point, but all communities were at the optimal size. See Pauly (1976), Pestieau (1983) and Zodrow (1983).
592
D.L. Rubinfeld
tion of population among communities and in terms of the appropriate level of public services within jurisdictions, is still possible. 22 Consider what capitalization means and how it fits into the Tiebout framework. To begin, one might ask what are the effects of property taxes or public services on the value of land? For this discussion assume that when one purchases a house one buys simply the plot of land with a fixed structure on it so there is no distinction between land and the structures on land. Allowance for the substitution of capital for land in the production of housing changes the quantitative, but not the qualitative nature of the analysis. The individual's willingness to pay for a plot of land will depend in part upon the bundle of public services that can be consumed once the land has been purchased and residence in the community is determined. However, the market price that is paid for the land will reflect not only the willingness to pay of the individual who eventually purchases the land, but also the willingness to pay of all potential purchasers of the land. If all individuals have identical tastes for public goods - one interesting extreme - then the price of land will be bid up by competition to reflect the value of public goods to all, and complete capitalization will have occurred. However, in the other extreme, if the individual's demand for public good is different from all others, then the price of land will not change and no capitalization will occur. The individual will be fortunate in being able to enjoy the consumer surplus associated with his unique tastes. Thus, with "sufficient" competition for land among individuals with "similar" tastes, one would expect that higher taxes would lead to lower land values, while greater public services would lead to increased land values. Capitalization is the term which refers to the effect of fiscal variables on the capitalized or present value of the asset land, rather than the annual rental price. Because competition is a crucial determinant of the presence of capitalization, it is important to distinguish between capitalization in the short run, intermediate run, and in the long run. In the short run with land (housing) in fixed supply, one would expect capitalization to occur with some frequency. In the intermediate run with housing in variable supply, an increased supply of housing could diminish or conceivably eliminate capitalization. Finally, in the long run in which the number of communities, and perhaps the number of metropolitan areas, is in variable supply, capitalization is likely not to occur. The reason is that with an increased demand for public goods individuals would always have the option to 22 Capitalization has extremely important implications for policy since it can thwart or even counteract attempts to achieve equity through judicial reform. For an extensive discussion of this issue, see Inman and Rubinfeld (1979). Capitalization adds a complication because individuals moving into a community must pay what is in effect a two-part tariff. One part of the price of public services is contained in the portion of the price of housing which is associated with the capitalization of fiscal differences. The second part is the direct payment made in terms of property taxes. Any policy change which increases costs of buying public services in an attempt to obtain equity will simultaneously lower land prices, compensating partially or wholly for the change in property taxes.
Ch. 11: Economics of the Public Sector
593
move to a newly developing jurisdiction rather than paying a higher price for housing in a developed jurisdiction. This distinction between short, intermediate and long run are crucial for understanding the more technical analysis which follows. The Tiebout model can easily be modified to account for capitalization, beginning with the short run. Let H be the units of housing consumed, p the price of a stream of services available from that housing, and i be the discount rate. 23 Then, the value of a house in the community, V, is as follows: V= (pH- tV)/i.
(18)
Here the property tax rate, t, is defined as a tax on the value of housing. Equation (18) assumes complete or total capitalization of taxes because total tax payments are deducted from the pre-tax value of the house to obtain the final after-tax value. Other things equal, an increased tax rate will lower the value of the house. More explicitly, the house value, V, in terms of the remaining variables is as follows: V=pH/(i+ t).
(19)
Equation (19) shows that higher tax rates lead to lower house values, assuming the level of public services provided is constant. The intermediate story would be similar except for the fact that land price adjustments would differ from housing price adjustments due to the supply response. When there is imperfect capitalization, equation (19) becomes: V= pH/(i + at),
(20)
where a is less than one. 24 Imperfect capitalization can occur in a short-run situation because individuals differ in their tastes and because the fixed number of communities limits the amount of competition for the variety of public services available. Of course, partial capitalization may also arise if there are informational market imperfections as well. 23 This 24
abstracts from the usual problems with selecting the appropriate discount rate. There has been a substantial literature of an empirical nature dealing with the extent to which property taxes as well as public sector variables are capitalized. The most widely cited piece is by Oates (1969), but there has been an extended discussion, both empirical and conceptual, about the Oates work. For a review of this entire literature, see Bloom, Ladd and Yinger (forthcoming). See also Church (1973), Cushing (1984), Edelstein (1974), Heinberg and Oates (1970), King (1977), Levin (1982), McDougall (1976), Oates (1972), Polinsky and Rubinfeld (1978), Rosen and Fullerton (1977), Smith (1970), Wales and Weins (1974), Wicks, Little, and Beck (1968), Woodard and Brady (1965), Rosen (1982), Yinger (1982).
594
D.L. Rubinfeld
However, as a general rule, short-run and intermediate-run models are associated with capitalization, which can arise for a variety of reasons. Assume, for example, that public services are provided more efficiently in some communities than in others in the sense that the tax cost of an equivalent bundle of services is lower. Then, we would expect competition to lead some individuals to move into the more efficient producing community, raising the price of land there. Hamilton (1976), on the other hand, has argued that zoning restrictions on low income housing within a mixed community can lead to intrajurisdictional capitalization.2 In Hamilton's case capitalization occurs because the supply of housing of each type is not perfectly elastic even in the intermediate run. In the long run, however, the prospect of new communities will be sufficient to eliminate capitalization, subject to the proviso that tastes of individuals are sufficiently similar to allow all surpluses to disappear. 2 6 To see this, imagine that some capitalization has occurred, i.e. that an additional unit of public service leads to a positive increment in house value, net of tax cost. Then, a new community can be created costlessly that provides the higher level of service without increasing the price of land. The "rents" originally earned by the owner of the land whose value was high due to capitalization, are dissipated when the new community is developed. However, since the assumptions associated with the Tiebout model - long-run assumptions - are not satisfied in the local public sector, the empirical presence of capitalization is not and should not be a surprise to public finance economists. 2 7 A complete analysis of how capitalization relates to the Tiebout model requires a framework which models both the decision to move among communities and the effect of the move on the tax base and the demand for public services. In addition, variations in individuals' preferences may influence the decision-making process within communities. Attempts to model such complex behavioral systems
25 In Hamilton's (1976) discussion of intrajurisdictional capitalization of public services, the possibility of efficiency with mixed communities is considered. Hamilton shows under a set of strong conditions concerning mobility and the use of zoning to regulate housing consumption, that an efficient heterogeneous community can exist. The efficient solution occurs because everyone satisfies their public service demands, since their tastes are assumed to be indistinguished. Housing demands vary, but these demands are also satisfied by a restrictive zoning ordinance. Those individuals with high incomes consume more housing but pay a higher price, while those with low incomes consume less housing and pay a lower price. The net effect of this full capitalization of the benefits and costs of public services is that no individual wishes to move to make himself better off. The efficiency of this solution arises only because we have assumed that everyone has identical public service demands. The main point, however, does hold- that stratification and homogeneity of individuals within jurisdictions is not a necessary condition for efficiency. See Bucovetsky (1981) for a more recent discussion of this and similar issues. See also Starrett (1981) and Scotchmer (1985b). 26 This is essentially the argument made by Edel and Sclar (1974). 27 See also Hamilton (1976) Epple, Zelenitz and Visscher (1978) Sonstelie and Portney (1980), Stiglitz (1983b), Topham (1982), and Wildasin (1983), for some methodological discussion. Brueckner's (1979) paper argues for capitalization, but does not consider long-run supply responses.
Ch. 11: Economics of the Public Sector
595
have been made by a number of authors, yet none is completely satisfactory.2 8 Nevertheless, the discussion of capitalization by Yinger (1982) on which the following analysis relies, is illuminating. 29 Capitalization depends on the willingness of potential or actual movers to pay for housing in a given community. For example, let P represent the amount of money per year per unit of housing services that a household is willing to pay to live in a house in a given community (not necessarily equal to p, the market price of housing), which is a function of the level of public services, G, and the property tax rate, t, in the community. The optimization decision of a household choosing among communities, with public services fixed, is given by maximize U( X, H, G) subject to Y= X+ P(G, t)H+ tP(G, t)H/i.
(21)
The expression tPH/i represents the tax rate times total house value, V. Thus, this model is essentially equivalent to the previous one except that P is now a function of both G and t. Households' bids for housing vary with public service and taxes, as can be seen by differentiating this system not only with respect to X and H, but also with respect to G and t. The complete set of first-order conditions is given below:
(i)
au/d X = A,
(ii)
dU/dH = XP(1 + t/i),
(iii)
dU/dG = X(dP/dG) H(1 + t/i),
(iv)
(dP/dt)(i+ t) + P = 0.30
(22)
First, assume that all households have identical tastes and income. In this case competition among households will eliminate all consumer surpluses so that the function P(G, t) represents an equilibrium pattern of housing prices. To see whether capitalization occurs, fix G at some arbitrary level, say G*, and then solve equation (22, iv) for the bid price function P. The solution to this differential equation is P = X(G)/(i + t),
(23)
where X(G) is a constant of integration whose value varies with G. 28 See, for example, Rose-Ackerman (1979, 1983), Portney and Sonsteie (1978), as well as Epple, Filamon and Romer (1983, 1984). 29 Yinger's model is limiting in part because there is no explicit public choice mechanism. Voter models are important in Epple et al (1983, 1984), Martinez-Vasquez (1981), Wildasin (1984), and Rose-Ackerman (1979). 30 (iv) is obtained by differentiating the constraint with respect to t, and solving.
596
D.L. Rubinfeld
It is easy to see that the solution satisfies (22, iv) by differentiating (23) with respect to t and choosing the appropriate constant of integration. Note that if the tax rate is zero, then the constant of integration is equal to the bid that would be made for housing in a no-tax world. Since a change in the level of public services, G* (through additional federally financed local goods), alters the constant of integration, equation (23) shows that the extent of capitalization depends upon the level of public services. 3 1 More generally, the pattern of housing prices will depend upon the choice of housing and the public good, when all four equations in (22) are solved simultaneously for H, G, and X in terms of the remaining variables. Yinger illustrates this solution in the case of a Cobb-Douglas utility function. The utility function is given in (24) and the solution for the bid price function in (25): U(X, H, G) = In X+ aln H+ P(G, t) = (yGf/-)/(i + t),
lnG,
(24) (25)
where y is a constant of integration. With identical households there will be no differences in willingness to pay for the same public service bundle, and P will equal the equilibrium price of housing. Within a jurisdiction where G is fixed, any variation in taxes-taxrates - is capitalized immediately into the price of housing. The dollar amount of capitalization depends upon the function, P, and, implicitly, the tastes of the individuals. For example, the extent of capitalization will vary directly with a, the strength of preference for the public good, and inversely with /B, the strength of preference for housing. Equation (25) also describes the variation in housing prices reflecting different levels of G when t is fixed. Specifically, the change in housing prices is given by dP/dG = (/a)P(G, t)/G.
(26)
Here the percentage change in housing prices is a positive function of the strength of preference of the public good. Now consider what happens if either the tastes or incomes of individuals differ. Recall that in a model with two income classes, low income and high income households are likely to segregate. 32 In such a case the capitalization question becomes more complicated, since the willingness to pay of a low income household to live (in a given size house) in a high income community depends on the trade-off between the value of additional public services and the cost reflected 31 And, 32
in general on the level of all commodities. This segregation is not a necessary condition for Tiebout equilibrium. See Yinger (1982) for a discussion of this issue. See, also Henderson (1985).
Ch. 11: Economics of the Public Sector
597
in additional taxes. If individuals choose not to move between jurisdictions, then there may be two different tax and service bundles in two different communities with the same housing prices. This follows because [as implied by (26)] differences in the dollar expenditure on public services may not be reflected dollar for dollar in differences in house values. How much individuals value public services affects capitalization, not the dollar expenditures. Thus, there may be little capitalization if differences in services are not valued by one of the income groups.3 3 To sum up, the analysis has shown that as a general rule the no capitalization case can arise only in the very long run in which entry of new communities is free, a necessary condition for the Tiebout model to mimic the usual competitive model. However, even in the long run, such a possibility is quite unrealistic, not only because of the cost of such development, but because of the spatial nature of urban areas which depend upon one or more centralized workplaces for employment. Despite some theoretical suggestions to the contrary, capitalization is a phenomenon which is likely to be an important one. Fortunately, this outcome is useful empirically. I will show in a later section, for example, how the presence of capitalization can be used to empirically estimate the demand for local public goods. 3.6. The efficiency of the Tiebout model with capitalization
How is our previous discussion about Tiebout modified in a model in which capitalization may occur? In particular, is the outcome efficient in the intrajurisdictional sense, conditional on the property-tax-distorted housing choice? Allowing for capitalization complicates the problem. Imagine, for example, two individuals, A and B, both with identical income and tastes. The individuals reside initially in separate jurisdictions providing different public goods, but with 33
Finally, consider the decision among residents of the community to select a public service level. Because changes in the public budget affect the location decision of potential community entrants, the prospect of capital gains and losses associated with capitalization can influence the public sector decision. In the simplest model in which all individuals have identical tastes and incomes, the possibility of capital gains does not affect the public service bundle choice. By choosing public services that set marginal benefit equal to marginal cost (implicit in the Samuelson condition), the individual is paying the maximum amount for the house that he would be willing to pay if he moved into the community. Even if non-residents have different tastes, the individual in the community is still maximizing property values. However, if the resident is planning to leave the community a different calculation might occur. As long as the political process is dominated by current residents, an equilibrium with property value maximization in a simple model (and with no business property) yields an efficient outcome. If movers do dominate or have a substantial impact, the analysis becomes more complex. The property value maximization argument has been utilized by a number of public finance researchers to build a simple structure for analyzing the local capitalization question. See, for example, Brueckner (1979). Note that the analysis becomes substantially more complex if we have heterogeneous communities, but we leave this analysis for a later section.
598
D.L Rubinfeld
housing prices equated. In the first jurisdiction individual A is consuming his desired level of public services, given the price of housing. Individual B, however, is underconsuming public services and is initially worse off than A. Since an equilibrium in such a model occurs only when utility levels are equal for both individuals, one possible outcome is for capitalization to occur, causing the price of housing in B's jurisdiction to fall. In the resulting equilibrium, B will face a lower price of housing, and will consume more housing and less public services than A. Both, however, will achieve the same level of utility and not be motivated to move. Thus, capitalization can compensate individuals, eliminating their incentive to move. However, this equilibrium will, in general, not be efficient, for the same kinds of reasons that were given previously. For example, it may be efficient from a production point of view to have one, rather than two, communities. However, individuals will not have the appropriate incentive to take advantage of economies of scale. With or without capitalization, the efficiency of the outcome in a Tiebout model depends upon the provision of public services and individual mobility. And, capitalization cannot rule out either production inefficiencies or suboptimality in consumption within communities. 3 4
3.7. Testing the Tiebout model Tests of the Tiebout model can be divided into two distinct types. First are the abstract tests of whether the Tiebout model yields an efficient outcome with each individual obtaining his most preferred level of public services in a homogeneous community. The second type of test attempts to evaluate whether fiscal variables affect migration so that there is at least a partial sorting of individuals in communities based on public sector variables. Some of these tests look at mobility explicitly, while others look indirectly by focusing on the capitalization of public sector variables into land values. One conceptual test of the Tiebout model follows from our previous discussion about capitalization. In a model in which all residents of the community are homeowners, property value maximization can lead to efficiency in the Tiebout world. The test of Tiebout efficiency is then to see whether aggregate property values are maximized for the current distribution of individuals among jurisdictions. The argument is most fully developed in Brueckner (1979, 1982, 1983), and is illustrated in Figure 3.2. Brueckner suggests that the graph of the relationship between aggregate total property values in a community, APV, and public output, G, is likely to be an inverted U-shape. The most important assumption in his analysis is that the local public choice mechanism is one which selects public 34 For a model involving strong assumptions in which the efficient outcome will be obtained, see Brueckner (1979). See also Sonstelie and Portney (1980) and Westhoff (1979).
Ch. 11: Economics of the Public Sector
599
-11 A V
APV*
Figure 3.2
goods to maximize property values. In this case, the point G*, where public service provision is optimal, is the point at which property values are maximized. A test of Tiebout efficiency involves looking across communities to see whether they are near the peak or the maximum of property values. Such a test can theoretically be made by estimating a multiple regression of the form given in equation (27): APV = a + PG + Z + e.
(27)
Here Z is a vector of variables (other than public sector variables), that determine property values and E is a random disturbance term. Note that this multiple regression model is similar to what has been called a "hedonic" regression model, where Z includes a list of housing, accessibility, and neighborhood characteristics. The model accounts not only for G, but also other variables that affect property values, and thus is a kind of "reduced form" explanation of why property values vary across communities. If the model is correct and public goods are efficiently produced, then near the optimum small changes in public spending should have no effect on property values. Whether the coefficient /3 in equation (27) is significantly different from zero can be tested to evaluate whether property values do respond to changes in public goods provision. Insignificance is at least consistent with property value maximization and efficiency. The test works because property value maximization implies that public goods will be provided up to the level at which a tax-financed increase in public goods yields a competitive return in terms of increased house value. The tax increase used to finance the public good takes into account the opportunity cost of the
600
D.L. Rubinfeld
public funds in terms of what they could earn in the private sector (all funds are assumed to come from investment). A coefficient of zero in the equation in which t is allowed to vary is the proper efficiency test. A more general test that would apply in a more general model than the one just described has been suggested by Inman (1978). Inman suggests that we analyze a structural equation such as (27a) in which the tax variable is included: APV= a + f3G + yZ +ft + £.
(27a)
In this case dAPV/dG = +f(dt/dG) = R, where R measures the return on the public investment. If the annual value of this return R = 1 + i, where i is the competitive rate of return, then the level of public investment is efficient. Inman's empirical work supports this conclusion for a group of jurisdictions on Long Island, New York. There are, however, substantial methodological and empirical problems associated with the narrow property value maximization test. First, the specification of the hedonic regression affects the interpretation of the results; in particular the regression is not likely to be linear. Second, it is difficult to measure all the appropriate control variables that ought to appear in the vector Z. Third, /3 may be insignificant even though there is not Tiebout efficiency. As Brueckner notes, one cannot be assured that some communities do not overprovide public services and others underprovide services so that the net effect appears to be a zero coefficient on the G variable. In addition, any empirical test which relies on an insignificant result is a weak one, since we cannot rule out the possibility that the regression model is improperly specified.3 5 Thus, this test of Tiebout efficiency is not likely to be very useful from an empirical perspective. Do public sector variables affect the decisions of individual households who move between jurisdictions? The best of the relatively scanty evidence, given in Reschovsky (1979), suggests that high income households are influenced by variations in public services, especially education. However, Reschovsky finds no support for this effect among low income movers. His study is particularly appealing because it focuses solely on the behavior of movers, not on all individuals, and is a micro rather than aggregated study. Most other Tiebout studies focus on the implications that such movement may have on the distribution of spending and on the spending demands within communities, usually as proxied by income. For example, Hamilton, Mills and 35 Tests of the Tiebout hypothesis using estimating equations like (27) do have some additional deficiencies. As Portney and Sonstelie (1978) point out, a linear form assumes that differences in property tax rates between communities have the same effect on the value of homes independent of the size of those homes. This suggests that it is tax payments rather than tax rates that ought to be capitalized. Sonstelie and Portney suggest an alternative Tiebout test which they claim has the advantage of not depending on the presence of capitalization. There are additional methodological problems with this approach, but it is a conceptually appealing one.
Ch. 11: Economics of the Public Sector
601
Puryear's (1975) crude empirical work suggests that there is more homogeneity of income and public good provision within suburban jurisdictions than there is within the central city. Given the vast set of choices that exist in suburban areas, one expects such homogeneity in a world in which public sector variables matter. However, we are clearly a long way from having homogeneous suburbs as is illustrated in the empirical analysis by Pack and Pack (1977). Perhaps the most widely discussed test of the Tiebout hypothesis was one described by Oates (1969). Oates suggested, along the lines of the argument spelled out by Brueckner, that if the Tiebout model was an accurate depiction of the local public sector, the effect (on the margin) of increased public services on property values, holding taxes constant, would be positive. Likewise, the tax variable, holding services constant, would have a negative sign, and the net effect of the two changing jointly would be zero, at least for the median or decisive voter. A good deal of discussion of the theory underlying the Oates paper has led to the view that the empirical work does not provide a direct test of the Tiebout efficiency. This should not come as a surprise since Tiebout equilibrium can occur without capitalization, and capitalization can arise without Tiebout efficiency. The Oates and other follow-up papers do suggest, however, that capitalization of both public spending and taxes is prevalent. Finally, an alternative test of the Tiebout mechanism uses survey data. Gramlich and Rubinfeld (1982) analyzed responses by a random sample of Michigan voters to questions about the voters desired level of public expenditures. Responses were used to estimate demand functions, which in turn were used to evaluate statistically the variation in public sector demands within various kinds of communities across the state. If the Tiebout model is appropriate, then one would expect greater homogeneity of demands in suburban communities where there is substantial locational and public sector choice. The data do support this view. Within the suburbs of the city of Detroit and several other large metropolitan areas, public service demands are relatively homogeneous, while the demands are less homogeneous in cities with smaller suburbs or rural areas.
4. Empirical determination of the demand for local public goods
4.1. Introduction Empirical estimation of the demand for public goods is a major focus of public finance. Throughout the 1950s and 1960s many researchers attempted to explain variations in local public expenditures as reflecting socioeconomic characteristics of the population as well as the level of intergovernmental aid. In the past decade
602
D.L. Rubinfeld
or so, researchers advanced this literature substantially by concentrating on the optimizing behavior of the parties demanding public goods. 3 6 The natural reaction was to borrow directly from the optimization literature of consumer theory, an approach which has a number of advantages over the previous methods. Specifically, it may be possible to distinguish between demand and supply characteristics, to specify the appropriate price and income variables, and to model the effect of grants-in-aid on local spending. However, a number of difficulties not present in standard consumer theory arise in modeling local public goods. I will treat both empirical and theoretical problems in this section, beginning with a discussion of the conceptual problems which underly demand estimation. The first conceptual problem involves the aggregation of preferences. Individual demands for the public good are translated into a community demand through some kind of political process. As a result, estimation of the demand for a publicly provided good must either explicitly or implicitly model this process. The usual resolution of this problem involves the use of the median voter model. 3 7 The median voter model is a demand-side model which takes the community demand for a single-dimensional public good to be the median of the individual demands for the good. Of course, the actual public good provided will depend upon the supply side as well - whether and under what conditions public officials will provide the public good desired by the median voter. More about this later. For the moment, however, I assume that the public provision comes about by a majority rule voting process. Under a set of reasonably strong assumptions (including single-peaked preferences, a single-dimensional public good, and no agenda setting), the median voter's demand will be the public good actually provided. Any alternative proposal can never win a majority of the votes in an election against the level demanded by the median person. Under some political processes there is no guarantee that the level of community provision of the public good will represent the quantity demanded by any one individual in the community. If this possibility occurs, demand estimation using aggregate data becomes very difficult. The median voter model is appealing among other reasons because it avoids this problem. The paper by Bergstrom and Goodman (1973) serves a valuable function in that it specifies clearly a sufficient set of very restrictive assumptions that makes use of aggregate data to estimate demand functions possible. Whether and to what extent the model is empirically validated is a subject for further discussion. First, however, it will be useful to discuss some additional conceptual problems 36
For two examples of this see Borcherding and Deacon (1972) and Deacon (1977). 37The median voter model was proposed by Bowen (1943), and extended by Barr and Davis (1966), and Bergstrom and Goodman (1973). See also Mueller (1976), and Martinez-Vasquez (1981).
Ch. 11: Economics of the Public Sector
603
peculiar to public goods - the measurement of output and the determination of price. Individual demand functions are derived as follows: each individual is assumed to have a quasi-concave utility function, U(X, H, G). To allow for variation in tastes among individuals with different socioeconomic characteristics, these utility functions are conditioned on a set of characteristics, denoted by Z. The individual's problem is then: maximize U(X, H, G; Z) subject to Y = X + pH + PGG.
(28)
Here the price of the private good has been normalized to be equal to 1, and the price of the public good is denoted by PG. The public good need not be a pure public good. In fact, I will allow for "congestion" in the sense that the level of the public good will depend upon population in the jurisdiction. Nevertheless, it will be useful at this point to assume that all individuals within the jurisdiction consume the same level of G, so that G need not be subscripted by individual household.3 8 The Tiebout model has no supply side in which the process by which the public good is produced is clearly specified. One relatively straightforward specification of the supply, given by Inman (1979), is to describe G as representing the flow of public services obtained from a production function whose inputs include the public capital stock and the population of the community. The capital stock itself is obtained from another production process which uses both labor and materials as inputs. This supply-side specification allows for the treatment of a number of interesting local public finance issues which will not be taken up in this chapter. First, it introduces the local public labor market (as well as the private market) explicitly into the model. Second, it raises the question of what the appropriate level of the public capital stock ought to be, as well as questions concerning the rate of investment in the capital stock, and the source of financing. Unfortunately, all of these issues cannot be covered in this chapter and are left for further reading and thought.3 9 Individual demand functions depend not only on individual budget constraints, but also on the budget constraint facing the public sector, which in turn determines the individual tax-price, PG. To incorporate this into the analysis I let 38This could be accounted for by subscripting the public good, allowing the within jurisdiction benefits to vary among individuals. 39 The estimation of production functions and, more generally, the study of the supply of local public goods has been undertaken in a substantial list of articles. A sample of those would include Carr-Hill and Stern (1973) and Fare et al. (1982). See Inman (1979) for a longer list.
604
D.L. Rubinfeld
C equal the marginal and average cost of production of G (at least near the optimum). The municipal budget constraint is then cG = T + A,
(29)
where T is the level of locally raised taxes and A is the total amount of aid given to the municipality by higher levels of government. This aid is assumed to be "non-matching", i.e. independent of the level of local tax effort. If matching aid were also included the budget constraint would then become cG = (1 + m)T+ A,
(29a)
where m is the matching rate as a fraction of local taxes T. If local revenues are raised by a property tax, then the tax-price of a dollar of local public spending to a homeowner is given by PG = cpH/B,
(30)
where pH is the value of the individual house, and B is total tax base in the community (piHi). 40 Here tax-price represents the individual's cost of purchasing an additional dollar of local public services, with no commercial and industrial tax base present.4 1 Solving the maximization problem given by equation (28) subject to the individual budget constraint and the budget constraint of the local government (29), yields (in general) the individual's demand function for the public good. This demand function is given in its most general form in equation (31) below where a superscript asterisk denotes the desired public good: G* = g(Y, PG; Z).
(31)
Following Bergstrom and Goodman (1973), most theories of the demand for public goods assume that individuals demand a flow of services of uniform quality, Go, from the publicly produced good. Because of "congestion" or crowding in the consumption of that good, Go is a function of the population of the community, N. For example, one might postulate that Go = G/N g , 40 This avoids all issues surrounding the distinction between assessed and market values for housing, as well as the questions of rental as opposed to owner-occupied housing, and the shifting of the tax on non-residential property. 41A more complex calculation of price would account for the deductibility of local taxes under the federal personal income tax and the possibility of local tax credits associated with the payment of property taxes.
Ch. 11: Economics of the Public Sector
605
where G is the observed level of the public good and g is thought to lie between 0 and 1. Note that when g is equal to 0, G is essentially a pure public good. However, when g equals 1, the consumption of the publicly provided good is equal to the individual's per capita share of the public good produced, so that G is a pure private good. Values of g between 0 and 1 allow for the possibility of goods that are mixed or that have partly private and partly public characteristics. If the tax-price associated with G is given by (30), then the tax-price per unit of Go is PG = PN g . Note that when G is a pure public good (g = 0), then PGO represents the price of one unit of public good which when consumed is available to all individuals. Thus, with the simplifying assumption that all houses are uniformly valued, the price would equal 1/Nth of the cost of producing a unit of the pure public good. When G is a pure private good PGO represents the cost of producing one unit of the public good, which is N times the cost per individual of producing the good. The most popular approach to estimating demand functions, in part because of its tractability, uses aggregate data on spending at either the jurisdiction, county, or perhaps state level. Were everyone in the community to receive the level of public services that he or she demanded, then the demand function might appear as follows:42
I + ,21og Y+ f331lg PO +f51lg Z + e.
log Go =
(32)
However, empirically one explains variation in actual output, G, not variations in GO. If the following substitutions are made: log Go
log G - g log N
and log PGO = log P
+
g log N,
then the actual output demand equation is log G =
1
+ 1210g Y + 310g PG + ,41og N + ,5log Z + e,
(33)
where ,4 = g(1 + ,3). The observations for each variable differ by jurisdiction, but the subscript has been suppressed. 42The log-log form is chosen for simplicity because parameter estimates are elasticities. However, one should note that the only utility function globally consistent with such a demand function is the Cobb-Douglas with unitary income and price elasticities.
606
D.L. Rubinfeld
With the log-log specification N appears as a separate variable in the demand equation. Thus, an estimate of g can be obtained from the estimate of the coefficient 4 and the estimate of the price elasticity of demand p/3.4 3
4.2. Empirical issues Strong as the median voter assumptions are, they are not sufficient to allow for demand estimation using aggregate data unless we know the characteristics of the median voter. Bergstrom and Goodman (1973) showed that under a certain set of rather strong conditions an observation on expenditures and public goods in the community can be treated as if it were the amount demanded by a homeowner with the median income and the median house value of that community. Their rather strong set of conditions shows one possible set of sufficient conditions for demand estimation. A number of Bergstrom-Goodman assumptions are worthy of particular attention. First, all jurisdictions are assumed to have income distributions that are simple proportional shifts of each other. For example, if the distribution of income in one community is normal with mean $10 000 and variance $100 000, a second city with mean income $20 000 would have a normal income distribution with the same $100 000 variance. Second, each individual's tax-price is a constant elasticity function of individual family income. This is equivalent to assuming that all individuals have identical constant income elasticities of demand for housing.4 4 Finally, the income and price elasticities are assumed to be related so that the distribution of desired demands is a monotonic function of income. To see why the monotonicity assumption is crucial, consider the distributions of desired demands given in Figure 4.1. Panel (a) illustrates the monotonic case in which the quantity demanded by the median person is the median quantity demanded. However, if the distribution is as in panel (b), it is clear that the median desired level of spending is no longer the level desired by the median income person. Instead, the individual with the median income demands Gmed, the highest demand, while individuals. with incomes Y 1 and Y2 demand G * or the median quantity demanded. What constraints on the parameters of the model insure that the median voter result works? The answer can be derived by differentiating the demand function with respect to income and realizing that the gross effect on demand of the change in income must be positive (or negative) everywhere so that differentiating 43 Crowding can occur with respect to dimensions other than population. For example, Gramlich and Rubinfeld (1982) specify crowding in terms of both the income and the property tax bases. This allows for the possibility that the provision of public goods will vary among individuals of different income classes living in different size houses. 44 In this case log H = C + log Y.
607
Ch. 11: Economics of the Public Sector G
G=Gmed
I I I I I Y
Ymed
G1 (
Y1
Ymed
Y2
Figure 4.1
(33) with respect to log Y yields: d log G/d log Y =
+
/f2
3d
log PG/d log Y.
(34)
Now assume (as do Bergstrom and Goodman) that the public good is property tax financed and there is a constant income elasticity of demand for housing, 7 (i.e. log H = CO + q log Y). Since log PG = log(cp/B) + log(H) from (30), it follows that log PG = Co + Xlog Y.Then, substituting into (34), it follows that d log G/d log Y =
/2 + /3X-
(35)
This suggests one useful empirical test of the median voter model: the income and price elasticity of demand for public goods, /2 and 3, as well as the income elasticity of demand for housing, 7, must make /2 + 3q positive. If this condition does not hold, then all empirical results will be suspect.4 5 45
To the extent that empirical estimates are available, they do tend to support this necessary condition for the median voter model.
608
D.L. Rubinfeld
The "median voter" methodology has been used by a large number of authors to estimate demand functions for public goods. Results vary by type of data base and of expenditure item studied.4 6 However, a few general conclusions for education may serve to give the reader a flavor of the results. Income elasticities vary from about 0.4 to 1.3 but most are substantially less than 1. Price elasticity estimates are generally very low in the range of -0.2 to -0.4. While income elasticity estimates are reasonably comparable, price elasticities' estimates are not, since empirical studies differ substantially in their definition of tax price. 4 7 Still, the estimates of the elasticities are consistent with the use of the median voter model. 48 Additional support comes from the fact that most empirical evidence indicates that p/4 is reasonably close to 1, a result consistent with the fact that education involves congestion (given a low price elasticity of demand). Nevertheless, the primary criticism of the median voter model is that other political theories are better able to explain public expenditure levels. Since Inman treats these issues in greater detail elsewhere in this volume, only a few issues are mentioned here. In many states a jurisdiction's expenditures are not determined by referenda. As a result, the role of political parties, information and voting become extremely important. Even in the context of referenda-determined expenditures, the median voter model requires strong assumptions. For example, it assumes that all eligible voters vote and that the suppliers of the public services, the boards of education or city councils, play an entirely passive role in determining the referenda agendas. Both Romer and Rosenthal (1978), and Denzau, Mackay and Weaver (1981) have suggested models in which agendas may not allow voters to choose between the median desired level and other alternatives. Romer and Rosenthal argue that if the alternative to passage of the proposal is a reversion to a substantially lower level of spending, then voters may opt for a higher than median desired level. This reversion agenda setting argument is troublesome, since many communities offer a sequence of elections. In such a case even if one or two referenda fail, a final passage can ensure continued expenditures. In addition, the literature on agenda setting does not fully resolve the question of whether electorates would, 46
Reviews are given in Inman (1978) and Denzau (1975). See also the review of the literature on demand for education in Bergstrom, Rubinfeid, and Shapiro (1982). While most of the literature deals with demand in the United States, there has been similar evidence outside the United States. See Pommerehne and Frey (1976), Pommerehne (1978), and Pommerehne and Schneider (1978). 47 A useful summary of these comparative results is given in Bergstrom, Rubinfeld and Shapiro (1982). 48Anot)her troublesome problem with the median voter approach is that preferences must not only be single peaked, but also one dimensional. However, recent work suggests that the majority rule outcomes may be unstable in a more general setting. See McKelvey (1976, 1979) for a discussion. Deviations and variations from median voter outcomes which might arise are discussed elsewhere by Comanor (1976) and Hinich (1977). The best review of the median voter model is given by Romer and Rosenthal (1979b). See also Filamon, Romer and Rosenthal (1982), Romer and Rosenthal (1979a), and Ott (1980).
Ch. 11: Economics of the Public Sector
609
when aware of the referenda setting activity of the local governments, rise up in opposition and either move to another community or vote out the people in power. Further development of the reasons that limited entry to the political process and/or the high costs of mobility among jurisdictions may result in agenda setting is needed. Romer and Rosenthal point out a number of empirical problems that suggest that demand function estimates may not provide strong support for the median voter model. One critique, which they call the "multiple fallacy", arises if, for example, all individuals desire twice the actual level of spending provided in the community. Given the model of equation (33), the only change in empirical estimates of demand parameters would be a change in the intercept. Since log 2G = log 2 + log G, the intercept in the equation with doubled demands would become log2 + fl or log(2 en'), but all of the other coefficients would remain unchanged. Why and under what conditions voters may be coerced into proportionally under- or overconsuming public services is an unanswered question, but troubling nonetheless. A second empirical issue surrounds the use of median income as an explanatory variable. To the extent that income distributions are similar among communities, one cannot be sure that the significance of the median income variable in an estimated demand equation supports the median voter model any more than it would a mean voter model or, for that matter, a model associated with individuals in any "fractile" of the income distribution. Current results are consistent with the use of median income, but no one has provided (to my knowledge) a robust test which distinguishes between the median and mean, for example. If the distribution of demands (or more correctly marginal rates of substitution) for public goods are nearly symmetric then the mean-median distinction becomes a trivial one. For asymmetric distributions, however, the difference can be important. As I discuss in Section 5.3, the median public good demanded would be the predicted outcome of majority rule voting. However, the mean demand would relate directly to the interjurisdictionally efficient provision of the good. With the mean substantially different from the median, majority rule is a political rule which leads to economically inefficient outcomes.
4.3. Some additionalempirical issues The demand function given in (33) specifies a level of real services provided. However, data on public goods are given usually in terms of dollars of expenditure. In order to account empirically for the difference between expenditures and output, G must be multiplied by an index of the price of public services. With the log-log demand form, this is equivalent to including price as a separate right-hand
610
D.L. Rubinfeld
variable. Since price is unavailable, most researchers use the average wage rate of public employees as a proxy. However, this proxy may not be a very good one because it does not include non-labor inputs to production and because the average wage may not account for the mix of labor quality. Of course, attempts to control for quality are often made. For example, in studies of education, quality can be proxied, although imperfectly, by the number of years of teacher experience or by the educational training of the teacher. Demand functions should be estimated for renters as well as homeowners. However, in the discussion to this point tax-price is calculated under the assumption that the median voter lives in the median value home. In the unlikely case that the distribution of tax-prices facing renters is identical to that of homeowners, then the median voter assumption may be satisfactory from an empirical point of view. Even if tax-prices differ, differences in other variables may be sufficient to compensate, so that demands are similar. For example, renters may face relatively low prices causing (other things equal) them to demand more of the public good. However, they may have fewer children or lower incomes, causing (other things equal) them to demand less. On net, it would be possible, although fortuituous, for renter demands and homeowner demands to be the same. However, most evidence suggests that renters' demands are different from homeowners in a systematic way and that voter turnout in local referenda is substantially lower for renters than for homeowners.49 As a result, demand estimation that does not control for renters may lead to biased estimates of demand functions. Note that this problem is really due to aggregation, since with individual data one could estimate different demand functions for renters and homeowners. In any case, the usual solution for the renter problem is simply to include an explanatory variable of the percent of renters in the community. This solution is suspect, since it implies that renters have identical price and income elasticities as homeowners and that demands are multiples of one another. By far the greatest econometric attention has been focused on the specification of the price term in the demand equation. One issue arises because the inclusion of tax base in the denominator of the price term implicitly assumes that an increase in the tax base of the jurisdiction will lower the price of public services. Whether this is true for commercial and industrial property depends upon the assumptions one makes about both the nature of public service demands by industrial property and the ability of commercial and industrial property to shift the property tax, either forward in terms of higher prices, backward in terms of lower wages, or to consumers living outside the jurisdiction who purchase the products.
49
See Rubinfeld (1977), for example.
Ch. II: Economics of the Public Sector
611
In response, Ladd (1975) has attempted to allow for different impacts of changes in the tax base on spending. To see how her procedure might work recall that PG = cpH/B, where B is residential tax base. Now let C represent commercial and industrial tax base. We can account for various treatments of commercial and industrial base by writing PG= [cpH/B][1 -a(C/(B + C))].
(36)
If a equals 0, then commercial base does not alter tax-price (there is no shifting), while if a is greater than 0, commercial base lowers tax price. For example, when a = 1, PG = cpH/(B + C). Ladd's empirical estimates of a are in the range of 0.6, suggesting that there is some forward shifting of the commercial and industrial base. The specification of the price variable also depends on the nature of grants from higher levels of government. If all grants are matching and open-ended (no budgetary limitation), then the price term becomes cpH(1 + m)/B. The price variable is simply recalculated and the demand equation estimated as before. However, some aid is non-matching, while a number of matching grants are closed-ended with communities opting to receive all available aid. In either case the grant does not alter the price of the public good on the margin; rather, it serves to increase community wealth. As a result the magnitude of closed-ended matching grants and non-matching grants should be viewed as additions to income, since they increase the consumption possibilities of the median voter. To the extent that the presence of these grants is capitalized into land values, however both individual and community tax base may change, as may the tax-price facing the median voter.i ° While the specification just suggested is theoretically correct, empirical evidence suggests that grants of a non-matching nature may have a greater impact on spending than would ordinary income. To account for this many authors include a separate grants variable in the demand equation, so that the effect of an incremental dollar of income need not equal the effect of an additional dollar of non-matching grant. Most empirical estimates of demand functions assume that the distribution of tastes of voters is not different from that of non-voters. Evidence on this issue is quite limited, despite the fact that a number of socioeconomic variables are known to affect turnout. However, one must control simultaneously for all voter attributes when looking at differences in demands. One attempt to do this [Rubinfeld (1980)] provided a weak test that supports the median voter hypothesis. Rubinfeld considered three alternative models of non-voter behavior. The 50 This statement is not quite correct, as discussed in Gramlich (1977). See Section 6 for more detail about the grants variables.
612
D.L. Rubinfeld
first hypothesis was that non-voters are alienated because their demands are either substantially lower or substantially higher than that actually provided. A second hypothesis was that non-voters are indifferent in the sense of being content with current spending levels and thus do not find voting worthwhile. The third is that the distributions of voter and non-voter demands are no different, so that the choice to vote is essentially random and independent of tastes. The study concluded that the random hypothesis could not be rejected, while the others could. But, because the study was limited to one jurisdiction in the State of Michigan, the results may not generalize. In studying public employee turnout and voting behavior in Michigan, Gramlich, Rubinfeld and Swift (1981) found the contrary. Their work suggested that non-voters are opposed to tax limits and generally support public spending relative to those who voted. In addition, Gramlich and Rubinfeld (1982) found that public employees with greater tastes for public goods voted at a much higher rate than private employees. Whether the turnout rates reflect differences in underlying tastes or the possibility of earning rents is difficult to say.
4.4. Estimating demand functions in a Tiebout world The approach to estimating demand functions just described is based on a median voter approach to the within jurisdiction determination of the provision of local public goods. No recognition has been given in the previous discussion, and in much of the literature, to the fact that Tiebout-related migration might be occurring at the same time. Goldstein and Pauly (1981) raise this important issue, suggesting that the presence of interjurisdictional migration can lead to biased parameter estimation if demand function estimation is carried out as suggested. The following relatively brief discussion of the issue borrows directly from their paper. In Section 4.1 I suggested that it would be appropriate to estimate a demand function of the form given in equation (33), repeated here: log G = I + t21og Y +
31
lg PG + 41og N + 51og Z + e.
(33)
In the classic normal regression model the right-hand explanatory variables which determine public spending G are assumed either to be fixed (exogenous) or, more realistically, to follow a probability distribution that is independent of (or uncorrelated with) the disturbance term e. The dependent variable, log G, and the disturbance, e, are assumed to be randomly distributed normal variables. The Tiebout process which brought about the distribution of households and public spending demands is one in which individuals search and move on the basis of the level of public spending. In general one would expect that migration
Ch. 11: Economics of the Public Sector
613
based on public sector demands would lead to changes in the distribution of income in the communities from which individuals leave and to which individuals move. If that change in distribution alters the income of the median voter then demand estimation will be biased. Broadly speaking this bias results from the fact that the demand and supply of public goods are being simultaneously determined. Properly, consistent estimation should take into account the fact that while changes in income cause changes in public spending, changes in public spending (through migration) cause changes in income. The same argument can be made with respect to tax-price as well, but the main point should be made clear. Except in unusual circumstances, if individuals sort themselves on the basis of public goods provision, least-squares single-equation estimation of demand functions will yield biased and inconsistent results. The obvious solution to the problem of bias is to correctly specify a two (or more) equation system which determines income and public spending jointly and hopefully allows the demand equation to be identified. However, this is easier said than done, since identification requires information about the supply or migration relationship either in the form of a variable that affects migration but not public spending determination (i.e. affects Y, but not G) or in the form of a restriction on the covariance of the cross-equation error term, a restriction that may have little or no economic interpretation. Research work in this area, both theoretically and empirically, is currently underway, as in Rubinfeld, Shapiro and Roberts (forthcoming). If one is willing to make further assumptions about the nature of the Tiebout process, then it is possible to determine the direction of the bias which results when least-squares estimation is used, and in the process to get some intuition about how an unbiased or at least consistent estimator can be determined. Assume (following the spirit of Goldstein-Pauly) that there are a number of communities containing individuals sorted on the basis of their demands for public goods. To simplify the analysis, assume also that income is the only determinant of the demand for public goods. The income elasticity of demand is also assumed to be positive. Suppose that the demand for public goods is given by G =a+fY+ e, and concentrate on the lowest income community. If all low income individuals resided in the community, then we would have reason to believe that omitted variables would cause the error term e to be symmetrically and perhaps normally distributed. However, consider what happens when individuals sort themselves on the basis of their demands for the public good. All those with relatively low demands, i.e. with less than zero, will choose to locate in the low income, low spending community, since there is no lower spending alternative. However, the choice is different with those with a positive disturbance. Some of them will choose (if the disturbance is large enough) to locate in the higher spending jurisdiction in which residents with somewhat higher incomes live. This suggests
614
D.L. Rubinfeld
that the error distribution associated with the low spending jurisdiction will become truncated, with the truncation being determined on the basis of individual demands for public goods. Such a truncation or grouping of data leads to bias when the grouping is based on the values of the dependent variable-spending on public goods. In this particular case the bias arises because the error term is positively correlated with income - high error term individuals are more likely to live in higher income, higher spending jurisdictions than those with negative error terms. Of course, the opposite story would hold were we to discuss the sorting of individuals at the high end of the income distribution. Here negative errors would be associated with lower incomes. On balance, least-squares estimation in this truncated model will lead to overestimates of the income elasticity of demand for public goods. The cause of the misestimation is the fact that the individual with the median income will not be correctly identified when demand estimation is done using aggregated data. For example, one is likely to incorrectly think that the individual whose demand determines the level of spending in the low income community is the individual with the median income in that community. If the Tiebout process works along the lines just described, what then is wrong with the estimation procedure suggested by Bergstrom and Goodman? One answer is that the assumption of proportionality of income no longer holds. In the community providing the lowest level of G, the income of the individual with a zero disturbance value will be less than the median income of those in the community. This follows because the community will contain a number of individuals with relatively high incomes but low demands for the public good (since those with low incomes and relatively high demands moved out). The reverse will be true for a community providing a high level of the public good. The distribution of income in that community will be skewed to the left, rather than to the right as in the low income community. The distribution of income in the high income community will as a result not be a multiple of the distribution in the low income community- the third as well as first moment of the distribution of income will have changed. 5. Applications of the demand approach 5.1. Alternative methods for estimating the demand for local public goods
The median voter model is the most popular, but not the only approach to estimating demand functions for public goods. While a number of alternatives have been tried, we will focus on two that seem particularly intriguing. The first approach uses micro (individual) survey data as opposed to macro or aggregated
Ch. 11: Economics of the Public Sector
615
data. It should be noted that voter surveys are subject to the criticism that the answers do not reflect true preferences for public goods because of the hypothetical nature of the questions asked. 5 The ability to estimate demand functions using survey data depends upon the nature of the community or communities being studied. If there is no expenditure variation among communities being studied, demand functions cannot be estimated. With expenditure variation, the estimation process becomes feasible. To see how this approach might work, consider a referenda voting example. 5 2 The alternatives in a local millage election may be to increase local spending or to keep it at the current level of spending. Assuming that the millage increase is very small, a "yes" vote is a revealed preference for greater spending and a "no" vote for lesser spending (it is assumed that everyone votes). Obviously, public spending, to which voters respond, is the same for all individuals in a jurisdiction. In Figure 5.1, panels (a) and (b) illustrate two distributions of desired demands for public goods within the same community. In the distributions of both panels (a) and (b), survey studies of voter behavior in a millage election should show similar results. Roughly half of the voters would state that they voted yes and half would state that they voted no, whether the distribution of demands had a high variance or a low variance. However, the two distributions are likely to be associated with different demand functions for public goods. For example, if demand is monotonically related to income and both communities have identical income distributions, then the demand function in panel (b) would have a higher income elasticity of demand than the demand function in panel (a) (given identical tastes). Unique income elasticities and other demand parameter estimates cannot be estimated when actual expenditures do not vary. To see how one might estimate a demand function when expenditures do vary, examine Figure 5.2 in which panels (a) and (b) describe two different distributions of demands for public goods, panel (b)'s distribution having a greater variance. Suppose that survey data are available for two communities, one with per capita spending of $1000 per pupil and the other with $1500 per pupil. In the first community 50 percent of the voters say that they want more spending and 50 percent say they want less. At this point either the distribution in panel (a) or panel (b) could represent the underlying demand for public goods. Now include information from the voting in the second community. Assume that 20 percent of the population say they want more spending and that the underlying characteristics of the population (income, price, etc.) are identical in the two jurisdictions. Then, given a normal distribution of desired demands, only the distribution in panel (b) is consistent with the survey responses in both communities. 51 A great deal of attention in local public finance has schemes to get people to reveal their preferences correctly. demand-revealing preference techniques, see Tideman and 52 See Rubinfeld (1977) as well as Akin and Lea (1982),
been placed on the notion of designing For some examples of attempts to derive Tullock (1976). Fischel (1979), and Citrin (1979).
616
D.L. Rubinfeld
(a) Prob.
I
G* (h
Prob.
Figure 5.1
Obtaining survey information from more than one community enables one to estimate a demand equation, because it gives information not only about the mean of the distribution of desired demands but also about the variance. One expects that the greater the variance in spending levels among communities surveyed, the more likely we can accurately estimate the demand for public goods. Using survey data collected by Courant, Gramlich and Rubinfeld (1979), Bergstrom, Rubinfeld and Shapiro (1982) analyze school expenditures demands. Respondents were asked to state whether they preferred more, less, or the same spending on public schools, after being given information about the characteristics of schools, public spending on schools, and the tax costs of voting for higher spending. The qualitative responses are used to identify demand functions in the following manner.5 3 53 This approach is similar to an approach taken earlier by Rubinfeld (1977) and by Shapiro (1975), and also relies heavily on the qualitative choice literature of McFadden (1973).
617
Ch. II: Economics of the Public Sector
(a) P
Prob-
(h) 1-1
1000
1500
G
Figure 5.2
Assume that the demand for public goods is given by 54 log G* = log Di - log ei .
(37)
Here D i summarizes the non-random portion of the individual's demand function, G* is the desired public good, and e is a random error term. The negative sign is solely for convenience.) Let G, equal the public output in the individual's jurisdiction, and let d, a number greater than 1, be a threshold parameter that accounts for substantial differences between desired and actual output and triggers an individual response of dissatisfaction.55 Responses for more spending or a vote for a millage, less spending or a no vote, or the "same" spending or a 54 The fact that responses are made with error is not a problem as long as the random error is not systematically related to some of the explanatory variables. 55 d can vary by individual without substantially complicating the analysis.
D.L. Rubinfeld
618
decision not to vote occur when: (i)
(more) G* > dG,,
(ii)
(less) G * < Gild,
(iii)
(same) G,/d < G * < dGi.
(38)
Taking logarithms (38) and substituting into (37), yields the following: (i)
(more) log ej < log D i - log d - log Gi,
(ii)
(less) log e' > log Di + log d - log Gi,
(iii)
(same) log D + log d - log Gi > log e > log D - log d - log Gj.
(39)
The key to econometric estimation of the micro demand model is to realize that the probability that (39, i) holds is given by the cumulative distribution function of the density of log ej when evaluated at (log Di - log d - log Gj). Similarly, the probability that (38, ii) holds is equal to one minus the cumulative distribution function, evaluated at (log Di + log d - log Gi), and finally, the probability that (39, iii) is equal to one minus the two previous values. The three numbers represent the probability or likelihood that an individual with a given demand function will respond more, less, and the same. Demand function parameters are estimated by maximizing the value of the joint likelihood function obtained by multiplying together the probability of an individual making a particular response over all individuals in the sample. A number of alternative statistical assumptions are possible, but Bergstrom, Rubinfeld and Shapiro chose the multiple logit or logistic model (the ordered logit, to be specific) for estimation purposes. 56 To see how the approach works, assume that the systematic portion of the demand function is given by: log D, = r0 + illog Y,+ 2 1°g PG, and that the variance of the error term is a 2. Then, each of the equations in (39) can be written in terms of the underlying demand parameters. For example (39, i) now becomes log e, < ( , - log d) +/ 110og Y + ,210g PGi- log Gi.
(40)
Because of the normalization of the logistic distribution, the statistical output will usually include coefficients that are deflated by the standard deviation of the 56
For a detailed discussion of the logit model, see Pindyck and Rubinfeld (1981, ch. 8).
Ch. 11: Economics of the Public Sector
619
error term.5 7 Since d is constant but G varies across jurisdictions, the right-hand side of the estimated equation, excluding the constant term, is as follows: (8/a )log Y + (
2 /)log
PG, - (1/a)log Gi .
(41)
The dependent variable (not written) is the logarithm of the likelihood of being in a given group (more, the same, or less). The constant term will be a somewhat complex function of ,o, a, and d, but is not of immediate concern here. The price elasticity of demand, / 2 , is simply the negative of the ratio of the coefficient of log PG, and the coefficient of log Gi, while the income elasticity is the negative of the ratio of the coefficients of log Y and log Gi. Note that the coefficient of log Gi provides an estimate of the standard deviation of the error term and thus tells how accurately one can distinguish between alternative demand functions. If G, does not vary, such as when data are from a single jurisdiction, one cannot estimate the error variance, and thus the parameters of the demand function cannot be determined.5 8 When this technique was applied to the Michigan survey data, the results were generally comparable to those obtained using aggregate data. Individual price elasticities were similar to those from median voter models, lending support to both approaches. Of course, survey data are more expensive to obtain than the data for cross-sectional median voter studies, but survey data have more detail about voter characteristics than other aggregate sources. In addition, the micro results were informative about the behavior and attitudes of individuals according to items such as race, income, occupation, religion, age, and family composition. A very different approach to demand estimation uses the fact that with capitalization the price of housing and land reflects individuals' valuations of all of the attributes that are associated with housing, including the level of public goods. Assume that the supply of land and/or new communities is not perfectly elastic so that capitalization is maintained even after supply responds. Then, under reasonable conditions one can obtain information about preferences for public goods from variations in property values across communities. This hedonic approach has been emphasized by Rosen (1974) and evaluated empirically by a long list of authors, most of whom have focused on environmental attributes rather than on public goods. 5 9 To develop the hedonic method I alter somewhat the utility maximization approach used earlier in this chapter. Rather than viewing housing and the public 57
Log Ei/o is unit logistic, since it has mean zero and variance one. 58The constant term is a bit more difficult and depends on the nature of the computer program. In general, however, we can obtain a separate estimate of the constant 80, and the threshold parameter d. See Bergstrom, Rubinfeld and Shapiro (1982) for details. 59 See, for example, Harrison and Rubinfeld (1978), and Nelson (1978).
620
D.L. Rubinfeld
good as distinct commodities, I assume that housing is a commodity which itself consists of a series of attributes, one of which is the level of public good available in the community. Other attributes might include structural characteristics of the house itself (house size, number of bathrooms, etc.), neighborhood characteristics (crime rate, socioeconomic status of the community, etc.) and accessibility characteristics (distance to work, distance to shopping areas, etc.). To distinguish this view of housing from the previous one, I let h(G, a) represent the housing bundle, where G is the level of public good, and a represents the vector of all other housing attributes. As before, p represents the price of housing, but now p(h) is a function of the housing attribute vector. The p(h) function might be linear as has been assumed previously, but it might be non-linear as well, depending upon one's assumptions about the extent to which housing attributes can be combined into housing bundles. The utility-maximization problem is now: maximize U(X, h(G, a)) subject to Y = X + p (h).
(42)
The first-order conditions with respect to G and to X are (i)
(aU/Oh)(Oh/OG) = X(ap/ah)(h/OG),
(ii)
au/ax= .
(43)
Combining (43, i) and (43, ii) yields: W= (aU/aG)/(aU/aX) = p(h)/aG.
(44)
The left-hand side of (44) is the marginal rate of substitution between the public and private good, while the right-hand side is the incremental change in house value associated with an increase in the level of the public good. In other words, (44) tells us that willingness-to-pay for the public good, W, can be measured by differences in house values, assuming that all other differences in housing attributes are held constant. Once information about the willingness-to-pay is known empirically, the demand function will also be known. The direct link occurs because the marginal rate of substitution is itself the compensated (equal utility) demand curve for the public good, which differs from the usual uncompensated demand function solely because of income effects. The hedonic approach depends crucially upon the presence of capitalization. As a result it cannot be applied in the case of a long-run Tiebout model in which capitalization is not present. In the short run or intermediate run, however, with
621
Ch. 11: Economics of the Public Sector
housing and public goods choices somewhat limited, housing price differentials should reflect differences in the provision of public goods. There are a number of possible ways to proceed, but the most direct is to first estimate the hedonic housing price function p(h). In practice the capitalized value of the property (PV) will often be measured rather than the annual price p, so that one must be careful to interpret the results as total willingness to pay, rather than an annual figure. Using PV as the dependent variable, a regression of the following form might be estimated: log(PV) = 0 + illog(G) +
2 1og(a)
(45)
+ E,
where is a random error term. With this particular formulation it follows directly that 8(log PV)/a(log G) =
l.
(46)
Now, using the fact that a(log X) = (1/X)(aX), (46) becomes W= a(PV)/aG = ,I(G/PV).
(47)
Thus, the marginal increment to house value from high public service provision is equal to P, times the ratio of public spending to the individual property value. If the market is in equilibrium, then this number represents individuals' marginal willingness to pay for incremental amounts of the public good. Furthermore, if all consumers are identical, W is the marginal rate of substitution of public for private goods and characterizes the demand function for public goods. The hedonic approach has its difficulties. In general one must worry about the confounding effects of supply and demand. For example, assume that the level of the public good in a community is determined in part by the income of the median voter. In this case income serves as a demand variable. However, assume also that the median voter's income arises from his or her wage rate as a public employee. Then income also represents a supply variable, the cost of producing the public good. Unless the supply and demand explanations can be sorted out, both theoretically and econometrically, the income elasticity of demand for the public good cannot be determined. To illustrate how the approach is applied, consider the easiest case in which supply is fixed and not determined by income or other attributes. One can estimate equation (48):60 log W = a o + allog G + a21og Y + ,
(48)
60 The more general analysis that worries about supply and demand is suggested in Rosen and attempted by several other authors such as Harrison and Rubinfeld (1978).
622
D.L Rubinfeld
which represents an inverse demand function, since W is the marginal valuation or price associated with the public good and G is the level of public good. The estimate of the parameter a tells how individual valuations of the public good vary with the level of that good, while the parameter a2 tells us how valuations vary with income. From these parameters, the standard price and income elasticities of demand can be calculated (assuming sufficient variation in G). For example, the income elasticity of demand would be (aG/G)/(aY/Y) = (log G)/a(log Y) = [(log W)/a (log T)]/[a (log W)/
(log G)] = a2 /a. (49)
The hedonic approach to valuing non-market attributes is questionable if people have limited information, supply is somewhat elastic, or if markets do not equilibrate. In addition, proper specification of the functional form is crucial for empirical work. 61 However, most of these problems are common to a demand estimation approach. And, the hedonic method provides an interesting alternative approach to the estimation of demand functions which can be carried out and evaluated without unreasonable data collection needs. How this approach as well as the previous ones can be of value as a prescriptive as opposed to descriptive device is the subject of the following section. The micro approach can provide one solution to the Tiebout bias problem discussed earlier. With micro data one no longer needs the assumption of Bergstrom and Goodman concerning proportionality of income distribution. Or, more generally, one need not worry about the problem of finding the income and other attributes of the individual whose demand for public spending is the median demand of those within the community. However, the Tiebout bias problem does not completely go away, even with micro data. If individuals sort themselves on G, then the simultaneity problem still exists, and a multiple equation, rather than single equation approach may be appropriate.
5.2. Will the median level of services provided be efficient? Demand functions for public goods have a number of uses, both positive (predictive) and normative. In this section one of the normative uses of the 61 Note also that I am not arguing that changes in property values themselves measure peoples' willingness to pay, but solely that information from property values can be used to induce demand functions. For a discussion of this issue, see Polinsky and Rubinfeld (1977). See also Brown and Rosen (1982) and Scotchmer (1986).
623
Ch. 11: Economics of the Public Sector
estimated demand functions is pursued-the question of the efficiency of the median-voter-determined level of public good provision. Whether the empirical approach is micro or macro and whether one works with demand or utility functions, is not a central concern here. Of course the median voter determination of public output is a positive, not a normative result. And, not surprisingly, the median voter demand is generally inefficient for a number of reasons. First, if individual tastes and incomes vary, a Lindahl equilibrium is possible in which each individual's desired public good is equal to the level of public good provided. However, even under these conditions, sorting of individuals among jurisdictions can lead to interjurisdictional efficiency gains. Second, given the current distribution of individuals, the willingness-to-pay for public output, net of costs, is not necessarily maximized. It is this second intrajurisdictional inefficiency that is of concern here. For an example of how this second type of inefficiency arises, recall that the marginal rate of substitution is a diminishing function of the level of public goods demanded (other things equal). If the utility function, U, is Cobb-Douglas, i.e. U = In X + a In H + ,f log G, then MRS = (au/aG)/(au/ax) = 1/G, which is monotonically decreasing in G. Now consider the distribution of marginal rates of substitution of public for private goods at the current level of spending. Individuals, with an MRS less than that of the individual who desires the current level of spending will desire less spending, while individuals with MRS greater than the current MRS of the median voter desire more spending. If the distribution of MRS is not symmetric, then the median voter outcome is inefficient. Consider a distribution of MRS which is highly skewed to the right, as in Figure 5.3. The mean of such a distribution is greater than the median. However, the median represents the level of public services demanded by the median voter
Prob.
_
Med. Mean Figure 5.3
MRS i
624
D.L. Rubinfeld
and thus the outcome of a majority rule referendum, while the mean is the more efficient level of provision, since it weights preferences by willingness to pay. Of course, this formulation assumes that there are no side payments made, no logrolling, and no package deals. This efficiency requirement is derived from Samuelson's condition that EMRS i = MRT. Dividing both sides of the equality by the number of individuals in the community, N, yields YMRSi/N = MRS = MRT/N. The left-hand side of the equation is the mean level of the MRS, and the right-hand side is the cost of an additional unit of public goods divided by the community population. If public goods are paid for with a head tax, i.e. an equal tax on all individuals, then the cost to an individual is MRT/N. For the individual with the mean marginal rate of substitution, his or her valuation of public goods equals his or her cost. Thus, to attain an efficient solution the preferences of the "mean" voter, not the median, should determine the referenda outcome. The "median" and "mean" outcomes will be equivalent when the distribution of MRS is symmetric. 62 When a property tax is substituted for a head tax, the efficiency conditions are more complex. However, the concept remains the same; efficient provision of public goods focuses on the mean of the marginal rates of substitution, not the median. 6 3 Empirically, means and medians are difficult to distinguish, but there is a viable test. One fits a distribution to the marginal rates of substitution across the population and tests whether the mean equals the median, using a non-parametric test.6 4 This test can also be used to see whether government is too large. If the sum of the marginal rates of substitution is less than the marginal rate of transformation, then overspending must have occurred. This discussion is based, however, on a rather simplistic view of government. A more complete analysis distinguishes at least three different behavioral models that result in non-optimal public spending levels. First is the model of Niskanen (1971) in which the level of public spending is greater than that desired by the median voter (and presumably greater than the efficient level) because budget decision-makers have bureaucratic power. In a sense this argument is a generali62 Note that a symmetric distribution of MRS is not the same as a symmetric distribution of demands, and that there will be a distribution of MRS for each level of spending possible. Consider the following example from Bergstrom (1979). If U = In X i +a In G,, c is the marginal (= average) cost of producing the public good, and a head tax is used, then G* = a/(1 + a)(C/N)Y. The median demand is then G*ed = a(1 + a)(c/N)Ymed. The MRS of each individual can be calculated to be: MRSi(Gea) = (Y, - (c/N)Gmed)/Gm*ed. Note, however, that the person with income Ymed and median MRS gets to consume his or her most preferred level of public good, Gm*,ed Since he or she is optimizing, his or her MRS is equal to the relative price of public goods, which is (c/N) under the head-tay scheme. It follows that the median MRS (evaluated at G,,d) = c/N. Finally, the mean MRS = (1/N) MRSi(Gd) = [(1 + O)(Y/Ymed) - a]. If Ymed = Y, the median outcome is efficient, otherwise it is not. 63For a well-written extension of the head tax case, see Barlow (1970). For a detailed discussion of these efficiency conditions including the case of an income or wealth tax and their implications, see Bergstrom (1979). See also Akim and Youngday (1976). 64Such a test was used in a different situation to test whether voter and non-voter demands were equivalent. See Rubinfeld (1980).
Ch. 11: Economics of the Public Sector
625
zation of the Romer-Rosenthal agenda-setting view. Bureaucrats want larger government output but are not particularly concerned about higher wages for themselves or their employees. The second model assumes that the wages of public employees are higher than they would be if they were determined competitively. This results in a tax-price which is above a competitively determined tax-price. Public budgets are higher than efficient budgets because public employees are earning rents. Presumably, these rents are the result of collective bargaining or political power on the part of public sector unions. This is the model considered in Courant, Gramlich and Rubinfeld (1979) and allowed for in the budgeting model of Inman (1981). In the papers by Courant, Gramlich and Rubinfeld, and Inman, there is limited mobility. Public employee unions have some power to set both wages and output, but must trade-off the advantages and disadvantages of each. Higher output, for example, may mean higher employment and greater bargaining power, while higher public wages may mean less output and less bargaining power. Courant and Rubinfeld (1982) show that even in the competitive world of the Tiebout model with unlimited mobility, the threat of public unions extracting rents is a real one.6 5 A third way in which government may be too large and inefficient, is that it may not be producing on the production frontier. Except for Fiorina and Noll (1978), there has been essentially no literature on the mechanisms by which such inefficiency may arise. One possibility is that government is perceived to be too large because there is waste due to the use of an inferior technology. What are the uses of these alternative models? They have important positive implications for our ability to predict the effects of policy changes on the local public sector and normative implications about the appropriate use of constitutional and statutory limitations of government. As an example, Courant and Rubinfeld (1982) suggest that the limitation on the total spending level of the public budget may or may not be an effective policy, depending upon what causes the inefficiency in the public sector. They suggest that the "Niskanen-like" inefficiency can be removed, at least in part, by an expenditure limitation, but that the rent earning ability of public employees is likely to be enhanced rather than diminished under such a reform. 6. Fiscal federalism 6.1. The responsibilities of local government The analysis of the provision of local public goods has shown us that with fixed jurisdictional boundaries and interjurisdictional externalities an efficient outcome 65
The same general result is given in Epple and Zelenitz (1981).
626
D.L. Rubinfeld
cannot be assured. However, it is possible to eliminate the inefficiencies associated with these interjurisdictional externalities in a model which is generalized to allow for multiple levels of government. Thus, it is natural for us to complete our overview of local public economics by briefly reviewing some of the problems and concerns associated with a federalist system. The analysis of federalism includes both normative and positive issues, but our focus will be primarily on the normative. This choice is made solely because of space limitations. The positive analysis of federalism is one of the more interesting areas of recent research in public finance. The central issue of federalism is whether certain goods and services or government responsibilities can best be provided and financed at the federal, state, or local level. This breakdown greatly simplifies reality, since there are numerous intermediate jurisdictions such as special districts, counties, and so on, and because many of these jurisdictions overlap spatially. In 1977, for example, there were over 3000 county governments, 18000 municipalities, 16000 townships, 15000 school districts, and 25000 special districts in the United States. This actually represents a decrease in the number of jurisdictions from the 1950s. Since space is limited and this chapter is primarily methodological, it will be useful to simplify the discussion of the important normative issues by assuming only two levels of government -local and national. The question then becomes: Which level of government is most appropriate for handling which public function?6 6 The usual taxonomy for analyzing optimal fiscal federalism is due to Musgrave (1959), and has been elaborated by Oates (1968, 1972), among others. Since this taxonomy is analytically valuable, it is followed reasonably closely here. Musgrave distinguished between three "branches" of government: the stabilization, distribution, and allocation branches. Although not analytically independent, these branches are usually treated separately, an approach which I follow here. 6.1.1. Stabilization One of the roles of any government is to use macroeconomic policy to stabilize prices and employment in an inherently cyclical national economy. Stabilization is not an important function of local governments, however, because local governments can use only fiscal rather than monetary tools and because they are relatively open to trade and migration. As a result, a good argument can be made that the federal government is in the best position to handle stabilization policy. One argument against stabilization being a local function is that local governments are required by law to balance their budgets. The budget balance constraint eliminates the major source of fiscal stabilization-deficits in times of 66
For a broader and more detailed discussion of these issues, see Oates (1972).
Ch. 11: Economics of the Public Sector
627
recession and surpluses when the economy is at or near full employment. Whether or not stabilization is the primary goal of assigning a government program to the federal or local level, that assignment can have stabilizing or destabilizing effects. Consider, for example, a recent (1982) Reagan Administration proposal to have state governments take over primary responsibility for managing welfare programs. Since welfare benefits are highest in times of recession, and state and local budgets must be balanced, the reassignment of functions can in itself be destabilizing, even though the goals of the program are primarily redistributive. One possible role for stabilization at the local level involves situations in which there are substantial regional differences in income and unemployment. 6 7 For instance, one might argue that the Northeast and Midwest public sectors should engage in policies to counter local unemployment problems since they have first-hand knowledge about the nature of these problems. There are substantial problems with regional stabilization policies, however. First, vast leakages to imports and exports make attempts to stabilize local economies ineffective. Second, these policies may increase competition among regions and jurisdications to attract business and other revenue-generating facilities. The net effect of such competition may leave the regional economy relatively unchanged and make all local governments worse off. This might arise because of the increased taxation needed to finance the tax breaks given to government with the resulting deadweight loss. Third, even if competition does alter outcomes, it may be inefficient from a social point of view. If the Northeast economy is "moving" to a new long-run equilibrium in which real incomes are substantially lower than they were previously, attempts by the local regional government to counter this equilibrating move may be socially unproductive. This controversial point is one of substantial current interest which has not been analyzed thoroughly.6 8 6.1.2. Distribution
To the extent that government has a role in redistributing income, the redistribution branch will be seriously limited if it operates at the local level. For example, if the Tiebout model is correct, then there is likely to be substantial income homogeneity within jurisdictions. With no disparities in income within jurisdictions redistribution is simply not possible. Local redistribution is also limited by the prospect of migration. If one jurisdiction redistributes more income than the other (which is otherwise the same), then the poor would have an incentive to move to the community with the more redistributive policy. This migration would 67
Tresch (1981) discusses these issues much more thoroughly than space allows here. 68See Courant (1983) for an interesting analysis of the tax and expenditure subsidy question among regions.
628
D.L. Rubinfeld
add to the first jurisdiction's cost of achieving its distributive goal. As a consequence of incentives such as these, local jurisdictions are likely undertaking " too little" redistribution. All in all, these arguments suggest that a larger level of government is appropriate for redistribution, especially if redistributive goals are the result of a national consensus. While the capacity of successful redistribution is greatest at the national level, there are some arguments for redistribution to be done by the lower levels of government.6 9 These arguments focus on the limitations of the market model as well as the differences between municipal and federal decision-making processes. For example, a number of authors argue that at the metropolitan level the transactions costs of running government, including the ability to correct evaluate distributional needs (who is poor, etc.), are less than they are at the federal level. Within limits, redistribution can be somewhat successful since both individuals and businesses are not perfectly mobile. Whether a metropolitan government chooses to substantially redistribute income is problematic, however, since such a government may well be controlled by the middle and upper income suburban residents.7 0 A related, but somewhat different, argument has been made by Pauly (1973). Pauly assumes that income redistribution from rich to poor increases the utility of both parties. According to this view, income redistribution is a pure public good and its "provision" makes both parties better off. However, the public good is local, since individuals care more about helping the poor in their community rather than elsewhere. As a result, a first-best solution requires differing redistribution programs in each jurisdiction. Such a goal clearly cannot be achieved with a uniform national program. And, except under very strong conditions, a decentralized solution in which each jurisdiction sets its own program is suboptimal. A more interesting income redistribution issue focuses on the distributional consequences of the local provision of taxes and services. If all local taxes and public goods provision are the same irrespective of household income, then the local public sector will be distributionally "neutral" and will not conflict with federal policies. However, if the outcome of the local public sector spending and taxing pattern is not neutral, and is different than the norms of the federal level, efficiency problems can arise. As an example assume that the outcome of a Tiebout-like process is for community tax bases to vary substantially. Then, some communities will be able to finance a given level of public services at a lower tax-rate than will others. Should the federal, state, or even metropolitan government involve itself in redistributing to equalize the "fiscal capacity" of communities? A number of authors have argued that the equalization of tax base is an 69 See 70
Tresch (1981, ch. 30) for details of this argument. For an argument for metropolitan government and redistribution, see Burstein (1980). See also Rose-Ackerman (1981).
Ch. 11: Economics of the Public Sector
629
appropriate policy from the point of view of equity, at least as far as educational finance is concerned. However, equalization of tax bases can be inefficient since it conflicts with what might have been the efficient outcome of a Tiebout sorting process. The matter becomes even more complex when we recall that differences in fiscal capacity are likely to be capitalized into land values as individuals from low tax base communities migrate into high tax base communities. As a result capitalization will compensate, at least partially, for differences in fiscal capacity. In this case individuals who consume public services at a low tax-price are paying for part of that benefit in higher housing prices. Despite the effects of capitalization, some equalization may generate desirable fiscal and income distributions. One can argue, for example, that differences in fiscal capacity per se violate a principle of horizontal equity, since individuals in different communities will not be treated equally. However, why one should worry about a limit equity goal is not clear. Nevertheless, equality or near equality of fiscal capacity may be a means, albeit a costly one, of partially achieving a goal of equalizing expenditures on merit goods such as education. This issue is discussed in greater detail in the section on grants-in-aid which follows. Empirical evidence on the fiscal equity aspects of the local public sector is quite limited and of questionable reliability. At this stage in the development of the subject area of local public finance both the underlying methodology and the actual empirical estimates of fiscal impacts on the distribution of income are in need of substantial improvement. Space, however, limits my analysis to a brief and necessarily somewhat cryptic description of the state of the art. For my purposes it is the qualitative nature of the distributional differences among levels of government that is important. Inman and Rubinfeld (1979) summarizes previous empirical work by focusing on two measures of equity, one on the tax side and the other on the spending side. As shown in Figure 6.1, s, the tax-equity index, is the percentage increase in after tax income that occurs in response to a 1-percent rise in before-tax income. A value of s equal to 1 indicates a proportional income tax, less than 1 illustrates a progressive tax, and a value greater than 1, a regressive tax. The other equity variable is the elasticity of local service expenditures, v, which measures the percentage change in local expenditures given a percentage increase in family income. If service outlays are independent of income, then v is 0. However, v greatly increases when service expenditures rise with family income, making the distribution of service expenditures pro-rich. Based on a detailed study of a number of incidence papers, Inman and Rubinfeld conclude that the federal level of government is the most progressive in terms of its tax and spending program, since spending is essentially proportional while taxes are mildly progressive. Welfare is viewed here as a federal and not a local program. The state level relies on an essentially proportional tax and a mildly pro-rich spending system. However, evidence from the local level suggests
630
D.L. Rubinfeid Tax Eauitv (s) 1.07
Private Sector
1.06 1.05 1.04
1.03 Local 1.02 1.01 ,State
1.00
Federal
0.0
01
02
0.3
oA
0.5
06
Spending Equity (v) Figure 6.1
that the tax system is somewhat regressive and that spending levels are somewhat pro-rich. In light of the analysis of the Tiebout model, this outcome is not surprising, since any attempt to be substantially progressive at the local level is likely be thwarted by the migration of high income families. Legal attempts to alter distribution have concentrated on zoning laws as well as the distribution of tax liabilities and expenditure benefits within jurisdictions. Inman and Rubinfeld (1979) suggest that, while a number of legal decisions appear to be equityimproving, the economic impact of those legal decisions has been relatively small. Likewise, legislative attempts to achieve equity by equalizing tax bases among jurisdictions have also had relatively small effects. These failures are due in part to the fact that individuals can migrate to avoid what they deem to be adverse distributional effects, and partly to the ability of jurisdictions to exercise their political power to thwart changes in spending and tax patterns in the first place. The inability to alter distribution follows from the fact that both legislative and judicial attempts to achieve equity do not confront the fundamental observation that there are substantial income differences among families within metropolitan areas. Attempts to achieve income redistribution through the local public sector are much less likely to be economically successful than a federal program of income redistribution through either the tax or the spending side of the budget. 6.1.3. Allocation
The third branch of government, the allocation branch, provides the best motivation for a multi-level federalist system. As the Tiebout model suggests, local
Ch. 11: Economics of the Public Sector
631
governments are likely to be most responsive to the variations in demands for public goods. What makes the federalism question interesting is the spatial nature of the public goods produced. For example, the theory of clubs suggests that there is an allocatively efficient size for government, where the marginal benefit from consumption of the public good is equal to the marginal cost of congestion. When the jurisdictional boundaries are fixed, however, the benefits of certain public goods produced in one jurisdiction are likely to spill over to the residents of neighboring jurisdictions. In general, the externality results in an inefficient outcome if public spending and taxing decisions are entirely in the hands of the local governments. For example, if suburban residents use some of the services of the central city but do not pay either user fees or taxes on wages to compensate for the benefits received, then the city may underproduce the goods. Economic externalities can be dealt with in two ways. First, the externality can be internalized by shifting the decision-making process to the metropolitan government level. Second, a system of taxes and subsidies can be designed that charges those generating the externalities for the costs and benefits they confer. For example, if the externality is generated by the community as a whole, a system of state and federal grants or taxes would work. Empirically, whether such externalities are important and which direction they flow is relatively unknown. The usual argument, suggested by Neenan (1981) and others, is that suburbanites are subsidized by the central city, but Bradford and Oates (1974) have argued that the suburban labor force generates benefits to central city residents in its productive role. Fiscal as opposed to economic externalities occur with respect to the industrial base of the central city and suburbs. Fiscal externalities alter relative prices that jurisdictions face without necessarily changing the underlying production functions and thus do not necessarily involve allocative inefficiencies. Assume, for example, that heavily taxed commercial and industrial properties can pass on or export some of these taxes by raising prices. If the products are sold in a national market or at least a non-local market, then the property tax costs can be exported to those outside the jurisdiction. As a result, households residing in the jurisdiction may decide to increase public service output because of the low price they face. However, they impose external monetary costs on non-residents of the jurisdiction. These fiscal effects may or may not lead to inefficient outcomes, depending upon whether underlying productive relationships (e.g. scale economies and congestion) are altered. When economic externalities do arise, they can be remedied if revenues generated by a larger jurisdiction, say a metropolitan or statewide area, are shared with the smaller units of government." Obviously, the allocation branch has an important role in the assignment of taxes to the different levels of government. To minimize the inefficiencies which arise because people move to avoid the effects of taxes, one ought to tax mobile 71
This argument has been developed in extensive detail in Brazer (1961).
632
D.L. Rubinfeld
sources of revenue at the highest level of government and immobile sources at the local level. Thus, the decision to tax both personal and corporate income at the federal level in the United States is rational, particularly since the personal income tax is progressive. Similarly, since the property and sales taxes are perceived not to fall heavily on the mobile factors of production, these state and local taxes are regarded as those that minimize inefficiencies. However, the problem is much more complex than this brief summary suggests. For example, whether the property tax is an appropriate local tax depends upon one's view of the incidence of the tax. If the property tax is borne by owners of housing or land, then it is a desirable local tax from an efficiency perspective. However, if the tax is borne by all owners of capital or by businesses who pass it on by raising prices, then it is less desirable. In fact, the possibility of "tax-exporting" to non-residents is important at both the state and the local level. In an early study McLure (1967) estimated that roughly 25 percent of the taxes levied at the state level are shifted outside the state. Similarly, Roberts (1975) suggested that exporting (exclusive of the deductability under the Federal personal income tax) ranged from under 20 percent for the property tax, to over 25 percent for the sales tax in Michigan. The spillover and tax exporting issues have been analyzed theoretically by a number of authors. 72 The following model, based on Oates (1972), should help to clarify some of the issues. It illustrates why the larger level of government must intercede in some manner when there are interjurisdictional externalities. Assume that there are two communities, A and B, each with N identical individuals. Individual utility functions are UA(XA, GA + aGB) in the first community and UB(XB, GB + aGA) in the second. X is a private good, and G is a public good whose benefits spillover to the neighboring community so that a ranges from 0 to 1 and represents the extent of the externality. The externality here is real, not just financial. In addition, the local public good is produced by a production function, F(X,G)= 0, within each community and financed by a property tax on housing of which k percent is exported. The inefficiency of the decentralized model can be seen by comparing the necessary conditions for a welfare optimum to the necessary conditions for utility maximization within each community. Given that the two utility functions are weighted equally, the social optimum is obtained by maximizing NUA( XA/N, G + aGB) + NUB( XB/N, GB + GA)
(50)
subject to
F(XA + XB, GA + GB)=
(51)
72 See, for example, Arnott and Grieson (1981), Gordon (1984), Bird and Slack (1983), and Zimmerman (1983).
633
Ch. 11: Economics of the Public Sector
The first-order necessary conditions for optimality are N(aMRSA + MRSB) = MRT
and N(MRSA + aMRSB) = MRT,
where MRSA = (auA/aGA)/(a A/(aXA/N)),
MRSB = (auB/aGB)/(a
B/( aX
B/
N)),
and MRT= (aF/aG)/(aF/aX). If there are no spillovers (a = 0), then the standard Samuelson condition applies so that YMRSA = N(MRSA) = MRT = N(MRSB) = YMRSB in each community. When a = 1, there is a single aggregate Samuelson efficiency condition that YMRSA + YMRSB = MRT. In the remaining cases, the efficiency conditions state that the joint benefit to individuals in both communities from an increase in G must equal the marginal cost in terms of foregone private consumption (the MRT). What happens in the decentralized case when each community maximizes its own utility? For community A the level of G is determined by maximizing (52)
NUA( X/JN, GA + aGB)
subject to (53)
F(kXA, GA) = 0.
Here, the input necessary to produce a given level of public good has been reduced by the factor k to reflect the lower cost to the community of using a partially exported tax to finance the public good. The first-order conditions for maximization yield: N(auA/aGA)/(auA/( aXA/N)) = N(MRSA)
= (F/aG)/(k aFlax) = MRT/k
634
D.L. Rubinfeld
or kNMRSA = MRT. If k = 1 and the production function is constant returns to scale, then community A would increase GA until MRSA = MRT/N. However, if there are spillovers, then they are underproducing the public good. Instead, community A should produce more GA until MRSA = MRT/N - aMRSB. When k is greater than one and a is less than one, the bias is in the other direction. Local governments tend to oversupply the public good for reasons analogous to the ones just given. Finally, in the general case, the direction of bias depends upon the magnitudes of the parameters. It is conceivable, but unlikely, that decentralized decision-making would result in A producing efficiently (MRT/N - aMRSB = MRT/kN) and B producing efficiently (MRT/NaMRSA = MRT/kN). In fact, in a more general version of the model, decentralization never leads to a first-best outcome. Instead, state and/or federal grants must be used to remove the inefficiency. One resolution to this externality problem is to have all production decisions made at a higher level of government. However, an alternative decentralized approach is to allow the higher level of government to use grants-in-aid to stimulate the appropriate spending levels by the local governments. It is this approach to the externality problem to which I now turn.
6.2. The role of grants in a federalist system The various roles that federal and state grants might play in a federalist system are depicted in the four panels in Figure 6.2.7 3 The horizontal axis measures the level of government spending or output G. The price of public output is chosen to equal 1 initially so that G can be represented in either real or money terms. The vertical axis represents total post-tax income in the community, and thus the budget line, AB, represents the constraint that the community decision-maker faces in trying to allocate goods between the public and private sectors. If there are no constraints, then the optimum is point E where the "community indifference curve" is tangent to the budget line. This indifference curve reflects the preferences of the city manager of "decisive voter" and is assumed to be a function of the preferences of the citizens in the community. While this is a plausible assumption if everyone in the community is identical, it is clearly unrealistic and causes serious problems if there is heterogeneity. 73 This analysis builds on the early work on Wilde (1968). See also Thurow (1966), and Inman (1975).
635
Ch. 11: Economics of the Public Sector Non-Matching Bloc Grant
' Y
Matching Grant (b)
(a)
A
A
H
G Specific Non-Matching Grant (c)
Specific Matching Grant "'I
Y
A
Figure 6.2. (a) Non-matching bloc grant; (b) matching grant; (c) specific non-matching grant; (d) specific matching grant
Panel (a) represents the case of the non-matching unconditional grant or the federal revenue sharing block grant. The non-matching grant adds to income and allows the community to expand its public spending as well as its after-tax income to move from point E to point F. Such a grant is logically used to redistribute income among jurisdictions since the lack of restriction on spending allows each community to maximize its own welfare. What if there are externalities in which some of the benefits of local spending spillover into neighboring jurisdictions? In this case a matching grant, as shown in panel (b), is more effective, since the matching rate is chosen to induce the local government to spend more on the public good. If the correct rate is chosen, then decision-makers will equate the social marginal benefit to all individuals from the public good to the marginal cost. If one assumes that the amount of money spent on the two grants is sufficient to allow the community in cases (a) and (b) to reach the same level of utility (not
636
D.L. Rubinfeld
the same amount of money), then the effects of matching and non-matching grants can be compared. Since the non-matching grant affects income without changing price, while the matching grant changes price and thus has both income and substitution effects, point H must be associated with a higher level of public spending than point F. This result follows from the fact that the substitution effect is always negative. Thus, the matching grant is more stimulative and hence more cost efficient than a non-matching grant. Panel (c) describes a conditional non-matching grant where money is received only if a minimum amount is spent on a particular public service. The grant acts like a pure non-matching grant with only an income effect if the community would spend more than the minimum amount required by the grant. Thus, the non-matching grant is effective in guaranteeing a minimum level of spending by a local government on a particular budgetary item. However, these grants do not have an effect different from a pure income grant unless the community is forced to spend more than it would have done otherwise. Finally, consider a matching grant that is limited or "closed ended". As long as the community operates within the budget constraint set by the matching program, the grant has a price effect. However, if the community spends all the money available from the grant, then the analysis is the same as panel (a).7 4 A closed-ended matching grant makes sense if there are some limited externalities and only a certain amount of additional stimulus is desired.7 5 This rather simplistic view of the effect of grants is flawed since it assumes that the community indifference curve reflects interests of the decisive voter or manager which are identical to the interests of the constituents. For example, the manager's political goals may not coincide with the long-run interests of those members of the jurisdiction, since his period of election or office is likely to be relatively short. Thus, the model just described is likely to be a poor predictor of government spending behavior, at least in the short run. After all, the simple model predicts that a dollar increase in grants has the same effect as a dollar increase in community income. However, grants often lead to greater spending increases than would result from an equivalent increase in community income. This phenomenon is sometimes called the "flypaper effect", since it describes the situation in which money tends to stick where it initially hits. A number of authors have attempted to model the effect of grants on local spending to explain the flypaper effect by focusing on the various kinds of budgetary and fiscal illusions that may arise in a federal system. Courant, 74 This has implications for the specification of the price and income terms in the regression analysis where grants can have both price and income effects. With a non-matching closed-ended grant, the closed-ended form of the grant acts simply as an addition to income, not as a price increment. 75For an extended and substantial analysis of the effect of grants, both theoretically and empirically on local spending, see Gramlich (1977). See also, Hamilton (1983b).
Ch. 11: Economics of the Public Sector
637
Gramlich and Rubinfeld (1979), as well as Romer and Rosenthal (1980) and Hettich and Winer (1984), have suggested models of fiscal illusion that focus on individual utility-maximizing behavior. The economic rationale for the flypaper effect in many of these models hinges on the inability of voters to perceive the true marginal price of public expenditures when non-matching grants are present. Of course, these models do not preclude the more popular political rationale that I discussed previously. The analysis of how the political process might work to explain the flypaper effect is one which should provide room for fruitful research. One reason for the limited predictive ability of the spending models is that each assumes that the grant program and level of grants is exogenous to the model. In fact, the form, nature, and extent of grants is often the result of a complicated bargaining process between politicians and employees at each level of government involved. Empirically, these models may not fully reflect the specific income and price effects of the program. A number of authors have shown recently that better measures of the effects of grant programs can be obtained from models that account for these bureaucratic elements.7 6 The analysis of grants within a federal system is in many ways typical of the state of local public economics today. Normative analyses of the role of grants have been well developed within models that have little or no assumptions about the supply side-the politics and economics of the provision of local public goods. However, empirical work on the positive aspects of the grants side is somewhat less well developed. And, the area of greatest need combines the normative and the positive. We need to utilize our empirical base of knowledge of the local public sector to build better normative and positive models. To be effective, both in terms of proscription and prescription these models must include reasonable assumptions about politics as well as economics.
7. Conclusions This chapter has covered a substantial amount of ground, and yet the subjectmatter of local public economics has been barely scratched. Emphasis has been placed on the demand side of the public sector, and on normative rather than positive models. Hopefully, the models and issues sketched out here will whet the student's appetite to look further into the interesting and varied problems in local public finance. A detailed review of where the analysis has taken us is likely to be tedious at this point. In its place I will be somewhat more presumptuous in sketching out what I perceive to be some of the interesting areas of ongoing research and areas for future research as well. My apologies and thanks in 76 See especially Johnson (1979) and Chernick (1979). See also Winer (1983), Fisher (1979, 1982), Filiman, Romer and Rosenthal (1982), and McGuire (1978, 1979).
638
D.L Rubinfeld
advance to my colleagues and peers who have done so much to shape my thinking. (1) The political economy of the local public sector has been studied by both political scientists and economists, but efforts to combine the two have had limited success. At this point the addition of supply-side-political assumptions to Tiebout models has led largely to negative conclusions about the existence and efficiency of the local public competitive analogy. Positive models and a more careful specification of the political process by which public goods are provided may get us somewhere, however. My sketch of the arguments for inefficiency of public good provision and the need for tax limitation is one of many possible examples. Without a well-thought-out and careful description about how public goods are provided one cannot hope to accurately assess either the normative or the positive aspects of the recent move towards state and local limitations on tax revenues and public spending. (2) The Tiebout model has provided one useful, albeit somewhat artificial, model of a more or less competitive local public sector. However, the value and usefulness of the Tiebout model is likely to diminish in the future, and an alternative or alternatives are needed. Several monopoly models of local government have been suggested and do provide a useful counterpart to the Tiebout approach. However, these models have not been nearly as well developed in the sense of taking adequate account of the multiple number of local jurisdictions and the general equilibrium nature of public spending and locational choices. There may be a good deal of mileage in pursuing the market structure analogy further. Some of the recent advances (as well as some of the older work) in the field of industrial organization may help us to understand this aspect of the local public sector. One possibility might be to view local governments as providing public goods which vary spatially in the attributes that they provide to the consuming public. Public goods could be viewed as brands which differ in one or more of the attributes. The extent to which local governments compete among themselves in the supply of the public good becomes an issue that can be studied using modern techniques of industrial organization. (3) Local public economics is a subject area that is rich with empirical work, but a good deal of high quality work remains to be undertaken. The relationship between mobility of households and the provision of local public services is one area that typifies work in local public finance. We know a good deal about why and how often households move, but relatively little about the extent to which their moves influence and are influenced by public sector variables. Better empirical information about this subject is likely to suggest better lines of theoretical research. At the same time better theory is likely to substantially improve upon the empirical work that is undertaken and to make hypothesis testing possible. A well-specified and correctly estimated model may help us to evaluate the correct method by which the demand for local public goods can be determined.
Ch. 11: Economics of the Public Sector
639
(4) There has been a great deal of attention given to equilibrium models of the local public sector and to the question of whether or not equilibrium exists. This is natural, given the tools of microeconomics currently available. However, many of the problems of the local public sector may be best explained by modeling an economy which is in disequilibrium. Modem tools, both theoretical and empirical, have only recently begun to be applied to these problems. For example, consider the question of whether the capital infrastructure in our cities is being efficiently provided. Are the cities of the Northeast "undercapitalized" and if so, what should the federal government do about it? Is it efficient for adjustments of population and the capital stock to respond solely to market forces, or do substantial externalities warrant government intervention? (5) The federalism question was only lightly treated in this chapter, but is an extremely important one. Recent proposals for a major restructuring of the federal system by President Reagan have encouraged economists and others to rethink the basic Musgravian view of federalism. Should redistribution be moved in part to the state and/or the local level? Can individuals and individual local governments internalize interjurisdictional externalities or is a move to consolidation of governments a necessary step? These are only a few of the areas in which interesting work is being and is likely to be done in local public economics. Future research should only serve to prove how limited and how narrow this list of suggestions has been.
References Aaron, H.J., 1975, Who pays the property tax? (Brookings, Washington, D.C.). Akin, J. and M. Lea, 1982, Microdata estimation of school expenditure levels: An alternative to the median voter approach, Public Choice 38, 113-128. Akin, J. and D. Youngday, 1976, The efficiency of local school finance, Review of Economics and Statistics 58. Arnott, R.J., 1979, Optimal taxation in a spatial economy with transport costs, Journal of Public Economics 11, 307-334. Arnott, R. and R.E. Grieson, 1981, Optimal fiscal policy for state or local government, Journal of Urban Economics 9, 23-48. Arnott, R.J. and J.E. Stiglitz, 1979, Aggregate land rents, expenditure on public goods, and optimal city size, Quarterly Journal of Economics 93, 471-500. Barlow, R., 1970, Efficiency aspects of local school finance, Journal of Political Economy 78, 1028-1039. Barr, J. and O.A. Davis, 1966, An elementary political and economic theory of expenditures of local governments, Southern Economic Journal 33, 149-165. Beck, J.H., 1983, Tax competition, uniform assessment, and the benefit principle, Journal of Urban Economics 13, 127-146. Berglas, E., 1982, User charges, local public services, and taxation of land rents, Public Finance 37, 178-188. Berglas, E. and D. Pines, 1984, Resource constraint, replicability and mixed clubs: A reply, Journal of Public Economics 23, 391-397. Bergstrom, T.C., 1979, When does majority rule supply public goods efficiently?, Scandanavian Journal of Economics 81, 216-226.
640
D.L. Rubinfeild
Bergstrom, T.C. and R.P. Goodman, 1973, Private demands for public goods, American Economic Review 63, 280-296. Bergstrom, T.C., D.L. Rubinfeld and P. Shapiro, 1982, Micro-based estimates of demand function for local school expenditures, Econometrica 50, 1183-1205. Bewley, T.F., 1981, A critique of Tiebout's theory of local public expenditures, Econometrica 49, 713-739. Bird, R. and E. Slack, 1983, Urban public finance in Canada (Butterworth's, Toronto). Bloom, H.S., H.F. Ladd and J. Yinger, 1983, Are property taxes capitalized into house values?, in: G.R. Zodrow, ed., Local provision of public services: The Tiebout model after twenty-five years (Academic Press, New York) 145-164. Bloom, Howard S., H.F. Ladd and J. Yinger, Property taxes and house values (Academic Press, New York) forthcoming. Borcherding, T.E. and R.T. Deacon, 1972, The demand for services of non-federal governments, American Economic Review 62, 891-901. Bowen, H.R., 1943, The interpretation of voting in the allocation of economic resource, Quarterly Journal of Economics 58, 27-48. Bradford, D. and W. Oates, 1974, Suburban exploitation of central cities and government structure, in: H. Hochman and G. Peterson, eds., Redistribution through public choice (Columbia University Press, New York). Brazer, H., 1961, The value of industrial property as a subject of taxation, Canadian Public Administration 4, 137-147. Brown, J.N. and H.S. Rosen, 1982, On the estimation of structural hedonic price models, Econometrica 50, 765-768. Brueckner, J.K., 1979, Property values, local public expenditure, and economic efficiency, Journal of Public Economics 11, 223-245. Brueckner, J.K., 1982, "A test for allocative efficiency in the local public sector, Journal of Public Economics 14, 311-332. Brueckner, J.K.,. 1983, Property value maximization and public sector efficiency, Journal of Urban Economics 14, 1-16. Buchanan, J., 1965, An economic theory of clubs, Economica 32, 1-14. Buchanan, J.M. and C.J. Goetz, 1972, Efficiency limits of fiscal mobility: An assessment of the Tiebout model, Journal of Public Economics 14, 25-44. Bucovetsky, S., 1981, Optimal jurisdictional fragmentation and mobility, Journal of Public Economics 16, 177-192. Burstein, N.R., 1980, Voluntary income clustering and the demand for housing and local public goods, Journal of Urban Economics 7, 176-185. Carr-Hill, R.A. and N.H. Stern, 1973, An econometric model of the supply and control of recorded offenses in England and Wales, Journal of Public Economics 2, 289-318. Chernick, H.A., 1979, An economic model of the distribution of project grants, in: P. Mieskowski and W. Oakland, eds., Fiscal federalism and grants-in-aid (The Urban Institute, Washington, D.C.) 81-103. Church, A.M., 1973, Capitalization of the effective property tax rate on single family homes, National Tax Journal 28, 113-122. Citrin, J., 1979, Do people want something for nothing: Public opinion on taxes and government spending, National Tax Journal 32, 113-130. Comanor, W.S., 1976, The median voter rule and the theory of political choice, Journal of Public Economics 5, 169-177. Courant, P.N., 1977, A general equilibrium model of heterogeneous local property taxes, Journal of Public Economics 8, 313-328. Courant, P'.N., 1983, On the effects of federal capital taxation on growing and declining areas, Journal of Urban Economics 14, 242-261. Courant, P.N. and D.L. Rubinfeld, 1981, On the welfare effects of tax limitation, Journal of Public Economics 15, 289-316. Courant, P. and D.L. Rubinfeld, 1982, On the measure of benefits in an urban context: Some general equilibrium issues, Journal of Urban Economics 13, 346-356.
Ch. 11: Economics of the Public Sector
641
Courant, P.N., E.M. Gramlich and D.L. Rubinfeld, 1979, Public employee market power and the level of government spending, American Economic Review 69, 806-817. Cushing, B.J., 1984, Capitalization of interjurisdictional fiscal differentials: An alternative approach, Journal of Urban Economics 15, 317-326. Deacon, R., 1977, Private choice and collective outcomes: Evidence from public sector demand analysis, National Tax Journal 30, 371-386. Denzau, A.T., 1975, An empirical survey of studies on public school spending, National Tax Journal 28, 241-249. Denzau, A.T., R.J. MacKay and C.L. Weaver, 1981, On the initiative-referendum option and the control of monopoly government, in: H.F. Ladd and T.N. Tideman, eds., Tax and expenditure limitations, Committee on Urban Public Economics, Papers on Public Economics (The Urban Institute, Washington, D.C.) 5, 191-222. Dusansky, R., M. Ingber and N. Karatjas, 1981, The impact of property taxation on housing values and rents, Journal of Urban Economics 10, 240-255. Eberts, R.W. and T.J. Gronberg, 1981, Jurisdictional homogeneity and the Tiebout hypothesis, Journal of Urban Economics 10, 227-239. Edel, M. and E. Sclar, 1974, Taxes, spending and property values: Supply adjustment in a Tiebout-Oates model, Journal of Political Economy 82, 941-954. Edelstein, R., 1974, The determinants of value in the Philadelphia housing market: A case study of the main line 1967-1969, The Review of Economics and Statistics 56, 319-327. Ellickson, B., 1973, A generalization of the pure theory of public goods, American Economic Review 63, 417-432. Ellickson, B., 1977, The politics and economics of decentralization, Journal of Urban Economics 4, 135-149. Epple, D. and A. Zelnitz, 1981, The implications of competition among jurisdictions: Does Tiebout need politics?, Journal of Political Economy 89, 1197-1217. Epple, D., R. Filimon and T. Romer, 1983, Housing, voting and moving: Equilibrium in a model of local public goods with multiple jurisdictions, in: J.V. Henderson, ed., Research in urban economics 3, 59-90. Epple, D.,. R. Filimon and T. Romer, 1984, Equilibrium among local jurisdictions: Toward an integrated treatment of voting and residential choice, Journal of Public Economics 17, 281-308. Epple, D., A. Zelenitz and M. Visscher, 1978, A search for testable implications of the Tiebout hypothesis, Journal of Political Economy 86, 405-425. Fare, R., S. Grosskopf and F.J. Yoon, 1982, A theoretical and empirical analysis of the highway speed-volume relationship, Journal of Urban Economics 12, 115-121. Filimon, R., T. Romer and H. Rosenthal, 1982, Asymmetric information and agenda control: The bases of monopoly power in public spending, Journal of Public Economics 17, 51-70. Fiorina, M.P. and R.G. Noll, 1978, Voters, bureaucrats and legislators: A rational choice perspective on the growth of bureaucracy, Journal of Public Economics 9, 239-254. Fischel, W., 1979, Determinants of voting on environmental quality: A study of a New Hampshire pulp mill referendum, Journal of Environmental Economics and Management 6, 107-118. Fisher, R.C., 1979, Theoretical view of revenue sharing grants, National Tax Journal 32, 173-184. Fisher, R.C., 1982, Income and grant effects on local expenditure: The flypaper effect and other difficulties, Journal of Urban Economics 12, 324-345. Flatters, F., V. Henderson and P. Mieszkowski, 1974, Public goods, efficiency and regional fiscal equalization, Journal of Public Economics 3, 99-112. Foley, D., 1967, Resource allocation and the public sector, Yale Economic Essays 7, 43-98. Goldstein, G.S. and M.V. Pauly, 1981, Tiebout bias on the demand for local public goods, Journal of Public Economics 16, 131-144. Gordon, R.H., 1984, Taxation in federal systems: An optimal taxation approach, in: R. Matthews and C. McLure, eds., Taxation in federal systems (Australian National University, Canberra) 26-42. Gramlich, E.M., 1977, Intergovernmental grants: A review of the empirical literature, in: W.E. Oates, ed., The political economy of fiscal federalism (D.C. Heath, Lexington, Mass.) 219-240. Gramlich, E.M. and D.L. Rubinfeld, 1982, Micro estimates of public spending demand functions and
642
D.L. Rubinfeld
tests of the Tiebout and median-voter hypotheses, Journal of Political Economy 90, 536-560. Gramlich, E.M., D.L. Rubinfeld and D.A. Swift, 1981, Why voters turn out for tax limitation votes, National Tax Journal 34, 115-124. Greenberg, J., 1977, Existence of an equilibrium with arbitrary tax schemes for financing local public goods, Journal of Economic Theory 16, 137-150. Greenberg, J., 1983, Local public goods with mobility: Existence and optimality of a general equilibrium, Journal of Economic Theory 30, 17-33. Hamilton, A.D., E.S. Mills and D. Puryear, 1975, The Tiebout hypothesis and residential income segregation, in: E.S. Mills and W.E. Oates, eds., Fiscal zoning and land use controls (Lexington Books, Lexington, Mass.) 101-118. Hamilton, B.W., 1975, Zoning and property taxation in a system of local governments, Urban Studies 12, 205-211. Hamilton, B.W., 1976, Capitalization of intrajurisdictional differences in local tax prices, American Economic Review 66, 743-753. Hamilton, B.W., 1983a, A review: Is the property tax a benefit tax?, in: G.R. Zodrow, ed., Local provision of public services: The Tiebout model after twenty-five years (Academic Press, New York). Hamilton, B.W., 1983b, The flypaper effect and other anomalies, Journal of Public Economics 22, 347-362. Harrison, D.E. and D.L. Rubinfeld, 1978, Hedonic housing prices and the demand for clean air, Journal of Environmental and Economic Management 5, 81-102. Hartwick, J.M., 1980, The Henry George rule, optimal population, and interregional equity, Canadian Journal of Economics 18, 695-700. Heinberg, J.D. and W.E. Oates, 1970, The incidence of differential property taxes and urban housing: A comment and some further evidence, National Tax Journal 23, 92-98. Henderson, J.V., 1979, Theories of group, jurisdiction, and city size, in: P. Mieszkowski and M. Straszheim, eds., Current issues in urban economics (The Johns Hopkins University Press, Baltimore) 235-269. Henderson, J.V., 1985, The Tiebout model: Bring back the entrepreneurs, Journal of Political Economy, 93, 248-264. Hettich, W. and S. Winer, 1984, A positive model of tax structure, Journal of Public Economics 23, 67-87. Hinich, M.J., 1977, Equilibrium in spatial voting: The median voter results is an artifact, Journal of Economic Theory 16, 208-219. Hochman, O., 1981, Land rents, optimal taxation and local fiscal independence in an economy with local public goods, Journal of Public Economics 15, 59-85. Hochman, O., 1982, Congestable local public goods in an urban setting, Journal of Urban Economics 11, 290-310. Inman, R.P., 1975, Grants in a metropolitan economy- A framework for policy, in: W. Oates, ed., Financing the new federalism (The Johns Hopkins University Press, Baltimore) 88-114. Inman, R.P., 1978, Testing political economy's "as if" proposition: Is the median income voter really decisive, Public Choice 33, 45-65. Inman, R.P., 1979, The fiscal performances of local governments: An interpretive review, in: P. Mieszkowski and M. Straszheim, eds., Current issues in urban economics (The Johns Hopkins University Press, Baltimore) 270-321. Inman, R.P., 1981, Wages, pensions, and employment in the local public sector, in: P. Mieszkowski and G.E. Peterson, eds., Public sector labor markets, Committee on Urban Public Economics, Papers on Public Economics (The Urban Institute, Washington, D.C.) 4, 11-58. Inman, R.P. and D.L. Rubinfeld, 1979, The judicial pursuit of local fiscal equity, Harvard Law Review 92, 1662-1750. Johnson, M.B., 1979, Community income, intergovernmental grants, and local school district fiscal behavior, in: P. Mieszkowski and W.H. Oakland, eds., Fiscal federalism and grants-in-aid, Committee on Urban Economics, Papers on Public Economics (The Urban Institute, Washington, D.C.) 1, 51-77. King, T.A., 1977, Estimating property tax capitalization: A critical comment, Journal of Political Economy 85, 425-431.
Ch. 11: Economics of the Public Sector
643
Ladd, H.F., 1975, Local education expenditures, fiscal capacity, and the composition of the property tax base, National Tax Journal 28, 145-158. Levin, S.G., 1982, Capitalization of local fiscal differentials: Inferences from a disequilibrium market model, Journal of Urban Economics 11, 180-189. Martinez-Vazquez, J., 1981, Selfishness versus public "regardingness" in voting behavior, Journal of Public Economics 15, 349-361. McDougall, G.S., 1976, Local public goods and residential property values: Some insights and extensions, National Tax Journal 29, 436-437. McFadden, D., 1973, Conditional logit analyses of qualitative choice behavior, in: P. Zarembka, ed., Frontiers in econometrics (Academic Press, New York). McGuire, M., 1974, Group segregation and optimal jurisdictions, Journal of Political Economy 82, 112-132. McGuire, M., 1978, A method for estimating the effects of a subsidy on the receiver's resource constraint, Journal of Public Economics 10, 25-44. McGuire, M., 1979, The analysis of federal grants into price and income components, in: P. Mieszkowski and W.H. Oakland, eds., Fiscal federalism and grants-in-aid (The Urban Institute, Washington, D.C.) 31-50. McKelvey, R.D., 1976, Intransitivities in multidimensional voting models and some implications for agenda control, Journal of Economic Theory 12, 472-482. McKelvey, R.D., 1979, General conditions for global intransitivities in formal voting models, Econometrica 47, 1085-1111. McLure, C.E., Jr., 1967, The interstate exporting of state and local taxes: Estimates for 1962, National Tax Journal 20, 49-77. Mills, E.S., 1979, Economic analysis of urban land-use controls, in: P. Mieszkowski and M. Straszheim, eds., Current issues in urban economics (The Johns Hopkins University Press, Baltimore) 511-541. Mueller, D.C., 1976, Public choice: A survey, Journal of Economic Literature 14, 395-433. Musgrave, R.A., 1959, The theory of public finance (McGraw-Hill, New York). Neenan, W.B., 1981, Local public economics (Markham, New York). Nelson, J.P., 1978, Residential choice, hedonic prices, and the demand for urban air quality, Journal of Urban Economics 5, 357-369. Niskanen, W., 1971, Bureaucracy and representative government (Aldine-Atherton, Chicago). Oates, W., 1968, The theory of public finance in a federal system, Canadian Journal of Economics 1, 37-54. Oates, W., 1969, The effects of property taxes and local spending on property values: An empirical study of tax capitalization and the Tiebout hypothesis, Journal of Political Economy 77, 957-971. Oates, W., 1972, Fiscal federalism (Harcourt Brace, New York). Oates, W., 1981, On local finance and the Tiebout model, American Economic Review 71, 93-97. Ott, M., 1980, Bureaucracy, monopoly, and the demand for municipal services, Journal of Urban Economics 8, 362-383. Pack, H. and J.R. Pack, 1977, Metropolitan fragmentation and suburban homogeneity, Urban Studies 14, 191-201. Pauly, M.V., 1970, Cores and clubs, Public Choice 9, 53-65. Pauly, M.V., 1973, Income distribution as a local public good, Journal of Public Economics 2, 35-58. Pauly, M.V., 1976, A model of local government expenditure and tax capitalization, Journal of Public Economics 6, 231-242. Pestieau, P., 1983, Fiscal mobility and local public goods: A survey of the empirical and theoretical studies of the Tiebout model, in: J.F. Thisse and H.G. Zoller, eds., Locational analysis of public facilities (North-Holland, Amsterdam) 11-41. Pindyck, R.S. and D.L. Rubinfeld, 1981, Econometric models and economic forecasts (McGraw-Hill, New York). Polinsky, A.M. and D.L. Rubinfeld, 1977, Property values and the benefits of environmental improvements: Theory and measurement,'in: L. Wingo and A. Evans, eds., Public economics and the quality of life (The Johns Hopkins University Press, Baltimore). Polinsky, A.M. and D.L. Rubinfeld, 1978, The long run effects of a residential property tax and local public services, Journal of Urban Economics 5, 241-262.
644
D.L. Rubinfeld
Pommerehne, W.W., 1978, Institutional approaches to public expenditures: Empirical evidence from Swiss municipalities, Journal of Public Economics 7, 255-280. Pommerehne, W.W. and B. Frey, 1976, Two approaches to estimating public exenditures, Public Finance Quarterly 4, 395-407. Pommerehne, W.W. and F. Schneider, 1978, Fiscal illusion, political institutions, and local public spending: Some neglected relationships, Kyklos 31, 381-408. Portney, P. and J. Sonstelie, 1978, Profit maximizing communities and the theory of local public expenditures, Journal of Urban Economics 5, 263-277. Reschovsky, A., 1979, Residential choice and the local public sector: An alternative test of the Tiebout hypothesis, Journal of Urban Economics 6, 501-521. Richter, D.K., 1978, Existence and computation of a Tiebout general equilibrium, Econometrica 46, 779-805. Richter, D.K., 1982, Weakly democratic regular tax equilibria in a local public goods economy with perfect consumer mobility, Journal of Economic Theory 27, 137-162. Roberts, D., 1975, Incidence of state and local taxes in Michigan, Ph.D. dissertation (Michigan State University). Romer, T. and H. Rosenthal, 1978, Political resource allocation, controlled agendas, and the status quo, Public Choice 33, 27-43. Romer, T. and H. Rosenthal, 1979a, Bureaucrats vs. voters: On the political economy of resource allocation by direct democracy, Quarterly Journal of Economics 93, 563-587. Romer, T. and H. Rosenthal, 1979b, The elusive median voter, Journal of Public Economics 12, 143-170. Romer, T. and H. Rosenthal, 1980, An institutional theory of the effect of intergovernmental grants, National Tax Journal 33, 451-458. Rose-Ackerman, S., 1983, Beyond Tiebout: Modeling the political economy of local government, in: G.R. Zodrow, ed., Local provision of public services: The Tiebout model after twenty-five years (Academic, New York) 55-84. Rose-Ackerman, S., 1979, Market models of local government: Exit, voting, and the land market, Journal of Urban Economics 7, 319-337. Rose-Ackerman, S., 1981, Does federalism matter: Political choice in a federal republic, Journal of Political Economy 89, 152-165. Rosen, H.S. and D.J. Fullerton, 1977, A note on local tax rates, public benefit levels and property values, Journal of Political Economy 85, 433-440. Rosen, K.T., 1982, The impact of Proposition 13 on house prices in northern California: A test of the interjurisdictional capitalization hypothesis, Journal of Political Economy 90, 191-200. Rosen, S., 1974, Hedonic prices and implicit markets, Journal of Political Economy 82, 34-55. Rubinfeld, D., 1977, Voting in a local school election: A micro analysis, Review of Economics and Statistics 59, 30-42. Rubinfeld, D., 1979, Judicial approaches to local public-sector equity: An economic analysis, in: P. Mieszkowski and M. Straszheim, eds., Current issues in urban economics (The Johns Hopkins University Press, Baltimore) 542-576. Rubinfeld, D., 1980, On the economics of voter turnout in local school elections, Public Choice 35, 315-331. Rubinfeld, D., P. Shapiro and J. Roberts, Tiebout bias and the demand for local public schooling, Review of Economics and Statistics, forthcoming. Rufolo, A.M., 1979, Efficient local taxation and local public goods, Journal of Public Economics 12, 351-376. Sandler, T. and J.T. Tschirhart, 1980, Mixed clubs: Further observations, Journal of Economic Literature 18, 1481-1521. Schweizer, U., 1983, Edgeworth and the Henry George theorem: How to finance local public projects, in: J.F. Thisse and H.G. Zoller, eds., Locational analysis of public facilities (North-Holland, Amsterdam) 79-93. Scotchmer, S., 1985, Profit-maximizing clubs, Journal of Public Economics 27, 25-85. Scotchmer, S., 1985, Hedonic prices and cost/benefit analysis, Journal of Economic Theory, 37, 55-75. Scotchmer, S., 1986, Local public goods in an equilibrium: How pecuniary externalities matter, Regional Science and Urban Economics, 16, 463-481. Shapiro, P., 1975, A model of voting and the incidence of environmental policy, in: H.W. Gottinger,
Ch. 11: Economies of the PublicSector
645
ed., Systems approaches and environmental problems (Vandershoech and Ruprecht, Gottinger). Smith, A.S.. 1970, Property tax capitalization in San Francisco, National Tax Journal 23, 177-193. Sonstelie, J. and P. Portney, 1980, Gross rents and market values: Testing the implications of Tiebout's hypothesis, Journal of Urban Economics 9, 102-108. Stahl, K. and P. Varaiya, 1983, Local collective goods: A critical re-examination of the Tiebout model, in: J.F. Thisse and H.G. Zoller, eds., Locational analysis of public facilities (North-Holland, Amsterdam) 45-53. Starrett, D.A., 1980, On the method of taxation and the provision of local public goods, American Economic Review 70, 380-392. Starrett, D.A., 1981, Land value capitalization in local public finance, Journal of Political Economy 89, 306-327. Stiglitz, J.E., 1977, The theory of local public goods, in: M. Feldstein and R. Inman, eds., The economics of public services (MacMillan, New York). Stiglitz, J.E., 1983a, Public goods in open economies with heterogeneous individuals, in: J.F. Thisse and H.G. Zoller, eds., Locational analysis of public facilities (North-Holland, Amsterdam) 55-78. Stiglitz, J.E., 1983b, The theory of local public goods twenty-five years after Tiebout: A perspective, in: J.F. Thisse and H.G. Zoller, eds., Local provision of public services: The Tiebout model after twenty-five years (Academic, New York) 17-54. Thurow, L., 1966, The theory of grants-in-aid, National Tax Journal 19, 373-377. Tideman, T.N. and G. Tullock, 1976, A new and superior process for making social choices, Journal of Political Economy 84, 1145-1159. Tiebout, C., 1956, A pure theory of local expenditures, Journal of Political Economy 64, 416-424. Topham, H., 1982, A note on testable implications of the Tiebout hypothesis, Public Finance 120-128. Tresch, R., 1981, Public finance: A normative theory (Business Publications, Inc., PIano, Texas). Wales, T.J. and E.G. Weins, 1974, Capitalization of residential property taxes: An empirical study, Review of Economics and Statistics 56, 329-333. Westhoff, F., 1979, Policy implications from community choice models: A caution, Journal of Urban Economics 6, 535-549. White, M.J., 1975, Fiscal zoning in fragmented metropolitan areas, in: E. Mills and W. Oates, eds., Fiscal zoning and land use control (Lexington Books, D.C. Heath Company, Lexington, Mass.) 31-100. Wicks, J.J., R.A. Little and R.A. Beck, 1968, A note on capitalization of property tax changes, National Tax Journal 21, 263-265. Wildasin, D.E., 1977, Public expenditures determined by voting with one's feet and public choice, Scandinavian Journal of Economics 79, 889-898. Wildasin, D.E., 1983, The welfare effects of intergovernment grants in an economy with independent jurisdictions, Journal of Urban Economics 13, 147-164. Wildasin, D.E., 1984, Urban public finance, draft. Wilde, J., 1968, The expenditure effects of grant-in-aid program, National Tax Journal 21, 340-348. Winer, S.L., 1983, Some evidence on the effect of the separation of spending and taxing decisions, Journal of Political Economy 91, 126-140. Woodard, F.O. and R.W. Brady, 1965, Inductive evidence of tax capitalization, National Tax Journal 18, 193-201. Wooders, M., 1980, The Tiebout hypothesis: Near optimality in local public good economies, Econometrica 48, 1467-1485. Yinger, J., 1982, Capitalization and the theory of local public finance, Journal of Political Economy 90, 917-943. Zimmerman, D., 1983, Resource misallocation from interstate tax exportation: Estimates of excess spending and welfare loss in a median voter framework, National Tax Journal 36, 193-202. Zodrow, G.R., 1983, The Tiebout model after twenty-five years: An overview, in: G.R Zodrow, ed., Local provision of public services: The Tiebout model after twenty-five years (Academic, New York) 1-16. Zodrow, G.R. and P. Mieszkowski, 1983, The incidence of the property tax: The benefit view versus the new view, in: G.R. Zodrow, ed., Local provision of public services: The Tiebout model after twenty-five years (Academic, New York) 109-130.
Chapter 12
MARKETS, GOVERNMENTS, AND THE "NEW" POLITICAL ECONOMY ROBERT P. INMAN* University of Pennsylvania and NBER
1. Introduction [The] very real features of government behavior-and the wider political structure-must be taken into account in any realistic assessment of the prospects for (economic) reform. In this return to "political economy," a great deal remains to be done [A.B. Atkinson and J.E. Stiglitz (1980, p. 576) as the conclusion to Lectures in Public Economics].
If there has been a prevailing trend in the affairs of nations in the twentieth century it has been the increasing role of governments in the allocation of society's resources. It is a trend evident not only in the major industrial nations of the world, but in developing economies as well. Governments provide goods and services directly, redistribute income, and regulate economic relations. Today, none of the major industrial nations, apart from Switzerland, has less than 20 percent of its gross domestic product (GDP) allocated through governments. Denmark, Sweden, Norway, Great Britain, the Netherlands, and Italy all allocate more than a third of GDP through governments. There is little doubt that government is a -if
not the-central institution in the allocation of social
resources. It is imperative that we understand the growth and performance of so important an economic institution. This is not a new task. Indeed, Adam Smith sought first and foremost to understand the role and limitations of governments *I wish to thank the National Science Foundation, grant SES-81-12001, for financial support for this project, and the many friends and colleagues who read and commented on an earlier version of this chapter. Particular thanks must be given to A. Auerbach, W. Baumol, D. Blair, J. Farnbach, E. Mills, R. Nelson, M. Olson, S. Peltzman, H. Rosenthal, M. Riordan, D. Rubinfeld, D. Sappington, K. Shepsle, and D. Yao who read all or major portions of a first draft with great care and insight. I hope this revised version has done their comments justice.
Handbook of Public Economics, vol. II, edited by A.J. Auerbach and M. Feldstein © 1987, Elsevier Science Publishers B. V. (North-Holland)
648
R.P. Inman
in the economic affairs of nations. Hobbes and Rousseau before him and Mills, Keynes, and Schumpeter after, wrote to the same agenda.' Yet in the last three decades we have witnessed perhaps the greatest advances in our understanding of the role, the strengths and the weaknesses of government as an economic institution. It is the intention of this chapter to survey these advances. The survey will bring together five separate literatures, each of which addresses one aspect in our understanding of the role for and capabilities of governments as resource allocators. The market failures literature, and in particular the challenge of Coase, Tiebout, and Stigler to the Pigou-Samuelson tradition of governmental remedies, will help us to understand when markets "work" and when they do not (Section 2). It is here we shall find the potential role for governmental-i.e. collective - action. Whether this potential will be realized in actual performance is another matter, however. The voluntary-exchange theory of Wicksell and Lindahl and the democratic social choice literature from Condorcet to Arrow and his contemporaries address just this question. When will a collective decision-making process satisfy a set of minimal normative criteria - for example, Pareto efficiency and non-dictatorial choice - and do so with a minimal expenditure of social resources? The voluntary exchange process of Wicksell and Lindahl and its contemporary offspring, the demand revealing mechanism, hold the promise of being efficient and the threat of becoming dictatorial. The voting process can protect us against dictatorship but generally at the cost of social efficiency. I examine these alternatives in detail. In Western societies the decision has been made to protect democracy; the task then is to search for the most efficient collective choice process consistent with this norm. The public choice literature developed under the early guidance of Bowen, Downs, Buchanan, and Niskanen has focused on just this question. It is in the spirit of their early work that we have begun to examine the performance of democratic governments as resource allocators. The results of this examination emerge in the new quantitative political economy and the empirical examination of government spending, taxation, and regulation (Section 3). Yet the political-economic perspective not only addresses important questions of fact-Why do governments do what they do? It also challenges in a most fundamental way our view of economic policy analysis. It is not enough to offer a policy recommendation and expect its adoption into law. Governments are endogeneous social institutions with their own agendas. The fact that governments will not automatically adopt Pareto-improving policies and reject Paretoinferior ones does not mean we should stop looking for good policies. It does mean, however, that an integral part of economic policy analysis must be to iFor an introduction to writings of the first political economists, from Aristotle onward, see Schumpeter (954).
Ch. 12: The "New" Political Economy
649
search for institutional reforms which will facilitate the adoption of good policies and which will block the adoption of bad ones. I shall review recent recommendations for the institutional reform of the budgetary process (Section 4). The "new" political economy is concerned with that most basic question of economic policy: Under what circumstances are markets or governments the preferred institution for allocating societal resources? The problem is an old one and was the central issue in the work of Hobbes, Smith, Mills and Sidgwick before us. What is "new" in the new political economy is the sophistication of analytic technique which we can now bring to bear to answer this important question. The work of the last thirty years has given us new insight and understanding into the relative strengths and weaknesses of markets and governments as institutions for allocating the economic resources of society (Section 5).
2. Market failures and the role of government According to the system of natural liberty, the sovereign has only three duties to attend to; three duties of great importance, indeed, but plain and intelligible to common understandings: first, the duty of protecting the society from the violence and invasion of other independent societies; secondly, the duty of protecting, as far as possible, every member of the society from the injustice or oppression of every other member of it, or the duty of establishing an exact administration of justice; and thirdly, the duty of erecting and maintaining certain public works and certain public institutions, which it can never be for the interest of any individual, or small number of individuals, to erect and maintain; because the profit could never repay the expense to any individual or small number of individuals, though it may frequently do much more than repay it to a great society (Adam Smith, The Wealth of Nations, Book IV, chapter IX). At least in theory, markets and governments are compatible institutions. The case for markets as the mechanism for the allocation of social resources is well known. Utility-maximizing households and profit-maximizing firms, if joined by markets covering all desired (now and future) goods and services, will be able to achieve a Pareto-efficient allocation of social resources, an allocation from which no single economic agent can be made better off without at the same time harming another agent. 2 Furthermore, if agents are small compared to the economy as a whole, then the competitive market equilibrium is as good as any other allocation which 2 The details of the argument are developed in Arrow and Hahn (1971). The strict convexity assumptions for households and firms can be relaxed without upsetting the central results; see Starr (1969).
650
R.P. nman
a society might achieve for an initial distribution of resources.3 If, in fact, competitive markets arise to cover all desired commodities for trade, then the need for government is minimal. However, if the needed competitive trades do not occur to produce an efficient allocation, then we can speak of a market failure. It is in these instances that we need to consider a potentially broader role for governmental activities. Indeed, I shall argue that there is a common problem which underlies all market failures, and that that common problem is uniquely handled by an institution which can enforce cooperative-that is, collectivebehavior in a world where non-cooperative behavior is the preferred individual strategy. Government is one such institution. 2.1. The minimal role of government 2.1.1. Property rights The establishment of individual rights to initial endowments and to the gains from trading those endowments is now understood as a necessary condition for the use of markets as an institution to achieve a Pareto-efficient allocation. Without rights to property, market trades cannot be sustained. For Buchanan (1975, p. 9) property rights are necessary if we are to escape the Hobbesian world of constant conflict and achieve the benefits of trade. Furthermore, one might well argue that the Hobbesian state of nature in fact induces individuals to coordinate their activities so as to ensure the existence and maintenance of property right institutions. Nozick (1974) advances the idea of an "invisible hand process" from which evolves a cooperative institution to enforce individual property rights. If the protection of an individual's property and human capital endowment is costly but is less costly than stealing another's endowment, the natural state of affairs will be to attack one's neighbors and to defend oneself. The expected outcome of such a world will equal the initial endowment (M) less protection costs (d) less the net costs of attack (c). This expected outcome is less than what could be achieved if the initial endowments of property and skills were protected through a system of property rights, provided the cost per person of such a system (0) were less than costs c + d(M - > M - c - d). If there are economies of scale in the protection of endowments [that is, it is cheaper per person to protect (n) endowments than (n - 1) endowments], Bush and Mayer (1974) and Schotter (1981, pp. 47-51) have each shown that there will arise a coalition of all agents who agree not to rob each other, and who grant the coalition judicial and enforcement powers to ensure cooperation. One outcome that may emerge from such a "game" is that every player retains his original endowments. Each player agrees not to rob another, so no resources need be spent on protection or on attack. This is not the only outcome, however. Players 3
Again see Arrow and Hahn (1971) for details.
651
Ch. 12: The "New" Political Economy
may agree to a property rights coalition which gives some members less than their initial endowment of resources and others more than their initial endowments. This unequal distribution of endowments through the property rights system will
still be a stable outcome if being outside the protection system makes each individual or subset of individuals worse off. 4 Nozick calls this property rights coalition the "minimal state". All members of society will agree to participate within the rules established by the property rights system, with some members compensated for the costs, if any, they may incur in the role of enforcers. 5 To enjoy the gains of markets, a minimal degree of cooperative activity is required. Government as the protector of property rights is necessary. 2.1.2. Distribution of endowments
With the establishment of property rights to initial endowments and the gains from trade, an economy in which markets span all current and future goods can then achieve a Pareto-efficient allocation. That final allocation is tied directly to initial endowments, however. Those who start with more may end up with more in Nozick's minimal-state, market economy. For many social philosophers, 4 Schotter (1981, pp. 45-51) develops these arguments in detail, but the basic idea is simple enough. In a game where each player is endowed with M of resources, where it costs c to attack and d to defend, and where there is sufficient time to attack only one other player and to be attacked once, players will find it advantageous to form a grand protection coalition- that is, a government. The pay-off matrix for this game is represented by:
Attack
Attacked
Defended Defeated
Win
Lose
2M-c-d M-c-d
M-c-d -(c d)
If a player attacks and is successful, then he captures M from the defeated opponent. If they are robbed and lose, they lose M. The best outcome is to "attack and win" and "defend against attack": 2 M-c - d. Schotter shows that if the probability of a successful attack against a defending opponent is , then the best strategy for each player is to attack and to defend. The expected value for each player in this game is therefore:
V= (2M-c- d) + (M-c- d) + (M-c- d) +(-(c+ d))=M-c- d. Note that this expected value of the "state of nature" game, M- c - d, is less than what would be earned if property rights were jointly protected at a cost , when c + d > : M - > M - c - d. Note too that the charge , can be varied by each individual i and unanimity maintained as long as M - 0 > M- c - d. If the charge (i) is greater than the true average cost per person (), then person i contributes more than the average share. If 0 < 0, then person i contributes less than the average share. An unequal distribution of final endowments, M- 9,, can therefore result. 5 Becker (1968) describes such an efficient, though inequitable, system of enforcement.
652
R.P. nman
however, this fact is a troublesome consequence of allocations by markets only. While libertarians such as Nozick view the market as a fair process and hence sufficient for a just society,6 those who embrace an outcome or end-state criterion of social justice will find Nozick's minimal-state market economy unacceptable on grounds of equity. If we embrace an end-state view of justice, government must do more than just ensure property rights. End-state theories of justice founded on utilitarianism, the Rawlsian maximin principle, or other notions of7 fairness, judge the outcomes of a social process by what people finally receive. Given such principles, there is no guarantee that markets will achieve a just allocation of resources. Markets alone cannot ensure fair outcomes; markets combined with an appropriately structured initial distribution of endowments are required. 8 Yet the appropriate distribution of endowments will generally not happen by chance; it must be managed by some institution within society. Markets, designed to ensure that no individual is made worse off than what he or she could achieve from the initial endowment, are clearly not the appropriate means of redistribution. Reallocation of initial endowments will require a coercive mechanism. A government with the power to tax and transfer is such an institution. Those who embrace an end-state philosophy of social justice therefore grant to governments an additional function: the power to tax and transfer the 9 initial endowments of societal resources. 6Nozick (1974) argues that extensions of state power beyond the prevention of force or fraud- for example, to redistribute goods and services- are a violation of our basic rights to liberty, that is, our rights not to be interfered with. Nozick does sketch a theory of a just distribution of holdings. There are three parts to any such theory: (i) a description of how people legitimately acquire their endowments, (ii) a description of how people legitimately transfer their endowments, and (iii) a description of how past injustices are to be rectified. Nozick's theory of just distributions is a procedural theory: a distribution is just if it arises from another just distribution by legitimate means. "Legitimate means" are those means which do not violate an agent's rights not to be interfered with. A market process based on free trade is legitimate in Nozick's view. Lump-sum taxation and transfer of endowments by the state is not. Such a redistribution of endowments can well include slavery (e.g. debtors' prison), yet even more modest taxes will make the state, in Nozick's words, "a part owner in you". Your rights to liberty are thereby violated. For an extended treatment of Nozick's view that the "minimal state" will not include government-enforced redistributions of endowments, see Levin (1982). 7 See, for example, Varian (1975) who criticizes Nozick's minimal-state economy from the perspective of various end-state theories of justice. SThis conclusion is the familiar second theorem of welfare economics: If an allocation is a point of maximum social welfare for some particular social welfare function, it can be achieved by a suitable reallocation of initial endowments followed by trading to a competitive market equilibrium. 9Redistribution via lump-sum taxes and transfers of initial endowments is not straightforward. If initial endowments are meant to include labor power as well as property, then a lump-sum tax and transfer scheme requires knowledge of worker abilities. Abilities are not easily observed, however. We all have an incentive to understate our skills to avoid taxes. Thus, tax and transfers which sort workers by ability are needed. Generally, we cannot find a perfect scheme [see Stiglitz (1982)]. End-state theories of justice necessarily must use "second-best" tools of redistribution, that is, tools which cost economic resources. The gains in justice must compensate for the lost efficiency according to the end-state welfare criterion.
Ch. 12: The "New" Political Economy
653
2.2. Beyond the minimal state: When markets fail
Proponents of the minimal state, whether libertarian or utilitarian, assume that markets do arise to allocate all valued goods and services efficiency. In reality we observe many valued commodities which are not efficiently allocated by markets: clean air and water, basic scientific research, charity, and insurance against nuclear attack are examples. l° Where markets fail to achieve efficiency, governments may succeed. But it is important in such analyses to give markets a full hearing. Is government really necessary, or have we simply failed to use the market institution to its full potential? I shall argue below that in many important instances governments are necessary for economic efficiency, and that the central feature of those instances is the need for the coercive enforcement of cooperative behavior among self-seeking agents. 2.2.1. Public goods
Samuelson's now classic 1954 paper on the theory of public goods explains, perhaps more clearly than any other single source, the central issue in the failure of markets to achieve allocative efficiency when cooperation is required. Samuelson argues that the technology of certain goods and services is such that increased consumption by one member of society in no way diminishes the consumption of the other members of society: "more for you means no less for me". Such commodities are called "pure public goods" and include as examples national defense, a lighthouse in a harbor, or basic medical research. Samuelson proves that a Pareto-efficient allocation will require that the sum of the marginal private benefits [agents' marginal rates of substitution (MRS) of income for the good] must just equal the marginal cost (MC) of an additional unit of the public good: YMRSi = MC. Will a private market process, where all individual agents seek to maximize only their own welfare, satisfy this efficiency requirement? Generally, no. If purchased privately, each consumer will provide no more than the level where his/her own marginal benefit (MRSi) equals marginal cost (MC). If others in society also benefit - and for these goods, their consumption will not detract from the benefits of the original consumer - then too little (ZMRSi > MC) of the good will be provided. One can even imagine instances in which none of the good is purchased privately, even though all benefit from its provision. When exclusion is not possible, the original consumer will soon realize that he/she has purchased the good while others benefit at no cost. But why not let others provide the good privately? Such "free-riding" behavior may dominate private l°The absence of a market for a valued good is not necessarily a sign of a market failure, however. It is possible that a good is not produced and sold in markets simply because the highest price anyone would pay for the good is less than the lowest cost at which the good can be produced. We do not have a private market in space travel today; we may tomorrow.
654
R.P. Inman
purchase. If so, then all consumers hope to free ride on their neighbors and no one provides the good. Clearly, the market process with self-seeking consumers has failed to achieve a Pareto-efficient allocation of these goods and services. But has it? Samuelson's original conclusion was challenged from two directions, both arguing, fundamentally, that we have not been clever enough in our design of a competitive market process for the allocation of such commodities. The first challenge, offered by Tiebout (1956), argued that most of the goods and services provided by governments could be efficiently provided, if only governments could be made competitive with each other for customers. The natural setting for such "market" competition is the local public economy, where may suburban governments within a single metropolitan area compete for residents. In Tiebout's view, such competition among residential suburbs (private firms are ignored by Tiebout) would force each government to be technically (i.e. production process) efficient as well as allocatively efficient. If a local government provided too much or too little of a public good or were too expensive, residents would exit to another community which did meet their demands at the lowest cost.
Tiebout's insight is a powerful one; it is not, however, a solution to Samuelson's "free-rider" critique of the market provision of pure public goods. Crucial to the existence of a Pareto-optimal equilibrium in the Tiebout system of competitive government is the assumption that the publicly provided good displays congestion - "more for you does mean less for me". As new residents enter a community, each additional resident reduces the consumption of the public good enjoyed by existing citizens. To restore consumed output of the service to its original chosen level, more of the public facility (labor, capital, materials) must be provided. When the marginal cost of the added facilities just equals the average cost of providing the chosen level of the good to all residents, then the community has achieved the optimal (efficient) population size for that level of output. This efficient population size must be small enough to permit a sufficient number of competitive communities. However, if the good or service is a pure public good with no congestion, then the marginal cost of new residents is always zero and the efficient population size is infinite. In this case, a Tiebout competitive solution will not be efficient, and an efficient solution cannot be Tiebout competitively Demsetz (1970) offered a competitive market process which he argued would efficiently allocate even a pure public good. As long as consumers could be easily excluded from the consumption of the pure public good by the producer, then Demsetz argued that competitive private firms would be compelled to provide the 1 lThis is not to argue that the Tiebout strategy does not solve the allocation problem for congestible public services. It well may. See Chapter 11 by Rubinfeld in this Handbook, and the review by Bewley (1981).
Ch. 12: The "New" Political Economy
655
efficient (MRS i = MC) level of the pure public good. If everyone in society has identical demands for the public good (i.e. the same MRS schedule), then Demsetz's process will work. A firm provides a given level of the public facility and charges each of the N customers a price of at least 1/N of the cost. If that facility is too small by the Samuelson criterion for efficient provision (MRSi > MC), then there will be a profitable margin for an alternative firm to offer a larger facility and attract customers away from the initial producer. The competitive process continues until potential profits are bid to zero. At that point an efficient level of the pure public good is provided.1 2 The Demsetz argument breaks down, however, if customers do not have identical demands for the public good and if those different demands are not known by the competitive firms. Oakland (1974) provides the counter-example in this, the more relevant, case. Allow three equal-sized groups to have different levels of demand (MRS schedules) for the public good: low (MRS), middle (MRS 2 ), and high (MRS3). Everyone will wish to purchase the level of the public good demanded by the low group, and a Demsetz process will satisfy the low demanders through a firm providing a facility level up to the point where MRS1 = MC/N. The middle (MRS2 > MC/N) and high demanders (MRS 3 > MC/N) will clearly desire more. The Demsetz process will meet the additional demands of these two groups by creating a second firm which provides additional output until MRS2 = MC/(2/3 N). (Note the low demanders do not buy this additional output; hence only 2/3N consumers are available to share costs.) The middle and high demanders buy the public good from both firm 1 and firm 2. Yet the high demanders want still more output [MRS3 > MC/(2/3 N)]. A third firm will arise in the Demsetz process to satisfy this additional demand; output will be defined by MRS 3 = MC/(1/3 N) where only the high demanders (1/3 N) buy from the third firm. We need to ask whether this competitive allocation is Pareto efficient. The answer is no. Reallocations can occur which will make some citizens better off without harming others. Simply, the low demanders will pay some small fee to join the middle and high demanders in consuming from firms 2 and 3, and the middle demanders will pay a small, reduced fee to join the high demanders in consuming from firm 3. Those additional payments can be used to expand output and all citizens can be made better off. The Demsetz competitive process will provide too little of the public good. 12 A firm must charge a price per person (p) which at least covers the marginal cost per person of producing the facility: p MC/N. The price cannot exceed that maximum willingness to pay of a typical (identical) customer: MRS > p. Hence MRS 2 p > MC/N. If the existing firm provides a level of the good such that EMRS = N MRS > MC, or MRS > MC/N, then a new firm could provide a larger output, still make a profit by pricing the new output between MRS > p > MC/N, and attract all the customers from the initial firm. This profit potential exists as long as MRS > MC/N. Output expansion continues until MRS = MC/N. But at that point N MRS = EMRS = MC, which implies efficient provision.
656
R.P. nman
Demsetz (1970) is aware of this problem, and its solution. The private firms must be able to price discriminate between the low, middle, and high demanders. It is here, however, that the competitive market fails us. There is no incentive in the Demsetz process for consumers to reveal their true demands. If low demanders are given a price break by firms 2 and 3, what is to prevent the middle and high demanders from claiming to be low demanders too? The Demsetz process is incapable of price discrimination by tastes, yet that is exactly what is required for a market process to ensure a Pareto-efficient allocation. We are back then to Samuelson's original dilemma: each consumer has a private incentive to free ride on the public good provisions of others. The only way out of this problem is to design a mechanism which successfully reveals consumer preferences. For then, free riding is impossible. Unfortunately, no such competitive market mechanism generally exists when consumers are self-seeking utility maximizers. Those allocation mechanisms which can extract true preferences require a single, non-competitive organization for their implementation. One candidate institution is government. 3 2.2.2. Externalities
Like pure public goods, externalities create a conflict between self-seeking, maximizing behavior and the attainment of a Pareto-efficient allocation of social resources, and for fundamentally the same reason. If a Pareto-efficient allocation is to result when an activity creates benefit or harm for individuals other than those performing the activity, then we require the performing agent to be rewarded or penalized according to the marginal benefits and costs created. Dirty or clean air affects all consumers equally; more for you means no less for me. The criterion for the efficient allocation of such activities is identical to that for pure public goods [see Oakland (1969)]. The sum of the marginal benefits of the external activity (which may be negative in the case of "bads" such as soot) must equal the marginal cost of providing (a "good") or curtailing (a "bad") the activity [see, for example, Baumol and Oates (1975, ch. 5)]. When there are many beneficiaries from changing the level of an externality-producing activity, we confront the same difficulty here in using a market process for allocation as we did with pure public goods. Unless all consumers have identical demands for the
13Thompson (1968) offers an alternative institution: a discriminating monopolist who confronts each consumer with all or nothing offers. Thompson's process, while valid, is an extremely expensive one-on-one bargaining game played with each citizen. Groves (1973) and Groves and Ledyard (1977) have devised more manageable demand-revealing processes. I will have more to say about their processes below (see Section 3.3). For a general review, see Chapter 10 by Laffont in this Handbook.
Ch. 12: The "New" Political Economy
657
externality and the externality can be denied to non-payers, a market process will not achieve a Pareto-efficient allocation. The failure of the market to reveal correctly true consumer or producer preferences for changes in the external activity is the problem; all agents have an incentive to "free ride" against other similarly situated consumers or producers. What is required is a single organization encompassing producers and consumers which can extract from each agent their true costs and benefits and then tax or reward the externality-producing activity accordingly.' 4 Again, that coercive institution called government is a possibility. There is a well-known exception to this conclusion, however. In the case of two agents linked to each other through an external (non-market) activity in a world of complete information and costless bargaining, the two agents will strike a mutually advantageous (Pareto-efficient) bargain for the level of the external activity. First articulated by Ronald Coase (1960), this argument has spawned an enormous literature by economists and legal scholars. The Coase Theorem challenges in a fundamental way the notion that governments are always necessary to achieve an efficient allocation of non-market, external activities. But Coase's world is the very special one of two-person bargaining and full information, and Coase's conclusion -that private bargaining will efficiently allocate what markets cannot- is limited thereto." The Coase theorem does not generalize to the many-agent case without assuming costless coalition formation so as to collapse the problem back into a two-party (e.g. consumer vs. producer) bargain. But costless coalition formation in essence assumes costless preference revelation and no "free riding", hardly a helpful assumption when analyzing the relative roles of markets and governments.l6 Relaxing the assumption of full information raises the possibility that bargaining in a world with non-convexities will result in non-optimal allocations. Bargainers may reach and then remain at locally efficient (second-best) not globally efficient (first-best) allocations [Polinsky (1979)]. Only an organization capable of overcoming barriers to information can correct these errors. Government is again a candidate organization. While one must recognize the challenge of the Coase Theorem to the usual conclusion that collective activity is required for the efficient allocation of externalities, the Theorem is not a sufficient argument for no government at all. At least the Nozick "minimal state" is required to enforce property rights 14For examples of how this might be done, see Dasgupta, Hammond and Maskin (1980). 15It must be noted that Coase's theorem still requires a "minimal-state" for property right
enforcement and income redistribution. In Coase's analysis, either lump-sum taxes and transfers can be used to transfer income, or the initial property rights can be adjusted to protect preferred distribution of post-bargain resources. On the use of property rights for distributional targets, see Polinsky (1979). 16On what may happen when costless coalititions cannot be formed, see Schultz and d'Arge (1974).
658
R.P. nman
agreements and in most real world cases, a more activist governmental position towards externalities must be considered too. 7 2.2.3. Increasing returns to scale
It has long been recognized that significant increasing returns to scale in the production of a commodity is a potential source of market failure [Dupuit (1844), Hotelling (1938)]. In instances in which production requires a large fixed cost for capital, "know-how", or natural resource development, average cost may exceed marginal cost over all relevant (i.e. demanded) levels of output. If so, Paretoefficient marginal cost pricing cannot be sustained by the private market process. A private firm which prices at marginal cost will not cover average costs, and in the long run will be driven from the market. A sustainable market process will produce either of two Pareto-inefficient outcomes in this case. In the case where no threat of market entry exists, the established firm will price at the long-run profit-maximizing level where marginal revenue equals marginal cost. Price will exceed marginal cost, and output will be below the Pareto-efficient level. A social welfare loss results [Harberger (1954)]. In the case where a significant threat of entry exists and the established firm's monopoly position is "contestable", the existing firm will be forced to lower its price and its profits to deter entry. In the limit, the firm will be driven to price at average cost where monopoly profits are zero, the so-called "Ramsey price" in this simple case [Baumol, Bailey and Willig (1977)1.18 Yet, price equal to average cost is not a Pareto efficient outcome either; again output is less than when price equals marginal cost. '7 Cooter (1982) offers an even stronger critique of the applicability of the Coase Theorem. Coase assumes that agents who can benefit from a bargain will in fact bargain until all the gains are exhausted. That is one view of human nature, but not the only one. Cooter offers the Hobbesian view as an alternative. Bargainers may adopt their worst threats- the threat not to bargain at all- unless there is a third party called government which forces them to negotiate. "The Coase Theorem represents extreme optimism about private cooperation and the Hobbes Theorem represents extreme pessimism." The two theorems have contradictory implications for the role of governments. "According to the Coase Theorem, there is no continuing need for government... According to the Hobbes Theorem, the coercive threats of government... are needed to achieve efficiency when externalities create bargaining situations... (T)he government continuously monitors private bargaining to insure its success" (p. 19). For some experimental evidence on the Coasian vs. Hobbesian view of bargaining, see Hoffman and Spitzer (1982). The evidence seems to favor the Coasian view of bargaining at least in the two-person, full information situation first analyzed by Coase. 1 "The original argument of Baumol, Bailey and Willig (1977) considered the more interesting case of a natural monopoly involving both increasing returns to scale ("economies of scale") and economies from multi-product production ("economies of scope"). In the simple case above, only economies of scale are considered. In the more general case at a multi-product firm, cross-product subsidies are possible and the resulting Ramsey prices may be above average cost for some commodities and below average cost for others. Nonetheless, these prices are still second best against alternatives which permit lump-sum (non-distortionary) taxes. See the argument in the following paragraph which generalizes to the case of economies of scale/economies of scope as well.
Ch. 12: The "New" Political Economy
659
The first-best solution to this allocation problem is to price at marginal cost and then to raise the required revenue to cover the long-run losses from pricing below average cost through a system of lump-sum, non-distortionary charges on all consumers. Can a private firm adopt this first-best pricing strategy? Not unless it knows who its customers will be before it builds its facilities.1 9 If customers are known only after the facility is in place, a subtle game of preference revelation will emerge between consumers and the firm which may prevent the attainment of long-run efficiency. The firm needs to charge the full lump-sum fee, a fee which each consumer has an incentive to avoid. Consumers will agree to purchase the good for a small mark-up above marginal cost (hence contributing a token amount towards fixed cost) but can threaten not to consume if charged the full lump-sum fee. An existing firm may view a small contribution better than none at all, but in the long run it suffers losses and is driven from the market. New entrants may fear the same fate. The market fails. First-best, Pareto efficiency can only result from the market process if the demands for the product are known before the production facilities are put in place. This requires an organization encompassing all consumers and producers which can extract from each agent their true costs and benefits of providing the commodity. Governments are a possible solution.2 0 2.2.4. Incomplete information The competitive market process as a mechanism for achieving a first-best, Pareto-optimal allocation of resources requires that each consumer be fully informed about all attributes of the commodity being purchased. If this is not true, particularly if the seller knows more about the commodity (whether output or input) than the buyer, market transactions will be characterized by incomplete or asymmetric information. Either of two problems may arise in this instance. The first, called moral hazard, arises when the buyer cannot observe controllable 19
For example, the firm may write long-term contracts with potential customers and then provide the service if a sufficient number of customers can be found to contribute to the cost of the overhead. Note, however, that all customers still have an incentive to free ride on the first-users. Once the facility is in place, the firm has an incentive to write second contracts with new customers and to charge fees which cover marginal costs and make only small contributions to overhead. Everyone will want to avoid the first-user contracts and sign a "new" customer agreement. Again the free-rider problem may undo a clever, private-market contracting scheme. 20 0n how governments might extract this needed information, see Baron and Myerson (1982). There is considerable discussion in the industrial organization literature on whether this informational argument for government provision of natural monopoly commodities means government should run monopolies or just license them. Posner (1971) and Demsetz (1968) argue in favor of licensing after bidding for the rights to provide the commodity. Bids should be awarded on the basis of lowest per unit price [Stigler (1968)]. Williamson (1976) argues that the information argument may still require government regulation or management of natural monopolies. Loeb and Magat (1979) offer another approach to natural monopoly regulation which conserves information about firm costs, but still requires information on consumer demand.
660
R.P. nman
actions taken by the seller. The typical example is the level of care or effort taken by a seller in performing a service. If care cannot be monitored, the incentive for sellers is to underprovide quality, even though consumers would pay for a better product [Holmstrom (1984)]. The second, called adverse selection, arises when the buyer cannot observe important exogenous characteristics or contingencies affecting the seller. When health-care consumers cannot distinguish between competent and incompetent physicians, car buyers cannot tell a "cream puff" from a "lemon", and insurance companies cannot separate good and bad risks, then adverse selection may drive good physicians, good cars, and good risks from the marketplace. Product quality declines, or the market may disappear altogether [Akerlof (1970)]. Market institutions may arise to help overcome informational asymmetries. The more able sellers may be willing to offer contingent contracts whereby payment is made only if product quality meets a given standard. Warranties are an example of contingent contracts. Seller reputation is another potential device for overcoming the problems of informational asymmetries. Sellers planning to remain in the market (this fact must be known!) have an incentive to protect product quality over the long run [Klein and Leffler (1981)]. Furthermore, contracts may act as signals of seller characteristics or effort. Only the most able or those willing to work hard will risk a contingent contract for their services. Investment in education may signal the more able in the work force [Spence (1973)]. Finally, sellers may band together to form professional associations to certify product quality of its members [Skaked and Sutton (1981)]. The issue here is whether these market responses are sufficient to overcome the difficulties originally created by incomplete information. The tentative answer appears to be, no [Holmstrom (1984)]. While they are ingenious institutional responses which expand the range of beneficial trades, they do not exhaust all possibilities for improvement. Further trades would be possible, if additional information about sellers could be provided to buyers. Is government necessary for the provision of this additional information? Several arguments, particularly favoring government certification, suggest that it is. If certification of sellers is costless, i.e. information is free, all but the least qualified sellers would allow themselves to be certified and the resulting market would be efficient. When certification is costly, however, markets will not be efficient and government intervention may be desirable. The preferred allocation will set the marginal benefits of additional certification (measured by the gains from now available trades) equal to the marginal costs of providing certification. There are likely to be significant externalities in such a process, however, and herein lies the case for possible government intervention. First, the value to a single seller of being certified may depend upon who else is certified; information about quality is often a statement of relative (better than, worse than) value. In such a case no one may signal because going first is not worthwhile. Only
Ch. 12: The "New" Political Economy
661
cooperative action by all sellers can overcome this "network externality" [Wilson (1982)]. Second, while the process of certification confers benefits on the most able sellers, it also imposes an externality on the least able who earn less with certification. This external cost of certification is ignored by the more able. Excessive information therefore may be provided by a market process [Jovanovic (1982)]. Again, a cooperative action by all sellers- how to limit certification- is necessary. Might we then rely upon private seller cooperatives to certify quality and ensure efficiency? No, for with the right to control certification goes the right to control entry. Skaked and Sutton (1981) show how professional groups which license entry will give out too few licenses, inflate prices, and earn monopoly rents. In summary, certification is a commodity- information about sellers - which has all the characteristics of a Samuelson public good. Individual consumers generally have no incentive to provide such a public service. Individual sellers may provide certification but because of either positive or negative spill-overs the level is likely to be non-optimal. Government regulation of certification is one solution to these problems.2 2.2.5. Unemployment A major, if not now the most prominent, role of government in a market economy is the management of cyclical fluctuations in output and employment. But not all observed unemployment requires government intervention. Some unemployment can be socially beneficial, allowing workers and firms to match skills and needs through market search. This so-called search or "natural" rate of unemployment is a structural feature of any economy with less than perfect information about possible labor market transactions. By itself it is not an imperfection in market economy performance which macro-economic fiscal or monetary policy should correct; indeed, such policies may only make matters worse [Friedman (1968)]. Market performance can be improved through a better matching of workers and firms, but it is not obvious that a government is necessary to provide that service. A deeper argument is needed. It is now generally appreciated that the case for government intervention to increase employment -whether through monetary and fiscal policies or through policies to improve labor market performance-must be grounded in a well specified micro-economic argument of market failure. Fundamentally, the problem of unemployment is a failure of self-seeking workers and firms to coordinate their actions to achieve mutually beneficial trades. In this sense unemployment is 21 Yet again the government must have a mechanism which will induce suppliers to reveal their true qualities. Various signalling equilibria have been proposed and analyzed in the new literature on the economics of information; see Riley (1979).
662
R.P. nman
"involuntary" (not "natural") and its correction therefore requires intervention of an extra-market coordinating agent such as government. But why might labor markets fail to exhaust all beneficial trades? 2 2 Either of two arguments seem potentially persuasive. The first builds from the theory of asymmetric information as applied to labor services. The labor market is marked by random shocks to firms' demands for labor creating alternate periods of low and high wages and employment. Riskaverse workers will want to insure against such swings in their annual incomes. In the case of symmetric (i.e. identical) information regarding demand shocks between insured workers and the insurers (whether the firm or a third party), an optimal insurance contract can be written which will efficiently allocate the economic risk.2 3 However, if the information regarding demand is asymmetrically distributed to the advantage of either party (e.g. the firm knows demand and the worker does not), an inefficient insurance contract may result. The reasons have been outlined above - moral hazard or adverse selection - and have been carefully examined by Azariadis (1975), Baily (1974) and others [see Azariadis and Stiglitz (1983, particularly sections IV and V)]. The conclusions here parallel those of the asymmetric information literature; less than a full employment Pareto-optimal allocation may result which can only be corrected through the truthful revelation of private information by workers or firms. Government intervention may be necessary in this case either (1) to collect and reveal relevant information to all parties, or (2) to provide unemployment insurance directly to workers if such insurance does not arise privately. 2 4 The second market failure argument for government intervention to improve employment rests fundamentally on economy-wide increasing returns to scale in the production of private goods and services [Weitzman (1982)].25 Strict constant
22 In essence the failure of the labor market is a failure of wages to adjust to clear the labor market; a point appreciated since the work of Keynes. If wages did adjust, a Walrasian equilibrium would result with no involuntary unemployment. As Drazen (1980) has emphasized in his review of the recent disequilibrium literature, the task before us now is not to explain what happens when wages are sticky, but rather why they are sticky in the first place. That is, why does the market fail? 23 This efficient insurance contract may well involve lay-off unemployment. See Azariadis and Stiglitz (1983). 24 It should be noted that any government policies which follow from the asymmetric information argument are micro-economic policies directed at improving the insurance of worker income against exogenous demand shocks. For an interesting asymmetric information argument which argues for macro-economicpolicies, see Woglom (1982). In Woglom's work the asymmetric information problem occurs in the product markets, not the labor markets. Note too that minimizing exogenous demand shock risks is also a policy alternative. But to argue for government as a necessary mechanism to reduce such shocks first requires a theory of how unanticipated demand shocks arise. I know of no theory which argues such shocks are a logical outcome of a market economy. 25 Weitzman emphasizes that increasing returns to scale is a general description for all the economic advantages of size: physical economies, internalization of externalities, economizing on information, specialization of roles, access to capital, and so on.
Ch. 12: The "New" PoliticalEconomy
663
returns to scale (and symmetric information) must imply full employment. With full divisibility of production provided by constant returns to scale, any unemployed factor can "hire itself and any other factors it needs and sell the resulting output directly.... The operational requirement is that the efficient minimum-cost scale of production be sufficiently small, relative to the size of the market, that any one firm or plant cannot affect prices appreciably" [Weitzman (1982, pp. 791, 792)]. Individuals can sell all that they produce at the market price. Full employment is therefore a logical consequence of the assumptions of constant returns to scale and price competition. Allowing increasing returns to scale upsets this happy balance of supply and demand. With increasing returns to scale, firms are large relative to their markets and thus face a downward sloping demand curve for their products. It is this fact which creates the possibility of a less than full employment equilibrium. Each existing firm, or new firm created by unemployed factors, is afraid of glutting its own market by increasing its output. New supply "spoils" its own market, before it has a chance to create new demand. Only the simultaneous expansion by all firms in the economy can overcome this barrier to full employment. Yet there is no market mechanism that will insure such coordinated expansion of economic activity. An extra-market institution - e.g. government - will be needed to move the economy to the preferred full employment equilibrium. In this case the appropriate policies are those which can expand aggregate demand, for example, monetary [Woglom (1982)] and fiscal [Hart (1982)] policy. The essence of each of the five market failures detailed above is the need for some institution to overcome the proclivity of individual, self-seeking agents to act uncooperatively when cooperation among agents is needed for a mutually beneficial trade. Markets as institutions for organizing economic activity often cannot insure the required cooperative behavior. When markets fail, they fail because they cannot be designed to prevent individual cheating against a mutually preferred cooperative outcome. In such cases, a cooperative agreement among all parties is required, and an extra-market institution - possibly a government will be needed to enforce it. Understanding why an enforceable cooperative agreement is necessary, and hence why governments may sometimes be preferred to markets for allocating economic resources, is our next task. 2.3. Market failures as a Prisoner'sDilemma: The potential case for government
The essence of the problem of the failure of two individuals to cooperate, even when cooperation is in their joint interest, is captured by a game called the Prisoner's Dilemma. 2 6 Imagine two agents deciding whether to provide a public 26
The formal structure of the game was originally proposed by Merrill Flood (1958), though the issue of non-cooperatve behavior when cooperation is jointly preferred has been discussed by
664
R.P. Inman Player 1 C
N
(1)
(1) C
(2)
(2) Player 2 (1)
(1) N
$6
/
$0
________________________________________________ ______________________________________________2 _____
Figure 2.1. The Prisoner's Dilemma
good which yields each with a benefit of $6 but costs a total of $8 to produce. The two agents can cooperate (strategy C) or not (strategy N) to provide the good. Once the good is provided the benefits are enjoyed by both parties. No communication or pre-contracting outside the game is allowed. If both agents cooperate, they share the costs equally and enjoy a net benefit of $2 per person [= $6 - ($8/2)]. If one agent produces the good but does not receive a contribution from the other, then the producing agent's net benefit is - $2 (= $6 - $8). In this case, the non-cooperative, free-riding agent enjoys all the benefits at no cost and receives $6 (= $6 - $0). If the good is not produced by either agent - that is, both act non-cooperatively - then each agent receives 0 net benefits. The best outcome for either player occurs when the other provides the good and he or she free rides, but if both parties attempt to free ride the good will not be produced. Unfortunately, the non-cooperative, "free-riding" strategy is the dominant (i.e. always best) strategy for both agents. That this is so can be seen from the matrix form of this game in Figure 2.1 above. Pay-offs to each player for each combination of strategies are shown above (for player 1) and below (for player 2)
political economists as the basis of collective action at least since David Hume's example of the flooded meadow: Two neighbours may agree to drain a meadow, which they possess in common; because 'tis easy for them to know each others mind; and each must perceive, that the immediate consequence of his failing in his part, is, the abandoning the whole project. But 'tis very difficult, and indeed impossible, that a thousand persons shou'd agree in any such action; it being difficult for them to concert so complicated a design, and still more difficult for them to execute it; while each seeks a pretex to free himself of the trouble and expense, and wou'd lay the whole burden on others. David Hume, A Treatize of Human Nature, book 3, part 2, sec. 7, p. 538.
Ch. 12: The "New" Political Economy
665
the diagonals in each cell. The joint cooperation strategy (C, C) yields players 1 and 2 $2 each. If player 1 cooperates but player 2 does not, the pay-offs are - $2 and $6, respectively. If player 2 cooperates but 1 does not, the pay-offs are reversed. If both players adopt the non-cooperative strategy, the pay-offs are $0 to each. Note that the dominant strategy for both agents is not to cooperate. If agent 2 cooperates (C), the best strategy for agent 1 is not to cooperate (N) as $6 > $2. If agent 2 does not cooperate (N), then again the best strategy for agent 1 is not to cooperate (N) as 0 > - $2. Either way, non-cooperation is preferred. The same logic applies to agent 2 deciding how to play against agent 1. Since both players find non-cooperation the preferred individual strategy, the outcome of this game is in the (N, N) cell where no public good is provided. Yet both agents would prefer the cooperative outcome to this result. Cooperation is preferred, but no agent will play cooperatively for fear of suffering large losses if their neighbor cheats. Just as the market may not provide a contract to ensure cooperative behavior when there is an incentive to act independently, so too do the players in the Prisoner's Dilemma game lack the means to enforce the cooperative solution. The pay-off matrix of the game illustrates the advantages to cooperation. The fact that the players cannot communicate or write binding contracts parallels our market failures. Without a cooperative contract and an enforcement mechanism, non-cooperation (N, N) is the inevitable result."7 Without too much effort one can construct a Prisoner's Dilemma story which captures the essential features of each of the market failures described above. 2 8 In each instance there are gains to cooperation but a failure of private contracting to ensure the cooperative outcome. If government is to have a role to play as a mechanism for the allocation of social resources beyond its functions in a "minimal state", then it is in its ability to enforce a cooperative outcome in these market failure, Prisoner's Dilemma games. Before we rush to embrace government, however, perhaps we should step back for a moment and ask if there might not be reasonable instances in which agents will voluntarily select the cooperative allocation within the Prisoner's Dilemma setting. In an analogy to market failure, markets failed because explicit and enforceable contracts could not be written, but perhaps "implicit" (i.e. legally unenforceable) contracts for cooperation will be chosen voluntarily. If the answer is yes, then the need for government - coer27 Both parties have adopted their Nash strategy in this non-cooperative, finite sum game and the outcome is a Nash equilibrium. The game is non-cooperative because no communication or extra-game interaction between the players are allowed. The game is finite sum because there is a finite gain to cooperation, if cooperation can be achieved. 2 SFor public goods and externalities see Riker and Ordeshook (1973, chapter 9). Klein and Leffler (1981, p. 624) describe their market game with asymmetric information as a Prisoner's Dilemma. Woglom (1982) and Weitzman (1982) both mention the Prisoner's Dilemma character of their description of the unemployment problem. Sen (1967) and Warr and Wright (1981) stress the Prisoner's Dilemma character of non-optimal intertemporal market allocations.
666
R.P. Inman
cive and expensive-is significantly reduced. Are there voluntary, cooperative 29 solutions to the Prisoner's Dilemma game? The Prisoner's Dilemma allocation problem described by the matrix in Figure 2.1 was a game played once between two agents. There was little to be gained by trusting your opponent and playing the cooperative strategy since you were sure to lose. Furthermore, since the game was played only once there was no way to penalize a cheating opponent. But our economic lives consist of many market failures, or Prisoner's Dilemma problems, and some -e.g. do we, or do we not litter in the park - are repeated almost daily. Might not the better characterization of market failures be a repeated Prisoner's Dilemma game, and, if so, is the non-cooperative strategy dominant in this case as well? In the case in which the Prisoner's Dilemma game is to be played a known and finite number of times, the non-cooperative strategy does remain the dominant strategy for each player. Consider a game which is to be played 100 times. For the 100th play, both agents know this is the last play of the game. Therefore, this iteration is equivalent to a one-time Prisoner's Dilemma game, where the noncooperative strategy is dominant. Thus, the outcome of the 100th play is known; both agents play non-cooperatively and receive $0 pay-offs. But if the outcome of the 100th iteration of the repeated game is known, so too will be the outcome of the 99th iteration. For now when players consider the 99th play of the game, they will realize the 100th play is already determined. What is the best strategy for the 99th iteration? Since nothing can be gained at the 100th iteration by playing cooperatively at the 99th, the dominant strategy for both players is again to play non-cooperatively. Thus, the outcome of the 99th play is known; it is the non-cooperative result with $0 pay-offs for both agents. This logic can be repeated for each preceding iteration of the game, until it is clear that the non-cooperative strategy is the dominant strategy for each iteration of the repeated Prisoner's Dilemma game. It appears that allowing for the fact that market failure-Prisoner's Dilemma situations are pervasive and repetitive will not, by itself, diminish the potential role for government to insure cooperative allocations. assumption of 29In the discussion which follows, I consider only those approaches which keep the self-seeking, utility-maximizing agents. We could of course obtain cooperative outcomes by assuming agents wish to behave cooperatively for reasons of friendship or the ethical importance of cooperative behavior. Such moral motivations are real enough. Indeed, there is a good deal of experimental evidence which indicates that friends, or parties, told to be cooperative, do chose the cooperative allocation [e.g. Deutsch (1973)]. But that strategy significantly changes our problem; a high penalty is now attached to the non-cooperative strategy such that the problem is no longer a Prisoner's Dilemma. For example, subtract $5 from the pay-off for each non-cooperative play in Figure 2.1; it should be clear that cooperation is now the dominant strategy in this new game. [Though unselfishness will not always solve our problems of non-cooperation, and such motivations may even create non-cooperative behavior! See Weymark (1978).] Since my focus is on the market failure-Prisoner's Dilemma problem, I will not discuss these alternative approaches to cooperative behavior. Hardin (1982, chs. 6 and 7) provides a useful survey. See also Kurz (1977) for a more formal analysis of resource allocations with altruistic agents.
667
Ch. 12: The "New" Political Economy
We can relax the structure of our analysis a bit further, however, and voluntary cooperative behavior will emerge, at least for some of the plays of a finite length, repeated Prisoner's Dilemma game. Axelrod (1981) drops the restriction that the finite Prisoner's Dilemma game has a known number of iterations and allows each player a degree of uncertainty that the next play will be the last. Axelrod asks whether in this uncertain setting cooperative behavior might emerge. The answer is yes, provided there is a sufficiently large probability that the game will continue after each iteration. The argument proceeds in two steps. First, it is shown that if the probability that the game will continue is high enough, then there is no best strategy against all strategies your opponent might play.30 Specifically, the non-cooperative strategy is no longer the dominant strategy. 30
The matrix of the Prisoner's Dilemma game can be generalized to: Agent 1 C
N
(1)
(1) R
C
T R
S (2)
Agent 2 (1) N
(2) (1)
S
P T
P (2)
(2)
where the payoffs are related as T> R > P > S and R 2 (S + T)/2 so that the alternating strategy of N then C is not preferred to the outcome from mutual cooperation. If the pay-off in any period is uncertain because the future of the game is uncertain, then this uncertainty should be reflected in players' strategy selection. Assume players are risk neutral and discounting of future pay-offs is not relevant. (The conclusions generalize to risk aversion and discounting.) A player who receives the pay-off P in each period the game is played receives an expected pay-off over the life of the game of 2
3
P + wP + w P + w P +
..
= P/(1 - w), where w is the probability the game will continue after
each play. In this setting there is no one best strategy independent of the strategy used by an opponent. To see this, suppose strategy A is best no matter what your opponent does. This means the value of playing A against any strategy B your opponent plays must dominate the value of some other strategy A' against B, that is, V(A I B) > V(A' IB). Consider the two general forms for strategy A: (1) cooperate until your opponent does not cooperate and then stop cooperating (the "nice" strategy), and (2) stop cooperating before your opponent does (the "not nice" strategy). First, if A is "nice", let A' be always play N and let the opponents strategy (B) be always play C. Then V(A I B) = R/1 - w. The value of A' is V(A' IB) = T/1 - w which is greater than V(A IB). Therefore when A is nice it is not always the best strategy. Second, let A be a "not nice" strategy, define A' to be to always play C, and define the opponent's strategy B to cooperate until the first player no longer cooperates and then to play N. Let strategy A be to defect after the first play. Then V(A' IB) = T+ (wP/1- w) and V(A' IB) = R/1 - w. In this case V(A' IB) > V(A I B) if w > (T - R)/T - P. If the probability of the game continuing is high enough then no one strategy is always best [Axelrod (1981, theorem 1)]. Axelrod's theorem 1 is a formalization of the insight of Hardin's (1982, p. 146) critique of non-cooperative outcome of the finite length Prisoner's Dilemma game.
668
R.P. Inman
Second, Axelrod proves that a cooperative strategy which punishes non-cooperators-called the "tit-for-tat" strategy-is a potentially stable strategy in the Prisoner's Dilemma game with uncertain repetition. Furthermore, if this strategy is adopted by some specified fraction of the players participating in a society of repeated Prisoner's Dilemma interactions the "tit-for-tat" strategy will eventually dominate the pure non-cooperation strategy.31 With the tit-for-tat strategy each player plays cooperatively until his opponents acts non-cooperatively and then he plays one round non-cooperatively in retaliation. With everyone playing tit-for-tat, cooperative allocations result. Has uncertainty about the number of iterations of a Prisoner's Dilemma game therefore given us the "out" we need? Surely, uncertainty is a reasonable relaxation and by itself does little violation to our view of market failures. The more relevant question is whether most market failure situations will meet the precise conditions of the Axelrod theorems. That is an empirical matter. Yet there remains an important logical gap in applying the Axelrod results to our question of market failures and the potential need for government. For the cooperative solution to emerge voluntarily, there must first be at least a small set
31Suppose a strategy B is being used by all members of society. We then introduce a new, small group into society who use a new strategy A. Those who play A interact with a probability p with another play of strategy A and a probability (1 -p) with someone who plays B. If the value of playing A against A is V(A IA) and the value of playing A against B is V(A IB), then the expected pay-off of playing A is p V(A IA) + (1 - p)V(A IB). Since there are so many B's, we can assume B's interact almost always with B's. The value of playing B is therefore V(B B). Playing strategy A will be preferred to playing B if p V(AlA) + (1 - p)V(A IB) > V(B IB), or if
V(B B) - V(A B)
P
V(A lA)- V(A lB)
Let strategy A be the "tit-for-tat" strategy, and let strategy B be to always play non-cooperatively. Then for the generalized pay-off matrix in footnote 30, V(BIB)= P/(1- w), V(AIB) = S wP/ (1 - w), and V(A IA) = R/(1 - w). Upon substitution, playing tit-for-tat will be preferred to always being non-cooperative if (P- S) + (SP> (R-S)+(S-P)w
P)w
If we let P = 0, S = - 2, and R = 2 (the pay-offs in Figure 2.1) and w = 0.5 (a 0.5 probability of continuing), then p > 0.33 will insure tit-for-tat will dominate always behaving non-cooperatively. If w = 0.9, then p > 0.091. As long as tit-for-tat players can be sure of interacting with enough other tit-for-tat players, then the tit-for-tat strategy will dominate a pure non-cooperative strategy [Axelrod (1981, p. 315)]. In a similar vein, Kreps et al. (1982) argue that cooperative behavior can emerge from a Prisoner's Dilemma game if each player believes the other will play tit-for-tat some fraction 8 of the time. The fraction is not motivated from either individual self-seeking behavior or the structure of the pay-offs; in a sense, players are simply willing to try cooperation now and then. Kreps et al. show that for 8 > 0, there is an upper limit to the total number of plays which involve non-cooperation.
669
Ch. 12: The "New" Political Economy
of players willing to play cooperatively at the start. What motivates their behavior? Certainly no self-seeking, utility-maximizing player will adopt the tit-for-tat strategy on his own. If all others play non-cooperatively, a tit-for-tat strategy will give you a negative pay-off in the first period (you cooperate and your opponent does not) and 0 pay-offs (both non-cooperate) after that. A non-cooperative strategy at each play gives 0 pay-offs in each iteration and is preferred. Cooperative behavior can emerge, but only if a significant subset of agents are willing to risk a loss and act cooperatively first. Rather than an argument against the need for government to handle market failures, Axelrod's analysis may be seen as offering a case for government. Government protects and encourages those first cooperative agents. It is enforced cooperation by government in the early stages of the repeated Prisoner's Dilemma problem which permits the tit-for-tat strategy to take hold. Over time, voluntary cooperative behavior may replace the need for enforced cooperative behavior. But enforced cooperation via governments is still necessary initially to solve the problems posed by market failures. Radner (1980) has adopted a different approach for motivating voluntarily cooperative behavior in Prisoner's Dilemma games. In Radner's analysis players are greedy and self-seeking, but only up to some margin, epsilon. Radner proves that an equilibrium will result in finite, Prisoner's Dilemma games which have the characteristic that if each player adopts a strategy within epsilon in pay-off of being his best strategy, then some maximal fraction of iterations in the repeated Prisoner's Dilemma game will be played cooperatively. This fraction of cooperative plays increases as epsilon increases.3 2 Radner suggests this epsilon-equilibria approach to cooperative behavior can be motivated from either of two perspectives. First, decision-making is costly, and epsilon measures the percent of maximal returns beyond which further searching for an optimal strategy is no longer worthwhile. Alternatively, epsilon may be interpreted as that fraction of maximal returns which each player will willingly forgo to protect cooperative behavior. Neither motivation is persuasive by itself; theories of bounded rationality or charitable behavior are needed. Furthermore, Radner's results establish only an upper bound to the percent of iterations which are played cooperatively; 32 In the repeated Prisoner's Dilemma game considered by Radner the number of total iterations (T) which are played by a "nice", or cooperative strategy is denoted k. If N is the number of players in Radner's game and e is the epsilon margin tolerated by each player, then Radner shows for the most compelling form of player behavior that the upper limit to the fraction of cooperative plays (k/T) is given by
1+
1- (1/e){(N-1)/4N} T
k T
[Radner (1980, p. 153)]. As e rises, the upper bound to the fraction of cooperative plays increases.
670
R.P. Inman
non-cooperative behavior is still likely and may be pervasive. 3 3 The need for government to enforce cooperative behavior thus remains as an institutional response to market failures. Smale (1980) explores yet a third approach to the motivation of voluntary cooperative behavior by self-seeking agents in Prisoner's Dilemma games. The analysis is of an infinitely repeated, two-person Prisoner's Dilemma game. Smale shows that cooperative behavior by both agents will eventually (i.e. asymptotically) emerge as the preferred strategies and that once mutual cooperation is in place the outcome is globally stable. Smale's results are based upon two assumptions. First, players have limited memories and therefore store information about past outcomes of the game through summary statistics which are updated after each play. For example, the average of past pay-offs to each player might be used. Decisions regarding how to play the next iteration are based on the value of these summary statistics. Smale's second assumption assumes the agents are "nice" but not foolish or charitable. If my opponent's past pay-offs as measured by the summary statistic are greater than my past pay-offs so measured, then I play non-cooperatively. If the summary measure of my opponent's past pay-offs is less than mine, then I play cooperatively. Finally, if our summary measures are equal, then I again play cooperatively. Smale calls this decision rule a "good" strategy, and he proves that a two-person, infinite Prisoner's Dilemma game will voluntarily converge to the Pareto-optimal cooperative solution if both players adopt the "good" strategy. While of potential interest in other settings (e.g. duopoly market analysis), Smale's result does little to upset the conclusion that market failure problems will require enforced cooperation for their solution. First, the analysis is restricted to two-person games, while most market failures involve at least several players; as the number of players increases, cooperation generally becomes more difficult.3 4 Second, the infinite repetition structure of the game is not appropriate when
33
Using Radner's formula for the estimate for the upper bound of his Prisoner's Dilemma game, we see that for a game with 50 repeated plays with the same 100 players (small town America) and e equal to 5 percent error, the upper bound on the fraction of cooperative plays is 0.996. For a game with five repeated plays with the same 100 or more players and e equals 2 percent error (big city America with sophisticated mobile residents), the upper bound on the fraction of cooperative plays falls to a bit less than 0.6. With E equal to 1 percent error, the upper bound for the fraction of cooperative plays falls to zero. In a mobile society of sophisticated, self-seeking agents non-cooperative behavior is the norm not the exception. 34 Smale's analysis may generalize to more than two players, but to date no results have been obtained. The analysis of N person, finite Prisoner's Dilemma games suggests cooperation becomes more difficult as the number of players increases. See Radner (1980) and more generally the discussion in Olson (1965, ch. 2) and Hardin (1971). For some preliminary work in this direction, see Rubinstein (1979).
Ch. 12: The "New" Political Economy
671
players have finite lives. 35 In the same spirit, the result insures only asymptotic convergence to cooperative behavior, but finite-lived players are surely interested in the speed of convergence too. Enforced cooperation today at a cost may be preferable to voluntary cooperation at some very distant future date. Finally, the emergence of the cooperative result in Smale's analysis is extremely sensitive to the precise formulation of the "good" strategy. Specifically, suppose that the good strategy is followed except when equal values for the players' summary statistics of past pay-offs arise. Then one player may decide not to cooperate rather than to cooperate. If so, the game converges to the joint non-cooperative outcome rather than the joint cooperative outcome, and the joint non-cooperative outcome is globally stable. Hence, a willingness to play "nicely" is a key assumption behind Smale's result. While the analyses of Axelrod, Radner, and Smale suggest ways through the Prisoner's Dilemma to voluntary cooperation and implicit market contracts, no panacea is at hand. In those market failure "games" in which players are not well known to each other, the number of iterations is few, and the number of players is large, non-cooperation rather than cooperation is likely to be the equilibrium behavior. In such cases only enforced cooperation through an institution such as government can lead us from the non-cooperative outcome.3 6
35 Smale appeals to Aumann (1959, p. 295) for justification of the infinite horizon structure to his game. What Aumann really seemed to have in mind was an argument similar to that which motivated Axelrod's analysis:
In the notion of supergame that will be used in this paper, each superplay consists of an infinite number of plays of the original game G. On the face of it, this would seem to be unrealistic, but actually it is more realistic than the notion in which each superplay consists of a fixed finite (large) number of plays of G... This is unnatural, because in the usual case, the players can always anticipate that there will still take place in the future an indefinite number of plays. This is especially true in political and economic applications, and even holds true for the neighborhood poker "club". Of course when looked at in the large, nobody really expects an infinite number of plays to take place; on the other hand, after each play we do expect that there will be more. [Italics added.] 36 Baumol (1952) emphasized the point in his now classic analysis of market failure and the role of the state: In those cases where the welfare of the various members of the economy is partly dependent on each others' activity, it is possible that persons in pursuit of their own immediate interests will be led to act in a manner contrary to the interests of others... It may be that the disadvantage to each of them of such activity is so great and of such a sort that arrangements can be made on a purely voluntary basis whereby it can be avoided altogether. Where such arrangements cannot be relied on, it becomes advantageous to the members of the economy to have activities restricted by coercive measures... The fundamental importance of this point to the theory of government should be noted, for in the very concept of government there is an element of coercion... (pp. 180-181).
672
R.P. Inman
2.4. Summary While competitive markets are extremely attractive institutions for allocating societal resources among economic agents, they are not by themselves sufficient to insure best allocations against the minimal ethical criterion of Pareto optimality: that all trades be undertaken which can make at least one agent better off without at the same time harming another. At a minimum, government is required to protect and validate the trading process through the maintenance of a system of property rights. Those who embrace a stronger ethical criterion than Pareto optimality may also ask government to redistribute the initial endowments of agents to insure a more equitable allocation of final goods and services. In fact, governments may be asked to do a good deal more than protect property rights and transfer endowments. The motivation for an extended agenda lies in the important fact that competitive markets may not be capable of facilitating all beneficial trades between agents. Markets fail. They fail for the fundamental reason that the institution of market trading cannot enforce cooperative behavior on self-seeking, utility-maximizing agents, and cooperative behavior between agents is often required for beneficial trading. In each instance of market failure-public goods, externalities, increasing returns, asymmetric information, unemployment- agents were asked to reveal information about their benefits or costs from trades with no guarantee that that information would not be used against them. Without that guarantee, information is concealed, trades collapse, and the market institution fails. Only through cooperatively sharing valued information can the preferred, Pareto-optimal allocation be achieved. A careful analysis of agent behavior in market failure problems reveals non-cooperation to be the dominant strategy of each self-seeking, utility-maximnizing player. The generic form of market failures is the Prisoner's Dilemma game repeated a finite number of times between many strangers. The equilibrium strategy in this game is to play non-cooperatively. Only by externally enforcing cooperative behavior upon individuals can the non-cooperative, market failure outcome by avoided. Government is one institution which has the potential to enforce the preferred cooperative outcome. Whether it can do so in fact, and at a reasonable cost, are our next questions. 3. The search for efficient government In every case where such cooperation is indispensible and salutary it is necessary for society; and while the government exacts from the members of society part of their liberty and wealth, the well-being thus obtained for all may make them support without regret the sacrifice imposed on them by the establishment of a government [J.B. Say, as quoted in Baumol (1967a, p. 187)].
Ch. 12: The "New" Political Economy
673
As an allocator of social resources, the task of government is to find and enforce a cooperative, Pareto-improving allocation. The difficulty is that each member of society has an incentive to conceal his true benefits and costs from cooperation in hopes of exploiting those who do reveal their true gains or losses. If we all cheat, the outcome will generally be inefficient. We must search then for a way of running government which minimizes such non-cooperative behavior and which does so with the expenditure of as little of society's scarce resources as possible. The first to contend that government might be organized efficiently was Wicksell (1896). Wicksell's idea was simple: if net benefits are created by a cooperative allocation, then a means of redistributing those gains through taxation could be found which would ensure that all citizens preferred the new allocation to the non-cooperative status quo. The new allocation would be unanimously approved; no re-allocations would be allowed unless all citizens agreed to the change. Wicksell reasoned that unanimous approval would offer sufficient protection to encourage voluntary, cooperative participation on the part of all agents. Wicksell's ideas were developed further by Erik Lindahl and are now called the voluntary exchange theory of government. In Section 3.1 below I review this view of government and, using the work of Olson (1965) and Malinvaud (1971), argue that the voluntary exchange theory is fundamentally flawed. Government must be coercive if it is to guarantee the participation of all citizens. Once we admit to the necessity of coercion we must confront an additional concern: how can we protect the citizenry from oppression, discrimination, or the violation of basic rights? At a minimum we would want a government which is non-dictatorial. Can we design a governmental (coercive) decision-making process which is (1) non-dictatorial, (2) capable of finding and enforcing Pareto-efficient resource allocations, and (3) does so in a way which uses as few recourses as possible in reaching its decisions? The answer, sadly, is no. Arrow (1963), Gibbard (1973) and Satterthwaite (1975) have provided us with two related theorems which show that there is no collective choice process which is non-dictatorial, always capable of finding Pareto-optimal allocations, and is decisionmaking efficient. Section 3.2 presents and discusses these important results. We must conclude that while markets are imperfect, so too are governments. Rather than searching for an ideal institution, we now see we must compare imperfect ones. Decentralized markets perform well against the standards of democratic choice and economy in decision-making, yet often fail against the standard of allocative efficiency. Since we are looking to government to improve upon the overall performance of markets, it is most useful at this stage to search for governments which are allocatively efficient. In light of the ArrowGibbard-Satterthwaite theorems, then, we must sacrifice either democracy or decision-making efficiency. In Section 3.3 I review the recent research on those
674
R.P. nman
social choice processes, called planning or demand-revealing processes, which seek efficiency but are dictatorial. In Section 3.4, I review the recent work on those collective choice mechanisms, such as majority rule, which are democratic but often inefficient. Only through the careful delineation of the performance properties of specific governmental structures can we hope to make an informed choice on the important matter of markets vs. governments.
3.1. Collective action as voluntary exchange The first to consider analytically the notion that collective (i.e. cooperative) activities might be provided voluntarily was Olson (1965) in his book, The Logic of Collective Action. Olson was particularly interested in the relationship between group size and the provision of the collective good. Voluntary groups form because individuals realize a congruence of interests in a particular activity, but individuals are always free to leave the group at any time. How might such groups behave in the provision of collective goods? Olson considered the archetypical case of a pure public good from which no individual can be denied its benefits. He hypothesized that individuals are more likely to act cooperatively when the group of potential beneficiaries is small than when it is large. Figures 3.1(a) and 3.1(b) outline Olson's arguments.37 Figure 3.1(a) examines the reaction of a typical agent to the provision of a non-excludable public service by other members of society. In Figure 3.1(a), X represents the provision by all others, while x is the individual's own contribution. The total level of services provided is therefore X + x. Initially, when X = 0, the individual will select that level of x which maximizes his or her own utility subject to an individual budget constraint. The budget constraint is line AA in Figure 3.1(a); the curved lines are the individual's indifference curves. The utility-maximizing outcome given AA is point 0, and the level x0 is the individual's own provision of the non-excludable public good. If all other agents behave identically, then an aggregate provision of (n - 1)x 0 = X0 will be provided by all others in the group. Since exclusion is not possible, our typical individual will be able to consume X0 without charge. Like any free good, X0 shifts the individual's budget constraint outward, in this case to A'A'. With the new budget constraint the individual prefers a new level of the public good equal to x 1, but only the amount (x 1 - X0 ) will be provided personally. Note, however, (xl - X0 ) may be 37 The discussion which follows is based on Chamberlain's (1974) development of Olson's analysis. The analysis assumes each agent acts as if his behavior will have no effect on the behavior of other agents. Comes and Sandler (1984) relax this assumption and allow for agents to anticipate the behavior of others in response to their behavior. Our central conclusion-voluntary behavior underprovides public goods - is unaffected by this extension.
675
Ch. 12: The "New" Political Economy
private goods y
Income Expansion Path
xo
A
Xo
X 1
Xmax
A
collective goods (a) provision by others (X)
X*(n-l)x
individual provision (b) Figure 3.1. Voluntary provision of public goods.
less than xO. As group consumption rises, individual contributions may decline. 38 When X moves beyond Xma in Figure 3.1(a), individual contributions fall to zero. What is the equilibrium outcome of this adjustment process? Figure 3.1(b) presents the analysis for n identical individuals, where n is allowed to increase 38 The case illustrated in Figure 3.1(a) is for a "normal" good where an increase in income or resources is allocated to all commodities. If the free level of X means an increase in the private good y as well as x, then the level of original expenditure on x must be reduced so the expenditure on y can increase. Thus for normal goods, as X increases, our contributions fall. This seems the most likely case.
676
R.P. nman
from 1 to oo. The curve RR represents a typical agent's reaction to varying levels of the aggregate provision (X) by the other, n - 1 agents. For the case in Figure 3.1(a), as X increases, x declines. This is true for each agent. In the case of identical agents, an equilibrium also requires that X= (n - 1)x. A line describing this relationship is drawn in Figure 3.1(b) as the ray from the origin with the slope of (n - 1). Equilibrium must be consistent with both individual utility maximizing behavior represented by the reaction curve, RR, and with aggregate group production defined by X = (n - 1)x. This is point E, where x * is provided individually and nx*(=(n -1)x* + x*) is provided by the group. As the
number of individuals in the group increases, (n- 1) increases and the equilibrium allocation shows declining individual contributions but increasing group provision. In this case, the increase in n more than compensates for the fall in X *, so nx * rises.
What is interesting from Olson's analysis is that at least some of the public good is provided voluntarily by members of the group, even when agents can "free ride" on their neighbors' provision. What is particularly important for this chapter, however, is the next question: Is the level of the collective good provided by voluntary provision Pareto optimal? The answer is no. For each individual acting voluntarily, x* (< x a) is defined by the tangency of his budget line, whose slope is the marginal cost of x, to his indifference curve, whose slope is his MRS evaluated at nx*: MRS(nx*)= MC. When agents are identical we can
multiply both sides of the individual equilibrium condition by n to give: ,MRS = nMRS(nx *) = nMC > MC. Olson's voluntary exchange theory of pub-
lic good provision implies marginal social benefits (MRS) exceed marginal social cost (MC). Too little of the public good is voluntarily provided for any finite n.3 9
Voluntary confederations of beneficiaries generally will not provide the efficient level of collective activities. What is required is a collective decision enforced across all beneficiaries. Yet, since everyone can benefit from a collective action can we not design a collective choice outcome which everyone will approve and thus voluntarily support? Wicksell (1896) thought so and proposed governmental decision-making by unanimous consent. Wicksell argued that it ought to be possible to ensure that any Pareto-improving government decision had the unanimous approval of all citizens. The trick is to adjust taxes to benefits from a given public activity: "Provided the expenditure in question holds out any prospect at all of creating utility exceeding costs, it will always be theoretically possible and approximately so in practice, to find a distribution of costs such that 39 Pauly (1970) reached the same conclusion for the more general case in which consumers are not identical. There have been numerous experiments to test individual behavior in voluntary associations and the results generally confirm Olson's predictions; see Manvell and Ames (1981). Palfrey and Rosenthal (1983) provide a review of the evidence and a more general analysis of individual strategic behavior in voluntary associations.
Ch. 12: The "New" PoliticalEconomy
677
all parties regard the expenditure as beneficial and may, therefore, approve it unanimously" [Wicksell (1897, pp. 89-90)]. Unanimity protects each citizen from exploitation and therefore, Wicksell reasoned, should encourage agents to reveal their true costs and benefits of collective activity. Furthermore, once a cooperative allocation is obtained, unanimity should minimize any enforcement costs required to maintain the cooperative outcome. It remained for Erik Lindahl (1919) to make Wicksell's insight precise. The adjustment process proposed by Lindahl was meant to describe how a government might work in practice to set the level of public spending.4 0 Two parties, A and B, are to determine the level of public expenditure (G) and the division of the required taxes between A and B. The tax share of expenditures paid by A is denoted h while the remaining share (1 - h) is paid by B. Hoping to ensure that each party to the bargain will cooperate, Lindahl followed Wicksell's lead and required unanimous agreement to any final allocation. Lindahl viewed government as an auction process, not unlike a market place. A tax share is proposed to parties A and B who, in turn, respond by quoting their preferred levels of public spending. If the spending levels differ, new tax shares are proposed which are higher for the high demander and lower for the low demander. The new tax shares should increase preferred public spending of the low demand agent and reduce preferred spending of the high demand agent. The process continues until the preferred spending levels coincide and unanimity prevails. Johansen (1963) formalized the Lindahl process in a simple general equilibrium model. The Johansen framework, presented in Figure 3.2(a), allows us to see how the Wicksell-Lindahl process was designed to achieve the Pareto-efficient (i.e. cooperative) allocation of public goods. It will also help us to understand just why the process is likely to fail. The curve marked AA is the demand curve of person A and represents A's preferred level of public spending for each tax share, h, A could be asked to pay. More G is demanded by A as h falls. The dashed lines denoted a'a', a "a" and a "'a "', which peak along AA, are three indifference curves for party A and indicate various combinations of h and G along each dashed line which are equally valued. For low levels of G, an increase in G is highly valued so A will pay a higher tax share to remain indifferent (the indifference curve rises toward AA). After a point, however, private income is valued more highly than G and person A must be compensated by a lower tax share as G expands (the indifference curve falls away from AA). The peak of each indifference curve is on AA. Since player A is better off by receiving more G and paying a lower share, A wishes to move as far down the AA demand curve as possible. A similar analysis 40 While the Lindahl process is generally described for public goods, in principle it can also be applied to other communal or shared activities, including externalities, lumpy capital investments, or information. See Bergstrom (1971), Weitzman (1978), and Dasgupta, Hammond and Maskin (1980).
678
R.P. Inman
A's share B's share
U
B
Gw
public spending (a)
A's share
B's share
B
B public spending (lb) Figure 3.2. The Lindahl process.
describes player B's demand curve, BB, and corresponding indifference curves: b'b', b"b" and b "' b ". Note, however, that B's tax share is 1 - h, so B prefers to move as high up the BB schedule as possible and have A pay a large share (h) of public expenditures. The Wicksell-Lindahl process entails the use of a government auctioneer who announces a tax share h and asks each player to respond with their preferred level of G. If a high h is announced, G demanded by
679
Ch. 12: The "New" Political Economy
B exceeds that level of G demanded by A. The auctioneer realizes that h must be lowered to increase A's demand and to lower B's. The auctioneer continues to lower h until A's demand rises and B's falls sufficiently so as to equalize their prefereed G's. This occurs at point 1 in Figure 3.2(a) where both demand Gw and pay the respective tax shares h
and (1 - h).
This is Wicksell's point of
unanimity, also known as a "Lindahl equilibrium." Note too that the two indifference curves a"a" and b"b" are tangent to each other at point 1; thus, there is no realignment of G and h which can make both parties better off. Point 1 is therefore Pareto optimal.4 1 Wicksell-Lindahl's government auctioneer has therefore succeeded in leading our two non-cooperative agents to a cooperative, Pareto-optimal outcome. The Wicksell-Lindahl process has apparently succeeded in finding and enforcing a Pareto-optimal cooperative allocation at the cost of a neutral government auctioneer cum tax collector. Matters, however, are not so easy. By requiring unanimity, the Wicksell-Lindahl process in effect grants a veto to the low demand agent. This agent can stop the process unless compensated by a lower tax share. It is therefore in the interest of both parties to position themselves so as to appear to be the low demand party. The low demand veto, rather than encouraging the truthful revelation of preferences, becomes a weapon for manipulating the auctioneer. For example, if A 41
Any move from point 1 will necessarily push one party to a lower indifference curve, and hence such a move cannot be Pareto optimal. We can also show that point 1 also satisfies Samuelson's efficiency condition for pure public goods, as it should. At point 1 we know party A pays h, of government taxes and receives Gw in services. Further, this combination (h,, G,) is on A's demand curve AA, and therefore G must be A's preferred spending level given a tax share of h,. That is, Gw is a solution to A's allocation problem: max Ua(ya, G) subject to the budget constraint a = hwG + ya, where ya are private goods and la is pre-tax income. G, therefore satisfies the requirement that A's marginal rate of substitution of ya for G equals A's "price" of another unit of G which is h: { -(aUa/aG)/(aUa/aa)} = h. Point 1 is also on B's demand curve, and therefore B's marginal rate of substitution of yb for G equals B's "price" of another unit of G which is (1- h,): (- (aUb/aG)/(OUb/aYb)) = (1 - h,). Finally, in the Lindahl framework, government spending equals a constant marginal (= average) cost per unit of public output times the level of output: G = (MC) X. Thus, aU/aG = (1/MC) (aU/ax). Substituting the expression for aU/aG into each party's marginal rates of substitution of Y for G gives: ( - (aUa/aX)/(aa/aya)}(1/MC)= hw and (- ( aub/a X)/( ab/ayb) ) (1/Mc) = (1 - h w) Adding the two equalities together and noting that the expression in curly brackets is each agent's marginal rate of substitution of Y for X (denoted earlier as MRS), implies: MRS a + MRSb = MC, which is Samuelson's condition for efficiency in the provision of public goods.
680
R.P. nman Player A
Player B
r
N
C
|
N
C
Point 3
Point 2
Point
Point
4
1
Figure 3.3.
knows B's demand curve and assumes B will tell the truth, then A can improve on his pay-off over that available at point by understanding his demand curve to be A'A'; see Figure 3.2(b). The resulting Wicksell-Lindahl equilibrium will then be point 4 where A achieves the pay-off on indifference curve a "' a "'. The false demand curve A'A' is just that curve which will maximize A's well-being given that B tells the truth; a "'a "' is tangent to BB at point 4 and is the best (lowest) indifference A can obtain given player B responds according to his true demand curve BB. Since A and B are equal players, however, B can play the same strategy and give an understated demand curve B'B' in hopes of exploiting A's true demand curve, AA. When B lies and A tells the truth, the resulting Wicksell-Lindahl equilibrium will be at point 2 giving B a pay-off on the preferred indifference curve b "'b "'. Yet when both players adopt the lying, non-cooperative strategy the outcome will be a point such as 3; an outcome inferior to the cooperative outcome at point 1. Malinvaud (1971) first emphasized the inadequacies of the Wicksell-Lindahl process as a mechanism for discovering a Pareto-optimal cooperative outcome. The process suffers from just the same difficulties which plagued the market system as an allocator of public goods. The allocation game in Figure 3.2(b) can be described by the pay-off matrix of outcomes shown in Figure 3.3, where N is the non-cooperative lying strategy and C is the cooperative truth-telling strategy. From the player's indifference curves, we observe the ordering of outcomes for player A as 4 > 1 > 3 > 2 (where " >." indicates preferred) while that for player B as 2 > 1 > 3 > 4. With these pay-offs, both players find non-cooperation is the dominant strategy within the institutional structure of the Wicksell-Lindahl process. The game is a Prisoner's Dilemma game. Rather than ensuring the cooperative allocation, the Wicksell-Lindahl process has simply moved noncooperative behavior from the marketplace into the house of government.4 2 If 42
These orderings of outcomes in Figure 3.2(b) (points 1, 2, 3, 4) is only one of several possible orderings. While point 4 is always best for A and point 2 is always best for B and while 1 is preferred to 2 for A and 1 is preferred to 4 for B, point 3 may be overall worst (e.g. 4 > 1 > 2 > 3 for A and 2>1>4>3 for B) or 3 may be second best (4>3>1>2 forA and 2>3>1>4 forB). These alternative orderings turn the allocation game into a game of "chicken" which Lipnowski and Maital (1983) analyze and show will not have a stable, Pareto-optimal allocation. Our main result, that the Wicksell-Lindahl process cannot guarantee an efficient outcome, is valid generally.
Ch. 12: The "New" Political Economy
681
government is to find and enforce a Pareto-superior cooperative allocation, we must provide the auctioneer-tax collector with an allocation process, other than the Wicksell-Lindahl scheme, which encourages participants to reveal their true benefits and costs from public activity. 4 3
3.2. Coercive mechanisms of collective choice: Democracy s. efficiency With the inability of the Olson or the Wicksell-Lindahl voluntary exchange mechanisms to guarantee an efficient allocation of social resources, we need to ask whether a coercive mechanism of collective choice might achieve efficiency. A coercive collective choice mechanism is one which requires individual agents to provide information about their preferences and/or their economic endowments and, when asked, to contribute resources to facilitate the collectively decided allocations. We need make no assumption about whether agents will act honestly and truthfully when participating in collective decision-making - indeed, that will turn out to be the central question - but we do require their participation. A coercive collective choice mechanism may be either democratic or dictatorial. A democratic mechanism respects the preferences of each individual, and the expressed preferences of all participants have bearing on the final resource allocation chosen by the mechanism. While democracy does not guarantee each agent will receive his favorite outcome, it does guarantee his preferences will be heard when reaching a collective choice. In contrast, a dictatorial collective choice mechanism is responsive to the preferences of only one individual, the dictator. All other participants can be ignored. As our task is to find a collective choice process which we can compare to a market process as a vehicle for achieving efficient resource allocations, it seems reasonable to ask that that mechanism be democratic rather than dictatorial. Markets clearly respect individual preferences, given initial endowments, and at least at this stage of our inquiry we should ask no less of a coercive collective choice mechanism. Our problem becomes this: Is there a democratic collective choice process capable of allocatingsocietal resources efficiently, even when markets cannot? Arrow (1963) was the first to pose this question precisely, and within his framework the answer is, no, democratic processes are not generally efficient. More recent work by Gibbard (1973) and Satterthwaite (1975) has extended Arrow's analysis to a more general behavioral structure and they reach essentially 43 These comments are not meant as a criticism of the notion of the Lindahl equilibrium as a normative concept. A Lindahl equilibrium, if reached, is Pareto efficient, and as each party pays a marginal tax equal to their marginal benefit the concept may have some appeal as a "fair" equilibrium as well [Varian (1978, p. 200)]. The concept of the Lindahl equilibrium is used extensively in models of public choice; see Roberts (1974). The criticism here is only of the Wicksell-Lindahl process, which generally fails to achieve its stated (and normatively attractive) objective, the Lindahl equilibrium.
682
R.P. nman
the same conclusion. We stand, therefore, in the face of a dilemma. Any collective choice mechanism which we might design must be imperfect: either efficient but dictatorial or democratic but inefficient. And we must choose. We shall examine the performance properties of specific dictatorial and democratic mechanisms in Sections 3.3 and 3.4 below. First it is important to understand just why it is we face this difficult choice. 3.2.1. Arrow's Possibility Theorem The Arrow Possibility Theorem clarifies in a most fundamental way what democracies can, and cannot, do. Arrow proposed five axioms which he argued might be reasonably required of any collective choice process; the theorem proves the inconsistency of the five axioms.4 4 They are: (i) Pareto optimality (P). If everyone prefers some alternative x to another alternative y (denoted xPiy, for all individuals i), the collective choice process should prefer x to y (denoted xPy). If we wish a collective choice process to find a Pareto-optimal allocation of social resources - in particular, to solve the problems posed by market failures - then we must impose the axiom of Pareto optimality. (ii) Non-dictatorship (ND). No individual has full control over the collective choice process such that that individual's preferences over alternatives are always decisive, even when everyone else prefers just the opposite. For example, there is no person i such that when i prefers x to y (denoted xPiy) but everyone else prefers y to x (yPjx, for all j, j i) the collective choice process prefers x to y (denoted xPy). Imposing non-dictatorship seems an important starting point for any analysis of collective choice processes which we hope will respect the preferences of all individuals in society. The last three axioms all have convenient interpretations as means of insuring an efficient decision-making process. They seek to ensure that societal decisions will in fact be made when individuals can provide the required information for collective choice, and that that information will be used as parsimoniously as possible. (iii) Unrestricted domain (U). The collective choice process is capable of reaching collective decisions for all possible combinations of individual prefer44Arrow originally proposed six axioms and proved their inconsistency. The axiom missing from the list here is the axiom of positive association. Positive association requires that if individual preferences over alternatives are changed for some reason so that an alternative x moves up in some individual's ordering and does not fall in others' orderings, then if the collective choice process originally preferred x, then x must remain preferred after the change in preferences. This axiom has been shown to be unnecessary to the proof with a slight strengthening of Arrow's other axioms. See Sen (1970a) and Vickrey (1960). Positive association still has a role to play in our analysis of collective choice, however. See the discussion below of the Gibbard-Satterthwaite theorem on strategy-proof voting processes and of May's theorem for majority rule.
Ch. 12: The "New" Political Economy
683
ences orderings of the alternatives. To exclude certain combinations of orderings from consideration by the collective choice mechanism clearly limits the mechanism's usefulness. Either no decisions can be reached when an excluded combination of orderings appears, or if we ignore those excluded combinations and do make a collective decision, we have denied all individuals with those rankings a chance to participate in society's decisions. (iv) Rationality (R). The collective choice process is required to be rational: first, in the sense that it can completely rank all alternatives presented for consideration by stating x is preferred to y (xPy), or y is preferred to x (yPx), or x and y are equally as desirable so we are indifferent (denoted xly); and, second, that all rankings display the property of transitivity. The requirement that the collective choice process give a complete ranking of all alternatives simply means we want our mechanism to be able to choose between alternatives. Completeness seems innocent enough. Requiring transitivity, however, is more controversial. Arrow required the collective choice mechanism be transitive with respect to both the strict preference relation (called P-transitive)-if x is preferred to y (xPy), and y preferred to z (yPz), then x is preferred to z (xPz) - and with respect to the indifference relation (called I-transitive)- if x is indifferent to y (xly) and y is indifferent to z (yIz), then x is indifferent to z (xlz). As we shall see, the Arrow transitivity requirement is most demanding but it has obvious decision-making advantages. Requiring P- and I-transitivity along with completeness ensures that the collective choice mechanism will give a complete ordering of all alternatives [Sen (1970, ch. 4)]. (This translation of individual preference orderings into a complete social ordering of alternatives Arrow called a "social welfare function.") Knowing alternatives are part of a complete ordering can save on decision-making costs. For example, with a complete social ordering, to rank any three alternatives we need to make only two comparisons. If we discover xPy and yPz or ylz, then we know without additional use of the collective choice mechanism that xPz. There is an additional advantage to requiring P- and I-transitivity. We avoid cycles of indecision often called voting cycles. If xPy and yPz, then if P-transitivity is violated zPx. In this instance, no one of the three alternatives clearly dominates the others. Which alternative is to be preferred is not at all clear. Arrow's transitivity requirement embodied in the axiom of rationality asks us to find a collective choice mechanism immune to the indecision of voting cycles. (v) Independence of Irrelevant Alternatives (I).
Requirement I states that the
selection of either of two alternatives by the social choice process must depend on the individual's orderings over only those two alternatives, and not on individual orderings over other alternatives. The central advantage of this requirement is again efficiency in decision-making: when axiom I applies, the comparison of alternatives x and y does not require us to consider the availability or the characteristics of some third alternative, z. While this axiom is appealing as a
684
R.P. Inman
source of economy in collective choice, it has its price. The axiom excludes the use of cardinal orderings in deciding social choices. Cardinal utility indicators permit us to introduce measures of preference intensity as relevant information in deciding collective choices. If one voter prefers x to y, while another prefers y to x, perhaps that voter who favors his first choice more intensely ought to have his way? Such arguments worried Arrow (1963, pp. 9-11), however. The choice of the cardinal utility indicator can be manipulated to affect the collectively chosen outcome. Yet there may be no compelling ethical or economic reason to prefer one cardinal utility indicator over another. 45 It seems reasonable, at least at the start, to see if we might find a collective choice process whose decisions are not sensitive to the arbitrary choice of such an indicator. Requiring the axiom of independent of irrelevant alternatives removes this possibility. Each of the five axioms above seems to be a plausible restriction as we search for a coercive, collective choice mechanism. We want the mechanism to be non-dictatorial (axiom ND), to find and enforce efficient allocations particularly when markets cannot (axiom P), to work for all possible combinations of individual preferences (axiom U), and to rank alternatives and to do so with an economy of process (axioms R and I). Does such a collective choice mechanism capable of ranking all alternatives exist? The answer is no. In his famous Possibility Theorem, Arrow showed that there is no collective choice process which can satisfy the five axioms P, ND, U, R, and I. The proof of Arrow's theorem has been examined, dissected, and refined since its original presentation in 1953, and one particularly straightforward presentation is given in Figure 3.4.6 4 The strategy of the proof is simple. In the first stage, a set of individuals, denoted as D, is defined as a "decisive set" over two alternatives x and y under a particular collective choice process when all members of D express a preference for x over y, all individuals not in D express the opposite preference, and the social choice process prefers x over y. It is then shown [steps (3)-(7)] in Figure 3.4] that this group which is decisive for x vs. y is also decisive over any other pair of alternatives. In the second stage of the proof, the original decisive group is split into two subsets, A and B, and it is shown that one of these two groups is a decisive set [steps (9)-(14)]. The process of splitting the decisive group continues until we reach a decisive set of one individual whose preferences control the collective choice process [step (15)]. That person is a 45
Suppose person 1 prefers x to y and assigns a cardinal utility value of 10 to x and a value of 5 to y. Person 2 prefers y to x and assigns a value of 6 to y and a value of 2 to x. Person values x "5 utils" more than y while person 2 values y "4 utils" more than x. Do we therefore conclude x is collectively preferred to y? What if person 2 adopted a different cardinal indicator and assigned a value of 12 to y and value of 4 to x? Then person 2 prefers y by 8 "utils" over x, and should we then conclude y is collectively preferred to x? Who is to decide which cardinal indicator of utility is "correct" and must all individuals have the same cardinal indicator? Violating axiom I means we must 4 6 answer these difficult questions. The proof here follows the outline in Vickrey (i96).See Sen (1970a) or Fishbumr (1973) for a review of the literature on the proof of the Arrow theorem.
Ch. 12: The "New" Political Economy
685
Theorem. The five axioms- the Pareto principle (P), non-dictatorship (ND), unrestricted domain (U), rationality (R), and independence of irrelevant alternatives (I) - cannot be simultaneously satisfied by a collective choice process. (1) Definition: A set of individuals, denoted D, is called decisive over alternatives x and v if the collective choice process yields a collective preference for x over y, whenever all members of D prefer x to y, and everyone else prefers y to x. (2) Assume a set of individuals D is decisive for x over y. (3) Assume all members of D prefer XPDVPDZ and all others not in D (denoted N) prefer yPNZPNX (use of axiom U).
(4) The collective choice process yields xPy (definition of decisive). (5) The collective choice process yields yPz (use of axiom P). (6) The collective choice process therefore yields xPz (use of axiom R). (7) Only members of D prefer x to z. Furthermore, society's preference of x over z must hold no matter the ranking of y (use of axiom I). Therefore the set of individuals D is decisive for x over z (definition of decisive). By repeating (2)-(6) D can be shown to be decisive over all pairs of alternatives. (8) D must contain two or more individuals (use of axiom ND). (9) Divide D into two non-empty sets, A and B, and assume for all members of A, xPA YP, z, and for all members of B, zPBxP 8 y. The set of citizens not in D, set N, continues to prefer yPNZPNx (use of axiom U). (10) There are three possible outcomes for society's choice between z and : either zPv, or yPz, or z and y are indifferent (denoted zly) (use of axiom R). (11) If zPy, then as members of set N and of set A prefer y to z and only members of set B prefer z to y, set B must be decisive (definition of decisive). (12) If yPz or ylz, then as xPy [step (4) above] the social choice process gives xPz by transitivity (use of axiom R). (13) However, only members of set A prefer x to z. Thus set A must be a decisive set (definition of decisive). (14) The original decisive set D can therefore be reduced to a smaller decisive set, either set B [step (11)] or set A [step (13)]. (15) Repeat steps (9)-(14) for the new decisive set. Repeat the argument again until the decisive set consists of only one member and cannot be subdivided. A decisive set of one member is a dictatorship. Thus, axiom ND is violated. Figure 3.4. The Arrow Possibility Theorem.
dictator. The consequence is that the four Arrow axioms - P, U, R, and I - are inconsistent with the fifth, ND. 3.2.2. Can we escape from Arrow's Theorem? The Arrow Possibility Theorem is true; there is no escaping that fact. The five axioms above cannot all be satisfied simultaneously by a collective choice process. There is only one way out and that is to discard one or more of Arrow's original requirements. Perhaps one of the axioms asks too much. On closer examination we might wish to relax one or more of the requirements. Perhaps then we can prove there does exist a collective choice process which satisfies this new, less demanding, set of axioms. This is the strategy which has occupied social choice theorists for the past twenty years. The results are generally discouraging. 47 4
For a more detailed review, see Sen (1983).
686
R.P. nman
(i) Relaxing Pareto optimality. This axiom hardly seems like one we would want to sacrifice, but for completeness we should explore the implications if we do relax it. Wilson (1972) reconsidered Arrow's theorem without enforcing the Pareto principle and shows that either the resulting collective choice mechanism is always indifferent between alternatives regardless of individual rankings or the mechanism produces a "negative" dictator, someone whose preferences over alternatives are reversed to form the social ordering. Neither alternative is very appealing. Since the Pareto principle is such a crucial component to our comparison of markets and governments, and since virtually nothing is gained by dropping it, it seems wise to look elsewhere for a way around Arrow's challenge. (ii) Relaxing non-dictatorship. The theorem itself shows that if we will tolerate a dictator we can satisfy the remaining four requirements. There are problems for the dictator, however. As Gibbard and Satterthwaite have shown, there is nothing in the remaining four requirements stated above - P, U, R, and I - that guarantees the other members of society will truthfully reveal their preferences to the dictator. An "optimal" allocation based on false preferences is a hollow victory. Yet obtaining true information cooperatively is surely a central problem for most dictatorships, even benevolent ones, seeking to maximize citizen welfare. How a dictator might overcome the problem of preference revelation is examined in detail in Section 3.3 below. For now we retain the axiom of non-dictatorship. (iii) Relaxing unrestricteddomain. The axiom of unrestricted domain requires
that all possible combinations of preference orderings be allowed for consideration by the collective choice process. To do otherwise is to deny those individuals who happen to have excluded preferences a say in the collective decision. Either we ignore them, which is discriminatory, or we say that no decision can be made when those combinations of orderings are expressed. Neither discrimination nor indecision is attractive. There is one happy circumstance in which all our problems disappear, however. This is a case which we will consider in detail in Section 3.4 below and which involves the use of pair-wise voting by majority rule (satisfying ND and I) to search for efficient budgetary allocations (satisfying P) by comparison of alternatives which yields orderings which are both complete and transitive (satisfying R). Only axiom U is violated through the imposition of a preference restriction on voters. Acceptable orderings are called "value restricted" and satisfy an axiom due to Sen (1966) in which voter preferences over every combination of three alternatives has the property that all agree some alternative is never worst, or some alternative is never best, or some alternative is never medium. Sen's condition of value restricted preferences sets a minimum degree of voter compatibility to ensure the majority rule process is transitive. A prominent example of preferences which satisfy Sen's condition are preferences which, when considering alternatives which can be ordered along one dimension (e.g. the size of the budget), yield orderings which are "single-peaked". (See Figure 3.9 below.) We shall consider this case in detail in Section 3.4. When
Ch. 12: The "New" Political Economy
687
voting, say, on public spending for a single public good, a majority rule process has the ability to find Pareto-optimal allocations. As we shall see in Section 3.4, however, this result is of only limited usefulness. The plausible restrictions on preferences which gave unique, majority rule outcomes in one dimension are no longer sufficient for unique majority outcome when policy assumes two or more dimensions-for example, taxes and spending. What is sufficient when policy alternatives have more than one dimension are limitations on preferences which are so restrictive as to make their realization extremely unlikely. (See Section 3.4 below.) Again what originally seemed a promising escape route has proven to be a dead end. If there is to be a generally satisfying way out of Arrow's paradox it will have to be in relaxing one of the last two axioms. (iv) Relaxing rationality. Perhaps the least defensible of Arrow's requirements is transitivity, the cornerstone of the rationality axiom. Arrow had sought a collective choice mechanism which would provide a complete ordering of all social alternatives. A complete ordering requires completeness and both P-transitivity (denoted xPy & yPz
xPz) and I-transitivity (denoted xly & ylz - xlz).
While knowledge of a complete ordering of alternatives has its advantages for collective decision-making, we do not often need all the information such orderings provide. In most instances, we seek only a best alternative from a set of options, not a complete ranking of all the options. (Who cares what policy was tenth best?) It is important to ask what might happen to Arrow's Possibility Theorem if we no longer require our collective choice mechanism to provide a complete ordering of all alternatives but rather only require that it find a "best" alternative? Relaxing the Arrow requirement that the collective choice mechanism be both P-and I-transitive drops the need for the mechanism to give a complete ordering, but retains the possibility that the mechanism can find a best alternative. Such mechanisms are often called "social decision functions" or "social choice functions" (as distinct from Arrow's "social welfare" or "social orderings" function). What happens, then, if we drop I-transitivity and required only P-transitivity? Plott (1976) has shown that one of Arrow's original arguments for requiring both P- and I-transitivity in fact needs only P-transitivity. Arrow wanted the collective choice mechanism to produce preferred outcomes which were independent of the "path" or agenda of alternatives used to decide the outcomes. Such path independence can be obtained when P- and I- transitivity hold, but only P-transitivity is really necessary. What happens if we permit intrasitive indifference and require only strict preferences to be transitive? Gibbard (1969) provided the answer: We will have collective decisions by oligarchies. Gibbard proved that every collective choice process satisfying P-transitivity and the other Arrow axioms of P, U, and I will generate a select group of individuals with the following powers. Any time members within the group agree over a pair of alternatives (say, xPiy for all i in the group) then the society chooses the group's
688
R.P. Inman
preferred alternative (xPy) no matter what others prefer. Furthermore, when the members of the group disagree, each member still has the power to veto any alternative they dislike (if xPiy for one i in the group, then yPx cannot occur). No collective decision will be made in such instances. We can conclude that relaxing Arrow's original rationality axiom by asking for just P-transitivity, or equivalently path independence, buys us very little. We go from a dictatorship by one to either a dictatorship by committee or, if the committee is large and cannot agree, to instances where no collective choice is made. Blair and Pollak (1981, 1983) push the relaxation of rationality to its limit. They ask the collective choice process to satisfy what might be viewed as a minimal rationality restriction: acyclicity. Acyclicity is weaker still than P-transitivity; their hope is to escape rule by oligarchies. Acyclicity requires only that the social choice process produce a clear winner which is preferred to all others in pairwise comparisons. Whether the losing alternatives satisfy P- or I-transitivity is not crucial. We are looking only for a winning alternative. The results are discouraging. The Blair-Pollak theorem shows that collective choice rules which satisfy the usual Arrow axioms P, U, and I plus the minimal rationality requirement of acyclicity will always produce an individual who has veto power (i.e. can ensure no choice is made except his choice) over a possibly sizable fraction of all pairwise comparisons. When the number of alternatives (a) is at least four, and the number of alternatives exceeds the number of voters (a > n), then the fraction of all pairwise comparisons which can be vetoed by at least one individual is (a - n + 1)/a. As a increases, this fraction approaches 1. The force of individual veto power is either to give to one individual unilateral power to block an alternative from adoption or, when two individuals who disagree have veto power over the same pair of alternatives, to force social indecision in the ranking of those alternatives. Unfortunately, the Blair-Pollak theorem is likely to apply to most budget allocation problems when we ask the collective choice rule to set the level and the distribution of many public goods and private income (via taxation). In the very simple case of three voters and three distinct bundles of a public service, there are six different distributions of services, each corresponding to a social alternative. 4 8 With a = 6 and n = 3, one individual voter can veto at least two-thirds (= 6 -- 3 + 1/6) of all pairwise comparisons. Expanding the list of alternatives to include several public goods and private income only increases the reach of individual veto power. Of course, we could limit the number of alternatives to be less than the number of voters - Ferejohn and Grether (1974) have shown acyclic collective choice rules can avoid the veto result in this case.- but then who is to decide which alternatives are to fall within 48
Suppose the three levels of the public good are low (L), middle (Mlv)and high (H), and the three amounts can be allocated across the three voters. The six alternatives ordered by person 's allocation, then 2's allocation, and then 3's allocation are: LMH, LHM, HML, HLM, MLH, and MHL.
689
Ch. 12: The "New" Political Economy Borda points for alternatives: w x y z Person 1: wPxPyPz Person 2: yPzPwPx Person 3: xPyPzPw Total points
4
3
2
1
2
1
4
3
1
4
3
2
7 8 9 6 Borda count winner: option y
Figure 3.5. Borda count voting.
the restricted set? Such agenda-setting power carries a veto or dictatorial implication of its own. We must conclude that the strategy of relaxing rationality, even down to the very weak condition of acyclicity, does not provide a very satisfying escape from the dilemma posed by Arrow's theorem. (v) Relaxing independence of irrelevant alternatives. Relaxing axiom I therefore seems to be our last chance for finding a compelling way out of Arrow's paradox. Indeed, when the theorem was first published, axiom I received the most criticism. The central complaint was that the axiom ruled out the introduction of potentially useful information on individual preference intensities. In fact, many commonly used voting schemes such as point-voting in which outcomes are chosen by the sum of scores from a ranking of alternatives will violate axiom L.49 If we can drop axiom I, the whole problem, as Samuelson (1967, p. 41) says " vanishes into thin air". The remaining four axioms - P, U, ND, and R - can be shown to be consistent. A typical collective choice mechanism used to illustrate this point is the so-called Borda count, named after its eighteenth century inventor Jean-Charles de Borda. In the case of a alternatives and n voters, each voter assigns a points to their first place alternative, a - 1 to their second place alternative, down to 1 point for their last place alternative. The Borda score for each alternative is the sum of points given to it by the n voters. The alternative with the highest score is the Borda winner. Figure 3.5 illustrates the use of the Borda count for 3 voters and 4 alternatives; alternative y is preferred. The Borda count is non-dictatorial; for example, person 2 who prefers alternative y is not decisive in all pairwise comparisons. The Borda count also satisfies the axiom of unrestricted domain, works for all preference orderings, and satisfies Arrow's rationality (i.e. is complete and is P- and I-transitive). Finally, the Borda count can find Pareto efficient allocations; a switch from a selected alternative (such as y) lowers the welfare of at least one person (such as person 2). The procedure does not satisfy axiom I, however. Axiom I requires that the ranking of any two 49 For a very interesting application of voting methods which violate axiom I, see the study by Dyer and Miles (1976) of the decision to determine the trajectory of the Mariner space probes.
690
R.P. nman
alternatives by the collective choice mechanism be independent of the ranking of any third alternative. In Figure 3.5, option y is preferred to option x. If all individuals retain their original preferences between x and y, but person 1 changes his ordering from w > x >y > z to x > w > z >y (1 still prefers x to y), then under the Borda count alternative x will receive 9 points, y will receive 8 points, z will receive 7, and w will receive 6. Now the collective choice mechanism prefers x to y, only because the ranking of some alternative options (w and z) had been changed. Clearly, the rankings with the Borda procedure are not independent of irrelevant alternatives. Furthermore, the Borda count is also sensitive to which options are actually included in the decision set. If we simply remove the worst option (z) in Figure 3.5 from consideration, then suddenly all remaining options (w, x, y) are viewed as equally valued by the Borda procedure; all alternatives receive a score of 6 (where 3 points are awarded first choice, 2 points to second choice, and 1 point to the last choice). How individuals report their rankings for all considered options and which options are actually considered can have a significant effect on the final outcome when axiom I is violated. Procedures which violate axiom I must gather preference information on all alternatives to rank any two, and, even more importantly, they are open to manipulation by the choice of the agenda and the stated preference orderings of the voters. Borda himself appreciated these problems and is said to have commented that "my scheme is intended only for honest men". 50 Dropping axiom I, therefore, does not appear to be a satisfactory way out of the dilemma of Arrow's theorem, Again we face a hard choice between decisionmaking efficiency and democracy. Processes which violate axiom I are likely to be expensive to run as all preference information must be considered. But matters are really worse than that. As we relaxed axiom I, a new problem emerged, one which significantly complicates our search for an attractive collective choice mechanism. It is the problem of preference manipulation or strategic voting. The central issue is well illustrated by our Borda count example. When all voters voted truthfully, the outcome (in Figure 3.5) was option y. Person 1, however, ranks y third and would prefer either w or x to y. Suppose voter 1 misrepresents his preferences and reports not his true preferences (w > x > y > z) but his
"strategic" preferences (x > w > z >y). Using a Borda count, the chosen outcome will become option x (9 points) while option y (8 points) falls to second place. Voter 1 can strategically manipulate his stated preferences to achieve a preferred allocation. Similar strategies exist for the other voters as well. Suddenly, our search for an efficient and democratic voting process has been complicated by a new order of difficult - what do we do about voters who lie? 50
Quoted in Black (1958, p. 182).
Ch. 12: The "New" Political Economy
691
Arrow's answer was to assume voters did not lie [Arrow (1963, p. 7)], and most all of the subsequent discussion of the Arrow theorem followed Arrow's lead. The manipulation of voter preferences was simply not an issue in the early evaluation and analysis of Arrow's theorem. It remained for Farquharson (1969) to bring strategic voting to our attention. Gibbard (1973) and Satterthwaite (1975) proved how all-pervasive the problem of preference manipulation really is in voting processes. The Gibbard-Satterthwaite theorem proved that every voting scheme violates at least one of three conditions: (1) the scheme is non-dictatorial; (2) truth (non-manipulation) is a dominant (always best) strategy for voters; or (3) the scheme considers three or more alternative policies. Simply put, democratic processes considering three or more options are always open to manipulation. There are collective choice processes which are immune to the strategic manipulation of preferences (called "strategy-proof' processes), but such processes must satisfy yet another axiom of decision-making-the axiom of positive association. 51 The axiom of positive association (denoted PA) says that if a collective choice process initially prefers an alternative x, and if some citizens then change their preference orderings so that x moves up relative to other alternatives, then x must remain the chosen alternative. When PA is joined with the Arrow axioms of rationality (R) and independence of irrelevant alternatives (I) then the collective choice process will be immune to manipulation by voters.52 Combining the results of the Arrow theorem with those of the GibbardSatterthwaite theorem yields the following, rather discouraging, conclusion: there is no collective choice process which is democratic (axiom ND), decision-making efficient (axioms R, I, U), allocatively efficient (axiom P), and immune to strategic manipulation (axioms PA, R, I). The following axioms of collective choice - ND, R, I, U, P and PA - cannot be satisfied simultaneously. 3.2.3. Conclusion
We have now exhausted all avenues of escape from the grip of Arrow's Possibility Theorem. Dropping any one of Arrow's requirements allows us to protect the others, but only at a price. We must sacrifice either efficient decision-making (including the truthful revelation of preferences), or allocative efficiency, or democracy. We stand, therefore, at a crossroads. The force of Arrow's theorem 51 It is interesting to note that Arrow originally included the axiom of positive association within his original set of axioms for describing an attractive social welfare function. Since he and subsequent analysts also assumed truthful voting, however, positive association was superfluous to establishing a conflict among the remaining axioms (U, I, P, R, and ND). See Sen (1970a, theorem 3.1). 5 2See Blin and Satterthwaite (1978, theorem 2). Blin and Satterthwaite show the tight logical connection of the Arrow theorem and the Gibbard-Satterthwaite theorem, deriving one from the other.
692
R.P. nmanl
and the Gibbard-Satterthwaite theorem has been to help clarify what lies down each path we might choose. Figure 3.6 is the roadmap. If we choose the democratic pathways which protect axiom ND (routes 1 and 2) we must realize that our collective choice mechanism will either be allocatively inefficient if we give up axiom P (route 1) or indecisive, informationally expensive, and manipulable if we give up axioms PA, R, U, or I (route 2). If we choose the third path of allocative and decision-making efficiency (retain P, PA, U, I, and R), then our collective choice mechanism is necessarily dictatorial. Sadly, there is no collective choice process which is democratic, decision-making efficient, and always capable of finding Pareto-optimal allocations. As markets are imperfect, so too are governments. Which path shall we follow? Our central concern here is to compare markets to governments as mechanisms for resource allocation. Since markets have failed us on the dimension of allocative efficiency it seems most fruitful to explore those avenues where government can potentially succeed against at least this criterion. If governments can allocate economic resources efficiently when markets cannot, then perhaps the price we pay in lost democracy (route 3) or in decision-making inefficiency (route 2) is low enough to justify a coercive collective choice process for that allocation. The theorems of Arrow, Gibbard, and Satterthwaite tell us that down either routes 2 or 3 there lies one collective choice process which satisfies the required, "signpost" axioms. Our next task is to find such processes and to examine their relative performance in finding allocatively efficient outcomes. We begin the search by first looking for dictatorial processes which satisfy allocative and decision-making efficiency. We then turn to an examination of inefficient, but democratic, decision-making processes. 3.3. Dictatorialprocesses: The dictator as a benevolent planner
A variety of dictatorial regimes are possible, but only one - the benevolent regime in which the dictator seeks to maximize some non-zero index of citizen wellbeing - seems worth discussing in detail. Repressive regimes may be allocatively efficient in the sense that the repressed cannot be made better off without making the dicator worse off, and they may be decision-making efficient since the dictator need know only his own preferences to make collective choices (thus escaping Arrow's dilemma), but such societies are likely to sacrifice so much in other basic human values as to be unacceptable (see the discussion in Section 5 below). A benevolent dictator, on the other hand, respects citizen preferences and makes a serious effort to allocate societal resources so as to maximize some weighted index of those preferences. The Lange-Lerner socialist planner is such a benevolent dictator; see, for example, the survey in Heal (1973). Unfortunately, the same fundamental problem which plagues markets or democratic decision-
Ch. 12: The "New" Political Economy
693
Arrow Possibility Theorem Gibbard-Satterthwaite Theorem
I
(2)
(1)
(3)
i Satisfy U, ,R,P PA)
Satisfy P, PA
Satisfy ND, U, I, R, PA)
IND,
sSatisfy 1, R
5
Satisfy 8 U I
|Satisfy U,R
If
I 1Lose U I Lose R
1Lose PI
l Lose I
Decision- Making Inefficiency
Allocative Inefficiency
)Lose ND I
Dictatorial Choice
P = Pareto Optimality ND= Non-dictatorship U = Unrestricted Domain R = Rationality (Completeness, P- and I-transitivity) I
= Independence of Irrelevant Alternatives
PA = Positive Association Figure 3.6. The performance of alternative collective choice processes.
694
R.P. Inman
making processes - the incentive to conceal and deceive - plagues the benevolent planner as well. The dictator's mantle does not carry with it the power to read the minds of men. If the efficient level of a public good is to be provided, the planner must know the preferences (the MRSs) of the individual consumers of the good. If efficient taxes are to be used to pay for public services or to redistribute income, then the planner must know individual resource endowments and how individuals will behave in the face of taxation. What one gains in the use of a dictatorial planning process of collective choice is a well-defined social welfare function - that is, a clearly articulated set of objectives. What we do not have, yet, is the information needed to put those objectives into action. The task here is to see if we can design a planning mechanism for collecting the crucial information on tastes, endowments, or behavior. 3.3.1. The MDP process Malinvaud (1971) and Dreze and de la Vall6e Poussin (1971) were the first to suggest an allocation process which had attractive incentive properties for the participants. Called the MDP mechanism, it is a planning process run by the benevolent dictator (central government) to determine the allocatively efficient level of a public good. The incentives of the citizens in such allocation problems are to act as free riders and to understate their true willingness-to-pay for the public activity. The MDP process encourages the truthful revelation of preferences by insuring that everyone who reveals their true willingness-to-pay cannot be hurt by such cooperative behavior. The worst that can happen is that they remain at their initial, pre-public good level of well-being. If everyone does cooperate, the MDP process converges to Pareto-efficient allocation. How does the MDP process work? In the simple two-person version public goods are provided at a constant marginal (equal to average) cost which can be normalized to $1 per unit of the public activity (X). Players are asked their willingness-to-pay for an increase in X which equals their marginal rate of substitution of income for X, denoted 7r, and evaluated at the current level of public activity (which may be zero). In the simple case of two players, A and B, they each respond with answers, gra and i b , which may or may not be their true marginal rates of substitution (MRS). The dictator-planner increases the provision of X if stated total marginal benefits, ra + 7 b, exceed marginal costs, $1. The increase in X, AX, is given by the spending rule: AX=p(ira + 7b - 1), where 4i is a positive constant. To pay for the increase in X, players A and B must contribute taxes, which they do according to the tax rules: AYa= ()(rb -
where Aya and
Ayb
-l)AX
and
Yb= ()(a
- 7b-
are the taxes paid (Aya < 0, Ayb
0). When services are reduced because ra + rb < 1, then subsidies are paid to each party (AYa > O, AYb > O) from the saved expenditure due to reduced public output (AX < 0). It should also be noted that these tax rules raise just enough money to pay for the increased public output, or distribute back to the voters no more than the resources released when public output is reduced. The MDP process is therefore economically feasible and always has a balanced public budget.5 3 We must ask two questions about the MDP process. First, if agents reveal their true marginal rates of substitution will the process find a Pareto efficient allocation? Second, what are the incentives to tell the truth? The answer to the first is easy. Since the process is always economically feasible at each step - neither wasting nor borrowing resources -and since it produces the public activity (X) until b(ra + rb - 1) = AX= 0, we know that in equilibrium,'/a + /rb = 1, which is the requirement for allocative efficiency (.MRS = MC) for public goods when all agents tell the truth. 54 But will agents reveal their true marginal benefits at each step? First, under the MDP process, agents are never made worse off by revealing their true marginal rates of substitution ( = MRS). Suppose party A tells the truth (Tra = MRSa), while party B understates his or her true benefits ( b < MRSb) at each step of the MDP process. Then if AX> 0, 7ra+ 7b 1 > 0, or srb + ra - I - 2 r a> a - 1)AX -2 a , or ()(7rb - Ta _- 1)AX> (_ 7a)AX. The expression (½)(ba is nothing more than the actual taxes paid (Ay < 0) by person A. The expression (- ra)(AX) is person A's true willingness to pay for the public activity (- aA X < 0). Thus, when the public good is increased, person A pays less in taxes than his or her maximum willingness to pay; person A is better off. What if A X < 0, perhaps because player B lied and stated a very low 7rb? Then ,a + 7rb 1 < 0, or by the same reasoning as before (noting AX< 0),½(Tb- Ta_ 1)AX> (-i a ) AX. Now (v b - a- 1) AX is person A's compensation because of the cutback in X, (Aya > 0), while (-sra) AX is the true amount A would demand
as compensation (- MRSaA X) for the cutback. In this case, A is paid more than the amount required to keep him indifferent; again player A is better off. Finally, when
a + rTb= 1, AX= 0, and both persons A and B remain at the status quo.
We conclude that with the DMP process, telling the truth will never hurt a player and it may make him better off. But is telling the truth always the best strategy to play in the MDP process? Sadly, no; there are advantages to lying. Consider the pay-off to player B in the above example where player A tells the truth and AX> 0. In this case, player B's 53 This is easily shown. Just add AYa + Yb and note that the total taxes raised just equals expenditures ($1 AX) when X> 0, while the total subsidy paid just equals released public resources ($1 A X) when A X < 0. 54 Whether the process converges to an equilibrium is also an issue. Drze and de la Valle Poussin (1971, theorem 1) show that when both parties tell the truth convergence is assured.
696
R.P. Inman
1) AX) are reduced as 7 b is made smaller; indeed B's taxes can become a subsidy if , a is large enough and 17b is small enough (AYb = ½(r a - b -1)AX> 0 if a> 1 and ,rb 0). Why not lie all the time? Because lying can make you worse off too. Cheating is a two-edged sword in the b < 1 so that AX< 0, we can show MDP process. If rb is too low and ,ra following the argument in the paragraph above that ()(r a- 77 b - 1) AX> - VbA X. The left-hand side of this expression is actual compensation paid to B (AYb > 0). In contrast to the case where B tells the truth (so rb = MRSb), the right-hand-side (-rbAX) is not B's true required compensation; in fact, it may be much less than B's true required compensation (- rbA X < -- MRSbA X, when 7rb < MRSb). The MDP process only promises that actual compensation will exceed stated compensation, not true compensation. If 7Tb is too low, it is possible for player B's actual compensation to fall short of true compensation when A X < 0; player B can be made worse off. We conclude, then, that the MDP process protects you from actually being harmed if you tell the truth, but does not guarantee that cooperation is always the best, or dominant, strategy. If you lie you can often do better than if you tell the truth, but at the risk (when A X < 0) of being made worse off relative to the status quo. In the MDP process, cooperative behavior is a minimax strategy; telling the truth minimizes your maximum loss. An extremely risk-averse agent will play cooperatively to avoid harm, but less risk-averse players may wish to take a chance on the cheating strategy. Thus, telling the truth is not a dominant strategy for citizens. There is no guarantee, therefore, that we will find the Pareto-optimal, cooperative solution we seek with the MDP process. 55
taxes (= AY b =
()(Ita - 77b -
3.3.2. The Vickrey-Groves-Clark mechanism The demand revealing (DR) process first proposed first by Vickrey (1961) and independently discovered by Groves (1973) and Clark (1971) is a planning mechanism which fully discourages cheating; telling the truth is a dominant strategy.5 6 In its basic form the DR process requires agents to submit a statement to the planner-dictator of their total benefits from the provision of all levels of the public activity, X. The stated total benefits from X for agent i are represented by the function Pi(X), which may or may not correspond to the 55 Roberts (1979) and Henry (1979) show the central difficulty that noncooperative behavior creates for the MDP process is not in finding an efficient allocation but in the speed of convergence to that equilibrium. Lying increases the time before convergence by a factor of N - 1, where N is the number of agents in the economy [Roberts (1979, p. 287)]. In economies with many agents who have finite lives, lying will effectively defeat the MDP search for a Pareto efficient outcome. For an excellent overview of the MDP literature, see Tulkens (1978). 56Green and Laffont (1977b) show all truth-dominant planning mechanisms must be a variant of Groves' mechanism.
697
Ch. 12: The "New" Political Economy
individual's true benefit function from the provision of X.57 The planner then collects all the submitted benefit schedules, aggregates them (EiN=lPi (X)), subtracts the social cost of providing X, ( =itPi(X)- C(X)), and finally provides that level of X which maximizes the difference between stated social benefits and total social costs. The chosen X (= X) is defined by an expression which looks a good deal like Samuelson's original condition for the optimal provision of public goods, /Y,'lTi(X) = MC(X), except ri(X) (= dPi/dX) may not equal agent i's true marginal benefit schedule [denoted, as before, as MRSi(X)]. Agent i may have lied. To prevent cheating the DR process makes clever use of the tax system needed to pay for the public activity. While the DR tax structure seems complicated, it has a compelling intuitive interpretation. Each agent i pays a tax (Ti) equal to
Ti
C(X) +
N N=
Si -
[P
C()0 N
where X is the planner's chosen level of the public activity, N is the number of citizens, C(X) is the total cost of the public activity, P(X) is the total benefit submitted by agent j, and Si is defined by
i
where X is that X which maximizes the expression YN_ i[Pj(X) - C(X)/N]. Note that X is the chosen level of X when agent i fails to submit his or her total benefit schedule. Thus, Si is the social surplus earned by all citizens other than i when i does not submit a total benefit schedule. Finally, the expression E% .[P.(X)- C()/N] is the social surplus earned by all citizens other than i when i does submit a total benefit schedule. Thus, T consists of two components: (1) i's equal share of total cost of paying for (= C(X)/N), and (2) the difference between the social benefits earned by everyone but i when i does not 57 The informational requirements of the DR process are to be contrasted with those of the Wicksell-Lindahl process and the MDP process. In the Wicksell-Lindahl process we needed just the preferred level of spending given tax shares and in the MDP process just the marginal benefit from a given level of X. Here we need the total benefit schedule, that is, a function which relates total benefits to all possible levels of X. Yet the DR process asks for information from the agents only once. The Wicksell-Lindahl and MDP schemes are iterative processes which require new information at each step. The question of which scheme manages the required information more efficiently is an important question, but one which this literature in particular, and economists generally, have not considered with much success. Is it better to ask for a lot of information once as in DR processes- yes, if collection is cheap and transmission expensive-or a little information often as in the MDP scheme - yes, if collection is difficult and transmission is cheap. Such considerations need to be made much more precise.
698
R.P. Inman
submit a benefit schedule for consideration by the planner and the social benefits earned by everyone but i when i does submit a benefit schedule. This second term measures the "externality" which i's statement about benefits imposes on the rest of society, and the DR tax scheme taxes this externality accordingly. How does this tax scheme ensure the truthful revelation of benefits? Agent i knows the planner-dictator will choose the level of the public activity to maximize net social benefit defined as N
i=1
P(X) - C(X),
which can also be written as
P(X)
-
C(X)N
+
CN ]
E j(x) jzi
Agent i will manipulate his message, Pi(X), so that the planner provides i's preferred level of X. That level is defined by maximizing i's own net benefits given i's true benefit, Ui(X), less i's tax payment:
U,(X)-IT= U (X)-
N
Si
N
'
which can be rewritten as U(X)-
C(X)
+ N,
-
N
]
i
This specification of the individual's objective function corresponds to the rewritten version of the planner's objective function given above except that (i) Pi (X) may not equal Ui(X), (ii) X need not equal X, and (iii) the addition of the term Si to the individual's maximand. But note that from the individual's point of view Si is fixed - it depends only on the messages of the other agents - and is therefore irrelevant to agent i's decision about what message to send to the planner. What is relevant is the possible difference between Ui(X) and Pi(X) and the subsequently chosen level of X by the planner. How can an individual manipulate the planner to do his or her bidding? That is to say, what message should person i send to the planner so that the planner selects that level of X which maximizes the individual's net benefits? The answer is P(X)= U(X);
Ch. 12: The "New" Political Economy
699
person i should always tell the truth no matter what the other members of society do. When you tell the truth, the planner's objective becomes identical to the individual's net benefit function: U(X) replaces Pi(X) and the relevant X for evaluating i's tax payments (r) is just that X preferred by agent i when his tax schedule is based on X = X. It is the tax scheme of the DR process - specifically the "externality tax" on i's message-which ensures that cooperative behavior will be the dominant strategy. With all agents revealing their true total benefit schedules from the public activity the planner-dictator can achieve a cooperative, Pareto-efficient outcome. The DR process therefore seems to have solved our problem of how to structure government decision-making, at least for dictatorial processes. Unfortunately, there are operational problems and they are serious.5 8 . First, a very specific form of the true benefit function is required for the DR process to ensure that truthfulness is the dominant strategy; the benefit function must depend on the level of X only - that is, U1( X). This is a very strong requirement. It says, in effect, that variations in how we value X cannot depend on how much of the private good we own. Wealth or income cannot determine our demand for public activities; the evidence is against such an assumption.5 9 Second, the DR process is not immune to manipulation by coalitions. Bennett and Conn (1976) show that it is not optimal for coalitions of agents to reveal their joint valuations for the public activity. If such coalitions form, the DR process cannot guarantee a cooperative outcome. Third, the tax scheme may take all the private income of some members of society. More generally, there is nothing in the DR mechanism which guarantees that all members of society will in fact be made better off in the new efficient outcome. The level of X is efficient but the tax scheme may extract so much from some members of society that they are worse off than they were in the original, non-cooperative allocation. When this threat of bankruptcy is significant, we can expect possibly large enforcement costs to running government through a DR process. Finally, and perhaps most importantly, the DR mechanism runs a budget surplus which must be wasted. Tax revenues from the DR process exceed the cost of providing the public activity. The DR tax raises revenue to cover production costs and also imposes a tax on the "externality" effect of each agent's message to the dictator-planner. It is this "externality" tax 58
A much more detailed discussion of these issues is in Chapter 10 by Laffont in this Handbook. 59The intuition behind this requirement is straightforward. If the level of our private goods determine our benefit from X, and the level of private goods depends on taxation, and in the DR process taxation depends on the message Pj(X) of others in society, we must then conclude that our utility from X, and hence our message P (X) to the planner, now depends upon the messages of all others. We no longer have the simple correspondence of the agent's message to the planner's objective function which made truthfulness the dominant strategy. A review of the evidence on the effects of income or wealth on the demand for public services is found in Inman (1979).
700
R.P. Inan
on messages which raises the surplus.6 0 Dropping the externality tax or returning these revenues to citizens will undo just what the tax has created: the incentive to truthfully reveal the benefits of the public activity.61 In effect, the resulting budget surplus is the decision-making cost of a government which is capable of ensuring that we will discover the allocatively efficient level of public activities. 3.3.3. Extensions
Is there any planning mechanism which cannot be manipulated by the citizenry, does not generate a budget surplus, and thus always finds the allocatively efficient level of a public good? That is, can we find a dictator-planner mechanism which dominates the Groves mechanism? Green and Laffont (1977b) and Walker (1980) provide an answer for a very general class of economic environments, and that answer is no. Walker's important analysis parallels and extends the GibbardSatterthwaite conclusions for manipulable voting systems to the more demanding setting of dictator-planner processes. If we have any chance of finding an efficient government it must surely be amongst dictatorial processes. Walker shows that even here we will necessarily fall short. Either the mechanism is decision-making efficient (i.e. no budget surplus) but open to strategic manipulation and thus unable to ensure the cooperative, allocatively efficient allocation or (as with the Groves DR mechanism) is capable of ensuring cooperative behavior but at the price of a wasted budget surplus. We can go either way. 60The sum of tax revenues is given by
,=
=
G(
N
)C ,= (
J
[ y(
N
)
)
The term iN (C(f)/N) = C(X) is the production costs of the public activity. If the second term on the right-hand side is positive, then total tax revenues is greater than production costs, and a surplus is earned. The second term is positive if
si
E
C(
> E
C[_
_
But remember S is evaluated at X which is the value of X which maximizes Dl~ i[ P (X) - C( X)/N]. Therefore Si is at least as great as ,N [P1 ( X) - C( X)/N], and may be larger. When S is larger, the expression in curly brackets- called here the "externality tax" on messages-will be positive, and a surplus above production costs results. Since tax revenues depend on all agents' messages, including each agent's own message, returning any surplus revenues suddenly makes each agent's tax net of returned surplus equal to some function T - f(X, P1,..., PN). But now an added term appears in the agent's objective function which breaks down the correspondence of the societal and individual maximands. Telling the truth will no longer be a dominant individual strategy.
Ch. 12: The "New" Political Economy
701
Groves and Ledyard (1977) advance a mechanism which has a balanced budget and is therefore decision-making efficient, but which is manipulable if citizens make allowances for how their messages to the dictator-planner might influence the messages given by others. If, however, everyone behaves as if their reported benefits or costs have no influence on the messages of others (called Cournot-Nash behavior), then Groves and Ledyard show that their mechanism will achieve the cooperative, Pareto-efficient allocation. Recent experimental work by Smith (1977) and Harstead and Marrese (1982) with a Groves-Ledyard process shows that the mechanism does succeed in finding a Pareto-efficient allocation in most cases. These experimental results are limited to the case of a few participants, however, and recent work by Muench and Walker (1983) show that the Groves-Ledyard process may fail to achieve an optimal allocation when there are many individuals reporting benefits. Allocation with many individuals is surely an important, if not the prominent, case. However, d'Aspremont and Gerard-Varet (1979) have extended the Groves-Ledyard results by devising mechanisms where truthful reporting is not simply the preferred Nash strategy (preferred when each player assumes all others ignore his message) but the preferred strategy in a more behaviorally realistic structure in which citizens allow for the effects of their messages on others and do so conditional upon a common expectation as to the distribution of preferences (and hence message behavior) in society. Recently, Cremer and Riordan (1984) have moved this line of research to its most promising limit by devising a mechanism where only one citizen, not everyone, need have truthful behavior as the expectationally rational strategy, that is, rational conditional on expectations of others' preferences. For all other citizens truthful behavior is the dominant (always preferred) strategy. Furthermore, both the d'Aspremont-Gerard-Varet and the Cremer-Riordan mechanisms are non-wasteful and thus achieve balanced public budgets. Unfortunately, the mechanisms are limited to the special case of separable, quasi-linear preferences-ui(x) +yi-where wealth or income (yi) has no effect on public
good demands. Rob (1982) has pursued the second line of inquiry and has examined more carefully the size of the budget surplus likely to arise from use of the Groves mechanism. How large are the decision-making costs needed to guarantee a cooperative allocation? Rob's analysis shows that as the number of agents in the economy increases, the budget surplus per agent from the externality tax of the Groves mechanism tends to zero. In fact, the surplus per agent shrinks so fast that even the total budget surplus declines as the number of agents increases. What is not included in Rob's analysis, however, but what must be a central component of any analysis of decision-making costs, are the costs of sending and processing each message from the citizens to the planner. Transmission of messages will involve a fixed cost per agent. Thus, increasing the number of
702
R.P. nman
agents may save on the budget surplus, but at the expense of an increase in transmission costs. A balance between the budget surplus and transmission costs can be achieved by selecting a cost-minimizing number of agents to sample from the national population. Green and Laffont (1977a) have explored the properties of such a sampling problem, and they conclude that for at least the Groves mechanism the optimal sample size (n*) is less than the full population (N), with N 1 / 3 an upper bound to n*. When N equals one million potential beneficiaries, for example, the optimal sample size will be no larger than 100. We should emphasize, however, that such sampling procedures do not guarantee the true allocatively efficient outcome, for we have not collected information from all citizens. What we do have is the best allocation possible given the fact that collecting information is costly. Yet in a world with decision-making costs, that is exactly what we require. 3.3.4. Conclusion We must conclude that in a world with sophisticated, rational agents, benevolent planner-dictator processes cannot guarantee economic efficiency. Processes which are capable of finding the preferred cooperative allocation are expensive to implement, while those processes which are relatively inexpensive to run are open to manipulation and thus may produce non-cooperative, allocatively inefficient outcomes. Planner-dictator processes suffer from a further defect: they are dictatorial and thus potentially unfair. In each of the broad class of mechanisms considered - MDP processes and Groves processes - the planner retains significant control over the distribution of the final economic surpluses which the mechanisms provide. How these surpluses are to be distributed will have an important effect upon the final well-being of all citizens and thus upon one's sense of the society as a just society. All the net benefits of cooperative activity may well be distributed to the dictator and his friends.6 2 Who, then, shall pick the dictator-planner and what institutions are to ensure his benevolence? Perhaps a democratic collective choice process can be found which is preferred? 62 Yet if the distribution becomes too unfair, members of society may well decide not to play the planning game at all. Enforcement costs with dictatorships can be high and rebellion is a real possibility. MDP processes are more attractive then Groves processes in this regard because the MDP process itself guarantees that you can never be harmed if you tell the truth; utilities of all cooperating participants are increased (or not reduced) at each step in the search for the cooperative allocation. The Groves mechanism, however, cannot offer that guarantee. In the extreme, the Groves "tax" may even drive some respondents into bankruptcy. Again we need to choose between those planning processes which are likely to have high decision-making costs (now the cost of enforcement) but can guarantee cooperative allocations (Groves mechanisms) and those processes with low decision-making costs but which cannot ensure a cooperative allocation (MDP processes).
Ch. 12: The "New" PoliticalEconomy
703
3.4. Democratic processes
Numerous democratic decision-making processes have been proposed for setting cooperative allocations, and almost all fall within either of the three strategies outlined under path (2) in Figure 3.6. Figure 3.7 provides a more detailed description of the alternatives. Majority rule processes (MRP) will protect axiom I - independence of irrelevant alternatives - and sacrifice either the axiom of Democratic Choice Processes
(2)
Satisf y ND, P PA
(i) Satisfy
os I Ls e
(ii) Satisfy
R U
Majority Rule Processes for Single-Peaked Preferences
(iii) Satisfy
Ue
U, I ILos e R
Lo s e
Majority Rule Processes With Structure- Induced Equilibrium
Preference Intensity Processes
Examples:
Examples:
Examples:
-Simple Majority Rule
-Simple Majority Rule
-Vote-Trading
-Agenda-Setter
With Special Districts -Agenda-Setters
-Logrolling With Interest Groups
l Decision-Making Inefficiency
I
Figure 3.7. Democratic choice processes.
704
R.P. Inman
universal domain (U, route 2, i) or the rationality axiom (R, route 2, ii). Alternatively point voting, logrolling, and various vote trading procedures - called "preference-intensity" processes (PIP) -will sacrifice axiom I (route 2, iii) but protect axioms R and U. The structure and performance of various majority rule and preference intensity processes are considered below for both one-dimensional and many-dimensional public policy decisions. 3.4.1. Majority rule processes The intuitive case for a majority rule process (hereafter MRP) is its apparent administrative simplicity and the respect MRPs show for all individuals and their proposals. May (1952) was the first to formalize these intuitions. First, May required that the social decision process reach a decision for all possible preference orderings of alternatives. This axiom is Arrow's axiom of unrestricted domain (U). Second, May asked that the decision process consider only the actual orderings of alternatives and not who holds those orderings. If two voters, David and Larsen, swap ballots, then May requires the voting outcome to be unchanged. This requirement is called the axiom of anonymity (A). Third, May required that only the attributes of two alternatives be relevant when comparing those alternatives; if x beats y and everyone who prefers x over y also prefers w over z then w must beat z. This requirement is called the axiom of neutrality (N). Neutrality requires that no option-for example, the status quo -play a special role in the voting process. Finally, May imposed the axiom of positive association (PA); if x is tied with, or beats, option y and if one person who was indifferent or preferred y now switches to prefer x, then x must become, or remain, the winner. Together, the axioms require that the democratic choice process allow all individual preference profiles (U), be responsive to those preference orderings (PA), and ignore any information not related to those orderings and the comparisons at hand (A and N). What social choice processes meet these requirements? May proves that only MRPs will qualify.6 3 The axioms of May are arguably attractive properties to impose on a democratic choice process,64 and Sen (1970a) shows that they are closely related to 63 Rae (1969) and Taylor (1969) motivate the MRP from a different perspective. Rae and Taylor show that in situations of conflict, where issues are unidimensional and randomly or impartially selected, and where the individuals gain an amount if the issue passes in their favor and lose an equal amount if the issue loses (an equal intensity assumption), then if a voter wishes to maximize his or her expected utility their preferred voting rule will be majority rule. This specification of the social choice problem has some appeal as a pre-constitution decision problem when voters will not know what issues will be considered or how they might react to the issues when they are considered. Voters select a voting rule behind a "veil of ignorance". See Buchanan and Tullock's (1962) discussion of constitutional design from this perspective. 64But they are not the only axioms we might wish to impose on a voting system, and other axioms may give other voting systems. See the discussion of preference-intensity processes below, but particularly see Plott (1976, pp. 554-566).
705
Ch. 12: The "New" Political Economy
Most preferred Least preferred
Individuals 1 2 3 x z y y x z z y x MRP Winner: xPy by 2 votes to 1 vote yPz by 2 votes to 1 vote zPx by 2 votes to 1 vote
Figure 3.8. Voting cycles with MRP.
Arrow's requirements as well. A MRP which satisfies axioms U, A, N, and PA will also satisfy Arrow's axioms of U, P, ND, and I (the converse is not true). MRPs are not, however, rational (Arrow's axiom R). MRPs are prone to intransitive orderings of social allocations. Figure 3.8 illustrates the problem. In a pairwise comparison between alternatives x and y, x receives two votes, y one vote, and thus x beats y. In a comparison between y and z, y receives two votes, z one vote, and y beats z. If the majority rule process were rational in Arrow's sense, then xPy and yPz should imply xPz. In fact, when z opposes x, z receives two votes, x receives one vote, and z beats x. The MRP yields an intransitive social ordering of alternatives for the preference profile in Figure 3.8. When such intransitivities result, we have a voting cycle. No clear winner emerges in a sequence of pairwise comparisons and we simply "cycle" from one alternative to another. Voting cycles pose a fundamental problem for MRPs. Either MRPs cycle forever and never reach a decision or, more likely, a stopping rule is imposed which ends the cycle but which exposes MRP to strategic manipulation. For example, suppose we allow each alternative to be considered only as long as it continues to win in pairwise comparisons. If the order of voting is first x vs. y and then the winner vs. z, it is clear that with the preference profile of Figure 3.8 alternative z is the chosen alternative. But what if the agenda is instead x vs. z with the winner vs. y? Now the winner will be y. Finally, if the agenda is z vs. y with the winner vs. x, then the chosen option will be x. When voting cycles occur, the voting rules and the agenda will determine the outcome. Whoever is chosen to set the agenda can determine the outcome.6 5 MRPs are not problem-free. Because Arrow's axiom of rationality is necessarily violated (because U, P, I, and ND are met, R must go), MRPs are open to voting cycles and are either manipulable or indecisive. There are only two ways out of this problem. The first is to drop one of the other Arrow axioms and impose axiom R. The axiom generally dropped is U (see Figure 3.7). We are asking if there are certain preference configurations for which the MRP will 65The example above assumes truthful voting, but this argument which illustrates the unique powers of an agenda-setter also holds if voters vote strategically, see Shepsle and Weingast (1981).
706
R.P. Inman
satisfy R. If so, we then need to examine how plausible these preferences configurations are in fact. Perhaps intransitivity and voting cycles are only a theoretical difficulty and not a serious problem in real societies. The second strategy retains the original four axioms (U, A, N, PA or their Arrow counterparts U, P, I, ND) and seeks to impose an additional structure on a MRP to prevent voting cycles, or, if there are cycles, to minimize the adverse consequences of manipulation (see Figure 3.7). Such processes are called majority rule processes with a structure-induced equilibrium (MRP-SE). Both strategies are reviewed here. Relaxing the axiom of universal domain (U) which allows for all possible profiles of voter preferences seemed, at least initially, an attractive way out of Arrow's dilemma. In real societies, common cultural backgrounds may simply mean that some preference profiles will never occur. How likely are voting cycles (the violation of axiom R) when voters have "similar preferences?" Sen (1966) has made the notion of "similar preferences" precise and has defined conditions where no cycles occur at all. Sen advances the Idea of value-restricted preferences and shows that if voter preferences over every combination of three alternatives has the property that all agree some alternative is not worst, or some alternative is not best, or some alternative is never medium, then a MRP will be transitive. Sen's conditions of value restriction establishes the minimum degree of voter compatability needed to avoid voting cycles in MRP. On its face, Sen's condition does not appear overly restrictive.6 6 Voters need only agree that some alternative is always "pretty good" (i.e. never worst), or always "pretty bad" (i.e. never best), or never a reasonable compromise (i.e. never medium). There is one prominent case where the condition of value restriction holds and a MRP yields a clear winner. It is the case where alternatives can be ordered along one dimension (e.g. from conservative to radical or from less to more public spending) and where preferences are "single-peaked" along this dimension. If, in Figure 3.8 x equals a "low budget", y a "middle budget", and z a "high budget", then the alternatives can be ordered along the one dimension, dollars, and we can plot the rank ordering of alternatives as solid lines in Figure 3.9. The rank orderings for persons I and III are called "single peaked" since there is only one "peak" to the plot of their preferences against the dimension of budget size. Person I might be called "conservative", while person III might be called "liberal". Person II has "double-peaked" preferences, however, and is the source of the voting cycle. If we change the preferences of person II to the ordering z > y > x, then Sen's value-restriction condition is satisfied-y is the one alternative which is never 66Sen's theorem (1966) does require the number of voters to be always odd, which is "disturbing and unattractive" [Sen (1970a, p. 169)]. Sen (1970a, pp. 169-171) discusses the various ways around this problem which require slightly stronger restrictions on allowable preferences. In the end, Sen concludes, as we shall too, that the usefulness of a pure MRP is a"question for empirical investigation". Will preference structures actually satisfy the needed restrictions to insure an equilibrium?
Ch. 12: The "New" Political Economy
707
PREFERENCE RANKING most
(11)
middle
(111)
least
(I) I x ="low"
I y "middle"
I z "high"
b~~~
budget level Figure 3.9. Single-peaked preferences.
worse. Furthermore, person II's preferences qualify as "single peaked" (the dashed line in Figure 3.9). Now a MRP of pairwise comparisons yields alternatively y as the clear winner: y beats x, x beats z, and y beats z. This majority winner in pairwise comparisons is called the Condorcet winner. How reasonable is the requirement that voter preferences be single peaked? In simple, one-dimensional budgetary allocations the requirement is satisfied when voters have convex preferences and linear or concave budget constraints [see Denzau and Parks (1979)]. These are the usual restrictions in demand analysis and not likely to be often violated.6 7 Figure 3.10 illustrates this case for voter III in the example above; similar diagrams could be drawn for voter I and for voter II's new (single-peaked) ordering. The fact that majority rule processes are transitive (produce a Condorcet winner) when preferences are single peaked, and single-peaked preferences are reasonable in one-dimensional budget problems, explains the popularity of majority rule models of budgetary politics. Unfortunately, the general applicability of these models is severely limited; they are only valid for problems where alternatives can be ordered along one dimension. 67 But they may be, with the most plausible violation being increasing returns to scale, and hence convex budget constraints. If increasing returns are strong enough we may well have the case where voters prefer a large amount of spending to a small amount because of economies of scale, and yet both small and large spending levels are preferred to some medium amount which does not yet afford all the cost savings of increasing returns to scale. An interesting analysis of voting outcomes in such cases is developed by Kramer and Klevorick (1974) where they develop conditions for local equilibria. These results are applied to actual allocation problems in pollution control in Kramer and Klevorick (1973).
708
R.P. Inman
private goods
x
y
z budget level
Figure 3.10. Budget preferences for voter III.
When we move to collective choice problems where alternatives are ordered on two or more dimensions -e.g. tax rates and spending or two spending levels- single peakedness is no longer adequate. New, stronger restrictions on preferences are required. Plott (1967) was the first to clarify sufficient limitations on voter preferences to ensure MRP would produce a Condorcet winner when the policy alternatives have two or more attributes. The Plott conditions require a finite number of voters whose preferences can be represented by a quasi-concave, continuously differentiable utility function. Furthermore, some voter is satiated at a point e in the policy space and the remaining voters can be partitioned into pairs so that each voter who wants to change from point e is exactly offset by another voter who prefers an equal but opposite change from e. The Plott conditions, which amount to a symmetry condition on distribution of voter preferences, are illustrated for the case of five voters in Figure 3.11. To simplify the presentation, I assume circular indifference curves around each voter's most preferred (" bliss") point or satiation point. The voter is worse off as we move in any direction from his or her point, and utility decreases monotonically with distance. [More formally, U(x) =f(O - (x - 6)) for any point x, where defines the voter's bliss point.] In Figure 3.11, the bliss points for voters A-E are indicated by points a-e, with circular indifference curves for each voter drawn about his bliss point. The solid lines connecting the various bliss points are "contract lines" marking all points of tangency between the indifference curves. Note that point e beats any alternative in a majority rule pairwise comparison, and that point e satisfies
709
Ch. 12: The "New" PoliticalEconomy Ve
'1
i
x2
Figure 3.11. Majority rule equilibrium with two policies.
Plott's conditions. Voters A and C are always opposed to each other as are voters B and D. Consider point 8 vs. point e. Voting for e are persons E, A, and B; voting for 8 are voters C and D. Consider point P vs. point e. Voting for e are E, D, and C while A and B vote for P3. Similarly, e beats a (E, B, and C for e, while A and D want a) and e beats y (E, A, and D for e, while B and C want y). A majority rule comparison of any point in the policy space to point e will show e to be the winner. When the Plott conditions do not hold, however, voting cycles may result. This is shown in Figure 3.12, where for simplicity I draw only the case of three voters with circular preferences. The arguments generalize easily. The solid lines connecting the bliss points of voters A, B, and C are the "contract lines" marking the tangencies of the indifference curves between the various pairs of voters. As drawn, the Plott conditions do not hold; if they did, the bliss point from one of the voters would have to lie upon the contract line for the other two. Figure 3.12 illustrates two important points. First, the area within the three contract lines defines the set of Pareto points for this allocation problem. A move from an allocation within the Pareto set (e.g. point e) to a point outside the set (e.g. point 8) will make at least one voter worse off (e.g. voter B). Conversely there is always a point within the Pareto set (e.g. point y) which will make all voters better off compared to its alternative outside the Pareto set (e.g. point 8). Second, there may be no stable majority rule winner when the Plott conditions are violated. As any point outside the Pareto set will be dominated by some point within the set, we can focus our analysis on alternatives such as a, , y, and e. In pairwise comparisons by majority rule, (favored by B and C) beats a (favored by A), a (favored by A and B) beats y (favored by C), but (contrary to transitivity which
710
R.P. Inman
predicts P beats y) y( favored by A and C) beats P (favored by B). Point e which is inside the Pareto set is also caught in a voting cycle. Point wins over point a as voters B and C prefer ; point a defeats y as A and B prefer a; but (contrary to transitivity) y beats e as A and C prefer y. There is no Condorcet winner amongst the alternatives in Figure 3.12. How likely is it that we will violate the Plott conditions and have voting cycles? The fact that a small deviation from perfect symmetry of voter preferences about the equilibrium point (e, in Figure 3.11) upsets the equilibrium and allows a voting cycle suggests that the Plott conditions are quite restrictive. Kramer (1973) has shown that when voters' preferences are convex and continuous and issues are to be decided on more than one dimension then voting cycles are the general case. Only when preferences are essentially identical can MRPs escape intransitivity [Kramer (1973)]. We must conclude that the emergence of a Condorcet winner is an unlikely occurrence in multi-dimensional, many-issue voting. 68 Tullock (1967) conjectured that voting cycles are less likely as the number of voters is increased. The cycle itself would be limited to a smaller subset of alternatives in the policy space (called the "cycle set"), and that this small subset would be within the Pareto set. The intuition of Tullock's conjecture was that as more and more voters covered the policy space the chances of finding the offsetting pairs of voters about this small cycle set increases - the cycle set being defined by those voters who do not offset each other. MRP should therefore tend to a decision within the cycle set (which decision we cannot say) and that the decision will be Pareto optimal. In an important paper, McKelvey (1975) has shown this conjecture to be wrong. In multi-dimensional voting problems there will generally be majority rule voting cycles which cover the entire policy space. McKelvey showed that if voters have circular preferences (Tullock's assumption too),6 9 then for any two alternatives in the policy space, a voting sequence of pairwise comparisons can be found which begins with one alternative and ends 68 We might well ask now the question posed by Sargent and Williamson (1967): How likely are voting cycles if some, but not all, voters have identical preferences? Using simulation techniques, they conclude that even for a modest proportion of identical preferences the probability of a voting cycle is significantly reduced, at least for the case of many voters and three alternatives. The probability of a cycle falls from 0.0877 to less than 1 percent in this case. It is likely that the probability of a cycle rises significantly, however, as the number of alternatives increases and approaches 1 as we cover the whole policy space [Kramer's (1973) result]. Fishburn and Gehrlein (1980) take another tack and examine the effect of voter indifference on the probability of voting cycles. They find that for three alternatives the probability of a cycle falls from 0.0877 without indifference to zero when everyone has at least one indifferent ranking in their ordering; for four and five alternatives and probability of a cycle falls from 0.1755 and 0.2513 to (=) 0.11 and () 0.20, respectively, with at least one indifferent ranking in each ordering [from Fig. 1, Fishburn and Gehrlein (1980)]. Preference compatibility- whether defined by Sen's value restriction, by identical preferences for some voters, or by indifference - does seem to help, but only when the number of alternatives are few. 69 McKelvey's results have since been generalized to more general preference structures by Schofield (1978), Cohen (1979), McKelvey (1979), and Cohen and Matthews (1980).
711
Ch. 12: The "New" Political Economy X1l
k
x2 Figure 3.12. Majority rule disequilibrium.
x2
L
//
x1 Figure 3.13. Majority rule voting paths.
712
R.P. Inman
with the other alternative such that each point along the way is preferred to its predecessor. Figure 3.13 illustrates this result for a point within the Pareto set outside the Pareto set. In a comparison of vs. 8, 8 and two points 8 and receives two votes (voters A and C) and e one vote (voter B). If we then compare points 8 and , both outside the Pareto set, alternative (voters B and C) beats 8 to . Alternative e is preferred by voters A and B (voter A). Finally, compare and therefore beats 0. We have found a voting cycle which travels from an arbitrary point inside the Pareto set to points outside the set and then back into the Pareto set. McKelvey's result proves this is no fluke; we can travel anywhere in the policy space by an appropriately chosen sequence of alternatives. Rather than unique occurrences, voting cycles are endemic to MRP. The analysis of Plott, Kramer, and McKelvey makes clear that dropping axiom U is not a particularly promising avenue to travel in our search for an attractive democratic social decision process. Only with truly one-dimensional policy choices does the relaxation of axiom, U - to single-peaked preferences- buy us very much at all. We must seek, then, another route. If we wish to continue down the path of the MRP, yet retain axiom U, then the rationality axiom R must go. The loss of axiom R raises serious questions, however. They center fundamentally upon the susceptibility of MRPs to voting cycles. We know already from the work of Blair and Pollak (1981, 1983) that no democratic choice process is immune to voting cycles. If we impose the restriction that our constitution satisfy an axiom of acyclicity (no voting cycles) then some individual or group will have extensive veto power. Since Blair-Pollak's analysis applies to all constitutions, it applies to MRPs as well. Nonetheless, acyclic constitutions may be worth examining. Shepsle (1979) in an important paper on "structure-induced equilibrium" for MRPs has begun this examination.70 To escape voting cycles endemic to MRPs, legislatures often adopt rules on how policy proposals are to be presented, discussed, and finally approved. Shepsle (1979) has begun the formal analysis of such rules, which he calls institutional structures. Such structures have three dimensions: (1) who is allowed to vote on a proposal (the "committee structure"); (2) which proposals may be compared to the status quo (the "jurisdiction structure"); and (3) how proposals may be modified once they have been formally introduced for voter consideration (the "amendment structure"). The now classic example of a structure-induced equilibrium for a majority rule process is the use of local budget referenda to set educational spending. The "committee" includes all residents within a defined geographic boundary. The committee's "jurisdiction" is to set the aggregate level of the school budget. 70 In their review of the recent literature on equilibrium in MRPs, Fiorina and Shepsle (1982) conclude: "Indeed, the central constructive feature of the Cohen-McKelvey-Schofield theorems is precisely that 'other features,' not the majority rule mechanism, are decisive in democratic institutions."
713
Ch. 12: The "New" Political Economy
percent of voters
I I I G°
I II
G
G1
G
Figure 3.14. Median voter budget.
The "amendment" process may allow all spending levels to be considered-an "open" agenda-or it may allow no amendments and only two budgets to be considered - "closed" agenda. This structure reduces a complicated multi-dimensional policy issue - the level, the content, and the financing of education - to a one-dimensional issue, the size of the education budget. Along this one dimension, voter preferences for spending levels are likely to be single peaked [see Figure 3.10, and Denzau and Parks (1979)]. In the case of an "open" agenda structure, the outcome will be that budget preferred by the voter with the median (50th percentile) position in the distribution of most preferred (or "ideal") points [Black (1958)]. Figure 3.14 illustrates this result. The solid curve is the density function of the distribution of voters' ideal points along the single dimension (G) of allowed alternatives. G represents the level of the school budget. The median position in the distribution, denoted G*, will be the winning alternative. Since preferences are single peaked, voters always prefer alternatives closest to their ideal points. If G1 were proposed, all the voters whose ideal points are to the left of G* will favor G* and a few voters to the right of, but close to, G* will also favor G*. Since half the voters lie to the left of G*, G* defeats G1. If GO were proposed, it too would lose to G*. Half the voters whose ideal points lie to the right of, G * will favor G * over G° , and a few voters positioned just to the left of G* will also prefer G* to G° . It is clear that G* is a MRP-SE equilibrium outcome. In the case of a closed agenda MRP, voters face a one-time choice between only two alternatives. How are those alternatives chosen? They arise from an agenda-setting process. One alternative is a "reversion", or fall-back,
R.P. Inman
714
A
private goods
Figure 3.15. Bureau behavior with alternative reversion budgets.
alternative - for example, last year's budget or status quo. The other alternative must be specified from an agenda-setting process. In an important line of new research, Romer and Rosenthal (1979b, 1982a, 1982b, 1982c) have developed and tested a model of agenda-setting behavior in one-dimensional policy problems. The analysis extends the early work on Niskanen (1971, 1975) on bureaucratic agenda-setters. In both Niskanen's and Romer-Rosenthal's models it is assumed that the agenda-setter is the bureaucracy and that the professional bureaucrat's objective is to maximize the size of the budget. Figure 3.15, which shows the budget line and indifference curves for the decisive median voter, illustrates how the bureaucrat will select the agenda, given that 50 percent of the voters must approve one alternative or the other. The trick is to select the agenda so that the approved budget is as large as possible. Note that if G*, the median voter's preferred budget, is one alternative, then it will receive voter approval. Thus, the worst the bureaucrat can do is G*. But they can capture a higher budget. Consider the agenda pair G (the "reversion" level) and a budget slightly smaller than G. Which will the median voter prefer? The answer is the budget near G for it places him on a higher indifference curve (more satisfaction) than that available with budget G. The bureaucrat can increase the budget from G * to nearly G with an appropriate agenda pair. Yet the approved budget can be made larger still. If the reversion level is set at G = 0, then an alternative budget slightly smaller than G will win voter approval in the pairwise comparison 0 vs. (G- 1). Romer and Rosenthal have shown how agenda manipulation can influence the final outcome in a MRP-SE process. Note too that such manipulation generally makes the agenda-setter better off (G increases) while making those outside the select circle (the median voter and other low demanders) worse off. An interesting series of empirical papers have in fact tested MRP-SE models for this special case of one-dimensional policy choices. Howard Bowen, in a
Ch. 12: The "New" PoliticalEconomy
715
remarkable paper published over forty years ago [Bowen (1943)], first proposed the empirical test of such MRP-SE models, 71 but it remained for Bergstrom and Goodman (1973) and Inman (1978) to perform the test. Those results were generally supportive of the MRP-SE process as a description of local government budgetary choice. But Bergstrom and Goodman, and Inman, both assumed the amendment process was an open one - that is, any proposal could be considered. Romer and Rosenthal's work extends this analysis by allowing for no amendments and closed agendas. Within their more general structure they could formally test the proposition that the amendment structure of these MRP-SEs was "open" or "closed". The key variable added to the Bergstrom-Goodman and Inman open agenda model was the "reversion" level. The lower is the reversion level, the higher should be government spending under the closed agenda model (see Figure 3.15). With an open agenda, the median voter is decisive, G* is chosen, and the reversion level should be statistically insignificant. In their study or Oregon school districts, Romer and Rosenthal find the reversion level is significant and has the expected impact on local spending. As the reversion level passes below a critical lower bound, Romer and Rosenthal find that local school spending is 11-18 percent higher than that predicted by the open agenda median voter model. In Oregon school districts, at least, the monopoly agenda-setter has an important role to play in setting the level of government spending. Shepsle (1979) has extended the analysis of MRP-SE to two or more policy dimensions. The structure involves committees of all voters (committees-of-thewhole), but each committee's jurisdiction is restricted to considering only one policy dimension. If there are two policy dimensions, then there must be two committees. Finally, the amendment process allows an open agenda subject to the restriction that only amendments pertaining to each committee's policy dimension can be considered (called the germaneness rule). Furthermore, truthful voting is assumed.7 2 Figures 3.16(a)-(c) illustrate an equilibrium in the simple case of two policy dimensions (x 1 and x 2 spending) and three voters (A, B, and C). The jurisdiction and amendment rules restrict voters to consider one policy dimension at a time. Figure 3.16(a) plots the preferred spending on x1 for each voter, given an exogenously set level of x 2. The curves are drawn as the dashed lines AA', BB', and CC' and are found by drawing lines parallel to the x1 axis 71 In this one paper Bowen (1943) has addressed the entire research agenda of the new political economy. First, he described the market failure of pure public goods and derived the conditions which must be satisfied if an efficient allocation is to result. Second, he proposed a behavioral model of government budgetary politics and derived the essential results of the median voter theorem. Third, he proposed an empirical test of whether governments, if behaving according to his model, would in fact provide the efficient level of the public good. His analysis precedes Samuelson's (1954) normative analysis of public goods by eleven years, Black's (1958) median voter results by fifteen years, and the new empirical political economy by at least twenty! 72 Extensions of the analysis to the case of strategic voting is found in Miller (1980) and Shepsle and Weingast (1984). The essential point- that structure-induced equilibrium result - remains intact.
716
R.P. nman I
X2
Xo 2
x1 (a)
/
B
X2
X1
xo (b) Figure 3.16. Structure-induced equilibrium.
717
Ch. 12: The "New" Political Economy
(c)
x1
Figure 3.16. Continued.
at given levels of x2 and noting the tangency of those lines to a voter indifference curve at a point (e.g. f/, a, and y) closest to the voters' ideal points (a, b, and c). Given the curves AA', BB', and CC' we can calculate the majority winning level of x 1 for each level of x2. For example, at x 2 = x° , voter B prefers r, voter A prefers a, and voter C prefers y. Since preferences for xl are single peaked along the ray at x2 = x, proposal a is the median preferred alternative and the majority rule winner. We can repeat the calculation for each x 2 level, and the result is the set of majority preferred x levels, drawn as the heavy line I in Figure 3.16(a). Figure 3.16(b) repeats this procedure, except here we determine the majority winning x 2 levels, given a level of x 1. For example, when x = x °, the three preferred proposals are , , and /3. Since preferences for x 2 are single peaked along the ray at x = x ° , is the majority winner. The heavy line II is a collection of the majority preferred x 2 levels, given x1. The intersection of lines I and II in Figure 3.16(c) is the Shepsle structure-induced equilibrium for this simple case, denoted as point E. In the example drawn here, E is a stable equilibrium as well. If moved away from E, the voting process will eventually return to point E.7 3 In general, the structure-induced equilibrium is not a single 73 0ne needs to impose a dynamical structure on the Shepsle model to discuss stability. A "cobweb" process seems reasonable. Given x2, select a majority preferred x,. Then given that x,, select a new majority preferred x2 . Does this sequence converge to E? In Figure 3.16(c), it does. However, it is not hard to construct an example with lines I and II (and underlying tastes) which does not converge to E. In that case, the equilibrium point where lines I and II intersect is unstable. For a more detailed discussion of stability with structure-induced equilibria, see Denzau and Mackay (1982).
718
R.P. nman
point but a set of points [Shepsle (1979, p. 50)]. Nonetheless, the equilibrium set is likely to be a small subset of the whole policy space. Shepsle's two-dimension, MRP-SE analysis assumes an open agenda; Mackay and Weaver (1981, 1983) have extended the model to allow for closed agendas set by a budget-maximizing bureaucrats. The structure of their model parallels Shepsle's special case of a committee-of-the-whole restricted to considering one policy dimension at a time [Figures 3.16(a) and 3.16(b)]. Mackay and Weaver (1981) examine both the case of a monopoly (closed) agenda-setter for one policy dimension but open agenda politics on the other dimensions and the case of monopoly (but non-collusive) agenda-setters on all dimensions. Results are then compared to a Shepsle open-agenda equilibrium, such as point E in Figure 3.16(c). They show that for two-dimensional policies (e.g. x and x 2), monopoly power will increase the size of budget allocated to the monopolized activity (from point E) and may reduce (if substitutes) or increase (if complements) the size of the budget to the openly decided public activity. If both activities are monopoly controlled and the monopoly bureaus are non-cooperating, it is possible for one bureau's budget to actually fall below that of the open-agenda outcome. The more likely case, however, is for both budgets to increase from the open-agenda equilibrium. Finally, if monopoly bureaus cooperate- specifically if they can agree to transfer funds between each other - then it is possible to expand their individual budgets still further [Mackay and Weaver (1983)]. The basic theoretical results from Mackay and Weaver parallel those of Niskanen, and Romer and Rosenthal, for the single-dimension case, but as yet no empirical analyses have confirmed these predictions for the case of multi-dimensional policies.74 Although the effect of institutional structure on policy choice is important, and real, it is not by itself an answer to the question of how MRPs do, or should, select outcomes. For if a structure-induced equilibrium such as point E in Figure 3.16 is not a Condorcet equilibrium (a policy capable of beating all others in pairwise comparisons), then there is always a majority with an incentive, and the potential power, to change the institutional structure and hence the chosen outcome. When, why, and how do such institutional changes occur? It is this question which now stands as the central challenge to our understanding of majority rule processes. In an important sense we have moved the dilemma of
74 An alternative to the Mackay-Weaver models is presented in Inman (1981). Inman hypothesizes a bureau - called a labor union- seeking to maximize an objective function over current and future labor income and the level of employment. Employment proxies for public good output, while labor income translates directly into taxpayer income through the government's budget constraint. The union negotiates with a coalition of taxpayers. The tendency is for negotiation to move to outcomes along a "contract curve" defined by mutually agreeable trades between the bargaining parties. Where on the contract line the final outcome will be, will depend on the relative bargaining strengths of the two parties. The analysis predicts and the empirical work confirms, a larger public budget with two-party bargaining than when workers are available at the competitive wage.
Ch. 12: The "New" PoliticalEconomy
719
Net benefits from roadway: Farmers I II A - $2000 - $2000 B $5000 - $2000 C -$2000 $5000 Figure 3.17. Vote-trading.
disequilibrium and cycling from the space of policies to the space of institutions. As Riker (1982) concludes in his own recent survey of majority rule processes: ... rules or institutions are just more alternatives in the policy space and the status quo of one set of rules can be supplanted with another set bf rules. (T)he only difference... is that the revelation of institutional disequilibrium is probably a longer process. ... If institutions are congealed tastes and if tastes lack equilibria, then also do institutions, except for short-run events. 3.4.2. Preference intensity processes
Having now examined various dictatorial processes and then democratic decision processes which sacrifice Arrow's axioms of universal domain or rationality (MRPs), it remains to describe possible democratic choice processes which violate the last Arrow axiom of independence from irrelevant alternatives (I). We have seen that democratic mechanisms such as the Borda count which violate axiom I have one advantage - they allow for voters to express the relative intensity of preference or value they place on alternatives - and two disadvantages- all alternatives just be considered jointly and such mechanisms are open to strategic manipulation by voters. I will call such democratic mechanisms preference intensity processes (PIPs). While the Borda count is the common textbook example of a PIP, it is not prominently used by real legislatures. PIPs do exist, however. Any democratic mechanism which permits votes to be traded, bought, or sold will violate axiom I, permit voters to express the intensity of their preferences for particular policy positions, require voters to consider all feasible policies, and be susceptible to strategic manipulations.7 5 Vote-trading, logrolling, and interest group politics in which votes are for sale are all democratic preference intensity processes. How do they work? Figure 3.17 illustrates the potential and hazards of vote-trading processes. 7 6 Persons A, B, and C are three farmers. Both farmers B and C have need of roadways to their farms which will each cost $6000 to be shared equally among 75See Wilson (1969) for a proof of these propositions. 76 The analysis is based on an example in Mueller (1979).
720
R.P. nman
the three farmers. The benefits of the access roads to farmers B and C will be $7000 each. If roadway I is built, farmer B gains a net increase of $5000 (= $7000 - $2000), while farmers A and C lose $2000. If roadway II is built, farmer C gains $5000 ($7000 - $2000), while farmers A and B lose $2000. The construction of the roadways is to be decided by a democratic process. If simple majority rule is used, both roadways I and II are defeated by votes of 2-1. Note, however, that both B and C gain by trading votes; B agrees to vote for roadway II if C agrees to vote for roadway I, and both projects are then approved by 2-1. Furthermore, joint passage is potentially Pareto optimal in this case since social benefits ($7000) exceed social costs ($6000) for each roadway. Two facts should be noted regarding this example. First, the existence of beneficial trades requires a non-uniform distribution of net benefits over the alternatives. If net benefits from the roadways for farmers B and C were only $2000, they would gain nothing by trading votes. For vote-trading to be worthwhile to individual participants, they must feel intensely toward one issue over another. Second, vote-trading need not always lead to a Pareto-optimal allocation. Suppose the gross benefits of each roadway were only $5000; now social benefits ($5000) are less than social costs ($6000). But a vote-trading equilibrium with $3000 in place of $5000 in Figure 3.17 (B and C still have an incentive to trade) will approve both roadways. The vote-trading outcome overprovides the public good. There is no assurance that vote-trading will achieve the Pareto-efficient allocation we seek; see Riker and Brams (1973). More fundamentally, however, there is no guarantee that a vote-trading equilibrium will even exist. Both farmers B and C have an incentive to cheat on the agreement if voting occurs sequentially. Once one farmer has achieved his roadway approval, he has an incentive to cheat on the agreement and vote to reject the roadway of his neighbor. Since the first farmer knows he will be cheated, he refuses to trade. The "game" they are playing is, of course, a Prisoner's Dilemma [Mueller (1979, p. 54)], and just as non-cooperative behavior in such games can undo trades in the market place, so too can it undo trades in the halls of government. We need a way to ensure that cheating cannot occur. The vehicle to foster cooperation in this case is the omnibus legislative package which requires that all bills (roadway I and roadway II) which have been vote-traded to be decided simultaneously. The creation of such omnibus legislative packages is called "logrolling". Logrolling permits voters to achieve a stable vote-trading equilibrium. The logrolling game has been analyzed by Becker (1983) and Weingast, Shepsle and Johnson (1981) under the assumptions each voter represents a particular interest group (e.g. roadway I or II) and that all voters adopt a Cournot-Nash strategy when bargaining-that is, all voters behave as if their votes do not influence the benefits and costs, and thus the behavior, of other voters. Becker
Ch. 12: The "New" Political Economy
721
and Weingast-Shepsle-Johnson show that such games will have an equilibrium allocation even over multi-dimensional policy choices. What has not yet been established is the existence of a logrolling/interest group equilibrium under more sophisticated, rational expectations behavior by voters. Yet, sophisticated, strategic voter behavior is generally viewed as a central element of all preference intensity processes. The analysis of logrolling/interest group models must be considered incomplete until sophisticated, strategic voter behavior is allowed [Kramer (1972)]. 3.4.3. The performance of democraticprocesses- allocative efficiency Having established that MRPs and PIPs may, in the right behavioral and institutional settings, lead to well-defined and predictable equilibria, we can now turn to an evaluation of those equilibria against the normative criterion of allocative efficiency. When will a democratic process satisfy Samuelson's condition (MRS = MC) for a cooperative, allocatively efficient provision of a collective activity? We know from Arrow's theorem that democratic processes cannot give stable, allocatively efficient outcomes all the time, but perhaps for a certain configuration of alternatives and preferences, it will succeed. Bowen (1943) was the first to consider systematically the efficiency properties of a democratic process. Analysis was limited to a MRP in the case of singledimension policies with single-peaked preferences. Bowen showed that if each of N voters pays an equal share of the marginal (= average) costs of the public good x [= MC(x)/N] and if preferences for the public good are distributed symmetrically [so the median MRS(x) corresponds to the mean MRS(x)], then for the MRP's selection of median voter's preferred alternative [defined by MRS(x)median = MC(x)/N] and from the symmetry of preferences [-MRS(x)/N= MRS(x)me,, = MRS(x)median] we can conclude that majority rule will satisfy Samuelson's conditions for allocative efficiency [MC(x)/N= MRS(X)media = YMRS(x)/N = MC(x) = YMRS(x)]. These conditions are obviously very special. Bergstrom (1979) has generalized the Bowen results for MRP to a more plausible tax system and structure of voter preferences. Specifically, Bergstrom proves that: if
(i) voter preferences are log-linear in the private good (Yi for voter i) and the public good (x) (Ui = In Yi + ailn x), and if, (ii) the public good is produced at a constant cost per unit of x(= c), and if, (iii) each voter's share of total taxes (i) is proportional to voter income (T = I,/ZEI), and if,
722
R.P. nman
(iv) the "taste parameter" which determines a voter's willingness to pay for the public good (parameter a) is distributed symmetrically across all voters and is independent of income, then a Bowen median voter equilibrium will be Pareto optimal as well.77 The assumption (i)-(iv) of the Bowen-Bergstrom theorem are not implausible, 77 The proof of the Bergstrom-Bowen theorem is straightforward. From assumption (i) and the voter's linear budget constraint, I, = Y + icx, we know the issue is single-dimensioned (what level of x?) and that voter preferences for alternative levels of x will be single peaked. The public sector equilibrium will therefore be determined by the preferences of the median voter. The maximization of U subject to the budget constraint defines the median voter's preferred level of x implicitly as
MRS =
=a(x )6=· TC,(1)
where a is the preference parameter for the median voter. Rearranging gives aY, = Ticx and substituting into the budget constraint yields I, = Y(l + a) or Y = I/(1 + a). Substituting this specification for Y/ into the above equilibrium condition and again rearranging terms allow us to specify the median voter's preferred level of x = as = (a/l + a)(I/i,c) =
+a)(EJ,/c),
(/
(2)
using Ti = (I,/YI). i is the level approved by the MRP. But does it satisfy Samuelson's conditions for efficiency? The typical -voter's MRS evaluated at x = x is equal to MRS = ai ( Y/)
=a ( -
which upon the substitution of ( i/l MRS = (ai/a)c
c)/X,
(3)
as defined in (2) equals: i).
(4)
Summing over all i voters including the median voter gives: , MRS, = ( c/a
I )
ai I.
(5)
If a is independently distributed from I, within the community, then EaiI,= aY,, where a is the mean value of the preference parameters. Upon substitution, EMRS = c((/a),
(6)
which, if the ai's are symmetrically distributed so a = a, reduces to _ MRS = c. Equation (7) is Samuelson's condition for the allocatively efficient provision of x.
(7)
Ch. 12: The "New" Political Economy
723
but of course they must be tested. Assumption (i) is a common specification in empirical public good demand analysis, where a i is often specified to depend upon age, religion (for educational spending), or race. Assumption (ii) is likely to hold for most public goods, while assumption (iii) is valid for any tax system where taxes are proportional to income. 78 Assumption (iv) is yet to be examined: are age, religion, or race or their combined effects on demand symmetrically distributed? Barlow (1970) has actually performed a variant of this test for a sample of Michigan school districts. At issue is the level of school spending. The median voter is assumed decisive (a Bowen equilibrium) and local spending is set by the condition: MRS(.)median = c(), where is the median voter's share of total local taxes, c is the constant cost per unit of x, and is the median voter's preferred level of x. The median voter's willingness to pay for schooling is related to the average willingness to pay by the relationship MRS() median = 4MRS(.)mean = i[MRS(x.)/N], where is a constant relating median to mean benefit. Hence, fc(x )= MRS(x) median = p[YMRS(/)/N], or
c(x)/)jMRS(x) = p/N. Note that if the ratio (/TN) equals one, then c(x) = LMRS(.) and Samuelson's efficiency condition is satisfied. If (+/TN) > 1 (< 1), then c(x) > ZMRS(.) [c(x)< EMRS(Z)] and local school services are
over-provided (under-provided) relative to the allocatively efficient level. The key ratios are the ratio of median to mean benefits [ = MRS(x) median/MRS(x) mean] and median to mean tax payments [N = /(YI/N)]. If we assume benefits are
symmetrically distributed and everyone pays equal taxes, then mean and median benefits are equal and mean and median tax payments are equal, so (/TN) = 1, and Samuelson's efficiency condition holds. This is Bowen's (1943) special case analyzed earlier. Barlow has actually estimated (/TN), however, and he concludes that for the majority of local school districts in his sample (/9N) < 1. Thus, local school spending is too low relative to the efficient level. For this single-dimension choice, the MRP is allocatively inefficient. 79 Kramer (1977) has provided the basis for evaluating the efficiency performance of a MRP in the case of multi-dimensional policies. A dynamic sequence of elections between an incumbent who must defend the status quo and a challenger who selects a vote maximizing alternative to the existing policy is shown to converge to the set of Pareto-efficient policies. The incumbent must run on his or
78A tax system based on a proportional income tax will define each voter's tax payment as T = tr I, where t is the proportional tax rate. Total taxes equal YT = t2I,. The voter's share of taxes (Ti) will therefore equal T//YT = tli/tEli = jl,/I,. Note, too, that any proportional tax system which taxes a commodity which is proportional to income also satisfies assumption (iii). For example a proportional property tax, where housing value is a constant times income, will satisfy assumption (iii).79 See Edelson (1973) and Bergstrom (1973) for critical comments and extensions of Barlow's work.
724
R.P. nman
her performance during office, which is assumed to be identical to the proposal the incumbent offered when elected. However, the challenger can offer any proposal he or she thinks can beat the incumbent's platform. The challenger seeks to maximize the number of votes received against the incumbent's position. Since in general there is no single policy which beats all others in majority rule comparisons, the challenger, if he knows voter preferences, can always find an alternative policy which beats the incumbent's. Thus, the "out" candidate can always beat the incumbent, and we should observe a series of one term candidates. Where does this process lead? Perhaps not too surprisingly, it takes us into the set of Pareto-optimal policies. When politicians seek to please the maximal number of voters with any policy change (the vote-maximization hypothesis), and when an incumbent's policy is outside the Pareto set, then there will always be a policy within the Pareto set which is unanimously preferred to the incumbent's policy and which the votemaximizing challenger will select. Unfortunately, once inside the Pareto set, there is no guarantees that the two-candidate competitive process will keep us there. Figure 3.13 above illustrated a case where a proposal in the Pareto set () could be defeated by a challenger's proposal outside the Pareto set () which in turn could lose to another proposal outside () or inside the Pareto set. The equilibria of the Kramer's dynamical MRP are not stable. However, vote-maximization behavior which assumes challengers are always looking for unanimously preferred alternatives keeps pushing outcomes back into the Pareto set. Kramer's analysis predicts that policies will eventually be in or near the Pareto set, but exactly where is not known. 80 What drives this MRP to the Pareto set is the assumption of vote-maximizing behavior. Just as the Bowen-Bergstrom structure required empirical testing, so too must we test the validity of the Kramer structure. While there is evidence that elected officials are responsive to constituent preferences [Jackson (1974)], there is also ample evidence that nonvoters - industry associations and other special interest groups - can also have a significant impact on policy selection. See Estey and Caves (1983), Kau, Keenan and Rubin (1982), Darden and Silberman (1976), Chappell (1982) and, most recently, Kalt and Zupan (1984). Non-voters can only influence policy if voters are imperfectly informed and/or politicians have objectives other than simply maximizing their election pluralities. Both alternative hypotheses seem plausible 80Actually, Kramer's analysis gives more precise predictions in certain cases. The Kramer twocandidate, dynamic model drives outcomes into a subset of the Pareto set, called the minmax set. The minmax set is composed of all proposals having the property that the maximum number of voters wishing to move in any policy direction is a minimum. Proposals outside the Pareto set cannot be minmax points because there is always some proposal in the Pareto set which will be unanimously preferred to it. Kramer (1977, p. 321) gives examples of minmax sets which are significantly smaller than the Pareto set. However, whenever the policy issue is purely redistributive (e.g. the division of a fixed sum of money) and voters are selfish, the minmax set will correspond to the entire Pareto set [Kramer (1977, p. 330)].
Ch. 12: The "New" Political Economy
725
and sufficient to drive final policy selection away from the Pareto set based on voter preferences. If, therefore, there is no strong evidence that MRPs have achieved allocatively efficient allocations of public goods, then how about preference intensity processes? Buchanan and Tullock (1962, p. 186) and Becker (1983) have argued that logrolling will succeed in finding Pareto-efficient packages of public goods and taxes, but both studies ignore the difficulties of finding cooperative allocations when voter-traders have incentives to conceal their true preferences through strategic behavior. Creating a pseudo-market in the legislature generally will not solve a market failure. The recent analysis of Weingast, Shepsle and Johnsen (1981) illustrates this important point. In their legislative model of logrolling, politicians are elected to represent local voter interests. If local voter welfare is defined to include not only direct project benefits but also the pecuniary benefits of project outlays within the district paid for by taxpayers living outside the district, then Weingast et al. show that each individual legislator will prefer a project size larger than the allocatively efficient size and, furthermore, that all legislators together have an incentive to approve an omnibus public works package which gives each legislator their favored project The result is "pork81If x is the scale of the public project in an individual district, b(x) is the aggregate benefits to members of the district, and c(x) is the total cost of project, then efficiency requires marginal benefits to equal marginal costs: b'(x E ) = c'(xE), where xE is the efficient level of the project. Total project costs can be divided into production costs spent within the district, denoted c,(x), production costs spent outside the district, denoted c2 (x), and non-expenditure resource costs (externalities, for example) within the district, C3 (X): C(X)= C(X)+ C2(X) + c3(x). Benefits are limited to district residents without loss of generality. Politicians are assumed to receive political credit for all project benefits and for all public expenditures which occur within their district. They lose political credit for all tax payments by residents and for the external costs which residents bear from a public project. The politician's maximand is specified as
P(= [b(xj) +cl (xj)
+ E C2(i)]
-
[C3(Xj) + rj(cl(Xi)
+ C2(Xi))
political costs
political benefits
where Pj(-) is the net reward to the elected representative from district j (re-election prospects, bribery income, etc.), b(xj) is direct project benefits, c(xj) + Fi ,jc 2 are all public expenditures in district j, C3 (X ) is external project costs (no spillovers across districts are assumed), T is district j's share of total taxes, and ,i(c,(xi) + c2 (xi )) is total budget costs for all projects across all districts. The politician's reward depends upon the difference between political benefits and political costs. In deciding on the level of public spending, the politician maximizes Pj ( .). Pj ( .) can be re-written as j(x)
P()
= { b(xj) + c (Xi-3(x +
i
[c2j(Xi) -
(X
j(
l(Xi) + c2(XJ))}
(C1(xi) + C2(Xi))]
726
R.P. Inman
barrel" politics and inefficiently large government spending; see Ferejohn (1974), Plott (1968) and Mayhew (1974) for some recent evidence. 8 2 Stigler (1971) and Peltzman (1976) have offered a variant of a vote-selling model to explain the apparent over-provision of government regulation; see Joskow and Noll (1981).83 Regulatory inefficiencies are the outcome of a game of redistribution in which the "owner" of the rule-setting process - either bureaucrat or legislator-can "sell" favorable regulations to the highest bidder.84 Such regulations create favored, monopoly positions within the private economy which produce allocative inefficiencies as an unwanted by-product. Recent papers by Krueger (1974), Brock and Magee (1978) and Bhagwati and Srinivasan (1980) The expression in curly brackets x,'s except x. The politician's spending elsewhere) is therefore the argument x. The first-order
depends only upon x,, while the second term is a summation over all preferred level of own district spending (which is independent of defined by the maximization of the expression in curly brackets over condition is given by
b'(xjN) + ci(XjN) = rj (ci(xjN) + C2(Xj )) + C(X ), where xjN is the politically preferred level of the district j's project. How does xN compare to Note
[cl(x) + 2(x) + c3(x)] > [(c (x)
c,(x) always for
[j(C;(XE) + C (XE)) + c,;(XE) - C (xE)],
or alternatively, b'(xE) + ci (xI) > jy(
(I)
+ c (XE)) + C (
) = b' ( XJN) +
( XJ)
If we assume b"(x)< 0 and c"(x)> 0, then the efficient outcome, xjE, must be less than the politically preferred outcome, xN: xN > x. Because projects in district j do not interact with projects elsewhere to influence net benefits in j, N each politician can set his or her preferred x without coordinating with politicians elsewhere. Each politician comes into the budget process with a preferred xN and the sum of those project levels over all districts sets the aggregate budget. The resulting equilibrium budget is a Nash equilibrium. 82It is instructive to compare the experiences of Presidents Carter and Reagan in their separate attempts to control government spending. Carter attacked selected individual district projects as inefficient - primarily water resource projects - and made no progress at all in limiting "pork-barrel" spending. Reagan, on the other hand, gave Congress a reduced budget as a package and emphasized that everyone's projects were reduced to everyone's benefits. This is exactly the strategy required by the analysis of Weingast, Shepsle mnd Johnson (1981, particularly, pp. 656-658) to promote efficiency. 83The recent book edited by Fromm (1981) in which the Joskow-Noll paper appears contains several of the better studies of the causes and consequences of economic regulation. Other recent studies on the consequences of regulation are Sloan's (1981) review and study of hospital cost inflation and Masson and DeBrock's (1980) study of milk price regulation. 84For a critique of these models using the recent history of regulatory reform, see Weingast (1981) and Wilson (1980).
Ch. 12: The "New" Political Economy
727
have argued that a similar process is at work in setting quota or tariff barriers to international trade. Not only is the politically-created, economic monopoly wasteful, but the process of logrolling or "rent-seeking" is itself inefficient. Since an explicit market in political favors is often illegal, favorable rulings must appear to have been openly considered by the policy process. Hearings must be held, studies completed, and rules written all of which consume scarce time and resources. 85 The sum of the economic losses from the misallocation of private resources and the resources consumed by rent-seeking behavior may be considerable; Krueger (1974), for example, has estimated the inefficiencies from rent-seeking in import licenses at 7 percent of GNP for India to as much as 15 percent of GNP for Turkey! 86 While it is safe to conclude from this brief review of the efficiency performance of known MRPs and PIPs that democratic processes do not generally guarantee an efficient allocation of social resources, we cannot go the next step and conclude that collectively-decided allocations using MRPs or PIPs are inferior to individually-decided market allocations. At the moment we know too little about the economic performance of democratic policy choice to clearly favor either markets or democratic governments as a more efficient resource allocator. 3.4.4. The performance of democraticprocesses - equity
The second and, it is often argued, the most important function of government, is setting the distribution of economic resources across members of society. From the perspective of those who argue the market process may generate an ethically unacceptable distribution of final goods and services, an extra-market institution called government with the coercive powers to tax and transfer is required to correct this market "failure"; see, for example, Rawls (1971). It is important to examine the distributional performance of governments. Indeed, such leading scholars as Stigler (1971) and Riker (1982) have argued that understanding the distributional performance of government is the central task of political economy. The essence of the redistribution activity is its zero-sum nature: what is given to one agent is necessarily tiken from another. There are no net gains from the activity, only transfers. When this redistribution game is played amongst equally 85 But Bhagwati (1980) makes the familiar point that one distortion-lobbying-may be welfare improving in a world with other market distortions. It is a valid point and cautions us to do empirical analysis before judging our political institutions as always inefficient. 86Bhagwati and Srinivasan (1980) emphasize that Krueger's estimates are an outer estimate of these inefficiencies as she assumes rents are sought competitively thereby drawing many resources into the rent-seeking process. If quotas are set as a bilateral bargaining game between the monopoly bureau and the to-be-favored private party then resources consumed by rent-seeking itself will be much less. This is not to say that the resulting market inefficiencies will be insignificant, however.
728
R.P. nman
powerful agents it generally has no stable, or undominated, outcome. 8 7 Whenever one group forms with sufficient power to take resources from another group, the weaker group has an incentive to bribe a few members of the stronger group to abandon that group, join with the weaker to make it strong, and then defeat the remaining members of the once strong group. Around it goes. Stability is achieved only through the outside imposition of rules of how groups may form and how groups can take and allocate resources. Understanding the distributional performance of governments therefore requires us to understand how these rules influence the outcome of the distribution game. Democratic solutions to the distribution game may take either of two forms: (1) as a majority rule solution to a non-cooperative game in which players do not communicate, collude, or form coalitions (MRP solutions), or (2) as a solution to a cooperative game in which coalitions, collusion and bargains are allowed (PIP solutions). The contemporary political economy literature contains examples of both. The use of a majority rule process (MRP) to describe distribution policy faces a serious, initial hurdle. If tax and transfer policy is multi-dimensional, then the MRP will not produce a stable policy outcome, unless the very restrictive Plott-Kramer conditions on voter preferences - essentially, that preferences are identical - are satisfied. The redistribution game is hardly interesting if everyone agrees! Thus, if a simple MRP is to be used as a basis for an analysis with stable outcomes, the redistribution game must be reduced to a single-dimension policy problem. 8 8 Romer (1975), Roberts (1977), and Meltzer and Richard (1981)- the major studies of redistribution through a MRP - do just that. Voters are assumed to have similar preferences over the primary goods of after-tax consumption (c) and work effect (e) but to differ in their basic ability (n) to earn income. It will be this difference in ability or wages which generates the conflict over tax and transfer policies. The final, and crucial restriction, is to limit government taxtransfer policy to a linear income tax of the form: T(y) = ty - a, where y is
earned income, t is the marginal tax rate, and a is a lump-sum subsidy (a > 0) or 8
7 More formally, we can say that there are. no undominated allocations in zero-sum games, or that the "core" of the game-i.e. the set of undominated allocations-is empty. statement is straightforward. Let the game involve N players, indexed by 1i = ... The proof of this n, resources valued at V(N). For an allocation to be in the core it must satisfy three who own total (1) no individual's allocation, xi, is less than his worst performance on his own, v(i): requirements: xi > v(i); (2) the total of all individual allocations must equal the value of the grand coalition, v(N): YNxi v = (N); and (3) the value of the allocations to members of some coalition smaller than (S < N), denoted -sx i, must be at least as great as the worst that coalition the grand coalition could do on its own, v(S): L Esxi uv(S). Assume an allocation x(= x1 ... , x,) is in the core. Then by condition (3) all members in S earn Ysx 2 (S) and those not in S earn _ sxi >_ v(- S). Since S and - S equals the grand coalition N, NXi > v(S) + v(- S). For zero-sum game (N) = u(S) +-v(- S) 0; thus ENXO.>2 But this result violated requirement (2) for a core allocation says ENX, = 0 for zero-sum games. Thus, core requirements cannot be satisfied for zero sum which games; 88Craig and Inman (1986) have recently analyzed a redistribution game in twothe core is empty. dimensions-an income guarantee and a marginal tax rate- using a MRP-SE model.
Ch. 12: The "New" Political Economy
729
tax (a < 0). Earned income equals effort times ability -y = ne - while after-tax consumption equals earned income minus net taxes - c = (1 - t)y + a. Individuals behave in the labor market so as to maximize their utility function u(c, e) which can also be written as u((1 - t)ne + a, e}. Households select work effort, e, to maximize u }. The resulting labor supply schedule is a function of the "net wage", (1 - t)n, and the lump-sum tax or subsidy, a: e = e[(l - t)n, a]. Once e has been chosen, y and c are also determined. The linear tax-transfer schedule has two policy variables: the marginal tax rate (t) and the lump-sum subsidy (a). The government's budget constraint which requires tax revenues to cover all (exogenously set) public good spending (G) and all lump-sum subsidies reduces this two-dimensional policy problem in a and t to a decision in one dimension, t. Total net taxes raised over all N agents must equal total spending: ZNT(y) = G, or N(ty - a) = G, where y is average before tax income. The budget constraint can also be written as a = ty - G/N. If leisure is a normal good (the usual assumption) so that work effort and earned income decline as the lump-sum subsidy a rises, then for any value of t (given G) there will be a unique value of a which satisfies the government's budget constraint. This is shown in Figure 3.18, where y(a) shows the fall in income as a rises, and where ty - G/N is the government's budget constraint for a chosen t. When
cx /N)
cx
cx
y
t2 >t 1l =>(
2
>(C1
Figure 3.18. Determination of lump-sum subsidy.
730
R.P. Inman
t = t, the required a which balances the budget constraint is ac. When t rises to t2, we can offer a higher subsidy for each value of y and a rises to a 2 . When the tax-transfer schedule is linear (two parameters, t and a) and the government must balance its budget, the redistribution game is reduced to a single-dimension policy problem: the selection of t. How does a MRP select a winning tax rate, t *? Since the selection of a value of t defines a corresponding value of a, and t and a together determine each agent's preferred labor supply (e), earned income (y), and final consumption (c), each household's welfare will ultimately depend upon the chosen tax rate, t*. Each voter will have a preferred tax rate, t*, set as that rate which maximizes ui {( }. If voters can be ordered by their preferred tax rates and if an individual's preferred rate is independent of the preferred rates of the other agents so there are no advantages to collusion, then the MRP can select a unique t*. Roberts (1977, p. 334) has made these requirements precise. The key assumption is called "hierarchical adherence" and requires that there exist an ordering of individuals such that pre-tax income is increasing along that ordering irrespective of which tax schedule is in operation. Meltzer and Richard (1981) show that if consumption (c) is a normal good, then in this simple model incomes can be ordered by worker productivity (n) and the assumption of hierarchical adherence will be satisfied; the more able workers earn more income no matter what value of t is finally chosen [see also Romer (1977)]. Hierarchical adherence seems to be a mild additional restriction in the context of this simple model. If hierarchical adherence holds, Roberts (1977, lemma 1) has shown that higher income individuals prefer lower tax rates and, hence, lower lump-sum subsidies. They favor less redistribution. Since individually preferred tax rates can be ordered along a single dimension, the collectively preferred tax rate (t*) under MRP will be the median tax rate (t*) preferred by that voter with the median pre-tax income or productive ability. Voters with economic abilities less than the median will prefer higher rates and subsidies, while the more economically able will prefer lower rates and lower subsidies. This general analysis allows us to make only weak predictions about t*, however - namely 1 > t * = t *. We know t* will be less than I since (i) the least able individual person prefers to make a, the subsidy, as large as possible; (ii) a will be zero if the chosen rate is 1 or larger since no one works when t * > 1, and (iii) the more able median voter prefers a lower rate than the least able person who clearly prefers a rate less than 1. Additional structure gives us additional predictions. If we further assume that the distribution of abilities and income is positively skewed so that average income (y) exceeds median income (y.), then the median voter will always prefer a positive marginal tax, t* > 0. There is always some advantage in taxing income as long as your income is not higher than average.8 89 The proof of this result follows an analysis of the median voter's preferred tax rate. A typical individual chooses effort (e) so as to maximize the preference relationship u (1 - t)ne + a, e }. The
Ch. 12: The "New" Political Economy
731
Unfortunately, while we can predict 1 > t = t * > 0, we cannot say anything about the value of the lump-sum tax or subsidy (a) without knowing the value of non-transfer government spending (G). And knowledge of a is crucial for judging the progressivity of a society's tax-transfer system. A progressive system has marginal tax rates which exceed the average tax rate so the average tax rate rises as income rises, i.e. the rich pay proportionally more. A regressive system has marginal rates below the average tax rate, so the average rate falls as income rises. A proportional tax system has a constant average (equal to marginal) tax rate. In this model the average tax rate equals T(y)/y = t - a/y which will be (>) the marginal tax rate (aT(y)/ay = t) as a < 0. A progressive tax-transfer system has a > 0. If G 0, than as 1 > t * > 0 we know from the government's budget constraint (a * = t *y - G/N) that a * > 0. However, if G > 0, we can no longer predict the sign of a a priori. Once again, further structure is required. Romer (1975) and Meltzer and Richard (1981) have examined changes in the MRP preferred t * and a * under alternative, more fully specified, models of the economy. They conclude that as leisure time becomes relatively more important in voter preferences - or, alternatively, as the elasticity of labor supply
first-order conditions for a maximum are given by (1-
t) nut
u < 0,
(i)
where u = au/ac and ue = au/ae, and the condition holds as an equality when e > 0. Small changes in the tax schedule, denoted by dt and da, will affect household utility as du = u[da - ne dt] + de[(l - t)nuc + uJ],
(ii)
which reduces to, du=uc[da-nedt] or
du/dt=u[da/dt-ne],
(iii)
when households respond to the changes in a and t so as to maximize utility [ la (i)]. Any changes in a and t must satisfy the government's budget constraint, however, where a = t - G/N. Thus, da/dt =y +t(dy/dt).
(iv)
Substituting (iv) into (iii) for the median ability voter (y, = nmem) gives: d um/dt= u [ y
+
t(dy/dt) - y].
(v)
Since the median vote sets ti = t, we know du/dt = 0, or (y -ym) + t(dy/dt) = 0. As ( -ym) > 0 and as (dy/dt) < 0 (see Figure 3.20), t must be positive. If ( -ym) < 0, then t must be negative.
732
R.P. nman
increases-the marginal tax rate t* and the level of the subsidy a* decline. Furthermore, when non-transfer government expenditures as a fraction of y is less than 10 percent, the preferred tax structure is generally progressive (a* > 0). However, as G approaches 25 percent of y the MRP preferred a * is negative and the tax-transfer system becomes regressive. The logic behind these results is straightforward. Large government spending for services means that t* must be very high (generally > 0.4 in Romer's simulations) before a becomes positive. At high marginal tax rates we suffer large losses in average income as workers reduce labor effect (see Figure 3.18). These efficiency losses dominate the median voter's desire for redistribution. The final value of t * is too small to both pay for G and offer positive subsidies: thus, a* < 0. These results highlight an important tension in democratic societies between the twin goals of efficiency and equity. As government spending rises to correct market failures, the democratically chosen level of lump-sum subsidies declines and may become zero or negative. Seater (1983) finds evidence for this trade-off in the historical U.S. data. The present struggle in the U.S. Congress between a desire to increase defense spending against the political necessity to simultaneously reduce transfers is a concrete example of this tension. These simple MRP models have made clear the potentially important role that agents' abilities and preferences play in setting a government's tax and transfer policies. Despite this important contribution, however, the MRP models are extremely limited in what they can tell us about redistribution. To ensure analytically stable outcomes, we have been forced to reduce our discussion of distribution policy to one dimension. In fact, in most societies redistribution policies are multi-dimensional. Nations assist their poor who are elderly, sick, permanently disabled, unemployed, homeless, or fatherless, and each group is helped at a different level of subsidy and taxed a different rate on earned income. If we are to capture the essential features of such policies we must move beyond one policy dimension to an analysis of multi-dimensional policies. Recent cooperative bargaining models of the redistribution game give us some results. Aumann and Kurz (1977, 1978) and Gardner (1981) have applied a new approach to the income redistribution game based upon the theory of n-person cooperative games. The implicit political process is a preference intensity process (PIP) where agents are allowed to communicate, to make side-payments, and to form coalitions. In their analysis the n-person game is first reduced to a two-party bargaining game between a winning or "taking" coalition (denoted as S) and a losing or "paying " coalition (- S). The two coalitions then resolve their differences in the redistribution game according to Nash's (1953) bargaining theory of two-player cooperative games. What makes the two-coalition game interesting is the fact that the "paying" coalition canl destroy some or all of its initial resource endowments. It is this threat which induces the "taking" coalition to bargain and not just "take". Nash's theory predicts a unique outcome to the redistribution
Ch. 12: The "New" Political Economy
733
game.9 0 To apply Nash's bargaining theory, however, the analysis must first specify the value to the two coalitions and their members of playing the redistribution game. The value concept chosen is the Shapley value, based on the work of Shapley (1953) and applied in numerous studies of political power [see Riker and Ordeshook (1973, pp. 154-175) and Nagel (1975)]. The Shapley value measures the value to an individual agent as the expectation of his or her contribution to the worth of the agents preceding them in a random order of all the players in a coalition.9 l The Aumann-Kurz analysis proves that the resulting solution to the redistribution game is that allocation of social resources which ("as if") maximizes the Shapley-Nash value of the game for all agents in society.9 The key assumptions which define the structure of the Aumann-Kurz redistribution game are: (i) majority rule determines redistribution policy - any coalition with more than half the agents controls taxes and transfers; (ii) agents are individually unimportant and can influence policy only through coalitions; (iii)
9°Nash's (1953) model of two party bargaining with threats specifies that an acceptable solution must be (i) feasible, (ii) Pareto optimal, (iii) invariant with respect to the choice of utility scaling, (iv) independent of the names of the players, and (v) independent of irrelevant alternatives. Furthermore (vi) when a player's choice of threat strategies is restricted while his opponent's choices remain unchanged, the restricted player cannot obtain a higher utility then he achieved when playing with the unrestricted set. Finally, (vii) the value to a player of his available threats should not be dependent upon the collective, or sequentially reinforcing, property of threats as he plays with his opponent. Nash then shows that the only solution to the bargaining game which satisfies these seven axioms is one which divides the pay-offs equally beyond the player's chosen threat points. 91 The Shapley value splits the value of a coalition amongst the individual members of the coalition according to
~,(v)=
V,(S)[U(S)-(V(S)- {i)], SEI,
where cpi(v) is the Shapley value for player i. v(S) is the aggregate value of coalition S with i as a member, while (v(S) - {i}) is value of coalition S when i is not a member. Thus the term in square brackets measures player i's marginal contribution to joining coalition S. The term V,,(S) is the probability that player i will in fact be the marginal (last) member to join coalition S when there are n other players in the game: V(S) = (S- 1)!(n - S)!/n!. Thus, the expression i(v) is the expected value of the marginal contributions of player i to all coalitions of which he is a member. Shapley (1953) proves that the value ,i(v) is the only value expression which satisfies the three requirements that (i) the value of a player should not depend on the label used to describe the player; (ii) the individual values shall partition the value of the whole game [~=l,i (v) = v(It)]; and (iii) the value of a game composed of two games shall be the sum of the separate values in the two component games. 92 There is also reason to believe that a stable, winning coalition in these zero-sum redistribution games will be of the minimum size necessary to remain a winning coalition; see Riker (1962), Riker and Ordeshook (1973, ch. 7) and Shepsle (1974).
734
R.P. nman
agents can, if they wish, destroy part or all of their initial resource endowments (denoted e); (iv) agents are assumed to have von Neumann-Morgenstern utility for final resources (x), u(x), with the properties u(x) > 0, u '(x)< 0, and oo > ut(x) > 0 for x > 0, but u(O) = 0; and (v) total social resources are positive but finite and equal the sum of all the initial endowments of the many infinitesimally small agents (where dt denotes one small agent and T the sum of all agents): oo > fre(t)dt> 0. The game is played in two steps. First, agents maneuver to form majority (S) and minority coalitions (- S = T- S). Only two coalitions are likely in such games since a majority coalition is needed for redistribution, and the minority coalition will not subdivide since anything a smaller coalition can achieve can also be reached by the larger minority coalition and the larger minority coalition will often be able to do more. Once the coalitions have formed, they play a two-party cooperative bargaining game in which threats are allowed. The majority threatens to take all of the minority's endowments, while the minority threatens to destroy some or all of that endowment. The two coalitions adopt strategically preferred threat points which guarantee the winning coalition at least f *(S) in aggregate utility and the losing coalition at least g*(- S) in aggregate utility, where f*(S) > g*(- S). The Nash (1953) solution to a bargaining game with threats yields a compromise in which the aggregate utility which remains after the threat points are guaranteed is divided evenly between the two coalitions. The Shapley value is then computed for each player in each coalition to determine the individual agent's share of his coalitions aggregate utility. The major conceptual difficulty in the analysis is defining utility values for coalitions in such a way that the aggregate utility can be redistributed amongs the players. Only dollars, not utility, can be re-allocated. Following re-allocations, it may be that the sum of the individual values exceeds the total value allocated to the coalition. The "adding-up" requirement is satisfied if each individual's utility is weighted by X(t) = 1/u'(x(t)); then utility is additively transferable through transfers of income.9 3 The result of this income 93 See Aumann and Kurz (1977, pp. 1144-1145). The intuition behind this requirement is as follows. If the notion of aggregate value for a coalition is to have meaning for any coalition it must be valid for a coalition of all players as well. The aggregate utility for the all-player coalition is given by
vx(T
)
= fX(t)u[x()dt,
where (t) is the "weight" on player t's utility which allows us to sum utilities. The maximum value of v(T) is defined by a value allocation x(t), where the weighted marginal utilities are equal across all agents. Thus, X(t)u'[x(t)] = k, where k is some constant. With no loss of generality, we can set k 1, which implies X(t) = 1/u'[x(t)]. The weighted value lost or gained with the transfer of dollars is therefore X(t)u'[x(t)]dx = l(dx), where dx is the change in income. Transferring dollars is therefore equivalent to transferring utility, when X(t) = 1/u[x(t)].
735
Ch. 12: The "New" Political Economy
redistribution game is a final, unique, value allocation which satisfies the formula [Aumann and Kurz (1977, p. 1145)]:
{X(t)u[x(t)] - V},
x(t) = e(t)-
where x(t) is the income finally received by player t, e(t) is the player's initial endowment, and the expression in curly brackets is player t's tax (if positive) or transfer (if negative). V is a constant and equals the average value of the weighted utilities from the chosen allocation: V= JTX(t)u,[x(t)]dt. Agents whose weighted utility value of money - V= X(t)u,[x(t)] -is greater than average (V) pay a tax. Taxpayers will be agents whose total utility is large and whose marginal utility, u'[x(t)], is small- that is, the richer people in society. Agents who receive a subsidy have a utility value of income (V) less than V- that is, the poorer people in society. Redistribution is pro-poor in the Aumann-Kurz model. What is most striking about the Aumann-Kurz result is the high marginal tax rate implied by the final value allocation. The rate exceeds 0.5 for all players. Total taxes are defined by e(t) - x(t). The marginal tax rate is therefore given by94
d[e(t)-x(t)] de(t) =m(t) = 1-
1 (2+ (utu/u
2
)
from which we must conclude, since u ' < 0, that 1 > m(t) > 0.5. The definition 94
Calculation of the marginal tax rate follows from d[e(t) - x(t)]
_
de(t)
1 _ dx(t) de(t)'
where dx(t)/de(t) is evaluated at the equilibrium value allocation, defined by x(t) = e(t) - u,[x(t)]/u; [x(t)] + V.
(ii)
As V is a constant:
dx(t)
=de(t)
-
ut [x(t)] dx(t) u,[x(t)] ; t)] + [X()'2 u,[x(t)]dx(t),
which implies:
dx(t) de(t)
/
2
.(v
u,[x(t)] U X(t'] 2
I
Substituting (iv) into (i) gives the definition of marginal tax rates below.
(iii)
736
R.P. Inman
of m(t) contains a further insight. The marginal tax rises as the value of the term (- u'/u')/(u'/u,) (= 8,) in m(t) rises. The numerator of 8t is the Arrow-Pratt measure of absolute risk aversion, while Aumann and Kurz (1977, p. 114) label the denominator as player t's "boldness" in risking total ruin. Players who are adverse to risk even on small bets and who are less "bold" when faced with total ruin will have higher values of 8, and therefore larger values of m,(x). The aggressive agent, one willing to risk everything, does better in the Aumann-Kurz redistribution game then one who plays timidly. This results marks an important extension over the non-cooperative, median voter results reviewed above. Not only is redistribution tied to differences in agents' incomes (or productive abilities) but now it is also related to differences in voter preferences and/or skills in playing the game. As a descriptive model of real tax and transfer policies, however, the Aumann-Kurz framework fails; few nations have marginal tax rates which exceed 0.5 for all players. While marginal rates greater than 0.5 are observed in many welfare programs [see Hausman (1981)], most taxpayers are taxed at significantly lower rates. Aumann and Kurz (1977, p. 1559) conjecture that if voting power in their model is related to the initial wealth of the agent, then marginal tax rates tend to fall. In the extreme case where votes are distributed in direct proportion to the initial wealth, there are no taxes or transfers at all. These results suggest the need for another dimension to the model of the income distribution game: the need to specify correctly the initial distribution of political power. Gardner (1981) has extended the Aumann-Kurz analysis in just this direction. He introduces two new political elements which move the analysis away from Aumann-Kurz's simple majority rule model. First, Gardner postulates the existence of an agent with veto power over all redistributive schemes. This veto agent cannot ensure his preferred outcome (i.e. is not a dictator) but he must be included in any winning coalition. Second, Gardner specifies that the winning coalition must attract not only the veto agent(s) but also some variable fraction (a) of the remaining agents. If we normalize the sum of all agents to be 1(T) = 2 and the sum of the veto agent(s) to be (0) = 1 (0 = veto agent), then a coalition S is winning if it includes enough members so it(S)> 1 + a, is indecisive if 1 + a > ,u(S) > 1 - a, and is losing if 1 - a > p&(S). As a - 0, the veto agent becomes closer and closer to being a dictator. As a - 1, the social choice process tends to unanimity. Gardner's analysis follows Aumann-Kurz from this point except that Gardner, having generalized the political structure, must place an additional restriction on agent preferences if he is to get results. The restriction is not trivial; the veto agent and some subset of other voters must be risk neutral (i.e. u,[x(t)] = x(t) for t = 0, and some subset t S'). In terms of our earlier discussion of the Aumann-Kurz results, we require the veto agent and some other voters to have no fear of ruin (, - 0) when playing the redistribution game.
Ch. 12: The "New" Political Economy
737
This is not an implausible restriction on the veto agent as nothing happens without his approval; whether other risk-neutral agents exist is an empirical question best tested by examining the plausibility of Gardner's results. Gardner (1981 pp. 360-362) shows that the equilibrium value allocation of this re-specified redistribution game with the power parameter a is given by
x(O) = e(O) +
1 + 2a - a2
for the veto agent, and by
x(t) = e(t) - ((1
a2 a[x
(t)u+
(1 + a2)(1
2-
a2)
}
for the remaining members of society. Endowments are again measured as e(t) and V is again the average value of the weighted utilities from the allocation. Note, first, that the veto agent is always a net recipient: x(0) > e(0). But as in Aumann and Kurz, poor agents who do not have veto power may also receive a net subsidy if the term within the curly brackets is negative. Second, as a - 1 and the political process tends to unanimity, x(0) e(0) and x(t) - e(t). Not surprisingly, redistribution disappears as we require unanimous consent. Third, as a - 0 and we move to a dictatorship by the veto agent, the veto agent receives his or her maximum transfer and the remaining agents pay a tax almost equal to the utility value of their final allocation [ X(t)u(x(t))]. Redistribution is maximized in favor of the ruling coalition in this case. Finally, as with Aumann and Kurz, we can calculate the schedule of marginal tax rates for the agents outside the veto group: d(e(t)-x(t))
de(t)
(
2 m)
1- ( 2
1 =a (1 + a2)(
ut u/u, )
Now m(t) can be less than 0.5 depending on the value of a. Specifically, as a - 1, m(t) - 0. Conversely, as a - 0, the marginal tax rates tend to the Aumann-Kurz values. As the political power of the agents outside the veto group increases (a - 1), those agents are better able to withstand the redistributive pressure from the veto group itself. The work of Aumann, Kurz, and Gardner, on the income redistribution game is promising. Of course, one can be critical. Extending Gardner's analysis to allow all agents to be risk averse would be an important extension. There are important difficulties with the Shapely-Nash solution concept itself, but what
738
R.P. nman
one might propose for an alternative is not obvious. 95 Exactly how coalitions form is never specified, yet this is a central issue for the cooperative game
approach to collective choice [Kramer (1972)]. Finally, and perhaps most importantly, both studies assume when specifying the final value allocation that redistribution involves no waste of social resources; total resources after redistribution equal total endowments before redistribution: fx(t)dt = JTe(t)dt. While players threaten to destroy endowments when the game is played, the final contract is such that no resources are in fact sacrificed. In effect, all preferences and abilities are assumed to be revealed truthfully during the game so that a first-best tax and transfer policy can be specified and enforced. This is a very strong assumption. The extension of these PIP models to accommodate the incentive effects of taxation remains to be done.9 . Conceptual and empirical analyses as to why societies redistribute resources has just begun. While the early work is promising, we are far from a compelling model of collective resource redistributions. The effects of agent preferences, abilities, and, most importantly, political institutions must be considered. Until such analysis is completed we must reserve final judgement as to the redistributive performance of alternative collective choice and market institutions.
3.5. Summary The failure of market institutions to achieve an economically efficient or fair allocation of social resources has raised the possibility that a government might somehow do it better. Voluntary collective organizations where individuals are free to stay or go were seen to suffer from the same fundamental problems which plagued markets (Section 3.1). Individuals will not cooperate unless it is in their private interest to do so. A coercive government with the power to tax and spend is necessary if we are to achieve cooperative allocations. Yet we do not wish our 95 A review and critique of the Nash bargaining theory is found in Luce and Raiffa (1957, pp. 128-134; 140-142) or Cross (1969). A review and critique of the Shapley value can be found in Luce and Raiffa (1957, pp. 245-252) or Riker and Ordeshook (1973, pp. 154-163). Alternative solution concepts to n-person, cooperative ames are reviewed in Shubik (1982, pp. 336-367). 96 A beginning on this question can be found in Ordover and Schotter (1981). Their analysis does not address transfer policy but simply concentrates on the determination of tax rates. They show that in a model where the politician is effectively a monopoly agenda-setter and seeks to maximize his side-payments ("campaign contributions"), the offered tax structure will be the "best" tax structure available after allowing for the disincentives of taxation. The voters and the agenda-setter write an implicit contract-enforced by reputation-in which the politician promises the "best" tax structure in exchange for the dollar value to the voters at this "best" structure over the dollar value of the structure which an opponent might propose (in effect, the reversion level). The opposing proposal may be the status quo. If political competition is allowed, however, no voting equilibrium is generally possible and the "best" tax structure is no longer a stable outcome.
Ch. 12: The "New" Political Economy
739
governments to be oppressive. Nor do we want collective decision-making to be too administratively cumbersome and expensive. Can we find, therefore, a government which respects individual preferences for collective activities, is administratively efficient, and, finally, capable of finding economically efficient allocation of social resources? The answer, as shown by Arrow, is no (Section 3.2). We therefore stand at a crossroads, and we must choose. One route takes us to dictatorial, often unfair, but potentially efficient, allocation processes (Section 3.3). The other roads lead to democratic, potentially more equitable, but generally inefficient mechanisms (Section 3.4). The decision as to how to organize the social allocation of social resources is a difficult one; there are no clear answers. Setting a preferred mix of market and governmental - voluntary and coercive - institutions requires a delicate balancing of pluses and minuses for neither institutional form clearly dominates the other. Before beginning the task of weighing the relative virtues and vices of markets and governments, however, we need to examine some recent attempts to improve the efficiency performance of governments.
4. Public policies for the public sector The very principle of constitutional government requires it to be assumed that political power will be abused to promote the particular interests of the holder; ... such is the natural tendency of things, to guard against which is the especial use of free institutions (J.S. Mill, Considerationson Representative Government, 1861). Accepting the view that government performance can be analyzed systematically and that successful predictions as to future public choices can be made, raises the next questions: Can we alter that performance and, if so, how? To political economists, governments are self-contained organizations making choices and allocating resources. Governments have their own structure and their own dynamics; one cannot simply tell a government to change. To alter performance, we must alter the structure of government. The central elements in that structure of public choice are the underlying preference and endowments of the participating agents (voters, bureaucrats, elected officials), the technologies of service production, and the rules (including the rules to change the rules) of government decision-making. If tastes and endowments of agents are assumed fixed (the usual assumption), then the only structural elements which can be changed are production technologies and the rules of choice. It is not unfair, I think, to characterize the task of expanding the set of feasible outcomes, given choice rules as the central mission of the policy economist/analyst. The task of the political economist/analyst, on the other hand, is to understand the effects of, and to
740
R.P. Inman
recommend changes in, the rules of public choice given the production technology. To do so we must understand how public organizations behave. That has been the central aim of the studies reviewed in Section 3 above. We must also think about taking the next step: changing the rules which govern public sector behavior. Section 4.1 discusses how we might address the problem of recommending changes in the rules of public choice. Section 4.2 reviews some recent studies which have proposed changes in the rules of fiscal choice to improve the efficiency performance of governments. Section 4.3 offers concluding remarks.
4.1. On designing policies for government reform The design and implementation of public policies for public organizations must move through each of three steps: (1) the evaluation of the current performance of the organization against a set of normatively compelling criteria; (2) the behavioral description of the present public organization which allows us to predict organization performance and propose a remedy; and (3) the creation of a legitimate, extra-organization institution which can enforce a change in the public organization's environment and thereby alter its performance. The evaluation of government performance must proceed within the same normative system which leads us to propose its existence in the first place. For example, within the framework of utilitarianism, governments as a coercive organization are desired because they permit transactions which increase the aggregate well-being of society as reflected in a social welfare function of the general form, W{U 1 ,..., UN }. Such a normative structure allows us to justify, and criticize, government performance on grounds of economic efficiency or distributive justice. Normative systems as different as those of Bentham or Rawls can be included with this framework; see Arrow (1973) or Alexander (1974). Alternatively, one might well deny the ethical validity of such "end-state", utilitarian evaluations of political performance and concentrate instead upon the "process" of collective actions. Nozick (1974) and Buchanan (1975) develop such normative positions giving primary moral emphasis to notions of liberty. The social contract, including the appropriate role of government, is based upon the central notion that individuals should be allowed to develop their initial endowments to their fullest potential. Any activity, institution, or social process which discourages or prevents that development is to be prohibited. It is not necessary here to argue for or against any particular evaluative structure - utilitarian, libertarian, or an alternative -but only to understand that such structures have been proposed and that government performance can, and should, be judged against such criteria. Actual governments can be inefficient, unfair, or in violation of some basic human right. What if the performance of government is found wanting? The next task is to find out why government has "failed" and to seek a remedy. At this step, we turn
Ch. 12: The "New" Political Economy
741
to the positive analyses of government behavior as reviewed in Section 3. Particularly important in the context of policy design is to understand how the technological and institutional environments influence the behavior of the government organization. Technology defines the aggregate volume of resources potentially available for government allocation as well as the marginal costs (including any excess burden costs) associated with turning private dollars into public activities. The institutional environment are the rules - the constitution - under which government allocations are decided. Included here are the voting rules which determine how choices will be made as well as any prohibitions against, or requirements to perform, explicit activities. Changes in the technological environment through grants-in-aid (or foreign aid) from one, "higher" level of government to another "lower" level of government will influence the tax and spending decisions of the lower levels.97 New production technologies or management techniques will alter the relative prices of public activities and thereby influence government allocations.9 8 The institutional environment can also be changed. Voting rules can be altered, or governments may be prohibited from using certain taxes, from exceeding a spending limit, or from enforcing laws which discriminate by age, sex, race, or religious preference. Having (i) discovered a government "failure", and (ii) knowing the effects of changes in the technological or institutional environments on government performance, we can then (iii) alter those environments and possibly correct or reduce the observed failure. Crucial to this view of policy, however, is the assumption that we have a legitimate - i.e. accepted - extra-governmental institution through which these technological and institutional reforms can be effected. In a democratic society, two such extra-government institutions suggest themselves: the courts and citizen-initiated constitutional amendments. In most Western democracies the judicial system stands, at least in constitutional theory, as a body largely independent of the legislative and the administrative-executive branches of government. Whether this independence holds in fact is a disputed question. Citing as evidence the recent advances by the U.S. courts into social policy-making, Shapiro (1964), and more recently Horowitz (1977), have argued that the judiciary does provide an independent review of legislative activity. Law professor Andrew Bickel (1970, p. 134) wrote: "All too many federal judges have been induced to view themselves as holding roving commissions as problem-solvers, and as charged with a duty to act when majoritarian institutions do not." In these analyses the courts are seen to respond to a potentially different set of criteria than their legislative counterparts when evaluating and, in many instances, requiring policy initiatives. At least the recent U.S. legal history suggests an emphasis on equity over the goal of efficiency 97
See, for example, Inman (1978) and for a general review of the grants-in-aid literature, see Inman (1979). 98 For example, Baumol (1967b) argues that technological improvements in the provision of private goods will eventually bankrupt the less technologically advanced public sector.
742
R.P. Inman
[Shapiro (1964), Horowitz (1977), Inman and Rubinfeld (1979)]. The fact that legal counsel for welfare and civil rights groups and for environmental and consumer groups prefer the judicial to the legislative route to reform is evidence on this point [Brill (1973)]. What is lacking in all of these studies, however, is a compelling institutional theory of how judicial decisions are made. Without such a theory one could well assert the alternative view and claim the courts do not act as an independent check on legislative performance. Landes and Posner (1975) take just such a position when they argue that the judiciary enforces rather than challenges the decisions of the legislature. The courts' enforcement of past legislative decisions is just what is required to give legislation the stability needed to ensure its passage in the first place. In the legislative market place only stable laws enforced by the courts promise sufficient long-term value to the various special interest groups to make the fight for passage worthwhile. In this role, the judiciary must be free of the wishes of the current legislature, who may wish to overturn those laws, yet subservient to the will of past legislatures. The use of lifetime appointments for judges, appointments generally made by prior legislatures, is seen by Landes and Posner as one means to ensure this historical obligation. Whether in fact the courts play a role independent of the legislative process therefore remains an open issue. 99 For those who do observe an independent judiciary there are the further questions: How well do the courts perform their review of legislative activities and do we want them to perform that function at all? ° Horowitz (1977, p. 298) for one is skeptical: "Over the long run, augmenting judicial capacity (for review) may erode the distinctive contribution the courts make to the social order. The danger is that courts, in developing the capacity to improve on the work of (legislative or administrative) institutions, may become altogether too much like them." If not the judicial system as an overseer of legislative action, then what? Citizen-initiated referenda have emerged in the United States as the most prominent alternative. The states of California, Massachusetts, and Michigan have all passed tax or spending limitations to restrict state budget growth. A constitutional amendment to require a balanced federal budget has also been discussed. Mueller (1973) has argued for just such a review process of legislative action. The legislature is viewed as responding to short-run vested interests, while 99 The empirical evidence of Landes and Posner (1975, appendix) on this question is not conclusive. Looking only at judicial judgments as to the constitutionality of a legislative act, Landes and Posner find that longer tenured judges are more likely to nullify old laws (contrary to expectations) and that older judges (holding judicial tenure fixed) are more likely to nullify more laws (consistent with expectations). The number of nullifications is very few, however; only 97 of the 38 000 laws passed by Congress from 1789 to 1972 have been overturned by the courts. What is needed is a careful look at how the courts monitor administrative agencies- how laws are enforced- for it is here that the judiciary is most often influential. See, for example, MacIntyre (1983). 100For an analysis of the logic and consequences of judicially-imposed reforms on government fiscal choice, see Inman and Rubinfeld (1979).
Ch. 12: The "New" Political Economy
743
citizens voting at the "constitutional stage" are assumed to consider the long-run implication of choices which may have no direct bearing on their own lives. The existing legislature responds to individual gains and losses, given present circumstances and tastes. Citizens and their representatives vote in their self-interest. Constitutional restrictions, however, are institutional reforms which will influence current and future legislative choices. Voters must necessarily take a long-term view of such decisions, but in the long run the position of each citizen is uncertain. Mueller argues (1973, p. 62) that in the face of such uncertainty voters may well adopt as a working hypothesis that they or their heirs can well be anyone: rich or poor, liberal or conservative. Individuals may then make their constitutional choices from a wider perspective than their own current self-interests - for example, by maximizing expected welfare assuming there is an equal probability of occupying any position in the future distribution of outcomes [ la Harsanyi (1955)]. The point is not that Mueller's precise construction of citizen oversight is correct [see Pattanaik (1968) or Sen (1970) for a critique of the Harsanyi logic], but that it is plausible to view citizens as stepping outside the current rush of political activity to evaluate and then to alter the flow of those events. The logic of such evaluation is prospective, forward-looking, and necessarily more inclusive of others' interests than would be a decision on current legislative activity; today's smokers can vote to ban smoking in public places, the rich can favor income redistribution, men can favor the extension of the franchise to women. To be effective, the institutional reforms must carry a measure of permanence, however; today's reform cannot be easily overruled tomorrow. And it may be necessary to postpone the enactment of reform, or to allow for "grandfathering", to ensure a more socially inclusive view of policy and a sense of legitimacy. Thus constitutional amendments or their repeal often require two-thirds majorities rather than simple majorities, and enactment may be postponed from two to five years after passage. In recent years there have been several efforts (notably in the United States) to improve the process of government decision-making. Each has used the constitutional or the judicial route to reform. In each case there has been an evaluation of government performance, a diagnosis of the problem based upon a behavioral model of that performance, and finally, a proposed remedy established and enforced by an extra-legislative institution. In the next section I review the logic behind one of these reforms: the use of constitutional amendments to improve the efficiency performance of government allocations.
4.2. An example: Constitutional limitations for government efficiency The past ten years have seen the rise of the citizen-initiated referendum to impose constitutional limitations upon government budgetary activities. In June 1978 the voters of California in a statewide referendum approved Proposition 13, restrict-
R.P. nman
744
ing the rate of local property taxation to no more than 1 percent of 1975 market values. In effect, the voters elected to cap local government spending. Similar referenda with similar effects have been approved in Michigan (a limitation on state income taxation) and Massachusetts (a limitation on local taxation). Surveys of voters in each state following the approval of these constitutional limits to government spending indicated that the voters felt government was inefficiently too big. °"' Reforms of bureaucratic and legislative behavior were required. The chosen means of limiting what was felt to be inefficient government performance was to constrain total spending through a constitutional amendment. While the initial reaction of public finance economists to such limitations was one of distrust - the reforms seemed excessively restrictive - there are in fact good economic arguments for their use. The economic problem which lies behind the perceived desirability of spending limitations is government inefficiency. The source of this potential inefficiency is the self-seeking, utility-maximizing behavior of the imperfectly controlled political agent. This politician-bureaucrat is assumed to have a well-defined preference relationship over personal income, y, and work effort, e, specified here as V(y, e) = u(y) + v(e); the marginal utility of income is non-negative and declining [u'(y) > 0, u"(y)< 0] and the marginal utility of effort is non-positive and declining [v'(e) Y *, but where we express poverty negatively, in that social welfare is decreasing in the poverty index, so that minus the headcount is represented by the integral of -1 from 0 to Y *, with the discontinuity at Y*. This is indicated by the heavy line in Figure 2.1. [See Atkinson (1985b) for further details.] There are several possible ways in which one could arrive at a replacement for the head count. One can simply propose measures which look better. Watts himself suggested two, and these are shown in Figure 2.1. First, there is the normalised poverty deficit which replaces the discontinuous function in the head count by the continuous function (Y- Y *)/Y *, or minus the shortfall from the poverty line divided by the poverty line. The second is more sophisticated, and is designed to take account of the fact that "poverty becomes more severe at an increasing rate" [Watts (1968, p. 326)], taking the function log(Y/Y*) in place of the poverty deficit, where Y < Y*, and zero otherwise. This in turn may be seen as a subcase of the class of measures suggested by Clark et al. (1981), governed by the parameter c, varying between 1 and -oo, with the Watts case being
Ch. 13: Income Maintenance and Social Insurance
791
obtained as c tends to zero. There are other possibilities, such as the class of measures suggested by Foster et al. (1984) and shown in Figure 2.1, based on some power of the (normalised) poverty gap. Sen, on the other hand, took an axiomatic approach and this has been followed in a number of subsequent articles. In the construction of the actual index that he proposes, the key axiom is that which rejects the equal weighting of poverty gaps implied by the poverty deficit, on the grounds that this is insensitive to the distribution of income amongst the poor, and proposes that the poverty gap be weighted by the person's rank in the ordering of the poor. The Sen assumptions lead to an index for the measurement of the extent of poverty which includes as one of its elements the Gini coefficient of the distribution of income among the poor. At the same time, he emphasises the "arbitrary" nature of the rank order assumption; and it is not surprising that the subsequent literature has sought to develop alternatives. Reference should be made to the work of Anand (1977, 1983), Blackorby and Donaldson (1980), Chakravarty (1981), Clark et al. (1981), Hamada and Takayama (1977), Kakwani (1981), Takayama (1979), and Thon (1979, 1981, 1983). A valuable survey is provided by Foster (1984). There is yet another approach that has not been much followed in the literature. This considers a class of poverty measures satisfying certain general properties and seeks conditions under which all members of that class will give the same ranking. It tries to establish common ground, so that we may be able to say that poverty has increased, or decreased, for all poverty measures in a class. In Atkinson (1985b) it is shown that a sufficient condition for poverty to have decreased for all poverty measures which are additively separable, non-decreasing and (weakly) concave functions of income is that the poverty gap be no greater taking all poverty lines from zero to the maximum in the range Y). The head count does not satisfy these conditions, but all the other measures shown in Figure 2.1 do satisfy them. So that if for 1950 the poverty gap curve, drawn taking each value of Y as a possible poverty line up to Y2*, lies everywhere below that for 1980, then we can conclude that poverty has increased. [This is in effect a second-degree stochastic dominance result, and related to that given in portfolio theory by Fishburn (1977)]. The implications of the concavity assumption, which embodies the Dalton transfer principle, are discussed in Atkinson (1985b). It may be noted that the result can be strengthened to include measures which are s-concave but not necessarily additively separable, such as the variation on the Sen index proposed by Thon (1979), where the weight attached to the poverty gaps is the ranking in the whole population and not just that among the poor. 2.3. Welfare economics and social welfare functions
The study of poverty over the last half century has proceeded largely independently from the main body of theoretical welfare economics. Whereas Pigou's
792
A.B. Atkinson
The Economics of Welfare contained seven chapters dealing with the poor and measures for the alleviation of poverty, such considerations received short shrift from the "new welfare economics" [see, for example, Little (1950) and Graaff (1957)]. In the present context, it is clearly necessary to bring together the two fields, since the public economics treatment of social insurance lies firmly in the welfare economic domain. In making this link, we can distinguish two main approaches. The first sees concern for poverty as stemming from individual interests or what Harsanyi (1955) calls a person's "subjective" preferences; that is a person supports transfers to the poor because this makes him better off in some way. The second sees concern for poverty stemming from considerations of social welfare, or what Harsanyi calls "ethical" preferences; that is the person supports transfers because that accords with his notion as to what constitutes a good society. The relation with individual interests has received attention in recent years in terms of "Pareto-improving" transfers, or voluntary redistribution, following the article by Hochman and Rodgers (1969). At the heart of the approach is the assumption that individual welfares are interdependent. The individual utility is a function of his consumption, and that of others, where we allow both for the possibility that it is the consumption of particular goods by others with which he is concerned and for the possibility that it is their utility level which enters his judgement. The first of these possibilities is illustrated by the case where he is concerned with how much others have to eat; the second by the case where he is concerned with their sense of well-being. These have rather different implications, the former leading to concern with "specific poverty" and the latter to assessment in terms of standards of living or even income. The utility interdependence justification for redistribution has been extensively explored [see, for instance, Hochman and Peterson (1974)]. Two points should, however, be noted. First, it has been taken for granted that the consumption of others does not enter negatively in the form of envy. The latter many people would want to ignore; as it was put by Sir Dennis Robertson, we should "call in the Archbishop of Canterbury to smack people over the head if they are stupid enough to allow the increased happiness... to be eroded by the gnawings of the green-eyed monster" [Robertson (1954, p. 677)]. Secondly, we have to consider whether the interdependence is symmetric. In the original example of Hochman and Rodgers (1969), Mutt on $10000 a year is concerned about Jeff on $3 000, but for much of the analysis Jeff is not, at the margin, concerned with Mutt's income. The asymmetric case where the rich are concerned about the welfare of the poor, whereas the poor are not concerned. about'the welfare of the rich, is open to the objection that we are replacing the subjective evaluations of the poor by those of the rich. A number of treatments of interdependence, such as that of Arrow (1981), have, however, assumed symmetric utility functions.
Ch. 13: Income Maintenance and Social Insurance
793
The approach to redistribution via interdependent preferences takes on particular significance when we consider the relation between private and social transfers in the Section 3. The same applies to the interpretation in terms of individual subjective preferences based on the "insurance" motive, where individuals favour redistribution on the grounds that they may themselves later be in need of assistance. This motive has been described by Buchanan and Tullock (1962, p. 193) as follows: to prevent the possibility of his falling into dire poverty in some unpredictable periods in the future, the individual may consider collective organisation which will, effectively, force him to contribute real income during periods of relative affluence. Such collective redistribution of real income among individuals, viewed as the working out of this sort of "income insurance" plan, may appear rational to the utility-maximizing individual." The notion of insurance is closely allied to the contractarian justification which has been advanced by several authors for the adoption of particular sets of ethical preferences. Harsanyi (1953, 1955) suggested that one's ethical preferences should be governed by the condition of "impersonality", such that "they indicate what social situation he would choose if he did not know what his personal position would be - but rather had an equal chance of obtaining any of the social positions" (1955, p. 315). The idea has been developed in a somewhat similar way by Buchanan and his colleagues [see, for example, Buchanan (1974) and Hochman (1983)]. They distinguish between the "constitutional choice" assumed to be made by people in the absence of knowledge about their own position in society, which corresponds to the ethical preferences, and the "post-constitutional choice" made by people in the light of full knowledge of their rights and resources. According to Buchanan (1974, p. 36): at the stage of genuine "constitutional" choice, it is not appropriate to include the standard arguments in individual utility functions, nor is it at all appropriate to introduce explicit utility interdependence. [Individuals] remain uncertain about their own income-wealth positions in subsequent periods during which the rules to be adopted will remain in force. Harsanyi [and Vickrey (1960)] argues that the condition of impersonality would lead to the resulting social welfare function taking the form of the sum of utilities, in this way providing a rationale for a utilitarian objective function. A quite different cnclusion is reached by Rawls (1971) arguing on the basis of a similar hypothetical situation, which he describes as the "original position". He argues that in this original position rational individuals will agree to two principles. The first is that liberty has priority over other considerations, no quantity of material goods being sufficient to compensate for loss of liberty. The second is the principle that everything should be distributed equally unless
794
A.B. A tkinson
unequal distribution is to the benefit of the least advantaged. It is this second principle, that of "maxi-min", and its lexicographic extension, that has had most influence on economics. The mental experiment suggested by Rawls is the same as that used by Harsanyi, Vickrey, and others to justify utilitarianism, but Rawls reaches an apparently quite different conclusion. Indeed, he is concerned to emphasise the differences. Following Arrow (1973), we may view both approaches as special cases of the social welfare function:
([Ul
--1}/(1-q),
q>0,
(2.2)
where social welfare is assumed to be an iso-elastic function of individual welfares, the latter being denoted by Ui. If, as in the Vickrey-Harsanyi story, people assume that they may be any member of society with equal probability, then expected utility maximisation would lead them to adopt this social welfare function with q = 0. The concavity of the function Ui then takes account of risk aversion. On the other hand, the Rawlsian social welfare function may be obtained either by allowing U to express infinite risk aversion or by letting q tend to infinity. Rawls himself suggests that "the parties will surely be very considerably risk-averse" (1974, p. 143). But his argument goes wider than this, and he adduces several further reasons for adopting the maxi-min objective. The attractions of this objective include "sharpness", or the fact that "citizens generally should be able to understand it and have some confidence that it is realised" (1974, p. 143). The social welfare function (2.2) is formulated in terms of individual welfares. If the function is formulated in terms of individual incomes, Y, then we can derive an index of income inequality, like that described in Atkinson (1970). In empirical studies, such as those of the redistributive impact of the welfare state discussed in Section 4, it is commonly income which is the variable studied, and measures of income inequality are widely used. There is of course little agreement about the appropriate value of q [see Okun (1975) for the "leaky bucket" experiment which helps people think about the implications of different values]. For this reason, the less demanding requirement of non-intersecting Lorenz curves is often used. If one is unhappy [see Kolm (1976)] about the implicit assumption that the measure of inequality is invariant with respect to proportional shifts in income (the Lorenz curve remains the same if all incomes are doubled or halved), then it may be better to use the "generalised" Lorenz curve, where it is the cumulative total income, not the share in income, which is shown on the vertical axis [Shorrocks (1983)]. [For results that allow some conclusions to be drawn in cases where the Lorenz curves cut, see Atkinson (1973b) and Shorrocks (1984)]. Ethical preferences have been discussed in terms of contractarian theories of the type advanced by Harsanyi or Rawls, but they may arise more directly
Ch. 13: Income Maintenance and Social Insurance
795
because the distribution of income, or resources, enters individual utility functions. This notion has been described by Thurow (1971, p. 327): "individuals may simply want to live in societies with particular distributions of income and economic power. There may be no externalities; the individual is simply exercising an aesthetic taste for equality or inequality similar in nature to a taste for paintings." The same idea is discussed by Breit (1974), who suggests that individual utility functions may be treated as embodying an index of inequality such as the Gini coefficient.
2.4. Concluding comment In several of the explanations of redistributional objectives, a key role is played by uncertainty. The contractarian theories posit a state of uncertainty as a mental experiment. The insurance model sees uncertainty as a real possibility that motivates individual choices. As it has been put by Dryzek and Goodin, people may act impartially "not because they are acting as if they were uncertain, but rather because they really are uncertain" [Dryzek and Goodin (forthcoming, p. 8)]. In more applied writing on the welfare state, too, the notion of uncertainty is prominent. Haveman (1985, p. 8) has recently argued, for example, that "the primary economic gain from the welfare state is the universal reduction in the uncertainty faced by individuals". In this context it is, however, important to distinguish between "risk" and "uncertainty". Industrial accidents may be a risk, whose probability and cost may be assessed by people with reasonable accuracy, and for which insurance may be provided on an actuarial basis. On the other hand, fears that there will be widespread recession, or a breakdown of family support for the aged, are not easily translated into the expected utility calculus. They are sources of uncertainty for which insurance in a narrow sense may be inapplicable. What is required is mutual support of a kind described by the French term "solidarity" or by Beveridge's phrase that "men should stand together with their fellows" [Beveridge (1942, p. 13)].
3. The design of state income maintenance
3.1. A simple model What is the role of state income redistribution? How does it relate to private philanthropy and to private insurance possibilities? What form should it take? To explore these questions, I use a highly simplified model, based on strong assumptions, particularly those about behaviour. It should however be stressed that the model does include behavioural responses. One of the major advances in
A.B. Atkinson
796
recent years in the analysis of the design of income maintenance, and public policy more generally, has been the explicit treatment of the effect of taxes and benefits on family behaviour. The "fixed cake" model which featured prominently in many discussions of distributional justice has been replaced by one in which labour supply, or other decisions, may affect the size of the cake. Indeed, this happened before supply side considerations became fashionable elsewhere in economics. The model used is similar in key respects to that investigated in the literature on the design of income taxation, notably following the article by Mirrlees (1971), which considered the optimum income tax structure when people varied their labour supply in response to changes in the tax schedule. His treatment concerned the working, or potentially working, population, but allowed for the possibility that the after tax income would exceed before tax income (a negative income tax) and that at low levels of wages people might choose zero hours of work, living solely on the guaranteed minimum income. The income transfer aspect was dealt with independently by Zeckhauser (1968, 1971). He distinguished two main groups, characterised as the representative citizen and the representative poor. The latter were assumed to vary their labour supply in response to the transfer. The simplified model combines elements of both those outlined above. In contrast to the Mirrlees formulation, two different groups are distinguished, but unlike the Zeckhauser model the second group is assumed to consist of people unable to work. [A two group example is also used in numerical calculations by Cooter and Helpman (1974)]. The assumptions made concerning the first group are similar to those of Mirrlees. They work a fraction L of the day, consume C of a consumption good which is taken as the numeraire, and are identical in all respects except their wage rate per hour, w, which is distributed with density f(w) over the range from W to infinity, such that the integral of f(w) over that range is equal to unity. Their utility functions are identical and of the form: U(C, L) = log(C+ C*) + b log(1 - L),
(3.1)
where b and C * are constants and are assumed strictly positive. The assumption about the sign of C * has implications for the behavioural responses, as discussed below, and it should be stressed that it is made purely for purposes of illustration, not for its empirical relevance. The first term in (3.1) is the utility derived from consumption and the second is that from leisure [a fraction (1 - L) of the day being available for leisure]. The budget constraint in the absence of taxes is given by C = wL, there being no non-labour (lump-sum) income. The indirect utility function, denoted by V(w,O), gives the level of utility attained as a function of the wage rate and lump-sum income (in this case zero).
Ch. 13: Income Maintenance and Social Insurance
797
The second group consists of those who are unable to work and who have zero income in the absence of transfers. This group is assumed to have the same utility function (3.1), where the positive value of C * implies that their welfare level has a finite lower bound. The level of indirect utility is given by v(R), where R is the transfer received. They are referred to as group A (for aged); the working population is referred to as group B. The ratio of the number in group A to that in group B - the dependency ratio - is denoted by D. Suppose that we consider first the traditional role of income maintenance: that of providing relief for those who would otherwise have no income. The government levies a proportional tax on the earnings of group B at rate t, with the proceeds being transferred as an equal cash payment to group A. This means that group B is faced with a budget constraint C= w(l - t)L, and the utility-maximising labour supply function may be calculated to be L = (1 - bC*/w(1 - t))/(l + b),
(3.2)
where t < - bC*/W
(3.3)
(ensuring that labour supply remains strictly positive even for the person with the lowest wage, this being W). Where (3.3) holds, the revenue collected per head of group A is given by
R = t(w -bC*/(1 - t))/D( + b),
(3.4)
where w denotes the average level of the wage rate. We can trace out a transfer possibility locus, as in Figure 3.1. The extent of the transfer will be subject to several conditions, including (3.3). If these do not intervene, then the attainable transfer reaches a maximum at the value given by [from (3.4)] (1 - t)2 = bC*/w.
(3.5)
From the expression for the labour supply function [equation (3.2)], we can see that the supply of labour in this case is a declining function of the tax rate where C* is positive. Whether this is so in the real world is of course an important empirical question (see Chapter 4 by Hausman in Volume I of this Handbook). As the literature on optimum taxation has shown, one of the ingredients in the determination of the level of taxation and transfer is the elasticity of labour supply. [See, for example, Atkinson and Stiglitz (1980, lecture 13). The model used here differs in that the tax is proportional without an exemption level, with the transfer only being made to those not in the labour force.]
798
A.B. A tkinson FPnihlP trnnsfpr
Basic Model
Where people in group can opt to join A
Tax rate Figure 3.1. Transfer possibility locus.
The limiting form of disincentive is when a person withdraws totally from the labour force. The assumption (3.3) precludes labour supply from falling to zero, but we need to consider more generally the relation between the two groups A and B. If L were to fall to zero, then the resulting level of utility for group B would be lower than that achieved by group A, given that the latter receive a positive transfer. This raises issues of fairness and incentives. If it were open to the low paid worker to opt out of the labour force and joint group A (for example, by retiring early), then we should have to allow for the possibility that increases in the transfer and in t would lead to an increase in the size of the dependent population. (In this respect, the model would be closer to that discussed in the optimum income tax literature.) The working population would then become those with wage rates in excess of w *, where w * is defined by log(R +C*)=log[(w*(
-t) + C*)/(b+ 1)l
+blog[(1 + C*/w*(I - t))/(1 + 1/b)].
(3.6)
The left-hand side is the level of utility achieved in group A; the right-hand side is that achieved in group B with a wage w * [calculated by substituting from (3.2) and the budget constraint]. This condition is illustrated in Figure 3.2, where the indifference curve indicating the choice of hours on the budget line AB is such as to pass through the point where L = 0 and the transfer is R. When w * > W, then the possible transfer is reduced, as illustrated by the dashed line in Figure 3.1. The extent to which social security leads to withdrawal from the labour force will be discussed in Section 5.
Ch. 13: Income Maintenance and Social Insurance
799
NET
Figure 3.2. Person indifferent between work and categorical benefit.
It is assumed above that the option of withdrawing from the labour force is one freely open to the worker, and this is indeed the assumption made in the optimum tax literature. In practice, the group A is typically defined as satisfying certain conditions; that is, the transfer is a categoricalone. In the case of old age, the test is a relatively straightforward one (and presumably only a minority of the working population would qualify). But in the case of disability or unemployment, the application of the criteria is less straightforward, and it is possible that some degree of discretion exists. Nonetheless, it is important to remember that the receipt of transfers such as unemployment insurance is in practice subject to constraints: benefit claims may be rejected on the grounds that the administrators do not consider the person "genuinely" qualified. We have set out above a menu of choices open to the government, in the context of a highly simplified model. In this respect, it is similar to the analysis by Diamond (1968) of the negative income tax. Using a function like (3.1), with b = 1 and C * = 0, he calculated how the possible level of transfer depended on the rate of withdrawal of benefit (the marginal tax rate), showing how the feasible transfer fell below that estimated ignoring labour supply responses (in this case the responses of the recipients).
3.2. Income support- individualisticpreferences The choice from the menu described above will evidently depend on the objectives pursued and these may take different forms, as discussed in Section 2. First,
800
A.B. A tkinson
we consider the case where income maintenance is justified on an individualistic basis, according to the preferences of the working population for redistribution. As it was put by Zeckhauser (1971, p. 324): poor and non-poor alike seek to institute mechanisms that give something to the poor beyond what they get from the natural workings of the "competitive" system. That the non-poor share in this desire indicates that an externality is generated when the economic welfare of the poor is improved. This leads us directly to ask why state redistribution is required. If the interdependence of individual utilities is such that individuals would like to see transfers carried out, why is private charity not sufficient? In order to examine this question, let us suppose that, associated with each N members of the dependent group A are N/D workers who regard the support of the dependent as their shared responsibility. Their preference for redistribution may be represented by extending the utility function of group B to include the level of welfare per person in group A, "discounted" by a factor r (where this is a constant assumed to lie between zero and unity): log(C + C* ) + b log(1- L) + r log(R + C*).
(3.7)
Within this framework, we may identify two rather different concerns. The first is for what we may call "family" redistribution, where a member of the working population is, say, concerned with the welfare of his or her parents. Such altruism within the family has been studied by Becker (1974, 1981), and the intertemporal aspects, particularly as they affect bequests, have been emphasised by Barro (1974, 1978). In the Becker analysis of the family, there is in general one donor, whose transfer then determines the living standard of the recipient. For example, with one donor and one recipient, the level of utility-maximising private transfer, denoted by S, is given by [where (3.3) is assumed to hold] S = [rw- (1 + b - r)C*]/(1+ r + b),
(3.8)
where this is non-negative; otherwise S = 0. From examination of the corner condition for S = 0, it may be seen that, holding b and C* constant, it is those with too low wages, or insufficient concern (too low r) who do not make transfers. As Barro has stressed, where such private transfers are being made, the introduction of state transfers will simply lead to a replacement of private transfers (this is leaving out of account any distortionary effect from the financing of the transfer). If the donor is at a corner, so that S = 0, then the introduction of state transfers will have an effect, but it will not be preferred by the donor (although it evidently will be preferred by the recipient). The family model takes no account of the real-world fact that some people have no descendants, so that no provision would be made for them. This brings
Ch. 13: Income Maintenance and Social Insurance
801
us to the second interpretation of the model, which is where people have a more general concern for the aged or other deserving groups. In this "large group" model, assumptions have to be made about the perceived relation between the individual transfer and the level of income received by group A. The most common assumption is that each individual takes the contributions of others as fixed, the "Nash assumption": R = (S + (N/D - 1)S)/N,
(3.9)
where S denotes the average transfer by the others. In other words, the individual assumes that the total divided among the N dependents will be his transfer S plus S from each of the remaining (N/D - 1) workers who are contributing to their support. An equilibrium is characterised by a set of transfers such that each person is happy with the amount of his donation given the transfers of others. This model of private redistribution is similar to that treated by Arrow (1981), Collard (1978), and Sugden (1982, 1984). The main conclusion drawn from this model is that the private level of transfers will often be inefficient in a Pareto sense, the reason being that such transfers have a public good aspect: the reason why the charity game so rarely leads to a Pareto-optimal allocation is that the income of others is a public good. If there are two or more givers, then each would benefit by the other's giving; but the charity game does not permit mutually advantageous arrangements [Arrow (1981, p. 287)]. These large group models of income transfers have provided formal underpinning for the conclusion reached long ago that private charity on its own is unlikely to be adequate. As was noted by Friedman in Capitalism and Freedom: it can be argued that private charity is insufficient because the benefits from it accrue to people other than those who make the gifts ... we might all of us be willing to contribute to the relief of poverty, provided everyone else did. We might not be willing to contribute the same amount without such assurance [Friedman (1962, p. 191)]. The scope for state intervention to overcome this problem has been extensively discussed in the series of papers on "Pareto-optimal redistribution", including Hochman and Rodgers (1969), Goldfarb (1970), and von Furstenberg and Mueller (1971). The form of such intervention is however important. As noted by Warr (1982), the lump-sum taxes on donors discussed by these authors could simply lead to the replacement of private transfers by public transfers dollar for dollar (where donors are not at a corner solution), just as in the case of family redistribution. He argues that an additional net transfer may be achieved by measures which provide an increased incentive to donate at the margin. This could take the form of a subsidy to charitable giving, and this is indeed provided
802
A.B. A tkitnson
in many countries through income tax deductions. The theoretical arguments for such deductions are examined in Feldstein (1980b) and Atkinson (1976). (It should be noted that it is not just the aggregate level of transfers which is of concern, but also its distribution. Tax relief may be an indiscriminate policy, not distinguishing, for example, between different types of charitable body.) The provision of state income maintenance goes beyond the subsidy of private charity in that it provides direct income support. As noted above, this may have the effect of reducing the extent of private transfers, but there are several reasons why it may be adopted, including the possibility that state income support has lower administrative costs, particularly where there are economies of scale, and that state income support involves redistribution between different groups. As we have seen in Section 2, such redistribution may be desired even on an individualistic basis for reasons of insurance. Suppose that we consider first the position of a people uncertain whether they are going to be workers with wage w or in group A with zero wage, attaching a probability P to the first and (1 - P) to the second event. In such a situation they may be concerned with the expected utility, which may be written in the same form as (3.7), where r = (1 - P)/P. Even without the workers attaching any weight ex post to the welfare of the disadvantaged group A, the ex ante objective function would be the same as before, and transfers would raise the expected utility of all. Why cannot private insurance provide for such contingencies? Recent theoretical work on insurance markets has revealed the problems which may arise even in the absence of considerations such as the fixed costs of administration (favouring a single state insurance scheme covering multiple contingencies): for example, Rothschild and Stiglitz (1976), and C. Wilson (1977). The first difficulty is that the risk of being in group A typically varies across the population, and while the individual may know P, this information may not be available to (may be concealed from) the insurer. If such differences are not observed, then there will be a problem of "adverse selection". If the private company offers insurance at a premium based on the expected claims for the whole population, then it will find that those with low risks will tend not to take out insurance and that their clientele will be weighted towards the higher risks, rendering it unprofitable. Secondly, and related, is the fact that people may place different weight on the value of insurance. We would therefore expect to see a menu of policies, but as Spence (1978, p. 428) has described: a competitive market is not able to provide the efficient menu of policies. ... The reason is that when policies designed for low risk people are priced at the expected cost for low risk people, the high risk people will tend to buy these policies because of the lower cost. [There are] informationally-based negative externalities running from high to low risk people, that damage the latter and impair the performance of the market
803
Ch. 13: Income Maintenance and Social Insurance
Thirdly, there is the "moral hazard" problem that the probability P may be influenced by decisions by the individual which cannot be monitored by the insurance company. The most obvious example would be voluntary unemployment. Social insurance may be able to provide options which cannot be sustained by private markets, and improve on the private outcome. However, as noted by Spence (1978), there are two important qualifications: the scope for improvement may depend on the prohibition of private insurance, since otherwise the latter would cream off all but the highest cost groups, and the social insurance scheme may provide insufficient diversity of options. Social insurance also redistributes between people with different wages, and here we are entering territory which private insurance is unlikely to cover. In its extreme form, such insurance would have to provide for the person completely uncertain about earnings possibilities. This would be analogous to the hypothetical situations envisaged in the contractarian theories of ethical preference, to which I now turn.
3.3. Ethicalpreferences The utilitarian function represents the longest-standing approach to the assessment of social welfare, having influenced, explicitly or implicitly, much of thinking about social policy for the past century and a half. As we have seen, it may be represented in terms of a person having an equal chance of being anyone in the population, so that expected welfare is 1+ D
V(w(1-t),)dF(w)+ l+Dv(R),
(3.10)
where D is the ratio of the number in group A to that in group B, and R denotes the transfer received per head by those in group A. The implications of this objective for the design of income support in our simple model may be seen by differentiating equation (3.10) with respect to t and obtaining the first-order condition [on the assumption that neither the constraint (3.3) nor that embodied in (3.6) are binding]: -
VM[w(1-t),0] wL dF(w) +DvM[R]dR/
dt,
(3.11)
where we have multiplied by (1 + D) and used the fact that the derivative of the indirect utility function with respect to the tax rate is equal to (- wL) times the marginal utility of income, which is denoted by VM, with a corresponding
804
A.B. Atkinson
expression
uM
for group A. This condition may be re-written as
DdR/dt=
{M[w(1 - t),O]/uM[R]}wLdF(w).
(3.12)
The transfer is not therefore carried to the point where the marginal increase, dR/dt, falls to zero, but stops short to an extent which depends on the relative marginal utilities of income of group A and of the working population, the latter weighted by their earnings. The magnitude of the right-hand side depends on the form of the marginal utility of income function and on the distribution of earnings. The way in which the distribution enters the calculation is illustrated by the examples given in Atkinson (1973c) and Deaton (1983). When we move from the strictly Benthamite sum of utilities case to consider more general social welfare functions, we have to allow for the effect of a greater degree of aversion to inequality, i.e. that the social marginal valuation of income declines more rapidly with income. From (3.12) we can see that the weights in the curly brackets are subject to two influences. To the extent that the working population receive less weight relative to group A, this reduces the right-hand side, and, other things equal, leads the transfer to be higher. To the extent that those with high earnings (wL) within the working population receive less weight, it will further reduce the right-hand side. The limiting case of greater inequality aversion is the Rawlsian objective function, and the solution in this case is given by setting dR/dt = 0, since the working population, being better off than group A, receive no weight in this case. The model considered so far has focused on transfers to those out of the labour force. A number of authors have taken the case where there are two groups both in the labour force, in order to explore the extent to which such transfers should be income related. Let us now assume that group A are now at work, with a wage
rate WA, and that group B are all identical, with wage rate WB, where
WA < WB.
The simplest form of income support would be that provided by a social dividend, or credit income tax, scheme, where everyone receives a payment R and all income is taxed at rate t. This is identical to the linear income tax which has been treated in the optimum income tax literature, for example, by Sheshinski (1972), Atkinson (1973c), Feldstein (1973), and Stern (1976). The social dividend has been contrasted with more complex negative income tax schemes which have two, or more, marginal tax rates. The tax may be levied first at rate t and then at t2 ; and typically it has been suggested that t > t2 , with the aim of allowing the guaranteed income to be kept up while at the same time preventing the benefit from spilling over too far to those higher up the income scale. The version illustrated in Figure 3.3 by a dashed line has been described as a "fully integrated" negative income tax, which means that the kink occurs on the 45° line. In other words, the break-even point is also that at which the
Ch. 13: Income Maintenance and Social Insurance
805
NET
GROSS INCOME Figure 3.3. Negative income tax schedules.
marginal tax rate changes. Alternatively, we may have an overlapping system, where the kink occurs after the break-even point, as shown in Figure 3.3 by the dotted line. The different types of negative income tax schemes are described in Green (1967). The choice between a social dividend and a two-tier negative income tax is examined by Kesselman and Garfinkel (1978) in the context of the two-group model. They point out that an unconstrained choice of the two tax rates can always improve on a single rate, making use of the result obtained by Sadka (1976) and others that the marginal tax rate at the top can be set to zero, with the kink at the point chosen under the flat-rate tax, and hence raise the welfare of the better-off group without reducing the revenue available to finance the transfer to the poorer group. Kesselman and Garfinkel restrict their attention to cases where the kink occurs above (or at) the break-even point and strictly below the point chosen by the better-off group. They compare different structures, holding the utility level of the poor constant, and examining the implications for the better-off group. The comparison is shown in Figure 3.4 of a fully integrated negative income tax (line GXR) with a social dividend (line SD). The indifference curves are obtained from those between consumption (which is equal to net income) and leisure, gross income being a decreasing linear function of leisure for a given wage rate. The latter means that the indifference curves of the rich and poor have different slopes. The fact that the tax rate t on the segment GX is higher
806
A.B. Atkinson NET MIC, MF
Figure 3.4. Comparison of social dividend and integrated negative income tax.
than t2 on the segment XR is taken by the authors to indicate that there is income testing (this is discussed further below). The poor group are in both cases on the same indifference curve; but the better off are on a higher indifference curve with the social dividend. This does not necessarily happen, but is possible, as Kesselman and Garfinkel point out, because the social dividend requires a smaller transfer to achieve that specified level of utility. The authors consider how the comparison depends on labour supply elasticities, examining the implications of empirical estimates, and introduce administrative factors. They conclude that "contrary to the conventional wisdom, the universal payment, flat-marginal-rate form may be more efficient than the income-tested form" [Kesselman and Garfinkel (1978, p. 211)]. The comparison described above did not consider the optimal two-tier structure; rather, the aim was to show that a specified negative income tax need not necessarily be preferable to a single rate structure. The analysis has been taken further by Sadka et al. (1982), who introduce an explicit social welfare function, with differing degrees of inequality aversion, and calculate the optimum two-tier structure with the restriction that it be of the fully integrated form. They also extend the model to five groups of workers. They conclude that the results "are broadly consistent" [Sadka et al. (1982, p. 309)] with those of Kesselman and Garfinkel. They note that higher elasticities are less favourable to the social dividend, but point out that the welfare losses are not very great if one fails to adopt a non-income-tested regime when it would be optimal.
Ch. 13: Income Maintenance and Social Insurance
807
The choice between income-tested and other programmes raises a number of issues not encompassed in the model we have been considering. These include the stigma associated with the income-tested benefits, to which I return later, and the administrative cost.
3.4. Social insurance and intertemporalallocation To this juncture the analysis has been essentially static. The group A have been referred to as the aged, but no account has been taken of the fact that they were alive at an earlier period. We have not therefore treated the- evidently important - intertemporal aspects of social insurance. In order to investigate this, let us go back to the formulation of Section 3.2, where the individual's concern for the income of group A now becomes concern for his own income in the second period of his life. (We work throughout with a two-period model, which is clearly a gross simplification, but the main theoretical points may be seen.) The difference is now that the transfer S is one which the person himself makes from an earlier period. If the interest rate is denoted by i, and it is assumed that the person can borrow or lend as much as desired at that interest rate, then we have the intertemporal budget constraint, giving the consumption in the second period, R: [w(1 - t)L- C](1 + i) + T= R,
(3.13)
where T denotes the state transfer in old age and t is the rate of contributions levied on wage-earners. The square bracket on the left-hand side denotes savings. Unlike the earlier formulation, there are no public good aspects, since each person is making provision for his or her old age. This again raises the question as to why state intervention is necessary. We must, however, start by observing that the transition to a dynamic economy has affected the welfare-economic basis of the argument. In a static economy, with a full set of markets, and full information, the private market allocation is Pareto efficient. In an economy with an infinite sequence of overlapping generations, there is no guarantee that the no-intervention allocation will necessarily be efficient. As illustrated by the examples of Diamond (1965) and Starrett (1972), a competitive economy may be pursuing a path on which the capital accumulation is "excessive" and all generations could be made better off by intervention. We may have "Phelps-Koopmans inefficiency" [see Phelps (1962) and Koopmans (1965)], where the rate of interest is below the rate of growth. In this situation, it is possible that state intervention, for example through creation of a national debt, may be unanimously preferred.
808
A.B. Atkinson
The introduction of a state pension scheme is one such instrument for the securing of a desired intertemporal allocation. The state, unlike private pensions, can operate on a different basis from conventional borrowing and lending. It can operate a pay-as-you-go scheme. Suppose that the population is growing at rate n. Then a contribution of $1 each by the working population generates a pension payment $(1 + n) per pensioner. Comparing this with the return i on private savings, it appears that the state scheme can offer a "better deal" where i < n, which is precisely the situation of dynamic inefficiency. This aspect was analysed by Aaron (1966). (Where there is labour-augmenting technical progress, then the term n includes the rate of increase in labour productivity.) The welfare consequences of social security can, however, only be assessed in a general equilibrium context. The fact that a generation is offered a pension scheme with a positive lifetime present value, or what is typically referred to as positive "net social security wealth", does not necessarily imply a net welfare gain [Tesfatsion (1984)]. Samuelson (1975) discussed the optimum social security system in the two-period overlapping generations model. Comparing steady states, he observes that a fully-funded state scheme will simply displace private savings, but that a pay-as-you-go scheme can achieve a "golden-rule" steady state. As he notes, a full analysis must take account of the approach to the steady state; and the inefficiency argument only applies where there is initially excess capital. Moreover, consideration has to be paid to the paths of public debt and public capital, which are not always considered explicitly. This aspect is discussed by Burbidge (1983), who distinguishes between the Samuelson-Gale and Diamond versions of the overlapping generations model. The introduction, or expansion, of social security may lead to changes in public indebtedness or public ownership of productive assets, which will be relevant to the general equilibrium consequences, and hence to the welfare impact. This analysis does not allow for the existence of bequests, which provide a link between generations; and Barro (1974) argues that where there are bequests then any state intervention would be offset by a family re-adjustment. (The presence of a bequest motive may in addition affect the kinds of steady state which may be obtained.) This does, however, assume that no one is constrained in their bequests. It is possible that people would like to pass on less wealth, but are limited in the extent to which they can leave negative bequests. It also depends on the absence of uncertainty. Sheshinski and Weiss (1981) set up a model with overlapping generations where the duration of life is uncertain. They show that, starting at the optimum level, an increase in the level of a fully funded social security system is only partially compensated by a decrease in private savings. The second reason why a state pension scheme may be justified is that people may exhibit myopia and not save enough for their old age if it is left to voluntary decisions. As is noted by Diamond (1977), there are several possible reasons for this: (i) people may lack the information necessary to judge their needs in
Ch. 13: Income Maintenance and Social Insurance
809
retirement; (ii) people may be unable to make effective decisions about such long-term issues, not being willing to confront the fact that one day they will be old and grey; and (iii) they may simply fail to give sufficient weight to the future when making decisions, that is, they may act "myopically" (a fact of which they may themselves be aware but not be able to withstand). As a result, it may be desirable for the government to act paternalistically or for people to pre-commit themselves (tie themselves to the mast to prevent over-spending). The myopia argument could be used to justify compulsory saving in either private or state form, or indeed the kind of contracting-out scheme operated in Britain, where people with at least a prescribed minimum of private cover can opt out of part of the state pension. In assessing the impact of a state pension scheme we have to take into account its interaction with other elements. First, there is the possible effect on other savings. As is noted by Kotlikoff et ai. (1982), the introduction of social security pensions may lead to an offsetting reduction in private saving if people foresee the state pensions and if they are not constrained in the capital market. This latter condition is an important one: a person may wish to borrow on the strength of the future pension but be unable to do so in an imperfect capital market. Secondly, in the absence of a state pension there is likely to be some form of income support for the elderly, the need for which would be reduced by the state pension scheme. Forced savings through social insurance may be a method of reducing dependence on public assistance. Or it may be the children of the elderly who would otherwise have had to support them, so that the younger generation are in fact the beneficiaries. The third set of arguments concern uncertainty. In few other personal decisions can uncertainty play a greater role. Those entering the labour force in the 1980s are having to plan savings to yield fruit in the second quarter of the twenty-first century. The uncertainties surround the rate of return, the ageincome path, the number of children, the length of life, the length of working life, the need for medical and other expenses, and the degree of support from children and other relatives. The existence of such uncertainties does not imply that state intervention is essential. The capital market may offer appropriate instruments: for example, indexed bonds to guarantee purchasing power and annuities to insure against the risk of longevity. The contribution of such markets is however limited in reality. One reason is the adverse selection problem discussed earlier; another is moral hazard with regard to decisions such as early retirement. The design of an optimum social insurance scheme taking account of the problem of moral hazard is examined by Diamond and Mirrlees (1978). Their analysis is extended by Whinston (1983) to allow for a diversity of types and he notes that this may enhance the case for public provision. Alternatively, similar functions may be performed by family arrangements. Kotlikoff and Spivak (1981) have examined the extent to which families can self-insure against uncertain death
810
A.B. A tkinson
through implicit or explicit family agreements with respect to transfers between family members, arguing that even small families can substitute to a large degree for a complete annuity market. One important difference between the family and market arrangements is that the family may be better placed to cope with uncertainty rather than risk. The same may also be true of social, in contrast to private, insurance. The "solidarity", which is an essential part of social insurance, implies a willingness to allow flexibility in contracts where there is agreement that conditions have changed in some way which could not have been foreseen initially. These arguments are all concerned with the role of state pensions in raising individual welfare, seen on a lifetime basis. The redistribution of lifetime income between individuals may be a major objective. Suppose, for example, that we take the simple two-period model. The present value of pre-transfer lifetime income measured in terms of first period consumption is w. The effect of the pay-as-you-go pension scheme, collecting tw and then paying a pension of tw(1 + n), is to change the present value of lifetime income by tw(n - i)/(1 + i). This generates a uniform gain or loss for everyone as a percentage of pre-transfer lifetime income. On the other hand, a scheme which collected proportional contributions and paid equal cash pensions would change the present value by t[w(1 + n)/ (1 + i) - w], and this would be progressive relative to the proportional system. The desirability of such a "tilt" in the benefit formula would depend on the distributional objectives pursued. State pension schemes may be redistributive between people in the same cohort in two major ways. The first is where the benefit formula is progressive with respect to past earnings, as just discussed; the second is where the pension is tailored to current income, through an income test. We have referred earlier to income-testing as part of a negative income tax, but there are specifically intertemporal aspects. Suppose that there are no differences in earnings records. Should the state pension be uniform, or related negatively to current income as it would be if the pension were income tested, or if it were supplemented by income-related assistance? The desirable extent of income testing depends on the degree of variation in returns to saving, on the responsiveness of savings to the means test, and on the extent of dispersion of past earnings. It depends also on the extent of stigma associated with the income testing, which may deter claims being made, and on the administrative costs of its operation. The possibility of greater income testing of income support for the aged in the United States is discussed by Berry et al. (1981). Redistribution may be desired on account of the different earnings which people have received while working, but a further important consideration is the difference in access to the capital market. The rates of interest at which people can borrow or lend differ both randomly and systematically across the population. There are limits to lending and borrowing, with people subject to borrowing
Ch. 13: Income Maintenance and Social Insurance
811
constraints to differing degrees. One of the aims of social insurance is to provide an asset which generates a guaranteed rate of return and which is typically rationed in favour of lower income groups.
3.5. The form of income support There are several dimensions to income support that have not so far been examined in this chapter. I take up two of these: the choice between categorical and non-categorical programmes, and benefits in cash versus benefits in kind. There are others - such as the roles of different levels of government in a federal system [see, for example, Pauly (1973), Buchanan (1974), Oates (1977), and Ladd and Doolittle (1982)] and the choice between income and wage subsidies [see, for example, Zeckhauser (1971), Kesselman (1971, 1973), Garfinkel (1973a) and, in the context of uncertainty, Cowell (1981)]-which should be considered, but space does not permit. In the simple models used in Sections 3.1-3.3, we had in effect a categorical programme providing support to members of group A, entitled to receive it by virtue of their membership in that category, and a non-categorical programme in the form of the social dividend or negative income tax, where the benefit was governed solely by income. The latter appeared in the model where the only difference between people was in their wage rates. In the model of Section 3.4, it was taken for granted that there should be a categorical programme for pensions, in the sense of conditions for eligibility other than income, such as reaching a minimum age. The gains from use of categorical programmes are that one can tailor the benefits more closely to categories of need, such as old age, and that one can take account of differences in labour supply (or other) responses. This may be illustrated here in the context of the static model of the earlier sections, with the modification that we are now concerned with both the position of the group A who are out of the labour force and with that of low paid workers. The relation between the two policies, categorical and non-categorical, may be seen if we move in stages from the simple redistributive scheme with which we began, where the uniform payment to group A was financed by a proportional tax on group B. Suppose first that the income tax is converted to a progressive linear tax, with an exemption level, and that the exemption level is then cashed out to yield a negative tax payment below the exemption level. In other words, the after-tax income of those in work becomes G + (1 - t)wL, where G represents the effect of the negative income tax (G is equal to t times the exemption level). The fact that the transfer to group A is now financed in a progressive fashion affects the determination of the optimum level of transfer. It also raises the question as to whether G and R should be set equal. If they are equal, then we have a pure
812
A.B. A tkinson
system of negative income tax; if R is set above G, then the system would be a categorical one in that those classified as in group A would receive more than if they were in group B with zero earnings. The choice in this simple model is relatively straightforward. If one asks whether it would be desirable to raise R above G, holding revenue constant (so R rises and G falls), then it would appear obvious that this would be preferred on distributional grounds, since R is better "targeted" on those with low incomes. The fall in G would hit low paid workers, but it would also affect the higher paid. We have, however, also to bear in mind the impact on labour supply. In the present case, this is likely to reinforce the argument. A fall in G will have an income effect on the working population, and on the assumption that leisure is a normal good will lead to an increase in revenue, allowing R to be increased still further. The model just described is similar to Akerlof's (1978) discussion of "tagging". Akerlof considers a situation where there are two classes of workers, one skilled and the other unskilled, each with a specified wage rate. There is no decision about hours of work, but the government's scope to redistribute from the better paid skilled workers to the less well paid unskilled workers is limited by the requirement that skilled workers do not switch to take unskilled jobs. This latter constraint may however be eased if it is possible to identify unskilled workers and present them with a different tax/benefit schedule. Akerlof supposes that a proportion beta of the unskilled workers can be "tagged" and shows that such tagging allows a higher transfer to be made: "total welfare could be raised by giving increased subsidies to the tagged poor" [Akerlof (1978, p. 17)]. It should be noted that where, as in this model, the recipient group have other sources of income, the categorical programme may involve a higher level of payment and a different rate of tax. (This differs from a two-tier negative income tax in that a person must meet the qualifying condition in order to benefit from this branch of the budget constraint.) These examples show the potential gains from categorical programmes, but there are also disadvantages. Akerlof notes (i) the administrative costs of the categorical system, including those to the recipients, (ii) the inequity which arises where there are errors in classifying the eligible (the 1 - beta who do not receive the preferential treatment in Akerlof's model), and (iii) the incentive for people to misrepresent their circumstances so as to qualify. The implications of errors in classification have been examined in greater depth by Stern (1982), who compares the benefits from the use of categorical information and the costs of erroneous classification including that of horizontal inequity. The second question considered here is that of benefits in cash versus benefits in kind. The models employed in the analysis have not allowed for differentiated goods, but may readily be extended. Let X1 be the commodity in question (for example, housing or food). The effect of provision in kind may be represented by
Ch. 13: Income Maintenance and Social Insurance
813
the addition of an element T[X 1, Pl, Y] to the individual budget constraint, where p, is the price of the good and Y denotes income. We may distinguish several forms of provision: (i) provision of a specified quantity in kind, G1, so that T=plmin{ X, G}, (ii) subsidy to the purchase of the good, which may be specific or ad valorem, so that T= tX or tplXl , and (iii) income-related provision of the good, or subsidy, so that G1 and/or t are functions of Y. In practice, these schemes are of considerable complexity. In Britain, the housing benefit scheme (in 1985) involves a rebate on rent for tenants:
T= min{ p 1 X,max[h lpX
1
- h 2 (Y- h3 ),O] },
(3.14)
where O < h, h2 < 1, and h3 depends on family size. The reader may well find this confusing, but in fact the representation (3.14) is a simplification of the way in which the scheme operates in practice. The provision of support in kind, that is tied to the purchase of particular commodities and assumed not be resaleable, runs counter to the typical liberal economist's recommendation that redistribution take the form of money income. The welfare economic basis for such a recommendation in a perfectly competitive economy without external effects or non-convexities and with full information has been set out by Arrow (1963, p. 943): social policy can confine itself to steps taken to alter the distribution of purchasing power. For any given distribution of purchasing power, the market will, under the assumptions made, achieve a competitive equilibrium which is necessarily optimal; and any optimal state is a competitive equilibrium corresponding to some distribution of purchasing power. There are several reasons why this may not apply. First, as observed by Foldes (1967a, 1967b, 1968), there may be difficulties when the competitive equilibrium is not unique. As he shows in a two-good example, if there are multiple equilibria, it may not be possible to attain a desired allocation simply by redistributing one good (money). Redistribution of all goods may be necessary [see also Samuelson (1974)]. This argument is valid within the confines of the assumptions made in the traditional model. The second argument introduces considerations not present in that model. Foldes supposes that the government is uncertain as to individual preferences and argues that again the government can attain the highest level of social welfare only by redistributing all goods. For comments on Foldes' argument, including the possibility of a multi-stage process, see Buchanan (1968). A third reason why in-kind provision may be necessary is that there may be interdependent preferences, where these preferences cover the form in which
814
A.B. Atkinson
income is spent and not just the total level of utility. The goal of eradicating specific poverty discussed in Section 2 is an example. Suppose that the consumption of one good by the recipient group enters the utility functions of others. In this situation, lump-sum transfers of purchasing power cannot ensure a Paretoefficient allocation [Daly and Giertz (1972)]. Moreover, voluntary transfers in kind leave scope for Pareto improvements [Daly and Giertz (1976)]. The same will in general be true if the level of state transfer in kind is determined by regard to the objectives only of the taxpayers, or what Browning refers to as "paternalistic" transfers. Garfinkel (1973b) points to the need to take account of the preferences of the beneficiaries. As is noted by Brennan and Walsh (1977), the situation described above differs from that where the welfare of the poor enters the donors' utility functions in that the change in the consumption pattern of the poor could be achieved without making the poor any better off. This analysis has largely been based on the assumption that economy is otherwise distortion-free and [with the exception of Foldes (1967a)] that the government is fully informed. Relaxing both of these assumptions has featured prominently in the agenda of the optimum taxation literature. Nichols and Zeckhauser (1982) draw attention to the way in which in-kind transfers may be used to discriminate between those genuinely eligible for support and those who are "impostors". Suppose that there are two groups, one eligible and one declaring the same income but in fact better off. If there is a good with a negative income elasticity of demand (Nichols and Zeckhauser refer to "lower quality housing"), then provision in kind of the amount chosen by the eligible group would act as a deterrent to the impostors. The interrelation of a general subsidy with other instruments of policy has been investigated as part of the literature on the design of indirect taxation [for example, Diamond and Mirrlees (1971) and Atkinson and Stiglitz (1976)] and on public sector pricing [for example, Feldstein (1972) and L. S. Wilson (1977)]. The argument for a subsidy depends on the use which can be made of other tools for redistribution and the extent to which people differ in the prices they face for the good and in their tastes for its consumption. In an analysis of housing subsidies, Atkinson (1977) shows that where people differ in their wage rates, but not in the prices they face, then there is no case for a subsidy that varies with housing outlay if the income tax schedule may be freely varied and there is separability between housing and leisure in the individual utility function. Where these conditions do not hold, then there may be a case for a subsidy. (For further discussion of housing subsidies, see Chapter 7 by Rosen in Volume I of this Handbook.) The general question of public price discrimination where the price can differ between households is discussed by Friedman and Hausman (1974) and Le Grand (1975). The interrelation between in-kind provision of goods and other instruments is examined in papers by Guesnerie (1981), Roberts (1978), and Guesnerie and
Ch. 13: Income Maintenance and Social Insurance
815
Roberts (1984). These have brought out the interrelation between the desirability of in-kind redistribution and the existence of distortionary taxes. In some cases, the results are counter-intuitive: for example, that it may be "luxuries" which should be provided in kind [Guesnerie and Roberts (1984, p. 79)]. Such results are a warning that, once we enter the territory of the second best, intuition may be highly misleading; but they may also lead us to question whether important features have been excluded from the analysis.
3.6. Concluding comment In this section I have examined some of the theoretical issues underlying the design of income support, and in the next section I go on to consider some of the ways in which it works out in practice. In both cases, it is important to remember that it is the way in which the system is conceived to work that may be crucial to its effects and effectiveness. Behavioural responses may be based on perceptions which bear little relation to reality. Support for a programme of income maintenance may derive from a social mythology in which notions such as that of a contributory principle play a potent role in providing "the cement which holds together a social system" [Hollister (1974, p. 39)].
4. The impact in practice The theoretical literature reviewed in the previous section is intended to illuminate policy decisions, and in this respect it has made valuable contributions, but there remains a large gulf between the problem as posed in that literature and that viewed by those actually concerned with the development of policy. An example is the treatment of income-tested benefits. Most of the models described above ignore the aspect of means testing which has generated most concern: the stigmatising effect of receipt which deters people from claiming. This is one of the features of reality which is considered in this section. I begin however by considering in more detail the actual form of benefit and tax schedules, and the complex budget constraints to which they give rise.
4.1. Complex budget constraints The budget constraint considered in the theoretical section is typically of a straightforward form, with one or two linear segments. In reality, the constraint is likely to be considerably more complicated, as illustrated by the account of the U.S. federal income tax given by Hausman in Chapter 4, Volume I, of this
816
A.B. Atkinson
Handbook. On the face of it, the tax schedule has a steady progression of marginal rates, from 14 percent on the first band of taxable income, rising through 16, 18, 21, up to 50 percent at the top (all figures relate to 1980). But for families with dependent children, the earned income tax credit introduced in 1975 modifies the picture. There is first a subsidy of 10 percent at the margin on earnings and then an additional marginal tax, via the withdrawal of the credit, until the benefit from the credit is extinguished. At this point, the marginal tax rate becomes lower than on the earlier tranche of earnings. As a result, there is a region where the marginal tax rate falls with earnings, creating a non-convexity in the budget constraint, as is illustrated by Figures 2.5 and 2.6 in Chapter 4 of this Handbook. This is in contrast to the situation typically shown in public finance textbooks where the marginal rate of tax rises with gross income (usually referred to as a "progressive tax system"), leading to a convex budget constraint, as illustrated in Figure 2.2 in Chapter 4 of this Handbook. The earned income tax credit applies only to those with dependents and to the relatively low paid, but a non-convexity is also generated by the form of the contributions to finance social security under the Federal Insurance Contributions Act (FICA), which are levied on earnings up to an upper limit, causing the marginal tax rate to fall at that point. Further sources of non-convexities in the budget constraint are the income transfer programmes designed to help families on low incomes. This is illustrated by the operation of the Aid to Families with Dependent Children (AFDC) programme in the United States [see Williams et al. (1982, pp. 482-497) for a description]. This scheme is administered at the state level and the provisions vary widely, but as shown in Chapter 4 of this Handbook, recipients may be faced with a higher marginal tax rate on earnings than that levied on any taxpayer under the income tax. In Britain, the income-tested programme designed to help working families with children, involves (in 1985) a 50 percent rate of withdrawal of benefit, which is substantially higher than the 30 percent basic rate of income tax (and not far short of the top rate of income tax of 60 percent). The complexity is considerably increased by the interaction and overlap between different elements in the system. Receipt of one benefit may preclude receipt of another, alternatively beneficiaries may be able to cumulate entitlements just as French politicians cumulate elected offices. Where benefits are cumulative, there are complications introduced by the fact that one benefit may take account of the amounts received under another scheme. The procedure by which the benefits from one programme are counted as income for the purpose of eligibility and benefit determination for a second programme is referred to as "sequencing" by Barth et al. (1974, p. 116). They give the example in the United States of AFDC payments being counted as income for food stamps and public housing benefits. In Britain, the decline in the income tax threshold, relative to average earnings, and the rise in the limits of eligibility for income-tested
Ch. 13: Income Maintenance and Social Insurance
817
benefits, has meant that there is quite a wide range of potential overlap between paying income tax and receiving income-tested benefits. Since income tax (in 1985) is payable at the full basic rate of 30 percent from the first £1 of taxable income, this leads to a marginal tax rate for recipients of 80 percent plus National Insurance contributions of at least 6.85 percent. If, in addition, the family is eligible for and claims housing benefit, then the marginal rate can (in 1985) exceed 100 percent (the system is being reviewed). The situation in 1985 is illustrated in Figure 4.1, which shows the relation between gross earnings and net income (including benefits) for a hypothetical family consisting of a man with a wife, who is not in paid work, and three children, and living in a rented house. There is a range of gross earnings where there is little increase in net income as gross income increases, and indeed net income falls at certain points. (In considering the marginal tax rates, one has to take account of the lagged adjustment of benefits to changes in income, so that benefit is not withdrawn immediately.) The possibility of such a "poverty trap" or "poverty plateau" was noted in Britain by Lynes (1971), Piachaud (1972), and Prest (1970). In the United States it was analysed by Hausman (1971), Aaron (1973), and Lampman (1975). Its actual importance depends on the number of people in the relevant income ranges (and those who would choose to be in that range if it were not for the trap). This number may not in practice be large, since the benefits are often confined to the lowest income levels, but the poverty trap has played an important part in debate about income support and its reform. Aaron (1973) describes how the high marginal rates implicit in different reform proposals advanced in the early 1970s, including Nixon's Family Assistance Plan, were one of the main reasons why the proposals failed to pass Congress. As Aaron brings out, it is not just the actual but also the apparent disincentive which has to be avoided. The effect of such complex budget constraints on individual work decisions is discussed by, among others, Diamond (1968), Hall (1973), Burtless and Hausman (1978), Hanoch and Honig (1978), Heckman and MaCurdy (1981), and Brown (1980). Where the budget constraint is convex, then the choice may be analysed in terms of the slope (which depends on the marginal tax rate) and the intercept with the vertical axis (the "virtual income"). This is illustrated in Figure 2.2 in Chapter 4 of this Handbook, where the individual's choice of hours is that which would be generated by a linear budget constraint with slope w2 and virtual income Y2-. It is, however, important to note that the marginal tax rate faced by the person is endogenous. This means that in econometric work one cannot treat the marginal tax rate as a parameter; one needs to take the full budget set (which is exogenous) into account. Where the budget constraint is non-convex, then the resulting labour supply curves may be discontinuous. A person may be indifferent between two distinct levels of work, as illustrated in Figure 2.8 in Chapter 4
A.B. A tkinson
818 Net family income (£ per week) 160 150 140 130
With FIS and housing rebates
120 110 100 l-,
1
//Without FIS and housing rebates
90 I I 100 110
I 120
I I I I I I I , 130 140 150 160 170 180 190 Gross earnings of husband (£ per week)
I 30.
I 40
Net family income (£per week) 180 170 160 150 140 130 120 i10 I 10
I 20
I I I I I 50 60 70 80 90 100 Gross earnings of wife (£ per week)
Figure 4.1. Gross and net income for family with three children in Britain at different earnings of husband (upper diagram) and wife (lower diagram). Note: FIS denotes Family Income Supplement; the difference between the solid and dashed lines in the upper diagram shows the effect of this and housing rebates (also a means-tested benefit).
Ch. 13: Income Maintenance and Social Insurance
819
of this Handbook, where it is noted that "small changes in ... any parameter of the tax system can then lead to large changes in desired hours of work" (p. 221). Attention has tended to focus on implications of the marginal tax rate on additional earnings by the family head. This is not, however, the only, nor indeed the most important, margin at which labour supply decisions may be made. Quantitatively more important are likely to be decisions by other family members, particularly those with regard to participation. [For studies of labour force participation, a subject not covered in Section 5, see Chapter 4 of this Handbook, and Hausman (1980), Heckman and Macurdy (1980), Zabalza (1983) and Blundell et al. (forthcoming).] The tax and social security system typically treats earnings by other workers in a different fashion. In Britain, there is a separate wife's earned income allowance for income tax, and a separate earnings exemption for housing benefit. This means that £1 earnings by the wife may bear a significantly lower marginal rate. This is illustrated in the lower part of Figure 4.1, which shows the implications for net family income if it is the wife, rather than the husband, who receives the additional earnings. There is a "notch" but in general the curve rises more steeply for the wife, indicating a lower marginal tax rate. This means that we have to consider the family labour supply decision as a whole [see Smith (1979) and Hausman and Ruud (1984)]. The work decisions made by families have many dimensions: for example, hours of work, choice of job, education and training, and retirement. Of particular significance here are decisions which may lead to an entitlement to categorical benefits. A person may retire and hence become eligible for a pension; a person may claim the status of disability and become eligible for disability benefits. These possibilities may generate a further non-convexity in the budget constraint, as illustrated in Figure 3.2, where a person may be either on the segment AB, or, if they satisfy the retirement condition, at the point R. The retirement condition may be enforced less sharply. It is implemented in Britain and the United States by an earnings test, which reduces the pension if earnings rise above a certain level, until the pension is extinguished. In other words, there is a segment joining R to the line AB with a slope steeper than that of AB. In considering the date of retirement, we need to bear in mind the intertemporal aspects. To continue after the minimum retirement age may provide an increment to the pension and, with an earnings-related pension scheme, staying at work may allow a person to improve his or her earnings record. These aspects are discussed further in Section 5, when we come to the empirical evidence about retirement decisions. The intertemporal aspects may be important for other work decisions. A person who is unemployed and considering accepting a job offer (or, conversely, who is employed and is considering quitting) may consider the implications not just for current income but also for the income stream over several periods. A decision to turn down a job offer may jeopardise future benefit receipt, and continued dependence on benefit may reduce future entitlement (as where there is a maximum duration of benefit).
820
A.B. A tkinson
Finally, eligibility for benefits may depend not just on income but also on wealth. Public assistance typically has an upper limit on assets or imputes an above-market return. Asset tests in the United States are described by Moon (1977b) for the aged and disabled, by Lurie (1977) for cash benefits for the non-elderly, and by Hausman (1977) for housing, medical care, and higher education. A further important factor is that hypothetical calculations of the poverty trap do not always take full account of the detailed provisions of the benefits, including the exemptions for certain types and amounts of income and the exercise of local discretion. The implications for the effective marginal tax rates under public assistance in the United States are discussed by Lurie (1974), Rowlatt (1972), Barr and Hall (1975), Hutchens (1978), and Moffitt (1979).
4.2. Non-receipt of benefits For taxes, there is an incentive for the taxpayer to try to opt out and for the tax authorities to make sure that he does not succeed in evading their net. Applying the same argument in reverse to benefits, we may expect the potential recipients to have every incentive to opt in and the responsibility of the authorities to be the detection of ineligible claimants. Experience in several countries however has shown that non-receipt by eligible families is a serious problem, at least for income-tested benefits. Families may be unaware of the existence of the benefit, a factor which be especially important with new programmes. Families may be aware of the benefit but believe that they are not eligible, as may happen if they have previously applied in different circumstances and been turned down. They may be aware that they are eligible but not claim because of the costs of doing so, such as the time spent queuing in offices and in completing forms, as may particularly be the case where the amount of benefit involved is small. They may not claim because they do not wish to be identified as receiving the benefit which they feel to be stigmatising. And they may have claimed but been rejected as a result of administrative error. Eligibility for a means-tested benefit typically depends on a range of conditions in addition to income. In some cases these are easy to check, such as age, but others may involve a substantial element of judgement, such as whether two people are deemed to be co-habiting. This means that it may not be easy to identify those people who are eligible but not claiming. One possible approach is to screen out potential recipients from a sample survey and then invite them to apply for the benefit, in order to confirm their eligibility according to the administrative procedures. However, this is unlikely to yield a correct estimate of eligible non-recipients since, unless the main reason for non-claiming was lack of information, many are still going to refuse. Studies of non-receipt have therefore usually been based on survey information supplied directly by respondents, and the need to rely on this source must qualify
Ch. 13: Income Maintenance and Social Insurance
821
the conclusions drawn. The studies also typically involve secondary analysis of survey data not explicitly collected for this purpose and hence may not contain all the details required. In Britain, much of the evidence is derived from the Family Expenditure Survey (FES), which is carried out primarily to collect information for the annual adjustment of the weights in the Retail Prices Index. The problems which arise in the use of the survey to estimate the extent of take-up are discussed in Atkinson (1984). Concern about non-take-up in Britain has been particularly directed at the means-tested Supplementary Benefit (SB), which replaced National Assistance in 1966. One of the reasons for the change was the evidence, confirmed in an official enquiry [Ministry of Pensions and National Insurance (1966)], that a sizeable number of pensioners were not claiming the assistance to which they were entitled. The changes in administration accompanying the introduction of SB were intended to improve the take-up rate. However, the analysis in Atkinson (1969) showed that more than half of the ensuing increase in the number of recipients could be attributed to a more generous scale rate rather than to improvement in take-up. Later analyses based on the FES for the 1970s confirmed that a sizeable minority were not claiming. The estimated percentage of non-claimants varied between 24 and 35 percent for families over pension age, and between 21 and 30 percent for those below pension age over the period 1973-1981 [Atkinson (1984, table 1)]. In terms of the proportion of benefit not claimed, the record is better, but still not impressive: in 1981 the estimated amount of benefit not claimed was 15 percent of the total [Atkinson (1984)]. Evidence about non-take-up of public assistance is found in other countries. For example, there are studies for West Germany by Hauser et al. (1981), Kortmann (1982) and, in German, Hartmann (1981). Take-up of FIS in Britain provides an interesting case study, since it was a new benefit, having no direct association with the old poor law, and the government attempted to secure a high take-up (the target was 85 percent). From the outset, it was clear that this was not being achieved; and a special survey of low income families in 1978-79 [Knight (1981)] estimated that the take-up was 51 percent in Great Britain. The time series of number of recipients has been analysed in Atkinson and Champion (1981), who find a significant relationship with the variation in the generosity of the scale relative to the earnings of the lowest decile of male manual workers, but no evidence of the separate upward trend over time which might indicate improved take-up. In the United States, attention has focused on the take-up of AFDC, of Supplemental Security Income (SSI) for the aged, blind and disabled, and of food stamps. In the case of AFDC, the evidence shows the participation rate in the late 1960s as being low [45 percent for the basic family portion and 27 percent for the unemployment parent (UP) portion in 1967], but as having increased in the early 1970s, reflecting the outreach efforts of administrators and the activities of rights workers. By 1977 the estimated take-up rates were 94 and 72 percent,
822
A.B. A kinson
respectively. These figures are taken from Warlick (1981a), who gives an assessment of their accuracy and draws attention to the sensitivity of the estimates to the methods applied [see also Moffit (1983, p. 1028, fn. 7)]. She notes for instance that adjustment for ineligible participants (those estimated by the researchers not to be eligible but who were in receipt) had the effect of reducing the estimated participation rate by 7 percentage points. The official study by the United States Department of Health, Education and Welfare [Projector and Murray (1978)] confirmed that take-up was low in 1970. They concluded that only one in three of the apparently eligible families reported receipt of public assistance in the Current Population Survey, although this was qualified by the fact that receipt may be incompletely reported and that there was only limited information to assess eligibility. Warlick (1981a) quotes estimates for the take-up of other income-tested benefits, including the figure of 37.5 percent for participation in the food stamp programme in 1974 given by MacDonald (1977) and her own estimate of 47 percent for SSI in 1975. In the latter case, she notes the wide range of estimates which can be obtained by applying different methods, but in all cases the estimated participation is well below 100 percent. Incomplete take-up of income-tested benefits has two important consequences. The first is that the poverty trap described in Section 4.1 may be less serious in fact than it appears in theory. Calculations based on hypothetical families are misleading in that they assume that benefit entitlement is taken up. The second important consequence of non-take-up is that the means-tested benefits fail to provide a fully satisfactory safety net. We examine next the evidence about the effectiveness of income support programmes in preventing poverty.
4.3. Benefits, taxes, and poverty In their assessment of the first decade of the War on Poverty in the United States, Plotnick and Skidmore estimate that before transfers 24.8 percent of households were below the official SSA poverty line in 1972, with a total pre-transfer income gap of $34 billion per year [Plotnick and Skidmore (1975, table 6.1)]. The transfers studied included OASDI, public assistance, unemployment insurance, workmen's compensation, veterans' pensions, and government employee pensions; and the total cost was $62 billion in 1972. The effect of these transfers was estimated to be reduce the poverty figure to 14.0 percent and the income gap to $12.5 billion (Plotnick and Skidmore, table 6.2). In other words, the transfers reduced the poverty population by 44 percent and the poverty gap by some 64 percent. These estimates have to be qualified, since transfer payments are believed to be under-reported and since in-kind transfers are excluded. Plotnick and Skidmore quote the estimates of Smeeding (1975), which allow also for the impact of taxes. These adjusted figures show the reduction in the poverty
Ch. 13: Income Maintenance and Social Insurance
823
population to be 72 percent in 1972 and the fall in the poverty gap to be 83 percent. More recent estimates by Danziger et al. (1984) show 24.2 percent of people (as opposed to households in the figures quoted above) being below the official poverty line in 1983 before transfers, with the post-transfer figure being 15.2 percent, or between 10 and 13 percent if in-kind transfers are included. It should be stressed that these, and the calculations that follow, assume that neither taxes nor transfers affect the pre-transfer income of the poor. The possible impact on a range of decisions is the subject of Section 5. In Britain, corresponding estimates have been made by Beckerman (1979a) for 1975 using data from the FES. He takes as the poverty line the Supplementary Benefit standard and calculates that before benefits 22.7 percent of the population were below this level, with an income gap of £5.9 billion per year. The benefits covered are cash transfers via National Insurance, child benefit, SB and other programmes, with a total cost of £8.0 billion. The proportion below the poverty line after benefits is estimated to be 3.3 percent and the total income gap £0.25 billion. The percentage reduction in the number in poverty is therefore 85 percent and the percentage reduction in the poverty gap is over 95 percent. Calculations of this kind have been used in a variety of ways to assess the effectiveness of income support. The simplest is the "success rate" in terms of reducing the measure of poverty, such as the headcount and the poverty gap used above. The shortcomings of the former measure, discussed in Section 2.2, seem particularly important here, since a reduction in the number in poverty could well be achieved at the expense of deepening poverty for others. For example, the conversion of a universal benefit to a means-tested basis could allow the level to be raised, thus benefiting those low income families who claim but making considerably worse off those low income families who do not claim. Why do the present systems not have 100 percent success? Here we must distinguish between the British and U.S. calculations. In the British case, the poverty line is the same as the income floor which SB seeks to provide. People therefore fall below only insofar as they do not claim or they are not fully covered by SB. The latter applies to those families where the head is in full-time work, and one of the main discoveries of the 1960s, notably by Abel-Smith and Townsend (1965), was the extent of poverty amongst working families. Non-takeup applies, for instance, to the elderly, where the basic National Insurance pension has typically been insufficient on its own to raise people above the SB scale and those without occupational pensions or substantial savings have been dependent on means-tested assistance. For discussion of the composition of the low income group and the proximate causes of poverty in Britain, see Atkinson (1969), Fiegehen et al. (1977), Townsend (1979), Berthoud et al. (1981), and Beckerman and Clark (1982). In the United States, the poverty line is not related to social security scales, and there is no presumption that benefits will raise all families above. It was not
824
A.B. Atkinson
therefore surprising that the studies in the 1960s laid considerable stress on groups such as the aged who were in receipt of transfers. While the Council of Economic Advisers (1964) correctly drew attention to the pervasiveness of poverty (" the poor are found among all major groups in the population", p. 62), the incidence of poverty was considerably higher for the aged, for families with a female head, and for the disabled. The position has however changed over the past twenty years. In particular, the transfers to the elderly appear to have become more effective in preventing poverty, but transfers for certain other groups have become less effective. Danziger et al. (1986) show that the proportion of the aged removed from poverty has risen over the period 1965-1982 from 57 to 78 percent in the case of whites and from 26 to 50 percent for blacks. But the corresponding percentages for families headed by a woman with children have fallen from 30 to 14 percent for whites and from 11 to 8 percent for blacks. A second criterion in assessing income support programmes has been target efficiency, where it should be stressed that the word "efficiency" is being used in a specific sense which does not take account of the aspects more usually included under this heading in welfare economics. Target efficiency was introduced by Weisbrod (1969), who noted that there are two different elements to efficiency in this context: the accuracy of a programme in assisting only those in the target group and the comprehensiveness of the programme in assisting all of the target group. The latter, which he refers to as "horizontal" efficiency, is related to the criterion of reducing the poverty count that we have already discussed. The former, or "vertical" efficiency, concerns the proportion of the total expenditure which goes to the pre-transfer poor and its relation to the amount necessary to raise them to the poverty line. A programme may be less than 100 percent efficient on this basis if part of the benefit goes to those above the poverty line or if the size of the transfer exceeds the amount required to fill the poverty gap. From the calculations given above, it may be seen that the reduction in the poverty gap was some 35 percent of the total transfer spending in 1972 in the United States and some 70 percent in 1975 in Britain. [For further analysis of the target efficiency of different programmes in the United States, see Garfinkel and Haveman (1977, ch. 5).] In the case of Britain, Beckerman notes that the high level of efficiency is surprising in view of the fact that most benefits are not income tested, but he points out that the subtraction of state transfers would leave the vast majority of pensioners with incomes below the poverty line. This of course takes no account of the incidence of such transfers. If, for example, in the absence of state pensions the children of the elderly would provide them with income support, then the reduction in poverty achieved by the state transfers may be quite a lot less. The calculations of pre-transfer incomes assume that earned income and savings would be the same if there were no income maintenance provisions. Given the scale of the transfers, this seems an heroic assumption; and there are good reasons to expect for example that the
Ch. 13: Income Maintenance and Social Insurance
825
incomes of the elderly would be different in the absence of state pensions, as would the number of elderly people living on their own. The prediction of such changes is, however, a difficult matter. They will depend on the behavioural changes by the recipients, and some of the most important are considered in Section 5. The contribution to the abolition of poverty depends also on the valuation placed on the transfers. Reference has been made to in-kind transfers, and in the United States there has been considerable controversy surrounding their treatment, not least because they have increased faster in amount than cash assistance. Some authors [for example, Browning (1976) and Paglin (1980)] have argued that in-kind transfers should be valued at cost, so that $1 of Medicare is valued at $1, and Paglin has even suggested that their value should exceed $1 since they are tax free and automatically indexed. Others argue that the provision of the transfer in kind restricts its value to the recipient. Smolensky et al. (1977), Smeeding (1977), Murray (1980), and Schwab (1985) have examined the general question of valuation; Kraft and Olsen (1977) and Murray (1975) deal with housing, and Clarkson (1976) and MacDonald (1977) deal with food stamps. The implications for the measurement of poverty of different approaches to the valuation of medical care benefits are investigated by Smeeding and Moon, who conclude that in aggregate there is little difference, but that for the measurement of poverty among the elderly the choice is quite crucial. The valuation of cash benefits has been less discussed, but the earlier analysis of non-take-up suggests that there may be grounds for differential treatment of means-tested and universal benefits. To the extent that non-receipt arises on account of stigma, then this may reduce the value of the benefit (at the margin the extra income would be exactly counter-balanced by the cost in terms of loss of dignity, etc.). This is of particular importance if one adopts a "rights" approach to poverty. More generally, there is the question as to whether "income" is the correct basis for a welfare measure; for groups such as the elderly or the disabled, it may be particularly inappropriate. Finally, we have not considered the unit of receipt and the distribution of income within that unit. In Britain, Fiegehen and Lansley (1976) compared the measures of poverty obtained when the household was taken as the unit with those where the tax unit was the basis. The tax unit is narrower, consisting of a single adults, or married couples, together with any dependent children, and the assessment of poverty on this basis has to make assumptions about the extent of income transfers between tax units in the same household (for example, help by sons and daughters to their elderly parents). Taking only the external income, Fiegehen and Lansley calculate from the 1971 Family Expenditure Survey that on a household basis there are 3.4 percent of persons below the Supplementary Benefit line and on a tax unit basis there are 6.4 percent (in both cases those receiving Supplementary Benefit were excluded). They conclude that "the choice
826
A.B. Atkinson
of income-receiving unit is crucial in estimating the numbers of people in poverty" [Fiegehen and Lansley (1976, p. 517)]. A closely related question is that of the distribution of income within the unit. In both the household and the tax unit calculations, it is assumed that all family members share the same living standard, being all above or all below the poverty line, but this may not be the case. The intrafamily distribution is a subject about which we know relatively little. Pahl (1983) quotes evidence in Britain that a sizeable proportion of women, up to a third, whose marriages have broken up report being "better off" on Supplementary Benefit than living with their husbands, but this can be interpreted in several ways. In the United States, evidence from the Michigan Panel Study of Income Dynamics has been used, for example, by Morgan (1984, and the references given there) to calculate intrafamily transfers, including the time that is contributed. Lazear and Michael (1984) have attempted to apportion consumption, taking account of the public good aspects within the family, and conclude that allowance for inequality within the family can make a significant difference to measured poverty. Given the limited information, a natural approach is to examine the effect of different assumptions about the intrafamily distribution. This is the approach of Moon (1977a). Her central assumption is that the highest priority is to ensure a subsistence minimum for all members but that once this is achieved transfers rise less than proportionately with family income. This is compared with alternatives under which transfers are limited to providing the subsistence minimum or where equivalent income is equalised. This subject warrants further investigation; and in the present context it should be noted that income transfer policy itself may have an important impact (for example, if child benefits are paid to the mother rather than to the father). 4.4. Overall redistributiveeffect Measures intended to relieve poverty may well have effects which spill over to those above the poverty line. Categorical benefits, for example, may go those who qualify under the conditions but whose income is well above the poverty cut-off. While this may be considered "vertically inefficient" in the sense discussed in the previous section, the income support measures may nonetheless contribute to a broader goal of reducing income inequality. As noted by Danziger et al. (1984), the impact on the non-poor of the War on Poverty has received little attention; but there have been a large number of studies of the overall redistributive impact of the government budget, from which we can draw conclusions about the role of the income transfer component. In the United States, the study by Danziger and Plotnick (1977) of the distribution among families and unrelated individuals found that in 1974 the
Ch. 13: Income Maintenance and Social Insurance
827
Gini coefficient for the distribution of pre-transfer income was 47.65 percent, whereas that for post-transfer income was 40.77 percent. These figures relate only to cash transfers. When we consider transfers in kind, then the same problems of valuation arise as were discussed above. The importance of this aspect in assessing the degree of inequality, and its trend over time, has been brought out by Browning (1976) and the subsequent discussion by Smeeding (1979). In a recent study, Smeeding (1984) has shown that there may be a large difference in the distributional consequences of medical care and housing subsidies between the conventional market value/government cost approach and that based on valuing the benefit to recipients and non-recipients. This issue is also considered at length by Smolensky et al. (1977). Their estimates for 1970 show cash transfers as reducing the Gini coefficient from 44.4 to 39.8 percent, and that adding in-kind transfers at taxpayer cost reduces it further to 37.1 percent, but that the reduction would be smaller if the transfers were valued at their cash equivalent to the beneficiaries and if account is taken of the benefit derived by the donors (via utility interdependence). More recently, the position may have changed with the advent of the Reagan Administration. For discussion of the changes in the 1980s, see Burtless (1986), Danziger and Haveman (1981), Danziger (1982), Danziger et al. (1984) and Meyer (1986). In Britain, the impact of cash and in-kind transfers is included in the official Economic Trends studies of the effects of taxes and benefits on household incomes [Central Statistical Office, (1983)]. The estimates for 1982 show the share of the bottom 20 percent rising from 0.4 percent to 5.7 percent as a result of cash transfers, and the authors noted that "the reduction in the Gini values from 48.2% to 36.2% ... clearly confirm that cash benefits produce the largest reduction in income inequality [compared to other elements in the budget]" [Central Statistical Office (1983, p. 87)]. The treatment of in-kind transfers in these studies is based on taxpayer cost, and shows their value as falling from 35 percent of final income for the bottom 20 percent to 12 percent for the top 20 percent. For discussions of these studies, and criticisms of the methods employed, see Nicholson (1974), O'Higgins (1980), Stephenson (1980), and Le Grand (1982, forthcoming). The extent of redistribution between different groups in the population is also important. Particular interest has attached to the position of the elderly. In its 1985 report, the U.S. Council of Economic Advisers devoted a chapter to the economic status of the elderly, arguing that "thirty years ago the elderly were a relatively disadvantaged group in the population. That is no longer the case" (1985, p. 160). The report recognises that the elderly are not a homogeneous group, but notes that the average per capita income for families (but not for unrelated individuals) is virtually the same for the elderly and non-elderly. In the United Kingdom, the Department of Health and Social Security (1984) show the
828
A.B. Atkinson
disposable income per head of pensioners having increased from 41 percent of that of non-pensioners in 1951 to 69 percent in 1984/85. The assessment of the position of the elderly calls into question the use of current income as a measure of economic status. Moon (1977a) develops an indicator of potential consumption which adds to current income adjustments for net worth, income in kind and family transfers (as discussed above). Using data from the Survey of Economic Opportunity in 1966 and 1967, Moon shows that such calculations can make a noticeable difference to the measured inequality among the aged and to the ranking of different families. Danziger et al. (1984) use information for 1973 and pay particular attention to the adjustments made for differing family size. The conclusion regarding the relative status of the elderly appears to be much more sensitive to this latter aspect than to the choice of consumption or income as the measure of economic status. In monitoring the economic circumstances of the aged, panel data are of especial value, and Hurd and Shoven (1983) have used the U.S. Longitudinal Retirement History Survey to track changes in status over six years. The analysis of the distributional impact of transfers raises in a particularly acute form the problem of treating the intertemporal aspects of redistribution. In the case of pensions, we observe a transfer to people in lower income ranges from those in the working population, but this does not necessarily imply redistribution in a lifetime sense. One method of allowing for this is to estimate the amount of the state pension which represents an actuarial return to past contributions. In the United States, Burkhauser and Warlick (1981) use the 1973 Current Population Survey merged with OASI earnings and benefit records (the Social Security Exact Match File) to compare the "actuarily fair" OASI pension with that actually received, and conclude that it would only be some 27 percent of the total. The major part represented an intergenerational transfer. This reflects the tendency of past legislators to "blanket in" groups who have not previously been covered or who could only attain a limited period of coverage before retirement. The same approach is used by Burkhauser (1979) to compare the return to single persons and married couples, showing that women married to men with larger pensions are making "redundant" (i.e. less than actuarily fair) contributions. A calculation which is forward rather than backward looking is that of the internal rate of return which prospective pensioners may expect, as was calculated for the (never introduced) National Superannuation scheme in Britain in Atkinson (1971). In the United States, Aaron (1977) has calculated benefit-cost ratios under the OASI system, bringing out the different treatment of different groups, such as two-earner and one-earner families, and that the scheme may be regressive rather than progressive, reflecting such factors as differential survival rates. This may explain some of the political support for the social security scheme. [See also Brittain (1972, ch. VI) and Ozawa (1974).] In contrast, Creedy (1982) concludes in his study of the earnings-related pension scheme in Britain
Ch. 13: Income Maintenance and Social Insurance
829
that there is relatively little systematic redistribution of lifetime income for a cohort. Calculations of lifetime redistribution under Canadian pension plans are given by Wolfson (1983). Typically such calculations of lifetime redistribution have been based on simulated earnings patterns, but in some cases actual earnings histories are available. These may cover the entire lifespan, as with the data on workers in Germany born between 1908 and 1913 used by Schmihl (1983). They may cover a substantial fraction of the lifespan, as with the U.S. Continuous Work History Sample used in the Report of the Consultant Panel on Social Security (1976), which discusses the problems of extrapolation to lifetime earnings.
4.5. Politicaleconomy of income support
The reasons why income maintenance policies obtain political support have been the subject of considerable debate, not least in view of the stated intention of governments in the United States and Britain, and other countries, to push back the frontier of the welfare state. [The cutting back of the welfare state is discussed in, among others, Aaron (1983), Danziger (1982), Danziger and Haveman (1981), Le Grand and Robinson (1984), Meyer (1986), and, from a historical perspective, Johnson (1985).] In this debate, one can discern several types of contribution, but the three most important are those which may be labelled - somewhat imprecisely - the "public choice", "institutional", and "Marxist" approaches. Public choice has been a major research area in recent years. It is surveyed by Inman in Chapter 12 of this Handbook, and I make no attempt to cover the ground comprehensively, simply referring to studies that deal explicitly with income maintenance. The essence of the public choice approach is that government decisions are influenced by voter preferences, by the goals of oliticians, and by the bureaucratic machinery, in varying proportions. Many of the applications of this approach in the field of explaining government expenditure decisions have been based on the median voter model, where the preferences of the median voter are assumed to be decisive in a direct election. In a representative democracy, too, the platform chosen by the parties would be designed to occupy the decisive middle ground. This line of explanation takes us back to the reasons why state income support may be desired, for example, because voters' utility functions exhibit interdependencies or because they are guided by a contractarian notion of justice. The median voter model has been applied by Orr (1976) to the explanation of redistribution at the level of local governments in the United States. Under the AFDC programme, the levels of assistance payments are set by the various states
830
A.B. A tkinson
who administer the programme; and the cost is shared between the states and the federal government according to a formula. He sets out a simple two-group model, similar to that used in Section 3 with recipients and taxpayers, with the incomes of the recipients being an argument in the utility function of the taxpayers [it is an asymmetric model of the type developed by Hochman and Rodgers (1969)]. The level of transfers is determined by majority vote in each jurisdiction, with the cost being shared with the federal authority via a matching grant. The theoretical argument suggests that the generosity of the transfer per AFDC recipient will depend on the level of taxpayer income, the recipient/taxpayer ratio, the absolute number of recipients, the federal matching rate, taxpayer preferences, and the composition of the recipient group. Using proxies for the last two of these variables, Orr estimates equations which account for a sizeable part of the variation in AFDC spending across states. There are difficulties with the analysis, including the fact that the number of recipients is treated as exogenous, whereas we may expect its size to depend on the level of assistance, both via migration and via transfer from the taxpayer group. But Orr correctly recognises that the position of the local authority is similar to that of an individual taxpayer in that it faces a budget constraint, determined by the federal matching formula. As discussed earlier, the slope of this constraint may be endogenous, depending on the local decision, and this is allowed for in the estimation. Moffitt (1984) gives a fuller treatment of the intergovernmental budget constraint, as applied to AFDC, and draws conclusions about the design of the matching grant formula. The median voter approach does not require that redistribution be based on the notion of interdependent utilities. Peltzman (1980) bases his theory of the growth of government on voters being concerned solely with their own welfare, the critical features being the degree of inequality and the ability to perceive and articulate self-interest. Dryzek and Goodin (forthcoming) in their analysis of the foundations of the post-1945 welfare state in Britain argue that people will support major redistribution at a time when there is major uncertainty. They suggest that the pervasive uncertainty generated by the Second World War led to a growth of risk-sharing arrangements. The government took over the task of providing universal war risks and war damage insurance in response to public desire for risk-spreading, and this in turn provided a major impetus to the growth of the post-war welfare state. The median voter explanation implies that political parties as such are irrelevant, in that no variation in policy would typically be observed according to the political party in office. The political ideology of the government would only become relevant when it had some degree of monopoly power. This led Kasper (1971) to argue that the dispersion of policies will be greater in states where there is little or no political competition than where there is competition and the politicians are constrained by the preferences of the median voter. He tests this
Ch. 13: Income Maintenance and Social Insurance
831
empirically in a study of welfare programmes in the United States and their variation by state. He divides states into four categories: one-party Democratic, modified one-party Democratic, two-party competitive, and modified one-party Republican. The results indicate that the variation in welfare payments is generally less in the two-party competitive group, although he recognises that there are alternative explanations. The variation by states in the United States was also used in the study by Davis and Jackson (1974) of Senate volting on the Family Assistance Plan proposed by Nixon but defeated by the Senate in 1970. They note that the plan was supported by equal proportions of Republicans and non-Southern Democrats but that all Southern Democrats voted against. Using survey information on the views of families, they show that the degree of support for the Plan varied significantly with income and region (the latter not according to the apparent self-interest in that the South could be expected to obtain more benefit but gave less support). This variation in constituency support could in turn explain a substantial part of the Senate voting patterns. The failure of the Family Assistance Plan has been taken by several authors as a case study of the political process. Marmor and Rein (1973) note that when the plan was announced most commentators were in favour, but the "euphoria" of 1969 gave way to a congressional stalemate. One account of the events is that by an active participant, Moynihan (1973). This book, together with those of Bowler (1974) and Burke and Burke (1974), is critically reviewed by Williams (1975), who discusses among other aspects the role of economic analysis in the political process. He expresses the view that "the long struggle for [the negative income tax] is surely one of the finest hours of social policy analysis. Never have policy analysts been so prominent for so long in a social program area" [Williams (1975, p. 434)]. Politicians are one set of actors in the public choice model; pressure groups [whose role is analysed by Becker (1983)] are .another, as are special interest groups. Lindbeck (1985) has argued that the more recent expansion of the redistributive activities of the public sector has taken the form of "fragmented horizontal redistributions", in contrast to the broader redistribution which had dominated earlier policy. These reflect politicians "buying votes" from various minority groups. He suggests that one by-product is that the size of gross transfers is much larger than that of net transfers. The public choice approach is explicitly grounded in the theory of the political process. The second set of explanations, which I have grouped together under the broad heading of "institutional", are more concerned with the development of the welfare state as a social and historical process. In Britain, this is typified by the writing of Titmuss (1958, 1968), in which he sees the evolution of income support as part of a wider process of changing social structure. He attaches particular significance to the role of occupational and fiscal welfare, an aspect
832
A.B. Atkinson
which has been developed by Sinfield (1978). An important example is provided by the relation between private pension schemes and state pensions. Thus, Abel-Smith (1984, p. 173) has recently argued for the retention of state earningsrelated pensions on the grounds that "if the social security system does not incorporate earnings-related benefits, these are bound to develop independently. And it is these occupational provisions, both public and private, that have created the most glaring inequalities and inequities." The development of "fiscal welfare" was stressed by Titmuss long before the concept of tax expenditures was widely discussed: "since the introduction of progressive taxation in 1907 there has been a remarkable development of social policy operating through the medium of the fiscal system" [Titmuss (1958, p. 45)]. Recognition of the parallel function of cash transfers and tax allowances has however failed to influence the political debate. As noted by Marmor (1971, p. 85n), "it is extraordinary how difficult it is to convince the skeptics that tax exemptions are functional equivalents of direct government expenditures". He quotes the example of Aaron as to how absurd it would be if the double exemption for the aged under income tax were proposed as a cash programme: "yesterday on the floor of Congress, Senator Blimp introduced legislation to provide cash allowances for most of the aged. Senator Blimp's plan is unique, however, in that it excludes the poor. The largest benefits - are payable to aged couples whose real income exceeds $200000 per year" [Aaron (1969, pp. 4-6)]. The institutional approach provides one way of rationalising some of the differences between countries in their income support programmes. Aaron (1967) examined the per capita social security expenditure of 22 countries in 1957, and found a significant relationship with the age of the system (in addition to per capita income), concluding in fact that this is the single most important determinant. This has been more fully developed by Wilensky (1975). Using data for 60 countries in 1966, he concludes that the ratio of social security spending to GNP is related significantly to the age of the system, to GNP per capita, and to the proportion of the population aged 65 or older; and that the type of political system had little independent influence. Wilensky discusses in addition the evidence from time series; on this, see also Heidenheimer and Layson (1982), who consider the balance between universal and selective programmes. The third type of research, that labelled "Marxist", is also historical in its approach, and it may be seen as offering one explanation of the political and other forces which have been observed in the determination of welfare policy. At the same time, a central insight of this approach is that the welfare state has a contradictory position in a modern capitalist economy. As it is put by Gintis and Bowles (1982, p. 341), "the contradictory nature of the welfare state stems from its location at the interface of two distinct sets of 'rules of the game' ". Whereas the welfare state is founded in a liberal-democratic manner on claims on resources being related to citizenship, capitalism recognises claims on resources based on the ownership of property. "Where the games are overlapping, as in all
Ch. 13: Income Maintenance and Social Insurance
833
modern capitalist welfare states, there is [increasingly] conflict between the regime of citizen rights and the regime of property rights" [Gintis and Bowles (1982, p. 341)]. On the other hand, the welfare state is seen by some Marxists as an instrument of capital. Gough (1979, p. 11) describes the analysis by radicals and Marxists of "the welfare state as a repressive mechanism of social control: social work, the schools, housing departments, the probation service, social security agencies - all were seen as means of controlling and/or adapting rebellious and non-conforming groups in society to the needs of capitalism". 4.6. Concluding comment In seeking to explain why income maintenance programmes have enjoyed less than complete success, two aspects should be stressed. The first is the problem of sheer complexity. The complex provisions and interactions pose serious barriers for recipients, administrators, and legislators. Economists and others have an important role to play in ensuring that the implications of present policies, and proposed reforms, are fully understood. The second is the failure of political will which prevented proposals for income support being enacted in their entirety (such as the Beveridge plan to do largely away with means-tested assistance) and which has led in recent years to the cutting back of the social insurance system. Account has to be taken of the background of shifting political objectives and support, and programmes for reform have to be designed with a view to the political processes necessary for their introduction. 5. Effects on economic behaviour Much of the recent growth of interest in social security amongst economists has stemmed from an anxiety that the existence of social security has had significant effects on behaviour. The supposed impact of social security on labour supply, on welfare dependence, on retirement, and on savings, has therefore come to play a central role in discussion of reforms in income support. In this survey, I concentrate on the influences on family decisions. The analysis is therefore a partial one, and the effects on, for example, the employment decisions of firms are not considered. (Thus, when discussing unemployment insurance, the incentive to lay-off workers is not addressed.) The theoretical models employed in the analysis of the effects of social security, and the econometric techniques used in the estimation of empirical relationships, are in many respects similar to those used in the analysis of taxation. Indeed, concern for the possible disincentive effects of a negative income tax was one of the main reasons for the resurgence of empirical work on taxation and labour supply. There is therefore considerable overlap with Chapter 4 in this Handbook by Hausman on labour supply and, to a lesser extent,
834
A.B. A kinson
Chapter 5 in this Handbook by Sandmo on savings. In view of this, I have chosen to concentrate on those aspects less fully covered by these authors. In particular, I have not considered the influence of social security on the hours of work by family heads or on the participation by married women in the labour force. Nor will I devote space to the negative income tax experiments. In addition to the survey in Chapter 4 of this Handbook, there is extensive coverage of these topics in Keeley (1981), Killingsworth (1983), and Pencavel (1984). At the same time, it has to be remembered that there are important interrelationships between different family decisions. The labour supply decisions with which I begin, dealing with retirement in Section 5.1, disability in Section 5.2, and duration of unemployment in Section 5.3, all are closely related to the decision to claim benefits. This is treated explicitly in Sections 5.2 and 5.3, where some studies focus on the probability of claiming disability insurance or receiving unemployment benefit, and it forms the subject of Section 5.4, which is concerned with the probability of claiming welfare/public assistance. Receipt of welfare may in turn have an impact on decisions about family formation, and the evidence about this is considered in Section 5.5. Decisions about participation during the prime working years have an influence on pension entitlements and hence on retirement decisions. This is affected too by the level of savings, which is the subject of Section 5.6, and over the life-cycle savings and retirement decisions are likely to be made simultaneously.
5.1. Income support and its effect on retirement decisions In the United States there has been a steady decline since 1900 in the participation rate of men aged 65 or over, with an acceleration after the Second World War [Aaron (1982, fig. 4)]. This trend has been seen by some as coinciding with the expansion of social security and the conclusion drawn that there must be a causal connection. However, such a conclusion cannot be drawn so easily. The trend to earlier retirement appears to pre-date widespread eligibility for state pensions, and it is far from evident that the cuts in social security in the 1980s will reverse the trends. In this context, it should be noted that this review is concerned with state pensions, not covering occupational, or other private, pensions, and that disentangling the effects of the two systems is no easy task. Moreover, it is evident that any correlation must be interpreted with considerable care, a care which is not always exhibited by those anxious to draw empirical conclusions. We need throughout to bear in mind that both the labour supply responses and the expansion of social security may be the product of other economic changes. In Section 3, the simple theoretical model portrayed the retirement decision in a static manner, in which the introduction of a categorical benefit to the retirement would tend to reduce participation. This remains true if the benefit is
Ch. 13: Income Maintenance and Social Insurance
835
tapered with earnings, through an earnings test, rather than removed completely if the person works (as shown in Figure 3.2), in the sense that the existence of the pension reduces the probability of participation. It is of course possible that variations in the size of the pension about its present level have no effect because all those qualifying (over the minimum pension age) have decided to retire, i.e. there is a corner maximum. Variations in other parameters of the pension may also have no effect on the participation decision. As noted by Gordon and Blinder (1980), the tax rate implicit in the earnings test in the United States has no effect on the corner condition since the first tranche of earnings do not affect the pension. If people are free to vary their hours of work, and there are no fixed costs of working or other sources of non-convexity in the budget constraint, then the decision to increase hours above zero will be unaffected by the earnings test. If working 1 hour a week is not attractive, then no amount of work will be, and the earnings test is not relevant. A static model is of course unsatisfactory for a phenomenon such as retirement. Models of life-cycle choice of retirement date are described by, among others, Feldstein (1976a), Hemming (1977), Sheshinski (1978), and Hu (1979). In this context, the explicit treatment of uncertainty appears especially important: people experience considerable uncertainty about wealth accumulation, financial needs, health, and job opportunities. These uncertainties imply that people are continuously reconsidering their plans for retirement and wealth accumulation as their economic and health positions develop [Diamond and Hausman (1984b, p. 97)]. The importance of the intertemporal aspects may arise on account of the rules of the pension scheme and the complex interdependencies that are induced. A person in Britain who reaches the minimum retirement age may choose to defer retirement, in which case the pension ultimately received is increased, acting as an incentive to defer retirement in certain cases. In the United States, Blinder et al. (1980) have argued that the increase in pension through deferred retirement may be more than actuarially fair for those aged 62-64, and that this may be enhanced through the formula that relates benefits to the earnings of a subset of years in the person's career. Through the automatic benefit recomputation, earnings at age 62 may raise the average on which pensions are calculated. The details of the calculations have been the subject of debate [see the comment by Burkhauser and Turner (1981) and the reply by Blinder et al. (1981)], but Blinder et al. (1980) have made an important contribution in drawing attention to the complexities of the trade-offs with which people may be faced under the social security system. How far people are in fact well informed about the provisions is a matter about which little is known. The perceptions which govern behaviour may be quite different from the reality. As in all areas of applied economics, the extent of the link between theoretical models and econometric specification varies considerably. Table 5.1 summarises
A.B. Atkinson
836
"l ,,wE m 4 co
~~~~~~~>, ..C
w7 9 4
u
04
·a,~.o f~ e,
w
-a
o
8
'cj
*8
9EU _
u
©le
X
P -i o )
,
U I Ll
C 'I:
.s
, o
,"
O
c
,u m o
._sc
oo
o
" vB>
cd
N
-
II
vlg E:LE
X
0
UWcs
0
Z
mm ~
z~in
>aD D to 0,c~o~ m
837
Ch. 13: Income Maintenance and Social Insurance
c
" m o r~
i -;- O
i
X
N
n
E'
o~
ocpo 93 0%
°
.
.
.
°
'~
E"
e
o
3. GtQU0)
1
~%
L)NOC
00 Q ;
;C 4
t
v)
0 n
o
E r>
cn ,,,
-OnS
o
_
0
~:
_Q U' 0o 0 1 0
a,
I 0
;, ,o _
0
._
d
M -
bo
5 X
838
A.B. Atkinson
II
,ox
11L u6 -'O 0.:,
C C C)
o
:n e
IC c C/i
0 r
e:
vz
2
0 -0
-M E! va
0
C g 0
~A
C)4
C)
CE
Ch. 13: Income Maintenance and Social Insurance
C)~~~~~~~~l
C - ,)
oI
.
002
0
C~
839
' C)
-a
=o
~ea
.=~-C)NCo)I ~
C
En .
6'~~~~~~~~ C)~ ~~ a o o9 N -e-
0
o0
o
2g
a~~ s~~
0
0*ci
vl
~~C a0 0
,o~~
a a
020 ~ ~~~~~~~~~~0l ~ ~ ~~'C cL
C
O)
0 g~~a~2,_q" O
a
3
C)L
Z
°
Z ~ ..~ .~
E'[.
I~It06
aS s-~~" 0-
)
~ ~~~f~~~~·a ~
a a
_1C O.>S
iao
Q tC Q
i,
x: on Cd~ ·.= a*wi Ea a v a c~- a~ ~a a n.2
, a' ut0
*~"~
a, a~~~~~~~~~~I
-C)
0a
.0
OC)
a.~a aac
ao
0.
~aO
a i
'a8
A
~
·j~'~
840
A.B. Atkinson
-
a r) to-e
>wa , eO
to
.
._
, Cv)
E
o
-
In C) c
U
8Oe m
1)
to _
a r'
C)
EDa
a
s
a C)
a
to
'to
z C) . P.
c)
to
'.
,,.
a
C)o
a aOC
OC
~
""q~
Do3
to
E
ao
e
30
3
oo
'0
'_
a Ca~~~0C C:
a--a
r-
~~ a,
pp q.
E P c
J
~a
s g:
to-Cl-
-
._2 Do
s! at Q
&Ei°
0
C)
aa , a a >
o D
o
C0
0
88 B
Xv 0o
a a
C
a
a.
'a~
;S
-Y
t
-
o
a
3 a, C)
-~-"2
a
E)
C 0 O-C:
)
~cP
C)C)
a
~-j:
-C
ai
to'
> oO
XC)
aa
Ch. 13: Income Maintenance and Social Insurance
841
rI
o
' Ci c
Cl
t0
4a
Ov i 04-
V)C .0
a;
1/i
I)
In E
E -u
8
~~c:C C-
S V
_
a
~ c;
i
Z >8, - a D D nn
O'
842
A.B. A rkinson
the main features of studies of the retirement decision, beginning in the first column with the underlying theoretical model. The table, like those that follow, draws heavily on the valuable survey by Danziger et al. (1981), but since this survey deals almost exclusively with the United States, I have supplemented it with studies from Britain and certain other countries (as well as including U.S. studies that have appeared since 1981). In this, and subsequent tables, I have not attempted to provide exhaustive coverage of all relevant studies, rather taking a selection that is broadly representative of recent work. The evidence about retirement, like that about the other decisions discussed, may come from one of four different types of source: (1) cross-country comparisons, (2) time series, (3) cross-section data, and (4) panel data. The last of these is where information is collected from individuals, or from their records, at a series of dates. As may be seen from Table 5.1, most of the studies are based on cross-section data, but a feature of the recent research has been the increased availability and use of panel data. The different types of evidence pose different problems of interpretation. In the case of cross-country studies, there may be economic, social or cultural factors which lead to differences in retirement ages and which also lead to differences in the provision for that retirement, without there necessarily being any direct causal connection between benefits and retirement. In the case of time series, there is the same problem in a different dimension. We observe an upward trend in both retirement and in pensions, but find it difficult to isolate that element which is a causal relationship from that part associated with other trends in the society. The use of cross-section data allow us to explore the relation with individual benefit, which may be less highly correlated with other factors, but it is the relative behaviour which is being modelled. We may find that people with high pensions tend to retire earlier than those with low pensions but this is still consistent with there being no causal effect. The existence of pensions may influence who retires but not the average retirement age. The different types of evidence have different strengths and weaknesses. In the case of the cross-country studies, one obvious shortcoming, recognised by the authors, is that there is a small number of observations. Pechman et al. (1968) caution the reader that for this reason, and on account of the difficulty of obtaining comparable data, their results need to be treated with care. The same applies to the time-series studies, where annual observations for the post-war period are unlikely to offer a firm basis for separating the different influences in operation. For these reasons, most weight has been given to the cross-section investigations.
Ch. 13: Income Maintenance and Social Insurance
843
The cross-section studies have a number of features in common. With one exception, they relate to the United States. In several cases, they make use of the same data set: the Longitudinal Retirement History Survey, referred to as RHS. [This data source is a panel, but the studies do not typically make explicit use of this aspect; where more than one year is taken, as in Hanoch and Honig (1983) for example, the observations are treated as independent.] The sample size is typically at least a thousand, and this is one respect in which research has altered: the early study by Boskin (1977), for example, used only 131 observations. The estimation methods employed are typically fairly sophisticated, with allowance for the limited range of dependent variables and for sample selection [see Maddala (1983) for a useful review of the estimation problems in general]. At the same time, despite these similarities, the results reached in the different studies vary considerably. If we consider both the cross-section and panel studies for the United States listed in Table 5.1, then we can see that the following conclude that pensions have a significant influence on retirement: Burkhauser (1980), Hall and Johnson (1980), Hanoch and Honig (1983), Pellechio (1979), Quinn (1977), Boskin and Hurd (1978), and Diamond and Hausman (1984a and 1984b). See also Hurd and Boskin (1984), not shown in Table 5.1. On the other hand, the following conclude that the effect of pensions is either statistically insignificant or economically unimportant: Gordon and Blinder (1980), Hamermesh (1984), Kotlikoff (1979b), Burtless and Moffitt (1984), and Mitchell and Fields (1984). This is rather confusing. The first lesson to be drawn is that it is important for researchers to explain the relationship between their results and those of other investigators; and best practice applied econometric work now includes such a comparative analysis of findings. There are several reasons for these differences. Here three are considered: model specification, treatment of benefits and taxes, and choice of sample. 5.1.1. Model specification The first point to be made about the specification is that the variables to be explained differ considerably between the various studies. Some treat retirement as an either/or situation (not always with the same empirical definition). This applies to Burkhauser (1980), Pellechio (1979), and Quinn (1977), for example, in whose studies the variable to be explained is the state of retirement or not. The same discrete approach may be modelled in terms of the planned or actual retirement age, as in the studies by Hall and Johnson (1980), Kotlikoff (1979b), and Mitchell and Fields (1984). Such a treatment does not allow for partial retirement, and Boskin and Hurd (1978) [and Zabalza et al. (1980) in the United Kingdom] widen the choice to three states, including semi-retirement where the
844
A.B. Atkinson
person continues to work part time. The hours of work may however be treated as a continuous variable, as in the theoretical formulation of Gordon and Blinder (1980), in which case they may be combined with participation to yield a total hours variable, as in Hamermesh (1984). There are, however, good reasons for treating the participation and post-retirement hours separately, and this is the. practice of Hanoch and Honig (1983) and Burtless and Moffitt (1984). The second aspect of the specification concerns the underlying model. The earlier studies tended to write down directly the equation to be estimated rather than derive it from any more basic model. This raises the problem of relating the results of different studies. Thus we might model the probability, f(t)dt, of a person retiring in the interval [t, t + dt] conditional on not having previously retired, and treat f(t) as a function of, for example, the benefit rate. Or we might model the probability that the person has not retired by date t, giving G(t), the "survivor" function, which is related to f(t) by G(t) =exp[-ff(v) dv]
(5.1)
It is clear that a linear influence of benefits on f(v) at time v will have a (non-linear) influence on G(t) for all t > v. (The same kind of model is used in a number of studies of unemployment duration - see below.) The choice-theoretic basis for the retirement decision is considered by several authors. One approach is to represent this choice in terms of a person's reservation wage, where he participates where the offered wage exceeds this reservation wage. This provides a pair of equations, with the market wage being observed where the person participates, and this is the route followed by Gordon and Blinder (1980). A second, and more basic, approach is in terms of explicit utility maximisation, as is exemplified by the work of Zabalza et al. (1980). They posit a constant elasticity utility function defined over income and leisure, and use this to evaluate the utility levels at three points on the budget constraint. This technique is the same as that used for labour supply by Burtless and Hausman (1978). As these latter authors argued, one of the virtues of the approach is that it makes explicit the source of variation across individuals. In the Zabalza et al. (1980) study, people differ in their marginal rate of substitution of income for leisure in a way that reflects both observed characteristics and a stochastic term. 5.1.2. Treatment of benefits and taxes The treatment in terms of explicit utility maximisation allows one to handle the quite complex budget constraints which may be generated by benefit systems and tax systems and their interaction. It is this complexity which casts serious doubt on any attempt to use an aggregate index, as is necessary when using time-series
Ch. 13: Income Maintenance and Social Insurance
845
or cross-country data. It is not like studying the demand for cigarettes, where the price moves over time in a broadly similar manner for all consumers and we can therefore reasonably take an overall index. In the case of retirement, the "price" in terms of the income difference, varies in different ways for different people, depending for example on their past earnings histories. The indicator of pensions relevant to those at the margin of retirement may move quite differently over time, or vary quite differently over countries, from the average levels of benefits typically employed. This is an argument for the use of individual data, but the complexity of the budget constraint raises problems here too. The relevant parameters of the benefit and tax system may not be immediately apparent. This is illustrated by the ingenious argument of Gordon and Blinder (1980), referred to above, that the relevant consideration for participation is the return to the first hour of work, leading them to suggest that the earnings test is irrelevant. It is, however, not just a question of determining the relevant marginal incentive or disincentive but also a question of how the system is perceived by the people concerned. Evidence from the OPCS Older Workers and Retirement survey in Britain shows limited understanding of the provisions for earning additional pension by deferring retirement or of the earnings rule. In the latter case, it is noted that "even among those subject to the earnings rule, only a minority knew how it operated" [Parker (1980, p. 45)]. As a result, it may be better to try and model the perceived budget constraint than to refine the treatment of the theoretical trade-offs. The treatment of the amount of pension entitlement differs across the studies both because of the differences in the available data and because authors take different views as to the appropriate choice of independent variable. It is in the nature of the exercise that the actual pension is not observed for those who have not retired, so that a calculation must be made of the potential pension. In order to calculate the potential pension, one needs access to data on earnings records similar to that collected by the social security administration. Where these data are not available, then it may be necessary to proxy the entitlement, perhaps the simplest such proxy being the 0/1 eligibility variable used by Quinn (1977). Most of the studies have, however, used sources which contain past earnings records for at least part of the career. There are two issues which arise in the use of such records to construct potential pensions. First, some authors adopt the practice of using the actual earnings where available and otherwise the calculated earnings. To the extent that the calculations systematically under- or over-state the pension, this is a potential source of bias in the estimated coefficient on the pension variable. Second, several authors have drawn attention to the fact that the actual earnings may be related to unobserved variables which also enter into the determination of retirement behaviour. A person who has a taste for work may both have acquired a higher pension entitlement, through having worked longer hours and earned more, and
846
A.B. Atkinson
for the same reason be less likely to retire early. Thus, Boskin and Hurd (1978) use a two-stage estimation procedure to allow for this endogeneity of benefits, and Gordon and Blinder (1980) use an instrument constructed by assuming specified annual hours and using the estimated individual wage rate to calculate earnings and hence pension entitlement. 5.1.3. Choice of sample No fewer than seven of the studies listed in Table 5.1 make use of the same data set: the U.S. Longitudinal Retirement History Survey. Yet the samples chosen for analysis differ materially. The survey is a continuing survey and the first point of difference is in the wave or waves taken: Quinn (1977) and Hall and Johnson (1980) take the data for 1969; Boskin and Hurd (1978) use those for 1969 and 1971; Gordon and Blinder (1980) those for 1969, 1971, and 1973; Hamermesh (1984) and Hanoch and Honig (1983) take the story up to 1975; and Burtless and Moffitt (1984) are the most recent, using information up to 1977. Secondly, there is the difference between those who make use of the panel nature of the data, as with the studies shown towards the end of Table 5.1. The value of panel data is well illustrated by the work of Diamond and Hausman (1984a, 1984b). We have, however, to bear in mind the problems which arise with such data from attrition. This may arise because some people did not respond to all waves of the survey, or it may reflect mortality of the respondent of spouse (where attention confined to intact couples). Diamond and Hausman assume that attrition is independent of the causes of retirement, but in earlier work Hausman [with Wise (1979)] has examined the consequences of attrition being correlated with behavioural responses. As is noted by Diamond and Hausman (1984b), non-independent attrition could be allowed by via a "competing risks" model [see Lawless (1982, ch. 10)]. The bias which may be introduced through sample attrition is an example of a wider problem. In the use of any subsample, there is the danger that the sample inclusion criteria may generate "selection bias", the implications of which have been extensively studied by Heckman (1976, 1979) and others. The simple procedure proposed by Heckman for the case of normal distributions is to include for each observation the inverse Mills ratio calculated from an equation explaining sample inclusion, and this procedure is followed in some of the studies listed in Table 5.1. For example, when Kotlikoff (1979b) takes the subsample with expected retirement age greater than 62 he includes the inverse Mills ratio in the equation for the expected retirement age. The same procedure is used where wage equations are estimated (to calculate wages for those already retired) using data on those who are working. For example, Hanoch and Honig (1983) estimate wage equations for men and women aged 58-69 finding that the selectivity effect (inverse Mills ratio) is significant for men but not for women.
Ch. 13: Income Maintenance and Social Insurance
847
In fact, from Table 5.1 it may be seen that most investigators limit their attention to a subsample of the survey. Broadly, if you are a white employee, married and male, and responded to all questions, then you stand a good chance of being included. But if you were black, female or self-employed, then your retirement decisions are unlikely to be modelled. This of course raises questions about the extent to which the conclusions may be extrapolated to the population as a whole. It also raises questions about the validity of the resulting estimates. The issues raised by the deletion of observations - that is the deliberate limitation of the sample by the investigator- are discussed by MacDonald and Robinson (1985). The elements described above are some of those which may lead to the differences in results. In some cases, authors have noted the differences between their findings and those of others and sought to explain them [an example is Diamond and Hausman (1984b)]. There is, however, need for further work reconciling the different findings. 5.2. Disability benefits and participation
Concern for the position of the disabled has undoubtedly grown in the past 20 years, not least because people have increasingly seen that there are common problems shared by war veterans, those born with handicaps, and those with industrial injuries. The problems may be common, but the financial and other provisions have differed considerably, and there have been moves to ensure a more coherent and effective system of support, or disability benefit, for those whose earning power is diminished. At the same time, the expansion of financial support has attracted criticism from those who have argued that it has reduced the incentive to work. Parsons, for example, documents the changing participation rate among men under 55, and argues that "the sharp decline in adult male labour force participation among both blacks and whites has coincided with a rapid expansion of the Social Security disability program" [Parsons (1980b, p. 912)]. The provision of disability benefits raises many issues, and these have been discussed in a U.S. context by Burkhauser and Haveman (1982) and in Britain in Walker and Townsend (1981). Haveman et al. (1984) provide a recent account of policy towards disabled workers, covering West Germany, France, Israel, Italy, Netherlands, Sweden, and the United Kingdom. Here I concentrate on the incentive issue. Table 5.2 summarises six studies from the United States, all based on cross-section or longitudinal data. [See Wolfe et al. (1984) for a study which covers both the United States and Netherlands.] The research in this area has been much less extensive than that on retirement but the same range of findings is evident. On the one hand, Parsons, in an analysis which goes beyond the
848
Atkinson
a = Q00
.lCG >QD
C) 0
a u
e
-;
0
,o Q alaE 0
_
So as or
u
'
H o @ -
. Cp .5 o
C
m0
,v
cQ0 -
e
1 '
Uo
Q0 I i
_
.
_
..
E .
CO n
o 0
Nc
oa
.C
QEI
COD
' :SCOi D D Id
CO
0
a
0oZ
0
F0 t
-
e
0
q. !5g Dl
-G
3u
D CO
~
I 0 0-a
O '----0 COO
C
C.
C--CC~. ._
i 0 -
8qsh
CO -0
O'6CO~
-s.-
75
8--
a0
~0
-o
m '0j~ O 00 s'~ 0-
s 0 0 i.
O >
c'0
0 O~
0c~g-~
;
aQ
u 0 ·c 0 ~~ 00 00
'0 C.
5u
o~~
o oC O %
CC~E $
u0
Ch. 13: Income Maintenance and Social Insurance
849
~-
*0U
C0 L
U
m
00
C8 **
In,
c
ZsO C
- 0
I
o 5n , 0 -
0) -
a~-~
6'co
0E
r E 0) ff
oY~
I UI
,
E
0
0:
._m
D
,
c,
,~
a o'"'
~0.. u-:
n
0
~
o
;
*
a
rC
r
GiCa, 0)
.;~
Q>
~~0
0-0
c 'r
c~3
~
0
-0
2-
00~
5~oir
vi
~0
,°=I
-_
0 O
oe z
·
og~-, 8
S
~ · %~ ~~~~qwe,~~~~~~~~~~~0
.0-0 0) D3
hOC]0,~
I-.-.
Z:v
°;CNO
~
&~~·r $~
S
o,.~ i0 · I~ho<s p8
.,~
00. 0
o) 0 ...
U U U 0)U
~
850
A.B. A tkinson
simple time-series point of the previous paragraph, concludes that there is a significant relationship between non-participation at the individual level and disability income, with an estimated elasticity of 0.63. On this basis, much of the observed increase in non-participation can be explained. In the subsequent paper [Parsons (1980b)] he reports an estimated elasticity nearly three times as high (1.8), although the figures are corrected to 0.49, and 0.93, respectively, in Parsons (1984, p. 544n). On the other hand, Haveman and Wolfe (1984a) find a significant but small coefficient, and conclude that more generous disability benefits account for only a small part of the increase. In the middle comes the study by Leonard (1979), who attributes about half of the increase to the growth in transfers. These findings illustrate again that one can draw quite different conclusions even when applying similar methods and using similar types of data. As noted, Parsons himself obtains estimates that differ substantially, and this difference appears to arise from (i) consideration of labour force status in 1966 rather than 1969, (ii) augmenting the sample by including those without wage information for 1966, and (iii) use of an estimated wage for those not reporting this variable [Parsons (1980b, p. 919)]. In their critique of the study by Parsons, Haveman and Wolfe (1984a, 1984b) provide a useful table showing how they arrive at different results. Their replication of the model of Parsons on the data set employed, the Michigan Panel Study of Income Dynamics, yields a similar coefficient for the social security variable. They then vary the specification, showing that the introduction of benefits and wages separately, rather than in ratio form, leads to neither being significant, and that a correction for the selectivity bias associated with using only data on those with reported wages also reduces the coefficient on the social security variable. This lack of robustness of Parsons' results may be due to the small sample size used by Haveman and Wolfe, as suggested in Parsons (1984), but this kind of systematic comparison of results is clearly necessary if we are to understand the source of the discrepancies. In the case of disability insurance, there are similar problems to those in the study of retirement. First, the specification of the model needs elaboration. The recent work by Halpern and Hausman (1986) provides an example as to how the participation decision can be given a more explicit theoretical structure. This applies both to the underlying decision framework, based on expected utility maximisation, and to the institutional context, where Halpern and Hausman take account of the probability of acceptance, assuming that people form an unbiased estimate of their probability of acceptance. Consideration of the decision made by applicants has also to allow for the other possibilities which may be open. Haveman et al. (1984) examine the position of older workers who can choose between applying for disability insurance, working, or taking early retirement. Secondly, the treatment of benefits raises several issues. In the present case, the problems of predicting benefit entitlement seem especially acute. The method
Ch. 13: Income Maintenance and Social Insurance
851
used [for example, Parsons (1980b)] is to take recorded wages, to extrapolate these to a career earnings variable, and then to apply the benefit rules. Not only is this a rough approximation to the benefit which would be received but also it takes no account of the probability of acceptance. Eligibility depends on evidence of "total and permanent" disability and that no substantial gainful activity is possible; as a result only between a third and a half of applications are successful [Lando et al. (1982)]. Modelling the probability of acceptance is one of the features of Halpern and Hausman (1986). They find that a variable based on nine questions about functional limitations is highly significant, as are certain health conditions. Thirdly, there is the choice of sample. As emphasised by Parsons (1984), the quality of the information on health status is of central importance, and this limits the data sets which can be used. At the same time, relatively little is known about the incidence of disability of the national population, and the use of special surveys of the disabled raises the problems of extrapolating the findings to the population as a whole. Finally, it should be noted that this discussion has been confined to disability; we should also consider the relation between benefits and short-term sickness absence. This has been examined in the United Kingdom by Fenn (1981). 5.3. Unemployment insurance and work decisions
The rise in unemployment in the 1970s and 1980s has led to renewed interest in unemployment insurance. This might seem scarcely surprising in that the massive increase in unemployment has put the insurance system to a test which it had never previously faced in the post-war period, and there might be reasons to doubt that it had been fully effective in providing income support. It is not, however, these aspects which have received prominence in recent economic* debate. It is the incentive rather than the income support aspects that have been the focus. How far has the existence of unemployment insurance reduced the incentive to work? Are improvements in unemployment insurance themselves in part responsible for the rise in unemployment? The studies which have been made to throw light on this question-summarised in Table 5.3 - illustrate the strengths and weaknesses of different types of evidence about incentives. First, there are time-series studies, favoured by those who have argued that unemployment benefits have led to higher unemployment [a recent example in the United Kingdom being the work of Minford et al. (1983)]. Time series has the merit that it allows one to follow the consequences of changes in unemployment insurance. In the United Kingdom, one of the bestknown of such "experiments" was the introduction in 1966 of the earningsrelated supplement (ERS) which added substantially to the benefits which certain
A.B. Atkinson
852
~C
OO
4 u 1:
·
0)
0
oh
_,.
Q °;
, 0
09
R
-a,'
') h
t
d
e(U cd
a
E, E .)~ok E. o
6
E;
m0~
"~
B~~~~~. 0~
m0 0.)X3
;
s00 0D
a 9
Nq
N~~~~~~~~~~~~~~~~~O
Cl
ON3
.2vl
-0
uS
CL
0)
i~~~~b
-,
0
,
1C 9O
T~~~
-
-'0 .
00" 0
Cd
' 00
:
0i,0, ;-
0)
0~ 000o
~ ~~~de
~
0
0.~
-d,
-
0 0
40 0
C)
a,0 08SE
0
r:
0 )° 0
0
3Q ac 0 .Z So 0 0l
Ch. 13: Income Maintenance and Social Insurance
tQ >1 oo>
853
) -a
a -a
' U
a -
*- U jU
U
c E3 CJ a
, r2O: U
.5:v ·
-o '0 a
o
~C
,~ci
© m
ao
10 .0 I·
O
O:
U
r~
C
Ct
U
a 'a
a
U '3
a
0 E
oc : .u
O
-a·~
CCC
E
a
a .c 0 au U 3r
~
a3.
-,.
n a S
.
-
.
a 0. -a a3co · 0
v
854
A.B. Atkinson
Y;s,, ~~C)OE~ E E -av
o'E
a a
'f 5~"
~e
t:
X11 I
3
aocm
rl
S
a,
af i. I.I
c
a
o
,o Id 0~ ·
:0
3
E .
a:
* O. o
I0-
C
iUU .:0 rV'
-·
a 0
aa~ ja ,3
u
;
860
A.B. Atkinson
workers could receive. The initial impetus for studying the consequences of this change came from the observation that it had been followed by a significant rise in registered unemployment. The problem with the use of time-series evidence is that it is difficult to isolate the effect of policy changes, such as the introduction of ERS, from other influences. It is for this reason that the word "experiment" is in inverted commas. The early attempt to examine the relationship in the United Kingdom, by Maki and Spindler (1975), introduced into the equation for unemployment variables to control for the structural and cyclical factors affecting unemployment. Allowing for these, they conclude that increases in unemployment benefit had a "substantial" effect on unemployment. The same conclusion is reached by Grubel and Maki (1976) for the United States, although their estimate of the elasticity is around ten times that in the United Kingdom- see Table 5.3. Grubel and Maki (1976, p. 293) themselves suggest that the time series may lead to an over-estimate of the response, and feel that the elasticity estimated in a cross-state analysis is a more accurate reflection of the unemployment induced by benefits. The argument given as to why the time series may over-state the elasticity is that there may have been other variables changing over time in the same direction [the examples given by Grubel and Maki (1976) include changes in public attitudes towards work]; and there are reasons to doubt whether the control variables are sufficient to represent what would have happened in the absence of changes in benefits. In the United Kingdom, it has been shown that the estimated effect of benefits is sensitive to the introduction of a time trend [Taylor (1977)] or of a permanent income variable [Cubbin and Foley (1977)]. The study by Junankar (1981) shows the the findings are not robust with respect to the choice of estimation period. Consideration of the specification of the equation to be estimated leads one to ask how the model relates to the treatment of the labour market in general. In the studies by Maki and Spindler (1975) and others, the precise status of the equation is not made clear; presumably it is a reduced form, obtained from a fuller set of underlying relationships. The time-series approach needs in effect a fully-specified labour market model. The work of Minford et al. (1983) provides one version of such a model, although its structure is not easy to understand. A more explicit model is that of Nickell (1982), who models separately the outflow from unemployment and the inflow from employment, finding from time-series evidence in the United Kingdom that benefits have no significant impact on inflows, but that the coefficient in the outflow equation is significantly negative. Rather different results are given by Junankar and Price (1984). In the development of a full labour market model, it is necessary to distinguish the different labour market states (this aspect has been stressed in the work of Hamermesh and of Clark and Summers). A person may be employed (including here self-employment), unemployed, or not in the labour force. Transitions are
Ch. 13: Income Maintenance and Social Insurance
861
possible between all three states. If unemployment is defined in terms of registration for benefit (this is discussed below), then a rise in benefits may induce higher unemployment through transitions both from employment and from the "not in the labour force" category, in addition to those who remain unemployed. In other words, part of the measured increase may come about because people choose to register who previously were not recorded. The operation of the benefit system is more complex than simply signing-on, and this is a further source of problems with the time-series approach. The typical practice is to take the benefits which are received by some "representative" worker: for example, Maki and Spindler take those for a worker with average earnings in the relevant contribution year and no spells of unemployment in that year. The problem is that benefit entitlements vary a great deal across the population, and they may have changed at different rates. As is shown in Atkinson and Micklewright (1985), different indicators of benefits followed quite different time paths over the period 1970-1983. Evidence about replacement rates in the United States, and their variation across individuals, is discussed by Hamermesh (1977). The advantage of studies based on individual data is that they can take account of the diversity of unemployment benefits; and this type of evidence has been extensively analysed. We should however, note at the outset that the extrapolation from individual experience to the population as a whole poses a serious problem, in that the effect may be interdependent, but the analysis usually treats each individual's decision as independent. Suppose, for example, that we observe that people with higher benefits delay returning to work. This does not necessarily mean that aggregate unemployment will be higher, since the refusal of jobs by one group may mean that the jobs are offered to others, increasing their re-employment probability. The pattern of unemployment benefits may determine who is unemployed but not affect the aggregate. As it is put by Albrecht and Axell (1984, p. 839): the typical cross-section or panel analysis addresses the question of whether individuals who receive relatively generous unemployment compensation search more than others who receive relatively less generous compensation, holding other differences between individuals constant. But the relevant policy question is whether economies characterised by relatively more generous unemployment compensation have more search unemployment than economies with less generous compensation. Such a question requires an equilibrium answer. There is therefore a basic problem of interpretation once the problem is set in a general equilibrium context. Table 5.3 shows a wide range of cross-section studies, differing methodologically in several respects. There is the distinction between reduced form models (the majority of those listed) and structural models [such as those of Kiefer and
A.B. Atkinson
862
Neumann (1979a, 1981) and Narendranathan and Nickell (1984)1. The studies may also be classified according to the aspect of the employment decision which is studied. I described earlier three possible labour market states. In some cases, the transition probabilities between these states are modelled directly [Barron and Mellow (1981), and Clark and Summers (1982)]. Other studies have been concerned with one part of the transition matrix. For example, in the United Kingdom, the probability of leaving unemployment (which could include exit from the labour force) has been the subject of work by Lancaster (1979), Nickell (1979a, 1979b), Atkinson et al. (1984), Lynch (1983), and Narendranathan et al. (1984). They have estimated this exit probability using the relationship with survival among the unemployed similar to that shown in equation (5.1) for retirement. The probability of moving from employment to unemployment has been less studied, but see Marston (1980) and Nickell (1980), as well as Stern (forthcoming) on repeat spells of unemployment. In other cases, the variable studied is the result of the transition process: for example, the length of time spent in a particular state. The length of an unemployment spell is examined in Classen (1979), Ehrenberg and Oaxaca (1976), Kiefer and Neumann (1979a, 1981), Newton and Rosen (1979), and Hills (1982). Recognising that there is the alternative of leaving the labour force, other studies have taken the length of employment, as in Hamermesh (1979), Solon (1979), and Moffit and Nicholson (1982). The emphasis in some studies is in probing the factors underlying the decision, and the job search model has been commonly chosen as a vehicle for this purpose. People are seen as determining their reservation wage, any job offering this or more being acceptable, in order to maximise the expected present value of the income stream. In a simple formulation, where there is an infinite horizon, a stationary wage offer distribution, and constant rate of arrival of job offers, constant wages and benefits, the choice of reservation wage, w *, may be shown to satisfy w* = b + P(W- w*)/r,
(5.2)
where b is the benefit, r the rate of discount, P the probability of receiving an acceptable job offer, and W is the expected wage associated with such an acceptable offer. This is not a course an explicit expression for w *, since both P and W depend on w*. If one is willing to posit a functional form for the distribution of wage offers, then it may be possible to solve for w *, and this is exploited ingeniously by Lancaster and Chesher (1983), who use information on reservation wages and expected wages in accepted job to fit the model exactly for each group of observations in their sample. Other authors have used the theoretical framework less tightly, not relying on the strong assumptions necessary to reach explicit results, and have treated as functions of benefits and other variables the reservation wage [Feldstein and Poterba (1984) and Fishe (198}) or
Ch. 13: Income Maintenance and Social Insurance
863
the post-unemployment wage [Classen (1979), Ehrenberg and Oaxaca (1976), Kiefer and Neumann (1979a, 1981)]. Or they have treated the exit probability [f(v) in equation (5.1)] as a function of the benefit level [Lancaster (1979), Nickell (1979a,b), Atkinson et al. (1984), and Narendranathan et al. (1985)]. Despite the differences in approaches, the findings of the cross-section studies appear at first sight to be relatively close. In a number of cases, it is reported that a 1 percent rise in benefits tends to be associated with rather less than a 1 percent increase in unemployment duration or decrease in the probability of leaving unemployment: see, for example, Lancaster and Nickell's (1980) summary of their evidence. In 1977, Hamermesh summarised his review of twelve U.S. studies by saying that "the best estimate-if one chooses a single figure-is that a 10-percentage point increase in the gross replacement rate leads to an increase in the duration of insured unemployment of about half a week when labour markets are tight" [Hamermesh (1977, p. 37)]. If all that one wishes to conclude is that the effect of unemployment benefits is rather modest, then that is probably sufficient. However, if one investigates more closely the different studies in the hope of reaching a more precise conclusion, then the picture becomes more blurred. We should also note the rather different finding with regard to flows into unemployment: for example, the study of Stern (forthcoming) of repeat unemployment in the United Kingdom finds no significant effect of benefits. The first source of doubt is that the reported statistical significance of the findings is not in many cases particularly impressive [an exception is Narendranathan et al. (1985)]. Taking, for example, nine estimates from the work of Ehrenberg and Oaxaca (1976), Kiefer and Neumann (1979a), and Lancaster (1979), using samples of around 500 cases, we find estimated t-statistics ranging from 1.7 to 3.8 with a mode of 2.3. Although -statistics of 2.0 are often treated as having attained respectability, the resulting estimates are clearly not very precise; and their lack of robustness has been brought out in studies which have varied the specification, the treatment of benefits and the sample employed. The specification in terms of the search model has already been discussed. Not only the detailed assumptions necessary to arrive at the form to be estimated, but also the basic presumptions of the model may be questioned. The role given to demand considerations is limited, whereas for many people this may be the crucial factor. People may be willing to accept any job which is offered. The shortage of microdata covering both supply and demand sides of the market is perhaps one reason why this has not been further explored. The aspect of specification which has been most fully investigated is the treatment of individual differences ("heterogeneity"). If people differ in their willingness to return to work, for example, then any cohort of unemployed will contain an increasing proportion of those reluctant to return; if this is not allowed for, then it may contaminate the estimates of the relationship with other variables [see Heckman and Singer, (1984a, 1984b, 1984c)].
864
A.B. A tkinson
The treatment of benefits raises similar questions to those discussed in the context of retirement decisions. The complexity of the benefit formulae and their interaction with other benefits and the income tax system mean that the calculation of the true incentive or disincentive to work may be difficult. The interdependence of decision over time is a crucial factor not captured by the stationary job search model. The decision to reject a job offer today may have implications for future benefit entitlement, either because it may lead to future disqualification or because entitlement depends on contribution conditions which can only be satisfied if the person returns to work. The positive incentive to enter employment which arises from such contribution conditions has been emphasised in the work of Hamermesh (1979, 1980). The interrelation with the tax system was discussed in Atkinson and Flemming (1978), where it was noted that with an annual tax assessment and tax exemption for benefits, the marginal cost of not working for a week was represented by earnings net of the marginal tax rate, not the average tax rate as commonly assumed. Furthermore, we have to allow for the fact that part of the income support may be provided by income-tested schemes for which the take-up rate is less than 100 percent, as is discussed further in Section 5.4. It is therefore difficult for the investigator to calculate the benefit to which a person is entitled, and the consequences for net income of the decision to refuse a job offer. It may be difficult too for the unemployed themselves, and the complexity of the system leads to one to suspect that actual behaviour may be governed not by an exact calculation of the trade-off, but by an approximate perception of the replacement rate. In practice, two main approaches have been used. First, studies have calculated for each person the benefit entitlement applying the rules of the schemes in question. In the United Kingdom, this approach is illustrated by the work of Nickell (1979a, 1979b); in the United States, examples are Ehrenberg and Oaxaca (1976) and Clark and Summers (1982), who provide a useful account of their UISIM programme. The problem with this approach is that it takes no account of the factors which cause actual receipt of benefit to differ from that calculated by applying the rules. In the United Kingdom, there are several reasons, including disqualification (for example, for quitting employment "without just cause"), incomplete contribution records, and the linking of spells. These are discussed in Micklewright (1984) and Atkinson and Micklewright (1985). From the evidence contained in the records of the unemployment insurance system, it appears that calculated benefit is likely to over-state this proportion in receipt and the average amount paid. The modelling of the benefit receipt is discussed in the United Kingdom by Micklewright (1985) and in the United States by Barron and Mellow (1981). The second approach is to make use of information on actual benefit receipt and on the individual's past career. The objection to this approach is that discussed in the case of pensions: that benefits may be influenced by unobserv-
Ch. 13: Income Maintenance and Social Insurance
865
able individual characteristics which also enter into the determination of the reemployment probability. A person with a bad previous employment record (not observed) would both get a lower benefit and be less keen to get a new job. The use of the calculated benefit as an instrument would avoid such contamination. For this reason, results based on actual benefits need to be interpreted with caution. What is clear, however, is that variation in the benefit variable can make a sizeable difference to the conclusions drawn. Atkinson et al. show that in the United Kingdom the use of the calculated benefit variable leads to results similar to those of Lancaster and Nickell (1980), but if benefit is related more closely to actual receipt then its effect ceases to be significantly different from zero. The conclusion is reached that the results are far from robust. The results of Narendranathan et al., using information from the benefit records, are different in that they find a well-defined elasticity, but it is quite small (0.28-0.36). The lack of robustness is also illustrated by the variation of results with the choice of sample and dependent variable. In the United States, Hills (1982, p. 462) has shown the sensitivity of the findings to the choice of sample: "we find that changes in sample definition produce quite different estimates of the relation between UI and the duration of unemployment". Moreover, the choice of sample is important when we seek to extrapolate the findings to the whole population. As in other fields, the initial simplicity of findings with regard to the disincentive effects tends to disappear when the issue is investigated in more detail. The flow of new studies has tended less to confirm the earlier findings than to raise new problems and to point to subtleties not previously taken into account. One cannot, moreover, forget that the provision of unemployment insurance is an issue charged with emotion. The academic debate has been successful in excluding the more obvious bar-room prejudices, but there is considerable scope for the conclusions drawn to be influenced by prior beliefs. 5.4. The take-up of welfare/public assistance
A number of major income-tested benefits are characterised by take-up rates apparently well below 100 percent. The reasons for non-take-up are various, but considerable importance has been attached to the stigmatising effect of receipt. The role of stigma in general is discussed by Cowell (1985), Deacon and Bradshaw (1983), and Spicker (1984). Some writers are sceptical. Klein (1975, p. 5), for example, describes stigma as "the phlogiston of social theory", and argues that low take-up can be explained just as well by other factors. That content can be given to the role of stigma is, however, illustrated by the theoretical analysis of Weisbrod (1970), who concentrated on what he called "external" stigma, which is associated with the number of people who identify the recipient as poor. He argued that take-up would be an increasing function of
866
A.B. Atkinson
the expected duration of benefits, of the level of benefits, and a decreasing function of the proportion of participants who are poor. Similarly, participation would be greater if the aid were provided in money than if it were in goods and services which can be identified as being for the poor. If we accept that stigma may influence participation, then it is only one of the factors at work. People may not claim on account of the transaction costs involved. People may not claim because they do not know of their eligibility. Evidence about the relative importance of different reasons may be sought from interviews with claimants and non-claimants. Non-take-up of Supplementary Benefit in Britain by pensioners is examined by Townsend (1979, p. 845), who concludes that "among the elderly, the predominant impression about their failure to obtain supplementary benefit was one of ignorance and inability to comprehend complex rules and pride in such independence as was left to them". [See also Altmann (1981).] Evidence about attitudes towards welfare in the United States are examined by Handler and Hollingsworth (1969). The Michigan Panel Study of Income Dynamics in 1976 asked those not participating in the food stamp programme whether they thought they were eligible and, if so, why they did not take up the stamps. In 60 percent of cases, eligible non-participants did not think or did not know that they were eligible [Duncan (1984, p. 85)]. Interview evidence is suggestive, but does not readily lead to quantification. A second type of evidence is that which seeks to relate participation rates to observed variables. For example, Strauss (1977) set out to measure the contribution made by the availability of information to take-up rates for Supplemental Security Income (SSI) in the United States, using for this purpose observations on receipt in different North Carolina counties, distinguishing between those with and without federal eligibility determination offices. He concludes that the presence of an office was indeed associated with increased participation. Participation in SSI has been examined by Warlick (1981b), using evidence from the Current Population Survey. The limitations of the available data mean however that she has to proxy factors such as "awareness of availability of SSI" by variables like educational attainment or receipt of OASI. The only variable which is statistically significant for all types of family unit is the amount of entitlement; and the results as a whole are inconclusive. A second programme which has been examined in the United States is food stamps. MacDonald (1977) uses data for 1971 from the Michigan Panel Study of Income Dynamics to examine the determinants of participation. He finds that households in receipt of welfare are more likely to claim, reasoning that they are more likely to be informed about the existence of stamps and that they have already overcome the stigma costs associated with claiming welfare. Secondly, the amount of the food stamp bonus was found by MacDonald to be a significant factor. This conclusion does however run counter to that reached by Coe (1979) using data for 1974 from the same source, although this study agrees about the importance of welfare recipiency.
Ch. 13: Income Maintenance and Social Insurance
867
In the case of welfare payments, the participation decision has been seen in conjunction with that concerning work. Brehm and Saving (1964) use an income-leisure trade-off in explaining variation in general assistance payments by state in the US; see in this context the comment by Stein and Albin (1967). More recently, the income-leisure framework has been combined with the explicit treatment of stigma. Ashenfelter (1983) derives a relationship between participation in an income-tested programme and the break-even level of the programme, the level of unearned income, the tax rate [as in Rea (1974)] plus an unobserved variable which represents the non-pecuniary costs of participation. Using data from the Seattle/Denver Income Maintenance Experiment, and assuming that the unobserved variable is normally distributed, Ashenfelter estimates the mean and variance of this distribution. The presence of a control group allows him to identify the parameters, and he concludes that receipt of benefits was not significantly affected by welfare stigma or other non-pecuniary programme participation costs. As was clearly identified in the original analysis by Weisbrod (1970), we need to distinguish different ways in which stigma may operate. The cost may be a fixed cost associated with any claim or there may be a marginal cost arising from more extensive participation or the claiming of more than one benefit. This has been investigated in the context of AFDC in the United States by Moffitt (1983), who allows for the possibility that there is both a fixed and a variable component, reducing the value of income. Using data for female family heads in 1976 from the Michigan Panel Study, he estimates jointly the participation decision and the choice of hours of work. The determinants of participation and hours are obtained using a specific form for the utility function and assumptions about the distribution of the fixed stigma component. Moffitt concludes that there is a significant influence of stigma, but it "appears to arise mainly from the act of welfare recipiency per se, and not to vary with the amount of the benefit once on welfare" [Moffitt (1983, pp. 1033-1034)]. The intertemporal aspects of the cost may also be important. There may be a set-up cost to claiming. Or, as modelled by Plant (1984), there may be a cost to transition between states, inhibiting claiming by those not previously in receipt and inducing welfare dependence by those on the benefit register. Analyses of the kind just described are a valuable addition to our knowledge, but they concentrate exclusively on the demand side of the relationship. The behavior of those administering income-tested benefits is however crucial to their operation; and much of the concern about stigma derives from the way in which the benefit is administered. The choice of participation is not solely that of the applicant. The potential recipient faces a variety of hurdles designed to ensure that benefits are confined to those who are truly eligible. Receipt may depend on availability for work or on demonstration that one is actively searching for work; it may require proof that the person is not co-habiting. The demand for assistance may be "constrained", as emphasised by Albin and Stein (1968).
868
A.B. Atkinson
5.5. Income support and family formation The reference to co-habitation brings us to an aspect of income support which has caused considerable controversy-its possible effect on family formation. This has been particularly discussed in the United States with regard to the welfare system. In that receipt of AFDC is conditional on the absence of one parent (until 1979 the father), the payment of benefits may influence decisions to marry or to separate. It has often been argued that the welfare system is one important factor in leading to the growth of female-headed families. The percentage of female-headed families (with no husband present) rose in the United States from 10.0 percent in 1960 to 15.4 percent in 1983, with the growth being especially marked for black families, increasing from 21.7 percent to 41.9 percent [Wilson and Neckerman (1986, table 2)]. The effects of income support on family formation may go much wider. The payment of pensions to the elderly may affect their ability to continue living on their own rather than joining the household of a son or daughter, or of entering a residential home. The age of marriage may be influenced by the level of support provided to families, as may be the completed family size. It is, however, the effect of welfare on single parenthood which has received most notice in the United States, and it is on this that attention is focussed here. The changes over time in the aggregate level of household formation in the United States are examined by Hendershott and Smith (1984), who show that a significant part of the change is due to changes in age-specific headship rates, being proportionately greatest from those under 35 and those over 75. Their estimated equations show a significant relation with the number of families on welfare, and they attribute part of the increased headship amongst the young to increase real AFDC benefits and lower eligibility standards. Such evidence, however, needs to be considered alongside the finding of Darity and Myers (1983), using annual data from 1955 to 1980, that the proportion of female-headed families amongst black families is not significantly related to AFDC payments. Cross-section data on receipt of AFDC by area (SMSAs) in the United States are employed by Honig (1974) - see also Minarik and Goldfarb (1976) and Honig (1976). Using data from 1960, she finds a positive relationship between average AFDC payments and the proportion of adult females who headed families with children. The relationship is less strong for whites than for non-whites, and is found to be smaller when data for 1970 are used. Consideration of individual data on household formation has been accompanied by a more explicit treatment of the individual decision. Hutchens (1979) has drawn on the work of Becker (1973) on the theory of marriage to deduce a number of possible determinants of the rate of re-marriage. Using a subsample of data for 1970 from the Michigan Panel Study of Income Dynamics, covering 438 women with children, he estimates a logit model of the probability of re-marriage
Ch. 13: Income Maintenance and Social Insurance
869
within two years. (He notes in this context the problems caused by sample attrition, which may be particularly serious for this purpose.) He concludes that the AFDC transfers do have a significant negative influence on the probability of re-marriage. The model of Danziger et al. (1982) compares the income when married with that when single, the decision regarding household formation being taken by the woman in the light of this comparison. (The decision of the man is treated as irrelevant. They note that abandonment by the husband "creates a short-run disequilibrium, but we assume that in the long run she will have found a new husband" [Danziger et al. (1982, p. 533)]. Some people might think that Keynes' view of the long run is nearer the mark.) The findings of their model, estimated using data from the 1975 Current Population Survey, are that AFDC increases the headship proportion by only a small amount. The effect of income support on family formation has also been studied as part of the negative income tax experiments. The evidence is reviewed by Bishop (1981a, 1980b); see also MacDonald and Sawhill (1978) and Groeneveld et al. (1980). The subject of family formation is clearly one of considerable complexity, where the role of economic factors should not be exaggerated. The mechanisms at work are more subtle than any simple relation between income support and marriage or separation.
5.6. Pensions and savings Since the early days of state pensions it has been argued that they discourage private saving. In 1907, a leading member of the Charity Organisation Society said that if the poor law could be swept away, "the condition of the masses would in every way be vastly improved, a further impetus would be given to the growth of the Friendly Societies, ... the necessity for providence and thrift would bring about less expenditure on drink" [quoted by Johnson (1984, p. 332)]. In recent years, the debate about the effect of pensions on savings has re-opened largely because of the research in the United States of Feldstein, starting with the article in which he reached the striking conclusion that "the social security program approximately halves the personal savings rate [which] implies that it substantially reduces the stock of capital and the level of national income" [Feldstein (1974, p. 922)]. This article stimulated a great deal of research, including attempts to replicate his approach in other countries [see references in the survey by Cartwright (1984)]. Not surprisingly, the principal conclusion has been challenged, and other authors have come to different conclusions. It has been pointed out that any reduction in personal savings does not necessarily imply a fall in capital stock; as we saw in Section 3, account has to be taken of the consequences for public debt and public capital. The impact of a rise in social security might, for instance, be offset by a decline in government
870
A.B. Atkinson
4
0
°4 -4
0w
L4
4
I
:4 >
-s
,3:4
:
:4 V
-'O
-
.g
o °L 5-
c CC
C4 ~0
-s 9 ;r~ :4
:4
-4
:
.
d !
' :4:4
*u :4 O a: @-· :4 :4*
:
:E
·:
:
0
.oo
0
^ E
*.: C
.5 4
rr; :4
~3 oI~ O CC
0~~~~~~ 0 :n
:4
,
:_,-x
;-~
~
bCCC o'4-C:
o
V).=
C-.
:4
:4 .O ,5 CA
:4
CC
I:4:
:4:4w-:4:
ay~~~g
0·:4
o%~~~~~~~~~~~~~~:
r~:4 C~·E.
-:
.?-P
-:4~ :425 :4
-:4
:4
r
-
o·-
* o~~~~~~ct~~~~~~~~
:4
4:
:4C
~
5 o~~~: 54
4
>
5-
Ch. 13: Income Maintenanceand Social Insurance
871
)
=
a .4M>
E
a
10
0)'0)1
PP
.U
= U
00
na a8~
0
rn 010>
CO._
0 )
n
-~
a
I -a
a
a'~~~~~~~~i
m
9~~~~~~~~~~~~~~~~~~a
B CC~~~
00
0)
'
00~
0' -
~
0
0~~~~~~~~~;
12
n
a
~0
'a3
00.Cj OE
-> ci a,3;
0 To
0
a r. I
9~~~~~~00
M
900
U-a
,1~~~~~~~'-
a
~.~ .~4 a ar 06
-
.~120
E
0)
ga3
0
en
0
B
-a up .Id C
0
C
0,~a
M en
z
c
)0-
O 0'~0
X >U
a~oa
o
A.B. Atkinson
872
I
-
,2a
- ::,
. E
u Z
-'
r-a a;
, vaor.E C
-C
4 u1 04
I
3
I
'
E a-a Q
r.t Cr
E
-''m
3
a
.aO0n
a- -a 'I
E!~
i3 =
D
a
ut
a
-a
.5 a a-a Ia a CQ
-
~c~
t
oN
la'
r
CE ~~~~~-I "
'-'~~~~~~~~
a'
O~ ' .
=~m ~~~0`
0I..'a 'ae
-'a
'aa ._
V-a
:,
o
vl'
°' j-a
O
I
3
0
a~~~~a~N"
'a
O
a' -
-
-aCa
-
'as
' a
0
a a :s
a3
N" 'a-~a'
-o ,. I
._
-a
=0m
a
a
i·
a'a
D 0
873
Ch. 13: Income Maintenance and Social Insurance JY a 0
_ '8
(
0
D
D
as · Pa ', -:
°00a, 0W a -v sC C pt
>0F
-0$ .,;
m L 0
: ra0.g O
0
o N C4f b7 '0
0' 0
0 c0 c
gE
0',
d O 0 rn0' C7 I
0'
r-'4
Ri
I8- .
r0
-
=F 't (d a, 'S
'd. _ cX
Ct
~00;a
I 0W
a Ct CIO
Gt
00 t4 at
0e Id
0r-
-;C
C ao *0
I Id tD - 7 o, la
Ct
:O -t
U
0x =-
._
0 t
at
v) -
.0
U
10 I U
0
00'
E Ct a
roo
3
a~o
0
0~
r-'
0
~a0.0ra ~0'0 0
hC~
ol 0
(
'Ct P0
0
o~0 V v2 00
C
°O la0_7-
~
0,
vdxh < 0.
(2.38)
z + dxh:
V *(z ) > W(Ul(xl°),..., Uh-l(xh-1,0), Uh(xhO + Edxh)
.. , UH(X HO))
(2.39) = W(U(x
+E
10
), ..., Uh(
aOU h dx
h+
aUh ax"
> V*(zO),
for small
h0 )
..... UH(HO))
o(e 2) > 0.
(2.40) (2.41)
On the other hand, V*(Z,) = V*(z
) + v(z
= V*(zO) + evdx
< V*(z°),
- zO) + o(E 2 )
(2.42)
h
(2.43)
+ O(E 2 )
for small
> 0.
This contradiction establishes the result.
(2.44)
Ch. 14: Theory of Cost-Benefit Analysis
937
Notice that this result is not concerned with how first-best allocations are actually achieved; in particular, it says nothing about market prices. However, we know that under certain conditions, a first-best allocation may be achieved as a competitive equilibrium with optimal lump-sum transfers; in such a case, (relative) shadow prices coincide with (relative) market prices, both being equal to the common marginal rates of substitution. The reader may verify that, where producer prices and lump-sum transfers are unrestricted control variables in problem (P) (2.10), while taxes and quotas are absent, and the economy is closed, a first-best allocation is achieved, shadow prices coincide with market prices, and welfare weights are equal for all consumers.
2.3. Some important second-best theorems
Some of the early contributions to second-best theory were pessimistic regarding the possibility of defining operational rules for public decision-making outside the perfectly competitive model; they stressed, for instance, that distortions occurring anywhere in the economy would lead to a disruption of the standard conditions for Pareto efficiency everywhere else [Lipsey and Lancaster (1956-57)]. Since then much work has been directed at establishing more positive results, and exhibiting the conditions under which reasonably simple and operational rules for decision-making (e.g. the desirability of aggregate production effi