CONTENTS LIST OF CONTRIBUTORS
vii
REVIEWER ACKNOWLEDGMENTS
ix
EDITORIAL POLICY AND SUBMISSION GUIDELINES
xi
TAX COMPLIANCE INTENTIONS OF LOW-INCOME INDIVIDUAL TAXPAYERS Henry Efebera, David C. Hayes, James E. Hunton and Cherie O’Neil
1
DETERMINANTS OF TAX PROFESSIONALS’ ADVICE AGGRESSIVENESS AND FEES Donna D. Bobek and Richard C. Hatfield
27
BEHAVIORAL IMPLICATIONS OF ALTERNATIVE GOING CONCERN REPORTING FORMATS Chantal Viger, Asokan Anandarajan, Anthony P. Curatola and Walid Ben-Amar
53
MANAGEMENT FRAUD RISK FACTORS: AN EXAMINATION OF THE SELF-INSIGHT OF AND CONSENSUS AMONG FORENSIC EXPERTS Sally A.Webber, Barbara Apostolou and John M. Hassell
75
BUDGETARY SLACK CREATION AND TASK PERFORMANCE: COMPARING INDIVIDUALS TO COLLECTIVE UNITS James M. Kohlmeyer III and James E. Hunton
97
v
vi
BUDGET TEAM GOALS AND PERFORMANCE ANTECEDENT AND MEDIATING EFFECTS Peter Chalos, Margaret Poon, Dean Tjosvold and W. J. Dunn III
123
PERFORMANCE EVALUATIONS, WITH OR WITHOUT DATA FROM A FORMAL ACCOUNTING REPORTING SYSTEM Yin Xu and Brad Tuttle
153
UNRAVELING THE EXPECTATIONS GAP: AN ASSURANCE GAPS MODEL AND ILLUSTRATIVE APPLICATION Kimberly Gladden Burke, Stacy E. Kovar and Penelope J. Prenshaw
169
LIST OF CONTRIBUTORS Asokan Anandarajan
School of Management, New Jersey Institute of Technology, Rutgers University, USA
Barbara Apostolou
Ourso College of Business, Louisiana State University, USA
Walid Ben-Amar
School of Management, University of Ottawa, Canada
Donna D. Bobek
School of Accounting, University of Central Florida, USA
Kimberly Gladden Burke
Else School of Management, Millsaps College, USA
Peter Chalos
College of Business, University of Illinois at Chicago, USA
Anthony P. Curatola
Department of Accounting, Drexel University, USA
W. J. Dunn III (Retired)
University of Illinois at Chicago, USA
Henry Efebera (Deceased)
University of Akron, USA
John M. Hassell
Indiana University Kelley School of Business Indianapolis, Indiana University-Purdue University Indianapolis, USA
Richard C. Hatfield
College of Business, University of Texas at San Antonio, USA
David C. Hayes
Ourso College of Business, Louisiana State University, USA
James E. Hunton
Bentley College, USA and Maastricht University, NL vii
viii
James M. Kohlmeyer III
School of Business, East Carolina University, USA
Stacy E. Kovar
College of Business Administration, Kansas State University, USA
Cherie O’Neil
College of Business, Colorado State University, USA
Margaret Poon
Department of Accounting, City University of Hong Kong, China
Penelope J. Prenshaw
Else School of Management, Millsaps College, USA
Dean Tjosvold
School of Business, Lingnan University, China and Simon Fraser University, Canada
Brad Tuttle
Moore School of Business, University of South Carolina, USA
Chantal Viger
Accounting Department, University of Quebec at Montreal, Canada
Sally A. Webber
Department of Accountancy, Northern Illinois University, USA
Yin Xu
College of Business & Public Administration, Old Dominion University, USA
REVIEWER ACKNOWLEDGMENTS The Editor and Associate Editors at AABR would like to thank the many excellent reviewers who have volunteered their time and expertise to make this an outstanding publication. Publishing quality papers in a timely manner would not be possible without their efforts. Elizabeth Almer Portland State University, USA
Christie L. Comunale Long Island University-C.W. Post Campus, USA
John Anderson San Diego State University, USA
Charles Cullinan Bryant College, USA
Philip Beaulieu University of Calgary, Canada
William N. Dilla Iowa State University, USA
Jean Bedard Northeastern University, USA
Craig Emby Simon Fraser University, Canada
James Bierstaker University of Massachusetts Boston, USA
Glen Gray California State University Northridge, USA
Dennis M. Bline Bryant College, USA
Clark Hampton University of Connecticut, USA
Rich Brody University of New Haven, USA
Rick Hatfield University of Texas San Antonio, USA
Robert H. Chenhall Monash University, Australia
Mary Callahan Hill Kennesaw State University, USA
Vincent Chong University of Western Australia, Australia
Karen L. Hooks Florida Atlantic University, USA
Freddie Choo San Francisco State University, USA
James E. Hunton Bentley College, USA and Maastricht University, NL ix
x
Mike Kirschenheiter Columbia University, USA
Robert J. Parker University of New Orleans, USA
James M. Kohlmeyer III East Carolina University, USA
Michael Roberts University of Alabama, USA
Stacy Kovar Kansas State University, USA
Jacob Rose Montana State University, USA
Theresa Libby Wilfred Laurier University, Canada
Andrew J. Rosman University of Connecticut, USA
Daryl Lindsay University of Saskatchewan, Canada
Georgia Smedley University of Nevada – Las Vegas, USA
Elaine Mauldin University of Missouri, USA James Maroney Northeastern University, USA
John Sweeney Washington State University, USA
Venky Nagar University of Michigan, USA
Stan Veliotis University of Connecticut, USA
Andreas Nikolaou Bowling Green State University, USA
Sally A. Webber Northern Illinois University
Hossein Nouri College of New Jersey, USA
Kristin Wentzel La Salle University, USA
Ed O’Donnell Arizona State University, USA
John Wermert Drake University, USA
Laurie Pant Suffolk University, USA
Patrick Wheeler University of Missouri, USA
EDITORIAL POLICY AND SUBMISSION GUIDELINES Advances in Accounting Behavioral Research (AABR) publishes articles encompassing all areas of accounting that incorporate theory from and contribute new knowledge and understanding to the fields of applied psychology, sociology, management science, and economics. The journal is primarily devoted to original empirical investigations; however, literature review papers, theoretical analyses, and methodological contributions are welcome. AABR is receptive to replication studies, provided they investigate important issues and are concisely written. The journal especially welcomes manuscripts that integrate accounting issues with organizational behavior, human judgment/decision making, and cognitive psychology. Manuscripts will be blind-reviewed by two reviewers and an associate editor. The recommendations of the reviewers and associate editor will be used to determine whether to accept the paper as is, accept the paper with minor revisions, reject the paper or to invite the authors to revise and resubmit the paper.
MANUSCRIPT SUBMISSION Manuscripts should be forwarded to the editor, Vicky Arnold, at Vicky.
[email protected] via e-mail. All text, tables, and figures should be incorporated into a word document prior to submission. The manuscript should also include a title page containing the name and address of all authors and a concise abstract. Also, include a separate word document with any experimental materials or survey instruments. If you are unable to submit electronically, please forward the manuscript along with the experimental materials to the following address: Vicky Arnold, Editor Advances in Accounting Behavioral Research Department of Accounting U41A School of Business University of Connecticut Storrs, CT 06269-2041, USA xi
xii
References should follow the APA (American Psychological Association) standard. References should be indicated by giving (in parentheses) the author’s name followed by the date of the journal or book; or with the date in parentheses, as in “suggested by Earley (2000).” In the text, use the form Rosman et al. (1995) where there are more than two authors, but list all authors in the references. Quotations of more than one line of text from cited works should be indented and citation should include the page number of the quotation; e.g. (Dunbar, 2001, p. 56). Citations for all articles referenced in the text of the manuscript should be shown in alphabetical order in the reference list at the end of the manuscript. Only articles referenced in the text should be included in the reference list. Format for references is as follows:
For Journals Dunn, C. L., & Gerard, G. J. (2001). Auditor efficiency and effectiveness with diagrammatic and linguistic conceptual model representations. International Journal of Accounting Information Systems, 2(3), 1–40.
For Books Ashton, R. H., & Ashton, A. H. (1995). Judgment and decision-making research in accounting and auditing. New York, NY: Cambridge University Press.
For a Thesis Smedley, G. A. (2001). The effects of optimization on cognitive skill acquisition from intelligent decision aids. Unpublished doctoral dissertation, University.
For a Working Paper Thorne, L., Massey, D. W., & Magnan, M. (2000). Insights into selectionsocialization in the audit profession: An examination of the moral reasoning of public accountants in the United States and Canada. Working Paper, York University, North York, Ontario.
xiii
For Papers From Conference Proceedings, Chapters From Book, etc. Messier, W. F. (1995). Research in and development of audit decision aids. In: R. H. Ashton & A. H. Ashton (Eds), Judgment and Decision Making in Accounting and Auditing (pp. 207–230). New York: Cambridge University Press.
TAX COMPLIANCE INTENTIONS OF LOW-INCOME INDIVIDUAL TAXPAYERS夽 Henry Efebera, David C. Hayes, James E. Hunton and Cherie O’Neil ABSTRACT Prior tax compliance research has largely ignored low-income individual taxpayers, as they have historically been viewed as having an immaterial impact on Federal tax revenues. However, the earned income tax credit (EITC) program has altered the Federal tax revenue landscape in this regard. The Internal Revenue Service (IRS) investigated the magnitude of EITC tax overpayments for tax year 1999 and concluded that between 27 and 31% of EITC filings were overstated, resulting in over-payments of between $8.5 and $9.9 billion (IRS, 2002). These excessive payments represented about 0.5% of total Federal revenues and 2.8% of the total tax gap. Thus, to the extent that low-income individual taxpayers intentionally under-report their incomes in order to receive higher EITC’s, the Federal budget is noticeably affected. 夽 This paper is based on Henry Efebera’s dissertation, which he completed at the University of South Florida. Upon graduation, Henry became an assistant professor at the University of Akron. Henry unexpectedly died on March 25th, 2002, leaving behind his wife, Yvonne, and three children, Omotade, Yvette, and Ebiyemi. Henry was a kind soul with a heart of gold. We miss him dearly and publish this article in his loving memory.
Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 1–25 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07001-2
1
2
HENRY EFEBERA ET AL.
This study extends and complements extant tax research by examining the compliance intentions of low-income individual taxpayers. Relying on the theory of planned behavior, we examine the extent to which perceived tax equity (vertical, horizontal and exchange), normative expectations, and legal sanctions affect tax compliance intentions. Consistent with the hypotheses, the results indicate a significant positive relationship between compliance intentions and: (1) equity perceptions of the tax system; (2) normative expectations of compliance; and (3) penalty magnitude. Additionally, the findings suggest two-way interactions between penalty magnitude and exchange equity, and penalty magnitude and normative expectations. Research results reported herein hold important policy implications related to the Federal government’s efforts to reduce tax cheating and increase compliance among low-income individual taxpayers.
INTRODUCTION Deliberate tax non-compliance is believed to be relatively widespread and it represents a serious problem to the U.S. fiscal system (Smith & Kinsey, 1987; Worsham, 1996). Although prior research has provided useful insight into factors associated with tax compliance in general, it has largely ignored low-income individual taxpayers and instead focused almost exclusively on middle- and upper-income individual taxpayers and business taxpayers. Low-income individual taxpayers have been understudied in the tax literature possibly because they are typically offered little or no tax-cheating incentives, rarely encounter tax avoidance opportunities, and yield a relatively small effect on tax revenue. However, with the significant expansion of the earned income tax credit (EITC),1 the incentive, opportunity and impact of non-compliance has significantly increased for this group of taxpayers in the last decade. For example, the General Accounting Office (1997) reported that the cost of the EITC program increased by 40% between 1993 and 1994 to $21 billion and attributed a significant part of this increase (25%) to fraud. The IRS concluded that between 27 and 31% of EITC filings were overstated in tax year 1999, which resulted in overpayments of between $8.5 and $9.9 billion, and represented about 0.5% of the total Federal revenues and 2.8% of the total tax gap (IRS, 2002).2 The high rate of fraud and overstatement motivated Congress to appropriate $145 million of additional funding and personnel for the EITC compliance effort in the 2001 budget. Furthermore, the Welfare Reform Act of 1997 added millions of new low-wage workers to the workforce, potentially increasing the financial impact of the program.
Tax Compliance Intentions of Low-Income Individual Taxpayers
3
In light of such concerns and increasing Congressional and IRS interests in low-income individual taxpayers compliance behavior, research that provides insights into the factors associated with their compliance intentions could have important policy implications. While prior research has advanced our understanding of compliance behavior in general, this study contributes to extant literature in three important ways. First, while most studies have examined the effect of overall equity perceptions of the tax system on tax compliance intentions, the present study extends this literature by examining the effects of vertical, horizontal, and exchange dimensions of equity perceptions (Jackson & Milliron, 1986; Moser et al., 1995). Vertical equity refers to the perceived tax burden of lower, as compared to higher, income taxpayers; horizontal equity concerns the perceived tax burden of one taxpayer to other taxpayers with comparable economic means; and, exchange equity refers to the perceived benefits taxpayers receive relative to the taxes they pay. Second, this study expands the usual conceptualization of external social influences (subjective norms), for it also includes an internal personal dimension (moral norms), as suggested by Beck and Ajzen (1991). While subjective norms refer to the perceived social influence to engage in a given behavior, moral norms reflect the internalized moral conscience that guides an individual’s behavior (Beck & Ajzen, 1991). Third, this study examines the tax compliance intentions of an under-represented but increasingly important segment of the population – low income individual taxpayers – within the framework of a comprehensive tax compliance model. A sample of low-income individual taxpayers was presented with a hypothetical tax scenario involving two sources of income: salary income subject to IRS reporting and self-employment income not subject to third party reporting. Each participant was also presented with a table indicating the amount of refundable credit corresponding to a given level of reported income. Participants were then asked to make a compliance decision. Prior to receiving the compliance decision task, participants indicated their perceptions of vertical, horizontal, and exchange equity. After the decision task was completed, perceptions of normative influences and legal sanctions were obtained. Consistent with the hypotheses, the results indicate a significant positive relationship between compliance intentions, and, equity perceptions of the tax system, normative expectations of compliance, and penalty magnitude. There was an interactive effect between penalty magnitude and exchange equity, and penalty magnitude and normative expectations. Although we theorized that normative influences would be comprised of social and moral norms, factor analysis indicated that normative influences also included detection risk – the likelihood that the IRS will detect deliberate underreporting of income.
4
HENRY EFEBERA ET AL.
The remainder of this paper is organized as follows. The next section develops the theoretical framework that links tax compliance intentions to perceptions of equity, normative influences and legal sanctions and presents the study hypotheses. This is followed by a presentation of the research methodology and analysis of the research findings. The last section discusses the study’s limitations and contributions.
BACKGROUND AND THEORY Despite a long history of taxation in the U.S., knowledge of how tax evasion behavior is formed is still not fully understood. Only recently have researchers begun to examine such basic issues as the role of normative influences, equity perceptions, legal complexity and rule ambiguity in determining tax compliance (or non-compliance) behavior (Klepper & Nagin, 1989). Prior tax compliance research has used several competing approaches and theoretical frameworks such as deterrence theory, equity theory (Adams, 1965), equity-control model (Maroney et al., 1998), fiscal psychology, exchange equity theory (Moser et al., 1995) and procedural justice (Worsham, 1996). Although these models have provided useful insights into our understanding of compliance behavior in general, empirical evidence from this stream of research has not converged (Alm, 1991; Cowell, 1992; Webley et al., 1985). Most of the prior tax compliance studies share common limitations for their omission of the social situations and environments in which the taxpayers are embedded (Smith & Kinsey, 1987), simplicity and omission of taxpayers’ morality (Kaplan et al., 1997) and inappropriate experimental contexts (Christensen & Hite, 1993). The inconclusive results from prior research based on existing theoretical frameworks suggest the need for a more comprehensive theory of behavioral choice.
The Theory of Planned Behavior The theory of planned behavior provides an alternative theoretical framework that could provide a more complete understanding of compliance behavior by integrating the central provisions of existing theories (Ajzen, 1991). The theory of planned behavior was derived from the theory of reason action (Fishbein & Ajzen, 1975), which posits that individuals’ intentions toward a behavior are determined by their: (1) attitude toward the behavior; (2) subjective norms with regard to the behavior; and (3) perceived behavioral control over engaging in
Tax Compliance Intentions of Low-Income Individual Taxpayers
5
the behavior. An individual’s attitude toward a behavior refers to the degree to which the individual has a favorable or unfavorable evaluation of the behavior, and is shaped by the individual’s beliefs about the consequences associated with performing the behavior. Subjective norms refer to individuals’ perceptions of social expectations to perform or not to perform the behavior, and their motivation to comply with those perceived expectations. Generally, the more an individual perceives that important social peers would approve of certain behavior the stronger is the intention to perform that target behavior. Perceived behavioral control refers to the perceived ease or difficulty of performing a given behavior. Intention is defined as an individual’s subjective probability to perform a behavior (Ajzen, 1991) or a composite behavioral inclination toward a behavior (Smith & Kinsey, 1987). As a general rule, a stronger intention to perform a behavior is associated with a greater likelihood of engaging in the behavior. The theory of planned behavior posits that individuals’ intentions, together with their perceived control over the behavior, determine whether they will actually engage in the behavior. Unlike the theory of reasoned action whose range of applicability was restricted to willful behaviors that an individual can decide to perform or not to perform, the theory of planned behavior applies to behaviors that are not under the individual’s complete volitional control (Ajzen, 1991). While the theory of planned behavior has been used extensively in the social science literature to predict or explain several deviant behaviors, few studies have explicitly examined the attitudes and beliefs associated with tax non-compliance behaviors. Smith and Kinsey (1987) make a significant contribution to this stream of research by presenting a suggested framework for examining the theory of planned behavior in the context of tax compliance, as they propose a model of specific socio-psychological factors that taxpayers consider in making their compliance decisions. We extend and test their suggested model in the current study.
A Social-Psychological Model of Tax Compliance Behavior The research model used in this study (see Fig. 1) depicts a conceptual model of relevant social-psychological factors that taxpayers theoretically consider in forming their tax compliance intention. The model, based on the theory of planned behavior framework, proposes three sets of general constructs that shape taxpayers’ attitudes and beliefs in the tax compliance context: (1) equity perceptions of the tax system; (2) normative expectations of “important others” and moral conscience; and (3) the perceived legal sanctions associated with the particular
6 HENRY EFEBERA ET AL.
Fig. 1. Research Model of Tax Compliance Behavior.
Tax Compliance Intentions of Low-Income Individual Taxpayers
7
non-compliant behavior. The model further posits that tax compliance intentions, together with perceived legal sanctions, ultimately determine whether the taxpayer actually engages in a deliberate non-compliant behavior. Central to the model is the taxpayer’s compliance intention. In the context of this research, intention is defined as a taxpayer’s subjective probability to engage in a target behavior. As a general rule, the stronger the compliance intention, the greater the likelihood that the taxpayer will actually be compliant in their tax reporting decision. Each of the factors in the model is also conceptualized as consisting of several variables which involve analytically similar dimensions. The following sections describe the extant research on the elements of the model (perceived equity, subjective norms, and legal sanctions) and develop the research hypotheses. Equity Perceptions In establishing the linkage between equity perceptions and attitude formation, Ajzen (1985, p. 85) proposed that perception of equity is “capable of influencing people’s attitudes toward their positions in a relationship, toward their partners in the relationship, toward the relationship as a whole, toward the tasks they are to perform, and toward the person or agent responsible for the inequity.” Prior research into the role of equity perceptions on tax compliance intentions and behavior has been based primarily on equity theory (Adam, 1965), which focuses on fairness judgments of outcomes and the behavioral effect of such judgments. The theory suggests that when individuals perceive their outcomes to be inequitable, they will try to restore equity by altering their input-to-output ratio, either by reducing their input or increasing their output. Despite the intuitive appeal of this reasoning, empirical evidence linking equity perceptions of the overall tax system to compliance decisions has been mixed. While McGraw and Scholz (1991) did not find a significant relationship between the general perceptions of equity and compliance decisions, studies by Hite and Roberts (1992) and Roberts (1994) suggest that equity perceptions do influence tax compliance attitudes. For example, McGraw and Scholz (1991) examined the relationship between perceived fairness and a compliance decision in the context of a communication that either emphasized social or personal consequences of tax reform. No significant difference in the compliance decisions between the two groups was found. In contrast, Roberts (1994) reported that public service announcements significantly improved attitudes toward compliance. Results from a different stream of research (e.g. Hite & Roberts, 1992; Maroney et al., 1998) examining the fairness perceptions of specific tax provisions, have consistently found a positive relationship between perceived fairness and taxpayer compliance. Overall, tax research dealing with fairness perceptions suggests that the nature of
8
HENRY EFEBERA ET AL.
equity perceptions, as well as the psychological processes that serve to form such perceptions, are still not well understood. The current study further expands the compliance literature by considering equity perception along several dimensions that taxpayers potentially use to weigh the equity of their tax burden, as indicated by Jackson and Milliron (1986). One dimension, exchange equity, involves the perceived equity of the exchange relationship between the taxpayer and the government, or the perceived benefits that the taxpayer receives for the tax dollars given. Perceived exchange inequity occurs when the taxpayers’ inputs (taxes) are perceived to be greater than their outputs (benefits). The second dimension, horizontal equity, refers to the taxpayers’ perceived equity of their tax burden as compared to other taxpayers with equivalent economic means. Horizontal inequity arises when taxpayers perceive that their share of the tax burden is disproportionately larger than other taxpayers in similar economic circumstances. The third dimension, vertical equity, refers to the taxpayers’ equity perception of their tax burden in relation to other taxpayers with more income. Vertical inequity arises when lower-income taxpayers perceive that their share of the tax burden is greater than higher-income taxpayers. Given the relatively few studies that have examined the effect of the different dimensions of perceived equity on tax compliance intentions or behavior, little is known about the specific equity dimensions on which taxpayers base their intentions in tax compliance contexts. Although Moser et al. (1995) provide preliminary evidence suggesting that horizontal and exchange equity are important factors in compliance decisions, their experimental study was based on economic conception of equity rather than taxpayers’ perceptions of equity. This study examines what types of equity perceptions, if any, influence low-income individual taxpayers’ compliance intentions. Since there is no empirical evidence indicating what dimensions of equity low-income taxpayers use in making their compliance decisions, we rely on three different perceptions of equity as a basis for our hypothesis. Equity theory would assert that higher perceptions of equity should lead to stronger intentions to engage in the behavior of interest. Thus, the following multi-part hypothesis is offered (all hypotheses are listed in the alternate form): H1 . There will be a positive relationship between tax compliance intentions and perceptions of (H1a ) vertical equity (H1b ) horizontal equity and (H1c ) exchange equity. Normative Influences The second component of the research model is the taxpayers’ normative expectations toward compliance. Many studies that have examined the effects of
Tax Compliance Intentions of Low-Income Individual Taxpayers
9
normative influences on behavioral choices have conceptualized such influences in terms of social comparison. In the social psychology literature, this social comparison is referred to as subjective norms. Thus, subjective norms refer to perceived social pressure from referent others to engage (not engage) in a specific behavior (Beck & Ajzen, 1991). It indicates an individual’s perception of the expectations of people who are important to the individual (spouse, family, social peers, co-workers, etc.) with respect to the target behavior (Randall & Gibson, 1991). As a general rule, the more an individual perceives that important social peers would approve of a behavior, the stronger is the individual’s intention to engage in the behavior. Many studies that have investigated this relationship between behavioral choices and subjective norms have generally found that they explain a significant proportion of individual’s intention towards risky driving behavior (Parker et al., 1992), shoplifting and lying (Beck & Ajzen, 1991), and ethical intentions (Randall & Gibson, 1991); although the findings have not been unanimous (Beck & Ajzen, 1991). Applying this reasoning to the tax context would suggest that as the perceived subjective norm toward compliance increases, compliance intentions and related behaviors would also increase. Conversely, when taxpayers perceive that their referent others (peers, spouses, family) would encourage or at least approve of deviant behavior, the taxpayers’ intentions toward non-compliance would also be expected to increase. In spite of the intuitive appeal of this proposition, very few studies have examined this relationship in a tax compliance context. For those studies that have examined this relationship, social norms were not the focus of the research and social norms were conceptualized very narrowly, i.e. peer influence (Maroney et al., 1998). The current study incorporates several facets of social norms (i.e. family, significant others and friends) and proposes the next hypothesis. H2a . There will be a positive relationship between tax compliance intentions and social norms. Regarding the role of normative influences in tax compliance intentions, other researchers have focused on the external social dimension to the exclusion of the equally important internal dimension of moral norm. Ajzen (1991) suggests that the inclusion of personal feelings of moral obligation or responsibility toward compliance would significantly increase the explanatory power of most models of behavioral choice in socially sensitive circumstances. Although prior research has separately considered the roles of subjective norms and morality on compliance to varying degrees, no study has examined both factors in the same framework, therefore precluding the examination of the relationship between them. The current study expands the literature by considering the independent and collaborative roles of moral norms.
10
HENRY EFEBERA ET AL.
Although research on the role of morality on tax compliance intentions is sparse, the results suggest that moral norms are important in understanding taxpayers’ compliance intentions and behavior. For example, Kaplan et al. (1997) found that individuals’ moral reasoning capacity moderated the effect of IRS’ compliance strategies on their compliance behavior. They found that legal sanctions were effective in reducing non-compliance intentions for taxpayers who have low sense of moral responsibility, while “appeals to moral conscience” communication were more effective for taxpayers with high sense of moral responsibility (Schwartz & Orleans, 1967); but, the results have not been unanimous (McGraw & Scholz, 1991). This study examines a similar proposition that morality affects low-income individual taxpayers’ compliance intentions with the following hypothesis: H2b . There will be a positive relationship between tax compliance intentions and moral norms. Legal Sanctions Prior tax compliance research has generally examined the effect of legal sanctions on tax compliance along two dimensions – perceptions of detection risk and the penalty magnitude associated with the behavior. Detection risk refers to the likelihood that the IRS will detect the tax non-compliant behavior. On the other hand, penalty magnitude refers to the perceived magnitude or severity of the penalty in terms of fines and jail terms associated with the detection of the tax non-compliant behavior. Those studies examining the effect of detection risk on non-compliance intentions and behaviors have generally found mixed results (Roth & Witte, 1985; Webley et al., 1985). In further exploring this issue, Fischer et al. (1992) suggest that the mixed results from prior research are due to problems with the way that detection risk is conceptualized and recommended that future research focus on perceptions of detection risk rather than objective measures of audit probabilities. In contrast, research examining the effect of sanctions magnitude on noncompliant behavior has generally been more positive (Carnes & Englebrecht, 1995; Witte & Woodbury, 1985), although the results have not been unanimous (e.g. Klepper & Nagin, 1989). This latter group of researchers argue that an increase in sanctions magnitude may reduce tax non-compliance only under specific conditions, for example, if the personal cost resulting from the increase is significant. The current study contributes to the literature by examining both detection risk and penalty magnitude, which leads to the final hypothesis. H3 . There will be a positive relationship between tax compliance intentions, and, perceived detection risk (H3a ) and perceived penalty magnitude (H3b ).
Tax Compliance Intentions of Low-Income Individual Taxpayers
11
RESEARCH METHOD Participants Prior behavioral tax research studies have been criticized for not paying significant attention to the appropriateness of the participants used in their research (O’Neil & Samelson, 2001). In order to obtain suitable participants for a study of the tax compliance intensions of low-income individual taxpayers, residents of a large government housing project in a metropolitan area in the Southeast participated in this study. Participants were asked to review a hypothetical tax scenario that involved two components of income (salary and self-employment income). As an incentive to participate in the study, each participant received a coupon from a local fast-food restaurant.3 Of the 197 questionnaires collected, there were 146 usable responses.4 Sample demographics, presented in Table 1, indicate that most participants (93.8%) had received an EITC credit in one or more of the past five years. Additionally, the majority of the respondents (96.6%) failed to report all income in at least one of the past five years. Table 1. Demographics. Sample Size = 146
Percentage of Total
Earned income Less than $5,000 $5001–$10,000 $10,001–$15,000 $15,001–$20,000 $20,001–$25,000 $25,001–$30,000 $30,001–$40,000
11.0 11.0 13.0 14.4 15.8 20.5 14.3
Age 15 – 20 21 – 30 31 – 40 41 – 50 51 – 60
11.7 44.5 30.1 11.6 2.1
Gender Male Female
40.4 59.6
Education Less than high school High school
19.9 41.1
12
HENRY EFEBERA ET AL.
Table 1. (Continued ) Sample Size = 146
Percentage of Total
Two years of college Four years of college Graduate
36.3 0.7 2.0
Job description Unskilled Semi-skilled Skilled Professional
38.4 7.5 34.9 19.2
Marital status Married Single/head of household
32.2 67.8
Number of years received EIC 0 1 2 3 4 5
6.2 7.5 13.7 14.4 11.0 47.2
Number of years failed to report all income in the past five years 0 1 2 3 4 5
3.4 2.1 4.8 8.2 25.3 56.2
Race/ethnicity White Black Hispanic American-Indian Asian-Pacific Islander
49.9 33.6 11.0 0.7 4.8
Task and Procedures Participants responded to a three-part questionnaire. First, respondents’ perceptions of the equity of the U.S. Federal tax system were assessed. This was done before the decision task, so that the task itself did not bias their equity perceptions. Next, participants were presented with a hypothetical tax scenario involving the reporting of salary and self-employment income. Participants were told that the employer sends salary income information to the IRS at the end of the year, but that
Tax Compliance Intentions of Low-Income Individual Taxpayers
13
reporting self-employment income is the responsibility of the taxpayer. In order to remind participants of the incentives for under-reporting their self-employment income, they were presented with a table that showed the amount of refundable EITC they would receive based on the amount of self-employment income reported. After reading the scenario and EITC table, participants indicated how much of the $12,000 self-employment income they would report. The final section of the questionnaire was used to collect demographic data. Variable measures were randomized and often reverse-coded in order to increase the internal validity of the measurement scales. The compliance decision environment was structured in accordance with the phase-out range of the EITC program, where additional reported income reduces the amount of refundable credit until a complete phase-out level is reached.5 In the phase-out range, low-income individual taxpayers have an incentive to understate their income in order to increase the EITC refund and reduce their self-employment taxes. This scenario was selected because preliminary interviews with low-income taxpayers suggested that many of them supplement their income by engaging in self-employment activities (street vending, music and performing art, cosmetology, etc) where they are typically paid on a cash basis with no reporting by the payer to the IRS.
Measured Variables This section discusses the research metrics. Presented in Table 2 are the item wordings, reliability estimates, mean and standard deviations for each variable. Tax Equity Participants’ indicated their perceptions of vertical equity (three items), horizontal equity (three items) and exchange equity (three items). The vertical and horizontal items were adapted from Jackson and Milliron (1986) and the exchange equity items were adapted from Yankelovich et al. (1984). The scales were oriented such that 1 equals very unfair and 7 equals very fair. Social Norms Social norm perceptions were measured using three items reflecting the perceived expectations of “important others.”6 A high score suggests that one would feel social pressure to properly report the additional income. Moral Norms Moral norms were assessed using three items that were adapted from Ajzen (1991). A high score suggests that one would feel guilty if income were under-reported.
14
Table 2. Descriptive Statistics and Reliability Estimates.a Item #
Item Wording
Vertical equity
VE1
How fair or unfair is the amount of federal income taxes that you pay when compared to people who make more money than you? (1 = Very Unfair, 7 = Very Fair) People like me pay a larger share of our incomes in federal taxes than do rich taxpayers. (1 = Strongly Agree, 7 = Strongly Disagree) Rich taxpayers pay a larger share of their incomes in federal taxes than do taxpayers like me. (1 = Strongly Agree, 7 = Strongly Disagree) I pay about the same amount of federal income taxes as other people who make about the same income as I do. (1 = Strongly Disagree, 7 = Strongly Agree) Most people who earn about the same income as I do pay more taxes than I do. (1 = Strongly Agree, 7 = Strongly Disagree) I pay more taxes compared to most people who make about the same income as I do. (1 = Strongly Agree, 7 = Strongly Disagree) How fair or unfair is the amount of federal income taxes that you pay when compared to the amount of services you get back from the federal government? (1 = Very Unfair, 7 = Very Fair) I pay more in federal income taxes than I receive in services from the federal government. (1 = Strongly Agree, 7 = Strongly Disagree) I am satisfied with the amount of benefits I receive from the federal government compared to the amount of taxes I pay. (1 = Strongly Disagree, 7 = Strongly Agree). My family (father/mother/brother/sister) will expect me to report the additional $12,000 income from my part-time business in my tax return. (1 = Strongly Disagree, 7 = Strongly Agree) My significant other (wife/husband/boyfriend/girlfriend) will expect me to report the additional $12,000 income from my part-time business in my tax return. (1 = Strongly Disagree, 7 = Strongly Agree) My friends will expect me to report the additional $12,000 income from my part-time business in my tax return. (1 = Strongly Disagree, 7 = Strongly Agree) I will feel guilty if I do not report the additional $12,000 income from my part-time business in order to receive a larger tax refund (1 = Strongly Disagree, 7 = Strongly Agree)
VE2 VE3 Horizontal equity
HE1 HE2 HE3
Exchange equity
EE1
EE2 EE3 Social norms
SN1
SN2
SN3 Moral norms
MN1
Alphab
0.73
0.75
0.79
0.78
Mean
Std. Dev.
4.77
1.75
4.58
1.82
4.37
1.99
3.70
1.56
4.66
1.45
3.64
1.35
4.75
1.82
4.73
2.01
4.74
1.93
2.84
1.79
3.48
1.90
3.51
1.92
3.39
2.23
HENRY EFEBERA ET AL.
Subconstruct
MN3 Detection risk
DR1 DR2
DR3 Penalty magnitude
PM1
PM2
PM3
Tax compliance intentions
TCI1
TCI2
It is against my personal principles NOT to report the additional $12,000 part-time business income in order to receive a larger tax refund. (1 = Strongly Disagree, 7 = Strongly Agree) It is wrong NOT to report the additional $12,000 part-time business income in order to receive a larger tax refund. (1 = Strongly Disagree, 7 = Strongly Agree) How likely would the IRS find out if you don’t report the additional $12,000 part-time business income in your tax return (1 = Very Unlikely, 7 = Very Likely) In this age of computers, the IRS will find out if I don’t report the additional $12,000 part-time business income in my tax return (1 = Strongly Disagree, 7 = Strongly Agree) The chance that I will be caught if I don’t report the additional $12,000 additional income from my part-time business is (1 = Very Low, 7 = Very High) I would be in serious trouble if the IRS found out that I did not report the additional $12,000 business income in order to receive a larger tax refund. (1 = Strongly Disagree, 7 = Strongly Agree) The IRS would severely punish me if they found out that I did not report some or all of the $12,000 additional business income in my tax return in order to receive a larger tax refund. (1 = Strongly Disagree, 7 = Strongly Agree) How serious would the punishment be if the IRS found out that you did not report some or all of the additional $12,000 income from your part-time business? (1 = Very Mild, 7 = Very Serious). If you were in Bobby’s situation, how much of the $12,000 additional business income would you report? (1 = $0, 2 = $1–$2,000, 3=$2,001–$4,000, 4=$4,001–$6,000, 5=$6001–$8,000, 6=$8,001–$10,000, 7=$10,001–$12,000) If you were Bobby, how likely is it that you would report the additional $12,000 income from the part-time business? (1 = Very Unlikely, 7 = Very Likely)
0.60
0.85
0.86
0.76c
3.99
2.11
2.91
1.91
4.26
1.76
4.02
1.81
4.16
1.73
2.54
1.55
2.59
1.56
2.52
1.45
4.20
2.38
4.08
2.21
Tax Compliance Intentions of Low-Income Individual Taxpayers
MN2
a
Each variable was assessed using a 7-point likert scale. Reliability estimates reflect Cronbach’s alpha. c Pearson correlation. b
15
16
HENRY EFEBERA ET AL.
Detection Risk Three items, developed and pilot-tested specifically for this study, were used to measure the likelihood of getting caught under-reporting income. A high score suggests a high likelihood of getting caught or being detected. Penalty Magnitude Penalty magnitude was assessed using three items developed and pilot-tested specifically for this study. A high score suggests a strong or tough penalty for under-reporting income. Dependent Variable – Tax Reporting Intentions The decision task dealt with how much of the $12,000 in self-employment income participants would report. The pilot study revealed that participants might not respond truthfully if they felt that the researcher would inform the IRS of their non-compliant intentions. Despite numerous attempts to convince the pilot participants that their responses could not be specifically identified to them and that the researchers would maintain complete confidentiality, the participants nevertheless indicated their reluctance to respond truthfully. Hence, tax compliance intention was indirectly measured by asking participants to assume the role of a fictitious person, Bobby Smith, and to indicate the amount of self-employment income that Bobby would report. In addition, participants were asked the likelihood they would report the additional income if they were Bobby. Thus, the tax compliance items approximate asking the participants how much they would report without triggering any fear of retribution from first-person truthful reporting.
RESEARCH RESULTS Factor Analysis The inter-item reliability estimates shown on Table 2 were at or above the recommended level of 0.60 (Carmines & Zeller, 1979) indicating acceptable convergent validity. To test for divergent validity, the response items were factor analyzed. Results of the principle components analysis (Varimax rotation) are presented on Table 3. Only factors with eigenvalues greater than or equal to one were retained. The factor analysis accounted for 67.8% of the overall variance among response items. Factor one included the social norm items, detection risk items, and one moral norm item. One way to interpret this construct is that the social norms and
Tax Compliance Intentions of Low-Income Individual Taxpayers
17
Table 3. Results of Factor Analysis on Measured Variables. Subconstruct
Item #
Factor 1
Factor 4
Factor 5
Factor 6
Principle components analysis – varimax rotated component matrix Vertical equity VE1 −0.04 0.59 0.04 VE2 −0.03 0.23 0.09 VE3 0.13 0.15 −0.03 Horizontal equity HE1 0.10 0.25 0.04 HE2 −0.06 0.31 0.02 HE3 −0.18 0.54 0.01 Exchange equity EE1 0.05 0.86 0.02 EE2 0.18 0.65 0.11 EE3 −0.01 0.83 −0.02 Social norms SN1 0.67 −0.02 0.17 SN2 0.77 0.02 0.13 SN3 0.74 0.11 0.10 Moral norms MN1 0.74 0.11 0.22 MN2 0.28 −0.06 0.12 MN3 0.24 −0.18 0.15 Detection risk DR1 −0.75 −0.04 −0.15 DR2 −0.80 −0.02 −0.06 DR3 −0.81 0.09 −0.15 Penalty magnitude PM1 0.30 −0.04 0.84 PM2 0.29 0.10 0.84 PM3 0.12 0.07 0.84
0.49 0.82 0.85 −0.05 0.08 0.03 0.09 0.05 0.20 0.02 −0.05 −0.05 0.24 −0.02 0.24 0.05 −0.04 −0.16 0.07 −0.01 0.02
0.03 −0.04 −0.05 0.76 0.76 0.35 −0.10 −0.01 −0.01 0.26 0.06 0.12 0.15 −0.06 0.37 −0.02 0.12 0.16 0.07 0.06 −0.05
−0.02 −0.04 0.09 0.22 0.26 0.40 −0.05 −0.13 0.06 −0.03 0.12 −0.16 0.18 0.77 0.44 −0.09 −0.14 −0.20 0.09 0.10 −0.01
Eigenvalue Percent of variance Cumulative variance
1.56 7.41 56.54
1.32 6.30 62.84
1.06 4.84 67.68
5.45 25.95 25.95
Factor 2
3.23 15.38 41.33
Factor 3
1.67 7.80 49.13
Note: Factor Interpretations: Factor 1: Normative Expectations; Factor 2: Exchange Equity; Factor 3: Penalty Magnitude; Factor 4: Vertical Equity; Factor 5: Horizontal Equity; Factor 6: Moral Norm Toward Peers.
detection risk sub-constructs reflect normative expectations from two perspectives – one from important others and the other from the Federal government. The moral norm item represents a personal normative expectation. Accordingly, the first factor “normative expectations” was labeled as originally conceived (see Fig. 2). The next four factors mostly reflect exchange equity (factor 2), penalty magnitude (factor 3), vertical equity (factor 4) and horizontal equity (factor 5). The sixth factor is a combination of horizontal equity and moral norms. Since horizontal expectations deal with perceived equity of the tax system among peers, the last construct is labeled “moral norm toward peers.”
18 HENRY EFEBERA ET AL.
Fig. 2. Research Model Results.
Tax Compliance Intentions of Low-Income Individual Taxpayers
19
Multiple Regression Results (H1 , H2 , H3 ) Factor scores were used in the multiple regression analysis rather than construct indices composed of the sum or average of item responses within each construct. Using factor scores eliminates inter-correlated error terms (multicolinearity) among independent variables. The hypotheses predicted positive relationships between tax compliance intentions and perceived equity of the tax system (H1a , H1b , and H1c ), normative expectations (H2a , and H2b ) and legal sanctions (H3a , and H3b ). To test these hypotheses, tax compliance intention was regressed on the six factors arising from the principle components analysis. Additionally, all possible interactions were tested. The significant results are presented in Table 4 and illustrated in Fig. 2. As indicated in Table 4, all six constructs arising from the factor analysis are significantly positively related to tax compliance intentions. However, exchange equity and normative expectations cannot be interpreted in a straightforward manner, as they interact with penalty magnitude (see Fig. 2). Table 4. Regression Results of Factor Scores on Tax Compliance Intentions (Dependent Variable = Tax Compliance Intentiona ). Model
Parameter Estimates Coefficient
Intercept Main factors Normative expectations Exchange equity Penalty magnitude Vertical equity Horizontal equity Moral norm toward peers Significant interactionsb Exchange equity by penalty magnitude Normative expectations by penalty magnitude Significant covariates Income bracket Previous reporting failures
Std. Error
6.17
0.71
1.28 0.39 0.36 0.28 0.25 0.21
0.12 0.12 0.12 0.12 0.12 0.12
0.29
Beta Coefficients
t-Value
Significance ( p-Value)
8.67
0.01
0.621 0.187 0.175 0.137 0.120 0.103
10.63 3.33 3.10 2.43 2.12 1.78
0.01 0.01 0.01 0.02 0.04 0.08
0.11
0.150
2.55
0.01
0.21
0.12
0.102
1.70
0.09
−0.13 −0.20
0.06 0.10
−0.122 −0.123
−2.11 −1.98
0.04 0.05
Note: R 2 = 0.59, Adjusted R 2 = 0.56, Overall F-ratio = 19.46 (p < 0.01). a The Tax Compliance Intention items (TCI1 and TCI2) were averaged to form a single compliance index. b All other two-way and three-way interactions were non-significant at p < 0.10.
20
HENRY EFEBERA ET AL.
The positive relationship between tax compliance intentions, and vertical equity (H1a ) and horizontal equity (H1b ) partially support H1 . The interaction between exchange equity and penalty magnitude also partially support H1 . That is, holding penalty magnitude constant, greater perceptions of exchange equity are associated with higher levels of tax compliance intentions (H1c ). The combined results indicate that H1 is supported. Evaluation of the second hypothesis (H2 ) is conditioned on the following interpretation. Holding penalty magnitude constant, social norms are positively related to higher tax compliance intentions (H2a ). However, only one of the moral norm items loaded on the “normative expectations” construct, hence, the expected positive relationship between moral norms and compliance intentions is weakly supported (H2b ), with the caveat that it too must be interpreted in light of the interaction. Additionally, the construct entitled “moral norm toward peers” also appears to be a subcomponent of normative expectations, as it deals with a consciousness to comply with the Federal tax code in a similar manner as peers. The positive relationship between “moral norm toward peers” and tax compliance intentions further supports H2b , again, when interpreted in proper perspective of the interaction. Unexpectedly, the three detection risk items loaded on the normative expectations factor. Detection risk can be viewed as a form of normative expectations or oversight from an external party (the Federal government), even though this relationship was not hypothesized. Overall, research findings suggest that H2 is partially supported. The third hypothesis is only partially supported as well, since the detection risk items (H3a ) loaded on the normative expectations factor, and the penalty magnitude items (H3b ) loaded on their own factor. Yet, penalty magnitude must be interpreted in light of exchange equity and normative expectations; meaning, holding exchange equity constant, more severe penalties are associated with higher tax compliance intensions, and, holding normative expectations constant, greater penalties also suggest higher compliance intentions. All demographic variables were originally included in the regression model as possible covariates, but only two factors were significant (p ≤ 0.10), income bracket and previous reporting failures. Interestingly, higher reporting brackets indicated less intention to comply. Not surprisingly, greater numbers of previous income reporting failures suggested less tax compliance intentions. Overall, the sub-construct items did not load on the constructs precisely as expected (compare Fig. 1 to Fig. 2). Nevertheless, factor analysis and regression model results indicate that the constructs articulated in the theory of planned behavior (attitudes, subjective norms and perceived behavioral control) are predictive of low-income individual taxpayers’ compliance intentions.
Tax Compliance Intentions of Low-Income Individual Taxpayers
21
DISCUSSION This study examines the extent to which low-income individual taxpayers’ compliance intentions are influenced by perceptions of tax equity (vertical, horizontal, and exchange equity), normative expectations (social and moral norms) and legal sanctions (detection risk and penalty magnitude). The research model (Fig. 1) is based on the theory of planned behavior (Ajzen, 1985, 1991). While prior tax compliance research has largely ignored low-income individual taxpayers, continued significant growth of the earned EITC program has created a situation where non-compliance among low-income individual taxpayers is becoming a serious problem with growing fiscal implications (IRS, 2002). Hence, any insight into ways to increase tax compliance among these taxpayers can be helpful in setting tax policy. The research findings suggest at least two policy implications for the Federal government to consider. First, the results indicate significant positive relationships between tax compliance intentions and perceptions of vertical and horizontal equity. This implies that any attempts made by the Federal government to ensure that lower income individual taxpayers feel as though they are not paying proportionately more taxes than upper income individual taxpayers (vertical equity) or peer tax payers (horizontal equity) can positively impact tax compliance intentions. Second, legal sanctions (penalty magnitude) reveal an interactive effect with exchange equity and normative expectations. With respect to exchange equity, low-income individual taxpayers must feel as though the benefits they receive from the Federal government are proportionate to the taxes they pay (exchange equity). Positive perceptions in this regard coupled with strong penalties for non-compliance should help to improve taxpayer compliance. Regarding normative expectations, there are two aspects to consider – important others and governmental oversight. The expectations of important others are clearly important, but social expectations of this nature are socio-cultural and outside the spectrum of legislation. However, the Federal government can take steps to increase the monitoring process via tax return audits. Naturally, attempts to increase the percentage of audits for low-income individual taxpayers must be cost effective, and aligned with the audit rates of vertical and horizontal taxpayers. As with exchange equity, increased Governmental oversight coupled with stronger penalties should lead to greater taxpayer compliance. This study is limited by certain validity threats common to survey research. First, although efforts were made to maximize the realism of the task, subjects responded to a contrived scenario involving a tax compliance opportunity. To the extent that participants respond differently when making such decisions on their
22
HENRY EFEBERA ET AL.
own returns, results from the present study may have limited external validity. Second, the questionnaire solicited compliance intentions indirectly by asking them to report their intentions as if they were a hypothetical taxpayer to mitigate this possibility. Although participants were also assured that their responses would remain anonymous, due to the sensitive nature of tax compliance issues, the possibility exists that respondents were not honest in reporting their intentions. Third, the current study measures taxpayers’ compliance intentions but not the actual compliance behavior. However, comparing participant demographics to their reported intentions offers some degree of external validity to the link between tax non-compliance intentions and behaviors in the current study, as nearly 97% of respondents reported that they have understated their taxable income at least once during the past five years; thus, their responses to the experimental script are likely indicative of how they would actually behave in a similar circumstance. Future research should continue to address the link between compliance intentions and behavior. Fourth, the maximum amount of tax refund that could be obtained from under-reporting income was assumed to be the $2,200 for all participants in this study. Additional research is needed to examine how taxpayers’ decisions may be influenced by larger tax items, for Christensen and Hite (1993) suggest that taxpayers are more conservative with items involving larger liabilities than with items involving smaller liabilities. This indicates that the tax compliance intentions captured in this study may be influenced by the maximum amount of refundable credit that was made available to the taxpayer. Finally, this study dealt with an under-reporting of income scenario in order to receive a higher EITC. It is also possible for individual taxpayers to over/underreport deductions in order maximize their qualifying EITC income.7 Presumably, the measured variables used in this study would also apply to over-reporting of deductions scenarios. However, taxpayers may perceive that there is less detection risk when deductions are intentionally over-reported. This issue should be addressed in future research on the tax compliance behavior of low-income individual taxpayers.
NOTES 1. “The Earned Income Tax Credit (EITC), sometimes called the Earned Income Credit (EIC), is a refundable Federal income tax credit for low-income working individuals and families. Congress originally approved the tax credit legislation in 1975 in part to offset the burden of social security taxes and to provide an incentive to work. The credit reduces the amount of Federal tax owed and can result in a refund check. When the EITC exceeds the amount of taxes owed, it results in a tax refund to those who claim and qualify for the credit. Income and family size determine the amount of the EITC. To qualify for the credit, both
Tax Compliance Intentions of Low-Income Individual Taxpayers
23
the earned income and the adjusted gross income for 2003 must be less than $29,666 for a taxpayer with one qualifying child ($30,666 for married filing jointly), $33,692 for a taxpayer with more than one qualifying child ($34,692 for married filing jointly), and $11,230 for a taxpayer with no qualifying children ($12,230 for married filing jointly). The EITC Eligibility Checklist on the last page of IRS’ Publication 596, Earned Income Credit, may be used to quickly determine eligibility for the credit.” (Source: http://www.irs.gov/individuals/). 2. “Several years ago the Internal Revenue Service developed the concept of the ‘tax gap’ as a way to measure voluntary federal income tax compliance. The gross tax gap is the difference between taxes owed (the ‘true’ tax liability) and taxes paid voluntarily and timely for any given tax year. The net tax gap is the gross tax gap minus taxes collected through various IRS enforcement programs for the same tax year. Both gross and net individual income tax gaps consist of three main components: non-filing, underreporting and underpayment. The non-filing gap is the amount of tax liability owed by taxpayers who do not voluntarily and timely file returns. The underreporting gap is the amount of tax liability not voluntarily reported by taxpayers who do file returns. The underpayment tax gap is the amount of tax liability individuals report on returns but do not pay voluntarily and timely.” (Source: http://www.unclefed.com/Tax-News/1997/). 3. Several incentives were tested, such as state lottery tickets and small cash payments. After several pilot test trials, the food coupon was deemed to be the best received and most appreciated incentive. 4. Of the 197 responses, 51 were deleted because they reported income that would not qualify as low-income taxpayers based on the 2002 EITC guidelines (e.g. workers raising two or more children with household income of $34,000 (married) or $33,000 (single)). 5. The 20% phase-out rate for the study compares to the 15.98% phase-out rate for the EITC program. 6. In the pilot study, subjects identified spouses, family and friends in that order as the strongest influences on their ethical behaviors. 7. The IRS has recently implemented an initiative aimed at reducing cheating behavior related to the over-reporting of deductions in order to receive a higher EITC. Specifically, the IRS now requires taxpayers to include their dependents’ social security numbers on tax returns in an attempt to halt the fraudulent listing of non-existent dependent children.
REFERENCES Adams, J. S. (1965). Inequity in social exchange. Advances in Experimental Social Psychology, 2, 267–299. Ajzen, I. (1985). From intentions to action: A theory of planned behavior. In: J. Kuhl & J. Bechmenn (Eds), Action Control from Cognition to Behavior. New York: Springer Verlag. Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50, 179–211. Alm, J. (1991). A perspective on the experimental analysis of tax reporting. The Accounting Review, 66(July), 577–593. Beck, L., & Ajzen, I. (1991). Predicting dishonest actions using the theory of planned behavior. Journal of Research in Personality, 25, 285–301.
24
HENRY EFEBERA ET AL.
Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment. Beverly Hills: Sage. Carnes, G. A., & Englebrecht, T. D. (1995). An investigation of the effect of detection risk perceptions, penalty sanctions, and income visibility on tax compliance. The Journal of American Taxation Association (Spring), 26–41. Christensen, A. L., & Hite, P. A. (1993). A study of the effect of taxpayer risk perceptions on ambiguous compliance decisions. The Journal of American Taxation Association (Spring), 1–18. Cowell, F. A. (1992). Tax evasion and inequity. Journal of Economic Psychology, 13, 521–543. Fishbein, M., & Ajzen, I. (1975). Belief, attitude, intention and behavior. In: An Introduction to Theory and Research. Boston, MA: Addison-Wesley. Fischer, C. M., Wartick, M., & Mark, M. (1992). Detection probability and taxpayer compliance: A literature review. Journal of Accounting Literaturel, 11, 1–46. General Accounting Office Reports (GAO) (1997). GAO Reports on EIC Usage. TNT 97, 119–179. (GAO/GCD-97-69). Release Date: May 16, 1997 (Doc 97-17840). Hite, P. A., & Roberts, M. L. (1992). An analysis of the tax reform based on taxpayers perceptions of fairness and self-interest. Advances in Taxation, 4, 115–137. Internal Revenue Service (IRS) (2002). Compliance Estimates for Earned Income Tax Credit claimed on 1999 Returns, February, IRS Publications. Jackson, B. R., & Milliron, V. C. (1986). Tax compliance research: Findings, problems, and prospects. Journal of Accounting Literature, 5, 125–161. Kaplan, S. E., Newberry, K. J., & Reckers, P. M. (1997). The effect of moral reasoning and educational communications on tax evasion intentions. The Journal of American Taxation Association, 19(Fall), 38–54. Klepper, S., & Nagin, D. (1989). Tax compliance and perceptions of the risks of detection and criminal prosecution. Law and Society Review, 23(2), 209–240. Maroney, J. J., Rupert, T. M., & Anderson, B. H. (1998). Taxpayer reaction to perceived inequity: An investigation of indirect effects and the equity control model. The Journal of the American Taxation Association, 20(Spring), 60–77. McGraw, K. M., & Scholz, J. T. (1991). Appeals to civic virtue vs. attention to self-interest: Effect on tax compliance. Law and Society Review, 25, 471–498. Moser, D. V., Evans, J. H., III, & Kim, C. K. (1995). The effects of horizontal and exchange inequity on tax reporting decisions. The Accounting Review (October), 619–634. O’Neil, C. J., & Samelson, D. P. (2001). Behavioral research in taxation: Recent advances and future prospects. Advances in Accounting Behavioral Research, 4, 103–139. Parker, D., Manstead, A. S. R., Stradling, S. G., & Reason, J. T. (1992). Intention to commit driving violations: An application of the theory of planned behavior. Journal of Applied Psychology, 77(1), 94–101. Randall, D. M., & Gibson, A. M. (1991). Ethical decision making in the medical profession: An application of the theory of planned behavior. Journal of Business Ethics, 10(2), 111–122. Roberts, M. L. (1994). An experimental approach to changing taxpayers’ attitudes towards fairness and compliance via television. The Journal of American Taxation Association (Spring), 67–86. Roth, J., & Witte, A. (1985). Understanding taxpayer compliance: Major factors and perspectives. Conference on Tax Administration. Internal Revenue Service (January), 57–78. Schwartz, R., & Orleans, S. (1967). On legal sanctions. University of Chicago Law Review, 34, 282–300. Smith, K. W., & Kinsey, K. A. (1987). Understanding taxpaying behavior: A conceptual framework with implications for research. Law and Society Review, 21(4), 639–663.
Tax Compliance Intentions of Low-Income Individual Taxpayers
25
Webley, P., Morris, I., & Amstutz, F. (1985). Tax evasion during a small business simulation. In: H. Brandstatter & E. Kirchler (Eds), Economic Psychology (pp. 233–242). Linz: Trauner. Witte, A. D., & Woodbury, D. F. (1985). The effect of tax laws and tax administration on tax compliance: The case of the U.S. individual income tax. National Tax Journal, 38, 1–14. Worsham, R. G. (1996). The effect of tax authority behavior on tax compliance: A procedural justice approach. The Journal of American Tax Association, 18(Fall), 19–39. Yankelovich, S., & White, Inc. (1984). Taxpayer attitudes study: Final report. Internal Revenue Service.
DETERMINANTS OF TAX PROFESSIONALS’ ADVICE AGGRESSIVENESS AND FEES Donna D. Bobek and Richard C. Hatfield ABSTRACT Prior research has identified a number of variables that influence tax professionals’ judgments. However, these variables have usually been examined in isolation. This study has two main findings. First, using a structured questionnaire that allows for the collection of variables related to actual tax planning engagements, this study validates the findings of numerous laboratory studies using factor and regression analysis. Factors representing risks and rewards associated with the client and the IRS, along with task characteristics and client aggressiveness significantly affect the aggressiveness of tax advice given to clients. Second, tax professionals do not appear to charge a premium for aggressive tax advice. However, regarding the fee charged, a significant gender effect is found even after controlling for time spent on the engagement, experience, firm size and education.
INTRODUCTION Roberts (1998) articulated a model of tax accountants’ judgment and decisionmaking (JDM) processes based on prior research. Included in this model are economic environmental factors representing the risks and rewards associated Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 27–51 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07002-4
27
28
DONNA D. BOBEK AND RICHARD C. HATFIELD
with the IRS, the client, and the tax accountant’s firm. He also includes two other sets of factors, individual-psychological factors (e.g. experience, advocacy) and task inputs (e.g. ambiguity and authority). These factors are presumed to affect a tax accountant’s cognitive processing, which in turn leads to some output (e.g. a recommendation on a tax planning issue). This model is based primarily on research findings from experimental studies using mostly “Big 5” accountants as research subjects. The present study has two objectives. The first objective is to test the validity of Roberts’ model using a different data collection technique that allows for the collection of a comprehensive set of variables. In addition, the subjects are tax professionals from smaller accounting firms rather than the large firms which were more typical of the studies examined in Roberts’ (1998) review. The advantage of this approach is to generalize prior results to the type of accounting firms that do most of the tax work (Russell, 2002), and to provide external validity to results found primarily in experimental settings. Further, this approach considers variables that have not been adequately addressed by prior research. While the risks and rewards associated with the IRS and the client have received a great deal of attention, the risks and rewards associated with the tax professionals’ firm have received less attention (Roberts, 1998). The second objective of this study is to determine whether tax professionals “price” aggressiveness. Increased aggressiveness may lead to an increased risk of malpractice claims (Bandy, 1996) and taxpayer and/or tax preparer penalties. Although CPAs are generally barred from charging contingency fees (Department of Treasury, 1994), it would be economically rational to assume that some premium may be associated with advice the tax professional deems particularly aggressive. CPAs from primarily small firms responded to a structured questionnaire regarding their last tax planning engagement. Factor analysis revealed that statistically developed factors are consistent with the factors articulated in Roberts’ model. Regression analysis using the factor scores as independent variables identified four factors as influential to the aggressiveness of the tax professionals’ advice: client characteristics (e.g. size, importance), task characteristics (e.g. ambiguity, tax dollars at stake), risks and rewards associated with the IRS (e.g. concern for IRS audit, taxpayer penalties) and, especially, client aggressiveness. The only factor that was not significant was a factor representing risks and rewards associated with the tax professional’s firm (e.g. concern for client loss, and concern for professional liability). Regression analysis was also performed with fees as the dependent variable. After controlling for time spent by the tax professional and others in the firm (the biggest influence on fees), firm size and gender significantly affected fees. The larger the firm (measured as number of professionals), the higher were the
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
29
fees. Also, male accountants charged more than female accountants. Experience, education and advice aggressiveness were not significantly related to fees, leading to the tentative conclusion that tax professionals do not charge a premium for aggressive advice. The remainder of the paper is organized as follows: in the next section prior research is discussed, and the study’s research objectives are articulated. In the following section, the method of testing the research objectives is described, followed by a discussion of the results obtained. Finally, the last section provides a summary and suggestions for future research.
LITERATURE REVIEW AND RESEARCH OBJECTIVES Variables from Roberts’ Model of Tax Judgment and Decision Making There has been a great deal of experimental research investigating the aggressiveness of tax professionals’ advice (e.g. Ayers et al., 1989; Bandy et al., 1994; Carnes et al., 1996; Cloyd & Spilker, 1999; Cuccia, 1994; Cuccia et al., 1995; Duncan et al., 1989; Helleloid, 1989; LaRue & Reckers, 1989; McGill, 1990; Pei et al., 1992; Reckers et al., 1991; Schisler, 1994). Roberts (1998) synthesized these prior results and articulated a model of tax accountants’ judgment and decision-making. This model is depicted in Fig. 1.1 Roberts (1998) identified three sets of inputs that affect the tax accountant’s cognitive processing (e.g. problem identification, information search, alternative evaluation), which in turn leads to some output (e.g. aggressive tax advice). The three sets of inputs he identified were: individual-psychological factors (e.g. experience, advocacy, risk preferences), economic environmental factors (i.e. risks and rewards associated with IRS, client, and/or firm), and task inputs (e.g. ambiguity, complexity, documentation). He also identified a variety of areas that required additional research. For example, he called for additional research exploring the role that the economic environment has on tax accountants’ judgment and decision-making, particularly the effect of liability concerns. The present study focuses on the effect of these inputs on the aggressiveness of the tax advice provided by tax professionals. Individual-psychological factors of interest in this study that have been found to be related to advice aggressiveness include issue experience, years of experience, firm size, and gender. In general, experience has been correlated with increased aggressiveness (e.g. Cloyd, 1995; LaRue & Reckers, 1989; Roberts & Klersey, 1996), although it is suggested that experience is merely a proxy for other variables such as better knowledge of IRS audit probabilities (Roberts, 1998). Firm
30 DONNA D. BOBEK AND RICHARD C. HATFIELD
Fig. 1. Economic Psychology-Processing Model of Tax Accountants’ Judgment/Decision Making.
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
31
size has sometimes been associated with more aggressive advice, although Roberts points out that there is no theoretical support for this effect (Roberts, 1998). Results regarding the effect of gender on advice aggressiveness are mixed. While McGill (1990) found males to provide more aggressive advice than females, Ashton (2000) reported weakly significant results that females provide more aggressive advice.2 Task factor variables of interest include law ambiguity and issue complexity. Law ambiguity relates to the clarity of the legal precedents, while issue complexity relates to the clarity of the facts and circumstances of the tax issue. The greater the ambiguity in the tax law, the more aggressive the advice is expected to be (Helleloid, 1989; Kaplan et al., 1988; Klepper & Nagin, 1989). The effect of complexity on advice aggressiveness has not been addressed by prior research. However, similar to law ambiguity, there should be more opportunity to be aggressive as the complexity of the tax issue increased. Economic environmental factors that have been considered by prior research include concerns about IRS action, and characteristics of the client. Concern for IRS involvement has had the expected effect of reducing advice aggressiveness (see e.g. Newberry et al., 1993). Client characteristics such as size, importance, strength, and the aggressiveness of the client have been found to increase the aggressiveness of tax professionals’ advice (Roberts, 1998). Firm factors, such as concern for client loss and concern for professional liability, have not been adequately addressed by prior research, and thus are discussed in more detail. Currently, tax engagements give rise to the majority of malpractice claims filed against CPAs (Anderson & Wolfe, 2001; Wladis, 1995; Yancey, 1996). The AICPA reported that 60% of all accountant malpractice claims in the AICPA Professional Liability Insurance program arose from tax engagements (Anderson & Wolfe, 2001). This is up from 43% ten years ago. For many tax firms, the direct cost of malpractice protection is their single largest expense after employee compensation, approximating 10% of total expenses (Bandy, 1996). Professional liability as a risk factor is becoming increasingly important in the tax professional’s work environment. Practitioners are aware that an effective way of limiting malpractice liability is to carefully regulate tax return aggressiveness (Bandy, 1996). Therefore, professional liability pressures may lead to general reductions in the aggressiveness of tax positions. A second risk associated with the tax professional’s firm is the possibility that the firm will lose the client. Roberts and Cargile (1994) examine the effect that concern for client loss has on the aggressiveness of both auditors and tax professionals. They find a significant main effect for the risk of client loss. When the risk of client loss is viewed as high, their subjects’ advice was more aggressive. Their client loss variable was dichotomous and was manipulated by telling the participant that the
32
DONNA D. BOBEK AND RICHARD C. HATFIELD
perceived risk of losing the client if an expenditure is capitalized is either high or low. Further, they find that the effect of client loss is stronger in an audit context than in a tax context. Therefore, we would expect that concern for client loss will be related to advice aggressiveness. To summarize, the first objective of this study is to perform a fairly comprehensive test of Roberts’ model of tax professionals’ judgment and decision-making, using advice aggressiveness as the dependent variable. Data is collected regarding the three inputs identified by Roberts, individual-psychological, task characteristics and the economic environment. However, the “black box” portion (i.e. cognitive processing) of the model illustrated in Fig. 1 is not addressed.
Tax Professionals’ Fees There have been numerous studies that have investigated the determinants of audit fees (e.g. Behn et al., 1999; Francis & Simon, 1987; O’Keefe et al., 1994; Simunic, 1980). In addition to identifying client attributes that are related to fees (e.g. size, foreign operations), these studies have also addressed whether or not audit quality and client satisfaction are related to the level of audit fees. There are only a few studies that have discussed tax preparer fees. Christensen (1992) identified fees as a determinant of taxpayers’ perceptions of quality service. In an analytical study, Phillips and Sansing (1998) investigated whether the ban on contingent fees serves to increase compliance (their conclusion was that it does not). Frischmann and Frees (1999), in an archival study using tax return data, determined that tax return preparation fees were associated with tax savings and time savings, but not uncertainty reduction. Ashton (2000), in a unique study which examined the results of a Money magazine tax return preparation contest found that the fee that the participants said they would have charged was not related to the aggressiveness or the accuracy of their services. There were differences however, between CPAs and non-CPAs, and males and females (CPAs and males would have charged more). Although not specifically addressing fees, client importance has been characterized as a surrogate for “high future compensation” (e.g. Reckers et al., 1991), and was found to be related to advice aggressiveness. Other than the Ashton (2000) study, we know of no study that has directly investigated whether there is a link between fees and advice aggressiveness. Nevertheless, providing aggressive advice is not without cost to the tax professional. Providing aggressive advice increases the risk of IRS audit, preparer penalties and taxpayer penalties, and may also translate into a higher risk of malpractice liability (Bandy, 1996). However, there is a lack of consensus within the extant literature as to whether or not taxpayers are even seeking aggressive
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
33
advice from tax professionals. Schisler (1995) found that taxpayers in his study were more aggressive than tax professionals. On the other hand, Hite (1992) reported that taxpayers do not demand aggressive advice. With this review of the literature as a backdrop, the second objective of the present study is to determine whether the fees charged by tax professionals are related to the aggressiveness of their advice. In addition to data about fees, we collected data regarding the amount of time spent on the engagement by both the tax professional and others in the tax professional’s firm, firm size, gender, education and years of experience.
METHOD Questionnaire We developed a questionnaire which asked respondents to recall their most recent tax planning engagement. The questionnaire is reproduced in the Appendix. All of the questions relate to one particular client and engagement. This method is similar to one used by Gibbins et al. (2001). They asked auditors to recall their last negotiation process and respond to a “structured questionnaire” about their experience. While experiments obviously provide the greatest degree of internal validity as well as allow for causality conclusions, they are subject to external validity concerns. This structured questionnaire approach allowed us to “observe” a naturally occurring behavior (i.e. we obtain a “sample” of actual tax planning engagements), and to collect a wide range of variables for a variety of different tax engagements. Our first research objective considered whether or not Roberts’ model, developed from the results of experimental research using primarily Big-5 CPAs as subjects, could be validated using a different methodology and a sample of tax professionals from small firms. Since the particular tax professional output we focused on was the aggressiveness of the advice given in a tax planning engagement, the dependent measure was the subject’s response to the question, “how aggressive was the advice that you gave the client on this specific issue” answered on a 7-point Likert scale (Pei et al., 1990; Roberts & Klersey, 1996; Spilker et al., 1999). We also collected measures designed to correspond to Roberts’ individualpsychological factors,3 task factors and environmental factors. The individual tax professional variables collected were issue experience (7-points scale), years of experience, firm size (measured as number of professionals in the office), position in the firm (e.g. partner, manager, senior, staff), gender and education (e.g. Masters Degree, Bachelors Degree, etc.). The remaining variables used to
34
DONNA D. BOBEK AND RICHARD C. HATFIELD
test Roberts’ model were measured on either a 7-point or 10-point scale. Task variables collected were law ambiguity and issue complexity. Environmental variables fall into three categories: risks and rewards associated with the IRS, the firm and the client. The IRS variables collected were concern for IRS audit, concern for taxpayer penalties and concern for preparer penalties. The firm variables collected were concern for professional liability and concern for client loss. The client variables collected were client size, client importance, client strength, client relationship, client aggressiveness and tax dollars at stake. Finally, in order to test the relationship between advice aggressiveness and the fee charged the client, we asked the subjects to estimate the fee (in dollars) they charged, and the amount of time (in hours) spent by themselves and others in the firm.
Subjects Data were collected from non-Big 5 accountants.4 While these participants are not completely comparable to subjects from large firms in many prior studies, they are more representative of the population of tax professionals.5 They were contacted through the mail. Five hundred tax professionals in the Central Florida area were Table 1. Sample Demographics. Years of Experience 0–10 years 10–20 years 20–30 years Over 30 years
18% 45% 30% 7%
Education No college AA/AS BA/BS MA/MS PhD
0% 7% 56% 36% 1%
Profession CPA Enrolled agent Other
83% 9% 8%
Position in Firm Staff Senior Manager Partner/Owner
0% 3% 8% 89%
Firm Size Big 5/national/regional Not national or regional
5% 95%
# of Professionals in Office Less than 5 5–20 More than 20
66% 19% 15%
Gender Male Female
72% 28%
% Who Experienced Previous Malpractice Claim
13%
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
35
Table 2. Detail of Response Rate. Initial questionnaires mailed
509
Less Returned as undeliverable Deceased Not eligiblea
(45) (2) (4)
Adjusted sample size Returned questionnairesb Response rate (based on adjusted sample size)
458 93 20.3%
a The
instructions to the questionnaire stated that it should only be completed if tax planning was a “significant” part (defined as at least 10%) of the tax professional’s practice. Four subjects contacted us to let us know they did not meet the criteria. It is likely that this number is understated; therefore the true response rate of eligible participants is likely higher than what is reported here. b All returned questionnaires were useable.
identified as potential participants. To encourage participation, participants were allowed to enter a drawing for free admission to a Professional Education Conference. They entered the drawing by sending a separate email message in order to preserve anonymity. Table 1 presents the sample demographics. The respondents were primarily male (72%), CPAs (83%) partners (88%), in small firms (94%) with less than 5 (66%) professionals in the office. The response rate (detailed in Table 2) was approximately 20%.
RESULTS Descriptive Statistics Table 3 presents the mean response to each variable. The mean of the dependent variable, advice aggressiveness, indicates that on their last engagement, the advice they gave was slightly more aggressive than average. They also indicated that the client was more aggressive than the average client (4.49 on a 7-point scale). The tax issue was somewhat complex and the tax dollars at stake were more than average. In general, the subjects were not particularly concerned with the risks associated with the firm or the IRS. Concern for client loss averaged just over three on a 10-point scale (where 1 = didn’t even think about it, and 10 = very concerned about it). The biggest concern was for taxpayer penalties, and that mean was still less than five.
36
DONNA D. BOBEK AND RICHARD C. HATFIELD
Table 3. Descriptive Statistics. Variable Complexitya Ambiguitya Issue experiencea Tax dollars at stakea Client sizea Client importancea Client strengtha Client relationshipa Client aggressivenessa Advice aggressivenessa Concern for preparer penaltyb Concern for IRS auditb Concern for tax payer penaltyb Concern for professional liabilityb Concern for client lossb Gender (% male) Years of experience Firm size (# of professionals in office) Fee charged Hours spent by self Hours spent by others
Mean
Standard Deviation
4.84 3.58 4.83 4.60 4.22 4.22 5.28 5.58 4.49 4.14 2.96 3.74 4.68 4.31 3.11 72% 19.75 5.46 $1,085 7.66 2.32
1.22 1.66 1.63 1.52 1.40 1.48 1.27 1.10 1.32 1.16 2.82 2.65 3.26 3.27 2.87 0.45 7.75 5.80 2,237 12.61 7.41
Note: See appendix for actual questions asked. a Measured on a 7 point scale. b Measured on a 10 point scale.
Test of Roberts’ Model Factor Analysis Results Factor analysis of the responses was performed to examine whether these measures loaded consistently with the factor descriptions provided by Roberts (1998). Table 4 reports the results of this factor analysis. Varimax rotation factor loadings are reported. Five factors were retained by the procedure. In total, the five factors explained 65.8% of the variance. Two of the five factors (Factors One and Five in Table 4) represent client characteristics (or in Roberts’ vernacular, “risks and rewards associated with the client”). Factor One includes client strength, importance, size and relationship. This factor explained 21.4% of the variance. Client aggressiveness loaded as its own factor (Factor Five), explaining 7% of the variance. Factor Two represents task characteristics. Issue complexity, issue experience, tax dollars at stake and law ambiguity loaded on this factor and explained 16.9%
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
37
Table 4. Factor Analysis Results.a Factor 1 Client strength Client importance Client size Client relationship
Factor 2
Factor 3
Factor 4
0.836 0.768 0.761 0.633
Issue complexity Issue experience Tax dollars at stake Law ambiguity
0.708 0.642 0.628 0.548
Concern for IRS audit Concern for preparer penalty Concern for taxpayer penalty
0.466 0.801 0.798 0.559
Concern for client loss Concern for professional liability
0.779 0.774
Client aggressiveness % Variance explained Factor description
Factor 5
0.899 21.4%
16.9%
12.3%
8.0%
7.0%
Client risks & rewards
Task factors
IRS risks & rewards
Firm risks & rewards
Client aggressiveness
a Varimax
rotation factor loadings are reported. One variable, Law Ambiguity, had relatively high loading scores on two factors, therefore both are reported.
of the variance. Factor Three represents risks and rewards associated with the IRS. Concern for IRS audit, concern for preparer penalties and concern for taxpayer penalties loaded on this factor and explained 12.3% of the variance. Finally, Factor Four represents risks and rewards associated with the CPA’s firm. Concern for professional liability and concern for client loss loaded on this factor and explained 8% of the variance. The only variable that did not load in a manner consistent with Roberts’ model was tax dollars at stake. Roberts viewed this variable as a risk/reward associated with the client. However, our subjects merely viewed it as a characteristic of the task.6 There were five individual variables collected: gender, education, firm size, years of experience and issue experience. Only issue experience was included in the factor analysis because the other four items are independent of the particular client engagement. Issue experience loaded with the task characteristics, which suggests that the level of specific experience influences the perceptions of the task itself. In summary, the factor analysis results appear to validate (with very few exceptions) Roberts’ characterization of the inputs to a tax professional’s cognitive
38
DONNA D. BOBEK AND RICHARD C. HATFIELD
processing of a judgment and decision-making task. It is interesting to note that the tax professionals viewed client aggressiveness as distinct from other client characteristics. As will be discussed below, when we examine the regression results, client aggressiveness appears to be a particularly salient characteristic. Regression Results The factor scores generated from the factor analysis procedure were used as independent variables, along with gender, firm size, education and years of experience, in a regression with advice aggressiveness as the dependent variable. The results of this regression are reported in Table 5. The regression model was significant at a 0.000 significance level. The model R2 was 0.338 (adjusted R2 was 0.259). Four of the five factors were significant at explaining advice aggressiveness. Only the factor representing risks and rewards associated with the firm was not significant (p-value = 0.219). None of the separately considered individual characteristics (i.e. gender, education, firm size, and years of experience) were significant in explaining advice aggressiveness.7 Examination of the Table 5. Regression Results: Advice Aggressiveness. Independent Variables
Intercept
Parameter Estimates
Standardized Coefficients
4.252
p-Value
0.000
Factorsa Client characteristics Task characteristics IRS risks and rewards Firm risks and rewards Client aggressiveness
0.269 0.282 −0.220 0.149 0.433
0.231 0.250 −0.189 0.126 0.377
0.020 0.011 0.055 0.219 0.000
Other variables Gender Firm size Years of experience Education
0.097 0.003 0.001 −0.190
0.037 0.015 0.004 −0.090
0.729 0.884 0.971 0.384
Model statistics Model mean square F-statistic Model P-value Model R2 Model adjusted R2
4.300 4.259 0.000 0.338 0.259
Note: Dependent variable: advice aggressiveness. a Table 4 reports complete factor loadings.
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
39
variance inflation factors (VIF) indicates that multicollinearity is not affecting these results. The standardized coefficients reported in Table 5 indicate that client aggressiveness was the most important influence, followed by task characteristics, client characteristics, and, finally, risks and rewards associated with the IRS. All of the factors positively affected advice aggressiveness, except, of course, for risks and rewards associated with the IRS, which had a negative coefficient. Based on the variable coding, the direction of these coefficients implies that large, important, financially strong, and aggressive clients received more aggressive advice. Also, the more ambiguous and complex the issue and the larger the dollar amount, the more aggressive was the advice. However, the more concerned the tax professional was about IRS involvement via audit, or penalties, the less aggressive was the advice provided. Discussion It was somewhat surprising that the factor representing concern for client loss and professional liability8 was not significant. One possible explanation is that the CPAs in our sample are not particularly concerned about either of these “risks.” Table 3 gives some support to this idea. Concern for client loss and concern for professional liability were both measured on a 10-point scale with 1 indicating that the respondent didn’t even consider the outcome when formulating his/her advice and 10 indicating that the respondent was very concerned with the outcome. The mean response for client loss was only 3.11, and more than half of the respondents rated it at a “1.” The mean response for professional liability was only 4.31. Therefore, for both of these “risks,” the mean response did not even reach the midpoint of the scale. This lack of concern may be a result of the tax professionals in our sample. Our sample was made up almost exclusively of CPAs from small firms. Cox and Radtke (2000) found that Big 5 CPAs feel more pressure from their firms than CPAs from smaller firms. Therefore, it may be that if Big 5 CPAs were included in the sample, the risks and rewards associated with the firm would have influenced advice aggressiveness. In addition, the mean fee for the engagement in our sample is $1,085. It may be that loss of such a relatively small fee does not represent much risk to the tax preparer.9 In summary, the regression results are generally consistent with the model proposed by Roberts. Client aggressiveness had the largest influence on the aggressiveness of the tax professionals’ advice. However, task characteristics, risks and rewards associated with the IRS, and other client characteristics were also significantly related to advice aggressiveness. The only surprising “non” result was that neither concern for client loss nor professional liability appeared to significantly influence the aggressiveness of the professionals’ advice.
40
DONNA D. BOBEK AND RICHARD C. HATFIELD
Test of Fee Determinants Regression Results To investigate whether tax professionals receive a premium for providing aggressive advice, we asked respondents to estimate the fee they charged as well as the amount of time they spent on the engagement. As reported in Table 3, the mean fee was $1,085, and the average number of hours spent was 7.66 by the tax professional, and 2.32 by others in the firm. As was expected, time spent and fees were highly correlated.10 We regressed fee on time spent, advice aggressiveness, and four other variables that may be related to fee: gender (Ashton, 2000), years of experience, size of firm (measured as number of professionals in the office), and education. The regression results are reported in Table 6. Table 6 reports a high model R2 of 0.89 (adjusted R2 of 0.88), primarily due to the inclusion in the regression of a measure of time spent. Advice aggressiveness was not significant (p-value = 0.879). However, firm size and gender were significantly related to fees. The coefficient for firm size had a t-statistic that was significant at the 0.063 level (two-sided test). The larger the firm, the higher were the fees. Gender was significant at the 0.014 level, and after time spent, was the most influential variable (based on the standardized regression coefficient). Males charged significantly more than females. Years of experience (p-value = 0.116) and education (p-value = 0.214) were not significantly related to the fee charged. Table 6. Regression Results: Fee Charged for Engagement. Independent Variables
Parameter Estimates
Dependent variable: Fee charged Intercept Time spent by self Time spent by others Advice aggressiveness Gender Firm size Years of experience Education
−414.66 116.99 46.76 9.15 442.10 23.48 −16.13 162.54
Model statistics Model mean square F-statistic Model P-value Model R2 Model adjusted R2
30,716,477 84.62 0.000 0.890 0.880
Standardized Coefficients
0.862 0.119 0.006 0.108 0.080 −0.071 0.052
p-Value
0.299 0.000 0.014 0.879 0.014 0.063 0.116 0.214
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
41
Examination of the VIF’s indicates that multicollinearity is not affecting these results. Discussion As expected, fees were strongly related to the amount of time devoted to the engagement. However, we found no direct evidence that aggressiveness is priced. Aggressiveness could be priced indirectly, if aggressive advice leads to more time spent on the engagement. This could occur either because the tax professional “charges” for more time, or because they actually spend more time researching the issue in order to gain comfort with providing aggressive advice. A significant correlation between time spent and advice aggressiveness would be consistent with this notion. However, the Pearson correlation coefficient between time spent and advice aggressiveness is only 0.134, which is not significant (p-value = 0.202). Thus, we have no evidence, either direct or indirect, that tax professionals charge a premium for aggressive advice. The most interesting result regarding the determinants of fees, however, is the significant gender effect. Since gender is an indicator variable, the coefficient on gender can be interpreted as a dollar amount. As shown in Table 6, even after controlling for time spent, firm size, experience and education, females, on average charged $442 less than males.11 This result is consistent in magnitude to the findings in Ashton (2000) who reported that males charged a 57% premium compared with females. In a study focusing on the profitability of small CPA firms, Fasci and Valdez (1998) found that female-owned CPA firms were significantly less profitable than male-owned CPA firms, even after controlling for age of the business, education, experience, time spent on the business and motivation for owning the business. While we cannot explain why females appear to undervalue their work product, evidence continues to indicate that they do. Fasci and Valdez (1998) identify several “disadvantages” faced by women that may result in lower productivity including socialization practices, family roles and lack of networks or contacts. The extent to which any or all of these factors influence the amount of fees female tax professionals charge remains unknown. Future research is necessary to understand this result.
Limitations The advantages of using the structured questionnaire approach, such as the collection of a variety of variables and the focus on actual tax planning engagements, do not come without cost. For example, asking CPAs to select the engagement on which to base their responses may have introduced bias (Gibbins et al., 2001),
42
DONNA D. BOBEK AND RICHARD C. HATFIELD
although we did ask them to recall their last tax planning engagement. There is also the possibility that the subjects’ memories of the factors that influenced them are not correct, or that important influences were not included in the questionnaire. Use of factor analysis mitigates some of the potential multicollinearity concerns. Non-response bias is also a concern. An analysis of late respondents was performed to investigate this concern. The results were qualitatively similar to those reported here when only the late respondents were considered. However, since the number of observations was lower the significance of the variables (particularly the task characteristics factor) was somewhat lower. In addition, this study is limited by the number and homogeneity of participants. While the results do not seem affected by a lack of power (with the possible exception of firm risks and rewards in the main regression), the sample size (93) is somewhat low for a regression with this number of variables. Removal of nonsignificant variables does not improve the significance levels of the other variables. Also, since the CPAs sampled were from one area of the country, generalization of the results to other geographical areas must be done with caution. Finally, our test of the relationship between advice aggressiveness and fees may have been underpowered given the possible lack of complexity (room to be aggressive) in the engagements in our sample.
SUMMARY AND FUTURE RESEARCH This study had two specific objectives for extending prior research. The first objective was to provide a comprehensive test of the validity of Roberts’ (1998) tax professional judgment and decision-making model. A second objective was to determine whether tax professionals charge a premium for aggressive advice. Regarding the first objective, we performed a two-step analysis. The first step consisted of a factor analysis of many of the inputs to tax professionals’ cognitive processing regarding a tax judgment and decision making task. The second step consisted of regression analysis with advice aggressiveness as the dependent variable, and the factor scores from the first step, along with four other individual-psychological factors as independent variables. The factor analysis results revealed that, consistent with Roberts’ (1998) model, the variables loaded nicely on factors that represented client concerns, IRS concerns, firm concerns and task characteristics with two exceptions. Tax dollars at stake, a variable that Roberts identified as a risk/reward associated with the client, loaded with task characteristics. Also, issue experience, which Roberts viewed as an individual-psychological factor, appeared to be related to how the tax professional viewed the task.
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
43
The regression analysis with advice aggressiveness as the dependent variable and the factor scores as independent variables, along with four individualpsychological factors, namely years of experience, education, gender and firm size, revealed that it was the client concerns (for example, client aggressiveness, client size), task characteristics (for example, law ambiguity), and risks and rewards associated with the IRS that influenced advice aggressiveness. Client aggressiveness appeared to be particularly influential. It loaded as a separate factor and had the largest standardized regression coefficient. The only factor that was not significant was the factor that represented risks and rewards associated with the tax professionals’ firm. The two variables that loaded on this factor were concern about professional liability and concern for client loss. We offer two possible reasons for this lack of effect. First, these concerns may be less of an issue with CPAs from small firms, than they would be with Big 5 CPAs. This is consistent with the results of Cox and Radke (1998). Therefore, additional research regarding these variables should be carried out with a population of Big 5 CPAs. Second, concerns about professional liability and concern for client loss may affect CPAs at other points in the judgment and decision-making process. For example, these concerns may cause them to spend more time performing their information search or analyzing alternatives. Roberts (1998) called for additional research in general on the cognitive processing by tax professionals. We echo this advice. Regarding the second objective, we performed regression analysis with fees as the dependent variable, and time spent on the engagement, advice aggressiveness, firm size, gender, education and years of experience as explanatory variables. Time spent was highly significant and was, by far, the biggest determinant of fees. Advice aggressiveness was not related to fees, or even to the amount of time spent. Therefore, we tentatively conclude that tax professionals, at least from small firms, do not charge a premium for aggressive advice. A possible explanation could be that tax professionals spread risk among all of their clients by incorporating a risk factor into their hourly billing rates.12 This particular explanation, as well as the general relationship between fees and advice, warrants additional research. The variables firm size and gender were significantly related to fees, with larger firms and males charging more. The gender result is of particular concern. Even after controlling for education, time spent, years of experience and firm size, it appears that males charge significantly more than females. This result is consistent with prior research results reported in Ashton (2000) and Fasci and Valdez (1998). None of these studies, including ours, provides an adequate explanation for why females appear to undervalue (relative to males) their work product. We urge future researchers to investigate the cause of this undervaluing so that prescriptive advice can be provided to female accounting professionals.
44
DONNA D. BOBEK AND RICHARD C. HATFIELD
NOTES 1. This figure was taken from an earlier version of Roberts (1998). 2. Ashton’s results are inconclusive for at least two reasons. First, his results were significant at the 0.10 level. Second, he defined aggressive in terms of the direction and magnitude of error from a predetermined “correct” tax due. Thus, females may have been more aggressive, or may have just been wrong more often in a tax-decreasing direction. 3. Roberts’ individual-psychological category included both characteristics of the tax professional (e.g. experience, knowledge, professional status) and psychological attributes of the tax professional (e.g. advocacy, aggressiveness). In this study, we focus on tax professional characteristics as opposed to psychological attributes and thus from this point forward we will describe these variables as “individual” factors instead of individual-psychological factors. 4. All but six of the subjects reported that they worked in a firm that was “not Regional/National.” Of those six, three did not respond to the question, one was from a Big-5 firm, one from a regional firm and one from a national firm. When those six subjects were eliminated from the analyses, the reported results did not change. 5. The average firm focused on tax work has about four licensed professionals and the median firm has just one or two professionals (Russell, 2002). 6. The loading of tax dollars at stake with client characteristics was 0.321, compared to 0.628 with task characteristics. 7. When all of the individual factors (gender, firm size, years of experience and education) were removed, the only significant change in the results is that the factor representing firm risks and rewards became marginally significant (p = 0.07). 8. We also collected data regarding whether or not the tax professional had previous experience with a malpractice claim. Inclusion of this variable does not change the reported results. 9. We also considered that, while concern about professional liability and client loss were positively correlated with each other, they might have differing effects on advice aggressiveness. However, inspection of the correlation coefficients between each variable and advice aggressiveness indicated that neither was significantly correlated with the dependent variable and replacing the Factor 4 score from the model in Table 5 with the individual variables did not produce significant results for either variable. Finally, the effect of client loss concerns on advice aggressiveness might differ depending upon how aggressive the client was. In other words, concern for client loss should only increase advice aggressiveness when the client was also aggressive. Thus we added an interaction term between Factors 4 and 5 to the model in Table 5. This interaction was not significant. However, since the interaction effect does not necessarily relate to professional liability, we considered a client loss X client aggressiveness interaction separately by dichotomizing the two variables at the median. This analysis also did not produce a significant interaction effect between concern for client loss and client aggressiveness. 10. The Pearson correlation coefficients between fee and time spent is 0.841 and fee and time spent by others is 0.787. 11. We also collected the tax professional’s position in the firm (e.g. partner, manager, etc). Position was correlated with gender; however, when it was included in the regression it was not significant (and did not cause multicollinearity problems). It did, however, reduce the coefficient on gender to 365 (which was still significant).
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
45
12. Another possible explanation is that the client assumes the risk for aggressive advice, and thus the tax professional does not find it necessary to charge more for this type of advice. This may be the case if the tax professional adequately informs his/her client of the risks associated with taking an aggressive position and lets the client decide whether or not to take the position.
ACKNOWLEDGMENTS We appreciate helpful comments from Dale Bandy, Peggy Dwyer, Andrew Judd, Lois Mahoney, Robin Roberts, participants at a University of Central Florida research workshop, and two anonymous reviewers. The first author is grateful to the PriceWaterhouseCoopers Foundation for financial assistance.
REFERENCES Anderson, S., & Wolfe, J. (2001). Accountants’ liability: Where are claims coming from? The Ohio CPA Journal, 60(4), 21–24. Ashton, R. H. (2000). Accuracy, agreement, and aggressiveness in tax reporting: Evidence from the money magazine contests. Advances in Taxation, 12, 1–21. Ayers, F. L., Jackson, B. R., & Hite, P. A. (1989). The economic benefits of regulation: Evidence from professional tax preparers. The Accounting Review, 64(2), 78–87. Bandy, D. (1996). Limiting tax practice liability. The CPA Journal, 66(5), 46–50. Bandy, D., Betancourt, L., & Kelliher, C. (1994). An empirical study of the objectivity of CPAs’ tax work. Advances in Taxation, 6, 1–23. Behn, B. K., Carcello, J. V., Hermanson, D. R., & Hermanson, R. H. (1999). Client satisfaction and big 6 audit fees. Contemporary Accounting Research, 16(4), 587–608. Carnes, G. A., Harwood, G. B., & Sawyers, R. B. (1996). The determinants of tax professionals’ aggressiveness in ambiguous situations. Advances in Taxation, 8, 1–26. Christensen, A. L. (1992). Evaluation of tax services: A client and preparer perspective. Journal of the American Taxation Association, 14(Fall), 60–87. Cloyd, C. B. (1995). Prior knowledge, information search behaviors, and performance in a tax research task. The Journal of the American Taxation Association, 17(Suppl.), 82–107. Cloyd, C. B., & Spilker, B. C. (1999). The influence of client preferences on tax professionals’ search for judicial precedents, subsequent judgments and recommendations. The Accounting Review, 74(3), 299–322. Cox, S. R., & Radtke, R. R. (2000). The effects of multiple accountability pressures on tax return preparation decisions. Advances in Taxation, 12, 23–50. Cuccia, A. D. (1994). The effects of increased sanctions on paid preparers: Integrating economic and psychological factors. The Journal of the American Taxation Association, 16(1), 41–66. Cuccia, A., Hackenbrack, K., & Nelson, M. W. (1995). The ability of professional standards to mitigate aggressive reporting. The Accounting Review, 70(2), 227–248. Department of the Treasury (1994). Treasury Department Circular 230. 31 CFR, subtitle A, sections 10.0–10.98 and 10.100–10.101. Washington, DC: Department of the Treasury.
46
DONNA D. BOBEK AND RICHARD C. HATFIELD
Duncan, W. A., LaRue, D. W., & Reckers, P. M. J. (1989). An empirical examination of the influence of selected economic and noneconomic variables in decision making by tax professionals. Advances in Taxation, 2, 91–106. Fasci, M. A., & Valdez, J. (1998). A performance contrast of male- and female-owned small accounting practices. Journal of Small Business Management, 36(3), 1–7. Francis, J. R., & Simon, D. T. (1987). A test of audit pricing in the small-client segment of the U.S. audit market. The Accounting Review, 62(1), 145–157. Frischmann, P. J., & Frees, E. W. (1999). Demand for services: Determinants of tax preparation fees. Journal of the American Taxation Association, 21(Suppl.), 1–23. Gibbins, M., Salterio, S., & Webb, A. (2001). Evidence about auditor-client management negotiation concerning client’s financial reporting. Journal of Accounting Research, 39(3), 535–563. Helleloid, R. T. (1989). Ambiguity and the evaluation of client documentation by tax professionals. The Journal of the American Taxation Association, 11, 22–36. Hite, P. A. (1992). An examination of taxpayer preference for aggressive tax advice. National Tax Journal, 45(4), 389–403. Kaplan, S., Reckers, P. M. J., West, S., & Boyd, J. (1988). An examination of tax reporting recommendations of professional tax preparers. Journal of Economic Psychology, 9(4), 427–443. Klepper, S., & Nagin, D. (1989). The role of tax preparers in tax compliance. Policy Sciences, 22, 167–194. LaRue, D., & Reckers, P. M. J. (1989). An empirical examination of the influence of selected factors on professional tax preparers’ decision process. Advances in Accounting, 7, 37–50. McGill, G. A. (1990). The CPA’s aggressive position recommendation decision: Situational, attitudinal, and personality factors. Working Paper, University of Florida, Gainesville, Florida. Newberry, K. J., Reckers, P. M. J., & Wyndelts, R. W. (1993). An examination of tax practitioner decisions: The role of preparer sanctions and framing effects associated with client condition. The Journal of Economic Psychology, 11(1), 119–146. O’Keefe, T. B., King, R. D., & Gaver, K. M. (1994). Audit fees, industry specialization, and compliance with GAAS reporting standards. Auditing: A Journal of Practice and Theory (Fall), 41–54. Pei, B. K. W., Reckers, P. M. J., & Wyndelts, R. W. (1992). Tax professionals belief revision: The effects of information presentation sequence, client preference, and domain experience. Decision Sciences, 23(1), 175–199. Phillips, J., & Sansing, R. C. (1998). Contingent fees and tax compliance. The Accounting Review, 73(1), 1–18. Reckers, P. M. J., Sanders, D. L., & Wyndelts, R. W. (1991). An empirical investigation of factors influencing tax practitioner compliance. The Journal of the American Taxation Association, 13(2), 30–46. Roberts, M. L. (1998). Tax accountants’ judgment/decision-making research: A review and synthesis. The Journal of the American Taxation Association, 20(1), 78–121. Roberts, M. L., & Cargile, B. R. (1994). Impartiality vs. advocacy: CPA’s responses to conflict in auditing and tax situations. Working Paper, University of Alabama, Tuscaloosa, Alabama. Roberts, M. L., & Klersey, G. F. (1996). Effects of authoritative guidelines and experience on tax decision making. Working Paper, University of Alabama, Tuscaloosa, Alabama. Russell, R. (2002). Independent practitioners make over half their earnings from tax preparation. Accounting Today, 16(Fall), 6–7. Schisler, D. L. (1994). An experimental examination of factors affecting tax preparers’ aggressiveness – A prospect theory approach. The Journal of the American Taxation Association, 16(2), 124–142.
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
47
Schisler, D. L. (1995). Equity, aggressiveness, consensus: A comparison of taxpayers and tax preparers. Accounting Horizons, 9(4), 76–87. Simunic, D. A. (1980). The pricing of audit services: Theory and evidence. Journal of Accounting Research, 18(Spring), 161–190. Spilker, B. C., Worsham, R. G., & Prawitt, D. F. (1999). Tax professionals’ interpretations of ambiguity in compliance and planning-decision contexts. Journal of the American Taxation Association, 21(2), 75–89. Wladis, R. (1995). Professional liability survey. Pennsylvania CPA Journal, 66(5), 30–33. Yancey, W. F. (1996). Managing a tax practice to avoid malpractice claims: Learning from past disasters. The CPA Journal, 66(2), 12–18.
APPENDIX: TAX ADVISOR QUESTIONNAIRE AVERAGE RESPONSES Please recall the most recent issue on which you provided tax planning advice to one of your clients. Please answer the following questions with that client and issue in mind. You will never be asked to reveal yourself or your client (circle a number to indicate your response to each question). (1) How complicated was the tax issue for this specific client, relative to other tax issues which you have advised this or other clients? Very simple 1
Average 2
3
4 Average: 4.84
5
6
Very complicated 7
(2) How clear was the authority regarding this tax issue, relative to other tax issues which you have advised this or other clients? Very Clear 1
2
Average 4 Average: 3.58
3
5
6
Very Ambiguous 7
(3) How much experience do you have with this tax issue, relative to other tax issues which you have advised this or other clients? Less than Average 1
2
3
Average 4 Average: 4.83
5
6
More than Average 7
48
DONNA D. BOBEK AND RICHARD C. HATFIELD
(4) Was the dollar amount of tax savings at stake for this issue large or small, relative to other tax issues you have advised clients? Very Small 1
2
3
Average 4 Average: 4.6
5
6
Very Large 7
(5) Was the advice given about this issue covered by an engagement letter? Yes
No Yes: 15.0%
(6) How large (e.g. total revenues, total assets, etc.) is this client, relative to your other clients? Very Small 1
2
3
Average 4 Average: 4.22
5
6
Very Large 7
(7) How important is this client, relative to other clients (e.g. total billing, other accounting services provided, referrals, etc.)? Not Important 1
2
3
Average 4 Average: 4.22
5
Very Important 7
6
(8) How financially strong (e.g. solvency, net worth, etc.) is this client, relative to other clients? Very Weak 1
2
3
Average 4 Average: 5.28
5
6
Very Strong 7
(9) How would you rate your working relationship with this client, relative to other clients? Well Below Average 1
Average 2
3
4 Average: 5.58
5
6
Well Above Average 7
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
49
(10) How aggressive, regarding tax issues, would you say this client is, relative to other clients? That is, does this client prefer aggressive or conservative advice? Very Conservative 1
Average 2
3
4 Average: 4.49
Very Aggressive 6
5
7
(11) How aggressive was the advice that you gave the client on this specific issue? Very Conservative 1
Average 2
3
4 Average: 4.14
5
Very Aggressive 7
6
(12) How certain are you that the advice you gave your client would hold up in court if challenged by the IRS? Very Certain 1
Average 2
3
4 Average: 2.71
5
Very Uncertain 7
6
(13) If the planned transaction was made by the client, challenged by the IRS, and the client lost, how likely would it be that the client would bring suit against you or your firm? Very Unlikely 1
Not Sure 2
3
4 Average: 2.56
5
6
Very Likely 7
(14) Below are five negative outcomes that can result from providing tax planning advice. These outcomes may constrain the tax professional when providing advice to a client. Please rate the extent to which these outcomes influenced the advice you provided your client on this issue. Please enter a number between 1 and 10 below (with 10 indicating that you were very concerned with the specific outcome and 1 indicating that you didn’t even consider the outcome when formulating your advice).
50
DONNA D. BOBEK AND RICHARD C. HATFIELD
Average 4.68 4.31 3.74 2.95 3.11
Taxpayer Penalties Professional Liability (i.e. client sues firm for failed tax advice) IRS Audit of Client Preparer Penalties Loss of Client
To help us categorize your responses, please answer some demographic questions. (1) How many years experience do you have as a tax accountant 19.75 years (2) What position do you hold in your firm?% responding: Staff 0% Manager 7.5% (3) What is your highest degree achieved? High School Associate Degree Bachelors Degree (4) What is your gender?
0% 6.5% 56%
Senior Partner/Owner
Masters Degree PhD
2% 88%
35.5% 1%
Male 72% Female 28% (5) What % of your chargeable time is spent providing tax advice that requires some amount of research 19% (6) Are you a(n): CPA 83.7% Enrolled Agent 8.7% Attorney 0% Other 7.6% (7) What size CPA firm do you work for? Big 5 National Firm Regional Firm Not Regional/National
1.1% 1.1% 1.1% 96.7%
(How many professionals in office? 5.46) (8) Please estimate how many hours were spent on this issue for your client by: Yourself Others
7 2/3 hours 2 1/3 hours
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
51
(9) Please estimate the amount of the fee you charged this client which was allocated to advice on this issue (enter dollar amount)? $1,085 (10) In general, how big of a concern is professional liability to you in your tax planning activities? Not a Concern at All 1
2
3
Somewhat of a Concern 4 Average: 4.6
5
6
Very Much of a Concern 7
BEHAVIORAL IMPLICATIONS OF ALTERNATIVE GOING CONCERN REPORTING FORMATS Chantal Viger, Asokan Anandarajan, Anthony P. Curatola and Walid Ben-Amar ABSTRACT The generally accepted method of presentation with respect to goingconcern reporting in a global context is to modify the auditor’s report with an explanatory paragraph in addition to having a separate note to the financial statements. In Canada, however, the auditor’s report is clean, and the going concern uncertainty is restricted to the endnotes. This research, using Canadian students as subjects and conducted as a between-subjects experiment, examines unsophisticated investor’s behavior to the signal conveyed by different reporting formats by auditors (U.S. versus Canadian). The results indicate that the form of the auditor’s report does significantly influence subjects’ decisions to invest and their perception of risk.
INTRODUCTION A number of studies have examined the information content of different financial statement formats. Some studies examined the influence of words vs. numbers and graphics (Frownfelter & Fulkerson, 1998; Stocks & Tuttle, 1998). While
Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 53–73 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07003-6
53
54
CHANTAL VIGER ET AL.
the general findings of these studies indicated that numbers and graphics convey information more clearly than words, the reality is that the auditor’s report, which is a frequently analyzed form of report, is stated in words. Hence, the presentation of information by management of the company and the auditor’s attestation potentially influences the financial statement users. The importance of the auditor’s report may take on greater meaning when the client faces financial distress that could threaten its going concern status. This issue has been recognized and addressed in most industrialized countries by means of the auditor’s report, which is modified with an explanatory paragraph (also referred to as “emphasis of matter”) that details the going concern uncertainty. Much research, therefore, has focused on the informational content of the going concern report and whether this report significantly influences the decision-making behavior of lenders and investors. The preponderant view is that descriptions of the going concern contingency in the endnotes to the financial statements suffice to warn the reader of potential problems (e.g. Elias & Johnston, 2001; Libby, 1979; Pringle et al., 1990). Although the general consensus is that the explanatory paragraph in the auditor’s report should not significantly influence users’ behavior (Elias & Johnston, 2001), it is required in the United States as well as other industrialized nations. In Canada, however, Section 5150 of the Canadian Institute of Chartered Accountants (CICA) Handbook does not require the auditor to modify his/her auditor’s report if the going concern contingency is described as a note in the financial statements (CICA, 2003). As a result, the auditor’s report is clean and the financial statement user has to read the disclosure of financial distress in the financial statements with no acknowledgement of said issue by the auditors. Canada issued exposure drafts in 1995–1996 that sought to change this reporting position. Under these exposure drafts (CICA, 1995, 1996), the auditor’s report would remain unqualified with no reference to a going concern uncertainty; however, the going concern contingency would be highlighted on the face of the Balance Sheet and Income Statement (in addition to a separate note clearly labeled as going concern assumptions). The CICA assumed that “commonality” or redundancy in the form of repetition would sufficiently accentuate the signal to financial statement users. While these exposure drafts were rescinded in 1999, the Accounting Standards Board of the CICA is revisiting the need to “minimize the differences between Canadian and United States’ GAAP in reporting the going concern contingency” (CICA, 1999, p. 2). Even if the exposure drafts had been adopted, a difference would still have existed between the Canadian and United States’ audit report in the presence of going concern uncertainties. More specifically, the United States’ method of presentation for a going concern uncertainty requires auditors to modify the auditor’s report with an explanatory paragraph in addition to management’s
Behavioral Implications of Alternative Going Concern Reporting Formats
55
separate note to the financial statements.1 The explanatory paragraph describes the events that cast doubt about the entity’s ability to remain as a going concern. The auditor’s report is technically referred to as an unqualified modified report because the explanatory paragraph serves as a “red flag” to financial statement users. Other countries have adopted a similar view to going-concern reporting requirements. In fact, the International Federation of Accountants (IFAC) requires auditors to modify their report by adding “an emphasis of matter paragraph” that highlights the going concern problem (IFAC, 1999). Thus, the Auditing Standards Board (ASB) of the United States and the IFAC positions are in stark contrast to the CICA, which posits that a note in the financial statements alone is a sufficient warning to investors. The contribution of this research is twofold. First, the results have implications for standard setters in Canada because of doubts cast on the adequacy of the current reporting requirements for a going concern issue. As mentioned above, Canada differs from the United States and other Western nations in that the auditor’s report in the presence of going concern uncertainties is “clean” rather than modified. Since this topic is a work in process by the Canadian Auditing Standards Board, the ongoing discussion entails whether the Standards Board: (a) should maintain the present method (a method criticized by Boritz (1991) as being too passive and sending a “mute” signal); (b) adopt an in-between position, for example still keep the report clean, but highlight the going concern uncertainty on the face of the Balance Sheet and Income Statement while referencing the going concern contingency footnote (the criticism above could hold here too, though to a lesser degree); or (c) adopt the approach used by the United States and other Western nations (that is, modify the audit report with a fourth explanatory paragraph detailing the going concern uncertainty and referencing the appropriate footnote). While this topic has been temporarily “shelved,” any research in this area may provide some insight to the Standards Board in their eventual deliberations. Recently, Anandarajan et al. (2002) examined this issue with respect to Canadian loan officers, clearly a sophisticated financial statement user group. This research expands on their findings by considering a less sophisticated financial statement user group. This study also seeks to contribute to the extant literature on the impact presentation formats have on individuals’ judgment. We examine whether various methods of presentation differentially affect the extent to which non-professional investors incorporate going concern information in investment decision judgments. Another reason for selecting non-professional investors is that regulators such as the Securities and Exchange Commission in the U.S. (e.g. Levitt, 1997) have
56
CHANTAL VIGER ET AL.
expressed an interest in understanding how financial reporting standards affect this investor group. Maines and McDaniel (2000) note that non-professional investors due to their relatively limited understanding of financial information would be more influenced by presentation formats than professional analysts. Further Maines and McDaniel (2000), and Hunton and McEwen (1997), both of whom used students as surrogates, state that in comparison to analysts, non-professional investors: (a) generally have ill-defined valuation models; (b) fail to identify specific data needed for financial analysis; and (c) assimilate information in a relatively unstructured manner. They also note that non-professional investors read the financial statements in the order presented, suggesting that they have few preconceived ideas of the importance of and/or relations among various financial statement items. Given this sequential information processing, non-professional investors are likely to consider all information regardless of its location.
THEORY DEVELOPMENT AND HYPOTHESES FORMULATION In the accounting literature, the organization of information has been shown to affect auditor’s going concern judgments (Ricchiute, 1992). While some studies warn that increasing the number of cues often “overloads” decision makers, leading to judgments of lower quality (Chewning & Harrell, 1990; Iselin, 1993), most researchers posit that if there is no information overload, then repetition and/or commonality influences judgment. Tversky and Kahneman (1973) indicate that availability of information, and, by extension, multiple redundancies result in greater understanding of the message conveyed. Slovic and MacPhillamy (1974) state that decision makers place greater weight on common measures and, unconsciously, at measures that are repeated. Slovic and MacPhillamy demonstrate that when two alternatives have a common attribute, along with unique attributes, the common attribute is weighted more. Payne et al. (1993) theorize that people choose simplifying strategies when making decisions. They note that reliance on common attributes or an attribute that is repeated is one such simplifying strategy; and more importantly, this form of decision making is not deliberate, but done subconsciously. These findings, especially the conclusions from cognitive and judgment research indicate that multiple reinforcements of the going concern contingency would accentuate the signal and influence decision-making. Further, the placement of particular items within the financial statements has also been shown to affect users’ judgments (though not in the context of going concern reporting). Hopkins (1996), for example, indicates that the location or placement of securities had an effect on financial analysts’ stock price judgments.
Behavioral Implications of Alternative Going Concern Reporting Formats
57
Hirst and Hopkins (1998) show that presenting comprehensive income in the Income Statement affects financial analysts’ stock price judgments differently than presenting the information in the Statement of Changes in Equity. The conclusion is that specific placement of particular pieces of information affected judgment and use of the information by financial statement users. In this context, an upfront reporting of the going concern uncertainty in the form of an explanatory (emphasis of matter) paragraph should accentuate the signal relative to a more subtle form of reporting, such as merely referencing on the face of the Income Statement and Balance Sheet and not reporting upfront on the auditor’s report. In this section we present a framework for evaluating how different formats for presenting going concern information affect investors’ judgments. This framework proposes that greater degrees of redundancy with reference to the presentation of going concern information may serve to focus readers’ attention more clearly on the matters raised. Based on the theory generated by the research cited above, we conclude that financial statement information incorporating varying levels of redundancy can influence investment decisions. Higher levels of redundancy accentuate the signal conveyed by the message. Hirst and Hopkins (1998) note that presentation format may affect analysts’ judgments partly because of the failure to sufficiently record information in memory. This shortcoming can be rectified by redundancy in information presentation. Similarly, Lipe and Salterio (2000) provide evidence that incorporating reinforcement and providing direct links between information may help decision makers mentally “chunk” these items and thus increase the emphasis on these items in forming judgments. In this research the “items” represent the going concern information. The “direct links” are the referencing of the going concern uncertainty in the explanatory (emphasis of matter) paragraph of the modified auditor’s report (United States format) and the referencing of the going concern uncertainty on the face of the Balance Sheet and Income Statement (proposed Canadian exposure draft). In Fig. 1, information acquisition is interpreted using the definition of Maines and McDaniel (2000, p. 183) as “an investor reading a specific financial statement item and storing the item in memory sufficiently well to recall where it appeared in the financial statements.” In the case of the control group, there is no information acquisition since the going concern uncertainty is neither discussed in a standalone note in the financial statements (the going concern uncertainty is disclosed like any other contingency) nor discussed in the explanatory paragraph. In the case of the format proposed by the Canadian exposure draft, there is information acquisition (as the going concern uncertainty is highlighted as a stand-alone note in the financial statements and referenced on the face of the Balance Sheet and Income Statement); and, as a result, an evaluation and a weighting given to this
58 CHANTAL VIGER ET AL.
Fig. 1. Framework for Examining Effects of Different Formats of Going Concern Information on Subjects’ Risk Assessment.
Behavioral Implications of Alternative Going Concern Reporting Formats
59
realization when evaluating overall investment risk. Similarly, as shown in the figure, the United States format should also result in information acquisition and evaluation (as the going concern uncertainty is discussed in a stand-alone note in the financial statements and discussed in an explanatory paragraph in the auditor’s report). This paper postulates that information evaluation and hence “weighting” will be greater in the presence of a modified report with an explanatory (emphasis of matter) paragraph upfront detailing and referencing a going concern uncertainty relative to the formatting in the proposed Canadian exposure drafts. Similarly, “weighting” of the going concern contingency would be greater for the proposed draft mode of format (unqualified report with no upfront reference but going concern contingency highlighted on the face of the Balance Sheet and Income Statement referencing a stand alone footnote) relative to a situation where there is no signal whatsoever as to a contingency. Based on the above, the following hypothesis (stated in the alternative form) is tested: H1 . Investors’ decisions to invest will be significantly lower when the reference to the contingency is in the form of a modified report relative to an unmodified report in the presence of going concern uncertainties, given full disclosure in the notes to the financial statements. The literature reveals that investors’ perceptions can be measured using certain criteria (Bertholdt, 1979; Gul, 1987; LaSalle & Anandarajan, 1997). These criteria are discussed below. Gul (1987), for example, notes that increased levels of disclosure about an uncertainty, including a going concern uncertainty, may be expected to increase the estimated effect of the uncertainty on the results and position disclosed in the financial statements and hence on the variance of expected cash flows. Thus, more disclosure may result in a greater perception of risk. Based on this, the second hypothesis is proposed and stated as follows: H2 . Investors’ perceptions of risk will be significantly higher when the reference to the contingency is in the form of a modified report relative to an unmodified report in the presence of going concern uncertainties, given full disclosure in the notes to the financial statements. Anandarajan et al. (2002) and LaSalle and Anandarajan (1997) found that the method of disclosure of the going concern uncertainty in the auditor’s report impacts users’ (loan officers in both studies) perceptions of the likelihood (or lack thereof) of a company improving its profitability. Based on this research, the third hypothesis is proposed and stated as follows:
60
CHANTAL VIGER ET AL.
H3 . Investors’ perceptions that the company can improve its profitability will be significantly lower when the reference to the contingency is in the form of a modified report relative to an unmodified report in the presence of going concern uncertainties, given full disclosure in the notes to the financial statements. Libby (1979) found that uncertainty qualifications cause users to search for additional information in order to estimate the effects of the uncertainty. Bertholdt (1979) and Gul (1987) suggested that there might be an increase in information search by financial statement users with increasing levels of disclosure of the uncertainty. The theory is that redundant information in the form of additional disclosure (e.g. explanatory paragraph in addition to a note in the financial statements) may exacerbate the perception of risk. This perception, in turn, may stimulate user behavior to have a greater need to assess the financial impact of that uncertainty on the company. As a result, the fourth hypothesis is proposed and stated as follows: H4 . Investors’ need for additional information will be significantly greater when the reference to the contingency is in the form of a modified report relative to an unmodified report in the presence of going concern uncertainties, given full disclosure in the notes to the financial statements. Bamber and Stratton (1997) note that an explanatory paragraph on the audit report detailing the uncertainty may have the consequence of focusing a reader’s attention on financial statement elements particularly important for their task. The implication is that the explanatory paragraph will highlight and magnify the impact of any notes the auditor chooses to emphasize. If so, investors should rate a note in the financial statements that is highlighted in the auditor’s report as more important to their investment decision relative to a scenario where the note is not highlighted. Consequently, the following two hypotheses are examined: H5 . Investors will weight the financial statement disclosure in the form of the going concern assumption footnote higher when it is referred to in the modified report format in their determination of factors entering into the investment decision. H6 . Investors will weight the modified audit report higher than the standard report in their determination of factors entering into the investment decision.
RESEARCH METHODOLOGY Research Design The research design was a 3 × 3 × 2 between-subject design, displayed in Fig. 2. This design was selected because subjects in a within-subjects design would have
Behavioral Implications of Alternative Going Concern Reporting Formats
61
Fig. 2. Experimental Design.
insight to the variable being manipulated. Such an insight may have sensitized the subjects with respect to their responses to the questions in the instrument. Subjects were randomly assigned to one of two experimental groups and a control group. Each participant received a case scenario relating to a fictitious company that presented a set of financial statements including a two-year corporate Balance Sheet, Statements of Earnings and Retained Earnings, and Statement of Changes in Financial Position for three years. In addition, notes to the financial statements were provided for the current year only. Participants assigned to Group 1 were provided going concern information based on current United States reporting standards, namely a separate going concern uncertainty note in the financial statements and a modified auditor’s report with a fourth explanatory paragraph detailing the uncertainty. Participants assigned to Group 2 (as shown in Fig. 2) were provided going concern information based on the proposed (now rescinded) Canadian exposure draft, where the going concern uncertainty was separately highlighted in a stand alone note combined with referencing on the face of the
62
Table 1. Differences Between Experimental Groups (1 and 2) and Control Group (3). Group
Decision to invest
Investing in the Company
1. Experimental (groups 1 and 2) 2. Control (group 3)
Yes
No
28 (48.3%) 27 (96.4%)
30 (51.7%) 1 (3.6%)
Mean
Standard Deviation
Statistic ( p-Value)
2 = 18.993 (0.000)
Perception of investment risk
1. Experimental (groups 1 and 2) 2. Control (group 3)
3.59 2.39
0.80 0.63
t = 6.951 (0.000)
Perception of likelihood that the company can improve its profitability
1. Experimental (groups 1 and 2) 2. Control (group 3)
3.24 3.82
0.90 0.77
t = −2.917 (0.005)
Perception of need to search for additional informationa
1. Experimental (groups 1 and 2) 2. Control (group 3)
3.18 2.43
1.31 0.79
t = 79.395 (0.002)
endpoints of the five point scales are as follows:
(1) Investment risk (2) Likelihood that company improve its profitability (3) Search for additional information
Low risk Very unlikely Very unlikely
High risk Very likely Very likely
CHANTAL VIGER ET AL.
a The
Behavioral Implications of Alternative Going Concern Reporting Formats
63
Balance Sheet, Statement of Earnings and Retained Earnings, and an unqualified auditor’s report. The comparisons between groups 1 and 2 were of primary interest to test our hypotheses; however, it was essential to ensure that the numbers in the financial statements did not drive the study results. Consequently, a control group (Group 3) was included in the research design. Subjects assigned to the control group were provided with identical information to that received by subjects of the experimental groups except that the going concern uncertainty was not disclosed separately from other contingencies and no reference was made to the going concern contingency in the financial statements and auditor’s report. A comparison was made between the decisions and perceptions of the two experimental groups (Groups 1 and 2) with those of the control group (Group 3). As shown in Table 1, there were significant differences between the control group and the experimental groups on whether to invest in the company (p-value = 0.0001). While 96% of the respondents in the control group indicated that they would invest in the company, only 48% of the respondents in the experimental groups indicated the same. Similarly, the differences between the control and experimental groups were all significant with respect to their perception of investment risk (p = 0.0000), perception of likelihood that the company can improve profitability (p < 0.005), and perception of the need to search for additional information (p < 0.002). These variations provide preliminary evidence that results are attributable to the reference to the going concern uncertainty rather than the numbers in the financial statements. Sample The subjects selected for this research were master’s level students at the University of Quebec at Montreal. These students were considered appropriate surrogates for non-professional investors because they have limited knowledge of accounting rules (Pringle et al., 1990); and, as such, they are less likely to be familiar with foreign country reporting rules. In contrast, other potential subjects such as Chartered Accountants (CA), CPAs or financial analysts might have been sensitized to changes in the standard auditor’s report; as a result, their decisions may be biased by their expectations of a particular audit report format. In addition, Walters-York and Curatola (1998, 2000) have concluded that the use of experienced students, as in this study, can provide meaningful results. Walters-York and Curatola (2000) note that research relying on student subjects is likely no less valid than research relying on non-student subject groups; student samples provide no greater threat to external validity than typical real-world samples. The customary real-world sample can be placed under the same scrutiny for lack of formal representativeness and atypicality as the customary student sample (p. 258).
64
CHANTAL VIGER ET AL.
The students who participated in this study were resuming their course work requirements to be eligible to sit for the National Chartered Accountants examination while working full-time. The average age of these students was 24.3 years, and they had completed an average of 8.33 accounting courses and almost two auditing courses in their university studies. A total of 86 students participated in the study and were randomly assigned to the three groups (29 in Group 1, 29 in Group 2, and 28 in Group 3). Statistical tests were conducted on the personal characteristics of the students in the three groups; students in the three groups were not statistically different in terms age, number of courses in management and financial accounting, number of auditing courses.
Task The experimental instrument provided to the students was developed from research instruments previously used (Bamber & Stratton, 1997; LaSalle & Anandarajan, 1997). It consisted of a covering letter, descriptive information about a hypothetical company, auditor’s report, Balance Sheet (2 years), Statements of Earnings and Retained Earnings (3 years), Statements of Changes in Financial Position (3 years), notes to financial statements (including the note highlighting the going concern problem for the current year only). All subjects received the above information. The difference among the three groups was the auditor’s report and the manner by which the going concern uncertainty was disclosed (please refer to Fig. 2). Specifically, participants within each experimental group received only one type of auditor’s going concern report, namely, the report with the explanatory paragraph (United States format) for participants assigned to the experimental group 1 or the standard report (Canadian format) for participants assigned to the experimental group 2 and the control group. Participants within the control group received the going concern uncertainty not disclosed separately from other contingencies (and no reference was made to the going concern contingency in the financial statements). For participants in the control group the going concern contingency was integrated with the other footnotes (not highlighted). No reference was made to the going concern uncertainty in the auditor’s report.
Response Instrument The response portion of the experimental instrument was effectively broken into three sections and is shown in the appendix. The first question in the instrument
Behavioral Implications of Alternative Going Concern Reporting Formats
65
sought to obtain background information about the respondents. Question 2 sought the intensity of the respondents to their interest in investing in the company. Questions 3, 4, and 6 aimed to examine respondents’ perceptions of risk.2 Subjects were asked to circle an answer on a Likert scale ranging from 1 to 5. The scales were set up so that a high score for question 3 and 6 indicated a high perception of risk and a low score, a low perception of risk. In contrast, question 4 was set up so that a low score indicated a high perception of risk, and a high score, a low perception of risk. Finally, question 5 (assigned to those in the experimental groups only) requested the subjects to assign 100 points across the different financial statement items according to the items’ relative importance to the investment decision. Question 5 was only assigned to the experimental groups because only those groups received the going concern contingency manipulation. Subjects were requested to assign points based on the perceived importance of each item on the decision making process. The items of information were presented in the same order for participants of groups 1 and 2. Once the experiment was complete appropriate manipulation checks were conducted to ensure that the subjects fully understood the meaning of the questions that were asked.3
RESULTS The first hypothesis (H1 ) related to the investment decision. A Chi-square test was performed to assess whether the subject’s willingness to invest was affected by the difference in presentation in the auditor’s report. The results, as shown in Table 2, indicate that the difference between the two experimental groups is statistically significant (2 = 9.943; p-value = 0.002). More specifically, only eight subjects (28.6%) who received the unqualified modified report were willing to invest while twenty subjects (71.4%) who received the standard report were willing to invest. Overall, the results of H1 suggest that the explanatory
Table 2. Cross-Tabulation of Audit Report and Investment Decision. Investment Decision
Experimental Groups Group 1 Actual U.S.
Total
Group 2 Proposed Cd
Yes No
8 (28.6%) 21 (70%)
20 (71.4%) 9 (30%)
28 30
Total
29
29
58
Note: 2 = 9.943, p-value = 0.002.
66
CHANTAL VIGER ET AL.
Table 3. Tests of Differences in Perceptions of Investors. Group
Mean
Standard Deviation
z-Statistic (p-Value) One Tail Test
Investment risk
Standard Modified
3.31 3.86
0.71 0.79
−2.64 (0.004)
Likelihood that the company improve its profitability
Standard Modified
3.52 2.97
0.78 0.94
−2.15 (0.015)
Search for additional information
Standard Modified
2.93 3.41
1.36 1.24
−1.37 (0.084)
Note: The endpoints of the five point scales are as follows: Investment risk Likelihood that company improve its profitability Search for additional information
Low risk Very unlikely Very unlikely
High risk Very likely Very likely
paragraph included in the unqualified modified report (Group 1) did impact the investment decision. Table 3 displays the distributions of the mean response and standard deviation of the responses to questions 3, 4, and 6. The results from the Mann-Whitney test indicate that the difference in presentation of the auditors’ report significantly influenced the investors’ assessment of the investment risk (H2 ) and the likelihood that the company can improve its profitability (H3 ) (p-value = 0.004 and 0.015, respectively). The difference in presentation format (H4 ), however, only marginally influenced the search for additional information about the going concern uncertainty (p-value = 0.084). In summary these results indicate that, relative to the type of standard auditor’s report issued in Canada, the inclusion of the explanatory paragraph in the auditor’s report detailing the going concern uncertainty significantly influences the perception of risk associated with the company and significantly decreases the perception of the likelihood that the company can improve its profitability. Table 4 reveals the relative importance of the different items in the financial statements to the investment decision for groups 1 and 2. On average (as given in column 3 of Table 4), subjects weighted the contingencies footnote (footnote 9) most heavily as impacting their investment decision (with a weight of 17.56). Next, the Statement of Earnings and Retained Earnings, the Balance Sheet and the Statement of Changes in Financial Position were also considered important in the investment decision with weights of 15.58, 13.87 and 13.50, respectively. Finally the going concern assumption footnote (footnote 1A) was also considered to be important with a weight of 9.60.
Behavioral Implications of Alternative Going Concern Reporting Formats
67
Table 4. Descriptive Statistics: Mean (Standard Deviation) of Decision Weights by Report Type. Auditor Report
Balance sheet Footnote 1A (Going concern assumption) Footnote 1B (Summary of significant accounting policies) Footnote 2 (Restructuring charge) Footnote 3 (Income taxes) Footnote 4 (Accounts receivable) Footnote 5 (Inventories) Footnote 6 (Capital assets) Footnote 7 (Accounts payable) Footnote 8 (Long term debt) Footnote 9 (Contingencies) Audit report Statement of earnings and retained earnings Statement of changes in financial position
Overall Mean (Std. Dev.)
Standard
Modified
16.30 (12.15) 4.38 (6.71) 0.15 (0.46)
11.60 (10.36) 14.44 (14.03) 1.80 (3.51)
13.87 (11.40) 9.60 (12.13) 1.01 (2.66)
2.92 (3.36) 1.15 (2.46) 4.61 (4.42) 5.38 (8.11) 2.88 (4.05) 2.88 (3.70) 6.30 (6.41) 15.11 (15.31) 2.96 (6.23) 17.80 (10.85) 16.53 (8.98)
4.76 (6.95) 1.62 (2.68) 4.55 (5.84) 2.50 (3.57) 2.32 (3.77) 3.12 (4.12) 4.50 (5.44) 19.83 (15.73) 3.67 (4.72) 14.03 (12.92) 10.67 (8.10)
3.87 (5.55) 1.39 (2.56) 4.58 (5.16) 3.88 (6.30) 2.59 (3.88) 3.05 (3.89) 5.37 (5.94) 17.56 (15.57) 3.33 (5.46) 15.58 (12.01) 13.50 (8.05)
With respect to the auditor report, subjects gave different weights to the financial statements’ items depending on the audit report assigned. Those subjects who received a standard report rated the statements of earnings and retained earnings (17.80), the Statement of Changes in Financial Position (16.56), and the Balance Sheet (16.30) as the three most important items to their investment decision. Whereas, those subjects who received a modified report ranked the contingencies footnote (19.83), the going concern assumption note (14.44), and the Statement of Earnings and Retained Earnings (14.03) as the three most important items to their decision. One might suspect that the explanatory paragraph of the modified report directed the subjects’ attention to the information disclosed in the two notes related to going concern uncertainties. H5 and H6 examined whether the uncertainty modification affected the weight investors attached to the going concern assumption footnote (H5 ) and the audit report (H6 ) in their investment decision. The results of the Mann-Whitney test, as given in Table 5, indicate that the type of report has a significant effect on the weight given to footnote 1A entitled Going concern assumption (z = −3.48 and p-value = 0.000) providing support for H5 . The difference due to the type of audit report, however, provided only marginally significance for H6 (2.96 v 3.67; z = −1.41; p-value = 0.078). These results suggest that the uncertainty
68
CHANTAL VIGER ET AL.
Table 5. Results Related to the Nature of the Effect of the Uncertainty Modification. Group
Mean
Standard Deviation
z-Statistic (p-Value) One Tail Test −3.48
Footnote 1A (Going concern assumption) weight
Standard
4.38
6.71
Modified
14.44
14.03
(0.000)
Audit report weight
Standard Modified
2.96 3.67
6.23 4.72
−1.41 (0.078)
modification primarily operates to direct investors’ attention to the going concern assumption footnote.
DISCUSSIONS, CONCLUSIONS, AND IMPLICATIONS The auditor’s report for Canadian companies currently is unqualified in the presence of going concern uncertainties as long as the auditor is satisfied with financial statement disclosure. The only requirement is that the going concern contingency be described in a note integrated with the other notes to the financial statements. The present method has been criticized as being passive and the signal mute (Boritz, 1991). In an attempt to adopt a more positive stance and accentuate the signal about the going concern contingency, the CICA considered and then withdrew two exposure drafts dealing with going concern uncertainties. The recently rescinded accounting standards proposed a separate stand-alone going concern note combined with reference to the going concern note on the face of the Balance Sheet and Income Statement. But the proposed auditing standard did not require the auditor’s report to be modified. Although the proposed standards have been withdrawn, this issue remains under consideration by the CICA because one of the principal objectives adopted by the AcSB is “to work toward the elimination of significant differences in accounting standards internationally” (CICA, 2000, p. 1). The findings of this study provide evidence that the provision of the explanatory (emphasis of matter) paragraph in the presence of going concern uncertainties did have an influencing effect on the non-professional investor subjects’ decision to invest and perceptions toward the riskiness of the company. These results are consistent with judgment and decision making theory, which holds that multiple reinforcements accentuate the signal and contradict the argument that the audit
Behavioral Implications of Alternative Going Concern Reporting Formats
69
report is redundant. Although the notes to the financial statements already disclose the same information as in the fourth paragraph of the auditor’s report, it appears that they do not provide the same level of warning to the financial statements’ users. Hence, as part of the CICA’s reassessment of the current going concern reporting and auditing standards in Canada, serious consideration should be given to require auditors to modify the audit report when facing going concern uncertainties. Such reporting, if adopted by the CICA, would also provide closer harmonization of Canadian standards with those of other countries. This study was not without limitations. First, the experimental design did not consider all the costs and benefits associated with real investors’ decision making. Second, the information provided to participants was less than the amount of information usually available to investors. Third, one question asked respondents about risk. Although the research attempted to measure “investment” risk, it is possible that the respondents could have interpreted it as “business” risk. While the extent of this misinterpretation is unknown, it could have impacted the results of the study. Finally, the wording of our cover letter, due to our ethical obligations, may have alerted the readers to the subject matter of this study. The general finding in this study is that the modification of the auditor report with a fourth explanatory paragraph appears, in the Canadian context, to cause investors to focus more on the going concern contingency and therefore sends a stronger signal. This finding is relevant since the issue of the auditor’s role and responsibility in communicating information on uncertainties (including the going concern status of client) is still an open debate in Canada. In a recently published paper, Anandarajan et al. (2002) examined loan officers’ reaction to the format based on the current Canadian standard, proposed exposure draft, and the United States standard. They concluded that bankers did not perceive a difference between the current Canadian standard and proposed Canadian exposure draft; but they did perceive a difference between the Canadian reporting formats and the format adopted by the United States and other countries. One limitation of that study was the selection of only one sophisticated financial statement user group (bank loan officers). This study extends those results by looking at the reaction of a non-professional investors group and found that the non-professional investors focused more on the going concern contingency under the United States format. In conclusion, this study makes two important contributions. First, investors are more likely to invest in a company experiencing a going concern problem when it is reported under the Canadian format than under the United States format. These results are even more pronounced than those found in Anandarajan et al. (2002) for loan officers, which suggested that the information may be misleading. From a practical viewpoint, the findings add to the debate on whether Canadian standard setters should change Canadian reporting in the presence of going
70
CHANTAL VIGER ET AL.
concern uncertainties to converge with the format adopted by the United States and other Western countries. Second, from an academic viewpoint, this study contributes to the literature on the incremental information provided by alternate forms of going concern audit reports. Overall, the preponderant view is that the modification or explanatory paragraph in the auditor’s report does not have incremental information content to the financial statement reader (Abdel-Khalik et al., 1986; Elias & Johnston, 2001; Houghton, 1983; LaSalle & Anandarajan, 1997; Libby, 1979; among others). In this study, we find that, in the Canadian context, the format of going concern presentation, especially when information is reinforced by repetition has incremental information content to a reader. In addition, these results add to and corroborate judgment and decision making theory, which holds that multiple reinforcements accentuate the signal provided that there is no information overload.
NOTES 1. The two relevant auditing standards in the U.S. are Statement of Auditing Standard (SAS) No. 58 titled Reports on Audited Financial Statements and SAS No. 59 entitled The Auditor’s Consideration of an Entity’s Ability to Continue as a Going Concern (see American Institute of Certified Accountants, 1988). 2. The risk that the study is attempting to measure is “investment risk.” It is defined as the investors’ perceptions of the financial viability of an entity based on their reading of the financial statements and notes to the financial statements. 3. A number of procedures were followed to ensure that the students took the task seriously. For example, the experiment was officially conducted as part of the requirements of a class. After the students completed the study, a discussion was held by the one of the researchers with the students to discuss the study. Students were requested to justify their answers with respect to the decision to invest. From the level of participation, we concluded that the students not only understood the material but also took their participation seriously. In fact, the majority of students who declined to invest cited the going concern uncertainty footnote as the primary reason for their decision. These responses were further corroborated by the evidence gathered from respondents’ reaction to question 5 in the response instrument, which requested the participants to allocate points among the information given to them based on the item’s relative importance to their investment decision. Students placed greater weight on items that were generally relevant to the investment decision with the heaviest weight on the contingency footnote. This suggests that an overall general understanding of the material was present among the participants in the study.
ACKNOWLEDGMENTS The authors thank the editor, the associate editor, and the two anonymous reviewers for their insightful and constructive comments.
Behavioral Implications of Alternative Going Concern Reporting Formats
71
REFERENCES Abdel-Khalik, A. R., Graul, P. R., & Newton, J. D. (1986). Reporting uncertainty and assessment of risk: Replication and extension in a Canadian setting. Journal of Accounting Research, 24, 372–382. American Institute of Certified Public Accountants (AICPA) (1988). Reports on audited financial statements. In: Statement on Auditing Standards (Nos. 58 and 59). New York, NY: AICPA. Anandarajan, A., Viger, C., & Curatola, A. P. (2002). An experimental investigation of alternative going-concern reporting formats: A Canadian experience. Canadian Accounting Perspectives, 1(2), 141–162. Bamber, E. M., & Stratton, R. A. (1997). The information content of the uncertainty-modified audit report: Evidence from bank loan officers. Accounting Horizons, 11(2), 1–11. Bertholdt, R. H. (1979). Discussion of the impact of uncertainty reporting on the loan decision. Journal of Accounting Research (Suppl.), 58–63. Boritz, J. E. (1991). The going concern assumption. In: Canadian Institute of Chartered Accountants Research Report. Toronto: CICA. Canadian Institute of Chartered Accountants (CICA) (1995). Proposed auditing recommendations auditor’s responsibility to evaluate the going concern assumption. Auditing Standards Board (September). Canadian Institute of Chartered Accountants (CICA) (1996). Proposed accounting recommendations – Going concern. Accounting Standards Board (January). Canadian Institute of Chartered Accountants (CICA) (1999). Department digest: A summary of current CICA projects and initiatives. The Canadian Accountant (Fall), 2. Canadian Institute of Chartered Accountants (CICA) (2000). New era begins for the Accounting Standards Board. The Canadian Account (Winter), 1. Canadian Institute of Chartered Accountants (CICA) (2003). CICA handbook. Toronto: CICA. Chewning, E., & Harrell, A. (1990). The effect of information overload on decision makers’ cue utilization levels and decision quality in a financial distress decision task. Accounting Organizations & Society, 15(6), 527–542. Elias, R. Z., & Johnston, J. G. (2001). Is there incremental information content in the going concern explanatory paragraph? Advances in Accounting, 18, 105–117. Frownfelter, C. A., & Fulkerson, C. L. (1998). Linking the incidence and quality of graphics in annual reports to corporate performance: An international comparison. Advances in Accounting Information Systems, 6, 129–152. Gul, F. A. (1987). The effects of uncertainty reporting on lending officers’ perception of risk and additional information required. ABACUS, 23(2), 172–179. Hirst, E., & Hopkins, P. (1998). Comprehensive income reporting and analysts’ valuation judgments. Journal of Accounting Research, 36, 47–75. Hopkins, P. (1996). The effect of financial statement classification of hybrid financial instruments on financial analysts’ stock price judgments. Journal of Accounting Research (Suppl.), 33–50. Houghton, K. A. (1983). Audit reports: Their impact on the loan decision process and outcome: An experiment. Accounting and Business Research, 66, 15–20. Hunton, J. E., & McEwen, R. A. (1997). An assessment of the relation between analysts’ earnings forecast accuracy, motivational incentives, and cognitive information search strategy. The Accounting Review, 72(October), 497–516. International Federation of Accountants (IFAC) (1999). International statement on auditing 570. Going concern. In: IFAC Handbook Technical Pronouncements. New York: IFAC.
72
CHANTAL VIGER ET AL.
Iselin, E. (1993). The effects of the information and data properties of financial ratios and statements on managerial decision quality. Journal of Business Finance and Accounting, 20(2), 249–266. LaSalle, R. E., & Anandarajan, A. (1997). Bank loan officers’ reactions to audit reports issued to entities with litigation and going concern uncertainties. Accounting Horizons, 11(2), 33–40. Levitt, A. (1997, September 26). The importance of high quality accounting standards. Speech to the Inter-American Development Bank, Washington, DC. Libby, R. (1979). The impact of uncertainty reporting on the loan decision. Journal of Accounting Research (Suppl.), 35–57. Lipe, M., & Salterio, S. (2000). Balanced scorecard: Judgmental effects of common and unique performance measures. The Accounting Review (July), 283–298. Maines, L. A., & McDaniel, L. (2000). Effects of comprehensive-income characteristics on nonprofessional investors’ judgments: The role of financial-statement presentation format. The Accounting Review, 75(April), 179–207. Payne, J., Bettman, J., & Johnson, E. (1993). The adaptive decision maker. Cambridge: Cambridge University Press. Pringle, L. M., Crum, R. P., & Swetz, R. J. (1990). Do SAS No 59 format changes affect the outcome and the quality of investment decisions. Accounting Horizons (September), 68–76. Ricchiute, D. (1992). Working paper order effects and auditors’ going concern decisions. The Accounting Review (January), 46–58. Slovic, P., & MacPhillamy, D. (1974). Dimensional commensurability and cue utilization in comparative judgment. Organizational Behavior and Human Performance, 11, 172–194. Stocks, M. H., & Tuttle, B. (1998). An examination of information presentation effects on financial distress predictions. Advances in Accounting Information Systems, 18, 107–128. Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207–232. Walters-York, L. M., & Curatola, A. P. (1998). Recent evidence on the use of students as surrogate subjects. Advances in Accounting Behavioral Research, 1, 123–143. Walters-York, L. M., & Curatola, A. P. (2000). Theoretical reflections on the use of students as surrogate subjects in behavioral experimentation. Advances in Accounting Behavioral Research, 3, 243–264.
APPENDIX 1 How many university courses have you had in management and financial accounting? How many university courses have you had in auditing? What is your age? 2 Would you be willing to invest in this company? (Check “Yes” or “No”). YES
NO
3 Please circle on the scale shown below your perception of risk of the company. LOW RISK
HIGH RISK 1
2
3
4
5
Behavioral Implications of Alternative Going Concern Reporting Formats
73
4 Please circle on the scale shown below your perception of the likelihood that the company can improve its profitability. VERY UNLIKELY
VERY LIKELY 1
2
3
4
5
5 Assign 100 points across the following financial statements items according to the items’ relative importance to your investment decision: (1) Balance sheet (2) Footnote 1A (Going Concern Assumption) (3) Footnote 1B (Summary of Significant Accounting Policies) (4) Footnote 2 (Restructuring Charge) (5) Footnote 3 (Income Taxes) (6) Footnote 4 (Accounts receivable) (7) Footnote 5 (Inventories) (8) Footnote 6 (Capital assets) (9) Footnote 7 (Accounts payable) (10) Footnote 8 (Long term debt) (11) Footnote 9 (Contingencies) (12) Audit report (13) Statement of earnings and retained earnings (14) Statement of changes in financial position 6 Please circle on the scale shown below the extent to which you are likely to search for additional information about matters addressed in the auditor’s report. VERY UNLIKELY
VERY LIKELY 1
2
3
4
5
END OF QUESTIONNAIRE. THANK YOU FOR YOUR HELP
MANAGEMENT FRAUD RISK FACTORS: AN EXAMINATION OF THE SELF-INSIGHT OF AND CONSENSUS AMONG FORENSIC EXPERTS Sally A. Webber, Barbara Apostolou and John M. Hassell ABSTRACT Over the past two years, fraudulent financial reporting has become a major concern of both the Securities and Exchange Commission and investors. These concerns have been spurred by evidence that several high-profile companies such as Enron, Tyco, WorldCom, and HealthSouth have published false and/or misleading financial reports. Statement on Auditing Standards (SAS) No. 82 specifies that auditors have a responsibility to assess the likelihood of management fraud and identifies specific risk factors that should be considered when making that assessment. Apostolou et al. (2001b) examined how internal and external auditors rate the relative importance of these factors. This study extends Apostolou et al. (2001b) by examining how forensic experts at four Big 5 professional service firms assess the factors specified in SAS No. 82. These assessments produced two different models of relative importance: (a) a statistical model (produced by the Analytic Hierarchy Process); and (b) a subjective model (based on subjects’ assessment of the relative weights). These models are then used to assess Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 75–96 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07004-8
75
76
SALLY A. WEBBER ET AL.
the self-insight of and the degree of agreement among the forensic experts. The results indicate that forensic experts have a moderately high degree of self-insight. A moderate to high degree of consensus among experts’ judgments about the relative importance of fraud risk factors was noted.
INTRODUCTION Statement on Auditing Standards (SAS) No. 82, “Consideration of Fraud in a Financial Statement Audit” (AICPA, 1997), identifies 25 risk factors that an auditor should consider when making a management fraud risk assessment.1 SAS No. 82 provides specific guidance to auditors regarding how they should fulfill the responsibility to obtain reasonable assurance about whether the financial statements are free of material misstatement, whether due to error or fraud. Auditors are required to consider SAS No. 82 risk factors that “are discriminating and have been found to be present frequently in actual instances of fraud” (Mancino, 1997, p. 32). Based upon identification of the presence of risk factors, the auditor is required to assess the risk of material misstatement due to fraud and to document the nature of the response. The fraud risk assessment should evolve throughout the course of the audit, and is not expected to be assessed at a level (i.e. high, medium, low), as is the case with inherent or control risk. Since late 2001, scrutiny has been placed on the auditing profession as a result of highly publicized corporate frauds (e.g. Enron, WorldCom, HealthSouth) and the corresponding apparent failure of the auditors to detect and report the wrongdoing in a timely manner. Significant shareholder losses and the decline in investor confidence led Congress to enact the Sarbanes-Oxley Act in 2002, which created the Public Companies Accounting Oversight Board to increase accountability of the public accounting profession. Meanwhile, the growth of the specialized practice of forensic accounting has been tremendous as a result of the increased concern about corporate fraud and its impact on the integrity of the financial markets. Forensic experts are individuals in professional services firms with specific training in and experience with fraud investigative techniques. These individuals work with fraud risk and its effects on an ongoing basis, while internal and external auditors may encounter it rarely. Thus, it seems appropriate to consider how forensic experts model the SAS No. 82 management fraud risk factors (AICPA, 1997). Apostolou et al. (2001b) studied how three groups of auditors (50 regional/local firm external auditors, 43 Big 5 firm external auditors, and 47 internal auditors) rated the relative importance of 25 management fraud risk factors in SAS No. 82 (AICPA, 1997). The AHP was used to model the judgment of each subject, and the individual models were then combined to produce mean decision models for
Management Fraud Risk Factors
77
each auditor group. Statistical analysis showed no significant differences between the auditor groups. This study replicates and extends Apostolou et al. (2001b) and investigates the degree of agreement (self-insight and consensus) among 35 practicing Big 5 firm forensic experts regarding the relative importance of the 25 management fraud risk factors. The forensic experts who participated serve crucial roles in their firms because they are summoned whenever fraud is suspected or discovered. Management fraud (as opposed to employee fraud) is the topic of interest in the current study primarily because its effects tend to be more severe, as evidenced by the impact of the highly publicized failures in the Enron era. If the knowledge held by forensic experts is to be passed on to and aid in the training of less experienced auditors, these experts must understand how they personally make decisions regarding risk (i.e. demonstrate a relatively high level of self-insight into their decisions about relative risk (Colbert, 1988)). The results of prior research regarding the self-insight of experts, which has been assessed in several different ways, have been mixed with results documenting low to high self-insight (Ashton & Brown, 1980; Ashton & Kramer, 1980; DeZoort, 1998; Hamilton & Wright, 1982; Mear & Firth, 1987; Slovic et al., 1972). Reilly and Doherty (1992) note that seemingly high correlations reported in some prior studies should be viewed with caution because it is easy to obtain high correlations in studies with few attributes. Many prior accounting studies that have shown auditors have relatively high self-insight have used six or fewer cues (Ashton & Kramer, 1980; Colbert, 1988; Hamilton & Wright, 1982). This study includes a very large number of attributes and an expert population where self-insight has not been previously documented. Thus we are unable to predict a level of self-insight from prior studies. If this research is to assist the profession in understanding the relative importance of these 25 fraud risk factors, it is important that the experts surveyed have the ability to describe how they view the importance of the factors (i.e. exhibit self-insight). Since we are unable to assess the accuracy of the subjects’ models in predicting fraud, consensus is measured as a surrogate. Although consensus does not guarantee accuracy, the relative weights produced from this study would be of little assistance if our experts showed little agreement. Two different measures of relative importance of the 25 risk factors were obtained from the forensic experts: (1) statistical weights derived using a mathematical decision model based on paired comparison data provided by the subjects; and (2) subjective weights assigned by the experts. The experts’ degree of agreement was assessed in two separate ways: (1) computing measures of self-insight across models; and (2) computing measures of consensus across experts. Self-insight was computed in two ways: (1) computing the correlations between the statistical and subjective model weights; and (2) after using simulation analysis
78
SALLY A. WEBBER ET AL.
to compute predictions from each expert’s statistical and subjective models, computing correlations between the predictions of the two models for each individual. Finally, the degree of consensus among the experts for both statistical and subjective models was assessed in two ways: (1) computing correlations between the model weights; and (2) after using simulation analysis to compute predictions from each expert’s statistical and subjective models, computing correlations between the predictions of the two models for all possible pairs of individuals. The results demonstrate moderately high self-insight and moderate to high consensus. Agreement (self-insight) between an individual expert’s models was moderately high for decision weights and high for predictions, and agreement (consensus) across individuals was higher than that found in prior studies. The remainder of the paper is organized as follows. Prior research is briefly summarized in the next section. The research method is then described, followed by a summary of the results and conclusions.
PRIOR RESEARCH Nieschweitz et al. (2000) review literature that addresses the fraud detection responsibility of external auditors. This prior research emphasizes various approaches to the study of fraud and fraud risk. Some researchers examined the efficacy of decision aids (Eining et al., 1997; Green & Choi, 1997; Pincus, 1989). Others examined how auditors make the fraud risk assessment or can improve upon it (Bernardi, 1994; Hooks et al., 1994; Reckers & Schultz, 1993; Zimbelman, 1997). Another avenue of research is the use of surveys to determine relevant risk factors (Apostolou & Hassell, 1993; Apostolou et al., 2001a; Hackenbrack, 1993; Loebbecke et al., 1989). Finally, the analysis of archival data sources to discover factors identified with management fraud also has been explored (Beasley et al., 1999; Palmrose, 1987). Landsittel and Bedard (1997, p. 3) noted that research can contribute to the ongoing improvement of auditing standards to guide the fraud risk assessment process, including: (1) identifying the most important fraud risk factors; and (2) weighting the risk factors as to their relative importance to the fraud risk assessment. Eining et al. (1997, p. 4) observed that checklists, which are used by auditors to assist in making a fraud risk assessment, give “no mechanical assistance for weighting and combining the red flag cues into an overall assessment.” Heiman-Hoffman et al. (1996) surveyed Big 6 auditors to obtain an explicit ranking of 30 fraud risk factors, but the ranking method used precluded determination of the relative importance of the risk factors. Apostolou et al. (2001a) measured the relative importance of SAS No. 82 risk factors to three groups of experts (Big
Management Fraud Risk Factors
79
5 field auditors, regional/local field auditors, and internal auditors). The current study extends these studies by assessing the relative importance of SAS No. 82 risk factors to the most senior forensic auditors in Big 5 firms.2
RESEARCH METHOD Data Collection The Analytic Hierarchy Process Developed by Thomas Saaty, the Analytic Hierarchy Process (AHP) is a mathematical tool used to model complex judgments (Saaty, 1986, 1988, 1994). Apostolou and Hassell (1993) review AHP research in accounting, and Apostolou et al. (2001b) provide an extended illustration showing the application of AHP. The AHP is useful when qualitative criteria (e.g. management fraud risk factors) enter into judgments, as is the case of assessing the risk of material misstatement due to management fraud. The technique requires that the judgment criteria be organized in a hierarchical fashion, from most general at the top of the hierarchy to most specific at the bottom. An AHP model is constructed by having each participant consider all possible pairs of risk factors within each hierarchical grouping and then rate the factors within each pair as to their importance to the immediately superior level in the hierarchy using a nine-point intensity rating scale. SAS No. 82 identifies management fraud risk factors that should enter into the assessment of the risk of material misstatement due to management fraud. The same AHP hierarchy for the management fraud risk factors (see Fig. 1) that was developed and described in Apostolou et al. (2001b, p. 8) was used. The hierarchy consists of three levels, with level one the most general and level three the most specific. Level one represents the goal of the decision process: assess risk of material misstatement due to management fraud in a financial statement audit. Level two consists of the SAS No. 82 categories of fraud risk factors: (a) management characteristics and influence over the control environment; (b) industry conditions; and (c) operating and financial stability characteristics. Level three includes a total of 25 specific risk factors (AICPA, 1997, ¶17) that fall within the three level-two categories.
Model Weights of Relative Importance Two methods to measure the relative importance of the management fraud risk factors to forensic experts were used. First, statistical model weights of relative
80 SALLY A. WEBBER ET AL.
Fig. 1. Management Fraud Risk Assessment Hierarchy. Note: (a) The risk factors are condensed versions of the definitions used in SAS No. 82 (AICPA, 1997). (b) The hierarchy is consistent with Apostolou et al. (2001b, p. 8).
Management Fraud Risk Factors
81
importance were obtained using the AHP to assess the relative importance of the management fraud risk factors to forensic experts from Big 5 firms. Second, subjective weights of relative importance were obtained by asking the forensic experts to allocate 100 points (100%) of relative importance across the 25 management fraud risk factors. Allocating 100 points across cues or factors is a commonly employed technique in behavioral research. The Research Instrument Data were collected from forensic experts with a three-part survey-type instrument, designed to facilitate the calculation of AHP models. The instrument consisted of three sections. Part one asked the subjects to allocate 100 points over the risk factors within each of the three categories and among the three categories. The risk factors and descriptions were reproduced verbatim from SAS No. 82 (AICPA, 1997. ¶17). Part two presented the pairwise comparisons within and among the three categories. Again, the subjects were provided with both the verbatim SAS No. 82 risk factor description and the AHP scale definitions on each page in which comparisons were required. The third section consisted of demographic questions. In part one, each participant allocated: (1) 100 points over the three risk categories; and (2) 100 points across the individual risk factors within each category. These responses produced subjective model weights of relative importance. Because the 25 factors appear in three different categories, two ways exist to allocate 100 points across the factors. One is to ask the subjects to allocate 100 points across all 25 factors. Another is to allocate 100 points across the factors in each category, and then to use the subjective allocations within each category to prepare an allocation across all 25 factors. The latter method was used because it is consistent with the SAS No. 82 categorical presentation (AICPA, 1997, ¶17). It is possible, however, that this process would result in a different allocation than if the 100 points had been allocated across all 25 factors at once. In part two of the instrument, participants made 52 paired-comparisons, which were used to produce statistical model weights of relative importance for each risk factor. An AHP model is produced by having each subject make all possible pairwise comparisons of the level 2 categories (3 comparisons) and the level 3 risk factors (126 comparisons), which would require a total of 129 pairwise comparisons. Subject fatigue is a significant concern when the number of comparisons is so large. When hierarchies contain a large number of comparisons, a function in the AHP’s Expert Choice™ (1998) software used for the computations called “link elements” allows the researcher to reduce the number of comparisons while generating sufficient redundancy among comparisons to produce an AHP model. The model used consists of a total of 52 comparisons: level 2 (3 comparisons) and level 3 (49 comparisons). The only grouping in which the “link elements” feature
82
SALLY A. WEBBER ET AL.
was used is operating characteristics and financial stability category, wherein the number of required comparisons was 28 instead of 105. In part three, demographic information was collected. Questions related to professional experience, involvement with fraud risk assessment, and the instrument were asked to assist in understanding the data. Forensic Expert Participants This research study was sponsored by the AICPA as part of its research into the effectiveness of SAS No. 82. The AICPA acted as an intermediary to solicit participation by forensic experts from Big 5 professional service firms. AICPA staff and representatives of each Big 5 firm reviewed the research instrument before it was administered. Four of the five firms agreed to participate by distributing the research instrument to their most senior forensic experts. Thirty-five individuals returned usable research instruments. A forensic expert is an individual with training and experience in assessing fraud risk, conducting fraud investigations, and evaluating the impact of actual fraud. By definition, a forensic expert is most likely to be consulted when fraud is alleged or suspected. Thus, an audit team member may notice indicators of fraud during a routine audit and call upon the forensic expert to investigate. An employee of a client may make an allegation of fraud (i.e. whistleblower), to which a forensic expert may respond. In a scenario such as Enron, a forensic expert will be employed to assess the damage from a known fraud. Forensic techniques extend beyond those used in a routine audit, and are not typically included in a usual audit program. Thus, individuals with the designation of forensic expert may be accountants or attorneys or former FBI agents who have obtained training and experience beyond the routine. For this study, all participants were designated by their respective firms as experts in the management fraud area, with mean auditing experience of 12 years, and were at least managers in their firms. Most participants had been members of an audit team when fraud was discovered (66%) and had personally been involved in making a fraud risk assessment (71%). Thirteen participants had specific certification in fraud investigation (e.g. Certified Fraud Examiner).
Data Analysis Self-Insight Two different measures of self-insight were used to assess the degree of agreement between forensic experts’ decision models. In their 1971 review of the judgment and decision-making literature, Slovic and Lichtenstein concluded that subjects have poor self-insight and that they overestimate the importance placed on minor
Management Fraud Risk Factors
83
cues and underestimate reliance on a few major cues. This conclusion was based upon a review of judgment and decision-making literature that primarily compared statistical and subjective model weights. Reilly and Doherty (1992, p. 286) note that although “there have been doubts expressed about the generalization that people have little or no insight regarding their judgment policies,” this conclusion seems to have become generally accepted. Subjective weights most commonly are derived by asking subjects to allocate 100 points (or 100%) across the judgment criteria. The statistical model most commonly used is regression analysis, although other models such as the AHP have been used (Apostolou & Hassell, 1993; Apostolou et al., 2001a, b; Hassell & Arrington, 1989). Reilly and Doherty (1992) indicate that four common ways of computing self-insight include the following: (1) correlation of the statistical model weights directly with the subjective weights; (2) R2 values from predictions based upon statistical model and subjective weights; (3) correlation statistics of predictions using statistical model and subjective weights; and (4) the judgment predictions for holdout samples based upon statistical model and subjective weights.3 Schmitt and Levine (1977) argue that although the statistical weights and subjective weights appear entirely different, the two sets of predicted values resulting from applications of those weights may be highly correlated. Thus, Schmitt and Levine suggest that the correlation between predicted values (method 3 identified by Reilly and Doherty) may reflect more self-insight than is apparent when comparing statistical and subjective weights, and this correlation provides a more reasonable method of assessing self-insight. Based on Schmitt and Levine’s argument, Surber (1985) reevaluated several prior studies by comparing correlations between predicted values rather than between statistical and subjective weights. Surber found that the correlation between the predicted values showed higher selfinsight than the previous studies that correlated statistical and subjective weights had suggested. Two within-subject analyses methods were used to assess self-insight in this study: (1) Spearman correlations between the AHP and subjective model weights for each expert, which reflects Reilly and Doherty’s suggestion number one; and (2) Spearman correlations for each expert’s AHP-model predictions and subjectivemodel predictions from a simulation analysis, which reflects Reilly and Doherty’s suggestion number three. Because measures of actual risk are required in calculating R2 values or to use as a holdout sample, methods 2 and 4 could not be conducted. Degree of Agreement Among Forensic Experts’ Decision Models (Consensus) Two between-subject analyses were conducted to measure the degree of agreement across experts’ decision models. First, for AHP and subjective models separately, the Spearman correlations between each possible pair of experts’ decision model
84
SALLY A. WEBBER ET AL.
weights were computed (i.e. for all possible pairs of 35 experts, the correlations between the 25 decision model weights). The average of all of these Spearman paired correlations is a measure of consensus-in-principle (Einhorn, 1974). Second, simulation analysis was used to calculate predictions from each expert’s AHP and subjective model weights to determine the level of agreement among the predictions that result from the subjects’ models. The best way to test each expert’s model would be to observe actual cases of management fraud and the factors or combinations of factors associated with the fraud, and then use those to compute how well an individual expert’s decision model predicted the fraud. Unfortunately, data from actual fraud risk assessments are not available due to its proprietary nature. As an alternative to considering actual fraud risk assessments, simulation analysis was used to construct hypothetical cases and to determine how similarly the experts’ AHP and subjective decision models predicted outcomes. A total of 1,000 randomly generated hypothetical cases (trials) were constructed and two different models were used in the simulation. First, a 3-values model was used. For each of the 25 factors in each case, a risk level was randomly assigned to that factor: 0.8 = high risk, 0.5 = medium risk, and 0.2 = low risk. These numerical values simulate an assessment of the level of risk for a particular risk factor. The simulated risk values were then input into each subject’s AHP and subjective decision models where they were weighted using the relative weights from each technique. Although this process does not correspond to that actually used by any auditing firm, these values are useful to simulate model predictions. The randomly assigned risk level was multiplied by the appropriate decision weight for each participant’s 25 AHP factor weights and the results summed. The final numerical score theoretically could range from 0.2–0.8. This process was repeated using subjective weights. In a second 2-values model, the simulation analysis was replicated by randomly assigning a value of either zero (no risk) or one (high risk) to each of the 25 risk factors over 1,000 trials. Then, for AHP and subjective models separately, Spearman correlations between each possible pair of experts’ predictions were computed. The average of all of these Spearman correlations is a measure of consensus-in-predicted values (Webber & Hassell, 1997).
RESULTS Forensic Experts’ AHP and Subjective Model Weights Table 1 presents the mean (median) AHP and subjective decision model weights, computed across subjects as an average of the 35 forensic experts’ models. These
Management Fraud Risk Factors
85
Table 1. Aggregate AHP and Subjective Decision Model Weights. Risk Factor Category
AHP Mean (Median)
Subjective Mean (Median)
Management Characteristics and Influence Over the Control Environment Significant compensation tied to aggressive accounting 0.180(0.160) practices Management’s failure to display appropriate attitude about 0.165(0.140) internal control Nonfinancial management’s influence over GAAP 0.068(0.048) principles or estimates High turnover of senior management 0.042(0.031) Strained management/auditor relationship 0.050(0.043) Known history of securities law violations 0.080(0.039)
0.049(0.050) 0.065(0.060) 0.060(0.050)
Subtotala
0.585
0.495
0.027(0.020)
0.045(0.040)
0.046(0.034)
0.061(0.060)
0.034(0.020) 0.038(0.028)
0.050(0.050) 0.051(0.050)
0.145
0.207
0.018(0.011) 0.023(0.014) 0.019(0.012) 0.029(0.018) 0.028(0.021) 0.006(0.005) 0.010(0.006) 0.016(0.010) 0.024(0.014)
0.024(0.025) 0.028(0.026) 0.020(0.017) 0.028(0.024) 0.026(0.024) 0.011(0.009) 0.014(0.015) 0.020(0.018) 0.025(0.020)
0.020(0.008) 0.011(0.007) 0.012(0.007) 0.024(0.013) 0.016(0.011) 0.014(0.011)
0.018(0.015) 0.014(0.013) 0.014(0.015) 0.018(0.017) 0.019(0.017) 0.019(0.017)
Industry Conditions Effect of new accounting requirements on financial stability/profitability High degree of competition/market saturation and declining margins Company in declining industry Rapid changes in industry, vulnerability to changing technology & product obsolescence Subtotala Operating and Financial Stability Characteristics Significant accounts based on estimates Significant related-party transactions “Substance over form” questions Presence of aggressive incentive programs Potential adverse consequences of poor financial results High vulnerability to interest rates Unusually high dependence on debt Threat of imminent bankruptcy Poor/deteriorating financial position with management guarantee of firm’s debt Bank accounts or operations in tax-haven jurisdictions Overly complex organization Difficulty in determining organizational control Negative operating cash flow but reported earnings Significant pressure to obtain capital Unusually rapid growth/profitability relative to industry
0.120(0.100) 0.130(0.100) 0.071(0.064)
Subtotala
0.270
0.298
Total
1.000
1.000
86
SALLY A. WEBBER ET AL.
weights could be interpreted as two alternative presentations of the relative importance of each management fraud risk factor. Generally, the results reported throughout do not differ based upon the participant’s firm (i.e. no firm effect), although the power of statistical tests is weak due to small sample sizes. The weights shown in Table 1 are consistent with prior research which has shown that the distribution of subjective weights tends to be more “even” (i.e. flatter) than those of statistical weights (Mear & Firth, 1987; Reilly & Dougherty, 1992). Although the relative ranking of the three major risk factor categories is consistent between the two methods, the subjective weights for the categories are more “even.” The highest category weighting (management characteristics and influence over the control environment) is lower and the lowest category weighting (industry conditions) is higher for the subjective method than for the AHP method. Because of the more even distribution of rankings throughout the subjective method, all of the weights in the industry conditions category are higher for the subjective method than for the AHP method. For both the management characteristics and influence over the control environment and operating and financial stability categories, the range of weights for the AHP method is larger than for the subjective method.4
Self-Insight Self-insight was analyzed using two different measures, both of which are withinsubject comparisons. The first is a correlation between the individual expert’s model weights and the second is a correlation between the prediction of the AHP model and the prediction of the subjective model. Correlations Between Individual Expert’s Model Weights (Within Subjects) Table 2 reports a Spearman correlation for the AHP and subjective model weights for each of the 35 experts: 33 of 35 correlations are statistically significant at the p ≤ 0.05 level.5 Further, the mean (median, minimum, maximum) Spearman correlation across the 35 participants is 0.694 (0.688, 0.266, 0.913), which is significant at the p ≤ 0.001 level. These results reflect a fairly high level of self-insight for the forensic experts. Correlations Between Predictions from AHP and Subjective Models (Within Subjects) The second self-insight analysis, as described previously, is based on a suggestion by Schmitt and Levine (1977). The Spearman correlation between the AHP predicted values for each simulated case and the subjective predicted values for the simulated case were computed as the second means of evaluating self-insight. This
Management Fraud Risk Factors
87
Table 2. Degree of Agreement in Model Weights Across Models (Within-Subjects): Mean Spearman Correlation of Experts’ AHP and Subjective Weights. Subject
Spearman
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
0.580 0.870 0.864 0.788 0.882 0.836 0.885 0.853 0.661 0.619 0.740 0.688 0.661 0.740 0.363 0.623 0.655 0.799 0.589 0.843 0.650 0.719 0.625 0.624 0.769 0.502 0.764 0.616 0.913 0.266 0.551 0.612 0.791 0.894 0.457
Mean Median Minimum Maximum
0.694 0.688 0.266 0.913
Note: Table 2 reports the Spearman correlation between the 25 AHP model weights and 25 subjective model weights for each participant (n = 35). Pearson correlations are essentially identical. All correlations are statistically significant (p ≤ 0.05) except for participants 15 and 30.
88
SALLY A. WEBBER ET AL.
Table 3. Degree of Agreement in Predictions Across Models (Within-Subjects): Mean Spearman Correlation of Predictions from Forensic Experts’ AHP and Subjective Model Weights. Spearman Correlationa
Subject Model 1 3-Values
b
Model 2 2-Valuesc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
0.710 0.890 0.917 0.812 0.914 0.825 0.911 0.858 0.924 0.799 0.896 0.912 0.629 0.783 0.829 0.764 0.903 0.963 0.903 0.955 0.802 0.821 0.764 0.677 0.867 0.753 0.935 0.823 0.901 0.678 0.842 0.898 0.838 0.911 0.681
0.721 0.890 0.899 0.781 0.912 0.833 0.900 0.855 0.925 0.774 0.879 0.905 0.616 0.796 0.820 0.747 0.894 0.958 0.891 0.963 0.789 0.806 0.774 0.691 0.868 0.747 0.940 0.806 0.909 0.665 0.836 0.892 0.841 0.912 0.650
Mean Median Minimum Maximum
0.837 0.842 0.629 0.963
0.831 0.841 0.616 0.963
Note: Each individual correlation is significant (p ≤ 0.001). a The mean Spearman correlation between the predicted outcomes of 1,000 trials, with predictions based upon 25 AHP model weights and 25 subjective model weights. Pearson correlations are essentially identical. b 3-Values model. The 1,000 random trials assign one of three values to each of the 25 weights: 0.8, 0.5, and 0.2, to represent high, medium, and low risk. c 2-Values model. The 1,000 random trials assign one of two values to each of the 25 weights: 1.0 and 0.0, to represent presence or absence of perceived risk.
Management Fraud Risk Factors
89
result describes consensus-in-predicted values because the correlation measures the degree to which the two models predict the same outcome. Table 3 presents the mean consensus-in-predicted values for each participant for both models. Table 3 reports the mean Spearman correlation for each subject for both 3-values and 2-values models. For both models and for each expert, the individual correlations across the 1,000 trials and 35 forensic experts are statistically significant (p ≤ 0.001). The mean (median, minimum, maximum) Spearman correlation for the 3-values model is 0.837 (0.842, 0.629, 0.963) and is statistically significant (p ≤ 0.001). In the 2-values model, the mean (median, minimum, maximum) Spearman correlation is 0.831 (0.841, 0.616, 0.963), which also is statistically significant (p ≤ 0.001). The high correlations reflect high degrees of consensusin-predicted values. The similarity of the mean, median, minimum, and maximum Spearman correlations in both sets of simulations indicates that the results are not sensitive to the two different ways that the model was operationalized.6
Consensus Among Experts Consensus of Model Weights Across Experts (Between Subjects) Individual correlations for all possible pairs of the 35 participants’ AHP weights were computed, resulting in 595 correlations; then, the process was repeated for the subjective weights. As shown in Table 4, the mean Spearman correlation for the 595 correlations is 0.459 for the AHP weights and 0.634 for the subjective weights, which are both statistically significant at the p = 0.001 level. These mean correlations reflect a moderate to moderately high degree of consensus-in-principle Table 4. Degree of Agreement in Model Weights Across Experts (Between Subjects). Decision Model
AHP Subjective aA
Mean Spearman Correlationa
Number of Correlationsb
Number of Significant Correlationsc
Number of Positive Correlations
0.459 0.634
595 595
405 (68%) 569 (96%)
567 (95%) 595 (100%)
correlation measure for the set of 25 decision weights is calculated for each possible pair of participants. The mean Spearman correlation is computed as the average of the correlation measures for the 595 pairs of participants. It indicates the degree of agreement among the set of raters. Both Spearman correlations are significant at the p = 0.001 level. The results are essentially identical if Kendall’s W is used as an alternative interrater reliability measure. b The number of possible pairings of the 35 participants is calculated as n!/k!(n–k)!. c The number of significant individual correlations (p ≤ 0.05).
90
SALLY A. WEBBER ET AL.
among the forensic experts’ decision models. Further, note that for 595 total subjective model correlations, 96% (569) are statistically significant, which indicates a high degree of agreement. For AHP correlations, 68% of individual correlations are statistically significant, which reflects a moderately high degree of agreement. Prior AHP studies typically have reported low levels of consensus-in-principle (Apostolou & Hassell, 1993; Webber & Hassell, 1997). Consensus in Predictions Across Experts (Between Subjects) Spearman correlations were computed for all possible 595 pairs of the 35 participants’ AHP model predictions, and the process repeated using predictions based upon the subjective models. Table 5 reports the mean Spearman correlation for both the 3-values (0.637) and 2-values (0.662) models. Corresponding Spearman correlations based upon subjective weights are 0.779 and 0.790, respectively. Further, Table 5 reports that all 595 individual correlations are positive and statistically significant (p ≤ 0.05) for both the 3-values and 2-values models. These results reflect moderately high to high consensus across subjects. Note that the power of the tests reported in Table 5 is greater than those reported in Table 4 because the results in Table 5 reflect simulated responses over 1,000 trials.
Comparison of Within-Subjects and Between-Subjects Correlations In a final attempt to assess self-insight, the (within-subjects) mean self-insight Spearman correlations of AHP and subjective model weights (Table 2) were Table 5. Degree of Agreement in Predictions Across Experts (Between Subjects). Decision Model
Mean Spearman Correlationa
Number of Correlationsb
Number of Significant Correlationsc
Number of Positive Individual Correlations
AHP 3-Values 2-Values
0.637 0.662
595 595
595 (100%) 595 (100%)
595 (100%) 595 (100%)
Subjective 3-Values 2-Values
0.779 0.790
595 595
595 (100%) 595 (100%)
595 (100%) 595 (100%)
a The
mean Spearman correlation across predictions based upon experts’ model weights. The results are essentially identical if Pearson correlations are computed. b The number of correlations for every pair of 35 participants’ 25 model weights. c The number of statistically significant individual correlations at p ≤ 0.05 level.
Management Fraud Risk Factors
91
Table 6. Comparison of Within-Subjects and Between-Subjects Correlations. Model Weights
Mean Spearman Correlation
Within-subjects (self-insight correlations) AHP and subjective weightsa
0.694
Between-subjects (consensus-in-principle) Subjective weightsb AHP weights
0.634 0.459
a Mean b Mean
t-Test
Wilcoxon Rank Sum Test
p = 0.0075 p = 0.0001
p = 0.0061 p = 0.0001
of 35 correlations, see Table 2. of 595 correlations, see Table 4.
compared to the (between-subjects) mean consensus Spearman correlations for the AHP weights and the subjective weights (Table 4). If subjects were responding based on an internal decision model, the average within-subjects correlation would be expected to be higher than the average between-subjects correlation because two different models from the same individual who possesses self-insight should be more highly related than those from different individuals. Table 6 presents the results of a t-test (Wilcoxon rank sum test) of the null hypothesis of no difference in mean (median) Spearman correlations. The self-insight Table 7. Comparison of Within-Subjects and Between-Subjects Correlations. Model Predictions (Consensus-inPredicted Values)
Within-Subjects AHP and subjective model predictionsa Between-Subjects AHP model predictionsb Subjective model predictionsb a Mean b Mean
3-Values Model Mean Spearman Correlation
t-Test
2-Values Model Wilcoxon Rank Sum Test
0.831
Mean Spearman Correlation
t-Test
Wilcoxon Rank Sum Test
0.837
0.637
p = 0.0001 p = 0.0001
0.662
p = 0.0001 p = 0.0001
0.779
p = 0.002
0.790
p = 0.003
of 35 correlations, see Table 3. of 595 correlations, see Table 5.
p = 0.003
p = 0.004
92
SALLY A. WEBBER ET AL.
correlation is statistically significantly greater than both the consensus subjectiveweight correlation (t-test p-value = 0.0075; Wilcoxon p-value = 0.0061) and the consensus AHP-weight correlation (both t-test and Wilcoxon p-values = 0.0001). These results reflect a higher degree of agreement between two different models for the same subject than the degree of agreement across subjects for the same model. This finding also supports the assessment that the experts have a high degree of self-insight. Table 7 repeats the analysis using consensus-in-predicted values, and reflects the same results as in Table 6. The Table 6 and Table 7 results are interpreted as reinforcing the conclusion that forensic experts have high selfinsight. The degree of agreement between the experts’ AHP and subjective models was higher than the degree of agreement across experts for either the AHP or subjective models.
CONCLUSION In this study, Apostolou et al. (2001b) was replicated and extended. Apostolou et al. (2001b) provided results showing that three auditor group decision models assessing SAS No. 82 risk factors (internal, external Big 5, and external regional/local) were not statistically significantly different. The replication is that a different subject pool was used with the same AHP modeling approach, although direct comparisons of the AHP models in Apostolou et al. (2001b) were not made. However, the forensic experts and the subjects in Apostolou (2001b) ranked the risk factor categories in the same order, with management characteristics and influence over the control environment most important. The extension is that additional analysis was conducted to examine the subjects’ degree of self-insight related to the AHP modeling and consensus among the subjects. Descriptive information about forensic experts’ AHP and subjective models related to the relative importance of management fraud risk factors is provided. The extant literature on self-insight and consensus is expanded by demonstrating that forensic experts possess relatively high degrees of both. Because self-insight and consensus measures are used to evaluate judgment quality, it appears that forensic experts provide useful data for research into the quality of the management fraud risk assessment. The task in this study differs from that used in many other expert judgment assessments because factors that are explicitly defined by professional standards are used, which may be why higher levels of self-insight and consensus are found. A description of how the AHP and subjective model weights regarding the relative importance of SAS No. 82 factors can be obtained is described. The research technique is relatively easy to apply, and the time commitment of subjects
Management Fraud Risk Factors
93
is reasonable. While it is not appropriate to generalize the results of a single study and make sweeping, global statements, some speculation for what these results could mean for auditing practice is offered. First, because the results support the conclusion that Big 5 forensic experts possess high self-insight, it might be possible for a firm to use the decision weights of their experts for audit training purposes or as the basis for analytical procedures. The decision models of forensic experts could be used for making predictions about the likelihood of management fraud. If a model suggests a high likelihood of fraud, the forensic experts could be brought in as consultants to manage risk exposure on a job. Models could be validated and refined by applying them to actual audit situations. Also, these decision models could be used to design audit expert systems and decision support systems. This aforementioned process could be used in an actual auditing setting as an analytical procedures tool. Over time, a firm could gather evidence about the a priori risk ratings and compare them to actual results, although this task could be difficult because of the low occurrence rate of material management fraud. The prediction scores could then be used more formally in assessments about the likelihood of management fraud. A limitation of the research method used is that the factors were considered individually. It is possible that factors interact such that certain factor groupings may signal an increased likelihood of management fraud. Other limitations result from certain design choices (e.g. allocating 100 points to factors within each factor category rather than 100 points across all 25 factors, or the way the simulation analysis was operationalized). DeZoort (1998) argues that using the 100-point allocation measure for subjective weights unnecessarily creates measurement error. Another limitation is that the data cannot show what an auditor will do with information even if he or she determines that fraud risk is high (i.e. will the follow-up be appropriate to the circumstances). These limitations suggest the need for additional research.
NOTES 1. Statement on Auditing Standards No. 99, Consideration of Fraud in a Financial Statement Audit, was issued by the AICPA’s Auditing Standards Board in October 2002. SAS No. 99 emphasizes professional skepticism, tests for management override of controls, the use of unpredictable audit tests, and requires discussions with management about fraud awareness. A new feature of SAS No. 99 is that the management fraud risk factors are categorized as Incentives/Pressures, Opportunities, and Attitudes/Rationalizations in explicit recognition that these three conditions exist when fraud is present. However, the relative importance of the risk factors is not expressed, which means that the findings of this research continue to be relevant. SAS No. 82 is referenced in this discussion because the research was conducted prior to the issuance of SAS No. 99.
94
SALLY A. WEBBER ET AL.
2. Because this is a replication and extension of Apostolou et al. (2001b), the literature review is abbreviated; see that paper for a more extensive literature review. 3. Reilly and Doherty (1989) asked student subjects to provide subjective weights in a judgment task. Self-insight was operationalized as the ability of a subject to choose his or her own decision model from an available set of all subjects’ decision models. Seven of 11 subjects selected their own decision models, and two other subjects were able to narrow their selection to two decision models, one of which was their model. The authors concluded that subjects had a high degree of self-insight. 4. This study is a follow-up to Apostolou et al. (2001b), which investigated the relative importance of the 25 SAS No. 82 risk factors to (1) 43 field auditors at four of the Big 5 professional service firms; (2) 50 field auditors at four regional/local accounting firms; and (3) 47 practicing internal auditors. Apostolou et al. (2001b) report the AHP weights of relative importance for those three groups. In this paper, the AHP weights of the 35 forensic auditors are not compared to the AHP weights of the three auditor groups in Apostolou et al. (2001b) because the focus is on the forensic auditors. The AHP category weights for the forensic auditors and the auditor groups reported in Apostolou et al. (2001b) are not statistically significantly different, although several significant differences are associated with individual risk factors. 5. Pearson correlation statistics were also computed but are not presented. Inferences throughout the paper do not change if Pearson correlations are used instead of Spearman correlations. 6. To test Schmitt and Levine’s (1977) suggestion that correlations between predicted values resulting from using statistical and subjective weights might reflect more self-insight than the comparison of the statistical and subjective weights, a t-test (Wilcoxon signed rank test) comparing the mean (median) of the two distributions of correlation statistics was conducted. The mean (median) of the distribution of predicted value correlations was statistically significantly greater than the mean (median) of the distribution of self-insight correlations based on comparisons of the decision weights (p = 0.0001 for both tests). Mear and Firth (1987) also found that self-insight measures based upon predictions were higher than self-insight measures based upon models weights for financial analyst subjects.
ACKNOWLEDGMENTS We thank the AICPA, which provided financial support and help in securing the participation of forensic experts, the forensic experts who provided data, and colleagues at the 2001 ABO Conference.
REFERENCES American Institute of Certified Public Accountants (AICPA) (1997). Consideration of Fraud in a Financial Statement Audit. Statement on Auditing Standards No. 82. New York, NY: AICPA. American Institute of Certified Public Accountants (AICPA) (2002). Consideration of Fraud in a Financial Statement Audit. Statement on Auditing Standards No. 99. New York, NY: AICPA.
Management Fraud Risk Factors
95
Apostolou, B., & Hassell, J. M. (1993). An overview of the analytic hierarchy process and its use in accounting research. Journal of Accounting Literature, 12, 1–28. Apostolou, B., Hassell, J. M., & Webber, S. A. (2001a). The importance of management fraud risk factors: Ratings by forensic experts. The CPA Journal (October), 2–7. Apostolou, B., Hassell, J. M., Webber, S. A., & Sumners, G. E. (2001b). The relative importance of management fraud risk factors. Behavioral Research in Accounting, 13, 1–24. Ashton, R. H., & Brown, P. R. (1980). Descriptive modeling of auditors’ internal control judgments: Replication and extension. Journal of Accounting Research, 18, 269–277. Ashton, R. H., & Kramer, S. S. (1980). Students as surrogates in behavioral accounting research: Some evidence. Journal of Accounting Research, 18, 1–15. Beasley, M. S., Carcello, J. V., & Hermanson, D. R. (1999). Fraudulent financial reporting: 1987–1997. Committee of sponsoring organizations of the Treadway commission. Bernardi, R. A. (1994). Fraud detection: The effect of client integrity and competence and auditor cognitive style. Auditing: A Journal of Practice & Theory, 13(Supp.), 68–84. Colbert, J. L. (1988). Inherent Risk: An investigation of auditors’ judgments. Accounting, Organizations and Society, 13, 111–121. DeZoort, F. T. (1998). An analysis of experience effects on audit committee members’ oversight judgments. Accounting, Organizations and Society, 23, 1–21. Einhorn, H. J. (1974). Expert judgment: Some necessary conditions and an example. Journal of Applied Psychology, 59, 562–571. Eining, M. M., Jones, D. R., & Loebbecke, J. K. (1997). Reliance on decision aids: An examination of auditors’ assessment of management fraud. Auditing: A Journal of Practice & Theory, 16(Fall), 1–19. Expert Choice, Inc. (1998). Team expert choice™ user manual. Pittsburgh, PA. Green, B. P., & Choi, J. H. (1997). Assessing the risk of management fraud through neural network technology. Auditing: A Journal of Practice & Theory, 16(Spring), 14–28. Hackenbrack, K. (1993). The effect of experience with different sized clients on auditor evaluations of fraudulent financial reporting indicators. Auditing: A Journal of Practice & Theory, 15(Spring), 99–110. Hamilton, R. E., & Wright, W. F. (1982). Internal control judgments and effects of experience: Replications and extensions. Journal of Accounting Research, 20, 756–766. Hassell, J. M., & Arrington, C. E. (1989). A comparative analysis of the construct validity of coefficients in paramorphic models of accounting judgments: A replication and extension. Accounting, Organizations and Society, 14, 527–537. Heiman-Hoffman, V. B., Morgan, K. P., & Patton, J. M. (1996). The warning signs of fraudulent financial reporting. Journal of Accountancy (October), 75–77. Hooks, K. L., Kaplan, S. E., & Schultz, J. J., Jr. (1994). Enhancing communication to assist in fraud prevention and detection. Auditing: A Journal of Practice & Theory (Fall), 86–117. Landsittel, D. L., & Bedard, J. C. (1997). Fraud and the auditor: Current developments and ongoing challenges. The Auditor’s Report (Fall), 3–4. Loebbecke, J. K., Eining, M. M., & Willingham, J. J. (1989). Auditors’ experience with material irregularities: Frequency, nature, and detectability. Auditing: A Journal of Practice & Theory, 9(Fall), 1–28. Mancino, J. (1997). The auditor and fraud. Journal of Accountancy (April), 32–36. Mear, R., & Firth, M. (1987). Cue usage and self-insight of financial analysts. The Accounting Review, 62, 176–182.
96
SALLY A. WEBBER ET AL.
Nieschweitz, R. J., Schultz, J. J., Jr., & Zimbelman, M. F. (2000). Empirical research on external auditors’ detection of financial statement fraud. Journal of Accounting Literature, 19, 190–246. Palmrose, Z. V. (1987). Litigation and independent auditors: The role of business failures and management fraud. Auditing: A Journal of Practice & Theory (Spring), 90–103. Pincus, K. V. (1989). The efficacy of a red flags questionnaire for assessing the possibility of fraud. Accounting, Organizations and Society, 14(1/2), 153–163. Reckers, P. M. J., & Schultz, J. J., Jr. (1993). The effects of fraud signals, evidence order, and group-assisted counsel on independent auditor judgment. Behavioral Research in Accounting, 5, 124–144. Reilly, B. A., & Doherty, M. E. (1989). A note on the assessment of self-insight in judgment research. Organizational Behavior and Human Decision Process, 44, 123–131. Reilly, B. A., & Doherty, M. E. (1992). The assessment of self-insight in judgment policies. Organizational Behavior and Human Decision Process, 53, 285–309. Saaty, T. L. (1986). Decision making for leaders. Pittsburgh, PA: RWS Publications. Saaty, T. L. (1988). The analytic hierarchy process. Pittsburgh, PA: RWS Publications. Saaty, T. L. (1994). The analytic hierarchy process: Some observations on the paper by Apostolou and Hassell. Journal of Accounting Literature, 13, 212–219. Schmitt, N., & Levine, R. L. (1977). Statistical and subjective weights: Some problems and proposals. Organizational Behavior and Human Performance, 20, 15–30. Slovic, P., Fleissner, D., & Bauman, W. S. (1972). Analyzing the use of information in investment decision making: A methodological proposal. The Journal of Business, 45, 283–301. Slovic, P., & Lichtenstein, S. (1971). Comparison of Bayesian and regression approaches to the study of information processing in judgment. Organizational Behavior and Human Performance, 6, 649–744. Surber, C. F. (1985). Measuring the importance of information in judgment: Individual differences in weighting ability and effort. Organizational Behavior and Human Decision Processes, 35, 156–178. Webber, S. A., & Hassell, J. M. (1997). A comparison of AHP and ANOVA decision modeling techniques in internal control procedure evaluation. Advances in Accounting, 15, 209–242. Zimbelman, M. F. (1997). The effects of SAS no. 82 on auditors’ attention to fraud risk factors and audit planning decisions. Journal of Accounting Research, 35(Supp.), 75–104.
BUDGETARY SLACK CREATION AND TASK PERFORMANCE: COMPARING INDIVIDUALS TO COLLECTIVE UNITS James M. Kohlmeyer III and James E. Hunton ABSTRACT The purpose of this study is to investigate differences between individual and collective budgeting decisions with respect to budgetary slack creation and task performance. While a great deal of research exists in the area of budgeting, to our knowledge, no prior studies have dealt with budget settings in a collective (e.g. small group or cross-functional team) environment. Accordingly, the current study examines differences in slack creation and task performance using a two (decision mode: individual vs. collective decision) by two (incentive contract: slack-inducing vs. truth inducing) between-subjects experimental design. A total of 295 students participated in the experiment (79 individuals and 72 three-person collective units). As expected, individuals and collective decision-makers created significantly more slack under a slack-inducing contract than a truth-inducing contract. Additionally, as anticipated, collective decision-makers created more slack than individuals under a slack-inducing contract. Unexpectedly, however, collective decision-makers created more slack than individuals using a truthinducing contract. Task performance was significantly different between
Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 97–122 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07005-X
97
98
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
individuals and collective unit members, such that performance of former exceeded latter, as hypothesized. Finally, preliminary analysis indicated that choice shift occurred in the collective units, such that the units became more cautious in setting budget goals than individuals under both incentive contract conditions.
INTRODUCTION Budgets are widely used tools in planning, organizing, directing and controlling the operations of business and governmental entities. While budgets reflect quantitative expressions of predicted performance results, they also serve to motivate subordinate performance (Chow et al., 1988). However, allowing employees to participate in setting their own budget targets provides them with an opportunity to ‘game’ the process and negotiate more easily achievable goals. Budget distortion of this nature often is referred to as budgetary slack (Merchant & VanderStede, 2000). Historically, budgetary slack research has focused on individual decisionmaking (Chow, 1983; Dunk, 1993; Young, 1985). Even though there has been an increased emphasis on teamwork in the workplace (Siegel & Sorensen, 1999), researchers have not examined the effects of collective decisions on the budgeting process. In an attempt to bridge this research gap, the purpose of this study is to examine differences between individual and collective decisions with respect to budgetary slack creation and actual task performance. In social psychology, there are conflicting views regarding the effectiveness of collective vs. individual decision-making. While some research has found that collective units perform better than individuals, other studies have found opposite effects (Isenberg, 1986; Janis, 1982; Rutledge & Harrell, 1994). The impact of incentive contracts on collective decision-making units has not been examined in social psychology or accounting, and the efficacy of a truth-inducing contract as a debiasing mechanism for collective units is uncharted territory as well. Toward this end, this study investigates differential reactions and behaviors of individuals and collective units to slack-inducing and truth-inducing incentive contracts. The current study operationalized a two (decision mode: individual vs. collective decision) by two (incentive contract: slack-inducing vs. truth-inducing) fully-crossed, between-subjects experimental design. A total of 295 students participated in the study. As anticipated, under a slack-inducing, as compared to truth-inducing, contract, individuals and collective units created more budgetary slack. Also as expected, under a slack-inducing contract collective units created more budgetary slack than individuals. Under a truth-inducing contract, however,
Budgetary Slack Creation and Task Performance
99
individuals unexpectedly created significantly less budgetary slack than collective units. Additionally, as hypothesized, the mean performance of collective unit members was significantly less than the mean performance of individuals. Finally, post hoc analysis suggests evidence of choice shift in the collective units, which helps to explain observed budgetary slack differences between individuals and collective units. The current study contributes to extant participative budgeting and slack research along several avenues. First, the differential impact of individuals versus collective decision-makers with regard to budgetary slack creation and actual task performance are examined for the first time in a budgeting context. Second, the mitigating impact of truth-inducing contracts on collective budgeting units is investigated. Finally, this study provides preliminary evidence that choice shift in a collective budgeting environment leans toward caution, not risk. The next section reviews relevant literature and presents study hypotheses. The following sections describe the research method and analyze study results. The final section summarizes the research findings and suggests future research ideas in the area of collective participation in the budgeting process.
THEORY AND HYPOTHESES Over the last four decades, management accounting researchers have been concerned with the issue of budgetary slack in organizations. Managers and their subordinates create budgetary slack when they purposefully build excess resources into their budgets or knowingly understate their productive capabilities (Baiman & Evans, 1983; Young & Lewis, 1995). Budgetary slack has also been described as the express incorporation of budget amounts that are easier to attain (Lukka, 1988; Merchant, 1985; Young, 1985). In this study, budgetary slack is defined as the extent to which participants understate their true productive capabilities. Many studies have found evidence of budgetary slack in organizations (e.g. Cammann, 1976; Kamin & Ronen, 1981; Kirby et al., 1991; Leibenstein, 1979; Lukka, 1988; Merchant, 1985; Merchant & Manzoni, 1989; Onsi, 1973; Schiff & Lewin, 1968; Umapathy, 1987). Having recognized that budgetary slack regularly occurs in business and governmental entities, researchers have focused on understanding why slack occurs, what factors affect its creation and how to minimize its effect. Principally, extant studies have investigated the creation of budgetary slack by individuals engaged in participative budgeting contexts. Shields and Shields (1998) define participative budgeting as a process whereby subordinates are involved with and have influence on the determination of their budgets. Agency theory assumes that subordinates know more than their superiors
100
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
about their tasks and task environments; thus, agency theorists characterize participative budgeting as a means by which superiors attempt to gain private information from subordinates and, as a consequence, reduce uncertainty (Baiman & Evans, 1983; Kirby et al., 1991; Shields & Shields, 1998). Information sharing via participative budgeting allows superiors to design and offer subordinates efficient goal-congruent incentive contracts aimed realizing the subordinates’ true productive capabilities.
Incentive Contracts The study of incentive contracts within the framework of budgetary slack relies primarily on agency theory as its theoretical foundation (Douglas & Wier, 2000). Since agency theory focuses on the relationship between agents and the principals, researchers have suggested that a properly designed reward system may induce subordinates to supply more accurate budgets (Chow, 1983). Unfortunately, setting an appropriate budget can be a major problem when subordinates have better information than their superior concerning their true productive capabilities, and when the subordinates’ pay is based on budgeted performance (Chow et al., 1988). While planning benefits may arise from participative budgeting, an incentive problem is often created. For instance, when a subordinate’s pay increases as budget difficulty decreases, ceteris paribus, the subordinate may bias the communication of private information such that a relatively easy budget is set, thereby creating slack. Without proper incentives to induce truthful communication and motivate best performance, some benefits of participative budgeting to an organization will be lost (Chow et al., 1988). To deal with these problems, researchers have designed and tested two types of incentive contracts – slack-inducing and truth-inducing. Prior studies demonstrate that when subordinates hold private information about their productive capabilities, participate in setting production targets and receive bonuses based on exceeding the production targets, they have an incentive to underestimate their productive capabilities in order to receive larger bonuses (Kirby et al., 1991). Thus, subordinates will typically build slack into budgets that provide opportunities to earn extra compensation, shirk responsibility on the job or both (Young, 1985). Hence, the type of contract that provides a fixed wage plus a bonus for performance exceeding the production target or budget has been defined as slack-inducing. Analytical research in accounting and economics has examined another form of incentive contract, called truth-inducing, which helps to alleviate incentive problems regarding subordinate motivation and communication of private
Budgetary Slack Creation and Task Performance
101
information (Christensen, 1982; Weitzman, 1976). Under certain assumptions, truth-inducing schemes induce subordinates to prefer budgets that are closely aligned with their ‘true’ expected performance. In addition, such schemes provide incentives to maximize performance regardless of the budget (Chow et al., 1988). Truth-inducing contracts have been constructed to solve the problem of misrepresentation of subordinates’ private information because such contracts impose a penalty for such distortion (Libby, 2002). Several empirical tests have compared the amount of budgetary slack created by subordinates rewarded under slack-inducing or truth-inducing incentive contracts. Chow et al. (1988) observed that when information asymmetry was present, slack was significantly lower under truth-inducing scheme in comparison to the slack-inducing scheme. Waller (1988) found that when the truth-inducing scheme was introduced, slack created by risk-neutral participants decreased significantly while slack created by risk-averse participants did not change. Waller and Bishop (1990) reported that overall firm profits were lower and misrepresentations higher under a slack-inducing incentive contract when compared to the truth-inducing scheme. Finally, Chow et al. (2000) tested five different mechanisms for encouraging truthful, upward communication of information within decentralized organizations. Results of the Chow et al. (2000) study indicated that the truthinducing contract led to significantly less misrepresentation by subordinates than did a slack-inducing, linear profit-sharing pay scheme. As previously stated, the budgetary slack literature provides empirical evidence that under conditions of information asymmetry, truth-inducing incentive contracts appear to reduce the amount of budgetary slack when compared to slack-inducing contracts. Accordingly, we do not offer a hypothesis related to how individuals respond to slack-inducing and truth-inducing contracts, as prior findings in this regard are quite robust. However, we include these two conditions in our experimental design for they serve as a baseline against which to compare slack creation in a collective decision-making environment.
Collective Decision Making In recent years there has been increased interest in collective decision making processes (e.g. small groups and teams). Organizations are making more decisions in the context of collective environments rather than placing such responsibilities on individuals. Accordingly, recent changes in business practices are influencing the traditional managerial accounting environment and necessitating a re-examination of prior research efforts (Young & Lewis, 1995). We suggest that similarities in how individuals and collective units respond to incentive contracts
102
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
are rooted in motivational incentives (Geen, 1991). Meaning, a slack-inducing contract encourages the creation of budgetary slack and a truth-inducing contract does not. In this light, we expect that budgetary slack creation by collective units under both types of incentive contracts should be consistent with individual participation research. However, we expect differences between individuals and collective decisionmakers in the magnitude of budgetary slack creation due, in part, to the social cognition that takes place during collective discussion of performance targets (Hackman, 1993; Hunton, 2001; Stasson & Bradshaw, 1995), which enhances all members’ understanding of how to “game” the budgeting process through socially-enhanced procedural knowledge (Wittenbaum & Stasser, 1996). Also, to be discussed in an upcoming section, judgment differences between individuals and collective units are linked to varying risk propensities (Rutledge & Harrell, 1994). We begin by hypothesizing how collective decision-makers will react differently from individuals with respect to slack-inducing and truth-inducing contracts. Slack Differences Between Individuals and Collective Units One of the basic assumptions of agency theory is that individuals will act to maximize their self-interests (Baiman & Evans, 1983). While agency theory does not explicitly discuss collective decision-makers, it suggests that during collective discussion budget participants will also attempt to maximize their self-interests. Schopler et al. (1993) tested the degree to which collective units, as compared to individuals, pursued self-interest acts. They found that collective unit members tended to be more focused on self-interest acts than individuals, and that groups often provided a social support structure that encouraged members to maximize their self-interests. Hence, collective units should “game” the budget process to a greater extent than individuals in an attempt to capitalize on each member’s self-interest. Prior research also suggests that collective units are more competitive than individuals, even in the absence of collective extrinsic rewards (Insko et al., 1990; Schopler & Insko, 1992; Schopler et al., 1991, 1993). Such competitive feelings build during intra-group social processes, particularly collective discussions, which foster an atmosphere of beating the perceived competition (Morgan & Tindale, 2002). Hence, collective discussion of budgetary incentive contracts should engender a collectively-oriented competitive spirit, thereby resulting in collective units attempting to “game” the budgetary process more aggressively than individuals in an effort to win. Further, accountability theory suggests that individuals, as compared to collective units, may be more reluctant to advocate extreme positions or make risky decisions because individuals inherently feel more accountable for their
Budgetary Slack Creation and Task Performance
103
decisions, even in the absence of external accountability mechanisms (Fandt & Ferris, 1990; Frink & Klimoski, 1998; Kroon et al., 1992; Tetlock, 1985). Individual decision-makers’ heightened feelings of accountability arise, in part, because there is no collective unit within which individuals can hide or shirk responsibility (Linden et al., 1999). Collective unit members, on the other hand, tend to feel less personal accountability than individuals; consequently, they often advocate more aggressive collective positions because each member feels that (s)he can hide within the group should (s)he fall short of pulling his/her share of the load (BarNir, 1998; Bateman et al., 1987). Accordingly, the extent of slack that individuals build into their budgets should be less extreme than collective units due to heightened feelings of personal accountability. Lastly, group discussion should result in collective unit members being more fully informed than individuals regarding the merits the incentive contract condition to which they are assigned, primarily because of the enhanced social cognition of collective unit members as compared to individuals (Hunton, 2001; Ono & Davis, 1988; Stasson & Bradshaw, 1995). While there may be wide discrepancies in understanding the advantages of different types of contracts at the individual level, collective units should provide more accurate unified understanding of the incentive contract because collective members bring different skills and capabilities to the decision process. Thus, on average, collective units should better understand the contractual conditions and how to “game” the process to achieve maximum gain. Hence, under a slack-inducing contract, collective units should create more budgetary slack than individuals due to heightened aggressiveness to maximize self-interests, greater propensity to be competitive, lower feelings of personal accountability and enhanced social cognition of how to game the budget-setting process. Regarding the latter issue, collective discussion should acutely reveal that the collective unit members could maximize their compensation by setting a budget well below their actual performance capability. Accordingly, the following hypothesis is presented (alternative form): H1 . The percentage of budgetary slack will be higher in the collective condition, as compared to the individual condition, under a slack-inducing contract. Conversely, collective discussion should clearly reveal that under the truthinducing incentive contract, collective unit members could best maximize their compensation by setting their budget as close as possible to their true productive capabilities. As discussed, collective group members are expected to be more risky or aggressive than individuals in this regard due to higher levels of self-interest, competition and cognition, and lower perceptions of personal accountability. As a result, while both individuals and collective units will attempt to estimate their
104
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
true productive capabilities, individuals should err on the side of under-estimation due to relative conservatism and collective unit members should err on the side of over-estimation due to relative aggression. The expectation of collective units is further supported by choice shift research, which indicates that collective unit members might over-estimate their “true” capability, as the momentum of shift during collective discussion is often more extreme than the initial predisposition of individuals prior to discussion (BarNir, 1998; Blumberg, 1994; Butler & Crino, 1992). Therefore, the following hypothesis is presented (alternative form): H2 . The percentage of budgetary slack will be lower in the collective condition, when compared to the individual condition, under a truth-inducing contract. Performance Differences between Individuals and Collective Units While the central focus of the current study is on the creation of budgetary slack, we also examine performance differences between individuals and collective units. Small group theory posits that, on whole, collective members will perform better than individuals when the group accepts a common goal and has a history of working with each other (e.g. Carless & DePaola, 2000; Littlepage et al., 1997; Mennecke et al., 1992; Stasson & Bradshaw, 1995). However, for ad hoc groups, lack of social attraction and collective responsibility among members can lead to shirking or social loafing (e.g. Geen, 1991; George, 1995; Shepperd & Taylor, 1999). That is, some group members will perform below their true capabilities and “hide” within the group. In the current study, the collective units were comprised of individuals who, prior to the experiment, had no history of working with each other in a collective environment. Hence, we would expect that an ad hoc group of this nature would, on average, display lower performance than individuals due to shirking and social loafing – actions which arise from fairly low intrinsic feelings of personal accountability toward the other group members. Accordingly, the final hypothesis is offered (alternate form): H3 . The mean performance of individuals will be significantly higher than the mean performance of collective unit members.
EXPERIMENTAL METHOD Research Design This study employed a 2 × 2 factorial design, wherein incentive contract (truth-inducing & slack-inducing) and decision mode (individual & collective) were manipulated. The two dependent variables were the percentage of budgetary slack created and actual performance on the assigned task.
Budgetary Slack Creation and Task Performance
105
Measurement of Budgetary Slack The operational definition of budgetary slack in this study is the difference between the participants’ self-reported expected performance after a practice period but before reading the incentive contract manipulation (i.e. Best Estimate of Production) and the participants’ self-set budget (i.e. Final Individual Budget) or the collective units’ jointly set budget (Final Collective Unit Budget) after the incentive contract manipulation (see Fig. 1). The difference is divided by expected performance (i.e. Best Estimate of Production) to normalize the measure and make it comparable across participants with different production capabilities (Stevens,
Fig. 1. Summary of Experimental Procedures.
106
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
2000). The resulting percentage of budgetary slack is then compared among treatment conditions. Measurement of Actual Performance The production task was adapted from Chow (1983), Chow et al. (1988), and Libby (2002). Study participants were provided with a decoding key wherein symbols were randomly assigned to each letter of the alphabet. Then, they were supplied with a list of words that had been coded in accordance with the symbols. The task was to decode as many seven-letter words as possible during a fiveminute performance trial. Performance was measured as the number of correctly decoded words. Covariates Two covariates were examined in this study. First, individual differences regarding the participant’s ability to correctly decode symbols to alphabetic characters was assessed. This performance capability was assessed using a five minute practice session (before they were exposed to the incentive contract). Chow, Cooper and Waller (1988) included performance capability as a covariate because a priori performance capability may affect the amount of budgetary slack. Second, the participants’ risk propensity was assessed because the risk attitude of individuals could affect the amount of budgetary slack they build into their final budgets. Risk propensity was measured by asking participants to respond to the following statement: “I would have to be % sure that I would receive $10 before I would willingly choose the gamble over receiving $5 for sure” (0–100%) (Young, 1985). Description of Incentive Contracts Participants in both individual and collective conditions were given either a slack-inducing or a truth-inducing incentive contract. Each contract provided a payoff table (Appendix) adapted from Libby (2002). The payoff table allowed the participant to calculate the number of raffle tickets she could earn toward the drawings of five $100 cash prizes; as such, performance rewards were directly related to performance outcomes. Both conditions read the following instructions: The vertical axis of the table labeled “Budgeted Number of Words to be Decoded” is your Final Budget. The horizontal axis of the table labeled “Actual Number of Words Decoded” is the actual number of words decoded in the upcoming five-minute work period. You will be given 500 tickets to start the period. In prior periods, your co-workers have been able to decode a minimum of 10 words in the five-minute period. Your supervisor will not accept a Final Budget of less than 10 words.
Budgetary Slack Creation and Task Performance
107
To give you an example of how this earnings contract works, if you chose your budget to be 25 words and then you actually decoded 30 words (actual > budget), the number of tickets that you would earn would be: Slack-inducing contract Tickets earned = 500 (started period with) + 875 (payoff from table) for a total of 1375 tickets. Truth-inducing contract Tickets earned = 500 (started period with) + 450 tickets (payoff from table) for a total of 950 tickets.
A minimum budget of 10 words was required so that participants could not set an unrealistically low budget of zero. Ten words were chosen as the minimum budget because two pilot studies indicated that all participants were able to decode at least ten words in the performance period. In both the individual and collective conditions, the participants earned raffle tickets toward the five $100 cash prizes based on their performance. Materials and Experimental Procedures Experimental sessions were conducted in 75-minute periods for individuals and 90-minute periods for the collective units. The collective unit sessions took 15 minutes longer to allow for group discussion. Participants were given the opportunity to decline to participate in the study. The four treatments were randomly assigned to twenty experimental sessions. A summary of the procedures is next described (see Fig. 1 for a graphical representation). At the beginning of each session, all participants were assembled in a large room designed for behavioral experimentation. There was sufficient space between participants such that they could only see and focus on their experimental packet. An experimenter was in the room at all times to ensure that no interaction took place among participants. After reading a brief introduction and signing consent forms, all participants engaged in a 5-minute practice period. Afterward, participants in both conditions (individual and collective) self-evaluated their performance with a decoding answer key given to them by the experimenter. The experimenter later verified the performance of each participant. Then, the participants gave their Best Estimate of Production regarding the number of words they believed they could accurately decode during another 5-minute session. At this point participants in the individual condition received the incentive contract manipulation (slack-inducing and truth-inducing contracts were randomly assigned to participants during each session). After reading the incentive contract, the individual participants submitted their Final Individual Budget. Afterward, the participants were asked to participate in a 5-minute performance task.
108
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Participants in the collective condition were randomly assigned to three-person collective units after giving their personal Best Estimate of Production. Controls were built into the experiment to assure that collective unit members could not change their Best Estimate of Production during or after collective discussion. Next, each collective unit assembled in separate rooms that were under visual control of the experimenter. However, the experimenter was not present in the rooms in order to allow for free exchange of information and opinions among collective unit members. Once the participants gathered into their collective units, they read their incentive contract, which was randomly assigned and provided to the collective unit in an envelope while they assembled in their discussion rooms. After a 15minute collective discussion of the incentive contract, the collective unit jointly set a Final Collective Unit Budget. Then members of collective units returned to their seats and completed the task assignment during a 5-minute performance period. After the 5-minute performance period the participants completed an exit questionnaire before a short debriefing. During the debriefing, participants were informed of the following: (1) do not discuss the study with others; (2) sign an agreement to not discuss the study with others; and (3) the specific date when the winners of the $100 cash prizes would be announced.
RESULTS Sample The participants chosen for this study were undergraduate business students enrolled in multiple sections of Principles of Accounting at a large state university located in the southeastern portion of the United States. The students received extra credit points for participation. Additionally, to encourage hard work and maintain a reasonably high level of interest in the study, the students were able to earn raffle tickets based on their performance. Importantly, the number of raffle tickets earned was directly related to the task performance outcome. Five separate prizes of $100 each were awarded to the five raffle drawing winners. A total of 298 students volunteered to participate in the experiment (177 females, 121 males), with a mean (standard deviation) age of 21.55 (2.85) years. Three participants failed to complete one or more necessary experimental measures, leaving a final usable sample of 295 participants. Statistical tests indicated no significant differences across treatment conditions based on age (F = 1.39, p = 0.18) or gender (χ2 = 10.09, p = 0.18). In the individual condition, there were 79 participants, of which 40 performed under a slack-inducing contract and 39 performed under a truth-inducing contract.
Budgetary Slack Creation and Task Performance
109
In the collective condition, 216 individuals comprised 72 three-person collective units. Each three-person collective unit was considered as a single sample observation. Thus, there were 36 observations in each of the two collective treatment conditions (i.e. slack-inducing and truth-inducing incentive contracts). In total, there were 151 independent sample observations in the study.
Manipulation Checks Participants responded to three statements concerning the incentive contract (truth-inducing or slack-inducing) manipulation. One statement read: “The incentive contract I worked under would motivate workers to set their budget at the number of words they could actually decode” (1 = Strongly Disagree, 7 = Strongly Agree). The mean (standard deviation) was 3.47 (2.12) in the slackinducing condition and 5.18 (1.58) in the truth-inducing condition (t = 7.80, p < 0.01). The second statement read: “The incentive contract that I worked under would motivate workers to set their budget below the number of words they could actually decode” (1 = Strongly Disagree, 7 = Strongly Agree). The mean (standard deviation) was 5.11 (2.01) in the slack-inducing condition and 3.71 (1.88) in the truth-inducing condition (t = 6.19, p < 0.01). The final statement asked: “To what extent did your incentive contract affect how you set your Final Budget in comparison to what you really thought you could do?” (1 = Much lower than my expected performance, 5 = Equal to my expected performance, 9 = Much higher than my expected performance). The mean (standard deviation) was 3.85 (2.37) in the slack-inducing condition and 5.38 (1.52) in the truth-inducing condition (t = 6.57, p < 0.01). Based on test results, the incentive contract manipulation was considered successful. Regarding the decision mode manipulation (individual versus collective unit), as discussed earlier, these two conditions were run in separate sessions due to logistical issues and internal validity concerns. Even though an experimenter was present to ensure that individuals did not interact with each other during the sessions and that collective unit members were placed into small rooms where they engaged in collective discussion, all participants responded to a manipulation check item in this regard. Specifically, participants responded to a statement asserting that they were able to interact with other participants during experimental session (1 = Strongly Disagree, 7 = Strongly Agree). The mean (standard deviation) response was 1.42 (1.13) in the individual condition and 6.53 (1.08) in the collective condition (t = 35.78, p < 0.01). Based on the efficacy of the experimental controls and results of the manipulation check item, the decision mode manipulation was deemed successful.
110
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Preliminary Analyses A MANCOVA model was used to determine statistically significant differences between treatment conditions. The dependent variables were percentage budgetary slack created and number of words decoded during the 5-minute performance period. The independent variables were decision mode (individual vs. collective unit) and incentive contract (slack-inducing vs. truth-inducing). The covariates were the number of words decoded during the 5-minute practice session and the individuals’ (mean collective units’) risk propensity. MANCOVA tests revealed significant results for decision-maker (F = 1160.97, p < 0.01) and incentive contract (F = 17.13, p < 0.01). The two-way interaction was non-significant (F = 0.61, p = 0.54). The first covariate (5-minute practice period performance) was significant with respect to the percentage of budgetary slack (F = 22.03, p < 0.01) and the number of words decoded during the 5-minute performance period (F = 124.76, p < 0.01). The second covariate (risk propensity) was non-significant with respect to budgetary slack (F = 2.62, p = 0.108) and actual performance (F = 1.55, p = 0.15). Thus, only the first covariate was included in the upcoming ANCOVA models.
Hypothesis Testing Hypothesis One The first hypothesis (H1 ) anticipated that under a slack-inducing contract, collective budgetary slack would be higher for collective units, as compared to individuals. The mean percentages of budgetary slack created by collective units (m = 26.83%) and individuals (m = 10.71%) were significantly different based on parametric (F = 30.36, p < 0.01) and non-parametric (t = 5.04, p < 0.02) test results.1 Accordingly, the first hypothesis was supported. Hypothesis Two The second hypothesis (H2 ) stated that under a truth-inducing contract collective units should create lower budgetary slack than individuals. Contrary to expectations, the results shown on Table 1 indicate that budgetary slack was higher for collective units (m = 6.61%) than individuals (m = −4.91%). The planned comparison (F = 14.02, p < 0.01) and non-parametric test (t = 11.12, p < 0.01) indicate that the two means were significantly different. Interestingly, the individuals were more aggressive in setting their budget targets, as indicated by negative slack, than were the collective units. Thus, the second hypothesis was not supported.
Panel A: Results of ANCOVA Testing on the Percentage of Budgetary Slacka Source d.f. Sum-Squares Covariateb Decision-modec Incentive contractd Interaction term Error Total (adj.)
1 1 1 1 146 150
0.75 0.71 1.21 0.02 5.30 7.71
Panel B: Mean (Standard Deviation) Percentage Budgetary Slacka and [Sample Size] Collective Individual Slack-inducing Truth-inducing Main effects
26.83% (20.42%) [36] 6.61% (8.39%) [36] 16.72% (18.32%)c [72]
10.71% (29.93%) [40] −4.91% (14.93%) [39] 2.90% (24.76%) [79]
F-Ratio
p-Value
20.71 19.57 33.27 0.55
0.001 0.001 0.001 0.461
Main Effects 18.77% (26.70%) [76] 0.85% (13.15%) [75] 9.81% (22.68%) [151]
Budgetary Slack Creation and Task Performance
Table 1. Percentage of Budgetary Slack Created.
a Budgetary
slack for the individual [collective] condition is calculated as the expected performance (Best Estimate of Production) of each participant [sum of the expected performances for all collective unit members] prior to receiving the incentive contract [and prior to engaging in collective discussion] minus the Final Budget [Group Budget] of each participant [collective unit] after receiving the incentive contract [and after engaging in collective discussion] divided by the Best Estimate of Production. Least square means are shown. b For the individual [collective] condition, the covariate reflects the number of words decoded [the sum of words decoded for all collective unit members] during the 5-minute practice period. c Decision-maker reflects Individual versus Collective Unit. d Incentive Contract reflects Slack-Inducing versus Truth-Inducing.
111
112
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Table 2. Number of Words Decoded During the 5-Minute Performance Period. Panel A: Results of ANCOVA Testing on Decoding Performancea Source d.f. Sum-Squares Covariateb Decision-modec Incentive contractd Interaction term Error Total (adj.)
1 1 1 1 290 294
5,666.79 61.69 0.02 21.02 3,537.22 9,389.99
F-Ratio
p-Value
462.99 5.04 0.01 1.72
0.001 0.032 0.971 0.191
Panel B: Mean (Standard Deviation) Decoding Performancea and [Sample Size]e Collective Individual Main Effects Slack-inducing Truth-inducing Main effects
22.19 (4.75) [108] 22.22 (6.21) [108] 22.21 (5.51) [216]
23.40 (6.23) [40] 24.41 (5.61) [39] 23.90 (5.91) [79]
22.52 (5.19) [148] 22.80 (6.11) [147] 22.66 (5.66) [295]
a Independence
among experimental participants is maintained, even for collective unit members, since all participants performed the decoding task in a supervised setting where interaction among participants was not allowed during the 5-minute practice and performance periods. Hence, for individuals and collective unit members, performance reflects the number of words correctly decoded during the 5-minute performance period. Least square means are shown. b For individuals and collective unit members, the covariate reflects the number of words decoded during the 5-minute practice period. c Decision-maker reflects Individual versus Collective Unit. d Incentive Contract reflects Slack-Inducing versus Truth-Inducing. e The expected mean squares of the ANCOVA were adjusted in accordance with Neter et al. (1990) for the unequal sample sizes between treatment conditions.
Hypothesis Three The last hypothesis (H3 ) indicated that the mean performance of individuals would be higher than the mean performance of collective unit group members. Results from ANCOVA testing in this regard are shown on Table 2 (Panel A), as are treatment means (least square), standard deviations and sample sizes (Panel B). As indicted, there is a significant main effect for decision mode, such that the mean performance of individuals (m = 23.90) is significantly greater than the mean performance of collective unit members (m = 22.21). A planned comparison (F = 5.04, p < 0.02) and non-parametric test (t = 4.70. p < 0.02) support the significance of this finding. Accordingly, the third hypothesis is supported. While a difference of 1.69 words (23.90 − 22.21) may not seem large from a practical standpoint, it is important to remember that the performance period lasted only five minutes. If a difference of this nature (7.7% productivity loss) were extrapolated over a longer time period, the performance decrease in the collective condition could have considerable efficiency implications.
Budgetary Slack Creation and Task Performance
113
Supplemental Observation The experimental design was not conducive to collecting a Final Individual Budget from collective unit members (prior to collective discussion) because the incentive contract was revealed to the collective unit as a whole. Hence, asking collective unit members to assess their expected performance after reading the incentive contract (as with participants in the individual decision mode condition) was not possible because independence among participants was violated as soon as they gathered into their collective units. Accordingly, collective unit members were not able to submit pre-discussion budget estimates. However, participants in the individual condition did submit a Final Individual Budget in the absence of collective discussion. Therefore, a precursory analysis of choice shift can be conducted in the current study. Choice shift was measured as the difference between the Final Individual Budget submitted by participants in the individual condition and the average Final Collective Unit Budget of collective unit members (i.e. the Final Collective Unit Budget divided by three members per group). Naturally, this method assumes that the Final Individual Budget serves as a proxy for the collective members’ pre-discussion budget. We recognize the inherent limitations of such an assumption. However, all participants were undergraduate students from the same university whose mean ages and gender proportions were not significantly different across treatment conditions. Additionally, the collective unit participants were randomly assigned to ad hoc groups. Thus, comparing the Final Individual Budget to the average Final Collective Unit Budget may not be entirely inappropriate. The analysis revealed that choice shift seemed to occur within both collective conditions (see Table 3). Within the individual slack-inducing contract condition, there was a mean Final Individual Budget of 16.45 words, whereas the average collective unit budget (per group member) was 12.94 words (t = −3.10, p < 0.01). Within the individual truth-inducing contract condition, there was a mean Final Individual Budget of 19.41 words, while the collective mean budget (per member) was 17.66 words (t = −1.75, p = 0.08). Based on this precursory analysis, there appeared to be a significant cautious choice shift when collective units performed under a slack-inducing contract, such that the collective unit reduced the number of budgeted words to be decoded per unit member, presumably to receive more compensation and reduce performance risk. Additionally, there was a marginally significant cautious choice shift under a truth-inducing contract, although choice shift theory would predict a risky shift in this circumstance. Hence, there is some indication that cautious choice shift occurred during collective budgeting in this study in both incentive contract conditions. Future studies should more rigorously investigate the choice shift phenomenon in the context of collective budget participation.
114
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Table 3. Mean Choice Shift in Collective Condition. Treatment
Slack-inducing contract Truth-inducing contract
Final Individual Budgeta
Average Collective Budgetb
Choice Shiftc
t-Statistic
16.45 19.41
12.94 17.66
3.51 1.75
3.10 1.75
p-Value ␦provider and user < provider ). Thus, while there is little difference in magnitude between the expectations of providers and users, the difference in influence contributes to the assurance gap. Differences in other subjective factors that influence disconfirmation – denoted by in Fig. 2 – may also influence the gap in a similar way. These differences in the sources of the gap may result from differences in the nature of the service or the nature of the users of the service. While the magnitude of the assurance gap ultimately reduces to the difference between the user’s and provider’s levels of satisfaction, it is important to conceptualize the gap in more comprehensive terms as in Fig. 2. In other words, the gap results from differences in magnitude and influence for each component of satisfaction. The source of the gap is important because it may influence how users respond to the gap and, consequently, the course of action providers must take to reduce the gap. For example, if, as often conceptualized, the audit assurance gap results from differences in the magnitude of expectations between public users and providers, the result may be high levels of disconfirmation for users only when performance – which is normally highly unobservable – becomes observable, as in the case of a bankruptcy. In this case, very high levels of dissatisfaction and a large and obvious assurance gap result only in isolated circumstances, but the result is very costly litigation. To mitigate this problem, careful attention to expectations may be needed. On the other hand, the problem for new assurance services, as we will hypothesize in our illustrative study later in this paper, may deal more with the influence of individual factors. If this is so, the implication may be a more stable and enduring gap that must be managed by careful attention to those factors, such as performance, that are most influential for users.
Unraveling the Expectations Gap
177
Components of the model depicted in Fig. 2 have been tested in previous research. However, a systematic, comprehensive analysis of the differences between users’ and providers’ satisfaction and their satisfaction formation is not available. Auditing research has established a difference in magnitude of many of these satisfaction-related variables between users and providers with respect to auditing. Accounting research has not, however, specifically examined the relative influence of these factors on satisfaction. No research related to any aspect of the gap is available for new assurance services, which are likely to be different than auditing service for reasons described in the next section. Consequently, to better illustrate the value of the model for thoroughly evaluating a variety of services and to put forth a comprehensive approach to testing it, the next section of this paper describes an illustrative study conducted based on the model.
Applying the Model to New Assurance Service In response to a mature market for traditional audit services, in 1993, the American Institute of Certified Public Accountants (AICPA) charged the Special Committee on Assurance Services (SCAS) with identifying ways to reposition the profession for the future. Despite some caution about expanding services after the passage of the Sarbanes-Oxley Act, the SCAS report, released in 1997, is still shaping the future of the accounting profession, affecting both the services provided and the manner in which they are provided, especially for mid-sized firms. With respect to services provided, the Committee recommended and the AICPA has now begun developing a series of assurance services that are “independent professional services that improve the quality of information, or its context, for decision makers” (AICPA, 1997). Specific services developed and offered by practitioners to date include WebTrust, ElderCare, SysTrust and PerformanceView. These services have many similarities to audit services. They often involve similar sampling and evaluation of transactions or activities; they involve forming an opinion based on the evidence; they involve communicating that opinion to a user; and their success relies heavily on the credibility of the provider. However, these services are also very different from traditional audit services. While the individual services are also very different from one another, each of the new assurance services attempts to provide highly useful information to a very well-defined user. Audits, on the other hand are designed to appeal to a much broader, more ambiguous group of users. In addition, users of new assurance services often play an integral role in shaping the nature and scope of the assurance engagement through their interaction with the provider. Conversely, users of audited financial statements typically have no direct influence on the nature and scope of the audit and rarely
178
KIMBERLY GLADDEN BURKE ET AL.
interact with the auditor. Recognizing these differences from traditional auditing services, the AICPA was prompted to simultaneously promote a new emphasis on “customer focus,” including understanding and responding to users needs and challenges, as a necessary competency for the future of the accounting profession. Thus, where users might once have been seen as only passive users of information, with the advent of new assurance services, the AICPA recognizes users as active decision makers who play a crucial role in determining the nature and scope of the engagement. Because of this shift in focus for these new services, and the important need for information related to potential assurance gaps for these services, we have chosen one of them as a good venue for illustrating the application of our Assurance Gaps Model from Fig. 2. Below, we further examine the different components of the potential assurance gap – magnitude and influence – for these new services. Differences in Magnitude Implicit in the audit research involving the assurance gap is the assumption that differences in magnitude of satisfaction and its influences, particularly expectations, have created the assurance gap. To date, several auditing researchers (Epstein & Geiger, 1994; Humphrey, Moizer & Turley, 1993; Porter, 1993) have reported significant differences between the expectations of users and CPAs. With respect to the performance component of satisfaction, the results are mixed. Humphrey, Moizer and Turley (1993) reported significant differences between users’ and CPAs’ assessments of performance, but Porter (1993) found that while auditors have generally higher assessments of performance than users, these differences were not statistically significant. While it is tempting to use these results to hypothesize similar differences in magnitude for new assurance services, one factor may prevent an effective analogy. By its very nature, the auditing research described thus far involves a traditional, mature, and recognized service with which both providers and users have at least some familiarity and experience. This is not the case for new assurance services such as ElderCare, where neither the provider nor the user has any experience with the service and typically little, if any, knowledge of the service to help form their satisfaction judgments. As a result, hypotheses related to the variables of interest – expectations, performance, disconfirmation and satisfaction – are viewed as exploratory and are stated in null form as follows: H1 –H4 . There will be no difference between expectations, disconfirmation, performance evaluations or satisfaction for users and providers for new, nontraditional services so that
Unraveling the Expectations Gap
179
Euser = Duser = Puser = Suser =
Eprovider Dprovider Pprovider Sprovider
Differences in Influence Auditing researchers to date have not explored the relative influence of the factors that impact satisfaction for the providers vs. users. Although not studied in auditing, significant marketing research exists to provide a basis for anticipating and hypothesizing these effects. That expectations play a key role in determining satisfaction for existing products and services is virtually unchallenged. The very newness of assurance services, however, may alter that role for users. One line of past research (Halstead et al., 1994; Oliver, 1997) suggests that the availability of internal sources of information, especially prior experiences, is important for forming salient and reliable expectations. This research implies that providers, who have a greater store of knowledge about accountants and the services they provide as well as more experience with different accounting services, will be able to form stronger expectations than users, who must rely on external sources of information – in particular marketing communications from the provider – to form their expectations of new accounting services. In short, because of their heightened familiarity with accountants and other accounting services, providers will perceive that they have a better idea of what to expect from a new service than will users, thus encouraging providers to rely more heavily on their expectations. Users may also have difficulty developing expectations because of the inherent intangibility of the assurance service. Most services are described as high in experience qualities, so that evaluation of the service offering can only be discerned after purchase or during consumption (Zeithaml & Bitner, 2000). So, even though users may have some information available to them to acquire knowledge about a service, as in the case of this study, their lack of experience with the specific service may make forming an expectation even more difficult. Taken together, this evidence suggests the following hypothesis: H5 . Expectations will have greater influence in the satisfaction process for CPAs than for users for new assurance services, such that ␦provider > ␦user and ␣provider > ␣user Additional research suggests that individuals who do not wish to perceive discrepancies between expectations and performance for ego-defensive reasons will be less likely to engage in disconfirmation judgments (Martin, Seta & Crelia,
180
KIMBERLY GLADDEN BURKE ET AL.
1990; Oliver, 1997). In the current situation, providers who have a significant stake in performance outcomes are expected to have this ego-defensive motivation and are, therefore, expected to be influenced less by disconfirmation than users who have no such motivation. Additionally, as described earlier, users have little prior experience to form the basis for reliable expectations. As a result, they must rely on disconfirmation, a psychological and holistic assessment of whether they got what they expected, (and performance as described in the next section) in developing satisfaction judgments. This leads to the following hypothesis: H6 . Disconfirmation will have greater influence in the satisfaction process for users than for CPAs for new assurance services, such that user > provider Providers’ ego-defensive motivation, coupled with the lack of stable expectations for users, supports the idea of performance having a greater influence for users than providers. Additionally, based on assimilation-contrast theory, performance is expected to have a smaller influence on satisfaction judgments for providers because their strong expectations will result in performance simply being assimilated toward expectations (Oliver, 1997). These predictions lead to the following hypothesis: H7 . Performance will have greater influence in the satisfaction process for users than for providers for new assurance services so that user > provider and user > provider
TESTING THE MODEL A multi-stage, online survey was conducted to test the hypotheses developed above and summarized in Fig. 2, Panel B, for a representative new assurance service, ElderCare. The online survey was constructed to mirror the sequence of communications with potential users in a typical ElderCare engagement, as described below, and to allow us to measure subjects’ expectations, perceptions of performance, disconfirmation and satisfaction at the appropriate times. Figure 3 describes the time line of the study. After providing demographic data, subjects accessed a web site advertisement for ElderCare services created based on the promotional materials published by the AICPA (1999b) and sponsored by the fictitious Taylor CPA Group. Next, subjects read a scenario describing a new client and his interactions with the Taylor CPA Group in contracting for the service. The scenario incorporated a
Unraveling the Expectations Gap
181
Fig. 3. Study Time Line.
detailed description of the service and excerpts from the engagement letter, both of which were prepared based on guidance provided by the AICPA (1997) and Practitioners Publishing Corporation (Lewis et al., 1998). Though written prior to issuance of the Alert, the service described was very similar to the one described in the AICPA’s Assurance Services Alert related to Eldercare (1999a). The service involved an agreed-upon procedures engagement including both non-financial and financial services. For the non-financial services, Taylor CPA Group agreed to regularly review daily log sheets maintained by a care provider, randomly observe the activities of the care provider and conduct biweekly discussions of the quality of the care provider’s work with a geriatric care manager. The care provider’s duties included administering medication for the client’s elderly mother, providing transportation and planning and preparing meals. For the financial services, the Taylor Group maintained the checking account for the elderly mother, paid her routine bills less than $300 and provided a monthly accounting of these activities to the client. Accounting for deposits, bank reconciliations and approval of expenses larger than $300 were performed by the
182
KIMBERLY GLADDEN BURKE ET AL.
client. To avoid confounding, fees for services were described as a fixed monthly fee, but no amounts were specified. Additionally, a standard provision precluding the client’s mother from including the CPA firm in her will was included with the engagement letter excerpt. After subjects read the service description, their expectations of the service were measured. Then, they viewed a description of the actual service provided, including excerpts from Taylor CPA Group’s report. After viewing this information, subjects were asked to evaluate the performance of Taylor CPA Group and describe their level of disconfirmation of expectations and their level of satisfaction relative to the service. The entire survey required an average of 30 minutes to complete. The following sections describe in more detail the subjects in the study and measurements used for the variables in the model in Fig. 1.
Subjects Subjects were a convenience sample; the user group consisted of 370 adults aged 24–65 and the provider group consisted of 62 accountants. Subjects were motivated through a $5 donation to one of several charitable or non-profit organizations. Subjects were contacted through a variety of means. Some were contacted through e-mail lists or at events held by the benefiting charitable organizations. Other subjects were alumni or employees of the researchers’ employing institutions. These individuals were contacted by the researchers via e-mail or advertisement in the university newspaper. All subjects were asked to forward information regarding the study to their family, friends and co-workers who met the criteria for participation; hence, some subjects were contacted in this manner as well. Demographic information about the subjects is provided in Table 1. Though a convenience sample, the relatively well-educated, middle-aged subjects included in the user group reasonably represent the target population of interest for the ElderCare service.
Variable Measures Because of the unique nature of the service, many measures used in the study were either developed especially for the study or adapted to the ElderCare service from previous research. Given the challenge associated with obtaining enough suitable subjects to test the entire model, initial pilot testing of the instrument focused on readability, reasonableness and understandability. Several representative subjects,
Unraveling the Expectations Gap
183
Table 1. Demographic Data. Users Count
Providers %
Count
%
Total number of subjects
370
Gender Male Female
161 208
44 56
42 20
68 32
Previously provided care to an elderly parent Yes 82 No 287
22 78
9 52
15 85
Educational Level High school Associates degree Bachelors degree Masters degree Doctoral/medical/law degree
30 14 136 104 86
8 4 37 28 23
1 1 40 16 4
2 2 64 26 6
43 years
Std. 9 yrs
41 years
Std. 10 yrs
Average age
62
faculty members and ElderCare providers, provided feedback during the pilot test. Each of the variable measures is described individually below. Prepurchase Expectations and Evaluation of Performance Outcomes Prepurchase expectations and evaluation of performance outcomes were both measured using the same twenty-one items developed for the study. Each item consisted of a statement regarding a potential outcome of the ElderCare service. Subjects were asked to indicate their beliefs that the service provided by Taylor CPA Group would result in each outcome using a seven-point Likert scale anchored by “strongly agree” and “strongly disagree.” The twenty-one items measured were developed to reflect four potential dimensions of expectations and performance associated with the service described. The items measuring each dimension are shown in Table 2, and each of the dimensions is described below: (1) Care Provided. These items related to the services provided by the care provider and observed by the CPA. (2) Evaluation. These items addressed providing, or assisting with, evaluations of care and the care provider, as well as assurances regarding the quality of care. (3) Quality of Life. These items dealt with the quality of life and general level of well-being provided to the client and his mother.
184
KIMBERLY GLADDEN BURKE ET AL.
Table 2. Expectations and Performance Measures. Scale/Item
Final Reliability/Status of Item
Care provided
Expectations =0.9565, Performance =0.9490 Included Included
Ensure that my mother receives appropriate physical care. Ensure that my mother has reliable transportation to her physical therapy appointments. Ensure that my mother takes her medication. Ensure that my mother has 3 meals a day. Evaluation Help me to evaluate the competence of care providers. Provide me with an evaluation of the competence of my mother’s care providers. Guarantee a high standard of quality care. Help me evaluate the performance of my mother’s care providers. Provide me with an evaluation of the performance of my mother’s care providers. Provide a comprehensive evaluation of my mother’s actual quality of care. Quality of life Make my life a little easier. Improve the quality of my mother’s life. Let me know everything is ok. Be valuable to me. Allow me to focus on spending quality time with my mother when I visit. Financial services Protect my mother from being taken advantage of financially. Assure that my mother’s money is invested wisely. Assure all of my mother’s bills are paid when due. Result in all of my mother’s money being spent unnecessarily. Provide a report of my mother’s monthly expenses. Provide an accurate record of the services provided to my mother.a a This
item was originally part of the Care Provided subscale.
Included Included Expectations =0.9477, Performance =0.9386 Included Included Included Included Included Included Expectations =0.9267, Performance =0.9440 Included Included Excluded (poor item remainder coefficient) Included Included Expectations =0.8054, Performance =0.8523 Included Excluded (poor item remainder coefficient) Included Excluded (did not load from any factor) Included Included
Unraveling the Expectations Gap
185
(4) Financial Services. These items dealt with the financial services provided by Taylor CPA Group. In assessing the reliability and validity of the item measures, all twenty-one items in both the expectations and performance scales were first analyzed using exploratory common factor analyses with varimax rotation to ascertain the dimensionality of the scales (Hair, Anderson, Tatham & Black, 1992). The same measurement model was used for both user and provider groups, as there was no reason to expect differences in the factor structure of these variables for the two groups. Examination of eigenvalues as well as examination of rotated factor patterns suggested that a four-factor solution offered the most meaningful interpretation of the underlying factors. The four-factor solution also explained a significantly large proportion (99%) of the underlying common variability in scores for both expectations and performance measures. Generally, each item loaded on the same factor for both measures, suggesting a common dimensionality for the prepurchase expectations and evaluation of performance outcomes measures, as expected. Additionally, the dimensions were generally consistent with a priori expectations of the four dimensions as described above. Following the factor analysis, an item analysis was utilized to select the items that most reliably measured the latent construct (Spector, 1992). As shown in Table 2, three items were eliminated from the subscales, one because it did not have a significant loading for any of the underlying factors in the factor analysis, and two other items because the item analysis revealed unfavorable item remainder coefficients. A final item “Provide an accurate record of the services provided to my mother” loaded on the financial services rather than the care provided factor as expected. Further consideration suggests that this item could better reflect an outcome resulting directly from the activities of the CPA as opposed to the care provider, making its inclusion in the financial services dimension reasonable. These analyses, which are summarized in Table 2, resulted in four different subscales each for prepurchase expectations and evaluation of performance outcomes. Each of these subscales possessed high degrees of reliability as measured by Cronbach’s alphas, with values ranging from 0.8054 to 0.9566 (reliabilities for each scale are shown in Table 2). As a result, items representing each subscale were averaged, and the resulting subscale measures were used as indicators of the latent constructs, prepurchase expectations and evaluation of performance outcomes, in the structural model, consistent with the recommendation of Landis, Beal and Tesluk (2000). The goal of this process was to capture as much of the content of each of the underlying constructs, expectations and performance, as possible while still maintaining a reasonable number of indicators in the structural model, given the sample size.
186
KIMBERLY GLADDEN BURKE ET AL.
Disconfirmation Disconfirmation was measured using a three-item, five-point, Likert-type scale suggested by Oliver (1993), Oliver and Swan (1989) and Westbrook (1987). The items required subjects to indicate whether the benefits, outcomes and service overall were relatively better or worse than expected. Exploratory factor analysis indicated that these three items reflected a single, unidimensional construct. Coefficient alpha for the resulting construct was 0.8382. An item analysis was utilized to select the items that most reliably measured the latent construct (Spector, 1992). Based on the item analysis, all three items were included in the structural model. Satisfaction Satisfaction was measured using a four-item, semantic differential scale. The items included in the scale are based on generalized satisfaction measures previously developed by Oliver (1993), Oliver and Swan (1989) and Crosby and Stephens (1987). In contrast to the disconfirmation items, these asked subjects to provide absolute, rather than relative, perceptions of the service. The items utilized bipolar adjective scales that required the subjects to describe their feelings about the service received. The positive adjectives on the scales were pleased, contented, satisfied and made a good choice. Coefficient alpha for the resulting scores was 0.9497. An item analysis was utilized to select the items that most reliably measured the latent construct (Spector, 1992). Based on the item analysis, all four items were included in the structural model.
RESULTS Table 3 provides descriptive statistics for both the accountant and user groups for the study variables. The t-test results shown in the table indicate no significant Table 3. Descriptive Statistics and Mean Comparisons for Study Variables. Accountants
Expectations Disconfirmation Performance Satisfaction a All
Clients
t-Statistic
Meana
Std.
Meana
Std.
5.27 4.76 5.41 5.26
1.00 1.27 1.01 1.29
5.17 4.68 5.17 5.11
1.22 1.29 1.25 1.45
variables measured on 7-point scales.
0.65 0.47 1.46 0.78
p-Value
0.52 0.64 0.14 0.43
Unraveling the Expectations Gap
187
differences between the accountant and non-accountant groups for any of the satisfaction component variables including expectations, performance, disconfirmation and satisfaction. Consequently, the null hypothesis of no difference between the two groups for Hypotheses 1–4 cannot be rejected. This implies that, based on the information provided in the study materials, both potential providers and potential users had similar perceptions of the Eldercare service. While the mean values for all of the variables were above the mean scale value of 3.5, suggesting that these perceptions were somewhat positive, comments from both participant groups might suggest that both were actually quite skeptical of the service. Failure to support Hypotheses 1–4 also suggests that any gaps between accountants and non-accountants for new assurance services might better be defined in terms of gaps in influence as suggested by Hypotheses 5–7, rather than gaps in magnitude. Hypotheses 5–7 were tested using structural equation modeling. The theoretical model to be examined is shown in Fig. 2, Panel A, with the hypotheses to be tested in Fig. 2, Panel B. Covariances between corresponding indicators of expectations and performance were estimated as part of the measurement model since the subscales used as indicators included the same items examined at different points in time, which fails to satisfy assumptions of independence (Anderson & Gerbing, 1988; Bagozzi, 1980; Gerbing & Anderson, 1984). Group comparisons were performed using Lisrel 8 as described by Joreskog and Sorbom (1993). First, the analysis was run assuming that the parameters of the models were identical for both groups. This model serves as a benchmark for evaluating subsequent models used for hypothesis testing where the structural parameters are allowed to vary between groups. This baseline analysis initially revealed a mediocre fit. Modification indices suggested a strong correlation between the error terms for two of the satisfaction measures. Examination of the items revealed strong similarities in their wording. Accordingly, one of these measures was deleted from the model before even considering it as a baseline (Hair et al., 1992). With this improved measurement model in place, the baseline structural model was reestimated and, consistent with Hair et al. (1992), a sample of commonly used goodness of fit measures was examined. The results reveal good fit, but some area for improvement. Chi-square is 260 with 172 degrees of freedom and p < 0.05. While a large p-value is desirable, indicating that the observed covariance matrix of the variables is similar to the model, small p-values are common in samples of this size (Hair et al., 1992). Root means square error of the approximation (RMSEA), which shows the error per degree of freedom and exhibits good fit at values less than 0.05, is 0.049. The comparative fit index (CFI), which indicates good fit at values above 0.90, is 0.99. The normed 2 , which has a recommended level between 1 and 2, is 1.51.
188
KIMBERLY GLADDEN BURKE ET AL.
The next step in the analysis was to free the parameters referred to in the hypotheses, allowing them to differ between groups. The new fit statistics were then compared with the fit statistics from the baseline model to determine whether the new model is significantly better than the baseline. In particular, because the models are nested models (Hair et al., 1992), the difference between the chi-squares of the two models is distributed as chi-square, with degrees of freedom equal to the difference between the degrees of freedom of the two models. To get an overall view of the hypotheses regarding differences in influence for expectations (H5 ), disconfirmation (H6 ) and performance (H7 ) for users versus providers, first, individual structural coefficients were freed and tests were performed to identify any differences in each of the individual parameters. These tests did not reveal any significant differences in the coefficients for any of the individual paths in the structural model. (The largest chi-square was 2.30, with a p-value of 0.13 for the expectations → satisfaction path; all other chi-squares were much smaller.) Based on this information, Hypothesis 6 suggesting a greater influence of disconfirmation in determining satisfaction for users than providers is not supported.
Fig. 4. Results. Note: ∗ indicates coefficient is significant at p = 0.05. All measurement model coefficients are significant. 2 for this model is 254 with 170 df. Compared to the baseline model with 2 of 260 with 170 df, the difference in 2 is 6 with 2 degrees of freedom, which is significant with p = 0.05.
Unraveling the Expectations Gap
189
However, a more careful evaluation of Hypotheses 5 and 7 suggests a difference between users and providers in the relative influence of expectations vs. performance in determining satisfaction. In short, Hypotheses 5 and 7 suggest that providers will be influenced more by expectations than performance and users will be influenced more by performance than expectations. To test this effect more directly, the paths between expectations and satisfaction and performance and satisfaction were simultaneously freed. This new model exhibited significantly better fit than the baseline model, providing support for Hypotheses 5 and 7. 2 for the new model is 254 with 170 df, for a difference in 2 of 6 with 2 degrees of freedom, which is significant with p = 0.05. Figure 4 shows the resulting structural models and coefficients for both users and providers. The difference in influence for the two determinants of satisfaction is as hypothesized – accountants are influenced more by expectations, with a structural coefficient of 0.35 (t = 2.61), than users, with a structural coefficient of 0.00 (t = 0.01); users are influenced more by performance, with a structural coefficient of 0.47 (t = 5.39), than providers, with a structural coefficient of 0.19 (t = 1.33). The practical implications of these results are discussed in the next section.
IMPLICATIONS FOR PRACTICE In terms of new assurance services, the results of this application of the Assurance Gaps Model provides some limited evidence of an emerging assurance gap. This gap, however, does not seem to stem from differences in magnitude, as the study finds no difference in the magnitude of the components of satisfaction – expectations, performance assessments, disconfirmation and satisfaction – between users and providers of ElderCare services. Instead, this gap seems to result from differences in influence, where sample user subjects already emphasize performance more heavily while sample providers emphasize expectations in forming their satisfaction judgments. Thus, while providers may rely on their beliefs about CPAs’ reputations or their past experience, users and potential users will require more evidence or indicators of quality performance of the particular service they have purchased. One possible explanation for the lack of differences in magnitude between users and providers may be that these perceptions reflect the newness of the service – with little knowledge of or experience with the service, both might be more generous about the service. If this is the case, providers should focus on managing user expectations and user satisfaction proactively as the service is provided, avoiding the problems associated with assurance gaps that might result
190
KIMBERLY GLADDEN BURKE ET AL.
as user perceptions evolve over time. However, comments from both groups of participants suggest a more interesting explanation for the similarity between user and provider perceptions. These comments, voluntarily provided by 47% of the potential users and 41% of the potential providers, point to equal degrees of skepticism regarding the service. In particular, both groups are skeptical of the value of the service, as well as the ability of accountants to provide the service effectively. Of those responding, 33% of providers and 18% of users voiced concerns about the quality of care not being assessed. Some 22% of both groups indicated concerns about difficulties verifying the services provided. And, 36% of users and 33% of providers expressed concern about the degree of value/expertise provided by the CPA in this area. Taken together, these comments suggest that a significant amount of work is needed if ElderCare and other new, non-traditional assurance services are to be successful. In particular, revisions to the service may be needed as well as significant efforts to convince both potential providers and users about the value of the new assurance services and to provide more evidence regarding the quality of the services provided. Additionally, practitioners must recognize that the process of managing assurance gaps is ongoing. This study represents vital perceptions of users and providers to a service in early stages of development, providing guidance that can contribute to the initial success of the new service. However, these perceptions may change over time as both groups develop experience with the service, leading to new gaps. Finally, the subjects in this study were largely inexperienced with providing care to elderly parents. This provides an opportunity for expanding the study to develop a large enough sample to examine the perceptions of users who have more first hand experience with the topic of the assurance.
CONCLUSIONS AND OPPORTUNITIES FOR FUTURE RESEARCH The primary objective of this study was to develop a model to more completely describe the sources of assurance gaps in a broad context, recognizing that assurance gaps are important to the profession not only in the traditional context of audits, but also in the context of a variety of new services. The second objective of this study was to illustrate application of the model and a method for testing the model in the context of new assurance services. The Assurance Gaps Model shown in Fig. 2 provides a more thorough framework within which academic researchers can examine the nature, scope
Unraveling the Expectations Gap
191
and implications of assurance gaps in many different areas of accounting. By adapting research from the marketing literature, the Assurance Gaps Model allows accountants to better articulate and specify the components of what has traditionally been referred to as an expectations gap. By extending the research of the marketing literature, this model allows researchers and practitioners to better examine the nature of the assurance gap by identifying two types of contributors: (1) differences in the magnitude of satisfaction and its components – expectations, performance assessments and disconfirmation; and (2) differences in the influence of each of these variables between users and providers. With increased emphasis on marketing of services on the one hand and focusing on users’ needs on the other, careful examination of the model in the context of all of the services provided by accountants would be advisable and can provide a wealth of information to enable practitioners and researchers to better understand and perhaps anticipate users’ responses to professional services. In an audit context, a thorough investigation of the assurance gap components for a wide variety of different users could provide useful information as the profession engages in rebuilding confidence in the integrity of the audit process. While limitations to the illustrative study provided in the paper exist, it provides a useful illustration of the Assurance Gaps Model. First, as demonstrated through development of Hypotheses 1–7, the model provides a basis for systematic evaluation of the sources of assurance gaps. Second, as described in the results and implications for practice, the application of the model can point to very specific courses of action. As with most experimental research, the limitations for these findings must be acknowledged. For the most part, these limitations arise because of the newness of the service. For example, because ElderCare is a new service, there were too few actual users or providers of the service to provide a sample adequate for the purposes of this study. Accordingly, we simulated the communications and activities in a typical ElderCare engagement relying on a sample of representative providers and representative users. While this approach limits the external validity of the study, it allowed us to demonstrate the model’s application in a new context and to address questions of internal validity and timeliness in a significant way. More importantly, the online survey methodology illustrates a very useful method for testing all of the variables in the model, comparing their effects. Future research should continue to examine assurance services in both internally and externally valid contexts. This research can provide insight to both users’ and providers’ perceptions of each service and the source of assurance gaps. Consequently, the potential effects of these gaps may be understood more thoroughly and the appropriate courses of action can be more completely identified.
192
KIMBERLY GLADDEN BURKE ET AL.
REFERENCES American Institute of Certified Public Accountants (1997). Report of the special committee on assurance services [online]. Available: http://www.aicpa.org/. American Institute of Certified Public Accountants (1999a). Assurance services alert: CPA ElderCare services – 1999. New York: American Institute of Certified Public Accountants. American Institute of Certified Public Accountants (1999b). CPA ElderCare services marketing tool kit. New York: American Institute of Certified Public Accountants. Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103(3), 411–423. Anderson, J. C., Lowe, D. J., & Reckers, P. M. J. (1993). Evaluation of auditor decisions: Hindsight bias effects and the expectation gap. Journal of Economic Psychology, 14(4), 711–738. Bagozzi, R. P. (1980). Causal models in marketing. New York: Wiley. Cronin, J. J., & Taylor, S. A. (1992). Measuring service quality: A reexamination and extension. Journal of Marketing, 56(July), 55–68. Crosby, L. A., & Stephens, N. (1987). Effects of relationship marketing on satisfaction, retention, and prices in the life insurance industry. Journal of Marketing Research, 24(November), 404–411. Epstein, M. J., & Geiger, M. A. (1994). Investor views of audit assurance: Recent evidence of the expectation gap. Journal of Accountancy (January), 60–66. Gerbing, D. W., & Anderson, J. C. (1984). On the meaning of within-factor correlated measurement errors. Journal of Consumer Research, 11(June), 572–580. Gramling, A. A., Schatzberg, J. W., & Wallace, W. A. (1996). The role of undergraduate auditing coursework in reducing the expectations gap. Issues in Accounting Education, 4(1), 131–161. Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1992). Multivariate data analysis. New York, NY: Macmillan. Halstead, D., Hartman, D., & Schmidt, S. L. (1994). Multisource effects on the satisfaction formation process. Journal of the Academy of Marketing Science, 22(2), 114–129. Humphrey, C., Moizer, P., & Turley, S. (1993). The audit expectations gap in Britain: An empirical investigation. Accounting and Business Research (Winter), 395–411. Joreskog, K., & Sorbom, D. (1993). Lisrel 8: Structural equation modeling with SIMPLIS command language. Chicago: Scientific Software International. Kell, W. G., & Boynton, W. C. (1992). Modern auditing. New York: Wiley. Landis, R. S., Beal, D. J., & Tesluk, P. E. (2000). A comparison of approaches to forming composite measures in structural equation models. Working Paper. Lewis, G. A., Thompson, C. T., Ecklund, K. J., Popovitch, R. L., Blanco-Best, M., Roeder, C. A., Lovelace, T. W., & Hart, P. I. (1998). Guide to providing ElderCare services. Fort Worth, TX: Practitioners Publishing Company. Liggio, C. D. (1974). The expectation gap: The accountant’s Waterloo. Journal of Contemporary Business, 3(3), 27–44. Lowe, D. J. (1994). The expectation gap in the legal system: Perception differences between auditors and judges. Journal of Applied Business Research, 10(3), 39–44. Martin, L. L., Seta, J. J., & Crelia, R. A. (1990). Assimilation and contrast as a function of people’s willingness and ability to expend effort in forming an impression. Journal of Personality and Social Psychology, 59, 27–37. Miller, J. R., Reed, S. A., & Strawser, R. H. (1991). The new auditor’s report: Will it close the expectation gap in communication? CPA Journal, 60(5), 68–72.
Unraveling the Expectations Gap
193
Oliver, R. L. (1993). Cognitive, affective and attribute bases of the satisfaction response. Journal of Consumer Research, 30(December), 418–430. Oliver, R. L. (1997). Satisfaction: A behavioral perspective on the consumer. New York: McGraw-Hill. Oliver, R. L., & Bearden, W. O. (1985). Disconfirmation processes and consumer evaluations in product usage. Journal of Business Research, 13(June), 235–246. Oliver, R. L., & Swan, J. E. (1989). Equity and disconfirmation perceptions as influences on merchant and product satisfaction. Journal of Consumer Research, 16(December), 372–383. Porter, B. (1993). An empirical study of the audit-expectation-performance gap. Accounting and Business Research, 24(93), 49–68. Spector, P. E. (1992). Summated rating scale construction: An introduction. Newbury Park, CA: Sage. Tse, D. K., & Wilton, P. C. (1988). Models of consumer satisfaction formation: An extension. Journal of Marketing Research, 25(May), 204–212. Westbrook, R. A. (1987). Product/consumption-based affective responses and postpurchase processes. Journal of Marketing Research, 25(August), 258–270. Zeithaml, V. A., & Bitner, M. J. (2000). Services marketing. New York: McGraw-Hill.