focus
guest editors’ introduction
Global Software Development James D. Herbsleb and Deependra Moitra, Lucent Technologies
he last several decades have witnessed a steady, irreversible trend toward the globalization of business, and of software-intensive high-technology businesses in particular. Economic forces are relentlessly turning national markets into global markets and spawning new forms of competition and cooperation that reach across national boundaries. This change is having a profound impact not only on marketing and distribution but also on the way products are conceived, designed, constructed, tested, and delivered to customers.1
T
Software engineers have recognized the profound influence of business globalization for some time, generating alarmist reactions
in some quarters and moving others to try to capture and emulate models that have met with success. More recently, attention has turned toward trying to understand the factors that enable multinationals and virtual corporations to operate successfully across geographic and cultural boundaries. Globalization and software Over the last few years, software has become a vital component of almost every business. Success increasingly depends on using software as a competitive weapon. More than a decade ago, seeking lower costs and access to skilled resources, many organizations began to experiment with remotely located software development facilities and with outsourcing. Several factors have accelerated this trend: ■
16
IEEE SOFTWARE
March/April 2001
the need to capitalize on the global resource pool to successfully and cost-competitively use scarce resources, wherever located; 0740-7459/01/$10.00 © 2001 IEEE
■
■
■
■
the business advantages of proximity to the market, including knowledge of customers and local conditions, as well as the good will engendered by local investment; the quick formation of virtual corporations and virtual teams to exploit market opportunities; severe pressure to improve time-tomarket by using time zone differences in “round-the-clock” development; and the need for flexibility to capitalize on merger and acquisition opportunities wherever they present themselves.
As a result, software development is increasingly a multisite, multicultural, globally distributed undertaking. Engineers, managers, and executives face numerous, formidable challenges on many levels, from the technical to the social and cultural. Is working at a distance really such a problem? Nearly everyone with GSD experience, it seems, has anecdotes illustrating difficulties and misunderstandings. While these stories are compelling, they do not give us a clear picture of its cumulative effects. However, we have strong evidence,2 based both on statistical modeling of development interval and on survey results, that multisite development tasks take much longer than comparable colocated tasks and that communication and coordination play major roles in this delay. While we focus primarily on the problems of GSD, we should not neglect the potential benefits of geographic dispersion. For example, if an organization can manage daily handoffs of work between remote sites and focus attention around the clock on critical-path tasks, it is possible to take advantage of widely dispersed time zones.3 We could theoretically extend the productive hours of the day from the current 8- to 10hour norm to somewhere near the limit of 24. This is perhaps a distant goal as a general model for development, but occasional benefits—for example, accelerated problem investigation or a distributed daily test-andfix cycle—are possible. Moreover, we probably think of “distributed” work in much too limited a way. Distances need not be global to be important.4-5 In fact, being in another building or on a different floor of the same building, or even
at the other end of a long corridor, severely reduces communication. Solutions that help globally distributed colleagues work together more effectively will often help those in the same zip code as well. Dimensions of the problem Physical separation among project members has diverse effects on many levels. Strategic issues Once a particular set of project sites has been determined (a decision outside the scope of this issue), deciding how to divide up the work across sites is difficult. Solutions are constrained by the resources available at the sites, their levels of expertise in various technologies, the infrastructure, and so on. An ideal arrangement would let the sites operate as independently as possible while providing for easy, flexible, and effective communication across sites. A number of models are possible and appropriate under different circumstances and require different coordination mechanisms.6 Another fundamental challenge is the organization’s resistance to GSD. This resistance often surfaces because of misalignment between senior and middle management on the intent and perceived benefits of GSD. Many individuals might believe their jobs are threatened, experience a loss of control, and fear the possibility of relocation and the need for extensive travel.
Cultures differ on many critical dimensions, such as the need for structure, attitudes toward hierarchy, communication styles, and sense of time.
Cultural issues GSD requires close cooperation of individuals with different cultural backgrounds. Cultures differ on many critical dimensions, such as the need for structure, attitudes toward hierarchy, sense of time, and communication styles.7 While many people find such differences enriching, they can also lead to serious and chronic misunderstandings, especially among people who do not know each other well. An email, for example, from someone in a culture where communication tends to be direct might seem abrupt or even rude to someone from a different background. A different sense of time can lead to acrimony over the interpretation and seriousness of deadlines. Cultural differences often exacerbate communication problems as well. When people are puzzled as to how to respond to oddMarch/April 2001
IEEE SOFTWARE
17
Developers not located together have very little informal, spontaneous conversation across sites.
sounding messages, they often just ignore them or make uncharitable attributions about the sender’s intentions or character. Inadequate communication Software development, particularly in the early stages, requires much communication.8 In fact, software projects have two complementary communication needs. First, the more formal, official communications need a clear, well-understood interface. For crucial tasks like updating project status, escalating project issues, and determining who has responsibility for particular work products, a fuzzy or poorly specified interface loses time and lets problems fall through the cracks. Disruption to a second, vital communication channel can be surprisingly crippling9: developers not located together have very little informal, spontaneous conversation across sites. Informal “corridor talk” helps people stay aware of what is going on around them, what other people are working on, what states various parts of the project are in, who has expertise in what area, and many other essential pieces of background information that enable developers to work together efficiently. One result is that the issues, big and small, that crop up on a nearly daily basis in any software project can go unrecognized or lie dormant and unresolved for extended periods. The absence of ongoing conversation can also lead to surprises from distant sites, potentially resulting in misalignment and rework. The more uncertain the project, the more important this communication channel is.10 These issues are even more complex in outsourcing arrangements. The fear of loss of intellectual property or other proprietary information about products or schedules leads to restricted or filtered communication, often seriously impairing this critical channel. Knowledge management Without effective information- and knowledge-sharing mechanisms, managers cannot exploit GSD’s benefits. For example, they might fail to promptly and uniformly share information from customers and the market among the development teams. When project leaders disseminate status information inadequately, teams cannot determine what tasks are currently on the critical path. Needed expertise might be available
18
IEEE SOFTWARE
March/April 2001
but cannot be located and hence is not exploited. Also, owing to poor knowledge and information management, teams miss many reuse opportunities that otherwise would have saved cost and time. Poor documentation can also cause ineffective collaborative development. The resistance to documentation among developers is well known and needs no emphasis. In GSD, however, in addition to documenting the various artifacts, updating and revising the documentation is equally important. To prevent assumptions and ambiguity and to support maintainability, documentation must be current and reflect what various teams are using and working on. Project and process management issues When teams hand off processes between sites, the lack of synchronization can be particularly critical—for example, if the development team at one site and the test group at another site define “unit-tested code” differently. Synchronization requires commonly defined milestones and clear entry and exit criteria. Though concurrent development process models have been suggested in the literature and used,11-13 effectively implementing concurrent engineering principles in GSD often becomes difficult because of volatile requirements, unstable specifications, the unavailability of good tools that support collaboration across time and space, and the lack of informal communication. Some groups practice risk management in a traditional fashion, not taking into account the possible impacts of diverse cultures and attitudes. Technical issues Since networks spanning globally dispersed locations are often slow and unreliable, tasks such as configuration management that involve transmission of critical data and multisite production must be meticulously planned and executed. The need to control product changes and to ensure that all concerned hear about them is much greater in GSD. Other common issues include using incompatible data formats and different versions of the same tools. The articles in this issue Each article in this issue provides some answers to managers and engineers navigat-
ing these difficult waters. They describe new tactics and techniques as well as hard-won practical lessons from experience. As we noted earlier, distance is a major issue in GSD leading to coordination, communication, and management problems. In “Tactical Approaches for Alleviating Distance in Global Software Development,” Erran Carmel and Ritu Agarwal provide several approaches that can be applied across a range of geographically distributed projects. Audris Mockus and David M. Weiss, in “Globalization by Chunking: A Quantitative Approach,” offer a method for using code change history to compute the degree of “relatedness” of the work items at two sites. They further propose a method for distributing work in a way that minimizes the need for coordination across sites. In “Using Components for Rapid Distributed Software Development,” Alexander Repenning, Andri Ioannidou, Michele Payton, Wenming Ye, and Jeremy Roschelle describe their experiences using a component architecture to support work distribution across sites. Based on their large,
geographically distributed testbed for publishing educational software applications, the authors outline a rapid production process using components suited for distributed development. This enables each site to take ownership of particular components and work on them independently without much need for intersite communication and coordination. In our experience, training programmers to think and behave like software engineers is an uphill task. Many educational programs now include team-oriented work, and with globalization so pervasive, they also need to train their students with geographically distributed development in mind. Jesús Favela and Feniosky PeñaMora, in “Geographically Distributed Collaborative Software Development,” describe a project-oriented software engineering course and show how students in two different countries collaborated using an Internet-based groupware environment. While their objective in designing the project was educational, their experiences are significant to the business community.
Synchronization requires commonly defined milestones and clear entry and exit criteria.
About the Authors James D Herbsleb is currently a
member of the Software Production Research Department and leader of the Bell Labs Collaboratory project. For the last three years, his work has focused on collaboration technology to support large, globally distributed projects. For the past 10 years, he has conducted research in collaborative software engineering, human–computer interaction, and computer-supported cooperative work. He holds an MS in computer science from the University of Michigan and a PhD in psychology from the University of Nebraska. Contact him at herbsleb@research. bell-labs.com. Deependra Moitra is currently general manager of engineering at the Lucent Technologies India R&D Program. His interests are in software engineering management, management of technology and innovation, new-product innovation, R&D globalization, and entrepreneurship in software and high-tech industries. He serves on the editorial boards of Research-Technology Management, Technology Analysis and Strategic Management, International Journal of Entrepreneurship and Innovation, Journal of Small Business Management, Journal of Knowledge Management, and IEEE Software. He is a member of IEEE, IEEE Computer Society, IEEE Engineering Management Society, and ACM. Contact him at
[email protected].
Software outsourcing is increasingly popular among corporations as an economically and strategically attractive business model. As companies outsource their software needs to software houses across national borders, the two independent organizations must interact—causing the dynamics of globally distributed software development to surface rather strongly. In a thought-provoking article, “Synching or Sinking: Global Software Outsourcing Relationships,” Richard Heeks, S. Krishna, Brian Nicholson, and Sundeep Sahay report on three interesting case studies and capture their successful outsourcing strategies to maximize business value. Finally, we offer three excellent articles summarizing real organizational experiences, lessons learned, and good practices. In “Surviving Global Software Development,” Christof Ebert and Philip De Neve narrate Alcatel’s experience in globally distributed software development and synthesize the good practices they observed in a large operation involving 5,000 engineers. Robert Battin, Ron Crocker, Joe Kreidler, and K. Subramanian, in “Leveraging Resources in Global Software Development,” report on their experiences and approaches while working on Motorola’s 3G Trial Systems project, which spans six countries. In “Outsourcing in India,” Werner Kobitzsch, Dieter Rombach, and Raimund Feldmann capture their experiences and lessons learned with distributed software development at Tenovis, a German company. Interestingly, while these articles are from different companies operating in different cultural settings, there is a marked commonality in experiences gained and approaches that worked.
G
iven the global reach of today’s large corporations and the global market for software products, few software engineers will remain unaffected as the globalization trend surges forward. As we increasingly work in virtual, distributed team environments, we will more and more face formidable problems of miscommunication, lack of coordination, infrastructure incompatibility, cultural misunderstanding, and conflicting expectations—not to mention the technical challenges of architecting
20
IEEE SOFTWARE
March/April 2001
products for distributed development. We hope the advances in collaborative tools and multimedia, Web technology,14-15 and a refined understanding of concurrent-engineering principles will help us address these challenges.
Acknowledgment We thank the reviewers who contributed their valuable time and expertise toward development of this special issue. Our sincere thanks also to Dawn Craig at IEEE Software for her meticulous efforts helping us put together the issue.
References 1. M. O’Hara-Devereaux and R. Johansen, Globalwork: Bridging Distance, Culture, and Time, Jossey-Bass, San Francisco, 1994. 2. J.D. Herbsleb et al., “An Empirical Study of Global Software Development: Distance and Speed,” to be published in Proc. Int’l Conf. Software Eng. 2001, IEEE CS Press, Los Alamitos, Calif., 2001. 3. E. Carmel, Global Software Teams, Prentice Hall, Upper Saddle River, N.J., 1999. 4. T. J. Allen, Managing the Flow of Technology, MIT Press, Cambridge, Mass., 1977. 5. R.E. Kraut, C. Egido, and J. Galegher, “Patterns of Contact and Communication in Scientific Research Collaborations,” Intellectual Teamwork: Social Foundations of Cooperative Work, J. Galegher, R.E. Kraut, and C. Egido, eds., Lawrence Erlbaum Assoc., Hillsdale, N.J., 1990, pp. 149–172. 6. R.E. Grinter, J.D. Herbsleb, and D.E. Perry, “The Geography of Coordination: Dealing with Distance in R&D Work,” Proc. Int’l ACM SIGGROUP Conf. Supporting Group Work, ACM Press, New York, 1999, pp. 306– 315. 7. G.H. Hofstede, Cultures and Organizations: Software of the Mind—Intercultural Cooperation and Its Importance for Survival, revised ed., McGraw-Hill, New York, 1997. 8. D.E. Perry, N.A. Staudenmayer, and L.G. Votta, “People, Organizations, and Process Improvement,” IEEE Software, vol. 11, no. 4, July/Aug. 1994, pp. 36–45. 9. J.D. Herbsleb and R.E. Grinter, “Architectures, Coordination, and Distance: Conway’s Law and Beyond,” IEEE Software, vol. 16, no. 5, Sept./Oct. 1999, pp. 63–70. 10. R.E. Kraut and L.A. Streeter, “Coordination in Software Development,” Comm. ACM, vol. 38, no. 3, Mar. 1995, pp. 69–81. 11. M. Aoyama, “Managing the Concurrent Development of Large-Scale Software Development,” Int’l J. Technology Management, vol. 14, no. 6/7/8, 1997, pp. 739– 765. 12. J.D. Blackburn, G. Hoedemaker, and L.N. van Wassenhove, “Concurrent Software Engineering: Prospects and Pitfalls,” IEEE Trans. Eng. Management, vol. 43, May 1996, pp. 179–188. 13. F. Rafii and S. Perkins, “Internationalizing Software with Concurrent Engineering,” IEEE Software, vol. 12, no. 5, Sept./Oct. 1995, pp. 39–46. 14. S. Murugesan, “Leverage Global Software Development and Distribution Using the Internet and Web,” Cutter IT J., vol. 12, no. 3, Mar. 1999, pp. 57–63. 15. M. Aoyama, “Web-Based Agile Software Development,” IEEE Software, vol. 15, no. 6, Nov./Dec. 1998, pp. 56–65.
focus
global software development
Tactical Approaches for Alleviating Distance in Global Software Development Erran Carmel, American University Ritu Agarwal, University of Maryland
o overcome the problem of distance in global software development, various managers are experimenting and quickly adjusting their tactical approaches. We discuss some emerging approaches and explain their motivations from conceptual and practical perspectives. The most intuitive approach for alleviating distance is to apply
T The authors describe three tactical approaches to reducing intensive collaboration, national and organizational cultural differences, and temporal distance. 22
IEEE SOFTWARE
communication technologies, but this is not our focus. Rather, we examine tactics that go beyond communication technologies— tactics aimed at reducing intensive collaboration, national and organizational cultural differences, and temporal distance. The phenomenon of global software development Only a decade ago, the number of entities engaging in global software development was small—but this has rapidly changed. Today, 203 of the US Fortune 500 engage in offshore outsourcing.1 In a recent study, we found that within the largest US firms, the median ratio of IT work outsourced—but consumed largely in the US— is 6.5 percent.2 Meanwhile, in the much smaller Netherlands, about 250 Dutch companies of various sizes engage in some kind of offshore work.3
March/April 2001
Upwards of 50 nations are currently participating—at least minimally—in collaborative software development internationally. In India, there are now 800 IT service firms competing for work globally.3 Many American firms are in the process of a radical push to send their key software processes offshore, and critical centers of software R&D are growing outside the traditional centers (such as the US)—in Ireland, Israel, Singapore, Finland, and many other nations. Finally, the marketplace is responding to the increased demand for IT labor through the construction of new commercial mechanisms. IT business-to-business intermediaries, such as ITsquare.com and IT-radar.com, serve as exchanges between worldwide IT services vendors and small- and medium-size businesses with IT needs. There are two critical, strategic reasons for moving to offshore software develop0740-7459/01/$10.00 © 2001 IEEE
Other Solutions to Alleviating Distance
ment: cost advantage and a large labor pool. Furthermore, the radical rethinking in our collective concept of a firm enables software globalization. Transaction-cost economics posits that if it is cheaper to transact internally within the corporation, the organization will grow larger. This occurred historically because of coordination requirements.4 Now that we can more effectively coordinate over long distances, organizations are either staying small or shrinking and conducting transactions externally (externalizing sales, distribution, manufacturing, management information systems, legal, payroll, and so forth). The growth of contract manufacturing is another manifestation of this trend.5 Whereas 25 years ago, 20 percent of US workers worked in the Fortune 500, today the figure stands at 10 percent and is falling.4 The critical challenge of distance We need to examine how distance contributes to heightened complexity in organizational processes. An organizational unit cannot function without coordination and control; unfortunately, distance creates difficulties in both. Coordination is the act of integrating each task with each organizational unit, so the unit contributes to the overall objective. Orchestrating the integration often requires intense and ongoing communication. Control is the process of adhering to goals, policies, standards, or quality levels. Controls can be formal (such as budgets and explicit guidelines) or informal (such as peer pressure). We recognize today that, for knowledge workers, coordination and control have in many ways blended together. Communication is a mediating factor affecting both coordination and control. It is the exchange of complete and unambiguous information— that is, the sender and receiver can reach a common understanding. Gary Anthes presents a telling example of poor communication in a global software development project, when a tester interpreted a spacebar instruction as a “b-l-a-n-k,” clearly not the intended message of the sender.6 Distance exacerbates coordination and control problems directly or, as Figure 1 shows, indirectly through its negative effects on communication. The bold arrows in Figure 1 represent the main challenge of global
The literature on virtual teams addresses a wide array of approaches for alleviating the problems of distance in global software development. Erran Carmel organized1 (and later refined2) these approaches into six centripetal forces that exert inward pressure on the team for more effective performance: collaborative technology, team building, leadership, product architecture and task allocation, software development methodology, and telecommunications infrastructure. Of these, technology-based solutions are perhaps the most obvious. For example, a distributed Software Configuration Management (for managing the system component versions) reduces miscommunication because it enforces a common work process and a common view of the project.1,3 Another use of technology is the team Web site that encompasses all facets of individual- and task-related information—from the programmers’ photos to the test documentation.4 Other nontechnology approaches are more subtle. Martha Maznevski and Kathy Chudoba found that effective virtual teams had a “deep rhythm” of regular team meetings, both face-to-face and over distance.5 Of course, a meeting is but a communication formalism. Communication need not always occur within a formal, hierarchical configuration. When global software teams collaborate on innovative projects, informal channels of coordination—or lateral channels—are critical. They “help developers fill in the details in the work, handle exceptions, correct mistakes….”6
References 1. E. Carmel, Global Software Teams: Collaborating across Borders and Time Zones, Prentice Hall, Upper Saddle River, N.J., 1999. 2. E. Carmel, “Global Software Teams: A Framework for Managerial Problems and Solutions,” Global Information Technology and Electronic Commerce: Issues for the New Millennium, Ivy League Publishing, Marietta GA, to be published, 2001. 3. R.E. Grinter, “Doing Software Development: Occasions for Automation and Formalisation,” Proc. Fifth European Conf. Computer Supported Cooperative Work (ECSCW ’97), Kluwer Academic Publishers, Dortrecht, The Netherlands, 1997, pp. 173–188. 4. S. McConnell, Software Project Survival Guide, Microsoft Press, Redmond, Wash., 1998. 5. M.L. Maznevski and K. Chudoba, “Bridging Space Over Time: Global Virtual Team Dynamics and Effectiveness,” Organization Science, vol. 11, no 5, Sept./Oct. 2000, pp. 473–492. 6. J. D. Herbsleb and R. E. Grinter, “Splitting the Organization and Integrating the Code: Conway’s Law Revisited,” Proc. 21st Int’l Conf. Software Engineering (ICSE ’99), ACM Press, New York, 1999, pp. 85–95.
software development. Distance negatively affects communication, which in turn reduces coordination effectiveness. We briefly discuss the array of approaches for alleviating distance problems in the “Other Solutions to Alleviating Distance” sidebar. Given the critical role of effective communication in the successful orchestration of a global software project, it is not surprising that new tactics for addressing distance are being adopted. Using data from different software organizations around the world, the literature on global software development, and our current fieldwork, we extended and refreshed these approaches to develop our three tactics. March/April 2001
IEEE SOFTWARE
23
Communication
Negative
1
2
3
Distance
4
Negative
Coordination
Negative
Control
Figure 1. Impacts of distance.
Tactic 1: Reduce intensive collaboration Researchers have depicted a global software development maturity function that increases over time to higher levels of knowledge work. Figure 2 shows the typical maturity function Carmel developed7 but with some augmentations. We deal with the transition of tasks between the Center and Foreign Entity. We use these terms as shorthand for the dyad typically involved in global software development. The Center is usually (but not always) a firm (or unit) in one of the triad areas of North America, the European Union, or Japan. The Foreign Entity might be in another triad nation or in a newly industrialized or developing nation. Other collaborations do take place, primarily in R&D, some of which involve a web of several national sites. The Foreign Entity engages in tasks that range from those that are well-defined and structured, with easy-to-use methods and unambiguous outcomes, to tasks that are
Unstructured tasks
Figure 2. A software task maturity function. 24
IEEE SOFTWARE
Structured tasks
March/April 2001
Visioning Ownership Functional specs High-level design Low-level design New programming Maintenance/Y2K Port Time
hard to define and unstructured, with an iterative solution method and unclear solution. We have a tendency to associate more unstructured tasks with increased levels of coordination complexity between the Center and the Foreign Entity, but this is not always the case, as Figure 3 shows. Furthermore, firms have become more adept at parsing out tasks (and functions) that require low levels of coordination. The lower-left region in Figure 3 represents relatively straightforward tasks with low complexity such as Y2K remediation. For many companies in the Center, these tasks often represent their first offshore activities. Projects such as Y2K remediation also reflect a very small part of the individual system’s life-cycle effort. Such projects are more manageable over distance, because the need to communicate and clarify requirements is relatively limited and tasks are relatively stable. The region at the top of Figure 3, at the maxima, is characterized by a very dense web of coordination that is needed to transfer knowledge and collaborate on tasks. In such an environment, two or more development units are working together on the same project. At the extreme, this is done using the follow-the-sun approach, in which work passes from site to site on a daily basis. For example, when developers in California finish the day’s tasks, they pass their work (such as code) to their colleagues in India, who will shortly be arriving at work. In turn, at the end of the day, the Indian developers pass their work back to California. Coordination complexity is high because, on a daily basis, each side must package its added value so that the other side can quickly proceed to add its own value without further clarifications. Near this extreme of complex coordination lie all varieties of innovative R&D activities, because they tend to be unstructured.8 As a consequence of the complexity of coordination, many organizations are moving in one of two directions (depicted by the arrows in Figure 3)—to either the graph’s far left or far right. In both cases, coordination complexity remains relatively low. The key difference is the relative life-cycle effort of the tasks the Foreign Entities perform. In the lefthand side of the graph, for a large corporation, this move often represents transferring maintenance activities (also known as sustaining work), help desks, and data centers.
Tactic 2: Reduce cultural distance Cultural distance stems from the degree of difference between the Center and the Foreign Entity. This difference typically manifests itself in one of two forms: organizational culture and national culture. Organizational culture encompasses the unit’s norms and values, where the unit could range from a small technology company to a multinational enterprise. Organizational culture includes the culture of systems development, such as the use of methodologies and project management practices. For example, an Indian source told the authors that Korean customers recently accused an Indian company out-
Complex Follow-the-sun
Data center Help desk Coordination
The lower-right side of the graph shows a relationship in which the Foreign Entity takes full responsibility for (or ownership of) a system, product, or corporate process. This alleviates many of the distance problems, because the Foreign Entity is not using links with the Center as frequently. In other words, ongoing collaboration is not as intense. For software product companies (packaged software), ownership can be over individual software components, individual modules, releases, or entire products. Consider the following case in point. One major American product software firm opened an Indian development center and, within the span of a few years, transitioned from the lower left of the graph to the lower right. Although tasks were initially structured maintenance activities, the Indian center soon migrated to complete ownership of point releases (for example, from release 2.1 to 2.2), including enhancements. Meanwhile, R&D in the US could devote nearly all its resources to major release cycles (for example, 3.0). An instance of firms operating on the left-hand side of Figure 3 and moving up the Y axis are two of the largest US megatechnology companies we recently studied, who are in a continuous process of migrating internal IT processes offshore, one chunk at a time, to their rapidly growing Indian IT centers. These processes include support at various levels of criticality as well as data centers. Of course, this movement is facilitated in no small part by the nature of IT today, which permits increasingly larger levels of modularity in both the development process as well as the final product.
Straightforward
Full product ownership Contract programming Port, Y2K
100% Share of the life-cycle effort performed by the foreign entity
Figure 3. Alternative paths to alleviating intensive collaboration.
sourcing work from Korea of becoming “too American” in that they were devoting too much attention to documentation and were too stringent about deadlines. National culture encompasses an ethnic group’s norms, values, and spoken language, often delineated by political boundaries of the nation-state. It stems from the relative distance between the Center and Foreign Entity. American firms generally prefer to situate development units in foreign locations where cultural distance is smaller—for example, in Ireland—or where language barriers are minimal, such as in India or the Philippines. Figure 4 organizes common structural arrangements for global software development plotted along the cultural distance continua. The two axes represent national cultural distance and distance from the core organizational culture. At one extreme, the least problematic arrangement is to build IT and R&D work teams domestically and inside the firm. At the other extreme, cultural distance is greatest when a foreign outsourcing or contracting company performs the work. These two extremes appear, respectively, in the upper-left and lower-right sections of Figure 4. Five types of foreign structures are common, with large multinational enterprises often making use of a combination of all five. Beginning with the upper-right quadrant of Figure 4, three common arrangements are March/April 2001
IEEE SOFTWARE
25
Distance from center's national culture
Intra-firm
Foreign subsidiary
Domestic internal software work
Offshore development center
Cu ltu ral lia iso n
dg
eh
ea
d
ntity
Bri
Joint venture or alliance with foreign firm
e ign fore
Language
Domestic software work with outsourcers, contractors, and partners
Foreign acquisition
nal
External to firm
Foreign
r Inte
Distance from center's organizational culture
Domestic
Foreign outsourcing, contracting
Figure 4. A taxonomy of structural arrangements for software development.
possible. An existing foreign subsidiary is typical for the internal IT of a multinational enterprise. A foreign acquisition is typical for technology or software firms. For example, in 1999, European firms acquired 229 US technology firms, while US firms acquired 405 European technology firms.9 The software professionals in these firms were then compelled to work together. An offshore development center is usually set up from the ground up. At the lower-right quadrant of Figure 4, two arrangements are common. A foreign outsourcing or contracting firm might be India-based Tata Consultancy Services, Mexico-based Dextra Technologies, or Pakistan-based Askari Information Systems, to name three of several thousand such firms. A joint venture or alliance with a foreign firm is an arrangement in which a designated group of software developers from the Center and Foreign Entity collaborate. This is typical in technology companies undertaking innovative product development or product extensions. With these structural arrangements in mind, companies and vendors have devised four approaches to alleviate distance. These are depicted graphically in Figure 4 as the arrows that “pull” left (reducing national culture distance), up (reducing organizational culture distance), or both. 26
IEEE SOFTWARE
March/April 2001
Bridgehead The first of these arrangements is the offshore–onshore bridgehead (depicted by the bridgehead arrow, suggesting a reduction in both national culture and organizational culture distance). Some have begun labeling this as the 75/25 rule of thumb: Essentially, 75 percent of personnel work occurs offshore, while 25 percent occurs onshore (usually at the customer site—for example, in the US). This arrangement optimizes cost savings (offshore) while maintaining closeness to the customer. The individuals assigned to work onshore are typically the more experienced and culturally assimilated. They act to understand the customer’s requirements specifications and translate them to the offshore programmers. Perhaps more importantly, the face-to-face interaction reduces miscommunication between customer and vendor. The 25 percent of project personnel that are onshore effectively serve as a bridge—reducing cultural distance. Our field work suggests that the 75/25 arrangement has now become part of the marketing pitch by offshore (primarily Indian) firms reassuring their customers that they have found the right solution to distance. Many of the largest Indian firms (such as Tata Consultancy Services) now have considerable corporate presence in the US, further enhancing the cultural buffer and reducing cultural distance. In an approach that is conceptually similar, large US IT services firms, such as EDS, maintain IT outsourcing professionals at the customer’s site to act as a bridgehead between a US-based customer and some offshore workforce. Internalization of Foreign Entity Some American and European companies, primarily technology firms, are opening internal-to-the-firm foreign software centers. We see this phenomenon in large global corporations, as well as in rather small companies, such as dotcoms that acquire a small Indian professional services firm to support their internal needs. Thus, some of these units are greenfields (created from scratch) while others are acquisitions or conversions from other functions. By internalizing global software development and shunning collaboration with external foreign partners (such as outsourcers, contractors, or joint venture partners) these firms are reducing cultural
distance. This is depicted by the Internal Foreign Entity arrow in Figure 4, suggesting a reduction in organizational culture distance. When employees are inside the firm, it reduces organizational distance, operationalized as coordination complexity and organizational culture. These IT workers are within the corporate network—inside the firewall— with access to all knowledge-bases, calendars, Web pages, and so forth. They are also trained in the corporate methodologies, policies, and systems. The cultural liaison The culture liaison might be a project manager or key executive who travels back and forth between the key stakeholder sites. The liaison’s informal role is to facilitate the cultural, linguistic, and organizational flow of communication and to bridge cultures, mediate conflicts, and resolve cultural miscommunications. This is depicted in Figure 4 by the cultural liaison arrow, suggesting a reduction in both national and organizational culture distance. Cultural liaisons are usually expatriates, repatriates, or well-traveled individuals with broader global perspectives. For instance, a native of Ireland settled in the US might serve as the cultural liaison with an offshore site in Ireland. Carmel found that 47 percent of global software teams had someone who fit the characteristics of a cultural liaison.7 Language Spoken language is an important component of national cultural distance. Many decision-makers at the executive level hesitate to engage in international alliances and express reservations about collaborating with nations in which the command of English is weak. The language factor is one of the reasons for the success of offshore IT work in countries with strong English language capabilities such as the Philippines and Singapore. The language arrow of Figure 4 suggests a reduction in national culture distance. In some instances, US companies invest in English as a Foreign Language courses for those who are not fluent in English to improve professional communication. We have found this to be common in Russia. Tactic 3: Reduce temporal distance Despite the considerable power of today’s asynchronous technologies for dispersed
work—email, voice mail, online discussion groups, project management tools, Software Configuration Management, and issue and defect-tracking databases—there are still powerful reasons for synchronous—if not face-to-face—communication. Synchronous communication includes telephone, audio conferencing, videoconferencing, application sharing, and sometimes synchronous online code walkthroughs. The advantages of synchronous communication include resolving miscommunications, misunderstandings, and small problems before they become bigger problems. A small issue can take days of back-and-forth discussions over email to resolve, but a brief conversation can quickly clarify the problem. Thus, asynchronous communication often delays or complicates problem resolution. In addition, synchronous communication can also help improve quality of life. IT professionals involved in global work frequently complain about the need to compromise personal life to speak to colleagues far away, many time zones removed. Working within time-zone bands facilitates effective synchronous communication. The goal is to minimize time-zone differences, with zero being best (gradually degrading as the two sites approach an eighthour difference and beyond). Figure 5 shows typical collaboration pairings in such time zone bands. For the US East Coast, minimal temporal distance means Canada, Mexico, and the Caribbean nations. In fact, Mexican and Caribbean providers market themselves as being “near shore” (as opposed to offshore) to capitalize on this. The potential for synchronous work with Latin America is also great. For Europe, this means intra-EU collaboration or collaboration with South Africa or low-cost nations such as Romania, Hungary, Russia, and Ukraine. The Japanese can work synchronously with Singapore, China, Philippines, and India. Another example is Israel, which, along with most industrialized nations, suffers from an undersupply of software professionals. One of its largest software firms, Amdocs, opened an offshore development center in Cyprus (+0 Hr), now staffed with 400 IT professionals. Reducing temporal distance presents a trade-off between the advantages and disadvantages of synchronous and asynchronous
Despite the considerable power of today’s asynchronous technologies for dispersed work, there are still powerful reasons for synchronous communication.
March/April 2001
IEEE SOFTWARE
27
London
St. Petersburg
Beijing
New York
1H Monterrey
3H
4.5H
1H
OH Bangalore
Barbados
1H
1H
3.5H
Tokyo
1H Manila
Sao Paulo
Cape Town
Figure 5. Centers and foreign sites in time-zone bands (time differences use daylight-savings time).
communication. For example, it eliminates the advantage of follow-the-sun type work, which requires large differences in time zones. Synchronous communication is also mediated by cultural distance. Japanese managers use less synchronous communication than do managers in North America or Europe (because the lingua franca of IT is English). Asian IT professionals working with Japanese firms in the Center will typically conduct almost all of their communication using asynchronous technologies unless at least one key person on each side has fluency in English or Japanese. Caveats and hybrids Not all companies practice all or even some of these tactical approaches. First, and most important, there are other tactical approaches to alleviate the problems of distance, the most significant of which is technology (as mentioned earlier). Second, decision criteria for tactic usage are based in part on task and organization type. Organizations developing and supporting their internal IT (as opposed to software R&D) are more likely to be interested in Tactics 1 and 2. Third, the three tactics can be traded with cost. For example, a firm that has reduced intensive collaboration (Tactic 1) by spinning off structured tasks is unlikely to invest heavily in reducing cultural
28
IEEE SOFTWARE
March/April 2001
distance (Tactic 2). Finally, it is rational to engage in only a subset of these tactics. For example, the typical relationship of a US–Indian (Center–foreign) collaboration fails the temporal criterion (India is 10.5 hours apart from the Silicon Valley workday), but does well on the other two distance alleviation tactical approaches (Tactics 1 and 2). Similarly, if a firm invests in a foreign strategic alliance (Tactic 2) that is also in the same time-zone band (Tactic 3), then intensive collaboration (the opposite of Tactic 1) might actually be desirable to reap the advantages of the reduced temporal distance. Sometimes, rather than reduce intensive collaboration as prescribed in Tactic 1, intensive collaborative is desirable for new application development and innovative R&D, despite its coordination difficulties. There are many cases in which a geographically dispersed team might be the only way to gather all talented IT professionals around the world. It is clearly desirable if the company can use time-zone differences to decrease time to market (for example, Ford’s design centers or IBM’s development of the VisualAge JavaBeans product suite), although very few companies have done it successfully. Paradoxically, as opposed to reducing cultural and organizational distance as pre-
scribed in Tactic 2, increasing cultural distance from the Center can be advantageous— allowing companies to tap diverse knowhow and ideas. Eric von Hippel found that successful companies assimilate innovations that emanate from their most important customers.10 For example, Japanese firms, which dominate the computer game market, develop some of their game software in the US to be close to the large community of game players and designers. Although Tactic 3 prescribed reducing temporal distance by using synchronous communication over distance, this is no panacea because synchronous communication does have its limitations. Many IT professionals shun videoconferencing because of various behavioral problems such as the awkwardness of interrupting or the inability to see all the participants. Moreover, while face-to-face work often uses multiple applications and channels, today’s synchronous application sharing tools unnaturally restrict collaboration to a single application informational space.11 And even small time zone differences can be disruptive: James Hersleb and Rebecca Grinter found that in a British–German collaboration (with only a one-hour time-zone difference), after deductions for lunch and different work hours, few windows of common (synchronized) time were left.12 Synchronous communication can even be inferior to asynchronous communications. After all, a telephone conversation (synchronous) does not address such communication problems as, “I already told you that,” whereas email and fax (asynchronous) automatically leave a written communication history.
References 1. “Offshore’s New Horizons,” Global Technology Business, vol. 3, no 3 Mar. 2000, pp. 12–15. 2. E. Carmel and R. Agarwal, Offshore Sourcing of Information Technology Work by America’s Largest Firms, tech. report, Kogod School of Business, American Univ., Washington D.C., 2000. 3. GPI consultancy and P. Tjia, Computer Software and IT Services: A Survey of the Netherlands and Other Major Markets in the European Union, The Center for the Promotion of Imports from Developing Countries, Netherlands, 1999. 4. T.W. Malone and R.J. Laubacher, “The Dawn of the ELance Economy,” Harvard Business Rev., vol. 76, no 5, Sept./Oct. 1998, pp. 145–152. 5. “Have Factory, Will Travel,” The Economist, vol. 354, no. 8157, 12 Feb. 2000, pp. 61–62. 6. G.H. Anthes, “Software Development Goes Global,” Computerworld Online, 26 June 2000; www.computerworld.com/cwi/story/0,1199,NAV47_STO46187,00.htm l (current 13 Feb. 2001). 7. E. Carmel, Global Software Teams: Collaborating Across Borders and Time Zones, Prentice Hall, Upper Saddle River, N.J., 1999. 8. R.E. Grinter, J.D. Herbsleb, and D.E. Perry, “The Geography of Coordination: Dealing with Distance in R&D Work,” Proc. Int’l ACM SIGGROUP Conf. Supporting Group Work (GROUP ’99), ACM Press, New York, 1999, pp. 306–315. 9. Broadview Technology M&A Report, 1999; www.broadview.com (current 13 Feb. 2001). 10. E. von Hippel, The Sources of Innovation, 2nd edition, Oxford Univ. Press, New York, 1994. 11. C. Geisler and E.H. Rogers, “Technological Mediation for Design Collaboration,” working paper, Rensselaer Polytechnic Inst., New York, June 2000. 12. J. D. Herbsleb and R. E. Grinter, “Splitting the Organization and Integrating the Code: Conway’s Law Revisited,” Proc. 21st Int’l Conf. Software Engineering (ICSE ’99), ACM Press, New York, 1999, pp. 85–95.
About the Authors
O
ur view of the future is that software development projects will increasingly look like a global virtual archipelago with several separate clusters of colocated professionals sprinkled with dispersed individuals working remotely. To overcome the increased burdens that realizing this vision presents, IT managers will continue to experiment with new tactics that overcome some of the liabilities of distance.
Erran Carmel is an associate professor at the Kogod School of Business at American University in Washington, D.C., where he cofounded and shares leadership of the program in Management of Global Information Technology. His research focuses on global software development, and he wrote the 1999 book Global Software Teams. Contact him at the Program in Management of Global Information Technology, Kogod School of Business, American Univ., Washington D.C. 20016-8044;
[email protected].
Ritu Agarwal is an associate professor of information systems at the Robert H. Smith
School of Business, University of Maryland. Her recent work has focused extensively on information technology workforce issues. She is the coauthor of Coping with Labor Scarcity in Information Technology, published in 1999. She is an associate editor for MIS Quarterly and the International Journal of Human Computer Studies. Contact her at Decision and Information Technologies, Robert H. Smith School of Business, Univ. of Maryland, College Park, MD 207421815;
[email protected].
March/April 2001
IEEE SOFTWARE
29
focus
global software development
Globalization by Chunking: A Quantitative Approach Audris Mockus, Bell Labs David M. Weiss, Avaya Laboratories
Distributing development over many sites, often in different countries, can cause productivityreducing coordination difficulties. This article introduces methods for assessing and minimizing coordination problems by identifying tightly coupled work items, or chunks, as candidates for independent development. 30
IEEE SOFTWARE
ecause of economic, political, and practical needs, businesses regularly distribute their software production globally.1 Participants at the different development sites often suffer inhibited communication and coordination because they are remote from each other. One result of the affected communication and coordination might be reduced productivity and an increased production interval.
B
In this article, we look for technical solutions to accommodate the business needs for distributed software development. In doing so, we investigate quantitative approaches to distributing work across geographic locations to minimize communication and synchronization needs. Our main premise is inspired by Melvin Conway’s work, which suggests that a software product’s structure reflects the organizational structure of the company that produced it,2 and David Parnas’s work suggesting software modularity should reflect the division of labor.3 Here, we introduce ways to quantify the three-way interactions between an organization’s reporting structure, its geographic distribution, and the structure of its source code. We based our analysis on records of work items (for this analysis, a work item is the assignment of developers to a task, usually to make changes to the software).
March/April 2001
Work items For software development to be most efficient, the organization’s geographic distribution and reporting structure should match the division of work in software development. Tightly coupled work items that require frequent coordination and synchronization should be performed within one site and one organizational subdivision. We try to identify such work items empirically by analyzing the changes made to software. When a set of work items all change the same set of code, we refer to the set of code as a chunk. We use the term module to mean a set of code contained in a directory of files, which follows the usage of the development projects whose software we analyzed. Our main contribution is to define an analysis process for identifying candidate chunks for distributed development across several locations based on quantitative evidence. As part of this process, we have defined 0740-7459/01/$10.00 © 2001 IEEE
Software release ■
■ ■
■
a method to quantify the impact on development time and effort of work items spanning development sites; a method to identify work item-induced chunks in software systems; a process to identify chunks that could be developed independently in different organizations or in different development sites, including a way to define quantitative measures that describe chunks; and an algorithm to find chunks (in terms of independent changeability).
Work items range in size from very large changes, such as releases, to very small changes, such as individual deltas (modifications) to a file. A release, also called a customer delivery, is a set of features and problem fixes. A feature is a group of modification requests associated with new software functionality. And, an MR is an individual request for changes. Put another way, each release can be characterized as a base system that a set of MRs modifies and extends. Figure 1 shows a hierarchy of work items with associated attributes. The source code of large software products is typically organized into subsystems according to major functionality (database, user interface, and so on). Each subsystem contains source code files and documentation. A version control system maintains the source code and documentation versions. Common VCSs are the Concurrent Versioning System,4 which is commonly used for open source software projects, and commercial systems, such as ClearCase, Continuus Change Management Suite, or Visual SourceSafe. We frequently deal with Source Code Control System and its descendants.5 VCSs operate over a set of source code files. An atomic change or delta to the program text comprises the deleted lines and the lines added to make the change. Deltas are usually computed by a file-differencing algorithm (such as Unix diff), invoked by the VCS, which compares an older version of a file with the current version. Included with every delta is information such as the time it was made, the person making it, and a short comment describing it. In addition to a VCS, most projects employ a change request management system that tracks MRs. Whereas deltas track
Change management system
Feature
Description
Modification request
Time, date
Delta
File, module No. of lines added, deleted
Developer
Version control system
changed lines of code, MRs are collections of deltas made for a single purpose—for example, to fix a simple defect. Some commonly used problem-tracking systems include ClearDDTS from Rational and the Extended Change Management System (ECMS).6 Most commercial VCSs also support problem tracking. Usually, such systems associate a list of deltas with each MR. There are several reasons for MRs, including fixing previous changes that caused a failure during testing or in the field and introducing new features to the existing system. Some MRs restructure the code to make it easier to understand and maintain. The latter activity is more common in heavily modified code, such as in legacy systems. Based on informal interviews in several software development organizations within Lucent, we obtained the following guidelines for dividing work into MRs: ■
■
Figure 1. Hierarchy of work items and associated data sources. Dashed lines define data sources, thick lines define changes, and thin lines define work-item properties. The arrows define “contains” relationships among changes; for example, each modification request is a part of a feature.
The software-development team splits work items that affect several subsystems (the largest building blocks of functionality) into distinct MRs so that each MR affects one subsystem. The team further organizes a work item in a subsystem that is too much for one person into several MRs, each suitable for one person.
For practical reasons, organizations avoid strictly enforcing these guidelines so that some MRs cross subsystem boundaries and some have several people working on them. Work-item-based measures of coordination needs Changes performed both within and outside of work items require coordination. For March/April 2001
IEEE SOFTWARE
31
The tight coordination needed within MRs suggests that they’re the smallest work items that can be done independently of each other.
32
IEEE SOFTWARE
a software release, all coordination happens within the release, whereas for an individual delta on a file, coordination is between the file’s other deltas. Changes made as part of an MR require tight internal coordination and are preferably done by a single developer. For example, a change to a function’s parameters would require a change in function declaration, function definition, and all the places where the function is called. Conversely, coordination between MRs, although needed, typically does not represent as much coordination as do changes within one MR. The tight coordination needed within MRs suggests that they’re the smallest work items that can be done independently of each other. In particular, MRs can be assigned to distinct development sites or distinct organizations. This hypothesis is supported by the evidence that MRs involving developers distributed across geographic locations take much longer to complete.7 Based on the guidelines for dividing work into MRs described previously, the work items encompassing several MRs might reflect only a weak coupling among parts of the code that they modify. Consequently, such work items might be divided among several developers. The tight coupling of work in an MR suggests the following measure of workitem-based coupling between entities in a software project. For two entities A and B, the number of MRs that result in changes to or activity by both A and B define the measure of absolute coupling. For example, if A and B represent two subsystems of the source code, the absolute measure of workitem coupling would be the number of MRs such that each MR changes the code in both subsystems. The coupling for two group of developers would be represented by the number of MRs such that each MR has at least one developer from each group assigned to it. A coupling between the code and a group of developers is defined in a similar fashion. To adjust for A and B’s size, dividing the absolute measure by the total number of MRs that relate to A or B can provide measures of relative coupling. Coordination needed to accomplish MRs is also embodied in other activities and in ways that aren’t reflected in the preceding coupling measures. Examples are coordina-
March/April 2001
tion among MRs in a feature or during system integration and testing. Empirical evaluation of workcoupling measures Let’s now examine the work-coupling measures of a major telecommunications software project to investigate their relationship to the development interval (also known as project lead time or project development time). The project involved work in a complex area of telephony, where market requirements and standards are changing rapidly. Such conditions make coordinating the development work extremely difficult and subject to continuous change. Additionally, the product competes in an aggressive market—a situation that brings extreme time pressures to development work. You can find a detailed study of the project elsewhere.7 Here, we focus on coupling measures between different project sites (located in Germany, England, and India) and their relationship with development interval. We define work interval as the difference between the date of the last delta and the first delta for an MR. Such a measure is a good approximation of the period of time, or interval, that implementing the change requires. For each MR, we determined whether the individuals associated with it were colocated or resided at more than one site. MRs that have individuals from more than one site are classified as multisite; the rest are classified as single-site. The ratio of multisite MRs to total MRs is a relative measure of MR coupling between sites; the ratio provides an approximation of the coordination needs between the sites. The work-interval comparison in Table 1 considers MRs done over two years— July 1997 to July 1999—that were nontrivial (required more than one delta). Because multisite MRs involve at least two people, to avoid bias we excluded singleperson MRs from the first comparison. Table 1 shows that approximately 18 percent of multiperson MRs are multisite and incur an average penalty of 7.6 days with 95-percent confidence interval of [3,13] days. We can use the MR-coupling measure between sites (216 MRs) together with average interval penalty (7.6 days) and average number of participants in multiperson MRs (2.6) to obtain the total delay of
Table 1 Comparison of Work Interval Measured in Calendar Time Modification requests
Multiperson All
Sites
Average interval (days)
Number of modification requests
Single Multiple Single
8.2 15.8 6.8
979 216 1408
11.7 (7.6*216*2.6/365) person years (PY). These data suggest that multisite MRs carry a significant penalty of increased work interval and that reducing the number of multisite MRs could reduce work interval by eliminating these delays. Work done by James Herbsleb and others indicates that, primarily, communication inefficiencies caused the longer development intervals.7 Globalization: A problem of distributing software development Our main goal is to help project management make better-informed decisions through quantitative evaluation of possible consequences. We start by asking, “What work could be transferred from a primary site with resource shortages to a secondary site that has underutilized development resources?” To answer, we evaluate a particular transfer approach’s costs and benefits and use an algorithm to find the best possible transfer. In studying such transfers in Lucent Technologies, we observed the following approaches being considered or used: ■
■
Transfer by functionality, in which the ownership of a subsystem or set of subsystems is transferred. This was the most commonly applied approach in the software organizations we studied. Distributing development among different sites by functional area ensures that each site will have its own domain expertise and therefore require only a small- to medium-sized development group that could be trained relatively quickly. The main disadvantage is that adding new functionality might require using experts from several sites, thereby increasing the need to coordinate feature work between sites. Transfer by localization, in which developers modify the software product locally for a local market. An example of such a modification is translating the documentation and user interface into a
■
■
local language. An advantage of such an approach is that the local development team is highly aware of its customers’ needs and knows the local language and the nature of locality-specific features. A disadvantage is the requirement to maintain experts in all the domains that might require change when adapting the system to the local market. Often, such an adaptation requires expertise in virtually all of the system’s domains. Transfer by development stage, in which developers perform different activities at different locations (for example, developers might perform design and coding at a different site than system testing). The advantages include having developmentstage experts at a single site, but the disadvantages include a need to communicate and coordinate between sites to proceed to the next development stage. Transfer by maintenance stage, in which developers transfer older releases primarily for the maintenance phase when they no longer expect to add new features to the release. This makes more resources available for developing new functionality at the site uninvolved in the maintenance phase. The disadvantages include a potential decrease in quality and increase in problem resolution intervals because the site maintaining the product hasn’t participated in the design and implementation of the functionality they maintain. Communication needs between the original site and the maintaining site might increase when difficult maintenance problems require the original site’s expertise.
The globalization problem—the difficulty of distributing development among several sites—is multifaceted, involving trade-offs in training needs, utilization of available expertise, and risk assessment, as well as a number of social and organizational factors.8 Although we focus on the ways to miniMarch/April 2001
IEEE SOFTWARE
33
100 Relative coupling Fraction of multisite modification requests Difference
80 60
Fraction of multisite modification requests
Fraction of multisite modification requests
100
40 20
Relative coupling Fraction of multisite modification requests Difference
80 60 40 20
0
0 1997
1998 Year
(a)
1997
1999
1998 Year
1999
(b)
Figure 2. Two globalization candidates. For both candidates, the solid line shows the yearly trend of relative measure of work-item-based coupling between the candidate and the complement, the dashed line shows the trend of the fraction of multisite maintenance requests within a candidate, and the dotted line shows the difference between the two trends: (a) candidate 1 appears to be significantly better for distribution than candidate 2 (b).
mize the need for coordination and communication among sites, in practice it is equally important to use documentation, practices, and tools to enable better communication and coordination. This includes maintaining systems documentation, user manuals, and design and requirements documents; providing good email, telephone, and video-conference facilities, and using presence-awareness tools such as instant messengers and electronic message boards. (A reference of communication and coordination needs for globally distributed software development and a list of promising tools are available elsewhere.)1,7 Qualitative factors Globalization might lead to transfer of work that is in some way undesirable to the primary site. The last three globalization approaches noted in the preceding section reflect different types of undesirable work, such as localization, maintenance (often referred to as current engineering), testing, and tools support. We observed several instances of functionality transfer (the first approach), where the areas undesirable to the primary site are transferred. (Of course, they might have been transferred for other reasons as well.) The decision to transfer work might involve informal risk-management strategies, especially if the transfer is to a secondary site that hasn’t worked with the primary site before or had problems working with the primary site in the past. The risk-management strategies consist of identifying work that isn’t critical to the overall project in general and to the primary site in particular so that the completion of the project (espe34
IEEE SOFTWARE
March/April 2001
cially the work in the primary location) would not be catastrophically affected by potential delays or quality problems at the secondary site. Examples of such “noncritical” work include simulation environments, development-tool enhancements, current engineering work, and parts of regression testing. To some extent, the risk management can be done by transferring a functional area, such as a part of operations, administration, and management. For the work transfer to be successful, the receiving location needs appropriate training. If the work involves knowing the fine points of legacy systems, the primary site must expect to offer significant training. Such a situation is likely to arise if the maintenance or testing stages are transferred. The amount of training might be especially high if the secondary location has high programmer turnover and therefore must continuously retrain personnel. The training needs vary depending on how specialized the work is. Quantitative evaluation criteria Although there are a number of dimensions to costs and benefits, we focus on quantifying several aspects of the globalization problem. We propose two globalization scenarios: ■ ■
when an organization evaluates the competing globalization factors, and when an organization generates globalization solutions.
The most common globalization approach that we have seen is to divide functionality among the locations. We quantify a number of factors for that approach.
Work coupling. Work items spanning multiple locations tend to introduce coordination overheads and associated delays. Consequently, having as few of such work items as possible is desirable. This criterion can be measured by the number of MRs that modify both the candidate and the complementary parts of the software. That number is the measure of absolute coupling between the candidate and the rest of the system. Chunks are the candidates that minimize this measure, because they have the minimal amount of coupling to the rest of the code base. In addition to predicting future coordination needs, assessing the candidate part of software’s current coordination overhead is important. Organizations can make that assessment by counting the MRs which involve participants from multiple locations. Figure 2 compares two globalization candidates. Each candidate is represented by a list of files that people involved in the globalization decision review. Both candidates start with approximately the same degree of relative coupling, but Candidate 1’s relative coupling tends to decrease in time whereas Candidate 2’s tends to increase. Additionally, Candidate 1 requires considerably more multisite MRs than candidate 2. Consequently, Candidate 1 appears to be significantly better for distribution than Candidate 2. Amount of effort. When assigning a part of the code to a remote location, it is important to ensure that the effort needed on that part of the code matches the candidate location’s development-resource capacity. It is also important that the candidate embodies some minimal amount of work; transferring a candidate that requires only a trivial amount of effort might not be worthwhile. The organization can estimate the amount of work that a candidate needs by assessing that candidate’s historical effort trends. Assuming that a developer spends roughly equal amounts of effort for each delta, adding the proportions of deltas each developer completed on the candidate during that year can give an approximation of the total effort spent during a year. For example, a developer who completed 100 deltas in a year, 50 of which apply to a particular candidate, would contribute .5 technical head-count years to the candidate. (The scale of effort is thus in terms of PY.) In our experience, re-
sources of between 10 and 20 PY were available in the remote locations, roughly corresponding to a group reporting to a technical manager. The assumption that each delta (done by the same programmer) carries equal amount of effort is only a rough approximation. In fact, in a number of software projects, a delta that fixes a bug requires more effort than a delta that adds new functionality.9 However, in our problem the approximation of equal effort per delta is reasonable because there is fairly large prediction noise (because the effort spent on a candidate might vary over time). Furthermore, each programmer is likely to have a mixture of different deltas in the candidate, averaging out the distinctions in effort among the different types of deltas. In cases when managers need more precise estimates, models are available9 that could help find a more precise effort for each delta.
In one of the projects we studied, a typical rule of thumb was that the remote new team would reach full productivity in 12 months.
Learning curves. When a chunk of code is
transferred to developers who are unfamiliar with the product, the developers might need to substantially adjust their effort. In one of the projects we studied, a typical rule of thumb was that the remote new team would reach full productivity was 12 months. Figure 3 presents an empirical estimate of such a curve. The productivity is measured by the number of deltas a developer completes in a month. We shifted the time for each developer to show their first delta occurring in month one, which let us calculate productivity based on the developer’s experience with the transferred code. The figure shows that the time to reach full productivity (flat learning curve) is approximately 15 months. Because developers in this project train for three months before starting work, the total time to reach full productivity is 18 months. Algorithm to find the best candidates We also investigated ways to generate candidates that optimize a desired criterion. Organizations can compare such automatically generated alternatives to existing candidates using qualitative and quantitative evaluations. Based on the previous analysis, we have the following criteria for evaluating candidates: ■
The number of MRs that modify both March/April 2001
IEEE SOFTWARE
35
Deltas per month (average)
25 20 15 10 5 0 5
10
15
20
Months of experience
is a multisite MR. Finally, we provide a range of effort in PY for the candidate. Initially, the algorithm generates a candidate by randomly selecting modules until it gets within the bounds of the specified effort. The new candidate is generated iteratively, where the iteration involves randomly choosing one of three steps: ■
Figure 3. Learning curve. The horizontal axis shows a developer’s experience on the project in months and the vertical axis shows the average number of deltas for 50 developers who started working on the project between 1995 and 1998. The jagged curve represents monthly averages, and the smooth curve illustrates the trend by smoothing the monthly data.
36
IEEE SOFTWARE
■
■
the candidate and the rest of the system should be minimized. The number of MRs within the candidate that involve participants from several sites should be maximized. The effort needed to work on the candidate should approximately match the spare development resources at the proposed remote site.
Because the first two criteria both measure the number of undesirable MRs, we can minimize the difference between them. In other words, let A be the number of multisite MR’s at present, and let B be the number of multisite MRs after the organization transfers the candidate to a remote site. The increase in multisite MRs because of such a transfer can be expressed by the difference: B – A. The number B can be approximated by the number of MR’s that cross the candidate’s boundary (the first criterion). The number A represents multisite MRs that are entirely within a candidate, and, presumably, they will become single-site MRs once the organization transfers the candidate to a new location (the second criterion). The algorithm generates possible candidates and selects the best according to the desired criterion. We use a variation of simulated annealing,10,11 whereby new candidates are generated iteratively from a current candidate. The algorithm accepts the generated candidate as the current candidate with a probability that depends on whether the evaluation criteria for the generated candidate are better than for the current candidate. As input to the algorithm, we provide a set of files or modules; each file is associated with an effort in PY for the prior year (calculated as described previously). Another input consists of a set of MRs, in which each MR is associated with the list of files it modifies and with an indicator of whether it
March/April 2001
■
■
Add a module to the candidate set by randomly selecting modules from the system’s complement until one emerges that does not violate the effort boundary conditions. Delete a module from the candidate set by randomly selecting modules to delete from the candidate until one emerges that does not violate the effort boundary conditions. Exchange modules by randomly selecting one module from the candidate and one from the complement until the exchange does not violate the effort boundary conditions.
Once the new candidate is generated, the algorithm evaluates the criterion of interest (coupling to the rest of the system) and compares it to the current candidate’s value. If the criterion is improved, the algorithm accepts the new candidate as the current candidate; if not, the algorithm accepts the new candidate as the current candidate with a probability p < 1. This probability p is related to the annealing temperature, which can be decreased with the number of iterations to speed the convergence. Because the computation speed wasn’t a challenge for us, we chose to keep the p always above 1/3 to make sure that the algorithm explores the entire solution space without getting stuck in a local minimum. If the current criterion improves upon the criterion value obtained in any step before the iteration, the current candidate and the criterion are recorded as the best solution. This ends the iteration. In practice, we use a slight modification of the algorithm, in which we record optimal candidates for different effort bounds during a single run of the algorithm. Of the two candidates in Figure 2, the first is optimal among candidates consuming approximately 10 PY per year, and the second is optimal among candidates consuming approximately 20 PY per year.
S
oftware-development teams could use similar techniques to compare the chunks and the modularity of the code—that is, to check if the work items match the source code directory structure, which typically represents the software’s modularity. In large software systems, the alignment between work items and organizational and software structures answers several important practical questions: ■ ■ ■
■
What is the current work structure, and does it match the initial architecture? Do the current work and software structures match the organizational structure? Does the current work structure match the organization’s geographic distribution? How do we define a piece of software so that it is and remains an independent chunk that developers could develop or change independently: Is it a file, directory, or some other entity?
Our approach applies to any project in which change data have been accumulated. Even in so-called greenfield projects, the development proceeds by incremental change so that once the project has produced a substantial amount of code, the algorithm could be applied to the change data. The same technique applies to other areas, including distributing work to contractors in the same country or assessing an existing distribution. Because of our strong emphasis on independent changeability, we think about what we have done as exposing the empirical information hiding a software system’s structure. As a system evolves, decisions that are embodied in the code’s structure become intertwined such that they are dependent on each other; a change to one usually means a change to the others. Evolution of the system drives the formation of chunks. The challenge for the software architect is to construct a modular design where the modules and the chunks closely correspond to each other throughout the system’s lifetime. Acknowledgments We thank the many software developers at Lucent who have assiduously used the configuration control systems that provided us with the data we needed to perform this and other studies. We also thank development managers Mary Zajac, Iris Dowden, and Daniel Owens for sharing their globalization experiences and opinions.
References 1. E. Carmel, Global Software Teams, Prentice-Hall, Upper Saddle River, N.J., 1999. 2. M.E. Conway, “How Do Committees Invent?,” Datamation, vol. 14, no. 4, Apr. 1968, pp. 28–31. 3. D.L. Parnas, “On the Criteria To Be Used in Decomposing Systems into Modules,” Comm. ACM, vol. 15, no. 12, 1972, pp. 1053–1058. 4. CVS—Concurrent Versions System, www.cvshome.org/ docs/manual/index.html (current 9 Feb. 2001). 5. M.J. Rochkind, “The Source Code Control System,” IEEE Trans. Software Eng., vol. 1, no. 4, 1975, pp. 364–370. 6. A.K. Midha, “Software Configuration Management for the 21st Century,” Bell Labs Technical J., vol. 2, no. 1, Winter 1997, pp. 154–165. 7. J.D. Herbsleb et al., “Distance, Dependencies, and Delay in a Global Collaboration,” Proc. ACM 2000 Conf. Computer-Supported Cooperative Work, ACM Press, New York, 2000, pp. 319–328. 8. R.E. Grinter, J.D. Herbsleb, and D.E. Perry. “The Geography of Coordination: Dealing with Distance in R&D Work.” Proc. GROUP ’99, ACM Press, New York, 1999, pp. 306–315. 9. T.L. Graves and A. Mockus, “Inferring Change Effort from Configuration Management Data,” Proc. Metrics 98: Fifth Int’l Symp. Software Metrics, IEEE CS Press, Los Alamitos, Calif., 1998, pp. 267–273. 10. N. Metropolis et al., “Equation of State Calculations by Fast-Computing Machines,” J. Chemical Physics, vol. 21, 1953, pp. 1087–1092. 11. S. Kirkpatrick, C.D. Gellat Jr., and M.P. Vecchi, “Optimization by Simulated Annealing,” Science, vol. 220, May 1983, pp. 671–680.
About the Authors Audris Mockus is a member of the technical staff in the Software Production Research department of Bell Labs, where he designs data-mining methods to summarize and augment the system-evolution data; Web-based interactive visualization techniques to inspect, present, and control the systems; and statistical models and optimization techniques to understand the systems. He is investigating properties of software changes of large software systems. He received a BS and an MS in applied mathematics from the Moscow Institute of Physics and Technology, and an MS and a PhD in statistics from Carnegie Mellon University. He is a member of the IEEE and American Statistical Association. Contact him at Bell Labs, 263 Shuman Blvd., Rm. 2F-319, Naperville, IL 60566;
[email protected]; www.bell-labs.com/~audris. David M. Weiss is the director of software technology research at Avaya Laboratories,
where he performs and guides research into ways of improving the effectiveness of software development. Formerly, he was the director of software production research at Bell Labs. He has also served as CTO of PaceLine Technologies and as the director of reuse and measurement at the Software Productivity Consortium. At the Congressional Office of Technology Assessment, he was coauthor of an assessment of the Strategic Defense Initiative, and he was a visiting scholar at the Wang Institute. He originated the GQM approach to software measurement, was a member of the A-7 project at the Naval Research Laboratory, and devised the FAST process for product-line engineering. He has also worked as a programmer and a mathematician. He received a PhD in computer science from the University of Maryland. He is a member of the IEEE and ACM. Contact him at Avaya Communication, 2C-555, 600-700 Mountain Ave., PO Box 636, Murray Hill, NJ 07974;
[email protected], www.avayalabs.com/~weiss.
March/April 2001
IEEE SOFTWARE
37
focus
global software development
Using Components for Rapid Distributed Software Development Alexander Repenning, Andri Ioannidou, and Michele Payton, University of Colorado, Boulder Wenming Ye and Jeremy Roschelle, SRI International
oftware development has not reached the maturity of other engineering disciplines; it is still challenging to produce software that works reliably, is easy to use and maintain, and arrives within budget and on time.1 In addition, relatively small software systems for highly specific applications are in increasing demand. This need requires a significantly different approach to software development from that used by their large, monolithic, general-purpose software
S A large, geographically distributed testbed consisting of domain experts, component framework coordinators, developers, publishers, and users produces educational applications using a rapid production pipeline process. 38
IEEE SOFTWARE
counterparts such as Microsoft Word. These microapplications (µApps) require very fast cycle time; that is, relatively small teams of domain experts and developers must build them quickly and iteratively. The organization might allocate these people on the fly, and they are likely to be dispersed geographically. One software development solution that has a long tradition of advocates,2 is recommended by leading experts,3 and is quickly gaining support is component-based development. Components (for example, platformindependent JavaBeans and Windows-only ActiveX controls) are highly reusable units of software functionality4; they let developers conceptualize software as interconnectable building blocks.5 These software components support modular engineering practices, just as integrated circuits support modular design of hardware.6 At least in
March/April 2001
theory, building large projects out of welldefined and well-behaved building blocks can reduce the complexity of software development, because building on stable substrates is faster than building from scratch.7 The same organization assembling the components might produce and maintain them in complete applications, or acquire them from third-party developers producing so-called commercial-off-the-shelf software packages.8 Distributed software development The component-based approach to software development is generally attractive and has exceptional appeal in distributed software development. One downfall of traditional distributed software development approaches is that software projects are often insufficiently decomposed. This results in overlapping or misunderstood responsi0740-7459/01/$10.00 © 2001 IEEE
Table 1 The Stakeholders, Their Roles, Responsibilities, and Products in µApp Development Stakeholders
Roles
Responsibilities
Products
San Diego State University and Queens University, Kingston, Ontario, Canada SRI International, Menlo Park, CA
Domain experts
Design math activities
Integration team workshop organizers
Organize workshop bringing together all the stakeholders to analyze and design activities Design and evolve component framework Provide component authoring tools Organize component repository, including components and use stories Accumulate and delegate change requests to component authors Develop simulation components Combine library, hand-coded and generated components into complete µApp Critique µApp design in terms of pedagogy Test µApp for cross-platform compatibility, performance, and clarity of documentation Publish and support µApp users through mentoring service Hold teams accountable for their responsibilities, including content, adherence to guidelines, and scheduling Participate in analysis and design Provide feedback to developer Use µApp
HTML activity mockups including explanatory text and pictures Set of paper-based low-fidelity activity mockups Implementation schedule for all activities Component repository Design guidelines Design and implementation services
Component framework coordinator
University of Colorado, Boulder
Developer
The MathForum, Swarthmore, PA
Publisher
Producer
Math teachers, students
Users
bilities, which can lead to significant communication breakdowns and complete project failure. The nature of components forces designers and developers to better encapsulate functionality into cohesive, reasonably well-documented chunks of software. This article reports on the experience of a large testbed called Educational Software Components of Tomorrow (www.escot. org), supported by the US National Science Foundation. Escot is building a digital library containing educational software focused on middle-school mathematics.9 The ESCOT goals include building interactive JavaBeanbased content for educational purposes and exploring the distributed software development process with the specific objective of building and deploying reliable software rapidly. A large pool of geographically distributed stakeholders in the US and Canada participate in the project. The CORD process The Component-Oriented Rapid Development process happens in parallel prediction cycles involving different subsets of stakeholders aligned in a pipeline fashion to output completed software at a weekly rate. It resembles extreme programming.10 XP replaces the four coarse sequential steps of the
Simulation component prototypes Component generator tools µApp Final µApp Support materials for users
Schedules Reminders Feedback Ideas leading to µApps Feedback
waterfall model found in most object-oriented software engineering approaches—including object-oriented design, object-oriented system analysis, the object modeling technique, hierarchical object-oriented design, object-oriented structured design, and responsibility-driven design—with an extremely large number of parallel steps, including analysis, design, implementation, and testing.11 CORD, like XP, employs parallelism, but the granularity of its parallelism relates not only to the process but also to the software components used and the number of distributed teams involved in the development process. The CORD approach involves a number of parallel threads representing distributed teams working on sets of components. The components’ relatively small size and CORD’s distributed nature suggest development activities that are relatively small in scope, highly parallel, and highly iterative. CORD differs from XP in several crucial ways. In CORD, ■
The project start includes centralized analysis and design. A large group of users, domain experts, designers, and developers analyzes project requirements and creates application mockups and interoperability specifications. March/April 2001
IEEE SOFTWARE
39
Figure 1. A small part of the design mockup showing simulation (top), buttons (middle), and spreadsheet (bottom) components. If you randomly throw darts at two concentric circles, what is the chance of hitting the smaller circle?
publish on the Web. Middle-school students explore them interactively, solve mathematical puzzles, and submit answers to a mentoring service. The Web site posts a new problem each week, and the problems relate to the same theme for a month, with each week’s problems becoming progressively more difficult. The particular µApp we describe focuses on the geometry of circles. The purpose of this µApp was to develop a sense of spatial relationships and probability and explain the derivation of the mathematical constant π using experimental possibilities.
■
■
■
These mockups serve as design blueprints for the project’s further development. The interoperability specifications attempt to anticipate connectivity issues that will arise during later componentbased development and create a robust framework to handle them. Development is distributed. After the initial centralized analysis and design, the group distributes development to independent teams, coordinated through regular builds. Each build assembles all or some components into a testable application or applet. Development is component-centered. This approach is well suited for distributed teams using heterogeneous sets of tools and platforms. In the ESCOT project, we find teams producing components using component generators, retrofitting off-the-shelf components, or programming them from scratch. Some teams use low-end programming environments such as Sun’s Java JDK, and others use more sophisticated integrated development environments such as CodeWarrior. Development and delivery are crossplatform. Individual developers and users can use the platform of their choice.
This article describes the CORD process in the context of ESCOT, in which teams collaboratively produce µApps that they 40
IEEE SOFTWARE
March/April 2001
Stakeholders: Collaborators and roles In the CORD process, several stakeholders from geographically dispersed locations work together. These integration teams, which are formed based on the requirements for the µApp’s core components, design and build math µApps. Creating these µApps requires more knowledge than any single person possesses. For instance, developers are technologically savvy but might not sufficiently understand the requirements of end users. End users, on the other hand, are experts in their application domains but might not adequately comprehend technological limitations or opportunities. We can accommodate this “symmetry of ignorance” by combining collaborative design with distributed development in the integration teams.12 Table 1 lists the integration team members and the roles they play in the creation of the componentbased distributed π µApp. Phase 1: Centralized analysis and design In phase 1 of the CORD process, a large subset of the stakeholders meet to hash out analysis and design issues. Brainstorming through low-fidelity design media. Distributed software development
often disintegrates when the process is insufficiently decomposed, the stakeholders’ roles are overlapping or poorly defined, or communication breaks down. In the CORD process, even if we distribute the software development process, we gather all the participants together, at least in the initial analysis and design phase. Despite scheduling issues, gathering the entire group is important. When group ideas are first emerging, face-to-face
interaction is essential, enabling us to resolve issues such as role definition and task distribution early in the process. Another important part of the process in the initial meeting is designing activities with low-fidelity media such as paper and Post-It notes, which are accessible to everyone and let all the stakeholders participate in the design process.13 The result of this step is a storyboard, a list of components, and a sense of interaction between the components. The component list feeds a stepwise refinement procedure gradually leading from the identification of the necessary components to their implementation and integration. In a five-day integration team workshop held at Swarthmore College, Pennsylvania, in August 1999, a group of domain experts, component framework coordinators, developers, publishers, and users brainstormed ideas for µApps, one of which was the π µApp. We analyzed a suite of about 20 such ideas and created mockups for each one. The original mockup of the π µApp consisted of poster-size paper sheets with PostIt notes representing components (see Figure 1). The design became clearer when participants returned to their home organization. It was, therefore, absolutely essential to capture design representation (the poster boards) and additional discussions through a dedicated scribe. Formalizing design with HTML mockups.
Transforming the initial design from the low-fidelity mockup to a more formal medium is an important next step. We turn rough sketches of text and images into more explicit representation, forcing designers to fill in conceptual gaps. A geographically dispersed group of stakeholders can easily share HTML documents to trigger design discussions. These documents can also be good starting points for soliciting feedback—not only from the internal group, but also from actual users. The publishers and the component framework coordinator scheduled the development of this µApp to begin in February 2000. The domain expert had expertise in designing curricula for math education. She created a preliminary design (see Figure 2) based on the paper mock-up and posted it on the Web as a blueprint for the final application. The Web page prompted a lot of
feedback from developers. The simulation part of the activity (the blue circle in the red box) consisted of throwing darts at a target and counting the darts hitting the blue versus the red part as a means to approximate the value of π. This clarified what kind of information would go into the tables, but the static Web pages shed little light on the way in which the simulation would work.
Figure 2. Other members of the design team can easily access and critique the HTML mockup of a part of the µApp.
Specifying interoperability design patterns.
The component framework coordinator analyzes the set of low-fidelity requirements, identifying interoperability requirements that all component-based µApps will need to solve. In our specific case, important common requirements include ■ ■ ■
■ ■
synchronizing data values across components, dynamic publishing and subscribing to data, writing the state of an assemblage of components to persistent storage (for example, XML files), overcoming component mismatches, and submitting user responses to the educational activity (for example, answers to challenges) to a server.
To address these requirements, the component framework coordinator extends standards already well supported by the individual component vendors. In our case, ESCOT extended a de facto standard, the design patterns underlying JavaBeans, with additional design patterns and conventions. Furthermore, ESCOT provided utility classes that imMarch/April 2001
IEEE SOFTWARE
41
Figure 3. We prototyped the central component of each project, called the anchor tenant component, and makes available to other teams as Java applets.
erate new components that only one specific µApp required, and we implemented the behavior quickly in a script without resorting to the slower cycle time required by our primary programming language, Java. Thus, by using component generators and scripting languages, CORD circumvents the common component-based development problem in which existing components are “never quite right” for the task at hand.
plement these patterns, off-loading common interoperability tasks to a centralized software development while leaving the details of µApp design and implementation to the integration teams. The developers use these patterns to instrument component generator tools. In spirit, these tools are similar to the rapid application development systems, such as JBuilder, commonplace in integrated development environments. However, unlike RAD systems, which focus on general-purpose GUI assembly, component generator tools focus on a narrow vertical application area. ESCOT uses two component generators, AgentSheets and the Geometer’s Sketchpad. AgentSheets (described in more detail later) enables authors to design multiagent simulations quickly. The Geometer’s Sketchpad lets them design animated sketches that obey Euclidean geometric constraints. Once an author creates a specific design, either tool can output a new JavaBean that conforms to the interoperability design patterns. This facilitates just-in-time production of new components that are highly targeted to a particular domain problem. In addition, the component framework coordinator maintains a set of end-user (scripting) programming languages, which fill the gaps among existing components in the repository. ESCOT allows pluggable scripting languages and presently supports JavaScript and Logo (a common educational programming language). By having scripting languages, we avoided the need to gen42
IEEE SOFTWARE
March/April 2001
Phase 2: Distributed analysis, design, implementation, and testing Phase 2 represents a fundamental shift from a centralized mode of operation to a distributed one. Independent, geographically separate teams now work in parallel at the project and local team level. Team selection is based on the requirements for the core components, called anchor tenant components. We use the term anchor tenant in analogy to the large department stores that create the primary organization of shopping malls. We typically design each µApp around one major component, such as a simulation, with many supporting components such as control widgets, data displays, and so forth. Building anchor tenant components. Develop-
ers create prototypes of the anchor tenant component. This is an important part of the process, as it provides the stakeholders involved in the CORD process with concrete representations close to the final application. In February 2000, we selected AgentSheets as our component generator tool to build the anchor tenant for the π µApp. AgentSheets is an agent-based simulation component authoring tool. It applies to many kinds of applications, including mathematics, sociology, physics, chemistry, and art.14 We authored the simulation in the Visual AgenTalk end-user programming language and rendered it into JavaBeans with the Ristretto component generator built into AgentSheets.15 We made the simulation available to the other team members as a Ristretto-generated, Java-enabled Web page for critique (see Figure 3). With the availability of an executable prototype, email discussion increased sharply. Based on the feedback, the design changed significantly through multiple iterations. In other cases in which it was impossible or too costly to build interactive prototypes, we sometimes built animations instead.
While some consider the cost of prototyping generally too high, we found that building executable prototypes was essential for stakeholders’ discussions of crucial design and implementation issues. Assembling components. Developing the an-
chor tenant component is an essential part of the development process, but a complete application also requires choosing the peripheral components such as databases, charting tools, and buttons. In the CORD process, the developers evaluate the initial choices of components made at the analysis and design stages and accept or reject the choices based on component functionality, usability, and interoperability with the anchor tenant component. In February 2000, we assembled the complete µApp, including the simulation, buttons to control the simulation, a data table to collect output from the simulation, and an input text field to define the number of samples. Components are tightly coupled through events and values. For instance, the simulation component simultaneously writes statistical information to a table component and reads simulation control parameters from text field components (see Figure 4). Developers assemble components visually by laying them out in a work area and connect components semantically by a wiring metaphor. In simple cases, component outputs are directly connected to other component inputs. More complex arrangements require adapter components or scripts. Guidelines and patterns stipulated by the component framework coordinator support the compatibility of components. Furthermore, we note compatibility problems and use them to guide framework revision. For distributed component-based software development to work, it is necessary to manage component collections. This management includes maintaining component repositories and collecting component use stories. Who has used which components in what kind of context and how? Did they use them successfully, or did they have to work around issues or even modify a component? In our case, an ethnographer (from the µApps’ publishers) captured use stories, which we shared among the extended group as text. We passed the use stories on to the component framework coordinator who archived them. The plan is for the component framework coordinator to
maintain these use stories by connecting them to the component catalog. This management should help create a cumulative organizational memory that informs the distributed and changing teams participating in the overall testbed. In addition, the component framework coordinator helps guide the improvement of components in the repository based on specific µApp needs. This is a delicate matter of balancing generality and functionality, as well as granularity, of components. Developers often want highly specific functionality, but components can quickly become unwieldy if they implement behaviors that are only infrequently used. Furthermore, if we add too many variant versions of components to the repository, the collection becomes hard to comprehend. Inheritance hierarchies might seem to be a solution to this problem. However, in our experience, ad hoc growth of an inheritance tree (for example, driven by needs of individual µApps) leads to clutter. We might use inheritance, but it should be driven by decisions at a product line level, where we can design branches to conform to stable, recurrent niches in market requirements. Hence, to partly resolve tensions arising in a single µApp cycle, the component framework coordinator often advises the µApp team to use a component generator tool or scripting language to solve nongeneral implementation problems, so as not to clutter the repository. The repository stays focused on highly general and fairly large-grain components, as these are considerably easier to comprehend and use. In February 2000, for the µApp’s final
Figure 4. We use the ESCOT builder tool to assemble simulation, spreadsheet, text, slider, and button components into a complete µApp. We specify the components used, their parameters and position, and the wiring scheme between them with the builder tool and capture them as XML files.
March/April 2001
IEEE SOFTWARE
43
Figure 5. A finished µApp. Middle-school students first predict the probability of the fish’s being in a certain part of the sphere. They then use the fish bowl simulation to track the position of randomly moving fish and, using the data gathered, verify or reject their predictions.
version, we replaced the idea of throwing darts at a board (see Figure 4) with a fish swimming around randomly in a fish bowl. We modified the layout a number of times in reaction to users’ tests and replaced the crude artwork previously showing in the simulation component with a more sophisticated rendered image (see Figure 5). Publishing and using a µApp Before we publish a µApp in the ESCOT context, the publisher, the MathForum, must test and approve it. Testing the µApp and its components consists of a mix of formal test cases and less formal, functional tests. For the formal portion, the component framework coordinator has been unit-testing individual components in the repository. We maintain unit tests in the repository and update them as we discover bugs, so they become more complete over time. The publisher also tests the candidate µApp at a functional level by asking a representative sample of users to put it through the steps needed to solve the educational problem that the µApp will pose. Considerations such as cross-platform compatibility, 44
IEEE SOFTWARE
March/April 2001
performance, and clarity of documentation must pass through the publisher’s quality control before making the µApp available to users (teachers and students, in our case) through the MathForum Electronic Problem of the Week Web site. After the µApp went “live” in March 2000, students interacted with the activities assigned for each week and submitted their answers to MathForum mentors, who guided them through the process. A significant number of users experienced problems loading the µApp. Part of the problem was the use of an ESCOT-specific runner software (a tool developed specifically for running these µApps), which we later replaced with a browser-only solution.
A
t the surface level, component-based approaches appear to be ideally suited for distributed software development. We have found CORD to be effective in building educational applications, enabling aggressive project scheduling. CORD’s component-based nature enables a high degree of parallelism involving distributed teams of domain experts, component framework coordinators, developers, publishers, and users in the software’s development process. We advocate the use of increasingly formal design representations progressing from informal Post-It notes toward working applications. Increasingly formal design representations enable essential
communication between the distributed team members and avoid premature design commitments by allowing the right degree of design elasticity at different points in the development process. The use of JavaBeans enabled our distributed, heterogeneous developer community to build components using different tools (basic JDKs, IDEs, and generators) and platforms (Windows, Mac, and Unix). However, to make sure that we properly integrated components, we needed a component framework coordinator. The CORD approach’s scalability is bound by the number of components involved in a design. Typical CORD µApps have fewer than 20 components. However, a component’s external complexity (that is, the complexity of its API) does not indicate its internal complexity. Interactive simulations, componentized legacy code, and database components often have simple interfaces but complex implementations. On the negative side, we have a number of issues with the JavaBean platform. While JavaBeans have enabled true cross-platform development, we frequently found platform and virtual-machine-dependent implementation discrepancies that required a significant number of additional development cycles for debugging and work-around implementation. We recommend that project managers prepare themselves for extremely low development- versus testing-time ratios.
9.
10. 11.
12.
13. 14.
15.
J. Roschelle et al., “Developing Educational Software Components,” Computer, vol. 32, no. 9, Sept.1999, pp. 50–58. K. Beck, “Embracing Change with Extreme Programming,” Computer, vol. 32, no. 10, Oct. 1999, pp. 70–77. I. Jacobsen, Object Oriented Software Engineering: A Use Case Driven Approach, Addison Wesley, Wokingham, UK, 1992. H. Rittel, “Second-Generation Design Methods,” Developments in Design Methodology, John Wiley, New York, 1984, pp. 317–27. M. Rettig, “Prototyping for Tiny Fingers,” Comm. ACM, vol. 37, no. 4, Apr. 1994, pp. 21–26. A. Repenning and T. Sumner, “AgentSheets: A Medium for Creating Domain-Oriented Visual Languages,” Computer, vol. 28, no. 3, Mar. 1995, pp. 17–25. A. Repenning and A. Ioannidou, “Behavior Processors: Layers between End-Users and Java Virtual Machines,” Proc. 1997 IEEE Symp. Visual Languages, IEEE CS Press, Piscataway, N.J., 1997, pp. 402–409.
About the Authors Alexander Repenning is CEO and president of AgentSheets and an assistant profes-
sor and member of the Center of Lifelong Learning & Design at the University of Colorado at Boulder. His research interests include end-user programming, visual programming, computers and education, human–computer interaction, and artificial intelligence. He has a PhD in computer science from the University of Colorado at Boulder. He is a member of the IEEE and ACM and the creator of AgentSheets. Contact him at the Dept. of Computer Science and Center of LifeLong Learning & Design, Univ. of Colorado at Boulder, Campus Box 430, Boulder, CO 80309-0430;
[email protected]. Andri Ioannidou is a PhD candidate in computer science and a member of the Center of LifeLong Learning & Design at the University of Colorado, Boulder and a senior project manager at AgentSheets. Her research interests include educational uses of technology, interactive simulations, end-user programming, and end-reuse. She has a BS and an MS in Computer Science from the University of Colorado. She is a member of the ACM. Contact her at the Department of Computer Science and Center of LifeLong Learning & Design, University of Colorado at Boulder, Campus Box 430, Boulder, CO 80309-0430;
[email protected].
Acknowledgments NSF grants REC 9804930 and SBIR DMI 9761360 supported this work. We would like to thank Janet Bowers, Natalie Sinclair, Chris DiGiano, and Jody Underwood for their valuable contributions to this article.
References 1. B. Boehm and V.R. Basili, “Gaining Intellectual Control of Software Development,” Computer, vol. 33, no. 5, May 2000, pp. 27–33. 2. M.D. McIlroy, “Mass Produced Software Components,” Software Engineering, P. Naur and B. Randell, eds., NATO Scientific Affairs Division, Garmisch, Germany, 1968, pp. 138–155. 3. PITAC Report, “Information Technology Research: Investing in Our Future,” 24 Feb. 1999, www.ccic.gov/ac/ report (current 2 Feb. 2001). 4. I. Jacobson, M. Griss, and P. Jonsson, Software Reuse: Architecture, Process, and Organization for Business Success, ACM Press, New York, 1997. 5. J. Udell, “Componentware,” Byte, May 1994, pp. 46–56. 6. B.J. Cox, “Planning the Software Industrial Revolution,” IEEE Software, vol. 23, no. 11, Nov. 1990, pp. 25–33. 7. H.A. Simon, The Sciences of the Artificial, 2nd edition, MIT Press, Cambridge, Mass., 1981. 8. J.M. Voas, “Certifying Off-the-Shelf Software Components,” Computer, vol. 31, no. 6, June 1998, pp. 53–59.
Michele Payton is a database administrator at IBM and a former graduate research assistant in the computer science department at the University of Colorado, Boulder. She is finishing her master’s degree in computer science at the University of Colorado, Boulder. She is a member of Tau Beta Pi and the ACM. Contact her at the Department of Computer Science, University of Colorado at Boulder, Campus Box 430, Boulder, CO 80309-0430; paytonm@ cs.colorado.edu.
Wenming Ye is a software engineer at the Center for Technology in Learning at SRI International. His interests include components-based software design, open source software, and Java development. He has BS and MS degrees in computer science from the University of Colorado at Boulder. Contact him at the Center for Technology in Learning, SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493;
[email protected].
Jeremy Roschelle is a senior cognitive scientist in the Center for Technology in Learning at SRI International. His research interests include interoperable and reusable components for math learning; the design of digital libraries for mathematical applets; and networked, handheld learning devices. He has a PhD in education from the University of California, Berkeley, and a BS in computer science and electrical engineering from the Massachusetts Institute of Technology. Contact him at the Center for Technology in Learning, SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493;
[email protected].
March/April 2001
IEEE SOFTWARE
45
focus
global software development
An Experience in Collaborative Software Engineering Education Jesús Favela, Centro de Investigación Científica y de Educación Superior de Ensenada, Mexico Feniosky Peña-Mora, Massachusetts Institute of Technology
Development of large-scale systems often requires interaction among geographically distributed specialists. Highlighting tool use and team integration, the authors describe a project-oriented course in which students in two countries collaborate on a software project in an Internetbased groupware environment. 0740-7459/01/$10.00 © 2001 IEEE
arge-scale software development requires the interaction of specialists from different fields who must communicate their decisions and coordinate their activities. As global software development becomes mainstream, software engineers face new challenges for which they have received little or no training.
L
To help a new generation of software developers better understand the industry’s globalization and familiarize them with distributed, collaborative development, we designed a course entitled the Distributed Software Engineering Laboratory. In the class, pairs of students from different countries work as a virtual organization overseeing the whole software development process. We describe the lessons we have learned in this course and propose a framework useful in dealing with some of the difficulties participants face. The Distributed Software Engineering Laboratory The CMM states that, as organizations reach higher levels of maturity, individual activities become team activities.1,2 Academia has recognized this in making projectbased software engineering courses part of the software engineering curriculum.3 As software development becomes global, we should enrich these educational experiences
to incorporate multisite, multicultural, and multilanguage projects, which most students will certainly face in professional life. To address this issue as we taught the class over the last three years, we tracked the students’ team-based development of three medium-sized software systems. The projects lasted 32 weeks and included upper undergraduate and graduate students from institutions in two countries: the Massachusetts Institute of Technology in the US and the Centro de Investigación Científica y de Educación Superior de Ensenada in Mexico. In the class, each student must assume a role in the project. One acts as the project manager, another is in charge of quality control, two or three design the system, and so on. All participants meet once a week, through videoconferencing, to report advances and coordinate activities. At other times, they communicate using email, messaging systems, and desktop videoconferencing. To emphasize the need for collaboration, we distribute all development activities. That March/April 2001
IEEE SOFTWARE
47
The course aims to create a flexible environment incorporating a variety of loosely integrated tools that take advantage of shared resources.
48
IEEE SOFTWARE
is, rather than having one site come up with the requirements and the other produce the design, each phase of the software development process involves participants from both sites. The first class consisted of seven MIT and ten CICESE students and the second of ten MIT and four CICESE students. The course involves lectures and lab sessions. For the class’s first four months, instructors from both sites hold weekly 90minute lectures using videoconferencing equipment. Because audio and video quality are not always as desired, we also use textbased chat to reinforce any potentially confusing content. WebPresenter, a system developed at CICESE, lets us share HTML presentations.4 The first two labs feature computermediated activities requiring pairs of students to develop questions for a client and create use-case diagrams collaboratively. These sessions familiarize students with requirements gathering and analysis, collaboration technology, and their remote colleagues. Starting with the third lab, we leave the meeting agenda to the students. The students used these labs for progress reports, walk-throughs on technical documents, design discussions, and other project-related activities. All three projects developed software products in the field of computer-mediated collaboration. The first project’s group received requirements in very general terms (“develop a tool to support collaborative distributed design”), and the group had to devise an innovative solution. In the second year, the project’s objective was to develop a tool to support casual interaction. For the last course, students developed a collaborative environment based on an application service provider architecture. We asked students to define a market for the project they develop and to draft a business plan to present to the MIT Entrepreneurship Forum’s $1K Competition (http://50k.mit.edu), designed to encourage students and researchers in the MIT community to produce tomorrow’s leading firms. In fact, some students in the second course won the MIT $1K competition in the software category and were finalists in the $50K competition. The group responsible for requirements analysis had to gather product requirements from interviews with experts and potential users, analyses of commercial products, and
March/April 2001
their own experience using similar tools. The 32 weeks of the course defined the development time. The analysts and the project manager had to work together to establish a feature set that they could deliver on time. In the first and third courses, the development team worked additional hours to deliver a working prototype that satisfied most original requirements. The second course had to cut down requirements considerably because the group, encouraged by winning the $1K Competition, moved some resources from development to writing the business plan required for the $50K Competition. In all three cases, the students delivered a working prototype on time, but the development effort was higher than originally anticipated. The group underestimated the coordination and communication overhead and overestimated their productivity (disregarding the instructors’ repeated warnings). This is not surprising, however, because most had no prior practical experience in software estimation and very limited experience developing similar products. We evaluated the course through questionnaires, student interviews, and reports from the project management and quality control groups. Additionally, we categorized all email exchanges and submissions to the project’s newsgroup. Tools used and issues raised There are several types of tools supporting software development collaboration, coordination, and communication. Although some have proven effective in distributed projects, we need ongoing research as to the type of tools and services needed for the additional difficulties that surface in such environments.5 The course aims to create a flexible environment incorporating a variety of loosely integrated tools that take advantage of shared resources. These tools could be more or less valuable depending on the project’s group size, group location, and so on. Rather than defining a complete and complex environment from the start, we have followed an evolutionary approach. We first identify the most pressing communication and coordination problems in a software development project and then design the processes and tools that could better address these problems. To make our development environment more realistic, we take into
Excellent 5 4
consideration the supporting software being used in industry for these purposes. We then deploy the tools and test them in a new project; if we identify new collaboration problems, we modify or replace the tools. Every month we conduct a survey among all participants to measure a number of issues ranging from team harmony to tool use. We also analyze the electronic messages sent to all participants and the issues discussed in the meetings. The results presented in this section make reference to the first project unless explicitly indicated. Communication tools Everyday communications between remote team members rely mostly on email and the instant-messaging system ICQ (www. icq.com). Formal meetings also use desktop videoconferencing equipment. Participants seldom use telephone, snail mail, or fax. All project documents, including project plans, diagrams, test cases, and code, are available in a central repository that all project members can access using a Web browser. Of particular interest was the participants’ ability to communicate with other group members and follow the team’s progress. Both groups’ perception was that the project started with poor communication and coordination, which improved as the project progressed and hit a new low towards the project’s end. This was partly because most participants concentrated on their own work toward the project’s end. This work was quite clearly defined and required little close collaboration. Interestingly, their overall satisfaction with the experience followed the opposite trend. Starting out well, it diminished (as they found difficulties coordinating their work) and went up again in the final months (as they learned to deal with the project’s issues) (see Figure 1). Participants rated highly the value of a repository for all project information. They could all access the different document versions to review them or use them for their own work. Diagram and document sharing through the Web solved some of the distance problems, accelerated the analysis and design stages, and allowed the programmers to begin their work while the designers or design team corrected the design. One of our findings was a very high overhead for contacting participants in remote
3 2 Poor
1 1
2
3 Period
4
5
Figure 1. Students rate their experience locations; 41% of the more than 500 email with the distributed interactions during the first project related software project.
to group coordination. Even if a student required only simple information, he or she had to send an email or even schedule a meeting. This was particularly true during the first project, which didn’t use an instantmessaging system. We also recorded a mismatch in the types of interactions between developers in the same site and those in distributed sites. In the second project’s first phases, approximately half the interactions with local peers were informal, compared to only 20% of those held with distributed colleagues. In the project’s final stages, local informal interactions rose to nearly 80% of all interactions, while those with remote peers remained almost constant. To provide additional support for this type of informal encounter, the importance of which previous studies have highlighted,6 we developed an environment that informs participants of their teammates’ presence when they access project information.7 The system operates as follows. When a user requests a Web page from the repository, it sends his or her user ID and current URL to an awareness server that has information on all current users’ locations. Every time a user moves to a different page, it updates this information to all other users and displays it in a separate window. This way, all users know the information other project participants are browsing in the project repository. By selecting a person from the awareness window, a user can initiate textual communication with him or her. Coordination tools Several tools have been used to facilitate project coordination, mostly to support project and configuration management. The following are tasks that we have found to be critical in a distributed software development environment.
March/April 2001
IEEE SOFTWARE
49
Project scheduling and tracking. Collabora-
All the teambuilding efforts had to take place in cyberspace, so the participants from the two sides had no opportunity to run into each other in the hallways or chat informally in the dormitory.
tive Project Planning is a synchronous collaborative tool for project planning that lets the project manager specify and schedule activities. The user can display these activities using a Gantt chart and analyze them using the critical path method. In contrast to other project planning tools, CPP can be used simultaneously by two or more users, who can collaborate on defining activities, estimating their duration, analyzing delays, and rescheduling. In a typical scenario, the project manager might start a session with a participant to discuss the status of an activity for which he is responsible. Through discussion and negotiation and by analyzing the effects on the whole project, they might decide to give this activity additional time and adjust the project plan accordingly. To support project tracking, we implemented a set of tools to help the project administrator with such tasks as sending project status updates to all members and sending weekly lists of assigned activities to individual members or groups. Configuration management. Configuration
management combines tools and techniques to control the software development process; it controls different versions of documents generated through the project’s life cycle to avoid update and simultaneous maintenance problems. We developed a software tool named Web Configuration Manager for the course. WCM is a workflow automation system that lets project participants request a change by filling out and submitting a Web form. Every time a participant makes a change request, WCM automatically generates a Web page with information related to it and sends a message to the configuration manager. The configuration manager uses WCM to track all requests and their status. Technical reviews. We developed an additional tool to support technical review meetings, including inspections and walkthroughs, of local or distributed groups of reviewers.8 First, the system registers the document for review along with the participants in the review process. It electronically notifies the reviewers of their participation and gives them a link to the document. In an inspection, participants use the tool asynchro50
IEEE SOFTWARE
March/April 2001
nously to review the document and register the defects they find (see Figure 2). Then the reviewers meet (physically or virtually) and use the tool to navigate through the document and describe the defects found. For this task, it synchronizes the Web browsers so that all participants see the same information under the control of the person presenting. This change control tool helps the participants respond better and more quickly to changes in documents, diagrams, and code— minimizing idle time and accelerating the development process. They use it mostly during the design phase, to propose changes to the analysis document, and during implementation, when change requests increase. We primarily use its synchronous version, because most reviews were walkthroughs rather than inspections. In this case, we use one tool client from each side and a projection system so that all participants can follow the issues being discussed. Lessons learned Below are the main challenges that the project manager identified during the first class project, which took place from September 1997 to April 1998, as well as the changes they led to in subsequent projects. Team dynamics and integration The fact that most team members didn’t know each other before the project started made efforts to integrate the team much more challenging. All the team-building efforts had to take place in cyberspace, so the participants from the two sides had no opportunity to run into each other in the hallways or chat informally in the dormitory. To address this issue during the second and third courses, we organized integration sessions at the project’s beginning. For instance, we asked participants to publish their own Web pages with personal and professional information and to review those of other participants. In a lab session, we asked each student to present to the whole group another student from the remote site based on a chat session and a review of his or her Web page. The language barrier All project meetings and communications were in English. This became a problem for several CICESE members whose native
Figure 2. We developed a tool to support technical document inspections.
tongue was not English and who clearly had reservations about impromptu English conversations. Throughout the project’s first phase, the CICESE team members were hesitant to speak English and lacked confidence in working with their MIT counterparts. In reality, their English was perfectly understandable. Despite reassurance from the MIT team, this pattern of behavior continued for quite some time before the students reached a sufficient comfort level. Once the two sides established a working relationship, the sensitivity to language seemed to lessen. The language problem can be partially managed through constant encouragement and establishing trust. One method for ensuring that everyone understands the main points of a meeting is to post the salient points on the Web. The documentation specialist published each meeting’s minutes along with the agreements and action items. This practice of publishing everything might seem tedious, but it is vital to ensure that the entire team is on the same wavelength. Cultural differences The two institutions, MIT and CICESE, have very different cultures. MIT has an atmosphere of individuality and entrepreneurship. The faculty expects students to perform tasks and make quick decisions on their own. It is not an environment in which
students expect or often receive praise. It has a free and casual attitude toward decisionmaking. Students are very assertive and outspoken, seldom withholding their honest opinion. The drawback to this custom is that sometimes opinions can be blunt and might offend someone not used to this custom. The environment at CICESE is more structured and group oriented than MIT’s. Students reinforce each other’s work with support and praise and seek to build a consensus among the entire team before proceeding with a decision. They are reluctant to take individual responsibility. They are very respectful of authority and less outspoken than their MIT counterparts. For example, a designer in CICESE wanted to criticize the work of an MIT designer but, instead of telling the person directly, asked the CICESE project manager to relay the information to the MIT project manager to handle the situation. These cultural differences created several problems at the project’s beginning. The first crisis arose when the CICESE team felt that the MIT team didn’t appreciate their efforts. They had translated much of the Web site from Spanish to English, but left a key part— the discussion threads system—in Spanish. The MIT team focused on the need to translate this remaining part, while the CICESE team expected positive feedback and supMarch/April 2001
IEEE SOFTWARE
51
Excellent 5 4 3 2 Poor
1 1
2
3
4
Period
Figure 3. Capacity to track the distributed team’s progress as reported by students. The figure shows the average student response, during the duration of the project, to the prompt: “Measure your ability to track the progress of the distributed team.”
port. As a result, a rift grew between the two teams, until the MIT side became aware of the problem. After this incident, the MIT team displayed heightened sensitivity to this issue and always prefaced subsequent critiques with encouraging remarks. The team contract A few weeks into the project, it became apparent that no standard expectation existed for many of the project’s daily processes. Some students missed meetings scheduled through email because there was not enough advance notice. Some didn’t act for several days on emails requesting information. In an attempt to formally document agreements on these and other issues, the teams drafted a contract establishing certain rules on communication, response time, and authority. For example, they decided that participants should respond to all emails within 24 hours and notify the project manager of any expected absences from meetings at least 24 hours beforehand. The team contract proved to be useful in streamlining communication; although students sometimes failed to reply to email within 24 hours, the frequency of such events lessened noticeably. Collaboration technology We developed several collaborative tools described earlier to address specific concerns that the previous projects raised. For example, the students’ failure to track team progress during the first project motivated development of the project tracking tool (see Figure 3). Most of these tools support communication and coordination, areas hampered by a distributed development environment. Not surprisingly, the tools focus on supporting processes that are critical to achieve a level 2 maturity according to the CMM model.2 The Web facilitates the loose integration of these tools. By simply adding appropriate links, one can easily move from one tool to
52
IEEE SOFTWARE
March/April 2001
another and to the project’s documentation as required. This ease proved particularly useful in the heterogeneous computing environment in which the students worked. On the other hand, the Web raises users’ expectations for up-to-the-minute project information. This can lead them to think that other sources of information are unnecessary and forces the group to continually update the Web server. The Five Critical Moves framework Based on our experience in previous projects, we developed the Five Critical Moves framework for the 1999–2000 lab to help students better understand and deal with distributed development issues. This framework is especially helpful during the first few weeks of a project; it brings the major features of virtual team interaction to the fore. Following are the five moves. 1. Articulating a shared project vision: reflection and collaboration Negotiating a shared vision can be a complicated process and should be ongoing throughout a course. To begin, students need a chance to develop and articulate their personal vision of what they will get from the course, identify risks, and hear others’ ideas. A general outline of the project goals should be available. A document with the group’s vision, accessible to all participants, should emerge in such a setting. 2. Building cooperation and trust When we meet someone face to face, many things can greatly influence our perceptions, trust, and expectations of each other: voice, body language, appearance, gender, age, and even the encounter’s location or context. How can we create a context for ourselves on the Internet to establish substantial relationships of cooperation and trust? Getting to know each other, finding ways to communicate feedback to each other, and recognizing the need to cooperate help build cooperation and trust in distributed working environments. Once we have established trust between colleagues, put reward systems in place, and implemented reliable technology, we must facilitate good communication and cooperation in the distance environment. In virtual teams, this means developing protocols for interac-
About the Authors
tion. A clear, predictable approach to communicating feedback can ease tensions around criticism and concerns. By approaching feedback in an open and accepting way, team members can also use each other as resources and mentors. This is especially important in a distributed environment, because one bad communication can lead to significant mistrust that cannot easily be resolved. 3. Establishing responsibilities and power Next, we develop a responsibility chart based on what each participant sees as his or her responsibility. Each participant should consider these questions: ■
■
■
What are the concrete results expected of the participant inside and outside the team? For which task will the participant take responsibility in the work? Who will share the responsibility? Who inside and outside the team will need or want information about the task? Who can benefit from this work?
4. Organizing and using resources We then identify our resources and who has access to them. We developed a student handbook with a list of such resources so that all students could find them efficiently. In a virtual team, knowing what is available at each location also keeps communication between team members flexible because they can determine whether email, chats, or threaded discussions are the best ways to stay in touch. 5. Coaching Finally, we use focus groups, interviews, meetings, and surveys to get feedback from the instructors and colleagues. While focus groups encourage reflection on the teamwork and address how well team members collaborate, interviews bring the individuals’ experience to the fore.
I
n job interviews after graduating, students receive positive feedback about the course concept and their experience. As one student explained, “They [the companies] seemed to be impressed with the teamwork environment and the project itself. They focused on the teamwork, the technol-
Jesús Favela is a professor at Centro de Investigación Cientifica y de Educación Superior de Ensenada in Mexico, where he leads the Collaborative Systems Laboratory. He has a BSc from Universidad Nacional Autónoma de México and an MSc and a PhD in computer-aided engineering from the Massachusetts Institute of Technology, where he worked as a research assistant at the Intelligent Engineering Systems Laboratory. His research interests include computer-supported cooperative work, multimedia information retrieval, and software engineering. Contact him at
[email protected].
Feniosky Peña-Mora is an associate professor of information technology and project management in the Civil and Environmental Engineering Department’s Intelligent Engineering Systems Group at the Massachusetts Institute of Technology. His current research interests are in information technology support for collaboration, conflict mitigation, change management, negotiation management, and process integration during design and development of largescale engineering systems. He holds a 1999 National Science Foundation Faculty Early Career Development Award and a 2000 White House Presidential Early Career Award for Scientists and Engineers. Contact him at
[email protected].
ogy, and the software engineering processes that are involved.” Thus, the Distributed Software Engineering Laboratory familiarizes students with issues they will face during their professional work in an increasingly global software development industry. Acknowledgments We thank Karim Hussein, Josefina Rodríguez, Rhonda Struminger, and Robin Losey, as well as all students involved in the courses, for their support on this project. Consejo Nacional de Ciencia y Tecnología Grant No. 29729A, the 1999 White House Presidential Early Career Award for Scientists and Engineers, the 1999 US National Science Foundation Faculty Early Career Development Award CMS9875557, and the NSF Award III-963002 partially funded this work.
References 1. R. Patnayakuni and N. Patnayakuni, “A Socio-Technical Approach to CASE and the Software Development Process,” Proc. 1st Am. Conf. Information Systems, Association for Information Systems, Atlanta, Ga., pp. 6–8. 2. M.C. Paul et al., “Capability Maturity Model for Software v.1.1,” tech. report CMU/SEI-93-TR-24, Software Eng. Inst., Carnegie Mellon Univ., Pittsburgh, Feb. 1993. 3. D. Bagert et al., “1999 Guidelines for Software Engineering Education Version 1.0,” tech. report CMU/SEI99-TR-032, Software Eng. Inst., Carnegie Mellon Univ., Pittsburgh, 1999. 4. J.A. Aguilar et al., “An Environment for Supporting Interactive Presentations to Distributed Audiences over the World Wide Web,” Proc. 3rd Int’l Workshop Groupware, Oct. 1997, Universidad Politécnica de Madrid, pp. 61–70. 5. J. Herbsleb et al., “An Empirical Study of Global Software Development: Distance and Speed,” to be published in Proc. 2001 Int’l Conf. Software Eng., IEEE CS Press, Los Alamitos, Calif., 2001. 6. R. Kraut and L. Streeter, “Coordination in Software Development,” Comm. ACM, vol. 38, no. 3, 1995, pp. 69–81. 7. J. Contreras et al., “Improving Collaborative Learning by Supporting Casual Encounters in Distance Learning,” Proc. SITE 2000 Int’l Conf., Association for the Advancement of Computing in Education, Norfolk, Va., 2000, pp. 136–142. 8. M. Fagan, “Design and code inspections to reduce errors in program development,” IBM Systems J., vol. 15, no. 3, 1976, pp. 182–211.
March/April 2001
IEEE SOFTWARE
53
focus
global software development
Synching or Sinking: Global Software Outsourcing Relationships Richard Heeks, Manchester University S. Krishna, Indian Institute of Management, Bangalore Brian Nicholson, Manchester University Sundeep Sahay, Oslo University
Clients and software developers need to move their global outsourcing relationships up the value chain to reap greater benefits. Yet such moves bring costs and risks. The authors investigate the strategies that differentiate successful and unsuccessful value chain moves. 54
IEEE SOFTWARE
lobal software outsourcing is the outsourcing of software development to subcontractors outside the client organization’s home country. India is the leading GSO subcontractor, registering average annual growth of more than 40 percent over the last decade and developing nearly US$4 billion in software for foreign clients in FY 1999.1 Indian firms now develop software for nearly one-third of the Fortune 500.2
G
Advice for potential GSO clients recommends starting small, at home, and with programmers.3 Many client organizations have followed this advice, putting a toe into the GSO waters through small-scale body shopping—for example, having Indian subcontractor staff come over to the client site to complete a minor, noncritical piece of coding or conversion work. Although this minimizes risk, it also minimizes benefits. In the US, onsite costs of India-related GSO undercut those for hiring US staff by only some 10 to 20 percent, but sending development work offshore to India typically undercuts onsite costs by some 50 percent. Large projects offer a greater potential for savings than small ones. Likewise, the cost savings for hiring Indian analysts or project managers are typically some $1,000 to $2,000 per month greater than those for hiring programmers.4 Clients are therefore keen to move up the
March/April 2001
value chain to increase the cost savings of outsourcing while gaining access to local labor and markets. However, moving up the value chain in this way brings additional costs and risks. GSO of any type imposes communication and coordination costs that local outsourcing does not have.5 It also imposes risks of total or partial project failure. Clients and developers are therefore seeking routes through the cost/risk minefield to the benefits that higher-value GSO promises. The research reported here investigates these routes. We selected India as the location for the developer half of the relationships to be studied, and North America, Europe, and East Asia as the client half. In the discussion that follows, we use fictitious names to preserve anonymity. Synching or sinking In seeking to understand successful strategies in GSO relationships, we were 0740-7459/01/$10.00 © 2001 IEEE
struck by contradictions from prescriptive recommendations. Techniques that worked well for one relationship could cause friction and failure in another. For one client, Gowing, the perceived deference and compliance of Indian developers was a key element in the client–developer relationship’s success. However, for another client, Sierra, it was a major problem that contributed to the closure of its Indian operations. Most North American clients found outsourcing of highly structured work to be effective, particularly in reducing transaction costs. But in the Japanese client contracts we studied, outsourcing less structured yet more creative tasks helped build meaningful interactions and relationships. Therefore, a contingent perspective had to inform our analysis of field data. Classic ideas on contingency and organization relate to the match or mismatch between organizations and their environments.6 Such ideas are familiar from the business alliance literature,7 and they also seemed to help explain the dynamics of the GSO relationships we investigated. Successful relationships were those in which a high degree of congruence occurred between developer and client: we call this synching. Unsuccessful relationships were those in which a low degree of congruence was achieved between developer and client: we call this sinking. Congruence might exist in relation to any number of contextual dimensions. We developed a dimensional framework on the basis of our initial research and other studies of GSO and of user–developer relationships.8,9 Using this, synching was more precisely defined as the minimization of gaps between client and subcontractor along the six Cocpit dimensions: coordination/control systems, objectives and values, capabilities, processes, information, and technology. Armed with this framework, we set out to discover the practical realities of synching. Synching in practice Those who fail to synch, sink. Their projects run into problems, including outright failures, and they fail to move up the value chain. Take, for example, Pradsoft, a leading Indian software house, which entered into a software development relationship with Ameriten, a large US client. Although there was some sharing of infor-
mation and overlaps in technology platforms, significant congruence did not develop. The gap arose particularly because of different objectives. Ameriten saw global software outsourcing as a means to cut costs at all costs. Pradsoft, however, saw GSO as a partnership arrangement that should incorporate capacity building and knowledge/technology transfer. Because Ameriten disagreed, transfers did not take place, and the values, processes, and management systems of the two parties significantly differed. Capabilities, too, did not develop as intended. Less than two years after some initial contracts, the relationship was terminated by mutual disagreement, having failed to create synergy or move up the value chain. We did not find complete congruence in any of the GSO relationships we studied. In the more successful ones, though, synching took place to a significant degree along most of the Cocpit dimensions. In general terms, it was the more congruent relationships that delivered more projects closer to deadline and budget. They also built the preconditions for higher-value outsourcing. In particular, congruence fostered trust between client and developer, and this trust progressed the relationship to larger, more highly skilled projects with more offshore components. Nevertheless, synching in practice ran into barriers and problems, particularly around certain issues and Cocpit dimensions, as described in the following case studies.
In general terms, it was the more congruent relationships that delivered more projects closer to deadline and budget.
Global and Shiva Global is a large North American multinational telecommunications company that took a strategic decision in the late 1980s to become involved in global software outsourcing to India.10 It began to build relationships with a number of developers, primarily with Shiva, one of India’s leading software exporters. Involving just 10 Indian staff in 1991, the partnership grew to involve nearly 400 by the late 1990s, mainly based in Mumbai. The relationship deepened to the extent that Shiva worked on some of Global’s core technology developments and was designated a full partner lab. This included transfer of ownership of some software products from Global to Shiva, and direct contact between Shiva and some of Global’s own customers. March/April 2001
IEEE SOFTWARE
55
Congruence of this client–developer relationship has also remained subject to factors in the external environment.
56
IEEE SOFTWARE
Global worked to achieve congruence on all Cocpit dimensions. It succeeded quite well on what can be considered the harder dimensions such as technology and processes. It steadily upgraded its network links until Shiva gained the same highspeed links and access to the corporate WAN as any of Global’s own departments. In addition, Global’s software development environment was mirrored in India, and the company set aside a dedicated group to ensure that matching technical resources were available at Shiva. For each project, there was a detailed process of project definition and specification development. This ensured that project methodologies, scope, schedule, and deliverables were unambiguously defined and understood by both parties, helping synch information and processes. Contract negotiations similarly served as a mechanism for building shared understanding, even attempting to address some of the softer components of synching, such as objectives and values. At both senior and project levels, Global made extensive use not only of the communications facilities available, but also of physical movement to provide for face-toface meetings. Such meetings proved to be more effective at synching values and informal information, in a way that IT-mediated communication could not. Global also ensured that traffic was two-way, with Global staff visiting Shiva’s offices in India as well as having Indian staff come to North America. Global instituted a comprehensive program of training for Shiva staff to bring capabilities into line with its requirements. This also addressed the process dimension—for example, enabling Indian staff to follow Global’s software development methodologies. It covered objectives and values as well by endeavoring to transmit Global’s corporate culture and value system. Periods of work for Shiva staff in Global’s North American office attempted to do the same thing. Finally, Global tried to address the coordination/control systems dimension. It supported the introduction of North American HR management systems. Despite barriers, Shiva introduced performance appraisal and career management sys-
March/April 2001
tems like those operating at Global headquarters. These techniques built a significant degree of congruence, especially for the capabilities, processes, information, and technology dimensions. Shiva’s operations ran quite like those in Global’s home country. The result was an average annual growth of more than 15 percent in Shiva–Global business. By the late 1990s, Global was a major Shiva client, accounting for some $8 million of outsourcing contracts per year. However, Shiva remained an Indian organization based in India, and limits to synching emerged. Perhaps not surprisingly, the limits centered on the soft issues of objectives and values, because they encompass deep sociocultural differences that are not easily erased. Synching coordination/control systems involved the attempted displacement of Shiva’s systems by Global’s own. This did not run smoothly because of the cultural underpinnings of those systems. Shiva’s systems were deep-rooted and based on a relatively personalized and subjective management culture that had a long tradition. Global’s systems were drawn from a culture of objectivity and accountability. Forcing one set of values onto the other was hard, and Shiva’s values proved quite resilient. It took enormous efforts before the Shiva project leader would produce a standardized monthly progress report, and Shiva staff refused to participate in Global’s employee satisfaction survey. Shiva therefore came to be seen as an outlier in the Global family because its coordination/control systems, objectives, and values could not be pushed fully into synch. Congruence of this client–developer relationship has also remained subject to factors in the external environment. During the 1990s, Shiva made itself significantly congruent with Global and its expectations and requirements. In the late 1990s, Global’s IS strategy took a sudden left turn into Internet-based models and applications. Shiva was in synch with legacy-based models and applications—not with this new world. Synching on every one of the Cocpit dimensions, especially capabilities and processes, began to fall apart. At the time of this writing, the relationship was struggling to get back into synch.
Sierra Sierra is a small but rapidly expanding UK software house, specializing in hightech customized projects.11 By 2000, it had a turnover of roughly $12 million and employed 80 staff worldwide. Sierra began outsourcing development work to India in 1996 using a body-shopping model. Congruence was poor: capabilities were not up to expectations, and—more importantly—Indian staff were outside Sierra’s HR systems and company culture, creating factional divisions between Indian and UK staff. To bring the developers more in synch, Sierra set up an overseas development center in Bangalore in 1998. The center was created on the basis of a rather hastily conducted business plan but with a high level of optimism and expectation about what could be achieved. The main intention was to produce “a little bit of Sierra in India.” Sierra put firm synching strategies into place—it introduced its home processes and coordination/control systems, installed high-speed communications links with videoconferencing functionality, shared information from the UK office over the corporate intranet, and encouraged a youthful and unstructured work environment, like that of its UK headquarters. Blinded by its enthusiasm, Sierra thought it saw congruence on all Cocpit dimensions and shot the relationship up the value chain. It outsourced whole projects, including virtual liaison with Sierra customers, to the Indian team, envisaging that distance in all its connotations—geographical, cultural, and linguistic—would be invisible. It deemed some projects successful—especially those that involved members of the Indian development team traveling to the customer site for requirements capture. However, distance could not be made invisible and limits to synching emerged. As might be expected, the videoconferencing link (when not disrupted by bad weather) could not substitute for face-to-face interaction. It failed to transmit the informal information that personal contact provides, creating a barrier to information synching. As with Global and Shiva, most of the persistent differences clustered along the objectives and values dimension—the dimension, arguably, about which Sierra as-
sumed most and did least to achieve congruence. These differences, in turn, undermined other Cocpit dimensions. One example of cultural dissonance stemming from these differences was rooted in authority. In Sierra UK, authority came more from technical knowledge rather than position. Processes of creative discussion were legitimized, meetings were open and confrontational, and, during problem-solving discussions, junior staff could and did openly contradict more senior staff. In Sierra’s Indian office, despite the congruence achieved on some dimensions, the opposite attitude to authority and confrontation prevailed. Another example of cultural dissonance was project leakage: slippage in the amount of time dedicated to a project. In the UK, leakage was typically 2 to 3 percent; in India, it was typically 25 to 30 percent. The problem arose over definitions of leakage. From the UK perspective, it was measured not in terms of deadlines, but in terms of how much time was spent on a task. From the Indian staff’s perspective, deadlines mattered but time spent did not, and they felt that UK values were insensitive to Indian conditions. To some degree, coordination/control systems, capabilities, and processes in the Bangalore office were all constrained from synching with the realities or expectations at UK headquarters. These limits to synching within an overseas development center framework were combined in the late 1990s with an external shock to synching similar to that suffered by Global and Shiva. Sierra moved aggressively into the e-commerce market, where contracts had short life cycles and a strong requirement to understand a lot of specific customer needs. The limits and shocks threw the client– developer relationship severely out of gear on several dimensions, and, in 1999, Sierra’s UK managers closed their Bangalore operation. Having failed to “culturally flood” developers in India, they decided to force synching by repatriating development activities, along with Indian staff, to Britain. This was an attempt to dis-embed that staff from its cultural infrastructure and enforce congruence on the foundational but stubborn objectives and values dimension.
Distance could not be made invisible, and limits to synching emerged.
March/April 2001
IEEE SOFTWARE
57
Table 1 Client–developer relationships, synching, and outcomes. Relationship
Degree of synching
Pradsoft–Ameriten Global–Shiva (pre-shock) Sierra UK–Bangalore (pre-shock)
C X ✓ —
O XX — —
C ✓ ✓✓ ✓
Outcome
P X ✓ ✓
I ✓ ✓ —
T ✓ ✓✓ ✓✓
Insufficient synching leads to relationship failure. Synching just sufficient for relationship development. Synching sufficient for relationship survival but not development.
✓✓: very congruent; ✓: fairly congruent; —: partly congruent; X: slightly congruent; XX: not congruent
Overall results There are many causes for GSO success or failure. However, from our experience with the cases described earlier and others, synching lays the foundation for higher project success rates in GSO and for highervalue GSO. It is a prerequisite for moves up the trust curve. Both client and developer managers must understand what synching means in their particular context. They must recognize which dimensions are most important in that context and look for strategies that bring about synching. But they must also recognize synching’s serious practical difficulties. For example, while it proved relatively easy to synch technology and capabilities, it proved hard to synch some aspects of information and coordination/control systems and very hard to synch objectives and values (see Table 1). Limits to synching Western clients seem to fall too easily for the argument that, in a globalized world, distance, borders, and place no longer matter. The experience of these cases suggests otherwise. Distance matters, and it interferes with synching because of the difference between GSO’s formal/tangible and informal/intangible aspects. Synching strategies in practice have been good at dealing with the former and poor at dealing with the latter, particu-
Figure 1. The influence of tacit knowledge, informal information, and culture.
Coordination/ control systems
Objectives and values
Culture
58
IEEE SOFTWARE
March/April 2001
Capabilities
larly in dealing with three overlapping issues: tacit knowledge, informal information, and culture. In direct terms, the first two are part of the information dimension, and the third is part of the objectives and values dimension. However, their importance derives from the way in which each indirectly affects all Cocpit dimensions other than its own, as Figure 1 shows. Tacit knowledge. Technologies, specifica-
tions, processes, methodologies, skills, objectives, and management systems can transfer from client to developer. But they all have informational components consisting of two parts: the explicit knowledge that can be laid out formally and the tacit knowledge that cannot. Global, for example, went with GSO best practice and used clear, formalized requirements specifications. Although necessary, these were not sufficient because they incorporated a whole set of tacit assumptions and understandings that were not transferred about the nature of the customer, design and programming choices, and working practices. Clients especially must therefore pay more attention to tacit knowledge. They must recognize its existence within all Cocpit dimensions, identify its content, and look for ways to synch it between themselves and their developers. Techniques might include those described later: use of physical meetings, straddlers, and bridging relationships.
Processes
Tacit knowledge
Information
Technology
Informal information
Informal information. Ready transfer of formal procedures also runs into problems because real software development requires constant divergence from formal guidelines and constant improvisation. Informality and improvisation require informal information, which thus remains a critical resource for global software development. Trying to focus on well-structured, stable projects helped some case study clients push a lot of information exchange into the formal realm that IT-mediated distance can handle relatively well. But informal information cannot be handled this way. Sierra found the cost, fragility, and artificiality of videoconferencing excluded informal conversations and restricted interaction to formal exchange of progress toward milestones. Email, too, only operated at an informal level once participants had physically met and built personal relationships. Travel and direct meetings are therefore a continuous and crucial element in GSO relationships to help fully synch the information dimension. Culture. Players in this global game still re-
tain cultural values rooted in a particular locale. The overseas development center is a powerful tool for synching and, thus, for raising project success rates and moving up the GSO value chain, but it has its limits. Western processes, systems, capabilities, and so forth can all be imposed. However, some cultural “stains” underpinning these dimensions are hard for this global tide to wash away. Clients must learn to live with this. Living with limits to synching How can Western clients retain the benefits of GSO and yet live with the limits to synching? Some pointers were noted earlier, which are specific examples of a more general need to create buffering mechanisms between the distant worlds of client and developer, and bridging mechanisms that allow intangibles to be exchanged between those worlds. In addition to their synching strategies, clients and developers must therefore also identify the buffering and bridging mechanisms that will help them deal with the limits to synching that still exist within global software outsourcing for some key Cocpit dimensions. Sample mechanisms follow.
Management of expectations. Sierra’s over-
seas development center foundered, in part, because UK expectations were out of synch with Indian realities on several Cocpit dimensions. Those expectations were built on too much media hype and too little cold, hard analysis. As the overly positive expectations crashed down to reality, they left in their wake disillusionment with GSO. A key, then, to successful GSO is a realistic expectation of what can be achieved given the level of attrition, the limitations of technological infrastructure, the cultural differences, and so forth. Global and Shiva built up a good in-house understanding of what can and cannot be achieved over a long period with a slow-growing expansion and trust curve. This was more successful than Sierra’s too-much, too-soon approach.
Players in this global game still retain cultural values rooted in a particular locale.
Using straddlers. Straddlers have one foot in the client’s world and one in the developer’s world. Global, for example, used some of its India-born managers in GSO relationships. These staff members proved adept at understanding what can and cannot be brought into synch. They managed the dimensional gaps that remained, often acting as a buffer between the Indian developers and senior client managers. Shiva, too, despite high attrition rates, had staff at all levels with significant experience of working in North America who were better able to see things from the client’s perspective. These straddlers have also acted as conduits for transfer of tacit knowledge and informal information. Sierra, by contrast (perhaps because it was a smaller and more recent entrant) had no effective straddlers in either client or developer groups. Building bridging relationships. Straddlers
bridge gaps within one individual, but an alternative is to bridge gaps by building strong one-to-one relationships between individual clients and developers. Global and Shiva were active in this, facilitating meetings and other informal contact in which such relationships could develop. They even proactively identified pairs of individuals to bring together. These relationships were a valuable means by which to cope with continuing client–developer mismatch. They created a channel for translating between client and developer and for passing inforMarch/April 2001
IEEE SOFTWARE
59
Client environment
Developer environment
Policies/laws
Policies/laws Client
Customers
Developer
Competitors
Labor market
Competitors New technologies
Figure 2. The GSO relationship environment.
mal information and tacit knowledge. They also created the synch of mutual trust and understanding that can help overcome other dimensional differences. Understanding the synching environment Both main cases exposed the danger of external shocks to synching, and they are a reminder that the client–developer relationship does not sit in vacuum. Instead, it sits within an environment of which some components are illustrated in Figure 2. As well as synching with each other,
clients and developers also seek to synch with other components of their environment. This creates pressures and tensions for the client–developer relationship, which might be acute (the negative impact of ebusiness developments) or chronic (the problems of developers in meeting demands of staff in the local labor market). There is no magic solution to meeting these environmental pressures, especially when they are unexpected. Explicit discussion of the pressures will be one step; diversification of both parties into other relationships will be another. Managers must recognize that there is a seventh dimension to synching: time. Congruence must be actively sustained because, once created, it cannot ensure a permanently synched relationship in a context of continuous environmental change. Clients and developers must therefore continuously reexamine their relationship and proactively move to address emerging mismatches.
About the Authors Richard Heeks is a senior lecturer in information systems at the Institute for Develop-
ment Policy and Management, Univ. of Manchester. His research interests focus on the role of IT in governance and in international development. His PhD researched the Indian software industry. Contact him at IDPM, Univ. of Manchester, Precinct Centre, Manchester, M13 9GH, UK;
[email protected].
S. Krishna is a professor at the Indian Institute of Management, Bangalore. His research
interests concern global software work arrangements. He holds a PhD in software engineering and chairs IIMB’s software enterprise management program, focusing on research and management education in partnership with the local software industry. Contact him at the Indian Inst. of Management, Bannerghatta Rd., Bangalore 560 076, India;
[email protected].
Brian Nicholson is a lecturer in information systems at the School of Accounting and
Finance, University of Manchester. His research interests focus on understanding the complexities of software development between the UK and India, which was also the topic of his PhD. Contact him at the School of Accounting and Finance, Univ. of Manchester, Oxford Rd., M13 9PL, UK;
[email protected].
Sundeep Sahay is an associate professor at the Department of Informatics, University of Oslo. His research interests concern globalization, IT, and work arrangements. Over the past four years, he has been involved in an extensive research program analyzing processes of global software development using distributed teams. Contact him at the Dept. of Informatics, Univ. of Oslo, PO Box 1080, Blindern, N-0316, Norway;
[email protected].
60
IEEE SOFTWARE
March/April 2001
References 1. Annual Review of the Indian Software Industry, Nat’l Association of Software and Service Companies, New Delhi, 2000. 2. The DQ Top 20, Dataquest, New Delhi, 15 July 1999. 3. F. Warren McFarlan, “Issues in Global Outsourcing,” Global Information Technology and Systems Management, Ivy League Publishing, Nashua, N.H., 1996. 4. R.B. Heeks, India’s Software Industry, Sage Publications, New Delhi, 1996. 5. U. Apte, “Global Outsourcing of Information Systems and Processing Services,” The Information Society, vol. 7, no. 4, Apr. 1990, pp. 287–303. 6. P.R. Lawrence and J.W. Lorsch, Organization and Environment, Harvard Univ. Press, Cambridge, Mass., 1967. 7. B. Kogut, “Joint Ventures,” Strategic Management J., vol. 9, no. 4, Apr. 1988, pp. 319–332. 8. R.B. Heeks, “Why Information Systems Succeed or Fail,” Proc. Business Information Management Conf., CD-ROM, Manchester Metropolitan Univ., Manchester, UK, 1998. 9. E. Carmel, Global Software Teams, Prentice Hall, Upper Saddle River, N.J., 1999. 10. S. Sahay and S. Krishna, Understanding Global Software Outsourcing Arrangements, tech. report, Center for Software Management, Indian Inst. of Management, Bangalore, 1999. 11. B. Nicholson, S. Sahay, and S. Krishna, Work Practices and Local Improvisations within Global Software Teams, working paper, School of Accounting and Finance, Univ. of Manchester, Manchester, UK, 2000.
focus
global software development
Surviving Global Software Development Christof Ebert and Philip De Neve, Alcatel
Although there are many good reasons to globally distribute development activities, success is not guaranteed by just opening a development center in another region of the world. This article summarizes and distills best practices from true global software development. 62
IEEE SOFTWARE
oftware development involves teamwork and a lot of communication. It seems rational to put all your engineers in one place, encourage them to share objectives, and let the project run. Why use distributed sites when it’s easier to work in one location without the overhead of remote communication and planning? How is it possible to survive (and succeed with) globally dispersed projects?
S
Working in a global context has its advantages, but it also has drawbacks. On the plus side, you gain time-zone effectiveness and reduced cost in various countries. However, working on a globally distributed project means operating costs for planning and managing people, along with language and cultural barriers. It also creates jealousy as the more expensive engineers (who are afraid of losing their jobs) are forced to train their much cheaper counterparts. In this case study, we try to summarize experiences and share best practices from projects of different types and sizes that involve several locations on different continents and in many cultures. Case study setting Alcatel is a global telecommunications supplier with a multitude of development projects that typically involve several countries. We will focus in this article on the company’s Switching and Routing business division.
March/April 2001
Software development in this business division is handled in a central R&D group of several thousand software engineers who are distributed throughout the world in more than 15 development centers in Europe, the US, Asia, and Australia. Strong functional and project organizations—interacting in a classic matrix—facilitate both project focus and longterm skill and technology evolution. Telecommunication suppliers have worked in a global context for years. The primary drivers in the past were the need to be locally present for customization and after-sales service and to show local customers how many new jobs were created, which in turn could justify more contracts. A second source for internationally dispersed software development is the growing number of acquisitions and mergers, which add new markets, products, engineers, and creativity to the existing team. A third reason for starting new development activities in countries where neither the market nor the acquisitions would justify such 0740-7459/01/$10.00 © 2001 IEEE
evolution is that it’s often impossible to hire young engineers with the necessary skills at a reasonable cost at the existing site. The answer in such cases is to start business in areas such as Eastern Europe or India, which Alcatel has done. Although development centers have operated autonomously in the past, aligned R&D teams have replaced such local activities with globally managed product development lines. This effectively avoids overhead in terms of redundant skills and resource buffers in various locations. Having no option to act locally, product development lines organize projects to achieve the best possible efficiency—by optimizing the trade-offs between temporarily collocating teams, reducing overheads, and having the right skills in due time. This, of course, directly relates to the organization’s overall profit and loss layout, which is adjusted to have R&D provide engineering results internally toward business units. Decisions about work allocation are R&D’s full responsibility. The parameters at which the department is measured are sales and R&D’s cost, which let it internally manage quality, productivity, and lead-time. If, for instance, a specific location is too expensive, it’s R&D’s decision to move engineering toward a higher productivity location. This study describes projects within Alcatel’s Switching and Routing Business Division. The product spectrum ranges from the proprietary S12 switching system to Corba/ Java middleware and front ends. Alcatel is registered for the ISO 9001 standard. The majority of development locations are ranked on CMM Level 2; few are on CMM Level 3. In terms of effort and cost, the software’s share is increasing continuously and currently occupies 80 percent to 90 percent of R&D’s budget. Our focus in this study is on switching projects because they are typically developed in at least two or three sites, sometimes on several continents. The projects vary in size between a few person-years and several hundred person-years, depending on how much new development is involved. Lessons learned from global development There are many ways to organize and manage global development. Let’s examine some
of Alcatel’s experiences. We elaborate on several practices, but due to space restrictions, we can’t cover all of them. Nevertheless, we welcome email regarding specific questions. Organization and allocation Work organization highly impacts globally distributed software development. Although some research recommends building virtual teams,1 we strongly advise building coherent and collocated teams of fully allocated engineers. Coherence means splitting the work during development according to feature content and assembling a team that can implement a set of related functionality. Collocation means that engineers working on such a set of coherent functionality should sit in the same building, perhaps within the same room. Full allocation implies that engineers working on a project should not be distracted by different tasks in other projects. At their kick-off, projects are already split into pieces of coherent functionality that will grow incrementally. Functional entities are allocated to development teams, which are often based in different locations. Architecture decisions, decision reviews at major milestones, and tests are done in one location. Experts from countries with minor contributions will relocate for however long the team needs. This allows effective project management, independent of how the project is globally allocated. Team members must communicate whenever necessary to make the team efficient.2 At Alcatel, we studied projects over five years where we could distinguish the degree of collocation and allocation.3 Collocated teams achieved an efficiency improvement during initial validation activities of over 50 percent, meaning that with the same amount of defects in design and code, those teams sitting in one location needed less than half the time for defect detection. Allocation directly impacts overall project efficiency. We found in the same long-term study that small projects with highly scattered resources showed less than half the productivity compared to projects with fully allocated staff. Cycle time is similarly impacted—people switching between tasks need time to adjust to the new job. We found an impact of a factor 2 to 3, which means it would take 2 to 3 times the original effort if people work on several assignments in parallel.
Team members must communicate whenever necessary to make the team efficient.
March/April 2001
IEEE SOFTWARE
63
Without a project perspective, engineers tended to handle work products inefficiently.
To clarify different organizational needs, we distinguish different roles. The key roles that facilitate allocation in global development include ■
■
■
Core competence: Highly experienced senior developers decide on architecture evolution, specify features, and review critical design decisions in the entire product line. Engineering: The majority of resources responsible for designing and integrating new functionality for all software. Service: Specific functions for a group of projects with short or repetitive assignments, including industrialization, documentation, and maintenance activities.
These roles are then allocated to various development teams, which constitute a project. Our objectives for improved allocation were as follows: ■
■
■
■
Allocate the majority of people contributing to the project (the Engineering role) to almost full-time. This is measured with a scatter factor that relates the people contributing to a task to the total effort of the task. This scatter factor should be a maximum of 1.5 (1.5 engineers on average contribute to a one-engineer task), with a clear tendency to further reduce. Ensure reliable allocation, with agreements on start and end dates. Having time fixed means that with clear quality and cost targets, the only variable factor is content. Content thus serves as a buffer to mitigate unexpected overruns and is facilitated by incremental development and continuous build.4–6 Ensure team collocation, even if the project is distributed across sites. Teams that are assigned across several locations face many challenges that could impact their ability to work as a team. Distinguish development (new functionality) from maintenance activities (such as defect correction). They should be organized as separate projects.
Such changes in allocation are a big cultural change—the clear target is to replace isolated expertise with skill-broadening tasks and effective teamwork. This implies a clear individual responsibility for overall project re64
IEEE SOFTWARE
March/April 2001
sults. Such simple yet effective rules demand a sufficiently detailed project plan at a project’s start that breaks down resource needs into skills and duration and provides a feature development breakdown into teams and increments. Alcatel’s Switching and Routing business division institutionalized these changes starting in 1998. Enriching jobs also requires more training and coaching. We saw, however, that coaching pays off the most. Looking only at the engineering cost of nonquality—the time needed to detect and correct defects—we found that projects with intensive coaching (1 to 2 percent of accumulated phase effort) reduced the cost of nonquality in the phase by over 20 percent.3 Break-even is typically reached at around 5 percent coaching effort. This means there are natural limits when involving too many inexperienced engineers. Concurrent engineering Previously, engineers lost sight of how their own contributions affected the overall project. Several project postmortems indicated that activities and work product quality were extremely isolated. The effect was that whenever we tried to build the complete product or iteration, we required huge overhead to bring the pieces together. This held true for individual work products as well. Without a product perspective, engineers tended to handle work products inefficiently. Results were moved forward to the next link in the chain, and the cost of nonquality (and the number of delays) accumulated. For example, inspections typically didn’t follow the defined process, which involves checkers, inspection leaders, and a maximum reading speed. Instead of applying reasonable exit criteria, many inspections were considered finished when they reached their respective milestone dates—and before continuing the defect detection with the next and more expensive activity. Test was conducted with a rather static set of test cases that were not dynamically filtered and adjusted to reliability growth models. The root causes were obvious but so embedded in the culture that the situation required a complete process reengineering to facilitate global development at a competitive cost. The major changes that we implemented involved concurrent engineering and team-
100
work. For instance, we assembled crossfunctional teams at the beginning of the project—before project kick-off, we called in an expert team to ensure a complete impact analysis, which is a prerequisite to defining increments. Concurrent engineering means that, for example, a tester is also part of such a team—experience shows that designers and testers look at the same problem very differently. Testability is only ensured with a focus on test strategy and the potential impacts of design decisions made during the project’s initial phases. Teamwork was reinforced to the degree that a team had sole responsibility for realizing a set of functionally related customer requirements. A designer would no longer leave the team when her work product was coded—she would stay to test the product in the context of changes provided by other team members. Feature orientation clearly dominates artificial architectural splits.4,5 The team’s targets are based on project targets, and all the team members share them. The team follows up the targets on the basis of delivered value, such as feature content.3,7 Periodic reviews of team progress with the project lead are necessary to follow up and help when case risks arise that the team cannot mitigate. We evaluated the effects of this reengineered process carefully over the past two years. As a result, we see two effects. We could reduce response time and overall cycle time with earlier defect detection (see Figure 1). The engineering cost of nonquality in the overall project and field defects could be reduced significantly due to earlier defect detection. We tested both hypotheses on a set of 68 projects over the past four years (before and after the change). As a result, we can accept with a significance level of more than 95 percent in a T-test that the change toward feature-oriented development impacts both cycle time and cost positively. Product line concept Global products with different local markets both stimulate and hinder true global software development. The broad flexibility of modern software can easily result in variants and local evolutions that make it impossible to manage synchronization of any kind of maintenance activity, be it corrective or additive. The absence of clear linkages to
Percentage of defects detected
90 80 70 60 50 40 30 20 10 0 0
10
20
30
40
50
60
70
80
90
100
Percentage of relative project time (start until handover)
business value invites gold plating—that is, implementing functions that might be rarely used or adding excessive functions that are not necessary to attain the desired business results. This trend had to stop, and the approach was a rigid introduction of a coherent worldwide product line concept. The product line concept is based on few core releases that are further customized according to specific market requirements around the world. The structuring of a system into product families allows the sharing of design effort within the family and, as such, counters the impact of ever-growing complexity. Based on a mapping of customer requirements to architectural units (such as modules, databases, subsystems, and production tools), we achieved a clustering of activities that allowed for splitting activities into three parts:
Figure 1. Effective team management scales directly to faster reaction time. The upper curve represents teamwork and feature-oriented development, the middle is the waterfall approach to defect detection, and the lower curve is original behavior.
1. Small independent architectural units that we could separate and leave out from any customization. Typically, they are subject to moving into separate servers, and development is collocated at one place. 2. Big chunks that any project would impact and thus need a global focus to facilitate simple customization (such as different signaling types captured with generic protocol descriptions and translation mechanisms). Development happens in multiskilled teams, and these skills are replicated in almost all locations. 3. Market- or customer-specific functional clusters based on the requirement analysis and ultimately forming the project team responsible for a customer project. This type of requirement must be the exception and asks for a dedicated pricing strategy because it creates the most overhead. March/April 2001
IEEE SOFTWARE
65
Effective tools and work environments are the glue to successful global software development.
66
IEEE SOFTWARE
Such separation of architectural units is also the necessary precondition for splitting a global project into teams that we can individually collocate. Change management Improving the allocation of engineers to only one project naturally means that there is no global owner of a specific work product across projects. Instead, many developers in different places simultaneously share the responsibility of enhancing functionality within one product. Often a distinct work product (or a file with source code) is replicated as variants are concurrently updated and synchronized to allow the centralized and global evolution of distinct functionality.6 Effective tools and work environments are thus the glue to successful global software development. Most commercial tools cause problems when used in sites around the globe. Almost no tool seamlessly synchronizes or backs up database contents without disturbing engineers that are logged on 24 hours a day, seven days a week. Performance rapidly decreases when multisite use is involved, due to heterogeneous server and network infrastructures. Managing corrections perfectly illustrates the observed challenges and solutions we implemented. Defects impact globally distributed product line architectures, and the risk is high that the same defects will occur again and again. The product line concept implies that engineering teams must align and synchronize feature roadmaps and deliveries of both new and changed (or corrected) functionality. However, synchronization of deliveries adds complexity to the development process. Engineers cannot easily copy corrections from one code branch to the other because they impact ongoing development. Therefore, effective synchronization of the individual corrections involves global visibility of all defects, impacts of defects, correction availability, and evaluation of correction impacts. To facilitate easier communication of appropriate corrections, we introduced a new synchronization mechanism into our worldwide defect database. Based on the detected failure and the originating fault, a list of files in different projects is automatically prepopulated by telling which variants of a given file the engineer needs to correct. Al-
March/April 2001
though this is rather simple with a parent and variant tree on the macroscopic level, careful manual analysis is needed due to localized small changes on the code procedure and database content level. The change management system then automatically triggers those variants (within customization projects). Depending on the project manager’s trade-off analysis of failure risk and stability impacts, the developer responsible for the specific customization would correct these defects. This approach immediately helped us focus on major field problems and ensure that they would be avoided in other markets. However, it also showed the applied product line roadmap’s cost. Too many variants create overhead. Obviously, variants must be aligned to allow for better synchronization of contents (both new functionality and corrections) while still preserving the desired functional flavors necessary in a specific market. Global development thus impacts product line architecture heavily toward fewer and simpler threads of design variants. Incremental development Although developers have known about and applied incremental development and related life-cycle models for many years,1,4 it’s not so obvious how to implement an entirely incremental approach to a legacy architecture that is primarily driven by heavily interacting subsystems instead of small addon functionality or independent components. In such architectures, which are typical for legacy systems, development during top-level (or architectural) design not only agrees on interfaces and impacts on various subsystems, but also on a work split that aligns with subsystems. The clash comes when these subsystems should be integrated with all new functionality. Such processes are characterized with extremely long integration cycles that don’t show any measurable progress in terms of feature content. The changes we introduced at Alcatel to achieve real incremental development were to ■
■
Analyze requirements from the beginning in view of how they could be clustered to related functionality. Analyze the context (interfaces and data structures that are common for several
■
■
■
■
Increments toward a stable build proved one of the key success factors in global development. We realized that a project’s cycle time is heavily impacted by whether continuous build is globally applied. Figure 2 shows the results from an overview on projects con-
ducted in the past three years. Especially with the described reengineering efforts, we could continuously reduce cycle time.
Relative time
modules) impacts of all increments up front before starting development. The elaboration phase is critical to make real incremental development and a stable test line feasible. Obviously, not all context impacts can be addressed immediately without extending the elaboration phase toward an unacceptable range. It is thus necessary to measure context stability and follow up with root cause analysis why certain context impacts were overseen. As a target, the elaboration should not take longer than one third of total elapse time. The remainder of the project duration is related to development activities. Provide a project plan that is based on these sets of clustered customer requirements and allocates development teams to each set. Depending on the impact of the increments, they can be delivered to the test line more or less frequently. For instance, if a context impact is detected too late, a new production cycle is necessary that takes more effort and leadtime than regular asynchronous increments of additional code within the originally defined context. Develop each increment within one dedicated team, although a team might be assigned to several increments in a row. Increments must be completed until the end of the unit and feature integration test to avoid discovering that various components cannot be accepted to the test line. A key criterion for the quality of increments is that they don’t break the build. Base the progress tracking of development and test primarily on the integration and testing of single customer requirements. This gives visibility on real progress because a requirement can only be checked off if it is successfully integrated in the test line. Traceability is improved because each customer requirement links to the related work products. Extensively feature test increments by using the independent test line.
5PY dvpt. effort 10PY dvpt. effort 30PY dvpt. effort 30PY dvpt. effort (PY = person-years)
Culture at work In a company like Alcatel, handling change effectively is mandatory to 1997 1998 1999 2000 remain a key player in the Year industry. Without change, a company will stagnate and eventually disappear. Figure 2. Coordinated What is crucial in a global development or- global development ganization is that all development locations improves project working in one product line use the same cycle time. processes, methodology, and terminology even when changes occur.8 This looks obvious, but in an organization with several thousand engineers, separated thousands of kilometers from each other, having different languages and cultures, it is quite a challenge. To convince top management of that need is not so difficult. The real challenge is Project life cycle
Process overview
Work product “dashboard”
to spread the awareness, communication, and knowledge to all levels in real time— from the different levels of management to the many engineers who want to have simple yet efficient processes. We thus focused on integrated workflow and online documentation of process and project contents. Today each team and project has globally accessible homepages that allow easy browsing of all internal work products. Work products and roles allow direct navi-
Figure 3. Integrated workflow environment linking process, tools, and work products on Alcatel’s intranet.
March/April 2001
IEEE SOFTWARE
67
Different development locations and cultures react differently to change.
68
IEEE SOFTWARE
gating to process descriptions and related workflow tools (see Figure 3). Key factors to continuous change are not new challenging tools or processes, but the ability of people (management and developers) to be open to change. This attitude is an integral part of the company’s culture. Large efforts have helped create the right attitude and spirit and a company culture based on common goals. One of the major milestones in creating such a common culture and thus facilitating our status as a global company was to choose English as the common language within Alcatel (not so obvious for a company based in France). English is mandatory, and language classes are provided in most non-native speaking countries to leverage skills. A common syntactical language does not necessarily mean the same semantics and pragmatics, however. This is obvious with commitments and negotiations in different cultures. We still face some barriers, especially with countries that clearly follow their own culture. One way to improve is a heavy exchange of teams and management to face and live within the different cultures and thus gradually build mutual understanding. Different development locations and cultures react differently to change. The longer a location works with similar tools and processes in a culture well protected from the outside, the more difficult the change is. This is not surprising, but we experienced direct proof of it. During the past three years, we opened two new and fast-growing development centers, one in Eastern Europe and one in India. By using the Greenfield approach— starting over—for these two centers, we successfully introduced our processes without friction. During the past three years, methodologies and processes changed several times without difficulty in these new centers. After three years in operation, one of these centers operates in a CMM Level 3 mode and plays a piloting role with Alcatel to become a CMM Level 5 location. Process maturity positively impacts global development. CMM Level 2 is an absolute minimum to achieve sound project management (which is a prerequisite for distributed development). However, only a standardized organizational process framework allows a seamless integration of different development centers and ensures that interfaces for
March/April 2001
various work products—including workflow management—facilitate an exchange of results and shared contributions. Such continuous learning impacts motivation and thus reduces turnover rates. Keeping turnover rates low is clearly a key objective in dealing with legacy software, otherwise nobody has the necessary longterm view.
M
anaging global software development is not easy and risks lowering overall productivity. Still, the positive impacts should not be forgotten. A major positive effect is innovation. Engineers with all types of cultural backgrounds actively participate to continuously improve the product, innovate new products, and make processes more effective. Achievements are substantial if engineers of entirely different educations and cultures try to solve problems. Best practices can be shared, and sometimes small changes within the global development community can have big positive effects. One example in Alcatel was the introduction of quiet rooms throughout the world to use for code reviews. Here are some of those best practices, which we identified over the past few years as clearly supporting global software development: ■
■ ■
■
■
■
Agree and communicate at project start the respective project targets, such as quality, milestones, content, or resource allocation. Ensure that commitments exist in written and controlled form. Have one project leader who is fully responsible for achieving project targets, and assign her a project management team that represents the major cultures within the project. Within each project, follow up continuously on the top 10 risks, which in a global project are typically less technical than organizational. Define at a project’s beginning which teams are involved and what they will do in each location. Set up a project homepage that summarizes project content, progress metrics,
■
■
■
■
■
■
■
planning information, and team-specific information. Provide an interactive process model based on accepted best practices that allows tailoring processes for the specific needs of a project or even team. Rigorously enforce within a product line using the agreed standard process that relates to a CMM Level 3 organization pattern. Provide sufficient communication means, such as videoconferencing or shared workspaces and global software libraries. Rigorously enforce CM and build management rules (such as branching, merging, and synchronization frequency) and provide the necessary tool support. Rotate management across locations and cultures to create the necessary awareness for cultural diversity and how to cope with it. Set up mixed teams from different countries to integrate individual cultural background toward a corporate and projectoriented spirit. Make teams fully accountable and responsible for their results.
Global software development is not the target per se, but rather the result of a conscious business-oriented trade-off. The guiding principles are to optimize the cost of engineering, while still achieving the best feasible integration of all R&D centers worldwide. These needs must be carefully balanced with additional costs that might occur only at a later point. This includes staff turnover rates, which in other countries might be higher than in Europe; cost overheads related to traveling, relocation, communication, or redundant development and test equipment; unavailability of dedicated tools that allow for globally distributed tools and work environments; impacts on the learning curve, which slows down with more locations involved; cultural differences that can impact work climate; insufficient language skills; different legal constraints related to work time, organization, or participation of unions; and building up redundant skills and resource buffers to be prepared for collocated teams and unforeseen maintenance activities. We faced all these obstacles, and even the best training cannot substitute for
extremely cooperative engineers and highly effective management. To be successful in a global market, a company should manage the risks of global software development, but can use the positive aspects as input to shape the development process in detail and the culture in general. History has shown us time and again that mixing blood is the best thing that can be done in the path of evolution. Globalization achieves exactly that.
Global software development is not the target per se, but rather the result of a conscious businessoriented trade-off.
References 1. D.W. Karolak, Global Software Development, IEEE CS Press, Los Alamitos, Calif., 1998. 2. T. DeMarco and T. Lister, Peopleware, 2nd ed., Dorset House, New York, 1999. 3. C. Ebert, “Improving Validation Activities in a Global Software Development,” Proc. Int’l Conf. Software Eng., IEEE CS Press, Los Alamitos, Calif., 2001. 4. S. McConnell, Software Project Survival Guide, Microsoft Press, Redmond, Wash., 1998. 5. E.A. Karlsson et al., “Daily Build and Feature Development in Large Distributed Projects,” Proc. Int’l Conf. Software Eng., IEEE CS Press, Los Alamitos, Calif., 2000, pp. 649–658. 6. D.E. Perry et al., “Parallel Changes in Large-Scale Software Development: An Observational Case Study,” Proc. Int’l Conf. Software Eng., IEEE CS Press, Los Alamitos, Calif., 1998, pp. 251–260. 7. W. Royce, Software Project Management, AddisonWesley, Reading, Mass., 1998. 8. D. Bunnell, Making the Cisco Connection, John Wiley & Sons, New York, 2000.
About the Authors Christof Ebert is Alcatel’s director of software coordination and process improvement.
He is also IEEE Software’s associate editor in chief for requirements engineering. His research interests include requirements engineering, software metrics, software quality assurance, realtime software development, and CASE support for such activities. He studied electrical engineering and software engineering at the University of Stuttgart and Kansas State University. He received his PhD from the University of Stuttgart. He is a member of the IEEE, GI, and the Alpha Lamda Delta honor society. Contact him at Alcatel, Fr.-Wellesplein 1, B-2018 Antwerpen, Belgium;
[email protected]. Philip De Neve is the design center manager of Alcatel’s Chennai Development Center
in India. His interests include human responses in multicultural software organizations, realtime software development, and artificial intelligence. He studied electrical engineering in the KIHO-Ghent. Contact him at Alcatel SRD, 94/95 Thiruvika Industrial Estate, Guindy, 600 032 Chennai, India;
[email protected].
March/April 2001
IEEE SOFTWARE
69
focus
global software development
Leveraging Resources in Global Software Development Robert D. Battin, Ron Crocker, Joe Kreidler, Motorola K. Subramanian, Motorola India Electronics
Developing business-critical software with engineers from six countries— all contributing high-quality components— expands the definition of teamwork. Having resolved global development issues on a major project, the authors suggest solutions to others facing similar challenges. 70
IEEE SOFTWARE
everaging global resources for software development is rapidly becoming the norm at Motorola, which has over 25 software development centers worldwide. Our project, called the 3G Trial (Third Generation Cellular System), was the first of its scope and significance developed by a global engineering team at Motorola.
L
Staffing was the most significant issue we encountered in the 3G Trial. We had only about 20 percent of the required staff available at our division headquarters in Arlington Heights, Ill., US, and needed to find the other 80 percent to successfully complete the project. Early on, we concluded that our only means to staff the project was to rely on software development engineers from Motorola’s worldwide software centers. We developed the system with staffing from six different countries, as Figure 1 shows. Next, we had to integrate the people into a team. While addressing this challenge, we identified key risk factors and developed approaches to reduce them. We separated the project risk factors into the five categories Erran Carmel describes as the centrifugal forces that pull global projects apart.1 Table 1 lists the issues under these categories and the solution strategies we used to resolve the issues. To pass on the lessons we learned from this project, this article sets out the global de-
March/April 2001
velopment issues we faced, our approaches to resolving them, and our findings compared to other research. Loss of communication richness Communication is the key to team-based success. Dale Walter Karolak2 devotes an entire chapter to the subject, and two of Carmel’s centrifugal forces1 are communication related. Let’s look first at the specific communication issues we faced in our project and our approach to managing these issues. Issue: physical distances and time zones We designed and developed the 3G Trial in six countries across three continents. The distances between development centers varied from 1,300 miles to 9,900 miles. The lack of direct flights between many of the centers further complicated travel between the centers. The geographic separation made it impractical to ever get the entire team together, so distance became an issue relative to communication. 0740-7459/01/$10.00 © 2001 IEEE
China
US Japan India Singapore
Australia
Each development center maintained local work shifts (9 a.m. to 6 p.m.), so development occurred at some location 21.5 hours per day (see Figure 2). However, maintaining the local work shifts provided no common overlapping work hours for all the development sites, adding to our communication issues. Solution: liaisons from the non-US sites When we committed to working with the worldwide development teams, our first
Figure 1. Worldwide
step was to identify liaisons from each site. development centers. The liaisons were engineers who moved to Arlington Heights for as long as three months. Their responsibility during this period was to meet the Arlington Heights developers, learn the system, help complete the system-level requirements and specifications, and communicate this information back to the development staffs at their home office. Having these engineers in Arlington Heights turned out to be a key factor in successfully completing the 3G Trial project.
Table 1 Global Development Issues and Solution Strategies Category name
Loss of communication richness
Issue Physical distance Time zone disparity Domain expertise
Architecture Coordination break- Software integration down Software conf. mngt. Vendor Geographical disper- support Governmental sion issues Loss of Development “teamness” process Local Cultural differences impression of remote terms
Liaisons
IncreRational Common Common Centralized Don't Complete Communi- ArchitecExperi- impose mental tural task work Contracts life bug cation tools ence principles integration assignment products reporting process cycle X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X X
X
X
X X
X
X
X
X
X
X
X
X
X
X
X
X
March/April 2001
IEEE SOFTWARE
71
2400 h UTC
1200 h UTC
0000 h UTC
Greenwich
6:00 PM Tokyo, Japan
9:00 AM
6:00 PM Beijing, China
9:00 AM
6:00 PM Singapore 9:00 AM 6:00 PM Bangalore, India
Downtime (2.5 hours)
9:00 AM
Time
9:00 AM 6:00 PM Arlington Heights, US 9:00 AM 6:00 PM Adelaide, Australia
Figure 2. Work shifts by location.
Not only did the liaisons learn the system, more importantly, they developed relationships with the Arlington Heights teams. Although none of our sources defines this role explicitly, Jaclyn Kostner3 and James Herbsleb and Rebecca Grinter4 point out the value of building trust among the teams. The key contribution of the liaisons was to build the “trusting shared future” for our project.3 Solution: continuous communication Once the liaisons returned home, we lost our local link to those sites. To remove this gap, we reinforced our communication mechanisms among the teams in three ways: Intranet connectivity. When the 3G Trial be-
gan, the software configuration management (SCM) tools were not in place. The team decided that an intranet publication would be the most expedient means to communicate in a non–real-time mode, for activities such as sharing documents. Conference calls. Although face-to-face communication is still the best means of exchanging ideas,1-3 it was not practical for our project. We met our real-time communications needs by teleconferencing, and it became a critical component of our communication strategy. In developing the 3G Trial, we had to inspect over 40 system-level specifications. We accomplished these inspections 72
IEEE SOFTWARE
March/April 2001
using conference calls that often started at 10 p.m. Arlington Heights time to allow full participation from the non-US sites. We also used conference calls to resolve problems that required a complete information exchange between engineers at various sites. Although we could report a problem by email almost instantaneously to all teams, the resolution often required detailed discussions. These discussions were scheduled in an impromptu manner during the night from the site requesting the conference call. For example, if the engineers in India decided they needed a conference call with the other engineers, they would email a notice, and the call would take place that evening in India and in the morning in Arlington Heights. Travel. We reserved travel for those cases where physical presence was absolutely required. Most trips fell into either liaison visits (early) or integration support (later). The team took over 150 trips and logged 1.4 million air miles. We used travel judiciously because each trip was expensive and time-consuming.
Issue: domain expertise Because people from outside the traditional product organization largely staffed this project, we saw the lack of problem domain expertise as a significant risk. The time riding the learning curve to become knowledgeable in the problem space was time directly removed from delivering the project. Because the nonUS engineers were unable to learn directly from the domain experts, we needed another mechanism to provide the learning. Solution: liaisons to share the knowledge The liaisons played an integral role in sharing the architectural vision with their teams. By working the liaisons into the definition phase of the project, they both understood the intent of the architecture as well as the overall theme of the project. They provided the link between the development team and the architecture team and resolved the majority of the development team’s architectural and domain related questions locally. Coordination breakdowns Carmel relates the coordination issues to the complexity issue. In this project, complexity was a given.1 The technology was new and sophisticated, and the team structure only
added to the complexity. We identified several issues related to coordinating the teams. Issue: architecture The system we were developing was an experimental one designed to prove the 3G concepts to both our customer and ourselves; as such, it exhibited emergent properties. Additionally, as the non-US teams were not very familiar with 3G concepts, we had the challenge of coming up with an architecture that everyone could understand. Finally, we had to avoid a structure that put any single portion of the system in every critical path. Solution: strongly held architectural principles Our principal technical complexity management tool was to actively work an “architecture lite”5 with a few strongly held key principles. Abstractly, these were: ■ ■ ■
low coupling among network elements, well defined interfaces, and concise semantics for the network element behaviors.
An architecture lite fully describes the system architecture but at a high level of abstraction, including both static and dynamic views. We increased the detail in the interelement interfaces and reduced the detail in the specifics of the elements allowing us to quickly understand our system. We saved time and resources in areas where we would only be making guesses because of the system’s emergent properties. We developed a relatively simple system metaphor6 embracing these principles; we relied on these principles to guide us when a behavior emerged. Because the network elements behavior and interfaces were well defined, they could be developed in a context shared among the team members. Finally, as the network elements had low coupling, each team could run independently of both any other team and the architects. By managing complexity at this level, we were free to distribute the development of the network elements to any group. Solution: rational task distribution Both Carmel1 and Karolak2 discuss various ways for dividing the work; our approach was a combination of these, which incorporated a rational method of assigning
tasks to locations. First, a location was only on the list based on available staff—real people, not simply authorization to hire new staff. Next, we considered expertise. For the teams with domain experience, the task assigned to them was within their area of expertise. Then we leveraged any particular advantage of a location. For example, we chose both the Japan and Arlington Heights teams for their tasks because they were physically close to the companies they would interact with. We assigned the remaining tasks to locations based on their ability to support the staffing requirements of the tasks.
We designed and developed the 3G Trial in six countries across three continents.
Issue: software integration Although working with a large number of independent teams let us address several problems simultaneously, it increased the amount of effort required to integrate the parts. We had multiple levels of integration—shared-library, subsystem, cluster, and system levels. Each offered unique challenges as the intersite interactions increased. Solution: incremental integration We used an incremental integration strategy to integrate the software at multiple levels.6,7 Our incremental integration plan, based on clusters and shared milestones, let us build confidence in both the individual network element software as well as the system over time. We grew an incremental understanding of pair-wise network element interactions and never faced “big bang” integration. Clusters. Early on, we determined that we
needed to work at end-to-end layers of the system. When we laid out the original project schedule, everything funneled through one work item. To break this dependency, we defined clusters that divided the system horizontally (end-to-end functionality) instead of vertically (subsystem-based). Each cluster denoted some significant portion of the system, allowing each to progress independently as far as possible. When intercluster integration came about, it tended to be within a subsystem and therefore within a team. Shared Milestones. To integrate and coordi-
nate the worldwide development, we defined a set of incremental milestones. At the system level, we selected and scheduled certain key deliverable features (such as packet March/April 2001
IEEE SOFTWARE
73
The liaisons provided the key link between the architecture team and the development teams, as well as providing the US management team with a face to put with the non-US centers.
data sessions). We broke down these system deliverables into cluster-level milestones (for example, originate the control session). The cluster milestones decomposed into subsystem deliverables (for example, handle the origination message). Each subsystem was responsible for informing the other subsystems of any changes within its milestones. These changes could then roll up into the cluster and system milestones. As the teams finished a given deliverable’s development, we rolled them into the lab and integrated the network elements. Issue: software configuration management SCM is challenging enough, but it is even more challenging for the globally distributed project. Like any large project, we had multiple versions of modules used in multiple builds by multiple developers, with problems reported and tracked against these items. Each development center had an installed SCM system, but there was no common SCM tool being used at all centers. Solution: a common SCM tool with multisite replication and a centralized bug repository We chose a SCM tool with multisite data replication for use at all the development centers. Similarly, we chose a problemtracking tool shared by the entire team, residing on servers in Arlington Heights but accessible from anywhere using a Webbased interface. It is less important to focus on the particular tools we chose than to understand the functions these tools allowed us. The multisite capability let each team run independently while still having visibility to changes from the other teams. The Web integration of the problem-tracking tool allowed a lightweight interface with full visibility, further decoupling the teams. Geographic dispersion In addition to the issues we’ve already discussed, we had to deal with some other practical issues related to using global resources for our project. Issue: vendor support We developed the 3G Trial using a variety of tools and products from third-party vendors. Obtaining consistent support from these vendors for global development
74
IEEE SOFTWARE
March/April 2001
proved far more difficult than we originally thought. Although most vendors were based in North America, providing acceptable sales and technical support for Arlington Heights, the situation for the other sites was not as good. Obtaining the same version of a product from multiple sales teams proved quite difficult. While the latest version of most products was readily available in the US, the vendors were often still introducing previous versions in other countries. Local technical support was absent in many countries; when available, we were plagued with inconsistent answers. In addition, although all vendors provided email support, some did not have 24-hour phone technical support. Solution: international support contracts For several vendors, the level of support offered did not match our needs; we required a custom support contract. For the non-US development locations, there was typically a premium for such support due to the required time coverage. For this project, we decided it was worthwhile to have this extra service. Paying a support premium to a vendor was worth the reduction in risk related to these vendors. Solution: centralized bug reports Even with standard maintenance agreements and premium support contracts, occasionally we needed to improve on the support available from our vendors. One effective means was to assign the Arlington Heights site the principal contact point for the vendors. When a non-US site would find a bug with a vendor product, Arlington Heights would duplicate the bug, report the bug to the vendor, and track it for the entire team. The bugs were real; their reporting location was centralized to improve vendor responsiveness. Centralizing the bug reports alleviated our issues with version management and the lack of local vendor support. Issue: government issues We also experienced difficulties in dealing with various governments we encountered during the project. We had to work with the immigration work law of each development center country and take the time and effort to obtain proper visas for the engineers who might be traveling. We had to
understand each country’s import and export rules and regulations and procedures for getting customs clearance. We also had to understand and comply with the US export law. We discovered that some of the tools we had originally selected for the project were not cleared for export to all the countries involved in development. Solution: experience and planning In all of these issues, we learned that understanding and respecting the time and documentation requirements imposed by the laws of the various countries was the key success factor. For example, we had a standard PC camera stuck in customs for over a month, not because it was a banned product in the destination country, but rather because the rounded up declared cost did not match the shipping invoice. The mistake was ours, but the time and effort to correct the problem was substantial and, of course, unplanned. As another example, obtaining proper visas for non-US developers to visit other non-US sites was more difficult initially because we lacked an understanding of the countries’ rules and regulations. After we obtained the first few visas, we had a time gauge to use that helped us effectively plan travel in advance. Loss of “teamness” As Carmel points out, it is often difficult to integrate separate independent teams into a coherent team.1 Our project faced this same difficulty. Issue: differing development processes Each development center had its own development process. Within each center’s process there was a unique set of named deliverables with expected content. We faced the challenge of coordinating among the various organizations that follow different processes. One issue was aligning the expected delivery sequence of deliverables among the different processes. Another was aligning the content of deliverables across the teams. For example, the Arlington Heights center calls a particular item a Software Functional Specification while the Adelaide center calls it a Software Requirements Specification. These documents contained overlapping but not identical content.
Solution: don’t impose a common process We resisted the temptation to impose a common process across the teams. Although this did not solve any of our project management problems, it did let each team begin producing results immediately, using a process they were familiar with. If the teams had been forced into a common process, the learning curve would have impacted the delivery of the system. Solution: common work products We understood the inconsistency in notations and terminology in the beginning of the project and came up with a set of common “work products” and vocabulary. The common work products were then mapped to specific deliverables of the individual centers. We also put these work product vocabularies into the problem-tracking database to let the engineers submit their documents under the proper work product name.
We understood the inconsistency in notations and terminology in the beginning of the project and came up with a set of common “work products” and vocabulary.
Solution: complete development life cycle Each development center had slightly different process definitions, but all were complete life-cycle processes. We decided that each development center would be responsible for delivering a tested subsystem. Thus, during the initial architecture definition and subsequent assignment of network elements, we took great care to specify and assign a black box to a development center. Cultural differences Although Carmel devotes an entire chapter to cultural differences, it was less of an issue for us.1 Motorola has vast experience working with global customers, and even this project had a non-US customer. Most of our experience was with in-country sales teams and “back home” engineering. Issue: US impression of international development We noticed at the beginning of the project some reluctance among the Arlington Heights management team to use international developers. It’s one thing to be able to say to management that enough engineers have been allocated to work on a project, but these engineers must be able to deliver functioning software. The management team had a genuine concern that the international engineering teams would not be able to produce as needed. March/April 2001
IEEE SOFTWARE
75
Table 2 Development Statistics Development location
Code size (KLOC, C /C++)
Defect density (defects/KLOC)
Beijing, China Arlington Heights, US Arlington Heights, US Tokyo, Japan Bangalore, India Singapore Adelaide, Australia
57 54 74 56 165 45 60
0.7 0.8 1.3 0.2 0.5 0.1 0.5
Solution: familiarity brought a change in impression We had no plan for removing skepticism in Arlington Heights; we hoped it would go away as the project progressed. By having the liaisons, the US team could have face-toface discussions with these international developers. Through attending requirements and design meetings, the local team learned that all of the non-US teams understood software engineering concepts. The interaction made the staff comfortable with the non-US engineers’ qualifications to build their network elements. Our impression about global development Our experiences with global development have been mostly positive. The team developed 511,000 lines of code for the project. Each center contributed a significant portion of the system. There would have been no system to deliver without the contributions of all of the distributed teams. When the project was completed, we collected metrics to find the code size distribution and defect density. Capers Jones reports the US industry average for defect density is 2.6 defects per 1,000 lines of code.8 As Table 2 shows, our defect density results demonstrate that globally distributing software development has not increased the defect density. Our experience shows several key actions are necessary for success with global development. We list them in order of importance. Liaisons. We cannot overemphasize the role
of liaisons as a critical success factor for global development. The liaisons provided the key link between the architecture team and the development teams, as well as providing the US management team with a face to put with the non-US center. The relationships formed between the liaisons and local team further cemented the cohesion among the teams. The liaisons were not passive. 76
IEEE SOFTWARE
March/April 2001
They actively contributed to the system definition phase of the project. They worked on the system-level requirements and interface specification documents, providing them with hands-on experience in the behavior of the system. We had fewer problems because they were involved in the early phases of the system. They understood our system and knew the right person to talk to when there was an issue. Distribute entire things for the entire lifecycle.
In our project, we sent complete subsystems to the development centers, which successfully reduced our communication issues. Each team could run in loose synchrony with the other teams, meeting them at the cluster integration points. Because each team had full lifecycle responsibility, they were available to support their parts of the system through integration. Plan to accommodate time and distance. At first, we did not fully understand the impact of using global development resources on normal development activities. Because there were only a few overlapping hours each day when engineers from all sites were awake, we needed to use inconvenient timeslots to coordinate and communicate. Our advice is to “share the pain,” with some teams working late and other teams working early, when necessary. We took conference calls at home or shifted our working time to accommodate other development teams. We also became aware that the time it takes to move people and things from one country to another is more than the flight time. Obtaining visas and dealing with customs are largely beyond anyone’s control. We learned the rules for each country and added extra time to deal with inconsistencies.
A
fter completing the 3G Trial, our company began a similar project requiring new features and compliance with different standards. Motorola compiled a globally distributed team for the new project and used the experiences from our project again. We had liaisons present for the requirements specification phase, assigned entire subsystems to the distributed teams, used conference calls to complete specifica-
About the Authors
tions, and had non-US team members travel to Arlington Heights to participate in integration and test activities. This second project also resulted in the successful delivery of the product to our customer.
Robert D. Battin is a principal staff engineer in the Global Telecom Solutions Sector of Motorola. He is currently leading a research project on IP header adaptation methods. He has a BS in computer science from North Central College. Contact him at
[email protected].
Ron Crocker is a senior member of the technical staff in the Global Telecom Solutions Sector of Motorola. He was the overall 3G Trial architect and technical leader. He has a BS in computer science from the University of Illinois and an MS from the University of Michigan. Contact him at
[email protected].
References 1. E. Carmel, Global Software Teams: Collaborating Across Borders and Time Zones, Prentice-Hall, Upper Saddle River, N.J., 1999. 2. D.W. Karolak, Global Software Development: Managing Virtual Teams and Environments, IEEE Computer Soc. Press, Los Alamitos, Calif., 1998. 3. J. Kostner, Virtual Leadership: Secrets from the Round Table for the Multisite Manager, Warner Books, New York, 1994. 4. J.D. Herbsleb and R.E. Grinter, “Architectures, Coordination, and Distance: Conway’s Law and Beyond,” IEEE Software, vol. 16, no. 5, Sept./Oct., 1999, pp. 63–70. 5. D. Coleman, “Architecture for Planning Software Product Platforms,” tutorial presented at the First Software Product Line, Denver, Colo., Aug. 30–Sept. 1, 2000. 6. K. Beck, eXtreme Programming Explained, AddisonWesley, Reading, Mass., 2000. 7. S.C. McConnell, Rapid Development: Taming Wild Software Schedules, Microsoft Press, Redmond, Wash., 1996. 8. C. Jones, Applied Software Measurement, McGraw Hill, New York, 1997.
Joe Kreidler is a principal staff engineer in the Global Telecom Solutions Sector of Motorola. He is currently leading Third Generation CDMA trial network element development. He has a BS in electrical engineering from the University of Illinois and an MS in computer science from the Illinois Institute of Technology. Contact him at
[email protected].
K. Subramanian is a program manager in the Global Software Group of Motorola. He is currently leading 3G Trial development at Motorola, India. He has a PhD in computer science from the Indian Institute of Science in Bangalore. Contact him at
[email protected].
Career Opportunities OREGON STATE UNIVERSITY Oregon Master of Software Engineering Program Do you want to teach outstanding students who work in Oregon's high tech industry? Do you want to improve software engineering education and the industrial practice of software engineering through your scholarly publications? Oregon State's Department of Computer Science offers a year-round, well-compensated, tenure-track faculty position in which you will collaborate with faculty members from the other three Oregon academic institutions that constitute the Oregon Master of Software Engineering (OMSE). The overall purpose of the OMSE program (www.omse.org) is to develop and certify that its graduates possess the knowledge and skills to develop high quality software, on time and within budget. Major areas include requirements engineering, architectural design, software design, software construction, verification and validation, and software project management. Students typically have computer science or engineering backgrounds and several years of software development experience. We have an opening for an assistant,
associate or full professor who will lead Oregon State University in the OMSE program. We are looking for an excellent teacher with outstanding interpersonal skills, initiative, and software engineering experience, and a record of industrially relevant publications and activities. You will join faculty from the University of Oregon, Oregon Graduate Institute, and Portland State University, including Richard Fairley, Stuart Faulk, Michal Young, Richard Hamlet, and Warren Harrison. You will work with high tech companies in Metro Portland and elsewhere in the state, and with the OSU faculty in Corvallis. The teaching assignment is two courses per quarter, four quarters per year. Applicants must have a Ph.D. in computer science, computer engineering, software engineering, or a closely related discipline, as well as industrial experience working as a software engineer. This is a twelve-month position with a salary range of $80,000-$90,000 per year, with Oregon University System benefits. Send a complete resume and three sealed letters of reference (email is acceptable) to OMSE Hiring Committee, Department of Computer Science, 102 Dearborn Hall, Oregon State University,
Corvallis, OR 97331, or
[email protected]. Contact information phone number: 541-737-3273. Review of applications will begin on April 2, 2001. Position will remain open until filled. Oregon State University is an AA/EEO employer.
◆◆◆ Collective Communications Dynamic early stage wireless solutions provider working with leaders like Sun and Palm has openings for professionals who love working on cuttingedge technology in an energetic, flexible environment. Research and development of high-visibility, leading-edge wireless software products; lucrative compensation packages. Senior Software Developers and Software Architects: Experience in distributed systems, multi-threaded code, OOP, full software lifecycle, 3+ years Java. A plus includes operating system design and implementation experience. E-mail:
[email protected] Visit: www.collective.com
focus
global software development
Outsourcing in India Werner Kobitzsch, Tenovis GmbH & Co. KG Dieter Rombach, University of Kaiserslautern and Fraunhofer IESE Raimund L. Feldmann, University of Kaiserslautern
tarting in the early 1990s and motivated initially by the desire to cut personnel costs, many companies have explored multisite, multicountry software development approaches. India and Eastern Europe, in particular, have drawn attention. Approaches have ranged from subcontracting portions of software development projects to third-party companies or subsidiaries, to establishing truly virtual development teams.
S Companies are increasingly looking to outsource their software development to India. After examining various possible outsourcing models, this article reports on a German telecommunication company’s experience setting up a satellite operation in India. 78
IEEE SOFTWARE
Early on, these innovators found that distributed development incurred its own overhead costs, such as high communication and quality assurance costs and the sometimes intensive training needed for developers at remote sites. Consequently, most companies today distribute their development primarily to access human resources and competencies not available at home and only secondarily to cut labor costs. To illustrate these changes, this article reports on the experiences of Tenovis GmbH & Co. KG, a German company in the private (in-house) telecommunication domain (see the “Tenovis in Germany” sidebar), and its software development partner in Bangalore, India. How distributed development works Multisite and multicountry software development can take several forms according to several different models, each promising different benefits and implying different challenges. While other factors do exist, our experience shows that the legal relation
March/April 2001
among the participating companies and the setup of the teams are the most important distinguishing features. We identify four major cooperation models (see Figure 1). 1. Separate teams in basically independent companies—the normal contractor– subcontractor relationship. Legal, knowledge-transfer, development and project management, and quality management issues apply here. If the distribution crosses cultural boundaries, these implications might be amplified, and language, time, and infrastructure issues might join the list of challenges. 2. Separate teams in legally related companies—the specialized contractor–subcontractor relationship between mother and daughter companies. This situation has similar implications as Model 1, except that legal, knowledge-transfer, and project and quality management issues are easier to solve because the mother company owns the subcontractor. 3. One team distributed across multiple 0740-7459/01/$10.00 © 2001 IEEE
sites of legally related companies—development and project- and qualitymanagement issues are paramount here. If the sites are distributed across different countries, language, time, and infrastructure also become challenges. 4. One team distributed across multiple sites of several basically independent companies—the most general mode of a globally operating company. Teams are distributed across multiple, legally independent sites. This model has all the implications of Model 3; in addition, the legal implications can become major challenges. Why global development? In setting up a global software development environment, a company’s reasons could range from lack of qualified personnel at its home site to lower development costs or access to new markets in other countries. Once the company has decided to globalize its software development, it must answer the following questions to select the appropriate cooperation model: ■
■
■
Are there historical relations that could ease some of the challenges inherent in globalization? Historical relations might exist between two companies considering global development. Perhaps former colleagues are working in the foreign company or have built their own startup, or perhaps the two companies are already cooperating in a field other than software development. Also, there might be strong historical relations between the two countries involved. Perhaps they share a language or have other cultural ties, or perhaps there is simply a favorable political climate for investing in the country under consideration. Existing historical relations will ease any form of cooperation—especially the close-cooperation Models 3 and 4. Are there cultural boundaries to consider? Country-specific traditions, beliefs, or religions, for instance, might affect a cooperative venture. Some Asian cultures, for instance, are traditionally more teamwork oriented than some Western countries, or so it is commonly thought. In such countries, establishing a cooperation could be easier according to Model 3 or 4. Can we distribute the development pro-
Team setup
Participating companies Basically independent
Legally related
Separate teams
Model 1
Model 2
One team
Model 4
Model 3
cess for an entire software system across sites? Should responsibilities be split along system requirements (different sites developing different system partitions but covering the entire life cycle for these partitions) or along development roles (one site doing the application engineering including requirements analysis, design, and integration and acceptance testing; the other site implementing system components). Or should they be split according to qualification? The choice may favor Model 1, 2, or 3 and 4, respectively.
Figure 1. The four cooperation models according to company relationship and team setup.
Following these general choices, companies must undertake a more detailed analysis to identify the most suitable cooperation model. Factors to consider include legal issues, knowledge transfer, development and project management, quality management, language and time, and infrastructure. Legal Legal issues become most important when distributed software development follows Models 1 and 4, which involve legally different companies. Here, contracts must precisely fix liability and intellectual ownership of the developed product. Consequently, the two companies should precisely formulate requirements for the externally developed software and include the minimally required
Tenovis in Germany Tenovis GmbH & Co. KG (www.tenovis.com) was founded in April 2000 as a spin-off from Bosch Telecom’s Private Network branch. Tenovis’s business covers more than 200,000 customers throughout Europe from startups and mid-size companies to such giants as Bertelsmannn, HypoVereinsbank, BMW, and Volkswagen. Inside Europe, Tenovis accounts for an 8,000-person workforce at 82 sites. Tenovis offers communication solutions and has core competencies in designing, planning, and implementing communication solutions based on data–voice integration. Tenovis also offers software application packages for special services as well as outsourcing solutions.
March/April 2001
IEEE SOFTWARE
79
The most effective transfer of knowledge relies on the exchange of people—either for performing applicationspecific tasks off site or for training.
quality standards. Otherwise, rework will lead to additional project costs. In addition, requirements engineers from both companies must check the resulting contracts, as should contract lawyers and, if applicable, even international business lawyers approved by the different legal systems. Sometimes, the contracts must integrate even seemingly small differences, such as the different definitions of the official business year in Germany (1 January to 31 December) and India (1 April to 31 March), for proper payment and accounting. Such differences in legal systems sometimes also affect Model 2, where the companies are legally related but are embedded in different cultures. Settings with distributed teams (Models 3 and 4) create additional challenges regarding the individual team members. Exchanging experts between different sites, for example, or sometimes even short business trips for emergency meetings, raises questions regarding visas, working permissions, or different social security systems, and thus sometimes must be planned months in advance. Knowledge transfer The most effective transfer of knowledge relies on the exchange of people—either for performing application-specific tasks off site or for training. However, this physical exchange is costly and difficult, so the companies need to employ other means of knowledge transfer as well. They must transfer the following kinds of knowledge: ■
80
IEEE SOFTWARE
Application knowledge. Depending on the type of software to be developed, a company might need to transfer its knowledge regarding the application domain to the remote site. For instance, when a company has specialized for several years in developing control software for a very specific domain—such as for nuclear power plants or cardiac pacemakers—it must usually transfer the knowledge it has gained about the specific control algorithms to the partners. Such transfer usually causes no problems for distributed development organized according to Models 2 or 3 because the sites belong to the same company. However, such transfer becomes crucial for Models 1 and 4 where the independent partner company might also be develop-
March/April 2001
■
■
■
ing software for competitors. Legal issues are sometimes involved when political considerations prevent the companies from freely transferring certain technologies, such as encryption techniques. Quality management knowledge. Quality requirements sometimes mean that a company must transfer its entire quality management to its partner. Otherwise, for instance, it might not be possible to guarantee or certificate development according to certain CMM levels or the ISO 9000 standard. The company thus must adjust knowledge about processes, techniques, or methods to a common level, for instance, through in-house training or temporarily exchanging experts. More often, this kind of knowledge transfer applies to distributed development according to Models 2, 3, and 4. Under Model 1 development, a company could choose a partner who offers and guarantees the required quality standards. Development (standards) knowledge. This is usually a minor issue because of existing, and widely documented, international (quasi) standards. However, if specialized company standards apply, this issue might cause problems for Models 1 and 4, but it is not critical for development according to Models 2 or 3. Company culture knowledge. As we’ll discuss later, this last dimension often addresses the challenge of overcoming cultural boundaries. Employee benefits (health coverage and holidays, for instance) also apply here. Transferring company culture is most important within Models 2, 3, and 4.
Development and project management Companies interested in distributed development processes must address several technical and managerial issues. One is the coordination challenge.1,2 In distributed teams, coordinating and sharing issues become more difficult, such as with the latest version of design documents (data availability), necessary interface changes between related modules (change control and configuration management), or questions to the teams’ expert on a certain topic (knowledge transfer). Other issues include collaborative editing and process modeling and enactment.3–5 Several tools and development environments
are available to support these development and project management activities. Among them are BSCW, MetaEDIT+, ClearCase MultiSite, or more general-purpose technologies such as Corba. In addition, several repository systems offer distributed and transparent access to project and experience data. However, these tools usually address only a small set of the stated issues and often lack required functionality to fully support geographically dispersed development sites. Consequently, companies typically use a diversity of tools. Sometimes, because suitable solutions are missing or have not yet been accepted in industry, they even use tools that were not designed to support cooperation across organizational boundaries.6 While coordination challenges affect distributed software development according to all four models, the other issues we address here apply primarily to the team-based Models 3 and 4. Quality management While development according to Model 1 lets a company freely choose a potential partner with the needed reputation (perhaps based on ISO 9000 or CMM quality standards), that’s not always the case with the other models. In these cases, the cooperating companies must first develop a common understanding of quality issues. Then, they can develop a suitable quality model that considers cultural particularities, such as traditional team orientation or resistance to foreign control. As part of these activities, intensive training and exchange of existing knowledge might be necessary. Once a common quality model is established, the next challenges arise in obtaining and interpreting the needed (measurement) data—as well as transferring the results back to the developers at the different sites. Transparent tracking of project data spread over different locations worldwide (according to Models 3 and 4) requires an advanced communication infrastructure. The same holds for the sharing of common experiences. In addition, only a few tools (such as EPG7 and WebME8) offer the needed capabilities to support quality management issues for collaborative software development across physically separated locations. The same challenges that accompany development and project management apply—even more so because of the sensitive issues of quality
data handling and lack of adequate tool support. Language and time Developing software at sites in another country usually means that developers at these locations speak different mother tongues. With Models 1 and 2, this language difference only affects the exchange of contracts, requirements, and products and joint meetings such as reviews. With Models 3 and 4, however, daily communication and technical documentation is involved as well. Usually, English is the language chosen for global communication purposes. However, all models might be affected if the software to be developed requires a language-specific user interface (for example, if all dialogs must be in German or must use Asian language symbols). Then, only a location in a certain country or distribution according to Model 1 offers suitable solutions for distributed software development. Differing time zones might present another challenge. At first glance, this might seem to be inconsequential, especially with one- or two-hour time differences. Major US companies, for instance, surmount this challenge with distributed sites within their own country every day. However, because the overlap of normal office hours becomes smaller, the issue takes on a different quality when the time difference grows. Sites become temporally dispersed.2 Arranging a video conference between a site in New York and one in Europe during normal 9-to-5 office hours, for example with a typical time difference of six hours, offers only a two-hour time slot daily (between 3 and 5 p.m. European time and 9 and 11 a.m. in New York). Even with smaller time-zone differences, such as the 4 and a 1/2–hour offset between Germany and India, teams that work according to Models 3 or 4 are sometimes affected by temporal dispersion. The two countries usually have different off-days and religious or national holidays. India celebrates its Independence Day on 15 August, for instance, while the US celebrates on 4 July. Other Indian holidays—Republic Day (26 January), Gandhi Jayanti (2 October), or Holi and Diwali (flexible according to Gregorian calendar, as they are based on the Hindu calendar and the lunar cycle)—do not even have German or American coun-
Transparent tracking of project data spread over different locations worldwide (according to Models 3 and 4) requires an advanced communication infrastructure.
March/April 2001
IEEE SOFTWARE
81
Indian Software Company Robert Bosch India (RBIN) in Bangalore, India, was founded in 1998 as a 100-percent subsidiary of Robert Bosch GmbH of Germany. Located in Koramangala, Bangalore, the company has provided software services for Robert Bosch units worldwide, including Europe, North America, Asia, and Australia. It is currently diversifying toward software development for non-Bosch customers. RBIN accounts for about 650 employees, with nearly 600 belonging to the software division. It operates as an ISO 9001 company and have earned a CMM Level 3. RBIN’s areas of specialization include ■ ■ ■ ■
software development for automotive applications, software development for telecommunications systems, business applications, and industrial automation.
See www.boschindia.com/rbin for more information.
terparts. Consequently, when developing in another country, a company must consider possible temporal dispersion when setting delivery deadlines or meeting appointments. Ignoring such cultural specialties could produce resentment and damage morale. Time-zone differences also offer opportunities for a virtually 24-hour development process. When night falls in Asia, for instance, the results of today’s work might transfer to a site in Europe where the workday has just begun. As the European workday ends, the documents go to America for further processing, before they return to Asia, arriving just in time for the new workday. This kind of distributed development works best in Models 3 and 4. In any case, implementing such distributed software development processes requires advanced infrastructure support. Infrastructure An advanced communication infrastructure is a key component for distributed software development, because team communication becomes much more difficult when the participants are geographically dispersed (according to Models 3 or 4):9,10 ■
82
IEEE SOFTWARE
The team members must communicate with each other to discuss problems and possible solutions: “Email communication addresses this to a certain degree, but email does not allow developers to carry on the rich conversations possible when they are physically collocated.”2 Other forms of communication are needed, such as teleconferences, chat rooms, videophones, or virtual conference rooms. In addition, time-zone dif-
March/April 2001
■
ferences still come into play. Also, the shared project database—the developed product and its documentation, process and measurement data, reuse repositories, and so forth—must be available to all team members. Developers cannot simply attach this usually huge amount of data to an email message or transfer it over standard telephone lines. Capabilities for large data transfer must be available.
Security is often a concern here. Are the chosen communication links safe? Or must common trust models such as PGP or X.509 be used to secure communication?11,12 Even when security is not a critical issue, the regular communication systems sometimes simply cannot cope with the requested high standards, especially in countries with less developed infrastructures. This might force a company to install its own private communication links, perhaps through expensive satellite links. Besides these communication issues, other infrastructure can also cause problems. An unpredictable loss of electricity, for instance, might cause lost data, destroyed hardware, or other problems with technical equipment. Insufficient office space, missing security services, incompatible technical standards, or restricted water supply, to name a few, can cause problems as well. Obviously, not all these issues are the contractor company’s responsibility when organizing development according to Model 1; the subcontracting company must guarantee the infrastructure. Nevertheless, even under this model, lack of appropriate infrastructure can affect the overall cooperation. Table 1 shows how the four cooperation models relate to the criteria we have just discussed. This taxonomy lets us classify collaborations, helping us identify the relevance of lessons learned. A collaboration according to our Model 3, for instance, might use lessons learned in a Model 2 cooperation that arose mainly because the participating companies were legally related. A Model 1 collaboration might use some Model 2 lessons learned that arose because separate teams were chosen for the team setup. Cooperation mode Cooperation between Robert Bosch India
Table 1 Issues Regarding Distributed Software Development according to the Four Cooperation Models
(RBIN) and Tenovis Model 1 Model 2 (formerly the Private Legal: Network part of Product issues (liability, intellectual ownership, IPs, and so on) H M Bosch Telecom) began Personal issues (visas, working permits, and so forth) L L in early 1997 (see Contracting in general H L the “Indian Software Knowledge transfer: Company” sidebar). Application knowledge H [L] A team of about six Quality management knowledge L M Indian software engiDevelopment standards M L neers located in BanCompany culture ? [?] galore, India’s Silicon Valley, started its deDevelopment and project management: velopment work unCoordination challenges M M der the supervision of Tool support (missing) L M a French colleague, Quality management: who stayed in BangaDefinition of a common quality model L M lore for 18 months. Realization / applying the model in daily practice M M Because Robert Bosch Language and time: owned RBIN, this colCommunication / documentation language [M] [M] laboration was iniTime-zone differences L L tially run according to Different off-days / holidays L L Model 2. When TenoInfrastructure: vis became an indeCommunication issues M M pendent company in early 2000, the coopHousing, safety, basic supply L [H] eration changed formally to Model 1. H : High impact M : Medium impact L : Low impact [ ] : Depends on location of site ? : Indicates no general trend However, due to the long previous history of collaboration according to Model 2, we man teams, which handled final integration. avoided some of Model 1’s more serious Currently, 72 Indian colleagues are working challenges. for Tenovis in PABX software development. The first topic addressed was development work of PC tools running on the ISM (inte- Legal gral system management), a tool to adminisA signed, mutually agreed-upon contract ter PABXs (public automatic exchanges) of between RBIN and Tenovis governs all devarious sizes. Initial activities covered training velopment work. This contract covers the of PABX functionalities, maintenance tools, responsibilities of both sides related to the software tools, and style guides for GUI de- software development work, including sign. The common understanding was to build maintenance activities during the software up an independent software engineering de- product life cycle (product care). The conpartment to take over any development work tract includes hardware needs such as workrelated to PABXs as soon as this startup work stations, PABXs, software tools, and licould mature. censees. Tenovis must pay special charges IT specialists from both companies in- covering installations such as servers that do stalled the infrastructure—a 256-Kbps link not belong to standard office equipment. for communication and transfer of software Because all work is undertaken on work packages. In early 1998, further service package descriptions, payment relates to the tools, PABX features, terminal equipment, hours estimated during the planning phase. and server applications came online. This (Along with each package description growth in responsibility and importance of comes the estimated time in hours needed RBIN’s development work made it necessary for the package. Payment is based on a fixed to install a lab facility. The teams in Banga- amount of money per hour; if the company lore thus could feature-test their develop- needs more hours than estimated, it will not ment work before delivering it to the Ger- necessarily earn more money. This then March/April 2001
Model 3
Model 4
L
H
[M]
[M]
L
H
[L]
H
H
H
L
M
[?]
?
H
H
H
H
H
H
H
H
[H]
[H]
[H]
[H]
H
H
H
H
[H]
L
IEEE SOFTWARE
83
Running offsite projects successfully requires that a tight project management program be put in place at the earliest possible moment.
forms a fixed-price–basis aspect of the contracts.) Budget allocation for the following year must always take place at year’s end. Each engineer responsible for a work package acts as the point of contact to counterparts at Tenovis. To control and supervise the tasks at RBIN, monthly reports cover progress, effort, time to complete, hours spent, and problems and potential solutions. These monthly (or more frequent) reports often trigger corrective actions. Knowledge transfer Knowledge transfer issues include product- and project-related concerns. (On its own, RBIN has always dealt with basic knowledge-transfer issues such as programming languages.) ■
■
■
84
IEEE SOFTWARE
Planning issues. The German and Indian teams exchanged project-planning issues to reach a common knowledge on what is planned, which releases to use, and how to reach the delivery dates. They addressed and solved important considerations such as dependencies on each other’s development results. Meetings always take place in Germany because the necessary marketing and project-planning resources are there. Programming issues. Developing switching software for PABXs is a very challenging task. To properly design software modules, programmers require detailed knowledge. They must be trained in such areas as interfaces, database design, and tooling, both for the software development as well as for version control. The knowledge transfer took several routes: German instructors coached several teams; training courses explained software tooling, software build processes, and infrastructure usage; on-the-job training took place in Germany, lasting up to nine months; and Indian colleagues conducted training courses in India. Project management issues. Running offsite projects successfully requires that a tight project management program be put in place at the earliest possible moment. Its modules—such as reporting, risk management, time schedules, problem reporting, error correction, and version control—need training and contin-
March/April 2001
ual supervision to detect failures that might cause schedule slippages or project and work package failures. After almost three years of cooperation with RBIN, Tenovis can now give full project responsibility for tooling to our Indian partners, indicating significant confidence in our colleagues’ quality of work. Development and project management As we’ve discussed, all work assigned to RBIN comes with a work package description and an initial estimation of effort. To run the development properly, the overall project leader at Tenovis must know who is responsible for each work package and its target delivery date. Both the hardware and software development environments play an important role and must be settled before development begins. During the cooperative development project, we generated numerous software packages, making version control an issue. A multisite configuration control tool running at Frankfurt and Bangalore addressed this issue. The backbone of our common development is the 256-Kbps communication link that guarantees a smooth transfer of data and voice. It is operated by a non-Tenovis-owned carrier specializing in intercontinental communication. Quality management This multisite software development cooperation follows the ISO 9000 quality standard at RBIN. We conducted extensive training on quality issues—tools, procedures, specifications, know-how, reviews, and measurements—at both sites to reach acceptable quality assurance levels. Ongoing status review meetings, videoconferences involving teams from both ends, quality reviews, and sessions on lessons learned served as tools to continuously improve the quality of the deliverables. Language and time As a global telecommunications manufacturer, the language of our development teams is English. Communication and documentation are kept in English, easing the exchange of information and shortening the time needed for responding to the originating team or engineer. Engineering time thus can concentrate on solving technical issues
and keeping delivery schedules. However, it took quite some time initially for each side to understand the English terms of the other side. The time difference between India and Germany created no major problems because collaboration took place between two legally separate teams with rather welldefined interfaces. Infrastructure India’s infrastructure is very volatile. To guarantee stable operation, the RBIN plant had to be technically self-supportive, which we achieved through ■
■
captive 1,000-KVA power generators to meet 100 percent of RBIN’s power requirement, along with 100-percent backup redundancy; and independent communication, as RBIN links to its customers through the Robert Bosch Corporate Network (RBN). The RBN consists of one 256Kbps and one 384-Kbps link to Germany, which carry communications of data, voice, fax, and video conferencing.
Lessons learned We gained the lessons learned described here mostly while the cooperation was organized according to Model 2 (before Tenovis became independent in early 2000, at which point the cooperation changed to Model 1). While these lessons are specific to the Tenovis–RBIN relationship, they might well be useful to many organizations, especially if they operate according to Model 2. Actually, many large companies (Lucent, Siemens, and Nokia, for example) have set up development sites in India, legally related to their own company (Model 2). If they operate according to one of the other models, the classification scheme for cooperation modes lets them identify clearly the differences between the Tenovis–RBIN situation and their own, and thereby judge certain lessons as more relevant to their own situation than others. For example, a company operating according to Model 3 might use those lessons learned from a Model 2 cooperation that relate mainly to the legal relationship between the participating companies. Conversely, a collaboration according Model 1 might use some Model 2 lessons learned that related to the fact that separate
teams were chosen for the team setup. The initial motivation to form the RBIN–Tenovis cooperation was to create a team of software developers of equal competence but at much lower engineering cost. Even after spending much time training and familiarizing the Indian colleagues and coping with a very high attrition rate, we still see a high financial payback. You can reduce attrition by using PC-based applications and advanced technologies such as Java and object-oriented design. However, if you are willing to invest the high initial startup cost for planning, supervision, training, and personal communication, the cooperation will turn out to be long lasting. Our decision to assign a liaison engineer in RBIN in India proved to be a valuable investment because it provided for quicker problem solving. Both teams wanted a fruitful working relationship, which we achieved by forming stable personal relationships. However, as experience showed, this is sometimes hard to achieve because personal career planning can sometimes interfere (such as when people are not willing to stay with a company for a longer period of time). Legal problems, language, and time issues never arose during this cooperative arrangement. English as the common language was an easy choice in this case. Also, according to Models 1 and 2, we had very clear contractual procedures, and the time difference was not an issue. Our experience showed that knowledge transfer must be a continuous activity. To inform, train, and motivate our teams in India, we initially train key Indian personnel in Germany and conduct continuous training programs onsite in India. We found that project schedules, timely deliverables, and problem solving should receive intensive attention. Videoconferences, frequent onsite visits and inspections, and teleconferencing are musts. Reporting, risk management, and risk assessment must receive attention. Our experience shows that tight management control and personal relationship are keys to project success. Common quality management standards and procedures are essential. Most difficulties arose in the necessary transparency of development practices through capturing the needed measurement data, such as er-
Our decision to assign a liaison engineer in RBIN in India proved to be a valuable investment because it provided for quicker problem solving.
March/April 2001
IEEE SOFTWARE
85
About the Authors Werner Kobitzsch is vice presi-
rors that occurred. Different sites view sensitivity of data differently. Only the trust built up over several years lets us establish a smoothly operating process-tracking and quality-assessment system based on data shared across development sites. Finally, companies looking to outsource development should not underestimate infrastructure issues. The different standards concerning availability of utility services and communication in different countries could upset project schedules, for example. Establishing some independence of such local problems would seem essential, based on our experience. The investments in independent power and communication facilities are more than worthwhile.
C
urrently, most German and European companies prefer to create their own subsidiaries. Such arrangements limit the legal challenges they might expect to encounter. Because Germany’s personnel shortage will not be solved anytime soon, distributed development of this kind will remain the standard. The different forms of all cooperation models we described will be used to establish long-lasting relationships. But this cooperation will not be limited to direct software development services. Companies that are not primarily devoted to software development will increasingly access the Indian continent to outsource regular data-processing services, such as entering, editing, or compiling of data (such as timetables, phone books, or account information) for company-specific systems. A number of large companies already use these services. For such companies, the cooperation models we’ve described for distributed software development will be useful in establishing their cooperative ventures with companies in India. Over time, this development will conflict with Indian interests. If it limits itself to performing simple data-processing tasks, India itself will be vulnerable to even less expensive competition in the future. Performing outsourced software development for foreign parent companies bears the same danger. Once the emergence of 86
IEEE SOFTWARE
March/April 2001
dent for applications at Tenovis, in charge of PABX service tools, accounting systems, and vertical applications such as hospitals and hotels. He has 30 years of experience in the communications industry working for Alcatel SEL, Raynet, DeTeWe-Berlin, Bosch Telecom, and Tenovis. He received an MS in transmission and high-frequency technology from Stuttgart University. Contact him at Tenovis GmbH, Leitung Entwichlung Applikationen, Kleyerstrasse 94, D-60326 Frankfurt, Germany;
[email protected].
Dieter Rombach is a full professor in the Department of Computer Science at the University of Kaiserslautern. He is also executive director of the Fraunhofer Institute for Experimental Software Engineering. His research interests include software methodologies, modeling and measurement of software processes and resulting products, software reuse, quality management, and technology transfer. He received a BS in mathematics and MS degrees in mathematics and computer science from the University of Karlsruhe, and a PhD in computer science from the University of Kaiserslautern. He is an associate editor of Empirical Software Engineering, an editor of Computer, and is a member of the German Computer Society (GI) and ACM, and a senior member of the IEEE. Contact him at the Fraunhofer Institut (IESE), Sauerwiesen 6, D-67661 Kaiserslautern, Germany;
[email protected]; www.iese. fhg.de/Staff/rombach.
Raimund L. Feldmann is a re-
searcher in the Software Engineering Research Group at the University of Kaiserlautern. His research interests include experience and reuse repositories and software process improvement by software measurement and comprehensive reuse. He received a BS and an MS in computer science from the University of Kaiserslautern. He is a member of the IEEE Computer Society. Contact him at the Univ. of Kaiserslautern, AG Software Eng. (AGSE), Postfach 3049, D-67653 Kaiserslautern, Germany;
[email protected]; wwwagse.informatik.uni-kl.de/Personalia/ rf.html.
more qualified personnel in other countries or the development of more efficient development methods solves the labor shortage problem, India could become obsolete as a major global software provider. Therefore, India’s policy must be—and actually is—to move into product development and the development of entire systems, combining software and application competence. Such development would ensure long-term importance of the Indian subcontinent as a major global software provider, but will put them in a more fierce competition with traditional software countries in the US and Europe.
References 1. J. Grundy, “Distributed Component Engineering Using a Decentralised, Internet-Based Environment,” Proc. Third Int’l Conf. Software Eng., Workshop Software Eng. over the Internet, 2000; http://sern.cpsc.ucalgary. ca/~maurer/ICSE2000ws/submissions/Grundy.pdf (current Mar. 2001). 2. S.E. Dossick and G.E. Kaiser, “Distributed Software Development with CHIME,” Proc. Third Int’l Conf. Software Eng., Workshop Software Eng. over the Internet, 1999; http://sern.cpsc.ucalgary.ca/~maurer/ ICSE99WS/Submissions/Dossik/Dossik.html (current Mar. 2001). 3. J.C. Grundy, “Interaction Issues for User-Configurable Collaborative Editing Systems,” Proc. Asian Pacific Computer & Human Interaction Conf. (APCHI), IEEE CS Press, Los Alamitos, Calif., 1998, pp. 145–150. 4. R. Conradi et al., “EPOS: Object-Oriented Cooperative Process Modeling,” Software Process Modeling & Technology, A. Finkelstein, J. Kramer, and B. Nuseibeh, eds., Research Studies Press, Hertfordshire, UK, 1994, pp. 33–70. 5. F. Bendeck et al., “Coordinating Management Activities in Distributed Software Development Projects,” Proc. IEEE Workshop Enabling Technologies: Infrastructures for Collaborative Enterprises (WETICE ’98), IEEE CS Press, Los Alamitos, Calif., 1998, pp. 33–38. 6. Z. Haag, R. Foley, and J. Newman, “Software Process Improvement in Geographically Distributed Software Engineering: An Initial Evaluation,” Proc. 23rd Euromicro Conf., IEEE Press, Piscataway, N.J., 1997, pp. 134–141. 7. M.I. Kellner et al., “Process Guides: Effective Guidance for Process Participants,” Proc. Fifth Int’l Conf. Software Process (ICSP), IEEE CS Press, Los Alamitos, Calif., 1998. 8. R. Tesoriero and M. Zelkowitz, “A Web-Based Tool for Data Analysis and Presentation,” IEEE Internet Computing, vol. 2, no. 5, Sept./Oct. 1998, pp. 63–69. 9. J. Suzuki and Y. Yamamoto, “Leveraging Distributed Software Development,” Computer, vol. 32, no. 9, Sept. 1999, pp. 59–65. 10. S.F. Li and A. Hopper, “A Framework to Integrate Synchronous and Asynchronous Collaboration,” Proc. IEEE Workshop Enabling Technologies: Infrastructures for Collaborative Enterprises (WETICE ’98), IEEE CS Press, Los Alamitos, Calif., 1998, pp. 96–101. 11. P. Zimmermann, The Official PGP User’s Guide, MIT Press, Cambridge, Mass., 1995. 12. ISO-IEC Standard 9594:1993-8, 1993 www.iso.ch (current 1 Mar. 2001).
feature open source
Open Source Software Adoption: A Status Report Huaiqing Wang, City University of Hong Kong Chen Wang, StockSmart
Open source software has emerged from the hacker community, but because of many misgivings and myths regarding its maturity, making informed adoption decisions is hard. Systematically applying requirementsoriented criteria to open source software offers a practical roadmap for navigating this new landscape. 90
IEEE SOFTWARE
sing the right software is increasingly critical to project success,1,2 but the choices keep getting wider and more confusing. Open source software has entered the mix, leaving the traditional confines of the hacker community and entering large-scale, wellpublicized applications.3 However, although some argue that it is ready for wide-scale commercial adaptation and deployment,4 the myriad number of
U
OSS packages make actual adoption a real challenge. This article presents a straightforward and practical roadmap to navigate your OSS adoption considerations. We do not have a universally accepted definition of OSS. For instance, Netscape, Sun Microsystems, and Apple recently introduced what they call “community-source” versions of their popular software—the Mozilla project, Solaris, and MacOS X, respectively.5,6 Such efforts, while validating the OSS concept, also make their inclusion into the OSS community a potential topic for contention. Here, we will use the loose definition of OSS that includes publicly available source code and community-source software. Requirements-oriented considerations Commercial IT development today is vastly different from that of 10 years ago; all-
March/April 2001
encompassing, proprietary in-house software development has effectively disappeared. Many efforts now focus on integrating offthe-shelf software packages to achieve particular software implementation goals. You must consider many requirements when choosing a suitable software package, regardless of whether the candidate is open source or commercial.5–7 Most of these criteria are common and have been extensively studied. We will not cover the following important adoption criteria because they do not distinguish between OSS and commercial-software candidates: functional capability, efficiency, speed of execution, and organizational standards and preferences. Specifically, our criteria apply a product-oriented evaluation framework in which we can compare and analyze distinctive features of OSS candidates.5,7 We will emphasize the technical and managerial 0740-7459/01/$10.00 © 2001 IEEE
requirements in which the nature of OSS is particularly relevant. These two aspects correspond to the two classes of stakeholders in commercial IT efforts. However, our aim is to outline the various requirement considerations rather than present their relative importance—you must prioritize the criteria with respect to your particular project. For instance, in the case of a legacy-system data conversion project, future upgradability might be moot but high reliability would be critical. Technical requirements A potential OSS would have to be evaluated according to several technical requirements involving architectural, development, and operational issues. Availability of technical support To adopt an OSS candidate in a commercial IT effort, you must have commercial-grade technical support available (at reasonable cost). This includes training, documentation, real-time support, bug fixes, and professional consulting as needed. To enable the development team to get off to a quick and smooth start, having a binary distribution of the OSS widely available is preferable so that the initial familiarization process can occur seamlessly. Future functional upgradability If the target application is to be operational, maintained, and extendable, the new software must be upgradable to provide additional capabilities. As a result, the current and future status of your OSS’s development becomes a significant factor, because continuous development and bug fixing enable future upgrade capabilities. In addition, backward compatibility is important so that future versions of the OSS require minimal recoding and reintegration with existing system functionality. Open-standard compatibility For a large and complex IT project, all the components must adhere to a particular open standard or protocol. It is insufficient that the OSS adhere solely to the various open standards at any point in time. It must also have continuous development momentum to adhere to future revisions of the standards as they evolve.
Customizability and extensibility For an OSS candidate to be adopted, it must be flexible enough to be customized or integrated in widely different technical environments. The package might also have to be extended to include extra, potentially proprietary functionality. While OSS is generally considered highly customizable and extensible—as the source code is publicly available—you must take into account the complexity of the effort to make such modifications at the source level. Also, you must consider the OSS package’s dependency on operating systems, development tools, and other software packages that might significantly affect OSS extensibility. Whether or not you can integrate the OSS with commercial software is also an important factor, because all the software must be able to be integrated with other software packages. High reliability For an OSS candidate to be considered operationally robust and highly reliable, it must have been operational in a large number of applications and its performance evaluated and reviewed. For critical systems, you would be prudent to adopt software that has been widely used commercially instead of one that has yet to gain sufficient operational data and use analysis.
Our criteria apply a productoriented evaluation framework in which we can compare and analyze distinctive features of open source software candidates.
Management requirements From a project management standpoint, a potential open source or commercial candidate would have to meet various resource allocation, licensing, and maintenance requirements to be adopted. Budgetary For the most part, OSS is considered free in the sense that generally no or minimal costs (for example, shipping and handling) are involved. However, there are indirect costs, including development, technical support, and maintenance efforts. For most IT projects, indirect costs can grow larger than the original package purchase cost. Development team expertise It is critical to consider the development team’s existing expertise with Unix, Perl, or other OSS technologies. Lack of familiarity here would require extensive team retraining and the adoption of not only new softMarch/April 2001
IEEE SOFTWARE
91
Table 1 Open Source Software Licenses and Their Effects Can be mixed with nonfree software
Proprietary modifications can be made private
Can be relicensed
Allows proprietary licensing
Y Y Y Y Y
N Y Y N Y
N N N N N
N N N Y N
GNU Public License Library GPL Berkeley Software Development Community Public License Commercial
ware but a new development philosophy as a whole—resulting in significant cost and resource consumption. Licensing and project scope Adopting OSS is not free from the terms set forth by software licenses. OSS products have several different types of license, each of which imposes a different set of restrictions that could potentially impede critical project capabilities such as internal reuse, proprietary custom extensions, and resale. Table 1 lists the following common types of OS license: GPL (GNU Public License), perhaps the most common one; LGPL (Library GPL), a modified version of GPL applying specifically to software libraries; BSD (Berkeley Software Development), applying mostly to derivatives and variants of BSD Unix; and CPL (Community Public License), a type of license typically found in community versions of commercial software. The licensing terms of your chosen software will affect your current and future project scope, such as internal use versus resale. Long-term maintainability Almost all operational IT projects must be maintained over time, so it is important to consider the complexity of maintaining the software you adopt. OSS characteristics such as development status, standard adherence, and the availability of support all affect the long-term manageability of your project. Analyzing OSS characteristics The following list describes 10 OSS characteristics and the possible values we can assign. By assigning a value to each of these characteristics for a particular OSS, we can specify that software’s capability to meet its requirements. We did this with a representative collection of OSS that is either widely used or widely noted in technical periodicals, including the community-source versions of Sun’s Solaris and Apple’s Mac OS X. Table 2 presents the resulting chart. 92
IEEE SOFTWARE
March/April 2001
1. Technical support: the amount of available support for the OSS. ■ – Support limited to direct, ad hoc individual developer support. ■ + Support based on communityoriented group support. ■ ++ Support tied to one or more commercial entities providing comprehensive support for the OSS (for example, Red Hat provides complete support for Linux, and Cygnus supports all GNU packages (interestingly, Red Hat acquired Cygnus in November 1999). ■ — No longer being developed or supported. 2. Backward compatibility: the effort required by an existing system to maintain compatibility with the OSS. ■ – OSS is either in its first stable release or its functionality has been modified such that systems using a previous version would require significant effort to upgrade to the current one. ■ + A moderate effort is required to upgrade to the current version. ■ ++ Virtually no effort is required to upgrade to the current version. 3. Standard compatibility: The open standard that the OSS adheres to and that multiple vendors have agreed to. ■ OSF (Open Software Foundation). ■ DNS (Domain Name System). ■ ANSI (American National Standards Association). ■ LDAP (Lightweight Directory Access Protocol). ■ SSL (Secure Sockets Layer). ■ SMTP (Simple Mail Transfer Protocol). ■ X11 (X-Windows Protocol). ■ HTTP (Hypertext Transfer Protocol). ■ HTML (Hypertext Markup Language). ■ SQL (Structured Query Language). ■ MIME (Multipurpose Internet Mail Extensions). ■ N/A: does not follow any open standard.
Table 2 Open Source Software Characteristics
++
++
OSF
Bind Gnome
+ +
++ –
DNS n/a
GNU CC
++
++
GNU Emacs GNU Make Java
++ ++ ++
KDE Perl
++
n/a
CPL
Y Y
+ –
++ +
BSD GPL
ANSI
Y
++
+
+ ++ +
n/a n/a n/a
+ + ++
+ + ++
– ++
– +
n/a n/a
Y Y Y (binary only) Y Y
– ++
– ++
Sendmail
++
++
SMTP
Y
+
++
Tk/Tcl
++
+
n/a
Y
+
–
++ (vendor)
++
X11
Y (vendor supplied)
++
++
Unix Linux, BSD Open platform Unix Unix Open platform Linux, BSD Open platform Unix (OS version) Open platform Unix
Gimp JDK
– ++
– ++
n/a n/a
– ++
– ++
LDAP
–
n/a
LDAP
Y Y (binary only) N
+
OpenLDAP OpenSSL SSLeay
– – ––
– + +
LDAP SSL SSL
N N N
Apache Mozilla
+ +
++ +
HTTP HTML
MySQL
+
+
PHP
+
Pine
+
Development library
Notes
++
Commercial substitutes
BSD GPL CPL
Integration with commercial SW
n/a n/a n/a
Binary availability
+ ++ –
Standard compatibility
+ + –
Y Y Y (binary only) Y (binary only)
Current development status
OSF OSF OSF
Software license
++ + –
Open source dependency
+ ++ ++
Commercial adoption
Backward compatibility
BSD Linux Macintosh OS X Solaris (announced)
X-Windows
Application
Managerial
Technical support Application environment
Operating system
Technical
Y Y Y
freebsd.org linux.org apple.com
n/a
sun.com
Stable Stable
Y Y
isc.org/bind gnome.org
GPL
Stable
Y
gnu.org
GPL GPL CPL
Stable Stable Stable
Y Y n/a
gnu.org gnu.org javasoft.com
BSD BSD
Dev. release Stable
Y N
kde.org perl.org
BSD
Stable
Y
sendmail.com
BSD
Stable
N
scriptics.com
BSD (X)
Stable
N
x.org
Stable Stable Commercial release Commercial release
GPL CPL
Dev. release Stable
Y n/a
gimp.org Javasoft.com
–
Unix Open platform Unix
BSD
Disc.
Defunct
– – –
– – –
Open Open Unix
BSD BSD BSD
Dev. release Dev. release Disc.
Y (Netscape) Y Y Y
Y Y
+ ++
++ –
BSD CPL
Stable Stable release
SQL
Y
–
–
Open Open platform Unix
+
n/a
Y
–
–
CPL (Recent:GPL) BSD
+
SMTP, MIME
Y
–
+
BSD
4. Binary availability: official or unofficial binary distributions are available. Even when an official distribution is widely available, there might be extensive unofficial binary packages that do not receive the same level of support and release upgrades as the official source-level and binary packages. ■ Yes. ■ No.
Open platform Unix
openldap.org operssl.org mozilla-crypto. ssleay.org apache.org mozilla.org
Stable
Y Y (Netscape) Y
Stable
Y
php.net
Stable
Y
pine.org
mysql.org
5. Integration with commercial software: the extent to which the OSS has integrated with commercial software. ■ – Virtually no widely used commercial software can be integrated with the OSS. ■ + A moderate number of commercial software can be integrated with the OSS, but no commercial installation history exists. March/April 2001
IEEE SOFTWARE
93
For More Information opensource.oreilly.com Although the O’Reilly Associates Web site aims to provide an overview of available books, it is also an excellent central location for general information regarding the state of OSS.
www.redhat.com Red Hat is an great centralized source of all OSS information related to Linux (an open source version of Unix that is rapidly gaining popularity in commercial and noncommercial environments). The site is geared to all technical backgrounds.
www.sourceforge.net SourceForge is a free service for technology-savvy users. The source code for almost all current OSS is available here, except for well-established OSS such as Apache, PHP, Linux, and the like. Be prepared to dive directly into source code and source-related documentation.
++ Many commercial software integration possibilities are available and have been deployed in commercial environments. 6. Commercial adoption: the extent to which the OSS has been commercially adopted. ■ – Virtually no commercial entity has adopted the OSS. ■ + A few commercial entities have selected and installed the OSS. ■ ++ The OSS has a large installed user base. 7. OS dependency: the specific operating systems on which the OSS depends; if available for virtually all major ones, it is designated an open platform. Although no OSS operating system is compatible with any application designed for commercial operating systems, almost all the OSS environments, libraries, and applications have been ported to commercial operating systems, except for the packages still under development (KDE, Gnome, Gimp) and Unix-specific applications (Bind, Pine). ■ Unix. ■ Linux. ■ BSD. ■ Open platform: available for virtually all major operating systems, including the various flavors of Unix (Linux, BSD, Solaris, and others), Windows, and Mac OS. 8. Software license: the OSS’s licensing format. The differences between the following types of licenses are the type of modifications and integrations an implementation party is permitted to perform on the OSS (see Table 1). ■
94
IEEE SOFTWARE
March/April 2001
GPL (General Public License): applies to all OSS applications developed by the Gnu organization. ■ LGPL (Library GPL): covers the various libraries developed by the Gnu organization. ■ BSD: includes all derivatives of the BSD license, such as the X-Windows license “X.” ■ CPL: includes various communitysource projects. 9. Current development status. ■ Development release: The OSS is still being actively developed and features added. ■ Stable: A stable, widely installed version of the OSS exists, with ongoing development efforts underway. ■ Discontinued: OSS development efforts have effectively stopped. 10. Commercial substitutes: whether commercial substitutes exist for the OSS. ■ Yes. ■ No. ■ N/A: commercial vendors offer community-source versions of the software; there is a corresponding commercial-software flavor, such as the commercial Netscape Browsers and the community-source version of Mozilla. ■
If made a mere five years ago, Table 2 would have contained virtually no commercial adoption or commercial-grade technical support for almost any of the OSS reviewed. Over the last five years, OSS has made giant strides in improving overall stability, support, and compatibility (for more information, see the related sidebar). Nevertheless, only a minority of the representative OSS set now have commercial-grade support and commercial adoption. Continued improvement in these areas will no doubt make other OSS candidates competitive for adoption in commercial IT projects.
O
pen source software has become a legitimate choice for commercial adoption, as its use in Internet applications shows. As OSS continues to mature, it will play an increasing role in the software industry. What does this mean for the OSS devel-
opment community? Besides jumping for joy, we hope that our work reflects some of the areas that require improvement for a more rapid adoption of OSS by the commercial IT entities. Announcements such as funding for Covalent Technologies,8 a commercial venture targeted specifically at supporting the commercial users of the Apache Web server, show that the OSS community is paying increasing attention to improving support, licensing, reliability, and other areas. We believe that such efforts will ensure the continuing success and innovation of OSS in the future. References 1. C. DiBona, S. Ockman, and M. Stone, Open Sources: Voices from the Open Source Revolution, O’Reilly & Associates, Cambridge, Mass., 1999. 2. E.S. Raymond, The Cathedral & the Bazaar, O’Reilly & Associates, Cambridge, Mass., 1999. 3. “The Netcraft Web Server Survey,” Sept. 2000; www.netcraft.com/survey (current 21 Feb. 2001). 4. T. O’Reilly, “Lessons from Open Source Software Development,” Comm. ACM, vol. 42, no. 4, Apr. 1999, pp. 33–37. 5. A. Brown and K. Wallnau, “A Framework for Evaluating Software Technology,” IEEE Software, vol. 13, no. 5, Sept. 1996, pp. 39–49. 6. A. Schamp, “CM-Tool Evaluation and Selection,” IEEE Software, vol. 12, no. 4, July 1995, pp. 114–118.
7.
E.A.Giakoumakis and G. Xylomenos, “Evaluation and Selection Criteria for Software Requirements Specification Standards,” Software Eng. J., Sept. 1996, pp. 307–319. 8. “The Venture Capital Report,” Forbes, 15 Dec. 1999, www.forbes.com/ 1999/12/17/mu4_print.html (current 22 Feb. 2001).
About the Authors Huaiqing Wang is an associate professor of information systems
at the City University of Hong Kong. He specializes in research and development of intelligent systems and Web-based intelligent agents and their e-business applications (such as multiagent-supported risk-monitoring systems, intelligent-agent-based knowledge management systems, modeling, and intelligent Web-based educational systems). He received his PhD in computer science from the University of Manchester. Contact him at the Dept. of Information Systems, City University of Hong Kong, Kowloon, Hong Kong;
[email protected]. Chen Wang is chief technology officer for StockSmart, which provides aggregated real-time financial information. He was previously a cofounder and CTO for FirstCircle. His primary industry-related research interests include cryptography, Internet commerce, open source software, privacy, and agent-based technologies. He has a BS in computer science and has completed graduate work in information systems at the University of Toronto. Contact him at StockSmart,116 John St., Suite 801, New York, NY 10005;
[email protected].
CALL
FOR
IEEE
Articles and Reviewers
Software Security: Building Systems Securely from the Ground Up
Submission deadline: 31 July 2001
Publication: January/February 2002
Fragile and insecure software conti nues to threaten a society increasing ly reliant on complex software syste because most security breaches are ms, made possible by software flaws. Engin eering secure and robust software syste can break the penetrate-and-patch ms cycle of software releases all too comm on today. Topics of interest for this special issue include: • Case studies that help quantify comm on security risks • Security implications of programm ing languages and development tools • Techniques for balancing security with other design goals • Extracting security requirements from software projects • Design for security • Aspect-oriented programming for security • Analyzing programs for vulnerabili ties
• Testing for vulnerabilities • Secure configuration and maintenan ce • Developing trusted environments for running untrusted mobile code • Secure mobile code programming paradigms • Analyzing unknown software for malic ious logic • Intrusion-tolerant software architectu res • Application-based intrusion detec tion • Quantifying trade-offs in adding secur ity during development
Articles must not exceed 5,400 word s including figures and tables, whic h count for 200 words each. Submissio within the theme’s scope will be peer ns -reviewed and edited. Be sure to inclu de the name of the theme for which you submitting. Please contact a guest are editor for more information about the focus or to discuss a potential subm contact the magazine assistant at softw ission; please
[email protected] for author guide lines and submission details.
Guest Editors: Anup K. Ghosh,
[email protected]; Chuck Howell,
[email protected]; and James Whittaker,
[email protected] design Editor: Martin Fowler
■
T h o u g h t Wo r k s
■
[email protected] Separating User Interface Code Martin Fowler
T
he first program I wrote on a salary was scientific calculation software in Fortran. As I was writing, I noticed that the code running the primitive menu system differed in style from the code carrying out the calculations. So I separated the routines for these tasks, which paid off when I was asked to create higher-level tasks that did several of the individual menu steps. I could just write a routine that called the calculation routines directly without involving the menus. Thus, I learned for myself a design principle that’s served me well in software development: Keep your user interface code separate from everything else. It’s a simple rule, embodied into more than one application framework, but it’s often not followed, which causes quite a bit of trouble. Stating the separation Any code that does anything with a user interface should only involve user interface code. It can take input from the user and display information, but it should not manipulate the information other than to format it for display. A clearly separated piece of code—separate routines, modules, or classes (based on your language’s organizing structure)—should do calculations, validations, or communications should be done by a clearly separated piece of code. For the rest of the article, I’ll refer to the 96
IEEE SOFTWARE
March/April 2001
user interface code as presentation code and the other code as domain code. When separating the presentation from the domain, make sure that no part of the domain code makes any reference to the presentation code. So, if you write an application with a WIMP (windows, icons, mouse, and pointer) GUI, you should be able to write a command line interface that does everything that you can do through the WIMP interface—without copying any code from the WIMP into the command line. Why do this? Following this principle leads to several good results. First, this presentation code separates the code into different areas of complexity. Any successful presentation requires a fair bit of programming, and the complexity inherent in that presentation differs in style from the domain with which you work. Often it uses libraries that are only relevant to that presentation. A clear separation lets you concentrate on each aspect of the problem separately—and one complicated thing at a time is enough. It also lets different people work on the separate pieces, which is useful when people want to hone more specialized skills. Making the domain independent of the presentation also lets you support multiple presentations on the same domain code, as suggested by the WIMP versus command line example, and also by writing a higherlevel Fortran routine. Multiple presentations 0740-7459/01/$10.00 © 2001 IEEE
DESIGN
are the reality of software. Domain code is usually easy to port from platform to platform, but presentation code is more tightly coupled to operating system or hardware. Even without porting, it’s common to find that demands to changes in the presentation occur with a different rhythm than changes in the domain functionality. Pulling away the domain code also makes it easier to spot—and avoid—duplication in domain code. Different screens often require similar validation logic, but when it’s hidden among all the screen handling, it’s difficult to spot. I remember a case where an application needed to change its date validation. The application had two parts that used different languages. One part had date validation copied over its date widgets and required over 100 edits. The other did its validation in a single date class and required just a single change. At the time, this was hyped as part of the massive productivity gain you could get with object-oriented software— but the former non-object system could have received the same benefit by having a single date validation routine. The separation yielded the benefit. Presentations, particularly WIMPs and browser-based presentations, can be very difficult to test. While tools exist that capture mouse clicks, the resulting macros are very tricky to maintain. Driving tests through direct calls to routines is far easier. Separating the domain code makes it much easier to test. Testability is often ignored as a criteria for good design, but a hard-totest application becomes very difficult to modify. The difficulties So why don’t programmers separate their code? Much of the reason lies in tools that make it hard to maintain the separation. In addition, the examples for those tools don’t reveal the price for ignoring the separation. In the last decade or so, the biggest presentation tool has been
the family of platforms for developing WIMP interfaces: Visual Basic, Delphi, Powerbuilder, and the like. These tools were designed for putting WIMP interfaces onto SQL databases and were very successful. The key to their success was dataaware widgets, such as a pop-up menu bound to a SQL query. Such tools are very powerful, letting you quickly build a WIMP interface that operates on a database, but the tools don’t provide any place to extract the domain code. If straight updates to data and view are all you do, then this is not a problem. Even as a certified object-bigot, I always recommend these kinds of tools for these kinds of applications. However, once domain logic gets complicated, it becomes hard to see how to separate it. This problem became particularly obvious as the industry moved to Web interfaces. If the domain logic is stuck inside a WIMP interface, it’s not possible to use it from a Web browser. However, the Web interfaces often encourage the same problems. Now we have server page technologies that let you embed code into HTML. As a way of laying out how generated information appears on the page, this makes plenty of sense. The structure starts breaking down when the code inside the server page is more complicated. As soon as code starts making calculations, running queries, or doing validations, it runs into that same trap of
Pulling away the domain code also makes it easier to spot—and avoid— duplication in domain code.
mixing presentation with domain code. To avoid this, make a separate module that contains the domain code and only make simple calls from the server page to that module. For a simple set of pages, there is an overhead (although I would call it a small one), but as the set gets more complicated, the value of the separation grows. This same principle, of course, is at the heart of using XML. I built my Web site, www.martinfowler.com, by writing XML and converting it to HTML. It lets me concentrate on the structure of what I was writing in one place, so I could think about its formatting later (not that I do any fancy formatting.) Those who use contentoriented styles in word processors are doing much the same thing. I’ve reached the point where that kind of separation seems natural. I’ve had to become a bit of an XSLT whiz—and the tools for that aren’t even adolescent yet. The general principle here is that of separating concerns, but I find such a general principle hard to explain and follow. After all, what concerns should you separate? Presentation and domain are two separable concerns I’ve found straightforward to explain—although that principle isn’t always easy to follow. I think it’s a key principle in well-engineered software. If we ever see engineering codes for software, I’d bet that separation of presentation and domain will be in there somewhere.
Martin Fowler is the chief scientist for ThoughtWorks, an Internet systems delivery and consulting company. For a decade, he was an independent consultant pioneering the use of objects in developing business information systems. He’s worked with technologies including Smalltalk, C++, object and relational databases, and Enterprise Java with domains including leasing, payroll, derivatives trading, and health care. He is particularly known for his work in patterns, UML, lightweight methodologies, and refactoring. He has written four books: Analysis Patterns, Refactoring, Planning Extreme Programming, and UML Distilled. Contact him at
[email protected]. March/April 2001
IEEE SOFTWARE
97