cs notes for the syudents of computer science

COMP 6761 Advanced Computer Graphics
Lecture Notes
Peter Grogono
These notes may be photocopied for students
taking COMP 6761 at Concordia University.
c Peter Grogono 2002, 2003
Department of Computer Science
Concordia University
Montreal, Quebec

CONTENTS CONTENTS
Contents
1 Introduction 1
1.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Callbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Reshaping Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 Keyboard Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.4 Mouse Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.5 Idle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 OpenGL Naming Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Type Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Function Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 General Features of OpenGL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.1 States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.2 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Drawing Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.1 Primitive Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.2 GLUT Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.3 Quadric Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.6 Hidden Surface Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.7 Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Transformations and Projections 14
2.1 Matrices in OpenGL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Projection Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 Perspective Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Model View Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Building Models and Scenes 21
3.1 A Digression on Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Matrix Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 Pops and Pushes Don’t Cancel! . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 Animation with Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Viewing the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
ii

CONTENTS CONTENTS
4 Lighting 32
4.1 Lighting Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.1 Hiding Surfaces and Enabling Lights . . . . . . . . . . . . . . . . . . . . 32
4.1.2 Kinds of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Material Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3 Light Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4 Lighting Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Lighting in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.6 Normal Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5 Special Effects 41
5.1 Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Fog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.3 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.4 Display Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.5 Bézier Curves and Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5.1 Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5.2 Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.6 Menus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.7 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.8 Other Features of OpenGL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.8.1 Textures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.8.2 NURBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.8.3 Antialiasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.8.4 Picking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.8.5 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.9 Program Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6 Organization of a Graphics System 57
6.1 The Graphics Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.1.1 Per Vertex Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.1.2 Primitive Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.3 Rasterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.4 Pixel Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.5 Fragment Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.2 Rasterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2.1 Drawing a Straight Line . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2.2 Drawing a Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
iii

CONTENTS CONTENTS
6.2.3 Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7 Transformations — Again 67
7.1 Scalar Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.3 Affine Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.4 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.4.1 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.4.2 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.4.3 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.5 Non-Affine Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.5.1 Perspective Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.5.2 Shadows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.5.3 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.6 Working with Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8 Rotation 77
8.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.2 2D Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.2.1 Representing 2D Rotations with Matrices . . . . . . . . . . . . . . . . . 79
8.2.2 Representing 2D Rotations with Complex Numbers . . . . . . . . . . . 80
8.3 3D Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.3.1 Representing 3D Rotations with Matrices . . . . . . . . . . . . . . . . . 81
8.3.2 Representing 3D Rotations with Quaternions . . . . . . . . . . . . . . . 82
8.3.3 A Proof that Unit Quaternions Represent Rotations . . . . . . . . . . . 85
8.3.4 Quaternions and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.3.5 Quaternion Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8.4 Quaternions in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8.4.1 Imitating a Trackball . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.4.2 Moving the Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.4.3 Flying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
9 Theory of Illumination 95
9.1 Steps to Realistic Illumination . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.1 Intrinsic Brightness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.2 Ambient Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.3 Diffuse Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.4 Attenuation of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.1.5 Coloured Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
iv

LIST OF FIGURES LIST OF FIGURES
9.1.6 Specular Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
9.1.7 Specular Colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
9.1.8 Multiple Light Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
9.2 Polygon Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
9.2.1 Flat Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
9.2.2 Smooth Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
10 The Theory of Light and Colour 102
10.1 Physiology of the Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
10.2 Achromatic Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10.3 Coloured Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10.4 The CIE System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.4.1 Using Gamuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.5 Other Colour Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10.5.1 RGB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10.5.2 CMY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10.5.3 YIQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10.6 Gamma Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
11 Advanced Techniques 112
11.1 Ray-Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
11.1.1 Recursive Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
11.1.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
11.2 Radiosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
11.2.1 Computing Form Factors . . . . . . . . . . . . . . . . . . . . . . . . . . 119
11.2.2 Choosing Patches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
11.2.3 Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
11.3 Bump Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
11.4 Environment Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
11.5 The Accumulation Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
References 126
List of Figures
1 A simple display function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 OpenGL Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 A simple reshape function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Primitive specifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
v

5 Drawing primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6 A coloured triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7 Perspective projection using gluPerspective() . . . . . . . . . . . . . . . . . . 16
8 An OpenGL program with a perspective transformation . . . . . . . . . . . . . 19
9 Translation followed by rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
10 Rotation followed by translation . . . . . . . . . . . . . . . . . . . . . . . . . . 20
11 Programming with global variables . . . . . . . . . . . . . . . . . . . . . . . . . 22
12 Programming with fewer global variables . . . . . . . . . . . . . . . . . . . . . . 23
13 Programming with fewer global variables, continued . . . . . . . . . . . . . . . 24
14 Drawing an arm — first version . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15 Drawing an arm — improved version . . . . . . . . . . . . . . . . . . . . . . . . 25
16 Drawing a Maltese Cross . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
17 Pushing and popping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
18 Zooming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
19 Using gluLookAt() and gluPerspective() . . . . . . . . . . . . . . . . . . . . 29
20 Parameters for glMaterialfv() . . . . . . . . . . . . . . . . . . . . . . . . . . 33
21 Using glMaterial() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
22 Parameters for glLightfv() . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
23 Parameters for glLightModel() . . . . . . . . . . . . . . . . . . . . . . . . . . 36
24 Computing normals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
25 Computing average normals on a square grid . . . . . . . . . . . . . . . . . . . 40
26 Fog Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
27 Parameters for glFog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
28 Two three-point Bézier curves with their control points . . . . . . . . . . . . . 46
29 Control parameters for Bézier curves . . . . . . . . . . . . . . . . . . . . . . . . 47
30 Using Bézier surfaces for the body of a plane . . . . . . . . . . . . . . . . . . . 49
31 Points generated by the code of Figure 30 . . . . . . . . . . . . . . . . . . . . . 50
32 Functions for menu callback and creation . . . . . . . . . . . . . . . . . . . . . 50
33 A C function for lines with slope less than 1. . . . . . . . . . . . . . . . . . . . 62
34 Drawing a circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
35 Computing points in the first octant . . . . . . . . . . . . . . . . . . . . . . . . 64
36 Plotting eight symmetrical points . . . . . . . . . . . . . . . . . . . . . . . . . . 64
37 Sutherland-Hodgman Polygon Clipping . . . . . . . . . . . . . . . . . . . . . . 65
38 Labelling the regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
39 Varieties of Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
40 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
41 Quaternion multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
vi

42 Mouse callback function for trackball simulation . . . . . . . . . . . . . . . . . 88
43 Projecting the mouse position . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
44 Updating the trackball quaternion . . . . . . . . . . . . . . . . . . . . . . . . . 90
45 Translating the camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
46 Unit vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
47 Auxiliary function for translating the camera . . . . . . . . . . . . . . . . . . . 92
48 Rotating the camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
49 Callback function for flying the plane . . . . . . . . . . . . . . . . . . . . . . . . 94
50 Illuminating an object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
51 Calculating R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
52 Gouraud Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
53 Phong Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
54 Dynamic range and perceptible steps for various devices . . . . . . . . . . . . . 103
55 CIE Chromaticity Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
56 Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
57 Lighting in the ray-tracing model . . . . . . . . . . . . . . . . . . . . . . . . . . 116
58 Effect of glAccum(op, val ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
vii

1 Introduction
The course covers both practical and theoretical aspects of graphics programming at a fairly
advanced level. It starts with practice: specifically, writing graphics programs with OpenGL.
During the second part of the course, we will study the theory on which OpenGL and other
graphics libraries are based.
OpenGL is an industry-standard graphics library. OpenGL programs run on most platforms.
All modern graphics cards contain hardware that handles OpenGL primitive operations, which
makes OpenGL programs run fast on most platforms. OpenGL provides a high-level interface,
making it easy to learn and use. However, graphics libraries have many common features and,
having learned OpenGL, you should find it relatively easy to learn another graphics system.
OpenGL is the basic library: it includes GL (Graphics Library) and GLU (Graphics Library
Utilities). GLU does not contain primitives; all of its functions make use of GL functions.
GL does not know anything about the windows system of the computer on which it is running.
It has a frame buffer, which it fills with appropriate data, but displaying the frame buffer on
the screen is the responsibility of the user.
GLUT (Graphic Library Utility Toolkit) provides the functionality necessary to transfer the
OpenGL frame buffer to a window on the screen. GLUT programs are platform-independent:
a GLUT program can be compiled and run on a unix workstation, a PC running Windows,
or a Mac, with substantially the same results. Lectures in this course will be based on GLUT
programming.
If you don’t use GLUT, you will have to understand how to program using windows. On a
PC, this means learning either the Windows API or MFC (although interfacing OpenGL and
MFC is not particularly easy). A good source of information for the Windows API is The
OpenGL SuperBible by Richard S. Wright, Jr. and Michael Sweet (Waite Group Press, 2000).
On a unix workstation, you will need a good understanding of X Windows.
The OpenGL hardware accelerators on graphics cards require special drivers. At Concordia,
these drivers have been installed on Windows systems but not on linux systems (usually
because linux drivers have not been written for the newest graphics cards). Consequently,
OpenGL programs run much faster (often 5 or 10 times faster) under Windows than under
linux.
1.1 Getting Started
These notes follow roughly the same sequence as Getting Started with OpenGL by Peter Gro-
gono, obtainable from the university Copy Centre. As mentioned above, we assume the use
of GLUT. The programming language is C or C++ (OpenGL functions are written in C but
can be called from a C++ program).
Any program that uses GLUT must start with the directive
#include <GL/glut.h>
This assumes that header files are stored in .../include/GL, which is where they are supposed
to be. In Concordia labs, the header files may be in .../include, in which case you should
use the directive
#include <glut.h>

1 INTRODUCTION 1.1 Getting Started
In most GLUT programs, the main function consists largely of GLUT “boiler plate” code and
will look something like this:
int main (int argc, char *argv[])
f
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGBA);
glutInitWindowSize(800, 600);
glutInitWindowPosition(100, 50);
glutCreateWindow("window title");
glutDisplayFunc(display);
glutMainLoop();
g
All of these functions come from GLUT, as indicated by the prefix glut-. Their effects are
as follows:
˘ glutInit initializes GLUT state. It is conventional to pass the command line arguments
to this function. This is useful if you are using X but rather pointless for Windows. An
X user can pass parameters for window size, etc., to a GLUT program.
˘ glutInitDisplayMode initializes the display mode by setting various bits. In this exam-
ple, the bit GLUT_SINGLE requests a single buffer (the alternative is double buffers, which
are needed for animation) and the bit GLUT_RGBA requests colours (red, green, blue, and
“alpha”, which we will discuss later). Note the use of | (not ||) to OR the bits.
˘ glutInitWindowSize sets the width and height of the graphics window in pixels. If this
call is omitted, GLUT will use the system default values.
˘ glutInitWindowPosition sets the position of the window. The arguments are the po-
sition of the left edge and the top edge, in screen coordinates. Note that, in screen
coordinates, 0 is the top of the screen, not the bottom. If this call is omitted, GLUT will
use the system default values.
˘ glutCreateWindow creates the window but does not display it. The window has the
size and position specified by the previous calls and a title given as an argument to this
function.
˘ glutDisplayFunc registers a callback function to update the display. Callbacks are
explained below.
˘ glutMainLoop enters a loop in which GLUT handles events generated by the user and
the system and responds to them. Events include: key strokes, mouse movements, mouse
clicks, and window reshaping operations.
As soon as the “main loop” starts, GLUT will respond to events and call the functions that
you have registered appropriately. The only callback registered above is display. In order
for this program to compile, you must have written a function of the form shown in Figure 1
which GLUT will call whenever it needs to update the display.
The function display calls OpenGL functions that we will look at in more detail later. For
now we note that OpenGL functions have the prefix gl- and:
˘ glClear clears various bits. This call sets all pixels in the frame buffer to the default
value.
˘ glColor3f sets the current colour. The parameters are the values for red, green, and
blue. This call asks for bright red.
2

1 INTRODUCTION 1.2 Callbacks
void display()
f
glClear(GL_COLOR_BUFFER_BIT);
glColor3f(1.0, 0.0, 0.0);
glBegin(GL_LINES);
glVertex2f(-1.0, 0.0);
glVertex2f(1.0, 0.0);
glEnd();
glFlush();
g
Figure 1: A simple display function
˘ glBegin starts a block in which OpenGL expects calls to functions that construct primi-
tives. In this case, the mode GL_LINES specifies line drawing, and we provide two vertexes
for each line.
˘ glVertex2f specifies the position of a vertex in 2D.
˘ glFlush forces the window to be refreshed. This call typically has no effect if you are
running OpenGL on a PC. It is needed when the program is running on a server and the
client screen must be refreshed.
1.2 Callbacks
For each callback function, you need to know: how to register the callback, how to declare
it, and what it does. The following sections provide this information. The callback functions
can have any name; the names used here (e.g., display) are typical.
1.2.1 Display
Registration
Declaration
void display();
Use The display function is called by GLUT whenever it thinks the graphics window needs
updating. Since no arguments are passed, the display function often uses global variables or
calls other functions to obtain its data.
1.2.2 Reshaping Events
Registration
glutReshapeFunc(reshape);
Declaration
void reshape(int width, int height);
3

Use The reshape function is called whenever the user reshapes the graphics window. The
arguments give the width and height of the reshaped window in pixels.
1.2.3 Keyboard Events
GLUT provides two callback functions for keyboard events: one for “ordinary” keys (techni-
cally: ASCII graphic characters); and one for “special” keys, such as function (F) keys and
arrow keys.
Registration
glutKeyboardFunc(keyboard);
Declaration
void keyboard(unsigned char key, int x, int y);
Use The keyboard function is called when the user presses a “graphic” key. These are the
keys for characters that are visible on the screen: letters, digits, symbols, and space. The esc
characters is also recognized (with code 27).
The values of x and y give the position of the mouse cursor at the time when the key was
pressed.
Registration
glutSpecialFunc(special);
Declaration
void special(int key, int x, int y);
Use The special function is similar to the keyboard function but is called when the user presses
a non-graphic character key. The key is identified by comparing it to a GLUT constant. The
constants are:
GLUT_KEY_F1 GLUT_KEY_F8 GLUT_KEY_LEFT
GLUT_KEY_F2 GLUT_KEY_F9 GLUT_KEY_RIGHT
GLUT_KEY_F3 GLUT_KEY_F10 GLUT_KEY_UP
GLUT_KEY_F4 GLUT_KEY_F11 GLUT_KEY_DOWN
GLUT_KEY_F5 GLUT_KEY_F12 GLUT_KEY_PAGE_UP
GLUT_KEY_F6 GLUT_KEY_HOME GLUT_KEY_PAGE_DOWN
GLUT_KEY_F7 GLUT_KEY_END GLUT_KEY_INSERT
1.2.4 Mouse Events
Registration
glutMouseFunc(mouse);
Declaration
void mouse(int button, int state, int x, int y);
Use This function is called when the user presses or releases a mouse button. The value
of button is one of GLUT_LEFT_BUTTON, GLUT_MIDDLE_BUTTON, or GLUT_RIGHT_BUTTON. The
value of state is GLUT_UP or GLUT_DOWN. The values of x and y give the position of the mouse
cursor at the time when the key was pressed.
4

The values x and y are measured in pixels and are relative to the graphics window. The top
left corner of the window gives x D 0 and y D 0; the bottom right corner gives x D width
and y D height, where width and height are the values given during initialization or by
reshaping. Note that y values increase downwards.
Registration
glutMotionFunc(motion);
glutPassiveMotionFunc(passiveMotion);
Declaration
void motion(int x, int y);
void passiveMotion(int x, int y);
Use These functions are called when the mouse is moved within the graphics window. If any
mouse button is pressed, motion is called. If no buttons are pressed, passiveMotion is called.
The values of x and y are the same as for the mouse callback. However, if you press a button
down while the mouse is in the graphics window, and then drag the mouse out of the window,
the values of x and y may go outside their respective ranges — that is, x may become negative
or greater than width, and y may become negative or greater than height.
1.2.5 Idle
Registration
glutIdleFunc(idle);
Declaration
void idle();
Use The idle function is called whenever OpenGL has nothing else to do. It is a very important
callback, because it enables animated graphics. A typical idle callback function looks like this:
void idle()
f
// Update model values
....
glutPostRedisplay();
g
The effect of glutPostRedisplay() is to inform GLUT that the graphics window needs
refreshing — that is, that GLUT should invoke the display callback function.
You could call the display function from within the idle function. Although this usually
works, it is not recommended. The reason is that GLUT handles many events and can
postpone refreshing the display until there are no outstanding events. For example, if the user
is dragging the mouse or reshaping the window during an animation, the graphics window
should not be redisplayed until the operation is completed.
This section includes all of the GLUT functions that you need to get started. Later, we will
look at GLUT functions for more advanced applications, such as menu management.
5

1 INTRODUCTION 1.3 OpenGL Naming Conventions
Suffix Data type C Type OpenGL Type
b 8-bit integer signed char GLbyte
s 16-bit integer short GLshort
i 32-bit integer int or long GLint, GLsizei
f 32-bit floating point float GLfloat, GLclampf
d 64-bit floating point double GLdouble, GLclampd
ub 8-bit unsigned integer unsigned char GLubyte, GLboolean
us 16-bit unsigned integer unsigned short GLushort
ui 32-bit unsigned integer unsigned int GLuint, GLenum, GLbitfield
Nothing void GLvoid
Figure 2: OpenGL Types
1.3 OpenGL Naming Conventions
1.3.1 Type Names
OpenGL uses typedefs to give its own names to C types, as shown in Figure 2 (see also Table 1
of Getting Started with OpenGL on page 7). You don’t have to use these types, but uses them
makes your programs portable. The suffixes in the left column are used in function names, as
described in the next section.
1.3.2 Function Names
OpenGL provides an API for C, not C++. This means that function names cannot be over-
loaded and, consequently, naming conventions are required to distinguish similar functions
with different parameter types. The structure of an OpenGL function name is as follows:
˘ The name begins with the prefix gl (primitive library functions) or glu (utility library
functions).
˘ The prefix is followed by the function name. The first letter of the function name is in
upper case.
˘ There may be a digit to indicate the number of parameters required. E.g., 3 indicates
that 3 parameters are expected.
˘ There may be a letter or letter pair indicating the type of the parameters. The codes are
given in the left column of Figure 2.
˘ There may be a v indicating that the arguments are pointers to arrays rather than the
actual values.
For example, in the call
glVertex3f(x, y, z);
there must be three arguments of type GLfloat. The same effect could be achieved by called
glVertex3fv(pc);
provided that pc is a pointer to an array of (at least) three floats.
6

1 INTRODUCTION 1.4 General Features of OpenGL
The official OpenGL documentation sometimes lists all the allowed forms (the OpenGL Refer-
ence Manual does this) and sometimes uses an abbreviated form (the OpenGL Programming
Guide does this). The form
void glVertexf234gfsifdg[v](TYPE coords);
stands for 24 different function prototypes. The functions may have 2, 3, or 4 parameters of
type short, integer, float, or double, and they may be passed as individual values or as a
pointer to a suitable array.
In these notes, we will normally use particular functions that are appropriate for the appli-
cation. However, you should always be aware that a different form might be better for your
application.
1.4 General Features of OpenGL
1.4.1 States
It is best to think of OpenGL as a Finite State Machine (with a lot of states!). The effect
of many functions is to change a state variable. The state is restored or changed again by
another call, not by default.
For example, if you call glColor3f(0,0,1), then everything you draw will be coloured bright
blue until you call glColor again with different arguments.
The state-machine concept seems simple enough, but it can give a lot of trouble in program
development. It often happens that your program is doing something unexpected because
OpenGL is in the wrong state, but it is hard to find out which state variable is wrong. Problems
are even worse when you work with multiple windows, because some parts of the OpenGL state
applies to individual windows and other parts apply to all windows.
A partial solution is to modify state in a systematic and structured way. For example, if a
feature is turned on somewhere, it is a good idea to turn it off in the same scope. You may
use lighting for some parts of a scene and not others. In this case, you display function should
look like this:
void display()
f
// initialization
....
// display parts of the scene that require lighting
glEnable(GL_LIGHTING);
....
// display parts of the scene that do not require lighting
glDisable(GL_LIGHTING);
....
g
If the calls to glEnable and glDisable had been hidden in functions called by display, we
could not tell when lighting was in effect by looking at the display function.
7

1 INTRODUCTION 1.5 Drawing Objects
1.4.2 Coordinates
Graphics programs make heavy use of coordinate systems and it is easy to get confused. There
are two important sets of coordinates that are fundamental to all applications.
Window Coordinates The window in which the graphics scene is displayed width w and
height h, measured in pixels. Coordinates are given as pairs (x, y). The top left corner of the
window is (0, 0) and the bottom left corner is (w, h). X values increase from left to right and
Y values increase from top to bottom.
Model Coordinates The model, or scene, that we are displaying is three-dimensional. (We
can use OpenGL for 2D graphics but most of the applications in this course assume 3D.) By
default:
˘ the origin is at the centre of the window, in the plane of the window
˘ the X axis points to the right of the window
˘ the Y axis points to the top of the window
˘ the Z axis points towards the viewer
Note that:
˘ The Y axis of the model is inverted with respect to the Y axis of the window. In the
model, Y points upwards, in accordance with engineering and mathematical conventions.
˘ The model coordinates are right-handed. Since the X and Y directions are fixed by
convention, the Z axis must point towards the viewer. (Hold your right hand so that
your thumb, first finger, and second finger are at right-angles to one another. Point your
thumb (X axis) to the right and your first finger (Y axis) upwards; then your second
finger (Z axis) is pointing towards you.)
Knowing the direction of the coordinates is not enough: OpenGL displays things only if they
are in the viewing volume. By default, the viewing volume contains points (x, y, z) such
that 1 x 1, 1 y 1, and 1 z 1. (We will see later how to alter these values.
Note that the coordinates in Figure 1 satisfy the conditions for visibility.) Objects outside
the viewing volume not visible on the screen.
OpenGL has to transform model coordinates to window coordinates. The mapping takes
(x, y, z) in the model to (w(x C 1)=2, h(1 y)=2). The origin of the model, (0, 0, 0), is mapped
to the centre of the window, (w=2, h=2). Z coordinates in the model are ignored (this is not
always true, as we will see later).
Figure 3 shows a simple reshape callback function. It assumes that the model is contained in
a box bounded by jxj 5, jyj 5, and jzj 5. It is designed so that, whatever the shape
of the new window, the whole of the model is visible and the model is not distorted. This
implies that the shorter side of the window must be 10 units long.
1.5 Drawing Objects
A scene in a graphics program is composed of various objects. At the lowest level, there are
primitive objects, such as vertexes, lines, and polygons. From primitive objects, we can
8

void reshape (int width, int height)
f
GLfloat w, h;
if (width height)
f
w = (5.0 * width) / height;
h = 5.0;
g
else
f
w = 5.0;
h = (5.0 * height) / width;
g
glViewport(0, 0, width, height);
glMatrixMode(GL PROJECTION);
glLoadIdentity();
glOrtho(-w, w, -h, h, -5, 5);
g
Figure 3: A simple reshape function
build common objects, such as cubes, cones, and spheres. We can also construct special
objects such as Bézier curves and surfaces.
All drawing (the technical term is “rendering”) is performed inside the display function.
1.5.1 Primitive Objects
The code for rendering OpenGL primitives has the form:
glBegin(mode );
....
glEnd();
where mode has one of the values shown in Figure 4. The explanations in Figure 4 are
sufficient for most modes but the three modes shown in Figure 5 need some care for correct
use.
Note that, for GL_QUAD_STRIP, the order in which the vertexes are given is not the same as
the order that is used to determine the front face. In the program, the vertexes appear in the
sequence v0, v1, v2, . . .. For display purposes (see Figure 5(c)), the quadrilaterals generated
are v0 v2 v3 v1, v2 v4 v5 v3, and so on.
There are some obvious constraints on the number of vertexes in a sequence of calls. However,
OpenGL is tolerant: it simply ignores extra vertexes. For example, if you set the mode to
GL_TRIANGLES and provide eight vertexes, OpenGL will draw two triangles and ignore the last
two vertexes.
A polygon with more than three sides may be convex or concave. OpenGL can draw convex
polygons only. In practice, surfaces are usually constructed using polygons with a small
9

number of edges, either triangles or quadrilaterals. For triangles, the problem of complexity
does not arise; if the quadrilaterals are approximately rectangular, they will be convex.
Another problem that arises with more than four vertexes is planarity. Whereas a set of three
points defines a unique plane, a set of four or more points may or may not lie in a plane.
Since the calculations that OpenGL performs assume planarity, you may get funny results if
you provide non-planar polygons.
A polygon has a front face and a back face. The faces can have different colours, and the
distinction between front and back is important for lighting. The rule for deciding which is
the front face is:
If the order in which the vertexes are displayed appears to be counter-clockwise
in the viewing window, we are looking at the front of the polygon.
There are two functions that affect the way in which polygons are displayed.
glPolygonMode(face, mode );
accepts the arguments shown in the following table.
face mode
GL_FRONT_AND_BACK (default) GL_FILL (default)
GL_FRONT GL_LINE
GL_BACK GL_POINT
GL_FILL means that the entire polygon will be shaded with the current colour; GL_LINE means
that its outline will be drawn; and GL_POINT means that only the vertexes will be drawn.
If the mode is GL_FILL, then the function
glShadeModel(shading);
Mode Value Effect
GL_POINTS Draw a point at each of the n vertices.
GL_LINES Draw the unconnected line segments v0v1, v2v3, . . . vn 2vn 1.
GL_LINE_STRIP Draw the connected line segments v0v1, v1v2, . . . , vn 2vn 1.
GL_LINE_LOOP Draw a closed loop of lines v0v1, v1, v2, . . . , vn 2vn 1, vn 1v0.
GL_TRIANGLES Draw the triangle v0v1v2, then the triangle v3v4v5, and so on.
GL_TRIANGLE_STRIP Draw the triangle v0v1v2, then use v3 to draw a second triangle,
and so on (see Figure 5 (a)).
GL_TRIANGLE_FAN Draw the triangle v0v1v2, then use v3 to draw a second triangle,
and so on (see Figure 5 (b)).
GL_QUADS Draw the quadrilateral v0v1v2v3, then the quadrilateral v4v5v6v7,
and so on.
GL_QUAD_STRIP Draw the quadrilateral v0v1v3v2, then the quadrilateral v2v3v5v4,
and so on (see Figure 5 (c)).
GL_POLYGON Draw a single polygon using v0, v1, . . . , vn 1 as vertices (n 3).
Figure 4: Primitive specifiers
10

r
v0
r
v1
r
v2
r
v3
r
v4
rv5

A
A
A
A

A
A
A
A
HH
H
H
r
v0
r
v1
rv2
rv3
r
v4
@
@
H
H
H
H
r
v0
r
v1
r
v2
r
v3
r
v4
r
v5
r
v6
r
v7
HH
H
H

HH
H
H
(a) GL TRIANGLE STRIP (b) GL TRIANGLE FAN (c) GL QUAD STRIP
Figure 5: Drawing primitives
glBegin(GL_TRIANGLES);
glColor3f(1, 0, 0);
glVertex3f(0, 0.732, 0);
glColor3f(0, 1, 0);
glVertex3f(-0.5, 0, 0);
glColor3f(0, 0, 1);
glVertex3f(0.5, 0, 0);
glEnd();
Figure 6: A coloured triangle
determines how the face will be coloured. By default, the shading mode is GL_SMOOTH. OpenGL
colours each vertex as specified by the user, and then interpolates colours in between. Figure 6
shows code that will draw a triangle which has red, green, and blue vertexes, and intermediate
colours in between. If the shading mode is GL_FLAT, the entire triangle will be give then colour
that is in effect when its first vertex is drawn.
1.5.2 GLUT Objects
The GLUT library provides several objects that are rather tedious to build “by hand”. These
objects are fully defined, with normals, and they are useful for experimenting with lighting.
A “wire” object displays as a wire frame; this is not very pretty but may be useful during
debugging. A “solid” object looks like a solid, but you should define its material properties
yourself, as we will see later. For a sphere, “slices” are lines of longitude and “stacks” are
lines of latitude. Cones are analogous. For a torus, “sides” run around the torus and “rings”
run around the “tube”. The prototypes for the GLUT objects are:
void glutWireSphere (double radius, int slices, int stacks);
void glutSolidSphere (double radius, int slices, int stacks);
void glutWireCube (double size);
void glutSolidCube (double size);
void glutWireTorus (double inner, double outer, int sides, int rings);
void glutSolidTorus (double inner, double outer, int sides, int rings);
void glutWireIcosahedron ();
void glutSolidIcosahedron ();
11

void glutWireOctahedron ();
void glutSolidOctahedron ();
void glutWireTetrahedron ();
void glutSolidTetrahedron ();
void glutWireDodecahedron ();
void glutSolidDodecahedron ();
void glutWireCone (double radius, double height, int slices, int stacks);
void glutSolidCone (double radius, double height, int slices, int stacks);
void glutWireTeapot (double size);
void glutSolidTeapot (double size);
1.5.3 Quadric Objects
Quadrics are surfaces that can be generated by an equation of the second degree: that is, by
choosing values of ai in
a1 x2
C a2 y2
C a3 z2
C a4 y z C a5 z x C a6 x y C a7 x C a8 y C a9 z C a10 D 0.
Quadrics provided by OpenGL include spheres, cylinders, and disks. Since the top and bottom
diameters of a cylinder can be set independently, cones can be drawn as well. Drawing a
quadric requires several steps. These steps usually occur during initialization:
˘ Declare a pointer to a quadric descriptor:
GLUquadricObj *pq;
˘ Allocate a default descriptor object:
pq = gluNewQuadric();
˘ Set the drawing style for the quadric:
gluQuadricDrawStyle(pq, GLU_FILL);
Possible values for the second argument are: GLU_POINT, GLU_LINE, GLU_SILHOUETTE,
and GLU_FILL.
Within the display function:
˘ Draw the quadric using one of:
gluSphere(pq, radius, slices, stacks);
gluCylinder(pq, baseRadius, topRadius, height, slices, stacks);
gluDisk(pq, innerRadius, outerRadius, slices, rings);
gluPartialDisk(pq, innerRadius, outerRadius, slices, rings, startAngle,
sweepAngle);
The first argument, pq, is the pointer returned by gluNewQuadric. The dimensions
have type double. The arguments slices, stacks, and rings indicate the number of
segments used to draw the figure, and are integers. Larger values mean slower displays:
values from 15 to 25 are adequate for most purposes. The angles for a partial disk must
be given in degrees, not radians.
A sphere has its centre at the origin. The base of the cylinder is at the origin, and the
cylinder points in the CZ direction. A disk has its centre at the origin and lies in the Z
plane.
When you have finished using quadrics:
12

1 INTRODUCTION 1.6 Hidden Surface Elimination
˘ Delete the quadric descriptor.
gluDeleteQuadric(pq);
The quadric descriptor contains the information about how to draw the quadric, as set by
gluDrawStyle, etc. Once you have created a descriptor, you can draw as many quadrics as
you like with it.
1.6 Hidden Surface Elimination
To obtain a realistic view of a collection of primitive objects, the graphics system must display
only the objects that the viewer can see. Since the components of the model are typically
surfaces (triangles, polygons, etc.), the step that ensures that invisible surfaces are not ren-
dered is called hidden surface elimination. There are various ways of eliminating hidden
surfaces; OpenGL uses a depth buffer.
The depth buffer is a two-dimensional array of numbers; each component of the array corre-
sponds to a pixel in the viewing window. In general, several points in the model will map to
a single pixel. The depth-buffer is used to ensure that only the point closest to the viewer is
actually displayed.
To enable hidden surface elimination, modify your graphics program as follows:
˘ When you initialize the display mode, include the depth buffer bit:
glutInitDisplayMode(GLUT RGBA | GLUT DEPTH);
˘ During initialization and after creating the graphics window, execute the following state-
ment to enable the depth-buffer test:
glEnable(GL DEPTH TEST);
˘ In the display() function, modify the call to glClear() so that it clears the depth-buffer
as well as the colour buffer:
glClear(GL COLOR BUFFER BIT | GL DEPTH BUFFER BIT);
1.7 Animation
If your program has an idle callback function that changes the values of some global variables,
OpenGL will display your model repeatedly, giving the effect of animation. The display will
probably flicker, however, because images will be alternately drawn and erased. To avoid
flicker, modify your program to use double buffering. In this mode, OpenGL renders the
image into one buffer while displaying the contents of the other buffer.
˘ When you initialize the display mode, include the double buffer bit:
glutInitDisplayMode(GLUT RGBA | GLUT DOUBLE);
˘ At the end of the display() function include the call
glutSwapBuffers();
If you want to eliminate hidden surfaces and animate, you will have to use this call during
initialization:
glutInitDisplayMode(GLUT RGBA | GLUT DEPTH | GLUT DOUBLE);
13

2 TRANSFORMATIONS AND PROJECTIONS
2 Transformations and Projections
Graphics programming makes heavy use of coordinate transformations. Suppose we want to
display a scene with two houses. It doesn’t make sense to specify the coordinates of all of the
vertices of one house and then repeat the process all over again for the other house. Instead,
we would define a house, translate all of its coordinates to one part of the scene, and then
translate the same set of points to another part of the scene. We might also rotate or scale the
house. In fact, we might translate, rotate, or scale the entire scene. All of these operations
require coordinate transformations.
Coordinate transformations change the coordinates of a point from (x, y, z) to (x0
, y0
, z0
). The
common kinds of transformation are:
˘ Translation:
x0
D x C a
y0
D y C b
z0
D z C c
˘ Scaling:
x0
D r x
y0
D s y
z0
D t z
˘ Rotation:
x0
D x cos y sin
y0
D x sin C y cos
z0
D z
(This is a rotation about the Z axis. Equations for rotations about the other axes are
similar.)
Scaling and rotation can be represented as matrix transformations. For example, the rotation
above can be written
2
4
x0
y0
z0
3
5 D
2
4
cos sin 0
sin cos 0
0 0 1
3
5
2
4
x
y
z
3
5
We cannot represent translation as a matrix transformation in this way. However, if we use
4 4 matrices, we can represent all three transformations because
2
6
6
4
1 0 0 a
0 1 0 b
0 0 1 c
0 0 0 1
3
7
7
5
2
6
6
4
x
y
z
1
3
7
7
5 D
2
6
6
4
x C a
y C b
z C c
1
3
7
7
5
We can view the use of 4 4 matrices simply as a trick for making translation work or as a
move to four-dimensional affine space. The graphics programming is the same in either case.
14

2 TRANSFORMATIONS AND PROJECTIONS 2.1 Matrices in OpenGL
2.1 Matrices in OpenGL
OpenGL maintains several matrices, of which the most important are the projection matrix
and the model view matrix. Since the transformation from model coordinates to window
coordinates is achieved by multiplying these two matrices together, one matrix would actually
suffice. Splitting the transformation into two parts is convenient because
˘ the projection determines our view of the model and
˘ the model view matrix determines our position and orientation within the model.
Thus we can think of the projection matrix as applying to the entire model and the model
view matrix as applying to parts of the model.
To manipulate matrices:
˘ Call glMatrixMode(mode) to choose a matrix. The value of mode is either GL_PROJECTION
or GL_MODELVIEW.
˘ Call glLoadIdentity() to set this matrix to the identity matrix.
˘ Call a projection function to set the value of the projection matrix.
˘ Call transformation functions to change the value of the model view matrix. The most fre-
quently used transformation functions are glTranslatef(x, y, z), glRotatef(, x, y, z),
and glScalef(r, s, t).
We consider projection matrices first and then model view matrices.
2.2 Projection Matrices
2.2.1 Orthogonal Projections
The simplest projection matrix gives an orthogoal projection. This is done very simply by
ignoring Z values and scaling X and Y values to fit the current viewport. The call
glOrtho(left, right, bottom, top, near, far )
defines an orthogonal transformation for which the point at (x, y, z) in the model will be
visible if left x right, bottom y top, and near z far. These positions define the
viewing volume: a point is visible only if it is within the viewing volume. Note that values
of Z are constrained even though they do not affect the position of the projected point in the
window. The matrix constructed by this call is
ortho D
2
6
6
6
6
6
6
6
6
4
2
right left
0 0
rightCleft
right left
0 2
top bottom
0
topCbottom
top bottom
0 0 2
far near
farCnear
far near
0 0 0 1
3
7
7
7
7
7
7
7
7
5
Note that if any pair of arguments are equal, the matrix will have infinite elements and your
program will probably crash.
The default orthogonal projection is equivalent to
glOrtho(-1, 1, -1, 1, -1, 1);
15

2 TRANSFORMATIONS AND PROJECTIONS 2.2 Projection Matrices
HH
H
H
HH
H
H
H
H
H
H
H
H
H
H
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
O ˛
``````````````````````````````
r
r
r
r
r
near
-
far
-
w
H
H
H
H
Y
HH
H
H
j
h
6
?
r
H
H
j
X
6
Y

Z
Figure 7: Perspective projection using gluPerspective()
In other words, the default viewing volume is defined by jxj 1, jyj 1, and jzj 1.
A blank window is a common problem during OpenGL program development. In many
cases, the window is blank because your model is not inside the viewing volume.
2.2.2 Perspective Projections
A perspective transformation makes distant objects appear smaller. One way to visualize a
perspective transformation is to imagine yourself looking out of a window. If you copy the
scene outside onto the glass, without moving your head, the image on the glass will be a
perspective transformation of the scene.
The simplest way to obtain a perspective transformation in OpenGL is to call
gluPerspective(angle, aspectRatio, near, far );
The effect of this call is shown in Figure 7. The angle between the top and bottom of the
scene, as seen by the viewer, is angle. The value of aspectRatio is the width of the window
divided by its height. The values of near and far determine the closest and furthest points in
the viewing volume.
Here is a typical reshape function using gluPerspective:
void reshape(int width, int height)
f
glLoadIdentity();
gluPerspective(30, (GLfloat)width / (GLfloat)height, 5, 40);
g
16

2 TRANSFORMATIONS AND PROJECTIONS 2.2 Projection Matrices
The new window has size width height. The first line of the function makes the viewport
fit the window. The second and third lines set the projection matrix to the identity matrix.
The fourth line turns it into a perspective projection with a vertical angle of 30ı and the
same aspect ratio as the new viewport. The viewing volume is bounded in the Z direction by
40 z 5.
The arguments near and far must satisfy far near 0. The model coordinates are negative
because the Z axis is pointing towards the viewer.
A perspective projection will appear correct only if the window subtends the given angle at
the viewer’s eye. If the value of angle is , the height of the window is h, and the distance
from the viewer to the window is d, then
D 2 tan 1

h
2d

or
2 tan

2
D
h
d
For example, the value of angle in the reshape function above is 30ı. If the scene is viewed
in a window 6 inches high, the viewer should place his or her eyes about 11 inches from the
screen.
Changing the value of angle gives the same effect as zooming a camera lens: the viewpoint
remains the same, but the angle of view changes.
The function gluPerspective is not an OpenGL primitive (its name begfins with glu, so it
is an OpenGL utility function). The OpenGL primitive function for perspective projection is
glFrustum, which we provide with the boundaries of the viewing volume:
glFrustum(left, right, bottom, top, near, far);
The arguments look (and, in fact, are) the same as those for glOrtho. The difference is that
near determines the distance of the viewer from the near plane and the shape of the viewing
volume is a “frustum” (truncated rectangular pyramid) rather than a cuboid (brick-shaped
solid). The matrix constructed by the call is shown below — compare this to the matrix
generated by glOrtho on page 15.
frustum D
2
6
6
6
6
6
6
6
6
4
2 near
right left
0
rightCleft
right left
0
0 2 near
top bottom
topCbottom
top bottom
0
0 0
farCnear
far near
2 far near
far near
0 0 1 0
3
7
7
7
7
7
7
7
7
5
Let v D [X, Y, Z, 1]T
. When we multiply this vector by ortho, the transformed X and Y
coordinates are independent of Z: with an orthogonal transformation, distance from the
17

2 TRANSFORMATIONS AND PROJECTIONS 2.3 Model View Transformations
viewer does not affect size. (We have abbreviated right to r, left to l, etc.)
ortho v D
2
6
6
6
6
6
6
6
6
6
4
2 X C r C l
r l
2 Y t b
t b
2 Z C f C n
f n
1
3
7
7
7
7
7
7
7
7
7
5
However, if we multiply frustum by v, we obtain the transformed coordinates below after
normalization (that is, scaling so that the fourth component is 1). Note that the X and Y
coordinates depend on the value of Z, moving towards the origin as Z gets larger.
frustum v D
2
6
6
6
6
6
6
6
6
6
4
2 n X C Z (r C l)
Z (r l)
2 n Y C Z (t C b)
Z (t b)
2 n f C Z (f C n)
Z (f n)
1
3
7
7
7
7
7
7
7
7
7
5
It might seem easiest to put the near plane very close and the far plane very far away because
this reduces the chance that the model will be outside the viewing volume. The drawback is
that precision may be lost in depth comparison calculations. The number of bits lost is about
log2

far
near

. For example, if you set near D 1 and far D 1000, you will lose about 10 bits of
precision.
Figure 8 shows a very simple OpenGL program that uses a perspective transformation. The dis-
play function includes a translation that moves the model 10 units in the negative z direction,
placing it comfortably in the viewing volume, which extends from z D 5 to z D 20. The
output statement in function reshape reveals when and how often the function is called; when
I ran this progam under Windows, reshape was called once, with width D height D 300.
2.3 Model View Transformations
Getting the projection right is the easy part. Manipulating the model view matrix is harder
because there is more work to do.
As we have seen, there are transformations for translating, rotating, and scaling. The order
in which these transformations are applied is important.
The graphics software simulates a camera, transforming a three-dimensional object, viewed
from a certain angle, into a rectangular, two-dimensional image. There are two ways of
thinking about a viewing transformation, and it is helpful to be able to think using both.
˘ A viewing transformation has the effect of moving the model with respect to the
camera.
˘ A viewing transformation has the effect of moving the camera with respect to the
model.
18

void display ()
f
glClearColor(0, 0.1, 0.4, 0);
glClear(GL COLOR BUFFER BIT);
glMatrixMode(GL MODELVIEW);
glLoadIdentity();
glTranslatef(0, 0, -10);
glutWireCube(1);
glFlush();
g
f
cout Reshape width height endl;
glLoadIdentity();
g
void main (int argc, char **argv)
f
glutInit(argc, argv);
glutCreateWindow(Perspective Transformation);
glutMainLoop();
g
Figure 8: An OpenGL program with a perspective transformation
Naturally, the two approaches are inverses. We can think of the transformation
used in the program of Figure 8 as either moving the model 10 units in the Z direction or
moving the camera 10 units in the CZ direction.
Initially, the camera and the model are both situated at the origin, (0, 0, 0), with the camera
looking in the CZ direction. If we want to see the model, we have either to move it away
from the camera, or move the camera away from the model.
For most purposes, it is easiest to visualize transformations like this: the camera remains
fixed, and the transformation moves the origin to a new location. All drawing takes place
at the current origin. For example, when we call glutWireCube, the cube is drawn with its
centre at the current origin. Viewed in this way, the translation above moves the origin to
(0, 0, 10) and continues drawing there.
Physicists use the term frame of reference, or frame for short, for a coordinate system
with an origin and axes. With this terminology, the effect of a transformation is to move the
19

void display ()
f
glLoadIdentity();
glRotatef(15, 0, 1, 0);
glutWireCube(1);
glFlush();
g
Figure 9: Translation followed by rotation
void display ()
f
glLoadIdentity();
glRotatef(15, 0, 1, 0);
glutWireCube(1);
glFlush();
g
Figure 10: Rotation followed by translation
frame, leaving the camera where it is, and to draw objects with respect to the new frame.
Although particular kinds of transformations commute, transformations in general do not
commute.
Consider the two versions of the display function shown in Figures 9 and 10.
˘ In Figure 9, we first translate the frame of reference 10 units in the Z direction and
then rotate it 15ı
about the Y axis. Finally, we draw the cube. The effect is that the
cube appears in the middle of the window, rotated 15ı about the vertical axis.
˘ In Figure 10, we first rotate the frame 15ı
about the Y axis and then translate it 10
units in the Z direction. Since the axes have been rotated, direction of the Z axis has
changed, and the cube moves to the side of the window.
20

3 BUILDING MODELS AND SCENES
3 Building Models and Scenes
3.1 A Digression on Global Variables
A common criticism of OpenGL is that programs depend too much on global variables. The
criticism is valid in the sense that most small OpenGL programs, especially example programs,
do make heavy use of global variables. To some extent, this is inevitable, because the call-
back functions have fixed parameter lists and do not return results: the only way they can
communicate is with global variables.
For example, suppose we want the mouse callback function to affect the display. Since the
mouse callback function receives the mouse coordinates and returns void, and the display
function receives nothing and returns void, these functions can communicate only by means
of global variables.
It is impossible to avoid having a few global variables. However, the number of global variables
can be made quite small by following standard encapsulation practices. For example, the
current GLUT context (current window, its width and height, position of the mouse, etc.)
can be put inside an object. Then there needs to be only one global variable, probably a
pointer to this object, and callback functions reference this object.
Figure 11 shows a small program written using global variables. It is typical of small OpenGL
example programs. Figures 12 and 13 show an equivalent program written with fewer global
variables. In fact, the only global variable in the second version is ps, which is a pointer
to an object that contains all the state necessary for this application. The display, reshape,
and mouse functions communicate by “sending messages” to this unique object. This tech-
nique extends nicely to larger programs. For example, multiple windows can be handled by
associating one object with each window.
21

3 BUILDING MODELS AND SCENES 3.1 A Digression on Global Variables
int mwWidth;
int mwHeight;
GLfloat xPos;
GLfloat yPos;
void display ()
{
glMatrixMode(GLMODELVIEW);
glLoadIdentity();
glRotatef(360 * xPos, 0, 1, 0);
glutWireCube(1);
glFlush();
}
{
mwWidth = width;
mwHeight = height;
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
}
void mouse(int x, int y)
{
xPos = (GLfloat)x / (GLfloat)mwWidth;
yPos = (GLfloat)y / (GLfloat)mwHeight;
}
{
glutMotionFunc(mouse);
glutMainLoop();
}
Figure 11: Programming with global variables
22

3 BUILDING MODELS AND SCENES 3.1 A Digression on Global Variables
class State
{
public:
State(int w, int h) : mwWidth(w), mwHeight(h) {}
int mwWidth;
int mwHeight;
GLfloat xPos;
GLfloat yPos;
};
State *ps;
void display ()
{
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glRotatef(360 * ps-xPos, 0, 1, 0);
glutWireCube(1);
glFlush();
}
{
ps-mwWidth = width;
ps-mwHeight = height;
glLoadIdentity();
}
void mouse(int x, int y)
{
ps-xPos = (GLfloat)x / (GLfloat)ps-mwWidth;
ps-yPos = (GLfloat)y / (GLfloat)ps-mwHeight;
}
Figure 12: Programming with fewer global variables
23

3 BUILDING MODELS AND SCENES 3.2 Matrix Stacks
{
const int WIDTH = 600;
const int HEIGHT = 400;
glutInitWindowSize(WIDTH, HEIGHT);
glutMotionFunc(mouse);
ps = new State(WIDTH, HEIGHT);
glutMainLoop();
}
Figure 13: Programming with fewer global variables, continued
glTranslatef(0, 0, LEN / 2);
glScalef(DIAM, DIAM, LEN);
glutWireCube(1);
glScalef(1/DIAM, 1/DIAM, 1/LEN);
glTranslatef(0, 0, LEN / 2 + RAD);
glutWireSphere(RAD, 15, 15);
glTranslatef(0, 0, - LEN - RAD);
Figure 14: Drawing an arm — first version
3.2 Matrix Stacks
In this section, we develop a program that draws a Maltese Cross. The cross has six arms,
pointing in the directions ˙X, ˙Y , and ˙Z. Each arm has a square cross-section and ends
with a sphere.
Figure 14 shows the code for one arm of the cross. The arm has diameter DIAM and length
LEN; the sphere has radius RAD. The origin is at the base of the arm. The arm is obtained
by changing the scale, drawing a cube, and resetting the original scale. When this code
has finished, the origin is restored by a translation that reverses the effect of the earlier
translations.
There are two undesirable features of the code in Figure 14. First, the need to “undo” the
scaling transformation. (In particular, note that the user might want to obtain“flat” arms
by setting one diameter to zero. In this case, the scaling cannot be reversed.) Second, the
same problem applies to the frame of reference: we have to reverse the effect of translation to
maintain the position of the origin.
OpenGL matrices are implemented as matrix stacks. To avoid reversing transformations, we
can stack the matrices that we need using glPushMatrix and restore them when we have
finished using glPopMatrix. Figure 15 is the revised code for drawing an arm. This code
leaves the frame of reference unchanged. The indentation is not required, of course, but it
helps the reader to understand the effect of the transformations.
24

glPushMatrix();
glTranslatef(0, 0, LEN / 2);
glPushMatrix();
glScalef(DIAM, DIAM, LEN);
glutWireCube(1);
glPopMatrix();
glTranslatef(0, 0, LEN / 2 + RAD);
glutWireSphere(RAD, 15, 15);
glPopMatrix();
Figure 15: Drawing an arm — improved version
void display ()
f
glLoadIdentity();
arm();
glRotatef(90, 1, 0, 0);
arm();
glRotatef(180, 1, 0, 0);
arm();
glRotatef(270, 1, 0, 0);
arm();
glRotatef(90, 0, 1, 0);
arm();
glRotatef(270, 0, 1, 0);
arm();
glFlush();
g
Figure 16: Drawing a Maltese Cross
We can put the code of Figure 15 into a function called arm. This function has an important
property that should be respected by all drawing functions: it leaves the reference frame
unchanged. Drawing scenes with functions that do not have this property can be very
confusing! The display function calls arm six times to draw the Maltese Cross: see Figure 16.
3.2.1 Pops and Pushes Don’t Cancel!
When pushing and popping matrices, it is important to realize that the sequence
glPopMatrix();
glPushMatrix();
does have an effect: the two calls do not cancel each other out. To see this, look at Figure 17.
The left column contains line numbers for identification, the second column contains code, the
third column shows the matrix at the top of the stack after each function has executed, and
25

# Code Stack
1 glLoadIdentity(); I
2 glPushMatrix(); I I
3 glTranslatef(1.0, 0.0, 0.0); T I
4 glPushMatrix(); T T I
5 glRotatef(10.0, 0.0, 1.0, 0.0); T R T I
6 glPopMatrix(); T I
7 glPushMatrix(); T T I
Figure 17: Pushing and popping
the other columns show the matrices lower in the stack. The matrices are shown as I (the
identity), T (the translation), and R (the rotation). At line 4, there are three matrices on
the stack, with T occupying the top two places. Line 5 post-multiplies the top matrix by R.
Line 6 pops the product T R off the stack, restoring the stack to its value at line 3. Line 7
pushes the stack and copies the top matrix. Note the difference between the stack entries at
line 5 and line 7.
3.2.2 Animation with Stacks
The ability to stack matrices is particularly important for animation. Suppose that we want
to draw a sequence of images showing a robot waving its arms. There will be at least two
angles that change with time: let us say that heta is the angle between the body and the
upper arm, and is the angle between the upper arm and the forearm. Then the animation
will include roughly the following steps.
1 Push frame 1.
1.1 Draw the body.
1.2 Translate to the left shoulder.
1.3 Push frame 2.
1.3.1 Rotate through heta.
1.3.2 Draw the upper arm and translate along it.
1.3.3 Rotate through .
1.3.4 Draw the forearm and hand.
1.4 Pop, restoring frame 2 (the left shoulder).
1.5 Translate to the right shoulder.
1.6 Push frame 3.
1.6.1 Rotate through heta.
1.6.2 Draw the upper arm and translate along it.
1.6.3 Rotate through .
1.6.4 Draw the forearm and hand.
1.7 Pop, restoring frame 3 (the right shoulder).
2 Pop, restoring frame 1 (the body).
26

3 BUILDING MODELS AND SCENES 3.3 Viewing the Model
3.3 Viewing the Model
So far, we have used a convention that is quite common in OpenGL programs: the reshape
callback function sets the projection matrix and the display callback function sets the model
view matrix. The partial program shown in Figure 18 shows a variation on this theme: the
display function sets both the projection and the model view matrices. The reshape function
updates the viewport and sets the global variables width and height, which are needed for
gluPerspective. The Y mouse coordinate is used to set the angle of view in gluPerspective;
this has the effect that moving the mouse down (up) in the window zooms the image towards
(away from) the viewer.
The function gluLookAt() defines a model-view transformation that simulates viewing (“look-
ing at”) the scene from a particular viewpoint. It takes nine arguments of type GLfloat. The
first three arguments define the camera (or eye) position with respect to the origin; the next
three arguments are the coordinates of a point in the model towards which the camera is
directed; and the last three arguments are the components of a vector pointing upwards. In
the call
gluLookAt ( 0.0, 0.0, 10.0,
0.0, 0.0, 0.0,
0.0, 1.0, 0.0 );
the point of interest in the model is at (0, 0, 0), the position of the camera relative to this
point is (0, 0, 10), and the vector (0, 1, 0) (that is, the Y -axis) is pointing upwards.
Although the idea of gluLookAt() seems simple, the function is tricky to use in practice.
Sometimes, introducing a call to gluLookAt() has the undesirable effect of making the image
disappear altogether! In the following code, the effect of the call to gluLookAt() is to move
the origin to (0, 0, 10); but the near and far planes defined by gluPerspective() are at
z D 1 and z D 5, respectively. Consequently, the cube is beyond the far plane and is
invisible.
glLoadIdentity();
gluPerspective(30, 1, 1, 5);
glLoadIdentity();
gluLookAt ( 0, 0, 10, 0, 0, 0, 0, 1, 0 );
glutWireCube(1.0);
Figure 19 demonstrates how to use gluLookAt() and gluPerspective() together. The two
important variables are alpha and dist. The idea is that the extension of the object in the Z-
direction is less than 2; consequently, it can be enclosed completely by planes at z D dist 1
and z D dist C 1. To ensure that the object is visible, gluLookAt() sets the camera position
to (0, 0, dist).
Changing the value of alpha in Figure 19 changes the size of the object (see Section 2.2.2).
The height of the viewing window is 2 (dist 1) tan (alpha=2); increasing alpha makes the
viewing window larger and the object smaller.
Changing the value of dist also changes the size of the image in the viewport, but in a
different way. The perspective changes, giving the effect of approaching (if dist gets smaller
and the object gets larger) or going away (if dist gets larger and the object gets smaller).
27

const int WIDTH = 600;
const int HEIGHT = 400;
int width = WIDTH;
int height = HEIGHT;
GLfloat xMouse = 0.5;
GLfloat yMouse = 0.5;
GLfloat nearPlane = 10;
GLfloat farPlane = 100;
GLfloat distance = 80;
void display ()
f
glLoadIdentity();
gluPerspective(20 + 60 * yMouse, GLfloat(width) / GLfloat(height),
nearPlane, farPlane);
glLoadIdentity();
glTranslatef(0, 0, -distance);
// Display scene
glutSwapBuffers();
g
void mouseMovement (int mx, int my)
f
xMouse = GLfloat(mx) / GLfloat(width);
yMouse = 1 - GLfloat(my) / GLfloat(height);
g
void reshapeMainWindow (int newWidth, int newHeight)
f
width = newWidth;
height = newHeight;
g
Figure 18: Zooming
28

const int SIZE = 500;
float alpha = 60.0;
float dist = 5.0;
void display (void)
{
glLoadIdentity();
gluLookAt ( 0.0, 0.0, dist,
0.0, 0.0, 0.0,
0.0, 1.0, 0.0 );
glutWireCube(1.0);
}
int main (int argc, char *argv[])
{
glutInitWindowSize(SIZE, SIZE);
glutInitWindowPosition(100, 50);
glutCreateWindow(A Perspective View);
glLoadIdentity();
gluPerspective(alpha, 1.0, dist - 1.0, dist + 1.0);
glutMainLoop();
}
Figure 19: Using gluLookAt() and gluPerspective()
It is possible to change alpha and dist together in such a way that the size of a key object
in the model stays the same while the perspective changes. This is a rather simple technique
in OpenGL, but it is an expensive effect in movies or television because the zoom control of
the lens must be coupled to the tracking motion of the camera. Hitchcock used this trick to
good effect in his movies Vertigo and Marnie.
The code extracts in the following example are taken from a program called viewpoints.cpp.
You can obtain the complete source code for this program from
http://www.cs.concordia.ca/~faculty/grogono/viewpoints.cpp
The display function of this program consists mainly of a switch statement for which the
cases are determined by keys pressed by the user. There is also an idle function callback that
performs the following computation:
carDirection += 0.05f;
if (carDirection TWOPI)
carDirection -= TWOPI;
carXPos = TRACKMIDDLE * sin(carDirection);
29

carYPos = TRACKMIDDLE * cos(carDirection);
In the simplest case, the camera stands still and we watch the car going around the track.
case DISTANT:
gluLookAt(
250.0, 0.0, 20.0 * height,
0.0, 0.0, 0.0,
0.0, 0.0, 1.0 );
drawScenery();
glTranslatef(carXPos, carYPos, carZPos);
glRotatef(RAD2DEG * carDirection, 0.0, 0.0, -1.0);
drawCar();
break;
In the next mode, the camera stays in the same position but pans to follow the car. This is
easily done by using the car’s position as the viewpoint of gluLookAt.
case INSIDE:
gluLookAt(
85.0, 0.0, height,
carXPos, carYPos, 0.0,
0.0, 0.0, 1.0 );
drawScenery();
glTranslatef(carXPos, carYPos, carZPos);
glRotatef(RAD2DEG * carDirection, 0.0, 0.0, -1.0);
drawCar();
break;
In the next mode, we see the scene from the driver’s point of view. The call to gluLookAt
establishes an appropriate point of view, assuming the car is at the origin and the car is
drawn without any further transformation. We then apply inverse transformations to show
the scenery moving with respect to the car.
case DRIVER:
gluLookAt(
2.0, 0.0, height,
12.0, 0.0, 2.0,
0.0, 0.0, 1.0 );
drawCar();
glRotatef(RAD2DEG * carDirection, 0.0, 0.0, 1.0);
glTranslatef(- carXPos, - carYPos, carZPos);
drawScenery();
break;
The next mode is the hardest to get right. We are in the car but looking at a fixed object in
the scene. The first rotation counteracts the rotation of the car. The we call gluLookAt to
looks from the driver’s position to the house at (40, 120). We then draw the car — which is
now rotating with respect to the camera, reverse the rotation and translation transformations,
and draw the scenery.
case HOUSE:
glRotatef(RAD2DEG * carDirection, 0.0, -1.0, 0.0);
gluLookAt(
30

2.0, 0.0, height,
40.0 - carXPos, 120.0 - carYPos, carZPos,
0.0, 0.0, 1.0 );
drawCar();
glRotatef(RAD2DEG * carDirection, 0.0, 0.0, 1.0);
glTranslatef(- carXPos, - carYPos, carZPos);
drawScenery();
break;
31

4 LIGHTING
4 Lighting
The techniques that we have developed so far enable us to draw various shapes, to view them
in perspective, and to move them around. With these techniques alone, however, it is hard to
create the illusion of reality. The missing dimension is lighting: with skillful use of lighting,
we can turn a primitive computer graphic into a realistic scene.
OpenGL provides simple but effective facilities for lighting. It compromises between realism
and efficiency: there are more sophisticated algorithms for lighting than those that OpenGL
uses, but they require significantly more processing time. OpenGL is good enough for most
purposes and fast enough to animate fairly complex scenes on today’s PCs.
4.1 Lighting Basics
4.1.1 Hiding Surfaces and Enabling Lights
Since lighting does not make much sense without hidden surface removal, we will assume in
this section that initialization includes these calls:
glutInitDisplayMode(GLUT DOUBLE | GLUT RGB);
glEnable(GL DEPTH TEST);
and that the display function clears the depth buffer bit:
OpenGL performs lighting calculations only if GL LIGHTING is enabled. There are eight lights
and their names are GL LIGHTn, where n D 0, 1, 1, . . . , 7. For simple applications, we need to
use only the first light. During initialization, we execute
glEnable(GL LIGHTING);
glEnable(GL LIGHT0);
As usual, OpenGL works with states: we can turn lighting on or off at any time by calling
glEnable(GL LIGHTING) and glDisable(GL LIGHTING). This is useful for scenes which are
partly lit and partly unlit.
4.1.2 Kinds of Light
Nature provides only one kind of light, in the form of photons. We can simulate photons, but
only with large amounts of computation. Research has shown that we can obtain realistic
illumination in a reasonably efficient way by dividing light into four categories. Note that we
never see the light itself, but only the surfaces that are illuminated by it. Even when we see
a “beam” of light, for example in a movie theatre, we are actually seeing illuminated dust
particles, not light.
Ambient light is light that pervades every part of the scene. Ambient light has no direction
and it illuminates every object in the same way, regardless of its position or orientation. If
there is no ambient light, the scene has a harsh, “outer space” quality, in which an surface
that is not directly illuminated appears completely black — and therefore invisible.
Diffuse light comes from a particular direction but is scattered in all directions by the surface
it hits. The brightness of a diffusely lit surface varies with the direction of the light: a simple
assumption is that the brightness is proportional to cos , where is the angle between the
32

4 LIGHTING 4.2 Material Properties
light rays and the normal to the surface. The colour of a surface lit by diffuse light is the
colour of the surface: a red object appears red, etc.
Specular light comes from a particular direction and is reflected in a cone. A mirror is an
almost perfect specular surface: a light ray is reflected as a light ray (the angle of the cone
is zero). Glossy objects reflect specular light in a cone whose angle depends on the shininess
of the material. If you think of a sphere illuminated by a small, bright light, it will have a
circular highlight: the size of the highlight will be small for a very shiny sphere and larger for
a matte sphere. Unlike diffuse light, the colour of specular light depends more on the light
source than on the illuminated surface. A red sphere lit by a white light has a white highlight.
Emissive light is light that appears to be coming from the object. In computer graphics,
we cannot actually construct objects that emit light; instead, we create the illusion that they
are emitting light by making their colour independent of the other lighting in the scene.
4.2 Material Properties
The function that we have been using, glColor, does not provide enough information for
lighting. In fact, it has no effect when lighting is enabled. Instead, we must define the colour
material properties of each surface by calling glMaterial{if}[v].
Each call to glMaterial has three arguments: a face, a property, and a value for that property.
Vector properties are usually passed by reference, as in this example:
GLfloat deepBlue[] = 0.1, 0.5, 0.8, 1.0 ;
glMaterialfv(GL FRONT, GL AMBIENT AND DIFFUSE, deepBlue);
The first argument must be one of GL_FRONT, GL_BACK, or GL_FRONT_AND_BACK. It deter-
mines which face of each polygon the property will be applied to. The most common and
efficient form is GL_FRONT. Figure 20 gives possible values of the second argument and cor-
responding default values of the third argument. Figure 21 provides examples of the use of
glColorMaterial.
The default specular colour is black, which means that objects have matte surfaces. To create
a shiny object, set GL_SPECULAR to white and specify the shininess as a single number between
0 and 128. The object will look as if it is made of plastic; creating other materials, such as
metals, requires a careful choice of values.
Parameter Meaning Default
GL_DIFFUSE diffuse colour of material (0.8, 0.8, 0.8, 1.0)
GL_AMBIENT ambient colour of material (0.2, 0.2, 0.2, 1.0)
GL_AMBIENT_AND_DIFFUSE ambient and diffuse
GL_SPECULAR specular colour of material (0.0, 0.0, 0.0, 1.0)
GL_SHININESS specular exponent 0.0
GL_EMISSION emissive colour of material (0.0.0, 0.0, 1.0)
Figure 20: Parameters for glMaterialfv()
33

4 LIGHTING 4.3 Light Properties
/* Data declarations */
GLfloat off[] = { 0.0, 0.0, 0.0, 0.0 };
GLfloat white[] = { 1.0, 1.0, 1.0, 1.0 };
GLfloat red[] = { 1.0, 0.0, 0.0, 1.0 };
GLfloat deep_blue[] = { 0.1, 0.5, 0.8, 1.0 };
GLfloat shiny[] = { 50.0 };
GLfloat dull[] = { 0.0; }
/* Draw a small, dark blue sphere with shiny highlights */
glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, deep_blue);
glMaterialfv(GL_FRONT, GL_SPECULAR, white);
glMaterialfv(GL_FRONT, GL_SHININESS, shiny);
glutSolidSphere(0.2, 10, 10);
/* Draw a large, red cube made of non-reflective material */
glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, red);
glMaterialfv(GL_FRONT, GL_SPECULAR, off);
glMaterialfv(GL_FRONT, GL_SHININESS, dull);
glutSolidCube(10.0);
/* Draw a white, glowing sphere */
glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, off);
glMaterialfv(GL_FRONT, GL_SPECULAR, off);
glMaterialfv(GL_FRONT, GL_SHININESS, dull);
glMaterialfv{GL_FRONT, GL_EMISSION, white);
glutSolidSphere(10.0, 20, 20);
Figure 21: Using glMaterial()
4.3 Light Properties
As mentioned above, OpenGL provides up to eight lights, each with its own properties. Since
lighting calculations must be performed for each light, using a large number of lights will slow
down the program. For most applications, it is best to use one or two lights only, to obtain
acceptable performance. However, the realism of a scene can be greatly enhanced by multiple
lights and there are occasions where a rich image is more important than fast animation.
Light properties are set by calling glLight{if}[v] with three arguments: the light (GL_LIGHT0,
GL_LIGHT1, etc.); the property name; and the property value. Figure 22 describes each prop-
erty and gives the default value for GL_LIGHT0. The default values for other lights are all
zero. This means that if you enable GL_LIGHT0 and do nothing else, you will see something
but, if you enable any other light you won’t see anything unless you specify its properties.
A light has three of the four colour components: diffuse, ambient, and specular, but not
emissive. (We have seen why a surface should have these colours but it is not obvious why
a light needs them as well. We will discuss this later, in the theory part of the course.)
A light has a position specified in four-dimensional coordinates (x, y, z, w). The fourth coor-
dinate, w, has a special significance: if it is zero, the light is at infinity and the other three
34

4 LIGHTING 4.3 Light Properties
GL_DIFFUSE diffuse colour (1.0, 1.0, 1.0, 1.0)
GL_AMBIENT ambient colour (0.0, 0.0, 0.0, 1.0)
GL_SPECULAR specular colour (1.0, 1.0, 1.0, 1.0)
GL_POSITION position (0.0, 0.0, 1.0, 0.0)
GL_CONSTANT ATTENUATION constant attenuation 1.0
GL_LINEAR_ATTENUATION linear attenuation 0.0
GL_QUADRATIC_ATTENUATION quadratic attenuation 0.0
GL_SPOT_CUTOFF cutoff angle of spotlight 180.0
GL_SPOT_DIRECTION direction of spotlight (0.0, 0.0, 1.0)
GL_SPOT_EXPONENT exponent of spotlight 0.0
Figure 22: Parameters for glLightfv()
coordinates give its direction; if it is 1, the light is at the position specified. For example,
the default position is (0, 0, 1, 0) which specifies a light in the positive Z direction (behind
the viewer in the default coordinate system) and infinitely far away. The position (5, 1, 0, 1)
defines a light on the left, slightly raised, and in the Z plane of the viewer. If w D 0, we have
a directional light and, if w D 1, we have a positional light.
Once again, the choice of light position trades realism and efficiency. If the light is at infinity,
its rays are parallel and lighting computations are fast. If the light is local, its rays hit each
object at a different angle and lighting computations take longer.
The attenuation factors determine how the brightness of the light decreases with distance.
OpenGL computes attenuation with the formula
a D
1
c C ` d C q d2
in which a is the attenuation factor, d is the distance from the light to the object, and c,
`, and q are the constant, linear, and quadratic attenuation coefficients, respectively. The
default values are c D 1, ` D 0, and q D 0. Clearly, ` and q must be zero for a directional
light because d D 1; in practice, OpenGL ignores these values for directional lights.
Physics tells us that the intensity of light decreases as the inverse square of the distance
from the source and therefore suggests setting c D ` D 0 and giving q some non-zero value.
However, the inverse-square law applies only to point sources of light, which are rather rare
in everyday life. For most purposes, the default values, c D 1 and ` D q D 0, are adequate.
Giving non-zero values to ` and q may give somewhat more realistic effects but will make
your program slower.
Every light is a spotlight that emits light in a cone. The angle of the cone is set by
GL_SPOT_CUTOFF and its default value is 180ı
which means that the light emits in all di-
rections. Changing the value of GL_SPOT_CUTOFF gives the effect of a light that emits in a
particular direction. For example, a value of 5ı
simulates a highly directional beam such as
the headlight of a car.
If you give GL_SPOT_CUTOFF a value other than 180ı, you should also give appropriate values
to GL_SPOT_DIRECTION and GL_SPOT_EXPONENT. The direction is simply a vector specified by
the values (x, y, z). The exponent determines how focused the beam is. The light intensity at
35

4 LIGHTING 4.4 Lighting Model
GL_LIGHT_MODEL_AMBIENT ambient light intensity (0.2, 0.2, 0.2, 1.0)
GL_LIGHT_MODEL_LOCAL_VIEWER simulate a close viewpoint GL_FALSE
GL_LIGHT_MODEL_TWO_SIDE select two-sided lighting GL_FALSE
GL_LIGHT_MODEL_COLOR_CONTROL colour calculations GL_SINGLE_COLOR
Figure 23: Parameters for glLightModel()
an angle from the centre of the beam is cosx , where x is the exponent value. The default
value of the exponent is 0; since cos0
D 1, the illumination is even across the cone.
4.4 Lighting Model
The lighting model determines overall features of the lighting. The lighting model is selected
by glLightModel{if}[v] which takes two arguments: a property and a value. Figure 23
shows the property names and their default values.
The ambient light in a scene is light that pervades the scene uniformly, coming from all
directions. A moderate quantity of ambient light, such as provided by the default setting,
makes the quality of light softer and ensures that every object is visible.
The other three properties allow you to choose between realism and speed. The local viewer
property determines the way in which OpenGL calculates specular reflections. (Roughly,
specular reflections come from shiny or glossy objects. We discuss it in more detail later.) If
the viewer is very distant, rays coming from the scene to the eye are roughly parallel, and
specular reflection calculations can be simplified by assuming that they actually are parallel
(GL_FALSE, the default setting). If we want to model lighting accurately from the point of
view of a close viewer, we have to make more detailed calculations by choosing GL_TRUE for
this parameter.
Two-sided lighting illuminates both sides of each polygon; single-sided lighting, the default,
illuminates only front surfaces and is therefore much faster. Suppose we are lighting a sphere:
all of the polygons face outwards and single-sided lighting is all we need. Suppose, however,
that we cut a hole in the sphere so that we can see inside. The inside of the sphere consists
of the back faces of the polygons and, in order to see them, we would need two-sided lighting.
As Figure 23 shows, the default colour calculation is GL_SINGLE_COLOR and it causes
OpenGL to calculate a single colour for each vertex. The call
glLightModel(GL LIGHT MODEL COLOR CONTROL, GL SEPARATE SPECULAR COLOR);
makes OpenGL calculate two colours for each vertex. The two colours are used when texturing,
to ensure that textured objects are illuminated realistically.
4.5 Lighting in Practice
Lighting a scene is fairly straightforward; the hardest part is to get the light(s) in the right
position. Positioning is done in two steps. First, the position is defined as a value:
GLfloat pos = f 0, 0, 3, 1 g;
Second, glLight is called with the property GL_POSITION:
36

4 LIGHTING 4.6 Normal Vectors
glLightfv(GL LIGHT0, GL POSITION, pos);
When this call is executed, the position given by pos is transformed by the current model
view matrix. This can be a bit confusing, because the coordinate frame is moved by the model
view matrix and the light is positioned with respect to the new coordinate frame. You may
find it easier to set pos to (0, 0, 0, 1) and then move the frame to wherever you want the light.
Assuming that you have set pos as above, the following code in the display function will draw
a stationery object with a fixed light.
glLoadIdentity();
gluLookAt(0, 0, 5, 0, 0, 0, 0, 1, 0);
// Draw model
In the next version, the light rotates around the stationery object. Assume that angle is
continuously updated by the idle function.
glLoadIdentity();
gluLookAt(0, 0, 5, 0, 0, 0, 0, 1, 0);
glPushMatrix();
glRotatef(angle, 1, 0, 0);
glPopMatrix();
// Draw model
A third possibility is that you want the light to move with the viewer, as if you were watching
the scene with a miner’s light attached to your hardhat. For this purpose, it’s best to put
the light at the origin (where the camera is) and to set its position before doing any other
viewing transformation. The position is set by
GLfloat pos = 0, 0, 0, 1 ;
and the display function contains the code
glLoadIdentity();
gluLookAt(0, 0, 5, 0, 0, 0, 0, 1, 0);
// Draw model
In each case, changing the fourth component of the light position to zero will give directional
(rather than positional) light, which is faster to compute but less realistic.
4.6 Normal Vectors
In order to perform lighting calculations, OpenGL needs to know the direction of the normal
at each point of the surface that it is rendering. The normal is a vector — usually a unit
vector — that “sticks out” of the surface at right angles to it. The normals of a sphere, for
example, would pass through the centre of the sphere if extended far enough backwards.
OpenGL requires a normal to be associated with each vertex. This might seem odd, because
normals belong to surfaces, not to vertexes. There are two ways of using normals.
37

˘ If we want to draw an object with clearly-distinguished faces, such as a cube, then each
vertex will have several normals associated with it. In the case of a cube, each corner
vertex will have three normals, one for each face.
˘ In order to create the illusion of a smooth surface, we compute the average of the normals
of the surfaces that meet at a vertex. For example, if three surfaces meet at a vertex,
and their normals are v1, v2, and v3, then the normal for that vertex is calculated as
v1 C v2 C v3
kv1 C v2 C v3k
The effect of this calculation is to smooth out the corners and edges of the object.
The function that sets a normal is glNormal3{bsidf}[v]() and it must be called before the
vertex it applies to. Once the normal is set, it can be applied to any number of vertexes. For
example, to draw a flat triangle in the XY -plane, we could execute:
glBegin(GL TRIANGLES);
glNormal3i(0, 0, 1);
glVertex3f(-0.5, 0, 0);
glVertex3f(0.5, 0, 0);
glVertex3f(0, 0.866, 0);
glEnd();
Here are three ways of computing normals.
1. Normal to a triangle. The vector normal to a triangle with vertices (x1, y1, z1), (x2, y2, z2),
and (x3, y3, z3) is (a, b, c), where
a D C
ˇ
ˇ
ˇ
ˇ
y2 y1 z2 z1
y3 y1 z3 z1
ˇ
ˇ
ˇ
ˇ
b D
ˇ
ˇ
ˇ
ˇ
x2 x1 z2 z1
x3 x1 z3 z1
ˇ
ˇ
ˇ
ˇ
c D C
ˇ
ˇ
ˇ
ˇ
x2 x1 y2 y1
x3 x1 y3 y1
ˇ
ˇ
ˇ
ˇ
Figure 24 shows a simple function that computes vector normal for a plane defined by
three points p, q, and r, chosen from an array of points.
2. Normal to a polygon. There is a simple algorithm, invented by Martin Newell, for find-
ing the normal to a polygon with N vertexes. For good results, the vertexes should lie
approximately in a plane, but the algorithm does not depend on this. If the vertexes have
coordinates (xi, yi, zi) for i D 0, 1, 2, ..., N 1, the normal n D (nx, ny, nz) is computed
as
nx D
X
0iN
(yi yiC1)(zi C ziC1)
ny D
X
0iN
(zi ziC1)(xi C xiC1)
nz D
X
0iN
(xi xiC1)(yi C yiC1).
The subscript i C 1 is computed “mod N”: if i D N 1, then i C 1 D 0. The result n
must be divided by knk D
q
n2
x C n2
y C n2
z to obtain a unit vector.
38

enum { X, Y, Z };
typedef float Point[3];
typedef float Vector[3];
Point points[MAX_POINTS];
void find_normal (int p, int q, int r, Vector v)
{
float x1 = points[p][X];
float y1 = points[p][Y];
float z1 = points[p][Z];
float x2 = points[q][X];
float y2 = points[q][Y];
float z2 = points[q][Z];
float x3 = points[r][X];
float y3 = points[r][Y];
float z3 = points[r][Z];
v[X] = + (y2-y1)*(z3-z1) - (z2-z1)*(y3-y1);
v[Y] = - (x2-x1)*(z3-z1) + (z2-z1)*(x3-x1);
v[Z] = + (x2-x1)*(y3-y1) - (y2-y1)*(x3-x1);
}
Figure 24: Computing normals
3. Normals for a square grid. A general purpose formula can often be simplified for spe-
cial cases. Suppose that we are constructing a terrain using squares and that the X and
Y coordinates are integer multiplies of the grid spacing, d. The height of the terrain at
x D i and y D j is is zi,j . Figure 25 shows 9 points of the terrain, centered at zi,j . The
X coordinates in this view are i 1, i, and i C 1, and the Y coordinates are j 1, j,
and j C 1. The appropriate normal for the point (xi, yi) is the average of the normals
to the quadrilaterals A, B, C , and D. Using Newell’s formula to compute these four
normals and adding the resulting vectors gives a vector n with components:
nx D d(zi 1,jC1 ziC1,jC1 C 2zi 1,j 2ziC1,j C zi 1,j 1 ziC1,j 1)
ny D d( zi 1,jC1 2zi,jC1 ziC1,jC1 C zi 1,j 1 C 2zi,j 1 C ziC1,j 1)
nz D 8 d2
Note that we do not need to include the factor d in the calculation of n since a scalar
multiple does not affect the direction of a vector. The correct normal vector is then
obtained by normalizing n.
The average of n normal vectors can be calculated by adding them and dividing by n. In
practice, the division by n is not usually necessary, because we can simply add the normal
vectors at a vertex and then normalize the resulting vector. Formally, the normalized average
vector of a set of vectors f (xi, yi, zi) j i D 1, 2, . . . , n g is

X
S
,
Y
S
,
Z
S

, where X D
iDn
X
iD1
xi,
39

j 1
j
j C 1
i 1 i i C 1
t
t
t
t
t
t
t
t
t
d - d -
?
d
6
?
d
6
A B
C
D
Figure 25: Computing average normals on a square grid
Y D
iDn
X
iD1
yi, Z D
iDn
X
iD1
zi, and S D
p
X2 C Y 2 C Z2.
Normalizing Vectors If you include the statement glEnable(GL_NORMALIZE) in your ini-
tialization code, OpenGL will normalize vectors for you. You have to do this if, for example,
you import a model with vertexes and normals pre-calculated and you then scale this model.
However, it is usually more efficient to normalize vectors yourself if you can.
40

5 SPECIAL EFFECTS
5 Special Effects
5.1 Blending
Blending is a very powerful and general feature of OpenGL and we will describe only a few
special cases of it. For complete details, see the “red book” (OpenGL Programming Guide,
Third Edition, by Mason Woo et al.).
The fourth component of a colour vector, usually referred to as “alpha” (˛), makes the colour
partially transparent, allowing it to be “blended” with another colour.
Normally, when OpenGL has computed the colour of a vertex it stores the colour at the corre-
sponding pixel location unless the depth buffer information says that the vertex is invisible, in
which case the pixel is left unchanged. When blending is being used, the computed colour is
combined with the colour that is already at the pixel, and the new colour is stored. The new
colour is called the source and the existing colour at the pixel is called the destination. If
source colour D (Rs, Gs, Bs, As)
source blending factors D (Sr , Sg, Sb, Sa)
destination colour D (Rd , Gd , Bd , Ad )
destination blending factors D (Sd , Sd , Sd , Sd )
then the final colour of the pixel is
(Rs Sr C Rd Dr , Gs Sg C Gd Dg, Bs Sb C Bd Db, As Sa C Ad Da)
The order in which we draw opaque objects does not usually matter much, because the depth
buffer takes care of hidden surfaces. With blending, however, the order is important, because
the order in which OpenGL processes the source and destination colours affects the result.
Blending is enabled by calling glEnable(GL_BLEND) and disabled by calling glDisable(GL_BLEND).
The blending process is determined by calls to
glBlendFunc(GLenum src, GLenum dst );
There are many possible values for these two arguments. Their uses are suggested by the
following examples.
˘ To blend two images: draw the first image with src = GL_ONE and dst = GL_ZERO.
Then set ˛ D 0.5 and draw the second image with src = GL_SRC_ALPHA and dst =
GL_ONE_MINUS_SRC_ALPHA.
˘ To achieve a “painting” effect, in which each brush stroke adds a little more colour,
use ˛ D 0.1 and draw each brush stroke with with src = GL_SRC_ALPHA and dst =
GL_ONE_MINUS_SRC_ALPHA.
Here are some extracts from program that achieves a glass-like effect by blending. During
initialization, call
glEnable(GL_BLEND);
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
Set up various colours. Note that the use of ˛ values less than 1. The vector glass is not
const because the program allows the transparency of glass to be changed.
41

5 SPECIAL EFFECTS 5.2 Fog
GLfloat glass[] = { 0.4f, 0.4f, 0.4f, 0.6f };
const GLfloat blue[] = { 0.2f, 0.2f, 1.0f, 0.8f };
const GLfloat white[] = { 1.0, 1.0, 1.0, 1.0 };
const GLfloat polished[] = { 100.0 };
In the display function, display the more opaque object first, then the transparent object.
glPushMatrix();
glTranslatef(1.0, 0.0, 0.0);
glRotatef(45.0, 1.0, 0.0, 0.0);
glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, blue);
glMaterialfv(GL_FRONT, GL_SHININESS, polished);
glutSolidIcosahedron();
glPopMatrix();
glPushMatrix();
glRotatef(30.0, 0.0, 1.0, 0.0);
glMaterialfv(GL_FRONT, GL_AMBIENT_AND_DIFFUSE, glass);
glMaterialfv(GL_FRONT, GL_SHININESS, polished);
glutSolidCube(3.0);
glPopMatrix();
See also (Hill Jr. 2001, pages 545–549).
5.2 Fog
Fog is an easy effect to create and it can be quite useful. A common problem that occurs in
creating landscapes is that the edge of the terrain looks like the end of the world rather than
a smooth horizon; we can use fog to hide such anomalies.
Fog is actually a special case of blending. The fog effect is obtained by blending the desired
colour of a vertex with the fog colour. The degree of blending is determined by the distance
of the vertex from the viewer. OpenGL provides three modes: in Figure 26, the left column
shows the modes and the right column shows f , the “fog factor”.
To use fog, you have to call glEnable(GL_FOG) and then set the parameters in the formulas
by calling glFog: The values in these formulas are set by the following function call. Figure 27
shows the values of the arguments.
glFogfifg[v](GLenum param, TYPE value );
Figure 27 shows the values of the arguments of glFog. As the formulas show, you set
GL_FOG_DENSITY for modes GL_EXP and GL_EXP2, and you set GL_FOG_START and GL_FOG_END
for mode FOG_LINEAR. The default mode is GL_EXP.
You can control the efficiency of fog generation by providing hints. If you call
glHint(GL FOG HINT, GL NICEST);
then OpenGL will calculate fog for every pixel. If you call
glHint(GL FOG HINT, GL FASTEST);
42

5 SPECIAL EFFECTS 5.3 Reflection
GL_LINEAR f D
end z
end start
GL_EXP f D e d z
GL_EXP2 f D e (d z)2
Figure 26: Fog Formulas
param value
GL_FOG_MODE GL_LINEAR, GL_EXP, or GL_EXP2
GL_FOG_DENSITY d
GL_FOG_START start
GL_FOG_END end
GL_FOG_COLOR colour
Figure 27: Parameters for glFog
then OpenGL will calculate for for every vertex, which is usually faster but doesn’t look so
nice. If you want OpenGL to decide which mode to use by itself, you write
glHint(GL FOG HINT, GL DONT CARE);
Naturally, you can call glDisable(GL_FOG) to turn the fog effect off.
5.3 Reflection
Reflection is one of several effects that you can obtain with the stencil buffer. A “stencil”
is a plastic sheet with holes cut into it. The holes have particular shapes — for example, an
architect’s stencil has shapes of furniture — and the stencil is used as a guide when drawing
those shapes. A stencil in a graphics program is an area of the window that is used to draw
something different from the main image.
Stencils can be used for a variety of effects. The following extracts are from a program that
draws a scene and its reflection in a mirror. Using a stencil for the mirror allows us to draw
a scene in which objects in the mirror are transformed differently from other objects.
During initialization:
glClearStencil(0);
glEnable(GL_STENCIL_TEST);
Define a mirror in a plane normal to the X-axis. There are two ways of drawing the mirror:
if p is true, draw a filled quadrilateral or, if p is false, draw a hollow outline.
void mirror (bool p)
{
if (p)
glBegin(GL_QUADS);
else
43

5 SPECIAL EFFECTS 5.4 Display Lists
glBegin(GL_LINE_LOOP);
glVertex3f(cmx, cmy - 0.5, cmz - 2.0);
glVertex3f(cmx, cmy - 0.5, cmz + 2.0);
glVertex3f(cmx, cmy + 0.5, cmz + 2.0);
glVertex3f(cmx, cmy + 0.5, cmz - 2.0);
glEnd();
}
Display the scene like this. First, store the shape of the mirror in the stencil buffer.
glClear(GL_STENCIL_BUFFER_BIT);
glStencilFunc(GL_ALWAYS, 1, 1);
glStencilOp(GL_REPLACE, GL_REPLACE, GL_REPLACE);
mirror(true);
As usual, clear the colour buffer and depth buffer bits:
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
Draw the mirror frame:
glColor3f(0.7f, 0.7f, 0.7f);
mirror(false);
Draw the scene outside the mirror:
glStencilOp(GL_KEEP, GL_KEEP, GL_KEEP);
glStencilFunc(GL_NOTEQUAL, 1, 1);
scene();
Finally, draw the reflected scene in the mirror. To obtain the mirror image, translate to the
centre of the mirror, reflect the scene in the X-axis (remember that the plane of the mirror is
normal to the X axis), and then reverse the translation.
glStencilFunc(GL_EQUAL, 1, 1);
glTranslatef(cmx, cmy, cmz);
glScalef(-1.0, 1.0, 1.0);
glTranslatef(-cmx, -cmy, -cmz);
scene();
5.4 Display Lists
Display lists do not provide any new graphical features; they are used to improve the per-
formance of graphics programs. The idea is to perform as many calculations as possible and
store them, instead of performing the calculations every time the scene is displayed.
The following code illustrates the use of display lists. To create a list:
GLuint pic = glGenLists(1);
glNewList(pic, GL COMPILE);
44

5 SPECIAL EFFECTS 5.4 Display Lists
// draw the picture
glEndList();
The argument given to glGenLists specifies the number of lists, N, that we want. glGenLists
returns the first number, f , in a range. When we call glNewList, the first argument must be
a number ` in the range: f ` f C N.
The second argument given to glNewList is either GL_COMPILE, if we want to do the cal-
culations only, or GL_COMPILE_AND_EXECUTE if we want to do the calculations and draw the
picture.
After creating a list, we draw the picture in it by calling
glCallList(pic);
where pic is the same number that we gave to glNewList.
A display list can include transformations (translate, rotate, scale) and drawing functions. As
a rule of thumb, you can do something in a display list if it would make sense to do the same
thing in the display function. However, the values used when the display list is created are
fixed. For example, you could include a rotation
glRotatef(angle, 0, 1, 0);
in a display list, but the current value of angle would be “frozen” in the list; changing the
value of angle would have no effect on the stored image.
A single list can be used many times. For example, if you created a display list with index
person, you could create a crowd scene like this (assume that rdm returns a random floating
point value in a suitable range):
for (p = 0; p CROWDSIZE; p++)
f
glPushMatrix();
glTranslatef(rdm(), 0, rdm());
glRotatef(rand() % 360, 0, 1, 0);
glCallList(person);
glPopMatrix();
g
Each person would be translated to some point in the XZ plane — but not levitated, since
there is no Y component — and rotated by a random amount. A problem with this simple
algorithm is that some people might overlap with others.
You can create hierarchical display lists. That is, you can use glCallList between glNewList
and glEndList, provided that the list you call has already been defined. The following code
is valid provided that the index couple has been allocated by glGenLists:
glNewList(couple);
glCallList(person);
glTranslatef(5, 0, 0);
glCallList(person);
glEndList();
As usual, it is highly advisable to use glPushMatrix and glPopMatrix to ensure that calling
a list does not change the frame of reference.
45

5 SPECIAL EFFECTS 5.5 Bézier Curves and Surfaces
r
r
r r r
r
Figure 28: Two three-point Bézier curves with their control points
5.5 Bézier Curves and Surfaces
We can do a limited amount of modelling with the built-in objects provided by GLUT (Sec-
tion 1.5.2 and quadrics (Section 1.5.3 but we need more general techniques to build complex
models. Bézier formulas1
are an example of such a technique.
It might seem that we can use any polynomial (or even function) to generate a curve. This is
not so, for teh following reason. We would like to represent the modelling function using only
a small number of parameters. For example, we can represent a second degree polynomial
a x2 C b x C c with just three numbers, a, b, and c. More practically, we could represent the
function by a small collection of points that it passes through or that control its direction.
We also want to perform graphics transformations on these points without changing the curve
or surface generated. Consequently, Bézier and other formulas have the following important
property: the relationship between the control points and the generated curve or
surface is unaffected by affine transformations.
5.5.1 Curves
We consider curves in 2D space first because they are easier to understand than surfaces in
3D space. A Bézier curve is defined by a set of control points. The curve starts at the
first point and ends at the last point, but it does not pass through the intermediate points.
Figure 28 shows two Bézier curves, each with three control points. Tangents to the curve at
its end point pass through the middle control point.
A general Bézier curve can have any number of control points. OpenGL provides evaluators
to draw Bézier curves. The minimal steps are as follows:
˘ Define an array of control points.
˘ During initialization, pass information about control points to an evaluator and enable
the evaluator.
˘ In the display function, compute the points that you need.
The evaluator defines a parametric curve: that is, points on the curve depend on values of
a parameter, u. It is often convenient to allow u to vary from 0 to 1 but OpenGL does not
require this.
Here is a simple example that generates a four-point Bézier curve.
1Bézier formulas were in fact invented by two researchers, both car designers: Pierre Bézier (1910–1999)
at Renault found the formulas and Paul de Casteljau at Citroen developed an algorithm for calculating the
coefficients.
46

Parameter Meaning
GL_MAP1_VERTEX_3 (x, y, z) coordinates
GL_MAP1_VERTEX_4 (x, y, z, w) coordinates
GL_MAP1_COLOR_4 (r, g, b, a) colour values
GL_MAP1_NORMAL (x, y, z) normal direction
Figure 29: Control parameters for Bézier curves
˘ Define an array with 4 3D points. The points lie in the XY plane.
GLfloat pts[4][3] = f -4, -4, 0, -2, 4, 0, 2, -4, 0, 4, 4, 0 g;
˘ Define the evaluator:
glMap1f(GL MAP1 VERTEX 3, 0, 1, 3, 4, pts[0][0]);
The first argument determines the type of the control points (see below). The next two
arguments specify the range of the parameter u: in this example, 0 u 1. The
argument “3” is the stride: it tells the function how many floating-point values to step
over between each point. The argument “4” is the order of the spline, which is equal
to the number of points specified. The last argument is the address of the first control
point.
˘ Enable the evaluator:
glEnable(GL MAP1 VERTEX 3);
˘ In the display function, draw the curve as a sequence of line segments:
glBegin(GL LINE STRIP);
for (int i = 0; i 50; i++)
glEvalCoord1f((GLfloat) i / 50.0);
glEnd();
˘ The same effect can be achieved by calling:
glMapGrid1f(50, 0.0, 1.0);
glEvalMesh1(GL LINE, 0, 50);
An evaluator can generate vertex coordinates, normal coordinates, colour values, or texture
coordinates. Figure 29 shows some of the possible values.
5.5.2 Surfaces
Generating a Bézier surface is similar, but two parameters, u and v, are needed for the surface
coordinates and a rectangular array of control points are required. The functions are specified
below in a general way rather than by specific example as above.
glMap2ffdg(target, u1, u2, ustride, uorder, v1, v2, vstride, vorder, points );
where
target D The control parameter: as Figure 29 but with MAP2
u1 D Minimum value for u
u2 D Maximum value for u
47

ustride D Address difference for u values
uorder D Number of u values
v1 D Minimum value for v
v2 D Maximum value for v
vstride D Address difference between v values
vorder D Number of v values
points D Address of first point
For example, suppose we define the control points with
GLfloat pts[4][4][3] = f ... g;
and the points specify vertices. Assume that we want 0 u 1 and 0 v 1. Then we
would call:
glMap2f(GL MAP2 VERTEX 3, 0, 1, 3, 4, 0, 1, 12, 4, pts[0][0][0]);
because the u entries are three floats apart and the v entries are four floats apart.
To obtain a point on the surface, call
glEvalCoord2f(u, v);
where (u, v) are the surface coordinates. The vertexes can be calculated four at a time and
drawn as GL QUADS to obtain a surface.
Alternatively, you can use the grid and mesh functions to draw the entire surface. An impor-
tant advantage of using these functions is that they generate normals.
glMapGrid2ffdg(nu, u1, u2, nv, v1, v2);
where
nu D Number of u control points to evalaute
u1 D Minimum value for u
u2 D Maximum value for u
nv D Number of v control points to evalaute
v1 D Minimum value for v
v2 D Maximum value for v
glEvalMesh2(mode, i1, i2, v1, v2);
where mode is one of GL POINT, GL LINE, or GL FILL; i1 and i2 specify the range of u values;
and j1 and j2 specify the range of v values.
Although there can be any number of control points in principle, using very large numbers
can be problematic. The functions generate polynomials of high degree that require time to
compute and may be unstable.
Figure 30 shows a concrete example of Bézier surface generation. The shape generated is
one side of the body of an aircraft; the surface is rendered twice to obtain both sides of the
aircraft. Figure 31 shows the 3D coordinates generated by this code.
See also (Hill Jr. 2001, Chapter 11).
48

glEnable(GL_MAP2_VERTEX_3);
glEnable(GL_AUTO_NORMAL);
setMaterial(METAL);
// Fuselage
const int fuWidth = 4;
const int fuLength = 6;
const int fuLoops = 20;
const int fuSlices = 20;
const GLfloat fuShapeFactor = 0.9f;
GLfloat fuPoints[fuLength][fuWidth][3];
struct { GLfloat len; GLfloat size; } fuParameters[fuLength] =
{
{ -10, 0 },
{ -9.6f, 1.4f },
{ -9, 1.6f },
{ 8, 1.4f },
{ 9.9f, 1 },
{ 10, 0 }
};
for (int p = 0; p fuLength; p++)
{
for (int y = 0; y fuWidth; y++)
fuPoints[p][y][2] = fuParameters[p].len;
fuPoints[p][0][0] = 0;
fuPoints[p][1][0] = fuParameters[p].size;
fuPoints[p][2][0] = fuParameters[p].size;
fuPoints[p][3][0] = 0;
fuPoints[p][0][1] = - fuShapeFactor * fuParameters[p].size;
fuPoints[p][1][1] = - fuShapeFactor * fuParameters[p].size;
fuPoints[p][2][1] = fuShapeFactor * fuParameters[p].size;
fuPoints[p][3][1] = fuShapeFactor * fuParameters[p].size;
}
glMap2f(GL_MAP2_VERTEX_3,
0, 1, 3, fuWidth,
0, 1, 3 * fuWidth, fuLength,
fuPoints[0][0][0]);
glMapGrid2f(fuLoops, 0, 1, fuSlices, 0, 1);
glEvalMesh2(GL_FILL, 0, fuLoops, 0, fuSlices);
glScalef(-1, 1, 1);
glEvalMesh2(GL_FILL, 0, fuLoops, 0, fuSlices);
Figure 30: Using Bézier surfaces for the body of a plane
49

5 SPECIAL EFFECTS 5.6 Menus
0.00 0.00 10.00 0.00 0.00 10.00 0.00 0.00 10.00 0.00 0.00 10.00
0.00 1.26 9.60 1.40 1.26 9.60 1.40 1.26 9.60 0.00 1.26 9.60
0.00 1.44 9.00 1.60 1.44 9.00 1.60 1.44 9.00 0.00 1.44 9.00
0.00 1.26 8.00 1.40 1.26 8.00 1.40 1.26 8.00 0.00 1.26 8.00
0.00 0.90 9.90 1.00 0.90 9.90 1.00 0.90 9.90 0.00 0.90 9.90
0.00 0.00 10.00 0.00 0.00 10.00 0.00 0.00 10.00 0.00 0.00 10.00
Figure 31: Points generated by the code of Figure 30
void menu(int code)
{
cout Menu selection: code endl;
}
void initMenus()
{
int sub = glutCreateMenu(menu);
glutAddMenuEntry(Orange, 5);
glutAddMenuEntry(Pear, 6);
glutAddMenuEntry(Quince, 7);
glutAddMenuEntry(Raspberry, 8);
glutCreateMenu(menu);
glutAddMenuEntry(Apple, 1);
glutAddMenuEntry(Banana, 2);
glutAddMenuEntry(Carrot, 3);
glutAddMenuEntry(Damson, 4);
glutAddSubMenu(More..., sub);
glutAttachMenu(GLUT_RIGHT_BUTTON);
}
Figure 32: Functions for menu callback and creation
5.6 Menus
GLUT provides menus: the menus are not very beautiful but have the advantage that they
are easy to create. Figure 32 shows the idea. The function initMenus is called once only
and sets up the menus. The callback function menu is made whenever the user makes a menu
selection. The argument passed to menu depends on the selection: if the user selects orange,
the value passed is 5, and so on.
It is also easy to create sub-menus and to attach them to the main menu. In Figure 32, the
sub-menu sub displays the four entries with codes 5, 6, 7, and 8, and it appears as the fifth
option on the main menu, which handles the codes 1, 2, 3, and 4.
50

5 SPECIAL EFFECTS 5.7 Text
5.7 Text
GLUT provides text in two forms:
˘ Bit-mapped characters are displayed in the plane of the screen in a fixed orientation; and
˘ stroked characters are 3D objects that can be drawn anywhere in the model and can be
scaled and rotated.
The call
glRasterPos(x, y);
sets the initial position for a bit-mapped string. The coordinates (x, y) are model coordinates
and transformations (e.g., glTranslatef) are applied to them.
To write one character, c, and set the raster position for the next character, call
glutBitmapCharacter(GLUT BITMAP TIMES ROMAN 24, c);
The call
glutStrokeCharacter(font, c);
draws a character, c, in the given font, which should be either GLUT_MONO_ROMAN (fixed width)
or GLUT_STROKE_ROMAN (proportional width). Since the height of a character is about 120
units, you may want to scale them down to suit your model. glutStrokeCharacter applies
a translation to the right, so that a sequence of calls displays a string with correct spacing
between letters.
5.8 Other Features of OpenGL
There are a number of other features of OpenGL that will not be discussed in detail here.
Instead, we describe the idea and omit details of the implementation.
5.8.1 Textures
A texture is a 1D or, more usually, a 2D pattern that is “wrapped” onto an object. The object
may be a plane, such as the side of a cube, or a curved surface such as a cylinder or sphere.
OpenGL needs two pieces of information in order to apply a texture: the texture image, and
the mapping of texture coordinates to object coordinates. The coordinate mapping is simple
in the case of a plane or cylinder, but more complicated in the case of a sphere or other object.
We have seen that when an object is drawn by the low-level functions, the user must provide
the coordinates and normal direction of each vertex. When textures are used, texture coordi-
nates must be provided as well. For the “standard objects”, texture coordinates are provided
by the corresponding functions. Consequently, if you want to texture a sphere or a teapot,
all you have to do is to tellOpenGL where to find the texture data. This data must be stored
as an RGB array: OpenGL does not handle BMP or JPG files directly.
OpenGL provides a lot of support for textures: the chapter on textures in the “Red Book”
((Woo, Nedier, Davis, and Shreiner 2000)) occupies 77 pages! In this section, we provide just
a brief overview of simple texturing techniques. Here are the key steps.
1. Create a texture object. This is simply an array of data corresponding to a 1D, 2D,
or 3D image. Each pixel is represented by anything from one to four pixels. The most
common case is a 2D texture with RGBA values (1 byte each) at each pixel. The
51

5 SPECIAL EFFECTS 5.8 Other Features of OpenGL
dimensions of the array must be powers of 2. Consequently, the array will occupy
4 2m
2n
bytes of memory.
You can generate the texture yourself, by calculation, or you can read data from a
file. Pictures usually come in an encoded format, such as .jpg or .bmp, and must
be converted to raw binary form before OpenGL can use them. Utility programs for
performing the conversions can be downloaded from the internet.
2. You have to tell OpenGL how the texture is to be applied. The most common cases are:
Replace: the final colour in the scene is the texture colour.
Modulate: the texture provides a value that changes the current colour. This mode is
suitable if the texture is being used as a shadow, for example.
Blend: the original colour and the texture colour are combined according to a blending
function.
3. Enable texture mapping by executing
glEnable(GL_TEXTURE_2D);
The available constants are GL_TEXTURE_1D, GL_TEXTURE_2D, and GL_TEXTURE_3D, de-
pending on the dimensionality of the texture. If more than one dimension is enabled,
the higher one is used.
4. Draw the scene, providing texture coordinates for each vertex:
glTexCoord3f(....);
glVertex3f(....);
OpenGL uses 0 and 1 as the limits of the texture in each direction. This means that,
if the texture coordinates are between 0 and 1, OpenGL will use a value within the
texture. If the texture coordinates are outside this range, then what happens depends
on the texturing mode. If you use GL_REPEAT, then the texture will be repeated as often
as necessary. For example, if you texture a square using coordinates that run from 0 to
5, you will obtain 55 D 25 copies of the texture. This technique is useful for rendering
tiles on a floor, for example.
If you use GL_CLAMP, the borders of the texture will be extended as far as necessary.
The following example shows the steps required to apply a simple texture. It is taken from
the “red book” (Woo, Nedier, Davis, and Shreiner 2000). Like all programs in the “red book”,
this program is written in C rather thanC++.
The first step is to create a texture. This texture is computed; the more common case is
for the texture to be read from a file and perhaps converted to the appropriate format for
OpenGL. Note that the size of the image is 26 26.
#define checkImageWidth 64
#define checkImageHeight 64
static GLubyte checkImage[checkImageHeight][checkImageWidth][4];
void makeCheckImage(void)
{
52

int i, j, c;
for (i = 0; i checkImageHeight; i++) {
for (j = 0; j checkImageWidth; j++) {
c = ((((i0x8)==0)^((j0x8))==0))*255;
checkImage[i][j][0] = (GLubyte) c;
checkImage[i][j][3] = (GLubyte) 255;
}
}
}
The texture object must have a “name” which is actually a small integer. The initialization
function constructs the texture image and then:
glPixelStore tells OpenGL that the texture data is aligned on a one-byte boundary.
glGentextures obtains one name for the texture and stores it in texName.
glBindtexture tells OpenGL that texName will be a 2D texture.
The position on the texture is defined by coordinates (s, t). The various calls to
glTexparameter instruct OpenGL to repeat the texture in both dimensions and to use
GL_NEAREST for both magnification and minification filters. These calls do not alter the
default values and could be omitted without making any difference.
glTexImage2D passes information about the texture to OpenGL, including the address
of the texture itself.
static GLuint texName;
void init(void)
{
glClearColor (0.0, 0.0, 0.0, 0.0);
glShadeModel(GL_FLAT);
glEnable(GL_DEPTH_TEST);
makeCheckImage();
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glGenTextures(1, texName);
glBindTexture(GL_TEXTURE_2D, texName);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, checkImageWidth, checkImageHeight,
0, GL_RGBA, GL_UNSIGNED_BYTE, checkImage);
}
53

The display function enables 2D texturing, specifies GL_DECAL mode for the texture. It then
displays two squares, one in the XY plane, and the other at an angle.
void display(void)
{
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_DECAL);
glBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0); glVertex3f(-2.0, -1.0, 0.0);
glTexCoord2f(0.0, 1.0); glVertex3f(-2.0, 1.0, 0.0);
glTexCoord2f(1.0, 1.0); glVertex3f(0.0, 1.0, 0.0);
glTexCoord2f(1.0, 0.0); glVertex3f(0.0, -1.0, 0.0);
glTexCoord2f(0.0, 0.0); glVertex3f(1.0, -1.0, 0.0);
glTexCoord2f(0.0, 1.0); glVertex3f(1.0, 1.0, 0.0);
glTexCoord2f(1.0, 1.0); glVertex3f(2.41421, 1.0, -1.41421);
glTexCoord2f(1.0, 0.0); glVertex3f(2.41421, -1.0, -1.41421);
glEnd();
glDisable(GL_TEXTURE_2D);
}
The remainder of the program is conventional OpenGL.
This basic pattern can be varied in a number of ways.
The function glTexImage2D should be replaced by glTexImage1D for 1D textures or by
glTexImage3D for 3D textures.
The internal format is GL_RGBA in the call to glTexImage2D, indicating that each
texel consists of four bytes containing red, green, blue, and alpha values. There are 37
other constants, each specifying different storage conventions.
In the call to glTexEnv, the final argument determines how the texture is applied. It
can be GL_DECAL, GL_REPLACE, GL_MODULATE, or GL_BLEND. When GL_BLEND is used, the
blending function must also be set to an appropriate value.
It is inefficient to use the full detail of a texture if the image of the texture on the
screen is very small. To avoid this inefficiency, a number of texture images are stored at
different levels of detail, and OpenGL selects the appropriate image for the application
(this is called mipmapping). You can ask OpenGL to compute mipmaps or you can
provide your own.
5.8.2 NURBS
NURBS are Non-Uniform Rational B-SplineS, or curves generated from sets of points. “Non-
uniform” means that the points do not have to be evenly spaced; “rational” means that the
equations have the form P(x, y)=Q(x, y), where P and Q are polynomials; a “spline” is a
54

continuous curve formed from a set of curves; and “B” stands for “basis”, where the “basis”
is a set of functions that is suitable for building spline curves.
In OpenGL, NURBS are built from Bézier curves; in fact, the NURBS functions form a high-
level interface to the functions that we have already seen in Section 5.5.
5.8.3 Antialiasing
You have probably noticed at one time or another that lines on a display do not have smooth
edges. The effect is particularly pronounced for lines that are almost parallel to one of the
axes: such lines look like staircases instead of straight lines. This phenomenon is called
aliasing and it is due to the finite size of the pixels on the screen. The same phenomenon
makes the edges of squares and rectangles look jagged when the edges are not parallel to the
axes.
The avoidance of aliasing effects is called antialiasing. There are various ways of imple-
menting antialiasing, and OpenGL provides a few of them.
Antialiasing Lines and Points A simple method of reducing aliasing effects is to adjust
the colour of pixels close to the line by blending. The dafult way of rendering a line is to
divide pixels into two classes: pixels “on” the line and pixels “not on” the line. A better way
is to decide, for each pixel, how much it contributes to the line, and to use this quantity for
blending. To achieve antialiasing in this way, your program should execute the statements
glEnable(GL_LINE_SMOOTH);
glEnable(GL_BLEND);
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
before drawing lines or points.
Antialiasing Polygons The above technique is not very helpful because we are usually
more interested in drawing polygons than in drawing points and lines. We can use a similar
technique, but it works best only if the polygons are drawn in order of their distance from
the viewer, with close polygons first and distant polygons last. Sorting the polygons, unfortu-
nately, is a non-trivial task. Sorting can be omitted, but the results are not so good. Execute
the code
glEnable(GL_POLYGON_SMOOTH);
glEnable(GL_BLEND);
glBlendFunc(GL_SRC_ALPHA_SATURATE, GL_ONE);
before drawing polygons.
Jittering Another way or reducing aliasing effects is to render the scene several times in
slightly different places. The movements should be very small, usually less than one pixel,
and the technique is called jittering.
As usual, OpenGL coding involves trade-offs. An antialiased scene will look better than a scen
with aliasing effects, but it will take longer to render. If jittering is used, it may take much
longer to render.
55

5 SPECIAL EFFECTS 5.9 Program Development
5.8.4 Picking
A question that is often asked is “how do I select an object with the mouse?” In general,
this is hard to do, because the coordinates we use to draw the model have a complicated
relationship with the coordinates of the mouse relative to the window. Furthermore, in a
3D scene, there may be several objects that corrspond to a single mouse position, one being
behind another. The solution that OpenGLprovides is to provide a special rendering mode.
Here is what happens:
˘ The user provides each object of interest with a “name” (in fact, the name is an integer).
˘ The user defines a small region of the screen. Typically, this is a rectangle that includes
the mouse position and a few pixels either side.
˘ OpenGL then renders the scene in selection mode. Whenever a named object is displayed
inside the selected region, a “hit record”, containing its name and some other information
is added to the “selection list”.
˘ The user examines the hit records to determine the object — or objects — selected.
5.8.5 Error Handling
OpenGL never prints a message, raises an exception, or causes your program to crash. However,
that doesn’t mean that nothing can go wrong: many operations cause errors and it is your
responsibility to discover that they have occurred. If you are using an OpenGL feature and
it doesn’t seem to be working properly, or is not working at all, it is possible that you have
performed an invalid operation. To find out, call glGetError with no arguments: the value
returned is the current error code. You can pass the value to gluErrorString to obtain an
intelligible message:
GLenum error = glGetError();
if (error != GL NO ERROR)
cout GL error: gluErrorString(error) endl;
5.9 Program Development
Most of the techniques for program development work for graphics programs but there are a
few additional points to note.
˘ Incremental development works best. Start by getting something simple in the graphics
window and then work on refining it.
˘ The “blank window” problem occurs often. To avoid it, start with a simple model at
the origin and straightforward projections. When you can see something, elaborate the
program in small steps.
˘ Don’t do everything at once. Get the shapes right before working on colour and lighting.
˘ Use “graphical debugging”. For example, a function that draws a set of axes (e.g., as
three coloured lines) can be very useful for finding out where you are in the model.
˘ During the early stages, having a mouse function that applies simple movements — for
example, rotations about two axes — can be very helpful.
56

6 ORGANIZATION OF A GRAPHICS SYSTEM
6 Organization of a Graphics System
6.1 The Graphics Pipeline
The path from vertex definition to visible pixel is long and complicated. When we are using
a high-level API such as OpenGL, it is not necessary to understand all of the details of this
path. Nevertheless, a rough idea of the process is helpful because it enables us to avoid obvious
mistakes in graphics programming.
There is a diagram called The OpenGL Machine that describes the way in which OpenGL
processes data. A particular version of OpenGL does not have to implement the machine pre-
cisely, but it must provide the same effect. You can obtain a diagram of the OpenGL Machine
either directly as www.3dlabs.com/support/developer/state.pdf or from the course web
page (in either case you will have to magnify it to make it readable).
The pipeline has two inputs: vertexes and pixels. Most of the information is typically in
the form of vertexes; pixels are used for special-purpose applications such as displaying text
in particular positions in the viewing window.
A unit of geometric data is called a primitive and consists of a object type and a list
of vertexes. For example, a triangle object has three vertexes and a quad object has four
vertexes. Vertex data includes:
˘ 4D coordinates, (x, y, z, w)
˘ normal vector
˘ texture coordinates
˘ colour values, (r, g, b, a) (or a colour index)
˘ material properties
˘ edge-flag data
The data associated with a vertex is assembled by a call to glVertex. This implies that
all of the information that OpenGL needs for a vertex must be established before the call to
glVertex. All of the data except the (x, y, z, w) coordinates have default values, and OpenGL
will use these if you have not provided any information. The default values are usually fairly
sensible, and this enables you to get a rough picture with a minimum of work and refine it
afterwards.
6.1.1 Per Vertex Operations
A number of operations are performed on each vertex: these are called per vertex operations
and they are important because the time that OpenGL requires to render a scene is the product
of the time taken to process one vertex and the number of vertexes in the scene.
˘ The position of each vertex is transformed by the model view matrix
˘ The normal vector at each vertex is transformed by the inverse transpose of the model
view matrix
˘ The normal is renormalized (made a unit vector) if specified by the user
˘ Texture coordinates are transformed by the model view matrix
˘ The normal vectors, material properties, and lighting model are used to perform lighting
calculations that determine the final colour of the vertex
57

6 ORGANIZATION OF A GRAPHICS SYSTEM 6.1 The Graphics Pipeline
6.1.2 Primitive Assembly
When all of the vertexes have been processed, the primitives are assembled. At this stage,
there is an object — triangle, quad, or general polygon — and information about each of its
vertexes. Each object of this kind is called a primitive.
The primitive is clipped by any clipping planes defined by the user. The clipping planes are
transformed by the model view matrix and determine whether a primitive is removed, wholly
or partially, by the clipping plane. A primitive that is partially clipped may acquire new
vertexes: for example, a clipped triangle becomes a quadrilateral.
The spatial coordinates of each vertex are then transformed by the projection matrix. A
second round of clipping occurs: this time, anything outside the viewing volume is clipped.
At this stage, the viewing volume is usually bounded by x D ˙1, y D ˙1, z D ˙1, regardless
of the kind of projection (orthogonal or perspective). Again, primitives at the edges of the
viewing volume may gain new vertices.
Each primitive has a front face and a back face. Culling is applied, eliminating back face (or
front face) data if it is not required.
6.1.3 Rasterization
The primitives are then converted to fragments in a process called rasterization. The
graphics window is considered to be a collection of small squares called pixels. For example,
a typical window on a high resolution screen might have 800 600 D 480, 000 pixels. A
primitive usually occupies several pixels, but some primitives (for example, small triangles in
a part of the model that is distant from the viewer) might occupy only part of a pixel.
If the primitive occupies one pixel or more, the shading model determines how the pixels
are coloured. In flat shading, the colour at one vertex determines the colour of all pixels; in
smooth shading, the pixel colours are obtained by interpolating between the vertex colours.
In all cases, the boundaries between primitives must be considered: a pixel may be part of
more than one primitive, in which case its colour must be averaged.
6.1.4 Pixel Operations
Meanwhile, some data has been specified directly in terms of pixels. Such data is usually in the
wrong format for OpenGL and must be packed, unpacked, realigned, or otherwise converted.
Pixel data is then rasterized and converted into fragments and combined with vertex data.
6.1.5 Fragment Operations
Several further operations are performed on the fragments:
˘ Texture data is mapped from the texture image source to the fragment
˘ If fog is enabled, it is applied to the fragment
˘ Antialiasing may be applied to reduce the jaggedness of lines and boundaries
˘ Scissor tests that may exclude part of the viewing window are applied
˘ Alpha computations are performed for overlapping fragments
˘ Stencils are applied to eliminate some parts of the view
58

6 ORGANIZATION OF A GRAPHICS SYSTEM 6.2 Rasterization
˘ The depth-buffer test is applied to choose the closest fragment at each pixel
˘ Blending, dithering, and logical operations are performed
˘ Colour masking is applied, if necessary, to reduce the amount of colour information to
the number of bits provided by the frame buffer
˘ The pixels for the fragment are written to the frame buffer
6.2 Rasterization
Rasterization, which is the process of turning the scene into pixels, is not a particularly
glamorous part of graphics programming but it is nonetheless one of the most important.
However many strange and wonderful effects your graphics engine can create, they will all be
spoiled by poor rasterization.
It is worth noting that, at this level, operations may be performed by either software or
hardware. One of the factors which distinguishes a high-performance graphics workstation
from a simple PC is that the workstation has more sophisticated hardware. For example,
the following quotation is taken from a description of the new Silicon Graphics Onyx 3000
workstation:
The new graphics system is built on the shared memory SGI NUMAflex archi-
tecture of the SGI Onyx 3000 series systems, which allows it to deliver industry-
leading interactive graphics performance of up to 283 million triangles per second
of sustained performance and 7.7 billion pixels per second.
A “triangle” in this context means that the system can take three vertexes and produce a
smooth-shaded triangle in 3.5 nanoseconds.
Consider the edge between two fragments. Mathematically, it is a perfectly straight line with
no width. In practice, it is formed of pixels. A pixel either belongs to one fragment or the
other, or is shared by both fragments. The pixel must be coloured accordingly; if the colour is
wrong, our extremely sensitive eyes will detect imperfections in the image, even if we cannot
see precisely what causes them.
It follows that the low-level primitives of the graphics system must be extremely robust and
reliable. For example, the pixels that form a line must not depend on the direction in which
we draw the line. Furthermore, operations at this level must be extremely efficient, because
they are used very frequently.
Finally, at this stage, we are working with discrete entities (pixels), not smoothly changing
quantities. Not only are the pixels themselves discrete, but colour is quantized into a fixed a
limited number of bits. Consequently, the best algorithms will avoid floating-point calculations
as much as possible and work with integers only.
We will consider just two of the problems that arise during rasterization: drawing a straight
line (which is fundamental) and drawing a circle (which is useful but not quite so fundamental).
6.2.1 Drawing a Straight Line
The midpoint algorithm scan converts a straight line using only integer addition. Bresen-
ham (1965) had the original idea, Pitteway (1967) gave the midpoint formulation, and the
version given here is due to Van Aken (1984).
59

The problem is to scan convert a line with end points (x0, y0) and (x1, y1). We assume that
the end points have integer coordinates (that is, they lie on the pixel grid). Let
dx D x1 x0
dy D y1 y0
We assume that dx 0, dy 0, and
dy
dx
1. (Note that dx and dy are integers, not
differentials.) Since the slope is less than one, we will need a pixel at every X-ordinate. The
problem is to choose the Y coordinates of the pixels.
Assume that we have plotted a pixel at (xp, yp). We have two choices for the pixel at xp C 1:
it should be at either H or L in the diagram below. Suppose that M is midway between L
and H. If the line passes below M, we plot pixel L; if the line passes above M (as in the
diagram), we plot pixel H.
M
M0

s
yp
yp C 1
xp xp C 1
H
L
The equation of the line is
y D x
dy
dx
C B (1)
where B D y0 x0
dy
dx
is the intercept with the axis x D 0 (we do not actually need B in the
subsequent calculations). We can rewrite (1) as x dy y dx C B dx D 0 and we define
F(x, y) x dy y dx C B dx. (2)
If P is a point on the line, clearly F(P) D 0. We can also show that
F(P)

0, if P is above the line;
0, if P is below the line.
Consequently, we can use F to decide which pixel to plot. If F(M) 0, then M is above the
line and we plot L; if F(M) 0, then M is below the line and we plot H.
We can easily compute F(M), using definition (2), as
F(M) D (xp C 1) dy (yp C 1
2 ) dx C B dx.
What happens next? Suppose we plot pixel L. Then the next midpoint, M0, is one step
“east” of M at (xp C 2, yp C 1
2 ). We have
F(M0
) D F(xp C 2, yp C 1
2 )
D (xp C 2) dy (yp C 1
2 ) dx C B dx
D F(M) C dy.
60

If, instead, we plot pixel H, the next midpoint is one step “northeast” of M at (xp C2, yp C 3
2 ),
as in the diagram. In this case,
F(M0
) D F(xp C 2, yp C 3
2 )
D (xp C 2) dy (yp C 3
2
) dx C B dx
D F(M) C dy dx.
Using these results, we need to compute d D F(M) only once, during initialization. For the
first point on the line, xp D x0 and yp D y0. Consequently:
F(M) D (xp C 1) dy (yp C 1
2 ) dx C B dx
D (x0 C 1) dy (y0 C 1
2 ) dx C y0 dx x0 dy
D dy 1
2 dx
In subsequent iterations, we:
increment x;
if d 0, add dy to d;
else if d 0, increment y and add dy dx to d.
There are three points to note:
In the last step, dy dx, and so dy dx 0 and d gets smaller.
We have implicitly dealt with the case F(M) D 0 in the same way as F(M) 0. In
some situations, we might need a more careful choice.
The algorithm still has fractions (with denominator 2). Since we need only the sign of
F, not its value, we can use 2F(M) instead of F(M).
Figure 33 shows a simple C version of the algorithm. Remember that this handles only the
case 0 dy dx: a complete function would have code for all cases.
6.2.2 Drawing a Circle
We can use the ideas that we used to draw a straight line to draw a circle. First, we use
symmetry to reduce the amount of work eightfold. Assume that the centre of the circle is at
the origin (0, 0). Then, if (x, y) is a point on the circle, then the following seven points are
also on the circle: ( x, y), (x, y), ( x, y), (y, x), ( y, x), (y, x), and ( y, x).
The equation of a circle with radius R and centre at the origin is x2 C y2 D R2. Let
F(x, y) D x2
C y2
R2
.
Then, for any point P:
F(P)
8

:
0, if P is inside the circle;
D 0, if P is on the circle; and
0, if P is outside the circle.
61

void line (int x0, int y0, int x1, int y1)
{
int dx = x1 - x0;
int dy = y1 - y0;
int d = 2 * dy - dx;
int L = 2 * dy;
int H = 2 * (dy - dx);
int x = x0;
int y = y0;
for (; x x1; x++)
{
pixel(x, y);
if (d 0)
d += L;
else
{
d += H;
y++;
}
}
pixel(x1, y1);
}
Figure 33: A C function for lines with slope less than 1.
r b b
b b
b
xp xp C 1 xp C 2
yp 2
yp 1
yp
H
L
M M2
M1
p
p
p
p
p
p
p
p
p
p
p
p
Figure 34: Drawing a circle
62

Assume we have plotted a pixel at (xp, yp) (see Figure 34). The decision variable d is given
by
d D F(xp C 1, yp
1
2 )
D (xp C 1)2
C (yp
1
2 )2
R2
If d 0, as in Figure 34, we plot L, the next decision point is M1, and
d0
D F(xp C 2, yp
3
2 )
D (xp C 2)2
C (yp
3
2 )2
R2
D d C 2xp 2yp C 5.
If d 0, we plot H, the next decision point is M2, and
d0
D F(xp C 2, yp
1
2 )
D (xp C 2)2
C (yp
1
2 )2
R2
D d C 2xp C 3.
For the first pixel, x D 0, y0 D R, and
M D (x0 C 1, R 1
2 )
D (1, R 1
2 ),
and
F(M) D F(1, R 1
2 )
D 12
C (R 1
2 )2
R2
D 5
4
R.
From this algorithm, it is straight forward to derive the code shown in Figure 35. For each
coordinate computed by the algorithm, the function circlepoints in Figure 36 plots eight
pixels at the points of symmetry.
6.2.3 Clipping
Clipping means removing part or all of an object because it is invisible. The simplest example
is a straight line joining two points A and B. The easy cases are when A and B are both
inside or both outside the window; the harder case is when one point is inside and the other is
outside, because we have to find out where the line meets the edge of the window and “clip”
it at that point.
We consider just one clipping technique: the Sutherland-Hodgman polygon-clipping algo-
rithm. The general algorithm can clip a polygon in 3D against a polyhedral volume defined
by planes. We will consider the simple case of clipping against planes parallel to the principle
axes, such as the boundaries of the viewing volume (VV).
The algorithm moves around the polygon, considering the vertexes one at a time, and adding
vertexes to its output. We assume that it has processed vertex u and is moving to vertex v.
The following cases must be considered:
63

void circle (int radius)
{
int x = 0;
int y = radius;
double d = 1.25 - radius;
circlepoints(x, y);
while (y x)
{
if (d 0)
d += 2.0 * x + 3.0;
else
{
d += 2.0 * (x - y) + 5.0;
y--;
}
x++;
circlepoints(x, y);
}
}
Figure 35: Computing points in the first octant
void circlepoints (int x, int y)
{
pixel(x, y);
pixel(-x, y);
pixel(x, -y);
pixel(-x, -y);
pixel(y, x);
pixel(-y, x);
pixel(y, -x);
pixel(-y, -x);
}
Figure 36: Plotting eight symmetrical points
1. u and v are both inside the VV: output the vertex v.
2. u is inside the VV and v is outside the VV: find the intersection w of the line uv with
the VV and output w.
3. u and v are both outside the VV: no output.
4. u is outside the VV and v is inside: find the intersection w of the line uv with the VV;
output w and then v.
Sutherland and Hodgman showed how to implement this algorithm recursively so that it can
be performed in hardware without intermediate storage. Figure 37 gives C-like pseudocode
64

void PolyClip(
Vertex inVertexArray[],
Vertex outVertexArray[],
int inLength,
int outLength,
Edge clipBoundary
)
{
Vertex s, p, i;
int j;
outLength = 0;
s = inVertexArray[inLength - 1];
for (j = 0; j inLength; j++)
{
p = inVertexArray[j];
if (Inside(p, clipBoundary))
{
if (Inside(s, clipBoundary))
Output(p);
else
{
i = Intersect(s, p, clipBoundary);
Output(i);
Output(p);
}
}
else
if (Inside(s, clipBoundary))
{
i = Intersect(s, p, clipBoundary);
Output(i);
}
s = p;
}
}
Figure 37: Sutherland-Hodgman Polygon Clipping
for the algorithm. Note that PolyClip is called four times: once for each side of the enclosing
rectangle.
In Figure 37, Output(p) is a kind of macro with the effect:
outVertexArray(outLength++) = p;
The function Inside checks whether the given point is inside the clipping boundary and
returns true if it is. The function Intersect returns the vertex where the line joining the
two given vertexes crosses the clipping boundary.
65

r
r
r
r
A B
C
D
1001 1000 1010
0001 0000 0010
0101 0100 0110
Figure 38: Labelling the regions
To make algorithms like this one efficient, it is important that low-level operations such as
Inside and Intersect are performed quickly. For example, we need a fast way of deciding
whether a line is partly or fully outside the VV. Here is one way of doing this for rectangles.
We extend the rectangle ABCD to define nine regions, as shown in Figure 38.
Suppose that the bottom left corner of the rectangle ABCD is at (Xmin, Ymin) and the top
right hand corner is at (Xmax, Ymax). Then the four coding bits have following values:
˘ Bit 1: y Ymax
˘ Bit 2: y Ymin
˘ Bit 3: x Xmax
˘ Bit 4: x Xmin
We can assign these bits quickly. Now assume that we have assigned bits to the end points of
a line, giving values A and B.
˘ If A D B D 0, the line is within the rectangle.
˘ If A B 6D 0, the line is entirely outside the rectangle.
In the other cases, the line is partly inside and partly outside the rectangle. Note, however,
that the bit values tell us which edge it crosses; this information speeds up the calculation of
the point of intersection.
66

7 TRANSFORMATIONS — AGAIN
7 Transformations — Again
This section provides the mathematical background for 3D graphics systems. We begin by
motivating the discussion: what is wrong with “ordinary” 3D Euclidean space?
It turns out that there are several problems:
˘ In physics and mechanics, we use a coordinate system with a fixed origin. This is conve-
nient for simple problems but, in graphics, the choice of origin is often not obvious. We
need a more flexible kind of geometry.
˘ Graphics requires several kinds of transformations (translation, scaling, rotation, etc.)
and projections. Although these can all be performed in 3D Euclidean space, there is no
simple and uniform technique that works for all of the transformations that we need.
The appropriate mathematical tools have existed for a long time: they are scalar, vector, and
affine spaces.
7.1 Scalar Spaces
A scalar space, also known as a field, is a set of values and operations on those values. There
are two special values, the zero and the unit, and the operations are addition, subtraction,
multiplication, and division. Although there are many scalar spaces, we need only one: the
real numbers. In this section, we will use lower case Greek letters (˛, ˇ, , . . .) to denote
real numbers and R to denote the set of all real numbers. Naturally, the zero of this system
is 0 and the unit is 1.
7.2 Vector Spaces
A vector space is a collection of vectors that satisfies a set of axioms. We write vectors as
bold face Roman letters such as u, v, w, etc. The axioms of a vector space are as follows; we
denote the vector space itself by V.
˘ There is a zero vector that we will write as 0.
˘ Vectors can be multiplied by scalars. If ˛ 2 R and v 2 V then ˛ v 2 V.
˘ Vectors can be added and subtracted. If u 2 V and v 2 V then uCv 2 V and u v 2 V.
˘ The following identities hold for all ˛, ˇ 2 R and u, v 2 V:
0 v D 0 (3)
1 v D v (4)
(˛ ˇ) v D ˛ (ˇ v) (5)
(˛ C ˇ) v D ˛ v C ˇ v (6)
˛ (u C v) D ˛ u C ˛ v (7)
There are a number of important properties of vector spaces that we will not explore here
because they can be found in any book on linear algebra. In particular, a basis for a vector
space V is a set of vectors u1, u2, . . . , un such that every vector v 2 V can be put in the form
˛1 u1 C ˛2 u2 C C ˛n un. The vector space has dimension d if it has no basis consisting
of less than d vectors.
67

7 TRANSFORMATIONS — AGAIN 7.3 Affine Spaces
Vector spaces have a particular vector, 0, that plays a special role. Affine spaces, which we
discuss next, are a way of avoiding the existence of an element with special properties.
Example The standard model for a vector space is the set of n-tuples of real numbers. A
vector (v1, v2, . . . , vn) is a tuple with n real components. This vector space is usually denoted
by Rn.
For example, R2 consists of pairs like (x, y). The zero vector 0 is represented by (0, 0).
Multiplication by a scalar, and addition and subtraction of vectors are defined by
˛ (x, y) D (˛ x, ˛ y)
(x, y) C (x0
y0
) D (x C x0
, y C y0
)
(x, y) (x0
y0
) D (x x0
, y y0
)
7.3 Affine Spaces
An affine space consists of: a set of points; an associated vector space; and two operations
(in addition to the operations of the vector space). The two operations are difference of
points (giving a vector) and addition of a vector and a point (giving a point). In the
following formal definitions, we write P for the set of points and V for the associated vector
space.
˘ If P 2 P and Q 2 P then P Q 2 V.
˘ If P 2 P and v 2 V then P C v 2 P.
The affine operations must satisfy certain properties:
P P D 0 (8)
(P C u) C v D P C (u C v) (9)
(P Q) C v D (P C v) Q (10)
P C v D P if and only if v D 0 (11)
Consider the expression L P C ˛ (Q P). We note first of all that this expression is
well-formed: since P and Q are points, Q P is a vector. We can multiply the vector Q P
by a scalar, ˛, obtaining the vector ˛ (Q P), which we can add to the point P. If we think
of P and Q as fixed points and ˛ as a variable scalar, then L corresponds to a set of points.
This set includes the points P and Q; clearly, when ˛ D 0, we have L P. Less obviously,
when ˛ D 1, we have L P C (Q P) D Q.
We define f P C ˛ (Q P) j ˛ 2 R g to be the line joining the points P and Q. Similarly,
we define
f R C ˇ (P C ˛ (Q P)) j ˛ 2 R, ˇ 2 R g
to be the plane through the points P, Q, and R.
Two lines, L P C ˛ (Q P) and L0
P0
C ˛ (Q0
P0
) are parallel if Q P D Q0
P0
(vector equality).
The expression ˛ P C ˇ Q is currently undefined, because we cannot multiply a point by a
scalar. If ˛ C ˇ D 1, however, we define this expression to mean P C ˇ (Q P) and we refer
to ˛ P C ˇ Q as an affine combination.
68

7 TRANSFORMATIONS — AGAIN 7.4 Transformations
Euclidean similarity affine projective
Transformations
rotation
translation
uniform scaling
nonuniform scaling
shear
perspective projection
Invariants
length
angle
ratio of lengths
parallelism
incidence
cross-ratio
Figure 39: Varieties of Transformation (adapted from Birchfield (1998))
Example A typical member of R4 is the 4-tuple (x, y, z, w). Define a point to be a member
of R4
with the form (x, y, z, 1). Then the difference of two points (computed by vector
subtraction in R4) is
(x, y, z, 1) (x0
, y0
, z0
, 1) D (x x0
, y y0
, z z0
, 0)
If we interpret (x x0, y y0, z z0, 0) as a vector in R3, then the set of points is an affine
space with R3
as its associated vector space. It is easy to see that the axioms (8)–(11) are
satisfied.
The space we have defined is called the standard affine 3-space in R. We will use this
affine space for the rest of this section, referring to it as S. The four coordinates used to
describe a point in S are called homogeneous coordinates.
We can take an arbitrary member of R4, such as (x, y, z, w), and transform it to the point
(x=w, y=w, z=w, 1) in S; this transformation is called homogenization. The four coordi-
nates used to describe a point in S are called homogeneous coordinates. We can perform
calculations in R4
but, before we interpret the results, we must homogenize all of the points.
7.4 Transformations
There are many kinds of transformations. Transformations are classified according to the
properties that they preserve. For example, a rotation is a Euclidean transformation
because it does not distort objects in Euclidean space, but a projection is non-Euclidean
because it loses information about one dimension and distorts objects in the other dimensions.
An important feature of a transformation is the properties that it preserves: these are called
the invariants of the transformation. Figure 39 shows various kinds of transformations and
their properties.
An affine transformation is a function that maps every point of an affine space to another
point in the same affine space. An affine transformation T must satisfy certain properties:
69

7 TRANSFORMATIONS — AGAIN 7.4 Transformations
˘ Affine combinations are preserved:
T (˛ P C ˇ Q) D ˛ T (P) C ˇ T (Q)
˘ If L is a line, then T (L) is a line. (Note that the line L is a set of points; T (L) is an
abbreviation for f T (P) j P 2 L g.)
˘ If L k L0
(L and L0
are parallel lines) then T (L) k T (L0
).
˘ If M is a plane, then T (M) is a plane.
˘ If M k M0
(M and M0
are parallel planes) then T (M) k T (M0
).
There is a convenient representation of affine transformations on S: we can use 44 matrices
of the form
M D
2
6
6
4

0 0 0 1
3
7
7
5
where the dots indicate any value. To transform a point, we treat the point as a 4 1 matrix
and premultiply it by the transformation matrix.
7.4.1 Translation
A translation transforms the point (x, y, z, 1) to (x C a, y C b, z C c, 1). The matrix and its
effect are described by the following equation:
2
6
6
4
1 0 0 a
0 1 0 b
0 0 1 c
0 0 0 1
3
7
7
5
2
6
6
4
x
y
z
1
3
7
7
5 D
2
6
6
4
x C a
y C b
z C c
1
3
7
7
5
7.4.2 Scaling
A scaling transformation scales the coordinates of the point by given factors along each of the
principle axes. The matrix and its effect are described by the following equation:
2
6
6
4
r 0 0 0
0 s 0 0
0 0 t 0
0 0 0 1
3
7
7
5
2
6
6
4
x
y
z
1
3
7
7
5 D
2
6
6
4
r x
s y
t z
1
3
7
7
5
7.4.3 Rotation
Rotations about the principle axes are defined by the equations below. We assume that the
rotations are counter-clockwise in a right-handed coordinate system. (To visualize a right-
handed coordinate system, extend the thumb and first two fingers of your right hand so they
70

7 TRANSFORMATIONS — AGAIN 7.5 Non-Affine Transformations
are roughly at right-angles to one another. Then your thumb points along the X-axis, your
first finger points along the Y -axis, and your second finger points along the Z-axis.)
Rx D
2
6
6
4
1 0 0 0
0 cos sin 0
0 sin cos 0
0 0 0 1
3
7
7
5
Ry D
2
6
6
4
cos 0 sin 0
0 1 0 0
sin 0 cos 0
0 0 0 1
3
7
7
5
Rz D
2
6
6
4
cos sin 0 0
sin cos 0 0
0 0 1 0
0 0 0 1
3
7
7
5
A general rotation through an angle about an axis in the direction of the unit vector
u D (ux, uy , uz) is given by the matrix
Ru() D
2
6
6
4
c C (1 c)u2
x (1 c)uy ux suz (1 c)uzux C suy 0
(1 c)uxuy C suz c C (1 c)u2
y (1 c)uzuy sux 0
(1 c)uxuz suy (1 c)uy uz C sux c C (1 c)u2
z 0
0 0 0 1
3
7
7
5
where s D sin and c D cos .
7.5 Non-Affine Transformations
We can use matrices to define transformations that are not affine but are nevertheless useful
in graphics. The bottom row of such a matrix is not [0, 0, 0, 1] and, in fact, we can use this
fact to achieve interesting results.
7.5.1 Perspective Transformations
The first non-affine transformation that we will consider the perspective transformation. Dur-
ing the middle ages, the concept of perspective was developed by imagining a ray of light pass-
ing through a window; the ray originates at a point in the scene and arrives at the painter’s
eye. The point at which it passes through the window is the point where that part of the
scene should be painted.
Figure 40 shows the origin at O, with the Z axis extending to the right and the Y axis
extending upwards (the X axis, which comes out of the paper towards you, is not shown).
The “window” on which the scene is to be projected is at z D n. The point in the scene is
at P, and its projection P0
is obtained by drawing a line (corresponding to a light ray) from
P to O. By similar triangles,
y0
D y
n
z
71


Z
6
Y
O

n
P0
y0
z
P
y
Figure 40: Perspective
and, in the XZ plane
x0
D x
n
z
It might appear that we cannot achieve a transformation of this kind with a matrix, because
matrix transformations are supposed to be linear. But consider the following equation:
2
6
6
4
n 0 0 0
0 n 0 0
0 0 0 0
0 0 1 0
3
7
7
5
2
6
6
4
x
y
z
1
3
7
7
5 D
2
6
6
4
n x
n y
0
z
3
7
7
5
If we homogenize the transformed point by dividing each of its components by z, we obtain
P0
D ( x
n
z
, y
n
z
, 0, 1), with the same X and Y values as above.
A transformation, as explained above, is a mapping from a space into the same space. A
projection is a mapping from a space to a space with fewer dimensions. If we discard the
last two components of P0, we obtain the projection
(x, y, z, 1) 7! ( x
n
z
, y
n
z
)
We have mapped the point P in S to a point in a two-dimensional plane using a perspective
transformation followed by a projection.
In practical situations, it is useful to have a value for the Z coordinate: for example, we can
use this value for depth buffer comparisons. To obtain a suitable value for Z, we apply the
following transformation
2
6
6
4
n 0 0 0
0 n 0 0
0 0 a b
0 0 1 0
3
7
7
5
2
6
6
4
x
y
z
1
3
7
7
5 D
2
6
6
4
n x
n y
a z C b
z
3
7
7
5
After homogenization, we obtain the 3D point

x
n
z
, y
n
z
,
a z C b
z

. In this expression,
the Z value is called the pseudodepth. It increases with distance, and can be used for depth
buffer comparisons.
72

For clipping purposes, it is convenient to restrict the pseudodepth values to the range ˙1, as in
the other directions. Assume the near and far planes are at z D n and z D f respectively.
Then we have
a ( n) C b
n
D 1
a ( f ) C b
f
D 1
and therefore
a D
n C f
n f
b D
2 n f
n f
and the perspective transformation that OpenGL uses is indeed
2
6
6
6
6
4
n 0 0 0
0 n 0 0
0 0
n C f
n f
2 n f
n f
0 0 1 0
3
7
7
7
7
5
The pseudodepth is
z0
D
az C b
z
D
nCf
n f
z C 2nf
n f
z
D
z(n C f ) C 2f n
z(n f )
.
7.5.2 Shadows
Suppose that there is a light source at L, a point at P, and a plane given by the equation
Ax C By C C z C D D 0. The point P will cast a shadow on the plane. To find the position of
the shadow, we have to find where the line through L and P meets the plane. The parametric
equation of a line through L and P is given by:
Qx D Lx C (Px Lx) t (12)
Qy D Ly C (Py Ly) t (13)
Qz D Lz C (Pz Lz) t (14)
To find where this line meets the plane, we solve the equation
A Qx C B Qy C C Qz C D D 0
for t. The result is
t D
A Lx C B Ly C C Lz C D
A Px A Lx C B Py B Ly C C Pz C Lz
D
A Lx C B Ly C C Lz C D
(A Lx C B Ly C C Lz) (A Px C B Py C C Pz)
73

Substituting this value into (12)–(14) gives the coordinates for the shadow point as
Qx D
Lx B Py C Lx C Pz Px B Ly Px C Lz Px D C Lx D
D
Lx(B Py C C Pz C D) Px(B Ly C C LZ C D)
Qy D
Ly A Px Ly C Pz C Py A Lx C Py C Lz C Py D Ly D
D
Ly(A Px C C Pz C D) Py (A Lx C C Lz C D)
Qz D
Lz A Px C Lz B Py Pz A Lx Pz B Ly Pz D C Lz D
D
Lz(A Px C B Py C D) Pz(A Lx C B Ly C D)
The matrix that takes a point P onto its projection Q on the plane is
2
6
6
4
B Ly C Lz D Lx B Lx C Lx D
Ly A A Lx C Lz D Ly C Ly D
Lz A Lz B A Lx B Ly D Lz D
A B C A Lx B Ly C Lz
3
7
7
5
As an example, we can choose y D 0 as the plane and (0, 1, 0) as the position of the light
source. Then B D 1, A D C D D D 0, Lx D 0, Ly D 1, and Lz D 0. The matrix is
2
6
6
4
1 0 0 0
0 0 0 0
0 0 1 0
0 1 0 1
3
7
7
5 .
If we use this matrix to transform the point (x, y, z, 1), we obtain ( x, 0, z, y 1). After
homogenization, this point becomes

x
1 y
, 0,
z
1 y

. We can see that this is correct for
points with 0 y 1. First, note that the shadow lies entirely in the plane y D 0. Next,
points with y D 0 transform to themselves, because they are on the plane and so is their
shadow. Points on the plane y D 1
2 have their coordinates doubled. As a point moves closer
to the plane y D 1, its shadow approaches infinity.
7.5.3 Reflection
Again, we consider a point P at (Px, Py, Pz) and the plane Ax C By C C z C D D 0. Here are
the parametric equations of a line through P perpendicular to the plane:
x D Px C A t (15)
y D Py C B t (16)
z D Pz C C t (17)
To find where this line meets the plane, we solve this equation and Ax C By C C z C D D 0
for t, giving
t0 D
A Px C B Py C C Pz C D
A2 C B2 C C 2
(18)
74

7 TRANSFORMATIONS — AGAIN 7.6 Working with Matrices
The reflection Q of P in the plane is the point obtained by substituting t D 2 t0 in equations
(15)–(17):
Qx D Px C 2 A t0
Qy D Py C 2 B t0
Qz D Pz C 2 C t0
Using the value of t0 given by (18), we have:
Qx D Px
2 a (a Px C b Py C c Pz C d)

Qy D Py
2 b (a Px C b Py C c Pz C d)

Qz D Pz
2 c (a Px C b Py C c Pz C d)

where D A2 C B2 C C 2. The matrix which maps P to Q is
2
6
6
4
2 A2
2 A B 2 A C 2 A D
2 A B 2 B2
2 B C 2 B D
2 A C 2 B C 2 C 2
2 C D
0 0 0
3
7
7
5
As a test of this matrix, we consider reflections in the YZ plane, which has equation x D 0.
The coefficients in the plane equation are A D 1 and B D C D D D 0. The following matrix
is clearly correct: 2
6
6
4
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
3
7
7
5
7.6 Working with Matrices
Finding the equations for projects is fairly straightforward: it is usually a matter of solving
linear equations. The algebra can get a bit heavy but problems of this kind are easily solved
by a package such as Maple, Matlab, or Mathematica.
The hard part is converting the solution from a set of equations into a matrix. The packages
are not much use here. The following notes may help.
The following equation illustrates the effect of terms in the bottom row of a 44 transformation
matrix:
2
6
6
4
1 0 0 0
0 1 0 0
0 0 1 0
a b c d
3
7
7
5
2
6
6
4
x
y
z
1
3
7
7
5 D
2
6
6
4
x
y
z
a x C b y C c z C d
3
7
7
5
After normalizing and dropping the fourth coordinate, we obtain the 3D point

x
a x C b y C c z C d
,
y
a x C b y C c z C d
,
z
a x C b y C c z C d

.
75

7 TRANSFORMATIONS — AGAIN 7.6 Working with Matrices
That is, entries in the fourth row act as divisors for the coordinates of the output point. The
first three columns divide by factors proportional to x, y, and z respectively, and the fourth
column can be used to divide by a constant factor.
As usual, entries in the right column correspond to translations:
2
6
6
4
1 0 0 r
0 1 0 s
0 0 1 t
0 0 0 1
3
7
7
5
2
6
6
4
x
y
z
1
3
7
7
5 D
2
6
6
4
x C r
y C s
z C t
1
3
7
7
5
Thus these entries in the matrix can be used to add constant quantities (that is quantities
independent of x, y, and z) to the output point.
We can apply these ideas to the equations for Q (15)–(15). The denominator of each coordi-
nate is (A Lx C B Ly C C Lz) (A Px C B Py C C Pz). From this, we can immediately infer
that the matrix must have the form
2
6
6
4

3
7
7
5
where the dots indicate values we don’t know yet and we have changed the sign because Qx
is negative.
Looking at the numerator of
Qx D
Lx(B Py C C Pz C D) Px(B Ly C C LZ C D)
the Px term tells us that the top-left corner of the matrix must be (B Ly C C LZ C D).
Similarly, the components of Lx(B Py C C Pz C D give the other components of the first row
of the matrix. We now have:
2
6
6
4
(B Ly C C LZ C D) Lx B Lx C Lx D

3
7
7
5
It is now safe to make a guess about the other entries based on symmetry. Filling them in
gives:
2
6
6
4
B Ly C Lz D Lx B Lx C Lx D
Ly A A Lx C Lz D Ly C Ly D
Lz A Lz B A Lx B Ly D Lz D
3
7
7
5
The final step is to see of this works. First, try some very simple examples. For instance,
put the light source at (0, 1, 0) and use the plane y D 0. This gives the matrix shown at the
end of Section 7.5.2. Then we can try more complicated examples and, finally, try it out with
OpenGL.
76

8 ROTATION
8 Rotation
Rotation in three dimensions is quite complicated but is easier to understand in relation
to rotation in two dimensions. Consequently, we discuss rotation in general first; then 2D
rotation, although much of the material should be revision; and finally rotation.
8.1 Groups
A group G D (S, ) is an algebraic structure consisting of a set S and a binary operation
on elements of the set. A group must have the following properties:
Closure: The set S is closed under the operation : if x 2 S and y 2 S, then x y 2 S.
Associative: The operation is associative: for all x, y, z 2 S, x (y z) D (x y) z.
Unit: There is a unit element u 2 S with the property that, for any x 2 S, xu D ux D x.
Inverse: For every element x 2 S, there is an inverse element y 2 S such that x y D u.
Note that the group properties do not include commutativity. In general, x y 6D y x. A
group with a commutative operator is called a commutative group or an Abelian group.
We will write x 1 for the inverse of x.
Since S is a set, it has subsets. If the elements of a subset and the group operation form a
group H, the H is a subgroup of G. Here is the formal definition of subgroup:
Let G D (S, ) be a group and suppose the set T S has the following properties (note that
we do not need to mention associativity):
Closure: The set T is closed under the group operation of G: if x 2 T and y 2 T , then
x y 2 T .
Unit: T contains the unit element u of G.
Inverse: If x 2 T , then x 1
2 T .
Then H D (T, ) is a subgroup of G.
Groups are often used to model operations on a set of objects. For example, graphics trans-
formations operate on vertex coordinates.
Assume that there is a group G and a set of objects O and that it is meaningful to apply a
member of G to a member of O. If f 2 G and x 2 O, we write f (x) for this operation. We
require:
Closure: the group operations are closed over O: if p 2 G and x 2 O, then p(x) 2 O;
Unit element: if e is the unit element of G and x 2 S, then e(x) D x.
A binary relation on a set S is an equivalence relation iff for all x, y, z 2 S:
Reflexivity: x x;
Symmetry: x y if and only if y x;
Transitivity: if x y and y z, then x z.
77

8 ROTATION 8.1 Groups
Do not confuse the application of a group element to another group element (p q 2 G) with
the application of a group element to a member of O (p(x) 2 O).
The familiar concept of “symmetry” is formally defined in terms of subgroups and equivalence
relations.
Lemma: Let be an equivalence relation on O and let H be the subset of G defined by
p 2 H () 8 x 2 O .p(x) x.
Then H is a subgroup of G.
Proof: We assume that is an equivalence relation we show that H is a subgroup of G.
Subset: H is a subset of G by definition.
Unit element: for any x 2 S:
x x (reflexivity)
e(x) x (unit element)
e 2 H (definition of H)
Closure: Assume p, q 2 H and q(x) D y and p(y) D z. Then pq(x) D p(y) D z and
x y (q 2 H and q(x) D y)
y z (p 2 H and p(y) D z)
x z (transitivity of )
x (p q)(x) (by construction of z)
p q 2 H (definition of H)
Inverse: Assume p 2 H and let y D p(x).
p(x) x (p 2 H)
y x (y D p(x))
x y (reflexivity of )
p 1
(y) y (p 1
(y) D p 1
p(x) D x)
p 1
2 H (definition of H)
As an example, suppose that the elements of S are images of squares and the elements of G
are p . If x 2 S, then p (x) is the image x rotated by degrees, where is a whole number
satisfying 0 360. We define
p p D p[C]
p 1
D p[ ]
where [] stands for adjusted by adding or subtracting multiples of 360 so that 0 [] 360.
Suppose that x y if “x looks the same as y”. Since x and y are images of squares, they
will look the same if they are rotated through a multiple of 90ı. The subgroup of G induced
by consists of the operators p with 2 f 0, 90, 180, 270 g.
78

8 ROTATION 8.2 2D Rotation
8.2 2D Rotation
Rotations in 2D are rotations in a plane about a fixed point. A rotation has one parameter,
which is an angle: we rotate through an angle .
If we rotate through and then through , we have rotated through C . Rotating through
0 has no effect. Rotating through and then through also has no effect.
Thus we can summarize the rotation group in 2D as follows:
˘ The elements are angles. We will assume that angles are computed mod 360ı
— that is,
if an angle is greater than 360ı (or less than zero), we subtract (or add) 360ı to it until
it is in the range [0, 360).
˘ The binary operation is addition of angles: given and , we can calculate C .
˘ The unit is 0ı
.
˘ The inverse of is .
The rotation group in 2D is called SO2, short for special and orthogonal group in two
dimensions. We note that it is a commutative group.
We need a way of calculating how points on the plane are transformed by elements of SO2.
With simple coordinate geometry, we can show that, if a point P has coordinates (x, y) then,
after rotation through , it has coordinates (x0, y0) where
x0
D x cos C y sin
y0
D x sin C y cos
8.2.1 Representing 2D Rotations with Matrices
We can use 2 2 matrices to represent these rotations. The matrix for a rotation through
is
cos sin
sin cos

and the unit is obtained by substituting 0 for :

1 0
0 1

The group operation with this representation is matrix multiplication. For example:

cos sin
sin cos

cos sin
sin cos

D

cos cos sin sin cos sin C sin cos
sin cos C cos sin sin sin C cos cos

D

cos( C ) sin( C )
sin( C ) cos( C )

Although matrix multiplication in general is not commutative, these particular operations are
commutative.
79

8.2.2 Representing 2D Rotations with Complex Numbers
There is an alternative representation for 2D rotations: we can use complex numbers.
A complex number has the form x C i y in which i D
p
1. The norm2
of a complex number
z D x C i y is
k z k D x2
C y2
We are interested only in complex numbers z with k z k D 1. If we write z D x C i y, then
x2 C y2 D 1 and so these numbers lie on the unit circle in a complex plane. We can write all
such numbers in the form cos C i sin for some value 0 2.
If we multiply two numbers on the unit circle, we get another number on the unit circle:
(cos C i sin ) (cos C i sin ) D cos cos sin sin C i(sin cos C cos sin )
D cos( C ) C i sin( C )
Thus the effect of multiplying by cos C i sin is to rotate through an angle . In this
system, the rotation group is represented as follows:
˘ Group elements are complex numbers of the form cos C i sin .
˘ The group operation is complex multiplication.
˘ Complex multiplication is associative.
˘ The unit element is 1 C i 0, corresponding to D 0.
˘ The inverse of cos C i sin is cos i sin because
(cos C i sin ) (cos i sin ) D cos2
C sin2
C i(sin cos cos sin )
D 1
8.3 3D Rotation
Although there are analogies between 2D rotations and 3D rotations, 3D rotations are consid-
erably more complicated. We first note a couple of non-intuitive trivia involving 3D rotation.
1. Rotations interfere with one another in unexpected ways. Let Rx() stand for a rotation
through about the X axis, with similar conventions for the Y and Z axes. Then you
can verify by drawing pictures that
Ry (180ı
) Rx(180ı
) D Rz(180ı
)
2. Although you cannot rotate an object through 360ı without letting go of it, you can
rotate it through 720ı
. Hold something flat on the palm of your right hand, facing
upwards. Begin turning it in a counter-clockwise direction. After turning it about 360ı,
your arm will be twisted but you can continue turning, raising your hand above your
head. Eventually, you will return to the starting position but the object will have rotated
through 720ı
. (This is called “the plate trick”.)
2The norm of a vector is the same as its length, or
q
x2
1 C x2
2 C , where x1, x2, . . . are the components
of the vector. In algebraic work, the square root is usually omitted. Thus we will use the squared norm for
complex numbers here and for quaternions in Section 8.3.2.
80

The rotation group in 3D is called SO3. As with SO2, there are two important representations,
the first using matrices and the second using a generalized form of complex numbers called
quaternions (Shoemake 1994a; Shoemake 1994d; Shoemake 1994b; Shoemake 1994c). We
discuss each in turn.
8.3.1 Representing 3D Rotations with Matrices
We need a 3 3 matrix to represent a 3D rotation. In graphics, we use a 4 4 matrix with
the form 2
6
6
4
? ? ? 0
? ? ? 0
? ? ? 0
0 0 0 1
3
7
7
5
in which the ? entries form the 3D rotation matrix. From now on, we will ignore the 0 and 1
entries of the 4 4 matrix because they do not contribute anything useful.
When we try to use matrices to represent rotations, we run into several difficulties.
1. The first difficulty is that the general form of the matrix, representing a rotation through
an arbitrary angle about an arbitrary axis, is rather complicated. However, Leonhard
Euler3
discovered that any rotation can be expressed as the product of three rotations
about the principle axes. (The form of the matrix for any of these rotations is rather
simple.)
Using the notation above, an arbitrary rotation R can be written Rx() Ry() Rz( )
in which , , and are called Euler angles. Because 3D rotations do not commute,
we must use a consistent order: Rx() Ry () Rz( ) is not in general the same as
Rx() Ry() Rz( ) or any other permutation.
2. Euler angles do not solve all of our problems. Suppose D 90ı
and the rotation has
the form Rx() Ry(90ı
) Rz( ). In this situation, Rx() and Rz() have the same
effect and there is no way of rotating about the third axis! We have lost one degree of
freedom. This phenomenon is called gimbal lock because it occurs in gyroscopes: they
are supported by three sets of bearings (called “gimbals”) so that they can rotate in any
direction. However, if one set of bearings is rotated through 90ı
, the other two sets of
bearings become parallel and the gyroscope can no longer rotate freely.
3. In computer graphics, we frequently want to animate a rotational movement. Suppose
R1 and R2 are matrices representing rotations that describe the initial and final orien-
tations of an object. What are the intermediate orientations? One obvious idea is to
compute the matrix
R D (1 ) R1 C R2
because R D R1 when D 0 and R D R2 when D 1. Unfortunately, the values of
R do not produce a smooth rotation and, even worse, R may not even be a rotation
matrix!
3Euler, a Swiss mathematician, lived from 1707 to 1783. The name is pronounced roughly as “Oiler”,
following the German pronunciation of “eu”. “We may sum up Euler’s work by saying that he created a good
deal of analysis, and revised almost all the branches of pure mathematics which were then known, filling up
the details, adding proofs, and arranging the whole in a consistent form. Such work is very important, and it
is fortunate for science when it fall into hands as competent as those of Euler.”
81

Actually, this is quite easy to see. Suppose R1 is the identity matrix and R2 represents
a rotation of 180ı
about the X axis. Then
R1 D
2
6
6
4
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
3
7
7
5 and R2 D
2
6
6
4
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
3
7
7
5
The intermediate matrix corresponding to D 1
2 is
2
6
6
4
1 0 0 0
0 0 0 0
0 0 0 0
0 0 0 1
3
7
7
5
which reduces any 3D object to a 1D line.
4. Given two rotations described by Euler angles, how do you find a rotation that corre-
sponds to their difference. Put concretely, suppose
R1 D Rx(1) Ry(1) Rz( 1)
and R2 D Rx(2) Ry(2) Rz( 2)
how do we find a rotation
R D Rx() Ry () Rz( )
such that R1 R D R2? This is a difficult problem to solve with Euler angles.
5. Since rotations matrices contain more information than is necessary to define a rotation,
rounding errors create problems. After a sequence of operations on rotation matrices,
accumulated errors may give a matrix that is not exactly a rotation. The consequence,
in a graphics program, is distortion of the image.
Another potential problem is that we often need inverses of rotations. Computing the
inverse of a matrix is expensive and subject to rounding errors. Fortunately, however, the
inverse of a rotation matrix is its transpose and is therefore easy to calculate. Although
we can exploit this fact when we are doing our own rotation calculations, a graphics
package such as OpenGL cannot distinguish rotation matrices from other matrices and
must use general techniques for inversion.
In summary, although it is possible to use matrices to represent 3D rotations, there are a
number of problems and a better representation is highly desirable. Fortunately, there is one.
8.3.2 Representing 3D Rotations with Quaternions
One solution of the problem of representing 3D rotations was discovered by Hamilton4
when
he introduced quaternions.
4Sir William Rowan Hamilton (1805-1865), Irish mathematician who also introduced vectors and matrices
and made important contributions to geometrical optics, dynamics (including the “Hamiltonian”), geometry,
complex numbers, theory of equations, real analysis, and linear operators.
82

i j k
i 1 k j
j k 1 i
k j i 1
Figure 41: Quaternion multiplication: when a quaternion is represented as s C i x C j y C k z,
multiplication uses this table for products of i, j, and k. Note that i2 D j2 D k2 D 1.
Hamilton reasoned that, since 2D rotations can be represented by two numbers (x C i y, see
above) it should be possible to represent 3D rotations with three numbers. For eight years,
he experimented with 3D vectors but was unsuccessful, because the rotation group cannot be
represented as a vector space. Eventually, he tried four numbers and succeeded very quickly:
he called the new objects quaternions (“quater” is Latin for “four”).
We can write a quaternion in several different ways:
˘ As a tuple of four real numbers: (s, x, y, z);
˘ As s C i x C j y C k z, by analogy to x C i y (see Figure 41).
˘ As a scalar/vector pair: (s, v).
We will use the last of these representations, which is also the most modern.
If the vector part of a quaternion is the zero vector (so we have (s, 0)) the quaternion behaves
exactly like a real number. If the scalar part is zero (so we have (0, v)), the quaternion
behaves exactly like a vector. We use this fact to make implicit conversions:
˘ the vector v can be converted to the quaternion (0, v);
˘ the quaternion (0, v) can be converted to the vector v.
In particular, the unit quaternion (1, 0) is essentially the same as the real number 1.
The quaternions are a number system in which all of the standard operations (addition, sub-
traction, multiplication, and division) are defined. We do not need addition and subtraction
and we will ignore them for now (there is one application which comes later).
Multiplication is defined like this:
(s1, v1) (s2, v2) D (s1s2 v1 v2, s1v2 C s2v1 C v1 v2)
in which v1 v2 is the dot product or inner product of the vectors v1 and v2 and v1 v2
is their outer product or cross product.
The unit (1, 0) behaves as it should:
(1, 0) (s, v) D (1 s C 0 v, 1 v C s 0 C 0 v)
D (s, v)
The conjugate of the quaternion q D (s, v) is the quaternion q
D (s, v). (Compare: the
conjugate of the complex number x C i y is the complex number x i y.)
The norm of the quaternion q D (s, v) is
k q k D q q
D (s, v) (s, v)
83

D (s2
C v v, s v C s ( v) C v v)
D (s2
C v v, 0)
D s2
C v v
Recall that, for any vector v, v v D 0. The norm of a quaternion is a real number. If we
write out the components of the vector part of the quaternion, we have
k (s, (x, y, z)) k D s2
C x2
C y2
C z2
(Compare: the norm of a complex number z D x C i y is the real number x2 C y2.)
We can rearrange
k q k D q q
into the form
q
q
k q k
D 1
which suggests
q 1
D
q
k q k
Of all the quaternions, only the zero quaternion, (0, 0), does not have an inverse.
A unit quaternion is a quaternion q with k q k D 1. Note that:
˘ the unit (1, 0) is an example of a unit quaternion but is not the only one;
˘ if q is a unit quaternion, then q 1 D q.
Let Q be the set of unit quaternions. Then Q, with multiplication as the operation, is a
group:
˘ multiplication is closed and associative (easy to prove, although we haven’t done so here);
˘ there is a unit, (1, 0) D 1;
˘ every unit quaternion has an inverse.
Consider the quaternion q D (cos , u sin ) in which u is a unit vector and so u u D 1.
Since
k q k D cos2
C (u u sin2

D 1
q is a unit quaternion. In general, we can write any unit quaternion in the form (cos , u sin ).
(Compare: if z is a complex number and k z k D 1, we can write z in the form cos C i sin .)
Consider the product of unit quaternions with their vector components in the same direction
(recall that u u D 0 for any vector u):
(cos , u sin ) (cos , u sin ) D (cos cos (u u) sin sin ,
u cos sin C u sin cos C (u u) sin sin )
D (cos( C ), u sin( C ))
84

Multiplying unit quaternions is the same as adding angles. (Compare:
(cos C i sin ) (cos C i sin ) D cos( C ) C i sin( C ) )
We have at last reached the interesting part. Suppose q D (cos , u sin ) is a unit quaternion
and v is a vector. Then
v0
D q v q 1
is the vector v rotated through 2 about an axis in the direction u. This is the sense in which
quaternions represent 3D rotations.
8.3.3 A Proof that Unit Quaternions Represent Rotations
Define R(, u) as an operation on vectors: its effect is to rotate a vector through an angle
about an axis defined by the unit vector u. We compute the vector R(, u) v. Below, we
shorten this expression to R v.
The first step is to resolve v into components parallel to and orthogonal to u:
vp D (u v)u
vo D v (u v)u
Since R does not affect vp, we have
R v D R(vp C vo)
D vp C R vo
Let w be a vector perpendicular to vo and lying in the plane of the rotation. Then w must
be orthogonal to u and vo and:
w D u vo
D u (v (u v)u)
D u v (u v)(u u)
D u v
since u u D 0.
We can resolve R vo into components parallel to vo and w. In fact:
R vo D vo cos C w sin
and hence
R v D R vp C R vo
D R vp C vo cos C w sin
D (u v)u C (v (u v)u) cos C (u v) sin
D v cos C u(u v)(1 cos ) C (u v) sin (19)
85

The next step is to see how quaternions achieve the same effect. Let p D (0, v) be a pure
quaternion and q D (cos , u sin ). Then
q p D (cos , u sin ) (0, v)
D ( (u v) sin, v cos C (u v) sin )
and
q p q 1
D ( (u v) sin, v cos C (u v) sin) (cos , u sin ) (20)
D ( (u v) sin cos (v cos C (u v) sin)) ( u sin ), (21)
u(u v) sin2
C v cos2
C (u v) sin cos
(v cos C (u v) sin) (u sin ))
D ( (u v) sin cos C (u v) sin cos C (u v) u sin2
, (22)
u(u v) sin2
C v cos2
C (u v) sin cos C
(u v) sin cos ((u v) u) sin2
)
D (0, (cos2
sin2
)v C 2 sin2
(u v)u C 2 sin cos (u v)) (23)
D (0, v cos 2 C u(u v)(1 cos 2) C (u v) sin2 (24)
In (23), the scalar part becomes zero, because the first two terms cancel and this third term
is zero: uv u D 0 because uv is orthogonal to u, . In the vector part, we use the general
fact that (bc)d D (db)c (dc)b which, in this case, gives (uv)u D (uu)v (uv)u.
Comparing (19) and (24), we see that they are the same if we substitute D 2.
To gain familiarity with unit quaternions, we consider a few simple examples. We will use the
form (cos , u sin ) for the general unit quaternion.
˘ First, assume D 0. Then the quaternion is (1, 0 u) or simply (1, 0). The direction of
the unit vector makes no difference if the amount of rotation is zero.
˘ Next, suppose D 90ı
. The quaternion then has the form (0, u). This means that a
pure unit quaternion represents a rotation through 90ı about the unit vector component
of the quaternion.
8.3.4 Quaternions and Matrices
We can think of the quaternion product q q0 as an operation: q is applied to q0. The operation
q can be represented as a matrix. We call it a “left operation” because q is on the left of q0
(this is important because quaternion multiplication is not commutative). If q D (s, (x, y, z)),
we can calculate the matrix from the definition of the quaternion product and obtain:
Lq D
2
6
6
4
s z y x
z s x y
y x s z
x y z s
3
7
7
5
Symmetrically, we can consider q0
q
, in which q
is a right operator acting on q0
; the corre-
sponding matrix is:
Rq D
2
6
6
4
s z y x
z s x y
y x s z
x y z s
3
7
7
5
86

8 ROTATION 8.4 Quaternions in Practice
Since matrix multiplication is associative, the matrix Lq Rq represents the effect of q v q on
the vector v. In other words, it is the rotation matrix corresponding to the quaternion q:
Lq Rq D
2
6
6
4
s2 x2 y2 z2 2(xy sz) 2(xz C sy) 0
2(xy C sz) s2 x2 C y2 z2 2(yz sx) 0
2(xz sy) 2(yz C sx) w2 x2 y2 C z2 0
0 0 0 s2 C x2 C y2 C z2
3
7
7
5
In the cases we are interested in, k q k D 1, and this matrix simplifies to
Q D
2
6
6
4
1 2(y2 C z2) 2(xy sz) 2(xz C sy) 0
2(xy C sz) 1 2(x2 C z2) 2(yz sx) 0
2(xz sy) 2(yz C sx) 1 2(x2 C y2) 0
0 0 0 1
3
7
7
5
It is also possible, of course, to convert a rotation matrix into a quaternion. The algebra is
rather heavy and will not be presented here.
8.3.5 Quaternion Interpolation
One of the advantage of the quaternion representation is that we can interpolate smoothly
between two orientations.
The problem that we wish to solve is this: given two rotations, R and R0
, how do we construct
a sequence of rotations R1, R2, . . . , Rn such that R1 D R and Rn D R0 in such a way that,
when we apply these rotations to a graphical object, it appears to rotate smoothly. Note that
if R and R0
are represented by matrices, this is not an easy problem to solve.
If we represent the rotations R and R0
as quaternions q and q0
, respectively, there is a fairly
simple solution — although its derivation is tricky.
Let D cos 1(q q0) and define
slerp(t, q, q0
) D
q sin((1 t)) C q0
sin(t)
sin
.
in which “slerp” stands for spherical linear interpolation. Then
slerp(0, q, q0
) D q
slerp(1, q, q0
) D q0
and, for 0 t 1, slerp(t, q, q0) is a quaternion that smoothly interpolates from q to q0.
8.4 Quaternions in Practice
The following three programs (available on the web site) illustrate applications of quaternions
to graphics programming. In each case, the same effect is hard to achieve with matrices
although, of course, anything is possible.
87

void mouseMovement (int xNew, int yNew)
{
const int MSTART = -10000;
static int xOld = MSTART;
static int yOld = MSTART;
if (xOld == MSTART yOld == MSTART)
{
xOld = xNew + 1;
yOld = yNew + 1;
}
quat.trackball(
float(2 * xOld - width) / float(width),
float(height - 2 * yOld) / float(height),
float(2 * xNew - width) / float(width),
float(height - 2 * yNew) / float(height) );
xOld = xNew;
yOld = yNew;
}
Figure 42: Mouse callback function for trackball simulation
8.4.1 Imitating a Trackball
A trackball is a device used for motion input in professional CAD workstations and also for
some games. It is a sphere with only the top accessible; by moving your hand over the sphere,
you send rotational information to the computer. The purpose of this program is to use the
mouse to simulate a trackball: when you move the mouse, it is as if you were moving your
hand over a trackball.
The hard part is performed by CUGL. We describe the API first and then the underlying
implementation. The first step is to declare a quaternion:
Quaternion quat;
In the display function, the quaternion is used to rotate the model:
quat.rotate();
buildPlane();
Most of the work is done by the mouse callback function, shown in Figure 42. This func-
tion stores the previous mouse position, (xOld,yOld), and the most recent mouse position,
(xNew,yNew).
To avoid a sudden jump when the program is started, the first block of code ensures that the
old and new values are close together initially.
88

double project(double x, double y)
{
double dsq = x * x + y * y;
double d = sqrt(dsq);
if (d BALLRADIUS * 0.5 * sqrt(2))
return sqrt(BRADSQ - dsq);
else
return BRADSQ / (2 * d);
}
Figure 43: Projecting the mouse position
The function passes transformed values of the old and new mouse positions to the function
Quaternion::trackball. The transformation ensures that the arguments are all in the range
[ 1, 1] (assuming that the mouse stays inside the window).
Finally, the old mouse position is updated to become the new mouse position and the function
calls glutPostRedisplay to refresh the view.
From the user’s point of view, that’s all there is to do. We can look behind the scenes and
see what CUGL is doing.
The implementation uses a couple of constants: the radius, r, of the simulated trackball and
r2
:
const double BALLRADIUS = 0.8f;
const double BRADSQ = BALLRADIUS * BALLRADIUS;
The mouse position (x, y) is projected onto a sphere to obtain a 3D point (x, y, z) such that
x2
C y2
C z2
D r2
. In practice, the effect of a mouse movement becomes too extreme as
the mouse approaches the edge of the ball, and we project onto a hyperboloid instead if
x2
C y2
1
2 r2
. All of this is done by the auxiliary function project shown in Figure 43.
The real work is done by Quaternion::trackball, shown in Figure 44. First, the vectors v1
and v2 are initialized to the projected positions of the mouse on the sphere or hyperboloid.
The idea is to compute a quaternion r that represents the rotation corresponding to these
two points and to multiply the current quaternion by r.
The function computes vectors a D v1 v2 and d D v1 v2 and the real number t D
k d k
2r
.
The value of t is then clipped if necessary to ensure that 1 t 1 and then is set to
sin 1
t.
We require a rotation through 1
2 about an axis parallel to a. The corresponding quaternion
is (sin , O
a cos ) where O
a D
a
k a k
.
8.4.2 Moving the Camera
The problem solved by the next program is to provide a set of commands that move the
camera in a consistent way. The program provides six ways of translating the camera (left,
right, up, down, forwards, and backwards) and four ways of rotating it (pan left, pan right,
89

void Quaternion::trackball(double x1, double y1, double x2, double y2)
{
Vector v1(x1, y1, project(x1, y1));
Vector v2(x2, y2, project(x2, y2));
Vector a = cross(v2, v1);
Vector d = v1 - v2;
double t = d.length() / (2 * BALLRADIUS);
if (t 1) t = 1;
if (t -1) t = -1;
double theta = asin(t);
(*this) *= Quaternion(cos(theta), a.normalize() * sin(theta));
};
Figure 44: Updating the trackball quaternion
tilt up, and tilt down). We know that this is difficult to do with Euler angles because of
the “gimbal lock” problem described in Section 8.3.1. Moreover, the matrix solution requires
computing matrix inverses, as we will see. The program shows how to do it with quaternions.
The program uses a vector to store the position of the camera. The initial value of the vector
is (0, h, 0), where h is the height of the camera. The height is negative because OpenGL
translates the scene, not the camera.
const Vector STARTPOS = Vector(0, - INITIAL_HEIGHT, 0);
Vector pos = STARTPOS;
The orientation of the camera is stored in a quaternion. The initial value of the quaternion
is (1, 0), which is established by the default constructor.
const Quaternion STARTQUAT;
Quaternion quat = STARTQUAT;
In the display function, the quaternion and the vector are used to rotate and position the
camera:
quat.rotate();
pos.translate();
scene();
The user translates the camera by pressing one of the keys ’f’, ’b’, ’l’, ’r’, ’u’, or ’d’.
Each key invokes the function move with a unit vector, as shown in Figure 45. The unit
vectors are defined in Figure 46.
Figure 47 shows the function move. We cannot simply update the position vector, because
the direction of movement would depend on the orientation, which is not what we want.
Suppose that the quaternion controlling the orientation is q. Then we apply q 1 to the
current position, obtaining the vector w giving the initial orientation of the camera. We then
apply the translation u to this vector and apply q to the result.
90

void graphicKeys (unsigned char key, int x, int y)
{
switch (key)
{
case ’f’:
move(K);
break;
case ’b’:
move(- K);
break;
case ’l’:
move(I);
break;
case ’r’:
move(- I);
break;
case ’u’:
move(- J);
break;
case ’d’:
move(J);
break;
case ’s’:
pos = STARTPOS;
quat = STARTQUAT;
break;
case 27:
exit(0);
default:
}
}
Figure 45: Translating the camera
const Vector I = Vector(1, 0, 0);
const Vector J = Vector(0, 1, 0);
const Vector K = Vector(0, 0, 1);
Figure 46: Unit vectors
91

void move(Vector u)
{
pos+=quat.apply(u);
}
Figure 47: Auxiliary function for translating the camera
void functionKeys (int key, int x, int y)
{
const double DELTA = radians(5);
switch (key)
{
case GLUT_KEY_UP:
quat *= Quaternion(I, DELTA);
break;
case GLUT_KEY_DOWN:
quat *= Quaternion(I, - DELTA);
break;
case GLUT_KEY_LEFT:
quat *= Quaternion(J, DELTA);
break;
case GLUT_KEY_RIGHT:
quat *= Quaternion(J, - DELTA);
break;
}
}
Figure 48: Rotating the camera
Figure 48 shows the callback function for rotating the camera. The arrow keys and # tilt
the camera up and down, and the arrow keys and ! pan left or right. In each case, the
current quaternion is multiplied by a quaternion that rotates 5ı about the given axis — I for
tilts and J for pans. Note that the call radians(5) converts 5ı
to radians as required by the
quaternion constructor.
8.4.3 Flying
The final program allows the user to “fly” a plane by using the arrow keys. The plane moves
forwards with uniform speed and the arrow keys change its orientation. The problem is to
update the position of the plane in a way that is consistent with its orientation. As before,
this is difficult to do with matrices and Euler angles could lock up. The following solution
uses quaternions.
The speed of the plane is constant and in the direction Z, because this is the way the plane’s
coordinates are set up.
92

const Vector VEL(0, 0, -100);
The plane has a current velocity, position, and orientation. Note that velocity is not in
general equal to VEL but rather VEL rotated by orientation.
Vector velocity;
Vector position;
Quaternion orientation;
When the user presses ’r’, the velocity, position, and orientation are reset to their initial
values. The order of the statements is important because orientation is used to set velocity.
void reset()
{
orientation = Quaternion(J, radians(90));
velocity = orientation.apply(VEL);
position = Vector();
}
The display function translates the plane and then uses the inverse of the orientation quater-
nion to set its direction. (The inversion could be avoided by reversing the effect of the arrow
keys in the special key callback function.)
position.translate();
orientation.inv().rotate();
glCallList(plane);
The idle callback function performs a simple integration, adding v dt to the current position.
void idle ()
{
position += velocity * DELTA_TIME;
}
Figure 49 shows the callback function that handles the arrow keys for controlling the plane.
Each key changes the orientation by a small amount and re-computes the velocity by applying
the new orientation to the initial velocity VEL. The small orientation changes are defined by
constant quaternions:
const Quaternion climb(I, DELTA_TURN);
const Quaternion left(J, DELTA_TURN);
const Quaternion roll(K, DELTA_TURN);
const Quaternion climbInv = climb.inv();
const Quaternion leftInv = left.inv();
const Quaternion rollInv = roll.inv();
93

void functionKeys (int key, int x, int y)
{
switch (key)
{
case GLUT_KEY_UP:
orientation *= climb;
break;
case GLUT_KEY_DOWN:
orientation *= climbInv;
break;
case GLUT_KEY_LEFT:
orientation *= left;
break;
case GLUT_KEY_RIGHT:
orientation *= leftInv;
break;
case GLUT_KEY_END:
orientation *= roll;
break;
case GLUT_KEY_PAGE_DOWN:
orientation *= rollInv;
break;
}
}
Figure 49: Callback function for flying the plane
94

9 THEORY OF ILLUMINATION
9 Theory of Illumination
This section is an expanded version of Appendix B from Getting Started with OpenGL.
To obtain realistic images in computer graphics, we need to know not only about light but also
what happens when light is reflected from an object into our eyes. The nature of this reflection
determines the appearance of the object. The general problem is to use the properties of the
light sources and the materials to compute the apparent colour at each pixel that corresponds
to part of an object on the screen.
9.1 Steps to Realistic Illumination
We discuss various techniques for solving this problem, increasing the realism at each step.
In each case, we define the intensity, I, of a pixel in terms of a formula. The first few
techniques ignore colour.
9.1.1 Intrinsic Brightness
We assume that each object has an intrinsic brightness ki. Then
I D ki
This technique can be used for simple graphics, and is essentially the technique that OpenGL
uses when lighting is disabled, but it is clearly unsatisfactory. There is no attempt to model
properties of the light or its effect on the objects.
9.1.2 Ambient Light
We assume that there is ambient light (light from all directions) with intensity Ia and that
each object has an ambient reflection coefficient ka. This gives
I D Ia ka
In practice, the ambient light technique looks a lot like the intrinsic brightness technique.
9.1.3 Diffuse Lighting
We assume that there is a single, point source of light and that the object has diffuse or
Lambertian reflective properties. This means that the light reflected from the object depends
only on the incidence angle of the light, not the direction of the viewer.
More precisely, suppose that: N is a vector normal to the surface of the object; L is a vector
corresponding to the direction of the light; and V is a vector corresponding to the direction of
the viewer. Figure 50 shows these vectors: note that V is not necessarily in the plane defined
by L and N. Assume that all vectors have unit length (k N k D k L k D k V k D 1). Then
I D N L
Note that:
95

9 THEORY OF ILLUMINATION 9.1 Steps to Realistic Illumination
6
N
@
@
@
@
@
I
L

R

:
V
@
@
@
@
@
@
@
@
@
@
@
@
r
light
r
Viewer
Object
Figure 50: Illuminating an object
˘ V does not appear in this expression and so the brightness does not depend on the
viewer’s position;
˘ the brightness is greatest when N and L are parallel (N L D 1); and
˘ the brightness is smallest when N and L are orthogonal (N L D 0).
We can account for Lambertian reflection in the following way. Suppose that the beam of
light has cross-section area A and it strikes the surface of the object at an angle . Then the
area illuminated is approximately A= cos . After striking the object, the light is scattered
uniformly in all directions. The apparent brightness to the viewer is inversely proportional to
the area illuminated, which means that it is proportional to cos , the inner product of the
light vector and the surface normal.
We introduce Ip, the incident light intensity from a point source, and kd , the diffuse reflec-
tion coefficient of the object. Then
I D Ip kd (N L)
The value of N L can be negative: this will be the case if the light is underneath the surface
of the object. We usually assume that such light does not contribute to the illumination of
the surface. In calculations, we should use max(N L, 0) to keep negative contributions out
of our results.
If we include some ambient light, this equation becomes
I D ia ka C Ip kd (N L)
9.1.4 Attenuation of Light
Light attenuates (gets weaker) with distance from the source. The theoretical rate of attenu-
ation for a point source of light is quadratic. In practice, sources are not true points and there
is always some ambient light from reflecting surfaces (although ambient light is very weak in
outer space). Consequently, we assume that attenuation, f , is given by
f D
1
C C L d C Q d2
96

where:
˘ d is the distance between the light and the object;
˘ C (constant attenuation) ensures that a close light source does not give an infinite amount
of light;
˘ L (linear term) allows for the fact that the source is not a point; and
˘ Q (quadratic term) models the theoretical attenuation from a point source.
Then we have
I D ia ka C f Ip kd (N L)
9.1.5 Coloured Light
The previous calculations ignore colour. In this section, we assume that:
˘ the object has diffuse colour factors Odr (red), Odg (blue), and Odb (green);
˘ the light has intensity colour factors corresponding to ambient sources (Iar , Iag, and
Iab); and
˘ point sources (Ipr, Ipg, and Ipb).
All of these numbers are in the range [0, 1]. We now have three intensity equations (with
D r, g, b) of the form
I D Ia ka Od C f Ip kd Od (N L)
9.1.6 Specular Reflection
Lambertian reflection is a property of dull objects such as cloth or chalk. Many objects
exhibit degrees of shininess: polished wood has some shininess and a mirror is the ultimate in
shininess. The technical name for shininess is specular reflection. A characteristic feature
of specular reflection is that it has a colour closer to the colour of the light source than the
colour of the object. For example, a brown table made of polished wood that is illuminated
by a white light will have specular highlights that are white, not brown.
Specular reflection depends on the direction of the viewer as well as the light. We introduce
a new vector, R (the reflection vector), which is the direction in which the light would
be reflected if the object was a mirror (see Figure 50). The brightness of specular reflection
depends on the angle between R and V (the angle of the viewer). For Phong shading
(developed by Phong Bui-Tong), we assume that the brightness is proportional to (R V)n
,
where n D 1 corresponds to a slightly glossy surface and n D 1 corresponds to a perfect
mirror. We now have
I D Ia ka Od C f Ip(kd Od (N L) C ks (R V)n
)
where ks is the specular reflection coefficient and n is the specular reflection exponent.
97

P
P
P
P
P
P
P
P
P
P
P
P
i
L

1
R
6
N
6
-
S -
S

N cos
Figure 51: Calculating R
Calculating the Reflection Vector The reflection vector R is the mirror image of the
incident vector L relative to the normal vector N. We assume that L and N are unit vectors.
Consequently, the projection of L onto N is a vector with length cos in the direction of N,
or N cos . As we can see from the right side of Figure 51:
R D N cos C S
Similarly, from the left side of Figure 51:
S D N cos L
Adding these equations gives
R C S D N cos C S C N cos L
which simplifies to
R D 2 N cos L
Since cos D N L, we can calculate R from
R D 2 N(N L) L
To calculate specular reflection, we actually need R V. The time needed for this calculation
depends on the assumptions made about the light source and the viewer:
˘ If the light source and the viewer are assumed to be at infinity, R and V are both constant
across the polygon and it is necessary to calculate R V only once for the polygon.
˘ If the light source is assumed to be at infinity (a directional light) but the viewer is
nearby, R is constant across the polygon but V varies, and we must calculate V and
R V for each pixel.
˘ If the light source is nearby (a positional light) and the viewer is nearby, both R and V
vary across the polygon, and we must calculate R, V, and R V for each pixel.
98

9 THEORY OF ILLUMINATION 9.2 Polygon Shading
9.1.7 Specular Colours
In practice, the colour of specular reflection is not completely independent of the colour of
the object. To allow for this, we can give the object a specular colour Os. Then we have
I D Ia ka Od C f Ip(kd Od (N L) C ks Os (R V)n
)
This equation represents our final technique for lighting an object and is a close approximation
to what OpenGL actually does.
9.1.8 Multiple Light Sources
If there are several light sources, we simply add their contributions. If the sum of the con-
tributions exceeds 1, we can either “clamp” the value (that is, use 1 instead of the actual
result) or reduce all values in proportion so that the greatest value is 1. Clamping is cheaper
computationally and usually sufficient.
The actual calculation performed by OpenGL is:
V D Oe C MaOa C
n 1
X
iD0

1
kc C kl d C kqd2

i
si

IaOa C (N L)IdOd C (R V)
IsOs

i
where
V D Vertex brightness
Ma D Ambient light model
kc D Constant attenuation coefficient
kl D Linear attenuation coefficient
kq D Quadratic attenuation coefficient
d D Distance of light source from vertex
si D Spotlight effect
Ia D Ambient light
Id D Diffuse light
Is D Specular light
Oe D Emissive brightness of material
Oa D Ambient brightness of material
Od D Diffuse brightness of material
Os D Specular brightness of material
D Shininess of material
and the subscript indicates colour components, and the subscript i denotes one of the lights.
9.2 Polygon Shading
The objects in graphical models are usually defined as many small polygons, typically triangles
or rectangles. We must choose a suitable colour for each visible pixel of a polygon: this is
called polygon shading.
99

9.2.1 Flat Shading
In flat shading, we compute a vector normal to the polygon and use it to compute the colour
for every pixel of the polygon. The computation implicitly assumes that:
˘ the polygon is really flat (not an approximation to a curved surface);
˘ N L is constant (the light source is infinitely far away); and
˘ N V is constant (the viewer is infinitely far away.)
Flat shading is efficient computationally but not very satisfactory: the edges of the polygons
tend to be visible and we see a polyhedron rather than the surface we are trying to approx-
imate. (The edges are even more visible than we might expect, due to the subjective Mach
effect, which exaggerates a change of colour along a line.)
9.2.2 Smooth Shading
In smooth shading, we compute normals at the vertices of the polygons, averaging over the
polygons that meet at the vertex. If we are using polygons to approximate a smooth surface,
these vectors approximate the true surface normals at the vertices. We compute the colour
at each vertex and then colour the polygons by interpolating the colours at interior pixels.
Smooth shading of coloured objects requires interpolating colour values. (Suppose we have
a line AB and we know the colour at A and the colour at B. Then interpolation enables us
to calculate all the colours between A and B.) It is not clear that interpolation of colours is
even possible. However, in Section 10 we will discover that interpolation is indeed possible
and not even very difficult.
There are several varieties of smooth shading. The most important are Gouraud shading5
and Phong shading.6
Gouraud Shading is a form of smooth shading that uses a particular kind of interpolation
for efficiency.
1. Compute the normal at each vertex of the polygon mesh. For analytical surfaces, such
as spheres and cones, we can compute the normals exactly. For surfaces approximated
by polygons, we use the average of the surface normals at each vertex.
2. Compute the light intensity for each colour at each vertex using a lighting model (e.g.,
Section 9.1.7 above).
3. Interpolate intensities along the edges of the polygons.
4. Interpolate intensities along scan lines within the polygons.
Figure 52 illustrates Gouraud shading. We assume a rectangular viewing window, W , and
a polygon with vertices v1, v2, v3 to be displayed. The first step calculates the colours at
the vertices; the second step interpolates between the vertices to obtain the colours along
the edges of the polygon. When a scan line s crosses the polygon, find the points p1 and
5Henri Gouraud. Continuous Shading of Curved Surfaces. IEEE Trans. Computers, C–20(6), June 1971,
623–9.
6Bui-Tuong Phong. Illumination for Computer Generated Pictures. Comm. ACM, 18(6), June 1975, 311–7.
100

W

v1
PPPPPPPPPPPP
v2
A
A
A
A
A
A
A
A
v3
s r
p1
r
p2
Figure 52: Gouraud Shading
@
@
@
@
@
I
r
p1
v1
A
A
A
A
A
A
K
B
B
B
B
B
B
M 6

r
p1
v2
s
Figure 53: Phong Shading
p2 where it crosses the edges of the polygon and find the colours there. Finally, interpolate
colours between p1 and p2 to obtain the correct colour for each pixel on the scan line.
Phong Shading is similar to Gouraud shading but interpolates the normals rather than
the intensities. Phong shading requires more computation than Gouraud shading but gives
better results, especially for specular highlights.
Figure 53 illustrates Phong shading. The scan line is s and the edges of the polygon are at p1
and p2. The averaged normal vectors at these points are v1 and v2. The algorithm moves to
each pixel between p1 and p2 and computes the normal vector there by interpolating between
v1 and v2.
101

10 THE THEORY OF LIGHT AND COLOUR
10 The Theory of Light and Colour
It is possible to include diagrams in these notes (e.g., the CIE chromaticity diagram)
but this has the effect of making the files much larger (megabytes rather than kilo-
bytes). Consequently, these notes are mainly text and the diagrams can be downloaded
from the course website.
The purpose of this section is to provide partial answers to the questions:
˘ What is light?
˘ How do we perceive light?
˘ How do we create the illusion of light and colour on a computer screen?
The first two questions have simple answers that are not very useful: light consists of photons
with various energy levels; and we perceive light when photons cause chemical changes in the
retinas of our eyes. We need to know a bit more in order to understand how it is possible to
create quite good illusions with relatively simple equipment.
10.1 Physiology of the Eye
The eye has many parts; the most important for this discussion are:
˘ The lens and cornea at the front of the eye, which focus light onto the retina at the
back of the eye.
˘ The iris diaphragm, which enables the eye to control the size of the lens aperture and
hence the amount of light reaching the retina.
˘ The retina, which consists of light-sensitive cells.
The light sensitive cells of the retina are of two kinds: rods and cones. (The names “rod”
and “cone” come from the shape of the cells.)
˘ Rods are very sensitive to intensity but do not distinguish colours well. Most rods are
positioned away from the centre of the retina. At low levels of illumination (e.g., at
night), we see mainly with our rods: we don’t see much colour and very dim lights are
best seen by looking to one side of them.
˘ Cones are about a thousand times less sensitive to light than rods, but they do perceive
colours. The centre of the retina has the highest density of cones, and that is where a
our vision is sharpest and we are most aware of colour. There is a small region, called
the fovea, where the density of cones is highest. When we “look at” an object, we have
adjusted our eyes so that the image of that object falls onto the fovea.
˘ There are three kinds of cones: roughly speaking, they respond to red, green, and blue
light. In reality, there is a lot of overlap, and each cone responds to some extent to all
colours.
If ears were like eyes, we could distinguish only low-, medium-, and high-pitched sounds; we
could not use speech to communicate and we could not enjoy music. The ear achieves this
with the aid of approximately 16,000 receptors (the “hair cells”), each responding to a slightly
different frequency. The trade-off, of course, is that ears cannot determine the direction of
the source of sound precisely.
102

10 THE THEORY OF LIGHT AND COLOUR 10.2 Achromatic Light
Device Dynamic Range Perceptible levels
Cathode-ray tube 50 – 200 400 – 550
Photographic print 100 450
Photographic slide 1000 700
Newsprint 10 200
Figure 54: Dynamic range and perceptible steps for various devices
10.2 Achromatic Light
Before considering coloured light, we will have a brief look at achromatic light — literally,
light without colour. More precisely, in this section, we will ignore the coloured components
that all light has and consider only the brightness, or intensity, of the light.
The eye responds to a wide range of brightnesses. The dimmest light that we can respond
to is about 10 6 cs=m2 (where ‘cs’ stands for candelas and ‘m’ stands for metres). At this
intensity, each visual receptor in the eye is receiving about one photon every 10 minutes; the
reason we can see anything at all is that the eye can integrate the responses of many receptors.
The brightest light that we can respond to without damaging the retina is about 108
cs=m2
— or 1014 times as much light.
Not surprisingly, given this range, our response to light intensity is not linear but logarithmic.
A trilight typically can be set to emit 50, 100, or 150 watts. We see the step from 50 to 100
watts as being greater than the step from 100 to 150 watts. To achieve even steps, the trilight
should have settings of 50, 100, and 200 watts — so that each setting doubles the intensity.
Suppose that we have a device, such a computer monitor, that can emit light at various
intensities. There will be a minimum intensity Imin and a maximum intensity Imax, where
Imin should be set so that we can just see the effect and Imax is probably the highest intensity
that the device is capable of. The intermediate intensities should be
Imin rn
where 0 n N and r is chosen so that imin rN
D Imax or r D N
r
Imax
Imin
. The ratio
Imax
Imin
is
called the dynamic range of the device.
Ideally, the steps should be small enough that we cannot detect the step from Imin rn to
Imin rnC1
. The number of steps needed depends on the dynamic range. Figure 54 shows the
dynamic range and number of perceptible steps for some common devices. If we use one byte
(8 bits) to encode intensity, we have 28 D 256 levels. From Figure 54, we can see that this is
enough for newsprint but not for the other devices, CRTs in particular. However, if we use
three bytes (24 bits) to encode three colours, we have 224 16 million distinct codes, which
should provide enough levels of brightness for most purposes.
10.3 Coloured Light
As mentioned in Section 10.1, the cones in our eyes distinguish colours. “Colour” is a sensation;
the physical cause of colour is the wavelength of photons. Most light consists of a large number
of photons with different wavelengths. (An important exception is the photons emitted from
103

10 THE THEORY OF LIGHT AND COLOUR 10.3 Coloured Light
a laser, which all have the same wavelength; laser light is called monochromatic.) We
see photons as light if they have a wavelength between 400 nm and 700 nm (‘nm’ stands for
nanometre and 1 nm D 10 9 metre). Photons with long wavelengths look red and photons
with short wavelengths look blue. We can detect photons with longer wavelength than red
light, but we call the effect “heat” rather than “light”.
A light source has a power spectrum which associates a power, or intensity, with each
wavelength. It is common practice to use to stand for a wavelength. We describe a source
of light as a function, P(), which gives the power of the source at each wavelength.
Similarly, the response of a receptor is also a function of the wavelength of the light. We can
write R(), G(), and B() for the response of the red, green, and blue receptors respectively.
The corresponding curves have a single hump that corresponds to the wavelength of greatest
sensitivity.
Our response to a light source is a triple of three real numbers (r, g, b) where
r D
Z 700
D400
R() P() d
g D
Z 700
D400
G() P() d
b D
Z 700
D400
B() P() d
It looks from these equations as if the analysis of colour vision would be alarmingly complex.
Fortunately, the eye has interesting properties that enable us to make significant simplifica-
tions. The first thing to notice is that although there are very many possible power spectra
— essentially any set of values defined between D 400 and D 700 define a power spectrum
— our perception is confined to the three numbers (r, g, b). This means that if two different
sources of light, with different power spectra, give the same values of (r, g, b), then we cannot
distinguish them.
Two light sources that appear to have the same colour are called metamers.
When we use the common expression “adding red and green gives yellow”, what we are really
saying is that a light source with red and green components, and another light source with a
single yellow component, are metamers.
The following results are based on experiments performed by asking people what they see. To
obtain objectivity, the subjects are not asked questions like “Does this look green or blue to
you?” but instead are asked “Are these colours the same or different?” Since people provide
highly consistent answers to this question, colour theory has become an objective science.
The first phenomenon is this: suppose X and Y are metamers. Then, for any colour Z, X CZ
and Y C Z are metamers. The “+” sign here stands for adding light. For example, we might
have two projectors projecting pools of light X and Y onto a screen. Viewers state that the
two pools of light have the same colour. A third projector, emitting light Z, is then switched
on. The viewers will agree that the areas where light from both projectors X and Z hits
the screens are indistinguishable from the area where light from projectors Y and Z hits the
screen.
104

10 THE THEORY OF LIGHT AND COLOUR 10.3 Coloured Light
Next, light of any colour can be obtained by mixing light from three colours in the right
proportions. (This statement is not precisely true: the exceptions will be discussed shortly.)
The three colours are called primaries.
Suppose that we use colours R, G, and B as primaries (although the names suggest red, green,
and blue, we do not have to use these colours as primaries). Then the claim is that, for any
colour X, we can choose factors ˛, ˇ, and such that
X D ˛ R C ˇ G C B
This in itself is not very interesting. The interesting part is that the linear formula is not just
a convenient way of writing: colour mixing really is linear. Suppose we have two colours X
and X0
and we have found appropriate weightings of the primary colours for them:
X D ˛ R C ˇ G C B
X0
D ˛0
R C ˇ0
G C 0
B
Then we can obtain X C X0 simply by summing the primary weights:
X C X0
D (˛ C ˛0
) R C (ˇ C ˇ0
) G C ( C 0
) B
This implies that colour has the properties of a three dimensional vector space and that
any three colours form a basis for this space.
As mentioned, there are problems with this representation. The first one is obvious: the three
primaries must be linearly independent. The system would not work if we could find ˛ and
ˇ such that B D ˛ R C ˇ G.
There is a more serious problem. It is true that we can represent all colours in the form
X D ˛ R C ˇ G C B, where now R, G, and B actually do stand for red, green, and blue.
The problem is that some of the values of ˛ are negative! In fact, if we use any three visible
colours as a basis, we will need negative coefficients to obtain some colours.
“Negative colours”, of course, do not exist in reality. We can interpret an equation of the
form
C D 0.2R 0.1G C 0.8B
which apparently defines a colour C with “negative green” as
C C 0.1G D 0.2R C 0.8B.
That is, we must add green to C to match the equivalent colour 0.2R C 0.8B.
There are three technical terms that we use when discussing colours.
1. The brightness of a colour is called its luminance. Colours that we consider different
may in fact be the “same” colour with a different luminance. For example, olive brown
is dark yellow.
2. The “colour” of a colour is called its hue. Red and green are different hues, but yellow
and olive brown (see above) have the same hue and differ only in luminance.
3. Suppose that we compare a colour with a grey with the same luminance. Their closeness
depends on the saturation of the colour. A highly saturated colour, such as pure red,
is far from grey, but an unsaturated colour, such as pink, is close to a light shade of
grey. The deep blue that we see in Finnish pottery and Islamic mosques is saturated,
whereas sky blue is unsaturated.
105

10 THE THEORY OF LIGHT AND COLOUR 10.4 The CIE System
10.4 The CIE System
The problem of negative coefficients was recognized many years ago and, in 1931, the Com-
mission Internationale de l’Eclairage (CIE for short) introduced a system of tristimulus
values for defining colour. The CIE primaries are called X, Y, and Z. Their properties are:
˘ They are super-saturated colours that cannot actually be perceived.
˘ All of the colours that we can perceive can be expressed as x X C y Y C z Z with positive
values of x, y, and z.
˘ The Y curve matches the sensitivity of the eye: it is low at the ends of the spectrum
and has a peak corresponding to the dominant colour of the sun, to which our eyes
are adapted. Consequently, the CIE y component of a light source is equivalent to the
luminance of the source.
The CIE System enables us to describe a light source in two ways.
Tristimulus values are the actual values of the three components: X, Y , and Z.
Chromaticity values are the normalized versions of the tristimulus values:
x D
X
X C Y C Z
y D
Y
X C Y C Z
z D
Z
X C Y C Z
The tristimulus values describe what we see (hence the term “tristimulus”): the colour of
the light and its intensity. The chromaticity values are normalized and do not describe the
brightness. However, since x C y C z D 1, only two are needed and, in practice, we describe a
light source using (x, y, Y ) coordinates. From x, y, and z, we can recover the other values as
z D 1 x y
X D x
Y
y
Z D z
Y
y
Now imagine all of the values of (x, y, z) that correspond to visible colours. (Note that “every
visible colour can be represented by suitable values of (x, y, z)” is not the same as the converse
“every value of (x, y, z) corresponds to a visible colour”.) These values form a cone with very
dim colours near the origin and brighter colours further out. The chromaticity values, for
which x C y C z D 1, form a plane that intersects the cone. The appearance of the visible
colours on this plane is called the CIE Chromaticity Diagram.7
The linearity property says that, if we take two points (i.e., colours) on the CIE diagram, we
can obtain the points (i.e., colours) on the line between them by mixing the colours. If we
take three colours forming a triangle, we can obtain any colour inside the triangle by mixing
the colours. The triangle is called a gamut.
7The course web site has links to the chromaticity diagram and an applet that allows you to experiment
with it.
106

Since the sides of the CIE diagram are curved, we cannot find three points (corresponding to
visible colours) that enclose the entire area. Consequently, any physical device that relies
on three primary colours cannot generate all perceptible colours. Nevertheless, some
“devices”, such as slide film, cover a large proportion of the CIE diagram.
Corresponding to each wavelength in the visible spectrum, there is a pure colour. CIE Chro-
maticity Coordinate tables give the XYZ components for discrete wavelengths: Figure 55
shows a typical table. For a single wavelength, we can read the XYZ components directly.
For a general light source, we compute a sum. (Strictly, we should use continuous functions
and integrate; the summation provides a good enough approximation and is much easier to
compute.)
Suppose that our light source has a power spectrum P() for values of (the wavelength)
in the visible spectrum. In practice, we would measure P() at discrete values, such as
D 380, 385, . . ., 825 if we were using the table in Figure 55. To obtain the tristimulus values
corresponding to this light source, we compute
X D
X

P() x
Y D
X

P() y
Z D
X

P() z
where x, y, and z are taken from the table in Figure 55.
10.4.1 Using Gamuts
Devices for printing or displaying colour use three or more sources of colour in combination.
We will consider only computer monitors, which have three guns firing electrons at phosphors
(it is the phosphors that create the colours, not the electrons) or use LCDs to display colour.
(Colour television uses the same principle.) Both technologies are based on red, green, and
blue (RGB) primaries.
We can represent the colours of the primaries (without their luminance) as coordinates on the
CIE Chromaticity Diagram. Suppose that the coordinates are (xr , yr ), (xg, yg), and (xb, yb).
Let
Ci D Xi C Yi C Zi
zi D 1 xi yi
Xi D xi Ci
Yi D yi Ci
Zi D zi Ci
where i 2 f r, g, b g and (Xi, Yi, Zi) are the XYZ coordinates of the colours that the monitor
can display. Then the relationship between the XYZ coordinates and RGB values (that is,
the signals we send to the device) is:
2
4
X
Y
Z
3
5 D
2
4
Xr Xg Xb
Yr Yg Yb
Zr Zg Zb
3
5
2
4
R
G
B
3
5
107

x y z
380 2.689900e-003 2.000000e-004 1.226000e-002
385 5.310500e-003 3.955600e-004 2.422200e-002
390 1.078100e-002 8.000000e-004 4.925000e-002
395 2.079200e-002 1.545700e-003 9.513500e-002
400 3.798100e-002 2.800000e-003 1.740900e-001
405 6.315700e-002 4.656200e-003 2.901300e-001
410 9.994100e-002 7.400000e-003 4.605300e-001
415 1.582400e-001 1.177900e-002 7.316600e-001
420 2.294800e-001 1.750000e-002 1.065800e+000
425 2.810800e-001 2.267800e-002 1.314600e+000
430 3.109500e-001 2.730000e-002 1.467200e+000
435 3.307200e-001 3.258400e-002 1.579600e+000
440 3.333600e-001 3.790000e-002 1.616600e+000
445 3.167200e-001 4.239100e-002 1.568200e+000
450 2.888200e-001 4.680000e-002 1.471700e+000
455 2.596900e-001 5.212200e-002 1.374000e+000
460 2.327600e-001 6.000000e-002 1.291700e+000
465 2.099900e-001 7.294200e-002 1.235600e+000
470 1.747600e-001 9.098000e-002 1.113800e+000
475 1.328700e-001 1.128400e-001 9.422000e-001
480 9.194400e-002 1.390200e-001 7.559600e-001
485 5.698500e-002 1.698700e-001 5.864000e-001
490 3.173100e-002 2.080200e-001 4.466900e-001
495 1.461300e-002 2.580800e-001 3.411600e-001
500 4.849100e-003 3.230000e-001 2.643700e-001
505 2.321500e-003 4.054000e-001 2.059400e-001
510 9.289900e-003 5.030000e-001 1.544500e-001
515 2.927800e-002 6.081100e-001 1.091800e-001
520 6.379100e-002 7.100000e-001 7.658500e-002
525 1.108100e-001 7.951000e-001 5.622700e-002
530 1.669200e-001 8.620000e-001 4.136600e-002
535 2.276800e-001 9.150500e-001 2.935300e-002
540 2.926900e-001 9.540000e-001 2.004200e-002
545 3.622500e-001 9.800400e-001 1.331200e-002
550 4.363500e-001 9.949500e-001 8.782300e-003
555 5.151300e-001 1.000100e+000 5.857300e-003
560 5.974800e-001 9.950000e-001 4.049300e-003
565 6.812100e-001 9.787500e-001 2.921700e-003
570 7.642500e-001 9.520000e-001 2.277100e-003
575 8.439400e-001 9.155800e-001 1.970600e-003
580 9.163500e-001 8.700000e-001 1.806600e-003
585 9.770300e-001 8.162300e-001 1.544900e-003
590 1.023000e+000 7.570000e-001 1.234800e-003
595 1.051300e+000 6.948300e-001 1.117700e-003
600 1.055000e+000 6.310000e-001 9.056400e-004
605 1.036200e+000 5.665400e-001 6.946700e-004
610 9.923900e-001 5.030000e-001 4.288500e-004
615 9.286100e-001 4.417200e-001 3.181700e-004
620 8.434600e-001 3.810000e-001 2.559800e-004
625 7.398300e-001 3.205200e-001 1.567900e-004
630 6.328900e-001 2.650000e-001 9.769400e-005
635 5.335100e-001 2.170200e-001 6.894400e-005
640 4.406200e-001 1.750000e-001 5.116500e-005
645 3.545300e-001 1.381200e-001 3.601600e-005
650 2.786200e-001 1.070000e-001 2.423800e-005
655 2.148500e-001 8.165200e-002 1.691500e-005
660 1.616100e-001 6.100000e-002 1.190600e-005
665 1.182000e-001 4.432700e-002 8.148900e-006
670 8.575300e-002 3.200000e-002 5.600600e-006
675 6.307700e-002 2.345400e-002 3.954400e-006
680 4.583400e-002 1.700000e-002 2.791200e-006
685 3.205700e-002 1.187200e-002 1.917600e-006
690 2.218700e-002 8.210000e-003 1.313500e-006
695 1.561200e-002 5.772300e-003 9.151900e-007
700 1.109800e-002 4.102000e-003 6.476700e-007
705 7.923300e-003 2.929100e-003 4.635200e-007
710 5.653100e-003 2.091000e-003 3.330400e-007
715 4.003900e-003 1.482200e-003 2.382300e-007
720 2.825300e-003 1.047000e-003 1.702600e-007
725 1.994700e-003 7.401500e-004 1.220700e-007
730 1.399400e-003 5.200000e-004 8.710700e-008
735 9.698000e-004 3.609300e-004 6.145500e-008
740 6.684700e-004 2.492000e-004 4.316200e-008
745 4.614100e-004 1.723100e-004 3.037900e-008
750 3.207300e-004 1.200000e-004 2.155400e-008
755 2.257300e-004 8.462000e-005 1.549300e-008
760 1.597300e-004 6.000000e-005 1.120400e-008
765 1.127500e-004 4.244600e-005 8.087300e-009
770 7.951300e-005 3.000000e-005 5.834000e-009
775 5.608700e-005 2.121000e-005 4.211000e-009
780 3.954100e-005 1.498900e-005 3.038300e-009
785 2.785200e-005 1.058400e-005 2.190700e-009
790 1.959700e-005 7.465600e-006 1.577800e-009
795 1.377000e-005 5.259200e-006 1.134800e-009
800 9.670000e-006 3.702800e-006 8.156500e-010
805 6.791800e-006 2.607600e-006 5.862600e-010
810 4.770600e-006 1.836500e-006 4.213800e-010
815 3.355000e-006 1.295000e-006 3.031900e-010
820 2.353400e-006 9.109200e-007 2.175300e-010
825 1.637700e-006 6.356400e-007 1.547600e-010
Figure 55: CIE Chromaticity Coordinates
108

D
2
4
xr xg xb
yr yg yb
zr zg zb
3
5
2
4
Cr 0 0
0 Cg 0
0 0 Cb
3
5
2
4
R
G
B
3
5
(To see how this relationship is derived, consider the RGB signals (1, 0, 0) (pure red), (0, 1, 0)
(pure green), and (0, 0, 1) (pure blue).)
The characteristics of a particular device are defined by its values of Cr , Cg, and Cb. These
can be obtained in either of two ways:
1. We can use a photometer to measure the luminance levels Yr, Yg, and Yb directly with
the monitor set to maximum brightness for the corresponding colour. Then:
Cr D
Yr
yr
Cg D
Yg
yg
Cb D
Yb
yb
2. A more common method is to measure the XYZ coordinates (Xw, Yw, Zw) of the mon-
itor’s white (that is, RGB coordinates (1, 1, 1)) and then solve the following equation
for the Ci’s:
2
4
Xw
Yw
Zw
3
5 D
2
4
xr xg xb
yr yg yb
zr zg zb
3
5
2
4
Cr
Cg
Cb
3
5
In most cases, the value we know is actually (xw, yw, Yw) — that is, the (x, y) position
on the Chromaticity Diagram and the luminance of white. In this case, the equation
above has the following solution:
Cr D k[xw(yg yb) yw(xg xb) C xgyb xbyg]
Cg D k[xw(yb yr ) yw(xb xr ) xr yb C xbyr ]
Cb D k[xw(yr yg) yw(xr xg) C xr yg xgyr ]
where
k D
Yw
yw[xr (yg yb) C xg(yb yr ) C xb(yr yg)]
.
The International Electrotechnical Commission has a standard (IEC 61966-2-1) that defines
the “D65 white point” (so called because it corresponds to a black body radiating at 6500ıK)
with tristimulus values (0.3127, 0.3290, 0.3583). The following table shows tristimulus values
for a typical monitor:
Colour Coordinates x y z
Red (xr , yr , zr ) 0.628 0.346 0.026
Green (xg, yg, zg) 0.268 0.588 0.144
Blue (xb, yb, zb) 0.150 0.070 0.780
The RGB/XYZ mappings for this monitor are:
2
4
R
G
B
3
5 D
2
4
3.240479 1.537150 0.498535
0.969256 1.875992 0.041556
0.055648 0.204043 1.057311
3
5
2
4
X
Y
Z
3
5
109

10 THE THEORY OF LIGHT AND COLOUR 10.5 Other Colour Systems
and
2
4
X
Y
Z
3
5 D
2
4
0.412453 0.357580 0.180423
0.212671 0.715160 0.072169
0.019334 0.119193 0.950227
3
5
2
4
R
G
B
3
5
10.5 Other Colour Systems
There are several other systems for representing colour. All of those that we describe here are
linearly related to the CIE system. In other words, we can transform from one system to any
other system using a 3 3 matrix.
10.5.1 RGB
The RGB system uses red, green, and blue as primaries, with coefficients between 0 and
1. We can visualize RGB colours as a cube with black ((0, 0, 0)) at one corner and white
((1, 1, 1)) at the opposite corner. Other corners are coloured red, green, blue, cyan (green +
blue), magenta (red + blue), and yellow (red + green). RGB is used for computer graphics
because cathode-ray monitors have red, green, and blue phosphors and LCD monitors have
been designed for compatibility.
10.5.2 CMY
The CMY system is the inverse of RGB and it is used for printing, where colours are subtrac-
tive rather than additive. The letters stand for cyan, magenta, and yellow. The relationship
is simply
2
4
C
M
Y
3
5 D
2
4
1
1
1
3
5
2
4
R
G
B
3
5
CMY is often extended to CMYK, where K is black. This is simply an economy, because
CMYK is used for printers and it is cheaper to print black with black ink than to achieve an
approximate black by mixing cyan, magenta, and yellow.
High quality colour printers extend the range still further, usually by adding light cyan and
light magenta inks, giving a total of six different inks.
10.5.3 YIQ
When colour television was introduced, there were many users with monochrome (so called
“black and white” or b/w) receivers. Broadcast engineers had to solve three problems:
˘ Transmit a colour signal
˘ Ensure compatibility: a b/w receiver must be able to produce a reasonable picture from
a colour signal
˘ Ensure recompatibility: a colour receiver must be able to reproduce a b/w signal (e.g.,
an old movie)
110

10 THE THEORY OF LIGHT AND COLOUR 10.6 Gamma Correction
Transmitting an RGB signal does not work because it is not compatible: a b/w re-
ceiver shows RGB as shades of gey, with no deep blacks or bright whites. The YIQ
system was adopted by the US National Television System Committee (NTSC)
for colour TV. The YIQ colour solid is a linear transformation of the RGB cube.
Its purpose is to exploit certain characteristics of the human eye to maximize the
utilization of a fixed bandwidth. The human visual system is more sensitive to
changes in luminance than to changes in hue or saturation, and thus a wider band-
width should be dedicated to luminance than to color information. Y is similar
to perceived luminance; I and Q carry color information and some luminance
information. The Y signal usually has 4.2 MHz bandwidth in a 525 line system.
Originally, the I and Q had different bandwidths (1.5 and 0.6 MHz), but now
they commonly have the same bandwidth of 1 MHz. [Adapted from information
on Nan C. Schaller’s web page.]
The CIE values for the standard NTSC phosphors are (0.67, 0.33) for red, (0.21, 0.71) for
green, and (0.14, 0.08) for blue. The white point is at (xw, yw, Yw) D (0.31, 0.316, 1.0). The
equations for converting between YIQ and RGB are
2
4
Y
I
Q
3
5 D
2
4
0.299 0.587 0.114
0.596 0.275 0.321
0.212 0.523 0.311
3
5
2
4
R
G
B
3
5
and
2
4
R
G
B
3
5 D
2
4
1 0.956 0.621
1 0.272 0.647
1 1.105 1.702
3
5
2
4
Y
I
Q
3
5 .
The ranges are 0 Y 1, 1 I 1 and 1 Q 1.
10.6 Gamma Correction
The response of a monitor is nonlinear. In this approximation
I D k V
I is the light intensity seen by the viewer, k is a constant, V is the voltage at the electron
gun, and is another constant. Since it is that causes the nonlinearity, we need gamm
correction to compensate for it. Typical values are 2.3 2.6, with D 2.2 often being
assumed.
111

11 ADVANCED TECHNIQUES
11 Advanced Techniques
Previous sections have focused mainly on rendering techniques that are provided by OpenGL.
In this section, we look briefly at two techniques that are not provided directly by OpenGL,
although they can be simulated. Amongst serious graphics programmers, OpenGL is consid-
ered to be a rather simple system with rather limited capabilities. The important advantage
of OpenGL, and the reason for its popularity, is that it is simple and fast. The techniques
described in this section are very slow by comparison. Although they are acceptably fast
with appropriate simplifications and modern hardware, the earliest implementations required
hundreds, or in some cases, thousands of hours of mainframe computer time to produce good
images.
11.1 Ray-Tracing
OpenGL computes the colour of each vertex of the scene. This is a potentially wasteful process,
since many vertexes never reach the viewing window: they may be clipped because they are
outside the viewing volume or invisible because there is an object between them and the
viewer.
Ray-tracing avoids this source of inefficiency by computing the colour of each pixel on the
screen. This avoids wasting time by computing the colour of invisible objects, but ray-tracing
introduces inefficiencies of its own.
H
H
H
HH
H
H
HH
H
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Vs
hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
P
s
%
'$
O1
O2
J
J
J
J
O3 O4
Figure 56: Ray Tracing
Figure 56 illustrates the basic ideas of ray-tracing. The viewer is at V and P is a pixel on the
screen. A line drawn from V to P and extended into the scene meets objects in the scene at
points O1, O2, O3, and O4. The colour at pixel P, from the point of view of the observer at
V , must come from point O1, since the other points on the line are hidden.
The line PV is called a ray. Although the light actually travels from the object O1 to the
viewer V , the calculation traces the ray backwards, from the viewer to the object. Thus the
basic ray-tracing algorithm, in pseudocode, is:
for each pixel, P:
Construct the line from the viewer, VP.
Find the points O1, O2, P3, . . . , On where PV meets surfaces in the scene.
112

11 ADVANCED TECHNIQUES 11.1 Ray-Tracing
Find Omin, the closest Oi to the viewer.
Compute the colour at Omin and set the colour of pixel P to this value.
To make this more precise, we introduce some vectors:
e D the displacement of the viewer’s eye from the origin
n D the direction of the viewer with respect to the centre of the scene
v D the vertical direction (“up vector”)
u D the direction right on the viewing screen
The unit vectors u, v, and n form a right-handed coordinate system corresponding to XYZ
in OpenGL camera coordinates.
Suppose the screen has width W , with number of columns Nc , and height H with number of
rows Nr . The horizontal displacement depends only on the column number and the vertical
displacement depends only on the row number. Thus for the pixel at column c and row r, we
have
uc D W

2c
Nc
1

vr D H

2r
Nr
1

If we assume that the position of the screen with respect to the origin is N in the n direction
(as in OpenGL) then the pixel with screen coordinates (r, c) has position
p D e N n C uc u C vr v
and the parametric equation of the line joining the viewer’s eye to this point is
L(t) D e(1 t) C (e N n C uc u C vr v) t (25)
In this equation, t D 0 corresponds to the eye position, e, and t D 1 corresponds to the pixel
on the screen. Consequently, points in the scene on this line will have values t 1 and larger
values of t corresponds to greater distance from the viewer.
We can write (25) in the form
L(t) D e C d t (26)
where
d D N n C W

2c
Nc
1

u C H

2r
Nr
1

v
Since e is fixed, it is necessary only to calculate d to find the line equation for each pixel.
We can now write the ray-tracing algorithm more precisely:
for (r D 0; r Nr ; r C C)
for (c D 0; c Nc; c C C)
find L(t)
for each surface S in the scene such that L(ts) 2 S
store minftsg
113

Finding the Intersections The next problem computing the intersections. If the surface
is a simple mathematical object, such as a cube (or general rectangular shape), a cylinder,
a sphere (or general ellipsoid), we can compute the intersection using a formula, as we will
show below. This means that, for simple objects, ray-tracing is an exact method, in contrast
to OpenGL, in which objects are modelled by a polygonal mesh. This is one reason why
ray-traced scenes look sharper and clearer than scenes rendered by OpenGL.
Here is one approach to computing the intersections. We assume that the surface is the set
of points that satisfy an equation of the form F(p) D 0. That is, the surface S consists of the
points S D f p j F(p) D 0 g.
To find the intersection of the ray L(t) with the surface F(p) D 0, we solve the equation
F(L(t)) D 0
for t. From (26) above, this is equivalent to solving
F(e C d t) D 0
for t.
Transformations It is only on rare occasions that we will need to draw a unit sphere
at the origin. How should we handle the case of a general sphere with an equation like
(x a)2 C (y b)2 C (z c)2 D r2? We could set up the equation and solve it as above, but
there is an alternative way. Just as in OpenGL, we can use translating, rotating, and scaling
transformations to transform the unit sphere into a general ellipsoid.
Suppose that we have a canonical surface (e.g., a unit sphere at the origin) F(p) D 0 and
a transformation T that transforms into into something more general (e.g., a football flying
through a goal mouth). Suppose also that q is a point on the transformed object, so that
T (p) D q. Given q, we can find p by inverting the transformation: p D T 1
q. Since
F(p) D 0, it follows that F(T 1q) D 0. In other words, the transformed surface is the set of
points
˚
q j F(T 1
q) D 0 .
The method for finding the intersection of a ray with a canonical object F that has been
transformed by T is therefore to solve
F(T 1
(e C d t)) D 0.
As an optimization we note that, since the transform and its inverse are usually linear, the
equation can be written
F(T 1
(e) C T 1
(d t)) D 0
in which T 1(e) is a constant (that is, independent of t).
Examples
Suppose that the object we are viewing is a plane. The general equation of a plane is
Ax C By C C z C D D 0 for particular values of the constants A, B, C , and D. Thus
F(p) D Apx C Bpy C Cpz C D
114

and, with p D L(t) D e C d t, the equation F(L(t)) D 0 becomes
A(ex C dx t) C B(ey C dy t) C C(ez C dz t) C D D 0.
This is a linear equation and its solution is
t D
AEx C BEy C CEz C D
Adx C Bdy C Cdz
.
If p has components (x, y, z) and
F(p) D x2
C y2
C z2
1
then the set f p j F(p) D 0 g is a unit sphere centered at the origin. We call this the
canonical sphere and we obtain other spheres by scaling and translating the canonical
sphere.
For example, suppose we want a sphere with radius 3 at (2, 4, 6). The required trans-
formations are
T D
2
6
6
4
1 0 0 2
0 1 0 4
0 0 1 6
0 0 0 1
3
7
7
5
and
S D
2
6
6
4
3 0 0 0
0 3 0 0
0 0 3 0
0 0 0 1
3
7
7
5
their product is
T S D
2
6
6
4
3 0 0 2
0 3 0 4
0 0 3 6
0 0 0 1
3
7
7
5
and the inverse of this matrix is
(T S) 1
D
2
6
6
6
6
6
6
6
6
6
4
1
3
0 0
2
3
0
1
3
0
4
3
0 0
1
3
2
0 0 0 1
3
7
7
7
7
7
7
7
7
7
5
Applying this matrix to the (homogenized) point e C d t gives
q D

1
3
ex C
1
3
dx t
2
3
,
4
3
C
1
3
ey C
1
3
dy t, 2 C
1
3
ez C
1
3
dz t

115

H
HH
H
H
H
HH
H
H
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Vs
hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
P
s
%
'$

s

s
O
L1
A
A
A
A
A
A
A
A
A
A
s
L2
Figure 57: Lighting in the ray-tracing model
and the equation F(q) D 0 is obtained as

1
3
ex C
1
3
dx t
2
3
2
C

4
3
C
1
3
ey C
1
3
dy t
2
C

2 C
1
3
ez C
1
3
dz t
2
1 D 0
which is a quadratic in t:

1
9
dx
2
C
1
9
dy
2
C
1
9
dz
2

t2
C

8
9
dy
4
9
dx C
2
9
ex dx C
2
9
ez dz
4
3
dz C
2
9
ey dy

tC
1
9
ex
2
C
47
9
4
9
ex
4
3
ez C
1
9
ey
2
C
1
9
ez
2 8
9
ey D 0.
This equation may have no roots (the ray misses the sphere), one root (the ray is tangent
to the sphere), or two roots (the ray intersects the sphere twice and we take the smallest
root).
We have computed the point where the ray meets the object, but we still have to compute the
illumination at that point. This means that we must be able to find the normal to the surface
of the object at the point of intersection with the ray. This normal is in fact the normal to
the generic surface transformed by the inverse transpose of T . If the generic surface has a
simple normal calculation — as many of them do — the normal is easy to find.
11.1.1 Recursive Ray Tracing
For detailed lighting calculations, including shadows, reflection, and refraction, we can apply
the ray-tracing algorithm recursively.
In Figure 57, the ray meets a surface at O. There are two light sources, L1 and L2. The
source L1 does not in fact illuminate the surface at O because there is another object between
116

L1 and O. The source L2, however, does illuminate O. If we take into account objects that
block the light, we will achieve the effect of shadows without further effort.
In order to find out whether a light source is blocked by an obstacle, we apply the ray-tracing
algorithm recursively. We emit a ray from O in the direction of the source L1 and determine
whether it meets any surfaces before L1. Repeating this for each light source enables us to
calculate the contribution of each source to the colour at O. If there is a possibility that the
obstacle might be reflective, we can recurse again to find out how much light it contributes.
We can use a similar technique for refraction by a transparent object. When the ray emitted
from O meets a transparent surface, it sends out two further rays, one for reflected and one
for refracted light. (As usual, the rays are going backwards, in the opposite direction to the
simulated light.) In a complete ray-tracing calculation, the intensity at a point is the sum of:
ambient light,
diffuse light,
specular light,
reflected light, and
refracted light.
Here is simplified pseudocode for a complete ray-tracing system. The function shade is passed
a Ray and returns the colour of a single pixel. The function hit finds the first surface that
the ray intersects.
Colour shade(Ray r)
obj = r.hit()
Colour col
col.set(emissive light)
col.set(ambient light)
for each light source
if obj is shiny
ref = reflected ray
col.add(shininess * shade(ref))
if obj is transparent
trn = transmitted ray
col.add(transparency * shade(trn))
return col
11.1.2 Summary
We can use the techniques that we have seen before (Gouraud and Phong shading) but the
precision of ray-tracing makes it desirable to use more sophisticated lighting models. We will
describe briefly one of these models, due to Cook and with a later improvement by Torrance,
and usually called the Cook-Torrance lighting model.
The Cook-Torrance model assumes that a rough surface consists of many small facets and
that each facet is almost mirror-like. Surfaces at an angle ı reflect light back to the viewer.
117

11 ADVANCED TECHNIQUES 11.2 Radiosity
The distribution of ı is given by
D(ı) D
e (tan ı
m )
2
4 m2 cos4ı
where m is a roughness factor.
It is easy to apply textures in a ray-tracing system: the hit coordinates are mapped directly
to texture coordinates.
Ray tracing is good for:
specular reflection
refraction
transparency
ambient lighting
shadows
It is not so good for diffuse lighting. However, radiosity does a good job of diffuse lighting.
11.2 Radiosity
There are some similarities between radiosity and ray-tracing. Both models are based on phys-
ical principles; both are based on simple ideas, yet a good implementation is quite complex;
both require large amounts of computation. In other ways they are quite different: radiosity
handles diffuse light best, whereas ray-tracing is best for specular light; radiosity computes
illumination independently of the point of view, whereas ray-tracing is completely determined
by the point of view.
The basic assumption of radiosity is that all surfaces emit light. If this seems surprising, look
around you. There are very few objects in direct view that you cannot see, which implies that
they are all emitting light.
Radiosity techniques divide the scene into patches. A patch is simply an area of a surface:
it can be large or small, flat or curved. It is helpful to think of a patch as being a small, flat
area, but this is not necessarily the case. We assume that each patch is an opaque, diffuse
emitter and reflector of light.
Then:
Bi D Ei C i
X
j
Bj Fji
Aj
Ai
(27)
In this equation:
Bi D the radiosity of patch i
D light reflected by patch i (W=m2)
Ei D the emissivity of patch i
D light directly emitted by patch i (W=m2)
i D reflection coefficient of patch i (a dimensionless number)
118

Ai D the area of patch i
Fji D the form factor for patch i, defined below
Equation (27) says that the amount of light emitted by patch i consists of the amount it emits
directly, Ei, plus the sum of the amount of light it receives from other patches and reflects.
For most patches, Ei D 0. If Ei 0, then patch i is a light source.
The amount of light that patch i reflects is proportional to its reflection coefficient, i. Then,
for each other patch i, the amount of light depends on: the emission from patch j, Bj ; the
form factor, Fji; and the relative areas, Ai and Aj .
Bj Fji is defined to be the amount of light leaving a unit area of patch j that reaches all
of patch i. What we actually need is the amount of light arriving at a unit area of patch i
from all of patch j. This explains the factor
Aj
Ai
.
Fji expresses the optical “coupling” between patches j and i. It is high if the patches are
close together and parallel and small if they are far away or not parallel. If one patch faces
away from the other, the coupling is zero. In most cases, Fii D 0 (a patch is not coupled to
itself) but, if a patch is a concave curved surface, a small amount of self-coupling is possible.
For diffuse light, we can show that AiFij D Aj Fji or
Fji
Aj
Ai
D Fij (28)
Substituting (28) into (27) gives
Bi D Ei C i
X
j
Bj Fij
which we can rearrange to give the radiosity equation:
Bi i
X
j
Bj Fij D Ei (29)
The important feature of the radiosity equation (29) is that it is simply a set of linear simul-
taneous equations in B1, B2, . . . , Bn. If we know the constants i, Ei, and Fij , we can solve
these equations and determine the radiosity of each patch.
So far, we have considered only the intensity of light. In practice, the constants i and Ei
will have different values at different wavelengths and we will have to perform the calculation
at several (typically three) wavelengths to obtain a result with colours. (Note that Fji does
not depend on the wavelength but only on the geometry of the scene.)
11.2.1 Computing Form Factors
The hardest part of radiosity is the calculation of the form factors Fij . We consider a small
part of each patch, dAi and dAj , which we can assume to be flat. Let:
L D the line joining these areas
r D the length of L
119

i D the angle between L and the normal to dAi
j D the angle between L and the normal to dAj
Hij D 1 if dAi is visible from dAj and 0 otherwise
Then
dFdi,dj D
cos i cos j
r2
Hij dAj
Integrating over Aj gives
Fdi,j D
Z
Aj
cos i cos j
r2
Hij dAj
and a second integration over dAi gives
Fij D
1
Ai
Z
Ai
Z
Aj
cos i cos j
r2
Hij dAj dAi
A naive implementation is likely to be inefficient. If there are N patches, we have to compute
N2 double integrals. Since N may be in the hundreds or even thousands, this will be time
consuming.
The Cohen-Greenberg algorithm computes fairly good approximations to Fij with much less
work. The idea of the algorithm is to pre-compute values for small squares on a semi-cube
that encloses the patch and to use these values in the computation of Fij .
11.2.2 Choosing Patches
Since the computation time increases as N2
for N patches, choosing the right patches is a
crucial step. To obtain reasonable efficiency, we would like to have large patches where the
light is uniform and smaller patches where it is non-uniform. Once again, recursion comes to
the rescue. A practical radiosity algorithm works like this:
1. Divide the scene into a small number of large patches.
2. Calculate the radiosity of each patch.
3. Estimate the radiosity variation across each patch (e.g., by looking at the radiosity of
its neighbours).
4. If the radiosity gradient across a patch is larger than a given threshold, split the patch
into two smaller patches.
5. If any patches were split in step 4, repeat from step 2.
In practice, the efficiency of the algorithm is improved by not repeating all the calculations
in step 2, because some values will not have been changed significantly by splitting patches.
120

11 ADVANCED TECHNIQUES 11.3 Bump Mapping
11.2.3 Improvements
Practical versions of radiosity algorithms are incremental. Initially, all Bi are assumed to be
zero. Equation (29) is used to calculate the first approximation to Bi, using only the non-zero
Eis. The second iteration takes into account one reflection, and the third iteration takes
into account two reflections, and so on. In practice, the equations stabilize fairly quickly,
because third and higher order reflections have very little energy. The procedure stops when
an iteration produces very little change.
Radiosity does a good job of diffuse lighting, and it automatically provides ambient lighting.
It is possible to account for specular lighting by including directional effects in the calculation
of Fij , but the computational overhead usually makes this impractical.
However, as we have seen, ray-tracing does a good job of specular light. It is possible to
combine radiosity and ray-tracing to get the best of both models. A typical calculation goes
like this:
1. Compute radiosity values for the scene, giving ambient and diffuse lighting that is in-
dependent of the view point.
2. Project the scene onto a viewing window.
3. Perform ray-tracing to obtain specular lights, reflection, and refraction.
11.3 Bump Mapping
Consider an orange. It is approximately a sphere, but has an irregular surface. It would be
expensive to model an orange exactly, because a very large number of polygons would be
needed to model the surface accurately. It is possible, however, to obtain the illusion of an
orange by drawing a sphere with incorrect normals. The normals interact with the lighting
calculations to give the effect of a dimpled surface.
The technique is called bump mapping and it was introduced by James Blinn in 1978.
Suppose that we have a surface defined with parametric coordinates. Points on the surface are
p(u, v) for various values of u and v (which are essentially the same as texture coordinates).
For example, we can define a sphere with radius r centered at the origin as
p(u, v) D (r sin u cos v, r sin u sin v, r cos u).
Bump mapping perturbs the true surface by adding a bump function to it:
p0
(u, v) D p(u, v) C b(u, v)n
where
N D
@p
@u

@p
@v
and
n D
N
jNj
121

11 ADVANCED TECHNIQUES 11.4 Environment Mapping
is the unit normal vector at p. The perturbed normal vector is
N0
D
@p0
@u

@p0
@v
.
We have
@p0
@u
D
@
@u
(p C bn)
D
@p
@u
C n
@b
@u
C b
@n
@u
.
If we assume that b is small (that is the idea of a “bump” mapping), we can neglect the last
term, giving for u and v:
@p0
@u

@p
@u
C n
@b
@u
@p0
@v

@p
@v
C n
@b
@v
.
The perturbed surface normal vector is
N0
D
@p0
@u

@p0
@v
D
@p
@u

@p
@v
C
@b
@v

@p
@u
n

C
@b
@u

n
@p
@v

C
@b
@u
@b
@v
(n n).
But n n D 0 and so
N0
D N C
@b
@v

@p
@u
n

C
@b
@u

n
@p
@v

.
We obtain the perturbed unit normal by normalizing N0
.
Although it is possible to all the calculations analytically, the usual practice is to approximate
the bump function b with a look-up table and to estimate the derivatives with finite differences:
@b
@u
bi,j bi 1,j
@b
@v
bi,j bi,j 1
Values of @b
@u
and @b
@v
can be tabulated, and the values of @p
@u
and @p
@v
are needed anyway for
normal calculation. Consequently, bump mapping is quite an efficient operation.
11.4 Environment Mapping
Real-life scenes often contain highly reflective objects. If these objects are to look realistic,
they should reflect the scenery around them. In general, this is hard to do, because it requires
calculating the reflected ray for each point on the surface, and then finding out where that
ray comes from in the scene. (This is a problem that is solved neatly by recursive ray-tracing,
of course.)
There are two ways of simplifying the rendering of reflective objects:
122

11 ADVANCED TECHNIQUES 11.4 Environment Mapping
1. We can work with simple objects, such as cylinders and spheres. It is easier to compute
reflections from these objects than from, say, a shiny car.
2. We can use texture mapping to render the object.
The second method, which is called environment mapping, was introduced by Blinn and
Newell. We will consider a simple and standard case that happens to have direct support
from OpenGL: the problem of rendering a reflecting sphere.
Environment mapping depends to some extent on the fact that people are rather like raccoons:
we recognize shiny or reflecting objects easily, and we are not too fussy about the precise details
of the reflection. Imagine a reflected sphere that is moving around. Strictly, the reflection on
the sphere should exactly match the surroundings; in practice, if the match is fairly good, our
eyes accept it as a reflection.
Environment mapping works in two steps: the first step is to obtain a suitable reflected image,
and the second step is to use that image to texture a sphere. A photograph taken with a
fish-eye lens provides a usable image. Alternatively, we can take a regular image and distort
it.
The following code is extracted from a program that performs texturing in sphere-map mode.
The following code is used during initialization. PixelMap is a class defined in CUGL. The
image should be a fish-eye view, as described above, but it does not have to be.
PixelMap tex;
tex.read(image.bmp);
GLuint name;
glGenTextures(1, name);
tex.setTexture(name);
glTexGenf(GL_S, GL_TEXTURE_GEN_MODE, GL_SPHERE_MAP);
glTexGenf(GL_T, GL_TEXTURE_GEN_MODE, GL_SPHERE_MAP);
The following code is used in the display function. The displayed object must have texture
coordinates 0 s 1 and 0 t 1.
glEnable(GL_TEXTURE_GEN_S);
glEnable(GL_TEXTURE_GEN_T);
// Display the object
glDisable(GL_TEXTURE_GEN_S);
glDisable(GL_TEXTURE_GEN_T);
glDisable(GL_TEXTURE_2D);
The mathematical justification of sphere mapping follows.
Let u be a unit vector from the eye to a vertex on the sphere. Let r be the corresponding
reflection vector computed as
r D u 2(n u) n
123

11 ADVANCED TECHNIQUES 11.5 The Accumulation Buffer
Then the texture coordinates are calculated as
s D
1
2

rx
p
C 1

t D
1
2

ry
p
C 1

where
p D
q
r2
x C r2
y C (rz C 1)2.
11.5 The Accumulation Buffer
Although we have seen various ways of making graphics images realistic, it is usually easy to
distinguish a computer graphic image from a photograph or a real scene. There are several
reasons for this: one important reason is that a graphics image is sharp and bright everywhere.
We are used to seeing real scenes in which distant objects are less brightly coloured than nearby
objects and photographs in which one part is in sharp focus and the rest is blurred. We can
use OpenGL fog to give the effect of distance. This section discusses blurring.
Blurring can be simulated by drawing the scene several times in slightly different positions.
OpenGL provides the accumulation buffer for this and other purposes. This buffer is used
to “accumulate” several different images before displaying a final image. Most applications of
the accumulation buffer are quite slow because the image must be rendered several times.
Here are some of the applications of the accumulation buffer:
A camera focuses at a particular distance. In theory, there is a plane that is sharp
and everything else is blurred. In practice, there is a depth of field, defined by two
distances between which everything is sharp enough (for example, the blurring might
be less than the grain size of the film).
To achieve a photographic effect, we can render the scene several times into the accu-
mulation buffer. One point in the scene, the centre of the image at the focal plane, is
kept fixed, and the scene is randomly rotated through a very small angle around this
point. The effect is that this point in the scene is sharp and everything else is blurred.
If a fast-moving object is photographed with a slow shutter speed, it appears blurred.
Skilled photographers sometimes pan the camera to follow the object, in which case the
object is sharp but the background is blurred. In either case, the effect is called motion
blur. It is used (often with exaggeration) in comics and animated films to emphasize
the feeling of motion.
To achieve the effect in OpenGL, the scene is again rendered several times into the
accumulation buffer. Stationary objects stay in the same place, and moving objects are
moved slightly. The “exposure” (explained below) can be varied. For example, a moving
object might be rendered in five different positions with exposure 1
2 , 1
4 , 1
8 , 1
16 , and 1
32 ,
to give the effect of fading.
Jagged edges and other artefacts of polygonal decomposition can be smoothed by ren-
dering the image several times in slightly different positions — this is a form of an-
tialiasing. The movements should be random and very small, typically less than a
pixel.
124

11 ADVANCED TECHNIQUES 11.5 The Accumulation Buffer
The function that controls the accumulation buffer is glAccum(). It requires two arguments,
an operation and a value. Figure 58 explains the effect of the various operations. The “buffer
currently selected for reading” is set by glReadBuffer() and the “buffer currently selected
for writing” is selected by glDrawBuffer(). By default, the current colour buffer is used for
reading and writing, so it is not actually necessary to call these functions.
Operation (op) Effect (val)
GL ACCUM Read each pixel of the buffer currently selected for reading, multiply
the RGB values by val, and add the result to the accumulation buffer.
GL LOAD Read each pixel of the buffer currently selected for reading, multiply
the RGB values by val, and store the result in the accumulation buffer.
GL RETURN Take each pixel from the accumulation buffer, multiply the RGB values
by val, and store the result in the colour buffer currently selected for
writing.
GL ADD Add val to each pixel in the accumulation buffer.
GL MULT Multiply each pixel in the accumulation buffer by val.
Figure 58: Effect of glAccum(op, val )
In a typical application, the accumulation buffer is used as follows:
Call glClear(GL_ACCUM_BUFFER_BIT) to clear the accumulation buffer.
Render the image n times into the colour buffer (as usual). After each rendering, call
glAccum (GL_ACCUM, x) with x D 1=n.
Call glAccum (GL_RETURN, 1.0) to copy the accumulated information back to the
colour buffer.
125

REFERENCES REFERENCES
References
Birchfield, S. (1998, April). An introduction to projective geometry (for computer vision).
http://robotics.stanford.edu/birch/projective.
Eberly, D. H. (2001). 3D Game Engine Design: a Practical Approach to Real-Time Com-
puter Graphics. Academic Press (Morgan Kaufmann).
Foley, J. D., A. van Dam, S. K. Feiner, and J. F. Hughes (1996). Computer Graphics:
Principles and Practice (Second ed.). Addison-Wesley.
Heckbert, Paul S. (editor) (1994). Graphics Gems IV. Acedemic Press.
Hill Jr., F. (2001). Computer Graphics using OpenGL. Prentice-Hall.
Shoemake, K. (1993). Quaternions. Available at various web sites, including (December
2001) http://www.fasterlight.com/hugg/links/gamelinks.html.
Shoemake, K. (1994a). Arcball rotation control. In (Heckbert, Paul S. (editor) 1994), pp.
175–192. Academic Press.
Shoemake, K. (1994b). Euler angle conversion. In (Heckbert, Paul S. (editor) 1994), pp.
222–229. Academic Press.
Shoemake, K. (1994c). Fiber bundle twist reduction. In (Heckbert, Paul S. (editor) 1994),
pp. 230–238. Academic Press.
Shoemake, K. (1994d). Polar matrix decomposition. In (Heckbert, Paul S. (editor) 1994),
pp. 207–221. Academic Press.
Shreiner, D. e. (2000). OpenGL Reference Manual: the Official Reference Document to
OpenGL, Version 1.2 (Third ed.). Addison-Wesley. The ‘Blue Book’.
Slater, M., A. Steed, and Y. Chrysanthou (2002). Computer Graphics and Virtual Envi-
ronments: from Realism to Real-Time. Addison-Wesley.
Watt, A. and F. Policarpo (2000). 3D Games: Real-time Rendering and Software Technol-
ogy, Volume One. Addison-Wesley.
Watt, A. and M. Watt (1992). Advanced Animation and Rendering Techniques. Addison-
Wesley.
Woo, M., J. Nedier, T. Davis, and D. Shreiner (2000). OpenGL Programming Guide: the
Official Guide to Learning OpenGL, Version 1.2 (Third ed.). Addison-Wesley. The ‘Red
Book’.
Wright Jr., R. S. and M. Sweet (2000). OpenGL SuperBible (Second Edition ed.). The Waite
Group.
126

cs notes for the syudents of computer science

More Related Content

Similar to cs notes for the syudents of computer science (20)

More from RavinderKSingla (20)

Recently uploaded (20)

cs notes for the syudents of computer science