SlideShare a Scribd company logo
TemplateFundamentals of Accounting
Instructions
Accounts to be used:
· Cash.
· Prepaid insurance.
· Land.
· Buildings.
· Equipment.
· Accounts payable.
· Unearned service revenue.
· Owner's capital.
· Owner's drawings.
· Service revenue.
· Advertising expense.
· Salaries and wages expense.
Leave a space between each dated transaction.
May 1
Invested $20,000 cash in the golf course business.
May 3
Purchased Hampstead Golf Land for $15,000 cash. The price
includes land $12,000, shed $2,000, and equipment $1,000.
May 5
Paid advertising expenses of $700.
May 6
Paid cash $600 for a one-year insurance policy.
May 10
Purchased golf discs and other equipment for $1,050 from Discs
Are Us, payable in 30 days.
May 18
Received $1,100 in cash for golf fees earned (service revenue).
May 19
Sold 150 coupon books for $10 each. Each book contains four
coupons that enable the holder to play one round of disc golf.
May 25
Withdrew $800 cash for personal use.
May 30
Pay $250 as salaries for part-time employees.
May 30
Paid Discs Are Us the full amount due.
May 31
Received $2,100 cash for fees earned.
Date
Accounts
Debit
Credit
TemplateFundamentals of AccountingInstructionsAcco.docx
TemplateFundamentals of AccountingInstructionsAcco.docx
TemplateFundamentals of AccountingInstructionsAcco.docx
TemplateFundamentals of AccountingInstructionsAcco.docx
TemplateFundamentals of AccountingInstructionsAcco.docx
1
3
Data Visualisation
2
3
Data Visualisation
A Handbook for Data Driven Design
Andy Kirk
4
SAGE Publications Ltd
1 Oliver’s Yard
55 City Road
London EC1Y 1SP
SAGE Publications Inc.
2455 Teller Road
Thousand Oaks, California 91320
SAGE Publications India Pvt Ltd
B 1/I 1 Mohan Cooperative Industrial Area
Mathura Road
New Delhi 110 044
SAGE Publications Asia-Pacific Pte Ltd
3 Church Street
#10-04 Samsung Hub
Singapore 049483
5
© Andy Kirk 2016
First published 2016
Apart from any fair dealing for the purposes of research or
private study,
or criticism or review, as permitted under the Copyright,
Designs and
Patents Act, 1988, this publication may be reproduced, stored or
transmitted in any form, or by any means, only with the prior
permission
in writing of the publishers, or in the case of reprographic
reproduction, in
accordance with the terms of licences issued by the Copyright
Licensing
Agency. Enquiries concerning reproduction outside those terms
should be
sent to the publishers.
Library of Congress Control Number: 2015957322
British Library Cataloguing in Publication data
A catalogue record for this book is available from the British
Library
ISBN 978-1-4739-1213-7
ISBN 978-1-4739-1214-4 (pbk)
Editor: Mila Steele
Editorial assistant: Alysha Owen
Production editor: Ian Antcliff
Marketing manager: Sally Ransom
Cover design: Shaun Mercier
Typeset by: C&M Digitals (P) Ltd, Chennai, India
Printed and bound in Great Britain by Bell and Bain Ltd,
Glasgow
6
Contents
List of Figures with Source Notes
Acknowledgements
About the Author
INTRODUCTION
PART A FOUNDATIONS
1 Defining Data Visualisation
2 Visualisation Workflow
PART B THE HIDDEN THINKING
3 Formulating Your Brief
4 Working With Data
5 Establishing Your Editorial Thinking
PART C DEVELOPING YOUR DESIGN SOLUTION
6 Data Representation
7 Interactivity
8 Annotation
9 Colour
10 Composition
PART D DEVELOPING YOUR CAPABILITIES
11 Visualisation Literacy
References
Index
7
List of Figures with Source Notes
1.1 A Definition for Data Visualisation 19
1.2 Per Capita Cheese Consumption in the U.S., by Sarah Slobin
(Fortune magazine) 20
1.3 The Three Stages of Understanding 22
1.4–6 Demonstrating the Process of Understanding 24–27
1.7 The Three Principles of Good Visualisation Design 30
1.8 Housing and Home Ownership in the UK, by ONS Digital
Content Team 33
1.9 Falling Number of Young Homeowners, by the Daily Mail
33
1.10 Gun Deaths in Florida (Reuters Graphics) 34
1.11 Iraq’s Bloody Toll, by Simon Scarr (South China Morning
Post)
34
1.12 Gun Deaths in Florida Redesign, by Peter A. Fedewa
(@pfedewa) 35
1.13 If Vienna would be an Apartment, by NZZ (Neue Zürcher
Zeitung) [Translated] 45
1.14 Asia Loses Its Sweet Tooth for Chocolate, by Graphics
Department (Wall Street Journal) 45
2.1 The Four Stages of the Visualisation Workflow 54
3.1 The ‘Purpose Map’ 76
3.2 Mizzou’s Racial Gap Is Typical On College Campuses, by
FiveThirtyEight 77
3.3 Image taken from ‘Wealth Inequality in America’, by
YouTube
user ‘Politizane’ (www.youtube.com/watch?v=QPKKQnijnsM)
78
3.4 Dimensional Changes in Wood, by Luis Carli (luiscarli.com)
79
3.5 How Y’all, Youse and You Guys Talk, by Josh Katz (The
New
York Times) 80
3.6 Spotlight on Profitability, by Krisztina Szücs 81
3.7 Countries with the Most Land Neighbours 83
3.8 Buying Power: The Families Funding the 2016 Presidential
Election, by Wilson Andrews, Amanda Cox, Alicia DeSantis,
Evan
Grothjan, Yuliya Parshina-Kottas, Graham Roberts, Derek
Watkins
and Karen Yourish (The New York Times) 84
3.9 Image taken from ‘Texas Department of Criminal Justice’
Website
(www.tdcj.state.tx.us/death_row/dr_executed_offenders.html)
86
8
3.10 OECD Better Life Index, by Moritz Stefaner, Dominikus
Baur,
Raureif GmbH 89
3.11 Losing Ground, by Bob Marshall, The Lens, Brian Jacobs
and
Al Shaw (ProPublica) 89
3.12 Grape Expectations, by S. Scarr, C. Chan, and F. Foo
(Reuters
Graphics) 91
3.13 Keywords and Colour Swatch Ideas from Project about
Psychotherapy Treatment in the Arctic 92
3.14 An Example of a Concept Sketch, by Giorgia Lupi of
Accurat 92
4.1 Example of a Normalised Dataset 99
4.2 Example of a Cross-tabulated Dataset 100
4.3 Graphic Language: The Curse of the CEO, by David Ingold
and
Keith Collins (Bloomberg Visual Data), Jeff Green (Bloomberg
News) 101
4.4 US Presidents by Ethnicity (1789 to 2015) 114
4.5 OECD Better Life Index, by Moritz Stefaner, Dominikus
Baur,
Raureif GmbH 116
4.6 Spotlight on Profitability, by Krisztina Szücs 117
4.7 Example of ‘Transforming to Convert’ Data 119
4.8 Making Sense of the Known Knowns 123
4.9 What Good Marathons and Bad Investments Have in
Common,
by Justin Wolfers (The New York Times) 124
5.1 The Fall and Rise of U.S. Inequality, in Two Graphs Source:
World Top Incomes Database; Design credit: Quoctrung Bui
(NPR)
136
5.2–4 Why Peyton Manning’s Record Will Be Hard to Beat, by
Gregor Aisch and Kevin Quealy (The New York Times) 138–
140
C.1 Mockup Designs for ‘Poppy Field’, by Valentina D’Efilippo
(design); Nicolas Pigelet (code); Data source: The Polynational
War
Memorial, 2014 (poppyfield.org) 146
6.1 Mapping Records and Variables on to Marks and Attributes
152
6.2 List of Mark Encodings 153
6.3 List of Attribute Encodings 153
6.4 Bloomberg Billionaires, by Bloomberg Visual Data (Design
and
development), Lina Chen and Anita Rundles (Illustration) 155
6.5 Lionel Messi: Games and Goals for FC Barcelona 156
6.6 Image from the Home page of visualisingdata.com 156
6.7 How the Insane Amount of Rain in Texas Could Turn Rhode
Island Into a Lake, by Christopher Ingraham (The Washington
Post)
156
9
6.8 The 10 Actors with the Most Oscar Nominations but No
Wins
161
6.9 The 10 Actors who have Received the Most Oscar
Nominations
162
6.10 How Nations Fare in PhDs by Sex Interactive, by
Periscopic;
Research by Amanda Hobbs; Published in Scientific American
163
6.11 Gender Pay Gap US, by David McCandless, Miriam Quick
(Research) and Philippa Thomas (Design) 164
6.12 Who Wins the Stanley Cup of Playoff Beards? by Graphics
Department (Wall Street Journal) 165
6.13 For These 55 Marijuana Companies, Every Day is 4/20, by
Alex
Tribou and Adam Pearce (Bloomberg Visual Data) 166
6.14 UK Public Sector Capital Expenditure, 2014/15 167
6.15 Global Competitiveness Report 2014–2015, by Bocoup and
the
World Economic Forum 168
6.16 Excerpt from a Rugby Union Player Dashboard 169
6.17 Range of Temperatures (°F) Recorded in the Top 10 Most
Populated Cities During 2015 170
6.18 This Chart Shows How Much More Ivy League Grads
Make
Than You, by Christopher Ingraham (The Washington Post) 171
6.19 Comparing Critics Scores (Rotten Tomatoes) for Major
Movie
Franchises 172
6.20 A Career in Numbers: Movies Starring Michael Caine 173
6.21 Comparing the Frequency of Words Used in Chapter 1 of
this
Book 174
6.22 Summary of Eligible Votes in the UK General Election
2015
175
6.23 The Changing Fortunes of Internet Explorer and Google
Chrome
176
6.24 Literarcy Proficiency: Adult Levels by Country 177
6.25 Political Polarization in the American Public’, Pew
Research
Center, Washington, DC (February, 2015) (http://www.people-
press.org/2014/06/12/political-polarization-in-the-american-
public/)
178
6.26 Finviz (www.finviz.com) 179
6.27 This Venn Diagram Shows Where You Can Both Smoke
Weed
and Get a Same-Sex Marriage, by Phillip Bump (The
Washington
Post) 180
6.28 The 200+ Beer Brands of SAB InBev, by Maarten
Lambrechts
for Mediafin: www.tijd.be/sabinbev (Dutch),
10
www.lecho.be/service/sabinbev (French) 181
6.29 Which Fossil Fuel Companies are Most Responsible for
Climate
Change? by Duncan Clark and Robin Houston (Kiln), published
in
the Guardian, drawing on work by Mike Bostock and Jason
Davies
182
6.30 How Long Will We Live – And How Well? by Bonnie
Berkowitz, Emily Chow and Todd Lindeman (The Washington
Post)
183
6.31 Crime Rates by State, by Nathan Yau 184
6.32 Nutrient Contents – Parallel Coordinates, by Kai Chang
(@syntagmatic) 185
6.33 How the ‘Avengers’ Line-up Has Changed Over the Years,
by
Jon Keegan (Wall Street Journal) 186
6.34 Interactive Fixture Molecules, by @experimental361 and
@bootifulgame 187
6.35 The Rise of Partisanship and Super-cooperators in the U.S.
House of Representatives. Visualisation by Mauro Martino,
authored
by Clio Andris, David Lee, Marcus J. Hamilton, Mauro Martino,
Christian E. Gunning, and John Armistead Selde 188
6.36 The Global Flow of People, by Nikola Sander, Guy J. Abel
and
Ramon Bauer 189
6.37 UK Election Results by Political Party, 2010 vs 2015 190
6.38 The Fall and Rise of U.S. Inequality, in Two Graphs.
Source:
World Top Incomes Database; Design credit: Quoctrung Bui
(NPR)
191
6.39 Census Bump: Rank of the Most Populous Cities at Each
Census, 1790–1890, by Jim Vallandingham 192
6.40 Coal, Gas, Nuclear, Hydro? How Your State Generates
Power.
Source: U.S. Energy Information Administration, Credit:
Christopher
Groskopf, Alyson Hurt and Avie Schneider (NPR) 193
6.41 Holdouts Find Cheapest Super Bowl Tickets Late in the
Game,
by Alex Tribou, David Ingold and Jeremy Diamond (Bloomberg
Visual Data) 194
6.42 Crude Oil Prices (West Texas Intermediate), 1985–2015
195
6.43 Percentage Change in Price for Select Food Items, Since
1990,
by Nathan Yau 196
6.44 The Ebb and Flow of Movies: Box Office Receipts 1986–
2008,
by Mathew Bloch, Lee Byron, Shan Carter and Amanda Cox
(The
New York Times) 197
6.45 Tracing the History of N.C.A.A. Conferences, by Mike
Bostock,
11
Shan Carter and Kevin Quealy (The New York Times) 198
6.46 A Presidential Gantt Chart, by Ben Jones 199
6.47 How the ‘Avengers’ Line-up Has Changed Over the Years,
by
Jon Keegan (Wall Street Journal) 200
6.48 Native and New Berliners – How the S-Bahn Ring Divides
the
City, by Julius Tröger, André Pätzold, David Wendler (Berliner
Morgenpost) and Moritz Klack (webkid.io) 201
6.49 How Y’all, Youse and You Guys Talk, by Josh Katz (The
New
York Times) 202
6.50 Here’s Exactly Where the Candidates Cash Came From, by
Zach
Mider, Christopher Cannon, and Adam Pearce (Bloomberg
Visual
Data) 203
6.51 Trillions of Trees, by Jan Willem Tulp 204
6.52 The Racial Dot Map. Image Copyright, 2013, Weldon
Cooper
Center for Public Service, Rector and Visitors of the University
of
Virginia (Dustin A. Cable, creator) 205
6.53 Arteries of the City, by Simon Scarr (South China Morning
Post) 206
6.54 The Carbon Map, by Duncan Clark and Robin Houston
(Kiln)
207
6.55 Election Dashboard, by Jay Boice, Aaron Bycoffe and
Andrei
Scheinkman (Huffington Post). Statistical model created by
Simon
Jackman 208
6.56 London is Rubbish at Recycling and Many Boroughs are
Getting
Worse, by URBS London using London Squared Map © 2015
www.aftertheflood.co 209
6.57 Automating the Design of Graphical Presentations of
Relational
Information. Adapted from McKinlay, J. D. (1986). ACM
Transactions on Graphics, 5(2), 110–141. 213
6.58 Comparison of Judging Line Size vs Area Size 213
6.59 Comparison of Judging Related Items Using Variation in
Colour
(Hue) vs Variation in Shape 214
6.60 Illustrating the Correct and Incorrect Circle Size Encoding
216
6.61 Illustrating the Distortions Created by 3D Decoration 217
6.62 Example of a Bullet Chart using Banding Overlays 218
6.63 Excerpt from What’s Really Warming the World? by Eric
Roston and Blacki Migliozzi (Bloomberg Visual Data) 218
6.64 Example of Using Markers Overlays 219
6.65 Why Is Her Paycheck Smaller? by Hannah Fairfield and
Graham
Roberts (The New York Times) 219
12
6.66 Inside the Powerful Lobby Fighting for Your Right to Eat
Pizza,
by Andrew Martin and Bloomberg Visual Data 220
6.67 Excerpt from ‘Razor Sales Move Online, Away From
Gillette’,
by Graphics Department (Wall Street Journal) 220
7.1 US Gun Deaths, by Periscopic 225
7.2 Finviz (www.finviz.com) 226
7.3 The Racial Dot Map: Image Copyright, 2013, Weldon
Cooper
Center for Public Service, Rector and Visitors of the University
of
Virginia (Dustin A. Cable, creator) 227
7.4 Obesity Around the World, by Jeff Clark 228
7.5 Excerpt from ‘Social Progress Index 2015’, by Social
Progress
Imperative, 2015 228
7.6 NFL Players: Height & Weight Over Time, by Noah
Veltman
(noahveltman.com) 229
7.7 Excerpt from ‘How Americans Die’, by Matthew C. Klein
and
Bloomberg Visual Data 230
7.8 Model Projections of Maximum Air Temperatures Near the
Ocean and Land Surface on the June Solstice in 2014 and 2099:
NASA Earth Observatory maps, by Joshua Stevens 231
7.9 Excerpt from ‘A Swing of Beauty’, by Sohail Al-Jamea,
Wilson
Andrews, Bonnie Berkowitz and Todd Lindeman (The
Washington
Post) 231
7.10 How Well Do You Know Your Area? by ONS Digital
Content
team 232
7.11 Excerpt from ‘Who Old Are You?’, by David McCandless
and
Tom Evans 233
7.12 512 Paths to the White House, by Mike Bostock and Shan
Carter
(The New York Times) 233
7.13 OECD Better Life Index, by Moritz Stefaner, Dominikus
Baur,
Raureif GmbH 233
7.14 Nobel Laureates, by Matthew Weber (Reuters Graphics)
234
7.15 Geography of a Recession, by Graphics Department (The
New
York Times) 234
7.16 How Big Will the UK Population be in 25 Years Time? by
ONS
Digital Content team 234
7.17 Excerpt from ‘Workers’ Compensation Reforms by State’,
by
Yue Qiu and Michael Grabell (ProPublica) 235
7.18 Excerpt from ‘ECB Bank Test Results’, by Monica
Ulmanu,
Laura Noonan and Vincent Flasseur (Reuters Graphics) 236
7.19 History Through the President’s Words, by Kennedy
Elliott, Ted
13
Mellnik and Richard Johnson (The Washington Post) 237
7.20 Excerpt from ‘How Americans Die’, by Matthew C. Klein
and
Bloomberg Visual Data 237
7.21 Twitter NYC: A Multilingual Social City, by James
Cheshire,
Ed Manley, John Barratt, and Oliver O’Brien 238
7.22 Killing the Colorado: Explore the Robot River, by Abrahm
Lustgarten, Al Shaw, Jeff Larson, Amanda Zamora and Lauren
Kirchner (ProPublica) and John Grimwade 238
7.23 Losing Ground, by Bob Marshall, The Lens, Brian Jacobs
and
Al Shaw (ProPublica) 239
7.24 Excerpt from ‘History Through the President’s Words’, by
Kennedy Elliott, Ted Mellnik and Richard Johnson (The
Washington
Post) 240
7.25 Plow, by Derek Watkins 242
7.26 The Horse in Motion, by Eadweard Muybridge. Source:
United
States Library of Congress’s Prints and Photographs division,
digital
ID cph.3a45870. 243
8.1 Titles Taken from Projects Published and Credited
Elsewhere in
This Book 248
8.2 Excerpt from ‘The Color of Debt: The Black Neighborhoods
Where Collection Suits Hit Hardest’, by Al Shaw, Annie
Waldman
and Paul Kiel (ProPublica) 249
8.3 Excerpt from ‘Kindred Britain’ version 1.0 © 2013 Nicholas
Jenkins – designed by Scott Murray, powered by SUL-CIDR
249
8.4 Excerpt from ‘The Color of Debt: The Black Neighborhoods
Where Collection Suits Hit Hardest’, by Al Shaw, Annie
Waldman
and Paul Kiel (ProPublica) 250
8.5 Excerpt from ‘Bloomberg Billionaires’, by Bloomberg
Visual
Data (Design and development), Lina Chen and Anita Rundles
(Illustration) 251
8.6 Excerpt from ‘Gender Pay Gap US?’, by David McCandless,
Miriam Quick (Research) and Philippa Thomas (Design) 251
8.7 Excerpt from ‘Holdouts Find Cheapest Super Bowl Tickets
Late
in the Game’, by Alex Tribou, David Ingold and Jeremy
Diamond
(Bloomberg Visual Data) 252
8.8 Excerpt from ‘The Life Cycle of Ideas’, by Accurat 252
8.9 Mizzou’s Racial Gap Is Typical On College Campuses, by
FiveThirtyEight 253
8.10 Excerpt from ‘The Infographic History of the World’,
Harper
Collins (2013); by Valentina D’Efilippo (co-author and
designer);
14
James Ball (co-author and writer); Data source: The
Polynational War
Memorial, 2012 254
8.11 Twitter NYC: A Multilingual Social City, by James
Cheshire,
Ed Manley, John Barratt, and Oliver O’Brien 255
8.12 Excerpt from ‘US Gun Deaths’, by Periscopic 255
8.13 Image taken from Wealth Inequality in America, by
YouTube
user ‘Politizane’ (www.youtube.com/watch?v=QPKKQnijnsM)
256
9.1 HSL Colour Cylinder: Image from Wikimedia Commons
published under the Creative Commons Attribution-Share Alike
3.0
Unported license 265
9.2 Colour Hue Spectrum 265
9.3 Colour Saturation Spectrum 266
9.4 Colour Lightness Spectrum 266
9.5 Excerpt from ‘Executive Pay by the Numbers’, by Karl
Russell
(The New York Times) 267
9.6 How Nations Fare in PhDs by Sex Interactive, by
Periscopic;
Research by Amanda Hobbs; Published in Scientific American
268
9.7 How Long Will We Live – And How Well? by Bonnie
Berkowitz, Emily Chow and Todd Lindeman (The Washington
Post)
268
9.8 Charting the Beatles: Song Structure, by Michael Deal 269
9.9 Photograph of MyCuppa mug, by Suck UK
(www.suck.uk.com/products/mycuppamugs/) 269
9.10 Example of a Stacked Bar Chart Based on Ordinal Data
270
9.11 Rim Fire – The Extent of Fire in the Sierra Nevada Range
and
Yosemite National Park, 2013: NASA Earth Observatory
images, by
Robert Simmon 270
9.12 What are the Current Electricity Prices in Switzerland
[Translated], by Interactive things for NZZ (the Neue Zürcher
Zeitung) 271
9.13 Excerpt from ‘Obama’s Health Law: Who Was Helped
Most’,
by Kevin Quealy and Margot Sanger-Katz (The New York
Times) 272
9.14 Daily Indego Bike Share Station Usage, by Randy Olson
(@randal_olson)
(http://www.randalolson.com/2015/09/05/visualizing-indego-
bike-
share-usage-patterns-in-philadelphia-part-2/) 272
9.15 Battling Infectious Diseases in the 20th Century: The
Impact of
Vaccines, by Graphics Department (Wall Street Journal) 273
9.16 Highest Max Temperatures in Australia (1st to 14th
January
2013), Produced by the Australian Government Bureau of
15
Meteorology 274
9.17 State of the Polar Bear, by Periscopic 275
9.18 Excerpt from Geography of a Recession by Graphics
Department (The New York Times) 275
9.19 Fewer Women Run Big Companies Than Men Named John,
by
Justin Wolfers (The New York Times) 276
9.20 NYPD, Council Spar Over More Officers by Graphics
Department (Wall Street Journal) 277
9.21 Excerpt from a Football Player Dashboard 277
9.22 Elections Performance Index, The Pew Charitable Trusts ©
2014
278
9.23 Art in the Age of Mechanical Reproduction: Walter
Benjamin by
Stefanie Posavec 279
9.24 Casualties, by Stamen, published by CNN 279
9.25 First Fatal Accident in Spain on a High-speed Line
[Translated],
by Rodrigo Silva, Antonio Alonso, Mariano Zafra, Yolanda
Clemente
and Thomas Ondarra (El Pais) 280
9.26 Lunge Feeding, by Jonathan Corum (The New York
Times);
whale illustration by Nicholas D. Pyenson 281
9.27 Examples of Common Background Colour Tones 281
9.28 Excerpt from NYC Street Trees by Species, by Jill Hubley
284
9.29 Demonstrating the Impact of Red-green Colour Blindness
(deuteranopia) 286
9.30 Colour-blind Friendly Alternatives to Green and Red 287
9.31 Excerpt from, ‘Pyschotherapy in The Arctic’, by Andy
Kirk 289
9.32 Wind Map, by Fernanda Viégas and Martin Wattenberg 289
10.1 City of Anarchy, by Simon Scarr (South China Morning
Post)
294
10.2 Wireframe Sketch, by Giorgia Lupi for ‘Nobels no degree’
by
Accurat 295
10.3 Example of the Small Multiples Technique 296
10.4 The Glass Ceiling Persists Redesign, by Francis Gagnon
(ChezVoila.com) based on original by S. Culp (Reuters
Graphics)
297
10.5 Fast-food Purchasers Report More Demands on Their
Time, by
Economic Research Service (USDA) 297
10.6 Stalemate, by Graphics Department (Wall Street Journal)
297
10.7 Nobels No Degrees, by Accurat 298
10.8 Kasich Could Be The GOP’s Moderate Backstop, by
FiveThirtyEight 298
16
10.9 On Broadway, by Daniel Goddemeyer, Moritz Stefaner,
Dominikus Baur, and Lev Manovich 299
10.10 ER Wait Watcher: Which Emergency Room Will See You
the
Fastest? by Lena Groeger, Mike Tigas and Sisi Wei
(ProPublica) 300
10.11 Rain Patterns, by Jane Pong (South China Morning Post)
300
10.12 Excerpt from ‘Pyschotherapy in The Arctic’, by Andy
Kirk 301
10.13 Gender Pay Gap US, by David McCandless, Miriam Quick
(Research) and Philippa Thomas (Design) 301
10.14 The Worst Board Games Ever Invented, by
FiveThirtyEight
303
10.15 From Millions, Billions, Trillions: Letters from
Zimbabwe,
2005−2009, a book written and published by Catherine Buckle
(2014), table design by Graham van de Ruit (pg. 193) 303
10.16 List of Chart Structures 304
10.17 Illustrating the Effect of Truncated Bar Axis Scales 305
10.18 Excerpt from ‘Doping under the Microscope’, by S. Scarr
and
W. Foo (Reuters Graphics) 306
10.19 Record-high 60% of Americans Support Same-sex
Marriage,
by Gallup 306
10.20 Images from Wikimedia Commons, published under the
Creative Commons Attribution-Share Alike 3.0 Unported
license 308
11.1–7 The Pursuit of Faster’ by Andy Kirk and Andrew
Witherley
318–324
17
Acknowledgements
This book has been made possible thanks to the unwavering
support of my
incredible wife, Ellie, and the endless encouragement from my
Mum and
Dad, the rest of my brilliant family and my super group of
friends.
From a professional standpoint I also need to acknowledge the
fundamental role played by the hundreds of visualisation
practitioners (no
matter under what title you ply your trade) who have created
such a wealth
of brilliant work from which I have developed so many of my
convictions
and formed the basis of so much of the content in this book. The
people
and organisations who have provided me with permission to use
their work
are heroes and I hope this book does their rich talent justice.
18
About the Author
Andy Kirk
is a freelance data visualisation specialist based in Yorkshire,
UK. He
is a visualisation design consultant, training provider, teacher,
researcher, author, speaker and editor of the award-winning
website
visualisingdata.com
After graduating from Lancaster University in 1999 with a BSc
(hons) in Operational Research, Andy held a variety of business
analysis and information management positions at organisations
including West Yorkshire Police and the University of Leeds.
He discovered data visualisation in early 2007 just at the time
when
he was shaping up his proposal for a Master’s (MA) Research
Programme designed for members of staff at the University of
Leeds.
On completing this programme with distinction, Andy’s passion
for
the subject was unleashed. Following his graduation in
December
2009, to continue the process of discovering and learning the
subject
he launched visualisingdata.com, a blogging platform that
would
chart the ongoing development of the data visualisation field.
Over
time, as the field has continued to grow, the site too has
reflected this,
becoming one of the most popular in the field. It features a wide
range of fresh content profiling the latest projects and
contemporary
techniques, discourse about practical and theoretical matters,
commentary about key issues, and collections of valuable
references
and resources.
In 2011 Andy became a freelance professional focusing on data
visualisation consultancy and training workshops. Some of his
clients
include CERN, Arsenal FC, PepsiCo, Intel, Hershey, the WHO
and
McKinsey. At the time of writing he has delivered over 160
public
and private training events across the UK, Europe, North
America,
Asia, South Africa and Australia, reaching well over 3000
delegates.
In addition to training workshops Andy also has two academic
teaching positions. He joined the highly respected Maryland
Institute
College of Art (MICA) as a visiting lecturer in 2013 and has
been
teaching a module on the Information Visualisation Master’s
Programme since its inception. In January 2016, he began
teaching a
data visualisation module as part of the MSc in Business
Analytics at
the Imperial College Business School in London.
19
Between 2014 and 2015 Andy was an external consultant on a
research project called ‘Seeing Data’, funded by the Arts &
Humanities Research Council and hosted by the University of
Sheffield. This study explored the issues of data visualisation
literacy
among the general public and, among many things, helped to
shape
an understanding of the human factors that affect visualisation
literacy and the effectiveness of design.
20
Introduction
I.1 The Quest Begins
In his book The Seven Basic Plots, author Christopher Booker
investigated
the history of telling stories. He examined the structures used in
biblical
teachings and historical myths through to contemporary
storytelling
devices used in movies and TV. From this study he found seven
common
themes that, he argues, can be identifiable in any form of story.
One of these themes was ‘The Quest’. Booker describes this as
revolving
around a main protagonist who embarks on a journey to acquire
a
treasured object or reach an important destination, but faces
many
obstacles and temptations along the way. It is a theme that I feel
shares
many characteristics with the structure of this book and the
nature of data
visualisation.
You are the central protagonist in this story in the role of the
data
visualiser. The journey you are embarking on involves a route
along a
design workflow where you will be faced with a wide range of
different
conceptual, practical and technical challenges. The start of this
journey
will be triggered by curiosity, which you will need to define in
order to
accomplish your goals. From this origin you will move forward
to
initiating and planning your work, defining the dimensions of
your
challenge. Next, you will begin the heavy lifting of working
with data,
determining what qualities it contains and how you might share
these with
others. Only then will you be ready to take on the design stage.
Here you
will be faced with the prospect of handling a spectrum of
different design
options that will require creative and rational thinking to
resolve most
effectively.
The multidisciplinary nature of this field offers a unique
opportunity and
challenge. Data visualisation is not an especially difficult
capability to
acquire, it is largely a game of decisions. Making better
decisions will be
your goal but sometimes clear decisions will feel elusive. There
will be
occasions when the best choice is not at all visible and others
when there
will be many seemingly equal viable choices. Which one to go
with? This
book aims to be your guide, helping you navigate efficiently
through these
21
difficult stages of your journey.
You will need to learn to be flexible and adaptable, capable of
shifting
your approach to suit the circumstances. This is important
because there
are plenty of potential villains lying in wait looking to derail
progress.
These are the forces that manifest through the imposition of
restrictive
creative constraints and the pressure created by the relentless
ticking clock
of timescales. Stakeholders and audiences will present complex
human
factors through the diversity of their needs and personal traits.
These will
need to be astutely accommodated. Data, the critical raw
material of this
process, will dominate your attention. It will frustrate and even
disappoint
at times, as promises of its treasures fail to materialise
irrespective of the
hard work, love and attention lavished upon it.
Your own characteristics will also contribute to a certain
amount of the
villainy. At times, you will find yourself wrestling with internal
creative
and analytical voices pulling against each other in opposite
directions.
Your excitably formed initial ideas will be embraced but will
need taming.
Your inherent tastes, experiences and comforts will divert you
away from
the ideal path, so you will need to maintain clarity and focus.
The central conflict you will have to deal with is the notion that
there is no
perfect in data visualisation. It is a field with very few ‘always’
and
‘nevers’. Singular solutions rarely exist. The comfort offered by
the rules
that instruct what is right and wrong, good and evil, has its
limits. You can
find small but legitimate breaking points with many of them.
While you
can rightly aspire to reach as close to perfect as possible, the
attitude of
aiming for good enough will often indeed be good enough and
fundamentally necessary.
In accomplishing the quest you will be rewarded with
competency in data
visualisation, developing confidence in being able to judge the
most
effective analytical and design solutions in the most efficient
way. It will
take time and it will need more than just reading this book. It
will also
require your ongoing effort to learn, apply, reflect and develop.
Each new
data visualisation opportunity poses a new, unique challenge.
However, if
you keep persevering with this journey the possibility of a
happy ending
will increase all the time.
I.2 Who is this Book Aimed at?
22
The primary challenge one faces when writing a book about data
visualisation is to determine what to leave in and what to leave
out. Data
visualisation is big. It is too big a subject even to attempt to
cover it all, in
detail, in one book. There is no single book to rule them all
because there
is no one book that can cover it all. Each and every one of the
topics
covered by the chapters in this book could (and, in several
cases, do) exist
as whole books in their own right.
The secondary challenge when writing a book about data
visualisation is to
decide how to weave all the content together. Data visualisation
is not
rocket science; it is not an especially complicated discipline.
Lots of it, as
you will see, is rooted in common sense. It is, however,
certainly a
complex subject, a semantic distinction that will be revisited
later. There
are lots of things to think about and decide on, as well as many
things to
do and make. Creative and analytical sensibilities blend with
artistic and
scientific judgments. In one moment you might be checking the
statistical
rigour of your calculations, in the next deciding which tone of
orange most
elegantly contrasts with an 80% black. The complexity of data
visualisation manifests itself through how these different
ingredients, and
many more, interact, influence and intersect to form the whole.
The decisions I have made in formulating this book‘s content
have been
shaped by my own process of learning about, writing about and
practising
data visualisation for, at the time of writing, nearly a decade.
Significantly
– from the perspective of my own development – I have been
fortunate to
have had extensive experience designing and delivering training
workshops and postgraduate teaching. I believe you only truly
learn about
your own knowledge of a subject when you have to explain it
and teach it
to others.
I have arrived at what I believe to be an effective and proven
pedagogy
that successfully translates the complexities of this subject into
accessible,
practical and valuable form. I feel well qualified to bridge the
gap between
the large population of everyday practitioners, who might
identify
themselves as beginners, and the superstar technical, creative
and
academic minds that are constantly pushing forward our
understanding of
the potential of data visualisation. I am not going to claim to
belong to that
latter cohort, but I have certainly been the former – a beginner –
and most
of my working hours are spent helping other beginners start
their journey.
I know the things that I would have valued when I was starting
out and I
23
know how I would have wished them to be articulated and
presented for
me to develop my skills most efficiently.
There is a large and growing library of fantastic books offering
many
different theoretical and practical viewpoints on the subject of
data
visualisation. My aim is to bring value to this existing
collection of work
by taking on a particular perspective that is perhaps under-
represented in
other texts – exploring the notion and practice of a visualisation
design
process. As I have alluded to in the opening, the central premise
of this
book is that the path to mastering data visualisation is achieved
by making
better decisions: effective choices, efficiently made. The book’s
central
goal is to help develop your capability and confidence in facing
these
decisions.
Just as a single book cannot cover the whole of this subject, it
stands that a
single book cannot aim to address directly the needs of all
people doing
data visualisation. In this section I am going to run through
some of the
characteristics that shape the readers to whom this book is
primarily
targeted. I will also put into context the content the book will
and will not
cover, and why. This will help manage your expectations as the
reader and
establish its value proposition compared with other titles.
Domain and Duties
The core audiences for whom this book has been primarily
written are
undergraduate and postgraduate-level students and early career
researchers
from social science subjects. This reflects a growing number of
people in
higher education who are interested in and need to learn about
data
visualisation.
Although aimed at social sciences, the content will also be
relevant across
the spectrum of academic disciplines, from the arts and
humanities right
through to the formal and natural sciences: any academic duty
where there
is an emphasis on the use of quantitative and qualitative
methods in studies
will require an appreciation of good data visualisation practices.
Where
statistical capabilities are relevant so too is data visualisation.
Beyond academia, data visualisation is a discipline that has
reached
mainstream consciousness with an increasing number of
professionals and
organisations, across all industry types and sizes, recognising
the
24
importance of doing it well for both internal and external
benefit. You
might be a market researcher, a librarian or a data analyst
looking to
enhance your data capabilities. Perhaps you are a skilled
graphic designer
or web developer looking to take your portfolio of work into a
more data-
driven direction. Maybe you are in a managerial position and
not directly
involved in the creation of visualisation work, but you need to
coordinate
or commission others who will be. You require awareness of the
most
efficient approaches, the range of options and the different key
decision
points. You might be seeking generally to improve the
sophistication of
the language you use around commissioning visualisation work
and to
have a better way of expressing and evaluating work created for
you.
Basically, anyone who is involved in whatever capacity with the
analysis
and visual communication of data as part of their professional
duties will
need to grasp the demands of data visualisation and this book
will go some
way to supporting these needs.
Subject Neutrality
One of the important aspects of the book will be to emphasise
that data
visualisation is a portable practice. You will see a broad array
of examples
of work from different industries, covering very different
topics. What will
become apparent is that visualisation techniques are largely
subject-matter
neutral: a line chart that displays the ebb and flow of favourable
opinion
towards a politician involves the same techniques as using a
line chart to
show how a stock has changed in value over time or how peak
temperatures have changed across a season in a given location.
A line
chart is a line chart, regardless of the subject matter. The
context of the
viewers (such as their needs and their knowledge) and the
specific
meaning that can be drawn will inevitably be unique to each
setting, but
the role of visualisation itself is adaptable and portable across
all subject
areas.
Data visualisation is an entirely global concern, not focused on
any defined
geographic region. Although the English language dominates
the written
discourse (books, websites) about this subject, the interest in it
and visible
output from across the globe are increasing at a pace. There are
cultural
matters that influence certain decisions throughout the design
process,
especially around the choices made for colour usage, but
otherwise it is a
discipline common to all.
25
Level and Prerequisites
The coverage of this book is intended to serve the needs of
beginners and
those with intermediate capability. For most people, this is
likely to be as
far as they might ever need to go. It will offer an accessible
route for
novices to start their learning journey and, for those already
familiar with
the basics, there will be content that will hopefully contribute to
fine-
tuning their approaches.
For context, I believe the only distinction between beginner and
intermediate is one of breadth and depth of critical thinking
rather than any
degree of difficulty. The more advanced techniques in
visualisation tend to
be associated with the use of specific technologies for handling
larger,
complex datasets and/or producing more bespoke and feature-
rich outputs.
This book is therefore not aimed at experienced or established
visualisation practitioners. There may be some new perspectives
to enrich
their thinking, some content that will confirm and other content
that might
constructively challenge their convictions. Otherwise, the
coverage in this
book should really echo the practices they are likely to be
already
observing.
As I have already touched on, data visualisation is a genuinely
multidisciplinary field. The people who are active in this field
or
profession come from all backgrounds – everyone has a
different entry
point and nobody arrives with all constituent capabilities. It is
therefore
quite difficult to define just what are the right type and level of
pre-
existing knowledge, skills or experiences for those learning
about data
visualisation. As each year passes, the savvy-ness of the type of
audience
this book targets will increase, especially as the subject
penetrates more
into the mainstream. What were seen as bewilderingly new
techniques
several years ago are now commonplace to more people.
That said, I think the following would be a fair outline of the
type and
shape of some of the most important prerequisite attributes for
getting the
most out of this book:
Strong numeracy is necessary as well as a familiarity with basic
statistics.
While it is reasonable to assume limited prior knowledge of
data
26
visualisation, there should be a strong desire to want to learn it.
The
demands of learning a craft like data visualisation take time and
effort; the capabilities will need nurturing through ongoing
learning
and practice. They are not going to be achieved overnight or
acquired
alone from reading this book. Any book that claims to be able
magically to inject mastery through just reading it cover to
cover is
over-promising and likely to under-deliver.
The best data visualisers possess inherent curiosity. You should
be
the type of person who is naturally disposed to question the
world
around them or can imagine what questions others have. Your
instinct
for discovering and sharing answers will be at the heart of this
activity.
There are no expectations of your having any prior familiarity
with
design principles, but a desire to embrace some of the creative
aspects
presented in this book will heighten the impact of your work.
Unlock
your artistry!
If you are somebody with a strong creative flair you are very
fortunate. This book will guide you through when and crucially
when
not to tap into this sensibility. You should be willing to increase
the
rigour of your analytical decision making and be prepared to
have
your creative thinking informed more fundamentally by data
rather
than just instinct.
A range of technical skills covering different software
applications,
tools and programming languages is not expected for this book,
as I
will explain next, but you will ideally have some knowledge of
basic
Excel and some experience of working with data.
I.3 Getting the Balance
Handbook vs Tutorial Book
The description of this book as being a ‘handbook’ positions it
as being of
practical help and presented in accessible form. It offers
direction with
comprehensive reference – more of a city guidebook for a
tourist than an
instruction manual to fix a washing machine. It will help you to
know what
things to think about, when to think about them, what options
exist and
how best to resolve all the choices involved in any data-driven
design.
Technology is the key enabler for working with data and
creating
27
visualisation design outputs. Indeed, apart from a small
proportion of
artisan visualisation work that is drawn by hand, the reliance on
technology to create visualisation work is an inseparable
necessity. For
many there is a understandable appetite for step-by-step
tutorials that help
them immediately to implement data visualisation techniques
via existing
and new tools.
However, writing about data visualisation through the lens of
selected
tools is a bit of a minefield, given the diversity of technical
options out
there and the mixed range of skills, access and needs. I greatly
admire
those people who have authored tutorial-based texts because
they require
astute judgement about what is the right level, structure and
scope.
The technology space around visualisation is characterised by
flux. There
are the ongoing changes with the enhancement of established
tools as well
as a relatively high frequency of new entrants offset by the
decline of
others. Some tools are proprietary, others are open source; some
are easier
to learn, others require a great deal of understanding before you
can even
consider embarking on your first chart. There are many recent
cases of
applications or services that have enjoyed fleeting exposure
before
reaching a plateau: development and support decline, the
community of
users disperses and there is a certain expiry of value.
Deprecation of
syntax and functions in programming languages requires the
perennial
updating of skills.
All of this perhaps paints a rather more chaotic picture than is
necessarily
the case but it justifies the reasons why this book does not offer
teaching in
the use of any tools. While tutorials may be invaluable to some,
they may
also only be mildly interesting to others and possibly of no
value to most.
Tools come and go but the craft remains. I believe that creating
a practical,
rather than necessarily a technical, text that focuses on the
underlying craft
of data visualisation with a tool-agnostic approach offers an
effective way
to begin learning about the subject in appropriate depth. The
content
should be appealing to readers irrespective of the extent of their
technical
knowledge (novice to advanced technicians) and specific tool
experiences
(e.g. knowledge of Excel, Tableau, Adobe Illustrator).
There is a role for all book types. Different people want
different sources
of insight at different stages in their development. If you are
seeking a text
that provides in-depth tutorials on a range of tools or pages of
programmatic instruction, this one will not be the best choice.
However, if
28
you consult only tutorial-related books, the chances are you will
likely fall
short on the fundamental critical thinking that will be needed in
the longer
term to get the most out of the tools with which you develop
strong skills.
To substantiate the book’s value, the digital companion
resources to this
book will offer a curated, up-to-date collection of visualisation
technology
resources that will guide you through the most common and
valuable tools,
helping you to gain a sense of what their roles are and where
these fit into
the design workflow. Additionally, there will be recommended
exercises
and many further related digital materials available for
exploring.
Useful vs Beautiful
Another important distinction to make is that this book is not
intended to
be seen as a beauty pageant. I love flicking through those glossy
‘coffee
table’ books as much as the next person; such books offer great
inspiration
and demonstrate some of the finest work in the field. This book
serves a
very different purpose. I believe that, as a beginner or relative
beginner on
this learning journey, the inspiration you need comes more from
understanding what is behind the thinking that makes these
amazing works
succeed and others not.
My desire is to make this the most useful text available, a
reference that
will spend more time on your desk than on your bookshelf. To
be useful is
to be used. I want the pages to be dog-eared. I want to see
scribbles and
annotated notes made across its pages and key passages
underlined. I want
to see sticky labels peering out above identified pages of note. I
want to
see creases where pages have been folded back or a double-page
spread
that has been weighed down to keep it open. In time I even want
its cover
reinforced with wallpaper or wrapping paper to ensure its
contents remain
bound together. There is every intention of making this an
elegantly
presented and packaged book but it should not be something
that invites
you to ‘look, but don’t touch’.
Pragmatic vs Theoretical
The content of this book has been formed through many years of
absorbing
knowledge from all manner of books, generations of academic
papers,
thousands of web articles, hundreds of conference talks, endless
online and
29
personal discussions, and lots of personal practice. What I
present here is a
pragmatic translation and distillation of what I have learned
down the
years.
It is not a deeply academic or theoretical book. Where
theoretical context
and reference is relevant it will be signposted as I do want to
ground this
book in as much evidenced-based content as possible; it is about
judging
what is going to add most value. Experienced practitioners will
likely have
an appetite for delving deeper into theoretical discourse and the
underlying
sciences that intersect in this field but that is beyond the scope
of this
particular text.
Take the science of visual perception, for example. There is no
value in
attempting to emulate what has already been covered by other
books in
greater depth and quality than I could achieve. Once you start
peeling back
the many different layers of topics like visual and cognitive
science the
boundaries of your interest and their relevance to data
visualisation never
seem to arrive. You get swallowed up by the depth of these
subjects. You
realise that you have found yourself learning about what the
very concept
of light and sight is and at that point your brain begins to ache
(well, mine
does at least), especially when all you set out to discover was if
a bar chart
would be better than a pie chart.
An important reason for giving greater weight to pragmatism is
because of
people: people are the makers, the stakeholders, the audiences
and the
critics in data visualisation. Although there are a great deal of
valuable
research-driven concepts concerning data visualisation, their
practical
application can be occasionally at odds with the somewhat
sanitised and
artificial context of the research methods employed. To
translate them into
real-world circumstances can sometimes be easier said than
done as the
influence of human factors can easily distort the significance of
otherwise
robust ideas.
I want to remove the burden from you as a reader having to
translate
relevant theoretical discourse into applicable practice. Critical
thinking
will therefore be the watchword, equipping you with the
independence of
thought to decide rationally for yourself what the solutions are
that best fit
your context, your data, your message and your audience. To do
this you
will need an appreciation of all the options available to you (the
different
things you could do) and a reliable approach for critically
determining
what choices you should make (the things you will do and why).
30
Contemporary vs Historical
This book is not going to look too far back into the past. We all
respect the
ancestors of this field, the great names who, despite primitive
means,
pioneered new concepts in the visual display of statistics to
shape the
foundations of the field being practised today. The field’s
lineage is
decorated by the influence of William Playfair’s first ever bar
chart,
Charles Joseph Minard’s famous graphic about Napoleon’s
Russian
campaign, Florence Nightingale’s Coxcomb plot and John
Snow’s cholera
map. These are some of the totemic names and classic examples
that will
always be held up as the ‘firsts’. Of course, to many beginners
in the field,
this historical context is of huge interest. However, again, this
kind of
content has already been superbly covered by other texts on
more than
enough occasions. Time to move on.
I am not going to spend time attempting to enlighten you about
how we
live in the age of ‘Big Data’ and how occupations related to
data are or
will be the ‘sexiest jobs’ of our time. The former is no longer
news, the
latter claim emerged from a single source. I do not want to bloat
this book
with the unnecessary reprising of topics that have been covered
at length
elsewhere. There is more valuable and useful content I want you
to focus
your time on.
The subject matter, the ideas and the practices presented here
will
hopefully not date a great deal. Of course, many of the graphic
examples
included in the book will be surpassed by newer work
demonstrating
similar concepts as the field continues to develop. However,
their worth as
exhibits of a particular perspective covered in the text should
prove
timeless. As more research is conducted in the subject, without
question
there will be new techniques, new concepts, new empirically
evidenced
principles that emerge. Maybe even new rules. There will be
new thought-
leaders, new sources of reference, new visualisers to draw
insight from.
New tools will be created, existing tools will expire. Some
things that are
done and can only be done by hand as of today may become
seamlessly
automated in the near future. That is simply the nature of a fast-
growing
field. This book can only be a line in the sand.
Analysis vs Communication
31
A further important distinction to make concerns the subtle but
significant
difference between visualisations which are used for analysis
and
visualisations used for communication.
Before a visualiser can confidently decide what to communicate
to others,
he or she needs to have developed an intimate understanding of
the
qualities and potential of the data. This is largely achieved
through
exploratory data analysis. Here, the visualiser and the viewer
are the same
person. Through visual exploration, different interrogations can
be pursued
‘on the fly’ to unearth confirmatory or enlightening discoveries
about what
insights exist.
Visualisation techniques used for analysis will be a key
component of the
journey towards creating visualisation for communication but
the practices
involved differ. Unlike visualisation for communication, the
techniques
used for visual analysis do not have to be visually polished or
necessarily
appealing. They are only serving the purpose of helping you to
truly learn
about your data. When a data visualisation is being created to
communicate to others, many careful considerations come into
play about
the requirements and interests of the intended or expected
audience. This
has a significant influence on many of the design decisions you
make that
do not exist alone with visual analysis.
Exploratory data analysis is a huge and specialist subject in and
of itself. In
its most advanced form, working efficiently and effectively with
large
complex data, topics like ‘machine learning’, using self-
learning
algorithms to help automate and assist in the discovery of
patterns in data,
become increasingly relevant. For the scope of this book the
content is
weighted more towards methods and concerns about
communicating data
visually to others. If your role is in pure data science or
statistical analysis
you will likely require a deeper treatment of the exploratory
data analysis
topic than this book can reasonably offer. However, Chapter 4
will cover
the essential elements in sufficient depth for the practical needs
of most
people working with data.
Print vs Digital
The opportunity to supplement the print version of this book
with an e-
book and further digital companion resources helps to cushion
the
agonising decisions about what to leave out. This text is
therefore
32
enhanced by access to further digital resources, some of which
are newly
created, while others are curated references from the endless
well of
visualisation content on the Web. Included online
(book.visualisingdata.com) will be:
a completed case-study project that demonstrates the workflow
activities covered in this book, including full write-ups and all
related
digital materials;
an extensive and up-to-date catalogue of over 300 data
visualisation
tools;
a curated collection of tutorials and resources to help develop
your
confidence with some of the most common and valuable tools;
practical exercises designed to embed the learning from each
chapter;
further reading resources to continue learning about the subjects
covered in each chapter.
I.4 Objectives
Before moving on to an outline of the book’s contents, I want to
share four
key objectives that I hope to accomplish for you by the final
chapter.
These are themes that will run through the entire text:
challenge, enlighten,
equip and inspire.
To challenge you I will be encouraging you to recognise that
your current
thinking about visualisation may need to be reconsidered, both
as a creator
and as a consumer. We all arrive in visualisation from different
subject and
domain origins and with that comes certain baggage and prior
sensibilities
that can distort our perspectives. I will not be looking to
eliminate these,
rather to help you harness and align them with other traits and
viewpoints.
I will ask you to relentlessly consider the diverse decisions
involved in this
process. I will challenge your convictions about what you
perceive to be
good or bad, effective or ineffective visualisation choices:
arbitrary
choices will be eliminated from your thinking. Even if you are
not
necessarily a beginner, I believe the content you read in this
book will
make you question some of your own perspectives and
assumptions. I will
encourage you to reflect on your previous work, asking you to
consider
how and why you have designed visualisations in the way that
you have:
where do you need to improve? What can you do better?
33
It is not just about creating visualisations, I will also challenge
your
approach to reading visualisations. This is not something you
might
usually think much about, but there is an important role for
more tactical
approaches to consuming visualisations with greater efficiency
and
effectiveness.
To enlighten you will be to increase your awareness of the
possibilities in
data visualisation. As you begin your discovery of data
visualisation you
might not be aware of the whole: you do not entirely know what
options
exist, how they are connected and how to make good choices.
Until you
know, you don’t know – that is what the objective of
enlightening is all
about.
As you will discover, there is a lot on your plate, much to work
through. It
is not just about the visible end-product design decisions.
Hidden beneath
the surface are many contextual circumstances to weigh up,
decisions
about how best to prepare your data, choices around the
multitude of
viable ways of slicing those data up into different angles of
analysis. That
is all before you even reach the design stage, where you will
begin to
consider the repertoire of techniques for visually portraying
your data – the
charts, the interactive features, the colours and much more
besides.
This book will broaden your visual vocabulary to give you more
ways of
expressing your data visually. It will enhance the sophistication
of your
decision making and of visual language for any of the
challenges you may
face.
To equip is to ensure you have robust tactics for managing your
way
through the myriad options that exist in data visualisation. The
variety it
offers makes for a wonderful prospect but, equally, introduces
the burden
of choice. This book aims to make the challenge of undertaking
data
visualisation far less overwhelming, breaking down the overall
prospect
into smaller, more manageable task chunks.
The structure of this book will offer a reliable and flexible
framework for
thinking, rather than rules for learning. It will lead to better
decisions.
With an emphasis on critical thinking you will move away from
an over-
reliance on gut feeling and taste. To echo what I mentioned
earlier, its role
as a handbook will help you know what things to think about,
when to
think about them and how best to resolve all the thinking
involved in any
data-driven design challenge you meet.
34
To inspire is to give you more than just a book to read. It is the
opening of
a door into a subject to inspire you to step further inside. It is
about helping
you to want to continue to learn about it and expose yourself to
as much
positive influence as possible. It should elevate your ambition
and broaden
your capability.
It is a book underpinned by theory but dominated by practical
and
accessible advice, including input from some of the best
visualisers in the
field today. The range of print and digital resources will offer
lots of
supplementary material including tutorials, further reading
materials and
suggested exercises. Collectively this will hopefully make it one
of the
most comprehensive, valuable and inspiring titles out there.
I.5 Chapter Contents
The book is organised into four main parts (A, B, C and D)
comprising
eleven chapters and preceded by the ‘Introduction’ sections you
are
reading now.
Each chapter opens with an introductory outline that previews
the content
to be covered and provides a bridge between consecutive
chapters. In the
closing sections of each chapter the most salient learning points
will be
summarised and some important, practical tips and tactics
shared. As
mentioned, online there will be collections of practical
exercises and
further reading resources recommended to substantiate the
learning from
the chapter.
Throughout the book you will see sidebar captions that will
offer relevant
references, aphorisms, good habits and practical tips from some
of the
most influential people in the field today.
Introduction
This introduction explains how I have attempted to make sense
of the
complexity of the subject, outlining the nature of the audience I
am trying
to reach, the key objectives, what topics the book will be
covering and not
covering, and how the content has been organised.
35
Part A: Foundations
Part A establishes the foundation knowledge and sets up a key
reference of
understanding that aids your thinking across the rest of the
book. Chapter 1
will be the logical starting point for many of you who are new
to the field
to help you understand more about the definitions and attributes
of data
visualisation. Even if you are not a complete beginner, the
content of the
chapter forms the terms of reference that much of the remaining
content is
based on. Chapter 2 prepares you for the journey through the
rest of the
book by introducing the key design workflow that you will be
following.
Chapter 1: Defining Data Visualisation
Defining data visualisation: outlining the components of
thinking
that make up the proposed definition for data visualisation.
The importance of conviction: presenting three guiding
principles of
good visualisation design: trustworthy, accessible and elegant.
Distinctions and glossary: explaining the distinctions and
overlaps
with other related disciplines and providing a glossary of terms
used
in this book to establish consistency of language.
Chapter 2: Visualisation Workflow
The importance of process: describing the data visualisation
design
workflow, what it involves and why a process approach is
required.
The process in practice: providing some useful tips, tactics and
habits that transcend any particular stage of the process but will
best
prepare you for success with this activity.
Part B: The Hidden Thinking
Part B discusses the first three preparatory stages of the data
visualisation
design workflow. ‘The hidden thinking’ title refers to how these
vital
activities, that have a huge influence over the eventual design
solution, are
somewhat out of sight in the final output; they are hidden
beneath the
surface but completely shape what is visible. These stages
represent the
often neglected contextual definitions, data wrangling and
editorial
challenges that are so critical to the success or otherwise of any
36
visualisation work – they require a great deal of care and
attention before
you switch your attention to the design stage.
Chapter 3: Formulating Your Brief
What is a brief?: describing the value of compiling a brief to
help
initiate, define and plan the requirements of your work.
Establishing your project’s context: defining the origin
curiosity or
motivation, identifying all the key factors and circumstances
that
surround your work, and defining the core purpose of your
visualisation.
Establishing your project’s vision: early considerations about
the
type of visualisation solution needed to achieve your aims and
harnessing initial ideas about what this solution might look like.
Chapter 4: Working With Data
Data literacy: establishing a basic understanding with this
critical
literacy, providing some foundation understanding about
datasets and
data types and some observations about statistical literacy.
Data acquisition: outlining the different origins of and methods
for
accessing your data.
Data examination: approaches for acquainting yourself with the
physical characteristics and meaning of your data.
Data transformation: optimising the condition, content and form
of
your data fully to prepare it for its analytical purpose.
Data exploration: developing deeper intimacy with the potential
qualities and insights contained, and potentially hidden, within
your
data.
Chapter 5: Establishing Your Editorial Thinking
What is editorial thinking?: defining the role of editorial
thinking in
data visualisation.
The influence of editorial thinking: explaining how the different
dimensions of editorial thinking influence design choices.
Part C: Developing Your Design
Solution
37
Part C is the main part of the book and covers progression
through the data
visualisation design and production stage. This is where your
concerns
switch from hidden thinking to visible thinking. The individual
chapters in
this part of the book cover each of the five layers of the data
visualisation
anatomy. They are treated as separate affairs to aid the clarity
and
organisation of your thinking, but they are entirely interrelated
matters and
the chapter sequences support this. Within each chapter there is
a
consistent structure beginning with an introduction to each
design layer, an
overview of the many different possible design options,
followed by
detailed guidance on the factors that influence your choices.
The production cycle: describing the cycle of development
activities
that take place during this stage, giving a context for how to
work
through the subsequent chapters in this part.
Chapter 6: Data Representation
Introducing visual encoding: an overview of the essentials of
data
representation looking at the differences and relationships
between
visual encoding and chart types.
Chart types: a detailed repertoire of 49 different chart types,
profiled
in depth and organised by a taxonomy of chart families:
categorical,
hierarchical, relational, temporal, and spatial.
Influencing factors and considerations: presenting the factors
that
will influence the suitability of your data representation
choices.
Chapter 7: Interactivity
The features of interactivity:
Data adjustments: a profile of the options for interactively
interrogating and manipulating data.
View adjustments: a profile of the options for interactively
configuring the presentation of data.
Influencing factors and considerations: presenting the factors
that will
influence the suitability of your interactivity choices.
Chapter 8: Annotation
38
The features of annotation:
Project annotation: a profile of the options for helping to
provide
viewers with general explanations about your project.
Chart annotation: a profile of the annotated options for helping
to
optimise viewers’ understanding your charts.
Influencing factors and considerations: presenting the factors
that will
influence the suitability of your annotation choices.
Chapter 9: Colour
The features of colour:
Data legibility: a profile of the options for using colour to
represent
data.
Editorial salience: a profile of the options for using colour to
direct
the eye towards the most relevant features of your data.
Functional harmony: a profile of the options for using colour
most
effectively across the entire visualisation design.
Influencing factors and considerations: presenting the factors
that will
influence the suitability of your colour choices.
Chapter 10: Composition
The features of composition:
Project composition: a profile of the options for the overall
layout and
hierarchy of your visualisation design.
Chart composition: a profile of the options for the layout and
hierarchy of the components of your charts.
Influencing factors and considerations: presenting the factors
that will
influence the suitability of your composition choices.
Part D: Developing Your Capabilities
Part D wraps up the book’s content by reflecting on the range of
capabilities required to develop confidence and competence
with data
39
visualisation. Following completion of the design process, the
multidisciplinary nature of this subject will now be clearly
established.
This final part assesses the two sides of visualisation literacy –
your role as
a creator and your role as a viewer – and what you need to
enhance your
skills with both.
Chapter 11: Visualisation Literacy
Viewing: Learning to see: learning about the most effective
strategy
for understanding visualisations in your role as a viewer rather
than a
creator.
Creating: The capabilities of the visualiser: profiling the skill
sets,
mindsets and general attributes needed to master data
visualisation
design as a creator.
40
Part A Foundations
41
1 Defining Data Visualisation
This opening chapter will introduce you to the subject of data
visualisation, defining what data visualisation is and is not. It
will outline
the different ingredients that make it such an interesting recipe
and
establish a foundation of understanding that will form a key
reference for
all of the decision making you are faced with.
Three core principles of good visualisation design will be
presented that
offer guiding ideals to help mould your convictions about
distinguishing
between effective and ineffective in data visualisation.
You will also see how data visualisation sits alongside or
overlaps with
other related disciplines, and some definitions about the use of
language in
this book will be established to ensure consistency in meaning
across all
chapters.
1.1 The Components of Understanding
To set the scene for what is about to follow, I think it is
important to start
this book with a proposed definition for data visualisation
(Figure 1.1).
This definition offers a critical term of reference because its
components
and their meaning will touch on every element of content that
follows in
this book. Furthermore, as a subject that has many different
proposed
definitions, I believe it is worth clarifying my own view before
going
further:
Figure 1.1 A Definition for Data Visualisation
42
At first glance this might appear to be a surprisingly short
definition: isn’t
there more to data visualisation than that, you might ask? Can
nine words
sufficiently articulate what has already been introduced as an
eminently
complex and diverse discipline?
I have arrived at this after many years of iterations attempting
to improve
the elegance of my definition. In the past I have tried to force
too many
words and too many clauses into one statement, making it
cumbersome
and rather undermining its value. Over time, as I have
developed greater
clarity in my own convictions, I have in turn managed to
establish greater
clarity about what I feel is the real essence of this subject. The
definition
above is, I believe, a succinct and practically useful description
of what the
pursuit of visualisation is truly about. It is a definition that
largely informs
the contents of this book. Each chapter will aim to enlighten
you about
different aspects of the roles of and relationships between each
component
expressed. Let me introduce and briefly examine each of these
one by one,
explaining where and how they will be discussed in the book.
Firstly, data, our critical raw material. It might appear a
formality to
mention data in the definition for, after all, we are talking about
data
visualisation as opposed to, let’s say, cheese visualisation
(though
visualisation of data using cheese has happened, see Figure
1.2), but it
needs to be made clear the core role that data has in the design
process.
Without data there is no visualisation; indeed there is no need
for one.
Data plays the fundamental role in this work, so you will need
to give it
your undivided attention and respect. You will discover in
Chapter 4 the
importance of developing an intimacy with your data to
acquaint yourself
with its physical properties, its meaning and its potential
qualities.
43
Figure 1.2 Per Capita Cheese Consumption in the US
Data is names, amounts, groups, statistical values, dates,
comments,
locations. Data is textual and numeric in format, typically held
in datasets
in table form, with rows of records and columns of different
variables.
This tabular form of data is what we will be considering as the
raw form of
data. Through tables, we can look at the values contained to
precisely read
them as individual data points. We can look up values quite
efficiently,
scanning across many variables for the different records held.
However,
we cannot easily establish the comparative size and relationship
between
multiple data points. Our eyes and mind are not equipped to
translate
easily the textual and numeric values into quantitative and
qualitative
meaning. We can look at the data but we cannot really see it
without the
context of relationships that help us compare and contrast them
effectively
with other values. To derive understanding from data we need to
see it
represented in a different, visual form. This is the act of data
representation.
This word representation is deliberately positioned near the
front of the
definition because it is the quintessential activity of data
visualisation
design. Representation concerns the choices made about the
form in which
your data will be visually portrayed: in lay terms, what chart or
charts you
will use to exploit the brain’s visual perception capabilities
most
effectively.
When data visualisers create a visualisation they are
representing the data
they wish to show visually through combinations of marks and
attributes.
Marks are points, lines and areas. Attributes are the appearance
properties
44
of these marks, such as the size, colour and position. The recipe
of these
marks and their attributes, along with other components of
apparatus, such
as axes and gridlines, form the anatomy of a chart.
In Chapter 6 you will gain a deeper and more sophisticated
appreciation of
the range of different charts that are in common usage today,
broadening
your visual vocabulary. These charts will vary in complexity
and
composition, with each capable of accommodating different
types of data
and portraying different angles of analysis. You will learn about
the key
ingredients that shape your data representation decisions,
explaining the
factors that distinguish the effective from the ineffective
choices.
Beyond representation choices, the presentation of data
concerns all the
other visible design decisions that make up the overall
visualisation
anatomy. This includes choices about the possible applications
of
interactivity, features of annotation, colour usage and the
composition of
your work. During the early stages of learning this subject it is
sensible to
partition your thinking about these matters, treating them as
isolated
design layers. This will aid your initial critical thinking.
Chapters 7–10
will explore each of these layers in depth, profiling the options
available
and the factors that influence your decisions.
However, as you gain in experience, the interrelated nature of
visualisation
will become much more apparent and you will see how the
overall design
anatomy is entirely connected. For instance, the selection of a
chart type
intrinsically leads to decisions about the space and place it will
occupy; an
interactive control may be included to reveal an annotated
caption; for any
design property to be even visible to the eye it must possess a
colour that is
different from that of its background.
The goal expressed in this definition states that data
visualisation is about
facilitating understanding. This is very important and some
extra time is
required to emphasise why it is such an influential component
in our
thinking. You might think you know what understanding means,
but when
you peel back the surface you realise there are many subtleties
that need to
be acknowledged about this term and their impact on your data
visualisation choices. Understanding ‘understanding’ (still with
me?) in
the context of data visualisation is of elementary significance.
When consuming a visualisation, the viewer will go through a
process of
understanding involving three stages: perceiving, interpreting
and
45
comprehending (Figure 1.3). Each stage is dependent on the
previous one
and in your role as a data visualiser you will have influence but
not full
control over these. You are largely at the mercy of the viewer –
what they
know and do not know, what they are interested in knowing and
what
might be meaningful to them – and this introduces many
variables outside
of your control: where your control diminishes the influence
and reliance
on the viewer increases. Achieving an outcome of understanding
is
therefore a collective responsibility between visualiser and
viewer.
These are not just synonyms for the same word, rather they
carry
important distinctions that need appreciating. As you will see
throughout this book, the subtleties and semantics of language
in data
visualisation will be a recurring concern.
Figure 1.3 The Three Stages of Understanding
Let’s look at the characteristics of the different stages that form
the process
of understanding to help explain their respective differences and
mutual
dependencies.
Firstly, perceiving. This concerns the act of simply being able
to read a
chart. What is the chart showing you? How easily can you get a
sense of
the values of the data being portrayed?
Where are the largest, middle-sized and smallest values?
What proportion of the total does that value hold?
How do these values compare in ranking terms?
To which other values does this have a connected relationship?
The notion of understanding here concerns our attempts as
viewers to
46
efficiently decode the representations of the data (the shapes,
the sizes and
the colours) as displayed through a chart, and then convert them
into
perceived values: estimates of quantities and their relationships
to other
values.
Interpreting is the next stage of understanding following on
from
perceiving. Having read the charts the viewer now seeks to
convert these
perceived values into some form of meaning:
Is it good to be big or better to be small?
What does it mean to go up or go down?
Is that relationship meaningful or insignificant?
Is the decline of that category especially surprising?
The viewer’s ability to form such interpretations is influenced
by their pre-
existing knowledge about the portrayed subject and their
capacity to utilise
that knowledge to frame the implications of what has been read.
Where a
viewer does not possess that knowledge it may be that the
visualiser has to
address this deficit. They will need to make suitable design
choices that
help to make clear what meaning can or should be drawn from
the display
of data. Captions, headlines, colours and other annotated
devices, in
particular, can all be used to achieve this.
Comprehending involves reasoning the consequence of the
perceiving and
interpreting stages to arrive at a personal reflection of what all
this means
to them, the viewer. How does this information make a
difference to what
was known about the subject previously?
Why is this relevant? What wants or needs does it serve?
Has it confirmed what I knew or possibly suspected beforehand
or
enlightened me with new knowledge?
Has this experience impacted me in an emotional way or left me
feeling somewhat indifferent as a consequence?
Does the context of what understanding I have acquired lead me
to
take action – such as make a decision or fundamentally change
my
behaviour – or do I simply have an extra grain of knowledge the
consequence of which may not materialise until much later?
Over the page is a simple demonstration to further illustrate this
process of
understanding. In this example I play the role of a viewer
working with a
sample isolated chart (Figure 1.4). As you will learn throughout
the design
47
chapters, a chart would not normally just exist floating in
isolation like this
one does, but it will serve a purpose for this demonstration.
Figure 1.4 shows a clustered bar chart that presents a
breakdown of the
career statistics for the footballer Lionel Messi during his
career with FC
Barcelona.
The process commences with perceiving the chart. I begin by
establishing
what chart type is being used. I am familiar with this clustered
bar chart
approach and so I quickly feel at ease with the prospect of
reading its
display: there is no learning for me to have to go through on
this occasion,
which is not always the case as we will see.
I can quickly assimilate what the axes are showing by
examining the labels
along the x- and y-axes and by taking the assistance provided by
colour
legend at the top. I move on to scanning, detecting and
observing the
general physical properties of the data being represented. The
eyes and
brain are working in harmony, conducting this activity quite
instinctively
without awareness or delay, noting the most prominent features
of
variation in the attributes of size, shape, colour and position.
Figure 1.4 Demonstrating the Process of Understanding
I look across the entire chart, identifying the big, small and
medium values
48
(these are known as stepped magnitude judgements), and form
an overall
sense of the general value rankings (global comparison
judgements). I am
instinctively drawn to the dominant bars towards the
middle/right of the
chart, especially as I know this side of the chart concerns the
most recent
career performances. I can determine that the purple bar –
showing goals –
has been rising pretty much year-on-year towards a peak in
2011/12 and
then there is a dip before recovery in his most recent season.
My visual system is now working hard to decode these
properties into
estimations of quantities (amounts of things) and relationships
(how
different things compare with each other). I focus on judging
the absolute
magnitudes of individual bars (one bar at a time). The
assistance offered
by the chart apparatus, such as the vertical axis (or y- axis)
values and the
inclusion of gridlines, is helping me more quickly estimate the
quantities
with greater assurance of accuracy, such as discovering that the
highest
number of goals scored was around 73.
I then look to conduct some relative higher/lower comparisons.
In
comparing the games and goals pairings I can see that three out
of the last
four years have seen the purple bar higher than the blue bar, in
contrast to
all the rest. Finally I look to establish proportional relationships
between
neighbouring bars, i.e. by how much larger one is compared
with the next.
In 2006/07 I can see the blue bar is more than twice as tall as
the purple
one, whereas in 2011/12 the purple bar is about 15% taller.
By reading this chart I now have a good appreciation of the
quantities
displayed and some sense of the relationship between the two
measures,
games and goals.
The second part of the understanding process is interpreting. In
reality, it
is not so consciously consecutive or delayed in relationship to
the
perceiving stage but you cannot get here without having already
done the
perceiving. Interpreting, as you will recall, is about converting
perceived
‘reading’ into meaning. Interpreting is essentially about
orientating your
assessment of what you’ve read against what you know about
the subject.
As I mentioned earlier, often a data visualiser will choose to –
or have the
opportunity to – share such insights via captions, chart overlays
or
summary headlines. As you will learn in Chapter 3, the
visualisations that
present this type of interpretation assistance are commonly
described as
offering an ‘explanatory’ experience. In this particular
demonstration it is
49
an example of an ‘exhibitory’ experience, characterised by the
absence of
any explanatory features. It relies on the viewer to handle the
demands of
interpretation without any assistance.
As you will read about later, many factors influence how well
different
viewers will be able to interpret a visualisation. Some of the
most critical
include the level of interest shown towards the subject matter,
its relevance
and the general inclination, in that moment, of a viewer to want
to read
about that subject through a visualisation. It is also influenced
by the
knowledge held about a subject or the capacity to derive
meaning from a
subject even if a knowledge gap exists.
Returning to the sample chart, in order to translate the
quantities and
relationships I extracted from the perceiving stage into
meaning, I am
effectively converting the reading of value sizes into notions of
good or
bad and comparative relationships into worse than or better than
etc. To
interpret the meaning of this data about Lionel Messi I can tap
into my
passion for and knowledge of football. I know that for a player
to score
over 25 goals in a season is very good. To score over 35 is
exceptional. To
score over 70 goals is frankly preposterous, especially at the
highest level
of the game (you might find plenty of players achieving these
statistics
playing for the Dog and Duck pub team, but these numbers have
been
achieved for Barcelona in La Liga, the Champions League and
other
domestic cup competitions). I know from watching the sport,
and poring
over statistics like this for 30 years, that it is very rare for a
player to score
remotely close to a ratio of one goal per game played. Those
purple bars
that exceed the height of the blue bars are therefore remarkable.
Beyond
the information presented in the chart I bring knowledge about
the periods
when different managers were in charge of Barcelona, how they
played the
game, and how some organised their teams entirely around
Messi’s talents.
I know which other players were teammates across different
seasons and
who might have assisted or hindered his achievements. I also
know his age
and can mentally compare his achievements with the traditional
football
career arcs that will normally show a steady rise, peak, plateau,
and then
decline.
Therefore, in this example, I am not just interested in the
subject but can
bring a lot of knowledge to aid me in interpreting this analysis.
That helps
me understand a lot more about what this data means. For other
people
they might be passingly interested in football and know how to
read what
50
is being presented, but they might not possess the domain
knowledge to go
deeper into the interpretation. They also just might not care.
Now imagine
this was analysis of, let’s say, an NHL ice hockey player
(Figure 1.5) –
that would present an entirely different challenge for me.
In this chart the numbers are irrelevant, just using the same
chart as before
with different labels. Assuming this was real analysis, as a
sports fan in
general I would have the capacity to understand the notion of a
sportsperson’s career statistics in terms of games played and
goals scored:
I can read the chart (perceiving) that shows me this data and
catch the gist
of the angle of analysis it is portraying. However, I do not have
sufficient
domain knowledge of ice hockey to determine the real meaning
and
significance of the big–small, higher–lower value relationships.
I cannot
confidently convert ‘small’ into ‘unusual’ or ‘greater than’ into
‘remarkable’. My capacity to interpret is therefore limited, and
besides I
have no connection to the subject matter, so I am insufficiently
interested
to put in the effort to spend much time with any in-depth
attempts at
interpretation.
Figure 1.5 Demonstrating the Process of Understanding
Imagine this is now no longer analysis about sport but about the
sightings
in the wild of Winglets and Spungles (completely made up
words). Once
again I can still read the chart shown in Figure 1.6 but now I
have
51
absolutely no connection to the subject whatsoever. No
knowledge and no
interest. I have no idea what these things are, no understanding
about the
sense of scale that should be expected for these sightings, I
don’t know
what is good or bad. And I genuinely don’t care either. In
contrast, for
those who do have a knowledge of and interest in the subject,
the meaning
of this data will be much more relevant. They will be able to
read the chart
and make some sense of the meaning of the quantities and
relationships
displayed.
To help with perceiving, viewers need the context of scale. To
help with
interpreting, viewers need the context of subject, whether that is
provided
by the visualiser or the viewer themself. The challenge for you
and I as
data visualisers is to determine what our audience will know
already and
what they will need to know in order to possibly assist them in
interpreting
the meaning. The use of explanatory captions, perhaps
positioned in that
big white space top left, could assist those lacking the
knowledge of the
subject, possibly offering a short narrative to make the
interpretations – the
meaning – clearer and immediately accessible.
We are not quite finished, there is one stage left. The third part
of the
understanding process is comprehending. This is where I
attempt to form
some concluding reasoning that translates into what this
analysis means for
me. What can I infer from the display of data I have read? How
do I relate
and respond to the insights I have drawn out as through
interpretation?
Does what I’ve learnt make a difference to me? Do I know
something
more than I did before? Do I need to act or decide on anything?
How does
it make me feel emotionally?
Figure 1.6 Demonstrating the Process of Understanding
52
Through consuming the Messi chart, I have been able to form an
even
greater appreciation of his amazing career. It has surprised me
just how
prolific he has been, especially having seen his ratio of goals to
games, and
I am particularly intrigued to see whether the dip in 2013/14
was a
temporary blip or whether the bounce back in 2014/15 was the
blip. And
as he reaches his late 20s, will injuries start to creep in as they
seem to do
for many other similarly prodigious young talents, especially as
he has
been playing relentlessly at the highest level since his late
teens?
My comprehension is not a dramatic discovery. There is no
sudden
inclination to act nor any need – based on what I have learnt. I
just feel a
heightened impression, formed through the data, about just how
good and
prolific Lionel Messi has been. For Barcelona fanatics who
watch him play
every week, they will likely have already formed this
understanding. This
kind of experience would only have reaffirmed what they
already probably
knew.
And that is important to recognise when it comes to managing
expectations about what we hope to achieve amongst our
viewers in terms
of their final comprehending. One person’s ‘I knew that
already’ is another
person’s ‘wow’. For every ‘wow, I need to make some changes’
type of
reflection there might be another ‘doesn’t affect me’. A
compelling
visualisation about climate change presented to Sylvie might
affect her
53
significantly about the changes she might need to make in her
lifestyle
choices that might reduce her carbon footprint. For Robert, who
is already
familiar with the significance of this situation, it might have
substantially
less immediate impact – not indifference to the meaning of the
data, just
nothing new, a shrug of the shoulders. For James, the hardened
sceptic,
even the most indisputable evidence may have no effect; he
might just not
be receptive to altering his views regardless.
What these scenarios try to explain is that, from your
perspective of the
visualiser, this final stage of understanding is something you
will have
relatively little control over because viewers are people and
people are
complex. People are different and as such they introduce
inconsistencies.
You can lead a horse to water but you cannot make it drink: you
cannot
force a viewer to be interested in your work, to understand the
meaning of
a subject or get that person to react exactly how you would
wish.
Visualising data is just an agent of communication and not a
guarantor for
what a viewer does with the opportunity for understanding that
is
presented. There are different flavours of comprehension,
different
consequences of understanding formed through this final stage.
Many
visualisations will be created with the ambition to simply
inform, like the
Messi graphic achieved for me, perhaps to add just an extra
grain to the
pile of knowledge a viewer has about a subject. Not every
visualisation
results in a Hollywood moment of grand discoveries, surprising
insights or
life-saving decisions. But that is OK, so long as the outcome
fits with the
intended purpose, something we will discuss in more depth in
Chapter 3.
Furthermore, there is the complexity of human behaviour in how
people
make decisions in life. You might create the most compelling
visualisation, demonstrating proven effective design choices,
carefully
constructed with very a specific audience type and need in
mind. This
might clearly show how a certain decision really needs to be
taken by
those in the audience. However, you cannot guarantee that the
decision
maker in question, while possibly recognising that there is a
need to act,
will be in a position to act, and indeed will know how to act.
It is at this point that one must recognise the ambitions and –
more
importantly – realise the limits of what data visualisation can
achieve.
Going back again, finally, to the components of the definition,
all the
reasons outlined above show why the term to facilitate is the
most a
visualiser can reasonably aspire to achieve.
54
It might feel like a rather tepid and unambitious aim, something
of a cop-
out that avoids scrutiny over the outcomes of our work: why not
aim to
‘deliver’, ‘accomplish’, or do something more earnest than just
‘facilitate’?
I deliberately use ‘facilitate’ because as we have seen we can
only control
so much. Design cannot change the world, it can only make it
run a little
smoother. Visualisers can control the output but not the
outcome: at best
we can expect to have only some influence on it.
1.2 The Importance of Conviction
The key structure running through this book is a data
visualisation design
process. By following this process you will be able to decrease
the size of
the challenge involved in making good decisions about your
design
solution. The sequencing of the stages presented will help
reduce the
myriad options you have to consider, which makes the prospect
of arriving
at the best possible solution much more likely to occur.
Often, the design choices you need to make will be clear cut. As
you will
learn, the preparatory nature of the first three stages goes a long
way to
securing that clarity later in the design stage. On other
occasions, plain old
common sense is a more than sufficient guide. However, for
more nuanced
situations, where there are several potentially viable options
presenting
themselves, you need to rely on the guiding value of good
design
principles.
‘I say begin by learning about data visualisation’s “black and
whites”,
the rules, then start looking for the greys. It really then becomes
quite a
personal journey of developing your conviction.’ Jorge Camoes,
Data
Visualization Consultant
For many people setting out on their journey in data
visualisation, the
major influences that shape their early beliefs about data
visualisation
design tend to be influenced by the first authors they come
across. Names
like Edward Tufte, unquestionably one of the most important
figures in
this field whose ideas are still pervasive, represent a common
entry point
into the field, as do people like Stephen Few, David
McCandless, Alberto
Cairo, and Tamara Munzner, to name but a few. These are
authors of
prominent works that typically represent the first books
purchased and
55
read by many beginners.
Where you go from there – from whom you draw your most
valuable
enduring guidance –will be shaped by many different factors:
taste, the
industry you are working in, the topics on which you work, the
types of
audiences you produce for. I still value much of what Tufte
extols, for
example, but find I can now more confidently filter out some of
his ideals
that veer towards impractical ideology or that do not necessarily
hold up
against contemporary technology and the maturing expectations
of people.
‘My key guiding principle? Know the rules, before you break
them.’
Gregor Aisch, Graphics Editor, The New York Times
The key guidance that now most helpfully shapes and supports
my
convictions comes from ideas outside the boundaries of
visualisation
design in the shape of the work of Dieter Rams. Rams was a
German
industrial and product designer who was most famously
associated with
the Braun company.
In the late 1970s or early 1980s, Rams was becoming concerned
about the
state and direction of design thinking and, given his prominent
role in the
industry, felt a responsibility to challenge himself, his own
work and his
own thinking against a simple question: ‘Is my design good
design?’. By
dissecting his response to this question he conceived 10
principles that
expressed the most important characteristics of what he
considered to be
good design. They read as follows:
1. Good design is innovative.
2. Good design makes a product useful.
3. Good design is aesthetic.
4. Good design makes a product understandable.
5. Good design is unobtrusive.
6. Good design is honest.
7. Good design is long lasting.
8. Good design is thorough down to the last detail.
9. Good design is environmentally friendly.
10. Good design is as little design as possible.
Inspired by the essence of these principles, and considering
their
applicability to data visualisation design, I have translated them
into three
56
high-level principles that similarly help me to answer my own
question: ‘Is
my visualisation design good visualisation design?’ These
principles offer
me a guiding voice when I need to resolve some of the more
seemingly
intangible decisions I am faced with (Figure 1.7).
Figure 1.7 The Three Principles of Good Visualisation Design
In the book Will it Make the Boat Go Faster?, co-author Ben
Hunt-Davis
provides details of the strategies employed by him and his team
that led to
their achieving gold medal success in the Men’s Rowing Eight
event at the
Sydney Olympics in 2000. As the title suggests, each decision
taken had to
pass the ‘will it make the boat go faster?’ test. Going back to
the goal of
data visualisation as defined earlier, these design principles
help me judge
whether any decision I make will better aid the facilitation of
understanding: the equivalence of ‘making the boat go faster’.
I will describe in detail the thinking behind each of these
principles and
explain how Rams’ principles map onto them. Before that, let
me briefly
explain why there are three principles of Rams’ original ten that
do not
entirely fit, in my view, as universal principles for data
visualisation.
‘I’m always the fool looking at the sky who falls off the cliff. In
other
words, I tend to seize on ideas because I’m excited about them
without
thinking through the consequences of the amount of work they
will
entail. I find tight deadlines energizing. Answering the question
of
“what is the graphic trying to do?” is always helpful. At
minimum the
work I create needs to speak to this. Innovation doesn’t have to
be a
wholesale out-of-the box approach. Iterating on a previous idea,
moving
it forward, is innovation.’ Sarah Slobin, Visual Journalist
Good design is innovative: Data visualisation does not need
always
to be innovative. For the majority of occasions the solutions
being
created call upon the tried and tested approaches that have been
used
for generations. Visualisers are not conceiving new forms of
representation or implementing new design techniques in every
57
project. Of course, there are times when innovation is required
to
overcome a particular challenge; innovation generally
materialises
when faced with problems that current solutions fail to
overcome.
Your own desire for innovation may be aligned to personal
goals
about the development of your skills or through reflecting on
previous
projects and recognising a desire to rethink a solution. It is not
that
data visualisation is never about innovation, just that it is not
always
and only about innovation.
Good design is long lasting: The translation of this principle to
the
context of data visualisation can be taken in different ways.
‘Long
lasting’ could be related to the desire to preserve the ongoing
functionality of a digital project, for example. It is quite
demoralising
how many historic links you visit online only to find a project
has
now expired through a lack of sustained support or is no longer
functionally supported on modern browsers.
Another way to interpret ‘long lasting’ is in the durability of the
technique. Bar charts, for example, are the old reliables of the
field –
always useful, always being used, always there when you need
them
(author wipes away a respectful tear). ‘Long lasting’ can also
relate to
avoiding the temptation of fashion or current gimmickry and
having a
timeless approach to design. Consider the recent design trend
moving
away from skeuomorphism and the emergence of so-called flat
design. By the time this book is published there will likely be a
new
movement. ‘Long lasting’ could apply to the subject matter.
Expiry in
the relevance of certain angles of analysis or out-of-date data is
inevitable in most of our work, particularly with subjects that
concern
current matters. Analysis about the loss of life during the
Second
World War is timeless because nothing is now going to change
the
nature or extent of the underlying data (unless new discoveries
emerge). Analysis of the highest grossing movies today will
change
as soon as new big movies are released and time elapses. So,
once
again, this idea of long lasting is very context specific, rather
than
being a universal goal for data visualisation.
Good design is environmentally friendly: This is, of course, a
noble
aim but the relevance of this principle has to be positioned
again at
the contextual level, based on the specific circumstances of a
given
project. If your work is to be printed, the ink and paper usage
immediately removes the notion that it is an environmentally
friendly
activity. Developing a powerful interactive that is being
hammered
constantly and concurrently by hundreds of thousands of users
puts an
58
extra burden on the hosting server, creating more demands on
energy
supply. The specific judgements about issues relating to the
impact of
a project on the environment realistically reside with the
protagonists
and stakeholders involved.
A point of clarity is that, while I describe them as design
principles, they
actually provide guidance long before you reach the design
thinking at the
final stage of this workflow. Design choices encapsulate the
critical
thinking undertaken throughout. Think of it like an iceberg: the
design is
the visible consequences of lots of hidden preparatory thinking
formed
through earlier stages.
Finally, a comment is in order about something often raised in
discussions
about the principles for this subject: that is, the idea that
visualisations
need to be memorable. This is, in my view, not relevant as a
universal
principle. If something is memorable, wonderful, that will be a
terrific by-
product of your design thinking, but in itself the goal of
achieving
memorability has to be isolated, again, to a contextual level
based on the
specific goals of a given task and the capacity of the viewer. A
politician
or a broadcaster might need to recall information more readily
in their
work than a group of executives in a strategy meeting with
permanent
access to endless information at the touch of a button via their
iPads.
Principle 1: Good Data Visualisation is
Trustworthy
The notion of trust is uppermost in your thoughts in this first of
the three
principles of good visualisation design. This maps directly onto
one of
Dieter Rams’ general principles of good design, namely that
good design
is honest.
Trust vs Truth
This principle is presented first because it is about the
fundamental
integrity, accuracy and legitimacy of any data visualisation you
produce.
This should always exist as your primary concern above all else.
There
should be no compromise here. Without securing trust the entire
purpose
of doing the work is undermined.
59
There is an important distinction to make between trust and
truth. Truth is
an obligation. You should never create work you know to be
misleading in
content, nor should you claim something presents the truth if it
evidently
cannot be supported by what you are presenting. For most
people, the
difference between a truth and an untruth should be beyond
dispute. For
those unable or unwilling to be truthful, or who are ignorant of
how to
differentiate, it is probably worth putting this book away now:
my telling
you how this is a bad thing is not likely to change your
perspective.
If the imperative for being truthful is clear, the potential for
there being
multiple different but legitimate versions of ‘truth’ within the
same data-
driven context muddies things. In data visualisation there is
rarely a
singular view of the truth. The glass that is half full is also half
empty.
Both views are truthful, but which to choose? Furthermore,
there are many
decisions involved in your work whereby several valid options
may
present themselves. In these cases you are faced with choices
without
necessarily having the benefit of theoretical influence to draw
out the right
option. You decide what is right. This creates inevitable biases
– no matter
how seemingly tiny – that ripple through your work. Your
eventual
solution is potentially comprised of many well-informed, well-
intended
and legitimate choices – no doubt – but they will reflect a
subjective
perspective all the same. All projects represent the outcome of
an entirely
unique pathway of thought.
You can mitigate the impact of these subjective choices you
make, for
example, by minimising the amount of assumptions applied to
the data you
are working with or by judiciously consulting your audience to
best ensure
their requirements are met. However, pure objectivity is not
possible in
visualisation.
‘Every number we publish is wrong but it is the best number
there is.’
Andrew Dilnott, Chair of the UK Statistics Authority
Rather than view the unavoidability of these biases as an
obstruction, the
focus should instead be on ensuring your chosen path is
trustworthy. In the
absence of an objective truth, you need to be able to
demonstrate that your
truth is trustable.
Trust has to be earned but this is hard to secure and very easy to
lose. As
the translation of a Dutch proverb states, ‘trust arrives on foot
and leaves
60
on horseback’. Trust is something you can build by eliminating
any sense
that your version of the truth can be legitimately disputed. Yet,
visualisers
only have so much control and influence in the securing of
trust. A
visualisation can be truthful but not viewed as trustworthy. You
may have
done something with the best of intent behind your decision
making, but it
may ultimately fail to secure trust among your viewers for
different
reasons. Conversely a visualisation can be trustworthy in the
mind of the
viewer but not truthful, appearing to merit trust yet utterly
flawed in its
underlying truth. Neither of these are satisfactory: the latter
scenario is a
choice we control, the former is a consequence we must strive
to
overcome.
‘Good design is honest. It does not make a product appear more
innovative, powerful or valuable than it really is. It does not
attempt to
manipulate the consumer with promises that cannot be kept.’
Dieter
Rams, celebrated Industrial Designer
Let’s consider a couple of examples to illustrate this notion of
trustworthiness. Firstly, think about the trust you might attach
respectively
to the graphics presented in Figure 1.8 and Figure 1.9. For the
benefit of
clarity both are extracted from articles discussing issues about
home
ownership, so each would be accompanied with additional
written analysis
at their published location. Both charts are portraying the same
data and
the same analysis; they even arrive at the same summary
finding. How do
the design choices make you feel about the integrity of each
work?
Figure 1.8 Housing and Home Ownership in the UK (ONS)
61
Both portrayals are truthful but in my view the first
visualisation, produced
by the UK Office for National Statistics (ONS), commands
greater
credibility and therefore far more trust than the second
visualisation,
produced by the Daily Mail. The primary reason for this begins
with the
colour choices. They are relatively low key in the ONS graphic:
colourful
but subdued, yet conveying a certain assurance. In contrast, the
Daily
Mail’s colour palette feels needy, like it is craving my attention
with
sweetly coloured sticks. I don’t care for the house key imagery
in the
background but it is relatively harmless. Additionally, the
typeface, font
size and text colour feel more gimmicky in the second graphic.
Once
again, it feels like it is wanting to shout at me in contrast to the
more polite
nature of the ONS text. Whereas the Daily Mail piece refers to
the ONS as
the source of the data, it fails to include further details about
the data
source, which is included on the ONS graphic alongside other
important
explanatory features such as the subtitle, clarity about the
yearly periods
and the option to access and download the associated data. The
ONS
graphic effectively ‘shows all its workings’ and overall earns,
from me at
least, significantly more trust.
62
Figure 1.9 Falling Number of Young Homeowners (Daily Mail)
Another example about the fragility of trust concerns the next
graphic,
which plots the number of murders committed using firearms in
Florida
over a period of time. This frames the time around the
enactment of the
‘Stand your ground’ law in the Florida. The area chart in Figure
1.10
shows the number of murders over time and, as you can see, the
chart uses
an inverted vertical y-axis with the red area going lower down
as the
number of deaths increases, with peak values at about 1990 and
2007.
However, some commentators felt the inversion of the y-axis
was
deceptive and declared the graphic not trustworthy based on the
fact they
were perceiving the values as represented by an apparent rising
‘white
mountain’. They mistakenly observed peak values around 1999
and 2005
based on them seeing these as the highest points. This confusion
is caused
by an effect known as figure-ground perception whereby a
background
form (white area) can become inadvertently recognised as the
foreground
form, and vice versa (with the red area seen as the background).
Figure 1.10 Gun Deaths in Florida
63
Figure 1.11 Iraq’s Bloody Toll
64
The key point here is that there was no intention to mislead.
Although the
approach to inverting the y-axis may not be entirely
conventional, it was
technically legitimate. Creatively speaking, the effect of
dribbling blood
was an understandably tempting metaphor to pursue. Indeed, the
graphic
attempts to emulate a notable infographic from several years
ago showing
the death toll during the Iraq conflict (Figure 1.11). In the case
of the
65
Florida graphic, on reflection maybe the data was just too
‘smooth’ to
convey the same dribbling effect achieved in the Iraq piece.
However,
being inspired and influenced by successful techniques
demonstrated by
others is to be encouraged. It is one way of developing our
skills.
Figure 1.12 Reworking of ‘Gun Deaths in Florida’
Unfortunately, given the emotive nature of the subject matter –
gun deaths
– this analysis would always attract a passionate reaction
regardless of its
form. In this case the lack of trust expressed by some was an
unintended
66
consequence of a single, innocent design: by reverting the y-
axis to an
upward direction, as shown in the reworked version in Figure
1.12, you
can see how a single subjective design choice can have a huge
influence
on people’s perception.
The creator of the Florida chart will have made hundreds of
perfectly
sound visualisations and will make hundreds more, and none of
them will
ever carry the intent of being anything other than truthful.
However, you
can see how vulnerable perceived trust is when disputes about
motives can
so quickly surface as a result of the design choice made. This is
especially
the case within the pressured environment of a newsroom where
you have
only a single opportunity to publish a work to a huge and
widespread
audience. Contrast this setting with a graphic published within
an
organisation that can be withdrawn and reissued far more easily.
Trust Applies Throughout the Process
Trustworthiness is a pursuit that should guide all your
decisions, not just
the design ones. As you will see in the next chapter, the
visualisation
design workflow involves a process with many decision
junctions – many
paths down which you could pursue different legitimate options.
Obviously, design is the most visible result of your decision
making, but
you need to create and demonstrate complete integrity in the
choices made
across the entire workflow process. Here is an overview of some
of the key
matters where trust must be at the forefront of your concern.
‘My main goal is to represent information accurately and in
proper
context. This spans from data reporting and number crunching
to
designing human-centered, intuitive and clear visualizations.
This is my
sole approach, although it is always evolving.’ Kennedy Elliott,
Graphics Editor, The Washington Post
Formulating your brief: As mentioned in the discussion about
the
‘Gun Crimes in Florida’ graphic, if you are working with
potentially
emotive subject matter, this will heighten the importance of
demonstrating trust. Rightly or wrongly, your topic will be more
exposed to the baggage of prejudicial opinion and trust will be
precarious. As you will learn in Chapter 3, part of the thinking
involved in ‘formulating your brief’ concerns defining your
audience,
67
considering your subject and establishing your early thoughts
about
the purpose of your work, and what you are hoping to achieve.
There
will be certain contexts that lend themselves to exploiting the
emotive
qualities of your subject and/or data but many others that will
not.
Misjudge these contextual factors, especially the nature of your
audience’s needs, and you will jeopardise the trustworthiness of
your
solution. As I have shown, matters of trust are often outside of
your
immediate influence: cynicism, prejudice or suspicion held by
viewers through their beliefs or opinions is a hard thing to
combat or
accommodate. In general, people feel comfortable with
visualisations
that communicate data in a way that fits with their world view.
That
said, at times, many are open to having their beliefs challenged
by
data and evidence presented through a visualisation. The
platform and
location in which your work is published (e.g. website or source
location) will also influence trust. Visualisations encountered in
already-distrusted media will create obstacles that are hard to
overcome.
Working with data: As soon as you begin working with data you
have a great responsibility to be faithful to this raw material. To
be
transparent to your audience you need to consider sharing as
much
relevant information about how you have handled the data that
is
being presented to them:
How was it collected: from where and using what criteria?
What calculations or modifications have you applied to it?
Explain your approach.
Have you made any significant assumptions or observed any
special counting rules that may not be common?
Have you removed or excluded any data?
How representative it is? What biases may exist that could
distort interpretations?
Editorial thinking: Even with the purest of intent, your role as
the
curator of your data and the creator of its portrayal introduces
subjectivity. When you choose to do one thing you are often
choosing
to not do something else. The choice to focus on analysis that
shows
how values have changed over time is also a decision to not
show the
same data from other viewpoints such as, for example, how it
looks
on a map. A decision to impose criteria on your analysis, like
setting
date parameters or minimum value thresholds, in order to reduce
clutter, might be sensible and indeed legitimate, but is still a
subjective choice.
68
‘Data and data sets are not objective; they are creations of
human
design. Hidden biases in both the collection and analysis stages
present
considerable risks [in terms of inference].’ Kate Crawford,
Principal
Researcher at Microsoft Research NYC
Data representation: A fundamental tenet of data visualisation is
to
never deceive the receiver. Avoiding possible
misunderstandings,
inaccuracies, confusions and distortions is of primary concern.
There
are many possible features of visualisation design that can lead
to
varying degrees of deception, whether intended or not. Here are
a few
to list now, but note that these will be picked up in more detail
later:
The size of geometric areas can sometimes be miscalculated
resulting in the quantitative values being disproportionately
perceived.
When data is represented in 3D, on the majority of occasions
this
represents nothing more than distracting – and distorting –
decoration. 3D should only be used when there are legitimately
three dimensions of data variables being displayed and the
viewer is able to change his or her point of view to navigate to
see different 2D perspectives.
The bar chart value axis should never be ‘truncated’ – the origin
value should always be zero – otherwise this approach will
distort the bar size judgements.
The aspect ratio (height vs width) of a line chart’s display is
influential as it affects the perceived steepness of connecting
lines which are key to reading the trends over time – too narrow
and the steepness will be embellished; too wide and the
steepness is dampened.
When portraying spatial analysis through a thematic map
representation, there are many different mapping projections to
choose from as the underlying apparatus for presenting and
orienting the geographical position of the data. There are many
different approaches to flatten the spherical globe, translating it
into a two-dimensional map form. The mathematical treatment
applied can alter significantly the perceived size or shape of
regions, potentially distorting their perception.
Sometimes charts are used in a way that is effectively corrupt,
like using pie charts for percentages that add up to more, or
less,
than 100%.
69
Data presentation: The main rule here is: if it looks significant,
it
should be, otherwise you are either misleading or creating
unnecessary obstacles for your viewer. The undermining of trust
can
also be caused by what you decline to explain: restricted or non-
functioning features of interactivity.
Absent annotations such as introduction/guides, axis titles and
labels, footnotes, data sources that fail to inform the reader of
what is going on.
Inconsistent or inappropriate colour usage, without explanation.
Confusing or inaccessible layouts.
Thoroughness in delivering trust extends to the faith you create
through reliability and consistency in the functional experience,
especially for interactive projects. Does the solution work and,
specifically, does it work in the way it promises to do?
Principle 2: Good Data Visualisation is Accessible
This second of the three principles of good visualisation design
helps to
inform judgments about how best to facilitate your viewers
through the
process of understanding. It is informed by three of Dieter
Rams’ general
principles of good design:
2 Good design makes a product useful.
4 Good design makes a product understandable.
5 Good design is unobtrusive.
Reward vs Effort
The opening section of this chapter broke down the stages a
viewer goes
through when forming their understanding about, and from, a
visualisation.
This process involved a sequence of perceiving, interpreting and
then
comprehending. It was emphasised that a visualiser’s control
over the
viewer’s pursuit of understanding diminishes after each stage.
The
objective, as stated by the presented definition, of ‘facilitating’
understanding reflects the reality of what can be controlled.
You can’t
force viewers to understand, but you can smooth the way.
To facilitate understanding for an audience is about delivering
accessibility. That is the essence of this principle: to remove
design-related
70
obstacles faced by your viewers when undertaking this process
of
understanding. Stated another way, a viewer should experience
minimum
friction between the act of understanding (effort) and the
achieving of
understanding (reward).
This ‘minimising’ of friction has to be framed by context,
though. This is
key. There are many contextual influences that will determine
whether
what is judged inaccessible in one situation could be seen as
entirely
accessible in another. When people are involved, diverse needs
exist. As I
have already discussed, varying degrees of knowledge emerge
and
irrational characteristics come to the surface. You can only do
so much: do
not expect to get all things right in the eyes of every viewer.
‘We should pay as much attention to understanding the project’s
goal in
relation to its audience. This involves understanding principles
of
perception and cognition in addition to other relevant factors,
such as
culture and education levels, for example. More importantly, it
means
carefully matching the tasks in the representation to our
audience’s
needs, expectations, expertise, etc. Visualizations are human-
centred
projects, in that they are not universal and will not be effective
for all
humans uniformly. As producers of visualizations, whether
devised for
data exploration or communication of information, we need to
take into
careful consideration those on the other side of the equation,
and who
will face the challenges of decoding our representations.’ Isabel
Meirelles, Professor, OCAD University (Toronto)
That is not to say that attempts to accommodate the needs of
your audience
should just be abandoned, quite the opposite. This is hard but it
is
essential. Visualisation is about human-centred design,
demonstrating
empathy for your audiences and putting them at the heart of
your decision
making.
There are several dimensions of definition that will help you
better
understand your audiences, including establishing what they
know, what
they do not know, the circumstances surrounding their
consumption of
your work and their personal characteristics. Some of these you
can
accommodate, others you may not be able to, depending on the
diversity
and practicality of the requirements. Again, in the absence of
perfection
optimisation is the name of the game, even if this means that
sometimes
the least worst is best.
71
The Factors Your Audiences Influence
Many of the factors presented here will occur when you think
about your
project context, as covered in Chapter 3. For now, it is helpful
to introduce
some of the factors that specifically relate to this discussion
about
delivering accessible design.
Subject-matter appeal: This was already made clear in the
earlier
illustration, but is worth logging again here: the appeal of the
subject
matter is a fundamental junction right at the beginning of the
consumption experience. If your audiences are not interested in
the
subject – i.e. they are indifferent towards the topic or see no
need or
relevance to engage with it there and then – then they will not
likely
stick around. They will probably not be interested in putting in
the
effort to work through the process of understanding for
something
that might be ultimately irrelevant. For those to whom the
subject
matter is immediately appealing, they are significantly more
likely to
engage with the data visualisation right the way through.
‘Data visualization is like family photos. If you don’t know the
people
in the picture, the beauty of the composition won’t keep your
attention.’
Zach Gemignani, CEO/Founder of Juice Analytics
Many of the ideas for this principle emerged from the Seeing
Data
visualisation literacy research project (seeingdata.org) on which
I
collaborated.
Dynamic of need: Do they need to engage with this work or is it
entirely voluntary? Do they have a direct investment in having
access
to this information, perhaps as part of their job and they need
this
information to serve their duties?
Subject-matter knowledge: What might your audiences know
and
not know about this subject? What is their capacity to learn or
potential motivation to develop their knowledge of this subject?
A
critical component of this issue, blending existing knowledge
with the
capacity to acquire knowledge, concerns the distinctions
between
complicated, complex, simple and simplified. This might seem
to be
more about the semantics of language but is of significant
influence
in data visualisation – indeed in any form of communication:
72
Complicated is generally a technical distinction. A subject
might
be difficult to understand because it involves pre-existing – and
probably high-level – knowledge and might be intricate in its
detail. The mathematics that underpinned the Moon landings are
complicated. Complicated subjects are, of course, surmountable
– the knowledge and skill are acquirable – but only achieved
through time and effort, hard work and learning (or
extraordinary talent), and, usually, with external assistance.
Complex is associated with problems that have no perfect
conclusion or maybe even no end state. Parenting is complex;
there is no rulebook for how to do it well, no definitive right or
wrong, no perfect way of accomplishing it. The elements of
parenting might not be necessarily complicated – cutting
Emmie’s sandwiches into star shapes – but there are lots of
different interrelated pressures always influencing and
occasionally colliding.
Simple, for the purpose of this book, concerns a matter that is
inherently easy to understand. It may be so small in dimension
and scope that it is not difficult to grasp, irrespective of prior
knowledge and experience.
Simplified involves transforming a problem context from either
a
complex or complicated initial state to a reduced form, possibly
by eliminating certain details or nuances.
Understanding the differences in these terms is vital. When
considering
your subject matter and the nature of your analysis you will
need to assess
whether your audience will be immediately able to understand
what you
are presenting or have the capacity to learn how to understand
it. If it is a
subject that is inherently complex or complicated, will it need
to be
simplified? If you are creating a graphic about taxation, will
you need to
strip it down to the basics or will this process of simplification
risk the
subject being oversimplified? The final content may be
obscured by the
absence of important subtleties. Indeed, the audience may have
felt
sufficiently sophisticated to have had the capacity to work out
and work
with a complicated topic, but you denied them that opportunity.
You might
reasonably dilute/reduce a complex subject for kids, but
generally my
advice is don’t underestimate the capacity of your audience.
Accordingly,
clarity trumps simplicity as the most salient concern about data
visualisation design.
73
‘Strive for clarity, not simplicity. It’s easy to “dumb something
down,”
but extremely difficult to provide clarity while maintaining
complexity.
I hate the word “simplify.” In many ways, as a researcher, it is
the bane
of my existence. I much prefer “explain,” “clarify,” or
“synthesize.” If
you take the complexity out of a topic, you degrade its
existence and
malign its importance. Words are not your enemy. Complex
thoughts
are not your enemy. Confusion is. Don’t confuse your audience.
Don’t
talk down to them, don’t mislead them, and certainly don’t lie
to them.’
Amanda Hobbs, Researcher and Visual Content Editor
What do they need to know? The million-dollar question. Often,
the
most common frustration expressed by viewers is that the
visualisation ‘didn’t show them what they were most interested
in’.
They wanted to see how something changed over time, not how
it
looked on a map. If you were them what would you want to
know?
This is a hard thing to second-guess with any accuracy. We will
be
discussing it further in Chapter 5.
Unfamiliar representation: In the final chapter of this book I
will
cover the issue of visualisation literacy, discussing the
capabilities
that go into being the most rounded creator of visualisation
work and
the techniques involved in being the most effective consumer
also.
Many people will perhaps be unaware of a deficit in their
visualisation literacy with regard to consuming certain chart
types.
The bar, line and pie chart are very common and broadly
familiar to
all. As you will see in Chapter 6, there are many more ways of
portraying data visually. This deficit in knowing how to read a
new or
unfamiliar chart type is not a failing on the part of the viewer, it
is
simply a result of their lack of prior exposure to these different
methods. For visualisers a key challenge lies with situations
when the
deployment of an uncommon chart may be an entirely
reasonable and
appropriate choice – indeed perhaps even the ‘simplest’ chart
that
could have been used – but it is likely to be unfamiliar to the
intended
viewers. Even if you support it with plenty of ‘how to read’
guidance,
if a viewer is overwhelmed or simply unwilling to make the
effort to
learn how to read a different chart type, you have little control
in
overcoming this.
Time: At the point of consuming a visualisation is the viewer in
a
pressured situation with a lot at stake? Are viewers likely to be
impatient and intolerant of the need to spend time learning how
to
read a display? Do they need quick insights or is there some
capacity
for them to take on exploring or reading in more depth? If it is
the
74
former, the immediacy of the presented information will
therefore be
a paramount requirement. If they have more time to work
through the
process of perceiving, interpreting and comprehending, this
could be
a more conducive situation to presenting complicated or
complex
subject matter – maybe even using different, unfamiliar chart
types.
Format: What format will your viewers need to consume your
work?
Are they going to need work created for a print output or a
digital
one? Does this need to be compatible with a small display as on
a
smartphone or a tablet? If what you create is consumed away
from its
intended native format, such as viewing a large infographic with
small text on a mobile phone, that will likely result in a
frustrating
experience for the viewer. However, how and where your work
is
consumed may be beyond your control. You can’t mitigate for
every
eventuality.
Personal tastes: Individual preferences towards certain colours,
visual elements and interaction features will often influence
(enabling
or inhibiting) a viewer’s engagement. The semiotic conventions
that
visualisers draw upon play a part in determining whether
viewers are
willing to spend time and expend effort looking at a
visualisation. Be
aware though that accommodating the preferences of one person
may
not cascade, with similar appeal, to all, and might indeed create
a
rather negative reaction.
Attitude and emotion: Sometimes we are tired, in a bad mood,
feeling lazy, or having a day when we are just irrational. And
the
prospect of working on even the most intriguing and well-
designed
project sometimes feels too much. I spend my days looking at
visualisations and can sympathise with the narrowing of mental
bandwidth when I am tired or have had a bad day. Confidence is
an
extension of this. Sometimes our audiences may just not feel
sufficiently equipped to embark on a visualisation if it is about
an
unknown subject or might involve pushing them outside their
comfort
zone in terms of the demands placed on their interpretation and
comprehension.
The Factors You Can Influence
Flipping the coin, let’s look at the main ways we, as visualisers,
can
influence (positively or negatively) the accessibility of the
designs created.
In effect, this entire book is focused on minimising the
likelihood that your
solution demonstrates any of these negative attributes.
Repeating the
75
mantra from earlier, you must avoid doing anything that will
cause the
boat to go slower.
‘The key difference I think in producing data
visualisation/infographics
in the service of journalism versus other contexts (like art) is
that there
is always an underlying, ultimate goal: to be useful. Not just
beautiful or
efficient – although something can (and should!) be all of those
things.
But journalism presents a certain set of constraints. A journalist
has to
always ask the question: How can I make this more useful? How
can
what I am creating help someone, teach someone, show someone
something new?’ Lena Groeger, Science Journalist, Designer
and
Developer at ProPublica
As you saw listed at the start of this section, the selected,
related design
principles from Dieter Rams’ list collectively include the aim of
ensuring
our work is useful, unobtrusive and understandable. Thinking
about what
not to do – focusing on the likely causes of failure across these
aims – is,
in this case, more instructive.
Your

More Related Content

PDF
Data Visualisation A Handbook for Data Driven Design Andy Kirk
DOCX
Data Visualisation23Data Visualisation
DOCX
Data Visualisation23Data Visualisation.docx
DOCX
Data Visualisation23Data Visualisation.docx
DOCX
Data Visualisation23Data Visualisation.docx
DOCX
Data Visualization Andy Kirk 3.pdf.pdfData Visualisati.docx
DOCX
Data Visualisation23Data Visualisation.docx
DOCX
Select one component of either project composition or chart compos.docx
Data Visualisation A Handbook for Data Driven Design Andy Kirk
Data Visualisation23Data Visualisation
Data Visualisation23Data Visualisation.docx
Data Visualisation23Data Visualisation.docx
Data Visualisation23Data Visualisation.docx
Data Visualization Andy Kirk 3.pdf.pdfData Visualisati.docx
Data Visualisation23Data Visualisation.docx
Select one component of either project composition or chart compos.docx

Similar to TemplateFundamentals of AccountingInstructionsAcco.docx (20)

DOCX
Annotated Bibliography DUE 0319. Please upload to Canvas. This.docx
DOCX
QuestionProvide a reflection of at least 500 words (or 2 page.docx
DOCX
ISOL 533 Project Part 1OverviewWrite paper in sections.docx
DOCX
Research Paper Topic Submission Hand washing in Sudan .docx
DOCX
Assignment 2Assignment 2CriteriaRatingsPtsThis criterion.docx
DOCX
Module 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docx
DOCX
Sheet1UNITED ARAB EMIRATES MOBILE DEMAND - June 2005UNITS2001A2002.docx
PPTX
Data visualisation as a campaign tool for change
PDF
Data science week_2_visualization
PDF
Democratizing Data
PDF
Data Visualization in the Newsroom
PDF
Data visualization 2
PDF
Data Visualization Resource Guide (September 2014)
PPTX
Class 6 data visualization
PPT
Using data effectively worskhop presentation
PDF
The Economic Crisis: Danger AND Opportunity
PDF
Digital analytics: Visualization (Lecture 5)
DOCX
TED Wiley Visualizing .docx
PDF
When data journalism meets science | Erice, June 10th, 2014
PPTX
data visualization and its need and usage
Annotated Bibliography DUE 0319. Please upload to Canvas. This.docx
QuestionProvide a reflection of at least 500 words (or 2 page.docx
ISOL 533 Project Part 1OverviewWrite paper in sections.docx
Research Paper Topic Submission Hand washing in Sudan .docx
Assignment 2Assignment 2CriteriaRatingsPtsThis criterion.docx
Module 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docx
Sheet1UNITED ARAB EMIRATES MOBILE DEMAND - June 2005UNITS2001A2002.docx
Data visualisation as a campaign tool for change
Data science week_2_visualization
Democratizing Data
Data Visualization in the Newsroom
Data visualization 2
Data Visualization Resource Guide (September 2014)
Class 6 data visualization
Using data effectively worskhop presentation
The Economic Crisis: Danger AND Opportunity
Digital analytics: Visualization (Lecture 5)
TED Wiley Visualizing .docx
When data journalism meets science | Erice, June 10th, 2014
data visualization and its need and usage
Ad

More from jacqueliner9 (20)

DOCX
TELESPAZIO PERFORMANCE APPRAISAL .docx
DOCX
Tell me everything you know about the following1.  Law Enfo.docx
DOCX
Tell me about yourself and highlight your strengths and professional.docx
DOCX
Telework opportunities are increasing in health care as they are in .docx
DOCX
Telework opportunities are increasing in health care as they are.docx
DOCX
Telehealth Technology  A summary of the technology to be imple.docx
DOCX
Television continues to remain a viable source of entertainment,  bo.docx
DOCX
Telehealth refers to the provision of medical care to affected i.docx
DOCX
Telenursing and TelemedicineTelenursing and telemedicine wil.docx
DOCX
Telehealth technology has extended the arms of traditional health ca.docx
DOCX
Telehealth is a collection of means or methods for enhancing health .docx
DOCX
Telehealth methods to deliver dietary interventions in adults .docx
DOCX
Technology is integral to successful implementation in many proj.docx
DOCX
technology is influencing and weakening the will power of going for .docx
DOCX
TelecommutingA. Telecommuting (Level 2)a. Introduction for T.docx
DOCX
Telecommunication NetHere are the instructions Once yo.docx
DOCX
TED Talk Wade Davis In order to begin to develop a global persp.docx
DOCX
TeenAddiction· In Section I (approximately 6-8 pages, doubl.docx
DOCX
Teheran 2Please revise your Reflection Paper #1 according to m.docx
DOCX
TED TalkKen Robinson (10 points)View the following TED Talk by .docx
TELESPAZIO PERFORMANCE APPRAISAL .docx
Tell me everything you know about the following1.  Law Enfo.docx
Tell me about yourself and highlight your strengths and professional.docx
Telework opportunities are increasing in health care as they are in .docx
Telework opportunities are increasing in health care as they are.docx
Telehealth Technology  A summary of the technology to be imple.docx
Television continues to remain a viable source of entertainment,  bo.docx
Telehealth refers to the provision of medical care to affected i.docx
Telenursing and TelemedicineTelenursing and telemedicine wil.docx
Telehealth technology has extended the arms of traditional health ca.docx
Telehealth is a collection of means or methods for enhancing health .docx
Telehealth methods to deliver dietary interventions in adults .docx
Technology is integral to successful implementation in many proj.docx
technology is influencing and weakening the will power of going for .docx
TelecommutingA. Telecommuting (Level 2)a. Introduction for T.docx
Telecommunication NetHere are the instructions Once yo.docx
TED Talk Wade Davis In order to begin to develop a global persp.docx
TeenAddiction· In Section I (approximately 6-8 pages, doubl.docx
Teheran 2Please revise your Reflection Paper #1 according to m.docx
TED TalkKen Robinson (10 points)View the following TED Talk by .docx
Ad

Recently uploaded (20)

PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Cell Types and Its function , kingdom of life
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Classroom Observation Tools for Teachers
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
01-Introduction-to-Information-Management.pdf
PDF
Business Ethics Teaching Materials for college
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Insiders guide to clinical Medicine.pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
PPH.pptx obstetrics and gynecology in nursing
O7-L3 Supply Chain Operations - ICLT Program
O5-L3 Freight Transport Ops (International) V1.pdf
Cell Structure & Organelles in detailed.
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Cell Types and Its function , kingdom of life
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Microbial diseases, their pathogenesis and prophylaxis
Classroom Observation Tools for Teachers
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
TR - Agricultural Crops Production NC III.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
01-Introduction-to-Information-Management.pdf
Business Ethics Teaching Materials for college
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Week 4 Term 3 Study Techniques revisited.pptx
Insiders guide to clinical Medicine.pdf
RMMM.pdf make it easy to upload and study
PPH.pptx obstetrics and gynecology in nursing

TemplateFundamentals of AccountingInstructionsAcco.docx

  • 1. TemplateFundamentals of Accounting Instructions Accounts to be used: · Cash. · Prepaid insurance. · Land. · Buildings. · Equipment. · Accounts payable. · Unearned service revenue. · Owner's capital. · Owner's drawings. · Service revenue. · Advertising expense. · Salaries and wages expense.
  • 2. Leave a space between each dated transaction. May 1 Invested $20,000 cash in the golf course business. May 3 Purchased Hampstead Golf Land for $15,000 cash. The price includes land $12,000, shed $2,000, and equipment $1,000. May 5 Paid advertising expenses of $700. May 6 Paid cash $600 for a one-year insurance policy. May 10 Purchased golf discs and other equipment for $1,050 from Discs Are Us, payable in 30 days. May 18 Received $1,100 in cash for golf fees earned (service revenue). May 19 Sold 150 coupon books for $10 each. Each book contains four coupons that enable the holder to play one round of disc golf. May 25 Withdrew $800 cash for personal use. May 30 Pay $250 as salaries for part-time employees. May 30 Paid Discs Are Us the full amount due. May 31 Received $2,100 cash for fees earned. Date Accounts Debit Credit
  • 8. 1 3 Data Visualisation 2 3 Data Visualisation A Handbook for Data Driven Design Andy Kirk 4 SAGE Publications Ltd 1 Oliver’s Yard
  • 9. 55 City Road London EC1Y 1SP SAGE Publications Inc. 2455 Teller Road Thousand Oaks, California 91320 SAGE Publications India Pvt Ltd B 1/I 1 Mohan Cooperative Industrial Area Mathura Road New Delhi 110 044 SAGE Publications Asia-Pacific Pte Ltd 3 Church Street #10-04 Samsung Hub Singapore 049483 5 © Andy Kirk 2016 First published 2016 Apart from any fair dealing for the purposes of research or private study,
  • 10. or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, this publication may be reproduced, stored or transmitted in any form, or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. Library of Congress Control Number: 2015957322 British Library Cataloguing in Publication data A catalogue record for this book is available from the British Library ISBN 978-1-4739-1213-7 ISBN 978-1-4739-1214-4 (pbk) Editor: Mila Steele Editorial assistant: Alysha Owen Production editor: Ian Antcliff Marketing manager: Sally Ransom Cover design: Shaun Mercier Typeset by: C&M Digitals (P) Ltd, Chennai, India
  • 11. Printed and bound in Great Britain by Bell and Bain Ltd, Glasgow 6 Contents List of Figures with Source Notes Acknowledgements About the Author INTRODUCTION PART A FOUNDATIONS 1 Defining Data Visualisation 2 Visualisation Workflow PART B THE HIDDEN THINKING 3 Formulating Your Brief 4 Working With Data 5 Establishing Your Editorial Thinking PART C DEVELOPING YOUR DESIGN SOLUTION 6 Data Representation 7 Interactivity 8 Annotation 9 Colour 10 Composition PART D DEVELOPING YOUR CAPABILITIES 11 Visualisation Literacy References Index 7
  • 12. List of Figures with Source Notes 1.1 A Definition for Data Visualisation 19 1.2 Per Capita Cheese Consumption in the U.S., by Sarah Slobin (Fortune magazine) 20 1.3 The Three Stages of Understanding 22 1.4–6 Demonstrating the Process of Understanding 24–27 1.7 The Three Principles of Good Visualisation Design 30 1.8 Housing and Home Ownership in the UK, by ONS Digital Content Team 33 1.9 Falling Number of Young Homeowners, by the Daily Mail 33 1.10 Gun Deaths in Florida (Reuters Graphics) 34 1.11 Iraq’s Bloody Toll, by Simon Scarr (South China Morning Post) 34 1.12 Gun Deaths in Florida Redesign, by Peter A. Fedewa (@pfedewa) 35 1.13 If Vienna would be an Apartment, by NZZ (Neue Zürcher Zeitung) [Translated] 45 1.14 Asia Loses Its Sweet Tooth for Chocolate, by Graphics Department (Wall Street Journal) 45 2.1 The Four Stages of the Visualisation Workflow 54 3.1 The ‘Purpose Map’ 76 3.2 Mizzou’s Racial Gap Is Typical On College Campuses, by FiveThirtyEight 77 3.3 Image taken from ‘Wealth Inequality in America’, by YouTube user ‘Politizane’ (www.youtube.com/watch?v=QPKKQnijnsM) 78 3.4 Dimensional Changes in Wood, by Luis Carli (luiscarli.com) 79 3.5 How Y’all, Youse and You Guys Talk, by Josh Katz (The New
  • 13. York Times) 80 3.6 Spotlight on Profitability, by Krisztina Szücs 81 3.7 Countries with the Most Land Neighbours 83 3.8 Buying Power: The Families Funding the 2016 Presidential Election, by Wilson Andrews, Amanda Cox, Alicia DeSantis, Evan Grothjan, Yuliya Parshina-Kottas, Graham Roberts, Derek Watkins and Karen Yourish (The New York Times) 84 3.9 Image taken from ‘Texas Department of Criminal Justice’ Website (www.tdcj.state.tx.us/death_row/dr_executed_offenders.html) 86 8 3.10 OECD Better Life Index, by Moritz Stefaner, Dominikus Baur, Raureif GmbH 89 3.11 Losing Ground, by Bob Marshall, The Lens, Brian Jacobs and Al Shaw (ProPublica) 89 3.12 Grape Expectations, by S. Scarr, C. Chan, and F. Foo (Reuters Graphics) 91 3.13 Keywords and Colour Swatch Ideas from Project about Psychotherapy Treatment in the Arctic 92 3.14 An Example of a Concept Sketch, by Giorgia Lupi of Accurat 92 4.1 Example of a Normalised Dataset 99 4.2 Example of a Cross-tabulated Dataset 100 4.3 Graphic Language: The Curse of the CEO, by David Ingold and Keith Collins (Bloomberg Visual Data), Jeff Green (Bloomberg
  • 14. News) 101 4.4 US Presidents by Ethnicity (1789 to 2015) 114 4.5 OECD Better Life Index, by Moritz Stefaner, Dominikus Baur, Raureif GmbH 116 4.6 Spotlight on Profitability, by Krisztina Szücs 117 4.7 Example of ‘Transforming to Convert’ Data 119 4.8 Making Sense of the Known Knowns 123 4.9 What Good Marathons and Bad Investments Have in Common, by Justin Wolfers (The New York Times) 124 5.1 The Fall and Rise of U.S. Inequality, in Two Graphs Source: World Top Incomes Database; Design credit: Quoctrung Bui (NPR) 136 5.2–4 Why Peyton Manning’s Record Will Be Hard to Beat, by Gregor Aisch and Kevin Quealy (The New York Times) 138– 140 C.1 Mockup Designs for ‘Poppy Field’, by Valentina D’Efilippo (design); Nicolas Pigelet (code); Data source: The Polynational War Memorial, 2014 (poppyfield.org) 146 6.1 Mapping Records and Variables on to Marks and Attributes 152 6.2 List of Mark Encodings 153 6.3 List of Attribute Encodings 153 6.4 Bloomberg Billionaires, by Bloomberg Visual Data (Design and development), Lina Chen and Anita Rundles (Illustration) 155 6.5 Lionel Messi: Games and Goals for FC Barcelona 156 6.6 Image from the Home page of visualisingdata.com 156 6.7 How the Insane Amount of Rain in Texas Could Turn Rhode Island Into a Lake, by Christopher Ingraham (The Washington Post) 156
  • 15. 9 6.8 The 10 Actors with the Most Oscar Nominations but No Wins 161 6.9 The 10 Actors who have Received the Most Oscar Nominations 162 6.10 How Nations Fare in PhDs by Sex Interactive, by Periscopic; Research by Amanda Hobbs; Published in Scientific American 163 6.11 Gender Pay Gap US, by David McCandless, Miriam Quick (Research) and Philippa Thomas (Design) 164 6.12 Who Wins the Stanley Cup of Playoff Beards? by Graphics Department (Wall Street Journal) 165 6.13 For These 55 Marijuana Companies, Every Day is 4/20, by Alex Tribou and Adam Pearce (Bloomberg Visual Data) 166 6.14 UK Public Sector Capital Expenditure, 2014/15 167 6.15 Global Competitiveness Report 2014–2015, by Bocoup and the World Economic Forum 168 6.16 Excerpt from a Rugby Union Player Dashboard 169 6.17 Range of Temperatures (°F) Recorded in the Top 10 Most Populated Cities During 2015 170 6.18 This Chart Shows How Much More Ivy League Grads Make Than You, by Christopher Ingraham (The Washington Post) 171 6.19 Comparing Critics Scores (Rotten Tomatoes) for Major Movie Franchises 172 6.20 A Career in Numbers: Movies Starring Michael Caine 173 6.21 Comparing the Frequency of Words Used in Chapter 1 of
  • 16. this Book 174 6.22 Summary of Eligible Votes in the UK General Election 2015 175 6.23 The Changing Fortunes of Internet Explorer and Google Chrome 176 6.24 Literarcy Proficiency: Adult Levels by Country 177 6.25 Political Polarization in the American Public’, Pew Research Center, Washington, DC (February, 2015) (http://www.people- press.org/2014/06/12/political-polarization-in-the-american- public/) 178 6.26 Finviz (www.finviz.com) 179 6.27 This Venn Diagram Shows Where You Can Both Smoke Weed and Get a Same-Sex Marriage, by Phillip Bump (The Washington Post) 180 6.28 The 200+ Beer Brands of SAB InBev, by Maarten Lambrechts for Mediafin: www.tijd.be/sabinbev (Dutch), 10 www.lecho.be/service/sabinbev (French) 181 6.29 Which Fossil Fuel Companies are Most Responsible for Climate Change? by Duncan Clark and Robin Houston (Kiln), published in the Guardian, drawing on work by Mike Bostock and Jason Davies
  • 17. 182 6.30 How Long Will We Live – And How Well? by Bonnie Berkowitz, Emily Chow and Todd Lindeman (The Washington Post) 183 6.31 Crime Rates by State, by Nathan Yau 184 6.32 Nutrient Contents – Parallel Coordinates, by Kai Chang (@syntagmatic) 185 6.33 How the ‘Avengers’ Line-up Has Changed Over the Years, by Jon Keegan (Wall Street Journal) 186 6.34 Interactive Fixture Molecules, by @experimental361 and @bootifulgame 187 6.35 The Rise of Partisanship and Super-cooperators in the U.S. House of Representatives. Visualisation by Mauro Martino, authored by Clio Andris, David Lee, Marcus J. Hamilton, Mauro Martino, Christian E. Gunning, and John Armistead Selde 188 6.36 The Global Flow of People, by Nikola Sander, Guy J. Abel and Ramon Bauer 189 6.37 UK Election Results by Political Party, 2010 vs 2015 190 6.38 The Fall and Rise of U.S. Inequality, in Two Graphs. Source: World Top Incomes Database; Design credit: Quoctrung Bui (NPR) 191 6.39 Census Bump: Rank of the Most Populous Cities at Each Census, 1790–1890, by Jim Vallandingham 192 6.40 Coal, Gas, Nuclear, Hydro? How Your State Generates Power. Source: U.S. Energy Information Administration, Credit: Christopher Groskopf, Alyson Hurt and Avie Schneider (NPR) 193 6.41 Holdouts Find Cheapest Super Bowl Tickets Late in the Game,
  • 18. by Alex Tribou, David Ingold and Jeremy Diamond (Bloomberg Visual Data) 194 6.42 Crude Oil Prices (West Texas Intermediate), 1985–2015 195 6.43 Percentage Change in Price for Select Food Items, Since 1990, by Nathan Yau 196 6.44 The Ebb and Flow of Movies: Box Office Receipts 1986– 2008, by Mathew Bloch, Lee Byron, Shan Carter and Amanda Cox (The New York Times) 197 6.45 Tracing the History of N.C.A.A. Conferences, by Mike Bostock, 11 Shan Carter and Kevin Quealy (The New York Times) 198 6.46 A Presidential Gantt Chart, by Ben Jones 199 6.47 How the ‘Avengers’ Line-up Has Changed Over the Years, by Jon Keegan (Wall Street Journal) 200 6.48 Native and New Berliners – How the S-Bahn Ring Divides the City, by Julius Tröger, André Pätzold, David Wendler (Berliner Morgenpost) and Moritz Klack (webkid.io) 201 6.49 How Y’all, Youse and You Guys Talk, by Josh Katz (The New York Times) 202 6.50 Here’s Exactly Where the Candidates Cash Came From, by Zach Mider, Christopher Cannon, and Adam Pearce (Bloomberg Visual Data) 203
  • 19. 6.51 Trillions of Trees, by Jan Willem Tulp 204 6.52 The Racial Dot Map. Image Copyright, 2013, Weldon Cooper Center for Public Service, Rector and Visitors of the University of Virginia (Dustin A. Cable, creator) 205 6.53 Arteries of the City, by Simon Scarr (South China Morning Post) 206 6.54 The Carbon Map, by Duncan Clark and Robin Houston (Kiln) 207 6.55 Election Dashboard, by Jay Boice, Aaron Bycoffe and Andrei Scheinkman (Huffington Post). Statistical model created by Simon Jackman 208 6.56 London is Rubbish at Recycling and Many Boroughs are Getting Worse, by URBS London using London Squared Map © 2015 www.aftertheflood.co 209 6.57 Automating the Design of Graphical Presentations of Relational Information. Adapted from McKinlay, J. D. (1986). ACM Transactions on Graphics, 5(2), 110–141. 213 6.58 Comparison of Judging Line Size vs Area Size 213 6.59 Comparison of Judging Related Items Using Variation in Colour (Hue) vs Variation in Shape 214 6.60 Illustrating the Correct and Incorrect Circle Size Encoding 216 6.61 Illustrating the Distortions Created by 3D Decoration 217 6.62 Example of a Bullet Chart using Banding Overlays 218 6.63 Excerpt from What’s Really Warming the World? by Eric Roston and Blacki Migliozzi (Bloomberg Visual Data) 218 6.64 Example of Using Markers Overlays 219 6.65 Why Is Her Paycheck Smaller? by Hannah Fairfield and
  • 20. Graham Roberts (The New York Times) 219 12 6.66 Inside the Powerful Lobby Fighting for Your Right to Eat Pizza, by Andrew Martin and Bloomberg Visual Data 220 6.67 Excerpt from ‘Razor Sales Move Online, Away From Gillette’, by Graphics Department (Wall Street Journal) 220 7.1 US Gun Deaths, by Periscopic 225 7.2 Finviz (www.finviz.com) 226 7.3 The Racial Dot Map: Image Copyright, 2013, Weldon Cooper Center for Public Service, Rector and Visitors of the University of Virginia (Dustin A. Cable, creator) 227 7.4 Obesity Around the World, by Jeff Clark 228 7.5 Excerpt from ‘Social Progress Index 2015’, by Social Progress Imperative, 2015 228 7.6 NFL Players: Height & Weight Over Time, by Noah Veltman (noahveltman.com) 229 7.7 Excerpt from ‘How Americans Die’, by Matthew C. Klein and Bloomberg Visual Data 230 7.8 Model Projections of Maximum Air Temperatures Near the Ocean and Land Surface on the June Solstice in 2014 and 2099: NASA Earth Observatory maps, by Joshua Stevens 231 7.9 Excerpt from ‘A Swing of Beauty’, by Sohail Al-Jamea, Wilson Andrews, Bonnie Berkowitz and Todd Lindeman (The
  • 21. Washington Post) 231 7.10 How Well Do You Know Your Area? by ONS Digital Content team 232 7.11 Excerpt from ‘Who Old Are You?’, by David McCandless and Tom Evans 233 7.12 512 Paths to the White House, by Mike Bostock and Shan Carter (The New York Times) 233 7.13 OECD Better Life Index, by Moritz Stefaner, Dominikus Baur, Raureif GmbH 233 7.14 Nobel Laureates, by Matthew Weber (Reuters Graphics) 234 7.15 Geography of a Recession, by Graphics Department (The New York Times) 234 7.16 How Big Will the UK Population be in 25 Years Time? by ONS Digital Content team 234 7.17 Excerpt from ‘Workers’ Compensation Reforms by State’, by Yue Qiu and Michael Grabell (ProPublica) 235 7.18 Excerpt from ‘ECB Bank Test Results’, by Monica Ulmanu, Laura Noonan and Vincent Flasseur (Reuters Graphics) 236 7.19 History Through the President’s Words, by Kennedy Elliott, Ted 13 Mellnik and Richard Johnson (The Washington Post) 237
  • 22. 7.20 Excerpt from ‘How Americans Die’, by Matthew C. Klein and Bloomberg Visual Data 237 7.21 Twitter NYC: A Multilingual Social City, by James Cheshire, Ed Manley, John Barratt, and Oliver O’Brien 238 7.22 Killing the Colorado: Explore the Robot River, by Abrahm Lustgarten, Al Shaw, Jeff Larson, Amanda Zamora and Lauren Kirchner (ProPublica) and John Grimwade 238 7.23 Losing Ground, by Bob Marshall, The Lens, Brian Jacobs and Al Shaw (ProPublica) 239 7.24 Excerpt from ‘History Through the President’s Words’, by Kennedy Elliott, Ted Mellnik and Richard Johnson (The Washington Post) 240 7.25 Plow, by Derek Watkins 242 7.26 The Horse in Motion, by Eadweard Muybridge. Source: United States Library of Congress’s Prints and Photographs division, digital ID cph.3a45870. 243 8.1 Titles Taken from Projects Published and Credited Elsewhere in This Book 248 8.2 Excerpt from ‘The Color of Debt: The Black Neighborhoods Where Collection Suits Hit Hardest’, by Al Shaw, Annie Waldman and Paul Kiel (ProPublica) 249 8.3 Excerpt from ‘Kindred Britain’ version 1.0 © 2013 Nicholas Jenkins – designed by Scott Murray, powered by SUL-CIDR 249 8.4 Excerpt from ‘The Color of Debt: The Black Neighborhoods Where Collection Suits Hit Hardest’, by Al Shaw, Annie Waldman and Paul Kiel (ProPublica) 250
  • 23. 8.5 Excerpt from ‘Bloomberg Billionaires’, by Bloomberg Visual Data (Design and development), Lina Chen and Anita Rundles (Illustration) 251 8.6 Excerpt from ‘Gender Pay Gap US?’, by David McCandless, Miriam Quick (Research) and Philippa Thomas (Design) 251 8.7 Excerpt from ‘Holdouts Find Cheapest Super Bowl Tickets Late in the Game’, by Alex Tribou, David Ingold and Jeremy Diamond (Bloomberg Visual Data) 252 8.8 Excerpt from ‘The Life Cycle of Ideas’, by Accurat 252 8.9 Mizzou’s Racial Gap Is Typical On College Campuses, by FiveThirtyEight 253 8.10 Excerpt from ‘The Infographic History of the World’, Harper Collins (2013); by Valentina D’Efilippo (co-author and designer); 14 James Ball (co-author and writer); Data source: The Polynational War Memorial, 2012 254 8.11 Twitter NYC: A Multilingual Social City, by James Cheshire, Ed Manley, John Barratt, and Oliver O’Brien 255 8.12 Excerpt from ‘US Gun Deaths’, by Periscopic 255 8.13 Image taken from Wealth Inequality in America, by YouTube user ‘Politizane’ (www.youtube.com/watch?v=QPKKQnijnsM) 256 9.1 HSL Colour Cylinder: Image from Wikimedia Commons published under the Creative Commons Attribution-Share Alike
  • 24. 3.0 Unported license 265 9.2 Colour Hue Spectrum 265 9.3 Colour Saturation Spectrum 266 9.4 Colour Lightness Spectrum 266 9.5 Excerpt from ‘Executive Pay by the Numbers’, by Karl Russell (The New York Times) 267 9.6 How Nations Fare in PhDs by Sex Interactive, by Periscopic; Research by Amanda Hobbs; Published in Scientific American 268 9.7 How Long Will We Live – And How Well? by Bonnie Berkowitz, Emily Chow and Todd Lindeman (The Washington Post) 268 9.8 Charting the Beatles: Song Structure, by Michael Deal 269 9.9 Photograph of MyCuppa mug, by Suck UK (www.suck.uk.com/products/mycuppamugs/) 269 9.10 Example of a Stacked Bar Chart Based on Ordinal Data 270 9.11 Rim Fire – The Extent of Fire in the Sierra Nevada Range and Yosemite National Park, 2013: NASA Earth Observatory images, by Robert Simmon 270 9.12 What are the Current Electricity Prices in Switzerland [Translated], by Interactive things for NZZ (the Neue Zürcher Zeitung) 271 9.13 Excerpt from ‘Obama’s Health Law: Who Was Helped Most’, by Kevin Quealy and Margot Sanger-Katz (The New York Times) 272 9.14 Daily Indego Bike Share Station Usage, by Randy Olson (@randal_olson) (http://www.randalolson.com/2015/09/05/visualizing-indego-
  • 25. bike- share-usage-patterns-in-philadelphia-part-2/) 272 9.15 Battling Infectious Diseases in the 20th Century: The Impact of Vaccines, by Graphics Department (Wall Street Journal) 273 9.16 Highest Max Temperatures in Australia (1st to 14th January 2013), Produced by the Australian Government Bureau of 15 Meteorology 274 9.17 State of the Polar Bear, by Periscopic 275 9.18 Excerpt from Geography of a Recession by Graphics Department (The New York Times) 275 9.19 Fewer Women Run Big Companies Than Men Named John, by Justin Wolfers (The New York Times) 276 9.20 NYPD, Council Spar Over More Officers by Graphics Department (Wall Street Journal) 277 9.21 Excerpt from a Football Player Dashboard 277 9.22 Elections Performance Index, The Pew Charitable Trusts © 2014 278 9.23 Art in the Age of Mechanical Reproduction: Walter Benjamin by Stefanie Posavec 279 9.24 Casualties, by Stamen, published by CNN 279 9.25 First Fatal Accident in Spain on a High-speed Line [Translated], by Rodrigo Silva, Antonio Alonso, Mariano Zafra, Yolanda Clemente and Thomas Ondarra (El Pais) 280 9.26 Lunge Feeding, by Jonathan Corum (The New York
  • 26. Times); whale illustration by Nicholas D. Pyenson 281 9.27 Examples of Common Background Colour Tones 281 9.28 Excerpt from NYC Street Trees by Species, by Jill Hubley 284 9.29 Demonstrating the Impact of Red-green Colour Blindness (deuteranopia) 286 9.30 Colour-blind Friendly Alternatives to Green and Red 287 9.31 Excerpt from, ‘Pyschotherapy in The Arctic’, by Andy Kirk 289 9.32 Wind Map, by Fernanda Viégas and Martin Wattenberg 289 10.1 City of Anarchy, by Simon Scarr (South China Morning Post) 294 10.2 Wireframe Sketch, by Giorgia Lupi for ‘Nobels no degree’ by Accurat 295 10.3 Example of the Small Multiples Technique 296 10.4 The Glass Ceiling Persists Redesign, by Francis Gagnon (ChezVoila.com) based on original by S. Culp (Reuters Graphics) 297 10.5 Fast-food Purchasers Report More Demands on Their Time, by Economic Research Service (USDA) 297 10.6 Stalemate, by Graphics Department (Wall Street Journal) 297 10.7 Nobels No Degrees, by Accurat 298 10.8 Kasich Could Be The GOP’s Moderate Backstop, by FiveThirtyEight 298 16 10.9 On Broadway, by Daniel Goddemeyer, Moritz Stefaner,
  • 27. Dominikus Baur, and Lev Manovich 299 10.10 ER Wait Watcher: Which Emergency Room Will See You the Fastest? by Lena Groeger, Mike Tigas and Sisi Wei (ProPublica) 300 10.11 Rain Patterns, by Jane Pong (South China Morning Post) 300 10.12 Excerpt from ‘Pyschotherapy in The Arctic’, by Andy Kirk 301 10.13 Gender Pay Gap US, by David McCandless, Miriam Quick (Research) and Philippa Thomas (Design) 301 10.14 The Worst Board Games Ever Invented, by FiveThirtyEight 303 10.15 From Millions, Billions, Trillions: Letters from Zimbabwe, 2005−2009, a book written and published by Catherine Buckle (2014), table design by Graham van de Ruit (pg. 193) 303 10.16 List of Chart Structures 304 10.17 Illustrating the Effect of Truncated Bar Axis Scales 305 10.18 Excerpt from ‘Doping under the Microscope’, by S. Scarr and W. Foo (Reuters Graphics) 306 10.19 Record-high 60% of Americans Support Same-sex Marriage, by Gallup 306 10.20 Images from Wikimedia Commons, published under the Creative Commons Attribution-Share Alike 3.0 Unported license 308 11.1–7 The Pursuit of Faster’ by Andy Kirk and Andrew Witherley 318–324 17
  • 28. Acknowledgements This book has been made possible thanks to the unwavering support of my incredible wife, Ellie, and the endless encouragement from my Mum and Dad, the rest of my brilliant family and my super group of friends. From a professional standpoint I also need to acknowledge the fundamental role played by the hundreds of visualisation practitioners (no matter under what title you ply your trade) who have created such a wealth of brilliant work from which I have developed so many of my convictions and formed the basis of so much of the content in this book. The people and organisations who have provided me with permission to use their work are heroes and I hope this book does their rich talent justice. 18 About the Author Andy Kirk is a freelance data visualisation specialist based in Yorkshire, UK. He is a visualisation design consultant, training provider, teacher, researcher, author, speaker and editor of the award-winning website visualisingdata.com
  • 29. After graduating from Lancaster University in 1999 with a BSc (hons) in Operational Research, Andy held a variety of business analysis and information management positions at organisations including West Yorkshire Police and the University of Leeds. He discovered data visualisation in early 2007 just at the time when he was shaping up his proposal for a Master’s (MA) Research Programme designed for members of staff at the University of Leeds. On completing this programme with distinction, Andy’s passion for the subject was unleashed. Following his graduation in December 2009, to continue the process of discovering and learning the subject he launched visualisingdata.com, a blogging platform that would chart the ongoing development of the data visualisation field. Over time, as the field has continued to grow, the site too has reflected this, becoming one of the most popular in the field. It features a wide range of fresh content profiling the latest projects and contemporary techniques, discourse about practical and theoretical matters, commentary about key issues, and collections of valuable references and resources. In 2011 Andy became a freelance professional focusing on data visualisation consultancy and training workshops. Some of his clients include CERN, Arsenal FC, PepsiCo, Intel, Hershey, the WHO and McKinsey. At the time of writing he has delivered over 160 public and private training events across the UK, Europe, North
  • 30. America, Asia, South Africa and Australia, reaching well over 3000 delegates. In addition to training workshops Andy also has two academic teaching positions. He joined the highly respected Maryland Institute College of Art (MICA) as a visiting lecturer in 2013 and has been teaching a module on the Information Visualisation Master’s Programme since its inception. In January 2016, he began teaching a data visualisation module as part of the MSc in Business Analytics at the Imperial College Business School in London. 19 Between 2014 and 2015 Andy was an external consultant on a research project called ‘Seeing Data’, funded by the Arts & Humanities Research Council and hosted by the University of Sheffield. This study explored the issues of data visualisation literacy among the general public and, among many things, helped to shape an understanding of the human factors that affect visualisation literacy and the effectiveness of design. 20 Introduction I.1 The Quest Begins
  • 31. In his book The Seven Basic Plots, author Christopher Booker investigated the history of telling stories. He examined the structures used in biblical teachings and historical myths through to contemporary storytelling devices used in movies and TV. From this study he found seven common themes that, he argues, can be identifiable in any form of story. One of these themes was ‘The Quest’. Booker describes this as revolving around a main protagonist who embarks on a journey to acquire a treasured object or reach an important destination, but faces many obstacles and temptations along the way. It is a theme that I feel shares many characteristics with the structure of this book and the nature of data visualisation. You are the central protagonist in this story in the role of the data visualiser. The journey you are embarking on involves a route along a design workflow where you will be faced with a wide range of different conceptual, practical and technical challenges. The start of this journey will be triggered by curiosity, which you will need to define in order to accomplish your goals. From this origin you will move forward to initiating and planning your work, defining the dimensions of your
  • 32. challenge. Next, you will begin the heavy lifting of working with data, determining what qualities it contains and how you might share these with others. Only then will you be ready to take on the design stage. Here you will be faced with the prospect of handling a spectrum of different design options that will require creative and rational thinking to resolve most effectively. The multidisciplinary nature of this field offers a unique opportunity and challenge. Data visualisation is not an especially difficult capability to acquire, it is largely a game of decisions. Making better decisions will be your goal but sometimes clear decisions will feel elusive. There will be occasions when the best choice is not at all visible and others when there will be many seemingly equal viable choices. Which one to go with? This book aims to be your guide, helping you navigate efficiently through these 21 difficult stages of your journey. You will need to learn to be flexible and adaptable, capable of shifting your approach to suit the circumstances. This is important
  • 33. because there are plenty of potential villains lying in wait looking to derail progress. These are the forces that manifest through the imposition of restrictive creative constraints and the pressure created by the relentless ticking clock of timescales. Stakeholders and audiences will present complex human factors through the diversity of their needs and personal traits. These will need to be astutely accommodated. Data, the critical raw material of this process, will dominate your attention. It will frustrate and even disappoint at times, as promises of its treasures fail to materialise irrespective of the hard work, love and attention lavished upon it. Your own characteristics will also contribute to a certain amount of the villainy. At times, you will find yourself wrestling with internal creative and analytical voices pulling against each other in opposite directions. Your excitably formed initial ideas will be embraced but will need taming. Your inherent tastes, experiences and comforts will divert you away from the ideal path, so you will need to maintain clarity and focus. The central conflict you will have to deal with is the notion that there is no perfect in data visualisation. It is a field with very few ‘always’ and ‘nevers’. Singular solutions rarely exist. The comfort offered by
  • 34. the rules that instruct what is right and wrong, good and evil, has its limits. You can find small but legitimate breaking points with many of them. While you can rightly aspire to reach as close to perfect as possible, the attitude of aiming for good enough will often indeed be good enough and fundamentally necessary. In accomplishing the quest you will be rewarded with competency in data visualisation, developing confidence in being able to judge the most effective analytical and design solutions in the most efficient way. It will take time and it will need more than just reading this book. It will also require your ongoing effort to learn, apply, reflect and develop. Each new data visualisation opportunity poses a new, unique challenge. However, if you keep persevering with this journey the possibility of a happy ending will increase all the time. I.2 Who is this Book Aimed at? 22 The primary challenge one faces when writing a book about data visualisation is to determine what to leave in and what to leave out. Data visualisation is big. It is too big a subject even to attempt to cover it all, in
  • 35. detail, in one book. There is no single book to rule them all because there is no one book that can cover it all. Each and every one of the topics covered by the chapters in this book could (and, in several cases, do) exist as whole books in their own right. The secondary challenge when writing a book about data visualisation is to decide how to weave all the content together. Data visualisation is not rocket science; it is not an especially complicated discipline. Lots of it, as you will see, is rooted in common sense. It is, however, certainly a complex subject, a semantic distinction that will be revisited later. There are lots of things to think about and decide on, as well as many things to do and make. Creative and analytical sensibilities blend with artistic and scientific judgments. In one moment you might be checking the statistical rigour of your calculations, in the next deciding which tone of orange most elegantly contrasts with an 80% black. The complexity of data visualisation manifests itself through how these different ingredients, and many more, interact, influence and intersect to form the whole. The decisions I have made in formulating this book‘s content have been shaped by my own process of learning about, writing about and practising data visualisation for, at the time of writing, nearly a decade.
  • 36. Significantly – from the perspective of my own development – I have been fortunate to have had extensive experience designing and delivering training workshops and postgraduate teaching. I believe you only truly learn about your own knowledge of a subject when you have to explain it and teach it to others. I have arrived at what I believe to be an effective and proven pedagogy that successfully translates the complexities of this subject into accessible, practical and valuable form. I feel well qualified to bridge the gap between the large population of everyday practitioners, who might identify themselves as beginners, and the superstar technical, creative and academic minds that are constantly pushing forward our understanding of the potential of data visualisation. I am not going to claim to belong to that latter cohort, but I have certainly been the former – a beginner – and most of my working hours are spent helping other beginners start their journey. I know the things that I would have valued when I was starting out and I 23 know how I would have wished them to be articulated and
  • 37. presented for me to develop my skills most efficiently. There is a large and growing library of fantastic books offering many different theoretical and practical viewpoints on the subject of data visualisation. My aim is to bring value to this existing collection of work by taking on a particular perspective that is perhaps under- represented in other texts – exploring the notion and practice of a visualisation design process. As I have alluded to in the opening, the central premise of this book is that the path to mastering data visualisation is achieved by making better decisions: effective choices, efficiently made. The book’s central goal is to help develop your capability and confidence in facing these decisions. Just as a single book cannot cover the whole of this subject, it stands that a single book cannot aim to address directly the needs of all people doing data visualisation. In this section I am going to run through some of the characteristics that shape the readers to whom this book is primarily targeted. I will also put into context the content the book will and will not cover, and why. This will help manage your expectations as the reader and establish its value proposition compared with other titles.
  • 38. Domain and Duties The core audiences for whom this book has been primarily written are undergraduate and postgraduate-level students and early career researchers from social science subjects. This reflects a growing number of people in higher education who are interested in and need to learn about data visualisation. Although aimed at social sciences, the content will also be relevant across the spectrum of academic disciplines, from the arts and humanities right through to the formal and natural sciences: any academic duty where there is an emphasis on the use of quantitative and qualitative methods in studies will require an appreciation of good data visualisation practices. Where statistical capabilities are relevant so too is data visualisation. Beyond academia, data visualisation is a discipline that has reached mainstream consciousness with an increasing number of professionals and organisations, across all industry types and sizes, recognising the 24 importance of doing it well for both internal and external
  • 39. benefit. You might be a market researcher, a librarian or a data analyst looking to enhance your data capabilities. Perhaps you are a skilled graphic designer or web developer looking to take your portfolio of work into a more data- driven direction. Maybe you are in a managerial position and not directly involved in the creation of visualisation work, but you need to coordinate or commission others who will be. You require awareness of the most efficient approaches, the range of options and the different key decision points. You might be seeking generally to improve the sophistication of the language you use around commissioning visualisation work and to have a better way of expressing and evaluating work created for you. Basically, anyone who is involved in whatever capacity with the analysis and visual communication of data as part of their professional duties will need to grasp the demands of data visualisation and this book will go some way to supporting these needs. Subject Neutrality One of the important aspects of the book will be to emphasise that data visualisation is a portable practice. You will see a broad array of examples of work from different industries, covering very different
  • 40. topics. What will become apparent is that visualisation techniques are largely subject-matter neutral: a line chart that displays the ebb and flow of favourable opinion towards a politician involves the same techniques as using a line chart to show how a stock has changed in value over time or how peak temperatures have changed across a season in a given location. A line chart is a line chart, regardless of the subject matter. The context of the viewers (such as their needs and their knowledge) and the specific meaning that can be drawn will inevitably be unique to each setting, but the role of visualisation itself is adaptable and portable across all subject areas. Data visualisation is an entirely global concern, not focused on any defined geographic region. Although the English language dominates the written discourse (books, websites) about this subject, the interest in it and visible output from across the globe are increasing at a pace. There are cultural matters that influence certain decisions throughout the design process, especially around the choices made for colour usage, but otherwise it is a discipline common to all. 25
  • 41. Level and Prerequisites The coverage of this book is intended to serve the needs of beginners and those with intermediate capability. For most people, this is likely to be as far as they might ever need to go. It will offer an accessible route for novices to start their learning journey and, for those already familiar with the basics, there will be content that will hopefully contribute to fine- tuning their approaches. For context, I believe the only distinction between beginner and intermediate is one of breadth and depth of critical thinking rather than any degree of difficulty. The more advanced techniques in visualisation tend to be associated with the use of specific technologies for handling larger, complex datasets and/or producing more bespoke and feature- rich outputs. This book is therefore not aimed at experienced or established visualisation practitioners. There may be some new perspectives to enrich their thinking, some content that will confirm and other content that might constructively challenge their convictions. Otherwise, the coverage in this book should really echo the practices they are likely to be already observing.
  • 42. As I have already touched on, data visualisation is a genuinely multidisciplinary field. The people who are active in this field or profession come from all backgrounds – everyone has a different entry point and nobody arrives with all constituent capabilities. It is therefore quite difficult to define just what are the right type and level of pre- existing knowledge, skills or experiences for those learning about data visualisation. As each year passes, the savvy-ness of the type of audience this book targets will increase, especially as the subject penetrates more into the mainstream. What were seen as bewilderingly new techniques several years ago are now commonplace to more people. That said, I think the following would be a fair outline of the type and shape of some of the most important prerequisite attributes for getting the most out of this book: Strong numeracy is necessary as well as a familiarity with basic statistics. While it is reasonable to assume limited prior knowledge of data 26 visualisation, there should be a strong desire to want to learn it. The
  • 43. demands of learning a craft like data visualisation take time and effort; the capabilities will need nurturing through ongoing learning and practice. They are not going to be achieved overnight or acquired alone from reading this book. Any book that claims to be able magically to inject mastery through just reading it cover to cover is over-promising and likely to under-deliver. The best data visualisers possess inherent curiosity. You should be the type of person who is naturally disposed to question the world around them or can imagine what questions others have. Your instinct for discovering and sharing answers will be at the heart of this activity. There are no expectations of your having any prior familiarity with design principles, but a desire to embrace some of the creative aspects presented in this book will heighten the impact of your work. Unlock your artistry! If you are somebody with a strong creative flair you are very fortunate. This book will guide you through when and crucially when not to tap into this sensibility. You should be willing to increase the rigour of your analytical decision making and be prepared to have your creative thinking informed more fundamentally by data rather than just instinct. A range of technical skills covering different software applications,
  • 44. tools and programming languages is not expected for this book, as I will explain next, but you will ideally have some knowledge of basic Excel and some experience of working with data. I.3 Getting the Balance Handbook vs Tutorial Book The description of this book as being a ‘handbook’ positions it as being of practical help and presented in accessible form. It offers direction with comprehensive reference – more of a city guidebook for a tourist than an instruction manual to fix a washing machine. It will help you to know what things to think about, when to think about them, what options exist and how best to resolve all the choices involved in any data-driven design. Technology is the key enabler for working with data and creating 27 visualisation design outputs. Indeed, apart from a small proportion of artisan visualisation work that is drawn by hand, the reliance on technology to create visualisation work is an inseparable necessity. For many there is a understandable appetite for step-by-step tutorials that help
  • 45. them immediately to implement data visualisation techniques via existing and new tools. However, writing about data visualisation through the lens of selected tools is a bit of a minefield, given the diversity of technical options out there and the mixed range of skills, access and needs. I greatly admire those people who have authored tutorial-based texts because they require astute judgement about what is the right level, structure and scope. The technology space around visualisation is characterised by flux. There are the ongoing changes with the enhancement of established tools as well as a relatively high frequency of new entrants offset by the decline of others. Some tools are proprietary, others are open source; some are easier to learn, others require a great deal of understanding before you can even consider embarking on your first chart. There are many recent cases of applications or services that have enjoyed fleeting exposure before reaching a plateau: development and support decline, the community of users disperses and there is a certain expiry of value. Deprecation of syntax and functions in programming languages requires the perennial updating of skills.
  • 46. All of this perhaps paints a rather more chaotic picture than is necessarily the case but it justifies the reasons why this book does not offer teaching in the use of any tools. While tutorials may be invaluable to some, they may also only be mildly interesting to others and possibly of no value to most. Tools come and go but the craft remains. I believe that creating a practical, rather than necessarily a technical, text that focuses on the underlying craft of data visualisation with a tool-agnostic approach offers an effective way to begin learning about the subject in appropriate depth. The content should be appealing to readers irrespective of the extent of their technical knowledge (novice to advanced technicians) and specific tool experiences (e.g. knowledge of Excel, Tableau, Adobe Illustrator). There is a role for all book types. Different people want different sources of insight at different stages in their development. If you are seeking a text that provides in-depth tutorials on a range of tools or pages of programmatic instruction, this one will not be the best choice. However, if 28 you consult only tutorial-related books, the chances are you will
  • 47. likely fall short on the fundamental critical thinking that will be needed in the longer term to get the most out of the tools with which you develop strong skills. To substantiate the book’s value, the digital companion resources to this book will offer a curated, up-to-date collection of visualisation technology resources that will guide you through the most common and valuable tools, helping you to gain a sense of what their roles are and where these fit into the design workflow. Additionally, there will be recommended exercises and many further related digital materials available for exploring. Useful vs Beautiful Another important distinction to make is that this book is not intended to be seen as a beauty pageant. I love flicking through those glossy ‘coffee table’ books as much as the next person; such books offer great inspiration and demonstrate some of the finest work in the field. This book serves a very different purpose. I believe that, as a beginner or relative beginner on this learning journey, the inspiration you need comes more from understanding what is behind the thinking that makes these amazing works succeed and others not. My desire is to make this the most useful text available, a
  • 48. reference that will spend more time on your desk than on your bookshelf. To be useful is to be used. I want the pages to be dog-eared. I want to see scribbles and annotated notes made across its pages and key passages underlined. I want to see sticky labels peering out above identified pages of note. I want to see creases where pages have been folded back or a double-page spread that has been weighed down to keep it open. In time I even want its cover reinforced with wallpaper or wrapping paper to ensure its contents remain bound together. There is every intention of making this an elegantly presented and packaged book but it should not be something that invites you to ‘look, but don’t touch’. Pragmatic vs Theoretical The content of this book has been formed through many years of absorbing knowledge from all manner of books, generations of academic papers, thousands of web articles, hundreds of conference talks, endless online and 29 personal discussions, and lots of personal practice. What I present here is a pragmatic translation and distillation of what I have learned
  • 49. down the years. It is not a deeply academic or theoretical book. Where theoretical context and reference is relevant it will be signposted as I do want to ground this book in as much evidenced-based content as possible; it is about judging what is going to add most value. Experienced practitioners will likely have an appetite for delving deeper into theoretical discourse and the underlying sciences that intersect in this field but that is beyond the scope of this particular text. Take the science of visual perception, for example. There is no value in attempting to emulate what has already been covered by other books in greater depth and quality than I could achieve. Once you start peeling back the many different layers of topics like visual and cognitive science the boundaries of your interest and their relevance to data visualisation never seem to arrive. You get swallowed up by the depth of these subjects. You realise that you have found yourself learning about what the very concept of light and sight is and at that point your brain begins to ache (well, mine does at least), especially when all you set out to discover was if a bar chart would be better than a pie chart.
  • 50. An important reason for giving greater weight to pragmatism is because of people: people are the makers, the stakeholders, the audiences and the critics in data visualisation. Although there are a great deal of valuable research-driven concepts concerning data visualisation, their practical application can be occasionally at odds with the somewhat sanitised and artificial context of the research methods employed. To translate them into real-world circumstances can sometimes be easier said than done as the influence of human factors can easily distort the significance of otherwise robust ideas. I want to remove the burden from you as a reader having to translate relevant theoretical discourse into applicable practice. Critical thinking will therefore be the watchword, equipping you with the independence of thought to decide rationally for yourself what the solutions are that best fit your context, your data, your message and your audience. To do this you will need an appreciation of all the options available to you (the different things you could do) and a reliable approach for critically determining what choices you should make (the things you will do and why). 30
  • 51. Contemporary vs Historical This book is not going to look too far back into the past. We all respect the ancestors of this field, the great names who, despite primitive means, pioneered new concepts in the visual display of statistics to shape the foundations of the field being practised today. The field’s lineage is decorated by the influence of William Playfair’s first ever bar chart, Charles Joseph Minard’s famous graphic about Napoleon’s Russian campaign, Florence Nightingale’s Coxcomb plot and John Snow’s cholera map. These are some of the totemic names and classic examples that will always be held up as the ‘firsts’. Of course, to many beginners in the field, this historical context is of huge interest. However, again, this kind of content has already been superbly covered by other texts on more than enough occasions. Time to move on. I am not going to spend time attempting to enlighten you about how we live in the age of ‘Big Data’ and how occupations related to data are or will be the ‘sexiest jobs’ of our time. The former is no longer news, the latter claim emerged from a single source. I do not want to bloat this book
  • 52. with the unnecessary reprising of topics that have been covered at length elsewhere. There is more valuable and useful content I want you to focus your time on. The subject matter, the ideas and the practices presented here will hopefully not date a great deal. Of course, many of the graphic examples included in the book will be surpassed by newer work demonstrating similar concepts as the field continues to develop. However, their worth as exhibits of a particular perspective covered in the text should prove timeless. As more research is conducted in the subject, without question there will be new techniques, new concepts, new empirically evidenced principles that emerge. Maybe even new rules. There will be new thought- leaders, new sources of reference, new visualisers to draw insight from. New tools will be created, existing tools will expire. Some things that are done and can only be done by hand as of today may become seamlessly automated in the near future. That is simply the nature of a fast- growing field. This book can only be a line in the sand. Analysis vs Communication 31
  • 53. A further important distinction to make concerns the subtle but significant difference between visualisations which are used for analysis and visualisations used for communication. Before a visualiser can confidently decide what to communicate to others, he or she needs to have developed an intimate understanding of the qualities and potential of the data. This is largely achieved through exploratory data analysis. Here, the visualiser and the viewer are the same person. Through visual exploration, different interrogations can be pursued ‘on the fly’ to unearth confirmatory or enlightening discoveries about what insights exist. Visualisation techniques used for analysis will be a key component of the journey towards creating visualisation for communication but the practices involved differ. Unlike visualisation for communication, the techniques used for visual analysis do not have to be visually polished or necessarily appealing. They are only serving the purpose of helping you to truly learn about your data. When a data visualisation is being created to communicate to others, many careful considerations come into play about the requirements and interests of the intended or expected
  • 54. audience. This has a significant influence on many of the design decisions you make that do not exist alone with visual analysis. Exploratory data analysis is a huge and specialist subject in and of itself. In its most advanced form, working efficiently and effectively with large complex data, topics like ‘machine learning’, using self- learning algorithms to help automate and assist in the discovery of patterns in data, become increasingly relevant. For the scope of this book the content is weighted more towards methods and concerns about communicating data visually to others. If your role is in pure data science or statistical analysis you will likely require a deeper treatment of the exploratory data analysis topic than this book can reasonably offer. However, Chapter 4 will cover the essential elements in sufficient depth for the practical needs of most people working with data. Print vs Digital The opportunity to supplement the print version of this book with an e- book and further digital companion resources helps to cushion the agonising decisions about what to leave out. This text is therefore 32
  • 55. enhanced by access to further digital resources, some of which are newly created, while others are curated references from the endless well of visualisation content on the Web. Included online (book.visualisingdata.com) will be: a completed case-study project that demonstrates the workflow activities covered in this book, including full write-ups and all related digital materials; an extensive and up-to-date catalogue of over 300 data visualisation tools; a curated collection of tutorials and resources to help develop your confidence with some of the most common and valuable tools; practical exercises designed to embed the learning from each chapter; further reading resources to continue learning about the subjects covered in each chapter. I.4 Objectives Before moving on to an outline of the book’s contents, I want to share four key objectives that I hope to accomplish for you by the final chapter. These are themes that will run through the entire text: challenge, enlighten, equip and inspire. To challenge you I will be encouraging you to recognise that your current
  • 56. thinking about visualisation may need to be reconsidered, both as a creator and as a consumer. We all arrive in visualisation from different subject and domain origins and with that comes certain baggage and prior sensibilities that can distort our perspectives. I will not be looking to eliminate these, rather to help you harness and align them with other traits and viewpoints. I will ask you to relentlessly consider the diverse decisions involved in this process. I will challenge your convictions about what you perceive to be good or bad, effective or ineffective visualisation choices: arbitrary choices will be eliminated from your thinking. Even if you are not necessarily a beginner, I believe the content you read in this book will make you question some of your own perspectives and assumptions. I will encourage you to reflect on your previous work, asking you to consider how and why you have designed visualisations in the way that you have: where do you need to improve? What can you do better? 33 It is not just about creating visualisations, I will also challenge your approach to reading visualisations. This is not something you
  • 57. might usually think much about, but there is an important role for more tactical approaches to consuming visualisations with greater efficiency and effectiveness. To enlighten you will be to increase your awareness of the possibilities in data visualisation. As you begin your discovery of data visualisation you might not be aware of the whole: you do not entirely know what options exist, how they are connected and how to make good choices. Until you know, you don’t know – that is what the objective of enlightening is all about. As you will discover, there is a lot on your plate, much to work through. It is not just about the visible end-product design decisions. Hidden beneath the surface are many contextual circumstances to weigh up, decisions about how best to prepare your data, choices around the multitude of viable ways of slicing those data up into different angles of analysis. That is all before you even reach the design stage, where you will begin to consider the repertoire of techniques for visually portraying your data – the charts, the interactive features, the colours and much more besides.
  • 58. This book will broaden your visual vocabulary to give you more ways of expressing your data visually. It will enhance the sophistication of your decision making and of visual language for any of the challenges you may face. To equip is to ensure you have robust tactics for managing your way through the myriad options that exist in data visualisation. The variety it offers makes for a wonderful prospect but, equally, introduces the burden of choice. This book aims to make the challenge of undertaking data visualisation far less overwhelming, breaking down the overall prospect into smaller, more manageable task chunks. The structure of this book will offer a reliable and flexible framework for thinking, rather than rules for learning. It will lead to better decisions. With an emphasis on critical thinking you will move away from an over- reliance on gut feeling and taste. To echo what I mentioned earlier, its role as a handbook will help you know what things to think about, when to think about them and how best to resolve all the thinking involved in any data-driven design challenge you meet. 34
  • 59. To inspire is to give you more than just a book to read. It is the opening of a door into a subject to inspire you to step further inside. It is about helping you to want to continue to learn about it and expose yourself to as much positive influence as possible. It should elevate your ambition and broaden your capability. It is a book underpinned by theory but dominated by practical and accessible advice, including input from some of the best visualisers in the field today. The range of print and digital resources will offer lots of supplementary material including tutorials, further reading materials and suggested exercises. Collectively this will hopefully make it one of the most comprehensive, valuable and inspiring titles out there. I.5 Chapter Contents The book is organised into four main parts (A, B, C and D) comprising eleven chapters and preceded by the ‘Introduction’ sections you are reading now. Each chapter opens with an introductory outline that previews the content to be covered and provides a bridge between consecutive chapters. In the closing sections of each chapter the most salient learning points
  • 60. will be summarised and some important, practical tips and tactics shared. As mentioned, online there will be collections of practical exercises and further reading resources recommended to substantiate the learning from the chapter. Throughout the book you will see sidebar captions that will offer relevant references, aphorisms, good habits and practical tips from some of the most influential people in the field today. Introduction This introduction explains how I have attempted to make sense of the complexity of the subject, outlining the nature of the audience I am trying to reach, the key objectives, what topics the book will be covering and not covering, and how the content has been organised. 35 Part A: Foundations Part A establishes the foundation knowledge and sets up a key reference of understanding that aids your thinking across the rest of the book. Chapter 1 will be the logical starting point for many of you who are new to the field to help you understand more about the definitions and attributes
  • 61. of data visualisation. Even if you are not a complete beginner, the content of the chapter forms the terms of reference that much of the remaining content is based on. Chapter 2 prepares you for the journey through the rest of the book by introducing the key design workflow that you will be following. Chapter 1: Defining Data Visualisation Defining data visualisation: outlining the components of thinking that make up the proposed definition for data visualisation. The importance of conviction: presenting three guiding principles of good visualisation design: trustworthy, accessible and elegant. Distinctions and glossary: explaining the distinctions and overlaps with other related disciplines and providing a glossary of terms used in this book to establish consistency of language. Chapter 2: Visualisation Workflow The importance of process: describing the data visualisation design workflow, what it involves and why a process approach is required. The process in practice: providing some useful tips, tactics and habits that transcend any particular stage of the process but will best prepare you for success with this activity. Part B: The Hidden Thinking
  • 62. Part B discusses the first three preparatory stages of the data visualisation design workflow. ‘The hidden thinking’ title refers to how these vital activities, that have a huge influence over the eventual design solution, are somewhat out of sight in the final output; they are hidden beneath the surface but completely shape what is visible. These stages represent the often neglected contextual definitions, data wrangling and editorial challenges that are so critical to the success or otherwise of any 36 visualisation work – they require a great deal of care and attention before you switch your attention to the design stage. Chapter 3: Formulating Your Brief What is a brief?: describing the value of compiling a brief to help initiate, define and plan the requirements of your work. Establishing your project’s context: defining the origin curiosity or motivation, identifying all the key factors and circumstances that surround your work, and defining the core purpose of your visualisation. Establishing your project’s vision: early considerations about the type of visualisation solution needed to achieve your aims and
  • 63. harnessing initial ideas about what this solution might look like. Chapter 4: Working With Data Data literacy: establishing a basic understanding with this critical literacy, providing some foundation understanding about datasets and data types and some observations about statistical literacy. Data acquisition: outlining the different origins of and methods for accessing your data. Data examination: approaches for acquainting yourself with the physical characteristics and meaning of your data. Data transformation: optimising the condition, content and form of your data fully to prepare it for its analytical purpose. Data exploration: developing deeper intimacy with the potential qualities and insights contained, and potentially hidden, within your data. Chapter 5: Establishing Your Editorial Thinking What is editorial thinking?: defining the role of editorial thinking in data visualisation. The influence of editorial thinking: explaining how the different dimensions of editorial thinking influence design choices. Part C: Developing Your Design Solution
  • 64. 37 Part C is the main part of the book and covers progression through the data visualisation design and production stage. This is where your concerns switch from hidden thinking to visible thinking. The individual chapters in this part of the book cover each of the five layers of the data visualisation anatomy. They are treated as separate affairs to aid the clarity and organisation of your thinking, but they are entirely interrelated matters and the chapter sequences support this. Within each chapter there is a consistent structure beginning with an introduction to each design layer, an overview of the many different possible design options, followed by detailed guidance on the factors that influence your choices.
  • 65. The production cycle: describing the cycle of development activities that take place during this stage, giving a context for how to work through the subsequent chapters in this part. Chapter 6: Data Representation Introducing visual encoding: an overview of the essentials of data representation looking at the differences and relationships between visual encoding and chart types. Chart types: a detailed repertoire of 49 different chart types, profiled in depth and organised by a taxonomy of chart families: categorical, hierarchical, relational, temporal, and spatial. Influencing factors and considerations: presenting the factors that will influence the suitability of your data representation choices. Chapter 7: Interactivity
  • 66. The features of interactivity: Data adjustments: a profile of the options for interactively interrogating and manipulating data. View adjustments: a profile of the options for interactively configuring the presentation of data. Influencing factors and considerations: presenting the factors that will influence the suitability of your interactivity choices. Chapter 8: Annotation 38 The features of annotation: Project annotation: a profile of the options for helping to provide viewers with general explanations about your project. Chart annotation: a profile of the annotated options for helping to
  • 67. optimise viewers’ understanding your charts. Influencing factors and considerations: presenting the factors that will influence the suitability of your annotation choices. Chapter 9: Colour The features of colour: Data legibility: a profile of the options for using colour to represent data. Editorial salience: a profile of the options for using colour to direct the eye towards the most relevant features of your data. Functional harmony: a profile of the options for using colour most effectively across the entire visualisation design. Influencing factors and considerations: presenting the factors that will influence the suitability of your colour choices. Chapter 10: Composition
  • 68. The features of composition: Project composition: a profile of the options for the overall layout and hierarchy of your visualisation design. Chart composition: a profile of the options for the layout and hierarchy of the components of your charts. Influencing factors and considerations: presenting the factors that will influence the suitability of your composition choices. Part D: Developing Your Capabilities Part D wraps up the book’s content by reflecting on the range of capabilities required to develop confidence and competence with data 39 visualisation. Following completion of the design process, the multidisciplinary nature of this subject will now be clearly established.
  • 69. This final part assesses the two sides of visualisation literacy – your role as a creator and your role as a viewer – and what you need to enhance your skills with both. Chapter 11: Visualisation Literacy Viewing: Learning to see: learning about the most effective strategy for understanding visualisations in your role as a viewer rather than a creator. Creating: The capabilities of the visualiser: profiling the skill sets, mindsets and general attributes needed to master data visualisation design as a creator. 40 Part A Foundations
  • 70. 41 1 Defining Data Visualisation This opening chapter will introduce you to the subject of data visualisation, defining what data visualisation is and is not. It will outline the different ingredients that make it such an interesting recipe and establish a foundation of understanding that will form a key reference for all of the decision making you are faced with. Three core principles of good visualisation design will be presented that offer guiding ideals to help mould your convictions about distinguishing between effective and ineffective in data visualisation. You will also see how data visualisation sits alongside or overlaps with other related disciplines, and some definitions about the use of language in
  • 71. this book will be established to ensure consistency in meaning across all chapters. 1.1 The Components of Understanding To set the scene for what is about to follow, I think it is important to start this book with a proposed definition for data visualisation (Figure 1.1). This definition offers a critical term of reference because its components and their meaning will touch on every element of content that follows in this book. Furthermore, as a subject that has many different proposed definitions, I believe it is worth clarifying my own view before going further: Figure 1.1 A Definition for Data Visualisation 42
  • 72. At first glance this might appear to be a surprisingly short definition: isn’t there more to data visualisation than that, you might ask? Can nine words sufficiently articulate what has already been introduced as an eminently complex and diverse discipline? I have arrived at this after many years of iterations attempting to improve the elegance of my definition. In the past I have tried to force too many words and too many clauses into one statement, making it cumbersome and rather undermining its value. Over time, as I have developed greater clarity in my own convictions, I have in turn managed to establish greater clarity about what I feel is the real essence of this subject. The definition above is, I believe, a succinct and practically useful description of what the pursuit of visualisation is truly about. It is a definition that largely informs the contents of this book. Each chapter will aim to enlighten
  • 73. you about different aspects of the roles of and relationships between each component expressed. Let me introduce and briefly examine each of these one by one, explaining where and how they will be discussed in the book. Firstly, data, our critical raw material. It might appear a formality to mention data in the definition for, after all, we are talking about data visualisation as opposed to, let’s say, cheese visualisation (though visualisation of data using cheese has happened, see Figure 1.2), but it needs to be made clear the core role that data has in the design process. Without data there is no visualisation; indeed there is no need for one. Data plays the fundamental role in this work, so you will need to give it your undivided attention and respect. You will discover in Chapter 4 the importance of developing an intimacy with your data to acquaint yourself
  • 74. with its physical properties, its meaning and its potential qualities. 43 Figure 1.2 Per Capita Cheese Consumption in the US Data is names, amounts, groups, statistical values, dates, comments, locations. Data is textual and numeric in format, typically held in datasets in table form, with rows of records and columns of different variables. This tabular form of data is what we will be considering as the raw form of data. Through tables, we can look at the values contained to precisely read them as individual data points. We can look up values quite efficiently, scanning across many variables for the different records held. However, we cannot easily establish the comparative size and relationship
  • 75. between multiple data points. Our eyes and mind are not equipped to translate easily the textual and numeric values into quantitative and qualitative meaning. We can look at the data but we cannot really see it without the context of relationships that help us compare and contrast them effectively with other values. To derive understanding from data we need to see it represented in a different, visual form. This is the act of data representation. This word representation is deliberately positioned near the front of the definition because it is the quintessential activity of data visualisation design. Representation concerns the choices made about the form in which your data will be visually portrayed: in lay terms, what chart or charts you will use to exploit the brain’s visual perception capabilities most effectively.
  • 76. When data visualisers create a visualisation they are representing the data they wish to show visually through combinations of marks and attributes. Marks are points, lines and areas. Attributes are the appearance properties 44 of these marks, such as the size, colour and position. The recipe of these marks and their attributes, along with other components of apparatus, such as axes and gridlines, form the anatomy of a chart. In Chapter 6 you will gain a deeper and more sophisticated appreciation of the range of different charts that are in common usage today, broadening your visual vocabulary. These charts will vary in complexity and composition, with each capable of accommodating different
  • 77. types of data and portraying different angles of analysis. You will learn about the key ingredients that shape your data representation decisions, explaining the factors that distinguish the effective from the ineffective choices. Beyond representation choices, the presentation of data concerns all the other visible design decisions that make up the overall visualisation anatomy. This includes choices about the possible applications of interactivity, features of annotation, colour usage and the composition of your work. During the early stages of learning this subject it is sensible to partition your thinking about these matters, treating them as isolated design layers. This will aid your initial critical thinking. Chapters 7–10 will explore each of these layers in depth, profiling the options available and the factors that influence your decisions.
  • 78. However, as you gain in experience, the interrelated nature of visualisation will become much more apparent and you will see how the overall design anatomy is entirely connected. For instance, the selection of a chart type intrinsically leads to decisions about the space and place it will occupy; an interactive control may be included to reveal an annotated caption; for any design property to be even visible to the eye it must possess a colour that is different from that of its background. The goal expressed in this definition states that data visualisation is about facilitating understanding. This is very important and some extra time is required to emphasise why it is such an influential component in our thinking. You might think you know what understanding means, but when you peel back the surface you realise there are many subtleties that need to
  • 79. be acknowledged about this term and their impact on your data visualisation choices. Understanding ‘understanding’ (still with me?) in the context of data visualisation is of elementary significance. When consuming a visualisation, the viewer will go through a process of understanding involving three stages: perceiving, interpreting and 45 comprehending (Figure 1.3). Each stage is dependent on the previous one and in your role as a data visualiser you will have influence but not full control over these. You are largely at the mercy of the viewer – what they know and do not know, what they are interested in knowing and what might be meaningful to them – and this introduces many variables outside of your control: where your control diminishes the influence
  • 80. and reliance on the viewer increases. Achieving an outcome of understanding is therefore a collective responsibility between visualiser and viewer. These are not just synonyms for the same word, rather they carry important distinctions that need appreciating. As you will see throughout this book, the subtleties and semantics of language in data visualisation will be a recurring concern. Figure 1.3 The Three Stages of Understanding Let’s look at the characteristics of the different stages that form the process of understanding to help explain their respective differences and mutual dependencies. Firstly, perceiving. This concerns the act of simply being able to read a chart. What is the chart showing you? How easily can you get a sense of
  • 81. the values of the data being portrayed? Where are the largest, middle-sized and smallest values? What proportion of the total does that value hold? How do these values compare in ranking terms? To which other values does this have a connected relationship? The notion of understanding here concerns our attempts as viewers to 46 efficiently decode the representations of the data (the shapes, the sizes and the colours) as displayed through a chart, and then convert them into perceived values: estimates of quantities and their relationships to other values. Interpreting is the next stage of understanding following on from perceiving. Having read the charts the viewer now seeks to
  • 82. convert these perceived values into some form of meaning: Is it good to be big or better to be small? What does it mean to go up or go down? Is that relationship meaningful or insignificant? Is the decline of that category especially surprising? The viewer’s ability to form such interpretations is influenced by their pre- existing knowledge about the portrayed subject and their capacity to utilise that knowledge to frame the implications of what has been read. Where a viewer does not possess that knowledge it may be that the visualiser has to address this deficit. They will need to make suitable design choices that help to make clear what meaning can or should be drawn from the display of data. Captions, headlines, colours and other annotated devices, in particular, can all be used to achieve this. Comprehending involves reasoning the consequence of the
  • 83. perceiving and interpreting stages to arrive at a personal reflection of what all this means to them, the viewer. How does this information make a difference to what was known about the subject previously? Why is this relevant? What wants or needs does it serve? Has it confirmed what I knew or possibly suspected beforehand or enlightened me with new knowledge? Has this experience impacted me in an emotional way or left me feeling somewhat indifferent as a consequence? Does the context of what understanding I have acquired lead me to take action – such as make a decision or fundamentally change my behaviour – or do I simply have an extra grain of knowledge the consequence of which may not materialise until much later? Over the page is a simple demonstration to further illustrate this process of understanding. In this example I play the role of a viewer working with a sample isolated chart (Figure 1.4). As you will learn throughout
  • 84. the design 47 chapters, a chart would not normally just exist floating in isolation like this one does, but it will serve a purpose for this demonstration. Figure 1.4 shows a clustered bar chart that presents a breakdown of the career statistics for the footballer Lionel Messi during his career with FC Barcelona. The process commences with perceiving the chart. I begin by establishing what chart type is being used. I am familiar with this clustered bar chart approach and so I quickly feel at ease with the prospect of reading its display: there is no learning for me to have to go through on this occasion, which is not always the case as we will see.
  • 85. I can quickly assimilate what the axes are showing by examining the labels along the x- and y-axes and by taking the assistance provided by colour legend at the top. I move on to scanning, detecting and observing the general physical properties of the data being represented. The eyes and brain are working in harmony, conducting this activity quite instinctively without awareness or delay, noting the most prominent features of variation in the attributes of size, shape, colour and position. Figure 1.4 Demonstrating the Process of Understanding I look across the entire chart, identifying the big, small and medium values 48 (these are known as stepped magnitude judgements), and form
  • 86. an overall sense of the general value rankings (global comparison judgements). I am instinctively drawn to the dominant bars towards the middle/right of the chart, especially as I know this side of the chart concerns the most recent career performances. I can determine that the purple bar – showing goals – has been rising pretty much year-on-year towards a peak in 2011/12 and then there is a dip before recovery in his most recent season. My visual system is now working hard to decode these properties into estimations of quantities (amounts of things) and relationships (how different things compare with each other). I focus on judging the absolute magnitudes of individual bars (one bar at a time). The assistance offered by the chart apparatus, such as the vertical axis (or y- axis) values and the inclusion of gridlines, is helping me more quickly estimate the quantities
  • 87. with greater assurance of accuracy, such as discovering that the highest number of goals scored was around 73. I then look to conduct some relative higher/lower comparisons. In comparing the games and goals pairings I can see that three out of the last four years have seen the purple bar higher than the blue bar, in contrast to all the rest. Finally I look to establish proportional relationships between neighbouring bars, i.e. by how much larger one is compared with the next. In 2006/07 I can see the blue bar is more than twice as tall as the purple one, whereas in 2011/12 the purple bar is about 15% taller. By reading this chart I now have a good appreciation of the quantities displayed and some sense of the relationship between the two measures, games and goals. The second part of the understanding process is interpreting. In
  • 88. reality, it is not so consciously consecutive or delayed in relationship to the perceiving stage but you cannot get here without having already done the perceiving. Interpreting, as you will recall, is about converting perceived ‘reading’ into meaning. Interpreting is essentially about orientating your assessment of what you’ve read against what you know about the subject. As I mentioned earlier, often a data visualiser will choose to – or have the opportunity to – share such insights via captions, chart overlays or summary headlines. As you will learn in Chapter 3, the visualisations that present this type of interpretation assistance are commonly described as offering an ‘explanatory’ experience. In this particular demonstration it is 49
  • 89. an example of an ‘exhibitory’ experience, characterised by the absence of any explanatory features. It relies on the viewer to handle the demands of interpretation without any assistance. As you will read about later, many factors influence how well different viewers will be able to interpret a visualisation. Some of the most critical include the level of interest shown towards the subject matter, its relevance and the general inclination, in that moment, of a viewer to want to read about that subject through a visualisation. It is also influenced by the knowledge held about a subject or the capacity to derive meaning from a subject even if a knowledge gap exists. Returning to the sample chart, in order to translate the quantities and relationships I extracted from the perceiving stage into
  • 90. meaning, I am effectively converting the reading of value sizes into notions of good or bad and comparative relationships into worse than or better than etc. To interpret the meaning of this data about Lionel Messi I can tap into my passion for and knowledge of football. I know that for a player to score over 25 goals in a season is very good. To score over 35 is exceptional. To score over 70 goals is frankly preposterous, especially at the highest level of the game (you might find plenty of players achieving these statistics playing for the Dog and Duck pub team, but these numbers have been achieved for Barcelona in La Liga, the Champions League and other domestic cup competitions). I know from watching the sport, and poring over statistics like this for 30 years, that it is very rare for a player to score remotely close to a ratio of one goal per game played. Those purple bars
  • 91. that exceed the height of the blue bars are therefore remarkable. Beyond the information presented in the chart I bring knowledge about the periods when different managers were in charge of Barcelona, how they played the game, and how some organised their teams entirely around Messi’s talents. I know which other players were teammates across different seasons and who might have assisted or hindered his achievements. I also know his age and can mentally compare his achievements with the traditional football career arcs that will normally show a steady rise, peak, plateau, and then decline. Therefore, in this example, I am not just interested in the subject but can bring a lot of knowledge to aid me in interpreting this analysis. That helps me understand a lot more about what this data means. For other people they might be passingly interested in football and know how to
  • 92. read what 50 is being presented, but they might not possess the domain knowledge to go deeper into the interpretation. They also just might not care. Now imagine this was analysis of, let’s say, an NHL ice hockey player (Figure 1.5) – that would present an entirely different challenge for me. In this chart the numbers are irrelevant, just using the same chart as before with different labels. Assuming this was real analysis, as a sports fan in general I would have the capacity to understand the notion of a sportsperson’s career statistics in terms of games played and goals scored: I can read the chart (perceiving) that shows me this data and catch the gist of the angle of analysis it is portraying. However, I do not have sufficient
  • 93. domain knowledge of ice hockey to determine the real meaning and significance of the big–small, higher–lower value relationships. I cannot confidently convert ‘small’ into ‘unusual’ or ‘greater than’ into ‘remarkable’. My capacity to interpret is therefore limited, and besides I have no connection to the subject matter, so I am insufficiently interested to put in the effort to spend much time with any in-depth attempts at interpretation. Figure 1.5 Demonstrating the Process of Understanding Imagine this is now no longer analysis about sport but about the sightings in the wild of Winglets and Spungles (completely made up words). Once again I can still read the chart shown in Figure 1.6 but now I have 51
  • 94. absolutely no connection to the subject whatsoever. No knowledge and no interest. I have no idea what these things are, no understanding about the sense of scale that should be expected for these sightings, I don’t know what is good or bad. And I genuinely don’t care either. In contrast, for those who do have a knowledge of and interest in the subject, the meaning of this data will be much more relevant. They will be able to read the chart and make some sense of the meaning of the quantities and relationships displayed. To help with perceiving, viewers need the context of scale. To help with interpreting, viewers need the context of subject, whether that is provided by the visualiser or the viewer themself. The challenge for you and I as data visualisers is to determine what our audience will know already and
  • 95. what they will need to know in order to possibly assist them in interpreting the meaning. The use of explanatory captions, perhaps positioned in that big white space top left, could assist those lacking the knowledge of the subject, possibly offering a short narrative to make the interpretations – the meaning – clearer and immediately accessible. We are not quite finished, there is one stage left. The third part of the understanding process is comprehending. This is where I attempt to form some concluding reasoning that translates into what this analysis means for me. What can I infer from the display of data I have read? How do I relate and respond to the insights I have drawn out as through interpretation? Does what I’ve learnt make a difference to me? Do I know something more than I did before? Do I need to act or decide on anything? How does it make me feel emotionally?
  • 96. Figure 1.6 Demonstrating the Process of Understanding 52 Through consuming the Messi chart, I have been able to form an even greater appreciation of his amazing career. It has surprised me just how prolific he has been, especially having seen his ratio of goals to games, and I am particularly intrigued to see whether the dip in 2013/14 was a temporary blip or whether the bounce back in 2014/15 was the blip. And as he reaches his late 20s, will injuries start to creep in as they seem to do for many other similarly prodigious young talents, especially as he has been playing relentlessly at the highest level since his late teens? My comprehension is not a dramatic discovery. There is no
  • 97. sudden inclination to act nor any need – based on what I have learnt. I just feel a heightened impression, formed through the data, about just how good and prolific Lionel Messi has been. For Barcelona fanatics who watch him play every week, they will likely have already formed this understanding. This kind of experience would only have reaffirmed what they already probably knew. And that is important to recognise when it comes to managing expectations about what we hope to achieve amongst our viewers in terms of their final comprehending. One person’s ‘I knew that already’ is another person’s ‘wow’. For every ‘wow, I need to make some changes’ type of reflection there might be another ‘doesn’t affect me’. A compelling visualisation about climate change presented to Sylvie might affect her
  • 98. 53 significantly about the changes she might need to make in her lifestyle choices that might reduce her carbon footprint. For Robert, who is already familiar with the significance of this situation, it might have substantially less immediate impact – not indifference to the meaning of the data, just nothing new, a shrug of the shoulders. For James, the hardened sceptic, even the most indisputable evidence may have no effect; he might just not be receptive to altering his views regardless. What these scenarios try to explain is that, from your perspective of the visualiser, this final stage of understanding is something you will have relatively little control over because viewers are people and people are complex. People are different and as such they introduce
  • 99. inconsistencies. You can lead a horse to water but you cannot make it drink: you cannot force a viewer to be interested in your work, to understand the meaning of a subject or get that person to react exactly how you would wish. Visualising data is just an agent of communication and not a guarantor for what a viewer does with the opportunity for understanding that is presented. There are different flavours of comprehension, different consequences of understanding formed through this final stage. Many visualisations will be created with the ambition to simply inform, like the Messi graphic achieved for me, perhaps to add just an extra grain to the pile of knowledge a viewer has about a subject. Not every visualisation results in a Hollywood moment of grand discoveries, surprising insights or life-saving decisions. But that is OK, so long as the outcome
  • 100. fits with the intended purpose, something we will discuss in more depth in Chapter 3. Furthermore, there is the complexity of human behaviour in how people make decisions in life. You might create the most compelling visualisation, demonstrating proven effective design choices, carefully constructed with very a specific audience type and need in mind. This might clearly show how a certain decision really needs to be taken by those in the audience. However, you cannot guarantee that the decision maker in question, while possibly recognising that there is a need to act, will be in a position to act, and indeed will know how to act. It is at this point that one must recognise the ambitions and – more importantly – realise the limits of what data visualisation can achieve. Going back again, finally, to the components of the definition, all the
  • 101. reasons outlined above show why the term to facilitate is the most a visualiser can reasonably aspire to achieve. 54 It might feel like a rather tepid and unambitious aim, something of a cop- out that avoids scrutiny over the outcomes of our work: why not aim to ‘deliver’, ‘accomplish’, or do something more earnest than just ‘facilitate’? I deliberately use ‘facilitate’ because as we have seen we can only control so much. Design cannot change the world, it can only make it run a little smoother. Visualisers can control the output but not the outcome: at best we can expect to have only some influence on it. 1.2 The Importance of Conviction The key structure running through this book is a data visualisation design
  • 102. process. By following this process you will be able to decrease the size of the challenge involved in making good decisions about your design solution. The sequencing of the stages presented will help reduce the myriad options you have to consider, which makes the prospect of arriving at the best possible solution much more likely to occur. Often, the design choices you need to make will be clear cut. As you will learn, the preparatory nature of the first three stages goes a long way to securing that clarity later in the design stage. On other occasions, plain old common sense is a more than sufficient guide. However, for more nuanced situations, where there are several potentially viable options presenting themselves, you need to rely on the guiding value of good design principles. ‘I say begin by learning about data visualisation’s “black and
  • 103. whites”, the rules, then start looking for the greys. It really then becomes quite a personal journey of developing your conviction.’ Jorge Camoes, Data Visualization Consultant For many people setting out on their journey in data visualisation, the major influences that shape their early beliefs about data visualisation design tend to be influenced by the first authors they come across. Names like Edward Tufte, unquestionably one of the most important figures in this field whose ideas are still pervasive, represent a common entry point into the field, as do people like Stephen Few, David McCandless, Alberto Cairo, and Tamara Munzner, to name but a few. These are authors of prominent works that typically represent the first books purchased and 55
  • 104. read by many beginners. Where you go from there – from whom you draw your most valuable enduring guidance –will be shaped by many different factors: taste, the industry you are working in, the topics on which you work, the types of audiences you produce for. I still value much of what Tufte extols, for example, but find I can now more confidently filter out some of his ideals that veer towards impractical ideology or that do not necessarily hold up against contemporary technology and the maturing expectations of people. ‘My key guiding principle? Know the rules, before you break them.’ Gregor Aisch, Graphics Editor, The New York Times The key guidance that now most helpfully shapes and supports
  • 105. my convictions comes from ideas outside the boundaries of visualisation design in the shape of the work of Dieter Rams. Rams was a German industrial and product designer who was most famously associated with the Braun company. In the late 1970s or early 1980s, Rams was becoming concerned about the state and direction of design thinking and, given his prominent role in the industry, felt a responsibility to challenge himself, his own work and his own thinking against a simple question: ‘Is my design good design?’. By dissecting his response to this question he conceived 10 principles that expressed the most important characteristics of what he considered to be good design. They read as follows: 1. Good design is innovative. 2. Good design makes a product useful.
  • 106. 3. Good design is aesthetic. 4. Good design makes a product understandable. 5. Good design is unobtrusive. 6. Good design is honest. 7. Good design is long lasting. 8. Good design is thorough down to the last detail. 9. Good design is environmentally friendly. 10. Good design is as little design as possible. Inspired by the essence of these principles, and considering their applicability to data visualisation design, I have translated them into three 56 high-level principles that similarly help me to answer my own question: ‘Is my visualisation design good visualisation design?’ These principles offer me a guiding voice when I need to resolve some of the more seemingly
  • 107. intangible decisions I am faced with (Figure 1.7). Figure 1.7 The Three Principles of Good Visualisation Design In the book Will it Make the Boat Go Faster?, co-author Ben Hunt-Davis provides details of the strategies employed by him and his team that led to their achieving gold medal success in the Men’s Rowing Eight event at the Sydney Olympics in 2000. As the title suggests, each decision taken had to pass the ‘will it make the boat go faster?’ test. Going back to the goal of data visualisation as defined earlier, these design principles help me judge whether any decision I make will better aid the facilitation of understanding: the equivalence of ‘making the boat go faster’. I will describe in detail the thinking behind each of these principles and explain how Rams’ principles map onto them. Before that, let me briefly explain why there are three principles of Rams’ original ten that do not
  • 108. entirely fit, in my view, as universal principles for data visualisation. ‘I’m always the fool looking at the sky who falls off the cliff. In other words, I tend to seize on ideas because I’m excited about them without thinking through the consequences of the amount of work they will entail. I find tight deadlines energizing. Answering the question of “what is the graphic trying to do?” is always helpful. At minimum the work I create needs to speak to this. Innovation doesn’t have to be a wholesale out-of-the box approach. Iterating on a previous idea, moving it forward, is innovation.’ Sarah Slobin, Visual Journalist Good design is innovative: Data visualisation does not need always to be innovative. For the majority of occasions the solutions being created call upon the tried and tested approaches that have been used
  • 109. for generations. Visualisers are not conceiving new forms of representation or implementing new design techniques in every 57 project. Of course, there are times when innovation is required to overcome a particular challenge; innovation generally materialises when faced with problems that current solutions fail to overcome. Your own desire for innovation may be aligned to personal goals about the development of your skills or through reflecting on previous projects and recognising a desire to rethink a solution. It is not that data visualisation is never about innovation, just that it is not always and only about innovation. Good design is long lasting: The translation of this principle to the context of data visualisation can be taken in different ways.
  • 110. ‘Long lasting’ could be related to the desire to preserve the ongoing functionality of a digital project, for example. It is quite demoralising how many historic links you visit online only to find a project has now expired through a lack of sustained support or is no longer functionally supported on modern browsers. Another way to interpret ‘long lasting’ is in the durability of the technique. Bar charts, for example, are the old reliables of the field – always useful, always being used, always there when you need them (author wipes away a respectful tear). ‘Long lasting’ can also relate to avoiding the temptation of fashion or current gimmickry and having a timeless approach to design. Consider the recent design trend moving away from skeuomorphism and the emergence of so-called flat design. By the time this book is published there will likely be a new movement. ‘Long lasting’ could apply to the subject matter. Expiry in the relevance of certain angles of analysis or out-of-date data is
  • 111. inevitable in most of our work, particularly with subjects that concern current matters. Analysis about the loss of life during the Second World War is timeless because nothing is now going to change the nature or extent of the underlying data (unless new discoveries emerge). Analysis of the highest grossing movies today will change as soon as new big movies are released and time elapses. So, once again, this idea of long lasting is very context specific, rather than being a universal goal for data visualisation. Good design is environmentally friendly: This is, of course, a noble aim but the relevance of this principle has to be positioned again at the contextual level, based on the specific circumstances of a given project. If your work is to be printed, the ink and paper usage immediately removes the notion that it is an environmentally friendly activity. Developing a powerful interactive that is being hammered
  • 112. constantly and concurrently by hundreds of thousands of users puts an 58 extra burden on the hosting server, creating more demands on energy supply. The specific judgements about issues relating to the impact of a project on the environment realistically reside with the protagonists and stakeholders involved. A point of clarity is that, while I describe them as design principles, they actually provide guidance long before you reach the design thinking at the final stage of this workflow. Design choices encapsulate the critical thinking undertaken throughout. Think of it like an iceberg: the design is the visible consequences of lots of hidden preparatory thinking formed
  • 113. through earlier stages. Finally, a comment is in order about something often raised in discussions about the principles for this subject: that is, the idea that visualisations need to be memorable. This is, in my view, not relevant as a universal principle. If something is memorable, wonderful, that will be a terrific by- product of your design thinking, but in itself the goal of achieving memorability has to be isolated, again, to a contextual level based on the specific goals of a given task and the capacity of the viewer. A politician or a broadcaster might need to recall information more readily in their work than a group of executives in a strategy meeting with permanent access to endless information at the touch of a button via their iPads. Principle 1: Good Data Visualisation is Trustworthy
  • 114. The notion of trust is uppermost in your thoughts in this first of the three principles of good visualisation design. This maps directly onto one of Dieter Rams’ general principles of good design, namely that good design is honest. Trust vs Truth This principle is presented first because it is about the fundamental integrity, accuracy and legitimacy of any data visualisation you produce. This should always exist as your primary concern above all else. There should be no compromise here. Without securing trust the entire purpose of doing the work is undermined. 59 There is an important distinction to make between trust and
  • 115. truth. Truth is an obligation. You should never create work you know to be misleading in content, nor should you claim something presents the truth if it evidently cannot be supported by what you are presenting. For most people, the difference between a truth and an untruth should be beyond dispute. For those unable or unwilling to be truthful, or who are ignorant of how to differentiate, it is probably worth putting this book away now: my telling you how this is a bad thing is not likely to change your perspective. If the imperative for being truthful is clear, the potential for there being multiple different but legitimate versions of ‘truth’ within the same data- driven context muddies things. In data visualisation there is rarely a singular view of the truth. The glass that is half full is also half empty. Both views are truthful, but which to choose? Furthermore,
  • 116. there are many decisions involved in your work whereby several valid options may present themselves. In these cases you are faced with choices without necessarily having the benefit of theoretical influence to draw out the right option. You decide what is right. This creates inevitable biases – no matter how seemingly tiny – that ripple through your work. Your eventual solution is potentially comprised of many well-informed, well- intended and legitimate choices – no doubt – but they will reflect a subjective perspective all the same. All projects represent the outcome of an entirely unique pathway of thought. You can mitigate the impact of these subjective choices you make, for example, by minimising the amount of assumptions applied to the data you are working with or by judiciously consulting your audience to best ensure
  • 117. their requirements are met. However, pure objectivity is not possible in visualisation. ‘Every number we publish is wrong but it is the best number there is.’ Andrew Dilnott, Chair of the UK Statistics Authority Rather than view the unavoidability of these biases as an obstruction, the focus should instead be on ensuring your chosen path is trustworthy. In the absence of an objective truth, you need to be able to demonstrate that your truth is trustable. Trust has to be earned but this is hard to secure and very easy to lose. As the translation of a Dutch proverb states, ‘trust arrives on foot and leaves 60
  • 118. on horseback’. Trust is something you can build by eliminating any sense that your version of the truth can be legitimately disputed. Yet, visualisers only have so much control and influence in the securing of trust. A visualisation can be truthful but not viewed as trustworthy. You may have done something with the best of intent behind your decision making, but it may ultimately fail to secure trust among your viewers for different reasons. Conversely a visualisation can be trustworthy in the mind of the viewer but not truthful, appearing to merit trust yet utterly flawed in its underlying truth. Neither of these are satisfactory: the latter scenario is a choice we control, the former is a consequence we must strive to overcome. ‘Good design is honest. It does not make a product appear more innovative, powerful or valuable than it really is. It does not attempt to
  • 119. manipulate the consumer with promises that cannot be kept.’ Dieter Rams, celebrated Industrial Designer Let’s consider a couple of examples to illustrate this notion of trustworthiness. Firstly, think about the trust you might attach respectively to the graphics presented in Figure 1.8 and Figure 1.9. For the benefit of clarity both are extracted from articles discussing issues about home ownership, so each would be accompanied with additional written analysis at their published location. Both charts are portraying the same data and the same analysis; they even arrive at the same summary finding. How do the design choices make you feel about the integrity of each work? Figure 1.8 Housing and Home Ownership in the UK (ONS) 61
  • 120. Both portrayals are truthful but in my view the first visualisation, produced by the UK Office for National Statistics (ONS), commands greater credibility and therefore far more trust than the second visualisation, produced by the Daily Mail. The primary reason for this begins with the colour choices. They are relatively low key in the ONS graphic: colourful but subdued, yet conveying a certain assurance. In contrast, the Daily Mail’s colour palette feels needy, like it is craving my attention with sweetly coloured sticks. I don’t care for the house key imagery in the background but it is relatively harmless. Additionally, the typeface, font size and text colour feel more gimmicky in the second graphic. Once again, it feels like it is wanting to shout at me in contrast to the more polite nature of the ONS text. Whereas the Daily Mail piece refers to the ONS as
  • 121. the source of the data, it fails to include further details about the data source, which is included on the ONS graphic alongside other important explanatory features such as the subtitle, clarity about the yearly periods and the option to access and download the associated data. The ONS graphic effectively ‘shows all its workings’ and overall earns, from me at least, significantly more trust. 62 Figure 1.9 Falling Number of Young Homeowners (Daily Mail) Another example about the fragility of trust concerns the next graphic, which plots the number of murders committed using firearms in Florida over a period of time. This frames the time around the enactment of the ‘Stand your ground’ law in the Florida. The area chart in Figure
  • 122. 1.10 shows the number of murders over time and, as you can see, the chart uses an inverted vertical y-axis with the red area going lower down as the number of deaths increases, with peak values at about 1990 and 2007. However, some commentators felt the inversion of the y-axis was deceptive and declared the graphic not trustworthy based on the fact they were perceiving the values as represented by an apparent rising ‘white mountain’. They mistakenly observed peak values around 1999 and 2005 based on them seeing these as the highest points. This confusion is caused by an effect known as figure-ground perception whereby a background form (white area) can become inadvertently recognised as the foreground form, and vice versa (with the red area seen as the background). Figure 1.10 Gun Deaths in Florida
  • 123. 63 Figure 1.11 Iraq’s Bloody Toll 64 The key point here is that there was no intention to mislead. Although the approach to inverting the y-axis may not be entirely conventional, it was technically legitimate. Creatively speaking, the effect of dribbling blood was an understandably tempting metaphor to pursue. Indeed, the graphic attempts to emulate a notable infographic from several years ago showing the death toll during the Iraq conflict (Figure 1.11). In the case of the 65
  • 124. Florida graphic, on reflection maybe the data was just too ‘smooth’ to convey the same dribbling effect achieved in the Iraq piece. However, being inspired and influenced by successful techniques demonstrated by others is to be encouraged. It is one way of developing our skills. Figure 1.12 Reworking of ‘Gun Deaths in Florida’ Unfortunately, given the emotive nature of the subject matter – gun deaths – this analysis would always attract a passionate reaction regardless of its form. In this case the lack of trust expressed by some was an unintended 66 consequence of a single, innocent design: by reverting the y-
  • 125. axis to an upward direction, as shown in the reworked version in Figure 1.12, you can see how a single subjective design choice can have a huge influence on people’s perception. The creator of the Florida chart will have made hundreds of perfectly sound visualisations and will make hundreds more, and none of them will ever carry the intent of being anything other than truthful. However, you can see how vulnerable perceived trust is when disputes about motives can so quickly surface as a result of the design choice made. This is especially the case within the pressured environment of a newsroom where you have only a single opportunity to publish a work to a huge and widespread audience. Contrast this setting with a graphic published within an organisation that can be withdrawn and reissued far more easily.
  • 126. Trust Applies Throughout the Process Trustworthiness is a pursuit that should guide all your decisions, not just the design ones. As you will see in the next chapter, the visualisation design workflow involves a process with many decision junctions – many paths down which you could pursue different legitimate options. Obviously, design is the most visible result of your decision making, but you need to create and demonstrate complete integrity in the choices made across the entire workflow process. Here is an overview of some of the key matters where trust must be at the forefront of your concern. ‘My main goal is to represent information accurately and in proper context. This spans from data reporting and number crunching to designing human-centered, intuitive and clear visualizations. This is my sole approach, although it is always evolving.’ Kennedy Elliott, Graphics Editor, The Washington Post
  • 127. Formulating your brief: As mentioned in the discussion about the ‘Gun Crimes in Florida’ graphic, if you are working with potentially emotive subject matter, this will heighten the importance of demonstrating trust. Rightly or wrongly, your topic will be more exposed to the baggage of prejudicial opinion and trust will be precarious. As you will learn in Chapter 3, part of the thinking involved in ‘formulating your brief’ concerns defining your audience, 67 considering your subject and establishing your early thoughts about the purpose of your work, and what you are hoping to achieve. There will be certain contexts that lend themselves to exploiting the emotive qualities of your subject and/or data but many others that will not. Misjudge these contextual factors, especially the nature of your
  • 128. audience’s needs, and you will jeopardise the trustworthiness of your solution. As I have shown, matters of trust are often outside of your immediate influence: cynicism, prejudice or suspicion held by viewers through their beliefs or opinions is a hard thing to combat or accommodate. In general, people feel comfortable with visualisations that communicate data in a way that fits with their world view. That said, at times, many are open to having their beliefs challenged by data and evidence presented through a visualisation. The platform and location in which your work is published (e.g. website or source location) will also influence trust. Visualisations encountered in already-distrusted media will create obstacles that are hard to overcome. Working with data: As soon as you begin working with data you have a great responsibility to be faithful to this raw material. To be transparent to your audience you need to consider sharing as much relevant information about how you have handled the data that
  • 129. is being presented to them: How was it collected: from where and using what criteria? What calculations or modifications have you applied to it? Explain your approach. Have you made any significant assumptions or observed any special counting rules that may not be common? Have you removed or excluded any data? How representative it is? What biases may exist that could distort interpretations? Editorial thinking: Even with the purest of intent, your role as the curator of your data and the creator of its portrayal introduces subjectivity. When you choose to do one thing you are often choosing to not do something else. The choice to focus on analysis that shows how values have changed over time is also a decision to not show the same data from other viewpoints such as, for example, how it looks on a map. A decision to impose criteria on your analysis, like setting
  • 130. date parameters or minimum value thresholds, in order to reduce clutter, might be sensible and indeed legitimate, but is still a subjective choice. 68 ‘Data and data sets are not objective; they are creations of human design. Hidden biases in both the collection and analysis stages present considerable risks [in terms of inference].’ Kate Crawford, Principal Researcher at Microsoft Research NYC Data representation: A fundamental tenet of data visualisation is to never deceive the receiver. Avoiding possible misunderstandings, inaccuracies, confusions and distortions is of primary concern. There are many possible features of visualisation design that can lead to varying degrees of deception, whether intended or not. Here are
  • 131. a few to list now, but note that these will be picked up in more detail later: The size of geometric areas can sometimes be miscalculated resulting in the quantitative values being disproportionately perceived. When data is represented in 3D, on the majority of occasions this represents nothing more than distracting – and distorting – decoration. 3D should only be used when there are legitimately three dimensions of data variables being displayed and the viewer is able to change his or her point of view to navigate to see different 2D perspectives. The bar chart value axis should never be ‘truncated’ – the origin value should always be zero – otherwise this approach will distort the bar size judgements. The aspect ratio (height vs width) of a line chart’s display is influential as it affects the perceived steepness of connecting lines which are key to reading the trends over time – too narrow and the steepness will be embellished; too wide and the steepness is dampened. When portraying spatial analysis through a thematic map representation, there are many different mapping projections to choose from as the underlying apparatus for presenting and
  • 132. orienting the geographical position of the data. There are many different approaches to flatten the spherical globe, translating it into a two-dimensional map form. The mathematical treatment applied can alter significantly the perceived size or shape of regions, potentially distorting their perception. Sometimes charts are used in a way that is effectively corrupt, like using pie charts for percentages that add up to more, or less, than 100%. 69 Data presentation: The main rule here is: if it looks significant, it should be, otherwise you are either misleading or creating unnecessary obstacles for your viewer. The undermining of trust can also be caused by what you decline to explain: restricted or non- functioning features of interactivity. Absent annotations such as introduction/guides, axis titles and labels, footnotes, data sources that fail to inform the reader of what is going on.
  • 133. Inconsistent or inappropriate colour usage, without explanation. Confusing or inaccessible layouts. Thoroughness in delivering trust extends to the faith you create through reliability and consistency in the functional experience, especially for interactive projects. Does the solution work and, specifically, does it work in the way it promises to do? Principle 2: Good Data Visualisation is Accessible This second of the three principles of good visualisation design helps to inform judgments about how best to facilitate your viewers through the process of understanding. It is informed by three of Dieter Rams’ general principles of good design: 2 Good design makes a product useful. 4 Good design makes a product understandable. 5 Good design is unobtrusive. Reward vs Effort The opening section of this chapter broke down the stages a viewer goes through when forming their understanding about, and from, a
  • 134. visualisation. This process involved a sequence of perceiving, interpreting and then comprehending. It was emphasised that a visualiser’s control over the viewer’s pursuit of understanding diminishes after each stage. The objective, as stated by the presented definition, of ‘facilitating’ understanding reflects the reality of what can be controlled. You can’t force viewers to understand, but you can smooth the way. To facilitate understanding for an audience is about delivering accessibility. That is the essence of this principle: to remove design-related 70 obstacles faced by your viewers when undertaking this process of understanding. Stated another way, a viewer should experience minimum friction between the act of understanding (effort) and the
  • 135. achieving of understanding (reward). This ‘minimising’ of friction has to be framed by context, though. This is key. There are many contextual influences that will determine whether what is judged inaccessible in one situation could be seen as entirely accessible in another. When people are involved, diverse needs exist. As I have already discussed, varying degrees of knowledge emerge and irrational characteristics come to the surface. You can only do so much: do not expect to get all things right in the eyes of every viewer. ‘We should pay as much attention to understanding the project’s goal in relation to its audience. This involves understanding principles of perception and cognition in addition to other relevant factors, such as culture and education levels, for example. More importantly, it means
  • 136. carefully matching the tasks in the representation to our audience’s needs, expectations, expertise, etc. Visualizations are human- centred projects, in that they are not universal and will not be effective for all humans uniformly. As producers of visualizations, whether devised for data exploration or communication of information, we need to take into careful consideration those on the other side of the equation, and who will face the challenges of decoding our representations.’ Isabel Meirelles, Professor, OCAD University (Toronto) That is not to say that attempts to accommodate the needs of your audience should just be abandoned, quite the opposite. This is hard but it is essential. Visualisation is about human-centred design, demonstrating empathy for your audiences and putting them at the heart of your decision making.
  • 137. There are several dimensions of definition that will help you better understand your audiences, including establishing what they know, what they do not know, the circumstances surrounding their consumption of your work and their personal characteristics. Some of these you can accommodate, others you may not be able to, depending on the diversity and practicality of the requirements. Again, in the absence of perfection optimisation is the name of the game, even if this means that sometimes the least worst is best. 71 The Factors Your Audiences Influence Many of the factors presented here will occur when you think about your project context, as covered in Chapter 3. For now, it is helpful
  • 138. to introduce some of the factors that specifically relate to this discussion about delivering accessible design. Subject-matter appeal: This was already made clear in the earlier illustration, but is worth logging again here: the appeal of the subject matter is a fundamental junction right at the beginning of the consumption experience. If your audiences are not interested in the subject – i.e. they are indifferent towards the topic or see no need or relevance to engage with it there and then – then they will not likely stick around. They will probably not be interested in putting in the effort to work through the process of understanding for something that might be ultimately irrelevant. For those to whom the subject matter is immediately appealing, they are significantly more likely to engage with the data visualisation right the way through.
  • 139. ‘Data visualization is like family photos. If you don’t know the people in the picture, the beauty of the composition won’t keep your attention.’ Zach Gemignani, CEO/Founder of Juice Analytics Many of the ideas for this principle emerged from the Seeing Data visualisation literacy research project (seeingdata.org) on which I collaborated. Dynamic of need: Do they need to engage with this work or is it entirely voluntary? Do they have a direct investment in having access to this information, perhaps as part of their job and they need this information to serve their duties? Subject-matter knowledge: What might your audiences know and not know about this subject? What is their capacity to learn or potential motivation to develop their knowledge of this subject? A critical component of this issue, blending existing knowledge
  • 140. with the capacity to acquire knowledge, concerns the distinctions between complicated, complex, simple and simplified. This might seem to be more about the semantics of language but is of significant influence in data visualisation – indeed in any form of communication: 72 Complicated is generally a technical distinction. A subject might be difficult to understand because it involves pre-existing – and probably high-level – knowledge and might be intricate in its detail. The mathematics that underpinned the Moon landings are complicated. Complicated subjects are, of course, surmountable – the knowledge and skill are acquirable – but only achieved through time and effort, hard work and learning (or extraordinary talent), and, usually, with external assistance. Complex is associated with problems that have no perfect conclusion or maybe even no end state. Parenting is complex; there is no rulebook for how to do it well, no definitive right or
  • 141. wrong, no perfect way of accomplishing it. The elements of parenting might not be necessarily complicated – cutting Emmie’s sandwiches into star shapes – but there are lots of different interrelated pressures always influencing and occasionally colliding. Simple, for the purpose of this book, concerns a matter that is inherently easy to understand. It may be so small in dimension and scope that it is not difficult to grasp, irrespective of prior knowledge and experience. Simplified involves transforming a problem context from either a complex or complicated initial state to a reduced form, possibly by eliminating certain details or nuances. Understanding the differences in these terms is vital. When considering your subject matter and the nature of your analysis you will need to assess whether your audience will be immediately able to understand what you are presenting or have the capacity to learn how to understand it. If it is a subject that is inherently complex or complicated, will it need to be simplified? If you are creating a graphic about taxation, will
  • 142. you need to strip it down to the basics or will this process of simplification risk the subject being oversimplified? The final content may be obscured by the absence of important subtleties. Indeed, the audience may have felt sufficiently sophisticated to have had the capacity to work out and work with a complicated topic, but you denied them that opportunity. You might reasonably dilute/reduce a complex subject for kids, but generally my advice is don’t underestimate the capacity of your audience. Accordingly, clarity trumps simplicity as the most salient concern about data visualisation design. 73 ‘Strive for clarity, not simplicity. It’s easy to “dumb something down,” but extremely difficult to provide clarity while maintaining
  • 143. complexity. I hate the word “simplify.” In many ways, as a researcher, it is the bane of my existence. I much prefer “explain,” “clarify,” or “synthesize.” If you take the complexity out of a topic, you degrade its existence and malign its importance. Words are not your enemy. Complex thoughts are not your enemy. Confusion is. Don’t confuse your audience. Don’t talk down to them, don’t mislead them, and certainly don’t lie to them.’ Amanda Hobbs, Researcher and Visual Content Editor What do they need to know? The million-dollar question. Often, the most common frustration expressed by viewers is that the visualisation ‘didn’t show them what they were most interested in’. They wanted to see how something changed over time, not how it looked on a map. If you were them what would you want to know? This is a hard thing to second-guess with any accuracy. We will
  • 144. be discussing it further in Chapter 5. Unfamiliar representation: In the final chapter of this book I will cover the issue of visualisation literacy, discussing the capabilities that go into being the most rounded creator of visualisation work and the techniques involved in being the most effective consumer also. Many people will perhaps be unaware of a deficit in their visualisation literacy with regard to consuming certain chart types. The bar, line and pie chart are very common and broadly familiar to all. As you will see in Chapter 6, there are many more ways of portraying data visually. This deficit in knowing how to read a new or unfamiliar chart type is not a failing on the part of the viewer, it is simply a result of their lack of prior exposure to these different methods. For visualisers a key challenge lies with situations when the deployment of an uncommon chart may be an entirely reasonable and
  • 145. appropriate choice – indeed perhaps even the ‘simplest’ chart that could have been used – but it is likely to be unfamiliar to the intended viewers. Even if you support it with plenty of ‘how to read’ guidance, if a viewer is overwhelmed or simply unwilling to make the effort to learn how to read a different chart type, you have little control in overcoming this. Time: At the point of consuming a visualisation is the viewer in a pressured situation with a lot at stake? Are viewers likely to be impatient and intolerant of the need to spend time learning how to read a display? Do they need quick insights or is there some capacity for them to take on exploring or reading in more depth? If it is the 74
  • 146. former, the immediacy of the presented information will therefore be a paramount requirement. If they have more time to work through the process of perceiving, interpreting and comprehending, this could be a more conducive situation to presenting complicated or complex subject matter – maybe even using different, unfamiliar chart types. Format: What format will your viewers need to consume your work? Are they going to need work created for a print output or a digital one? Does this need to be compatible with a small display as on a smartphone or a tablet? If what you create is consumed away from its intended native format, such as viewing a large infographic with small text on a mobile phone, that will likely result in a frustrating experience for the viewer. However, how and where your work is consumed may be beyond your control. You can’t mitigate for every
  • 147. eventuality. Personal tastes: Individual preferences towards certain colours, visual elements and interaction features will often influence (enabling or inhibiting) a viewer’s engagement. The semiotic conventions that visualisers draw upon play a part in determining whether viewers are willing to spend time and expend effort looking at a visualisation. Be aware though that accommodating the preferences of one person may not cascade, with similar appeal, to all, and might indeed create a rather negative reaction. Attitude and emotion: Sometimes we are tired, in a bad mood, feeling lazy, or having a day when we are just irrational. And the prospect of working on even the most intriguing and well- designed project sometimes feels too much. I spend my days looking at visualisations and can sympathise with the narrowing of mental bandwidth when I am tired or have had a bad day. Confidence is an extension of this. Sometimes our audiences may just not feel
  • 148. sufficiently equipped to embark on a visualisation if it is about an unknown subject or might involve pushing them outside their comfort zone in terms of the demands placed on their interpretation and comprehension. The Factors You Can Influence Flipping the coin, let’s look at the main ways we, as visualisers, can influence (positively or negatively) the accessibility of the designs created. In effect, this entire book is focused on minimising the likelihood that your solution demonstrates any of these negative attributes. Repeating the 75 mantra from earlier, you must avoid doing anything that will cause the boat to go slower.
  • 149. ‘The key difference I think in producing data visualisation/infographics in the service of journalism versus other contexts (like art) is that there is always an underlying, ultimate goal: to be useful. Not just beautiful or efficient – although something can (and should!) be all of those things. But journalism presents a certain set of constraints. A journalist has to always ask the question: How can I make this more useful? How can what I am creating help someone, teach someone, show someone something new?’ Lena Groeger, Science Journalist, Designer and Developer at ProPublica As you saw listed at the start of this section, the selected, related design principles from Dieter Rams’ list collectively include the aim of ensuring our work is useful, unobtrusive and understandable. Thinking about what not to do – focusing on the likely causes of failure across these
  • 150. aims – is, in this case, more instructive. Your