The primary challenge one faces when writing a book about data visualisation is to determine what to leave in and what to leave out. Data visualisation is big. It is too big a subject to even attempt to cover it all, in detail, in one book. There is no single book to rule them all because there is no one book that can cover it all. Each and every one of the topics covered by the chapters in this book could, and indeed do, exist as whole books in their own right.
The secondary challenge is to decide how to weave all the content together. Data visualisation is not rocket science; it is not an especially complicated discipline: lots of it is rooted in common sense. It is, however, certainly a complex subject. There are lots of things to think about and decide on, as well as many things to do and make. Creative and analytical sensibilities blend with artistic and scientific judgments. In one moment you might be checking the statistical rigour of your calculations, in the next deciding which tone of orange most elegantly contrasts with an 80% black. The complexity of data visualisation manifests itself through how these different ingredients, and many more, interact, influence and intersect to form the whole.
I have arrived at what I believe to be an effective and proven pedagogy that successfully translates the complexities of this subject into accessible, practical and valuable form. I feel well qualified to bridge the gap between the large population of everyday practitioners, who might identify themselves as beginners, and the superstar technical, creative and academic minds that are constantly pushing forward our understanding of the potential of data visualisation. I am not going to claim to belong to that latter cohort, but I have certainly been the former – a beginner – and most of my working hours are spent helping other beginners start their journey. I know the things that I would have valued when I was starting out and I know how I would have wished them to be articulated and presented to me.
There is a large and growing library of fantastic books offering many different theoretical and practical viewpoints on the subject of data visualisation. My aim is to add value to this existing collection of work by taking on a particular perspective that is perhaps under-represented in other texts – exploring the notion and practice of a visualisation design process. It is my belief that the path to mastering data visualisation is achieved by making better decisions: effective choices, efficiently made. The aim of the book is therefore to help readers elegantly navigate through the process of what things to think about, when to think about them, what options exist and how to make the best choices.
** A note about the hardback copy pricing, many have asked why the cost of the hardback is so much higher than the paperback equivalent. This is because the hardback copies are intended for libraries where they will need to be a little more rigorous to protect the condition of the book for long term usage. Far fewer hardback copies are printed which means that the individual unit cost is higher than would normally be found with the higher volume print runs of the paperback version. **
Just as a single book cannot cover the whole of this subject, it stands that a single book cannot aim to address directly the needs of all people doing data visualisation. Here are some of the characteristics that shape the readers to whom this book is primarily targeted. This will help manage your expectations as a potential reader and establish its value proposition compared with other titles:
The core audiences for whom this book has been primarily written are undergraduate and postgraduate-level students and early career researchers from social science subjects. This reflects a growing number of people in higher education who are interested in and need to learn about data visualisation. Although aimed at social sciences, the content will also be relevant across the spectrum of academic disciplines, from the arts and humanities right through to the formal and natural sciences: any academic duty where there is an emphasis on the use of quantitative and qualitative methods in studies will require an appreciation of good data visualisation practices. Where statistical capabilities are relevant so too is data visualisation.
Beyond academia, data visualisation is a discipline that has reached mainstream consciousness with an increasing number of professionals and organisations, across all industry types and sizes, recognising the importance of doing it well for both internal and external benefit. You might be a market researcher, a librarian or a data analyst looking to enhance your visual communication capabilities. Perhaps you are a skilled graphic designer or web developer looking to take your portfolio of work into a more data-driven direction. Maybe you are in a managerial position and not directly involved in the creation of visualisation work, but you might be seeking generally to improve the sophistication of the language you use around commissioning visualisation work and to have a better way of expressing and evaluating work created for you. Anyone who is involved, in any capacity, with the analysis and visual communication of data as part of their professional duties will need to grasp the demands of data visualisation and this book will go some way to supporting these needs.
The pitch of the book's content is intended to serve the needs of beginners and those with intermediate capabilities. For most people, this is likely to be as far as they might ever need to go. It will offer an accessible route for novices to start their learning journey and, for those already familiar with the basics, there will be content that will hopefully contribute to fine-tuning their approaches. It is therefore not a book aimed necessarily at experienced or established visualisation practitioners. There may be some new perspectives to enrich their thinking, some content that will confirm and other content that might constructively challenge their convictions.
Technology is the key enabler for working with data and creating visualisation design outputs. Indeed, apart from a small proportion of artisan visualisation work that is drawn by hand, the reliance on technology to create visualisation work is an inseparable necessity. For many beginners there is an understandable appetite for step-by-step tutorials that help them immediately to implement data visualisation techniques via existing and new tools. However, it is important to be clear that this book will not offer teaching in the use of any tools.
Writing about data visualisation through the lens of selected tools is a bit of a minefield, given the diversity of technical options out there and the mixed range of skills, access and needs. I believe creating a practical, rather than necessarily a technical, text that focuses on the underlying craft of data visualisation with a tool-agnostic approach offers an effective way to begin learning about the subject in appropriate depth. The content should be appealing to readers irrespective of the extent of their technical knowledge (novice to advanced technicians) and specific tool experiences (e.g. knowledge of Excel, Tableau, Adobe Illustrator).
I love flicking through those glossy ‘coffee table’ books as much as the next person; such books offer great inspiration and demonstrate some of the finest work in the field. This book serves a very different purpose, however. I believe that, as a beginner or relative beginner on this learning journey, the inspiration you need comes more from understanding what is behind the thinking that makes these amazing works succeed and others not. There is every hope that it will be seen as an elegantly presented and packaged book but my desire is to make this the most useful text available, a reference that will spend more time on your desk than on your bookshelf or your coffee table.
The content of this book has been formed through many years of absorbing knowledge from all manner of books, generations of academic papers, thousands of web articles, hundreds of conference talks, endless online and personal discussions, and lots of personal practice. What I present here is a pragmatic translation and distillation of what I have learned down the years. It is not a deeply academic or theoretical book. Where theoretical context and reference is relevant it will be signposted as I naturally wish to provide evidenced-based content wherever possible; it is about judging what is going to add most value. Experienced practitioners will likely have an appetite for delving deeper into theoretical discourse and the underlying sciences that intersect in this field but that is beyond the scope of this particular text. To accompany the book will be a digital companion site that will include many more references, articles, book and papers to substantiate readers' knowledge of the subject.
The subject matter, the ideas and the practices presented here will hopefully not date a great deal. Of course, many of the graphic examples included in the book will be surpassed by newer work demonstrating similar concepts as the field continues to develop. However, their worth as exhibits of a particular perspective covered in the text should prove timeless. As more research is conducted in the subject, without question there will be new techniques, new concepts, new empirically evidenced principles that emerge. There will be new thought-leaders, new sources of reference, new visualisers to draw insight from. New tools will be created, existing tools will expire. Some things that are done and can only be done by hand as of today may become seamlessly automated in the near future. That is simply the nature of a fast-growing field. This book can only ever be a line in the sand.
Looking back, we all respect the ancestors of this field, the great names who, despite primitive means, pioneered new concepts in the visual display of statistics to shape the foundations of the field being practised today. The field’s lineage is decorated by names and classic examples that frequent most books that have been written to date about this subject. Of course, to many beginners in the field, this historical context is of huge interest. However, again, this kind of content has already been superbly covered by other texts on more than enough occasions. Time to move on.
A final important distinction to make concerns the subtle but significant difference between visualisations which are used for exploratory analysis and visualisations used for communication. Exploratory analysis is a huge and specialist subject in and of itself. In its most advanced form, working efficiently and effectively with large complex data, topics like ‘machine learning’ become increasingly relevant. For the scope of this book the content is weighted more towards methods and concerns about communicating data visually to others. That said, Chapter 4 will cover the essential elements of the approaches to exploratory analysis in sufficient depth for the practical needs of most people working with data.
The book is organised into four main parts (A, B, C and D) comprising eleven chapters and preceded by an ‘Introduction’ section to provide some initial context about the book’s content and structure.
Part A establishes the foundation knowledge and sets up a key reference of understanding that aids your thinking across the rest of the book. Chapter 1 will be the logical starting point for many of you who are new to the field to help you understand more about the definitions and attributes of data visualisation. Even if you are not a complete beginner, the content of the chapter forms the terms of reference that much of the remaining content is based on. Chapter 2 prepares you for the journey through the rest of the book by introducing the key design workflow that you will be following.
Chapter 1: Defining Data Visualisation
Chapter 2: Visualisation Workflow
Part B discusses the first three preparatory stages of the data visualisation design workflow. ‘The hidden thinking’ title refers to how these vital activities, that have a huge influence over the eventual design solution, are somewhat out of sight in the final output; they are hidden beneath the surface but completely shape what is visible. These stages represent the often neglected contextual definitions, data wrangling and editorial challenges that are so critical to the success or otherwise of any visualisation work – they require a great deal of care and attention before you switch your attention to the design stage.
Chapter 3: Formulating Your Brief
Chapter 4: Working With Data
Chapter 5: Establishing Your Editorial Thinking
Part C is the main part of the book and covers progression through the data visualisation design and production stage. This is where your concerns switch from hidden thinking to visible thinking. The individual chapters in this part of the book cover each of the five layers of the data visualisation anatomy. They are treated as separate affairs to aid the clarity and organisation of your thinking, but they are entirely interrelated matters and the chapter sequences support this. Within each chapter there is a consistent structure beginning with an introduction to each design layer, an overview of the many different possible design options, followed by detailed guidance on the factors that influence your choices.
Chapter 6: Data Representation
Chapter 7: Interactivity
Chapter 8: Annotation
Chapter 9: Colour
Chapter 10: Composition
Part D wraps up the book’s content by reflecting on the range of capabilities required to develop confidence and competence with data visualisation. Following completion of the design process, the multidisciplinary nature of this subject will now be clearly established. This final part assesses the two sides of visualisation literacy – your role as a creator and your role as a viewer – and what you need to enhance your skills with both.
Chapter 11: Visualisation Literacy
A special digital companion site has been constructed to provide readers with additional references and resources to supplement their learning from the book’s content.
There are three strands of additional content for readers to work through: ‘Reading’, ‘Exercises’ and ‘Case Study’:
The Reading sections offers a curated collection of recommended links to web articles, papers and texts, as well as suggested visualisation projects that are associated with the topics covered by each chapter. They might not always directly relate to a specific passage in the book, rather they will tend to add further perspectives to build up your knowledge. There are also links to the source locations of all images included, which is especially useful for those images relating to interactive projects and any original images that only appeared in the book as a cropped excerpt. In addition, this strand of supplementary content includes a selection of hand-picked tutorials to help you learn techniques and tools specific coverage of the chapter. The collection of links will be constantly refined as and when new, relevant material is discovered.
The Exercises sections offer a range of conceptual and practical tasks for you to consider undertaking to help substantiate your learning from the book. The nature of the challenges posed varies considerably, some involve sketching, others require you to assess visualisation work, and several relate directly to raw datasets. These challenges aren’t based around quizzes, for which the correct answers are revealed elsewhere: these are exercises for you to undertake and experience alongside the topics covered in related sections of book.
The Case Study content provides a stream of narrative detailing the behind-the-scenes process that went in to the development of the ‘Filmographics’ project, which was created exclusively to accompany the publication of this book. The purpose of this case study is to help demonstrate the practical application of the design process proposed through this book. (There are no case study sections for Chapters 1 and 11).
This is a list of any acknowledged errors included in the book's text:
The passage that says 'This distinction will be explained in context during Chapter 5' is incorrect, this should be Chapter 4.
"A matrix chart can be used to display quantitative or categorical values at the intersection between two categorical dimensions. The chart comprises two categorical axes with each possible value presented across the row and column headers of a table layout. To display quantitative values, each corresponding cell is marked by a geometric shape with its area sized to represent a quantitative value and colour often used visually to distinguish a further categorical dimension. While they are most commonly seen using circles, you can of course use other symbols or shape attributes, especially if you are displaying different categorical values at the intersections, rather than quantitative values."