Creative visualisations of the UK

Inspired by the great work done by Paul Butler to visualise Facebook friendship connections across the globe, London-based web developer Jim Holden has produced some great mapping visuals of the UK using postcode, bus stop and road network data.

Click on the images below for larger more detailed image versions, which do the visualisations more justice:

The road network map can be viewed here.

All three pieces do a great job of portraying the population patterns around the country. The postcode map, in particular, which is based on more than two million data points, reveals the dominance of London, the Northern clustering of the Liverpool, Leeds, Sheffield and Manchester, the somewhat isolated world of Newcastle and the dominating Scottish corridor of Glasgow and Edinburgh. Jim points out that Scotland is difficult to map given the lack of postcodes and bus routes in the upper north west side of the country.

As with the Facebook map, the data ‘silences’ are just as fascinating, with large gaps depicting the sparsely populated regions of the national parks of Dartmoor, Exmoor in the South West, the Brecon Beacons and Snowdonia in Wales, and the Peak District, Yorkshire Dales, North York moors and Lake District in the English North.

The maps were produced using detailed locations from publicly open Ordnance Survey data, available from the website. For the bus stop map, data was acquired from the governments open release of NaPTAN. Jim describes the technical process:

I wrote a PHP script to read in the data sets, con and cycle through each point and translate that to an image width/height of my choice. If you want to maintain the correct aspect ratio of the UK then you can use the maximum GB grid points of the UK to determine a horizontal/vertical ratio. Once the plotting (+2 million postcodes) is processed you can use some functions within PHP to filter and slightly blur the resulting graphics such that it brings out the “heat map” style. I didn’t use any other products or GIS software.

Next up, Jim is planning to produce a similar piece based on the train network data, also provided by NaPTAN, as well as one compiled from data on the UK’s mobile infrastructure.

Visualisation Insights: #7 Executive Director of Information Services

This is the seventh article in my Visualisation Insights series. The purpose of this series is to provide readers with unique insights into the field of visualisation from the different perspectives of those in the roles of designer, practitioner, academic, blogger, journalist and all sorts of other visual thinkers. My aim is to bring together these interviews to create a greater understanding and appreciation of the challenges, approaches and solutions emerging from these people – the visualisation world’s cast and crew.

This article is based on an interview I held with Brian Derry, the Executive Director of Information Services at the NHS Information Centre, based in Leeds (UK).

I first came across Brian when I discovered an article in the Health Services Journal (subscription required) entitled ‘Demystifying data’ in which he was quoted about the importance of the visual display of data. His opening statement that “3d charts are the first refuge of scoundrels” was music to my ears.

Quite apart from this appealing standpoint, his role as custodian of so much valuable statistical information relating to health and care matters makes him one of the most important information professionals in the UK so I was delighted when he invited me over to meet him for a chat about his role, the work of the Information Centre and the importance of clearly communicated data.

Brian is a chartered statistician and chartered IT professional. Having studied Statistics at degree level, he began his working life as a Government statistician for the Ministry of Agriculture, Fisheries and Food (MAFF, now known as DEFRA) working on significant analytical projects such as the National Household Survey.

After leaving MAFF he moved on to the Home Office, taking up a key role with the remit of trying to develop information products that help explain and improve the management of prison parole, a particularly complex system which has many flows, branches, stocks, decisions points and feedback loops.

A whole new level of complexity faced him when he took up the role of Head of Performance Analysis at the Department of Health in the early 1990s. This commenced his distinguished career in or around the National Health Service (NHS) which has encompassed a succession of senior and director-level informatics posts at the Leeds Health Authority, the Leeds Teaching Hospitals NHS Trust and NHS Connecting for Health.

His most recent career advancement took him to the position of Executive Director of Information Services at the NHS Information Centre in Leeds. He recently completed his term of office as chair of the National Council of The Association for Informatics Professionals in Health and Social Care (ASSIST) and is currently the Chair of the British Computer Society health informatics Professional Development Board.

The Information Centre was created in April 2005 as a result of the bringing together of a range of historically discrete information agencies, as well as key parts of the Department of Health and the Information Authority.

It is responsible for running major information collections and statistical publications relating to a broad spectrum of health and social care service topics such as measures of care quality, hospital activity, prescribing, primary and community care, mental health, adult social services, population health, lifestyle surveys, alcohol, obesity, the NHS workforce and NHS estate.

The centre employs around 500 people, of which about 70% are based in Leeds and the remainder based in Southport. The staff in the Leeds office comprise a diverse range of skills and expertise with a mixture of IT people (managing sophisticated systems, data warehousing, survey processes), statistical analysts, business analysts, information governance specialists, HR people,communication professionals, project and programme managers, and health care specialists. The Southport ‘Central Register’ office has the unfortunate label of ‘National Back Office’ (“a terrible name”, as Brian suggests) but is recognised as providing an incredibly important service dealing with the management of the national register of NHS patients and their NHS numbers (the national unique patient identifier) and therefore has access to a huge amount of vital data.

A large proportion of the Information Centre’s workload involves producing official statistics for central government to help develop and account for health policy. However, as Brian points out, since its inception, the organisation is moving more and more towards a focus on what is most useful for the NHS and Local Government in order to help improve their services. This is a hugely positive trend.

Having recently been the subject of review, and given the political climate under the new Government, it is great news to hear that the Information Centre continues to be recognised as a vital national repository of information with the review concluding it provided a valuable service. It will now become a non-departmental public body operating at arm’s-length from the Department of Health.

The Information Centre is now looking to enhance its service across several key areas with particular emphasis on facilitating an increased centralised collection of information, continuing to increase the range of information useful for patients and the public, especially around the quality of service measures, and aspiring to sustain the quality and quantity of provision but with greater efficiency (in terms of both cost and access). As Brian points out:

“Information is not a free good so there is an increasing need to make best use of it, making it work in combination rather in isolation and joining up a single cohesive story.”

This creates greater value and impact.

At an early stage in his career, Brian was aware that much of the output of Government statistical work was impenetrable, produced by civil servants for civil servants and not presented in ways that would make it meaningful to everyday people. His keen interest in identifying effective visual methods for information displays can be traced back to these early experiences.

He is careful, though, to recognise the acute complexities of many of the service environments and organisational systems he has worked in, especially in some of his earlier roles where the people making key judgments were not particularly technical or information-minded people. No system is more complex than the NHS and the challenges faced by information professionals are becoming greater as the desire for improving the understanding of quality of care grows.

“Trying to explain the NHS in numbers is hard. How do you effectively describe how good a service may be?”

Brian points to a specific difficulty with how you effectively present mortality rates, where you are trying to convey notions of wide variation and constant changes in trends in an accessible way:

“We now live in a world where patient choice is terribly important, with people choosing how and where to receive care. But choice without information is no choice at all.”

The key challenge is about communicating such information to people in ways that reduces the complexity and enhances their ability to choose.

Continuing the issue of the civil servants being responsible for communicating statistical work, Brian points to the specific demands involved in dealing with non-specialists in the NHS context. He refers to his past experience working with senior NHS managers where the people he dealt with were typically highly intelligent, in key positions but had no statistical training and were not always particularly numerate. Yet these people operate in an environment where everything is judged through complex indicators and so this pressure in combination with a lack of expertise creates a culture focused on dissecting short term differences in numbers without fully considering the bigger picture.

As a service which is generally renowned as being obsessed with chasing targets, in the NHS data can be subjected to scrutiny at such a micro level of detail, and observed within a random distribution, that it creates a narrative fallacy:

“If you present a set of numbers, people will spend hours trying to explain reasons for those numbers. But too many fail to understand variation very well – you need a run of numbers to include a context. You cannot establish any idea of performance history when you only see two data points in isolation.”

So given this strong culture, how do professionals in the Information Centre deal with clinical colleagues who may commission or consume information projects? How do they manage to endorse their own expertise, particularly in circumstances where there is a client wishing to dictate matters?

Brian describes how when dealing with colleagues they approach the engagement as if it were a standard consultancy project with a focus on clearly identifying a separation between wants and needs, and establishing the specific problem that is trying to be solved. This is about professionally managing the relationship ensuring there is respect for the mutual areas of expertise.

“We are open to challenge, and should be too, but we can rightly say that we may know of ways of doing this better, so it depends on the opportunity for dialogue.”

Situations naturally get most anxious when analysis is being presented on information which directly relates to clinical colleagues, so it is about building trust and ensuring they fully understand the data and the analytical treatment on which they may be judged.

“There can be nothing worse than being judged on information you didn’t know existed.”

A process challenge facing the Information Centre, and indeed any large public sector body responsible for information management, concerns the need effectively balance the production and analysis of information. So much time can be absorbed by the gathering, handling and statistical analysis of data to the detriment of the opportunity to derive meaning from the information results. This can be a symptom of over-accountability, a situation where the recording and production of information is relentlessly ‘feeding the beast’ that is a public service organisation and its myriad stakeholders. The cost of capturing and processing data is huge and so it is important to be reminded that information should be seen as a by-product of activity, not the activity itself.

“The information tail should not wag the service dog”, remarks Brian.

One of the most challenging components of NHS performance concerns the assessment of quality of care and a key dimension of quality is the patient experience. But how do you capture the type of data you need to monitor patient perception of the care they have received?

A structured source of evidence comes from sites like ‘I want great care’ are providing important channels for capturing and analysing information about patient experiences, their views on the quality of care they have received, perceptions of GP performance etc. On top of this, excellent data is gathered via large number of surveys which record patient attitudes about the care they have received.

A significant growth area in the information management and statistical world is sentiment analysis. The catalyst has been the ubiquity of social networking services like Twitter and Facebook which create incredible volumes of accessible qualitative data. The challenge is to identify the most effective uses, algorithms and means of mining this data from a health service ‘attitudes’ perspective.

Brian notes that a key barrier to the maturity of analysis around this matter concerns access to computer technologies: the heaviest use of health and social care is among the very young and the elderly. Additionally, health is typically linked to deprivation so patients from lower social backgrounds are also restricted from having their voices heard through some of these contemporary, technologically driven channels. This represents a significant problem for securing a comprehensive and efficient picture of perceptions around this quality of care. Identifying an effective solution to resolve this particularly complex and elusive challenge is a priority for information professionals across the NHS.

I asked Brian about his views around the visualisation and communication side of his organisation’s work. In terms of the challenge in explaining statistical information to the public, he points a finger of blame towards media’s role in failing to promote clear communication.

“Large portions of the media are largely more concerned with graphic design over content which can confuse viewers and readers, missing the message completely.

The only purpose of a chart is to convey message accurately and clearly but most of the information you see in the media fails these basic tests. 3D charts are a particular bug bear. Whilst they are technically clever pieces of design, they fail to communicate accurately and sometimes this is intentional. Conveying genuine meaningful information is our priority, not getting distracted by graphic design.”

Brian also believes the media seem to amplify a society-wide shortcoming with regards to numeracy. He observes how many people struggle with the nuances around statistical concepts such as percentages, percentage changes and percentage point changes, all particularly common devices used to support the presentation of news items.

People are also becoming less patient, there is a society wide culture that demands information, immediately and without having had to plough through masses of detail. People don’t want to have to work for information, so it should work for them. This makes the challenge of presentation so critical – if you cannot convey your message instantly, forget it.

On a broader basis, there is failure to keep things simple. Brian cites a quote attributed to the French mathematician Blaise Pascal who said something to the effect of

“I am sorry I have had to write you such a long letter, but I did not have time to write you a short one.”

It is the same with charts – I’ve not had time to do a simple chart so here’s a complex one. However, there is a balance to be struck and sometimes there is such a drive to “simplify, simplify, simplify” that sometimes the key content is abandoned in doing so.

I asked Brian about the advice or guidance provided to his analysts about approaching the visualisation stage of their work. Many of the professional statisticians belong to the Government Statistics Service which spans all government departments and this body has clear standards and principles to follow about the presentation of data. Beyond the specifics of these standards there is a general house rule which promotes good statistical practice to be about focusing on the content first, design second. When it comes to design, analysts need to constantly ask themselves “will this accurately convey the message?

Brian also draws a key distinction between the activity of analysing and communicating data and the difference each entails with regards to design:

“It is really important you are clear about what you are trying to do. Are you describing a set process or system? Are you answering a question about a hypothesis? Or are you using it to make a particular point?”

Despite the presence of clear house rules around statistical and presentation practices, the Information Centre’s analysts are actively encouraged to ensure they are taking full advantage of latest technological developments and contemporary thinking. Indeed this is an area of business they invest a lot of time and effort in. By way of example, to support this culture of ongoing development, a member of staff is currently doing a PhD around statistics at the University of Leeds and a strategic relationship has been formed with the same institution. This provides mutual access to key seminars and events, Information Centre staff routinely teach on some undergraduate courses and they also collaborate in many areas of research activity.

The software used by staff lists many of the typical tools you would expect within such an information-rich environment. They use Microsoft Office products, SQL server databases, SAS analytical tools, and various GIS applications – as Brian observes “health is very concerned with patterns of geography”. They also have access to extremely large data warehouses operated by commercial partner organisations such as BT, for example the ‘Secondary Users Service’ which takes continuous records of patient care amounting to millions of lines of data.

There are some concerns around the extensive use of Excel and Access. They are still popular tools but are considered difficult to control at scale and this can lead to errors concerning issues like cell linkage inconsistencies and ‘cut and paste’ mistakes, particularly when utilised in shared processes.

The general approach in the Information Centre is to standardise methodologies, creating a sophisticated environment of working that is portable and generic and not overly reliant on a single technology to achieve everything. To support this approach they have begun moving more towards a data management environment built around SAS, with enables shared references, macros, access to data sources, is considered much more efficient and, importantly, reduces the risk of errors.

One of the significant developments within information management in recent years has been the emergence of the open data movement. I asked Brian how he viewed this in the context of NHS-related information – is it a positive development, does it pose a risk or opportunity, does it create added pressure/demands on his organisation?

Emphatically he thought it was a positive development and believed a certain amount of risk-taking should be encouraged, but he was mindful of certain challenges that accompany this environment.

“We need to move away from the state control – the public have come along way so the more information that is available, and in a form for anyone to use it, the better.”

The adverse risks he recognises concern issues of understanding, appreciating that with certain information contexts results can be misinterpreted, so supporting features like reference files and meta-data are vital.

The boundaries of transparency are also changing and the classic public service view of confidentiality is slowly diminishing, which is very healthy. For example, clinicians are now routinely named in certain packages of analysis – that would never have been the case not so long ago but reveals a far greater willingness to embrace transparency. It is still important, however, to be very careful in many circumstances when patient confidentiality is concerned. Issues of potential identification can crop up within subjects that involve small numbers, particularly when presented in a geographical context. Whilst the analysis may not immediately reveal individuals it does open up the possibility for people to triangulate data and arrive at an identity.

As a public body it is more important than ever to be able demonstrate effectiveness and value for money. I asked Brian how success is judged in the Information Centre?

Much of the immediately available evidence comes through standard web analytics with extensive information available on typical measures around visitors, pages visits, which datasets viewed and downloaded. They also have activity from registered users via a service on the Information Centre website called ‘My IC‘. This helps to collect information about what subjects users are most interested and provides a convenient channel for running survey campaigns across these regular customer groups.

Whilst having a fairly advanced understanding of the usage of registered users, the biggest gap surrounds the use of their information services by patients and the wider public. More and more data is getting out there, and there is good research about what is generally helpful to public, but it is still difficult to track how the information is being used to drive action or insight by these groups:

“We need to keep pushing to establish a great understanding of what it is you [the public] want, how do you want to use it, how could it be done better – a level of intelligence greater than can be derived from simply running surveys. This is a great challenge.”

Brian cites the ever increasing sophistication of people and their constantly changing behaviour and tastes around consuming information, driven by technology. In response, one of the more non-traditional methods employed to understand how people are engaging with the site and its services has been work such as that by City University using eye-tracking to identify the ergonomic factors around the web layout and design.

My final question was to ask Brian to project himself five years into the future and consider what will represent success for the Information Centre over that period.

“If a lot more of the information around the NHS was more readily available and was being used and understood, especially by the public. That would be a success. We will have also developed better measures for the quality of care. But overall, we will have succeeded if the Health system was beginning to use the information effectively to improve its services. Data is for information which is for improving public services.”

I’m extremely grateful to Brian for inviting me down to meet him for this interview and the energetic and informative discussion that ensued. I took a great deal out of this, unearthing more and more nuggets of wisdom every time I listened back to the audio recording. I wish him and his colleagues at the Information Centre all the very best for the future in their undertaking of this hugely valuable activity.

Best of the visualisation web… December 2010

At the end of each month I pull together a collection of links to some of the most relevant, interesting and useful articles I’ve come across during the previous month. If you follow me on Twitter you will see many of these items tweeted as soon as I find them. Here’s the latest collection from December 2010:

Wired | “.WWF” The tree-friendly file format that can’t be printed | Link

Smashing Magazine | “What Font Should I Use?” Five principles for choosing and using typefaces | Link

The Guardian | Best apps: our experts pick 50 of the most dazzling, useful and novel | Link

Design Shack | Design discussion: brand advertising vs. promotional marketing | Link

David B Sparks | Demonstration of electoral ‘Marimekko’ (or tree map) plots | Link

Wall Street Journal | Everything the Internet knows about me (because I asked it to) | Link

Loose Wire Blog | Facebook’s ‘Locality of Friendship’ | Link

Impure | Gapminder, redeveloped using Impure| Link

In Graphics | In Graphics – ‘A magazine for visual people’ | Link

O’Reilly Radar | Six months after the publication of the “What is data science?” paper | Link

O’Reilly Radar | Strata Gems: Quick starts for charts| Link

O’Reilly Radar | Strata Gems: Where to find data | Link

O’Reilly Radar | Strata Gems: Write your own visualizations | Link

Core 77 | Thinking of doing a design PhD? | Link

xPlane | Why the office is still a great place to work | Link

Flowing Data | Amanda Cox (NYT) on data graphics and stuff | Link

Creative Review | Amnesty’s guerrila campaign makes the invisible visible | Link

Online Journalism Blog | Visualising data with the Datapress WordPress plugin | Link

Impure | Impure second video-tutorial: Workspaces and Impure code | Link

IBM | Data visualization with Processing, Part 1: An introduction to the language and environment | Link

New York Times | Interactive puzzles to test your insight | Link

New York Times | Designing election results on the iPad | Link

Discover Magazine | Live not by visualization alone… | Link

Flowing Data | 10 best data visualization projects of the year – 2010 | Link

New York Times | ‘Mapping America: Every City, Every Block’ – showing household income distribution | Link

41 Latitude | Why do Google Maps’s city labels seem much more “readable” than those of its competitors? | Link

Wall Street Journal | “What They Know” series – the data collected and shared by 101 popular apps on iPhone and Android phones | Link | The Google Books Ngram Viewer | Link

Junk Charts | Handling multi-level data in multiple charts | Link

Visualisation Reflections: #6 Data Journalist & Information Designer

This is a follow-up post to my sixth article in the Visualisation Insights series which I published earlier this week. The purpose of this companion series is to optimise the learning opportunities from each insights article, reflecting on the ideas, issues and observations to emerge.

Why did I choose this subject?

Recalling the purpose of this series, I’m looking to provide readers with unique insights around visualisation from the perspective of its active and prominent participants. As I said in my article, over the past 12-24 months there have been very few higher-profile individuals in the visualisation field than David McCandless.

The launch of his ultra-successful book Information is Beautiful, his accompanying website, appearances on British TV, a regular platform on the Guardian data blog, an invitation to present at TED and many other public appearances have led him to become the face of UK visualisation.

I was naturally delighted when David agreed to an interview.

Impressions prior to the interview?

Having followed David’s emergence closely from here in the UK, I have witnessed the high praise he has received as well as a certain amount of negativity that automatically accompanies such exposure. Interestingly, both acclaim and criticism seems to centre on the matter of beauty. The supporters love the visual appeal and engage-ability (is that even a word?) portrayed in his work, the critics argue that this style leans too heavily on aesthetic appeal to the detriment of perceptual accuracy.

This is not a unique argument in the visualisation field, it is the argument across all of design.

Despite the worthy efforts of the academic community, the paradox of form vs function shows no signs of being resolved – the perception of design continues to be inextricably linked to the volatility of human taste.

Whilst there are occasionally elements in some of the designs in ‘Information is Beautiful’ that emphasise beauty over enabling accurate interpretation, my personal view of David’s vast portfolio is that it is a hugely impressive, interesting and insightful contribution to the field. The creativity displayed is fabulous and he deploys innovative visuals that greatly expand the range of design solutions most of us typically resort to.

I believe he has contributed enormously to the burgeoning appeal of the subject, helping it to reach a broader audience, which is a vital achievement otherwise the popularity of a subject can end up existing within an isolated bubble. To witness a glimpse of how popular David’s work has been start typing the word ‘information’ into Google, and see how quickly the recommendation for ‘Information is Beautiful’ appears…

Impressions after the interview?

There are many fantastic nuggets contained within David’s interview and it should provide a great reference for anyone in the field, whether you are new to it or already well established in visualisation design.

The first highlight concerns his career background. Here is a wonderful exhibition of the many different education and career routes down which you can end up arriving in the visualisation field.

His passion for programming and computer games dominates the early part of his career (presumably inspired by War Games and Weird Science if the timeline is anything to go by!) and through this he gradually migrated towards the world of Journalism.

His visual CV perfectly demonstrates the pace, direction and maturity of capabilities and junctions his career has followed and it is fascinating to observe the sudden expansion of his interests in to design around 2007, as described by his discovery of a visual solution to make sense of, and distill complex information around evolutionism and creationism.

The visualisation field is characterised by its convergence of principles, practices and theories across a wide range of traditionally diverse subjects. David’s career seems to demonstrate this convergence perfectly – the strong competence with technology and a mind geared towards programming, the journalistic nose for a story, pursuing an hypothesis, the copywriter’s ability to eliminate waste and reduce a matter to its essential core, and the designers skill at pulling all this together into an engaging communication. He is hack and hacker rolled up into one:

I’m always going for simplicity and clarity and space. Partly as a response to information overload. But also because journalism and design share a common goal – they want to make things clearer.

His rapid transition from novice designer (with all due respect) to the conception and production of his successful book is remarkable. His journalistic senses seem to have been a major catalyst:

After starting to play with infographics, I suffered a good six months of doubt. My agent wanted me to do another book and asked me what I had. Meekly, I presented this idea of a “mapbook of ideas”. In terms of identifying the visualisation subjects included. I just followed my ignorance and frustration mostly! I set about trying to answer questions I wanted answered. And to fix things that frustrated me about information – the reporting of abstract billions dollar amounts, patterns in media. But I think frustration, ignorance, boredom, bewilderment can be as inspiring as joy or curiosity or conversation. Those ‘negative’ feelings are a sign that something isn’t working. And an opportunity to design a solution.

I’m particularly interested by his observation of how negative emotions can help push you forward and how they have a similar effect to the way positive emotions pull you towards a goal.

Once again, his journalism background strongly influences his thoroughness of preparation and research, as he describes the process of doing a design as being “80% research, 20% design“. This highlights the importance of undertaking and persevering with the often boring and arduous background work that takes place in pursuing data to construct a visualisation. This activity mostly goes unnoticed but only when done properly, otherwise a lack of rigour is glaring.

I mentioned earlier the issue of ‘volatility of human taste’. This was a strong theme in the debate (if you can call it a debate) that David took part in with Neville Brody on BBC’s Newsnight programme. What should have been a healthy and welcome debate on the increasing popularity of information design was greatly let down by the weak and seemingly unbalanced facilitation of the discussion. It is really interesting to hear David’s thoughts on this appearance:

Specifically, I forgot how TV journalism reduces debate down to two opposing polarities: for and against. Which I think for a topic like information design is a lame approach. How can you be against information design? It’s just a technique! So I was caught on the hop a bit and felt quite bemused by what was going on. I thought we might have a debate about its potential and its limitations. But no.

The final observation from David’s interview comes from my question relating to the perceived benefits of visualisation. In what might be seen by some as a surprising standpoint David remarks that visualisation is a means but not the end. This is absolutely true and can be evidenced by the proliferation of needless, un-cohesive infographics that continue to fly around the web:

…I think businesses hungry for new insights and innovations may be disappointed by what this field can offer. I think data visualisations can provide insights – sometimes. But the data has to be worked and moulded and played with journalistically – to reveal those insights. I’m not sure visualisation per se or automated tools can do this. You need a brain. Data needs humans to be interesting.

Many thanks again to David for finding time to take part in this interview. I happened to approach him about the article at probably the most high profile period of his year so it is hugely appreciated that he was able to provide such a comprehensive and interesting set of responses to my questions. I wish him continued success during 2011. You can follow David on his Information is Beautiful website and all via his Twitter account.

Look out for future insights articles, with many interesting interviews and interviewees lined up…

Twitter raffle for free conference pass

Following my recent visualisation contest I am delighted to announce that, thanks to the kind folks at O’Reilly, I have a second free pass to give away for one lucky person to attend the Strata ‘Making Data Work’ Conference in February.

With time running out before the event kicks off on the 1st February I’m going to hold a fairly simple quickfire raffle via Twitter.

To enter, all you have to do is retweet, reply or mention this post on Twitter.

The raffle is open as soon as this post is published and will close at midday on Monday 10th January.

I will then compile a list of Twitter users eligible for the prize, randomly drawing out one person’s name. The winner will be announced on Monday and I will send them details of the registration code immediately so they can get on and register.

For those who don’t win the prize you can still benefit from a 25% discount off the registration fee by using code str11vsd or by clicking here.

Visualisation Insights: #6 Data Journalist & Information Designer

This is the sixth article in my Visualisation Insights series. The purpose of this series is to provide readers with unique insights into the field of visualisation from the different perspectives of those in the roles of designer, practitioner, academic, blogger, journalist and all sorts of other visual thinkers. My aim is to bring together these interviews to create a greater understanding and appreciation of the challenges, approaches and solutions emerging from these people – the visualisation world’s cast and crew.

David McCandless is a data journalist and information designer based in London. Over the past 2 years David has emerged as one of the most prominent visualisation designers, largely through his exceptionally popular book “Information is Beautiful” and accompanying website.

Other than Hans Rosling, it would be fair to say that, through his book, his ongoing design work and several high profile public appearances, David was the most talked about star of the visualisation scene in 2010. It is therefore self-evident that I would wish to invite him to take part in an Insights article and I’m extremely grateful that he was able to find time from his extremely demanding schedule to offer his thoughts on a range of issues.

You’ve had a varied career to date – can you provide a brief narrative of the key milestones in your education/training/career background that led up to your current data journalist/information design career you enjoy today?

Sure. I’m 38 years old and have been a professional journalist for 23 years (since I was 15).

1986 (aged 14) – hacking column for cult ZX Spectrum magazine, Your Sinclair. I would break into games and work out how to give players infinite lives.

1989 – Studied English at University Of London.  Dropped out aged 19.

1990s (20s) – Wrote for video games magazines Zero, Pc Zone – UK Doom Champion

1997 (26) – Freelance feature writer on technology, web and game culture. For Wired Magazine US & UK, The Daily Telegraph, The Evening Standard.

1999 (28) – Wrote & edited, part of an interactive drama on BBC 2 in the UK. One of the first websites to feature something called a “blog”.

2001 (age 30) – Freelance reporter for The Guardian, Telegraph and Independent. Writing about web culture and anything “strange and interesting”.

2003 (32) – Freelance conceptual copywriter specialising in digital advertising. Clients included Orange, Amnesty, BBC. Won 2 webby awards, one silver pencil and over 14 separate industry awards.

2006 (35) – Wrote the The Internet Now In Handy Book Form! a spoof of the internet. Self-orchestrated viral campaign.

2007 (36) – Creative Director working in advertising. Started designing infographics. Had the idea to do a book of information maps and charts called “The Information Atlas”

2008 (37) – Wrote and designed ‘Information Is Beautiful’. The book took 1 year of solid work to design and refine. Plus  6 months of research. 6 months of doubt.

When/what was your “aha!” moment in terms of information design – when were your eyes opened to the world of visualisation and what defined some of the techniques you employ today?

2007 I was a working journalist and having to stay on top of loads of different subjects, keep copious notes and track the developments and ideas in the various fields I was interested in and writing about.

This was fun but hard work. I’m into loads of different subjects: tech, philosophy, science, culture, sub culture, history etc. So I was generating a lot of material.

I was researching a piece about Evolutionism and Creationism. It was generating huge amount of notes as Evolutionary Theory is made up of loads of different camps, all with slightly varying perspectives. At the same time, Creationism is made up loads of different camps, all with slightly varying perspective. I wanted to try to depict them all.

So, to keep track of all these different camps, I drew a visual map, and tried to sum up, distill, each camp into the minimum words.

In the end, I had this pretty interesting diagram. I looked at it and thought: “Hmmmm, I don’t really need to write the article now. It’s done the job of delivering the information.”

Then I thought: “Maybe I could do this to loads of subjects? Instead of writing an article, this diagram could be the article?”

I then started looking at my notes and article ideas with that in mind. That was it really.

How would you define your visualisation/design style?

I’m always going for simplicity and clarity and space. Partly as a response to information overload. But also because journalism and design share a common goal – they want to make things clearer.

Who would you describe as being the most influential authors, writers, designers or academics behind your visualisation techniques and design identity?

Key influences: Edward Tufte, Ben Fry, Robert Horn, Judith Donath, Marshall McLuhan, Wurman, Otto Neurath, Gerd Arntz, Bureau D’Etudes, Stamen.

Um, I look at everything – I subscribe to design blogs and look at them everyday. So I would say I’m influenced by information design as a whole, and by the web. Alas, a lot of the blogs I look at don’t necessarily have back-links to the designers, so I don’t often know whose work I’m looking at (that’s a reason why I always put a prominent signature on my work)

Many readers will probably own or will have read a copy of your book “Information is Beautiful”/”The Visual Miscellaneum”, what motivated you to write the book and how did you identify the visualisation subjects included?

After starting to play with infographics, I suffered a good six months of doubt. My agent wanted me to do another book and asked me what I had. Meekly, I presented this idea of a “mapbook of ideas”.

In terms of identifying the visualisation subjects included. I just followed by ignorance and frustration mostly! I set about trying to answer questions I wanted answered. And to fix things that frustrated me about information – the reporting of abstract billions dollar amounts, patterns in media.

I also had some cool ideas – like Time Travel or Wikipedia Edit Wars – that couldn’t wait to visualize.

But I think frustration, ignorance, boredom, bewilderment can be as inspiring as joy or curiosity or conversation. Those ‘negative’ feelings are a sign that something isn’t working. And an opportunity to design a solution.

How long did it take you to construct the collection of published designs?

It took a year of solid work.

Any plans for a follow-up?

Yep. I’m going to be announcing something soon.

[click for full image view]

What was the motivation/curiosity that sparked the research behind this piece of work? Was it simply the topical nature of the digital economy bill or was it something you had been working on for a while?

A friend (thanks Neilon) sent me a link to blog post by the Cynical Musician called ‘The Paradise That Never Was‘.

In it, he lamented the fact that the potential of the web as a golden, global marketplace for ALL musicians, not just those on major labels, had never really appeared. He felt record companies had quickly colonised and contracted the space. So that musicians ended up being screwed. As evidence, he produce his pitiful micro earnings from etc and compared it to the minimum wage.

I thought it was a fantastic insight. But also the linking of the data to a real world figure that we can all relate to (the minimum wage) really de-abstracted the numbers.

I felt it was a great approach.

As a user of Spotify and other music services, I felt appalled that musicians were being so ripped off. I wanted to find out the truth about the figures.

So I set about applying the same metric to all the main online services (US, UK & Global) and double-checking the figures.

(Incidentally, I had to get the Spotify figure leaked to me by an industry insider. It’s a very closely guarded secret).

You always provide a detailed background behind you/your team’s data research – presumably your journalistic leanings shows a strong ethic around accuracy and thoroughness –what proportion of the overall project duration was spent gathering data vs. the actual design preparation and execution?

There was a huge amount of research that went into this. It was very difficult to keep on top of the figures and to decode the various systems and proportions music services use to calculate revenues. Also there were several different categories of earnings (physical sales, streaming, downloads etc) so I had to juggle apples, oranges and pears.

I would say it was 80% research. 20% design.

What software/techniques do you use to handle the data/ analysis and prepare it for the visualisation?

I just use the spreadsheet function in Google Docs.

My approach is always to create well-designed spreadsheets – using colours, filled bars, separators, bolds, emphasising key figures, de-emphasising less vital numbers – so the spreadsheet is as legible and understandable as the final design. You should be able to understand the issue if you look at the spreadsheet.

Also, the spreadsheet often acts as the ‘skeleton’ for the design. If you’ve structured the data well, so that the meanings is clear, then the visual design is often pretty straightforward. You essentially just design what’s in the spreadsheet – perhaps using visualisation techniques to lead the eye, emphasise data and make it attractive.

In terms of the design, how did you arrive at the concept of your display? What sort of alternative solutions were you considering? Were there any constraints that restricted the solutions you could arrive at (eg. time, space, layout etc.)? Can you explain some of the deliberate decisions you made around the visual properties in the design? I’m particularly interested your decision process around the growing pink bubbles representing the variable ‘need to sell…’ and the common difficulty of representing hugely diverse ranges of values like this on the same graph.

The image went through 5 main drafts. That’s unusually low for me actually.

I wanted to run it as a blog post, rather than a larger visual. So the width had to be 550 pixels wide. I wanted people to be able to mount it in their own blogs.

Running stuff at 550 width presents some problems, mainly that the fonts have to be small. Legibility etc becomes an issue.

Draft 1 – The key stand-out, easiest-to-understand metric in the whole dataset is ‘how many a solo musician must sell to hit minimum wage’. So this acted as the central spine of the design. This section was given a key colour to further emphasise it.

It worked well because there’s basically a massive visual punchline when you get to the bottom. In the early drafts the punchline is even more massive because I got my calculations wrong on Last.Fm

Draft 2 – I also wanted to convey the proportions of each type of sale a musician makes. Which I felt was interesting and added visual variety. So I started using pie charts.

In the meantime, I started adding visual elements or representations for each sell. I thought it created more visual variety.

It was a fairly simple design – essentially it’s a table. But I was refining and re-calculating the data right from the start.

From a process point of view, do you tend to sketch ideas out on paper first or go straight into ‘playing’ with the data, exploring different compositions and iterations?

I sometimes sketch. Sometimes the data shapes the image (as in this case). Other times, you just try different approaches until one works.

Generally, I think if you move into digital too early, you can get stuck in the ‘purity’ of it. I’d say stay on paper and sketches for as long as possible. You can debug a sketch pretty well. So if it works on paper, especially when you show it to other people, it’ll probably work onscreen.

Other than using (presumably) Illustrator, are there any other software tools that you use to help create the final designs?


Are there any particularly intricate or advanced features of this software that you could share?

Not really sorry.

Of course any design brief could be delivered in many different ways. On reflection, is there anything you would wish to change, add or remove from this design? Have you received any suggestions or comments that you thought represented good ideas?

Looking back on it now, I would probably de-clutter it a bit more. Maybe lose the $ column. And perhaps centre the final bubble. It annoys me that it’s not centered. Otherwise, not bad!

I think one commenter said, I should add an infinite sized bubble for all the radio plays an artist gets that nobody pays for. Hmmm, good point.

How do you see the state of the visualisation field right now (ie. encompassing data journalism/information design) both in the UK and worldwide?

Pretty exciting. Expanding rapidly. All the infodesigners I know are busy, busy, busy.

I despair a little at the infographic dreck that has appeared. But I think the good pieces are still rising to the top and being seen.

There still seems to be a boundary between journalism and information design though. Still I think many designers and journalistic are falling into the pitfall of thinking just visualising something makes it good or interesting enough. I’d love to see more designers getting some journalistic training – and more journalists getting some design training.

Do you see any regional differences in techniques or approaches to visualisation across Europe compared to the UK/ US other parts of the world? Alternatively, does visualisation exist as a genuine global community progressing and evolving collectively?

Not sure really. As I said, I look at design pretty agnostically. I don’t often know where it comes from.

You recently accompanied David Cameron to India, how did this come about and what were the key insights that struck you from your experiences on this trip?

My name got on a list! I was as surprised as you. It was quite a mad trip. The Indians we met were as hungry for data visualisation and information design as we are.

You were also recently invited to go on BBC’s Newsnight programme, debating the issue of data visualisation and in particular the issue of aesthetics. Do you have any reflections about how the subject was presented and how the subsequent discussions went?

I was flattered and amazed to be asked to appear. I haven’t done telly for a few years. So I’d forgotten what it was like. Specifically, I forgot how TV journalism reduces debate down to two opposing polarities: for and against. Which I think for a topic like information design is a lame approach. How can you be against information design? It’s just a technique!

So I was caught on the hop a bit and felt quite bemused by what was going on. I thought we might have a debate about its potential and its limitations. But no.

One of the criticisms I hear (and generally agree with) about the visualisation field is that it is good at celebrating and promoting itself inside the community but struggles somewhat to effectively penetrate beyond into the wider business community.

That could be true, though I’ve had many people approach me from the business world wanting visualisations, tools and insights.

There are clear benefits for the communications industries: journalism, advertising, marketing, web.

But I think businesses hungry for new insights and innovations may be disappointed by what this field can offer.

I think data visualisations can provide insights – sometimes. But the data has to be worked and moulded and played with journalistically – to reveal those insights. I’m not sure visualisation per se or automated tools can do this. You need a brain.

Data needs humans to be interesting.

How do you go about articulating and selling the benefits of data visualisation to those new to the subject, especially given that a direct return on investment can be somewhat tenuous?

It’s self-evident isn’t it? Information design either works or it doesn’t. If it does, it’ll conveys its message and be understandable to virtually anyone. Like a road sign. Or an iPod. No selling of benefits required.

I’m extremely grateful to David for taking part in this interview especially as his time is unquestionably in very high demand right now. He has offered a great insight into his life as a prominent visualisation designer and readers will find some invaluable advice contained within this article, especially in terms of career pathways and design process. I wish him all the best in his ongoing quest to conquer the world of visualisation in 2011 and look forward to his announcement of a follow up to his book. For those of you who don’t already follow David in any shape or form, you can buy his book here, visit his blog and see his relentless stream of new designs here and follow him on twitter via @infobeautiful.

Visualisation Reflections: #5 Head of Data Visualisation

This is a follow-up post to my fifth article in the Visualisation Insights series which I published just before Christmas. The purpose of this companion series is to optimise the learning opportunities from each insights article, reflecting on the ideas, issues and observations to emerge.

Why did I choose this subject?

I first came across Alan when I stumbled upon a couple of his presentations decks on Google (here and here). I very excited to learn about the existence of the Data Visualisation Centre and also very drawn to the perspectives and messages Alan was sharing in these presentation snippets.

His unique position as Head of the Data Visualisation Centre, for the Office of National Statistics, offers an outstanding platform from which he can effect best practice visualisation techniques and so I was very keen to explore this world with him.

Impressions prior to the interview?

As I’ve just mentioned my first and only exposure to Alan, to date, was via a series of presentation documents. Whilst the contents of these slides only provided a very brief insight into his views you could immediately pick up the sense that here was somebody who had a very balanced and principle-driven expertise in visualisation.

It was more than enough evidence for me to track him down for an interview.

Impressions after the interview?

Responding to the large volume of questions and topics areas I threw at Alan, he has provided a fantastically insightful article that absolutely encapsulates the aim of this series.

There are too many highlights to make it worthwhile mentioning them all, I’d simply urge everyone to have a thorough read through. However, there are a few things I would pick out.

Its fascinating to discover how people have arrived into the world of visualisation. In Alan’s case this journey commenced from his background in cartography. He typifies many of the people I have come across in the visualisation field who discovered it by challenging conventions, asking questions about why things are done the way they are, ‘surely there is a better way’…

Alan talks about “Rosling’s last 6cm” and this is a great reference to capture the traditionally abandoned importance of effective design of communication. He goes on to mention that the remit of the Data Visualisation Centre reflects how “ONS takes the communication of its statistics seriously” – this is hugely encouraging given the vast array of statistical data this organisation is sat on.

Central to Alan’s appeal, a theme that stands out in the presentation slides I mentioned above, is his exceptional appreciation of the purpose of visualisation and its principles. There is so much information to support this but a couple of passages especially enforce it:

I also admire Hans Rosling who has been very keen to push visualisation to support decision-making and public policy. I am always inclined to support the people who think of visualisation in these terms rather than as a pure ‘beautification’ exercise.

The information in the data is beautiful, not the graphic itself.

A particularly interesting insight comes in response to my question about the opportunities and threats that exist with the era of open data, especially in respect of non-experts. Alan offers a very articulate and reasonable response, one that challenges my own perceptions:

…I find the issue of non-expertise interesting and potentially patronising. Traditional notions of statistical literacy, based on numeracy, are ripe for challenging, I think. For example, here in the UK, Camelot had to withdraw a lottery scratchcard a couple of years ago (‘cool cash’) because it required people to compare negative numbers and too many people found that difficult. Poor numeracy was acting as a barrier to problem-solving. However, it would have been perfectly feasible to present the same information in a different way (for example, to ask them to identify a warmer or colder temperature, rather than a higher or lower number) and many of the same people would have been able to solve the problem. We are sometimes lazy in that we are happy to let an issue like poor numeracy act as a barrier when there are ways around it. I’m not saying we should ignore numeracy issues – but improving numeracy is something with a very, very large turning circle. 15 million adults in the UK lack Level 1 numeracy skills – but these are people – intelligent people – who need to handle data now, so there’s a challenge up front for people working within the data/information visualisation.

This final sentence acts as a highly motivating ‘call to arms’ for us in the visualisation field to keep pushing for new ways of helping to engage with those who struggle with numeracy matters.

Looking forward, the 2011 census will offer an intensely deep data event for Alan’s team to tackle in pursuit of their principle motto “making numbers meaningful“. This will be a huge challenge but also a great opportunity for them to deliver access and facilitate understanding for a new generation. As the following paragraph suggests, this particular data visualisation task appears to be in safe hands:

Visualisation gives us a chance to let people see official data in a more personal, meaningful way. They can visualise their own inflation patterns based on their own spending habits, compare their neighbourhoods and cities with others, see how long they are expected to live, all over a cup of coffee, using just a web browser. That’s an exciting thought and I think we are only just scratching the surface. The real power of all this convergence is going to be what happens when we bring different data sources together – as long as its done skillfully, rather than just for the sake of it.

Finally, my favourite passage in this interview, when I asked Alan to describe his favoured learning resources:

There are plenty. Edward Tufte’s books are probably the ones you would keep on your coffee table if you had normal folk coming round for a coffee. Stephen Few’s books are the safety-first manuals we keep around next to the First Aid kit.

The question is what do you put on your coffee table when the weird folk come round??

Many thanks again to Alan for his generous time, effort and exceptionally thorough and interesting responses to my many questions. I’ve re-read this article many times and keep finding something new each time. I wish him and his colleagues at the Data Visualisation Centre all the best for 2011. You can keep a track on Alan’s discoveries via his graphboy del.ici.ous account.

Look out for future insights articles, with many interesting interviews and interviewees lined up…

Welcome to 2011!

I hope all readers and visitors have had a pleasant break over the past week or so.

Things have been pretty quiet on this site and recently across the blogosphere (hate that word but it does the job) but judging by the increased number of feeds and tweets I’ve been picking up today, it seems people are starting to gear up for another hard slog.

The purpose of this first post of 2011 is really just to mark the launch of the new year and to specifically announce a few things that will be coming up in the next month or two.

I wanted to decorate this post with an apt image of a calendar, one that embodied the theme of visualisation in some way. I decided to use this example of a niko-niko calender. It is an application of kanban, a Japanese concept related to lean/just-in-time, applied to some agile project management approaches to visibly capture the mood of a team and the feelings of individual members.

In the week that saw Nicholas Felton launch the Daytum iPhone App, I thought it was a timely alternative idea to share with those possibly seeking a less sophisticated method of tracking and visualising personal progress through the coming year. It is only the 4th of January so you’ve still chance to remember your feelings from 3 days ago – though I’m not sure a green, hungover-looking face is available with this concept…

Strata Conference

The first item is a reminder of the upcoming O’Reilly Strata Conference, taking place between 1st and 3rd February in Santa Clara, CA.

I’m delighted to say that Visualising Data readers can now benefit from a 25% discount off the registration fee by using code str11vsd or by clicking here. For those who can make it I would highly recommend you get yourself registered, particularly if you can do so before 6th January as the early-bird $250 discount is still available.

Since I last posted about this event, a further set of great speakers and topics have been announced and squeezed in to an already densely packed schedule. I’ll be going through in more detail next week about the specific subjects and sessions that I would really recommend visiting.

Unfortunately, I won’t be able to make it over in person but my recent visualisation contest winner, Jan Willem Tulp, will be reporting about his experiences and key insights.

Coming Posts

I’ve a great line up of visualisation insights posts lined up for the next week or two and some further still in the pipeline. Amongst these will be the David McCandless interview I mentioned a while ago.

I’ve been really flattered by the willingness and effort that everybody I’ve asked to take part so far has demonstrated and I hope to be able to continue to share these (even if I say so myself) really interesting perspectives throughout the coming year.

I’m also finally getting closer to publish my ultimate resources series. I’ve been talking about these for ages but should now be looking at a February release, having diverted away from a blind alley structure-wise that was proving hugely inefficient in compiling the pieces.

Besides a few visualisation design projects, I’ve also got plans for a number of training sessions which will probably be around late spring/summer, the possibility of starting work on a book and also a big ongoing decision I have to make about whether to embark on a PhD…

2011 should prove to be a busy and interesting year, I wish you all the best.