This is the fifth article in my Visualisation Insights series. The purpose of this series is to provide readers with unique insights into the field of visualisation from the different perspectives of those in the roles of designer, practitioner, academic, blogger, journalist and all sorts of other visual thinkers. My aim is to bring together these interviews to create a greater understanding and appreciation of the challenges, approaches and solutions emerging from these people – the visualisation world’s cast and crew.
Alan Smith is Head of Data Visualisation at the Data Visualisation Centre for the UK Office for National Statistics (ONS). He is a visualisation expert with an extensive portfolio of innovative data visualisation solutions and is a well known champion of the importance of graphics in data analysis.
My reasons for approaching Alan about conducting a Visualisation Insights interview are quite straightforward. His position as head of this important unit gives him unique influence and opportunity to effect best practice visualisation techniques across the critical hub of data and statistical analysis provided by the ONS.
Just to be clear the views Alan expresses in this interview are his own and not necessarily those of the ONS.
Please can you give me some background about the Data Visualisation Centre? How long as it been established? What motivated its creation?
We started the Centre in summer 2007. ONS has a central Methodology Directorate – comprised mainly of small teams with expertise in the various elements of statistical production (designing surveys, analysis, estimation etc) and the one one thing that seemed to be missing was a dedicated team for getting information back in to people’s heads – what Hans Rosling calls ‘the final 6cm’. For some years prior to this, I had been running a GIS and mapping team in ONS’ Geography division, where I already had a cartographic visualisation remit. But it became pretty clear that the distinction between mapping and other forms of visualisation was becoming increasingly blurred, both in terms of tools and principles, so establishing the Centre was a great step forward.
What would you describe as being its principle remit?
On a broad level, I would say it’s to make sure that, as a producer of an enormous range of statistics, ONS takes the communication of its statistics seriously. What we’ve found over the last few years is that there are a lot of people producing statistics in ONS, and across the wider Government Statistics Service (GSS), who really care about their outputs, but lacked any formal support on how to implement things they were thinking about. Our unofficial motto is ‘making numbers meaningful’ and that is a principal goal – to allow users of our statistics to bridge the gap between data and meaningful information.
What is the make-up of the organisation?
We are a small team, currently just 4 of us. Our backgrounds vary, which is what I wanted from the outset – a multi-disciplinary skill set is much more flexible than 4 clones would be. So we have myself with a cartographic background, one colleague has a PhD in Visual Perception, one is a psychology graduate with a background in data collection and finally, we have a GIS/spatial analysis graduate.
Who are the key clients?
Key clients, inevitably, are internal to ONS, though we are keen to offer as much help as we can across the wider GSS. and beyond. ONS delivers a broad range of outputs right across the economic and social spectrums. With many of our web-based visualisation projects, we’ve been keen to share them across the public sector for reuse (with other data sets etc). There’s been some enthusiasm for that too, which has been tremendous. Looking forward, the forthcoming 2011 Census will deliver an avalanche of rich data which we are looking forward to working with.
How would you describe the profile of its work (advice, research, design work)?
It’s a mixture of everything really – which is what keeps it fresh, For a small team, we have a lot of hats to wear, so yes, there is an advisory/consultancy role, training, we develop and maintain standards on data presentation and also get involved with design and production work. This includes programming and more traditional DTP-style work.
What is your own background in Data Visualisation?
I can really trace this back to my time at the University of Colorado in the early 1990s, where I studied cartography – using ink and vellum. At the time, GIS was beginning to gather momentum and I remember being simultaneously excited and frightened by what was going to happen to the cartographic field going forward from there.
How long have you been aware of Data Visualisation as a subject field? What was your ‘eureka’ moment?
One of my major complaints about early attempts to produce any kind of maps on the Internet was how clunky, awkward and ugly they tended to be – essentially because software vendors were trying to replicate full blown GIS on the web. I began to miss the control over design you get with pen and paper. While I was fishing around for a dissertation topic for my MSc (in GIS), it became clear that the solution was not going to come from the traditional GIS community, because, at the time, they just didn’t ‘get it’. I began searching further afield and ended up finding many more kindred spirits in the visualisation community rather than in straight-laced GIS.
Who would you describe as being the most influential (to you) practitioners/authors around visualisation?
There have been plenty. For me, Andreas Neumann was the first to demonstrate that you could deliver elegant, interactive data visualisations, combining maps, graphs and other visual forms, using nothing more than a web browser and open standards. His early work – such as the social patterns and structure of Vienna map – was tremendously influential. In the cartographic field, Jason Dykes, Mark Harrower and Danny Dorling spring to mind as people whose work make me sit up and think. I also admire Hans Rosling who has been very keen to push visualisation to support decision-making and public policy. I am always inclined to support the people who think of visualisation in these terms rather than as a pure ‘beautification’ exercise.
There’s a new wave of statistically literate journalists who are producing some great work – for example, Mark Easton, Ben Goldacre, Michael Blastland and Simon Rogers. If we can help people like them get to the information in our data quicker, then that’s a key objective of ours met. In my own team, Dr Steven Rogers, who has a phenomenal understanding of perceptual systems, is a continuing source of entertaining discussions and new ways of looking at familiar things.
More recently, I’ve also spent more time looking backwards than forwards. So people like William Playfair, who invented modern data graphics as we know them, Charles Booth, Willard Brinton, Jacques Bertin. It’s tremendous to look at what they did and think what they would make of where we are now (probably, simultaneous horror and excitement).
What learning resources (eg. websites, blogs, journals, books etc.) do you most commonly refer to or immerse yourself in?
There are plenty. Edward Tufte’s books are probably the ones you would keep on your coffee table if you had normal folk coming round for a coffee. Stephen Few’s books are the safety-first manuals we keep around next to the First Aid kit. A little known magazine outside of the statistical world is ‘ Significance’, now published jointly between the Royal Statistical Society and the American Statistical Association, has some lovely articles which I would seriously encourage people to read. I’m always coming across interesting web content, which I try to bookmark on my delicious.com account. The New York Times website is great, there’s some tremendous stuff there. For people involved in communicating about statistics, there’s the BlogAboutStats. My favourite visualisation blogs are information aesthetics – and yours of course
Are there any software developments that particularly excite you?
I am particularly excited by the second-coming of SVG (Scalable Vector Graphics), which is now embedded in the nascent HTML5 specification. It is becoming exactly what I first hoped it would be – HTML for pictures. With proper support across most browsers, we should see some really exciting things coming from that area now.
In terms of software, web-based tools like MapShaper and IndieMapper are very exciting. Also, the profound influence that something like Google Maps has made on the public subconscious. Although there are many concerns over the degradation of traditional geography skills, you can’t argue that things like GoogleMaps and Location-Based Services in general haven’t made geographic information meaningful to people.
Regardless of the fight between Adobe and Apple, I am still a big admirer of both companies. They make great hardware and software that makes our jobs (usually) easier and more enjoyable.
Are there any particular trends in the field that particularly excite you?
The move to open standards for web visualisation. Even Microsoft are now on board. A web browser has now got to be the most ubiquitous piece of software in the world and they now all have built-in support for sophisticated visualisation techniques like animation and interactivity. It’s a real pity when web content fails to take real advantage of this. Free libraries like Processing.js should encourage some real innovation in visualisation too.
More broadly, the fact that people now expect the web to be engaging means there are real opportunities for bringing information to people who previously wouldn’t have have gone near official statistics. Visualisation gives us a chance to let people see official data in a more personal, meaningful way. They can visualise their own inflation patterns based on their own spending habits, compare their neighbourhoods and cities with others, see how long they are expected to live, all over a cup of coffee, using just a web browser. That’s an exciting thought and I think we are only just scratching the surface. The real power of all this convergence is going to be what happens when we bring different data sources together – as long as its done skillfully, rather than just for the sake of it.
Are there any matters that particularly disappoint or frustrate you?
Yes – but a lot of the things that generate frustration also mean we are never short of work! I get frustrated by all of these things at various times, but have to recognise we have had minor and major victories in those areas too.
What is your perception of the era of open data and transparent access – do you feel this is a positive move or is it somewhat superficial given we are essentially making data available to non-experts (in both analysis and domain knowledge)?
Overall, I think it’s a very positive step, though it’s not without issues and challenges. If I ever needed to demonstrate that numbers alone, without context, do not equal meaningful information, then the enormous amount of content dumped onto things like data.gov.uk is almost a self-authored case study. Just making data available, by itself, solves nothing.
However, I find the issue of non-expertise interesting and potentially patronising. Traditional notions of statistical literacy, based on numeracy, are ripe for challenging, I think. For example, here in the UK, Camelot had to withdraw a lottery scratchcard a couple of years ago (‘cool cash’) because it required people to compare negative numbers and too many people found that difficult. Poor numeracy was acting as a barrier to problem-solving. However, it would have been perfectly feasible to present the same information in a different way (for example, to ask them to identify a warmer or colder temperature, rather than a higher or lower number) and many of the same people would have been able to solve the problem. We are sometimes lazy in that we are happy to let an issue like poor numeracy act as a barrier when there are ways around it. I’m not saying we should ignore numeracy issues – but improving numeracy is something with a very, very large turning circle. 15 million adults in the UK lack Level 1 numeracy skills – but these are people – intelligent people – who need to handle data now, so there’s a challenge up front for people working within the data/information visualisation.
What role can the Data Visualisation Centre play in safeguarding the positives of this shift in attitude towards data access?
On a simple level, we’d like to see ourselves as just one place where people can come to for ideas on how to visualise data, recognising that there are many others too. I am particularly keen that we play a role in what at ONS we call ‘Wider Public Reporting’, which involves how we engage with people beyond traditional statistics users. There is definitely a need to unlock the expertise and insight that is acquired during the process of making official data. One option for this is syndication of one sort or another. I recognise that not many people are going to browse ‘statistics.gov.uk‘ over a sandwich at lunch (2 of their least favourite words in a URL!) – but they might look at the same content if it was hosted on the BBC, The Guardian, or The 10 O’clock News for example. The Met Office is a good example, I think, of a recognised authoritative source of official data, which is served to many different channels in many different forms for a variety of audiences.
Could you provide a brief outline of how a typical visualisation project may come about and evolve?
Normally, a spark – and buy-in – from the parts of ONS that produce the data is the starting point. How things proceed from there depends on what that spark is. I remember presenting the first animated population pyramid as a fait accompli to our demography area, mainly because we were playing with their data and wanted something less complex than a map to test out our ideas. But most projects end up being a collaboration based on their knowledge of their data and how it is used together with out understanding of how we can exploit it.
Which visualisation tools do you mainly use in the DVC? Are there any that you don’t currently use that you would like to?
Our general approach has always been to figure out what we wanted to achieve and then work out a way of implementing it – so we never let the tool determine the way forward. Having said that, we very often end up with the usual suspects – Adobe Illustrator, Photoshop, Flash, various open web technologies (HTML/SVG/CSS) because they are so flexible and between them cover most of our requirements. Excel is never far away either, as most data at one stage or other seems to pass through it. We tend to avoid tools that have proven difficult to deploy across the web, sometimes restricted by our own infrastructure or the ability of users to access it.
How would you describe your progress so far in promoting and succeeding in the better practice of visualisation/design?
It is a never ending job. But more people across official statistics now take notice, so there’s a partial success.
Do you have a particular design style/standard that you use for reference in handling visualisation project work?
Yes – to try and make the designs unobtrusive and place the visual emphasis firmly on data and information. We are gradually evolving a house style that reflects that. As with all things, we don’t want the design style to be a straight jacket, but a little consistency is a good thing for the end user.
How do you handle the issue of aesthetic design with functional performance of a visualisation?
Carefully. It is not a crime to make a data graphic look attractive and engaging – but if it is over-egged, it can actually be counter-productive to your overall goal. So we tend not to go for gratuitous visual effects (drop shadows, 3-d treatments, shimmering reflections) as these can lead you down a slippery slope. We’re also very careful in our use of colour – both from a web accessibility perspective and from an aesthetic angle. Another thing we have to be careful of in the statistical arena is natural colour associations with certain topics. Often, the simplest symbology is the best. For example, it is easier for the eye to estimate quantity from length than area, – so lines, not blobs, are often more effective if you want people to interpret quantity correctly.
How do you articulate the true benefits of data visualisation?
When people use visualisation as a route into data that would otherwise pass them by – so, catching a wider audience, or revealing insight that would otherwise be hidden. These are things that data visualisation brings – to experts and non-experts alike. Playfair said that ‘…no study is less alluring or more dry and tedious than statistics…unless the mind and imagination are set to work‘ – that’s the role of data visualisation. The information in the data is beautiful, not the graphic itself.
Do you ever come under particular pressure from clients/customers claiming to know what they want in terms of presentation of information? How do you handle this?
Yes – and many of them are right! The “ageing map of the UK” we produced was a reflection of a very well developed sketch from Shayla Goldring in our demography team, who had a clear idea of what she wanted to achieve. Having said that, there have also been occasions where we have had to step in and advise our clients of best practice and, if necessary, enforce it. But it is very rare for things to turn into a stand-off.
Would you be able to identify an example of a particularly effective visualisation project you have worked on?
In terms of sheer market penetration, the animated population pyramids, which have proliferated into many different variants, have been a favourite – and a good example of starting out with something basic and rapidly iterating it based on customer feedback. We have also had plenty of people from around the world reuse these templates with their own data, which has been very gratifying. On a personal level, satisfying my cartographic DNA, the flow-mapping in our CommuterView product was a lovely project to work on, I want to revisit that one in the near future with a modern toolkit.
Have you any examples of particularly creative or innovative approaches beyond the standard graphical approaches (eg. bar/line graphs) that you have recommended?
Boxplots. We’re big fans of boxplots for multiple distributions. They were devised by John Tukey, the father of Exploratory Data Analysis, who championed the used of graphics for spotting things you’re not expecting to see. There have been alot of recent innovations – technology’s been great for encouraging it – but my view is that there’s alot of mileage in applying new techniques to the standard displays – the best of both worlds!
How do you assess the success of your visualisation services?
Repeat demand! We do monitor feedback, both good and bad, and hope it will help us make better products in the future.
Finally, how would you like to see the Data Visualisation Centre evolve/progress over the next 3-5 years?
In the current climate, survival is success! Beyond that, I’d like to feel that success is making the Centre recognised as an integral part of ONS’ workflow, delivering core outputs, not just an ‘added value’ thing. If it can help draw attention to the wealth of information in ONS data, then so much the better.
I’m extremely grateful to Alan for his wonderfully candid and detailed responses to the many questions I posed him! He has perfectly encapsulated the purpose of these insights articles providing some rich perspectives from his unique role in this field. I wish him and his colleagues at the Data Visualisation Centre all the very best for the future. You can keep a track on Alan’s bookmarked discoveries via his graphboy del.ici.ous account.
Its been a great start to the week in the world of visualisation with Paul Butler’s terrific design showing some fascinating patterns of Facebook friends and Nathan Yau reminding us of some of the best work that has taken place this year. Now, the New York Times graphics team have joined the party with the stunning project ‘Mapping America‘.
Inspired by the excellent Chicago Boundaries project by Bill Rankin, Matthew Bloch and Shan Carter have created an interactive mapping interface which layers a range of demographic information for every block in every city of America. This creates an amazingly detailed and revealing picture of the diversity of America’s population.
You can choose to display a range of different layers of information relating to race and ethnic distribution, income patterns, housing and domestic matters as well as education subjects.
The brilliance of this work is particularly demonstrated by the dexterity of its interactivity which allows you to smoothly navigate across the country, search for zip codes, zoom in and out to enhance the levels of detail. Helpfully, hovering over individual blocks brings up a dialogue box with specific data for that area.
Congratulations to all involved in this work, it really is exceptional.
At the end of each month I pull together a collection of links to some of the most relevant, interesting and useful articles I’ve come across during the previous month. If you follow me on Twitter you will see many of these items tweeted as soon as I find them. Here’s the latest collection from November 2010:
Fathom | Video of Ben Fry talk at UX Week in San Francisco | Link
Visualology | The Gestalt of Slides | Link
Data Pointed | ‘Measuring The Universe’ – Roman Ondák’s living infographic | Link
Imperica | Stefanie Posavec on the process of visualising data | Link
Dynamic Diagrams | Science visualizations way small and way big | Link
BBC | Diagrams that changed the world | Link
Guardian | Analysing data is the future for journalists, says Tim Berners-Lee | Link
Flowing Data | Tutorial on how to make bubble charts | Link
Visual Eyes | Introducing Visualeyes – a web-based authoring tool to weave images, maps, charts, video and data into highly interactive and compelling dynamic visualizations | Link
Fell In Love With Data | Video Interview: JD Fekete talks about Jacques Bertin | Link
New York Times | The Opinionator Blog – Stories vs. Statistics | Link
Infosthetics | Visualizing data using long-time exposure photos | Link
Perceptual Edge | Unit charts are for kids | Link
Infosthetics | Impure: a new visualization programming language for non-programmers | Link
10,000 Words | 8 creative ways to use RSS feeds | Link
Wired | 80 gigapixels of London’s skyline | Link
Excel Charts | Consistent dashboard design: write a simple sentence | Link
Creative Review | New York magazine: data done right (Ed – or is it?) | Link
Don Norman’s JND | Looking Back, Looking Forward | Link
Drawar | Don’t call it minimalism | Link
Flowing Data | Format and clean your data with Google Refine | Link
Nieman Journalism Lab | Mario Garcia’s path to better designed newspapers | Link
Datavisualization.ch | How We Visualized 4.3 Million Votes | Link
Lars’ Notes | Slides of InfoVis 2010 presentation “How Information Visualization Novices Construct Visualizations” | Link
LSR Online | How visualization is being used in Leicestershire to inform policy | Link
Flowing Data | Open thread: How do you start working on a data graphic? | Link
Eager Eyes | Part two of Robert Kosara’s interview with Swivel “Part 2: Solving A Single Problem” | Link
Flowing Data | Guest post by Joan DiMicco “Telling Stories with Data, A VisWeek 2010 Workshop” | Link
Noisy Decent Graphics | The history of the colour wheel | Link
Drawar | The solo freelance designer’s plateau dilemma | Link
On Friday I announced the two finalists battling it out in my ‘visualisations in the wild’ contest. Over the weekend and right up to an hour ago I’ve been inundated with email messages, tweets and post comments from readers casting their vote for which visualisation they believe best captures the essence of this competition.
The contest is now closed and I’m delighted to announce that Visualisation B, the Duracell Powercheck, has been declared the winner with 75.2% of the votes.
Congratulations to Jan Willem Tulp who wins a Full Conference Pass to the O’Reilly Strata ‘Making Data Work’ conference taking place in Santa Clara, CA in February 2011.
Commiserations go to Naomi B Robbins who was a very respectable runner-up and received a lot of support for her excellent submission but ultimately just not enough to beat Jan.
Many thanks to all who voted and also, once again, to those who entered the competition.
The theme of this contest was visualisations in the wild and the challenge was to submit a photo of a great example of best practice information design being used in everyday life. I was looking for unique examples that demonstrated the value and power of good visualisation practice, designs that help to make everyday life go by that little bit more smoothly.
The contest has now closed and I have narrowed it down to two entrants who now go forward as the finalists. One of these was submitted very early in the contest and was the clear leader until late yesterday evening when I received the other submission.
Originally I was going to judge the winner myself but I think it will be more fun and more democratic to open this process up. I would therefore like to invite Visualising Data readers, subscribers, followers and casual visitors to submit their choice for which is the best visualisation in the wild example (see below for instructions).
(Click on image above for larger view or click here for the original AFSC design in pdf)
Entrant’s Description: 3ft Quaker appeal to spend on peace – not war. Shows % discretionary budget on military (red) and other categories. Powerful help to visualize and understand big numbers. The data is also available on http://www.oneminuteforpeace.org/budget in a pie chart, but I find the long strip to be much more powerful.
(Click on image for larger view or click here for original demo of Duracell Powercheck)
Entrant’s Description: Duracell’s interactive animated single bar chart shows remaining battery charge in yellow when holding the battery on the dots.
Your assessment for choosing your preferred visualisation should be based on two criteria:
The only other rules are that the entrants themselves cannot submit a score and I also won’t accept any input from Sepp Blatter or anyone else at FIFA.
The closing date/time for all votes is 12:00 UK time on Monday 6th December. I’ll announce the winner shortly after.
Good luck to the finalists and thanks for all the entries!
Fascinated to read in the mainstream press about Tableau’s decision to remove a series of visualisations published by WikiLeaks using Tableau Public, following pressure from Senator Joe Lieberman. Elissa Fink of Tableau has published an explanation of the background behind this decision.
The key excerpts from this explanation are:
Our terms of service require that people using Tableau Public do not upload, post, email, transmit or otherwise make available any content that they do not have the right to make available. Furthermore, if we receive a complaint about a particular set of data, we retain the right to investigate the situation and remove any offending data, if necessary.
Our decision to remove the data from our servers came in response to a public request by Senator Joe Lieberman, who chairs the Senate Homeland Security Committee, when he called for organisations hosting WikiLeaks to terminate their relationship with the website.
This is clearly a very difficult matter which has caused a great deal of reaction in support of and in opposition to Tableau’s decision as you will read in the comments section.
My view is it is very unfortunate that Tableau has had come to this decision. I don’t necessarily think they believe their ‘terms of service’ reason, especially given this data is now so widely cascaded on the Internet. Unfortunately, it does create a tricky precedent that will be interesting to keep an eye on over time with Tableau users likely to test the boundaries of this policy approach on other subject matters and data contexts.
I generally support what WikiLeaks stand for (though I find the cablegate release less about whistle-blowing and more about diplomatic gossiping) but you cannot ignore or be complacent about the massive baggage that comes with any slightest association with the organisation. The reactions this organisation stirs around the world is quite incredible. I have had limited but first hand experience of dealing with the same potential difficulties of association.
Do we actually expect a private company like Tableau to be waving the flags on the frontline of free speech activism? Lets be realistic, no. But I’m sure their organisational values are generally sound so lets not go over the top on the possible shortcomings of principles they hold dear and instead take a look at the bigger picture.
The action Tableau has taken is regrettable but it has been done so in the context of pressure being mounted by the American government and Joe Lieberman, in particular, on organisations associated with WikiLeaks, no matter how tenuously. You have to say though, had the pressure yet got that heated or could they have held firm a bit longer? Unfortunately for them, and probably us too, Tableau are likely to be bloodied and bruised by taking this decision.
A couple of items of news have popped up this week which give further evidence of the growing mainstream appreciation and appeal of visualisation. The fact both involve the BBC is very encouraging and continues their recent coverage of this subject following their Newsnight programme featuring David McCandless back in August.
The Joy of Stats
The first item regards the announcement of a programme titled ‘The Joy of Stats‘ which is going out next Tuesday 7th at 21:00 on BBC Four:
A documentary which takes viewers on a rollercoaster ride through the wonderful world of statistics to explore the remarkable power they have to change our understanding of the world, presented by superstar boffin Professor Hans Rosling, whose eye-opening, mind-expanding and funny online lectures have made him an international internet legend.
As many have already remarked, we in the UK are very fortunate to be able to see this though no doubt it will be made accessible in some way, shape or form to non-UK followers in due course. I’m sure it will prove to be a great show if the preview clip below is anything to go by.
A couple of grumbles though. Firstly, I always struggle to get my visuals around (if indeed visuals can get around) log scales so I’m a bit concerned that your average punter watching this might be a bit misguided as to how values are progressing along the x-axis. It could also be slightly misleading to the untrained eye that the y-axis does not begin at zero.
Secondly, why can’t descriptions of programmes like this avoid using lazy phrases like “superstar boffin” and observations like “Rosling is a man who revels in the glorious nerdiness of statistics“. This is such patronising rubbish, it really winds me up. Why is it nerdy to be passionate about something? Especially when it is around a subject as important as global health and poverty. Its not like these are particularly advanced examples of statistics anyway. Rant over.
BBC Backstage DataArt
The next item to grab my attention was discovering the BBC’s Data Art project which is an interesting sub-site focused on developing and showcasing examples of innovative visualisations relating to BBC data. The project is a collaboration between BBC Learning Innovation and the Centre for Research in Education, Art and Media (CREAM) at the University of Westminster.
The site contains a range of useful resources, guides and data sources as well as full details behind the development of all the visualisation projects. One example project currently being profiled allows you to compare mentions of certain terms in the news coverage on BBC Americas. For example comparing terms such as Obama vs. Bush produces the display shown below and navigating around this interface leads you to explore the database of specific news items.