Visualisation Insights: #1 The visualisation designer

This post is the first in a series I am commencing called Visualisation Insights. The purpose of this series is to provide readers with unique insights into the field of visualisation from the different perspectives of those in the roles of designer, practitioner, academic, blogger, journalist and all sorts of other visual thinkers. My aim is to bring together these interviews to create a greater understanding and appreciation of the challenges, approaches and solutions that exist in the worlds of this collection of people – the visualisation field’s ‘cast and crew’. I will be following each interview with a post reflecting on some of the key insights to emerge.

The World Cup is now but a distant memory to most of us, unless you happen to come from Spain of course, but one of the key aspects of the tournament from point of view was the array of innovative infographics and visualisation devices used by media organisations and websites to enhance their coverage (see my collections part 1, part 2 and part 3).

One of the most shared and discussed visualisation designs was conceived, designed and produced by New York designer Michael Deal on behalf of Umbro, the UK sportswear brand. Michael designed a novel, elegant and hugely insightful approach to visualising each of the 64 games based around three important statistics that help describe the ebb and flow of the game: completed passes, attempts on goal and goals scored. These designs were neatly compiled on to a single poster layout.


[click image for large view]

Hugely impressed by this design solution, which has strong themes of the layout and data density of Tufte’s sparklines and Few’s bullet graph, I invited Michael to impart his thoughts about the process he had gone through, the design decisions he had taken and the visualisation execution methods he had employed.

You can see more of Michael’s personal work at www.mikemake.com.



Michael, can you give me a brief outline of your background as a designer? Also, if possible, how would you describe your particular focus or expertise as a design (in terms of the sort of project you typically undertake/enjoy)?

I’m 23 years old and I’ve been practicing graphic design in New York City for almost a year, working now at Pentagram. I’m still figuring out what my dream projects might be but recently I’ve been enjoying visualizing data in custom, “hand-crafted” designs.

How would you describe your interest in football?

I’ve been playing on teams since I was in elementary school. It’s a sport I love to play and deeply respect.


How did you get involved in the Umbro project?

Umbro had seen my work and approached me a few months ago with the idea of creating infographics for the World Cup.

What was the brief you were given? Was it fairly open (“we need some visualisations about the World Cup”) or more specific (“we need visualisations about each match showing passing, attempts and goals”)?

It was much more the former. I wish all clients could be as fantastic as Umbro has been, specifically my contact there, Aaron Lavery. The brief was very open and we had a lot of great dialogue. They showed full trust in me as the designer, allowing me to explore any option I thought might work.

How did you arrive at the decision to visualise the variables you chose (ie. passing, attempts and goals)?

One of the main challenges of this project was to design the entire graphic before having the actual data, so that when the matches were played, I could plug the data in and get the graphic out the door right away. This approach ruled out taking more editorial angles where one might assess the matches that had been played and use infographics to highlight interesting aspects unique to this tournament.  So the practical direction was to create an infographic that could potentially receive data from any given World Cup, which offered the exciting possibility to experiment with a graphic language for visualizing any given football match.

I was interested in presenting possession, and I realized that mapping successful passes would be an easy solution for conveying the ebb and flow of possession purely with data points. The charting of shots on goal functions in a similar way, while suggesting more about where the ball was on the field.


You mention the data is supplied by Opta, was this an arrangement already presented for you or did you source the data yourself?

Umbro had worked with Opta before, so this was a conveniently established relationship for me.

Did you have the opportunity to explore other aspects of the vast data sets Opta hold?

There were XML files of data for every match, which could sometimes exceed 10,000 lines of code, tracking nearly every detail of the gameplay, including x/y field coordinates and a timestamp for each piece of action. This degree of granularity can really make you geek out, with rabbit holes left and right you have to make sure to step over.

Can you briefly talk me through your data preparation process? What software/techniques did you use to handle the data and produce the analysis behind the visualisation?

I’m still experimenting with more automated means of handling and visualizing data like this, so in the interest of time, it was important for me to just get it all into the more comfortable form of a spreadsheet for each game. There seemed to be a number of PC applications that converted XML feeds to Excel spreadsheets (I used Advanced XML Converter), but I found nothing for Mac, so if anybody knows of something, let me know!

I then spent a good bit of time preparing a multi-table template in iWork’s Numbers that would isolate and reorder just the data I needed upon pasting the Excel output for each match.



How did you arrive at the concept of the display? What sort of alternative solutions were you considering?

I was almost certain that mini-timelines for each game would be the way to go. I wanted compact match visualizations that would fit together in as small a space as a legible reading of the data would allow. At first it was tempting to arrange the timelines in more of a tournament bracket formation, but they needed to share a space within the same eye-span to allow for immediate ease of comparison. This was a much higher priority than it was to highlight the paths of each team as easily as a tournament bracket diagram does.

I was particularly interested in clearly presenting games with discrepancies between the frequencies of the three variables—For example, games where one team dominates possession but just can’t get a score on the board.

From a process point of view, do you tend to sketch ideas out on paper first or go straight to the data/technology and try out different compositions/iterations?

In all other types of design, I sketch a lot by hand before anything else; with data visualization, the framework of the design is usually fleshed out enough in my head to jump right in and get a proof of concept to make sure the imagined design works. In this case, that framework was generally a set of small, stackable timelines that contained different variables from a football match in the same shared space. I knew from the beginning that it would eventually work, but the number of Adobe Illustrator sketches I made to refine the final shapes for the visualization language is almost comical, with a file filled with dozens of iterations of these little timelines.

Can you explain some of the deliberate decisions you made around the visual properties in the design?

For the three event types charted in the timelines, a higher frequency of each generally indicates one team outplaying another for that respective variable, but there’s a hierarchy to the importance of each variable (the frequency ultimately being more crucial for goals than for shots than for passes). The shapes and colors representing these three variables needed to be appropriately weighted to maintain this hierarchical balance between the events.

The pass rectangles needed to fall back as the main texture of possession in the match, over which the more heavily weighted shot triangles were mapped in a darker color. The triangle was chosen for this variable because it conveys a sharp strike, the impact of a shot on goal. Successful goal circles were mapped at a higher position above the upward-pointing triangles to allow them to stand out, while conveying the upward progression in the importance of these gameplay events.

I chose a buoyant red circle for this variable to suggest (as far as one reasonably could in this case) the fiery triumph and ecstasy that comes packaged with any goal scored in the World Cup. The circle has its universal connotations of transcendental perfection and all that, and I gave it as saturated a red as possible.

I never would have guessed that such semi-philosophical abstractions would enter my design process for data visualizations, but I will welcome them anytime they feel like showing up again in the future. It was fun to think about Kandinsky’s Bauhaus ideal of the role of the circle, triangle, and square as part of a reductive universal visual script. It was also nice to get lost in the timelines as little landscapes painted with game data, with green trees, blue hills, and red suns. It definitely had me craving some Super Mario Bros action.

…and my answer to this question has officially become way too self-indulgent.

One of the key aspects of any creative process is knowing when to stop adding or subtracting from the design: how did you handle this delicate stage of the project?

It was tempting to include things like subtle time measures and tick marks throughout the piece, but I didn’t want the viewer getting hung up on that. It’s all meant to be more about comparing the general patterns within and between the games rather than being able to discern the exact minute that a goal was scored.

The shape of the timelines had enough order in their grid layout that I didn’t need any kind of framing lines or boxes, which was very liberating. If the timelines within each column don’t appear to align properly, it only appears this way because I left out all extra framing elements. It’s purely data points and country labels.

At some point in the process I was designing the timelines with data for fouls and penalties, and it was a tough decision at first to exclude it because it was definitely interesting stuff. However, as a gameplay event, it was in a separate category from the stepwise set of passes, shots, and goals, and it was steering the graphic in a different direction from what I was intending. Also, the immediate decipherability of a three-variable chart just isn’t as strong with a fourth element introduced.

What software did you use to create the designs? What features of this software did you use to create the intricate displays? For example, did you use scripting in Illustrator to automate the task of plotting the instance of passes?

I actually didn’t use any scripting in Illustrator. I copy/pasted scatterplots from Numbers into Illustrator, used the “convert to shape” effect a lot, and relied heavily on Illustrator’s Smart Guides. It was essentially built “by hand” with the spreadsheet-created charts as precise blueprints for locking the shapes in place. This is obviously a time-consuming approach, but it offered me absolute control over the entire graphic.



Of course any design brief could be delivered in many different ways, but, on reflection, is there anything you would wish to change, add or remove to this design?

I’m actually pretty satisfied with the way this piece came out. If I hadn’t been working under such time constraints, however, it would have been very interesting to see how it would have looked with the height of the green passing bars representing the length of the pass. This might have revealed some insights about the varying playing and passing styles of different teams.

I’d also be curious to see an alternate version with “unfilled” outlines of the blue shot triangles to represent off-target shots, differentiating them from solid triangles for on-target shots. Again, it was a matter of time constraints and keeping it simple.

Like football, in the way every fan thinks he can do better than the manager (like me), design naturally has its own armchair experts (probably, again, like me!) – have you received any suggestions or comments that you thought represented good ideas?

Flaneuse from contexts.org wrote a very thorough and insightful review of my visualization, and had a great suggestion about including the win/loss point values accumulated by each team advancing past the Group Stage. It can be a confusing system to keep track of if your team is in a tight situation and you want to look ahead at what hypothetical tournament outcomes would be necessary for your team to advance. There would have been a way to make this work, it just didn’t make it into the final piece.

As a result of this work, do you have any “unfinished business”/burning ambition to pursue further football related visualisations?

I would love to continue to create visualizations for football, and other sports too. There’s so much to be done.

There are often limits to what you can deduce through numbers and charts, but with the rich datasets in sports, those borders are more distant. With the universal passion you see for sports, the continuing development of in-game data-tracking methods, and the broadening mainstream application of data visualization, it won’t be long before the presence of infographics will be ingrained in our conception of how sports footage and analysis is presented in the media—much more than they are right now. We’ll be seeing more infographics in-game and on SportsCenter.

How challenging did you find the task of creating a visualisation for football compared to some of the other design subjects/briefs you have been involved in?

Being so familiar with the subject, while brainstorming ideas I could imagine myself on the field and in the action, which is an advantage you don’t always have with projects where you’re learning a lot about the subject along the way. It’s a bit cheesy, but it definitely helped. Another advantage was that a lot of other World Cup visualizations were coming out as I was designing this, and it was easy to see what was and wasn’t working.


*************************
Many thanks to Michael for his detailed, candid and insightful thoughts and also for his time in participating in this post. I wish him all the success for the future.

Visualising the Wikileak’s war logs using Tableau Public

Further to yesterday’s post about the Wikileaks Afghanistan War Logs, the Guardian datablog has published a post today describing how their data journalism operation worked. This reveals some interesting insights into the way the investigation team went about handling, analysing and interpreting all this data in order to unearth and present the key stories.

They have also made available a series of spreadsheets containing the data they have used for their various visualisations: summary of casualty data, full list of IED explosions and detailed data behind 300 of the key incidents (needs accompanying glossary of military terms).

I’ve played about with some of this data in Tableau Public to see if I can unearth some interesting visualisations and also to test out the data/software in this environment. I’ve embedded a sample of them below for sharing (please note they do take a while to load up):


Powered by Tableau

This first graph simply plots all types of casualty on a common scale across the 6 year period so that you get a feel for the relative levels for each category as well as any particular patterns within each year. As with all these graphs, the context of the timeline of military strategy, troop numbers and other milestone information would help explain or inform some of these patterns.


Powered by Tableau

In contrast to the first graph, this second one plots all casualities on a single line graph. The approach here is to accept the noise created by the largely overlapping lines towards the bottom of the graph because you can then easily identify unusual peaks such as the huge increase in Taliban casualties particularly during Aug/Sep 2006 and Aug/Sep 2007. It is also clear to see the overall increasing bloodiness of the war over time.


Powered by Tableau

This third graph plots a cumulative picture of casualty numbers by type, clearly revealing the far greater numbers of Taliban casualties. It is really interesting to see the close proximity of Civilian and Afghan forces casualties throughout the course of the war – this graph shows this far better than the individual monthly patterns of the second graph.


Powered by Tableau

This final graph is a heat map used to try and draw out seasonal patterns behind casualty numbers. I decided to use dual encoding for casualty levels with the size of the square and its colour both representing the data count. I felt this helped emphasise patterns more clearly than having just one. Note that the colour and size scales representing different values/maximums in each graph – these ranges are normalised to help comparison of the intensity levels rather than the absolute counts of casualty under each category. As shown in the second graph, you can clearly see an upsurge in activity around the late summer/early autumn periods, particularly in recent years. Casualty levels seem strangely low for the winter and spring months?

IED Explosions


Powered by Tableau


Powered by Tableau

These initial graphs above (the map boundaries haven’t come out particularly well compared to how they looked when created) show firstly the total deaths and secondly the total woundings by location during the 6 year period. Its particularly interesting to see the prominence of locations of deaths or woundings around the highways of Afghanistan, as you would expect given the roadside IED tactics.


Powered by Tableau


Powered by Tableau

The second lot of graphs are small multiples of the death and wounding incidents plots (1) across the 6 year period and (2) by the nature of the event.


Powered by Tableau

This final graphs presents a monthly and yearly plot of deaths and woundings by category of victim and I think it does offer some interesting patterns. Overall these visualisations probably don’t bring a great deal of added insight compared to the original Guardian visualisations, although I think the exercise has served as a good test of the tool as a means for exploring such data.

300 Key Incidents
I tried a few combinations using this data but nothing can really improve on the map interface the Guardian created for this data and, given this is just a manual selection of key events, any statistical or trend analysis will be flawed. I did a wordle word cloud analysis to see if there were any interesting trends on repeated terms but the data contains so much coded language and has references that distort such analysis.

*************************

Paul Bradshaw on the Online Journalism Blog reports that “French data journalism outfit Owni have put together an impressive app (also in English) that attempts to put a user-friendly interface on the intimidating volume of War Logs documents.”

*************************

The Atlantic website has joined in the task of visualising the data, presenting a range of map based analysis for the IED data.

*************************

Nathan at FlowingData has published a guest post by Alastair Dant, interactive lead at the Guardian, describing the efforts that went into designing the war logs map of incidents revealed by Wikileaks.

Wikileaks War Logs: A triumph in ‘data journalism’?

The New York Times, the Guardian (UK) and Der Spiegel (Ger) have published details of a huge set of war logs from the whistleblowing website Wikileaks, detailing the war in Afghanistan.

There are many aspects to the journalistic challenges facing these organisations who have been selected and trusted by Wikileaks to handle the responsibility of publicising this story and making sense of the data.

The volume and breadth of data to work through, the complexity of the emerging stories, the responsibility for accurate and sensitive reporting (NYT’s note to readers) and visualisation devices required to effectively present these stories are just some of the demands being placed on the various journalists and design experts who will have been carefully deployed to handle this story.

I concede that I know little about Der Spiegel’s track record with interactives, infographics and other visualisations but the first two are well established (and repeatedly recognised by awards) as being two of the most capable media organisations in respect of the use of information displays to enhance their reporting. I’ve yet to see the reasoning behind Wikileaks selecting these three – I’m sure its based on their political leanings – but I would imagine their reputation for data journalism, their ability to draw insight and impart robust data-stories will have been prominent factors.

A New Age of Journalism

Stepping away from the subject matter reveals a fascinating demonstration of the practice of data journalism, as acknowledged by the Guardian themselves here and in their live blog, an excerpt of which states:

With the number of documents released, many have, understandably, struggled to reach an instant verdict on how significant the information is. But media commentators have been quick to see the significance of the way the information was released for future models of journalism.

One example is Jay Rosen’s blog for New York University and another is Alexis Madrigal’s blog on the Atlantic. Madrigal describes the publication of the documents as “a milestone in the new news ecosystem”, and writes of:

“new conduits … opened into the most highly regarded newsrooms in the country. In the new asymmetrical journalism, it’s not clear who is on what side or what the rules of engagement actually are. But the reason WikiLeaks may have just changed the media is that we found out that it doesn’t really matter. Their data is good, and that’s what counts.”

Visualisations (Last updated 27/07/10)

Here are the visualisations produced so far by The Guardian and Der Spiegel. It is interesting that there is nothing yet to emerge from the New York Times. There could be three reasons for this 1) taking their time to produce a fantastic interactive, 2) they have decided to keep their coverage limited to written prose, 3) they have an agreement with the other organisations to leave the inforgraphics work to them. Who knows?

Guardian – IED attacks on civilians, coalition and Afghan troops

Guardian – IED attacks by location by year

Guardian – selection of significant incidents

Der Spiegel – Deaths as a result of insurgent bomb attacks

Der Spiegel – No peaceful end in sight

* note that I’ve returned the ‘?’ to the end of the post’s title – it was unintentionally missing from the original post.

Radial plot of FTSE 100

A series of charts by Jeremy Christopher depicting the history, worth and composition of the FTSE 100 share index have been doing the rounds over the past 24 hours (first seen via DataVisualization.ch). The image below is probably the most discussed and shared of this short series of 3 charts. It visualises the FTSE’s index rate for every month over the past 26 years and also shows the changing profile of the industries comprising the 100 most highly capitalised UK companies.

Radial plots are certainly a popular visual device right now (there were several examples in my ‘Visualising the World Cup‘ series) but they are also one of the more contentious visual design choices, certainly amongst purists.

The focus of debate normally surrounds the suitability of plotting a variable (or variables in the case of the similar radar graphs) about a circular axis. Time, however, is a measure that does lend itself to a radial presentation. A logical alternative could have been to disconnect the circle and stretch out the sequence along a traditional left-to-right horizontal layout. However, the compatible layout and proportions offered by a circular display like this (especially when you are trying to comply with the dimensions of ISO paper size formats like A1) is a perfectly understandable choice – it does provide the means to achieve a great deal of data density.

I think this is a wonderfully elegant design, it appeals aesthetically, engages me for a prolonged period and fundamentally communicates the message it was intending – that of the shifting story of the FTSE. There are just a few minor design choices I would have taken differently:

  1. The colour scheme used to depict the different industries is pleasing on the eye but offers only very subtle changes between sector which makes it difficult to accurately judge the shifting profile. There is clearly a designer’s desire to minimise the spectrum of colours to sustain the look and feel this diagram achieves but an increase in the distinction between shades would be good.
  2. The bars going around the outer band, depicting index rate, create a bit of a moire effect – unintended visual noise. A solution to this would be to reduce the gaps between each bar. Currently, this space looks to be set at 100% of the bar width, so something closer to 25% might prevent the effect.
  3. The gradient grey shading in the background of these outer bars is the only ink on the design not encoding a piece of data and, in my view, reduces some of the clarity of these bars. It appears the main function of this shading is to emphasise white major/minor grid lines to help with reading the index value. Perhaps the shading could be removed entirely or at least reduced to a very soft shade of grey and, instead, a subtle but stronger shade of grey used for the grid lines.

Twitter visualisation of happiness

The San Francisco Chronicle presents a visualisation developed by Alan Mislove, a researcher at Northwestern University. His study looks at 300 million tweets measuring mood based on the sentiment of the language used. The results are then plotted according to location and time of day. Finally, the volume of tweets are represented by the area of the state in question:

I’ve not seen many visualisations attempted like this before, with the size of regions pulsing to reflect the changing size of a given variable, in this case the volume of tweet. However, I don’t think it works that well at communicating the results or allowing us to explore some of the patterns of data. The coastal bulges are quite interesting but the constant distortion of the country and the states within vastly reduces the chance of drawing insight.

This work prompted me to consider some of the alternative approaches that to plotting multivariate data around issues of geography and time. The following recent examples show possible ideas though, in some cases, they would involve sacrificing the volume of tweets variable.

3D Elevated Maps (Doug McCune)

This technique maps 3D elevations to represent the location and prevalence of various crimes on San Francisco. A similar approach could have been applied to the entire US map and the peaks colour coded to represent the mood language and sized to represent volume of tweets. Animated over time these peaks would then grow and shrink accordingly.

Tweetography (Urban Tick & Digital Urban)

This work plots concepts of new city landscapes based on terrain altitudes representing the volume of twitter activity and styled using a classic cartography look and feel. The colour coding here represents the peaks and so this would in principle change the visualisation of happiness, with volume having to be sacrificed and the height used to represent extent of happiness.

Gapminder (Hans Rosling)

The much celebrated work of Hans Rosling, where multiple variables are plotted over time and by geographical position could be an alternative option, still allowing the use of colour to represent the mood and size of bubbles to show volume of tweets.

Bubble Map (New York Times)

Similar to the concept of Gapminder but rather than bubbles being plotted over the mid point of countries, here the bubbles are the representations of the geography of each country, growing and shrinking in size over time. This may result in a similarly unsatisfying solution though, as the true geography becomes distorted and therefore hard to interpret.

Twitter Chatter Map (New York Times)

The second representation idea to come from the NYT is the interactive visualisation of the Superbowl and the twitter ‘chatter’ that took place leading up to, during and after the game. This approach plots selected twitter terms across the country over time, growing the size of the text depending on its volume. This perhaps wouldn’t be most suited for the happiness plot for which the values are derived from syntax but not represented by it.

Choropleth/Heat Map (New York Times)

The final technique to consider, once again inspired by an example in the NYT, would sacrifice the volume of tweets variable and simply plot the happiness ‘index’ directly on to the specific locations across the map. The animation would show the changing levels across the states without distorting the geographical accuracy.

It is always nice to see experimental visualisation approaches being adopted to find new ways of communicating qualitative data, especially multivariate data of this nature. Regardless of whether they succeed, it is still extremely valuable work to help further the field overall through better understanding and experience of technique application.

Follow up to ‘Worst graph design ever?’

In my recent post ‘Worst graph design ever?’ I provided a very brief  review of the ‘The Little Book of Shocking Global Facts’. As the title suggests, this took a critical view of the appropriateness and effectiveness of the design choices made to represent the facts covered.

Clarifying my position

Over the past week there has been a great buzz about this book’s release and its probably fair to say a majority have followed a similar opinion to my own, indeed my post has been referred to on quite a few sites. To be fair there have also been staunch defenders of the work so it is important to acknowledge the presence of two sides of the debate. In some cases my opinions have been classified as representing ‘anti-Barnbrook” sentiment and so I wanted to clarify my position.

As far as I’m concerned I have absolutely no anti-Barnbrook feelings, I am simply anti-poor visualisation. I’ve only had passing exposure to Jonathan Barnbrook but it is clear he polarises opinion. That is perfectly normal and expected in a graphic design context where judgements are formed with greater emphasis on the creativity and novelty (by which I mean ‘newness’ rather than gimmickry) of visual presentation. Issues of taste and aesthetic are largely the currency of the debate in this field.

I see things from the slightly different visualisation world rather than the pure graphic design perspective. I therefore judge designs on their ability to effectively and clearly communicate or to enable a reader to explore data in an engaging and useful way. Forming opinions based on the function of clear communication should not be taken to imply that innovation and creativity have no place in visualisation. That is simply not the case. Some supporters of the Barnbrook designs have drawn fearful comparison with a dreaded  alternative design approach based on dense numbers and formal statistics and the use of dry, mainstream fonts like Ariel or Times New Roman. Are these the only design possibilities for presenting data? To offer such a binary ‘this or that’ view of the potential of visualisation design is extremely blinkered. But let’s remember that this book is about communicating facts and so creativity in design must aid the communication effect, not impair it.

A response from the designers

DigitalArts have published an interview with Barnbrook designer, Jon Abbott, who offers his views on the research and creative processes as well as the design solutions produced.

The article’s introduction presents an interesting context, particularly in light of the resulting design:

The sheer scale of some of the environmental and social challenges facing the world today can be difficult to wrap your mind around – the amount of data and the massive numbers involved can often seem impenetrable. In The Little Book of Shocking Global Facts, Barnbrook… has cut an experimental line between type art and infographics that allows you to engage with them without robbing them of their power.

With these observations in mind I maintain my belief that the designs fail to achieve such objectives. I’ll leave you to draw your own conclusions from here but will present what I consider to be the most interesting exchange in this interview:

DA: How do you use design and typography to make statistics engaging and make sense of incomprehensibly large numbers?

JA: “Aesthetically, our intention was not to create an almanac of information design, but a diverse, fun, richly illustrated book that the reader would want to pick up again and again. We thought that 192 pages of numbers and statistics presented in a formal way could have been slightly dull, so we set about creating something which we thought would be visually stimulating for the reader. We wanted to avoid any visual clichés; some of the numbers are presented in a seemingly abstract way, but all the illustrations have their root in the fact which they are representing.”

DA: How how do you stop the design from detracting from the information?

JA: “There have been accusations levelled at the book that our experimental approach impedes the communication of information. Most of the statistics are supported with some explanatory information and it is presented clearly albeit sometimes in an unusual fashion.

“Our belief is that a book which is quickly digested is quickly forgotten. I would argue the reader deserves more credit than that, and I think people are willing to take the time to delve into a complex illustration again and again.”

Best of the visualisation web… June 2010

At the end of each month I pull together a collection of links to some of the most relevant, interesting and useful articles I’ve come across. If you follow me on Twitter you will see these items tweeted as soon as I find them. Here’s the latest collection from June 2010:

Online Journalism Review (from May 2007) | An interview with Alberto Cairo, El Mundo’s former infographics expert, about his (then) upcoming book on visual journalism | Link

Smashing Magazine | Interesting background behind the design of a “World Of Programming” Infographic | Link

If It Was My Home | Visualisation which plots the spill area of the BP Oil Disaster over your chosen location, helping to contextualise the extent of this disaster | Link

Juice Analytics | Summary page of Juice’s excellent ’30 days to context connection’ series about the world of visualisation | Link

The Guardian | From the Guardian’s News & Media Sustainability Report 2010 – “our data journalism is opening up a world of information” | Link

The Guardian (from Oct 2009) | Portugal’s new paper points to print’s future | Link

Online Journalism Blog | Video with Olympics Reporter Ollie Williams to find out what the BBC are planning for 2012 | Link

Perceptual Edge | Business Intelligence Industry – Get to Know Your Real Customers | Link

blprnt.blg (from Apr 2010) | Your Random Numbers – getting Started with Processing and Data Visualization | Link

Logo My Way | Logo redesign challenge for BP in light of the oil disaster | Link

Online Journalism Blog | Alternative critique of The Guardian’s crusade for liberating access to all data, looking at the economics of data in the news environment | Link

Haha.nu | Good Magazine | There’s been a few lows recently in terms of bad design, here are a couple more to digest | Example 1 and Example 2

The Guardian | Why Minority Report was spot on | Link

Data Mining | Visualizing wireless signals in Augmented Reality | Link

Visuale | 8 simple rules on brainstorming around a visualization design | Link

Smashing Magazine | Applying Interior Design principles to the web | Link

Eager Eyes | Article reflecting on the demise of the visualisation website ‘Verifiable’ | Link

Wired | Article to launch the availability of UK tube data – ‘Live Tube map shows the power of TfL’s data’ | Link

Infosthetics | Recreating Minard’s Napoleon graph using modern day tools: Swivel, Tableau and Many Eyes | Link

Infosthetics | Interview with Karim Rashid about Information Aesthetics in Industrial Design| Link

New York times | The New York Times visualisation which tracks the extent and spread of the Gulf of Mexico oil spill | Link

Core77 | Don Norman’s article about design thinking being a useful myth | Link

CBS Money Watch | The 10 hottest careers in the US| Link

Kickstarter | A new way to fund and follow creativity | Link

Moritz Stefaner | Open source release of Moritz Stefaner’s brilliant Elastic Lists code | Link

Design Mind | How designers are introducing the idea of “simple” to a group of high school students | Link

Perceptual Edge | ‘Circle lust continues’ – article bemoaning the ongoing love affair some visualisation designers have with the use of the circle | Link

Infosthetics | Interview with Nicholas Felton from feltron.com | Link

Eager Eyes | Caroline Ziemkiewicz and Robert Kosara’s paper on ‘Implied Dynamics in Information Visualization’ | Link

Infosthetics | Social Visualization Software Review: Tableau Public | Link

Juice Analytics | Can familiarity trump usability? | Link

Infographics News | Three infographics and 200 years of Argentinian history | Link

Infographics News | Four visual communicators give advice on infographics | Link

Flowing Data | Review of the book Data Flow 2 | Link

Infosthetics | What is the best arrow representation in visualisations? | Link

Flowing Data | Ridiculous but good fun, a Twitter parade in your honour | Link

Flowing Data | Turning information into action – 10 Tactics| Link

Smashing Magazine | Why design-by-committee should die | Link

O’Reilly Radar | Data is not binary: Why open data requires credibility and transparency | Link

Still awaiting Tufte’s influence?

Access to Data

I have spoken recently about the status of the world of visualisation and how the juxtaposition of a number of factors is really facilitating its growth and popularity, but not yet widespread best practice. One of the key elements in this emergence has been the modern day pursuit for transparency and open access to data which is growing at a rapid pace and is being rewarded with positive action.

We are seeing great progress in this area, and where once we had a trickle, there is now something of a deluge. Government data sites like data.gov (US) and data.gov.uk (UK), the COINS database of UK treasury spending, the Guardian’s datablog, the imminent release of data around UK local government spending and the highly popular Transport for London river, bus and tube data are just some of the examples of this movement.

Accompanying this growing access to important and interesting data comes a great opportunity, responsibility and demand to provide effective visualisation services that act as a window for any interested person or party to be able to explore, make sense of and make use of these great resources.

I recently came across some more of the visualisation services launched in the US such as Recovery.gov, USAspending.gov and the Federal IT Portfolio. Yet, whilst they are a great step in the right direction making the information accessible to a much wider audience, there are things nagging away at me about some of the basic flaws that exist in the design of these services.

Great Expectations

Back in March it was announced Edward Tufte had been appointed on to the Recovery Independent Advisory Panel. This was met with great anticipation within the visualisation community, giving greater prominence to someone like that could only serve to promote and enhance the profile and good practice of the discipline:

Presumably, Tufte will be using his expertise to find charts that illustrate how the stimulus is being used, and what effect its having on the economy. That’s brilliant news, for anyone overwhelmed by the blather surrounding political debates. And it’s not just a token appointment. Tufte says that he’ll be going to Washington several days a month, and teleworking regularly. “Infographics Win! Obama Appoints Data-Viz Demigod to Chart the Stimulus”, Fast Company, 8th March 2010

I concede that my knowledge of the US government is largely informed by the West Wing and so I don’t fully understand the relationships between and alignment of the various councils, departments, panels and bodies. Furthermore, I’m unclear about the role of the department that appears responsible for delivering the spending sites (Chief Information Officer’s Council) nor entirely clear about the potential scope or reach of Tufte’s appointment. However, whilst there are certainly good things amongst these services (eg. the use of gapminder works well) there is a distinct lack of evidence that he has had any design input or influence:

Recovery.gov

USASpending

IT.USASpending

The inelegant design of much of the above (especially the ugly tree map and indecipherable USA Spending categorical colour scheme) and the presence of devices like 3D pie charts and gauges does not evoke the Tufte principles many of us take great influence from.

Furthermore, when you hear that the development of the visualisation service for IT spending portfolio cost a total of $8M you start to see it in a new light, especially when you see this summary comment from the same people who were applauding the Data-Viz demigod’s appointment:

Of course, a sophisticated Web site of this magnitude doesn’t come cheap, but it is certainly easy to use… In this case, USASpending.gov’s IT Dashboard is $0.1 million under budget, ahead of schedule by 1.3%, and has an overall rating of 7.5. And the cost to create the USASpending.gov? $8 million, with the dashboard itself costing $1.3 million to build. For this kind of high-tech transparency, I think it’s absolutely worth it. Fast Company article, 14th July 2010

I’m sure I’ll receive feedback criticising my pedantry with this post but surely we would have expected better?

Worst graph design ever?

Yesterday, I came across a graphic which I believe to be possibly the worst graph I have ever seen. I’ve seen some stinkers but this has cleared out the room. Now hold your nose…

It comes from a book titled “The Little Book of Shocking Global Facts“. I’m sure I won’t be the first to offer it the alternate title of “Little Book of Shocking Graphs”. Interestingly the tag line on this article describes it as a “new book of astounding charts and stats” which does work on an unintended level.

Accompanying this are many other examples of bad practice in visualisation design – the fundamental purpose of which, let’s remember, is to effectively communicate information about a subject.

It is almost impossible to know where to start in critiquing a design like, so many are its faults. At best they are bad album art.

I discussed in a previous post how it is important to maintain balance and consideration when commenting on the work of designers trying to produce visualisation designs – often they contain flaws but have good intent. This is not one of them. Furthermore, the purpose and intention behind this book, which is to inform and engage people about significant and important global issues, deserves so much more than what has been produced. I think it is somewhat irresponsible of the publishers to have commissioned or approved the work that has been delivered here.

[Edit: coincidentally I've just seen that Andrew at Infosthetics has also published a post about this topic, as has Nicholas Felton]

Visualising the World Cup: Reflections

On the excellent DataVisualization.ch site, Benjamin has posted an interview that shares some of his thought about the various World Cup visualisations, infographics and interactives we have seen. This is for an article by Jeremy Wagstaff for the Jakarta Post and the BBC Wold Service. Benjamin has invited further feedback for others to join the conversation and so I decided to reflect on the collections I’ve posted (part 1, part 2, part 3) and offer my additional thoughts based around the same questions:

Do you happen to know where all this World Cup data is coming from?

As Benjamin suggests the sources are often difficult to identify but in general the typical origins of much data appears to come from sites like Twitter, statistical analysis and data providers like Opta and then probably in house databases held by the likes of UEFA and media outlets such as the BBC.

Has anything changed since the last World Cup? It seems there’s a lot more visualization being done this time. Is that so, and if it is, is it because there’s more or better data, it’s cheaper, or because our devices or palettes have changed? Any other reasons?

The increase in visualisations around the World Cup simply reflects the wider growth and emergence of the subject. A number of factors are facilitating this growth:

Technology as a creative enabler: The primary factor I would identify is technology. The creative data-driven software, array of online tools, APIs and advanced programming languages that now exist are creating sophisticated and accessible opportunities for more people to get creative with information designs.

Technology as a consumer device: The rise of the smartphone, the growth of mobile web, the launch of the iPad and emergence of ‘apps’, the increased prevalence and speed of broadband access are all key areas of growth over the past few years that have transformed the way we consume media. These contemporary platforms create new challenges and opportunities for designers to find interesting and effective ways of presenting interactive information.

Technology for recording data: The emergence of the social web and in particular Twitter has created staggering volumes of recorded data, both quantitative and qualitative: the opportunities this presents for analysis is amazing. One of the emerging terms you’ll come across is ‘digital exhaust’ which describes the incredible trail of data our usage of the Internet has created. Once again, the prospects for information designers to mine this data to feed visualisation products are only restricted by imagination and curiosity.

Open data movement: A further contextual factor is the open source movement and the campaign for open, transparent access to data. The opening up of access to government data, in particular, has gathered incredible pace over the past 24 months and is now well beyond the tipping point. This has once again created high profile opportunities for analysts and designers to use data to draw insight and encourage greater accountability.

Data journalism: There is a noticeable shift in the scope of 21st century journalism, influenced by the online publishing of newspapers, which has evolved the scope of typical roles to reflect the evolving nature of information sources and news consumption.

Data Events: The growth of the visualisation field over the past few years has been characterised by what I’d describe as massive ‘data events’: historical events that capture a worldwide interest and are accompanied by substantial data recording. I’m talking about events like the global financial crisis, the US and UK elections, climate change and several natural disasters. The appetite for making sense of these events and communicating them in visible, understandable designs has facilitated a rapidly increasing awareness and interest in the discipline. The World Cup was just the latest in the line of these data events, one that has an enormous potential captive audience already in place.

Information design as a marketing device: If you can create the ‘killer’ World Cup graphic that creates a massive buzz around the web you can attract a huge amount of traffic to your site. Take the example of Marca of Spain. The infographic design team there created probably the most ‘shared’ interactive Wallchart and the stats reveal the incredible amount of exposure this created for the newspaper (a post revealing these stats can be found in Spanish and Google translated into English). As a strategy for generating users, viewers or readers the emphasis on interesting, interactive infographics appears to have been taken up by Murdoch as a way of getting people to pay for The Times.

I wanted to suggest that this World Cup seems to have been something of a tipping point for data visualization. Any truth to that?

I agree with Benjamin’s response here. I think a claim of ‘tipping point’ could be valid for the awareness of and appetite for innovative ways of presenting information, particularly in the media. However, we are a long way off experiencing a tipping point in the best practice of information design/visualisation across all aspects of life and particularly business.

Generally speaking, there seems to be a great appetite for this kind of thing. Is that the case, and if so, what’s driving it?

Once again, Benjamin’s response is spot on. I would add that with great appetite comes responsibility. The user experience side has to be delicately handled because this is the arena in which much of the potential bad practice can be deployed. Devices that appeal, entertain and engage users need to do so without sacrificing the fundamental purpose of the display – that is to communicate data or allow people to explore data. The path that leads to the temptation to go beyond effective function and into worthless gimmickry is paved with danger!

What’s the business model for all the time spent on this kind of visualization for media stuff?

The business models should be based around the dual purpose of (a) attracting readers/viewers and (b) making them better informed, with the emphasis on the latter. The rewards for this are self-explanatory. Of course great time and effort will go into the creation of these designs, however, the exceptional skills, embedded practice and wonderful tools that exist out there should make this an increasingly sustainable strategy.

Any other thoughts–about what’s happening, the business model, the future, etc?

I’ve written recently about the status of visualisation and some of the factors that will help it continue to move and grow in the right direction. The focus of these thoughts concerns awareness and education. For every great visualisation there are probably three terrible ones that entirely fail to serve their purpose – make the viewer feel smarter about that topic. The key elements of any successful visualisation is message, data and display. Taking the first of these, you have to have a reason for communicating, it has to have a message, a purpose for informing. Too often graphics fail because they are a mismatch of all sorts of random snippets of information, they lack the cohesion that comes from having a core purpose or reason for production. The second aspect, data, is certainly moving in the right direction as outlined above. It is now more freely available, in better quality and is growing in volume at an incredible rate. The potential for new knowledge, greater insights and interesting stories to emerge from this is boundless. The final aspect, display, requires potential practitioners to be better informed about the foundation principles of design, how the visual system works, statistics, communication principles and, of course, technology. We need to immerse best practice in all aspects of our lives and business otherwise this ratio of good:bad practice will likely grow in the favour of the bad and the bubble will burst.