Andy Kirk about to appear in some new places…

Just wanted to announce that, over the next few weeks, you may see my name and my postings pop-up in a couple of extra places.

Firstly, I’m excited to have been invited by Andrew Vande Moere’s to perform a guest editor role on his immense Infosthetics site whilst he takes a vacation. I’m immensely grateful to Andrew for entrusting me with this treasured responsibility and I promise to showcase the very best 3D pie charts and tower infographics that emerge during his break…

Secondly, and with equal pleasure, I will be soon commencing a regular slot on the excellent O’Reilly Radar site, compiling occasional posts that showcase the latest trends in data visualisation.

These developments won’t have any negative impact on the frequency and standard of posts on visualisingdata.com, however, so do stay tuned for the usual mix of essential resources, contemporary examples and critiques.

I’ve also got a great contest coming up (don’t worry, its not another data visualisation challenge!) where the prize will be a brand spandking new, cutting edge 3M Pocket Projector. More on this next week…

New visualisation design project: UN Global Pulse challenge

I’m delighted to share with you details of my entry for the Visualizing.org UN Global Pulse visualisation challenge. The title of the work is “Giving Voices to the Vulnerable: The Economic Crisis” and explores survey data gathered by UN Global Pulse about perceptions of economic impact across five countries: India, Iraq, Mexico, Uganda and Ukraine.

If the Visualizing.org Player window doesn’t work, you can view the image via this closr.it upload

The aim of this post is to share with you the design process that was pursued, explaining some of the key decisions made and the design choices that formed the finished work. I am going to structure this around the three key themes that shape any visualisation project: message, data and design. A fourth theme, which relates to the constraints and restrictions around a project, runs throughout and so is incorporated within the others.

 

Message

The UN Global Pulse survey (conducted during May-August 2010) was undertaken using mobile phones/SMS and asked two multiple choice and three open-ended questions focusing on economic perceptions.

  1. In the past year, meeting your household needs has been: Easier, Same, More difficult, Very difficult
  2. In the past year, how has the (insert country) economic situation changed?: Better, Same, Worse, Much Worse
  3. What has been the greatest change you had to make to meet your household needs this past year?
  4. How has your quality of life changed over the past year?
  5. In one word, how do you feel about your future?

The purpose of this exercise was to discover perceptions about the impact of the ongoing global economic crisis:

Visualizing and UN Global Pulse challenge you to visualize the voices of vulnerable populations in times of global crisis. We’re looking for clear, informative, and creative visualizations that tackle one or more of the following: How do people in different nations describe their quality of life? What types of changes do people make in order to cope with economic uncertainty? How do individuals perceive their future outlook?

My intention was to absolutely maximise the benefit readers could derive from this work. It couldn’t be about a temptation towards generating data art or an emphasis on interactive novelty, just a clear and accessible piece of analysis that would help any reader feel smarter as a result of engagement with it.

 

Data

Clearly, by selecting participants using mobile phones, this would eliminate the involvement of those likely to be less privileged and so had to be considered very unscientific. However, the survey data did not, and was not intended to represent a statistically significant sample.

The data was presented in 5 separate tales (one for each country) showing the responses to each of the five questions.

The first job was to become familiar with the data and to identify any data cleaning requirements. A combination of Excel filtering/sorting and Tableau managed to identify any data quality issues as well as the range and distribution of values. A number of illegible and erroneous values were removed or resolved but the biggest task was making sense of and dealing with the open-ended data captured in questions 3-5.

At an early stage I made the decision to focus only on working with just one of these fields, as there was simply so much variability in the content it would require a significant manual cleaning and classification process. For the purpose of focus, only the latter of these questions (Q5) was explored in order to identify the most important and valuable insights.

This fifth question, focusing on perceptions of future prospects and (supposedly) based on a single term appealed as the most interesting of the three free-text fields and also the least challenging classification task. I manually went through the 3794 records and formed a more concise and representative selection of single word terms and, whilst I was able to deploy a range of automated processes to assist with this, it was still a significant task. In addition to establishing a more manageable list of terms I also wanted to assess the polarity of these terms, were they a negative or positive sentiment, and so embarked on also scoring the terms with a positive (1,2) or negative (-1,-2) value depending on their nature and strength. I also noticed a number of terms that were of a neutral nature so assigned this a separate category so as not to lose or dilute their intent.

The analysis load was shared between Excel and Tableau. Initially, I was looking for patterns within each variable but then moved onto exploring combinations to identify potentially interesting or prominent relationships. Tableau, in particular, is such an outstanding tool for engaging with and exploring a dataset, so easily allowing you to flow from one analytical hypothesis to the next, opening up possibilities and quickly, efficiently closing off dead-ends.

 

Design

At an early stage I made the decision that I wanted to pursue my design as a static, print-compatible piece. There is something elegant and actually more challenging about seeking a solution that successfully communicates within a single view, without the need for animation or interactivity. Like the beauty of a photo in contrast to moving image. Furthermore, a static format would seem to connect with the nature of this global task and the need to make the solution accessible to all, regardless of format and technology platform. An interactive would have been very interesting to work on and to this extent I was interested in the potential development of a solution influenced by the inspirational work of the New York Times.

The early concept I was arriving at was a sequenced view that took the reader through some contextual information about the economic situations in each country (to help form judgments on subsequent perceptions) and then reveal the patterns of the survey responses. Each country’s analysis would then sit side-by-side to facilitate comparison across each subject area. I thought about the sequence of the countries and eventually settled on alphabetical order, just because I couldn’t decide on a meaningful ordering basis.

For the economic data I settled on a combination of an area chart showing the Gross National Income per Capita and a bar chart for the GDP % growth/decline. Two issues emerged here: firstly, the absolute GNI values being very different from country-to-country meant the use of small multiples would only be used to reveal patterns rather than aid direct comparison, secondly, the Iraqi GDP values had three outliers that I didn’t want to have to accommodate as they would skew the axis range and diminish the visibility of other values.

The next decision was to combine some values in both Q1 and Q2, merging the ‘Much Worse’ and ‘Worse’ perceptions for the former, and the ‘Very Difficult’ and ‘Difficult’ values for the latter. It made more sense to assess matters on a single, combined view both analytically and presentation wise.

The best display for the Q1 and Q2 analysis was a horizontal bar, based on standardised % values rather than absolute number of responses for each category. This would also allow comparison across each country.

To aid intuition, red colours would be established to always represent a negative view (strong red = very negative), green to reflect a positive view (strong green = very positive) and dark yellow would be for a neutral or ‘same’ status view. All descriptive text would be Georgia font and numeric data in Gill Sans MT font.

The next display was a combination of variables from Q1 and Q2 into a heat map in order to see how different cohorts of respondents were answering each question. The darker the section would reveal the dominance of perception pairings. This approach justified the decision to reduce the variable range down to three values rather than four.

To communicate the analysis of terms used for Q5, initially, I tried some bubble charts based on the spread of polarity but they simply weren’t working in a way that aided comparison, rendering them pretty but fairly useless. Alternatively, a stacked horizontal bar showing the balance of positivity vs. negativity seemed to be working nicely. Presenting the frequency of usage for the various terms was problematic as I really wanted to avoid using something like a word cloud. This approach is a very obvious option for dealing with textual analysis but my view is that they offer limited functional and visualisation value – they are just for decoration. Eventually I arrived at a comparison of the 10 most prevalent terms, using a horizontal bar chart, using colours to indicate the polarity of each sentiment. This would show the general view point across each country. Finally, an aggregated % total formed by these ten terms would show how strongly they represented the consensus.

The final visualisation display was a pair of heatmaps for each country comparing the responses for views of the future with the Q1 and Q2 perceptions. I decided to use the same blue colour scheme on these heatmaps as the one above so that there were no visual clashes across the piece overall and to let the eye get used to interpreting the colour range values.

The last task was to incorporate some personal insights in a column of comments to aid understanding and add all the necessary annotation and explanatory text (intro, data treatment, how to read this visualisation, data sources) to help viewers understand every aspect of the project and the final piece.

You can see the final piece published on the Visualizing.org site.

Best of the visualisation web… June 2011

At the end of each month I pull together a collection of links to some of the most relevant, interesting or thought-provoking articles I’ve come across during the previous month. If you follow me on Twitter – and now Google+ too – you will see many of these items tweeted as soon as I find them. Here’s the latest collection from June 2011:

The Numbers Guy (WSJ) | A Full Plate of suggestions for USDA

blprnt | All The Names: Algorithmic design and the 9/11 memorial

Software and Art | Amanda Cox: Making illustrations better with an annotation layer

Abhishek Tiwari | Bubble sets: Revealing set relationships with isocountours over existing visualisations

Buzzdata | Noah Iliinsky on good visualizations

Excel Charts | Change bad charts in Wikipedia

Toronto.ca | Wellbeing Toronto – a new web-based measurement and visualization tool that helps evaluate community wellbeing across the city’s 140 neighbourhoods

CR Blog | Cannes round-up

Fell in Love with Data | Data visualization is NOT useful. It’s indispensable.

Drawar | Dieter Rams and the 10 Principles for Good Design: Part 1, Part II

Eyeo Collection | An unofficial, community-driven collection of inspirations and discoveries from the Eyeo Festival 2011

Flowing Data | Find everywhere you can go in 15 minutes or less

Carlacasilli’s media psychology | Concluding part 8 of the excellent series of posts about information visualisation, a new visual language.

Fell in Love with Data | Guest post: From killer questions come powerful visualisations

FlowingData | GeoCommons 2.0, now with more mapping features

Design Mind | Google Health’s failure to bring meaning to data

Fast Company | Google helps journalists make data more informative, and beautiful

Meaning in Communication | The science of persuasion

10,000 Words | Newspaper Map: The coolest way to visually surf newspapers

Vimeo | Video: OECD Better Life Index: Design Variations

Rev Dan Catt | Of Data Scientists, Big Data, the City and Dancers

Idea Transplant | Why designers cannot for work free…

Line 25 | Showcase of impressive design process explanations

Target Point Consulting | Seeing is Remembering

Eager Eyes | The Camera metaphor of visualization use

Bissantz | The eight commandments of good visualization

Till Nagel | TileMill for Processing

Datavisualization.ch | Travel time and housing prices map

Infosthetics | Twitter visualizes the geographical spreading of information

Very Small Array | Overlap between Academy Award Nominees, Best Picture and the Top 10 Highest Grossing Films per year (1928-2010)

Visualizing.org | Q&A with Enrico Bertini

Perceptual Edge | Does GE Think We’re Stupid?

O’Reilly Strata Conference, New York 2011 (20% reader discount)

You may recall my promotion of the inaugural O’Reilly Strata “Making Data Work” conference, which took place in January of this year and was, by many accounts a really successful event. I’m excited to share details of the follow up event which is taking place in New York, between September 19 and 23. The first Strata conference was a sell out event and this next one promises to offer an even bigger and fascinating line up of topics, sessions and speakers.

The event is split up into three distinct sessions, each offering a different contemporary perspective on the power and opportunities that exist in this data rich era:

 

Strata JumpStart | 19th September | Get your 20% discount (code = DATA)

Providing a crash course for managers, strategists, and entrepreneurs on how to manage the data deluge that’s transforming traditional business practices across the board–in finance, marketing, sales, legal, privacy/security, operations, and HR. Join us for an intense, day-long deep dive. You can see an evolving list of confirmed speakers but some of the topics covered in this intense one-day session will include:

 

Strata Summit | 20th – 21st September | Request an invitation

The Strata Summit is an invitation only event (though you can initiate an invite) providing a two day event on the essential high-level strategies for thriving in “the harsh light of data”, delivered by the battle-tested business and technology pioneers who are leading the way. The Summit will explore how the application of data is reshaping industries, destroying incumbents, and crowning new kings. Delegates will also get glimpses into the fringes of data science that point to opportunities ahead.

You can see the impressive list of confirmed speakers here which include Simon Rogers (Guardian), Cory Doctorow (novelist) and James Powell (Thomson Reuters).

 

Strata Conference | 22nd – 23rd September | Get your 20% discount (code = DATA)

THE main event. Taking place at the New York Hilton, this two days will form the main part of the conference schedule providing a comprehensive deep dive into the nuts-and-bolts needed for building a data-driven business—the latest on skills, tools, and technologies you need to make data work. Strata Conference covers the latest and best tools and technologies for this new discipline, along the entire data supply chain—from gathering, cleaning, analyzing, and storing data to communicating data intelligence effectively. With hardcore technical sessions on parallel computing, machine learning, and interactive visualizations; case studies from finance, media, healthcare, and technology; and provocative reports from the leading edge, Strata Conference showcases the people, tools, and technologies that make data work.

Confirmed speakers include Drew Conway (New York University), Jer Thorp (blprnt) and Pete Warden (OpenHeatMap)

To register for any of these events simply click through on any of the links or banners above and you will be led through to the conference registration page which should have your 20% discount pre-completed – if this is not the case, the code is simply DATA. Early registration runs through to 1st August.

Visualizing Player 1.0 launch

Much of last week’s attention was taken up by the launch of visual.ly, but today sees another great development in the visualisation field with the Visualizing.org launch of the Visualizing Player 1.0 - the first-ever media player designed specifically for data visualisation.

The main features of the Visualizing Player are that it is widely portable, enabling easy embedding and sharing, and is a universal platform through its support of countless creative formats, including hi-res graphics, HTML5, Java, Flash and video. It is especially useful for those who publish content on the web such as bloggers and designers themselves.

One of our core missions here at Visualizing is to build you the best possible platform and the most powerful tools for sharing those creations. We think the Visualizing Player — built from your feedback — is the best possible way to get your work out across the web.

The result is an ideal presentation and publishing format finally worthy of the work visualisation designers are creating, unleashing the pure aesthetic, full interactivity and insightful experiences as originally intended.

It sounds like the development and evolution of the Player will not stop here: “We’re already developing new features to make the Visualizing Player even more useful to you so please send us any comments or feedback you might have.”

Check it out!

Find out about becoming a freelance visualisation designer

Just a quick post to point you in the direction of a great piece Enrico Bertini at Fell in Love with Data is working on. He is asking readers to suggest questions to ask top visualisation designer Moritz Stefaner in a forthcoming interview. The focus of the interview is to find out about Moritz’s experiences as a freelance data visualisation designer.

The interview is likely to take place next week so don’t miss this opportunity to find out more about this exciting and emerging vocation. You can send your questions to Enrico and/or Moritz either by adding a comment on Enrico’s blog post, or contact them on twitter @FILWD and @moritz_stefaner.

Visual.ly public beta is live!

Today we welcome the latest notable development in the visualisation field. Just over three months after launching a teaser preview video of their offering, inspiring 60,000 people to sign up for invites, 8,500 Twitter followers and attracting an impressive list of partners, visual.ly, a platform for showcasing the diversity and beauty of visualisation, will launch into the public arena.

Promising to “connect public to publishers, designers and data“, Visual.ly will offer the largest web resource of indexed, searchable visualisations (over 2,000) and a powerful data warehouse pulling together a vast array of datasets.

The website will also eventually offer an intriguing, automated facility for registered users to create their own web-based infographics and visualisations, with a low-cost membership available for individuals and a more advanced version for larger companies. I understand the infographic creation feature will arrive later in 2011, signalling the transformation from beta to full-launch.

You can find out a bit more (but not much!) about visual.ly from this FastCoDesign article and here in the New Brunswick Business Journal. You can also watch the pre-launch video below:

 

One to watch…

I am really fascinated to see how the Visual.ly offering evolves and how well it takes off. I genuinely wish the team well and, given the impressive ‘interest’ statistics, they sound to be off to a great start.

Just to share a couple of observations that occurred to me when I first heard about this site.

I am particularly intrigued by the concept of automating infographics. Now this could be semantics, but my personal definition of infographics, and one shared and articulated far better by Robert Kosara, is that they are essentially produced manually, bringing together separate visualisation, imagery and textual elements to form a single, cohesive explanatory visual. To hear that a creation engine has been conceived that could help to automate this is a significant proposition and I look forward to giving it a go.

I will also be interested to see the design style and target of the visualisations and infographics. The issue of form vs function rumbles on, but this relationship (and lets see it as a relationship rather than a battle) was articulated better than most in a nailed-on article by Zach Gemignani of Juice Analytics. The talented designers on board at Visual.ly appear to be predominantly from a marketing background, so will this lead to a tendency towards designs that focus more on the emotional “wow”, serving the beauty-seeking user or will it also manage to feed those seeking more sober, but elegant, analytical communications.

Of course, these observations are made pre-launch and with all the ignorance of fact that goes with that status. As the site evolves towards full maturity I would love to explore theses further with the guys at Visual.ly.

Bottom line? Check it out for yourself.

Update #1: Fascinating to hear that Robert Kosara of EagerEyes has shared that he has been involved/is involved in an advisory capacity to support the development of Visual.ly which is great news. He will be posting a news item on his site giving more details about this in the next 24 hours.

Update #2: Here is Robert’s post revealing an encouraging post detailing more about what Visual.ly is and will be.

Part 6: The essential collection of visualisation resources

The contents of this post are now published on the interactive Resources page

Part 5: The essential collection of visualisation resources

The contents of this post are now published on the interactive Resources page

Interview with data visualisation contest winner

A few weeks ago I published details of a data visualisation contest I had been invited to judge on. This was in relation to the then-topical issue of the population of black students at the UK’s elite universities.

The winner of this contest was announced last week and I caught up with Swiss designer Raphaël Halloran, the lucky winner of an iPad2 and other goodies, to explore his design process in more detail.


(Click on image for interactive version)

 

Raphaël, congratulations on your victory! How does it feel to have designed the winning entry?

Thank you Andy! It feels great and is very encouraging! But I was actually really surprised when I discovered the results. I wasn’t expecting to win, since this was for me more of an HTML 5 exploration.

Furthermore, I was astonished by the quality of many of the entries. The participants seemed to have put great effort and time into their work.

 

What has been your journey into the field of data visualisation? Do you have a design or computer science background? When did you first discover it as an active subject area/growing field?

Data visualisation has been interesting me since my early years in university. I often had to handle data and information during my “Geoscience and environment “ studies, although I wasn’t really trying to develop any specific designing skills.

My passion for it grew bigger as I started a thesis on information design in environmental communication, which I am trying to finish at the moment.

I also have been working part time since about a year ago in an environmental engineering company as an infographic&interface designer, which helped me improve my designing skills. And as for my computer skills, I am self-taught.

As we are more and more confronted with an information overload, I truly see data visualisation as a promising and emerging field.

 

What motivated you to enter this competition? What did you think about the brief and the challenge set?

The Ipad 2 prize motivated me of course! No seriously, since I have pretty little experience in the field, it seemed to me like a great opportunity to test my skills and be judged by a panel of renowned designers/editors, notably David Mccandless who has been a great inspiration for me.

Honestly, the data wasn’t the most interesting I have seen, but making something cool out of it seemed like a good challenge to take up.

 

Can you talk me through the early stages of how you went about creating your entry?

Firstly from reading the brief, to forming ideas, to analysing the data, conceiving the initial design possibilities? (any images, screenshots, sketched etc. would be great!)

I tried to keep it simple and mainly stick to the data that was given, as well as to the title (black students at UK universities).

The first stage was to understand the data. At first sight, I thought: wow! there are so few black students at UK universities! It is important to contextualize the data to avoid any bias. Therefore, It seemed essential to me to indicate on the graph the portion of black population in England. At the end, you will observe that globally, UK universities have a higher portion of black people than England itself.

I also wanted to be able to easily compare the different universities, since the elitism seemed to be the point. So the success rate by ethnicity was a key factor for this comparison.

I decided to simply go with these two sets of data to give two different views on UK universities and emphasize the comparison between the universities.

Histograms were my first choice and although they can often seem boring, they remain in my opinion one of the most effective ways for human eyes and brains to compare data. I tried to give it a cool look by using flashy colors and a minimalistic design conceived in Illustrator.

 

What key decisions did you make during this process? For example, did you come up with any other potential solutions that you eventually decided not to follow? Your solution was an elegantly simple one, did you have other more complex/busy ideas?

My first idea of the graph was pretty much the final version. I believe that “less is more” and maybe that is what rewarded me in this competition. Albert Einstein’s quote: “Make everything as simple as possible, but not simpler” definitely defines, in my opinion, data visualisation and information design.

It should remain understandable, and even though the data was in this case quite basic, there was no need to make it even more complicated by overloading with information people would never remember. You have to go straight to the point.

 

Can you explain the choices you made on your final piece? eg. the colours, the background, the icons, the interactive motion etc.?

As I said before, the objective was to stay minimalistic in every way, without reducing the relevance of the information. The aesthetics I chose weren’t inspired by any particular work I have seen before. I designed the icons from pictures of the universities. I also wanted this graph to stand out and be catchy, which explains my choice of fresh colours on a dark background.

Animation-wise, simple but elegant animation on a sober interface was the way to go.

 

Despite your victory, are there any things you would now do differently with the benefit of more time or having seen other entries? Are there other sets of data you would have liked to have access to in order to enhance the work (eg. other uni’s, more years of data, other contextual info)?

I don’t think I would have done anything differently but I would have liked to have access to the data on success rates in UCL and the average success rate in UK universities, which are missing on my graph.

I didn’t take the risk of making any hypothetical correlations with other types of data as it can lead the viewer to a biased truth, although I could have maybe compared with the ethnic diversity in the different regions of UK.

 

Your project was developed using HTML5. Why did you choose this? Do you have any views on the future of HTML5 as a potential programming environment for creating visualisation (especially compared to other languages, Flash etc.)?

I think HTML 5 has an enormous potential and you can already find mind-blowing animations over the web. It will certainly grow quickly and spread all over the web in the next few years. It brings web development to another level and will probably allow huge possibilities in data visualisation. Plus, it is iPad/Smartphone-compatible contrarily to Flash technology.

 

An effective visualisation enables people to draw insight from the data, can you tell me what key observations or conclusions you were able to make yourself from your work?

Very often, just one type of information isn’t enough to make a correct statement. In this case, one would prima facie assume that University of Durham is the most elitist or “racist”. But when you take a look at the success rate, you will see that black students have a higher success rate than the average. This could probably simply be explained by the fact that Durham County is one of the least ethnically diverse places in UK.

Overall, Oxford and Cambridge universities seem to have a tendency to accept less black students than the average. That said, I don’t want to make any wrong conclusions. You would have to look at the admission success rates on a longer period of time to make a more accurate observation I guess.

 

Now that you have achieved this victory, what future plans or ambitions do you have with data visualisation work in the future?

Continue developing my data visualisation skills, especially with interactive content and why not make a career out of it.