10 significant visualisation developments: July to December 2011

Back in July I published a collection of the 10 most significant visualisation developments from the first half of 2011. These were a very personal view of the most prominent, memorable, significant, progressive and appealing developments of the year so far.

As we prepare to bid farewell to 2011 I am now looking back over the latter half of the year with a follow up collection of developments that I perceive to have had most significance during the period July to December.

As I made clear in the previous post, there will be selections here that won’t or wouldn’t make other peoples’ top 10s but I wouldn’t expect them to. These are just things that struck a chord with me and fulfill my basic criteria that they further the progress of data visualisation in their own particular way.

And so, in no particular order…

1. “We’re hiring!” – The increasing prevalence of data visualisation careers

Within the context of the worst economic conditions in many generations, data visualisation not only seems to be bucking the trend but actually continues to thrive against the odds. Over the past few years we have seen the field move from the fringes to become an increasingly mainstream discipline and this is being reflected in what I perceive to be a steady increase in prevalence of attractive jobs, postings and project opportunities in organisations and visualisation agencies for people with the key skills and knowledge. In recent times we have seen openings with David McCandless, Periscopic, Twitter and Interactive things, to name but a few. Data science in particular keeps being discussed as an emerging discipline and there is clearly a much better understanding about the value of these roles within businesses and organisations. There is perhaps one aspect which will hopefully see significant progress during 2012. Nathan’s FlowingData jobs forum is an excellent stream of opportunities both for postings and projects, and there are always avenues to pursue via LinkedIn, however, otherwise there is still much demand for a better established data visualisation B2B marketplace to connect designers and consultants with the businesses out there who are clearly, increasingly interested in this activity.

2. The relentless stream of amazing online content

One of the trends I’ve been increasingly aware of during 2011, in general, but specifically the latter six months, has been the incredible, increasing volume of magnificent content available across the web. As visitors to this site will know, I compile a monthly collection of the best data visualisation content I find during the previous month. I have found the size of this task has grown considerably and I seem to save or bookmark a greater volume of superb material month-on-month, meaning I now need to present my collection over two posts to make it remotely digestible. I have been especially blown away by the relentless stream of inspirational video content with so many hours-worth of invaluable footage of leading practitioners discussing great projects or imparting their considerable wisdom. Examples would be conferences such as the O’Reilly Strata and the Eyeo Festival which captured, compiled and shared superb collections of speakers. There is, of course, always something amazing worth watching on TED and you should check out Benjamin’s excellently curated vimeo channel, with around 350 videos available.

3. The rise of the ‘Truth and Beauty Operator’

It would difficult to argue against Moritz Stefaner being recognised as the most prolific, prominent and celebrated visualisation designer of 2011. I included his OECD Better Life Index project on my previous ‘significant developments‘ post (which also received the gold seal of approval from Stephen Few!) and since then we have seen a number of great projects from or involving Moritz, such as the World Economic Forum Global Agenda Survey and the Max Planck Research Networks project. On top of that, in one of my favourite blog posts of the year, he generously shared his thoughts about ‘How to Become a Data Visualization Freelancer‘ in an interview with Enrico Bertini. He has probably done more speaking events this year than Al Gore and you can now read his thoughts in this recent interview with Benjamin Wiederkehr for the excellent Substratum series. Putting my forecasters hat on, and maintaining the European theme, my tip(s) for the next 6 months would be that Gregor Aisch or Jan Willem Tulp will occupy one of these top 10 spots…

4. Some great new titles on the bookshelf

The second half of 2011 was decorated by the release of several important data visualisation books, including Nathan Yau’s ‘Visualize This’ (my review here), Noah Iliinsky and Julie Steele’s ‘Designing Data Visualizations‘ (review coming soon) and Manuel Lima’s ‘Visual Complexity‘ (review coming soon). News has also just reached me that the Guardian datastore is publishing an e-book about its work in time for Christmas so that’s another to look forward to. If you want something a little bit different, and harmless infographics is your thing, then the Lonely Planet’s ‘How to Land a Jumbo Jet‘ book of travel-related visualisations has now also been published. With regards to 2012, news that Tamara Munzner is working on a book (preview chapters) is great news and I really look forward to reading that, as I do the English translation of Alberto Cairo’s El Arte Functional, due next summer.

5. Forbes American Migration

Published in November, this interactive map, created by Jon Bruner for Forbes, illustrated the patterns of migration within the US – involving almost 40 million Americans annually – by tracing the movement between each county over a 5 year period. As Bruner observes, this piece reflects the shifting “geographical marketplace of the States during the boom and bust of the last decade”. Based on IRS data, the selection of a county will display all the migration connections with other counties: those coloured blue send more migrants to the selected county; those in red take more than they send. The connecting arrows were perhaps somewhat redundant (in certain situations, like LA, they hide the county detail) but I found it an incredibly engaging solution, one that lured me into spend a significant amount of time interacting with it. The site’s social media metrics suggest many people have explored the data and reveals that this interactive has successfully made the subject matter accessible to all, allowing people to unlock the myriad stories that exist about this period of American social and economic history. I also appreciated the accompanying narratives from several writers who offered their expert perspectives on the key insights.

6. Hurricane Irene trackers – New York Times & Stamen

Towards the end of August, the focus of much news coverage across the western world was on the imminent Hollywood-esque arrival of the powerful Hurricane Irene along the east coast of the US. The main angle of media attention concerned its potential threat and impact on densely populated areas such as New York City. There were two prominent live visualisations used to track the progress of the hurricane, one developed by Stamen for MSNBC and the other for New York Times by the usual suspects Matthew Bloch, Shan Carter and Matthew Ericson. I couldn’t split the two so decided to include both, especially as I feel the context of these tools and the story overall reflected the growing maturity of the visualisation discipline. Not only were they deployed so prominently in the coverage of this developing story (as opposed to a post-event analysis) but both of these red-hot graphics groups were capable of successfully mobilising wonderfully effective, informative and elegant visualisation solutions in limited time.

7. Occupy George

“By circulating dollar bills stamped with fact-based infographics, Occupy George informs the public of America’s daunting economic disparity one bill at a time”. A very straightforward application of visualisation but one that seemed to strike a real chord with the sentiment of those involved or sympathetic to the Occupy movement, whether active or passive. The idea was that people could download the templates of various dollar bill infographics and then print them to use as props in the various protest movements that were spreading across the globe.

8. Are we learning to get along better?

A few articles over the past 6 months seems to encapsulate the sense I’ve picked up on that we might be learning to live more cordially alongside one another. By that, I mean infographics people and data visualisation people, pragmatic visualisation people and aesthetic visualisation people. In one of my favourite posts of the year, Zach Gemignani observed the relationship between artists and practitioners in data visualisation and rationalised how they can and should co-exist in a healthy discipline. Jorge Camoes added to this theme with an interesting piece about how one person can find and experience patterns that the next person may not, it is a truly individual experience. Enrico Bertini discussed how we should seek to help and advise infographic designers, in whose work we may identify significant flaws, rather than just dismiss or discount their contributions. Nathan discussed how we should come to acknowledge the important in the visualisation eco-system of low quality infographics and rather than chase them out of town, we should worry more if they start to disappear (in fact, maybe we should just come clean and confess our love for them). Finally, in a slightly different context, Robert Kosara made the observation from the VisWeek conference that visualisation was ‘growing up’ and had demonstrated evidence of it actually working (!). Without this sort of real-world evidence, and continuing attempts to define and clarify the different roles of visualisation, we would continue to be treading water and doing battle with one another, allowing the traditional fault lines to dominate coverage of the subject. Instead, I think we’re starting to see some real shape and harmony forming. Now, who fancies a group hug?

9. So many great, animated geospatial visualisations

I wrote at length about the power of animated geospatial visualisations in telling stories my O’Reilly Radar piece and it is fair to say that the latter half of 2011 has been notable for a number of fantastic exhibitions of this popular method, including, but not limited to: Derek Watkins’ visualisation of US post office expansion, the Stanford visualisation of the spread of newspapers across the US, Mahir Yavuz’s ‘Sense of Patterns‘, NASA’s Visual Tour of Earth’s Fires and Michael Kriel’s iPhone movement project.

10. The launch of Visual.ly

The most high profile launch of the year came in the shape of Visual.ly, the field’s latest platform for showcasing the popularity of visualisation and infographics. Within just three months of launching a teaser preview video of their offering, 60,000 people had been sufficiently inspired to sign up for invites and Visual.ly had attracted an impressive array of partners, backers, and advisors. They now have nearly 50,000 twitter followers and recently announced that they have raised $2M to build their business further. Recently, they launched a blog section of the site and if the early articles are to go by, this could be a very promising source of subject and knowledge content. Exciting stuff, but where does it all lead? Well, the most eagerly anticipated development within the field is still yet to launch – the enigmatic automated facility for registered users to create their own web-based infographics and visualisations, which was due at the end of 2011 but is yet to see light. We await to see how successful this tool will prove to be with casual and experienced designers alike, and there is still much concern and contempt for the T&Cs associated with the potential exploitation of content published on the site but, regardless, it has certainly been a significant arrival on the scene.

Special mentions…

Having once again struggled to keep this list down to only 10, here are some special mentions for a few further developments that deserve a special mention:

Experimental isarithmic maps visualise electoral data - David B. Sparks, a fifth-year PhD candidate at Duke University, published a fascinating set of experiments using ‘Isarithmic’ maps to visualise US party identification.

Guardian Hacking Timeline – I included this because of the multi-dimensional representation of the Hacking story as it evolves over time. You have the basic device of the timeline which plots the number of tweets per hour, but you have the accompanying, pulsing bubble chart reflecting the shifting focus of the story, the key event snippets, the most prominent tweets and dynamic keyword analysis. Every angle was comprehensively covered.

Eric Fischer – Eric has been responsible for a number of excellent visualisation projects during 2011, with Flickr data proving to be his particular choice of raw material – two of his most notable projects included London’s Twitter And Flickr Traffic and Visualising seasonal and temporal patterns in Flickr photos.

Cinemetrics - There’s not much practical insight emerging from this project but I really loved the innovative work of Frederic Brodbeck’s project to extract, analyse and then visually re-packaging many different metrics associated with movies, including duration of scenes, average colour and scene motion.

Visualizing.org Marathons - over the past few months Visualizing.org have been running a series of 24 hour visualisation marathons for student teams to compete. These have been excellent events, providing the potential next generation of visualisation designers with a unique opportunity to practice and develop their skills and experiences. Hopefully we will see more events like this repeated in 2012.

Honoured to be a judge and speaker at Malofiej 20

Just a quick announcement to share that I am absolutely thrilled and honoured to have been invited to attend Malofiej 20, the 20th edition of the ‘most important journalistic infographics event in the world’, as a judge and speaker.

I will be providing updates on this site and via the Malofiej 20 blog, sharing my experiences as a bright-eyed, first time attendee during the build up and the event itself.

My sincere thanks to Javier Errea, chairman of the Malofiej World Infographics Summit and Competition, and also to Alberto Cairo who I look forward greatly to meeting along with all the other members, speakers, judges and delegates in March!

Visualizing.org video tells the visualisation story of 2011

Here’s a nice little video wrap-up of 2011 from the splendid people at Visualizing.org, providing a quick look through the year’s events using their visualisation contests and uploaded visualisation projects to tell the story…

You can navigate through a gallery of all the projects displayed in the video here.

The worst visualisation I’ve seen this year? Google Zeitgeist 2011

The Google Zeitgeist report for 2011 has been published reviewing the most popular news, events, people and stories according to search queries typed into Google during the year. This year’s report contains a baffling 3D visualisation to present the top 10 lists.

Here you have a 3D, multi-series bar chart with a slider device to help you navigate through the 12 months of the year. Not only do you have to contend with having to adjust your eyes to the 3D plane, but for some reason only the tops of the bars are actually visually distinguishable with the bars themselves ghostly transparent.

If you wish, you can attempt to compare the top 10 by selecting two search terms and following the progress of each query through the year, with the slider now moving you through chunks of 4 weeks, with display showing 3 lots of 4 week periods. Good luck with that. I would suggest a ruler, a sharpie pen and a defaced monitor may be your only hope of pulling that off.

Still struggling? Don’t worry because at least there is a ‘help’ prompt to guide us through this impenetrable design…

Outstanding, thanks for that.

I’m inclined to give this a ‘Worst visualisation of 2011′ badge because of its score on a disappointment index (an index I’ve just launched in my head today). Sure we’ve had an endless stream of shocking infographics but you don’t expect anything off them, they are an easy target. You just genuinely expect more from Google. For such a data and technology rich, innovative organisation this project just fails on so many levels.

Hive plots bid farewell to hairball visualisations

For those of you interested in network visualisations I strongly recommend you have a look at Hive Plots. This is a site set up to publish the findings and technical solutions from research by Martin Krzywinski (et al) from the Genome Sciences Center, Vancouver. The project’s tag line “Rational Network Visualization – Farewell to Hairballs”, should also give you a sense of the occasionally light-hearted approach to the project’s report which makes it a very engaging read, much more so than many academic papers.

Most traditional network visualisations produced echo the complexity of a data context rather than make it particularly accessible for the average viewer to make sense of. One can form impressions of particular clusters and popularly-connected nodes, but more specific interpretations are somewhat limited due to the visual complexity displayed.

As the author articulates below, the focus of this hive plot study was to find a way to make such visualisations geared towards facilitating much deeper and efficient visual interpretation:

The hive plot is a rational visualization method for drawing networks. Nodes are mapped to and positioned on radially distributed linear axes — this mapping is based on network structural properties. Edges are drawn as curved links. Simple and interpretable.

The purpose of the hive plot is to establish a new baseline for visualization of large networks — a method that is both general and tunable and useful as a starting point in visually exploring network structure.

To read more about Hive Plots either visit the site or download a copy of the same information captured in a pdf slide deck.

London Transport Museum historical visualisation exhibition

The Creative Review blog reports on a new poster show (‘Painting by Numbers‘) for early 2012 at the London Transport Museum which will be exhibiting some vintage data visualisations from the 1930s onwards.

The works relate to a number of designs which were used to present the quality and benefits of using London’s public transport.

Often thought of as a 21st century phenomenon, data visualisation – the presentation of information abstracted into a visual form – has been in use since the 2nd century when Egyptian people used tables to organise astronomical information. It was not until the 1920s that the importance and power of data visualisation for examining and making sense of data and information became more widely used.

The exhibition includes 20 posters from the artists such as Hans Schleger, Theyre Lee-Elliott and James Fitton and runs from the 6th January to 18th March 2012. You can by replicas of the posters from the London Transport Museum’s own poster collection shop.

(Acknowledging @Periscopic for spotting it first!)

Top economists reveal their graphs of 2011

To reflect on a particularly volatile past 12 months economically-speaking, the BBC has published a collection of graphs nominated by some of the top economists around the world as being “the graph which had the greatest impact on them, which they thought best explains the current financial situation, or which tells us something significant about what lies ahead”. Accompanying each graph in the slide show is a description from the economists explaining their choices.

It’s really interesting to see the array of designs that have had such an impact on these influential people, who are making such important policy decisions on the back of these displays (amongst other things, of course).

I know their selections are going to be based on the stories each chart tells (rather than a judgment of its accordance with best visualisation practices) but it is still amazing to see so many examples of poor colour choices, ugly type and, above all else, dominating, intruding chart apparatus.

Here’s a small sample of the chosen graphs, you can see the full slideshow here:

 

VICKY PRYCE, SENIOR MANAGING DIRECTOR FTI CONSULTING

 

RICHARD KOO, CHIEF ECONOMIST AT THE NOMURA RESEARCH INSTITUTE

 

KENNETH ROGOFF, FORMER CHIEF ECONOMIST AT THE IMF

Google Chrome’s helpful scrollbar ‘find’ visualisation

I spend most of my browsing life in Google Chrome these days and I’m a less angry more satisfied person because of it. But only today have I ever never noticed this handy little built-in visualisation feature.

When you run a ‘find’ operation for a given term, the vertical scroll bar highlights shows you where, on the current page, instances of the keyword you searched for appear, allowing you to quickly navigate around the search results.

Reminds me of the great New York Times State of the Union address keyword visualisation feature.

Data visualisation training course locations (Jan-Jun 2012)

Last month I asked people to send me details of the locations where they would ideally be interested in attending one of my one-day ‘Introduction to Data Visualisation‘ public training courses. I have now combined the recent requests with all the previous expressions of interest and have collated a clear picture of where the most prominent clusters of interest exist.

I am therefore thrilled to share details of the confirmed locations for training courses between January and June 2012. The schedule of dates, venue details and the registration arrangements for each event will be finalised as soon as possible and published on my permanent training information page:

Derry (26th January)

London (9th February)

Bristol

Edinburgh

Amsterdam

Copenhagen

Barcelona

New York

Chicago

Washington DC

Baltimore

Toronto

Of course more locations may be arranged in due course should sufficient volume of interest emerge. If you don’t see your ideal location listed above, drop me a line and let me know where you want me to come – I will do by best to get to you some how, some time!

Don’t forget, you can also arrange private, bespoke training courses – just get in touch with me to discuss your requirements.

O’Reilly Strata Conference, Santa Clara 2012 (20% reader discount)

Once again it’s my pleasure to be supporting the third O’Reilly Strata Conference, which will be taking place in Santa Clara, California between February 28 and March 1 (don’t forget it’s a leap year!). Described as the ‘the home of data science’, the conference brings together some of the very best developers, data scientists, data analysts, CIOs and IT professionals who are driving the data revolution.

As with previous events (in September and the launch event in February) Visualising Data readers can benefit from a 20% discount on their registration fees, using the code VIZDATA.

So why should you attend? Quite simply there is a fantastic line up of speakers across all topic tracks but especially the ‘Visualization and Interface’ one, with names like Noah Illiinsky, Simon Rogers, Jock MacKinlay, Ben Goldacre, Pete Warden, Hal Varian and Max Gadney found within the excellent schedule. Here’s my proposed timetable…

 

Tuesday, 28th February

 

9:00 Designing Data Visualizations Workshop

Noah Iliinsky (Complex Diagrams)
We will discuss how to figure out what story to tell, select the right data, and pick appropriate layout and encodings. The goal is to learn how to create a visualization that conveys appropriate knowledge to a specific audience (which may include the designer).

We’ll briefly discuss tools, including pencil and paper. No prior technology or graphic design experience is necessary. An awareness of some basic user-centered design concepts will be helpful.

Understanding of your specific data or data types will help immensely. Please do bring data sets to play with.

13:30 (1) Hands-on Visualization with Tableau

Jock Mackinlay (Tableau Software), Ross Perez (Tableau Software)

Data has always been a second class citizen on the web. As images, then audio, then video made their way onto the internet, data was always left out of the party, forced into dusty Excel files and ugly HTML tables. Tableau Public is one of the tools aiming to change that by allowing anyone to create interactive charts, maps and graphs and publishing to the web—no programming required.

In this tutorial you will learn why data is vital to the future of the web, how Tableau Public works, and gain hands-on experience with taking data from numbers to the web.

Through three different use cases, you will learn the capabilities of the Tableau Public product. The tutorial will conclude with an extended hands-on session covering the visualization process from data to publishing. Topics covered will include:

 

13:30 (2) The Craft of Data Journalism

Simon Rogers (Guardian)
Learn first hand from award-winning Guardian journalists how they mix data, journalism and visualization to break and tell compelling stories: all at newsroom speeds.

 

 

Wednesday, 29th February

 

8:45 Plenary

Welcome Edd Dumbill (O’Reilly Media, Inc. ), Alistair Croll (Bitcurrent)
The Apache Hadoop Ecosystem Doug Cutting (Cloudera)
Decoding the Great American ZIP myth Abhishek Mehta (Tresata)
A Big Data Imperative: Driving Big Action Avinash Kaushik (Market Motive)
Keynote by Ben Goldacre Ben Goldacre (Bad Science)

10:40 (1) Dealing With Messy Data

Q Ethan McCallum (independent)
Welcome to data science’s dirty little secret: data is messy. and it’s your problem.

It’s bad enough that data comes from myriad sources and in a dizzying variety of formats. Malformed files, missing values, inconsistent and arcane formats, and a host of other issues all conspire to keep you away from your intended purpose: getting meaningful insight out of your data. Before you can touch any algorithms, before you feed any regressions, you’re going to have to roll up your sleeves and whip that data into shape.

Q Ethan McCallum, technology consultant and author of Parallel R (O’Reilly), will explore common pitfalls of this data munging and share solutions from his personal playbook. Most of all, he’ll show you how to do this quickly and effectively, so you can get back to the real work of analyzing your data.

10:40 (2) Science of Visualization

Jock Mackinlay (Tableau Software)

Visual analysis is an iterative process for working with data that exploits the power of the human visual system. The formal core of visual analysis is the mapping of data to appropriate visual representations.

In this talk, you’ll learn: •What years of research by psychologists, statisticians and others have taught us about designing great visualizations •Fundamental principles for designing effective data views for yourself and others •How to systematically analyze data using your visual system.

11:30 Effective Data Visualization

Hjalmar Gislason (DataMarket)

Data visualization is often where people realize the real value in underlying data. Good data visualization tools are therefore vital for many data projects to reach their full potential.

Many companies have realized this and are looking for the best solutions to address their data visualization needs. There is plenty of tools to choose from, but even for relatively simple charting, many have found themselves with limited options. As the requirements pile up, options become limited: Cross-browser compatibility, server-side rendering, iOS support, interactivity, full control of branding, look and feel … and you’ll find yourself compromising, or – worse yet – building your own visualization library!

Building our data publishing platform – DataMarket.com – we’ve certainly been faced with the aforementioned challenges. In this session we’ll share our findings and approach for others to avoid our mistakes and learn from our – sometimes hard – lessons learned.

We’ll also share what we see the future of online data visualization holding: the technologies we’re betting on and how things will become easier, visualizations more effective, code easier to maintain and applications more user friendly as these technologies mature and develop.

13:30 Building a Data Narrative: Discovering Haight Street

Jesper Andersen (Bloom Studios)

Data isn’t just for supporting decisions and creating actionable interfaces. Data can create nuance, giving new understandings that lead to further questioning—rather than just actionable decisions. In particular, curiosity, and creative thinking can be driven by combining different data sets and techniques to develop a narrative around a set of data sets that tells the story of a place—the emotions, history, and change embedded in the experience of the place.

In this session, we’ll see how far we can go in exploring one street in San Francisco, Haight Street, and see how well we can understand it’s geography, ebbs and flows, and behavior by combining as many data sources as possible. We’ll integrate basic public data from the city, street and mapping data from Open Street Maps, real estate and rental listings data, data from social services like Foursquare, Yelp and Instagram, and analyze photographs of streets from mapping services to create a holistic view of one street and see what we can understand from this. We’ll show how you can summarize this data numerically, textually, and visually, using a number of simple techniques.

We’ll cover how traditional data analysis tools like R and NumPy can be combined with tools more often associated with robotics like OpenCV (computer-vision) to create a more complete data set. We’ll also cover how traditional data visualization techniques can be combined with mapping and augmented reality to present a more complete picture of any place, including Haight Street.

14:20 Crafting Meaningful Data Experiences

Bitsy Bentley (GfK Custom Research)

I am frequently asked for advice about using data visualization to solve communication problems that are better served through improved information architecture. A nicely formatted bar chart won’t rescue you from a poorly planned user interface. When designing meaningful data experiences it’s essential to understand the problems your users are trying to solve.

In this case, I was asked to take a look at a global data-delivery platform with a number of issues. How do we appeal to a broad cross-section of business users? How do we surface information to our clients in a useful way? How do we facilitate action, beyond information sharing? How do we measure success?

A user-centered approach allowed us to weave together a more meaningful experience for our business users and usability testing revealed helpful insights about how information sharing and data analysis flows within large organizations.

Data visualization is a powerful tool for revealing simple answers to complex questions, but context is key. User-centered design methods ensure that your audience receives the information they need in a usable and actionable way. Data visualization and user experience practices are not mutually exclusive. They work best when they work together.

16:00 (1) Netflix recommendations: beyond the 5 stars

Xavier Amatriain (Netflix)

Netflix is known for pushing the envelope of recommendation technologies. In particular, the Netflix Prize put a focus on using explicit user feedback to predict ratings. This kind of recommendation showed its value in the time when Netflix’s business was primarily mailing DVDs. Nowadays Netflix has moved into the streaming world and this has spurred numerous changes in the way people use the service. The service is now available on dozens of devices and more than 40 countries.

Instead of spending time deciding what to add to a DVD queue to watch later, people now access the service and watch whatever appeals to them at that moment. Also, Netflix now has richer contextual information such as the time and day when people are watching content, or the device they are using.

In this talk I will describe some of the ways we use implicit and contextualized information to create a personalized experience for Netflix users.

16:00 (2) Roll Your Own Front End: A Survey of Creative Coding Frameworks

Michael Edgcumbe (Columbia University), Eric Mika (The Department of Objects)

Many options exist when choosing a framework to build a custom data explorer on top of your company’s stack. With a brief nod to out-of-the-box business intelligence solutions, the presenters will offer an overview of the creative coding frameworks that lend themselves to data visualization on and across web browsers and native apps written for Mac OS X, iOS, Windows, and Android. Evaluation of the strengths and weaknesses of libraries such as Processing, OpenFrameworks, Cinder, Polycode, Nodebox, d3.js, PhiloGL, Raphael.js, Protovis, and WebGL will be explored through visual examples and code. The audience should come away with a sense of what investments into education will return a high value product that serves unique design goals.

16:50 Sketching With Data

Fabien Girardin (Lift Lab)

Since the early days of the data deluge, Lift Lab has been helping many actors of the ‘smart city’ in transforming the accumulation of network data (e.g. cellular network activity, aggregated credit card transactions, real-time traffic information, user-generated content) into products or services. Due to their innovative and transversal incline, our projects generally involve a wide variety of professionals from physicist and engineers to lawyers, decision makers and strategists.

Our innovation methods embark these different stakeholders with fast prototyped tools that promote the processing, recompilation, interpretation, and reinterpretation of insights. For instance, our experience shows that the multiple perspectives extracted from the use of exploratory data visualizations is crucial to quickly answer some basic questions and provoke many better ones. Moreover, the ability to quickly sketch an interactive system or dashboard is a way to develop a common language amongst varied and different stakeholders. It allows them to focus on tangible opportunities of product or service that are hidden within their data. In this form of rapid visual business intelligence, an analysis and its visualization are not the results, but rather the supporting elements of a co-creation process to extract value from data.

We will exemplify our methods with tools that help engage a wide spectrum of professionals to the innovation path in data science. These tools are based on a flexible data platform and visual programming environment that permit to go beyond the limited design possibilities industry standards. Additionally they reduce the prototyping time necessary to sketch interactive visualizations that allow the different stakeholder of an organization to take an active part in the design of services or products.

 

 

Thursday, 1st March

 

8:45 Plenary

Welcome Alistair Croll (Bitcurrent), Edd Dumbill (O’Reilly Media, Inc. )
Democratization of Data Platforms Jonathan Gosier (metaLayer Inc.)
Embrace the Chaos Pete Warden (Jetpac)
Keynote by Usman Haque Usman Haque (Pachube.com)
Using Google Data for Short-term Economic Forecasting Hal Varian (Google)
Keynote by Coco Krumme Coco Krumme (MIT Media Lab)

10:40 Video Graphics – Engaging and Informing

Max Gadney (After The Flood)

Videographics achieve the two most important criteria of the visualizer.

They engage attention and they inform.

I am currently working with the BBC to define a new format – that of the ‘Video Dat Graphic’. Some of these exist online to degrees of success but we are codifying best practice, auditing current activity and can show our work in the market context.

I will discuss how video is an information rich medium – from a survey of data resolution across media and how these videos can compliment the BBC online offering as a whole.

Some subjects to cover will be – storytelling principles – what actually works in 2 minutes – scripting and storyboarding – drafting a plan – timescales, costs and resources – designing for cognition – how video needs to understand how we perceive

I’ll be showing many examples in addition to our work.

This is a high paced session, with lots to look at and an excellent mix of storytelling and information design ideas. There is an excellent balance between theory and practical advice.

11:30 Rich Sports Data and Augmented Reality

Ryan Ismert (Sportvision, Inc)

Our presentation will cover the nascent fusion of automatically-collected live Digital Records of sports Events (DREs) with Augmented Reality (AR), primarily for television broadcast.

AR has long been used to in broadcast sports to show elements of the event that are otherwise difficult to see – the canonical examples are the virtual yellow “1st and 10” line for American Football and ESPNs KZone™ strike zone graphics. Similarly, sports leagues and teams have historically collected large amounts of data on events, often expending huge amounts of manual effort to do so. Our talk will discuss the evolution of data-driven AR graphics and the systems that make them possible. We’ll focus on systems for automating the collection of huge amounts of event data/metadata, such as the race car tracking technology used by NASCAR and the MLB’s PitchFX™ ball tracking system. We provide a rubric for thinking about classes of sports event data that encompasses scoring, event and action semantics metadata, and participant motion.

We’ll briefly discuss the history of these sports data collection technologies, and then take a deeper look at how the current first generation of automated systems are being leveraged for increasingly sophisticated analyses and visualizations, often via AR, but also through virtual worlds renderings from viewpoints unavailable or impossible from broadcast cameras. The remainder of the talk will examine two case studies highlighting the interplay between rich, live sports data and augmented reality visualization.

The first case study will describe one of the first of the next-gen digital records systems to come online and track players – Sportvision’s FieldFX™ system for baseball. Although exceeding difficult to collect, the availability of robust player motion data promises to revolutionize areas such as coaching and scouting performance analysis, fantasy sports and wagering, broadcast TV graphics and commentary, and sports medicine. We’ll show examples of some potential applications, and also cover data quality challenges in some detail, in order to examine the impact that these challenges have on the applications using the data.

The second case study will examine the rise of automated DRE collection as an answer to that nagging question about AR – ‘what sort of things do people want to see that way?’ Many of the latest wave of AR startups are banking huge amounts of venture money that the answer is in user-generated or crowd-sourced content. While this may end up being true for some consumer-focused mobile applications, our experience in the notoriously tight-fisted rights and monetization environment of sports has led directly to the requirement to create owned, curated data sources. This came about from four realizations that we think are more generally applicable to AR businesses…

  1. Cool looking isn’t a business, even in sports.
  2. It must be best shown in context, over video, or it won’t be shown at all.
  3. The ability to technically execute AR is no longer a barrier to entry. Cutting edge visualization will only seem amazing for the next six seconds.
  4. We established impossibly high quality expectations, and now the whole industry has to live with them.

 

13:30 Visualizing Geo Data

Jason Sundram (Where, Inc.)

In an increasingly mobile world, we are each generating tons of geo-tagged data. Photo uploads to Instagram, tweets, Foursquare check-ins, local searches, and even real-time public-transportation feeds are commonplace. The companies that gather this data make a lot of it freely available. The people who work for these companies have many opportunities to learn from this data. But in order to learn, we must first figure out what questions to ask. Visualization is a tool that helps us think of questions and begin to answer them.

There are 3 different major ways to think about geodata:

  1. Over time
  2. Aggregated spatially (e.g. by county)
  3. Aggregated by density (e.g. heatmap)

Additionally, creating tools that allow users to explore data on multiple scales (i.e. zoom) is important, but adds complexity: you have to find a tile source and perhaps even render your data to tiles.

Choice of projection is key. Most of us grew up with the Mercator projection, but an equal-area projection is often a better choice.

I will take one data set and walk through visualizing it using the 3 approaches described above.

The first example will use Processing and Tile Mill to generate a zoomable animated map, playing back a month worth of data. I’ll show how to render the map to a movie for easy distribution.

The second example will use d3.js to show the same data at a county level in a chloropleth map. I’ll discuss color schemes and interaction, and compare what can be done with d3.js to Fathom’s Stats of the Union project.

The last example will talk about how to make a heatmap with millions of data points.

14:20 (1) Beautiful Vectors: Emerging Geospatial technologies in the browser

Mano Marks (Google, Inc. ), Chris Broadfoot (Google)

Beautiful, useful and scalable techniques for analysing and displaying spatial information are key to unlocking important trends in geospatial and geotemporal data. Recent developments in HTML 5 enable rendering of complex visualisations within the browser, facilitating fast, dynamic user interfaces built around web maps. Client-side visualization allows developers to forgo expensive server-side tasks to render visualisations. These new interfaces have enabled a new class of application, empowering any user to explore large, enterprise-scale spatial data without requiring specialised geographic information technology software. This session will examine existing enterprise-scale, server-side visualization technologies and demonstrate how cutting edge technologies can supplement and replace them while enable additional capabilities.

14:20 (2) It’s Not “Junk” [Data] Anymore

Kaitlin Thaney (Digital Science), Mark Hahnel (FigShare), Ben Goldacre (Bad Science)

In a research environment, under the current operating system, most data and figures collected or generated during your work is lost, intentionally tossed aside or classified as “junk”, or at worst trapped in silos or locked behind embargo periods. This stifles and limits scientific research at its core, making it much more difficult to validate experiments, reproduce experiments or even stumble upon new breakthroughs that may be buried in your null results.

Changing this reality not only takes the right tools and technology to store, sift and publish data, but also a shift in the way we think of and value data as a scientific contribution in the research process. In the digital age, we’re not bound by the physical limitations of analog medium such as the traditional scientific journal or research paper, nor should our data be locked into understandings based off that medium.

This session will look at the socio-cultural context of data science in the research environment, specifically at the importance of publishing negative results through tools like FigShare – an open data project that fosters data publication, not only for supplementary information tied to publication, but all of the back end information needed to reproduce and validate the work, as well as the negative results. We’ll hear about the broader cultural shift needed in how we incentivise better practices in the lab and how companies like Digital Science are working to use technology to push those levers to address the social issue. The session will also include a look at the real-world implications in clinical research and medicine from Ben Goldacre, an epidemiologist who has been looking at not only the ethical consequences but issues in efficacy and validation.

16:00 From Big Data to Big Insights

Robbie Allen (Automated Insights)

With recent advances in linguistic algorithms, data processing capabilities and the availability of large structured data sets, it is now possible for software to create long form narratives that rival humans in quality and depth. This means content development can take advantage of many of the positive attributes of software, namely, continuous improvement, collaborative development and significant computational processing.

Robbie Allen, the CEO of Automated Insights, and his team have done this to great effect by automatically creating over 100,000 articles covering College Basketball, College Football, NBA, MLB, NFL in a 10 month period. Automated Insights is now branching out beyond sports into finance, real-estate, government, and healthcare.

In this talk, Robbie will share the lesson’s his company has learned about the viability of automated content and where the technology is headed. It all started with short sentences of uniform content and has expanded to the point where software can generate several paragraphs of unique prose highlighting the important aspects of an event or story.

16:50 Exploring the Stories Behind the Data

Cheryl Phillips (The Seattle Times)

A story on the U.S. Census will tell the broad themes behind the data and use people to exemplify those themes. But what every reader also wants to know answers to more specific questions: How did my community change? What happened where I live, in my neighborhood? And being able to provide those answers through an interactive visualization is what story-telling through the data is all about. A story or report on a subject by its very nature summarizes the underlying data. But readers may have questions specific to a time, date or place. Visualizing the data and providing effective, targeted ways to drill deeper is key to giving the reader more than just the story. The visualization can enhance and deepen the experience. Cheryl Phillips will discuss data visualization strategies to do just that, providing examples from The Seattle Times and other journalism organizations.