YouTube ‘Map My Summer’

Kristian Saliba, a digital art director for Three Drunk Monkeys creative agency (Sydney, Australia), has sent me details of an interesting project that has just been launched for YouTube titled ‘Map My Summer’.

This is a YouTube experiment that asks “what does an Australian summer really look like?”. Users are encouraged to upload clips of their many and varied summer experiences. The clips are located on a (initially) blank map, building up an abstract representation of Australia’s geography based on an emerging canvas of summer experiences.

Users are able to browse the map and video content through an info-graphic styled user interface which segments the data into colour coded categories.

Take a look at the great activities people are up to during the Australian summer which is still going on now (December to March). Those of you based down under, or who have spent time down there (any English cricket fans?!), get your videos uploaded and help build up the map.

Visualising travel facts, figures and ephemera

Lonely Planet are working on an exciting new visual guide to the world, titled “How to Land a Jumbo Jet” based on a unique collection of cultural and travel-inspired visualisations. To help create this book they want to crowd-source innovative ideas and creations from across the visualisation and design community.

As noted on Infosthetics, visualisation contests are becoming more prevalent every week. However, this particular venture should be viewed in a different light. It is not about a single submission winning a cash prize and the remaining entries disappearing into the abyss of an online gallery. Rather, it is about a significant volume of submissions being selected to be published in a book.

The Brief

The book will feature a number of infographics that present stories and insights around the many varied experiences that come from world travel. It is clearly a broad brief giving you a wonderful blank canvas on which to arrive at a unique visualisation-based perspective about your experiences of different countries, cultures, people, modes of travel, local customs, souvenirs, political histories, cuisine etc.

We want infographics that illuminate, entertain and inform. We want them to be great examples of information design, and we want the information in them to be true and for them to have a good degree of integrity. That doesn’t mean they can’t be light-hearted though!

Lonely Planet will select approximately 70 of the best, most innovative pitches and commission them to be created, with a fee of US$300 being paid for those commissioned. You might not think this is a great amount, but the real incentive is that the selected designers will receive guidance and feedback from the editor of the book – infographic designer, Nigel Holmes – and of course you will have the considerable pride of seeing your work published in the book, to be launched at the end of the year.

How to submit

If you have an idea for a visualisation or infographic you need to construct a compelling pitch. A pitch does not have to be complete – sketches, mock-ups or even a written pitch for your concept are fine. Then all you have to do is email through your submission by March 20, 2011 (keeping the file size to less than 2MB).

Pitches may be in the form of scanned rough sketches, vector artwork, or a verbal description of the project. The more information (visual and verbal) that is supplied in the pitch, the better – we want to be convinced to choose your work. Additional information supporting the nature and quality of your work – eg. CV, relevant folio, website – would also be helpful.


Lonely Planet have published some useful guidelines for the specific qualities they will be looking for in the published information graphics:

What do they not want to see?

We do not want data dumps that might look visually attractive, but that make no point. Instead, edit the data rigorously, and ask yourself the question: “what’s the point I’m making in this graphic?”. Then make that point shine.

Hopefully these guidelines will help ensure we don’t have a repeat of last year’s infamous ‘Little Book of Shocking Global Facts‘…

Visualisation Reflections: #8 Visualisation Designer

This is a follow-up post to my eighth article in the Visualisation Insights series which I published earlier this week. The purpose of this companion series is to optimise the learning opportunities from each insights article, reflecting on the ideas, issues and observations to emerge.

Why did I choose this subject?
I first came across Nathaniel’s Timeplots designs a couple of years ago and was instantly charmed by the intricate detail and beautiful execution of the designs he and his collaborators produce. In an era when we are (rightly) excited by the potential of interactivity and (also rightly) fatigued by the proliferation of bad practice infographics, it is reassuring to such pure visualisation art form shining out.

I approached Nathaniel for this interview to discover more about his background, his design influences and the methods he employs.

Impressions prior to the interview?
Without wishing to repeat myself, my main impressions of Nathaniel’s work were formed about the careful and deliberate design practices he deployed. The subtle use of colour, the lack of attention-grabbing visuals, the complete attention to detail, the dedication and depth of research, the perfect balance of composition – all the classic visualisation and information design methods are on show here and every visual element adds value to the information exchange.

I knew from information on his site that Nathaniel had a keen interest and background in politics that clearly influenced his focus on political subject matter, I also knew that he had experienced a highly successful and privileged education at some of the most outstanding establishments in the US and had enjoyed a successful career to date. I was keen to hear more about what motivated him to translate this passion in to these detailed infographic masterpieces.

Impressions after the interview?

I get the impression that Timeplots represents something of a ‘labour of love’ that provides Nathaniel with a perfect vehicle through which to pursue long held interests and highly tuned capabilities:

I wanted… to produce something tangible with an aesthetic component. I had some ideas (like a visual history of the senate) that had been something I had wanted to do for 20 years. I wanted to learn new things and acquire new skills. I had a long-standing interest in data visualization and I wanted an excuse to pursue it. I wanted to meet people in that space and learn from them.

He has a deep history in programming/computer science and a long-time passion for data/stats which led into, and beyond, political science and an interest in statistical methodologies. He is clearly a naturally gifted analyst and somebody with a keen eye for conceiving visualisation solutions to communicate complex issues. Coupled with his considerable technical capabilities this makes him the ideal visualisation all-rounder.

One of the most interesting elements of the interview comes when he references an old blog post, published back in 2005, in which he describes being reminded of a class he took at Yale with a certain Professor Edward Tufte:

[The class] was an introduction to an essential tool for learning how to think, it was also an appeal to numerical honesty in marshalling an argument, and thirdly (an especially distinguishing matter for me), it was informed with an aesthetic sense. If I had not taken that class, I would certainly not (for better or worse) have gone on to graduate study in the field that I chose.

A striking observation made is the remark that it wasn’t necessarily Tufte’s principles that left the greatest impact on him rather the admiration for the quality of Tufte’s beautifully crafted books.

If you are not familiar with Tufte’s superb books on the visual display of information, you ought to be. A review that he cites called them “cognitive art” and that’s precisely how I think of them. If you are like me, you will wish you had written those books, or books of similar quality in whatever happens to be your subject.

A common theme that runs throughout this interview is the respect Nathaniel has for people, especially those involved in craft, who pursue their passion, have the conviction to do things their own way and achieve great standards in their work.

Another interesting remark is his recognition of a personal desire to find the time and resources to do things really well. This is especially important to him as he observes “as much as anything, I wanted these posters for myself“.

A section that also really struck me in my interview with Nathaniel was the clear motivation he has that these pieces of work have an enduring quality, something that commands engagement beyond the initial reactive glance.

I like seeing people stand in front of one of the posters, trying to puzzle it out.  My work is not generally meant to be absorbed in one passing.

Every product is created to be lasting information art that reveals new patterns and details upon repeated viewings.

This is something I feel is very important in visualisation. We live in an age where consumers want things in an instant and don’t want to work to obtain something. People occasionally, and wrongly, measure the effectiveness of a visualisation by the immediacy of interpretation. In certain contexts that is clearly an objective (such as in Air Traffic Control or in the environment of Stock Exchanges), but otherwise complex issues and stories represented in visual form shouldn’t be ‘dumbed down’ for immediate consumption, rather just made more accessible irrespective of how long it takes to draw insight.

Its important to also acknowledge the point Nathaniel makes about demonstrating some flexibility with his design execution, experimenting with the inclusion of features such as imagery to test the popularity of reaction. This encapsulates the difficulty, once again, of the curious nature of judging the reaction to visualisation work:

I wanted them to be a bit more accessible that the previous prints we’ve done, so I’ve allowed myself to put in more images and photos, which I generally steer away from.  I’m curious to see if this makes them more or less popular.

Finally, credit to Nathaniel for generously mentioning the likes of Wallstats and Historyshots because I would also recognise these as been exceptional examples of a similar art to Timeplots.

Many thanks again to Nathaniel for agreeing to take part in this interview – it has proven to be a great insight into the world of a visualisation designer. I wish him and his colleagues at the Timeplots all the best for the future.

Look out for future insights articles, with many interesting interviews and interviewees lined up…

Eye Magazine, Issue 78 on Information Design

I’d recommend you try and get hold of a copy of the winter issue (number 78) of Eye magazine, the international review of graphic design, which is dedicated to the subject of Information Design and includes a special 26 page pull-out feature.

Information Design – Not so much a specialist genre, more an essential fact of life.

This winter issue features the following articles:

As well as an interesting opinion piece “Understand, visualise, survive” by Max Gadney

Remember, this is the international review of graphic design and so inevitably will approach coverage of the subject from that perspective.

Best of the visualisation web… January 2011

At the end of each month I pull together a collection of links to some of the most relevant, interesting and useful articles I’ve come across during the previous month. If you follow me on Twitter you will see many of these items tweeted as soon as I find them. Here’s the latest collection from January 2011:

Data Market | 13 thousand data sets, 100 million time series, 600 million facts | Link

A List Apart | Design Criticism and the Creative Process | Link

Junk Charts | A smarter word cloud: likes and not likes | Link

CNN Money | Best Companies To Work For 2011 | Link

Steph Abegg | Supercenters, Hamburgers, and Coffee: Using density-equalizing cartograms to display the distribution of Walmarts, McDonalds, and Starbucks in the US | Link

CERN Document Server | New trends in data analysis and visualization on the web | Link

Excel Charts | Data visualization hierarchy of needs | Link

Fell In Love With Data | Demystifying cargo cult visualization: You cannot visualize 3 variables by mixing 3 colors | Link

Scientific American | Words, pictures, and the visual display of scientific information: Getting back to the basics of information design | Link

David B Sparks | High dimension visualization in Political Science | Link

R&D Mag | How can data visualization change technology? | Link

Fell In Love With Data | How do you visualize too much data? | Link

GE Blogs | How much CO2 is created by… | Link

Flowing Data | In investing, timing is everything | Link

AIGA | Video of Jonathan Harris “Cold:Bold” at ‘Gain: AIGA Business and Design Conference’ | Link

Noah Brier | On Infographics | Link

Huffington Post | Who gives the best info? A short history of Information Design | Link

GE Blogs | Powering the kitchen | Link

Eager Eyes | Research: How to tell stories with data? | Link

TedTalks | TEDxGoteborg – Anders Ynnerman: Visualizing the medical data explosion | Link

TedTalks | TEDMED – Thomas Goetz: It’s time to redesign medical data | Link

UX Magazine | Social Seen: Analyzing and visualizing data from social networks | Link

Statistical Graphics and More | Data analysis of yesteryear | Link

Core 77 | Subaru on how to design a mediocre car | Link

Zero Intelligence Agents | Swallowing the Academic “Red Pill” | Link

Dashboard Spy | The insidious infographic | Link

TYPOGRAPH | Scale and rhythm | Link

Perceptual Edge | Designing with the Mind in Mind: A brief book review | Link

Perceptual Edge | Simplicity vs. Complexity: Design goals | Link

Well-Formed Data | Notablia – Visualising deletion discussions on Wikipedia | Link

Harvard Vision Lab | Silencing is a new illusion that shows it’s hard to notice when moving objects change | Link

ReadWriteWeb | How a science journalist created a data visualization to show the magnitude of the Haiti earthquake | Link | Inspirational vintage infographics | Link

10,000 Words | 7 Innovative online maps | Link

Smart Data Collective | For data visualization, circles don’t cut it | Link

Cool Hunting | Daytum iPhone App | Link

Wired | Stories that work in 150 seconds | Link

Infosthetics | Let’s debate the issue of aesthetics in data visualization… on television | Link

O’Reilly Radar | Visualization deconstructed: New York Times “Mapping America” | Link

Jonathan MacDonald | The fallacy of data bubble ignorance | Link

Visualisation Insights: #8 Visualisation Designer

This is the eighth article in my Visualisation Insights series. The purpose of this series is to provide readers with unique insights into the field of visualisation from the different perspectives of those in the roles of designer, practitioner, academic, blogger, journalist and all sorts of other visual thinkers. My aim is to bring together these interviews to create a greater understanding and appreciation of the challenges, approaches and solutions emerging from these people – the visualisation world’s cast and crew.

Nathaniel Pearlman is the President of Timeplots, a Washington DC based design venture dedicated to making the visual display of information more comprehensible and more aesthetic.

The aim of Timeplots is to tell complex stories in visual form. Their design offering is split into ‘public’ projects, which are typically based on subject matter relating to political history, and private ‘Timeplots on Demand’ for individuals/organisations seeking to map their specific stories or subjects.

The resulting visual forms are wonderful, intricate, enlightening pieces of information artworks that are available for purchase as high quality prints. They are more than simply informative posters, they are incredibly dense, beautiful pieces of visualisation that demonstrate the craft at its finest. Every visual element deployed adds value to the information exchange:

Timeplots are carefully crafted to provide a clear, comprehensive perspective of a specific subject. Every product is created to be lasting information art that reveals new patterns and details upon repeated viewings.

I greatly admire Nathaniel and his collaborators’ work and so was delighted when he agreed to take part in an interview for this series, allowing me to find out more about his background, the Timeplots venture and the design process behind these amazing works.

Can you give me a brief outline of your career/educational background leading up to your current life as a visualisation designer?

As a kid I was an ardent follower of politics and sports, and I think that’s where my interest in data and its visualization began.  I learned to program on an Apple II Plus and experimented with graphing all kinds of things; as a high schooler I was teaching computer programming to both adults and children.  Then I got more formal training with a degree in computer science at Yale. I worked for a few years as a professional programmer before spending four years in the doctoral program in Political Science at MIT, where I concentrated in American political and methodology (mainly statistics).  I did everything but write my dissertation there, instead leaving to found NGP Software, Inc. (now NGP VAN, Inc.), which has grown into a 130+ person political software company. Along the way I conceived of a lot of visualization projects, but never had the time to tackle them until recently when I founded Timeplots as a vehicle to do it.

I notice your undergraduate studies took place at Yale and you were a student of Edward Tufte. That must have been a fabulous experience to receive such an esteemed grounding in the subject at the beginning of your career?

Definitely the class with Tufte was influential.  I once blogged about it here:

Would you say, therefore, that Tufte’s principles have had the strongest influence on your design style/principles? Are there any other authors/designers/practitioners/academics that have had a profound impact on your work?

I think less about Tufte principles in particular, though some are quite influential with me, and more about the quality of his books, in substance, design, layout and prose when working on Timeplots. I like that he did it his own way and I try to do mine my way. I think the biggest challenge is to find the time and resources to do something really well.  I’d say I’m learning a craft and have a long way to go. I think highly of other people I studied with at Yale, MIT and Harvard, including Professors Mayhew, Perlis, Ansolabehere, King, Stewart, Snyder. I am inspired by people with high standards for themselves and others. I’d include the potter Marguerite Wildenhain, woodworkers like Sam Maloof and Wharton Esherick.  I like people who have found a way to make a living doing their own thing.  I also am a reader and I have a couple of shelves of books on design, information graphics, data visualization.

What is the best piece of advice you have received or would wish to impart to other visualisation designers?


What is your business model for Timeplots: Where did the motivation come from? What are your aims? Do you offer services for private projects alongside your public work? Where would you like to see it progress over the coming years?

I hope that Timeplots settles on a good business model. Right now we are selling posters and doing some one-off contracting work, but I imagine it will evolve over time. The motivation came from a bunch of things.  I had the chance to step away from day-to-day management of my software company and I wanted to revisit some of the challenges of bootstrapping another business.  I wanted the new thing to produce something tangible with an aesthetic component. I had some ideas (like a visual history of the senate) that had been something I had wanted to do for 20 years. I wanted to learn new things and acquire new skills. I had a long-standing interest in data visualization and I wanted an excuse to pursue it. I wanted to meet people in that space and learn from them.  So we will see how it goes. I’d like to find some more good folks to work with – anyone reading this interested?

Your first project involved a three part series visualising the history of major US political institutions, what was the inspiration behind you undertaking this work, what inspired the subject matter?

I’m a student of American political history and as much as anything, I wanted these posters for myself. I should note here the hard work of the folks who have worked with me along the way on the research, programming, and design of these posters.  The prints are time-consuming and my main roles are idea generator, knowledge repository, editor, and funder.

What has been the most pleasing feedback or comment you have received about this series/project?

I am really honoured when someone actually takes their hard-earned money, buys my work among all the possible things available in this world, and puts it on their wall.  I love seeing them in someone’s home or office.  Right now we have a little show at the coffee shop at Politics and Prose bookstore in Washington, DC.  I like seeing people stand in front of one of the posters, trying to puzzle it out.  My work is not generally meant to be absorbed in one passing.

You have recently launched a fantastic new Timeplots project with two works on the visual history of the Democratic and Republican parties.

Can you outline some of the design process and decisions that lie behind these pieces?

Well I keep meaning to get away from politics, but I wanted to get a history of the American political parties in first. And I wanted them to be a bit more accessible that the previous prints we’ve done, so I’ve allowed myself to put in more images and photos, which I generally steer away from.  I’m curious to see if this makes them more or less popular.

How do you conceive the designs? Do you sketch ideas/compositions out on paper first or get into exploratory analysis looking for insights and stories?

We start with a set of questions to answer. We use a whiteboard. The party posters may look like reasonably straightforward decorated timelines, but we struggled for a long time over the design and what would be the central story, finally settling on a measure of party strength to work around.

How do you decide on the dimensions of information that you will seek to incorporate into the design?

I guess that is an intersection of what is available, what is interesting, what is meaningful, and what fits.

How do you involve collaborators in the project?

I reach out to folks who are highly knowledgeable (professors, practitioners, experts) in the subject matter at hand, and email them early drafts of the prints for feedback.

How long does each piece typically take?

Too many months to admit here. I hope we can get better and faster at it.

What software/technical resources do you use to develop the works?

These are programmed in R and cleaned up in Illustrator.  I’m open to other ways, would love to hear how others do it.

Can you explain some of the most challenging design decisions you had made on any of the visual properties in these designs?

The amount of data on the U.S. Senate poster (every Senator in history) was very difficult to fit into a poster format. We went a long way down several other paths before settling on the current design.

One of the key aspects of any creative process is knowing when to stop adding or subtracting from the design: how did you/do you handle this delicate stage of the project?

I want to keep adding and subtracting still to everything I’ve done.  I know it could make them all better. It is very hard to stop. Subtracting is probably more important and I probably need to get better at that.

What plans do you have for future Timeplots projects? Will they continue to be paper based ‘artworks’ or have you ever considered developing them as digital/interactive pieces?

I’d like to do interactive projects with optional prints associated with them.  I’m thinking about a baseball print, maybe, next.  You can do amazing things interactively, but as someone who has been in software for a long time, it seems so much less tangible.  My prints are far better on a nice big sheet of heavy paper than trying to scroll around on a digital device.

How do you see the state of the visualisation field right now?

Booming, and all over the place. It appears to be moving fast. But it is still hard to find really great stuff, or stuff to my taste, at least.

What are the things that excite you/keep you positive about the way the visualisation is advancing?

I like when something that is very difficult to understand can be made clear by a new method of presentation. I wish someone would tackle redesigning my Carefirst healthcare billing statement, for example. It is dreadful and I cannot make heads or tails of it. At some point I would love to be able to contribute to public policy by helping to clarify important matters.

Are there any aspects that frustrate or disappoint you?

I have not yet learned how to market my own stuff very well. I’m a bit shy and I find that area frustrating.  And finding the time to do all the things I want to do.

Finally, a chance for you to recommend or promote other designers/practitioners in the field – are there any people or work you would strongly recommend for readers to take a look at?

I like the Wallstats poster on the U.S. budget. Historyshots also produces good work. I like the folks at Juice Analytics who do dashboarding and such. I’m envious of all the folks with active visualization blogs with substantial readerships; I am considering starting my own and if I do it will be at

I’m extremely grateful to Nathaniel for taking part in this interview, offering some really interesting and candid insights into his world as a designer. Thanks also for the amazing speed in which he responded to my questions! I wish him and his Timeplots collaborators all the best in their future success with these exquisite visualisations. You can buy these prints direct from the Timeplots site, follow Nathaniel’s twitter updates via @timeplots and keep up with his blog updates here.

Guest post: Day 3 at the O’Reilly Strata Conference

This is the final guest/cross-post by Jan Willem Tulp, winner of my recent contest to win a full pass to the O’Reilly Strata conference. The conference has been taking place this week and Jan has kindly offered to share a short summary on each day of the conference. You can find out more about Jan’s work via his blog and follow him on Twitter @JanWillemTulp.

Day 3 at O’Reilly Strata Conference

The third day of the Strata Conference was again packed with great sessions. The day started off with numerous keynotes. The first one was Simon Rogers of The Guardian. Simon is not just a fabulous presenter, also the examples of his work at the Guarding were great examples of how to tell stories with data, and how The Guardian actually enhanced its news stories by sharing data with the public. Next up was an interesting panel discussion with Toby Segaran ( Google), Amber Case (Geoloqi) and Bradford Cross (Flightcaster) and moderated by Alistair Croll (Bitcurrent). Topic of discussion was Posthumus, Big Data and New Interfaces. After this discussion we had some good presentations by Ed Boyajian (EnterpriseDB) and after that Barry Devlin (9sight consulting ). Next was a very lively talk by DJ Patil (LinkedIn), and he showed very convincingly that the success of working with big data at LinkedIn is only possible with a good team of talented people. Scott Yara (EMC) came next, and also had a lively talk full of humor on how Your Data Rules The World. The closing keynote was from Carol McCall (Tenzing Health) with a serious problem brought with humor on how big data analytics can be used to improve the US healthcare, and turn it ‘from sickcare into healthcare’.

As my first session I chose a talk on Data Journalism, Applied Interfaces. Marshall Kirkpatrick (ReadWriteWeb) showed some really useful tools, like NeedleBase, that he uses for discovering stories on the Internet. He was followed up by Simon Rogers of The Guardian again, who more or less continued his keynote, showing very compelling examples of how The Guardian uses data to tell stories, and how they use for instance Google Fusion Tables to publish many of their data. The last speaker of this session was Jer Thorpe, and he absolutely blew me away with a beautiful interface he has created in Processing as an R&D project together with the New York Times. It’s called Cascade, and shows a visual representation of how Twitter messages are cascaded over various followers and links.

My next session was on ‘RealTime Analytics at Twitter’ where Kevin Weil mainly explained RainBird, a project they use for various counting applications so that realtime analytics can easily be applied. The project will be opensourced in the near future.

After the break I saw a session on AnySurface: Bringing Agent-based Simulation and Data Visualization to All Surfaces by Stephen Guerin (Santa Fe Complex). He showed how using a projector and a table of sand can be used to enhance a data visualization for simulation purposes. As an example he showed us how he projects agent-based models and emergent phenomena in complex system dynamics can help firefighters simulate bottlenecks in escape routes. It was also very cool to see that many of his simulations are built in Processing. Next up was a session by Creve Maples (Event Horizon) and I really like the first part of his talk, because he had a very good story on how we should keep the capacity of the human brain for processing information in mind when designing products and tools. It was really good to hear such a strong emphasis on this. The last part of his talk was mainly about some of the 3D visualizations he has done in the past that were very successful for his company, but didn’t struck me as much as the first half of his talk.

The session on Data as Art by J.J. Toothman (NASA Ames Research Center) was a good an fun talk with many examples of infographics and visualizations. I had already seen most of them myself, some were new. It was a great talk with lots of eye-candy. The final talk of the conference I saw was about Predicting the Future: Anticipating the World with Analytics. Three speakers gave their vision on how they do that: Christopher Ahlberg (Recorded Future ) showed how his companies uses time-related hints (like the mention of the word ‘tommorrow’) in existing content on the Internet can be used to more or less predict the future. Robert McGrew (Palantir Technologies) showed how analyzing many large datasets in combination with human analysis can be used to perform effective fraud and crime predication. Finally Rion Snow (Twitter) showed that research has proven that analyzing tweets can be used effectively for stock market prediction (3 days ahead!), flu and virus spread prediction, and UK election result prediction (more accurate than exit polls). The predictive power of analyzing the Twitter crowd was really stunning.

This concluded the O’Reilly Strata Conference. The conference was fantastic, the sessions were great, and most of all, meeting all these people was probably even the best of all!


I’d like to thank Jan for sharing his thoughts on the conference, I really appreciate his time and effort in writing these at the end of each day, after what sound long and energetic sessions!

Congratulations also to O’Reilly who appear to have organised and executed a fantastic new event. I look forward to New York in September.

Where have we seen this before?

Those of you who followed my South Africa 2010 World Cup posts during last summer (part 1, part 2 and part 3) will possibly remember one interactive visualisation design in particular, a World Cup schedule by Spanish daily Marca.

Amongst many different visualisation approaches to bringing the World Cup to readers from the World’s media outlets, this was one of the most shared, discussed, liked and retweeted projects as the figures below (as at July 2010).

Sharp-eyed reader Sachin Rajpal has pointed me in the direction of a remarkably similar concept for the ICC Cricket World Cup 2011 Schedule on the CricBuzz website and judging by the initial stats is experiencing the same sort of popularity.

Is this a case of the same design team replicating their successful formula on a different media platform or evidence of imitation being “the sincerest form of flattery”?

Guest post: Day 2 at the O’Reilly Strata Conference

This is a guest/cross-post by Jan Willem Tulp, the winner of my recent contest to win a full pass to the O’Reilly Strata conference. The conference is taking place this week and Jan has kindly offered to share a short summary on each day of the conference. You can find out more about Jan’s work via his blog and follow him on Twitter @JanWillemTulp.

Day 2 at O’Reilly Strata Conference

After a day of tutorials, the second day at Strata was the first of two conference days, packed with fascinating sessions. The day was kicked of with a plenary session with a long list of top-speakers in field of data science: Edd Dumbill of O’Reilly Media, Alistair Croll of  Bitcurrent, Hilary Mason of, James Powell of Thomson Reuters, Mark Madsen of Third Nature, Werner Vogels of, Zane Adam of Microsoft Corp, Abhishek Mehta of Tresata, Mike Olson of Cloudera, Rod Smith of IBM Emerging Internet Technologies and last but not least Anthony Goldbloom of Kaggle. Various topics were presented in presentations of 10 minutes each, like data without limits, data marketplace, and the mythology of big data. The shortest presentation struck me most: “the $3 Million Heritage Health Prize” presented by Anthony Goldbloom: people are challenged to create a predictive application that uses healthcare data to predict which people are most likely to go to hospital, so that ‘US healthcare becomes healthcare instead of sickcare’. The prize is $3 Million for the one who solves this!

Next up were the individual sessions, and I was very much looking forward to the talk “Telling Greate Data Stories Online” Jock MacKinlay of Tableau. And though the talk itself was excellent, for me it was all known stuff, but the talk is highly recommended for those unfamiliar with Visual Analytics or Tableau. Being biased towards visualization related sessions, my next session was “Desinging for Infinity” by  Dustin Kirk of Neustar. Dustin showed 8 Design Patterns of User Interface Design, like infinite scrolling, which were really good. It reminded me of the updated version of the material in Steve Krugg’s book Don’t Make Me Think.

Next up was the best talk of the day: “Small is the New Big: Lessons in Visual Economy”. Kim Rees of Periscopic showed us very good examples of effective information visualizations. I was really blown away by this presentation, mostly because she really showed how creatively removing clutter and distractions can make the visualization very effective. Also the creative interactions that help the user using the visualization were compelling. Next was Philip Kromer of  Infochimps on “Big Data, Lean Startup: Data Science on a Shoestring”. Though my expectations were that Philip was going to explain the Lean Startup principles, evangelized by Eric Ries, the talk was more about Infochimps approach to doing business. Some remarkable comments by Philip: “everything we do is for the purpose of programmer joy”, and “Java has many many virtues, but joy is not one of them”. Great presentation and inspiring insights!

My next sessions was “Visualizing Shared, Distributed Data” by Roman Stanek (GoodData), Pete Warden (OpenHeatMap) and Alon Halevy (Google). After short presentations of each, these three guys had a panel discussion where the audience could as questions. Their discussion evolved mostly around the fact that all three deal with data that is created and uploaded by a user, and how do you deal with that: do you clean it, what’s the balance between complex query functionality and ease of use, etc. My final session was “Wolfram Alpha: Answering Questions with the World’s Factual Data” by Joshua Martell. Half the talk was a demonstration of the features of  WolframAlpha, and the other half was more or less a high level talk about how WolframAlpha handles user input, how data is stored, how user analytics is performed, and more.

The day ended with a Science Fair where students, researchers and companies were showing new advancements in the field of data science. There were really interesting showcases, like a simulation tool for system dynamics. But again biased towards visualization, the one that struck me most was Impure by Bestiaro. Impure is a visual programming language that allows users to easily create their own visualization, both simple and very advanced. It was also great to see the passion of Bestiario for their own product.

Finally one of the best things of the conference so far has been meeting people, some of which I only know virtually for some time now. I especially enjoyed meeting all the visualization people today. It’s really great to meet many of the online visualization community in person.

So again, a fantastic day at Strata, and I am looking forward to tomorrow!


Jerome Cukier, the second of my Strata Conference contest winners, submitted the following summary in the comments for this post and I decided they warranted adding to this overall Day 2 summary. You can also follow Jerome’s conference updates via his Twitter account. Thanks Jerome!

I was really inspired by Hillary Mason’s opening keynote. Watch this for yourself here, in 10 mins Hillary manages to explain both what’s going on in the field and get us excited about what will come.

I have been extremely thrilled by Kim Rees talk. Get this from someone who’s spent the last 3 years watching every datavis that went viral on the internet, I had not seen 90% of the examples she showed. She should post the slides at

I really liked Dustin Kirk talk as well, which was extremely practical. The issue he tackles is: now that applications (esp. web applications) have to let users handle a huge amount of data, how is that affecting interface design? he showed us the contrast between the “1995 way” of say, selecting one item in a list of 500 – which would be the standard HTML drop down list (), in contrast with the state of the art, such as selecting labels in gmail or finding contacts in an iphone. Selecting items in a big list is just one of the many problems where endless data calls for an immediate change, and Dustin did a great job of illustrating that with examples. Slides can be found at

There were 2 exciting sessions I attended without Jan Willem. The first was a talk from Peter Skomoroch from linkedin who talked about exploring the “data exhaust”, or the byproduct of our digital activities. after explaining general principles he demoed that with a project his crack team of data scientists/visualizers came up with in a couple of days: a mashup between the strata attendee directory listing and their linkedin profiles, complete with skills and connections. The result is a thought-provoking network map of the skills of the Strata people.

Lastly, I saw Matthew Russell make an impressive demonstration of what he explains in his book, “mining the socialweb”, specifically how to use python to get gems of information from twitter. I’m having a hard time deciding whether the code was more interesting than the actual questions that Matthew was asking to a popular twitter account, what I do know is that I’m getting the book on my way home and so should you.

Follow these speakers on twitter – @hmason, @krees, @dustin_kirk, @peteskomoroch, @ptwobrussell!/HMASON

Crime and policing data for England and Wales

Yesterday saw the high profile launch of the Government’s portal for exploring local crime and policing information for England and Wales, a huge step forward in the ongoing movement towards open data.

The site fundamentally acts as a police information portal, providing contact details for local policing team, twitter feeds relating to local issues involving the nearby Force, details of how to get involved in neighbourhood watch schemes as well as instructions for how to report crime.

But the main feature of the site is a mapping tool allowing users to enter a postcode and discover the volume of ‘street-level’ crimes and anti-social behaviour incidents (ASB) in that area over the previous full month.

This default display reveals the aggregate number of crimes and ASBs in that postcode area, with the option to narrow down the display to show information about Burglaries, ASBs, Robberies, Vehicle Crimes, Violent Crimes and a broad category for all Other Crimes.

To protect privacy, individual addresses are not pinpointed on the map, instead crimes are mapped to an anonymous point on or near the road where they occurred with their size modified to reflect the volume of records at each location. You can modify the display to show summaries by specific streets or neighbourhoods.

There is also the ability to obtain a full data download for each Force area and an API facility to encourage people to use the data and develop additional facilities as apps.


The past 24 hours since the launch of this service has seen a great deal of negativity which is unfortunate and largely over the top, but some genuinely constructive feedback has emerged from the noise.

The main issue concerned widespread problems accessing the site. It is no surprise, given the amount of news and press publicity in the lead up to the site going live, that it struggled to cope with demand. As the BBC reports, at its peak yesterday there were 5 million hits per hour/75,000 per minute causing it to be unavailable for long periods. That is a staggering amount of interest and demonstrates the appetite for this sort of information.

The second matter has been the decision to geographically display the location of crimes at an anonymous geometric centre point rather than on the location of the recorded crime/incident. This graphical ambiguity has been criticised by the British Cartography Society citing it as a frustrating example of ineffective mapping techniques, one which enrages cartographers and ultimately provides the end user with little information. The Guardian article Too Much Information also touches on the issue of how people are supposed to use this aggregated, geographically averaged data.

The main issue concerns the use of centralised dots to represent data when they portray what area that data is supposed to represent – which road boundaries, which villages, which properties are included or excluded? In the absence of detailed geographical plots at property level the more effective method would have been to use shaded areas to represent the geographical region covered by each aggregated data point, encoding the quantities through the shade or hue. A proportion of crimes/incidents per population would contextualise the levels better than absolute volumes.

When I worked for the Police several years ago I was fairly close to the development of one of the earliest pioneering attempts to make police data more transparent using a similar map-based approach. I therefore appreciate the difficulties associated with presenting crime or incident data down to property level – there are significant potential political and legal ramifications in doing this.

Ultimately, a certain safety-first approach takes precedent but this can have reverse consequences – a geographical centre-point still lands somewhere on a map, possibly a crime-free property, but this could give the false impression that is the crime hotspot!

I suspect the initial strategy with the displays has been to launch an initial ‘safe’ approach to provoke reaction, test out opinions and trigger debate. Then, hopefully, when there is greater confidence in the project’s scope, the fully detailed potential of this data will be unleashed.