Last week there was an article on Wired profiling an upcoming tool from Tableau called Elastic, which drew my ire. The tool looks fine, haven’t seen a great deal about it but I’m sure it will find a user base.
What initially caused my Roger Moore eyebrow to spring into action was the way the article framed the tool. Check out this tweet from Wired.
“Spreadsheets are awful”. Just plain ignorance.
Spreadsheets are incredibly valuable tools for handling data, undertaking calculations and analysing it. They are not the most powerful of statistical analysis tools but they often provide enough. They are not the most potent charting packages, but they often provide enough. They are a fundamentally useful ally.
When someone emails a spreadsheet to your iPad, the app will open it up—but not as a series of rows and columns… The hope is that this will make is easier for anyone to read a digital spreadsheet—an age-old computer creation that’s still looks like Greek to so many people.
Sure, people do produce and share some really impenetrable workbooks. They dress up tables of data with the most horrendous shading and bordered decorations. However, as with bad PowerPoint slide decks, it is so lazy and easy to blame the tool and not the creator. What if I wanted the table of data un-visualised? Maybe I want to use the raw data as it comes, maybe I want to perform a lookup-and-reference type of interpretation? We don’t have to nor do we want to visualise everything. Let’s be more discerning than that.
That was the first thing. The second thing that caused my beef was the angle used to substantiate this type of tool as a kind of panacea that will automate and (that most dreadful of words) democratise the role of visual analysis.
So many companies aim to democratize access to online data, but for all the different data analysis tool out on the market, this is still the domain of experts — people schooled in the art of data analysis. These projects aim to put the democracy in democratize.
I don’t even know what that last bit means. Surely that would lead to democracytize?
These kind of confused articles bluntly reduce the craft of data visualisation, data science and data journalism into the most simplified of disciplines, something that an automaton should operate. They smooth over the complexities of working with data in a way that only existed in the idealised scenario offered by Microsoft’s practice ‘Northwind’ database.
The hope is anyone can become a kind of data scientist — using data in ways that echo so many journalists these days, from Nate Silver on down.
A “kind of data scientist”. “Nate Silver on down”. Wonderful, sign me up, I’m sold.
The Seattle-based company has been massively successful selling software that helps big businesses “visualize” the massive amount of online data they generate
I was immensely grateful to be invited to speak at yesterday’s excellent Visualized.io conference event in London. As I previewed last week, the title of my talk was ‘The Design of Time’. With only a 15-20 minute slot I couldn’t possibly fit in everything that I wanted to profile (indeed I probably shouldn’t have attempted everything that I DID profile) and so here is a director’s cut version of yesterday’s talk.
I’ve had this issue on my mind for a while now but haven’t really found a way of expressing a cohesive post about it. I still haven’t, as you’ll find by the time you reach the bottom. Let me state from the outset: today, I am the problem guy, not the solution guy. However, I felt I’d pondered for long enough and so decided put this out there to trigger some further thought and discussion.
As we will all know, the work emerging from the contemporary data visualisation field is dominated by digital output. Of course, there is still a significant amount produced for print consumption but, ever-increasingly, data visualisation is a digital – made-by and made-for – pursuit.
The history of the field preceding this recent era had a legacy of work that was easily archived and replicated for viewed in books or libraries. But how do we preserve the incredible array of digital data visualisation work being produced by this and future generations? It is an issue that goes beyond just safeguarding URLs and certainly goes beyond just the field of data visualisation.
Last evening, there was a terrifically astute stream of tweets from The Upshot’s Derek Willis, discussing web/data journalism, that articulated the concerns perfectly
As I perused some of the many tremendous web-visualisations tracking the recent US mid-term elections I was struck by the fleeting status of a graphic being fed by live data updates as they occur. As the story of an election night unfolds there will be all sorts of interesting ebb and flows, different points where the story arc seems to be heading in different directions (maybe not in this particular election but you get the point).
As soon as the new data comes in, the composition and content of that live graphic has changed.
This is not unique to elections of course, any real-time or frequently updated visualisation.
Over on Bloomberg they have the excellent Billionaires project, with a daily update on the fortunes (absolute and changing) of the world’s rich.
What is interesting about this project, as Lisa Strausfield discussed in Data Stories episode #41, is that Bloomberg has journalist resources assigned to stories around billionaires. It’s a matter of common interest and intrigue so why not. Perhaps because of this dedicated resource there is a daily archive of the status of the billionaire’s rankings project for any given date (eg. 11th April). So it is very easy to revisit a point in time and see how person x did on that day
Not every real time project will have that resource, nor will it have a subject matter that has levels of potential interest that endure on an ongoing basis. So what can be done for those projects?
Another example of the preservation challenges. It sounds like soon (or even already) parts of the US will be getting very cold. I saw this tweet with a still snapshot of the live ‘Earth‘ weather map visualisation by Cameron Beccario.
Seduced by these patterns I also took a look at the display on the hint.fm ‘Wind Map‘ and took my own screenshot to preserve that data ‘moment’.
I’d forgotten that the Wind Map project does have an archive gallery of previously interesting or noteworthy weather events.
However, that gallery has not been updated since Hurricane Sandy in October 2012.
As soon as this latest weather system passes, those interesting patterns are gone forever. Unless someone archives them.
As the final members of the graphics teams (1, 2, 3, 4 etc.) across the news media finally shut down their machines after a long night of mid-term election coverage, I am reminded of a great article by Matt Ericson from 2010 titled ‘When maps shouldn’t be maps’. (Addition: A very helpful ‘Map or Don’t Map‘ flowchart from John Nelson at IDV)
In this article Matt describes the need to be more challenging in our natural assumption that simply by having spatial data we should map that data: “the impulse is since the data CAN be mapped, the best way to present the data MUST be a map”. If the interesting patterns are not spatial then a mapped display is fairly redundant. We may learn more from a location-categorical display comparing quantities or how values for those locations have changes over time, ranked by the largest to smallest changes, for example.
However, on the flip side, when the interesting patterns ARE spatial, then of course, the layering of a data display on to the apparatus of a ‘map’ makes complete sense. Over the past week I have come across two different but very effective examples that demonstrate this.
Firstly, a very revealing visualisation (Alberto’s viz of the week, no less) about ‘Obama’s Health Law‘ by the New York Times. The map displays the percentage point increases, county by county, of Americans with health insurance under the Affordable Care Act.
I don’t know a great deal about the Affordable Care Act, particularly the political mechanisms that make it available or otherwise, but from looking at the display you can immediately see regional discrepancies that MUST be reflective of state level policies. Reading the accompanying article explains this observation in further detail:
That state boundaries are so prominent in the map attests to the power of state policy in shaping health insurance conditions. The most important factor in predicting whether an American who had no insurance in 2013 signed up this year was whether the state that person lives in expanded its Medicaid program in 2014.
By way of illustration, the piece draws contrast between Kentucky, which expanded Medicaid, and Tennessee, which didn’t. This was something highlighted by Lena Groeger on Twitter
There are many other spatially significant differences that support the benefit of displaying this data (albeit, just one view or one slice of analysis about that data) via a map: It reveals interesting patterns that would not have been as effectively or efficiently portrayed using other approaches.
The second example I came across concerns a different idea of mapping, this time the mapping of the geography of the human body. The graphic ‘Bumps, Bruises and Breaks’ by the Wall Street Journal – originally found on Junk Charts – shows how NFL players have sustained over 1300 injuries this season and where these injuries occurred on the body.
Plotting the quantitative displays of injury totals across the different parts of the body makes complete sense. It is more concrete, you can see the distribution more instantly. By having the illustrated player in the background you can also draw conclusions about the sufficiency (or otherwise) of the protection they get from their kit. Incidentally, Kaiser does a great job of offering up some further enhancement ideas for the graphic.
So in conclusion, just because you can map your data, doesn’t mean to say you should. Have the discipline and sense to challenge your natural impulses but, when it does make sense to do so, plotting spatial data on a map can really illuminate the inherent patterns.
Below you will find an embedded slideshare version of the slides used in last week’s talk at the Data Visualization Group in the Bay Area Meetup at the University of San Francisco. As usual, the quality of the slide images hasn’t quite been preserved in the upload but you’ll get the idea at least.
I always say this and will say it again: presentation slides are just visual props for a talk so you won’t be able to necessarily decipher the exact narrative that accompanied each subject. For meetup members (and maybe those not present too, possibly) the video of the talk should be released soon. I might also trot out the same talk at another future opportunity so do have a look through/watch but don’t memorise it, just in case!
I have edited one or two of the slides for the purpose of sharing this deck publicly. For instance, this was my original slide 2, capturing the idea that I regretted that my talk title was a bit too Troy McClure-esque.
That might seem like a rather a pompous section header – after all it is just a new site design – but for me it feels like a really significant milestone. The new version of visualisingdata.com was launched yesterday without too many bumps in the road, thankfully. I want to share a little bit more information about the thinking behind this new site’s design and functionality. I appreciate all the feedback and comments that have been aired so far and hope this responds to some of the curiosities that were expressed.
Visualisingdata.com was launched in February 2010 shortly after I graduated from a Masters research programme that had enthused me to want to continue to learn and discover much more about the data visualisation field. I decided that setting up a quick and lightweight blogging platform like WordPress and writing publicly about the subject was a great way to continue learning. You are forced to research, think and carefully establish your convictions.
At that time the field was experiencing a very evident increase in popularity and mainstream coverage. (I will always look back as being very fortunate to arrive in it when I did). Over time, as the field has continued to mature and spread, hopefully, the site’s content has reflected a similar development providing value to new enthusiasts and seasoned pros along the way.
I was always mindful that the look, feel and functionality of the site was somewhat stuck in the vertical, scrolling wilderness of the particular WordPress theme I’d chosen. New stuff gets seen, older stuff gathers dust. It did an entirely worthy job for a long time, particularly prior to me a being a full-time freelance professional, but over the past couple of years I’ve felt an increasing friction between the style of the site and the things I’m teaching, preaching and practicing. As a shop-window to what I’m about it was no longer a good enough fit.
I don’t have sufficient web programming literacy to do much more than tweaking so any significant development have been very hard to find space to move up the priority list. However, at the end of 2013 I decided the time was right to draw together a brains trust and start working towards an entirely fresh design. Not only a website reboot but something that extended to a whole new visual design and brand identity across all outputs.
Here’s an outline of the contents and features that you will now find on the site.
Creating a home page was one of the most important additions I wanted to introduce in this new version. I wanted to move away from the home page being the beginning of a stream of blog posts. This very final single image of what was essentially the previous home page view is illustrative of the old experience. I was also seeking a more dynamic front landing page that would encourage people to explore different parts of the site and offer a slightly different experience each day they come back to the site. I accept many people still tend to consume site content through RSS, google discoveries and links from social media but it is still nice to be able to offer a home page experience for those new arrivals and/or those who stick around for a browse.
So, on the home page you will find a convenient profile of the latest and most relevant content on the site.
We are still exploring ways to ensure the layout and functionality is optimised for large screen vs. small screen, for desktop vs. mobile/tablet. More on this further down.
The blog posts on this site date back to February 2010 and are really the heart and soul of this site. As I’ve already described, one of my motivations was to make the older content more immediately visible and accessible, getting away from that very linear, vertical journey of the previous WordPress template. Now you can move around nearly 5 years of content in just one or two clicks of the button. You also have the featured images to better inform your browsing choices and preview text to decide whether to continue to visit the full article. The blog page takes a couple of seconds to load up the content of the database – we’re going to continue looking at ways to shave time of this loading. The content is presented in reverse chronology and is categorised by seven content groups.
For the individual blog posts there is now a cleaner design with a more striking banner image for each blog (with the flanking colours auto generated from the dominant image colours to blend in) and a more integrated ‘related posts’ feature. The comments are tidied away until you wish to view them and you can browse backwards and forwards, as usual, to proximate posts. The social media buttons enable the sharing of the post links to the main 4 destinations. The tally counts for each post have hopefully been pulled through in most cases but some have not.
In my current blog post database there are 529 published posts. In the process of migrating to this new site design I have had to go through each post to reconfigure it slightly, set up excerpt text, update to the new categories and also assign a featured image. I have managed to get through over half of these but still have 200+ to complete so you will see some blank tiles in the blog page and in the blog posts themselves a ‘Page Under Construction’ banner image. I’ll be working through these as quickly as possible to tidy up all the content.
The collection of resources has been one of the most popular content items published on the site down the years. This page now provides a interactive navigable database of over 200 tools, applications and programmes that have an important role to play in data visualisation design. As new tools arrive on the scene, this collection will be kept entirely up-to-date to maintain the latest catalogue of options: I will shortly be finally getting round to the extra 70/80 or so outstanding items that need adding. The categories are a best fit grouping, though some tools inevitably do cut across several in scope. The preview text wording is often drawn from a tool’s native site, rather than being my own. Also, I may not have actually used all the tools, I may not even personally think all of them a particularly great, but I will have seen evidence of others who have endorsed them or found them useful in different contexts to make them worthy of including in this collection. I would also like to take the opportunity thank Tableau Software who are the exclusive sponsors of this resources page.
The references collection provides further useful resources for data visualisation enthusiasts. As above I will endeavour to add all the outstanding content that I’ve been bookmarking as soon as possible and keep it more up-to-date going forward. Some of the background widgets we have created in the WordPress admin dashboard will make this task much easier than before.
This page provides an overview of the data visualisation training workshops I run on behalf of Visualising Data Ltd. There is an overview of the training content, a profile of the types of training available, a list of the current training schedule, an interactive map to explore the location and type of previous events, a selection of participant testimonials (still needs another 100+ adding) and a form for interested parties to make a request for a future event. I am still exploring the potential use of a new training event registration tool (as an alternative to Eventbrite) that will be more integrated into WordPress.
The services page presents the range of professional services I offer on behalf of Visualising Data Ltd covering consultancy, research, teaching, speaking and writing. There is also a gallery showing the logos of previous clients.
This page promotes details of my first book ‘Data Visualization: A Successful Design Process’ and, as the development progresses, will also eventually show information about my second title.
The about page profiles me, provides an outline of the workings and content of this site, some brief details about Visualising Data Ltd and a contact details page with links to my more prominent social media profiles.
The type used in the new site includes Exo 2, used mainly for titles and headings, and Raleway, mainly for the body text. The colour palette was developed by Matt (see below) and comprises the following:
As with any worthwhile site development, we have sought a solution that works as well as possible across all platforms. There is a view that modern web site design should have a ‘mobile first’ objective. I recognise and agree that content is increasingly shared and accessed from smaller, mobile devices. However, my motivation for the redesign of this site has been focused foremost on an enhanced desktop experience, translatable where possible to tablet and then stripped back to be as accessible and light as possible for mobile. You will not see the same level of functionality available on a mobile browser compared to desktop because the screen real-estate and scope for interaction doesn’t lend itself to achieve this. This tweet and comment from Al Shaw expresses my views perfectly: “Instead of ‘mobile first’ I like ‘to every screen according to its needs’”.
This does not mean that there is not more work to be done to maximise the effectiveness of the design across the different platforms, we have several things to fix and possible avenues to explore to make the site as responsive, accessible and well-performing as possible.
Matt led the way in facilitating the new design thinking: developing the new branding ideas around logos, typography and colour palette and helping to formulate the initial mockup sketches ideas of the site. Matt will be contributing to a future article about the thinking and design process specifically behind the new logo.
Andrew led on the development side, bringing his considerable technical talent and sharp eye to entirely translate my hopes for the site’s structure and capability in to reality. Over the past few months especially he has gone above and beyond to help me get over the line with this new launch, particularly as my often limited spare capacity has led to a very stochastic pattern of progress.
Its been an immense pleasure working with both of these splendid chaps and I am hugely grateful for their contributions.
One of the (many) things that impresses me most about the quality of data visualisation and infographic output from the leading journalist organisations is the continued variety and innovation of their techniques. Rather than just being constrained by a limited visual vocabulary, each new work published tends to feature a solution uniquely suited to the data, the analysis and the subject matter involved. Given the pervasive time constraints involved, the work we see created, day in day out, is quite incredible.
Of course, on special occasions, there is a compelling reason to potentially re-cycle previously used graphic archetypes and there was an example of this last week that was both astute and highly impactive.
Firstly, to explain what I mean by re-using custom graphic archetypes. I’m not talking about the repeated use of an off-the-shelf chart on numerous separate occasions, like the bar or the line chart, I’m referring to more bespoke solutions that have been used for multiple projects.
The New York Times, for example, utilised this interactive and participative matrix to assess the public’s reaction to the death of Osama Bin Laden back in May 2011.
They then, quite correctly, used the same underlying graphic approach to assess the reaction to Barrack Obama’s stance on same-sex marriage. Why reinvent the wheel when you’ve got a perfectly applicable solution on your shelf?
In another example, The Guardian launched their innovative interactive timeline in March 2011 to outline the key milestones and sequence of events related to the Arab Spring.
In June of the same year it was re-imagined as a timeline to show the evolution of modern music…
…and again in October 2012 to show the events leading to the Eurozone crisis. An entirely reasonable, appropriate and – importantly – effective choice.
Whilst it is not purely the same archetype being re-used, last week’s ‘How high can a missile reach?‘ graphic published by the Washington Post (by Bonnie Berkowitz, Julie Tate and Richard Johnson) had a more profound effect. This was a vertical-scrollable graphic that aimed to show the scale of the height involved in the tragic shooting down of the Malaysia Flight MH17. I’ve recorded a brief video of it below.
As I mentioned on Twitter last week, there is something so haunting about the juxtaposition of the ‘how high’ graphic considering just three months earlier, the Post produced a ‘how deep’ graphic showing the ‘The depth of the problem‘ (by Richard Johnson and Ben Chartoff). This was a very similar graphic device used to show the scale of the depth involved in the search for the Malaysia Flight MH370 black box.
Once again, the appropriateness of the same graphic approach being used is without question. The switch from height to depth, from upwards scrolling to downwards scrolling, to visually capture the essence of two so-closely-linked tragedies was very cleverly conceived and had a big impact on me.
I’ve had this short post sat in my draft folder for weeks now, awaiting the right context before publishing. I’m finally motivated to post it having seen a few discussions on Twitter last week whilst on holiday (when the hotel pool has wifi, what can you do but look now and again…).
The Twitter discussions involved comments along the lines of “good, but would have been nice if…”. This is something I’ve uttered and written hundreds of times before: it is an inevitable reaction of somebody assessing what is in front of them. (Whisper it, sometimes it will also be a comment shared with the wider world to help others understand just how super astute you are!).
I didn’t bookmark the conversations nor is it about pinpointing individuals, indeed I can’t even remember who was involved. Furthermore, this isn’t another criticism-soapbox piece but a simple reminder that data visualisation – and frankly any creative endeavour – is a pursuit of optimisation.
Firstly, it is important to remember that the “it would have been nice if…” observation (usually in relation to absence of a certain design feature) is more than likely a view also shared by the creator. Just because a piece of work doesn’t include something that would have added value doesn’t necessarily mean that it wasn’t both considered and desired by the designer him/herself.
This short exchange between Elijah Meeks and Hannah Fairfield in relation to a New York Times graphic about the Affordable Care Act demonstrates the reality of the circumstances in which projects are created. I’m not picking on Elijah’s query – the hover/click feature was something I remember also instinctively wanting – because it was an entirely valid point, rather I’m struck by Hannah’s quick reply ‘ran outta time for tooltips’.
Secondly – and mainly – we rarely, if ever, have perfect conditions for creating visualisation work. It is a game of compromise shaped by factors like resource limitations, time constraints, client interference, format restrictions, market pressures etc. It is sometimes about the skill of judging when ‘good enough’ has been achieved. Indeed, on some occasions it is not even about settling for ‘second best’ but realising there is a viable path represented by a least worst solution.
So, don’t stop critiquing work and querying whether something had been considered. Don’t stop commenting on what you think would be good to make something even better. But do remember that there is likely a good reason why certain things couldn’t be achieved in the context of its creation.
Occasionally I invite folks to contribute guest posts to profile their work, ideas or knowledge. This guest post comes from Benn Stancil from a startup called Mode who have created a really interesting tool that allows you to reverse engineer analysis/visualisations in order to potentially take them in new directions. The product was opened to the public yesterday, so you can check it out and a few examples of the visualisations that people have built with it.
Like so many others, I’ve long been fascinated by learning from data–and as a result, been an avid consumer of data visualizations. The explosion of data in recent years has fueled a similar explosion of beautiful and insightful visualizations, created by everyone from industry leaders like the New York Times and Guardian to undiscovered brilliance hidden in obscure corners of the internet.
Even the best visualizations, however, rarely answer all of a viewer’s questions. We often want to understand how the data was collected, how it would look if considered from a different angle, what story it would tell if combined with other data, or how the visualization was built. In other words, great visualizations not only answer questions, but inspire more.
Unfortunately, it’s often difficult to document and share enough information to answer these follow-up questions. Creators carry the burden of sharing their data sources, their analysis that aggregated and combined data, their visualization code, and many other details. And piecing this information together after the fact is equally burdensome for consumers. The bit of knowledge someone new could add by remixing the analysis–or the bit they could learn by better understanding the original–often hits a dead-end, no matter how inspiring the visualization.
The above is a screenshot of a finished visualization. You can see the query, visualization code, and previous versions by clicking on the Query, Presentation, and Run History tabs above the graphic.
By organizing all of this information together in a simple package, people can immediately understand and add to visualizations without having to rebuild the work themselves. We’ve made this possible in one click–simply click clone on the screen above, and you’ll be working with with same visualization published by the original author, exactly where they left off.
When a piece of work is cloned, the original author not only maintains credit, but also sees who cloned their work and what they’re doing with it. This allows the community to push an analysis forward, without ever losing sight of the creator and without the creator losing sight of how their work is evolving.
Others can then working with the analysis and visualization in their own workspaces. They can even add their own data–Mode allows multiple creators’ data to be combined in a single visualization. Because all of this work happens in the browser, Mode doesn’t require setting up a development environment or finding a place to host the visualization.
Here is a screenshot of the presentation editor, where you can add custom visualization code and preview it.
Finally, we want people to be able to easily share their work. All visualizations in Mode can be shared via URL, or can be embedded anywhere on the internet, just like a YouTube video. The embedded visualizations, like the one below, can be fully interactive, and link back to all of the data and work.
Our approach to making data visualizations more accessible is largely influenced by our own experiences as data analysts. Surely others, who have had different experiences and objectives, face other challenges or have other ideas for solutions.
We’d love to hear what you think of our direction and how we can tailor it to your needs. What problems have you had when collaborating on data visualizations? What are your biggest struggles, and how would you solve them? If you’d like to check out our approach, Mode is free to use and you can sign up here.
We’re looking forward to see what great work people can build with Mode – and perhaps more importantly, what we can learn from each other. The world is producing fascinating data at an unprecedented pace, on subjects ranging from air quality in Chicago, to taxi traffic in Seattle, to the tattoo trends in the NBA. Great technologies for producing visualizations, like D3, Raphaël, and R, are constantly improving. And we have many giants in the data visualization community to look up to. At Mode, our hope is to help all of us stand on their shoulders.
People might seek teaching in data visualisation because they find themselves doing this…
So you’ve got to find an accessible way to communicate this…
Without overly reducing it to this…
You know that some people might be wanting to do this…/p>
But they really need to appreciate how and when to do this…/p>
Whilst you want to acknowledge the classics like this…/p>
You’re also keen to give people a glimpse into this…/p>
You have to be respectful of this…/p>
But if you overly prescribe the rule book, everyone will end up like this…/p>
When really you want to encourage flexibility to do this…/p>
Ultimately, you want people to leave with the confidence, know-how and aspiration to create this…/p>
Want to know how I balance these demands? Experiencing it for yourself…