Should you ‘trust’ data visualisations?
There has been much discussion this past couple of days about John Burn-Murdoch’s article in the Guardian ‘Why you should never trust a visualisation‘ which was itself a response to an earlier article by Pete Warden that proposed ‘Why you should never trust a data scientist‘.
The gist of the John’s piece (for clarity, his title isn’t being sensationalist rather just copying the original’s wording structure for effect) is expressed in this passage:
Data presented in any medium is a powerful tool and must be used responsibly, but it is when information is expressed visually that the risks are highest.
However, it is also a balanced argument, with acknowledgements that the attributes of data visualisation that create potential ‘risks’ in terms of mis-representation or mis-interpretation are also conversely its great strength. The points that John makes about the importance of including sources, workings/treatments applied to data, assumptions and caveats is entirely right and should be the default practice for any serious designer. I’m a great believer in the idea of ‘good enough’ being a suitable threshold in many situations. The pursuit of perfection is idealistic and also likely to result in inertia, yet if we have accepted any shortcomings in our data or understanding we should make this clear to our audiences.
My overall response to this latter article, shared by a few I’ve seen discussing this issue, is that in isolation it is a legitimate perspectives to take. However, there is fundamentally a wider issue to contemplate: should you trust a newspaper? should you trust a research paper? should you trust any form of communication? Data visualisation is a form of communication, one that is exposed to the same potential weaknesses, biases, prejudices and subjectivity of its creator as any other form might be. This bias might be particularly overt or more subtle and subconscious but it still exists.
Every visualisation takes a ‘stance’ of some nature. The popular concept of seeing visualisation as being akin to taking a photograph of data implies an angle, a framing and interrogation of a subject matter, context that is included and wider context that is not. Other views of the same subject matter are possible but one makes a specific judgment about their own choice of angle, be it provocative and biased, neutral and detached, big picture or localised snapshot.
Similarly, data visualisation in its role as a window to facilitate interpretation about a subject also potentially exposes an audiences’ own pre-conditioned opinions and personal takes on a subject matter. This is not just data visualisation’s problem it is ‘communication’s problem as a whole, otherwise why would this lot exist?
To provide a somewhat tangental illustration of the issues with imperfections in written or visual communication and its interpretation, I am reminded of this hugely discussed controversial photograph, one that has been interpreted across the entire spectrum of views (here, here, here and here, for example).
I’ve included below a short video from Andy Cotgreave, a name I’m sure most of you will recognise from Tableau. Andy has put together a neat little response to one aspect of John’s argument he felt was missing including a short simple demo to support his point.