This is a guest post from Jon Schwabish, an economist and data visualization creator. You can reach him at jschwabi@yahoo.com or by following him on Twitter @jschwabish. Many thanks to Job for his efforts to keep us poor folk who aren't able to make it to this great flagship event.


So welcome to (my) day 2 of Visweek! Today was even more exciting than yesterday (day 1 review here)! As the BioVis folk filtered out and the InfoVis people filtered in, there was a palpable change in energy in the conference area. It could have been more people, or the soda served at the first break, or the break from the rain, but there is definitely a higher energy here today.

In the Morning

Today marked my first Visweek Keynote address, given this morning by Mary Czerwinski of Microsoft Research. All three ballrooms were open, the room was packed, and people seemed excited. The talk was interesting, but I was not inspired. She spoke about a number interesting projects Microsoft is working on (and has worked on) and showed some entertaining videos of projects that they worked on back in the day of IE4, as well as some various survey design techniques. Gauging by the number of questions at the end, many people seemed to find it interesting, but I was looking for something bit broader, a sentiment that was shared by a few other conference attendees I spoke with.

I then turned to my first InfoVis session (in case you’re confused about the differences between all of these “Vis” sessions, I’ll spend some time discussing them in my final post). This session included the “best paper” of the conference by Steve Haroz and David Whitney about how to use color to encode information. Steve did a great job presenting—the authors made three general conclusions, which I adapt below to be a bit more practical:

  1. Minimize the use of bins (i.e., color) for categorical data;
  2. If possible, require interactivity in your design; and
  3. Group like items together.

I also enjoyed the paper by Steven Gomez and his co-authors , in which the authors asked students in various fields to present research from other academic disciplines. They then visualized the “average” presentation slide and, based on my own experience, I wasn’t totally surprised that they found that students from the social sciences (e.g., economics) were more likely to have slides filled with text and bullet points than students in the humanities. The paper by Michael Sedlmair and his co-authors was also a nice paper about design study and I talked to a few people afterwards who thought the lessons from their work were well presented and discussed and could be used as an outline for others undertaking design studies.

Overall, I don’t want to spend too much time dissecting these papers or others in the session—I’m not really qualified to do so without reading the papers in detail—but, in general, I thought the quality of the presentations and the quality of the slides was really quite excellent. The slides were clear of clutter, text was large and sparse, the color choices were simple, and the stories were clear.

In the Afternoon

I had a great lunch with some graduate students, Jerome Cukier, and Drew Skau (from Tableau), and followed that up with some additionally interesting sessions. I will say that the afternoon InfoVis sessions (I did not try the SciVis or VAST sessions today) were a bit more technical than the morning session, so I got a bit less out of them. The paper by Anastasia Bezarios and Petra Isenberg, which tested how well people perceive visual variables from different perspectives in front of a large wall display was really interesting, and I enjoyed the description of the actual experiment.

I also spent a bit of time in Marek Kultys’ tutorial on Good Practice of Visual Communication Design in Scientific and Data Visualization. Marek had a really nice basic-level data visualization course in his 4 hours or so of lecture, including a practical design period in which students were given a design challenge and asked to do some sketches (unfortunately, I didn’t stay to see how it turned out). I always find it interesting to see how different people try to teach basic data visualization principles and though I found his slides hard to read, I think his “6 Principles of Information Design” - borrowed from Edward Tufte's "The Visual Display of Quantitative Information" book - are well worth repeating:

  1. Have a properly chosen format;
  2. Show comprehensive information in true context—that is, do not lie;
  3. Use words, numbers, and drawing together;
  4. Avoid content-free information, including chart junk—in in other words, avoid redundant visual elements;
  5. Display an accessible complexity of detail; and
  6. Have a narrative quality—tell a story about your data.

In the Evening

So, a little personal plug here. I spent the first part of my evening with about 70 other people doing a “Poster Fast Forward”, which was really fun. Each of us got up for 30 seconds to give a preview of our poster (mine is on the data visualization and infographic efforts at the Congressional Budget Office), which are being showcased in a neighboring room. I followed that up with an Ignite Talk (a 5 minute talk in which your 20 slides automatically advance every 15 seconds) at an event run by Noah Iliinski and sponsored by Tableau. A lot of people from the conference showed up and everyone seemed to have a good time.

The Bad - or Perhaps Just My Confusion

I freely admit that the academic side of data visualization is not my area of expertise. But I saw a number of different presentations today and here are some of the reported sample sizes in each: 6, 10, 12, 15, 20, 5, 20, and 31. With such small samples sizes, I find it hard to believe the statistical results in any of these studies. I have to go to the back of my nearest statistics book to find the critical values of t-statistics when sample sizes are that small and I am almost entirely certain that none of these authors can claim that their samples were truly random. (Someone did try to point out that perhaps the respondents were randomly seated around the computers—that doesn’t count!) It also appears to be the case that many of the participants in these studies are students at the various universities at which these experiments are taking place. I just find it hard to think how results from studies using such small samples can be applied to the population at large. Even if cost is an issue, conclusive results should be based in good statistics, and such small sample sizes does not, to me, inspire confidence.

There looks to be a number of great talks tomorrow and I’m excited to see what others are doing.

VisWeek updates by Jon Schwabish: Day 3
VisWeek updates by Jon Schwabish: Day 1