Data that speaks for itself: part 2
4 March 2014
Today I want to focus on applying yesterday’s key principles, with a walk through some good and bad examples.
We are often tempted to include a number of different variables on a single data visualisation. While this can be useful for comparison, it can also be confusing for readers, especially when the variables use a different scale. Figure 1 shows an example of this, with two variables and two scales making it difficult to work out which variable is plotted against which scale. Where possible, data series should be displayed in its simplest form and if you are including multiple factors in a visualisation, try to ensure there is only one scale.
Pie charts remain a popular choice, but data visualisation experts strongly warn against using them. Stephen Few, a world-leader in information design and data visualisation, argues that pie charts make it difficult for readers to compare values. He concedes that they can be useful in assessing part-to-whole relationships, but makes a strong case for avoiding them in most situations.
Figure 2 illustrates just how hard it can be to read pie charts successfully. Aside from the lack of data labels and values, it is impossible to tell whether segment A or B is bigger, and furthermore, how much smaller segment C is compared to segment D. In contrast, the bar chart in figure 3 is much easier to read. The chart contains data values on the Y axis and the layout facilitates easy comparison between data points.
Presenting data in a numerical progression is another way of making it easy for readers to interpret it. Some data series have a natural order, for example, week, year or celcius. However, other data does not and in this case you should think about ordering your data from the largest to the smallest (or vice versa), to help readers compare it at ease. Take the bar chart in figure 3. If we sort the data by its values, it is even easier to identify high and low scores. Figure 4 below illustrates the same data as figure 3, but with a numerical progression.
Most data visualisation experts are clear about one thing: 3D charts should be avoided at all costs. They are difficult to read, and the graphics often detract from the data. Both Stephen Few and Edward Tufte discourage the use of 3D visualisations, arguing that important information may be hidden or distorted.
For example, compare the bar charts in figure 5. Apart from an example of how not to colour your charts, the chart on the left is incredibly difficult to read; with 3D visualisations, it is almost impossible to tell where at which point on the Y axis the data should be read from. In contrast, the bar chart on the right has a much lower “data-ink ratio” (Tufte would be proud!). Instead of guidelines, the bars have dashes through them indicating key points on the Y axis and, all in all, are cleaner to look at, more stylish and easier to read.
Charts, graphs, data visualisations—whatever you want to call them—are a key way of presenting data. Choosing the how to display your data is often an intuitive decision and everyone has their own preferences about how it should be done. That said, thinking carefully about how you present your data can increase the impact of your work and is something we should all be doing.
There are many ways to ‘make your numbers talk’ and different people have different preferences. What works for you?