Data Visualization: How to Make a Really Bad Graph

As a data lover I have a particular nonpolitical aversion to Fox News. Their travesties in representing data have been well documented on other sites.

I enjoy poking fun at bad graphs, but over time one comes to recognize common traps many fall into when designing a graph. Recently I came across this particularly bad graph on a social network. If you know the source I would appreciate the ability to cite where it came from, but perhaps it is just as well that we let the origin remain anonymous.

Let's take this graph one piece at a time and let it instruct us on how to make our own bad graphs.

Use Multiple Y Axes when Displaying Similar Data

You'll note that the chart is measuring two continuous measures of exactly the same type of data (employment/population ratio). However, they're plotted against two different Y axes. It can be difficult to add more than two axes, but if you can add more, it really helps aid incomprehensibility so people can't compare data well.

Use Different, Unnatural X-Intercepts for each Y axis

Your readers expect you to make zero the Y-intercept on your axes, so make sure you don't do that. Pick an arbitrary number for each axis that suits the message you wish your data to give, and allows you to exaggerate its effect.

Use Different Scales for Similar Data

The author of this chart was very clever. Look at the difference in scale from the top to the bottom of the chart. On the male trendline the graph measures 78-62=16 units, and the female trendline measures 60-38=22 units. The more axes with misaligned scales the better. Your readers will try to compare the relative scales, and you will have have them making incorrect inferences in no time.

Use Ratios as your Data

It is well known that people do not mentally grasp ratios well. People understand absolute numbers, so whenever you need to shroud information, put it in the form of of a ratio over or under some other number. In this particular case the ratio actually might make more sense, except it would make more sense if it were quoted as a percentage of a certain population.

Add a Trendline, preferably Linear

Here's a big one. Graph consumers love trendlines. Adding a trendline to the graph is like adding a puppy under the Christmas tree. They reinforce the message that everyone should assume the data you're displaying is linear. All data is linear, right? It lets your readers draw wonderfully ridiculous extrapolations off into eternity on either end, and lets you make a strong policy case that change is needed to modify the graph. Trendlines are also excellent for reinforcing the idea in your reader's minds that there is just 1 relevant variable into the function, usually time. Everyone remembers "y=mx+b". Give your readers an intercept b (see above regarding how to choose your intercepts), a comfortable m slope in the form of a trendline, and a nice range of x values along the bottom, and they will solve for truth on their own. This will keep your readers from considering other reasons you didn't have time to research that might be affecting the lines.

Hopefully this has been a valuable lesson in data-obfuscation. I'll continue to bring more tips as I find good examples of bad graphs.