Wednesday, August 1, 2007

I Still Hate Statistics

Anyone who knows me well is aware that I don’t have much use for statistics and infographics. Okay, I’ll just come out and say it—I hate statistics and infographics.

I find little or nothing that is informative about either. The fact is, all you have to do is read about the Enron corporation to realize that you can make numbers say whatever you want.

Case in point: John Gruber’s Daring Fireball: Linked List recently pointed to this blog entry:

Pixels Are The New Pies

An interesting infographic trend: Square blocks of color are now being used to represent percentage-based statistics instead of the traditional pie chart.

[Anil Dash: Unsolicited Opinions Since 1999]

I completely agree that pie charts are not a superb way of representing data, but the square block examples shown on Anil’s site aren’t the answer, either. Just look at the single column of yellow blocks in the first sample image that represent 10%. If you instead made those 10 blocks occupy a corner of the full region, your eye is going to perceive the “volume” occupied by the blocks a lot differently.

Also notice the white space in the middle of the first sample that presumably represents 8% of something such as nonrespondants or no data. Not only does my eye want to tell me it’s more than 8% due to it’s centered location, but it also increases the perception that the dark green area is a lot more than 48%. In fact, if there was no legend and I didn’t count—but only glanced at the infographic—I would guess approximately the dark green area is about 75% which is completely wrong.

Even if you don’t use infographics, you might think that showing raw numbers paints the perfect picture. I’m sorry to say it doesn’t. A chart of raw numerical statistics tells you nothing without some sort of legend or context. And it’s in that context where you can twist up what the numbers represent, making bad numbers paint something in a far better light than it deserves.

I do not know what the answer is for accurately representing large amounts of statistical data. I hate to complain about how much I hate something without having an idea one how to fix it, but statistics is something that might be impossible to fix.

