With all of the new faces that are likely to show up in light of the recent NPD news, I thought I'd post a thread giving an example of how statistical sampling works. (I also want to have a post to link back to whenever someone insists that the numbers on this site are "made up.") We'll keep this pretty simple, leaving out things like confidence intervals and normal distributions, while focusing on one example.
The premise of statistics is that we can look at a small portion of some large group, and use it to make very accurate predictions about the group as a whole. While there will always be discrepencies with the "real" number in the larger group, we can get an extremely good idea of the overall picture so long as we have a true random sample. Let's look at an example.
Probably everyone here has heard of gamefaqs.com, which conducts a poll each and every day. I snapped a picture of one of their polls right at the start of one day:
This question on favorite puzzle series is a good one to use, because there shouldn't be any fanboy bias to deal with. (This helps us get a random sample.) Notice that there are only 139 responses so far. Think of these numbers as the estimates produced by VGChartz. We have a small random sample of a much, much larger total - basically the same relationship that VGChartz has with retailers and sales.
So how accurate of a picture did this particular small sample end up producing? Here's the same poll at the end of the day:
We now have over 67,000 votes on the same topic. Think of this as the "real" sales data VGChartz is trying to track. The initial sample tracked only 0.2% of the total - that's 1 out of 500! But surprise! The overall picture turns out to be extremely accurate. Our tiny sample correctly indicated that Tetris is by far the most popular game, with all the others trailing behind.
Now if you look closely, you'll see that there are some errors in the sample. Tetris is overestimated (72% to 64%), Bust a Move and Lemmings are undertracked, and Pokemon Puzzle League is noticeably too high. In fact, our sample incorrectly had Pokemon Puzzle League ranked higher than Bust and Move and Lemmings. This is exactly the sort of cherry picking that doubters use to "disprove" VGChartz. But to say this is to miss the forest for the trees; individual elements in a statistical sample can definitely be off, especially when the numbers are close together. Clearly, however, the numbers from our sample were not "made up"; even with our tiny sample, it nailed three games almost exactly (Columns, Mr. Driller, Puyo Pop) and was reasonably close on two more (Tetris, Pokemon Puzzle League). Most importantly, the overall shape of the group is extremely clear from our sample. The three tier structure (Tetris alone, followed by close numbers for Bust A Move/Lemmings/Pokemon Puzzle, and then a trailing group of the other three) of the group immediately jumps out from both graphs.
So while the sampling methods VGChartz uses will often be wrong on the micro level (due to margin of error/confidence interval reasons), it will very rarely be wrong on the macro level. For those who would continue to doubt, try looking at how other samples are put together. You'll find that it's very possible indeed to look at a couple hundred of responses and draw inferences about tens of thousands, or even millions, of pieces of data.
End of 2008 totals: Wii 42m, 360 24m, PS3 18.5m (made Jan. 4, 2008)