By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - ioi speaks out about ergh "VGC analysts"

Shadow1980 said:
ioi said:

To answer you and Carl directly - our lifetime hardware data is obviously a special case as we get shipment data and can make regular adjustments to our formulas (to represent changing trends as buying habits change over time) and correct older data to stay within a certain range of the shipment data. There is obviously increased emphasis on the lifetime hardware data as it means so much to everyone in the "console wars" and as such it is regularly adjusted to be as accurate as possible but if shipment data wasn't released then we would probably be looking a +/-5% range - which would be acceptable for most normal uses. Let's be honest here - the only reason people care is because the PS3 and 360 sales are quite close, if there was a 20-30m gap then nobody would be moaning about 200k here and there

As for official shipment data - none of us know exact stock levels, exact date ranges used and so on. Manufacturers have been known to play tricks in the past with their numbers to make them look better in one way or another so even these figures shouldn't be taken as gospel. I try to use shipment data as a guide to make sure we are in the right area but with stock being anything from 100k to maybe 2-3m at different points in time and with discrepancies in actual dates used etc then the shipment data is only useful to a certain point. With all of that said - what is the specific PS3 situation that you are refering to, I will take a look into it.

To make a general point - we don't just go in and start changing numbers to match someone else. I don't trust what some "insider" posts on Twitter and neither should anyone else. Similarly, most news sites that publish or leak data usually have no idea what they are actually talking about and often mis-report figures or don't compare the correct time ranges and so on. We have our own data, our own contacts, sources, insiders and I prefer to trust our info as I know where it is coming from and exactly what it represents.

It is quite infuriating when someone insists that our data is wrong because so-and-so said such-and-such when we have 100 times as much information on the subject than they ever will!


I'm fine with recent numbers being estimates as they can always be corrected and adjusted when better data comes in. However, there's only one thing that's always bugged me. The numbers for older systems (sixth-gen and earlier) are pretty spot-on  as we've had good data for many years, and we obviously won't be getting any better data than that. However, there's one exception to that rule: the original PlayStation. While the lifetime sales of the NES, SNES, Xbox, Saturn, etc., are all almost exactly what their lifetime shipments are, VGC's list of best-selling hardware has it as having sold 104.25 million, whereas Sony has stated that they shipped 102.49 million. Given how VGC's figures for other older systems are congruent with the official lifetime shipment numbers, the PS1's tally being 1-½ million higher than the official count really sticks out. So, how'd that 104.25M figure come about? It's not really a big deal or anything, but I am curious as to why it alone amongst older major systems has a noticeable difference between lifetime shipments and lifetime sales.

And PS2 here is a lot smaller than shipment... but not many bother with it... they are old and dominated, even 20M error wouldn't make a difference...

And about the other post about other sources... say one that is free and have weekly figures... and a funny forum.



duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"

http://gamrconnect.vgchartz.com/post.php?id=8808363

Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"

http://gamrconnect.vgchartz.com/post.php?id=9008994

Azzanation: "PS5 wouldn't sold out at launch without scalpers."

Around the Network
MaskedBandit2 said:
ioi said:
MaskedBandit2 said:

I dont understand the post about error percentages, bell curve analysis, and probability. To me, it seems like a weak attempt at an excuse for publishing bad numbers.  You say you're not "wrong" if you publish 600k and VGC says 485k. That's a 24% error, and a difference of 115k units! How is that even acceptable? As mentioned, it only gets worse as numbers scale higher. A 20% error of 2M is 400k units. That is definitely not meaningless. You're validation for this is that NPD has a margin of error as well, and does the same thing that VGC does? Naturally there's error, but it's going to be much smaller.

And when I look at actual charts, if you want to say the numbers can't be wrong and you're just using probabilities, why are you even publishing these ridiculously precise numbers. What's the difference between saying one game sold 238,854 and another selling 241,913? Heck, what's the point of even publishing the high figures this site does if you're saying they can fall in such a large range? It doesn't take much thought to know a game like GTA is going to sell in the multi-millions. If you're saying the numbers can't be wrong because they fall within a decent portion of a standard distribution, despite being off by a couple million, what's the point?


It's not an excuse, it is an explanation. Read my last post before this one - we take data from a sample population and scale it up to represent data from the whole population. Given variances in what the sample does compared to the whole population, there will be a bell-curve probability of the real values around our estimated one. The further you go from the estimate, the less likely you are to get that value.

Roughly speaking for the USA, we are using data from ~2m people to represent what the entire population are doing. Now a sample of 2 million people is enormous but even so it is less than 1% of the entire population and if for some reason we have bias towards particular regions, ethnic groups, age ranges, household incomes, genders and so on then our data will be an imperfect sample.

As for publishing data to the nearest unit - that is common practice. 238,854 doesn't mean that we have personally tracked exactly 238,854 sales of something - it means in reality that we may have tracked 1571 sales of something and via various scaling methods and adjustments have arrived at that figure as our best estimate of the sales of that product - which represents the centre of the bell curve.

Then why even publish 238,854.  Your original post says if you put a number (600k) that doesn't mean it sold 600k, but rather, it's an estimate, thought of as a probability.  If you have two close numbers like what I said, why would you not say both as 240k, as they basically have the same probability to be off especially since the small differences are just likely statistical noise.  It comes across as misleading.  Why even rank the sales?

Cumulative numbers will be quite a bit off if you round the numbers at every turn.



kitler53 said:
if only we were arguing about 5%...

i monitor the number really, really close the adjustments i see are more like:
Japan : 0%
America: 20 - 100%
Europe: 50 - 200%


How exactly are we even verifying for European numbers?

 

For Japan we have Media Create Sales/Famitsu, For North America we have NPD... What's the equivalent for Europe?



Zkuq said:
MaskedBandit2 said:
ioi said:
MaskedBandit2 said:

I dont understand the post about error percentages, bell curve analysis, and probability. To me, it seems like a weak attempt at an excuse for publishing bad numbers.  You say you're not "wrong" if you publish 600k and VGC says 485k. That's a 24% error, and a difference of 115k units! How is that even acceptable? As mentioned, it only gets worse as numbers scale higher. A 20% error of 2M is 400k units. That is definitely not meaningless. You're validation for this is that NPD has a margin of error as well, and does the same thing that VGC does? Naturally there's error, but it's going to be much smaller.

And when I look at actual charts, if you want to say the numbers can't be wrong and you're just using probabilities, why are you even publishing these ridiculously precise numbers. What's the difference between saying one game sold 238,854 and another selling 241,913? Heck, what's the point of even publishing the high figures this site does if you're saying they can fall in such a large range? It doesn't take much thought to know a game like GTA is going to sell in the multi-millions. If you're saying the numbers can't be wrong because they fall within a decent portion of a standard distribution, despite being off by a couple million, what's the point?


It's not an excuse, it is an explanation. Read my last post before this one - we take data from a sample population and scale it up to represent data from the whole population. Given variances in what the sample does compared to the whole population, there will be a bell-curve probability of the real values around our estimated one. The further you go from the estimate, the less likely you are to get that value.

Roughly speaking for the USA, we are using data from ~2m people to represent what the entire population are doing. Now a sample of 2 million people is enormous but even so it is less than 1% of the entire population and if for some reason we have bias towards particular regions, ethnic groups, age ranges, household incomes, genders and so on then our data will be an imperfect sample.

As for publishing data to the nearest unit - that is common practice. 238,854 doesn't mean that we have personally tracked exactly 238,854 sales of something - it means in reality that we may have tracked 1571 sales of something and via various scaling methods and adjustments have arrived at that figure as our best estimate of the sales of that product - which represents the centre of the bell curve.

Then why even publish 238,854.  Your original post says if you put a number (600k) that doesn't mean it sold 600k, but rather, it's an estimate, thought of as a probability.  If you have two close numbers like what I said, why would you not say both as 240k, as they basically have the same probability to be off especially since the small differences are just likely statistical noise.  It comes across as misleading.  Why even rank the sales?

Cumulative numbers will be quite a bit off if you round the numbers at every turn.

Not if you make a balance in the adjustments... some up, some down, and you can always adjust when you have more data, like they do now... but in math, its wrong to put numbers of significance above your margin of error...

Lets say you have a scale (analog) that measure up to 1 pound differences... So you would measure someone 150Lbs or 151... would be acceptable to use 150.5 +- 0.5 as well... but would be totally wrong to use 150.56738 Kg because you have no way to accertain that imprecision... but I don't know much about statistics methods and if it is acceptable to use this many significance numbers.



duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"

http://gamrconnect.vgchartz.com/post.php?id=8808363

Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"

http://gamrconnect.vgchartz.com/post.php?id=9008994

Azzanation: "PS5 wouldn't sold out at launch without scalpers."

DietSoap said:

How exactly are we even verifying for European numbers?

 

For Japan we have Media Create Sales/Famitsu, For North America we have NPD... What's the equivalent for Europe?


Chart-Track and other sales trackers for software; Nintendo's bar-charts for hardware.  Doesn't give any numbers, though, so it's mostly guesswork and making sure the numbers at least match the proportions we're given.



Around the Network

Personally, I think one of the best practical demonstrations of the point ioi is making is in the Japanese sales threads. We get numbers from 3 different trackers every week (Famitsu, Dengeki and Media Create) and all of them report different numbers, sometimes significantly with ranges of as much as 10% even on bigger titles

Add to this differences in methodology and simply in what they choose / are able to track and you can arrive at very different numbers even from 3 trackers which are all considered very reputable, some of whom are used by publishers in the same way as NPD is in the USA

Ultimately, no matter how much we would love to be able to say with 100% accuracy that game X sold Y number of units in week Z, we can't and realistically never will be able to



I personally am just happy I have a site with a great community that provides (IMO) the most accurate numbers in one easy location. Things are never perfect, and never will be. I have been coming to the site for a long time and will continue to do so.

As always ioi, thank you for providing a place for me to get sales, news, reviews, and (for the most part) intelligent gamer conversation. Don't let the rabid fans of (insert console of choice) get to you...😉.



TheTruthHurts! said:
I personally am just happy I have a site with a great community that provides (IMO) the most accurate numbers in one easy location. Things are never perfect, and never will be. I have been coming to the site for a long time and will continue to do so.

As always ioi, thank you for providing a place for me to get sales, news, reviews, and (for the most part) intelligent gamer conversation. Don't let the rabid fans of (insert console of choice) get to you...😉.

OUCH!

...

Sorry, the truth hurts!

(I'll let myself out the door now...)



I think supply scarcity throws a major wrench into VGC projects that is why I don't know how accurate that PS4 number is. For instance this holiday there was a week when Walmart was reported to get a huge shipment but not many other retailers got any, if VGC does not track Walmart in there data they would not reflect those sales accurately, the was another week that Best Buy Target and Amazon got huge shipments so the same is true, VGC had data on those retailers but not Walmart whom did not get many they week they would over track because they would extrapolate those sales to other retailers who did not get supply.



End of 2009 Predictions (Set, January 1st 2009)

Wii- 72 million   3rd Year Peak, better slate of releases

360- 37 million   Should trend down slightly after 3rd year peak

PS3- 29 million  Sales should pick up next year, 3rd year peak and price cut

While I understand where you're coming from Brett, I think the discrepancy between what readers think is accurate and what you'd call accurate is partially on how the numbers are reported. Is it common in statistics to report numbers to such an apparently high precision even though you know the error to be higher than that? In a scientific journal the standard is to report the estimated error in the results and to round the number to the first significant digit which with a 5% error would be probably be the third digit so 282,965 would become 283,000. Is there any particularly issue with reporting the numbers this way with the estimated error included versus what you do now?



...