By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - ioi speaks out about ergh "VGC analysts"

kitler53 said:
if only we were arguing about 5%...

i monitor the number really, really close the adjustments i see are more like:
Japan : 0%
America: 20 - 100%
Europe: 50 - 200%


How exactly are we even verifying for European numbers?

 

For Japan we have Media Create Sales/Famitsu, For North America we have NPD... What's the equivalent for Europe?



Around the Network
Zkuq said:
MaskedBandit2 said:
ioi said:
MaskedBandit2 said:

I dont understand the post about error percentages, bell curve analysis, and probability. To me, it seems like a weak attempt at an excuse for publishing bad numbers.  You say you're not "wrong" if you publish 600k and VGC says 485k. That's a 24% error, and a difference of 115k units! How is that even acceptable? As mentioned, it only gets worse as numbers scale higher. A 20% error of 2M is 400k units. That is definitely not meaningless. You're validation for this is that NPD has a margin of error as well, and does the same thing that VGC does? Naturally there's error, but it's going to be much smaller.

And when I look at actual charts, if you want to say the numbers can't be wrong and you're just using probabilities, why are you even publishing these ridiculously precise numbers. What's the difference between saying one game sold 238,854 and another selling 241,913? Heck, what's the point of even publishing the high figures this site does if you're saying they can fall in such a large range? It doesn't take much thought to know a game like GTA is going to sell in the multi-millions. If you're saying the numbers can't be wrong because they fall within a decent portion of a standard distribution, despite being off by a couple million, what's the point?


It's not an excuse, it is an explanation. Read my last post before this one - we take data from a sample population and scale it up to represent data from the whole population. Given variances in what the sample does compared to the whole population, there will be a bell-curve probability of the real values around our estimated one. The further you go from the estimate, the less likely you are to get that value.

Roughly speaking for the USA, we are using data from ~2m people to represent what the entire population are doing. Now a sample of 2 million people is enormous but even so it is less than 1% of the entire population and if for some reason we have bias towards particular regions, ethnic groups, age ranges, household incomes, genders and so on then our data will be an imperfect sample.

As for publishing data to the nearest unit - that is common practice. 238,854 doesn't mean that we have personally tracked exactly 238,854 sales of something - it means in reality that we may have tracked 1571 sales of something and via various scaling methods and adjustments have arrived at that figure as our best estimate of the sales of that product - which represents the centre of the bell curve.

Then why even publish 238,854.  Your original post says if you put a number (600k) that doesn't mean it sold 600k, but rather, it's an estimate, thought of as a probability.  If you have two close numbers like what I said, why would you not say both as 240k, as they basically have the same probability to be off especially since the small differences are just likely statistical noise.  It comes across as misleading.  Why even rank the sales?

Cumulative numbers will be quite a bit off if you round the numbers at every turn.

Not if you make a balance in the adjustments... some up, some down, and you can always adjust when you have more data, like they do now... but in math, its wrong to put numbers of significance above your margin of error...

Lets say you have a scale (analog) that measure up to 1 pound differences... So you would measure someone 150Lbs or 151... would be acceptable to use 150.5 +- 0.5 as well... but would be totally wrong to use 150.56738 Kg because you have no way to accertain that imprecision... but I don't know much about statistics methods and if it is acceptable to use this many significance numbers.



duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"

http://gamrconnect.vgchartz.com/post.php?id=8808363

Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"

http://gamrconnect.vgchartz.com/post.php?id=9008994

Azzanation: "PS5 wouldn't sold out at launch without scalpers."

DietSoap said:

How exactly are we even verifying for European numbers?

 

For Japan we have Media Create Sales/Famitsu, For North America we have NPD... What's the equivalent for Europe?


Chart-Track and other sales trackers for software; Nintendo's bar-charts for hardware.  Doesn't give any numbers, though, so it's mostly guesswork and making sure the numbers at least match the proportions we're given.



Personally, I think one of the best practical demonstrations of the point ioi is making is in the Japanese sales threads. We get numbers from 3 different trackers every week (Famitsu, Dengeki and Media Create) and all of them report different numbers, sometimes significantly with ranges of as much as 10% even on bigger titles

Add to this differences in methodology and simply in what they choose / are able to track and you can arrive at very different numbers even from 3 trackers which are all considered very reputable, some of whom are used by publishers in the same way as NPD is in the USA

Ultimately, no matter how much we would love to be able to say with 100% accuracy that game X sold Y number of units in week Z, we can't and realistically never will be able to



I personally am just happy I have a site with a great community that provides (IMO) the most accurate numbers in one easy location. Things are never perfect, and never will be. I have been coming to the site for a long time and will continue to do so.

As always ioi, thank you for providing a place for me to get sales, news, reviews, and (for the most part) intelligent gamer conversation. Don't let the rabid fans of (insert console of choice) get to you...😉.



Around the Network
TheTruthHurts! said:
I personally am just happy I have a site with a great community that provides (IMO) the most accurate numbers in one easy location. Things are never perfect, and never will be. I have been coming to the site for a long time and will continue to do so.

As always ioi, thank you for providing a place for me to get sales, news, reviews, and (for the most part) intelligent gamer conversation. Don't let the rabid fans of (insert console of choice) get to you...😉.

OUCH!

...

Sorry, the truth hurts!

(I'll let myself out the door now...)



I think supply scarcity throws a major wrench into VGC projects that is why I don't know how accurate that PS4 number is. For instance this holiday there was a week when Walmart was reported to get a huge shipment but not many other retailers got any, if VGC does not track Walmart in there data they would not reflect those sales accurately, the was another week that Best Buy Target and Amazon got huge shipments so the same is true, VGC had data on those retailers but not Walmart whom did not get many they week they would over track because they would extrapolate those sales to other retailers who did not get supply.



End of 2009 Predictions (Set, January 1st 2009)

Wii- 72 million   3rd Year Peak, better slate of releases

360- 37 million   Should trend down slightly after 3rd year peak

PS3- 29 million  Sales should pick up next year, 3rd year peak and price cut

While I understand where you're coming from Brett, I think the discrepancy between what readers think is accurate and what you'd call accurate is partially on how the numbers are reported. Is it common in statistics to report numbers to such an apparently high precision even though you know the error to be higher than that? In a scientific journal the standard is to report the estimated error in the results and to round the number to the first significant digit which with a 5% error would be probably be the third digit so 282,965 would become 283,000. Is there any particularly issue with reporting the numbers this way with the estimated error included versus what you do now?



...

supernihilist said:
ninetailschris said:
To be fair, vgchartz is not the most reliable source for numbers out of all the well known ones. Vgchartz makes like lot of amateur mistakes.


Really?

just name me one


Almost everything during holidays where they always adjust to npd later on. Uk is for 3ds is over tracked hundreds of thousands. PS3 are greatly undertrack like God of War while others are greatly overtracked. There numbers seem to be more of a guessing game based on preorders then anything actual based on sales. I find it strange that there prediction of the game knack sells in the USA where exactly what they predicted it sold almost. The actual sales based on npd where much lower. This all of course just personal observation but I don't believe any company themselves take the sales from this site seriously.



"Excuse me sir, I see you have a weapon. Why don't you put it down and let's settle this like gentlemen"  ~ max

Though I absolutely agree with vgc methodology and ioi's explanation of that, in my opinion it doesn't help that vgc sticks with reporting numbers that are higher precision than the source data can support. The precision of a number should be based on the margin of error of the underlying data, otherwise you are simply making up a level of precision that isn't supported. If the margin is 5%, then sales of 101428 should be reported as 101428 +/- 5071, or simply 100k because you've got a 4 digit margin of error (leaving the 2 most significant numbers as the actual supported data). That's basic statistical and scientific practise, not reporting data at a higher precision than the data can support. The difficulty obviously is working out your error margin