By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - ioi speaks out about ergh "VGC analysts"

This is a sales site. When the numbers are completely wrong then it becomes hard for anyone to trust this site's numbers. Sometimes the estimates are correct but usually they are not close.



    

NNID: FrequentFlyer54

Around the Network
DonFerrari said:
Alby_da_Wolf said:
To people suggesting a number like 370022 for a given platform in a given week be rounded to 370000: it wouldn't be correct. If calculations bring to a result of 370022 ± error margin, rounding it to 370000 ± error margin wouldn't be the same. Look at the first message of this thread, it's explained quite clearly: that 370022 value is the midpoint of the probability curve of the extrapolation of sales data collected that week. BTW most probably a rounding to the closest unit already happened, as sales are integer numbers, start values are integers and final values must be too, but intermediate values very often won't and calculations will be made keeping all the available decimals in every intermediate result, and rounding will happen only at the end to limit the growth of rounding error, that would just be added to the error already present in the extrapolation.
I'm puzzled, I thought that some basic rules for rounding and about measurement errors were taught also in high-school physics courses all around the world, not just at university.

And aparrently you haven't learn them since you want to use more signficance numbers than the tolerance permit.

Just no. When you receive sales data, you collect integer numbers and you aren't introducing yourself an additional measurement error like if you were measuring a length, say, with a 1mm graduated ruler. Those integer numbers can be precise, if the store can give them to you for a given week, or they'll be approximations plus or minus an error margin. You'll take them into account, and you'll have to take into account also the precision available for your internal calculations and the approximation error accumulation, plus obviously your estimate of the error in the formula you devised for the extrapolation of your data but even then, rounding the final result of the central value to an integer different from the closest one (or the closest greater or lower one) would be an error in every case.



Stwike him, Centuwion. Stwike him vewy wuffly! (Pontius Pilate, "Life of Brian")
A fart without stink is like a sky without stars.
TGS, Third Grade Shooter: brand new genre invented by Kevin Butler exclusively for Natal WiiToo Kinect. PEW! PEW-PEW-PEW! 
 


Alby_da_Wolf said:
DonFerrari said:
Alby_da_Wolf said:
To people suggesting a number like 370022 for a given platform in a given week be rounded to 370000: it wouldn't be correct. If calculations bring to a result of 370022 ± error margin, rounding it to 370000 ± error margin wouldn't be the same. Look at the first message of this thread, it's explained quite clearly: that 370022 value is the midpoint of the probability curve of the extrapolation of sales data collected that week. BTW most probably a rounding to the closest unit already happened, as sales are integer numbers, start values are integers and final values must be too, but intermediate values very often won't and calculations will be made keeping all the available decimals in every intermediate result, and rounding will happen only at the end to limit the growth of rounding error, that would just be added to the error already present in the extrapolation.
I'm puzzled, I thought that some basic rules for rounding and about measurement errors were taught also in high-school physics courses all around the world, not just at university.

And aparrently you haven't learn them since you want to use more signficance numbers than the tolerance permit.

Just no. When you receive sales data, you collect integer numbers and you aren't introducing yourself an additional measurement error like if you were measuring a length, say, with a 1mm graduated ruler. Those integer numbers can be precise, if the store can give them to you for a given week, or they'll be approximations plus or minus an error margin. You'll take them into account, and you'll have to take into account also the precision available for your internal calculations and the approximation error accumulation, plus obviously your estimate of the error in the formula you devised for the extrapolation of your data but even then, rounding the final result of the central value to an integer different from the closest one (or the closest greater or lower one) would be an error in every case.

Not sure if you know metrology, but you can also decentralize your "mean"... instead of saying 10.0+-0.05 you could say 9.95+0.1. In statistic don't know it would be acceptable... but I never saw a statistic report on election being 22.5+-.5% of intention (and they use integer numbers even tough they state results in percentages), and then they give margin of error as 5 pp (percentual points)... by your logic they should state it as 22.5341% (if that was the exact approximation value on their method on lets say 1430 people polled).



duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"

http://gamrconnect.vgchartz.com/post.php?id=8808363

Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"

http://gamrconnect.vgchartz.com/post.php?id=9008994

Azzanation: "PS5 wouldn't sold out at launch without scalpers."

Thanks ioi for your hard work. 



DonFerrari said:
Alby_da_Wolf said:
DonFerrari said:
Alby_da_Wolf said:
To people suggesting a number like 370022 for a given platform in a given week be rounded to 370000: it wouldn't be correct. If calculations bring to a result of 370022 ± error margin, rounding it to 370000 ± error margin wouldn't be the same. Look at the first message of this thread, it's explained quite clearly: that 370022 value is the midpoint of the probability curve of the extrapolation of sales data collected that week. BTW most probably a rounding to the closest unit already happened, as sales are integer numbers, start values are integers and final values must be too, but intermediate values very often won't and calculations will be made keeping all the available decimals in every intermediate result, and rounding will happen only at the end to limit the growth of rounding error, that would just be added to the error already present in the extrapolation.
I'm puzzled, I thought that some basic rules for rounding and about measurement errors were taught also in high-school physics courses all around the world, not just at university.

And aparrently you haven't learn them since you want to use more signficance numbers than the tolerance permit.

Just no. When you receive sales data, you collect integer numbers and you aren't introducing yourself an additional measurement error like if you were measuring a length, say, with a 1mm graduated ruler. Those integer numbers can be precise, if the store can give them to you for a given week, or they'll be approximations plus or minus an error margin. You'll take them into account, and you'll have to take into account also the precision available for your internal calculations and the approximation error accumulation, plus obviously your estimate of the error in the formula you devised for the extrapolation of your data but even then, rounding the final result of the central value to an integer different from the closest one (or the closest greater or lower one) would be an error in every case.

Not sure if you know metrology, but you can also decentralize your "mean"... instead of saying 10.0+-0.05 you could say 9.95+0.1. In statistic don't know it would be acceptable... but I never saw a statistic report on election being 22.5+-.5% of intention (and they use integer numbers even tough they state results in percentages), and then they give margin of error as 5 pp (percentual points)... by your logic they should state it as 22.5341% (if that was the exact approximation value on their method on lets say 1430 people polled).

I know you can do it and yes, if you decentralize and you do like you say, that is taking into account the additional error you introduce doing so, then it's correct, but this way you are unnecessarily increasing the error margin, and unnecessarily worsening the error accumulation in the lifetime total. And yes, it also makes sense for immediateness to round to an integer percent value when you communicate a percent result, or to just one decimal, like ioi does in the bar chart in VGC's front page, but you don't want to do the same for data that you'll reuse for further calculations, like adding weekly absolute numbers to get monthly, yearly, LTD and lifetime totals. Then, when using those numbers to draw a curve, rounding those integers to the resolution available will again be correct, but it's correct to do it for FINAL data that you won't reuse. If you'll want to draw another graph, you'll start again from the unrounded values, you won't get them scanning the graph you previously made rounding the values. If you round too much and too early data that must be reused, the error on further results could skyrocket. BTW, rounding to the closest hundred or thousand can be a significant additional percent error for small weekly numbers.
Then there is a method issue: ioi devised his formulas for extrapolation and he needs to constantly refine, update and possibly improve them or at least not let them worsen as the market changes, to achieve it he cannot arbitrarily round numbers, he could do it for some samples if he found that a given source is systematically above or under the average and its error isn't random, but biased up or down, but ouside of these cases, he'll want to keep all his numbers as centered as possible, then he'll only round data that he just presents to the public without reusing them, like the percent values and rounded totals of the front page bar charts. Another thing: real sales numbers are integers, so after applying extrapolation formulas, that will mostly give results with decimals, rounding to the closest integer is right and necessary. But after doing it, an integer like 370022 has the same "dignity" as 370000, rounding it to the closest hundred or thousand presenting the results is just cosmetic, you'll do it regularly if you, for example, present the data as 370k, 0.37M or 0.4M and so on, but again, you won't add these quick representations of weekly totals to calculate yearly totals. The 0.37M example can give you a hint: if you write, for example, 10.1M for a lifetime total, your rounding error is at most ±0.5%, but if you decide to use just ONE decimal and you write 0.1M  for a weekly total that before cosmetic rounding was a given number somewhere between 50000 and 150000, your rounding error can be up to 50%: even when deciding to do cosmetic roundings, you'll have to consider the range of actual values before deciding the acceptable rounding.


EDIT: I completed this post after taking some time and a long pause and in the meantime ioi wrote another comment on this issue, read it too, as he's enormously more skilled than me in the statistics and probability field (I, despite not liking them very much, except for robotics, found that I'm better at automatic controls, and, for example, I have many time had the temptation to represent ioi's adjustments as a linear system with a feedback loop  ).



Stwike him, Centuwion. Stwike him vewy wuffly! (Pontius Pilate, "Life of Brian")
A fart without stink is like a sky without stars.
TGS, Third Grade Shooter: brand new genre invented by Kevin Butler exclusively for Natal WiiToo Kinect. PEW! PEW-PEW-PEW! 
 


Around the Network
MaskedBandit2 said:
"Bye then"? Ridiculous.. When did I ever say it was easy? When did I ever say anything about tracking with 100% accuracy...

Not so ridiculous. Are you a vgchartz pro member? If so, you have a valid reason to complain because your paying for information that is often way off the mark. If not, then your getting it free like most of us.

How about appreciating the fact that this site exists at all, and has provided us with a good place to debate and vent our worst fanboy instincts. It doesn't matter if the vgchartz team pretends like they are the holy grail of sales data (which I'm not saying they do), you should have enough common sense to know that these numbers won't always be that accurate.

I'm sure the mods all have day-jobs and personal lives to attend to as well.



ioi said:
ninetailschris said:

I don't believe any company themselves take the sales from this site seriously.

 

I guess these guys are all stupid then:

 

http://www.vgchartz.com/pro/

Others like Pachter and 343i has represented vgc before as well.

Anyways i'd like to thank you for answering a lot of questions in this thread where some people has been outright disrespectful towards you in this thread.  I know a few others like Kowenicki thanked you as well but it is great seeing you reach out to the cimmunity to answer these questions.

So I will thank you on the behalf of VGC!  Not just for this thread, but for all the work you've put into the site over the years. Thanks :)



@brett,

I call bullshit on your first bullet point. No corporation will give inside information. Thats defonitly against policy. If you do have that info then someone could get fired. Now maybe you have someone like Ben leaking information or an employee at a local retailer willimg to talk about sales but i cant see much more. As for consumer data, well... Id like to know where that comes from. I dont see vgc willing to pay a survey group to pay consumers to collect legit data. Especcially on a global scale. Where are you being misleading? Well... (and this is my biggest gripe) how about on the front page. Anyone who visit the site isnt going to know someone named Brett runs the site, is a user named ioi and has made numerous posts explaining his methodology (or... why welse do you think you run into the same issues time and again? Its because the site is set up like a sales tracker along with vgcpro).

 

Now, I wont argue the accuracy of the numbers because thats your work and Im sure you will feel like Im attacking your work which would piss most people off.

I just cant see the ad revenue from the site being enough to support an office and a team. But hell, if it works then good for you. So, youve befriended someone from the NPD geoup then? Again, Im calling bullshit. Im sure NPD has policies they conduct their busoness under but if you knew a worker personally then Im sure they would gladly tell you more. Now, im not saying they would spill the beans and give up evrrything. 

To be brutally honest Im here for the community and the sales reports that users collect and post here from the companies themselves



Alby_da_Wolf said:
DonFerrari said:
Alby_da_Wolf said:
DonFerrari said:
Alby_da_Wolf said:
To people suggesting a number like 370022 for a given platform in a given week be rounded to 370000: it wouldn't be correct. If calculations bring to a result of 370022 ± error margin, rounding it to 370000 ± error margin wouldn't be the same. Look at the first message of this thread, it's explained quite clearly: that 370022 value is the midpoint of the probability curve of the extrapolation of sales data collected that week. BTW most probably a rounding to the closest unit already happened, as sales are integer numbers, start values are integers and final values must be too, but intermediate values very often won't and calculations will be made keeping all the available decimals in every intermediate result, and rounding will happen only at the end to limit the growth of rounding error, that would just be added to the error already present in the extrapolation.
I'm puzzled, I thought that some basic rules for rounding and about measurement errors were taught also in high-school physics courses all around the world, not just at university.

And aparrently you haven't learn them since you want to use more signficance numbers than the tolerance permit.

Just no. When you receive sales data, you collect integer numbers and you aren't introducing yourself an additional measurement error like if you were measuring a length, say, with a 1mm graduated ruler. Those integer numbers can be precise, if the store can give them to you for a given week, or they'll be approximations plus or minus an error margin. You'll take them into account, and you'll have to take into account also the precision available for your internal calculations and the approximation error accumulation, plus obviously your estimate of the error in the formula you devised for the extrapolation of your data but even then, rounding the final result of the central value to an integer different from the closest one (or the closest greater or lower one) would be an error in every case.

Not sure if you know metrology, but you can also decentralize your "mean"... instead of saying 10.0+-0.05 you could say 9.95+0.1. In statistic don't know it would be acceptable... but I never saw a statistic report on election being 22.5+-.5% of intention (and they use integer numbers even tough they state results in percentages), and then they give margin of error as 5 pp (percentual points)... by your logic they should state it as 22.5341% (if that was the exact approximation value on their method on lets say 1430 people polled).

I know you can do it and yes, if you decentralize and you do like you say, that is taking into account the additional error you introduce doing so, then it's correct, but this way you are unnecessarily increasing the error margin, and unnecessarily worsening the error accumulation in the lifetime total. And yes, it also makes sense for immediateness to round to an integer percent value when you communicate a percent result, or to just one decimal, like ioi does in the bar chart in VGC's front page, but you don't want to do the same for data that you'll reuse for further calculations, like adding weekly absolute numbers to get monthly, yearly, LTD and lifetime totals. Then, when using those numbers to draw a curve, rounding those integers to the resolution available will again be correct, but it's correct to do it for FINAL data that you won't reuse. If you'll want to draw another graph, you'll start again from the unrounded values, you won't get them scanning the graph you previously made rounding the values. If you round too much and too early data that must be reused, the error on further results could skyrocket. BTW, rounding to the closest hundred or thousand can be a significant additional percent error for small weekly numbers.
Then there is a method issue: ioi devised his formulas for extrapolation and he needs to constantly refine, update and possibly improve them or at least not let them worsen as the market changes, to achieve it he cannot arbitrarily round numbers, he could do it for some samples if he found that a given source is systematically above or under the average and its error isn't random, but biased up or down, but ouside of these cases, he'll want to keep all his numbers as centered as possible, then he'll only round data that he just presents to the public without reusing them, like the percent values and rounded totals of the front page bar charts. Another thing: real sales numbers are integers, so after applying extrapolation formulas, that will mostly give results with decimals, rounding to the closest integer is right and necessary. But after doing it, an integer like 370022 has the same "dignity" as 370000, rounding it to the closest hundred or thousand presenting the results is just cosmetic, you'll do it regularly if you, for example, present the data as 370k, 0.37M or 0.4M and so on, but again, you won't add these quick representations of weekly totals to calculate yearly totals. The 0.37M example can give you a hint: if you write, for example, 10.1M for a lifetime total, your rounding error is at most ±0.5%, but if you decide to use just ONE decimal and you write 0.1M  for a weekly total that before cosmetic rounding was a given number somewhere between 50000 and 150000, your rounding error can be up to 50%: even when deciding to do cosmetic roundings, you'll have to consider the range of actual values before deciding the acceptable rounding.


EDIT: I completed this post after taking some time and a long pause and in the meantime ioi wrote another comment on this issue, read it too, as he's enormously more skilled than me in the statistics and probability field (I, despite not liking them very much, except for robotics, found that I'm better at automatic controls, and, for example, I have many time had the temptation to represent ioi's adjustments as a linear system with a feedback loop  ).

Understood all... but as I said (maybe it woud be to troublesome and desnecessary work for it worth) you could round but keep historic, so you can track how much you have been rounding up and down (maybe you would increase the error on weekly figures, but we were talking about rouding inside the error margin... so 10,000,100 to be rounded to 10M is ok but 10,100 to be rounded to 10k isn't... but don't really matter) so you can keep a clear data for accumulated.

Anyway that is of no importance, since all "errors" (what systematically ioi is more than 5% of the value) aren't because of rounding or significance... and I see the figures in the site more as ballpark figures (and round in my mind to have a magnificence order number) and don't mind too much if it is 1% 5% or 30% wrong, I can still make usefull conclusion with it...

The ones that care more about it are the warriors using it for console war (since they will use some small differences to declare winner and loser and if ioi margin of error invert that glorius winning stance they get upset... so whenever numbers are close one side will call over or under tracked depending on their interest, without knowing the method or the usage), and it doesn't seem like that is ioi concern with his precision.



duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"

http://gamrconnect.vgchartz.com/post.php?id=8808363

Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"

http://gamrconnect.vgchartz.com/post.php?id=9008994

Azzanation: "PS5 wouldn't sold out at launch without scalpers."

Max King of the Wild said:

@brett,

I call bullshit on your first bullet point. No corporation will give inside information. Thats defonitly against policy. If you do have that info then someone could get fired. Now maybe you have someone like Ben leaking information or an employee at a local retailer willimg to talk about sales but i cant see much more. As for consumer data, well... Id like to know where that comes from. I dont see vgc willing to pay a survey group to pay consumers to collect legit data. Where are you being misleading? Well... (and this is my biggest gripe) how about on the front page. Anyone who visit the site isnt going to know someone named Brett runs the site, is a user named ioi and has made numerous posts explaining his methodology (or... why welse do you think you run into the same issues time and again? Its because the site is set up like a sales tracker along with vgcpro).

 

Now, I wont argue the accuracy of the numbers because thats your work and Im sure you will feel like Im attacking your work which would piss most people off.

I just cant see the ad revenue from the site being enough to support an office and a team. But hell, if it works then good for you. So, youve befriended someone from the NPD geoup then? Again, Im calling bullshit. Im sure NPD has polacies they conduct their busoness under but if you knew a worker personally then Im sure they would gladly tell you more. Now, im not saying they would spill the beans and give up evrrything.

If you say no corporation would give info to Brett because of policies (and you are mostly right) how NPD would collect the same data?? Here is a tip they collect data from single stores across the country (there is nothing stoping 100 of different employees from different retail chains to say how many of each item they have sold). About customer data I would like to know but certainly isn't as absurd as you say... you talk about Ben, but you have bee freely been tossing numbers here for PS4 and Xbone sales.

You have already argued the accuracy and almost personally attacked the owner, which seems much more level headed then most users, since in any other thread you would be banned or worst for this kind of stance, maybe you have saw it comming and tonned down.

If you can't see how much they earn from running VGC them you don't even have to discuss... and about NPD, you were the one that said if you meet some NPD worker they would happily tell all (now you add if you know him personally, and guess what we don't know Brett personally), he would tell just if he isn't professional... that is work ethics and privileged information and the guy should be fired and prosecuted if he leaks the methodology for you.

The only problem I would have with VG numbers is like the numbers from PS2 (which is way off from shipment figures, but I imagine most don't care because it have been discontinued) and GT5p because its 25% of PD figures (we don't know if PD stated shipped figures using digital sales as well, but I doubt)... and surely there are other issues like that, data that have more than 5 years of iteration and should be really close but haven't been adjusted, but it doesn't really bother me.

@ioi any particular reason why some of those really of number aren't corrected? Is more because it isn't worthy, it's old or not enough people have brought it to your attention?



duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"

http://gamrconnect.vgchartz.com/post.php?id=8808363

Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"

http://gamrconnect.vgchartz.com/post.php?id=9008994

Azzanation: "PS5 wouldn't sold out at launch without scalpers."