| freedquaker said: You have given the perfect example why the market share percentages need to be used, instead of sheer numbers. As you have suggested, two rows of numbers, both increasing might be negatively correlated in terms of market shares, this is exactly what we are looking for. First of all, regression is not merely meant to measure the raw numbers, in my lifetime both as a student and a teacher, I applied and was told to apply a lot of regression analysis like this, based on ratios. Secondly, Using the raw sales numbers here will yield completely disastrous results since a lot of external factors will comprimise the net affect. For example, during an economic recession, all variables might shrink leading us to believe they are all positively related. Or in an economic expansionary period, or in holiday seasons, sales will go up, all in the same direction, leading us to believe they are positively related. Likewise if you try to apply the correlation analysis with regards to raw numbers in pairs, you will get all positive numbers, meaning they are complemtary goods, and one increases the sales of the other, and there is no competition between them, which is NONESENSE, and completely misleading. Of course we know that, because none of them are actually "complementary goods", so they've got to have zero or negative correlation, if not, there is something wrong with your data or assumptions. Finally, suppose that wii has a market share at around 50% before, and 360 has 28%, with PS3 at 22% (Just made up numbers). If a PS3 price drop has caused the wii share to decrease to 35%, with PS3 at 40% and 360 at 25%, you can easily argue that PS3 has a much greater effect on wii, rather 360, as showed very recently. For those who want to check the data and the procedure, here is the excel : http://rapidshare.com/files/288962580/console_correlation.xls |
Again: take 100 triads of totally random numbers generated by the computer. Calculate the "market shares" from them and calculate the correlations between the share coulmns: the results will be negative.
This does not prove that the raw numbers represent competing goods: the numbers were random, thus independent by definition. And quite obviously if you compute the correlations between the random columns themselves, you'll get somethig very close to zero.
Frankly I would have to dust my statistics books, but you can probably calculate on paper how much correlation you're introducing by using x/(x+y+z) instead of x.
I expect the use of shares to be viable if you had much more than 3 consoles so that the negative correlation each console causes on the other via the normalizing denominator is much less than the "real" correlation between the raw data.







