http://www.technologyreview.com/Energy/13830/In the scientific and political debate over global warming, the latest wrong piece may be the hockey stick, the famous plot (shown below), published by University of Massachusetts geoscientist Michael Mann and colleagues. This plot purports to show that we are now experiencing the warmest climate in a millennium, and that the earth, after remaining cool for centuries during the medieval era, suddenly began to heat up about 100 years ago––just at the time that the burning of coal and oil led to an increase in atmospheric levels of carbon dioxide.
But now a shock: Canadian scientists Stephen McIntyre and Ross McKitrick have uncovered a fundamental mathematical flaw in the computer program that was used to produce the hockey stick. In his original publications of the stick, Mann purported to use a standard method known as principal component analysis, or PCA, to find the dominant features in a set of more than 70 different climate records.
But it wasnt so. McIntyre and McKitrick obtained part of the program that Mann used, and they found serious problems. Not only does the program not do conventional PCA, but it handles data normalization in a way that can only be described as mistaken.
Now comes the real shocker. This improper normalization procedure tends to emphasize any data that do have the hockey stick shape, and to suppress all data that do not. To demonstrate this effect, McIntyre and McKitrick created some meaningless test data that had, on average, no trends. This method of generating random data is called Monte Carlo analysis, after the famous casino, and it is widely used in statistical analysis to test procedures. When McIntyre and McKitrick fed these random data into the Mann procedure, out popped a hockey stick shape!
...
--
...In PCA and similar techniques, each of the (in this case, typically 70) different data sets have their averages subtracted (so they have a mean of zero), and then are multiplied by a number to make their average variation around that mean to be equal to one; in technical jargon, we say that each data set is normalized to zero mean and unit variance. In standard PCA, each data set is normalized over its complete data period; for key climate data sets that Mann used to create his hockey stick graph, this was the interval 1400-1980. But the computer program Mann used did not do that. Instead, it forced each data set to have zero mean for the time period 1902-1980, and to match the historical records for this interval. This is the time when the historical temperature is well known, so this procedure does guarantee the most accurate temperature scale. But it completely screws up PCA. PCA is mostly concerned with the data sets that have high variance, and the Mann normalization procedure tends to give very high variance to any data set with a hockey stick shape. (Such data sets have zero mean only over the 1902-1980 period, not over the longer 1400-1980 period.)
The net result: the principal component will have a hockey stick shape even if most of the data do not.