> What's your understanding of "statistical significance"?

What's your understanding of "statistical significance"?

Posted at: 2015-03-12 
No links please, just your own words; in general and in respect of the global temperature trend; what's your understanding of "statistical significance"?

Global temperature trend (1) can be interpreted 2 ways:

A) I think it is probably warmer than it used to be.

B) I think that it is getting warmer.

It is not unusual for temperatures to jump around 0.75 kelvins in only 5 years, so how can we know if it has actually warmed 4.3 Kelvins over the last 34 years in the lower troposphere (2), or that it is now warming? How do we know it is not just an aberration? We do not, but we have statistical tools that help us to understand the odds of whether or not we are making a good bet: We calculate the probability that we are right (confidence level, P) or wrong (1 - P) (3).

Thus, for case 'A', where we attempt to establish that the lower troposphere has warmed over the last 34 years, we can calculate the average temperature over 4 years 34 years ago, and compare it with the average temperature over the last 4 years (1A). Since, in reality, the temperature bounced around like crazy during these sets of 4 years, http://www.woodfortrees.org/plot/gistemp...

we have to take an average, and then account for the uncertainty that the averages represent the actual temperatures of the lower troposphere at that time by taking the standard deviation of each data set. http://en.wikipedia.org/wiki/Standard_de...

We can then use the error function to figure out the probability that we are off a certain amount. http://en.wikipedia.org/wiki/File:Standa...

From there it is a philosophical argument how much doubt we should be willing to live with. Generally speaking, people have found it convenient to use standard deviations (sigmas) as follows:

1 sigma of certainty = 68.2% probability we got something, and 31.9% that it was just an accidental result. This means keep watching and collecting more data so that later we can make a more certain evaluation.

2 sigma = 95.6% correct result and 4.4% that it is an accident. This means the result is interesting, but not convincing.

3 sigma = 99.8% correct result and 0.2% that it is bogus. Most people will be reasonably convinced by that.

It takes about 30 years of data collection to get to 2 standard deviations in just about anything for climatology, or just about somebody's entire career just to get something interesting, but not convincing. Thus, climatologists mean 2 sigma, or sometimes just 95% confidence (real 19 times out of 20) when they say "significant".

To put that into perspective, here is an incident that happened to me {not really} when I was an undergraduate:

* I asked my Math prof whether or not all odd numbers were prime. He looked at me like I was a moron, said "NO!" and stomped off.

* Being unsatisfied, I asked my Climatology prof. He responded by taking a sample of 10 odd numbers: {1,3,5,7,9,11,13,15,17,19} 9 and 15 are not prime numbers, so he applied the student test to see if they were significant outliers, and determined that they were. Using "sampling error" as a justification, he threw out the 2 bad data points. Then, he concluded that indeed, all odd numbers are prime.

* Being completely confused, I asked my Petroleum Engineering prof. He responded: "Let's see.... 1,3,5. Yep! they're all prime!" At last, I had an answer that I could relate to.

CAUTION: 1 is not a prime number!

Edit @Pegminer: Nuts! I can never put one past you. Okay, I made the corrections.

Edit @Lee: The "obvious" thing you are missing is:

Statistical Significance in Climatology generally means a Statistical Probability of 95% or more that the result is valid. It can be defined as more than 95% if an author wishes for discussion.

Use in context: http://news.bbc.co.uk/2/hi/8511670.stm

In that link Phil Jones is 95% confident that the rate of warming for each of the 4 periods he points out is greater than zero. (He is at least 95% confident that it actually warmed some over the period.)

Edit @Darwins: "there has been no statistically significant warming for the last 15 years!"

Here are the last 15 years: http://www.woodfortrees.org/plot/gistemp...

RSS shows Earth cooled 0.01 Kelvins since 1997.

GISS shows Earth warmed 0.14 Kelvins since 1997.

The period is not long enough to know with 95% confidence that it has warmed. This is normal for such short time periods. However, Warmists like Hansen have acknowledged that there may be periods of cooling between future warming periods. That is new. So the data has prompted Warmist leaders to be more open about some things. 30 years is the minimum period for a climate trend.

What is significant is its failure to warm as much as the IPCC computer models based on a CO2 sensitivity of 3 kelvins per doubling predicted. http://www.sciencebits.com/IPCC_nowarmin...

Statistical Significance is a mathematical term with a precise meaning. The user of the term should always specify what level they are talking about; typically used alone most readers will assume it means 95.45%.

It means this: If the observed outcomes are random, than the actual observed outcome would happen no more than 4.55% of the time. This is why statistical calculations come with confidence ranges: the higher the level of confidence you want, the broader that range. You can be 95% sure that reality lies in a narrow range and 99% sure that reality lies in a broader range. When you calculate a warming trend (or any linear trend in any category) to the 95% confidence level, you are saying there is a 95.45% change that reality is in the range, a 2.275% chance reality is higher and a 2.275% chance that reality is lower.

This is a horrible description and probably a better example of why I am not a high school teacher. I know. But people who understand high school statistics understand statistical significance those who haven't don't. (In fairness since not everybody take stats in high school, this might be undergraduate stats to many.) When someone says "there has been no statistically significant warming for the last 15 years!" the meaning is that they are an ignorant argumentative boob. This is not a statement that an educated person makes.

It's just like odds at the track...you can have 10 to 1 odds (and higher) or dead even, or anything in between. If you have, for example, three horses, one a 21 year old nag that ran the last race at three minutes, another horse who ran the same track in 9.7 seconds, and a third that ran it in 9.6, the odds are astronomical that the 21 year old would place, let alone win. However, depending on the course and each horses records, the odds on other two might be considered even-statistically insignificant.

With respect to the global temperature trend, statisitical insignificance addresses the current short term trend and scientific bookmakers then need to look at the statistical probabilities already established. Does this mean, for example, that the predictions-in terms of probability-need to be revised? While the recent measurements are still within the range of the probability established, it's like the horse race odds above-what if the old nag suddenly had a great race and ran in 11.7. Is this enough to make book on?

No, of course not-but it means you keep an eye on the old nag, since the odds may change in her favor if the trend continues long enough.

Statistical significance is the degree to which the scatter in the data is smaller than the trend.

The statement is true, because the scatter in the data is actually larger than the trend -- over 15 years. But over a longer term, the trend is stronger than the scatter. So I'd give the statement no weight, whatsoever.

Statistical significance (aka statistical test of significance) is a measure of whether some set(s) of observed data are "unusual" under some hypothesized condition. P-values (aka observed significance levels) are measures of "how unusual" the observed data are - assuming the null hypothesis is true. By tradition / convention./ whatever, p-values of 0.05 or 0.01 are most common (however, this can be something of a judgment call depending on what you are doing and what you are looking for).

**[Strictly speaking, p-values tell us about the "unusual-ness" of the test statistic and not the actual data - but I can feel this explanation slipping away and there is no point quibbling over technical Bullshlt.]

It is important to remember that statistical significance only gives us an idea of how unusual something is - and "statistical significance" is (most often) not exactly the same as "practical significance" - "operational significance" - actual significance, etc. There are a lot things going on here and just because statistical tests generate "precise" answers to some number of decimal places that does not mean that they are perfectly exact or accurate answers.

For this reason, there are a lot of situations where confidence intervals are more accurate and more practically useful.

-------

edit --

>>How much weight would you give to the statement "there has been no statistically significant warming for the last 15 years!"<<

It is mathematically and statistically weak, and otherwise completely irrelevant. The entire basis for the Denier claim (yet another that they are too stupid to understand) are unfortunate casual statements made by honest scientists who overestimated the integrity of AGW-Deniers while underestimating their stupidity.

I'm on record as being of the opinion that it was/is a mistake for any climate scientist to comment of the statistical properties of climate data over short time periods because (1) the words scientists use in describing the tests have different meanings among the non-technical public and (2) it gives Deniers the chance to bltch and whine about it because - even if they were not math-morons - they are pathological liars who are not only uninterested in the science and the truth - but actively oppose them.

=====

Sagebrush ---

>>Regarding the follow up question: I think James Hansen said it best, "Skeptics will be all over us – the world is really cooling, the models are no good.”<<

James Hansen never said that. In fact, those statements do not represent the personal opinions of the person who DID write the email, Mike MacCracken.

The quote is from an email written by Mike MacCracken to Phil Jones and Chris Folland.

http://foia2011.org/index.php?id=502

Here is the relevant part:

> > …just so you might have a quantified explanation in case the prediction is wrong. Otherwise,

> > the Skeptics will be all over us--the world is really cooling, the models are no good, etc.

> > And all this just as the US is about ready to get serious on the issue.

> > >

> > > We all, and you all in particular, need to be prepared.

> > >

> > > Best, Mike MacCracken

As any fool can see, Mike MacCracken is not stating that he personally believes that “…the world is really cooling, the models are no good, etc. (also NOTE the “etc.) ---- he is warning Phil and Chris about the kinds of falsehoods that stupid, lying Deniers might say.

So, let’s review your answer:

Your answer is based on information in the public domain that can be obtained in less than 10 minutes by anyone with minimal education who is functionally literate and has computer access to the internet - and, yet, (1) you quote a quote that is not really a quote; (2) attribute the non-quote to a person who had no part at all in the discussion where the non-quote occurred; (3) misrepresented the non-quote because you never actually read it - and never cared whether it was authentic or true; and (4) furthermore, you presented yourself as an informed adult who had honestly researched the information and was truthfully and faithfully providing it as educational and informational resource in a public forum for the benefit of all interested stakeholders and members of the general public.

In summation, therefore, I have demonstrated “beyond a reasonable doubt” (as that concept is legally defined and applied by every city, county, municipality, state, and federal court in the US judicial system) that your answer is – in its every word, structure, and statement – by intent and execution – a complete and deliberate lie containing not a single word of truth.

In terms of trends, it is a measure of whether or not the data supports a genuine trend, or is there a significant chance that a trend could be a due to random noise.

Denialists love to talk about the warming since 1995, 1996, 1997 or 1998 not being statistically significant, as if something happened in one of those years to make it statistically significant. But it is just a matter of not having enough data in these time frames. It would make sense to conclude that the data does not disprove the null hypothesis to the effect that Earth is not warming since 1996 if that were the year when the thermometer was invented in 1996. But the fact that statistical significance of the trend from a year which was cherry picked by denialists is not important, since we have instrumental data from long before any of these cherry picked years.

My most general understanding is that it is measure of supporting a particular hypothesis or rejecting a null hypothesis. I also consider it to be a subjective assessment that is open to discussion and critique.

As an example, and I do need a link here, Santer et al(2011) "Separating signal and noise in atmospheric temperature changes: The importance of timescale" state:

"A single decade of observational TLT data is therefore inadequate for identifying a slowly evolving anthropogenic warming signal. Our results show that temperature records of at least 17 years in length are required for identifying human effects on global-mean tropospheric temperature." http://onlinelibrary.wiley.com/doi/10.10...

That is an example of a subjective assessment of using statistical significance to draw conclusions from temperature data trend by coming up with a minimum record length as a determining factor.

So a statement like there has been no statistically significant warming for xx years is a subjective statement and drawing conclusions from that statement are also subjective and open to debate and discussion.

It refers to the probablity of getting a particular result by random variation, rather than a true relationship. When someone says that it's significant at the 95% level, then there would be less than a 5% chance of seeing that result from random variation alone. The higher the significance level, the smaller the chance of seeing that result occur randomly. If someone does not give a number, just saying something is or is not statistically significant, they are usually referring to the 95% level, although that's being a bit sloppy and that may vary from one field to another.

Note that these estimates always depend on a knowledge of the population that they're drawn from (such as being a normal distribution), if that is wrong, the estimate of statistical significance is wrong.

EDIT for NW Jack:(1) Your anecdote, while amusing, is clearly fiction--you should not say that it actually happened, (2) the climatology and petroleum engineering professors would have known that 1 is not a prime number.

Another EDIT: Thanks NW Jack! Actually, I've heard a related joke about a geologist, an engineering geologist, and a geophysicist, where they are each asked what 2 x 2 is.

My off-the-cuff response is that "statistical significance" generally means that one thing (or action or phenomenon) is clearly important in causing some other thing. Usually, but I think not necessarily always, "statistically significant" implies a demonstrated or otherwise proven high probability of the causal relationship being actually true rather than coincidental or the result of random "noise" (fluctuations).

The "noise" element clearly applies heavily in the case of global temperature trend, because the variation in just one location between night and day is already vastly greater than the change of the global average from one century to the next . Seasonal, cyclical and geographic variation add tremendous additional "noise."

Edit - The Sagebrush revision of the old adage reads:

Lies

Damn Lies

Statistics

Stupid lies by people too stupid to understand what they are lying about

Not necessarily anything important. I don't believe statistical significance has sides even though this forum does , it is a stand alone concept

"What does "statistical significance" really mean?

Many researchers get very excited when they have discovered a "statistically significant" finding, without really understanding what it means. When a statistic is significant, it simply means that you are very sure that the statistic is reliable. It doesn't mean the finding is important or that it has any decision-making utility."

No links please, just your own words; in general and in respect of the global temperature trend; what's your understanding of "statistical significance"?

Enough empirical data to adequately predict a trend or a pattern.

Regarding the follow up question: I think James Hansen said it best, "Skeptics will be all over us – the world is really cooling, the models are no good.”

I have no understanding of it. I do think we are doomed doomed.