# Postmodern Conservative Archive

### Monthly

Contributing Blog Editors
Peter Lawler
Pete Spiliakos

Contributing Editors
James Ceaser
Michael Davis
Ralph Hancock
Jason Joseph
Kate Pitrone
John Presnall
Carl Scott
Flagg Taylor

#### Blogroll

« Previous  |Home|  Next »

Friday, January 16, 2009, 2:35 AM

Over at Upturned Earth, John Schwenkler has asked for an eighth-grade-level refresher course on what an r-squared value means.

Given how prone to misinterpretation the correlation coefficient is, it’s a little bit easier to talk about what it doesn’t mean. This is also a useful exercise in understanding the limits of science:

1) It doesn’t mean the probability that something causes something else.

Correlation does NOT imply causation
.

2) It doesn’t even mean the probability that something and something else are correlated.

Things can be correlated in all kinds of ways. The r-squared value only measures (in a weird way that we’ll discuss soon) the probability that two things are linearly correlated. Once upon a time, physicists wrought havoc upon the sciences by writing papers claiming all kinds of correlations that didn’t actually exist. It’s rather easy to ascribe correlations to things that are not, in fact, correlated. Don’t succumb to that temptation.

3) It doesn’t even mean the probability that something and something else are linearly correlated.

Statistics can’t actually tell you the probability of something being the case without additional assumptions. The oft-abused p-values are not, as most people interpret them, equivalent to one minus the probability that a given relationship exists. Rather, they are the probability that assuming nothing but chance is at work, the given situation might be observed. This common misconception naturally extends to r-squared numbers: just consider Anscombe’s quartet.

So what on earth does it mean?

In as few words as possible, the r-squared value represents the fraction of the variability in a data-set that can be accounted for by the statistical model (in a drearily frequentist way). As for what that actually means, statisticians aren’t really able to come to any agreement. Welcome to the wonderful world of Damned Lies.

January 16th, 2009 | 2:55 am

Correlation does NOT imply causation

…right, but causation does imply correlation.

January 17th, 2009 | 3:43 pm

Correct.
The classic example in undergrad stat courses is the Philipine Toaster Test.
The Philipino government desired a means of population birth control for reduced family size in the 50′s. The highest negative correlate with family size was the number of electrical appliances owned by the family.
So if correlation implied causation birth control in the Philipines could be implemented by handing out toasters.
It is the old hidden variable problem, like IQ, religious belief, and conservatism.
;)

January 18th, 2009 | 3:14 am

â€¦right, but causation does imply correlation.

…and thus lack of correlation implies lack of causation! (Probably.)

January 18th, 2009 | 3:19 am

Bayesian reasoning is the Devil’s logic. [ref.]