The Correlation Coefficient: Its Values Range Between +/- 1, or Do They? (KDnuggets News 08:11, item 16, Publications)

KDnuggets : News : 2008 : n11 : item16

Publications

From: Bruce Ratner
Date: 22 May 2008
Subject: The Correlation Coefficient: Its Values Range Between +/- 1, or Do They?

The "correlation coefficient" was coined by Karl Pearson in 1896. Accordingly, this statistic is over a century old, and is still going strong. It is one of the most used statistics today; second to the mean. The correlation coefficients weaknesses and warnings of misuse are well documented. As a fifteen-year practiced consulting statistician, who also teaches statisticians continuing and professional studies for the Database Marketing/Data Mining Industry, I see too often the weaknesses and warnings are not heeded. Among the weaknesses/uses, there is one that is rarely mentioned: the correlation coefficient interval [-1, +1] is restricted by the individual distributions of the two variables being correlated. The purpose of this article is: 1) to introduce the affects the distributions of the two individual variables have on the correlation coefficient interval; and 2) thusly, to provide a procedure for calculating an adjusted correlation coefficient, whose realized correlation coefficient interval is often shorter than the original one.

www.geniq.net/res/correlation-coefficient-max-values-less-than-one.html

Basics of the Correlation Coefficient
The correlation coefficient, denoted by r, is a measure of the strength of the straight-line or linear relationship between two variables. The well known correlation coefficient is often misused because its linearity assumption is not tested. The correlation coefficient can by definition, i.e., theoretically assume any value in the interval between +1 and -1, including the end values plus/minus 1.

The following points are the accepted guidelines for interpreting the correlation coefficient:

0 indicates no linear relationship.
+1 indicates a perfect positive linear relationship: as one variable increases in its values, the other variable also increases in its values via an exact linear rule.
-1 indicates a perfect negative linear relationship: as one variable increases in its values, the other variable decreases in its values via an exact linear rule.