KDnuggets : News : 2008 : n11 : item16 < PREVIOUS | NEXT >

Publications

From: Bruce Ratner
Date: 22 May 2008
Subject: The Correlation Coefficient: Its Values Range Between +/- 1, or Do They?

The "correlation coefficient" was coined by Karl Pearson in 1896. Accordingly, this statistic is over a century old, and is still going strong. It is one of the most used statistics today; second to the mean. The correlation coefficient’s weaknesses and warnings of misuse are well documented. As a fifteen-year practiced consulting statistician, who also teaches statisticians continuing and professional studies for the Database Marketing/Data Mining Industry, I see too often the weaknesses and warnings are not heeded. Among the weaknesses/uses, there is one that is rarely mentioned: the correlation coefficient interval [-1, +1] is restricted by the individual distributions of the two variables being correlated. The purpose of this article is: 1) to introduce the affects the distributions of the two individual variables have on the correlation coefficient interval; and 2) thusly, to provide a procedure for calculating an adjusted correlation coefficient, whose realized correlation coefficient interval is often shorter than the original one.

www.geniq.net/res/correlation-coefficient-max-values-less-than-one.html

Basics of the Correlation Coefficient
The correlation coefficient, denoted by r, is a measure of the strength of the straight-line or linear relationship between two variables. The well known correlation coefficient is often misused because its linearity assumption is not tested. The correlation coefficient can – by definition, i.e., theoretically – assume any value in the interval between +1 and -1, including the end values plus/minus 1.

The following points are the accepted guidelines for interpreting the correlation coefficient:

  1. 0 indicates no linear relationship.
  2. +1 indicates a perfect positive linear relationship: as one variable increases in its values, the other variable also increases in its values via an exact linear rule.
  3. -1 indicates a perfect negative linear relationship: as one variable increases in its values, the other variable decreases in its values via an exact linear rule.

Read more.


KDnuggets : News : 2008 : n11 : item16 < PREVIOUS | NEXT >

Copyright © 2008 KDnuggets.   Subscribe to KDnuggets News!