Big Data makes it all too easy find spurious "patterns" in data. A new approach helps avoid overfitting by using 2 key ideas: validation should not reveal any information about the holdout data, and adding of a small amount of noise to any validation result.
In analytics it is a common practice to understand the basic statistical properties of its variables viz. range, mean and deviation. Centrality measures are the most important to them, explore how to use these measures.