LIONbook Chapter 15: Dimensionality reduction
The LIONbook on machine learning and optimization, written by co-founders of LionSolver software, is provided free for personal and non-profit usage. Chapter 15 looks at Dimensionality reduction by linear transformations (projections).
Here is the latest chapter from LIONbook, a new book dedicated to "LION" combination of Machine Learning and Intelligent Optimization, written by the developers of LionSolver software, Roberto Battiti and Mauro Brunato.
This book is freely available on the web.
Here are the previous chapters:
- Chapters 1-2: Introduction and nearest neighbors.
- Chapter 3: Learning requires a method
- Chapter 4: Linear models
- Chapter 5: Mastering generalized linear least-squares
- Chapter 6: Rules, decision trees, and forests
- Chapter 7: Ranking and selecting features
- Chapter 8: Specific nonlinear models
- Chapter 9: Neural networks, shallow and deep
- Chapter 10: Statistical Learning Theory and Support Vector Machines (SVM).
- Chapter 11: Democracy in machine learning: how to combine different methods.
- Chapter 12: Top-down clustering: K-means.
- Chapter 13: Bottom-up (agglomerative) clustering.
- Chapter 14: Self-organizing maps.
You can also download the entire book here.
The latest chapter is Chapter 15: Dimensionality reduction.
In exploratory data analysis one is actually using the unsupervised learning capabilities of our brain to identify interesting patterns and relationships in the data. It is often useful to map entities to two dimensions, so that they can be analyzed by our eyes.
The mapping has to preserve as much as possible the relevant information present in the original data, describing similarities and diversities between entities.
For example, think about a marketing manager analyzing similarities and differences between his customers, so that different campaigns can be tuned to the different groups, or think about the head of a human resources department who aims at classifying the competencies possessed by different employees. We would like to organize entities in two dimensions so that similar objects are near each other and dissimilar objects are far from each other.