LIONbook Chapter 13: Bottom-up (agglomerative) clustering
The LIONbook on machine learning and optimization, written by co-founders of LionSolver software, is provided free for personal and non-profit usage. Chapter 13 looks at Bottom-up (agglomerative) clustering.
Here is the latest chapter from LIONbook, a new book dedicated to "LION" combination of Machine Learning and Intelligent Optimization, written by the developers of LionSolver software, Roberto Battiti and Mauro Brunato.
This book is freely available on the web.
Here are the previous chapters:
- Chapters 1-2: Introduction and nearest neighbors.
- Chapter 3: Learning requires a method
- Chapter 4: Linear models
- Chapter 5: Mastering generalized linear least-squares
- Chapter 6: Rules, decision trees, and forests
- Chapter 7: Ranking and selecting features
- Chapter 8: Specific nonlinear models
- Chapter 9: Neural networks, shallow and deep
- Chapter 10: Statistical Learning Theory and Support Vector Machines (SVM).
- Chapter 11: Democracy in machine learning: how to combine different methods.
- Chapter 12: Top-down clustering: K-means.
You can also download the entire book here.
The latest chapter is Chapter 13: Bottom-up (agglomerative) clustering.
In general, clustering methods require setting many parameters, such as choosing the appropriate number of clusters in k-means, as explained in Chapter 12. A way to avoid choosing the number of clusters at the beginning consists of building progressively larger clusters, in a hierarchical manner, and leaving the choice of the most appropriate number and size of clusters to a subsequent analysis phase. This is called bottom-up, agglomerative clustering.
Hierarchical algorithms find successive clusters by using previously established clusters, beginning with each element as a separate cluster and merging them into successively larger clusters. At each step the most similar clusters are merged.