World Cup 2022: Clustering Explained Step By Step

10 min readJan 4, 2023

Clustering is a method which consists on grouping data points (clients, texts, images…) based on similarities. Clustering is an unsupervised machine learning problem that aims to process data and find similar structure in a set of data without any target values (dataset without labels).
Clusters are groups similar elements that differ from the elements in other clusters.

Clustering benefits are many and varied depending the field:

Client clustering: optimize and adapt strategy based on behavior
Increase company productivity: deal with groups of clients and not clients (reduce worklaod)

Clustering Types

Hierarchical Clustering (e.g., CAH)
Centroid-based Clustering (e.g., K-means)
Density-based Clustering (e.g., DBSCAN)
Distribution-based Clustering (e.g., DBCLASD)

Clustering Workflow

In this article, we will cover different clustering algorithms: K-means, CAH, Optics, DBSCAN…

World Cup 2022: Clustering Explained Step By Step

Clustering Types

Clustering Workflow

Written by Elfao