World Cup 2022: Become a Real Soccer Data Consultant!

8 min readDec 9, 2022

One of the tasks I always face during my Data Science journey is working in high dimensionality. That’s mean dealing with what we call Curse of Dimensionality. This phenomena refers to that when the dimensionality increases, the volume of the space increases so fast that the available data become sparse. In order to obtain a reliable result, the amount of data needed often grows exponentially with the dimensionality.

Basically, if you face this kind of problem you have two choices to optimize your machine learning solution:

Increase the data available
Use dimension reduction to avoid this phenomenon

Here in this article, we will show you how use PCA to reduce dimensions base on world-cup data and introduce a second method that we’ll detail in the next article.

PCA Goals

PCA for principal component analysis is a popular technique for analyzing large datasets. Large dataset means dataset with a high number of features (variables/dimensions)

PCA is a statistical method for reducing the dimensionality of your dataset. This method allows you to create new features (called principal components) based on a linear combination of initial ones and each new feature is built with the objective of maximizing…

World Cup 2022: Become a Real Soccer Data Consultant!

PCA Goals

Written by Elfao