Creating a Football Pizza Chart Using R for Player Analysis
Written on
Introduction to Football Data Analytics
The landscape of football has transformed significantly in recent years, particularly with the increased incorporation of data analytics. Utilizing data allows us to assess player performances throughout the season, highlighting advanced metrics that typical statistics fail to convey. In this guide, we will create a pizza chart, also known as a percentile chart, to evaluate a player's performance. We will specifically examine the statistics of N’Golo Kante from the past year. Let’s dive in!
Implementation Steps
To create the pizza chart, we will follow several steps:
- Gather the data
- Select the appropriate metrics
- Preprocess the data
- Visualize the data
Gathering the Data
We will source our data from FBRef, a website that aggregates player statistics from various leagues, especially the top five European leagues. These leagues offer extensive metrics, thanks to StatsBomb, their third-party data provider. You can learn more about FBRef through the provided link.
To retrieve the data, we will utilize the worldfootballR library, which provides access to several data sources, including FBRef, Understat, and Transfermarkt. Make sure to install the library first using the following command:
install.packages('worldfootballR')
Once installed, load the library with:
library(worldfootballR)
We will focus on profiling N’Golo Kante's performance, so we will obtain the specific FBRef page link for him. Use the link as an argument for the fb_player_scouting_report function:
Please allow some time for the data retrieval process to complete.
Preprocessing the Data
Next, we will preprocess the data to filter it for the last year of Kante's playing time. Here’s how to do it:
library(tidyverse)
player <- df %>%
filter(scouting_period == 'Last 365 Days')
We will focus on a few key statistics relevant to Kante's position as a midfielder:
- Attacking Metrics: Non-Penalty Goals, Non-Penalty xG, Shots on Target, Assists, Key Passes
- Possession Metrics: Progressive Passes, Through Balls, Passes Under Pressure, Dribbles Completed, Progressive Carries
- Defensive Metrics: Tackles, Blocks, Interceptions, Pressures
To filter the data according to these metrics, we need to identify the indices for each statistic:
player$Statistic
Next, we can filter the data by running the following lines of code:
mid_cols <- c(3, 9, 14, 2, 43, 47, 52, 53, 87, 102, 106, 117, 125, 96)
player <- player[mid_cols,]
To categorize the statistics into three groups, we will create a new column titled stat:
player <- player %>%
mutate(stat=case_when(Statistic == 'Non-Penalty Goals' |
Statistic == 'Non-Penalty xG' |
Statistic == 'Shots on target' |
Statistic == 'Assists' |
Statistic == 'Key Passes' ~ "Attacking",
Statistic == 'Passes Completed' |
Statistic == 'Key Passes' |
Statistic == 'Progressive Passes' |
Statistic == 'Through Balls' |
Statistic == 'Passes Under Pressure' |
Statistic == 'Progressive Carries' |
Statistic == 'Dribbles Completed' ~ 'Possession',
TRUE ~ "Defending"))
To enhance visualization, we will format the whitespace in each statistic’s name to introduce new lines:
player$Statistic <- gsub(" ", "n", player$Statistic)
Visualizing the Pizza Chart
Now, let's visualize the pizza chart. If you're familiar with ggplot, this should be straightforward. Here’s the code to create the chart:
ggplot(player, aes(fct_reorder(Statistic, stat), Percentile)) +
geom_bar(aes(y=100), fill="#131313", stat="identity", width=1, color="#797979", alpha=0.5, show.legend=FALSE) +
geom_bar(aes(fill=stat), stat='identity', width=1, color='#044a9e', alpha=1) +
coord_polar(clip='off') +
geom_hline(yintercept=25, color="#021e3f", linetype='longdash', alpha=0.5) +
geom_hline(yintercept=50, color="#021e3f", linetype='longdash', alpha=0.5) +
geom_hline(yintercept=75, color="#021e3f", linetype='longdash', alpha=0.5) +
scale_fill_manual(values=c("Possession" = "#1ADA89",
"Attacking" = "#0F70BF",
"Defending" = "#EC313A")) +
geom_label(aes(label=Percentile, fill=stat), size=2, color='white', show.legend=FALSE) +
scale_y_continuous(limits=c(-20,100)) +
labs(fill="",
title=glue("{player$Player} | Brighton"),
subtitle="Last 365 Days | Stats Per 90",
caption="Data Source: StatsBomb via FBRef | Created by <WRITE YOUR NAME HERE>") +
theme_minimal()
The result will resemble the following chart:
In essence, we are creating a bar chart transformed into a circular format using polar coordinates, resulting in a spiral-like appearance. We enhance the chart's aesthetics by adding labels to each bar with the geom_label function.
To improve the overall visual appeal, we can customize the chart further:
ggplot(player, aes(fct_reorder(Statistic, stat), Percentile)) +
geom_bar(aes(y=100), fill="#131313", stat="identity", width=1, color="#797979", alpha=0.5, show.legend=FALSE) +
geom_bar(aes(fill=stat), stat='identity', width=1, color='#044a9e', alpha=1) +
coord_polar(clip='off') +
geom_hline(yintercept=25, color="#021e3f", linetype='longdash', alpha=0.5) +
geom_hline(yintercept=50, color="#021e3f", linetype='longdash', alpha=0.5) +
geom_hline(yintercept=75, color="#021e3f", linetype='longdash', alpha=0.5) +
scale_fill_manual(values=c("Possession" = "#1ADA89",
"Attacking" = "#0F70BF",
"Defending" = "#EC313A")) +
geom_label(aes(label=Percentile, fill=stat), size=2, color='white', show.legend=FALSE) +
scale_y_continuous(limits=c(-20,100)) +
labs(fill="",
title=glue("{player$Player} | Chelsea"),
subtitle="Last 365 Days | Stats Per 90",
caption="Data Source: StatsBomb via FBRef | Created by <WRITE YOUR NAME HERE>") +
theme_minimal() +
theme(plot.background = element_rect(fill = "#021e3f", color = "#021e3f"),
panel.background = element_rect(fill = "#021e3f", color = "#021e3f"),
legend.position = "bottom",
axis.title.y = element_blank(),
axis.title.x = element_blank(),
axis.text.y = element_blank(),
axis.text.x = element_text(size=6, color='#BBBBBB'),
text = element_text(family="Geneva", color="#FFFFFF"),
plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5, size=8),
plot.caption = element_text(hjust=0.5, size=6),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.margin = margin(5, 4, 2, 4))
Here’s the final visualization:
Analyzing the Chart
From this chart, we can draw several conclusions. Kante, a defensive midfielder, operates effectively in both his team's and the opponent's half. His statistics indicate he is among the best in the top five European leagues. He excels in regaining possession through his defensive skills, including pressures, blocks, tackles, and interceptions. Additionally, he is capable of making progressive passes and carries, bringing the ball closer to the attacking zone and contributing to goal-scoring opportunities.
Final Thoughts
Congratulations on completing this tutorial! Your pizza chart should now look polished and presentable. I encourage you to experiment and create your own version with a unique style! If you found this guide helpful, consider following me on Medium and LinkedIn for more insights into football analytics and data science.
Thank you for reading!
In this video, learn how to effortlessly create pizza plots for football player statistics using R and the ggshakeR package. Discover effective methods and techniques to visualize player data.
This tutorial covers the creation of polar/pizza charts in R, presented by Dom Samangy. Follow along to master the visualization of complex datasets in a clear and engaging format.