By Brian Maurer and Tyler Trent
Christian Lattanzio tries to organize his team from the sideline (Photo courtesy of MLS)
When discussing statistics and analytics in soccer, an excellent place to start is asking the question, “Why analyze soccer?”
As an independent writer, using analytics and statistics can help enhance my ability to communicate what is happening during a game or the course of a season.
Players and coaches have started adapting video analysis and other informative data to help their teams improve more efficiently. Scouting departments utilize data to try and find players that are affordable and match their teams’ playing styles.
Regardless of the role in soccer, data analysis should be used as a supplement to help enhance other skills already in place. It is not used to replace observing, discussing, or coaching. As an observer and a writer, statistics can provide validation and strengthen an opinion when attempting to make a point or offer the opportunity to re-evaluate what I think I am seeing.
For example, if I observe Ashley Westwood as an exceptional passer, that observation can be further validated by looking at some of his passing stats (6.62 progressive passes per 90). Looking at other passing stats from Westwood’s season can make me ask questions and re-evaluate my thinking on how good of a passer he is (78.2% pass completion rate).
Analytics should provide a path toward a more integral understanding of what is observed in the beautiful game. It can do this by providing answers that are complimented by questions that can lead us to further discovery.
Here is a look at some of the more commonly used analytical methods used by pundits and fans, what answers they provide, and some of the questions they leave open to interpretation.
xG – Expected Goals
xG is probably the most commonly mentioned term when it comes to analytics. xG is a way that puts a numerical value on the quality of a shot. For example, a shot with an xG of 0.6 means it had a 60% chance of resulting in a goal. xG is calculated differently depending on which stat provider’s model is used, so you might see different numbers reported for the same game. For example, American Soccer Analysis’ xG model uses shot location, goal mouth placement, and passing.
xG can provide a few different answers when looking at a game, player, team, or season. These models can give us a sense of who is getting chances (quantity) and who is getting good chances (quality).
For example, according to FBref, Enzo Copetti had 1 xG in his match against NYCFC. However, he only had two shots in that game, leading to a .5 xG per shot (low quantity, but high quality). On the other hand, the rest of Charlotte FC had nine shots that accumulated to .4 xG (high quantity, low quality).
What xG does not provide is a sense of who is controlling a game (this stat often gets misconstrued for control, similar to how possession percentage does). xG models also don’t account for the role of the defense in preventing chances from reaching the goal. For example, in that same Charlotte FC match I referenced earlier, Charlotte FC players not named Copetti had nine shots. Four of those shots were blocked. While those shots accumulated .17 xG, those shots were covered by the defense because they blocked them, snuffing out the chances that the xG stat represented.
xG models generally do not consider the skill of the shot taker. For example, a .05 xG shot taken by Karol Świderski is not equal to a .05 xG shot taken by Derrick Jones.
Ways to Dive Deeper with xG
xG can also be used to calculate expected goals assisted (xGA) for passers, goals minus xG (G-xG) for finishing quality, and post-shot expected goals (PSxG) for finishing and for goalkeepers.
Predictive models such as xG are still in the adolescent stage, and many new methods will be created and used in the coming years.
A key pass is typically defined as a pass that leads to a teammate having an attempt on goal that does not result in a goal. This is an important metric to consider when looking at how a player supports their team offensively, even when it doesn’t end up being added as an assist in their stats.
Key passes are useful for showing the quantity of shot assists, but they do not highlight the quality or impact of the passer in a given play. For instance, a player can make a short sideways pass and earn a key pass if the receiver can take a shot. However, that pass will likely be far more routine and less impactful at breaking down a defense.
On the other hand, a passer can make a through-ball pass that eliminates three defenders from the play and puts a striker in on goal 1 v 1 versus the keeper. These two passes are quite different in their potential impact on a game and quality. Yet, the key pass metric counts them as the same.
Progressive Carries & Progressive Passes
Progressive carries are typically defined by a move that moves the ball in the attacking half at least 10 yards forward. Progressive passes are defined the same but with passes instead of dribbling the ball forward. This metric indicates a player’s tendency to move the ball forward.
The shortcomings of progressive carry and pass stats are similar to the shortcomings in key pass stats. They do a good job of identifying the quantity but do not necessarily identify the quality or the impact of the play.
This one may seem obvious, but many people don't know the actual calculation used to determine possession percentages. The calculation is calculated by dividing the sum of a team’s passes by the total number of passes in the match by both teams.
General possession stats, such as Charlotte FC having 42% possession against NYCFC’s 58% possession, are meaningless. However, if you break down possession into smaller five-minute increments throughout a match or by looking at where possession takes place, these stats can become far more informative as to how a game state is altering between teams.
Heat maps are a graphic representation of a player’s position on the field through a match or season. Below is an example of Atlanta United’s Thiago Almada for the 2023 MLS season. Areas shown in red are areas where Almada is highly found in. Heat maps are an excellent way to assess a player’s average position. For example, looking at the heat maps of an entire starting eleven can sometimes provide a sense of how a coach lined up their team compared to the pregame lineup release.
Thiago Almada's 2023 MLS heat map (Courtesy of SofaScore)
Heat maps can be useful as a positional compliment to other stats, such as looking at the overall actions of a player.
Below is Brandon Cambridge’s heat map from his most recent substitute performance against the Chicago Fire. The heat map provides a sense of where he was positioned on the field but does not demonstrate what happened there.
Brandon Cambridge's heat map versus the Chicago Fire (Graphic courtesy of SofaScore)
Once Cambridge’s passing and shooting actions (below) are added, his role and performance are easier to observe.
Brandon Cambridge's passing and shooting actions* against the Chicago Fire (Graphic courtesy of MLSsoccer.com)
Explore and Question
Data analytics is a valuable tool that allows one to explore more specific questions. When used alongside other techniques, it can help enhance the way we discuss and explore soccer. It is not a definitive source of truth, but if interpreted carefully can help guide one toward a more expansive way of understanding the sport we all love.
If you are interested in more statistics and analytics, some good free options are FBref, the American Soccer Analysis App, or FotMob. All these sites pull data from sources such as Opta and StatsBomb.
Inverting the Pyramid: The History of Soccer Tactics, Jonathan Wilson
Football Hackers: The Science and Art of a Data Revolution, Christoph Biermann
Net Gains: Inside the Beautiful Games’ Analytics Revolution, Ryan O’Hanlon
*Squares symbolize passes (black are completed passes) circles represent shots.