Data Analytics Practice: Comparing Players Fairly

find german version here: Link


Imagine that you want to judge two middle-distance runners based on the distance they have run. The first runner has 12 minutes to complete his run, while the second has 15 minutes to do so.


Unfair? Yeah sure, but that's what we do when we don't adjust the quantitative performance data in our data analysis.

That PSG's "passing monster" Marco Verratti records around 90 touches of the ball per game is incredible. Yet PSG has a possession percentage of over 60%. It wouldn't be fair to compare him without adjustment to, for example, Jean Onana who has 48% possession at Bordeaux with 47 touches of the ball. This practical issue of fair comparability and adjustment of stats is what we want to address in this blog.


In order to compare the players' performances fairly, it is relevant to put the absolute and quantitative values into the right context. In other words, the values must first be adjusted so that they can be compared fairly and are as reliable and meaningful as possible.


Adjusting metrics is not new and has been used for many years. Specifically, we were inspired for this blog by our esteemed DataAnalytics colleague Ben Griffis from America.



For Ben:

If your actions inspire others to dream more, learn more, do more and become more, you are a true #leader

In this blog, we highlight several ways to clean up defensive and offensive quantitative metrics.



Per 90 minutes metrics (p90)

Per 90 minute (p90) metrics are a good starting point to make player performance more tangible and comparable. But just not more so...




Making defensive metrics more meaningful

Let's first look at the defensive, quantitative or volume-based metrics.

Like volume of defensive duels, interceptions, tackles, blocked shots, etc.

All actions that are performed when the opponent has the ball.


But if we only use the number per 90 minutes, we only see the truth of the absolute number. It is clear that the Centre Back A with 9 interceptions in the game has a better value than the opponent's Centre Back B with 7 interceptions. But this is only half the truth.


If the team of A had only 40% ball possession and the team of B had 60% ball possession, this changes the rating. A had 60% opponent playing time for his 9 interceptions and had to execute his defensive actions. While B had only 40% opponent playing time for its 7 interceptions to execute its defensive actions.


If we adjust the interceptions value to fair and equal duration (50% possession), then all of a sudden B has the better value. than A.

A (9/60*50) = 7.5 adjusted interceptions

B (7/40*50) = 8.75 adjusted interceptions



Unadjusted defensive quantitative values (p90) favor players from teams with little possession, as they had more time to execute their defensive actions.

Adjusting does not apply to qualitative values, of course. The success rate of dribbles or passes etc is not adjusted.


Adjusting values can bring inconspicuous players or teams into the light and make them really shine



Two diagrams of the 2021-22 Swiss Super League follow.


Diagram Standard with the p90 values and Diagram Adjsuted with the possession adjusted values (PAdj). Symbolic for the adjustment you can track W. Burger, C. Zesiger and L. Zuffi.



Defensive Stats per 90 Min, Data: Wyscout
Defensive Stats per 90 Min, Data: Wyscout

Defensive Stats PAdj, Data: Wyscout
Defensive Stats PAdj, Data: Wyscout

W. Burger and C. Zesiger play with Basel and Young Boys on teams with a lot of possession. Their defensive stats have benefited from the PAdj adjustment. For L. Zuffi, it's the other way around. He slipped below the median after the adjustment due to Sion's low possession.



Statistical adjustment is a ubiquitous practice in all quantitative fields, used to correct for irregularities or limitations in observed data, to remove the influence of confounding variables, or to convert observed correlations into causal inferences.




Making Offensive Metrics More Meaningful

When it comes to offensive metrics, or more precisely, actions with the ball, the reverse is true. Players on teams with a lot of possession have more playing time available to execute their ball actions.



Unadjusted metrics of quantitative ball actions (p90) favor players of teams with a lot of possession, as they have more time to execute their ball actions.


Let's look at Key Passes and Expected Assists. Both metrics have one thing in common. They can only arise from passes. Previously, we adjusted up on defensive metrics with team ball possession. While this is best possible, it is not perfect. Because adjusting at the player level with a team level value is not perfect. But better than p90 with no quantitative adjustment at all.



Now that we have the numbers, we can use the number of passes per player to adjust here. It does matter in the quality of the player whether he can generate his "Expected Assists" value with 50 or with 30 passes per game....




The following are two diagrams of the Super League 2021-22

Diagram Standard with the p90 values and Diagram ADJUSTED with the pass-adjusted values. For this purpose we used the unit "per 40 passes". The value used should be as close as possible to the mean value. Symbolic for the adjustment you can see on the charts

M. Stevanović, J. von Moos and T. Aiyegun on the charts.


Key Passes & xA per 90, Data: Wyscout
Key Passes & xA per 90, Data: Wyscout

Key Passes & xA Pass adjusted, Data: Wyscout
Key Passes & xA Pass adjusted, Data: Wyscout

The performance of M. Stevanović is still good, but is redimensioned, as he had 35.8 passes per game. While the performances of J. von Moos (19.5 passes per game) and T. Aiyegun (17.1 passes per game) appear in a new shine. They are able to deliver their performances with about half of the passes of M. Stevanović.


"Using the number of touches for adjustment is dangerous" I've heard several times. That wouldn't take into account how well a player offers himself or frees himself. That's absolutely correct, but when it comes to offensive output, we also only want to evaluate the offensive output, measured against the input. Fact: The player generates more output with fewer balls. If you want to evaluate the connection to the game and the offering and free running, then you can compare the number of passes with other players in the same position.


It is always important to have the full focus on the question you want to answer.


The metrics that come from passes become fairer and more meaningful with pass adjustment.



Data-driven player personal coaching

By the way, we are in the process of offering data-driven player coaching.

The player gets a smart view of his performance data outside of his system and we work out measures together in individual tactics coaching sessions how he can improve his performance.


The player is to a large extent responsible for his own development. He has become a kind of self-entrepreneur.


External coaching for players and coaches by specialists is becoming more and more important to achieve excellence and to use all possibilities and resources. Whether #mentalcoach, #athleticscoach, #tacticscoach or #dataanalysiscoach.... In the end, it's not #talent but #development that pays the rent.




Ball Progression (Progressive Passes & Progressive runs)

Same procedure if we want to make statements about the ball progression with the values "Progressive Passes" & "Progressive runs".


The "Progressive Passes" are caused by passes, so again we adjust with "per 40 passes" While "Progressive runs" are not caused by passes, but by touches or received passes. Thus, we use "per 30 passes received" to adjust for this metric



On the following two graphs, we used to track H. Mahou, T. Coyle, L. Millar, and

G. Clichy selected.


Ball Progression per 90, Data: Wyscout
Ball Progression per 90, Data: Wyscout

Ball Progression Adjusted, Data: Wyscout
Ball Progression Adjusted, Data: Wyscout

We again see significant changes in the adjusted values.

H. Mahou and T. Coyle had their performance with fewer passes and received passes per game and their performances get a stronger appreciation.



While on the other hand the performances of L. Millar and G. Clichy, were somewhat redimensioned by the many passes and passes received.





Advanced Quantitative Adjustments

The examples used in this blog are about basic adjustments to the quantitative values, which provide a more realistic and fair view of the data.


Advanced adjustments are possible with the Defensive values instead of the opponent's possession, also with the opponent's number of touches. If you have the data, you can even use only the number of ball touches in the attacking third or number of passes into the final third.


For the ball actions, instead of the number of passes or passes, you can also selectively use only the passes passes in a certain area of the field.


Adjusting the quantitative data changes the view of the data as we have seen it so far. There is a tension there that makes for exciting conversations and many insights await our discoveries.

We have become accustomed to always analyzing both views (p90 & adjusted) when scouting. In any case, unadjusted quantitative values can lead to wrong assumptions, distortions and ultimately wrong decisions.


The stronger the correlation between the quantitative metric and the adjusted unit, the more robust and meaningful the data analysis will be.



Did you like the quality of our post? Then reward us with your "credits" and like and share this post within your social network. Thank you very much. To never miss a post you can subscribe to the blog.




footballytics - we know how to make date talk

We combine the competences of football tactics, scouting and DataAnalytics and advise and support clubs in interpreting and using data to make better decisions in scouting and match analysis in a data-validated way.

 

Blog from www.footballytics.ch

About football Data Analytics - improve the game - change the ǝɯɐƃ.

Share this post


0 Kommentare

Aktuelle Beiträge

Alle ansehen