Team Style Profiles and Similarity Scores

 

 

How do you find the most English team? You could count English internationals, home-grown players, the most fans, or simply refer to the picture above and declare that game the most English game of recent years. I took a different approach to find out which team played closest to the English style this last season. To do so, we need to develop a way of profiling teams by their style. For this we will use a number of metrics, listed below:

 

Both offense and defense

-Possession

 

Offense

-field tilt (ratio of attacking third/own third completions)

-shot tempo (shots per pass)

-intrabox success rate (completion % on passes that begin and end inside the box)

-pass length

-centrality (% of passes toward the center of pitch in final third)

-box attacks (passes into the box)

-forward play (% of passes that are forward)

 

Defense

-field tilt

-high press rate (% of passes completed that are 60+ yards away from the goal)

-shot tempo

-intrabox success rate

-centrality

-box attacks

-forward play

 

For each metric, a team’s rate was compared to the European average and standard deviation to get a z score, which was then used to make a team profile. For example, Villareal allows 31% of intrabox passes to be completed. The European average is 40.4% with a standard deviation of 5.4%. This puts Villareal in the 4th percentile for ease of intrabox passing against. This is done for each metric to create a team profile (Villarreal shown again):

 

 

You can see the two things that jump out are that they shut down the box and also force teams to the flanks more than any other team in Europe.

 

If you do this for each team in a league you begin to see some significant stylistic differences. I’ve looked at differences in shooting across leagues before and Colin Trainor and others have written about it on this site. Others have written very well about defensive differences from league to league (two are here and here). These profiles are another way of looking at league differences through how they play the ball. Spanish, Italian, and English teams have significantly higher field tilt than German and French teams. England and France are well ahead in intra-box pass % with Spain and Germany significantly behind. Box passes can be seen below:

 

 

 

Putting it all together, here is the composite England style profile (average of each team):

 

 

 

To find the most English team we need to use another tool in its early stages: the Style Similarity Score. It’s a simple tool that compares percentile differences across the different categories (with slight weighting changes, they are ordered according to importance in the list at the start of the article) and gives us a number summing up all of those differences. If a team had exactly the same numbers as another, their Style Similarity Score would be 0, and the higher you get the more different the teams playing styles theoretically are. Here are two quick examples:

 

 

The eye test doesn’t completely contradict anything I’ve seen, which makes me think this is a good first step. I wanted to use this new tool to find the real essence of each league. The glitz and glamour of Arsenal, Bayern, Barcelona and PSG are well-known but certainly aren’t representative of the average team in each of those leagues. So I put the English profile from above through the similarity score to find the two teams most similar so I’d know what game to watch if I wanted to find the true heart of Premier League football. I did this for each of the top 5 leagues.

 

Results

 

England: Stoke City v Aston Villa

Italy: Palermo v Sassuolo

France: Lorient v St Etienne

Germany: Frankfurt v Stuttgart

Spain: Deportivo de La Coruna v Valencia

 

If you had sat down and watched all 11 of these matches between these sides this season, I think you would have a good taste of the differences between the leagues. Just looking at the results you can see that: Frankfurt and Stuttgart played a 5-4 classic and a 3-1 as well while St Etienne beat Lorient three times by scores of 2-0, 1-0, 1-0 without a first-half goal.

 

The EPL is an interesting case as it has way fewer teams that “look” like the average side. This is because the league is more stratified in the way they pass. Burnley, QPR, Palace, Stoke, West Ham, and Hull all are in the top 15% of most long balls while Arsenal, City, Swansea, Liverpool, Spurs, Chelsea, and Everton are in the bottom 15%. This wide split between groups of teams means there isn’t a big group of teams playing near the average English style (like there are in Germany, France and Spain) but Stoke-Villa is as close as it gets.

 

 

Where do we go from here? 

With more work, team profiles and similarity scores could be used to look at how teams and styles match up against another. If we can see that Dortmund struggle more against teams who press them back then teams who sit back and force play wide you can alter your tactics (if you are a manager) or alter your bets. It’s another piece of information on top of shot data like expG: if Villarreal and Marseille had the same expG rating you would know Dortmund was a better bet against Marseille’s style of expG than Villarreal. Maybe teams that sit back and play long balls do great against teams that have high final third possession numbers like the conventional wisdom says, maybe they don’t. Game-to-game and month-to-month changes in tactics and style could be tracked much more clearly. Similar styles could be mapped together to see if their shots or shots allowed are different than the normal to improve xG models. One early example of this involves Swansea. I wrote about how expG models do not properly capture what Gladbach has been doing so I was interested to see who was similar to them. They turned out to be a rather unique profile with not many similar teams but the closest team was Swansea. Despite having a poor intra-box defense the Swans track well with Gladbach. When I checked their goal numbers relative to expected goals, sure enough they have been over-performing for 3 straight seasons now in my model. I haven’t done a deep dive into that yet, but it’s something I might not have seen without the similarity score.

These Team Style Profiles and Style Similarity Scores are good first steps but there is lots of room for improvement and without tracking data there are limitations.  Should different metrics be chosen? There are pretty strong relationships between possession, field tilt, and box attacks for example so should they all go into the mix? Should the weight assigned to each metric when comparing with other teams be adjusted? What about teams who change styles often throughout games and season like Thomas Tuchel did at Mainz? At the end of the year the stats only look one way but it covers up a ton of variance, there needs to be a metric for flexibility for sure. Certainly changes will be made, one of the first being improving field tilt to include all completions and not just a simple ratio of attacking/own third.

 

 

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s