A Family Tree of European Offenses

 

In my previous two posts on StatsBomb, I have used passing data to create team profiles and find similarity scores and then built on that to create a family tree relating how teams defend all across Europe. Today brings the offensive side of the ball to the forefront. You can read the full process of how these metrics are created and related in the previous two pieces but I will give a quick run-through here.

 

The metrics used are:

Shot tempo: shots per pass (3 highest tempos: QPR, Leverkusen, Crystal Palace. Lowest: PSG, Bayern, Manchester United)

Box activity: how often a team passes into the box per game (3 highest: Bayern, Dortmund, Man City. Lowest: Nantes, Bastia, Cordoba)

Intra-box success rate: completion % of passes that start and end inside the box (3 highest: Man City, Lyon, Bordeaux. Lowest: Hertha, Koln, Athletic Bilbao)

Centrality: % of completions in middle of pitch when in final third (3 highest: Torino, Dortmund, Hoffenheim. Lowest: Real Sociedad, Atletico Madrid, Levante)

Possession: share of possession (3 highest: Bayern, Barcelona, PSG. Lowest: Palace, Hertha, Eibar)

Forward play: % of completions that are forward (3 highest: Paderborn, Marseille, Hoffenheim. Lowest: Manchester United, Roma, Manchester City)

Field tilt: how far up the pitch the average pass is completed at (Highest: Man City, Barcelona, Chelsea. Lowest: Augsburg, Torino, Rennes)

 

 

 

and a few new ones for offense

instead of simple long ball%, two metrics replace it

Penalty box entry length: the average distance of a pass in which a team enters the box (Shortest: Barcelona, Arsenal, Manchester City. Longest: Eibar, Evian, Levante)

Playout length: average length of completions from deep in own half (shortest: PSG, Cagliari, Inter. Longest: Burnley, Eibar, QPR)

 

also new are

Red Zone%: % of passes that are completed within 20 yards of the opposition goal (highest: Man City, Leverkusen, Burnley. Lowest: Cordoba, Elche, Levante)

Diagonals: added thanks to this treatise from Adin Osmanbasic it is a measure of what % of a teams passes are long and diagonal. (highest: Bayern, Lyon, Lazio. Lowest: Palace, Palermo, Sunderland)

 

Are these the best metrics for judging a style of play? Almost certainly not, that will be a long process full of tweaking and testing. Right now, I feel satisfied this gives us good groupings as the variables are generally measuring different things (none correlate above .5 with each other) and are measuring some reasonably distinctive part of the game. I’ve weighted some metrics more (shot tempo, box attacks, possession, intra-box success rate) and some less (diagonals, forward play) before running these analyses.

 

The first analysis was a k-means cluster analysis using those metrics to group similar teams together. “K” is number of groups and there is always a debate as to how many you should choose. I ran analysis with k ranging from 12 to 35 and then looked at how much variance was explained by each one. 20 was where the variance seemed to stop decreasing consistently and since I used 20 groups in my defensive piece, I was happy to go with 20 again. If you choose a different k, you will get teams shuffled around a bit within groups as obviously certain teams are barely part of one group and could be moved to another without much concern. Once they had been grouped I ran an agglomerative hierarchal clustering on those group metrics to create a tree graph relating all the groups of teams across Europe. The tree graph as a whole is below, then I will go through each branch for a quick overview.

 

 

 

 

We will start with the 5 groups at the top.

 

From top to bottom:

Group 6: Lazio, Sampdoria, Torino

is closely related to

Group 5: Genoa, Frankfurt, Newcastle, Parma, Malaga, Villarreal

 

We start out with an enormous yawn. A bunch of solidly midtable teams along with Newcastle and Parma (whose defenses were historically awful, offenses weren’t near as bad). These teams have few standout characteristics either way. They do have well above average shot tempos and are good at passing inside the box. The main difference is Group 6 plays extremely centrally and play a lot of diagonal balls while Group 5 plays higher up the pitch and spends a high % of their time in the “red zone” (within approximately 25 yards of goal).

 

Group 8: Palermo, Atletico Madrid

Don’t play in the center and essentially never play diagonal passes. Have a very high field tilt, yet are well below average at time spent in the red zone and box attacks. I do wonder which metrics are mainly manager-related and which are player-related and if it’s even possible to separate them satisfactorily. If you have Messi and Neymar, you will hit a lot of diagonals and have a great intra-box passing % no matter what, even if they play for Diego Simeone right? I tend to think Atletico rarely play diagonal passes because so many attacks go through the wings and where there is only one way to play a diagonal ball and that’s into the teeth of the defense. The fact Atletico don’t commit many men to attack and seem to set up defense first means there will be less options to hit across the field. Hopefully a piece on variance of styles throughout the season can help find manager effect.

Group 3: Cesena, Hull City, Guingamp, Metz, Montpellier, Almeria

and

Group 1: Atalanta, Chievo, Burnley, Palace, Leicester, QPR, West Ham, Bremen

 

These teams rarely play with the ball and play it long repeatedly. Group 1’s shot tempo is by some distance the highest of any group, and they generally spend a lot of time in the red zone and are above average at putting balls into the box. Group 3 has neither of those last two positive metrics, providing the difference.

 

West Ham at some point this season were being mentioned as a team that might break into the top 5 and were good enough to qualify for Europe (I guess they did, but I doubt many of those pundits were eyeing the Fair Play Table at the time). They wound up as a pretty poor team playing similarly to a lot of other poor teams.

 

 

 

 

 

Group 13:  Stoke, Toulouse, Mainz, Augsburg, Hamburg, Freiburg, Stuttgart

and

Group 9: Sassuolo, Udinese, Verona, Koln, Hertha, Paderborn

 

Here we find most of the bottom of the German table. We find teams in this group play normal length passes deep in their own half but play long balls into the box. They have low possession rates, low field tilt and very low intra-box success rates. Group 13 has the ball and tests the box a lot more than group 9.

Group 16: West Brom, Caen, Lens, Athletic, Espanyol, Real Sociedad

and

Group 14: Sunderland, Bastia, Cordoba, Deportivo, Eibar, Getafe, Granada, Levante

These teams are even worse at passing inside the box and couple it with rarely getting the ball to the box. They generally play long balls throughout the entire field, don’t play centrally, and don’t have a large share of the ball. Group 16 has a higher share of the ball, play shorter passes into the box, and have a significantly higher field tilt. Last year David Moyes was in the Champions League managing Manchester United against Bayern Munich and he ends this season lumped in with Caen and West Brom by some guy on StatsBomb. What a fall.

Group 17: Evian, Nantes, Reims, Elche

and

Group 15: Swansea, Bordeaux, Lille, Lorient, Nice, St Etienne, Valencia

Here we have the patient, pick your spots teams. These teams breach the box at very low rates, but break into the box using short passes (group 15 significantly shorter) and complete a very high rate of their intra-box passes. Group 15 sees more of the ball while Group 17 plays mainly through the wings.

 

Group 11: Aston Villa, Rennes

Low possession teams generally play directly and shoot quickly when they get the ball. They don’t have the quality to hold the ball and play intricately so seem to rush the ball up the pitch and fire. Aston Villa and Rennes do not:

They couple their low shot tempo with the longest average pass length of any group when entering the ball into the box and below average intra-box pass success rates. So they aren’t picking and choosing prime spots, but seem to simply have a lot of useless completions that don’t get them closer to the box or a shot. Not pretty.

Group 18: Marseille, Wolfsburg, Rayo Vallecano

An interesting group here with two high-pressing defenses in Marseille and Rayo. These teams have the lowest average field tilt of any group and a high rate of forward play. They play short passes at both ends of the field and tend to play through the wings in the attacking third. It’s a strange profile as they almost play counter-attack football with very high possession rates. I am guessing the fact the game has become very stretched in Marseille and Rayo’s case leads to many of these numbers. When the other team is wide open you can play forward and don’t spend a lot of time passing it around against a set box (which would raise your field tilt rating).

Group 7: Milan, Schalke

If nothing else comes from this, I think the grouping absolutely got this one right. On a gut level this just feels perfect. Two big-budget teams who performed absolutely dreadfully this season (barring maybe the most bizarre game of the season in Madrid). Slow, ponderous play that rarely gets the ball near the goal or tests the box does not make for good watching. For good measure they are atrocious at passing inside the box. At least they don’t hit a lot of long balls, right?

 

After this group there is a big gap. Look back up at the main tree and you will see there isn’t much similarity between groups 7 and 18 and then 20.

 

 

 

Group 20: Bayern, Barcelona, Celta Vigo

and

Group 4: Empoli, Inter, Roma, Spurs, Juventus

Now we start to get to the high possession, highly effective offenses clustered here at the bottom of the tree. Celta Vigo kind of stand out here and while they certainly don’t reach the heights of Bayern or Barca, they style themselves similarly. They are good inside the box, hold the ball at very high rates, attack the box a lot and play short passes to enter the box. When you combine this offensive style with the crazy Bielsa pressing tactics and taking impeccably named Chilean team O’Higgins to their first league title ever back in 2013, Celta manager Eduardo Berizzo should at least be taken a look at for bigger and better jobs in the coming years.

 

Inter and AC Milan finished near each other in the table and are linked together in my generally EPL-centric mind but actually played very differently (a high box entry pass length bar here actually refers to a short pass in what was an astoundingly poor design choice):

 

Another team who is interesting in how they profile with these metrics is Empoli. Their defense was mixed in with Fiorentina, PSG, and Manchester United and now their offense reaches high class company as well. Kind of strange for a team that won 8 games all year and finished 15th a year after promotion from Serie B. Without knowing much else about him except these profiles, I’d wager that Maurizio Sarri will be another manager to watch going forward. And as soon as I typed that sentence I scrolled down on his Wiki page to find out that he has been confirmed as the new Napoli manager. Another instance of the profiles running ahead of my knowledge. To get a team with no players on more than $300,000 yearly salary to play like this is quite an achievement.

Group 12: Everton, Manchester United, Lyon, Monaco, PSG, Hannover, Gladbach

 

An imaging error has led to the Gladbach logo being left off. Any Foals fans feeling left out, please go read my long investigation into the entirety of Gladbach  and get back to me. These teams have very slow developing attacks that don’t get up the field at high rates. They are very good at passing inside the box.

 

 

 

 

 

Group 10: Arsenal, Man City, Liverpool, Chelsea, Southampton

Saints and Liverpool just barely make this group, but it shows the serious stratification of the EPL once again. These teams pepper the box (Saints 62nd percentile, all others 78+) from central areas (all above 80th percentile), with short passes (all above 72nd percentile) and are great at completing passes once inside the box (each team above 80th percentile). Chelsea and especially Arsenal and City are near the top of Europe at all of these things but Liverpool and Saints are like the little brothers who are doing what their big brothers do, just not quite as well.

Group 2: Cagliari, Fiorentina, Napoli, Real Madrid

Only PSG played out of the back using shorter passes than Cagliari. They can take some solace in that stat and the fact their offense was grouped with these 3 teams next season as they play in Serie B. The main problem there was they allowed 68 goals and their defense was grouped with QPR, Burnley, and Chievo.

Group 19: Dortmund, Leverkusen, Sevilla, Hoffenheim

 

The crazy uncles who seem to have little relation to anyone else. Usually shot speed is correlated with possession, as the chart with Aston Villa and Rennes showed earlier. Here we see it flipped the other way: teams with high possession who still fire a lot of shots per pass.

 

This group also has a higher % of their passes in the “red zone” or the 25 yards or so within the goal of any group. They pepper the box more than anyone bar the Bayern/Barca group and play extremely centrally.

 

Going forward

I think there is great potential with this type of analysis, especially once we begin to drill down into style vs style or game to game analysis. Tom Worville thought it could be used for transfers for teams looking for flexible players or possibly players with experience playing the style they wanted to. I am not sure if I would feel comfortable basing player analyses on this broad, team-level data right not but certainly it could be good for a starting point. For example, if Sunderland is looking to fix their offense maybe they would study what Athletic Bilbao does differently. Since Bilbao does a lot of things similarly to Sunderland, the differences might be easier to reach than studying Arsenal or Barcelona. At the very least, a quick glance at these graphs can make anyone much more knowledgable about the game across Europe and then decide what to look at further from that. For example, I had no idea about Empoli or Celta Vigo’s style of play before a week or so ago. Now I will keep my eye on Maurizio Sarri and Eduardo Berizzo going forward despite having not watched more than 30 minutes of those two teams total in the previous year.

And again, this is a rough guideline. If you change any of the metrics or the number of clusters you would get slightly different results. Southampton and Liverpool were close to breaking off into a separate group from the big 3 English offenses, Swansea was close to joining group 17, Bremen and Hannover are only loosely attached to their groups and several more things might have changed. These groups are not set in stone at all.

 

Discussion

If you have any questions, criticisms, comments or want to discuss this further you can reach me on twitter @SaturdayonCouch or post a comment on my blog. I’d love to discuss.

 

 

Advertisements

5 thoughts on “A Family Tree of European Offenses

  1. Hey, these were very interesting, both defence and attack posts. I wonder if you could share the full defensive and attacking profiles of all teams? I mean the data you share here for some teams: field tilt, shot tempo, box attack etc. I would like to see that data for each team and not only your final conclusion that certain team has “high possession and low shot tempo”, for example, so I could maybe take my own look at that.

    Like

  2. And yet, you’d probably have the data to run this same analysis on previous seasons. That would be interesting, because it would show if there are continuities in the way clubs, managers, or even perhaps dominant players set their teams to play. Mostly, the managers. For example, did any significant change of style in Aston Villa’s game happen when Lambert was replaced, or is it just that Villa’s resources and club culture really force that weak style of game. And it would be perhaps even more interesting to see whether the manager of Villa’s play style cousins Rennes, Philippe Montanier, played the same football at Sociedad, or whether it is the resources or lack thereof at Rennes that have got him playing that way. And how he played at Sociedad when he got better resources. Why that is particularly interesting is that they may be getting Gourcuff at Rennes. So, assuming he stays reasonably fit (big presumption), would he allow them to change the style of game, or would a player of his calibre just go to waste in a team like that.

    Indeed, Villa’s and Rennes’ game looks like a type of game that tries to do something but fails at it. How would they maybe play if they were successful at it? In my honest opinion, these are interesting questions, and I assume you’d have the data and the tools to give at least suggestive and approximative answers.

    Like

    1. you bring up some great points and great topics to study. certainly changing managers or managers moving will be one thing on my to-do list to look at (I do not have full old data yet, but hopefully can get it soon).

      And I think you are right about Villa and Rennes, they don’t want to completely give themselves over to the rush it forward strategy but wind up essentially accomplishing nothing good offensively. They either struggle to even find spots to hoist long balls into the box, or are trying to play the ball slowly up the field so they can move it closely to the box and just fail at doing so. Maybe if you are trying to build a quality team for the future it is worth struggling like this to build good fundamentals and thought patterns in your players and when you sign one or two new players all of a sudden you are more able to play how you want and don’t have to change everything, I don’t know. It will be interesting to see if one really good player will allow them to change, or if it takes more.

      It is very hard to know the answers to the questions of who is responsible for how a team is playing, the manager, the players, what specific players, etc. There are essentially unlimited things to study in soccer and at least publicly statistics are so far away from providing us with any sort of statement on player value. The coming years should bring so much advancement in these areas, it will be a great time to be a soccer fan.

      Thanks for you post, hopefully this summer I can look at in-season managerial changes and see how teams changed.

      Like

      1. You make a good point. Both Villa and Rennes are clubs where the manager was given time to rebuild the team, so maybe they have calculated it as a worthy risk to try something that will be great if it succeeds even if it fails first. Villa had patience with Lambert for several seasons because he was supposed to build a new young team, and when Montanier first arrived to Rennes, they rebuilt the squad almost completely and have been patient for the first two seasons at least. Both teams have shown glimpses of some good football in individual matches during last two seasons. And both of them try to strengthen their teams now with important signings. The question about how much and in which ways they can change their direction with new players is indeed a tricky one, as is evaluating transfers on this basis, but maybe a general look at these trees can give us some idea.

        From the little I see in your family trees and in play in general, I’d be cautiously more positive about Rennes than Villa, assuming that Montanier is trying to create a sort of patient build up and a defence in mould of Monaco or Lorient. In fact, how much would you have to tweak their stats to make it Lorient’s game? I think they may be failing at being Lorient at the moment, but may have a reason to try to be a sort of better version of Lorient, or to play similar game with better players, seen that that Rennes is a sort of big brother for Lorient and definitely a team with better resources. But it may well be that at the moment they even have worse players. Regarding that, it would make sense to get a playmaking, technically apt central midfielder – like Gourcuff, and if they really get him, it would be interesting to check it as a hypothesis next season if Gourcuff enables them to play that “pick your spot” attack (and if that’s indeed what they try to do).

        Villa at the moment just seems to be having the worst of both worlds: the defence of QPR and the attack of, well, Villa. They’re rumoured to be signing Micah Richards after his strong season at Fiorentina, but if we look at your family trees, Fiorentina’s defence where Richards fit in well seems to be almost the exact opposite to the defence that Villa has played. If Villa continues to play like that, can we really expect to see the best of Richards? Or rather to see him flop? Or is this a sign that Sherwood wants to develop his team towards more aggressive, attacking and proactive direction? But can a ball playing, aggressive defender really be enough to turn the teams attacking and defensive ways? A central midfielder would again make more sense and rumour has it that they may sign Huddlestone. If their direction was slow build up à la Swans, I’d expect it to require a more creative and attacking playmaker, whereas if the solution to their attacking problems is Huddlestone, it would suggest they actually want to resort to long ball game à la Hull. But what are they doing with Micah then? Maybe bigger transfers are coming, or Sherwood builds the attack around Grealish?

        Like

  3. Like, for example, you can show that Gladbach actually has played this same strange and strangely efficient style under Favre for several seasons.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s