Those of you who've suffered through my fanposts know I'm a fan of "top-down" analytics (which encouragingly is starting to pick up steam in the analytics community).
I've done them with a variety of twists, usually adding on a layer of complexity each time. The last time out I added a "Points Above Replacement" (PAR) calculation to the With/Without statistics.
This time I'm taking a step back from the "With/Without" part and replacing it with a weighted score, where the goals for and against are modified based on the player's position in each match. Put simply, a player's goals against "score" will be affected more if they are a defender than if they are a forward. Conversely, forwards get a bigger piece of the "credit" for the goals scored when they are on the pitch. Strictly speaking this sort of weighting violates the spirit of a true "top-down" metric, but it has been suggested before by the commentariat and I kind of wanted to see what it might look like anyway.
So first the basics. For each match we keep track of how many goals are scored and conceded when a given player is on the pitch. Goals are not equal though - they are normalized based on two factors. The first is how many goals the opponent normally concedes/scores in a match and the second is splitting that into a home and away component. Those are the "expected" goals. Those expected goals are compared to what actually happened to create a "goals for" and "goals against" score. In the table below those are the "gf_net" and "ga_net" columns.
As a simple example, if Aston Villa normally concede 3 goals a match on the road and LFC only manage to net 1, that means everyone on the field's gf_net will be (-2) for that game - they scored 2 fewer goals than expected. Goals against is just the opposite. If Aston Villa normally score 1 goal on the road and LFC concede 0, then ga_net is (-1) for that match, because we allowed one fewer goal than expected.
So the twist this time around is we're weighting those scores. The weights I chose are frankly arbitrary - but I think they make sense. The nice thing is they are easy to change so if you have suggestions let me know.
Here's how I decided to weigh things:
For goals against:
Forwards: 0.05 (i.e. Sturridge is "responsible" for 5% of any goal we concede)
For goals scored:
You'll note the weights are mirror images of one another and they add up to 1.0 for both sides of the ball. One of the regrettable things (to me anyway) is how the positions are broken down: F,M,D,G.
Obviously this is not ideal - for example an attacking fullback should probably get a bigger piece of the credit for goals scored than your average CB whereas here they are lumped into 'D.' But this is what I'm limited to, having chosen ESPN for my data source.
So without delay (ha!), here's the 2015/16 table after 28 EPL matches for Liverpool:
|Joao Carlos Teixeira||1||3||-0.04||-0.06||0.02||0||0.02||0.017||0.63||0.001||0|
name, games, minutes: hopefully self-explanatory
gf_net: The sum of the goals scored minus the expected goals scored for each match for which this player was on the pitch. This is the raw number (no adjustments)
ga_net: the sum of the goals conceded minus the expected goals conceded for each match for which this player was on the pitch. This is the raw number (no adjustments). A negative number is good!
gd: gf_net minus ga_net
adj_gd: gf_net and ga_net are adjusted according to the position the player was playing when the goal(s) was/wasn't scored by the weights discussed above and then subtracted from one another to give an adjusted goal differential.
adj_gdpg: adj_gd per 90 minutes
adj_ppg: adjusted goal differential (adj_gdpg) converted to points using the historical correlation (0.779 * gd).
adj_pps: adj_ppg extrapolated to a full 38 game season (adjusted points per season)
multiplier: The percentage of minutes of the total possible that player has played.
par: Points Above Replacement - figured by: multiplier * adj_pps
So there you have it. A top-down measurement of Liverpool's players, weighted by position. Some interesting things to note: Coutinho would have been ranked much higher by doing things the old way - weighting the scores hurts him a lot since he gets little "credit" now for the fact that fewer goals are conceded when he's on the pitch. Lallana too.
I looked at a lot of other teams using this method and things actually look about the way you'd expect. So much so that I decided to create a league "table" using these scores. To do that I took the average "par" for each team and multiplied it by 11 (because it goes to 11) (OK actually because there are 11 players on the field) and added that number to 52 (a historically meaningful number that I won't go into here) to get the "expected points" for each team. Here's what that table looks like:
|West Ham United||0.96||62.56|
|West Bromwich Albion||-0.91||41.99|
Aside from the fact that the expected points are higher in nearly every case than what most models predict, it looks pretty good. Geez Everton, do you not like winning?
Thanks for reading.