When Columbus and Portland face off this evening in MLS Cup, it will be a clash between two of the better passing teams in the league. Both feature midfields heavy on ball control, with international-caliber players pulling the strings supported by a back line that likes to get forward.
In preparation for this game, I collected player-by-player passing summaries for each game the two teams played, starting from the last time they faced each other in late September. What I found indicates that fans of all stripes could be in for quite a treat.
Start with the basics
Including their game on September 26th, Portland has played ten games while Columbus has only played eight. Each team has played half of these games in league play, while the other half have come in the playoffs (Portland had a first-round series with Sporting Kansas City, while Columbus enjoyed a bye through that round).
Columbus has used 19 players over their eight games, while Portland has used 20 over their ten games. The following two plots illustrate how those rosters have been used.
Portland’s lineup has shifted slightly more than Columbus, although each has varied their appearance at times. The Timbers have been generally consistent in their defense, with Ridgewell and Borchers in the center flanked by Villafana and Powell. Kwarasey has played all but one game in goal. In the attack, Fanendo Adi is an ever-present name, although he is occasionally joined by Wallace, Asprilla, or Melano.
The Timbers formation, working from the graphics provided by MLS, shifts between a 4-3-3 or 4-5-1, depending on how far up the wing midfielders (Wallace, Asprilla, or Melano) play. My intent to go back and watch more of the Timbers games went unrealized, unfortunately.
For Columbus, the shifts to their lineup have almost all been due to injury or suspension. Most Columbus observers could predict their starters easily. Sauro and Parkhurst anchor the defense, with Francis on the left and Afful on the right. Steve Clark will be in goal. In front of this formation stand Tchani and Trapp, linking everyone else on the field with accurate short and medium range passes. Federico Higuain roams the attacking half of the field, typically between Meram on the left and Finlay on the right. Kei Kamara is the lone forward.
Stepping into passing data
The heart of my explorations come from passing data extracted from WhoScored. For each game played, I looked at the number of passes attempted, and completed, for each player. Combining these numbers gives us a measure of passing accuracy.
One of my first steps was to generate a scatterplot comparing the number of passes attempted with their accuracy.
Each dot on the above plot represents a player’s performance in a single game. Yellow dots are Columbus players, and green are Portland. A few things immediate stand out from this exercise:
- Both Portland and Columbus feature some very accurate passers. Even after ignoring late-game substitutes who complete each of their very few passes, both teams feature players who complete more than 90% of their attempts.
- The most active Columbus players tend to attempt more passes than their counterparts from Portland. If you look at only those players who attempt at least 60 passes in a game, most of them come from Columbus.
- Portland has more representation among those players who are less accurate. Examining only the lower half of the above plot, more of the players with under 60% accuracy seem to come from Portland – although Columbus also has some dots there.
When faced with a scatterplot of this complexity, I’ve found that a contour plot can be extremely useful. These are helpful for looking at general structures and ignoring potentially distracting outliers. Following are two plots that present each team’s passing data as contours.
In this context, the narrative of two accurate-passing teams seems to survive intact. Both teams’ activity is concentrated at about the same accuracy, although both teams present outliers of greater activity (stretching to the right), and of lower accuracy (stretching down).
Overlaying these plots atop each other makes this point even clearer.
Looking at individual players
With team-level trends identified, the next step was to look within each roster for smaller-scale trends. Here a tool like Data Voyager can come in really handy, but we can also see some information from Excel and R.
Kwarasey for Portland is a much worse passer than Clark for Columbus, although the New York series was brutal for Clark in particular. Both teams now feature a very accurate passing defender, with Gaston Sauro matching the performances seen from Nat Borchers. Both teams enjoy very accurate passing from their holding midfielders, with Diego Chara approaching perfection and a slight edge over the accuracy seen from Wil Trapp. At the forward position, Fanendo Adi does a better job bringing his teammates into the game than does Kei Kamara.
These trends are even more evident when we return to the scatterplot, sorting by position.
(mental note – I need to figure out how to control dot color in Data Voyager)
There are some nice symmetries here. The advantage held by Columbus in passes from their goalkeeper is similar to that enjoyed by Portland at forward. A similar flip-flop can be seen between the midfield and defense.
Two last points
One thing that analysts love to look at is the impact that a manager has during a game, and an immediate target for this inspection is the role and performance of substitutes. This is an area where Columbus came in for heavy criticism early in the season, as the team didn’t get a goal from a substitute until late in the year.
In terms of passing specifically, however, Columbus seems to have an edge over their opponent. The following chart breaks down passing accuracy by team and role.
Viewed as a box plot, it appears that Gregg Berhalter has generally improved his team’s performance with his substitutes, bringing on players who are more likely to complete their passes (and thus retain possession), while Caleb Porter at Portland has had to sacrifice possession. Neither team is bad in this regard, but given the ability each team has shown in holding the ball, it is a bit of a surprise to see such a significant divergence here.
The final inspection looks at the difference between regular season and playoff games. It has often been said in sports that the playoffs are an entirely different experience – intensity is increased, the games are faster and harder, and the quality of opponents is certainly better. Given this conventional wisdom, I wanted to look at how each team’s regular starters have changed their performance in the postseason.
This scatterplot includes the 23 players from both teams who have started at least three games. Each player’s passing accuracy in the final games of the regular season is plotted on the horizontal axis, while their accuracy in the playoffs is on the vertical axis. The diagonal blue line is the threshold for players whose performance has improved in the postseason.
There is so much to unpack in this plot that I almost ditched everything else and just focused here. Both teams’ forwards have gotten better in the playoffs – Adi and Kamara are two of only six players to be able to make that claim.
Similarly, both teams’ goalkeepers have seen their performance suffer. In the case of Clark specifically, however, it should be remembered that the New York series is entirely the cause of his decline. His distribution was ~80% accurate in the eight games leading into the semifinals, and only ~25% accurate in those two games.
Ultimately, the four names I’m most drawn to are two midfielders from each team. Federico Higuain and Wil Trapp are two of the lynchpins of the Columbus style, while the same can be said of Darlington Nagbe and Diego Chara in green. While Nagbe and Chara – like many other players – have seen their performance dip, they are still the most accurate passers across either team heading into the game. Trapp is typically the most-connected, most-accurate passer on the field for Columbus, but he may need to put on something special this evening. Higuain’s performance has been better over the last four games, and that may help relieve some pressure on the Columbus youngster as well.
Many of these plots got posted to Twitter this week as I wound my way through the data. Several others never saw the light of day, or got left on the editing table in the writing of this piece. If you’re interested in seeing all of the gory details, I set up a gallery on my Facebook page with everything.
The data used has been posted to GitHub. Tools used to produce these plots include R, Excel, Data Voyager, Plot.ly, and Gimp.