The third home game of the Columbus Crew’s 2014 season is coming up this weekend. The team will play DC United – one of the team’s rivals in the early days of MLS, but a rivalry that has significantly faded in recent years.
There are a number of on-field questions heading into this game, but I wanted to spend a bit of time revisiting the team’s attendance. To help me in this effort, I’ve started to dust off some of the bigger guns in my arsenal – OpenRefine and R. Using these tools I’ve assembled an attendance dataset covering the Crew’s 280 home regular season games from 1996 – 2013. The dataset is on GitHub – you can download it here.
Same process again
Before I get too far into the statistical weeds, however, I first put together a quick chart examining the Crew’s attendance during the month of April:
This chart is similar in structure to one I prepared before the second home game. A simple linear relationship (blue line) through these 40 data points indicates that the team gains (or loses) about 140 fans for each degree of temperature change this early in the season. Furthermore, if the current forecast of 58 degrees proves accurate (the vertical yellow line above), we might expect an attendance figure just below 15 thousand people.
One detail of this chart that needs to be noted is that the relationship noted here is weaker than the one I published before game two. That earlier figure came with an R2 value of 0.24 – not great, but much stronger than the 0.14 that emerges from this week’s data.
Which brings me to the new data set. For this exploration I’ve included several data points for each home game in addition to the attendance:
- Match date
- Average daily temperature
- Days since the last home game
- Number of games in this stadium
For now, I’ve looked at three potential factors that could impact attendance:
Of these factors, the Crew’s opponent seems to have one of the strongest impacts on attendance. This isn’t surprising given the strength of the Beckham effect. To see why, look at this box plot showing the range of attendance for each opponent:
Looking at this chart, it should be clear that certain opponents draw better than others. However, there is still significant overlap between some opponents – particularly clubs like Kansas City and Philadelphia.
Games against DC United have drawn pretty well – certainly better than the times when Chivas USA or Houston have visited. This tends to argue for a larger crowd than might otherwise be expected this early in the season, but I doubt that it will be enough for the Crew to completely break from the pattern of lower attendance over the first few games.
Days since last home game
One additional factor that could boost the crowd size comes from the fact that the team hasn’t played at home in two weeks. Call it the “absence makes the heart grown fonder” phenomenon. Looked at another way, perhaps this is more accurately summarized that playing too many home games in quick succession cannibalizes interest.
In order to cut down on noise, I’ve grouped the Crew’s home games into six general categories, roughly by the number of weeks between home games. Visual inspection of the above violin plot indicates that the team does in fact draw better as home games are spaced out more – but that the threshold for significant gains is three or more weeks.
From this, it appears that this weekend’s game – coming two weeks after the last home game – is somewhat unlikely to receive a significant bounce because of the gap between games. Home games after a one game road trip still draw roughly around 15,000. There’s also a pretty significant spread under these conditions, suggesting that a number of other factors are more significant.
The Crew’s home stadium has the weakest relationship to the Crew’s attendance figures – indicating that whatever factors influence attendance did not noticeably change when the club moved into Crew Stadium in 1999.
While the Crew drew some notably large crowds in Ohio Stadium, and have had worse games since moving to Crew Stadium, the majority of crowd sizes are roughly comparable in each stadium. In terms of making a prediction for this weekend, there isn’t a particular reason to exclude those early years from the model.
(As an aside, this is one of the reasons why I have grown to appreciate violin plots – for their ability to illustrate distribution of data points among categories)
Some Concluding Thoughts
- Out of all of these approaches comes the impression that this weekend’s attendance will be somewhere between 14-15,000. This would be a modest improvement over last season, when the third game of the season drew 14,090.
- Should the weather forecast continue to improve, there is the potential that the crowd could grow even larger. When I first started this article on Monday night, predictions where for a game time temperature in the high 50s. Recently I’ve seen some reports that it could be in the mid-60s, which – looking at the first chart in this article – could mean that even 16,000 may not be out of reach.
- Coming into this game, the Crew’s (two-game) average attendance is 14,781. That is about 1,300 higher than last season, so the team has some breathing room if the goal is simply to improve year-over-year.
- Interestingly, last year’s third home game was also against DC United, and the temperature that day – April 27 – was 56 degrees.