Playing with models.

It’s true: Cubs choke while Yankees surge.

Posted in Sports, Statistics by Alexander Lobkovsky Meitiv on March 24, 2010

1969 Chicago Cubs Photo with autographs

The infamous chokers: 1969 Chicago Cubs

The fans of Chicago Cubs know this scenario all too well. The Cubs, a great team, have a decent season only to slump and choke at the end. You only have to google the words “Cubs choke” to come up with dozens of websites lamenting the numerous heartbreaks. Cubs fans have developed a kind of fatalistic gloom as a coping strategy.

Then there are the Yankees everyone loves to hate. They seem to elevate their game at the end of the season and transform from a good team to hall of fame greatness. It’s as if all season long they weren’t giving it all they got during the regular season. It seems that they play just well enough to get into the postseason only to turn on the afterburners and blow everyone away.

Are these notions fiction perpetrated by fans or fact based on evidence?

We are in a position to test these hypothesis using a scientifically sound ELO ranking system. Using publicly available match data I compiled ELO ratings for all baseball teams going back to 1874. To refresh your memory, an ELO rating is a number which measures the true strength of a team based on all previous games. It is supposed to track the current strength accurately and in an unbiased manner.

The ratings of the 1977 Yankees and the 1969 Cubs

Variation of the ELO rating of the Cubs and Yankees during the 1969 and 1977 seasons.


The graph of the infamous 1969 Cubs choke and the Yankees 1977 season in which they won the World Series after being 51-44 in July and ranked #3 seems to support the “Cubs choke Yankees surge” hypothesis.

Is this true in general or is it just a lucky or unlucky break?

Well, here is where the data analysis can fully demonstrate its magic. The numbers don’t lie. If whoever does the numbers doesn’t that is.

What I did is to compute the difference between the rating of each team at the end of the season and its rating on September 1st of the same season. I then averaged this late season rating change over the last 47 years (since the 1961 expansion of the leagues from 16 to 20 teams). I then tested the result against the hypothesis that the rating change is purely random. This test weeded out the teams whose late season rating change could have resulted from purely random rating fluctuations. The remaining teams’ late season change is statistically significant and therefore not a fluke.

The result clearly supports the “Cubs choke, Yankees surge hypothesis.”

Bargraph of the average late season rating change.

The average late season rating change for 9 teams whose rating change is statistically significant.

Legend:
ANA: Angels
CHW: White Sox
OAK: Athletics
STL: Cardinals
ATL: Braves
TEX: Rangers
CHC: Cubs
NYY: Yankees
DET: Tigers

Basketball Time Machine

Posted in Sports, Statistics by Alexander Lobkovsky Meitiv on March 10, 2010
1986 Celtics

1986 Celtics

1997 Bulls

1997 Bulls

What would you do with a time machine? I bet some people would be chomping at the bit to pit two dominant teams from different eras against each other and have a grand old spectacle!
But alas, it is safe to say that a time machine will remain for the foreseeable future in the realm of magic.

Can we get a glimpse at what the outcome of such a magical game might be? Is there a scientifically sound way to rate sports teams in a way that judges their true strength. Most importantly, we need a method that yields ratings whose scale does not change with time so that a team that gets a rating of 2000 thirty years ago is as strong (in some sense) as a team that gets a rating of 2000 today.

We are indeed in luck! Such a system exists. It was proposed in the 1950’s by a Hungarian mathematician Arpad Elo (read about him on Wikipedia) and bears his name. His system is based on sound mathematical theory and ever since then dozens upon dozens of mathematical papers have been proving how reliable and reasonable the system is. Although Elo originally proposed his system to rate chess players, it has been adopted by a number of other sports bodies including FIDE, FIFA, MLB, EGF and others.

At the core of the ELO system is the ranking updating scheme which adjusts the ranking of the two teams (or players) after each match depending on the result. Given the rankings before the game, one can compute the probability of each outcome given that the actual performance has a certain probability distribution. If the stronger team wins its rating increases by a smaller amount than if the weaker team wins. There are many different specific incarnations of the system. While some are more accurate than others, even in its simplest form, the system is quite useful. In fact using publicly available match data we can resolve the question:

If 1997 Chicago Bulls played a best of 7 series against the 1986 Boston Celtics, what are the chances of each team winning?

After downloading the match data (56,467 games over 64 years that involved a total of 53 franchises some of which changed names and cities a number of time) and computing the rating history I came up with the top ten highest rated franchises:

Rank Team Year achieved Rating
1 Chicago Bulls 1997 2233.7
2 Boston Celtics 1986 2184.9
3 Los Angeles Lakers 1988 2163.3
4 Philadelphia 76ers 1983 2149.2
5 Detroit Pistons 1990 2137.4
6 Utah Jazz 1999 2129.9
7 Dallas Mavericks 2007 2126.5
8 San Antonio Spurs 2007 2089.4
9 Milwaukee Bucks 1971 2081.6
10 Seattle Supersonics 1996 2076.5

It is a telling sign that the NBA is a competitively healthy organization that the top 10 all time high ranking teams of all time pretty close to each other in rating. Also, it seems at least superficially, that there is no historical bias meaning the objective meaning of a rating does not change with time.

So, what would happen if the 1997 Bulls played a best of 7 series against the 1986 Celtics?
Home field advantage aside (the ranking I am using does not take that into account), the probability of the Bulls winning any particular game is p=0.53 The probability of winning a best of 7 series (I defined q = 1 - p below)

\displaystyle{\frac{p^4(1 + 4q _ 15q^2 + 4q^3}{p^4(1 + 4q + 15q^2 + 4q^3) + q^4(1 + 1p + 15p^2 +4p^3)} = 0.575}

The Bulls would have a 57.5% chance of winning the series: an exciting spectacle indeed!

Finally I leave you with a graph of the historical ratings of six teams from large metropolitan areas from 1980 to present day. It seems that it is extremely difficult to maintain a dominant team for more than a few seasons (although the Lakers managed to do so in the 1980’s).

NBA ELO ratings graph

Historical season ending ELO ratings for six NBA teams from large metropolitan areas