First release of Ultimate Elo Ratings
Hi, this is the first release of a project I’ve been working on for a while now. Live updating Elo ratings is something that has been conspicously missing from the Smash professional (and general) scene.
They exist in other fields from the official FIDE Chess Ratings to FiveThirtyEight’s NBA and NFL models. And they perform admirally for a single metric. In fact, FiveThirtyEight’s old NBA model, which was purely based on Elo ratings, historically defeated Vegas ratings 53% of the time–not enough to beat the rake, but a good performance for a single number.
So, here it is. I will write more on methodology later, but the model is a weighted variant of the original Elo algorithm. While theoretically, given enough time and enough matches between players, the original Elo scores converge, the reality is that in Smash, there just isn’t enough data for that.
The adjusted algorithm takes into account some information that we know a priori. For instance, results at a Super Major with 2000 entrants should be given more weight than results at a 100 person local, or that a 3-0 in a Bo5 means more than 3-2.
What happened to Tweek? #
Undoutably one of the areas where the rankings deviate from player made rankings is Tweek. The usual #2 ranked player does not have his spot, and this is mainly an artifact of how Elo works.
Basically, losing to TheGreatGonzales tanked his score. Furthermore, Samsora, Marss and Ally all gained significant Elo from their runs at Smash’n’Splash.
Given some more matches, if Tweek performs then his ranking will go back up, as you’d expect from his calibar of player. But, the numbers are what they are, even if they disagree with what we consider his skill to be.
What happened to Japan? #
Protobanham, Kameme, Abandango and friends are probably lower than they should be. This is mostly due to the lack of tournament results at majors. At large tournaments, Protobanham before CEO was only in Umebra, for instance, and that run just isn’t going to net you the same Elo points as the people making top 8s at every major.
As they play more internationally and have success, their rankings will grow.Contact me at email@example.com or @stu2b50 on Twitter