Recently, Amazon and Riot had a hackathon that asked people to globally rank teams. Unfortunately I don't know Python and I have no idea how do to use AWS, so I wasn't able to submit something.
But I am doing a PhD in biostats, so I used R for data management and analytics and got some findings I wanted to share (that way my efforts weren't in vain).
For context: The data I used is from Oracle's Elixir. Similar to the hackathon, I'm only going to use 2020-2023 data.
Application of Elo
I decided to create a modified elo formula to rank teams. Quick recap: This is a generic elo formula that gives the expected probability of player A beating player B, and then you update the ratings of A and B through this formula.
My modified elo formula looks like this, where I use the difference in features (kills, assists, etc.) between blue side and red side, denoted by delta. The idea behind delta is that would consider how close or how one-sided the game was and then reward/penalize teams for that. Then my updating formula looks like this, where the update is weighted by region, split, tournament, and time of match (earlier matches have lower weight, to reflect patches).
First, I naively just fitted the formula on a team level and the results came out like this:
Rank | Team | Elo |
---|---|---|
1 | DWG KIA | 1668.87 |
2 | T1 | 1656.81 |
3 | Gen.G | 1642.70 |
4 | Royal Never Give Up | 1620.54 |
5 | JD Gaming | 1613.45 |
6 | EDward Gaming | 1608.96 |
7 | Top Esports | 1605.18 |
8 | G2 Esports | 1584.90 |
9 | PSG Talon | 1581.88 |
10 | GAM Esports | 1580.11 |
I think we can all agree that this is not accurate. I think DWG KIA is so high because they won Worlds in 2020, thus putting their match weights higher.
We also know that we should actually consider players on a team to determine their elo. I will apply this elo formula on a player level by role. Here delta will be used to be the difference in features within the role. In theory, if a player performed really well compared to his counterpart, but his team lost the match, then he wouldn't be penalized as much. Then the team elo would be calculated by naively averaging all player elos (we can argue that teams have different emphasis on how they play through their players, but just for simplicity sake).
Here are the results for top 10 players in each role and their elo.
Rank | Top | Jungle | Mid | ADC | Support | |||||
---|---|---|---|---|---|---|---|---|---|---|
1 | 369 | 1724.46 | Oner | 1721.28 | Faker | 1746.72 | Ruler | 1718.66 | Keria | 1728.66 |
2 | Zeus | 1698.00 | Kanavi | 1709.42 | Chovy | 1719.33 | Gumayusi | 1706.42 | Lehends | 1699.42 |
3 | Doran | 1680.12 | Peanut | 1704.80 | knight | 1710.53 | Hope | 1673.56 | BeryL | 1695.11 |
4 | Bin | 1639.76 | Canyon | 1672.95 | Yagao | 1686.82 | JackeyLove | 1642.44 | Meiko | 1640.20 |
5 | Canna | 1605.51 | Clid | 1614.24 | ShowMaker | 1678.49 | Deft | 1608.72 | SwordArt | 1629.78 |
6 | Kiaya | 1601.33 | Karsa | 1611.76 | Scout | 1663.86 | huanfeng | 1592.67 | yuyanjia | 1624.72 |
7 | Wayward | 1592.10 | Tian | 1610.72 | Bdd | 1626.69 | Light | 1589.97 | ON | 1609.01 |
8 | BrokenBlade | 1591.26 | Tarzan | 1605.76 | Caps | 1606.64 | Aiming | 1580.85 | Kellin | 1607.47 |
9 | Hanabi | 1584.06 | Levi | 1604.23 | Kati | 1606.41 | GALA | 1579.83 | Bie | 1592.34 |
10 | Ale | 1581.92 | Cuzz | 1603.10 | Xiaohu | 1604.56 | Viper | 1578.70 | MISSING | 1588.56 |
We can say relatively that this is surprisingly accurate on a player level. Now applying the average to get the team elo:
Rank | Team | Elo |
---|---|---|
1 | T1 | 1720.22 |
2 | JD Gaming | 1657.90 |
3 | Gen.G | 1649.78 |
4 | Bilibili Gaming | 1614.64 |
5 | KT Rolster | 1614.45 |
6 | Dplus KIA | 1613.10 |
7 | LNG Esports | 1590.90 |
8 | Frank Esports | 1586.31 |
9 | Weibo Gaming | 1582.13 |
10 | G2 Esports | 1577.30 |
Here we get something ok. Intuitively, T1 is highest because of their performance in the past 3 Worlds. But interesting a wild Frank Esports from Hong Kong is top 8, so there definitely needs to be some adjustments with my formula. Overall, not too shabby for a second attempt.
Machine Learning on Winning Probabilities for Worlds 2023
Switching gears a little bit, I wanted to look at winning probabilities of team A beating team B in Worlds.
With the Swiss Stage, this was a good opportunity to look at all possible team match-ups. I trained XGBoost, a machine learning algorithm, on the 2023 season up to the beginning of Worlds, and tested the model on all possible combinations of the Worlds teams competing against each other. One of the training features was also the elo of the team at that time point given my formula.
Here is the heatmap. The graph is read across the rows as the probability of the team on the Y axis beating a team on the X axis. The lighter the color the better. We can see that the model highly values JDG and Gen.G to beat pretty much all the teams. The model also surprisingly highly overvalues GAM Esports for some reason, and undervalues Bilibili Gaming.
If you read all this, thanks for reading. If anyone from Amazon or Riot read this, sorry for not being able to submit anything in time. If there's any questions about the methodology, I'll try to answer it the best I can. Thanks again for reading and enjoy Worlds!
TL;DR Created a modified elo formula to get rankings of all teams, then applied machine learning with the elo formula to get winning probabilities of all teams competing in Worlds
External link →