[2023] Creating an Elo formula to globally rank teams and predict winning probabilities for Worlds '23

Recently, Amazon and Riot had a hackathon that asked people to globally rank teams. Unfortunately I don't know Python and I have no idea how do to use AWS, so I wasn't able to submit something.

But I am doing a PhD in biostats, so I used R for data management and analytics and got some findings I wanted to share (that way my efforts weren't in vain).

For context: The data I used is from Oracle's Elixir. Similar to the hackathon, I'm only going to use 2020-2023 data.

Application of Elo

I decided to create a modified elo formula to rank teams. Quick recap: This is a generic elo formula that gives the expected probability of player A beating player B, and then you update the ratings of A and B through this formula.

My modified elo formula looks like this, where I use the difference in features (kills, assists, etc.) between blue side and red side, denoted by delta. The idea behind delta is that would consider how close or how one-sided the game was and then reward/penalize teams for that. Then my updating formula looks like this, where the update is weighted by region, split, tournament, and time of match (earlier matches have lower weight, to reflect patches).

First, I naively just fitted the formula on a team level and the results came out like this:

Rank	Team	Elo
1	DWG KIA	1668.87
2	T1	1656.81
3	Gen.G	1642.70
4	Royal Never Give Up	1620.54
5	JD Gaming	1613.45
6	EDward Gaming	1608.96
7	Top Esports	1605.18
8	G2 Esports	1584.90
9	PSG Talon	1581.88
10	GAM Esports	1580.11

I think we can all agree that this is not accurate. I think DWG KIA is so high because they won Worlds in 2020, thus putting their match weights higher.

We also know that we should actually consider players on a team to determine their elo. I will apply this elo formula on a player level by role. Here delta will be used to be the difference in features within the role. In theory, if a player performed really well compared to his counterpart, but his team lost the match, then he wouldn't be penalized as much. Then the team elo would be calculated by naively averaging all player elos (we can argue that teams have different emphasis on how they play through their players, but just for simplicity sake).

Here are the results for top 10 players in each role and their elo.

Rank	Top		Jungle		Mid		ADC		Support
1	369	1724.46	Oner	1721.28	Faker	1746.72	Ruler	1718.66	Keria	1728.66
2	Zeus	1698.00	Kanavi	1709.42	Chovy	1719.33	Gumayusi	1706.42	Lehends	1699.42
3	Doran	1680.12	Peanut	1704.80	knight	1710.53	Hope	1673.56	BeryL	1695.11
4	Bin	1639.76	Canyon	1672.95	Yagao	1686.82	JackeyLove	1642.44	Meiko	1640.20
5	Canna	1605.51	Clid	1614.24	ShowMaker	1678.49	Deft	1608.72	SwordArt	1629.78
6	Kiaya	1601.33	Karsa	1611.76	Scout	1663.86	huanfeng	1592.67	yuyanjia	1624.72
7	Wayward	1592.10	Tian	1610.72	Bdd	1626.69	Light	1589.97	ON	1609.01
8	BrokenBlade	1591.26	Tarzan	1605.76	Caps	1606.64	Aiming	1580.85	Kellin	1607.47
9	Hanabi	1584.06	Levi	1604.23	Kati	1606.41	GALA	1579.83	Bie	1592.34
10	Ale	1581.92	Cuzz	1603.10	Xiaohu	1604.56	Viper	1578.70	MISSING	1588.56

We can say relatively that this is surprisingly accurate on a player level. Now applying the average to get the team elo:

Rank	Team	Elo
1	T1	1720.22
2	JD Gaming	1657.90
3	Gen.G	1649.78
4	Bilibili Gaming	1614.64
5	KT Rolster	1614.45
6	Dplus KIA	1613.10
7	LNG Esports	1590.90
8	Frank Esports	1586.31
9	Weibo Gaming	1582.13
10	G2 Esports	1577.30

Here we get something ok. Intuitively, T1 is highest because of their performance in the past 3 Worlds. But interesting a wild Frank Esports from Hong Kong is top 8, so there definitely needs to be some adjustments with my formula. Overall, not too shabby for a second attempt.

Machine Learning on Winning Probabilities for Worlds 2023

Switching gears a little bit, I wanted to look at winning probabilities of team A beating team B in Worlds.

With the Swiss Stage, this was a good opportunity to look at all possible team match-ups. I trained XGBoost, a machine learning algorithm, on the 2023 season up to the beginning of Worlds, and tested the model on all possible combinations of the Worlds teams competing against each other. One of the training features was also the elo of the team at that time point given my formula.

Here is the heatmap. The graph is read across the rows as the probability of the team on the Y axis beating a team on the X axis. The lighter the color the better. We can see that the model highly values JDG and Gen.G to beat pretty much all the teams. The model also surprisingly highly overvalues GAM Esports for some reason, and undervalues Bilibili Gaming.

If you read all this, thanks for reading. If anyone from Amazon or Riot read this, sorry for not being able to submit anything in time. If there's any questions about the methodology, I'll try to answer it the best I can. Thanks again for reading and enjoy Worlds!

TL;DR Created a modified elo formula to get rankings of all teams, then applied machine learning with the elo formula to get winning probabilities of all teams competing in Worlds

External link →

Recent League of Legends Posts

Phreak is absolutely thrilled for the up-coming Griefing Detection system, promising that it will be very effective

Funny Edits from LTA - hope folks like it

25.09 Full Patch Preview

My Briar cosplay video!