Original Post — Direct link
over 4 years ago - /u/RiotAugust - Direct link

First should apologize to the people over at u.gg. Using hyperbolic terms like "garbage" isn't very useful. U.GG HAS GOOD DATA if you're looking at the right spots, but you have to consider sample size before trusting it.

To clarify the issue that I'm seeing: Sites like lolalytics and u.gg are great for determining relative balance (how good is champ x vs. the rest of the roster) in plat+ for champs who aren't critically unpopular. They're a lot worse at determining exact winrate/power levels of a given champ, especially at Diamond+ or Master+ levels of play. The sample size just isn't large enough, and it gets even worse early in a patch when only a few days of data have been collected (at that point even plat+ data is unreliable).

IMO it feels off when data sites are presenting things with low sample size as "real." I'll have people telling me "look at how broken 59% winrate Ivern is at master+" and then I see the data they're referencing has only 120 games. Not sure why it's being shown at all when the sample size is that low.

TLDR: U.GG and sites like it are GREAT for general comparisons between champs in plat+ or lower. They're less reliable when looking at higher mmr's or trying to find exact winrates.

over 4 years ago - /u/GreaterBelugaWhale - Direct link

Originally posted by invisible_face_

Just because you can have other inputs doesn't mean discussion can't be data driven as well.

believe this is a terminology misunderstanding. For us, data driven is defined as data leads the charge. We are data informed, meaning we lead the charge, with data informing us. I guess a rough parallel from my experience would be say, allowing a self-improving algorithm that uses exclusively data to dictate design changes similar to how some companies use them to present products to users (ie: youtube black box mystery algorithm) vs some human actually in control looking at the data and making their own decisions.

Generally I prefer data informed approaches since I don't think we should ever surrender control of work direction to data itself. We have become more data driven though over the last year, with our more solid commitments to winrate thresholds on balance.

over 4 years ago - /u/RiotAugust - Direct link

Originally posted by ShinggoLu

disclaimer: I am one of two co-founders of U.GG.

Thanks to u/RiotAugust for providing the context. I and the rest of the team obviously don't think U.GG is garbage but I understand the perspective RiotAugust presents. We do our best to gather as much data as we can get, display it and allow the player to draw their own conclusions from the data. The great thing about data is one number can be used to tell multiple stories. For example, the Lt. Gov of Texas (we're based in Austin, TX) can look at 500 deaths in Texas and come to the conclusion that stay at home order is overblown and it is time to re-open Texas, whereas someone else looks at 500 deaths and concludes that the strict stay at home order is exactly why the death toll isn't substantially higher.

People drawing conclusions from a small sample size in my opinion is part of what makes League of Legends fun and keeps the game fresh. A champion designer/game balancer might add that it also makes their work a living hell. At the end of the day I think it's great for everyone when we're all talking about League. It sure as hell is better than talking about Covid.

Edit 1: To explain drawing conclusions for a small sample size, there are situations when the entirety of the sample is "small" like the example RiotAugust gave where maybe a couple people play Ivern at master's+. We display exactly as much data as we can gather. Like what u/wertache said below, scouring these relatively obscure builds and champions for something op to climb with is a fun and fresh part of the game. If it works, it get's picked up by more players and there is more data, and with the larger sample we get a better understanding of whether the build is truly op or just something a onetrick is able to find success on. When it does work, the meta shifts and the game stays fresh.

Edit 2: Some people are curious why our total matches analyzed is low for patch 10.8. It is low for this patch. A lot of people believe that if it ain't broke don't fix it. I personally believe that if we aren't constantly striving to improve our systems, someone else will eventually come along with nextleaguesite.gg and I'll be out of a job. We built U.GG on a fundamental belief in speed. Speed in how quickly we can grab data from Riot's API within the limits of their rate limits, how quickly we can aggregate the data from our databases, and how quickly we can serve the data to player's around the world. For patch 10.8, we made a change to our aggregation algo and we missed an edge case bug that, one week into the patch, compiled an "empty" file that resulted in our tier list and some builds resetting to 0 games analyzed. We don't lie at U.GG so we fixed the bug and restarted mid patch. The matches analyzed reflects exactly as much data as we have. This bug is fixed. I can't guarantee that we won't have other bugs that cause issues in the future, but I do guarantee that we will continue to work on improving our systems to make it faster.

Thanks for all the work you do and the great website you maintain for all of us. Sorry again for being hyperbolic.