TL,DR: There is no statistical evidence in the considered dataset which would suggest that ARAM is not random.
Hello all
This post is made as an answer to ChaosRay3's post found here. He noted down for around a year which champions he got in ARAM. According to his statistics, he played a total of 1229 ARAM games in that timespan. You can find the complete statistics about which champions he got how many times in his post.
In the original post, the question came up in the comments about some statistical analysis, so here it is (question was posted by Kdog122025). I will try to explain the methods and discuss the data first, then show the results. As with the last post I made here, the code can be found here (I was asked in my last post why there was no statistical hypothesis testing, so here we go). Everything was calculated using R 3.5.3.
As for me, I'm a statistician/data scientist working in the retail business. I'm currently in the military service, so there is some free time I need to fill somehow and this dataset looked interesting. I have no affiliation with Riot Games.
Overview:
- Relevant questions which we try to answer
- Overview and discussion of the present dataset
- Statistical hypothesis testing
- Binomial and multinomial distribution
- Results
Relevant questions which we try to answer
The first thing we have to do is define the questions we want answer. The overall question is IS ARAM RANDOM, which we shall split into two parts as we need different methods to answer them:
- Is the distribution between your currently owned champion pool and the free rotation random?
- Within your currently owned champion pool and the free rotation champion pool, is the selection random?
In the first question, we try to answer whether it is more likely that you get a champion which you own or one you don't. This is a two-group problem (you are an owned champion or not), for which a binomial distribution is appropriate.
In the second question, we try to answer whether within a given group, the selection is random. We have to separate the two groups because the random rotation changes every 2 weeks, so the two groups of champions owned/not owned have different ways of being generated.
Before the question comes, it is absolutely valid to split the data this way. If the selection is random, it will also be random within a subset of champions. The subsets have to be defined in a way that the selection algorithm always treats the members of such a subset the same. This will become clearer in the next section.
Overview and discussion of the present dataset
The provided dataset containes three groups:
- Champions owned at the beginning (33).
- Champions bought or released within the observation period (13).
- Champions not owned for the whole observation period (94).
I will discard the second group of champions as they cannot be cleanly analysed. This leaves me with the group of owned champions (608 games played in total) and the group of not owned champinos (468 games played in total). It is valid to do so as if the generating mechanism of the data is random, it will still hold for the selected datasets. And if it is not random, it will be detectable within the subsets.
It is worth to mention that we can expect some small bias in the data towards champions which are owned less among the people who play ARAM. Think about it this way: Everyone has to get a champion he owns or is in the free roration. The probability of getting a popular (owned alot) champion is then a bit smaller than for unpopular (not owned alot) champions as you have to "share" those champions (or the possibility of getting them) with the other players.
The one shortcoming in the present dataset is that rerolls (if applicable) were written down, for which the effect described above is even stronger. However, you use your rerolls not randomly but when you have a bad champion for ARAM or for the composition, which will somewhat lower the presence of these "bad" picks. It is not clear how this bias is to be considered correctly from my point of view.
I can elaborate more in the comments on the last few section if it's not clear. However, given that there are so many champions available, I do not think that these effects lead to a large bias and therefore ignore it.
Statistical hypothesis testing
Now we come to a very important point from statistical testing: Statistical testing does not prove anything. What we do however is to define a null hypothesis H0, for which we can define a distribution which we will use together with the actual data to calculate or evidence for/against the null hypothesis.
For our two cases, these will be:
- Distribution between owned/not owned champion pool:
- H0: The distribution between the two pools is random. Then the distribution between the number of games with a champion from the owned pool nOwned and the not owned pool nNotOwned will follow a binomial distribution with p = nOwned / nNotOwned .
- H1: The alternative hypothesis is that the distribution is not random.
- Distribution within the subsets of owned/not owned champion pools:
- H0: The distribution within the champions of a pool is random. Then the distribution within the number of played games per champion ni, i from 1:(number if champions in the pool np) follows a multinomial distribution with np classes with the probabilities pi = ni/nobs with nobs being the number of observation, in this case the number of games played within the chosen champion pool.
- H1: The alternative hypothesis is that the distribution is not random.
Given the distribution, we then calculate the p-value of observing the actual data or more extreme data given the null hypothesis. This value is then compared against a predefined confidence level, usually chosen as 5%.
Please note that the p-value expresses our (un)certainty for H0, not for an alternative hypothesis. I put this here in italic as it is not very intuitive and a lot of people (also people who study math or statistics) get this wrong.
Usually, one rejects the null hypothesis if the p-value is below 5%. For our second question, as we test two groups simultaneously, we also have the multiple testing problem, so to have a confidence level of 5%, the p-values must be below 2.5% for us to reject the null hypothesis.
Binomial and multinomial distribution
For more details and graphs read the wikipedia articles here and here. The binomial distribution describes the outcome of a binary experiment (Bernoulliexperiment) repeated n times. Imagine a coin being tossed n times, the binomial distribution will describe how likely it is to get the number of heads. For this, the distribution also needs the probability p of the coin falling on heads.
Image a fair coin (p = 0.5) being tossed 10 times. Then the binomial distibution will tell us the probability of getting 0, 1, ..., 9, 10 heads. But we can also use this to describe how sure we are that the coin is random. For this we do an experiment (toss the coin 10 times) and get e.g. 6 heads. We can then calculate, using the null hypothesis that the coin is random, the probability of getting the observed data or a value more extreme. This probability is the p-value which we will then compare to the confidence level.
The multinomial distribution is a generalization of the binomial distribution to more than two classes. I will not go into the details of it, details on how I calculate the p-value for the multinomial testing can be taken from here.
Results
- The binomial test between the number of champions owned/not owned resultet in a p-value of 0.14, above our chosen confidence level of 5%. Therefore, we will not reject the null hypothesis of the selection between owned/not owned champions being random.
Take note here that I only estimated the number of champions in the free rotation (14 champions over three rotations minus the ratio of owned champions to the total number of champions). One should either wepscrape the champions of the free rotations and get the correct numbers (that has its problems as you need to aggregate this data over a whole year but would need the number of played games per week to make it correctly), or use a beta distribution where the probability p also becomes variable. - The likelihood-based multinomial tests for the two pools owned/not owned resultet in p-values of 0.90 and 0.15. Both are above the 2.5% threshhold necessary given by our confidence level of 5%. Again, we do not reject the null hypothesis of the selection of champions within the pools being random.
Note here that I calculated the likelihood ratios and did the Chi-Squared test as described in the wikipedia article on multinomial testing. The exact multinomial test is unfeasable to use as you run out of memory very quickly (I have 192 GB RAMs) as the number of permutations that need to be calculated grows extremely rapid with both the number of available champions per pool and the number of played games.
Thank you for reading this far and hopefully you got a grasp on statistical testing. In conclusion, there was no evidence found in this dataset that the champion selection algorithm is not random.
Have a good day :)
External link →