Original Post — Direct link

There doesn’t seem to be any established avenue for this sort of thing, so I’m hoping one of you may be able to help me out!

I’m working on a rule extraction algorithm which will detect trends and associations between a multitude of features in data, but it requires historical data to train. I’d like to avoid scraping the data from the OSRS site, so that I don’t suck up server bandwidth, get mistaken for a DDOS, etc.

Should this research be successful, it will provide a starting point for an wide array of applications which would provide immense utility for the OSRS team, and the community at large. And the best part is that it’s through my university, and therefore 100% open source!

Anyone who can help will have my undying gratitude. Accessing this data without scraping may mean the difference between having an awesome RuneScape related thesis, and settling for a lame normie thesis.

External link →
almost 6 years ago - /u/ModMatK - Direct link

What data are you looking for?

almost 6 years ago - /u/ModMatK - Direct link

Originally posted by olthatremain

You planning another data stream? Always been my favourite to watch mate.

Yup. Watch this space - well not this space, watch the website for news.

almost 6 years ago - /u/ModMatK - Direct link

Originally posted by BasicFail

How long do we need to watch?

I have been watching that space for the past hour without losing eye contact...

For ever,.

almost 6 years ago - /u/ModMatK - Direct link

Originally posted by TridomKing

Wow, thanks for replying!

Primarily I’m looking for historic GE pricing data, as is available on the OSRS website, albeit not in a form I can use.

Other “would be nice” data items would be things like historic cumulative player engagement at different times of day, most common tasks being engaged in, and other similar game-related demographic information.

I’d like to make it clear that I am NOT looking for any information related to players themselves or their individual characters, or any information that JaGex may find sensitive in regard to their platform.

The cool thing about this kind of rule extraction research is that it provides you with information about trends and relationships in data that you wouldn’t even know to begin looking for. Which is why I believe it could provide utility to both JaGex and the computationally inclined player base.

Edit: whoops, this reply was meant for you u/ModMatK

Were there specific items that you have in mind for the GE data and how far would you need it to go back.

As for player engagement, do you need concurrent users logged into game?

As for tasks being most engaged in, that is a very difficult question. The data sets are huge as is where different tasks sit. To manage and read this data ourselves we have a team of data scientists and data engineers using the latest machine learning technology to put it into a usable form. So, I don't think I can get that one.

The problem we have with data is that we collect huge amounts (measured in the mutliple terrabytes a day I believe) so it is very difficult to display and share in a usable format. The other problem is that a lot of it is confidential, so anything other than concurrent players (which is on the website) is probably not going to be possible.

Let me know on the first two points and I shall see what I can do.