Original Post — Direct link
over 4 years ago - /u/Penrif - Direct link

Originally posted by Masalar

I always like these explanation articles. I may not fully understand everything talked about, but I find it fascinating anyway.

Glad you liked it! If you have any questions, I'm happy to help clarify anything

over 4 years ago - /u/Penrif - Direct link

Originally posted by enyaliustv

As someone working in the field it is nice to see what happened, what did you do to fix the problem and what is being done to make the chances of it happening again close to zero. I have had my fair share of moments where I've thought the work was done by workers with way too little experience and knowledge to deal with the European servers but I was wrong. Thanks for the article!

We're nearly done moving to a different container scheduler entirely, and that one has much more expressive constraints that will allow us to problematically ensure exclusivity on host across shards with a bit of configuration. Until the edge services can be transitioned over however, we've set up alerting that prompts us to do a manual remap if problematic crowding occurs. That's been working well for the past couple weeks since this incident.

over 4 years ago - /u/Penrif - Direct link

Originally posted by FrenchLyfe

You mentioned that a large amount of containers go into making league work, I was wondering just how spread out it actually is? Is a single champ select lobby running on multiple containers at once, each responsible for different things like Champs, runes, Chat, etc? I remember reading about those being highly separate client side, so it would be interesting to know if the server side mirrors the client in that sense.

It's very spread out - the examples you call out in a single champ select are spot on correct. There's a few more but I'm fuzzy on the exact splits so I hope you'll accept a bit of handwave there. In the core game loop we also have other services like the one that manages parties, one for matchmaking, and on the flip side there's separate services for processing mission results and storing your match history, for example. Very much in the microservices style of architecture.

over 4 years ago - /u/Penrif - Direct link

Originally posted by imArsenals

Almost 90% sure it's happening (on a smaller scale) in NA as well. I was playing with some Texas friends last night (all DFW) and all of us had massive packet loss despite our MS/FPS not changing.

That'd have to be something else entirely - the game servers don't run in the container cluster presently. I didn't see any major network issues with NA last night - possible that y'all are on the same ISP and they were having issues?

over 4 years ago - /u/Penrif - Direct link

Originally posted by yamfarmer1

Is it something homegrown or one of the popular open source orchestration frameworks? (k8s, etc)

It's a popular open source framework. Reckon rolling your own clustering is quickly becoming like rolling your own crypto - just don't.

over 4 years ago - /u/Penrif - Direct link

Originally posted by JanEric1

should i talk about my master thesis so i dont feel as dumb for not understanding a single word of this. :(

http://matt.might.net/articles/phd-school-in-pictures/

We may be at different angles off the center, but that doesn't make our specialties any more or less "smart" than the other

over 4 years ago - /u/Penrif - Direct link

Originally posted by SeizeTheKills

Always love reading your technical stuff, both for Riot and in the past for EVE Online!

o7 m8