Like other client/server products, our game has a very complex and distributed infrastructure, consisting of a variety of different servers. These are authorization servers, servers to store user profiles, battle servers, a squad server and a voice communication server. All of these are not on a single device, but dozens! Plus, a matching server, consisting of many physical devices of entrance gateways, that are proxies to eliminate points of failure, and a server that actually creates battles from players in the queue.
Now, a little more detail on what happened this past weekend. One of the gateway services was operating for 375 days, and during a scheduled reboot, a misconfigured version was loaded that used only a single core (we discovered it only on Sunday) and incorrectly proxying IP-addresses. Initially, it seemed to us that the server was overloaded due to the newly introduced vehicles and a gaming nation, as well as ...
Read more