over 2 years ago - CCP_Explorer - Direct link

If you are a Deutsche Telekom AG customer and have been experiencing more disconnects since 1-2 August, then please create a support ticket. If you are playing from Germany and have been experiencing disconnects since the beginning of August even if your ISP is not DTAG, then we want to hear from you. We are trying to understand what is happening in this graph:

image1133×537 73.3 KB

over 2 years ago - CCP_Explorer - Direct link

First off: Please DO NOT post any of your details as a reply in this thread as they will contain private information. Rather send us a support ticket with those details; but most importantly open a support ticket with Deutsche Telekom AG about network route instability (if DTAG is your ISP).

Earlier this year then customers of Virgin Media UK experienced a very similar issue as is described in this thread and as we are seeing in metrics now for Germany. See this forum thread: 20220225 - Connection Issues (UK).

See also these posts from me for that incident:

At the time when we were debugging this with Virgin Media UK customer, there was this helpful advice from @Alaska: 20220225 - Connection Issues (UK) - #382 by Alaska

Open a blank text file. Copy the following into your file:

:START
@echo Started: %date% %time%
tracert tranquility.servers.eveonline.com
@echo Completed: %date% %time%
timeout 60
GOTO START

Save your file (ie filename.bat) to your desired folder. MUST be .bat

Open a Command prompt window. Drag your saved file into the command prompt window (it will show the location when dragged) and hit enter/return to run it.

When disconnected, switch to that CMD window and allow a traceroute to run. Then stop it, and scroll through the latest traceroutes and see if any of them changed; i.e. did the network traffic suddenly go a different route on its way to Tranquility (which is tranquility.servers.eveonline.com // 172.65.201.188). If there is a route change at the time of the disconnect, then we want to hear about it (in a support ticket).

BTW, this issue can’t be fixed in a patch and can’t be fixed on our side. This needs to be fixed by DTAG and Cloudflare, possibly with Level3’s involvement since we have seen Level3 in Frankfurt in traces that have been sent to us.

over 2 years ago - CCP_Explorer - Direct link

That is indeed a bit of a mystery.

A sudden route change will result in stateful network devices on the new path to reply with a TCP RST to reset the connection. At that point the EVE Client needs to reestablish the connection to Tranquility through the new Cloudflare location that the network connection is going though. Normally this will result in all running EVE Clients to reset since normally they would all be transmitting and receiving messages.

There is an edge case if the network route flips back and forth very quickly and it so happens that one the clients was not transmitting/receiving then it would stay connected (or there would be packet loss but a re-transmission would be requested and succeed and no RST packets would be sent.

Hence we suspect some other network equipment failure at Deutsche Telekom AG where only some of the connections are terminated (for reasons unknown to us).

We are not seeing this pattern from other countries starting 1-2 August, either they are the same or better (e.g. connectivity in Australia improved significantly at the tail end of July so we just came off that investigation to this one), and so far the only German players that have contacted us are with Deutsche Telekom AG.

over 2 years ago - CCP_Explorer - Direct link

No.

The issues were happening in ISPs’ network systems before reaching the local Cloudflare locations in Australia.

See also my next post with more details about Deutsche Telekom AG as that may be relevant to players in Australia and New Zealand.

over 2 years ago - CCP_Explorer - Direct link

Some EVE players who are customers of Deutsche Telekom AG (DTAG) and have been affected recently by random disconnects have reported to us that they have resolved the issue by installing https://cloudflarewarp.com/, which is software from Cloudflare, and using it in full WARP mode. You can read more about this software here https://blog.cloudflare.com/warp-for-desktop/ and here Windows desktop client · Cloudflare WARP client docs. Please note that I’m not requiring, asking or recommending that you install any software on your computer and take no responsibility for this particular software; I’m only relaying the experience of other DTAG customers.

This software is essentially a Cloudflare-specific VPN: “The WARP application uses BoringTun to encrypt all the traffic from your device and send it directly to Cloudflare’s edge, ensuring that no one in between is snooping on what you’re doing. If the site you are visiting is already a Cloudflare customer, the content is immediately sent down to your device.”

Given that Tranquility is serviced by Cloudflare’s network, then if customers of DTAG are able to bypass the disconnect issues by using a Cloudflare VPN-style application, then the root of the disconnect issue is clearly at DTAG.

If you do decide to try this out, please let us know if the disconnects stop.

over 2 years ago - CCP_Explorer - Direct link

Further, DTAG customers may want to consider adding their complaint to BGP Flaps, Long-lived TCP Connections | Telekom hilft Community on DTAG’s forums.

I want to comment on that thread:

Ist halt kein Telekom Problem … die nutzen das Peering was der Anbieter vorgibt.

We have seen this happen at ISPs, where they do get the correct BGP information from Cloudflare on the route to Tranquility but the ISPs’ equipment either has out-of-memory issues or a prefix limit and throws the correct routing information away.

Level3 kann halt nen Problem sein, da die neben ner super ausgebauten Autobahn zum Telekom Netz auch ne billige Schotterpiste haben.

We have seen both versions of this issue. Firstly then we actually disable all routing via Level 3 as much as we can on and from our network equipment as many EVE players have had Level 3 related issues in the past. In particular then EVE players in South America and Mexico at certain ISPs were earlier this year unable to even connect to Tranquility before we cut Level 3 off as much as we could by not peering with them in London. Secondly, some ISPs may choose to ignore BGP routing if they have a cheaper exit point off their network.

Telekom ist nen Tier-1 Provider … wenn CCP den größten Kundenstamm in Europa (das Telekom AS ist auch für deren Konzerneinheiten in den anderen Ländern da) erreichen will, müssen se halt bissel Geld in die Hand nehmen.

Such as hosting Tranquility at Cloudflare? We have already done that, but even then we have seen large ISPs then fail at keeping steady connections to Cloudflare.

Im Telekom Netz ist alles fein. CCP bzw. deren Provider muss schauen, was se da in deren Peering Bereich so treiben bzw. bestellen.

I can’t make any claims on what is wrong in this particular case, but experience from similar incidents tells us that the issues such as this one are always in the ISPs’ networks. I’ve also repeatedly had network admins at large ISPs tell me that all is fine in their network but when presented with date over the course of days or even weeks then they ultimately find the cause in their networks. Which is why I always start with finding a possible cause at the ISP before moving to finding a cause at Cloudflare or in our own network equipment. Issues in our own equipment manifests very differently - normally then multiple users spread over the world are having issues. With high probability then localised issues have localised root causes.

Dazu ist es bei “long lived” TCP-Connections (was für ein Buzzword, es sind stinknormale TCP-Verbindungen) vollkommen egal ob sich dabei zwischendurch die Route ändert, davon bekommen die Pakete gar nichts mit.

All stateful devices on the route will notice and send RST in response. No one connects directly to Tranquility but rather to firewalls, DDoS protection and TCP proxies on Cloudflare’s edge. You can read more about the setup here https://www.cloudflare.com/case-studies/ccp-games/.

over 2 years ago - CCP_Explorer - Direct link

We are moving all discussion on disconnect issues affecting customers of Deutsche Telekom AG to this new thread:

over 2 years ago - CCP_Explorer - Direct link

A web page is not a real-time system that keeps on ticking. It is state-less and will just stay there until you come back. If you disconnect from EVE then the simulation carries on and we must re-connect you with your session. But the simulation won’t be in the same state as when you left. In fact, as soon as we detect that something is amiss, then you are automatically emergency-warped away. Your web page doesn’t do that. There isn’t a comparison here.