Further, DTAG customers may want to consider adding their complaint to BGP Flaps, Long-lived TCP Connections | Telekom hilft Community on DTAG’s forums.
I want to comment on that thread:
Ist halt kein Telekom Problem … die nutzen das Peering was der Anbieter vorgibt.
We have seen this happen at ISPs, where they do get the correct BGP information from Cloudflare on the route to Tranquility but the ISPs’ equipment either has out-of-memory issues or a prefix limit and throws the correct routing information away.
Level3 kann halt nen Problem sein, da die neben ner super ausgebauten Autobahn zum Telekom Netz auch ne billige Schotterpiste haben.
We have seen both versions of this issue. Firstly then we actually disable all routing via Level 3 as much as we can on and from our network equipment as many EVE players have had Level 3 related issues in the past. In particular then EVE players in South America and Mexico at certain ISPs were earlier this year unable to even connect to Tranquility before we cut Level 3 off as much as we could by not peering with them in London. Secondly, some ISPs may choose to ignore BGP routing if they have a cheaper exit point off their network.
Telekom ist nen Tier-1 Provider … wenn CCP den größten Kundenstamm in Europa (das Telekom AS ist auch für deren Konzerneinheiten in den anderen Ländern da) erreichen will, müssen se halt bissel Geld in die Hand nehmen.
Such as hosting Tranquility at Cloudflare? We have already done that, but even then we have seen large ISPs then fail at keeping steady connections to Cloudflare.
Im Telekom Netz ist alles fein. CCP bzw. deren Provider muss schauen, was se da in deren Peering Bereich so treiben bzw. bestellen.
I can’t make any claims on what is wrong in this particular case, but experience from similar incidents tells us that the issues such as this one are always in the ISPs’ networks. I’ve also repeatedly had network admins at large ISPs tell me that all is fine in their network but when presented with date over the course of days or even weeks then they ultimately find the cause in their networks. Which is why I always start with finding a possible cause at the ISP before moving to finding a cause at Cloudflare or in our own network equipment. Issues in our own equipment manifests very differently - normally then multiple users spread over the world are having issues. With high probability then localised issues have localised root causes.
Dazu ist es bei “long lived” TCP-Connections (was für ein Buzzword, es sind stinknormale TCP-Verbindungen) vollkommen egal ob sich dabei zwischendurch die Route ändert, davon bekommen die Pakete gar nichts mit.
All stateful devices on the route will notice and send RST in response. No one connects directly to Tranquility but rather to firewalls, DDoS protection and TCP proxies on Cloudflare’s edge. You can read more about the setup here https://www.cloudflare.com/case-studies/ccp-games/.