I'm confused how this is part of any algorithm that you use though. All I can see is "I f**ked" potentially being viewed by an automated check to possibly be followed by something else that could be considered vulgar. And yet people will call each other some of the worst possible things in the game chat, telling each other that they're f**king shit and they should go kill themselves and... The algorithms don't catch that. So what is even the point of this system? It doesn't catch real offenders much of the time and it hands out punishments to people who did nothing wrong.
Yeah it's really not my department but I can't imagine a human being looked at "I f*cked up" and smashed the big HARASSMENT button, but what do I know