Skip to content

Instantly share code, notes, and snippets.

@thinkjson
Last active October 12, 2017 00:24
Show Gist options
  • Select an option

  • Save thinkjson/d7eb32b86f05b3290c237f63235809dc to your computer and use it in GitHub Desktop.

Select an option

Save thinkjson/d7eb32b86f05b3290c237f63235809dc to your computer and use it in GitHub Desktop.
This is a proposal for identifying and retaliating against abusers on the Twitter platform. Using a combination of sentiment analysis and mechanical turk, the system can flag abusive tweets and report repeat offenders in real time. This is a proposal for the abuse system that should have been built into Twitter. Comments welcome.

Abusive tweet storms and DDoS

Abusive tweet storms brought on by a Twitter user with a significant amount of followers using their influence to intimidate or verbally abuse another Twitter user bears some striking similarities to a DDoS attack. If you frame the problem in terms of a DDoS attack, then many mitigation techniques used against DDoS attacks can also be used against these Twitter attacks.

Identification of abusive tweets

By opting in, users can submit their replies for analysis by the system. Using sentiment analysis, abusive tweets can be identified programmatically, and confirmed using a mechanical turk like system. In order to broaden the scope of the analysis, the abusive user's other replies are analyzed and catalogued, being flagged along the way to protect other users. Their victims' replies will also be analyzed so other abusive users can be identified. We can also use their following list to determine other potential abusive users. In this way the pool of people whose tweets are being analyzed will grow organically, and other bad actors can be identified in a

Initial server-side mitigation

In a phase one of a mitigation solution, abusive tweets can be catalogued on the home page of the system's public-facing web site. This will allow volunteers to click the link and report the tweet as abusive. This is a necessary solution because Twitter exposes no public API for reporting abusive tweets. In addition, the front page will catalog the top 10 worst abusers in the system, with links to their profiles allowing visitors to the web site to report the user themselves for abuse.

Client-side mitigation

The most effective mitigation against this kind of abuse is to filter replies on the client. I propose four modes in which the client can operate to protect the user, with escalating between modes being triggered either by volume of abusive tweets, or by the user:

Passive filtering

In this mode, only tweets identified by the system in a reactive manner are removed from the replies feed. For most users experiencing only occasional abusive replies, this mode may be sufficient.

Active filtering

In this mode, sentiment analysis is run on the client, and any replies with a potentially negative message are hidden from the replies feed until mechanical turk style verification can take place. This can be triggered by a certain number of abusive tweets in a specified period of time.

Reply whitelisting

In this mode, only replies from users who have already been whitelisted either by the user or by a consistently positive sentiment analysis score will get through. This is akin to blocking CIDRs where attacks tend to be clustered in a DDoS attack. As mechanical turk verification flags tweets as safe, replies can start to trickle in again.

Replies only from following

In the most restrictive mode, replies from any user that the user is not following will be filtered, and never shown to the user. This can be used in extreme cases where even reply whitelisting is allowing abusive tweets through, or the volume of abusive tweets exceeds a particular threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment