Skip to content

Instantly share code, notes, and snippets.

@ryanblock
Created August 29, 2012 02:27
Show Gist options
  • Save ryanblock/3506204 to your computer and use it in GitHub Desktop.
Save ryanblock/3506204 to your computer and use it in GitHub Desktop.
How Tweetbot and regex made my Twitter replies usable again (filed under: #wrongryan)

How Tweetbot and regex made my Twitter replies usable again

As it turns out, most normal humans are incapable of learning to use Twitter @ replies. And in case you don't follow me on Twitter: yes, my handle (@ryan) gets a lot of erroneous mentions. (The most amusing, random ones I've even taken to retweeting under the #wrongryan hashtag.)

Then Tweetbot -- and its ability to use regex as Twitter filters -- came along. Here's how the Tapbots guys and some regular expressions single-handedly made my Twitter replies usable again.

Notes and caveats

  • I'm not a regex expert. Far from it. I suck at regex, actually. If you have suggestions for improvements, please leave them below!
  • Some regex may look a little sloppy, but in actuality was written because TweetBot for Mac's regex filter support is very early, and things like repeats (expression{3,}) are buggy. So everything below should work without crashing Tweetbot for iPhone, iPad, and Mac.
  • Obvious, but not everyone should make use of every filter below. These filters are all tuned for reply spam on my particular account, and many of these filters may actually result in false positives. So be careful!

Lots and lots of user mentions

The top #wrongryan offender is, believe it or not, teenage Twitter users in Malaysia who blast out tweets with up to a dozen of their friends' first names. There's no pattern, and no awareness that the replies are all pointing to the wrong people. And as it happens, apparently Ryan is a popular first name. A couple of ways to do this.

First, the long, rambling list (with, or without commas):

(@\w+,?){3,}@ryan

That one kind of broke down for me pretty quickly, though, so I tried mentions (at least 3) scattered all over the place:

(@\w+.*){3,}(@ryan)

I've found that wasn't very effective, though, because @ryan might be mentioned anywhere. Also, the {3,} has been crashing TweetBot for Mac. So I created this set of four expressions to nuke the appearance of @ryan in any tweet with any three other mentions:

(?s)(?i:@ryan.*)(@\w+.*)(@\w+.*)(@\w+)
(?s)(@\w+.*)(?i:@ryan.*)(@\w+.*)(@\w+)
(?s)(@\w+.*)(@\w+.*)(?i:@ryan.*)(@\w+)
(?s)(@\w+.*)(@\w+.*)(@\w+.*)(?i:@ryan)

Multi-RTs

The second worst offender: basically the same mention spam above, but a variant where these kids are using a lots of old school retweets instead all those first names. This regex kills anything with multiple RTs in a single post. [Updated to be much more greedy, watch out!]

(?s)(RT).*?(RT)

Quotes immediately preceding the reply

For some reason a LOT of erroneous tweets to me look something like "@ryan (note the preceding quotation). Don't ask me why. (Updated to prevent false positives on quote-style retweets.)

^.+[“"](?i:@ryan)

Empty replies

For tweets that only contain "@ryan" and nothing else. Sorry, @Hodgman.

^(?i:@ryan)$

All caps

An lot of wrongryans come in the form of all caps. As usual, don't ask me why. But I'm pretty sure I've never seen an intentional reply come in this way.

(@RYAN)

Truncated mentions of other Ryans

I see this one a lot specifically when retweets of TechCrunch articles by @RyanLawler get truncated. This version uses a single elipsis character or standard three periods (thanks @noahhendrix!).

(?i:@ryan)(…|\.{3,})$

Trailing mentions

Again, chalk it up to "I don't get it", but a lot of wrongryans just feature a "@ryan" right at the end of the tweet for seemingly no good reason. But the problem with doing a simple expression like @ryan$ is that sometimes people write real tweets with a trailing mention, like a hat tip, cc, via, etc. (i.e. "Blah blah blah /via @ryan").

We can prevent those crediting mentions from getting accidentally blocked, though! For that we'll use a negative lookbehind:

(?i)(?<!cc|ht|via|/|:) @ryan$

Anti-Seacrest, Sheckler, Reynolds, etc.

Duh. Also, his name is occasionally typoed as "seacret". Okay.

(?i:seacres?t)

Apparently the kids these days also try to tweet at someone named Ryan Sheckler, who, Wikipedia tells me, is a professional skateboarder with an MTV show. Many also typo his name "shecler" or "shekler".

(?i:shec?k?ler)

And then there's Ryan "Renolds" (sigh).

(?i:rey?nolds)

Anti-Paul Ryan

Fun tip: left and right alike, people who wrongryan on Paul Ryan's name tend to be nut jobs. And most make the same mistake. I also pair this with any mentions of @romney, which really does it. This one is just a regular filter, though, not regex:

paul @ryan

Anti-Ryan Murphy

This guy does the Glee show, right? Gleeks aren't very excellent at Twittering.

@ryan murphy

Anti-other Ryans

Oh, there are so, so many. I've just completely filtered out anything including:

  • Gosling
  • Lochte
  • Boyce
  • Beatty
  • Higa (with preceding space, so as not to filter out words like "Michigan")
  • Giggs
  • The list goes on...

Hat tip!

A big ass tip of the hat to Githubbers Justin and imathis for their awesome filters, as well as Logan Bailey for some input.

@jcornelius
Copy link

Thank you, Ryan. This should be a big help. Now if I could only filter non-English tweets as that's a good percentage of the mis-tweets to my @jc account. Any ideas on that? You can have a look at http://twitter.com/jc_mistweeted to see some of the garbage I've tolerated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment