The t command-line Twitter tool is a great way to work with Twitter information in a spreadsheet.
Its homepage with good installation instructions is here:
https://github.com/sferik/t
And I've written some related instructions about how to get an authentication token from Twitter:
http://www.compjour.org/tutorials/twitter-app-authentication-process/
Once you have it installed and you're authenticated, you can do a basic search for Tweets like this:
$ t search all 'nicar16'
The default behavior is to present the tweets in a human-readable format:
@mailbackwards
Good morning Denver, I'm at #NICAR16. Find me and say hi (and then come to
our talk on Sunday)
@tbtprojx
RT @MarshallProj: And about building your own criminal justice data w
@ultracasual @gabrieldance, @kenandavis + more at 3:30
https://t.co/mTNK1a1Xox #NICAR16
@sickmund
RT @MarshallProj: And about building your own criminal justice data w
@ultracasual @gabrieldance, @kenandavis + more at 3:30
https://t.co/mTNK1a1Xox #NICAR16
@tbtprojx
RT @MarshallProj: #NICAR16: Learn how to keep those news apps skills sharp at
11:30, with @gabrieldancehttp://bit.ly/1nwH4Zd
@rdmurphy
RT @A_L: Want to learn how to work with satellite data? @esagara and I will
be sharing our secrets today at 11:30 #NICAR16
But you can get them in CSV format using the --csv
flag:
$ t search all 'nicar16' --csv
ID | Posted at | Screen name | Text |
---|---|---|---|
707951040855982080 | 2016-03-10 15:27:35 +0000 | MaiAndy | RT @nkhensley: Saturday. #NICAR16 https://t.co/IBbqmP8KIo |
707950349508739072 | 2016-03-10 15:24:51 +0000 | ashlynstill | RT @Lindzcook: Join @ashlynstill and me in Denver 4 at 9am to learn programming concepts using fun games! Great place to start for newcomers #NICAR16 |
707950090355216384 | 2016-03-10 15:23:49 +0000 | karanormal | It's a beautiful day to live in Denver... Because #NICAR16. |
707949741179428864 | 2016-03-10 15:22:26 +0000 | HBCompass | Starting off #NICAR16 by tilting off a bench just in case everyone didn't know I'm awkward as hell. https://t.co/9HJ1Z6lvFT |
707949606831665153 | 2016-03-10 15:21:53 +0000 | nkhensley | Saturday. #NICAR16 https://t.co/IBbqmP8KIo |
707949340040548352 | 2016-03-10 15:20:50 +0000 | AlexSecanove | RT @biologypartners: Investigative journalists & data miners: welcome to Colorado. There are some exciting data analytics startups here for you to meet. #NICAR16 |
707949060238344193 | 2016-03-10 15:19:43 +0000 | natecarlisle | And @TonySemerad and I just landed at DEN. Next stop: #NICAR16 |
707949028881731585 | 2016-03-10 15:19:36 +0000 | michelleminkoff | Let #nicar16 officially begin -- my uniform is on! It's go time! https://t.co/K2Z2DIfu04 |
707948651151122433 | 2016-03-10 15:18:06 +0000 | ryanngro | My sixth NICAR conf and the first where I fell asleep before midnight on the first night. Losing my touch. #NICAR16 |
707948445131268096 | 2016-03-10 15:17:17 +0000 | 1GKh | RT @FerretScot: If you're interested in investigative journalism it's worth keeping an eye on #NICAR16 as it unfolds |
707948358275444736 | 2016-03-10 15:16:56 +0000 | cjsinner | SUPER excited for my first #NICAR16 😁😁😁 |
By default, 20 of the most recent tweets are returned. You can change this by using the -n
flag; I believe the max nunber of results is capped at 3200, or, however many tweets have been posted in the last 7 days with the queried term.
And of course, you most likely want to be piping this directly into a text file that you can open up in Excel or what have you:
$ t search all 'nicar16' --csv -n 3200 > nicar16tweets.csv
The t search
subcommand lets you narrow the query to just your own timeline (t search timeline 'nicar16'
) or even to a specific list. Run t search help
to see the descriptions:
t search all QUERY # Returns the 20 most recent Tweets that match the specified query.
t search favorites [USER] QUERY # Returns Tweets you've favorited that match the specified query.
t search help [COMMAND] # Describe subcommands or one specific subcommand
t search list [USER/]LIST QUERY # Returns Tweets on a list that match the specified query.
t search mentions QUERY # Returns Tweets mentioning you that match the specified query.
t search retweets [USER] QUERY # Returns Tweets you've retweeted that match the specified query.
t search timeline [USER] QUERY # Returns Tweets in your timeline that match the specified query.
t search users QUERY # Returns users that match the specified query.
This is also a good time to try out csvkit, rather than using a spreadsheet.
Use csvcut
with the -n
flag to see the headers:
$ csvcut -n nicar16tweets.csv
1: ID
2: Posted at
3: Screen name
4: Text
Here's how to get the most frequent users (by screen name) of the hashtag in the set of tweets you've downloaded:
$ csvcut -c 'Screen name' nicar16tweets.csv | sort | uniq -c | sort -rn
82 BizJournalism
20 MacDiva
19 ultracasual
18 Jeremy_CF_Lin
17 IRE_NICAR
15 tbtprojx
15 RajneeshB
14 palewire
13 brentajones
13 KateReports
13 DanielleAlberti
12 seecmb
12 benlkeith
12 KarrieKehoe
12 HacksHackersCO
11 livlab
11 dougfisher
10 wjchat
10 harrisj
9 onyxfish
If you need yet another example of why you should stay away from Excel (and any other spreadsheet, but mostly Excel on OS X) until you absolutely need a spreadsheet, you will get this inexplicable error when opening up the csv file provided by t if you're on OS X:
The reason? Because when the first letters in a file are ID
, this causes Excel to shit itself. It's hard to imagine the logic that went into that decision to hardcode ID
as a magic word: https://support.microsoft.com/en-us/kb/215591