Skip to content

Instantly share code, notes, and snippets.

@howdydoody123
Last active November 1, 2023 19:51
Show Gist options
  • Save howdydoody123/05a9c20f5702b1adeb6e to your computer and use it in GitHub Desktop.
Save howdydoody123/05a9c20f5702b1adeb6e to your computer and use it in GitHub Desktop.
'''
Author: Sean Beck
A script for downloading a Twitter user's
entire tweet archive. Note that it is limited
to only grabbing up to 3200 of a user's tweets
per a Twitter limitation mentioned on their documentation
page: https://dev.twitter.com/rest/reference/get/statuses/user_timeline
'''
import csv
import os
import argparse
import tweepy
PATH = os.path.dirname(os.path.realpath(__file__))
def get_args():
parser = argparse.ArgumentParser()
parser.add_argument('-u', '--username', type=str, help='The user you would like to get an archive of')
return parser.parse_args()
def get_api():
'''
Creates an instance of the tweepy API class
'''
with open(PATH+'/config') as f:
api_key = f.readline().strip()
api_secret = f.readline().strip()
access_token = f.readline().strip()
access_token_secret = f.readline().strip()
auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_token_secret)
return tweepy.API(auth)
def get_tweets(username):
api = get_api()
tweets = []
current = api.user_timeline(screen_name=username, count=200)
tweets.extend(current)
last_id = tweets[-1].id - 1
while len(current) > 0:
current = api.user_timeline(screen_name=username, count=200, max_id=last_id)
if len(current) > 0:
tweets.extend(current)
last_id = tweets[-1].id - 1
return [[tweet.id_str, tweet.created_at, tweet.text.encode('utf-8')] for tweet in tweets]
if __name__ == '__main__':
args = get_args()
tweets = get_tweets(args.username)
filename = '%s_tweets.csv' % args.username
print 'Got %d tweets from user %s' % (len(tweets), args.username)
print
print 'Writing to CSV file named %s' % filename
with open(filename, 'wb') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(['id', 'created_at', 'text'])
writer.writerows(tweets)
Copy link

ghost commented May 25, 2015

Hello! I'm currently working on my MA Thesis, and this script promises everything that I need. Unfortunately (and due to my limited programming skills) I can't get it to work. Currently I get an error massage that there is an invalid syntax in this line: "print 'Got %d tweets from user %s' % (len(tweets), args.username)".

As said, I'm completely new to this, but getting it to work on my pc would be extremely beneficial for my research!

@howdydoody123
Copy link
Author

@MutsFM 2 years late to this since Gist doesn't give me notifications about comments, but my guess is you installed Python 3 which uses a different syntax for print. In the future, make sure to copy the entire error output you get so others can better help you.

@miltonmateus
Copy link

@MutsFM i had the same error using it today, put some parenthesis and it will work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment