Skip to content

Instantly share code, notes, and snippets.

@bonzanini
Last active May 20, 2025 02:55
Show Gist options
  • Save bonzanini/af0463b927433c73784d to your computer and use it in GitHub Desktop.
Save bonzanini/af0463b927433c73784d to your computer and use it in GitHub Desktop.
Twitter Stream Downloader
consumer_key = 'your-consumer-key'
consumer_secret = 'your-consumer-secret'
access_token = 'your-access-token'
access_secret = 'your-access-secret'
# To run this code, first edit config.py with your configuration, then:
#
# mkdir data
# python twitter_stream_download.py -q apple -d data
#
# It will produce the list of tweets for the query "apple"
# in the file data/stream_apple.json
import tweepy
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time
import argparse
import string
import config
import json
def get_parser():
"""Get parser for command line arguments."""
parser = argparse.ArgumentParser(description="Twitter Downloader")
parser.add_argument("-q",
"--query",
dest="query",
help="Query/Filter",
default='-')
parser.add_argument("-d",
"--data-dir",
dest="data_dir",
help="Output/Data Directory")
return parser
class MyListener(StreamListener):
"""Custom StreamListener for streaming data."""
def __init__(self, data_dir, query):
query_fname = format_filename(query)
self.outfile = "%s/stream_%s.json" % (data_dir, query_fname)
def on_data(self, data):
try:
with open(self.outfile, 'a') as f:
f.write(data)
print(data)
return True
except BaseException as e:
print("Error on_data: %s" % str(e))
time.sleep(5)
return True
def on_error(self, status):
print(status)
return True
def format_filename(fname):
"""Convert file name into a safe string.
Arguments:
fname -- the file name to convert
Return:
String -- converted file name
"""
return ''.join(convert_valid(one_char) for one_char in fname)
def convert_valid(one_char):
"""Convert a character into '_' if invalid.
Arguments:
one_char -- the char to convert
Return:
Character -- converted char
"""
valid_chars = "-_.%s%s" % (string.ascii_letters, string.digits)
if one_char in valid_chars:
return one_char
else:
return '_'
@classmethod
def parse(cls, api, raw):
status = cls.first_parse(api, raw)
setattr(status, 'json', json.dumps(raw))
return status
if __name__ == '__main__':
parser = get_parser()
args = parser.parse_args()
auth = OAuthHandler(config.consumer_key, config.consumer_secret)
auth.set_access_token(config.access_token, config.access_secret)
api = tweepy.API(auth)
twitter_stream = Stream(auth, MyListener(args.data_dir, args.query))
twitter_stream.filter(track=[args.query])
@bonzanini
Copy link
Author

@markgillis0 unfortunately exact phrase matching is not supported by the twitter streaming API yet: https://dev.twitter.com/streaming/overview/request-parameters#track
on the other side, it is supported by the search API

@shannonwho
Copy link

Hi! Thank you very much for sharing.

The code works fine when I input the query for apple, but no other keyword can be input in. Do you happen to know why is that?

Any suggestions will be really helpful!

@ajax-jones
Copy link

I find that the 401 is what you get before you set up your config.py with the twitter app credentials.I get the none error if the -d is not specified. So I create a sub-dir and use that and it works fine then
sudo mkdir mydir
sudo python tweet.py -q apple -d mydir

@Parth-Vader
Copy link

If I want to store just the "text" portion , how can I do it?

@kmrsatish17
Copy link

I'm getting this error. Please help!!
Error on_data: [Errno 2] No such file or directory: 'data/stream_apple.json'

@kjoth
Copy link

kjoth commented Dec 4, 2016

How do I get to list of my followers?

for friends in tweepy.Cursor(api.followers).items():
fw.write('Friends: ' + str(follower_ids) + "\n")

follower_ids is not found

@rsathishr
Copy link

Am getting an error!! pls help me out

Failed on data: %s '_io.TextIOWrapper' object has no attribute 'Write'
ERROR: execution aborted

@GabrielYe
Copy link

GabrielYe commented Mar 14, 2017

@bonzanini Thanks for your code. How can I get all of the tweets of a specific user ? For example, I wanna get tweets of Kobe.
Thank you.

@yuchenQ
Copy link

yuchenQ commented Mar 19, 2017

@bonzanini Hi thanks for you great example, may I ask what use of
@classmethod
def parse(cls, api, raw):

thks

@baoyanpeng
Copy link

Thanks a lot,and i have a question. Whether can i obtain the data about some keywords before today?

@Dixith-Reddy-Nayeni
Copy link

Thank u very much...it worked for me..:)

@vibhuti1990
Copy link

Hi Could you please help me with the below error.

Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'

@zeanong
Copy link

zeanong commented Sep 12, 2017

Works well. Thank you!

@L-Kov
Copy link

L-Kov commented Sep 22, 2017

I get the error:
line 96, in
auth = OAuthHandler(config.consumer_key, config.consumer_secret)
AttributeError: 'module' object has no attribute 'consumer_key'

what config module do you use?

@Kanishk-Anand
Copy link

Kanishk-Anand commented Oct 31, 2017

I keep getting 401 as output. I have set up the config.py file with my credentials, still it gives 401. Any help?

@m-abubakar-saddique
Copy link

How to limit the tweets?

@salsaeede
Copy link

i keep getting the below , can someone help me to successfully import config

import config
Traceback (most recent call last):

File "C:\Users\salman\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)

File "", line 1, in
import config

File "C:\Users\salman\Anaconda3\lib\site-packages\config.py", line 733
except Exception, e:
^
SyntaxError: invalid syntax

@SjorsG
Copy link

SjorsG commented Jan 15, 2018

I'm so sorry bothering you after you already have written this beautiful piece of code.
I think the answer lies in the comment section in your code, but it seems like i just can't get it right.
How do i edit config.py with your configuration, then:
mkdir data
python twitter_stream_download.py -q apple -d data

I get this error:

line 96, in
auth = OAuthHandler(config.consumer_key, config.consumer_secret)
AttributeError: 'module' object has no attribute 'consumer_key'

Thank you for your time

@ericdorsey
Copy link

@SjorsG
Are you sure you have a file called "config.py" in the same folder, that has a variable in it that's called "consumer_key", that has your key assigned to it?

consumer_key = 'YOURCONSUMERKEYHERE'

@pbajpai2
Copy link

I'm using Python 3.7.0 and downloaded Tweepy 3.6.0

And after running config.py (which ends successfully) and doing the mkdir data step. I get the following error when running the twitter_stream_download.py

**C:\Users\pbajp\Git\datasci_course_materials\assignment1\alternate>python twitter_stream_download.py -q apple -d data
Traceback (most recent call last):
File "twitter_stream_download.py", line 9, in
import tweepy
File "C:\Users\pbajp\AppData\Local\Programs\Python\Python37\lib\site-packages\tweepy_init
.py", line 17, in
from tweepy.streaming import Stream, StreamListener
File "C:\Users\pbajp\AppData\Local\Programs\Python\Python37\lib\site-packages\tweepy\streaming.py", line 358
def start(self, async):
^
SyntaxError: invalid syntax**

Can anyone guide me on next steps to debug?

@agcala
Copy link

agcala commented Jul 12, 2018

@rsathishr
It is "write" not "Write"

@Germain94
Copy link

Hello everyone.
First of all, thank you for your work @bonzanini !
I'm trying to search for tweets from two weeks ago until now. Can I transform your code to do that ?

@AreRex14
Copy link

Work fine. Thank you for your work @bonzanini

@arnabghose997
Copy link

For those who are facing the following error:

Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'

You have to create a folder named "data" in the same directory, for the code to work. Hope this helps.

@Carpintonto
Copy link

maybe I am totally missing something, but it sure seems to me that the script is totally functional without import json or the @classmethod

@Benasir1
Copy link

For those who are facing the following error:

Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'

You have to create a folder named "data" in the same directory, for the code to work. Hope this helps.

@arnabghose997. I still face the same problem after creating folder 'data' in the same directory

@PranjalShekhawat
Copy link

Any idea how to resolve this error please

runfile('C:/Users/chhaj/OneDrive/Desktop/test4 tweet search.py', wdir='C:/Users/chhaj/OneDrive/Desktop')
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Error on_data: [Errno 2] No such file or directory: 'None/stream_-.json'
Traceback (most recent call last):

File "", line 1, in
runfile('C:/Users/chhaj/OneDrive/Desktop/test4 tweet search.py', wdir='C:/Users/chhaj/OneDrive/Desktop')

File "C:\Users\chhaj\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)

File "C:\Users\chhaj\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/chhaj/OneDrive/Desktop/test4 tweet search.py", line 95, in
twitter_stream.filter(track=[args.query])

File "C:\Users\chhaj\Anaconda3\lib\site-packages\tweepy\streaming.py", line 453, in filter
self._start(is_async)

File "C:\Users\chhaj\Anaconda3\lib\site-packages\tweepy\streaming.py", line 368, in _start
self._run()

File "C:\Users\chhaj\Anaconda3\lib\site-packages\tweepy\streaming.py", line 269, in _run
self._read_loop(resp)

File "C:\Users\chhaj\Anaconda3\lib\site-packages\tweepy\streaming.py", line 331, in _read_loop
self._data(next_status_obj)

File "C:\Users\chhaj\Anaconda3\lib\site-packages\tweepy\streaming.py", line 303, in _data
if self.listener.on_data(data) is False:

File "C:/Users/chhaj/OneDrive/Desktop/test4 tweet search.py", line 50, in on_data
time.sleep(5)

KeyboardInterrupt

@valdassukevicius
Copy link

worked just fine from cmd python 3.8.5 just needed to create a data sub-folder within the assignment

@lognguyen
Copy link

@pbajpai2 i dont know if you've fixed that one yet. If you use different IDE/Interpreter when try to edit the two files, it might be the problem. In my case, i used Anaconda so i had to use the Ana Prompt to run it properly.

@Wandering-Mind
Copy link

@markgillis0 unfortunately exact phrase matching is not supported by the twitter streaming API yet: https://dev.twitter.com/streaming/overview/request-parameters#track on the other side, it is supported by the search API

I was thinking something very similar to this original comment about multiple character searches. What's the probability that this is a feature used elsewhere? If high probabilty, how would you begin to build it out? ballpark estimates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment