Skip to content

Instantly share code, notes, and snippets.

@ndamulelonemakh
Last active October 8, 2021 10:30
Show Gist options
  • Select an option

  • Save ndamulelonemakh/bb1d9e082e4e3a757d319c8a48dccda7 to your computer and use it in GitHub Desktop.

Select an option

Save ndamulelonemakh/bb1d9e082e4e3a757d319c8a48dccda7 to your computer and use it in GitHub Desktop.
Misc web scraping scripts
# i. Search for tweets containing hastag '#messi' posted in 2020
# Results will be saved to a csv file named 'messitweets.csv'
# Other filters: Only get tweets in english(en), stats: show the number of likes, retweets for each post, count: show the number of posts returned
twint --search "#messi" --since 2020-01-01 --until 2020-12-31 --limit 50000 --popular --csv -o "messitweets.csv" --lang en --stats --hashtags --stats --count
# ii. Collect all tweets from a users timeline
twint --since 2020-01-01 --until 2020-12-31 --limit 50000 --popular --csv -o "out.csv" --lang en --stats --hashtags --stats --count -u endeesa -tl
# Sample request sent URL to twitter
# https://api.twitter.com/2/search/adaptive.json?include_can_media_tag=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&count=100&query_source=typed_query&cursor=-1&spelling_corrections=1&ext=mediaStats%252ChighlightedLabel&tweet_search_mode=live&l=en&lang=en&q=%24%20since%3A1293840000%20until%3A1325289600
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment