Last active
October 8, 2021 10:30
-
-
Save ndamulelonemakh/bb1d9e082e4e3a757d319c8a48dccda7 to your computer and use it in GitHub Desktop.
Misc web scraping scripts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # i. Search for tweets containing hastag '#messi' posted in 2020 | |
| # Results will be saved to a csv file named 'messitweets.csv' | |
| # Other filters: Only get tweets in english(en), stats: show the number of likes, retweets for each post, count: show the number of posts returned | |
| twint --search "#messi" --since 2020-01-01 --until 2020-12-31 --limit 50000 --popular --csv -o "messitweets.csv" --lang en --stats --hashtags --stats --count | |
| # ii. Collect all tweets from a users timeline | |
| twint --since 2020-01-01 --until 2020-12-31 --limit 50000 --popular --csv -o "out.csv" --lang en --stats --hashtags --stats --count -u endeesa -tl | |
| # Sample request sent URL to twitter | |
| # https://api.twitter.com/2/search/adaptive.json?include_can_media_tag=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&count=100&query_source=typed_query&cursor=-1&spelling_corrections=1&ext=mediaStats%252ChighlightedLabel&tweet_search_mode=live&l=en&lang=en&q=%24%20since%3A1293840000%20until%3A1325289600 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment