Created
November 18, 2022 03:03
-
-
Save joelnitta/48a8c3b80fafc0e06dde9c39e5841915 to your computer and use it in GitHub Desktop.
Archive a user's tweets
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(rtweet) | |
library(tidyverse) | |
# Initial authorization setup, only need to do once | |
# auth_setup_default() #nolint | |
# Authorize | |
auth_as("default") | |
# Set user name | |
user_name <- "PUT USER NAME HERE" | |
# Download all tweets from user (as many as possible anyways) | |
# The docs say it should work for up to 3200 tweets | |
# https://docs.ropensci.org/rtweet/reference/get_timeline.html | |
my_tweets <- get_timeline( | |
user = user_name, | |
n = Inf, | |
retryonratelimit = TRUE) | |
# And save them forever! | |
saveRDS(my_tweets, "my_tweets.RDS") | |
# Extract URLs of media (images etc) | |
# We can map back to the tweet the image came from by `tweet_id`, which | |
# matches `id` of my_tweets | |
media_url <- | |
my_tweets %>% | |
rename(tweet_id = id) %>% | |
mutate(media = map(entities, "media")) %>% | |
select(tweet_id, media) %>% | |
unnest(media) %>% | |
filter(!is.na(id)) %>% | |
mutate( | |
filename = str_match(media_url, "\\/([^\\/]*)$") |> | |
magrittr::extract(, 2) | |
) | |
# Download media to "media" folder | |
walk2( | |
media_url$media_url, | |
media_url$filename, | |
~download.file(.x, glue::glue("media/{.y}"))) |
@darachm glad it worked for you!
It should be possible to wrap this into a function that could be run from the command line - basically, the only input you need is the user name. But as you note they would still need to authorize using rtweet
.
However, there's another reason the current script wouldn't work well for non-useRs: the data are nested, so you can't write them out to a flat CSV (they originally come from the twitter API as JSON). I save them as an RDS file, which you can only really work with in R. So if you wanted to make a "general purpose" script it might be better just to save the JSON.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey works for me! Good job
Is there a way to allow non-R users to use it? Something so simple that somone could download R and run it with
Rscript
on Macs and ???? on windowz? I don't know howFYI y'all new users of
rtweet
will need to runauth_as_default()
once to login.rtweet
also wantshttpuv
so of course to set it up you'd need to install that (rtweet
prompts but if you're in a script well)