Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save neirbowj/23cf44be302c1bbe6e284ed9d1337c36 to your computer and use it in GitHub Desktop.
Save neirbowj/23cf44be302c1bbe6e284ed9d1337c36 to your computer and use it in GitHub Desktop.
Twitter: How to archive your following/followers data (usernames, etc)

Twitter allows users to download parts of their data, see How to download your Twitter archive.

But what's not included in that data dump is the usernames/handles of the people that you follow or are following you. All you get is account IDs which is just an internal number and so a bit useless when it comes to archival.

Here's a way to get that data (you need to know how to run stuff in the terminal):

  1. Go to your Twitter profile in a desktop browser (Firefox or Chrome)
  2. Right click on page → Inspect → Network tab
  3. Click on the Following link (e.g. https://twitter.com/{yourusername}/following)
  4. Scroll through the entire list of following users (e.g. keep pressing page down, space or scroll)
  5. In the filter/search bar of the Network tab: type Following? to only get the requests we're interested in
  6. Right click on requests → Save All As HAR or Save all as HAR with content; Name it following.har

Then, in a terminal, with jq installed, run:

jq -r '.log.entries[] | .response.content.text | fromjson? | .data.user.result.timeline.timeline.instructions[-1].entries[] | .content.itemContent.user_results.result | values | [.rest_id, .legacy.screen_name, .legacy.name, .legacy.description] | @csv' following.har > following.csv

That will give you the ID, username, display name and description in a CSV file.

If you want to archive all the data that Twitter returns, run this to get a JSON stream:

jq -r '.log.entries[] | .response.content.text | fromjson? | .data.user.result.timeline.timeline.instructions[-1].entries[] | .content.itemContent.user_results.result | values' following.har > following.json

If you want to export your followers too, clear the Network requests and then repeat the steps but use the Followers link, filter requests by Followers? and use different filenames. The rest is the same.

(Tweet about it here: https://twitter.com/niborst/status/1589883662048579584)

If you want to do all of the above plus fetch the banner and avatar images of each account, you could run the following script after producing the HAR files. It will create a followers and a following directory where you are when you run in, and fill each one with sub-directories named with the name and ID of each account, containing the banner (if available) and avatar images for that account.

#!/bin/sh

for t in followers following; do
    # extract all fields into JSON
    jq -r '.log.entries[] | .response.content.text | fromjson? | .data.user.result.timeline.timeline.instructions[-1].entries[] | .content.itemContent.user_results.result | values' \
        $t.har \
        > $t.json

    # extract essential fields into a TSV
    jq -r '.log.entries[] | .response.content.text | fromjson? | .data.user.result.timeline.timeline.instructions[-1].entries[] | .content.itemContent.user_results.result | values | [.rest_id, .legacy.screen_name, .legacy.name, .legacy.description, .legacy.profile_banner_url, .legacy.profile_image_url_https] | @tsv' \
        $t.har \
        > $t.tsv

    # make a directory to hold account images
    mkdir $t

    # fetch banner (if a available) and avatar images
    awk -F"\t" '{print $1, $2, $5, $6}' $t.tsv \
        | while read -r id username banner_url avatar_url; do
            d=$t/${username}-${id}
            mkdir -p "$d"
            if [ -n "$banner_url" ]; then
                curl -o "$d/banner" "$banner_url"
            fi
            if [ -n "$avatar_url" ]; then
                curl -o "$d/avatar" "$avatar_url"
            fi
        done

    # add file extension based on detected type (typ. "jpeg" or "png")
    find $t -type f | while read -r f; do
        ext=$(file -b "$f" | awk '{print $1}' | tr "[:upper:]" "[:lower:]")
        mv "$f" "$f.$ext"
    done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment