Skip to content

Instantly share code, notes, and snippets.

@gnomon-
Last active January 6, 2017 04:32
Show Gist options
  • Save gnomon-/57557f9e56de9d673f321d6ec5add4da to your computer and use it in GitHub Desktop.
Save gnomon-/57557f9e56de9d673f321d6ec5add4da to your computer and use it in GitHub Desktop.
#!/bin/bash
curlicue_path="${HOME}/src/curlicue"
# edit this to point wherever you `git pull https://github.com/decklin/curlicue.git` to...
function print_usage {
cat <<-_END_OF_HELP_
rethread-tweets
USAGE
rethread-tweets [tweet_id] [other_tweet_id(s) ...]
Given the tweet ID of the last tweet in a thread, recursively retrieve up to the head
of the thread, then print out the text of the tweets in sequence (lowest-to-highest
tweet ID, roughly equivalent to chronological order, in theory)
If you pass in more than one tweet ID, the threads will be reconstructed one by one.
DEPENDENCIES
- openssl
- curl (with SSL support for HTTPS retrieval)
- jq
https://stedolan.github.io/jq/
- curlicue:
https://github.com/decklin/curlicue
- a Twitter API access token to pass to curlicue
https://dev.twitter.com/oauth/overview/application-owner-access-tokens
_END_OF_HELP_
return 0
}
if ((0 == $#)) ; then
print_usage
exit 0
fi
while (($# > 0)) ; do
# twid='816631120196501504'
twid=$1
ids=( "$twid" )
while [[ $twid == +([0-9]) ]] ; do
twid_next=$("${curlicue_path}"/contrib/twitpull -n -j '.in_reply_to_status_id_str' 'statuses/show' "id=${twid}")
# I use `in_reply_to_status_id_str` here rather than `in_reply_to_status` because
# my version of `jq` apparently lacks bignum arithmetic and truncates the numeric
# values to 53 bits of precision which IS NOT THE SAME THING AT ALL kthxbyeeeeee
ids+=("$twid_next")
twid=$twid_next
printf . >> /dev/stderr
done
unset ids[$((${#ids[@]} -1))]
ids_str=${ids[*]}
ids_str=${ids_str// /,}
"${curlicue_path}"/contrib/twitpull -n -j 'sort_by(.id_str) | .[].full_text' 'statuses/lookup' "id=${ids_str}" 'tweet_mode=extended'
printf '\n'
shift
done
@gnomon-
Copy link
Author

gnomon- commented Jan 6, 2017

This pairs well with a little dingus I whipped up to run twitter images through tesseract; I've got it as a little scriptlent in ~/bin/ named tw_ocr:

#!/bin/bash

COLUMNS=$(tput cols)

while (($# > 0)) ; do
	url_tw=$1
	url_img=($( \
		curl -L -s "$url_tw" | \
		  awk '/property="og:image"/ {
			i=$0;
			sub(/.*content="/, "", i);
			sub(/".*/,         "", i);
			print i;
		}' \
	))
	for ((i=0; i<${#url_img[@]}; i++)) ; do
		fn="$(mktemp -p /dev/shm/ --suffix=.jpeg)" && \
		  curl -Ls -o "$fn" "${url_img[i]}" && \
		  tesseract "$fn" stdout -l eng && \
		  rm -- "$fn"
		(((${#url_img[@]}-i)>1)) && printf '\n\n----\n\n'
	done | fmt -w "$COLUMNS" | less
	shift
done

@gnomon-
Copy link
Author

gnomon- commented Jan 6, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment