Skip to content

Instantly share code, notes, and snippets.

@leondz
Created May 30, 2018 14:53
Show Gist options
  • Save leondz/6d04a4784bd762cac3a48fd0b99e899d to your computer and use it in GitHub Desktop.
Save leondz/6d04a4784bd762cac3a48fd0b99e899d to your computer and use it in GitHub Desktop.
Extract unix time to nearest millisecond from Twitter tweet ID
# use id_str instead of id to avoid overflows when accidentally casting to int
def twitter_id_to_epoch(id_str):
# credit to "On the endogenesis of Twitter’s Spritzer and Gardenhose sample streams" Kergl et al., Proc ASONAM 2014 (IEEE)
id_str = id_str.replace("'", "")
id_i = int(id_str)
bitstring = "{:064b}".format(id_i)
timestamp_b = bitstring[1:42]
snowflake_epoch = int(timestamp_b, 2)
epoch_ms = snowflake_epoch + 1288834974657
return epoch_ms / 1000
@dkergl
Copy link

dkergl commented Jun 1, 2018

@leondz: Thanks for credit! In my fork I've replaced some string operations that are quite expensive in Python by a faster bit shifting operation. I think the underlying logic of the conversion is still visible, while this piece of code runs in about half of the time, which might be crucial for real-time environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment