Skip to content

Instantly share code, notes, and snippets.

@s2t2
Created February 4, 2022 16:46
Show Gist options
  • Save s2t2/0838d97e0b16d29d0ba14519bd0914ba to your computer and use it in GitHub Desktop.
Save s2t2/0838d97e0b16d29d0ba14519bd0914ba to your computer and use it in GitHub Desktop.
column_name datatype description
user_id INTEGER unique identifier for each user in our "impeachment 2020" dataset
created_on DATE date the user was created
screen_name_count INTEGER number of screen names used
screen_names STRING all screen names used
is_bot BOOLEAN whether or not we classified this user as a "bot" / automated account
bot_rt_network INTEGER for bots, which retweet network (0:anti-trump, 1:pro-trump)
is_q BOOLEAN whether or not this user tweeted Q-anon language / hashtags
q_status_count INTEGER the number of tweets with Q-anon language / hashtags
status_count INTEGER number of total tweets authoried by this user (in our "impeachment 2020" dataset only)
rt_count INTEGER number of total retweets authoried by this user (in our "impeachment 2020" dataset only)
avg_score_lr FLOAT avergage opinion score from our Logistic Regression model (0:anti-trump, 1:pro-trump)
avg_score_nb FLOAT avergage opinion score from our Naive Bayes model (0:anti-trump, 1:pro-trump)
avg_score_bert FLOAT avergage opinion score from our BERT Transformer model (0:anti-trump, 1:pro-trump)
opinion_community INTEGER binary classification of average opinion (0:anti-trump, 1:pro-trump)
follower_count INTEGER number of followers (in our "impeachment 2020" dataset only)
follower_count_b INTEGER ... who are bots
follower_count_h INTEGER ... who are humans
friend_count INTEGER number of friends (in our "impeachment 2020" dataset only)
friend_count_b INTEGER ... who are bots
friend_count_h INTEGER ... who are humans
avg_toxicity FLOAT average "toxicity" score from the Detoxify model
avg_severe_toxicity FLOAT average "sever toxicity" score from the Detoxify model
avg_insult FLOAT average "insult" score from the Detoxify model
avg_obscene FLOAT average "obscene" score from the Detoxify model
avg_threat FLOAT average "threat" score from the Detoxify model
avg_identity_hate FLOAT average "identity hate" score from the Detoxify model
urls_shared_count (TODO) INTEGER number of tweets with URLs in them (TODO)
fact_scored_count INTEGER number of tweets with URL domains that we have rankings for
avg_fact_score FLOAT average fact score of links shared (1: fake news, 5: mainstream media)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment