DATA.md

column_name	datatype	description
user_id	INTEGER	unique identifier for each user in our "impeachment 2020" dataset
created_on	DATE	date the user was created
screen_name_count	INTEGER	number of screen names used
screen_names	STRING	all screen names used
is_bot	BOOLEAN	whether or not we classified this user as a "bot" / automated account
bot_rt_network	INTEGER	for bots, which retweet network (0:anti-trump, 1:pro-trump)
is_q	BOOLEAN	whether or not this user tweeted Q-anon language / hashtags
q_status_count	INTEGER	the number of tweets with Q-anon language / hashtags
status_count	INTEGER	number of total tweets authoried by this user (in our "impeachment 2020" dataset only)
rt_count	INTEGER	number of total retweets authoried by this user (in our "impeachment 2020" dataset only)
avg_score_lr	FLOAT	avergage opinion score from our Logistic Regression model (0:anti-trump, 1:pro-trump)
avg_score_nb	FLOAT	avergage opinion score from our Naive Bayes model (0:anti-trump, 1:pro-trump)
avg_score_bert	FLOAT	avergage opinion score from our BERT Transformer model (0:anti-trump, 1:pro-trump)
opinion_community	INTEGER	binary classification of average opinion (0:anti-trump, 1:pro-trump)
follower_count	INTEGER	number of followers (in our "impeachment 2020" dataset only)
follower_count_b	INTEGER	... who are bots
follower_count_h	INTEGER	... who are humans
friend_count	INTEGER	number of friends (in our "impeachment 2020" dataset only)
friend_count_b	INTEGER	... who are bots
friend_count_h	INTEGER	... who are humans
avg_toxicity	FLOAT	average "toxicity" score from the Detoxify model
avg_severe_toxicity	FLOAT	average "sever toxicity" score from the Detoxify model
avg_insult	FLOAT	average "insult" score from the Detoxify model
avg_obscene	FLOAT	average "obscene" score from the Detoxify model
avg_threat	FLOAT	average "threat" score from the Detoxify model
avg_identity_hate	FLOAT	average "identity hate" score from the Detoxify model
urls_shared_count (TODO)	INTEGER	number of tweets with URLs in them (TODO)
fact_scored_count	INTEGER	number of tweets with URL domains that we have rankings for
avg_fact_score	FLOAT	average fact score of links shared (1: fake news, 5: mainstream media)

s2t2/DATA.md