This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import nltk | |
import sys | |
sent_detector = nltk.data.load('tokenizers/punkt/english.pickle') | |
fulltext = open(sys.argv[1], "r").read() | |
for sentence in sent_detector.tokenize(fulltext.strip()): | |
if(len(sentence) <= 140): | |
print(sentence) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
MSM: ("bbc.com", "reuters.com", "nytimes.com", "washingtonpost.com", "cnn.com", | |
"telegraph.co.uk", "latimes.com", "huffingtonpost.com", "theguardian.com", "forbes.com", | |
"examiner.com", "usatoday.com", "wsj.com", "cbsnews.com", "cbc.ca", "time.com", | |
"sfgate.com", "newsweek.com", "bostonglobe.com", "nydailynews.com", "msnbc.com", | |
"foxnews.com", "aljazeera.com", "nbcnews.com", "npr.org", "bloomberg.com", "abcnews.com", | |
"aljazeera.com", "bigstory.ap.com", "cbc.ca", "time.com") | |
TABLOIDS: ['dailymail.co.uk', 'express.co.uk','mirror.co.uk', | |
'news.com.au', 'nypost.com', 'thesun.co.uk','dailystar.co.uk','metro.co.uk'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
all_users_info = [x._json for x in users_info] | |
f = codecs.open(FILENAME, mode="w", encoding="utf-8") | |
f.write(json.dumps(all_users_info, ensure_ascii=False)) | |
f.close() |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from PIL import Image | |
import sys | |
## Example via: https://stackoverflow.com/questions/28057722/algorithm-to-turn-a-large-blob-of-text-into-an-image-as-defined-by-the-image-e | |
def to_ascii(img,maxLen=250.0): | |
#resize to maximum line length | |
width, height = img.size | |
rate = maxLen / max(width, height) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import sys,os,io | |
import simplejson as json | |
import zstandard as zstd | |
subreddit = "futurology" | |
infile = sys.argv[1] | |
outfile = sys.argv[2] | |
print("infile: {0}".format(infile)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"//ABOUT THIS FILE": "This is a configuration file for a repeating Twitter farewell message that uses cheapbotsdonequick.com. To set it up, I logged into CheapBotsDoneQuick with my Twitter account, pasted this configuration file into the box, and set the announcement for twice daily. The site will now auto-post in perpituity until one of the systems goes down or my account is banned from Twitter.", | |
"origin": [ | |
" I have left Twitter, due to the the dismantling of the platform's safety & security capacity.\n\n Find me on Mastodon, LinkedIn, or sign up for email updates: https://natematias.com/updates/ \n\n Thanks for #noun#, #people#. \n\nThis message repeats." | |
], | |
"noun": [ | |
"all the support and love", | |
"great conversations", | |
"all the inspiration", | |
"so many great discussions", |
OlderNewer