Skip to content

Instantly share code, notes, and snippets.

View harrisj's full-sized avatar

Jacob Harris harrisj

View GitHub Profile
@harrisj
harrisj / ambiguous_pronunciations
Created April 26, 2013 18:43
Here are the words we have flagged as having possibly ambiguous pronunciations. The haiku dictionary is case-insensitive, which is why there are in all caps (sometimes, there is an issue because of an abbreviation looks like a word)
AB
ABELES
ABERLE
ABKHAZIAN
ABLER
ABS
ABT
ABTS
ACQUIRING
ACREAGE
@harrisj
harrisj / gist:5431899
Created April 22, 2013 01:38
Additions to the haiku syllable dictionary 4/21/2013
100th 3
1850s 4
1860 4
1861 5
1862 5
1863 5
1865 5
1885 5
1909 4
1911 5
@harrisj
harrisj / gist:5392675
Created April 16, 2013 01:33
Recent additions to the syllable counts 4/15/2013
11,000 5
12,000 3
12:30 3
15,000 4
1908 4
1910 3
1913 4
1919 4
1920s 4
1925 5
@harrisj
harrisj / gist:5356851
Created April 10, 2013 17:48
Syllable counts added 4/10/13
adverts 2
agitpop 3
apses 2
arcology 4
avoider 3
backlot 2
bartend 2
basketful 3
batsman 2
batsmen 2
@harrisj
harrisj / gist:5328374
Created April 7, 2013 00:59
Additions to the syllable dictionary 4/6/13
acrostic 3
actressy 3
aerate 2
anesthetized 4
artiste 2
B.U. 2
backdiving 3
backheel 2
baggier 3
bearskin 2
@harrisj
harrisj / gist:5320576
Created April 5, 2013 16:18
Here is a recent sample of syllable counts I added to the Haiku Bot
akept 2
alighted 3
apexes 3
atemporal 4
azolla 3
benshi 2
bewigged 2
blarney 2
brocaded 3
bunga 2
@harrisj
harrisj / meter.rb
Created April 3, 2013 14:47
Some sample code for the term logic in the haiku finder
So, basically, scanning a sentence becomes like this:
1. Split the sentence into words
2. For each word, create a cleaned version (basically strip off quotes and punctuation) and look up in the dictionary.
3. If not found, try a few fallbacks based on stemmming rules or such
4. Otherwise, add the word to a term_misses table.
These code is really ugly (sorry!), and you might notice I actually am talking about meter in it. This is because the background image for the haikus is generated from the meter. But I don't want to do the meter for new words (sorry, syllable is enough), so for those I just return - as the meter (otherwise it's a combination of 1 or 0)
HAIKU (11000 1010011 11110)
@harrisj
harrisj / tweet_signatures.md
Last active November 24, 2020 11:03
Tweet Signatures: a simple solution for thwarting forged tweets

A Simple Solution for Faked Tweets

Recently, a somewhat large selection of my timeline was shocked by the discovery that it's simple to make a fake-looking tweet on the web. Some feared it would be only a matter of time before some news organization is suckered by a fake tweet that seems to come from a real source.

Luckily, the solution already exists, and it's something you already use constantly: GNU PrivacyGuard signatures Here is an approach for verifying a tweet is authentic and hasn't been tampered with that is so simple even @KimKardashian could figure it out. To get started, we just need to do a little setup first:

  1. Of course, you have already installed GnuPG for your own use, generated a keypair and uploaded it to a keyserver so that other people can look it up. Its email address must be publicly listed in your twitter profile.
  2. Then, you must collect the public keys of the people you fo
@harrisj
harrisj / twitter_moderation
Created March 29, 2013 00:58
Using Twitter as a Content Filtering Interface
One random idea I have been kicking around with in the past is using Twitter as a crowdsourced moderation interface. This is an interesting approach in the following cases:
1. We receive many more potential items than we want to appear on the official, moderated interface.
2. The moderation twitter account is an interesting behind-the-scenes stream for people who are really interested in the content we are moderating down for the final site.
The final site could be another twitter account that posts only the best or anything else really (blog, tumblr, facebook, pinterest, what have you).
This approach won't really work well if you want to filter out questionable content. It's not an obscenity filter but merely a fun way of separating out the most popular items from the stream. How does it work?
For each item you post to the firehose twitter account, you store the following information in your DB:
@harrisj
harrisj / gist:4532095
Created January 14, 2013 18:21
Rocky Stracke is a GitHubber. I kid because I love...
1.9.3p327 :005 > 100.times { puts "#{Faker::Name.name} is a GitHubber" }; 0
Sid Littel is a GitHubber
Henri Wuckert is a GitHubber
Annette Ledner is a GitHubber
Mercedes Murray MD is a GitHubber
Rocky Stracke is a GitHubber
Miss Noah Reichel is a GitHubber
Miss Ludie Schaefer is a GitHubber
Mr. Isom Hilll is a GitHubber
Elva Homenick DVM is a GitHubber