These are the Kickstarter Engineering and Data role definitions for both teams.
I hereby claim:
- I am jimfingal on github.
- I am jimfingal (https://keybase.io/jimfingal) on keybase.
- I have a public key ASAWKvqX_TJ5LA5qI8IlZ2T9LYzZIOOAQtAzvxzLqzLbUAo
To claim this, I am signing this object:
After reading Darius Kazemi's post, "Aphorism detection for fun but definitely not profit", I wanted in -- I've done a number of text-focused bots, but none that did anything more advanced than tokenizing things and making use of ngrams with Markov chains. I have some experience with NLP in Python so thought it would be fun to port it.
The essence of Darius's algorithm is:
- Read in Corpus
- Tokenize corpus into sentences
- Filter out sentences that match a few basic patterns