Skip to content

Instantly share code, notes, and snippets.

@crawles
Last active September 16, 2016 18:48
Show Gist options
  • Save crawles/bd22ceb6410ec1c6580b408ea1cc4f5d to your computer and use it in GitHub Desktop.
Save crawles/bd22ceb6410ec1c6580b408ea1cc4f5d to your computer and use it in GitHub Desktop.
Example
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook implements a pre-trained sentiment analysis pipeline including a regex pre-processing step, tokenization, n-gram computation, and logistic regreression model as a RESTful API."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2016-08-24T11:16:17.753340",
"start_time": "2016-08-24T11:16:17.733692"
},
"collapsed": false
},
"outputs": [],
"source": [
"import cPickle\n",
"import json\n",
"\n",
"import pandas as pd\n",
"import sklearn\n",
"import requests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Import the trained model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2016-08-24T14:20:40.129088",
"start_time": "2016-08-24T14:20:40.044663"
},
"collapsed": true
},
"outputs": [],
"source": [
"resp = requests.get(\"https://raw.githubusercontent.com/crawles/gpdb_sentiment_analysis_twitter_model/master/twitter_sentiment_model.pkl\")\n",
"resp.raise_for_status()\n",
"cl = cPickle.loads(resp.content)"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2016-09-16T11:57:34.885062",
"start_time": "2016-09-16T11:57:34.874853"
}
},
"source": [
"### Import data pre-processing function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def regex_preprocess(raw_tweets):\n",
" pp_text = pd.Series(raw_tweets)\n",
" \n",
" user_pat = '(?<=^|(?<=[^a-zA-Z0-9-_\\.]))@([A-Za-z]+[A-Za-z0-9]+)'\n",
" http_pat = '(https?:\\/\\/(?:www\\.|(?!www))[^\\s\\.]+\\.[^\\s]{2,}|www\\.[^\\s]+\\.[^\\s]{2,})'\n",
" repeat_pat, repeat_repl = \"(.)\\\\1\\\\1+\",'\\\\1\\\\1'\n",
"\n",
" pp_text = pp_text.str.replace(pat = user_pat, repl = 'USERNAME')\n",
" pp_text = pp_text.str.replace(pat = http_pat, repl = 'URL')\n",
" pp_text.str.replace(pat = repeat_pat, repl = repeat_repl)\n",
" return pp_text"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Setup the API"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Jupyter Kernel Gateway utilizes a global REQUEST JSON string that will be replaced on each invocation of the API."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"REQUEST = json.dumps({\n",
" 'path' : {},\n",
" 'args' : {}\n",
"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compute sentiment using trained model and serve using POST"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using the kernel gateway, a cell is created as an HTTP handler using a single line comment. The handler supports common HTTP verbs (GET, POST, DELETE, etc). For more information, view the <a href=\"https://jupyter-kernel-gateway.readthedocs.io/en/latest/http-mode.html\">docs</a>."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2016-08-22T17:52:15.653020",
"start_time": "2016-08-22T17:52:15.636712"
},
"collapsed": false
},
"outputs": [],
"source": [
"# POST /polarity_compute\n",
"req = json.loads(REQUEST)\n",
"tweets = req['body']['data']\n",
"print(cl.predict_proba(regex_preprocess(tweets))[:][:,1])"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment