Skip to content

Instantly share code, notes, and snippets.

@jonhurlock
Created June 20, 2012 15:07
Show Gist options
  • Save jonhurlock/2960359 to your computer and use it in GitHub Desktop.
Save jonhurlock/2960359 to your computer and use it in GitHub Desktop.
Example Python Code showing cURLing data and POSTing data to Elastic Search, but fails with escaped speech marks
############# My Clusters Health
curl -XGET 'http://127.0.0.1:9200/_cluster/health?pretty=true'
{
"cluster_name" : "TweetHadoop",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 15,
"active_shards" : 15,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 15
}
############# Works fine for posting data
import urllib
import urllib2
url = 'http://localhost:9200/twitter/tweet/22499999'
data = '{"user" : "helenax33","message" : "LIVE: http://www.justin.tv/xxhelenaxx i look like a can read+sound like a man.","pubDate" : "20090705T21:46:34","isAQuestion" : "0" }'
#data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
##### Also this works Fine for posting data.
import pycurl
apiURL = 'http://localhost:9200/twitter/tweet/22499999'
c = pycurl.Curl()
c.setopt(c.URL, apiURL)
c.setopt(c.POSTFIELDS, '{ "user" : "helenax33", "message" : "LIVE: http://www.justin.tv/xxhelenaxx i look like a can read+sound like a man.", "pubDate" : "20090705T21:46:34", "isAQuestion" : "0" }')
c.setopt(c.VERBOSE, True)
c.perform()
# However, if I want to put a speech mark (") in the message part, then it fails e.g.
import pycurl
apiURL = 'http://localhost:9200/twitter/tweet/22499999'
c = pycurl.Curl()
c.setopt(c.URL, apiURL)
c.setopt(c.POSTFIELDS, '{ "user" : "helenax33", "message" : "LIVE: http://www.justin.tv/xxhelenaxx i " look like a can read+sound like a man.", "pubDate" : "20090705T21:46:34", "isAQuestion" : "0" }')
c.setopt(c.VERBOSE, True)
c.perform()
# So I tried to escape the speech mark e.g.
import pycurl
apiURL = 'http://localhost:9200/twitter/tweet/22499999'
c = pycurl.Curl()
c.setopt(c.URL, apiURL)
c.setopt(c.POSTFIELDS, '{ "user" : "helenax33", "message" : "LIVE: http://www.justin.tv/xxhelenaxx i \" look like a can read+sound like a man.", "pubDate" : "20090705T21:46:34", "isAQuestion" : "0" }')
c.setopt(c.VERBOSE, True)
c.perform()
# However, this still fails. Please help :(
@seanhandley
Copy link

Ah. Looks from that that you're escaping the whole JSON string. And that's confusing it. The JSON string's quote marks should remain unescaped - it's the quote marks INSIDE the message you want to sort out. So re.escape(message) and then interpolate that into your JSON string. Hopefully you'll have a winner :-)

@seanhandley
Copy link

i.e.

{ "user" : "someusername", "message" : "something they are tweeting about that contains \"speech marks\".", "pubDate" : "20090705T21:46:34"}

@jonhurlock
Copy link
Author

jonhurlock commented Jun 21, 2012 via email

@seanhandley
Copy link

No, not ideal. Surely there's a python json lib that will parse/encode for you? In ruby you can just say .to_json and it works (escaping all dodgy chars also).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment