Created
December 30, 2013 20:33
-
-
Save thequbit/8187729 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I have an elastic search index with the following scheme: | |
# create entry | |
body = {'targeturl': urldata['targeturl'], # url of website being crawled | |
'docurl': docurl, # the url of the document | |
'docname': docname, # the name of the document | |
'linktext': linktext, # the text within the <a> tags | |
'pdftext': pdftext, # the full text of the document | |
'pdfhash': pdfhash, # the MD5 hash of the document | |
'scrapedatetime': scrapedatetime, # the datetime the document was found | |
'textfilename': textfilename, # the name of the text file in the file store | |
'pdffilename': pdffilename, # the name of the pdf file in the file store | |
'misfit': misfit, # boolean for downstream use | |
'orgname': org['name'], # name of the organization the doc belongs to | |
'orgid': org['orgid'], # org id in the DB | |
'bodyid': org['bodyid'] # body id in the DB of the body the org belongs to | |
} | |
# send to indexer | |
es = elasticsearch.Elasticsearch() | |
es.index( | |
index="monroeminutes", | |
doc_type="pdfdoc", | |
id=uuid.uuid4(), | |
body=body, | |
) | |
The above code passes the found document into elastic search to be indexed. I now need to perform a search to get the document out, but only if it matches the orgid that I want. I believe this is just a syntax of how to form the query that I am getting wrong. | |
Line in question: | |
https://github.com/thequbit/monroeminutes/blob/master/src/search.py#L38 | |
Error: | |
(mmenv)administrator@anna:~/dev/monroeminutes/src$ python search.py | |
No handlers could be found for logger "elasticsearch" | |
Traceback (most recent call last): | |
File "search.py", line 72, in <module> | |
response = search.search('scottsville',orgid=1) | |
File "search.py", line 45, in search | |
body=body | |
File "/home/administrator/.virtualenvs/mmenv/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 70, in _wrapped | |
return func(*args, params=params, **kwargs) | |
File "/home/administrator/.virtualenvs/mmenv/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 388, in search | |
params=params, body=body) | |
File "/home/administrator/.virtualenvs/mmenv/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 223, in perform_request | |
status, raw_data = connection.perform_request(method, url, params, body, ignore=ignore) | |
File "/home/administrator/.virtualenvs/mmenv/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 53, in perform_request | |
self._raise_error(response.status, raw_data) | |
File "/home/administrator/.virtualenvs/mmenv/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 82, in _raise_error | |
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) | |
elasticsearch.exceptions.TransportError: TransportError(400, u'SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[JMs9LW9cR726AXz43g05-A][monroeminutes][1]: SearchParseException[[monroeminutes][1]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"match": {"pdftext": "scottsville", "orgid": 1}}, "from": 0, "size": 10}]]]; nested: QueryParsingException[[monroeminutes] [match] query parsed in simplified form, with direct field name, but included more options than just the field name, possibly use its \'options\' form, with \'query\' element?]; }{[JMs9LW9cR726AXz43g05-A][monroeminutes][4]: SearchParseException[[monroeminutes][4]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"match": {"pdftext": "scottsville", "orgid": 1}}, "from": 0, "size": 10}]]]; nested: QueryParsingException[[monroeminutes] [match] query parsed in simplified form, with direct field name, but included more options than just the field name, possibly use its \'options\' form, with \'query\' element?]; }{[JMs9LW9cR726AXz43g05-A][monroeminutes][2]: SearchParseException[[monroeminutes][2]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"match": {"pdftext": "scottsville", "orgid": 1}}, "from": 0, "size": 10}]]]; nested: QueryParsingException[[monroeminutes] [match] query parsed in simplified form, with direct field name, but included more options than just the field name, possibly use its \'options\' form, with \'query\' element?]; }{[JMs9LW9cR726AXz43g05-A][monroeminutes][3]: SearchParseException[[monroeminutes][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"match": {"pdftext": "scottsville", "orgid": 1}}, "from": 0, "size": 10}]]]; nested: QueryParsingException[[monroeminutes] [match] query parsed in simplified form, with direct field name, but included more options than just the field name, possibly use its \'options\' form, with \'query\' element?]; }{[JMs9LW9cR726AXz43g05-A][monroeminutes][0]: SearchParseException[[monroeminutes][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"match": {"pdftext": "scottsville", "orgid": 1}}, "from": 0, "size": 10}]]]; nested: QueryParsingException[[monroeminutes] [match] query parsed in simplified form, with direct field name, but included more options than just the field name, possibly use its \'options\' form, with \'query\' element?]; }]') | |
(mmenv)administrator@anna:~/dev/monroeminutes/src$ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment