Question: can parallel pre- and postprocessing speed up Gensim Doc2Vec?
- Spark: 349s
- Vanilla: 373s
(only one run, so not a very scientific comparison)
Run on a single machine with 16GB RAM and Intel i7-8550U CPU @ 1.80GHz
You need to move from one solr instance to another and can't be bothered with mismatching versions or whatever? These two scripts will help you :)
First you need to create a new core in the target instance. You may want to use the schema/configset from the originating instance though, as the default schema might not be ideal.
Im my scenario I moved from Solr 5.5.5 to Solr 7.4.
Therefore I had to (at least) update the solrconfig.xml
, where the lucene version is specified.
The exact version you need can be found in the default configset ([solr_root]/server/solr/configsets/...
)
This script extracts all emails from an Outlook PST archive and saves them into some output folder as individual RFC822 compliant *.eml files.
Installing the external dependency pypff may not be straight forward (it wasn't for me). I forked the original repository to make it work in Python 3. If you get errors, check their wiki pages for help or try my fork. Below are the steps that worked for me:
Clone https://github.com/libyal/libpff/tree/master/pypff
Very basic script for downloading a specific index from ElasticSearch into a file containing one document per line (json-formatted).
The argparse should be pretty self explanatory.
Call the script by:
python download.py --url=http://my.elastic.search.eu --port=9200 --index=my_index --out=/path/to/output/
tim@klapprechner ~/workspace/satnavpi/valhalla (git)-[2.1.8] % ./autogen.sh :( | |
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, '.'. | |
libtoolize: copying file './ltmain.sh' | |
libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'. | |
libtoolize: copying file 'm4/libtool.m4' | |
libtoolize: copying file 'm4/ltoptions.m4' | |
libtoolize: copying file 'm4/ltsugar.m4' | |
libtoolize: copying file 'm4/ltversion.m4' | |
libtoolize: copying file 'm4/lt~obsolete.m4' | |
configure.ac:9: installing './compile' |
<!DOCTYPE html> | |
<html lang="en"> | |
<head> | |
<meta charset="UTF-8"> | |
<title>RefMe Publications - Article viewer</title> | |
</head> | |
<body> | |
<div id="refme-cite-widget"></div> | |
<div itemscope itemtype="http://schema.org/ScholarlyArticle"> | |
<strong>Title:</strong> <span itemprop="name">Reviewing the advantages of reference generators like RefME</span><br/> |