Skip to content

Instantly share code, notes, and snippets.

View jprante's full-sized avatar
💤
Dormant

Jörg Prante jprante

💤
Dormant
View GitHub Profile
@jprante
jprante / mysql-boolean.sh
Created April 5, 2015 15:41
How to index MySQL boolean type - which is TINYINT(1) - with JDBC plugin for Elasticsearch
#!/bin/sh
/usr/local/mysql/bin/mysql -u root test <<EOT
drop table test;
create table test (
id integer,
b boolean
);
insert into test values (1, FALSE);
insert into test values (2, TRUE);
@jprante
jprante / mysql-dateformat.sh
Created April 3, 2015 21:32
JDBC plugin with custom date/time format
#!/bin/sh
/usr/local/mysql/bin/mysql -u root test <<EOT
drop table test;
create table test (
id integer,
d text
);
insert into test values (1, '2015-04-03 00:45:00');
EOT
@jprante
jprante / mysql-timestamp-cron.sh
Last active August 29, 2015 14:18
Elasticsearch JDBC plugin fetching MySQL rows by a cron schedule
#!/bin/sh
/usr/local/mysql/bin/mysql -u root test <<EOT
drop table test;
create table test (
id integer,
t timestamp,
message text
);
insert into test values (1, now(), 'Hello, this is message 1');
@jprante
jprante / mysql-blob-river.sh
Last active August 29, 2015 14:18
Combining MySQL blob, PDF, Elasticsearch JDBC plugin, and attachment mapper type plugin
#!/bin/sh
# MySQL 5.1+ with http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_load-file
# ES 1.4.4+
# plugins required: jdbc, mapper attachment
# test.pdf is a PDF that can be parsed by Tika
/usr/local/mysql/bin/mysql -u root test <<EOT
drop table test;
create table test (
@jprante
jprante / binary-put-es.sh
Last active May 9, 2020 15:06
Put binary data to Elasticsearch
# first, set
#
# http.compression: true
#
# in config/elasticsearch.yml
curl -XDELETE 'localhost:9200/test'
rm -f content.json content.json.gz
echo '{"content":"Hello World"}' > content.json
@jprante
jprante / es.log
Created March 20, 2015 14:48
JDBC river log (no MySQL running)
[2015-03-20 15:46:35,378][INFO ][node ] [Daisy Johnson] version[1.4.4], pid[15796], build[c88f77f/2015-02-19T13:05:36Z]
[2015-03-20 15:46:35,379][INFO ][node ] [Daisy Johnson] initializing ...
[2015-03-20 15:46:35,388][INFO ][plugins ] [Daisy Johnson] loaded [jdbc-1.4.0.10-87c9ce0], sites []
[2015-03-20 15:46:37,028][INFO ][node ] [Daisy Johnson] initialized
[2015-03-20 15:46:37,029][INFO ][node ] [Daisy Johnson] starting ...
[2015-03-20 15:46:37,080][INFO ][transport ] [Daisy Johnson] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.1.1.57:9300]}
[2015-03-20 15:46:37,092][INFO ][discovery ] [Daisy Johnson] elasticsearch/QLgs41BVTNqHR69nC3hC0w
[2015-03-20 15:46:40,860][INFO ][cluster.service ] [Daisy Johnson] new_master [Daisy Johnson][QLgs41BVTNqHR69nC3hC0w][Jorg-Prantes-MacBook-Pro.local][inet[/10.1.1.57:9300]], reason: zen-disco-join (elected_as_maste
@jprante
jprante / dewey-crawler.py
Last active August 29, 2015 14:17
Crawling dewey.info
#!/usr/bin/env python
"""
cralws triples from dewey.info and writes them to RDF N3 file
"""
import rdflib
def crawl(uri,file):
g = rdflib.ConjunctiveGraph()
@jprante
jprante / docvalues-multifield.sh
Created March 16, 2015 08:51
Doc values with multi-field
curl -XDELETE 'localhost:9200/test'
curl -XPUT 'localhost:9200/test' -d '
{
"mappings" : {
"docs" : {
"properties" : {
"content" : {
"type" : "string",
@jprante
jprante / char-filter.sh
Created March 13, 2015 09:36
Char filter demo
curl -XDELETE 'localhost:9200/test'
curl -XPUT 'localhost:9200/test' -d '
{
"settings" : {
"analysis": {
"char_filter" : {
"full_text_mapping" : {
"type": "mapping",
@jprante
jprante / detect.groovy
Created February 11, 2015 13:54
Detect latin/greek characters
def file = new File('names.txt')
file.eachLine { line ->
if (line.length() > 16) {
line = line.substring(16)
line.tokenize('$').each { word ->
word = word.substring(1)
print word
def latin = false
def greek = false
word.replaceAll("\\p{C}","").replaceAll("\\p{Space}","").replaceAll("\\p{Punct}","").toCharArray().each { c ->