Skip to content

Instantly share code, notes, and snippets.

@aih
Last active August 9, 2016 20:09
Show Gist options
  • Save aih/f0370479669e3003b2a4 to your computer and use it in GitHub Desktop.
Save aih/f0370479669e3003b2a4 to your computer and use it in GitHub Desktop.

#Importing Data from MS SQL Server to Elasticsearch Adapted from the instructions for the jdbc importer here.

MS SQL Server

  1. Download Elasticsearch

  2. Install Elasticsearch

    Follow instructions on https://www.elastic.co/products/elasticsearch

3a) Install Cygwin Make sure to include the wget utility and an unzip utility. (not necessary if you run the script from Windows command prompt)

3b) Install JDBC importer

    `wget http://xbib.org/repository/org/xbib/elasticsearch/importer/elasticsearch-jdbc/<version>/elasticsearch-jdbc-<version>-dist.zip`
    
   (update version to the current stable version)
   Copy the zip file into your ```\Program Files```  or ```\Program Files (x86)``` folder. Unzip using and unzip utility (e.g. from Cygwin).
  1. Download SQL Server JDBC driver from Microsoft

  2. Copy the driver into lib folder of the jdbc importer (should be in \Program Files folder, from step 3 above).

     cp SQLJDBC4.jar $JDBC_IMPORTER_HOME/lib

6a) Set up the database you want to be indexed. This includes allowing TCP/IP connections

    1. Enable TCP/IP in the SQL SERVER CONFIGURATION MANAGER ( (Start Menu > Microsoft SQL Server > Configuration Tools > SQL Server Configuration Manager > SQL Server Network Configuration)
    2. Start the SQL SERVER BROWSER Service. To do this, open the CONFIGURATION MANAGER as an administrator (right click to open). Once in the Manager, right click the BROWSER SERVICE, click Properties and change the Start Mode to Enabled. (http://stackoverflow.com/a/21378235)
    3. Explicitly set the TCP to 1433, as explained in these [two](http://stackoverflow.com/a/18850073) [answers](http://stackoverflow.com/a/24299346).
    

6b) Change authentication on the SQL server to 'Mixed Mode'. Give permissions on the database to a new user, created for the purposes of importing to elasticsearch.

  1. Start Elasticsearch

    ./elasticsearch.bat
    
  2. [Update: jdbc importer can be run from Windows command prompt. To do this, save the json object with the settings in a file and append the filename at the end of your java command. See comments below.] Start JDBC importer by copying and pasting the following script into the Cygwin terminal. You can use the right-click button or Shift-insert to paste into the terminal. Also make sure to:

    * Substitute the correct path for ```$JDBC_IMPORTER_HOME```.
    * Substitute the correct databaseName, user and password
    * Substitute the correct table name (instead of 'ScoreCards').
```
bin=$JDBC_IMPORTER_HOME/bin
lib=$JDBC_IMPORTER_HOME/lib
echo '{
    "type" : "jdbc",
    "jdbc": {
        "url":"jdbc:sqlserver://localhost:1433;databaseName=ICFV",
        "user":"elasticsearch",
        "password":"elasticsearch",
        "sql":"select * from ScoreCards",
        "index" : "myindex",
        "type" : "mytype"
    }
}' | java \
       -cp "$lib/*" \
       -Dlog4j.configurationFile="$bin"/log4j2.xml \
       org.xbib.tools.Runner \
       org.xbib.tools.JDBCImporter
```

Note these changes to the variables in the script at jdbc-importer, in keeping with Windows variable conventions:

-cp "${lib}/*" \
-Dlog4j.configurationFile=${bin}/log4j2.xml \

to

-cp "$lib/*" \
-Dlog4j.configurationFile="$bin"/log4j2.xml \
  1. You should see messages from the importer in the logfile. You can also check Marvel and Sense to confirm that the data has been entered into an index and is searchable using the Elasticsearch search API.
@vaniaravinda
Copy link

Hi,

I have gone through your article and followed the steps. Started working on it from 2 months finally succeeded in indexing the data from SQL Server with your article. TanQ so much for such a great article.
Can you please let me know how to do re-indexing and how to run this command through .Net code.

Thanks,
Vani Aravinda

@aih
Copy link
Author

aih commented Aug 20, 2015

Hi @vaniaravinda,
Thanks for the comment-- I did not see it until now.

If I understand your question, you can reindex by deleting the previous index and then running the code above again from Cygwin. I use the free Sense plugin for this, installed and running at : http://localhost:9200/_plugin/marvel/sense/index.html.

I have not yet gotten to the point of running the command through .Net, though we will have to do this for our project soon. Do be in touch if you have any other questions or suggestions.

Ari

@vaniaravinda
Copy link

Hi,

TanQ for ur reply. Can u plz let me know how to run the below command through windows command prompt? I want to schedule this for auto run. Can u plz let me know if u have any suggestions.

bin=$JDBC_IMPORTER_HOME/bin
lib=$JDBC_IMPORTER_HOME/lib
echo '{
"type" : "jdbc",
"jdbc": {
"url":"jdbc:sqlserver://localhost:1433;databaseName=ICFV",
"user":"elasticsearch",
"password":"elasticsearch",
"sql":"select * from ScoreCards",
"index" : "myindex",
"type" : "mytype"
}
}' | java
-cp "$lib/*"
-Dlog4j.configurationFile="$bin"/log4j2.xml
org.xbib.tools.Runner
org.xbib.tools.JDBCImporter

Thanks,
Vani Aravinda

@aih
Copy link
Author

aih commented Sep 14, 2015

I hope this is not too late for your use, but I just worked out how to run the command from the Windows prompt. The key is to save the json object in a file, without the single quotes. For example, you can call it 'settings.json'. Then run from the Windows command prompt, using the full paths:

java -cp "C:\path\to\jdbc_importer_home\lib\*" -Dlog4j.configurationFile="C:\path\to\jdbc_importer_home\bin\log4j2.xml" org.xbib.tools.Runner org.xbib.tools.jdbcImporter settings.json

@polomint82
Copy link

aih - can you please supply a sample of the json file that you're talking about? I would really appreciate it. This article is the closest thing I've come to that actually works.

thanks in advance

@polomint82
Copy link

Ok figured it out. For future users, make sure you configure TCP/IP IPAll to port 1433 prior to running. Fantastic article... Great job

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment