Skip to content

Instantly share code, notes, and snippets.

@aih
Last active August 9, 2016 20:09
Show Gist options
  • Save aih/f0370479669e3003b2a4 to your computer and use it in GitHub Desktop.
Save aih/f0370479669e3003b2a4 to your computer and use it in GitHub Desktop.

#Importing Data from MS SQL Server to Elasticsearch Adapted from the instructions for the jdbc importer here.

MS SQL Server

  1. Download Elasticsearch

  2. Install Elasticsearch

    Follow instructions on https://www.elastic.co/products/elasticsearch

3a) Install Cygwin Make sure to include the wget utility and an unzip utility. (not necessary if you run the script from Windows command prompt)

3b) Install JDBC importer

    `wget http://xbib.org/repository/org/xbib/elasticsearch/importer/elasticsearch-jdbc/<version>/elasticsearch-jdbc-<version>-dist.zip`
    
   (update version to the current stable version)
   Copy the zip file into your ```\Program Files```  or ```\Program Files (x86)``` folder. Unzip using and unzip utility (e.g. from Cygwin).
  1. Download SQL Server JDBC driver from Microsoft

  2. Copy the driver into lib folder of the jdbc importer (should be in \Program Files folder, from step 3 above).

     cp SQLJDBC4.jar $JDBC_IMPORTER_HOME/lib

6a) Set up the database you want to be indexed. This includes allowing TCP/IP connections

    1. Enable TCP/IP in the SQL SERVER CONFIGURATION MANAGER ( (Start Menu > Microsoft SQL Server > Configuration Tools > SQL Server Configuration Manager > SQL Server Network Configuration)
    2. Start the SQL SERVER BROWSER Service. To do this, open the CONFIGURATION MANAGER as an administrator (right click to open). Once in the Manager, right click the BROWSER SERVICE, click Properties and change the Start Mode to Enabled. (http://stackoverflow.com/a/21378235)
    3. Explicitly set the TCP to 1433, as explained in these [two](http://stackoverflow.com/a/18850073) [answers](http://stackoverflow.com/a/24299346).
    

6b) Change authentication on the SQL server to 'Mixed Mode'. Give permissions on the database to a new user, created for the purposes of importing to elasticsearch.

  1. Start Elasticsearch

    ./elasticsearch.bat
    
  2. [Update: jdbc importer can be run from Windows command prompt. To do this, save the json object with the settings in a file and append the filename at the end of your java command. See comments below.] Start JDBC importer by copying and pasting the following script into the Cygwin terminal. You can use the right-click button or Shift-insert to paste into the terminal. Also make sure to:

    * Substitute the correct path for ```$JDBC_IMPORTER_HOME```.
    * Substitute the correct databaseName, user and password
    * Substitute the correct table name (instead of 'ScoreCards').
```
bin=$JDBC_IMPORTER_HOME/bin
lib=$JDBC_IMPORTER_HOME/lib
echo '{
    "type" : "jdbc",
    "jdbc": {
        "url":"jdbc:sqlserver://localhost:1433;databaseName=ICFV",
        "user":"elasticsearch",
        "password":"elasticsearch",
        "sql":"select * from ScoreCards",
        "index" : "myindex",
        "type" : "mytype"
    }
}' | java \
       -cp "$lib/*" \
       -Dlog4j.configurationFile="$bin"/log4j2.xml \
       org.xbib.tools.Runner \
       org.xbib.tools.JDBCImporter
```

Note these changes to the variables in the script at jdbc-importer, in keeping with Windows variable conventions:

-cp "${lib}/*" \
-Dlog4j.configurationFile=${bin}/log4j2.xml \

to

-cp "$lib/*" \
-Dlog4j.configurationFile="$bin"/log4j2.xml \
  1. You should see messages from the importer in the logfile. You can also check Marvel and Sense to confirm that the data has been entered into an index and is searchable using the Elasticsearch search API.
@polomint82
Copy link

Ok figured it out. For future users, make sure you configure TCP/IP IPAll to port 1433 prior to running. Fantastic article... Great job

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment