Skip to content

Instantly share code, notes, and snippets.

@billdueber
Created November 10, 2010 18:05
Show Gist options
  • Save billdueber/671233 to your computer and use it in GitHub Desktop.
Save billdueber/671233 to your computer and use it in GitHub Desktop.
Example of how to use jruby_streaming_update_solr_server
require 'rubygems'
require 'threach'
require 'jruby_streaming_update_solr_server'
solrURL = 'your solr url'
sussQueueSize = 128 # number of docs to queue up
sussThreads = 1 # number of threads to use to send stuff to solr
threads = 3 # number of threads to use to process the data
suss = StreamingUpdateSolrServer.new(solrURL, sussQueueSize, sussThreads)
# use javabin; requires you to have the right handler
# set up in Solr. It's faster.
suss.useJavabin!
# The suss will send batches of documents to Solr automatically;
# you don't have to do anything special
# Get a reader for your kind of data; it should have a working #each
reader = SomethingThatGetsEachDocumentInTurn.new(whatever)
# Read the data one item at a time using #each, but spin up a few
# threads to do the actually processing
reader.threach(threads) do |myData|
# Create a hash with solr field names as the keys and
# the values as the values. multiple values can be represented
# by arrays, e.g., h[key] = [val1, val2, ...]
h = turnDocIntoHash(myData)
suss << h
end
suss.commit
# NOTE
#
# For efficiency, you should actually turn myData into a SolrDocument
# instead of a hash, so the call to suss.<< doesn't have to
# do it for you.
#
# doc = SolrInputDocument.new
# doc['id'] = 3334 # set the id
# doc.add('author', 'Bill Dueber') # alternate syntax; set an author
# doc.add('author', "Naomi Dushay") # this syntax allows you to add multiple values
# doc.add('isbn', ['123456789X', '9783346758394']) # Can add multple values at once
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment