Scott Smerchek smerchek

Scott graduated from Kansas State University in 2010 and has since worked at Softek Solutions, Inc. He is considered a man of all trades at Softek, working on anything from SQL on the back-end to HTML/JS on the front-end and everything in between. He has recently specialized in using Puppet to deploy infrastructure and customizing Lucene and Solr. He is very interested in search, analytics, dealing with big data, and data visualization. He is always interested in the latest technologies and trends in the industry, and of course learning! Outside software and technology, Scott enjoys reading, cycling, running, and spending time with his wife.

###The Lightning Version Solr is an open source search platform built on top of the Apache Lucene project. Solr wraps Lucene with a nice RESTful API, and adds other features like faceted search, grouping, field types, caching, xml configuration, an administration interface, and the ability to scale with distributed search. This session will breeze through the basics of Solr and Lucene. Then we'll touch on each of advanced/awesome features of Solr to demonstrate the power and ease of Solr. This session will of course be supplemented with working examples and demos.

####Reviewer Comments I gave the longer version of this talk for the first time this year, and hope to give it a few more times between now and September. That talk also happened to be my first technical presentation and I got a very good response. I think Solr is a great fit for a 20 minute presentation because it is so easy to set up and query that the power of Solr is immediately apparent. Because of Solr's RESTful API, this talk is language-agn

In order for me to have a good idea of who will be attending the event please #rsvp to @nodekc before each meeting. Our meetings will be held at snow and company (Just off I35 in downtown KC) monthly or every other week, depending on interest. If you have a laptop please bring it (wireless internet will be provided).

The NodeKC user group will focus on building open source node projects. We'll work collaboratively on projects that the group determines to be interesting and challenging. Our first project, however, has been pre-determined for the sake of a quick start. This project idea will be announced and developed further at the first meetup.

We'll break into two groups and work toward a similar goal on the project after the first meeting. These groups will have a set of user stories to accomplish before the next meetup. At the following meeting we'll review each of the groups code and determine which codebase to merge into the main github branch. We can also selectively choose parts of each of the grou

####Beyond the Basics: Lucene and Solr If you are already using Lucene and/or Solr (or even ElasticSearch), then this is the talk for you. We will go beyond the basics of these brilliant open source search platforms. Not only are there many ways to customize Solr through the standard configuration file, but there is so much more. Payloads offer up many possibilities for customization, including the ability to tag word with part of speech information. There is also a lot of ways to extend Lucene and Solr by creating your own filters, query parsers, tokenizers, token filters, and even highlighters with some simple Java code. If search is a core feature of your application, then you need to be using these advanced features to set yourself apart.

####DevOps: Automating Your Infrastructure with Puppet Puppet is an open source project built by PuppetLabs (http://puppetlabs.com) to automate the management of your IT infrastructure. Whether you manage a hosted environment or you run your own servers in-house, Puppet can help alleviate management headaches. Puppet lets your declaratively describe what a machine should look like, and then makes it happen (and makes sure it stays that way). This talk will go over the basics of Puppet, including: how to get started, the essentials of Puppet modules, using existing modules on the Puppet Forge, running Puppet on Windows. It will also touch on how to write a basic module.

##Breaking Down the Lucene Analysis Process

The Lucene analysis process is very powerful, but most of us only know enough of the basics to put together a simple analyzer chain. Search isn't always plug-and-play, and the ability to manipulate and compose tokenizers and token filters will be the differentiator in developing your search product.

Using visualizations of the analysis chain, I will break down the Lucene analysis process to its most basic parts: char filters, tokenizers, and token filters. I'll show how differences in the composition of the token filters affects the final output. We'll see how tokens are more than just a stream; that they can become a token graph using synonyms and generating word parts.

##Reviewer Comments

I've been working directly with Lucene for the past year, implementing Softek's proprietary ranking algorithm for searching radiology documents. In the process, I've submitted patches or extended core Lucene and Solr code. I've implemented our own query parser extension and

	<!DOCTYPE html>
	<meta charset="utf-8">
	<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js"></script>
	<script src="http://d3js.org/d3.v2.js?2.9.1"></script>
	<script src="https://raw.github.com/timrwood/moment/1.6.2/min/moment.min.js"></script>
	<style>
	html {
	font-family: Arial, Helvetica, sans-serif;
	font-size: 10pt;
	}

	#source: http://johnleach.co.uk/words/771/puppet-dependencies-and-run-stages

	exec { "apt-update":
	command => "/usr/bin/apt-get update"
	}

	Exec['apt-update'] -> Package <\| \|>

	#source: http://projects.puppetlabs.com/projects/1/wiki/debian_preseed_patterns

	file { "/tmp/file.preseed":
	source => 'puppet:///modules/modulename/file.preseed',
	mode => 600,
	backup => false,
	}

	package { 'packagename':
	responsefile => '/tmp/file.preseed',

	#each line in answers.txt represents a single answer.

	sh pkg.sh < answers.txt