Skip to content

Instantly share code, notes, and snippets.

A backup of http://sites.google.com/site/redcodenl/creating-shazam-in-java-1 just in case
Why is this necessary? Read http://sites.google.com/site/redcodenl/patent-infringement
Please fork, tweet about, etc.
----
Creating Shazam in Java
A couple of days ago I encountered this article: How Shazam Works
This got me interested in how a program like Shazam works… And more importantly, how hard is it to program something similar in Java?
# Implementation inspired by "Simple simhashing" by Ryan Moulton
# http://knol.google.com/k/simple-simhashing
# For simhashing we take ngrams and calculate their hash
# It could be interesting to change the ngram size, but '2'
# is a good value according to my tests
def simhashing sentence, ngram_size = 2
terms = sentence.downcase.split " "
hashes = []
Instructions for installing FlockDB on Ubuntu
------- Dependencies
- java 1.6: The "easy" one: apt-get install sun-java6-*
In Ubuntu 10.4 java packages have been moved to partner repository(https://wiki.ubuntu.com/LucidLynx/ReleaseNotes#Sun Java moved to the Partner repository), so we must add the partner repository:
sudo add-apt-repository "deb http://archive.canonical.com/ lucid partner"
Dado que se han suscrito los siguientes usuarios:
| email | pais | interests |
| [email protected] | España | Arquitectos |
| [email protected] | España | Artistas y fotógrafos |
---
Dado /^que se han suscrito los siguientes usuarios:$/ do |subscriptors|
subscriptors.hashes.each do |s|
subscriber = Factory.create :subscriber, :email => s[:email], :interests => Factory.create(:interests, :name => s[:interests])
# We have some cached pages (not every page) and we need to analyze if there's a huge concentration of misses as we have poor performance
# This simple script gets a rails log file and parses it to find the hits/misses to the cache and the response times
# key points:
# - Cache expires with time, so it's easy to parse the logs and detect misses
# The ouput is a CSV format with four columns: Minute, misses, hits and average request time
# CHANGES:
# 2010-11-30 16:11 : Extracted the regexp so we can use it for another logfiles
# 2010-11-30 16:11 : Included URL regexps so we can define which URLs shoud be analyzed and wich ones ignored
require 'rubygems'
require 'faster_csv'
FasterCSV.foreach('areas.csv', :headers => true, :col_sep => ';') do |row|
from = "<ID_CENTRO_TRABAJO>#{row['ID_CENTRO_TRABAJO']}</ID_CENTRO_TRABAJO>"
to = "<ID_CENTRO_TRABAJO>#{row['INTRANET']}</ID_CENTRO_TRABAJO>"
File.open('fixed.xml').each_line { |line| puts line.gsub(from, to) }
end
<ROW>
<C0>1</C0>
<ID_EMPLEADO>01365</ID_EMPLEADO>
<NAME>MARIA PAZ MARTIN GONZALEZ</NAME>
<EMAIL>[email protected]</EMAIL>
<POSITION>ADMINISTRATIVO I</POSITION>
<EXTENSION></EXTENSION>
<N_FICHERO>01365.jpg</N_FICHERO>
<ID_CENTRO_TRABAJO>MAD</ID_CENTRO_TRABAJO>
<N_UNIDAD>ADMINISTRACION</N_UNIDAD>
ID_CENTRO_TRABAJO;INTRANET
MAD;MOD
@brenes
brenes / letsdo.thor
Created January 13, 2011 09:07 — forked from mort/letsdo.thor
# My first Thor script is a helper to the productivity strategy of having an
# specific /etc/hosts file that blocks Twitter, Facebook, porn sites, and other
# unwanted distractions, during certains time of the day.
# Copy of your regular hosts file to /etc/hosts.play
# Create a /etc/hosts.work file with all the blocked sites on it
# Use 'thor lets:do [work|play]' to switch to the desired mental context.
# 'work' and 'play' are orientative, use whichever keywords suit you best.
# Nothing keeps you from keeping n different 'contexts' around.
@brenes
brenes / vaciar_campo_definition.rb
Created January 27, 2011 10:17
Definición para Mundo Pepino de la acción de vaciar un campo de formulario
Cuando /^vacio el campo ["'](.+)["']$/ do |field|
find_field_and_do :fill_in, field, :with => ""
end