Skip to content

Instantly share code, notes, and snippets.

@fforbeck
fforbeck / configure_queues.sh
Created February 25, 2016 14:19
RabbitMQ - Command Line Setup. Create queue, bindings, exchanges with rabbitmqadmin and rabbitmqctl
#!/usr/bin/env bash
URL="http://localhost:15672/cli/rabbitmqadmin"
VHOST="<>"
USER="<>"
PWD="<>"
QUEUE="<>"
FAILED_QUEUE="<>"
@fforbeck
fforbeck / move_messages.sh
Created February 25, 2016 14:10
RabbitMQ - Shovel Plugin: Move messages from one queue to another via command line
# The command deletes the parameter after all messages are moved origin to target queue
rabbitmqctl set_parameter -p <vhost> shovel "<origin.queue.name>" '{"src-uri":"amqp://<user>:<pwd>@/<vhost_name>","src-queue":"<origin.queue.name>","dest-uri":"amqp://<user>:<pwd>@/<vhost_name>","dest-exchange":"<target.queue.name>","prefetch-count":1,"reconnect-delay":5,"add-forward-headers":false,"ack-mode":"on-confirm","delete-after":"queue-length"}'
@fforbeck
fforbeck / allocate_unassigned_shard.sh
Last active March 10, 2016 19:54
Allocate unassigned shard from Elasticsearch node
##
# http://stackoverflow.com/questions/19967472/elasticsearch-unassigned-shards-how-to-fix
##
NODE="YOUR NODE NAME"
IFS=$'\n'
for line in $(curl -s 'localhost:9200/_cat/shards' | fgrep UNASSIGNED); do
INDEX=$(echo $line | (awk '{print $1}'))
SHARD=$(echo $line | (awk '{print $2}'))
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
@fforbeck
fforbeck / gist:43f3e946357a045a6cf3
Created January 20, 2016 16:49 — forked from duydo/elasticsearch_best_practices.txt
ElasticSearch - Index best practices from Shay Banon
If you want, I can try and help with pointers as to how to improve the indexing speed you get. Its quite easy to really increase it by using some simple guidelines, for example:
- Use create in the index API (assuming you can).
- Relax the real time aspect from 1 second to something a bit higher (index.engine.robin.refresh_interval).
- Increase the indexing buffer size (indices.memory.index_buffer_size), it defaults to the value 10% which is 10% of the heap.
- Increase the number of dirty operations that trigger automatic flush (so the translog won't get really big, even though its FS based) by setting index.translog.flush_threshold (defaults to 5000).
- Increase the memory allocated to elasticsearch node. By default its 1g.
- Start with a lower replica count (even 0), and then once the bulk loading is done, increate it to the value you want it to be using the update_settings API. This will improve things as possibly less shards will be allocated to each machine.
- Increase the number of machines you have so
// installed Clojure packages:
//
// * BracketHighlighter
// * lispindent
// * SublimeREPL
// * sublime-paredit
{
"word_separators": "/\\()\"',;!@$%^&|+=[]{}`~?",
"paredit_enabled": true,
{:user {:dependencies [[org.clojure/tools.namespace "0.2.3"]
[spyscope "0.1.3"]
[criterium "0.4.1"]]
:injections [(require '(clojure.tools.namespace repl find))
; try/catch to workaround an issue where `lein repl` outside a project dir
; will not load reader literal definitions correctly:
(try (require 'spyscope.core)
(catch RuntimeException e))]
:plugins [[lein-pprint "1.1.1"]
[lein-beanstalk "0.2.6"]
import static org.apache.commons.lang.StringUtils.isBlank;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.apache.log4j.Logger;
import net.tanesha.recaptcha.ReCaptchaImpl;
import net.tanesha.recaptcha.ReCaptchaResponse;

Interesse dos usuários

Motivação

Para empresas que trabalham com publicidade online, mais precisamente DSPs e suas plataformas de Real-time Bidding é muito importante coletar a analisar informações a cerca do comportamento e interesses dos usuários enquanto navegam na internet. Sendo assim, descrevo nesse graph-gist uma abordagem básica que pode ser utilizada para analisar tais dados, considerando um determinado período de tempo e os produtos visualizados por cada usuário. Vamos considerar que alguns personagens de Breaking Bad navegaram na internet há alguns dias e encontraram alguns produtos, elementos químicos e, possuem interesse em comprá-los. Tal informação será de extrema importância no momento de dar um lance em um leilão de publicidade, sabendo o perfil e interesse de um determinado usuário. Então armazenamos tais usuários, produtos e datas das visualizações para que possamos extrair essas informações futuramente.

@fforbeck
fforbeck / Graph Gist Sample
Last active December 23, 2015 17:59
Neo4j and Cypher (tests) Finding the users that liked some product a few days ago.
#New graph
START root=node(0)
CREATE
(User1 { name:'User1' }),
(User2 { name: 'User2' }),
(User3 { name: 'User3' }),
(Mac { name: 'Mac' }),
(Samsung { name: 'Samsung' }),
(Brastemp { name: 'Brastemp' }),

Interest of users

Motivation

For companies that work with online advertising, more precisely DSPs and their Real-time Bidding platforms is very important to consider collecting information about the behavior and interests of users while they surfing on the internet. Therefore, in this graph gist, I describe a basic approach that can be used to analyze such data, considering a certain period of time and the products visualized by each user. Let’s consider some characters from Breaking Bad surfed on the internet a few days ago and found some products interesting, chemical elements, and they are thinking about buying them. Such information is extremely important when making a bid at auction advertising, knowing the profile and interests of a given user. Then we store these users, products and dates of views so we can extract this information in the future.