Skip to content

Instantly share code, notes, and snippets.

@hivefans
hivefans / hbase.rest.scanner.filters.md
Last active March 17, 2020 02:03 — forked from stelcheck/hbase.rest.scanner.filters.md
HBase Stargate REST API Scanner Filter Examples|-|{"files":{"hbase.rest.scanner.filters.md":{"env":"plain"}},"tag":"bigdata"}

Stargate Scanner Filter Examples

Introduction

So yeah... no documentation for the HBase REST API in regards to what should a filter look like...

So I installed Eclipse, got the library, and took some time to find some of the (seemingly) most useful filters you could use. I'm very green at anything regarding HBase, and I hope this will help anyone trying to get started with it.

What I discovered is that basically, attributes of the filter object follow the same naming than in the documentation. For this reason, I have made the link clickable and direct them to the HBase Class documentation attached to it; check for the instantiation argument names, and you will have your attribute list (more or less).

@hivefans
hivefans / hbase-rest-examples.sh
Last active March 17, 2020 02:03 — forked from karmi/hbase-rest-examples.sh
Experiments with the HBase REST API|-|{"files":{"hbase-rest-examples.sh":{"env":"plain"}},"tag":"bigdata"}
#!/usr/bin/env bash
#
# ===================================
# Experiments with the HBase REST API
# ===================================
#
# <http://hbase.apache.org/docs/r0.20.4/api/org/apache/hadoop/hbase/rest/package-summary.html>
#
# Usage:
#
@hivefans
hivefans / pyrdd_access_javardd.md
Last active March 17, 2020 02:03 — forked from yu-iskw/testing.md
PySpark serializer and deserializer testing with a nested and complicated value|-|{"files":{"pyrdd_access_javardd.md":{"env":"plain"}},"tag":"bigdata"}

Python =(parallelize)=> RDD =(collect)=> Python

It works well.

>>> sc = SparkContext('local', 'test', batchSize=2)
>>> data = [([1, 0], [0.5, 0.499]), ([0, 1], [0.5, 0.499])]
>>> rdd = sc.parallelize(data)
>>> rdd.collect()
[([1, 0], [0.5, 0.499]), ([0, 1], [0.5, 0.499])]
@hivefans
hivefans / watch_log.py
Last active March 17, 2020 02:03 — forked from albsen/watch_log.py
python log file watcher|-|{"files":{"watch_log.py":{"env":"plain"}},"tag":"bigdata"}
#!/usr/bin/env python
"""
Real time log files watcher supporting log rotation.
Author: Giampaolo Rodola' <g.rodola [AT] gmail [DOT] com>
License: MIT
"""
import os
@hivefans
hivefans / es-index-rate.py
Last active March 17, 2020 02:04 — forked from slackorama/es-index-rate.py
|-|{"files":{"es-index-rate.py":{"env":"plain"}},"tag":"bigdata"}
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import argparse
import time
import requests
SLEEP_TIME = 5
@hivefans
hivefans / SimpleHTTPServerWithUpload.py
Last active March 17, 2020 02:04 — forked from UniIsland/SimpleHTTPServerWithUpload.py
Simple Python Http Server with Upload|-|{"files":{"SimpleHTTPServerWithUpload.py":{"env":"plain"}},"tag":"bigdata"}
# !/usr/bin/env python
# coding=utf-8
# http://my.oschina.net/leejun2005/blog/71444
"""
简介:这是一个 python 写的轻量级的文件共享服务器(基于内置的SimpleHTTPServer模块),
支持文件上传下载,只要你安装了python(建议版本2.6~2.7,不支持3.x),
然后去到想要共享的目录下,执行:
python SimpleHTTPServerWithUpload.py
或者 python SimpleHTTPServerWithUpload.py filename
"""
@hivefans
hivefans / golang_job_queue.md
Last active March 17, 2020 02:03 — forked from harlow/golang_job_queue.md
Job queues in Golang|-|{"files":{"worker_standalone.go":{"env":"plain"},"golang_job_queue.md":{"env":"plain"},"worker_refactored.go":{"env":"plain"},"worker_original.go":{"env":"plain"}},"tag":"Uncategorized"}
@hivefans
hivefans / config.go
Last active March 17, 2020 02:03 — forked from nl5887/config.go
|-|{"files":{"main.go":{"env":"plain"},"config.yaml":{"env":"plain"},"config.go":{"env":"plain"}},"tag":"Uncategorized"}
package honeycast
import (
"io/ioutil"
"regexp"
"github.com/imdario/mergo"
"gopkg.in/yaml.v2"
)
@hivefans
hivefans / kafka_quick_hack.go
Last active March 17, 2020 02:04 — forked from hackintoshrao/kafka_quick_hack.go
A quick hack to get the Golang kafka driver to publish to "test" topic created from the quick start guide here http://kafka.apache.org/documentation.html#quickstart|-|{"files":{"kafka_quick_hack.go":{"env":"plain"}},"tag":"bigdata"}
package main
import (
"github.com/Shopify/sarama"
"crypto/tls"
"crypto/x509"
"encoding/json"
"flag"
"fmt"
@hivefans
hivefans / sarama_produce_test_with_timestamp.go
Last active March 17, 2020 02:03 — forked from niels-s/sarama_produce_test_with_timestamp.go
|-|{"files":{"sarama_produce_test_with_timestamp.go":{"env":"plain"}},"tag":"bigdata"}
package main
import (
"log"
"os"
"time"
"github.com/Shopify/sarama"
)