Skip to content

Instantly share code, notes, and snippets.

@coderplay
coderplay / DataPack.java
Created October 17, 2012 13:17
InfoBright
public abstract class DataPack<T> {
private int decomposerId;
private int outliers;
private ReentrantLock dataPackLock;
public void setDecompressorId(int decomposerId) {
this.decomposerId = decomposerId;
}
@coderplay
coderplay / CMakeLists.txt
Created December 9, 2012 14:22
supersonic with cmake
cmake_minimum_required(VERSION 2.8)
project(supersonic)
MESSAGE(STATUS "This is BINARY dir " ${supersonic_BINARY_DIR})
MESSAGE(STATUS "This is SOURCE dir "${supersonic_SOURCE_DIR})
#============================================================================
# HEADERS
@coderplay
coderplay / BuildingImpala.md
Created December 11, 2012 07:56
Building Impala

RHEL 5u4

upgrade python

The default version of python installed on RHEL 5u4 is 2.4.3, which doesn't support with statement. Here is the error message:

******************************
 Building Impala backend 
******************************

File "/home/zhouchen.zm/impala/bin/gen_build_version.py", line 43

@coderplay
coderplay / Lucene.md
Last active December 10, 2015 00:28
搜索相关知识

Grouping 与Facet的区别

  • Grouping was first released with Lucene 3.2, its related jira issue is LUCENE-1421: it allows to group search results by specified field. For example, if you group by the author field, then all documents with the same value in the author field fall into a single group. You will have a kind of tree as output. If you want to go deeper into using this lucene feature, this blog post should be useful.
  • Faceting was first released with Lucene 3.4, its related jira issue is LUCENE-3079: this feature doesn't group documents, it just tells you how many documents fall in a specific value of a facet. For example, if you have a facet based on the author field, you will receive a list of all your authors, and for each author you will know how many documents belong to that specific author. Afte
@coderplay
coderplay / MappingTest.java
Created December 26, 2012 02:12
Java Map
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.RandomAccessFile;
import java.lang.reflect.Field;
import java.nio.ByteBuffer;
@coderplay
coderplay / DirectMemoryTricky.java
Created December 30, 2012 07:38
Direct Memory Tricks
import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.lang.reflect.Array;
import java.lang.reflect.Field;
import java.lang.reflect.Modifier;
import javax.management.MBeanServer;
import javax.management.ObjectName;
import javax.management.openmbean.CompositeData;
@coderplay
coderplay / TestMemoryAccessPatterns.java
Last active December 10, 2015 15:18
on-heap array vs off-heap array
public class TestMemoryAccessPatterns {
private static final int LONG_SIZE = 8;
private static final int PAGE_SIZE = 2 * 1024 * 1024;
private static final int ONE_GIG = 1024 * 1024 * 1024;
private static final int ARRAY_SIZE = (int) (ONE_GIG / LONG_SIZE);
private static final int WORDS_PER_PAGE = PAGE_SIZE / LONG_SIZE;
private static final int ARRAY_MASK = ARRAY_SIZE - 1;
private static final int PAGE_MASK = WORDS_PER_PAGE - 1;
import java.io.File;
import java.io.RandomAccessFile;
import java.lang.reflect.Method;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
import sun.nio.ch.FileChannelImpl;
@coderplay
coderplay / drop_cache.cpp
Last active December 10, 2015 19:28
drop page cache
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <assert.h>
@coderplay
coderplay / powerdrill.md
Last active December 11, 2015 21:18
Google PowerDrill论文翻译

一次鼠标点击,处理万亿条数据

摘要

列存储数据系统已成为业界游戏规则的改变者. 高度定制化和调优的系统让 #引言

背景

贡献

相关工作

基础方法

全扫描和数据跳跃的威力

数据分区