This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# check | |
lucene $ ./gradlew check | |
BUILD SUCCESSFUL in 3m 59s | |
# packaging | |
lucene $ ./gradlew clean | |
lucene $ ./gradlew assembleRelease | |
BUILD SUCCESSFUL in 49s | |
# luke |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@State(Scope.Benchmark) | |
public class SearchBenchmark { | |
private static final String dirPath = System.getProperty("index.dir"); | |
private static final String[] terms1 = new String[]{"電車", "列車", "鉄道"}; | |
private Directory dir; | |
private IndexReader reader; | |
private Query query1; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
use rand::prelude::*; | |
fn main() { | |
let mut rng = thread_rng(); | |
let p: f32 = 0.00001; | |
let max_doc: usize = 1_000_000; | |
let mut postings: Vec<usize> = vec![rng.gen_range(1, 1000) as usize]; | |
loop { | |
let next = postings.last().unwrap() + geo_random(p); | |
if next > max_doc { |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pub fn encode_vbyte(li: &[usize]) -> Vec<u8> { | |
fn encode(k: usize) -> Vec<u8> { | |
let mut vbytes = Vec::new(); | |
let mut tmp = k; | |
while tmp >= 128 { | |
vbytes.push(128 + (tmp & 127) as u8); | |
tmp >>= 7; | |
} | |
vbytes.push(tmp as u8); | |
vbytes |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// https://crates.io/crates/bit-vec | |
use bit_vec::BitVec; | |
pub fn encode_rice(li: &[usize], m: u32) -> BitVec { | |
fn encode_quotient(k: usize, m: u32) -> BitVec { | |
let q: usize = (((k - 1) / m as usize) as f64).floor() as usize; | |
// encode (quotient + 1) in unary code | |
let mut bv = BitVec::from_elem(q + 1, false); | |
bv.set(q, true); | |
bv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// https://crates.io/crates/bit-vec | |
use bit_vec::BitVec; | |
pub fn encode_golomb(li: &[usize], m: u32) -> BitVec { | |
fn encode_quotient(k: usize, m: u32) -> BitVec { | |
let q: usize = (((k - 1) / m as usize) as f64).floor() as usize; | |
// encode (quotient + 1) in unary code | |
let mut bv = BitVec::from_elem(q + 1, false); | |
bv.set(q, true); | |
bv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// https://crates.io/crates/bit-vec | |
use bit_vec::BitVec; | |
pub fn encode_gamma(li: &[usize]) -> BitVec { | |
fn encode(k: usize) -> BitVec { | |
let body_len: usize = ((k as f64).log2().floor() as usize) + 1; | |
let body = BitVec::from_bytes(&k.to_be_bytes()); | |
let mut bv = BitVec::from_elem(body_len * 2 - 1, false); | |
// set selector bit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// https://crates.io/crates/bit-vec | |
use bit_vec::BitVec; | |
pub fn encode_delta(li: &[usize]) -> BitVec { | |
fn encode(k: usize) -> BitVec { | |
let body_len: usize = ((k as f64).log2().floor() as usize) + 1; | |
let body = BitVec::from_bytes(&k.to_be_bytes()); | |
// set gamma encoded selector | |
let mut bv = encode_gamma(&[body_len]); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FROM docker.elastic.co/elasticsearch/elasticsearch:7.7.0 | |
ENV PATH /usr/share/elasticsearch/bin:$PATH | |
# switch user to elasticsearch | |
USER elasticsearch | |
# install plugins | |
RUN elasticsearch-plugin install analysis-kuromoji |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# base image | |
FROM docker.elastic.co/elasticsearch/elasticsearch:7.7.0 | |
# PATH | |
ENV PATH /usr/share/elasticsearch/bin:$PATH | |
USER elasticsearch | |
# copy configuration file | |
COPY elasticsearch.yml /usr/share/elasticsearch/config/ |
NewerOlder