Some benchmark results, for:
- Setting up a
MemReader. - Setting up a
MemReaderand reading 130 tokens as sentences (with lots ofStringandVector). - Reading 1 token, with no copies or allocations.
test conll::memreader_overhead ... bench: 195 ns/iter (+/- 22) = 22394 MB/s
test conll::sentence_reader_iter_bench ... bench: 269884 ns/iter (+/- 40269) = 16 MB/s
test conll::token_from_str_bench ... bench: 99 ns/iter (+/- 24) = 313 MB/s
We can see that the MemReader incurs negligible overhead, and Token::from_string is fast enough that it won't be a bottleneck (apparently the Snappy decompression algorithm, one of the fastest in common use, runs at around 500MB/sec on a single i7 core).
Next up: Modifying the sentence reader to require zero amortized allocations.