Skip to content

Instantly share code, notes, and snippets.

@veer66
Last active August 3, 2021 03:38
Show Gist options
  • Save veer66/fbc4b5f0905c55a86f9cfd55f200fe5b to your computer and use it in GitHub Desktop.
Save veer66/fbc4b5f0905c55a86f9cfd55f200fe5b to your computer and use it in GitHub Desktop.

Benchmark

nlpo3-cli vs newmm

Setup

  • Computer: Scaleway's Mac mini M1
  • Rustc: rustc 1.54.0 (a178d0322 2021-07-26)
  • Python: Python 3.8.2
  • OS: Darwin 506124d8-4acf-4595-9d46-8ca4b44b8110 20.6.0 Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5/RELEASE_ARM64_T8101 arm64
  • Script:
#!/bin/bash

set -x

INPUT=thwik-head1m.txt

for i in {1..10}
do
  { time python3 newmm.py < $INPUT > newmm.out ; } 2>> bench_newmm.txt
  { time nlpo3 segment < $INPUT > cham.out ; } 2>> bench_o3.txt
done
  • A command line interface for newmm:
from pythainlp import word_tokenize
import sys

for line in sys.stdin:
        print("|".join(word_tokenize(line[:-1])))

Result

nlpo3

[root@exper1 ~]# % grep real bench_o3.txt 
real    2m10.923s
real    2m12.014s
real    2m10.931s
real    2m9.448s
real    2m9.055s
real    2m10.570s
real    2m10.672s
real    2m10.140s
real    2m11.220s
real    2m9.941s

newmm

% grep real bench_newmm.txt 
real    7m52.180s
real    7m58.090s
real    7m57.071s
real    8m9.779s
real    7m54.576s
real    7m52.807s
real    7m59.109s
real    7m58.489s
real    7m59.604s
real    7m57.844s

Average

  • nlpo3
% grep real bench_o3.txt | ruby -lane 'BEGIN { all = 0.0; cnt = 0 }; cols = $F[1].split(/[ms]/).map {|x| x.to_f }; v = cols[0]*60 + cols[1]; all += v; cnt += 1; END { p all/cnt}' 
130.49140000000003
  • newmm
% grep real bench_newmm.txt | ruby -lane 'BEGIN { all = 0.0; cnt = 0 }; cols = $F[1].split(/[ms]/).map {|x| x.to_f }; v = cols[0]*60 + cols[1]; all += v; cnt += 1; END { p all/cnt}'
477.9549

Performance ratio

3.66

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment