Skip to content

Instantly share code, notes, and snippets.

@rblaze
Created March 27, 2012 07:10
Show Gist options
  • Save rblaze/2213565 to your computer and use it in GitHub Desktop.
Save rblaze/2213565 to your computer and use it in GitHub Desktop.
module Main where
import Codec.Digest.SHA
import Control.Monad
import Data.Bits
import Data.ByteString as BS(ByteString, last, init, pack, unpack)
import Data.HashTable as H
import Data.Int
import Data.Maybe
import Data.Word
getTail :: Int -> ByteString -> [Word8]
getTail 0 _ = []
getTail n str = BS.last str : getTail (n - 1) (BS.init str)
mkWord :: [Word8] -> Word64
mkWord xs@(_:_:_:_:_:_:_:[]) = mergeWord xs .&. 0xFFFFFFFFFFF --0x3FFFFFFFFFFFF
mkWord _ = error "Wrong list"
mergeWord :: [Word8] -> Word64
mergeWord [] = 0
mergeWord (x:xs) = fromIntegral x .|. shift (mergeWord xs) 8
mkByteString :: Integer -> ByteString
mkByteString i = pack (mksplit i)
where
mksplit :: Integer -> [Word8]
mksplit 0 = []
mksplit n = fromIntegral r : mksplit q
where
(q, r) = quotRem n 256
hashWord :: Word64 -> Int32
hashWord w = if testBit w 40 then v else negate v
where
v :: Int32
v = fromIntegral (w .&. 0x7FFFFFFF)
main::IO()
main = do
ht <- H.new (==) hashWord
forM_ [1..] $ \i -> do
let str = mkByteString i
let hv = mkWord (getTail 7 (hash SHA256 str))
found <- liftM isJust (H.lookup ht hv)
when found $ do
iold <- liftM fromJust (H.lookup ht hv)
error (show (unpack $ mkByteString i) ++ " " ++ show (unpack $ mkByteString iold))
H.update ht hv i
@rblaze
Copy link
Author

rblaze commented Mar 27, 2012

LinearHashTable

11,580,869,996 bytes allocated in the heap
31,126,814,616 bytes copied during GC
243,847,716 bytes maximum residency (7 sample(s))
2,184,568 bytes maximum slop
652 MB total memory in use (120 MB lost due to fragmentation)

Generation 0: 3300 collections, 0 parallel, 165.13s, 165.52s elapsed
Generation 1: 7 collections, 0 parallel, 1.57s, 1.77s elapsed

INIT time 0.00s ( 0.00s elapsed)
MUT time 20.33s ( 20.67s elapsed)
GC time 166.71s (167.29s elapsed)
EXIT time 0.00s ( 0.08s elapsed)
Total time 187.03s (187.96s elapsed)

%GC time 89.1% (89.0% elapsed)

Alloc rate 569,776,540 bytes per MUT second

Productivity 10.9% of total user, 10.8% of total elapsed

@rblaze
Copy link
Author

rblaze commented Mar 27, 2012

Linear, если не заказывать сразу большую таблицу, а использовать H.new

12,547,294,380 bytes allocated in the heap
17,997,812,764 bytes copied during GC
226,684,152 bytes maximum residency (16 sample(s))
3,446,416 bytes maximum slop
701 MB total memory in use (171 MB lost due to fragmentation)

Generation 0: 3300 collections, 0 parallel, 121.88s, 122.42s elapsed
Generation 1: 16 collections, 0 parallel, 1.84s, 1.88s elapsed

INIT time 0.00s ( 0.00s elapsed)
MUT time 26.65s ( 27.16s elapsed)
GC time 123.72s (124.30s elapsed)
EXIT time 0.00s ( 0.09s elapsed)
Total time 150.37s (151.46s elapsed)

%GC time 82.3% (82.1% elapsed)

Alloc rate 470,824,054 bytes per MUT second

Productivity 17.7% of total user, 17.6% of total elapsed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment