Last active
January 1, 2016 21:39
-
-
Save ririw/8205284 to your computer and use it in GitHub Desktop.
Space inefficient program
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ghc -O2 -rtsopts -threaded -prof -fprof-auto -fforce-recomp reader.hs | |
time ./reader +RTS -K1G -sstderr -pa -A3M |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Mon Dec 30 23:37 2013 Time and Allocation Profiling Report (Final) | |
reader +RTS -K1G -sstderr -pa -A3M -RTS | |
total time = 5.73 secs (5726 ticks @ 1000 us, 1 processor) | |
total alloc = 1,920,697,176 bytes (excludes profiling overheads) | |
COST CENTRE MODULE %time %alloc ticks bytes | |
GC GC 52.4 0.0 3001 0 | |
popLink Main 15.3 29.5 877 566607296 | |
generalIndexer Main 12.8 15.7 735 301188920 | |
resourceName Main 8.9 27.7 511 531200856 | |
indexLinks.indexLoop Main 3.2 5.0 185 96016120 | |
link Main 2.7 7.3 157 140799648 | |
comment Main 2.0 6.3 117 121599960 | |
indexLinks.insertLink Main 0.9 4.2 51 79999800 | |
linkLine Main 0.6 1.7 34 31999920 | |
linkLineParser Main 0.5 2.7 30 51200000 | |
OVERHEAD_of PROFILING 0.3 0.0 16 0 | |
SYSTEM SYSTEM 0.2 0.0 11 16352 | |
MAIN MAIN 0.0 0.0 1 6768 | |
IDLE IDLE 0.0 0.0 0 0 | |
PINNED SYSTEM 0.0 0.0 0 0 | |
DONT_CARE MAIN 0.0 0.0 0 0 | |
CAF GHC.Integer.Type 0.0 0.0 0 0 | |
CAF GHC.Integer.Logarithms.Internals 0.0 0.0 0 320 | |
CAF GHC.IO.Encoding.Failure 0.0 0.0 0 0 | |
CAF GHC.Real 0.0 0.0 0 0 | |
CAF GHC.Float 0.0 0.0 0 0 | |
CAF GHC.Event.PSQ 0.0 0.0 0 0 | |
CAF GHC.IO.Handle.Types 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding.UTF8 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding.UTF32 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding.UTF16 0.0 0.0 0 0 | |
CAF GHC.Enum 0.0 0.0 0 0 | |
CAF GHC.Event.Manager 0.0 0.0 0 0 | |
CAF GHC.Event.Clock 0.0 0.0 0 0 | |
CAF Foreign.Marshal.Alloc 0.0 0.0 0 0 | |
CAF Data.Typeable.Internal 0.0 0.0 0 0 | |
CAF GHC.Event.Internal 0.0 0.0 0 32 | |
CAF GHC.Event.EPoll 0.0 0.0 0 0 | |
CAF GHC.Event.Control 0.0 0.0 0 0 | |
CAF GHC.Int 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding.Iconv 0.0 0.0 0 248 | |
CAF GHC.IO.FD 0.0 0.0 0 32 | |
CAF GHC.Conc.Sync 0.0 0.0 0 0 | |
CAF System.Posix.Internals 0.0 0.0 0 0 | |
CAF Data.Maybe 0.0 0.0 0 0 | |
CAF GHC.Show 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding 0.0 0.0 0 3376 | |
CAF GHC.Exception 0.0 0.0 0 0 | |
CAF GHC.Conc.Signal 0.0 0.0 0 672 | |
CAF GHC.Arr 0.0 0.0 0 0 | |
CAF GHC.Event.Thread 0.0 0.0 0 904 | |
CAF GHC.TopHandler 0.0 0.0 0 0 | |
CAF GHC.List 0.0 0.0 0 0 | |
CAF GHC.IO.Handle.Text 0.0 0.0 0 0 | |
CAF GHC.IO.Exception 0.0 0.0 0 0 | |
CAF Control.Exception.Base 0.0 0.0 0 0 | |
CAF GHC.IO.Handle.Internals 0.0 0.0 0 0 | |
CAF GHC.IO.Handle.FD 0.0 0.0 0 34672 | |
CAF GHC.IO.Handle 0.0 0.0 0 0 | |
CAF GHC.ForeignPtr 0.0 0.0 0 0 | |
CAF GHC.Err 0.0 0.0 0 0 | |
CAF Data.ByteString 0.0 0.0 0 0 | |
CAF Data.Map 0.0 0.0 0 0 | |
CAF Data.Attoparsec.ByteString.FastSet 0.0 0.0 0 0 | |
CAF Data.Attoparsec.Internal.Types 0.0 0.0 0 0 | |
CAF Data.Attoparsec.ByteString.Internal 0.0 0.0 0 0 | |
rnf Main 0.0 0.0 0 0 | |
rnf Main 0.0 0.0 0 0 | |
showList Main 0.0 0.0 0 0 | |
showsPrec Main 0.0 0.0 0 0 | |
showList Main 0.0 0.0 0 0 | |
showsPrec Main 0.0 0.0 0 0 | |
showList Main 0.0 0.0 0 0 | |
showsPrec Main 0.0 0.0 0 0 | |
main Main 0.0 0.0 0 19496 | |
indexLinks Main 0.0 0.0 0 224 | |
linkspath Main 0.0 0.0 0 1256 | |
popLink.parsed Main 0.0 0.0 0 0 | |
CAF Main 0.0 0.0 0 304 | |
individual inherited | |
COST CENTRE MODULE no. entries %time %alloc %time %alloc ticks bytes | |
MAIN MAIN 53 0 0.0 0.0 100.0 100.0 1 6768 | |
main Main 107 0 0.0 0.0 47.1 100.0 0 19496 | |
indexLinks Main 109 1 0.0 0.0 47.1 100.0 0 224 | |
indexLinks.indexLoop Main 110 400001 3.2 5.0 47.1 100.0 185 96016120 | |
indexLinks.insertLink Main 116 400000 0.9 4.2 13.7 19.8 51 79999800 | |
generalIndexer Main 123 799998 12.8 15.7 12.8 15.7 735 301188920 | |
popLink Main 111 400000 15.3 29.5 30.1 75.2 877 566607296 | |
linkLineParser Main 113 0 0.5 2.7 14.8 45.6 30 51200000 | |
comment Main 115 0 2.0 6.3 14.3 43.0 117 121599960 | |
linkLine Main 118 0 0.6 1.7 12.3 36.7 34 31999920 | |
resourceName Main 120 0 8.9 27.7 11.7 35.0 511 531198672 | |
link Main 122 0 2.7 7.3 2.7 7.3 157 140799648 | |
CAF Main 105 0 0.0 0.0 0.0 0.0 0 304 | |
link Main 121 1 0.0 0.0 0.0 0.0 0 0 | |
resourceName Main 119 1 0.0 0.0 0.0 0.0 0 2184 | |
linkLine Main 117 1 0.0 0.0 0.0 0.0 0 0 | |
comment Main 114 1 0.0 0.0 0.0 0.0 0 0 | |
linkLineParser Main 112 1 0.0 0.0 0.0 0.0 0 0 | |
linkspath Main 108 1 0.0 0.0 0.0 0.0 0 1256 | |
main Main 106 1 0.0 0.0 0.0 0.0 0 0 | |
CAF Data.Attoparsec.ByteString.Internal 104 0 0.0 0.0 0.0 0.0 0 0 | |
CAF Data.Attoparsec.Internal.Types 103 0 0.0 0.0 0.0 0.0 0 0 | |
CAF Data.Attoparsec.ByteString.FastSet 102 0 0.0 0.0 0.0 0.0 0 0 | |
CAF Data.Map 101 0 0.0 0.0 0.0 0.0 0 0 | |
CAF Data.ByteString 100 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Err 99 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.ForeignPtr 98 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Handle 97 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Handle.FD 96 0 0.0 0.0 0.0 0.0 0 34672 | |
CAF GHC.IO.Handle.Internals 95 0 0.0 0.0 0.0 0.0 0 0 | |
CAF Control.Exception.Base 94 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Exception 93 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Handle.Text 92 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.List 91 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.TopHandler 90 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Event.Thread 89 0 0.0 0.0 0.0 0.0 0 904 | |
CAF GHC.Arr 88 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Conc.Signal 87 0 0.0 0.0 0.0 0.0 0 672 | |
CAF GHC.Exception 86 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding 85 0 0.0 0.0 0.0 0.0 0 3376 | |
CAF GHC.Show 84 0 0.0 0.0 0.0 0.0 0 0 | |
CAF Data.Maybe 83 0 0.0 0.0 0.0 0.0 0 0 | |
CAF System.Posix.Internals 82 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Conc.Sync 81 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.FD 80 0 0.0 0.0 0.0 0.0 0 32 | |
CAF GHC.IO.Encoding.Iconv 79 0 0.0 0.0 0.0 0.0 0 248 | |
CAF GHC.Int 78 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Event.Control 77 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Event.EPoll 76 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Event.Internal 75 0 0.0 0.0 0.0 0.0 0 32 | |
CAF Data.Typeable.Internal 74 0 0.0 0.0 0.0 0.0 0 0 | |
CAF Foreign.Marshal.Alloc 73 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Event.Clock 72 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Event.Manager 71 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Enum 70 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding.UTF16 69 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding.UTF32 68 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding.UTF8 67 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Handle.Types 66 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Event.PSQ 65 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Float 64 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Real 63 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.IO.Encoding.Failure 62 0 0.0 0.0 0.0 0.0 0 0 | |
CAF GHC.Integer.Logarithms.Internals 61 0 0.0 0.0 0.0 0.0 0 320 | |
CAF GHC.Integer.Type 60 0 0.0 0.0 0.0 0.0 0 0 | |
SYSTEM SYSTEM 59 0 0.2 0.0 0.2 0.0 11 16352 | |
GC GC 58 0 52.4 0.0 52.4 0.0 3001 0 | |
OVERHEAD_of PROFILING 57 0 0.3 0.0 0.3 0.0 16 0 | |
DONT_CARE MAIN 56 0 0.0 0.0 0.0 0.0 0 0 | |
PINNED SYSTEM 55 0 0.0 0.0 0.0 0.0 0 0 | |
IDLE IDLE 54 0 0.0 0.0 0.0 0.0 0 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{-# LANGUAGE OverloadedStrings #-} | |
import Prelude hiding (takeWhile, take) | |
import Data.Attoparsec.Char8 | |
import Control.Applicative | |
import qualified Data.ByteString.Char8 as BS | |
import System.IO hiding (hGetLine) | |
import Control.Monad.Loops | |
import Control.Monad | |
import qualified Data.Map as M | |
import Data.Maybe | |
import Data.Hashable | |
import Control.Parallel | |
import System.Random | |
import Debug.Trace | |
import qualified Data.Foldable (sum) | |
import Data.List (foldl') | |
import Data.IORef | |
import Data.HashTable as HT | |
import Control.DeepSeq | |
import Data.Int | |
data NTEntry = NTEntry Resource Schema Content deriving Show | |
data NTLink = NTLink From To deriving Show | |
type From = BS.ByteString | |
type To = BS.ByteString | |
type Resource = BS.ByteString | |
type Schema = BS.ByteString | |
type Content = BS.ByteString | |
instance NFData NTEntry where | |
rnf (NTEntry r s c) = (rnf r) `seq` (rnf s) `seq` (rnf c) `seq` () | |
instance NFData BS.ByteString where | |
rnf bs = bs `seq` () | |
type Handlemap = M.Map Int [Integer] | |
data TypedHandleMap a = TypedHandleMap (M.Map Int [Integer]) deriving Show | |
--instance Hashable BS.ByteString where | |
--hash = hash . BS.unpack | |
type IXTable = HT.HashTable Resource [Integer] | |
main = do | |
f <- openFile linkspath ReadMode | |
indexLinks f | |
indexLinks :: Handle -> IO IXTable | |
indexLinks f = do | |
table <- (HT.new (==) (fromIntegral . hash)) :: IO (IXTable) | |
indexLoop f table | |
return table | |
where | |
indexLoop :: | |
Handle -> | |
IXTable -> | |
IO () | |
indexLoop f table = do | |
ended <- hIsEOF f | |
if ended | |
then return () | |
else do | |
v <- popLink f | |
insertLink table v | |
indexLoop f table | |
insertLink :: | |
IXTable | |
-> (Maybe (Integer, Maybe NTLink)) | |
-> IO () | |
insertLink _ Nothing = return () | |
insertLink _ (Just (_, Nothing)) = return () | |
insertLink table (Just (pos, Just (NTLink f t))) = do | |
generalIndexer table f pos | |
generalIndexer table t pos | |
generalIndexer :: | |
IXTable | |
-> Resource -> Integer | |
-> IO () | |
generalIndexer index b pos = do | |
l <- HT.lookup index b | |
case l of | |
Nothing -> insert index b [pos] | |
Just ls -> insert index b (pos : ls) | |
linkspath = "page_links_en.nt_" | |
abstractspath = "short_abstracts_en.nt" | |
popAbstract :: Handle -> IO (Maybe (Integer, Maybe NTEntry)) | |
popAbstract h = do | |
end <- hIsEOF h | |
if end | |
then return Nothing | |
else do | |
lineStart <- hTell h | |
line <- BS.hGetLine h | |
let parsed = force (parseOnly abstractParser line) in | |
parsed `par` case parsed of | |
Left error -> do | |
hPutStrLn stderr $ show error | |
return $ Just (lineStart, Nothing) | |
Right v -> | |
return $ Just (lineStart, v) | |
popLink :: Handle -> IO (Maybe (Integer, Maybe NTLink)) | |
popLink h = do | |
end <- hIsEOF h | |
if end | |
then return Nothing | |
else do | |
lineStart <- hTell h | |
line <- BS.hGetLine h | |
let parsed = parseOnly linkLineParser line in | |
parsed `par` | |
case (parseOnly linkLineParser line) of | |
Left error -> do | |
hPutStrLn stderr $ show error | |
return $ Just (lineStart, Nothing) | |
Right v -> | |
return $ Just (lineStart, v) | |
lazy_untilM :: Monad m => m a -> m Bool -> m [a] | |
lazy_untilM action test = do | |
a <- action | |
t <- test | |
if t | |
then return [a] | |
else do | |
r <- lazy_untilM action test | |
return $ a:r | |
linkLineParser :: Parser (Maybe NTLink) | |
linkLineParser = | |
(comment >> return Nothing) | |
<|> linkLine | |
linkLine = do | |
char '<' | |
from <- resourceName | |
char '>' | |
skipWhile isSpace | |
char '<' | |
link | |
char '>' | |
skipWhile isSpace | |
char '<' | |
to <- resourceName | |
char '>' | |
skipWhile ((/=) '\n') | |
return . Just $ NTLink from to | |
resourceName = do | |
string "http://dbpedia.org/resource/" | |
takeWhile ((/= '>')) | |
abstractParser :: Parser (Maybe NTEntry) | |
abstractParser = | |
(comment >> return Nothing) | |
<|> entry | |
comment :: Parser () | |
comment = do | |
skipWhile isSpace | |
char '#' | |
skipWhile ((/=) '\n') | |
entry :: Parser (Maybe NTEntry) | |
entry = do | |
skipWhile isSpace | |
char '<' | |
resource <- resourceName | |
char '>' | |
skipWhile isSpace | |
char '<' | |
schema <- link | |
char '>' | |
skipWhile isSpace | |
char '"' | |
content <- quotedString | |
char '"' | |
char '@' | |
lang <- language | |
skipWhile ((/=) '\n') | |
case lang of | |
"en" -> return $ Just $ NTEntry resource schema content | |
_ -> return Nothing | |
link = takeWhile ((/=) '>') | |
quotedString = takeWhile ((/=) '"') | |
language = take 2 | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# started 2013-08-04T11:34:31Z | |
<http://dbpedia.org/resource/AccessibleComputing> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Computer_accessibility> . | |
<http://dbpedia.org/resource/AfghanistanGeography> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Geography_of_Afghanistan> . | |
<http://dbpedia.org/resource/AfghanistanHistory> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/History_of_Afghanistan> . | |
<http://dbpedia.org/resource/AfghanistanPeople> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Demography_of_Afghanistan> . | |
<http://dbpedia.org/resource/AfghanistanCommunications> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Communications_in_Afghanistan> . | |
<http://dbpedia.org/resource/AfghanistanMilitary> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Military_of_Afghanistan> . | |
<http://dbpedia.org/resource/AfghanistanTransportations> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Transport_in_Afghanistan> . | |
<http://dbpedia.org/resource/AfghanistanTransnationalIssues> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Foreign_relations_of_Afghanistan> . | |
<http://dbpedia.org/resource/AmoeboidTaxa> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Amoeboid> . |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment