-
-
Save copumpkin/6647111 to your computer and use it in GitHub Desktop.
{-# LANGUAGE RankNTypes, BangPatterns #-} | |
module Mutable where | |
import Control.Monad | |
import Control.Monad.ST | |
import Data.STRef | |
import Data.Vector.Mutable (MVector) | |
import qualified Data.Vector.Mutable as MV | |
-- A read-only view into an MVector or similar | |
data Chunk s a = Chunk { size :: Int, get :: Int -> ST s a } | |
data Iteratee i o | |
= Done o | |
| Moar (forall s. Chunk s i -> ST s (Iteratee i o)) | |
foo :: Int -> Iteratee Int Double | |
foo n = Moar (f n 0) | |
where | |
f :: Int -> Int -> Chunk s Int -> ST s (Iteratee Int Double) | |
f 0 v _ = return $ Done (fromIntegral v) | |
f c v chunk = do | |
let go n !c | n < 0 = return c | |
go n !c = do | |
x <- get chunk n | |
go (n - 1) (c + x) | |
x <- go (size chunk - 1) 0 | |
return $ Moar (f (c - 1) (v + x)) | |
bufSize = 16 | |
run :: Iteratee i o -> [i] -> o | |
run i xs = runST $ do | |
ref <- newSTRef xs | |
buf <- MV.new bufSize | |
let chunk = Chunk bufSize (MV.read buf) | |
go (Done x) = return x | |
go (Moar f) = do | |
cur <- readSTRef ref | |
modifySTRef ref (drop bufSize) | |
forM_ (zip [0..bufSize - 1] cur) (\(i, x) -> MV.write buf i x) | |
next <- f chunk | |
go next | |
go i |
The only thing you can do to an MVector
outside of ST
is ask for its length, since that never changes. You could imagine a fancy implementation that gives you chunks of different sizes, so you'd be able to retain pointers to them but not be able to do much else. In particular, no assumptions would be violated. If during your chunk the length was x
, you could hang onto the chunk in an existential and use it way later, and the length would still be x
. Other chunk processors might see different MVector
s, though.
Also to clarify for any readers: the goal here isn't to have effects in ST
; it's to maintain the same old pure Iteratee
model people are used to (substitute for pipes or machines or whatever) while allowing the runner to safely use mutable chunks.
The MVector
in the iteratee itself is merely a marker. I would only expect the producer to ever write to it, and if I were less lazy I'd write a read-only state-annotated vector for something like this.
Very nice! You could enter ST
again to manipulate the existential vector, but you wouldn't be able to run the resulting ST
action.
I may steal this for scalaz-stream. :) I had been playing with idea of recycling buffers, but I hadn't figured out way to make it safe.
Yeah, exactly. runST
is useless if you have an ST
action for a particular state type. I'm trying to think now whether something like stream fusion could benefit from the idea. Feel free to do whatever you want with it! Chances are you'll do more than I will :P
I worked out another scheme that might be nicer for this - basic idea is to have the driver maintain a weak reference to the Vector
it returns. When it goes to read the next value, it checks whether the weak reference points to anything. If not, it can mutate the underlying buffer. If the weak reference is not null, that (conservatively) implies that something is still holding on to the vector, which is fine, it can create a fresh buffer. It's 'conservative', in the sense that the weak reference might not get nulled out in time, but that is perfectly okay here since it's just an optimization. I also suspect with generational GC that it would end up working quite well in practice.
I kind of prefer this to hitting it with the ST
sledgehammer. Using ST
totally restricts aliasing / sharing. With this approach, you are free to alias / share the vector you get back, it's just that if you keep it around, the driver will be forced to make a copy.
If I have
data EMVector i = forall s . EMVector (MVector s i)
, and return that aso
, is there literally nothing I can do with the existentialMVector s i
? If I can get any information at all out of that existentialMVector
, then that would be unsafe...