shmookey · January 18, 2020 19:24
diff --git a/translate.lhs b/translate.lhs
 External storage on the EVM: a static recompilation approach        7 May 2017
 Luke Williams <[email protected]>                                   Rev. 3


 This document describes a way of converting ordinary compiled EVM contracts to
 a form suitable for use with an off-chain storage backend, such as the system
 described by Smolenski for storing contract state on IPFS. This technique is
 completely invisible to the contract developer and requires no modification to
 contract logic. It is also generic to any EVM contract, regardless of the high
 level language it was written in, and compatible with the public blockchain.

 This is a self-implementing specification written in literate Haskell. It may
 be compiled and run unmodified with any recent version of Haskell to yield a
 command-line tool for converting contracts.

 I hereby release this document under an MIT-style license. It is free for all
 commercial, non-commercial, quasi-commercial and infomercial use - but please
 don't sue me if you use this and lose all your money. I don't have any either.

 ------------------------------------------------------------------------------

  BACKGROUND

 Smolenski [1] describes an integrated approach to off-blockchain data storage
 for EVM smart contracts, targeting an IPFS storage backend. A wide variety of
 use cases are supported by a range of features, including private data storage
 with encryption, access control and revocation, user authentication and more.
 A core idea of the proposal is the notion of contracts as state transformers -
 rather than modeling contracts as objects with a hidden internal structure, we
 can regard them as pure code, as functions of a state value. This is certainly
 a useful way of looking at contracts, if we wish to work with data stored off-
 chain.

 In Smolenski's proposal this view is given form at the method level - stateful
 methods are written with an extra argument and return value for the state. I
 argue that there are many advantages and few disadvantages if instead the view
 of contracts as state transformers is applied at the contract level - outside
 of Solidity, in a hidden layer of “plumbing” beneath the high-level logic that
 is expressed by the contract's source code - essentially, abstracting from the
 idea of contract methods as state transformer functions to view the contract
 itself as a state monad. This allows us to unify our view of internal/external
 storage into a single model, where the options are distinguished only by the
 compile-time choice of which monad's “plumbing” to use. Then if we know what
 each option looks like at the EVM opcode level, we can change our minds later:
 we can “re-compile” the contract by replacing all the plumbing, reliably and
 automatically, without access to the source code.

 The advantages to this approach are not limited to a unity of abstractions: we
 can surely save much developer time and effort by avoiding the need to rewrite
 parts of contracts, as well as cumulative experience in writing contracts the
 traditional way. In a strategic sense, more developers are likely to try using
 it if the barriers to entry are low. Finally there is a practical benefit too:
 explicitly threading around a state object ties up at least one “reachable”
 stack slot (sometimes more) anywhere that state is available, further reducing
 the already constraining limit of 16 arguments + function locals. The hidden
 state approach does not affect a contract's use of the stack.

 We should be careful to consider potential risks. Hiding the technical details
 is counterproductive (and frustrating) if those details turn out to have high-
 level consequences, which may not be clear to the developer. In this case, the
 only such detail I have been able to identify is that the gas costs of storage
 operations may be difficult to predict - but almost certain to be smaller than
 the equivalent native storage operation, usually much smaller. A more detailed
 comparison will require more information about the proposed implementation.

 [1] Permissioned Blocks White Paper, Michael Smolenski, 2017
 https://github.com/autocontracts/permissioned-blocks/blob/master/whitepaper.md

  AIM

 Our goal is to transform an ordinary contract into one that takes an external
 state structure as a hidden extra parameter and returns an updated copy of the
 structure in a similar way, without any expensive EVM storage operations.

 The resulting contract should be ABI-compatible with Smolenski's proposal.

 This process should work for all existing contracts without modification and
 be suitable for implementation within a smart contract. A Solidity version is
 certain to be tedious and verbose; for this demonstration we will construct a
 simple but powerful static recompiler in Haskell, reading compiled contracts
 and operating on the EVM bytecode level to produce a new binary, suitable for
 immediate deployment to the blockchain.


  DESIGN

 EVM contract storage is essentially a key-value store allowing arbitrary 256-
 bit keys and values. Solidity addresses storage using two conventions: scalar
 values are packed as tightly as possible (without changing their ordering) in
 sequence starting from slot 0. For lists and mappings, a key prefix defined by
 the value's position in the list of members is concatenated with a key and
 hashed in order to determine the storage slot to use. Implementing a key-value
 mapping in memory is left as a future exercise. For now, we will only consider
 member variables with scalar types.

 When a contract reads or writes storage data, it operates on one full storage
 slot at a time using the EVM opcodes `SLOAD` and `SSTORE`. When these opcodes
 are encountered, the value at the top of the stack is taken as the referenced
 storage key. In the case of `SSTORE`, the next value on the stack is used for
 the value and the operation leaves the stack with 2 fewer items. For `SLOAD`,
 the value located at the given storage key is pushed onto the stack, leaving
 the total number of items unchanged.

 Contracts may also access cheaper volatile working memory using the operations
 `MLOAD` and `MSTORE`, which are analogous to the storage opcodes we just saw,
 except that they access a byte-addressed and contiguous range of memory. The
 stack arguments consumed by these operations mirror those for storage, but we
 clearly will need to do some kind of address translation to map between byte-
 and word-addressed layouts, and also to avoid clashes with the contract's own
 use of working memory. Solidity implements a (very) simple memory management
 scheme which may be used to achieve a seamless integration in the general case
 but for now we will simply choose a high starting offset to avoid overlapping
 regions.

 Before we dive into writing our translation functions, we need to enable some
 handy language extensions and import a few things from the core libraries. Our
 code will otherwise be entirely self-contained. A basic installation of GHC or
 Haskell Platform is all you need.

 > {-# LANGUAGE ViewPatterns #-}
 > import Control.Monad (forM_)
 > import Data.Word (Word8)
 > import Data.Bits (Bits, shiftL, shiftR, (.|.))
 > import System.Environment (getArgs)
 > import qualified Data.ByteString as B

 To turn a storage key into a memory address, we simply multiply by 32 (bytes,
 for a 256 bit key size) and add the offset. 64k ought to be enough for anyone,
 so we shall define our storage region to begin on the next byte after the 16-
 bit range ends:

 > offset :: Integer
 > offset = 0x010000      -- 2^16 = 65536 (3 bytes)

 Given a replacement instruction, i.e. `MSTORE` or `MLOAD`, we can generate the
 translated code:

 > relocate :: Op -> [Op]
 > relocate x =
 >   [ Push1 0x20         -- 32 byte slots (1 byte)
 >   , Mul
 >   , Push3 offset
 >   , Add
 >   , x
 >   ]

 Note that the length of this operation is 9 bytes: 5 opcodes (1 byte each) and
 2 constants of 1 byte and 3 bytes repsectively. We will replace each instance
 of `SSTORE` and `SLOAD` with the output of this function called with `MSTORE`
 and `MLOAD` respectively.

 Execution on the EVM may halt for a number of reasons, either gracefully or in
 case of error. In the latter case the entire transaction is “rolled back”, so
 we will only concern ourselves with graceful termination. There are two ways
 that a contract may terminate gracefully, those are by the `STOP` and `RETURN`
 instructions. Both have the effect of halting execution, but `RETURN` copies
 a region of memory and “returns” it to the caller.

 We need to return the updated state data along with any data returned by the
 contract, which means we must replace all `STOP`s with `RETURN`s, and insert
 new instructions to place the storage memory region at the end of the returned
 data. At this point we will make another simplifying assumption and constrain
 the storage size to a rather small 1kb, for the sole purpose of avoiding loops
 in our demonstration code:

 > storageSize :: Integer
 > storageSize = 0x400     -- 2^10 = 1024 (2 bytes)

 We could trivially use a larger value, but there is little point calculating
 an accurate one. A fully realized implementation will already be maintaining
 an exact value for the mapping structure which can be used instead.

 When there is no return data, i.e. we encounter a `STOP`, we simply push the
 offset and size of the storage memory region to the stack, then `RETURN`:

 > returnState :: [Op]
 > returnState =
 >   [ Push2 storageSize -- Push storage region length to stack
 >   , Push3 offset      -- Push storage region offset to stack
 >   , Return
 >   ]

 The length of this operation is 8 bytes. This code will replace the `STOP`
 instruction.

 When there is other return data, we will copy the storage region to the end of
 the return data region and return with the length value increased by the size
 of the storage data, i.e. 1kb. We could do it the other way around, but there
 is no EVM instruction to copy a range of data, and it will be simpler for our
 purposes to be working with a whole number of 32-byte words than working with
 individual bytes. Knowing the first and second stack values contain the offset
 and length of the main return data respectively, we move each word back. Our
 function will append this code to the end of a contract:

 > returnData :: [Op] -> [Op]
 > returnData contract =
 >   contract ++ prepare ++ concatMap copy [0..31] ++ complete
 >   where
 >     prepare =
 >       [ JumpLabel 0                 -- Allow jumping to this code by label
 >       , Dup1                        -- Duplicate first stack value (offset)
 >       , Dup3                        -- Duplicate third stack value (length)
 >       , Add                         -- Add to get offset for storage data
 >       ]
 >     copy x =
 >       [ Push3 (x*32 + storageSize)  -- Pointer to requested word
 >       , MLoad                       -- Load the word onto the stack
 >       , Dup1                        -- Duplicate the current offset
 >       , MStore                      -- Store in new position
 >       , Push1 0x20                  -- Amount to increment offset
 >       , Add                         -- Increment current offset
 >       ]
 >     complete =
 >       [ Pop                         -- Discard final offset
 >       , Swap1                       -- Swap offset and length on stack
 >       , Push2 storageSize           -- Push storage size to stack
 >       , Add                         -- Add storage size to return size
 >       , Swap1                       -- Put return args back in order
 >       , Return                      -- Finally, we can now return
 >       ]

 The length of this operation is 332 bytes. This is too long to insert anywhere
 we find a `RETURN`, so instead we will append this code as a subroutine at the
 end of the contract. We can then replace existing `RETURN`s by jumping to the
 new routine using a special meta-instruction `PushLabel`, which is translated
 to the appropriate `PUSH` opcode and constant during the final assembly stage:

 > returnJump :: [Op]
 > returnJump =
 >   [ PushLabel 0
 >   , Jump
 >   ]

 When an EVM contract is called, the message data must be read from a separate
 address space. The EVM provides opcodes for loading a single word to the stack
 (`CALLDATALOAD`) and copying a range into memory (`CALLDATACOPY`). We will use
 the latter to copy the storage data at the end of the call data to the storage
 region in memory, using `CALLDATASIZE` to determine the length. Our function
 will prepend this code to the beginning of a contract:

 > initStorage :: [Op] -> [Op]
 > initStorage contract =
 >   [ Push2 storageSize
 >   , CallDataSize
 >   , Push2 storageSize
 >   , Sub
 >   , Push3 offset
 >   , CallDataCopy
 >   ] ++ contract

 The length of this operation is 13 bytes. We prepend this code to the contract
 to ensure that the storage memory region is populated throughout the contract
 execution.

 With all of our translations now defined, we can write our opcode translation
 function:

 > translate :: Op -> [Op]
 > translate x =
 >   case x of SLoad  -> relocate MLoad
 >             SStore -> relocate MStore
 >             Stop   -> returnState
 >             Return -> returnJump
 >             _      -> [x]

 This function takes an opcode and returns a list of opcodes to replace it. The
 replacement list for a non-storage opcode is simply the singleton list of that
 opcode.

 Inserting new code presents us with a problem. Control flow on the EVM is by
 the use of 2 jump opcodes: `JUMP` and `JUMPI`, where the latter is conditional
 on the second stack value being nonzero, and both use the first stack value as
 the target bytecode position - and these targets may no longer be valid.

 Addressing this issue, or rather re-addressing it, will be a little trickier.
 Jump targets may not always be constants pushed onto the stack in the opcode
 immediately prior, in fact it is quite common for these values to be kept on
 the stack for later use. It is however almost universal for them to originate
 as constants in `PUSH` opcodes, and we can determine which constants likely
 correspond to jump targets by correlating them against the positions of every
 `JUMPDEST` opcode, which all jump operations must land on - at the risk of
 mistakenly identifying ordinary constants as jump labels. There are several
 approaches to re-addressing which avoid these mistakes. In general the only
 perfect solution is a runtime lookup table, in practice compile-time solutions
 can be just as viable with no runtime overhead. (I got flawless results from
 a dynamic dependency analysis approach, but that would have made this write-up
 an order of magnitude longer.)

 After reading a contract but before doing any translation of it we locate all
 `JUMPDEST` opcodes, convert them to position-tagged `JumpLabel` meta-opcodes,
 then replace any constants that occur in the set of “labels” to the equivalent
 `PushLabel` meta-opcode. This effectively gives us position-independent jumps,
 enabling us to safely insert new opcodes. The process entails two passses. The
 first pass creates the labels and collects them into a list:

 > label :: [(Integer, Op)] -> ([Integer], [Op])
 > label =
 >   foldr f ([], [])
 >   where
 >     f (k, op) (ks, ops) = case op of
 >       JumpDest -> (k : ks, JumpLabel k : ops)
 >       _        -> (ks    , op          : ops)

 The second pass takes the output of the first and translates `PUSH` opcodes
 with constants that occur in the label list into `PushLabel` meta-opcodes.

 > labelPush :: ([Integer], [Op]) -> [Op]
 > labelPush (ks, ops) =
 >   map f ops
 >   where
 >     f op = case op of
 >       (push -> P k _) | k `elem` ks -> PushLabel k
 >       _                             -> op

 In order to reassemble the contract back into EVM bytecode, we need to perform
 the same process in reverse, replacing label meta-opcodes with real opcodes
 based on their new positions. This is also a two pass process. First we remove
 the `JumpLabel`s, whose actual positions will now be different to their labels
 and build a list of mappings. Note that we must keep track of the current byte
 position by the length of the the opcodes.

 > unlabel :: [Op] -> ([(Integer, Integer)], [Op])
 > unlabel =
 >   dropFst . foldr f (0, [], [])
 >   where
 >     dropFst (_, ks, ops) = (ks, ops)
 >     f op (k, ks, ops) = case op of
 >       JumpLabel k'    -> (k+1,   (k',k):ks, JumpDest : ops)
 >       PushLabel k'    -> (k+3,   ks,        op       : ops)
 >       (push -> P x n) -> (k+n+1, ks,        op       : ops)
 >       _               -> (k+1,   ks,        op       : ops)

 The second pass takes the output of the first and translates `PushLabel` meta-
 opcodes to `PUSH2` opcodes with the mapped offset. This is the first time that
 we introduce the possibility of error into out program, since the mapping may
 fail if we have made a mistake somewhere. We won't handle that possibility for
 this demonstration, but we will at least acknowledge it with an error message
 and abort.

 > unlabelPush :: ([(Integer, Integer)], [Op]) -> [Op]
 > unlabelPush (kvs, ops) =
 >   map f ops
 >   where
 >     f op = case op of
 >       PushLabel k -> Push2 (remap k)
 >       _           -> op
 >     remap k = case lookup k kvs of
 >       Just k' -> k'
 >       Nothing -> error $ "no such label: " ++ show k

 We now have all the machinery we need to translate contracts, and can finally
 connect them all up in a neat pipeline. This function will take the list of
 position-tagged opcodes from a disassembler function, and return a final list
 of opcodes to provide the assembler.

 > convert :: [(Integer, Op)] -> [Op]
 > convert =
 >     unlabelPush
 >   . unlabel
 >   . returnData
 >   . initStorage
 >   . concatMap translate
 >   . labelPush
 >   . label

 That's basically it. All that remains is the main IO to load the contract and
 save the translated version. A basic assembler/disassembler is provided in the
 appendix in order to make this document a self-contained program, but we need
 not concern ourselves with the details. Tying it all together, our program
 will accept any number of input files as command line arguments, and for each
 input write the translated output to a new file with the extension `.out`:

 > main :: IO ()
 > main = do
 >   inputs <- tail <$> getArgs
 >   forM_ inputs $ \file -> do
 >     contract <- disassemble <$> B.readFile file
 >     let file' = file ++ ".out"
 >     B.writeFile file' . assemble $ convert contract

 The translated contracts are ABI-compatible with Smolenski's proposal, in that
 every method effectively has an extra argument and return value for the state
 data, but this demonstration does not include a mechanism for automatically
 updating the `solc`-generated ABI file. For now, these changes will have to be
 added manually.

 One final caveat is that contracts compiled by `solc` begin with a preamble or
 “installer” section which initialises the contract and performs the vital task
 of returning the actual contract bytecode, which is then stored with the newly
 created account. As a simple workaround, the `--bin-runtime` option to `solc`
 will cause it to emit just the part that should be translated, which can then
 be spliced back in to the full bytecode. Note that therefore the initial state
 of the contract will be empty; contracts that require initialisation should do
 so by exposing an “initialise” method in their public API.


  CONCLUSION

 We have built a static recompiler for converting smart contracts which use the
 native EVM storage facility into contracts which transform state values passed
 via transaction inputs and outputs. It is a general solution applicable to all
 EVM contracts, though the version we have built in this document is subject to
 several limitations which are described in the DESIGN section with a suggested
 solution in each case - these are “engineering problems”, in the sense that an
 appropriate implementation is known.

 There is one feature absent from this demonstration which is an important part
 of Smolenski's proposal, namely support for endorser-oracle state validation.
 This involves the contract using its native EVM storage to store a hash of the
 last known valid state, which can only be updated if the new hash is endorsed
 by the oracle by signing it. There are two ways we could approach this: either
 as a completely transparent process (a “hidden method”), or by a mechanism to
 allow contracts to validate their own “state update” requests, where the lower
 level code provides a way for the high level code to store the new hash. This
 question is left as a topic for future discussion.

 The high and ever increasing prices of native EVM storage for smart contracts
 is an issue which is only likely to worsen over time as Ethereum's popularity
 grows along with the volume of contract data that peers are required to store
 on their machines. On the other hand, a more popular network is a more useful
 one, and if contract developers are able to take advantage of cheaper storage
 while still leveraging the reach of the Ethereum blockchain, many will likely
 opt to do so. Interest in privacy and access control is also likely to become
 stronger as distributed applications become more sophisticated. This proposal
 is a meant as a small contribution to that conversation - in the hope that it
 might help us to answer: what on earth we are going to do about this problem?


 -----------------------------------------------------------------------------


                  APPENDIX A - ASSEMBLER AND DISASSEMBLER

 These are the basic functions for disassembly and assembly. They will cause an
 error if the operation fails.

 > disassemble :: B.ByteString -> [(Integer, Op)]
 > disassemble bs = f 0
 >   where f i | i >= B.length bs = []
 >             | otherwise        = let (n, x) = getOp i bs
 >                                  in (toInteger i, x) : f (n+i)

 > assemble :: [Op] -> B.ByteString
 > assemble = mconcat . map toSym

 > getOp :: Int -> B.ByteString -> (Int, Op)
 > getOp p bs =
 >   if p >= 0 && p < B.length bs
 >   then case sym $ B.index bs p of
 >     Right op     -> (1,   op)
 >     Left (n, op) -> (n+1, op $ getData (p+1) n bs)
 >   else
 >     error $ "out of bounds " ++ show p

 > getData :: Int -> Int -> B.ByteString -> Integer
 > getData p n bs =
 >   if p >= 0 && p + n <= B.length bs
 >   then roll . B.take n $ B.drop p bs
 >   else error "unexpected end of input"



                      APPENDIX B - PUSH VIEW PATTERN

 This code defines the `pushN` view pattern.

 > data PushN = P Integer Integer | NotPush
 > push :: Op -> PushN
 > push op = case op of
 >   Push1  x -> P x 1  ; Push17 x -> P x 17
 >   Push2  x -> P x 2  ; Push18 x -> P x 18
 >   Push3  x -> P x 3  ; Push19 x -> P x 19
 >   Push4  x -> P x 4  ; Push20 x -> P x 20
 >   Push5  x -> P x 5  ; Push21 x -> P x 21
 >   Push6  x -> P x 6  ; Push22 x -> P x 22
 >   Push7  x -> P x 7  ; Push23 x -> P x 23
 >   Push8  x -> P x 8  ; Push24 x -> P x 24
 >   Push9  x -> P x 9  ; Push25 x -> P x 25
 >   Push10 x -> P x 10 ; Push26 x -> P x 26
 >   Push11 x -> P x 11 ; Push27 x -> P x 27
 >   Push12 x -> P x 12 ; Push28 x -> P x 28
 >   Push13 x -> P x 13 ; Push29 x -> P x 29
 >   Push14 x -> P x 14 ; Push30 x -> P x 30
 >   Push15 x -> P x 15 ; Push31 x -> P x 31
 >   Push16 x -> P x 16 ; Push32 x -> P x 32
 >   _ -> NotPush



                         APPENDIX C - SYMBOL TABLES

 This code declares the `Op` type and bytecode conversions.

 > type I = Integer
 > data Op
 >   = Stop       | Lt       | Pop      | BlockHash    | Address
 >   | Add        | Gt       | MLoad    | Coinbase     | Balance
 >   | Mul        | SLT      | MStore   | Timestamp    | Origin
 >   | Sub        | SGT      | MStore8  | Number       | Caller
 >   | Div        | Eq       | SLoad    | Difficulty   | CallValue
 >   | SDiv       | IsZero   | SStore   | GasLimit     | CallDataLoad
 >   | Mod        | And      | Jump     | Create       | CallDataSize
 >   | SMod       | Or       | JumpI    | Call         | CallDataCopy
 >   | AddMod     | Xor      | PC       | CallCode     | CodeSize
 >   | MulMod     | Not      | MSize    | Return       | CodeCopy
 >   | Exp        | Byte     | Gas      | DelegateCall | GasPrice
 >   | SignExtend | SHA3     | JumpDest | Suicide      | ExtCodeSize
 >   | Push1  I   | Push17 I | Dup1     | Swap1        | ExtCodeCopy
 >   | Push2  I   | Push18 I | Dup2     | Swap2        | Log0
 >   | Push3  I   | Push19 I | Dup3     | Swap3        | Log1
 >   | Push4  I   | Push20 I | Dup4     | Swap4        | Log2
 >   | Push5  I   | Push21 I | Dup5     | Swap5        | Log3
 >   | Push6  I   | Push22 I | Dup6     | Swap6        | Log4
 >   | Push7  I   | Push23 I | Dup7     | Swap7        -- Meta:
 >   | Push8  I   | Push24 I | Dup8     | Swap8        | Invalid Word8
 >   | Push9  I   | Push25 I | Dup9     | Swap9        | JumpLabel I
 >   | Push10 I   | Push26 I | Dup10    | Swap10       | PushLabel I
 >   | Push11 I   | Push27 I | Dup11    | Swap11
 >   | Push12 I   | Push28 I | Dup12    | Swap12
 >   | Push13 I   | Push29 I | Dup13    | Swap13
 >   | Push14 I   | Push30 I | Dup14    | Swap14
 >   | Push15 I   | Push31 I | Dup15    | Swap15
 >   | Push16 I   | Push32 I | Dup16    | Swap16

 > sym :: Word8 -> Either (Int, Integer -> Op) Op
 > sym x = case x of
 >   0x00 -> r Stop         ; 0x33 -> r Caller       ; 0x5b -> r JumpDest
 >   0x01 -> r Add          ; 0x34 -> r CallValue    ; 0xa0 -> r Log0
 >   0x02 -> r Mul          ; 0x35 -> r CallDataLoad ; 0xa1 -> r Log1
 >   0x03 -> r Sub          ; 0x36 -> r CallDataSize ; 0xa2 -> r Log2
 >   0x04 -> r Div          ; 0x37 -> r CallDataCopy ; 0xa3 -> r Log3
 >   0x05 -> r SDiv         ; 0x38 -> r CodeSize     ; 0xa4 -> r Log4
 >   0x06 -> r Mod          ; 0x39 -> r CodeCopy     ; 0xf0 -> r Create
 >   0x07 -> r SMod         ; 0x3a -> r GasPrice     ; 0xf1 -> r Call
 >   0x08 -> r AddMod       ; 0x3b -> r ExtCodeSize  ; 0xf2 -> r CallCode
 >   0x09 -> r MulMod       ; 0x3c -> r ExtCodeCopy  ; 0xf3 -> r Return
 >   0x0A -> r Exp          ; 0x40 -> r BlockHash    ; 0xf4 -> r DelegateCall
 >   0x0b -> r SignExtend   ; 0x41 -> r Coinbase     ; 0xf5 -> r Suicide
 >   0x10 -> r Lt           ; 0x42 -> r Timestamp    ; 0x90 -> r Swap1
 >   0x11 -> r Gt           ; 0x43 -> r Number       ; 0x91 -> r Swap2
 >   0x12 -> r SLT          ; 0x44 -> r Difficulty   ; 0x92 -> r Swap3
 >   0x13 -> r SGT          ; 0x45 -> r GasLimit     ; 0x93 -> r Swap4
 >   0x14 -> r Eq           ; 0x50 -> r Pop          ; 0x94 -> r Swap5
 >   0x15 -> r IsZero       ; 0x51 -> r MLoad        ; 0x95 -> r Swap6
 >   0x16 -> r And          ; 0x52 -> r MStore       ; 0x96 -> r Swap7
 >   0x17 -> r Or           ; 0x53 -> r MStore8      ; 0x97 -> r Swap8
 >   0x18 -> r Xor          ; 0x54 -> r SLoad        ; 0x98 -> r Swap9
 >   0x19 -> r Not          ; 0x55 -> r SStore       ; 0x99 -> r Swap10
 >   0x1a -> r Byte         ; 0x56 -> r Jump         ; 0x9a -> r Swap11
 >   0x20 -> r SHA3         ; 0x57 -> r JumpI        ; 0x9b -> r Swap12
 >   0x30 -> r Address      ; 0x58 -> r PC           ; 0x9c -> r Swap13
 >   0x31 -> r Balance      ; 0x59 -> r MSize        ; 0x9d -> r Swap14
 >   0x32 -> r Origin       ; 0x5a -> r Gas          ; 0x9e -> r Swap15
 >   0x60 -> l (1,  Push1)  ; 0x70 -> l (17, Push17) ; 0x80 -> r Dup1
 >   0x61 -> l (2,  Push2)  ; 0x71 -> l (18, Push18) ; 0x81 -> r Dup2
 >   0x62 -> l (3,  Push3)  ; 0x72 -> l (19, Push19) ; 0x82 -> r Dup3
 >   0x63 -> l (4,  Push4)  ; 0x73 -> l (20, Push20) ; 0x83 -> r Dup4
 >   0x64 -> l (5,  Push5)  ; 0x74 -> l (21, Push21) ; 0x84 -> r Dup5
 >   0x65 -> l (6,  Push6)  ; 0x75 -> l (22, Push22) ; 0x85 -> r Dup6
 >   0x66 -> l (7,  Push7)  ; 0x76 -> l (23, Push23) ; 0x86 -> r Dup7
 >   0x67 -> l (8,  Push8)  ; 0x77 -> l (24, Push24) ; 0x87 -> r Dup8
 >   0x68 -> l (9,  Push9)  ; 0x78 -> l (25, Push25) ; 0x88 -> r Dup9
 >   0x69 -> l (10, Push10) ; 0x79 -> l (26, Push26) ; 0x89 -> r Dup10
 >   0x6a -> l (11, Push11) ; 0x7a -> l (27, Push27) ; 0x8a -> r Dup11
 >   0x6b -> l (12, Push12) ; 0x7b -> l (28, Push28) ; 0x8b -> r Dup12
 >   0x6c -> l (13, Push13) ; 0x7c -> l (29, Push29) ; 0x8c -> r Dup13
 >   0x6d -> l (14, Push14) ; 0x7d -> l (30, Push30) ; 0x8d -> r Dup14
 >   0x6e -> l (15, Push15) ; 0x7e -> l (31, Push31) ; 0x8e -> r Dup15
 >   0x6f -> l (16, Push16) ; 0x7f -> l (32, Push32) ; 0x8f -> r Dup16
 >   0x9f -> r Swap16       ; _    -> r (Invalid x)
 >  where l = Left ; r = Right

 > toSym :: Op -> B.ByteString
 > toSym op = case op of
 >   Stop       -> b 0x00 ; Caller       -> b 0x33 ; JumpDest     -> b 0x5b
 >   Add        -> b 0x01 ; CallValue    -> b 0x34 ; Log0         -> b 0xa0
 >   Mul        -> b 0x02 ; CallDataLoad -> b 0x35 ; Log1         -> b 0xa1
 >   Sub        -> b 0x03 ; CallDataSize -> b 0x36 ; Log2         -> b 0xa2
 >   Div        -> b 0x04 ; CallDataCopy -> b 0x37 ; Log3         -> b 0xa3
 >   SDiv       -> b 0x05 ; CodeSize     -> b 0x38 ; Log4         -> b 0xa4
 >   Mod        -> b 0x06 ; CodeCopy     -> b 0x39 ; Create       -> b 0xf0
 >   SMod       -> b 0x07 ; GasPrice     -> b 0x3a ; Call         -> b 0xf1
 >   AddMod     -> b 0x08 ; ExtCodeSize  -> b 0x3b ; CallCode     -> b 0xf2
 >   MulMod     -> b 0x09 ; ExtCodeCopy  -> b 0x3c ; Return       -> b 0xf3
 >   Exp        -> b 0x0A ; BlockHash    -> b 0x40 ; DelegateCall -> b 0xf4
 >   SignExtend -> b 0x0b ; Coinbase     -> b 0x41 ; Suicide      -> b 0xf5
 >   Lt         -> b 0x10 ; Timestamp    -> b 0x42 ; Swap1        -> b 0x90
 >   Gt         -> b 0x11 ; Number       -> b 0x43 ; Swap2        -> b 0x91
 >   SLT        -> b 0x12 ; Difficulty   -> b 0x44 ; Swap3        -> b 0x92
 >   SGT        -> b 0x13 ; GasLimit     -> b 0x45 ; Swap4        -> b 0x93
 >   Eq         -> b 0x14 ; Pop          -> b 0x50 ; Swap5        -> b 0x94
 >   IsZero     -> b 0x15 ; MLoad        -> b 0x51 ; Swap6        -> b 0x95
 >   And        -> b 0x16 ; MStore       -> b 0x52 ; Swap7        -> b 0x96
 >   Or         -> b 0x17 ; MStore8      -> b 0x53 ; Swap8        -> b 0x97
 >   Xor        -> b 0x18 ; SLoad        -> b 0x54 ; Swap9        -> b 0x98
 >   Not        -> b 0x19 ; SStore       -> b 0x55 ; Swap10       -> b 0x99
 >   Byte       -> b 0x1a ; Jump         -> b 0x56 ; Swap11       -> b 0x9a
 >   SHA3       -> b 0x20 ; JumpI        -> b 0x57 ; Swap12       -> b 0x9b
 >   Address    -> b 0x30 ; PC           -> b 0x58 ; Swap13       -> b 0x9c
 >   Balance    -> b 0x31 ; MSize        -> b 0x59 ; Swap14       -> b 0x9d
 >   Origin     -> b 0x32 ; Gas          -> b 0x5a ; Swap15       -> b 0x9e
 >   Dup1       -> b 0x80 ; Dup7         -> b 0x86 ; Dup13        -> b 0x8c
 >   Dup2       -> b 0x81 ; Dup8         -> b 0x87 ; Dup14        -> b 0x8d
 >   Dup3       -> b 0x82 ; Dup9         -> b 0x88 ; Dup15        -> b 0x8e
 >   Dup4       -> b 0x83 ; Dup10        -> b 0x89 ; Dup16        -> b 0x8f
 >   Dup5       -> b 0x84 ; Dup11        -> b 0x8a ; Swap16       -> b 0x9f
 >   Dup6       -> b 0x85 ; Dup12        -> b 0x8b ; Invalid x    -> unroll x
 >   (push -> P x n) -> p x (fromIntegral n)
 >  where p x n = (n - 1 + 0x60) `B.cons` word n (unroll x)
 >        b     = B.singleton



                        APPENDIX D: HELPER FUNCTIONS

 > roll :: B.ByteString -> Integer
 > roll = B.foldl' unstep 0
 >   where unstep a b = a `shiftL` 8 .|. fromIntegral b

 > unroll :: (Integral a, Bits a) => a -> B.ByteString
 > unroll = B.reverse . B.unfoldr step
 >   where
 >     step 0 = Nothing
 >     step i = Just (fromIntegral i, i `shiftR` 8)

 > word :: Integral a => a -> B.ByteString -> B.ByteString
 > word x bs =
 >   if B.length bs >= n
 >   then
 >     B.drop (max 0 $ B.length bs - n) bs
 >   else
 >     let len = B.length bs
 >         z   = B.pack $ take (n - len) (repeat 0)
 >     in z `B.append` bs
 >  where n = fromIntegral x


 ------------------------------------------------------------------------------
 End of transmission.
No results found