-
-
Save drouillard/01e10fbb42850d5a5b0cada9ea4762f7 to your computer and use it in GitHub Desktop.
off-chain storage for ethereum smart contracts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
External storage on the EVM: a static recompilation approach 7 May 2017 | |
Luke Williams <[email protected]> Rev. 3 | |
This document describes a way of converting ordinary compiled EVM contracts to | |
a form suitable for use with an off-chain storage backend, such as the system | |
described by Smolenski for storing contract state on IPFS. This technique is | |
completely invisible to the contract developer and requires no modification to | |
contract logic. It is also generic to any EVM contract, regardless of the high | |
level language it was written in, and compatible with the public blockchain. | |
This is a self-implementing specification written in literate Haskell. It may | |
be compiled and run unmodified with any recent version of Haskell to yield a | |
command-line tool for converting contracts. | |
I hereby release this document under an MIT-style license. It is free for all | |
commercial, non-commercial, quasi-commercial and infomercial use - but please | |
don't sue me if you use this and lose all your money. I don't have any either. | |
------------------------------------------------------------------------------ | |
BACKGROUND | |
Smolenski [1] describes an integrated approach to off-blockchain data storage | |
for EVM smart contracts, targeting an IPFS storage backend. A wide variety of | |
use cases are supported by a range of features, including private data storage | |
with encryption, access control and revocation, user authentication and more. | |
A core idea of the proposal is the notion of contracts as state transformers - | |
rather than modeling contracts as objects with a hidden internal structure, we | |
can regard them as pure code, as functions of a state value. This is certainly | |
a useful way of looking at contracts, if we wish to work with data stored off- | |
chain. | |
In Smolenski's proposal this view is given form at the method level - stateful | |
methods are written with an extra argument and return value for the state. I | |
argue that there are many advantages and few disadvantages if instead the view | |
of contracts as state transformers is applied at the contract level - outside | |
of Solidity, in a hidden layer of “plumbing” beneath the high-level logic that | |
is expressed by the contract's source code - essentially, abstracting from the | |
idea of contract methods as state transformer functions to view the contract | |
itself as a state monad. This allows us to unify our view of internal/external | |
storage into a single model, where the options are distinguished only by the | |
compile-time choice of which monad's “plumbing” to use. Then if we know what | |
each option looks like at the EVM opcode level, we can change our minds later: | |
we can “re-compile” the contract by replacing all the plumbing, reliably and | |
automatically, without access to the source code. | |
The advantages to this approach are not limited to a unity of abstractions: we | |
can surely save much developer time and effort by avoiding the need to rewrite | |
parts of contracts, as well as cumulative experience in writing contracts the | |
traditional way. In a strategic sense, more developers are likely to try using | |
it if the barriers to entry are low. Finally there is a practical benefit too: | |
explicitly threading around a state object ties up at least one “reachable” | |
stack slot (sometimes more) anywhere that state is available, further reducing | |
the already constraining limit of 16 arguments + function locals. The hidden | |
state approach does not affect a contract's use of the stack. | |
We should be careful to consider potential risks. Hiding the technical details | |
is counterproductive (and frustrating) if those details turn out to have high- | |
level consequences, which may not be clear to the developer. In this case, the | |
only such detail I have been able to identify is that the gas costs of storage | |
operations may be difficult to predict - but almost certain to be smaller than | |
the equivalent native storage operation, usually much smaller. A more detailed | |
comparison will require more information about the proposed implementation. | |
[1] Permissioned Blocks White Paper, Michael Smolenski, 2017 | |
https://github.com/autocontracts/permissioned-blocks/blob/master/whitepaper.md | |
AIM | |
Our goal is to transform an ordinary contract into one that takes an external | |
state structure as a hidden extra parameter and returns an updated copy of the | |
structure in a similar way, without any expensive EVM storage operations. | |
The resulting contract should be ABI-compatible with Smolenski's proposal. | |
This process should work for all existing contracts without modification and | |
be suitable for implementation within a smart contract. A Solidity version is | |
certain to be tedious and verbose; for this demonstration we will construct a | |
simple but powerful static recompiler in Haskell, reading compiled contracts | |
and operating on the EVM bytecode level to produce a new binary, suitable for | |
immediate deployment to the blockchain. | |
DESIGN | |
EVM contract storage is essentially a key-value store allowing arbitrary 256- | |
bit keys and values. Solidity addresses storage using two conventions: scalar | |
values are packed as tightly as possible (without changing their ordering) in | |
sequence starting from slot 0. For lists and mappings, a key prefix defined by | |
the value's position in the list of members is concatenated with a key and | |
hashed in order to determine the storage slot to use. Implementing a key-value | |
mapping in memory is left as a future exercise. For now, we will only consider | |
member variables with scalar types. | |
When a contract reads or writes storage data, it operates on one full storage | |
slot at a time using the EVM opcodes `SLOAD` and `SSTORE`. When these opcodes | |
are encountered, the value at the top of the stack is taken as the referenced | |
storage key. In the case of `SSTORE`, the next value on the stack is used for | |
the value and the operation leaves the stack with 2 fewer items. For `SLOAD`, | |
the value located at the given storage key is pushed onto the stack, leaving | |
the total number of items unchanged. | |
Contracts may also access cheaper volatile working memory using the operations | |
`MLOAD` and `MSTORE`, which are analogous to the storage opcodes we just saw, | |
except that they access a byte-addressed and contiguous range of memory. The | |
stack arguments consumed by these operations mirror those for storage, but we | |
clearly will need to do some kind of address translation to map between byte- | |
and word-addressed layouts, and also to avoid clashes with the contract's own | |
use of working memory. Solidity implements a (very) simple memory management | |
scheme which may be used to achieve a seamless integration in the general case | |
but for now we will simply choose a high starting offset to avoid overlapping | |
regions. | |
Before we dive into writing our translation functions, we need to enable some | |
handy language extensions and import a few things from the core libraries. Our | |
code will otherwise be entirely self-contained. A basic installation of GHC or | |
Haskell Platform is all you need. | |
> {-# LANGUAGE ViewPatterns #-} | |
> import Control.Monad (forM_) | |
> import Data.Word (Word8) | |
> import Data.Bits (Bits, shiftL, shiftR, (.|.)) | |
> import System.Environment (getArgs) | |
> import qualified Data.ByteString as B | |
To turn a storage key into a memory address, we simply multiply by 32 (bytes, | |
for a 256 bit key size) and add the offset. 64k ought to be enough for anyone, | |
so we shall define our storage region to begin on the next byte after the 16- | |
bit range ends: | |
> offset :: Integer | |
> offset = 0x010000 -- 2^16 = 65536 (3 bytes) | |
Given a replacement instruction, i.e. `MSTORE` or `MLOAD`, we can generate the | |
translated code: | |
> relocate :: Op -> [Op] | |
> relocate x = | |
> [ Push1 0x20 -- 32 byte slots (1 byte) | |
> , Mul | |
> , Push3 offset | |
> , Add | |
> , x | |
> ] | |
Note that the length of this operation is 9 bytes: 5 opcodes (1 byte each) and | |
2 constants of 1 byte and 3 bytes repsectively. We will replace each instance | |
of `SSTORE` and `SLOAD` with the output of this function called with `MSTORE` | |
and `MLOAD` respectively. | |
Execution on the EVM may halt for a number of reasons, either gracefully or in | |
case of error. In the latter case the entire transaction is “rolled back”, so | |
we will only concern ourselves with graceful termination. There are two ways | |
that a contract may terminate gracefully, those are by the `STOP` and `RETURN` | |
instructions. Both have the effect of halting execution, but `RETURN` copies | |
a region of memory and “returns” it to the caller. | |
We need to return the updated state data along with any data returned by the | |
contract, which means we must replace all `STOP`s with `RETURN`s, and insert | |
new instructions to place the storage memory region at the end of the returned | |
data. At this point we will make another simplifying assumption and constrain | |
the storage size to a rather small 1kb, for the sole purpose of avoiding loops | |
in our demonstration code: | |
> storageSize :: Integer | |
> storageSize = 0x400 -- 2^10 = 1024 (2 bytes) | |
We could trivially use a larger value, but there is little point calculating | |
an accurate one. A fully realized implementation will already be maintaining | |
an exact value for the mapping structure which can be used instead. | |
When there is no return data, i.e. we encounter a `STOP`, we simply push the | |
offset and size of the storage memory region to the stack, then `RETURN`: | |
> returnState :: [Op] | |
> returnState = | |
> [ Push2 storageSize -- Push storage region length to stack | |
> , Push3 offset -- Push storage region offset to stack | |
> , Return | |
> ] | |
The length of this operation is 8 bytes. This code will replace the `STOP` | |
instruction. | |
When there is other return data, we will copy the storage region to the end of | |
the return data region and return with the length value increased by the size | |
of the storage data, i.e. 1kb. We could do it the other way around, but there | |
is no EVM instruction to copy a range of data, and it will be simpler for our | |
purposes to be working with a whole number of 32-byte words than working with | |
individual bytes. Knowing the first and second stack values contain the offset | |
and length of the main return data respectively, we move each word back. Our | |
function will append this code to the end of a contract: | |
> returnData :: [Op] -> [Op] | |
> returnData contract = | |
> contract ++ prepare ++ concatMap copy [0..31] ++ complete | |
> where | |
> prepare = | |
> [ JumpLabel 0 -- Allow jumping to this code by label | |
> , Dup1 -- Duplicate first stack value (offset) | |
> , Dup3 -- Duplicate third stack value (length) | |
> , Add -- Add to get offset for storage data | |
> ] | |
> copy x = | |
> [ Push3 (x*32 + storageSize) -- Pointer to requested word | |
> , MLoad -- Load the word onto the stack | |
> , Dup1 -- Duplicate the current offset | |
> , MStore -- Store in new position | |
> , Push1 0x20 -- Amount to increment offset | |
> , Add -- Increment current offset | |
> ] | |
> complete = | |
> [ Pop -- Discard final offset | |
> , Swap1 -- Swap offset and length on stack | |
> , Push2 storageSize -- Push storage size to stack | |
> , Add -- Add storage size to return size | |
> , Swap1 -- Put return args back in order | |
> , Return -- Finally, we can now return | |
> ] | |
The length of this operation is 332 bytes. This is too long to insert anywhere | |
we find a `RETURN`, so instead we will append this code as a subroutine at the | |
end of the contract. We can then replace existing `RETURN`s by jumping to the | |
new routine using a special meta-instruction `PushLabel`, which is translated | |
to the appropriate `PUSH` opcode and constant during the final assembly stage: | |
> returnJump :: [Op] | |
> returnJump = | |
> [ PushLabel 0 | |
> , Jump | |
> ] | |
When an EVM contract is called, the message data must be read from a separate | |
address space. The EVM provides opcodes for loading a single word to the stack | |
(`CALLDATALOAD`) and copying a range into memory (`CALLDATACOPY`). We will use | |
the latter to copy the storage data at the end of the call data to the storage | |
region in memory, using `CALLDATASIZE` to determine the length. Our function | |
will prepend this code to the beginning of a contract: | |
> initStorage :: [Op] -> [Op] | |
> initStorage contract = | |
> [ Push2 storageSize | |
> , CallDataSize | |
> , Push2 storageSize | |
> , Sub | |
> , Push3 offset | |
> , CallDataCopy | |
> ] ++ contract | |
The length of this operation is 13 bytes. We prepend this code to the contract | |
to ensure that the storage memory region is populated throughout the contract | |
execution. | |
With all of our translations now defined, we can write our opcode translation | |
function: | |
> translate :: Op -> [Op] | |
> translate x = | |
> case x of SLoad -> relocate MLoad | |
> SStore -> relocate MStore | |
> Stop -> returnState | |
> Return -> returnJump | |
> _ -> [x] | |
This function takes an opcode and returns a list of opcodes to replace it. The | |
replacement list for a non-storage opcode is simply the singleton list of that | |
opcode. | |
Inserting new code presents us with a problem. Control flow on the EVM is by | |
the use of 2 jump opcodes: `JUMP` and `JUMPI`, where the latter is conditional | |
on the second stack value being nonzero, and both use the first stack value as | |
the target bytecode position - and these targets may no longer be valid. | |
Addressing this issue, or rather re-addressing it, will be a little trickier. | |
Jump targets may not always be constants pushed onto the stack in the opcode | |
immediately prior, in fact it is quite common for these values to be kept on | |
the stack for later use. It is however almost universal for them to originate | |
as constants in `PUSH` opcodes, and we can determine which constants likely | |
correspond to jump targets by correlating them against the positions of every | |
`JUMPDEST` opcode, which all jump operations must land on - at the risk of | |
mistakenly identifying ordinary constants as jump labels. There are several | |
approaches to re-addressing which avoid these mistakes. In general the only | |
perfect solution is a runtime lookup table, in practice compile-time solutions | |
can be just as viable with no runtime overhead. (I got flawless results from | |
a dynamic dependency analysis approach, but that would have made this write-up | |
an order of magnitude longer.) | |
After reading a contract but before doing any translation of it we locate all | |
`JUMPDEST` opcodes, convert them to position-tagged `JumpLabel` meta-opcodes, | |
then replace any constants that occur in the set of “labels” to the equivalent | |
`PushLabel` meta-opcode. This effectively gives us position-independent jumps, | |
enabling us to safely insert new opcodes. The process entails two passses. The | |
first pass creates the labels and collects them into a list: | |
> label :: [(Integer, Op)] -> ([Integer], [Op]) | |
> label = | |
> foldr f ([], []) | |
> where | |
> f (k, op) (ks, ops) = case op of | |
> JumpDest -> (k : ks, JumpLabel k : ops) | |
> _ -> (ks , op : ops) | |
The second pass takes the output of the first and translates `PUSH` opcodes | |
with constants that occur in the label list into `PushLabel` meta-opcodes. | |
> labelPush :: ([Integer], [Op]) -> [Op] | |
> labelPush (ks, ops) = | |
> map f ops | |
> where | |
> f op = case op of | |
> (push -> P k _) | k `elem` ks -> PushLabel k | |
> _ -> op | |
In order to reassemble the contract back into EVM bytecode, we need to perform | |
the same process in reverse, replacing label meta-opcodes with real opcodes | |
based on their new positions. This is also a two pass process. First we remove | |
the `JumpLabel`s, whose actual positions will now be different to their labels | |
and build a list of mappings. Note that we must keep track of the current byte | |
position by the length of the the opcodes. | |
> unlabel :: [Op] -> ([(Integer, Integer)], [Op]) | |
> unlabel = | |
> dropFst . foldr f (0, [], []) | |
> where | |
> dropFst (_, ks, ops) = (ks, ops) | |
> f op (k, ks, ops) = case op of | |
> JumpLabel k' -> (k+1, (k',k):ks, JumpDest : ops) | |
> PushLabel k' -> (k+3, ks, op : ops) | |
> (push -> P x n) -> (k+n+1, ks, op : ops) | |
> _ -> (k+1, ks, op : ops) | |
The second pass takes the output of the first and translates `PushLabel` meta- | |
opcodes to `PUSH2` opcodes with the mapped offset. This is the first time that | |
we introduce the possibility of error into out program, since the mapping may | |
fail if we have made a mistake somewhere. We won't handle that possibility for | |
this demonstration, but we will at least acknowledge it with an error message | |
and abort. | |
> unlabelPush :: ([(Integer, Integer)], [Op]) -> [Op] | |
> unlabelPush (kvs, ops) = | |
> map f ops | |
> where | |
> f op = case op of | |
> PushLabel k -> Push2 (remap k) | |
> _ -> op | |
> remap k = case lookup k kvs of | |
> Just k' -> k' | |
> Nothing -> error $ "no such label: " ++ show k | |
We now have all the machinery we need to translate contracts, and can finally | |
connect them all up in a neat pipeline. This function will take the list of | |
position-tagged opcodes from a disassembler function, and return a final list | |
of opcodes to provide the assembler. | |
> convert :: [(Integer, Op)] -> [Op] | |
> convert = | |
> unlabelPush | |
> . unlabel | |
> . returnData | |
> . initStorage | |
> . concatMap translate | |
> . labelPush | |
> . label | |
That's basically it. All that remains is the main IO to load the contract and | |
save the translated version. A basic assembler/disassembler is provided in the | |
appendix in order to make this document a self-contained program, but we need | |
not concern ourselves with the details. Tying it all together, our program | |
will accept any number of input files as command line arguments, and for each | |
input write the translated output to a new file with the extension `.out`: | |
> main :: IO () | |
> main = do | |
> inputs <- tail <$> getArgs | |
> forM_ inputs $ \file -> do | |
> contract <- disassemble <$> B.readFile file | |
> let file' = file ++ ".out" | |
> B.writeFile file' . assemble $ convert contract | |
The translated contracts are ABI-compatible with Smolenski's proposal, in that | |
every method effectively has an extra argument and return value for the state | |
data, but this demonstration does not include a mechanism for automatically | |
updating the `solc`-generated ABI file. For now, these changes will have to be | |
added manually. | |
One final caveat is that contracts compiled by `solc` begin with a preamble or | |
“installer” section which initialises the contract and performs the vital task | |
of returning the actual contract bytecode, which is then stored with the newly | |
created account. As a simple workaround, the `--bin-runtime` option to `solc` | |
will cause it to emit just the part that should be translated, which can then | |
be spliced back in to the full bytecode. Note that therefore the initial state | |
of the contract will be empty; contracts that require initialisation should do | |
so by exposing an “initialise” method in their public API. | |
CONCLUSION | |
We have built a static recompiler for converting smart contracts which use the | |
native EVM storage facility into contracts which transform state values passed | |
via transaction inputs and outputs. It is a general solution applicable to all | |
EVM contracts, though the version we have built in this document is subject to | |
several limitations which are described in the DESIGN section with a suggested | |
solution in each case - these are “engineering problems”, in the sense that an | |
appropriate implementation is known. | |
There is one feature absent from this demonstration which is an important part | |
of Smolenski's proposal, namely support for endorser-oracle state validation. | |
This involves the contract using its native EVM storage to store a hash of the | |
last known valid state, which can only be updated if the new hash is endorsed | |
by the oracle by signing it. There are two ways we could approach this: either | |
as a completely transparent process (a “hidden method”), or by a mechanism to | |
allow contracts to validate their own “state update” requests, where the lower | |
level code provides a way for the high level code to store the new hash. This | |
question is left as a topic for future discussion. | |
The high and ever increasing prices of native EVM storage for smart contracts | |
is an issue which is only likely to worsen over time as Ethereum's popularity | |
grows along with the volume of contract data that peers are required to store | |
on their machines. On the other hand, a more popular network is a more useful | |
one, and if contract developers are able to take advantage of cheaper storage | |
while still leveraging the reach of the Ethereum blockchain, many will likely | |
opt to do so. Interest in privacy and access control is also likely to become | |
stronger as distributed applications become more sophisticated. This proposal | |
is a meant as a small contribution to that conversation - in the hope that it | |
might help us to answer: what on earth we are going to do about this problem? | |
----------------------------------------------------------------------------- | |
APPENDIX A - ASSEMBLER AND DISASSEMBLER | |
These are the basic functions for disassembly and assembly. They will cause an | |
error if the operation fails. | |
> disassemble :: B.ByteString -> [(Integer, Op)] | |
> disassemble bs = f 0 | |
> where f i | i >= B.length bs = [] | |
> | otherwise = let (n, x) = getOp i bs | |
> in (toInteger i, x) : f (n+i) | |
> assemble :: [Op] -> B.ByteString | |
> assemble = mconcat . map toSym | |
> getOp :: Int -> B.ByteString -> (Int, Op) | |
> getOp p bs = | |
> if p >= 0 && p < B.length bs | |
> then case sym $ B.index bs p of | |
> Right op -> (1, op) | |
> Left (n, op) -> (n+1, op $ getData (p+1) n bs) | |
> else | |
> error $ "out of bounds " ++ show p | |
> getData :: Int -> Int -> B.ByteString -> Integer | |
> getData p n bs = | |
> if p >= 0 && p + n <= B.length bs | |
> then roll . B.take n $ B.drop p bs | |
> else error "unexpected end of input" | |
APPENDIX B - PUSH VIEW PATTERN | |
This code defines the `pushN` view pattern. | |
> data PushN = P Integer Integer | NotPush | |
> push :: Op -> PushN | |
> push op = case op of | |
> Push1 x -> P x 1 ; Push17 x -> P x 17 | |
> Push2 x -> P x 2 ; Push18 x -> P x 18 | |
> Push3 x -> P x 3 ; Push19 x -> P x 19 | |
> Push4 x -> P x 4 ; Push20 x -> P x 20 | |
> Push5 x -> P x 5 ; Push21 x -> P x 21 | |
> Push6 x -> P x 6 ; Push22 x -> P x 22 | |
> Push7 x -> P x 7 ; Push23 x -> P x 23 | |
> Push8 x -> P x 8 ; Push24 x -> P x 24 | |
> Push9 x -> P x 9 ; Push25 x -> P x 25 | |
> Push10 x -> P x 10 ; Push26 x -> P x 26 | |
> Push11 x -> P x 11 ; Push27 x -> P x 27 | |
> Push12 x -> P x 12 ; Push28 x -> P x 28 | |
> Push13 x -> P x 13 ; Push29 x -> P x 29 | |
> Push14 x -> P x 14 ; Push30 x -> P x 30 | |
> Push15 x -> P x 15 ; Push31 x -> P x 31 | |
> Push16 x -> P x 16 ; Push32 x -> P x 32 | |
> _ -> NotPush | |
APPENDIX C - SYMBOL TABLES | |
This code declares the `Op` type and bytecode conversions. | |
> type I = Integer | |
> data Op | |
> = Stop | Lt | Pop | BlockHash | Address | |
> | Add | Gt | MLoad | Coinbase | Balance | |
> | Mul | SLT | MStore | Timestamp | Origin | |
> | Sub | SGT | MStore8 | Number | Caller | |
> | Div | Eq | SLoad | Difficulty | CallValue | |
> | SDiv | IsZero | SStore | GasLimit | CallDataLoad | |
> | Mod | And | Jump | Create | CallDataSize | |
> | SMod | Or | JumpI | Call | CallDataCopy | |
> | AddMod | Xor | PC | CallCode | CodeSize | |
> | MulMod | Not | MSize | Return | CodeCopy | |
> | Exp | Byte | Gas | DelegateCall | GasPrice | |
> | SignExtend | SHA3 | JumpDest | Suicide | ExtCodeSize | |
> | Push1 I | Push17 I | Dup1 | Swap1 | ExtCodeCopy | |
> | Push2 I | Push18 I | Dup2 | Swap2 | Log0 | |
> | Push3 I | Push19 I | Dup3 | Swap3 | Log1 | |
> | Push4 I | Push20 I | Dup4 | Swap4 | Log2 | |
> | Push5 I | Push21 I | Dup5 | Swap5 | Log3 | |
> | Push6 I | Push22 I | Dup6 | Swap6 | Log4 | |
> | Push7 I | Push23 I | Dup7 | Swap7 -- Meta: | |
> | Push8 I | Push24 I | Dup8 | Swap8 | Invalid Word8 | |
> | Push9 I | Push25 I | Dup9 | Swap9 | JumpLabel I | |
> | Push10 I | Push26 I | Dup10 | Swap10 | PushLabel I | |
> | Push11 I | Push27 I | Dup11 | Swap11 | |
> | Push12 I | Push28 I | Dup12 | Swap12 | |
> | Push13 I | Push29 I | Dup13 | Swap13 | |
> | Push14 I | Push30 I | Dup14 | Swap14 | |
> | Push15 I | Push31 I | Dup15 | Swap15 | |
> | Push16 I | Push32 I | Dup16 | Swap16 | |
> sym :: Word8 -> Either (Int, Integer -> Op) Op | |
> sym x = case x of | |
> 0x00 -> r Stop ; 0x33 -> r Caller ; 0x5b -> r JumpDest | |
> 0x01 -> r Add ; 0x34 -> r CallValue ; 0xa0 -> r Log0 | |
> 0x02 -> r Mul ; 0x35 -> r CallDataLoad ; 0xa1 -> r Log1 | |
> 0x03 -> r Sub ; 0x36 -> r CallDataSize ; 0xa2 -> r Log2 | |
> 0x04 -> r Div ; 0x37 -> r CallDataCopy ; 0xa3 -> r Log3 | |
> 0x05 -> r SDiv ; 0x38 -> r CodeSize ; 0xa4 -> r Log4 | |
> 0x06 -> r Mod ; 0x39 -> r CodeCopy ; 0xf0 -> r Create | |
> 0x07 -> r SMod ; 0x3a -> r GasPrice ; 0xf1 -> r Call | |
> 0x08 -> r AddMod ; 0x3b -> r ExtCodeSize ; 0xf2 -> r CallCode | |
> 0x09 -> r MulMod ; 0x3c -> r ExtCodeCopy ; 0xf3 -> r Return | |
> 0x0A -> r Exp ; 0x40 -> r BlockHash ; 0xf4 -> r DelegateCall | |
> 0x0b -> r SignExtend ; 0x41 -> r Coinbase ; 0xf5 -> r Suicide | |
> 0x10 -> r Lt ; 0x42 -> r Timestamp ; 0x90 -> r Swap1 | |
> 0x11 -> r Gt ; 0x43 -> r Number ; 0x91 -> r Swap2 | |
> 0x12 -> r SLT ; 0x44 -> r Difficulty ; 0x92 -> r Swap3 | |
> 0x13 -> r SGT ; 0x45 -> r GasLimit ; 0x93 -> r Swap4 | |
> 0x14 -> r Eq ; 0x50 -> r Pop ; 0x94 -> r Swap5 | |
> 0x15 -> r IsZero ; 0x51 -> r MLoad ; 0x95 -> r Swap6 | |
> 0x16 -> r And ; 0x52 -> r MStore ; 0x96 -> r Swap7 | |
> 0x17 -> r Or ; 0x53 -> r MStore8 ; 0x97 -> r Swap8 | |
> 0x18 -> r Xor ; 0x54 -> r SLoad ; 0x98 -> r Swap9 | |
> 0x19 -> r Not ; 0x55 -> r SStore ; 0x99 -> r Swap10 | |
> 0x1a -> r Byte ; 0x56 -> r Jump ; 0x9a -> r Swap11 | |
> 0x20 -> r SHA3 ; 0x57 -> r JumpI ; 0x9b -> r Swap12 | |
> 0x30 -> r Address ; 0x58 -> r PC ; 0x9c -> r Swap13 | |
> 0x31 -> r Balance ; 0x59 -> r MSize ; 0x9d -> r Swap14 | |
> 0x32 -> r Origin ; 0x5a -> r Gas ; 0x9e -> r Swap15 | |
> 0x60 -> l (1, Push1) ; 0x70 -> l (17, Push17) ; 0x80 -> r Dup1 | |
> 0x61 -> l (2, Push2) ; 0x71 -> l (18, Push18) ; 0x81 -> r Dup2 | |
> 0x62 -> l (3, Push3) ; 0x72 -> l (19, Push19) ; 0x82 -> r Dup3 | |
> 0x63 -> l (4, Push4) ; 0x73 -> l (20, Push20) ; 0x83 -> r Dup4 | |
> 0x64 -> l (5, Push5) ; 0x74 -> l (21, Push21) ; 0x84 -> r Dup5 | |
> 0x65 -> l (6, Push6) ; 0x75 -> l (22, Push22) ; 0x85 -> r Dup6 | |
> 0x66 -> l (7, Push7) ; 0x76 -> l (23, Push23) ; 0x86 -> r Dup7 | |
> 0x67 -> l (8, Push8) ; 0x77 -> l (24, Push24) ; 0x87 -> r Dup8 | |
> 0x68 -> l (9, Push9) ; 0x78 -> l (25, Push25) ; 0x88 -> r Dup9 | |
> 0x69 -> l (10, Push10) ; 0x79 -> l (26, Push26) ; 0x89 -> r Dup10 | |
> 0x6a -> l (11, Push11) ; 0x7a -> l (27, Push27) ; 0x8a -> r Dup11 | |
> 0x6b -> l (12, Push12) ; 0x7b -> l (28, Push28) ; 0x8b -> r Dup12 | |
> 0x6c -> l (13, Push13) ; 0x7c -> l (29, Push29) ; 0x8c -> r Dup13 | |
> 0x6d -> l (14, Push14) ; 0x7d -> l (30, Push30) ; 0x8d -> r Dup14 | |
> 0x6e -> l (15, Push15) ; 0x7e -> l (31, Push31) ; 0x8e -> r Dup15 | |
> 0x6f -> l (16, Push16) ; 0x7f -> l (32, Push32) ; 0x8f -> r Dup16 | |
> 0x9f -> r Swap16 ; _ -> r (Invalid x) | |
> where l = Left ; r = Right | |
> toSym :: Op -> B.ByteString | |
> toSym op = case op of | |
> Stop -> b 0x00 ; Caller -> b 0x33 ; JumpDest -> b 0x5b | |
> Add -> b 0x01 ; CallValue -> b 0x34 ; Log0 -> b 0xa0 | |
> Mul -> b 0x02 ; CallDataLoad -> b 0x35 ; Log1 -> b 0xa1 | |
> Sub -> b 0x03 ; CallDataSize -> b 0x36 ; Log2 -> b 0xa2 | |
> Div -> b 0x04 ; CallDataCopy -> b 0x37 ; Log3 -> b 0xa3 | |
> SDiv -> b 0x05 ; CodeSize -> b 0x38 ; Log4 -> b 0xa4 | |
> Mod -> b 0x06 ; CodeCopy -> b 0x39 ; Create -> b 0xf0 | |
> SMod -> b 0x07 ; GasPrice -> b 0x3a ; Call -> b 0xf1 | |
> AddMod -> b 0x08 ; ExtCodeSize -> b 0x3b ; CallCode -> b 0xf2 | |
> MulMod -> b 0x09 ; ExtCodeCopy -> b 0x3c ; Return -> b 0xf3 | |
> Exp -> b 0x0A ; BlockHash -> b 0x40 ; DelegateCall -> b 0xf4 | |
> SignExtend -> b 0x0b ; Coinbase -> b 0x41 ; Suicide -> b 0xf5 | |
> Lt -> b 0x10 ; Timestamp -> b 0x42 ; Swap1 -> b 0x90 | |
> Gt -> b 0x11 ; Number -> b 0x43 ; Swap2 -> b 0x91 | |
> SLT -> b 0x12 ; Difficulty -> b 0x44 ; Swap3 -> b 0x92 | |
> SGT -> b 0x13 ; GasLimit -> b 0x45 ; Swap4 -> b 0x93 | |
> Eq -> b 0x14 ; Pop -> b 0x50 ; Swap5 -> b 0x94 | |
> IsZero -> b 0x15 ; MLoad -> b 0x51 ; Swap6 -> b 0x95 | |
> And -> b 0x16 ; MStore -> b 0x52 ; Swap7 -> b 0x96 | |
> Or -> b 0x17 ; MStore8 -> b 0x53 ; Swap8 -> b 0x97 | |
> Xor -> b 0x18 ; SLoad -> b 0x54 ; Swap9 -> b 0x98 | |
> Not -> b 0x19 ; SStore -> b 0x55 ; Swap10 -> b 0x99 | |
> Byte -> b 0x1a ; Jump -> b 0x56 ; Swap11 -> b 0x9a | |
> SHA3 -> b 0x20 ; JumpI -> b 0x57 ; Swap12 -> b 0x9b | |
> Address -> b 0x30 ; PC -> b 0x58 ; Swap13 -> b 0x9c | |
> Balance -> b 0x31 ; MSize -> b 0x59 ; Swap14 -> b 0x9d | |
> Origin -> b 0x32 ; Gas -> b 0x5a ; Swap15 -> b 0x9e | |
> Dup1 -> b 0x80 ; Dup7 -> b 0x86 ; Dup13 -> b 0x8c | |
> Dup2 -> b 0x81 ; Dup8 -> b 0x87 ; Dup14 -> b 0x8d | |
> Dup3 -> b 0x82 ; Dup9 -> b 0x88 ; Dup15 -> b 0x8e | |
> Dup4 -> b 0x83 ; Dup10 -> b 0x89 ; Dup16 -> b 0x8f | |
> Dup5 -> b 0x84 ; Dup11 -> b 0x8a ; Swap16 -> b 0x9f | |
> Dup6 -> b 0x85 ; Dup12 -> b 0x8b ; Invalid x -> unroll x | |
> (push -> P x n) -> p x (fromIntegral n) | |
> where p x n = (n - 1 + 0x60) `B.cons` word n (unroll x) | |
> b = B.singleton | |
APPENDIX D: HELPER FUNCTIONS | |
> roll :: B.ByteString -> Integer | |
> roll = B.foldl' unstep 0 | |
> where unstep a b = a `shiftL` 8 .|. fromIntegral b | |
> unroll :: (Integral a, Bits a) => a -> B.ByteString | |
> unroll = B.reverse . B.unfoldr step | |
> where | |
> step 0 = Nothing | |
> step i = Just (fromIntegral i, i `shiftR` 8) | |
> word :: Integral a => a -> B.ByteString -> B.ByteString | |
> word x bs = | |
> if B.length bs >= n | |
> then | |
> B.drop (max 0 $ B.length bs - n) bs | |
> else | |
> let len = B.length bs | |
> z = B.pack $ take (n - len) (repeat 0) | |
> in z `B.append` bs | |
> where n = fromIntegral x | |
------------------------------------------------------------------------------ | |
End of transmission. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment