Skip to content

Instantly share code, notes, and snippets.

@chriseth
Last active July 8, 2024 10:55
Show Gist options
  • Save chriseth/8870a7c0ee9429d92399795c54a31340 to your computer and use it in GitHub Desktop.
Save chriseth/8870a7c0ee9429d92399795c54a31340 to your computer and use it in GitHub Desktop.

Answers to Deep Questions about Solidity

The following list of questions was taken from https://www.reddit.com/r/ethereum/comments/72reba/do_you_have_deep_questions_about_solidity_or_the/

An updated summary on the different ways one could have two contracts interact (DELEGATECALL, STATICCALL, libraries, all that stuff) with clear pros/cons for each (gas cost, whether it requires EVM assembly directives, etc)

Question by /u/drcode

I won't talk about low-level opcodes here because of the brevity of the answer. In general, there are four ways functions can be called in Solidity:

  1. f() or L.f() for L being a library and f being an internal function
  2. x.f() for x being a contract instance and f being a payable or "regular" function
  3. x.f() for x being a contract instance and f being a view (constant) or pure function
  4. L.f() for L being a library and f being a public function

The first is the cheapest, it is just a "jump" in the same bytecode. The drawback is that you can only call functions of the same contract or internal functions of a library. Both are essentially the same thing, putting the function in a library just allows you to do some modularisation. Another advantage apart from the cost is that the system will not create a copy if you pass memory structs or arrays, they are passed by reference instead. This again means that it is cheaper, but it will also allow you to modify the data.

The second is the "normal" cross-contract function call which costs at least 700 gas. Anything passed as argument will be copied and the target will have a completely new execution context.

The third is a restriction of the second that will use STATICALL in the future. The only difference is that the called contract cannot make any modifications to the state, but the costs and implications about copies are the same.

The fourth uses the DELEGATECALL opcode which basically executes the library's code in the context of the calling contract. You can pass storage data by reference but memory data will be passed involving a copy.

The whole use of memory/storage designators, when they are necessary and when they are implied, is still unclear to me (especially in cases where one is dealing with more complex variable types, like arrays of structs or something) The way to solve this would be to have a bunch of code snippet examples and then say "In this case, X is in memory and Y is a state variable, ideally with some discussion in each case what is happening at a low level in each instance, in terms of costly opcodes)

Question by /u/drcode

I hope we have this in the documentation. If not, please add it :-). The general idea is that you always have to specify the storage location in the future. A complex construct like a struct or array cannot change the storage location in itself, i.e. the storage location is just one keyword at the end of the type name and applies to everything in the type. This means you cannot store storage pointers in memory structs or memory pointers in storage structs.

It would be awesome for beginners if there was just a visualization that is a bar chart of different common code examples (writing to a state struct, creating an event, calculating a SHA) and how much gas they consume. I think most newcomers are completely unaware of how bonkers such a bar chart would look (or at least haven't fully internalized it)

Question by /u/drcode

We used to color the code depending on its cost in Mix. This was not very accurate, though. Remix tells you the cost of each function, so please feel free to create such a visualisation!

Writing a word to storage: 20k gas, updating it: 5k gas. Reading from storage: 200 gas. Computing sha3: 30 gas plus 6 per data word. Event: 375 plus 8 per word and 375 per indexed argument.

These are just the EVM values, of course Solidity adds some overhead.

A straightforward example of using ecrecover()

Question by /u/drcode

https://ethereum.stackexchange.com/a/15911/222

A good explanation of how awesomely useful indexed log fields are, what you can do with them, and why newcomers should use them more.

Question by /u/drcode

I think there should be texts out there that can explain it much better than me.

Also when variables are passed by reference or value. You have to nitpick the documents to find the answer.

Question by /u/SyncMyShip

An independent copy is always created when the storage/memory boundary is crossed, i.e. if you convert a storage type to a memory type or vice-versa. Value types (basically anything apart from structs, arrays and mappings) are always passed by value. If the called function can receive a storage reference, storage data is passed by reference, otherwise a memory copy is created. Memory types are always passed by reference if possible. It is possible if and only if an internal function of a library is called or a function on the same contract instance is called directly.

What problems do you feel are hard to tackle with static analysis, that it would be nice to see dynamic analyzers for in Solidity? Where do current compiler warnings fall short?

Question by /u/AlLnAtuRalX

I think the distinction between static analyzers and dynamic analyzers blur for smart contracts. Probably everything that can be done with dynamic analysis can also be done by static analysis using symbolic execution. We already have a quite impressive set of static analyzers, including but not limited to Mythril, oyente, porosity, Securify and the AST-based static analyzer that is part of remix. I don't think there are problems that are particularly hard to tackle, it is just a lot of work to anticipate all of them. Futhermore, a nice frontend that integrates all these tools would be nice.

What bugs me most about the compiler errors is that they are often quite hard to understand. At best, every error should at least provide a way how to solve it in the sense of "Did you mean Y?". The problem here is again that this is a lot of work and can get pretty messy quickly.

My Deep Question lies in the current state of testing, and how we are currently testing smart contracts. Do we feel that testing contracts via web3 and calling a node via RPC is the best way to be testing contracts? What if we could have tools that would allow for easier testing and not via a layer of indirection. would you test a class in another language via using a web interface to interact with it ? surely testing via web3 etc is at the integration test level ??

Question by /u/hookercookerman

You can unit-test smart contracts by writing tests in Solidity itself. I would say this works pretty well and there are already frameworks for it. Testing from javascript via web3 is indeed rather an integration test, but I don't think that the indirection is a problem. What is a problem is that slow backend nodes are used. For ethereum-js-vm the slowdown is caused due to the slowdown of javascript, especially on cryptographic operations. If you use a real ethereum node you often also have a slowdown due to mining. For example geth does have a development mode, but as far as I know it still takes a second to mine a block because it does not allow blocks to be in the future. cpp-ethereum has a mode that runs the tests quite fast and can include future blocks (we use that for testing the compiler itself).

How to give developers confidence that the Solidity optimizer will definitely not change behaviors, even if the EVM is upgraded making some tricks obsolete? C has given me an inherent distrust in optimizers, and it's nice that Solidity has no undefined behaviors, but do you think there are cases today where the optimizer breaks down?

Question by /u/AlLnAtuRalX

The optimizer did have some issues in the past, but we disabled many stages where we had some doubts about their correctness. Of course, bugs can be anywhere in the code of the compiler and disabling code always disables potential bugs. Formally proving optimizer rules correct would help there (any volunteers?).

Future changes in the semantics of the EVM should not be a problem because incompatible changes there would not only affect code generated by the optimizer. Also I don't think the comparison with C optimizers is fair here, because both the C optimizers themselves and the target machine (x86{ are magnitudes more complex.

how much more performant can the EVM possibly get and how? Is there any optimization margin at the solidity compilation phase as well?

Question by /u/nnn4

I don't think this question can be answered reliably at this point in tem.

How do you set and prioritize features for solidity? Is there a public roadmap? There are various rumors and a general sense of distrust towards use of the optimizer. Would you recommend using it for a security critical application?

Question by /u/maurelian

We use github issues and discuss them at our public weekly meeting. Concerning the optimizer, please take a look at the question above.

Hi, what are the next steps in evolving Solidity as a language? What is coming soon? More interested in syntax changes like improvements on types, strings. Any form of garbage collection or manual memory management? Thank you

Question by /u/biserdi

Garbage collection is out of scope, because memory is "garbage collected" by the EVM with each call. Please prove me wrong, but I think adding the overhead of a garbage collector is not worth it. It might be worth to use memory management, though, but there are no plans for implementing it. One big change is the switch to iulia as an intermediate language, but that is not too visible to the users. We are planning to remove some legacies like javascript scoping and in general we plan to introduce new keywords to make the semantics clearer. Since we are a tiny team, we cannot focus on anything that can be implemented outside the compiler. String manipulation functions is one of them - there are already libraries for it.

Are there any thoughts on concurrent design? For example multiple Ethereum blockchains interacting or another solution. I imagine there could be great stride to gain from running the EVM on GPU for example.

Question by /u/SyncMyShip

A blockchain is not a high performance computation platform, so Solidity will probably not receive any parallel execution features (SIMD might be an exception). Running code on a GPU is only advantageous if it is parallelizible and I don't see that coming for smart contracts.

Why is SHA-3 an optcode but SHA-256 a precompiled contract?

Question by /u/niklas_buschmann

That's a very good question! Unfortunately, this was decided before I joined. In general, we now add precompiled contracts for anything that is more complicated. I guess keccak256 is so elementary that it was considered worthwhile to make it an opcode.

Why can't we do vector operations :(? It seems to be ridiculous to add 100 to a value associated with 5000 accounts using a for loop. There are a lot of situations where vector operations would be necessary to keep the gas consumed low.

Question by /u/pcastonguay

As mentioned by others, there is a proposal to add vector operations to the EVM. In your use-case, though, it would not help anything because, if I understand you correctly, the most expensive operation is writing to storage and you cannot parallelize modifying the trie.

Furthermore, I'm not sure if we actually need such functionality. A blockchain is an accounting database. Why do you need to add 100 to 5000 accounts? Wouldn't it be better to just store the fact that each account receives 100 and then only do the calculation if someone wants to know the balance of a specific account? In blockchains you (currently) cannot separate thinking about what to do and how to do it due to the scalability problem. If you force nodes to have parallel computing power does not necessarily make it cheaper, it just reduces the cost for some execution platforms.

What prompted the decision to model Solidity, Viper et al on high level languges such as Javascript and Python? I understand the angle of ease of use, but will this really hold water as the technology becomes more widely adopted and with it, the safety concerns grow? Are there any plans to develop programming languages targeting the EVM based on safer paradigms? (strong static typing, pure functional programming... I'm thinking semantics closer to Haskell, Rust, Ada, to name a few)

Question by /u/Steel_Neuron

A safe language is only safe if it is used. That said, I think that in the beginning, we did focus a little too much on allowing people to write readable code and neglected the fact that we should also prevent people from writing bad code. While we are working on putting more and more restrictions on Solidity, I'm a little sceptical that lessons learnt from traditional computers can be so easily transferred to smart contracts. The big problems in the last months were not related to type systems, syntax or pureness, but instead with how Solidity exposes the EVM's concept of a contract and calls between contracts. I'm not sure a change in the high level language alone will be able to solve that. There are some examples of how such interactions can be modeled, but they come with their own drawbacks which are often related to a break of atomicity and complexity of use.

Is there a way to iterate through the entire stack from a contract call?

Here's why I'm asking. There was a question on StackOverflow about whether writing to storage is indispensable to implementing an anti-reentrancy semaphore.

An idea I had was that while afaik the EVM or Solidity don't support global in-memory variables, the non-reentrant function could iterate through the stack as the first thing it does, looking for a specific 256-bit uid characteristic of the contract instance and the function.

If the value is not found, the function pushes the value on the stack, and goes on do its thing. If the value is found, we assume the function has been called once in this transaction, and so we abort due to reentrancy.

Question by /u/s1gmoid

There is an EIP that could solve the problem that writing such a semaphore to the stack is very expensive. There are many use-cases that require some information being visible for the duration of a transaction (and not just a call). Since this information is deleted at the end of the transaction, it should be much cheaper than storage.

Why does the solidity documentation still say "If you want to implement access restrictions in library functions using msg.sender, you have to manually supply the value of msg.sender as an argument." This is no longer true since the change to DELEGATECALL, right?

Question by /u/accape

Indeed, thanks for the hint! - ethereum/solidity#3207

What happens to a contract's storage if you update the contract and remove a member variable marked as storage, or if you rename it or other.

Question by /u/briandilley

There is no way to update the code of a contract in Solidity while keeping storage variables. Libraries allow a certain kind of upgradability but this is also not really supported by Solidity.

Having said that, you can use inline assembly to arrive at such a solution. Renaming a state variable will not change the storage layout, but removing it will remove its slot. Please take a look at the "storage layout" section in the documentation for details.

I would like to see a clear listing of default behaviours, and modification of those defaults if they encourage unsafe contract design (e.g. all functions being public).

Question by /u/_dredge

We are actually moving away from defaults and will require explicitly stating any "options" in the future.

Is there any talk about replacing or rewriting the Yellow Paper? I know I'm not the only one who finds it hard to read. It's also contained bugs and is very likely to have more, since it's not executable or testable or verifiable in any formal way.

Question by /u/meekale

Indeed, in the way it is written, it is neither human- nor machine-readable. At best, it should either be both, or there should be an automated way to compile one into the other. We now have a specification of the EVM which is machine-readable and to a limited extent also human-readable - https://github.com/kframework/evm-semantics/blob/master/evm.md - but this does not yet exist for the other parts of the yellowpaper.

What exactly makes pre-compiled contracts cost less gas? Some of the common cryptographic functions have been implemented as "pre-compiled contracts". How exactly does this result in their requiring less gas to execute than if they were just coded into a normally deployed smart contract?

Question by /u/miyayes

In principle, we could reduce the gas costs for any specific contract. Miners could analyze their behaviour and implement them manually in a language that can more efficiently be compiled to native code. The difference with precompiled contracts is just that there was consensus among the community about how much they can be optimized.

Note that this only works for one contract at a time, so the more contracts you include, the messier it gets and the likelihood of consensus failures increases. This is why we usually shy away from precompiled contracts unless absolutely necessary.

Are there any plans to bring more op codes into the language?

Question by /u/_dredge

Any opcode that exists for the EVM also exists for Solidity (in inline assembly). If you are asking about new opcodes for the EVM then yes, there are plans (in the form of EIPs), but I cannot say anything about the likelihood of any of them being accepted.

@guest2345
Copy link

hi please help me out with this warning erorr : Warning: Visibility for constructor is ignored. If you want the contract to be non-deployable, making it "abstract" is sufficient.
--> contracts/Rick Coin.sol:352:3:
|
352 | constructor() public {
| ^ (Relevant source part starts here and spans across multiple lines).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment