- simplify cycle counting so it's easier to understand
- make the time it takes to emulate 100kcycles more regular
Memory access has inconsistent costs currently. When decoding instructions it costs cycles, but things like [A] don't. For both a "real" DCPU and the emulator (assuming typical cache behavior), memory accesses are more expensive than register accesses.
- memory accesses cost 1 cycle
- some operations take extra cycles:
- +1: MUL, MLI, STI, STD, IF*
- +2: DIV, DVI, MOD, MDI
- ... (HW*, interrupts)
- SET A, B
- 1 cycle -- read opcode
- SET [A], [B]
- 3 cycles -- read opcode, read a, write b
- MUL A, 2
- 2 cycles -- read opcode, extra execute
- MUL A, 200
- 3 cycles -- read opcode, read a, extra execute
- MUL [A], 200
- 5 cycles -- read opcode, read a, read b, extra execute, write b
- ADD/SUB should preferably be 1 cycle -- it's no more expensive than shifting.
- Instructions generally cost +1 cycle if there's any branching or extra complexity in the emulator executing them.
- Makes the DCPU16 more realistic
- Discourages weird optimizations based on DCPU weirdness
It's writing to b, not B, as in the first operand.