Medium Risk
https://github.com/vyperlang/vyper/blob/52dc413c684532d5c4d6cdd91e3b058957cfcba0/vyper/compiler/output.py#L294-L312
When the compiler is run in -f opcodes
or -f opcodes_runtime
mode, it translates the final bytecode into opcodes. However, due to incorrect padding placed on PUSH
values, the return values will be incorrect for any bytes with leading zeros.
When the compiler is run with a target output of opcodes, we run the final bytecode through the following function:
def _build_opcodes(bytecode: bytes) -> str:
bytecode_sequence = deque(bytecode)
opcode_map = dict((v[0], k) for k, v in opcodes.get_opcodes().items())
opcode_output = []
while bytecode_sequence:
op = bytecode_sequence.popleft()
opcode_output.append(opcode_map.get(op, f"VERBATIM_{hex(op)}"))
if "PUSH" in opcode_output[-1] and opcode_output[-1] != "PUSH0":
push_len = int(opcode_map[op][4:])
# we can have push_len > len(bytecode_sequence) when there is data
# (instead of code) at end of contract
# CMC 2023-07-13 maybe just strip known data segments?
push_len = min(push_len, len(bytecode_sequence))
push_values = [hex(bytecode_sequence.popleft())[2:] for i in range(push_len)]
opcode_output.append(f"0x{''.join(push_values).upper()}")
print(opcode_output)
return " ".join(opcode_output)
This function iterates through each instruction in the bytecode and translates it to the corresponding opcode. In the case of PUSH
instructions, it parses the number of bytes to include (let's call it x
), and then assumes the following x
instructions are the value passed to PUSH
.
For each of these two byte chunks, it parses the bytes with hex(bytecode_sequence.popleft())[2:]
and joins them together.
The problem is that for two bytes that begin with a leading 0
(such as 0x05
), this simply appends the non-zero digit to the sequence. The result is a sequence that is not as long as expected by the PUSH instruction, and therefore is prepended (or appended, depending on the type) with 0s in order to reach the expected length.
Consider the following Vyper contract, with a single function that returns a bytes4 value of 0x350f872d
:
@external
def f1() -> bytes4:
return 0x350f872d
Because the second byte of the return value starts with a 0, the translation will return 0xf
instead of 0x0f
.
The result is are these incorrect opcodes returned from the compiler (see the PUSH32
instruction in the middle):
PUSH0 CALLDATALOAD PUSH1 0xE0 SHR PUSH4 0xC27FC35 DUP2 XOR PUSH2 0x03E JUMPI CALLVALUE PUSH2 0x042 JUMPI PUSH32 0x35F872D0000000000000000000000000000 PUSH1 0x40 MSTORE PUSH1 0x20 PUSH1 0x40 RETURN JUMPDEST PUSH0 PUSH0 REVERT JUMPDEST PUSH0 DUP1 REVERT
The compiler will return incorrect values when run in opcode mode and there is any PUSH instruction that includes bytes with leading zeros.
Manual Review
Ensure that the push_values
value is padded to be two digits before being joined into a bytestring.