Expand description
The impl_instructions!
macro
The heart of this crate’s implementation is the private impl_instructions!
macro. This macro
is used to generate the Instruction
and Opcode
types along with their implementations.
The intention is to allow for having a single source of truth from which each of the instruction-related types and implementations are derived.
Its usage looks like this:
impl_instructions! {
"Adds two registers."
0x10 ADD add [RegId RegId RegId]
"Bitwise ANDs two registers."
0x11 AND and [RegId RegId RegId]
// ...
}
Each instruction’s row includes:
- A short docstring.
- The Opcode byte value.
- An uppercase identifier (for generating variants and types).
- A lowercase identifier (for generating the shorthand instruction constructor).
- The instruction layout (for the
new
andunpack
functions).
The following sections describe each of the items that are derived from the
impl_instructions!
table in more detail.
The Opcode
enum
Represents the bytecode portion of an instruction.
/// Solely the opcode portion of an instruction represented as a single byte.
#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
#[repr(u8)]
pub enum Opcode {
/// Adds two registers.
ADD = 0x10,
/// Bitwise ANDs two registers.
AND = 0x11,
// ...
}
A TryFrom<u8>
implementation is also provided, producing an Err(InvalidOpcode)
in the case
that the byte represents a reserved or undefined value.
assert_eq!(Opcode::try_from(0x10), Ok(Opcode::ADD));
assert_eq!(Opcode::try_from(0x11), Ok(Opcode::AND));
assert_eq!(Opcode::try_from(0), Err(InvalidOpcode));
The Instruction
enum
Represents a single, full instruction, discriminated by its Opcode
.
/// Representation of a single instruction for the interpreter.
///
/// The opcode is represented in the tag (variant), or may be retrieved in the form of an
/// `Opcode` byte using the `opcode` method.
///
/// The register and immediate data associated with the instruction is represented within
/// an inner unit type wrapper around the 3 remaining bytes.
#[derive(Clone, Copy, Eq, Hash, PartialEq)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub enum Instruction {
/// Adds two registers.
ADD(op::ADD),
/// Bitwise ANDs two registers.
AND(op::AND),
// ...
}
The From<Instruction> for u32
(aka RawInstruction
) and TryFrom<u32> for Instruction
implementations can be found in the crate root.
A unique unit type per operation
In order to reduce the likelihood of misusing unrelated register IDs or immediate values, we generate a unique unit type for each type of operation (i.e instruction variant) and guard access to the relevant register IDs and immediate values behind each type’s unique methods.
These unique operation types are generated as follows within a dedicated op
module:
pub mod op {
//! Definitions and implementations for each unique instruction type, one for each
//! unique `Opcode` variant.
// A unique type for each operation.
/// Adds two registers.
pub struct ADD([u8; 3]);
/// Bitwise ANDs two registers.
pub struct AND([u8; 3]);
// ...
// An implementation for each unique type.
impl ADD {
pub const OPCODE: Opcode = Opcode::ADD;
/// Construct the instruction from its parts.
pub fn new(ra: RegId, rb: RegId, rc: RegId) -> Self {
Self(pack::bytes_from_ra_rb_rc(ra, rb, rc))
}
/// Convert the instruction into its parts.
pub fn unpack(self) -> (RegId, RegId, RegId) {
unpack::ra_rb_rc_from_bytes(self.0)
}
}
impl AND {
// ...
}
// ...
// A short-hand `Instruction` constructor for each operation to make it easier to
// hand-write assembly for tests and benchmarking. As these constructors are public and
// accept literal values, we check that the values are within range.
/// Adds two registers.
pub fn add(ra: u8, rb: u8, rc: u8) -> Instruction {
ADD::new(check_reg_id(ra), check_reg_id(rb), check_reg_id(rc)).into()
}
/// Bitwise ANDs two registers.
pub fn and(ra: u8, rb: u8, rc: u8) -> Instruction {
AND::new(check_reg_id(ra), check_reg_id(rb), check_reg_id(rc)).into()
}
// ...
};
Instruction Layout
The function signatures of the new
and unpack
functions are derived from the instruction’s
data layout described in the impl_instructions!
table.
For example, the unpack
method for ADD
looks like this:
// 0x10 ADD add [RegId RegId RegId]
pub fn unpack(self) -> (RegId, RegId, RegId)
While the unpack
method for ADDI
looks like this:
// 0x50 ADDI addi [RegId RegId Imm12]
pub fn unpack(self) -> (RegId, RegId, Imm12)
Shorthand Constructors
The shorthand instruction constructors (e.g. add
, and
, etc) are specifically designed to
make it easier to handwrite assembly for tests or benchmarking. Unlike the $OP::new
constructors which require typed register ID or immediate inputs, the instruction constructors
allow for constructing Instruction
s from convenient literal value inputs. E.g.
use fuel_asm::{op, Instruction};
// A sample program to perform ecrecover
let program: Vec<Instruction> = vec![
op::move_(0x10, 0x01), // set r[0x10] := $one
op::slli(0x20, 0x10, 5), // set r[0x20] := `r[0x10] << 5 == 32`
op::slli(0x21, 0x10, 6), // set r[0x21] := `r[0x10] << 6 == 64`
op::aloc(0x21), // alloc `r[0x21] == 64` to the heap
op::addi(0x10, 0x07, 1), // set r[0x10] := `$hp + 1` (allocated heap)
op::move_(0x11, 0x04), // set r[0x11] := $ssp
op::add(0x12, 0x04, 0x20), // set r[0x12] := `$ssp + r[0x20]`
op::ecr(0x10, 0x11, 0x12), // recover public key in memory[r[0x10], 64]
op::ret(0x01), // return `1`
];