Module fuel_asm::macros

source ·
Expand description

The impl_instructions! macro

The heart of this crate’s implementation is the private impl_instructions! macro. This macro is used to generate the Instruction and Opcode types along with their implementations.

The intention is to allow for having a single source of truth from which each of the instruction-related types and implementations are derived.

Its usage looks like this:

impl_instructions! {
    "Adds two registers."
    0x10 ADD add [RegId RegId RegId]
    "Bitwise ANDs two registers."
    0x11 AND and [RegId RegId RegId]
    // ...
}

Each instruction’s row includes:

  • A short docstring.
  • The Opcode byte value.
  • An uppercase identifier (for generating variants and types).
  • A lowercase identifier (for generating the shorthand instruction constructor).
  • The instruction layout (for the new and unpack functions).

The following sections describe each of the items that are derived from the impl_instructions! table in more detail.

The Opcode enum

Represents the bytecode portion of an instruction.

/// Solely the opcode portion of an instruction represented as a single byte.
#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
#[repr(u8)]
pub enum Opcode {
    /// Adds two registers.
    ADD = 0x10,
    /// Bitwise ANDs two registers.
    AND = 0x11,
    // ...
}

A TryFrom<u8> implementation is also provided, producing an Err(InvalidOpcode) in the case that the byte represents a reserved or undefined value.

assert_eq!(Opcode::try_from(0x10), Ok(Opcode::ADD));
assert_eq!(Opcode::try_from(0x11), Ok(Opcode::AND));
assert_eq!(Opcode::try_from(0), Err(InvalidOpcode));

The Instruction enum

Represents a single, full instruction, discriminated by its Opcode.

/// Representation of a single instruction for the interpreter.
///
/// The opcode is represented in the tag (variant), or may be retrieved in the form of an
/// `Opcode` byte using the `opcode` method.
///
/// The register and immediate data associated with the instruction is represented within
/// an inner unit type wrapper around the 3 remaining bytes.
#[derive(Clone, Copy, Eq, Hash, PartialEq)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub enum Instruction {
    /// Adds two registers.
    ADD(op::ADD),
    /// Bitwise ANDs two registers.
    AND(op::AND),
    // ...
}

The From<Instruction> for u32 (aka RawInstruction) and TryFrom<u32> for Instruction implementations can be found in the crate root.

A unique unit type per operation

In order to reduce the likelihood of misusing unrelated register IDs or immediate values, we generate a unique unit type for each type of operation (i.e instruction variant) and guard access to the relevant register IDs and immediate values behind each type’s unique methods.

These unique operation types are generated as follows within a dedicated op module:

pub mod op {
    //! Definitions and implementations for each unique instruction type, one for each
    //! unique `Opcode` variant.

    // A unique type for each operation.

    /// Adds two registers.
    pub struct ADD([u8; 3]);

    /// Bitwise ANDs two registers.
    pub struct AND([u8; 3]);

    // ...

    // An implementation for each unique type.

    impl ADD {
        pub const OPCODE: Opcode = Opcode::ADD;

        /// Construct the instruction from its parts.
        pub fn new(ra: RegId, rb: RegId, rc: RegId) -> Self {
            Self(pack::bytes_from_ra_rb_rc(ra, rb, rc))
        }

        /// Convert the instruction into its parts.
        pub fn unpack(self) -> (RegId, RegId, RegId) {
            unpack::ra_rb_rc_from_bytes(self.0)
        }
    }

    impl AND {
        // ...
    }

    // ...

    // A short-hand `Instruction` constructor for each operation to make it easier to
    // hand-write assembly for tests and benchmarking. As these constructors are public and
    // accept literal values, we check that the values are within range.

    /// Adds two registers.
    pub fn add(ra: u8, rb: u8, rc: u8) -> Instruction {
        ADD::new(check_reg_id(ra), check_reg_id(rb), check_reg_id(rc)).into()
    }

    /// Bitwise ANDs two registers.
    pub fn and(ra: u8, rb: u8, rc: u8) -> Instruction {
        AND::new(check_reg_id(ra), check_reg_id(rb), check_reg_id(rc)).into()
    }

    // ...
};

Instruction Layout

The function signatures of the new and unpack functions are derived from the instruction’s data layout described in the impl_instructions! table.

For example, the unpack method for ADD looks like this:

// 0x10 ADD add [RegId RegId RegId]
pub fn unpack(self) -> (RegId, RegId, RegId)

While the unpack method for ADDI looks like this:

// 0x50 ADDI addi [RegId RegId Imm12]
pub fn unpack(self) -> (RegId, RegId, Imm12)

Shorthand Constructors

The shorthand instruction constructors (e.g. add, and, etc) are specifically designed to make it easier to handwrite assembly for tests or benchmarking. Unlike the $OP::new constructors which require typed register ID or immediate inputs, the instruction constructors allow for constructing Instructions from convenient literal value inputs. E.g.

use fuel_asm::{op, Instruction};

// A sample program to perform ecrecover
let program: Vec<Instruction> = vec![
    op::move_(0x10, 0x01),     // set r[0x10] := $one
    op::slli(0x20, 0x10, 5),   // set r[0x20] := `r[0x10] << 5 == 32`
    op::slli(0x21, 0x10, 6),   // set r[0x21] := `r[0x10] << 6 == 64`
    op::aloc(0x21),            // alloc `r[0x21] == 64` to the heap
    op::addi(0x10, 0x07, 1),   // set r[0x10] := `$hp + 1` (allocated heap)
    op::move_(0x11, 0x04),     // set r[0x11] := $ssp
    op::add(0x12, 0x04, 0x20), // set r[0x12] := `$ssp + r[0x20]`
    op::ecr(0x10, 0x11, 0x12), // recover public key in memory[r[0x10], 64]
    op::ret(0x01),             // return `1`
];