cranelift_codegen/machinst/
abi.rs

1//! Implementation of a vanilla ABI, shared between several machines. The
2//! implementation here assumes that arguments will be passed in registers
3//! first, then additional args on the stack; that the stack grows downward,
4//! contains a standard frame (return address and frame pointer), and the
5//! compiler is otherwise free to allocate space below that with its choice of
6//! layout; and that the machine has some notion of caller- and callee-save
7//! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this
8//! mold and thus both of these backends use this shared implementation.
9//!
10//! See the documentation in specific machine backends for the "instantiation"
11//! of this generic ABI, i.e., which registers are caller/callee-save, arguments
12//! and return values, and any other special requirements.
13//!
14//! For now the implementation here assumes a 64-bit machine, but we intend to
15//! make this 32/64-bit-generic shortly.
16//!
17//! # Vanilla ABI
18//!
19//! First, arguments and return values are passed in registers up to a certain
20//! fixed count, after which they overflow onto the stack. Multiple return
21//! values either fit in registers, or are returned in a separate return-value
22//! area on the stack, given by a hidden extra parameter.
23//!
24//! Note that the exact stack layout is up to us. We settled on the
25//! below design based on several requirements. In particular, we need
26//! to be able to generate instructions (or instruction sequences) to
27//! access arguments, stack slots, and spill slots before we know how
28//! many spill slots or clobber-saves there will be, because of our
29//! pass structure. We also prefer positive offsets to negative
30//! offsets because of an asymmetry in some machines' addressing modes
31//! (e.g., on AArch64, positive offsets have a larger possible range
32//! without a long-form sequence to synthesize an arbitrary
33//! offset). We also need clobber-save registers to be "near" the
34//! frame pointer: Windows unwind information requires it to be within
35//! 240 bytes of RBP. Finally, it is not allowed to access memory
36//! below the current SP value.
37//!
38//! We assume that a prologue first pushes the frame pointer (and
39//! return address above that, if the machine does not do that in
40//! hardware). We set FP to point to this two-word frame record. We
41//! store all other frame slots below this two-word frame record, as
42//! well as enough space for arguments to the largest possible
43//! function call. The stack pointer then remains at this position
44//! for the duration of the function, allowing us to address all
45//! frame storage at positive offsets from SP.
46//!
47//! Note that if we ever support dynamic stack-space allocation (for
48//! `alloca`), we will need a way to reference spill slots and stack
49//! slots relative to a dynamic SP, because we will no longer be able
50//! to know a static offset from SP to the slots at any particular
51//! program point. Probably the best solution at that point will be to
52//! revert to using the frame pointer as the reference for all slots,
53//! to allow generating spill/reload and stackslot accesses before we
54//! know how large the clobber-saves will be.
55//!
56//! # Stack Layout
57//!
58//! The stack looks like:
59//!
60//! ```plain
61//!   (high address)
62//!                              |          ...              |
63//!                              | caller frames             |
64//!                              |          ...              |
65//!                              +===========================+
66//!                              |          ...              |
67//!                              | stack args                |
68//! Canonical Frame Address -->  | (accessed via FP)         |
69//!                              +---------------------------+
70//! SP at function entry ----->  | return address            |
71//!                              +---------------------------+
72//! FP after prologue -------->  | FP (pushed by prologue)   |
73//!                              +---------------------------+           -----
74//!                              |          ...              |             |
75//!                              | clobbered callee-saves    |             |
76//! unwind-frame base -------->  | (pushed by prologue)      |             |
77//!                              +---------------------------+   -----     |
78//!                              |          ...              |     |       |
79//!                              | spill slots               |     |       |
80//!                              | (accessed via SP)         |   fixed   active
81//!                              |          ...              |   frame    size
82//!                              | stack slots               |  storage    |
83//!                              | (accessed via SP)         |    size     |
84//!                              | (alloc'd by prologue)     |     |       |
85//!                              +---------------------------+   -----     |
86//!                              | [alignment as needed]     |             |
87//!                              |          ...              |             |
88//!                              | args for largest call     |             |
89//! SP ----------------------->  | (alloc'd by prologue)     |             |
90//!                              +===========================+           -----
91//!
92//!   (low address)
93//! ```
94//!
95//! # Multi-value Returns
96//!
97//! We support multi-value returns by using multiple return-value
98//! registers. In some cases this is an extension of the base system
99//! ABI. See each platform's `abi.rs` implementation for details.
100
101use crate::entity::SecondaryMap;
102use crate::ir::types::*;
103use crate::ir::{ArgumentExtension, ArgumentPurpose, Signature};
104use crate::isa::TargetIsa;
105use crate::settings::ProbestackStrategy;
106use crate::CodegenError;
107use crate::{ir, isa};
108use crate::{machinst::*, trace};
109use regalloc2::{MachineEnv, PReg, PRegSet};
110use rustc_hash::FxHashMap;
111use smallvec::smallvec;
112use std::collections::HashMap;
113use std::marker::PhantomData;
114use std::mem;
115
116/// A small vector of instructions (with some reasonable size); appropriate for
117/// a small fixed sequence implementing one operation.
118pub type SmallInstVec<I> = SmallVec<[I; 4]>;
119
120/// A type used by backends to track argument-binding info in the "args"
121/// pseudoinst. The pseudoinst holds a vec of `ArgPair` structs.
122#[derive(Clone, Debug)]
123pub struct ArgPair {
124    /// The vreg that is defined by this args pseudoinst.
125    pub vreg: Writable<Reg>,
126    /// The preg that the arg arrives in; this constrains the vreg's
127    /// placement at the pseudoinst.
128    pub preg: Reg,
129}
130
131/// A type used by backends to track return register binding info in the "ret"
132/// pseudoinst. The pseudoinst holds a vec of `RetPair` structs.
133#[derive(Clone, Debug)]
134pub struct RetPair {
135    /// The vreg that is returned by this pseudionst.
136    pub vreg: Reg,
137    /// The preg that the arg is returned through; this constrains the vreg's
138    /// placement at the pseudoinst.
139    pub preg: Reg,
140}
141
142/// A location for (part of) an argument or return value. These "storage slots"
143/// are specified for each register-sized part of an argument.
144#[derive(Clone, Copy, Debug, PartialEq, Eq)]
145pub enum ABIArgSlot {
146    /// In a real register.
147    Reg {
148        /// Register that holds this arg.
149        reg: RealReg,
150        /// Value type of this arg.
151        ty: ir::Type,
152        /// Should this arg be zero- or sign-extended?
153        extension: ir::ArgumentExtension,
154    },
155    /// Arguments only: on stack, at given offset from SP at entry.
156    Stack {
157        /// Offset of this arg relative to the base of stack args.
158        offset: i64,
159        /// Value type of this arg.
160        ty: ir::Type,
161        /// Should this arg be zero- or sign-extended?
162        extension: ir::ArgumentExtension,
163    },
164}
165
166impl ABIArgSlot {
167    /// The type of the value that will be stored in this slot.
168    pub fn get_type(&self) -> ir::Type {
169        match self {
170            ABIArgSlot::Reg { ty, .. } => *ty,
171            ABIArgSlot::Stack { ty, .. } => *ty,
172        }
173    }
174}
175
176/// A vector of `ABIArgSlot`s. Inline capacity for one element because basically
177/// 100% of values use one slot. Only `i128`s need multiple slots, and they are
178/// super rare (and never happen with Wasm).
179pub type ABIArgSlotVec = SmallVec<[ABIArgSlot; 1]>;
180
181/// An ABIArg is composed of one or more parts. This allows for a CLIF-level
182/// Value to be passed with its parts in more than one location at the ABI
183/// level. For example, a 128-bit integer may be passed in two 64-bit registers,
184/// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The
185/// number of "parts" should correspond to the number of registers used to store
186/// this type according to the machine backend.
187///
188/// As an invariant, the `purpose` for every part must match. As a further
189/// invariant, a `StructArg` part cannot appear with any other part.
190#[derive(Clone, Debug)]
191pub enum ABIArg {
192    /// Storage slots (registers or stack locations) for each part of the
193    /// argument value. The number of slots must equal the number of register
194    /// parts used to store a value of this type.
195    Slots {
196        /// Slots, one per register part.
197        slots: ABIArgSlotVec,
198        /// Purpose of this arg.
199        purpose: ir::ArgumentPurpose,
200    },
201    /// Structure argument. We reserve stack space for it, but the CLIF-level
202    /// semantics are a little weird: the value passed to the call instruction,
203    /// and received in the corresponding block param, is a *pointer*. On the
204    /// caller side, we memcpy the data from the passed-in pointer to the stack
205    /// area; on the callee side, we compute a pointer to this stack area and
206    /// provide that as the argument's value.
207    StructArg {
208        /// Offset of this arg relative to base of stack args.
209        offset: i64,
210        /// Size of this arg on the stack.
211        size: u64,
212        /// Purpose of this arg.
213        purpose: ir::ArgumentPurpose,
214    },
215    /// Implicit argument. Similar to a StructArg, except that we have the
216    /// target type, not a pointer type, at the CLIF-level. This argument is
217    /// still being passed via reference implicitly.
218    ImplicitPtrArg {
219        /// Register or stack slot holding a pointer to the buffer.
220        pointer: ABIArgSlot,
221        /// Offset of the argument buffer.
222        offset: i64,
223        /// Type of the implicit argument.
224        ty: Type,
225        /// Purpose of this arg.
226        purpose: ir::ArgumentPurpose,
227    },
228}
229
230impl ABIArg {
231    /// Create an ABIArg from one register.
232    pub fn reg(
233        reg: RealReg,
234        ty: ir::Type,
235        extension: ir::ArgumentExtension,
236        purpose: ir::ArgumentPurpose,
237    ) -> ABIArg {
238        ABIArg::Slots {
239            slots: smallvec![ABIArgSlot::Reg { reg, ty, extension }],
240            purpose,
241        }
242    }
243
244    /// Create an ABIArg from one stack slot.
245    pub fn stack(
246        offset: i64,
247        ty: ir::Type,
248        extension: ir::ArgumentExtension,
249        purpose: ir::ArgumentPurpose,
250    ) -> ABIArg {
251        ABIArg::Slots {
252            slots: smallvec![ABIArgSlot::Stack {
253                offset,
254                ty,
255                extension,
256            }],
257            purpose,
258        }
259    }
260}
261
262/// Are we computing information about arguments or return values? Much of the
263/// handling is factored out into common routines; this enum allows us to
264/// distinguish which case we're handling.
265#[derive(Clone, Copy, Debug, PartialEq, Eq)]
266pub enum ArgsOrRets {
267    /// Arguments.
268    Args,
269    /// Return values.
270    Rets,
271}
272
273/// Abstract location for a machine-specific ABI impl to translate into the
274/// appropriate addressing mode.
275#[derive(Clone, Copy, Debug)]
276pub enum StackAMode {
277    /// Offset into the current frame's argument area.
278    IncomingArg(i64, u32),
279    /// Offset within the stack slots in the current frame.
280    Slot(i64),
281    /// Offset into the callee frame's argument area.
282    OutgoingArg(i64),
283}
284
285/// Trait implemented by machine-specific backend to represent ISA flags.
286pub trait IsaFlags: Clone {
287    /// Get a flag indicating whether forward-edge CFI is enabled.
288    fn is_forward_edge_cfi_enabled(&self) -> bool {
289        false
290    }
291}
292
293/// Used as an out-parameter to accumulate a sequence of `ABIArg`s in
294/// `ABIMachineSpec::compute_arg_locs`. Wraps the shared allocation for all
295/// `ABIArg`s in `SigSet` and exposes just the args for the current
296/// `compute_arg_locs` call.
297pub struct ArgsAccumulator<'a> {
298    sig_set_abi_args: &'a mut Vec<ABIArg>,
299    start: usize,
300    non_formal_flag: bool,
301}
302
303impl<'a> ArgsAccumulator<'a> {
304    fn new(sig_set_abi_args: &'a mut Vec<ABIArg>) -> Self {
305        let start = sig_set_abi_args.len();
306        ArgsAccumulator {
307            sig_set_abi_args,
308            start,
309            non_formal_flag: false,
310        }
311    }
312
313    #[inline]
314    pub fn push(&mut self, arg: ABIArg) {
315        debug_assert!(!self.non_formal_flag);
316        self.sig_set_abi_args.push(arg)
317    }
318
319    #[inline]
320    pub fn push_non_formal(&mut self, arg: ABIArg) {
321        self.non_formal_flag = true;
322        self.sig_set_abi_args.push(arg)
323    }
324
325    #[inline]
326    pub fn args(&self) -> &[ABIArg] {
327        &self.sig_set_abi_args[self.start..]
328    }
329
330    #[inline]
331    pub fn args_mut(&mut self) -> &mut [ABIArg] {
332        &mut self.sig_set_abi_args[self.start..]
333    }
334}
335
336/// Trait implemented by machine-specific backend to provide information about
337/// register assignments and to allow generating the specific instructions for
338/// stack loads/saves, prologues/epilogues, etc.
339pub trait ABIMachineSpec {
340    /// The instruction type.
341    type I: VCodeInst;
342
343    /// The ISA flags type.
344    type F: IsaFlags;
345
346    /// This is the limit for the size of argument and return-value areas on the
347    /// stack. We place a reasonable limit here to avoid integer overflow issues
348    /// with 32-bit arithmetic.
349    const STACK_ARG_RET_SIZE_LIMIT: u32;
350
351    /// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.
352    fn word_bits() -> u32;
353
354    /// Returns the number of bytes in a word.
355    fn word_bytes() -> u32 {
356        return Self::word_bits() / 8;
357    }
358
359    /// Returns word-size integer type.
360    fn word_type() -> Type {
361        match Self::word_bits() {
362            32 => I32,
363            64 => I64,
364            _ => unreachable!(),
365        }
366    }
367
368    /// Returns word register class.
369    fn word_reg_class() -> RegClass {
370        RegClass::Int
371    }
372
373    /// Returns required stack alignment in bytes.
374    fn stack_align(call_conv: isa::CallConv) -> u32;
375
376    /// Process a list of parameters or return values and allocate them to registers
377    /// and stack slots.
378    ///
379    /// The argument locations should be pushed onto the given `ArgsAccumulator`
380    /// in order. Any extra arguments added (such as return area pointers)
381    /// should come at the end of the list so that the first N lowered
382    /// parameters align with the N clif parameters.
383    ///
384    /// Returns the stack-space used (rounded up to as alignment requires), and
385    /// if `add_ret_area_ptr` was passed, the index of the extra synthetic arg
386    /// that was added.
387    fn compute_arg_locs(
388        call_conv: isa::CallConv,
389        flags: &settings::Flags,
390        params: &[ir::AbiParam],
391        args_or_rets: ArgsOrRets,
392        add_ret_area_ptr: bool,
393        args: ArgsAccumulator,
394    ) -> CodegenResult<(u32, Option<usize>)>;
395
396    /// Generate a load from the stack.
397    fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
398
399    /// Generate a store to the stack.
400    fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;
401
402    /// Generate a move.
403    fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;
404
405    /// Generate an integer-extend operation.
406    fn gen_extend(
407        to_reg: Writable<Reg>,
408        from_reg: Reg,
409        is_signed: bool,
410        from_bits: u8,
411        to_bits: u8,
412    ) -> Self::I;
413
414    /// Generate an "args" pseudo-instruction to capture input args in
415    /// registers.
416    fn gen_args(args: Vec<ArgPair>) -> Self::I;
417
418    /// Generate a "rets" pseudo-instruction that moves vregs to return
419    /// registers.
420    fn gen_rets(rets: Vec<RetPair>) -> Self::I;
421
422    /// Generate an add-with-immediate. Note that even if this uses a scratch
423    /// register, it must satisfy two requirements:
424    ///
425    /// - The add-imm sequence must only clobber caller-save registers that are
426    ///   not used for arguments, because it will be placed in the prologue
427    ///   before the clobbered callee-save registers are saved.
428    ///
429    /// - The add-imm sequence must work correctly when `from_reg` and/or
430    ///   `into_reg` are the register returned by `get_stacklimit_reg()`.
431    fn gen_add_imm(
432        call_conv: isa::CallConv,
433        into_reg: Writable<Reg>,
434        from_reg: Reg,
435        imm: u32,
436    ) -> SmallInstVec<Self::I>;
437
438    /// Generate a sequence that traps with a `TrapCode::StackOverflow` code if
439    /// the stack pointer is less than the given limit register (assuming the
440    /// stack grows downward).
441    fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;
442
443    /// Generate an instruction to compute an address of a stack slot (FP- or
444    /// SP-based offset).
445    fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>) -> Self::I;
446
447    /// Get a fixed register to use to compute a stack limit. This is needed for
448    /// certain sequences generated after the register allocator has already
449    /// run. This must satisfy two requirements:
450    ///
451    /// - It must be a caller-save register that is not used for arguments,
452    ///   because it will be clobbered in the prologue before the clobbered
453    ///   callee-save registers are saved.
454    ///
455    /// - It must be safe to pass as an argument and/or destination to
456    ///   `gen_add_imm()`. This is relevant when an addition with a large
457    ///   immediate needs its own temporary; it cannot use the same fixed
458    ///   temporary as this one.
459    fn get_stacklimit_reg(call_conv: isa::CallConv) -> Reg;
460
461    /// Generate a load to the given [base+offset] address.
462    fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;
463
464    /// Generate a store from the given [base+offset] address.
465    fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;
466
467    /// Adjust the stack pointer up or down.
468    fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;
469
470    /// Compute a FrameLayout structure containing a sorted list of all clobbered
471    /// registers that are callee-saved according to the ABI, as well as the sizes
472    /// of all parts of the stack frame.  The result is used to emit the prologue
473    /// and epilogue routines.
474    fn compute_frame_layout(
475        call_conv: isa::CallConv,
476        flags: &settings::Flags,
477        sig: &Signature,
478        regs: &[Writable<RealReg>],
479        is_leaf: bool,
480        incoming_args_size: u32,
481        tail_args_size: u32,
482        fixed_frame_storage_size: u32,
483        outgoing_args_size: u32,
484    ) -> FrameLayout;
485
486    /// Generate the usual frame-setup sequence for this architecture: e.g.,
487    /// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on
488    /// AArch64.
489    fn gen_prologue_frame_setup(
490        call_conv: isa::CallConv,
491        flags: &settings::Flags,
492        isa_flags: &Self::F,
493        frame_layout: &FrameLayout,
494    ) -> SmallInstVec<Self::I>;
495
496    /// Generate the usual frame-restore sequence for this architecture.
497    fn gen_epilogue_frame_restore(
498        call_conv: isa::CallConv,
499        flags: &settings::Flags,
500        isa_flags: &Self::F,
501        frame_layout: &FrameLayout,
502    ) -> SmallInstVec<Self::I>;
503
504    /// Generate a return instruction.
505    fn gen_return(
506        call_conv: isa::CallConv,
507        isa_flags: &Self::F,
508        frame_layout: &FrameLayout,
509    ) -> SmallInstVec<Self::I>;
510
511    /// Generate a probestack call.
512    fn gen_probestack(insts: &mut SmallInstVec<Self::I>, frame_size: u32);
513
514    /// Generate a inline stack probe.
515    fn gen_inline_probestack(
516        insts: &mut SmallInstVec<Self::I>,
517        call_conv: isa::CallConv,
518        frame_size: u32,
519        guard_size: u32,
520    );
521
522    /// Generate a clobber-save sequence. The implementation here should return
523    /// a sequence of instructions that "push" or otherwise save to the stack all
524    /// registers written/modified by the function body that are callee-saved.
525    /// The sequence of instructions should adjust the stack pointer downward,
526    /// and should align as necessary according to ABI requirements.
527    fn gen_clobber_save(
528        call_conv: isa::CallConv,
529        flags: &settings::Flags,
530        frame_layout: &FrameLayout,
531    ) -> SmallVec<[Self::I; 16]>;
532
533    /// Generate a clobber-restore sequence. This sequence should perform the
534    /// opposite of the clobber-save sequence generated above, assuming that SP
535    /// going into the sequence is at the same point that it was left when the
536    /// clobber-save sequence finished.
537    fn gen_clobber_restore(
538        call_conv: isa::CallConv,
539        flags: &settings::Flags,
540        frame_layout: &FrameLayout,
541    ) -> SmallVec<[Self::I; 16]>;
542
543    /// Generate a call instruction/sequence. This method is provided one
544    /// temporary register to use to synthesize the called address, if needed.
545    fn gen_call(dest: &CallDest, tmp: Writable<Reg>, info: CallInfo<()>) -> SmallVec<[Self::I; 2]>;
546
547    /// Generate a memcpy invocation. Used to set up struct
548    /// args. Takes `src`, `dst` as read-only inputs and passes a temporary
549    /// allocator.
550    fn gen_memcpy<F: FnMut(Type) -> Writable<Reg>>(
551        call_conv: isa::CallConv,
552        dst: Reg,
553        src: Reg,
554        size: usize,
555        alloc_tmp: F,
556    ) -> SmallVec<[Self::I; 8]>;
557
558    /// Get the number of spillslots required for the given register-class.
559    fn get_number_of_spillslots_for_value(
560        rc: RegClass,
561        target_vector_bytes: u32,
562        isa_flags: &Self::F,
563    ) -> u32;
564
565    /// Get the ABI-dependent MachineEnv for managing register allocation.
566    fn get_machine_env(flags: &settings::Flags, call_conv: isa::CallConv) -> &MachineEnv;
567
568    /// Get all caller-save registers, that is, registers that we expect
569    /// not to be saved across a call to a callee with the given ABI.
570    fn get_regs_clobbered_by_call(call_conv_of_callee: isa::CallConv) -> PRegSet;
571
572    /// Get the needed extension mode, given the mode attached to the argument
573    /// in the signature and the calling convention. The input (the attribute in
574    /// the signature) specifies what extension type should be done *if* the ABI
575    /// requires extension to the full register; this method's return value
576    /// indicates whether the extension actually *will* be done.
577    fn get_ext_mode(
578        call_conv: isa::CallConv,
579        specified: ir::ArgumentExtension,
580    ) -> ir::ArgumentExtension;
581}
582
583/// Out-of-line data for calls, to keep the size of `Inst` down.
584#[derive(Clone, Debug)]
585pub struct CallInfo<T> {
586    /// Receiver of this call
587    pub dest: T,
588    /// Register uses of this call.
589    pub uses: CallArgList,
590    /// Register defs of this call.
591    pub defs: CallRetList,
592    /// Registers clobbered by this call, as per its calling convention.
593    pub clobbers: PRegSet,
594    /// The calling convention of the callee.
595    pub callee_conv: isa::CallConv,
596    /// The calling convention of the caller.
597    pub caller_conv: isa::CallConv,
598    /// The number of bytes that the callee will pop from the stack for the
599    /// caller, if any. (Used for popping stack arguments with the `tail`
600    /// calling convention.)
601    pub callee_pop_size: u32,
602}
603
604impl<T> CallInfo<T> {
605    /// Creates an empty set of info with no clobbers/uses/etc with the
606    /// specified ABI
607    pub fn empty(dest: T, call_conv: isa::CallConv) -> CallInfo<T> {
608        CallInfo {
609            dest,
610            uses: smallvec![],
611            defs: smallvec![],
612            clobbers: PRegSet::empty(),
613            caller_conv: call_conv,
614            callee_conv: call_conv,
615            callee_pop_size: 0,
616        }
617    }
618
619    /// Change the `T` payload on this info to `U`.
620    pub fn map<U>(self, f: impl FnOnce(T) -> U) -> CallInfo<U> {
621        CallInfo {
622            dest: f(self.dest),
623            uses: self.uses,
624            defs: self.defs,
625            clobbers: self.clobbers,
626            caller_conv: self.caller_conv,
627            callee_conv: self.callee_conv,
628            callee_pop_size: self.callee_pop_size,
629        }
630    }
631}
632
633/// The id of an ABI signature within the `SigSet`.
634#[derive(Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
635pub struct Sig(u32);
636cranelift_entity::entity_impl!(Sig);
637
638impl Sig {
639    fn prev(self) -> Option<Sig> {
640        self.0.checked_sub(1).map(Sig)
641    }
642}
643
644/// ABI information shared between body (callee) and caller.
645#[derive(Clone, Debug)]
646pub struct SigData {
647    /// Currently both return values and arguments are stored in a continuous space vector
648    /// in `SigSet::abi_args`.
649    ///
650    /// ```plain
651    ///                  +----------------------------------------------+
652    ///                  | return values                                |
653    ///                  | ...                                          |
654    ///   rets_end   --> +----------------------------------------------+
655    ///                  | arguments                                    |
656    ///                  | ...                                          |
657    ///   args_end   --> +----------------------------------------------+
658    ///
659    /// ```
660    ///
661    /// Note we only store two offsets as rets_end == args_start, and rets_start == prev.args_end.
662    ///
663    /// Argument location ending offset (regs or stack slots). Stack offsets are relative to
664    /// SP on entry to function.
665    ///
666    /// This is a index into the `SigSet::abi_args`.
667    args_end: u32,
668
669    /// Return-value location ending offset. Stack offsets are relative to the return-area
670    /// pointer.
671    ///
672    /// This is a index into the `SigSet::abi_args`.
673    rets_end: u32,
674
675    /// Space on stack used to store arguments. We're storing the size in u32 to
676    /// reduce the size of the struct.
677    sized_stack_arg_space: u32,
678
679    /// Space on stack used to store return values. We're storing the size in u32 to
680    /// reduce the size of the struct.
681    sized_stack_ret_space: u32,
682
683    /// Index in `args` of the stack-return-value-area argument.
684    stack_ret_arg: Option<u16>,
685
686    /// Calling convention used.
687    call_conv: isa::CallConv,
688}
689
690impl SigData {
691    /// Get total stack space required for arguments.
692    pub fn sized_stack_arg_space(&self) -> i64 {
693        self.sized_stack_arg_space.into()
694    }
695
696    /// Get total stack space required for return values.
697    pub fn sized_stack_ret_space(&self) -> i64 {
698        self.sized_stack_ret_space.into()
699    }
700
701    /// Get calling convention used.
702    pub fn call_conv(&self) -> isa::CallConv {
703        self.call_conv
704    }
705
706    /// The index of the stack-return-value-area argument, if any.
707    pub fn stack_ret_arg(&self) -> Option<u16> {
708        self.stack_ret_arg
709    }
710}
711
712/// A (mostly) deduplicated set of ABI signatures.
713///
714/// We say "mostly" because we do not dedupe between signatures interned via
715/// `ir::SigRef` (direct and indirect calls; the vast majority of signatures in
716/// this set) vs via `ir::Signature` (the callee itself and libcalls). Doing
717/// this final bit of deduplication would require filling out the
718/// `ir_signature_to_abi_sig`, which is a bunch of allocations (not just the
719/// hash map itself but params and returns vecs in each signature) that we want
720/// to avoid.
721///
722/// In general, prefer using the `ir::SigRef`-taking methods to the
723/// `ir::Signature`-taking methods when you can get away with it, as they don't
724/// require cloning non-copy types that will trigger heap allocations.
725///
726/// This type can be indexed by `Sig` to access its associated `SigData`.
727pub struct SigSet {
728    /// Interned `ir::Signature`s that we already have an ABI signature for.
729    ir_signature_to_abi_sig: FxHashMap<ir::Signature, Sig>,
730
731    /// Interned `ir::SigRef`s that we already have an ABI signature for.
732    ir_sig_ref_to_abi_sig: SecondaryMap<ir::SigRef, Option<Sig>>,
733
734    /// A single, shared allocation for all `ABIArg`s used by all
735    /// `SigData`s. Each `SigData` references its args/rets via indices into
736    /// this allocation.
737    abi_args: Vec<ABIArg>,
738
739    /// The actual ABI signatures, keyed by `Sig`.
740    sigs: PrimaryMap<Sig, SigData>,
741}
742
743impl SigSet {
744    /// Construct a new `SigSet`, interning all of the signatures used by the
745    /// given function.
746    pub fn new<M>(func: &ir::Function, flags: &settings::Flags) -> CodegenResult<Self>
747    where
748        M: ABIMachineSpec,
749    {
750        let arg_estimate = func.dfg.signatures.len() * 6;
751
752        let mut sigs = SigSet {
753            ir_signature_to_abi_sig: FxHashMap::default(),
754            ir_sig_ref_to_abi_sig: SecondaryMap::with_capacity(func.dfg.signatures.len()),
755            abi_args: Vec::with_capacity(arg_estimate),
756            sigs: PrimaryMap::with_capacity(1 + func.dfg.signatures.len()),
757        };
758
759        sigs.make_abi_sig_from_ir_signature::<M>(func.signature.clone(), flags)?;
760        for sig_ref in func.dfg.signatures.keys() {
761            sigs.make_abi_sig_from_ir_sig_ref::<M>(sig_ref, &func.dfg, flags)?;
762        }
763
764        Ok(sigs)
765    }
766
767    /// Have we already interned an ABI signature for the given `ir::Signature`?
768    pub fn have_abi_sig_for_signature(&self, signature: &ir::Signature) -> bool {
769        self.ir_signature_to_abi_sig.contains_key(signature)
770    }
771
772    /// Construct and intern an ABI signature for the given `ir::Signature`.
773    pub fn make_abi_sig_from_ir_signature<M>(
774        &mut self,
775        signature: ir::Signature,
776        flags: &settings::Flags,
777    ) -> CodegenResult<Sig>
778    where
779        M: ABIMachineSpec,
780    {
781        // Because the `HashMap` entry API requires taking ownership of the
782        // lookup key -- and we want to avoid unnecessary clones of
783        // `ir::Signature`s, even at the cost of duplicate lookups -- we can't
784        // have a single, get-or-create-style method for interning
785        // `ir::Signature`s into ABI signatures. So at least (debug) assert that
786        // we aren't creating duplicate ABI signatures for the same
787        // `ir::Signature`.
788        debug_assert!(!self.have_abi_sig_for_signature(&signature));
789
790        let sig_data = self.from_func_sig::<M>(&signature, flags)?;
791        let sig = self.sigs.push(sig_data);
792        self.ir_signature_to_abi_sig.insert(signature, sig);
793        Ok(sig)
794    }
795
796    fn make_abi_sig_from_ir_sig_ref<M>(
797        &mut self,
798        sig_ref: ir::SigRef,
799        dfg: &ir::DataFlowGraph,
800        flags: &settings::Flags,
801    ) -> CodegenResult<Sig>
802    where
803        M: ABIMachineSpec,
804    {
805        if let Some(sig) = self.ir_sig_ref_to_abi_sig[sig_ref] {
806            return Ok(sig);
807        }
808        let signature = &dfg.signatures[sig_ref];
809        let sig_data = self.from_func_sig::<M>(signature, flags)?;
810        let sig = self.sigs.push(sig_data);
811        self.ir_sig_ref_to_abi_sig[sig_ref] = Some(sig);
812        Ok(sig)
813    }
814
815    /// Get the already-interned ABI signature id for the given `ir::SigRef`.
816    pub fn abi_sig_for_sig_ref(&self, sig_ref: ir::SigRef) -> Sig {
817        self.ir_sig_ref_to_abi_sig[sig_ref]
818            .expect("must call `make_abi_sig_from_ir_sig_ref` before `get_abi_sig_for_sig_ref`")
819    }
820
821    /// Get the already-interned ABI signature id for the given `ir::Signature`.
822    pub fn abi_sig_for_signature(&self, signature: &ir::Signature) -> Sig {
823        self.ir_signature_to_abi_sig
824            .get(signature)
825            .copied()
826            .expect("must call `make_abi_sig_from_ir_signature` before `get_abi_sig_for_signature`")
827    }
828
829    pub fn from_func_sig<M: ABIMachineSpec>(
830        &mut self,
831        sig: &ir::Signature,
832        flags: &settings::Flags,
833    ) -> CodegenResult<SigData> {
834        // Keep in sync with ensure_struct_return_ptr_is_returned
835        if sig.uses_special_return(ArgumentPurpose::StructReturn) {
836            panic!("Explicit StructReturn return value not allowed: {sig:?}")
837        }
838        let tmp;
839        let returns = if let Some(struct_ret_index) =
840            sig.special_param_index(ArgumentPurpose::StructReturn)
841        {
842            if !sig.returns.is_empty() {
843                panic!("No return values are allowed when using StructReturn: {sig:?}");
844            }
845            tmp = [sig.params[struct_ret_index]];
846            &tmp
847        } else {
848            sig.returns.as_slice()
849        };
850
851        // Compute args and retvals from signature. Handle retvals first,
852        // because we may need to add a return-area arg to the args.
853
854        // NOTE: We rely on the order of the args (rets -> args) inserted to compute the offsets in
855        // `SigSet::args()` and `SigSet::rets()`. Therefore, we cannot change the two
856        // compute_arg_locs order.
857        let (sized_stack_ret_space, _) = M::compute_arg_locs(
858            sig.call_conv,
859            flags,
860            &returns,
861            ArgsOrRets::Rets,
862            /* extra ret-area ptr = */ false,
863            ArgsAccumulator::new(&mut self.abi_args),
864        )?;
865        if !flags.enable_multi_ret_implicit_sret() {
866            assert_eq!(sized_stack_ret_space, 0);
867        }
868        let rets_end = u32::try_from(self.abi_args.len()).unwrap();
869
870        // To avoid overflow issues, limit the return size to something reasonable.
871        if sized_stack_ret_space > M::STACK_ARG_RET_SIZE_LIMIT {
872            return Err(CodegenError::ImplLimitExceeded);
873        }
874
875        let need_stack_return_area = sized_stack_ret_space > 0;
876        if need_stack_return_area {
877            assert!(!sig.uses_special_param(ir::ArgumentPurpose::StructReturn));
878        }
879
880        let (sized_stack_arg_space, stack_ret_arg) = M::compute_arg_locs(
881            sig.call_conv,
882            flags,
883            &sig.params,
884            ArgsOrRets::Args,
885            need_stack_return_area,
886            ArgsAccumulator::new(&mut self.abi_args),
887        )?;
888        let args_end = u32::try_from(self.abi_args.len()).unwrap();
889
890        // To avoid overflow issues, limit the arg size to something reasonable.
891        if sized_stack_arg_space > M::STACK_ARG_RET_SIZE_LIMIT {
892            return Err(CodegenError::ImplLimitExceeded);
893        }
894
895        trace!(
896            "ABISig: sig {:?} => args end = {} rets end = {}
897             arg stack = {} ret stack = {} stack_ret_arg = {:?}",
898            sig,
899            args_end,
900            rets_end,
901            sized_stack_arg_space,
902            sized_stack_ret_space,
903            need_stack_return_area,
904        );
905
906        let stack_ret_arg = stack_ret_arg.map(|s| u16::try_from(s).unwrap());
907        Ok(SigData {
908            args_end,
909            rets_end,
910            sized_stack_arg_space,
911            sized_stack_ret_space,
912            stack_ret_arg,
913            call_conv: sig.call_conv,
914        })
915    }
916
917    /// Get this signature's ABI arguments.
918    pub fn args(&self, sig: Sig) -> &[ABIArg] {
919        let sig_data = &self.sigs[sig];
920        // Please see comments in `SigSet::from_func_sig` of how we store the offsets.
921        let start = usize::try_from(sig_data.rets_end).unwrap();
922        let end = usize::try_from(sig_data.args_end).unwrap();
923        &self.abi_args[start..end]
924    }
925
926    /// Get information specifying how to pass the implicit pointer
927    /// to the return-value area on the stack, if required.
928    pub fn get_ret_arg(&self, sig: Sig) -> Option<ABIArg> {
929        let sig_data = &self.sigs[sig];
930        if let Some(i) = sig_data.stack_ret_arg {
931            Some(self.args(sig)[usize::from(i)].clone())
932        } else {
933            None
934        }
935    }
936
937    /// Get information specifying how to pass one argument.
938    pub fn get_arg(&self, sig: Sig, idx: usize) -> ABIArg {
939        self.args(sig)[idx].clone()
940    }
941
942    /// Get this signature's ABI returns.
943    pub fn rets(&self, sig: Sig) -> &[ABIArg] {
944        let sig_data = &self.sigs[sig];
945        // Please see comments in `SigSet::from_func_sig` of how we store the offsets.
946        let start = usize::try_from(sig.prev().map_or(0, |prev| self.sigs[prev].args_end)).unwrap();
947        let end = usize::try_from(sig_data.rets_end).unwrap();
948        &self.abi_args[start..end]
949    }
950
951    /// Get information specifying how to pass one return value.
952    pub fn get_ret(&self, sig: Sig, idx: usize) -> ABIArg {
953        self.rets(sig)[idx].clone()
954    }
955
956    /// Get the number of arguments expected.
957    pub fn num_args(&self, sig: Sig) -> usize {
958        let len = self.args(sig).len();
959        if self.sigs[sig].stack_ret_arg.is_some() {
960            len - 1
961        } else {
962            len
963        }
964    }
965
966    /// Get the number of return values expected.
967    pub fn num_rets(&self, sig: Sig) -> usize {
968        self.rets(sig).len()
969    }
970}
971
972// NB: we do _not_ implement `IndexMut` because these signatures are
973// deduplicated and shared!
974impl std::ops::Index<Sig> for SigSet {
975    type Output = SigData;
976
977    fn index(&self, sig: Sig) -> &Self::Output {
978        &self.sigs[sig]
979    }
980}
981
982/// Structure describing the layout of a function's stack frame.
983#[derive(Clone, Debug, Default)]
984pub struct FrameLayout {
985    /// N.B. The areas whose sizes are given in this structure fully
986    /// cover the current function's stack frame, from high to low
987    /// stack addresses in the sequence below.  Each size contains
988    /// any alignment padding that may be required by the ABI.
989
990    /// Size of incoming arguments on the stack.  This is not technically
991    /// part of this function's frame, but code in the function will still
992    /// need to access it.  Depending on the ABI, we may need to set up a
993    /// frame pointer to do so; we also may need to pop this area from the
994    /// stack upon return.
995    pub incoming_args_size: u32,
996
997    /// The size of the incoming argument area, taking into account any
998    /// potential increase in size required for tail calls present in the
999    /// function. In the case that no tail calls are present, this value
1000    /// will be the same as [`Self::incoming_args_size`].
1001    pub tail_args_size: u32,
1002
1003    /// Size of the "setup area", typically holding the return address
1004    /// and/or the saved frame pointer.  This may be written either during
1005    /// the call itself (e.g. a pushed return address) or by code emitted
1006    /// from gen_prologue_frame_setup.  In any case, after that code has
1007    /// completed execution, the stack pointer is expected to point to the
1008    /// bottom of this area.  The same holds at the start of code emitted
1009    /// by gen_epilogue_frame_restore.
1010    pub setup_area_size: u32,
1011
1012    /// Size of the area used to save callee-saved clobbered registers.
1013    /// This area is accessed by code emitted from gen_clobber_save and
1014    /// gen_clobber_restore.
1015    pub clobber_size: u32,
1016
1017    /// Storage allocated for the fixed part of the stack frame.
1018    /// This contains stack slots and spill slots.
1019    pub fixed_frame_storage_size: u32,
1020
1021    /// Stack size to be reserved for outgoing arguments, if used by
1022    /// the current ABI, or 0 otherwise.  After gen_clobber_save and
1023    /// before gen_clobber_restore, the stack pointer points to the
1024    /// bottom of this area.
1025    pub outgoing_args_size: u32,
1026
1027    /// Sorted list of callee-saved registers that are clobbered
1028    /// according to the ABI.  These registers will be saved and
1029    /// restored by gen_clobber_save and gen_clobber_restore.
1030    pub clobbered_callee_saves: Vec<Writable<RealReg>>,
1031}
1032
1033impl FrameLayout {
1034    /// Split the clobbered callee-save registers into integer-class and
1035    /// float-class groups.
1036    ///
1037    /// This method does not currently support vector-class callee-save
1038    /// registers because no current backend has them.
1039    pub fn clobbered_callee_saves_by_class(&self) -> (&[Writable<RealReg>], &[Writable<RealReg>]) {
1040        let (ints, floats) = self.clobbered_callee_saves.split_at(
1041            self.clobbered_callee_saves
1042                .partition_point(|r| r.to_reg().class() == RegClass::Int),
1043        );
1044        debug_assert!(floats.iter().all(|r| r.to_reg().class() == RegClass::Float));
1045        (ints, floats)
1046    }
1047
1048    /// The size of FP to SP while the frame is active (not during prologue
1049    /// setup or epilogue tear down).
1050    pub fn active_size(&self) -> u32 {
1051        self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size
1052    }
1053
1054    /// Get the offset from the SP to the sized stack slots area.
1055    pub fn sp_to_sized_stack_slots(&self) -> u32 {
1056        self.outgoing_args_size
1057    }
1058}
1059
1060/// ABI object for a function body.
1061pub struct Callee<M: ABIMachineSpec> {
1062    /// CLIF-level signature, possibly normalized.
1063    ir_sig: ir::Signature,
1064    /// Signature: arg and retval regs.
1065    sig: Sig,
1066    /// Defined dynamic types.
1067    dynamic_type_sizes: HashMap<Type, u32>,
1068    /// Offsets to each dynamic stackslot.
1069    dynamic_stackslots: PrimaryMap<DynamicStackSlot, u32>,
1070    /// Offsets to each sized stackslot.
1071    sized_stackslots: PrimaryMap<StackSlot, u32>,
1072    /// Total stack size of all stackslots
1073    stackslots_size: u32,
1074    /// Stack size to be reserved for outgoing arguments.
1075    outgoing_args_size: u32,
1076    /// Initially the number of bytes originating in the callers frame where stack arguments will
1077    /// live. After lowering this number may be larger than the size expected by the function being
1078    /// compiled, as tail calls potentially require more space for stack arguments.
1079    tail_args_size: u32,
1080    /// Register-argument defs, to be provided to the `args`
1081    /// pseudo-inst, and pregs to constrain them to.
1082    reg_args: Vec<ArgPair>,
1083    /// Finalized frame layout for this function.
1084    frame_layout: Option<FrameLayout>,
1085    /// The register holding the return-area pointer, if needed.
1086    ret_area_ptr: Option<Reg>,
1087    /// Calling convention this function expects.
1088    call_conv: isa::CallConv,
1089    /// The settings controlling this function's compilation.
1090    flags: settings::Flags,
1091    /// The ISA-specific flag values controlling this function's compilation.
1092    isa_flags: M::F,
1093    /// Whether or not this function is a "leaf", meaning it calls no other
1094    /// functions
1095    is_leaf: bool,
1096    /// If this function has a stack limit specified, then `Reg` is where the
1097    /// stack limit will be located after the instructions specified have been
1098    /// executed.
1099    ///
1100    /// Note that this is intended for insertion into the prologue, if
1101    /// present. Also note that because the instructions here execute in the
1102    /// prologue this happens after legalization/register allocation/etc so we
1103    /// need to be extremely careful with each instruction. The instructions are
1104    /// manually register-allocated and carefully only use caller-saved
1105    /// registers and keep nothing live after this sequence of instructions.
1106    stack_limit: Option<(Reg, SmallInstVec<M::I>)>,
1107
1108    _mach: PhantomData<M>,
1109}
1110
1111fn get_special_purpose_param_register(
1112    f: &ir::Function,
1113    sigs: &SigSet,
1114    sig: Sig,
1115    purpose: ir::ArgumentPurpose,
1116) -> Option<Reg> {
1117    let idx = f.signature.special_param_index(purpose)?;
1118    match &sigs.args(sig)[idx] {
1119        &ABIArg::Slots { ref slots, .. } => match &slots[0] {
1120            &ABIArgSlot::Reg { reg, .. } => Some(reg.into()),
1121            _ => None,
1122        },
1123        _ => None,
1124    }
1125}
1126
1127fn checked_round_up(val: u32, mask: u32) -> Option<u32> {
1128    Some(val.checked_add(mask)? & !mask)
1129}
1130
1131impl<M: ABIMachineSpec> Callee<M> {
1132    /// Create a new body ABI instance.
1133    pub fn new(
1134        f: &ir::Function,
1135        isa: &dyn TargetIsa,
1136        isa_flags: &M::F,
1137        sigs: &SigSet,
1138    ) -> CodegenResult<Self> {
1139        trace!("ABI: func signature {:?}", f.signature);
1140
1141        let flags = isa.flags().clone();
1142        let sig = sigs.abi_sig_for_signature(&f.signature);
1143
1144        let call_conv = f.signature.call_conv;
1145        // Only these calling conventions are supported.
1146        debug_assert!(
1147            call_conv == isa::CallConv::SystemV
1148                || call_conv == isa::CallConv::Tail
1149                || call_conv == isa::CallConv::Fast
1150                || call_conv == isa::CallConv::Cold
1151                || call_conv == isa::CallConv::WindowsFastcall
1152                || call_conv == isa::CallConv::AppleAarch64
1153                || call_conv == isa::CallConv::Winch,
1154            "Unsupported calling convention: {call_conv:?}"
1155        );
1156
1157        // Compute sized stackslot locations and total stackslot size.
1158        let mut end_offset: u32 = 0;
1159        let mut sized_stackslots = PrimaryMap::new();
1160
1161        for (stackslot, data) in f.sized_stack_slots.iter() {
1162            // We start our computation possibly unaligned where the previous
1163            // stackslot left off.
1164            let unaligned_start_offset = end_offset;
1165
1166            // The start of the stackslot must be aligned.
1167            //
1168            // We always at least machine-word-align slots, but also
1169            // satisfy the user's requested alignment.
1170            debug_assert!(data.align_shift < 32);
1171            let align = std::cmp::max(M::word_bytes(), 1u32 << data.align_shift);
1172            let mask = align - 1;
1173            let start_offset = checked_round_up(unaligned_start_offset, mask)
1174                .ok_or(CodegenError::ImplLimitExceeded)?;
1175
1176            // The end offset is the the start offset increased by the size
1177            end_offset = start_offset
1178                .checked_add(data.size)
1179                .ok_or(CodegenError::ImplLimitExceeded)?;
1180
1181            debug_assert_eq!(stackslot.as_u32() as usize, sized_stackslots.len());
1182            sized_stackslots.push(start_offset);
1183        }
1184
1185        // Compute dynamic stackslot locations and total stackslot size.
1186        let mut dynamic_stackslots = PrimaryMap::new();
1187        for (stackslot, data) in f.dynamic_stack_slots.iter() {
1188            debug_assert_eq!(stackslot.as_u32() as usize, dynamic_stackslots.len());
1189
1190            // This computation is similar to the stackslots above
1191            let unaligned_start_offset = end_offset;
1192
1193            let mask = M::word_bytes() - 1;
1194            let start_offset = checked_round_up(unaligned_start_offset, mask)
1195                .ok_or(CodegenError::ImplLimitExceeded)?;
1196
1197            let ty = f.get_concrete_dynamic_ty(data.dyn_ty).ok_or_else(|| {
1198                CodegenError::Unsupported(format!("invalid dynamic vector type: {}", data.dyn_ty))
1199            })?;
1200
1201            end_offset = start_offset
1202                .checked_add(isa.dynamic_vector_bytes(ty))
1203                .ok_or(CodegenError::ImplLimitExceeded)?;
1204
1205            dynamic_stackslots.push(start_offset);
1206        }
1207
1208        // The size of the stackslots needs to be word aligned
1209        let stackslots_size = checked_round_up(end_offset, M::word_bytes() - 1)
1210            .ok_or(CodegenError::ImplLimitExceeded)?;
1211
1212        let mut dynamic_type_sizes = HashMap::with_capacity(f.dfg.dynamic_types.len());
1213        for (dyn_ty, _data) in f.dfg.dynamic_types.iter() {
1214            let ty = f
1215                .get_concrete_dynamic_ty(dyn_ty)
1216                .unwrap_or_else(|| panic!("invalid dynamic vector type: {dyn_ty}"));
1217            let size = isa.dynamic_vector_bytes(ty);
1218            dynamic_type_sizes.insert(ty, size);
1219        }
1220
1221        // Figure out what instructions, if any, will be needed to check the
1222        // stack limit. This can either be specified as a special-purpose
1223        // argument or as a global value which often calculates the stack limit
1224        // from the arguments.
1225        let stack_limit = f
1226            .stack_limit
1227            .map(|gv| gen_stack_limit::<M>(f, sigs, sig, gv));
1228
1229        let tail_args_size = sigs[sig].sized_stack_arg_space;
1230
1231        Ok(Self {
1232            ir_sig: ensure_struct_return_ptr_is_returned(&f.signature),
1233            sig,
1234            dynamic_stackslots,
1235            dynamic_type_sizes,
1236            sized_stackslots,
1237            stackslots_size,
1238            outgoing_args_size: 0,
1239            tail_args_size,
1240            reg_args: vec![],
1241            frame_layout: None,
1242            ret_area_ptr: None,
1243            call_conv,
1244            flags,
1245            isa_flags: isa_flags.clone(),
1246            is_leaf: f.is_leaf(),
1247            stack_limit,
1248            _mach: PhantomData,
1249        })
1250    }
1251
1252    /// Inserts instructions necessary for checking the stack limit into the
1253    /// prologue.
1254    ///
1255    /// This function will generate instructions necessary for perform a stack
1256    /// check at the header of a function. The stack check is intended to trap
1257    /// if the stack pointer goes below a particular threshold, preventing stack
1258    /// overflow in wasm or other code. The `stack_limit` argument here is the
1259    /// register which holds the threshold below which we're supposed to trap.
1260    /// This function is known to allocate `stack_size` bytes and we'll push
1261    /// instructions onto `insts`.
1262    ///
1263    /// Note that the instructions generated here are special because this is
1264    /// happening so late in the pipeline (e.g. after register allocation). This
1265    /// means that we need to do manual register allocation here and also be
1266    /// careful to not clobber any callee-saved or argument registers. For now
1267    /// this routine makes do with the `spilltmp_reg` as one temporary
1268    /// register, and a second register of `tmp2` which is caller-saved. This
1269    /// should be fine for us since no spills should happen in this sequence of
1270    /// instructions, so our register won't get accidentally clobbered.
1271    ///
1272    /// No values can be live after the prologue, but in this case that's ok
1273    /// because we just need to perform a stack check before progressing with
1274    /// the rest of the function.
1275    fn insert_stack_check(
1276        &self,
1277        stack_limit: Reg,
1278        stack_size: u32,
1279        insts: &mut SmallInstVec<M::I>,
1280    ) {
1281        // With no explicit stack allocated we can just emit the simple check of
1282        // the stack registers against the stack limit register, and trap if
1283        // it's out of bounds.
1284        if stack_size == 0 {
1285            insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1286            return;
1287        }
1288
1289        // Note that the 32k stack size here is pretty special. See the
1290        // documentation in x86/abi.rs for why this is here. The general idea is
1291        // that we're protecting against overflow in the addition that happens
1292        // below.
1293        if stack_size >= 32 * 1024 {
1294            insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1295        }
1296
1297        // Add the `stack_size` to `stack_limit`, placing the result in
1298        // `scratch`.
1299        //
1300        // Note though that `stack_limit`'s register may be the same as
1301        // `scratch`. If our stack size doesn't fit into an immediate this
1302        // means we need a second scratch register for loading the stack size
1303        // into a register.
1304        let scratch = Writable::from_reg(M::get_stacklimit_reg(self.call_conv));
1305        insts.extend(M::gen_add_imm(self.call_conv, scratch, stack_limit, stack_size).into_iter());
1306        insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));
1307    }
1308}
1309
1310/// Generates the instructions necessary for the `gv` to be materialized into a
1311/// register.
1312///
1313/// This function will return a register that will contain the result of
1314/// evaluating `gv`. It will also return any instructions necessary to calculate
1315/// the value of the register.
1316///
1317/// Note that global values are typically lowered to instructions via the
1318/// standard legalization pass. Unfortunately though prologue generation happens
1319/// so late in the pipeline that we can't use these legalization passes to
1320/// generate the instructions for `gv`. As a result we duplicate some lowering
1321/// of `gv` here and support only some global values. This is similar to what
1322/// the x86 backend does for now, and hopefully this can be somewhat cleaned up
1323/// in the future too!
1324///
1325/// Also note that this function will make use of `writable_spilltmp_reg()` as a
1326/// temporary register to store values in if necessary. Currently after we write
1327/// to this register there's guaranteed to be no spilled values between where
1328/// it's used, because we're not participating in register allocation anyway!
1329fn gen_stack_limit<M: ABIMachineSpec>(
1330    f: &ir::Function,
1331    sigs: &SigSet,
1332    sig: Sig,
1333    gv: ir::GlobalValue,
1334) -> (Reg, SmallInstVec<M::I>) {
1335    let mut insts = smallvec![];
1336    let reg = generate_gv::<M>(f, sigs, sig, gv, &mut insts);
1337    return (reg, insts);
1338}
1339
1340fn generate_gv<M: ABIMachineSpec>(
1341    f: &ir::Function,
1342    sigs: &SigSet,
1343    sig: Sig,
1344    gv: ir::GlobalValue,
1345    insts: &mut SmallInstVec<M::I>,
1346) -> Reg {
1347    match f.global_values[gv] {
1348        // Return the direct register the vmcontext is in
1349        ir::GlobalValueData::VMContext => {
1350            get_special_purpose_param_register(f, sigs, sig, ir::ArgumentPurpose::VMContext)
1351                .expect("no vmcontext parameter found")
1352        }
1353        // Load our base value into a register, then load from that register
1354        // in to a temporary register.
1355        ir::GlobalValueData::Load {
1356            base,
1357            offset,
1358            global_type: _,
1359            flags: _,
1360        } => {
1361            let base = generate_gv::<M>(f, sigs, sig, base, insts);
1362            let into_reg = Writable::from_reg(M::get_stacklimit_reg(f.stencil.signature.call_conv));
1363            insts.push(M::gen_load_base_offset(
1364                into_reg,
1365                base,
1366                offset.into(),
1367                M::word_type(),
1368            ));
1369            return into_reg.to_reg();
1370        }
1371        ref other => panic!("global value for stack limit not supported: {other}"),
1372    }
1373}
1374
1375/// Returns true if the signature needs to be legalized.
1376fn missing_struct_return(sig: &ir::Signature) -> bool {
1377    sig.uses_special_param(ArgumentPurpose::StructReturn)
1378        && !sig.uses_special_return(ArgumentPurpose::StructReturn)
1379}
1380
1381fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {
1382    // Keep in sync with Callee::new
1383    let mut sig = sig.clone();
1384    if sig.uses_special_return(ArgumentPurpose::StructReturn) {
1385        panic!("Explicit StructReturn return value not allowed: {sig:?}")
1386    }
1387    if let Some(struct_ret_index) = sig.special_param_index(ArgumentPurpose::StructReturn) {
1388        if !sig.returns.is_empty() {
1389            panic!("No return values are allowed when using StructReturn: {sig:?}");
1390        }
1391        sig.returns.insert(0, sig.params[struct_ret_index]);
1392    }
1393    sig
1394}
1395
1396/// ### Pre-Regalloc Functions
1397///
1398/// These methods of `Callee` may only be called before regalloc.
1399impl<M: ABIMachineSpec> Callee<M> {
1400    /// Access the (possibly legalized) signature.
1401    pub fn signature(&self) -> &ir::Signature {
1402        debug_assert!(
1403            !missing_struct_return(&self.ir_sig),
1404            "`Callee::ir_sig` is always legalized"
1405        );
1406        &self.ir_sig
1407    }
1408
1409    /// Initialize. This is called after the Callee is constructed because it
1410    /// may allocate a temp vreg, which can only be allocated once the lowering
1411    /// context exists.
1412    pub fn init_retval_area(
1413        &mut self,
1414        sigs: &SigSet,
1415        vregs: &mut VRegAllocator<M::I>,
1416    ) -> CodegenResult<()> {
1417        if sigs[self.sig].stack_ret_arg.is_some() {
1418            let ret_area_ptr = vregs.alloc(M::word_type())?;
1419            self.ret_area_ptr = Some(ret_area_ptr.only_reg().unwrap());
1420        }
1421        Ok(())
1422    }
1423
1424    /// Get the return area pointer register, if any.
1425    pub fn ret_area_ptr(&self) -> Option<Reg> {
1426        self.ret_area_ptr
1427    }
1428
1429    /// Accumulate outgoing arguments.
1430    ///
1431    /// This ensures that at least `size` bytes are allocated in the prologue to
1432    /// be available for use in function calls to hold arguments and/or return
1433    /// values. If this function is called multiple times, the maximum of all
1434    /// `size` values will be available.
1435    pub fn accumulate_outgoing_args_size(&mut self, size: u32) {
1436        if size > self.outgoing_args_size {
1437            self.outgoing_args_size = size;
1438        }
1439    }
1440
1441    /// Accumulate the incoming argument area size requirements for a tail call,
1442    /// as it could be larger than the incoming arguments of the function
1443    /// currently being compiled.
1444    pub fn accumulate_tail_args_size(&mut self, size: u32) {
1445        if size > self.tail_args_size {
1446            self.tail_args_size = size;
1447        }
1448    }
1449
1450    pub fn is_forward_edge_cfi_enabled(&self) -> bool {
1451        self.isa_flags.is_forward_edge_cfi_enabled()
1452    }
1453
1454    /// Get the calling convention implemented by this ABI object.
1455    pub fn call_conv(&self, sigs: &SigSet) -> isa::CallConv {
1456        sigs[self.sig].call_conv
1457    }
1458
1459    /// Get the ABI-dependent MachineEnv for managing register allocation.
1460    pub fn machine_env(&self, sigs: &SigSet) -> &MachineEnv {
1461        M::get_machine_env(&self.flags, self.call_conv(sigs))
1462    }
1463
1464    /// The offsets of all sized stack slots (not spill slots) for debuginfo purposes.
1465    pub fn sized_stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {
1466        &self.sized_stackslots
1467    }
1468
1469    /// The offsets of all dynamic stack slots (not spill slots) for debuginfo purposes.
1470    pub fn dynamic_stackslot_offsets(&self) -> &PrimaryMap<DynamicStackSlot, u32> {
1471        &self.dynamic_stackslots
1472    }
1473
1474    /// Generate an instruction which copies an argument to a destination
1475    /// register.
1476    pub fn gen_copy_arg_to_regs(
1477        &mut self,
1478        sigs: &SigSet,
1479        idx: usize,
1480        into_regs: ValueRegs<Writable<Reg>>,
1481        vregs: &mut VRegAllocator<M::I>,
1482    ) -> SmallInstVec<M::I> {
1483        let mut insts = smallvec![];
1484        let mut copy_arg_slot_to_reg = |slot: &ABIArgSlot, into_reg: &Writable<Reg>| {
1485            match slot {
1486                &ABIArgSlot::Reg { reg, .. } => {
1487                    // Add a preg -> def pair to the eventual `args`
1488                    // instruction.  Extension mode doesn't matter
1489                    // (we're copying out, not in; we ignore high bits
1490                    // by convention).
1491                    let arg = ArgPair {
1492                        vreg: *into_reg,
1493                        preg: reg.into(),
1494                    };
1495                    self.reg_args.push(arg);
1496                }
1497                &ABIArgSlot::Stack {
1498                    offset,
1499                    ty,
1500                    extension,
1501                    ..
1502                } => {
1503                    // However, we have to respect the extension mode for stack
1504                    // slots, or else we grab the wrong bytes on big-endian.
1505                    let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1506                    let ty =
1507                        if ext != ArgumentExtension::None && M::word_bits() > ty_bits(ty) as u32 {
1508                            M::word_type()
1509                        } else {
1510                            ty
1511                        };
1512                    insts.push(M::gen_load_stack(
1513                        StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1514                        *into_reg,
1515                        ty,
1516                    ));
1517                }
1518            }
1519        };
1520
1521        match &sigs.args(self.sig)[idx] {
1522            &ABIArg::Slots { ref slots, .. } => {
1523                assert_eq!(into_regs.len(), slots.len());
1524                for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
1525                    copy_arg_slot_to_reg(&slot, &into_reg);
1526                }
1527            }
1528            &ABIArg::StructArg { offset, .. } => {
1529                let into_reg = into_regs.only_reg().unwrap();
1530                // Buffer address is implicitly defined by the ABI.
1531                insts.push(M::gen_get_stack_addr(
1532                    StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1533                    into_reg,
1534                ));
1535            }
1536            &ABIArg::ImplicitPtrArg { pointer, ty, .. } => {
1537                let into_reg = into_regs.only_reg().unwrap();
1538                // We need to dereference the pointer.
1539                let base = match &pointer {
1540                    &ABIArgSlot::Reg { reg, ty, .. } => {
1541                        let tmp = vregs.alloc_with_deferred_error(ty).only_reg().unwrap();
1542                        self.reg_args.push(ArgPair {
1543                            vreg: Writable::from_reg(tmp),
1544                            preg: reg.into(),
1545                        });
1546                        tmp
1547                    }
1548                    &ABIArgSlot::Stack { offset, ty, .. } => {
1549                        let addr_reg = writable_value_regs(vregs.alloc_with_deferred_error(ty))
1550                            .only_reg()
1551                            .unwrap();
1552                        insts.push(M::gen_load_stack(
1553                            StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1554                            addr_reg,
1555                            ty,
1556                        ));
1557                        addr_reg.to_reg()
1558                    }
1559                };
1560                insts.push(M::gen_load_base_offset(into_reg, base, 0, ty));
1561            }
1562        }
1563        insts
1564    }
1565
1566    /// Generate an instruction which copies a source register to a return value slot.
1567    pub fn gen_copy_regs_to_retval(
1568        &self,
1569        sigs: &SigSet,
1570        idx: usize,
1571        from_regs: ValueRegs<Reg>,
1572        vregs: &mut VRegAllocator<M::I>,
1573    ) -> (SmallVec<[RetPair; 2]>, SmallInstVec<M::I>) {
1574        let mut reg_pairs = smallvec![];
1575        let mut ret = smallvec![];
1576        let word_bits = M::word_bits() as u8;
1577        match &sigs.rets(self.sig)[idx] {
1578            &ABIArg::Slots { ref slots, .. } => {
1579                assert_eq!(from_regs.len(), slots.len());
1580                for (slot, &from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1581                    match slot {
1582                        &ABIArgSlot::Reg {
1583                            reg, ty, extension, ..
1584                        } => {
1585                            let from_bits = ty_bits(ty) as u8;
1586                            let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1587                            let vreg = match (ext, from_bits) {
1588                                (ir::ArgumentExtension::Uext, n)
1589                                | (ir::ArgumentExtension::Sext, n)
1590                                    if n < word_bits =>
1591                                {
1592                                    let signed = ext == ir::ArgumentExtension::Sext;
1593                                    let dst =
1594                                        writable_value_regs(vregs.alloc_with_deferred_error(ty))
1595                                            .only_reg()
1596                                            .unwrap();
1597                                    ret.push(M::gen_extend(
1598                                        dst, from_reg, signed, from_bits,
1599                                        /* to_bits = */ word_bits,
1600                                    ));
1601                                    dst.to_reg()
1602                                }
1603                                _ => {
1604                                    // No move needed, regalloc2 will emit it using the constraint
1605                                    // added by the RetPair.
1606                                    from_reg
1607                                }
1608                            };
1609                            reg_pairs.push(RetPair {
1610                                vreg,
1611                                preg: Reg::from(reg),
1612                            });
1613                        }
1614                        &ABIArgSlot::Stack {
1615                            offset,
1616                            ty,
1617                            extension,
1618                            ..
1619                        } => {
1620                            let mut ty = ty;
1621                            let from_bits = ty_bits(ty) as u8;
1622                            // A machine ABI implementation should ensure that stack frames
1623                            // have "reasonable" size. All current ABIs for machinst
1624                            // backends (aarch64 and x64) enforce a 128MB limit.
1625                            let off = i32::try_from(offset).expect(
1626                                "Argument stack offset greater than 2GB; should hit impl limit first",
1627                                );
1628                            let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1629                            // Trash the from_reg; it should be its last use.
1630                            match (ext, from_bits) {
1631                                (ir::ArgumentExtension::Uext, n)
1632                                | (ir::ArgumentExtension::Sext, n)
1633                                    if n < word_bits =>
1634                                {
1635                                    assert_eq!(M::word_reg_class(), from_reg.class());
1636                                    let signed = ext == ir::ArgumentExtension::Sext;
1637                                    let dst =
1638                                        writable_value_regs(vregs.alloc_with_deferred_error(ty))
1639                                            .only_reg()
1640                                            .unwrap();
1641                                    ret.push(M::gen_extend(
1642                                        dst, from_reg, signed, from_bits,
1643                                        /* to_bits = */ word_bits,
1644                                    ));
1645                                    // Store the extended version.
1646                                    ty = M::word_type();
1647                                }
1648                                _ => {}
1649                            };
1650                            ret.push(M::gen_store_base_offset(
1651                                self.ret_area_ptr.unwrap(),
1652                                off,
1653                                from_reg,
1654                                ty,
1655                            ));
1656                        }
1657                    }
1658                }
1659            }
1660            ABIArg::StructArg { .. } => {
1661                panic!("StructArg in return position is unsupported");
1662            }
1663            ABIArg::ImplicitPtrArg { .. } => {
1664                panic!("ImplicitPtrArg in return position is unsupported");
1665            }
1666        }
1667        (reg_pairs, ret)
1668    }
1669
1670    /// Generate any setup instruction needed to save values to the
1671    /// return-value area. This is usually used when were are multiple return
1672    /// values or an otherwise large return value that must be passed on the
1673    /// stack; typically the ABI specifies an extra hidden argument that is a
1674    /// pointer to that memory.
1675    pub fn gen_retval_area_setup(
1676        &mut self,
1677        sigs: &SigSet,
1678        vregs: &mut VRegAllocator<M::I>,
1679    ) -> Option<M::I> {
1680        if let Some(i) = sigs[self.sig].stack_ret_arg {
1681            let ret_area_ptr = Writable::from_reg(self.ret_area_ptr.unwrap());
1682            let insts =
1683                self.gen_copy_arg_to_regs(sigs, i.into(), ValueRegs::one(ret_area_ptr), vregs);
1684            insts.into_iter().next().map(|inst| {
1685                trace!(
1686                    "gen_retval_area_setup: inst {:?}; ptr reg is {:?}",
1687                    inst,
1688                    ret_area_ptr.to_reg()
1689                );
1690                inst
1691            })
1692        } else {
1693            trace!("gen_retval_area_setup: not needed");
1694            None
1695        }
1696    }
1697
1698    /// Generate a return instruction.
1699    pub fn gen_rets(&self, rets: Vec<RetPair>) -> M::I {
1700        M::gen_rets(rets)
1701    }
1702
1703    /// Produce an instruction that computes a sized stackslot address.
1704    pub fn sized_stackslot_addr(
1705        &self,
1706        slot: StackSlot,
1707        offset: u32,
1708        into_reg: Writable<Reg>,
1709    ) -> M::I {
1710        // Offset from beginning of stackslot area.
1711        let stack_off = self.sized_stackslots[slot] as i64;
1712        let sp_off: i64 = stack_off + (offset as i64);
1713        M::gen_get_stack_addr(StackAMode::Slot(sp_off), into_reg)
1714    }
1715
1716    /// Produce an instruction that computes a dynamic stackslot address.
1717    pub fn dynamic_stackslot_addr(&self, slot: DynamicStackSlot, into_reg: Writable<Reg>) -> M::I {
1718        let stack_off = self.dynamic_stackslots[slot] as i64;
1719        M::gen_get_stack_addr(StackAMode::Slot(stack_off), into_reg)
1720    }
1721
1722    /// Get an `args` pseudo-inst, if any, that should appear at the
1723    /// very top of the function body prior to regalloc.
1724    pub fn take_args(&mut self) -> Option<M::I> {
1725        if self.reg_args.len() > 0 {
1726            // Very first instruction is an `args` pseudo-inst that
1727            // establishes live-ranges for in-register arguments and
1728            // constrains them at the start of the function to the
1729            // locations defined by the ABI.
1730            Some(M::gen_args(std::mem::take(&mut self.reg_args)))
1731        } else {
1732            None
1733        }
1734    }
1735}
1736
1737/// ### Post-Regalloc Functions
1738///
1739/// These methods of `Callee` may only be called after
1740/// regalloc.
1741impl<M: ABIMachineSpec> Callee<M> {
1742    /// Compute the final frame layout, post-regalloc.
1743    ///
1744    /// This must be called before gen_prologue or gen_epilogue.
1745    pub fn compute_frame_layout(
1746        &mut self,
1747        sigs: &SigSet,
1748        spillslots: usize,
1749        clobbered: Vec<Writable<RealReg>>,
1750    ) {
1751        let bytes = M::word_bytes();
1752        let total_stacksize = self.stackslots_size + bytes * spillslots as u32;
1753        let mask = M::stack_align(self.call_conv) - 1;
1754        let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.
1755        self.frame_layout = Some(M::compute_frame_layout(
1756            self.call_conv,
1757            &self.flags,
1758            self.signature(),
1759            &clobbered,
1760            self.is_leaf,
1761            self.stack_args_size(sigs),
1762            self.tail_args_size,
1763            total_stacksize,
1764            self.outgoing_args_size,
1765        ));
1766    }
1767
1768    /// Generate a prologue, post-regalloc.
1769    ///
1770    /// This should include any stack frame or other setup necessary to use the
1771    /// other methods (`load_arg`, `store_retval`, and spillslot accesses.)
1772    pub fn gen_prologue(&self) -> SmallInstVec<M::I> {
1773        let frame_layout = self.frame_layout();
1774        let mut insts = smallvec![];
1775
1776        // Set up frame.
1777        insts.extend(M::gen_prologue_frame_setup(
1778            self.call_conv,
1779            &self.flags,
1780            &self.isa_flags,
1781            &frame_layout,
1782        ));
1783
1784        // The stack limit check needs to cover all the stack adjustments we
1785        // might make, up to the next stack limit check in any function we
1786        // call. Since this happens after frame setup, the current function's
1787        // setup area needs to be accounted for in the caller's stack limit
1788        // check, but we need to account for any setup area that our callees
1789        // might need. Note that s390x may also use the outgoing args area for
1790        // backtrace support even in leaf functions, so that should be accounted
1791        // for unconditionally.
1792        let total_stacksize = (frame_layout.tail_args_size - frame_layout.incoming_args_size)
1793            + frame_layout.clobber_size
1794            + frame_layout.fixed_frame_storage_size
1795            + frame_layout.outgoing_args_size
1796            + if self.is_leaf {
1797                0
1798            } else {
1799                frame_layout.setup_area_size
1800            };
1801
1802        // Leaf functions with zero stack don't need a stack check if one's
1803        // specified, otherwise always insert the stack check.
1804        if total_stacksize > 0 || !self.is_leaf {
1805            if let Some((reg, stack_limit_load)) = &self.stack_limit {
1806                insts.extend(stack_limit_load.clone());
1807                self.insert_stack_check(*reg, total_stacksize, &mut insts);
1808            }
1809
1810            if self.flags.enable_probestack() {
1811                let guard_size = 1 << self.flags.probestack_size_log2();
1812                match self.flags.probestack_strategy() {
1813                    ProbestackStrategy::Inline => M::gen_inline_probestack(
1814                        &mut insts,
1815                        self.call_conv,
1816                        total_stacksize,
1817                        guard_size,
1818                    ),
1819                    ProbestackStrategy::Outline => {
1820                        if total_stacksize >= guard_size {
1821                            M::gen_probestack(&mut insts, total_stacksize);
1822                        }
1823                    }
1824                }
1825            }
1826        }
1827
1828        // Save clobbered registers.
1829        insts.extend(M::gen_clobber_save(
1830            self.call_conv,
1831            &self.flags,
1832            &frame_layout,
1833        ));
1834
1835        insts
1836    }
1837
1838    /// Generate an epilogue, post-regalloc.
1839    ///
1840    /// Note that this must generate the actual return instruction (rather than
1841    /// emitting this in the lowering logic), because the epilogue code comes
1842    /// before the return and the two are likely closely related.
1843    pub fn gen_epilogue(&self) -> SmallInstVec<M::I> {
1844        let frame_layout = self.frame_layout();
1845        let mut insts = smallvec![];
1846
1847        // Restore clobbered registers.
1848        insts.extend(M::gen_clobber_restore(
1849            self.call_conv,
1850            &self.flags,
1851            &frame_layout,
1852        ));
1853
1854        // Tear down frame.
1855        insts.extend(M::gen_epilogue_frame_restore(
1856            self.call_conv,
1857            &self.flags,
1858            &self.isa_flags,
1859            &frame_layout,
1860        ));
1861
1862        // And return.
1863        insts.extend(M::gen_return(
1864            self.call_conv,
1865            &self.isa_flags,
1866            &frame_layout,
1867        ));
1868
1869        trace!("Epilogue: {:?}", insts);
1870        insts
1871    }
1872
1873    /// Return a reference to the computed frame layout information. This
1874    /// function will panic if it's called before [`Self::compute_frame_layout`].
1875    pub fn frame_layout(&self) -> &FrameLayout {
1876        self.frame_layout
1877            .as_ref()
1878            .expect("frame layout not computed before prologue generation")
1879    }
1880
1881    /// Returns the full frame size for the given function, after prologue
1882    /// emission has run. This comprises the spill slots and stack-storage
1883    /// slots as well as storage for clobbered callee-save registers, but
1884    /// not arguments arguments pushed at callsites within this function,
1885    /// or other ephemeral pushes.
1886    pub fn frame_size(&self) -> u32 {
1887        let frame_layout = self.frame_layout();
1888        frame_layout.clobber_size + frame_layout.fixed_frame_storage_size
1889    }
1890
1891    /// Returns offset from the slot base in the current frame to the caller's SP.
1892    pub fn slot_base_to_caller_sp_offset(&self) -> u32 {
1893        let frame_layout = self.frame_layout();
1894        frame_layout.clobber_size
1895            + frame_layout.fixed_frame_storage_size
1896            + frame_layout.setup_area_size
1897    }
1898
1899    /// Returns the size of arguments expected on the stack.
1900    pub fn stack_args_size(&self, sigs: &SigSet) -> u32 {
1901        sigs[self.sig].sized_stack_arg_space
1902    }
1903
1904    /// Get the spill-slot size.
1905    pub fn get_spillslot_size(&self, rc: RegClass) -> u32 {
1906        let max = if self.dynamic_type_sizes.len() == 0 {
1907            16
1908        } else {
1909            *self
1910                .dynamic_type_sizes
1911                .iter()
1912                .max_by(|x, y| x.1.cmp(&y.1))
1913                .map(|(_k, v)| v)
1914                .unwrap()
1915        };
1916        M::get_number_of_spillslots_for_value(rc, max, &self.isa_flags)
1917    }
1918
1919    /// Get the spill slot offset relative to the fixed allocation area start.
1920    pub fn get_spillslot_offset(&self, slot: SpillSlot) -> i64 {
1921        // Offset from beginning of spillslot area.
1922        let islot = slot.index() as i64;
1923        let spill_off = islot * M::word_bytes() as i64;
1924        let sp_off = self.stackslots_size as i64 + spill_off;
1925
1926        sp_off
1927    }
1928
1929    /// Generate a spill.
1930    pub fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg) -> M::I {
1931        let ty = M::I::canonical_type_for_rc(from_reg.class());
1932        debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
1933
1934        let sp_off = self.get_spillslot_offset(to_slot);
1935        trace!("gen_spill: {from_reg:?} into slot {to_slot:?} at offset {sp_off}");
1936
1937        let from = StackAMode::Slot(sp_off);
1938        <M>::gen_store_stack(from, Reg::from(from_reg), ty)
1939    }
1940
1941    /// Generate a reload (fill).
1942    pub fn gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot) -> M::I {
1943        let ty = M::I::canonical_type_for_rc(to_reg.to_reg().class());
1944        debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
1945
1946        let sp_off = self.get_spillslot_offset(from_slot);
1947        trace!("gen_reload: {to_reg:?} from slot {from_slot:?} at offset {sp_off}");
1948
1949        let from = StackAMode::Slot(sp_off);
1950        <M>::gen_load_stack(from, to_reg.map(Reg::from), ty)
1951    }
1952}
1953
1954/// An input argument to a call instruction: the vreg that is used,
1955/// and the preg it is constrained to (per the ABI).
1956#[derive(Clone, Debug)]
1957pub struct CallArgPair {
1958    /// The virtual register to use for the argument.
1959    pub vreg: Reg,
1960    /// The real register into which the arg goes.
1961    pub preg: Reg,
1962}
1963
1964/// An output return value from a call instruction: the vreg that is
1965/// defined, and the preg it is constrained to (per the ABI).
1966#[derive(Clone, Debug)]
1967pub struct CallRetPair {
1968    /// The virtual register to define from this return value.
1969    pub vreg: Writable<Reg>,
1970    /// The real register from which the return value is read.
1971    pub preg: Reg,
1972}
1973
1974pub type CallArgList = SmallVec<[CallArgPair; 8]>;
1975pub type CallRetList = SmallVec<[CallRetPair; 8]>;
1976
1977pub enum IsTailCall {
1978    Yes,
1979    No,
1980}
1981
1982/// ABI object for a callsite.
1983pub struct CallSite<M: ABIMachineSpec> {
1984    /// The called function's signature.
1985    sig: Sig,
1986    /// All register uses for the callsite, i.e., function args, with
1987    /// VReg and the physical register it is constrained to.
1988    uses: CallArgList,
1989    /// All defs for the callsite, i.e., return values.
1990    defs: CallRetList,
1991    /// Call destination.
1992    dest: CallDest,
1993    is_tail_call: IsTailCall,
1994    /// Caller's calling convention.
1995    caller_conv: isa::CallConv,
1996    /// The settings controlling this compilation.
1997    flags: settings::Flags,
1998
1999    _mach: PhantomData<M>,
2000}
2001
2002/// Destination for a call.
2003#[derive(Debug, Clone)]
2004pub enum CallDest {
2005    /// Call to an ExtName (named function symbol).
2006    ExtName(ir::ExternalName, RelocDistance),
2007    /// Indirect call to a function pointer in a register.
2008    Reg(Reg),
2009}
2010
2011impl<M: ABIMachineSpec> CallSite<M> {
2012    /// Create a callsite ABI object for a call directly to the specified function.
2013    pub fn from_func(
2014        sigs: &SigSet,
2015        sig_ref: ir::SigRef,
2016        extname: &ir::ExternalName,
2017        is_tail_call: IsTailCall,
2018        dist: RelocDistance,
2019        caller_conv: isa::CallConv,
2020        flags: settings::Flags,
2021    ) -> CallSite<M> {
2022        let sig = sigs.abi_sig_for_sig_ref(sig_ref);
2023        CallSite {
2024            sig,
2025            uses: smallvec![],
2026            defs: smallvec![],
2027            dest: CallDest::ExtName(extname.clone(), dist),
2028            is_tail_call,
2029            caller_conv,
2030            flags,
2031            _mach: PhantomData,
2032        }
2033    }
2034
2035    /// Create a callsite ABI object for a call directly to the specified
2036    /// libcall.
2037    pub fn from_libcall(
2038        sigs: &SigSet,
2039        sig: &ir::Signature,
2040        extname: &ir::ExternalName,
2041        dist: RelocDistance,
2042        caller_conv: isa::CallConv,
2043        flags: settings::Flags,
2044    ) -> CallSite<M> {
2045        let sig = sigs.abi_sig_for_signature(sig);
2046        CallSite {
2047            sig,
2048            uses: smallvec![],
2049            defs: smallvec![],
2050            dest: CallDest::ExtName(extname.clone(), dist),
2051            is_tail_call: IsTailCall::No,
2052            caller_conv,
2053            flags,
2054            _mach: PhantomData,
2055        }
2056    }
2057
2058    /// Create a callsite ABI object for a call to a function pointer with the
2059    /// given signature.
2060    pub fn from_ptr(
2061        sigs: &SigSet,
2062        sig_ref: ir::SigRef,
2063        ptr: Reg,
2064        is_tail_call: IsTailCall,
2065        caller_conv: isa::CallConv,
2066        flags: settings::Flags,
2067    ) -> CallSite<M> {
2068        let sig = sigs.abi_sig_for_sig_ref(sig_ref);
2069        CallSite {
2070            sig,
2071            uses: smallvec![],
2072            defs: smallvec![],
2073            dest: CallDest::Reg(ptr),
2074            is_tail_call,
2075            caller_conv,
2076            flags,
2077            _mach: PhantomData,
2078        }
2079    }
2080
2081    pub(crate) fn dest(&self) -> &CallDest {
2082        &self.dest
2083    }
2084
2085    pub(crate) fn take_uses(self) -> CallArgList {
2086        self.uses
2087    }
2088
2089    pub(crate) fn sig<'a>(&self, sigs: &'a SigSet) -> &'a SigData {
2090        &sigs[self.sig]
2091    }
2092
2093    pub(crate) fn is_tail_call(&self) -> bool {
2094        matches!(self.is_tail_call, IsTailCall::Yes)
2095    }
2096}
2097
2098impl<M: ABIMachineSpec> CallSite<M> {
2099    /// Get the number of arguments expected.
2100    pub fn num_args(&self, sigs: &SigSet) -> usize {
2101        sigs.num_args(self.sig)
2102    }
2103
2104    /// Get the number of return values expected.
2105    pub fn num_rets(&self, sigs: &SigSet) -> usize {
2106        sigs.num_rets(self.sig)
2107    }
2108
2109    /// Emit a copy of a large argument into its associated stack buffer, if
2110    /// any.  We must be careful to perform all these copies (as necessary)
2111    /// before setting up the argument registers, since we may have to invoke
2112    /// memcpy(), which could clobber any registers already set up.  The
2113    /// back-end should call this routine for all arguments before calling
2114    /// `gen_arg` for all arguments.
2115    pub fn emit_copy_regs_to_buffer(
2116        &self,
2117        ctx: &mut Lower<M::I>,
2118        idx: usize,
2119        from_regs: ValueRegs<Reg>,
2120    ) {
2121        match &ctx.sigs().args(self.sig)[idx] {
2122            &ABIArg::Slots { .. } | &ABIArg::ImplicitPtrArg { .. } => {}
2123            &ABIArg::StructArg { offset, size, .. } => {
2124                let src_ptr = from_regs.only_reg().unwrap();
2125                let dst_ptr = ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
2126                ctx.emit(M::gen_get_stack_addr(
2127                    StackAMode::OutgoingArg(offset),
2128                    dst_ptr,
2129                ));
2130                // Emit a memcpy from `src_ptr` to `dst_ptr` of `size` bytes.
2131                // N.B.: because we process StructArg params *first*, this is
2132                // safe w.r.t. clobbers: we have not yet filled in any other
2133                // arg regs.
2134                let memcpy_call_conv =
2135                    isa::CallConv::for_libcall(&self.flags, ctx.sigs()[self.sig].call_conv);
2136                for insn in M::gen_memcpy(
2137                    memcpy_call_conv,
2138                    dst_ptr.to_reg(),
2139                    src_ptr,
2140                    size as usize,
2141                    |ty| ctx.alloc_tmp(ty).only_reg().unwrap(),
2142                )
2143                .into_iter()
2144                {
2145                    ctx.emit(insn);
2146                }
2147            }
2148        }
2149    }
2150
2151    /// Add a constraint for an argument value from a source register.
2152    /// For large arguments with associated stack buffer, this may
2153    /// load the address of the buffer into the argument register, if
2154    /// required by the ABI.
2155    pub fn gen_arg(&mut self, ctx: &mut Lower<M::I>, idx: usize, from_regs: ValueRegs<Reg>) {
2156        let stack_arg_space = ctx.sigs()[self.sig].sized_stack_arg_space;
2157        let stack_arg = if self.is_tail_call() {
2158            StackAMode::IncomingArg
2159        } else {
2160            |offset, _| StackAMode::OutgoingArg(offset)
2161        };
2162        let word_rc = M::word_reg_class();
2163        let word_bits = M::word_bits() as usize;
2164
2165        match ctx.sigs().args(self.sig)[idx].clone() {
2166            ABIArg::Slots { ref slots, .. } => {
2167                assert_eq!(from_regs.len(), slots.len());
2168                for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
2169                    match slot {
2170                        &ABIArgSlot::Reg {
2171                            reg, ty, extension, ..
2172                        } => {
2173                            let ext = M::get_ext_mode(ctx.sigs()[self.sig].call_conv, extension);
2174                            let vreg =
2175                                if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
2176                                    assert_eq!(word_rc, reg.class());
2177                                    let signed = match ext {
2178                                        ir::ArgumentExtension::Uext => false,
2179                                        ir::ArgumentExtension::Sext => true,
2180                                        _ => unreachable!(),
2181                                    };
2182                                    let extend_result =
2183                                        ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
2184                                    ctx.emit(M::gen_extend(
2185                                        extend_result,
2186                                        *from_reg,
2187                                        signed,
2188                                        ty_bits(ty) as u8,
2189                                        word_bits as u8,
2190                                    ));
2191                                    extend_result.to_reg()
2192                                } else {
2193                                    *from_reg
2194                                };
2195
2196                            let preg = reg.into();
2197                            self.uses.push(CallArgPair { vreg, preg });
2198                        }
2199                        &ABIArgSlot::Stack {
2200                            offset,
2201                            ty,
2202                            extension,
2203                            ..
2204                        } => {
2205                            let ext = M::get_ext_mode(ctx.sigs()[self.sig].call_conv, extension);
2206                            let (data, ty) =
2207                                if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
2208                                    assert_eq!(word_rc, from_reg.class());
2209                                    let signed = match ext {
2210                                        ir::ArgumentExtension::Uext => false,
2211                                        ir::ArgumentExtension::Sext => true,
2212                                        _ => unreachable!(),
2213                                    };
2214                                    let extend_result =
2215                                        ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
2216                                    ctx.emit(M::gen_extend(
2217                                        extend_result,
2218                                        *from_reg,
2219                                        signed,
2220                                        ty_bits(ty) as u8,
2221                                        word_bits as u8,
2222                                    ));
2223                                    // Store the extended version.
2224                                    (extend_result.to_reg(), M::word_type())
2225                                } else {
2226                                    (*from_reg, ty)
2227                                };
2228                            ctx.emit(M::gen_store_stack(
2229                                stack_arg(offset, stack_arg_space),
2230                                data,
2231                                ty,
2232                            ));
2233                        }
2234                    }
2235                }
2236            }
2237            ABIArg::StructArg { .. } => {
2238                // Only supported via ISLE.
2239            }
2240            ABIArg::ImplicitPtrArg {
2241                offset,
2242                pointer,
2243                ty,
2244                purpose: _,
2245            } => {
2246                assert_eq!(from_regs.len(), 1);
2247                let vreg = from_regs.regs()[0];
2248                let amode = StackAMode::OutgoingArg(offset);
2249                let tmp = ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
2250                ctx.emit(M::gen_get_stack_addr(amode, tmp));
2251                let tmp = tmp.to_reg();
2252                ctx.emit(M::gen_store_base_offset(tmp, 0, vreg, ty));
2253                match pointer {
2254                    ABIArgSlot::Reg { reg, .. } => self.uses.push(CallArgPair {
2255                        vreg: tmp,
2256                        preg: reg.into(),
2257                    }),
2258                    ABIArgSlot::Stack { offset, .. } => ctx.emit(M::gen_store_stack(
2259                        stack_arg(offset, stack_arg_space),
2260                        tmp,
2261                        M::word_type(),
2262                    )),
2263                }
2264            }
2265        }
2266    }
2267
2268    /// Call `gen_arg` for each non-hidden argument and emit all instructions
2269    /// generated.
2270    pub fn emit_args(&mut self, ctx: &mut Lower<M::I>, (inputs, off): isle::ValueSlice) {
2271        let num_args = self.num_args(ctx.sigs());
2272        assert_eq!(inputs.len(&ctx.dfg().value_lists) - off, num_args);
2273
2274        let mut arg_value_regs: SmallVec<[_; 16]> = smallvec![];
2275        for i in 0..num_args {
2276            let input = inputs.get(off + i, &ctx.dfg().value_lists).unwrap();
2277            arg_value_regs.push(ctx.put_value_in_regs(input));
2278        }
2279        for (i, arg_regs) in arg_value_regs.iter().enumerate() {
2280            self.emit_copy_regs_to_buffer(ctx, i, *arg_regs);
2281        }
2282        for (i, value_regs) in arg_value_regs.iter().enumerate() {
2283            self.gen_arg(ctx, i, *value_regs);
2284        }
2285    }
2286
2287    /// Emit the code to forward a stack-return pointer argument through a tail
2288    /// call.
2289    pub fn emit_stack_ret_arg_for_tail_call(&mut self, ctx: &mut Lower<M::I>) {
2290        if let Some(i) = ctx.sigs()[self.sig].stack_ret_arg() {
2291            let ret_area_ptr = ctx.abi().ret_area_ptr.expect(
2292                "if the tail callee has a return pointer, then the tail caller \
2293                 must as well",
2294            );
2295            self.gen_arg(ctx, i.into(), ValueRegs::one(ret_area_ptr));
2296        }
2297    }
2298
2299    /// Define a return value after the call returns.
2300    pub fn gen_retval(
2301        &mut self,
2302        ctx: &mut Lower<M::I>,
2303        idx: usize,
2304    ) -> (SmallInstVec<M::I>, ValueRegs<Reg>) {
2305        let mut insts = smallvec![];
2306        let mut into_regs: SmallVec<[Reg; 2]> = smallvec![];
2307        let ret = ctx.sigs().rets(self.sig)[idx].clone();
2308        match ret {
2309            ABIArg::Slots { ref slots, .. } => {
2310                for slot in slots {
2311                    match slot {
2312                        // Extension mode doesn't matter because we're copying out, not in,
2313                        // and we ignore high bits in our own registers by convention.
2314                        &ABIArgSlot::Reg { reg, ty, .. } => {
2315                            let into_reg = ctx.alloc_tmp(ty).only_reg().unwrap();
2316                            self.defs.push(CallRetPair {
2317                                vreg: into_reg,
2318                                preg: reg.into(),
2319                            });
2320                            into_regs.push(into_reg.to_reg());
2321                        }
2322                        &ABIArgSlot::Stack { offset, ty, .. } => {
2323                            let into_reg = ctx.alloc_tmp(ty).only_reg().unwrap();
2324                            let sig_data = &ctx.sigs()[self.sig];
2325                            // The outgoing argument area must always be restored after a call,
2326                            // ensuring that the return values will be in a consistent place after
2327                            // any call.
2328                            let ret_area_base = sig_data.sized_stack_arg_space();
2329                            insts.push(M::gen_load_stack(
2330                                StackAMode::OutgoingArg(offset + ret_area_base),
2331                                into_reg,
2332                                ty,
2333                            ));
2334                            into_regs.push(into_reg.to_reg());
2335                        }
2336                    }
2337                }
2338            }
2339            ABIArg::StructArg { .. } => {
2340                panic!("StructArg not supported in return position");
2341            }
2342            ABIArg::ImplicitPtrArg { .. } => {
2343                panic!("ImplicitPtrArg not supported in return position");
2344            }
2345        }
2346
2347        let value_regs = match *into_regs {
2348            [a] => ValueRegs::one(a),
2349            [a, b] => ValueRegs::two(a, b),
2350            _ => panic!("Expected to see one or two slots only from {ret:?}"),
2351        };
2352        (insts, value_regs)
2353    }
2354
2355    /// Emit the call itself.
2356    ///
2357    /// The returned instruction should have proper use- and def-sets according
2358    /// to the argument registers, return-value registers, and clobbered
2359    /// registers for this function signature in this ABI.
2360    ///
2361    /// (Arg registers are uses, and retval registers are defs. Clobbered
2362    /// registers are also logically defs, but should never be read; their
2363    /// values are "defined" (to the regalloc) but "undefined" in every other
2364    /// sense.)
2365    ///
2366    /// This function should only be called once, as it is allowed to re-use
2367    /// parts of the `CallSite` object in emitting instructions.
2368    pub fn emit_call(&mut self, ctx: &mut Lower<M::I>) {
2369        let word_type = M::word_type();
2370        if let Some(i) = ctx.sigs()[self.sig].stack_ret_arg {
2371            let rd = ctx.alloc_tmp(word_type).only_reg().unwrap();
2372            let ret_area_base = ctx.sigs()[self.sig].sized_stack_arg_space();
2373            ctx.emit(M::gen_get_stack_addr(
2374                StackAMode::OutgoingArg(ret_area_base),
2375                rd,
2376            ));
2377            self.gen_arg(ctx, i.into(), ValueRegs::one(rd.to_reg()));
2378        }
2379
2380        let uses = mem::take(&mut self.uses);
2381        let defs = mem::take(&mut self.defs);
2382        let clobbers = {
2383            // Get clobbers: all caller-saves. These may include return value
2384            // regs, which we will remove from the clobber set below.
2385            let mut clobbers = <M>::get_regs_clobbered_by_call(ctx.sigs()[self.sig].call_conv);
2386
2387            // Remove retval regs from clobbers.
2388            for def in &defs {
2389                clobbers.remove(PReg::from(def.preg.to_real_reg().unwrap()));
2390            }
2391
2392            clobbers
2393        };
2394
2395        let sig = &ctx.sigs()[self.sig];
2396        let callee_pop_size = if sig.call_conv() == isa::CallConv::Tail {
2397            // The tail calling convention has callees pop stack arguments.
2398            sig.sized_stack_arg_space
2399        } else {
2400            0
2401        };
2402
2403        let call_conv = sig.call_conv;
2404        let ret_space = sig.sized_stack_ret_space;
2405        let arg_space = sig.sized_stack_arg_space;
2406
2407        ctx.abi_mut()
2408            .accumulate_outgoing_args_size(ret_space + arg_space);
2409
2410        let tmp = ctx.alloc_tmp(word_type).only_reg().unwrap();
2411
2412        // Any adjustment to SP to account for required outgoing arguments/stack return values must
2413        // be done inside of the call pseudo-op, to ensure that SP is always in a consistent
2414        // state for all other instructions. For example, if a tail-call abi function is called
2415        // here, the reclamation of the outgoing argument area must be done inside of the call
2416        // pseudo-op's emission to ensure that SP is consistent at all other points in the lowered
2417        // function. (Except the prologue and epilogue, but those are fairly special parts of the
2418        // function that establish the SP invariants that are relied on elsewhere and are generated
2419        // after the register allocator has run and thus cannot have register allocator-inserted
2420        // references to SP offsets.)
2421        for inst in M::gen_call(
2422            &self.dest,
2423            tmp,
2424            CallInfo {
2425                dest: (),
2426                uses,
2427                defs,
2428                clobbers,
2429                callee_conv: call_conv,
2430                caller_conv: self.caller_conv,
2431                callee_pop_size,
2432            },
2433        )
2434        .into_iter()
2435        {
2436            ctx.emit(inst);
2437        }
2438    }
2439}
2440
2441#[cfg(test)]
2442mod tests {
2443    use super::SigData;
2444
2445    #[test]
2446    fn sig_data_size() {
2447        // The size of `SigData` is performance sensitive, so make sure
2448        // we don't regress it unintentionally.
2449        assert_eq!(std::mem::size_of::<SigData>(), 24);
2450    }
2451}