sharded_slab/
implementation.rs

1// This module exists only to provide a separate page for the implementation
2// documentation.
3
4//! Notes on `sharded-slab`'s implementation and design.
5//!
6//! # Design
7//!
8//! The sharded slab's design is strongly inspired by the ideas presented by
9//! Leijen, Zorn, and de Moura in [Mimalloc: Free List Sharding in
10//! Action][mimalloc]. In this report, the authors present a novel design for a
11//! memory allocator based on a concept of _free list sharding_.
12//!
13//! Memory allocators must keep track of what memory regions are not currently
14//! allocated ("free") in order to provide them to future allocation requests.
15//! The term [_free list_][freelist] refers to a technique for performing this
16//! bookkeeping, where each free block stores a pointer to the next free block,
17//! forming a linked list. The memory allocator keeps a pointer to the most
18//! recently freed block, the _head_ of the free list. To allocate more memory,
19//! the allocator pops from the free list by setting the head pointer to the
20//! next free block of the current head block, and returning the previous head.
21//! To deallocate a block, the block is pushed to the free list by setting its
22//! first word to the current head pointer, and the head pointer is set to point
23//! to the deallocated block. Most implementations of slab allocators backed by
24//! arrays or vectors use a similar technique, where pointers are replaced by
25//! indices into the backing array.
26//!
27//! When allocations and deallocations can occur concurrently across threads,
28//! they must synchronize accesses to the free list; either by putting the
29//! entire allocator state inside of a lock, or by using atomic operations to
30//! treat the free list as a lock-free structure (such as a [Treiber stack]). In
31//! both cases, there is a significant performance cost — even when the free
32//! list is lock-free, it is likely that a noticeable amount of time will be
33//! spent in compare-and-swap loops. Ideally, the global synchronzation point
34//! created by the single global free list could be avoided as much as possible.
35//!
36//! The approach presented by Leijen, Zorn, and de Moura is to introduce
37//! sharding and thus increase the granularity of synchronization significantly.
38//! In mimalloc, the heap is _sharded_ so that each thread has its own
39//! thread-local heap. Objects are always allocated from the local heap of the
40//! thread where the allocation is performed. Because allocations are always
41//! done from a thread's local heap, they need not be synchronized.
42//!
43//! However, since objects can move between threads before being deallocated,
44//! _deallocations_ may still occur concurrently. Therefore, Leijen et al.
45//! introduce a concept of _local_ and _global_ free lists. When an object is
46//! deallocated on the same thread it was originally allocated on, it is placed
47//! on the local free list; if it is deallocated on another thread, it goes on
48//! the global free list for the heap of the thread from which it originated. To
49//! allocate, the local free list is used first; if it is empty, the entire
50//! global free list is popped onto the local free list. Since the local free
51//! list is only ever accessed by the thread it belongs to, it does not require
52//! synchronization at all, and because the global free list is popped from
53//! infrequently, the cost of synchronization has a reduced impact. A majority
54//! of allocations can occur without any synchronization at all; and
55//! deallocations only require synchronization when an object has left its
56//! parent thread (a relatively uncommon case).
57//!
58//! [mimalloc]: https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf
59//! [freelist]: https://en.wikipedia.org/wiki/Free_list
60//! [Treiber stack]: https://en.wikipedia.org/wiki/Treiber_stack
61//!
62//! # Implementation
63//!
64//! A slab is represented as an array of [`MAX_THREADS`] _shards_. A shard
65//! consists of a vector of one or more _pages_ plus associated metadata.
66//! Finally, a page consists of an array of _slots_, head indices for the local
67//! and remote free lists.
68//!
69//! ```text
70//! ┌─────────────┐
71//! │ shard 1     │
72//! │             │    ┌─────────────┐        ┌────────┐
73//! │ pages───────┼───▶│ page 1      │        │        │
74//! ├─────────────┤    ├─────────────┤  ┌────▶│  next──┼─┐
75//! │ shard 2     │    │ page 2      │  │     ├────────┤ │
76//! ├─────────────┤    │             │  │     │XXXXXXXX│ │
77//! │ shard 3     │    │ local_head──┼──┘     ├────────┤ │
78//! └─────────────┘    │ remote_head─┼──┐     │        │◀┘
79//!       ...          ├─────────────┤  │     │  next──┼─┐
80//! ┌─────────────┐    │ page 3      │  │     ├────────┤ │
81//! │ shard n     │    └─────────────┘  │     │XXXXXXXX│ │
82//! └─────────────┘          ...        │     ├────────┤ │
83//!                    ┌─────────────┐  │     │XXXXXXXX│ │
84//!                    │ page n      │  │     ├────────┤ │
85//!                    └─────────────┘  │     │        │◀┘
86//!                                     └────▶│  next──┼───▶  ...
87//!                                           ├────────┤
88//!                                           │XXXXXXXX│
89//!                                           └────────┘
90//! ```
91//!
92//!
93//! The size of the first page in a shard is always a power of two, and every
94//! subsequent page added after the first is twice as large as the page that
95//! preceeds it.
96//!
97//! ```text
98//!
99//! pg.
100//! ┌───┐   ┌─┬─┐
101//! │ 0 │───▶ │ │
102//! ├───┤   ├─┼─┼─┬─┐
103//! │ 1 │───▶ │ │ │ │
104//! ├───┤   ├─┼─┼─┼─┼─┬─┬─┬─┐
105//! │ 2 │───▶ │ │ │ │ │ │ │ │
106//! ├───┤   ├─┼─┼─┼─┼─┼─┼─┼─┼─┬─┬─┬─┬─┬─┬─┬─┐
107//! │ 3 │───▶ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
108//! └───┘   └─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
109//! ```
110//!
111//! When searching for a free slot, the smallest page is searched first, and if
112//! it is full, the search proceeds to the next page until either a free slot is
113//! found or all available pages have been searched. If all available pages have
114//! been searched and the maximum number of pages has not yet been reached, a
115//! new page is then allocated.
116//!
117//! Since every page is twice as large as the previous page, and all page sizes
118//! are powers of two, we can determine the page index that contains a given
119//! address by shifting the address down by the smallest page size and
120//! looking at how many twos places necessary to represent that number,
121//! telling us what power of two page size it fits inside of. We can
122//! determine the number of twos places by counting the number of leading
123//! zeros (unused twos places) in the number's binary representation, and
124//! subtracting that count from the total number of bits in a word.
125//!
126//! The formula for determining the page number that contains an offset is thus:
127//!
128//! ```rust,ignore
129//! WIDTH - ((offset + INITIAL_PAGE_SIZE) >> INDEX_SHIFT).leading_zeros()
130//! ```
131//!
132//! where `WIDTH` is the number of bits in a `usize`, and `INDEX_SHIFT` is
133//!
134//! ```rust,ignore
135//! INITIAL_PAGE_SIZE.trailing_zeros() + 1;
136//! ```
137//!
138//! [`MAX_THREADS`]: https://docs.rs/sharded-slab/latest/sharded_slab/trait.Config.html#associatedconstant.MAX_THREADS