Module measureme::stringtable
source · Expand description
A string table implementation with a tree-like encoding.
Each entry in the table represents a string and is encoded as a list of components where each component can either be
- a string value that contains actual UTF-8 string content,
- a string ID that contains a reference to another entry, or
- a terminator tag which marks the end of a component list.
The string content of an entry is defined as the concatenation of the content of its components. The content of a string value is its actual UTF-8 bytes. The content of a string ID is the contents of the entry it references.
The byte-level encoding of component lists uses the structure of UTF-8 in order to save space:
-
A valid UTF-8 codepoint never starts with the byte
0xFE
. We make use of this fact by letting all string ID components start with this0xFE
prefix. Thus when we parse the contents of a value we know to stop if we encounter this byte. -
A valid UTF-8 string cannot contain the
0xFF
byte. Thus we can safely use0xFF
as our component list terminator.
The sample composite string [“abc”, ID(42), “def”, TERMINATOR] would thus be encoded as:
['a', 'b' , 'c', 254, 42, 0, 0, 0, 'd', 'e', 'f', 255]
^^^^^^^^^^^^^^^^ ^^^
string ID with 0xFE prefix terminator (0xFF)
As you can see string IDs are encoded in little endian format.
Each string in the table is referred to via a StringId
. StringId
s may
be generated in two ways:
- Calling
StringTableBuilder::alloc()
which returns theStringId
for the allocated string. - Calling
StringId::new_virtual()
to create a “virtual”StringId
that later can be mapped to an actual string viaStringTableBuilder::map_virtual_to_concrete_string()
.
String IDs allow you to deduplicate strings by allocating a string once and then referring to it by id over and over. This is a useful trick for strings which are recorded many times and it can significantly reduce the size of profile trace files.
StringId
s are partitioned according to type:
[0 .. MAX_VIRTUAL_STRING_ID, METADATA_STRING_ID, .. ]
From 0
to MAX_VIRTUAL_STRING_ID
are the allowed values for virtual strings.
After MAX_VIRTUAL_STRING_ID
, there is one string id (METADATA_STRING_ID
)
which is used internally by measureme
to record additional metadata about
the profiling session. After METADATA_STRING_ID
are all other StringId
values.
Structs§
- A
StringId
is used to identify a string in theStringTable
. It is either a regularStringId
, meaning that it contains the absolute address of a string within the string table data. Or it is “virtual”, which means that the address it points to is resolved via the string table index data, that maps virtualStringId
s to addresses. - Write-only version of the string table
Enums§
- A single component of a string. Used for building composite table entries.
Constants§
- The id of the profile metadata string entry.
Traits§
- Anything that implements
SerializableString
can be written to aStringTable
.