Struct wasmtime_environ::wasmparser::Parser
source · pub struct Parser { /* private fields */ }
Expand description
An incremental parser of a binary WebAssembly module or component.
This type is intended to be used to incrementally parse a WebAssembly module or component as bytes become available for the module. This can also be used to parse modules or components that are already entirely resident within memory.
This primary function for a parser is the Parser::parse
function which
will incrementally consume input. You can also use the Parser::parse_all
function to parse a module or component that is entirely resident in memory.
Implementations§
source§impl Parser
impl Parser
sourcepub fn new(offset: u64) -> Parser
pub fn new(offset: u64) -> Parser
Creates a new parser.
Reports errors and ranges relative to offset
provided, where offset
is some logical offset within the input stream that we’re parsing.
sourcepub fn is_core_wasm(bytes: &[u8]) -> bool
pub fn is_core_wasm(bytes: &[u8]) -> bool
Tests whether bytes
looks like a core WebAssembly module.
This will inspect the first 8 bytes of bytes
and return true
if it
starts with the standard core WebAssembly header.
sourcepub fn is_component(bytes: &[u8]) -> bool
pub fn is_component(bytes: &[u8]) -> bool
Tests whether bytes
looks like a WebAssembly component.
This will inspect the first 8 bytes of bytes
and return true
if it
starts with the standard WebAssembly component header.
sourcepub fn parse<'a>(
&mut self,
data: &'a [u8],
eof: bool
) -> Result<Chunk<'a>, BinaryReaderError>
pub fn parse<'a>( &mut self, data: &'a [u8], eof: bool ) -> Result<Chunk<'a>, BinaryReaderError>
Attempts to parse a chunk of data.
This method will attempt to parse the next incremental portion of a
WebAssembly binary. Data available for the module or component is
provided as data
, and the data can be incomplete if more data has yet
to arrive. The eof
flag indicates whether more data will ever be received.
There are two ways parsing can succeed with this method:
-
Chunk::NeedMoreData
- this indicates that there is not enough bytes indata
to parse a payload. The caller needs to wait for more data to be available in this situation before calling this method again. It is guaranteed that this is only returned ifeof
isfalse
. -
Chunk::Parsed
- this indicates that a chunk of the input was successfully parsed. The payload is available in this variant of what was parsed, and this also indicates how many bytes ofdata
was consumed. It’s expected that the caller will not provide these bytes back to theParser
again.
Note that all Chunk
return values are connected, with a lifetime, to
the input buffer. Each parsed chunk borrows the input buffer and is a
view into it for successfully parsed chunks.
It is expected that you’ll call this method until Payload::End
is
reached, at which point you’re guaranteed that the parse has completed.
Note that complete parsing, for the top-level module or component,
implies that data
is empty and eof
is true
.
Errors
Parse errors are returned as an Err
. Errors can happen when the
structure of the data is unexpected or if sections are too large for
example. Note that errors are not returned for malformed contents of
sections here. Sections are generally not individually parsed and each
returned Payload
needs to be iterated over further to detect all
errors.
Examples
An example of reading a wasm file from a stream (std::io::Read
) and
incrementally parsing it.
use std::io::Read;
use anyhow::Result;
use wasmparser::{Parser, Chunk, Payload::*};
fn parse(mut reader: impl Read) -> Result<()> {
let mut buf = Vec::new();
let mut cur = Parser::new(0);
let mut eof = false;
let mut stack = Vec::new();
loop {
let (payload, consumed) = match cur.parse(&buf, eof)? {
Chunk::NeedMoreData(hint) => {
assert!(!eof); // otherwise an error would be returned
// Use the hint to preallocate more space, then read
// some more data into our buffer.
//
// Note that the buffer management here is not ideal,
// but it's compact enough to fit in an example!
let len = buf.len();
buf.extend((0..hint).map(|_| 0u8));
let n = reader.read(&mut buf[len..])?;
buf.truncate(len + n);
eof = n == 0;
continue;
}
Chunk::Parsed { consumed, payload } => (payload, consumed),
};
match payload {
// Sections for WebAssembly modules
Version { .. } => { /* ... */ }
TypeSection(_) => { /* ... */ }
ImportSection(_) => { /* ... */ }
FunctionSection(_) => { /* ... */ }
TableSection(_) => { /* ... */ }
MemorySection(_) => { /* ... */ }
TagSection(_) => { /* ... */ }
GlobalSection(_) => { /* ... */ }
ExportSection(_) => { /* ... */ }
StartSection { .. } => { /* ... */ }
ElementSection(_) => { /* ... */ }
DataCountSection { .. } => { /* ... */ }
DataSection(_) => { /* ... */ }
// Here we know how many functions we'll be receiving as
// `CodeSectionEntry`, so we can prepare for that, and
// afterwards we can parse and handle each function
// individually.
CodeSectionStart { .. } => { /* ... */ }
CodeSectionEntry(body) => {
// here we can iterate over `body` to parse the function
// and its locals
}
// Sections for WebAssembly components
InstanceSection(_) => { /* ... */ }
CoreTypeSection(_) => { /* ... */ }
ComponentInstanceSection(_) => { /* ... */ }
ComponentAliasSection(_) => { /* ... */ }
ComponentTypeSection(_) => { /* ... */ }
ComponentCanonicalSection(_) => { /* ... */ }
ComponentStartSection { .. } => { /* ... */ }
ComponentImportSection(_) => { /* ... */ }
ComponentExportSection(_) => { /* ... */ }
ModuleSection { parser, .. }
| ComponentSection { parser, .. } => {
stack.push(cur.clone());
cur = parser.clone();
}
CustomSection(_) => { /* ... */ }
// most likely you'd return an error here
UnknownSection { id, .. } => { /* ... */ }
// Once we've reached the end of a parser we either resume
// at the parent parser or we break out of the loop because
// we're done.
End(_) => {
if let Some(parent_parser) = stack.pop() {
cur = parent_parser;
} else {
break;
}
}
}
// once we're done processing the payload we can forget the
// original.
buf.drain(..consumed);
}
Ok(())
}
sourcepub fn parse_all(
self,
data: &[u8]
) -> impl Iterator<Item = Result<Payload<'_>, BinaryReaderError>>
pub fn parse_all( self, data: &[u8] ) -> impl Iterator<Item = Result<Payload<'_>, BinaryReaderError>>
Convenience function that can be used to parse a module or component that is entirely resident in memory.
This function will parse the data
provided as a WebAssembly module
or component.
Note that when this function yields sections that provide parsers, no further action is required for those sections as payloads from those parsers will be automatically returned.
Examples
An example of reading a wasm file from a stream (std::io::Read
) into
a buffer and then parsing it.
use std::io::Read;
use anyhow::Result;
use wasmparser::{Parser, Chunk, Payload::*};
fn parse(mut reader: impl Read) -> Result<()> {
let mut buf = Vec::new();
reader.read_to_end(&mut buf)?;
let parser = Parser::new(0);
for payload in parser.parse_all(&buf) {
match payload? {
// Sections for WebAssembly modules
Version { .. } => { /* ... */ }
TypeSection(_) => { /* ... */ }
ImportSection(_) => { /* ... */ }
FunctionSection(_) => { /* ... */ }
TableSection(_) => { /* ... */ }
MemorySection(_) => { /* ... */ }
TagSection(_) => { /* ... */ }
GlobalSection(_) => { /* ... */ }
ExportSection(_) => { /* ... */ }
StartSection { .. } => { /* ... */ }
ElementSection(_) => { /* ... */ }
DataCountSection { .. } => { /* ... */ }
DataSection(_) => { /* ... */ }
// Here we know how many functions we'll be receiving as
// `CodeSectionEntry`, so we can prepare for that, and
// afterwards we can parse and handle each function
// individually.
CodeSectionStart { .. } => { /* ... */ }
CodeSectionEntry(body) => {
// here we can iterate over `body` to parse the function
// and its locals
}
// Sections for WebAssembly components
ModuleSection { .. } => { /* ... */ }
InstanceSection(_) => { /* ... */ }
CoreTypeSection(_) => { /* ... */ }
ComponentSection { .. } => { /* ... */ }
ComponentInstanceSection(_) => { /* ... */ }
ComponentAliasSection(_) => { /* ... */ }
ComponentTypeSection(_) => { /* ... */ }
ComponentCanonicalSection(_) => { /* ... */ }
ComponentStartSection { .. } => { /* ... */ }
ComponentImportSection(_) => { /* ... */ }
ComponentExportSection(_) => { /* ... */ }
CustomSection(_) => { /* ... */ }
// most likely you'd return an error here
UnknownSection { id, .. } => { /* ... */ }
// Once we've reached the end of a parser we either resume
// at the parent parser or the payload iterator is at its
// end and we're done.
End(_) => {}
}
}
Ok(())
}
sourcepub fn skip_section(&mut self)
pub fn skip_section(&mut self)
Skip parsing the code section entirely.
This function can be used to indicate, after receiving
CodeSectionStart
, that the section will not be parsed.
The caller will be responsible for skipping size
bytes (found in the
CodeSectionStart
payload). Bytes should only be fed into parse
after the size
bytes have been skipped.
Panics
This function will panic if the parser is not in a state where it’s parsing the code section.
Examples
use wasmparser::{Result, Parser, Chunk, Payload::*};
use std::ops::Range;
fn objdump_headers(mut wasm: &[u8]) -> Result<()> {
let mut parser = Parser::new(0);
loop {
let payload = match parser.parse(wasm, true)? {
Chunk::Parsed { consumed, payload } => {
wasm = &wasm[consumed..];
payload
}
// this state isn't possible with `eof = true`
Chunk::NeedMoreData(_) => unreachable!(),
};
match payload {
TypeSection(s) => print_range("type section", &s.range()),
ImportSection(s) => print_range("import section", &s.range()),
// .. other sections
// Print the range of the code section we see, but don't
// actually iterate over each individual function.
CodeSectionStart { range, size, .. } => {
print_range("code section", &range);
parser.skip_section();
wasm = &wasm[size as usize..];
}
End(_) => break,
_ => {}
}
}
Ok(())
}
fn print_range(section: &str, range: &Range<usize>) {
println!("{:>40}: {:#010x} - {:#010x}", section, range.start, range.end);
}