Expand description
High performance XML reader/writer.
§Description
quick-xml contains two modes of operation:
A streaming API based on the StAX model. This is suited for larger XML documents which cannot completely read into memory at once.
The user has to explicitly ask for the next XML event, similar to a database cursor. This is achieved by the following two structs:
Reader
: A low level XML pull-reader where buffer allocation/clearing is left to user.Writer
: A XML writer. Can be nested with readers if you want to transform XMLs.
Especially for nested XML elements, the user must keep track where (how deep) in the XML document the current event is located.
quick-xml contains optional support of asynchronous reading and writing using tokio.
To get it enable the async-tokio
feature.
Furthermore, quick-xml also contains optional Serde support to directly
serialize and deserialize from structs, without having to deal with the XML events.
To get it enable the serialize
feature. Read more about mapping Rust types
to XML in the documentation of de
module. Also check serde_helpers
module.
§Examples
§Features
quick-xml
supports the following features:
-
async-tokio
— Enables support for asynchronous reading and writing fromtokio
’s IO-Traits by enabling reading events from types implementingtokio::io::AsyncBufRead
. -
encoding
— Enables support of non-UTF-8 encoded documents. Encoding will be inferred from the XML declaration if it is found, otherwise UTF-8 is assumed.Currently, only ASCII-compatible encodings are supported. For example, UTF-16 will not work (therefore,
quick-xml
is not standard compliant).Thus, quick-xml supports all encodings of
encoding_rs
except these:You should stop processing a document when one of these encodings is detected, because generated events can be wrong and do not reflect a real document structure!
Because these are the only supported encodings that are not ASCII compatible, you can check for them:
use quick_xml::events::Event; use quick_xml::reader::Reader; let xml = to_utf16le_with_bom(r#"<?xml encoding='UTF-16'><element/>"#); let mut reader = Reader::from_reader(xml.as_ref()); reader.config_mut().trim_text(true); let mut buf = Vec::new(); let mut unsupported = false; loop { if !reader.decoder().encoding().is_ascii_compatible() { unsupported = true; break; } buf.clear(); match reader.read_event_into(&mut buf).unwrap() { Event::Eof => break, _ => {} } } assert_eq!(unsupported, true);
This restriction will be eliminated once issue #158 is resolved.
-
escape-html
— Enables support for recognizing all HTML 5 entities inunescape
function. The full list of entities also can be found in https://html.spec.whatwg.org/entities.json. -
overlapped-lists
— This feature is for the Serde deserializer that enables support for deserializing lists where tags are overlapped with tags that do not correspond to the list.When this feature is enabled, the XML:
<any-name> <item/> <another-item/> <item/> <item/> </any-name>
could be deserialized to a struct:
#[derive(Deserialize)] #[serde(rename_all = "kebab-case")] struct AnyName { item: Vec<()>, another_item: (), }
When this feature is not enabled (default), only the first element will be associated with the field, and the deserialized type will report an error (duplicated field) when the deserializer encounters a second
<item/>
.Note, that enabling this feature can lead to high and even unlimited memory consumption, because deserializer needs to check all events up to the end of a container tag (
</any-name>
in this example) to figure out that there are no more items for a field. If</any-name>
or even EOF is not encountered, the parsing will never end which can lead to a denial-of-service (DoS) scenario.Having several lists and overlapped elements for them in XML could also lead to quadratic parsing time, because the deserializer must check the list of events as many times as the number of sequence fields present in the schema.
To reduce negative consequences, always limit the maximum number of events that
Deserializer
will buffer.This feature works only with
serialize
feature and has no effect ifserialize
is not enabled. -
serde-types
— Enables serialization of some quick-xml types usingserde
. This feature is rarely needed.This feature does NOT provide XML serializer or deserializer. You should use the
serialize
feature for that instead. -
serialize
— Enables support forserde
serialization and deserialization. When this feature is enabled, quick-xml provides serializer and deserializer for XML.This feature does NOT enables serializaton of the types inside quick-xml. If you need that, use the
serde-types
feature.
Re-exports§
pub use crate::encoding::Decoder;
pub use crate::errors::serialize::DeError;
serialize
pub use crate::errors::serialize::SeError;
serialize
pub use crate::errors::Error;
pub use crate::errors::Result;
pub use crate::reader::NsReader;
pub use crate::reader::Reader;
pub use crate::writer::ElementWriter;
pub use crate::writer::Writer;
Modules§
- de
serialize
SerdeDeserializer
module. - A module for wrappers that encode / decode data.
- Error management module
- Manage xml character escapes
- Defines zero-copy XML events used throughout this library.
- Module for handling names according to the W3C Namespaces in XML 1.1 (Second Edition) specification
- Contains low-level parsers of different XML pieces.
- Contains high-level interface for a pull-based XML parser.
- se
serialize
Module to handle custom serdeSerializer
- serde_
helpers serde-types
Provides helper functions to glue an XML with a serde content model. - Contains high-level interface for an events-based XML emitter.
Macros§
- impl_
deserialize_ for_ internally_ tagged_ enum serde-types
A helper to implementDeserialize
for internally tagged enums which does not useDeserializer::deserialize_any
that produces wrong results with XML because of serde#1183.