buffered_reader/lib.rs
1//! A [`BufferedReader`] is a super-powered `Read`er.
2//!
3//! Like the [`BufRead`] trait, the [`BufferedReader`] trait has an
4//! internal buffer that is directly exposed to the user. This design
5//! enables two performance optimizations. First, the use of an
6//! internal buffer amortizes system calls. Second, exposing the
7//! internal buffer allows the user to work with data in place, which
8//! avoids another copy.
9//!
10//! The [`BufRead`] trait, however, has a significant limitation for
11//! parsers: the user of a [`BufRead`] object can't control the amount
12//! of buffering. This is essential for being able to conveniently
13//! work with data in place, and being able to lookahead without
14//! consuming data. The result is that either the sizing has to be
15//! handled by the instantiator of the [`BufRead`] object---assuming
16//! the [`BufRead`] object provides such a mechanism---which is a
17//! layering violation, or the parser has to fallback to buffering if
18//! the internal buffer is too small, which eliminates most of the
19//! advantages of the [`BufRead`] abstraction. The [`BufferedReader`]
20//! trait addresses this shortcoming by allowing the user to control
21//! the size of the internal buffer.
22//!
23//! The [`BufferedReader`] trait also has some functionality,
24//! specifically, a generic interface to work with a stack of
25//! [`BufferedReader`] objects, that simplifies using multiple parsers
26//! simultaneously. This is helpful when one parser deals with
27//! framing (e.g., something like [HTTP's chunk transfer encoding]),
28//! and another decodes the actual objects. It is also useful when
29//! objects are nested.
30//!
31//! # Details
32//!
33//! Because the [`BufRead`] trait doesn't provide a mechanism for the
34//! user to size the internal buffer, a parser can't generally be sure
35//! that the internal buffer will be large enough to allow it to work
36//! with all data in place.
37//!
38//! Using the standard [`BufRead`] implementation, [`BufReader`], the
39//! instantiator can set the size of the internal buffer at creation
40//! time. Unfortunately, this mechanism is ugly, and not always
41//! adequate. First, the parser is typically not the instantiator.
42//! Thus, the instantiator needs to know about the implementation
43//! details of all of the parsers, which turns an implementation
44//! detail into a cross-cutting concern. Second, when working with
45//! dynamically sized data, the maximum amount of the data that needs
46//! to be worked with in place may not be known apriori, or the
47//! maximum amount may be significantly larger than the typical
48//! amount. This leads to poorly sized buffers.
49//!
50//! Alternatively, the code that uses, but does not instantiate a
51//! [`BufRead`] object, can be changed to stream the data, or to
52//! fallback to reading the data into a local buffer if the internal
53//! buffer is too small. Both of these approaches increase code
54//! complexity, and the latter approach is contrary to the
55//! [`BufRead`]'s goal of reducing unnecessary copying.
56//!
57//! The [`BufferedReader`] trait solves this problem by allowing the
58//! user to dynamically (i.e., at read time, not open time) ensure
59//! that the internal buffer has a certain amount of data.
60//!
61//! The ability to control the size of the internal buffer is also
62//! essential to straightforward support for speculative lookahead.
63//! The reason that speculative lookahead with a [`BufRead`] object is
64//! difficult is that speculative lookahead is /speculative/, i.e., if
65//! the parser backtracks, the data that was read must not be
66//! consumed. Using a [`BufRead`] object, this is not possible if the
67//! amount of lookahead is larger than the internal buffer. That is,
68//! if the amount of lookahead data is larger than the [`BufRead`]'s
69//! internal buffer, the parser first has to [`std::io::BufRead::consume`] some
70//! data to be able to examine more data. But, if the parser then
71//! decides to backtrack, it has no way to return the unused data to
72//! the [`BufRead`] object. This forces the parser to manage a buffer
73//! of read, but unconsumed data, which significantly complicates the
74//! code.
75//!
76//! The [`BufferedReader`] trait also simplifies working with a stack of
77//! [`BufferedReader`]s in two ways. First, the [`BufferedReader`] trait
78//! provides *generic* methods to access the underlying
79//! [`BufferedReader`]. Thus, even when dealing with a trait object, it
80//! is still possible to recover the underlying [`BufferedReader`].
81//! Second, the [`BufferedReader`] provides a mechanism to associate
82//! generic state with each [`BufferedReader`] via a cookie. Although
83//! it is possible to realize this functionality using a custom trait
84//! that extends the [`BufferedReader`] trait and wraps existing
85//! [`BufferedReader`] implementations, this approach eliminates a lot
86//! of error-prone, boilerplate code.
87//!
88//! # Examples
89//!
90//! The following examples show not only how to use a
91//! [`BufferedReader`], but also better illustrate the aforementioned
92//! limitations of a [`BufRead`]er.
93//!
94//! Consider a file consisting of a sequence of objects, which are
95//! laid out as follows. Each object has a two byte header that
96//! indicates the object's size in bytes. The object immediately
97//! follows the header. Thus, if we had two objects: "foobar" and
98//! "xyzzy", in that order, the file would look like this:
99//!
100//! ```text
101//! 0 6 f o o b a r 0 5 x y z z y
102//! ```
103//!
104//! Here's how we might parse this type of file using a
105//! [`BufferedReader`]:
106//!
107//! ```
108//! use buffered_reader;
109//! use buffered_reader::BufferedReader;
110//!
111//! fn parse_object(content: &[u8]) {
112//! // Parse the object.
113//! # let _ = content;
114//! }
115//!
116//! # f(); fn f() -> Result<(), std::io::Error> {
117//! # const FILENAME : &str = "/dev/null";
118//! let mut br = buffered_reader::File::open(FILENAME)?;
119//!
120//! // While we haven't reached EOF (i.e., we can read at
121//! // least one byte).
122//! while br.data(1)?.len() > 0 {
123//! // Get the object's length.
124//! let len = br.read_be_u16()? as usize;
125//! // Get the object's content.
126//! let content = br.data_consume_hard(len)?;
127//!
128//! // Parse the actual object using a real parser. Recall:
129//! // `data_hard`() may return more than the requested amount (but
130//! // it will never return less).
131//! parse_object(&content[..len]);
132//! }
133//! # Ok(()) }
134//! ```
135//!
136//! Note that `content` is actually a pointer to the
137//! [`BufferedReader`]'s internal buffer. Thus, getting some data
138//! doesn't require copying the data into a local buffer, which is
139//! often discarded immediately after the data is parsed.
140//!
141//! Further, [`BufferedReader::data`] (and the other related functions) are guaranteed
142//! to return at least the requested amount of data. There are two
143//! exceptions: if an error occurs, or the end of the file is reached.
144//! Thus, only the cases that actually need to be handled by the user
145//! are actually exposed; there is no need to call something like
146//! [`std::io::Read::read`] in a loop to ensure the whole object is available.
147//!
148//! Because reading is separate from consuming data, it is possible to
149//! get a chunk of data, inspect it, and then consume only what is
150//! needed. As mentioned above, this is only possible with a
151//! [`BufRead`] object if the internal buffer happens to be large
152//! enough. Using a [`BufferedReader`], this is always possible,
153//! assuming the data fits in memory.
154//!
155//! In our example, we actually have two parsers: one that deals with
156//! the framing, and one for the actual objects. The above code
157//! buffers the objects in their entirety, and then passes a slice
158//! containing the object to the object parser. If the object parser
159//! also worked with a [`BufferedReader`] object, then less buffering
160//! will usually be needed, and the two parsers could run
161//! simultaneously. This is particularly useful when the framing is
162//! more complicated like [HTTP's chunk transfer encoding]. Then,
163//! when the object parser reads data, the frame parser is invoked
164//! lazily. This is done by implementing the [`BufferedReader`] trait
165//! for the framing parser, and stacking the [`BufferedReader`]s.
166//!
167//! For our next example, we rewrite the previous code assuming that
168//! the object parser reads from a [`BufferedReader`] object. Since the
169//! framing parser is really just a limit on the object's size, we
170//! don't need to implement a special [`BufferedReader`], but can use a
171//! [`Limitor`] to impose an upper limit on the amount
172//! that it can read. After the object parser has finished, we drain
173//! the object reader. This pattern is particularly helpful when
174//! individual objects that contain errors should be skipped.
175//!
176//! ```
177//! use buffered_reader;
178//! use buffered_reader::BufferedReader;
179//!
180//! fn parse_object<R: BufferedReader<()>>(br: &mut R) {
181//! // Parse the object.
182//! # let _ = br;
183//! }
184//!
185//! # f(); fn f() -> Result<(), std::io::Error> {
186//! # const FILENAME : &str = "/dev/null";
187//! let mut br : Box<dyn BufferedReader<()>>
188//! = Box::new(buffered_reader::File::open(FILENAME)?);
189//!
190//! // While we haven't reached EOF (i.e., we can read at
191//! // least one byte).
192//! while br.data(1)?.len() > 0 {
193//! // Get the object's length.
194//! let len = br.read_be_u16()? as u64;
195//!
196//! // Set up a limit.
197//! br = Box::new(buffered_reader::Limitor::new(br, len));
198//!
199//! // Parse the actual object using a real parser.
200//! parse_object(&mut br);
201//!
202//! // If the parser didn't consume the whole object, e.g., due to
203//! // a parse error, drop the rest.
204//! br.drop_eof();
205//!
206//! // Recover the framing parser's `BufferedReader`.
207//! br = br.into_inner().unwrap();
208//! }
209//! # Ok(()) }
210//! ```
211//!
212//! Of particular note is the generic functionality for dealing with
213//! stacked [`BufferedReader`]s: the [`BufferedReader::into_inner`] method is not bound
214//! to the implementation, which is often not be available due to type
215//! erasure, but is provided by the trait.
216//!
217//! In addition to utility [`BufferedReader`]s like the
218//! [`Limitor`], this crate also includes a few
219//! general-purpose parsers, like the [`Zlib`]
220//! decompressor.
221//!
222//! [`BufRead`]: std::io::BufRead
223//! [`BufReader`]: std::io::BufReader
224//! [HTTP's chunk transfer encoding]: https://en.wikipedia.org/wiki/Chunked_transfer_encoding
225
226#![doc(html_favicon_url = "https://docs.sequoia-pgp.org/favicon.png")]
227#![doc(html_logo_url = "https://docs.sequoia-pgp.org/logo.svg")]
228#![warn(missing_docs)]
229
230use std::io;
231use std::io::{Error, ErrorKind};
232use std::cmp;
233use std::fmt;
234use std::convert::TryInto;
235
236#[macro_use]
237mod macros;
238
239mod generic;
240mod memory;
241mod limitor;
242mod reserve;
243mod dup;
244mod eof;
245mod adapter;
246#[cfg(feature = "compression-deflate")]
247mod decompress_deflate;
248#[cfg(feature = "compression-bzip2")]
249mod decompress_bzip2;
250
251pub use self::generic::Generic;
252pub use self::memory::Memory;
253pub use self::limitor::Limitor;
254pub use self::reserve::Reserve;
255pub use self::dup::Dup;
256pub use self::eof::EOF;
257pub use self::adapter::Adapter;
258#[cfg(feature = "compression-deflate")]
259pub use self::decompress_deflate::Deflate;
260#[cfg(feature = "compression-deflate")]
261pub use self::decompress_deflate::Zlib;
262#[cfg(feature = "compression-bzip2")]
263pub use self::decompress_bzip2::Bzip;
264
265// Common error type for file operations.
266mod file_error;
267
268// These are the different File implementations. We
269// include the modules unconditionally, so that we catch bitrot early.
270#[allow(dead_code)]
271mod file_generic;
272#[allow(dead_code)]
273#[cfg(unix)]
274mod file_unix;
275
276// Then, we select the appropriate version to re-export.
277#[cfg(not(unix))]
278pub use self::file_generic::File;
279#[cfg(unix)]
280pub use self::file_unix::File;
281
282/// The default buffer size.
283///
284/// This is configurable by the SEQUOIA_BUFFERED_READER_BUFFER
285/// environment variable.
286fn default_buf_size() -> usize {
287 use std::sync::OnceLock;
288
289 static DEFAULT_BUF_SIZE: OnceLock<usize> = OnceLock::new();
290 *DEFAULT_BUF_SIZE.get_or_init(|| {
291 use std::env::var_os;
292 use std::str::FromStr;
293
294 let default = 32 * 1024;
295
296 if let Some(size) = var_os("SEQUOIA_BUFFERED_READER_BUFFER") {
297 size.to_str()
298 .and_then(|s| {
299 match FromStr::from_str(s) {
300 Ok(s) => Some(s),
301 Err(err) => {
302 eprintln!("Unable to parse the value of \
303 'SEQUOIA_BUFFERED_READER_BUFFER'; \
304 falling back to the default buffer \
305 size ({}): {}",
306 err, default);
307 None
308 }
309 }
310 })
311 .unwrap_or(default)
312 } else {
313 default
314 }
315 })
316}
317
318// On debug builds, Vec<u8>::truncate is very, very slow. For
319// instance, running the decrypt_test_stream test takes 51 seconds on
320// my (Neal's) computer using Vec<u8>::truncate and <0.1 seconds using
321// `unsafe { v.set_len(len); }`.
322//
323// The issue is that the compiler calls drop on every element that is
324// dropped, even though a u8 doesn't have a drop implementation. The
325// compiler optimizes this away at high optimization levels, but those
326// levels make debugging harder.
327fn vec_truncate(v: &mut Vec<u8>, len: usize) {
328 if cfg!(debug_assertions) {
329 if len < v.len() {
330 unsafe { v.set_len(len); }
331 }
332 } else {
333 v.truncate(len);
334 }
335}
336
337/// Like `Vec<u8>::resize`, but fast in debug builds.
338fn vec_resize(v: &mut Vec<u8>, new_size: usize) {
339 if v.len() < new_size {
340 v.resize(new_size, 0);
341 } else {
342 vec_truncate(v, new_size);
343 }
344}
345
346/// The generic `BufferReader` interface.
347pub trait BufferedReader<C> : io::Read + fmt::Debug + fmt::Display + Send + Sync
348 where C: fmt::Debug + Send + Sync
349{
350 /// Returns a reference to the internal buffer.
351 ///
352 /// Note: this returns the same data as `self.data(0)`, but it
353 /// does so without mutably borrowing self:
354 ///
355 /// ```
356 /// # f(); fn f() -> Result<(), std::io::Error> {
357 /// use buffered_reader;
358 /// use buffered_reader::BufferedReader;
359 ///
360 /// let mut br = buffered_reader::Memory::new(&b"0123456789"[..]);
361 ///
362 /// let first = br.data(10)?.len();
363 /// let second = br.buffer().len();
364 /// // `buffer` must return exactly what `data` returned.
365 /// assert_eq!(first, second);
366 /// # Ok(()) }
367 /// ```
368 fn buffer(&self) -> &[u8];
369
370 /// Ensures that the internal buffer has at least `amount` bytes
371 /// of data, and returns it.
372 ///
373 /// If the internal buffer contains less than `amount` bytes of
374 /// data, the internal buffer is first filled.
375 ///
376 /// The returned slice will have *at least* `amount` bytes unless
377 /// EOF has been reached or an error occurs, in which case the
378 /// returned slice will contain the rest of the file.
379 ///
380 /// Errors are returned only when the internal buffer is empty.
381 ///
382 /// This function does not advance the cursor. To advance the
383 /// cursor, use [`BufferedReader::consume`].
384 ///
385 /// Note: If the internal buffer already contains at least
386 /// `amount` bytes of data, then [`BufferedReader`]
387 /// implementations are guaranteed to simply return the internal
388 /// buffer. As such, multiple calls to [`BufferedReader::data`]
389 /// for the same `amount` will return the same slice.
390 ///
391 /// Further, [`BufferedReader`] implementations are guaranteed to
392 /// not shrink the internal buffer. Thus, once some data has been
393 /// returned, it will always be returned until it is consumed.
394 /// As such, the following must hold:
395 ///
396 /// If [`BufferedReader`] receives `EINTR` when `read`ing, it will
397 /// automatically retry reading.
398 ///
399 /// ```
400 /// # f(); fn f() -> Result<(), std::io::Error> {
401 /// use buffered_reader;
402 /// use buffered_reader::BufferedReader;
403 ///
404 /// let mut br = buffered_reader::Memory::new(&b"0123456789"[..]);
405 ///
406 /// let first = br.data(10)?.len();
407 /// let second = br.data(5)?.len();
408 /// // Even though less data is requested, the second call must
409 /// // return the same slice as the first call.
410 /// assert_eq!(first, second);
411 /// # Ok(()) }
412 /// ```
413 fn data(&mut self, amount: usize) -> Result<&[u8], io::Error>;
414
415 /// Like [`BufferedReader::data`], but returns an error if there is not at least
416 /// `amount` bytes available.
417 ///
418 /// [`BufferedReader::data_hard`] is a variant of [`BufferedReader::data`] that returns at least
419 /// `amount` bytes of data or an error. Thus, unlike [`BufferedReader::data`],
420 /// which will return less than `amount` bytes of data if EOF is
421 /// encountered, [`BufferedReader::data_hard`] returns an error, specifically,
422 /// `io::ErrorKind::UnexpectedEof`.
423 ///
424 /// # Examples
425 ///
426 /// ```
427 /// # f(); fn f() -> Result<(), std::io::Error> {
428 /// use buffered_reader;
429 /// use buffered_reader::BufferedReader;
430 ///
431 /// let mut br = buffered_reader::Memory::new(&b"0123456789"[..]);
432 ///
433 /// // Trying to read more than there is available results in an error.
434 /// assert!(br.data_hard(20).is_err());
435 /// // Whereas with data(), everything through EOF is returned.
436 /// assert_eq!(br.data(20)?.len(), 10);
437 /// # Ok(()) }
438 /// ```
439 fn data_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
440 let result = self.data(amount);
441 if let Ok(buffer) = result {
442 if buffer.len() < amount {
443 return Err(Error::new(ErrorKind::UnexpectedEof,
444 "unexpected EOF"));
445 }
446 }
447 result
448 }
449
450 /// Returns all of the data until EOF. Like [`BufferedReader::data`], this does not
451 /// actually consume the data that is read.
452 ///
453 /// In general, you shouldn't use this function as it can cause an
454 /// enormous amount of buffering. But, if you know that the
455 /// amount of data is limited, this is acceptable.
456 ///
457 /// # Examples
458 ///
459 /// ```
460 /// # f(); fn f() -> Result<(), std::io::Error> {
461 /// use buffered_reader;
462 /// use buffered_reader::BufferedReader;
463 ///
464 /// const AMOUNT : usize = 100 * 1024 * 1024;
465 /// let buffer = vec![0u8; AMOUNT];
466 /// let mut br = buffered_reader::Generic::new(&buffer[..], None);
467 ///
468 /// // Normally, only a small amount will be buffered.
469 /// assert!(br.data(10)?.len() <= AMOUNT);
470 ///
471 /// // `data_eof` buffers everything.
472 /// assert_eq!(br.data_eof()?.len(), AMOUNT);
473 ///
474 /// // Now that everything is buffered, buffer(), data(), and
475 /// // data_hard() will also return everything.
476 /// assert_eq!(br.buffer().len(), AMOUNT);
477 /// assert_eq!(br.data(10)?.len(), AMOUNT);
478 /// assert_eq!(br.data_hard(10)?.len(), AMOUNT);
479 /// # Ok(()) }
480 /// ```
481 fn data_eof(&mut self) -> Result<&[u8], io::Error> {
482 // Don't just read std::usize::MAX bytes at once. The
483 // implementation might try to actually allocate a buffer that
484 // large! Instead, try with increasingly larger buffers until
485 // the read is (strictly) shorter than the specified size.
486 let mut s = default_buf_size();
487 // We will break the loop eventually, because self.data(s)
488 // must return a slice shorter than std::usize::MAX.
489 loop {
490 match self.data(s) {
491 Ok(buffer) => {
492 if buffer.len() < s {
493 // We really want to do
494 //
495 // return Ok(buffer);
496 //
497 // But, the borrower checker won't let us:
498 //
499 // error[E0499]: cannot borrow `*self` as
500 // mutable more than once at a time.
501 //
502 // Instead, we break out of the loop, and then
503 // call self.buffer().
504 s = buffer.len();
505 break;
506 } else {
507 s *= 2;
508 }
509 }
510 Err(err) =>
511 return Err(err),
512 }
513 }
514
515 let buffer = self.buffer();
516 assert_eq!(buffer.len(), s);
517 Ok(buffer)
518 }
519
520 /// Consumes some of the data.
521 ///
522 /// This advances the internal cursor by `amount`. It is an error
523 /// to call this function to consume data that hasn't been
524 /// returned by [`BufferedReader::data`] or a related function.
525 ///
526 /// Note: It is safe to call this function to consume more data
527 /// than requested in a previous call to [`BufferedReader::data`], but only if
528 /// [`BufferedReader::data`] also returned that data.
529 ///
530 /// This function returns the internal buffer *including* the
531 /// consumed data. Thus, the [`BufferedReader`] implementation must
532 /// continue to buffer the consumed data until the reference goes
533 /// out of scope.
534 ///
535 /// # Examples
536 ///
537 /// ```
538 /// # f(); fn f() -> Result<(), std::io::Error> {
539 /// use buffered_reader;
540 /// use buffered_reader::BufferedReader;
541 ///
542 /// const AMOUNT : usize = 100 * 1024 * 1024;
543 /// let buffer = vec![0u8; AMOUNT];
544 /// let mut br = buffered_reader::Generic::new(&buffer[..], None);
545 ///
546 /// let amount = {
547 /// // We want at least 1024 bytes, but we'll be happy with
548 /// // more or less.
549 /// let buffer = br.data(1024)?;
550 /// // Parse the data or something.
551 /// let used = buffer.len();
552 /// used
553 /// };
554 /// let buffer = br.consume(amount);
555 /// # Ok(()) }
556 /// ```
557 fn consume(&mut self, amount: usize) -> &[u8];
558
559 /// A convenience function that combines [`BufferedReader::data`] and [`BufferedReader::consume`].
560 ///
561 /// If less than `amount` bytes are available, this function
562 /// consumes what is available.
563 ///
564 /// Note: Due to lifetime issues, it is not possible to call
565 /// [`BufferedReader::data`], work with the returned buffer, and then call
566 /// [`BufferedReader::consume`] in the same scope, because both [`BufferedReader::data`] and
567 /// [`BufferedReader::consume`] take a mutable reference to the [`BufferedReader`].
568 /// This function makes this common pattern easier.
569 ///
570 /// # Examples
571 ///
572 /// ```
573 /// # f(); fn f() -> Result<(), std::io::Error> {
574 /// use buffered_reader;
575 /// use buffered_reader::BufferedReader;
576 ///
577 /// let orig = b"0123456789";
578 /// let mut br = buffered_reader::Memory::new(&orig[..]);
579 ///
580 /// // We need a new scope for each call to [`BufferedReader::data_consume`], because
581 /// // the `buffer` reference locks `br`.
582 /// {
583 /// let buffer = br.data_consume(3)?;
584 /// assert_eq!(buffer, &orig[..buffer.len()]);
585 /// }
586 ///
587 /// // Note that the cursor has advanced.
588 /// {
589 /// let buffer = br.data_consume(3)?;
590 /// assert_eq!(buffer, &orig[3..3 + buffer.len()]);
591 /// }
592 ///
593 /// // Like [`BufferedReader::data`], [`BufferedReader::data_consume`] may return and consume less
594 /// // than requested if there is no more data available.
595 /// {
596 /// let buffer = br.data_consume(10)?;
597 /// assert_eq!(buffer, &orig[6..6 + buffer.len()]);
598 /// }
599 ///
600 /// {
601 /// let buffer = br.data_consume(10)?;
602 /// assert_eq!(buffer.len(), 0);
603 /// }
604 /// # Ok(()) }
605 /// ```
606 fn data_consume(&mut self, amount: usize)
607 -> Result<&[u8], std::io::Error> {
608 let amount = cmp::min(amount, self.data(amount)?.len());
609
610 let buffer = self.consume(amount);
611 assert!(buffer.len() >= amount);
612 Ok(buffer)
613 }
614
615 /// A convenience function that effectively combines [`BufferedReader::data_hard`]
616 /// and [`BufferedReader::consume`].
617 ///
618 /// This function is identical to [`BufferedReader::data_consume`], but internally
619 /// uses [`BufferedReader::data_hard`] instead of [`BufferedReader::data`].
620 fn data_consume_hard(&mut self, amount: usize)
621 -> Result<&[u8], io::Error>
622 {
623 let len = self.data_hard(amount)?.len();
624 assert!(len >= amount);
625
626 let buffer = self.consume(amount);
627 assert!(buffer.len() >= amount);
628 Ok(buffer)
629 }
630
631 /// Checks whether the end of the stream is reached.
632 fn eof(&mut self) -> bool {
633 self.data_hard(1).is_err()
634 }
635
636 /// Checks whether this reader is consummated.
637 ///
638 /// For most readers, this function will return true once the end
639 /// of the stream is reached. However, some readers are concerned
640 /// with packet framing (e.g. the [`Limitor`]). Those readers
641 /// consider themselves consummated if the amount of data
642 /// indicated by the packet frame is consumed.
643 ///
644 /// This allows us to detect truncation. A packet is truncated,
645 /// iff the end of the stream is reached, but the reader is not
646 /// consummated.
647 ///
648 fn consummated(&mut self) -> bool {
649 self.eof()
650 }
651
652 /// A convenience function for reading a 16-bit unsigned integer
653 /// in big endian format.
654 fn read_be_u16(&mut self) -> Result<u16, std::io::Error> {
655 let input = self.data_consume_hard(2)?;
656 // input holds at least 2 bytes, so this cannot fail.
657 Ok(u16::from_be_bytes(input[..2].try_into().unwrap()))
658 }
659
660 /// A convenience function for reading a 32-bit unsigned integer
661 /// in big endian format.
662 fn read_be_u32(&mut self) -> Result<u32, std::io::Error> {
663 let input = self.data_consume_hard(4)?;
664 // input holds at least 4 bytes, so this cannot fail.
665 Ok(u32::from_be_bytes(input[..4].try_into().unwrap()))
666 }
667
668 /// Reads until either `terminal` is encountered or EOF.
669 ///
670 /// Returns either a `&[u8]` terminating in `terminal` or the rest
671 /// of the data, if EOF was encountered.
672 ///
673 /// Note: this function does *not* consume the data.
674 ///
675 /// # Examples
676 ///
677 /// ```
678 /// # f(); fn f() -> Result<(), std::io::Error> {
679 /// use buffered_reader;
680 /// use buffered_reader::BufferedReader;
681 ///
682 /// let orig = b"0123456789";
683 /// let mut br = buffered_reader::Memory::new(&orig[..]);
684 ///
685 /// {
686 /// let s = br.read_to(b'3')?;
687 /// assert_eq!(s, b"0123");
688 /// }
689 ///
690 /// // [`BufferedReader::read_to`] doesn't consume the data.
691 /// {
692 /// let s = br.read_to(b'5')?;
693 /// assert_eq!(s, b"012345");
694 /// }
695 ///
696 /// // Even if there is more data in the internal buffer, only
697 /// // the data through the match is returned.
698 /// {
699 /// let s = br.read_to(b'1')?;
700 /// assert_eq!(s, b"01");
701 /// }
702 ///
703 /// // If the terminal is not found, everything is returned...
704 /// {
705 /// let s = br.read_to(b'A')?;
706 /// assert_eq!(s, orig);
707 /// }
708 ///
709 /// // If we consume some data, the search starts at the cursor,
710 /// // not the beginning of the file.
711 /// br.consume(3);
712 ///
713 /// {
714 /// let s = br.read_to(b'5')?;
715 /// assert_eq!(s, b"345");
716 /// }
717 /// # Ok(()) }
718 /// ```
719 fn read_to(&mut self, terminal: u8) -> Result<&[u8], std::io::Error> {
720 let mut n = 128;
721 let len;
722
723 loop {
724 let data = self.data(n)?;
725
726 if let Some(newline)
727 = data.iter().position(|c| *c == terminal)
728 {
729 len = newline + 1;
730 break;
731 } else if data.len() < n {
732 // EOF.
733 len = data.len();
734 break;
735 } else {
736 // Read more data.
737 n = cmp::max(2 * n, data.len() + 1024);
738 }
739 }
740
741 Ok(&self.buffer()[..len])
742 }
743
744 /// Discards the input until one of the bytes in terminals is
745 /// encountered.
746 ///
747 /// The matching byte is not discarded.
748 ///
749 /// Returns the number of bytes discarded.
750 ///
751 /// The end of file is considered a match.
752 ///
753 /// `terminals` must be sorted.
754 fn drop_until(&mut self, terminals: &[u8])
755 -> Result<usize, std::io::Error>
756 {
757 // Make sure terminals is sorted.
758 for t in terminals.windows(2) {
759 assert!(t[0] <= t[1]);
760 }
761
762 let buf_size = default_buf_size();
763 let mut total = 0;
764 let position = 'outer: loop {
765 let len = {
766 // Try self.buffer. Only if it is empty, use
767 // self.data.
768 let buffer = if self.buffer().is_empty() {
769 self.data(buf_size)?
770 } else {
771 self.buffer()
772 };
773
774 if buffer.is_empty() {
775 break 'outer 0;
776 }
777
778 if let Some(position) = buffer.iter().position(
779 |c| terminals.binary_search(c).is_ok())
780 {
781 break 'outer position;
782 }
783
784 buffer.len()
785 };
786
787 self.consume(len);
788 total += len;
789 };
790
791 self.consume(position);
792 Ok(total + position)
793 }
794
795 /// Discards the input until one of the bytes in `terminals` is
796 /// encountered.
797 ///
798 /// The matching byte is also discarded.
799 ///
800 /// Returns the terminal byte and the number of bytes discarded.
801 ///
802 /// If match_eof is true, then the end of file is considered a
803 /// match. Otherwise, if the end of file is encountered, an error
804 /// is returned.
805 ///
806 /// `terminals` must be sorted.
807 fn drop_through(&mut self, terminals: &[u8], match_eof: bool)
808 -> Result<(Option<u8>, usize), std::io::Error>
809 {
810 let dropped = self.drop_until(terminals)?;
811 match self.data_consume(1) {
812 Ok([]) if match_eof => Ok((None, dropped)),
813 Ok([]) => Err(Error::new(ErrorKind::UnexpectedEof, "EOF")),
814 Ok(rest) => Ok((Some(rest[0]), dropped + 1)),
815 Err(err) => Err(err),
816 }
817 }
818
819 /// Like [`BufferedReader::data_consume_hard`], but returns the data in a
820 /// caller-owned buffer.
821 ///
822 /// [`BufferedReader`] implementations may optimize this to avoid a
823 /// copy by directly returning the internal buffer.
824 fn steal(&mut self, amount: usize) -> Result<Vec<u8>, std::io::Error> {
825 let mut data = self.data_consume_hard(amount)?;
826 assert!(data.len() >= amount);
827 if data.len() > amount {
828 data = &data[..amount];
829 }
830 Ok(data.to_vec())
831 }
832
833 /// Like [`BufferedReader::steal`], but instead of stealing a fixed number of
834 /// bytes, steals all of the data until the end of file.
835 fn steal_eof(&mut self) -> Result<Vec<u8>, std::io::Error> {
836 let len = self.data_eof()?.len();
837 let data = self.steal(len)?;
838 Ok(data)
839 }
840
841 /// Like [`BufferedReader::steal_eof`], but instead of returning the data, the
842 /// data is discarded.
843 ///
844 /// On success, returns whether any data (i.e., at least one byte)
845 /// was discarded.
846 ///
847 /// Note: whereas [`BufferedReader::steal_eof`] needs to buffer all of the data,
848 /// this function reads the data a chunk at a time, and then
849 /// discards it. A consequence of this is that an error may occur
850 /// after we have consumed some of the data.
851 fn drop_eof(&mut self) -> Result<bool, std::io::Error> {
852 let buf_size = default_buf_size();
853 let mut at_least_one_byte = false;
854 loop {
855 let n = self.data(buf_size)?.len();
856 at_least_one_byte |= n > 0;
857 self.consume(n);
858 if n < buf_size {
859 // EOF.
860 break;
861 }
862 }
863
864 Ok(at_least_one_byte)
865 }
866
867 /// Copies data to the given writer returning the copied amount.
868 ///
869 /// This is like using [`std::io::copy`], but more efficient as it
870 /// avoids an extra copy, and it will try to copy all the data the
871 /// reader has already buffered.
872 ///
873 /// On success, returns the amount of data (in bytes) that has
874 /// been copied.
875 ///
876 /// Note: this function reads and copies the data a chunk at a
877 /// time. A consequence of this is that an error may occur after
878 /// we have consumed some of the data.
879 fn copy(&mut self, sink: &mut dyn io::Write) -> io::Result<u64> {
880 let buf_size = default_buf_size();
881 let mut total = 0;
882 loop {
883 let data = self.data(buf_size)?;
884 sink.write_all(data)?;
885
886 let n = data.len();
887 total += n as u64;
888 self.consume(n);
889 if n < buf_size {
890 // EOF.
891 break;
892 }
893 }
894
895 Ok(total)
896 }
897
898 /// A helpful debugging aid to pretty print a Buffered Reader stack.
899 ///
900 /// Uses the Buffered Readers' `fmt::Display` implementations.
901 fn dump(&self, sink: &mut dyn std::io::Write) -> std::io::Result<()>
902 where Self: std::marker::Sized
903 {
904 let mut i = 1;
905 let mut reader: Option<&dyn BufferedReader<C>> = Some(self);
906 while let Some(r) = reader {
907 {
908 let cookie = r.cookie_ref();
909 writeln!(sink, " {}. {}, {:?}", i, r, cookie)?;
910 }
911 reader = r.get_ref();
912 i += 1;
913 }
914 Ok(())
915 }
916
917 /// Boxes the reader.
918 fn into_boxed<'a>(self) -> Box<dyn BufferedReader<C> + 'a>
919 where Self: 'a + Sized
920 {
921 Box::new(self)
922 }
923
924 /// Boxes the reader.
925 #[deprecated(note = "Use into_boxed")]
926 fn as_boxed<'a>(self) -> Box<dyn BufferedReader<C> + 'a>
927 where Self: 'a + Sized
928 {
929 self.into_boxed()
930 }
931
932 /// Returns the underlying reader, if any.
933 ///
934 /// To allow this to work with [`BufferedReader`] traits, it is
935 /// necessary for `Self` to be boxed.
936 ///
937 /// This can lead to the following unusual code:
938 ///
939 /// ```text
940 /// let inner = Box::new(br).into_inner();
941 /// ```
942 ///
943 /// Note: if `Self` is not actually owned, e.g., you passed a
944 /// reference, then this returns `None` as it is not possible to
945 /// consume the outer buffered reader. Consider:
946 ///
947 /// ```
948 /// # use buffered_reader::BufferedReader;
949 /// # use buffered_reader::Limitor;
950 /// # use buffered_reader::Memory;
951 /// #
952 /// # const DATA : &[u8] = b"01234567890123456789suffix";
953 /// #
954 /// let mut mem = Memory::new(DATA);
955 /// let mut limitor = Limitor::new(mem, 20);
956 /// let mut br = Box::new(&mut limitor);
957 /// // br doesn't owned limitor, so it can't consume it.
958 /// assert!(matches!(br.into_inner(), None));
959 ///
960 /// let mut mem = Memory::new(DATA);
961 /// let mut limitor = Limitor::new(mem, 20);
962 /// let mut br = Box::new(limitor);
963 /// assert!(matches!(br.into_inner(), Some(_)));
964 fn into_inner<'a>(self: Box<Self>) -> Option<Box<dyn BufferedReader<C> + 'a>>
965 where Self: 'a;
966
967 /// Returns a mutable reference to the inner [`BufferedReader`], if
968 /// any.
969 ///
970 /// It is a very bad idea to read any data from the inner
971 /// [`BufferedReader`], because this [`BufferedReader`] may have some
972 /// data buffered. However, this function can be useful to get
973 /// the cookie.
974 fn get_mut(&mut self) -> Option<&mut dyn BufferedReader<C>>;
975
976 /// Returns a reference to the inner [`BufferedReader`], if any.
977 fn get_ref(&self) -> Option<&dyn BufferedReader<C>>;
978
979 /// Sets the [`BufferedReader`]'s cookie and returns the old value.
980 fn cookie_set(&mut self, cookie: C) -> C;
981
982 /// Returns a reference to the [`BufferedReader`]'s cookie.
983 fn cookie_ref(&self) -> &C;
984
985 /// Returns a mutable reference to the [`BufferedReader`]'s cookie.
986 fn cookie_mut(&mut self) -> &mut C;
987}
988
989/// A generic implementation of `std::io::Read::read` appropriate for
990/// any [`BufferedReader`] implementation.
991///
992/// This function implements the `std::io::Read::read` method in terms
993/// of the `data_consume` method. We can't use the `io::std::Read`
994/// interface, because the [`BufferedReader`] may have buffered some
995/// data internally (in which case a read will not return the buffered
996/// data, but the following data).
997///
998/// This implementation is generic. When deriving a [`BufferedReader`],
999/// you can include the following:
1000///
1001/// ```text
1002/// impl<'a, T: BufferedReader> std::io::Read for XXX<'a, T> {
1003/// fn read(&mut self, buf: &mut [u8]) -> Result<usize, std::io::Error> {
1004/// return buffered_reader_generic_read_impl(self, buf);
1005/// }
1006/// }
1007/// ```
1008///
1009/// It would be nice if we could do:
1010///
1011/// ```text
1012/// impl <T: BufferedReader> std::io::Read for T { ... }
1013/// ```
1014///
1015/// but, alas, Rust doesn't like that ("error\[E0119\]: conflicting
1016/// implementations of trait `std::io::Read` for type `&mut _`").
1017pub fn buffered_reader_generic_read_impl<T: BufferedReader<C>, C: fmt::Debug + Sync + Send>
1018 (bio: &mut T, buf: &mut [u8]) -> Result<usize, io::Error> {
1019 bio
1020 .data_consume(buf.len())
1021 .map(|inner| {
1022 let amount = cmp::min(buf.len(), inner.len());
1023 buf[0..amount].copy_from_slice(&inner[0..amount]);
1024 amount
1025 })
1026}
1027
1028/// Make a `Box<BufferedReader>` look like a BufferedReader.
1029impl <'a, C: fmt::Debug + Sync + Send> BufferedReader<C> for Box<dyn BufferedReader<C> + 'a> {
1030 fn buffer(&self) -> &[u8] {
1031 return self.as_ref().buffer();
1032 }
1033
1034 fn data(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1035 return self.as_mut().data(amount);
1036 }
1037
1038 fn data_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1039 return self.as_mut().data_hard(amount);
1040 }
1041
1042 fn data_eof(&mut self) -> Result<&[u8], io::Error> {
1043 return self.as_mut().data_eof();
1044 }
1045
1046 fn consume(&mut self, amount: usize) -> &[u8] {
1047 return self.as_mut().consume(amount);
1048 }
1049
1050 fn data_consume(&mut self, amount: usize)
1051 -> Result<&[u8], std::io::Error> {
1052 return self.as_mut().data_consume(amount);
1053 }
1054
1055 fn data_consume_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1056 return self.as_mut().data_consume_hard(amount);
1057 }
1058
1059 fn consummated(&mut self) -> bool {
1060 self.as_mut().consummated()
1061 }
1062
1063 fn read_be_u16(&mut self) -> Result<u16, std::io::Error> {
1064 return self.as_mut().read_be_u16();
1065 }
1066
1067 fn read_be_u32(&mut self) -> Result<u32, std::io::Error> {
1068 return self.as_mut().read_be_u32();
1069 }
1070
1071 fn read_to(&mut self, terminal: u8) -> Result<&[u8], std::io::Error>
1072 {
1073 return self.as_mut().read_to(terminal);
1074 }
1075
1076 fn steal(&mut self, amount: usize) -> Result<Vec<u8>, std::io::Error> {
1077 return self.as_mut().steal(amount);
1078 }
1079
1080 fn steal_eof(&mut self) -> Result<Vec<u8>, std::io::Error> {
1081 return self.as_mut().steal_eof();
1082 }
1083
1084 fn drop_eof(&mut self) -> Result<bool, std::io::Error> {
1085 return self.as_mut().drop_eof();
1086 }
1087
1088 fn get_mut(&mut self) -> Option<&mut dyn BufferedReader<C>> {
1089 // Strip the outer box.
1090 self.as_mut().get_mut()
1091 }
1092
1093 fn get_ref(&self) -> Option<&dyn BufferedReader<C>> {
1094 // Strip the outer box.
1095 self.as_ref().get_ref()
1096 }
1097
1098 fn into_boxed<'b>(self) -> Box<dyn BufferedReader<C> + 'b>
1099 where Self: 'b
1100 {
1101 self
1102 }
1103
1104 fn as_boxed<'b>(self) -> Box<dyn BufferedReader<C> + 'b>
1105 where Self: 'b
1106 {
1107 self
1108 }
1109
1110 fn into_inner<'b>(self: Box<Self>) -> Option<Box<dyn BufferedReader<C> + 'b>>
1111 where Self: 'b {
1112 // Strip the outer box.
1113 (*self).into_inner()
1114 }
1115
1116 fn cookie_set(&mut self, cookie: C) -> C {
1117 self.as_mut().cookie_set(cookie)
1118 }
1119
1120 fn cookie_ref(&self) -> &C {
1121 self.as_ref().cookie_ref()
1122 }
1123
1124 fn cookie_mut(&mut self) -> &mut C {
1125 self.as_mut().cookie_mut()
1126 }
1127}
1128
1129/// Make a `&mut T` where `T` implements `BufferedReader` look like a
1130/// BufferedReader.
1131impl <'a, T, C> BufferedReader<C> for &'a mut T
1132where
1133 T: BufferedReader<C>,
1134 C: fmt::Debug + Sync + Send + 'a
1135{
1136 fn buffer(&self) -> &[u8] {
1137 (**self).buffer()
1138 }
1139
1140 fn data(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1141 (**self).data(amount)
1142 }
1143
1144 fn data_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1145 (**self).data_hard(amount)
1146 }
1147
1148 fn data_eof(&mut self) -> Result<&[u8], io::Error> {
1149 (**self).data_eof()
1150 }
1151
1152 fn consume(&mut self, amount: usize) -> &[u8] {
1153 (**self).consume(amount)
1154 }
1155
1156 fn data_consume(&mut self, amount: usize)
1157 -> Result<&[u8], std::io::Error> {
1158 (**self).data_consume(amount)
1159 }
1160
1161 fn data_consume_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1162 (**self).data_consume_hard(amount)
1163 }
1164
1165 fn consummated(&mut self) -> bool {
1166 (**self).consummated()
1167 }
1168
1169 fn read_be_u16(&mut self) -> Result<u16, std::io::Error> {
1170 (**self).read_be_u16()
1171 }
1172
1173 fn read_be_u32(&mut self) -> Result<u32, std::io::Error> {
1174 (**self).read_be_u32()
1175 }
1176
1177 fn read_to(&mut self, terminal: u8) -> Result<&[u8], std::io::Error>
1178 {
1179 (**self).read_to(terminal)
1180 }
1181
1182 fn steal(&mut self, amount: usize) -> Result<Vec<u8>, std::io::Error> {
1183 (**self).steal(amount)
1184 }
1185
1186 fn steal_eof(&mut self) -> Result<Vec<u8>, std::io::Error> {
1187 (**self).steal_eof()
1188 }
1189
1190 fn drop_eof(&mut self) -> Result<bool, std::io::Error> {
1191 (**self).drop_eof()
1192 }
1193
1194 fn get_mut(&mut self) -> Option<&mut dyn BufferedReader<C>> {
1195 (**self).get_mut()
1196 }
1197
1198 fn get_ref(&self) -> Option<&dyn BufferedReader<C>> {
1199 (**self).get_ref()
1200 }
1201
1202 fn into_boxed<'b>(self) -> Box<dyn BufferedReader<C> + 'b>
1203 where Self: 'b
1204 {
1205 Box::new(self)
1206 }
1207
1208 fn as_boxed<'b>(self) -> Box<dyn BufferedReader<C> + 'b>
1209 where Self: 'b
1210 {
1211 Box::new(self)
1212 }
1213
1214 fn into_inner<'b>(self: Box<Self>) -> Option<Box<dyn BufferedReader<C> + 'b>>
1215 where Self: 'b
1216 {
1217 None
1218 }
1219
1220 fn cookie_set(&mut self, cookie: C) -> C {
1221 (**self).cookie_set(cookie)
1222 }
1223
1224 fn cookie_ref(&self) -> &C {
1225 (**self).cookie_ref()
1226 }
1227
1228 fn cookie_mut(&mut self) -> &mut C {
1229 (**self).cookie_mut()
1230 }
1231}
1232
1233// The file was created as follows:
1234//
1235// for i in $(seq 0 9999); do printf "%04d\n" $i; done > buffered-reader-test.txt
1236#[cfg(test)]
1237fn buffered_reader_test_data_check<'a, T: BufferedReader<C> + 'a, C: fmt::Debug + Sync + Send>(bio: &mut T) {
1238 use std::str;
1239
1240 for i in 0 .. 10000 {
1241 let consumed = {
1242 // Each number is 4 bytes plus a newline character.
1243 let d = bio.data_hard(5);
1244 if d.is_err() {
1245 println!("Error for i == {}: {:?}", i, d);
1246 }
1247 let d = d.unwrap();
1248 assert!(d.len() >= 5);
1249 assert_eq!(format!("{:04}\n", i), str::from_utf8(&d[0..5]).unwrap());
1250
1251 5
1252 };
1253
1254 bio.consume(consumed);
1255 }
1256}
1257
1258#[cfg(test)]
1259const BUFFERED_READER_TEST_DATA: &[u8] =
1260 include_bytes!("buffered-reader-test.txt");
1261
1262#[cfg(test)]
1263mod test {
1264 use super::*;
1265
1266 #[test]
1267 fn buffered_reader_eof_test() {
1268 let data = BUFFERED_READER_TEST_DATA;
1269 // Make sure data_eof works.
1270 {
1271 let mut bio = Memory::new(data);
1272 let amount = {
1273 bio.data_eof().unwrap().len()
1274 };
1275 bio.consume(amount);
1276 assert_eq!(bio.data(1).unwrap().len(), 0);
1277 }
1278
1279 // Try it again with a limitor.
1280 {
1281 let bio = Memory::new(data);
1282 let mut bio2 = Limitor::new(
1283 bio, (data.len() / 2) as u64);
1284 let amount = {
1285 bio2.data_eof().unwrap().len()
1286 };
1287 assert_eq!(amount, data.len() / 2);
1288 bio2.consume(amount);
1289 assert_eq!(bio2.data(1).unwrap().len(), 0);
1290 }
1291 }
1292
1293 #[cfg(test)]
1294 fn buffered_reader_read_test_aux<'a, T: BufferedReader<C> + 'a, C: fmt::Debug + Sync + Send>
1295 (mut bio: T, data: &[u8]) {
1296 let mut buffer = [0; 99];
1297
1298 // Make sure the test file has more than buffer.len() bytes
1299 // worth of data.
1300 assert!(buffer.len() < data.len());
1301
1302 // The number of reads we'll have to perform.
1303 let iters = (data.len() + buffer.len() - 1) / buffer.len();
1304 // Iterate more than the number of required reads to check
1305 // what happens when we try to read beyond the end of the
1306 // file.
1307 for i in 1..iters + 2 {
1308 let data_start = (i - 1) * buffer.len();
1309
1310 // We don't want to just check that read works in
1311 // isolation. We want to be able to mix .read and .data
1312 // calls.
1313 {
1314 let result = bio.data(buffer.len());
1315 let buffer = result.unwrap();
1316 if !buffer.is_empty() {
1317 assert_eq!(buffer,
1318 &data[data_start..data_start + buffer.len()]);
1319 }
1320 }
1321
1322 // Now do the actual read.
1323 let result = bio.read(&mut buffer[..]);
1324 let got = result.unwrap();
1325 if got > 0 {
1326 assert_eq!(&buffer[0..got],
1327 &data[data_start..data_start + got]);
1328 }
1329
1330 if i > iters {
1331 // We should have read everything.
1332 assert!(got == 0);
1333 } else if i == iters {
1334 // The last read. This may be less than buffer.len().
1335 // But it should include at least one byte.
1336 assert!(0 < got);
1337 assert!(got <= buffer.len());
1338 } else {
1339 assert_eq!(got, buffer.len());
1340 }
1341 }
1342 }
1343
1344 #[test]
1345 fn buffered_reader_read_test() {
1346 let data = BUFFERED_READER_TEST_DATA;
1347 {
1348 let bio = Memory::new(data);
1349 buffered_reader_read_test_aux (bio, data);
1350 }
1351
1352 {
1353 use std::path::PathBuf;
1354 use std::fs::File;
1355
1356 let path : PathBuf = [env!("CARGO_MANIFEST_DIR"),
1357 "src",
1358 "buffered-reader-test.txt"]
1359 .iter().collect();
1360
1361 let mut f = File::open(&path).expect(&path.to_string_lossy());
1362 let bio = Generic::new(&mut f, None);
1363 buffered_reader_read_test_aux (bio, data);
1364 }
1365 }
1366
1367 #[test]
1368 fn drop_until() {
1369 let data : &[u8] = &b"abcd"[..];
1370 let mut reader = Memory::new(data);
1371
1372 // Matches the 'a' at 0 and consumes 0 bytes.
1373 assert_eq!(reader.drop_until(b"ab").unwrap(), 0);
1374 // Matches the 'b' at 1 and consumes 1 byte.
1375 assert_eq!(reader.drop_until(b"bc").unwrap(), 1);
1376 // Matches the 'b' at 1 and consumes 0 bytes.
1377 assert_eq!(reader.drop_until(b"ab").unwrap(), 0);
1378 // Matches the 'd' at 4 and consumes 2 bytes.
1379 assert_eq!(reader.drop_until(b"de").unwrap(), 2);
1380 // Matches nothing, consuming the last 1 byte.
1381 assert_eq!(reader.drop_until(b"e").unwrap(), 1);
1382 // Matches nothing, consuming nothing.
1383 assert_eq!(reader.drop_until(b"e").unwrap(), 0);
1384 }
1385
1386 #[test]
1387 fn drop_through() {
1388 let data : &[u8] = &b"abcd"[..];
1389 let mut reader = Memory::new(data);
1390
1391 // Matches the 'a' at 0 and consumes 1 byte.
1392 assert_eq!(reader.drop_through(b"ab", false).unwrap(),
1393 (Some(b'a'), 1));
1394 // Matches the 'b' at 1 and consumes 1 byte.
1395 assert_eq!(reader.drop_through(b"ab", false).unwrap(),
1396 (Some(b'b'), 1));
1397 // Matches the 'd' at 4 and consumes 2 byte.
1398 assert_eq!(reader.drop_through(b"def", false).unwrap(),
1399 (Some(b'd'), 2));
1400 // Doesn't match (eof).
1401 assert!(reader.drop_through(b"def", false).is_err());
1402 // Matches EOF.
1403 assert!(reader.drop_through(b"def", true).unwrap().0.is_none());
1404 }
1405
1406 #[test]
1407 fn copy() -> io::Result<()> {
1408 // The memory reader has all the data buffered, copying it
1409 // will issue a single write.
1410 let mut bio = Memory::new(BUFFERED_READER_TEST_DATA);
1411 let mut sink = Vec::new();
1412 let amount = bio.copy(&mut sink)?;
1413 assert_eq!(amount, 50_000);
1414 assert_eq!(&sink[..], BUFFERED_READER_TEST_DATA);
1415
1416 // The generic reader uses buffers of the given chunk size,
1417 // copying it will issue multiple writes.
1418 let mut bio = Generic::new(BUFFERED_READER_TEST_DATA, Some(64));
1419 let mut sink = Vec::new();
1420 let amount = bio.copy(&mut sink)?;
1421 assert_eq!(amount, 50_000);
1422 assert_eq!(&sink[..], BUFFERED_READER_TEST_DATA);
1423 Ok(())
1424 }
1425
1426 #[test]
1427 fn mutable_reference() {
1428 use crate::Memory;
1429 const DATA : &[u8] = b"01234567890123456789suffix";
1430
1431 /// API that consumes the memory reader.
1432 fn parse_ten_bytes<B: BufferedReader<()>>(mut r: B) {
1433 let d = r.data_consume_hard(10).unwrap();
1434 assert!(d.len() >= 10);
1435 assert_eq!(&d[..10], &DATA[..10]);
1436 drop(r); // We consumed the reader.
1437 }
1438
1439 let mut mem = Memory::new(DATA);
1440 parse_ten_bytes(&mut mem);
1441 parse_ten_bytes(&mut mem);
1442 let suffix = mem.data_eof().unwrap();
1443 assert_eq!(suffix, b"suffix");
1444
1445 let mut mem = Memory::new(DATA);
1446 let mut limitor = Limitor::new(&mut mem, 20);
1447 parse_ten_bytes(&mut limitor);
1448 parse_ten_bytes(&mut limitor);
1449 assert!(limitor.eof());
1450 drop(limitor);
1451 let suffix = mem.data_eof().unwrap();
1452 assert_eq!(suffix, b"suffix");
1453 }
1454
1455 #[test]
1456 fn mutable_reference_with_cookie() {
1457 use crate::Memory;
1458 const DATA : &[u8] = b"01234567890123456789suffix";
1459
1460 /// API that consumes the memory reader.
1461 fn parse_ten_bytes<B, C>(mut r: B)
1462 where B: BufferedReader<C>,
1463 C: std::fmt::Debug + Send + Sync
1464 {
1465 let d = r.data_consume_hard(10).unwrap();
1466 assert!(d.len() >= 10);
1467 assert_eq!(&d[..10], &DATA[..10]);
1468 drop(r); // We consumed the reader.
1469 }
1470
1471 #[derive(Debug)]
1472 struct Cookie {
1473 }
1474
1475 impl Default for Cookie {
1476 fn default() -> Self { Cookie {} }
1477 }
1478
1479 let mut mem = Memory::with_cookie(DATA, Cookie::default());
1480 parse_ten_bytes(&mut mem);
1481 parse_ten_bytes(&mut mem);
1482 let suffix = mem.data_eof().unwrap();
1483 assert_eq!(suffix, b"suffix");
1484
1485 let mut mem = Memory::with_cookie(DATA, Cookie::default());
1486 let mut limitor = Limitor::with_cookie(
1487 &mut mem, 20, Cookie::default());
1488 parse_ten_bytes(&mut limitor);
1489 parse_ten_bytes(&mut limitor);
1490 assert!(limitor.eof());
1491 drop(limitor);
1492 let suffix = mem.data_eof().unwrap();
1493 assert_eq!(suffix, b"suffix");
1494
1495 let mut mem = Memory::with_cookie(DATA, Cookie::default());
1496 let mut mem = Box::new(&mut mem) as Box<dyn BufferedReader<Cookie>>;
1497 let mut limitor = Limitor::with_cookie(
1498 &mut mem, 20, Cookie::default());
1499 parse_ten_bytes(&mut limitor);
1500 parse_ten_bytes(&mut limitor);
1501 assert!(limitor.eof());
1502 drop(limitor);
1503 let suffix = mem.data_eof().unwrap();
1504 assert_eq!(suffix, b"suffix");
1505 }
1506
1507 #[test]
1508 fn mutable_reference_inner() {
1509 use crate::Memory;
1510 const DATA : &[u8] = b"01234567890123456789suffix";
1511
1512 /// API that consumes the memory reader.
1513 fn parse_ten_bytes<B: BufferedReader<()>>(mut r: B) {
1514 let d = r.data_consume_hard(10).unwrap();
1515 assert!(d.len() >= 10);
1516 assert_eq!(&d[..10], &DATA[..10]);
1517 drop(r); // We consumed the reader.
1518 }
1519
1520 let mut mem = Memory::new(DATA);
1521 let mut limitor = Limitor::new(&mut mem, 20);
1522 parse_ten_bytes(&mut limitor);
1523 parse_ten_bytes(&mut limitor);
1524 assert!(limitor.eof());
1525
1526 // Check that get_mut returns `mem` by reading from the inner
1527 // and checking that we get more data.
1528 let mem = limitor.get_mut().expect("have inner");
1529 let suffix = mem.data_eof().unwrap();
1530 assert_eq!(suffix, b"suffix");
1531 }
1532}