buffered_reader/
lib.rs

1//! A [`BufferedReader`] is a super-powered `Read`er.
2//!
3//! Like the [`BufRead`] trait, the [`BufferedReader`] trait has an
4//! internal buffer that is directly exposed to the user.  This design
5//! enables two performance optimizations.  First, the use of an
6//! internal buffer amortizes system calls.  Second, exposing the
7//! internal buffer allows the user to work with data in place, which
8//! avoids another copy.
9//!
10//! The [`BufRead`] trait, however, has a significant limitation for
11//! parsers: the user of a [`BufRead`] object can't control the amount
12//! of buffering.  This is essential for being able to conveniently
13//! work with data in place, and being able to lookahead without
14//! consuming data.  The result is that either the sizing has to be
15//! handled by the instantiator of the [`BufRead`] object---assuming
16//! the [`BufRead`] object provides such a mechanism---which is a
17//! layering violation, or the parser has to fallback to buffering if
18//! the internal buffer is too small, which eliminates most of the
19//! advantages of the [`BufRead`] abstraction.  The [`BufferedReader`]
20//! trait addresses this shortcoming by allowing the user to control
21//! the size of the internal buffer.
22//!
23//! The [`BufferedReader`] trait also has some functionality,
24//! specifically, a generic interface to work with a stack of
25//! [`BufferedReader`] objects, that simplifies using multiple parsers
26//! simultaneously.  This is helpful when one parser deals with
27//! framing (e.g., something like [HTTP's chunk transfer encoding]),
28//! and another decodes the actual objects.  It is also useful when
29//! objects are nested.
30//!
31//! # Details
32//!
33//! Because the [`BufRead`] trait doesn't provide a mechanism for the
34//! user to size the internal buffer, a parser can't generally be sure
35//! that the internal buffer will be large enough to allow it to work
36//! with all data in place.
37//!
38//! Using the standard [`BufRead`] implementation, [`BufReader`], the
39//! instantiator can set the size of the internal buffer at creation
40//! time.  Unfortunately, this mechanism is ugly, and not always
41//! adequate.  First, the parser is typically not the instantiator.
42//! Thus, the instantiator needs to know about the implementation
43//! details of all of the parsers, which turns an implementation
44//! detail into a cross-cutting concern.  Second, when working with
45//! dynamically sized data, the maximum amount of the data that needs
46//! to be worked with in place may not be known apriori, or the
47//! maximum amount may be significantly larger than the typical
48//! amount.  This leads to poorly sized buffers.
49//!
50//! Alternatively, the code that uses, but does not instantiate a
51//! [`BufRead`] object, can be changed to stream the data, or to
52//! fallback to reading the data into a local buffer if the internal
53//! buffer is too small.  Both of these approaches increase code
54//! complexity, and the latter approach is contrary to the
55//! [`BufRead`]'s goal of reducing unnecessary copying.
56//!
57//! The [`BufferedReader`] trait solves this problem by allowing the
58//! user to dynamically (i.e., at read time, not open time) ensure
59//! that the internal buffer has a certain amount of data.
60//!
61//! The ability to control the size of the internal buffer is also
62//! essential to straightforward support for speculative lookahead.
63//! The reason that speculative lookahead with a [`BufRead`] object is
64//! difficult is that speculative lookahead is /speculative/, i.e., if
65//! the parser backtracks, the data that was read must not be
66//! consumed.  Using a [`BufRead`] object, this is not possible if the
67//! amount of lookahead is larger than the internal buffer.  That is,
68//! if the amount of lookahead data is larger than the [`BufRead`]'s
69//! internal buffer, the parser first has to [`std::io::BufRead::consume`] some
70//! data to be able to examine more data.  But, if the parser then
71//! decides to backtrack, it has no way to return the unused data to
72//! the [`BufRead`] object.  This forces the parser to manage a buffer
73//! of read, but unconsumed data, which significantly complicates the
74//! code.
75//!
76//! The [`BufferedReader`] trait also simplifies working with a stack of
77//! [`BufferedReader`]s in two ways.  First, the [`BufferedReader`] trait
78//! provides *generic* methods to access the underlying
79//! [`BufferedReader`].  Thus, even when dealing with a trait object, it
80//! is still possible to recover the underlying [`BufferedReader`].
81//! Second, the [`BufferedReader`] provides a mechanism to associate
82//! generic state with each [`BufferedReader`] via a cookie.  Although
83//! it is possible to realize this functionality using a custom trait
84//! that extends the [`BufferedReader`] trait and wraps existing
85//! [`BufferedReader`] implementations, this approach eliminates a lot
86//! of error-prone, boilerplate code.
87//!
88//! # Examples
89//!
90//! The following examples show not only how to use a
91//! [`BufferedReader`], but also better illustrate the aforementioned
92//! limitations of a [`BufRead`]er.
93//!
94//! Consider a file consisting of a sequence of objects, which are
95//! laid out as follows.  Each object has a two byte header that
96//! indicates the object's size in bytes.  The object immediately
97//! follows the header.  Thus, if we had two objects: "foobar" and
98//! "xyzzy", in that order, the file would look like this:
99//!
100//! ```text
101//! 0 6 f o o b a r 0 5 x y z z y
102//! ```
103//!
104//! Here's how we might parse this type of file using a
105//! [`BufferedReader`]:
106//!
107//! ```
108//! use buffered_reader;
109//! use buffered_reader::BufferedReader;
110//!
111//! fn parse_object(content: &[u8]) {
112//!     // Parse the object.
113//!     # let _ = content;
114//! }
115//!
116//! # f(); fn f() -> Result<(), std::io::Error> {
117//! # const FILENAME : &str = "/dev/null";
118//! let mut br = buffered_reader::File::open(FILENAME)?;
119//!
120//! // While we haven't reached EOF (i.e., we can read at
121//! // least one byte).
122//! while br.data(1)?.len() > 0 {
123//!     // Get the object's length.
124//!     let len = br.read_be_u16()? as usize;
125//!     // Get the object's content.
126//!     let content = br.data_consume_hard(len)?;
127//!
128//!     // Parse the actual object using a real parser.  Recall:
129//!     // `data_hard`() may return more than the requested amount (but
130//!     // it will never return less).
131//!     parse_object(&content[..len]);
132//! }
133//! # Ok(()) }
134//! ```
135//!
136//! Note that `content` is actually a pointer to the
137//! [`BufferedReader`]'s internal buffer.  Thus, getting some data
138//! doesn't require copying the data into a local buffer, which is
139//! often discarded immediately after the data is parsed.
140//!
141//! Further, [`BufferedReader::data`] (and the other related functions) are guaranteed
142//! to return at least the requested amount of data.  There are two
143//! exceptions: if an error occurs, or the end of the file is reached.
144//! Thus, only the cases that actually need to be handled by the user
145//! are actually exposed; there is no need to call something like
146//! [`std::io::Read::read`] in a loop to ensure the whole object is available.
147//!
148//! Because reading is separate from consuming data, it is possible to
149//! get a chunk of data, inspect it, and then consume only what is
150//! needed.  As mentioned above, this is only possible with a
151//! [`BufRead`] object if the internal buffer happens to be large
152//! enough.  Using a [`BufferedReader`], this is always possible,
153//! assuming the data fits in memory.
154//!
155//! In our example, we actually have two parsers: one that deals with
156//! the framing, and one for the actual objects.  The above code
157//! buffers the objects in their entirety, and then passes a slice
158//! containing the object to the object parser.  If the object parser
159//! also worked with a [`BufferedReader`] object, then less buffering
160//! will usually be needed, and the two parsers could run
161//! simultaneously.  This is particularly useful when the framing is
162//! more complicated like [HTTP's chunk transfer encoding].  Then,
163//! when the object parser reads data, the frame parser is invoked
164//! lazily.  This is done by implementing the [`BufferedReader`] trait
165//! for the framing parser, and stacking the [`BufferedReader`]s.
166//!
167//! For our next example, we rewrite the previous code assuming that
168//! the object parser reads from a [`BufferedReader`] object.  Since the
169//! framing parser is really just a limit on the object's size, we
170//! don't need to implement a special [`BufferedReader`], but can use a
171//! [`Limitor`] to impose an upper limit on the amount
172//! that it can read.  After the object parser has finished, we drain
173//! the object reader.  This pattern is particularly helpful when
174//! individual objects that contain errors should be skipped.
175//!
176//! ```
177//! use buffered_reader;
178//! use buffered_reader::BufferedReader;
179//!
180//! fn parse_object<R: BufferedReader<()>>(br: &mut R) {
181//!     // Parse the object.
182//!     # let _ = br;
183//! }
184//!
185//! # f(); fn f() -> Result<(), std::io::Error> {
186//! # const FILENAME : &str = "/dev/null";
187//! let mut br : Box<dyn BufferedReader<()>>
188//!     = Box::new(buffered_reader::File::open(FILENAME)?);
189//!
190//! // While we haven't reached EOF (i.e., we can read at
191//! // least one byte).
192//! while br.data(1)?.len() > 0 {
193//!     // Get the object's length.
194//!     let len = br.read_be_u16()? as u64;
195//!
196//!     // Set up a limit.
197//!     br = Box::new(buffered_reader::Limitor::new(br, len));
198//!
199//!     // Parse the actual object using a real parser.
200//!     parse_object(&mut br);
201//!
202//!     // If the parser didn't consume the whole object, e.g., due to
203//!     // a parse error, drop the rest.
204//!     br.drop_eof();
205//!
206//!     // Recover the framing parser's `BufferedReader`.
207//!     br = br.into_inner().unwrap();
208//! }
209//! # Ok(()) }
210//! ```
211//!
212//! Of particular note is the generic functionality for dealing with
213//! stacked [`BufferedReader`]s: the [`BufferedReader::into_inner`] method is not bound
214//! to the implementation, which is often not be available due to type
215//! erasure, but is provided by the trait.
216//!
217//! In addition to utility [`BufferedReader`]s like the
218//! [`Limitor`], this crate also includes a few
219//! general-purpose parsers, like the [`Zlib`]
220//! decompressor.
221//!
222//! [`BufRead`]: std::io::BufRead
223//! [`BufReader`]: std::io::BufReader
224//! [HTTP's chunk transfer encoding]: https://en.wikipedia.org/wiki/Chunked_transfer_encoding
225
226#![doc(html_favicon_url = "https://docs.sequoia-pgp.org/favicon.png")]
227#![doc(html_logo_url = "https://docs.sequoia-pgp.org/logo.svg")]
228#![warn(missing_docs)]
229
230use std::io;
231use std::io::{Error, ErrorKind};
232use std::cmp;
233use std::fmt;
234use std::convert::TryInto;
235
236#[macro_use]
237mod macros;
238
239mod generic;
240mod memory;
241mod limitor;
242mod reserve;
243mod dup;
244mod eof;
245mod adapter;
246#[cfg(feature = "compression-deflate")]
247mod decompress_deflate;
248#[cfg(feature = "compression-bzip2")]
249mod decompress_bzip2;
250
251pub use self::generic::Generic;
252pub use self::memory::Memory;
253pub use self::limitor::Limitor;
254pub use self::reserve::Reserve;
255pub use self::dup::Dup;
256pub use self::eof::EOF;
257pub use self::adapter::Adapter;
258#[cfg(feature = "compression-deflate")]
259pub use self::decompress_deflate::Deflate;
260#[cfg(feature = "compression-deflate")]
261pub use self::decompress_deflate::Zlib;
262#[cfg(feature = "compression-bzip2")]
263pub use self::decompress_bzip2::Bzip;
264
265// Common error type for file operations.
266mod file_error;
267
268// These are the different File implementations.  We
269// include the modules unconditionally, so that we catch bitrot early.
270#[allow(dead_code)]
271mod file_generic;
272#[allow(dead_code)]
273#[cfg(unix)]
274mod file_unix;
275
276// Then, we select the appropriate version to re-export.
277#[cfg(not(unix))]
278pub use self::file_generic::File;
279#[cfg(unix)]
280pub use self::file_unix::File;
281
282/// The default buffer size.
283///
284/// This is configurable by the SEQUOIA_BUFFERED_READER_BUFFER
285/// environment variable.
286fn default_buf_size() -> usize {
287    use std::sync::OnceLock;
288
289    static DEFAULT_BUF_SIZE: OnceLock<usize> = OnceLock::new();
290    *DEFAULT_BUF_SIZE.get_or_init(|| {
291        use std::env::var_os;
292        use std::str::FromStr;
293
294        let default = 32 * 1024;
295
296        if let Some(size) = var_os("SEQUOIA_BUFFERED_READER_BUFFER") {
297            size.to_str()
298                .and_then(|s| {
299                    match FromStr::from_str(s) {
300                        Ok(s) => Some(s),
301                        Err(err) => {
302                            eprintln!("Unable to parse the value of \
303                                       'SEQUOIA_BUFFERED_READER_BUFFER'; \
304                                       falling back to the default buffer \
305                                       size ({}): {}",
306                                      err, default);
307                            None
308                        }
309                    }
310                })
311                .unwrap_or(default)
312        } else {
313            default
314        }
315    })
316}
317
318// On debug builds, Vec<u8>::truncate is very, very slow.  For
319// instance, running the decrypt_test_stream test takes 51 seconds on
320// my (Neal's) computer using Vec<u8>::truncate and <0.1 seconds using
321// `unsafe { v.set_len(len); }`.
322//
323// The issue is that the compiler calls drop on every element that is
324// dropped, even though a u8 doesn't have a drop implementation.  The
325// compiler optimizes this away at high optimization levels, but those
326// levels make debugging harder.
327fn vec_truncate(v: &mut Vec<u8>, len: usize) {
328    if cfg!(debug_assertions) {
329        if len < v.len() {
330            unsafe { v.set_len(len); }
331        }
332    } else {
333        v.truncate(len);
334    }
335}
336
337/// Like `Vec<u8>::resize`, but fast in debug builds.
338fn vec_resize(v: &mut Vec<u8>, new_size: usize) {
339    if v.len() < new_size {
340        v.resize(new_size, 0);
341    } else {
342        vec_truncate(v, new_size);
343    }
344}
345
346/// The generic `BufferReader` interface.
347pub trait BufferedReader<C> : io::Read + fmt::Debug + fmt::Display + Send + Sync
348  where C: fmt::Debug + Send + Sync
349{
350    /// Returns a reference to the internal buffer.
351    ///
352    /// Note: this returns the same data as `self.data(0)`, but it
353    /// does so without mutably borrowing self:
354    ///
355    /// ```
356    /// # f(); fn f() -> Result<(), std::io::Error> {
357    /// use buffered_reader;
358    /// use buffered_reader::BufferedReader;
359    ///
360    /// let mut br = buffered_reader::Memory::new(&b"0123456789"[..]);
361    ///
362    /// let first = br.data(10)?.len();
363    /// let second = br.buffer().len();
364    /// // `buffer` must return exactly what `data` returned.
365    /// assert_eq!(first, second);
366    /// # Ok(()) }
367    /// ```
368    fn buffer(&self) -> &[u8];
369
370    /// Ensures that the internal buffer has at least `amount` bytes
371    /// of data, and returns it.
372    ///
373    /// If the internal buffer contains less than `amount` bytes of
374    /// data, the internal buffer is first filled.
375    ///
376    /// The returned slice will have *at least* `amount` bytes unless
377    /// EOF has been reached or an error occurs, in which case the
378    /// returned slice will contain the rest of the file.
379    ///
380    /// Errors are returned only when the internal buffer is empty.
381    ///
382    /// This function does not advance the cursor.  To advance the
383    /// cursor, use [`BufferedReader::consume`].
384    ///
385    /// Note: If the internal buffer already contains at least
386    /// `amount` bytes of data, then [`BufferedReader`]
387    /// implementations are guaranteed to simply return the internal
388    /// buffer.  As such, multiple calls to [`BufferedReader::data`]
389    /// for the same `amount` will return the same slice.
390    ///
391    /// Further, [`BufferedReader`] implementations are guaranteed to
392    /// not shrink the internal buffer.  Thus, once some data has been
393    /// returned, it will always be returned until it is consumed.
394    /// As such, the following must hold:
395    ///
396    /// If [`BufferedReader`] receives `EINTR` when `read`ing, it will
397    /// automatically retry reading.
398    ///
399    /// ```
400    /// # f(); fn f() -> Result<(), std::io::Error> {
401    /// use buffered_reader;
402    /// use buffered_reader::BufferedReader;
403    ///
404    /// let mut br = buffered_reader::Memory::new(&b"0123456789"[..]);
405    ///
406    /// let first = br.data(10)?.len();
407    /// let second = br.data(5)?.len();
408    /// // Even though less data is requested, the second call must
409    /// // return the same slice as the first call.
410    /// assert_eq!(first, second);
411    /// # Ok(()) }
412    /// ```
413    fn data(&mut self, amount: usize) -> Result<&[u8], io::Error>;
414
415    /// Like [`BufferedReader::data`], but returns an error if there is not at least
416    /// `amount` bytes available.
417    ///
418    /// [`BufferedReader::data_hard`] is a variant of [`BufferedReader::data`] that returns at least
419    /// `amount` bytes of data or an error.  Thus, unlike [`BufferedReader::data`],
420    /// which will return less than `amount` bytes of data if EOF is
421    /// encountered, [`BufferedReader::data_hard`] returns an error, specifically,
422    /// `io::ErrorKind::UnexpectedEof`.
423    ///
424    /// # Examples
425    ///
426    /// ```
427    /// # f(); fn f() -> Result<(), std::io::Error> {
428    /// use buffered_reader;
429    /// use buffered_reader::BufferedReader;
430    ///
431    /// let mut br = buffered_reader::Memory::new(&b"0123456789"[..]);
432    ///
433    /// // Trying to read more than there is available results in an error.
434    /// assert!(br.data_hard(20).is_err());
435    /// // Whereas with data(), everything through EOF is returned.
436    /// assert_eq!(br.data(20)?.len(), 10);
437    /// # Ok(()) }
438    /// ```
439    fn data_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
440        let result = self.data(amount);
441        if let Ok(buffer) = result {
442            if buffer.len() < amount {
443                return Err(Error::new(ErrorKind::UnexpectedEof,
444                                      "unexpected EOF"));
445            }
446        }
447        result
448    }
449
450    /// Returns all of the data until EOF.  Like [`BufferedReader::data`], this does not
451    /// actually consume the data that is read.
452    ///
453    /// In general, you shouldn't use this function as it can cause an
454    /// enormous amount of buffering.  But, if you know that the
455    /// amount of data is limited, this is acceptable.
456    ///
457    /// # Examples
458    ///
459    /// ```
460    /// # f(); fn f() -> Result<(), std::io::Error> {
461    /// use buffered_reader;
462    /// use buffered_reader::BufferedReader;
463    ///
464    /// const AMOUNT : usize = 100 * 1024 * 1024;
465    /// let buffer = vec![0u8; AMOUNT];
466    /// let mut br = buffered_reader::Generic::new(&buffer[..], None);
467    ///
468    /// // Normally, only a small amount will be buffered.
469    /// assert!(br.data(10)?.len() <= AMOUNT);
470    ///
471    /// // `data_eof` buffers everything.
472    /// assert_eq!(br.data_eof()?.len(), AMOUNT);
473    ///
474    /// // Now that everything is buffered, buffer(), data(), and
475    /// // data_hard() will also return everything.
476    /// assert_eq!(br.buffer().len(), AMOUNT);
477    /// assert_eq!(br.data(10)?.len(), AMOUNT);
478    /// assert_eq!(br.data_hard(10)?.len(), AMOUNT);
479    /// # Ok(()) }
480    /// ```
481    fn data_eof(&mut self) -> Result<&[u8], io::Error> {
482        // Don't just read std::usize::MAX bytes at once.  The
483        // implementation might try to actually allocate a buffer that
484        // large!  Instead, try with increasingly larger buffers until
485        // the read is (strictly) shorter than the specified size.
486        let mut s = default_buf_size();
487        // We will break the loop eventually, because self.data(s)
488        // must return a slice shorter than std::usize::MAX.
489        loop {
490            match self.data(s) {
491                Ok(buffer) => {
492                    if buffer.len() < s {
493                        // We really want to do
494                        //
495                        //   return Ok(buffer);
496                        //
497                        // But, the borrower checker won't let us:
498                        //
499                        //  error[E0499]: cannot borrow `*self` as
500                        //  mutable more than once at a time.
501                        //
502                        // Instead, we break out of the loop, and then
503                        // call self.buffer().
504                        s = buffer.len();
505                        break;
506                    } else {
507                        s *= 2;
508                    }
509                }
510                Err(err) =>
511                    return Err(err),
512            }
513        }
514
515        let buffer = self.buffer();
516        assert_eq!(buffer.len(), s);
517        Ok(buffer)
518    }
519
520    /// Consumes some of the data.
521    ///
522    /// This advances the internal cursor by `amount`.  It is an error
523    /// to call this function to consume data that hasn't been
524    /// returned by [`BufferedReader::data`] or a related function.
525    ///
526    /// Note: It is safe to call this function to consume more data
527    /// than requested in a previous call to [`BufferedReader::data`], but only if
528    /// [`BufferedReader::data`] also returned that data.
529    ///
530    /// This function returns the internal buffer *including* the
531    /// consumed data.  Thus, the [`BufferedReader`] implementation must
532    /// continue to buffer the consumed data until the reference goes
533    /// out of scope.
534    ///
535    /// # Examples
536    ///
537    /// ```
538    /// # f(); fn f() -> Result<(), std::io::Error> {
539    /// use buffered_reader;
540    /// use buffered_reader::BufferedReader;
541    ///
542    /// const AMOUNT : usize = 100 * 1024 * 1024;
543    /// let buffer = vec![0u8; AMOUNT];
544    /// let mut br = buffered_reader::Generic::new(&buffer[..], None);
545    ///
546    /// let amount = {
547    ///     // We want at least 1024 bytes, but we'll be happy with
548    ///     // more or less.
549    ///     let buffer = br.data(1024)?;
550    ///     // Parse the data or something.
551    ///     let used = buffer.len();
552    ///     used
553    /// };
554    /// let buffer = br.consume(amount);
555    /// # Ok(()) }
556    /// ```
557    fn consume(&mut self, amount: usize) -> &[u8];
558
559    /// A convenience function that combines [`BufferedReader::data`] and [`BufferedReader::consume`].
560    ///
561    /// If less than `amount` bytes are available, this function
562    /// consumes what is available.
563    ///
564    /// Note: Due to lifetime issues, it is not possible to call
565    /// [`BufferedReader::data`], work with the returned buffer, and then call
566    /// [`BufferedReader::consume`] in the same scope, because both [`BufferedReader::data`] and
567    /// [`BufferedReader::consume`] take a mutable reference to the [`BufferedReader`].
568    /// This function makes this common pattern easier.
569    ///
570    /// # Examples
571    ///
572    /// ```
573    /// # f(); fn f() -> Result<(), std::io::Error> {
574    /// use buffered_reader;
575    /// use buffered_reader::BufferedReader;
576    ///
577    /// let orig = b"0123456789";
578    /// let mut br = buffered_reader::Memory::new(&orig[..]);
579    ///
580    /// // We need a new scope for each call to [`BufferedReader::data_consume`], because
581    /// // the `buffer` reference locks `br`.
582    /// {
583    ///     let buffer = br.data_consume(3)?;
584    ///     assert_eq!(buffer, &orig[..buffer.len()]);
585    /// }
586    ///
587    /// // Note that the cursor has advanced.
588    /// {
589    ///     let buffer = br.data_consume(3)?;
590    ///     assert_eq!(buffer, &orig[3..3 + buffer.len()]);
591    /// }
592    ///
593    /// // Like [`BufferedReader::data`], [`BufferedReader::data_consume`] may return and consume less
594    /// // than requested if there is no more data available.
595    /// {
596    ///     let buffer = br.data_consume(10)?;
597    ///     assert_eq!(buffer, &orig[6..6 + buffer.len()]);
598    /// }
599    ///
600    /// {
601    ///     let buffer = br.data_consume(10)?;
602    ///     assert_eq!(buffer.len(), 0);
603    /// }
604    /// # Ok(()) }
605    /// ```
606    fn data_consume(&mut self, amount: usize)
607                    -> Result<&[u8], std::io::Error> {
608        let amount = cmp::min(amount, self.data(amount)?.len());
609
610        let buffer = self.consume(amount);
611        assert!(buffer.len() >= amount);
612        Ok(buffer)
613    }
614
615    /// A convenience function that effectively combines [`BufferedReader::data_hard`]
616    /// and [`BufferedReader::consume`].
617    ///
618    /// This function is identical to [`BufferedReader::data_consume`], but internally
619    /// uses [`BufferedReader::data_hard`] instead of [`BufferedReader::data`].
620    fn data_consume_hard(&mut self, amount: usize)
621        -> Result<&[u8], io::Error>
622    {
623        let len = self.data_hard(amount)?.len();
624        assert!(len >= amount);
625
626        let buffer = self.consume(amount);
627        assert!(buffer.len() >= amount);
628        Ok(buffer)
629    }
630
631    /// Checks whether the end of the stream is reached.
632    fn eof(&mut self) -> bool {
633        self.data_hard(1).is_err()
634    }
635
636    /// Checks whether this reader is consummated.
637    ///
638    /// For most readers, this function will return true once the end
639    /// of the stream is reached.  However, some readers are concerned
640    /// with packet framing (e.g. the [`Limitor`]).  Those readers
641    /// consider themselves consummated if the amount of data
642    /// indicated by the packet frame is consumed.
643    ///
644    /// This allows us to detect truncation.  A packet is truncated,
645    /// iff the end of the stream is reached, but the reader is not
646    /// consummated.
647    ///
648    fn consummated(&mut self) -> bool {
649        self.eof()
650    }
651
652    /// A convenience function for reading a 16-bit unsigned integer
653    /// in big endian format.
654    fn read_be_u16(&mut self) -> Result<u16, std::io::Error> {
655        let input = self.data_consume_hard(2)?;
656        // input holds at least 2 bytes, so this cannot fail.
657        Ok(u16::from_be_bytes(input[..2].try_into().unwrap()))
658    }
659
660    /// A convenience function for reading a 32-bit unsigned integer
661    /// in big endian format.
662    fn read_be_u32(&mut self) -> Result<u32, std::io::Error> {
663        let input = self.data_consume_hard(4)?;
664        // input holds at least 4 bytes, so this cannot fail.
665        Ok(u32::from_be_bytes(input[..4].try_into().unwrap()))
666    }
667
668    /// Reads until either `terminal` is encountered or EOF.
669    ///
670    /// Returns either a `&[u8]` terminating in `terminal` or the rest
671    /// of the data, if EOF was encountered.
672    ///
673    /// Note: this function does *not* consume the data.
674    ///
675    /// # Examples
676    ///
677    /// ```
678    /// # f(); fn f() -> Result<(), std::io::Error> {
679    /// use buffered_reader;
680    /// use buffered_reader::BufferedReader;
681    ///
682    /// let orig = b"0123456789";
683    /// let mut br = buffered_reader::Memory::new(&orig[..]);
684    ///
685    /// {
686    ///     let s = br.read_to(b'3')?;
687    ///     assert_eq!(s, b"0123");
688    /// }
689    ///
690    /// // [`BufferedReader::read_to`] doesn't consume the data.
691    /// {
692    ///     let s = br.read_to(b'5')?;
693    ///     assert_eq!(s, b"012345");
694    /// }
695    ///
696    /// // Even if there is more data in the internal buffer, only
697    /// // the data through the match is returned.
698    /// {
699    ///     let s = br.read_to(b'1')?;
700    ///     assert_eq!(s, b"01");
701    /// }
702    ///
703    /// // If the terminal is not found, everything is returned...
704    /// {
705    ///     let s = br.read_to(b'A')?;
706    ///     assert_eq!(s, orig);
707    /// }
708    ///
709    /// // If we consume some data, the search starts at the cursor,
710    /// // not the beginning of the file.
711    /// br.consume(3);
712    ///
713    /// {
714    ///     let s = br.read_to(b'5')?;
715    ///     assert_eq!(s, b"345");
716    /// }
717    /// # Ok(()) }
718    /// ```
719    fn read_to(&mut self, terminal: u8) -> Result<&[u8], std::io::Error> {
720        let mut n = 128;
721        let len;
722
723        loop {
724            let data = self.data(n)?;
725
726            if let Some(newline)
727                = data.iter().position(|c| *c == terminal)
728            {
729                len = newline + 1;
730                break;
731            } else if data.len() < n {
732                // EOF.
733                len = data.len();
734                break;
735            } else {
736                // Read more data.
737                n = cmp::max(2 * n, data.len() + 1024);
738            }
739        }
740
741        Ok(&self.buffer()[..len])
742    }
743
744    /// Discards the input until one of the bytes in terminals is
745    /// encountered.
746    ///
747    /// The matching byte is not discarded.
748    ///
749    /// Returns the number of bytes discarded.
750    ///
751    /// The end of file is considered a match.
752    ///
753    /// `terminals` must be sorted.
754    fn drop_until(&mut self, terminals: &[u8])
755        -> Result<usize, std::io::Error>
756    {
757        // Make sure terminals is sorted.
758        for t in terminals.windows(2) {
759            assert!(t[0] <= t[1]);
760        }
761
762        let buf_size = default_buf_size();
763        let mut total = 0;
764        let position = 'outer: loop {
765            let len = {
766                // Try self.buffer.  Only if it is empty, use
767                // self.data.
768                let buffer = if self.buffer().is_empty() {
769                    self.data(buf_size)?
770                } else {
771                    self.buffer()
772                };
773
774                if buffer.is_empty() {
775                    break 'outer 0;
776                }
777
778                if let Some(position) = buffer.iter().position(
779                    |c| terminals.binary_search(c).is_ok())
780                {
781                    break 'outer position;
782                }
783
784                buffer.len()
785            };
786
787            self.consume(len);
788            total += len;
789        };
790
791        self.consume(position);
792        Ok(total + position)
793    }
794
795    /// Discards the input until one of the bytes in `terminals` is
796    /// encountered.
797    ///
798    /// The matching byte is also discarded.
799    ///
800    /// Returns the terminal byte and the number of bytes discarded.
801    ///
802    /// If match_eof is true, then the end of file is considered a
803    /// match.  Otherwise, if the end of file is encountered, an error
804    /// is returned.
805    ///
806    /// `terminals` must be sorted.
807    fn drop_through(&mut self, terminals: &[u8], match_eof: bool)
808        -> Result<(Option<u8>, usize), std::io::Error>
809    {
810        let dropped = self.drop_until(terminals)?;
811        match self.data_consume(1) {
812            Ok([]) if match_eof => Ok((None, dropped)),
813            Ok([]) => Err(Error::new(ErrorKind::UnexpectedEof, "EOF")),
814            Ok(rest) => Ok((Some(rest[0]), dropped + 1)),
815            Err(err) => Err(err),
816        }
817    }
818
819    /// Like [`BufferedReader::data_consume_hard`], but returns the data in a
820    /// caller-owned buffer.
821    ///
822    /// [`BufferedReader`] implementations may optimize this to avoid a
823    /// copy by directly returning the internal buffer.
824    fn steal(&mut self, amount: usize) -> Result<Vec<u8>, std::io::Error> {
825        let mut data = self.data_consume_hard(amount)?;
826        assert!(data.len() >= amount);
827        if data.len() > amount {
828            data = &data[..amount];
829        }
830        Ok(data.to_vec())
831    }
832
833    /// Like [`BufferedReader::steal`], but instead of stealing a fixed number of
834    /// bytes, steals all of the data until the end of file.
835    fn steal_eof(&mut self) -> Result<Vec<u8>, std::io::Error> {
836        let len = self.data_eof()?.len();
837        let data = self.steal(len)?;
838        Ok(data)
839    }
840
841    /// Like [`BufferedReader::steal_eof`], but instead of returning the data, the
842    /// data is discarded.
843    ///
844    /// On success, returns whether any data (i.e., at least one byte)
845    /// was discarded.
846    ///
847    /// Note: whereas [`BufferedReader::steal_eof`] needs to buffer all of the data,
848    /// this function reads the data a chunk at a time, and then
849    /// discards it.  A consequence of this is that an error may occur
850    /// after we have consumed some of the data.
851    fn drop_eof(&mut self) -> Result<bool, std::io::Error> {
852        let buf_size = default_buf_size();
853        let mut at_least_one_byte = false;
854        loop {
855            let n = self.data(buf_size)?.len();
856            at_least_one_byte |= n > 0;
857            self.consume(n);
858            if n < buf_size {
859                // EOF.
860                break;
861            }
862        }
863
864        Ok(at_least_one_byte)
865    }
866
867    /// Copies data to the given writer returning the copied amount.
868    ///
869    /// This is like using [`std::io::copy`], but more efficient as it
870    /// avoids an extra copy, and it will try to copy all the data the
871    /// reader has already buffered.
872    ///
873    /// On success, returns the amount of data (in bytes) that has
874    /// been copied.
875    ///
876    /// Note: this function reads and copies the data a chunk at a
877    /// time.  A consequence of this is that an error may occur after
878    /// we have consumed some of the data.
879    fn copy(&mut self, sink: &mut dyn io::Write) -> io::Result<u64> {
880        let buf_size = default_buf_size();
881        let mut total = 0;
882        loop {
883            let data = self.data(buf_size)?;
884            sink.write_all(data)?;
885
886            let n = data.len();
887            total += n as u64;
888            self.consume(n);
889            if n < buf_size {
890                // EOF.
891                break;
892            }
893        }
894
895        Ok(total)
896    }
897
898    /// A helpful debugging aid to pretty print a Buffered Reader stack.
899    ///
900    /// Uses the Buffered Readers' `fmt::Display` implementations.
901    fn dump(&self, sink: &mut dyn std::io::Write) -> std::io::Result<()>
902        where Self: std::marker::Sized
903    {
904        let mut i = 1;
905        let mut reader: Option<&dyn BufferedReader<C>> = Some(self);
906        while let Some(r) = reader {
907            {
908                let cookie = r.cookie_ref();
909                writeln!(sink, "  {}. {}, {:?}", i, r, cookie)?;
910            }
911            reader = r.get_ref();
912            i += 1;
913        }
914        Ok(())
915    }
916
917    /// Boxes the reader.
918    fn into_boxed<'a>(self) -> Box<dyn BufferedReader<C> + 'a>
919        where Self: 'a + Sized
920    {
921        Box::new(self)
922    }
923
924    /// Boxes the reader.
925    #[deprecated(note = "Use into_boxed")]
926    fn as_boxed<'a>(self) -> Box<dyn BufferedReader<C> + 'a>
927        where Self: 'a + Sized
928    {
929        self.into_boxed()
930    }
931
932    /// Returns the underlying reader, if any.
933    ///
934    /// To allow this to work with [`BufferedReader`] traits, it is
935    /// necessary for `Self` to be boxed.
936    ///
937    /// This can lead to the following unusual code:
938    ///
939    /// ```text
940    /// let inner = Box::new(br).into_inner();
941    /// ```
942    ///
943    /// Note: if `Self` is not actually owned, e.g., you passed a
944    /// reference, then this returns `None` as it is not possible to
945    /// consume the outer buffered reader.  Consider:
946    ///
947    /// ```
948    /// # use buffered_reader::BufferedReader;
949    /// # use buffered_reader::Limitor;
950    /// # use buffered_reader::Memory;
951    /// #
952    /// # const DATA : &[u8] = b"01234567890123456789suffix";
953    /// #
954    /// let mut mem = Memory::new(DATA);
955    /// let mut limitor = Limitor::new(mem, 20);
956    /// let mut br = Box::new(&mut limitor);
957    /// // br doesn't owned limitor, so it can't consume it.
958    /// assert!(matches!(br.into_inner(), None));
959    ///
960    /// let mut mem = Memory::new(DATA);
961    /// let mut limitor = Limitor::new(mem, 20);
962    /// let mut br = Box::new(limitor);
963    /// assert!(matches!(br.into_inner(), Some(_)));
964    fn into_inner<'a>(self: Box<Self>) -> Option<Box<dyn BufferedReader<C> + 'a>>
965        where Self: 'a;
966
967    /// Returns a mutable reference to the inner [`BufferedReader`], if
968    /// any.
969    ///
970    /// It is a very bad idea to read any data from the inner
971    /// [`BufferedReader`], because this [`BufferedReader`] may have some
972    /// data buffered.  However, this function can be useful to get
973    /// the cookie.
974    fn get_mut(&mut self) -> Option<&mut dyn BufferedReader<C>>;
975
976    /// Returns a reference to the inner [`BufferedReader`], if any.
977    fn get_ref(&self) -> Option<&dyn BufferedReader<C>>;
978
979    /// Sets the [`BufferedReader`]'s cookie and returns the old value.
980    fn cookie_set(&mut self, cookie: C) -> C;
981
982    /// Returns a reference to the [`BufferedReader`]'s cookie.
983    fn cookie_ref(&self) -> &C;
984
985    /// Returns a mutable reference to the [`BufferedReader`]'s cookie.
986    fn cookie_mut(&mut self) -> &mut C;
987}
988
989/// A generic implementation of `std::io::Read::read` appropriate for
990/// any [`BufferedReader`] implementation.
991///
992/// This function implements the `std::io::Read::read` method in terms
993/// of the `data_consume` method.  We can't use the `io::std::Read`
994/// interface, because the [`BufferedReader`] may have buffered some
995/// data internally (in which case a read will not return the buffered
996/// data, but the following data).
997///
998/// This implementation is generic.  When deriving a [`BufferedReader`],
999/// you can include the following:
1000///
1001/// ```text
1002/// impl<'a, T: BufferedReader> std::io::Read for XXX<'a, T> {
1003///     fn read(&mut self, buf: &mut [u8]) -> Result<usize, std::io::Error> {
1004///         return buffered_reader_generic_read_impl(self, buf);
1005///     }
1006/// }
1007/// ```
1008///
1009/// It would be nice if we could do:
1010///
1011/// ```text
1012/// impl <T: BufferedReader> std::io::Read for T { ... }
1013/// ```
1014///
1015/// but, alas, Rust doesn't like that ("error\[E0119\]: conflicting
1016/// implementations of trait `std::io::Read` for type `&mut _`").
1017pub fn buffered_reader_generic_read_impl<T: BufferedReader<C>, C: fmt::Debug + Sync + Send>
1018        (bio: &mut T, buf: &mut [u8]) -> Result<usize, io::Error> {
1019    bio
1020        .data_consume(buf.len())
1021        .map(|inner| {
1022            let amount = cmp::min(buf.len(), inner.len());
1023            buf[0..amount].copy_from_slice(&inner[0..amount]);
1024            amount
1025        })
1026}
1027
1028/// Make a `Box<BufferedReader>` look like a BufferedReader.
1029impl <'a, C: fmt::Debug + Sync + Send> BufferedReader<C> for Box<dyn BufferedReader<C> + 'a> {
1030    fn buffer(&self) -> &[u8] {
1031        return self.as_ref().buffer();
1032    }
1033
1034    fn data(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1035        return self.as_mut().data(amount);
1036    }
1037
1038    fn data_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1039        return self.as_mut().data_hard(amount);
1040    }
1041
1042    fn data_eof(&mut self) -> Result<&[u8], io::Error> {
1043        return self.as_mut().data_eof();
1044    }
1045
1046    fn consume(&mut self, amount: usize) -> &[u8] {
1047        return self.as_mut().consume(amount);
1048    }
1049
1050    fn data_consume(&mut self, amount: usize)
1051                    -> Result<&[u8], std::io::Error> {
1052        return self.as_mut().data_consume(amount);
1053    }
1054
1055    fn data_consume_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1056        return self.as_mut().data_consume_hard(amount);
1057    }
1058
1059    fn consummated(&mut self) -> bool {
1060        self.as_mut().consummated()
1061    }
1062
1063    fn read_be_u16(&mut self) -> Result<u16, std::io::Error> {
1064        return self.as_mut().read_be_u16();
1065    }
1066
1067    fn read_be_u32(&mut self) -> Result<u32, std::io::Error> {
1068        return self.as_mut().read_be_u32();
1069    }
1070
1071    fn read_to(&mut self, terminal: u8) -> Result<&[u8], std::io::Error>
1072    {
1073        return self.as_mut().read_to(terminal);
1074    }
1075
1076    fn steal(&mut self, amount: usize) -> Result<Vec<u8>, std::io::Error> {
1077        return self.as_mut().steal(amount);
1078    }
1079
1080    fn steal_eof(&mut self) -> Result<Vec<u8>, std::io::Error> {
1081        return self.as_mut().steal_eof();
1082    }
1083
1084    fn drop_eof(&mut self) -> Result<bool, std::io::Error> {
1085        return self.as_mut().drop_eof();
1086    }
1087
1088    fn get_mut(&mut self) -> Option<&mut dyn BufferedReader<C>> {
1089        // Strip the outer box.
1090        self.as_mut().get_mut()
1091    }
1092
1093    fn get_ref(&self) -> Option<&dyn BufferedReader<C>> {
1094        // Strip the outer box.
1095        self.as_ref().get_ref()
1096    }
1097
1098    fn into_boxed<'b>(self) -> Box<dyn BufferedReader<C> + 'b>
1099        where Self: 'b
1100    {
1101        self
1102    }
1103
1104    fn as_boxed<'b>(self) -> Box<dyn BufferedReader<C> + 'b>
1105        where Self: 'b
1106    {
1107        self
1108    }
1109
1110    fn into_inner<'b>(self: Box<Self>) -> Option<Box<dyn BufferedReader<C> + 'b>>
1111            where Self: 'b {
1112        // Strip the outer box.
1113        (*self).into_inner()
1114    }
1115
1116    fn cookie_set(&mut self, cookie: C) -> C {
1117        self.as_mut().cookie_set(cookie)
1118    }
1119
1120    fn cookie_ref(&self) -> &C {
1121        self.as_ref().cookie_ref()
1122    }
1123
1124    fn cookie_mut(&mut self) -> &mut C {
1125        self.as_mut().cookie_mut()
1126    }
1127}
1128
1129/// Make a `&mut T` where `T` implements `BufferedReader` look like a
1130/// BufferedReader.
1131impl <'a, T, C> BufferedReader<C> for &'a mut T
1132where
1133    T: BufferedReader<C>,
1134    C: fmt::Debug + Sync + Send + 'a
1135{
1136    fn buffer(&self) -> &[u8] {
1137        (**self).buffer()
1138    }
1139
1140    fn data(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1141        (**self).data(amount)
1142    }
1143
1144    fn data_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1145        (**self).data_hard(amount)
1146    }
1147
1148    fn data_eof(&mut self) -> Result<&[u8], io::Error> {
1149        (**self).data_eof()
1150    }
1151
1152    fn consume(&mut self, amount: usize) -> &[u8] {
1153        (**self).consume(amount)
1154    }
1155
1156    fn data_consume(&mut self, amount: usize)
1157                    -> Result<&[u8], std::io::Error> {
1158        (**self).data_consume(amount)
1159    }
1160
1161    fn data_consume_hard(&mut self, amount: usize) -> Result<&[u8], io::Error> {
1162        (**self).data_consume_hard(amount)
1163    }
1164
1165    fn consummated(&mut self) -> bool {
1166        (**self).consummated()
1167    }
1168
1169    fn read_be_u16(&mut self) -> Result<u16, std::io::Error> {
1170        (**self).read_be_u16()
1171    }
1172
1173    fn read_be_u32(&mut self) -> Result<u32, std::io::Error> {
1174        (**self).read_be_u32()
1175    }
1176
1177    fn read_to(&mut self, terminal: u8) -> Result<&[u8], std::io::Error>
1178    {
1179        (**self).read_to(terminal)
1180    }
1181
1182    fn steal(&mut self, amount: usize) -> Result<Vec<u8>, std::io::Error> {
1183        (**self).steal(amount)
1184    }
1185
1186    fn steal_eof(&mut self) -> Result<Vec<u8>, std::io::Error> {
1187        (**self).steal_eof()
1188    }
1189
1190    fn drop_eof(&mut self) -> Result<bool, std::io::Error> {
1191        (**self).drop_eof()
1192    }
1193
1194    fn get_mut(&mut self) -> Option<&mut dyn BufferedReader<C>> {
1195        (**self).get_mut()
1196    }
1197
1198    fn get_ref(&self) -> Option<&dyn BufferedReader<C>> {
1199        (**self).get_ref()
1200    }
1201
1202    fn into_boxed<'b>(self) -> Box<dyn BufferedReader<C> + 'b>
1203        where Self: 'b
1204    {
1205        Box::new(self)
1206    }
1207
1208    fn as_boxed<'b>(self) -> Box<dyn BufferedReader<C> + 'b>
1209        where Self: 'b
1210    {
1211        Box::new(self)
1212    }
1213
1214    fn into_inner<'b>(self: Box<Self>) -> Option<Box<dyn BufferedReader<C> + 'b>>
1215            where Self: 'b
1216    {
1217        None
1218    }
1219
1220    fn cookie_set(&mut self, cookie: C) -> C {
1221        (**self).cookie_set(cookie)
1222    }
1223
1224    fn cookie_ref(&self) -> &C {
1225        (**self).cookie_ref()
1226    }
1227
1228    fn cookie_mut(&mut self) -> &mut C {
1229        (**self).cookie_mut()
1230    }
1231}
1232
1233// The file was created as follows:
1234//
1235//   for i in $(seq 0 9999); do printf "%04d\n" $i; done > buffered-reader-test.txt
1236#[cfg(test)]
1237fn buffered_reader_test_data_check<'a, T: BufferedReader<C> + 'a, C: fmt::Debug + Sync + Send>(bio: &mut T) {
1238    use std::str;
1239
1240    for i in 0 .. 10000 {
1241        let consumed = {
1242            // Each number is 4 bytes plus a newline character.
1243            let d = bio.data_hard(5);
1244            if d.is_err() {
1245                println!("Error for i == {}: {:?}", i, d);
1246            }
1247            let d = d.unwrap();
1248            assert!(d.len() >= 5);
1249            assert_eq!(format!("{:04}\n", i), str::from_utf8(&d[0..5]).unwrap());
1250
1251            5
1252        };
1253
1254        bio.consume(consumed);
1255    }
1256}
1257
1258#[cfg(test)]
1259const BUFFERED_READER_TEST_DATA: &[u8] =
1260    include_bytes!("buffered-reader-test.txt");
1261
1262#[cfg(test)]
1263mod test {
1264    use super::*;
1265
1266    #[test]
1267    fn buffered_reader_eof_test() {
1268        let data = BUFFERED_READER_TEST_DATA;
1269        // Make sure data_eof works.
1270        {
1271            let mut bio = Memory::new(data);
1272            let amount = {
1273                bio.data_eof().unwrap().len()
1274            };
1275            bio.consume(amount);
1276            assert_eq!(bio.data(1).unwrap().len(), 0);
1277        }
1278
1279        // Try it again with a limitor.
1280        {
1281            let bio = Memory::new(data);
1282            let mut bio2 = Limitor::new(
1283                bio, (data.len() / 2) as u64);
1284            let amount = {
1285                bio2.data_eof().unwrap().len()
1286            };
1287            assert_eq!(amount, data.len() / 2);
1288            bio2.consume(amount);
1289            assert_eq!(bio2.data(1).unwrap().len(), 0);
1290        }
1291    }
1292
1293    #[cfg(test)]
1294    fn buffered_reader_read_test_aux<'a, T: BufferedReader<C> + 'a, C: fmt::Debug + Sync + Send>
1295        (mut bio: T, data: &[u8]) {
1296        let mut buffer = [0; 99];
1297
1298        // Make sure the test file has more than buffer.len() bytes
1299        // worth of data.
1300        assert!(buffer.len() < data.len());
1301
1302        // The number of reads we'll have to perform.
1303        let iters = (data.len() + buffer.len() - 1) / buffer.len();
1304        // Iterate more than the number of required reads to check
1305        // what happens when we try to read beyond the end of the
1306        // file.
1307        for i in 1..iters + 2 {
1308            let data_start = (i - 1) * buffer.len();
1309
1310            // We don't want to just check that read works in
1311            // isolation.  We want to be able to mix .read and .data
1312            // calls.
1313            {
1314                let result = bio.data(buffer.len());
1315                let buffer = result.unwrap();
1316                if !buffer.is_empty() {
1317                    assert_eq!(buffer,
1318                               &data[data_start..data_start + buffer.len()]);
1319                }
1320            }
1321
1322            // Now do the actual read.
1323            let result = bio.read(&mut buffer[..]);
1324            let got = result.unwrap();
1325            if got > 0 {
1326                assert_eq!(&buffer[0..got],
1327                           &data[data_start..data_start + got]);
1328            }
1329
1330            if i > iters {
1331                // We should have read everything.
1332                assert!(got == 0);
1333            } else if i == iters {
1334                // The last read.  This may be less than buffer.len().
1335                // But it should include at least one byte.
1336                assert!(0 < got);
1337                assert!(got <= buffer.len());
1338            } else {
1339                assert_eq!(got, buffer.len());
1340            }
1341        }
1342    }
1343
1344    #[test]
1345    fn buffered_reader_read_test() {
1346        let data = BUFFERED_READER_TEST_DATA;
1347        {
1348            let bio = Memory::new(data);
1349            buffered_reader_read_test_aux (bio, data);
1350        }
1351
1352        {
1353            use std::path::PathBuf;
1354            use std::fs::File;
1355
1356            let path : PathBuf = [env!("CARGO_MANIFEST_DIR"),
1357                                  "src",
1358                                  "buffered-reader-test.txt"]
1359                .iter().collect();
1360
1361            let mut f = File::open(&path).expect(&path.to_string_lossy());
1362            let bio = Generic::new(&mut f, None);
1363            buffered_reader_read_test_aux (bio, data);
1364        }
1365    }
1366
1367    #[test]
1368    fn drop_until() {
1369        let data : &[u8] = &b"abcd"[..];
1370        let mut reader = Memory::new(data);
1371
1372        // Matches the 'a' at 0 and consumes 0 bytes.
1373        assert_eq!(reader.drop_until(b"ab").unwrap(), 0);
1374        // Matches the 'b' at 1 and consumes 1 byte.
1375        assert_eq!(reader.drop_until(b"bc").unwrap(), 1);
1376        // Matches the 'b' at 1 and consumes 0 bytes.
1377        assert_eq!(reader.drop_until(b"ab").unwrap(), 0);
1378        // Matches the 'd' at 4 and consumes 2 bytes.
1379        assert_eq!(reader.drop_until(b"de").unwrap(), 2);
1380        // Matches nothing, consuming the last 1 byte.
1381        assert_eq!(reader.drop_until(b"e").unwrap(), 1);
1382        // Matches nothing, consuming nothing.
1383        assert_eq!(reader.drop_until(b"e").unwrap(), 0);
1384    }
1385
1386    #[test]
1387    fn drop_through() {
1388        let data : &[u8] = &b"abcd"[..];
1389        let mut reader = Memory::new(data);
1390
1391        // Matches the 'a' at 0 and consumes 1 byte.
1392        assert_eq!(reader.drop_through(b"ab", false).unwrap(),
1393                   (Some(b'a'), 1));
1394        // Matches the 'b' at 1 and consumes 1 byte.
1395        assert_eq!(reader.drop_through(b"ab", false).unwrap(),
1396                   (Some(b'b'), 1));
1397        // Matches the 'd' at 4 and consumes 2 byte.
1398        assert_eq!(reader.drop_through(b"def", false).unwrap(),
1399                   (Some(b'd'), 2));
1400        // Doesn't match (eof).
1401        assert!(reader.drop_through(b"def", false).is_err());
1402        // Matches EOF.
1403        assert!(reader.drop_through(b"def", true).unwrap().0.is_none());
1404    }
1405
1406    #[test]
1407    fn copy() -> io::Result<()> {
1408        // The memory reader has all the data buffered, copying it
1409        // will issue a single write.
1410        let mut bio = Memory::new(BUFFERED_READER_TEST_DATA);
1411        let mut sink = Vec::new();
1412        let amount = bio.copy(&mut sink)?;
1413        assert_eq!(amount, 50_000);
1414        assert_eq!(&sink[..], BUFFERED_READER_TEST_DATA);
1415
1416        // The generic reader uses buffers of the given chunk size,
1417        // copying it will issue multiple writes.
1418        let mut bio = Generic::new(BUFFERED_READER_TEST_DATA, Some(64));
1419        let mut sink = Vec::new();
1420        let amount = bio.copy(&mut sink)?;
1421        assert_eq!(amount, 50_000);
1422        assert_eq!(&sink[..], BUFFERED_READER_TEST_DATA);
1423        Ok(())
1424    }
1425
1426    #[test]
1427    fn mutable_reference() {
1428        use crate::Memory;
1429        const DATA : &[u8] = b"01234567890123456789suffix";
1430
1431        /// API that consumes the memory reader.
1432        fn parse_ten_bytes<B: BufferedReader<()>>(mut r: B) {
1433            let d = r.data_consume_hard(10).unwrap();
1434            assert!(d.len() >= 10);
1435            assert_eq!(&d[..10], &DATA[..10]);
1436            drop(r); // We consumed the reader.
1437        }
1438
1439        let mut mem = Memory::new(DATA);
1440        parse_ten_bytes(&mut mem);
1441        parse_ten_bytes(&mut mem);
1442        let suffix = mem.data_eof().unwrap();
1443        assert_eq!(suffix, b"suffix");
1444
1445        let mut mem = Memory::new(DATA);
1446        let mut limitor = Limitor::new(&mut mem, 20);
1447        parse_ten_bytes(&mut limitor);
1448        parse_ten_bytes(&mut limitor);
1449        assert!(limitor.eof());
1450        drop(limitor);
1451        let suffix = mem.data_eof().unwrap();
1452        assert_eq!(suffix, b"suffix");
1453    }
1454
1455    #[test]
1456    fn mutable_reference_with_cookie() {
1457        use crate::Memory;
1458        const DATA : &[u8] = b"01234567890123456789suffix";
1459
1460        /// API that consumes the memory reader.
1461        fn parse_ten_bytes<B, C>(mut r: B)
1462        where B: BufferedReader<C>,
1463              C: std::fmt::Debug + Send + Sync
1464        {
1465            let d = r.data_consume_hard(10).unwrap();
1466            assert!(d.len() >= 10);
1467            assert_eq!(&d[..10], &DATA[..10]);
1468            drop(r); // We consumed the reader.
1469        }
1470
1471        #[derive(Debug)]
1472        struct Cookie {
1473        }
1474
1475        impl Default for Cookie {
1476            fn default() -> Self { Cookie {} }
1477        }
1478
1479        let mut mem = Memory::with_cookie(DATA, Cookie::default());
1480        parse_ten_bytes(&mut mem);
1481        parse_ten_bytes(&mut mem);
1482        let suffix = mem.data_eof().unwrap();
1483        assert_eq!(suffix, b"suffix");
1484
1485        let mut mem = Memory::with_cookie(DATA, Cookie::default());
1486        let mut limitor = Limitor::with_cookie(
1487            &mut mem, 20, Cookie::default());
1488        parse_ten_bytes(&mut limitor);
1489        parse_ten_bytes(&mut limitor);
1490        assert!(limitor.eof());
1491        drop(limitor);
1492        let suffix = mem.data_eof().unwrap();
1493        assert_eq!(suffix, b"suffix");
1494
1495        let mut mem = Memory::with_cookie(DATA, Cookie::default());
1496        let mut mem = Box::new(&mut mem) as Box<dyn BufferedReader<Cookie>>;
1497        let mut limitor = Limitor::with_cookie(
1498            &mut mem, 20, Cookie::default());
1499        parse_ten_bytes(&mut limitor);
1500        parse_ten_bytes(&mut limitor);
1501        assert!(limitor.eof());
1502        drop(limitor);
1503        let suffix = mem.data_eof().unwrap();
1504        assert_eq!(suffix, b"suffix");
1505    }
1506
1507    #[test]
1508    fn mutable_reference_inner() {
1509        use crate::Memory;
1510        const DATA : &[u8] = b"01234567890123456789suffix";
1511
1512        /// API that consumes the memory reader.
1513        fn parse_ten_bytes<B: BufferedReader<()>>(mut r: B) {
1514            let d = r.data_consume_hard(10).unwrap();
1515            assert!(d.len() >= 10);
1516            assert_eq!(&d[..10], &DATA[..10]);
1517            drop(r); // We consumed the reader.
1518        }
1519
1520        let mut mem = Memory::new(DATA);
1521        let mut limitor = Limitor::new(&mut mem, 20);
1522        parse_ten_bytes(&mut limitor);
1523        parse_ten_bytes(&mut limitor);
1524        assert!(limitor.eof());
1525
1526        // Check that get_mut returns `mem` by reading from the inner
1527        // and checking that we get more data.
1528        let mem = limitor.get_mut().expect("have inner");
1529        let suffix = mem.data_eof().unwrap();
1530        assert_eq!(suffix, b"suffix");
1531    }
1532}
buffered_reader/lib.rs

buffered_reader/
lib.rs