pub fn iter_from_counts<Find>(
counts: Vec<Count>,
db: Find,
progress: Box<dyn DynNestedProgress + 'static>,
_: Options
) -> impl Iterator<Item = Result<(SequenceId, Vec<Entry>), Error>> + Finalize<Reduce = Statistics<Error>>
Available on crate feature
generate
only.Expand description
Given a known list of object counts
, calculate entries ready to be put into a data pack.
This allows objects to be written quite soon without having to wait for the entire pack to be built in memory. A chunk of objects is held in memory and compressed using DEFLATE, and serve the output of this iterator. That way slow writers will naturally apply back pressure, and communicate to the implementation that more time can be spent compressing objects.
counts
- A list of previously counted objects to add to the pack. Duplication checks are not performed, no object is expected to be duplicated.
progress
- a way to obtain progress information
options
- more configuration
Returns the checksum of the pack
Discussion
Advantages
- Begins writing immediately and supports back-pressure.
- Abstract over object databases and how input is provided.
Disadvantages
currently there is no way to easily write the pack index, even though the state here is uniquely positioned to do so with minimal overhead (especially compared toProbably works now by chaining Iterators or keeping enough state to write a pack and then generate an index with recorded data.gix index-from-pack
)