Struct File

source

pub struct File { /* private fields */ }

Expand description

A representation of a pack index file

Implementations§

source §

impl File

Instantiation

source

pub fn at(path: impl AsRef<Path>, object_hash: Kind) -> Result<File, Error>

Open the pack index file at the given path.

The object_hash is a way to read (and write) the same file format with different hashes, as the hash kind isn’t stored within the file format itself.

source §

impl File

Iteration and access

source

pub fn oid_at_index(&self, index: EntryIndex) -> &oid

Returns the object hash at the given index in our list of (sorted) sha1 hashes. The index ranges from 0 to self.num_objects()

§Panics

If index is out of bounds.

source

pub fn pack_offset_at_index(&self, index: EntryIndex) -> Offset

Returns the offset into our pack data file at which to start reading the object at index.

§Panics

If index is out of bounds.

source

pub fn crc32_at_index(&self, index: EntryIndex) -> Option<u32>

Returns the CRC32 of the object at the given index.

Note: These are always present for index version 2 or higher.

§Panics

If index is out of bounds.

source

pub fn lookup(&self, id: impl AsRef<oid>) -> Option<EntryIndex>

Returns the index of the given hash for use with the oid_at_index(), pack_offset_at_index() or crc32_at_index().

source

pub fn lookup_prefix( &self, prefix: Prefix, candidates: Option<&mut Range<EntryIndex>>, ) -> Option<PrefixLookupResult>

Given a prefix, find an object that matches it uniquely within this index and return Some(Ok(entry_index)). If there is more than one object matching the object Some(Err(()) is returned.

Finally, if no object matches the index, the return value is None.

Pass candidates to obtain the set of entry-indices matching prefix, with the same return value as one would have received if it remained None. It will be empty if no object matched the prefix.

source

pub fn iter<'a>(&'a self) -> Box<dyn Iterator<Item = Entry> + 'a>

An iterator over all Entries of this index file.

source

pub fn sorted_offsets(&self) -> Vec<Offset>

Return a vector of ascending offsets into our respective pack data file.

Useful to control an iteration over all pack entries in a cache-friendly way.

source §

impl File

Traversal with index

source

pub fn traverse_with_index<Processor, E>( &self, pack: &File, processor: Processor, progress: &mut dyn DynNestedProgress, should_interrupt: &AtomicBool, _: Options, ) -> Result<Outcome, Error<E>>
where Processor: FnMut(Kind, &[u8], &Entry, &dyn Progress) -> Result<(), E> + Send + Clone, E: Error + Send + Sync + 'static,

Iterate through all decoded objects in the given pack and handle them with a Processor, using an index to reduce waste at the cost of memory.

For more details, see the documentation on the traverse() method.

source §

impl File

Verify and validate the content of the index file

source

pub fn traverse_with_lookup<C, Processor, E, F>( &self, processor: Processor, pack: &File, progress: &mut dyn DynNestedProgress, should_interrupt: &AtomicBool, _: Options<F>, ) -> Result<Outcome, Error<E>>
where C: DecodeEntry, E: Error + Send + Sync + 'static, Processor: FnMut(Kind, &[u8], &Entry, &dyn Progress) -> Result<(), E> + Send + Clone, F: Fn() -> C + Send + Clone,

Iterate through all decoded objects in the given pack and handle them with a Processor using a cache to reduce the amount of waste while decoding objects.

For more details, see the documentation on the traverse() method.

source §

impl File

Traversal of pack data files using an index file

source

pub fn traverse<C, Processor, E, F>( &self, pack: &File, progress: &mut dyn DynNestedProgress, should_interrupt: &AtomicBool, processor: Processor, _: Options<F>, ) -> Result<Outcome, Error<E>>
where C: DecodeEntry, E: Error + Send + Sync + 'static, Processor: FnMut(Kind, &[u8], &Entry, &dyn Progress) -> Result<(), E> + Send + Clone, F: Fn() -> C + Send + Clone,

Iterate through all decoded objects in the given pack and handle them with a Processor. The return value is (pack-checksum, Outcome, progress), thus the pack traversal will always verify the whole packs checksum to assure it was correct. In case of bit-rod, the operation will abort early without verifying all objects using the interrupt mechanism mechanism.

§Algorithms

Using the Options::traversal field one can chose between two algorithms providing different tradeoffs. Both invoke new_processor() to create functions receiving decoded objects, their object kind, index entry and a progress instance to provide progress information.

Algorithm::DeltaTreeLookup builds an index to avoid any unnecessary computation while resolving objects, avoiding the need for a cache entirely, rendering new_cache() unused. One could also call traverse_with_index() directly.
Algorithm::Lookup uses a cache created by new_cache() to avoid having to re-compute all bases of a delta-chain while decoding objects. One could also call traverse_with_lookup() directly.

Use thread_limit to further control parallelism and check to define how much the passed objects shall be verified beforehand.

source §

impl File

Verify and validate the content of the index file

source

pub fn index_checksum(&self) -> ObjectId

Returns the trailing hash stored at the end of this index file.

It’s a hash over all bytes of the index.

source

pub fn pack_checksum(&self) -> ObjectId

Returns the hash of the pack data file that this index file corresponds to.

It should crate::data::File::checksum() of the corresponding pack data file.

source

pub fn verify_checksum( &self, progress: &mut dyn Progress, should_interrupt: &AtomicBool, ) -> Result<ObjectId, Error>

Validate that our index_checksum() matches the actual contents of this index file, and return it if it does.

source

pub fn verify_integrity<C, F>( &self, pack: Option<PackContext<'_, F>>, progress: &mut dyn DynNestedProgress, should_interrupt: &AtomicBool, ) -> Result<Outcome, Error<Error>>
where C: DecodeEntry, F: Fn() -> C + Send + Clone,

The most thorough validation of integrity of both index file and the corresponding pack data file, if provided. Returns the checksum of the index file, the traversal outcome and the given progress if the integrity check is successful.

If pack is provided, it is expected (and validated to be) the pack belonging to this index. It will be used to validate internal integrity of the pack before checking each objects integrity is indeed as advertised via its SHA1 as stored in this index, as well as the CRC32 hash. The last member of the Option is a function returning an implementation of crate::cache::DecodeEntry to be used if the index::traverse::Algorithm is Lookup. To set this to None, use None::<(_, _, _, fn() -> crate::cache::Never)>.

The thread_limit optionally specifies the amount of threads to be used for the pack traversal. make_cache is only used in case a pack is specified, use existing implementations in the crate::cache module.

§Tradeoffs

The given progress is inevitably consumed if there is an error, which is a tradeoff chosen to easily allow using ? in the error case.

source §

impl File

Various ways of writing an index file from pack entries

source

pub fn write_data_iter_to_stream<F, F2, R>( version: Version, make_resolver: F, entries: &mut dyn Iterator<Item = Result<Entry, Error>>, thread_limit: Option<usize>, root_progress: &mut dyn DynNestedProgress, out: &mut dyn Write, should_interrupt: &AtomicBool, object_hash: Kind, pack_version: Version, ) -> Result<Outcome, Error>
where F: FnOnce() -> Result<(F2, R)>, R: Send + Sync, F2: for<'r> Fn(EntryRange, &'r R) -> Option<&'r [u8]> + Send + Clone,

Available on crate feature streaming-input only.

Write information about entries as obtained from a pack data file into a pack index file via the out stream. The resolver produced by make_resolver must resolve pack entries from the same pack data file that produced the entries iterator.

kind is the version of pack index to produce, use crate::index::Version::default() if in doubt.
tread_limit is used for a parallel tree traversal for obtaining object hashes with optimal performance.
root_progress is the top-level progress to stay informed about the progress of this potentially long-running computation.
object_hash defines what kind of object hash we write into the index file.
pack_version is the version of the underlying pack for which entries are read. It’s used in case none of these objects are provided to compute a pack-hash.

§Remarks

neither in-pack nor out-of-pack Ref Deltas are supported here, these must have been resolved beforehand.
make_resolver() will only be called after the iterator stopped returning elements and produces a function that provides all bytes belonging to a pack entry writing them to the given mutable output Vec. It should return None if the entry cannot be resolved from the pack that produced the entries iterator, causing the write operation to fail.

source §