pub fn walk(
worktree_root: &Path,
ctx: Context<'_>,
options: Options,
delegate: &mut dyn Delegate
) -> Result<(Outcome, PathBuf), Error>
Expand description
A function to perform a git-style, unsorted, directory walk.
worktree_root
- the top-most root of the worktree, which must be a prefix toroot
.- If
Options::precompose_unicode
is enabled, this path must be precomposed. - The starting point of the traversal (traversal root) is calculated from by doing
worktree_root + pathspec.common_prefix()
. - Note that if the traversal root leading to this directory or it itself is excluded, it will be provided to
Delegate::emit()
without further traversal. - If
Options::precompose_unicode
is enabled, all involved paths must be precomposed. - Must be contained in
worktree_root
.
- If
ctx
- everything needed to classify the paths seen during the traversal.delegate
- an implementation ofDelegate
to control details of the traversal and receive its results.
Returns (outcome, traversal_root)
, with the traversal_root
actually being used for the traversal,
useful to transform the paths returned for the user. It’s always within the worktree_root
, or the same,
but is hard to guess due to additional logic affecting it.
§Performance Notes
In theory, parallel directory traversal can be significantly faster, and what’s possible for our current
gix_features::fs::WalkDir
implementation is to abstract a filter_entry()
method so it works both for
the iterator from the walkdir
crate as well as from jwalk
. However, doing so as initial version
has the risk of not being significantly harder if not impossible to implement as flow-control is very
limited.
Thus the decision was made to start out with something akin to the Git implementation, get all tests and
baseline comparison to pass, and see if an iterator with just filter_entry
would be capable of dealing with
it. Note that filter_entry
is the only just-in-time traversal control that walkdir
offers, even though
one could consider switching to jwalk
and just use its single-threaded implementation if a unified interface
is necessary to make this work - jwalk
has a more powerful API for this to work.
If that was the case, we are talking about 0.5s for single-threaded traversal (without doing any extra work) or 0.25s for optimal multi-threaded performance, all in the WebKit directory with 388k items to traverse. Thus, the speedup could easily be 2x or more and thus worth investigating in due time.