pub enum CUDAExecutionProviderCuDNNConvAlgoSearch {
Exhaustive,
Heuristic,
Default,
}
Expand description
The type of search done for cuDNN convolution algorithms.
Variants§
Exhaustive
Expensive exhaustive benchmarking using cudnnFindConvolutionForwardAlgorithmEx
.
This function will attempt all possible algorithms for cudnnConvolutionForward
to find the fastest algorithm.
Exhaustive search trades off between memory usage and speed. The first execution of a graph will be slow while
possible convolution algorithms are tested.
Heuristic
Lightweight heuristic-based search using cudnnGetConvolutionForwardAlgorithm_v7
.
Heuristic search sorts available convolution algorithms by expected (based on internal heuristic) relative
performance.
Default
Uses the default convolution algorithm, CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM
.
The default algorithm may not have the best performance depending on specific hardware used. It’s recommended to
use Exhaustive
or Heuristic
to search for a faster algorithm instead. However, Default
does have its
uses, such as when available memory is tight.
NOTE: This name may be confusing as this is not the default search algorithm for the CUDA EP. The default search algorithm is actually
Exhaustive
.
Trait Implementations§
Source§impl Clone for CUDAExecutionProviderCuDNNConvAlgoSearch
impl Clone for CUDAExecutionProviderCuDNNConvAlgoSearch
Source§fn clone(&self) -> CUDAExecutionProviderCuDNNConvAlgoSearch
fn clone(&self) -> CUDAExecutionProviderCuDNNConvAlgoSearch
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more