iai_callgrind

Struct OutputFormat

source
pub struct OutputFormat(/* private fields */);
Available on crate feature default only.
Expand description

Configure the default output format of the terminal output of Iai-Callgrind

This configuration is only applied to the default output format (--output-format=default) and not to any of the json output formats like (--output-format=json).

§Examples

For example configure the truncation length of the description to 200 for all library benchmarks in the same file with OutputFormat::truncate_description:

use iai_callgrind::{main, LibraryBenchmarkConfig, OutputFormat};
main!(
    config = LibraryBenchmarkConfig::default()
        .output_format(OutputFormat::default()
            .truncate_description(Some(200))
        );
    library_benchmark_groups = some_group
);

Implementations§

source§

impl OutputFormat

source

pub fn truncate_description(&mut self, value: Option<usize>) -> &mut Self

Adjust, enable or disable the truncation of the description in the iai-callgrind output

The default is to truncate the description to the size of 50 ascii characters. A None value disables the truncation entirely and a Some value will truncate the description to the given amount of characters excluding the ellipsis.

To clearify which part of the output is meant by DESCRIPTION:

benchmark_file::group_name::function_name id:DESCRIPTION
  Instructions:              352135|352135          (No change)
  L1 Hits:                   470117|470117          (No change)
  L2 Hits:                      748|748             (No change)
  RAM Hits:                    4112|4112            (No change)
  Total read+write:          474977|474977          (No change)
  Estimated Cycles:          617777|617777          (No change)
§Examples

For example, specifying this option with a None value in the main! macro disables the truncation of the description for all benchmarks.

use iai_callgrind::{main, LibraryBenchmarkConfig, OutputFormat};
main!(
    config = LibraryBenchmarkConfig::default()
        .output_format(OutputFormat::default()
            .truncate_description(None)
        );
    library_benchmark_groups = some_group
);
source

pub fn show_intermediate(&mut self, value: bool) -> &mut Self

Show intermediate metrics from parts, subprocesses, threads, … (Default: false)

In callgrind, threads are treated as separate units (similar to subprocesses) and the metrics for them are dumped into an own file. Other valgrind tools usually separate the output files only by subprocesses. To also show the metrics of any intermediate fragments and not just the total over all of them, set the value of this method to true.

Temporarily setting show_intermediate to true can help to find misconfigurations in multi-thread/multi-process benchmarks.

§Examples

As opposed to valgrind/callgrind, --trace-children=yes, --separate-threads=yes and --fair-sched=try are the defaults in Iai-Callgrind, so in the following example it’s not necessary to specify --separate-threads to track the metrics of the spawned thread. However, it is necessary to specify an additional toggle or else the metrics of the thread are all zero. We also set the super::EntryPoint to None to disable the default entry point (toggle) which is the benchmark function. So, with this setup we collect only the metrics of the method my_lib::heavy_calculation in the spawned thread and nothing else.

use iai_callgrind::{
    main, LibraryBenchmarkConfig, OutputFormat, EntryPoint, library_benchmark,
    library_benchmark_group
};

#[library_benchmark(
    config = LibraryBenchmarkConfig::default()
        .entry_point(EntryPoint::None)
        .callgrind_args(["--toggle-collect=my_lib::heavy_calculation"])
        .output_format(OutputFormat::default().show_intermediate(true))
)]
fn bench_thread() -> u64 {
    let handle = std::thread::spawn(|| my_lib::heavy_calculation());
    handle.join().unwrap()
}

library_benchmark_group!(name = some_group; benchmarks = bench_thread);
main!(library_benchmark_groups = some_group);

Running the above benchmark the first time will print something like the below (The exact metric counts are made up for demonstration purposes):

my_benchmark::some_group::bench_thread
  ## pid: 633247 part: 1 thread: 1   |N/A
  Command:            target/release/deps/my_benchmark-08fe8356975cd1af
  Instructions:                     0|N/A             (*********)
  L1 Hits:                          0|N/A             (*********)
  L2 Hits:                          0|N/A             (*********)
  RAM Hits:                         0|N/A             (*********)
  Total read+write:                 0|N/A             (*********)
  Estimated Cycles:                 0|N/A             (*********)
  ## pid: 633247 part: 1 thread: 2   |N/A
  Command:            target/release/deps/my_benchmark-08fe8356975cd1af
  Instructions:                  3905|N/A             (*********)
  L1 Hits:                       4992|N/A             (*********)
  L2 Hits:                          0|N/A             (*********)
  RAM Hits:                       464|N/A             (*********)
  Total read+write:              5456|N/A             (*********)
  Estimated Cycles:             21232|N/A             (*********)
  ## Total
  Instructions:                  3905|N/A             (*********)
  L1 Hits:                       4992|N/A             (*********)
  L2 Hits:                          0|N/A             (*********)
  RAM Hits:                       464|N/A             (*********)
  Total read+write:              5456|N/A             (*********)
  Estimated Cycles:             21232|N/A             (*********)

With show_intermediate set to false (the default), only the total is shown:

my_benchmark::some_group::bench_thread
  Instructions:                  3905|N/A             (*********)
  L1 Hits:                       4992|N/A             (*********)
  L2 Hits:                          0|N/A             (*********)
  RAM Hits:                       464|N/A             (*********)
  Total read+write:              5456|N/A             (*********)
  Estimated Cycles:             21232|N/A             (*********)
source

pub fn show_grid(&mut self, value: bool) -> &mut Self

Show an ascii grid in the benchmark terminal output

This option adds guiding lines which can help reading the benchmark output when running multiple tools with multiple threads/subprocesses.

§Examples
use iai_callgrind::OutputFormat;

let output_format = OutputFormat::default().show_grid(true);

Below is the output of a Iai-Callgrind run with DHAT as additional tool benchmarking a function that executes a subprocess which itself starts multiple threads. For the benchmark run below OutputFormat::show_intermediate was also active to show the threads and subprocesses.

test_lib_bench_threads::bench_group::bench_thread_in_subprocess three:3
|======== CALLGRIND ===================================================================
|-## pid: 3186352 part: 1 thread: 1       |pid: 2721318 part: 1 thread: 1
| Command:            target/release/deps/test_lib_bench_threads-b0b85adec9a45de1
| Instructions:                       4697|4697                 (No change)
| L1 Hits:                            6420|6420                 (No change)
| L2 Hits:                              17|17                   (No change)
| RAM Hits:                            202|202                  (No change)
| Total read+write:                   6639|6639                 (No change)
| Estimated Cycles:                  13575|13575                (No change)
|-## pid: 3186468 part: 1 thread: 1       |pid: 2721319 part: 1 thread: 1
| Command:            target/release/thread 3
| Instructions:                      35452|35452                (No change)
| L1 Hits:                           77367|77367                (No change)
| L2 Hits:                             610|610                  (No change)
| RAM Hits:                            784|784                  (No change)
| Total read+write:                  78761|78761                (No change)
| Estimated Cycles:                 107857|107857               (No change)
|-## pid: 3186468 part: 1 thread: 2       |pid: 2721319 part: 1 thread: 2
| Command:            target/release/thread 3
| Instructions:                    2460507|2460507              (No change)
| L1 Hits:                         2534939|2534939              (No change)
| L2 Hits:                              17|17                   (No change)
| RAM Hits:                            186|186                  (No change)
| Total read+write:                2535142|2535142              (No change)
| Estimated Cycles:                2541534|2541534              (No change)
|-## pid: 3186468 part: 1 thread: 3       |pid: 2721319 part: 1 thread: 3
| Command:            target/release/thread 3
| Instructions:                    3650414|3650414              (No change)
| L1 Hits:                         3724275|3724275              (No change)
| L2 Hits:                              21|21                   (No change)
| RAM Hits:                            130|130                  (No change)
| Total read+write:                3724426|3724426              (No change)
| Estimated Cycles:                3728930|3728930              (No change)
|-## pid: 3186468 part: 1 thread: 4       |pid: 2721319 part: 1 thread: 4
| Command:            target/release/thread 3
| Instructions:                    4349846|4349846              (No change)
| L1 Hits:                         4423438|4423438              (No change)
| L2 Hits:                              24|24                   (No change)
| RAM Hits:                            125|125                  (No change)
| Total read+write:                4423587|4423587              (No change)
| Estimated Cycles:                4427933|4427933              (No change)
|-## Total
| Instructions:                   10500916|10500916             (No change)
| L1 Hits:                        10766439|10766439             (No change)
| L2 Hits:                             689|689                  (No change)
| RAM Hits:                           1427|1427                 (No change)
| Total read+write:               10768555|10768555             (No change)
| Estimated Cycles:               10819829|10819829             (No change)
|======== DHAT ========================================================================
|-## pid: 3186472 ppid: 3185288           |pid: 2721323 ppid: 2720196
| Command:            target/release/deps/test_lib_bench_threads-b0b85adec9a45de1
| Total bytes:                        2774|2774                 (No change)
| Total blocks:                         24|24                   (No change)
| At t-gmax bytes:                    1736|1736                 (No change)
| At t-gmax blocks:                      3|3                    (No change)
| At t-end bytes:                        0|0                    (No change)
| At t-end blocks:                       0|0                    (No change)
| Reads bytes:                       21054|21054                (No change)
| Writes bytes:                      13165|13165                (No change)
|-## pid: 3186473 ppid: 3186472           |pid: 2721324 ppid: 2721323
| Command:            target/release/thread 3
| Total bytes:                      156158|156158               (No change)
| Total blocks:                         73|73                   (No change)
| At t-gmax bytes:                   52225|52225                (No change)
| At t-gmax blocks:                     19|19                   (No change)
| At t-end bytes:                        0|0                    (No change)
| At t-end blocks:                       0|0                    (No change)
| Reads bytes:                      118403|118403               (No change)
| Writes bytes:                     135926|135926               (No change)
|-## Total
| Total bytes:                      158932|158932               (No change)
| Total blocks:                         97|97                   (No change)
| At t-gmax bytes:                   53961|53961                (No change)
| At t-gmax blocks:                     22|22                   (No change)
| At t-end bytes:                        0|0                    (No change)
| At t-end blocks:                       0|0                    (No change)
| Reads bytes:                      139457|139457               (No change)
| Writes bytes:                     149091|149091               (No change)
|-Comparison with bench_find_primes_multi_thread three:3
| Instructions:                   10494117|10500916             (-0.06475%) [-1.00065x]
| L1 Hits:                        10757259|10766439             (-0.08526%) [-1.00085x]
| L2 Hits:                             601|689                  (-12.7721%) [-1.14642x]
| RAM Hits:                           1189|1427                 (-16.6783%) [-1.20017x]
| Total read+write:               10759049|10768555             (-0.08828%) [-1.00088x]
| Estimated Cycles:               10801879|10819829             (-0.16590%) [-1.00166x]

Trait Implementations§

source§

impl AsRef<OutputFormat> for OutputFormat

source§

fn as_ref(&self) -> &InternalOutputFormat

Converts this type into a shared reference of the (usually inferred) input type.
source§

impl Clone for OutputFormat

source§

fn clone(&self) -> OutputFormat

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for OutputFormat

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl Default for OutputFormat

source§

fn default() -> OutputFormat

Returns the “default value” for a type. Read more
source§

impl From<&OutputFormat> for InternalOutputFormat

source§

fn from(value: &OutputFormat) -> Self

Converts to this type from the input type.
source§

impl From<&mut OutputFormat> for InternalOutputFormat

source§

fn from(value: &mut OutputFormat) -> Self

Converts to this type from the input type.
source§

impl From<OutputFormat> for InternalOutputFormat

source§

fn from(value: OutputFormat) -> Self

Converts to this type from the input type.

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> CloneToUninit for T
where T: Clone,

source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

source§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.