iai-callgrind 0.14.0

High-precision and consistent benchmarking framework/harness for Rust
Documentation

Iai-Callgrind is a benchmarking framework/harness which uses Valgrind's Callgrind and other Valgrind tools like DHAT, Massif, ... to provide extremely accurate and consistent measurements of Rust code, making it perfectly suited to run in environments like a CI. Iai-Callgrind is integrated in Bencher.

Iai-Callgrind is:

  • Precise: High-precision measurements of Instruction counts and many other metrics allow you to reliably detect very small optimizations and regressions of your code.
  • Consistent: Iai-Callgrind can take accurate measurements even in virtualized CI environments and make them comparable between different systems completely negating the noise of the environment.
  • Fast: Each benchmark is only run once, which is usually much faster than benchmarks which measure execution and wall-clock time. Benchmarks measuring the wall-clock time have to be run many times to increase their accuracy, detect outliers, filter out noise, etc.
  • Visualizable: Iai-Callgrind generates a Callgrind (DHAT, ...) profile of the benchmarked code and can be configured to create flamegraph-like charts from Callgrind metrics. In general, all Valgrind-compatible tools like callgrind_annotate, kcachegrind or dh_view.html and others to analyze the results in detail are fully supported.
  • Easy: The API for setting up benchmarks is easy to use and allows you to quickly create concise and clear benchmarks. Focus more on profiling and your code than on the framework.

See the Guide and api documentation at docs.rs for all the details.

Design philosophy and goals

Iai-Callgrind benchmarks are designed to be runnable with cargo bench. The benchmark files are expanded to a benchmarking harness which replaces the native benchmark harness of rust. Iai-Callgrind is a profiling framework that can quickly and reliably detect performance regressions and optimizations even in noisy environments with a precision that is impossible to achieve with wall-clock time based benchmarks. At the same time, we want to abstract the complicated parts and repetitive tasks away and provide an easy to use and intuitive api. Iai-Callgrind tries to stay out of your way and apply sensible default settings so you can focus more on profiling and your code!

How far are we?

Iai-callgrind is in a mature development stage and is already in use. Nevertheless, you may experience big changes between a minor version bump. With the release of 0.14.0, almost all callgrind capabilities are implemented including benchmarking of multi-threaded and multi-process applications. Profiling of heap usage with DHAT or massif is possible, but can be further improved. Creating callgrind flamegraphs for multi-process/multi-threaded benchmarks is considered to be in an experimental state. Please read our Vision to learn more about the ideas and the direction the future path might take.

When not to use Iai-Callgrind

Although Iai-Callgrind is useful in many projects, there are cases where Iai-Callgrind is not a good fit.

  • If you need wall-clock times, Iai-Callgrind cannot help you much. The estimation of cpu cycles merely correlates to wall-clock times but is not a replacement for wall-clock times. The cycles estimation is primarily designed to be a relative metric to be used for comparison.
  • Iai-Callgrind cannot be run on Windows and platforms not supported by Valgrind.

Quickstart

You're missing the old README? To get started read the Guide.

The guide maintains only the versions 0.12.3 upwards. For older versions checkout the README of this repo using a specific tagged version for example https://github.com/iai-callgrind/iai-callgrind/tree/v0.12.2 or using the github ui.

Here's just a small introductory example, assuming you have everything installed and a benchmark with the following content in benches/library_benchmark.rs ready:

use iai_callgrind::{main, library_benchmark_group, library_benchmark};
use std::hint::black_box;

fn fibonacci(n: u64) -> u64 {
    match n {
        0 => 1,
        1 => 1,
        n => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

#[library_benchmark]
#[bench::short(10)]
#[bench::long(30)]
fn bench_fibonacci(value: u64) -> u64 {
    black_box(fibonacci(value))
}

library_benchmark_group!(name = bench_fibonacci_group; benchmarks = bench_fibonacci);
main!(library_benchmark_groups = bench_fibonacci_group);

Now run

cargo bench

Contributing

Thanks for helping to improve this project! A guideline about contributing to Iai-Callgrind can be found in the CONTRIBUTING.md file.

You have an idea for a new feature, are missing a functionality or have found a bug?

Please don't hesitate to open an issue.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you shall be dual licensed as in License, without any additional terms or conditions.

Links

Related Projects

  • Iai: The repository from which Iai-Callgrind is forked. Iai uses Cachegrind instead of Callgrind under the hood.
  • Criterion-rs: A Statistics-driven benchmarking library for Rust. Wall-clock times based benchmarks.
  • hyperfine: A command-line benchmarking tool. Wall-clock time based benchmarks.
  • divan: Statistically-comfy benchmarking library. Wall-clock times based benchmarks.
  • dhat-rs: Provides heap profiling and ad hoc profiling capabilities to Rust programs, similar to those provided by DHAT.
  • cargo-valgrind: A cargo subcommand, that runs valgrind and collects its output in a helpful manner.
  • crabgrind: Valgrind Client Request interface for Rust programs. A small library that enables Rust programs to tap into Valgrind's tools and virtualized environment.

Credits

Iai-Callgrind is forked from https://github.com/bheisler/iai and was originally written by Brook Heisler (@bheisler).

Iai-Callgrind wouldn't be possible without Valgrind.

License

Iai-Callgrind is like Iai dual licensed under the Apache 2.0 license and the MIT license at your option.

According to Valgrind's documentation:

The Valgrind headers, unlike most of the rest of the code, are under a BSD-style license, so you may include them without worrying about license incompatibility.

We have included the original license where we made use of the original header files.