Struct nvml_wrapper::Nvml
source · pub struct Nvml { /* private fields */ }
Expand description
The main struct that this library revolves around.
According to NVIDIA’s documentation, “It is the user’s responsibility to call nvmlInit()
before calling any other methods, and nvmlShutdown()
once NVML is no longer being used.”
This struct is used to enforce those rules.
Also according to NVIDIA’s documentation, “NVML is thread-safe so it is safe to make
simultaneous NVML calls from multiple threads.” In the Rust world, this translates to NVML
being Send
+ Sync
. You can .clone()
an Arc
wrapped NVML
and enjoy using it on any thread.
NOTE: If you care about possible errors returned from nvmlShutdown()
, use the .shutdown()
method on this struct. The Drop
implementation ignores errors.
When reading documentation on this struct and its members, remember that a lot of it, especially in regards to errors returned, is copied from NVIDIA’s docs. While they can be found online here, the hosted docs sometimes outdated and may not accurately reflect the version of NVML that this library is written for; beware. You should ideally read the doc comments on an up-to-date NVML API header. Such a header can be downloaded as part of the CUDA toolkit.
Implementations§
source§impl Nvml
impl Nvml
sourcepub fn init() -> Result<Self, NvmlError>
pub fn init() -> Result<Self, NvmlError>
Handles NVML initialization and must be called before doing anything else.
While it is possible to initialize NVML
multiple times (NVIDIA’s docs state
that reference counting is used internally), you should strive to initialize
NVML
once at the start of your program’s execution; the constructors handle
dynamically loading function symbols from the NVML
lib and are therefore
somewhat expensive.
Note that this will initialize NVML but not any GPUs. This means that NVML can communicate with a GPU even when other GPUs in a system are bad or unstable.
By default, initialization looks for “libnvidia-ml.so” on linux and “nvml.dll”
on Windows. These default names should work for default installs on those
platforms; if further specification is required, use Nvml::builder
.
§Errors
DriverNotLoaded
, if the NVIDIA driver is not runningNoPermission
, if NVML does not have permission to talk to the driverUnknown
, on any unexpected error
sourcepub fn init_with_flags(flags: InitFlags) -> Result<Self, NvmlError>
pub fn init_with_flags(flags: InitFlags) -> Result<Self, NvmlError>
An initialization function that allows you to pass flags to control certain behaviors.
This is the same as init()
except for the addition of flags.
§Errors
DriverNotLoaded
, if the NVIDIA driver is not runningNoPermission
, if NVML does not have permission to talk to the driverUnknown
, on any unexpected error
§Examples
use nvml_wrapper::bitmasks::InitFlags;
// Don't fail if the system doesn't have any NVIDIA GPUs
//
// Also, don't attach any GPUs during initialization
Nvml::init_with_flags(InitFlags::NO_GPUS | InitFlags::NO_ATTACH)?;
sourcepub fn builder<'a>() -> NvmlBuilder<'a>
pub fn builder<'a>() -> NvmlBuilder<'a>
Create an NvmlBuilder
for further flexibility in how NVML is initialized.
sourcepub fn shutdown(self) -> Result<(), NvmlError>
pub fn shutdown(self) -> Result<(), NvmlError>
Use this to shutdown NVML and release allocated resources if you care about handling
potential errors (the Drop
implementation ignores errors!).
§Errors
Uninitialized
, if the library has not been successfully initializedUnknown
, on any unexpected error
sourcepub fn device_count(&self) -> Result<u32, NvmlError>
pub fn device_count(&self) -> Result<u32, NvmlError>
Get the number of compute devices in the system (compute device == one GPU).
Note that this count can include devices you do not have permission to access.
§Errors
Uninitialized
, if the library has not been successfully initializedUnknown
, on any unexpected error
sourcepub fn sys_driver_version(&self) -> Result<String, NvmlError>
pub fn sys_driver_version(&self) -> Result<String, NvmlError>
Gets the version of the system’s graphics driver and returns it as an alphanumeric string.
§Errors
Uninitialized
, if the library has not been successfully initializedUtf8Error
, if the string obtained from the C function is not valid Utf8
sourcepub fn sys_nvml_version(&self) -> Result<String, NvmlError>
pub fn sys_nvml_version(&self) -> Result<String, NvmlError>
Gets the version of the system’s NVML library and returns it as an alphanumeric string.
§Errors
Utf8Error
, if the string obtained from the C function is not valid Utf8
sourcepub fn sys_cuda_driver_version(&self) -> Result<i32, NvmlError>
pub fn sys_cuda_driver_version(&self) -> Result<i32, NvmlError>
Gets the version of the system’s CUDA driver.
Calls into the CUDA library (cuDriverGetVersion()).
You can use cuda_driver_version_major
and cuda_driver_version_minor
to get the major and minor driver versions from this number.
§Errors
FunctionNotFound
, if cuDriverGetVersion() is not found in the shared libraryLibraryNotFound
, if libcuda.so.1 or libcuda.dll cannot be found
sourcepub fn sys_process_name(
&self,
pid: u32,
length: usize
) -> Result<String, NvmlError>
pub fn sys_process_name( &self, pid: u32, length: usize ) -> Result<String, NvmlError>
Gets the name of the process for the given process ID, cropped to the provided length.
§Errors
Uninitialized
, if the library has not been successfully initializedInvalidArg
, if the length is 0 (if this is returned without length being 0, file an issue)NotFound
, if the process does not existNoPermission
, if the user doesn’t have permission to perform the operationUtf8Error
, if the string obtained from the C function is not valid UTF-8. NVIDIA’s docs say that the string encoding is ANSI, so this may very well happen.Unknown
, on any unexpected error
sourcepub fn device_by_index(&self, index: u32) -> Result<Device<'_>, NvmlError>
pub fn device_by_index(&self, index: u32) -> Result<Device<'_>, NvmlError>
Acquire the handle for a particular device based on its index (starts at 0).
Usage of this function causes NVML to initialize the target GPU. Additional GPUs may be initialized if the target GPU is an SLI slave.
You can determine valid indices by using .device_count()
. This
function doesn’t call that for you, but the actual C function to get
the device handle will return an error in the case of an invalid index.
This means that the InvalidArg
error will be returned if you pass in
an invalid index.
NVIDIA’s docs state that “The order in which NVML enumerates devices has
no guarantees of consistency between reboots. For that reason it is recommended
that devices be looked up by their PCI ids or UUID.” In this library, that translates
into usage of .device_by_uuid()
and .device_by_pci_bus_id()
.
The NVML index may not correlate with other APIs such as the CUDA device index.
§Errors
Uninitialized
, if the library has not been successfully initializedInvalidArg
, if index is invalidInsufficientPower
, if any attached devices have improperly attached external power cablesNoPermission
, if the user doesn’t have permission to talk to this deviceIrqIssue
, if the NVIDIA kernel detected an interrupt issue with the attached GPUsGpuLost
, if the target GPU has fallen off the bus or is otherwise inaccessibleUnknown
, on any unexpected error
sourcepub fn device_by_pci_bus_id<S: AsRef<str>>(
&self,
pci_bus_id: S
) -> Result<Device<'_>, NvmlError>
pub fn device_by_pci_bus_id<S: AsRef<str>>( &self, pci_bus_id: S ) -> Result<Device<'_>, NvmlError>
Acquire the handle for a particular device based on its PCI bus ID.
Usage of this function causes NVML to initialize the target GPU. Additional GPUs may be initialized if the target GPU is an SLI slave.
The bus ID corresponds to the bus_id
returned by Device.pci_info()
.
§Errors
Uninitialized
, if the library has not been successfully initializedInvalidArg
, ifpci_bus_id
is invalidNotFound
, ifpci_bus_id
does not match a valid device on the systemInsufficientPower
, if any attached devices have improperly attached external power cablesNoPermission
, if the user doesn’t have permission to talk to this deviceIrqIssue
, if the NVIDIA kernel detected an interrupt issue with the attached GPUsGpuLost
, if the target GPU has fallen off the bus or is otherwise inaccessibleNulError
, for which you can read the docs onstd::ffi::NulError
Unknown
, on any unexpected error
sourcepub fn device_by_serial<S: AsRef<str>>(
&self,
board_serial: S
) -> Result<Device<'_>, NvmlError>
👎Deprecated: use .device_by_uuid()
, this errors on dual GPU boards
pub fn device_by_serial<S: AsRef<str>>( &self, board_serial: S ) -> Result<Device<'_>, NvmlError>
.device_by_uuid()
, this errors on dual GPU boardsNot documenting this because it’s deprecated and does not seem to work anymore.
sourcepub fn device_by_uuid<S: AsRef<str>>(
&self,
uuid: S
) -> Result<Device<'_>, NvmlError>
pub fn device_by_uuid<S: AsRef<str>>( &self, uuid: S ) -> Result<Device<'_>, NvmlError>
Acquire the handle for a particular device based on its globally unique immutable UUID.
Usage of this function causes NVML to initialize the target GPU. Additional GPUs may be initialized as the function called within searches for the target GPU.
§Errors
Uninitialized
, if the library has not been successfully initializedInvalidArg
, ifuuid
is invalidNotFound
, ifuuid
does not match a valid device on the systemInsufficientPower
, if any attached devices have improperly attached external power cablesIrqIssue
, if the NVIDIA kernel detected an interrupt issue with the attached GPUsGpuLost
, if the target GPU has fallen off the bus or is otherwise inaccessibleNulError
, for which you can read the docs onstd::ffi::NulError
Unknown
, on any unexpected error
NVIDIA doesn’t mention NoPermission
for this one. Strange!
sourcepub fn topology_common_ancestor(
&self,
device1: &Device<'_>,
device2: &Device<'_>
) -> Result<TopologyLevel, NvmlError>
pub fn topology_common_ancestor( &self, device1: &Device<'_>, device2: &Device<'_> ) -> Result<TopologyLevel, NvmlError>
Gets the common ancestor for two devices.
Note: this is the same as Device.topology_common_ancestor()
.
§Errors
InvalidArg
, if the device is invalidNotSupported
, if thisDevice
or the OS does not support this featureUnexpectedVariant
, for which you can read the docs forUnknown
, on any unexpected error
§Platform Support
Only supports Linux.
sourcepub fn unit_by_index(&self, index: u32) -> Result<Unit<'_>, NvmlError>
pub fn unit_by_index(&self, index: u32) -> Result<Unit<'_>, NvmlError>
Acquire the handle for a particular Unit
based on its index.
Valid indices are derived from the count returned by .unit_count()
.
For example, if unit_count
is 2 the valid indices are 0 and 1, corresponding
to UNIT 0 and UNIT 1.
Note that the order in which NVML enumerates units has no guarantees of consistency between reboots.
§Errors
Uninitialized
, if the library has not been successfully initializedInvalidArg
, ifindex
is invalidUnknown
, on any unexpected error
§Device Support
For S-class products.
sourcepub fn are_devices_on_same_board(
&self,
device1: &Device<'_>,
device2: &Device<'_>
) -> Result<bool, NvmlError>
pub fn are_devices_on_same_board( &self, device1: &Device<'_>, device2: &Device<'_> ) -> Result<bool, NvmlError>
Checks if the passed-in Device
s are on the same physical board.
Note: this is the same as Device.is_on_same_board_as()
.
§Errors
Uninitialized
, if the library has not been successfully initializedInvalidArg
, if eitherDevice
is invalidNotSupported
, if this check is not supported by thisDevice
GpuLost
, if thisDevice
has fallen off the bus or is otherwise inaccessibleUnknown
, on any unexpected error
sourcepub fn unit_count(&self) -> Result<u32, NvmlError>
pub fn unit_count(&self) -> Result<u32, NvmlError>
sourcepub fn create_event_set(&self) -> Result<EventSet<'_>, NvmlError>
pub fn create_event_set(&self) -> Result<EventSet<'_>, NvmlError>
sourcepub fn discover_gpus(&self, pci_info: PciInfo) -> Result<(), NvmlError>
pub fn discover_gpus(&self, pci_info: PciInfo) -> Result<(), NvmlError>
Request the OS and the NVIDIA kernel driver to rediscover a portion of the PCI subsystem in search of GPUs that were previously removed.
The portion of the PCI tree can be narrowed by specifying a domain, bus, and
device in the passed-in pci_info
. If all of these fields are zeroes, the
entire PCI tree will be searched. Note that for long-running NVML processes,
the enumeration of devices will change based on how many GPUs are discovered
and where they are inserted in bus order.
All newly discovered GPUs will be initialized and have their ECC scrubbed which may take several seconds per GPU. All device handles are no longer guaranteed to be valid post discovery. I am not sure if this means all device handles, literally, or if NVIDIA is referring to handles that had previously been obtained to devices that were then removed and have now been re-discovered.
Must be run as administrator.
§Errors
Uninitialized
, if the library has not been successfully initializedOperatingSystem
, if the operating system is denying this featureNoPermission
, if the calling process has insufficient permissions to perform this operationNulError
, if an issue is encountered when trying to convert a RustString
into aCString
.Unknown
, on any unexpected error
§Device Support
Supports Pascal and newer fully supported devices.
Some Kepler devices are also supported (that’s all NVIDIA says, no specifics).
§Platform Support
Only supports Linux.
sourcepub fn excluded_device_count(&self) -> Result<u32, NvmlError>
pub fn excluded_device_count(&self) -> Result<u32, NvmlError>
sourcepub fn excluded_device_info(
&self,
index: u32
) -> Result<ExcludedDeviceInfo, NvmlError>
pub fn excluded_device_info( &self, index: u32 ) -> Result<ExcludedDeviceInfo, NvmlError>
Trait Implementations§
source§impl Drop for Nvml
impl Drop for Nvml
This Drop
implementation ignores errors! Use the .shutdown()
method on
the Nvml
struct
if you care about handling them.
source§impl EventLoopProvider for Nvml
impl EventLoopProvider for Nvml
source§fn create_event_loop<'nvml>(
&'nvml self,
devices: Vec<&Device<'nvml>>
) -> Result<EventLoop<'_>, NvmlErrorWithSource>
fn create_event_loop<'nvml>( &'nvml self, devices: Vec<&Device<'nvml>> ) -> Result<EventLoop<'_>, NvmlErrorWithSource>
Create an event loop that will register itself to recieve events for the given
Device
s.
This function creates an event set and registers each devices’ supported event
types for it. The returned EventLoop
struct then has methods that you can
call to actually utilize it.
§Errors
Uninitialized
, if the library has not been successfully initializedGpuLost
, if any of the givenDevice
s have fallen off the bus or are otherwise inaccessibleUnknown
, on any unexpected error
§Platform Support
Only supports Linux.