Constant gmp_mpfr_sys::C::MPFR::MPFR_Basics

source ·
pub const MPFR_Basics: ();
Expand description

This constant is a place-holder for documentation; do not use it in code.


4 MPFR Basics


4.1 Headers and Libraries

All declarations needed to use MPFR are collected in the include file mpfr.h. It is designed to work with both C and C++ compilers. You should include that file in any program using the MPFR library:

#include <mpfr.h>

Note, however, that prototypes for MPFR functions with FILE * parameters are provided only if <stdio.h> is included too (before mpfr.h):

#include <stdio.h>
#include <mpfr.h>

Likewise <stdarg.h> (or <varargs.h>) is required for prototypes with va_list parameters, such as mpfr_vprintf.

And for any functions using intmax_t, you must include <stdint.h> or <inttypes.h> before mpfr.h, to allow mpfr.h to define prototypes for these functions. Moreover, under some platforms (in particular with C++ compilers), users may need to define MPFR_USE_INTMAX_T (and should do it for portability) before mpfr.h has been included; of course, it is possible to do that on the command line, e.g., with -DMPFR_USE_INTMAX_T.

Note: If mpfr.h and/or gmp.h (used by mpfr.h) are included several times (possibly from another header file), <stdio.h> and/or <stdarg.h> (or <varargs.h>) should be included before the first inclusion of mpfr.h or gmp.h. Alternatively, you can define MPFR_USE_FILE (for MPFR I/O functions) and/or MPFR_USE_VA_LIST (for MPFR functions with va_list parameters) anywhere before the last inclusion of mpfr.h. As a consequence, if your file is a public header that includes mpfr.h, you need to use the latter method.

When calling a MPFR macro, it is not allowed to have previously defined a macro with the same name as some keywords (currently do, while and sizeof).

You can avoid the use of MPFR macros encapsulating functions by defining the MPFR_USE_NO_MACRO macro before mpfr.h is included. In general this should not be necessary, but this can be useful when debugging user code: with some macros, the compiler may emit spurious warnings with some warning options, and macros can prevent some prototype checking.

All programs using MPFR must link against both libmpfr and libgmp libraries. On a typical Unix-like system this can be done with ‘-lmpfr -lgmp’ (in that order), for example:

gcc myprogram.c -lmpfr -lgmp

MPFR is built using Libtool and an application can use that to link if desired, see GNU Libtool.

If MPFR has been installed to a non-standard location, then it may be necessary to set up environment variables such as ‘C_INCLUDE_PATH’ and ‘LIBRARY_PATH’, or use ‘-I’ and ‘-L’ compiler options, in order to point to the right directories. For a shared library, it may also be necessary to set up some sort of run-time library path (e.g., ‘LD_LIBRARY_PATH’) on some systems. Please read the INSTALL file for additional information.

Alternatively, it is possible to use ‘pkg-config’ (a file ‘mpfr.pc’ is provided as of MPFR 4.0):

cc myprogram.c $(pkg-config --cflags --libs mpfr)

Note that the ‘MPFR_’ and ‘mpfr_’ prefixes are reserved for MPFR. As a general rule, in order to avoid clashes, software using MPFR (directly or indirectly) and system headers/libraries should not define macros and symbols using these prefixes.


4.2 Nomenclature and Types

A floating-point number, or float for short, is an object representing a radix-2 floating-point number consisting of a sign, an arbitrary-precision normalized significand (also called mantissa), and an exponent (an integer in some given range); these are called regular numbers. By convention, the radix point of the significand is just before the first digit (which is always 1 due to normalization), like in the C language, but unlike in IEEE 754 (thus, for a given number, the exponent values in MPFR and in IEEE 754 differ by 1).

Like in the IEEE 754 standard, a floating-point number can also have three kinds of special values: a signed zero (+0 or −0), a signed infinity (+Inf or −Inf), and Not-a-Number (NaN). NaN can represent the default value of a floating-point object and the result of some operations for which no other results would make sense, such as 0 divided by 0 or +Inf minus +Inf; unless documented otherwise, the sign bit of a NaN is unspecified. Note that contrary to IEEE 754, MPFR has a single kind of NaN and does not have subnormals. Other than that, the behavior is very similar to IEEE 754, but there are some minor differences.

The C data type for such objects is mpfr_t, internally defined as a one-element array of a structure (so that when passed as an argument to a function, it is the pointer that is actually passed), and mpfr_ptr is the C data type representing a pointer to this structure; mpfr_srcptr is like mpfr_ptr, but the structure is read-only (i.e., const qualified).

The precision is the number of bits used to represent the significand of a floating-point number; the corresponding C data type is mpfr_prec_t. The precision can be any integer between MPFR_PREC_MIN and MPFR_PREC_MAX. In the current implementation, MPFR_PREC_MIN is equal to 1.

Warning! MPFR needs to increase the precision internally, in order to provide accurate results (and in particular, correct rounding). Do not attempt to set the precision to any value near MPFR_PREC_MAX, otherwise MPFR will abort due to an assertion failure. However, in practice, the real limitation will probably be the available memory on your platform, and in case of lack of memory, the program may abort, crash or have undefined behavior (depending on your C implementation).

An exponent is a component of a regular floating-point number. Its C data type is mpfr_exp_t. Valid exponents are restricted to a subset of this type, and the exponent range can be changed globally as described in Exception Related Functions. Special values do not have an exponent.

The rounding mode specifies the way to round the result of a floating-point operation, in case the exact result cannot be represented exactly in the destination (see Rounding). The corresponding C data type is mpfr_rnd_t.

MPFR has a global (or per-thread) flag for each supported exception and provides operations on flags (Exceptions). This C data type is used to represent a group of flags (or a mask).


4.3 MPFR Variable Conventions

Before you can assign to a MPFR variable, you need to initialize it by calling one of the special initialization functions. When you are done with a variable, you need to clear it out, using one of the functions for that purpose. A variable should only be initialized once, or at least cleared out between each initialization. After a variable has been initialized, it may be assigned to any number of times. For efficiency reasons, avoid to initialize and clear out a variable in loops. Instead, initialize it before entering the loop, and clear it out after the loop has exited. You do not need to be concerned about allocating additional space for MPFR variables, since any variable has a significand of fixed size. Hence unless you change its precision, or clear and reinitialize it, a floating-point variable will have the same allocated space during all its life.

As a general rule, all MPFR functions expect output arguments before input arguments. This notation is based on an analogy with the assignment operator. MPFR allows you to use the same variable for both input and output in the same expression. For example, the main function for floating-point multiplication, mpfr_mul, can be used like this: mpfr_mul (x, x, x, rnd). This computes the square of x with rounding mode rnd and puts the result back in x.


4.4 Rounding

The following rounding modes are supported:

  • MPFR_RNDN: round to nearest, with the even rounding rule (roundTiesToEven in IEEE 754); see details below.
  • MPFR_RNDD: round toward negative infinity (roundTowardNegative in IEEE 754).
  • MPFR_RNDU: round toward positive infinity (roundTowardPositive in IEEE 754).
  • MPFR_RNDZ: round toward zero (roundTowardZero in IEEE 754).
  • MPFR_RNDA: round away from zero.
  • MPFR_RNDF: faithful rounding. This feature is currently experimental. Specific support for this rounding mode has been added to some functions, such as the basic operations (addition, subtraction, multiplication, square, division, square root) or when explicitly documented. It might also work with other functions, as it is possible that they do not need modification in their code; even though a correct behavior is not guaranteed yet (corrections were done when failures occurred in the test suite, but almost nothing has been checked manually), failures should be regarded as bugs and reported, so that they can be fixed.

Note that, in particular for a result equal to zero, the sign is preserved by the rounding operation.

The MPFR_RNDN mode works like roundTiesToEven from the IEEE 754 standard: in case the number to be rounded lies exactly in the middle between two consecutive representable numbers, it is rounded to the one with an even significand; in radix 2, this means that the least significant bit is 0. For example, the number 2.5, which is represented by (10.1) in binary, is rounded to (10.0) = 2 with a precision of two bits, and not to (11.0) = 3. This rule avoids the drift phenomenon mentioned by Knuth in volume 2 of The Art of Computer Programming (Section 4.2.2).

Note: In particular for a 1-digit precision (in radix 2 or other radices, as in conversions to a string of digits), one considers the significands associated with the exponent of the number to be rounded. For instance, to round the number 95 in radix 10 with a 1-digit precision, one considers its truncated 1-digit integer significand 9 and the following integer 10 (since these are consecutive integers, exactly one of them is even). 10 is the even significand, so that 95 will be rounded to 100, not to 90.

For the directed rounding modes, a number x is rounded to the number y that is the closest to x such that

  • MPFR_RNDD: y is less than or equal to x;
  • MPFR_RNDU: y is greater than or equal to x;
  • MPFR_RNDZ: abs(y) is less than or equal to abs(x);
  • MPFR_RNDA: abs(y) is greater than or equal to abs(x).

The MPFR_RNDF mode works as follows: the computed value is either that corresponding to MPFR_RNDD or that corresponding to MPFR_RNDU. In particular when those values are identical, i.e., when the result of the corresponding operation is exactly representable, that exact result is returned. Thus, the computed result can take at most two possible values, and in absence of underflow/overflow, the corresponding error is strictly less than one ulp (unit in the last place) of that result and of the exact result. For MPFR_RNDF, the ternary value (defined below) and the inexact flag (defined later, as with the other flags) are unspecified, the divide-by-zero flag is as with other roundings, and the underflow and overflow flags match what would be obtained in the case the computed value is the same as with MPFR_RNDD or MPFR_RNDU. The results may not be reproducible.

Most MPFR functions take as first argument the destination variable, as second and following arguments the input variables, as last argument a rounding mode, and have a return value of type int, called the ternary value. The value stored in the destination variable is correctly rounded, i.e., MPFR behaves as if it computed the result with an infinite precision, then rounded it to the precision of this variable. The input variables are regarded as exact (in particular, their precision does not affect the result).

As a consequence, in case of a non-zero real rounded result, the error on the result is less than or equal to 1/2 ulp (unit in the last place) of that result in the rounding to nearest mode, and less than 1 ulp of that result in the directed rounding modes (a ulp is the weight of the least significant represented bit of the result after rounding).

Unless documented otherwise, functions returning an int return a ternary value. If the ternary value is zero, it means that the value stored in the destination variable is the exact result of the corresponding mathematical function. If the ternary value is positive (resp. negative), it means the value stored in the destination variable is greater (resp. lower) than the exact result. For example with the MPFR_RNDU rounding mode, the ternary value is usually positive, except when the result is exact, in which case it is zero. In the case of an infinite result, it is considered as inexact when it was obtained by overflow, and exact otherwise. A NaN result (Not-a-Number) always corresponds to an exact return value. The opposite of a returned ternary value is guaranteed to be representable in an int.

Unless documented otherwise, functions returning as result the value 1 (or any other value specified in this manual) for special cases (like acos(0)) yield an overflow or an underflow if that value is not representable in the current exponent range.


4.5 Floating-Point Values on Special Numbers

This section specifies the floating-point values (of type mpfr_t) returned by MPFR functions (where by “returned” we mean here the modified value of the destination object, which should not be mixed with the ternary return value of type int of those functions). For functions returning several values (like mpfr_sin_cos), the rules apply to each result separately.

Functions can have one or several input arguments. An input point is a mapping from these input arguments to the set of the MPFR numbers. When none of its components are NaN, an input point can also be seen as a tuple in the extended real numbers (the set of the real numbers with both infinities).

When the input point is in the domain of the mathematical function, the result is rounded as described in Rounding (but see below for the specification of the sign of an exact zero). Otherwise the general rules from this section apply unless stated otherwise in the description of the MPFR function (MPFR Interface).

When the input point is not in the domain of the mathematical function but is in its closure in the extended real numbers and the function can be extended by continuity, the result is the obtained limit. Examples: mpfr_hypot on (+Inf,0) gives +Inf. But mpfr_pow cannot be defined on (1,+Inf) using this rule, as one can find sequences (x_n,y_n) such that x_n goes to 1, y_n goes to +Inf and x_n to the y_n goes to any positive value when n goes to the infinity.

When the input point is in the closure of the domain of the mathematical function and an input argument is +0 (resp. −0), one considers the limit when the corresponding argument approaches 0 from above (resp. below), if possible. If the limit is not defined (e.g., mpfr_sqrt and mpfr_log on −0), the behavior is specified in the description of the MPFR function, but must be consistent with the rule from the above paragraph (e.g., mpfr_log on ±0 gives −Inf).

When the result is equal to 0, its sign is determined by considering the limit as if the input point were not in the domain: If one approaches 0 from above (resp. below), the result is +0 (resp. −0); for example, mpfr_sin on −0 gives −0 and mpfr_acos on 1 gives +0 (in all rounding modes). In the other cases, the sign is specified in the description of the MPFR function; for example mpfr_max on −0 and +0 gives +0.

When the input point is not in the closure of the domain of the function, the result is NaN. Example: mpfr_sqrt on −17 gives NaN.

When an input argument is NaN, the result is NaN, possibly except when a partial function is constant on the finite floating-point numbers; such a case is always explicitly specified in MPFR Interface. Example: mpfr_hypot on (NaN,0) gives NaN, but mpfr_hypot on (NaN,+Inf) gives +Inf (as specified in Transcendental Functions), since for any finite or infinite input x, mpfr_hypot on (x,+Inf) gives +Inf.

MPFR also tries to follow the specifications of the IEEE 754 standard on special values (IEEE 754 agree with the above rules in most cases). Any difference with IEEE 754 that is not explicitly mentioned, other than those due to the single NaN, is unintended and might be regarded as a bug. See also MPFR and the IEEE 754 Standard.


4.6 Exceptions

MPFR defines a global (or per-thread) flag for each supported exception. A macro evaluating to a power of two is associated with each flag and exception, in order to be able to specify a group of flags (or a mask) by OR’ing such macros.

Flags can be cleared (lowered), set (raised), and tested by functions described in Exception Related Functions.

The supported exceptions are listed below. The macro associated with each exception is in parentheses.

  • Underflow (MPFR_FLAGS_UNDERFLOW): An underflow occurs when the exact result of a function is a non-zero real number and the result obtained after the rounding, assuming an unbounded exponent range (for the rounding), has an exponent smaller than the minimum value of the current exponent range. (In the round-to-nearest mode, the halfway case is rounded toward zero.)

    Note: This is not the single possible definition of the underflow. MPFR chooses to consider the underflow after rounding. The underflow before rounding can also be defined. For instance, consider a function that has the exact result 7 multiplied by two to the power e − 4, where e is the smallest exponent (for a significand between 1/2 and 1), with a 2-bit target precision and rounding toward positive infinity. The exact result has the exponent e − 1. With the underflow before rounding, such a function call would yield an underflow, as e − 1 is outside the current exponent range. However, MPFR first considers the rounded result assuming an unbounded exponent range. The exact result cannot be represented exactly in precision 2, and here, it is rounded to 0.5 times 2 to e, which is representable in the current exponent range. As a consequence, this will not yield an underflow in MPFR.

  • Overflow (MPFR_FLAGS_OVERFLOW): An overflow occurs when the exact result of a function is a non-zero real number and the result obtained after the rounding, assuming an unbounded exponent range (for the rounding), has an exponent larger than the maximum value of the current exponent range. In the round-to-nearest mode, the result is infinite. Note: unlike the underflow case, there is only one possible definition of overflow here.
  • Divide-by-zero (MPFR_FLAGS_DIVBY0): An exact infinite result is obtained from finite inputs.
  • NaN (MPFR_FLAGS_NAN): A NaN exception occurs when the result of a function is NaN.
  • Inexact (MPFR_FLAGS_INEXACT): An inexact exception occurs when the result of a function cannot be represented exactly and must be rounded.
  • Range error (MPFR_FLAGS_ERANGE): A range exception occurs when a function that does not return a MPFR number (such as comparisons and conversions to an integer) has an invalid result (e.g., an argument is NaN in mpfr_cmp, or a conversion to an integer cannot be represented in the target type).

Moreover, the group consisting of all the flags is represented by the MPFR_FLAGS_ALL macro (if new flags are added in future MPFR versions, they will be added to this macro too).

Differences with the ISO C99 standard:

  • In C, only quiet NaNs are specified, and a NaN propagation does not raise an invalid exception. Unless explicitly stated otherwise, MPFR sets the NaN flag whenever a NaN is generated, even when a NaN is propagated (e.g., in NaN + NaN), as if all NaNs were signaling.
  • An invalid exception in C corresponds to either a NaN exception or a range error in MPFR.

4.7 Memory Handling

MPFR functions may create caches, e.g., when computing constants such as Pi, either because the user has called a function like mpfr_const_pi directly or because such a function was called internally by the MPFR library itself to compute some other function. When more precision is needed, the value is automatically recomputed; a minimum of 10% increase of the precision is guaranteed to avoid too many recomputations.

MPFR functions may also create thread-local pools for internal use to avoid the cost of memory allocation. The pools can be freed with mpfr_free_pool (but with a default MPFR build, they should not take much memory, as the allocation size is limited).

At any time, the user can free various caches and pools with mpfr_free_cache and mpfr_free_cache2. It is strongly advised to free thread-local caches before terminating a thread, and all caches before exiting when using tools like ‘valgrind’ (to avoid memory leaks being reported).

MPFR allocates its memory either on the stack (for temporary memory only) or with the same allocator as the one configured for GMP: see Section “Custom Allocation” in GNU MP. This means that the application must make sure that data allocated with the current allocator will not be reallocated or freed with a new allocator. So, in practice, if an application needs to change the allocator with mp_set_memory_functions, it should first free all data allocated with the current allocator: for its own data, with mpfr_clear, etc.; for the caches and pools, with mpfr_mp_memory_cleanup in all threads where MPFR is potentially used. This function is currently equivalent to mpfr_free_cache, but mpfr_mp_memory_cleanup is the recommended way in case the allocation method changes in the future (for instance, one may choose to allocate the caches for floating-point constants with malloc to avoid freeing them if the allocator changes). Developers should also be aware that MPFR may also be used indirectly by libraries, so that libraries based on MPFR should provide a clean-up function calling mpfr_mp_memory_cleanup and/or warn their users about this issue.

Note: For multithreaded applications, the allocator must be valid in all threads where MPFR may be used; data allocated in one thread may be reallocated and/or freed in some other thread.

MPFR internal data such as flags, the exponent range, the default precision, and the default rounding mode are either global (if MPFR has not been compiled as thread safe) or per-thread (thread-local storage, TLS). The initial values of TLS data after a thread is created entirely depend on the compiler and thread implementation (MPFR simply does a conventional variable initialization, the variables being declared with an implementation-defined TLS specifier).

Writers of libraries using MPFR should be aware that the application and/or another library used by the application may also use MPFR, so that changing the exponent range, the default precision, or the default rounding mode may have an effect on this other use of MPFR since these data are not duplicated (unless they are in a different thread). Therefore any such value changed in a library function should be restored before the function returns (unless the purpose of the function is to do such a change). Writers of software using MPFR should also be careful when changing such a value if they use a library using MPFR (directly or indirectly), in order to make sure that such a change is compatible with the library.


4.8 Getting the Best Efficiency Out of MPFR

Here are a few hints to get the best efficiency out of MPFR:

  • you should avoid allocating and clearing variables. Reuse variables whenever possible, allocate or clear outside of loops, pass temporary variables to subroutines instead of allocating them inside the subroutines;
  • use mpfr_swap instead of mpfr_set whenever possible. This will avoid copying the significands;
  • avoid using MPFR from C++, or make sure your C++ interface does not perform unnecessary allocations or copies. Slowdowns of up to a factor 15 have been observed on some applications with a C++ interface;
  • MPFR functions work in-place: to compute a = a + b you don’t need an auxiliary variable, you can directly write mpfr_add (a, a, b, ...).