Constant gmp_mpfr_sys::C::GMP::Floating_point_Functions

source ยท
pub const Floating_point_Functions: ();
Expand description

This constant is a place-holder for documentation; do not use it in code.


7 Floating-point Functions

GMP floating point numbers are stored in objects of type mpf_t and functions operating on them have an mpf_ prefix.

The mantissa of each float has a user-selectable precision, in practice only limited by available memory. Each variable has its own precision, and that can be increased or decreased at any time. This selectable precision is a minimum value, GMP rounds it up to a whole limb.

The accuracy of a calculation is determined by the priorly set precision of the destination variable and the numeric values of the input variables. Input variables’ set precisions do not affect calculations (except indirectly as their values might have been affected when they were assigned).

The exponent of each float has fixed precision, one machine word on most systems. In the current implementation the exponent is a count of limbs, so for example on a 32-bit system this means a range of roughly 2^-68719476768 to 2^68719476736, or on a 64-bit system this will be much greater. Note however that mpf_get_str can only return an exponent which fits an mp_exp_t and currently mpf_set_str doesn’t accept exponents bigger than a long.

Each variable keeps track of the mantissa data actually in use. This means that if a float is exactly represented in only a few bits then only those bits will be used in a calculation, even if the variable’s selected precision is high. This is a performance optimization; it does not affect the numeric results.

Internally, GMP sometimes calculates with higher precision than that of the destination variable in order to limit errors. Final results are always truncated to the destination variable’s precision.

The mantissa is stored in binary. One consequence of this is that decimal fractions like 0.1 cannot be represented exactly. The same is true of plain IEEE double floats. This makes both highly unsuitable for calculations involving money or other values that should be exact decimal fractions. (Suitably scaled integers, or perhaps rationals, are better choices.)

The mpf functions and variables have no special notion of infinity or not-a-number, and applications must take care not to overflow the exponent or results will be unpredictable.

Note that the mpf functions are not intended as a smooth extension to IEEE P754 arithmetic. In particular results obtained on one computer often differ from the results on a computer with a different word size.

New projects should consider using the GMP extension library MPFR (https://www.mpfr.org/) instead. MPFR provides well-defined precision and accurate rounding, and thereby naturally extends IEEE P754.


7.1 Initialization Functions

Function: void mpf_set_default_prec (mp_bitcnt_t prec)

Set the default precision to be at least prec bits. All subsequent calls to mpf_init will use this precision, but previously initialized variables are unaffected.

Function: mp_bitcnt_t mpf_get_default_prec (void)

Return the default precision actually used.

An mpf_t object must be initialized before storing the first value in it. The functions mpf_init and mpf_init2 are used for that purpose.

Function: void mpf_init (mpf_t x)

Initialize x to 0. Normally, a variable should be initialized once only or at least be cleared, using mpf_clear, between initializations. The precision of x is undefined unless a default precision has already been established by a call to mpf_set_default_prec.

Function: void mpf_init2 (mpf_t x, mp_bitcnt_t prec)

Initialize x to 0 and set its precision to be at least prec bits. Normally, a variable should be initialized once only or at least be cleared, using mpf_clear, between initializations.

Function: void mpf_inits (mpf_t x, ...)

Initialize a NULL-terminated list of mpf_t variables, and set their values to 0. The precision of the initialized variables is undefined unless a default precision has already been established by a call to mpf_set_default_prec.

Function: void mpf_clear (mpf_t x)

Free the space occupied by x. Make sure to call this function for all mpf_t variables when you are done with them.

Function: void mpf_clears (mpf_t x, ...)

Free the space occupied by a NULL-terminated list of mpf_t variables.

Here is an example on how to initialize floating-point variables:

{
  mpf_t x, y;
  mpf_init (x);           /* use default precision */
  mpf_init2 (y, 256);     /* precision at least 256 bits */
  ...
  /* Unless the program is about to exit, do ... */
  mpf_clear (x);
  mpf_clear (y);
}

The following three functions are useful for changing the precision during a calculation. A typical use would be for adjusting the precision gradually in iterative algorithms like Newton-Raphson, making the computation precision closely match the actual accurate part of the numbers.

Function: mp_bitcnt_t mpf_get_prec (const mpf_t op)

Return the current precision of op, in bits.

Function: void mpf_set_prec (mpf_t rop, mp_bitcnt_t prec)

Set the precision of rop to be at least prec bits. The value in rop will be truncated to the new precision.

This function requires a call to realloc, and so should not be used in a tight loop.

Function: void mpf_set_prec_raw (mpf_t rop, mp_bitcnt_t prec)

Set the precision of rop to be at least prec bits, without changing the memory allocated.

prec must be no more than the allocated precision for rop, that being the precision when rop was initialized, or in the most recent mpf_set_prec.

The value in rop is unchanged, and in particular if it had a higher precision than prec it will retain that higher precision. New values written to rop will use the new prec.

Before calling mpf_clear or the full mpf_set_prec, another mpf_set_prec_raw call must be made to restore rop to its original allocated precision. Failing to do so will have unpredictable results.

mpf_get_prec can be used before mpf_set_prec_raw to get the original allocated precision. After mpf_set_prec_raw it reflects the prec value set.

mpf_set_prec_raw is an efficient way to use an mpf_t variable at different precisions during a calculation, perhaps to gradually increase precision in an iteration, or just to use various different precisions for different purposes during a calculation.


7.2 Assignment Functions

These functions assign new values to already initialized floats (see Initialization Functions).

Function: void mpf_set (mpf_t rop, const mpf_t op)
Function: void mpf_set_ui (mpf_t rop, unsigned long int op)
Function: void mpf_set_si (mpf_t rop, signed long int op)
Function: void mpf_set_d (mpf_t rop, double op)
Function: void mpf_set_z (mpf_t rop, const mpz_t op)
Function: void mpf_set_q (mpf_t rop, const mpq_t op)

Set the value of rop from op.

Function: int mpf_set_str (mpf_t rop, const char *str, int base)

Set the value of rop from the string in str. The string is of the form ‘M@N’ or, if the base is 10 or less, alternatively ‘MeN’. ‘M’ is the mantissa and ‘N’ is the exponent. The mantissa is always in the specified base. The exponent is either in the specified base or, if base is negative, in decimal. The decimal point expected is taken from the current locale, on systems providing localeconv.

The argument base may be in the ranges 2 to 62, or −62 to −2. Negative values are used to specify that the exponent is in decimal.

For bases up to 36, case is ignored; upper-case and lower-case letters have the same value; for bases 37 to 62, upper-case letters represent the usual 10..35 while lower-case letters represent 36..61.

Unlike the corresponding mpz function, the base will not be determined from the leading characters of the string if base is 0. This is so that numbers like ‘0.23’ are not interpreted as octal.

White space is allowed in the string, and is simply ignored. [This is not really true; white-space is ignored in the beginning of the string and within the mantissa, but not in other places, such as after a minus sign or in the exponent. We are considering changing the definition of this function, making it fail when there is any white-space in the input, since that makes a lot of sense. Please tell us your opinion about this change. Do you really want it to accept "3 14" as meaning 314 as it does now?]

This function returns 0 if the entire string is a valid number in base base. Otherwise it returns −1.

Function: void mpf_swap (mpf_t rop1, mpf_t rop2)

Swap rop1 and rop2 efficiently. Both the values and the precisions of the two variables are swapped.


7.3 Combined Initialization and Assignment Functions

For convenience, GMP provides a parallel series of initialize-and-set functions which initialize the output and then store the value there. These functions’ names have the form mpf_init_set…

Once the float has been initialized by any of the mpf_init_set… functions, it can be used as the source or destination operand for the ordinary float functions. Don’t use an initialize-and-set function on a variable already initialized!

Function: void mpf_init_set (mpf_t rop, const mpf_t op)
Function: void mpf_init_set_ui (mpf_t rop, unsigned long int op)
Function: void mpf_init_set_si (mpf_t rop, signed long int op)
Function: void mpf_init_set_d (mpf_t rop, double op)

Initialize rop and set its value from op.

The precision of rop will be taken from the active default precision, as set by mpf_set_default_prec.

Function: int mpf_init_set_str (mpf_t rop, const char *str, int base)

Initialize rop and set its value from the string in str. See mpf_set_str above for details on the assignment operation.

Note that rop is initialized even if an error occurs. (I.e., you have to call mpf_clear for it.)

The precision of rop will be taken from the active default precision, as set by mpf_set_default_prec.


7.4 Conversion Functions

Function: double mpf_get_d (const mpf_t op)

Convert op to a double, truncating if necessary (i.e. rounding towards zero).

If the exponent in op is too big or too small to fit a double then the result is system dependent. For too big an infinity is returned when available. For too small 0.0 is normally returned. Hardware overflow, underflow and denorm traps may or may not occur.

Function: double mpf_get_d_2exp (signed long int *exp, const mpf_t op)

Convert op to a double, truncating if necessary (i.e. rounding towards zero), and with an exponent returned separately.

The return value is in the range 0.5<=abs(d)<1 and the exponent is stored to *exp. d * 2^exp is the (truncated) op value. If op is zero, the return is 0.0 and 0 is stored to *exp.

This is similar to the standard C frexp function (see Normalization Functions in The GNU C Library Reference Manual).

Function: long mpf_get_si (const mpf_t op)
Function: unsigned long mpf_get_ui (const mpf_t op)

Convert op to a long or unsigned long, truncating any fraction part. If op is too big for the return type, the result is undefined.

See also mpf_fits_slong_p and mpf_fits_ulong_p (see Miscellaneous Functions).

Function: char * mpf_get_str (char *str, mp_exp_t *expptr, int base, size_t n_digits, const mpf_t op)

Convert op to a string of digits in base base. The base argument may vary from 2 to 62 or from −2 to −36. Up to n_digits digits will be generated. Trailing zeros are not returned. No more digits than can be accurately represented by op are ever generated. If n_digits is 0 then that accurate maximum number of digits are generated.

For base in the range 2..36, digits and lower-case letters are used; for −2..−36, digits and upper-case letters are used; for 37..62, digits, upper-case letters, and lower-case letters (in that significance order) are used.

If str is NULL, the result string is allocated using the current allocation function (see Custom Allocation). The block will be strlen(str)+1 bytes, that being exactly enough for the string and null-terminator.

If str is not NULL, it should point to a block of n_digits + 2 bytes, that being enough for the mantissa, a possible minus sign, and a null-terminator. When n_digits is 0 to get all significant digits, an application won’t be able to know the space required, and str should be NULL in that case.

The generated string is a fraction, with an implicit radix point immediately to the left of the first digit. The applicable exponent is written through the expptr pointer. For example, the number 3.1416 would be returned as string "31416" and exponent 1.

When op is zero, an empty string is produced and the exponent returned is 0.

A pointer to the result string is returned, being either the allocated block or the given str.


7.5 Arithmetic Functions

Function: void mpf_add (mpf_t rop, const mpf_t op1, const mpf_t op2)
Function: void mpf_add_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)

Set rop to op1 + op2.

Function: void mpf_sub (mpf_t rop, const mpf_t op1, const mpf_t op2)
Function: void mpf_ui_sub (mpf_t rop, unsigned long int op1, const mpf_t op2)
Function: void mpf_sub_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)

Set rop to op1op2.

Function: void mpf_mul (mpf_t rop, const mpf_t op1, const mpf_t op2)
Function: void mpf_mul_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)

Set rop to op1 times op2.

Division is undefined if the divisor is zero, and passing a zero divisor to the divide functions will make these functions intentionally divide by zero. This lets the user handle arithmetic exceptions in these functions in the same manner as other arithmetic exceptions.

Function: void mpf_div (mpf_t rop, const mpf_t op1, const mpf_t op2)
Function: void mpf_ui_div (mpf_t rop, unsigned long int op1, const mpf_t op2)
Function: void mpf_div_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)

Set rop to op1/op2.

Function: void mpf_sqrt (mpf_t rop, const mpf_t op)
Function: void mpf_sqrt_ui (mpf_t rop, unsigned long int op)

Set rop to the square root of op.

Function: void mpf_pow_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)

Set rop to op1 raised to the power op2.

Function: void mpf_neg (mpf_t rop, const mpf_t op)

Set rop to −op.

Function: void mpf_abs (mpf_t rop, const mpf_t op)

Set rop to the absolute value of op.

Function: void mpf_mul_2exp (mpf_t rop, const mpf_t op1, mp_bitcnt_t op2)

Set rop to op1 times 2 raised to op2.

Function: void mpf_div_2exp (mpf_t rop, const mpf_t op1, mp_bitcnt_t op2)

Set rop to op1 divided by 2 raised to op2.


7.6 Comparison Functions

Function: int mpf_cmp (const mpf_t op1, const mpf_t op2)
Function: int mpf_cmp_z (const mpf_t op1, const mpz_t op2)
Function: int mpf_cmp_d (const mpf_t op1, double op2)
Function: int mpf_cmp_ui (const mpf_t op1, unsigned long int op2)
Function: int mpf_cmp_si (const mpf_t op1, signed long int op2)

Compare op1 and op2. Return a positive value if op1 > op2, zero if op1 = op2, and a negative value if op1 < op2.

mpf_cmp_d can be called with an infinity, but results are undefined for a NaN.

Function: int mpf_eq (const mpf_t op1, const mpf_t op2, mp_bitcnt_t op3)

This function is mathematically ill-defined and should not be used.

Return non-zero if the first op3 bits of op1 and op2 are equal, zero otherwise. Note that numbers like e.g., 256 (binary 100000000) and 255 (binary 11111111) will never be equal by this function’s measure, and furthermore that 0 will only be equal to itself.

Function: void mpf_reldiff (mpf_t rop, const mpf_t op1, const mpf_t op2)

Compute the relative difference between op1 and op2 and store the result in rop. This is abs(op1-op2)/op1.

Macro: int mpf_sgn (const mpf_t op)

Return +1 if op > 0, 0 if op = 0, and -1 if op < 0.

This function is actually implemented as a macro. It evaluates its argument multiple times.


7.7 Input and Output Functions

Functions that perform input from a stdio stream, and functions that output to a stdio stream, of mpf numbers. Passing a NULL pointer for a stream argument to any of these functions will make them read from stdin and write to stdout, respectively.

When using any of these functions, it is a good idea to include stdio.h before gmp.h, since that will allow gmp.h to define prototypes for these functions.

See also Formatted Output and Formatted Input.

Function: size_t mpf_out_str (FILE *stream, int base, size_t n_digits, const mpf_t op)

Print op to stream, as a string of digits. Return the number of bytes written, or if an error occurred, return 0.

The mantissa is prefixed with an ‘0.’ and is in the given base, which may vary from 2 to 62 or from −2 to −36. An exponent is then printed, separated by an ‘e’, or if the base is greater than 10 then by an ‘@’. The exponent is always in decimal. The decimal point follows the current locale, on systems providing localeconv.

For base in the range 2..36, digits and lower-case letters are used; for −2..−36, digits and upper-case letters are used; for 37..62, digits, upper-case letters, and lower-case letters (in that significance order) are used.

Up to n_digits will be printed from the mantissa, except that no more digits than are accurately representable by op will be printed. n_digits can be 0 to select that accurate maximum.

Function: size_t mpf_inp_str (mpf_t rop, FILE *stream, int base)

Read a string in base base from stream, and put the read float in rop. The string is of the form ‘M@N’ or, if the base is 10 or less, alternatively ‘MeN’. ‘M’ is the mantissa and ‘N’ is the exponent. The mantissa is always in the specified base. The exponent is either in the specified base or, if base is negative, in decimal. The decimal point expected is taken from the current locale, on systems providing localeconv.

The argument base may be in the ranges 2 to 36, or −36 to −2. Negative values are used to specify that the exponent is in decimal.

Unlike the corresponding mpz function, the base will not be determined from the leading characters of the string if base is 0. This is so that numbers like ‘0.23’ are not interpreted as octal.

Return the number of bytes read, or if an error occurred, return 0.


7.8 Miscellaneous Functions

Function: void mpf_ceil (mpf_t rop, const mpf_t op)
Function: void mpf_floor (mpf_t rop, const mpf_t op)
Function: void mpf_trunc (mpf_t rop, const mpf_t op)

Set rop to op rounded to an integer. mpf_ceil rounds to the next higher integer, mpf_floor to the next lower, and mpf_trunc to the integer towards zero.

Function: int mpf_integer_p (const mpf_t op)

Return non-zero if op is an integer.

Function: int mpf_fits_ulong_p (const mpf_t op)
Function: int mpf_fits_slong_p (const mpf_t op)
Function: int mpf_fits_uint_p (const mpf_t op)
Function: int mpf_fits_sint_p (const mpf_t op)
Function: int mpf_fits_ushort_p (const mpf_t op)
Function: int mpf_fits_sshort_p (const mpf_t op)

Return non-zero if op would fit in the respective C data type, when truncated to an integer.

Function: void mpf_urandomb (mpf_t rop, gmp_randstate_t state, mp_bitcnt_t nbits)

Generate a uniformly distributed random float in rop, such that 0 <= rop < 1, with nbits significant bits in the mantissa or less if the precision of rop is smaller.

The variable state must be initialized by calling one of the gmp_randinit functions (Random State Initialization) before invoking this function.

Function: void mpf_random2 (mpf_t rop, mp_size_t max_size, mp_exp_t exp)

Generate a random float of at most max_size limbs, with long strings of zeros and ones in the binary representation. The exponent of the number is in the interval −exp to exp (in limbs). This function is useful for testing functions and algorithms, since these kind of random numbers have proven to be more likely to trigger corner-case bugs. Negative random numbers are generated when max_size is negative.