[−][src]Constant gmp_mpfr_sys::C::GMP::Floating_point_Functions
pub const Floating_point_Functions: ();
This constant is a place-holder for documentation; do not use it in code.
Next: Low-level Functions, Previous: Rational Number Functions, Up: Top [Index]
7 Floating-point Functions
GMP floating point numbers are stored in objects of type mpf_t
and
functions operating on them have an mpf_
prefix.
The mantissa of each float has a user-selectable precision, in practice only limited by available memory. Each variable has its own precision, and that can be increased or decreased at any time. This selectable precision is a minimum value, GMP rounds it up to a whole limb.
The accuracy of a calculation is determined by the priorly set precision of the destination variable and the numeric values of the input variables. Input variables’ set precisions do not affect calculations (except indirectly as their values might have been affected when they were assigned).
The exponent of each float has fixed precision, one machine word on most
systems. In the current implementation the exponent is a count of limbs, so
for example on a 32-bit system this means a range of roughly
2^-68719476768 to 2^68719476736, or on a 64-bit system
this will be much greater. Note however that mpf_get_str
can only
return an exponent which fits an mp_exp_t
and currently
mpf_set_str
doesn’t accept exponents bigger than a long
.
Each variable keeps track of the mantissa data actually in use. This means that if a float is exactly represented in only a few bits then only those bits will be used in a calculation, even if the variable’s selected precision is high. This is a performance optimization; it does not affect the numeric results.
Internally, GMP sometimes calculates with higher precision than that of the destination variable in order to limit errors. Final results are always truncated to the destination variable’s precision.
The mantissa is stored in binary. One consequence of this is that decimal
fractions like 0.1 cannot be represented exactly. The same is true of
plain IEEE double
floats. This makes both highly unsuitable for
calculations involving money or other values that should be exact decimal
fractions. (Suitably scaled integers, or perhaps rationals, are better
choices.)
The mpf
functions and variables have no special notion of infinity or
not-a-number, and applications must take care not to overflow the exponent or
results will be unpredictable.
Note that the mpf
functions are not intended as a smooth
extension to IEEE P754 arithmetic. In particular results obtained on one
computer often differ from the results on a computer with a different word
size.
New projects should consider using the GMP extension library MPFR (http://mpfr.org) instead. MPFR provides well-defined precision and accurate rounding, and thereby naturally extends IEEE P754.
• Initializing Floats | ||
• Assigning Floats | ||
• Simultaneous Float Init & Assign | ||
• Converting Floats | ||
• Float Arithmetic | ||
• Float Comparison | ||
• I/O of Floats | ||
• Miscellaneous Float Functions |
Next: Assigning Floats, Previous: Floating-point Functions, Up: Floating-point Functions [Index]
7.1 Initialization Functions
- Function: void mpf_set_default_prec (mp_bitcnt_t prec)
Set the default precision to be at least prec bits. All subsequent calls to
mpf_init
will use this precision, but previously initialized variables are unaffected.
- Function: mp_bitcnt_t mpf_get_default_prec (void)
Return the default precision actually used.
An mpf_t
object must be initialized before storing the first value in
it. The functions mpf_init
and mpf_init2
are used for that
purpose.
- Function: void mpf_init (mpf_t x)
Initialize x to 0. Normally, a variable should be initialized once only or at least be cleared, using
mpf_clear
, between initializations. The precision of x is undefined unless a default precision has already been established by a call tompf_set_default_prec
.
- Function: void mpf_init2 (mpf_t x, mp_bitcnt_t prec)
Initialize x to 0 and set its precision to be at least prec bits. Normally, a variable should be initialized once only or at least be cleared, using
mpf_clear
, between initializations.
- Function: void mpf_inits (mpf_t x, ...)
Initialize a NULL-terminated list of
mpf_t
variables, and set their values to 0. The precision of the initialized variables is undefined unless a default precision has already been established by a call tompf_set_default_prec
.
- Function: void mpf_clear (mpf_t x)
Free the space occupied by x. Make sure to call this function for all
mpf_t
variables when you are done with them.
- Function: void mpf_clears (mpf_t x, ...)
Free the space occupied by a NULL-terminated list of
mpf_t
variables.
Here is an example on how to initialize floating-point variables:
{ mpf_t x, y; mpf_init (x); /* use default precision */ mpf_init2 (y, 256); /* precision at least 256 bits */ … /* Unless the program is about to exit, do ... */ mpf_clear (x); mpf_clear (y); }
The following three functions are useful for changing the precision during a calculation. A typical use would be for adjusting the precision gradually in iterative algorithms like Newton-Raphson, making the computation precision closely match the actual accurate part of the numbers.
- Function: mp_bitcnt_t mpf_get_prec (const mpf_t op)
Return the current precision of op, in bits.
- Function: void mpf_set_prec (mpf_t rop, mp_bitcnt_t prec)
Set the precision of rop to be at least prec bits. The value in rop will be truncated to the new precision.
This function requires a call to
realloc
, and so should not be used in a tight loop.
- Function: void mpf_set_prec_raw (mpf_t rop, mp_bitcnt_t prec)
Set the precision of rop to be at least prec bits, without changing the memory allocated.
prec must be no more than the allocated precision for rop, that being the precision when rop was initialized, or in the most recent
mpf_set_prec
.The value in rop is unchanged, and in particular if it had a higher precision than prec it will retain that higher precision. New values written to rop will use the new prec.
Before calling
mpf_clear
or the fullmpf_set_prec
, anothermpf_set_prec_raw
call must be made to restore rop to its original allocated precision. Failing to do so will have unpredictable results.mpf_get_prec
can be used beforempf_set_prec_raw
to get the original allocated precision. Aftermpf_set_prec_raw
it reflects the prec value set.mpf_set_prec_raw
is an efficient way to use anmpf_t
variable at different precisions during a calculation, perhaps to gradually increase precision in an iteration, or just to use various different precisions for different purposes during a calculation.
Next: Simultaneous Float Init & Assign, Previous: Initializing Floats, Up: Floating-point Functions [Index]
7.2 Assignment Functions
These functions assign new values to already initialized floats (see Initializing Floats).
- Function: void mpf_set (mpf_t rop, const mpf_t op)
- Function: void mpf_set_ui (mpf_t rop, unsigned long int op)
- Function: void mpf_set_si (mpf_t rop, signed long int op)
- Function: void mpf_set_d (mpf_t rop, double op)
- Function: void mpf_set_z (mpf_t rop, const mpz_t op)
- Function: void mpf_set_q (mpf_t rop, const mpq_t op)
Set the value of rop from op.
- Function: int mpf_set_str (mpf_t rop, const char *str, int base)
Set the value of rop from the string in str. The string is of the form ‘M@N’ or, if the base is 10 or less, alternatively ‘MeN’. ‘M’ is the mantissa and ‘N’ is the exponent. The mantissa is always in the specified base. The exponent is either in the specified base or, if base is negative, in decimal. The decimal point expected is taken from the current locale, on systems providing
localeconv
.The argument base may be in the ranges 2 to 62, or -62 to -2. Negative values are used to specify that the exponent is in decimal.
For bases up to 36, case is ignored; upper-case and lower-case letters have the same value; for bases 37 to 62, upper-case letter represent the usual 10..35 while lower-case letter represent 36..61.
Unlike the corresponding
mpz
function, the base will not be determined from the leading characters of the string if base is 0. This is so that numbers like ‘0.23’ are not interpreted as octal.White space is allowed in the string, and is simply ignored. [This is not really true; white-space is ignored in the beginning of the string and within the mantissa, but not in other places, such as after a minus sign or in the exponent. We are considering changing the definition of this function, making it fail when there is any white-space in the input, since that makes a lot of sense. Please tell us your opinion about this change. Do you really want it to accept
"3 14"
as meaning 314 as it does now?]This function returns 0 if the entire string is a valid number in base base. Otherwise it returns -1.
- Function: void mpf_swap (mpf_t rop1, mpf_t rop2)
Swap rop1 and rop2 efficiently. Both the values and the precisions of the two variables are swapped.
Next: Converting Floats, Previous: Assigning Floats, Up: Floating-point Functions [Index]
7.3 Combined Initialization and Assignment Functions
For convenience, GMP provides a parallel series of initialize-and-set functions
which initialize the output and then store the value there. These functions’
names have the form mpf_init_set…
Once the float has been initialized by any of the mpf_init_set…
functions, it can be used as the source or destination operand for the ordinary
float functions. Don’t use an initialize-and-set function on a variable
already initialized!
- Function: void mpf_init_set (mpf_t rop, const mpf_t op)
- Function: void mpf_init_set_ui (mpf_t rop, unsigned long int op)
- Function: void mpf_init_set_si (mpf_t rop, signed long int op)
- Function: void mpf_init_set_d (mpf_t rop, double op)
Initialize rop and set its value from op.
The precision of rop will be taken from the active default precision, as set by
mpf_set_default_prec
.
- Function: int mpf_init_set_str (mpf_t rop, const char *str, int base)
Initialize rop and set its value from the string in str. See
mpf_set_str
above for details on the assignment operation.Note that rop is initialized even if an error occurs. (I.e., you have to call
mpf_clear
for it.)The precision of rop will be taken from the active default precision, as set by
mpf_set_default_prec
.
Next: Float Arithmetic, Previous: Simultaneous Float Init & Assign, Up: Floating-point Functions [Index]
7.4 Conversion Functions
- Function: double mpf_get_d (const mpf_t op)
Convert op to a
double
, truncating if necessary (i.e. rounding towards zero).If the exponent in op is too big or too small to fit a
double
then the result is system dependent. For too big an infinity is returned when available. For too small 0.0 is normally returned. Hardware overflow, underflow and denorm traps may or may not occur.
- Function: double mpf_get_d_2exp (signed long int *exp, const mpf_t op)
Convert op to a
double
, truncating if necessary (i.e. rounding towards zero), and with an exponent returned separately.The return value is in the range 0.5<=abs(d)<1 and the exponent is stored to
*exp
. d * 2^exp is the (truncated) op value. If op is zero, the return is 0.0 and 0 is stored to*exp
.This is similar to the standard C
frexp
function (see Normalization Functions in The GNU C Library Reference Manual).
- Function: long mpf_get_si (const mpf_t op)
- Function: unsigned long mpf_get_ui (const mpf_t op)
Convert op to a
long
orunsigned long
, truncating any fraction part. If op is too big for the return type, the result is undefined.See also
mpf_fits_slong_p
andmpf_fits_ulong_p
(see Miscellaneous Float Functions).
- Function: char * mpf_get_str (char *str, mp_exp_t *expptr, int base, size_t n_digits, const mpf_t op)
Convert op to a string of digits in base base. The base argument may vary from 2 to 62 or from -2 to -36. Up to n_digits digits will be generated. Trailing zeros are not returned. No more digits than can be accurately represented by op are ever generated. If n_digits is 0 then that accurate maximum number of digits are generated.
For base in the range 2..36, digits and lower-case letters are used; for -2..-36, digits and upper-case letters are used; for 37..62, digits, upper-case letters, and lower-case letters (in that significance order) are used.
If str is
NULL
, the result string is allocated using the current allocation function (see Custom Allocation). The block will bestrlen(str)+1
bytes, that being exactly enough for the string and null-terminator.If str is not
NULL
, it should point to a block of n_digits + 2 bytes, that being enough for the mantissa, a possible minus sign, and a null-terminator. When n_digits is 0 to get all significant digits, an application won’t be able to know the space required, and str should beNULL
in that case.The generated string is a fraction, with an implicit radix point immediately to the left of the first digit. The applicable exponent is written through the expptr pointer. For example, the number 3.1416 would be returned as string
"31416"
and exponent 1.When op is zero, an empty string is produced and the exponent returned is 0.
A pointer to the result string is returned, being either the allocated block or the given str.
Next: Float Comparison, Previous: Converting Floats, Up: Floating-point Functions [Index]
7.5 Arithmetic Functions
- Function: void mpf_add (mpf_t rop, const mpf_t op1, const mpf_t op2)
- Function: void mpf_add_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)
Set rop to op1 + op2.
- Function: void mpf_sub (mpf_t rop, const mpf_t op1, const mpf_t op2)
- Function: void mpf_ui_sub (mpf_t rop, unsigned long int op1, const mpf_t op2)
- Function: void mpf_sub_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)
Set rop to op1 - op2.
- Function: void mpf_mul (mpf_t rop, const mpf_t op1, const mpf_t op2)
- Function: void mpf_mul_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)
Set rop to op1 times op2.
Division is undefined if the divisor is zero, and passing a zero divisor to the divide functions will make these functions intentionally divide by zero. This lets the user handle arithmetic exceptions in these functions in the same manner as other arithmetic exceptions.
- Function: void mpf_div (mpf_t rop, const mpf_t op1, const mpf_t op2)
- Function: void mpf_ui_div (mpf_t rop, unsigned long int op1, const mpf_t op2)
- Function: void mpf_div_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)
-
Set rop to op1/op2.
- Function: void mpf_sqrt (mpf_t rop, const mpf_t op)
- Function: void mpf_sqrt_ui (mpf_t rop, unsigned long int op)
-
Set rop to the square root of op.
- Function: void mpf_pow_ui (mpf_t rop, const mpf_t op1, unsigned long int op2)
-
Set rop to op1 raised to the power op2.
- Function: void mpf_neg (mpf_t rop, const mpf_t op)
Set rop to -op.
- Function: void mpf_abs (mpf_t rop, const mpf_t op)
Set rop to the absolute value of op.
- Function: void mpf_mul_2exp (mpf_t rop, const mpf_t op1, mp_bitcnt_t op2)
Set rop to op1 times 2 raised to op2.
- Function: void mpf_div_2exp (mpf_t rop, const mpf_t op1, mp_bitcnt_t op2)
Set rop to op1 divided by 2 raised to op2.
Next: I/O of Floats, Previous: Float Arithmetic, Up: Floating-point Functions [Index]
7.6 Comparison Functions
- Function: int mpf_cmp (const mpf_t op1, const mpf_t op2)
- Function: int mpf_cmp_z (const mpf_t op1, const mpz_t op2)
- Function: int mpf_cmp_d (const mpf_t op1, double op2)
- Function: int mpf_cmp_ui (const mpf_t op1, unsigned long int op2)
- Function: int mpf_cmp_si (const mpf_t op1, signed long int op2)
Compare op1 and op2. Return a positive value if op1 > op2, zero if op1 = op2, and a negative value if op1 < op2.
mpf_cmp_d
can be called with an infinity, but results are undefined for a NaN.
- Function: int mpf_eq (const mpf_t op1, const mpf_t op2, mp_bitcnt_t op3)
This function is mathematically ill-defined and should not be used.
Return non-zero if the first op3 bits of op1 and op2 are equal, zero otherwise. Note that numbers like e.g., 256 (binary 100000000) and 255 (binary 11111111) will never be equal by this function’s measure, and furthermore that 0 will only be equal to itself.
- Function: void mpf_reldiff (mpf_t rop, const mpf_t op1, const mpf_t op2)
Compute the relative difference between op1 and op2 and store the result in rop. This is abs(op1-op2)/op1.
- Macro: int mpf_sgn (const mpf_t op)
-
Return +1 if op > 0, 0 if op = 0, and -1 if op < 0.
This function is actually implemented as a macro. It evaluates its argument multiple times.
Next: Miscellaneous Float Functions, Previous: Float Comparison, Up: Floating-point Functions [Index]
7.7 Input and Output Functions
Functions that perform input from a stdio stream, and functions that output to
a stdio stream, of mpf
numbers. Passing a NULL
pointer for a
stream argument to any of these functions will make them read from
stdin
and write to stdout
, respectively.
When using any of these functions, it is a good idea to include stdio.h before gmp.h, since that will allow gmp.h to define prototypes for these functions.
See also Formatted Output and Formatted Input.
- Function: size_t mpf_out_str (FILE *stream, int base, size_t n_digits, const mpf_t op)
Print op to stream, as a string of digits. Return the number of bytes written, or if an error occurred, return 0.
The mantissa is prefixed with an ‘0.’ and is in the given base, which may vary from 2 to 62 or from -2 to -36. An exponent is then printed, separated by an ‘e’, or if the base is greater than 10 then by an ‘@’. The exponent is always in decimal. The decimal point follows the current locale, on systems providing
localeconv
.For base in the range 2..36, digits and lower-case letters are used; for -2..-36, digits and upper-case letters are used; for 37..62, digits, upper-case letters, and lower-case letters (in that significance order) are used.
Up to n_digits will be printed from the mantissa, except that no more digits than are accurately representable by op will be printed. n_digits can be 0 to select that accurate maximum.
- Function: size_t mpf_inp_str (mpf_t rop, FILE *stream, int base)
Read a string in base base from stream, and put the read float in rop. The string is of the form ‘M@N’ or, if the base is 10 or less, alternatively ‘MeN’. ‘M’ is the mantissa and ‘N’ is the exponent. The mantissa is always in the specified base. The exponent is either in the specified base or, if base is negative, in decimal. The decimal point expected is taken from the current locale, on systems providing
localeconv
.The argument base may be in the ranges 2 to 36, or -36 to -2. Negative values are used to specify that the exponent is in decimal.
Unlike the corresponding
mpz
function, the base will not be determined from the leading characters of the string if base is 0. This is so that numbers like ‘0.23’ are not interpreted as octal.Return the number of bytes read, or if an error occurred, return 0.
Previous: I/O of Floats, Up: Floating-point Functions [Index]
7.8 Miscellaneous Functions
- Function: void mpf_ceil (mpf_t rop, const mpf_t op)
- Function: void mpf_floor (mpf_t rop, const mpf_t op)
- Function: void mpf_trunc (mpf_t rop, const mpf_t op)
-
Set rop to op rounded to an integer.
mpf_ceil
rounds to the next higher integer,mpf_floor
to the next lower, andmpf_trunc
to the integer towards zero.
- Function: int mpf_integer_p (const mpf_t op)
Return non-zero if op is an integer.
- Function: int mpf_fits_ulong_p (const mpf_t op)
- Function: int mpf_fits_slong_p (const mpf_t op)
- Function: int mpf_fits_uint_p (const mpf_t op)
- Function: int mpf_fits_sint_p (const mpf_t op)
- Function: int mpf_fits_ushort_p (const mpf_t op)
- Function: int mpf_fits_sshort_p (const mpf_t op)
Return non-zero if op would fit in the respective C data type, when truncated to an integer.
- Function: void mpf_urandomb (mpf_t rop, gmp_randstate_t state, mp_bitcnt_t nbits)
-
Generate a uniformly distributed random float in rop, such that 0 <= rop < 1, with nbits significant bits in the mantissa or less if the precision of rop is smaller.
The variable state must be initialized by calling one of the
gmp_randinit
functions (Random State Initialization) before invoking this function.
- Function: void mpf_random2 (mpf_t rop, mp_size_t max_size, mp_exp_t exp)
Generate a random float of at most max_size limbs, with long strings of zeros and ones in the binary representation. The exponent of the number is in the interval -exp to exp (in limbs). This function is useful for testing functions and algorithms, since these kind of random numbers have proven to be more likely to trigger corner-case bugs. Negative random numbers are generated when max_size is negative.
Previous: I/O of Floats, Up: Floating-point Functions [Index]