long double

Floating point precisions
IEEE 754
16-bit: Half (binary16) 32-bit: Single (binary32), decimal32 64-bit: Double (binary64), decimal64 128-bit: Quadruple (binary128), decimal128 256-bit: Octuple (binary256) Extended precision formats
Other
Minifloat Arbitrary precision

In C and related programming languages, long double refers to a floating point data type that is often more precise than double precision. As with C's other floating point types, it may not necessarily map to an IEEE format.

`long double` in C

History

The long double type was present in the original 1989 C standard^[1] but support was improved by the 1999 revision of the C standard, or C99, which extended the standard library to include functions operating on long double such as sinl() and strtold().

Long double constants are floating-point constants suffixed with "L" or "l" (lower-case L), e.g., 0.333333333333333333L. Without a suffix, the evaluation depends on FLT_EVAL_METHOD.

Implementations

On the x86 architecture, most C compilers implement long double as the 80-bit extended precision type supported by x86 hardware (sometimes stored as 12 or 16 bytes to maintain data structure alignment), as specified in the C99 / C11 standards (IEC 60559 floating-point arithmetic (Annex F)). An exception is Microsoft Visual C++ for x86, which makes long double a synonym for double.^[2] The Intel C++ compiler on Microsoft Windows supports extended precision, but requires the /Qlong‑double switch for long double to correspond to the hardware's extended precision format.^[3]

Compilers may also use long double for a 128-bit quadruple precision format. This is the case on HP-UX^[4] and on Solaris/SPARC^[5] machines. This format is currently implemented in software due to lack of hardware support.

On some PowerPC and SPARCv9 machines, long double is implemented as a double-double arithmetic, where a long double value is regarded as the exact sum of two double-precision values, giving at least a 106-bit precision; with such a format, the long double type does not conform to the IEEE floating-point standard. Otherwise, long double is simply a synonym for double (double precision).

With the GNU C Compiler, long double is 80-bit extended precision on x86 processors regardless of the physical storage used for the type (which can be either 96 or 128 bits),^[6] On some other architectures, long double can be double-double (e.g. on PowerPC^[7]^[8]^[9]) or 128-bit quadruple precision (e.g. on SPARC^[10]). As of gcc 4.3, a quadruple precision is also supported on x86, but as the nonstandard type __float128 rather than long double.^[11]

Although the x86 architecture, and specifically the x87 floating-point instructions on x86, supports 80-bit extended-precision operations, it is possible to configure the processor to automatically round operations to double (or even single) precision. Conversely, in extended-precision mode, extended precision may be used for intermediate compiler-generated calculations even when the final results are stored at a lower precision (i.e. FLT_EVAL_METHOD == 2). With gcc on Linux, 80-bit extended precision is the default; on several BSD operating systems (FreeBSD and OpenBSD), double-precision mode is the default, and long double operations are effectively reduced to double precision.^[12] (NetBSD 7.0 and later, however, defaults to 80-bit extended precision ^[13]). However, it is possible to override this within an individual program via the FLDCW "floating-point load control-word" instruction.^[12] On x86_64 the BSDs default to 80-bit extended precision. Microsoft Windows with Visual C++ also sets the processor in double-precision mode by default, but this can again be overridden within an individual program (e.g. by the _controlfp_s function in Visual C++^[14]). The Intel C++ Compiler for x86, on the other hand, enables extended-precision mode by default.^[15] On OS X, long double is 80-bit extended precision ^[16] .

Other specifications

In CORBA (from specification of 3.0, which uses "ANSI/IEEE Standard 754-1985" as its reference), "the long double data type represents an IEEE double-extended floating-point number, which has an exponent of at least 15 bits in length and a signed fraction of at least 64 bits", with GIOP/IIOP CDR, whose floating-point types "exactly follow the IEEE standard formats for floating point numbers", marshalling this as what seems to be IEEE 754-2008 binary128 a.k.a. quadruple precision without using that name.

References

↑ ANSI/ISO 9899-1990 American National Standard for Programming Languages - C, section 6.1.2.5
↑ MSDN homepage, about Visual C++ compiler
↑ Intel Developer Site
↑ Hewlett Packard (1992). "Porting C Programs". HP-UX Portability Guide - HP 9000 Computers (PDF) (2nd ed.). pp. 5–3 and 5–37.
↑ Sun Numerical Computation Guide, Chapter 2: IEEE Arithmetic
↑ Using the GNU Compiler Collection, i386 and x86-64 Options.
↑ Using the GNU Compiler Collection, RS/6000 and PowerPC Options
↑ Inside Macintosh - PowerPC Numerics
↑ 128-bit long double support routines for Darwin
↑ SPARC Options
↑ GCC 4.3 Release Notes
1 2 Brian J. Gough and Richard M. Stallman, An Introduction to GCC, section 8.6 Floating-point issues (Network Theory Ltd., 2004).
↑ "Significant changes from NetBSD 6.0 to 7.0".
↑ _controlfp_s, Microsoft Developer Network (2/25/2011).
↑ Intel C++ Compiler Documentation, Using the -fp-model (/fp) Option.
↑ https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/LowLevelABI/130-IA-32_Function_Calling_Conventions/IA32.html

This article is issued from Wikipedia - version of the Thursday, August 13, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.