IEEE Notes - Sect. 5

Double precision (64-bit) representation

The 64-bit (double precision) IEEE floating-point representation is laid out as

`s eeeeeeeeeee ffff ffffffff....ffffffff`

in the IEEE 754 standard. There are Ne = **11** bits in the exponent
field and Nf = **52** bits for the fractional part of the mantissa.

We store *bias*+**p** in the exponent field;
the bias is 01111111111 (binary) = 3FF (hex) = 1023 (decimal)

To allow for the representation of special values (0,Inf, NaN) as
described in section #4, two bit patterns
are reserved thus limiting the power **p** to the range [-1022,1023].

Since the mantissa has a total of 53 bits (when you count the
hidden bit) and is rounded, the magnitude of the relative
error in a number is bounded by 2^{-53} = 1.11... x 10^{-16}.

This means we get almost 16 decimal digit precision.

(The largest possible mantissa is M = 2^{53} = 9.007...x10^15,
which has 15+ digits of precision.)

The largest positive number that can be stored is

1.11111....11111 x 2^{1023} = 1.797693... x 10^{308}.

Notice that 1.11111....11111 = 2 - 2^{-52}.

Also note that log_{10}(largest) = 308.2547...

The smallest positive number is

1.00000...00000 x 2^{-1022} = 2.225074... x 10^{-308}.

Note that log_{10}(smallest) = -307.6526...

**Notice: **

When we go to double precision using the IEEE 754 floating-point
standard, we gain more than a factor of 2 in the precision of
the mantissa *and * we gain a huge factor in the size of
numbers we can work with before encountering an overflow condition.

This material is © Copyright 1996, by James Carr. FSU students enrolled in MAD-3401 have permission to make personal copies of this document for use when studying. Other academic users may link to this page but may not copy or redistribute the material without the author's permission.

Return to the Home Page for MAD-3401.