Home | Projects | Notes > Computer Architecture & Organization > Floating-Point Numbers
A floating-point value is stored as two components:
A number
The location of the radix (In base 10 it is called the decimal) point within the number.
Floating-point arithmetic lets you handle the very large and very small number found in scientific applications.
The term mantissa has been replaced by significand to indicate the number of significant bits in a floating-point number.
Example:
Significand:
Exponenet:
Because a floating-point number is defined as the product of two values, a floating point expression is not unique;
e.g.,
An IEEE-754 floating-point significand is always normalized (unless it is equal to zero).
Significand always begins with a leading
Normalization allows the highest available precision by using all significant bits.
e.g.,
Examples:
The significand of an IEEE format floating-point number is represented in sign and magnitude form. (A sign-bit indicates positive or negative)
The exponent is represented in a biased form, by adding a constant called a biase to the true exponent.
This is to make the range of the exponent nonnegative.
Examples:
Suppose an 8-bit exponent is used and all exponents are biased by
IEEE-754 floating-point in 32-bit uses the following format:
xxxxxxxxxx
11S EEEEEEEE 1.MMMMMMMMMMMMMMMMMMMMMMM
S: Sign bit (1-bit)
E: 8-bit biased exponent (tells you where to put the binary point)
The stored exponent value
Binary range:
Decimal range:
M: 23-bit fractional significand.
Leading
IEEE-754 floating-point in 32-bit can represent 232 = 4,294,967,296 different numbers. These include:
The number 0.0
sign: 0, exponent: 0, mantissa: 0
±∞
Very small denormalized numbers
Various other special conditions
Overall, the standard allows approximately seven significant decimal digits and and approximate value range of 10-45 to 1038.
IEEE-754 floating-point in 32-bit is still very limited to express the infinite number of real numbers in reality. For this reason, followings are defined as well.
For 64-bit
16 significant decimal digits
Decimal range: 10-300 to 10300
For 128-bit
Decimal range: 10-4900 to 104900
34 decimal digits
[!] Note: There is a 16-bit format which is extremely limited in both range and precision, but is useful for simple graphics applications.
Converting 253.75(10) to binary floating-point form.
xxxxxxxxxx
241Step 1. Convert 253 and 0.75 to hex first, then to binary.
2
3 253 = FD (hex), 1111 1101 (binary)
4 16 * 0.75 = 12.0, 0.C (hex), 0.1100 (binary)
5 ∴ 253.75 (decimal) = FD.C (hex) = 1111 1101 . 1100 (binary)
6
7Step 2. Normalize.
8
9 1.111 1101 1100 x 2^7
10
11Step 3. Get the exponent.
12
13 7 + 127 = 134 = 1000 0110 (binary)
14
15Step 4. Put the parts together to form the floating-point format.
16
17 S EEEE EEEE 1.MMMMMMMMMMMMMMMMMMMMMMM
18 0 1000 0110 11111011100000000000000 (Do not copy '1.')
19
20Step 5. Regroup this by 4's and convert into hex. (For easy grading!)
21
22 0100 0011 0111 1101 1100 0000 0000 0000
23
24 ∴ 437DC000 (hex)
Do the previous example in reverse order!
Just don't forget:
The implied "1." in mantissa
The exponent is in biased form