Collectives on Stack Overflow. Learn more. How are floating point numbers stored in memory? Ask Question. Asked 10 years, 1 month ago. Active 1 year, 9 months ago. Viewed 51k times. I've read that they're stored in the form of mantissa and exponent I've read this document but I could not understand anything. Improve this question. Sid S 5, 2 2 gold badges 13 13 silver badges 23 23 bronze badges. Jitendra Jitendra 4, 4 4 gold badges 42 42 silver badges 63 63 bronze badges.
The document you linked to explains it rather clearly. What specifically do you find hard to understand? It explains how the exponent is stored AFTER introducing a problem that needs this explanation But what if the number is zero?
Oh dear. It's like those crime stories where the trick is that they didn't show you all the informations but the main character in the story knows them all. FWIW, this is pretty helpful with understanding when paired with the accepted answer: h-schmidt. Add a comment. Active Oldest Votes. Clearly, using only 32 bits, it's not possible to store every digit in such numbers.
So: 1. So, what is needed to encode this, as efficiently as possible? What do we really need? The sign of the expression. The exponent The value in the range 1. The floating-point package never uses a denormalized form unless the exponent becomes less than the minimum that can be represented in a normalized form.
The following table shows the minimum and maximum values you can store in variables of each floating-point type. The values listed in this table apply only to normalized floating-point numbers; denormalized floating-point numbers have a smaller minimum value. Note that numbers retained in 80 x 87 registers are always represented in bit normalized form; numbers can only be represented in denormalized form when stored in bit or bit floating-point variables variables of type float and type long.
If precision is less of a concern than storage, consider using type float for floating-point variables. Conversely, if precision is the most important criterion, use type double.
Floating-point variables can be promoted to a type of greater significance from type float to type double. Promotion often occurs when you perform arithmetic on floating-point variables.
This arithmetic is always done in as high a degree of precision as the variable with the highest degree of precision. For example, consider the following type declarations:. In the following example which uses the declarations from the preceding example , the arithmetic is done in float bit precision on the variables; the result is then promoted to type double:. Adding 1 and the binary point to the beginning of the mantissa gives the following value:.
To adjust the mantissa for the exponent, move the decimal point to the left for negative exponent values or right for positive exponent values. Since the exponent is three, the mantissa is adjusted as follows:. The result is a binary floating-point number. Binary digits to the left of the decimal point represent the power of two corresponding to their position. Binary digits to the right of the decimal point also represent the power of two corresponding to their position.
However, the powers are negative. For example,. But be aware that there are values that cannot be represented exactly no matter how many bits you use. Since values are approximate, calculations with them are also approximate, and rounding errors accumulate.
The number of bits in the exponent determines the range the minimum and maximum values you can represent. But as you move towards your minimum and maximum values, the size of the gap between representable values increases. That is, if you can't exactly represent values between 0.
Be careful when multiplying very large in terms of magnitude numbers by very small numbers. First, you should understand engineering notation, which has a fixed-precision factor and an integer exponent: 1 is 1. This allows for very short notation of large numbers. One billion is 1. The factor before the E is usually notated as a fixed-precision number: 1. Also note that the factor never needs a leading zero. Instead, the exponent can be decremented until that is no longer the case.
Significant differences to base engineering notation is that of course now the exponent has base 2. The exact size of each part depends on the exact floating-point standard you are using. Sign up to join this community.
The best answers are voted up and rise to the top. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams?
0コメント