Question 1

Why can't 0.1 be represented exactly in floating point?

Accepted Answer

The decimal 0.1 in binary is an infinitely repeating fraction (0.000110011001100...), just like 1/3 is a repeating decimal. Since a float has a fixed number of bits, the fraction must be truncated, introducing a tiny rounding error. This is why 0.1 + 0.2 in most languages equals approximately 0.30000000000000004 rather than exactly 0.3.

Question 2

What is the bias in the exponent field?

Accepted Answer

The bias is a fixed offset added to the actual exponent before it is stored. Single precision uses a bias of 127 and double precision uses 1023. If the actual exponent is 3, the stored value is 3 + 127 = 130 for single precision. This biased representation allows exponents to range from −126 to +127 (single) or −1022 to +1023 (double) while using only unsigned integer storage.

Question 3

What is the hidden bit?

Accepted Answer

For normalized floating-point numbers, the leading bit of the mantissa is always 1 and is not stored — it is implied. This 'hidden' or 'implicit' bit effectively gives single precision 24 bits of mantissa precision (23 stored + 1 hidden) and double precision 53 bits (52 stored + 1 hidden). Denormalized numbers near zero have an implicit leading 0 instead.

Question 4

What are special IEEE 754 values?

Accepted Answer

IEEE 754 defines several special values: positive and negative zero (distinguished by the sign bit), positive and negative infinity (all exponent bits set, all mantissa bits zero), and NaN — Not a Number (all exponent bits set, at least one mantissa bit set). These allow graceful handling of overflow, division by zero, and undefined operations without crashing programs.

Question 5

When should I use single versus double precision?

Accepted Answer

Use double precision (64-bit) for scientific computing, financial calculations, and any application requiring more than 7 decimal digits of accuracy. Use single precision (32-bit) when memory or performance is constrained — GPUs process single-precision floats much faster, and mobile/embedded systems often prefer 32-bit for efficiency. The rounding errors in single precision can accumulate significantly in iterative algorithms.

Question 6

How do I avoid floating-point precision errors in my code?

Accepted Answer

For equality comparisons, use a tolerance (epsilon) rather than exact equality: |a − b| < 1e-9 instead of a === b. For financial calculations, consider using integer arithmetic (e.g., store amounts in cents) or a dedicated decimal library. For scientific computing, use compensated summation algorithms like Kahan summation to reduce accumulated rounding errors in large sums.

Decimal Input	Precision & Notes	Significance
3.141592653589793 (Double)	Sign: 0 · Exp: 1 · Exact digits: ~15	π — irrational, stored with tiny rounding error
0.1 (Single)	Sign: 0 · Stored: 0.100000001490116 · Error: ~1.49e-9	Classic rounding error example
2.718281828459045 (Double)	Sign: 0 · Exp: 1 · Exact digits: ~15	Euler's number e
1.23e-10 (Single)	Sign: 0 · Normalized · Small positive value	Tests small-number precision in single format

Floating Point Calculator

Examples

About the Floating Point Calculator

How to Use This Calculator

Frequently Asked Questions