Summarize NaN issues
Define NaN
IEEE - Special values (NaN, INF, subnormal)
NaN is a floating-point representation to represent a value which has no meaning. Specifically, NaN's exponent is the maximum value for this datatype and mantissa is not 0.
NaN's Generation
Example and analysis
- Which NaN, INF matters
- pytorch -- nan
- More mixed precision issue related to NaN in APEX
- Search NaN in deepstability
- sru
- cumf
Analyze NaN in matrix multiplication
MM's implementation is like a blocked FMA, so the case producing NaN for FMA will also produce NaN in MM:
fma(+-0, +-inf, z) = NaN
fma(+-inf, +-0, z) = NaN
fma(x,y,-inf) = NaN if x*y=inf
fma(x,y,inf) = NaN if x*y=-inf
That's why it will produce NaN if you are doing precision conversion in the inputs; precision conversion is easily to get +-inf which will cause NaN in the following operations.