While trying to import a CSV dataset with Covid data for France, I realized there are numeric values in the data editor which are printed "1.#QNAN". This is a quiet NaN, a "not a number" value, prescribed by the IEEE 754 standard for floating point numbers. If I remember well, they are encoded in double precision using a byte representation that cannot be used for "normal" numbers (unlike Stata, which uses "regular" large values to encode missings).
Here is an example CSV which will show the problem I now face:
Code:
id,n 1,10 2,. 3,NaN
Code:
import delim nan.csv, clear list di n[3] sca x=n[3] di x
- -list- prints 1.#QNAN as expected.
- -di n[3]- prints . (that is, a missing).
- The scalar is also a missing.
So a NaN is converted to a missing. But, if I do -list if mi(n)-, I get only the second row. So a NaN is not considered a missing!
But, now, how am I supposed to filter on NaN values in the dataset? (for instance to replace them, maybe by... a missing)
In R, there is the -is.nan- function, and in C there is -isnan-, but in Stata, I don't know any equivalent. Is there a way to check whether a variable is a NaN in an -if- clause?
Afterthought
There is a way: list if mi(n+0) & !mi(n). But it's not clean. Anything better?
0 Response to Not a number
Post a Comment