A.D. Corlan
(RO) >
software
NaN_payload: storage of tagged objects in the payload bits of NaN IEEE-754 floating point values.
A.D. Corlan
June 6, 2014.
Archived by WebCite® at
http://www.webcitation.org/
Introduction
In most statistics applications it is necessary to store a mixture of floating point
and cathegorical data, at least in order to represent the non-availability
of some data and perhaps traces to reasons for such non-availability.
This could be achieved via normal tagged types, as done in the
corlpack package
where 128-bit records are used. Alternatively, one can use the fact
that IEEE-754 floating point objects, that are implemented in most modern
processors, already are tagged types: either a floating point number, or +/- infinity
or a NaN (not-a-number) value that has an unused payload can be stored in
a record.
Thus, one alterative is to use the NaN payload, that allows 51 bits for the
'double', 64 bit floats, to store other types, as presumably intended by the
IEEE-754 designers.
The advantages of using the NaN payload are:
- They are compatible with existing libraries that deal with floating point
number arrays, as well as data formats (such as netcdf). Even if changes are necessary
to such libraries, they are likely to be localised as the general data structures
need not be changed.
- Memory/storage/bandwidth is used more efficiently.
- They allow for simpler data structures. For example, I can use a vector
of floats even if one or two elements need to contain some non-float information.
I don't have to design some record structure that contains the vector only because of
these values.
The disadvantages are the limited number of bits, that makes them
unusable for some applications--for example, I can't store more than
about 9 letters from a selected set--and also the fact that behaviour of
machines and libraries that process NaN values is poorly documented and
may be variable, resulting in portability limitations.
nan_payload_64
This is a first, experimental, implementation of selectors, predicates
and in/out functions for the payload of IEEE-754 64-bit floating point
types as an Ada package. The 51 bits are treated as a 3-bit tag and a 48-bit
data field. Two alternative types are provided for, so far: symbols
of up to 9 characters from a limited alphanumeric set and calendar
dates with millisecond resolution. For more details see the code.
Use gnatchop and gnatmake to compile.
Download nan_payload_64 version 0.1