Skip to content

Commit 7c4e409

Browse files
committed
Issue #11734: Add support for IEEE 754 half-precision floats to the struct module. Original patch by Eli Stevens.
1 parent 2500c98 commit 7c4e409

7 files changed

Lines changed: 393 additions & 11 deletions

File tree

Doc/library/struct.rst

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,8 @@ platform-dependent.
216216
+--------+--------------------------+--------------------+----------------+------------+
217217
| ``N`` | :c:type:`size_t` | integer | | \(4) |
218218
+--------+--------------------------+--------------------+----------------+------------+
219+
| ``e`` | \(7) | float | 2 | \(5) |
220+
+--------+--------------------------+--------------------+----------------+------------+
219221
| ``f`` | :c:type:`float` | float | 4 | \(5) |
220222
+--------+--------------------------+--------------------+----------------+------------+
221223
| ``d`` | :c:type:`double` | float | 8 | \(5) |
@@ -257,9 +259,10 @@ Notes:
257259
fits your application.
258260

259261
(5)
260-
For the ``'f'`` and ``'d'`` conversion codes, the packed representation uses
261-
the IEEE 754 binary32 (for ``'f'``) or binary64 (for ``'d'``) format,
262-
regardless of the floating-point format used by the platform.
262+
For the ``'f'``, ``'d'`` and ``'e'`` conversion codes, the packed
263+
representation uses the IEEE 754 binary32, binary64 or binary16 format (for
264+
``'f'``, ``'d'`` or ``'e'`` respectively), regardless of the floating-point
265+
format used by the platform.
263266

264267
(6)
265268
The ``'P'`` format character is only available for the native byte ordering
@@ -268,6 +271,16 @@ Notes:
268271
on the host system. The struct module does not interpret this as native
269272
ordering, so the ``'P'`` format is not available.
270273

274+
(7)
275+
The IEEE 754 binary16 "half precision" type was introduced in the 2008
276+
revision of the `IEEE 754 standard <ieee 754 standard_>`_. It has a sign
277+
bit, a 5-bit exponent and 11-bit precision (with 10 bits explicitly stored),
278+
and can represent numbers between approximately ``6.1e-05`` and ``6.5e+04``
279+
at full precision. This type is not widely supported by C compilers: on a
280+
typical machine, an unsigned short can be used for storage, but not for math
281+
operations. See the Wikipedia page on the `half-precision floating-point
282+
format <half precision format_>`_ for more information.
283+
271284

272285
A format character may be preceded by an integral repeat count. For example,
273286
the format string ``'4h'`` means exactly the same as ``'hhhh'``.
@@ -430,3 +443,7 @@ The :mod:`struct` module also defines the following type:
430443
The calculated size of the struct (and hence of the bytes object produced
431444
by the :meth:`pack` method) corresponding to :attr:`format`.
432445

446+
447+
.. _half precision format: https://en.wikipedia.org/wiki/Half-precision_floating-point_format
448+
449+
.. _ieee 754 standard: https://en.wikipedia.org/wiki/IEEE_floating_point#IEEE_754-2008

Include/floatobject.h

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -74,16 +74,17 @@ PyAPI_FUNC(double) PyFloat_AsDouble(PyObject *);
7474
* happens in such cases is partly accidental (alas).
7575
*/
7676

77-
/* The pack routines write 4 or 8 bytes, starting at p. le is a bool
77+
/* The pack routines write 2, 4 or 8 bytes, starting at p. le is a bool
7878
* argument, true if you want the string in little-endian format (exponent
79-
* last, at p+3 or p+7), false if you want big-endian format (exponent
79+
* last, at p+1, p+3 or p+7), false if you want big-endian format (exponent
8080
* first, at p).
8181
* Return value: 0 if all is OK, -1 if error (and an exception is
8282
* set, most likely OverflowError).
8383
* There are two problems on non-IEEE platforms:
8484
* 1): What this does is undefined if x is a NaN or infinity.
8585
* 2): -0.0 and +0.0 produce the same string.
8686
*/
87+
PyAPI_FUNC(int) _PyFloat_Pack2(double x, unsigned char *p, int le);
8788
PyAPI_FUNC(int) _PyFloat_Pack4(double x, unsigned char *p, int le);
8889
PyAPI_FUNC(int) _PyFloat_Pack8(double x, unsigned char *p, int le);
8990

@@ -96,14 +97,15 @@ PyAPI_FUNC(int) _PyFloat_Repr(double x, char *p, size_t len);
9697
PyAPI_FUNC(int) _PyFloat_Digits(char *buf, double v, int *signum);
9798
PyAPI_FUNC(void) _PyFloat_DigitsInit(void);
9899

99-
/* The unpack routines read 4 or 8 bytes, starting at p. le is a bool
100+
/* The unpack routines read 2, 4 or 8 bytes, starting at p. le is a bool
100101
* argument, true if the string is in little-endian format (exponent
101-
* last, at p+3 or p+7), false if big-endian (exponent first, at p).
102+
* last, at p+1, p+3 or p+7), false if big-endian (exponent first, at p).
102103
* Return value: The unpacked double. On error, this is -1.0 and
103104
* PyErr_Occurred() is true (and an exception is set, most likely
104105
* OverflowError). Note that on a non-IEEE platform this will refuse
105106
* to unpack a string that represents a NaN or infinity.
106107
*/
108+
PyAPI_FUNC(double) _PyFloat_Unpack2(const unsigned char *p, int le);
107109
PyAPI_FUNC(double) _PyFloat_Unpack4(const unsigned char *p, int le);
108110
PyAPI_FUNC(double) _PyFloat_Unpack8(const unsigned char *p, int le);
109111

Lib/test/test_struct.py

Lines changed: 105 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
from collections import abc
22
import array
3+
import math
34
import operator
45
import unittest
56
import struct
@@ -366,8 +367,6 @@ def test_705836(self):
366367
# SF bug 705836. "<f" and ">f" had a severe rounding bug, where a carry
367368
# from the low-order discarded bits could propagate into the exponent
368369
# field, causing the result to be wrong by a factor of 2.
369-
import math
370-
371370
for base in range(1, 33):
372371
# smaller <- largest representable float less than base.
373372
delta = 0.5
@@ -659,6 +658,110 @@ def test_module_func(self):
659658
self.assertRaises(StopIteration, next, it)
660659
self.assertRaises(StopIteration, next, it)
661660

661+
def test_half_float(self):
662+
# Little-endian examples from:
663+
# http://en.wikipedia.org/wiki/Half_precision_floating-point_format
664+
format_bits_float__cleanRoundtrip_list = [
665+
(b'\x00\x3c', 1.0),
666+
(b'\x00\xc0', -2.0),
667+
(b'\xff\x7b', 65504.0), # (max half precision)
668+
(b'\x00\x04', 2**-14), # ~= 6.10352 * 10**-5 (min pos normal)
669+
(b'\x01\x00', 2**-24), # ~= 5.96046 * 10**-8 (min pos subnormal)
670+
(b'\x00\x00', 0.0),
671+
(b'\x00\x80', -0.0),
672+
(b'\x00\x7c', float('+inf')),
673+
(b'\x00\xfc', float('-inf')),
674+
(b'\x55\x35', 0.333251953125), # ~= 1/3
675+
]
676+
677+
for le_bits, f in format_bits_float__cleanRoundtrip_list:
678+
be_bits = le_bits[::-1]
679+
self.assertEqual(f, struct.unpack('<e', le_bits)[0])
680+
self.assertEqual(le_bits, struct.pack('<e', f))
681+
self.assertEqual(f, struct.unpack('>e', be_bits)[0])
682+
self.assertEqual(be_bits, struct.pack('>e', f))
683+
if sys.byteorder == 'little':
684+
self.assertEqual(f, struct.unpack('e', le_bits)[0])
685+
self.assertEqual(le_bits, struct.pack('e', f))
686+
else:
687+
self.assertEqual(f, struct.unpack('e', be_bits)[0])
688+
self.assertEqual(be_bits, struct.pack('e', f))
689+
690+
# Check for NaN handling:
691+
format_bits__nan_list = [
692+
('<e', b'\x01\xfc'),
693+
('<e', b'\x00\xfe'),
694+
('<e', b'\xff\xff'),
695+
('<e', b'\x01\x7c'),
696+
('<e', b'\x00\x7e'),
697+
('<e', b'\xff\x7f'),
698+
]
699+
700+
for formatcode, bits in format_bits__nan_list:
701+
self.assertTrue(math.isnan(struct.unpack('<e', bits)[0]))
702+
self.assertTrue(math.isnan(struct.unpack('>e', bits[::-1])[0]))
703+
704+
# Check that packing produces a bit pattern representing a quiet NaN:
705+
# all exponent bits and the msb of the fraction should all be 1.
706+
packed = struct.pack('<e', math.nan)
707+
self.assertEqual(packed[1] & 0x7e, 0x7e)
708+
packed = struct.pack('<e', -math.nan)
709+
self.assertEqual(packed[1] & 0x7e, 0x7e)
710+
711+
# Checks for round-to-even behavior
712+
format_bits_float__rounding_list = [
713+
('>e', b'\x00\x01', 2.0**-25 + 2.0**-35), # Rounds to minimum subnormal
714+
('>e', b'\x00\x00', 2.0**-25), # Underflows to zero (nearest even mode)
715+
('>e', b'\x00\x00', 2.0**-26), # Underflows to zero
716+
('>e', b'\x03\xff', 2.0**-14 - 2.0**-24), # Largest subnormal.
717+
('>e', b'\x03\xff', 2.0**-14 - 2.0**-25 - 2.0**-65),
718+
('>e', b'\x04\x00', 2.0**-14 - 2.0**-25),
719+
('>e', b'\x04\x00', 2.0**-14), # Smallest normal.
720+
('>e', b'\x3c\x01', 1.0+2.0**-11 + 2.0**-16), # rounds to 1.0+2**(-10)
721+
('>e', b'\x3c\x00', 1.0+2.0**-11), # rounds to 1.0 (nearest even mode)
722+
('>e', b'\x3c\x00', 1.0+2.0**-12), # rounds to 1.0
723+
('>e', b'\x7b\xff', 65504), # largest normal
724+
('>e', b'\x7b\xff', 65519), # rounds to 65504
725+
('>e', b'\x80\x01', -2.0**-25 - 2.0**-35), # Rounds to minimum subnormal
726+
('>e', b'\x80\x00', -2.0**-25), # Underflows to zero (nearest even mode)
727+
('>e', b'\x80\x00', -2.0**-26), # Underflows to zero
728+
('>e', b'\xbc\x01', -1.0-2.0**-11 - 2.0**-16), # rounds to 1.0+2**(-10)
729+
('>e', b'\xbc\x00', -1.0-2.0**-11), # rounds to 1.0 (nearest even mode)
730+
('>e', b'\xbc\x00', -1.0-2.0**-12), # rounds to 1.0
731+
('>e', b'\xfb\xff', -65519), # rounds to 65504
732+
]
733+
734+
for formatcode, bits, f in format_bits_float__rounding_list:
735+
self.assertEqual(bits, struct.pack(formatcode, f))
736+
737+
# This overflows, and so raises an error
738+
format_bits_float__roundingError_list = [
739+
# Values that round to infinity.
740+
('>e', 65520.0),
741+
('>e', 65536.0),
742+
('>e', 1e300),
743+
('>e', -65520.0),
744+
('>e', -65536.0),
745+
('>e', -1e300),
746+
('<e', 65520.0),
747+
('<e', 65536.0),
748+
('<e', 1e300),
749+
('<e', -65520.0),
750+
('<e', -65536.0),
751+
('<e', -1e300),
752+
]
753+
754+
for formatcode, f in format_bits_float__roundingError_list:
755+
self.assertRaises(OverflowError, struct.pack, formatcode, f)
756+
757+
# Double rounding
758+
format_bits_float__doubleRoundingError_list = [
759+
('>e', b'\x67\xff', 0x1ffdffffff * 2**-26), # should be 2047, if double-rounded 64>32>16, becomes 2048
760+
]
761+
762+
for formatcode, bits, f in format_bits_float__doubleRoundingError_list:
763+
self.assertEqual(bits, struct.pack(formatcode, f))
764+
662765

663766
if __name__ == '__main__':
664767
unittest.main()

Misc/ACKS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1435,6 +1435,7 @@ Greg Stein
14351435
Marek Stepniowski
14361436
Baruch Sterin
14371437
Chris Stern
1438+
Eli Stevens
14381439
Alex Stewart
14391440
Victor Stinner
14401441
Richard Stoakley

Misc/NEWS

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,9 @@ Core and Builtins
6969
Library
7070
-------
7171

72+
- Issue #11734: Add support for IEEE 754 half-precision floats to the
73+
struct module. Based on a patch by Eli Stevens.
74+
7275
- Issue #27919: Deprecated ``extra_path`` distribution option in distutils
7376
packaging.
7477

Modules/_struct.c

Lines changed: 75 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,33 @@ get_size_t(PyObject *v, size_t *p)
266266

267267
/* Floating point helpers */
268268

269+
static PyObject *
270+
unpack_halffloat(const char *p, /* start of 2-byte string */
271+
int le) /* true for little-endian, false for big-endian */
272+
{
273+
double x;
274+
275+
x = _PyFloat_Unpack2((unsigned char *)p, le);
276+
if (x == -1.0 && PyErr_Occurred()) {
277+
return NULL;
278+
}
279+
return PyFloat_FromDouble(x);
280+
}
281+
282+
static int
283+
pack_halffloat(char *p, /* start of 2-byte string */
284+
PyObject *v, /* value to pack */
285+
int le) /* true for little-endian, false for big-endian */
286+
{
287+
double x = PyFloat_AsDouble(v);
288+
if (x == -1.0 && PyErr_Occurred()) {
289+
PyErr_SetString(StructError,
290+
"required argument is not a float");
291+
return -1;
292+
}
293+
return _PyFloat_Pack2(x, (unsigned char *)p, le);
294+
}
295+
269296
static PyObject *
270297
unpack_float(const char *p, /* start of 4-byte string */
271298
int le) /* true for little-endian, false for big-endian */
@@ -469,6 +496,16 @@ nu_bool(const char *p, const formatdef *f)
469496
}
470497

471498

499+
static PyObject *
500+
nu_halffloat(const char *p, const formatdef *f)
501+
{
502+
#if PY_LITTLE_ENDIAN
503+
return unpack_halffloat(p, 1);
504+
#else
505+
return unpack_halffloat(p, 0);
506+
#endif
507+
}
508+
472509
static PyObject *
473510
nu_float(const char *p, const formatdef *f)
474511
{
@@ -680,6 +717,16 @@ np_bool(char *p, PyObject *v, const formatdef *f)
680717
return 0;
681718
}
682719

720+
static int
721+
np_halffloat(char *p, PyObject *v, const formatdef *f)
722+
{
723+
#if PY_LITTLE_ENDIAN
724+
return pack_halffloat(p, v, 1);
725+
#else
726+
return pack_halffloat(p, v, 0);
727+
#endif
728+
}
729+
683730
static int
684731
np_float(char *p, PyObject *v, const formatdef *f)
685732
{
@@ -743,6 +790,7 @@ static const formatdef native_table[] = {
743790
{'Q', sizeof(PY_LONG_LONG), LONG_LONG_ALIGN, nu_ulonglong,np_ulonglong},
744791
#endif
745792
{'?', sizeof(BOOL_TYPE), BOOL_ALIGN, nu_bool, np_bool},
793+
{'e', sizeof(short), SHORT_ALIGN, nu_halffloat, np_halffloat},
746794
{'f', sizeof(float), FLOAT_ALIGN, nu_float, np_float},
747795
{'d', sizeof(double), DOUBLE_ALIGN, nu_double, np_double},
748796
{'P', sizeof(void *), VOID_P_ALIGN, nu_void_p, np_void_p},
@@ -825,6 +873,12 @@ bu_ulonglong(const char *p, const formatdef *f)
825873
#endif
826874
}
827875

876+
static PyObject *
877+
bu_halffloat(const char *p, const formatdef *f)
878+
{
879+
return unpack_halffloat(p, 0);
880+
}
881+
828882
static PyObject *
829883
bu_float(const char *p, const formatdef *f)
830884
{
@@ -921,6 +975,12 @@ bp_ulonglong(char *p, PyObject *v, const formatdef *f)
921975
return res;
922976
}
923977

978+
static int
979+
bp_halffloat(char *p, PyObject *v, const formatdef *f)
980+
{
981+
return pack_halffloat(p, v, 0);
982+
}
983+
924984
static int
925985
bp_float(char *p, PyObject *v, const formatdef *f)
926986
{
@@ -972,6 +1032,7 @@ static formatdef bigendian_table[] = {
9721032
{'q', 8, 0, bu_longlong, bp_longlong},
9731033
{'Q', 8, 0, bu_ulonglong, bp_ulonglong},
9741034
{'?', 1, 0, bu_bool, bp_bool},
1035+
{'e', 2, 0, bu_halffloat, bp_halffloat},
9751036
{'f', 4, 0, bu_float, bp_float},
9761037
{'d', 8, 0, bu_double, bp_double},
9771038
{0}
@@ -1053,6 +1114,12 @@ lu_ulonglong(const char *p, const formatdef *f)
10531114
#endif
10541115
}
10551116

1117+
static PyObject *
1118+
lu_halffloat(const char *p, const formatdef *f)
1119+
{
1120+
return unpack_halffloat(p, 1);
1121+
}
1122+
10561123
static PyObject *
10571124
lu_float(const char *p, const formatdef *f)
10581125
{
@@ -1141,6 +1208,12 @@ lp_ulonglong(char *p, PyObject *v, const formatdef *f)
11411208
return res;
11421209
}
11431210

1211+
static int
1212+
lp_halffloat(char *p, PyObject *v, const formatdef *f)
1213+
{
1214+
return pack_halffloat(p, v, 1);
1215+
}
1216+
11441217
static int
11451218
lp_float(char *p, PyObject *v, const formatdef *f)
11461219
{
@@ -1182,6 +1255,7 @@ static formatdef lilendian_table[] = {
11821255
{'Q', 8, 0, lu_ulonglong, lp_ulonglong},
11831256
{'?', 1, 0, bu_bool, bp_bool}, /* Std rep not endian dep,
11841257
but potentially different from native rep -- reuse bx_bool funcs. */
1258+
{'e', 2, 0, lu_halffloat, lp_halffloat},
11851259
{'f', 4, 0, lu_float, lp_float},
11861260
{'d', 8, 0, lu_double, lp_double},
11871261
{0}
@@ -2239,7 +2313,7 @@ these can be preceded by a decimal repeat count:\n\
22392313
x: pad byte (no data); c:char; b:signed byte; B:unsigned byte;\n\
22402314
?: _Bool (requires C99; if not available, char is used instead)\n\
22412315
h:short; H:unsigned short; i:int; I:unsigned int;\n\
2242-
l:long; L:unsigned long; f:float; d:double.\n\
2316+
l:long; L:unsigned long; f:float; d:double; e:half-float.\n\
22432317
Special cases (preceding decimal count indicates length):\n\
22442318
s:string (array of char); p: pascal string (with count byte).\n\
22452319
Special cases (only available in native format):\n\

0 commit comments

Comments
 (0)