Subscribe to this thread
Home - General / All posts - RoundDecs() on float32 data type
320 post(s)
#19-Nov-21 08:25

It seems to me that RoundDecs() on a float32 number type is not working appropriately in M9 (it is not rounding - isthis normal behaviour ? ) whereas on a float64 number type the expected behaviour is correct.


6,852 post(s)
#19-Nov-21 08:29

Could you provide a specific example of how you are using the function, and what you get as compared to what you expect?

Seems to work for me...

> ? RoundDecs(CAST(3.1415926 AS FLOAT64), 4)

float64: 3.1416

> ? RoundDecs(CAST(3.1415926 AS FLOAT32), 4)

float64: 3.1416

474 post(s)
#19-Nov-21 09:07

This effect:

> ? CAST(RoundDecs(CAST(3.1415926 AS FLOAT32), 4) AS FLOAT32)

float32: 3.1415998935699463

> ? CAST(RoundDecs(CAST(3.1415926 AS FLOAT32), 4) AS FLOAT64)

float64: 3.1416

It comes from "mismatch" between binary and decimal representation.



9,773 post(s)
#19-Nov-21 09:34

If you mean that after you round a specific value to, say, 2 decimal digits, then print it to full precision, the output shows more than 2 decimal digits - this is quite likely expected, yes.

Short: use a style.

Long: (apologies for a wall of text)

Floating-point numbers are only precise for binary fractions and their sums (1/2, 1/4, 1/2+1/8, etc). When you use a floating-point number to represent a decimal fraction (eg, 9/10), you get an approximation. So, if you use RoundDecs to round, say, 3.14159 to 2 decimal digits, you get the value that is closest to 3.14, not 3.14 exactly. 3.14 cannot be represented exactly as a binary fraction, when you translate it to binary, you get an infinite number of digits (similarly to 1/3 being an infinite 0.3333... in decimal), both FLOAT32 and FLOAT64 store the first digits up to some point. If you then try to print the result using the default style, which preserves all digits that make a difference, you get 3.14 for FLOAT64 but 3.140000104904175 for FLOAT32.

In the first case, we start with a FLOAT64 value that is closest to 3.14, start printing it, scan through the infinite stream of decimal digits and stop at the earliest point we can so that the result parses back to the FLOAT64 value that is closest to 3.14 - the result happens to be '3.14'.

In the second case, we start with a FLOAT32 value that is closest to 3.14 (which differs from the value in the first case because FLOAT32 stores less data than FLOAT64), convert that to FLOAT64 (which produces a different value than in the first case), print that and get a different result.

Essentially, we get a strange-looking result for FLOAT32 because we print FLOAT32 as FLOAT64. Why are we not printing FLOAT32 as FLOAT32 = stop at the first digit which did not make a difference for FLOAT32, and instead convert to FLOAT64? Because we do not know what the resulting string will be eventually parsed into and if it will have to be parsed into a FLOAT64, the result will be a loss of precision. Now, I get that it is tempting to think that having a FLOAT32 which is a closest approximation of 3.14 in decimal print as '3.14' and having that convert to a FLOAT64 which is a closest approximation of the same 3.14, but with more precision, is fine, but there are strong arguments against: eg, if we do this, then the results of converting a FLOAT32 to FLOAT64 will differ from the results of converting a FLOAT32 to NVARCHAR and then converting NVARCHAR to FLOAT64. Converting a FLOAT32 or FLOAT64 to NVARCHAR and back is supposed to be non-lossy. Converting a FLOAT32 to FLOAT64 is supposed to be non-lossy as well. So if converting a FLOAT32 to NVARCHAR to FLOAT64 will suddenly become lossy (right now it is not), this will be quite a surprise.

RoundDecs is just a wrong tool for limiting the number of displayed decimal digits. Do that using a style.

320 post(s)
#19-Nov-21 12:06

Thank you all.

Ok styling ... ahuum; I was still using M9 175.4.

Will upgrade to M9 175.5.

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.