Insufficient test cases and incorrect implementations
The test cases for this problem are insufficient to ensure that the solutions handle numbers differently at different scales in many common languages, even though the problem description says that they should. Specifically, using a 64-bit IEEE floating point number (which is the largest many languages offer, and some only offer even smaller floating point types), the test-case numbers 100000000000000.01 and 100000000000000.011, which the problem statement implies "should test whether larger numbers are allowed larger errors", are actually equal, with the representation 100000000000000.015625. At present, this oversight is exploited by many (if not most) of the implementations, which simply test whether the absolute difference of the inputs is less than 1e-18. As the test cases stand, it would technically be correct to simply test if a == b.
Specifically, the incorrect solutions are any which are based (directly or indirectly) on the C# solution, the Delphi solution (as its standard comparison function only uses an absolute difference), the FreeBasic solution, the Mathematica solution, the Scala solution, and the Wren solution. It seems that especially many of the more popular languages have incorrect solutions (as many are based on the C#, C, or Java solution).
I would like to change this test case to involve numbers a few binary orders of magnitude less (to give room for languages with slightly smaller floats) and perhaps also include a test-case for 32-bit and/or 16-bit floats, in order to force implementations in any language to allow for differences in magnitude, but the fact that this would invalidate many existing solutions may be an issue. Perhaps the existing solutions which operate only on an absolute difference between numbers could be moved to a different page? Goose121 (talk) 00:04, 20 November 2021 (UTC)
- I made a minor grammatical adjustment to your post which made it clearer to me, says x ==> implies "x".
- It (now) strikes me the third case should really have had two 1 on the rhs, then it would be a blatent error if test 1 and test 3 gave different results.
- (Clearly the Java/Lua entries should be given some sort of special prize for getting cases 3, 4, 6, and 7 wrong!)
- The tests should have expected results explicitly stated, that is beyond 1 and 2, "Otherwise answers may vary and still be correct." is an absolute cop-out if ever I saw one.
- A consistent output should be specified (maybe like R), I quite like the idea of "==" and "~=" columns. I could go on.
- You should write a list here of any further tests you'd like to see.
- Remember all this is really much more "compare languages" as opposed to "write a useful function", I guess.
- Marking existing answers as needing review is not unprecedented, but needs 4 or 5 "ayes" here --Pete Lomax (talk) 13:51, 20 November 2021 (UTC)
Can of worms
uh-oh, you've opened a can of worms here. I can remember in the far distant past, being lectured to about floating point/equality/closeness and I resolved to run and hide if I could in the future :-)
Firstly, you can't leave people to find out how Python does it, if you want them to use a specific method then you need to add the description to the task.
Second. Approximation depends on circumstances. If the compared vary exponentially ar non-linearly, or ... then one may end up with different, but more usable definitions of approximately equal.
--Paddy3118 (talk) 12:08, 2 September 2019 (UTC)
- Agreed. You need to be very precise about your imprecision. Admittedly my gut instinct was that 100.01 and 100.011 are not approximately equal, like you said, but in fact they are better than 99.999% approximately equal and less than 0.001% different! I just wrote down "what I usually do" and on reflection that is not really likely to meet any task requirements very well. Perhaps explicitly specifying the accuracy (as a fraction, percentage, number of significant digits, decimal places, or whatever) with all the explicitly required answers for each of the various precision settings might help. Also, the test cases should probably be numbered rather than bullet pointed, if you're going to refer to them by number. --Pete Lomax (talk) 01:25, 3 September 2019 (UTC)
- I've just done the update to the task's preamble (as far as numbers instead of bullets). However, some programming language entries have added some of their (or other's) pairs of numbers to be compared, so their outputs don't match the examples given in the task's preamble. -- Gerard Schildberger (talk) 01:41, 3 September 2019 (UTC)
The desired task use case is to have a method that allows tests on floating point calculations to be tested against a non-integer decimal constant, to verify correctness of the calculation, even when code changes change the floating point result in its non-significant portion.
Beyond that, the "can of worms" probably surrounds a) whether there should be an absolute difference that matters, versus just a relative difference, and b) whether 0.0 is different from all other floating point, because only 1/0.0 is NaN. Those "wormy" issues should not matter here.--Wherrera (talk) 02:25, 3 September 2019 (UTC)
Clarify the task
It is not clear to me if the task is asking for a function which compares two floating point values to be within a given tolerance or looking at languages implementation of floating point values. If the latter maybe does 0.1+0.2=0.3 might be a useful ref.--Nigel Galloway (talk) 15:40, 3 September 2019 (UTC)