Talk:Kahan summation: Difference between revisions

(→‎Task 2: Why just one representation?)
(→‎Task 2: ULP)
Line 99:
 
:But what about the issues we have already uncovered in several languages? What about languages that have much better control of number representation and calculations? There is a chance for languages that have these extra capabilities to shine and I would not want to lose that by fixing on one representation. --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 16:03, 20 December 2014 (UTC)
 
::There is no chance of solving this task unless the representation is known. Kahan summation is only meaningful for fixed-precision floating-point formats. (I know this issue hasn't come up but I expect some people to hear "decimal" and try to use a fixed-point decimal type, which I think may be more common than floating-point decimal.) For an appropriate type then, observing a difference from simple summation is only possible when results are displayed with sufficient precision. For the example with epsilon in R for example, you must know that you have nearly 16 significant decimal digits, then since epsilon is a difference in the 16th decimal place, you must display 16 significant figures. R can do this with sprintf. I suspect most languages will have a way of displaying numbers at either full or specified precision. Even in the case where full resolution was not displayed, a difference between Kahan and simple summation could still be shown if you at least knew how much precision is displayed (or equivalently, how much precision is suppressed.) Then you could contrive data so the the difference would ultimately be apparent at whatever (known) precision is displayed. Not knowing if you are doing fixed-precision floating point math, not knowing the precision you are working with, or not knowing what precision is displayed are showstoppers to illustrating Kahan summation. Some task requirements to demonstrate these capabilities would help avoid much floundering. For example,
::*State the precision and base used, ex, 6 digit decimal, or 53 bit binary.
::*Determine the unit of least precision, or unit of last place, or ULP. In general this is base**(-precision).
::*Compute and show 1, ULP, and 1-ULP in sufficient precision to show all three different.
::*Now compute and show 1+ULP at full precision. It must show 1.000... with zeros covering the full precision. Examples,
<pre>
6 digit decimal
1: 1.00000
ULP: .000001
1-ULP: .999999
1+ULP: 1.00000
</pre>
<pre>
IEEE 754 binary64, base 2, precision 53.
1: 1.000000000000000e+00
ULP: 1.110223024625157e-16
1-ULP: 9.999999999999999e-01
1+ULP: 1.000000000000000e+00
</pre>
:If you can't reproduce 1-ULP showing something less than one and 1+ULP showing exactly one, you can't do the task. It means you don't have a suitable representation, you don't have the ULP, you can't display full precision, or something.
 
:The epsilon technique should find ULP if the divisor is the base, and if equality can be tested without fuzzing. If you count iterations, it will tell you the precision as well. Really though I suggest the epsilon technique not be required. You should know your precision and base without computing them.
<lang go>package main
 
import (
"fmt"
"math"
)
 
// defining constants for IEEE 754 binary64
const (
base = 2
prec = 53
)
 
// "epsilon"
func ulp() (u float64, p int) {
u = 1.
for 1+u > 1 {
p++
u /= 2
}
return
}
 
func main() {
u, p := ulp()
fmt.Println("ulp by definintion:", math.Pow(base, -prec))
fmt.Println("ulp computed: ", u)
fmt.Println("computed precision:", p)
}</lang>
{{out}}
<pre>
ulp by definintion: 1.1102230246251565e-16
ulp computed: 1.1102230246251565e-16
computed precision: 53
</pre>
::Kahan summation of 1, +ULP, -ULP does technically show that it works, but I think the example is a little abstract and doesn't illustrate the practical value of Kahan summation well. My examples adding 10 copies of pi or adding a triangle of numbers were attempts at doing this and showing accumulated discrepancies greater than just 1 ULP. Here's one more attempt, this time summing a bunch of random numbers and accumulating a reference result.
<lang go>package main
 
import (
"fmt"
"math"
"math/rand"
"time"
)
 
func kahan(s []float64) float64 {
var tot, c float64
for _, x := range s {
y := x - c
t := tot + y
c = (t - tot) - y
tot = t
}
return tot
}
 
func seq(s []float64) float64 {
tot := 0.
for _, x := range s {
tot += x
}
return tot
}
 
// defining constants for IEEE 754 binary64
const (
base = 2
prec = 53
)
 
var ulp = math.Pow(base, -prec)
 
func main() {
rand.Seed(time.Now().UnixNano())
n := make([]float64, 10001)
n[0] = 1
refSum := 0.
for i := 1; i < len(n); i++ {
r := rand.Float64() * 1.1 * ulp
n[i] = r
refSum += r
}
 
fmt.Printf("Sequential: %.15f\n", seq(n))
fmt.Printf("Kahan: %.15f\n", kahan(n))
fmt.Printf("Reference: %.18f\n", refSum)
}</lang>
{{out}}
<pre>
Sequential: 1.000000000000215
Kahan: 1.000000000000614
Reference: 0.000000000000613562
</pre>
:It's a little strange because that 1.1 is a fudge factor needed for satisfying results. A multiple of a little more than half the base works well...because rounding... Anyway, it's nice to see the Kahan sum contain a rouding of the reference sum and see the sequential sum lose a few decimal places. Here's similar code for six decimal places,
<lang python>from decimal import *
from random import random
 
getcontext().prec = 6
 
def kahan(vals, start = 0):
tot = start
c = 0
for x in vals:
y = x - c
t = tot + y
c = (t - tot) - y
tot = t
return tot
 
ulp = Decimal('1')
prec = 0
while 1+ulp > 1:
prec += 1
ulp /= 10
print(ulp, prec)
 
n = []
refSum = Decimal('0')
for i in range(10000):
r = Decimal(random() * 6) * ulp
n.append(Decimal(r))
refSum += r
 
print("Sequential: ", sum(n, Decimal('1')))
print("Kahan ", kahan(n, Decimal('1')))
print("Reference: ", refSum)</lang>
{{out}}
<pre>
0.000001 6
Sequential: 1.01615
Kahan 1.02979
Reference: 0.0297927
</pre>
::&mdash;[[User:Sonia|Sonia]] ([[User talk:Sonia|talk]]) 01:53, 21 December 2014 (UTC)
1,707

edits