Kahan summation: Difference between revisions

Content added Content deleted

Inline

Revision as of 17:25, 12 December 2014

The Kahan summation algorithm is a method of summing a series of numbers represented in a limited precision in a way that minimises the loss of precision in the result.

The task is to follow the previously linked Wikipedia articles algorithm and its worked example by:

Do all arithmetic in decimal to a precision of six digits.
Write a function/method/procedure/subroutine/... to perform Kahan summation on an ordered collection of numbers, (such as a list of numbers).
Create the three numbers a, b, c equal to 10000.0, 3.14159, 2.71828 respectively.
Show that the simple left-to-right summation, equivalent to (a + b) + c gives an answer of 10005.8
Show that the Kahan function applied to the sequence of values a, b, c results in the more precise answer of 10005.9

If your language does not have six digit decimal point, but does have some fixed precision floating point number system then state this and try this alternative task using floating point numbers:

Follow the same steps as for the main task description above but for the numbers a, b, c use the floating-point values 1.0, epsilon, -epsilon respectively where epsilon is determined by its final value after the following:

<lang python>epsilon = 1.0 while 1.0 + epsilon != 1.0:

   epsilon = epsilon / 2.0</lang>

The above should ensure that (a + b) + c != 1.0 whilst their Kahan sum does equal 1.0 . Only if this is not the case are you then free to find three values with this property of your own.

If your language has decimals, but not six digits of precision then you may follow the outline given in Kahan_summation#Python:_Arbitrary_precision_Decimal which uses ideas from the floating point case above.

In general

Show all output on this page.
If the floating point calculations are used then answers may depend on the hardware platform or compiler/interpreter tool-chain used for a language.
Slight deviations from the task description should be explained and the the subsequent difference in output explained.
All examples should have constants chosen to clearly show the benefit of Kahan summing!

C#

This example is incomplete. If decimal precision cannot be set then this should be noted as well as the deviation in expected output explained. Please ensure that it meets all task requirements and remove this message.

<lang csharp>using System.Collections.Generic; using System.Console; using System.Linq;

namespace KahanSummation {

   static class EnumerableDecimalExtensions
   {
       internal static decimal KahanSum(this IEnumerable<decimal> numbers)
       {
           var sum = 0m;
           var c = 0m;

           foreach (var number in numbers)
           {
               var y = number + c;
               var t = sum + y;
               c = (t - sum) - y;
               sum = t;
           }

           return sum;
       }
   }

   sealed class Program
   {
       static void Main()
       {
           var input = new[]
           {
               10000.0m,
               3.14159m,
               2.71828m
           };

           WriteLine("Left associative:\t" + input.Sum());
           WriteLine("Kahan summation: \t" + input.KahanSum());
       }
   }

} </lang>

Output:

Left associative:       10005.85987
Kahan summation:        10005.85987

F#

This example is incomplete. If decimal precision cannot be set then this should be noted as well as the deviation in expected output explained. Please ensure that it meets all task requirements and remove this message.

<lang fsharp>open System

// loop version let kahanSum numbers =

   let mutable sum = 0m
   let mutable c = 0m

   for number in numbers do
       let y = number + c
       let t = sum + y
       c <- (t - sum) - y
       sum <- t
   sum

// recursive version let rec innerKahanSum numbers sum c =

   match numbers with
   | [] -> sum
   | number :: rest -> 
       let y = number + c
       let t = sum + y
       let z = ((t - sum) - y)
       innerKahanSum rest t z

let tailRecursiveKahanSum numbers =

   innerKahanSum numbers 0m 0m

// main [<EntryPoint>] let main argv =

   let input = [ 10000.0m; 3.14159m; 2.71828m ]
   printfn "Left associative: \t%M" (input |> List.sum)
   printfn "Kahan summation: \t%M" (input |> kahanSum)
   printfn "Tail recursive: \t%M" (input |> tailRecursiveKahanSum)
   0

</lang>

Output:

Left associative:       10005.85987
Kahan summation:        10005.85987
Tail recursive:         10005.85987

Go

Go has no floating point decimal type. Its floating point types are float64 and float32, implementations of IEEE 754 binary64 and binary32 formats. Alternative task: <lang go>package main

import "fmt"

type float float64

func kahan(s ...float) float {

   var tot, c float
   for _, x := range s {
       y := x - c
       t := tot + y
       c = (t - tot) - y
       tot = t
   }
   return tot

}

func epsilon() float {

   ε := float(1)
   for 1+ε != 1 {
       ε /= 2
   }
   return ε

}

func main() {

   a := float(1)
   b := epsilon()
   c := -b
   fmt.Println("Left associative:", a+b+c)
   fmt.Println("Kahan summation: ", kahan(a, b, c))
   fmt.Println("Epsilon:         ", b)

}</lang>

Output:

Left associative: 0.9999999999999999
Kahan summation:  1
Epsilon:          1.1102230246251565e-16

With float defined as float32,

Left associative: 0.99999994
Kahan summation:  1
Epsilon:          5.9604645e-08

PHP

This example is incorrect. Please fix the code and remove this message.

Details: This does not follow the original or the revised task description

Interactive mode

<lang php>$ php -a Interactive mode enabled

php > function kahansum($input) { php { $sum = $c = 0; php { foreach($input as $item) { php { $y = $item + $c; php { $t = $sum + $y; php { $c = ($t - $sum) - $y; php { $sum = $t; php { } php { return $sum; php { } php > $input = array(10000.0, 3.14159, 2.71828); php > list($a, $b, $c) = $input; php > echo $sumabc."\n"; 10005.85987 php > echo kahansum($input)."\n"; 10005.85987 php > echo array_sum($input)."\n"; 10005.85987 php > // does not seem to be any difference in PHP 5.5 </lang>

Script

<lang php><?php function kahansum($input) {

   $sum = $c = 0;                                                              
   foreach($input as $item) {                                                  
       $y = $item + $c;                                                        
       $t = $sum + $y;                                                         
       $c = ($t - $sum) - $y;                                                  
       $sum = $t;                                                              
   }                                                                           
   return $sum;

}

$input = array(10000.0, 3.14159, 2.71828); list($a, $b, $c) = $input; $sumabc = $a + $b + $c; echo $sumabc."\n"; echo kahansum($input)."\n"; echo array_sum($input)."\n"; </lang>

Script output: <lang bash> $ php kahansum.php 10005.85987 10005.85987 10005.85987 </lang>

Python

Python: Decimal

<lang python>>>> from decimal import * >>> >>> getcontext().prec = 6 >>> >>> def kahansum(input):

   summ = c = 0
   for num in input:
       y = num - c
       t = summ + y
       c = (t - summ) - y
       summ = t
   return summ

>>> a, b, c = [Decimal(n) for n in '10000.0 3.14159 2.71828'.split()] >>> a, b, c (Decimal('10000.0'), Decimal('3.14159'), Decimal('2.71828')) >>> >>> (a + b) + c Decimal('10005.8') >>> kahansum([a, b, c]) Decimal('10005.9') >>> >>> >>> sum([a, b, c]) Decimal('10005.8') >>> # it seems Python's sum() doesn't make use of this technique. >>> >>> # More info on the current Decimal context: >>> getcontext() Context(prec=6, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow]) >>> >>> >>> ## Lets try the simple summation with more precision for comparison >>> getcontext().prec = 20 >>> (a + b) + c Decimal('10005.85987') >>> </lang>

Python: Floats

This was written as proof that the floating-point sub-task could work and is better off displayed, so...

<lang python>>>> eps = 1.0 >>> while 1.0 + eps != 1.0: eps = eps / 2.0

>>> eps 1.1102230246251565e-16 >>> (1.0 + eps) - eps 0.9999999999999999 >>> kahansum([1, eps, -eps]) 1.0 >>> >>> >>> # Info on this implementation of floats >>> import sys >>> sys.float_info sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1) >>> </lang> Note that the kahansum function from the decimal example is duck typed and adjusts to work with the number type used in its argument list.

Note also that the math.fsum function employs an even more precise algorithm for the summation of floating point numbers.

Python: Arbitrary precision Decimal

Some languages have a decimal type, but cannot alter its precision to six digits. This example mimmicks this case byusing the default Python decimal precision and a variant of epsilon finding that divides by ten instead of two. <lang python>>>> from decimal import localcontext, Decimal >>> >>> with localcontext() as ctx: one, ten = Decimal('1.0'), Decimal('10') eps = one while one + eps != one: eps = eps / ten print('eps is:', eps) print('Simple sum is:', (one + eps) - eps) print('Kahan sum is:', kahansum([one, eps, -eps]))

eps is: 1E-28 Simple sum is: 0.9999999999999999999999999999 Kahan sum is: 1.000000000000000000000000000 >>> </lang>

R

Try using the task rules

1. Do all arithmetic in decimal to a precision of six digits.

options(digits=6)

2. Write a function/method/procedure/subroutine/... to perform Kahan summation
on an ordered collection of numbers, (such as a list of numbers).

kahansum <- function(x) {

   ks <- 0
   c <- 0
   for(i in 1:length(x)) {
       y <- x[i] - c
       kt <- ks + y
       c = (kt - ks) - y
       ks = kt
   }
   ks

}

3. Create the three numbers a, b, c equal to 10000.0, 3.14159, 2.71828 respectively.

a <- 10000.0 b <- 3.14159 c <- 2.71828

4. Show that the simple left-to-right summation, equivalent to (a + b) + c gives an answer of 10005.8

a + b + c

[1] 10005.9

5. Show that the Kahan function applied to the sequence of values a, b, c results in the more precise answer of 10005.9

input <- c(a, b, c) kahansum(input)

[1] 10005.9

options(digits=6)

as there is no difference let's try the subtask

epsilon = 1.0 while ((1.0 + epsilon) != 1.0) {

   epsilon = epsilon / 2.0

} epsilon

[1] 1.11022e-16

a <- 1.0000000000000011 b <- epsilon c <- -epsilon

left-to-right summation

(a + b) + c

[1] 1

kahan summation

kahansum(c(a, b, c))

[1] 1

but, are they really the same or is an artifact?

(a + b) + c == kahansum(c(a, b, c))

FALSE

ok, then what is the difference?

((a + b) + c) - kahansum(c(a, b, c))

[1] 2.22045e-16

</lang>

@@ Line 314: / Line 314: @@
 <lang r>
+# Try using the task rules
-> # 1. Do all arithmetic in decimal to a precision of six digits.
-> options(digits=6)
+# 1. Do all arithmetic in decimal to a precision of six digits.
->
+options(digits=6)
-> # 2. Write a function/method/procedure/subroutine/... to perform Kahan summation
-> # on an ordered collection of numbers, (such as a list of numbers).
+# 2. Write a function/method/procedure/subroutine/... to perform Kahan summation
-> kahansum <- function(x) {
+# on an ordered collection of numbers, (such as a list of numbers).
-+     ks <- 0
+kahansum <- function(x) {
-+     c <- 0
+    ks <- 0
-+     for(i in 1:length(x)) {
-+         y <- x[i] - c
+    c <- 0
+    for(i in 1:length(x)) {
-+         kt <- ks + y
-+         c = (kt - ks) - y
+        y <- x[i] - c
-+         ks = kt
+        kt <- ks + y
+        c = (kt - ks) - y
-+     }
-+     ks
+        ks = kt
-+ }
+    }
+    ks
->
+}
-> # 3. Create the three numbers a, b, c equal to 10000.0, 3.14159, 2.71828 respectively.
-> a <- 10000.0
+# 3. Create the three numbers a, b, c equal to 10000.0, 3.14159, 2.71828 respectively.
-> b <- 3.14159
-> c <- 2.71828
+a <- 10000.0
+b <- 3.14159
->
+c <- 2.71828
-> # 4. Show that the simple left-to-right summation, equivalent to (a + b) + c gives an answer of 10005.8
-> # this is not true for R
+# 4. Show that the simple left-to-right summation, equivalent to (a + b) + c gives an answer of 10005.8
-> a + b + c
+a + b + c
-[1] 10005.9
+# [1] 10005.9
->
-> # 5. Show that the Kahan function applied to the sequence of values a, b, c results in the more precise answer of 10005.9
+# 5. Show that the Kahan function applied to the sequence of values a, b, c results in the more precise answer of 10005.9
-> input <- c(a, b, c)
+input <- c(a, b, c)
-> kahansum(input)
+kahansum(input)
-[1] 10005.9
+# [1] 10005.9
+options(digits=6)
+# as there is no difference let's try the subtask
+epsilon = 1.0
+while ((1.0 + epsilon) != 1.0) {
+    epsilon = epsilon / 2.0
+}
+epsilon
+# [1] 1.11022e-16
+a <- 1.0000000000000011
+b <- epsilon
+c <- -epsilon
+# left-to-right summation
+(a + b) + c
+# [1] 1
+# kahan summation
+kahansum(c(a, b, c))
+# [1] 1
+# but, are they really the same or is an artifact?
+(a + b) + c == kahansum(c(a, b, c))
+# FALSE
+# ok, then what is the difference?
+((a + b) + c) - kahansum(c(a, b, c))
+# [1] 2.22045e-16
 </lang>