Kahan summation

From Rosetta Code
Revision as of 18:06, 12 December 2014 by rosettacode>Jmcastagnetto (modified example to comply with revised task and subtask)
Kahan summation is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

The Kahan summation algorithm is a method of summing a series of numbers represented in a limited precision in a way that minimises the loss of precision in the result.

The task is to follow the previously linked Wikipedia articles algorithm and its worked example by:

  1. Do all arithmetic in decimal to a precision of six digits.
  2. Write a function/method/procedure/subroutine/... to perform Kahan summation on an ordered collection of numbers, (such as a list of numbers).
  3. Create the three numbers a, b, c equal to 10000.0, 3.14159, 2.71828 respectively.
  4. Show that the simple left-to-right summation, equivalent to (a + b) + c gives an answer of 10005.8
  5. Show that the Kahan function applied to the sequence of values a, b, c results in the more precise answer of 10005.9

If your language does not have six digit decimal point, but does have some fixed precision floating point number system then state this and try this alternative task using floating point numbers:

  • Follow the same steps as for the main task description above but for the numbers a, b, c use the floating-point values 1.0, epsilon, -epsilon respectively where epsilon is determined by its final value after the following:

<lang python>epsilon = 1.0 while 1.0 + epsilon != 1.0:

   epsilon = epsilon / 2.0</lang>

The above should ensure that (a + b) + c != 1.0 whilst their Kahan sum does equal 1.0 . Only if this is not the case are you then free to find three values with this property of your own.

If your language has decimals, but not six digits of precision then you may follow the outline given in Kahan_summation#Python:_Arbitrary_precision_Decimal which uses ideas from the floating point case above.

In general
  • Show all output on this page.
  • If the floating point calculations are used then answers may depend on the hardware platform or compiler/interpreter tool-chain used for a language.
  • Slight deviations from the task description should be explained and the the subsequent difference in output explained.
  • All examples should have constants chosen to clearly show the benefit of Kahan summing!


C#

This example is incomplete. If decimal precision cannot be set then this should be noted as well as the deviation in expected output explained. Please ensure that it meets all task requirements and remove this message.

<lang csharp>using System.Collections.Generic; using System.Console; using System.Linq;

namespace KahanSummation {

   static class EnumerableDecimalExtensions
   {
       internal static decimal KahanSum(this IEnumerable<decimal> numbers)
       {
           var sum = 0m;
           var c = 0m;
           foreach (var number in numbers)
           {
               var y = number + c;
               var t = sum + y;
               c = (t - sum) - y;
               sum = t;
           }
           return sum;
       }
   }
   sealed class Program
   {
       static void Main()
       {
           var input = new[]
           {
               10000.0m,
               3.14159m,
               2.71828m
           };
           WriteLine("Left associative:\t" + input.Sum());
           WriteLine("Kahan summation: \t" + input.KahanSum());
       }
   }

} </lang>

Output:
Left associative:       10005.85987
Kahan summation:        10005.85987

F#

This example is incomplete. If decimal precision cannot be set then this should be noted as well as the deviation in expected output explained. Please ensure that it meets all task requirements and remove this message.

<lang fsharp>open System

// loop version let kahanSum numbers =

   let mutable sum = 0m
   let mutable c = 0m
   for number in numbers do
       let y = number + c
       let t = sum + y
       c <- (t - sum) - y
       sum <- t
   sum

// recursive version let rec innerKahanSum numbers sum c =

   match numbers with
   | [] -> sum
   | number :: rest -> 
       let y = number + c
       let t = sum + y
       let z = ((t - sum) - y)
       innerKahanSum rest t z
     

let tailRecursiveKahanSum numbers =

   innerKahanSum numbers 0m 0m

// main [<EntryPoint>] let main argv =

   let input = [ 10000.0m; 3.14159m; 2.71828m ]
   printfn "Left associative: \t%M" (input |> List.sum)
   printfn "Kahan summation: \t%M" (input |> kahanSum)
   printfn "Tail recursive: \t%M" (input |> tailRecursiveKahanSum)
   0

</lang>

Output:
Left associative:       10005.85987
Kahan summation:        10005.85987
Tail recursive:         10005.85987

Go

Go has no floating point decimal type. Its floating point types are float64 and float32, implementations of IEEE 754 binary64 and binary32 formats. Alternative task: <lang go>package main

import "fmt"

type float float64

func kahan(s ...float) float {

   var tot, c float
   for _, x := range s {
       y := x - c
       t := tot + y
       c = (t - tot) - y
       tot = t
   }
   return tot

}

func epsilon() float {

   ε := float(1)
   for 1+ε != 1 {
       ε /= 2
   }
   return ε

}

func main() {

   a := float(1)
   b := epsilon()
   c := -b
   fmt.Println("Left associative:", a+b+c)
   fmt.Println("Kahan summation: ", kahan(a, b, c))
   fmt.Println("Epsilon:         ", b)

}</lang>

Output:
Left associative: 0.9999999999999999
Kahan summation:  1
Epsilon:          1.1102230246251565e-16

With float defined as float32,

Left associative: 0.99999994
Kahan summation:  1
Epsilon:          5.9604645e-08

PHP

Script

<lang php><?php // Main task

// 1. Do all arithmetic in decimal to a precision of six digits. // PHP does not have a way of restricting overall precision

// 2. Write a function/method/procedure/subroutine/... to perform Kahan // summation on an ordered collection of numbers, (such as a list of numbers). ) function kahansum($input) {

   $sum = $c = 0;
   foreach($input as $item) {
       $y = $item + $c;
       $t = $sum + $y;
       $c = ($t - $sum) - $y;
       $sum = $t;
   }
   return $sum;

}

// 3. Create the three numbers a, b, c equal to 10000.0, 3.14159, 2.71828 // respectively. $input = array(10000.0, 3.14159, 2.71828); list($a, $b, $c) = $input;

// 4. show that the simple left-to-right summation, equivalent to (a + b) + c // gives an answer of 10005.8 ) $sumabc = ($a + $b) + $c; echo "Main task - Left to right summation: "; echo sprintf("%6.1f", $sumabc).PHP_EOL; // 10005.9

// 5. Show that the Kahan function applied to the sequence of values a, b, c // results in the more precise answer of 10005.9 echo "Main task - Kahan summation: "; echo sprintf("%6.1f", kahansum($input)).PHP_EOL; // 10005.9

// Let's use the substask $epsilon = 1.0; while ((1.0 + $epsilon) != 1.0) {

   $epsilon /= 2.0;

} echo "Trying the subtask".PHP_EOL."epsilon: "; echo $epsilon.PHP_EOL; // 1.1102230246252E-16

$a = 1.0; $b = $epsilon; $c = -$epsilon;

$input = array($a, $b, $c);

// left-to-right summation $sumabc = ($a + $b) + $c; echo "Sub task - Left to right summation: "; echo sprintf("%.1f", $sumabc).PHP_EOL;

// kahan summation echo "Sub task - Kahan summation: "; echo sprintf("%.1f", kahansum($input)).PHP_EOL;

// but, are they really the same or is an artifact? echo "Are the results the same?".PHP_EOL; var_dump((($a + $b) + $c) === kahansum($input));

// ok, then what is the difference? echo "Difference between the operations: "; $diff = (($a + $b) + $c) - kahansum($input); echo $diff.PHP_EOL; </lang>

Script output <lang bash> $ php kahansum.php Main task - Left to right summation: 10005.9 Main task - Kahan summation: 10005.9 Trying the subtask epsilon: 1.1102230246252E-16 Sub task - Left to right summation: 1.0 Sub task - Kahan summation: 1.0 Are the results the same? bool(false) Difference between the operations: 1.1102230246252E-16 </lang>

Python

Python: Decimal

<lang python>>>> from decimal import * >>> >>> getcontext().prec = 6 >>> >>> def kahansum(input):

   summ = c = 0
   for num in input:
       y = num - c
       t = summ + y
       c = (t - summ) - y
       summ = t
   return summ

>>> a, b, c = [Decimal(n) for n in '10000.0 3.14159 2.71828'.split()] >>> a, b, c (Decimal('10000.0'), Decimal('3.14159'), Decimal('2.71828')) >>> >>> (a + b) + c Decimal('10005.8') >>> kahansum([a, b, c]) Decimal('10005.9') >>> >>> >>> sum([a, b, c]) Decimal('10005.8') >>> # it seems Python's sum() doesn't make use of this technique. >>> >>> # More info on the current Decimal context: >>> getcontext() Context(prec=6, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow]) >>> >>> >>> ## Lets try the simple summation with more precision for comparison >>> getcontext().prec = 20 >>> (a + b) + c Decimal('10005.85987') >>> </lang>

Python: Floats

This was written as proof that the floating-point sub-task could work and is better off displayed, so...

<lang python>>>> eps = 1.0 >>> while 1.0 + eps != 1.0: eps = eps / 2.0


>>> eps 1.1102230246251565e-16 >>> (1.0 + eps) - eps 0.9999999999999999 >>> kahansum([1, eps, -eps]) 1.0 >>> >>> >>> # Info on this implementation of floats >>> import sys >>> sys.float_info sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1) >>> </lang> Note that the kahansum function from the decimal example is duck typed and adjusts to work with the number type used in its argument list.

Note also that the math.fsum function employs an even more precise algorithm for the summation of floating point numbers.

Python: Arbitrary precision Decimal

Some languages have a decimal type, but cannot alter its precision to six digits. This example mimmicks this case byusing the default Python decimal precision and a variant of epsilon finding that divides by ten instead of two. <lang python>>>> from decimal import localcontext, Decimal >>> >>> with localcontext() as ctx: one, ten = Decimal('1.0'), Decimal('10') eps = one while one + eps != one: eps = eps / ten print('eps is:', eps) print('Simple sum is:', (one + eps) - eps) print('Kahan sum is:', kahansum([one, eps, -eps]))


eps is: 1E-28 Simple sum is: 0.9999999999999999999999999999 Kahan sum is: 1.000000000000000000000000000 >>> </lang>

R

<lang r>

  1. Try using the task rules
  1. 1. Do all arithmetic in decimal to a precision of six digits.

options(digits=6)

  1. 2. Write a function/method/procedure/subroutine/... to perform Kahan summation
  2. on an ordered collection of numbers, (such as a list of numbers).

kahansum <- function(x) {

   ks <- 0
   c <- 0
   for(i in 1:length(x)) {
       y <- x[i] - c
       kt <- ks + y
       c = (kt - ks) - y
       ks = kt
   }
   ks

}

  1. 3. Create the three numbers a, b, c equal to 10000.0, 3.14159, 2.71828 respectively.

a <- 10000.0 b <- 3.14159 c <- 2.71828

  1. 4. Show that the simple left-to-right summation, equivalent to (a + b) + c gives an answer of 10005.8

a + b + c

  1. [1] 10005.9
  1. 5. Show that the Kahan function applied to the sequence of values a, b, c results in the more precise answer of 10005.9

input <- c(a, b, c) kahansum(input)

  1. [1] 10005.9

options(digits=6)

  1. as there is no difference let's try the subtask

epsilon = 1.0 while ((1.0 + epsilon) != 1.0) {

   epsilon = epsilon / 2.0

} epsilon

  1. [1] 1.11022e-16

a <- 1.0000000000000011 b <- epsilon c <- -epsilon

  1. left-to-right summation

(a + b) + c

  1. [1] 1
  1. kahan summation

kahansum(c(a, b, c))

  1. [1] 1
  1. but, are they really the same or is an artifact?

(a + b) + c == kahansum(c(a, b, c))

  1. FALSE
  1. ok, then what is the difference?

((a + b) + c) - kahansum(c(a, b, c))

  1. [1] 2.22045e-16

</lang>