Steffensen's method

Steffensen's method is a numerical method to find the roots of functions. It is similar to Newton's method, but, unlike Newton's, does not require derivatives. Like Newton's, however, it is prone towards not finding a solution at all.

In this task we will do a root-finding problem that illustrates both the advantage—that no derivatives are required—and the disadvantage—that the method often gives no answer. We will try to find intersection points of two Bézier curves.

Steffensen's method

We will be using the variant of Steffensen's method that employs Aitken's extrapolation. Aitken's extrapolation is illustrated by the following ATS function:

fun aitken
     (f  : double -> double,   (* function double to double *)
      p0 : double)             (* initial fixed point estimate *)
   : double =
 let
   val p1 = f(p0)
   val p2 = f(p1)
   val p1m0 = p1 - p0
 in
   p0 - (p1m0 * p1m0) / (p2 - (2.0 * p1) + p0)
 end

The return value is a function of $p0$ , $p1=f(p0)$ , and $p2=f(f(p0))$ . What one is trying to find is a so-called fixed point of $f$ : a value of $p$ such that $f(p)=p$ . Our implementation of Steffensen's method will be to repeat Aitken's extrapolation until either a tolerance is satisfied or too many iterations have been executed:

fun steffensen_aitken     (* finds fixed point p such that f(p) = p *)
     (f       : double -> double, (* function double to double *)
      pinit   : double,           (* initial estimate *)
      tol     : double,           (* tolerance *)
      maxiter : int)              (* maximum number of iterations *)
   : Option (double) =     (* return a double, IF tolerance is met *)
 let
   var p0   : double = pinit
   var p    : double = aitken (f, p0)
   var iter : int = 1          (* iteration counter *)
 in
   while (abs (p - p0) > tol andalso iter < maxiter)
     begin
       p0 := p;
       p := aitken (f, p0);
       iter := iter + 1
     end;
   if abs (p - p0) > tol then None () else Some (p)
 end

The function steffensen_aitken will find a fixed point for us, but how can we use that to find a root? In particular, suppose one wants to find $t$ such that $g(t)=0$ . Then what one can do (and what we will try to do) is find a fixed point $p$ of $f(t)=g(t)+t$ . In that case, $p=f(p)=g(p)+p$ , and therefore $g(p)=0$ .

The problem we wish to solve

Suppose we have two quadratic planar Bézier curves, with control points $(-1,0),(0,10),(1,0)$ and $(2,1),(-8,2),(2,3)$ , respectively. These two parabolas are shown in the following figure. As you can see, they intersect at four points.

Two parabolas in the plane that intersect at four points, plus their Bézier control polygons.

We want to try to find the points of intersection.

The method we will use is implicitization. In this method, one first rewrites one of the curves as an implicit equation in $x$ and $y$ . For this we will use the parabola that is convex upwards: it has implicit equation $5x^{2}+y-5=0$ . Then what one does is plug the parametric equations of the other curve into the implicit equation. The resulting equation is $5(x(t))^{2}+y(t)-5=0$ , where $t$ is the independent parameter for the curve that is convex leftwards. After expansion, this will be a degree-4 equation in $t$ . Find its four roots and you have found the points of intersection of the two curves.

That is easier said than done.

There are excellent ways to solve this problem, but we will not be using those. Our purpose, after all, is to illustrate Steffensen's method: both its advantages and its drawbacks. Let us look at an advantage: to use Steffensen's method (which requires only function values, not derivatives), we do not actually have to expand that polynomial in $t$ . Instead, we can evaluate $x(t)$ and $y(t)$ and plug the resulting numbers into the implicit equation. What is more, we do not even need to write $x(t)$ and $y(t)$ as polynomials, but instead can evaluate them directly from their control points, using de Casteljau's algorithm:

fun de_casteljau
     (c0 : double,         (* control point coordinates (one axis) *)
      c1 : double,
      c2 : double,
      t  : double)             (* the independent parameter *)
   : double =                  (* value of x(t) or y(t) *)
 let
   val s = 1.0 - t
   val c01 = (s * c0) + (t * c1)
   val c12 = (s * c1) + (t * c2)
   val c012 = (s * c01) + (t * c12)
 in
   c012
 end

fun x_convex_left_parabola (t : double) : double =
 de_casteljau (2.0, ~8.0, 2.0, t)

fun y_convex_left_parabola (t : double) : double =
 de_casteljau (1.0, 2.0, 3.0, t)

Plugging $x(t)$ and $y(t)$ into the implicit equation, and writing $f(t)$ as the function whose fixed points are to be found:

fun implicit_equation (x : double, y : double) : double =
 (5.0 * x * x) + y - 5.0

fun f (t : double) : double = (* find fixed points of this function *)
 let
   val x = x_convex_left_parabola (t)
   val y = y_convex_left_parabola (t)
 in
   implicit_equation (x, y) + t
 end

What is left to be done is simple, but tricky: use the steffensen_aitken function to find those fixed points. This is where a huge disadvantage of Steffensen's method (shared with Newton's method) will be illustrated. What is likely to happen is that only some of the intersection points will be found. Furthermore, for most of our initial estimates, Steffensen's method will not give us an answer at all.

The task

Use the methods described above to try to find intersection points of the two parabolas. For initial estimates, use ${\it {pinit}}=t_{0}=0.0,0.1,0.2,\cdots ,0.9,1.0$ . Choose reasonable settings for tolerance and maximum iterations. (The ATS example has 0.00000001 and 1000, respectively.) For each initial estimate, if tolerance was not met before maximum iterations was reached, print that there was no answer. On the other hand, if there was an answer, plug the resulting value of $t$ into $x(t)$ and $y(t)$ to find the intersection point $(x,y)$ . Check that that this answer is correct, by plugging $(x,y)$ into the implicit equation. If it is not correct, print that Steffensen's method gave a spurious answer. Otherwise print the value of $(x,y)$ .

(The ATS example has purposely been written in a fashion to be readable as if it were pseudocode. You may use it as a reference implementation.)

ATS

#include "share/atspre_staload.hats"

fun aitken                      (* Aitken's extrapolation *)
      (f  : double -> double,   (* function double to double *)
       p0 : double)             (* initial fixed point estimate *)
    : double =
  let
    val p1 = f(p0)
    val p2 = f(p1)
    val p1m0 = p1 - p0
  in
    p0 - (p1m0 * p1m0) / (p2 - (2.0 * p1) + p0)
  end

fun steffensen_aitken     (* finds fixed point p such that f(p) = p *)
      (f       : double -> double, (* function double to double *)
       pinit   : double,           (* initial estimate *)
       tol     : double,           (* tolerance *)
       maxiter : int)              (* maximum number of iterations *)
    : Option (double) =     (* return a double, IF tolerance is met *)
  let
    var p0   : double = pinit
    var p    : double = aitken (f, p0)
    var iter : int = 1          (* iteration counter *)
  in
    while (abs (p - p0) > tol andalso iter < maxiter)
      begin
        p0 := p;
        p := aitken (f, p0);
        iter := iter + 1
      end;
    if abs (p - p0) > tol then None () else Some (p)
  end

fun de_casteljau
      (c0 : double,         (* control point coordinates (one axis) *)
       c1 : double,
       c2 : double,
       t  : double)             (* the independent parameter *)
    : double =                  (* value of x(t) or y(t) *)
  let
    val s = 1.0 - t
    val c01 = (s * c0) + (t * c1)
    val c12 = (s * c1) + (t * c2)
    val c012 = (s * c01) + (t * c12)
  in
    c012
  end

fun x_convex_left_parabola (t : double) : double =
  de_casteljau (2.0, ~8.0, 2.0, t)

fun y_convex_left_parabola (t : double) : double =
  de_casteljau (1.0, 2.0, 3.0, t)

fun implicit_equation (x : double, y : double) : double =
  (5.0 * x * x) + y - 5.0

fun f (t : double) : double = (* find fixed points of this function *)
  let
    val x = x_convex_left_parabola (t)
    val y = y_convex_left_parabola (t)
  in
    implicit_equation (x, y) + t
  end

implement main0 () =
  let
    var i : int = 0
    var t0 : double = 0.0
  in
    while (not (i = 11))
      begin
        begin
          print! ("t0 = ", t0, " : ");
          case steffensen_aitken (f, t0, 0.00000001, 1000) of
          | None () => println! ("no answer")
          | Some (t) =>
            let
              val x = x_convex_left_parabola (t)
              val y = y_convex_left_parabola (t)
            in
              if abs (implicit_equation (x, y)) <= 0.000001 then
                println! ("intersection at (", x, ", ", y, ")")
              else
                (* In my experience, it is possible for the algorithm
                   to achieve tolerance and yet give a spurious
                   answer. Exploration of this phenomenon is beyond
                   the scope of this task. Such spurious answers are
                   easy to discard. *)
                println! ("spurious solution")
            end
        end;
        i := i + 1;
        t0 := t0 + 0.1
      end
  end

Output:

t0 = 0.000000 : no answer
t0 = 0.100000 : intersection at (0.881025, 1.118975)
t0 = 0.200000 : no answer
t0 = 0.300000 : no answer
t0 = 0.400000 : no answer
t0 = 0.500000 : no answer
t0 = 0.600000 : no answer
t0 = 0.700000 : no answer
t0 = 0.800000 : no answer
t0 = 0.900000 : intersection at (-0.681025, 2.681025)
t0 = 1.000000 : no answer

C

Translation of: ATS

#include <stdio.h>
#include <math.h>

double aitken(double (*f)(double), double p0) {
    double p1 = f(p0);
    double p2 = f(p1);
    double p1m0 = p1 - p0;
    return p0 - p1m0 * p1m0 / (p2 - 2.0 * p1 + p0);
}

double steffensenAitken(double (*f)(double), double pinit, double tol, int maxiter) {
    double p0 = pinit;
    double p = aitken(f, p0);
    int iter = 1;
    while (fabs(p - p0) > tol && iter < maxiter) {
        p0 = p;
        p = aitken(f, p0);
        ++iter;
    }
    if (fabs(p - p0) > tol) return nan("");
    return p;
}

double deCasteljau(double c0, double c1, double c2, double t) {
    double s = 1.0 - t;
    double c01 = s * c0 + t * c1;
    double c12 = s * c1 + t * c2;
    return s * c01 + t * c12;
}

double xConvexLeftParabola(double t) {
    return deCasteljau(2.0, -8.0, 2.0, t);
}

double yConvexRightParabola(double t) {
    return deCasteljau(1.0, 2.0, 3.0, t);
}

double implicitEquation(double x, double y) {
    return 5.0 * x * x + y - 5.0;
}

double f(double t) {
    double x = xConvexLeftParabola(t);
    double y = yConvexRightParabola(t);
    return implicitEquation(x, y) + t;
}

int main() {
    double t0 = 0.0, t, x, y;
    int i;
    for (i = 0; i < 11; ++i) {
        printf("t0 = %0.1f : ", t0);
        t = steffensenAitken(f, t0, 0.00000001, 1000);
        if (isnan(t)) {
            printf("no answer\n");
        } else {
            x = xConvexLeftParabola(t);
            y = yConvexRightParabola(t);
            if (fabs(implicitEquation(x, y)) <= 0.000001) {
                printf("intersection at (%f, %f)\n", x, y);
            } else {
                printf("spurious solution\n");
            }
        }
        t0 += 0.1;
    }
    return 0;
}

Output:

t0 = 0.0 : no answer
t0 = 0.1 : intersection at (0.881025, 1.118975)
t0 = 0.2 : no answer
t0 = 0.3 : no answer
t0 = 0.4 : no answer
t0 = 0.5 : no answer
t0 = 0.6 : no answer
t0 = 0.7 : no answer
t0 = 0.8 : no answer
t0 = 0.9 : intersection at (-0.681025, 2.681025)
t0 = 1.0 : no answer

C++

Translation of: C

#include <iostream>
#include <cmath>
#include <iomanip>

using namespace std;

double aitken(double (*f)(double), double p0) {
    double p1 = f(p0);
    double p2 = f(p1);
    double p1m0 = p1 - p0;
    return p0 - p1m0 * p1m0 / (p2 - 2.0 * p1 + p0);
}

double steffensenAitken(double (*f)(double), double pinit, double tol, int maxiter) {
    double p0 = pinit;
    double p = aitken(f, p0);
    int iter = 1;
    while (fabs(p - p0) > tol && iter < maxiter) {
        p0 = p;
        p = aitken(f, p0);
        ++iter;
    }
    if (fabs(p - p0) > tol) return nan("");
    return p;
}

double deCasteljau(double c0, double c1, double c2, double t) {
    double s = 1.0 - t;
    double c01 = s * c0 + t * c1;
    double c12 = s * c1 + t * c2;
    return s * c01 + t * c12;
}

double xConvexLeftParabola(double t) {
    return deCasteljau(2.0, -8.0, 2.0, t);
}

double yConvexRightParabola(double t) {
    return deCasteljau(1.0, 2.0, 3.0, t);
}

double implicitEquation(double x, double y) {
    return 5.0 * x * x + y - 5.0;
}

double f(double t) {
    double x = xConvexLeftParabola(t);
    double y = yConvexRightParabola(t);
    return implicitEquation(x, y) + t;
}

int main() {
    double t0 = 0.0, t, x, y;
    int i;
    for (i = 0; i < 11; ++i) {
        cout << "t0 = " << setprecision(1) << t0 << " : ";
        t = steffensenAitken(f, t0, 0.00000001, 1000);
        if (isnan(t)) {
            cout << "no answer << endl;
        } else {
            x = xConvexLeftParabola(t);
            y = yConvexRightParabola(t);
            if (fabs(implicitEquation(x, y)) <= 0.000001) {
                cout << "intersection at (" << x << ", " << y << ")" << endl;
            } else {
                cout << "spurious solution" << endl;
            }
        }
        t0 += 0.1;
    }
    return 0;
}

Output:

Same as C example.

RATFOR

Translation of: ATS

function aitken (f, p0)
  implicit none

  double precision f, p0, aitken
  external f

  double precision p1, p2

  p1 = f(p0)
  p2 = f(p1)
  aitken = p0 - (p1 - p0)**2 / (p2 - (2.0d0 * p1) + p0)
end

subroutine steff (f, pinit, tol, mxiter, ierr, pfinal)
  implicit none

  double precision f, pinit, tol, pfinal
  integer mxiter, ierr
  external f

  double precision p0, p, aitken
  integer iter
  external aitken

  p0 = pinit
  p = aitken (f, p0)
  iter = 1
  while (abs (p - p0) > tol && iter < mxiter)
    {
      p0 = p
      p = aitken (f, p0)
      iter = iter + 1
    }
  if (abs (p - p0) > tol)
    ierr = -1
  else
    {
      ierr = 0
      pfinal = p
    }
end

function dcstlj (c0, c1, c2, t)
  implicit none

  double precision c0, c1, c2, t, dcstlj

  double precision s, c01, c12, c012

  s = 1.0d0 - t
  c01 = (s * c0) + (t * c1)
  c12 = (s * c1) + (t * c2)
  c012 = (s * c01) + (t * c12)
  dcstlj = c012
end

function x (t)
  implicit none
  double precision x, t, dcstlj
  external dcstlj
  x = dcstlj (2.0d0, -8.0d0, 2.0d0, t)
end

function y (t)
  implicit none
  double precision y, t, dcstlj
  external dcstlj
  y = dcstlj (1.0d0, 2.0d0, 3.0d0, t)
end

function impleq (x, y)
  implicit none
  double precision x, y, impleq
  impleq = (5.0d0 * x * x) + y - 5.0d0
end

function f (t)
  implicit none
  double precision f, t, x, y, impleq
  external x, y, impleq
  f = impleq (x(t), y(t)) + t
end

program RCstef
  implicit none

  double precision x, y, impleq, f
  external x, y, impleq, f, steff

  double precision t0, t
  integer i, ierr

  t0 = 0.0d0
  for (i = 0; i != 11; i = i + 1)
    {
      call steff (f, t0, 0.00000001d0, 1000, ierr, t)
      if (ierr < 0)
        write (*,*) "t0 = ", t0, " : no answer"
      else if (abs (impleq (x(t), y(t))) <= 0.000001)
        write (*,*) "t0 = ", t0, " : intersection at (", _
                    x(t), ", ", y(t), ")"
      else
        write (*,*) "t0 = ", t0, " : spurious solution"
      t0 = t0 + 0.1d0
    }
end

Output:

Compiled with f2c (which accounts for the ink-saving numerals):

 t0 =   0. : no answer
 t0 =   .1 : intersection at (  .881024968,   1.11897503)
 t0 =   .2 : no answer
 t0 =   .3 : no answer
 t0 =   .4 : no answer
 t0 =   .5 : no answer
 t0 =   .6 : no answer
 t0 =   .7 : no answer
 t0 =   .8 : no answer
 t0 =   .9 : intersection at ( -.681024968,   2.68102497)
 t0 =   1. : no answer

Wren

Translation of: ATS

Library: Wren-fmt

Wren's only built-in numeric type is a 'double' (in C) so no surprise that we're getting the same answers as ATS when using the same tolerances and maximum number of iterations.

import "./fmt" for Fmt

var aitken = Fn.new { |f, p0|
    var p1 = f.call(p0)
    var p2 = f.call(p1)
    var p1m0 = p1 - p0
    return p0 - p1m0 * p1m0 / (p2 - 2 * p1 + p0)
}

var steffensenAitken = Fn.new { |f, pinit, tol, maxiter|
    var p0 = pinit
    var p = aitken.call(f, p0)
    var iter = 1
    while ((p - p0).abs > tol && iter < maxiter) {
        p0 = p
        p = aitken.call(f, p0)
        iter = iter + 1
    }
    if ((p - p0).abs > tol) return null
    return p
}

var deCasteljau = Fn.new { |c0, c1, c2, t|
    var s = 1 - t
    var c01 = s * c0 + t * c1
    var c12 = s * c1 + t * c2
    return s * c01 + t * c12
}

var xConvexLeftParabola  = Fn.new { |t| deCasteljau.call(2, -8, 2, t) }
var yConvexRightParabola = Fn.new { |t| deCasteljau.call(1,  2, 3, t) }

var implicitEquation = Fn.new { |x, y|  5 * x * x + y - 5 }

var f = Fn.new { |t|
    var x = xConvexLeftParabola.call(t)
    var y = yConvexRightParabola.call(t)
    return implicitEquation.call(x, y) + t
}

var t0 = 0
for (i in 0..10) {
    Fmt.write("t0 = $0.1f : ", t0)
    var t = steffensenAitken.call(f, t0, 0.00000001, 1000)
    if (!t) {
        Fmt.print("no answer")
    } else {
        var x = xConvexLeftParabola.call(t)
        var y = yConvexRightParabola.call(t)
        if (implicitEquation.call(x, y).abs <= 0.000001) {
            Fmt.print("intersection at ($f, $f)", x, y)
        } else {
            Fmt.print("spurious solution")
        }
    }
    t0 = t0 + 0.1
}

Output:

t0 = 0.0 : no answer
t0 = 0.1 : intersection at (0.881025, 1.118975)
t0 = 0.2 : no answer
t0 = 0.3 : no answer
t0 = 0.4 : no answer
t0 = 0.5 : no answer
t0 = 0.6 : no answer
t0 = 0.7 : no answer
t0 = 0.8 : no answer
t0 = 0.9 : intersection at (-0.681025, 2.681025)
t0 = 1.0 : no answer