Addition-chain exponentiation

From Rosetta Code
Addition-chain exponentiation is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.
This page uses content from Wikipedia. The original article was at Addition-chain exponentiation. The list of authors can be seen in the page history. As with Rosetta Code, the text of Wikipedia is available under the GNU FDL. (See links for details on variance)

In cases of special objects (such as with matrices) the operation of multiplication can be excessively expensive. In these cases the operation of multiplication should be avoided or reduced to a minimum.

In mathematics and computer science, optimal addition-chain exponentiation is a method of exponentiation by positive integer powers that requires a minimal number of multiplications. It works by creating a shortest addition chain that generates the desired exponent. Each exponentiation in the chain can be evaluated by multiplying two of the earlier exponentiation results. More generally, addition-chain exponentiation may also refer to exponentiation by non-minimal addition chains constructed by a variety of algorithms (since a shortest addition chain is very difficult to find).

The shortest addition-chain algorithm requires no more multiplications than binary exponentiation and usually less. The first example of where it does better is for , where the binary method needs six multiplies but a shortest addition chain requires only five:

(binary, 6 multiplications)
(shortest addition chain, 5 multiplications)

On the other hand, the addition-chain method is much more complicated, since the determination of a shortest addition chain seems quite difficult: no efficient optimal methods are currently known for arbitrary exponents, and the related problem of finding a shortest addition chain for a given set of exponents has been proven NP-complete.

Table demonstrating how to do Exponentiation using Addition Chains
Number of
Multiplications
Actual
Exponentiation
Specific implementation of
Addition Chains to do Exponentiation
0 a1 a
1 a2 a × a
2 a3 a × a × a
2 a4 (a × a→b) × b
3 a5 (a × a→b) × b × a
3 a6 (a × a→b) × b × b
4 a7 (a × a→b) × b × b × a
3 a8 ((a × a→b) × b→d) × d
4 a9 (a × a × a→c) × c × c
4 a10 ((a × a→b) × b→d) × d × b
5 a11 ((a × a→b) × b→d) × d × b × a
4 a12 ((a × a→b) × b→d) × d × d
5 a13 ((a × a→b) × b→d) × d × d × a
5 a14 ((a × a→b) × b→d) × d × d × b
5 a15 ((a × a→b) × b × a→e) × e × e
4 a16 (((a × a→b) × b→d) × d→h) × h

The number of multiplications required follows this sequence: 0, 1, 2, 2, 3, 3, 4, 3, 4, 4, 5, 4, 5, 5, 5, 4, 5, 5, 6, 5, 6, 6, 6, 5, 6, 6, 6, 6, 7, 6, 7, 5, 6, 6, 7, 6, 7, 7, 7, 6, 7, 7, 7, 7, 7, 7, 8, 6, 7, 7, 7, 7, 8, 7, 8, 7, 8, 8, 8, 7, 8, 8, 8, 6, 7, 7, 8, 7, 8, 8, 9, 7, 8, 8, 8, 8, 8, 8, 9, 7, 8, 8, 8, 8, 8, 8, 9, 8, 9, 8, 9, 8, 9, 9, 9, 7, 8, 8, 8, 8...

This sequence can be found at: http://oeis.org/A003313

Task requirements:

Using the following values: and

Repeat task Matrix-exponentiation operator, except use addition-chain exponentiation to better calculate:

, and .

As an easier alternative to doing the matrix manipulation above, generate the addition-chains for 12509, 31415 and 27182 and use addition-chain exponentiation to calculate these two equations:

  • 1.0000220644541631415
  • 1.0000255005525127182

Also: Display a count of how many multiplications were done in each case.

Note: There are two ways to approach this task:

  • Brute force - try every permutation possible and pick one with the least number of multiplications. If the brute force is a simpler algorithm, then present it as a subtask under the subtitle "Brute force", eg ===Brute Force===.
  • Some clever algorithm - the wikipedia page has some hints, subtitle the code with the name of algorithm.

Note: Binary exponentiation does not usually produce the best solution. Provide only optimal solutions.

Kudos (κῦδος) for providing a routine that generate sequence A003313 in the output.


Also, see the Rosetta Code task:   [http://rosettacode.org/wiki/Knuth%27s_power_tree Knuth's power tree].

C

Using complex instead of matrix. Requires Achain.c. It takes a long while to compute the shortest addition chains, such that if you don't have the chain lengths precomputed and stored somewhere, you are probably better off with a binary chain (normally not shortest but very simple to calculate) whatever you intend to use the chains for.

#include <stdio.h>

#include "achain.c" /* not common practice */

/* don't have a C99 compiler atm */
typedef struct {double u, v;} cplx;

inline cplx c_mul(cplx a, cplx b)
{
	cplx c;
	c.u = a.u * b.u - a.v * b.v;
	c.v = a.u * b.v + a.v * b.u;
	return c;
}

cplx chain_expo(cplx x, int n)
{
	int i, j, k, l, e[32];
	cplx v[32];

	l = seq(n, 0, e);

	puts("Exponents:");
	for (i = 0; i <= l; i++)
		printf("%d%c", e[i], i == l ? '\n' : ' ');

	v[0] = x; v[1] = c_mul(x, x);
	for (i = 2; i <= l; i++) {
		for (j = i - 1; j; j--) {
			for (k = j; k >= 0; k--) {
				if (e[k] + e[j] < e[i]) break;
				if (e[k] + e[j] > e[i]) continue;
				v[i] = c_mul(v[j], v[k]);
				j = 1;
				break;
			}
		}
	}
	printf("(%f + i%f)^%d = %f + i%f\n",
		x.u, x.v, n, v[l].u, v[l].v);

	return x;
}

int bin_len(int n)
{
	int r, o;
	for (r = o = -1; n; n >>= 1, r++)
		if (n & 1) o++;
	return r + o;
}

int main()
{
	cplx	r1 = {1.0000254989, 0.0000577896},
		r2 = {1.0000220632, 0.0000500026};
	int n1 = 27182, n2 = 31415, i;

	init();
	puts("Precompute chain lengths");
	seq_len(n2);

	chain_expo(r1, n1);
	chain_expo(r2, n2);
	puts("\nchain lengths: shortest binary");
	printf("%14d %7d %7d\n", n1, seq_len(n1), bin_len(n1));
	printf("%14d %7d %7d\n", n2, seq_len(n2), bin_len(n2));
	for (i = 1; i < 100; i++)
		printf("%14d %7d %7d\n", i, seq_len(i), bin_len(i));
	return 0;
}

output

...
Exponents:
1 2 4 8 10 18 28 46 92 184 212 424 848 1696 3392 6784 13568 27136 27182
(1.000025 + i0.000058)^27182 = -0.000001 + i2.000001
Exponents:
1 2 4 8 16 17 33 49 98 196 392 784 1568 3136 6272 6289 12561 25122 31411 31415
(1.000022 + i0.000050)^31415 = -0.000001 + i2.000000

chain lengths: shortest binary
         27182      18      21
         31415      19      24
             1       0       0
             2       1       1
             3       2       2
             4       2       2
      ...
            89       9       9
            90       8       9
            91       9      10
            92       8       9
            93       9      10
      ...

Go

Translation of: C

Though adjusted to deal with matrices rather than complex numbers.

Calculating A ^ (m * n) from scratch using this method would take 'for ever' so I've calculated it instead as (A ^ m) ^ n.

package main

import (
    "fmt"
    "math"
)

const (
    N    = 32
    NMAX = 40000
)

var (
    u     = [N]int{0: 1, 1: 2} // upper bounds
    l     = [N]int{0: 1, 1: 2} // lower bounds
    out   = [N]int{}
    sum   = [N]int{}
    tail  = [N]int{}
    cache = [NMAX + 1]int{2: 1}
    known = 2
    stack = 0
    undo  = [N * N]save{}
)

type save struct {
    p *int
    v int
}

func replace(x *[N]int, i, n int) {
    undo[stack].p = &x[i]
    undo[stack].v = x[i]
    x[i] = n
    stack++
}

func restore(n int) {
    for stack > n {
        stack--
        *undo[stack].p = undo[stack].v
    }
}

/* lower and upper bounds */
func lower(n int, up *int) int {
    if n <= 2 || (n <= NMAX && cache[n] != 0) {
        if up != nil {
            *up = cache[n]
        }
        return cache[n]
    }
    i, o := -1, 0
    for ; n != 0; n, i = n>>1, i+1 {
        if n&1 != 0 {
            o++
        }
    }
    if up != nil {
        i--
        *up = o + i
    }
    for {
        i++
        o >>= 1
        if o == 0 {
            break
        }
    }
    if up == nil {
        return i
    }
    for o = 2; o*o < n; o++ {
        if n%o != 0 {
            continue
        }
        q := cache[o] + cache[n/o]
        if q < *up {
            *up = q
            if q == i {
                break
            }
        }
    }
    if n > 2 {
        if *up > cache[n-2]+1 {
            *up = cache[n-1] + 1
        }
        if *up > cache[n-2]+1 {
            *up = cache[n-2] + 1
        }
    }
    return i
}

func insert(x, pos int) bool {
    save := stack
    if l[pos] > x || u[pos] < x {
        return false
    }
    if l[pos] == x {
        goto replU
    }
    replace(&l, pos, x)
    for i := pos - 1; u[i]*2 < u[i+1]; i-- {
        t := l[i+1] + 1
        if t*2 > u[i] {
            goto bail
        }
        replace(&l, i, t)
    }
    for i := pos + 1; l[i] <= l[i-1]; i++ {
        t := l[i-1] + 1
        if t > u[i] {
            goto bail
        }
        replace(&l, i, t)
    }
replU:
    if u[pos] == x {
        return true
    }
    replace(&u, pos, x)
    for i := pos - 1; u[i] >= u[i+1]; i-- {
        t := u[i+1] - 1
        if t < l[i] {
            goto bail
        }
        replace(&u, i, t)
    }
    for i := pos + 1; u[i] > u[i-1]*2; i++ {
        t := u[i-1] * 2
        if t < l[i] {
            goto bail
        }
        replace(&u, i, t)
    }
    return true
bail:
    restore(save)
    return false
}

func try(p, q, le int) bool {
    pl := cache[p]
    if pl >= le {
        return false
    }
    ql := cache[q]
    if ql >= le {
        return false
    }
    var pu, qu int
    for pl < le && u[pl] < p {
        pl++
    }
    for pu = pl - 1; pu < le-1 && u[pu+1] >= p; pu++ {

    }
    for ql < le && u[ql] < q {
        ql++
    }
    for qu = ql - 1; qu < le-1 && u[qu+1] >= q; qu++ {

    }
    if p != q && pl <= ql {
        pl = ql + 1
    }
    if pl > pu || ql > qu || ql > pu {
        return false
    }
    if out[le] == 0 {
        pu = le - 1
        pl = pu
    }
    ps := stack
    for ; pu >= pl; pu-- {
        if !insert(p, pu) {
            continue
        }
        out[pu]++
        sum[pu] += le
        if p != q {
            qs := stack
            j := qu
            if j >= pu {
                j = pu - 1
            }
            for ; j >= ql; j-- {
                if !insert(q, j) {
                    continue
                }
                out[j]++
                sum[j] += le
                tail[le] = q
                if seqRecur(le - 1) {
                    return true
                }
                restore(qs)
                out[j]--
                sum[j] -= le
            }
        } else {
            out[pu]++
            sum[pu] += le
            tail[le] = p
            if seqRecur(le - 1) {
                return true
            }
            out[pu]--
            sum[pu] -= le
        }
        out[pu]--
        sum[pu] -= le
        restore(ps)
    }
    return false
}

func seqRecur(le int) bool {
    n := l[le]
    if le < 2 {
        return true
    }
    limit := n - 1
    if out[le] == 1 {
        limit = n - tail[sum[le]]
    }
    if limit > u[le-1] {
        limit = u[le-1]
    }
    // Try to break n into p + q, and see if we can insert p, q into
    // list while satisfying bounds.
    p := limit
    for q := n - p; q <= p; q, p = q+1, p-1 {
        if try(p, q, le) {
            return true
        }
    }
    return false
}

func seq(n, le int, buf []int) int {
    if le == 0 {
        le = seqLen(n)
    }
    stack = 0
    l[le], u[le] = n, n
    for i := 0; i <= le; i++ {
        out[i], sum[i] = 0, 0
    }
    for i := 2; i < le; i++ {
        l[i] = l[i-1] + 1
        u[i] = u[i-1] * 2
    }
    for i := le - 1; i > 2; i-- {
        if l[i]*2 < l[i+1] {
            l[i] = (1 + l[i+1]) / 2
        }
        if u[i] >= u[i+1] {
            u[i] = u[i+1] - 1
        }
    }
    if !seqRecur(le) {
        return 0
    }
    if buf != nil {
        for i := 0; i <= le; i++ {
            buf[i] = u[i]
        }
    }
    return le
}

func seqLen(n int) int {
    if n <= known {
        return cache[n]
    }
    // Need all lower n to compute sequence.
    for known+1 < n {
        seqLen(known + 1)
    }
    var ub int
    lb := lower(n, &ub)
    for lb < ub && seq(n, lb, nil) == 0 {
        lb++
    }
    known = n
    if n&1023 == 0 {
        fmt.Printf("Cached %d\n", known)
    }
    cache[n] = lb
    return lb
}

func binLen(n int) int {
    r, o := -1, -1
    for ; n != 0; n, r = n>>1, r+1 {
        if n&1 != 0 {
            o++
        }
    }
    return r + o
}

type(
    vector = []float64
    matrix []vector
)

func (m1 matrix) mul(m2 matrix) matrix {
    rows1, cols1 := len(m1), len(m1[0])
    rows2, cols2 := len(m2), len(m2[0])
    if cols1 != rows2 {
        panic("Matrices cannot be multiplied.")
    }
    result := make(matrix, rows1)
    for i := 0; i < rows1; i++ {
        result[i] = make(vector, cols2)
        for j := 0; j < cols2; j++ {
            for k := 0; k < rows2; k++ {
                result[i][j] += m1[i][k] * m2[k][j]
            }
        }
    }
    return result
}

func (m matrix) pow(n int, printout bool) matrix {
    e := make([]int, N)
    var v [N]matrix
    le := seq(n, 0, e)
    if printout {
        fmt.Println("Addition chain:")
        for i := 0; i <= le; i++ {
            c := ' '
            if i == le {
                c = '\n'
            }
            fmt.Printf("%d%c", e[i], c)
        }
    }
    v[0] = m
    v[1] = m.mul(m)
    for i := 2; i <= le; i++ {
        for j := i - 1; j != 0; j-- {
            for k := j; k >= 0; k-- {
                if e[k]+e[j] < e[i] {
                    break
                }
                if e[k]+e[j] > e[i] {
                    continue
                }
                v[i] = v[j].mul(v[k])
                j = 1
                break
            }
        }
    }
    return v[le]
}

func (m matrix) print() {
    for _, v := range m {
        fmt.Printf("% f\n", v)
    }
    fmt.Println()
}

func main() {
    m := 27182
    n := 31415
    fmt.Println("Precompute chain lengths:")
    seqLen(n)
    rh := math.Sqrt(0.5)
    mx := matrix{
        {rh, 0, rh, 0, 0, 0},
        {0, rh, 0, rh, 0, 0},
        {0, rh, 0, -rh, 0, 0},
        {-rh, 0, rh, 0, 0, 0},
        {0, 0, 0, 0, 0, 1},
        {0, 0, 0, 0, 1, 0},
    }
    fmt.Println("\nThe first 100 terms of A003313 are:")
    for i := 1; i <= 100; i++ {
        fmt.Printf("%d ", seqLen(i))
        if i%10 == 0 {
            fmt.Println()
        }
    }
    exs := [2]int{m, n}
    mxs := [2]matrix{}
    for i, ex := range exs {
        fmt.Println("\nExponent:", ex)
        mxs[i] = mx.pow(ex, true)
        fmt.Printf("A ^ %d:-\n\n", ex)
        mxs[i].print()
        fmt.Println("Number of A/C multiplies:", seqLen(ex))
        fmt.Println("  c.f. Binary multiplies:", binLen(ex))
    }
    fmt.Printf("\nExponent: %d x %d = %d\n", m, n, m*n)
    fmt.Printf("A ^ %d = (A ^ %d) ^ %d:-\n\n", m*n, m, n)   
    mx2 := mxs[0].pow(n, false)
    mx2.print()
}
Output:
Precompute chain lengths:
Cached 1024
Cached 2048
Cached 3072
....
Cached 28672
Cached 29696
Cached 30720

The first 100 terms of A003313 are:
0 1 2 2 3 3 4 3 4 4 
5 4 5 5 5 4 5 5 6 5 
6 6 6 5 6 6 6 6 7 6 
7 5 6 6 7 6 7 7 7 6 
7 7 7 7 7 7 8 6 7 7 
7 7 8 7 8 7 8 8 8 7 
8 8 8 6 7 7 8 7 8 8 
9 7 8 8 8 8 8 8 9 7 
8 8 8 8 8 8 9 8 9 8 
9 8 9 9 9 7 8 8 8 8 

Exponent: 27182
Addition chain:
1 2 4 8 10 18 28 46 92 184 212 424 848 1696 3392 6784 13568 27136 27182
A ^ 27182:-

[-0.500000 -0.500000 -0.500000  0.500000  0.000000  0.000000]
[ 0.500000 -0.500000 -0.500000 -0.500000  0.000000  0.000000]
[-0.500000 -0.500000  0.500000 -0.500000  0.000000  0.000000]
[ 0.500000 -0.500000  0.500000  0.500000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]

Number of A/C multiplies: 18
  c.f. Binary multiplies: 21

Exponent: 31415
Addition chain:
1 2 4 8 16 17 33 49 98 196 392 784 1568 3136 6272 6289 12561 25122 31411 31415
A ^ 31415:-

[ 0.707107  0.000000  0.000000 -0.707107  0.000000  0.000000]
[ 0.000000  0.707107  0.707107  0.000000  0.000000  0.000000]
[ 0.707107  0.000000  0.000000  0.707107  0.000000  0.000000]
[ 0.000000  0.707107 -0.707107  0.000000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]

Number of A/C multiplies: 19
  c.f. Binary multiplies: 24

Exponent: 27182 x 31415 = 853922530
A ^ 853922530 = (A ^ 27182) ^ 31415:-

[-0.500000  0.500000 -0.500000  0.500000  0.000000  0.000000]
[-0.500000 -0.500000 -0.500000 -0.500000  0.000000  0.000000]
[-0.500000 -0.500000  0.500000  0.500000  0.000000  0.000000]
[ 0.500000 -0.500000 -0.500000  0.500000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]

Below is the original solution which was ruled inadmissible because it uses 'star chains' and is therefore non-optimal.

I think it should nevertheless be retained as it is an interesting approach and there are other solutions to this task which are based on it.

/*
Continued fraction addition chains, as described in "Efficient computation
of addition chains" by F. Bergeron, J. Berstel, and S. Brlek, published in
Journal de théorie des nombres de Bordeaux, 6 no. 1 (1994), p. 21-38,
accessed at http://www.numdam.org/item?id=JTNB_1994__6_1_21_0.
*/
package main

import (
    "fmt"
    "math"
)

// Representation of addition chains.
// Notes:
// 1. While an []int might represent addition chains in general, the
// techniques here work only with "star" chains, as described in the paper.
// Knowledge that the chains are star chains allows certain optimizations.
// 2. The paper descibes a linked list representation which encodes both
// addends of numbers in the chain.  This allows additional optimizations, but
// for the purposes of the RC task, this simpler representation is adequate.
type starChain []int

// ⊗= operator.  modifies receiver.
func (s1 *starChain) cMul(s2 starChain) {
    p := *s1
    i := len(p)
    n := p[i-1]
    p = append(p, s2[1:]...)
    for ; i < len(p); i++ {
        p[i] *= n
    }
    *s1 = p
}

// ⊕= operator.  modifies receiver.
func (p *starChain) cAdd(j int) {
    c := *p
    *p = append(c, c[len(c)-1]+j)
}

// The γ function described in the paper returns a set of numbers in general,
// but certain γ functions return only singletons.  The dichotomic strategy
// is one of these and gives good results so it is the one used for the
// RC task.  Defining the package variable γ to be a singleton allows some
// simplifications in the code.
var γ singleton

type singleton func(int) int

func dichotomic(n int) int {
    return n / (1 << uint((λ(n)+1)/2))
}

// integer log base 2
func λ(n int) (a int) {
    for n != 1 {
        a++
        n >>= 1
    }
    return
}

// minChain as described in the paper.
func minChain(n int) starChain {
    switch a := λ(n); {
    case n == 1<<uint(a):
        r := make(starChain, a+1)
        for i := range r {
            r[i] = 1 << uint(i)
        }
        return r
    case n == 3:
        return starChain{1, 2, 3}
    }
    return chain(n, γ(n))
}

// chain as described in the paper.
func chain(n1, n2 int) starChain {
    q, r := n1/n2, n1%n2
    if r == 0 {
        c := minChain(n2)
        c.cMul(minChain(q))
        return c
    }
    c := chain(n2, r)
    c.cMul(minChain(q))
    c.cAdd(r)
    return c
}

func main() {
    m := 31415
    n := 27182
    show(m)
    show(n)
    show(m * n)
    showEasier(m, 1.00002206445416)
    showEasier(n, 1.00002550055251)
}

func show(e int) {
    fmt.Println("exponent:", e)
    s := math.Sqrt(.5)
    a := matrixFromRows([][]float64{
        {s, 0, s, 0, 0, 0},
        {0, s, 0, s, 0, 0},
        {0, s, 0, -s, 0, 0},
        {-s, 0, s, 0, 0, 0},
        {0, 0, 0, 0, 0, 1},
        {0, 0, 0, 0, 1, 0},
    })
    γ = dichotomic
    sc := minChain(e)
    fmt.Println("addition chain:", sc)
    a.scExp(sc).print("a^e")
    fmt.Println("count of multiplies:", mCount)
    fmt.Println()
}   

var mCount int

func showEasier(e int, a float64) {
    fmt.Println("exponent:", e)
    γ = dichotomic
    sc := minChain(e)
    fmt.Printf("%.14f^%d: %.14f\n", a, sc[len(sc)-1], scExp64(a, sc))
    fmt.Println("count of multiplies:", mCount)
    fmt.Println()
}

func scExp64(a float64, sc starChain) float64 {
    mCount = 0
    p := make([]float64, len(sc))
    p[0] = a 
    for i := 1; i < len(p); i++ {
        d := sc[i] - sc[i-1]
        j := i - 1
        for sc[j] != d {
            j--
        }
        p[i] = p[i-1] * p[j]
        mCount++
    }
    return p[len(p)-1]
}   
    
func (m *matrix) scExp(sc starChain) *matrix {
    mCount = 0
    p := make([]*matrix, len(sc))
    p[0] = m.copy()
    for i := 1; i < len(p); i++ {
        d := sc[i] - sc[i-1]
        j := i - 1
        for sc[j] != d {
            j--
        }
        p[i] = p[i-1].multiply(p[j])
        mCount++
    }
    return p[len(p)-1]
}

func (m *matrix) copy() *matrix {
    return &matrix{append([]float64{}, m.ele...), m.stride}
}

// code below copied from matrix multiplication task
type matrix struct {
    ele    []float64
    stride int
}

func matrixFromRows(rows [][]float64) *matrix {
    if len(rows) == 0 {
        return &matrix{nil, 0}
    }
    m := &matrix{make([]float64, len(rows)*len(rows[0])), len(rows[0])}
    for rx, row := range rows {
        copy(m.ele[rx*m.stride:(rx+1)*m.stride], row)
    }
    return m
}

func (m *matrix) print(heading string) {
    if heading > "" {
        fmt.Print(heading, "\n")
    }
    for e := 0; e < len(m.ele); e += m.stride {
        fmt.Printf("%6.3f ", m.ele[e:e+m.stride])
        fmt.Println()
    }
}

func (m1 *matrix) multiply(m2 *matrix) (m3 *matrix) {
    m3 = &matrix{make([]float64, (len(m1.ele)/m1.stride)*m2.stride), m2.stride}
    for m1c0, m3x := 0, 0; m1c0 < len(m1.ele); m1c0 += m1.stride {
        for m2r0 := 0; m2r0 < m2.stride; m2r0++ {
            for m1x, m2x := m1c0, m2r0; m2x < len(m2.ele); m2x += m2.stride {
                m3.ele[m3x] += m1.ele[m1x] * m2.ele[m2x]
                m1x++
            }
            m3x++
        }
    }
    return m3
}

Output (manually wrapped at 80 columns.)

exponent: 31415
addition chain: [1 2 4 5 10 20 25 50 55 110 220 245 490 980 1960 3920 7840
15680 31360 31415]
a^e
[ 0.707  0.000  0.000 -0.707  0.000  0.000] 
[ 0.000  0.707  0.707  0.000  0.000  0.000] 
[ 0.707  0.000  0.000  0.707  0.000  0.000] 
[ 0.000  0.707 -0.707  0.000  0.000  0.000] 
[ 0.000  0.000  0.000  0.000  0.000  1.000] 
[ 0.000  0.000  0.000  0.000  1.000  0.000] 
count of multiplies: 19

exponent: 27182
addition chain: [1 2 4 8 10 18 28 46 92 184 212 424 848 1696 3392 6784 13568
27136 27182]
a^e
[-0.500 -0.500 -0.500  0.500  0.000  0.000] 
[ 0.500 -0.500 -0.500 -0.500  0.000  0.000] 
[-0.500 -0.500  0.500 -0.500  0.000  0.000] 
[ 0.500 -0.500  0.500  0.500  0.000  0.000] 
[ 0.000  0.000  0.000  0.000  1.000  0.000] 
[ 0.000  0.000  0.000  0.000  0.000  1.000] 
count of multiplies: 18

exponent: 853922530
addition chain: [1 2 4 5 7 12 24 48 96 103 206 309 412 721 1133 1854 3708 4841
9682 19364 21218 26059 52118 104236 208472 416944 833888 1667776 3335552 6671104
13342208 26684416 53368832 106737664 213475328 426950656 853901312 853922530]
a^e
[-0.500  0.500 -0.500  0.500  0.000  0.000] 
[-0.500 -0.500 -0.500 -0.500  0.000  0.000] 
[-0.500 -0.500  0.500  0.500  0.000  0.000] 
[ 0.500 -0.500 -0.500  0.500  0.000  0.000] 
[ 0.000  0.000  0.000  0.000  1.000  0.000] 
[ 0.000  0.000  0.000  0.000  0.000  1.000] 
count of multiplies: 37

exponent: 31415
1.00002206445416^31415: 1.99999999989447
count of multiplies: 19

exponent: 27182
1.00002550055251^27182: 1.99999999997876
count of multiplies: 18

Haskell

Generators of a nearly-optimal addition chains for a given number.

dichotomicChain :: Integral a => a -> [a]
dichotomicChain n
  | n == 3  = [3, 2, 1]
  | n == 2 ^ log2 n = takeWhile (> 0) $ iterate (`div` 2) n
  | otherwise = let k = n `div` (2 ^ ((log2 n + 1) `div` 2))
                in chain n k
  where
    chain n1 n2 
      | n2 <= 1 = dichotomicChain n1
      | otherwise = case n1 `divMod` n2 of
          (q, 0) -> dichotomicChain q `mul` dichotomicChain n2
          (q, r) -> [r] `add` (dichotomicChain q `mul` chain n2 r)

    c1 `mul` c2 = map (head c2 *) c1 ++ tail c2
    c1 `add` c2 = map (head c2 +) c1 ++ c2
 
    log2 = floor . logBase 2 . fromIntegral

binaryChain :: Integral a => a -> [a]
binaryChain 1 = [1]
binaryChain n | even n = n : binaryChain (n `div` 2)
              | odd n = n : binaryChain (n - 1)
λ> dichotomicChain 31415 
[31415,31360,15680,7840,3920,1960,980,490,245,220,110,55,50,25,20,10,5,4,2,1]

λ> length $ dichotomicChain 31415
20

λ> length $ binaryChain 31415
25

λ> length $ dichotomicChain (31415*27182)
38

λ> length $ binaryChain (31415*27182)
45

Universal monoid multiplication that uses additive chain

times :: (Monoid p, Integral a) => a -> p -> p
0 `times` _ = mempty
n `times` x = res
  where
    (res:_, _, _) = foldl f ([x], 1, 0) $ tail ch
    ch = reverse $ dichotomicChain n
    f (p:ps, c1, i) c2 = let Just j = elemIndex (c2-c1) ch
                         in ((p <> ((p:ps) !! (i-j))):p:ps, c2, i+1)
λ> 31 `times` "a"
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

λ> getSum $ 31 `times` 2
62

λ> getProduct $ 31 `times` 2
2147483648

Implementation of matrix multiplication as monoidal operation.

data M a = M [[a]] | I deriving Show

instance Num a => Semigroup (M a) where
  I <> m = m
  m <> I = m
  M m1 <> M m2 = M $ map (\r -> map (\c -> r `dot` c) (transpose m2)) m1
    where dot a b = sum $ zipWith (*) a b

instance Num a => Monoid (M a) where
  mempty = I
-- matrix multiplication
λ> M [[2,4],[5,7]] <> M [[4,1],[6,2]]
M [[32,10],[62,19]]

-- matrix exponentiation
λ> 2 `times` M [[2,4],[5,7]]
M [[24,36],[45,69]]

-- calculation of large Fibonacci numbers
λ> 2 `times` M [[1,1],[1,0]]
M [[2,1],[1,1]]

λ> 3 `times` M [[1,1],[1,0]]
M [[3,2],[2,1]]

λ> 4 `times` M [[1,1],[1,0]]
M [[5,3],[3,2]]

λ> 20 `times` M [[1,1],[1,0]]
M [[10946,6765],[6765,4181]]

λ> 200 `times` M [[1,1],[1,0]]
M [[453973694165307953197296969697410619233826,280571172992510140037611932413038677189525],[280571172992510140037611932413038677189525,173402521172797813159685037284371942044301]]

The task

λ> :{
Main:> | let a = a = let s = sqrt (1/2) in M [[s,0,s,0,0,0]
Main:> |                                     ,[0,s,0,s,0,0]
Main:> |                                     ,[0,s,0,-s,0,0]
Main:> |                                     ,[-s,0,s,0,0,0]
Main:> |                                     ,[0,0,0,0,0,1]
Main:> |                                     ,[0,0,0,0,1,0]] 
Main:> | :}

λ> 31415 `times` a
M [[0.7071067811883202,0.0,0.0,-0.7071067811883202,0.0,0.0],
[0.0,0.7071067811883202,0.7071067811883202,0.0,0.0,0.0],
[0.7071067811883202,0.0,0.0,0.7071067811883202,0.0,0.0],
[0.0,0.7071067811883202,-0.7071067811883202,0.0,0.0,0.0],
[0.0,0.0,0.0,0.0,0.0,1.0],
[0.0,0.0,0.0,0.0,1.0,0.0]]

λ> 27182 `times` a
M [[-0.5000000000010233,-0.5000000000010233,-0.5000000000010233,0.5000000000010233,0.0,0.0],
[0.5000000000010233,-0.5000000000010233,-0.5000000000010233,-0.5000000000010233,0.0,0.0],
[-0.5000000000010233,-0.5000000000010233,0.5000000000010233,-0.5000000000010233,0.0,0.0],
[0.5000000000010233,-0.5000000000010233,0.5000000000010233,0.5000000000010233,0.0,0.0],
[0.0,0.0,0.0,0.0,1.0,0.0],
[0.0,0.0,0.0,0.0,0.0,1.0]]

λ> (31415*27182) `times` a
M [[-0.500000030038797,0.5000000300387971,-0.5000000300387971,0.5000000300387973,0.0,0.0],
[-0.5000000300387971,-0.5000000300387972,-0.5000000300387969,-0.5000000300387969,0.0,0.0],
[-0.5000000300387972,-0.5000000300387971,0.5000000300387969,0.5000000300387969,0.0,0.0],
[0.5000000300387971,-0.500000030038797,-0.5000000300387973,0.5000000300387971,0.0,0.0],
[0.0,0.0,0.0,0.0,1.0,0.0],
[0.0,0.0,0.0,0.0,0.0,1.0]]


Julia

Uses an iterator to generate A003313 (the first 100 values). Uses the Knuth path method for the larger integers. This gives a chain length of 18 for both 27182 and 31415.

import Base.iterate

mutable struct AdditionChains{T}
    chains::Vector{Vector{T}}
    work_chain::Int
    work_element::Int
    AdditionChains{T}() where T = new{T}([[one(T)]], 1, 1)
end

function Base.iterate(acs::AdditionChains, state = 1)
    i, j = acs.work_chain, acs.work_element
    newchain = [acs.chains[i]; acs.chains[i][end] + acs.chains[i][j]]
    push!(acs.chains, newchain)
    if j == length(acs.chains[i])
        acs.work_chain += 1
        acs.work_element = 1
    else
        acs.work_element += 1
    end
    return newchain, state + 1
end

function findchain!(acs::AdditionChains, n)
    @assert n > 0
    n == 1 && return [one(eltype(first(acs.chains)))]
    idx = findfirst(a -> a[end] == n, acs.chains)
    if idx == nothing
        for (i, chain) in enumerate(acs)
            chain[end] == n && return chain
        end
    end
    return acs.chains[idx]
end

""" memoization for knuth_path """
const knuth_path_p, knuth_path_lvl = Dict(1 => 0), [[1]]

""" knuth path method for addition chains """
function knuth_path(n)::Vector{Int}
    iszero(n) && return Int[]
    while !haskey(knuth_path_p, n)
        q = Int[]
        for x in first(knuth_path_lvl), y in knuth_path(x)
            if !haskey(knuth_path_p, x + y)
                knuth_path_p[x + y] = x
                push!(q, x + y)
            end
        end
        knuth_path_lvl[begin] = q
    end
    return push!(knuth_path(knuth_path_p[n]), n)
end

function pow(x, chain)
    p, products = 0, Dict{Int, typeof(x)}(0 => one(x), 1 => x)
    for i in chain
        products[i] = products[p] * products[i - p]
        p = i
    end
    return products[chain[end]]
end

function test_addition_chains()
    additionchain = AdditionChains{Int}()
    println("First one hundred addition chain lengths:")
    for i in 1:100
        print(rpad(length(findchain!(additionchain, i)) -1, 3), i % 10 == 0 ? "\n" : "")
    end
    println("\nKnuth chains for addition chains of 31415 and 27182:")
    expchains = Dict(i => knuth_path(i) for i in [31415, 27182])
    for (n, chn) in expchains
        println("Exponent: ", rpad(n, 10), "\n  Addition Chain: $(chn[begin:end-1]))")
    end
    println("\n1.00002206445416^31415 = ", pow(1.00002206445416, expchains[31415]))
    println("1.00002550055251^27182 = ", pow(1.00002550055251, expchains[27182]))
    println("1.00002550055251^(27182 * 31415) = ", pow(BigFloat(pow(1.00002550055251, expchains[27182])), expchains[31415]))
    println("(1.000025 + 0.000058i)^27182 = ", pow(Complex(1.000025, 0.000058), expchains[27182]))
    println("(1.000022 + 0.000050i)^31415 = ", pow(Complex(1.000022, 0.000050), expchains[31415]))
    x = sqrt(1/2)
    matrixA = [x 0 x 0 0 0; 0 x 0 x 0 0; 0 x 0 -x 0 0; -x 0 x 0 0 0; 0 0 0 0 0 1; 0 0 0 0 1 0]
    println("matrix A ^ 27182 = ")
    display(pow(matrixA, expchains[27182]))
    println("matrix A ^ 31415 = ")
    display(round.(pow(matrixA, expchains[31415]), digits=6))
    println("(matrix A ^ 27182) ^ 31415 = ")
    display(pow(pow(matrixA, expchains[27182]), expchains[31415]))
end

test_addition_chains()
Output:
First one hundred addition chain lengths:
0  1  2  2  3  3  4  3  4  4
5  4  5  5  5  4  5  5  6  5
6  6  6  5  6  6  6  6  7  6
7  5  6  6  7  6  7  7  7  6
7  7  7  7  7  7  8  6  7  7
7  7  8  7  8  7  8  8  8  7
8  8  8  6  7  7  8  7  8  8
9  7  8  8  8  8  8  8  9  7
8  8  8  8  8  8  9  8  9  8
9  8  9  9  9  7  8  8  8  8

Knuth chains for addition chains of 31415 and 27182:
Exponent: 27182
  Addition Chain: [1, 2, 3, 5, 7, 14, 21, 35, 70, 140, 143, 283, 566, 849, 1698, 3396, 6792, 6799, 13591])
Exponent: 31415
  Addition Chain: [1, 2, 3, 5, 7, 14, 28, 56, 61, 122, 244, 488, 976, 1952, 3904, 7808, 15616, 15677, 31293])

1.00002206445416^31415 = 1.9999999998924638
1.00002550055251^27182 = 1.9999999999792688
1.00002550055251^(27182 * 31415) = 7.199687435551025768365237017391630520475805934721292725697031724530209692195819e+9456
(1.000025 + 0.000058i)^27182 = -0.01128636963542673 + 1.9730308496660347im
(1.000022 + 0.000050i)^31415 = 0.00016144681325535107 + 1.9960329014194498im
matrix A ^ 27182 =
6×6 Matrix{Float64}:
 -0.5  -0.5  -0.5   0.5  0.0  0.0
  0.5  -0.5  -0.5  -0.5  0.0  0.0
 -0.5  -0.5   0.5  -0.5  0.0  0.0
  0.5  -0.5   0.5   0.5  0.0  0.0
  0.0   0.0   0.0   0.0  1.0  0.0
  0.0   0.0   0.0   0.0  0.0  1.0
matrix A ^ 31415 =
6×6 Matrix{Float64}:
  0.707107   0.0       -0.0       -0.707107  0.0  0.0
 -0.0        0.707107   0.707107  -0.0       0.0  0.0
  0.707107  -0.0       -0.0        0.707107  0.0  0.0
  0.0        0.707107  -0.707107  -0.0       0.0  0.0
  0.0        0.0        0.0        0.0       0.0  1.0
  0.0        0.0        0.0        0.0       1.0  0.0
(matrix A ^ 27182) ^ 31415 =
6×6 Matrix{Float64}:
 -0.5   0.5  -0.5   0.5  0.0  0.0
 -0.5  -0.5  -0.5  -0.5  0.0  0.0
 -0.5  -0.5   0.5   0.5  0.0  0.0
  0.5  -0.5  -0.5   0.5  0.0  0.0
  0.0   0.0   0.0   0.0  1.0  0.0
  0.0   0.0   0.0   0.0  0.0  1.0

Nim

Translation of: Go
import math, sequtils, strutils

const
  N = 32
  NMax = 40_000

type Save = tuple[p: ptr int; v: int]

var
  u: array[N, int]  # Upper bounds.
  l: array[N, int]  # Lower bounds.
u[0] = 1; u[1] = 2
l[0] = 1; l[1] = 2

var outp, sum, tail: array[N, int]

var cache: array[NMax + 1, int]
cache[2] = 1

var
  known = 2
  stack = 0
  undo: array[N * N, Save]


proc replace(x: var openArray[int]; i, n: int) =
  undo[stack] = (x[i].addr, x[i])
  x[i] = n
  inc stack


proc restore(n: int) =
  while stack > n:
    dec stack
    undo[stack].p[] = undo[stack].v


proc bounds(n: int): tuple[lower, upper: int] =
  # Return lower and upper bounds.
  if n <= 2 or n <= NMax and cache[n] != 0:
    return (cache[n], cache[n])
  var
    i = -1
    o = 0
    n = n
  while n != 0:
    if (n and 1) != 0: inc o
    n = n shr 1
    inc i
  dec i
  result.upper = o + i
  while true:
    inc i
    o = o shr 1
    if o == 0: break
  o = 2
  while o * o < n:
    if n mod o == 0:
      let q = cache[o] + cache[n div o]
      if q < result.upper:
        result.upper = q
        if q == i: break
    inc o
  if n > 2:
    if result.upper > cache[n - 2] + 1:
      result.upper = cache[n - 1] + 1
    if result.upper > cache[n - 2] + 1:
      result.upper = cache[n - 2] + 1
  result.lower = i


proc insert(x, pos: int): bool =
  let save = stack

  if l[pos] > x or u[pos] < x:
    return false

  if l[pos] != x:
    l.replace(pos, x)
    var i = pos - 1
    while u[i] * 2 < u[i+1]:
      let t = l[i+1] + 1
      if t * 2 > u[i]:
        restore(save)
        return false
      l.replace(i, t)
      dec i
    i = pos + 1
    while l[i] <= l[i-1]:
      let t = l[i-1] + 1
      if t > u[i]:
        restore(save)
        return false
      l.replace(i, t)
      inc i

  if u[pos] == x:
    return true

  u.replace(pos, x)
  var i = pos - 1
  while u[i] >= u[i+1]:
    let t = u[i+1] - 1
    if t < l[i]:
      restore(save)
      return false
    u.replace(i, t)
    dec i
  i = pos + 1
  while u[i] > u[i-1] * 2:
    let t = u[i-1] * 2
    if t < l[i]:
      restore(save)
      return false
    u.replace(i, t)
    inc i

  result = true


# Forward reference.
proc seqRecur(le: int): bool


proc `try`(p, q, le: int): bool =

  var pl = cache[p]
  if pl >= le: return false
  var ql = cache[q]
  if ql >= le: return false

  while pl < le and u[pl] < p: inc pl
  var pu = pl - 1
  while pu < le - 1 and u[pu+1] >= p: inc pu

  while ql < le and u[ql] < q: inc ql
  var qu = ql - 1
  while qu < le - 1 and u[qu+1] >= q: inc qu

  if p != q and pl <= ql: pl = ql + 1
  if pl > pu or ql > qu or ql > pu: return false

  if outp[le] == 0:
    pu = le - 1
    pl = pu

  let ps = stack
  while pu >= pl:
    if insert(p, pu):
      inc outp[pu]
      inc sum[pu], le
      if p != q:
        let qs= stack
        var j = qu
        if j >= pu: j = pu - 1
        while j >= ql:
          if insert(q, j):
            inc outp[j]
            inc sum[j], le
            tail[le] = q
            if seqRecur(le - 1): return true
            restore(qs)
            dec outp[j]
            dec sum[j], le
          dec j
      else:
        inc outp[pu]
        inc sum[pu], le
        tail[le] = p
        if seqRecur(le - 1): return true
        dec outp[pu]
        dec sum[pu], le
      dec outp[pu]
      dec sum[pu], le
      restore(ps)
    dec pu


proc seqRecur(le: int): bool =
  let n = l[le]
  if le < 2: return true
  var limit = n - 1
  if outp[le] == 1: limit = n - tail[sum[le]]
  if limit > u[le-1]: limit = u[le-1]
  # Try to break n into p + q, and see if we can insert p, q into
  # list while satisfying bounds.
  var p = limit
  var q = n - p
  while q <= p:
    if `try`(p, q, le): return true
    dec p
    inc q


# Forward reference.
proc sequence(n, le: int): int


proc seqLen(n: int): int =

  if n <= known: return cache[n]

  # Need all lower n to compute sequence.
  while known + 1 < n:
    discard seqLen(known + 1)

  var (lb, ub) = bounds(n)
  while lb < ub and sequence(n, lb) == 0: inc lb

  known = n
  if (n and 1023) == 0: echo "Cached: ", known
  cache[n] = lb
  result = lb


proc sequence(n, le: int): int =
  let le = if le != 0: le else: seqLen(n)
  stack = 0
  l[le] = n
  u[le] = n
  for i in 0..le:
    outp[i] = 0
    sum[i] = 0
  for i in 2..<le:
    l[i] = l[i-1] + 1
    u[i] = u[i-1] * 2
  for i in countdown(le - 1, 3):
    if l[i] * 2 < l[i+1]:
      l[i] = (1 + l[i+1]) div 2
    if u[i] >= u[i+1]:
      u[i] = u[i+1] - 1

  if not seqRecur(le): return 0
  result = le


proc sequence(n, le: int; buf: var openArray[int]): int =
  let le = sequence(n, le)
  for i in 0..le: buf[i] = u[i]
  result = le


proc binLen(n: int): int =
  var r, o = -1
  var n = n
  while n != 0:
    if (n and 1) != 0: inc o
    n = n shr 1
    inc r
  result = r + o


type
  Vector = seq[float]
  Matrix = seq[Vector]


func `*`(m1, m2: Matrix): Matrix =
  let
    rows1 = m1.len
    cols1 = m1[0].len
    rows2 = m2.len
    cols2 = m2[0].len
  if cols1 != rows2:
    raise newException(ValueError, "Matrices cannot be multiplied.")

  result = newSeqWith(rows1, newSeq[float](cols2))
  for i in 0..<rows1:
    for j in 0..<cols2:
      for k in 0..<rows2:
        result[i][j] += m1[i][k] * m2[k][j]


proc pow(m: Matrix; n: int; printout: bool): Matrix =
  var e = newSeq[int](N)
  var v: array[N, Matrix]
  let le = sequence(n, 0, e)
  if printout:
    echo "Addition chain:"
    echo e[0..le].join(" ")
  v[0] = m
  v[1] = m * m
  for i in 2..le:
    block loop2:
      for j in countdown(i - 1, 0):
        for k in countdown(j, 0):
          if e[k] + e[j] < e[i]: break
          if e[k] + e[j] > e[i]: continue
          v[i] = v[j] * v[k]
          break loop2
  result = v[le]


func `$`(m: Matrix): string =
  for v in m:
    result.add '['
    let start = result.len
    for x in v:
      result.addSep(" ", start)
      result.add x.formatFloat(ffDecimal, precision = 6).align(9)
    result.add "]\n"


var m = 27182
var n = 31415
echo "Precompute chain lengths:"
discard seqlen(n)
let rh = sqrt(0.5)
let mx: Matrix = @[@[ rh, 0.0,  rh, 0.0, 0.0, 0.0],
                   @[0.0,  rh, 0.0,  rh, 0.0, 0.0],
                   @[0.0,  rh, 0.0, -rh, 0.0, 0.0],
                   @[-rh, 0.0,  rh, 0.0, 0.0, 0.0],
                   @[0.0, 0.0, 0.0, 0.0, 0.0, 1.0],
                   @[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]]

echo "\nThe first 100 terms of A003313 are:"
for i in 1..100:
  stdout.write seqLen(i), ' '
  if i mod 10 == 0: echo()

let exs = [m, n]
var mxs: array[2, Matrix]
for i, ex in exs:
  echo()
  echo "Exponent: ", ex
  mxs[i] = mx.pow(ex, true)
  echo "A ^ $#:\n".format(ex)
  echo mxs[i]
  echo "Number of A/C multiplies: ", seqLen(ex)
  echo "  c.f. Binary multiplies: ", binLen(ex)
echo()
echo "Exponent: $1 x $2 = $3".format(m, n, m * n)
echo "A ^ $1 = (A ^ $2) ^ $3:\n".format(m * n, m, n)
echo mxs[0].pow(n, false)
Output:
Precompute chain lengths:
Cached: 1024
Cached: 2048
Cached: 3072
Cached: 4096
Cached: 5120
Cached: 6144
Cached: 7168
Cached: 8192
Cached: 9216
Cached: 10240
Cached: 11264
Cached: 12288
Cached: 13312
Cached: 14336
Cached: 15360
Cached: 16384
Cached: 17408
Cached: 18432
Cached: 19456
Cached: 20480
Cached: 21504
Cached: 22528
Cached: 23552
Cached: 24576
Cached: 25600
Cached: 26624
Cached: 27648
Cached: 28672
Cached: 29696
Cached: 30720

The first 100 terms of A003313 are:
0 1 2 2 3 3 4 3 4 4 
5 4 5 5 5 4 5 5 6 5 
6 6 6 5 6 6 6 6 7 6 
7 5 6 6 7 6 7 7 7 6 
7 7 7 7 7 7 8 6 7 7 
7 7 8 7 8 7 8 8 8 7 
8 8 8 6 7 7 8 7 8 8 
9 7 8 8 8 8 8 8 9 7 
8 8 8 8 8 8 9 8 9 8 
9 8 9 9 9 7 8 8 8 8 

Exponent: 27182
Addition chain:
1 2 4 8 10 18 28 46 92 184 212 424 848 1696 3392 6784 13568 27136 27182
A ^ 27182:

[-0.500000 -0.500000 -0.500000  0.500000  0.000000  0.000000]
[ 0.500000 -0.500000 -0.500000 -0.500000  0.000000  0.000000]
[-0.500000 -0.500000  0.500000 -0.500000  0.000000  0.000000]
[ 0.500000 -0.500000  0.500000  0.500000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]

Number of A/C multiplies: 18
  c.f. Binary multiplies: 21

Exponent: 31415
Addition chain:
1 2 4 8 16 17 33 49 98 196 392 784 1568 3136 6272 6289 12561 25122 31411 31415
A ^ 31415:

[ 0.707107  0.000000  0.000000 -0.707107  0.000000  0.000000]
[ 0.000000  0.707107  0.707107  0.000000  0.000000  0.000000]
[ 0.707107  0.000000  0.000000  0.707107  0.000000  0.000000]
[ 0.000000  0.707107 -0.707107  0.000000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]

Number of A/C multiplies: 19
  c.f. Binary multiplies: 24

Exponent: 27182 x 31415 = 853922530
A ^ 853922530 = (A ^ 27182) ^ 31415:

[-0.500000  0.500000 -0.500000  0.500000  0.000000  0.000000]
[-0.500000 -0.500000 -0.500000 -0.500000  0.000000  0.000000]
[-0.500000 -0.500000  0.500000  0.500000  0.000000  0.000000]
[ 0.500000 -0.500000 -0.500000  0.500000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]

Phix

Brute force

Naieve brute force search, no attempt to optimise, manages about 4 million checks/s.
Replacing the recursion with an internal stack and chosen with a fixed length array might help, but otherwise I got no good ideas at all for trimming the search space.
Giving it the same length, I think, yields the same result as Knuth's_power_tree#Phix, and at least in the cases that I tried, somewhat faster than the method on that page.
If you know the A003313 number, you can throw that at it and wait (for several billion years) or get the power tree length and loop trying to find a path one shorter (and wait several trillion years). The path() routine is a direct copy from the link above.
Note that "tries" overflows (crashes) at 1073741824, which I kept in as a deliberate limiter.

with javascript_semantics

constant p = new_dict({{1,0}})
sequence lvl = {1}
 
function path(integer n)
    if n=0 then return {} end if
    while getd_index(n,p)=NULL do
        sequence q = {}
        for i=1 to length(lvl) do
            integer x = lvl[i]
            sequence px = path(x)
            for j=1 to length(px) do
                integer y = x+px[j]
                if getd_index(y,p)!=NULL then exit end if
                setd(y,x,p)
                q &= y
            end for
        end for
        lvl = q
    end while
    return path(getd(n,p))&n
end function

atom t1 = time()+1
integer tries = 0
function addition_chain(integer target, len, sequence chosen={1})
-- target and len must be >=2
    tries += 1
    integer l = length(chosen),
            last = chosen[$]
    if last=target then return chosen end if
    if l=len then
        if platform()!=JS and time()>t1 then
            ?{"addition_chain",chosen,tries}
            t1 = time()+1
        end if
    else
        for i=l to 1 by -1 do
            integer next = last+chosen[i]
            if next<=target then
                sequence res = addition_chain(target,len,deep_copy(chosen)&next)
                if length(res) then return res end if
            end if
        end for
    end if
    return {}
end function
 
-- first, some proof of correctness at the lower/trivial end:
 
sequence res = repeat(0,120)
res[2] = 1
for n=3 to length(res) do
    integer l = length(path(n))
    while true do
        sequence ac = addition_chain(n,l)
        if length(ac)=0 then exit end if
        l = length(ac)-1
    end while
    res[n] = l
end for
puts(1,"The first 120 members of A003313:\n")
pp(res)
 
printf(1,"addition_chain(31,8):%s\n",{sprint(addition_chain(31,8))})
Output:
The first 120 members of A003313:
{0,1,2,2,3,3,4,3,4,4,5,4,5,5,5,4,5,5,6,5,6,6,6,5,6,6,6,6,7,6,7,5,6,6,7,6,7,
 7,7,6,7,7,7,7,7,7,8,6,7,7,7,7,8,7,8,7,8,8,8,7,8,8,8,6,7,7,8,7,8,8,9,7,8,8,
 8,8,8,8,9,7,8,8,8,8,8,8,9,8,9,8,9,8,9,9,9,7,8,8,8,8,9,8,9,8,9,9,9,8,9,9,9,
 8,9,9,9,9,9,9,9,8}
addition_chain(31,8):{1,2,4,8,10,20,30,31}

On the task numbers, however, as to be expected, it struggles but probably would eventually get there if the overflow on tries were removed:

--?path(12509)                       -- {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,12288,12416,12480,12496,12504,12508,12509}
--?addition_chain(12509,21) -- 0s       {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,12288,12416,12480,12496,12504,12508,12509}
--?addition_chain(12509,20) -- 12.3s    {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,4160,8320,12480,12496,12504,12508,12509}
--?addition_chain(12509,19) -- 1.1s     {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,4160,4168,8336,12504,12508,12509}
--?addition_chain(12509,18) -- bust
--?addition_chain(12509,17) -- bust

--?path(31415)                       -- {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,24576,28672,30720,31232,31360,31392,31408,31412,31414,31415}
--?addition_chain(31415,25) -- 0s       {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,24576,28672,30720,31232,31360,31392,31408,31412,31414,31415}
--?addition_chain(31415,24) -- bust
--?addition_chain(31415,23) -- bust
--?addition_chain(31415,22) -- 137s     {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,10240,10368,10384,10386,20772,31158,31414,31415}
--?addition_chain(31415,21) -- 116s     {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,6144,6272,6288,6290,12562,25124,31414,31415}
--?addition_chain(31415,20) -- bust     
--?addition_chain(31415,19) -- bust

--?path(27182)                       -- {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,24576,26624,27136,27168,27176,27180,27182}
--?addition_chain(27182,22) -- 0s       {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,24576,26624,27136,27168,27176,27180,27182}
--?addition_chain(27182,21) -- 19.4s    {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,8704,8712,8714,17428,26142,27166,27182}
--?addition_chain(27182,20) -- 14.2s    {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,5120,5128,5640,5642,10770,21540,27182}
--?addition_chain(27182,19) -- bust
--?addition_chain(27182,18) -- bust

Once you have an addition chain, of course, apply it as per Knuth's_power_tree#Phix, and of course length(pn) is the number of multiplies that will perform.

--sequence pn = {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,24576,28672,30720,31232,31360,31392,31408,31412,31414,31415}
--sequence pn = {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,10240,10368,10384,10386,20772,31158,31414,31415}
--sequence pn = {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,6144,6272,6288,6290,12562,25124,31414,31415}
--sequence pn = {1,2,4,8,16,17,33,49,98,196,392,784,1568,3136,6272,6289,12561,25122,31411,31415}    -- (from C)
--printf(1,"%3g ^ %d (%d) = %s\n", {1.00002206445416,31415,length(pn),treepow(1.00002206445416,31415,pn)})
--1.00002 ^ 31415 (25) = 1.9999999998949602994638558
--1.00002 ^ 31415 (22) = 1.9999999998949602994638556
--1.00002 ^ 31415 (21) = 1.9999999998949602994638552
--1.00002 ^ 31415 (20) = 1.9999999998949602994638291

--sequence pn = {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,24576,26624,27136,27168,27176,27180,27182}
--sequence pn = {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,8704,8712,8714,17428,26142,27166,27182}
--sequence pn = {1,2,4,8,16,32,64,128,256,512,1024,2048,4096,5120,5128,5640,5642,10770,21540,27182}
--sequence pn = {1,2,4,8,10,18,28,46,92,184,212,424,848,1696,3392,6784,13568,27136,27182}       -- (from C)
--printf(1,"%3g ^ %d (%d) = %s\n", {1.00002550055251,27182,length(pn),treepow(1.00002550055251,27182,pn)})
--1.00003 ^ 27182 (22) = 1.9999999999727968238298861
--1.00003 ^ 27182 (21) = 1.9999999999727968238298859
--1.00003 ^ 27182 (20) = 1.9999999999727968238298855
--1.00003 ^ 27182 (19) = 1.9999999999727968238298527

Python

'''  Rosetta Code task Addition-chain_exponentiation  '''

from math import sqrt
from numpy import array
from mpmath import mpf


class AdditionChains:
    ''' two methods of calculating addition chains '''

    def __init__(self):
        ''' memoization for knuth_path '''
        self.chains, self.idx, self.pos = [[1]], 0, 0
        self.pat, self.lvl = {1: 0}, [[1]]

    def add_chain(self):
        ''' method 1: add chains depth then breadth first until done '''
        newchain = self.chains[self.idx].copy()
        newchain.append(self.chains[self.idx][-1] +
                        self.chains[self.idx][self.pos])
        self.chains.append(newchain)
        if self.pos == len(self.chains[self.idx])-1:
            self.idx += 1
            self.pos = 0
        else:
            self.pos += 1
        return newchain

    def find_chain(self, nexp):
        ''' method 1 interface: search for chain ending with n, adding more as needed '''
        assert nexp > 0
        if nexp == 1:
            return [1]
        chn = next((a for a in self.chains if a[-1] == nexp), None)
        if chn is None:
            while True:
                chn = self.add_chain()
                if chn[-1] == nexp:
                    break

        return chn

    def knuth_path(self, ngoal):
        ''' method 2: knuth method, uses memoization to search for a shorter chain '''
        if ngoal < 1:
            return []
        while not ngoal in self.pat:
            new_lvl = []
            for i in self.lvl[0]:
                for j in self.knuth_path(i):
                    if not i + j in self.pat:
                        self.pat[i + j] = i
                        new_lvl.append(i + j)

            self.lvl[0] = new_lvl

        returnpath = self.knuth_path(self.pat[ngoal])
        returnpath.append(ngoal)
        return returnpath


def cpow(xbase, chain):
    ''' raise xbase by an addition exponentiation chain for what becomes x**chain[-1] '''
    pows, products = 0, {0: 1, 1: xbase}
    for i in chain:
        products[i] = products[pows] * products[i - pows]
        pows = i
    return products[chain[-1]]


if __name__ == '__main__':
    # test both addition chain methods
    acs = AdditionChains()
    print('First one hundred addition chain lengths:')
    for k in range(1, 101):
        print(f'{len(acs.find_chain(k))-1:3}', end='\n'if k % 10 == 0 else '')

    print('\nKnuth chains for addition chains of 31415 and 27182:')
    chns = {m: acs.knuth_path(m) for m in [31415, 27182]}
    for (num, cha) in chns.items():
        print(f'Exponent: {num:10}\n  Addition Chain: {cha[:-1]}')
    print('\n1.00002206445416^31415 =', cpow(1.00002206445416, chns[31415]))
    print('1.00002550055251^27182 =', cpow(1.00002550055251, chns[27182]))
    print('1.000025 + 0.000058i)^27182 =',
          cpow(complex(1.000025, 0.000058), chns[27182]))
    print('1.000022 + 0.000050i)^31415 =',
          cpow(complex(1.000022, 0.000050), chns[31415]))
    sq05 = mpf(sqrt(0.5))
    mat = array([[sq05, 0, sq05, 0, 0, 0], [0, sq05, 0, sq05, 0, 0], [0, sq05, 0, -sq05, 0, 0],
                 [-sq05, 0, sq05, 0, 0, 0], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 1, 0]])
    print('matrix A ^ 27182 =')
    print(cpow(mat, chns[27182]))
    print('matrix A ^ 31415 =')
    print(cpow(mat, chns[31415]))
    print('(matrix A ** 27182) ** 31415 =')
    print(cpow(cpow(mat, chns[27182]), chns[31415]))
Output:
First one hundred addition chain lengths:
  0  1  2  2  3  3  4  3  4  4
  5  4  5  5  5  4  5  5  6  5
  6  6  6  5  6  6  6  6  7  6
  7  5  6  6  7  6  7  7  7  6
  7  7  7  7  7  7  8  6  7  7
  7  7  8  7  8  7  8  8  8  7
  8  8  8  6  7  7  8  7  8  8
  9  7  8  8  8  8  8  8  9  7
  8  8  8  8  8  8  9  8  9  8
  9  8  9  9  9  7  8  8  8  8

Knuth chains for addition chains of 31415 and 27182:
Exponent:      31415
  Addition Chain: [1, 2, 3, 5, 7, 14, 28, 56, 61, 122, 244, 488, 976, 1952, 3904, 7808, 15616, 15677, 31293]
Exponent:      27182
  Addition Chain: [1, 2, 3, 5, 7, 14, 21, 35, 70, 140, 143, 283, 566, 849, 1698, 3396, 6792, 6799, 13591]

1.00002206445416^31415 = 1.9999999998924638
1.00002550055251^27182 = 1.9999999999792688
1.000025 + 0.000058i)^27182 = (-0.01128636963542673+1.9730308496660347j)
1.000022 + 0.000050i)^31415 = (0.00016144681325535107+1.9960329014194498j)
matrix A ^ 27182 =
[[mpf('5.0272320351359433e-4092') 0 mpf('5.0272320351359433e-4092') 0 0 0]
 [0 mpf('5.0272320351359433e-4092') 0 mpf('5.0272320351359433e-4092') 0 0]
 [0 mpf('5.0272320351359433e-4092') 0 mpf('5.0272320351359433e-4092') 0 0]
 [mpf('5.0272320351359433e-4092') 0 mpf('5.0272320351359433e-4092') 0 0 0]
 [0 0 0 0 0 1]
 [0 0 0 0 1 0]]
matrix A ^ 31415 =
[[mpf('3.7268602513250562e-4729') 0 mpf('3.7268602513250562e-4729') 0 0 0]
 [0 mpf('3.7268602513250562e-4729') 0 mpf('3.7268602513250562e-4729') 0 0]
 [0 mpf('3.7268602513250562e-4729') 0 mpf('-3.7268602513250562e-4729') 0
  0]
 [mpf('-3.7268602513250562e-4729') 0 mpf('3.7268602513250562e-4729') 0 0
  0]
 [0 0 0 0 0 1]
 [0 0 0 0 1 0]]
(matrix A ** 27182) ** 31415 =
[[mpf('1.77158544475749e-128528148') 0 mpf('1.77158544475749e-128528148')
  0 0 0]
 [0 mpf('1.77158544475749e-128528148') 0
  mpf('1.77158544475749e-128528148') 0 0]
 [0 mpf('1.77158544475749e-128528148') 0
  mpf('1.77158544475749e-128528148') 0 0]
 [mpf('1.77158544475749e-128528148') 0 mpf('1.77158544475749e-128528148')
  0 0 0]
 [0 0 0 0 0 1]
 [0 0 0 0 1 0]]

Racket

This example is incorrect. Please fix the code and remove this message.
Details: The task states: Binary exponentiation does not usually produce the best solution. Provide only optimal solutions. This answer only considers binary exponentiation, which is not enough to give an optimal solution.


The addition chains correspond to binary exponentiation.

#lang racket
(define (chain n)
  ; computes a simple addition chain for n
  (cond [(= n 1)   '()]
        [(even? n) (define n/2 (/ n 2))
                   (cons (list n n/2 n/2) (chain n/2))]
        [(odd? n)  (define n-1 (- n 1))
                   (cons (list n n-1 1) (chain (- n 1)))]))

(define mult
  (let ([n 0])
    (λ xs
      (cond [(equal? xs (list 'count)) n]
            [(equal? xs (list 'reset)) (set! n 0)]
            [else (set! n (+ n 1))
                  (apply * xs)]))))
    
(define (expt/chain x n chain)
    ; computes x^n using the addition chain  
  (define ht (make-hash))
  (hash-set! ht 1 x)
  (define (expt1 n)
    (or (hash-ref ht n #f)
        (let ()
          (define x^n
            (match (assoc n chain)
              [(list _ s t) (mult (expt1 s) (expt1 t))]))
          (hash-set! ht n x^n)
          x^n)))
  (expt1 n))
     
(define (test x n)  
  (displayln (~a "Chain for " n "\n" (chain n)))
  (mult 'reset)
  (displayln (~a x " ^ " n " = " (expt/chain x n (chain n))))
  (displayln (~a "Multiplications: " (mult 'count)))
  (newline))

(test 1.00002206445416 31415)
(test 1.00002550055251 27182)

Output:

Chain for 31415
((31415 31414 1) (31414 15707 15707) (15707 15706 1) (15706 7853 7853) (7853 7852 1) (7852 3926 3926) (3926 1963 1963) (1963 1962 1) (1962 981 981) (981 980 1) (980 490 490) (490 245 245) (245 244 1) (244 122 122) (122 61 61) (61 60 1) (60 30 30) (30 15 15) (15 14 1) (14 7 7) (7 6 1) (6 3 3) (3 2 1) (2 1 1))
1.00002206445416 ^ 31415 = 1.9999999998913485
Multiplications: 24

Chain for 27182
((27182 13591 13591) (13591 13590 1) (13590 6795 6795) (6795 6794 1) (6794 3397 3397) (3397 3396 1) (3396 1698 1698) (1698 849 849) (849 848 1) (848 424 424) (424 212 212) (212 106 106) (106 53 53) (53 52 1) (52 26 26) (26 13 13) (13 12 1) (12 6 6) (6 3 3) (3 2 1) (2 1 1))
1.00002550055251 ^ 27182 = 1.9999999999774538
Multiplications: 21

Raku

Translation of: Python
# 20230327 Raku programming solution

use Math::Matrix;

class AdditionChains { has ( $.idx, $.pos, @.chains, @.lvl, %.pat ) is rw; 

   method add_chain { 
      # method 1: add chains depth then breadth first until done 
      return gather given self { 
         take my $newchain = .chains[.idx].clone.append: 
            .chains[.idx][*-1] + .chains[.idx][.pos];
	     .chains.append: $newchain;
         .pos == +.chains[.idx]-1 ?? ( .idx += 1 && .pos = 0 ) !! .pos += 1;
      }
   }

   method find_chain(\nexp where nexp > 0) {
   # method 1 interface: search for chain ending with n, adding more as needed
      return ([1],) if nexp == 1;
      my @chn = self.chains.grep: *.[*-1] == nexp // [];
      unless @chn.Bool { 
         repeat { @chn = self.add_chain } until @chn[*-1][*-1] == nexp 
      }
      return @chn
   }

   method knuth_path(\ngoal) {
      # method 2: knuth method, uses memoization to search for a shorter chain
      return [] if ngoal < 1; 
      until self.pat{ngoal}:exists {
         self.lvl[0] = [ gather for self.lvl[0].Array -> \i {
            for self.knuth_path(i).Array -> \j {
               unless self.pat{i + j}:exists {
                  self.pat{i + j} = i and take i + j
               }
            }
         } ] 
      }
      return self.knuth_path(self.pat{ngoal}).append: ngoal
   }

   multi method cpow(\xbase, \chain) {
#  raise xbase by an addition exponentiation chain for what becomes x**chain[-1]
      my ($pows, %products) = 0, 1 => xbase;

      %products{0} = xbase ~~ Math::Matrix 
         ?? Math::Matrix.new([ [ 1 xx xbase.size[1] ] xx xbase.size[0] ]) !! 1;

      for chain.Array -> \i {
         %products{i} = %products{$pows} * %products{i - $pows};
         $pows = i
      }
      return %products{ chain[*-1] }
   }
}

my $chn =  AdditionChains.new( idx    =>      0, pos =>      0, 
                               chains => ([1],), lvl => ([1],), pat => {1=>0} );

say 'First one hundred addition chain lengths:';
.Str.say for ( (1..100).map: { +$chn.find_chain($_)[0] - 1 } ).rotor: 10;

my %chns = (31415, 27182).map: { $_ => $chn.knuth_path: $_ };

say "\nKnuth chains for addition chains of 31415 and 27182:";
say "Exponent: $_\n  Addition Chain: %chns{$_}[0..*-2]" for %chns.keys;
say '1.00002206445416^31415 = ', $chn.cpow(1.00002206445416, %chns{31415});
say '1.00002550055251^27182 = ', $chn.cpow(1.00002550055251, %chns{27182});
say '(1.000025 + 0.000058i)^27182 = ', $chn.cpow: 1.000025+0.000058i, %chns{27182};
say '(1.000022 + 0.000050i)^31415 = ', $chn.cpow: 1.000022+0.000050i, %chns{31415};

my \sq05 = 0.5.sqrt;
my \mat  = Math::Matrix.new( [[sq05,    0, sq05,     0, 0, 0], 
                             [    0, sq05,    0,  sq05, 0, 0], 
                             [    0, sq05,    0, -sq05, 0, 0],
                             [-sq05,    0, sq05,     0, 0, 0], 
                             [    0,    0,    0,     0, 0, 1], 
                             [    0,    0,    0,     0, 1, 0]] );

say 'matrix A ^ 27182 =';
say my $res27182 = $chn.cpow(mat, %chns{27182});
say 'matrix A ^ 31415 =';
say $chn.cpow(mat, %chns{31415});
say '(matrix A ** 27182) ** 31415 =';
say $chn.cpow($res27182, %chns{31415});
Output:
First one hundred addition chain lengths:
0 1 2 2 3 3 4 3 4 4
5 4 5 5 5 4 5 5 6 5
6 6 6 5 6 6 6 6 7 6
7 5 6 6 7 6 7 7 7 6
7 7 7 7 7 7 8 6 7 7
7 7 8 7 8 7 8 8 8 7
8 8 8 6 7 7 8 7 8 8
9 7 8 8 8 8 8 8 9 7
8 8 8 8 8 8 9 8 9 8
9 8 9 9 9 7 8 8 8 8

Knuth chains for addition chains of 31415 and 27182:
Exponent: 27182
  Addition Chain: 1 2 3 5 7 14 21 35 70 140 143 283 566 849 1698 3396 6792 6799 13591
Exponent: 31415
  Addition Chain: 1 2 3 5 7 14 28 56 61 122 244 488 976 1952 3904 7808 15616 15677 31293
1.00002206445416^31415 = 1.9999999998924638
1.00002550055251^27182 = 1.999999999974053
(1.000025 + 0.000058i)^27182 = -0.01128636963542673+1.9730308496660347i
(1.000022 + 0.000050i)^31415 = 0.00016144681325535107+1.9960329014194498i
matrix A ^ 27182 =
  0  0  0  0  0  0
  0  0  0  0  0  0
  0  0  0  0  0  0
  0  0  0  0  0  0
  0  0  0  0  0  1
  0  0  0  0  1  0
matrix A ^ 31415 =
   0  0  0   0  0  0
   0  0  0   0  0  0
   0  0  0  -0  0  0
  -0  0  0   0  0  0
   0  0  0   0  0  1
   0  0  0   0  1  0
(matrix A ** 27182) ** 31415 =
  0  0  0  0  0  0
  0  0  0  0  0  0
  0  0  0  0  0  0
  0  0  0  0  0  0
  0  0  0  0  0  1
  0  0  0  0  1  0

Tcl

This example is incorrect. Please fix the code and remove this message.
Details: The tasks explicitly asks to give only optimal solutions, star chains are not enough.

Using code at Matrix multiplication#Tcl and Matrix Transpose#Tcl (not shown here).

Translation of: Go
# Continued fraction addition chains, as described in "Efficient computation
# of addition chains" by F. Bergeron, J. Berstel, and S. Brlek, published in
# Journal de théorie des nombres de Bordeaux, 6 no. 1 (1994), p. 21-38,
# accessed at http://www.numdam.org/item?id=JTNB_1994__6_1_21_0.
#
# Uses the dichotomic strategy, which produces good results with simpler
# coding than for a pluggable non-deterministic strategy.

package require Tcl 8.5
namespace path {::tcl::mathop ::tcl::mathfunc}

proc minchain {n} {
    if {!($n & ($n-1))} {
	for {set i 1} {$i <= $n} {incr i $i} {lappend c $i}
	return $c
    } elseif {$n == 3} {
	return {1 2 3}
    }
    return [chain $n [expr {$n >> int(ceil(floor(log($n)/log(2))/2))}]]
}
proc chain {n1 n2} {
    set q [expr {$n1 / $n2}]
    set r [expr {$n1 % $n2}]
    if {$r == 0} {
	return [chain.* [minchain $n2] [minchain $q]]
    } else {
	return [chain.+ [chain.* [chain $n2 $r] [minchain $q]] $r]
    }
}
proc chain.+ {ns k} {
    return [lappend ns [expr {[lindex $ns end] + $k}]]
}
proc chain.* {ns ms} {
    set n_k [lindex $ns end]
    foreach m_i $ms {
	if {$m_i==1} continue
	lappend ns [expr {$n_k * $m_i}]
    }
    return $ns
}

# Generate a lambda term to do exponentiation with a given multiplier command.
# Works by extracting information from the addition chain; the lambda term
# generated is minimal
proc makeExponentiationLambda {n mulfunc} {
    set chain [minchain $n]
    set cmd {set a0}
    set idxes 0
    foreach c0 [lrange $chain 0 end-1] c1 [lrange $chain 1 end] {
	lappend idxes [lsearch $chain [expr {$c1 - $c0}]]
    }
    for {set i 1} {$i<[llength $chain]} {incr i} {
	set cmd "$mulfunc \[$cmd\] \$a[lindex $idxes $i]"
	if {$i in $idxes} {
	    set cmd "set a$i \[$cmd\]"
	}
    }
    list a0 $cmd
}

# Demonstrating application of problem to matrix exponentiation
proc count_mult {a b} {incr ::countMult;matrix_multiply $a $b}
set m 31415
set n 27182
set mn [expr {$m*$n}]
set pow_m [makeExponentiationLambda $m count_mult]
set pow_n [makeExponentiationLambda $n count_mult]
set pow_mn [makeExponentiationLambda $mn count_mult]

set rh [expr {sqrt(0.5)}]
set mrh [expr {-$rh}]
set A [subst {
    {$rh 0 $rh 0 0 0}
    {0 $rh 0 $rh 0 0}
    {0 $rh 0 $mrh 0 0}
    {$mrh 0 $rh 0 0 0}
    {0 0 0 0 0 1}
    {0 0 0 0 1 0}
}]
puts "A**$m"; set countMult 0
print_matrix [apply $pow_m $A] %6.3f
puts "$countMult matrix multiplies"
puts "A**$n"; set countMult 0
print_matrix [apply $pow_n $A] %6.3f
puts "$countMult matrix multiplies"
puts "A**$mn"; set countMult 0
print_matrix [apply $pow_mn $A] %6.3f
puts "$countMult matrix multiplies"
Output:
A**31415
 0.707  0.000  0.000 -0.707  0.000  0.000 
 0.000  0.707  0.707  0.000  0.000  0.000 
 0.707  0.000  0.000  0.707  0.000  0.000 
 0.000  0.707 -0.707  0.000  0.000  0.000 
 0.000  0.000  0.000  0.000  0.000  1.000 
 0.000  0.000  0.000  0.000  1.000  0.000 
19 matrix multiplies
A**27182
-0.500 -0.500 -0.500  0.500  0.000  0.000 
 0.500 -0.500 -0.500 -0.500  0.000  0.000 
-0.500 -0.500  0.500 -0.500  0.000  0.000 
 0.500 -0.500  0.500  0.500  0.000  0.000 
 0.000  0.000  0.000  0.000  1.000  0.000 
 0.000  0.000  0.000  0.000  0.000  1.000 
18 matrix multiplies
A**853922530
-0.500  0.500 -0.500  0.500  0.000  0.000 
-0.500 -0.500 -0.500 -0.500  0.000  0.000 
-0.500 -0.500  0.500  0.500  0.000  0.000 
 0.500 -0.500 -0.500  0.500  0.000  0.000 
 0.000  0.000  0.000  0.000  1.000  0.000 
 0.000  0.000  0.000  0.000  0.000  1.000 
37 matrix multiplies

Wren

Translation of: Go
Library: Wren-dynamic
Library: Wren-fmt

This is very slow (12 mins 50 secs) compared to Go (25 seconds) but, given the amount of calculation involved, not too bad for the Wren interpreter.

import "./dynamic" for Struct
import "./fmt" for Fmt

var Save = Struct.create("Save", ["p", "i", "v"])

var N    = 32
var NMAX = 40000

var u     = List.filled(N, 0)  // upper bounds
var l     = List.filled(N, 0)  // lower bounds
var out   = List.filled(N, 0)
var sum   = List.filled(N, 0)
var tail  = List.filled(N, 0)
var cache = List.filled(NMAX + 1, 0)
var known = 2
var stack = 0
var undo  = List.filled(N * N, null)

for (i in 0..1) l[i] = u[i] = i + 1
for (i in 0...N*N) undo[i] = Save.new(null, 0, 0)
cache[2] = 1

var replace = Fn.new { |x, i, n|
    undo[stack].p = x
    undo[stack].i = i
    undo[stack].v = x[i]
    x[i] = n
    stack = stack + 1
}

var restore = Fn.new { |n|
    while (stack > n) {
        stack = stack - 1
        undo[stack].p[undo[stack].i] = undo[stack].v
    }
}

/* lower and upper bounds */
var lower = Fn.new { |n, up|
    if (n <= 2 || (n <= NMAX && cache[n] != 0)) {
        if (up.count > 0) up[0] = cache[n]
        return cache[n]
    }
    var i = -1
    var o =  0
    while (n != 0) {
        if ((n&1) != 0) o = o + 1
        n = n >> 1
        i = i + 1
    }
    if (up.count > 0) {
        i = i - 1
        up[0] = o + i
    }
    while (true) {
        i = i + 1
        o = o >> 1
        if (o == 0) break
    }
    o = 2
    while (o * o < n) {
        if (n%o != 0) {
            o = o + 1
            continue
        }
        var q = cache[o] + cache[(n/o).floor]
        if (q < up[0]) {
            up[0] = q
            if (q == i) break
        }
        o = o + 1
    }
    if (n > 2) {
        if (up[0] > cache[n-2] + 1) up[0] = cache[n-1] + 1
        if (up[0] > cache[n-2] + 1) up[0] = cache[n-2] + 1
    }
    return i
}

var insert = Fn.new { |x, pos|
    var save = stack
    if (l[pos] > x || u[pos] < x) return false
    if (l[pos] != x) {
        replace.call(l, pos, x)
        var i = pos - 1
        while (u[i]*2 < u[i+1]) {
            var t = l[i+1] + 1
            if (t * 2 > u[i]) {
                restore.call(save)
                return false
            }
            replace.call(l, i, t)
            i = i - 1
        }
        i = pos + 1
        while (l[i] <= l[i-1]) {
            var t = l[i-1] + 1
            if (t > u[i]) {
                restore.call(save)
                return false
            }
            replace.call(l, i, t)
            i = i + 1
        }
    }
    if (u[pos] == x) return true
    replace.call(u, pos, x)
    var i = pos - 1
    while (u[i] >= u[i+1]) {
        var t = u[i+1] - 1
        if (t < l[i]) {
            restore.call(save)
            return false
        }
        replace.call(u, i, t)
        i = i - 1
    }
    i = pos + 1
    while (u[i] > u[i-1]*2) {
        var t = u[i-1] * 2
        if (t < l[i]) {
            restore.call(save)
            return false
        }
        replace.call(u, i, t)
        i = i + 1
    }
    return true
}

var try  // forward declaration

var seqRecur = Fn.new { |le|
    var n = l[le]
    if (le < 2) return true
    var limit = n - 1
    if (out[le] == 1) limit = n - tail[sum[le]]
    if (limit > u[le-1]) limit = u[le-1]

    // Try to break n into p + q, and see if we can insert p, q into
    // list while satisfying bounds.
    var p = limit
    var q = n - p
    while (q <= p) {
        if (try.call(p, q, le)) return true
        q = q + 1
        p = p -1
    }
    return false
}

try = Fn.new { |p, q, le|
    var pl = cache[p]
    if (pl >= le) return false
    var ql = cache[q]
    if (ql >= le) return false
    while (pl < le && u[pl] < p) pl = pl + 1
    var pu = pl - 1
    while (pu < le-1 && u[pu+1] >= p) pu = pu + 1
    while (ql < le && u[ql] < q) ql = ql + 1
    var qu = ql - 1
    while (qu < le-1 && u[qu+1] >= q) qu = qu + 1
    if (p != q && pl <= ql) pl = ql + 1
    if (pl > pu || ql > qu || ql > pu) return false
    if (out[le] == 0) {
        pu = le - 1
        pl = pu
    }
    var ps = stack
    while (pu >= pl) {
        if (!insert.call(p, pu)) {
            pu = pu - 1
            continue
        }
        out[pu] = out[pu] + 1
        sum[pu] = sum[pu] + le
        if (p != q) {
            var qs = stack
            var j = qu
            if (j >= pu) j = pu - 1
            while (j >= ql) {
                if (!insert.call(q, j)) {
                    j = j - 1
                    continue
                }
                out[j] = out[j] + 1
                sum[j] = sum[j] + le
                tail[le] = q
                if (seqRecur.call(le - 1)) return true
                restore.call(qs)
                out[j] = out[j] - 1
                sum[j] = sum[j] - le
                j = j - 1
            }
        } else {
            out[pu] = out[pu] + 1
            sum[pu] = sum[pu] + le
            tail[le] = p
            if (seqRecur.call(le - 1)) return true
            out[pu] = out[pu] - 1
            sum[pu] = sum[pu] - le
        }
        out[pu] = out[pu] - 1
        sum[pu] = sum[pu] - le
        restore.call(ps)
        pu = pu - 1
    }
    return false
}

var seq  // forward declaration

var seqLen // recursive function
seqLen = Fn.new { |n|
    if (n <= known) return cache[n]
    // Need all lower n to compute sequence.
    while (known+1 < n) seqLen.call(known + 1)
    var ub = 0
    var pub = [ub]
    var lb = lower.call(n, pub)
    ub = pub[0]
    while (lb < ub && seq.call(n, lb, []) == 0) {
        lb = lb + 1
    }
    known = n
    if (n&1023 == 0) System.print("Cached %(known)")
    cache[n] = lb
    return lb
}

seq = Fn.new { |n, le, buf|
    if (le == 0) le = seqLen.call(n)
    stack = 0
    l[le] = u[le] = n
    for (i in 0..le) out[i] = sum[i] = 0
    var i = 2
    while (i < le) {
        l[i] = l[i-1] + 1
        u[i] = u[i-1] * 2
        i = i + 1
    }
    i = le - 1
    while (i > 2) {
        if (l[i]*2 < l[i+1]) {
            l[i] = ((1 + l[i+1]) / 2).floor
        }
        if (u[i] >= u[i+1]) {
            u[i] = u[i+1] - 1
        }
        i = i - 1
    }
    if (!seqRecur.call(le)) return 0
    if (buf.count > 0) {
        for (i in 0..le) buf[i] = u[i]
    }
    return le
}

var binLen = Fn.new { |n|
    var r = -1
    var o = -1
    while (n != 0) {
        if (n&1 != 0) o = o + 1
        n = n >> 1
        r = r + 1
    }
    return r + o
}

var mul = Fn.new { |m1, m2|
    var rows1 = m1.count
    var rows2 = m2.count
    var cols1 = m1[0].count
    var cols2 = m2[0].count
    if (cols1 != rows2) Fiber.abort("Matrices cannot be multiplied.")
    var result = List.filled(rows1, null)
    for (i in 0...rows1) {
        result[i] = List.filled(cols2, 0)
        for (j in 0...cols2) {
            for (k in 0...rows2) {
                result[i][j] = result[i][j] + m1[i][k] * m2[k][j]
            }
        }
    }
    return result
}

var pow = Fn.new { |m, n, printout|
    var e = List.filled(N, 0)
    var v = List.filled(N, null)
    for (i in 0...N) v[i] = List.filled(N, 0)
    var le = seq.call(n, 0, e)
    if (printout) {
        System.print("Addition chain:")
        for (i in 0..le) {
            var c = (i == le) ? "\n" : " "
            System.write("%(e[i])%(c)")
        }
    }
    v[0] = m
    v[1] = mul.call(m, m)
    var i = 2
    while (i <= le) {
        var j = i - 1
        while (j != 0) {
            for (k in j..0) {
                if (e[k]+e[j] < e[i]) break
                if (e[k]+e[j] > e[i]) continue
                v[i] = mul.call(v[j], v[k])
                j = 1
                break
            }
            j = j - 1
        }
        i = i + 1
    }
    return v[le]
}

var print = Fn.new { |m|
    for (v in m) Fmt.print("$9.6f", v)
    System.print()
}

var m = 27182
var n = 31415
System.print("Precompute chain lengths:")
seqLen.call(n)
var rh = (0.5).sqrt
var mx = [
    [rh,  0, rh,   0, 0, 0],
    [0,  rh,  0,  rh, 0, 0],
    [0,  rh,  0, -rh, 0, 0],
    [-rh, 0, rh,   0, 0, 0],
    [0,   0,  0,   0, 0, 1],
    [0,   0,  0,   0, 1, 0]
]
System.print("\nThe first 100 terms of A003313 are:")
for (i in 1..100) {
    Fmt.write("$d ", seqLen.call(i))
    if (i%10 == 0) System.print()
}
var exs = [m, n]
var mxs = List.filled(2, null)
var i = 0
for (ex in exs) {
    System.print("\nExponent: %(ex)")
    mxs[i] = pow.call(mx, ex, true)
    Fmt.print("A ^ $d:-\n", ex)
    print.call(mxs[i])
    System.print("Number of A/C multiplies: %(seqLen.call(ex))")
    System.print("  c.f. Binary multiplies: %(binLen.call(ex))")
    i = i + 1
}
Fmt.print("\nExponent: $d x $d = $d", m, n, m*n)
Fmt.print("A ^ $d = (A ^ $d) ^ $d:-\n", m*n, m, n)
var mx2 = pow.call(mxs[0], n, false)
print.call(mx2)
Output:
Precompute chain lengths:
Cached 1024
Cached 2048
Cached 3072
....
Cached 28672
Cached 29696
Cached 30720

The first 100 terms of A003313 are:
0 1 2 2 3 3 4 3 4 4 
5 4 5 5 5 4 5 5 6 5 
6 6 6 5 6 6 6 6 7 6 
7 5 6 6 7 6 7 7 7 6 
7 7 7 7 7 7 8 6 7 7 
7 7 8 7 8 7 8 8 8 7 
8 8 8 6 7 7 8 7 8 8 
9 7 8 8 8 8 8 8 9 7 
8 8 8 8 8 8 9 8 9 8 
9 8 9 9 9 7 8 8 8 8 

Exponent: 27182
Addition chain:
1 2 4 8 10 18 28 46 92 184 212 424 848 1696 3392 6784 13568 27136 27182
A ^ 27182:-

[-0.500000 -0.500000 -0.500000  0.500000  0.000000  0.000000]
[ 0.500000 -0.500000 -0.500000 -0.500000  0.000000  0.000000]
[-0.500000 -0.500000  0.500000 -0.500000  0.000000  0.000000]
[ 0.500000 -0.500000  0.500000  0.500000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]

Number of A/C multiplies: 18
  c.f. Binary multiplies: 21

Exponent: 31415
Addition chain:
1 2 4 8 16 17 33 49 98 196 392 784 1568 3136 6272 6289 12561 25122 31411 31415
A ^ 31415:-

[ 0.707107  0.000000  0.000000 -0.707107  0.000000  0.000000]
[ 0.000000  0.707107  0.707107  0.000000  0.000000  0.000000]
[ 0.707107  0.000000  0.000000  0.707107  0.000000  0.000000]
[ 0.000000  0.707107 -0.707107  0.000000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]

Number of A/C multiplies: 19
  c.f. Binary multiplies: 24

Exponent: 27182 x 31415 = 853922530
A ^ 853922530 = (A ^ 27182) ^ 31415:-

[-0.500000  0.500000 -0.500000  0.500000  0.000000  0.000000]
[-0.500000 -0.500000 -0.500000 -0.500000  0.000000  0.000000]
[-0.500000 -0.500000  0.500000  0.500000  0.000000  0.000000]
[ 0.500000 -0.500000 -0.500000  0.500000  0.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  1.000000  0.000000]
[ 0.000000  0.000000  0.000000  0.000000  0.000000  1.000000]