Gradient descent: Difference between revisions

← Older edit

Gradient descent (view source)

Revision as of 17:20, 7 December 2023

21,206 bytes added , 5 months ago

m

→‎{{header|Wren}}: Minor tidy

PureFox

9,476

edits

Revision as of 15:30, 5 September 2020 (view source) rosettacode>Gerard Schildberger m (→‎{{header\|REXX}}: changed to use a much smaller tolerance, added wording to the REXX section header..) ← Older edit		Latest revision as of 17:20, 7 December 2023 (view source) PureFox (talk \| contribs) m (→‎{{header\|Wren}}: Minor tidy)
(11 intermediate revisions by 6 users not shown)
Line 18: {{Trans\|Fortran}} THe results agree with the Fortran sample and the Julia sample to 6 places. <~~lang~~syntaxhighlight lang="algol68">PROC steepest descent = ( REF[]LONG REAL x, LONG REAL alphap, tolerance )VOID: BEGIN LONG REAL alpha := alphap; Line 89: print( ( "Testing steepest descent method:", newline ) ); print( ( "The minimum is at x[0] = ", fixed( x[ 0 ], -10, 6 ), ", x[1] = ", fixed( x[ 1 ], -10, 6 ), newline ) ) END</~~lang~~syntaxhighlight> {{out}} <pre> Line 99: {{Trans\|ALGOL 68}} which is a {{Trans\|Go}} with the gradient function from {{Trans\|Fortran}} The results agree (to 6 places) with the Fortran and Julia samples. <~~lang~~syntaxhighlight lang="algolw">begin procedure steepestDescent ( long real array x ( * ); long real value alphap, tolerance ) ; begin Line 163: write( "The minimum is at x(0) = ", x( 0 ), ", x(1) = ", x( 1 ) ) end end.</~~lang~~syntaxhighlight> {{out}} <pre> Line 169: The minimum is at x(0) = 0.1076268, x(1) = -1.2232596 </pre> =={{header\|Common Lisp}}== {{works with\|SBCL}} <syntaxhighlight lang="Lisp"> (defparameter epsilon (sqrt 0.0000001d0)) (defun myfun (x y) (+ (* (- x 1) (- x 1) (exp (- (expt y 2)))) (* y (+ 2 y) (exp (- (* 2 (expt x 2))))))) (defun xd (x y) (/ (- (myfun (* x (+ 1 epsilon)) y) (myfun (* x (- 1 epsilon)) y)) (* 2 x epsilon))) (defun yd (x y) (/ (- (myfun x (* y (+ 1 epsilon))) (myfun x (* y (- 1 epsilon)))) (* 2 y epsilon))) (defun gd () (let ((x 0.1d0) (y -1.0d0)) (dotimes (i 31) (setf x (- x (* 0.3d0 (xd x y)))) (setf y (- y (* 0.3d0 (yd x y ))))) (list x y))) </syntaxhighlight> {{out}} <pre> (0.10762684380416455 -1.2232596637047382) </pre> =={{header\|Forth}}== {{works with\|SwiftForth}} {{libheader\|fpmath}} Approach with partial differential equations GD1, GD2 <syntaxhighlight lang="forth"> 0e fvalue px 0e fvalue py fvariable x 0.1e x F! fvariable y -1e y F! : ^2 FDUP F* ; \ gradient for x, analytical function : GD1 ( F: x y -- dfxy) to py to px 2e px 1e F- F* py ^2 FNEGATE FEXP F* -4e px F* py F* py 2e F+ F* px ^2 -2e F* FEXP F* F+ ; \ gradient for y, analytical function : GD2 ( F: x y -- dfxy ) to py to px px ^2 -2e F* FEXP FDUP py F* FSWAP py 2e F+ F* F+ py ^2 FNEGATE FEXP py F* px 1e F- ^2 F* -2e F* F+ ; \ gradient descent : GD ( x y n -- ) \ gradient descent, initial guess and number of iterations FOVER FOVER y F! x F! 0 do FOVER FOVER GD1 0.3e F* x F@ FSWAP F- x F! GD2 0.3e F* y F@ FSWAP F- y F! x F@ y F@ FOVER FOVER FSWAP F. F. CR loop ; </syntaxhighlight> Approach with main function and numerical gradients xd xy: <syntaxhighlight lang="forth"> 0e fvalue px 0e fvalue py fvariable x 0.1e x F! fvariable y -1e y F! 1e-6 fvalue EPSILON : fep EPSILON FSQRT ; : MYFUN ( x y -- fxy ) FDUP FROT FDUP ( y y x x ) ( x ) 1e F- FDUP F* ( y y x a ) FROT ( y x a y ) ( y ) FDUP F* FNEGATE FEXP F* ( y x b ) FROT ( x b y ) ( y ) FDUP 2e F+ F* ( x b c ) FROT ( b c x ) ( x ) FDUP F* -2e F* FEXP F* F+ ; : xd ( x y ) to py to px px 1e fep F+ F* py myfun px 1e fep F- F* py myfun F- 2e px F* fep F* F/ ; : xy ( x y ) to py to px py 1e fep F+ F* px FSWAP myfun py 1e fep F- F* px FSWAP myfun F- 2e py F* fep F* F/ ; : GDB ( x y n -- ) \ gradient descent, initial guess and number of iterations CR FOVER FOVER y F! x F! 0 do FOVER FOVER xd 0.3e F* x F@ FSWAP F- x F! xy 0.3e F* y F@ FSWAP F- y F! x F@ y F@ FOVER FOVER FSWAP F. F. CR loop ; </syntaxhighlight> {{out}} <pre> 0.1e -1e 20 GD 0.1810310 -1.1787894 0.1065262 -1.1965306 0.1144636 -1.2181779 0.1075002 -1.2206140 0.1082816 -1.2227594 0.1076166 -1.2230041 0.1076894 -1.2232108 0.1076261 -1.2232350 0.1076328 -1.2232549 0.1076267 -1.2232573 0.1076274 -1.2232592 0.1076268 -1.2232594 0.1076269 -1.2232596 0.1076268 -1.2232596 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1e -1e 20 GDB 0.1810310 -1.1787893 0.1065262 -1.1965305 0.1144636 -1.2181779 0.1075002 -1.2206140 0.1082816 -1.2227594 0.1076166 -1.2230041 0.1076894 -1.2232108 0.1076261 -1.2232350 0.1076328 -1.2232549 0.1076268 -1.2232573 0.1076274 -1.2232592 0.1076268 -1.2232594 0.1076269 -1.2232596 0.1076268 -1.2232596 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1076268 -1.2232597 0.1076268 -1.2232597 </pre> =={{header\|Fortran}}== Line 193 ⟶ 389: (%o7) done The optimization subroutine GD sets the reverse communication variable IFLAG. This allows the evaluation of the gradient to be done separately. <~~lang~~syntaxhighlight ~~Fortran~~lang="fortran"> SUBROUTINE EVALFG (N, X, F, G) IMPLICIT NONE INTEGER N Line 274 ⟶ 470: STOP 'program complete' END </syntaxhighlight> ~~</lang>~~ {{out}} <pre>After 31 steps, found minimum at x= 0.107626843548372 y= -1.223259663839920 of f= -0.750063420551493 Line 284 ⟶ 480: However, since it was originally written, I've substituted Fortran's gradient function for the original one (see Talk page) which now gives results which agree (to 6 decimal places) with those of the Fortran, Julia, Algol 68 and Algol W solutions. As a number of other solutions are based on this one, I suggest their authors update them accordingly. <~~lang~~syntaxhighlight lang="go">package main import ( Line 361 ⟶ 557: fmt.Println("Testing steepest descent method:") fmt.Printf("The minimum is at x = %f, y = %f for which f(x, y) = %f.\n", x[0], x[1], g(x)) }</~~lang~~syntaxhighlight> {{out}} Line 367 ⟶ 563: Testing steepest descent method: The minimum is at x = 0.107627, y = -1.223260 for which f(x, y) = -0.750063. </pre> =={{header\|J}}== {{works with\|Jsoftware\|>= 9.0.1}} {{libheader\|math/calculus}} In earlier versions without calculus library, D. can be used instead of pderiv <syntaxhighlight lang="J"> load 'math/calculus' coinsert 'jcalculus' NB. initial guess go=: 0.1 _1 NB. function func=: monad define 'xp yp' =. y ((1-xp)(1-xp) ] ^-(yp)^2) + yp(yp+2) ] ^ _2 * xp^2 NB. evaluates from right to left without precedence ) NB. gradient descent shortygd =: monad define go=.y go=. go - 0.03 * func pderiv 1 ] go NB. use D. instead of pderiv for earlier versions ) NB. use: shortygd ^:_ go </syntaxhighlight> {{out}} <pre> 0.107627 _1.22326 </pre> =={{header\|jq}}== '''Adapted from [[#Go\|Go]]''' '''Works with jq and gojq, the C and Go implementations of jq''' The function, g, and its gradient, gradG, are defined as 0-arity filters so that they can be passed as formal parameters to the `steepestDescent` function. <syntaxhighlight lang=jq> def sq: ..; def ss(stream): reduce stream as $x (0; . + ($x\|sq)); # Function for which minimum is to be found: input is a 2-array def g: . as $x \| (($x[0]-1)\|sq) (( - ($x[1]\|sq))\|exp) + $x[1]($x[1]+2) ((-2($x[0]\|sq))\|exp); # Provide a rough calculation of gradient g(p): def gradG: . as [$x, $y] \| ($x$x) as $x2 \| [2($x-1)(-($y\|sq)\|exp) - 4 * $x * (-2($x\|sq)\|exp) $y($y+2), - 2($x-1)($x-1)$y((-$y$y)\|exp) + ((-2$x2)\|exp)($y+2) + ((-2$x2)\|exp)$y] ; def steepestDescent(g; gradG; $x; $alpha; $tolerance): ($x\|length) as $n \| {g0 : (x\|g), # Initial estimate of result alpha: $alpha, fi: ($x\|gradG), # Calculate initial gradient x: $x, iterations: 0 } # Calculate initial norm: \| .delG = (ss( .fi[] ) \| sqrt ) \| .b = .alpha/.delG # Iterate until value is <= $tolerance: \| until (.delG <= $tolerance; .iterations += 1 # Calculate next value: \| reduce range(0; $n) as $i(.; .x[$i] += (- .b * .fi[$i]) ) # Calculate next gradient: \| .fi = (.x\|gradG) # Calculate next norm: \| .delG = (ss( .fi[] ) \| sqrt) \| .b = .alpha/.delG # Calculate next value: \| .g1 = (.x\|g) # Adjust parameter: \| if .g1 > .g0 then .alpha /= 2 else .g0 = .g1 end ) ; def tolerance: 0.0000006; def alpha: 0.1; def x: [0.1, -1]; # Initial guess of location of minimum. def task: steepestDescent(g; gradG; x; alpha; tolerance) \| "Testing the steepest descent method:", "After \(.iterations) iterations,", "the minimum is at x = \(.x[0]), y = \(.x[1]) for which f(x, y) = \(.x\|g)." ; task </syntaxhighlight> {{output}} <pre> Testing the steepest descent method: After 24 iterations, the minimum is at x = 0.10762682432948058, y = -1.2232596548816101 for which f(x, y) = -0.7500634205514924. </pre> =={{header\|Julia}}== <~~lang~~syntaxhighlight lang="julia">using Optim, Base.MathConstants f(x) = (x[1] - 1) * (x[1] - 1) * e^(-x[2]^2) + x[2] * (x[2] + 2) * e^(-2 * x[1]^2) println(optimize(f, [0.1, -1.0], GradientDescent())) </~~lang~~syntaxhighlight>{{out}} <pre> Results of Optimization Algorithm Line 395 ⟶ 717: * Gradient Calls: 35 </pre> =={{header\|Nim}}== {{trans\|Go}} The gradient function has been rewritten to remove a term in the partial derivative with respect to y (two terms instead of three). This doesn’t change the result which agrees with those of Go, Fortran, Julia, etc. <syntaxhighlight lang="nim">import math, strformat func steepDescent(g: proc(x: openArray[float]): float; gradG: proc(p: openArray[float]): seq[float]; x: var openArray[float]; alpha, tolerance: float) = let n = x.len var alpha = alpha var g0 = g(x) # Initial estimate of result. # Calculate initial gradient. var fi = gradG(x) # Calculate initial norm. var delG = 0.0 for i in 0..<n: delG += fi[i] * fi[i] delG = sqrt(delG) var b = alpha / delG # Iterate until value is <= tolerance. while delG > tolerance: # Calculate next value. for i in 0..<n: x[i] -= b * fi[i] # Calculate next gradient. fi = gradG(x) # Calculate next norm. delG = 0 for i in 0..<n: delG += fi[i] * fi[i] delG = sqrt(delG) b = alpha / delG # Calculate next value. let g1 = g(x) # Adjust parameter. if g1 > g0: alpha = 0.5 else: g0 = g1 when isMainModule: func g(x: openArray[float]): float = ## Function for which minimum is to be found. (x[0]-1) (x[0]-1) * exp(-x[1]x[1]) + x[1] (x[1]+2) * exp(-2x[0]x[0]) func gradG(p: openArray[float]): seq[float] = ## Provides a rough calculation of gradient g(p). result = newSeq[float](p.len) let x = p[0] let y = p[1] result[0] = 2 * (x-1) * exp(-yy) - 4 x * exp(-2xx) * y * (y+2) result[1] = -2 * (x-1) * (x-1) * y * exp(-yy) + 2 (y+1) * exp(-2xx) const Tolerance = 0.0000006 Alpha = 0.1 var x = [0.1, -1.0] # Initial guess of location of minimum. steepDescent(g, gradG, x, Alpha, Tolerance) echo "Testing steepest descent method:" echo &"The minimum is at x = {x[0]:.12f}, y = {x[1]:.12f} for which f(x, y) = {g(x):.12f}"</syntaxhighlight> {{out}} <pre>Testing steepest descent method: The minimum is at x = 0.107626824329, y = -1.223259654882 for which f(x, y) = -0.750063420551</pre> =={{header\|Perl}}== Calculate with <code>bignum</code> for numerical stability. {{trans\|Raku}} <~~lang~~syntaxhighlight lang="perl">use strict; use warnings; use bignum; Line 452 ⟶ 850: my @x = <0.1 -1>; # Initial guess of location of minimum. printf "The minimum is at x[0] = %.6f, x[1] = %.6f", steepestDescent($alpha, $tolerance, @x);</~~lang~~syntaxhighlight> {{out}} <pre>The minimum is at x[0] = 0.107653, x[1] = -1.223370</pre> Line 458 ⟶ 856: =={{header\|Phix}}== {{trans\|Go}} <!--<syntaxhighlight lang="phix">(phixonline)--> ~~<lang Phix>-- Function for which minimum is to be found.~~ <span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span> ~~function g(sequence x)~~ <span style="color: #000080;font-style:italic;">-- Function for which minimum is to be found.</span> ~~atom {x0,x1} = x~~ <span style="color: #008080;">function</span> <span style="color: #000000;">g</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">x</span><span style="color: #0000FF;">)</span> ~~return (x0-1)(x0-1)exp(-x1x1) +~~ <span style="color: #004080;">atom</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">x0</span><span style="color: #0000FF;">,</span><span style="color: #000000;">x1</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">x</span> ~~x1(x1+2)exp(-2x0x0)~~ <span style="color: #008080;">return</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">x0</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">)(</span><span style="color: #000000;">x0</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">)</span><span style="color: #7060A8;">exp</span><span style="color: #0000FF;">(-</span><span style="color: #000000;">x1</span><span style="color: #0000FF;"></span><span style="color: #000000;">x1</span><span style="color: #0000FF;">)</span> <span style="color: #0000FF;">+</span> ~~end function~~ <span style="color: #000000;">x1</span><span style="color: #0000FF;">(</span><span style="color: #000000;">x1</span><span style="color: #0000FF;">+</span><span style="color: #000000;">2</span><span style="color: #0000FF;">)</span><span style="color: #7060A8;">exp</span><span style="color: #0000FF;">(-</span><span style="color: #000000;">2</span><span style="color: #0000FF;"></span><span style="color: #000000;">x0</span><span style="color: #0000FF;"></span><span style="color: #000000;">x0</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">end</span> <span style="color: #008080;">function</span> ~~-- Provides a rough calculation of gradient g(x).~~ ~~function gradG(sequence p)~~ <span style="color: #000080;font-style:italic;">-- Provides a rough calculation of gradient g(x).</span> ~~atom {x,y} = p~~ <span style="color: #008080;">function</span> <span style="color: #000000;">gradG</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">p</span><span style="color: #0000FF;">)</span> ~~p[1] = 2(x-1)exp(-yy) - 4xexp(-2xx)y(y+2)~~ <span style="color: #004080;">atom</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">x</span><span style="color: #0000FF;">,</span><span style="color: #000000;">y</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">p</span><span style="color: #0000FF;">,</span> ~~p[2] = -2(x-1)(x-1)yexp(-yy) + exp(-2xx)(y+2) + exp(-2xx)y~~ <span style="color: #000000;">xm1</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">x</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span> ~~return p~~ <span style="color: #000000;">emyy</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">exp</span><span style="color: #0000FF;">(-</span><span style="color: #000000;">y</span><span style="color: #0000FF;"></span><span style="color: #000000;">y</span><span style="color: #0000FF;">),</span> ~~end function~~ <span style="color: #000000;">em2xx</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">exp</span><span style="color: #0000FF;">(-</span><span style="color: #000000;">2</span><span style="color: #0000FF;"></span><span style="color: #000000;">x</span><span style="color: #0000FF;"></span><span style="color: #000000;">x</span><span style="color: #0000FF;">),</span> <span style="color: #000000;">xm12emyy</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">xm1</span><span style="color: #0000FF;"></span><span style="color: #000000;">2</span><span style="color: #0000FF;"></span><span style="color: #000000;">emyy</span> ~~function steepestDescent(sequence x, atom alpha, tolerance)~~ <span style="color: #000000;">p</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">xm12emyy</span> <span style="color: #0000FF;">-</span> <span style="color: #000000;">4</span><span style="color: #0000FF;"></span><span style="color: #000000;">x</span><span style="color: #0000FF;"></span><span style="color: #000000;">em2xx</span><span style="color: #0000FF;"></span><span style="color: #000000;">y</span><span style="color: #0000FF;">(</span><span style="color: #000000;">y</span><span style="color: #0000FF;">+</span><span style="color: #000000;">2</span><span style="color: #0000FF;">),</span> ~~integer n = length(x)~~ <span style="color: #0000FF;">-</span><span style="color: #000000;">xm12emyy</span><span style="color: #0000FF;"></span><span style="color: #000000;">xm1</span><span style="color: #0000FF;"></span><span style="color: #000000;">y</span> <span style="color: #0000FF;">+</span> <span style="color: #000000;">em2xx</span><span style="color: #0000FF;">(</span><span style="color: #000000;">2</span><span style="color: #0000FF;"></span><span style="color: #000000;">y</span><span style="color: #0000FF;">+</span><span style="color: #000000;">2</span><span style="color: #0000FF;">)}</span> ~~atom g0 = g(x) -- Initial estimate of result.~~ <span style="color: #008080;">return</span> <span style="color: #000000;">p</span> <span style="color: #008080;">end</span> <span style="color: #008080;">function</span> ~~-- Calculate initial gradient.~~ ~~sequence fi = gradG(x)~~ <span style="color: #008080;">function</span> <span style="color: #000000;">steepestDescent</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">x</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">atom</span> <span style="color: #000000;">alpha</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">tolerance</span><span style="color: #0000FF;">)</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">fi</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">gradG</span><span style="color: #0000FF;">(</span><span style="color: #000000;">x</span><span style="color: #0000FF;">)</span> <span style="color: #000080;font-style:italic;">-- Calculate initial gradient</span> ~~-- Calculate initial norm.~~ <span style="color: #004080;">atom</span> <span style="color: #000000;">g0</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">g</span><span style="color: #0000FF;">(</span><span style="color: #000000;">x</span><span style="color: #0000FF;">),</span> <span style="color: #000080;font-style:italic;">-- Initial estimate of result</span> ~~atom delG = sqrt(sum(sq_mul(fi,fi))),~~ <span style="color: #000000;">delG</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">sqrt</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">sum</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">sq_mul</span><span style="color: #0000FF;">(</span><span style="color: #000000;">fi</span><span style="color: #0000FF;">,</span><span style="color: #000000;">fi</span><span style="color: #0000FF;">))),</span> <span style="color: #000080;font-style:italic;">-- & norm</span> ~~b = alpha / delG~~ <span style="color: #000000;">b</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">alpha</span> <span style="color: #0000FF;">/</span> <span style="color: #000000;">delG</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">icount</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span> <span style="color: #000080;font-style:italic;">-- iteration limitor/sanity</span> ~~-- Iterate until value is <= tolerance.~~ <span style="color: #008080;">while</span> <span style="color: #000000;">delG</span><span style="color: #0000FF;">></span><span style="color: #000000;">tolerance</span> <span style="color: #008080;">do</span> <span style="color: #000080;font-style:italic;">-- Iterate until <= tolerance</span> ~~while delG>tolerance do~~ <span style="color: #000000;">x</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">sq_sub</span><span style="color: #0000FF;">(</span><span style="color: #000000;">x</span><span style="color: #0000FF;">,</span><span style="color: #7060A8;">sq_mul</span><span style="color: #0000FF;">(</span><span style="color: #000000;">b</span><span style="color: #0000FF;">,</span><span style="color: #000000;">fi</span><span style="color: #0000FF;">))</span> <span style="color: #000080;font-style:italic;">-- Calculate next value</span> ~~-- Calculate next value.~~ <span style="color: #000000;">fi</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">gradG</span><span style="color: #0000FF;">(</span><span style="color: #000000;">x</span><span style="color: #0000FF;">)</span> <span style="color: #000080;font-style:italic;">-- next gradient</span> ~~x = sq_sub(x,sq_mul(b,fi))~~ <span style="color: #000000;">delG</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">sqrt</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">sum</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">sq_mul</span><span style="color: #0000FF;">(</span><span style="color: #000000;">fi</span><span style="color: #0000FF;">,</span><span style="color: #000000;">fi</span><span style="color: #0000FF;">)))</span> <span style="color: #000080;font-style:italic;">-- next norm</span> <span style="color: #000000;">b</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">alpha</span> <span style="color: #0000FF;">/</span> <span style="color: #000000;">delG</span> ~~-- Calculate next gradient.~~ <span style="color: #004080;">atom</span> <span style="color: #000000;">g1</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">g</span><span style="color: #0000FF;">(</span><span style="color: #000000;">x</span><span style="color: #0000FF;">)</span> <span style="color: #000080;font-style:italic;">-- next value</span> ~~fi = gradG(x)~~ <span style="color: #008080;">if</span> <span style="color: #000000;">g1</span><span style="color: #0000FF;">></span><span style="color: #000000;">g0</span> <span style="color: #008080;">then</span> <span style="color: #000000;">alpha</span> <span style="color: #0000FF;">/=</span> <span style="color: #000000;">2</span> <span style="color: #000080;font-style:italic;">-- Adjust parameter</span> ~~-- Calculate next norm.~~ <span style="color: #008080;">else</span> ~~delG = sqrt(sum(sq_mul(fi,fi)))~~ <span style="color: #000000;">g0</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">g1</span> ~~b = alpha / delG~~ <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> <span style="color: #000000;">icount</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span> ~~-- Calculate next value.~~ <span style="color: #7060A8;">assert</span><span style="color: #0000FF;">(</span><span style="color: #000000;">icount</span><span style="color: #0000FF;"><</span><span style="color: #000000;">100</span><span style="color: #0000FF;">)</span> <span style="color: #000080;font-style:italic;">-- (increase if/when necessary)</span> ~~atom g1 = g(x)~~ <span style="color: #008080;">end</span> <span style="color: #008080;">while</span> <span style="color: #008080;">return</span> <span style="color: #000000;">x</span> ~~-- Adjust parameter.~~ <span style="color: #008080;">end</span> <span style="color: #008080;">function</span> ~~if g1>g0 then~~ ~~alpha /= 2~~ <span style="color: #008080;">constant</span> <span style="color: #000000;">alpha</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0.1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">tolerance</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0.00000000000001</span> ~~else~~ <span style="color: #004080;">sequence</span> <span style="color: #000000;">x</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">steepestDescent</span><span style="color: #0000FF;">({</span><span style="color: #000000;">0.1</span><span style="color: #0000FF;">,-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">},</span> <span style="color: #000000;">alpha</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">tolerance</span><span style="color: #0000FF;">)</span> ~~g0 = g1~~ <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"Testing steepest descent method:\n"</span><span style="color: #0000FF;">)</span> ~~end if~~ <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"The minimum is at x = %.13f, y = %.13f for which f(x, y) = %.15f\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">x</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span> <span style="color: #000000;">x</span><span style="color: #0000FF;">[</span><span style="color: #000000;">2</span><span style="color: #0000FF;">],</span> <span style="color: #000000;">g</span><span style="color: #0000FF;">(</span><span style="color: #000000;">x</span><span style="color: #0000FF;">)})</span> ~~end while~~ <!--</syntaxhighlight>--> ~~return x~~ ~~end function~~ ~~constant tolerance = 0.0000001, alpha = 0.1~~ ~~sequence x = steepestDescent({0.1,-1}, alpha, tolerance)~~ ~~printf(1,"Testing steepest descent method:\n")~~ ~~printf(1,"The minimum is at x = %.13f, y = %.13f for which f(x, y) = %.16f\n", {x[1], x[2], g(x)})</lang>~~ {{out}} Results ~~now~~ match ~~(at least) Algol 68/W,~~ Fortran, ~~Go, Julia, Raku, REXX, and Wren [at~~most ~~least~~others to ~~6dp~~6 or ~~more].~~7dp<br> Some slightly unexpected/unusual (but I think acceptable) variance was noted when playing with different tolerances, be warned.<br> ~~Note that specifying a tolerance < 1e-7 causes an infinite loop.   <small>(except for REXX which will tolerate a much smaller tolerance.</small>)<br>~~ Results on 32/64 bit Phix agree to 13dp, which I therefore choose to show in full here (but otherwise would not really trust). <pre> Testing steepest descent method: The minimum is at x = 0.~~1076268243295~~1076268435484, y = -1.~~2232596548816~~2232596638399 for which f(x, y) = -0.~~7500634205514924~~750063420551493 </pre> Line 530 ⟶ 922: I could have used ∇ and Δ in the variable names, but it looked too confusing, so I've gone with <var>grad-</var> and <var>del-</var> <~~lang~~syntaxhighlight lang="racket">#lang racket (define (apply-vector f v) Line 577 ⟶ 969: (module+ main (Gradient-descent)) </syntaxhighlight> ~~</lang>~~ {{out}} Line 585 ⟶ 977: (formerly Perl 6) {{trans\|Go}} <syntaxhighlight lang="raku" ~~perl6~~line># 20200904 Updated Raku programming solution sub steepestDescent(@x, $alpha is copy, $h) { Line 624 ⟶ 1,016: say "Testing steepest descent method:"; say "The minimum is at x[0] = ", @x[0], ", x[1] = ", @x[1]; </syntaxhighlight> ~~</lang>~~ {{out}} <pre>Testing steepest descent method: Line 632 ⟶ 1,024: =={{header\|REXX}}== The   ''tolerance''   can be much smaller;   a tolerance of   '''1e-200'''   was tested.   It works, but causes the program to execute a bit slower, but still sub-second execution time. <~~lang~~syntaxhighlight lang="rexx">/REXX pgm searches for minimum values of the bi─variate function (AKA steepest descent)/ numeric digits (length( e() ) - length(.) ) % 2 /use half of number decimal digs in E./ tolerance= 1e-30 /use a much smaller tolerance for REXX/ Line 673 ⟶ 1,065: numeric form; m.=9; parse value format(x,2,1,,0) 'E0' with g "E" _ .; g=g .5'e'_ %2 do j=0 while h>9; m.j=h; h= h % 2 + 1; end /j/ do k=j+5 to 0 by -1; numeric digits m.k; g=(g+x/g).5; end /k/; return g</~~lang~~syntaxhighlight> {{out\|output\|text=  when using the internal default inputs:}} <pre> Line 682 ⟶ 1,074: =={{header\|Scala}}== {{trans\|Go}} <~~lang~~syntaxhighlight lang="scala">object GradientDescent { /* Steepest descent method modifying input values/ Line 756 ⟶ 1,148: } } </syntaxhighlight> ~~</lang>~~ {{out}} <pre> Line 768 ⟶ 1,160: <br><br> <syntaxhighlight lang="typescript"> ~~<lang Typescript>~~ // Using the steepest-descent method to search // for minimum values of a multi-variable function Line 865 ⟶ 1,257: gradientDescentMain(); </syntaxhighlight> ~~</lang>~~ {{out}} <pre> Line 876 ⟶ 1,268:   [Linear Regression using Gradient Descent by Adarsh Menon] <br><br> <syntaxhighlight lang="typescript"> ~~<lang Typescript>~~ let data: number[][] = [[32.5023452694530, 31.70700584656990], Line 1,024 ⟶ 1,416: gradientDescentMain(); </syntaxhighlight> ~~</lang>~~ =={{header\|Wren}}== {{trans\|Go}} ~~{{libheader\|Wren-math}}~~ {{libheader\|Wren-fmt}} <~~lang~~syntaxhighlight ~~ecmascript~~lang="wren">import "./~~math~~fmt" for ~~Math~~Fmt ~~import "/fmt" for Fmt~~ // Function for which minimum is to be found. var g = Fn.new { \|x\| return (x[0]-1)(x[0]-1) ~~Math.exp~~(-x[1]x[1]).exp + x[1](x[1]+2)* ~~Math.exp~~(-2x[0]x[0]).exp } Line 1,044 ⟶ 1,434: var x = p[0] var y = p[1] return [2(x-1)~~Math.exp~~(-yy).exp - 4x~~Math.exp~~(-2xx).expy(y+2), -2(x-1)(x-1)y~~Math.exp~~(-yy).exp + ~~Math.exp~~(-2xx).exp(y+2) + ~~Math.exp~~(-2xx).expy] } Line 1,093 ⟶ 1,483: steepestDescent.call(x, alpha, tolerance) System.print("Testing steepest descent method:") Fmt.print("The minimum is at x = $f, y = $f for which f(x, y) = $f.", x[0], x[1], g.call(x))</~~lang~~syntaxhighlight> {{out}} Line 1,103 ⟶ 1,493: =={{header\|zkl}}== {{trans\|Go}} with tweaked gradG <~~lang~~syntaxhighlight lang="zkl">fcn steepestDescent(f, x,y, alpha, h){ g0:=f(x,y); # Initial estimate of result. fix,fiy := gradG(f,x,y,h); # Calculate initial gradient Line 1,122 ⟶ 1,512: g0:=f(x,y); return((f(x + h, y) - g0)/h, (f(x, y + h) - g0)/h) }</~~lang~~syntaxhighlight> <~~lang~~syntaxhighlight lang="zkl">fcn f(x,y){ # Function for which minimum is to be found. (x - 1).pow(2)(-y.pow(2)).exp() + y(y + 2)(-2.0x.pow(2)).exp() Line 1,134 ⟶ 1,524: println("Testing steepest descent method:"); println("The minimum is at (x,y) = (%f,%f). f(x,y) = %f".fmt(x,y,f(x,y)));</~~lang~~syntaxhighlight> {{out}} <pre>