Multiple regression

From Rosetta Code
Revision as of 13:41, 9 August 2009 by rosettacode>Dkf (→‎{{header|Tcl}}: whitespace)
Task
Multiple regression
You are encouraged to solve this task according to the task description, using any language you may know.

Given a set of data vectors in the following format:

Compute the vector using ordinary least squares regression using the following equation:

You can assume y is given to you as an array, and x is given to you as a two-dimensional array.

Note: This is more general than Polynomial Fitting, which only deals with 2 datasets and only deals with polynomial equations. Ordinary least squares can deal with an arbitrary number of datasets (limited by the processing power of the machine) and can have more advanced equations such as:

Ruby

Using the standard library Matrix class:

<lang ruby>require 'matrix'

def regression_coefficients y, x

 y = Matrix.column_vector y.map { |i| i.to_f }
 x = Matrix.columns x.map { |xi| xi.map { |i| i.to_f }}
 (x.t * x).inverse * x.t * y

end</lang>

Testing: <lang ruby>puts regression_coefficients([1, 2, 3, 4, 5], [ [2, 1, 3, 4, 5] ])</lang> Output:

Matrix[[0.981818181818182]]

Tcl

Uses the

Library: tcllib

linear algebra package.

<lang tcl>package require math::linearalgebra namespace eval multipleRegression {

   namespace export regressionCoefficients
   namespace import ::math::linearalgebra::*
   # Matrix inversion is defined in terms of Gaussian elimination
   # Note that we assume (correctly) that we have a square matrix
   proc invert {matrix} {

solveGauss $matrix [mkIdentity [lindex [shape $matrix] 0]]

   }
   # Implement the Ordinary Least Squares method
   proc regressionCoefficients {y x} {

matmul [matmul [invert [matmul $x [transpose $x]]] $x] $y

   }

} namespace import multipleRegression::regressionCoefficients</lang> Using an example from the Wikipedia page on the correlation of height and weight: <lang tcl># Simple helper just for this example proc map {n exp list} {

   upvar 1 $n v
   set r {}; foreach v $list {lappend r [uplevel 1 $exp]}; return $r

}

  1. Data from wikipedia

set x {

   1.47 1.50 1.52 1.55 1.57 1.60 1.63 1.65 1.68 1.70 1.73 1.75 1.78 1.80 1.83

} set y {

   52.21 53.12 54.48 55.84 57.20 58.57 59.93 61.29 63.11 64.47 66.28 68.10
   69.92 72.19 74.46

}

  1. Wikipedia states that fitting up to the square of x[i] is worth it

puts [regressionCoefficients $y [map n {map v {expr {$v**$n}} $x} {0 1 2}]]</lang> Produces this output (a 3-vector of coefficients):

128.81280358170625 -143.16202286630732 61.96032544293041