LU decomposition

From Rosetta Code
Task
LU decomposition
You are encouraged to solve this task according to the task description, using any language you may know.

Every square matrix can be decomposed into a product of a lower triangular matrix and a upper triangular matrix , as described in LU decomposition.

It is a modified form of Gaussian elimination. While the Cholesky decomposition only works for symmetric, positive definite matrices, the more general LU decomposition works for any square matrix.

There are several algorithms for calculating L and U. To derive Crout's algorithm for a 3x3 example, we have to solve the following system:

We now would have to solve 9 equations with 12 unknowns. To make the system uniquely solvable, usually the diagonal elements of are set to 1

so we get a solvable system of 9 unknowns and 9 equations.

Solving for the other and , we get the following equations:

and for :

We see that there is a calculation pattern, which can be expressed as the following formulas, first for

and then for

We see in the second formula that to get the below the diagonal, we have to divide by the diagonal element (pivot) , so we get problems when is either 0 or very small, which leads to numerical instability.

The solution to this problem is pivoting , which means rearranging the rows of , prior to the decomposition, in a way that the largest element of each column gets onto the diagonal of . Rearranging the rows means to multiply by a permutation matrix :

Example:

The decomposition algorithm is then applied on the rearranged matrix so that


Task description

The task is to implement a routine which will take a square nxn matrix and return a lower triangular matrix , a upper triangular matrix and a permutation matrix , so that the above equation is fulfilled.

You should then test it on the following two examples and include your output.


Example 1
A

1   3   5
2   4   7
1   1   0

L

1.00000   0.00000   0.00000
0.50000   1.00000   0.00000
0.50000  -1.00000   1.00000

U

2.00000   4.00000   7.00000
0.00000   1.00000   1.50000
0.00000   0.00000  -2.00000

P

0   1   0
1   0   0
0   0   1
Example 2
A

11    9   24    2
 1    5    2    6
 3   17   18    1
 2    5    7    1

L

1.00000   0.00000   0.00000   0.00000
0.27273   1.00000   0.00000   0.00000
0.09091   0.28750   1.00000   0.00000
0.18182   0.23125   0.00360   1.00000

U

11.00000    9.00000   24.00000    2.00000
 0.00000   14.54545   11.45455    0.45455
 0.00000    0.00000   -3.47500    5.68750
 0.00000    0.00000    0.00000    0.51079

P

1   0   0   0
0   0   1   0
0   1   0   0
0   0   0   1



11l

Translation of: Python
F pprint(m)
   L(row) m
      print(row)

F matrix_mul(a, b)
   V result = [[0.0] * a.len] * a.len
   L(j) 0 .< a.len
      L(i) 0 .< a.len
         V r = 0.0
         L(k) 0 .< a.len
            r += a[i][k] * b[k][j]
         result[i][j] = r
   R result

F pivotize(m)
   ‘Creates the pivoting matrix for m.’
   V n = m.len
   V ID = (0 .< n).map(j -> (0 .< @n).map(i -> Float(i == @j)))
   L(j) 0 .< n
      V row = max(j .< n, key' i -> abs(@m[i][@j]))
      I j != row
         swap(&ID[j], &ID[row])
   R ID

F lu(A)
   ‘Decomposes a nxn matrix A by PA=lU and returns l, U and P.’
   V n = A.len
   V l = [[0.0] * n] * n
   V U = [[0.0] * n] * n
   V P = pivotize(A)
   V A2 = matrix_mul(P, A)
   L(j) 0 .< n
      l[j][j] = 1.0
      L(i) 0 .. j
         V s1 = sum((0 .< i).map(k -> @U[k][@j] * @l[@i][k]))
         U[i][j] = A2[i][j] - s1
      L(i) j .< n
         V s2 = sum((0 .< j).map(k -> @U[k][@j] * @l[@i][k]))
         l[i][j] = (A2[i][j] - s2) / U[j][j]
   R (l, U, P)

V a = [[1, 3, 5], [2, 4, 7], [1, 1, 0]]
L(part) lu(a)
   pprint(part)
   print()
print()
V b = [[11, 9, 24, 2], [1, 5, 2, 6], [3, 17, 18, 1], [2, 5, 7, 1]]
L(part) lu(b)
   pprint(part)
   print()
Output:
[1, 0, 0]
[0.5, 1, 0]
[0.5, -1, 1]

[2, 4, 7]
[0, 1, 1.5]
[0, 0, -2]

[0, 1, 0]
[1, 0, 0]
[0, 0, 1]


[1, 0, 0, 0]
[0.272727, 1, 0, 0]
[0.0909091, 0.2875, 1, 0]
[0.181818, 0.23125, 0.00359712, 1]

[11, 9, 24, 2]
[0, 14.5455, 11.4545, 0.454545]
[0, 0, -3.475, 5.6875]
[0, 0, 0, 0.510791]

[1, 0, 0, 0]
[0, 0, 1, 0]
[0, 1, 0, 0]
[0, 0, 0, 1]

Ada

Works with: Ada 2005

decomposition.ads:

with Ada.Numerics.Generic_Real_Arrays;
generic
   with package Matrix is new Ada.Numerics.Generic_Real_Arrays (<>);
package Decomposition is

   -- decompose a square matrix A by PA = LU
   procedure Decompose (A : Matrix.Real_Matrix; P, L, U : out Matrix.Real_Matrix);

end Decomposition;

decomposition.adb:

package body Decomposition is

   procedure Swap_Rows (M : in out Matrix.Real_Matrix; From, To : Natural) is
      Temporary : Matrix.Real;
   begin
      if From = To then
         return;
      end if;
      for I in M'Range (2) loop
         Temporary := M (M'First (1) + From, I);
         M (M'First (1) + From, I) := M (M'First (1) + To, I);
         M (M'First (1) + To, I) := Temporary;
      end loop;
   end Swap_Rows;

   function Pivoting_Matrix
     (M : Matrix.Real_Matrix)
      return Matrix.Real_Matrix
   is
      use type Matrix.Real;
      Order     : constant Positive := M'Length (1);
      Result    : Matrix.Real_Matrix := Matrix.Unit_Matrix (Order);
      Max       : Matrix.Real;
      Row       : Natural;
   begin
      for J in 0 .. Order - 1 loop
         Max := M (M'First (1) + J, M'First (2) + J);
         Row := J;
         for I in J .. Order - 1 loop
            if M (M'First (1) + I, M'First (2) + J) > Max then
               Max := M (M'First (1) + I, M'First (2) + J);
               Row := I;
            end if;
         end loop;
         if J /= Row then
            -- swap rows J and Row
            Swap_Rows (Result, J, Row);
         end if;
      end loop;
      return Result;
   end Pivoting_Matrix;

   procedure Decompose (A : Matrix.Real_Matrix; P, L, U : out Matrix.Real_Matrix) is
      use type Matrix.Real_Matrix, Matrix.Real;
      Order : constant Positive := A'Length (1);
      A2 : Matrix.Real_Matrix (A'Range (1), A'Range (2));
      S : Matrix.Real;
   begin
      L := (others => (others => 0.0));
      U := (others => (others => 0.0));
      P := Pivoting_Matrix (A);
      A2 := P * A;
      for J in 0 .. Order - 1 loop
         L (L'First (1) + J, L'First (2) + J) := 1.0;
         for I in 0 .. J loop
            S := 0.0;
            for K in 0 .. I - 1 loop
               S := S + U (U'First (1) + K, U'First (2) + J) *
                 L (L'First (1) + I, L'First (2) + K);
            end loop;
            U (U'First (1) + I, U'First (2) + J) :=
              A2 (A2'First (1) + I, A2'First (2) + J) - S;
         end loop;
         for I in J + 1 .. Order - 1 loop
            S := 0.0;
            for K in 0 .. J loop
               S := S + U (U'First (1) + K, U'First (2) + J) *
                 L (L'First (1) + I, L'First (2) + K);
            end loop;
            L (L'First (1) + I, L'First (2) + J) :=
              (A2 (A2'First (1) + I, A2'First (2) + J) - S) /
              U (U'First (1) + J, U'First (2) + J);
         end loop;
      end loop;
   end Decompose;

end Decomposition;

Example usage:

with Ada.Numerics.Real_Arrays;
with Ada.Text_IO;
with Decomposition;
procedure Decompose_Example is
   package Real_Decomposition is new Decomposition
     (Matrix => Ada.Numerics.Real_Arrays);

   package Real_IO is new Ada.Text_IO.Float_IO (Float);
   
   procedure Print (M : Ada.Numerics.Real_Arrays.Real_Matrix) is
   begin
      for Row in M'Range (1) loop
         for Col in M'Range (2) loop
            Real_IO.Put (M (Row, Col), 3, 2, 0);
         end loop;
         Ada.Text_IO.New_Line;
      end loop;
   end Print;

   Example_1 : constant Ada.Numerics.Real_Arrays.Real_Matrix :=
     ((1.0, 3.0, 5.0),
      (2.0, 4.0, 7.0),
      (1.0, 1.0, 0.0));
   P_1, L_1, U_1 : Ada.Numerics.Real_Arrays.Real_Matrix (Example_1'Range (1),
                                                         Example_1'Range (2));
   Example_2 : constant Ada.Numerics.Real_Arrays.Real_Matrix :=
     ((11.0, 9.0, 24.0, 2.0),
      (1.0, 5.0, 2.0, 6.0),
      (3.0, 17.0, 18.0, 1.0),
      (2.0, 5.0, 7.0, 1.0));
   P_2, L_2, U_2 : Ada.Numerics.Real_Arrays.Real_Matrix (Example_2'Range (1),
                                                         Example_2'Range (2));
begin
   Real_Decomposition.Decompose (A => Example_1,
                                 P => P_1,
                                 L => L_1,
                                 U => U_1);
   Real_Decomposition.Decompose (A => Example_2,
                                 P => P_2,
                                 L => L_2,
                                 U => U_2);
   Ada.Text_IO.Put_Line ("Example 1:");
   Ada.Text_IO.Put_Line ("A:"); Print (Example_1);
   Ada.Text_IO.Put_Line ("L:"); Print (L_1);
   Ada.Text_IO.Put_Line ("U:"); Print (U_1);
   Ada.Text_IO.Put_Line ("P:"); Print (P_1);
   Ada.Text_IO.New_Line;
   Ada.Text_IO.Put_Line ("Example 2:");
   Ada.Text_IO.Put_Line ("A:"); Print (Example_2);
   Ada.Text_IO.Put_Line ("L:"); Print (L_2);
   Ada.Text_IO.Put_Line ("U:"); Print (U_2);
   Ada.Text_IO.Put_Line ("P:"); Print (P_2);
end Decompose_Example;
Output:
Example 1:
A:
  1.00  3.00  5.00
  2.00  4.00  7.00
  1.00  1.00  0.00
L:
  1.00  0.00  0.00
  0.50  1.00  0.00
  0.50 -1.00  1.00
U:
  2.00  4.00  7.00
  0.00  1.00  1.50
  0.00  0.00 -2.00
P:
  0.00  1.00  0.00
  1.00  0.00  0.00
  0.00  0.00  1.00

Example 2:
A:
 11.00  9.00 24.00  2.00
  1.00  5.00  2.00  6.00
  3.00 17.00 18.00  1.00
  2.00  5.00  7.00  1.00
L:
  1.00  0.00  0.00  0.00
  0.27  1.00  0.00  0.00
  0.09  0.29  1.00  0.00
  0.18  0.23  0.00  1.00
U:
 11.00  9.00 24.00  2.00
  0.00 14.55 11.45  0.45
  0.00  0.00 -3.47  5.69
  0.00  0.00  0.00  0.51
P:
  1.00  0.00  0.00  0.00
  0.00  0.00  1.00  0.00
  0.00  1.00  0.00  0.00
  0.00  0.00  0.00  1.00

ATS

(* There is a "little matrix library" included below. Not all of it is
   used, though unused parts may prove useful for playing with the
   code.

   One might, by the way, find interesting how I get the P matrix from
   a permutation vector. *)

%{^
#include <math.h>
#include <float.h>
%}

#include "share/atspre_staload.hats"

macdef NAN = g0f2f ($extval (float, "NAN"))
macdef Zero = g0i2f 0
macdef One = g0i2f 1

(* You can substitute an "fma" function for this definition: *)
macdef multiply_and_add (x, y, z) = (,(x) * ,(y)) + ,(z)

exception Exc_degenerate_problem of string

(*------------------------------------------------------------------*)
(* A "little matrix library"                                        *)

typedef Matrix_Index_Map (m1 : int, n1 : int, m0 : int, n0 : int) =
  {i1, j1 : pos | i1 <= m1; j1 <= n1}
  (int i1, int j1) -<cloref0>
  [i0, j0 : pos | i0 <= m0; j0 <= n0]
    @(int i0, int j0)

datatype Real_Matrix (tk : tkind,
                      m1 : int, n1 : int,
                      m0 : int, n0 : int) =
| Real_Matrix of (matrixref (g0float tk, m0, n0),
                  int m1, int n1, int m0, int n0,
                  Matrix_Index_Map (m1, n1, m0, n0))
typedef Real_Matrix (tk : tkind, m1 : int, n1 : int) =
  [m0, n0 : pos] Real_Matrix (tk, m1, n1, m0, n0)
typedef Real_Vector (tk : tkind, m1 : int, n1 : int) =
  [m1 == 1 || n1 == 1] Real_Matrix (tk, m1, n1)
typedef Real_Row (tk : tkind, n1 : int) = Real_Vector (tk, 1, n1)
typedef Real_Column (tk : tkind, m1 : int) = Real_Vector (tk, m1, 1)

extern fn {tk : tkind}
Real_Matrix_make_elt :
  {m0, n0 : pos}
  (int m0, int n0, g0float tk) -< !wrt >
    Real_Matrix (tk, m0, n0, m0, n0)

extern fn {tk : tkind}
Real_Matrix_copy :
  {m1, n1 : pos}
  Real_Matrix (tk, m1, n1) -< !refwrt > Real_Matrix (tk, m1, n1)

extern fn {tk : tkind}
Real_Matrix_copy_to :
  {m1, n1 : pos}
  (Real_Matrix (tk, m1, n1),    (* destination *)
   Real_Matrix (tk, m1, n1)) -< !refwrt >
    void

extern fn {tk : tkind}
Real_Matrix_fill_with_elt :
  {m1, n1 : pos}
  (Real_Matrix (tk, m1, n1), g0float tk) -< !refwrt > void

extern fn {}
Real_Matrix_dimension :
  {tk : tkind}
  {m1, n1 : pos}
  Real_Matrix (tk, m1, n1) -<> @(int m1, int n1)

extern fn {tk : tkind}
Real_Matrix_get_at :
  {m1, n1 : pos}
  {i1, j1 : pos | i1 <= m1; j1 <= n1}
  (Real_Matrix (tk, m1, n1), int i1, int j1) -< !ref > g0float tk

extern fn {tk : tkind}
Real_Matrix_set_at :
  {m1, n1 : pos}
  {i1, j1 : pos | i1 <= m1; j1 <= n1}
  (Real_Matrix (tk, m1, n1), int i1, int j1, g0float tk) -< !refwrt >
    void

extern fn {}
Real_Matrix_apply_index_map :
  {tk : tkind}
  {m1, n1 : pos}
  {m0, n0 : pos}
  (Real_Matrix (tk, m0, n0), int m1, int n1,
   Matrix_Index_Map (m1, n1, m0, n0)) -<>
    Real_Matrix (tk, m1, n1)

extern fn {}
Real_Matrix_transpose :
  (* This is transposed INDEXING. It does NOT copy the data. *)
  {tk : tkind}
  {m1, n1 : pos}
  {m0, n0 : pos}
  Real_Matrix (tk, m1, n1, m0, n0) -<>
    Real_Matrix (tk, n1, m1, m0, n0)

extern fn {}
Real_Matrix_block :
  (* This is block (submatrix) INDEXING. It does NOT copy the data. *)
  {tk : tkind}
  {p0, p1 : pos | p0 <= p1}
  {q0, q1 : pos | q0 <= q1}
  {m1, n1 : pos | p1 <= m1; q1 <= n1}
  {m0, n0 : pos}
  (Real_Matrix (tk, m1, n1, m0, n0),
   int p0, int p1, int q0, int q1) -<>
    Real_Matrix (tk, p1 - p0 + 1, q1 - q0 + 1, m0, n0)

extern fn {tk : tkind}
Real_Matrix_unit_matrix :
  {m : pos}
  int m -< !refwrt > Real_Matrix (tk, m, m)

extern fn {tk : tkind}
Real_Matrix_unit_matrix_to :
  {m : pos}
  Real_Matrix (tk, m, m) -< !refwrt > void

extern fn {tk : tkind}
Real_Matrix_matrix_sum :
  {m, n : pos}
  (Real_Matrix (tk, m, n), Real_Matrix (tk, m, n)) -< !refwrt >
    Real_Matrix (tk, m, n)

extern fn {tk : tkind}
Real_Matrix_matrix_sum_to :
  {m, n : pos}
  (Real_Matrix (tk, m, n),      (* destination*)
   Real_Matrix (tk, m, n),
   Real_Matrix (tk, m, n)) -< !refwrt >
    void

extern fn {tk : tkind}
Real_Matrix_matrix_difference :
  {m, n : pos}
  (Real_Matrix (tk, m, n), Real_Matrix (tk, m, n)) -< !refwrt >
    Real_Matrix (tk, m, n)

extern fn {tk : tkind}
Real_Matrix_matrix_difference_to :
  {m, n : pos}
  (Real_Matrix (tk, m, n),      (* destination*)
   Real_Matrix (tk, m, n),
   Real_Matrix (tk, m, n)) -< !refwrt >
    void

extern fn {tk : tkind}
Real_Matrix_matrix_product :
  {m, n, p : pos}
  (Real_Matrix (tk, m, n), Real_Matrix (tk, n, p)) -< !refwrt >
    Real_Matrix (tk, m, p)

extern fn {tk : tkind}
Real_Matrix_matrix_product_to :
  (* For the matrix product, the destination should not be the same as
     either of the other matrices. *)
  {m, n, p : pos}
  (Real_Matrix (tk, m, p),      (* destination*)
   Real_Matrix (tk, m, n),
   Real_Matrix (tk, n, p)) -< !refwrt >
    void

extern fn {tk : tkind}
Real_Matrix_scalar_product :
  {m, n : pos}
  (Real_Matrix (tk, m, n), g0float tk) -< !refwrt >
    Real_Matrix (tk, m, n)

extern fn {tk : tkind}
Real_Matrix_scalar_product_2 :
  {m, n : pos}
  (g0float tk, Real_Matrix (tk, m, n)) -< !refwrt >
    Real_Matrix (tk, m, n)

extern fn {tk : tkind}
Real_Matrix_scalar_product :
  {m, n : pos}
  (Real_Matrix (tk, m, n), g0float tk) -< !refwrt >
    Real_Matrix (tk, m, n)

extern fn {tk : tkind}
Real_Matrix_scalar_product_2 :
  {m, n : pos}
  (g0float tk, Real_Matrix (tk, m, n)) -< !refwrt >
    Real_Matrix (tk, m, n)

extern fn {tk : tkind}
Real_Matrix_scalar_product_to :
  {m, n : pos}
  (Real_Matrix (tk, m, n),      (* destination*)
   Real_Matrix (tk, m, n),
   g0float tk) -< !refwrt >
    void

extern fn {tk : tkind}          (* Useful for debugging. *)
Real_Matrix_fprint :
  {m, n : pos}
  (FILEref, Real_Matrix (tk, m, n)) -<1> void

overload copy with Real_Matrix_copy
overload copy_to with Real_Matrix_copy_to
overload fill_with_elt with Real_Matrix_fill_with_elt
overload dimension with Real_Matrix_dimension
overload [] with Real_Matrix_get_at
overload [] with Real_Matrix_set_at
overload apply_index_map with Real_Matrix_apply_index_map
overload transpose with Real_Matrix_transpose
overload block with Real_Matrix_block
overload unit_matrix with Real_Matrix_unit_matrix
overload unit_matrix_to with Real_Matrix_unit_matrix_to
overload matrix_sum with Real_Matrix_matrix_sum
overload matrix_sum_to with Real_Matrix_matrix_sum_to
overload matrix_difference with Real_Matrix_matrix_difference
overload matrix_difference_to with Real_Matrix_matrix_difference_to
overload matrix_product with Real_Matrix_matrix_product
overload matrix_product_to with Real_Matrix_matrix_product_to
overload scalar_product with Real_Matrix_scalar_product
overload scalar_product with Real_Matrix_scalar_product_2
overload scalar_product_to with Real_Matrix_scalar_product_to
overload + with matrix_sum
overload - with matrix_difference
overload * with matrix_product
overload * with scalar_product

(*------------------------------------------------------------------*)
(* Implementation of the "little matrix library"                    *)

implement {tk}
Real_Matrix_make_elt (m0, n0, elt) =
  Real_Matrix (matrixref_make_elt<g0float tk> (i2sz m0, i2sz n0, elt),
               m0, n0, m0, n0, lam (i1, j1) => @(i1, j1))

implement {}
Real_Matrix_dimension A =
  case+ A of Real_Matrix (_, m1, n1, _, _, _) => @(m1, n1)

implement {tk}
Real_Matrix_get_at (A, i1, j1) =
  let
    val+ Real_Matrix (storage, _, _, _, n0, index_map) = A
    val @(i0, j0) = index_map (i1, j1)
  in
    matrixref_get_at<g0float tk> (storage, pred i0, n0, pred j0)
  end

implement {tk}
Real_Matrix_set_at (A, i1, j1, x) =
  let
    val+ Real_Matrix (storage, _, _, _, n0, index_map) = A
    val @(i0, j0) = index_map (i1, j1)
  in
    matrixref_set_at<g0float tk> (storage, pred i0, n0, pred j0, x)
  end

implement {}
Real_Matrix_apply_index_map (A, m1, n1, index_map) =
  (* This is not the most efficient way to acquire new indexing, but
     it will work. It requires three closures, instead of the two
     needed by our implementations of "transpose" and "block". *)
  let
    val+ Real_Matrix (storage, m1a, n1a, m0, n0, index_map_1a) = A
  in
    Real_Matrix (storage, m1, n1, m0, n0,
                 lam (i1, j1) =>
                   index_map_1a (i1a, j1a) where
                     { val @(i1a, j1a) = index_map (i1, j1) })
  end

implement {}
Real_Matrix_transpose A =
  let
    val+ Real_Matrix (storage, m1, n1, m0, n0, index_map) = A
  in
    Real_Matrix (storage, n1, m1, m0, n0,
                 lam (i1, j1) => index_map (j1, i1))
  end

implement {}
Real_Matrix_block (A, p0, p1, q0, q1) =
  let
    val+ Real_Matrix (storage, m1, n1, m0, n0, index_map) = A
  in
    Real_Matrix (storage, succ (p1 - p0), succ (q1 - q0), m0, n0,
                 lam (i1, j1) =>
                  index_map (p0 + pred i1, q0 + pred j1))
  end

implement {tk}
Real_Matrix_copy A =
  let
    val @(m1, n1) = dimension A
    val C = Real_Matrix_make_elt<tk> (m1, n1, A[1, 1])
    val () = copy_to<tk> (C, A)
  in
    C
  end

implement {tk}
Real_Matrix_copy_to (Dst, Src) =
  let
    val @(m1, n1) = dimension Src
    prval [m1 : int] EQINT () = eqint_make_gint m1
    prval [n1 : int] EQINT () = eqint_make_gint n1

    var i : intGte 1
  in
    for* {i : pos | i <= m1 + 1} .<(m1 + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m1; i := succ i)
        let
          var j : intGte 1
        in
          for* {j : pos | j <= n1 + 1} .<(n1 + 1) - j>.
               (j : int j) =>
            (j := 1; j <> succ n1; j := succ j)
              Dst[i, j] := Src[i, j]
        end
  end

implement {tk}
Real_Matrix_fill_with_elt (A, elt) =
  let
    val @(m1, n1) = dimension A
    prval [m1 : int] EQINT () = eqint_make_gint m1
    prval [n1 : int] EQINT () = eqint_make_gint n1

    var i : intGte 1
  in
    for* {i : pos | i <= m1 + 1} .<(m1 + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m1; i := succ i)
        let
          var j : intGte 1
        in
          for* {j : pos | j <= n1 + 1} .<(n1 + 1) - j>.
               (j : int j) =>
            (j := 1; j <> succ n1; j := succ j)
              A[i, j] := elt
        end
  end

implement {tk}
Real_Matrix_unit_matrix {m} m =
  let
    val A = Real_Matrix_make_elt<tk> (m, m, Zero)
    var i : intGte 1
  in
    for* {i : pos | i <= m + 1} .<(m + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m; i := succ i)
        A[i, i] := One;
    A
  end

implement {tk}
Real_Matrix_unit_matrix_to A =
  let
    val @(m, _) = dimension A
    prval [m : int] EQINT () = eqint_make_gint m

    var i : intGte 1
  in
    for* {i : pos | i <= m + 1} .<(m + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m; i := succ i)
        let
          var j : intGte 1
        in
          for* {j : pos | j <= m + 1} .<(m + 1) - j>.
               (j : int j) =>
               (j := 1; j <> succ m; j := succ j)
            A[i, j] := (if i = j then One else Zero)
        end
  end

implement {tk}
Real_Matrix_matrix_sum (A, B) =
  let
    val @(m, n) = dimension A
    val C = Real_Matrix_make_elt<tk> (m, n, NAN)
    val () = matrix_sum_to<tk> (C, A, B)
  in
    C
  end

implement {tk}
Real_Matrix_matrix_sum_to (C, A, B) =
  let
    val @(m, n) = dimension A
    prval [m : int] EQINT () = eqint_make_gint m
    prval [n : int] EQINT () = eqint_make_gint n

    var i : intGte 1
  in
    for* {i : pos | i <= m + 1} .<(m + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m; i := succ i)
        let
          var j : intGte 1
        in
          for* {j : pos | j <= n + 1} .<(n + 1) - j>.
               (j : int j) =>
            (j := 1; j <> succ n; j := succ j)
              C[i, j] := A[i, j] + B[i, j]
        end
  end

implement {tk}
Real_Matrix_matrix_difference (A, B) =
  let
    val @(m, n) = dimension A
    val C = Real_Matrix_make_elt<tk> (m, n, NAN)
    val () = matrix_difference_to<tk> (C, A, B)
  in
    C
  end

implement {tk}
Real_Matrix_matrix_difference_to (C, A, B) =
  let
    val @(m, n) = dimension A
    prval [m : int] EQINT () = eqint_make_gint m
    prval [n : int] EQINT () = eqint_make_gint n

    var i : intGte 1
  in
    for* {i : pos | i <= m + 1} .<(m + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m; i := succ i)
        let
          var j : intGte 1
        in
          for* {j : pos | j <= n + 1} .<(n + 1) - j>.
               (j : int j) =>
            (j := 1; j <> succ n; j := succ j)
              C[i, j] := A[i, j] - B[i, j]
        end
  end

implement {tk}
Real_Matrix_matrix_product (A, B) =
  let
    val @(m, n) = dimension A and @(_, p) = dimension B
    val C = Real_Matrix_make_elt<tk> (m, p, NAN)
    val () = matrix_product_to<tk> (C, A, B)
  in
    C
  end

implement {tk}
Real_Matrix_matrix_product_to (C, A, B) =
  let
    val @(m, n) = dimension A and @(_, p) = dimension B
    prval [m : int] EQINT () = eqint_make_gint m
    prval [n : int] EQINT () = eqint_make_gint n
    prval [p : int] EQINT () = eqint_make_gint p

    var i : intGte 1
  in
    for* {i : pos | i <= m + 1} .<(m + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m; i := succ i)
        let
          var k : intGte 1
        in
          for* {k : pos | k <= p + 1} .<(p + 1) - k>.
               (k : int k) =>
            (k := 1; k <> succ p; k := succ k)
              let
                var j : intGte 1
              in
                C[i, k] := A[i, 1] * B[1, k];
                for* {j : pos | j <= n + 1} .<(n + 1) - j>.
                     (j : int j) =>
                  (j := 2; j <> succ n; j := succ j)
                    C[i, k] :=
                      multiply_and_add (A[i, j], B[j, k], C[i, k])
              end
        end
  end

implement {tk}
Real_Matrix_scalar_product (A, r) =
  let
    val @(m, n) = dimension A
    val C = Real_Matrix_make_elt<tk> (m, n, NAN)
    val () = scalar_product_to<tk> (C, A, r)
  in
    C
  end

implement {tk}
Real_Matrix_scalar_product_2 (r, A) =
  Real_Matrix_scalar_product<tk> (A, r)

implement {tk}
Real_Matrix_scalar_product_to (C, A, r) =
  let
    val @(m, n) = dimension A
    prval [m : int] EQINT () = eqint_make_gint m
    prval [n : int] EQINT () = eqint_make_gint n

    var i : intGte 1
  in
    for* {i : pos | i <= m + 1} .<(m + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m; i := succ i)
        let
          var j : intGte 1
        in
          for* {j : pos | j <= n + 1} .<(n + 1) - j>.
               (j : int j) =>
            (j := 1; j <> succ n; j := succ j)
              C[i, j] := A[i, j] * r
        end
  end

implement {tk}
Real_Matrix_fprint {m, n} (outf, A) =
  let
    val @(m, n) = dimension A
    var i : intGte 1
  in
    for* {i : pos | i <= m + 1} .<(m + 1) - i>.
         (i : int i) =>
      (i := 1; i <> succ m; i := succ i)
        let
          var j : intGte 1
        in
          for* {j : pos | j <= n + 1} .<(n + 1) - j>.
               (j : int j) =>
            (j := 1; j <> succ n; j := succ j)
              let
                typedef FILEstar = $extype"FILE *"
                extern castfn FILEref2star : FILEref -<> FILEstar
                val _ = $extfcall (int, "fprintf", FILEref2star outf,
                                   "%12.6lf", A[i, j])
              in
              end;
          fprintln! (outf)
        end
  end

(*------------------------------------------------------------------*)
(* LUP decomposition. Based on
   https://en.wikipedia.org/w/index.php?title=LU_decomposition&oldid=1146366204#C_code_example
   *)

extern fn {tk : tkind}
Real_Matrix_LUP_decomposition :
  {n : pos}
  (Real_Matrix (tk, n, n),
   g0float tk (* tolerance *) ) -< !exnrefwrt >
    @(Real_Matrix (tk, n, n),
      Real_Matrix (tk, n, n),
      Real_Matrix (tk, n, n))

overload LUP_decomposition with Real_Matrix_LUP_decomposition

implement {tk}
Real_Matrix_LUP_decomposition {n} (A, tol) =
  let
    val @(n, _) = dimension A
    typedef one_to_n = intBtwe (1, n)

    (* The initial permutation is [1,2,3,...,n]. *)
    implement
    array_tabulate$fopr<one_to_n> i =
      let
        val i = g1ofg0 (sz2i (succ i))
        val () = assertloc ((1 <= i) * (i <= n))
      in
        i
      end
    val permutation =
      $effmask_all arrayref_tabulate<one_to_n> (i2sz n)
    fn
    index_map : Matrix_Index_Map (n, n, n, n) =
      lam (i1, j1) => $effmask_ref
        (@(i0, j1) where { val i0 = permutation[i1 - 1] })

    val A = apply_index_map (copy<tk> A, n, n, index_map)

    fun
    select_pivot {i, k : pos | i <= k; k <= n + 1}
                 .<(n + 1) - k>.
                 (i         : int i,
                  k         : int k,
                  max_abs   : g0float tk,
                  k_max_abs : intBtwe (i, n))
        :<!ref> @(g0float tk, intBtwe (i, n)) =
      if k = succ n then
        @(max_abs, k_max_abs)
      else
        let
          val absval = A[k, i]
        in
          if absval > max_abs then
            select_pivot (i, succ k, absval, k)
          else
            select_pivot (i, succ k, max_abs, k_max_abs)
        end                  

    fn {}
    exchange_rows (i1 : one_to_n,
                   i2 : one_to_n) :<!refwrt> void =
      if i1 <> i2 then
        let
          val k1 = permutation[pred i1]
          and k2 = permutation[pred i2]
        in
          permutation[pred i1] := k2;
          permutation[pred i2] := k1
        end

    val () =
      let
        var i : Int
      in
        for* {i : pos | i <= n + 1} .<(n + 1) - i>.
             (i : int i) =>
          (i := 1; i <> succ n; i := succ i)
            let
              val @(maxabs, i_pivot) = select_pivot (i, i, Zero, i)
              prval [i_pivot : int] EQINT () = eqint_make_gint i_pivot
              var j : Int
            in
              if maxabs < tol then
                $raise Exc_degenerate_problem
                  ("Real_Matrix_LUP_decomposition");
              exchange_rows (i_pivot, i);
              for* {j : int | i + 1 <= j; j <= n + 1}
                   .<(n + 1) - j>.
                   (j : int j) =>
                (j := succ i; j <> succ n; j := succ j)
                  let
                    var k : Int
                  in
                    A[j, i] := A[j, i] / A[i, i];
                    for* {k : int | i + 1 <= k; k <= n + 1}
                         .<(n + 1) - k>.
                         (k : int k) =>
                      (k := succ i; k <> succ n; k := succ k)
                        A[j, k] :=
                          multiply_and_add
                            (~A[j, i],  A[i, k], A[j, k])
                  end
            end
      end

    val U = A
    val L = Real_Matrix_unit_matrix<tk> n
    val () =
      let
        var i : Int
      in
        for* {i : int | 2 <= i; i <= n + 1} .<(n + 1) - i>.
             (i : int i) =>
          (i := 2; i <> succ n; i := succ i)
            let
              var j : Int
            in
              for* {j : pos | j <= i} .<i - j>.
                   (j : int j) =>
                (j := 1; j <> i; j := succ j)
                  begin
                    L[i, j] := U[i, j];
                    U[i, j] := Zero
                  end
            end
      end
    val P = apply_index_map (Real_Matrix_unit_matrix<tk> n,
                             n, n, index_map)
  in
    @(L, U, P)
  end

(*------------------------------------------------------------------*)

implement
main0 () =
  (* I use tolerances of zero, secure in the knowledge that IEEE
     floating point will not crash the program just because a matrix
     was singular. :) *)
  let
    val A = Real_Matrix_make_elt<dblknd> (3, 3, NAN)
    val () =
      (A[1, 1] := 1.0;  A[1, 2] := 3.0;  A[1, 3] := 5.0;
       A[2, 1] := 2.0;  A[2, 2] := 4.0;  A[2, 3] := 7.0;
       A[3, 1] := 1.0;  A[3, 2] := 1.0;  A[3, 3] := 0.0)
    val @(L, U, P) = LUP_decomposition (A, 0.0)
    val () = println! "A"
    val () = Real_Matrix_fprint (stdout_ref, A)
    val () = println! "L"
    val () = Real_Matrix_fprint (stdout_ref, L)
    val () = println! "U"
    val () = Real_Matrix_fprint (stdout_ref, U)
    val () = println! "P"
    val () = Real_Matrix_fprint (stdout_ref, P)
    val () = println! "PA - LU"
    val () = Real_Matrix_fprint (stdout_ref, P * A - L * U)

    val () = println! "\n------------------------------------------\n"

    val A = Real_Matrix_make_elt<dblknd> (4, 4, NAN)
    val () =
      (A[1, 1] := 11.0;  A[1, 2] := 9.0;  A[1, 3] := 24.0;  A[1, 4] := 2.0;
       A[2, 1] := 1.0;  A[2, 2] := 5.0;  A[2, 3] := 2.0;  A[2, 4] := 6.0;
       A[3, 1] := 3.0;  A[3, 2] := 17.0;  A[3, 3] := 18.0;  A[3, 4] := 1.0;
       A[4, 1] := 2.0;  A[4, 2] := 5.0;  A[4, 3] := 7.0;  A[4, 4] := 1.0)
    val @(L, U, P) = LUP_decomposition (A, 0.0)
    val () = println! "A"
    val () = Real_Matrix_fprint (stdout_ref, A)
    val () = println! "L"
    val () = Real_Matrix_fprint (stdout_ref, L)
    val () = println! "U"
    val () = Real_Matrix_fprint (stdout_ref, U)
    val () = println! "P"
    val () = Real_Matrix_fprint (stdout_ref, P)
    val () = println! "PA - LU"
    val () = Real_Matrix_fprint (stdout_ref, P * A - L * U)

    val () = println! "\n------------------------------------------\n"

    val () = println! ("I have added an example having an asymmetric",
                       " permutation\nmatrix, ",
                       "because I needed one for testing:")
    val () = println! ()

    val A = Real_Matrix_make_elt<dblknd> (4, 4, NAN)
    val () =
      (A[1, 1] := 11.0;  A[1, 2] := 9.0;  A[1, 3] := 24.0;  A[1, 4] := 2.0;
       A[2, 1] := 1.0;  A[2, 2] := 5.0;  A[2, 3] := 2.0;  A[2, 4] := 6.0;
       A[3, 1] := 3.0;  A[3, 2] := 175.0;  A[3, 3] := 18.0;  A[3, 4] := 1.0;
       A[4, 1] := 2.0;  A[4, 2] := 5.0;  A[4, 3] := 7.0;  A[4, 4] := 1.0)
    val @(L, U, P) = LUP_decomposition (A, 0.0)
    val () = println! "A"
    val () = Real_Matrix_fprint (stdout_ref, A)
    val () = println! "L"
    val () = Real_Matrix_fprint (stdout_ref, L)
    val () = println! "U"
    val () = Real_Matrix_fprint (stdout_ref, U)
    val () = println! "P"
    val () = Real_Matrix_fprint (stdout_ref, P)
    val () = println! "PA - LU"
    val () = Real_Matrix_fprint (stdout_ref, P * A - L * U)
  in
  end

(*------------------------------------------------------------------*)
Output:
$ patscc -std=gnu2x -g -O2 -march=native -DATS_MEMALLOC_GCBDW lu_decomposition_task.dats -lgc && ./a.out
A
    1.000000    3.000000    5.000000
    2.000000    4.000000    7.000000
    1.000000    1.000000    0.000000
L
    1.000000    0.000000    0.000000
    0.500000    1.000000    0.000000
    0.500000   -1.000000    1.000000
U
    2.000000    4.000000    7.000000
    0.000000    1.000000    1.500000
    0.000000    0.000000   -2.000000
P
    0.000000    1.000000    0.000000
    1.000000    0.000000    0.000000
    0.000000    0.000000    1.000000
PA - LU
    0.000000    0.000000    0.000000
    0.000000    0.000000    0.000000
    0.000000    0.000000    0.000000

------------------------------------------

A
   11.000000    9.000000   24.000000    2.000000
    1.000000    5.000000    2.000000    6.000000
    3.000000   17.000000   18.000000    1.000000
    2.000000    5.000000    7.000000    1.000000
L
    1.000000    0.000000    0.000000    0.000000
    0.272727    1.000000    0.000000    0.000000
    0.090909    0.287500    1.000000    0.000000
    0.181818    0.231250    0.003597    1.000000
U
   11.000000    9.000000   24.000000    2.000000
    0.000000   14.545455   11.454545    0.454545
    0.000000    0.000000   -3.475000    5.687500
    0.000000    0.000000    0.000000    0.510791
P
    1.000000    0.000000    0.000000    0.000000
    0.000000    0.000000    1.000000    0.000000
    0.000000    1.000000    0.000000    0.000000
    0.000000    0.000000    0.000000    1.000000
PA - LU
    0.000000    0.000000    0.000000    0.000000
    0.000000    0.000000    0.000000    0.000000
    0.000000    0.000000    0.000000    0.000000
    0.000000    0.000000    0.000000    0.000000

------------------------------------------

I have added an example having an asymmetric permutation
matrix, because I needed one for testing:

A
   11.000000    9.000000   24.000000    2.000000
    1.000000    5.000000    2.000000    6.000000
    3.000000  175.000000   18.000000    1.000000
    2.000000    5.000000    7.000000    1.000000
L
    1.000000    0.000000    0.000000    0.000000
    0.272727    1.000000    0.000000    0.000000
    0.181818    0.019494    1.000000    0.000000
    0.090909    0.024236   -0.190393    1.000000
U
   11.000000    9.000000   24.000000    2.000000
    0.000000  172.545455   11.454545    0.454545
    0.000000    0.000000    2.413066    0.627503
    0.000000    0.000000    0.000000    5.926638
P
    1.000000    0.000000    0.000000    0.000000
    0.000000    0.000000    1.000000    0.000000
    0.000000    0.000000    0.000000    1.000000
    0.000000    1.000000    0.000000    0.000000
PA - LU
    0.000000    0.000000    0.000000    0.000000
    0.000000    0.000000    0.000000    0.000000
    0.000000    0.000000    0.000000    0.000000
    0.000000    0.000000    0.000000    0.000000

AutoHotkey

;--------------------------
LU_decomposition(A){
    P := Pivot(A)
    A_ := Multiply_Matrix(P, A)

    U := [], L := [], n := A_.Count()
    loop % n {
        i := A_Index
        loop % n {
            j := A_Index
            
            Sigma := 0, k := 1
            while (k <= i-1)
                Sigma += (U[k, j] * L[i, k]), k++
            U[i, j] := A_[i, j] - Sigma
            
            Sigma := 0, k := 1
            while (k <= j-1)
                Sigma += (U[k, j] * L[i, k]), k++
            L[i, j] := (A_[i, j] - Sigma) / U[j, j]
        }
    }
    return [L, U, P]
}
;--------------------------
Pivot(M){
    n := M.Count(), P := [], i := 0
    while (i++ < n){
        P.push([]) 
        j := 0
        while (j++ < n)
            P[i].push(i=j ? 1 : 0)
    }
    i := 0  
    while (i++ < n){
        maxm := M[i, i], row := i, j := i
        while (j++ < n)
            if (M[j, i] > maxm)
                maxm := M[j, i], row := j 
        if (i != row)
            tmp := P[i], P[i] := P[row], P[row] := tmp 
    }
    return P
}
;--------------------------
Multiply_Matrix(A,B){
    if (A[1].Count() <> B.Count())
        return
    RCols := A[1].Count()>B[1].Count()?A[1].Count():B[1].Count()
    RRows := A.Count()>B.Count()?A.Count():B.Count(),     R := []
    Loop, % RRows {
        RRow:=A_Index
        loop, % RCols {
            RCol:=A_Index,            v := 0
            loop % A[1].Count()
                col := A_Index,        v += A[RRow, col] * B[col,RCol]
            R[RRow,RCol] := v
        }
    }
    return R
}
;--------------------------
ShowMatrix(L, f:=3){
    for r, obj in L{
        row := ""
        for c, v in obj
            row .= Format("{:." f "f}", v) ", "
        output .= "[" trim(row, ", ") "]`n,"
    }
    return "[" Trim(output, "`n,") "]"
}
;--------------------------
Examples:
A1 := [[1, 3, 5]
    , [2, 4, 7]
    , [1, 1, 0]]

A2 := [[11, 9, 24, 2]
    ,[1, 5, 2, 6]
    ,[3, 17, 18, 1]
    ,[2, 5, 7, 1]]

loop 2 {
    L := LU_Decomposition(A%A_Index%)
    result .= ""
    . "A:=`n" ShowMatrix(A%A_Index%, 4)
    . "`n`nL:=`n" ShowMatrix(L.1)
    . "`n`nU:=`n" ShowMatrix(L.2)
    . "`n`nP:=`n" ShowMatrix(L.3)
    . "`n--------------------------------`n"
}
MsgBox, 262144, , % result
return
Output:
A:=
[[1.0000, 3.0000, 5.0000]
,[2.0000, 4.0000, 7.0000]
,[1.0000, 1.0000, 0.0000]]

L:=
[[1.000, 0.000, 0.000]
,[0.500, 1.000, 0.000]
,[0.500, -1.000, 1.000]]

U:=
[[2.000, 4.000, 7.000]
,[0.000, 1.000, 1.500]
,[0.000, 0.000, -2.000]]

P:=
[[0.000, 1.000, 0.000]
,[1.000, 0.000, 0.000]
,[0.000, 0.000, 1.000]]
--------------------------------
A:=
[[11.0000, 9.0000, 24.0000, 2.0000]
,[1.0000, 5.0000, 2.0000, 6.0000]
,[3.0000, 17.0000, 18.0000, 1.0000]
,[2.0000, 5.0000, 7.0000, 1.0000]]

L:=
[[1.000, 0.000, 0.000, 0.000]
,[0.273, 1.000, 0.000, 0.000]
,[0.091, 0.287, 1.000, 0.000]
,[0.182, 0.231, 0.004, 1.000]]

U:=
[[11.000, 9.000, 24.000, 2.000]
,[0.000, 14.545, 11.455, 0.455]
,[0.000, 0.000, -3.475, 5.688]
,[0.000, 0.000, 0.000, 0.511]]

P:=
[[1.000, 0.000, 0.000, 0.000]
,[0.000, 0.000, 1.000, 0.000]
,[0.000, 1.000, 0.000, 0.000]
,[0.000, 0.000, 0.000, 1.000]]
--------------------------------

BBC BASIC

      DIM A1(2,2)
      A1() = 1, 3, 5, 2, 4, 7, 1, 1, 0
      PROCLUdecomposition(A1(), L1(), U1(), P1())
      PRINT "L1:" ' FNshowmatrix(L1())
      PRINT "U1:" ' FNshowmatrix(U1())
      PRINT "P1:" ' FNshowmatrix(P1())
      
      DIM A2(3,3)
      A2() = 11, 9, 24, 2, 1, 5, 2, 6, 3, 17, 18, 1, 2, 5, 7, 1
      PROCLUdecomposition(A2(), L2(), U2(), P2())
      PRINT "L2:" ' FNshowmatrix(L2())
      PRINT "U2:" ' FNshowmatrix(U2())
      PRINT "P2:" ' FNshowmatrix(P2())
      END
      
      DEF PROCLUdecomposition(a(), RETURN l(), RETURN u(), RETURN p())
      LOCAL i%, j%, k%, n%, s, b() : n% = DIM(a(),2)
      DIM l(n%,n%), u(n%,n%), b(n%,n%)
      PROCpivot(a(), p())
      b() = p() . a()
      FOR j% = 0 TO n%
        l(j%,j%) = 1
        FOR i% = 0 TO j%
          s = 0
          FOR k% = 0 TO i% : s += u(k%,j%) * l(i%,k%) : NEXT
          u(i%,j%) = b(i%,j%) - s
        NEXT
        FOR i% = j% TO n%
          s = 0
          FOR k% = 0 TO j% : s += u(k%,j%) * l(i%,k%) : NEXT
          IF i%<>j% l(i%,j%) = (b(i%,j%) - s) / u(j%,j%)
        NEXT
      NEXT j%
      ENDPROC
      
      DEF PROCpivot(a(), RETURN p())
      LOCAL i%, j%, m%, n%, r% : n% = DIM(a(),2)
      DIM p(n%,n%) : FOR i% = 0 TO n% : p(i%,i%) = 1 : NEXT
      FOR i% = 0 TO n%
        m% = a(i%,i%)
        r% = i%
        FOR j% = i% TO n%
          IF a(j%,i%) > m% m% = a(j%,i%) : r% = j%
        NEXT
        IF i%<>r% THEN
          FOR j% = 0 TO n% : SWAP p(i%,j%),p(r%,j%) : NEXT
        ENDIF
      NEXT i%
      ENDPROC
      
      DEF FNshowmatrix(a())
      LOCAL @%, i%, j%, a$
      @% = &102050A
      FOR i% = 0 TO DIM(a(),1)
        FOR j% = 0 TO DIM(a(),2)
          a$ += STR$(a(i%,j%)) + ", "
        NEXT
        a$ = LEFT$(LEFT$(a$)) + CHR$(13) + CHR$(10)
      NEXT i%
      = a$
Output:
L1:
1.00000, 0.00000, 0.00000
0.50000, 1.00000, 0.00000
0.50000, -1.00000, 1.00000

U1:
2.00000, 4.00000, 7.00000
0.00000, 1.00000, 1.50000
0.00000, 0.00000, -2.00000

P1:
0.00000, 1.00000, 0.00000
1.00000, 0.00000, 0.00000
0.00000, 0.00000, 1.00000

L2:
1.00000, 0.00000, 0.00000, 0.00000
0.27273, 1.00000, 0.00000, 0.00000
0.09091, 0.28750, 1.00000, 0.00000
0.18182, 0.23125, 0.00360, 1.00000

U2:
11.00000, 9.00000, 24.00000, 2.00000
0.00000, 14.54545, 11.45455, 0.45455
0.00000, 0.00000, -3.47500, 5.68750
0.00000, 0.00000, 0.00000, 0.51079

P2:
1.00000, 0.00000, 0.00000, 0.00000
0.00000, 0.00000, 1.00000, 0.00000
0.00000, 1.00000, 0.00000, 0.00000
0.00000, 0.00000, 0.00000, 1.00000

C

Compiled with gcc -std=gnu99 -Wall -lm -pedantic. Demonstrating how to do LU decomposition, and how (not) to use macros.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define foreach(a, b, c) for (int a = b; a < c; a++)
#define for_i foreach(i, 0, n)
#define for_j foreach(j, 0, n)
#define for_k foreach(k, 0, n)
#define for_ij for_i for_j
#define for_ijk for_ij for_k
#define _dim int n
#define _swap(x, y) { typeof(x) tmp = x; x = y; y = tmp; }
#define _sum_k(a, b, c, s) { s = 0; foreach(k, a, b) s+= c; }

typedef double **mat;

#define _zero(a) mat_zero(a, n)
void mat_zero(mat x, int n) { for_ij x[i][j] = 0; }

#define _new(a) a = mat_new(n)
mat mat_new(_dim)
{
	mat x = malloc(sizeof(double*) * n);
	x[0]  = malloc(sizeof(double) * n * n);

	for_i x[i] = x[0] + n * i;
	_zero(x);

	return x;
}

#define _copy(a) mat_copy(a, n)
mat mat_copy(void *s, _dim)
{
	mat x = mat_new(n);
	for_ij x[i][j] = ((double (*)[n])s)[i][j];
	return x;
}

#define _del(x) mat_del(x)
void mat_del(mat x) { free(x[0]); free(x); }

#define _QUOT(x) #x
#define QUOTE(x) _QUOT(x)
#define _show(a) printf(QUOTE(a)" =");mat_show(a, 0, n)
void mat_show(mat x, char *fmt, _dim)
{
	if (!fmt) fmt = "%8.4g";
	for_i {
		printf(i ? "      " : " [ ");
		for_j {
			printf(fmt, x[i][j]);
			printf(j < n - 1 ? "  " : i == n - 1 ? " ]\n" : "\n");
		}
	}
}

#define _mul(a, b) mat_mul(a, b, n)
mat mat_mul(mat a, mat b, _dim)
{
	mat c = _new(c);
	for_ijk c[i][j] += a[i][k] * b[k][j];
	return c;
}

#define _pivot(a, b) mat_pivot(a, b, n)
void mat_pivot(mat a, mat p, _dim)
{
	for_ij { p[i][j] = (i == j); }
	for_i  {
		int max_j = i;
		foreach(j, i, n)
			if (fabs(a[j][i]) > fabs(a[max_j][i])) max_j = j;

		if (max_j != i)
			for_k { _swap(p[i][k], p[max_j][k]); }
	}
}

#define _LU(a, l, u, p) mat_LU(a, l, u, p, n)
void mat_LU(mat A, mat L, mat U, mat P, _dim)
{
	_zero(L); _zero(U);
	_pivot(A, P);

	mat Aprime = _mul(P, A);

	for_i  { L[i][i] = 1; }
	for_ij {
		double s;
		if (j <= i) {
			_sum_k(0, j, L[j][k] * U[k][i], s)
			U[j][i] = Aprime[j][i] - s;
		}
		if (j >= i) {
			_sum_k(0, i, L[j][k] * U[k][i], s);
			L[j][i] = (Aprime[j][i] - s) / U[i][i];
		}
	}

	_del(Aprime);
}

double A3[][3] = {{ 1, 3, 5 }, { 2, 4, 7 }, { 1, 1, 0 }};
double A4[][4] = {{11, 9, 24, 2}, {1, 5, 2, 6}, {3, 17, 18, 1}, {2, 5, 7, 1}};

int main()
{
	int n = 3;
	mat A, L, P, U;

	_new(L); _new(P); _new(U);
	A = _copy(A3);
	_LU(A, L, U, P);
	_show(A); _show(L); _show(U); _show(P);
	_del(A);  _del(L);  _del(U);  _del(P);

	printf("\n");

	n = 4;

	_new(L); _new(P); _new(U);
	A = _copy(A4);
	_LU(A, L, U, P);
	_show(A); _show(L); _show(U); _show(P);
	_del(A);  _del(L);  _del(U);  _del(P);

	return 0;
}

C++

#include <cassert>
#include <cmath>
#include <iomanip>
#include <iostream>
#include <limits>
#include <numeric>
#include <sstream>
#include <vector>

template <typename scalar_type> class matrix {
public:
    matrix(size_t rows, size_t columns)
        : rows_(rows), columns_(columns), elements_(rows * columns) {}

    matrix(size_t rows, size_t columns, scalar_type value)
        : rows_(rows), columns_(columns), elements_(rows * columns, value) {}

    matrix(size_t rows, size_t columns,
        const std::initializer_list<std::initializer_list<scalar_type>>& values)
        : rows_(rows), columns_(columns), elements_(rows * columns) {
        assert(values.size() <= rows_);
        size_t i = 0;
        for (const auto& row : values) {
            assert(row.size() <= columns_);
            std::copy(begin(row), end(row), &elements_[i]);
            i += columns_;
        }
    }

    size_t rows() const { return rows_; }
    size_t columns() const { return columns_; }

    const scalar_type& operator()(size_t row, size_t column) const {
        assert(row < rows_);
        assert(column < columns_);
        return elements_[row * columns_ + column];
    }
    scalar_type& operator()(size_t row, size_t column) {
        assert(row < rows_);
        assert(column < columns_);
        return elements_[row * columns_ + column];
    }
private:
    size_t rows_;
    size_t columns_;
    std::vector<scalar_type> elements_;
};

template <typename scalar_type>
void print(std::wostream& out, const matrix<scalar_type>& a) {
    const wchar_t* box_top_left = L"\x23a1";
    const wchar_t* box_top_right = L"\x23a4";
    const wchar_t* box_left = L"\x23a2";
    const wchar_t* box_right = L"\x23a5";
    const wchar_t* box_bottom_left = L"\x23a3";
    const wchar_t* box_bottom_right = L"\x23a6";

    const int precision = 5;
    size_t rows = a.rows(), columns = a.columns();
    std::vector<size_t> width(columns);
    for (size_t column = 0; column < columns; ++column) {
        size_t max_width = 0;
        for (size_t row = 0; row < rows; ++row) {
            std::ostringstream str;
            str << std::fixed << std::setprecision(precision) << a(row, column);
            max_width = std::max(max_width, str.str().length());
        }
        width[column] = max_width;
    }
    out << std::fixed << std::setprecision(precision);
    for (size_t row = 0; row < rows; ++row) {
        const bool top(row == 0), bottom(row + 1 == rows);
        out << (top ? box_top_left : (bottom ? box_bottom_left : box_left));
        for (size_t column = 0; column < columns; ++column) {
            if (column > 0)
                out << L' ';
            out << std::setw(width[column]) << a(row, column);
        }
        out << (top ? box_top_right : (bottom ? box_bottom_right : box_right));
        out << L'\n';
    }
}

// Return value is a tuple with elements (lower, upper, pivot)
template <typename scalar_type>
auto lu_decompose(const matrix<scalar_type>& input) {
    assert(input.rows() == input.columns());
    size_t n = input.rows();
    std::vector<size_t> perm(n);
    std::iota(perm.begin(), perm.end(), 0);
    matrix<scalar_type> lower(n, n);
    matrix<scalar_type> upper(n, n);
    matrix<scalar_type> input1(input);
    for (size_t j = 0; j < n; ++j) {
        size_t max_index = j;
        scalar_type max_value = 0;
        for (size_t i = j; i < n; ++i) {
            scalar_type value = std::abs(input1(perm[i], j));
            if (value > max_value) {
                max_index = i;
                max_value = value;
            }
        }
        if (max_value <= std::numeric_limits<scalar_type>::epsilon())
            throw std::runtime_error("matrix is singular");
        if (j != max_index)
            std::swap(perm[j], perm[max_index]);
        size_t jj = perm[j];
        for (size_t i = j + 1; i < n; ++i) {
            size_t ii = perm[i];
            input1(ii, j) /= input1(jj, j);
            for (size_t k = j + 1; k < n; ++k)
                input1(ii, k) -= input1(ii, j) * input1(jj, k);
        }
    }
    
    for (size_t j = 0; j < n; ++j) {
        lower(j, j) = 1;
        for (size_t i = j + 1; i < n; ++i)
            lower(i, j) = input1(perm[i], j);
        for (size_t i = 0; i <= j; ++i)
            upper(i, j) = input1(perm[i], j);
    }
    
    matrix<scalar_type> pivot(n, n);
    for (size_t i = 0; i < n; ++i)
        pivot(i, perm[i]) = 1;

    return std::make_tuple(lower, upper, pivot);
}

template <typename scalar_type>
void show_lu_decomposition(const matrix<scalar_type>& input) {
    try {
        std::wcout << L"A\n";
        print(std::wcout, input);
        auto result(lu_decompose(input));
        std::wcout << L"\nL\n";
        print(std::wcout, std::get<0>(result));
        std::wcout << L"\nU\n";
        print(std::wcout, std::get<1>(result));
        std::wcout << L"\nP\n";
        print(std::wcout, std::get<2>(result));
    } catch (const std::exception& ex) {
        std::cerr << ex.what() << '\n';
    }
}

int main() {
    std::wcout.imbue(std::locale(""));
    std::wcout << L"Example 1:\n";
    matrix<double> matrix1(3, 3,
       {{1, 3, 5},
        {2, 4, 7},
        {1, 1, 0}});
    show_lu_decomposition(matrix1);
    std::wcout << '\n';

    std::wcout << L"Example 2:\n";
    matrix<double> matrix2(4, 4,
      {{11, 9, 24, 2},
        {1, 5, 2, 6},
        {3, 17, 18, 1},
        {2, 5, 7, 1}});
    show_lu_decomposition(matrix2);
    std::wcout << '\n';
    
    std::wcout << L"Example 3:\n";
    matrix<double> matrix3(3, 3,
      {{-5, -6, -3},
       {-1,  0, -2},
       {-3, -4, -7}});
    show_lu_decomposition(matrix3);
    std::wcout << '\n';
    
    std::wcout << L"Example 4:\n";
    matrix<double> matrix4(3, 3,
      {{1, 2, 3},
       {4, 5, 6},
       {7, 8, 9}});
    show_lu_decomposition(matrix4);

    return 0;
}
Output:
Example 1:
A
⎡1.00000 3.00000 5.00000⎤
⎢2.00000 4.00000 7.00000⎥
⎣1.00000 1.00000 0.00000⎦

L
⎡1.00000  0.00000 0.00000⎤
⎢0.50000  1.00000 0.00000⎥
⎣0.50000 -1.00000 1.00000⎦

U
⎡2.00000 4.00000  7.00000⎤
⎢0.00000 1.00000  1.50000⎥
⎣0.00000 0.00000 -2.00000⎦

P
⎡0.00000 1.00000 0.00000⎤
⎢1.00000 0.00000 0.00000⎥
⎣0.00000 0.00000 1.00000⎦

Example 2:
A
⎡11.00000  9.00000 24.00000 2.00000⎤
⎢ 1.00000  5.00000  2.00000 6.00000⎥
⎢ 3.00000 17.00000 18.00000 1.00000⎥
⎣ 2.00000  5.00000  7.00000 1.00000⎦

L
⎡1.00000 0.00000 0.00000 0.00000⎤
⎢0.27273 1.00000 0.00000 0.00000⎥
⎢0.09091 0.28750 1.00000 0.00000⎥
⎣0.18182 0.23125 0.00360 1.00000⎦

U
⎡11.00000  9.00000 24.00000 2.00000⎤
⎢ 0.00000 14.54545 11.45455 0.45455⎥
⎢ 0.00000  0.00000 -3.47500 5.68750⎥
⎣ 0.00000  0.00000  0.00000 0.51079⎦

P
⎡1.00000 0.00000 0.00000 0.00000⎤
⎢0.00000 0.00000 1.00000 0.00000⎥
⎢0.00000 1.00000 0.00000 0.00000⎥
⎣0.00000 0.00000 0.00000 1.00000⎦

Example 3:
A
⎡-5.00000 -6.00000 -3.00000⎤
⎢-1.00000  0.00000 -2.00000⎥
⎣-3.00000 -4.00000 -7.00000⎦

L
⎡1.00000  0.00000 0.00000⎤
⎢0.20000  1.00000 0.00000⎥
⎣0.60000 -0.33333 1.00000⎦

U
⎡-5.00000 -6.00000 -3.00000⎤
⎢ 0.00000  1.20000 -1.40000⎥
⎣ 0.00000  0.00000 -5.66667⎦

P
⎡1.00000 0.00000 0.00000⎤
⎢0.00000 1.00000 0.00000⎥
⎣0.00000 0.00000 1.00000⎦

Example 4:
A
⎡1.00000 2.00000 3.00000⎤
⎢4.00000 5.00000 6.00000⎥
⎣7.00000 8.00000 9.00000⎦
matrix is singular

Common Lisp

Uses the routine (mmul A B) from Matrix multiplication.

;; Creates a nxn identity matrix.
(defun eye (n)
  (let ((I (make-array `(,n ,n) :initial-element 0)))
    (loop for j from 0 to (- n 1) do
          (setf (aref I j j) 1))
    I))

;; Swap two rows l and k of a mxn matrix A, which is a 2D array.
(defun swap-rows (A l k)
  (let* ((n (cadr (array-dimensions A)))
         (row (make-array n :initial-element 0)))
    (loop for j from 0 to (- n 1) do
          (setf (aref row j) (aref A l j))
          (setf (aref A l j) (aref A k j))
          (setf (aref A k j) (aref row j)))))

;; Creates the pivoting matrix for A.
(defun pivotize (A)
  (let* ((n (car (array-dimensions A)))
         (P (eye n)))
    (loop for j from 0 to (- n 1) do
          (let ((max (aref A j j))
                (row j))
            (loop for i from j to (- n 1) do
                  (if (> (aref A i j) max)
                      (setq max (aref A i j)
                            row i)))
            (if (not (= j row))
                (swap-rows P j row))))

  ;; Return P.
  P))

;; Decomposes a square matrix A by PA=LU and returns L, U and P.
(defun lu (A)
  (let* ((n (car (array-dimensions A)))
         (L (make-array `(,n ,n) :initial-element 0))
         (U (make-array `(,n ,n) :initial-element 0))
         (P (pivotize A))
         (A (mmul P A)))

    (loop for j from 0 to (- n 1) do
          (setf (aref L j j) 1)
          (loop for i from 0 to j do
                (setf (aref U i j)
                      (- (aref A i j)
                         (loop for k from 0 to (- i 1)
                               sum (* (aref U k j)
                                      (aref L i k))))))
          (loop for i from j to (- n 1) do
                (setf (aref L i j)
                      (/ (- (aref A i j)
                            (loop for k from 0 to (- j 1)
                                  sum (* (aref U k j)
                                         (aref L i k))))
                         (aref U j j)))))

  ;; Return L, U and P.
  (values L U P)))

Example 1:

(setf g (make-array '(3 3) :initial-contents '((1 3 5) (2 4 7)(1 1 0))))
#2A((1 3 5) (2 4 7) (1 1 0))

(lu g)
#2A((1 0 0) (1/2 1 0) (1/2 -1 1))
#2A((2 4 7) (0 1 3/2) (0 0 -2))
#2A((0 1 0) (1 0 0) (0 0 1))

Example 2:

(setf h (make-array '(4 4) :initial-contents '((11 9 24 2)(1 5 2 6)(3 17 18 1)(2 5 7 1))))
#2A((11 9 24 2) (1 5 2 6) (3 17 18 1) (2 5 7 1))

(lup h)
#2A((1 0 0 0) (3/11 1 0 0) (1/11 23/80 1 0) (2/11 37/160 1/278 1))
#2A((11 9 24 2) (0 160/11 126/11 5/11) (0 0 -139/40 91/16) (0 0 0 71/139))
#2A((1 0 0 0) (0 0 1 0) (0 1 0 0) (0 0 0 1))

D

Translation of: Common Lisp
import std.stdio, std.algorithm, std.typecons, std.numeric,
       std.array, std.conv, std.string, std.range;

bool isRectangular(T)(in T[][] m) pure nothrow @nogc {
    return m.all!(r => r.length == m[0].length);
}

bool isSquare(T)(in T[][] m) pure nothrow @nogc {
    return m.isRectangular && m[0].length == m.length;
}

T[][] matrixMul(T)(in T[][] A, in T[][] B) pure nothrow
in {
    assert(A.isRectangular && B.isRectangular &&
           !A.empty && !B.empty && A[0].length == B.length);
} body {
    auto result = new T[][](A.length, B[0].length);
    auto aux = new T[B.length];

    foreach (immutable j; 0 .. B[0].length) {
        foreach (immutable k, const row; B)
            aux[k] = row[j];
        foreach (immutable i, const ai; A)
            result[i][j] = dotProduct(ai, aux);
    }

    return result;
}

/// Creates the pivoting matrix for m.
T[][] pivotize(T)(immutable T[][] m) pure nothrow
in {
    assert(m.isSquare);
} body {
    immutable n = m.length;
    auto id = iota(n)
              .map!((in j) => n.iota.map!(i => T(i == j)).array)
              .array;

    foreach (immutable i; 0 .. n) {
        // immutable row = iota(i, n).reduce!(max!(j => m[j][i]));
        T maxm = m[i][i];
        size_t row = i;
        foreach (immutable j; i .. n)
            if (m[j][i] > maxm) {
                maxm = m[j][i];
                row = j;
            }

        if (i != row)
            swap(id[i], id[row]);
    }

    return id;
}

/// Decomposes a square matrix A by PA=LU and returns L, U and P.
Tuple!(T[][],"L", T[][],"U", const T[][],"P")
lu(T)(immutable T[][] A) pure nothrow
in {
    assert(A.isSquare);
} body {
    immutable n = A.length;
    auto L = new T[][](n, n);
    auto U = new T[][](n, n);
    foreach (immutable i; 0 .. n) {
        L[i][i .. $] = 0;
        U[i][0 .. i] = 0;
    }

    immutable P = A.pivotize!T;
    immutable A2 = matrixMul!T(P, A);

    foreach (immutable j; 0 .. n) {
        L[j][j] = 1;
        foreach (immutable i; 0 .. j+1) {
            T s1 = 0;
            foreach (immutable k; 0 .. i)
                s1 += U[k][j] * L[i][k];
            U[i][j] = A2[i][j] - s1;
        }
        foreach (immutable i; j .. n) {
            T s2 = 0;
            foreach (immutable k; 0 .. j)
                s2 += U[k][j] * L[i][k];
            L[i][j] = (A2[i][j] - s2) / U[j][j];
        }
    }

    return typeof(return)(L, U, P);
}

void main() {
    immutable a = [[1.0, 3, 5],
                   [2.0, 4, 7],
                   [1.0, 1, 0]];
    immutable b = [[11.0, 9, 24, 2],
                   [1.0,  5,  2, 6],
                   [3.0, 17, 18, 1],
                   [2.0,  5,  7, 1]];

    auto f = "[%([%(%.1f, %)],\n %)]]\n\n".replicate(3);
    foreach (immutable m; [a, b])
        writefln(f, lu(m).tupleof);
}
Output:
[[1.0, 0.0, 0.0],
 [0.5, 1.0, 0.0],
 [0.5, -1.0, 1.0]]

[[2.0, 4.0, 7.0],
 [0.0, 1.0, 1.5],
 [0.0, 0.0, -2.0]]

[[0.0, 1.0, 0.0],
 [1.0, 0.0, 0.0],
 [0.0, 0.0, 1.0]]


[[1.0, 0.0, 0.0, 0.0],
 [0.3, 1.0, 0.0, 0.0],
 [0.0, 0.3, 1.0, 0.0],
 [0.2, 0.2, 0.0, 1.0]]

[[11.0, 9.0, 24.0, 2.0],
 [0.0, 14.5, 11.5, 0.5],
 [0.0, 0.0, -3.5, 5.7],
 [0.0, 0.0, 0.0, 0.5]]

[[1.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 1.0, 0.0],
 [0.0, 1.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 1.0]]

EchoLisp

(lib 'matrix) ;; the matrix library provides LU-decomposition
(decimals 5)

(define A (list->array' (1 3 5 2 4 7 1 1 0 ) 3 3))
(define PLU (matrix-lu-decompose A)) ;; -> list of three matrices, P, Lower, Upper

(array-print (first PLU))
0  1  0 
1  0  0 
0  0  1 
(array-print (second PLU))
1    0   0 
0.5  1   0 
0.5  -1  1 
(array-print (caddr PLU))
2  4  7   
0  1  1.5 
0  0  -2  

(define A (list->array '(11 9 24 2 1 5 2 6 3 17 18 1 2 5 7 1 ) 4 4))
(define PLU (matrix-lu-decompose A)) ;; -> list of three matrices, P, Lower, Upper
(array-print (first PLU))
1  0  0  0 
0  0  1  0 
0  1  0  0 
0  0  0  1 

(array-print (second PLU))
1        0        0       0 
0.27273  1        0       0 
0.09091  0.2875   1       0 
0.18182  0.23125  0.0036  1 

(array-print (caddr PLU))
11  9         24        2       
0   14.54545  11.45455  0.45455 
0   0         -3.475    5.6875  
0   0         0         0.51079

Fortran

program lu1
  implicit none
  call check( reshape([real(8)::1,2,1,3,4,1,5,7,0                  ],[3,3]) )
  call check( reshape([real(8)::11,1,3,2,9,5,17,5,24,2,18,7,2,6,1,1],[4,4]) )
 
contains
 
  subroutine check(a)
    real(8), intent(in)   :: a(:,:)
    integer               :: i,j,n
    real(8), allocatable  :: aa(:,:),l(:,:),u(:,:)
    integer, allocatable  :: p(:,:)
    integer, allocatable  :: ipiv(:)
    n = size(a,1)
    allocate(aa(n,n),l(n,n),u(n,n),p(n,n),ipiv(n))
    forall (j=1:n,i=1:n)
      aa(i,j) = a(i,j)
      u (i,j) = 0d0
      p (i,j) = merge(1  ,0  ,i.eq.j)
      l (i,j) = merge(1d0,0d0,i.eq.j)
    end forall
    call lu(aa, ipiv)
    do i = 1,n
       l(i, :i-1) = aa(i, :i-1)
       u(i,i:   ) = aa(i,i:   )
    end do
    p(ipiv,:) = p
    call mat_print('a',a)
    call mat_print('p',p)
    call mat_print('l',l)
    call mat_print('u',u)
    print *, "residual"
    print *, "|| P.A - L.U || =  ", maxval(abs(matmul(p,a)-matmul(l,u)))
  end subroutine
 
  subroutine lu(a,p)
!   in situ decomposition, corresponds to LAPACK's dgebtrf
    real(8), intent(inout) :: a(:,:)
    integer, intent(out  ) :: p(:)
    integer                :: n, i,j,k,kmax
    n = size(a,1)
    p = [ ( i, i=1,n ) ]
    do k = 1,n-1
        kmax = maxloc(abs(a(p(k:),k)),1) + k-1
        if (kmax /= k ) then
            p([k, kmax]) = p([kmax, k])
            a([k, kmax],:) = a([kmax, k],:)
        end if
        a(k+1:,k) = a(k+1:,k) / a(k,k)
        forall (j=k+1:n) a(k+1:,j) = a(k+1:,j) - a(k,j)*a(k+1:,k)
    end do
  end subroutine

  subroutine mat_print(amsg,a)
    character(*), intent(in) :: amsg
    class    (*), intent(in) :: a(:,:)
    integer                  :: i
    print*,' '
    print*,amsg
    do i=1,size(a,1)
      select type (a)
        type is (real(8)) ; print'(100f8.2)',a(i,:)
        type is (integer) ; print'(100i8  )',a(i,:)
      end select
    end do
    print*,' '
  end subroutine

end program
Output:
 a
    1.00     3.00     5.00
    2.00     4.00     7.00
    1.00     1.00     0.00
 p
    0.00     1.00     0.00
    1.00     0.00     0.00
    0.00     0.00     1.00
 l
    1.00     0.00     0.00
    0.50     1.00     0.00
    0.50    -1.00     1.00
 u
    2.00     4.00     7.00
    0.00     1.00     1.50
    0.00     0.00    -2.00
 residual
 || P.A - L.U || =     0.0000000000000000     
 a
   11.00     9.00    24.00     2.00
    1.00     5.00     2.00     6.00
    3.00    17.00    18.00     1.00
    2.00     5.00     7.00     1.00
 p
    1.00     0.00     0.00     0.00
    0.00     0.00     1.00     0.00
    0.00     1.00     0.00     0.00
    0.00     0.00     0.00     1.00
 l
    1.00     0.00     0.00     0.00
    0.27     1.00     0.00     0.00
    0.09     0.29     1.00     0.00
    0.18     0.23     0.00     1.00
 u
   11.00     9.00    24.00     2.00
    0.00    14.55    11.45     0.45
    0.00     0.00    -3.47     5.69
    0.00     0.00     0.00     0.51
 residual
 || P.A - L.U || =     0.0000000000000000 

Go

2D representation

Translation of: Common Lisp
package main

import "fmt"
    
type matrix [][]float64

func zero(n int) matrix {
    r := make([][]float64, n)
    a := make([]float64, n*n)
    for i := range r {
        r[i] = a[n*i : n*(i+1)]
    } 
    return r 
}
    
func eye(n int) matrix {
    r := zero(n)
    for i := range r {
        r[i][i] = 1
    }
    return r
}   
    
func (m matrix) print(label string) {
    if label > "" {
        fmt.Printf("%s:\n", label)
    }
    for _, r := range m {
        for _, e := range r {
            fmt.Printf(" %9.5f", e)
        }
        fmt.Println()
    }
}

func (a matrix) pivotize() matrix { 
    p := eye(len(a))
    for j, r := range a {
        max := r[j] 
        row := j
        for i := j; i < len(a); i++ {
            if a[i][j] > max {
                max = a[i][j]
                row = i
            }
        }
        if j != row {
            // swap rows
            p[j], p[row] = p[row], p[j]
        }
    } 
    return p
}

func (m1 matrix) mul(m2 matrix) matrix {
    r := zero(len(m1))
    for i, r1 := range m1 {
        for j := range m2 {
            for k := range m1 {
                r[i][j] += r1[k] * m2[k][j]
            }
        }
    }
    return r
}

func (a matrix) lu() (l, u, p matrix) {
    l = zero(len(a))
    u = zero(len(a))
    p = a.pivotize()
    a = p.mul(a)
    for j := range a {
        l[j][j] = 1
        for i := 0; i <= j; i++ {
            sum := 0.
            for k := 0; k < i; k++ {
                sum += u[k][j] * l[i][k]
            }
            u[i][j] = a[i][j] - sum
        }
        for i := j; i < len(a); i++ {
            sum := 0.
            for k := 0; k < j; k++ {
                sum += u[k][j] * l[i][k]
            }
            l[i][j] = (a[i][j] - sum) / u[j][j]
        }
    }
    return
}

func main() {
    showLU(matrix{
        {1, 3, 5},
        {2, 4, 7},
        {1, 1, 0}})
    showLU(matrix{
        {11, 9, 24, 2},
        {1, 5, 2, 6},
        {3, 17, 18, 1},
        {2, 5, 7, 1}})
}

func showLU(a matrix) {
    a.print("\na")
    l, u, p := a.lu()
    l.print("l")
    u.print("u") 
    p.print("p") 
}
Output:
a:
   1.00000   3.00000   5.00000
   2.00000   4.00000   7.00000
   1.00000   1.00000   0.00000
l:
   1.00000   0.00000   0.00000
   0.50000   1.00000   0.00000
   0.50000  -1.00000   1.00000
u:
   2.00000   4.00000   7.00000
   0.00000   1.00000   1.50000
   0.00000   0.00000  -2.00000
p:
   0.00000   1.00000   0.00000
   1.00000   0.00000   0.00000
   0.00000   0.00000   1.00000

a:
  11.00000   9.00000  24.00000   2.00000
   1.00000   5.00000   2.00000   6.00000
   3.00000  17.00000  18.00000   1.00000
   2.00000   5.00000   7.00000   1.00000
l:
   1.00000   0.00000   0.00000   0.00000
   0.27273   1.00000   0.00000   0.00000
   0.09091   0.28750   1.00000   0.00000
   0.18182   0.23125   0.00360   1.00000
u:
  11.00000   9.00000  24.00000   2.00000
   0.00000  14.54545  11.45455   0.45455
   0.00000   0.00000  -3.47500   5.68750
   0.00000   0.00000   0.00000   0.51079
p:
   1.00000   0.00000   0.00000   0.00000
   0.00000   0.00000   1.00000   0.00000
   0.00000   1.00000   0.00000   0.00000
   0.00000   0.00000   0.00000   1.00000

Flat representation

package main

import "fmt"

type matrix struct {
    stride int
    ele    []float64
}

func (m *matrix) print(heading string) {
    if heading > "" {
        fmt.Print("\n", heading, "\n")
    }
    for e := 0; e < len(m.ele); e += m.stride {
        fmt.Printf("%8.5f ", m.ele[e:e+m.stride])
        fmt.Println()
    }
}

func (m1 *matrix) mul(m2 *matrix) (m3 *matrix, ok bool) {
    if m1.stride*m2.stride != len(m2.ele) {
        return nil, false
    }
    m3 = &matrix{m2.stride, make([]float64, (len(m1.ele)/m1.stride)*m2.stride)}
    for m1c0, m3x := 0, 0; m1c0 < len(m1.ele); m1c0 += m1.stride {
        for m2r0 := 0; m2r0 < m2.stride; m2r0++ {
            for m1x, m2x := m1c0, m2r0; m2x < len(m2.ele); m2x += m2.stride {
                m3.ele[m3x] += m1.ele[m1x] * m2.ele[m2x]
                m1x++
            }
            m3x++
        }
    }
    return m3, true
}

func zero(rows, cols int) *matrix {
    return &matrix{cols, make([]float64, rows*cols)}
}

func eye(n int) *matrix {
    m := zero(n, n)
    for ix := 0; ix < len(m.ele); ix += n + 1 {
        m.ele[ix] = 1
    }
    return m
}

func (a *matrix) pivotize() *matrix {
    pv := make([]int, a.stride)
    for i := range pv {
        pv[i] = i
    }
    for j, dx := 0, 0; j < a.stride; j++ {
        row := j
        max := a.ele[dx]
        for i, ixcj := j, dx; i < a.stride; i++ {
            if a.ele[ixcj] > max {
                max = a.ele[ixcj]
                row = i
            }
            ixcj += a.stride
        }
        if j != row {
            pv[row], pv[j] = pv[j], pv[row]
        }
        dx += a.stride + 1
    }
    p := zero(a.stride, a.stride)
    for r, c := range pv {
        p.ele[r*a.stride+c] = 1
    }
    return p
}

func (a *matrix) lu() (l, u, p *matrix) {
    l = zero(a.stride, a.stride)
    u = zero(a.stride, a.stride)
    p = a.pivotize()
    a, _ = p.mul(a)
    for j, jxc0 := 0, 0; j < a.stride; j++ {
        l.ele[jxc0+j] = 1
        for i, ixc0 := 0, 0; ixc0 <= jxc0; i++ {
            sum := 0.
            for k, kxcj := 0, j; k < i; k++ {
                sum += u.ele[kxcj] * l.ele[ixc0+k]
                kxcj += a.stride
            }
            u.ele[ixc0+j] = a.ele[ixc0+j] - sum
            ixc0 += a.stride
        }
        for ixc0 := jxc0; ixc0 < len(a.ele); ixc0 += a.stride {
            sum := 0.
            for k, kxcj := 0, j; k < j; k++ {
                sum += u.ele[kxcj] * l.ele[ixc0+k]
                kxcj += a.stride
            }
            l.ele[ixc0+j] = (a.ele[ixc0+j] - sum) / u.ele[jxc0+j]
        }
        jxc0 += a.stride
    }
    return
}

func main() {
    showLU(&matrix{3, []float64{
        1, 3, 5,
        2, 4, 7,
        1, 1, 0}})
    showLU(&matrix{4, []float64{
        11, 9, 24, 2,
        1, 5, 2, 6,
        3, 17, 18, 1,
        2, 5, 7, 1}})
}

func showLU(a *matrix) {
    a.print("\na")
    l, u, p := a.lu()
    l.print("l")
    u.print("u")
    p.print("p")
}

Output is same as from 2D solution.

Library gonum/mat

package main

import (
    "fmt"

    "gonum.org/v1/gonum/mat"
)

func main() {
    showLU(mat.NewDense(3, 3, []float64{
        1, 3, 5,
        2, 4, 7,
        1, 1, 0,
    }))
    fmt.Println()
    showLU(mat.NewDense(4, 4, []float64{
        11, 9, 24, 2,
        1, 5, 2, 6,
        3, 17, 18, 1,
        2, 5, 7, 1,
    }))
}

func showLU(a *mat.Dense) {
    fmt.Printf("a: %v\n\n", mat.Formatted(a, mat.Prefix("   ")))
    var lu mat.LU
    lu.Factorize(a)
    l := lu.LTo(nil)
    u := lu.UTo(nil)
    fmt.Printf("l: %.5f\n\n", mat.Formatted(l, mat.Prefix("   ")))
    fmt.Printf("u: %.5f\n\n", mat.Formatted(u, mat.Prefix("   ")))
    fmt.Println("p:", lu.Pivot(nil))
}
Output:

Pivot format is a little different here. (But library solutions don't really meet task requirements anyway.)

a: ⎡1  3  5⎤
   ⎢2  4  7⎥
   ⎣1  1  0⎦

l: ⎡ 1.00000   0.00000   0.00000⎤
   ⎢ 0.50000   1.00000   0.00000⎥
   ⎣ 0.50000  -1.00000   1.00000⎦

u: ⎡ 2.00000   4.00000   7.00000⎤
   ⎢ 0.00000   1.00000   1.50000⎥
   ⎣ 0.00000   0.00000  -2.00000⎦

p: [1 0 2]

a: ⎡11   9  24   2⎤
   ⎢ 1   5   2   6⎥
   ⎢ 3  17  18   1⎥
   ⎣ 2   5   7   1⎦

l: ⎡1.00000  0.00000  0.00000  0.00000⎤
   ⎢0.27273  1.00000  0.00000  0.00000⎥
   ⎢0.09091  0.28750  1.00000  0.00000⎥
   ⎣0.18182  0.23125  0.00360  1.00000⎦

u: ⎡11.00000   9.00000  24.00000   2.00000⎤
   ⎢ 0.00000  14.54545  11.45455   0.45455⎥
   ⎢ 0.00000   0.00000  -3.47500   5.68750⎥
   ⎣ 0.00000   0.00000   0.00000   0.51079⎦

p: [0 2 1 3]

Library go.matrix

package main

import (
    "fmt"

    mat "github.com/skelterjohn/go.matrix"
)

func main() {
    showLU(mat.MakeDenseMatrixStacked([][]float64{
        {1, 3, 5},
        {2, 4, 7},
        {1, 1, 0}}))
    showLU(mat.MakeDenseMatrixStacked([][]float64{
        {11, 9, 24, 2},
        {1, 5, 2, 6},
        {3, 17, 18, 1},
        {2, 5, 7, 1}}))
}

func showLU(a *mat.DenseMatrix) {
    fmt.Printf("\na:\n%v\n", a)
    l, u, p := a.LU()
    fmt.Printf("l:\n%v\n", l)
    fmt.Printf("u:\n%v\n", u)
    fmt.Printf("p:\n%v\n", p)
}
Output:
a:
{1, 3, 5,
 2, 4, 7,
 1, 1, 0}
l:
{  1,   0,   0,
 0.5,   1,   0,
 0.5,  -1,   1}
u:
{  2,   4,   7,
   0,   1, 1.5,
   0,   0,  -2}
p:
{0, 1, 0,
 1, 0, 0,
 0, 0, 1}

a:
{11,  9, 24,  2,
  1,  5,  2,  6,
  3, 17, 18,  1,
  2,  5,  7,  1}
l:
{       1,        0,        0,        0,
 0.272727,        1,        0,        0,
 0.090909,   0.2875,        1,        0,
 0.181818,  0.23125, 0.003597,        1}
u:
{       11,         9,        24,         2,
         0, 14.545455, 11.454545,  0.454545,
         0,         0,    -3.475,    5.6875,
         0,         0,         0,  0.510791}
p:
{1, 0, 0, 0,
 0, 0, 1, 0,
 0, 1, 0, 0,
 0, 0, 0, 1}

Haskell

Without elem-at-index modifications; doesn't find maximum but any non-zero element

import Data.List
import Data.Maybe
import Text.Printf

-- a matrix is represented as a list of columns
mmult :: Num a => [[a]] -> [[a]] -> [[a]] 
mmult a b = [ [ sum $ zipWith (*) ak bj | ak <- (transpose a) ] | bj <- b ]

nth mA i j = (mA !! j) !! i

idMatrixPart n m k = [ [if (i==j) then 1 else 0 | i <- [1..n]] | j <- [k..m]]
idMatrix n = idMatrixPart n n 1

permMatrix n ix1 ix2 =
    [ [ if ((i==ix1 && j==ix2) || (i==ix2 && j==ix1) || (i==j && j /= ix1 && i /= ix2))
        then 1 else 0| i <- [0..n-1]] | j <- [0..n-1]]
permMatrix_inv n ix1 ix2 = permMatrix n ix2 ix1
        
-- count k from zero
elimColumn :: Int -> [[Rational]] -> Int -> [Rational]
elimMatrix :: Int -> [[Rational]] -> Int -> [[Rational]]
elimMatrix_inv :: Int -> [[Rational]] -> Int -> [[Rational]]

elimColumn n mA k = [(let mAkk = (nth mA k k) in  if (i>k) then (-(nth mA i k)/mAkk)
    else if (i==k) then 1 else 0) | i <- [0..n-1]]
elimMatrix n mA k = (idMatrixPart n k 1) ++ [elimColumn n mA k] ++ (idMatrixPart n n (k+2))
elimMatrix_inv n mA k = (idMatrixPart n k 1) ++ --mA is elimMatrix there
    [let c = (mA!!k) in [if (i==k) then 1 else if (i<k) then 0 else (-(c!!i)) | i <- [0..n-1]]]
     ++ (idMatrixPart n n (k+2))

swapIndx :: [[Rational]] -> Int -> Int
swapIndx mA k = fromMaybe k (findIndex (>0) (drop k (mA!!k)))

-- LUP; lupStep returns [L:U:P]
paStep_recP :: Int -> [[Rational]] -> [[Rational]] -> [[Rational]] -> Int -> [[[Rational]]]
paStep_recM :: Int -> [[Rational]] -> [[Rational]] -> [[Rational]] -> Int -> [[[Rational]]]
lupStep :: Int -> [[Rational]] -> [[[Rational]]]

paStep_recP n mP mA mL cnt = 
    let mPt = permMatrix n cnt (swapIndx mA cnt) in 
        let mPtInv = permMatrix_inv n cnt (swapIndx mA cnt) in
    if (cnt >= n) then [(mmult mP mL),mA,mP] else
        (paStep_recM n (mmult mPt mP) (mmult mPt mA) (mmult mL mPtInv) cnt)

paStep_recM n mP mA mL cnt =
    let mMt = elimMatrix n mA cnt in
        let mMtInv = elimMatrix_inv n mMt cnt in
    paStep_recP n mP (mmult mMt mA) (mmult mL mMtInv) (cnt + 1)

lupStep n mA = paStep_recP n (idMatrix n) mA (idMatrix n) 0

--IO
matrixFromRationalToString m = concat $ intersperse "\n"
    (map (\x -> unwords $ printf "%8.4f" <$> (x::[Double])) 
        (transpose (matrixFromRational m))) where 
        matrixFromRational m = map (\x -> map fromRational x) m

solveTask mY = let mLUP = lupStep (length mY) mY in
    putStrLn ("A: \n" ++ matrixFromRationalToString mY) >>
    putStrLn ("L: \n" ++ matrixFromRationalToString (mLUP!!0)) >>
    putStrLn ("U: \n" ++ matrixFromRationalToString (mLUP!!1)) >>
    putStrLn ("P: \n" ++ matrixFromRationalToString (mLUP!!2)) >>
    putStrLn ("Verify: PA\n" ++ matrixFromRationalToString (mmult (mLUP!!2) mY)) >>
    putStrLn ("Verify: LU\n" ++ matrixFromRationalToString (mmult (mLUP!!0) (mLUP!!1)))

mY1 = [[1, 2, 1], [3, 4, 7], [5, 7, 0]] :: [[Rational]]
mY2 = [[11, 1, 3, 2], [9, 5, 17, 5], [24, 2, 18, 7], [2, 6, 1, 1]] :: [[Rational]]
main = putStrLn "Task1: \n" >> solveTask mY1 >>
    putStrLn "Task2: \n" >> solveTask mY2
Output:
Task1: 

A: 
  1.0000   3.0000   5.0000
  2.0000   4.0000   7.0000
  1.0000   7.0000   0.0000
L: 
  1.0000   0.0000   0.0000
  2.0000   1.0000   0.0000
  1.0000  -2.0000   1.0000
U: 
  1.0000   3.0000   5.0000
  0.0000  -2.0000  -3.0000
  0.0000   0.0000 -11.0000
P: 
  1.0000   0.0000   0.0000
  0.0000   1.0000   0.0000
  0.0000   0.0000   1.0000
Verify: PA
  1.0000   3.0000   5.0000
  2.0000   4.0000   7.0000
  1.0000   7.0000   0.0000
Verify: LU
  1.0000   3.0000   5.0000
  2.0000   4.0000   7.0000
  1.0000   7.0000   0.0000
Task2: 

A: 
 11.0000   9.0000  24.0000   2.0000
  1.0000   5.0000   2.0000   6.0000
  3.0000  17.0000  18.0000   1.0000
  2.0000   5.0000   7.0000   1.0000
L: 
  1.0000   0.5556   0.2317   0.0000
  0.0000   1.0000   0.0000   0.0000
  0.0000   1.8889   1.0000   0.0000
  0.0000   0.0909   0.0000   1.0000
U: 
  0.0081   0.0000   0.0000   0.5325
 11.0000   9.0000  24.0000   2.0000
-17.7778   0.0000 -27.3333  -2.7778
  0.0000   4.1818  -0.1818   5.8182
P: 
  0.0000   0.0000   0.0000   1.0000
  1.0000   0.0000   0.0000   0.0000
  0.0000   0.0000   1.0000   0.0000
  0.0000   1.0000   0.0000   0.0000
Verify: PA
  2.0000   5.0000   7.0000   1.0000
 11.0000   9.0000  24.0000   2.0000
  3.0000  17.0000  18.0000   1.0000
  1.0000   5.0000   2.0000   6.0000
Verify: LU
  2.0000   5.0000   7.0000   1.0000
 11.0000   9.0000  24.0000   2.0000
  3.0000  17.0000  18.0000   1.0000
  1.0000   5.0000   2.0000   6.0000

With Numeric.LinearAlgebra

import Numeric.LinearAlgebra

a1, a2 :: Matrix R
a1 = (3><3)
  [1,3,5
  ,2,4,7
  ,1,1,0]

a2 = (4><4)
  [11,  9, 24,  2
  , 1,  5,  2,  6
  , 3, 17, 18,  1
  , 2,  5,  7,  1]

main = do
  print $ lu a1
  print $ lu a2
Output:
((3><3)
 [ 1.0,  0.0, 0.0
 , 0.5,  1.0, 0.0
 , 0.5, -1.0, 1.0 ],(3><3)
 [ 2.0,  4.0,  7.0
 , 0.0,  1.0,  1.5
 , 0.0, -0.0, -2.0 ],(3><3)
 [ 0.0, 1.0, 0.0
 , 1.0, 0.0, 0.0
 , 0.0, 0.0, 1.0 ],-1.0)
((4><4)
 [                  1.0,                 0.0,                   0.0, 0.0
 ,   0.2727272727272727,                 1.0,                   0.0, 0.0
 , 9.090909090909091e-2,              0.2875,                   1.0, 0.0
 ,  0.18181818181818182, 0.23124999999999996, 3.5971223021580693e-3, 1.0 ],(4><4)
 [ 11.0,                9.0,                24.0,                2.0
 ,  0.0, 14.545454545454547,  11.454545454545455, 0.4545454545454546
 ,  0.0,                0.0, -3.4749999999999996,             5.6875
 ,  0.0,                0.0,                 0.0,  0.510791366906476 ],(4><4)
 [ 1.0, 0.0, 0.0, 0.0
 , 0.0, 0.0, 1.0, 0.0
 , 0.0, 1.0, 0.0, 0.0
 , 0.0, 0.0, 0.0, 1.0 ],-1.0)

Idris

works with Idris 0.10

Uses The Method Of Partial Pivoting

Solution:

module Main 

import Data.Vect
 
Matrix : Nat -> Nat -> Type -> Type
Matrix m n t = Vect m (Vect n t)

-- Creates list from 0 to n (not including n) 
upTo : (m : Nat) -> Vect m (Fin m)
upTo Z = []
upTo (S n) = 0 :: (map FS (upTo n))

-- Creates list from 0 to n-1  (not including n-1)
upToM1 : (m : Nat) -> (sz ** Vect sz (Fin m))
upToM1 m = case (upTo m) of
                  (y::ys) => (_ ** init(y::ys)) 
                  [] => (_ ** [])

-- Creates list from i to n (not including n)
fromUpTo : {n : Nat} -> Fin n -> (sz ** Vect sz (Fin n))
fromUpTo {n} m = filter (>= m) (upTo n)

-- Creates list from i+1 to n (not including n)
fromUpTo1 : {n : Nat} -> Fin n -> (sz ** Vect sz (Fin n))
fromUpTo1 {n} m with (fromUpTo m)
  | (_ ** xs) = case xs of 
                  (y::ys) => (_ ** ys)
                  [] => (_ ** [])


-- Create Zero Matrix of size m by n 
zeros : (m : Nat) -> (n : Nat) -> Matrix m n Double
zeros m n = replicate m (replicate n 0.0)

replaceAtM : (Fin m, Fin n) -> t -> Matrix m n t -> Matrix m n t
replaceAtM (i, j) e a = replaceAt i (replaceAt j e (index i a)) a

-- Create Identity Matrix of size m by m
eye : (m : Nat) -> Matrix m m Double
eye m = map create1Vec (upTo m)
  where
    set1 : Vect m Double -> Fin m -> Vect m Double
    set1 a n = replaceAt n 1.0 a

    create1Vec : Fin m -> Vect m Double
    create1Vec n = set1 (replicate m 0.0) n


indexM : (Fin m, Fin n) -> Matrix m n t -> t
indexM (i, j) a = index j (index i a)
 

-- Obtain index for the row containing the
-- largest absolute value for the given column
colAbsMaxIndex : Fin m -> Fin m -> Matrix m m Double -> Fin m
colAbsMaxIndex startRow col a {m} with (fromUpTo startRow)
  | (_ ** xs) = 
    snd $ foldl (\(absMax, idx), curIdx => 
          let curAbsVal = abs(indexM (curIdx, col) a) in
            if (curAbsVal > absMax) 
              then (curAbsVal, curIdx)
              else (absMax, idx)
        ) (0.0, startRow) xs


-- Swaps two rows in a given matrix
swapRows : Fin m -> Fin m -> Matrix m n t -> Matrix m n t
swapRows r1 r2 a = replaceAt r2 tempRow $ replaceAt r1 (index r2 a) a
  where tempRow = index r1 a


-- Swaps two individual values in a matrix
swapValues : (Fin m, Fin m) -> (Fin m, Fin m) -> Matrix m m Double -> Matrix m m Double
swapValues (i1, j1) (i2, j2) m = replaceAtM (i2, j2) v1 $ replaceAtM (i1, j1) v2 m
  where
      v1 = indexM (i1, j1) m
      v2 = indexM (i2, j2) m

-- Perform row Swap on Lower Triangular Matrix
lSwapRow : Fin m -> Fin m -> Matrix m m Double -> Matrix m m Double
lSwapRow row1 row2 l {m} with (filter (< row1) (upTo m))
  | (_ ** xs) =  foldl (\l',col => swapValues (row1, col) (row2, col) l') l xs


rowSwap : Fin m -> (Matrix m m Double,  Matrix m m Double, Matrix m m Double) -> 
                        (Matrix m m Double, Matrix m m Double, Matrix m m Double)
rowSwap col (l,u,p) = (lSwapRow col row l, swapRows col row u, swapRows col row p) 
      where row = colAbsMaxIndex col col u

 
calc : (Fin m) -> (Fin m) -> (Matrix m m Double, Matrix m m Double) -> 
                                (Matrix m m Double, Matrix m m Double)
calc i j (l, u) {m} = (l', u')
   where 
         l' : Matrix m m Double
         l' = replaceAtM (j, i) ((indexM (j, i) u) / indexM (i, i) u) l
         
         u'' : (Fin m) -> (Matrix m m Double) -> (Matrix m m Double)
         u'' k u = replaceAtM (j, k) ((indexM (j, k) u) - 
                  ((indexM (j, i) l') * (indexM (i, k) u))) u
         
         u' : (Matrix m m Double)
         u' with (fromUpTo i) | (_ ** xs) = foldl (\curU, idx => u'' idx curU) u xs


-- Perform a single iteration of the algorithm for the given column
iteration : Fin m -> (Matrix m m Double, Matrix m m Double, Matrix m m Double) ->
                        (Matrix m m Double, Matrix m m Double, Matrix m m Double) 
iteration i lup {m} = iterate' (rowSwap i lup) 
 
          where 
                modify : (Matrix m m Double, Matrix m m Double) -> 
                            (Matrix m m Double, Matrix m m Double)
                modify lu with (fromUpTo1 i) | (_ ** xs) = 
                                            foldl (\lu',j => calc i j lu') lu xs 
                                            
                iterate' : (Matrix m m Double, Matrix m m Double, Matrix m m Double) ->
                              (Matrix m m Double, Matrix m m Double, Matrix m m Double)
                iterate' (l, u, p) with (modify (l, u)) | (l', u') = (l', u', p) 


-- Generate L, U, P matricies from a given square matrix.
-- Where L * U = A, and P is the permutation matrix
luDecompose : Matrix m m Double -> (Matrix m m Double, Matrix m m Double, Matrix m m Double) 
luDecompose a {m} with (upToM1 m) 
  | (_ ** xs) = foldl (\lup,idx => iteration idx lup) (eye m,a,eye m) xs 
      
      
      
ex1 : (Matrix 3 3 Double, Matrix 3 3 Double, Matrix 3 3 Double)
ex1 = luDecompose [[1, 3, 5], [2, 4, 7], [1, 1, 0]] 

ex2 : (Matrix 4 4 Double, Matrix 4 4 Double, Matrix 4 4 Double)
ex2 = luDecompose [[11, 9, 24, 2], [1, 5, 2, 6], [3, 17, 18, 1], [2, 5, 7, 1]]

printEx : (Matrix n n Double, Matrix n n Double, Matrix n n Double) -> IO ()
printEx (l, u, p) = do
  putStr "l:"
  print l
  putStrLn "\n"
  
  putStr "u:"
  print u
  putStrLn "\n"

  putStr "p:"
  print p
  putStrLn "\n"

main : IO()
main = do 
  putStrLn "Solution 1:"
  printEx ex1
  putStrLn "Solution 2:"
  printEx ex2
Output:
Solution 1:
l:[[1, 0, 0], [0.5, 1, 0], [0.5, -1, 1]]

u:[[2, 4, 7], [0, 1, 1.5], [0, 0, -2]]

p:[[0, 1, 0], [1, 0, 0], [0, 0, 1]]

Solution 2:
l:[[1, 0, 0, 0], [0.2727272727272727, 1, 0, 0], [0.09090909090909091, 0.2875, 1, 0], [0.1818181818181818, 0.23125, 0.003597122302158069, 1]]

u:[[11, 9, 24, 2], [0, 14.54545454545455, 11.45454545454546, 0.4545454545454546], [0, 0, -3.475, 5.6875], [0, 0, 0, 0.510791366906476]]

p:[[1, 0, 0, 0], [0, 0, 1, 0], [0, 1, 0, 0], [0, 0, 0, 1]]

J

Taken with slight modification from [1].

Solution:

mp=: +/ .*

LU=: 3 : 0
  'm n'=. $ A=. y
  if. 1=m do.
    p ; (=1) ; p{"1 A [ p=. C. (n-1);~.0,(0~:,A)i.1
  else.
    m2=. >.m%2
    'p1 L1 U1'=. LU m2{.A
    D=. (/:p1) {"1 m2}.A
    F=. m2 {."1 D
    E=. m2 {."1 U1
    FE1=. F mp %. E
    G=. m2}."1 D - FE1 mp U1
    'p2 L2 U2'=. LU G
    p3=. (i.m2),m2+p2
    H=. (/:p3) {"1 U1
    (p1{p3) ; (L1,FE1,.L2) ; H,(-n){."1 U2
  end.
)

permtomat=: 1 {.~"0 -@>:@:/:
LUdecompose=: (permtomat&.>@{. , }.)@:LU

Example use:

   A=:3 3$1 3 5 2 4 7 1 1 0
   LUdecompose A
┌─────┬─────┬───────┐
1 0 01 0 01  3  5
0 1 02 1 00 _2 _3
0 0 11 1 10  0 _2
└─────┴─────┴───────┘
   mp/> LUdecompose A
1 3 5
2 4 7
1 1 0

   A=:4 4$11 9 24 2  1 5 2 6  3 17 18 1  2 5 7 1
   LUdecompose A
┌───────┬─────────────────────────────┬─────────────────────────────┐
1 0 0 0        1        0        0 011       9        24        2
0 1 0 00.0909091        1        0 0 0 4.18182 _0.181818  5.81818
0 0 1 0 0.272727  3.47826        1 0 0       0    12.087 _19.7826
0 0 0 1 0.181818 0.804348 0.230216 1 0       0         0 0.510791
└───────┴─────────────────────────────┴─────────────────────────────┘
   mp/> LUdecompose A
11  9 24 2
 1  5  2 6
 3 17 18 1
 2  5  7 1

Java

Translation of Common Lisp via D

Works with: Java version 8
import static java.util.Arrays.stream;
import java.util.Locale;
import static java.util.stream.IntStream.range;

public class Test {

    static double dotProduct(double[] a, double[] b) {
        return range(0, a.length).mapToDouble(i -> a[i] * b[i]).sum();
    }

    static double[][] matrixMul(double[][] A, double[][] B) {
        double[][] result = new double[A.length][B[0].length];
        double[] aux = new double[B.length];

        for (int j = 0; j < B[0].length; j++) {

            for (int k = 0; k < B.length; k++)
                aux[k] = B[k][j];

            for (int i = 0; i < A.length; i++)
                result[i][j] = dotProduct(A[i], aux);
        }
        return result;
    }

    static double[][] pivotize(double[][] m) {
        int n = m.length;
        double[][] id = range(0, n).mapToObj(j -> range(0, n)
                .mapToDouble(i -> i == j ? 1 : 0).toArray())
                .toArray(double[][]::new);

        for (int i = 0; i < n; i++) {
            double maxm = m[i][i];
            int row = i;
            for (int j = i; j < n; j++)
                if (m[j][i] > maxm) {
                    maxm = m[j][i];
                    row = j;
                }

            if (i != row) {
                double[] tmp = id[i];
                id[i] = id[row];
                id[row] = tmp;
            }
        }
        return id;
    }

    static double[][][] lu(double[][] A) {
        int n = A.length;
        double[][] L = new double[n][n];
        double[][] U = new double[n][n];
        double[][] P = pivotize(A);
        double[][] A2 = matrixMul(P, A);

        for (int j = 0; j < n; j++) {
            L[j][j] = 1;
            for (int i = 0; i < j + 1; i++) {
                double s1 = 0;
                for (int k = 0; k < i; k++)
                    s1 += U[k][j] * L[i][k];
                U[i][j] = A2[i][j] - s1;
            }
            for (int i = j; i < n; i++) {
                double s2 = 0;
                for (int k = 0; k < j; k++)
                    s2 += U[k][j] * L[i][k];
                L[i][j] = (A2[i][j] - s2) / U[j][j];
            }
        }
        return new double[][][]{L, U, P};
    }

    static void print(double[][] m) {
        stream(m).forEach(a -> {
            stream(a).forEach(n -> System.out.printf(Locale.US, "%5.1f ", n));
            System.out.println();
        });
        System.out.println();
    }

    public static void main(String[] args) {
        double[][] a = {{1.0, 3, 5}, {2.0, 4, 7}, {1.0, 1, 0}};

        double[][] b = {{11.0, 9, 24, 2}, {1.0, 5, 2, 6}, {3.0, 17, 18, 1},
        {2.0, 5, 7, 1}};

        for (double[][] m : lu(a))
            print(m);

        System.out.println();

        for (double[][] m : lu(b))
            print(m);
    }
}
  1.0   0.0   0.0 
  0.5   1.0   0.0 
  0.5  -1.0   1.0 

  2.0   4.0   7.0 
  0.0   1.0   1.5 
  0.0   0.0  -2.0 

  0.0   1.0   0.0 
  1.0   0.0   0.0 
  0.0   0.0   1.0 


  1.0   0.0   0.0   0.0 
  0.3   1.0   0.0   0.0 
  0.1   0.3   1.0   0.0 
  0.2   0.2   0.0   1.0 

 11.0   9.0  24.0   2.0 
  0.0  14.5  11.5   0.5 
  0.0   0.0  -3.5   5.7 
  0.0   0.0   0.0   0.5 

  1.0   0.0   0.0   0.0 
  0.0   0.0   1.0   0.0 
  0.0   1.0   0.0   0.0 
  0.0   0.0   0.0   1.0 

JavaScript

Works with: ES5 ES6
const mult=(a, b)=>{
  let res = new Array(a.length);
  for (let r = 0; r < a.length; ++r) {
    res[r] = new Array(b[0].length);
    for (let c = 0; c < b[0].length; ++c) {
      res[r][c] = 0;
      for (let i = 0; i < a[0].length; ++i)
        res[r][c] += a[r][i] * b[i][c];
    }
  }
  return res;
}

const lu = (mat) => {
    let lower = [],upper = [],n=mat.length;;
    for(let i=0;i<n;i++){
        lower.push([]);
        upper.push([]);
        for(let j=0;j<n;j++){
            lower[i].push(0);
            upper[i].push(0);
        }
    }
    for (let i = 0; i < n; i++) {
        for (let k = i; k < n; k++){
            let sum = 0;
            for (let j = 0; j < i; j++)
                sum += (lower[i][j] * upper[j][k]);
            upper[i][k] = mat[i][k] - sum;
        }
        for (let k = i; k < n; k++) {
            if (i == k)
                lower[i][i] = 1;
            else{
                let sum = 0;
                for (let j = 0; j < i; j++)
                    sum += (lower[k][j] * upper[j][i]);
                lower[k][i] = (mat[k][i] - sum) / upper[i][i];
            }
        }
    }
    return [lower,upper];
}

const pivot = (m) =>{
    let n = m.length;
    let id = [];
    for(let i=0;i<n;i++){
        id.push([]);
        for(let j=0;j<n;j++){
            if(i===j)
                id[i].push(1);
            else
                id[i].push(0);
        }
    }
    for (let i = 0; i < n; i++) {
        let maxm = m[i][i];
        let row = i;
        for (let j = i; j < n; j++)
            if (m[j][i] > maxm) {
                maxm = m[j][i];
                row = j;
            }
        if (i != row) {
            let tmp = id[i];
            id[i] = id[row];
            id[row] = tmp;
        }
    }
    return id;
}

const luDecomposition=(A)=>{
    const P = pivot(A);
    A = mult(P,A);
    return [...lu(A),P];
}

jq

Works with: jq version 1.4

jq currently does not have builtin support for matrices and therefore some infrastructure is needed to make the following self-contained. Matrices here are represented as arrays of arrays in the usual way.

Infrastructure

# Create an m x n matrix
def matrix(m; n; init):
  if m == 0 then []
  elif m == 1 then [range(0;n)] | map(init)
  elif m > 0 then
    matrix(1;n;init) as $row
    | [range(0;m)] | map( $row )
  else error("matrix\(m);_;_) invalid")
  end ;

def I(n): matrix(n;n;0) as $m
  | reduce range(0;n) as $i ($m; . | setpath( [$i,$i]; 1));

def dot_product(a; b):
  reduce range(0;a|length) as $i (0; . + (a[$i] * b[$i]) );

# transpose/0 expects its input to be a rectangular matrix
def transpose:
  if (.[0] | length) == 0 then []
  else [map(.[0])] + (map(.[1:]) | transpose)
  end ;

# A and B should both be numeric matrices, A being m by n, and B being n by p.
def multiply(A; B):
  (B[0]|length) as $p
  | (B|transpose) as $BT
  | reduce range(0; A|length) as $i
       ([];
       reduce range(0; $p) as $j 
         (.;
          .[$i][$j] = dot_product( A[$i]; $BT[$j] ) ));

def swap_rows(i;j):
  if i == j then .
  else .[i] as $i | .[i] = .[j] | .[j] = $i
  end ;

# Print a matrix neatly, each cell occupying n spaces, but without truncation
def neatly(n):
  def right: tostring | ( " " * (n-length) + .);
  . as $in
  | length as $length
  | reduce range (0;$length) as $i
      (""; . + reduce range(0;$length) as $j
      (""; "\(.) \($in[$i][$j] | right )" ) + "\n" ) ;

LU decomposition

# Create the pivot matrix for the input matrix.
# Use "range(0;$n) as $i" to handle ill-conditioned cases.
def pivotize:
  def abs: if .<0 then -. else . end;
  length as $n
  | . as $m
  | reduce range(0;$n) as $j
      (I($n);
       # state: [row; max]
       (reduce range(0; $n) as $i
          ([$j, $m[$j][$j]|abs ];
           ($m[$i][$j]|abs) as $a
           | if $a > .[1] then [ $i, $a ] else . end) | .[0]) as $row
        | swap_rows( $j; $row)
      ) ;

# Decompose the input nxn matrix A by PA=LU and return [L, U, P].
def lup:
  def div(i;j):
    if j == 0 then if i==0 then 0 else error("\(i)/0") end
    else i/j
    end;
  . as $A
  | length as $n
  | I($n) as $L         #  matrix($n; $n; 0.0) as $L
  | matrix($n; $n; 0.0) as $U
  | ($A|pivotize) as $P
  | multiply($P;$A) as $A2
  # state: [L, U]
  | reduce range(0; $n) as $i ( [$L, $U];
      reduce range(0; $n) as $j (.;
          .[0] as $L
        | .[1] as $U
        | if ($j >= $i) then
            (reduce range(0;$i) as $k (0; . + ($U[$k][$j] * $L[$i][$k] ))) as $s1
            | [$L, ($U| setpath([$i,$j]; ($A2[$i][$j] - $s1))) ]
          else
            (reduce range(0;$j) as $k (0; . + ($U[$k][$j] * $L[$i][$k]))) as $s2
            | [ ($L | setpath([$i,$j]; div(($A2[$i][$j] - $s2) ; $U[$j][$j] ))), $U ]
          end ))
    | . + [ $P ]
;

Example 1:

def a: [[1, 3, 5], [2, 4, 7], [1, 1, 0]];
a | lup[] | neatly(4)
Output:
 $ /usr/local/bin/jq -M -n -r -f LU.jq
    1    0    0
  0.5    1    0
  0.5   -1    1

    2    4    7
    0    1  1.5
    0    0   -2

    0    1    0
    1    0    0
    0    0    1

Example 2:

def b: [[11,9,24,2],[1,5,2,6],[3,17,18,1],[2,5,7,1]];
b | lup[] | neatly(21)
Output:
$ /usr/local/bin/jq -M -n -r -f LU.jq
                     1                     0                     0                     0
    0.2727272727272727                     1                     0                     0
   0.09090909090909091                0.2875                     1                     0
   0.18181818181818182   0.23124999999999996 0.0035971223021580693                     1

                    11                     9                    24                     2
                     0    14.545454545454547    11.454545454545455    0.4545454545454546
                     0                     0   -3.4749999999999996                5.6875
                     0                     0                     0     0.510791366906476

                     1                     0                     0                     0
                     0                     0                     1                     0
                     0                     1                     0                     0
                     0                     0                     0                     1

Example 3:

# A|lup|verify(A) should be true
def verify(A):
  .[0] as $L | .[1] as $U | .[2] as $P
  | multiply($P; A) == multiply($L; $U);

def A:
  [[1,  1,  1,  1],
   [1,  1, -1, -1],
   [1, -1,  0,  0],
   [0,  0,  1, -1]];

A|lup|verify(A)
Output:
true

Julia

Julia has the predefined functions `lu`, `lufact` and `lufact!` in the standard library to compute the lu decomposition of a matrix.

Output:
julia> using LinearAlgebra; lu([1 3 5 ; 2 4 7 ; 1 1 0])
(
3x3 Array{Float64,2}:
 1.0   0.0  0.0
 0.5   1.0  0.0
 0.5  -1.0  1.0,

3x3 Array{Float64,2}:
 2.0  4.0   7.0
 0.0  1.0   1.5
 0.0  0.0  -2.0,

[2,1,3])

Kotlin

// version 1.1.4-3

typealias Vector = DoubleArray
typealias Matrix = Array<Vector>
 
operator fun Matrix.times(other: Matrix): Matrix {
    val rows1 = this.size
    val cols1 = this[0].size
    val rows2 = other.size
    val cols2 = other[0].size
    require(cols1 == rows2)
    val result = Matrix(rows1) { Vector(cols2) }
    for (i in 0 until rows1) {
        for (j in 0 until cols2) {
            for (k in 0 until rows2) {
                result[i][j] += this[i][k] * other[k][j]
            }
        }
    }
    return result
}

fun pivotize(m: Matrix): Matrix {
    val n = m.size
    val im = Array(n) { Vector(n) }
    for (i in 0 until n) im[i][i] = 1.0
    for (i in 0 until n) {
        var max = m[i][i]
        var row = i
        for (j in i until n) {
            if (m[j][i] > max) {
                max = m[j][i]
                row = j
            }
        }
        if (i != row) {
            val t = im[i]
            im[i] = im[row]
            im[row] = t
        }
    }
    return im
} 
                
fun lu(a: Matrix): Array<Matrix> {
    val n = a.size
    val l = Array(n) { Vector(n) }
    val u = Array(n) { Vector(n) }
    val p = pivotize(a)
    val a2 = p * a
 
    for (j in 0 until n) {
        l[j][j] = 1.0
        for (i in 0 until j + 1) {
            var sum = 0.0
            for (k in 0 until i) sum += u[k][j] * l[i][k]
            u[i][j] = a2[i][j] - sum
        }
        for (i in j until n) {
            var sum2 = 0.0   
            for(k in 0 until j) sum2 += u[k][j] * l[i][k]
            l[i][j] = (a2[i][j] - sum2) / u[j][j]
        }
    } 
    return arrayOf(l, u, p)
}
   
fun printMatrix(title: String, m: Matrix, f: String) {
    val n = m.size
    println("\n$title\n")
    for (i in 0 until n) {
        for (j in 0 until n) print("${f.format(m[i][j])}  ")
        println()
    }
}

fun main(args: Array<String>) {
    val a1 = arrayOf(
        doubleArrayOf( 1.0,  3.0,  5.0),
        doubleArrayOf( 2.0,  4.0,  7.0),
        doubleArrayOf( 1.0,  1.0,  0.0)
    )
    val (l1, u1, p1) = lu(a1)
    println("EXAMPLE 1:-")
    printMatrix("A:", a1, "%1.0f")
    printMatrix("L:", l1, "% 7.5f")
    printMatrix("U:", u1, "% 8.5f")
    printMatrix("P:", p1, "%1.0f")

    val a2 = arrayOf(
        doubleArrayOf(11.0,  9.0, 24.0,  2.0),
        doubleArrayOf( 1.0,  5.0,  2.0,  6.0),
        doubleArrayOf( 3.0, 17.0, 18.0,  1.0),
        doubleArrayOf( 2.0,  5.0,  7.0,  1.0)
    )
    val (l2, u2, p2) = lu(a2)
    println("\nEXAMPLE 2:-")
    printMatrix("A:", a2, "%2.0f")
    printMatrix("L:", l2, "%7.5f")
    printMatrix("U:", u2, "%8.5f")
    printMatrix("P:", p2, "%1.0f")
}
Output:
EXAMPLE 1:-

A:

1  3  5  
2  4  7  
1  1  0  

L:

 1.00000   0.00000   0.00000  
 0.50000   1.00000   0.00000  
 0.50000  -1.00000   1.00000  

U:

 2.00000   4.00000   7.00000  
 0.00000   1.00000   1.50000  
 0.00000   0.00000  -2.00000  

P:

0  1  0  
1  0  0  
0  0  1  

EXAMPLE 2:-

A:

11   9  24   2  
 1   5   2   6  
 3  17  18   1  
 2   5   7   1  

L:

1.00000  0.00000  0.00000  0.00000  
0.27273  1.00000  0.00000  0.00000  
0.09091  0.28750  1.00000  0.00000  
0.18182  0.23125  0.00360  1.00000  

U:

11.00000   9.00000  24.00000   2.00000  
 0.00000  14.54545  11.45455   0.45455  
 0.00000   0.00000  -3.47500   5.68750  
 0.00000   0.00000   0.00000   0.51079  

P:

1  0  0  0  
0  0  1  0  
0  1  0  0  
0  0  0  1  

Lobster

import std

// derived from JAMA v1.03

// rectangular input array A is transformed in place to LU form

def LUDecomposition(LU):
   // Use a "left-looking", dot-product, Crout/Doolittle algorithm.
   let m = LU.length
   let n = LU[0].length
   let piv = map(m): _
   var pivsign = 1
   let LUcolj = map(m): 0.0
   // Outer loop.
   for(n) j:
      // Make a copy of the j-th column to localize references
      for(m) i:
         LUcolj[i] = LU[i][j]
      // Apply previous transformations
      for(m) i:
         let LUrowi = LU[i]
         // Most of the time is spent in the following dot product
         let kmax = min(i,j)
         var s = 0.0
         for(kmax) k:
            s += LUrowi[k] * LUcolj[k]
         s = LUcolj[i] - s
         LUcolj[i] = s
         LUrowi[j] = s
      // Find pivot and exchange if necessary.
      var p = j
      var i = j+1
      while i < m:
         if abs(LUcolj[i]) > abs(LUcolj[p]):
            p = i
         i += 1
      if p != j:
         for(n) k:
            let t = LU[p][k]
            LU[p][k] = LU[j][k]
            LU[j][k] = t
         let k = piv[p]
         piv[p] = piv[j]
         piv[j] = k
         pivsign = -pivsign
      // Compute multipliers.
      if j < m and LU[j][j] != 0.0:
         i = j+1
         while i < m:
            LU[i][j] /= LU[j][j]
            i += 1
   return piv

def print_A(A):
   print "A:"
   for(A) row:
      print row

def print_L(LU):
   print "L:"
   for(LU) lurow, i:
      let row = map(lurow.length): 0.0
      for(lurow) x, j:
         if i > j:
            row[j] = x
         else: if i == j:
            row[j] = 1.0
      print row

def print_U(LU):
   print "U:"
   for(LU) lurow, i:
      let row = map(lurow.length): 0.0
      for(lurow) x, j:
         if i <= j:
            row[j] = x
      print row

def print_P(piv):
   print "P:"
   for(piv) j:
      let row = map(piv.length): 0
      row[j] = 1
      print row

var A = [[1., 3., 5.],
         [2., 4., 7.],
         [1., 1., 0.]]

print_A A
var piv = LUDecomposition(A)
print_L A
print_U A
print_P piv

A = [[11.,  9., 24., 2.],
     [ 1.,  5.,  2., 6.],
     [ 3., 17., 18., 1.],
     [ 2.,  5.,  7., 1.]]

print_A A
piv = LUDecomposition(A)
print_L A
print_U A
print_P piv
Output:
A:
[1.0, 3.0, 5.0]
[2.0, 4.0, 7.0]
[1.0, 1.0, 0.0]
L:
[1.0, 0.0, 0.0]
[0.5, 1.0, 0.0]
[0.5, -1.0, 1.0]
U:
[2.0, 4.0, 7.0]
[0.0, 1.0, 1.5]
[0.0, 0.0, -2.0]
P:
[0, 1, 0]
[1, 0, 0]
[0, 0, 1]
A:
[11.0, 9.0, 24.0, 2.0]
[1.0, 5.0, 2.0, 6.0]
[3.0, 17.0, 18.0, 1.0]
[2.0, 5.0, 7.0, 1.0]
L:
[1.0, 0.0, 0.0, 0.0]
[0.272727272727, 1.0, 0.0, 0.0]
[0.090909090909, 0.2875, 1.0, 0.0]
[0.181818181818, 0.23125, 0.003597122302, 1.0]
U:
[11.0, 9.0, 24.0, 2.0]
[0.0, 14.54545454545, 11.45454545454, 0.454545454545]
[0.0, 0.0, -3.475, 5.6875]
[0.0, 0.0, 0.0, 0.510791366906]
P:
[1, 0, 0, 0]
[0, 0, 1, 0]
[0, 1, 0, 0]
[0, 0, 0, 1]

Maple

A:=<<1.0|3.0|5.0>,<2.0|4.0|7.0>,<1.0|1.0|0.0>>:

LinearAlgebra:-LUDecomposition(A);
Output:
    [0  1  0]  [              1.0   0.   0.]  [2.  4.                7.]
    [       ]  [                           ]  [                        ]
    [1  0  0], [0.500000000000000  1.0   0.], [0.  1.  1.50000000000000]
    [       ]  [                           ]  [                        ]
    [0  0  1]  [0.500000000000000  -1.  1.0]  [0.  0.               -2.]
A:=<<11.0|9.0|24.0|2.0>,<1.0|5.0|2.0|6.0>,
    <3.0|17.0|18.0|1.0>,<2.0|5.0|7.0|1.0>>:

with(LinearAlgebra):

LUDecomposition(A);
Output:
    [1  0  0  0]  
    [          ]  
    [0  0  1  0]  
    [          ], 
    [0  1  0  0]  
    [          ]  
    [0  0  0  1]  

      [               1.0                 0.                   0.   0.]  
      [                                                               ]  
      [ 0.272727272727273                1.0                   0.   0.]  
      [                                                               ], 
      [0.0909090909090909  0.287500000000000                  1.0   0.]  
      [                                                               ]  
      [ 0.181818181818182  0.231250000000000  0.00359712230215807  1.0]  

      [11.                9.                24.                 2.]
      [                                                           ]
      [ 0.  14.5454545454545   11.4545454545455  0.454545454545455]
      [                                                           ]
      [ 0.                0.  -3.47500000000000   5.68750000000000]
      [                                                           ]
      [ 0.                0.                 0.  0.510791366906476]

Mathematica/Wolfram Language

(*Ex1*)a = {{1, 3, 5}, {2, 4, 7}, {1, 1, 0}};
{lu, p, c} = LUDecomposition[a];
l = LowerTriangularize[lu, -1] + IdentityMatrix[Length[p]];
u = UpperTriangularize[lu];
P = Part[IdentityMatrix[Length[p]], p] ;
MatrixForm /@ {P.a , P, l, u, l.u}

(*Ex2*)a = {{11, 9, 24, 2}, {1, 5, 2, 6}, {3, 17, 18, 1}, {2, 5, 7, 1}};
{lu, p, c} = LUDecomposition[a];
l = LowerTriangularize[lu, -1] + IdentityMatrix[Length[p]];
u = UpperTriangularize[lu];
P = Part[IdentityMatrix[Length[p]], p] ;
MatrixForm /@ {P.a , P, l, u, l.u}
Output:

MATLAB / Octave

LU decomposition is part of language

  A = [
  1   3   5
  2   4   7
  1   1   0];

  [L,U,P] = lu(A)
Output:
  L =

   1.00000   0.00000   0.00000
   0.50000   1.00000   0.00000
   0.50000  -1.00000   1.00000

  U =

   2.00000   4.00000   7.00000
   0.00000   1.00000   1.50000
   0.00000   0.00000  -2.00000

  P =

   0   1   0
   1   0   0
   0   0   1

2nd example:

  A = [
   11    9   24    2
    1    5    2    6
    3   17   18    1
    2    5    7    1 ];

  [L,U,P] = lu(A)
Output:
  L =

   1.00000   0.00000   0.00000   0.00000
   0.27273   1.00000   0.00000   0.00000
   0.09091   0.28750   1.00000   0.00000
   0.18182   0.23125   0.00360   1.00000

  U =

   11.00000    9.00000   24.00000    2.00000
    0.00000   14.54545   11.45455    0.45455
    0.00000    0.00000   -3.47500    5.68750
    0.00000    0.00000    0.00000    0.51079

  P =

   1   0   0   0
   0   0   1   0
   0   1   0   0
   0   0   0   1 

Creating a MATLAB function

function [ P, L, U ] = LUdecomposition(A)

% Ensures A is n by n
sz = size(A);
if sz(1)~=sz(2)
    fprintf('A is not n by n\n');
    clear x;
    return;
end

n = sz(1);
L = eye(n);
P = eye(n);
U = A;

for i=1:sz(1)

    % Row reducing    
    if U(i,i)==0
        maximum = max(abs(U(i:end,1)));
        for k=1:n
           if maximum == abs(U(k,i))
               temp = U(1,:);
               U(1,:) = U(k,:);
               U(k,:) = temp;
               
               temp = P(:,1);
               P(1,:) = P(k,:);
               P(k,:) = temp;
           end
        end
        
    end
    
    if U(i,i)~=1
        temp = eye(n);
        temp(i,i)=U(i,i);
        L = L * temp;
        U(i,:) = U(i,:)/U(i,i); %Ensures the pivots are 1.
    end
    
    if i~=sz(1)
        
        for j=i+1:length(U)
            temp = eye(n);
            temp(j,i) = U(j,i);
            L = L * temp;
            U(j,:) = U(j,:)-U(j,i)*U(i,:);
            
        end
    end
    
    
end
P = P';
end

Maxima

/* LU decomposition is built-in */

a: hilbert_matrix(4)$

/* LU in "packed" form */

lup: lu_factor(a);
/* [matrix([1,   1/2,  1/3,   1/4   ],
           [1/2, 1/12, 1/12,  3/40  ],
           [1/3, 1,    1/180, 1/120 ],
           [1/4, 9/10, 3/2,   1/2800]),
    [1, 2, 3, 4], generalring] */

/* extract actual factors */

get_lu_factors(lup);
/* [matrix([1, 0, 0, 0],
           [0, 1, 0, 0],
           [0, 0, 1, 0],
           [0, 0, 0, 1]),
           
    matrix([1,   0,    0,   0],
           [1/2, 1,    0,   0],
           [1/3, 1,    1,   0],
           [1/4, 9/10, 3/2, 1]),
           
    matrix([1, 1/2,  1/3,   1/4   ],
           [0, 1/12, 1/12,  3/40  ],
           [0, 0,    1/180, 1/120 ],
           [0, 0,    0,     1/2800])
   ] */

/* solve for a given right-hand side */

lu_backsub(lup, transpose([1, 1, -1, -1]));
/* matrix([-204], [2100], [-4740], [2940]) */

Nim

Translation of: Kotlin
Library: strfmt

The matrices are represented by static arrays rather than sequences. This allows to use 1-based indexes which, in this case, is somewhat more natural than O-based indexes. Of course, as all is static, we have to rely heavily on generics.

For display, we use the third party module "strfmt" which allows to specify dynamically the format.

import macros, strutils
import strfmt

type

  Matrix[M, N: static int] = array[1..M, array[1..N, float]]
  SquareMatrix[N: static int] = Matrix[N, N]


# Templates to allow to use more natural notation for indexing.
template `[]`(m: Matrix; i, j: int): float = m[i][j]
template `[]=`(m: Matrix; i, j: int; val: float) = m[i][j] = val


func `*`[M, N, P: static int](a: Matrix[M, N]; b: Matrix[N, P]): Matrix[M, P] =
  ## Matrix multiplication.
  for i in 1..M:
    for j in 1..P:
      for k in 1..N:
        result[i, j] += a[i, k] * b[k, j]


func pivotize[N: static int](m: SquareMatrix[N]): SquareMatrix[N] =

  for i in 1..N: result[i, i] = 1

  for i in 1..N:
    var max = m[i, i]
    var row = i
    for j in i..N:
      if m[j, i] > max:
        max = m[j, i]
        row = j
    if i != row:
      swap result[i], result[row]


func lu[N: static int](m: SquareMatrix[N]): tuple[l, u, p: SquareMatrix[N]] =

  result.p = m.pivotize()
  let m2 = result.p * m

  for j in 1..N:
    result.l[j, j] = 1
    for i in 1..j:
      var sum = 0.0
      for k in 1..<i: sum += result.u[k, j] * result.l[i, k]
      result.u[i, j] = m2[i, j] - sum
    for i in j..N:
      var sum = 0.0
      for k in 1..<j: sum += result.u[k, j] * result.l[i, k]
      result.l[i, j] = (m2[i, j] - sum) / result.u[j, j]


proc print(m: Matrix; title, f: string) =
  echo '\n', title
  for i in 1..m.N:
    for j in 1..m.N:
      stdout.write m[i, j].format(f), "  "
    stdout.write '\n'


when isMainModule:

  const A1: SquareMatrix[3] = [[1.0, 3.0, 5.0],
                               [2.0, 4.0, 7.0],
                               [1.0, 1.0, 0.0]]

  let (l1, u1, p1) = A1.lu()
  echo "\nExample 2:"
  A1.print("A:", "1.0f")
  l1.print("L:", "8.5f")
  u1.print("U:", "8.5f")
  p1.print("P:", "1.0f")


  const A2: SquareMatrix[4] = [[11.0,  9.0, 24.0,  2.0],
                               [ 1.0,  5.0,  2.0,  6.0],
                               [ 3.0, 17.0, 18.0,  1.0],
                               [ 2.0,  5.0,  7.0,  1.0]]

  let (l2, u2, p2) = A2.lu()
  echo "Example 1:"
  A2.print("A:", "2.0f")
  l2.print("L:", "8.5f")
  u2.print("U:", "8.5f")
  p2.print("P:", "1.0f")
Output:
Example 1:

A:
1  3  5  
2  4  7  
1  1  0  

L:
 1.00000   0.00000   0.00000  
 0.50000   1.00000   0.00000  
 0.50000  -1.00000   1.00000  

U:
 2.00000   4.00000   7.00000  
 0.00000   1.00000   1.50000  
 0.00000   0.00000  -2.00000  

P:
0  1  0  
1  0  0  
0  0  1  

Example 2:

A:
11   9  24   2  
 1   5   2   6  
 3  17  18   1  
 2   5   7   1  

L:
 1.00000   0.00000   0.00000   0.00000  
 0.27273   1.00000   0.00000   0.00000  
 0.09091   0.28750   1.00000   0.00000  
 0.18182   0.23125   0.00360   1.00000  

U:
11.00000   9.00000  24.00000   2.00000  
 0.00000  14.54545  11.45455   0.45455  
 0.00000   0.00000  -3.47500   5.68750  
 0.00000   0.00000   0.00000   0.51079  

P:
1  0  0  0  
0  0  1  0  
0  1  0  0  
0  0  0  1  

PARI/GP

matlup(M) =
{
  my (L = matid(#M), U = M, P = L);

  for (i = 1, #M-1,     \\ pivoting
    p = M[z=i,i];
    for (k = i, #M, if (M[k,i] > p, p = M[z=k,i]));

    if (i != z,         \\ swap rows
      k = U[i,]; U[i,] = U[z,]; U[z,] = k;
      k = P[i,]; P[i,] = P[z,]; P[z,] = k;
    );
  );

  for (i = 1, #M-1,     \\ decompose
    for (k = i+1, #M,
      L[k,i] = U[k,i] / U[i,i];
      for (j = i, #M, U[k,j] -= L[k,i] * U[i,j])
    )
  );

  [L,U,P]       \\ return L,U,P triple matrix
}

Output:

gp > [L,U,P] = matlup([1,3,5;2,4,7;1,1,0]);

gp > L

[  1  0 0]

[1/2  1 0]

[1/2 -1 1]

gp > U
 
[2 4   7]

[0 1 3/2]

[0 0  -2]

gp > P

[0 1 0]

[1 0 0]

[0 0 1]

gp > [L,U,P] = matlup([11,9,24,2;1,5,2,6;3,17,18,1;2,5,7,1]);

gp > L

[   1      0     0 0]

[3/11      1     0 0]

[1/11  23/80     1 0]

[2/11 37/160 1/278 1]

gp > U

[11      9      24      2]

[ 0 160/11  126/11   5/11]

[ 0      0 -139/40  91/16]

[ 0      0       0 71/139]

gp > P

[1 0 0 0]

[0 0 1 0]

[0 1 0 0]

[0 0 0 1]

Perl

Translation of: Raku
use List::Util qw(sum);

for $test (
    [[1, 3, 5],
     [2, 4, 7],
     [1, 1, 0]],

    [[11,  9, 24,  2],
     [ 1,  5,  2,  6],
     [ 3, 17, 18,  1],
     [ 2,  5,  7,  1]]
) {
    my($P, $AP, $L, $U) = lu(@$test);
    say_it('A matrix', @$test);
    say_it('P matrix',  @$P);
    say_it('AP matrix', @$AP);
    say_it('L matrix',  @$L);
    say_it('U matrix',  @$U);

}

sub lu {
    my (@a) = @_;
    my $n = +@a;
    my @P  = pivotize(@a);
    my $AP = mmult(\@P, \@a);
    my @L  = matrix_ident($n);
    my @U  = matrix_zero($n);
    for $i (0..$n-1) {
        for $j (0..$n-1) {
            if ($j >= $i) {
                $U[$i][$j] =  $$AP[$i][$j] - sum map { $U[$_][$j] * $L[$i][$_] } 0..$i-1;
            } else {
                $L[$i][$j] = ($$AP[$i][$j] - sum map { $U[$_][$j] * $L[$i][$_] } 0..$j-1) / $U[$j][$j];
            }
        }
    }
    return \@P, $AP, \@L, \@U;
}

sub pivotize {
    my(@m) = @_;
    my $size = +@m;
    my @id = matrix_ident($size);
    for $i (0..$size-1) {
        my $max = $m[$i][$i];
        my $row = $i;
        for $j ($i .. $size-2) {
            if ($m[$j][$i] > $max) {
                $max = $m[$j][$i];
                $row = $j;
            }
        }
        ($id[$row],$id[$i]) = ($id[$i],$id[$row]) if $row != $i;
    }
    @id
}

sub matrix_zero  { my($n) = @_; map { [ (0) x $n ] } 0..$n-1 }
sub matrix_ident { my($n) = @_; map { [ (0) x $_, 1, (0) x ($n-1 - $_) ] } 0..$n-1 }

sub mmult {
  local *a = shift;
  local *b = shift;
  my @p = [];
  my $rows = @a;
  my $cols = @{ $b[0] };
  my $n = @b - 1;
  for (my $r = 0 ; $r < $rows ; ++$r) {
      for (my $c = 0 ; $c < $cols ; ++$c) {
          $p[$r][$c] += $a[$r][$_] * $b[$_][$c] foreach 0 .. $n;
      }
  }
  return [@p];
}

sub say_it {
    my($message, @array) = @_;
    print "$message\n";
    $line = sprintf join("\n" => map join(" " => map(sprintf("%8.5f", $_), @$_)), @{+\@array})."\n";
    $line =~ s/\.00000/      /g;
    $line =~ s/0000\b/    /g;
    print "$line\n";
}
Output:
A matrix
 1        3        5
 2        4        7
 1        1        0

P matrix
 0        1        0
 1        0        0
 0        0        1

AP matrix
 2        4        7
 1        3        5
 1        1        0

L matrix
 1        0        0
 0.5      1        0
 0.5     -1        1

U matrix
 2        4        7
 0        1        1.5
 0        0       -2

A matrix
11        9       24        2
 1        5        2        6
 3       17       18        1
 2        5        7        1

P matrix
 1        0        0        0
 0        0        1        0
 0        1        0        0
 0        0        0        1

AP matrix
11        9       24        2
 3       17       18        1
 1        5        2        6
 2        5        7        1

L matrix
 1        0        0        0
 0.27273  1        0        0
 0.09091  0.28750  1        0
 0.18182  0.23125  0.00360  1

U matrix
11        9       24        2
 0       14.54545 11.45455  0.45455
 0        0       -3.47500  5.68750
 0        0        0        0.51079

Phix

Translation of: Kotlin
with javascript_semantics
function matrix_mul(sequence a, sequence b)
    if length(a[1]) != length(b) then
        return 0
    end if
    sequence c = repeat(repeat(0,length(b[1])),length(a))
    for i=1 to length(a) do
        for j=1 to length(b[1]) do
            for k=1 to length(a[1]) do
                c[i][j] += a[i][k]*b[k][j]
            end for
        end for
    end for
    return c
end function
 
function pivotize(sequence m)
    integer n = length(m)
    sequence im = repeat(repeat(0,n),n)
    for i=1 to n do
        im[i][i] = 1
    end for
    for i=1 to n do
        atom mx = m[i][i]
        integer row = i
        for j=i to n do
            if m[j][i]>mx then
                mx = m[j][i]
                row = j
            end if
        end for
        if i!=row then
            {im[i],im[row]} = {im[row],im[i]}
        end if
    end for
    return im
end function
 
function lu(sequence a)
    integer n = length(a)
    sequence l = repeat(repeat(0,n),n),
             u = repeat(repeat(0,n),n),
             p = pivotize(a),
             a2 = matrix_mul(p,a)
 
    for j=1 to n do
        l[j][j] = 1.0
        for i=1 to j do
            atom sum1 = 0.0
            for k=1 to i do
                sum1 += u[k][j] * l[i][k]
            end for
            u[i][j] = a2[i][j] - sum1
        end for
        for i=j+1 to n do
            atom sum2 = 0.0  
            for k=1 to j do
                sum2 += u[k][j] * l[i][k]
            end for
            l[i][j] = (a2[i][j] - sum2) / u[j][j]
        end for
    end for
    return {a, l, u, p}
end function
 
constant a = {{{1, 3, 5},
               {2, 4, 7},
               {1, 1, 0}},
              {{11, 9,24, 2},
               { 1, 5, 2, 6},
               { 3,17,18, 1},
               { 2, 5, 7, 1}}}
for i=1 to length(a) do
    ?"== a,l,u,p: =="
    pp(lu(a[i]),{pp_Nest,2,pp_Pause,0})
end for
Output:
"== a,l,u,p: =="
{{{1,3,5},
  {2,4,7},
  {1,1,0}},
 {{1,0,0},
  {0.5,1,0},
  {0.5,-1,1}},
 {{2,4,7},
  {0,1,1.5},
  {0,0,-2}},
 {{0,1,0},
  {1,0,0},
  {0,0,1}}}
"== a,l,u,p: =="
{{{11,9,24,2},
  {1,5,2,6},
  {3,17,18,1},
  {2,5,7,1}},
 {{1,0,0,0},
  {0.2727272727,1,0,0},
  {0.0909090909,0.2875,1,0},
  {0.1818181818,0.23125,0.0035971223,1}},
 {{11,9,24,2},
  {0,14.54545455,11.45454545,0.4545454545},
  {0,0,-3.475,5.6875},
  {0,0,0,0.5107913669}},
 {{1,0,0,0},
  {0,0,1,0},
  {0,1,0,0},
  {0,0,0,1}}}

PL/I

(subscriptrange, fofl, size):                         /* 2 Nov. 2013 */
LU_Decomposition: procedure options (main);
   declare a1(3,3) float (18) initial ( 1,  3,  5,
                                        2,  4,  7,
                                        1,  1,  0);
   declare a2(4,4) float (18) initial (11,  9, 24, 2,
                                        1,  5,  2, 6,
                                        3, 17, 18, 1,
                                        2,  5,  7, 1);
   call check(a1);
   call check(a2);


/* In-situ decomposition */
LU: procedure(a, p);
   declare a(*,*) float (18);
   declare p(*)   fixed binary;
   declare (maximum, rtemp) float (18);
   declare (n, i, j, k, ii, temp) fixed binary;

   n = hbound(a,1);
   do i = 1 to n; p(i) = i; end;

   do k = 1 to n-1;

      maximum = 0; ii = k;
      do i = k to n;
         if maximum < abs(a(p(i),k)) then
            do; maximum = abs(a(p(i),k)); ii = i; end;
      end;
      if ii ^= k then do; temp = p(k); p(k) = p(ii); p(ii) = temp; end;

      do i = k+1 to n; a(p(i),k) = a(p(i),k) / a(p(k),k); end;

      do j = k+1 to n;
         do i = k+1 to n;
            a(p(i),j) = a(p(i),j) - a(p(i),k) * a(p(k),j);
         end;
      end;

   end;
end LU;

CHECK: procedure(a);
   declare a(*,*) float (18) nonassignable;

   declare aa(hbound(a,1), hbound(a,2)) float (18);
   declare  L(hbound(a,1), hbound(a,2)) float (18);
   declare  U(hbound(a,1), hbound(a,2)) float (18);
   declare (p(hbound(a,1), hbound(a,2)), ipiv(hbound(a,1)) ) fixed binary;
   declare pp(hbound(a,1), hbound(a,2)) fixed binary;
   declare (i, j, n, temp(hbound(a,1))) fixed binary;

   n = hbound(a,1);
   aa = A; /* work with a copy */
   P = 0; L = 0; U = 0;
   do i = 1 to n;
      p(i,i) = 1; L(i,i) = 1; /* convert permutation vector to a matrix */
   end;

   call LU(aa, ipiv);

   do i = 1 to n;
      do j = 1 to i-1; L(i,j) = aa(ipiv(i),j); end;
      do j = i to n;   U(i,j) = aa(ipiv(i),j); end;
   end;

   pp = p;
   do i = 1 to n;
      p(ipiv(i), *) = pp(i,*);
   end;

   put skip list ('A');
   put edit (A) (skip, (n) f(10,5));

   put skip list ('P');
   put edit (P) (skip, (n) f(11));

   put skip list ('L');
   put edit (L) (skip, (n) f(10,5));

   put skip list ('U');
   put edit (U) (skip, (n) f(10,5));

end CHECK;

end LU_Decomposition;

Derived from Fortran version above. Results:

A 
   1.00000   3.00000   5.00000
   2.00000   4.00000   7.00000
   1.00000   1.00000   0.00000
P 
          0          1          0
          1          0          0
          0          0          1
L 
   1.00000   0.00000   0.00000
   0.50000   1.00000   0.00000
   0.50000  -1.00000   1.00000
U 
   2.00000   4.00000   7.00000
   0.00000   1.00000   1.50000
   0.00000   0.00000  -2.00000
A 
  11.00000   9.00000  24.00000   2.00000
   1.00000   5.00000   2.00000   6.00000
   3.00000  17.00000  18.00000   1.00000
   2.00000   5.00000   7.00000   1.00000
P 
          1          0          0          0
          0          0          1          0
          0          1          0          0
          0          0          0          1
L 
   1.00000   0.00000   0.00000   0.00000
   0.27273   1.00000   0.00000   0.00000
   0.09091   0.28750   1.00000   0.00000
   0.18182   0.23125   0.00360   1.00000
U 
  11.00000   9.00000  24.00000   2.00000
   0.00000  14.54545  11.45455   0.45455
   0.00000   0.00000  -3.47500   5.68750
   0.00000   0.00000   0.00000   0.51079

Python

Translation of: D
from pprint import pprint

def matrixMul(A, B):
    TB = zip(*B)
    return [[sum(ea*eb for ea,eb in zip(a,b)) for b in TB] for a in A]

def pivotize(m):
    """Creates the pivoting matrix for m."""
    n = len(m)
    ID = [[float(i == j) for i in xrange(n)] for j in xrange(n)]
    for j in xrange(n):
        row = max(xrange(j, n), key=lambda i: abs(m[i][j]))
        if j != row:
            ID[j], ID[row] = ID[row], ID[j]
    return ID

def lu(A):
    """Decomposes a nxn matrix A by PA=LU and returns L, U and P."""
    n = len(A)
    L = [[0.0] * n for i in xrange(n)]
    U = [[0.0] * n for i in xrange(n)]
    P = pivotize(A)
    A2 = matrixMul(P, A)
    for j in xrange(n):
        L[j][j] = 1.0
        for i in xrange(j+1):
            s1 = sum(U[k][j] * L[i][k] for k in xrange(i))
            U[i][j] = A2[i][j] - s1
        for i in xrange(j, n):
            s2 = sum(U[k][j] * L[i][k] for k in xrange(j))
            L[i][j] = (A2[i][j] - s2) / U[j][j]
    return (L, U, P)

a = [[1, 3, 5], [2, 4, 7], [1, 1, 0]]
for part in lu(a):
    pprint(part, width=19)
    print
print
b = [[11,9,24,2],[1,5,2,6],[3,17,18,1],[2,5,7,1]]
for part in lu(b):
    pprint(part)
    print
Output:
[[1.0, 0.0, 0.0],
 [0.5, 1.0, 0.0],
 [0.5, -1.0, 1.0]]

[[2.0, 4.0, 7.0],
 [0.0, 1.0, 1.5],
 [0.0, 0.0, -2.0]]

[[0.0, 1.0, 0.0],
 [1.0, 0.0, 0.0],
 [0.0, 0.0, 1.0]]


[[1.0, 0.0, 0.0, 0.0],
 [0.27272727272727271, 1.0, 0.0, 0.0],
 [0.090909090909090912, 0.28749999999999998, 1.0, 0.0],
 [0.18181818181818182, 0.23124999999999996, 0.0035971223021580693, 1.0]]

[[11.0, 9.0, 24.0, 2.0],
 [0.0, 14.545454545454547, 11.454545454545455, 0.45454545454545459],
 [0.0, 0.0, -3.4749999999999996, 5.6875],
 [0.0, 0.0, 0.0, 0.51079136690647597]]

[[1.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 1.0, 0.0],
 [0.0, 1.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 1.0]]

R

library(Matrix)
A <- matrix(c(1, 3, 5, 2, 4, 7, 1, 1, 0), 3, 3, byrow=T)
dim(A) <- c(3, 3)
expand(lu(A))
Output:
$L
3 x 3 Matrix of class "dtrMatrix" (unitriangular)
     [,1] [,2] [,3]
[1,]  1.0    .    .
[2,]  0.5  1.0    .
[3,]  0.5 -1.0  1.0

$U
3 x 3 Matrix of class "dtrMatrix"
     [,1] [,2] [,3]
[1,]  2.0  4.0  7.0
[2,]    .  1.0  1.5
[3,]    .    . -2.0

$P
3 x 3 sparse Matrix of class "pMatrix"
          
[1,] . | .
[2,] | . .
[3,] . . |

Racket

#lang racket
(require math)
(define A (matrix
           [[1   3   5]
            [2   4   7]
            [1   1   0]]))

(matrix-lu A)
; result:
; (mutable-array #[#[1 0 0] 
;                  #[2 1 0] 
;                  #[1 1 1]])
; (mutable-array #[#[1 3 5] 
;                  #[0 -2 -3] 
;                  #[0 0 -2]])

Raku

(formerly Perl 6)

Translation of Ruby.

for (  [1, 3, 5], # Test Matrices
       [2, 4, 7],
       [1, 1, 0]
    ),
    (  [11,  9, 24,  2],
       [ 1,  5,  2,  6],
       [ 3, 17, 18,  1],
       [ 2,  5,  7,  1]
    )
    -> @test {
    say-it 'A Matrix', @test;
    say-it( .[0], @(.[1]) ) for 'P Matrix', 'Aʼ Matrix', 'L Matrix', 'U Matrix' Z, lu @test;
}

sub lu (@a) {
    die unless @a.&is-square;
    my $n  = @a;
    my @P  = pivotize @a;
    my @Aʼ = mmult @P, @a;
    my @L  = matrix-ident $n;
    my @U  = matrix-zero  $n;
    for ^$n X ^$n -> ($i,$j) {
        if $j$i { @U[$i;$j] =  @Aʼ[$i;$j] - [+] map { @U[$_;$j] × @L[$i;$_] }, ^$i              }
        else       { @L[$i;$j] = (@Aʼ[$i;$j] - [+] map { @U[$_;$j] × @L[$i;$_] }, ^$j) / @U[$j;$j] }
    }
    @P, @Aʼ, @L, @U;
}

sub pivotize (@m) {
    my $size = @m;
    my @id   = matrix-ident $size;
    for ^$size -> $i {
        my $max = @m[$i;$i];
        my $row = $i;
        for $i ..^ $size -> $j {
            if @m[$j;$i] > $max {
                $max = @m[$j;$i];
                $row = $j;
            }
        }
        @id[$row, $i] = @id[$i, $row] if $row != $i;
    }
    @id
}

sub is-square (@m) { so @m == all @m }

sub matrix-zero ($n, $m = $n) { map { [ flat 0 xx $n ] }, ^$m }

sub matrix-ident ($n) { map { [ flat 0 xx $_, 1, 0 xx $n - 1 - $_ ] }, ^$n }

sub mmult(@a,@b) {
    my @p;
    for ^@a X ^@b[0] -> ($r, $c) {
        @p[$r;$c] += @a[$r;$_] × @b[$_;$c] for ^@b;
    }
    @p
}

sub rat-int ($num) {
    return $num unless $num ~~ Rat;
    return $num.narrow if $num.narrow ~~ Int;
    $num.nude.join: '/';

}

sub say-it ($message, @array) {
    say "\n$message";
    $_».&rat-int.fmt("%7s").say for @array;
}
Output:
A Matrix
      1       3       5
      2       4       7
      1       1       0

P Matrix
      0       1       0
      1       0       0
      0       0       1

Aʼ Matrix
      2       4       7
      1       3       5
      1       1       0

L Matrix
      1       0       0
    1/2       1       0
    1/2      -1       1

U Matrix
      2       4       7
      0       1     3/2
      0       0      -2

A Matrix
     11       9      24       2
      1       5       2       6
      3      17      18       1
      2       5       7       1

P Matrix
      1       0       0       0
      0       0       1       0
      0       1       0       0
      0       0       0       1

Aʼ Matrix
     11       9      24       2
      3      17      18       1
      1       5       2       6
      2       5       7       1

L Matrix
      1       0       0       0
   3/11       1       0       0
   1/11   23/80       1       0
   2/11  37/160   1/278       1

U Matrix
     11       9      24       2
      0  160/11  126/11    5/11
      0       0 -139/40   91/16
      0       0       0  71/139

REXX

/*REXX program creates a  matrix  from console input, performs/shows  LU  decomposition.*/
#= 0;    P.= 0;    PA.= 0;     L.= 0;     U.= 0  /*initialize some variables to zero.   */
parse arg x                                      /*obtain matrix elements from the C.L. */
                  call bldAMat;       call showMat 'A'    /*build and display A  matrix.*/
                  call bldPmat;       call showMat 'P'    /*  "    "     "    P     "   */
                  call multMat;       call showMat 'PA'   /*  "    "     "    PA    "   */
  do y=1  for N;  call bldUmat;       call bldLmat        /*build     U  and  L     "   */
  end   /*y*/
                  call showMat 'L';   call showMat 'U'    /*display   L  and  U     "   */
exit                                             /*stick a fork in it,  we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
bldAMat: ?= words(x);  do N=1  for ?  until N**2>=?                /*find matrix size.  */
                       end  /*N*/
         if N**2\==?  then do;  say '***error*** wrong # of elements entered:'  ?;  exit 9
                           end
                  do    r=1  for N                                 /*build   A   matrix.*/
                     do c=1  for N;        #= # + 1;     _= word(x, #);        A.r.c= _
                     if \datatype(_, 'N')  then call er "element isn't numeric: "  _
                     end   /*c*/
                  end      /*r*/;          return
/*──────────────────────────────────────────────────────────────────────────────────────*/
bldLmat:          do    r=1  for N                                 /*build lower matrix.*/
                     do c=1  for N;        if r==c  then do;   L.r.c= 1;   iterate;    end
                     if c\==y | r==c | c>r  then iterate
                     _= PA.r.c
                                           do k=1  for c-1;    _= _   -   U.k.c * L.r.k
                                           end  /*k*/
                     L.r.c= _ / U.c.c
                     end   /*c*/
                  end      /*r*/;          return
/*──────────────────────────────────────────────────────────────────────────────────────*/
bldPmat: c= N;    do r=N  by -1  for N;    P.r.c= 1;     c= c + 1  /*build perm. matrix.*/
                  if c>N  then c= N%2;     if c==N  then c= 1
                  end   /*r*/;             return
/*──────────────────────────────────────────────────────────────────────────────────────*/
bldUmat:          do    r=1  for N;        if r\==y  then iterate  /*build upper matrix.*/
                     do c=1  for N;        if c<r    then iterate
                     _= PA.r.c
                                           do k=1  for r-1;     _= _   -   U.k.c * L.r.k
                                           end   /*k*/
                     U.r.c= _ / 1
                     end   /*c*/
                  end      /*r*/;          return
/*──────────────────────────────────────────────────────────────────────────────────────*/
multMat:          do      i=1  for N                 /*multiply matrix  P and A  ──► PA */
                     do   j=1  for N
                       do k=1  for N;      pa.i.j=  (pa.i.j   +   p.i.k * a.k.j)    /    1
                       end   /*k*/
                     end     /*j*/                   /*÷ by one does normalization [↑]. */
                  end        /*i*/;        return
/*──────────────────────────────────────────────────────────────────────────────────────*/
showMat: parse arg mat,rows,cols;   say;   rows= word(rows N,1);   cols= word(cols rows,1)
         w= 0;    do    r=1  for rows
                     do c=1  for cols;     w= max(w,  length( value( mat'.'r"."c ) ) )
                     end  /*c*/
                  end     /*r*/
         say center(mat  'matrix',  cols * (w + 1) + 7,  "─")      /*display the header.*/
                  do    r=1  for rows;     _=
                     do c=1  for cols;     _= _ right( value(mat'.'r"."c),   w + 1)
                     end   /*c*/
                  say _
                  end      /*r*/;       return
output   when using the input of:     1 3 5     2 4 7     1 1 0
──A matrix───
  1  3  5
  2  4  7
  1  1  0

──P matrix───
  0  1  0
  1  0  0
  0  0  1

──PA matrix──
  2  4  7
  1  3  5
  1  1  0

─────L matrix──────
    1    0    0
  0.5    1    0
  0.5   -1    1

─────U matrix──────
    2    4    7
    0    1  1.5
    0    0   -2
output   when using the input of:     11 9 24 2     1 5 2 6     3 17 18 1     2 5 7 1
─────A matrix──────
  11   9  24   2
   1   5   2   6
   3  17  18   1
   2   5   7   1

───P matrix────
  1  0  0  0
  0  0  1  0
  0  1  0  0
  0  0  0  1

─────PA matrix─────
  11   9  24   2
   3  17  18   1
   1   5   2   6
   2   5   7   1

───────────────────────────L matrix────────────────────────────
              1              0              0              0
    0.272727273              1              0              0
   0.0909090909    0.287500001              1              0
    0.181818182        0.23125  0.00359712804              1

───────────────────────U matrix────────────────────────
           11            9           24            2
            0   14.5454545   11.4545455   0.45454545
            0            0  -3.47500002       5.6875
            0            0            0  0.510791339

Ruby

require 'matrix'

class Matrix
  def lu_decomposition
    p = get_pivot
    tmp = p * self
    u = Matrix.zero(row_size).to_a
    l = Matrix.identity(row_size).to_a
    (0 ... row_size).each do |i|
      (0 ... row_size).each do |j|
        if j >= i
          # upper
          u[i][j] = tmp[i,j] - (0 ... i).inject(0.0) {|sum, k| sum + u[k][j] * l[i][k]}
        else
          # lower
          l[i][j] = (tmp[i,j] - (0 ... j).inject(0.0) {|sum, k| sum + u[k][j] * l[i][k]}) / u[j][j]
        end
      end
    end
    [ Matrix[*l], Matrix[*u], p ]
  end
  
  def get_pivot
    raise ArgumentError, "must be square" unless square?
    id = Matrix.identity(row_size).to_a
    (0 ... row_size).each do |i|
      max = self[i,i]
      row = i
      (i ... row_size).each do |j|
        if self[j,i] > max
          max = self[j,i]
          row = j
        end
      end
      id[i], id[row] = id[row], id[i]
    end
    Matrix[*id]
  end
  
  def pretty_print(format, head=nil)
    puts head if head
    puts each_slice(column_size).map{|row| format*row_size % row}
  end
end

puts "Example 1:"
a = Matrix[[1,  3,  5],
           [2,  4,  7],
           [1,  1,  0]]
a.pretty_print(" %2d", "A")
l, u, p = a.lu_decomposition
l.pretty_print(" %8.5f", "L")
u.pretty_print(" %8.5f", "U")
p.pretty_print(" %d",    "P")

puts "\nExample 2:"
a = Matrix[[11, 9,24,2], 
           [ 1, 5, 2,6], 
           [ 3,17,18,1], 
           [ 2, 5, 7,1]]
a.pretty_print(" %2d", "A")
l, u, p = a.lu_decomposition
l.pretty_print(" %8.5f", "L")
u.pretty_print(" %8.5f", "U")
p.pretty_print(" %d",    "P")
Output:
Example 1:
A
  1  3  5
  2  4  7
  1  1  0
L
  1.00000  0.00000  0.00000
  0.50000  1.00000  0.00000
  0.50000 -1.00000  1.00000
U
  2.00000  4.00000  7.00000
  0.00000  1.00000  1.50000
  0.00000  0.00000 -2.00000
P
 0 1 0
 1 0 0
 0 0 1

Example 2:
A
 11  9 24  2
  1  5  2  6
  3 17 18  1
  2  5  7  1
L
  1.00000  0.00000  0.00000  0.00000
  0.27273  1.00000  0.00000  0.00000
  0.09091  0.28750  1.00000  0.00000
  0.18182  0.23125  0.00360  1.00000
U
 11.00000  9.00000 24.00000  2.00000
  0.00000 14.54545 11.45455  0.45455
  0.00000  0.00000 -3.47500  5.68750
  0.00000  0.00000  0.00000  0.51079
P
 1 0 0 0
 0 0 1 0
 0 1 0 0
 0 0 0 1

Matrix has a lup_decomposition built-in method.

l, u, p = a.lup_decomposition
l.pretty_print(" %8.5f", "L")
u.pretty_print(" %8.5f", "U")
p.pretty_print(" %d",    "P")

Output is the same.

Rust

Library: ndarray
#![allow(non_snake_case)]
use ndarray::{Array, Axis, Array2, arr2, Zip, NdFloat, s};

fn main() {
    println!("Example 1:");
    let A: Array2<f64> = arr2(&[
        [1.0, 3.0, 5.0],
        [2.0, 4.0, 7.0],
        [1.0, 1.0, 0.0],
    ]);
    println!("A \n {}", A);
    let (L, U, P) = lu_decomp(A);
    println!("L \n {}", L);
    println!("U \n {}", U);
    println!("P \n {}", P);

    println!("\nExample 2:");
    let A: Array2<f64> = arr2(&[
        [11.0, 9.0, 24.0, 2.0],
        [1.0, 5.0, 2.0, 6.0],
        [3.0, 17.0, 18.0, 1.0],
        [2.0, 5.0, 7.0, 1.0],
    ]);
    println!("A \n {}", A);
    let (L, U, P) = lu_decomp(A);
    println!("L \n {}", L);
    println!("U \n {}", U);
    println!("P \n {}", P);
}

fn pivot<T>(A: &Array2<T>) -> Array2<T>
where T: NdFloat {
    let matrix_dimension = A.rows();
    let mut P: Array2<T> = Array::eye(matrix_dimension);
    for (i, column) in A.axis_iter(Axis(1)).enumerate() {
        // find idx of maximum value in column i
        let mut max_pos = i;
        for j in i..matrix_dimension {
            if column[max_pos].abs() < column[j].abs() {
                max_pos = j;
            }
        }
        // swap rows of P if necessary
        if max_pos != i {
            swap_rows(&mut P, i, max_pos);
        }
    }
    P
}

fn swap_rows<T>(A: &mut Array2<T>, idx_row1: usize, idx_row2: usize)
where T: NdFloat {
    // to swap rows, get two ArrayViewMuts for the corresponding rows
    // and apply swap elementwise using ndarray::Zip
    let (.., mut matrix_rest) = A.view_mut().split_at(Axis(0), idx_row1);
    let (row0, mut matrix_rest) = matrix_rest.view_mut().split_at(Axis(0), 1);
    let (_matrix_helper, mut matrix_rest) = matrix_rest.view_mut().split_at(Axis(0), idx_row2 - idx_row1 - 1);
    let (row1, ..) = matrix_rest.view_mut().split_at(Axis(0), 1);
    Zip::from(row0).and(row1).apply(std::mem::swap);
}

fn lu_decomp<T>(A: Array2<T>) -> (Array2<T>, Array2<T>, Array2<T>)
where T: NdFloat {

    let matrix_dimension = A.rows();
    assert_eq!(matrix_dimension, A.cols(), "Tried LU decomposition with a non-square matrix.");
    let P = pivot(&A);
    let pivotized_A = P.dot(&A);

    let mut L: Array2<T> = Array::eye(matrix_dimension);
    let mut U: Array2<T> = Array::zeros((matrix_dimension, matrix_dimension));
    for idx_col in 0..matrix_dimension {
        // fill U
        for idx_row in 0..idx_col+1 {
            U[[idx_row, idx_col]] = pivotized_A[[idx_row, idx_col]] -
                U.slice(s![0..idx_row,idx_col]).dot(&L.slice(s![idx_row,0..idx_row]));
        }
        // fill L
        for idx_row in idx_col+1..matrix_dimension {
            L[[idx_row, idx_col]] = (pivotized_A[[idx_row, idx_col]] -
                U.slice(s![0..idx_col,idx_col]).dot(&L.slice(s![idx_row,0..idx_col]))) /
                U[[idx_col, idx_col]];
        }
    }
    (L, U, P)
}


Output:
Example 1:
A 
 [[1, 3, 5],
 [2, 4, 7],
 [1, 1, 0]]
L 
 [[1, 0, 0],
 [0.5, 1, 0],
 [0.5, -1, 1]]
U 
 [[2, 4, 7],
 [0, 1, 1.5],
 [0, 0, -2]]
P 
 [[0, 1, 0],
 [1, 0, 0],
 [0, 0, 1]]

Example 2:
A 
 [[11, 9, 24, 2],
 [1, 5, 2, 6],
 [3, 17, 18, 1],
 [2, 5, 7, 1]]
L 
 [[1, 0, 0, 0],
 [0.2727272727272727, 1, 0, 0],
 [0.09090909090909091, 0.2875, 1, 0],
 [0.18181818181818182, 0.23124999999999996, 0.0035971223021580693, 1]]
U 
 [[11, 9, 24, 2],
 [0, 14.545454545454547, 11.454545454545455, 0.4545454545454546],
 [0, 0, -3.4749999999999996, 5.6875],
 [0, 0, 0, 0.510791366906476]]
P 
 [[1, 0, 0, 0],
 [0, 0, 1, 0],
 [0, 1, 0, 0],
 [0, 0, 0, 1]]

Alternative with abstalg=

Library: abstalg

This one implements a naive LU decomposition with arbitrary fields, so it works over Rational Numbers as well as floats, (or any field)

use abstalg::*;
pub struct Matrix2D<'a, F>
where
    F: Field,
{
    field: MatrixRing<F>,
    data: &'a mut Vec<<F as Domain>::Elem>,
    rows: usize,
    cols: usize,
}

impl<'a, F: Field + Clone> Matrix2D<'a, F> {
    pub fn new(field: F, data: &'a mut Vec<<F as Domain>::Elem>, rows: usize, cols: usize) -> Self {
        assert_eq!(rows * cols, data.len(), "Data does not match dimensions");
        Matrix2D {
            field: MatrixRing::<F>::new(field, rows),
            data,
            rows,
            cols,
        }
    }

    pub fn get(&self, row: usize, col: usize) -> &<F as Domain>::Elem {
        assert!(row < self.rows && col < self.cols, "Index out of bounds");
        &self.data[row * self.cols + col]
    }

    pub fn get_mut(&mut self, row: usize, col: usize) -> &mut <F as Domain>::Elem {
        assert!(row < self.rows && col < self.cols, "Index out of bounds");
        &mut self.data[row * self.cols + col]
    }

    pub fn get_row(&self, row: usize) -> Vec<<F as Domain>::Elem> {
        assert!(row < self.rows, "Row index out of bounds");
        let mut result = Vec::new();
        for col in 0..self.cols {
            result.push(self.get(row, col).clone());
        }
        result
    }

    pub fn get_col(&self, col: usize) -> Vec<<F as Domain>::Elem> {
        assert!(col < self.cols, "Column index out of bounds");
        let mut result = Vec::new();
        for row in 0..self.rows {
            result.push(self.get(row, col).clone());
        }
        result
    }

    pub fn set_row(&mut self, row: usize, new_row: Vec<<F as Domain>::Elem>) {
        assert!(row < self.rows, "Row index out of bounds");
        assert_eq!(new_row.len(), self.cols, "New row has wrong length");
        for col in 0..self.cols {
            *self.get_mut(row, col) = new_row[col].clone();
        }
    }

    pub fn set_col(&mut self, col: usize, new_col: Vec<<F as Domain>::Elem>) {
        assert!(col < self.cols, "Column index out of bounds");
        assert_eq!(new_col.len(), self.rows, "New column has wrong length");
        for row in 0..self.rows {
            *self.get_mut(row, col) = new_col[row].clone();
        }
    }

    pub fn swap_rows(&mut self, row1: usize, row2: usize) {
        assert!(
            row1 < self.rows && row2 < self.rows,
            "Row index out of bounds"
        );
        if row1 != row2 {
            for col in 0..self.cols {
                let temp = self.get(row1, col).clone();
                *self.get_mut(row1, col) = self.get(row2, col).clone();
                *self.get_mut(row2, col) = temp;
            }
        }
    }
    pub fn l_u_decomposition(&mut self) -> Result<Vec<<F as Domain>::Elem>, String>
    where
        F: Clone,
    {
        // Let base = field.base()
        let base = self.field.base().clone();
        // Let v_a = VectorAlgebra(base, cols)
        let v_a = VectorAlgebra::new(base.clone(), self.cols);
        // Let the_l_matrix = I (creates an identity matrix)
        let mut the_l_matrix: Vec<_> = self.field.int(1);
        // Let l_matrix = Matrix2D(base, the_l_matrix, rows, cols)
        let mut l_matrix = Matrix2D::new(base.clone(), &mut the_l_matrix, self.rows, self.cols);

        // For each pivot in min(rows, cols)
        for pivot in 0..std::cmp::min(self.rows, self.cols) {
            // Let pivot_row = self.get_row(pivot)
            let pivot_row = self.get_row(pivot);
            // If pivot element (pivot_row[pivot]) is zero, LU decomposition is not possible
            if base.is_zero(&pivot_row[pivot]) {
                return Err(
                    "LU decomposition without pivoting is not possible for this matrix".into(),
                );
            }
            // Let pivot_entry_inv = 1 / pivot_row[pivot]
            let pivot_entry_inv = base.inv(&pivot_row[pivot]);

            // For each row_idx in (pivot + 1) to rows
            for row_idx in (pivot + 1)..self.rows {
                // Let row = self.get_row(row_idx)
                let mut row = self.get_row(row_idx);
                // Let scale = row[pivot] * pivot_entry_inv
                let scale = base.mul(&row[pivot], &pivot_entry_inv);

                // row += -scale * pivot_row (Vector addition and scalar multiplication)
                v_a.add_assign(
                    &mut row,
                    &v_a.neg(&mul_vector(&v_a, scale.clone(), pivot_row.clone())),
                );

                // l_matrix[row_idx][pivot] = scale
                *l_matrix.get_mut(row_idx, pivot) = scale;
                // self.set_row(row_idx, row) (Sets the modified row back into the matrix)
                self.set_row(row_idx, row);
            }
        }
        // Returns the L matrix
        Ok(the_l_matrix)
    }

    pub fn p_l_u_decomposition(
        &self,
    ) -> Result<
        (
            Vec<<F as Domain>::Elem>,
            Vec<<F as Domain>::Elem>,
            Vec<<F as Domain>::Elem>,
        ),
        String,
    >
    where
        F: Clone,
    {
        let base = self.field.base().clone();
        let mut self2 = (*self.data).clone();
        let mut cloned_vector = Matrix2D::new(base.clone(), &mut self2, self.rows, self.cols);
        let mut pivot_row = 0;

        let mut the_p_matrix: Vec<_> = self.field.zero();
        let mut p_matrix = Matrix2D::new(base.clone(), &mut the_p_matrix, self.rows, self.cols);
        //let mut u_matrix = self.clone(); //Initializes the U matrix as a copy of the original matrix

        for pivot_col in 0..self.cols {
            // Find a non-zero entry in the pivot column
            let swap_row = (pivot_row..self.rows)
                .find(|&row| !base.equals(cloned_vector.get(row, pivot_col), &base.zero()));
            match swap_row {
                Some(swap_row) => {
                    // Swap rows in U and P matrices to bring the non-zero entry to the pivot position
                    cloned_vector.swap_rows(pivot_row, swap_row);
                    p_matrix.swap_rows(pivot_row, swap_row);
                    pivot_row += 1;
                }
                None => {
                    // If there are no non-zero entries in the pivot column, just proceed to the next column
                }
            }
        }

        // Set the diagonals of P to 1
        for i in 0..self.rows {
            *p_matrix.get_mut(i, i) = base.one();
        }

        // Run the LU decomposition on the permuted U matrix
        let l_u_result = cloned_vector.l_u_decomposition();

        match l_u_result {
            Ok(the_l_matrix) => Ok((the_p_matrix, the_l_matrix, cloned_vector.data.clone())),
            Err(e) => Err(e),
        }
    }
}

use std::{error::Error, fmt};
impl<'a, T> fmt::Display for Matrix2D<'a, T>
where
    <T as Domain>::Elem: fmt::Display,
    T: Field,
{
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        for row in 0..self.rows {
            for col in 0..self.cols {
                write!(f, "{} ", self.get(row, col))?;
            }
            writeln!(f)?;
        }
        Ok(())
    }
}

fn mul_vector<T>(
    f: &VectorAlgebra<T>,
    a: <T as Domain>::Elem,
    d: Vec<<T as Domain>::Elem>,
) -> <VectorAlgebra<T> as Domain>::Elem
where
    T: Field,
{
    f.mul(&d, &f.diagonal(a))
}

#[cfg(test)]
mod tests {
    use super::*; // bring into scope everything from the parent module
    use abstalg::ReducedFractions;
    use abstalg::I32;

    fn test_p_l_u_decomposition(matrix: Vec<isize>, size: usize) {
        // This test assumes that the base field is rational numbers.
        // Create a test 4x4 matrix
        let (matrix, size): (Vec<isize>, usize) =
            (vec![11, 9, 24, 2, 1, 5, 2, 6, 3, 17, 18, 1, 2, 5, 7, 1], 4);
        let field = abstalg::ReducedFractions::new(abstalg::I32);
        let matrix_ring = MatrixRing::new(field.clone(), size);
        let mut matrix: Vec<_> = matrix.clone().into_iter().map(|i| field.int(i)).collect();
        let mut matrix2d = Matrix2D::new(field.clone(), &mut matrix, size, size);

        // Decompose the matrix using the p_l_u_decomposition function
        let p_l_u_decomposition_result = matrix2d.p_l_u_decomposition().unwrap();
        let (mut p_matrix, mut l_matrix, mut u_matrix) = p_l_u_decomposition_result;

        // Convert the matrices back to Matrix2D form for printing
        let p_matrix2d = Matrix2D::new(field.clone(), &mut p_matrix, size, size);
        let l_matrix2d = Matrix2D::new(field.clone(), &mut l_matrix, size, size);
        let u_matrix2d = Matrix2D::new(field.clone(), &mut u_matrix, size, size);

        println!("P={} L={} U={}", p_matrix2d, l_matrix2d, u_matrix2d,);

        // Multiply the resulting P, L, and U matrices
        let p_l = matrix_ring.mul(&p_matrix, &l_matrix);
        let mut p_l_u = matrix_ring.mul(&p_l, &u_matrix);

        //let p_l_u_2d = Matrix2D::new(field.clone(), &mut p_l_u, 4, 4);
        // Check that the product of P, L, and U is equal to the original matrix
        assert_eq!(matrix, p_l_u);
    }

    #[test]
    fn test_p_l_u_decomposition_example() {
        test_p_l_u_decomposition(vec![
            11, 9, 24, 2,
            1, 5, 2, 6,
            3, 17, 18, 1,
            2, 5, 7, 1,
        ], 4);
        test_p_l_u_decomposition(vec![
            1, 3, 5,
            2, 4, 7,
            1, 1, 0,
        ], 3);
    }
}

Sidef

Translation of: Raku
func is_square(m) { m.all { .len == m.len } }
func matrix_zero(n, m=n) { m.of { n.of(0) } }
func matrix_ident(n) { n.of {|i| [i.of(0)..., 1, (n - i - 1).of(0)...] } }

func pivotize(m) {
    var size = m.len
    var id = matrix_ident(size)
    for i in (^size) {
        var max = m[i][i]
        var row = i
        for j in (i ..^ size) {
            if (m[j][i] > max) {
                max = m[j][i]
                row = j
            }
        }
        if (row != i) {
            id.swap(row, i)
        }
    }
    return id
}

func mmult(a, b) {
    var p = []
    for r in ^a, c in ^b[0], i in ^b {
        p[r][c] := 0 += (a[r][i] * b[i][c])
    }
    return p
}

func lu(a) {
    is_square(a) || die "Defined only for square matrices!";
    var n = a.len
    var P = pivotize(a)
    var  = mmult(P, a)
    var L = matrix_ident(n)
    var U = matrix_zero(n)
    for i in ^n, j in ^n {
        if (j >= i) {
            U[i][j] = ([i][j] - sum(^i, { U[_][j] * L[i][_] }))
        } else {
            L[i][j] = (([i][j] - sum(^j, { U[_][j] * L[i][_] })) / U[j][j])
        }
    }
    return [P, , L, U]
}

func say_it(message, array) {
    say "\n#{message}"
    array.each { |row|
        say row.map{"%7s" % .as_rat}.join(' ')
    }
}

var t = [[
   %n(1 3 5),
   %n(2 4 7),
   %n(1 1 0),
],[
   %n(11  9 24  2),
   %n( 1  5  2  6),
   %n( 3 17 18  1),
   %n( 2  5  7  1),
]]

t.each { |test|
    say_it('A Matrix', test)
    for a,b in (['P Matrix', 'Aʼ Matrix', 'L Matrix', 'U Matrix'] ~Z lu(test)) {
        say_it(a, b)
    }
}
A Matrix
      1       3       5
      2       4       7
      1       1       0

P Matrix
      0       1       0
      1       0       0
      0       0       1

Aʼ Matrix
      2       4       7
      1       3       5
      1       1       0

L Matrix
      1       0       0
    1/2       1       0
    1/2      -1       1

U Matrix
      2       4       7
      0       1     3/2
      0       0      -2

A Matrix
     11       9      24       2
      1       5       2       6
      3      17      18       1
      2       5       7       1

P Matrix
      1       0       0       0
      0       0       1       0
      0       1       0       0
      0       0       0       1

Aʼ Matrix
     11       9      24       2
      3      17      18       1
      1       5       2       6
      2       5       7       1

L Matrix
      1       0       0       0
   3/11       1       0       0
   1/11   23/80       1       0
   2/11  37/160   1/278       1

U Matrix
     11       9      24       2
      0  160/11  126/11    5/11
      0       0 -139/40   91/16
      0       0       0  71/139

Stata

Builtin LU decoposition

See LU decomposition in Stata help.

mata
: lud(a=(1,3,5\2,4,7\1,1,0),l=.,u=.,p=.)

: a
       1   2   3
    +-------------+
  1 |  1   3   5  |
  2 |  2   4   7  |
  3 |  1   1   0  |
    +-------------+

: l
        1    2    3
    +----------------+
  1 |   1    0    0  |
  2 |  .5    1    0  |
  3 |  .5   -1    1  |
    +----------------+

: u
         1     2     3
    +-------------------+
  1 |    2     4     7  |
  2 |    0     1   1.5  |
  3 |    0     0    -2  |
    +-------------------+

: p
       1
    +-----+
  1 |  2  |
  2 |  1  |
  3 |  3  |
    +-----+

Implementation

void ludec(real matrix a, real matrix l, real matrix u, real vector p) {
	real scalar i,j,n,s
	real vector js
	
	l = a
	n = rows(a)
	p = 1::n
	for (i=1; i<n; i++) {
		maxindex(abs(l[i::n,i]), 1, js=., .)
		j = js[1]+i-1
		if (j!=i) {
			l[(i\j),.] = l[(j\i),.]
			p[(i\j)] = p[(j\i)]
		}
		for (j=i+1; j<=n; j++) {
			l[j,i] = s = l[j,i]/l[i,i]
			l[j,i+1..n] = l[j,i+1..n]-s*l[i,i+1..n]
		}
	}
	
	u = uppertriangle(l)
	l = lowertriangle(l, 1)
}

Example:

: ludec(a=(1,3,5\2,4,7\1,1,0),l=.,u=.,p=.)

: a
       1   2   3
    +-------------+
  1 |  1   3   5  |
  2 |  2   4   7  |
  3 |  1   1   0  |
    +-------------+

: l
        1    2    3
    +----------------+
  1 |   1    0    0  |
  2 |  .5    1    0  |
  3 |  .5   -1    1  |
    +----------------+

: u
         1     2     3
    +-------------------+
  1 |    2     4     7  |
  2 |    0     1   1.5  |
  3 |    0     0    -2  |
    +-------------------+

: p
       1
    +-----+
  1 |  2  |
  2 |  1  |
  3 |  3  |
    +-----+

Tcl

package require Tcl 8.5
namespace eval matrix {
    namespace path {::tcl::mathfunc ::tcl::mathop}

    # Construct an identity matrix of the given size
    proc identity {order} {
	set m [lrepeat $order [lrepeat $order 0]]
	for {set i 0} {$i < $order} {incr i} {
	    lset m $i $i 1
	}
	return $m
    }

    # Produce the pivot matrix for a given matrix
    proc pivotize {matrix} {
	set n [llength $matrix]
	set p [identity $n]
	for {set j 0} {$j < $n} {incr j} {
	    set max [lindex $matrix $j $j]
	    set row $j
	    for {set i $j} {$i < $n} {incr i} {
		if {[lindex $matrix $i $j] > $max} {
		    set max [lindex $matrix $i $j]
		    set row $i
		}
	    }
	    if {$j != $row} {
		# Row swap inlined; too trivial to have separate procedure
		set tmp [lindex $p $j]
		lset p $j [lindex $p $row]
		lset p $row $tmp
	    }
	}
	return $p
    }

    # Decompose a square matrix A by PA=LU and return L, U and P
    proc luDecompose {A} {
	set n [llength $A]
	set L [lrepeat $n [lrepeat $n 0]]
	set U $L
	set P [pivotize $A]
	set A [multiply $P $A]

	for {set j 0} {$j < $n} {incr j} {
	    lset L $j $j 1
	    for {set i 0} {$i <= $j} {incr i} {
		lset U $i $j [- [lindex $A $i $j] [SumMul $L $U $i $j $i]]
	    }
	    for {set i $j} {$i < $n} {incr i} {
		set sum [SumMul $L $U $i $j $j]
		lset L $i $j [/ [- [lindex $A $i $j] $sum] [lindex $U $j $j]]
	    }
	}

	return [list $L $U $P]
    }

    # Helper that makes inner loop nicer; multiplies column and row,
    # possibly partially...
    proc SumMul {A B i j kmax} {
	set s 0.0
	for {set k 0} {$k < $kmax} {incr k} {
	    set s [+ $s [* [lindex $A $i $k] [lindex $B $k $j]]]
	}
	return $s
    }
}

Support code:

# Code adapted from Matrix_multiplication and Matrix_transposition tasks
namespace eval matrix {
    # Get the size of a matrix; assumes that all rows are the same length, which
    # is a basic well-formed-ness condition...
    proc size {m} {
	set rows [llength $m]
	set cols [llength [lindex $m 0]]
	return [list $rows $cols]
    }

    # Matrix multiplication implementation
    proc multiply {a b} {
	lassign [size $a] a_rows a_cols
	lassign [size $b] b_rows b_cols
	if {$a_cols != $b_rows} {
	    error "incompatible sizes: a($a_rows, $a_cols), b($b_rows, $b_cols)"
	}
	set temp [lrepeat $a_rows [lrepeat $b_cols 0]]
	for {set i 0} {$i < $a_rows} {incr i} {
	    for {set j 0} {$j < $b_cols} {incr j} {
		lset temp $i $j [SumMul $a $b $i $j $a_cols]
	    }
	}
	return $temp
    }

    # Pretty printer for matrices
    proc print {matrix {fmt "%g"}} {
	set max [Widest $matrix $fmt]
	lassign [size $matrix] rows cols
	foreach row $matrix {
	    foreach val $row width $max {
		puts -nonewline [format "%*s " $width [format $fmt $val]]
	    }
	    puts ""
	}
    }
    proc Widest {m fmt} {
	lassign [size $m] rows cols 
	set max [lrepeat $cols 0]
	foreach row $m {
	    for {set j 0} {$j < $cols} {incr j} {
		set s [format $fmt [lindex $row $j]]
		lset max $j [max [lindex $max $j] [string length $s]]
	    }
	}
	return $max
    }
}

Demonstrating:

# This does the decomposition and prints it out nicely
proc demo {A} {
    lassign [matrix::luDecompose $A] L U P
    foreach v {A L U P} {
	upvar 0 $v matrix
	puts "${v}:"
	matrix::print $matrix %.5g
	if {$v ne "P"} {puts "---------------------------------"}
    }
}
demo {{1 3 5} {2 4 7} {1 1 0}}
puts "================================="
demo {{11 9 24 2} {1 5 2 6} {3 17 18 1} {2 5 7 1}}
Output:
A:
1 3 5 
2 4 7 
1 1 0 
---------------------------------
L:
  1  0 0 
0.5  1 0 
0.5 -1 1 
---------------------------------
U:
2 4   7 
0 1 1.5 
0 0  -2 
---------------------------------
P:
0 1 0 
1 0 0 
0 0 1 
=================================
A:
11  9 24 2 
 1  5  2 6 
 3 17 18 1 
 2  5  7 1 
---------------------------------
L:
       1       0         0 0 
 0.27273       1         0 0 
0.090909  0.2875         1 0 
 0.18182 0.23125 0.0035971 1 
---------------------------------
U:
11      9     24       2 
 0 14.545 11.455 0.45455 
 0      0 -3.475  5.6875 
 0      0      0 0.51079 
---------------------------------
P:
1 0 0 0 
0 0 1 0 
0 1 0 0 
0 0 0 1 

VBA

Translation of: Phix
Option Base 1
Private Function pivotize(m As Variant) As Variant
    Dim n As Integer: n = UBound(m)
    Dim im() As Double
    ReDim im(n, n)
    For i = 1 To n
        For j = 1 To n
            im(i, j) = 0
        Next j
        im(i, i) = 1
    Next i
    For i = 1 To n
        mx = Abs(m(i, i))
        row_ = i
        For j = i To n
            If Abs(m(j, i)) > mx Then
                mx = Abs(m(j, i))
                row_ = j
            End If
        Next j
        If i <> Row Then
            For j = 1 To n
                tmp = im(i, j)
                im(i, j) = im(row_, j)
                im(row_, j) = tmp
            Next j
        End If
    Next i
    pivotize = im
End Function
 
Private Function lu(a As Variant) As Variant
    Dim n As Integer: n = UBound(a)
    Dim l() As Double
    ReDim l(n, n)
    For i = 1 To n
        For j = 1 To n
            l(i, j) = 0
        Next j
    Next i
    u = l
    p = pivotize(a)
    a2 = WorksheetFunction.MMult(p, a)
    For j = 1 To n
        l(j, j) = 1#
        For i = 1 To j
            sum1 = 0#
            For k = 1 To i
                sum1 = sum1 + u(k, j) * l(i, k)
            Next k
            u(i, j) = a2(i, j) - sum1
        Next i
        For i = j + 1 To n
            sum2 = 0#
            For k = 1 To j
                sum2 = sum2 + u(k, j) * l(i, k)
            Next k
            l(i, j) = (a2(i, j) - sum2) / u(j, j)
        Next i
    Next j
    Dim res(4) As Variant
    res(1) = a
    res(2) = l
    res(3) = u
    res(4) = p
    lu = res
End Function
 
Public Sub main()
    
    a = [{1, 3, 5; 2, 4, 7; 1, 1, 0}]
    Debug.Print "== a,l,u,p: =="
    result = lu(a)
    For i = 1 To 4
        For j = 1 To UBound(result(1))
            For k = 1 To UBound(result(1), 2)
                Debug.Print result(i)(j, k),
            Next k
            Debug.Print
        Next j
        Debug.Print
    Next i
    a = [{11, 9,24, 2; 1, 5, 2, 6; 3,17,18, 1; 2, 5, 7, 1}]
    Debug.Print "== a,l,u,p: =="
    result = lu(a)
    For i = 1 To 4
        For j = 1 To UBound(result(1))
            For k = 1 To UBound(result(1), 2)
                Debug.Print Format(result(i)(j, k), "0.#####"),
            Next k
            Debug.Print
        Next j
        Debug.Print
    Next i
End Sub
Output:
== a,l,u,p: ==
 1             3             5            
 2             4             7            
 1             1             0            

 1             0             0            
 0,5           1             0            
 0,5          -1             1            

 2             4             7            
 0             1             1,5          
 0             0            -2            

 0             1             0            
 1             0             0            
 0             0             1            

== a,l,u,p: ==
11,           9,            24,           2,            
1,            5,            2,            6,            
3,            17,           18,           1,            
2,            5,            7,            1,            

1,            0,            0,            0,            
0,27273       1,            0,            0,            
0,09091       0,2875        1,            0,            
0,18182       0,23125       0,0036        1,            

11,           9,            24,           2,            
0,            14,54545      11,45455      0,45455       
0,            0,            -3,475        5,6875        
0,            0,            0,            0,51079       

1,            0,            0,            0,            
0,            0,            1,            0,            
0,            1,            0,            0,            
0,            0,            0,            1,   

Wren

Library: Wren-matrix
Library: Wren-fmt
import "./matrix" for Matrix
import "./fmt" for Fmt

var arrays = [
    [ [1, 3, 5],
      [2, 4, 7],
      [1, 1, 0] ],

    [ [11,  9, 24, 2],
      [ 1,  5,  2, 6],
      [ 3, 17, 18, 1],
      [ 2,  5,  7, 1] ]
]

for (array in arrays) {
    var m = Matrix.new(array)
    System.print("A\n")
    Fmt.mprint(m, 2, 0)
    System.print("\nL\n")
    var lup = m.lup
    Fmt.mprint(lup[0], 8, 5)
    System.print("\nU\n")
    Fmt.mprint(lup[1], 8, 5)
    System.print("\nP\n")
    Fmt.mprint(lup[2], 2, 0)
    System.print()
}
Output:
A

| 1  3  5|
| 2  4  7|
| 1  1  0|

L

| 1.00000  0.00000  0.00000|
| 0.50000  1.00000  0.00000|
| 0.50000 -1.00000  1.00000|

U

| 2.00000  4.00000  7.00000|
| 0.00000  1.00000  1.50000|
| 0.00000  0.00000 -2.00000|

P

| 0  1  0|
| 1  0  0|
| 0  0  1|

A

|11  9 24  2|
| 1  5  2  6|
| 3 17 18  1|
| 2  5  7  1|

L

| 1.00000  0.00000  0.00000  0.00000|
| 0.27273  1.00000  0.00000  0.00000|
| 0.09091  0.28750  1.00000  0.00000|
| 0.18182  0.23125  0.00360  1.00000|

U

|11.00000  9.00000 24.00000  2.00000|
| 0.00000 14.54545 11.45455  0.45455|
| 0.00000  0.00000 -3.47500  5.68750|
| 0.00000  0.00000  0.00000  0.51079|

P

| 1  0  0  0|
| 0  0  1  0|
| 0  1  0  0|
| 0  0  0  1|

zkl

Using the GNU Scientific Library, which does the decomposition without returning the permutations:

var [const] GSL=Import("zklGSL");	// libGSL (GNU Scientific Library)
fcn luTask(A){
   A.LUDecompose();	//  in place, contains L & U
   L:=A.copy().lowerTriangle().setDiagonal(0,0,1);
   U:=A.copy().upperTriangle();
   return(L,U);
}

A:=GSL.Matrix(3,3).set(1,3,5,  2,4,7,   1,1,0);  // example 1
L,U:=luTask(A);
println("L:\n",L.format(),"\nU:\n",U.format());

A:=GSL.Matrix(4,4).set(11.0,  9.0, 24.0, 2.0,	// example 2
			1.0,  5.0,  2.0, 6.0,
			3.0, 17.0, 18.0, 1.0,
			2.0,  5.0,  7.0, 1.0);
L,U:=luTask(A);
println("L:\n",L.format(8,4),"\nU:\n",U.format(8,4));
Output:
L:
      1.00,      0.00,      0.00
      0.50,      1.00,      0.00
      0.50,     -1.00,      1.00
U:
      2.00,      4.00,      7.00
      0.00,      1.00,      1.50
      0.00,      0.00,     -2.00
L:
  1.0000,  0.0000,  0.0000,  0.0000
  0.2727,  1.0000,  0.0000,  0.0000
  0.0909,  0.2875,  1.0000,  0.0000
  0.1818,  0.2312,  0.0036,  1.0000
U:
 11.0000,  9.0000, 24.0000,  2.0000
  0.0000, 14.5455, 11.4545,  0.4545
  0.0000,  0.0000, -3.4750,  5.6875
  0.0000,  0.0000,  0.0000,  0.5108

Or, using lists:

Translation of: Common Lisp
Translation of: D

A matrix is a list of lists, ie list of rows in row major order.

fcn make_array(n,m,v){ (m).pump(List.createLong(m).write,v)*n }
fcn eye(n){ // Creates a nxn identity matrix.
   I:=make_array(n,n,0.0);
   foreach j in (n){ I[j][j]=1.0 }
   I
}
 
// Creates the pivoting matrix for A.
fcn pivotize(A){
   n:=A.len();	// rows
   P:=eye(n);
   foreach i in (n){
      max,row:=A[i][i],i;
      foreach j in ([i..n-1]){
         if(A[j][i]>max) max,row=A[j][i],j;
      }
      if(i!=row) P.swap(i,row);
   }
   // Return P.
   P
}

// Decomposes a square matrix A by PA=LU and returns L, U and P.
fcn lu(A){
   n:=A.len();
   L:=eye(n);
   U:=make_array(n,n,0.0);
   P:=pivotize(A);
   A=matMult(P,A);

   foreach j in (n){
      foreach i in (j+1){
         U[i][j]=A[i][j] - (i).reduce('wrap(s,k){ s + U[k][j]*L[i][k] },0.0);
      }
      foreach i in ([j..n-1]){
         L[i][j]=( A[i][j] - 
		   (j).reduce('wrap(s,k){ s + U[k][j]*L[i][k] },0.0) ) /
		 U[j][j];
      }
   }
   // Return L, U and P.
   return(L,U,P);
}

fcn matMult(a,b){
   n,m,p:=a[0].len(),a.len(),b[0].len();
   ans:=make_array(n,m,0.0);
   foreach i,j,k in (m,p,n){ ans[i][j]+=a[i][k]*b[k][j]; }
   ans
}

Example 1

g:=L(L(1.0,3.0,5.0),L(2.0,4.0,7.0),L(1.0,1.0,0.0));
lu(g).apply2("println");
Output:
L(L(1,0,0),L(0.5,1,0),L(0.5,-1,1))
L(L(2,4,7),L(0,1,1.5),L(0,0,-2))
L(L(0,1,0),L(1,0,0),L(0,0,1))

Example 2

lu(L( L(11.0,  9.0, 24.0, 2.0), 
      L( 1.0,  5.0,  2.0, 6.0),
      L( 3.0, 17.0, 18.0, 1.0),
      L( 2.0,  5.0,  7.0, 1.0) )).apply2(T(printM,Console.writeln.fpM("-")));

fcn printM(m)  { m.pump(Console.println,rowFmt) }
fcn rowFmt(row){ ("%9.5f "*row.len()).fmt(row.xplode()) }

The list apply2 method is side effects only, it doesn't aggregate results. When given a list of actions, it applies the action and passes the result to the next action. The fpM method is partial application with a mask, "-" truncates the parameters at that point (in this case, no parameters, ie just print a blank line, not the result of printM).

Output:
  1.00000   0.00000   0.00000   0.00000 
  0.27273   1.00000   0.00000   0.00000 
  0.09091   0.28750   1.00000   0.00000 
  0.18182   0.23125   0.00360   1.00000 

 11.00000   9.00000  24.00000   2.00000 
  0.00000  14.54545  11.45455   0.45455 
  0.00000   0.00000  -3.47500   5.68750 
  0.00000   0.00000   0.00000   0.51079 

  1.00000   0.00000   0.00000   0.00000 
  0.00000   0.00000   1.00000   0.00000 
  0.00000   1.00000   0.00000   0.00000 
  0.00000   0.00000   0.00000   1.00000