Sorting algorithms/Merge sort: Difference between revisions

Content added Content deleted

Inline

Revision as of 01:54, 12 December 2009

The merge sort is a recursive sort of order n*log(n). It is notable for having a worst case and average complexity of O(n*log(n)), and a best case complexity of O(n) (for pre-sorted input). The basic idea is to split the collection into smaller groups by halving it until the groups only have one element or no elements (which are both entirely sorted groups). Then merge the groups back together so that their elements are in order. This is how the algorithm gets its "divide and conquer" description.

Write a function to sort a collection of integers using the merge sort. The merge sort algorithm comes in two parts: a sort function and a merge function. The functions in pseudocode look like this:

function mergesort(m)
   var list left, right, result
   if length(m) ≤ 1
       return m
   else
       var middle = length(m) / 2
       for each x in m up to middle - 1
           add x to left
       for each x in m at and after middle
           add x to right
       left = mergesort(left)
       right = mergesort(right)
       if last(left) ≤ first(right) 
          append right to left
          return left
       result = merge(left, right)
       return result

function merge(left,right)
   var list result
   while length(left) > 0 and length(right) > 0
       if first(left) ≤ first(right)
           append first(left) to result
           left = rest(left)
       else
           append first(right) to result
           right = rest(right)
   if length(left) > 0 
       append rest(left) to result
   if length(right) > 0 
       append rest(right) to result
   return result

Ada

This example creates a generic package for sorting arrays of any type. Ada allows array indices to be any discrete type, including enumerated types which are non-numeric. Furthermore, numeric array indices can start at any value, positive, negative, or zero. The following code handles all the possible variations in index types. <lang ada>generic

  type Element_Type is private;
  type Index_Type is (<>);
  type Collection_Type is array(Index_Type range <>) of Element_Type;
  with function "<"(Left, Right : Element_Type) return Boolean is <>;

package Mergesort is

  function Sort(Item : Collection_Type) return Collection_Type;

end MergeSort;</lang>

<lang ada>package body Mergesort is

  -----------
  -- Merge --
  -----------
  
  function Merge(Left, Right : Collection_Type) return Collection_Type is
     Result : Collection_Type(Left'First..Right'Last);
     Left_Index : Index_Type := Left'First;
     Right_Index : Index_Type := Right'First;
     Result_Index : Index_Type := Result'First;
  begin
     while Left_Index <= Left'Last and Right_Index <= Right'Last loop
        if Left(Left_Index) <= Right(Right_Index) then
           Result(Result_Index) := Left(Left_Index);
           Left_Index := Index_Type'Succ(Left_Index); -- increment Left_Index
        else
           Result(Result_Index) := Right(Right_Index);
           Right_Index := Index_Type'Succ(Right_Index); -- increment Right_Index
        end if;
        Result_Index := Index_Type'Succ(Result_Index); -- increment Result_Index
     end loop;
     if Left_Index <= Left'Last then
        Result(Result_Index..Result'Last) := Left(Left_Index..Left'Last);
     end if;
     if Right_Index <= Right'Last then
        Result(Result_Index..Result'Last) := Right(Right_Index..Right'Last);
     end if;
     return Result;
  end Merge;
  
  ----------
  -- Sort --
  ----------

  function Sort (Item : Collection_Type) return Collection_Type is
     Result : Collection_Type(Item'range);
     Middle : Index_Type;
  begin
     if Item'Length <= 1 then
        return Item;
     else
        Middle := Index_Type'Val((Item'Length / 2) + Index_Type'Pos(Item'First));
        declare
           Left : Collection_Type(Item'First..Index_Type'Pred(Middle));
           Right : Collection_Type(Middle..Item'Last);
        begin
           for I in Left'range loop
              Left(I) := Item(I);
           end loop;
           for I in Right'range loop
              Right(I) := Item(I);
           end loop;
           Left := Sort(Left);
           Right := Sort(Right);
           Result := Merge(Left, Right);
        end;
        return Result;
     end if;
  end Sort;

end Mergesort;</lang> The following code provides an usage example for the generic package defined above. <lang ada>with Ada.Text_Io; use Ada.Text_Io; with Mergesort;

procedure Mergesort_Test is

  type List_Type is array(Positive range <>) of Integer;
  package List_Sort is new Mergesort(Integer, Positive, List_Type);
  procedure Print(Item : List_Type) is
  begin
     for I in Item'range loop
        Put(Integer'Image(Item(I)));
     end loop;
     New_Line;
  end Print;
  
  List : List_Type := (1, 5, 2, 7, 3, 9, 4, 6);

begin

  Print(List);
  Print(List_Sort.Sort(List));

end Mergesort_Test;</lang> The output of this example is:

 1 5 2 7 3 9 4 6
 1 2 3 4 5 6 7 9

ALGOL 68

Translation of: python

Below are two variants of the same routine. If copying the DATA type to a different memory location is expensive, then the optimised version should be used as the DATA elements are handled indirectly. <lang algol68>MODE DATA = CHAR;

PROC merge sort = ([]DATA m)[]DATA: (

   IF LWB m >= UPB m THEN
       m
   ELSE
       INT middle = ( UPB m + LWB m ) OVER 2;
       []DATA left = merge sort(m[:middle]);
       []DATA right = merge sort(m[middle+1:]);
       flex merge(left, right)[AT LWB m]
   FI

);

FLEX version: A demonstration of FLEX for manipulating arrays #

PROC flex merge = ([]DATA in left, in right)[]DATA:(

   [UPB in left + UPB in right]DATA result;
   FLEX[0]DATA left := in left;
   FLEX[0]DATA right := in right;

   FOR index TO UPB result DO
       # change the direction of this comparison to change the direction of the sort #
       IF LWB right > UPB right THEN
           result[index:] := left; 
           stop iteration
       ELIF LWB left > UPB left THEN
           result[index:] := right;
           stop iteration
       ELIF left[1] <= right[1] THEN
           result[index] := left[1];
           left := left[2:]
       ELSE
           result[index] := right[1];
           right := right[2:]
       FI
   OD;

stop iteration:

   result

);

[32]CHAR char array data := "big fjords vex quick waltz nymph"; print((merge sort(char array data), new line));</lang> Output:

    abcdefghiijklmnopqrstuvwxyz

Optimised version:

avoids FLEX array copies and manipulations
avoids type DATA memory copies, useful in cases where DATA is a large STRUCT

<lang algol68>PROC opt merge sort = ([]REF DATA m)[]REF DATA: (

   IF LWB m >= UPB m THEN
       m
   ELSE
       INT middle = ( UPB m + LWB m ) OVER 2;
       []REF DATA left = opt merge sort(m[:middle]);
       []REF DATA right = opt merge sort(m[middle+1:]);
       opt merge(left, right)[AT LWB m]
   FI

);

PROC opt merge = ([]REF DATA left, right)[]REF DATA:(

   [UPB left - LWB left + 1 + UPB right - LWB right + 1]REF DATA result;
   INT index left:=LWB left, index right:=LWB right;

   FOR index TO UPB result DO
       # change the direction of this comparison to change the direction of the sort #
       IF index right > UPB right THEN
           result[index:] := left[index left:]; 
           stop iteration
       ELIF index left > UPB left THEN
           result[index:] := right[index right:];
           stop iteration
       ELIF left[index left] <= right[index right] THEN
           result[index] := left[index left]; index left +:= 1
       ELSE
           result[index] := right[index right]; index right +:= 1
       FI
   OD;

stop iteration:

   result

);

create an array of pointers to the data being sorted #

[UPB char array data]REF DATA data; FOR i TO UPB char array data DO data[i] := char array data[i] OD;

[]REF CHAR result = opt merge sort(data); FOR i TO UPB result DO print((result[i])) OD; print(new line)</lang> Output:

    abcdefghiijklmnopqrstuvwxyz

AutoHotkey

contributed by Laszlo on the ahk forum <lang AutoHotkey>MsgBox % MSort("") MsgBox % MSort("xxx") MsgBox % MSort("3,2,1") MsgBox % MSort("dog,000000,cat,pile,abcde,1,zz,xx,z")

MSort(x) { ; Merge-sort of a comma separated list

  If (2 > L:=Len(x))
      Return x                                             ; empty or single item lists are sorted
  StringGetPos p, x, `,, % "L" L//2                        ; Find middle comma
  Return Merge(MSort(SubStr(x,1,p)), MSort(SubStr(x,p+2))) ; Split, Sort, Merge

}

Len(list) {

  StringReplace t, list,`,,,UseErrorLevel                  ; #commas -> ErrorLevel
  Return list="" ? 0 : ErrorLevel+1

}

Item(list,ByRef p) { ; item at position p, p <- next position

  Return (p := InStr(list,",",0,i:=p+1)) ? SubStr(list,i,p-i) : SubStr(list,i)

}

Merge(list0,list1) { ; Merge 2 sorted lists

  IfEqual list0,, Return list1
  IfEqual list1,, Return list0
  i0 := Item(list0,p0:=0)
  i1 := Item(list1,p1:=0)
  Loop  {
     i := i0>i1
     list .= "," i%i%                                      ; output smaller
     If (p%i%)
        i%i% := Item(list%i%,p%i%)                         ; get next item from processed list
     Else {
        i ^= 1                                             ; list is exhausted: attach rest of other
        Return SubStr(list "," i%i% (p%i% ? "," SubStr(list%i%,p%i%+1) : ""), 2)
     }
  }

}</lang>

C

<lang c>#include <stdio.h>

include <stdlib.h>
include <string.h>

int *locmerge(int *l, unsigned int ll, int *r, unsigned int lr) {

 int *result;
 unsigned int il, ir, ires;
 
 il = ir = ires = 0;

 result = malloc(sizeof(int)*(ll+lr));
 while ( (ll > 0) && ( lr > 0 ) ) {
   if ( l[il] <= r[ir] ) {
     result[ires++] = l[il++]; ll--;
   } else {
     result[ires++] = r[ir++]; lr--;
   }
 }
 if ( ll > 0 ) {
   memcpy(&result[ires], &l[il], sizeof(int)*ll);
 }
 if ( lr > 0 ) {
   memcpy(&result[ires], &r[ir], sizeof(int)*lr);
 }
 return result;

}

int *mergesort(int *list, unsigned int l) {

 unsigned int middle;
 int *tleft, *tright, *result, *left, *right;

 if ( l < 2 ) return list;
 middle = l / 2;
 tleft = malloc(sizeof(int)*(l - middle));
 tright = malloc(sizeof(int)*middle);
 memcpy(tleft, list, sizeof(int)*(l-middle));
 memcpy(tright, list+(l-middle), sizeof(int)*middle);
 left = mergesort(tleft, l-middle);
 right = mergesort(tright, middle);
 if (list[l-middle-1] <= list[l-middle])
   result = list;
 else
   result = locmerge(left, l-middle, right, middle);
 if ( tleft == left ) {
   free(left);
 } else {
   free(left); free(tleft);
 }
 if ( tright == right ) {
   free(right);
 } else {
   free(right); free(tright);
 }
 return result;

}</lang>

Usage:

<lang c>int main() {

 int *tobesorted;
 int i;

 tobesorted = malloc(sizeof(int)*50);
 for(i=0; i < 50; i++) {
   tobesorted[i] = 50 - i;
   printf("%d\n", tobesorted[i]);
 }
 printf("----\n");

 int *r = mergesort(tobesorted, 50);
 for(i=0; i < 50; i++) printf("%d\n", r[i]);
 free(r);
 return 0;

}</lang>

C++

<lang cpp>#include <iterator>

include <algorithm> // for std::inplace_merge
include <functional> // for std::less

template<typename RandomAccessIterator, typename Order>

void mergesort(RandomAccessIterator first, RandomAccessIterator last, Order order)

{

 if (last - first > 1)
 {
   RandomAccessIterator middle = first + (last - first) / 2;
   mergesort(first, middle, order);
   mergesort(middle, last, order);
   std::inplace_merge(first, middle, last, order);
 }

}

template<typename RandomAccessIterator>

void mergesort(RandomAccessIterator first, RandomAccessIterator last)

{

 mergesort(first, last, std::less<typename std::iterator_traits<RandomAccessIterator>::value_type>());

}</lang>

C#

Works with: C# version 2.0+

<lang csharp>using System; using System.Collections.Generic;

namespace RosettaCode.MergeSort {

   public static class MergeSorter
   {
       public static List<T> Sort<T>(List<T> list) where T : IComparable
       {
           if (list.Count <= 1) return list;

           List<T> left = list.GetRange(0, list.Count / 2);
           List<T> right = list.GetRange(left.Count, list.Count - left.Count);
           return Merge(Sort(left), Sort(right));
       }

       public static List<T> Merge<T>(List<T> left, List<T> right) where T : IComparable
       {
           List<T> result = new List<T>();
           while (left.Count > 0 && right.Count > 0)
           {
               if (left[0].CompareTo(right[0]) <= 0)
               {
                   result.Add(left[0]);
                   left.RemoveAt(0);
               }
               else
               {
                   result.Add(right[0]);
                   right.RemoveAt(0);
               }
           }
           result.AddRange(left);
           result.AddRange(right);
           return result;
       }
   }

}</lang>

As in the Ada example above, the following code provides a usage example:

<lang csharp>using System; using System.Collections.Generic;

namespace RosettaCode.MergeSort {

   class Program
   {
       static void Main(string[] args)
       {
           List<int> testList = new List<int> { 1, 5, 2, 7, 3, 9, 4, 6 };
           printList(testList);
           printList(MergeSorter.Sort(testList));
       }

       private static void printList<T>(List<T> list)
       {
           foreach (var t in list)
           {
               Console.Write(t + " ");
           }
           Console.WriteLine();
       }
   }

}</lang>

Again, as in the Ada example the output is:

1 5 2 7 3 9 4 6
1 2 3 4 5 6 7 9

Clojure

Translation of: Haskell

<lang lisp>(defn merge* [left right]

 (cond (nil? left) right
       (nil? right) left
       true (let [[l & *left] left
                  [r & *right] right]
              (if (<= l r) (cons l (merge* *left right))
                           (cons r (merge* left *right))))))

(defn merge-sort [L]

 (let [[l & *L] L]
   (if (nil? *L) 
     L
     (let [[left right] (split-at (/ (count L) 2) L)]
       (merge* (merge-sort left) (merge-sort right))))))</lang>

Common Lisp

<lang lisp>(defun merge-sort (result-type sequence predicate)

  (let ((split (floor (length sequence) 2)))
    (if (zerop split)
      (copy-seq sequence)
      (merge result-type (merge-sort result-type (subseq sequence 0 split) predicate)
                         (merge-sort result-type (subseq sequence split)   predicate)
                         predicate))))</lang>

merge is a standard Common Lisp function.

> (merge-sort 'list (list 1 3 5 7 9 8 6 4 2) #'<)
(1 2 3 4 5 6 7 8 9)

D

Works with: Tango

<lang d>module mergesort ;

version(Tango) {

 import tango.io.Stdout ;
 import tango.util.collection.LinkSeq ;
 alias LinkSeq!(int) LNK ;</lang>

<lang d>// Tango LinkSeq version

 void mergesort1(T)(T m) {
   if (m.length <= 1)
     return m ;
   int mid = m.length  / 2 ;   
   T left  = m.subset(0, mid) ;
   T right = m.subset(mid, m.length - mid) ;
   mergesort1(left) ;
   mergesort1(right) ;
   merge1(m, left, right) ;
 }
 void merge1(T)(T m, T left, T right) {
   m.clear ;
   while(left.length > 0 && right.length > 0)
     if (left.head <= right.head)
       m.append(left.take()) ;
     else 
       m.append(right.take()) ;      
   while(left.length > 0)
       m.append(left.take()) ;
   while(right.length > 0)
       m.append(right.take()) ;
 }</lang>

<lang d>alias Stdout print ; } else { // not Version Tango

 import std.stdio ;
 alias writef print ;

}</lang> <lang d>// D array version T[] mergesort2(T)(inout T[] m) {

 if (m.length <= 1)
   return m ;
 int mid = m.length / 2 ;
 T[] left, right;
 left = m[0..mid] ;
 right = m[mid..$] ;
 left.mergesort2() ;
 right.mergesort2() ;
 m.merge2(left, right) ;
 return m ;

} void merge2(T)(inout T[] merged, inout T[] left, inout T[] right) {

 T[] m = new T[left.length + right.length];
 int headL = 0 ;
 int headR = 0 ;
 int tailM = 0 ;
 while (headL < left.length && headR < right.length)
   if(left[headL] <= right[headR])
     m[tailM++] = left[headL++] ;
   else
     m[tailM++] = right[headR++] ;
 if (headL < left.length)
   m[tailM..$] = left[headL..$] ;
 else if (headR < right.length)
   m[tailM..$] = right[headR..$] ;
 merged = m ;

}</lang> <lang d>void dump(T)(T l) {

 foreach(e ; l)
   print(e," ") ;
 print("\n") ;

} void main() {

 int[] arr = [8,6,4,2,1,3,5,7,9] ;

 version(Tango) {  
   LNK lnk = new LNK ; 
   foreach(e;arr)
     lnk.append(e);  
   dump(lnk) ;
   mergesort1(lnk) ;
   dump(lnk) ;
 }
 dump(arr) ;
 mergesort2(arr) ;
 dump(arr) ;

}</lang>

A template version using D version 2.0. Also includes a couble basic unit tests and some benchmark code.

<lang d>import std.math; import std.conv; import std.stdio; import std.array; import std.random; import std.date; import std.algorithm;

template TSort(T) {

   T[] mergeSort(T[] unsorted)
   {
       if (unsorted.length > 1)
       {
           T[] merged;
           size_t mid = to!(size_t)(floor(unsorted.length / 2));
           auto set = setUnion(mergeSort(unsorted[0 ..  mid]), mergeSort(unsorted[mid .. unsorted.length]));
           foreach(e; set)
           {
               merged ~= e;
           }
           return merged;
       }
       else
       {
           return unsorted;
       }
   }

   T[] mergeLists(T[] s1, T[] s2)
   {
       T[] resList;
       auto appResult = appender(&resList);

       if (s1[s1.length - 1] <= s2[0])
       {
           return s1 ~ s2;
       }
       else if (s2[s2.length - 1] <= s1[0])
       {
           return s2 ~ s1;
       }
       else
       {
           size_t s1count = 0;
           size_t s2count = 0;
           while (s1.length > s1count && s2.length > s2count)
           {
               if (s1[s1count] <= s2[s2count])
               {
                   appResult.put(s1[s1count]);
                   s1count++;
               }
               else
               {
                   appResult.put(s2[s2count]);
                   s2count++;
               }
           }

           if (s1.length > s1count)
           {
               appResult.put(s1[s1count .. s1.length]);
           }
           else if (s2.length > s2count)
           {
               appResult.put(s2[s2count .. s2.length]);
           }
       }

       return resList;
   }

}

unittest {

   int[] evenArray = [4, 2, 7, 9];
   int[] oddArray = [4, 2, 7, 9, 3];
   int[] evenSortedArray = [2, 4, 7, 9];
   int[] oddSortedArray = [2, 3, 4, 7, 9];

   writeln("Testing the sort.");
   auto t1 = TSort!(int).mergeSort(evenArray);
   assert(t1.length == evenArray.length);
   for (uint i = 0; i < t1.length; i++)
   {
       assert(t1[i] == evenSortedArray[i]);
   }
   auto t2 = TSort!(int).mergeSort(oddArray);
   assert(t2.length == oddArray.length);
   debug(mergesort) writeln(__FILE__, ": \"Sorted\" list: ", t2);
   for (uint i = 0; i < t1.length; i++)
   {
       assert(t2[i] == oddSortedArray[i]);
   }

}

int[] data;

auto benchmarkMergeSort() {

   auto res = TSort!(int).mergeSort(data);

}

void main(string[] args) {

   version(benchmark)
   {
       writeln("Initializing the data.");
       Mt19937 gen;
       gen.seed(unpredictableSeed);
       for (size_t i = 0; i < 10000000; i++)
       {
           data ~= uniform(0, 1024, gen);
       }
       writeln("Starting benchmark.");
       writeln("Sort time: ", benchmark!(benchmarkMergeSort)(1));
   }

}</lang>

E

<lang e>def merge(var xs :List, var ys :List) {

   var result := []
   while (xs =~ [x] + xr && ys =~ [y] + yr) {
       if (x <= y) {
           result with= x
           xs := xr
       } else {
           result with= y
           ys := yr
       }
   }
   return result + xs + ys

}

def sort(list :List) {

   if (list.size() <= 1) { return list }
   def split := list.size() // 2
   return merge(sort(list.run(0, split)),
                sort(list.run(split)))

}</lang>

Erlang

Below are two versions. Both take advantage of built-in Erlang functions, lists:split and list:merge. The multi-process version spawns a new process each time it splits. This was slightly faster on a test system with only two cores, so it may not be the best implementation, however it does illustrate how easy it can be to add multi-threaded/process capabilities to a program.

Single-threaded version: <lang erlang>mergeSort(L) when length(L) == 1 -> L; mergeSort(L) when length(L) > 1 ->

   {L1, L2} = lists:split(length(L) div 2, L),
   lists:merge(mergeSort(L1), mergeSort(L2)).</lang>

Multi-process version: <lang erlang>pMergeSort(L) when length(L) == 1 -> L; pMergeSort(L) when length(L) > 1 ->

   {L1, L2} = lists:split(length(L) div 2, L),
   spawn(mergesort, pMergeSort2, [L1, self()]),
   spawn(mergesort, pMergeSort2, [L2, self()]),
   mergeResults([]).

pMergeSort2(L, Parent) when length(L) == 1 -> Parent ! L; pMergeSort2(L, Parent) when length(L) > 1 ->

   {L1, L2} = lists:split(length(L) div 2, L),
   spawn(mergesort, pMergeSort2, [L1, self()]),
   spawn(mergesort, pMergeSort2, [L2, self()]),
   Parent ! mergeResults([]).</lang>

Forth

This is an in-place mergesort which works on arrays of integers. <lang forth>: merge-step ( right mid left -- right mid+ left+ )

 over @ over @ < if
   over @ >r
   2dup - over dup cell+ rot move
   r> over !
   >r cell+ 2dup = if rdrop dup else r> then
 then cell+ ;

merge ( right mid left -- right left )

 dup >r begin 2dup > while merge-step repeat 2drop r> ;

mid ( l r -- mid ) over - 2/ cell negate and + ;

mergesort ( right left -- right left )

 2dup cell+ <= if exit then
 swap 2dup mid recurse rot recurse merge ;

sort ( addr len -- ) cells over + swap mergesort 2drop ;

create test 8 , 1 , 5 , 3 , 9 , 0 , 2 , 7 , 6 , 4 ,

.array ( addr len -- ) 0 do dup i cells + @ . loop drop ;

test 10 2dup sort .array \ 0 1 2 3 4 5 6 7 8 9</lang>

Fortran

Works with: Fortran version 90 and later

<lang fortran>subroutine Merge(A,NA,B,NB,C,NC)

  integer, intent(in) :: NA,NB,NC         ! Normal usage: NA+NB = NC
  integer, intent(in out) :: A(NA)        ! B overlays C(NA+1:NC)
  integer, intent(in)     :: B(NB)
  integer, intent(in out) :: C(NC)
  
  integer :: I,J,K
  
  I = 1; J = 1; K = 1;
  do while(I <= NA .and. J <= NB)
     if (A(I) <= B(J)) then
        C(K) = A(I)
        I = I+1
     else
        C(K) = B(J)
        J = J+1
     endif
     K = K + 1
  enddo
  do while (I <= NA)
     C(K) = A(I)
     I = I + 1
     K = K + 1
  enddo
  return

end subroutine merge

recursive subroutine MergeSort(A,N,T)

  integer, intent(in) :: N
  integer, dimension(N), intent(in out) :: A
  integer, dimension((N+1)/2), intent (out) :: T
  
  integer :: NA,NB,V
  
  if (N < 2) return
  if (N == 2) then
     if (A(1) > A(2)) then
        V = A(1)
        A(1) = A(2)
        A(2) = V
     endif
     return
  endif      
  NA=(N+1)/2
  NB=N-NA

  call MergeSort(A,NA,T)
  call MergeSort(A(NA+1),NB,T)

  if (A(NA) > A(NA+1)) then
     T(1:NA)=A(1:NA)
     call Merge(T,NA,A(NA+1),NB,A,N)
  endif
  return

end subroutine MergeSort

program TestMergeSort

  integer, parameter :: N = 8
  integer, dimension(N) :: A = (/ 1, 5, 2, 7, 3, 9, 4, 6 /)
  integer, dimension ((N+1)/2) :: T
  call MergeSort(A,N,T)
  write(*,'(A,/,10I3)')'Sorted array :',A

end program TestMergeSort</lang>

Haskell

Splitting in half in the middle like the normal merge sort does would be inefficient on the singly-linked lists used in Haskell (since you would have to do one pass just to determine the length, and then another half-pass to do the splitting). Instead, the algorithm here splits the list in half in a different way -- by alternately assigning elements to one list and the other. That way we (lazily) construct both sublists in parallel as we traverse the original list. Unfortunately, under this way of splitting we cannot do a stable sort. <lang haskell>merge [] ys = ys merge xs [] = xs merge xs@(x:xs') ys@(y:ys') | x <= y = x : merge xs' ys

                           | otherwise = y : merge xs  ys'

split (x:y:zs) = let (xs,ys) = split zs in (x:xs,y:ys) split [x] = ([x],[]) split [] = ([],[])

mergeSort [] = [] mergeSort [x] = [x] mergeSort xs = let (as,bs) = split xs

               in merge (mergeSort as) (mergeSort bs)</lang>

Java

Works with: Java version 1.5+

<lang java>import java.util.LinkedList; public class Merge<E extends Comparable<E>> { public LinkedList<E> mergeSort(LinkedList<E> m){ if(m.size() <= 1) return m;

int middle= m.size() / 2; LinkedList<E> left= new LinkedList<E>(); for(int i= 0;i < middle;i++) left.add(m.get(i)); LinkedList<E> right= new LinkedList<E>(); for(int i= middle;i < m.size();i++) right.add(m.get(i));

right= mergeSort(right); left= mergeSort(left); LinkedList<E> result= merge(left, right);

return result; }

public LinkedList<E> merge(LinkedList<E> left, LinkedList<E> right){ LinkedList<E> result= new LinkedList<E>();

while(left.size() > 0 && right.size() > 0){ //change the direction of this comparison to change the direction of the sort if(left.peek().compareTo(right.peek()) <= 0) result.add(left.remove()); else result.add(right.remove()); }

if(left.size() > 0) result.addAll(left); if(right.size() > 0) result.addAll(right); return result; } }</lang>

JavaScript

<lang javascript>function sort(a) {

 var mid = a.length>>1;
 if (mid==0) return a;
 var less = sort(a.slice(0,mid));
 var more = sort(a.slice(mid));
 var merged = [];
 do {
   if (more[0] < less[0]) { var t=less; less=more; more=t; }
   merged.push(less.shift());
 } while (less.length > 0);
 return merged.concat(more);

}</lang>

Logo

Works with: UCB Logo

<lang logo>to split :size :front :list

 if :size < 1 [output list :front :list]
 output split :size-1 (lput first :list :front) (butfirst :list)

end

to merge :small :large

 if empty? :small [output :large]
 ifelse lessequal? first :small first :large ~
   [output fput first :small merge butfirst :small :large] ~
   [output fput first :large merge butfirst :large :small]

end

to mergesort :list

 localmake "half split (count :list) / 2 [] :list
 if empty? first :half [output :list]
 output merge mergesort first :half mergesort last :half

end</lang>

Lucid

[1] <lang lucid>msort(a) = if iseod(first next a) then a else merge(msort(b0),msort(b1)) fi

 where
  p = false fby not p;
  b0 = a whenever p;
  b1 = a whenever not p;
  just(a) = ja
     where
        ja = a fby if iseod ja then eod else next a fi;
     end;
  merge(x,y) = if takexx then xx else yy fi
    where
     xx = (x) upon takexx;
     yy = (y) upon not takexx;
     takexx = if iseod(yy) then true elseif
                 iseod(xx) then false else xx <= yy fi;
    end;
 end;</lang>

OCaml

<lang ocaml>let rec split_at n xs =

 match n, xs with
     0, xs ->
       [], xs
   | _, [] ->
       failwith "index too large"
   | n, x::xs when n > 0 ->
       let xs', xs = split_at (pred n) xs in
         x::xs', xs
   | _, _ ->
       invalid_arg "negative argument"

let rec merge_sort cmp = function

   [] -> []
 | [x] -> [x]
 | xs ->
     let xs, ys = split_at (List.length xs / 2) xs in
       List.merge cmp (merge_sort cmp xs) (merge_sort cmp ys)

let _ =

 merge_sort compare [8;6;4;2;1;3;5;7;9]</lang>

PL/I

<lang pli>MERGE: PROCEDURE (A,LA,B,LB,C);

/* Merge A(1:LA) with B(1:LB), putting the result in C

  B and C may share the same memory, but not with A.

/

  DECLARE (A(*),B(*),C(*)) BYADDR POINTER;
  DECLARE (LA,LB) BYVALUE NONASGN FIXED BIN(31);
  DECLARE (I,J,K) FIXED BIN(31);
  DECLARE (SX) CHAR(58) VAR BASED (PX);
  DECLARE (SY) CHAR(58) VAR BASED (PY);
  DECLARE (PX,PY) POINTER;

  I=1; J=1; K=1;
  DO WHILE ((I <= LA) & (J <= LB));
     PX=A(I); PY=B(J);
     IF(SX <= SY) THEN
        DO; C(K)=A(I); K=K+1; I=I+1; END;
     ELSE
        DO; C(K)=B(J); K=K+1; J=J+1; END;
  END;
  DO WHILE (I <= LA);
     C(K)=A(I); I=I+1; K=K+1;
  END;
  RETURN;

END MERGE;

MERGESORT: PROCEDURE (AP,N) RECURSIVE ;

/* Sort the array AP containing N pointers to strings */

    DECLARE (AP(*))              BYADDR POINTER;
    DECLARE (N)                  BYVALUE NONASGN FIXED BINARY(31);
    DECLARE (M,I)                FIXED BINARY;
    DECLARE AMP1(1)              POINTER BASED(PAM);
    DECLARE (pX,pY,PAM) POINTER;
    DECLARE SX CHAR(58) VAR BASED(pX);
    DECLARE SY CHAR(58) VAR BASED(pY);

  IF (N=1) THEN RETURN;
  M = trunc((N+1)/2);
  IF (M>1) THEN CALL MERGESORT(AP,M);
  PAM=ADDR(AP(M+1));
  IF (N-M > 1) THEN CALL MERGESORT(AMP1,N-M);
  pX=AP(M); pY=AP(M+1);
  IF SX <= SY then return;     /* Skip Merge */
  DO I=1 to M; TP(I)=AP(I); END;
  CALL MERGE(TP,M,AMP1,N-M,AP);
  RETURN;

END MERGESORT;</lang>

Prolog

<lang prolog>% msort( L, S ) % True if S is a sorted copy of L, using merge sort msort( [], [] ). msort( [X], [X] ). msort( U, S ) :- split(U, L, R), msort(L, SL), msort(R, SR), merge(SL, SR, S).

% split( LIST, L, R ) % Alternate elements of LIST in L and R split( [], [], [] ). split( [X], [X], [] ). split( [L,R|T], [L|LT], [R|RT] ) :- split( T, LT, RT ).

% merge( LS, RS, M ) % Assuming LS and RS are sorted, True if M is the sorted merge of the two merge( [], RS, RS ). merge( LS, [], LS ). merge( [L|LS], [R|RS], [L|T] ) :- L =< R, merge( LS, [R|RS], T). merge( [L|LS], [R|RS], [R|T] ) :- L > R, merge( [L|LS], RS, T).</lang>

Perl

<lang perl>sub mergesort {

       my @i = @_;
       return @i unless scalar @i > 1;

       my $m = int $#i / 2;
       my @left = &mergesort(@i[0..$m]);
       my @right = &mergesort(@i[$m+1..$#i]);

       return &merge(\@left,\@right);

}

sub merge {

       my($left,$right) = @_;
       my @r = ();

       push @r, shift @{(@$left[0] < @$right[0]?$left:$right)} while(@$left and @$right);
       push @r, shift @$left while(@$left);
       push @r, shift @$right while(@$right);

       return @r;

}</lang> Also note, the build-in function sort uses mergesort.

Python

Works with: Python version 2.6+

<lang python>from heapq import merge

def merge_sort(m):

   if len(m) <= 1:
       return m

   middle = len(m) / 2
   left = m[:middle]
   right = m[middle:]

   left = merge_sort(left)
   right = merge_sort(right)
   return list(merge(left, right))</lang>

Pre-2.6, merge() could be implemented like this: <lang python>def merge(left, right):

   result = []

   while left and right:
       # change the direction of this comparison to change the direction of the sort
       if left[0] <= right[0]:
           result.append(left.pop(0))
       else:
           result.append(right.pop(0))

   if left:
       result.extend(left)
   if right:
       result.extend(right)
   return result</lang>

R

<lang r>mergesort <- function(m) {

  merge_ <- function(left, right)
  {
     result <- c()
     while(length(left) > 0 && length(right) > 0)
     {
        if(left[1] <= right[1])
        {
           result <- c(result, left[1])
           left <- left[-1]
        } else
        {
           result <- c(result, right[1])
           right <- right[-1]
        }         
     }
     if(length(left) > 0) result <- c(result, left)
     if(length(right) > 0) result <- c(result, right)
     result
  }
  
  len <- length(m)
  if(len <= 1) m else
  {
     middle <- length(m) / 2
     left <- m[1:floor(middle)]
     right <- m[floor(middle+1):len]
     left <- mergesort(left)
     right <- mergesort(right)
     if(left[length(left)] <= right[1])
     {
        c(left, right)
     } else
     {
        merge_(left, right)
     } 
  }

} mergesort(c(4, 65, 2, -31, 0, 99, 83, 782, 1)) # -31 0 1 2 4 65 83 99 782</lang>

Ruby

<lang ruby>def merge_sort(m)

   if m.length <= 1
       return m
   end

   middle = m.length / 2
   left = m[0,middle]
   right = m[middle..-1]

   left = merge_sort(left)
   right = merge_sort(right)
   merge(left, right)

end

def merge(left, right)

   result = []

   until left.empty? || right.empty?
       # change the direction of this comparison to change the direction of the sort
       if left.first <= right.first
           result << left.shift
       else
           result << right.shift
       end
   end

   unless left.empty?
       result += left
   end
   unless right.empty?
       result += right
   end
   result

end</lang>

Here's a version that monkey patches the Array class, with an example that demonstrates it's a stable sort <lang ruby>class Array

 def mergesort(&comparitor)
   if length <= 1
     self
   else
     unless comparitor
       comparitor = lambda {|a, b| a <=> b}
     end
     middle = length / 2
     left  = self[0,  middle].mergesort(&comparitor)
     right = self[middle..-1].mergesort(&comparitor)
     merge(left, right, comparitor) 
   end
 end
 
 protected 
 def merge(left, right, comparitor)
   if left.empty?
     right
   elsif right.empty?
     left
   elsif comparitor.call(left.first, right.first) <= 0
     [left.first] + merge(left[1..-1], right, comparitor)
   else
     [right.first] + merge(left, right[1..-1], comparitor)
   end
 end

end

ary = [7,6,5,9,8,4,3,1,2,0] ary.mergesort # => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] ary.mergesort {|a, b| b <=> a} # => [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

ary = [["UK", "London"], ["US", "New York"], ["US", "Birmingham"], ["UK", "Birmingham"]] ary.mergesort {|a, b| a[1] <=> b[1]}

=> [["US", "Birmingham"], ["UK", "Birmingham"], ["UK", "London"], ["US", "New York"]]</lang>

Scheme

<lang scheme>(define (merge-sort l gt?)

    (letrec
        (
         (merge
          (lambda (left right)
            (cond
              ((null? left) right)
              ((null? right) left)
              ((gt? (car left) (car right))
               (cons (car right) (merge left (cdr right))))
              (else
               (cons (car left) (merge (cdr left) right))))))

         (take
          (lambda (l num)
            (if (zero? num)
                (list)
                (cons (car l) (take (cdr l) (- num 1))))))

         (half (quotient (length l) 2)))

      (if (zero? half)
          l
          (merge
           (merge-sort (take      l half) gt?)
           (merge-sort (list-tail l half) gt?)))))</lang>

(merge-sort '(1 3 5 7 9 8 6 4 2) >)

Seed7

<lang seed7>const proc: mergeSort2 (inout array elemType: arr, in integer: lo, in integer: hi, inout array elemType: scratch) is func

 local
   var integer: mid is 0;
   var integer: k is 0;
   var integer: t_lo is 0;
   var integer: t_hi is 0;
 begin
   if lo < hi then
     mid := (lo + hi) div 2;
     mergeSort2(arr, lo, mid, scratch);
     mergeSort2(arr, succ(mid), hi, scratch);
     t_lo := lo;
     t_hi := succ(mid);
     for k range lo to hi do
       if t_lo <= mid and (t_hi > hi or arr[t_lo] < arr[t_hi]) then
         scratch[k] := arr[t_lo];
         incr(t_lo);
       else
         scratch[k] := arr[t_hi];
         incr(t_hi);
       end if;
     end for;
     for k range lo to hi do
       arr[k] := scratch[k];
     end for;
   end if;
 end func;

const proc: mergeSort2 (inout array elemType: arr) is func

 local
   var array elemType: scratch is 0 times elemType.value;
 begin
   scratch := length(arr) times elemType.value;
   mergeSort2(arr, 1, length(arr), scratch);
 end func;</lang>

Original source: [2]

Standard ML

<lang sml>fun merge cmp ([], ys) = ys

 | merge cmp (xs, []) = xs
 | merge cmp (xs as x::xs', ys as y::ys') =
     case cmp (x, y) of GREATER => y :: merge cmp (xs, ys')
                      | _       => x :: merge cmp (xs', ys)

fun merge_sort cmp [] = []

 | merge_sort cmp [x] = [x]
 | merge_sort cmp xs = let
     val ys = List.take (xs, length xs div 2)
     val zs = List.drop (xs, length xs div 2)
   in
     merge cmp (merge_sort cmp ys, merge_sort cmp zs)
   end

merge_sort Int.compare [8,6,4,2,1,3,5,7,9]</lang>

Tcl

<lang tcl>package require Tcl 8.5

proc mergesort m {

   set len [llength $m]
   if {$len <= 1} {
       return $m
   }
   set middle [expr {$len / 2}]
   set left [lrange $m 0 [expr {$middle - 1}]]
   set right [lrange $m $middle end]
   return [merge [mergesort $left] [mergesort $right]]

}

proc merge {left right} {

   set result [list]
   while {[set lleft [llength $left]] > 0 && [set lright [llength $right]] > 0} {
       if {[lindex $left 0] <= [lindex $right 0]} {
           set left [lassign $left value]
       } else {
           set right [lassign $right value]
       }
       lappend result $value
   }
   if {$lleft > 0} {
       lappend result {*}$left
   }
   if {$lright > 0} {
       set result [concat $result $right] ;# another way append elements
   }
   return $result

}

puts [mergesort {8 6 4 2 1 3 5 7 9}] ;# => 1 2 3 4 5 6 7 8 9</lang> Also note that Tcl's built-in lsort command uses the mergesort algorithm.

UnixPipes

Works with: Zsh

<lang bash>split() {

  (while read a b ; do
      echo $a > $1 ; echo $b > $2
  done)

}

mergesort() {

xargs -n 2 | (read a b; test -n "$b" && (
    lc="1.$1" ; gc="2.$1"
    (echo $a $b;cat)|split >(mergesort $lc >$lc) >( mergesort $gc >$gc)
    sort -m $lc $gc
    rm -f $lc $gc;
) || echo $a)

}

cat to.sort | mergesort</lang>

Ursala

<lang Ursala>#import std

mergesort "p" = @iNCS :-0 ~&B^?a\~&YaO "p"?abh/~&alh2faltPrXPRC ~&arh2falrtPXPRC

show+

example = mergesort(lleq) <'zoh','zpb','hhh','egi','bff','cii','yid'></lang>

output:

bff
cii
egi
hhh
yid
zoh
zpb

The mergesort function could also have been defined using the built in sorting operator, -<, because the same algorithm is used. <lang Ursala>mergesort "p" = "p"-<</lang>

V

merge uses the helper mergei to merge two lists. The mergei takes a stack of the form [mergedlist] [list1] [list2] it then extracts one element from list2, splits the list1 with it, joins the older merged list, first part of list1 and the element that was used for splitting (taken from list2) into the new merged list. the new list1 is the second part of the split on older list1. new list2 is the list remaining after the element e2 was extracted from it.

<lang v>[merge

  [mergei
      uncons [swap [>] split] dip
      [[*m] e2 [*a1] b1 a2 : [*m *a1 e2] b1 a2] view].
   
  [a b : [] a b] view
  [size zero?] [pop concat]
      [mergei]
  tailrec].

[msort

 [splitat [arr a : [arr a take arr a drop]] view i].
 [splitarr dup size 2 / >int splitat].

 [small?] []
   [splitarr]
   [merge]
 binrec].</lang>

[8 7 6 5 4 2 1 3 9] msort puts