Compare sorting algorithms' performance

From Rosetta Code
Task
Compare sorting algorithms' performance
You are encouraged to solve this task according to the task description, using any language you may know.

Measure a relative performance of sorting algorithms implementations.

Plot execution time vs. input sequence length dependencies for various implementation of sorting algorithm and different input sequence types (example figures).

Consider three type of input sequences:

  • ones: sequence of all 1's. Example: {1, 1, 1, 1, 1}
  • range: ascending sequence, i.e. already sorted. Example: {1, 2, 3, 10, 15}
  • shuffled range: sequence with elements randomly distributed. Example: {5, 3, 9, 6, 8}

Consider at least two different sorting functions (different algorithms or/and different implementation of the same algorithm). For example, consider Bubble Sort, Insertion sort, Quicksort or/and implementations of Quicksort with different pivot selection mechanisms. Where possible, use existing implementations.

Preliminary subtask:

General steps:

  1. Define sorting routines to be considered.
  2. Define appropriate sequence generators and write timings.
  3. Plot timings.
  4. What conclusions about relative performance of the sorting routines could be made based on the plots?

AutoHotkey

This example is under development. It was marked thus on 16/1/2010. Please help complete the example.

<lang ahk>; BUGGY - FIX

  1. Persistent
  2. SingleInstance OFF

SetBatchLines, -1 SortMethods := "Bogo,Bubble,Cocktail,Counting,Gnome,Insertion,Merge,Permutation,Quick,Selection,Shell,BuiltIn" Gui, Add, Edit, vInput, numbers,separated,by,commas,without,spaces,afterwards Loop, PARSE, SortMethods, `, Gui, Add, CheckBox, v%A_LoopField%, %A_LoopField% Sort Gui, Add, Button, gTest, Test! Gui, Show,, SortTest! Return Test: SplashTextOn,,, Test Commencing Sleep 2500 SplashTextOff Gui, +OwnDialogs Gui, Submit, NoHide Loop, PARSE, SortMethods, `, { If (%A_LoopField%) { DllCall("QueryPerformanceCounter", "Int64 *", %A_LoopField%Begin) %A_LoopField%Out := %A_LoopField%Sort(Input) DllCall("QueryPerformanceCounter", "Int64 *", %A_LoopField%Time) %A_LoopField%End := %A_LoopField%Begin + %A_LoopField%Time %A_LoopField%Time -= %A_LoopField%Begin } } Time := "" Loop, PARSE, SortMethods, `, If (%A_LoopField%) Time .= A_LoopField . " Sort: " . %A_LoopField%Time . "`t`t" . %A_LoopField%Out . "`r`n" MsgBox,, Results!, %Time% Return


Sorting funtions (Bogo, Bubble, Cocktail, Counting, Gnome, Insertion, Merge, Permutation, Quick, Selection, Shell, BuiltIn)

BogoSort(var) { sorted := 1 Loop, Parse, var { current := A_LoopField rest := SubStr(var, A_Index) Loop, Parse, rest { If (current > A_LoopField) sorted := 0 } } While !sorted { sorted := 1 Loop, Parse, var, `, { current := A_LoopField rest := SubStr(var, A_Index) Loop, Parse, rest, `, { If (current > A_LoopField) sorted := 0 } }

Sort, var, D`, Random } Return var }

BubbleSort(var) { StringSplit, array, var, `, hasChanged = 1 size := array0 While hasChanged { hasChanged = 0 Loop, % (size - 1) { i := array%A_Index% aj := A_Index + 1 j := array%aj% If (j < i) { temp := array%A_Index% array%A_Index% := array%aj% array%aj% := temp hasChanged = 1 } } } Loop, % size sorted .= "," . array%A_Index% Return substr(sorted,2) }

CocktailSort(var) { StringSplit array, var, `, i0 := 1, i1 := array0 Loop { Changed = Loop % i1-- -i0 { j := i0+A_Index, i := j-1 If (array%j% < array%i%) t := array%i%, array%i% := array%j%, array%j% := t ,Changed = 1 } IfEqual Changed,, Break Loop % i1-i0++ { i := i1-A_Index, j := i+1 If (array%j% < array%i%) t := array%i%, array%i% := array%j%, array%j% := t ,Changed = 1 } IfEqual Changed,, Break } Loop % array0 sorted .= "," . array%A_Index% Return SubStr(sorted,2) }

CountingSort(var) { max := min := substr(var, 1, instr(var, ",")) Loop, parse, var, `, { If (A_LoopField > max) max := A_LoopField

Else If (A_LoopField < min) min := A_LoopField } Loop % max-min+1 i := A_Index-1, a%i% := 0 Loop, Parse, var, `, i := A_LoopField-min, a%i%++ Loop % max-min+1 { i := A_Index-1, v := i+min Loop % a%i% t .= "," v } Return SubStr(t,2) }

GnomeSort(var) { StringSplit, a, var, `, i := 2, j := 3 While i <= a0 { u := i-1 If (a%u% < a%i%) i := j, j := j+1 Else { t := a%u%, a%u% := a%i%, a%i% := t If (--i = 1) i := j, j++ } } Loop % a0 sorted .= "," . a%A_Index% Return SubStr(sorted,2) }

InsertionSort(var) { StringSplit, a, var, `, Loop % a0-1 { i := A_Index+1, v := a%i%, j := i-1 While j>0 and a%j%>v u := j+1, a%u% := a%j%, j-- u := j+1, a%u% := v } Loop % a0 sorted .= "," . a%A_Index% Return SubStr(sorted,2) }


MergeSort(var) { StringReplace, t, var, `,,, UseErrorLevel L := ((t = "") ? 0 : ErrorLevel+1) If (2 > L) Return var StringGetPos, p, var, `,, % "L" L//2 list0 := MergeSort(SubStr(var,1,p)) list1 := MergeSort(SubStr(var,p+2)) If (list0 = "") Return list1 Else If (list1 = "") Return list0 list := list0 i0 := (p0 := InStr(list,",",0,i:=p0+1)) ? SubStr(list,i,p0-i) : SubStr(list,i) list := list1 i1 := (p1 := InStr(list,",",0,i:=p1+1)) ? SubStr(list,i,p1-i) : SubStr(list,i) Loop { i := i0>i1 list .= "," i%i% If (p%i%) { list := list%i% i%i% := (p%i% := InStr(list,",",0,i:=p%i%+1)) ? SubStr(list,i,p%i%-i) : SubStr(list,i) } Else { i ^= 1 rtv := SubStr(list "," i%i% (p%i% ? "," SubStr(list%i%,p%i%+1) : ""), 2) } } Return rtv }

PermutationSort(var) { static a:="a",v:="v" StringSplit, a, var, `, v0 := a0 Loop %v0% v%A_Index% := A_Index unsorted := 0 Loop % %a%0-1 { i := %v%%A_Index%, j := A_Index+1, j := %v%%j% If (%a%%i% > %a%%j%) unSorted := 1 } While unSorted { i := %v%0, i1 := i-1 While %v%%i1% >= %v%%i% { --i, --i1 IfLess i1,1, Return 1 } j := %v%0 While %v%%j% <= %v%%i1% --j t := %v%%i1%, %v%%i1% := %v%%j%, %v%%j% := t, j := %v%0 While i < j t := %v%%i%, %v%%i% := %v%%j%, %v%%j% := t, ++i, --j unsorted := 0 Loop % %a%0-1 { i := %v%%A_Index%, j := A_Index+1, j := %v%%j% If (%a%%i% > %a%%j%) unSorted := 1 } } Loop % a0 i := v%A_Index%, sorted .= "," . a%i% Return SubStr(sorted,2) }

QuickSort(var) { StringSplit, list, var, `, If (list0 <= 1) Return list pivot := list1 Loop, Parse, var, `, { If (A_LoopField < pivot) less .= "," . A_LoopField Else If (A_LoopField > pivot) more .= "," . A_LoopField Else pivotlist .= "," . A_LoopField } less := QuickSort(substr(less,2)) more := QuickSort(substr(more,2)) Return substr(less,2) . pivotList . more }

SelectionSort(var) { StringSplit, a, var, `, Loop % a0-1 { i := A_Index, mn := a%i%, j := m := i Loop % a0-i { j++ If (a%j% < mn) mn := a%j%, m := j } t := a%i%, a%i% := a%m%, a%m% := t } Loop % a0 sorted .= "," . a%A_Index% Return SubStr(sorted,2) }

ShellSort(var) { StringSplit, a, var, `, inc := a0 While inc:=round(inc/2.2) Loop % a0-inc { i := A_Index+inc, t := a%i%, j := i, k := j-inc While j > inc && a%k% > t a%j% := a%k%, j := k, k -= inc a%j% := t } Loop % a0 s .= "," . a%A_Index% Return SubStr(s,2) }

BuiltInSort(var) { Sort, var, N D`, Return var }</lang>

BBC BASIC

<lang bbcbasic> HIMEM = PAGE + 2000000

     INSTALL @lib$+"SORTLIB"
     INSTALL @lib$+"TIMERLIB"
     Sort% = FN_sortinit(0,0)
     Timer% = FN_ontimer(1000, PROCtimer, 1)
     
     PRINT "Array size:", 1000, 10000, 100000
     @% = &2020A
     
     FOR patt% = 1 TO 4
       CASE patt% OF
         WHEN 1: PRINT '"Data set to all ones:"
         WHEN 2: PRINT '"Data ascending sequence:"
         WHEN 3: PRINT '"Data randomly shuffled:"
         WHEN 4: PRINT '"Data descending sequence:"
       ENDCASE
       
       FOR type% = 1 TO 6
         CASE type% OF
           WHEN 1: PRINT "Internal (lib)";
           WHEN 2: PRINT "Quicksort   ";
           WHEN 3: PRINT "Radix sort  ";
           WHEN 4: PRINT "Shellsort   ";
           WHEN 5: PRINT "Bubblesort  ";
           WHEN 6: PRINT "Insertion sort";
         ENDCASE
         
         FOR power% = 3 TO 5
           PROCsorttest(patt%, type%, 10^power%)
         NEXT
         PRINT
         
       NEXT type%
     NEXT patt%
     END
     
     DEF PROCsorttest(patt%, type%, size%)
     LOCAL a%(), C%, I%
     DIM a%(size%-1)
     
     CASE patt% OF
       WHEN 1: a%() = 1 : a%() = 1
       WHEN 2: FOR I% = 0 TO size%-1 : a%(I%) = I% : NEXT
       WHEN 3: FOR I% = 0 TO size%-1 : a%(I%) = I% : NEXT
         C% = RND(-123456) : REM Seed
         FOR I% = size% TO 2 STEP -1 : SWAP a%(I%-1),a%(RND(I%)-1) : NEXT
       WHEN 4: FOR I% = 0 TO size%-1 : a%(I%) = size%-1-I% : NEXT
     ENDCASE
     
     Start% = TIME
     ON ERROR LOCAL PRINT , "   >100.00" ; : ENDPROC
     CASE type% OF
       WHEN 1: C% = size% : CALL Sort%, a%(0)
       WHEN 2: PROCquicksort(a%(), 0, size%)
       WHEN 3: PROCradixsort(a%(), size%, 10)
       WHEN 4: PROCshellsort(a%(), size%)
       WHEN 5: PROCbubblesort(a%(), size%)
       WHEN 6: PROCinsertionsort(a%(), size%)
     ENDCASE
     PRINT , (TIME - Start%)/100;
     
     FOR I% = 0 TO size%-2
       IF a%(I%) > a%(I%+1) ERROR 100, "Sort failed!"
     NEXT
     ENDPROC
     
     DEF PROCtimer
     Start% += 0
     IF (TIME - Start%) > 10000 ERROR 111, ""
     ENDPROC
     
     DEF PROCbubblesort(a%(), n%)
     LOCAL i%, l%
     REPEAT
       l% = 0
       FOR i% = 1 TO n%-1
         IF a%(i%-1) > a%(i%) THEN
           SWAP a%(i%-1),a%(i%)
           l% = i%
         ENDIF
       NEXT
       n% = l%
     UNTIL l% = 0
     ENDPROC
     
     DEF PROCinsertionsort(a%(), n%)
     LOCAL i%, j%, t%
     FOR i% = 1 TO n%-1
       t% = a%(i%)
       j% = i%
       WHILE j%>0 AND t%<a%(ABS(j%-1))
         a%(j%) = a%(j%-1)
         j% -= 1
       ENDWHILE
       a%(j%) = t%
     NEXT
     ENDPROC
     
     DEF PROCquicksort(a%(), s%, n%)
     LOCAL l%, p%, r%, t%
     IF n% < 2 THEN ENDPROC
     t% = s% + n% - 1
     l% = s%
     r% = t%
     p% = a%((l% + r%) DIV 2)
     REPEAT
       WHILE a%(l%) < p% l% += 1 : ENDWHILE
       WHILE a%(r%) > p% r% -= 1 : ENDWHILE
       IF l% <= r% THEN
         SWAP a%(l%), a%(r%)
         l% += 1
         r% -= 1
       ENDIF
     UNTIL l% > r%
     IF s% < r% PROCquicksort(a%(), s%, r% - s% + 1)
     IF l% < t% PROCquicksort(a%(), l%, t% - l% + 1 )
     ENDPROC
     
     DEF PROCshellsort(a%(), n%)
     LOCAL h%, i%, j%, k%
     h% = n%
     WHILE h%
       IF h% = 2 h% = 1 ELSE h% DIV= 2.2
       FOR i% = h% TO n% - 1
         k% = a%(i%)
         j% = i%
         WHILE j% >= h% AND k% < a%(ABS(j% - h%))
           a%(j%) = a%(j% - h%)
           j% -= h%
         ENDWHILE
         a%(j%) = k%
       NEXT
     ENDWHILE
     ENDPROC
     
     DEF PROCradixsort(a%(), n%, r%)
     LOCAL d%, e%, i%, l%, m%, b%(), bucket%()
     DIM b%(DIM(a%(),1)), bucket%(r%-1)
     FOR i% = 0 TO n%-1
       IF a%(i%) < l% l% = a%(i%)
       IF a%(i%) > m% m% = a%(i%)
     NEXT
     a%() -= l%
     m% -= l%
     e% = 1
     WHILE m% DIV e%
       bucket%() = 0
       FOR i% = 0 TO n%-1
         bucket%(a%(i%) DIV e% MOD r%) += 1
       NEXT
       FOR i% = 1 TO r%-1
         bucket%(i%) += bucket%(i% - 1)
       NEXT
       FOR i% = n%-1 TO 0 STEP -1
         d% = a%(i%) DIV e% MOD r%
         bucket%(d%) -= 1
         b%(bucket%(d%)) = a%(i%)
       NEXT
       a%() = b%()
       e% *= r%
     ENDWHILE
     a%() += l%
     ENDPROC</lang>

Output:

Array size:               1000     10000    100000

Data set to all ones:
Internal (lib)            0.00      0.01      0.03
Quicksort                 0.02      0.27      3.18
Radix sort                0.01      0.05      0.53
Shellsort                 0.03      0.36      4.44
Bubblesort                0.00      0.01      0.09
Insertion sort            0.00      0.02      0.26

Data ascending sequence:
Internal (lib)            0.00      0.00      0.02
Quicksort                 0.02      0.15      1.82
Radix sort                0.02      0.18      2.10
Shellsort                 0.03      0.37      4.44
Bubblesort                0.00      0.01      0.09
Insertion sort            0.01      0.03      0.27

Data randomly shuffled:
Internal (lib)            0.00      0.02      0.44
Quicksort                 0.02      0.26      3.17
Radix sort                0.02      0.17      2.08
Shellsort                 0.04      0.73     11.57
Bubblesort                0.69     69.70   >100.00
Insertion sort            0.55     55.54   >100.00

Data descending sequence:
Internal (lib)            0.00      0.01      0.10
Quicksort                 0.01      0.15      1.90
Radix sort                0.02      0.17      2.06
Shellsort                 0.03      0.50      6.39
Bubblesort                0.95     94.77   >100.00
Insertion sort            1.11   >100.00   >100.00

C

(The reference example is Python)

Examples of sorting routines

We can use the codes in the category Sorting Algorithms; since these codes deal with integer arrays, we should change them a little. To accomplish this task I've also renamed them more consistently algorithm_sort; so we have e.g. bubble_sort, quick_sort and so on.

Sequence generators

csequence.h <lang c>#ifndef _CSEQUENCE_H

  1. define _CSEQUENCE_H
  2. include <stdlib.h>

void setfillconst(double c); void fillwithconst(double *v, int n); void fillwithrrange(double *v, int n); void shuffledrange(double *v, int n);

  1. endif</lang>

csequence.c <lang c>#include "csequence.h"

static double fill_constant;

void setfillconst(double c) {

 fill_constant = c;

}

void fillwithconst(double *v, int n) {

 while( --n >= 0 ) v[n] = fill_constant;

}

void fillwithrrange(double *v, int n) {

 int on = n;
 while( --on >= 0 ) v[on] = n - on;

}

void shuffledrange(double *v, int n) {

 int on = n;
 fillwithrrange(v, n);
 while( --n >= 0 ) {
   int r = rand() % on;
   double t = v[n];
   v[n] = v[r];
   v[r] = t;
 }

}</lang>

Write timings

We shall use the code from Query Performance. Since the action is a generic function with a single argument, we need wrappers which encapsule each sorting algorithms we want to test.

writetimings.h <lang c>#ifndef _WRITETIMINGS_H

  1. define _WRITETIMINGS_H
  2. include "sorts.h"
  3. include "csequence.h"
  4. include "timeit.h"

/* we repeat the same MEANREPEAT times, and get the mean; this *should*

  give "better" results ... */
  1. define MEANREPEAT 10.0
  2. define BUFLEN 128
  3. define MAKEACTION(ALGO) \
 int action_ ## ALGO (int size) {				\
   ALGO ## _sort(tobesorted, size);				\
   return 0; }
  1. define MAKEPIECE(N) { #N , action_ ## N }

int action_bubble(int size); int action_shell(int size); int action_quick(int size); int action_insertion(int size); int action_merge(int size); int doublecompare( const void *a, const void *b ); int action_qsort(int size); int get_the_longest(int *a);

struct testpiece {

 const char *name;
 int (*action)(int);

}; typedef struct testpiece testpiece_t;

struct seqdef {

 const char *name;
 void (*seqcreator)(double *, int);

}; typedef struct seqdef seqdef_t;

  1. endif</lang>

writetimings.c <lang c>#include <stdio.h>

  1. include <stdlib.h>
  1. include "writetimings.h"

double *tobesorted = NULL; const char *bname = "data_"; const char *filetempl = "%s%s_%s.dat"; int datlengths[] = {100, 200, 300, 500, 1000, 5000, 10000, 50000, 100000};

testpiece_t testpieces[] = { // MAKEPIECE(bubble),

 MAKEPIECE(shell),
 MAKEPIECE(merge),
 MAKEPIECE(insertion),
 MAKEPIECE(quick),
 MAKEPIECE(qsort),
 { NULL, NULL }

};

seqdef_t seqdefs[] = {

 { "c1", fillwithconst },
 { "rr", fillwithrrange },
 { "sr", shuffledrange },
 { NULL, NULL }

};


MAKEACTION(bubble) MAKEACTION(insertion) MAKEACTION(quick) MAKEACTION(shell)

int action_merge(int size) {

 double *res = merge_sort(tobesorted, size);
 free(res); /* unluckly this affects performance */
 return 0;

}

int doublecompare( const void *a, const void *b ) {

 if ( *(double *)a == *(double *)b ) return 0;
 if ( *(double *)a < *(double *)b ) return -1; else return 1;

} int action_qsort(int size) {

 qsort(tobesorted, size, sizeof(double), doublecompare);
 return 0;

}

int get_the_longest(int *a) {

 int r = *a;
 while( *a > 0 ) {
   if ( *a > r ) r = *a;
   a++;
 }
 return r;

}


int main() {

 int i, j, k, z, lenmax;
 char buf[BUFLEN];
 FILE *out;
 double thetime;
 lenmax = get_the_longest(datlengths);
 printf("Bigger data set has %d elements\n", lenmax);
 tobesorted = malloc(sizeof(double)*lenmax);
 if ( tobesorted == NULL ) return 1;
 setfillconst(1.0);
 for(i=0; testpieces[i].name != NULL; i++) {
   for(j=0; seqdefs[j].name != NULL; j++) {
     snprintf(buf, BUFLEN, filetempl, bname, testpieces[i].name,

seqdefs[j].name);

     out = fopen(buf, "w");
     if ( out == NULL ) goto severe;
     printf("Producing data for sort '%s', created data type '%s'\n", 

testpieces[i].name, seqdefs[j].name);

     for(k=0; datlengths[k] > 0; k++) {

printf("\tNumber of elements: %d\n", datlengths[k]); thetime = 0.0; seqdefs[j].seqcreator(tobesorted, datlengths[k]); fprintf(out, "%d ", datlengths[k]); for(z=0; z < MEANREPEAT; z++) { thetime += time_it(testpieces[i].action, datlengths[k]); } thetime /= MEANREPEAT; fprintf(out, "%.8lf\n", thetime);

     }
     fclose(out);
   }
 }

severe:

 free(tobesorted);
 return 0;

}</lang> This code produce several files with the following naming convention:

  • data_algorithm_sequence.dat

Where algorithm is one of the following: insertion, merge, shell, quick, qsort (the quicksort in the libc library) (bubble sort became too slow for longest sequences). Sequence is c1 (constant value 1), rr (reverse range), sr (shufled range). These data can be easily plotted by Gnuplot, which can also do fitting. Instead we do our fitting using Polynomial Fitting. <lang c>#include <stdio.h>

  1. include <stdlib.h>
  2. include <math.h>
  1. include "polifitgsl.h"
  1. define MAXNUMOFDATA 100

double x[MAXNUMOFDATA], y[MAXNUMOFDATA]; double cf[2];

int main() {

 int i, nod;
 int r;
 for(i=0; i < MAXNUMOFDATA; i++)
 {
   r = scanf("%lf %lf\n", &x[i], &y[i]);
   if ( (r == EOF) || (r < 2) ) break;
   x[i] = log10(x[i]);
   y[i] = log10(y[i]);
 }
 nod = i;
 polynomialfit(nod, 2, x, y, cf);
 printf("C0 = %lf\nC1 = %lf\n", cf[0], cf[1]);
 return 0;

}</lang> Here we search for a fit with C0+C1x "in the log scale", since we supposed the data, once plotted on a logscale graph, can be fitted by a line. We can use e.g. a shell one-liner to produce the parameters for the line for each data file previously output. In particular I've used the following

for el in *.dat ; do fitdata <$el >${el%.dat}.fd ; done

Plot timings and Figures

Once we have all the ".dat" files and associated ".fd", we can use Gnuplot to draw our data and think about conclusions (we could also use the idea in Plot x, y arrays, but it needs too much enhancements to be usable for this analysis). Here an example of such a draw for a single file (using Gnuplot)

gnuplot> f(x) = C0 + C1*x
gnuplot> set logscale xy
gnuplot> load 'data_quick_sr_u.fd'
gnuplot> set xrange [100:100000]
gnuplot> set key left
gnuplot> plot 10**f(log10(x)), 'data_quick_sr_u.dat'

(The _u.dat are produced by a modified version of the code in order to write timings in microseconds instead of seconds) We can easily write another shell script/one-liner to produce a single file driver for Gnuplot in order to produce all the graph we can be interested in. These graphs show that the linear (in log scale) fit do not always fit the data... I haven't repeated the tests; the problems are when the sequence length becomes huge; for some algorithm that uses extra memory (like implementation of the merge sort), this could depend on the allocation of the needed memory. Another extraneous factor could be system load (the CLOCK_MONOTONIC used by the timing function is system wide rather than per process, so counting time spent in other processes too?). The "most stable" algorithms seem to be quick sort (but not qsort, which indeed is just the libc quick sort, here not plotted!) and shell sort (except for reversed range).

Conclusion: we should repeat the tests...


D

<lang d> import std.algorithm; import std.container; import std.datetime; import std.random; import std.stdio;

immutable size_t arrSize = 10000; immutable int[arrSize] allOnes = 1; immutable int[arrSize] sortedData = genSorted(); static int[arrSize] randomData;

int[arrSize] genSorted() {

   int[arrSize] data;
   for (int i = 0; i < data.length; i++)
   {
       data[i] = i+1;
   }
   return data;

}

int[arrSize] genRandom() {

   int[arrSize] data;
   Random gen = Random(unpredictableSeed);
   for (int i = 0; i < data.length; i++)
   {
       data[i] = gen.front;
       gen.popFront;
   }
   return data;

}

void main() {

   randomData = genRandom();
   writeln("Testing against all ones.");
   auto r = benchmark!(allOnesBubble, allOnesInsertion, allOnesBuiltIn,
                       allOnesHeap)(100);
   writefln("Bubble Sort:\t%10d", r[0].to!("usecs", int));
   writefln("Insertion Sort:\t%10d", r[1].to!("usecs", int));
   writefln("Built-in Sort:\t%10d", r[2].to!("usecs", int));
   writefln("Heap Sort:\t%10d", r[3].to!("usecs", int));
   writeln("\nTesting against sorted data.");
   r = benchmark!(sortedBubble, sortedInsertion, sortedBuiltIn, sortedHeap)(100);
   writefln("Bubble Sort:\t%10d", r[0].to!("usecs", int));
   writefln("Insertion Sort:\t%10d", r[1].to!("usecs", int));
   writefln("Built-in Sort:\t%10d", r[2].to!("usecs", int));
   writefln("Heap Sort:\t%10d", r[3].to!("usecs", int));
   writeln("\nTesting against random data.");
   r = benchmark!(randomBubble, randomInsertion, randomBuiltIn, randomHeap)(100);
   writefln("Bubble Sort:\t%10d", r[0].to!("usecs", int));
   writefln("Insertion Sort:\t%10d", r[1].to!("usecs", int));
   writefln("Built-in Sort:\t%10d", r[2].to!("usecs", int));
   writefln("Heap Sort:\t%10d", r[3].to!("usecs", int));

}

void bubbleSort(T)(T[] list) {

   for (int i = list.length-1; i > 0; i--)
   {
       for (int j = i -1; j >= 0; j--)
       {
           if (list[i] < list[j])
           {
               T temp = list[j];
               list[j] = list[i];
               list[i] = temp;
           }
       }
   }

}

void allOnesBubble() {

   int[] data = allOnes.dup;
   bubbleSort(data);

}

void sortedBubble() {

   int[] data = sortedData.dup;
   bubbleSort(data);

}

void randomBubble() {

   int[] data = randomData.dup;
   bubbleSort(data);

}

unittest {

   int[5] list = [2,5,8,3,1];
   bubbleSort(list);
   assert(isSorted(list.dup));

}

void insertionSort(T)(T[] list) {

   foreach (i; 1..list.length)
   {
       T currElem = list[i];
       size_t j = i;
       for (; j > 0 && currElem < list[j-1]; j--)
       {
           list[j] = list[j-1];
       }
       list[j] = currElem;
   }

}

void allOnesInsertion() {

   int[] data = allOnes.dup;
   insertionSort(data);

}

void sortedInsertion() {

   int[] data = sortedData.dup;
   insertionSort(data);

}

void randomInsertion() {

   int[] data = randomData.dup;
   insertionSort(data);

}

unittest {

   int[5] list = [2,5,8,3,1];
   insertionSort(list);
   assert(isSorted(list.dup));

}

void allOnesBuiltIn() {

   int[] data = allOnes.dup;
   sort!("a < b")(data);

}

void sortedBuiltIn() {

   int[] data = sortedData.dup;
   sort!("a < b")(data);

}

void randomBuiltIn() {

   int[] data = randomData.dup;
   sort!("a < b")(data);

}

void allOnesHeap() {

   int[] data = allOnes.dup;
   auto heap = BinaryHeap!(int[], "a > b")(data);

}

void sortedHeap() {

   int[] data = sortedData.dup;
   auto heap = BinaryHeap!(int[], "a > b")(data);

}

void randomHeap() {

   int[] data = randomData.dup;
   auto heap = BinaryHeap!(int[], "a > b")(data);

} </lang>

Timings from my VM with 10,000 elements in each array:

Testing against all ones.
Bubble Sort:      15746446
Insertion Sort:       6415
Built-in Sort:      110079
Heap Sort:           23997

Testing against sorted data.
Bubble Sort:      14770093
Insertion Sort:       6889
Built-in Sort:       55631
Heap Sort:           21959

Testing against random data.
Bubble Sort:      40575526
Insertion Sort:    9022029
Built-in Sort:      177563
Heap Sort:           48902

J

<lang j> NB. extracts from other rosetta code projects ts=: 6!:2, 7!:2@] radix =: 3 : 0 256 radix y

a=. #{. z =. x #.^:_1 y e=. (-a) {."0 b =. i.x x#.1{::(<:@[;([: ; (b, {"1) <@}./. e,]))&>/^:a [ z;~a-1 NB. , ([: ; (b, {:"1) <@(}:"1@:}.)/. e,])^:(#{.z) y,.z ) bubble=: (([ (<. , >.) {.@]) , }.@])/^:_ insertion=:((>: # ]) , [ , < #])/ sel=: 1 : 'x # [' quick=: 3 : 0

if.  1 >: #y do.  y
else.
 e=. y{~?#y
 (quick y <sel e),(y =sel e),quick y >sel e
end.

) gaps =: [: }: 1 (1+3*])^:(> {:)^:a:~ # insert =: (I.~ {. ]) , [ , ] }.~ I.~ gapinss =: #@] {. ,@|:@(] insert//.~ #@] $ i.@[) shell =: [: ; gapinss &.>/@(< ,~ ]&.>@gaps) builtin =: /:~


NB. characterization of the sorting algorithms.

sorts =: bubble`insertion`shell`quick`radix`builtin generators =: #&1`(i.@-)`(?.~) NB. data generators

round =: [: <. 1r2&+

ll =: (<_1 0)&{ NB. verb to extract lower left which holds ln data length lc =: (<_1 1)&{ NB. verb to fetch lower center which holds most recent time

NB. maximum_time characterize ln_start_size NB. characterize returns a rank 4 matrix with successive indexes for NB. algorithm, input arrangement, max number of tests in group, length time space characterize =: 4 : 0

 max_time =. x
 start =. 1 3{.<:y
 for_sort. sorts do.
   for_generator. generators do.                                           NB. limit time  and  paging prevention
     t =: }. (, (, [: ts 'sort@.0 (generator@.0)' , ":@round@^)@>:@ll) ^: ((lc < max_time"_) *. ll < 17"_) ^:_ start
     if. generator -: {.generators do.
       g =. ,:t
     else.
       g =. g,t
     end.
   end.
   if. sort -: {.sorts do.
     s =. ,:g
   else.
     s =. s,g
   end.
 end.

)

NB. character cell graphics

NB. From j phrases 10E. Approximation d3=: 1&,.@[ %.~ ] NB. a and b such that y is approx. a + b*x

NB. domain and range 0 to 14. D=:14

plot =: 1 : '(=/ round@(u&.(*&(D%<:y))))i.y' NB. function plot size points =: 4 : '1(<"1|:|.round y*D%~<:x)}0$~2#x' NB. size points x,:y

show =: [: |. [: '0'&~:@{:} ' ' ,: ":

plt =: 3 : 0 30 plt y NB. default size 30

n =. >:i.-# experiments =. <@(#~"1 (0&<)@{.)"2 y pts =. n +./ .*x&points@>experiments coef =. d3/@>experiments (_*pts) + n +./ .*1 0 2|:coef&(p."1) plot x ) </lang>

   a =: 1 characterize 5
   $a  NB. a has rank 4
6 3 13 3

   'l t s' =: |:a   NB. transpose moves length time space to leading dimension
   l =: |: <: l     NB. transpose restores order
   t =:	|: 12 +^. t NB. choose arbitrary time units so that ^. time is positive
   s =:	|: ^. s     NB. ln space

   NB. 6 groups of sort methods   with 3 arrangements of data    ---> exponentially increasing data lengths,
   6j2":t   NB. ln time         negative infinity indicates avoided experiment
  3.83  4.80  5.89  7.15  8.65 10.18 11.90 13.75    __    __    __    __    __
  8.77 10.90 13.13    __    __    __    __    __    __    __    __    __    __
  8.70 10.87 13.11    __    __    __    __    __    __    __    __    __    __

  3.91  5.43  7.13  8.90 10.76 12.73    __    __    __    __    __    __    __
  3.99  5.63  7.42  9.21 11.13 13.06    __    __    __    __    __    __    __
  4.14  5.72  7.58  9.37 11.30 13.26    __    __    __    __    __    __    __

  5.56  6.75  7.90  9.05 10.27 11.60 13.13    __    __    __    __    __    __
  5.61  6.88  8.05  9.40 10.84 12.48    __    __    __    __    __    __    __
  5.67  6.93  8.09  9.43 10.89 12.48    __    __    __    __    __    __    __

  1.95  1.99  2.19  2.46  2.98  3.66  4.63  5.54  6.60  7.61  8.70  9.81 10.99
  6.06  7.13  8.09  9.09 10.10 11.10 12.12    __    __    __    __    __    __
  6.09  7.17  8.10  9.11 10.11 11.12 12.14    __    __    __    __    __    __

  3.25  3.33  3.55  4.06  4.77  5.60  6.72  7.75  8.75 10.11 11.21 12.22    __
  3.37  3.87  4.19  4.72  5.53  6.49  7.84  9.29 10.71 11.78 12.88    __    __
  3.42  4.00  4.43  5.07  5.93  7.07  8.06  9.76 10.96 12.10    __    __    __

  0.38  0.38  0.58  1.02  1.68  2.68  3.45  4.42  5.42  6.51  7.64  8.86  9.87
  0.49  0.58  0.96  1.55  2.34  3.11  4.31  5.82  7.43  8.70  9.75 10.75 11.75
  1.40  2.01  2.88  3.87  4.92  5.92  7.05  8.18  9.45 10.91 12.01    __    __

This display is no less than a bar chart and, frankly, suffices but for the curve fit requirement.


   NB. algorithms: bubble 6,  insertion 5,  shell 4,  quick 3,  radix 2,  builtin 1
   NB. rows:  log time
   NB. cols:  log size

   NB. data is all 1
   show 30 plt l ,: & (0&{)"2 t
                        5   4 _                       2 2  
                          4                         2      
                      5 _   6                     2        
                    _   4 6                     2          
                      4 _                     _           3
                  5 _   6                   2           3  
                      6                   _       _   3   1
                _ 4                     2         3 3   1  
                _   _                 _         3     1    
              1   6                 2         _   _ 1      
            _                     2         3   1          
            _   _               2 _       _   _            
          4                   2         3   1              
        _ 5   6             2 _     3 _   _                
      4 _   _             2       3     1                  
    _                   _       3 _   _                    
      5   6           2       3     1                      
  4     _           _       3 _   1                        
4   _             2       3     1 _                        
    _ 6       2 _     3 3   1 1                            
  5 6       2       3   _ 1   _                            
          2 _     3 _   1                                  
5 6 _   _       3     1 _                                  
6     2       3 _   1                                      
    2   _   _     1 _                                      
  2 _   3 3     1                                          
2     3       1 _                                          
    3       _                                              
  3 _   _ 1                                                
3     1 1                                                  

   NB. data is  reversed integer list
   show 30 plt l ,: & (1&{)"2 t
          6               4     3                         1
                      5 4     3               2         1  
        _           _ 4     3             _ 2         1    
                    _     3               2         1      
      6             4   _               2         _        
                  1   3               _                    
    _           _   _               2           1          
    6           4 3               _           _            
              1 _               2           1              
  6         _ 3               2           _                
            _               2 _         1                  
6         1               2           _                    
        _ 5             2           1                      
      3               2 _         1                        
    _ 4 _           2             _                        
  3 _                           1                          
3     5           2 _         1                            
  4 _           2           1 _                            
4   5         2 _         1                                
            _           1                                  
  5     _ 2           1 _                                  
5   _   2           1                                      
      2           1 _                                      
    2                                                      
  2             _                                          
2             1                                            
            _                                              
        _ 1                                                
    _   1                                                  
      1                                                    

   NB. data is  random
   show 30 plt l ,: & (2&{)"2 t
          6           5   4     3             2   1        
                    _   4     3             2   1          
        _           5 4     3             2   1            
                    _     3             2   1              
      6           5 4   _             _   _                
                  4   3             2     1                
    _           _   _             _   _ 1                  
    6           4 3                   1                    
              1 _               2   1                      
  6         _ 3               _   _                        
            _               2   1                          
6         1               2   1                            
        _ 5             _   1 _                            
      3 _             2   1                                
    _ 4 5           2   1                                  
  3 _ 5           2 _ 1 _                                  
3 4             2                                          
    _         2 _   _                                      
4           _     1                                        
  5       2     _                                          
        _     1                                            
5   _ 2     _                                              
    2     1                                                
  2     _                                                  
2   _ 1                                                    
    1                                                      
  1                                                        
1                                                          
                                                           
                                                           

The first version of the code set only a time limit. The builtin sort violated this only when the data overflowed RAM into virtual space, causing a large jump in time affecting also the next data set as the OS restored itself. The timing might be interesting for some other exercise. Here, a maximum data size test was inserted. Arbitrary time is the reasonable choice without details of the J interpreter nor of specific hardware. The radix sort involves putting data directly into the right spot. It is quick!

The data fit curves of the character cell graph were combined with GCD +. function. This explains "1"s or other strange values where these curves intersect. Finally the scatter plots were multiplied by infinity and added to the best fit curves. The points didn't show up well using the same values as the curves.

Python

Works with: Python version 2.5

Examples of sorting routines

<lang python>def builtinsort(x):

   x.sort()

def partition(seq, pivot):

  low, middle, up = [], [], []
  for x in seq:
      if x < pivot:
          low.append(x)
      elif x == pivot:
          middle.append(x)
      else:
          up.append(x)
  return low, middle, up

import random def qsortranpart(seq):

  size = len(seq)
  if size < 2: return seq
  low, middle, up = partition(seq, random.choice(seq))
  return qsortranpart(low) + middle + qsortranpart(up)</lang>

Sequence generators

<lang python>def ones(n):

   return [1]*n

def reversedrange(n):

   return reversed(range(n))

def shuffledrange(n):

   x = range(n)
   random.shuffle(x)
   return x</lang>    

Write timings

<lang python>def write_timings(npoints=10, maxN=10**4, sort_functions=(builtinsort,insertion_sort, qsort),

                 sequence_creators = (ones, range, shuffledrange)):    
  Ns = range(2, maxN, maxN//npoints)
  for sort in sort_functions:
      for make_seq in sequence_creators:
          Ts = [usec(sort, (make_seq(n),)) for n in Ns]
          writedat('%s-%s-%d-%d.xy' % (sort.__name__,  make_seq.__name__, len(Ns), max(Ns)), Ns, Ts)</lang>

Where writedat() is defined in the Write float arrays to a text file, usec() - Query Performance, insertion_sort() - Insertion sort, qsort - Quicksort subtasks, correspondingly.

Plot timings

Library: matplotlib
Library: numpy

<lang python>import operator import numpy, pylab def plotdd(dictplotdict):

  """See ``plot_timings()`` below."""
  symbols = ('o', '^', 'v', '<', '>', 's', '+', 'x', 'D', 'd',
             '1', '2', '3', '4', 'h', 'H', 'p', '|', '_')
  colors = list('bgrcmyk') # split string on distinct characters
  for npoints, plotdict in dictplotdict.iteritems():
      for ttle, lst in plotdict.iteritems():            
          pylab.hold(False)                                
          for i, (label, polynom, x, y) in enumerate(sorted(lst,key=operator.itemgetter(0))):
              pylab.plot(x, y, colors[i % len(colors)] + symbols[i % len(symbols)], label='%s %s' % (polynom, label))
              pylab.hold(True)
              y = numpy.polyval(polynom, x)
              pylab.plot(x, y, colors[i % len(colors)], label= '_nolegend_')                
          pylab.legend(loc='upper left')
          pylab.xlabel(polynom.variable)
          pylab.ylabel('log2( time in microseconds )')                
          pylab.title(ttle, verticalalignment='bottom')
          figname = '_%(npoints)03d%(ttle)s' % vars()
          pylab.savefig(figname+'.png')
          pylab.savefig(figname+'.pdf')
          print figname</lang>

See Plot x, y arrays and Polynomial Fitting subtasks for a basic usage of pylab.plot() and numpy.polyfit().

<lang python>import collections, itertools, glob, re import numpy def plot_timings():

  makedict = lambda: collections.defaultdict(lambda: collections.defaultdict(list))
  df = makedict()
  ds = makedict()
  # populate plot dictionaries
  for filename in glob.glob('*.xy'):
      m = re.match(r'([^-]+)-([^-]+)-(\d+)-(\d+)\.xy', filename)
      print filename
      assert m, filename
      funcname, seqname, npoints, maxN = m.groups()
      npoints, maxN = int(npoints), int(maxN)        
      a = numpy.fromiter(itertools.imap(float, open(filename).read().split()), dtype='f')
      Ns = a[::2]  # sequences lengths
      Ts = a[1::2] # corresponding times 
      assert len(Ns) == len(Ts) == npoints
      assert max(Ns) <= maxN
      #
      logsafe = numpy.logical_and(Ns>0, Ts>0)
      Ts = numpy.log2(Ts[logsafe])
      Ns = numpy.log2(Ns[logsafe])
      coeffs = numpy.polyfit(Ns, Ts, deg=1)
      poly = numpy.poly1d(coeffs, variable='log2(N)')
      #
      df[npoints][funcname].append((seqname, poly, Ns, Ts))
      ds[npoints][seqname].append((funcname, poly, Ns, Ts))
  # actual plotting
  plotdd(df)
  plotdd(ds) # see ``plotdd()`` above</lang>

Figures: log2( time in microseconds ) vs. log2( sequence length )

log(Time) vs. log(N): Relative performance on [1]*N as an input
log(Time) vs. log(N): Relative performance on range(N) as an input
log(Time) vs. log(N): Relative performance on random permutation of range(N) as an input

<lang python>sort_functions = [

   builtinsort,         # see implementation above
   insertion_sort,      # see Insertion sort
   insertion_sort_lowb, # insertion_sort, where sequential search is replaced
                        #     by lower_bound() function
   qsort,               # see Quicksort
   qsortranlc,          # qsort with randomly choosen pivot
                        #     and the filtering via list comprehension
   qsortranpart,        # qsortranlc with filtering via partition function
   qsortranpartis,      # qsortranpart, where for a small input sequence lengths
   ]                    #     insertion_sort is called

if __name__=="__main__":

  import sys
  sys.setrecursionlimit(10000)
  write_timings(npoints=100, maxN=1024, # 1 <= N <= 2**10 an input sequence length
                sort_functions=sort_functions,
                sequence_creators = (ones, range, shuffledrange))
  plot_timings()</lang>

Executing above script we get belowed figures.

ones

ones.png (143KiB)

builtinsort     - O(N)
insertion_sort  - O(N)
qsort           - O(N**2)
qsortranpart    - O(N)

range

range.png (145KiB)

builtinsort     - O(N)
insertion_sort  - O(N)
qsort           - O(N**2)
qsortranpart    - O(N*log(N))

shuffled range

shuffledrange.png (152KiB)

builtinsort     - O(N)  
insertion_sort  - O(N**4) ???
qsort           - O(N*log(N))
qsortranpart    - O(N) ???

Tcl

Background

The lsort command is Tcl's built-in sorting engine. It is implemented in C as a mergesort, so while it is theoretically slower than quicksort, it is a stable sorting algorithm too, which produces results that tend to be less surprising in practice. This task will be matching it against multiple manually-implemented sorting procedures.

Observations

Obviously, the built-in compiled sort command will be much faster than any Tcl-coded implementation. The Tcl-coded mergesort is up to 3 orders of magnitude slower.

The shellsort implementation suffers, relative to other algorithms, in the case where the list is already sorted.

Code

Library: Tk
Library: Tcllib (Package: struct::list)

<lang tcl>###############################################################################

  1. measure and plot times

package require Tk package require struct::list namespace path ::tcl::mathfunc

proc create_log10_plot {title xlabel ylabel xs ys labels shapes colours} {

   set w [toplevel .[clock clicks]]
   wm title $w $title
   pack [canvas $w.c -background white]
   pack [canvas $w.legend -background white]
   update
   plot_log10 $w.c $w.legend $title $xlabel $ylabel $xs $ys $labels $shapes $colours
   $w.c config -scrollregion [$w.c bbox all]
   update

}

proc plot_log10 {canvas legend title xlabel ylabel xs ys labels shapes colours} {

   global xfac yfac
   set log10_xs [map {_ {log10 $_}} $xs]
   foreach series $ys {
       lappend log10_ys [map {_ {log10 $_}} $series]
   }
   set maxx [max {*}$log10_xs]
   set yvalues [lsort -real [struct::list flatten $log10_ys]]
   set firstInf [lsearch $yvalues Inf]
   set maxy [lindex $yvalues [expr {$firstInf == -1 ? [llength $yvalues] - 1 : $firstInf - 1}]]
   set xfac [expr {[winfo width $canvas] * 0.8/$maxx}]
   set yfac [expr {[winfo height $canvas] * 0.8/$maxy}]
   scale $canvas x 0 $maxx $xfac "log10($xlabel)"
   scale $canvas y 0 $maxy $yfac "log10($ylabel)" $maxx $xfac
   $legend create text 30 0 -text $title -anchor nw
   set count 1
   foreach series $log10_ys shape $shapes colour $colours label $labels {
       plotxy $canvas $log10_xs $series $shape $colour
       legenditem $legend [incr count] $shape $colour $label
   }

}

proc map {lambda list} {

   set res [list]
   foreach item $list {lappend res [apply $lambda $item]}
   return $res

}

proc legenditem {legend pos shape colour label} {

   set y [expr {$pos * 15}]
   $shape $legend 20 $y -fill $colour
   $legend create text 30 $y -text $label -anchor w

}

  1. The actual plotting engine

proc plotxy {canvas _xs _ys shape colour} {

   global xfac yfac
   foreach x $_xs y $_ys {
       if {$y < Inf} {
           lappend xs $x
           lappend ys $y
       }
   }
   set coords [list]
   foreach x $xs y $ys {
       set coord_x [expr {$x*$xfac}]
       set coord_y [expr {-$y*$yfac}]
       $shape $canvas $coord_x $coord_y -fill $colour
       lappend coords $coord_x $coord_y
   }
   $canvas create line $coords -smooth true

}

  1. Rescales the contents of the given canvas

proc scale {canvas direction from to fac label {other_to 0} {other_fac 0}} {

   set f [expr {$from*$fac}]
   set t [expr {$to*$fac}]
   switch -- $direction {
       x {
           set f [expr {$from * $fac}]
           set t [expr {$to * $fac}]
           # create x-axis
           $canvas create line $f 0 $t 0
           $canvas create text $f 0 -anchor nw -text $from
           $canvas create text $t 0 -anchor n -text [format "%.1f" $to]
           $canvas create text [expr {($f+$t)/2}] 0 -anchor n -text $label
       }
       y {
           set f [expr {$from * -$fac}]
           set t [expr {$to * -$fac}]
           # create y-axis
           $canvas create line 0 $f 0 $t
           $canvas create text 0 $f -anchor se -text $from
           $canvas create text 0 $t -anchor e -text [format "%.1f" $to]
           $canvas create text 0 [expr {($f+$t)/2}] -anchor e -text $label
           # create gridlines
           set xmax [expr {$other_to * $other_fac}]
           for {set i 1} {$i < $to} {incr i} {
               set y [expr {$i * -$fac}]
               $canvas create line 0 $y $xmax $y -dash .
           }
       }
   }

}

  1. Helper to make points, which are otherwise not a native item type

proc dot {canvas x y args} {

   set id [$canvas create oval [expr {$x-3}] [expr {$y-3}] \
               [expr {$x+3}] [expr {$y+3}]]
   $canvas itemconfigure $id {*}$args

} proc square {canvas x y args} {

   set id [$canvas create rectangle [expr {$x-3}] [expr {$y-3}] \
               [expr {$x+3}] [expr {$y+3}]]
   $canvas itemconfigure $id {*}$args

} proc cross {canvas x y args} {

   set l1 [$canvas create line [expr {$x-3}] $y [expr {$x+3}] $y]
   set l2 [$canvas create line $x [expr {$y-3}] $x [expr {$y+3}]]
   $canvas itemconfigure $l1 {*}$args
   $canvas itemconfigure $l2 {*}$args

} proc x {canvas x y args} {

   set l1 [$canvas create line [expr {$x-3}] [expr {$y-3}] [expr {$x+3}] [expr {$y+3}]]
   set l2 [$canvas create line [expr {$x+3}] [expr {$y-3}] [expr {$x-3}] [expr {$y+3}]]
   $canvas itemconfigure $l1 {*}$args
   $canvas itemconfigure $l2 {*}$args

} proc triangleup {canvas x y args} {

   set id [$canvas create polygon $x [expr {$y-4}] \
               [expr {$x+4}] [expr {$y+4}] \
               [expr {$x-4}] [expr {$y+4}]]
   $canvas itemconfigure $id {*}$args

} proc triangledown {canvas x y args} {

   set id [$canvas create polygon $x [expr {$y+4}] \
               [expr {$x+4}] [expr {$y-4}] \
               [expr {$x-4}] [expr {$y-4}]]
   $canvas itemconfigure $id {*}$args

}

wm withdraw .

  1. list creation procedures

proc ones n {

   lrepeat $n 1

} proc reversed n {

   while {[incr n -1] >= 0} {
       lappend result $n
   }
   return $result

} proc random n {

   for {set i 0} {$i < $n} {incr i} {
       lappend result [expr {int($n * rand())}]
   }
   return $result

}

set algorithms {lsort quicksort shellsort insertionsort bubblesort mergesort} set sizes {1 10 100 1000 10000 100000} set types {ones reversed random} set shapes {dot square cross triangleup triangledown x} set colours {red blue black brown yellow black}

  1. create some lists to be used by all sorting algorithms

array set lists {} foreach size $sizes {

   foreach type $types {
       set lists($type,$size) [$type $size]
   }

}

set runs 10

  1. header

fconfigure stdout -buffering none puts -nonewline [format "%-16s" "list length:"] foreach size $sizes {

   puts -nonewline [format " %10d" $size]

} puts ""

  1. perform the sort timings and output results

foreach type $types {

   puts "\nlist type: $type"
   set times [list]
   foreach algo $algorithms {
       set errs [list]
       set thesetimes [list]
       $algo {} ;# call it once to ensure it's compiled
       
       puts -nonewline [format "   %-13s" $algo]
       foreach size $sizes {
           # some implementations are just too slow
           if {$type ne "ones" && (
               ($algo eq "insertionsort" && $size > 10000) ||
               ($algo eq "bubblesort" && $size > 1000))
           } {
               set time Inf
           } else {
               # OK, do it
               if {[catch {time [list $algo $lists($type,$size)] $runs} result] != 0} {
                   set time Inf
                   lappend errs $result
               } else {
                   set time [lindex [split $result] 0]
               }
           }
           lappend thesetimes $time
           puts -nonewline [format " %10s" $time]
       }
       puts ""
       if {[llength $errs] > 0} {
           puts [format "      %s" [join $errs "\n      "]]
       }
       lappend times $thesetimes
   }
   create_log10_plot "Sorting a '$type' list" size time $sizes $times $algorithms $shapes $colours

} puts "\ntimes in microseconds, average of $runs runs"</lang>

Output

list length:              1         10        100       1000      10000     100000

list type: ones
   lsort                0.8        1.2        7.2       71.9     1042.7    11428.9
   quicksort            1.1        9.0       40.6      369.5     3696.4    37478.4
   shellsort            1.4       26.0      249.1     4003.4    56278.7   717790.6
   insertionsort        1.1        6.4       59.0      528.1     5338.9    54913.0
   bubblesort           1.9        5.1       31.9      308.8     3259.1    31991.2
   mergesort            1.3       61.1      704.2     9275.4   224784.4 14599414.6

list type: reversed
   lsort                1.0        1.6        9.9      112.1     1434.9    20181.0
   quicksort            1.5       55.3      495.6     6705.9    79984.0   963975.0
   shellsort            1.5       25.9      457.0     7118.6    92497.5  1210143.9
   insertionsort        1.2       21.0     1645.0   159262.2 15859610.8        Inf
   bubblesort           1.9      445.0    46526.6  4665550.4        Inf        Inf
   mergesort            1.4       61.7      842.8     9572.1   215536.6 16938651.0

list type: random
   lsort                1.0        1.7       15.7      300.9     3275.0    58779.5
   quicksort            1.2       28.0      429.1     5609.5    71743.3   923630.4
   shellsort            1.6       26.7      571.0     9031.1   140526.9  2244152.7
   insertionsort        1.3       15.4      832.6    79018.0  7893722.6        Inf
   bubblesort           1.8      256.2    23753.1  2422926.0        Inf        Inf
   mergesort            1.9       60.2      883.5    12505.6   399672.6 49225509.8

times in microseconds, average of 10 runs