Move-to-front algorithm: Difference between revisions

From Rosetta Code
Content added Content deleted
(+ D)
m (→‎{{header|Python}}: Don't use name of module as argument name too.)
Line 157: Line 157:
SYMBOLTABLE = list(ascii_lowercase)
SYMBOLTABLE = list(ascii_lowercase)


def move2front_encode(string, symboltable):
def move2front_encode(strng, symboltable):
sequence, pad = [], symboltable[::]
sequence, pad = [], symboltable[::]
for char in string:
for char in strng:
indx = pad.index(char)
indx = pad.index(char)
sequence.append(indx)
sequence.append(indx)

Revision as of 09:02, 21 May 2014

Move-to-front algorithm is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Given a symboltable of a zero-indexed array of all possible input symbols this algorithm reversibly transforms a sequence of input symbols into an array of output numbers (indices).
The transform in many cases acts to give frequently repeated input symbols lower indices which is useful in some compression algorithms.

Encoding algorithm
    for each symbol of the input sequence:
        output the index of the symbol in the symboltable
        move that symbol to the front of the symboltable
Decoding algorithm
    # Using the same starting symboltable
    for each index of the input sequence:
        output the symbol at that index of the symboltable
        move that symbol to the front of the symboltable
Example

Encoding the string of character symbols 'broood' using a symboltable of the characters 'a'-to-'z'

Input Output SymbolTable
broood 1 'abcdefghijklmnopqrstuvwxyz'
broood 1 17 'bacdefghijklmnopqrstuvwxyz'
broood 1 17 15 'rbacdefghijklmnopqstuvwxyz'
broood 1 17 15 0 'orbacdefghijklmnpqstuvwxyz'
broood 1 17 15 0 0 'orbacdefghijklmnpqstuvwxyz'
broood 1 17 15 0 0 5 'orbacdefghijklmnpqstuvwxyz'

Decoding the indices back to the original symbol order:

Input Output SymbolTable
1 17 15 0 0 5 b 'abcdefghijklmnopqrstuvwxyz'
1 17 15 0 0 5 br 'bacdefghijklmnopqrstuvwxyz'
1 17 15 0 0 5 bro 'rbacdefghijklmnopqstuvwxyz'
1 17 15 0 0 5 broo 'orbacdefghijklmnpqstuvwxyz'
1 17 15 0 0 5 brooo 'orbacdefghijklmnpqstuvwxyz'
1 17 15 0 0 5 broood 'orbacdefghijklmnpqstuvwxyz'
Task
  • Encode and decode the following three strings of characters using the symboltable of the

characters 'a'-to-'z' as above.

  • Show the strings and their encoding here.
  • Add a check to ensure that the decoded string is the same as the original.

The strings are: 'broood', 'bananaaa', and 'hiphophiphop'. <(Note the spellings).

D

<lang d>import std.stdio, std.string, std.ascii, std.algorithm;

ptrdiff_t[] mtfEncoder(in string data) pure nothrow @safe in {

   assert(data.countchars("a-z") == data.length);

} out(result) {

   assert(result.length == data.length);
   assert(result.all!(e => e >= 0 && e < lowercase.length));

} body {

   ubyte[lowercase.length] order = lowercase.representation;
   auto encoded = new typeof(return)(data.length);
   size_t i = 0;
   foreach (immutable b; data) {
       immutable j = encoded[i++] = order[].countUntil(b);
       bringToFront(order[0 .. j], order[j .. j + 1]);
   }
   return encoded;

}

string mtfDecoder(in ptrdiff_t[] encoded) pure nothrow @safe in {

   assert(encoded.all!(e => e >= 0 && e < lowercase.length));

} out(result) {

   assert(result.length == encoded.length);
   assert(result.countchars("a-z") == result.length);

} body {

   ubyte[lowercase.length] order = lowercase.representation;
   auto decoded = new char[encoded.length];
   size_t i = 0;
   foreach (immutable code; encoded) {
       decoded[i++] = order[code];
       bringToFront(order[0 .. code], order[code .. code + 1]);
   }
   return decoded;

}

void main() {

   foreach (immutable word; ["broood", "bananaaa", "hiphophiphop"]) {
       immutable encoded = word.mtfEncoder;
       immutable decoded = encoded.mtfDecoder;
       writefln("'%s' encodes to %s, which decodes back to '%s'",
                word, encoded, decoded);
       assert(word == decoded);
   }

}</lang>

Output:
'broood' encodes to [1, 17, 15, 0, 0, 5], which decodes back to 'broood'
'bananaaa' encodes to [1, 1, 13, 1, 1, 1, 0, 0], which decodes back to 'bananaaa'
'hiphophiphop' encodes to [7, 8, 15, 2, 15, 2, 2, 3, 2, 2, 3, 2], which decodes back to 'hiphophiphop'

Python

<lang python>from __future__ import print_function from string import ascii_lowercase

SYMBOLTABLE = list(ascii_lowercase)

def move2front_encode(strng, symboltable):

   sequence, pad = [], symboltable[::]
   for char in strng:
       indx = pad.index(char)
       sequence.append(indx)
       pad = [pad.pop(indx)] + pad
   return sequence

def move2front_decode(sequence, symboltable):

   chars, pad = [], symboltable[::]
   for indx in sequence:
       char = pad[indx]
       chars.append(char)
       pad = [pad.pop(indx)] + pad
   return .join(chars)

if __name__ == '__main__':

   for s in ['broood', 'bananaaa', 'hiphophiphop']:
       encode = move2front_encode(s, SYMBOLTABLE)
       print('%14r encodes to %r' % (s, encode), end=', ')
       decode = move2front_decode(encode, SYMBOLTABLE)
       print('which decodes back to %r' % decode)
       assert s == decode, 'Whoops!'</lang>
Output:
      'broood' encodes to [1, 17, 15, 0, 0, 5], which decodes back to 'broood'
    'bananaaa' encodes to [1, 1, 13, 1, 1, 1, 0, 0], which decodes back to 'bananaaa'
'hiphophiphop' encodes to [7, 8, 15, 2, 15, 2, 2, 3, 2, 2, 3, 2], which decodes back to 'hiphophiphop'