MD5/Implementation

From Rosetta Code
< MD5
Revision as of 02:52, 29 September 2010 by rosettacode>Dgamey (new draft task and icon code)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
MD5/Implementation is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

The purpose of this task to code and validate an implementation of the MD5 Message Digest Algorithm using native facilities within your language. Refer to MD5 on Wikipedia or the MD5 definition in IETF RFC (1321).

  • The implementation needs to implement the key functionality namely producing a correct message digest for an input string. It is not necessary to mimic all of the calling modes such as adding to a digest one block at a time over subsequent calls.
  • In addition to coding and verifying your implementation, note any challenges your language presented implementing the solution, implementation choices made, or limitations of your solution.
  • Solutions on this page should NOT use built in functions, call outs to operating system calls or library routines written in other languages. For these see MD5.
  • An implementation taken from a native source library and shown on this site is acceptable.

The solutions shown here will provide practical illustrations of bit manipulation, unsigned integers, working with little-endian data.

The following verification strings and hashes come from RFC 1321:

   0xd41d8cd98f00b204e9800998ecf8427e ==> ""  
   0x0cc175b9c0f1b6a831c399e269772661 ==> "a"
   0x900150983cd24fb0d6963f7d28e17f72 ==> "abc"
   0xf96b697d7cb7938d525a2f31aaf161d0 ==> "message digest"
   0xc3fcd3d76192e4007dfb496cca67e13b ==> "abcdefghijklmnopqrstuvwxyz"
   0xd174ab98d277d9f5a5611c2c9f419d9f ==> "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
   0x57edf4a22be3c955ac49da2e2107b67a ==> "12345678901234567890123456789012345678901234567890123456789012345678901234567890"

Icon and Unicon

The following program is based on part on the Wikipedia pseudo-code and in part on the reference implementation in RFC 1321. The implementation uses large integers. The solution works in both Icon and Unicon. One limitation of this implementation is that will not handle arbitrary (bit) length messages - all are byte aligned. <lang Icon>procedure main() # validate against the RFC test strings and more

  testMD5("The quick brown fox jumps over the lazy dog", 16r9e107d9d372bb6826bd81d3542a419d6)
  testMD5("The quick brown fox jumps over the lazy dog.", 16re4d909c290d0fb1ca068ffaddf22cbd0)
  testMD5("", 16rd41d8cd98f00b204e9800998ecf8427e)    #R = MD5 test suite from RFC
  testMD5("a", 16r0cc175b9c0f1b6a831c399e269772661)   #R
  testMD5("abc", 16r900150983cd24fb0d6963f7d28e17f72) #R
  testMD5("message digest", 16rf96b697d7cb7938d525a2f31aaf161d0) #R 
  testMD5("abcdefghijklmnopqrstuvwxyz", 16rc3fcd3d76192e4007dfb496cca67e13b) #R 
  testMD5("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", 16rd174ab98d277d9f5a5611c2c9f419d9f) #R        
  testMD5("12345678901234567890123456789012345678901234567890123456789012345678901234567890", 16r57edf4a22be3c955ac49da2e2107b67a) #R

end

procedure testMD5(s,rh) # compute the MD5 hash and compare it to reference value

  write("Message(length=",*s,") = ",image(s))
  write("Digest = ",hexstring(h := MD5(s)),if h = rh then " matches reference hash" else (" does not match reference hash = " || hexstring(rh)),"\n")

end

link hexcvt # for testMD5

$define B32 4 # 32 bits $define B64 8 # 64 bits in bytes $define B512 64 # 512 bits in bytes $define M32 16r100000000 # 2^32 $define M64 16r10000000000000000 # 2^64

procedure MD5(s) #: return MD5 hash of message s local w,a,b,c,d,i,t,m local mlength,message,hash static rs,ks,istate static maxpad

initial {

  every (rs := []) |||:= 
     (t := [ 7, 12, 17, 22] | [ 5,  9, 14, 20] | [ 4, 11, 16, 23] | [ 6, 10, 15, 21]) ||| t ||| t ||| t
  every put(ks := [],integer(M32 * abs(sin(i := 1 to 64))))
  istate := [ 16r67452301, 16rEFCDAB89, 16r98BADCFE, 16r10325476 ]  # "Magic" IV
  maxpad := left(char(16r80),B512+B64,char(16r00)) # maximum possible padding
  if not (*rs = *ks = 64) then runerr(500,"MD5 setup error") 
  }
                                                   # 1. Construct prefix
  t := (*s*8)%M64                                  # original message length
  s ||:= maxpad                                    # append maximum padding 
  s[0-:*s%B512] := ""                              # trim to final length 
  s[0-:B64] := map("01234567","76543210",unsigned2string(t,B64) )  # as little endian length
  
  message := []                                    # 2. Subdivide message
  s ? while put(message,move(B512))                #  into 512 bit blocks
                                                   # 3. Transform message ...
  state := copy(istate)                            # Initialize hashes 
  every m := !message do {                         # For each message block
     w := []
     m ? while put(w,unsigned(reverse(move(B32)))) # break into little-endian words 
     a := state[1]                                 # pick up hashes
     b := state[2]
     c := state[3]
     d := state[4]
     every i := 0 to 63 do  {                      # Process 4 rounds of hashes	
        case round := i/16 of {

0 : { # 0..15 f := ixor(d, iand(b,ixor(c,d))) # alternate F

              g := i

}

           1 : {                                   # 16..31
              f := ixor(c,iand(d,ixor(b,c)))       # alternate G	
              g := (5*i+1) % 16

}

           2 : {                                   # 32..47
              f := ixor(b,ixor(c,d))               # H
              g := (3*i+5) % 16
              }		 
           3 : {                                   # 48..64
              f := ixor(c,ior(b,ixor(d,16rffffffff)))  # I
              g := (7*i) % 16 
              }

}

        a +:= (f + ks[i+1] + w[g+1])               # Core of FF, GG, HH, II
        a %:= M32

a := ior( ishift(a,rs[i+1]), ishift(a,-(32-rs[i+1]))) # 32bit rotate a +:= b a %:= M32

        a :=: b :=: c :=: d                        # rotate variables
     }

     state[1] +:= a                                # Add back new hashes 
     state[2] +:= b
     state[3] +:= c
     state[4] +:= d
     every !state %:= M32                          # mod 2^32
  }
  every (hash := "") ||:= reverse(unsigned2string(!state,4)) # little-endian digest
  return unsigned(hash)
  end       

procedure unsigned2string(i,w) # uint to string pad to w bytes local s

  if i < 0 then runerr(500,i)
  s := ""
  while (0 < i) | (*s < \w) do {
     s ||:= char(i % 256)
     i /:= 256
     }
  return reverse(s)

end

link unsigned # string to unsigned integer</lang>

The

provides unsigned and hexcvt Sample Output (abridged):

Message(length=43) = "The quick brown fox jumps over the lazy dog"
Digest = 9E107D9D372BB6826BD81D3542A419D6 matches reference hash

Message(length=44) = "The quick brown fox jumps over the lazy dog."
Digest = E4D909C290D0FB1CA068FFADDF22CBD0 matches reference hash

Message(length=0) = ""
Digest = D41D8CD98F00B204E9800998ECF8427E matches reference hash

Message(length=1) = "a"
Digest = CC175B9C0F1B6A831C399E269772661 matches reference hash
...