Soundex: Difference between revisions
(→{{header|Python}}: re-fix & merge) |
|||
Line 106: | Line 106: | ||
def soundex(word): |
def soundex(word): |
||
codes = ("bfpv","cgjkqsxz", "dt", "l", "mn", "r") |
codes = ("bfpv","cgjkqsxz", "dt", "l", "mn", "r") |
||
codes2 = ''.join(codes) |
|||
cmap = lambda kar: ''.join( str(ix+1) for ix,cod in enumerate(codes) if kar in cod ) |
|||
cmap2 = lambda kar: cmap(kar) if kar in codes2 else '9' |
|||
sdx = ''.join(cmap2(kar) for kar in word.lower()) |
|||
sdx2 = word[0].upper() + ''.join(k for k,g in groupby(sdx) if k!='9')[ |
|||
1 if cmap(word[0].lower()) else 0:] |
|||
sdx3 = sdx2[0:4].ljust(4,'0') |
sdx3 = sdx2[0:4].ljust(4,'0') |
||
return sdx3 |
return sdx3 |
Revision as of 03:11, 13 November 2009
You are encouraged to solve this task according to the task description, using any language you may know.
Soundex is an algorithm for creating indices for words based on their pronunciation. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling (from the WP article).
Haskell
<lang haskell>import Text.PhoneticCode.Soundex import Control.Arrow</lang> Example: <lang haskell>*Main> mapM_ print $ map (id &&& soundexSimple) ["Soundex", "Example", "Sownteks", "Ekzampul"] ("Soundex","S532") ("Example","E251") ("Sownteks","S532") ("Ekzampul","E251")</lang>
Java
<lang java>public static void main(String[] args){
System.out.println(soundex("Soundex")); System.out.println(soundex("Example")); System.out.println(soundex("Sownteks")); System.out.println(soundex("Ekzampul")); }
private static String getCode(char c){
switch(c){ case 'B': case 'F': case 'P': case 'V': return "1"; case 'C': case 'G': case 'J': case 'K': case 'Q': case 'S': case 'X': case 'Z': return "2"; case 'D': case 'T': return "3"; case 'L': return "4"; case 'M': case 'N': return "5"; case 'R': return "6"; default: return ""; }
}
public static String soundex(String s){
String code, previous, soundex; code = s.toUpperCase().charAt(0) + ""; previous = "7"; for(int i = 1;i < s.length();i++){ String current = getCode(s.toUpperCase().charAt(i)); if(current.length() > 0 && !current.equals(previous)){ code = code + current; } previous = current; } soundex = (code + "0000").substring(0, 4); return soundex;
}</lang> Output:
S532 E251 S532 E251
JavaScript
<lang javascript>var soundex = function (s) {
var a = s .substring(1, s.length) .split(), r = , codes = { a: , e: , i: , o: , u: , b: 1, f: 1, p: 1, v: 1, c: 2, g: 2, j: 2, k: 2, q: 2, s: 2, x: 2, z: 2, d: 3, t: 3, l: 4, m: 5, n: 5, r: 6 };
r = s[0].toUpperCase() + a .filter(function (v, i, a) { return v !== a[i + 1]; }) .map(function (v, i, a) { return codes[v] }).join();
return (r + '000').slice(0, 4);
};</lang>
Perl
The Text::Soundex core module supports various soundex algorithms. <lang perl>use Text::Soundex; print soundex("Soundex"), "\n"; # S532 print soundex("Example"), "\n"; # E251 print soundex("Sownteks"), "\n"; # S532 print soundex("Ekzampul"), "\n"; # E251</lang>
PHP
PHP already has a built-in soundex() function: <lang php><?php echo soundex("Soundex"), "\n"; // S532 echo soundex("Example"), "\n"; // E251 echo soundex("Sownteks"), "\n"; // S532 echo soundex("Ekzampul"), "\n"; // E251 ?></lang>
Python
<lang python>from itertools import groupby
def soundex(word):
codes = ("bfpv","cgjkqsxz", "dt", "l", "mn", "r") codes2 = .join(codes) cmap = lambda kar: .join( str(ix+1) for ix,cod in enumerate(codes) if kar in cod ) cmap2 = lambda kar: cmap(kar) if kar in codes2 else '9' sdx = .join(cmap2(kar) for kar in word.lower()) sdx2 = word[0].upper() + .join(k for k,g in groupby(sdx) if k!='9')[ 1 if cmap(word[0].lower()) else 0:] sdx3 = sdx2[0:4].ljust(4,'0') return sdx3
</lang> Example Output <lang Python>>>>print soundex("soundex") S532 >>>print soundex("example") E251 >>>print soundex("ciondecks") C532 >>>print soundex("ekzampul") E251</lang>
Tcl
contains an implementation of Knuth's soundex algorithm in the soundex
package.
<lang tcl>package require soundex
foreach string {"Soundex" "Example" "Sownteks" "Ekzampul"} {
set soundexCode [soundex::knuth $string] puts "\"$string\" has code $soundexCode"
}</lang> Which produces this output:
"Soundex" has code S532 "Example" has code E251 "Sownteks" has code S532 "Ekzampul" has code E251
VBScript
<lang vbscript>Function getCode(c)
Select Case c Case "B", "F", "P", "V" getCode = "1" Case "C", "G", "J", "K", "Q", "S", "X", "Z" getCode = "2" Case "D", "T" getCode = "3" Case "L" getCode = "4" Case "M", "N" getCode = "5" Case "R" getCode = "6" End Select
End Function
Function soundex(s)
Dim code, previous code = UCase(Mid(s, 1, 1)) previous = 7 For i = 2 to (Len(s) + 1) current = getCode(UCase(Mid(s, i, 1))) If Len(current) > 0 And current <> previous Then code = code & current End If previous = current Next soundex = Mid(code, 1, 4) If Len(code) < 4 Then soundex = soundex & String(4 - Len(code), "0") End If
End Function</lang>