NYSIIS: Difference between revisions

47,826 bytes added ,  4 months ago
m
m (→‎{{header|Perl}}: Fix link: Perl 6 --> Raku)
m (→‎{{header|Wren}}: Minor tidy)
 
(9 intermediate revisions by 7 users not shown)
Line 1:
{{draft task|text processing}}
{{wikipedia}}
The [[wp:New York State Identification and Intelligence System|New York State Identification and Intelligence System phonetic code]], commonly known as NYSIIS, is a phonetic algorithm for creating indices for words based on their pronunciation. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
 
The task here is to implement the original NYSIIS algorithm, shown in Wikipedia, rather than any other subsequent modification. Also, before the algorithm is applied the input string should be converted to upper case with all white space removed.
 
The [[wp:New York State Identification and Intelligence System|New York State Identification and Intelligence System phonetic code]], commonly known as NYSIIS, is a phonetic algorithm for creating indices for words based on their pronunciation.
An optional step is to handle multiple names, including double-barrelled names or double surnames (e.g. 'Hoyle-Johnson' or 'Vaughan Williams') and unnecessary suffixes/honours that are not required for indexing purposes (e.g. 'Jnr', 'Sr', 'III', etc) - a small selection will suffice. The original implementation is also restricted to six characters, but this is not a requirement.
 
The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
 
 
;Task:
Implement the original NYSIIS algorithm, shown in Wikipedia, rather than any other subsequent modification.
 
Also, before the algorithm is applied the input string should be converted to upper case with all white space removed.
 
An optional step is to handle multiple names, including double-barrelled names or double surnames (e.g. 'Hoyle-Johnson' or 'Vaughan Williams') and unnecessary suffixes/honours that are not required for indexing purposes (e.g. 'Jnr', 'Sr', 'III', etc) - a small selection will suffice.
 
The original implementation is also restricted to six characters, but this is not a requirement.
 
 
;See also
* [[Soundex]]
<br><br>
 
=={{header|11l}}==
{{trans|Python}}
 
<syntaxhighlight lang="11l">V _vowels = ‘AEIOU’
 
F replace_at(String text; position, fromlist, tolist)
L(f, t) zip(fromlist, tolist)
I text[position .+ f.len] == f
R text[0 .< position]‘’t‘’text[position + f.len ..]
R text
 
F replace_end(String text; fromlist, tolist)
L(f, t) zip(fromlist, tolist)
I text.ends_with(f)
R text[0 .< (len)-f.len]‘’t
R text
 
F nysiis(String =name)
name = name.replace(re:‘\W’, ‘’).uppercase()
name = replace_at(name, 0, [‘MAC’, ‘KN’, ‘K’, ‘PH’, ‘PF’, ‘SCH’],
[‘MCC’, ‘N’, ‘C’, ‘FF’, ‘FF’, ‘SSS’])
name = replace_end(name, [‘EE’, ‘IE’, ‘DT’, ‘RT’, ‘RD’, ‘NT’, ‘ND’],
[‘Y’, ‘Y’, ‘D’, ‘D’, ‘D’, ‘D’, ‘D’])
V key = String(name[0])
V key1 = ‘’
V i = 1
L i < name.len
V (n_1, n) = (name[i - 1], name[i])
V n1_ = I i + 1 < name.len {name[i + 1]} E ‘’
name = replace_at(name, i, [‘EV’] [+] Array(:_vowels).map(String), [‘AF’] [+] [String(‘A’)] * 5)
name = replace_at(name, i, [String(‘Q’), ‘Z’, ‘M’], [String(‘G’), ‘S’, ‘N’])
name = replace_at(name, i, [‘KN’, ‘K’], [String(‘N’), ‘C’])
name = replace_at(name, i, [‘SCH’, ‘PH’], [‘SSS’, ‘FF’])
I n == ‘H’ & (n_1 !C :_vowels | n1_ !C :_vowels)
name = name[0 .< i]‘’n_1‘’name[i + 1 ..]
I n == ‘W’ & n_1 C :_vowels
name = name[0 .< i]‘A’name[i + 1 ..]
I key != ‘’ & key.last != name[i]
key ‘’= name[i]
i++
key = replace_end(key, [‘S’, ‘AY’, ‘A’], [‘’, ‘Y’, ‘’])
R key1‘’key
 
V names = [‘Bishop’, ‘Carlson’, ‘Carr’, ‘Chapman’, ‘Franklin’,
‘Greene’, ‘Harper’, ‘Jacobs’, ‘Larson’, ‘Lawrence’,
‘Lawson’, ‘Louis, XVI’, ‘Lynch’, ‘Mackenzie’, ‘Matthews’,
‘McCormack’, ‘McDaniel’, ‘McDonald’, ‘Mclaughlin’, ‘Morrison’,
‘O'Banion’, ‘O'Brien’, ‘Richards’, ‘Silva’, ‘Watkins’,
‘Wheeler’, ‘Willis’, ‘brown, sr’, ‘browne, III’, ‘browne, IV’,
‘knight’, ‘mitchell’, ‘o'daniel’]
L(name) names
print(‘#15: #.’.format(name, nysiis(name)))</syntaxhighlight>
 
{{out}}
<pre>
Bishop: BASAP
Carlson: CARLSAN
Carr: CAR
Chapman: CAPNAN
Franklin: FRANCLAN
Greene: GRAN
Harper: HARPAR
Jacobs: JACAB
Larson: LARSAN
Lawrence: LARANC
Lawson: LASAN
Louis, XVI: LASXV
Lynch: LYNC
Mackenzie: MCANSY
Matthews: MATA
McCormack: MCARNAC
McDaniel: MCDANAL
McDonald: MCDANALD
Mclaughlin: MCLAGLAN
Morrison: MARASAN
O'Banion: OBANAN
O'Brien: OBRAN
Richards: RACARD
Silva: SALV
Watkins: WATCAN
Wheeler: WALAR
Willis: WALA
brown, sr: BRANSR
browne, III: BRAN
browne, IV: BRANAV
knight: NAGT
mitchell: MATCAL
o'daniel: ODANAL
</pre>
 
=={{header|C++}}==
Implementation based on Wikipedia description of the algorithm.
 
<syntaxhighlight lang="c">
<lang c>
#include <iostream> // required for debug code in main() only
#include <iomanip> // required for debug code in main() only
Line 158 ⟶ 259:
}
 
</syntaxhighlight>
</lang>
{{out|Example output}}
<pre>
Line 199 ⟶ 300:
Refactored code based on other examples to reduce footprint.
 
<langsyntaxhighlight lang="cos">
Class Utils.Phonetic [ Abstract ]
{
Line 312 ⟶ 413:
 
}
</syntaxhighlight>
</lang>
{{out|Examples}}
<pre>
Line 358 ⟶ 459:
=={{header|D}}==
{{trans|Python}}
<langsyntaxhighlight lang="d">import std.stdio, std.regex, std.algorithm, std.range, std.string;
 
string replaceAt(in string text, in uint pos, in string[] fromList,
Line 417 ⟶ 518:
foreach (immutable name; names)
writefln("%11s: %s", name, name.nysiis);
}</langsyntaxhighlight>
{{out}}
<pre> Bishop: BASAP
Line 455 ⟶ 556:
=={{header|Go}}==
{{trans|Kotlin}}
<langsyntaxhighlight lang="go">package main
 
import (
Line 595 ⟶ 696:
fmt.Printf("%-16s : %s\n", name, name2)
}
}</langsyntaxhighlight>
 
{{out}}
Line 645 ⟶ 746:
=={{header|Java}}==
{{works with|Java|8}}
<langsyntaxhighlight lang="java">import static java.util.Arrays.*;
import static java.lang.System.out;
 
Line 756 ⟶ 857:
return Vowels.indexOf(c) != -1;
}
}</langsyntaxhighlight>
 
<pre>Carr -> CAR
Line 796 ⟶ 897:
{{trans|Python}}
 
<langsyntaxhighlight lang="julia">function replaceat(text::AbstractString, position::Int, fromlist, tolist)
for (f, t) in zip(fromlist, tolist)
if startswith(text[position:end], f)
Line 853 ⟶ 954:
"knight", "mitchell", "o'daniel"]
@printf("%15s: %s\n", name, nysiis(name))
end</langsyntaxhighlight>
 
{{out}}
Line 891 ⟶ 992:
 
=={{header|Kotlin}}==
<langsyntaxhighlight lang="scala">// version 1.1.2
 
val fStrs = listOf("MAC" to "MCC", "KN" to "N", "K" to "C", "PH" to "FF",
Line 980 ⟶ 1,081:
println("${name.padEnd(16)} : $name2")
}
}</langsyntaxhighlight>
 
{{out}}
Line 1,027 ⟶ 1,128:
de la Mare II : DALANA(R)
</pre>
 
=={{header|Nim}}==
{{trans|Kotlin}}
<syntaxhighlight lang="nim">import strutils
 
const
FStrs = [("MAC", "MCC"), ("KN", "N"), ("K", "C"),
("PH", "FF"), ("PF", "FF"), ("SCH", "SSS")]
LStrs = [("EE", "Y"), ("IE", "Y"), ("DT", "D"),
("RT", "D"), ("RD", "D"), ("NT", "D"), ("ND", "D")]
MStrs = [("EV", "AF"), ("KN", "N"), ("SCH", "SSS"), ("PH", "FF")]
EStrs = ["JR", "JNR", "SR", "SNR"]
 
 
func isVowel(c: char): bool = c in {'A', 'E', 'I', 'O', 'U'}
 
func isRoman(s: string): bool = s.allCharsInSet({'I', 'V', 'X'})
 
 
func nysiis(word: string): string =
 
if word.len == 0: return
var word = word.toUpperAscii()
let fields = word.split({' ', ','})
if fields.len > 1:
let last = fields[^1]
if last.isRoman: word.setLen(word.len - last.len)
word = word.multiReplace((" ", ""), (",", ""), ("'", ""), ("-", ""))
for eStr in EStrs:
if word.endsWith(eStr): word.setLen(word.len - eStr.len)
for fStr in FStrs:
if word.startsWith(fStr[0]): word[0..fStr[0].high] = fStr[1]
for lStr in LStrs:
if word.endsWith(lStr[0]): word[^2..^1] = lStr[1]
 
result.add word[0]
word.delete(0..0)
for mStr in MStrs:
word = word.replace(mStr[0], mStr[1])
var s = result[0] & word
var len = s.len
for i in 1..<len:
case s[i]
of 'E', 'I', 'O', 'U': s[i] = 'A'
of 'Q': s[i] = 'G'
of 'Z': s[i] = 'S'
of 'M': s[i] = 'N'
of 'K': s[i] = 'C'
of 'H': (if not s[i-1].isVowel or i < len - 1 and not s[i+1].isVowel: s[i] = s[i-1])
of 'W': (if s[i-1].isVowel: s[i] = 'A')
else: discard
 
if s[len-1] == 'S':
s.setLen(len-1)
dec len
if len > 1 and s[len-2..len-1] == "AY":
s.delete(len-2..len-2)
dec len
if len > 0 and s[len-1] == 'A':
s.setLen(len-1)
dec len
 
var prev = result[0]
for i in 1..<len:
let c = s[i]
if prev != c:
result.add c
prev = c
 
 
const Names = ["Bishop", "Carlson", "Carr", "Chapman",
"Franklin", "Greene", "Harper", "Jacobs", "Larson", "Lawrence",
"Lawson", "Louis, XVI", "Lynch", "Mackenzie", "Matthews", "May jnr",
"McCormack", "McDaniel", "McDonald", "Mclaughlin", "Morrison",
"O'Banion", "O'Brien", "Richards", "Silva", "Watkins", "Xi",
"Wheeler", "Willis", "brown, sr", "browne, III", "browne, IV",
"knight", "mitchell", "o'daniel", "bevan", "evans", "D'Souza",
"Hoyle-Johnson", "Vaughan Williams", "de Sousa", "de la Mare II"]
 
for name1 in Names:
var name2 = nysiis(name1)
if name2.len > 6:
name2 = "$1($2)".format(name2[0..5], name2[6..^1])
echo name1.alignLeft(16), ": ", name2</syntaxhighlight>
 
{{out}}
<pre>Bishop : BASAP
Carlson : CARLSA(N)
Carr : CAR
Chapman : CAPNAN
Franklin : FRANCL(AN)
Greene : GRAN
Harper : HARPAR
Jacobs : JACAB
Larson : LARSAN
Lawrence : LARANC
Lawson : LASAN
Louis, XVI : LA
Lynch : LYNC
Mackenzie : MCANSY
Matthews : MATA
May jnr : MY
McCormack : MCARNA(C)
McDaniel : MCDANA(L)
McDonald : MCDANA(LD)
Mclaughlin : MCLAGL(AN)
Morrison : MARASA(N)
O'Banion : OBANAN
O'Brien : OBRAN
Richards : RACARD
Silva : SALV
Watkins : WATCAN
Xi : X
Wheeler : WALAR
Willis : WAL
brown, sr : BRAN
browne, III : BRAN
browne, IV : BRAN
knight : NAGT
mitchell : MATCAL
o'daniel : ODANAL
bevan : BAFAN
evans : EVAN
D'Souza : DSAS
Hoyle-Johnson : HAYLAJ(ANSAN)
Vaughan Williams: VAGANW(ALAN)
de Sousa : DASAS
de la Mare II : DALANA(R)</pre>
 
=={{header|Perl}}==
{{trans|Raku}}
<langsyntaxhighlight lang="perl">sub no_suffix {
my($name) = @_;
$name =~ s/\h([JS]R)|([IVX]+)$//i;
Line 1,081 ⟶ 1,310:
printf "%10s, %s\n", $_, $nysiis;
}
</syntaxhighlight>
</lang>
{{out}}
<pre style="height:35ex"> knight, NAGT
Line 1,119 ⟶ 1,348:
=={{header|Phix}}==
{{trans|Go}}
<!--<syntaxhighlight lang="phix">(phixonline)-->
<lang Phix>function isVowel(integer byte)
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
return find(byte,"AEIOU")!=0
<span style="color: #008080;">function</span> <span style="color: #000000;">isVowel</span><span style="color: #0000FF;">(</span><span style="color: #004080;">integer</span> <span style="color: #000000;">byte</span><span style="color: #0000FF;">)</span>
end function
<span style="color: #008080;">return</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">byte</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"AEIOU"</span><span style="color: #0000FF;">)!=</span><span style="color: #000000;">0</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
function isRoman(string s)
if s == "" then
<span style="color: #008080;">function</span> <span style="color: #000000;">isRoman</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">)</span>
return false
<span style="color: #008080;">if</span> <span style="color: #000000;">s</span> <span style="color: #0000FF;">==</span> <span style="color: #008000;">""</span> <span style="color: #008080;">then</span>
end if
<span style="color: #008080;">return</span> <span style="color: #004600;">false</span>
for i=1 to length(s) do
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
if not find(s[i],"IVX") then
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
return false
<span style="color: #008080;">if</span> <span style="color: #008080;">not</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">],</span><span style="color: #008000;">"IVX"</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span>
end if
<span style="color: #008080;">return</span> <span style="color: #004600;">false</span>
end for
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
return true
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
end function
<span style="color: #008080;">return</span> <span style="color: #004600;">true</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
function nysiis(string word)
if word == "" then return "" end if
<span style="color: #008080;">function</span> <span style="color: #000000;">nysiis</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">)</span>
word = upper(word)
<span style="color: #008080;">if</span> <span style="color: #000000;">word</span> <span style="color: #0000FF;">==</span> <span style="color: #008000;">""</span> <span style="color: #008080;">then</span> <span style="color: #008080;">return</span> <span style="color: #008000;">""</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
sequence ww = split_any(word, ", ", no_empty:=true)
<span style="color: #000000;">word</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">upper</span><span style="color: #0000FF;">(</span><span style="color: #000000;">word</span><span style="color: #0000FF;">)</span>
if length(ww)>1 then
<span style="color: #004080;">sequence</span> <span style="color: #000000;">ww</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">split_any</span><span style="color: #0000FF;">(</span><span style="color: #000000;">word</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">", "</span><span style="color: #0000FF;">)</span>
string last = ww[$]
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">ww</span><span style="color: #0000FF;">)></span><span style="color: #000000;">1</span> <span style="color: #008080;">then</span>
if isRoman(last) then
<span style="color: #004080;">string</span> <span style="color: #000000;">last</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">ww</span><span style="color: #0000FF;">[$]</span>
word = word[1..-length(last)-1]
<span style="color: #008080;">if</span> <span style="color: #000000;">isRoman</span><span style="color: #0000FF;">(</span><span style="color: #000000;">last</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span>
end if
<span style="color: #000000;">word</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..-</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">last</span><span style="color: #0000FF;">)-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
end if
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
word = substitute_all(word, " ,'-", repeat("",4))
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
sequence eStrs = {"JR", "JNR", "SR", "SNR"}
<span style="color: #000000;">word</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">substitute_all</span><span style="color: #0000FF;">(</span><span style="color: #000000;">word</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">" ,'-"</span><span style="color: #0000FF;">,</span> <span style="color: #7060A8;">repeat</span><span style="color: #0000FF;">(</span><span style="color: #008000;">""</span><span style="color: #0000FF;">,</span><span style="color: #000000;">4</span><span style="color: #0000FF;">))</span>
for i=1 to length(eStrs) do
<span style="color: #004080;">sequence</span> <span style="color: #000000;">eStrs</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{</span><span style="color: #008000;">"JR"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"JNR"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"SR"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"SNR"</span><span style="color: #0000FF;">}</span>
string ei = eStrs[i]
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">eStrs</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
integer lei = length(ei)
<span style="color: #004080;">string</span> <span style="color: #000000;">ei</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">eStrs</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span>
if length(word)>lei
<span style="color: #004080;">integer</span> <span style="color: #000000;">lei</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">ei</span><span style="color: #0000FF;">)</span>
and word[-lei..$]=ei then
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">word</span><span style="color: #0000FF;">)></span><span style="color: #000000;">lei</span>
word = word[1..-lei-1]
<span style="color: #008080;">and</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">[-</span><span style="color: #000000;">lei</span><span style="color: #0000FF;">..$]=</span><span style="color: #000000;">ei</span> <span style="color: #008080;">then</span>
end if
<span style="color: #000000;">word</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">lei</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
end for
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
sequence fStrs = {{"MAC","MCC"}, {"KN","N"}, {"K","C"},
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
{"PH","FF"}, {"PF","FF"}, {"SCH","SSS"}}
<span style="color: #004080;">sequence</span> <span style="color: #000000;">fStrs</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{{</span><span style="color: #008000;">"MAC"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"MCC"</span><span style="color: #0000FF;">},</span> <span style="color: #0000FF;">{</span><span style="color: #008000;">"KN"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"N"</span><span style="color: #0000FF;">},</span> <span style="color: #0000FF;">{</span><span style="color: #008000;">"K"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"C"</span><span style="color: #0000FF;">},</span>
for i=1 to length(fStrs) do
<span style="color: #0000FF;">{</span><span style="color: #008000;">"PH"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"FF"</span><span style="color: #0000FF;">},</span> <span style="color: #0000FF;">{</span><span style="color: #008000;">"PF"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"FF"</span><span style="color: #0000FF;">},</span> <span style="color: #0000FF;">{</span><span style="color: #008000;">"SCH"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"SSS"</span><span style="color: #0000FF;">}}</span>
string {fi,rfi} = fStrs[i]
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">fStrs</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
integer lfi = length(fi)
<span style="color: #004080;">string</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">fi</span><span style="color: #0000FF;">,</span><span style="color: #000000;">rfi</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">fStrs</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span>
if length(word)>lfi
<span style="color: #004080;">integer</span> <span style="color: #000000;">lfi</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">fi</span><span style="color: #0000FF;">)</span>
and word[1..lfi]=fi then
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">word</span><span style="color: #0000FF;">)></span><span style="color: #000000;">lfi</span>
word[1..lfi] = rfi
<span style="color: #008080;">and</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..</span><span style="color: #000000;">lfi</span><span style="color: #0000FF;">]=</span><span style="color: #000000;">fi</span> <span style="color: #008080;">then</span>
end if
<span style="color: #000000;">word</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..</span><span style="color: #000000;">lfi</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">rfi</span>
end for
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
if length(word)>=2 then
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
string l2 = word[-2..-1]
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">word</span><span style="color: #0000FF;">)>=</span><span style="color: #000000;">2</span> <span style="color: #008080;">then</span>
if find(l2,{"EE","IE"}) then
<span style="color: #004080;">string</span> <span style="color: #000000;">l2</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">[-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
word[-2..-1] = "Y"
<span style="color: #008080;">if</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l2</span><span style="color: #0000FF;">,{</span><span style="color: #008000;">"EE"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"IE"</span><span style="color: #0000FF;">})</span> <span style="color: #008080;">then</span>
elsif find(l2,{"DT","RT","RD","NT","ND"}) then
<span style="color: #000000;">word</span><span style="color: #0000FF;">[-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"Y"</span>
word[-2..-1] = "D"
<span style="color: #008080;">elsif</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l2</span><span style="color: #0000FF;">,{</span><span style="color: #008000;">"DT"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"RT"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"RD"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"NT"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"ND"</span><span style="color: #0000FF;">})</span> <span style="color: #008080;">then</span>
end if
<span style="color: #000000;">word</span><span style="color: #0000FF;">[-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"D"</span>
end if
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
integer initial = word[1]
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
string key = word[1..1]
<span style="color: #004080;">integer</span> <span style="color: #000000;">initial</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
word = word[2..$]
<span style="color: #004080;">string</span> <span style="color: #000000;">key</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
word = substitute_all(word,{"EV","KN","SCH","PH"},
<span style="color: #000000;">word</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">word</span><span style="color: #0000FF;">[</span><span style="color: #000000;">2</span><span style="color: #0000FF;">..$]</span>
{"AF","N", "SSS","FF"})
<span style="color: #000000;">word</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">substitute_all</span><span style="color: #0000FF;">(</span><span style="color: #000000;">word</span><span style="color: #0000FF;">,{</span><span style="color: #008000;">"EV"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"KN"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"SCH"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"PH"</span><span style="color: #0000FF;">},</span>
string sb = key&word
<span style="color: #0000FF;">{</span><span style="color: #008000;">"AF"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"N"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"SSS"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"FF"</span><span style="color: #0000FF;">})</span>
integer le := length(sb)
<span style="color: #004080;">string</span> <span style="color: #000000;">sb</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">&</span><span style="color: #000000;">word</span>
for i=2 to le do
<span style="color: #004080;">integer</span> <span style="color: #000000;">le</span> <span style="color: #0000FF;">:=</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">sb</span><span style="color: #0000FF;">)</span>
switch sb[i] do
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">2</span> <span style="color: #008080;">to</span> <span style="color: #000000;">le</span> <span style="color: #008080;">do</span>
case 'E', 'I', 'O', 'U': sb[i] = 'A'
<span style="color: #008080;">switch</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #008080;">do</span>
case 'Q': sb[i] = 'G'
<span style="color: #008080;">case</span> <span style="color: #008000;">'E'</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">'I'</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">'O'</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">'U'</span><span style="color: #0000FF;">:</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">'A'</span>
case 'Z': sb[i] = 'S'
<span style="color: #008080;">case</span> <span style="color: #008000;">'Q'</span><span style="color: #0000FF;">:</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">'G'</span>
case 'M': sb[i] = 'N'
<span style="color: #008080;">case</span> <span style="color: #008000;">'Z'</span><span style="color: #0000FF;">:</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">'S'</span>
case 'K': sb[i] = 'C'
<span style="color: #008080;">case</span> <span style="color: #008000;">'M'</span><span style="color: #0000FF;">:</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">'N'</span>
case 'H': if (i> 1 and not isVowel(sb[i-1]))
<span style="color: #008080;">case</span> <span style="color: #008000;">'K'</span><span style="color: #0000FF;">:</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">'C'</span>
or (i<le and not isVowel(sb[i+1])) then
<span style="color: #008080;">case</span> <span style="color: #008000;">'H'</span><span style="color: #0000FF;">:</span> <span style="color: #008080;">if</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">i</span><span style="color: #0000FF;">></span> <span style="color: #000000;">1</span> <span style="color: #008080;">and</span> <span style="color: #008080;">not</span> <span style="color: #000000;">isVowel</span><span style="color: #0000FF;">(</span><span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]))</span>
sb[i] = sb[i-1]
<span style="color: #008080;">or</span> <span style="color: #0000FF;">(</span><span style="color: #000000;">i</span><span style="color: #0000FF;"><</span><span style="color: #000000;">le</span> <span style="color: #008080;">and</span> <span style="color: #008080;">not</span> <span style="color: #000000;">isVowel</span><span style="color: #0000FF;">(</span><span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]))</span> <span style="color: #008080;">then</span>
end if
<span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
case 'W': if isVowel(sb[i-1]) then
<span style="color: #008080;">end</span> sb[i]<span style="color: sb[i-1]#008080;">if</span>
<span style="color: #008080;">case</span> <span style="color: #008000;">'W'</span><span style="color: #0000FF;">:</span> <span style="color: #008080;">if</span> <span style="color: #000000;">isVowel</span><span style="color: #0000FF;">(</span><span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">])</span> <span style="color: #008080;">then</span>
end if
<span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span>
end switch
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
end for
<span style="color: #008080;">end</span> <span style="color: #008080;">switch</span>
integer prev := initial
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
for j=2 to le do
<span style="color: #004080;">integer</span> <span style="color: #000000;">prev</span> <span style="color: #0000FF;">:=</span> <span style="color: #000000;">initial</span>
integer c := sb[j]
<span style="color: #008080;">for</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">=</span><span style="color: #000000;">2</span> <span style="color: #008080;">to</span> <span style="color: #000000;">le</span> <span style="color: #008080;">do</span>
if prev != c then
<span style="color: #004080;">integer</span> <span style="color: #000000;">c</span> <span style="color: #0000FF;">:=</span> <span style="color: #000000;">sb</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">]</span>
key &= c
<span style="color: #008080;">if</span> <span style="color: #000000;">prev</span> <span style="color: #0000FF;">!=</span> <span style="color: #000000;">c</span> <span style="color: #008080;">then</span>
prev = c
<span style="color: #000000;">key</span> <span style="color: #0000FF;">&=</span> <span style="color: #000000;">c</span>
end if
<span style="color: #000000;">prev</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">c</span>
end for
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
if length(key)>=1 and key[$] == 'S' then key[$ ..$] = "" end if
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
if length(key)>=2 and key[-2..-1] == "AY" then key[$-1..$] = "Y" end if
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">key</span><span style="color: #0000FF;">)>=</span><span style="color: #000000;">1</span> <span style="color: #008080;">and</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">[$]</span> <span style="color: #0000FF;">==</span> <span style="color: #008000;">'S'</span> <span style="color: #008080;">then</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">[$</span> <span style="color: #0000FF;">..$]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
if length(key)>=1 and key[$] == 'A' then key[$ ..$] = "" end if
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">key</span><span style="color: #0000FF;">)>=</span><span style="color: #000000;">2</span> <span style="color: #008080;">and</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">[-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">==</span> <span style="color: #008000;">"AY"</span> <span style="color: #008080;">then</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">[$-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..$]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"Y"</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
return key
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">key</span><span style="color: #0000FF;">)>=</span><span style="color: #000000;">1</span> <span style="color: #008080;">and</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">[$]</span> <span style="color: #0000FF;">==</span> <span style="color: #008000;">'A'</span> <span style="color: #008080;">then</span> <span style="color: #000000;">key</span><span style="color: #0000FF;">[$</span> <span style="color: #0000FF;">..$]</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span> <span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
end function
<span style="color: #008080;">return</span> <span style="color: #000000;">key</span>
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span>
constant tests = {
{ "Bishop", "BASAP" },
<span style="color: #008080;">constant</span> <span style="color: #000000;">tests</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{</span>
{ "Carlson", "CARLSAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Bishop"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"BASAP"</span> <span style="color: #0000FF;">},</span>
{ "Carr", "CAR" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Carlson"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"CARLSAN"</span> <span style="color: #0000FF;">},</span>
{ "Chapman", "CAPNAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Carr"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"CAR"</span> <span style="color: #0000FF;">},</span>
{ "Franklin", "FRANCLAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Chapman"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"CAPNAN"</span> <span style="color: #0000FF;">},</span>
{ "Greene", "GRAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Franklin"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"FRANCLAN"</span> <span style="color: #0000FF;">},</span>
{ "Harper", "HARPAR" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Greene"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"GRAN"</span> <span style="color: #0000FF;">},</span>
{ "Jacobs", "JACAB" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Harper"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"HARPAR"</span> <span style="color: #0000FF;">},</span>
{ "Larson", "LARSAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Jacobs"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"JACAB"</span> <span style="color: #0000FF;">},</span>
{ "Lawrence", "LARANC" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Larson"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"LARSAN"</span> <span style="color: #0000FF;">},</span>
{ "Lawson", "LASAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Lawrence"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"LARANC"</span> <span style="color: #0000FF;">},</span>
{ "Louis, XVI", "L" }, -- (see note)
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Lawson"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"LASAN"</span> <span style="color: #0000FF;">},</span>
{ "Lynch", "LYNC" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Louis, XVI"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"L"</span> <span style="color: #0000FF;">},</span> <span style="color: #000080;font-style:italic;">-- (see note)</span>
{ "Mackenzie", "MCANSY" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Lynch"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"LYNC"</span> <span style="color: #0000FF;">},</span>
{ "Matthews", "MAT" }, -- (see note)
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Mackenzie"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MCANSY"</span> <span style="color: #0000FF;">},</span>
{ "May jnr", "MY" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Matthews"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MAT"</span> <span style="color: #0000FF;">},</span> <span style="color: #000080;font-style:italic;">-- (see note)</span>
{ "McCormack", "MCARNAC" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"May jnr"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MY"</span> <span style="color: #0000FF;">},</span>
{ "McDaniel", "MCDANAL" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"McCormack"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MCARNAC"</span> <span style="color: #0000FF;">},</span>
{ "McDonald", "MCDANALD" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"McDaniel"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MCDANAL"</span> <span style="color: #0000FF;">},</span>
{ "Mclaughlin", "MCLAGLAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"McDonald"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MCDANALD"</span> <span style="color: #0000FF;">},</span>
{ "Morrison", "MARASAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Mclaughlin"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MCLAGLAN"</span> <span style="color: #0000FF;">},</span>
{ "O'Banion", "OBANAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Morrison"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MARASAN"</span> <span style="color: #0000FF;">},</span>
{ "O'Brien", "OBRAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"O'Banion"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"OBANAN"</span> <span style="color: #0000FF;">},</span>
{ "Richards", "RACARD" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"O'Brien"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"OBRAN"</span> <span style="color: #0000FF;">},</span>
{ "Silva", "SALV" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Richards"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"RACARD"</span> <span style="color: #0000FF;">},</span>
{ "Watkins", "WATCAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Silva"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"SALV"</span> <span style="color: #0000FF;">},</span>
{ "Wheeler", "WALAR" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Watkins"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"WATCAN"</span> <span style="color: #0000FF;">},</span>
{ "Willis", "WAL" }, -- (see note)
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Wheeler"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"WALAR"</span> <span style="color: #0000FF;">},</span>
{ "Xi", "X" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Willis"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"WAL"</span> <span style="color: #0000FF;">},</span> <span style="color: #000080;font-style:italic;">-- (see note)</span>
{ "bevan", "BAFAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Xi"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"X"</span> <span style="color: #0000FF;">},</span>
{ "brown, sr", "BRAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"bevan"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"BAFAN"</span> <span style="color: #0000FF;">},</span>
{ "brown sr", "BRAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"brown, sr"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"BRAN"</span> <span style="color: #0000FF;">},</span>
{ "browne, III", "BRAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"brown sr"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"BRAN"</span> <span style="color: #0000FF;">},</span>
{ "browne, IV", "BRAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"browne, III"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"BRAN"</span> <span style="color: #0000FF;">},</span>
{ "evans", "EVAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"browne, IV"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"BRAN"</span> <span style="color: #0000FF;">},</span>
{ "knight", "NAGT" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"evans"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"EVAN"</span> <span style="color: #0000FF;">},</span>
{ "mitchell", "MATCAL" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"knight"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"NAGT"</span> <span style="color: #0000FF;">},</span>
{ "o'daniel", "ODANAL" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"mitchell"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"MATCAL"</span> <span style="color: #0000FF;">},</span>
{ "D'Souza", "DSAS" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"o'daniel"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"ODANAL"</span> <span style="color: #0000FF;">},</span>
{ "de Sousa", "DASAS" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"D'Souza"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"DSAS"</span> <span style="color: #0000FF;">},</span>
{ "Hoyle-Johnson", "HAYLAJANSAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"de Sousa"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"DASAS"</span> <span style="color: #0000FF;">},</span>
{ "Vaughan Williams", "VAGANWALAN" },
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Hoyle-Johnson"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"HAYLAJANSAN"</span> <span style="color: #0000FF;">},</span>
{ "de la Mare II", "DALANAR" } }
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"Vaughan Williams"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"VAGANWALAN"</span> <span style="color: #0000FF;">},</span>
 
<span style="color: #0000FF;">{</span> <span style="color: #008000;">"de la Mare II"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"DALANAR"</span> <span style="color: #0000FF;">}</span> <span style="color: #0000FF;">}</span>
integer errors = 0
for i=1 to length(tests) do
<span style="color: #004080;">integer</span> <span style="color: #000000;">errors</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">0</span>
string {name,expected} = tests[i],
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span>
name2 := nysiis(name)
<span style="color: #004080;">string</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">name</span><span style="color: #0000FF;">,</span><span style="color: #000000;">expected</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">tests</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">],</span>
if name2!=expected then
<span style="color: #000000;">name2</span> <span style="color: #0000FF;">:=</span> <span style="color: #000000;">nysiis</span><span style="color: #0000FF;">(</span><span style="color: #000000;">name</span><span style="color: #0000FF;">)</span>
errors += 1
<span style="color: #008080;">if</span> <span style="color: #000000;">name2</span><span style="color: #0000FF;">!=</span><span style="color: #000000;">expected</span> <span style="color: #008080;">then</span>
if length(name2) > 6 then
<span style="color: #000000;">errors</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">1</span>
name2 = sprintf("%s(%s)", {name2[1..6], name2[7..$]})
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">name2</span><span style="color: #0000FF;">)</span> <span style="color: #0000FF;">></span> <span style="color: #000000;">6</span> <span style="color: #008080;">then</span>
end if
<span style="color: #000000;">name2</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">sprintf</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"%s(%s)"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">name2</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..</span><span style="color: #000000;">6</span><span style="color: #0000FF;">],</span> <span style="color: #000000;">name2</span><span style="color: #0000FF;">[</span><span style="color: #000000;">7</span><span style="color: #0000FF;">..$]})</span>
printf(1,"%-16s : %s\n", {name, name2})
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
end if
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"%-16s : %s\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">name</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">name2</span><span style="color: #0000FF;">})</span>
end for
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span>
printf(1,"All tests completed, %d errors\n",errors)</lang>
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span>
<span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"All tests completed, %d errors\n"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">errors</span><span style="color: #0000FF;">)</span>
<!--</syntaxhighlight>-->
Note: After some careful consideration, I have decided that all three (see note) tests <i>are</i> in fact correct, or at least follow wp, specifically step 6 <i>before</i> step 7.
{{out}}
Line 1,276 ⟶ 1,508:
=={{header|Python}}==
A literal translation of the algorithm from the [[wp:New York State Identification and Intelligence System|Wikipedia article]].
<langsyntaxhighlight lang="python">import re
 
_vowels = 'AEIOU'
Line 1,329 ⟶ 1,561:
'knight', 'mitchell', "o'daniel"]
for name in names:
print('%15s: %s' % (name, nysiis(name)))</langsyntaxhighlight>
{{out}}
<pre> Bishop: BASAP
Line 1,377 ⟶ 1,609:
verbose.
 
<langsyntaxhighlight lang="racket">#lang racket/base
(require racket/string racket/match)
(define (str-rplc-at str replacement start (end (string-length str)))
Line 1,456 ⟶ 1,688:
 
(for ((n names) (p py-nysiis-names))
(check-equal? (nysiis n) p (format "(nysiis ~s) = ~s" n p))))</langsyntaxhighlight>
 
=={{header|Raku}}==
Line 1,463 ⟶ 1,695:
This implementation removes common name suffixes similar to the reference implementation, even though it is not specified in the task description or on the linked [[wp:New York State Identification and Intelligence System|NYSIIS]] page. This algorithm isn't too friendly to certain French kings. :)
 
<syntaxhighlight lang="raku" perl6line>sub no_suffix ($name) {
$name.uc.subst: /\h (<[JS]>R) | (<[IVX]>+) $/, '';
}
Line 1,513 ⟶ 1,745:
}
printf "%10s, %s\n", $_, $nysiis;
}</langsyntaxhighlight>
 
Output:
Line 1,562 ⟶ 1,794:
If the rule of only returning (up to) six characters is to be enforced, then the last REXX statement should be
replaced with:
<langsyntaxhighlight lang="rexx">return strip( left(key, 6) ) /*return the leftmost six characters. */</langsyntaxhighlight>
<langsyntaxhighlight lang="rexx">/*REXX program implements the NYSIIS phonetic algorithm (for various test names). */
names@@= "Bishop brown_sr browne_III browne_IV Carlson Carr Chapman D'Souza de_Sousa Franklin",
"Greene Harper Hoyle-Johnson Jacobs knight Larson Lawrence Lawson Louis_XVI Lynch",
"Mackenzie Marshall,ESQ Matthews McCormack McDaniel McDonald Mclaughlin mitchell Morrison",
"O'Banion O'Brien o'daniel Richards Silva Vaughan_Williams Watkins Wheeler Willis Xavier,MD."
parse upper arg z; if z='' then z=names @@ /*obtain optional name list from the CL*/
 
do i=1 for words(z) /*process each name (word) in the list.*/
xx= translate( word(z, i), , '_') /*reconstitute any blanks using TRANS. */
say right(xx, 35) ' ──► ' nysiis(xx) /*display some stuff to the terminal. */
end /*i*/
exit 0 /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
$: p= substr(x,j-1,1) /*prev*/; n= substr(x,j+1,1) /*next*/; return substr(x,j,arg(1))
vowel: return pos(arg(1), 'AEIOUaeiou') \== 0 /*returns 1 if the argument has a vowel*/
/*──────────────────────────────────────────────────────────────────────────────────────*/
nysiis: procedure; arg x; x= space( translate(x, , ',')) /*elide commas, excess blanks*/
w= words(x); Lw= word(x, w) /*pick off the last word in name list. */
@titles= 'ESQ JNR JR SNR SR' /* [↓] last word post─nominal letters?*/
if w\==1 then if pos('IL', lw)==0 then /*disallow IL as Roman #. */
if pos(., x)\==0 |, /*Sr. Jr. Esq. ...··· ? */
datatype( left(Lw, 1), 'W') |, /*2nd 3rd 4th ...··· ? */
verify(Lw, 'IVXL') ==0 |, |, /*Roman numeral suffix? */
wordpos(x, @titles)\==0 then x= subword(x, 1, w-1)
x= space(x, 0) /*remove all whitespace from the name. */
if left(x, 3)=='MAC' then x= '"MCC'"substr(x, 4) /*start with MAC ? */
if left(x, 2)=='KN' then x= '"N'"substr(x, 3) /* " " KN ? */
if left(x, 1)=='K' then x= '"C'"substr(x, 2) /* " " K ? */
if left(x, 2)=='PH' | left(x,2)=='"PF'" then x= 'FF'substr(x, 3) /* " " PH,PF?*/
if left(x, 3)=='SCH' then x= '"SSS'"substr(x, 4) /* " " SCH ? */
r2= right(x, 2)
if wordpos(r2, 'EE IE') \==0 then x= left(x, length(x)-2)"Y" /*ends with ··· ?*/
if wordpos(r2, 'DT RT RD NT ND')\==0 then x= left(x, length(x)-2)"D" /* " " " "*/
key= left(x, 1) /*use first char.*/
 
do j=2 to length(x); if \datatype($(1), 'U') then iterate /*¬ Latin letter? Skip it*/
if $(2)=='EV' then x= overlay("F", x, j+1) /*have an EV ? Use F */
else x= overlay( translate( $(1), 'AAAAGSN', "EIOUQZM"), x, j)
if $(2)=='KN' then x= left(x, j-1)"N"substr(x, j+1) /*have a KN ? Use N */
else if $(1)=="K" then x= overlay('C',x,j) /* " " K ? Use C */
if $(3)=='SCH' then x= overlay("SSS", x, j) /* " " SCH? Use SSS*/
if $(2)=='PH' then x= overlay("FF", x, j) /* " " PH ? Use FF */
if $(1)=='H' then if \vowel(p) | \vowel(n) then x= overlay( p , x, j)
if $(1)=='W' then if vowel(p) then x= overlay("A", x, j)
if $(1)\== right(key, 1) then key= key || $(1) /*append to KEY.*/
end /*j*/
/* [↓] elide: */
if right(key, 1)=='S' then key= left(key, max(1, length(key) -1)) /*ending S */
if right(key, 2)=='AY' then key= left(key, length(key) -2)"Y" /* " A in AY*/
if right(key, 1)=='A' then key= left(key, max(1, length(key) -1)) /* " A */
return strip(key) /*return the whole key (all of it). */</langsyntaxhighlight>
'''{{out|output'''|text=&nbsp; when using the default input(s):}}
<pre>
Bishop ──► BASAP
Line 1,658 ⟶ 1,890:
 
=={{header|Tcl}}==
<langsyntaxhighlight lang="tcl">proc nysiis {name {truncate false}} {
# Normalize to first word, uppercased, without non-letters
set name [regsub -all {[^A-Z]+} [string toupper [regexp -inline {\S+} $name]] ""]
Line 1,685 ⟶ 1,917:
}
return $name
}</langsyntaxhighlight>
Demonstrating:
<langsyntaxhighlight lang="tcl">foreach name {
knight mitchell "o'daniel" "brown sr" "browne III"
"browne IV" "O'Banion" Mclaughlin McCormack Chapman
Line 1,697 ⟶ 1,929:
} {
puts "$name -> [nysiis $name]"
}</langsyntaxhighlight>
{{out}}
<pre>
Line 1,733 ⟶ 1,965:
Wheeler -> WHALAR
Louis XVI -> L
</pre>
 
=={{header|Wren}}==
{{trans|Kotlin}}
{{libheader|Wren-str}}
{{libheader|Wren-pattern}}
{{libheader|Wren-fmt}}
<syntaxhighlight lang="wren">import "./str" for Str
import "./pattern" for Pattern
import "./fmt" for Fmt
 
var fStrs = [["MAC", "MCC"], ["KN", "N"], ["K", "C"], ["PH", "FF"], ["PF", "FF"], ["SCH", "SSS"]]
var lStrs = [["EE", "Y"], ["IE", "Y"], ["DT", "D"], ["RT", "D"], ["RD", "D"], ["NT", "D"], ["ND", "D"]]
var mStrs = [["EV", "AF"], ["KN", "N"], ["SCH", "SSS"], ["PH", "FF"]]
var eStrs = ["JR", "JNR", "SR", "SNR"]
 
var isVowel = Fn.new { |c| "AEIOU".contains(c) }
 
var isRoman = Fn.new { |s| s.all { |c| "IVX".contains(c) } }
 
var splitter = Pattern.new("[ |,]")
 
var nysiis = Fn.new { |word|
if (word == "") return word
var w = Str.upper(word)
var ww = splitter.splitAll(w)
if (ww.count > 1 && isRoman.call(ww[-1])) w = w[0...w.count-ww[-1].count]
for (c in " ,'-") w = w.replace(c, "")
for (eStr in eStrs) {
if (w.endsWith(eStr)) w = w[0...w.count-eStr.count]
}
for (fStr in fStrs) {
if (w.startsWith(fStr[0])) w = fStr[1] + w[fStr[0].count..-1]
}
for (lStr in lStrs) {
if (w.endsWith(lStr[0])) w = w[0..-3] + lStr[1]
}
var key = w[0]
w = w[1..-1]
for (mStr in mStrs) w = w.replace(mStr[0], mStr[1])
var sb = (key[0] + w).toList
var i = 1
var len = sb.count
while (i < len) {
if ("EIOU".contains(sb[i])) {
sb[i] = "A"
} else if (sb[i] == "Q") {
sb[i] = "G"
} else if (sb[i] == "Z") {
sb[i] = "S"
} else if (sb[i] == "M") {
sb[i] = "N"
} else if (sb[i] == "K") {
sb[i] = "C"
} else if (sb[i] == "H") {
if (!isVowel.call(sb[i-1]) || (i < len-1 && !isVowel.call(sb[i+1]))) sb[i] = sb[i-1]
} else if (sb[i] == "W") {
if (isVowel.call(sb[i-1])) sb[i] = "A"
}
i = i + 1
}
if (sb[len-1] == "S") {
sb = sb[0...len-1]
len = len - 1
}
if (len > 1 && Str.sub(sb.join(""), len-2..-1) == "AY") {
sb = sb[0...len-2]
sb = sb + ["Y"]
len = len -1
}
if (len > 0 && sb[len-1] == "A") {
sb = sb[0...len-1]
len = len - 1
}
var prev = key[0]
for (j in 1...len) {
var c = sb[j]
if (prev != c) {
key = key + c
prev = c
}
}
return key
}
 
var names = [
"Bishop", "Carlson", "Carr", "Chapman",
"Franklin", "Greene", "Harper", "Jacobs", "Larson", "Lawrence",
"Lawson", "Louis, XVI", "Lynch", "Mackenzie", "Matthews", "May jnr",
"McCormack", "McDaniel", "McDonald", "Mclaughlin", "Morrison",
"O'Banion", "O'Brien", "Richards", "Silva", "Watkins", "Xi",
"Wheeler", "Willis", "brown, sr", "browne, III", "browne, IV",
"knight", "mitchell", "o'daniel", "bevan", "evans", "D'Souza",
"Hoyle-Johnson", "Vaughan Williams", "de Sousa", "de la Mare II"
]
 
for (name in names) {
var name2 = nysiis.call(name)
if (name2.count > 6) name2 = Fmt.swrite("$s($s)", name2[0..5], name2[6..-1])
Fmt.print("$-16s : $s", name, name2)
}</syntaxhighlight>
 
{{out}}
<pre>
Bishop : BASAP
Carlson : CARLSA(N)
Carr : CAR
Chapman : CAPNAN
Franklin : FRANCL(AN)
Greene : GRAN
Harper : HARPAR
Jacobs : JACAB
Larson : LARSAN
Lawrence : LARANC
Lawson : LASAN
Louis, XVI : LA
Lynch : LYNC
Mackenzie : MCANSY
Matthews : MATA
May jnr : MY
McCormack : MCARNA(C)
McDaniel : MCDANA(L)
McDonald : MCDANA(LD)
Mclaughlin : MCLAGL(AN)
Morrison : MARASA(N)
O'Banion : OBANAN
O'Brien : OBRAN
Richards : RACARD
Silva : SALV
Watkins : WATCAN
Xi : X
Wheeler : WALAR
Willis : WAL
brown, sr : BRAN
browne, III : BRAN
browne, IV : BRAN
knight : NAGT
mitchell : MATCAL
o'daniel : ODANAL
bevan : BAFAN
evans : EVAN
D'Souza : DSAS
Hoyle-Johnson : HAYLAJ(ANSAN)
Vaughan Williams : VAGANW(ALAN)
de Sousa : DASAS
de la Mare II : DALANA(R)
</pre>
 
=={{header|zkl}}==
{{trans|Python}}
<langsyntaxhighlight lang="zkl">fcn replaceAt(text,pos,fromList,toList){
foreach f,t in (fromList.zip(toList)){
if(0==text[pos,*].find(f)) return(text.set(pos,f.len(),t));
Line 1,772 ⟶ 2,150:
}
replaceEnd(key, T("S", "AY", "A"), T("", "Y", ""));
}</langsyntaxhighlight>
<langsyntaxhighlight lang="zkl">names := T("Bishop", "Carlson", "Carr", "Chapman",
"Franklin", "Greene", "Harper", "Jacobs", "Larson", "Lawrence",
"Lawson", "Louis, XVI", "Lynch", "Mackenzie", "Matthews",
Line 1,781 ⟶ 2,159:
"knight", "mitchell", "o'daniel");
foreach name in (names){ println("%11s: %s".fmt(name, nysiis(name))) }</langsyntaxhighlight>
{{out}}
<pre>
9,482

edits