Text completion: Difference between revisions
Content added Content deleted
m (→{{header|Phix}}: use unix_dict(), and a properly transpiled apply()) |
|||
Line 446: | Line 446: | ||
Process finished with exit code 0 |
Process finished with exit code 0 |
||
</pre> |
|||
=={{header|jq}}== |
|||
{{works with|jq}} |
|||
This entry uses unixdict.txt, the dictionary used frequently on other RosettaCode pages. |
|||
Since the title of this particular page is "Text completion", the |
|||
solution offered here includes both a straightforward lookup of the dictionary to |
|||
check for possible completions, and a check for equally good matches |
|||
based on the Levenshtein distance. |
|||
The Levenshtein module used is the collection of function definitions on |
|||
the RC page https://rosettacode.org/wiki/Levenshtein_distance |
|||
and is thus not repeated here. |
|||
The "percentage" reported is computed as described in the comments. |
|||
The "debug" statements for showing the number of words under consideration for best Levenshtein matches is retained. |
|||
<lang jq> |
|||
include "levenshtein-distance" {search: "."}; # https://rosettacode.org/wiki/Levenshtein_distance#jq |
|||
# input: a dictionary |
|||
# output an array of {word, d, p} objects showing the best matches as measured |
|||
# by the Levenshtein distance, d; p is an indicator of similarity expressed as |
|||
# 100 * (max(0, ($word|length) - d) / ($word|length) ) rounded to the nearest integer |
|||
def closest($word): |
|||
if .[$word] then {$word, d:0, percentage: 100} |
|||
else |
|||
"Levenshtein-closest words to \($word):", |
|||
(($word|length) as $length |
|||
| (keys_unsorted | map(select(length | (. > $length-3) and (. < $length + 3)))) as $candidates |
|||
| $candidates |
|||
| (length|debug) as $debug |
|||
| map( {word: ., d: levenshteinDistance($word; .)} ) |
|||
| minima(.d) |
|||
| map( .p = (100 * ([0, ($length - .d) ] | max / $length) | round ))) |
|||
end ; |
|||
# Input: a dictionary |
|||
# Output: an array of possible completions of $word |
|||
def completion($word): |
|||
"Possible completions of \($word):", |
|||
(keys_unsorted | map(select(startswith(word)))); |
|||
def task: |
|||
INDEX(inputs; .) # the dictionary |
|||
| completion("compli"), |
|||
closest("complition"), |
|||
closest("compxxxxtion") |
|||
; |
|||
task |
|||
</lang> |
|||
{{out}} |
|||
Invocation: jq -nR -f text-completion.jq unixdict.txt |
|||
This assumed levenshtein-distance.jq is in the pwd. |
|||
See the comments above. |
|||
<pre> |
|||
Possible completions of compli: |
|||
[ |
|||
"compliant", |
|||
"complicate", |
|||
"complicity", |
|||
"compliment", |
|||
"complimentary", |
|||
"compline" |
|||
] |
|||
Levenshtein-closest words to complition: |
|||
["DEBUG:",10396] |
|||
[ |
|||
{ |
|||
"word": "completion", |
|||
"d": 1, |
|||
"p": 90 |
|||
} |
|||
] |
|||
"Levenshtein-closest words to compxxxxtion:" |
|||
["DEBUG:",4082] |
|||
[ |
|||
{ |
|||
"word": "competition", |
|||
"d": 4, |
|||
"p": 67 |
|||
}, |
|||
{ |
|||
"word": "compilation", |
|||
"d": 4, |
|||
"p": 67 |
|||
}, |
|||
{ |
|||
"word": "completion", |
|||
"d": 4, |
|||
"p": 67 |
|||
}, |
|||
{ |
|||
"word": "complexion", |
|||
"d": 4, |
|||
"p": 67 |
|||
}, |
|||
{ |
|||
"word": "composition", |
|||
"d": 4, |
|||
"p": 67 |
|||
}, |
|||
{ |
|||
"word": "compunction", |
|||
"d": 4, |
|||
"p": 67 |
|||
}, |
|||
{ |
|||
"word": "computation", |
|||
"d": 4, |
|||
"p": 67 |
|||
} |
|||
] |
|||
</pre> |
</pre> |
||