Text completion: Difference between revisions

Content added Content deleted
m (→‎{{header|Phix}}: use unix_dict(), and a properly transpiled apply())
Line 446: Line 446:


Process finished with exit code 0
Process finished with exit code 0
</pre>

=={{header|jq}}==
{{works with|jq}}
This entry uses unixdict.txt, the dictionary used frequently on other RosettaCode pages.

Since the title of this particular page is "Text completion", the
solution offered here includes both a straightforward lookup of the dictionary to
check for possible completions, and a check for equally good matches
based on the Levenshtein distance.

The Levenshtein module used is the collection of function definitions on
the RC page https://rosettacode.org/wiki/Levenshtein_distance
and is thus not repeated here.

The "percentage" reported is computed as described in the comments.

The "debug" statements for showing the number of words under consideration for best Levenshtein matches is retained.
<lang jq>
include "levenshtein-distance" {search: "."}; # https://rosettacode.org/wiki/Levenshtein_distance#jq

# input: a dictionary
# output an array of {word, d, p} objects showing the best matches as measured
# by the Levenshtein distance, d; p is an indicator of similarity expressed as
# 100 * (max(0, ($word|length) - d) / ($word|length) ) rounded to the nearest integer
def closest($word):
if .[$word] then {$word, d:0, percentage: 100}
else
"Levenshtein-closest words to \($word):",
(($word|length) as $length
| (keys_unsorted | map(select(length | (. > $length-3) and (. < $length + 3)))) as $candidates
| $candidates
| (length|debug) as $debug
| map( {word: ., d: levenshteinDistance($word; .)} )
| minima(.d)
| map( .p = (100 * ([0, ($length - .d) ] | max / $length) | round )))
end ;

# Input: a dictionary
# Output: an array of possible completions of $word
def completion($word):
"Possible completions of \($word):",
(keys_unsorted | map(select(startswith(word))));

def task:
INDEX(inputs; .) # the dictionary
| completion("compli"),
closest("complition"),
closest("compxxxxtion")
;

task
</lang>
{{out}}
Invocation: jq -nR -f text-completion.jq unixdict.txt

This assumed levenshtein-distance.jq is in the pwd.
See the comments above.
<pre>
Possible completions of compli:
[
"compliant",
"complicate",
"complicity",
"compliment",
"complimentary",
"compline"
]
Levenshtein-closest words to complition:
["DEBUG:",10396]
[
{
"word": "completion",
"d": 1,
"p": 90
}
]
"Levenshtein-closest words to compxxxxtion:"
["DEBUG:",4082]
[
{
"word": "competition",
"d": 4,
"p": 67
},
{
"word": "compilation",
"d": 4,
"p": 67
},
{
"word": "completion",
"d": 4,
"p": 67
},
{
"word": "complexion",
"d": 4,
"p": 67
},
{
"word": "composition",
"d": 4,
"p": 67
},
{
"word": "compunction",
"d": 4,
"p": 67
},
{
"word": "computation",
"d": 4,
"p": 67
}
]
</pre>
</pre>