Most frequent k chars distance: Difference between revisions

Most frequent k chars distance (view source)

Revision as of 23:58, 9 October 2017

38 bytes removed , 6 years ago

m

Reduce some of the bogus markup in the task description so the contents actually contains the entries

Thundergnat

10,327

edits

Revision as of 23:49, 9 October 2017 (view source) Thundergnat (talk \| contribs) (→‎{{header\|Perl6}}: Add Perl6 example) ← Older edit		Revision as of 23:58, 9 October 2017 (view source) Thundergnat (talk \| contribs) m (Reduce some of the bogus markup in the task description so the contents actually contains the entries) Newer edit →
Line 39: The string distance ==;Definition== Method has two steps. * [[wp:Hash function\|Hash]] input strings str1 and str2 separately using MostFreqKHashing and output hstr1 and hstr2 respectively * Calculate string distance (or string similarity coefficient) of two hash outputs, hstr1 and hstr2 and output an integer value ~~===~~;Most Frequent K Hashing~~===~~ The first step of algorithm is calculating the hashing based on the most frequent k characters. The hashing algorithm has below steps: '''string function''' ''MostFreqKHashing'' ('''string''' inputString, '''int''' K) Line 64: Similar to the most of [[wp:hashing function\|hashing functions]], ''Most Frequent K Hashing'' is also a [[wp:one way function]]. ~~===~~;Most Frequent K Distance~~===~~ Distance calculation between two strings is based on the hash outputs of two strings. '''int function''' ''MostFreqKSimilarity'' ('''string''' inputStr1, '''string''' inputStr2) Line 77: In some implementations, the distance metric is required instead of similarity coefficient. In order to convert the output of above similarity coefficient to distance metric, the output can be subtracted from any constant value (like the maximum possible output value). For the case, it is also possible to implement a [[wp:wrapper function]] over above two functions. ~~===~~;;String Distance Wrapper Function~~===~~ In order to calculate the distance between two strings, below function can be implemented '''int function''' MostFreqKSDF ('''string''' inputStr1, '''string''' inputStr2, '''int''' K, '''int''' maxDistance) Line 84: Any call to above string distance function will supply two input strings and a maximum distance value. The function will calculate the similarity and subtract that value from the maximum possible distance. It can be considered as a simple [[wp:additive inverse]] of similarity. ==;Examples== Let's consider maximum 2 frequent hashing over two strings ‘research’ and ‘seeking’. <lang javascript>MostFreqKHashing('research',2) = 'r2e2'</lang> Line 142: MostFreqKSDF(str1,str2,2,100) = 83 </pre> ~~=Implementations=~~ =={{header\|C++}}== Line 918: It computes the identical value on the identical inputs. =;References= {{reflist}}