WiktionaryDumps to words: Difference between revisions

than => to
m (grammar etc)
(than => to)
Line 4:
Make a file that can be useful with [https://en.wikipedia.org/wiki/Spell_checker spell checkers] like [https://fr.wikipedia.org/wiki/Ispell Ispell] and [https://en.wikipedia.org/wiki/GNU_Aspell Aspell].
 
Use the [https://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2 wiktionary dump] (input) to create a file equivalent thanto [https://manpages.ubuntu.com/manpages/bionic/man5/spanish.5.html "/usr/share/dict/spanish"] (output). The input file is an XML dump of the Wiktionary that is a bz2'ed file of about 800MB. The output file should be a file similar to "/usr/share/dict/spanish", a simple text file each line of which is one word in the given language. An example of such a file is available in Ubuntu with the package '''wspanish'''.
 
 
2,442

edits