Jump to content

WiktionaryDumps to words: Difference between revisions

useful for spell checkers
(Added C)
(useful for spell checkers)
Line 1:
{{draft task}}
 
;NOTE: Please help addressing the issues about this task on the discussion page. If you add another language, be aware that itthis task may change in the future, and that you will need to update your example.
 
;Task:
Use the [https://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2 wiktionary dump] (input) to create a file equivalent than [http://manpages.ubuntu.com/manpages/bionic/man5/french.5.html "/usr/share/dict/french"] (output). This dump is a big bz2'ed XML file of about 800MB. The "/usr/share/dict/french" file contains one word of the French language by line in a text file. This file is available in Ubuntu with the package '''wfrench'''.
Make a file that can be useful with [https://en.wikipedia.org/wiki/Spell_checker spell checkers] like [https://fr.wikipedia.org/wiki/Ispell Ispell] and [https://en.wikipedia.org/wiki/GNU_Aspell Aspell].
 
Use the [https://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2 wiktionary dump] (input) to create a file equivalent than [httphttps://manpages.ubuntu.com/manpages/bionic/man5/frenchspanish.5.html "/usr/share/dict/frenchspanish"] (output). ThisThe input file is an XML dump of the Wiktionary that is a big bz2'ed XML file of about 800MB. The output file should be a file similar than "/usr/share/dict/frenchspanish" filewhich contains one word of thea Frenchgiven language by line in a simple text file. ThisAn example of such a file is available in Ubuntu with the package '''wfrenchwspanish'''.
 
 
Cookies help us deliver our services. By using our services, you agree to our use of cookies.