Talk:Word frequency: Difference between revisions
m (→task clarification: corrected a misspelling.) |
m (→why entered as a task instead of draft task?: added an observation.) |
||
Line 1: | Line 1: | ||
==why entered as a ''task'' instead of ''draft task''?== |
==why entered as a ''task'' instead of ''draft task''?== |
||
Why was this entry entered as a ''task'' instead of a ''draft task''? -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 03:08, 16 August 2017 (UTC) |
Why was this entry entered as a ''task'' instead of a ''draft task''? -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 03:08, 16 August 2017 (UTC) |
||
... ahhh ... I see that this ''task'' was demoted to a ''draft task'' by Paddy3118. -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 08:34, 16 August 2017 (UTC) |
|||
==task clarification== |
==task clarification== |
Revision as of 08:35, 16 August 2017
why entered as a task instead of draft task?
Why was this entry entered as a task instead of a draft task? -- Gerard Schildberger (talk) 03:08, 16 August 2017 (UTC)
... ahhh ... I see that this task was demoted to a draft task by Paddy3118. -- Gerard Schildberger (talk) 08:34, 16 August 2017 (UTC)
task clarification
I assume we are to code programs to handle the general case, not just the file to be used as a test case.
What is a "word"?
What letters can be included in a word?
What other characters can be included in a word?
Are words that are hyphenated one word or two?
What about words like: jack-o'-lantern
What about split words across lines (if there are possi-
ble if any)?
Are words that contain an apostrophe to be included (such as let's)?
What about words that contain non-Latin (Roman) letters?
As it happens, those non-Latin letters don't show up in the top ten.
What exactly is the text (start and stop) that is contained in the web-page to be used?
Should we also use the prologue and epilogue of the Project Gutenberg along with the book's text?
Wouldn't it be a lot simpler to have a simple (and complete) text file to download [with no (de-)assembly or editing required]?
-- Gerard Schildberger (talk) 03:08, 16 August 2017 (UTC)