Talk:Word frequency: Difference between revisions

From Rosetta Code
Content added Content deleted
m (changed font in the 1st section name.)
m (→‎task clarification: corrected a misspelling.)
Line 23: Line 23:
<br>As it happens, those non-Latin letters don't show up in the &nbsp; ''top ten''.
<br>As it happens, those non-Latin letters don't show up in the &nbsp; ''top ten''.


What '''exacty''' is the ''text'' &nbsp; (start and stop) &nbsp; that is contained in the web-page to be used?
What '''exactly''' is the ''text'' &nbsp; (start and stop) &nbsp; that is contained in the web-page to be used?


Should we also use the prologue and epilogue of the &nbsp; ''Project Gutenberg'' &nbsp; along with the book's text?
Should we also use the prologue and epilogue of the &nbsp; ''Project Gutenberg'' &nbsp; along with the book's text?

Wouldn't it be a lot simpler to have a simple (and complete) text file to download &nbsp; [with no (de-)assembly or editing required]?


-- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 03:08, 16 August 2017 (UTC)
-- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 03:08, 16 August 2017 (UTC)

Revision as of 08:31, 16 August 2017

why entered as a task instead of draft task?

Why was this entry entered as a   task   instead of a   draft task?   -- Gerard Schildberger (talk) 03:08, 16 August 2017 (UTC)

task clarification

I assume we are to code programs to handle the general case, not just the file to be used as a test case.

What is a "word"?

What letters can be included in a word?

What other characters can be included in a word?

Are words that are hyphenated one word or two?

What about words like:     jack-o'-lantern

What about split words across lines   (if there are possi-
ble if any)?

Are words that contain an apostrophe to be included   (such as let's)?

What about words that contain non-Latin (Roman) letters?
As it happens, those non-Latin letters don't show up in the   top ten.

What exactly is the text   (start and stop)   that is contained in the web-page to be used?

Should we also use the prologue and epilogue of the   Project Gutenberg   along with the book's text?

Wouldn't it be a lot simpler to have a simple (and complete) text file to download   [with no (de-)assembly or editing required]?

-- Gerard Schildberger (talk) 03:08, 16 August 2017 (UTC)