Talk:Teacup rim text: Difference between revisions

→‎Limiting output: added Raku (footnote).
(No longer clear that only one permutation of each set of qualifying words should be printed. Suggested a wording to deal with this.)
(→‎Limiting output: added Raku (footnote).)
 
(19 intermediate revisions by 6 users not shown)
Line 6:
:: I shall specify the wordlist and be specific about the result set. [[User:Axtens|Axtens]] ([[User talk:Axtens|talk]]) 01:10, 5 August 2019 (UTC)
 
:: Okay that's done. How do I tell the Perl6<sup>*</sup> contributor to abbreviate his output? [[User:Axtens|Axtens]] ([[User talk:Axtens|talk]]) 03:40, 5 August 2019 (UTC)
 
 
:: <sup>*</sup> (which has been subsequently changed to '''Raku'''.) &nbsp; &nbsp; &nbsp; -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 20:55, 8 July 2020 (UTC)
 
==A good task description specifies a problem rather than a procedure==
The real task/problem here is to identify and display a subset of words (in a given lexicon) that are 'circular' in the sense which you describe.
 
 
The current formulation (para 3) is the narration of a '''procedure''', rather than the statement of a problem or task, and is perhaps not yet quite consistent with the Rosetta Code goal (see the landing page) of aiding ''a person with a grounding in one approach to a problem in learning another''.
Line 25 ⟶ 28:
 
:: --[[User:PureFox|PureFox]] ([[User talk:PureFox|talk]]) 09:10, 6 August 2019 (UTC)
 
:"Rotations" would be clearer than "permutations" as well.--LambertDW 04:54, 30 June 2020 (UTC)
 
== should programming solutions be assuming caseless words? ==
Line 36 ⟶ 41:
 
:::: Phew. I thought I was the horse for a while :-)<br> --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 08:41, 6 August 2019 (UTC)
 
::::: ∙∙∙ &nbsp; <big> ''straight from the horse's mouth''</big>, &nbsp; not to be confused with the other end of the horse. &nbsp; The phrase means &nbsp; (one version):
 
Getting information from the highest authority. &nbsp; Origin: &nbsp; when you get a tip for a horse race,
the tip is better as nearer it is to the horse e.g. the jockey or the trainer. &nbsp; When you got it
''straight from the horse's mouth'', you have it directly from the source.
 
::::: (Horses don't talk, but there're some exceptions: &nbsp; Mr. Ed and Horatio, for two, &nbsp; and Francis was a mule.) &nbsp; So the connection of Rosetta Code to horses is obvious. &nbsp; &nbsp; -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:14, 6 August 2019 (UTC)
 
:::::: Urdu language blurs the distinction a little. White (man) as in foreigner: گورا. Horse: گھوڑا. There's aspiration after the initial 'g' and the 'r' is retroflexed. Otherwise it's just 'gora'. Gotta watch those points of articulation. &nbsp; [[User:Axtens|Axtens]] ([[User talk:Axtens|talk]]) 20:37, 7 August 2019 (UTC)
 
==Dictionary swap? ==
We seem to use dictionary [http://wiki.puzzlers.org/pub/wordlists/unixdict.txt unixdict.txt] [https://www.google.com/search?client=firefox-b-d&q=site%3Arosettacode.org+unixdict.txt a lot], and I just found it again as they moved it on their site. --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 16:48, 6 August 2019 (UTC)
 
: For the sake of continuity with the history of RC, I will specify that. Good find. Thank you! [[User:Axtens|Axtens]] ([[User talk:Axtens|talk]]) 01:32, 8 August 2019 (UTC)
 
: Re ''"we seem to use ... a lot"'' (above), perhaps worth pausing to consider the goals and qualities ?
:The MIT dictionary proves more rewarding for this exercise (it contains more circular groups). It is also half the size, which makes scripts quicker to iteratively test, and contributes less to atmospheric warming. Now that we are flip-flopping a bit here, I think some will understandably just future-proof themselves by using both, which is fine, but it may be that the MIT dictionary simply has better qualities as a vehicle for testing scripts. [[User:Hout|Hout]] ([[User talk:Hout|talk]]) 08:07, 8 August 2019 (UTC)
 
:: Hadn't thought of dictionary quality - good point. I just thought that if there's a dict we use a lot, then why not use it again.
:: I tend to stay clear of the scrabble type dictionaries when actually playing word games as they tend to include really odd letter combinations as words (such as qi and qat - which I now know), but which expands my vocabulary in odd directions. --[[User:Paddy3118|Paddy3118]] ([[User talk:Paddy3118|talk]]) 10:25, 8 August 2019 (UTC)
 
::: A good (quality) dictionary would have uppercase words, so &nbsp; '''god''' &nbsp; ''and'' &nbsp; '''God''' &nbsp; would both be present, which are different words and the inclusion of them would preclude shortcuts that most computer programs seem to have taken and assume that all words are lowercase &nbsp; (unfortunately, this is the case for at least two dictionaries used for this Rosetta Code task). &nbsp; Also, the commonly known abbreviations such as &nbsp; '''PTA''' &nbsp; and others should be included; &nbsp; it is ''not'' &nbsp; '''pta'''. &nbsp; Not to mention the inclusion of hyphenated words, &nbsp; not the least of which would be such words as &nbsp; '''twenty-one''', &nbsp; '''21''', &nbsp; and &nbsp; '''pow-wow''' &nbsp; (as well as &nbsp; '''powwow'''). &nbsp; Also, abbreviations such as &nbsp; '''A.S.P.C.A.,''' &nbsp; '''U.S.''', &nbsp; '''US''', and &nbsp; '''us'''. &nbsp; Also, words such as &nbsp; '''C/N''' &nbsp; and some common apostrophised (apostrophized) words &nbsp; ('''it's''' &nbsp; and &nbsp; '''ain't'''). &nbsp; There also would be words that have decimal digits in them. &nbsp; This would test the ability of computer programming solutions to handle mixed case and other characters. &nbsp; And I would like a few words like &nbsp; '''antidisestablishmentarianism''' &nbsp; sprinkled around to stretch the (computer program) limits (if any) for the sizes of the words (strings). &nbsp; As for the size, I have a much larger dictionary that I use, and it doesn't hinder the testing. &nbsp; But if the size (for testing purposes) is a problem, insert a circuit-breaker in the program that facilitates testing. &nbsp; (By the way, is there any limit on soap-box time?) &nbsp; In short, a good quality dictionary would mimic a "real" dictionary, if not in size, but at least in scope. &nbsp; &nbsp; -- -- [[User:Gerard Schildberger|Gerard Schildberger]] ([[User talk:Gerard Schildberger|talk]]) 18:45, 8 August 2019 (UTC)
 
:::: Gerard, if you have a big dictionary I would prefer to use that than the unixdict one. I had only opted for that because, well, "we've (mostly) always used that one." Oh, and if you have any others in other languages, that'd be good too. But getting back to the casing issue: the original illustration used uppercase. I'm very much inclined to say "uppercase only." I may yet actually be so bold. [[User:Axtens|Axtens]] ([[User talk:Axtens|talk]]) 10:42, 9 August 2019 (UTC)
 
:::: https://github.com/dwyl/english-words looks like a serious contender [[User:Axtens|Axtens]] ([[User talk:Axtens|talk]]) 13:48, 9 August 2019 (UTC) Hah! so I downloaded words.txt and tried to get my QB64 submission to load it. It couldn't. [[User:Axtens|Axtens]] ([[User talk:Axtens|talk]]) 13:56, 9 August 2019 (UTC)
 
::::: I've just downloaded that file and there are 466,551 words in it altogether. This produces 374 sets which match the task requirements of which 12 consist of 4-letter words such as:
 
:::::[ADAR DARA ARAD RADA]
 
::::: However, I think there will be some other 4-letter words uttered by the other contributors if you change the task requirements again :) --[[User:PureFox|PureFox]] ([[User talk:PureFox|talk]]) 14:38, 9 August 2019 (UTC)
:::::: On top of which, 400+K words would be just too many for the purpose of script testing – a waste of time, heat and fuel. The brevity of the MIT10000 is a quality in itself. [[User:Hout|Hout]] ([[User talk:Hout|talk]]) 14:48, 9 August 2019 (UTC)
 
::::::: Which is a good point and brings me also to another frustration: I was after words that can be pronounced. The "ARC RCA CAR" triple has "RCA". Yeah, it's a word. Sort of. RCA cables and all that. But going back to the coaster and the illustration, "TEA EAT ATE" are speakable words. RCA involves saying the name of each letter. But then what do you do with LASER? It's an acronym but almost no one knows what it stands for -- it's been assimilated into regular parlance. The huge DWYL list is chock full of initialisms and acronyms and assorted unspeakable "words". And yes, it's a WOMBAT (a Waste Of Money, Brains And Time) -- at least for this task. [[User:Axtens|Axtens]] ([[User talk:Axtens|talk]]) 07:13, 12 August 2019 (UTC)