Talk:Rosetta Code/Run examples

From Rosetta Code
Revision as of 20:56, 15 February 2020 by PureFox (talk | contribs) (→‎Task status: Further thoughts.)

Output extra credit

Since it's just extra credit I don't think this should keep it from being a full task, but I'm concerned about the output checking EC. It might just end up being too problematic because of the different ways languages show certain things. The first example I thought of was lists. Here's how the same list might look in different languages:

Scheme/LISP(s):

(1 2 3 4 5)

Java:

[1, 2, 3, 4, 5]

J:

1 2 3 4 5

Prolog:

[1,2,3,4,5]

All of those are the same list, but they look pretty different. I'm also a little worried about someone taking a solution that does this output checking and using it to mark examples incorrect, but that doesn't seem very likely at all so I'm not that worried--only a little bit. Like I said, it shouldn't keep this from becoming a full task, but people should make sure they think ahead a lot if they try to do this EC. --Mwn3d 17:51, 22 November 2011 (UTC)

yes, that is very true, and the more reason why this is extra credit.--eMBee 18:02, 22 November 2011 (UTC)

Some points

I do have some points for consideration:

  1. Layout of examples: examples have not been created to be auto-run and massaging a wide selection of examples to run could make for too long a program. You might consider something like "Assume that code inside the first <lang> tag below the languages {{header|}} tag is all that need run".
    that sounds like a good idea. in my mind i expected each <lang> tag to be a separate solution, so the downloading could would have to pick one or run all of them on by one.--eMBee 18:51, 22 November 2011 (UTC)
    That's not universally true. Sometimes answers are structured as two parts: the core of the solution, and a driver/testbed that applies the solution to a particular case. Other languages sometimes have multiple <lang> sections, one per file needed to create the solution (I've seen this quite a bit with the Ada solutions). –Donal Fellows 07:59, 16 April 2012 (UTC)
  2. Would this lead to high server load when developing and testing a solution - especially for the extra credit part of the task?
    Running every example's going to be impractical anyway. Some require extra command line arguments. Some require user interaction. Running a single example for a single language would be enough (where they are nominated at runtime; no hard-coding!) and wouldn't cause problems with excessive server load. –Donal Fellows 15:38, 23 November 2011 (UTC)
  3. You might want to just have language A download and run examples from language A. Rosetta Code normally allows a task to be fulfilled in one language without necessarily knowing another to any degree, (if at all).

--Paddy3118 18:36, 22 November 2011 (UTC)

well, the knowledge needed for some of the languages is minimal, eg: to run a Python solution, only python task.py is needed. surely anyone can implement support for that without knowing python. the main point here is to write the program in a way so that support for additional languages can be added easily because the framework is already there. and once more solutions appear, taking them from one language and porting them to another should be easy.--eMBee 18:51, 22 November 2011 (UTC)

Overly ambitious

The task currently says: "The program should verify that the tools needed to compile or run the solution are present before running it." But this is not possible, in the general case. For example, one task may need opengl installed, with the proper drivers for the hardware (and some platforms may have multiple sets of drivers installed only some of which are useful with the installed hardware), while another needs the user to be running a browser and yet another may require that the user be running a tty window and another may require a certain version of a certain library and another may require a certain virtual machine to be installed and configured properly... and these requirements will change both from task to task but also from implementation language to implementation language. --Rdm 19:33, 5 December 2011 (UTC)

quite right. good catch. when i wrote this i was only thinking of the executable needed to run the code or the compiler. that is, before the program attempts to run any command it should check if the command is available, or catch the error in some way with a message to the user. but come to think of it, it should probably catch any error and present it to the user, so the error can be fixed.--eMBee 13:47, 8 December 2011 (UTC)
If you want to do something like this, you should require sandboxing, or at least remind users of risks. What if I snuck in stuff to the effect of "rm -rf /" right before some hapless one blindly pulls the code and runs it? It's an open wiki, and should by no means be considered trusted. --Ledrug 14:03, 8 December 2011 (UTC)
also true. this should come with sufficient warnings. and the more reason to present the source to the user and wait for confirmation before running, as the task states.--eMBee 16:08, 8 December 2011 (UTC)

Self reference

If this task runs itself, where will it stop? André van Delft 00:30, 15 December 2011 (UTC)

- It asks for user input, so if the user does not input it stops.--Zorro1024 (talk) 20:46, 24 March 2015 (UTC)

Task status

Of course, this task should not come out of draft status until these ages old questions are addressed/task modified etc. RC entries are supposed to be comparable and not constrained to be automatically runnable. Languages with read-eval-print-loops might show output from that environment with prompts for example, (or not). - to add another question to the list. --Paddy3118 (talk) 18:59, 13 February 2020 (UTC)

Perhaps target-able entries should need to be marked somehow, eg {{libheader|runnable|0.8.1}}, which currently appears as
Library: runnable version 0.8.1
and (only) then we can assume the next <lang>..</lang> is a valid stand-alone? A {{libheader|runnable|none}} could serve as an explicit do not run warning.
It might be sensible to make a local cache part of the task, so it does not re-download files less than (say) 7 days old.
I might also suggest a decent [x-plat] gui front-end and a common .ini file that all languages can share. --Pete Lomax (talk) 08:33, 14 February 2020 (UTC)
Hi Pete, you seem to be proposing that RC tasks be changed to suite one draft task. I think this task itself - for such reasons - isn't tenable. --Paddy3118 (talk) 13:20, 14 February 2020 (UTC)
Although this task has a number of issues, I don't think it's untenable as there are six existing solutions including a very extensive Perl 6 one from Thundergnat. Whilst I consider the EC to be impractical and the MC unrealistic, you can just ignore these as I did for the Go entry. --PureFox (talk) 15:22, 14 February 2020 (UTC)
The biggest difficulty with coming up with a general purpose, all (most) language task runner is the inconsistent markup and layout used by different entry authors. When I first wrote the Perl 6 task runner, I spent over a week going through and regularizing / standardizing markup on the various Perl 6 task entry authors and ensuring that the vast majority are complete runnable programs. It's pretty good now, but that was a real barrier to entry when I first wrote the code. The next several, not easily prioritizable difficulties for other languages: many task entries are not complete, stand-alone runnable code, they leave out standard libraries, or more commonly, call for including other task entry code... but don't actually include it; many task authors chose to spilt up entries into separate "task function" and "task demo" code blocks with no easy algorithmic way to tell which is which; some languages are just complicated to install / compile. None of it is insurmountable, but it requires more time and energy than I am willing to put into it. The extra credit (report on task entries that fail) is actually pretty easy, though determining why they fail may be hard. The Perl 6 entry already does report failures, though it just reports on the command line, not a compiled list. Just a matter of teeing it into a file though if I wanted to add that. The more credit (compare output to a standard) is somewhat infeasible for a Rosettacode task unless it was severely constrained to a select group of languages and tasks; and even then would be a large undertaking. Just smoking the Perl 6 entries takes several hundred / thousand lines of code, a few hours of runtime and extensive (one time) manual modification to the tasks to properly capture and compare the outputs. --Thundergnat (talk) 19:36, 15 February 2020 (UTC)
Yeah, distinguishing snippets and programs split up into bits from actual runnable programs is a significant difficulty with this task. In the case of Go I could have looked for a main() function to determine whether it might be runnable but it might still not be if it were using imports which were not currently installed on the machine. Trying to check this and, if necessary, download the necessary resources would have been a hopeless task as a lot of Go libraries use C stuff which needs to be installed first.
I therefore decided to just extract the first potentially runnable code and leave it to the user to determine whether it was actually runnable or not. If they said run it anyway, it would just fail to build or error out, hopefully without bringing down the host executable.
Another potential difficulty is running programs for languages which have had version changes which are not backwards compatible, a notable example here being Python. I wondered here about looking for 'print' statements to see whether they had brackets or not for the arguments but, of course, not all programs will use 'print' and there may be other changes as well which are less easy to detect.
I thought Pete made a good point about having some way of determining whether code was runnable or not and ideally you should include the version number and platform(s) as well. But, whilst that might be good practice for the future, it's difficult to see how one could enforce it and, even if you could, trying to deal with what's happened in the past for all languages would be impractical in any case.--PureFox (talk) 20:56, 15 February 2020 (UTC)