Talk:Text processing/1: Difference between revisions

Line 35:
:::::Data munging is a loose term and ''does'' apply to the task.
:::::There was a very real requirement to find the longest time that things were broke, and, compared to some requests, is quite reasonable. The newsgroup reference I added was to show that this was a real-world problem. The data already in the article should be enough to complete the task. The task maybe more challenging than some, but maybe trying to solve it will impart useful, (and marketable), skills, as well as being able to contrast solutions in different languages. Unfortunately their is no way to see the development process used in different programming languages because that might show if such a task is easier done in a scripting language. some of the questions asked above for example, might not occur if the language chosen made it easy to just assume correctly formed data and quickly write a parser that could be quickly re-written if your assumptions were wrong. This task is quite straightforward for a data munging task, the full file follows the syntax of the excerpt. Their are no hand editing errors, no funny escape characters, the needed results can be calculated from the data shown, ... --[[User:Paddy3118|Paddy3118]] 20:15, 11 November 2008 (UTC)
::::::Just because it's a real-world problem doesn't mean that it's a good instructional problem. People are coming here to learn and they don't need to filter through a complex task to see how to take in pre-formatted input and do a few little calculations with it. It's fine to mention that it's real, but I don't think makes the task any more valid for RC. I believe that the "broke" time calculation is simple, it just seems a little weird for learning purposes (once again, being real doesn't imply people will easily learn from it). I don't think the task as a whole is very difficult, I just think its complexity is overriding its purpose. As for the errors, in this particular case you may be able to assume perfect input, but in general it's good practice to be thinking "what if", and with data munging jobs in general, you may not be able to assume clean input. The people who ask about errors are just following their good programming instincts. So basically: it doesn't matter if it's real when people need to learn from it, simpler is better when creating tasks here, and it would be nice if we could just agree on what to do for some errors.--[[User:Mwn3d|Mwn3d]] 20:48, 11 November 2008 (UTC)
 
: I think we should assume that the format is as in the example input file readings.txt. That is:
:* Field separator = single Tab character
Anonymous user