Talk:Fivenum

From Rosetta Code
Revision as of 20:06, 26 February 2018 by Eoraptor (talk | contribs) (→‎License)

The task needs clarification

  • What the task is actually asking for is the 5 numbers used to draw a boxplot. This has nothing to do with "big data", or with producing a "smaller array". Requiring that the five numbers yield the same boxplot if they are treated as data is pretty useless: the task emphasizes space reduction, but it will not save space.
  • There are several conventions for boxplot statistics. Not all statistical programs draw boxplots like R.
  • Also, 5 numbers are obviously not enough to reproduce a boxplot, as outliers are usually drawn as well. This is not a problem if we admit we don't need the outliers.

All in all, the task seems to be "rewrite R's fivenum in your language". It certainly can be a task, but it's rather short-sighted: there are other languages beyond R, out there.

Eoraptor (talk) 20:35, 25 February 2018 (UTC)

License

The R function is part of the R source, hence has GPL license. Any translation of this is a derivative work. On such a simple function, I doubt it would be a problem, but please be careful next time: copy-pasting is not fine. Eoraptor (talk) 23:29, 25 February 2018 (UTC)


"This has nothing to do with "big data", or with producing a "smaller array". Requiring that the five numbers yield the same boxplot if they are treated as data is pretty useless: the task emphasizes space reduction, but it will not save space."

Everything in this quote is objectively false. I had a task in my job to make boxplots with huge datasets (> 120 GB of data) but all I needed were these five data points. It made no sense whatsoever to save every data point. I was doing this in Perl, since I don't like using R unless I have to. That was the purpose of the page. On the contrary, this task was very useful for me, but maybe not for you. I wouldn't have been able to make these plots without this task, specifically the Perl translation. "pretty useless"? on the contrary, this was essential, and I couldn't have performed the task without it. In the spirit of generosity, I decided to make my work in translating R's fivenum function available to others in case they had the same problem I did.--Hailholyghost (talk) 14:27, 26 February 2018 (UTC)

You missed the point, again. That YOU had to do this in your job does not mean that computing 5 numbers is related to big data. Maybe you needed also a mean, that does not imply computing a mean is related to big data: you can take a mean of 10 values. Always your case, your job, your own particular situation, does not make a general task. The general task would say: compute these numbers, period. You can do it with small data, with big data, with gigantic data, noone cares. Besides, the task you asked is not exactly the same: you asked for data that would produce the same numbers. And THAT is useless, as it's obviously enough to store the numbers. But maybe the sentence was not clear enough? Oh, and you did not describe clearly how these five numbers are to be computed: as I already said, there is no universal convention on boxplots. All in all, you didn't address any of my question. So much for your generosity. Eoraptor (talk) 20:00, 26 February 2018 (UTC)