Talk:Fivenum: Difference between revisions

m
Line 22:
I would be happy with both possibilities, but these are entirely different tasks, and if we have to manage large data, please state how large, and adapt the current solutions accordingly. All current solutions imply the dataset lies entirely in memory. For "usual" machines, that means the dataset is actually rather small.
 
Hailholyghost gave his example above, here is another one: most of my work is done on a business PC with 8GB RAM and SAS/Stata/R (and I suspect most professional statisticians work on a daily basis on that kind of machine, with that kind of software). Some of my work is done on a SAS VA server with 265256 GB RAM. Still another part of my job is done on SAS EG in a Citrix environment connecting to large Oracle servers (health data for the entire french population). Different machines, different tasks, obviously. Work on server will usually be useful to build summary data that will then be transferred to a PC for modelling or any further computations. That could mean computing boxplot data on server to be used on PC. As an aside, I would not compute this to save space, but because it's statistically useful for some task, and there is an extra reason to download summary data apart from network bandwidth: it's simply forbidden to extract raw data from these servers, to protect privacy (although even on the server it's anonymized).
 
While both tasks described above are acceptable, I personally would be reluctant to ask something on Rosetta Code that requires more than a PC, as it's by far the most widespread kind of machine, especially among students, who will likelly benefit most from such a site.
1,336

edits