Talk:Fivenum: Difference between revisions

← Older edit

Talk:Fivenum (view source)

Revision as of 08:05, 15 February 2019

1,291 bytes added , 5 years ago

→‎R vs Wikipedia: Ref.

Anonymous user

rosettacode>Paddy3118

Revision as of 22:13, 27 February 2018 (view source) Eoraptor (talk \| contribs) m (→‎Large vs not large) ← Older edit		Latest revision as of 08:05, 15 February 2019 (view source) rosettacode>Paddy3118 (→‎R vs Wikipedia: Ref.)
(7 intermediate revisions by 3 users not shown)
Line 22: I would be happy with both possibilities, but these are entirely different tasks, and if we have to manage large data, please state how large, and adapt the current solutions accordingly. All current solutions imply the dataset lies entirely in memory. For "usual" machines, that means the dataset is actually rather small. Hailholyghost gave his example above, here is another one: most of my work is done on a business PC with 8GB RAM and SAS/Stata/R/Python (and I suspect most professional statisticians work on a daily basis on that kind of machine, with that kind of software). Some of my work is done on a SAS VA server with 256 GB RAM. Still another part of my job is done on SAS EG in a Citrix environment connecting to large Oracle servers (health data for the entire french population). Different machines, different tasks, obviously. Work on server will usually be useful to build summary data that will then be transferred to a PC for modelling or any further computations. That could mean computing boxplot data on server to be used on PC. As an aside, I would not compute this to save space, but because it's statistically useful for some task. Also, there is an extra reason to download summary data, apart from network bandwidth limitation: it's simply forbidden to extract raw data from these servers, to protect privacy (although even on the server ~~it's~~the datasets are anonymized). While both tasks described above are acceptable, I personally would be reluctant to ask something on Rosetta Code that requires more than a PC, as it's by far the most widespread kind of machine, especially among students, who will likelly benefit most from such a site. But we could also have a category of "heavy" tasks, requiring specialized hardware and/or software, if there are enough users willing to contribute to such tasks (I won't, as my access to such machines does not allow any form of "entertainment" - though to be honest I did once run the Fortran 77 N-Queens program on an IBM z9). [[User:Eoraptor\|Eoraptor]] ([[User talk:Eoraptor\|talk]]) 20:12, 27 February 2018 (UTC) ==Phix test result glitch ?== I noticed, in passing, that the first of the three test results in the Phix example shows the value 43 where other code (and a quick test just now with the the built-in R function) is returning 42.5 Perhaps some kind of edge case that might be worth checking ? Or just a variant interpretation ? [[User:Hout\|Hout]] ([[User talk:Hout\|talk]]) 08:53, 13 February 2019 (UTC) ==R vs Wikipedia== [[wp:Fivenum]] uses quartiles defined as members of set of input values. The R definition differs. Could do with a definitive definition or an explanation of the variants as part of this task, as, as others have stated, "do what R does" highlights issues. --[[User:Paddy3118\|Paddy3118]] ([[User talk:Paddy3118\|talk]]) 13:56, 13 February 2019 (UTC) :[[wp:Percentile#Definitions]] shows common methods of computing percentiles. --[[User:Paddy3118\|Paddy3118]] ([[User talk:Paddy3118\|talk]]) 08:04, 15 February 2019 (UTC)