User talk:Hailholyghost: Difference between revisions

Content added Content deleted

Inline

Revision as of 13:56, 8 December 2017

Welch's t-test

Hello, you wrote: "Welch's t-test is only part of the calculation, it isn't the purpose of the page."

What's the purpose of the page, then?

By the way, "Given two lists of data, calculate the p-value used for Calculation of P-value." is totally meaningless. Which p-value? There are zillions of p-value, associated with zillions of statistical tests. Basically all you have to know is the probability distribution of some statistic, but there are infinitely many of them: you are not going to ask for every possible distribution, so you have to choose one. Here, all the task in its current state is about Welch's t-test and how to compute the corresponding p-value, and yet you pretend it's not the purpose of the page. Puzzling.

Eoraptor (talk) 11:37, 8 December 2017 (UTC)

Welch's t-test is easy, the reason I created this page is because I had no idea how to calculate the integration, which is the point of the page. This page is meant to calculate this the same as R's "t.test(x,y,paired=FALSE)" If this page were about the t-test, as you say it is, the C code would have 1 line, and is completely trivial. The point of the page is to show how to do the non-trivial things: integration. I spent weeks figuring out how to do this, and your title change obfuscates my hard work.--Hailholyghost (talk) 12:51, 8 December 2017 (UTC)

Then this is a very badly designed task. There are already tasks about integration. Anyway, nobody would compute the p-value this way: there are good implementations of special functions, here the incomplete beta function, and usually they make use of specific properties of the functions, like series, continued fraction, optimal rational (Pade like) or polynomial (Chebyshev equioscillation) approximations, etc. If you want the p-value, the "standard" way is to find a special function library. Integration should be a last resort for Rosetta Code, for languages that do not have readily available special functions libraries. Common languages such as C, Fortran or Python all have something (I implemented the task in Python using numpy/scipy, and there is something in Fortran using IMSL and SLATEC, both well-known libraries), statistical packages even have builtin function for the whole test (see Stata and R here).

If you want to see how to use a generalist integration routine to compute the p-value, you are doing it wrong. If you want a general task on integration, you are doing it wrong too, and there is already a task about this.

You write the C code is trivial: really? How? You will need at least the incomplete beta function from GSL (or IMSL, NAG or any library with it, even a Fortran-based library), and this will require more than one line (you may do more or less what is done here in Fortran or Python). Even getting a working GSL on Windows is not trivial at all. And what you call the non-trivial thing is really not about integration at all.

Eoraptor (talk) 13:13, 8 December 2017 (UTC)

For instance, the R implementation uses a port of TOMS 708 from the ACM. There are reasons not to reimplement this from scratch: it requires much research work to prove a given algorithm is correct, to find necessary coefficients with enough precision, and to get a correct and efficient implementation. Here you are not even trying to investigate the convergence of the integral, and it's an inefficient way to find an accurate answer. Eoraptor (talk) 13:30, 8 December 2017 (UTC)

@@ Line 11: / Line 11: @@
 ::Then this is a very badly designed task. There are already tasks about integration. Anyway, nobody would compute the p-value this way: there are good implementations of special functions, here the incomplete beta function, and usually they make use of specific properties of the functions, like series, continued fraction, optimal rational (Pade like) or polynomial (Chebyshev equioscillation) approximations, etc. If you want the p-value, the "standard" way is to find a special function library. Integration should be a last resort for Rosetta Code, for languages that do not have readily available special functions libraries. Common languages such as C, Fortran or Python all have something (I implemented the task in Python using numpy/scipy, and there is something in Fortran using IMSL and SLATEC, both well-known libraries), statistical packages even have builtin function for the whole test (see Stata and R here).
 ::If you want to see how to use a generalist integration routine to compute the p-value, you are doing it wrong. If you want a general task on integration, you are doing it wrong too, and there is already a task about this.
-::You write the C code is trivial: really? How? You will need at least the incomplete beta function from [https://www.gnu.org/software/gsl/manual/html_node/Incomplete-Beta-Function.html GSL] (or IMSL, NAG or any library with it, even a Fortran-based library), and this will require more than one line (you may do more or less what is done here in Fortran or Python). Even getting a working GSL on Windows is not trivial at all.
+::You write the C code is trivial: really? How? You will need at least the incomplete beta function from [https://www.gnu.org/software/gsl/manual/html_node/Incomplete-Beta-Function.html GSL] (or IMSL, NAG or any library with it, even a Fortran-based library), and this will require more than one line (you may do more or less what is done here in Fortran or Python). Even getting a working GSL on Windows is not trivial at all. And what you call the non-trivial thing is really not about integration at all.
 ::[[User:Eoraptor|Eoraptor]] ([[User talk:Eoraptor|talk]]) 13:13, 8 December 2017 (UTC)
 :::For instance, the R implementation uses a port of [https://dl.acm.org/citation.cfm?id=131776 TOMS 708] from the ACM. There are reasons not to reimplement this from scratch: it requires much research work to prove a given algorithm is correct, to find necessary coefficients with enough precision, and to get a correct and efficient implementation. Here you are not even trying to investigate the convergence of the integral, and it's an inefficient way to find an accurate answer. [[User:Eoraptor|Eoraptor]] ([[User talk:Eoraptor|talk]]) 13:30, 8 December 2017 (UTC)