Talk:Permutation test: Difference between revisions

made a note of the task specification change
(made a note of the task specification change)
 
(8 intermediate revisions by 5 users not shown)
Line 10:
 
Unfortunately there is still an issue of roundoff error, as [[User:Rdm|Rdm]] also notes. Scaling the experimental data up by a factor of 100 seems to be an adequate workaround in the Ursala solution (and possibly others using standard IEEE double precision math). Can any numerical analysis experts reading this please suggest a better one? I've tried using sums as a proxy for means, and also calculating means and sums
the careful way by first sorting the numbers in order of absolute value, which didn't help.
To help head off any further arguments about the results, here are my statistics, which agree with those of [[User:Rdm|Rdm]].
 
Line 72:
 
--[[User:Sluggo|Sluggo]] 02:06, 4 February 2011 (UTC)
 
===Using rational arithmetic===
I wondered about the accuracy of results after it was queried above and ran the Python solution but with rational numbers as input to see what effect this may have.
 
I should explain that the first set of results are produced using the Python program from the page which works in double precision floating point, in the idle Python IDE. I then show the calculation of a result using rationals (the fractions module).
<lang python>>>> ================================ RESTART ================================
>>>
under=86.90%, over=13.10%
>>> from fractions import Fraction as F
>>> t = [F(tr) for tr in treatmentGroup]
>>> c = [F(cn) for cn in controlGroup]
>>> u = permutationTest(t, c)
>>> print("under=%.2f%%, over=%.2f%%" % (u, 100. - u))
under=86.89%, over=13.11%</lang>
 
--[[User:Paddy3118|Paddy3118]] 03:22, 4 February 2011 (UTC)
 
:Both sets of results above appear to be incorrect. Do you believe the latter is correct, and if so, can you support it by listing statistics comparable to those above? I maintain that if you investigate it, you'll find that many of the large group of alternative means that should be exactly equal to the empirical mean are off by a little and therefore are counted among those that are greater or lesser. --[[User:Sluggo|Sluggo]] 23:43, 4 February 2011 (UTC)
 
:Using rational arithmetic, I get:
 
:<lang>under: 87.1972%
over: 12.8028%</lang> when I subtract mean control effects from mean treatment effects (experimental result is 1379 divided by 90), and I get
:<lang>under: 13.1417%
over: 86.8583%</lang> when I subtract mean treatment effects from mean control effects (experimental result is -1379 divided by 90).
 
:Have I made a logical error? --[[User:Rdm|Rdm]] 15:33, 4 February 2011 (UTC)
 
:: No. The problem is that it seems to be an ill-conditioned problem. IEEE arithmetic is getting in the way and causing all sorts of trouble. That's really nasty. The only good way to deal with this is to change the task so it doesn't have the problem, since the magic needed to fix it is evil and problem-specific (multiplying through by 100 pushes the figures into the stable range). Ugh. –[[User:Dkf|Donal Fellows]] 15:53, 4 February 2011 (UTC)
 
:::Actually, I was asking about how the large and small value reversed (between the under and over categories) in those two cases. --[[User:Rdm|Rdm]] 17:14, 4 February 2011 (UTC)
 
===Using integer arithmetic ===
Rationals are a bit of overkill when the denominator is always the same. I just posted a version using all integer arithmetic. The problem with floats is those 313 cases where the differences are equal. Anything that causes the difference to be off by the tiniest bit can cause them to be mis-categorized. I don't know enough statistics to know how this is typically handled. For this task, it might be enough to change the task description to provide experimental results as an integer score from 0 to 100. While we're at it, we could change the task to specify that the difference is treatment-control. That would eliminate the double solution.&mdash;[[User:Sonia|Sonia]] 20:12, 4 February 2011 (UTC)
 
:I'm in favor of amending the task as you recommend, and I'll do it if no one objects in the next few days or does it first. --[[User:Sluggo|Sluggo]] 23:46, 4 February 2011 (UTC)
 
== Name of task ==
Line 79 ⟶ 115:
 
: "Permutation test" is the conventional name for this test according to the Wikipedia, even though it may be a misnomer. --[[User:Sluggo|Sluggo]] 02:24, 4 February 2011 (UTC)
 
== Change in specification ==
 
In keeping with the developing consensus of opinion, I have changed the task specification to use integer test data, and specified that the difference in means is to be calculated by subtracting the control group mean from the treatment group mean. The correct results should be <math>12.80%</math> and <math>87.20%</math>. --[[User:Sluggo|Sluggo]] 21:05, 10 February 2011 (UTC)
Anonymous user