Talk:Constrained random points on a circle: Difference between revisions
(→Uniform distribution: some math to demonstrate the bias) |
|||
Line 103: | Line 103: | ||
</pre> |
</pre> |
||
Here you can see that the number of points near top/bottom is greater than that for left/right sides of the plot. For this particular case, the simplest way to check for the bias is to count the number of points where abs(y) > abs(x) (this effectively partitions the plot using 45 degree lines) -- for my "bad" code I see a ratio (over multiple runs) of |
Here you can see that the number of points near top/bottom is greater than that for left/right sides of the plot. For this particular case, the simplest way to check for the bias is to count the number of points where abs(y) > abs(x) (this effectively partitions the plot using 45 degree lines) -- for my "bad" code I see a ratio (over multiple runs) of 54:46 in favor of the top/bottom quadrants over left/right. Counted another way, 79% of random seeds result in a plot with more top/bottom points than left/right. |
Revision as of 14:18, 3 September 2010
Not 100 points
There are only 89 points in the circle shown in the verilog example output. This is no surprise, because AFAICS the algorithm doesn't make sure that the same point isn't chosen twice. Now given that it's the first example, I guess it's what was meant by the task description, but then the task description probably should be changed to reflect the fact that less points are OK. --Ce 10:55, 3 September 2010 (UTC)
I've fixed the description: it would be a poor RNG that doesn't produce duplicates --Dave
How to check the code
If you increase the number of points produced to 10k, you should get output rather like this (generated with Tcl version; your version may differ). This lets you check that the spread of points produces the expected annulus. –Donal Fellows 11:00, 3 September 2010 (UTC)
X XXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXX XXXXXXXX XXXXXXXX XXXXXXX XXXXXXX XXXXXX XXXXXX XXXXXX XXXXXX XXXXXX XXXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXXX XXXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXXX XXXXXX XXXXXX XXXXXX XXXXXX XXXXXX XXXXXXX XXXXXXX XXXXXXXX XXXXXXXX XXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXX X
Uniform distribution
I'm not very good with stats, but I've seen discussions pop up before about different distributions of points. Is uniform vs normal vs (something?) a significant component of the task? How may it be verified with at most 100 points? --Michael Mol 12:27, 3 September 2010 (UTC)
I guess that if you count the number of points that lie on a line that passes through the center, then the count should not depend on the angle of that line. So it would be wrong to simply pick a uniform-random value of x and then a uniform-random of y at that location, because that would tend to lead to a higher density of points on the left and right sides (look at the maximally dense version above: there are 12 possible values of y at x==0; but only one at x == +/- 15) -- and thus a greater number of points away from the x-axis. A common mistake when generating a set of linked random variables is that the distribution depends on the order in which they are generated.
So, take this Perl code:
my @bitmap = map { " " x 32 } 0 .. 33; for (1 .. 100) { my $x = int rand(31) - 15; my $max = sqrt( 225-($x*$x) ); my $min = 100-($x*$x); $min = $min > 0 ? sqrt $min : 0; my $y = int rand( 1+$max-$min ) + $min; $y = -$y if rand() < .5; $x += 16; $y += 16; #print "$x $y\n"; substr( $bitmap[$y], $x, 1, "#" ); } print "$_\n" for @bitmap; # # # ## # # ## ## ## # # ## # # # # # # # # # # # # ## # # # # # # # ## # # # # # ## # # # # ## # # # # # # # # # ## # # # # # # # # # ## # # # # ## # # # #
Here you can see that the number of points near top/bottom is greater than that for left/right sides of the plot. For this particular case, the simplest way to check for the bias is to count the number of points where abs(y) > abs(x) (this effectively partitions the plot using 45 degree lines) -- for my "bad" code I see a ratio (over multiple runs) of 54:46 in favor of the top/bottom quadrants over left/right. Counted another way, 79% of random seeds result in a plot with more top/bottom points than left/right.