Statistics/Basic: Difference between revisions
Content added Content deleted
(→{{header|jq}}: move to proper place (first step)) |
|||
Line 2,132: | Line 2,132: | ||
<pre>prompt$ jsish -u statisticsBasic.jsi |
<pre>prompt$ jsish -u statisticsBasic.jsi |
||
[PASS] statisticsBasic.jsi</pre> |
[PASS] statisticsBasic.jsi</pre> |
||
=={{header|jq}}== |
|||
{{works with|jq}} |
|||
'''Works with gojq, the Go implementation of jq''' |
|||
The following jq program uses a streaming approach so that only one PRN (pseudo-random number) |
|||
need be in memory at a time. |
|||
For PRNs in [0,1], as here, the program is thus essentially only limited by CPU time available. |
|||
In the example section below, we include N=10 million. |
|||
Since jq does not currently have a built-in PRNG, we will use an external source |
|||
of entropy; there are, however, RC entries giving PRN generators written in jq that could be used, |
|||
e.g. https://rosettacode.org/wiki/Subtractive_generator#jq |
|||
For the sake of illustration, we will use /dev/urandom encapsulated in a shell function: |
|||
<lang sh># Usage: prng N width |
|||
function prng { |
|||
cat /dev/urandom | tr -cd '0-9' | fold -w "$2" | head -n "$1" |
|||
}</lang> |
|||
'''basicStats.jq''' |
|||
<lang jq> # $histogram should be a JSON object, with buckets as keys and frequencies as values. |
|||
# $keys should be an array of all the potential bucket names (possibly integers) |
|||
# in the order to be used for display: |
|||
def pp($histogram; $keys): |
|||
([$histogram[]] | add) as $n # for scaling |
|||
| ($keys|length) as $length |
|||
| $keys[] |
|||
| "\(.) : \("*" * (($histogram[tostring] // 0) * 20 * $length / $n) // "" )" ; |
|||
# `basic_stats` computes the unadjusted standard deviation |
|||
# and assumes the sum of squares (ss) can be computed without concern for overflow. |
|||
# The histogram is based on allocation to a bucket, which is made |
|||
# using `bucketize`, e.g. `.*10|floor` |
|||
def basic_stats(stream; bucketize): |
|||
# Use |
|||
reduce stream as $x ({histogram: {}}; |
|||
.count += 1 |
|||
| .sum += $x |
|||
| .ss += $x * $x |
|||
| .mean = (.sum / .count) |
|||
| .stddev = (.ss/.count) - .mean*.mean |
|||
| ($x | bucketize | tostring) as $bucket |
|||
| .histogram[$bucket] += 1) ; |
|||
basic_stats( "0." + inputs | tonumber; .*10|floor) |
|||
| " |
|||
Basic statistics for \(.count) PRNs in [0,1]: |
|||
mean: \(.mean) |
|||
stddev: \(.stddev) |
|||
Histogram dividing [0,1] into 10 equal intervals:", |
|||
pp(.histogram; [range(0;10)] )</lang> |
|||
'''Driver Script''' (e.g. bash) |
|||
<lang>for n in 100 1000 1000000 ; do |
|||
echo "Basic statistics for $n PRNs in [0,1]" |
|||
prng $n 10 | jq -nrR -f basicStats.jq |
|||
echo |
|||
done</lang> |
|||
{{out}} |
|||
<pre> |
|||
Basic statistics for 100 PRNs in [0,1]: |
|||
mean: 0.49030669246300007 |
|||
stddev: 0.08727384816799114 |
|||
Histogram dividing [0,1] into 10 equal intervals: |
|||
0 : ************************** |
|||
1 : ************** |
|||
2 : ****************************** |
|||
3 : ****************** |
|||
4 : ********** |
|||
5 : ************** |
|||
6 : ********************** |
|||
7 : **************************** |
|||
8 : ************************** |
|||
9 : ************ |
|||
Basic statistics for 1000 PRNs in [0,1]: |
|||
mean: 0.47791345211799985 |
|||
stddev: 0.08221887691336871 |
|||
Histogram dividing [0,1] into 10 equal intervals: |
|||
0 : ******************** |
|||
1 : ********************** |
|||
2 : ************************ |
|||
3 : ********************** |
|||
4 : ******************** |
|||
5 : ******************** |
|||
6 : **************** |
|||
7 : ***************** |
|||
8 : ******************** |
|||
9 : ****************** |
|||
Basic statistics for 1000000 PRNs in [0,1]: |
|||
mean: 0.5003777569065345 |
|||
stddev: 0.08319450650142385 |
|||
Histogram dividing [0,1] into 10 equal intervals: |
|||
0 : ******************* |
|||
1 : ******************** |
|||
2 : ******************* |
|||
3 : ******************* |
|||
4 : ******************** |
|||
5 : ******************** |
|||
6 : ******************* |
|||
7 : ******************** |
|||
8 : ******************* |
|||
9 : ******************** |
|||
Basic statistics for 10000000 PRNs in [0,1]: |
|||
mean: 0.500018892075542 |
|||
stddev: 0.08335213806183339 |
|||
Histogram dividing [0,1] into 10 equal intervals: |
|||
0 : ******************** |
|||
1 : ******************* |
|||
2 : ******************** |
|||
3 : ******************** |
|||
4 : ******************** |
|||
5 : ******************* |
|||
6 : ******************* |
|||
7 : ******************* |
|||
8 : ******************* |
|||
9 : ******************** |
|||
</pre> |
|||
=={{header|Julia}}== |
=={{header|Julia}}== |