Stem-and-leaf plot: Difference between revisions
m (→{{header|Perl}} generating {{header|LaTeX}}: Position banner at head of section.) |
(→{{header|Python}}: Don't skip intervals!) |
||
Line 73: | Line 73: | ||
=={{header|Python}}== |
=={{header|Python}}== |
||
Adjusting <code>Stem.leafdigits</code> allows you to modify how many digits of a value are used in the leaf, with the stem intervals adjusted accordingly. |
Adjusting <code>Stem.leafdigits</code> allows you to modify how many digits of a value are used in the leaf, with the stem intervals adjusted accordingly. |
||
<lang python> |
<lang python>from collections import namedtuple |
||
from collections import namedtuple |
|||
from pprint import pprint as pp |
from pprint import pprint as pp |
||
from math import floor |
from math import floor |
||
Line 98: | Line 97: | ||
stemwidth = max(len(str(x)) for x in stems) |
stemwidth = max(len(str(x)) for x in stems) |
||
leafwidth = max(len(str(x)) for x in leafs) |
leafwidth = max(len(str(x)) for x in leafs) |
||
laststem, out = |
laststem, out = min(stems) - interval, [] |
||
for s,l in d: |
for s,l in d: |
||
while laststem < s: |
|||
laststem += interval |
|||
out.append('\n%*i |' % ( stemwidth, laststem)) |
|||
out.append(' %*i' % (leafwidth, l)) |
out.append(' %*i' % (leafwidth, l)) |
||
out.append('\n\nKey\n Stem interval: %f\n\n' % interval) |
out.append('\n\nKey\n Stem interval: %f\n\n' % interval) |
||
Line 117: | Line 116: | ||
0 | 5 11 21 33 37 41 42 49 53 54 66 68 81 87 97 |
0 | 5 11 21 33 37 41 42 49 53 54 66 68 81 87 97 |
||
100 | 0 2 4 9 10 23 23 23 24 25 28 34 39 46 47 |
100 | 0 2 4 9 10 23 23 23 24 25 28 34 39 46 47 |
||
200 | |
|||
300 | 5 15 20 21 24 24 24 29 30 34 52 56 60 60 61 75 78 |
300 | 5 15 20 21 24 24 24 29 30 34 52 56 60 60 61 75 78 |
||
400 | 7 14 19 27 32 36 40 42 46 48 50 51 52 |
400 | 7 14 19 27 32 36 40 42 46 48 50 51 52 |
||
Line 132: | Line 132: | ||
50 | 3 4 |
50 | 3 4 |
||
60 | 6 8 |
60 | 6 8 |
||
70 | |
|||
80 | 1 7 |
80 | 1 7 |
||
90 | 7 |
90 | 7 |
||
Line 139: | Line 140: | ||
130 | 4 9 |
130 | 4 9 |
||
140 | 6 7 |
140 | 6 7 |
||
150 | |
|||
160 | |
|||
170 | |
|||
180 | |
|||
190 | |
|||
200 | |
|||
210 | |
|||
220 | |
|||
230 | |
|||
240 | |
|||
250 | |
|||
260 | |
|||
270 | |
|||
280 | |
|||
290 | |
|||
300 | 5 |
300 | 5 |
||
310 | 5 |
310 | 5 |
||
320 | 0 1 4 4 4 9 |
320 | 0 1 4 4 4 9 |
||
330 | 0 4 |
330 | 0 4 |
||
340 | |
|||
350 | 2 6 |
350 | 2 6 |
||
360 | 0 0 1 |
360 | 0 0 1 |
||
370 | 5 8 |
370 | 5 8 |
||
380 | |
|||
390 | |
|||
400 | 7 |
400 | 7 |
||
410 | 4 9 |
410 | 4 9 |
Revision as of 16:29, 14 December 2009
You are encouraged to solve this task according to the task description, using any language you may know.
Create a well-formatted stem-and-leaf plot from the following data set, where the leaves are the last digits:
110 436 124 109 440 330 53 352 315 452 54 49 334 102 432 123 442 125 97 104 11 446 123 360 324 427 451 329 139 42 324 320 450 100 87 414 305 21 375 324 360 123 33 378 37 66 41 321 68 356 407 448 5 128 81 361 419 134 147 146
The primary intent of this task is the presentation of information. It is acceptable to hardcode the data set or characteristics of it (such as what the stems are) in the example, insofar as it is impractical to make the example generic to any data set. For example, in a computation-less language like HTML the data set may be entirely prearranged within the example; the interesting characteristics are how the proper visual formatting is arranged.
If possible, the output should not be a bitmap image. Monospaced plain text is acceptable, but do better if you can. It may be a window, i.e. not a file.
Perl generating LaTeX
<lang perl>#!/usr/bin/perl -w
my @data = sort {$a <=> $b} qw(110 436 124 109 440 330 53 352 315 452 54 49 334 102 432 123 442 125 97 104 11 446 123 360 324 427 451 329 139 42 324 320 450 100 87 414 305 21 375 324 360 123 33 378 37 66 41 321 68 356 407 448 5 128 81 361 419 134 147 146);
- FIXME: This should count the maximum number of leaves in any one stem;
- instead it takes the total number of data items, which is usually
- a massive overestimate.
my $columns = @data;
print <<"EOT"; \\documentclass{report} \\usepackage{fullpage} \\begin{document}
\\begin{tabular}{ r | *{$columns}{c} }
EOT
my $laststem = undef;
for my $value (@data) {
my $stem = int($value / 10); my $leaf = $value % 10; while (not defined $laststem or $stem > $laststem) { if (not defined $laststem) { $laststem = $stem - 1; } else { print " \\\\\n"; } $laststem++; print " $laststem"; } printf " & %3d", $leaf;
}
print <<'EOT';
\end{tabular}
\end{document} EOT</lang>
LaTeX output of the Perl program:
<lang latex>\documentclass{report} \usepackage{fullpage} \begin{document}
\begin{tabular}{ r | *{60}{c} } 0 & 5 \\ 1 & 1 \\ 2 & 1 \\ 3 & 3 & 7 \\ ... 44 & 0 & 2 & 6 & 8 \\ 45 & 0 & 1 & 2 \end{tabular}
\end{document}</lang>
The parameter to the tabular
environment defines the columns of the table. “r” and “c” are right- and center-aligned columns, “|” is a vertical rule, and “*{count}{cols}”
repeats a column definition count times.
Python
Adjusting Stem.leafdigits
allows you to modify how many digits of a value are used in the leaf, with the stem intervals adjusted accordingly.
<lang python>from collections import namedtuple
from pprint import pprint as pp
from math import floor
Stem = namedtuple('Stem', 'data, leafdigits')
data0 = Stem((110, 436, 124, 109, 440, 330, 53, 352, 315, 452,
54, 49, 334, 102, 432, 123, 442, 125, 97, 104, 11, 446, 123, 360, 324, 427, 451, 329, 139, 42, 324, 320, 450, 100, 87, 414, 305, 21, 375, 324, 360, 123, 33, 378, 37, 66, 41, 321, 68, 356, 407, 448, 5, 128, 81, 361, 419, 134, 147, 146), 2.0)
def stemplot(stem):
d = [] interval = int(10**int(stem.leafdigits)) for data in sorted(stem.data): data = int(floor(data)) stm, lf = divmod(data,interval) d.append( (int(stm * interval), int(lf)) ) stems, leafs = list(zip(*d)) stemwidth = max(len(str(x)) for x in stems) leafwidth = max(len(str(x)) for x in leafs) laststem, out = min(stems) - interval, [] for s,l in d: while laststem < s: laststem += interval out.append('\n%*i |' % ( stemwidth, laststem)) out.append(' %*i' % (leafwidth, l)) out.append('\n\nKey\n Stem interval: %f\n\n' % interval) return .join(out)
if __name__ == '__main__':
print( stemplot(data0) ) print( stemplot(Stem(data0.data, 1.0)) )
</lang>
Sample Output
>>> 0 | 5 11 21 33 37 41 42 49 53 54 66 68 81 87 97 100 | 0 2 4 9 10 23 23 23 24 25 28 34 39 46 47 200 | 300 | 5 15 20 21 24 24 24 29 30 34 52 56 60 60 61 75 78 400 | 7 14 19 27 32 36 40 42 46 48 50 51 52 Key Stem interval: 100.000000 0 | 5 10 | 1 20 | 1 30 | 3 7 40 | 1 2 9 50 | 3 4 60 | 6 8 70 | 80 | 1 7 90 | 7 100 | 0 2 4 9 110 | 0 120 | 3 3 3 4 5 8 130 | 4 9 140 | 6 7 150 | 160 | 170 | 180 | 190 | 200 | 210 | 220 | 230 | 240 | 250 | 260 | 270 | 280 | 290 | 300 | 5 310 | 5 320 | 0 1 4 4 4 9 330 | 0 4 340 | 350 | 2 6 360 | 0 0 1 370 | 5 8 380 | 390 | 400 | 7 410 | 4 9 420 | 7 430 | 2 6 440 | 0 2 6 8 450 | 0 1 2 Key Stem interval: 10.000000 >>>
Ruby
<lang ruby>def generate_stem_and_leaf(data)
sorted = data.sort minimum, maximum = sorted.minmax multiplier = 10 ** Math.log10(maximum - minimum).floor stem_data = Hash.new {|h,k| h[k] = []} sorted.each do |value| stem, leaf = value.divmod(multiplier) stem_data[stem] << leaf end [multiplier, stem_data]
end
def print_stem_and_leaf(data)
multiplier, stem_data = generate_stem_and_leaf(data) stem_width = Math.log10(stem_data.keys.max).ceil leaf_width = Math.log10(multiplier) min_stem, max_stem = stem_data.keys.minmax min_stem.upto(max_stem) do |stem| leaves = stem_data[stem].inject("") {|str,leaf| str << "%*d " % [leaf_width, leaf]} puts "%*d | %s" % [stem_width, stem, leaves] end puts puts "key: 5|4=#{5*multiplier+4}" puts "leaf unit: 1" puts "stem unit: #{multiplier}"
end
data = DATA.read.split.map {|s| s.to_i} print_stem_and_leaf(data)
__END__
110 436 124 109 440 330 53 352 315 452 54 49 334 102 432 123 442 125 97 104 11 446 123 360 324 427 451 329 139 42 324 320 450 100 87 414 305 21 375 324 360 123 33 378 37 66 41 321 68 356 407 448 5 128 81 361 419 134 147 146</lang>
outputs
0 | 5 11 21 33 37 41 42 49 53 54 66 68 81 87 97 1 | 0 2 4 9 10 23 23 23 24 25 28 34 39 46 47 2 | 3 | 5 15 20 21 24 24 24 29 30 34 52 56 60 60 61 75 78 4 | 7 14 19 27 32 36 40 42 46 48 50 51 52 key: 5|4=504 leaf unit: 1 stem unit: 100
Tcl
Note that this algorithm collects the data in a hash table first before sorting the stems and then finally sorting the leaves by stem, rather than sorting the data first.
<lang tcl>package require Tcl 8.5
- How to process a single value, adding it to the table mapping stems to
- leaves.
proc addSLValue {arrayName value} {
upvar 1 $arrayName ary # Extract the sign and clean up the value set s [expr {$value < 0 ? "-" : " "}] set value [expr {round(abs($value))}] # Split the value into stem and leaf set leaf [expr {$value % 10}] set stem [expr {$value / 10}] # Store lappend ary($s$stem) $leaf
}
- How to do the actual output of the stem-and-leaf table, given that we have
- already done the splitting into stems and leaves.
proc printSLTable {arrayName} {
upvar 1 $arrayName ary # Sort the stems by number set names [lsort -integer [array names ary]] # Work out how much width the stems take so everything lines up set len [expr {
max([string length [lindex $names 0]], [string length [lindex $names end]])
}] # Print out the table, sorting the leaves as we go foreach n $names {
puts [format " %*s | %s" $len $n [lsort -integer $ary($n)]]
}
}
- Assemble the parts into a full stem-and-leaf table printer.
proc printStemLeaf dataList {
array set tbl {} foreach value $dataList {
addSLValue tbl $value
} printSLTable tbl
}
- Demo code
set data {
110 436 124 109 440 330 53 352 315 452 54 49 334 102 432 123 442 125 97 104 11 446 123 360 324 427 451 329 139 42 324 320 450 100 87 414 305 21 375 324 360 123 33 378 37 66 41 321 68 356 407 448 5 128 81 361 419 134 147 146
} printStemLeaf $data</lang> Abbreviated output:
0 | 5 1 | 1 2 | 1 3 | 3 7 … 44 | 0 2 6 8 45 | 0 1 2