Text processing/2: Difference between revisions
Content added Content deleted
Thundergnat (talk | contribs) (Add mirror of data file. (Link is 404)) |
Thundergnat (talk | contribs) m (→{{header|Perl 6}}: Remove less idiomatic version, adjust verbiage to match, update output) |
||
Line 1,877: | Line 1,877: | ||
=={{header|Perl 6}}== |
=={{header|Perl 6}}== |
||
{{trans|Perl}} |
{{trans|Perl}} |
||
{{works with|Rakudo| |
{{works with|Rakudo|2018.03}} |
||
⚫ | |||
<lang perl6>my $fields = 49; |
|||
⚫ | |||
⚫ | |||
my ($good-records, %dates) = 0; |
|||
for 1 .. * Z $*IN.lines -> $line, $s { |
|||
my @fs = split /\s+/, $s; |
|||
@fs == $fields or die "$line: Bad number of fields"; |
|||
given shift @fs { |
|||
m/\d**4 \- \d**2 \- \d**2/ or die "$line: Bad date format"; |
|||
++%dates{$_}; |
|||
} |
|||
my $all-flags-okay = True; |
|||
for @fs -> $val, $flag { |
|||
$val ~~ /\d+ \. \d+/ or die "$line: Bad value format"; |
|||
$flag ~~ /^ \-? \d+/ or die "$line: Bad flag format"; |
|||
$flag < 1 and $all-flags-okay = False; |
|||
} |
|||
$all-flags-okay and ++$good-records; |
|||
} |
|||
⚫ | |||
say 'Good records: ', $good-records; |
|||
say 'Repeated timestamps:'; |
|||
⚫ | |||
say ' ', $_ for grep { %dates{$_} > 1 }, sort keys %dates;</lang> |
|||
⚫ | The final line simply greps out the pairs from the hash whose value is an array with more than 1 element. (Those values that are not arrays nevertheless have a <tt>.elems</tt> method that always reports <tt>1</tt>.) The <tt>.pairs</tt> is merely there for clarity; grepping a hash directly has the same effect. |
||
⚫ | |||
Output: |
|||
<pre>Good records: 5017 |
|||
Repeated timestamps: |
|||
1990-03-25 |
|||
1991-03-31 |
|||
1992-03-29 |
|||
1993-03-28 |
|||
1995-03-26</pre> |
|||
The first version demonstrates that you can program Perl 6 almost like Perl 5. Here's a more idiomatic Perl 6 version that runs several times faster: |
|||
<lang perl6>my $good-records; |
<lang perl6>my $good-records; |
||
my $line; |
my $line; |
||
Line 1,933: | Line 1,913: | ||
<pre>5017 good records out of 5471 total |
<pre>5017 good records out of 5471 total |
||
Repeated timestamps (with line numbers): |
Repeated timestamps (with line numbers): |
||
1990-03-25 |
1990-03-25 => [84 85] |
||
1991-03-31 |
1991-03-31 => [455 456] |
||
1992-03-29 |
1992-03-29 => [819 820] |
||
1993-03-28 |
1993-03-28 => [1183 1184] |
||
1995-03-26 |
1995-03-26 => [1910 1911]</pre> |
||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | The final line simply greps out the pairs from the hash whose value is an array with more than 1 element. (Those values that are not arrays nevertheless have a <tt>.elems</tt> method that always reports <tt>1</tt>.) The <tt>.pairs</tt> is merely there for clarity; grepping a hash directly has the same effect. |
||
⚫ | |||
=={{header|PHP}}== |
=={{header|PHP}}== |