Kernighans large earthquake problem: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎{{header|Haskell}}: Added a first Haskell draft)
Line 304: Line 304:
(\x ->
(\x ->
[ x
[ x
| 6 < (read (last $ words x) :: Float) ]) $
| 6 < (read (last (C.unpack <$> C.words x)) :: Float) ]) $
C.unpack <$> C.lines cs</lang>
C.lines cs</lang>
{{Out}}
{{Out}}
<pre>"8/27/1883 Krakatoa 8.8"
<pre>"8/27/1883 Krakatoa 8.8"

Revision as of 16:44, 22 September 2018

Task
Kernighans large earthquake problem
You are encouraged to solve this task according to the task description, using any language you may know.

Brian Kernighan, in a lecture at the University of Nottingham, described a problem on which this task is based.

Problem

You are given a a data file of thousands of lines; each of three `whitespace` separated fields: a date, a one word name and the magnitude of the event.

Example lines from the file would be lines like:

8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens       7.6
3/13/2009    CostaRica           5.1
Task
  • Create a program or script invocation to find all the events with magnitude greater than 6
  • Assuming an appropriate name e.g. "data.txt" for the file:
  1. Either: Show how your program is invoked to process a data file of that name.
  2. Or: Incorporate the file name into the program, (as it is assumed that the program is single use).



ALGOL 68

<lang algol68>IF FILE input file;

   STRING file name = "data.txt";
   open( input file, file name, stand in channel ) /= 0

THEN

   # failed to open the file #
   print( ( "Unable to open """ + file name + """", newline ) )

ELSE

   # file opened OK #
   BOOL at eof := FALSE;
   # set the EOF handler for the file #
   on logical file end( input file, ( REF FILE f )BOOL:
                                    BEGIN
                                        # note that we reached EOF on the latest read #
                                        at eof := TRUE;
                                        # return TRUE so processing can continue #
                                        TRUE
                                    END
                      );
   # return the real value of the specified field on the line #
   PROC real field = ( STRING line, INT field )REAL:
        BEGIN
           REAL result  := 0;
           INT  c pos   := LWB line;
           INT  max pos := UPB line;
           STRING f     := "";
           FOR f ield number TO field WHILE c pos <= max pos DO
               # skip leading spaces #
               WHILE IF c pos > max pos THEN FALSE ELSE line[ c pos ] = " " FI DO
                   c pos +:= 1
               OD;
               IF c pos <= max pos THEN
                   # have a field #
                   INT start pos = c pos;
                   WHILE IF c pos > max pos THEN FALSE ELSE line[ c pos ] /= " " FI DO
                       c pos +:= 1
                   OD;
                   IF field number = field THEN
                       # have the required field #
                       f := line[ start pos : c pos - 1 ]
                   FI
               FI
           OD;
           IF f /= "" THEN
               # have the field - assume it a real value and convert it #
               FILE real value;
               associate( real value, f );
               on value error( real value
                             , ( REF FILE f )BOOL:
                                    BEGIN
                                        # "handle" invalid data #
                                        result := 0;
                                        # return TRUE so processing can continue #
                                        TRUE
                                    END
                             );
               get( real value, ( result ) )
           FI;
           result
        END # real field # ;
   # show the lines where the third field is > 6 #
   WHILE NOT at eof
   DO
       STRING line;
       get( input file, ( line, newline ) );
       IF real field( line, 3 ) > 6 THEN
           print( ( line, newline ) )
       FI
   OD;
   # close the file #
   close( input file )

FI</lang>

AWK

<lang awk> awk '$3 > 6' data.txt</lang>

C++

<lang cpp>// Randizo was here!

  1. include <iostream>
  2. include <fstream>
  3. include <string>

using namespace std;

int main() {

   ifstream file("../include/earthquake.txt");
   int count_quake = 0;
   int column = 1;
   string value;
   double size_quake;
   string row = "";


   while(file >> value)
   {
       if(column == 3)
       {
           size_quake = stod(value);
           if(size_quake>6.0)
           {
               count_quake++;
               row += value + "\t";
               cout << row << endl;
           }
           column = 1;
           row = "";
       }
       else
       {
           column++;
           row+=value + "\t";
       }
   }
   cout << "\nNumber of quakes greater than 6 is " << count_quake << endl;
   return 0;

}</lang>

New version: <lang cpp>// Jolkdarr was also here!

  1. include <iostream>
  2. include <iomanip>
  3. include <fstream>
  4. include <string>

int main() {

   using namespace std;
   ifstream file("data.txt");
   int count_quake = 0;
   string s1, s2;
   double rate;
   while (!file.eof()) {
       file >> s1 >> s2 >> rate;
       if (rate > 6.0) {
       	cout << s1 << setw(20) << s2 << " " << rate << endl;
       	count_quake++;
       }
   }
   cout << endl << "Number of quakes greater than 6 is " << count_quake << endl;
   return 0;

}</lang>

C

<lang c>#include <stdio.h>

  1. include <string.h>
  2. include <stdlib.h>

int main() {

   FILE *fp;
   char *line = NULL;
   size_t len = 0;
   ssize_t read;
   char *lw, *lt;
   fp = fopen("data.txt", "r");
   if (fp == NULL) {
       printf("Unable to open file\n");
       exit(1);
   }
   printf("Those earthquakes with a magnitude > 6.0 are:\n\n");
   while ((read = getline(&line, &len, fp)) != EOF) {
       if (read < 2) continue;   /* ignore blank lines */
       lw = strrchr(line, ' ');  /* look for last space */
       lt = strrchr(line, '\t'); /* look for last tab */
       if (!lw && !lt) continue; /* ignore lines with no whitespace */
       if (lt > lw) lw = lt;     /* lw points to last space or tab */
       if (atof(lw + 1) > 6.0) printf("%s", line);
   }
   fclose(fp);
   if (line) free(line);
   return 0;

}</lang>

Output:

Using the given file:

Those earthquakes with a magnitude > 6.0 are:

8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens       7.6


Cixl

<lang cixl> use: cx;

'data.txt' `r fopen lines {

 let: (time place mag) @@s split ..;
 let: (m1 m2) $mag @. split &int map ..;
 $m1 6 >= $m2 0 > and {[$time @@s $place @@s $mag] say} if

} for </lang>

Output:
8/27/1883 Krakatoa 8.8
5/18/1980 MountStHelens 7.6

Factor

lines is a convenience word that reads lines from standard input. If you don't want to type them all in yourself, it is suggested that you give the program a file to read. For example, on the Windows command line: factor kernighan.factor < earthquakes.txt <lang factor>USING: io math math.parser prettyprint sequences splitting ; IN: rosetta-code.kernighan

lines [ "\s" split last string>number 6 > ] filter .</lang>

Go

<lang go>package main

import (

   "bufio"
   "fmt"
   "os"
   "strconv"
   "strings"

)

func main() {

   f, err := os.Open("data.txt")
   if err != nil {
       fmt.Println("Unable to open the file")
       return
   }
   defer f.Close()
   fmt.Println("Those earthquakes with a magnitude > 6.0 are:\n")
   input := bufio.NewScanner(f)
   for input.Scan() {
       line := input.Text()
       fields := strings.Fields(line)
       mag, err := strconv.ParseFloat(fields[2], 64)
       if err != nil {
           fmt.Println("Unable to parse magnitude of an earthquake")
           return
       }
       if mag > 6.0 {
           fmt.Println(line)
       }
   }

}</lang>

Output:
Those earthquakes with a magnitude > 6.0 are:

8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens       7.6

Kotlin

<lang scala>// Version 1.2.40

import java.io.File

fun main(args: Array<String>) {

   val r = Regex("""\s+""")
   println("Those earthquakes with a magnitude > 6.0 are:\n")
   File("data.txt").forEachLine {
       if (it.split(r)[2].toDouble() > 6.0) println(it)
   }    

}</lang>

Output:

Using the given file:

Those earthquakes with a magnitude > 6.0 are:

8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens       7.6


Haskell

<lang haskell>import qualified Data.ByteString.Lazy.Char8 as C

main :: IO () main = do

 cs <- C.readFile "data.txt"
 mapM_ print $
   concatMap
     (\x ->
         [ x
         | 6 < (read (last (C.unpack <$> C.words x)) :: Float) ]) $
   C.lines cs</lang>
Output:
"8/27/1883    Krakatoa            8.8"
"5/18/1980    MountStHelens       7.6"

Lua

For each line, the Lua pattern "%S+$" is used to capture between the final space character and the end of the line. <lang lua>-- arg[1] is the first argument provided at the command line for line in io.lines(arg[1] or "data.txt") do -- use data.txt if arg[1] is nil

 magnitude = line:match("%S+$")
 if tonumber(magnitude) > 6 then print(line) end

end</lang>

Perl

<lang perl>perl -n -e '/(\S+)\s*$/ and $1 > 6 and print' data.txt</lang>

Perl 6

Works with: Rakudo version 2018.03

Pass in a file name, or use default for demonstration purposes. <lang perl6>$_ = @*ARGS[0] ?? @*ARGS[0].IO !! q:to/END/;

   8/27/1883    Krakatoa            8.8
   5/18/1980    MountStHelens       7.6
   3/13/2009    CostaRica           5.1
   END

map { .say if .words[2] > 6 }, .lines;</lang>

Phix

<lang Phix>sequence cl = command_line() string filename = iff(length(cl)>=3?cl[3]:"e02.txt") integer fn = open(filename,"r") if fn=-1 then crash("cannot open filename") end if while 1 do

   object line = gets(fn)
   if line=-1 then exit end if
   line = substitute(trim(line),"\t"," ")
   sequence r = scanf(line,"%s %f")
   if length(r)=1 and r[1][2]>6 then ?line end if

end while close(fn)</lang>

Output:
"8/27/1883    Krakatoa            8.8"
"5/18/1980    MountStHelens       7.6"

Python

Typed into a bash shell or similar: <lang python>python -c ' with open("data.txt") as f:

   for ln in f:
       if float(ln.strip().split()[2]) > 6:
           print(ln.strip())'</lang>


Or, if scale permits a file slurp and a parse retained for further processing, we can combine the parse and filter with a concatMap abstraction:

<lang python>from os.path import expanduser from functools import (reduce)


  1. main :: IO ()

def main():

   xs = concatMap(
       lambda x: (
           lambda ws=words(x): (
               lambda n=float(ws[2]):
               [ws] if 6 < n else []
           )()
       )()
   )(
       lines(readFile('~/data.txt'))
   )
   print (xs)


  1. GENERIC ABSTRACTIONS ----------------------------


  1. concatMap :: (a -> [b]) -> [a] -> [b]

def concatMap(f):

   return lambda xs: (
       reduce(lambda a, b: a + b, map(f, xs), [])
   )


  1. lines :: String -> [String]

def lines(s):

   return s.splitlines()


  1. readFile :: FilePath -> IO String

def readFile(fp):

   return open(expanduser(fp)).read()


  1. words :: String -> [String]

def words(s):

   return s.split()


  1. MAIN ---

main()</lang>

Output:
[['8/27/1883', 'Krakatoa', '8.8'], ['5/18/1980', 'MountStHelens', '7.6']]

REXX

A little extra coding was added to provide an output title (with centering and better alignment),   and an error message for input file not found. <lang rexx>/*REXX program to read a file containing a list of earthquakes: date, site, magnitude.*/ parse arg iFID mMag . /*obtain optional arguments from the CL*/ if iFID== | iFID=="," then iFID= 'earthquakes.dat' /*Not specified? Then use default*/ if mMag== | mMag=="," then mMag= 6 /* " " " " " */

  1. =0 /*# of earthquakes that meet criteria. */
  do j=0  while lines(iFID)\==0                 /*read all lines in the input file.    */
  parse value linein(iFID) with date site mag . /*parse three words from an input line.*/
  if mag<=mMag  then iterate                    /*Is the quake too small?  Then skip it*/
  #= # + 1                                      /*bump the number of qualifying quakes.*/
  if #==1  then say center('date', 20, "═")     '=magnitude='     center("site", 20, '═')
  say center(date, 20)      center(mag/1, 11)        '  '        site
  end   /*j*/                                   /*stick a fork in it,  we're all done. */

say if j==0 then say er 'file ' iFID " is empty or not found."

        else say #  ' earthquakes listed.'</lang>
output   when using the default inputs:
════════date════════ =magnitude= ════════site════════
     08/27/1883          8.8        Krakatoa
     05/18/1980          7.6        MountStHelens

2  earthquakes listed.

Ring

<lang ring>

  1. Project  : Kernighans large earthquake problem

load "stdlib.ring" nr = 0 equake = list(3) fn = "equake.txt" fp = fopen(fn,"r")

while not feof(fp)

        nr = nr + 1 
        equake[nr] = readline(fp)

end fclose(fp) for n = 1 to len(equake)

    for m = 1 to len(equake[n])
         if equake[n][m] = " "
            sp = m
         ok
    next
    sptemp = right(equake[n],len(equake[n])-sp)
    sptemo = number(sptemp)
    if sptemp > 6
       see equake[n] + nl
    ok

next </lang> Output:

8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens   7.6

Ruby

ruby -nae "$F[2].to_f > 6 && print" data.txt

A more interesting problem. Print only the events whose magnitude is above average.

Contents of the file:

8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens       7.6
3/13/2009    CostaRica           5.1
2000-02-02  Foo               7.7
1959-08-08   Bar             6.2
1849-09-09  Pym                9.0

The command:

ruby -e"m=$<.to_a;f=->s{s.split[2].to_f};a=m.reduce(0){|t,s|t+f[s]}/m.size;puts m.select{|s|f[s]>a}" e.txt

Output:

8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens       7.6
2000-02-02  Foo               7.7
1849-09-09  Pym                9.0

Scala

<lang Scala>object Equakes extends App {

 val HeavyMag = 6.0
 val regex = """^.*?((?:[0-9]{1,4}[.\-\/]){2}[0-9]{2,4})(?:\s*)([\sa-zA-Z]+)((?:0|(?:[1-9][0-9]*))(?:\.[0-9]+))?.*?$""".r
 val heavyOnes: Seq[String] = io.Source.fromFile("equake.txt").getLines().map {
   case s@regex(_, _, mag)
     if mag.toDouble > HeavyMag => s
   case _ => ""
 }.filter(_.nonEmpty).toSeq
 println(s"Events with a magnitude greater than $HeavyMag are:\n")
 heavyOnes.foreach(println(_))
 println(s"End of list of ${heavyOnes.length} events.")

}</lang>

Output:
Events with a magnitude greater than 6.0 are:

8/27/1883 Krakatoa 8.8 5/18/1980 MountStHelens 7.6

End of list of 2 events.

Tcl

Inspired by awk. <lang tcl>catch {console show} ;## show console when running from tclwish catch {wm withdraw .}

set filename "data.txt" set fh [open $filename] set NR 0 ;# number-of-record, means linenumber

while {[gets $fh line]>=0} { ;# gets returns length of line, -1 means eof

   incr NR
   set  line2 [regexp -all -inline {\S+} $line]  ;# reduce multiple whitespace
   set  fld   [split $line2]  	;# split line into fields, at whitespace
   set  f3    [lindex $fld 2] 	;# zero-based
  #set  NF    [llength $fld]   	;# number-of-fields
   if {$f3 > 6} { puts "$line" }

} close $fh </lang>

zkl

While lexical comparsions [of numeric data] are fine for this problem, it is bad practice so I don't do it (written so text is automatically converted to float). <lang zkl>fcn equake(data,out=Console){

  data.pump(out,fcn(line){ 6.0line.split()[-1] },Void.Filter)

}</lang> <lang zkl>equake(Data(Void,

  1. <<<

"8/27/1883 Krakatoa 8.8\n" "5/18/1980 MountStHelens 7.6\n" "3/13/2009 CostaRica 5.1\n"

  1. <<<

));</lang> or <lang zkl>equake(File("equake.txt"));</lang> or <lang zkl>$ zkl --eval 'File.stdin.pump(Console,fcn(line){ 6.0<line.split()[-1] },Void.Filter)' < equake.txt</lang>

Output:
8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens       7.6