User defined pipe and redirection operators

If the language supports operator definition, then:

create "user defined" the equivalents of the Unix shell "<", "|", ">", "<<", ">>" and $(cmd) operators.
Provide simple equivalents of: cat, tee, grep, & uniq, but as filters/procedures native to the specific language.
Replicate the below sample shell script, but in the specific language
Specifically do not cache the entire stream before the subsequent filter/procedure starts. Pass each record on as soon as available through each of the filters/procedures in the chain.

Alternately: if the language does not support operator definition then replace with:

define the procedures: input(cmd,stream), pipe(stream,cmd), output(stream, stream), whereis(array), append(stream)

For bonus Kudos: Implement the shell "&" concept as a dyadic operator in the specific language. e.g.: <lang bash>( head x & tail x & wait ) | grep test</lang>

Sample shell script: ¢ draft - pending a better (more interesting) suggestion ¢ <lang bash>aa="$(

 (
   head -4 < List_of_computer_scientists.lst;
   cat List_of_computer_scientists.lst | grep ALGOL | tee ALGOL_pioneers.lst;
   tail -4 List_of_computer_scientists.lst
 ) | sort | uniq | tee the_important_scientists.lst | grep aa

); echo "Pioneer: $aa"</lang> Input Records:

A test sample of scientists from wikipedia's "List of computer scientists"
Name	Areas of interest
Wil van der Aalst	business process management, process mining, Petri nets
Hal Abelson	intersection of computing and teaching
Serge Abiteboul	database theory
Samson Abramsky	game semantics
Leonard Adleman	RSA, DNA computing
Manindra Agrawal	polynomial-time primality testing
Luis von Ahn	human-based computation
Alfred Aho	compilers book, the 'a' in AWK
Stephen R. Bourne	Bourne shell, portable ALGOL 68C compiler
Kees Koster	ALGOL 68
Lambert Meertens	ALGOL 68, ABC (programming language)
Peter Naur	BNF, ALGOL 60
Guido van Rossum	Python (programming language)
Adriaan van Wijngaarden	Dutch pioneer; ARRA, ALGOL
Dennis E. Wisnosky	Integrated Computer-Aided Manufacturing (ICAM), IDEF
Stephen Wolfram	Mathematica
William Wulf	compilers
Edward Yourdon	Structured Systems Analysis and Design Method
Lotfi Zadeh	fuzzy logic
Arif Zaman	Pseudo-random number generator
Albert Zomaya	Australian pioneer of scheduling in parallel and distributed systems
Konrad Zuse	German pioneer of hardware and software

These records can be declared in any format appropriate to the specific language. eg table, array, list, table or text file etc.

Output:

Pioneer: Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL

ALGOL 68

See User defined pipe and redirection operators/ALGOL 68

Works with: ALGOL 68 version Revision 1; one minor extension - PRAGMA READ; one major extension - Algol68G's Currying.

Works with: ALGOL 68G version tested with release 1.18.0-9h.tiny.

File: Iterator_pipe_operators.a68 <lang algol68>MODE

 PAGEIN =         PAGE,
 PAGEAPPEND = REF PAGE,
 PAGEOUT =    REF PAGE;

MODE

 MOID = VOID,
 YIELDLINE = PROC(LINE)VOID,
 GENLINE = PROC(YIELDLINE)VOID,
 FILTER = PROC(GENLINE)GENLINE, # the classic shell filter #
 MANYTOONE = PROC([]GENLINE)GENLINE; # eg cat, as in con[cat]enate #

PRIO =: = 5, << = 5, >> = 5;

OP < = (FILTER filter, PAGEIN page)GENLINE: filter(READ page),

  <  = (MANYTOONE cmd, PAGEIN page)GENLINE: cmd(READ page),
  << = (FILTER filter, PAGEIN page)GENLINE: filter(READ page),
  >  = (GENLINE gen, PAGEOUT page)VOID: gen(WRITE page),
  >> = (GENLINE gen, PAGEAPPEND page)VOID: gen(APPEND page),
  =: = (GENLINE gen, FILTER filter)GENLINE: filter(gen),
  =: = (GENLINE gen, MANYTOONE cmd)GENLINE: cmd(gen);</lang>File: Iterator_pipe_utilities.a68

See User defined pipe and redirection operators/ALGOL 68

File: Iterator_pipe_page.a68

See User defined pipe and redirection operators/ALGOL 68

File: test_Iterator_pipe_page.a68 <lang algol68>#!/usr/local/bin/a68g --script #

First define what kind of record (aka LINE) we are piping and filtering #

FORMAT line fmt = $xg$; MODE

 LINE = STRING,
 PAGE = FLEX[0]LINE,
 BOOK = FLEX[0]PAGE;

PR READ "Iterator_pipe_page.a68" PR PR READ "Iterator_pipe_operators.a68" PR PR READ "Iterator_pipe_utilities.a68" PR

PAGE list of computer scientists = (

 "Wil van der Aalst - business process management, process mining, Petri nets",
 "Hal Abelson - intersection of computing and teaching",
 "Serge Abiteboul - database theory",
 "Samson Abramsky - game semantics",
 "Leonard Adleman - RSA, DNA computing",
 "Manindra Agrawal - polynomial-time primality testing",
 "Luis von Ahn - human-based computation",
 "Alfred Aho - compilers book, the 'a' in AWK",
 "Stephen R. Bourne - Bourne shell, portable ALGOL 68C compiler",
 "Kees Koster - ALGOL 68",
 "Lambert Meertens - ALGOL 68, ABC (programming language)",
 "Peter Naur - BNF, ALGOL 60",
 "Guido van Rossum - Python (programming language)",
 "Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL",
 "Dennis E. Wisnosky - Integrated Computer-Aided Manufacturing (ICAM), IDEF",
 "Stephen Wolfram - Mathematica",
 "William Wulf - compilers",
 "Edward Yourdon - Structured Systems Analysis and Design Method",
 "Lotfi Zadeh - fuzzy logic",
 "Arif Zaman - Pseudo-random number generator",
 "Albert Zomaya - Australian pioneer of scheduling in parallel and distributed systems",
 "Konrad Zuse - German pioneer of hardware and software"

);

PAGE algol pioneers list, the scientists list; PAGE aa;

Now do a bit of plumbing: #

cat((

   head(4, ) <  list of computer scientists,
   cat(READ list of computer scientists) =: grep("ALGOL", ) =: tee(WRITE algol pioneers list),
   tail(4, READ list of computer scientists)
 )) =: sort =: uniq =: tee(WRITE the scientists list) =: grep("aa", ) >> aa;

Finally check the result: #

printf((

 $"Pioneer: "$, line fmt, aa, $l$,
 $"Number of Algol pioneers: "g(-0)$, UPB algol pioneers list, $l$,
 $"Number of scientists: "g(-0)$, UPB the scientists list, $l$

))</lang> Output:

Pioneer:  Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL
Number of Algol pioneers: 6
Number of scientists: 15

J

If we ignore the gratuitous complexity requirements of this task, it boils down to this:

Step 0: get the data. The task does not specify how to get the data, so here I use lynx, which is readily available on most unix-like systems, including cygwin. Note that lynx needs to be in the OS PATH when running j.

<lang j>require 'task' data=:<;._2 shell 'lynx -dump -nolist -width=999 http://en.wikipedia.org/wiki/List_of_computer_scientists'</lang>

Step 1: define task core algorithms:

Step 2: select and display the required data:

<lang j> ;'aa' grep 'ALGOL' grep data

    * Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL

</lang>

As for the concept of a pipe that presents data one record at a time to a downstream function, that corresponds to the J operator @ and we could achieve the "left to right" syntax mechanism by explicitly ordering its arguments 2 :'v@u' but it's not clear how to demonstrate that usefully, in this task. (And, I could write a lot of code, to accomplish what's being accomplished here with the two successive greps, but I find that concept distasteful and tedious.)

However, note also that J's sort (/:~) and uniq (~.) operations would work just fine on this kind of data. For example:

<lang j> ;'aa' grep 'ALGOL' grep data,data

    * Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL
    * Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL

  ;'aa' grep ~. 'ALGOL' grep data,data
    * Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL

</lang>

That said, this implements most (perhaps all) of the required complexities:

<lang j>declare=: erase@boxopen tee=: 4 :0

 if._1=nc boxopen x do.(x)=:  end.
 (x)=: (do x),y
 y

) grep=: 4 :'x (+./@E.S:0 # ]) y' pipe=:2 :'v@(u"0)' NB. small pipe -- spoon feed one record at a time PIPE=:2 :0 NB. big pipe -- feed everything all together

 v u y

 v (,x)"_ y        NB. syntactic sugar, beware of tooth decay

) head=: {. tail=: -@[ {. ] sort=: /:~ uniq=: ~. cat=: ] echo=: smoutput@;

declare;:'ALGOL_pioneers the_important_scientists' aa=: ;do TXT=:0 :0 -.LF

 (
   (
     4 head data
   ),(
     cat pipe
     ('ALGOL'&grep) pipe
     ('ALGOL_pioneers'&tee)
       data
   ),(
     4 tail data
 )) PIPE
 sort PIPE
 uniq PIPE
 ('the_important_scientists'&tee) PIPE
 ('aa'&grep)

)

echo 'Pioneer:';aa</lang>

This produces the result:

<lang>Pioneer: * Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL</lang>

Perl

Implementing only stream chaining, cat, grep and tee. Oddly enough, I don't feel the urge to implement all of the more-or-less-the-same features asked for by the task. <lang perl>use strict; use 5.10.0;

package IO::File; sub readline { CORE::readline(shift) } # icing, not essential

package Stream; use Exporter 'import';

Only overload one operator. "file | stream" and "stream | stream"
are not ambiguous like with shell commands.

use overload '|' => \&chain; sub new { my $cls = shift; bless { args => [@_] }, ref $cls || $cls; }

sub chain { my ($left, $right, $swap) = @_; ($left, $right) = ($right, $left) if $swap;

if (!ref $left) { my $h; open $h, $left and $left = $h or die $left }

if (!ref $right) { # output file not implemented: don't know where I'd ever use it my $h; open $h, '>', $right and $right = $h or die $right }

if (ref $left and $left->isa(__PACKAGE__)) { $left->{output} = $right; }

if (ref $right and $right->isa(__PACKAGE__)) { $right->{input} = $left; } $right; }

Read a line and do something to it. By default it's this dummy
pass-through function. Overriding it defines a subclass' behavior

sub transform { shift; shift }

sub readline { my $obj = shift; my $line; return $line = <STDIN> unless defined $obj->{input};

while (1) { $line = $obj->{input}->readline or return; return $line if $line = $obj->transform($line); } }

package Cat; use parent -norequire, 'Stream';

Dummy, exactly the same as Stream. Except now we can invoke
as Cat::ter, instead of Stream::ter, which is not even a word

sub ter { Cat->new(@_) }

package Grep; use parent -norequire, 'Stream';

sub transform { my ($obj, $line) = @_; for (@{$obj->{args}}) { return $line if ($line =~ $_) } return; }

sub per { Grep->new(@_) }

package Tee; use parent -norequire, 'Stream'; sub er{ my $obj = Tee->new(@_); @{$obj->{tees}} = map { open my $h, '>', $_ or die $_; $h } @{$obj->{args}}; delete $obj->{args}; $obj }

sub transform { my ($obj, $line) = @_; print $_ $line for @{$obj->{tees}}; $line; }

print while $_ = $chain->readline;</lang>

Tcl

The syntax of redirections is slightly out, as they're inserted as explicit pipeline elements, and standard Tcl syntax is used to pull in results from sub-pipelines (because it is vastly simpler): <lang tcl>package require Tcl 8.6

Helpers

proc aspipe {input cmd args} {

   tailcall coroutine pipe[incr ::pipes] eval {yield [info coroutine];} \

[list $cmd $input {*}$args] {;break} } proc forpipe {input var body} {

   upvar 1 $var v
   while {[llength [info commands $input]]} {

set v [$input] uplevel 1 $body

}

Pipeline framework; parses, collects results as newline-separated lines

proc pipeline args {

   if {![llength $args]} {error "no pipeline components"}
   set p [aspipe {} eval {while {[gets stdin line]>=0} {yield $line}}]
   set oi -1
   foreach ni [lsearch -all [lappend args "|"] "|"] {

set cmd [lrange $args [expr {$oi+1}] [expr {$ni-1}]] set p [aspipe $p {*}$cmd] set oi $ni

   }
   set accum {}
   forpipe $p line {

lappend accum $line

   }
   return [join $accum \n]

}

Pipeline implementations - redirections

proc << {in args} {

   foreach string $args {

foreach line [split $string "\n"] { yield $line }

} proc < {in filename} {

   set f [open $filename]
   while {[gets $f line] >= 0} {

yield $line

   }
   close $f

} proc > {in filename} {

   set f [open $filename w]
   forpipe $in line {

puts $f $line

   }
   close $f

} proc >> {in filename} {

   set f [open $filename a]
   forpipe $in line {

puts $f $line

   }
   close $f

}

Pipeline implementations - "commands"

proc cat {in args} {

   foreach filename $args {

if {$filename eq "-"} { forpipe $in line { yield $line } } else { set f [open $filename] while {[gets $f line] >= 0} { yield $line } close $f }

} proc head {in count} {

   forpipe $in line {

if {[incr i] <= $count} { yield $line }

} proc tail {in count} {

   incr count -1
   set accum {}
   forpipe $in line {

set accum [lrange [lappend accum $line] end-$count end]

   }
   foreach item $accum {yield $item}

} proc grep {in RE} {

   forpipe $in line {

if {[regexp $RE $line]} {yield $line}

} proc sort {in} {

   set accum {}
   forpipe $in line {

lappend accum $line

   }
   foreach line [lsort $accum] {yield $line}

} proc uniq {in} {

   forpipe $in line {

if {![info exists prev] || $prev ne $line} { yield $line } set prev $line

} proc wc {in {type "words"}} {

   set count 0
   switch $type {

words { set RE {\S+} } lines { set RE {.*} }

   }
   forpipe $in line {

incr count [regexp -all $RE $line]

   }
   yield $count

} proc tee {in filename} {

   set f [open $filename w]
   forpipe $in line {

puts $f $line yield $line

   }
   close $f

}</lang> Sample pipeline: <lang tcl>set file "List_of_computer_scientists.lst" set aa [pipeline \

   << [pipeline < $file | head 4] [pipeline < $file | grep ALGOL | tee "ALGOL_pioneers.txt"] [pipeline < $file | tail 4] \
   | sort | uniq | tee "the_important_scientists.lst" | grep aa]

puts "Pioneer: $aa"</lang>