User defined pipe and redirection operators
If the language supports operator definition, then:
- create "user defined" the equivalents of the Unix shell "<", "|", ">", "<<", ">>" and $(cmd) operators.
- Provide simple equivalents of: cat, tee, grep, & uniq, but as filters/procedures native to the specific language.
- Replicate the below sample shell script, but in the specific language
- Specifically do not cache the entire stream before the subsequent filter/procedure starts. Pass each record on as soon as available through each of the filters/procedures in the chain.
Alternately: if the language does not support operator definition then replace with:
- define the procedures: input(cmd,stream), pipe(stream,cmd), output(stream, stream), whereis(array), append(stream)
For bonus Kudos: Implement the shell "&" concept as a dyadic operator in the specific language. e.g.: <lang bash>( head x & tail x & wait ) | grep test</lang>
Sample shell script: ¢ draft - pending a better (more interesting) suggestion ¢ <lang bash>aa="$(
( head -4 < List_of_computer_scientists.lst; cat List_of_computer_scientists.lst | grep ALGOL | tee ALGOL_pioneers.lst; tail -4 List_of_computer_scientists.lst ) | sort | uniq | tee the_important_scientists.lst | grep aa
); echo "Pioneer: $aa"</lang> Input Records:
Name | Areas of interest |
---|---|
Wil van der Aalst | business process management, process mining, Petri nets |
Hal Abelson | intersection of computing and teaching |
Serge Abiteboul | database theory |
Samson Abramsky | game semantics |
Leonard Adleman | RSA, DNA computing |
Manindra Agrawal | polynomial-time primality testing |
Luis von Ahn | human-based computation |
Alfred Aho | compilers book, the 'a' in AWK |
Stephen R. Bourne | Bourne shell, portable ALGOL 68C compiler |
Kees Koster | ALGOL 68 |
Lambert Meertens | ALGOL 68, ABC (programming language) |
Peter Naur | BNF, ALGOL 60 |
Guido van Rossum | Python (programming language) |
Adriaan van Wijngaarden | Dutch pioneer; ARRA, ALGOL |
Dennis E. Wisnosky | Integrated Computer-Aided Manufacturing (ICAM), IDEF |
Stephen Wolfram | Mathematica |
William Wulf | compilers |
Edward Yourdon | Structured Systems Analysis and Design Method |
Lotfi Zadeh | fuzzy logic |
Arif Zaman | Pseudo-random number generator |
Albert Zomaya | Australian pioneer of scheduling in parallel and distributed systems |
Konrad Zuse | German pioneer of hardware and software |
These records can be declared in any format appropriate to the specific language. eg table, array, list, table or text file etc.
Output:
Pioneer: Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL
ALGOL 68
See User defined pipe and redirection operators/ALGOL 68
File: Iterator_pipe_operators.a68 <lang algol68>MODE
PAGEIN = PAGE, PAGEAPPEND = REF PAGE, PAGEOUT = REF PAGE;
MODE
MOID = VOID, YIELDLINE = PROC(LINE)VOID, GENLINE = PROC(YIELDLINE)VOID, FILTER = PROC(GENLINE)GENLINE, # the classic shell filter # MANYTOONE = PROC([]GENLINE)GENLINE; # eg cat, as in con[cat]enate #
PRIO =: = 5, << = 5, >> = 5;
OP < = (FILTER filter, PAGEIN page)GENLINE: filter(READ page),
< = (MANYTOONE cmd, PAGEIN page)GENLINE: cmd(READ page), << = (FILTER filter, PAGEIN page)GENLINE: filter(READ page), > = (GENLINE gen, PAGEOUT page)VOID: gen(WRITE page), >> = (GENLINE gen, PAGEAPPEND page)VOID: gen(APPEND page), =: = (GENLINE gen, FILTER filter)GENLINE: filter(gen), =: = (GENLINE gen, MANYTOONE cmd)GENLINE: cmd(gen);</lang>File: Iterator_pipe_utilities.a68
File: Iterator_pipe_page.a68
File: test_Iterator_pipe_page.a68 <lang algol68>#!/usr/local/bin/a68g --script #
- First define what kind of record (aka LINE) we are piping and filtering #
FORMAT line fmt = $xg$; MODE
LINE = STRING, PAGE = FLEX[0]LINE, BOOK = FLEX[0]PAGE;
PR READ "Iterator_pipe_page.a68" PR PR READ "Iterator_pipe_operators.a68" PR PR READ "Iterator_pipe_utilities.a68" PR
PAGE list of computer scientists = (
"Wil van der Aalst - business process management, process mining, Petri nets", "Hal Abelson - intersection of computing and teaching", "Serge Abiteboul - database theory", "Samson Abramsky - game semantics", "Leonard Adleman - RSA, DNA computing", "Manindra Agrawal - polynomial-time primality testing", "Luis von Ahn - human-based computation", "Alfred Aho - compilers book, the 'a' in AWK", "Stephen R. Bourne - Bourne shell, portable ALGOL 68C compiler", "Kees Koster - ALGOL 68", "Lambert Meertens - ALGOL 68, ABC (programming language)", "Peter Naur - BNF, ALGOL 60", "Guido van Rossum - Python (programming language)", "Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL", "Dennis E. Wisnosky - Integrated Computer-Aided Manufacturing (ICAM), IDEF", "Stephen Wolfram - Mathematica", "William Wulf - compilers", "Edward Yourdon - Structured Systems Analysis and Design Method", "Lotfi Zadeh - fuzzy logic", "Arif Zaman - Pseudo-random number generator", "Albert Zomaya - Australian pioneer of scheduling in parallel and distributed systems", "Konrad Zuse - German pioneer of hardware and software"
);
PAGE algol pioneers list, the scientists list; PAGE aa;
- Now do a bit of plumbing: #
cat((
head(4, ) < list of computer scientists, cat(READ list of computer scientists) =: grep("ALGOL", ) =: tee(WRITE algol pioneers list), tail(4, READ list of computer scientists) )) =: sort =: uniq =: tee(WRITE the scientists list) =: grep("aa", ) >> aa;
- Finally check the result: #
printf((
$"Pioneer: "$, line fmt, aa, $l$, $"Number of Algol pioneers: "g(-0)$, UPB algol pioneers list, $l$, $"Number of scientists: "g(-0)$, UPB the scientists list, $l$
))</lang> Output:
Pioneer: Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL Number of Algol pioneers: 6 Number of scientists: 15
J
If we ignore the gratuitous complexity requirements of this task, it boils down to this:
Step 0: get the data. The task does not specify how to get the data, so here I use lynx, which is readily available on most unix-like systems, including cygwin. Note that lynx needs to be in the OS PATH when running j.
<lang j>require 'task' data=:<;._2 shell 'lynx -dump -nolist -width=999 http://en.wikipedia.org/wiki/List_of_computer_scientists'</lang>
Step 1: define task core algorithms:
<lang j>grep=: +./@E.S:0 # ]</lang>
Step 2: select and display the required data:
<lang j> ;'aa' grep 'ALGOL' grep data
* Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL
</lang>
As for the concept of a pipe that presents data one record at a time to a downstream function, that corresponds to the J operator @
and we could achieve the "left to right" syntax mechanism by explicitly ordering its arguments 2 :'v@u'
but it's not clear how to demonstrate that usefully, in this task. (And, I could write a lot of code, to accomplish what's being accomplished here with the two successive greps, but I find that concept distasteful and tedious.)
However, note also that J's sort (/:~
) and uniq (~.
) operations would work just fine on this kind of data. For example:
<lang j> ;'aa' grep 'ALGOL' grep data,data
* Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL * Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL
;'aa' grep ~. 'ALGOL' grep data,data * Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL
</lang>
That said, this implements most (perhaps all) of the required complexities:
<lang j>declare=: erase@boxopen tee=: 4 :0
if._1=nc boxopen x do.(x)=: end. (x)=: (do x),y y
) grep=: 4 :'x (+./@E.S:0 # ]) y' pipe=:2 :'v@(u"0)' NB. small pipe -- spoon feed one record at a time PIPE=:2 :0 NB. big pipe -- feed everything all together
v u y
v (,x)"_ y NB. syntactic sugar, beware of tooth decay
) head=: {. tail=: -@[ {. ] sort=: /:~ uniq=: ~. cat=: ] echo=: smoutput@;
declare;:'ALGOL_pioneers the_important_scientists' aa=: ;do TXT=:0 :0 -.LF
( ( 4 head data ),( cat pipe ('ALGOL'&grep) pipe ('ALGOL_pioneers'&tee) data ),( 4 tail data )) PIPE sort PIPE uniq PIPE ('the_important_scientists'&tee) PIPE ('aa'&grep)
)
echo 'Pioneer:';aa</lang>
This produces the result:
<lang>Pioneer: * Adriaan van Wijngaarden - Dutch pioneer; ARRA, ALGOL</lang>
Perl
Implementing only stream chaining, cat, grep and tee. Oddly enough, I don't feel the urge to implement all of the more-or-less-the-same features asked for by the task. <lang perl>use strict; use 5.10.0;
package IO::File; sub readline { CORE::readline(shift) } # icing, not essential
package Stream; use Exporter 'import';
- Only overload one operator. "file | stream" and "stream | stream"
- are not ambiguous like with shell commands.
use overload '|' => \&chain; sub new { my $cls = shift; bless { args => [@_] }, ref $cls || $cls; }
sub chain { my ($left, $right, $swap) = @_; ($left, $right) = ($right, $left) if $swap;
if (!ref $left) { my $h; open $h, $left and $left = $h or die $left }
if (!ref $right) { # output file not implemented: don't know where I'd ever use it my $h; open $h, '>', $right and $right = $h or die $right }
if (ref $left and $left->isa(__PACKAGE__)) { $left->{output} = $right; }
if (ref $right and $right->isa(__PACKAGE__)) { $right->{input} = $left; } $right; }
- Read a line and do something to it. By default it's this dummy
- pass-through function. Overriding it defines a subclass' behavior
sub transform { shift; shift }
sub readline { my $obj = shift; my $line; return $line = <STDIN> unless defined $obj->{input};
while (1) { $line = $obj->{input}->readline or return; return $line if $line = $obj->transform($line); } }
package Cat; use parent -norequire, 'Stream';
- Dummy, exactly the same as Stream. Except now we can invoke
- as Cat::ter, instead of Stream::ter, which is not even a word
sub ter { Cat->new(@_) }
package Grep; use parent -norequire, 'Stream';
sub transform { my ($obj, $line) = @_; for (@{$obj->{args}}) { return $line if ($line =~ $_) } return; }
sub per { Grep->new(@_) }
package Tee; use parent -norequire, 'Stream'; sub er{ my $obj = Tee->new(@_); @{$obj->{tees}} = map { open my $h, '>', $_ or die $_; $h } @{$obj->{args}}; delete $obj->{args}; $obj }
sub transform { my ($obj, $line) = @_; print $_ $line for @{$obj->{tees}}; $line; }
package main; my $chain = '/etc/services' # head of chain; omit to use STDIN | Cat::ter # don't really need this line | Grep::per(qr/tcp/) | Tee::er('/tmp/t1', '/tmp/t2') | Grep::per(qr/170/) | Tee::er('/tmp/t3') ;
print while $_ = $chain->readline;</lang>
Tcl
The syntax of redirections is slightly out, as they're inserted as explicit pipeline elements, and standard Tcl syntax is used to pull in results from sub-pipelines (because it is vastly simpler): <lang tcl>package require Tcl 8.6
- Helpers
proc aspipe {input cmd args} {
tailcall coroutine pipe[incr ::pipes] eval {yield [info coroutine];} \
[list $cmd $input {*}$args] {;break} } proc forpipe {input var body} {
upvar 1 $var v while {[llength [info commands $input]]} {
set v [$input] uplevel 1 $body
}
}
- Pipeline framework; parses, collects results as newline-separated lines
proc pipeline args {
if {![llength $args]} {error "no pipeline components"} set p [aspipe {} eval {while {[gets stdin line]>=0} {yield $line}}] set oi -1 foreach ni [lsearch -all [lappend args "|"] "|"] {
set cmd [lrange $args [expr {$oi+1}] [expr {$ni-1}]] set p [aspipe $p {*}$cmd] set oi $ni
} set accum {} forpipe $p line {
lappend accum $line
} return [join $accum \n]
}
- Pipeline implementations - redirections
proc << {in args} {
foreach string $args {
foreach line [split $string "\n"] { yield $line }
}
} proc < {in filename} {
set f [open $filename] while {[gets $f line] >= 0} {
yield $line
} close $f
} proc > {in filename} {
set f [open $filename w] forpipe $in line {
puts $f $line
} close $f
} proc >> {in filename} {
set f [open $filename a] forpipe $in line {
puts $f $line
} close $f
}
- Pipeline implementations - "commands"
proc cat {in args} {
foreach filename $args {
if {$filename eq "-"} { forpipe $in line { yield $line } } else { set f [open $filename] while {[gets $f line] >= 0} { yield $line } close $f }
}
} proc head {in count} {
forpipe $in line {
if {[incr i] <= $count} { yield $line }
}
} proc tail {in count} {
incr count -1 set accum {} forpipe $in line {
set accum [lrange [lappend accum $line] end-$count end]
} foreach item $accum {yield $item}
} proc grep {in RE} {
forpipe $in line {
if {[regexp $RE $line]} {yield $line}
}
} proc sort {in} {
set accum {} forpipe $in line {
lappend accum $line
} foreach line [lsort $accum] {yield $line}
} proc uniq {in} {
forpipe $in line {
if {![info exists prev] || $prev ne $line} { yield $line } set prev $line
}
} proc wc {in {type "words"}} {
set count 0 switch $type {
words { set RE {\S+} } lines { set RE {.*} }
} forpipe $in line {
incr count [regexp -all $RE $line]
} yield $count
} proc tee {in filename} {
set f [open $filename w] forpipe $in line {
puts $f $line yield $line
} close $f
}</lang> Sample pipeline: <lang tcl>set file "List_of_computer_scientists.lst" set aa [pipeline \
<< [pipeline < $file | head 4] [pipeline < $file | grep ALGOL | tee "ALGOL_pioneers.txt"] [pipeline < $file | tail 4] \ | sort | uniq | tee "the_important_scientists.lst" | grep aa]
puts "Pioneer: $aa"</lang>