Native shebang

From Rosetta Code
Revision as of 00:59, 14 July 2015 by Peak (talk | contribs) (→‎{{header|jq}}: $ cat)



Native shebang is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

In short: Use the specimen language (native) for "scripting".

Example
If your language is "foo", then the test case of "echo.foo" runs in a terminal as "./echo.foo Hello, world!".


In long: Create a program (in the specimen language) that will automatically compile a test case (of the same specimen language) to a native binary executable and then transparently load and run this test case executable.

Make it so that all that is required is a custom shebangs at the start of the test case. e.g. "#!/usr/local/bin/script_foo"

Importantly: This task must be coded strictly in the specimen language, neither using a shell script nor any other 3rd language.

Optimise this progress so that the test program binary executable is only created if the original test program source code as been touched/edited.

Note: If the lauguage (or a specific implementation) handles this automatically, then simple provide an example of "echo.foo"


Background:

Simple shebangs can help with scripting, e.g. "#!/usr/bin/env python" at the top of a Python script will allow it to be run in a terminal as "./script.py".

The task Multiline shebang largely demonstrates how to use "shell" code in the shebang to compile and/or run source-code from a 3rd language, typically "#!/bin/bash" or "#!/bin/sh".


This task:

However in this task Native shebang task we are go native. In the shebang, instead of running a shell, we call a binary-executable generated from the original native language, e.g. when using C with gcc "#!/usr/local/bin/script_gcc" to extract, compile and run the native "script" source code.

Other small innovations required of this Native shebang task:

  • Cache the executable in some appropriate place in a path, dependant on available write permissions.
  • Generate a new cached executable only when the source has been touched.
  • If a cached is available, then run this instead of regenerating a new executable.

Difficulties:

  • Naturally, some languages are not compiled. These languages are forced to use shebang executables from another language, eg "#!/usr/bin/env python" uses the C binaries /usr/bin/env and /usr/bin/python. If this is the case, then simply document the details of the case.
  • In a perfect world, the test file (e.g. echo.c) would still be a valid program, and would compile without error using the native compiler (e.g. gcc for text.c). The problem is that "#!" is syntactically incorrect on many languages, but in others it can be parsed as a comment.
  • The "test binary" should be exec-ed and hence retain the original Process identifier.

Test case:

  • Create a simple "script file" (in the same native language) called "echo" then use the "script" to output "Hello, world!"

ALGOL 68

Using ALGOL 68G to script ALGOL 68G

Works with: ALGOL 68 version Revision 1.
Works with: ALGOL 68G version Any - Tested with release algol68g-2.7.

Note: With Algol68G the option "-O3" will compile the script file to a ".so" file, this ".so" file is a binary executable and dynamically loaded library. Also note that this ".so" will only be generated if the ".a68" source file has been touched. File: echo.a68<lang algol68>#!/usr/bin/a68g --script #

  1. -*- coding: utf-8 -*- #

STRING ofs := ""; FOR i FROM 4 TO argc DO print((ofs, argv(i))); ofs:=" " OD</lang> Test Execution:

$ ./echo.a68 Hello, world!
Output:
Hello, world!

C

Using gcc to script C

This example is incorrect. Please fix the code and remove this message.

Details: Talk:Native shebang#Problems: "The C example doesn't work for me (unless a segmentation fault from script_gcc.sh can be described as "working" or a bad interpreter error from echo.c can be described as "working"). --Rdm"

File: script_gcc.c <lang c>#!/usr/local/bin/script_gcc.sh /* Optional: this C code initially is-being/can-be boot strapped (compiled) using bash script_gcc.sh */

  1. include <errno.h>
  2. include <libgen.h>
  3. include <stdarg.h>
  4. include <stdio.h>
  5. include <stdlib.h>
  6. include <string.h>
  7. include <sys/stat.h>
  8. include <unistd.h>

/* the actual shebang for C target scripts is:

  1. !/usr/local/bin/script_gcc.c
  • /

/* general readability constants */ typedef char /* const */ *STRING; typedef enum{FALSE=0, TRUE=1} BOOL; const STRING ENDCAT = NULL;

/* script_gcc.c specific constants */

  1. define DIALECT "c" /* or cpp */

const STRING

 CC="gcc",
 COPTS="-lm -x "DIALECT,
 IEXT="."DIALECT,
 OEXT=".out";

const BOOL OPT_CACHE = TRUE;

/* general utility procedured */ char strcat_out[BUFSIZ];

STRING STRCAT(STRING argv, ... ){

 va_list ap;
 va_start(ap, argv);
 STRING arg;
 strcat_out[0]='\0';
 for(arg=argv; arg != ENDCAT; arg=va_arg(ap, STRING)){
    strncat(strcat_out, arg, sizeof strcat_out);
 }
 va_end(ap);
 return strndup(strcat_out, sizeof strcat_out);

}

char itoa_out[BUFSIZ];

STRING itoa(int i){

 sprintf(itoa_out, "%d", i);
 return itoa_out;

}

time_t modtime(STRING filename){

 struct stat buf;
 if(stat(filename, &buf) != EXIT_SUCCESS)perror(filename);
 return buf.st_mtime;

}

/* script_gcc specific procedure */ BOOL compile(STRING srcpath, STRING binpath){

 int out;
 STRING compiler_command=STRCAT(CC, " ", COPTS, " -o ", binpath, " -", ENDCAT);
 FILE *src=fopen(srcpath, "r"),
      *compiler=popen(compiler_command, "w");
 char buf[BUFSIZ];
 BOOL shebang;
 for(shebang=TRUE; fgets(buf, sizeof buf, src); shebang=FALSE)
   if(!shebang)fwrite(buf, strlen(buf), 1, compiler);
 out=pclose(compiler);
 return out;

}

void main(int argc, STRING *argv, STRING *envp){

 STRING binpath,
        srcpath=argv[1],
        argv0_basename=STRCAT(basename((char*)srcpath /*, .DIALECT */), ENDCAT),
        *dirnamew, *dirnamex;
 argv++; /* shift */

/* Warning: current dir "." is in path, AND * /tmp directories are common/shared */

 STRING paths[] = {
   dirname(strdup(srcpath)), /* not sure why strdup is required? */
   STRCAT(getenv("HOME"), "/bin", ENDCAT),
   "/usr/local/bin",
   ".",
   STRCAT(getenv("HOME"), "/tmp", ENDCAT),
   getenv("HOME"),
   STRCAT(getenv("HOME"), "/Desktop", ENDCAT),

/* "/tmp" ... a bit of a security hole */

   ENDCAT
 };
 for(dirnamew = paths; *dirnamew; dirnamew++){
   if(access(*dirnamew, W_OK) == EXIT_SUCCESS) break;
 }

/* if a CACHEd copy is not to be kept, then fork a sub-process to unlink the .out file */

 if(OPT_CACHE == FALSE){
   binpath=STRCAT(*dirnamew, "/", argv0_basename, itoa(getpid()), OEXT, ENDCAT);
   if(compile(srcpath, binpath) == EXIT_SUCCESS){
     if(fork()){
       sleep(0.1); unlink(binpath);
     } else {
       execvp(binpath, argv);
     }
   }
 } else {

/* else a CACHEd copy is kept, so find it */

   time_t modtime_srcpath = modtime(srcpath);
   for(dirnamex = paths; *dirnamex; dirnamex++){
     binpath=STRCAT(*dirnamex, "/", argv0_basename, OEXT, ENDCAT);
     if((access(binpath, X_OK) == EXIT_SUCCESS) && (modtime(binpath) >= modtime_srcpath))
       execvp(binpath, argv);
   }
 }
 binpath=STRCAT(*dirnamew, "/", argv0_basename, OEXT, ENDCAT);
 if(compile(srcpath, binpath) == EXIT_SUCCESS)
   execvp(binpath, argv);
 perror(STRCAT(binpath, ": executable not available", ENDCAT));
 exit(errno);

}</lang>

Test Source File: echo.c <lang c>#!/usr/local/bin/script_gcc.c

  1. include <stdio.h>
  2. include <string.h>
  3. include <stdlib.h>

int main(int argc, char **argv, char **envp){

 char ofs = '\0';
 for(argv++; *argv; argv++){
   if(ofs)putchar(ofs); else ofs=' ';
   fwrite(*argv, strlen(*argv), 1, stdout);
 }
 putchar('\n');
 exit(EXIT_SUCCESS);

}</lang>

Test Execution:

$ ./echo.c Hello, world!
Output:
Hello, world!

UNIX Shell

Using sh to script sh

In strictly shell this is natural, native and easy:

File: echo.sh <lang sh>#!/bin/sh echo "$@"</lang>

Usage:

./echo.sh Hello, world!
Output:
Hello, world!

Using bash to script C

Works with: Bourne Again SHell

Note: this Native shebang task does not exactly apply to bash because bash is interpretive, but as a skeleton template the following script is an example of how compiled languages can implement the shebang. Also: this bash code can be used to automatically compile the C code in /usr/local/bin/script_gcc.c above.

File: script_gcc.sh <lang bash>#!/bin/bash

  1. Actual shebang when using bash:
  2. !/usr/local/bin/script_gcc.sh
  1. Alternative shebang when using bash:
  2. !/bin/bash /usr/local/bin/script_gcc.sh
  1. CACHE=No # to turn off caching...
  1. Note: this shell should be re-written in actual C! :-)

DIALECT=c # or cpp CC="gcc" COPTS="-lm -x $DIALECT" IEXT=.$DIALECT OEXT=.out

ENOENT=2

srcpath="$1"; shift # => "$@"

  1. basename="$(basename "$srcpath" ."$DIALECT")"

basename="$(basename "$srcpath")"

  1. Warning: current dir "." is in path, AND */tmp directories are common/shared

paths="$(dirname "$srcpath") $HOME/bin /usr/local/bin . $HOME/tmp $HOME $HOME/Desktop"

  1. /tmp

while read dirnamew; do

 [ -w "$dirnamew" ] && break

done << end_here_is $paths end_here_is

compile(){

 sed -n '2,$p' "$srcpath" | "$CC" $COPTS -o "$binpath" -

}

if [ "'$CACHE'" = "'No'" ]; then

 binpath="$dirnamew/$basename-v$$$OEXT"
 if compile; then
   ( sleep 0.1; exec rm "$binpath" ) & exec "$binpath" "$@"
 fi

else

 while read dirnamex; do
   binpath="$dirnamex/$basename$OEXT"
   if [ -x "$binpath" -a "$binpath" -nt "$srcpath" ];
     then exec "$binpath" "$@"; fi
 done << end_here_is

$paths end_here_is

 binpath="$dirnamew/$basename$OEXT"
 if compile; then exec "$binpath" "$@"; fi
 echo "$binpath: executable not available" 1>&2
 exit $ENOENT

fi</lang> Test Source File: echo.c <lang c>#!/usr/local/bin/script_gcc.sh

  1. include <stdio.h>
  2. include <string.h>
  3. include <stdlib.h>

int main(int argc, char **argv, char **envp){

 char ofs = '\0';
 for(argv++; *argv; argv++){
   if(ofs)putchar(ofs); else ofs=' ';
   fwrite(*argv, strlen(*argv), 1, stdout);
 }
 putchar('\n');
 exit(EXIT_SUCCESS);

}</lang>

Test Execution:

$ ./echo.c Hello, world!
Output:
Hello, world!


jq

Works with: jq version 1.4

jq can be invoked on the shebang line, e.g. as

#!/usr/local/bin/jq -M -n -f

or

#!/usr/bin/env/jq -M -n -f

Example 1: <lang sh>$ cat echo.foo

  1. !/usr/bin/env/jq -M -n -r -f

"Klaatu barada nikto!"</lang>


$ ./echo.foo 
Klaatu barada nikto!

Command-line parameters of a script created with a shebang line in this manner are processed as jq command-line parameters. Thus, instead of being able to invoke the script along the lines of

$ ./echo.foo "Hello world!"   # nope

one would have to introduce a named variable to hold the command-line parameter, as illustrated in the next example:

Example 2: <lang sh>$ cat echo.foo

  1. !/usr/bin/env/jq -M -n -r -f

$x</lang>

Output:

<lang sh>$ ./echo.foo --arg x "Hello, world!" Hello, world!</lang>

Python

Extract: "If you need to create a .pyc file for a module that is not imported, you can use the py_compile and compileall modules. The py_compile module can manually compile any module. One way is to use the py_compile.compile function in that module interactively:[1]:"

>>> import py_compile
>>> py_compile.compile('echo.py')

File: echo.py <lang python>#!/path/to/python

  1. Although `#!/usr/bin/env python` may be better if the path to python can change

import sys print " ".join(sys.argv[1:])</lang>

Usage:

./echo.py Hello, world!
Output:
Hello, world!

Racket

Racket has raco: Racket Command Line Tools which can be used to compile to bytecode or compile to standalone executables (along with a whole load of other fun stuff to do with packages, unit testing and the likes).

To properly compile a file/program, one needs to invoke raco or go through invocations of racket to see what needs to be done. Compilation is expensive. Dependency management is expensive and difficult to do. The only one who can probably be trusted to do this is raco. So (as with Python), if you need to compile the program, do so with the compiler.

Once you have done this, however, racket (in the shebang) will use the 'compiled' version, not the source.

In this example:

File native-shebang.rkt contains the following: <lang racket>#! /usr/local/racket-6.1/bin/racket

  1. lang racket

(displayln "hello")</lang>

My directory contains only this:

-bash-3.2$ ls
native-shebang.rkt

Which runs:

-bash-3.2$ ./native-shebang.rkt 
hello

But has not self-compiled or anything like that:

-bash-3.2$ ls
native-shebang.rkt

I run raco to compile it:

-bash-3.2$ raco make native-shebang.rkt 
-bash-3.2$ ls -R
.:
compiled  native-shebang.rkt

./compiled:
native-shebang_rkt.dep  native-shebang_rkt.zo

The dependency file and byte-code -- .zo -- file are in a compiled directory.

I still run native-shebang.rkt from the script (with the racket shebang). Racket will use the compiled code instead of the source in the script:

-bash-3.2$ ./native-shebang.rkt 
hello

(although it's hard to prove)

Ruby

Ruby does not compile to a binary, thankfully.