Machine code: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎{{header|Common Lisp}}: Made the code a little more efficient.)
Line 278: Line 278:
;output:
;output:
<pre>12+7 = 19</pre>
<pre>12+7 = 19</pre>

=={{header|Phix}}==
<lang Phix>atom mem = allocate(9)
poke(mem,{#8B,#44,#24,#04,#03,#44,#24,#08,#C3})
constant mfunc = define_c_func({},mem,{C_INT,C_INT},C_INT)
?c_func(mfunc,{12,7})
free(mem)</lang>
In Phix the #ilASM statement (which has guards to allow 32/64/WIN/LNX variants) is usually used for inline assembly, for example (but sticking to the task):
<lang Phix>atom mem = allocate(9)
poke(mem,{#8B,#44,#24,#04,#03,#44,#24,#08,#C3})
integer res
#ilASM{ mov eax,[mem]
call :%pLoadMint -- (in case mem>#3FFFFFFF)
push 12
push 7
call eax
add esp,8
mov [res],eax }
?res
free(mem)</lang>


=={{header|Python}}==
=={{header|Python}}==

Revision as of 16:01, 25 October 2015

Task
Machine code
You are encouraged to solve this task according to the task description, using any language you may know.

The task requires poking machine code directly into memory and executing it. This is strictly for x86 (32 bit) architectures. The machine code is the opcodes of the following simple program:

<lang asm>mov EAX, [ESP+4] add EAX, [ESP+8] ret</lang>

which translates into the following opcodes: (139 68 36 4 3 68 36 8 195) and in Hex this would correspond to the following: ("8B" "44" "24" "4" "3" "44" "24" "8" "C3")

Implement the following in your favorite programming language (take the common lisp code as an example if you wish):

  1. Poke the above opcodes into a memory pointer
  2. Excute it with the following arguments: [ESP+4] => unsigned-byte argument of value 7; [ESP+8] => unsigned-byte argument of value 12; The result would be 19.
  3. Free the Pointer

AutoHotkey

MCode Tutorial (Forum Thread)

MCode4GCC (Forum Thread | GitHub) - An MCode generator using the GCC Compiler. <lang AutoHotkey>MCode(Var, "8B44240403442408C3") MsgBox, % DllCall(&Var, "Char",7, "Char",12) Var := "" return

http://www.autohotkey.com/board/topic/19483-machine-code-functions-bit-wizardry/

MCode(ByRef code, hex) { ; allocate memory and write Machine Code there

  VarSetCapacity(code, StrLen(hex) // 2)
  Loop % StrLen(hex) // 2
     NumPut("0x" . SubStr(hex, 2 * A_Index - 1, 2), code, A_Index - 1, "Char")

}</lang>

BBC BASIC

Note that BBC BASIC for Windows includes an 80386/80486 assembler as standard!

<lang bbcbasic> REM Claim 9 bytes of memory

     SYS "GlobalAlloc",0,9 TO code%
     REM Poke machine code into it
     P%=code%
     [OPT 0
     mov EAX, [ESP+4]
     add EAX, [ESP+8]
     ret
     ]
     REM Run code
     SYS code%,7,12 TO result%
     PRINT result%
     REM Free memory
     SYS "GlobalFree",code%
     END</lang>

C

<lang C>#include <stdio.h>

  1. include <sys/mman.h>
  2. include <string.h>

int test (int a, int b) {

 /*
      mov EAX, [ESP+4]
      add EAX, [ESP+8]
      ret
 */
 char code[] = {0x8B, 0x44, 0x24, 0x4, 0x3, 0x44, 0x24, 0x8, 0xC3};
 void *buf;
 int c;
 /* copy code to executable buffer */
 buf = mmap (0,sizeof(code),PROT_READ|PROT_WRITE|PROT_EXEC,
            MAP_PRIVATE|MAP_ANON,-1,0);
 memcpy (buf, code, sizeof(code));
 /* run code */
 c = ((int (*) (int, int))buf)(a, b);
 /* free buffer */
 munmap (buf, sizeof(code));
 return c;

}

int main () {

 printf("%d\n", test(7,12));
 return 0;

}</lang>

Common Lisp

<lang lisp>;;Note that by using the 'CFFI' library, one can apply this procedure portably in any lisp implementation;

in this code however I chose to demonstrate only the implementation-dependent programs.
CCL
Allocate a memory pointer and poke the opcode into it

(defparameter ptr (ccl::malloc 9))

(loop for i in '(139 68 36 4 3 68 36 8 195)

  for j from 0 do
  (setf (ccl::%get-unsigned-byte ptr j) i))
Execute with the required arguments and return the result as an unsigned-byte

(ccl::ff-call ptr :UNSIGNED-BYTE 7 :UNSIGNED-BYTE 12 :UNSIGNED-BYTE)

Output = 19
Free the pointer

(ccl::free ptr)

SBCL

(defparameter mmap (list 139 68 36 4 3 68 36 8 195))

(defparameter pointer (sb-alien:make-alien sb-alien:unsigned-char (length mmap)))

(defparameter callp (loop for byte in mmap

                         for i from 0

do (setf (sb-alien:deref pointer i) byte) finally (return (sb-alien:cast pointer (function integer integer integer)))))

(sb-alien:alien-funcall callp 7 12)

(loop for i from 0 below 18 collect (sb-alien:deref ptr i))

(sb-alien:free-alien pointer)

CLISP

(defparameter mmap (list 139 68 36 4 3 68 36 8 195))

(defparameter POINTER (FFI:FOREIGN-ADDRESS (FFI:FOREIGN-ALLOCATE 'FFI:UINT8 :COUNT 9)))

(loop for i in mmap

  for j from 0 do
  (FUNCALL #'(SETF FFI:MEMORY-AS) i POINTER 'FFI:INT j))

(FUNCALL

(FFI:FOREIGN-FUNCTION POINTER

(LOAD-TIME-VALUE (FFI:PARSE-C-TYPE '(FFI:C-FUNCTION (:ARGUMENTS 'FFI:INT 'FFI:INT) (:RETURN-TYPE FFI:INT) (:LANGUAGE :STDC)))))

7 12)

(FFI:FOREIGN-FREE POINTER) </lang>

D

In D you usually use a nicer asm {} statement for similar purposes.

Generally new operating systems forbid execution of any address unless it's known to contain executable code. This is a basic version that unlike the C entry executes from array memory. This may crash on some operating systems. <lang d>int test(in int a, in int b) pure nothrow @nogc {

   /*
   mov EAX, [ESP+4]
   add EAX, [ESP+8]
   ret
   */
   immutable ubyte[9] code = [0x8B, 0x44, 0x24, 0x4, 0x3, 0x44, 0x24, 0x8, 0xC3];
   alias F = extern(C) int function(int, int) pure nothrow @nogc;
   immutable f = cast(F)code.ptr;
   return f(a, b); // Run code.

}

void main() {

   import std.stdio;
   test(7, 12).writeln;

}</lang>

Output:
 19

Nim

Translation of: C

<lang nim>import posix

when defined(macosx) or defined(bsd):

 const MAP_ANONYMOUS = 0x1000

elif defined(solaris):

 const MAP_ANONYMOUS = 0x100

else:

 var
   MAP_ANONYMOUS {.importc: "MAP_ANONYMOUS", header: "<sys/mman.h>".}: cint

proc test(a, b: cint): cint =

 # mov EAX, [ESP+4]
 # add EAX, [ESP+8]
 var code = [0x8B'u8, 0x44, 0x24, 0x4, 0x3, 0x44, 0x24, 0x8, 0xC3]
 # create executable buffer
 var buf = mmap(nil, sizeof(code), PROT_READ or PROT_WRITE or PROT_EXEC,
   MAP_PRIVATE or MAP_ANONYMOUS, -1, 0)
 # copy code to buffer
 copyMem(addr buf, addr code[0], sizeof(code))
 # run code
 {.emit: "`result` = ((int (*) (int, int))&`buf`)(`a`,`b`);".}
 # free buffer
 discard munmap(buf, sizeof(code))

echo test(7, 12)</lang>

PARI/GP

GP can't peek and poke into memory, but PARI can add in those capabilities via C.

Translation of: C

<lang c>#include <stdio.h>

  1. include <sys/mman.h>
  2. include <string.h>
  3. include <pari/pari.h>

int test(int a, int b) {

 char code[] = {0x8B, 0x44, 0x24, 0x4, 0x3, 0x44, 0x24, 0x8, 0xC3};
 void *buf;
 int c;
 /* copy code to executable buffer */
 buf = mmap (0,sizeof(code),PROT_READ|PROT_WRITE|PROT_EXEC,
            MAP_PRIVATE|MAP_ANON,-1,0);

 memcpy (buf, code, sizeof(code));
 /* run code */
 c = ((int (*) (int, int))buf)(a, b);
 /* free buffer */
 munmap (buf, sizeof(code));
 return c;

}

void init_auto(void) {

 pari_printf("%d\n", test(7,12));
 return 0;

}</lang>

Pascal

Tested under Linux with Freepascal 2.6.4-32BIt ( like the Code used ) cdecl doesn't work in Freepascal under Linux 64-bit <lang pascal>Program Example66; {Inspired... program to demonstrate the MMap function. Freepascal docs } Uses

 BaseUnix,Unix;

const

 code : array[0..9] of byte = ($8B, $44, $24, $4, $3, $44, $24, $8, $C3, $00);
 a :longInt= 12; 
 b :longInt=  7;  

type

 tDummyFunc = function(a,b:LongInt):LongInt;cdecl;

Var

   Len,k  : cint;
   P    : Pointer;

begin

 len := sizeof(code);
 P:= fpmmap(nil,
            len+1 ,
            PROT_READ OR PROT_WRITE OR PROT_EXEC,
            MAP_ANONYMOUS OR MAP_PRIVATE,
            -1, // for MAP_ANONYMOUS
            0);
 If P =  Pointer(-1) then
   Halt(4);                  
 for k := 0 to len-1 do
   pChar(p)[k] := char(code[k]);
 k := tDummyFunc(P)(a,b);
 Writeln(a,'+',b,' = ',k);
 if fpMUnMap(P,Len)<>0 Then
   Halt(fpgeterrno);

end.</lang>

output
12+7 = 19

Phix

<lang Phix>atom mem = allocate(9) poke(mem,{#8B,#44,#24,#04,#03,#44,#24,#08,#C3}) constant mfunc = define_c_func({},mem,{C_INT,C_INT},C_INT) ?c_func(mfunc,{12,7}) free(mem)</lang> In Phix the #ilASM statement (which has guards to allow 32/64/WIN/LNX variants) is usually used for inline assembly, for example (but sticking to the task): <lang Phix>atom mem = allocate(9) poke(mem,{#8B,#44,#24,#04,#03,#44,#24,#08,#C3}) integer res

  1. ilASM{ mov eax,[mem]
       call :%pLoadMint -- (in case mem>#3FFFFFFF)
       push 12
       push 7
       call eax
       add esp,8
       mov [res],eax }

?res free(mem)</lang>

Python

Works with: CPython version 3.x

The ctypes module is meant for calling existing native code from Python, but you can get it to execute your own bytes with some tricks. The bulk of the code is spent establishing an executable memory area - once that's done, the actual execution takes just a few lines.

<lang Python>import ctypes import os from ctypes import c_ubyte, c_int

code = bytes([0x8b, 0x44, 0x24, 0x04, 0x03, 0x44, 0x24, 0x08, 0xc3])

code_size = len(code)

  1. copy code into an executable buffer

if (os.name == 'posix'):

   import mmap
   executable_map = mmap.mmap(-1, code_size, mmap.MAP_PRIVATE | mmap.MAP_ANON, mmap.PROT_READ | mmap.PROT_WRITE | mmap.PROT_EXEC)
   # we must keep a reference to executable_map until the call, to avoid freeing the mapped memory
   executable_map.write(code)
   # the mmap object won't tell us the actual address of the mapping, but we can fish it out by allocating
   # some ctypes object over its buffer, then asking the address of that
   func_address = ctypes.addressof(c_ubyte.from_buffer(executable_map))

elif (os.name == 'nt'):

   # the mmap module doesn't support protection flags on Windows, so execute VirtualAlloc instead
   code_buffer = ctypes.create_string_buffer(code)
   PAGE_EXECUTE_READWRITE = 0x40  # Windows constants that would usually come from header files
   MEM_COMMIT = 0x1000
   executable_buffer_address = ctypes.windll.kernel32.VirtualAlloc(0, code_size, MEM_COMMIT, PAGE_EXECUTE_READWRITE)
   if (executable_buffer_address == 0):
       print('Warning: Failed to enable code execution, call will likely cause a protection fault.')
       func_address = ctypes.addressof(code_buffer)
   else:
       ctypes.memmove(executable_buffer_address, code_buffer, code_size)
       func_address = executable_buffer_address

else:

   # for other platforms, we just hope DEP isn't enabled
   code_buffer = ctypes.create_string_buffer(code)
   func_address = ctypes.addressof(code_buffer)

prototype = ctypes.CFUNCTYPE(c_int, c_ubyte, c_ubyte) # build a function prototype from return type and argument types func = prototype(func_address) # build an actual function from the prototype by specifying the address res = func(7,12) print(res) </lang>

PureBasic

Machine code

Using the Windows API:

<lang PureBasic> Procedure MachineCodeVirtualAlloc(a,b)

  • vm = VirtualAlloc_(#Null,?ecode-?scode,#MEM_COMMIT,#PAGE_EXECUTE_READWRITE)
   If(*vm)
       CopyMemory_(*vm,?scode,?ecode-?scode)
       eax_result=CallFunctionFast(*vm,a,b)
       VirtualFree_(*vm,0,#MEM_RELEASE)
       ProcedureReturn eax_result
   EndIf

EndProcedure

rv=MachineCodeVirtualAlloc(7,12) MessageRequester("MachineCodeVirtualAlloc",str(rv)+space(50),#PB_MessageRequester_Ok)

  1. HEAP_CREATE_ENABLE_EXECUTE=$00040000

Procedure MachineCodeHeapCreate(a,b) hHeap=HeapCreate_(#HEAP_CREATE_ENABLE_EXECUTE,?ecode-?scode,?ecode-?scode)

   If(hHeap)
       CopyMemory_(hHeap,?scode,?ecode-?scode)
       eax_result=CallFunctionFast(hHeap,a,b)
       HeapDestroy_(hHeap)
       ProcedureReturn eax_result
   EndIf

EndProcedure

rv=MachineCodeHeapCreate(7,12) MessageRequester("MachineCodeHeapCreate",str(rv)+space(50),#PB_MessageRequester_Ok) End

8B442404 mov eax,[esp+4]
03442408 add eax,[esp+8]
C20800 ret 8

DataSection scode: Data.c $8B,$44,$24,$04,$03,$44,$24,$08,$C2,$08,$00 ecode: EndDataSection </lang>

Racket

<lang racket>#lang racket/base

(require ffi/unsafe)

set up access to racket internals

(define scheme-malloc-code

 (get-ffi-obj 'scheme_malloc_code #f (_fun (len : _intptr) -> _pointer)))

(define scheme-free-code

 (get-ffi-obj 'scheme_free_code #f (_fun _pointer -> _void)))

(define opcodes '(139 68 36 4 3 68 36 8 195))

(define code (scheme-malloc-code 64))

(for ([byte opcodes]

     [i (in-naturals)])
 (ptr-set! code _ubyte i byte))

(define function (cast code _pointer (_fun _ubyte _ubyte -> _ubyte)))

(function 7 12)

(scheme-free-code code)</lang>

Tcl

Translation of: C
Library: Critcl

<lang tcl>package require critcl

critcl::ccode {

   #include <sys/mman.h>

}

  1. Define a command using C. The C is embedded in Tcl, and will be
  2. built into a shared library at runtime. Note that Tcl does not
  3. provide a native way of doing this sort of thing; this thunk is
  4. mandatory.

critcl::cproc runMachineCode {Tcl_Obj* codeObj int a int b} int {

   int size, result;
   unsigned char *code = Tcl_GetByteArrayFromObj(codeObj, &size);
   void *buf;
   /* copy code to executable buffer */
   buf = mmap(0, (size_t) size, PROT_READ|PROT_WRITE|PROT_EXEC,
           MAP_PRIVATE|MAP_ANON, -1, 0); 
   memcpy(buf, code, (size_t) size);
   /* run code */
   result = ((int (*) (int, int)) buf)(a, b);
   /* dispose buffer */
   munmap(buf, (size_t) size);
   return result;

}

  1. But now we have our thunk, we can execute arbitrary binary blobs

set code [binary format c* {0x8B 0x44 0x24 0x4 0x3 0x44 0x24 0x8 0xC3}] puts [runMachineCode $code 7 12]</lang> Note that it would be more common to put that thunk in its own package (e.g., machineCodeThunk) and then just do something like this: <lang tcl>package require machineCodeThunk 1.0

set code [binary format c* {0x8B 0x44 0x24 0x4 0x3 0x44 0x24 0x8 0xC3}] puts [runMachineCode $code 7 12]</lang>