Use another language to call a function: Difference between revisions

Content added Content deleted
(→‎Using cptr and memcpy: Show C side-by-side.)
(→‎{{header|TXR}}: Remove first solution, improve carray solution, put callback discussion at end.)
Line 1,067: Line 1,067:


=={{header|TXR}}==
=={{header|TXR}}==

=== Using character array ===


This is really two tasks: how to accept foreign callbacks, and how to link code to a C program which controls the <code>main</code> startup function.
This is really two tasks: how to accept foreign callbacks, and how to link code to a C program which controls the <code>main</code> startup function.
Line 1,097: Line 1,095:
gcc -g --shared query.c -o query.c</lang>
gcc -g --shared query.c -o query.c</lang>


===Using <code>carray</code>===
Now an interactive TXR session.

In this situation, the most appropriate FFI type to use for the foreign buffer is the <code>carray</code> type. This type allows TXR Lisp code to manipulate a foreign array while retaining its identity, so that it is able to pass the same pointer to the foreign code that it received from that code. <code>carray</code> also solves the problem of dealing with the common representational approach in C when arrays are represented by pointers, and do not include their size as part of their type information. A <code>carray</code> object can be constructed with an zero size, which can be adjusted when the size is known, using <code>carray-set-length</code>.
Like the <code>array</code> type, <code>carray</code> has specialized behaviors when its element type is <code>char</code>, <code>bchar</code> or <code>wchar</code>. The <code>carray-get</code> function will decode a string from the underlying array, and <code>carray-put</code> will encode a string into the array. In the case of the <code>char</code> type, this involves UTF-8 coding.


Callbacks are modeled as "FFI closures". The macro <code>deffi-cb</code> defines a function which itself isn't a callback, but is rather a combinator which converts a Lisp function into a FFI callback.
Callbacks are modeled as "FFI closures". The macro <code>deffi-cb</code> defines a function which itself isn't a callback, but is rather a combinator which converts a Lisp function into a FFI callback.


<lang txrlisp>(with-dyn-lib "./query.so"
<lang txrlisp>(with-dyn-lib "./query.so"
(deffi query "query" void (closure)))
(deffi query "query" void (closure)))

(deffi-cb query-cb int ((ptr (array 1024 char)) (ptr (array 1 size-t))))
(deffi-cb query-cb int ((carray char) (ptr (array 1 size-t))))
(query (query-cb (lambda (data sizeptr)
(query (query-cb (lambda (buf sizeptr)
(symacrolet ((size [sizeptr 0]))
(symacrolet ((size [sizeptr 0]))
(let* ((s "Here am I")
(let* ((s "Here am I")
Line 1,112: Line 1,114:
(cond
(cond
((> l size) 0)
((> l size) 0)
(t (set [data :..:] s)
(t (carray-set-length buf size)
(carray-put buf s)
(set size l))))))))</lang>
(set size l))))))))</lang>


Line 1,121: Line 1,124:
Note that the obvious way of passing a <code>size_t</code> value by pointer, namely <code>(ptr size-t)</code> doesn't work. While the callback will receive the size (FFI will decode the pointer type's semantics and get the size value), updating the size will not propagate back to the caller, because it becomes, effectively, a by-value parameter. A <code>(ptr size-t)</code> object has to be embedded in an aggregate that is passed by reference, in order to have two-way semantics. Here we use the trick of treating the <code>size_t *</code> as an array of 1, which it ''de facto'' is. In the callback, we establish local symbol macro which lets us just refer to <code>[sizeptr 0]</code> it as <code>size</code>.
Note that the obvious way of passing a <code>size_t</code> value by pointer, namely <code>(ptr size-t)</code> doesn't work. While the callback will receive the size (FFI will decode the pointer type's semantics and get the size value), updating the size will not propagate back to the caller, because it becomes, effectively, a by-value parameter. A <code>(ptr size-t)</code> object has to be embedded in an aggregate that is passed by reference, in order to have two-way semantics. Here we use the trick of treating the <code>size_t *</code> as an array of 1, which it ''de facto'' is. In the callback, we establish local symbol macro which lets us just refer to <code>[sizeptr 0]</code> it as <code>size</code>.


Note also how the data is prepared. As a special case, FFI creates a correspondence between the <code>char</code> array and a character string. The callback must mutate the character string to the desired value; FFI will then propagate the mutation to the original array. If the callback mistakenly performed <code>(set data s)</code>, it wouldn't work, because the original string object is untouched. Only the lexical <code>data</code> variable is replaced with a pointer <code>s</code>. The expression <code>(set [data :..:] s)</code> replaces a subrange of <code>data</code> with <code>s</code>, where the subrange is all of <code>data</code>.


=== Using <code>cptr</code> and <code>memcpy</code> ===
Finally, note that TXR Lisp strings are Unicode, stored as arrays of wide characters (C type <code>wchar_t</code>). FFI is doing automatic conversion between that representation and UTF-8. That's a specialized behavior of the <code>(array ... char)></code> type. If UTF-8 encoding is undesirable, then the <code>bchar</code> type can be used (byte char). Then there is a one to one correspondence between the Unicode characters and array elements. However, out-of-range Unicode characters (values above U+007F) trigger an exception.


An alternative approach is possible if we avail ourselves of the <code>memcpy</code> function via FFI. We can receive the data as an opaque foreign pointer represented by the <code>cptr</code> type. We can set up <code>memcpy</code> so that its destination argument and return value is a <code>cptr</code>, but the source argument is a string:
=== Using <code>carray</code> ===


<lang txrlisp>(with-dyn-lib "./query.so"
Our above approach has a problem: it uses FFI in a way that relies on knowing the size of the C object, which is incorrect. The C buffer could be of any size; the only indicator we can trust is the run-time value we are given.
(deffi query "query" void (closure)))
(with-dyn-lib nil
(deffi memcpy "memcpy" cptr (cptr str size-t)))
(deffi-cb query-cb int (cptr (ptr (array 1 size-t))))
(query (query-cb (lambda (buf sizeptr) ; int lambda(void *buf, size_t *sizeptr)
(symacrolet ((size [sizeptr 0])) ; { #define size sizeptr[0]
(let* ((s "Here am I") ; char *s = "Here am I";
(l (length s))) ; size_t l = strlen(s);
(cond ; if (length > size)
((> l size) 0) ; { return 0; } else
(t (memcpy buf s l) ; { memcpy(buf, s, l);
(set size l)))))))) ; return size = l; } }</lang>


Here, the use of the <code>str</code> type in the <code>memcpy</code> interface means that FFI automatically produces a UTF-8 encoding of the string in a temporary buffer. The pointer to that temporary buffer is what is passed into <code>memcpy</code>. The temporary buffer is released after <code>memcpy</code> returns.
To accurately deal with this kind of situation accurately, the lower level <code>carray</code> FFI type can be used:


To reveal the similarity between the Lisp logic and how a C function might be written, the corresponding C code is shown.
<lang txrlisp>;; callback signature is altered to take "carray of char":
However, that C code's semantics is, of course, devoid of any hidden UTF-8 conversion.
(with-dyn-lib "./query.so"
(deffi query "query" void (closure)))


===Exceptions from Callback===
(deffi-cb query-cb int ((carray char) (ptr (array 1 size-t))))


If the callback throws an exception or performs any other non-local return, it will return a default return value of all zero bits in the given return type. This value can be specified, but the zero default suits our particular situation, because the problem task defines the return value of zero as an error indicator.
(query (query-cb (lambda (buf sizeptr)
(symacrolet ((size [sizeptr 0]))
(carray-set-length buf size)
(let* ((s "Here am I")
(l (length s)))
(cond
((> l size) 0)
(t (each ((i (range* 0 l)))
(carray-refset buf i [s i]))
(set size l))))))))</lang>


We can explore this interactively:
If the callback throws an exception or performs any other non-local return, it will return a default return value of all zero bits in the given return type. This value can be specified, but the zero default suits our particular situation:


<pre>$ txr
<pre>$ txr
Line 1,165: Line 1,172:
Here we can see that when the callback throws the <code>error</code> exception, the C code prints <code>query: callback failed</code>, due to receiving the default abort return value of zero. Then, the exception continues up to the interactive prompt.
Here we can see that when the callback throws the <code>error</code> exception, the C code prints <code>query: callback failed</code>, due to receiving the default abort return value of zero. Then, the exception continues up to the interactive prompt.


If a return value other than zero indicates that the callback failed, that can be arranged with an additional argument in <code>deffi-cb</code>:
=== Using <code>cptr</code> and <code>memcpy</code> ===


<lang txrlisp>(deffi-cb query-cb int (cptr (ptr (array 1 size-t))) -1)</lang>
A more succinct approach is possible if we avail ourselves of the <code>memcpy</code> function via FFI. We can receive the data as an opaque foreign pointer represented by the <code>cptr</code> type. We can set up <code>memcpy</code> so that its destination argument and return value is a <code>cptr</code>, but the source argument is a string:


Now the <code>query-cb</code> function generates callbacks that return -1 to the caller, rather than zero, if aborted by a non-local control transfer such as an exception.
<lang txrlisp>(with-dyn-lib "./query.so"
(deffi query "query" void (closure)))
(with-dyn-lib nil
(deffi memcpy "memcpy" cptr (cptr str size-t)))
(deffi-cb query-cb int (cptr (ptr (array 1 size-t))))
(query (query-cb (lambda (buf sizeptr) ; int lambda(void *buf, size_t *sizeptr)
(symacrolet ((size [sizeptr 0])) ; { #define size sizeptr[0]
(let* ((s "Here am I") ; char *s = "Here am I";
(l (length s))) ; size_t l = strlen(s);
(cond ; if (length > size)
((> l size) 0) ; { return 0; } else
(t (memcpy buf s l) ; { memcpy(buf, s, l);
(set size l)))))))) ; return size = l; } }</lang>

Here, the use of the <code>str</code> type in the <code>memcpy</code> interface means that FFI automatically produces a UTF-8 encoding of the string in a temporary buffer. The pointer to that temporary buffer is what is passed into <code>memcpy</code>. The temporary buffer is released after <code>memcpy</code> returns.

To reveal the similarity between the Lisp logic and how a C function might be written, the corresponding C code is shown.
However, that C code's semantics is, of course, devoid of any hidden UTF-8 conversion.


=={{header|zkl}}==
=={{header|zkl}}==