|
The XSUB mechanism is a simple way for Perl programs to access C subroutines. An XSUB
routine will have a stack that contains the arguments from the Perl program, and a way to map
from the Perl data structures to a C equivalent.
The stack arguments are accessible through the ST(n) macro, which returns the n'th
stack argument. Argument 0 is the first argument passed in the Perl subroutine call. These
arguments are SV*, and can be used anywhere an SV* is used.
Most of the time, output from the C routine can be handled through use of the RETVAL and
OUTPUT directives. However, there are some cases where the argument stack is not already long
enough to handle all the return values. An example is the POSIX tzname() call, which takes no
arguments, but returns two, the local time zone's standard and summer time abbreviations.
To handle this situation, the PPCODE directive is used and the stack is extended using the
macro:
where SP is the macro that represents the local copy of the stack pointer, and
num is the number of elements the stack should be extended by.
Now that there is room on the stack, values can be pushed on it using PUSHs
macro. The values pushed will often need to be "mortal" (See /Reference Counts and Mortality).
PUSHs(sv_2mortal(newSViv(an_integer)))
PUSHs(sv_2mortal(newSVpv("Some String",0)))
PUSHs(sv_2mortal(newSVnv(3.141592)))
|
|
And now the Perl program calling tzname, the two values will be assigned as
in:
($standard_abbrev, $summer_abbrev) = POSIX::tzname;
|
|
An alternate (and possibly simpler) method to pushing values on the stack is to use the
macro:
This macro automatically adjust the stack for you, if needed. Thus, you do not need to call
EXTEND to extend the stack.
Despite their suggestions in earlier versions of this document the macros PUSHi,
PUSHn and PUSHp are not suited to XSUBs which return multiple
results, see /Putting a C value on Perl stack.
For more information, consult perlxs
and perlxstut.
There are four routines that can be used to call a Perl subroutine from within a C program.
These four are:
I32 call_sv(SV*, I32);
I32 call_pv(const char*, I32);
I32 call_method(const char*, I32);
I32 call_argv(const char*, I32, register char**);
|
|
The routine most often used is call_sv. The SV* argument contains
either the name of the Perl subroutine to be called, or a reference to the subroutine. The
second argument consists of flags that control the context in which the subroutine is called,
whether or not the subroutine is being passed arguments, how errors should be trapped, and how
to treat return values.
All four routines return the number of arguments that the subroutine returned on the Perl
stack.
These routines used to be called perl_call_sv, etc., before Perl v5.6.0, but
those names are now deprecated; macros of the same name are provided for compatibility.
When using any of these routines (except call_argv), the programmer must
manipulate the Perl stack. These include the following macros and functions:
dSP
SP
PUSHMARK()
PUTBACK
SPAGAIN
ENTER
SAVETMPS
FREETMPS
LEAVE
XPUSH*()
POP*()
|
|
For a detailed description of calling conventions from C to Perl, consult perlcall.
All memory meant to be used with the Perl API functions should be manipulated using the
macros described in this section. The macros provide the necessary transparency between
differences in the actual malloc implementation that is used within perl.
It is suggested that you enable the version of malloc that is distributed with Perl. It
keeps pools of various sizes of unallocated memory in order to satisfy allocation requests
more quickly. However, on some platforms, it may cause spurious malloc or free errors.
New(x, pointer, number, type);
Newc(x, pointer, number, type, cast);
Newz(x, pointer, number, type);
|
|
These three macros are used to initially allocate memory.
The first argument x was a "magic cookie" that was used to keep
track of who called the macro, to help when debugging memory problems. However, the current
code makes no use of this feature (most Perl developers now use run-time memory checkers), so
this argument can be any number.
The second argument pointer should be the name of a variable that will point
to the newly allocated memory.
The third and fourth arguments number and type specify how many
of the specified type of data structure should be allocated. The argument type is
passed to sizeof. The final argument to Newc, cast,
should be used if the pointer argument is different from the type
argument.
Unlike the New and Newc macros, the Newz macro calls
memzero to zero out all the newly allocated memory.
Renew(pointer, number, type);
Renewc(pointer, number, type, cast);
Safefree(pointer)
|
|
These three macros are used to change a memory buffer size or to free a piece of memory no
longer needed. The arguments to Renew and Renewc match those of New
and Newc with the exception of not needing the "magic cookie" argument.
Move(source, dest, number, type);
Copy(source, dest, number, type);
Zero(dest, number, type);
|
|
These three macros are used to move, copy, or zero out previously allocated memory. The source
and dest arguments point to the source and destination starting points. Perl will
move, copy, or zero out number instances of the size of the type
data structure (using the sizeof function).
The most recent development releases of Perl has been experimenting with removing Perl's
dependency on the "normal" standard I/O suite and allowing other stdio
implementations to be used. This involves creating a new abstraction layer that then calls
whichever implementation of stdio Perl was compiled with. All XSUBs should now use the
functions in the PerlIO abstraction layer and not make any assumptions about what kind of
stdio is being used.
For a complete description of the PerlIO abstraction, consult perlapio.
A lot of opcodes (this is an elementary operation in the internal perl stack machine) put
an SV* on the stack. However, as an optimization the corresponding SV is (usually) not
recreated each time. The opcodes reuse specially assigned SVs (targets) which are (as a
corollary) not constantly freed/created.
Each of the targets is created only once (but see Scratchpads
and recursion below), and when an opcode needs to put an integer, a double, or a string on
stack, it just sets the corresponding parts of its target and puts the target on
stack.
The macro to put this target on stack is PUSHTARG, and it is directly used in
some opcodes, as well as indirectly in zillions of others, which use it via (X)PUSH[pni].
Because the target is reused, you must be careful when pushing multiple values on the
stack. The following code will not do what you think:
This translates as "set TARG to 10, push a pointer to TARG
onto the stack; set TARG to 20, push a pointer to TARG onto the
stack". At the end of the operation, the stack does not contain the values 10 and 20, but
actually contains two pointers to TARG, which we have set to 20. If you need to
push multiple different values, use XPUSHs, which bypasses TARG.
On a related note, if you do use (X)PUSH[npi], then you're going to need a dTARG
in your variable declarations so that the *PUSH* macros can make use of the local
variable TARG.
The question remains on when the SVs which are targets for opcodes are created. The
answer is that they are created when the current unit -- a subroutine or a file (for opcodes
for statements outside of subroutines) -- is compiled. During this time a special anonymous
Perl array is created, which is called a scratchpad for the current unit.
A scratchpad keeps SVs which are lexicals for the current unit and are targets for opcodes.
One can deduce that an SV lives on a scratchpad by looking on its flags: lexicals have SVs_PADMY
set, and targets have SVs_PADTMP set.
The correspondence between OPs and targets is not 1-to-1. Different OPs in the
compile tree of the unit can use the same target, if this would not conflict with the expected
life of the temporary.
In fact it is not 100% true that a compiled unit contains a pointer to the scratchpad AV.
In fact it contains a pointer to an AV of (initially) one element, and this element is the
scratchpad AV. Why do we need an extra level of indirection?
The answer is recursion, and maybe threads. Both these can create several
execution pointers going into the same subroutine. For the subroutine-child not write over the
temporaries for the subroutine-parent (lifespan of which covers the call to the child), the
parent and the child should have different scratchpads. (And the lexicals should be
separate anyway!)
So each subroutine is born with an array of scratchpads (of length 1). On each entry to the
subroutine it is checked that the current depth of the recursion is not more than the length
of this array, and if it is, new scratchpad is created and pushed into the array.
The targets on this scratchpad are undefs, but they are already marked
with correct flags.
|