|
In the previous section we've seen a network message that was constructed by prefixing the
binary message length to the actual message. You'll find that packing a length followed by so
many bytes of data is a frequently used recipe since appending a null byte won't work if a null
byte may be part of the data. Here is an example where both techniques are used: after two null
terminated strings with source and destination address, a Short Message (to a mobile phone) is
sent after a length byte:
my $msg = pack( 'Z*Z*CA*', $src, $dst, length( $sm ), $sm );
|
|
Unpacking this message can be done with the same template:
( $src, $dst, $len, $sm ) = unpack( 'Z*Z*CA*', $msg );
|
|
There's a subtle trap lurking in the offing: Adding another field after the Short Message (in
variable $sm) is all right when packing, but this cannot be unpacked naively:
# pack a message
my $msg = pack( 'Z*Z*CA*C', $src, $dst, length( $sm ), $sm, $prio );
# unpack fails - $prio remains undefined!
( $src, $dst, $len, $sm, $prio ) = unpack( 'Z*Z*CA*C', $msg );
|
|
The pack code A* gobbles up all remaining bytes, and $prio remains
undefined! Before we let disappointment dampen the morale: Perl's got the trump card to make
this trick too, just a little further up the sleeve. Watch this:
# pack a message: ASCIIZ, ASCIIZ, length/string, byte
my $msg = pack( 'Z* Z* C/A* C', $src, $dst, $sm, $prio );
# unpack
( $src, $dst, $sm, $prio ) = unpack( 'Z* Z* C/A* C', $msg );
|
|
Combining two pack codes with a slash (/) associates them with a single value
from the argument list. In pack, the length of the argument is taken and packed
according to the first code while the argument itself is added after being converted with the
template code after the slash. This saves us the trouble of inserting the length
call, but it is in unpack where we really score: The value of the length byte marks
the end of the string to be taken from the buffer. Since this combination doesn't make sense
except when the second pack code isn't a*, A* or Z*, Perl
won't let you.
The pack code preceding / may be anything that's fit to represent a number: All
the numeric binary pack codes, and even text codes such as A4 or Z*:
# pack/unpack a string preceded by its length in ASCII
my $buf = pack( 'A4/A*', "Humpty-Dumpty" );
# unpack $buf: '13 Humpty-Dumpty'
my $txt = unpack( 'A4/A*', $buf );
|
|
/ is not implemented in Perls before 5.6, so if your code is required to work on
older Perls you'll need to unpack( 'Z* Z* C') to get the length, then use it to
make a new unpack string. For example
# pack a message: ASCIIZ, ASCIIZ, length, string, byte (5.005 compatible)
my $msg = pack( 'Z* Z* C A* C', $src, $dst, length $sm, $sm, $prio );
# unpack
( undef, undef, $len) = unpack( 'Z* Z* C', $msg );
($src, $dst, $sm, $prio) = unpack ( "Z* Z* x A$len C", $msg );
|
|
But that second unpack is rushing ahead. It isn't using a simple literal string
for the template. So maybe we should introduce...
So far, we've seen literals used as templates. If the list of pack items doesn't have fixed
length, an expression constructing the template has to be used. Here's an example: To store
named string values in a way that can be conveniently parsed by a C program, we create a
sequence of names and null terminated ASCII strings, with = between the name and
the value, followed by an additional delimiting null byte. Here's how:
my $env = pack( 'A*A*Z*' x keys( %Env ) . 'C',
map( { ( $_, '=', $Env{$_} ) } keys( %Env ) ), 0 );
|
|
Let's examine the cogs of this byte mill, one by one. There's the map call,
creating the items we intend to stuff into the $env buffer: to each key (in $_)
it adds the = separator and the hash entry value. Each triplet is packed with the
template code sequence A*A*Z* that is multiplied with the number of keys. (Yes,
that's what the keys function returns in scalar context.) To get the very last null
byte, we add a 0 at the end of the pack list, to be packed with C.
(Attentive readers may have noticed that we could have omitted the 0.)
For the reverse operation, we'll have to determine the number of items in the buffer before
we can let unpack rip it apart:
my $n = $env =~ tr/\0// - 1;
my %env = map( split( /=/, $_ ), unpack( 'Z*' x $n, $env ) );
|
|
The tr counts the null bytes. The unpack call returns a list of
name-value pairs each of which is taken apart in the map block.
|
|