Website hosting service by Active-Venture.com
  

 Back to Index

Lengths and Widths

String Lengths

In the previous section we've seen a network message that was constructed by prefixing the binary message length to the actual message. You'll find that packing a length followed by so many bytes of data is a frequently used recipe since appending a null byte won't work if a null byte may be part of the data. Here is an example where both techniques are used: after two null terminated strings with source and destination address, a Short Message (to a mobile phone) is sent after a length byte:

 
   my $msg = pack( 'Z*Z*CA*', $src, $dst, length( $sm ), $sm );  

Unpacking this message can be done with the same template:

 
   ( $src, $dst, $len, $sm ) = unpack( 'Z*Z*CA*', $msg );  

There's a subtle trap lurking in the offing: Adding another field after the Short Message (in variable $sm) is all right when packing, but this cannot be unpacked naively:

 
   # pack a message
   my $msg = pack( 'Z*Z*CA*C', $src, $dst, length( $sm ), $sm, $prio );

   # unpack fails - $prio remains undefined!
   ( $src, $dst, $len, $sm, $prio ) = unpack( 'Z*Z*CA*C', $msg );  

The pack code A* gobbles up all remaining bytes, and $prio remains undefined! Before we let disappointment dampen the morale: Perl's got the trump card to make this trick too, just a little further up the sleeve. Watch this:

 
   # pack a message: ASCIIZ, ASCIIZ, length/string, byte
   my $msg = pack( 'Z* Z* C/A* C', $src, $dst, $sm, $prio );

   # unpack
   ( $src, $dst, $sm, $prio ) = unpack( 'Z* Z* C/A* C', $msg );  

Combining two pack codes with a slash (/) associates them with a single value from the argument list. In pack, the length of the argument is taken and packed according to the first code while the argument itself is added after being converted with the template code after the slash. This saves us the trouble of inserting the length call, but it is in unpack where we really score: The value of the length byte marks the end of the string to be taken from the buffer. Since this combination doesn't make sense except when the second pack code isn't a*, A* or Z*, Perl won't let you.

The pack code preceding / may be anything that's fit to represent a number: All the numeric binary pack codes, and even text codes such as A4 or Z*:

 
   # pack/unpack a string preceded by its length in ASCII
   my $buf = pack( 'A4/A*', "Humpty-Dumpty" );
   # unpack $buf: '13  Humpty-Dumpty'
   my $txt = unpack( 'A4/A*', $buf );  

/ is not implemented in Perls before 5.6, so if your code is required to work on older Perls you'll need to unpack( 'Z* Z* C') to get the length, then use it to make a new unpack string. For example

 
   # pack a message: ASCIIZ, ASCIIZ, length, string, byte (5.005 compatible)
   my $msg = pack( 'Z* Z* C A* C', $src, $dst, length $sm, $sm, $prio );

   # unpack
   ( undef, undef, $len) = unpack( 'Z* Z* C', $msg );
   ($src, $dst, $sm, $prio) = unpack ( "Z* Z* x A$len C", $msg );  

But that second unpack is rushing ahead. It isn't using a simple literal string for the template. So maybe we should introduce...

Dynamic Templates

So far, we've seen literals used as templates. If the list of pack items doesn't have fixed length, an expression constructing the template has to be used. Here's an example: To store named string values in a way that can be conveniently parsed by a C program, we create a sequence of names and null terminated ASCII strings, with = between the name and the value, followed by an additional delimiting null byte. Here's how:

 
   my $env = pack( 'A*A*Z*' x keys( %Env ) . 'C',
                   map( { ( $_, '=', $Env{$_} ) } keys( %Env ) ), 0 );  

Let's examine the cogs of this byte mill, one by one. There's the map call, creating the items we intend to stuff into the $env buffer: to each key (in $_) it adds the = separator and the hash entry value. Each triplet is packed with the template code sequence A*A*Z* that is multiplied with the number of keys. (Yes, that's what the keys function returns in scalar context.) To get the very last null byte, we add a 0 at the end of the pack list, to be packed with C. (Attentive readers may have noticed that we could have omitted the 0.)

For the reverse operation, we'll have to determine the number of items in the buffer before we can let unpack rip it apart:

 
   my $n = $env =~ tr/\0// - 1;
   my %env = map( split( /=/, $_ ), unpack( 'Z*' x $n, $env ) );  

The tr counts the null bytes. The unpack call returns a list of name-value pairs each of which is taken apart in the map block.

 

 

 

Domain name registration service & domain search - 
Register cheap domain name from $7.95 and enjoy free domain services 
 

Cheap domain name search service -
Domain name services at just
$8.95/year only
 

Register domain name -
Buy domain name registration and cheap domain transfer at low, affordable price.

© 2002-2004 Active-Venture.com Web Site Hosting Service

 

[ There are two major products that come out of Berkeley: LSD and UNIX. We don't believe this to be a coincidence.   ]

 

 
 
 

Disclaimer: This documentation is provided only for the benefits of our web hosting customers.
For authoritative source of the documentation, please refer to http://www.perldoc.com