Website hosting service by

 Back to Index


perlpacktut - tutorial on pack and unpack


pack and unpack are two functions for transforming data according to a user-defined template, between the guarded way Perl stores values and some well-defined representation as might be required in the environment of a Perl program. Unfortunately, they're also two of the most misunderstood and most often overlooked functions that Perl provides. This tutorial will demystify them for you.

The Basic Principle

Most programming languages don't shelter the memory where variables are stored. In C, for instance, you can take the address of some variable, and the sizeof operator tells you how many bytes are allocated to the variable. Using the address and the size, you may access the storage to your heart's content.

In Perl, you just can't access memory at random, but the structural and representational conversion provided by pack and unpack is an excellent alternative. The pack function converts values to a byte sequence containing representations according to a given specification, the so-called "template" argument. unpack is the reverse process, deriving some values from the contents of a string of bytes. (Be cautioned, however, that not all that has been packed together can be neatly unpacked - a very common experience as seasoned travellers are likely to confirm.)

Why, you may ask, would you need a chunk of memory containing some values in binary representation? One good reason is input and output accessing some file, a device, or a network connection, whereby this binary representation is either forced on you or will give you some benefit in processing. Another cause is passing data to some system call that is not available as a Perl function: syscall requires you to provide parameters stored in the way it happens in a C program. Even text processing (as shown in the next section) may be simplified with judicious usage of these two functions.

To see how (un)packing works, we'll start with a simple template code where the conversion is in low gear: between the contents of a byte sequence and a string of hexadecimal digits. Let's use unpack, since this is likely to remind you of a dump program, or some desperate last message unfortunate programs are wont to throw at you before they expire into the wild blue yonder. Assuming that the variable $mem holds a sequence of bytes that we'd like to inspect without assuming anything about its meaning, we can write

   my( $hex ) = unpack( 'H*', $mem );
   print "$hex\n";  

whereupon we might see something like this, with each pair of hex digits corresponding to a byte:


What was in this chunk of memory? Numbers, characters, or a mixture of both? Assuming that we're on a computer where ASCII (or some similar) encoding is used: hexadecimal values in the range 0x40 - 0x5A indicate an uppercase letter, and 0x20 encodes a space. So we might assume it is a piece of text, which some are able to read like a tabloid; but others will have to get hold of an ASCII table and relive that firstgrader feeling. Not caring too much about which way to read this, we note that unpack with the template code H converts the contents of a sequence of bytes into the customary hexadecimal notation. Since "a sequence of" is a pretty vague indication of quantity, H has been defined to convert just a single hexadecimal digit unless it is followed by a repeat count. An asterisk for the repeat count means to use whatever remains.

The inverse operation - packing byte contents from a string of hexadecimal digits - is just as easily written. For instance:

   my $s = pack( 'H2' x 10, map { "3$_" } ( 0..9 ) );
   print "$s\n";  

Since we feed a list of ten 2-digit hexadecimal strings to pack, the pack template should contain ten pack codes. If this is run on a computer with ASCII character coding, it will print 0123456789.

Pack Recipes

Here are a collection of (possibly) useful canned recipes for pack and unpack:

    # Convert IP address for socket functions
    pack( "C4", split /\./, "" ); 

    # Count the bits in a chunk of memory (e.g. a select vector)
    unpack( '%32b*', $mask );

    # Determine the endianness of your system
    $is_little_endian = unpack( 'c', pack( 's', 1 ) );
    $is_big_endian = unpack( 'xc', pack( 's', 1 ) );

    # Determine the number of bits in a native integer
    $bits = unpack( '%32I!', ~0 );

    # Prepare argument for the nanosleep system call
    my $timespec = pack( 'L!L!', $secs, $nanosecs );  

For a simple memory dump we unpack some bytes into just as many pairs of hex digits, and use map to handle the traditional spacing - 16 bytes to a line:

    my $i;
    print map { ++$i % 16 ? "$_ " : "$_\n" }
          unpack( 'H2' x length( $mem ), $mem ),
          length( $mem ) % 16 ? "\n" : '';

Funnies Section

    # Pulling digits out of nowhere...
    print unpack( 'C', pack( 'x' ) ),
          unpack( '%B*', pack( 'A' ) ),
          unpack( 'H', pack( 'A' ) ),
          unpack( 'A', unpack( 'C', pack( 'A' ) ) ), "\n";

    # One for the road ;-)
    my $advice = pack( 'all u can in a van' );


Simon Cozens and Wolfgang Laun.




Domain name registration service & domain search - 
Register cheap domain name from $7.95 and enjoy free domain services 

Cheap domain name search service -
Domain name services at just
$8.95/year only

Buy domain name registration and cheap domain transfer at low, affordable price.

© 2002-2004 Web Site Hosting Service


[ Heuristics are bug ridden by definition— if they didn't have bugs, then they'd be algorithms   ]



Disclaimer: This documentation is provided only for the benefits of our web hosting customers.
For authoritative source of the documentation, please refer to