Website hosting service by Active-Venture.com
  

 Back to Index

Data: Hashes (Associative Arrays)

How do I process an entire hash?

Use the each() function (see perlfunc/each) if you don't care whether it's sorted:

 
    while ( ($key, $value) = each %hash) {
	print "$key = $value\n";
    }  

If you want it sorted, you'll have to use foreach() on the result of sorting the keys as shown in an earlier question.

What happens if I add or remove keys from a hash while iterating over it?

Don't do that. :-)

[lwall] In Perl 4, you were not allowed to modify a hash at all while iterating over it. In Perl 5 you can delete from it, but you still can't add to it, because that might cause a doubling of the hash table, in which half the entries get copied up to the new top half of the table, at which point you've totally bamboozled the iterator code. Even if the table doesn't double, there's no telling whether your new entry will be inserted before or after the current iterator position.

Either treasure up your changes and make them after the iterator finishes or use keys to fetch all the old keys at once, and iterate over the list of keys.

How do I look up a hash element by value?

Create a reverse hash:

 
    %by_value = reverse %by_key;
    $key = $by_value{$value};  

That's not particularly efficient. It would be more space-efficient to use:

 
    while (($key, $value) = each %by_key) {
	$by_value{$value} = $key;
    }  

If your hash could have repeated values, the methods above will only find one of the associated keys. This may or may not worry you. If it does worry you, you can always reverse the hash into a hash of arrays instead:

 
     while (($key, $value) = each %by_key) {
	 push @{$key_list_by_value{$value}}, $key;
     }  

How can I know how many entries are in a hash?

If you mean how many keys, then all you have to do is use the keys() function in a scalar context:

 
    $num_keys = keys %hash;  

The keys() function also resets the iterator, which means that you may see strange results if you use this between uses of other hash operators such as each().

How do I sort a hash (optionally by value instead of key)?

Internally, hashes are stored in a way that prevents you from imposing an order on key-value pairs. Instead, you have to sort a list of the keys or values:

 
    @keys = sort keys %hash;	# sorted by key
    @keys = sort {
		    $hash{$a} cmp $hash{$b}
	    } keys %hash; 	# and by value  

Here we'll do a reverse numeric sort by value, and if two keys are identical, sort by length of key, or if that fails, by straight ASCII comparison of the keys (well, possibly modified by your locale--see perllocale).

 
    @keys = sort {
		$hash{$b} <=> $hash{$a}
			  ||
		length($b) <=> length($a)
			  ||
		      $a cmp $b
    } keys %hash;  

How can I always keep my hash sorted?

You can look into using the DB_File module and tie() using the $DB_BTREE hash bindings as documented in DB_File/"In Memory Databases". The Tie::IxHash module from CPAN might also be instructive.

What's the difference between "delete" and "undef" with hashes?

Hashes contain pairs of scalars: the first is the key, the second is the value. The key will be coerced to a string, although the value can be any kind of scalar: string, number, or reference. If a key $key is present in %hash, exists($hash{$key}) will return true. The value for a given key can be undef, in which case $hash{$key} will be undef while exists $hash{$key} will return true. This corresponds to ($key, undef) being in the hash.

Pictures help... here's the %hash table:

 
	  keys  values
	+------+------+
	|  a   |  3   |
	|  x   |  7   |
	|  d   |  0   |
	|  e   |  2   |
	+------+------+  

And these conditions hold

 
	$hash{'a'}                       is true
	$hash{'d'}                       is false
	defined $hash{'d'}               is true
	defined $hash{'a'}               is true
	exists $hash{'a'}                is true (Perl5 only)
	grep ($_ eq 'a', keys %hash)     is true  

If you now say

 
	undef $hash{'a'}  

your table now reads:

 
	  keys  values
	+------+------+
	|  a   | undef|
	|  x   |  7   |
	|  d   |  0   |
	|  e   |  2   |
	+------+------+  

and these conditions now hold; changes in caps:

 
	$hash{'a'}                       is FALSE
	$hash{'d'}                       is false
	defined $hash{'d'}               is true
	defined $hash{'a'}               is FALSE
	exists $hash{'a'}                is true (Perl5 only)
	grep ($_ eq 'a', keys %hash)     is true  

Notice the last two: you have an undef value, but a defined key!

Now, consider this:

 
	delete $hash{'a'}  

your table now reads:

 
	  keys  values
	+------+------+
	|  x   |  7   |
	|  d   |  0   |
	|  e   |  2   |
	+------+------+  

and these conditions now hold; changes in caps:

 
	$hash{'a'}                       is false
	$hash{'d'}                       is false
	defined $hash{'d'}               is true
	defined $hash{'a'}               is false
	exists $hash{'a'}                is FALSE (Perl5 only)
	grep ($_ eq 'a', keys %hash)     is FALSE  

See, the whole entry is gone!

Why don't my tied hashes make the defined/exists distinction?

This depends on the tied hash's implementation of EXISTS(). For example, there isn't the concept of undef with hashes that are tied to DBM* files. It also means that exists() and defined() do the same thing with a DBM* file, and what they end up doing is not what they do with ordinary hashes.

How do I reset an each() operation part-way through?

Using keys %hash in scalar context returns the number of keys in the hash and resets the iterator associated with the hash. You may need to do this if you use last to exit a loop early so that when you re-enter it, the hash iterator has been reset.

How can I get the unique keys from two hashes?

First you extract the keys from the hashes into lists, then solve the "removing duplicates" problem described above. For example:

 
    %seen = ();
    for $element (keys(%foo), keys(%bar)) {
	$seen{$element}++;
    }
    @uniq = keys %seen;  

Or more succinctly:

 
    @uniq = keys %{{%foo,%bar}};  

Or if you really want to save space:

 
    %seen = ();
    while (defined ($key = each %foo)) {
        $seen{$key}++;
    }
    while (defined ($key = each %bar)) {
        $seen{$key}++;
    }
    @uniq = keys %seen;  

How can I store a multidimensional array in a DBM file?

Either stringify the structure yourself (no fun), or else get the MLDBM (which uses Data::Dumper) module from CPAN and layer it on top of either DB_File or GDBM_File.

How can I make my hash remember the order I put elements into it?

Use the Tie::IxHash from CPAN.

 
    use Tie::IxHash;
    tie(%myhash, Tie::IxHash);
    for ($i=0; $i<20; $i++) {
        $myhash{$i} = 2*$i;
    }
    @keys = keys %myhash;
    # @keys = (0,1,2,3,...)  

Why does passing a subroutine an undefined element in a hash create it?

If you say something like:

 
    somefunc($hash{"nonesuch key here"});  

Then that element "autovivifies"; that is, it springs into existence whether you store something there or not. That's because functions get scalars passed in by reference. If somefunc() modifies $_[0], it has to be ready to write it back into the caller's version.

This has been fixed as of Perl5.004.

Normally, merely accessing a key's value for a nonexistent key does not cause that key to be forever there. This is different than awk's behavior.

How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays?

Usually a hash ref, perhaps like this:

 
    $record = {
        NAME   => "Jason",
        EMPNO  => 132,
        TITLE  => "deputy peon",
        AGE    => 23,
        SALARY => 37_000,
        PALS   => [ "Norbert", "Rhys", "Phineas"],
    };  

References are documented in perlref and the upcoming perlreftut. Examples of complex data structures are given in perldsc and perllol. Examples of structures and object-oriented classes are in perltoot.

How can I use a reference as a hash key?

You can't do this directly, but you could use the standard Tie::RefHash module distributed with Perl.

 

 

 

© 2002-2004 Active-Venture.com Web Site Hosting Service

 

[ Never underestimate the bandwidth of a station wagon full of backup tapes hurling down the highway.   ]

 

 
 
 

Disclaimer: This documentation is provided only for the benefits of our web hosting customers.
For authoritative source of the documentation, please refer to http://www.perldoc.com