Cheap web site hosting service by Active-Venture.com
  

 Back to Index

Core Enhancements

Interpreter cloning, threads, and concurrency

Perl 5.6.0 introduces the beginnings of support for running multiple interpreters concurrently in different threads. In conjunction with the perl_clone() API call, which can be used to selectively duplicate the state of any given interpreter, it is possible to compile a piece of code once in an interpreter, clone that interpreter one or more times, and run all the resulting interpreters in distinct threads.

On the Windows platform, this feature is used to emulate fork() at the interpreter level. See perlfork for details about that.

This feature is still in evolution. It is eventually meant to be used to selectively clone a subroutine and data reachable from that subroutine in a separate interpreter and run the cloned subroutine in a separate thread. Since there is no shared data between the interpreters, little or no locking will be needed (unless parts of the symbol table are explicitly shared). This is obviously intended to be an easy-to-use replacement for the existing threads support.

Support for cloning interpreters and interpreter concurrency can be enabled using the -Dusethreads Configure option (see win32/Makefile for how to enable it on Windows.) The resulting perl executable will be functionally identical to one that was built with -Dmultiplicity, but the perl_clone() API call will only be available in the former.

-Dusethreads enables the cpp macro USE_ITHREADS by default, which in turn enables Perl source code changes that provide a clear separation between the op tree and the data it operates with. The former is immutable, and can therefore be shared between an interpreter and all of its clones, while the latter is considered local to each interpreter, and is therefore copied for each clone.

Note that building Perl with the -Dusemultiplicity Configure option is adequate if you wish to run multiple independent interpreters concurrently in different threads. -Dusethreads only provides the additional functionality of the perl_clone() API call and other support for running cloned interpreters concurrently.

 
    NOTE: This is an experimental feature.  Implementation details are
    subject to change.  

Lexically scoped warning categories

You can now control the granularity of warnings emitted by perl at a finer level using the use warnings pragma. warnings and perllexwarn have copious documentation on this feature.

Unicode and UTF-8 support

Perl now uses UTF-8 as its internal representation for character strings. The utf8 and bytes pragmas are used to control this support in the current lexical scope. See perlunicode, utf8 and bytes for more information.

This feature is expected to evolve quickly to support some form of I/O disciplines that can be used to specify the kind of input and output data (bytes or characters). Until that happens, additional modules from CPAN will be needed to complete the toolkit for dealing with Unicode.

 
    NOTE: This should be considered an experimental feature.  Implementation
    details are subject to change.  

Support for interpolating named characters

The new \N escape interpolates named characters within strings. For example, "Hi! \N{WHITE SMILING FACE}" evaluates to a string with a unicode smiley face at the end.

"our" declarations

An "our" declaration introduces a value that can be best understood as a lexically scoped symbolic alias to a global variable in the package that was current where the variable was declared. This is mostly useful as an alternative to the vars pragma, but also provides the opportunity to introduce typing and other attributes for such variables. See perlfunc/our.

Support for strings represented as a vector of ordinals

Literals of the form v1.2.3.4 are now parsed as a string composed of characters with the specified ordinals. This is an alternative, more readable way to construct (possibly unicode) strings instead of interpolating characters, as in "\x{1}\x{2}\x{3}\x{4}". The leading v may be omitted if there are more than two ordinals, so 1.2.3 is parsed the same as v1.2.3.

Strings written in this form are also useful to represent version "numbers". It is easy to compare such version "numbers" (which are really just plain strings) using any of the usual string comparison operators eq, ne, lt, gt, etc., or perform bitwise string operations on them using |, &, etc.

In conjunction with the new $^V magic variable (which contains the perl version as a string), such literals can be used as a readable way to check if you're running a particular version of Perl:

 
    # this will parse in older versions of Perl also
    if ($^V and $^V gt v5.6.0) {
        # new features supported
    }  

require and use also have some special magic to support such literals, but this particular usage should be avoided because it leads to misleading error messages under versions of Perl which don't support vector strings. Using a true version number will ensure correct behavior in all versions of Perl:

 
    require 5.006;    # run time check for v5.6
    use 5.006_001;    # compile time check for v5.6.1  

Also, sprintf and printf support the Perl-specific format flag %v to print ordinals of characters in arbitrary strings:

 
    printf "v%vd", $^V;		# prints current version, such as "v5.5.650"
    printf "%*vX", ":", $addr;	# formats IPv6 address
    printf "%*vb", " ", $bits;	# displays bitstring  

See perldata/"Scalar value constructors" for additional information.

Improved Perl version numbering system

Beginning with Perl version 5.6.0, the version number convention has been changed to a "dotted integer" scheme that is more commonly found in open source projects.

Maintenance versions of v5.6.0 will be released as v5.6.1, v5.6.2 etc. The next development series following v5.6.0 will be numbered v5.7.x, beginning with v5.7.0, and the next major production release following v5.6.0 will be v5.8.0.

The English module now sets $PERL_VERSION to $^V (a string value) rather than $] (a numeric value). (This is a potential incompatibility. Send us a report via perlbug if you are affected by this.)

The v1.2.3 syntax is also now legal in Perl. See Support for strings represented as a vector of ordinals for more on that.

To cope with the new versioning system's use of at least three significant digits for each version component, the method used for incrementing the subversion number has also changed slightly. We assume that versions older than v5.6.0 have been incrementing the subversion component in multiples of 10. Versions after v5.6.0 will increment them by 1. Thus, using the new notation, 5.005_03 is the "same" as v5.5.30, and the first maintenance version following v5.6.0 will be v5.6.1 (which should be read as being equivalent to a floating point value of 5.006_001 in the older format, stored in $]).

New syntax for declaring subroutine attributes

Formerly, if you wanted to mark a subroutine as being a method call or as requiring an automatic lock() when it is entered, you had to declare that with a use attrs pragma in the body of the subroutine. That can now be accomplished with declaration syntax, like this:

 
    sub mymethod : locked method ;
    ...
    sub mymethod : locked method {
	...
    }

    sub othermethod :locked :method ;
    ...
    sub othermethod :locked :method {
	...
    }
  

(Note how only the first : is mandatory, and whitespace surrounding the : is optional.)

AutoSplit.pm and SelfLoader.pm have been updated to keep the attributes with the stubs they provide. See attributes.

File and directory handles can be autovivified

Similar to how constructs such as $x->[0] autovivify a reference, handle constructors (open(), opendir(), pipe(), socketpair(), sysopen(), socket(), and accept()) now autovivify a file or directory handle if the handle passed to them is an uninitialized scalar variable. This allows the constructs such as open(my $fh, ...) and open(local $fh,...) to be used to create filehandles that will conveniently be closed automatically when the scope ends, provided there are no other references to them. This largely eliminates the need for typeglobs when opening filehandles that must be passed around, as in the following example:

 
    sub myopen {
        open my $fh, "@_"
	     or die "Can't open '@_': $!";
	return $fh;
    }

    {
        my $f = myopen("</etc/motd");
	print <$f>;
	# $f implicitly closed here
    }  

open() with more than two arguments

If open() is passed three arguments instead of two, the second argument is used as the mode and the third argument is taken to be the file name. This is primarily useful for protecting against unintended magic behavior of the traditional two-argument form. See perlfunc/open.

64-bit support

Any platform that has 64-bit integers either

 
	(1) natively as longs or ints
	(2) via special compiler flags
	(3) using long long or int64_t  

is able to use "quads" (64-bit integers) as follows:

  • constants (decimal, hexadecimal, octal, binary) in the code
  • arguments to oct() and hex()
  • arguments to print(), printf() and sprintf() (flag prefixes ll, L, q)
  • printed as such
  • pack() and unpack() "q" and "Q" formats
  • in basic arithmetics: + - * / % (NOTE: operating close to the limits of the integer values may produce surprising results)
  • in bit arithmetics: & | ^ ~ << >> (NOTE: these used to be forced to be 32 bits wide but now operate on the full native width.)
  • vec()

Note that unless you have the case (a) you will have to configure and compile Perl using the -Duse64bitint Configure flag.

 
    NOTE: The Configure flags -Duselonglong and -Duse64bits have been
    deprecated.  Use -Duse64bitint instead.  

There are actually two modes of 64-bitness: the first one is achieved using Configure -Duse64bitint and the second one using Configure -Duse64bitall. The difference is that the first one is minimal and the second one maximal. The first works in more places than the second.

The use64bitint does only as much as is required to get 64-bit integers into Perl (this may mean, for example, using "long longs") while your memory may still be limited to 2 gigabytes (because your pointers could still be 32-bit). Note that the name 64bitint does not imply that your C compiler will be using 64-bit ints (it might, but it doesn't have to): the use64bitint means that you will be able to have 64 bits wide scalar values.

The use64bitall goes all the way by attempting to switch also integers (if it can), longs (and pointers) to being 64-bit. This may create an even more binary incompatible Perl than -Duse64bitint: the resulting executable may not run at all in a 32-bit box, or you may have to reboot/reconfigure/rebuild your operating system to be 64-bit aware.

Natively 64-bit systems like Alpha and Cray need neither -Duse64bitint nor -Duse64bitall.

Last but not least: note that due to Perl's habit of always using floating point numbers, the quads are still not true integers. When quads overflow their limits (0...18_446_744_073_709_551_615 unsigned, -9_223_372_036_854_775_808...9_223_372_036_854_775_807 signed), they are silently promoted to floating point numbers, after which they will start losing precision (in their lower digits).

 
    NOTE: 64-bit support is still experimental on most platforms.
    Existing support only covers the LP64 data model.  In particular, the
    LLP64 data model is not yet supported.  64-bit libraries and system
    APIs on many platforms have not stabilized--your mileage may vary.  

Large file support

If you have filesystems that support "large files" (files larger than 2 gigabytes), you may now also be able to create and access them from Perl.

 
    NOTE: The default action is to enable large file support, if
    available on the platform.  

If the large file support is on, and you have a Fcntl constant O_LARGEFILE, the O_LARGEFILE is automatically added to the flags of sysopen().

Beware that unless your filesystem also supports "sparse files" seeking to umpteen petabytes may be inadvisable.

Note that in addition to requiring a proper file system to do large files you may also need to adjust your per-process (or your per-system, or per-process-group, or per-user-group) maximum filesize limits before running Perl scripts that try to handle large files, especially if you intend to write such files.

Finally, in addition to your process/process group maximum filesize limits, you may have quota limits on your filesystems that stop you (your user id or your user group id) from using large files.

Adjusting your process/user/group/file system/operating system limits is outside the scope of Perl core language. For process limits, you may try increasing the limits using your shell's limits/limit/ulimit command before running Perl. The BSD::Resource extension (not included with the standard Perl distribution) may also be of use, it offers the getrlimit/setrlimit interface that can be used to adjust process resource usage limits, including the maximum filesize limit.

Long doubles

In some systems you may be able to use long doubles to enhance the range and precision of your double precision floating point numbers (that is, Perl's numbers). Use Configure -Duselongdouble to enable this support (if it is available).

"more bits"

You can "Configure -Dusemorebits" to turn on both the 64-bit support and the long double support.

Enhanced support for sort() subroutines

Perl subroutines with a prototype of ($$), and XSUBs in general, can now be used as sort subroutines. In either case, the two elements to be compared are passed as normal parameters in @_. See perlfunc/sort.

For unprototyped sort subroutines, the historical behavior of passing the elements to be compared as the global variables $a and $b remains unchanged.

sort $coderef @foo allowed

sort() did not accept a subroutine reference as the comparison function in earlier versions. This is now permitted.

File globbing implemented internally

Perl now uses the File::Glob implementation of the glob() operator automatically. This avoids using an external csh process and the problems associated with it.

 
    NOTE: This is currently an experimental feature.  Interfaces and
    implementation are subject to change.  

Support for CHECK blocks

In addition to BEGIN, INIT, END, DESTROY and AUTOLOAD, subroutines named CHECK are now special. These are queued up during compilation and behave similar to END blocks, except they are called at the end of compilation rather than at the end of execution. They cannot be called directly.

POSIX character class syntax [: :] supported

For example to match alphabetic characters use /[[:alpha:]]/. See perlre for details.

Better pseudo-random number generator

In 5.005_0x and earlier, perl's rand() function used the C library rand(3) function. As of 5.005_52, Configure tests for drand48(), random(), and rand() (in that order) and picks the first one it finds.

These changes should result in better random numbers from rand().

Improved qw// operator

The qw// operator is now evaluated at compile time into a true list instead of being replaced with a run time call to split(). This removes the confusing misbehaviour of qw// in scalar context, which had inherited that behaviour from split().

Thus:

 
    $foo = ($bar) = qw(a b c); print "$foo|$bar\n";  

now correctly prints "3|a", instead of "2|a".

Better worst-case behavior of hashes

Small changes in the hashing algorithm have been implemented in order to improve the distribution of lower order bits in the hashed value. This is expected to yield better performance on keys that are repeated sequences.

pack() format 'Z' supported

The new format type 'Z' is useful for packing and unpacking null-terminated strings. See perlfunc/"pack".

pack() format modifier '!' supported

The new format type modifier '!' is useful for packing and unpacking native shorts, ints, and longs. See perlfunc/"pack".

pack() and unpack() support counted strings

The template character '/' can be used to specify a counted string type to be packed or unpacked. See perlfunc/"pack".

Comments in pack() templates

The '#' character in a template introduces a comment up to end of the line. This facilitates documentation of pack() templates.

Weak references

In previous versions of Perl, you couldn't cache objects so as to allow them to be deleted if the last reference from outside the cache is deleted. The reference in the cache would hold a reference count on the object and the objects would never be destroyed.

Another familiar problem is with circular references. When an object references itself, its reference count would never go down to zero, and it would not get destroyed until the program is about to exit.

Weak references solve this by allowing you to "weaken" any reference, that is, make it not count towards the reference count. When the last non-weak reference to an object is deleted, the object is destroyed and all the weak references to the object are automatically undef-ed.

To use this feature, you need the Devel::WeakRef package from CPAN, which contains additional documentation.

 
    NOTE: This is an experimental feature.  Details are subject to change.    

Binary numbers supported

Binary numbers are now supported as literals, in s?printf formats, and oct():

 
    $answer = 0b101010;
    printf "The answer is: %b\n", oct("0b101010");  

Lvalue subroutines

Subroutines can now return modifiable lvalues. See perlsub/"Lvalue subroutines".

 
    NOTE: This is an experimental feature.  Details are subject to change.  

Some arrows may be omitted in calls through references

Perl now allows the arrow to be omitted in many constructs involving subroutine calls through references. For example, $foo[10]->('foo') may now be written $foo[10]('foo'). This is rather similar to how the arrow may be omitted from $foo[10]->{'foo'}. Note however, that the arrow is still required for foo(10)->('bar').

Boolean assignment operators are legal lvalues

Constructs such as ($a ||= 2) += 1 are now allowed.

exists() is supported on subroutine names

The exists() builtin now works on subroutine names. A subroutine is considered to exist if it has been declared (even if implicitly). See perlfunc/exists for examples.

exists() and delete() are supported on array elements

The exists() and delete() builtins now work on simple arrays as well. The behavior is similar to that on hash elements.

exists() can be used to check whether an array element has been initialized. This avoids autovivifying array elements that don't exist. If the array is tied, the EXISTS() method in the corresponding tied package will be invoked.

delete() may be used to remove an element from the array and return it. The array element at that position returns to its uninitialized state, so that testing for the same element with exists() will return false. If the element happens to be the one at the end, the size of the array also shrinks up to the highest element that tests true for exists(), or 0 if none such is found. If the array is tied, the DELETE() method in the corresponding tied package will be invoked.

See perlfunc/exists and perlfunc/delete for examples.

Pseudo-hashes work better

Dereferencing some types of reference values in a pseudo-hash, such as $ph->{foo}[1], was accidentally disallowed. This has been corrected.

When applied to a pseudo-hash element, exists() now reports whether the specified value exists, not merely if the key is valid.

delete() now works on pseudo-hashes. When given a pseudo-hash element or slice it deletes the values corresponding to the keys (but not the keys themselves). See perlref/"Pseudo-hashes: Using an array as a hash".

Pseudo-hash slices with constant keys are now optimized to array lookups at compile-time.

List assignments to pseudo-hash slices are now supported.

The fields pragma now provides ways to create pseudo-hashes, via fields::new() and fields::phash(). See fields.

 
    NOTE: The pseudo-hash data type continues to be experimental.
    Limiting oneself to the interface elements provided by the
    fields pragma will provide protection from any future changes.  

Automatic flushing of output buffers

fork(), exec(), system(), qx//, and pipe open()s now flush buffers of all files opened for output when the operation was attempted. This mostly eliminates confusing buffering mishaps suffered by users unaware of how Perl internally handles I/O.

This is not supported on some platforms like Solaris where a suitably correct implementation of fflush(NULL) isn't available.

Better diagnostics on meaningless filehandle operations

Constructs such as open(<FH>) and close(<FH>) are compile time errors. Attempting to read from filehandles that were opened only for writing will now produce warnings (just as writing to read-only filehandles does).

Where possible, buffered data discarded from duped input filehandle

open(NEW, "<&OLD") now attempts to discard any data that was previously read and buffered in OLD before duping the handle. On platforms where doing this is allowed, the next read operation on NEW will return the same data as the corresponding operation on OLD. Formerly, it would have returned the data from the start of the following disk block instead.

eof() has the same old magic as <>

eof() would return true if no attempt to read from <> had yet been made. eof() has been changed to have a little magic of its own, it now opens the <> files.

binmode() can be used to set :crlf and :raw modes

binmode() now accepts a second argument that specifies a discipline for the handle in question. The two pseudo-disciplines ":raw" and ":crlf" are currently supported on DOS-derivative platforms. See perlfunc/"binmode" and open.

-T filetest recognizes UTF-8 encoded files as "text"

The algorithm used for the -T filetest has been enhanced to correctly identify UTF-8 content as "text".

system(), backticks and pipe open now reflect exec() failure

On Unix and similar platforms, system(), qx() and open(FOO, "cmd |") etc., are implemented via fork() and exec(). When the underlying exec() fails, earlier versions did not report the error properly, since the exec() happened to be in a different process.

The child process now communicates with the parent about the error in launching the external command, which allows these constructs to return with their usual error value and set $!.

Improved diagnostics

Line numbers are no longer suppressed (under most likely circumstances) during the global destruction phase.

Diagnostics emitted from code running in threads other than the main thread are now accompanied by the thread ID.

Embedded null characters in diagnostics now actually show up. They used to truncate the message in prior versions.

$foo::a and $foo::b are now exempt from "possible typo" warnings only if sort() is encountered in package foo.

Unrecognized alphabetic escapes encountered when parsing quote constructs now generate a warning, since they may take on new semantics in later versions of Perl.

Many diagnostics now report the internal operation in which the warning was provoked, like so:

 
    Use of uninitialized value in concatenation (.) at (eval 1) line 1.
    Use of uninitialized value in print at (eval 1) line 1.  

Diagnostics that occur within eval may also report the file and line number where the eval is located, in addition to the eval sequence number and the line number within the evaluated text itself. For example:

 
    Not enough arguments for scalar at (eval 4)[newlib/perl5db.pl:1411] line 2, at EOF  

Diagnostics follow STDERR

Diagnostic output now goes to whichever file the STDERR handle is pointing at, instead of always going to the underlying C runtime library's stderr.

More consistent close-on-exec behavior

On systems that support a close-on-exec flag on filehandles, the flag is now set for any handles created by pipe(), socketpair(), socket(), and accept(), if that is warranted by the value of $^F that may be in effect. Earlier versions neglected to set the flag for handles created with these operators. See perlfunc/pipe, perlfunc/socketpair, perlfunc/socket, perlfunc/accept, and perlvar/$^F.

syswrite() ease-of-use

The length argument of syswrite() has become optional.

Better syntax checks on parenthesized unary operators

Expressions such as:

 
    print defined(&foo,&bar,&baz);
    print uc("foo","bar","baz");
    undef($foo,&bar);  

used to be accidentally allowed in earlier versions, and produced unpredictable behaviour. Some produced ancillary warnings when used in this way; others silently did the wrong thing.

The parenthesized forms of most unary operators that expect a single argument now ensure that they are not called with more than one argument, making the cases shown above syntax errors. The usual behaviour of:

 
    print defined &foo, &bar, &baz;
    print uc "foo", "bar", "baz";
    undef $foo, &bar;  

remains unchanged. See perlop.

Bit operators support full native integer width

The bit operators (& | ^ ~ << >>) now operate on the full native integral width (the exact size of which is available in $Config{ivsize}). For example, if your platform is either natively 64-bit or if Perl has been configured to use 64-bit integers, these operations apply to 8 bytes (as opposed to 4 bytes on 32-bit platforms). For portability, be sure to mask off the excess bits in the result of unary ~, e.g., ~$x & 0xffffffff.

Improved security features

More potentially unsafe operations taint their results for improved security.

The passwd and shell fields returned by the getpwent(), getpwnam(), and getpwuid() are now tainted, because the user can affect their own encrypted password and login shell.

The variable modified by shmread(), and messages returned by msgrcv() (and its object-oriented interface IPC::SysV::Msg::rcv) are also tainted, because other untrusted processes can modify messages and shared memory segments for their own nefarious purposes.

More functional bareword prototype (*)

Bareword prototypes have been rationalized to enable them to be used to override builtins that accept barewords and interpret them in a special way, such as require or do.

Arguments prototyped as * will now be visible within the subroutine as either a simple scalar or as a reference to a typeglob. See perlsub/Prototypes.

require and do may be overridden

require and do 'file' operations may be overridden locally by importing subroutines of the same name into the current package (or globally by importing them into the CORE::GLOBAL:: namespace). Overriding require will also affect use, provided the override is visible at compile-time. See perlsub/"Overriding Built-in Functions".

$^X variables may now have names longer than one character

Formerly, $^X was synonymous with ${"\cX"}, but $^XY was a syntax error. Now variable names that begin with a control character may be arbitrarily long. However, for compatibility reasons, these variables must be written with explicit braces, as ${^XY} for example. ${^XYZ} is synonymous with ${"\cXYZ"}. Variable names with more than one control character, such as ${^XY^Z}, are illegal.

The old syntax has not changed. As before, `^X' may be either a literal control-X character or the two-character sequence `caret' plus `X'. When braces are omitted, the variable name stops after the control character. Thus "$^XYZ" continues to be synonymous with $^X . "YZ" as before.

As before, lexical variables may not have names beginning with control characters. As before, variables whose names begin with a control character are always forced to be in package `main'. All such variables are reserved for future extensions, except those that begin with ^_, which may be used by user programs and are guaranteed not to acquire special meaning in any future version of Perl.

New variable $^C reflects -c switch

$^C has a boolean value that reflects whether perl is being run in compile-only mode (i.e. via the -c switch). Since BEGIN blocks are executed under such conditions, this variable enables perl code to determine whether actions that make sense only during normal running are warranted. See perlvar.

New variable $^V contains Perl version as a string

$^V contains the Perl version number as a string composed of characters whose ordinals match the version numbers, i.e. v5.6.0. This may be used in string comparisons.

See Support for strings represented as a vector of ordinals for an example.

Optional Y2K warnings

If Perl is built with the cpp macro PERL_Y2KWARN defined, it emits optional warnings when concatenating the number 19 with another number.

This behavior must be specifically enabled when running Configure. See INSTALL and README.Y2K.

Arrays now always interpolate into double-quoted strings

In double-quoted strings, arrays now interpolate, no matter what. The behavior in earlier versions of perl 5 was that arrays would interpolate into strings if the array had been mentioned before the string was compiled, and otherwise Perl would raise a fatal compile-time error. In versions 5.000 through 5.003, the error was

 
        Literal @example now requires backslash  

In versions 5.004_01 through 5.6.0, the error was

 
        In string, @example now must be written as \@example  

The idea here was to get people into the habit of writing "fred\@example.com" when they wanted a literal @ sign, just as they have always written "Give me back my \$5" when they wanted a literal $ sign.

Starting with 5.6.1, when Perl now sees an @ sign in a double-quoted string, it always attempts to interpolate an array, regardless of whether or not the array has been used or declared already. The fatal error has been downgraded to an optional warning:

 
        Possible unintended interpolation of @example in string  

This warns you that "fred@example.com" is going to turn into fred.com if you don't backslash the @. See http://www.plover.com/~mjd/perl/at-error.html for more details about the history here.

 

  

 

Cheap domain name search service -
Domain name services at just
$8.95/year only
 

Register domain name -
Buy domain name registration and cheap domain transfer at low, affordable price.

© 2002-2004 Active-Venture.com Web Site Hosting Service

 

[ The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong, it usually turns out to be impossible to get at or repair.   ]

 

 
 
 

Disclaimer: This documentation is provided only for the benefits of our web hosting customers.
For authoritative source of the documentation, please refer to http://www.perldoc.com