Splits a string into a list of strings and returns that list. By default, empty leading
fields are preserved, and empty trailing ones are deleted.
In scalar context, returns the number of fields found and splits into the @_
array. Use of split in scalar context is deprecated, however, because it clobbers your
subroutine arguments.
If EXPR is omitted, splits the $_ string. If PATTERN is also omitted,
splits on whitespace (after skipping any leading whitespace). Anything matching PATTERN is
taken to be a delimiter separating the fields. (Note that the delimiter may be longer than
one character.)
If LIMIT is specified and positive, it represents the maximum number of fields the EXPR
will be split into, though the actual number of fields returned depends on the number of
times PATTERN matches within EXPR. If LIMIT is unspecified or zero, trailing null fields
are stripped (which potential users of pop would do well to remember). If
LIMIT is negative, it is treated as if an arbitrarily large LIMIT had been specified. Note
that splitting an EXPR that evaluates to the empty string always returns the empty list,
regardless of the LIMIT specified.
A pattern matching the null string (not to be confused with a null pattern //,
which is just one member of the set of patterns matching a null string) will split the
value of EXPR into separate characters at each point it matches that way. For example:
print join(':', split(/ */, 'hi there'));
|
|
produces the output 'h:i:t:h:e:r:e'.
Using the empty pattern // specifically matches the null string, and is
not be confused with the use of // to mean "the last successful pattern
match".
Empty leading (or trailing) fields are produced when there are positive width matches
at the beginning (or end) of the string; a zero-width match at the beginning (or end) of
the string does not produce an empty field. For example:
print join(':', split(/(?=\w)/, 'hi there!'));
|
|
produces the output 'h:i :t:h:e:r:e!'.
The LIMIT parameter can be used to split a line partially
($login, $passwd, $remainder) = split(/:/, $_, 3);
|
|
When assigning to a list, if LIMIT is omitted, Perl supplies a LIMIT one larger than
the number of variables in the list, to avoid unnecessary work. For the list above LIMIT
would have been 4 by default. In time critical applications it behooves you not to split
into more fields than you really need.
If the PATTERN contains parentheses, additional list elements are created from each
matching substring in the delimiter.
split(/([,-])/, "1-10,20", 3);
|
|
produces the list value
If you had the entire header of a normal Unix email message in $header, you could split
it up into fields and their values this way:
$header =~ s/\n\s+/ /g; # fix continuation lines
%hdrs = (UNIX_FROM => split /^(\S*?):\s*/m, $header);
|
|
The pattern /PATTERN/ may be replaced with an expression to specify
patterns that vary at runtime. (To do runtime compilation only once, use /$variable/o.)
As a special case, specifying a PATTERN of space (' ') will split on white
space just as split with no arguments does. Thus, split(' ') can
be used to emulate awk's default behavior, whereas split(/ /) will
give you as many null initial fields as there are leading spaces. A split on /\s+/
is like a split(' ') except that any leading whitespace produces a null first
field. A split with no arguments really does a split(' ', $_)
internally.
A PATTERN of /^/ is treated as if it were /^/m, since it
isn't much use otherwise.
Example:
open(PASSWD, '/etc/passwd');
while (<PASSWD>) {
chomp;
($login, $passwd, $uid, $gid,
$gcos, $home, $shell) = split(/:/);
#...
}
|
|
As with regular pattern matching, any capturing parentheses that are not matched in a split()
will be set to undef when returned:
@fields = split /(A)|B/, "1A2B3";
# @fields is (1, 'A', 2, undef, 3)
|
|