SECURITY
Although the main discussion of Perl security issues can be found in perlsec, a discussion of Perl's
locale handling would be incomplete if it did not draw your attention to locale-dependent
security issues. Locales--particularly on systems that allow unprivileged users to build their
own locales--are untrustworthy. A malicious (or just plain broken) locale can make a
locale-aware application give unexpected results. Here are a few possibilities:
- Regular expression checks for safe file names or mail addresses using
\w
may be spoofed by an LC_CTYPE locale that claims that characters such as
">" and "|" are alphanumeric.
- String interpolation with case-mapping, as in, say,
$dest = "C:\U$name.$ext",
may produce dangerous results if a bogus LC_CTYPE case-mapping table is in effect.
- A sneaky
LC_COLLATE locale could result in the names of students with
"D" grades appearing ahead of those with "A"s.
- An application that takes the trouble to use information in
LC_MONETARY may
format debits as if they were credits and vice versa if that locale has been subverted. Or
it might make payments in US dollars instead of Hong Kong dollars.
- The date and day names in dates formatted by strftime() could be manipulated to
advantage by a malicious user able to subvert the
LC_DATE locale.
("Look--it says I wasn't in the building on Sunday.")
Such dangers are not peculiar to the locale system: any aspect of an application's
environment which may be modified maliciously presents similar challenges. Similarly, they are
not specific to Perl: any programming language that allows you to write programs that take
account of their environment exposes you to these issues.
Perl cannot protect you from all possibilities shown in the examples--there is no
substitute for your own vigilance--but, when use locale is in effect, Perl uses
the tainting mechanism (see perlsec)
to mark string results that become locale-dependent, and which may be untrustworthy in
consequence. Here is a summary of the tainting behavior of operators and functions that may be
affected by the locale:
-
Comparison operators (lt, le, ge, gt
and cmp):
Scalar true/false (or less/equal/greater) result is never tainted.
-
Case-mapping interpolation (with \l, \L, \u
or \U)
Result string containing interpolated material is tainted if use locale is
in effect.
-
Matching operator (m//):
Scalar true/false result never tainted.
Subpatterns, either delivered as a list-context result or as $1 etc. are tainted if use
locale is in effect, and the subpattern regular expression contains \w
(to match an alphanumeric character), \W (non-alphanumeric character), \s
(white-space character), or \S (non white-space character). The
matched-pattern variable, $&, $` (pre-match), $' (post-match), and $+ (last match) are
also tainted if use locale is in effect and the regular expression contains \w,
\W, \s, or \S.
-
Substitution operator (s///):
Has the same behavior as the match operator. Also, the left operand of =~
becomes tainted when use locale in effect if modified as a result of a
substitution based on a regular expression match involving \w, \W,
\s, or \S; or of case-mapping with \l, \L,\u
or \U.
-
Output formatting functions (printf() and write()):
Results are never tainted because otherwise even output from print, for example print(1/7),
should be tainted if use locale is in effect.
-
Case-mapping functions (lc(), lcfirst(), uc(), ucfirst()):
Results are tainted if use locale is in effect.
-
POSIX locale-dependent functions (localeconv(), strcoll(), strftime(), strxfrm()):
Results are never tainted.
-
POSIX character class tests (isalnum(), isalpha(), isdigit(), isgraph(), islower(),
isprint(), ispunct(), isspace(), isupper(), isxdigit()):
True/false results are never tainted.
Three examples illustrate locale-dependent tainting. The first program, which ignores its
locale, won't run: a value taken directly from the command line may not be used to name an
output file when taint checks are enabled.
#/usr/local/bin/perl -T
# Run with taint checking
# Command line sanity check omitted...
$tainted_output_file = shift;
open(F, ">$tainted_output_file")
or warn "Open of $untainted_output_file failed: $!\n";
|
|
The program can be made to run by "laundering" the tainted value through a
regular expression: the second example--which still ignores locale information--runs, creating
the file named on its command line if it can.
#/usr/local/bin/perl -T
$tainted_output_file = shift;
$tainted_output_file =~ m%[\w/]+%;
$untainted_output_file = $&;
open(F, ">$untainted_output_file")
or warn "Open of $untainted_output_file failed: $!\n";
|
|
Compare this with a similar but locale-aware program:
#/usr/local/bin/perl -T
$tainted_output_file = shift;
use locale;
$tainted_output_file =~ m%[\w/]+%;
$localized_output_file = $&;
open(F, ">$localized_output_file")
or warn "Open of $localized_output_file failed: $!\n";
|
|
This third program fails to run because $& is tainted: it is the result of a match
involving \w while use locale is in effect.
|
|