Using open() for IPC
Perl's basic open() statement can also be used for unidirectional interprocess communication
by either appending or prepending a pipe symbol to the second argument to open(). Here's how to
start something up in a child process you intend to write to:
open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
|| die "can't fork: $!";
local $SIG{PIPE} = sub { die "spooler pipe broke" };
print SPOOLER "stuff\n";
close SPOOLER || die "bad spool: $! $?";
|
|
And here's how to start up a child process you intend to read from:
open(STATUS, "netstat -an 2>&1 |")
|| die "can't fork: $!";
while (<STATUS>) {
next if /^(tcp|udp)/;
print;
}
close STATUS || die "bad netstat: $! $?";
|
|
If one can be sure that a particular program is a Perl script that is expecting filenames in
@ARGV, the clever programmer can write something like this:
% program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
|
|
and irrespective of which shell it's called from, the Perl program will read from the file f1,
the process cmd1, standard input (tmpfile in this case), the f2 file, the cmd2
command, and finally the f3 file. Pretty nifty, eh?
You might notice that you could use backticks for much the same effect as opening a pipe for
reading:
print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
die "bad netstat" if $?;
|
|
While this is true on the surface, it's much more efficient to process the file one line or
record at a time because then you don't have to read the whole thing into memory at once. It
also gives you finer control of the whole process, letting you to kill off the child process
early if you'd like.
Be careful to check both the open() and the close() return values. If you're writing
to a pipe, you should also trap SIGPIPE. Otherwise, think of what happens when you start up a
pipe to a command that doesn't exist: the open() will in all likelihood succeed (it only
reflects the fork()'s success), but then your output will fail--spectacularly. Perl can't know
whether the command worked because your command is actually running in a separate process whose
exec() might have failed. Therefore, while readers of bogus commands return just a quick end of
file, writers to bogus command will trigger a signal they'd better be prepared to handle.
Consider:
open(FH, "|bogus") or die "can't fork: $!";
print FH "bang\n" or die "can't write: $!";
close FH or die "can't close: $!";
|
|
That won't blow up until the close, and it will blow up with a SIGPIPE. To catch it, you
could use this:
$SIG{PIPE} = 'IGNORE';
open(FH, "|bogus") or die "can't fork: $!";
print FH "bang\n" or die "can't write: $!";
close FH or die "can't close: status=$?";
|
|
Both the main process and any child processes it forks share the same STDIN, STDOUT, and
STDERR filehandles. If both processes try to access them at once, strange things can happen. You
may also want to close or reopen the filehandles for the child. You can get around this by
opening your pipe with open(), but on some systems this means that the child process cannot
outlive the parent.
You can run a command in the background with:
The command's STDOUT and STDERR (and possibly STDIN, depending on your shell) will be the
same as the parent's. You won't need to catch SIGCHLD because of the double-fork taking place
(see below for more details).
In some cases (starting server processes, for instance) you'll want to completely dissociate
the child process from the parent. This is often called daemonization. A well behaved daemon
will also chdir() to the root directory (so it doesn't prevent unmounting the filesystem
containing the directory from which it was launched) and redirect its standard file descriptors
from and to /dev/null (so that random output doesn't wind up on the user's terminal).
use POSIX 'setsid';
sub daemonize {
chdir '/' or die "Can't chdir to /: $!";
open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
open STDOUT, '>/dev/null'
or die "Can't write to /dev/null: $!";
defined(my $pid = fork) or die "Can't fork: $!";
exit if $pid;
setsid or die "Can't start a new session: $!";
open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
}
|
|
The fork() has to come before the setsid() to ensure that you aren't a process group leader
(the setsid() will fail if you are). If your system doesn't have the setsid() function, open /dev/tty
and use the TIOCNOTTY ioctl() on it instead. See tty(4) for details.
Non-Unix users should check their Your_OS::Process module for other solutions.
Another interesting approach to IPC is making your single program go multiprocess and
communicate between (or even amongst) yourselves. The open() function will accept a file
argument of either "-|" or "|-" to do a very
interesting thing: it forks a child connected to the filehandle you've opened. The child is
running the same program as the parent. This is useful for safely opening a file when running
under an assumed UID or GID, for example. If you open a pipe to minus, you can write to
the filehandle you opened and your kid will find it in his STDIN. If you open a pipe from
minus, you can read from the filehandle you opened whatever your kid writes to his STDOUT.
use English '-no_match_vars';
my $sleep_count = 0;
do {
$pid = open(KID_TO_WRITE, "|-");
unless (defined $pid) {
warn "cannot fork: $!";
die "bailing out" if $sleep_count++ > 6;
sleep 10;
}
} until defined $pid;
if ($pid) { # parent
print KID_TO_WRITE @some_data;
close(KID_TO_WRITE) || warn "kid exited $?";
} else { # child
($EUID, $EGID) = ($UID, $GID); # suid progs only
open (FILE, "> /safe/file")
|| die "can't open /safe/file: $!";
while (<STDIN>) {
print FILE; # child's STDIN is parent's KID
}
exit; # don't forget this
}
|
|
Another common use for this construct is when you need to execute something without the
shell's interference. With system(), it's straightforward, but you can't use a pipe open or
backticks safely. That's because there's no way to stop the shell from getting its hands on your
arguments. Instead, use lower-level control to call exec() directly.
Here's a safe backtick or pipe open for read:
# add error processing as above
$pid = open(KID_TO_READ, "-|");
if ($pid) { # parent
while (<KID_TO_READ>) {
# do something interesting
}
close(KID_TO_READ) || warn "kid exited $?";
} else { # child
($EUID, $EGID) = ($UID, $GID); # suid only
exec($program, @options, @args)
|| die "can't exec program: $!";
# NOTREACHED
}
|
|
And here's a safe pipe open for writing:
# add error processing as above
$pid = open(KID_TO_WRITE, "|-");
$SIG{ALRM} = sub { die "whoops, $program pipe broke" };
if ($pid) { # parent
for (@data) {
print KID_TO_WRITE;
}
close(KID_TO_WRITE) || warn "kid exited $?";
} else { # child
($EUID, $EGID) = ($UID, $GID);
exec($program, @options, @args)
|| die "can't exec program: $!";
# NOTREACHED
}
|
|
Since Perl 5.8.0, you can also use the list form of open for pipes : the syntax
open KID_PS, "-|", "ps", "aux" or die $!;
|
|
forks the ps(1) command (without spawning a shell, as there are more than three arguments to
open()), and reads its standard output via the KID_PS filehandle. The corresponding
syntax to read from command pipes (with "|-" in place of "-|")
is also implemented.
Note that these operations are full Unix forks, which means they may not be correctly
implemented on alien systems. Additionally, these are not true multithreading. If you'd like to
learn more about threading, see the modules file mentioned below in the SEE ALSO section.
While this works reasonably well for unidirectional communication, what about bidirectional
communication? The obvious thing you'd like to do doesn't actually work:
open(PROG_FOR_READING_AND_WRITING, "| some program |")
|
|
and if you forget to use the use warnings pragma or the -w flag, then
you'll miss out entirely on the diagnostic message:
Can't do bidirectional pipe at -e line 1.
|
|
If you really want to, you can use the standard open2() library function to catch both ends.
There's also an open3() for tridirectional I/O so you can also catch your child's STDERR, but
doing so would then require an awkward select() loop and wouldn't allow you to use normal Perl
input operations.
If you look at its source, you'll see that open2() uses low-level primitives like Unix pipe()
and exec() calls to create all the connections. While it might have been slightly more efficient
by using socketpair(), it would have then been even less portable than it already is. The
open2() and open3() functions are unlikely to work anywhere except on a Unix system or some
other one purporting to be POSIX compliant.
Here's an example of using open2():
use FileHandle;
use IPC::Open2;
$pid = open2(*Reader, *Writer, "cat -u -n" );
print Writer "stuff\n";
$got = <Reader>;
|
|
The problem with this is that Unix buffering is really going to ruin your day. Even though
your Writer filehandle is auto-flushed, and the process on the other end will get
your data in a timely manner, you can't usually do anything to force it to give it back to you
in a similarly quick fashion. In this case, we could, because we gave cat a -u
flag to make it unbuffered. But very few Unix commands are designed to operate over pipes, so
this seldom works unless you yourself wrote the program on the other end of the double-ended
pipe.
A solution to this is the nonstandard Comm.pl library. It uses pseudo-ttys to make
your program behave more reasonably:
require 'Comm.pl';
$ph = open_proc('cat -n');
for (1..10) {
print $ph "a line\n";
print "got back ", scalar <$ph>;
}
|
|
This way you don't have to have control over the source code of the program you're using. The
Comm library also has expect() and interact() functions. Find the library (and we hope
its successor IPC::Chat) at your nearest CPAN archive as detailed in the SEE ALSO section
below.
The newer Expect.pm module from CPAN also addresses this kind of thing. This module requires
two other modules from CPAN: IO::Pty and IO::Stty. It sets up a pseudo-terminal to interact with
programs that insist on using talking to the terminal device driver. If your system is amongst
those supported, this may be your best bet.
If you want, you may make low-level pipe() and fork() to stitch this together by hand. This
example only talks to itself, but you could reopen the appropriate handles to STDIN and STDOUT
and call other processes.
#!/usr/bin/perl -w
# pipe1 - bidirectional communication using two pipe pairs
# designed for the socketpair-challenged
use IO::Handle; # thousands of lines just for autoflush :-(
pipe(PARENT_RDR, CHILD_WTR); # XXX: failure?
pipe(CHILD_RDR, PARENT_WTR); # XXX: failure?
CHILD_WTR->autoflush(1);
PARENT_WTR->autoflush(1);
if ($pid = fork) {
close PARENT_RDR; close PARENT_WTR;
print CHILD_WTR "Parent Pid $$ is sending this\n";
chomp($line = <CHILD_RDR>);
print "Parent Pid $$ just read this: `$line'\n";
close CHILD_RDR; close CHILD_WTR;
waitpid($pid,0);
} else {
die "cannot fork: $!" unless defined $pid;
close CHILD_RDR; close CHILD_WTR;
chomp($line = <PARENT_RDR>);
print "Child Pid $$ just read this: `$line'\n";
print PARENT_WTR "Child Pid $$ is sending this\n";
close PARENT_RDR; close PARENT_WTR;
exit;
}
|
|
But you don't actually have to make two pipe calls. If you have the socketpair() system call,
it will do this all for you.
#!/usr/bin/perl -w
# pipe2 - bidirectional communication using socketpair
# "the best ones always go both ways"
use Socket;
use IO::Handle; # thousands of lines just for autoflush :-(
# We say AF_UNIX because although *_LOCAL is the
# POSIX 1003.1g form of the constant, many machines
# still don't have it.
socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
or die "socketpair: $!";
CHILD->autoflush(1);
PARENT->autoflush(1);
if ($pid = fork) {
close PARENT;
print CHILD "Parent Pid $$ is sending this\n";
chomp($line = <CHILD>);
print "Parent Pid $$ just read this: `$line'\n";
close CHILD;
waitpid($pid,0);
} else {
die "cannot fork: $!" unless defined $pid;
close CHILD;
chomp($line = <PARENT>);
print "Child Pid $$ just read this: `$line'\n";
print PARENT "Child Pid $$ is sending this\n";
close PARENT;
exit;
}
|
|
|
|