Website hosting service by

 Back to Index


perlebcdic - Considerations for running Perl on EBCDIC platforms


An exploration of some of the issues facing Perl programmers on EBCDIC based computers. We do not cover localization, internationalization, or multi byte character set issues other than some discussion of UTF-8 and UTF-EBCDIC.

Portions that are still incomplete are marked with XXX.


Most socket programming assumes ASCII character encodings in network byte order. Exceptions can include CGI script writing under a host web server where the server may take care of translation for you. Most host web servers convert EBCDIC data to ISO-8859-1 or Unicode on output.

Hashing order and checksums

To the extent that it is possible to write code that depends on hashing order there may be differences between hashes as stored on an ASCII based machine and hashes stored on an EBCDIC based machine. XXX


Internationalization(I18N) and localization(L10N) are supported at least in principle even on EBCDIC machines. The details are system dependent and discussed under the perlebcdic/OS ISSUES section below.


Perl may work with an internal UTF-EBCDIC encoding form for wide characters on EBCDIC platforms in a manner analogous to the way that it works with the UTF-8 internal encoding form on ASCII based platforms.

Legacy multi byte EBCDIC code pages XXX.


This pod document contains literal Latin 1 characters and may encounter translation difficulties. In particular one popular nroff implementation was known to strip accented characters to their unaccented counterparts while attempting to view this document through the pod2man program (for example, you may see a plain y rather than one with a diaeresis as in ˙). Another nroff truncated the resultant manpage at the first occurrence of 8 bit characters.

Not all shells will allow multiple -e string arguments to perl to be concatenated together properly as recipes 0, 2, 4, 5, and 6 might seem to imply.


perllocale, perlfunc, perlunicode, utf8.

REFERENCES ASCII: American Standard Code for Information Infiltration Tom Jennings, September 1999.

The Unicode Standard, Version 3.0 The Unicode Consortium, Lisa Moore ed., ISBN 0-201-61633-5, Addison Wesley Developers Press, February 2000.

CDRA: IBM - Character Data Representation Architecture - Reference and Registry, IBM SC09-2190-00, December 1996.

"Demystifying Character Sets", Andrea Vine, Multilingual Computing & Technology, #26 Vol. 10 Issue 4, August/September 1999; ISSN 1523-0309; Multilingual Computing Inc. Sandpoint ID, USA.

Codes, Ciphers, and Other Cryptic and Clandestine Communication Fred B. Wrixon, ISBN 1-57912-040-7, Black Dog & Leventhal Publishers, 1998. IBM - EBCDIC and the P-bit; The biggest Computer Goof Ever Robert Bemer.


15 April 2001: added UTF-8 and UTF-EBCDIC to main table, pvhp.


Peter Prymmer wrote this in 1999 and 2000 with CCSID 0819 and 0037 help from Chris Leach and André Pirard as well as POSIX-BC help from Thomas Dorner Thanks also to Vickie Cooper, Philip Newton, William Raffloer, and Joe Smith. Trademarks, registered trademarks, service marks and registered service marks used in this document are the property of their respective owners.




© 2002-2004 Web Site Hosting Service


[ Large increases in cost with questionable increases in performance can be tolerated only in race horses and women.   ]



Disclaimer: This documentation is provided only for the benefits of our web hosting customers.
For authoritative source of the documentation, please refer to