NAME

perlfaq - frequently asked questions about Perl ($Date: 1997/03/17 22:17:56 $)


DESCRIPTION

This document is structured into the following sections:

perlfaq: Structural overview of the FAQ.
This document.

perlfaq1: General Questions About Perl
Very general, high-level information about Perl.

perlfaq2: Obtaining and Learning about Perl
Where to find source and documentation to Perl, support and training, and related matters.

perlfaq3: Programming Tools
Programmer tools and programming support.

perlfaq4: Data Manipulation
Manipulating numbers, dates, strings, arrays, hashes, and miscellaneous data issues.

perlfaq5: Files and Formats
I/O and the ``f'' issues: filehandles, flushing, formats and footers.

perlfaq6: Regexps
Pattern matching and regular expressions.

perlfaq7: General Perl Language Issues
General Perl language issues that don't clearly fit into any of the other sections.

perlfaq8: System Interaction
Interprocess communication (IPC), control over the user-interface (keyboard, screen and pointing devices).

perlfaq9: Networking
Networking, the Internet, and a few on the web.


Where to get this document

This document is posted regularly to comp.lang.perl.announce and several other related newsgroups. It is available in a variety of formats from CPAN in the /CPAN/doc/FAQs/FAQ/ directory, or on the web at http://www.perl.com/perl/faq/ .


How to contribute to this document

You may mail corrections, additions, and suggestions to perlfaq-suggestions@perl.com. Mail sent to the old perlfaq alias will merely cause the FAQ to be sent to you.


What will happen if you mail your Perl programming problems to the authors

Your questions will probably go unread, unless they're suggestions of new questions to add to the FAQ, in which case they should have gone to the perlfaq-suggestions@perl.com instead.

You should have read section 2 of this faq. There you would have learned that comp.lang.perl.misc is the appropriate place to go for free advice. If your question is really important and you require a prompt and correct answer, you should hire a consultant.


Credits

When I first began the Perl FAQ in the late 80s, I never realized it would have grown to over a hundred pages, nor that Perl would ever become so popular and widespread. This document could not have been written without the tremendous help provided by Larry Wall and the rest of the Perl Porters.


Author and Copyright Information

Copyright (c) 1997 Tom Christiansen and Nathan Torkington. All rights reserved.


Non-commercial Reproduction

Permission is granted to distribute this document, in part or in full, via electronic means or printed copy providing that (1) that all credits and copyright notices be retained, (2) that no charges beyond reproduction be involved, and (3) that a reasonable attempt be made to use the most current version available.

Furthermore, you may include this document in any distribution of the full Perl source or binaries, in its verbatim documentation, or on a complete dump of the CPAN archive, providing that the three stipulations given above continue to be met.


Commercial Reproduction

Requests for all other distribution rights, including the incorporation in part or in full of this text or its code into commercial products such as but not limited to books, magazine articles, or CD-ROMs, must be made to perlfaq-legal@perl.com. Any commercial use of any portion of this document without prior written authorization by its authors will be subject to appropriate action.


Disclaimer

This information is offered in good faith and in the hope that it may be of use, but is not guaranteed to be correct, up to date, or suitable for any particular purpose whatsoever. The authors accept no liability in respect of this information or its use.


Changes

  1. /March/97 Version Various typos fixed throughout.

    Added new question on Perl BNF on the perlfaq7 manpage.

    Initial Release: 11/March/97
    This is the initial release of version 3 of the FAQ; consequently there have been no changes since its initial release.


perlfaq1 - General Questions About Perl ($Revision: 1.10 $)


What is Perl?

Perl is a high-level programming language with an eclectic heritage written by Larry Wall and a cast of thousands. It derives from the ubiquitous C programming language and to a lesser extent from sed, awk, the Unix shell, and at least a dozen other tools and languages. Perl's process, file, and text manipulation facilities make it particularly well-suited for tasks involving quick prototyping, system utilities, software tools, system management tasks, database access, graphical programming, networking, and world wide web programming. These strengths make it especially popular with system administrators and CGI script authors, but mathematicians, geneticists, journalists, and even managers also use Perl. Maybe you should, too.


Who supports Perl? Who develops it? Why is it free?

The original culture of the pre-populist Internet and the deeply-held beliefs of Perl's author, Larry Wall, gave rise to the free and open distribution policy of perl. Perl is supported by its users. The core, the standard Perl library, the optional modules, and the documentation you're reading now were all written by volunteers. See the personal note at the end of the README file in the perl source distribution for more details.

In particular, the core development team (known as the Perl Porters) are a rag-tag band of highly altruistic individuals committed to producing better software for free than you could hope to purchase for money. You may snoop on pending developments via news://genetics.upenn.edu/perl.porters-gw/ and http://www.frii.com/~gnat/perl/porters/summary.html.

While the GNU project includes Perl in its distributions, there's no such thing as ``GNU Perl''. Perl is not produced nor maintained by the Free Software Foundation. Perl's licensing terms are also more open than GNU software's tend to be.

You can get commercial support of Perl if you wish, although for most users the informal support will more than suffice. See the answer to ``Where can I buy a commercial version of perl?'' for more information.


Which version of Perl should I use?

You should definitely use version 5. Version 4 is old, limited, and no longer maintained. Its last patch (4.036) was in 1992. The last production release was 5.003, and the current experimental release for those at the bleeding edge (as of 27/03/97) is 5.003_92, considered a beta for production release 5.004, which will probably be out by the time you read this. Further references to the Perl language in this document refer to the current production release unless otherwise specified.


What are perl4 and perl5?

Perl4 and perl5 are informal names for different versions of the Perl programming language. It's easier to say ``perl5'' than it is to say ``the 5 release of Perl'', but some people have interpreted this to mean there's a language called ``perl5'', which isn't the case. Perl5 is merely the popular name for the fifth major release (October 1994), while perl4 was the fourth major release (March 1991). There was also a perl1 (in January 1988), a perl2 (June 1988), and a perl3 (October 1989).

The 5.0 release is, essentially, a complete rewrite of the perl source code from the ground up. It has been modularized, object-oriented, tweaked, trimmed, and optimized until it almost doesn't look like the old code. However, the interface is mostly the same, and compatibility with previous releases is very high.

To avoid the ``what language is perl5?'' confusion, some people prefer to simply use ``perl'' to refer to the latest version of perl and avoid using ``perl5'' altogether. It's not really that big a deal, though.


How stable is Perl?

Production releases, which incorporate bug fixes and new functionality, are widely tested before release. Since the 5.000 release, we have averaged only about one production release per year.

Larry and the Perl development team occasionally make changes to the internal core of the language, but all possible efforts are made toward backward compatibility. While not quite all perl4 scripts run flawlessly under perl5, an update to perl should nearly never invalidate a program written for an earlier version of perl (barring accidental bug fixes and the rare new keyword).


Is Perl difficult to learn?

Perl is easy to start learning -- and easy to keep learning. It looks like most programming languages you're likely to have had experience with, so if you've ever written an C program, an awk script, a shell script, or even an Excel macro, you're already part way there.

Most tasks only require a small subset of the Perl language. One of the guiding mottos for Perl development is ``there's more than one way to do it'' (TMTOWTDI, sometimes pronounced ``tim toady''). Perl's learning curve is therefore shallow (easy to learn) and long (there's a whole lot you can do if you really want).

Finally, Perl is (frequently) an interpreted language. This means that you can write your programs and test them without an intermediate compilation step, allowing you to experiment and test/debug quickly and easily. This ease of experimentation flattens the learning curve even more.

Things that make Perl easier to learn: Unix experience, almost any kind of programming experience, an understanding of regular expressions, and the ability to understand other people's code. If there's something you need to do, then it's probably already been done, and a working example is usually available for free. Don't forget the new perl modules, either. They're discussed in Part 3 of this FAQ, along with the CPAN, which is discussed in Part 2.


How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl?

Favorably in some areas, unfavorably in others. Precisely which areas are good and bad is often a personal choice, so asking this question on Usenet runs a strong risk of starting an unproductive Holy War.

Probably the best thing to do is try to write equivalent code to do a set of tasks. These languages have their own newsgroups in which you can learn about (but hopefully not argue about) them.


Can I do [task] in Perl?

Perl is flexible and extensible enough for you to use on almost any task, from one-line file-processing tasks to complex systems. For many people, Perl serves as a great replacement for shell scripting. For others, it serves as a convenient, high-level replacement for most of what they'd program in low-level languages like C or C++. It's ultimately up to you (and possibly your management ...) which tasks you'll use Perl for and which you won't.

If you have a library that provides an API, you can make any component of it available as just another Perl function or variable using a Perl extension written in C or C++ and dynamically linked into your main perl interpreter. You can also go the other direction, and write your main program in C or C++, and then link in some Perl code on the fly, to create a powerful application.

That said, there will always be small, focused, special-purpose languages dedicated to a specific problem domain that are simply more convenient for certain kinds of problems. Perl tries to be all things to all people, but nothing special to anyone. Examples of specialized languages that come to mind include prolog and matlab.


When shouldn't I program in Perl?

When your manager forbids it -- but do consider replacing them :-).

Actually, one good reason is when you already have an existing application written in another language that's all done (and done well), or you have an application language specifically designed for a certain task (e.g. prolog, make).

For various reasons, Perl is probably not well-suited for real-time embedded systems, low-level operating systems development work like device drivers or context-switching code, complex multithreaded shared-memory applications, or extremely large applications. You'll notice that perl is not itself written in Perl.

The new native-code compiler for Perl may reduce the limitations given in the previous statement to some degree, but understand that Perl remains fundamentally a dynamically typed language, and not a statically typed one. You certainly won't be chastized if you don't trust nuclear-plant or brain-surgery monitoring code to it. And Larry will sleep easier, too -- Wall Street programs not withstanding. :-)


What's the difference between "perl" and "Perl"?

One bit. Oh, you weren't talking ASCII? :-) Larry now uses ``Perl'' to signify the language proper and ``perl'' the implementation of it, i.e. the current interpreter. Hence Tom's quip that ``Nothing but perl can parse Perl.'' You may or may not choose to follow this usage. For example, parallelism means ``awk and perl'' and ``Python and Perl'' look ok, while ``awk and Perl'' and ``Python and perl'' do not.


Is it a Perl program or a Perl script?

It doesn't matter.

In ``standard terminology'' a program has been compiled to physical machine code once, and can then be be run multiple times, whereas a script must be translated by a program each time it's used. Perl programs, however, are usually neither strictly compiled nor strictly interpreted. They can be compiled to a bytecode form (something of a Perl virtual machine) or to completely different languages, like C or assembly language. You can't tell just by looking whether the source is destined for a pure interpreter, a parse-tree interpreter, a byte-code interpreter, or a native-code compiler, so it's hard to give a definitive answer here.


What is a JAPH?

These are the ``just another perl hacker'' signatures that some people sign their postings with. About 100 of the of the earlier ones are available from http://www.perl.com/CPAN/misc/japh .


Where can I get a list of Larry Wall witticisms?

Over a hundred quips by Larry, from postings of his or source code, can be found at http://www.perl.com/CPAN/misc/lwall-quotes .


How can I convince my sysadmin/supervisor/employees to use version (5/5.004/Perl instead of some other language)?

If your manager or employees are wary of unsupported software, or software which doesn't officially ship with your Operating System, you might try to appeal to their self-interest. If programmers can be more productive using and utilizing Perl constructs, functionality, simplicity, and power, then the typical manager/supervisor/employee may be persuaded. Regarding using Perl in general, it's also sometimes helpful to point out that delivery times may be reduced using Perl, as compared to other languages.

If you have a project which has a bottleneck, especially in terms of translation, or testing, Perl almost certainly will provide a viable, and quick solution. In conjunction with any persuasion effort, you should not fail to point out that Perl is used, quite extensively, and with extremely reliable and valuable results, at many large computer software and/or hardware companies throughout the world. In fact, many Unix vendors now ship Perl by default, and support is usually just a news-posting away, if you can't find the answer in the comprehensive documentation, including this FAQ.

If you face reluctance to upgrading from an older version of perl, then point out that version 4 is utterly unmaintained and unsupported by the Perl Development Team. Another big sell for Perl5 is the large number of modules and extensions which greatly reduce development time for any given task. Also mention that the difference between version 4 and version 5 of Perl is like the difference between awk and C++. (Well, ok, maybe not quite that distinct, but you get the idea.) If you want support and a reasonable guarantee that what you're developing will continue to work in the future, then you have to run the supported version. That probably means running the 5.004 release, although 5.003 isn't that bad (it's just one year and one release behind). Several important bugs were fixed from the 5.000 through 5.002 versions, though, so try upgrading past them if possible.


perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.13 $)

This section of the FAQ answers questions about where to find source and documentation for Perl, support and training, and related matters.


What machines support Perl? Where do I get it?

The standard release of Perl (the one maintained by the perl development team) is distributed only in source code form. You can find this at http://www.perl.com/CPAN/src/latest.tar.gz, which is a gzipped archive in POSIX tar format. This source builds with no porting whatsoever on most Unix systems (Perl's native environment), as well as Plan 9, VMS, QNX, OS/2, and the Amiga.

Although it's rumored that the (imminent) 5.004 release may build on Windows NT, this is yet to be proven. Binary distributions for 32-bit Microsoft systems and for Apple systems can be found http://www.perl.com/CPAN/ports/ directory. Because these are not part of the standard distribution, they may and in fact do differ from the base Perl port in a variety of ways. You'll have to check their respective release notes to see just what the differences are. These differences can be either positive (e.g. extensions for the features of the particular platform that are not supported in the source release of perl) or negative (e.g. might be based upon a less current source release of perl).

A useful FAQ for Win32 Perl users is http://www.endcontsw.com/people/evangelo/Perl_for_Win32_FAQ.html


How can I get a binary version of Perl?

If you don't have a C compiler because for whatever reasons your vendor did not include one with your system, the best thing to do is grab a binary version of gcc from the net and use that to compile perl with. CPAN only has binaries for systems that are terribly hard to get free compilers for, not for Unix systems.


I copied the Perl binary from one machine to another, but scripts don't work.

That's probably because you forgot libraries, or library paths differ. You really should build the whole distribution on the machine it will eventually live on, and then type make install. Most other approaches are doomed to failure.

One simple way to check that things are in the right place is to print out the hard-coded @INC which perl is looking for.

	perl -e 'print join("\n",@INC)'

If this command lists any paths which don't exist on your system, then you may need to move the appropriate libraries to these locations, or create symlinks, aliases, or shortcuts appropriately.


I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work?

Read the INSTALL file, which is part of the source distribution. It describes in detail how to cope with most idiosyncracies that the Configure script can't work around for any given system or architecture.


What modules and extensions are available for Perl? What is CPAN? What does CPAN/src/... mean?

CPAN stands for Comprehensive Perl Archive Network, a huge archive replicated on dozens of machines all over the world. CPAN contains source code, non-native ports, documentation, scripts, and many third-party modules and extensions, designed for everything from commercial database interfaces to keyboard/screen control to web walking and CGI scripts. The master machine for CPAN is ftp://ftp.funet.fi/pub/languages/perl/CPAN/, but you can use the address http://www.perl.com/CPAN/CPAN.html to fetch a copy from a ``site near you''. See http://www.perl.com/CPAN (without a slash at the end) for how this process works.

CPAN/path/... is a naming convention for files available on CPAN sites. CPAN indicates the base directory of a CPAN mirror, and the rest of the path is the path from that directory to the file. For instance, if you're using ftp://ftp.funet.fi/pub/languages/perl/CPAN as your CPAN site, the file CPAN/misc/japh file is downloadable as ftp://ftp.funet.fi/pub/languages/perl/CPAN/misc/japh .

Considering that there are hundreds of existing modules in the archive, one probably exists to do nearly anything you can think of. Current categories under CPAN/modules/by-category/ include perl core modules; development support; operating system interfaces; networking, devices, and interprocess communication; data type utilities; database interfaces; user interfaces; interfaces to other languages; filenames, file systems, and file locking; internationalization and locale; world wide web support; server and daemon utilities; archiving and compression; image manipulation; mail and news; control flow utilities; filehandle and I/O; Microsoft Windows modules; and miscellaneous modules.


Is there an ISO or ANSI certified version of Perl?

Certainly not. Larry expects that he'll be certified before Perl is.


Where can I get information on Perl?

The complete Perl documentation is available with the perl distribution. If you have perl installed locally, you probably have the documentation installed as well: type man perl if you're on a system resembling Unix. This will lead you to other important man pages. If you're not on a Unix system, access to the documentation will be different; for example, it might be only in HTML format. But all proper perl installations have fully-accessible documentation.

You might also try perldoc perl in case your system doesn't have a proper man command, or it's been misinstalled. If that doesn't work, try looking in /usr/local/lib/perl5/pod for documentation.

If all else fails, consult the CPAN/doc directory, which contains the complete documentation in various formats, including native pod, troff, html, and plain text. There's also a web page at http://www.perl.com/perl/info/documentation.html that might help.

It's also worth noting that there's a PDF version of the complete documentation for perl available in the CPAN/authors/id/BMIDD directory.

Many good books have been written about Perl -- see the section below for more details.


What are the Perl newsgroups on USENET? Where do I post questions?

The now defunct comp.lang.perl newsgroup has been superseded by the following groups:

    comp.lang.perl.announce 		Moderated announcement group
    comp.lang.perl.misc     		Very busy group about Perl in general
    comp.lang.perl.modules  		Use and development of Perl modules
    comp.lang.perl.tk           	Using Tk (and X) from Perl

    comp.infosystems.www.authoring.cgi 	Writing CGI scripts for the Web.

There is also USENET gateway to the mailing list used by the crack Perl development team (perl5-porters) at news://genetics.upenn.edu/perl.porters-gw/ .


Where should I post source code?

You should post source code to whichever group is most appropriate, but feel free to cross-post to comp.lang.perl.misc. If you want to cross-post to alt.sources, please make sure it follows their posting standards, including setting the Followup-To header line to NOT include alt.sources; see their FAQ for details.


Perl Books

A number books on Perl and/or CGI programming are available. A few of these are good, some are ok, but many aren't worth your money. Tom Christiansen maintains a list of these books, some with extensive reviews, at http://www.perl.com/perl/critiques/index.html.

The incontestably definitive reference book on Perl, written by the creator of Perl and his apostles, is now in its second edition and fourth printing.

    Programming Perl (the "Camel Book"):
	Authors: Larry Wall, Tom Christiansen, and Randal Schwartz
        ISBN 1-56592-149-6      (English)
        ISBN 4-89052-384-7      (Japanese)
	(French and German translations in progress)

Note that O'Reilly books are color-coded: turquoise (some would call it teal) covers indicate perl5 coverage, while magenta (some would call it pink) covers indicate perl4 only. Check the cover color before you buy!

What follows is a list of the books that the FAQ authors found personally useful. Your mileage may (but, we hope, probably won't) vary.

If you're already a hard-core systems programmer, then the Camel Book just might suffice for you to learn Perl from. But if you're not, check out the ``Llama Book''. It currently doesn't cover perl5, but the 2nd edition is nearly done and should be out by summer 97:

    Learning Perl (the Llama Book):
	Author: Randal Schwartz, with intro by Larry Wall
        ISBN 1-56592-042-2      (English)
        ISBN 4-89502-678-1      (Japanese)
        ISBN 2-84177-005-2      (French)
        ISBN 3-930673-08-8      (German)

Another stand-out book in the turquoise O'Reilly Perl line is the ``Hip Owls'' book. It covers regular expressions inside and out, with quite a bit devoted exclusively to Perl:

    Mastering Regular Expressions (the Cute Owls Book):
	Author: Jeffrey Friedl
	ISBN 1-56592-257-3

You can order any of these books from O'Reilly & Associates, 1-800-998-9938. Local/overseas is 1-707-829-0515. If you can locate an O'Reilly order form, you can also fax to 1-707-829-0104. See http://www.ora.com/ on the Web.

Recommended Perl books that are not from O'Reilly are the following:

   Cross-Platform Perl, (for Unix and Windows NT)
       Author: Eric F. Johnson
       ISBN: 1-55851-483-X

   How to Set up and Maintain a World Wide Web Site, (2nd edition)
	Author: Lincoln Stein, M.D., Ph.D.
	ISBN: 0-201-63462-7

   CGI Programming in C & Perl,
	Author: Thomas Boutell
	ISBN: 0-201-42219-0

Note that some of these address specific application areas (e.g. the Web) and are not general-purpose programming books.


Perl in Magazines

The Perl Journal is the first and only magazine dedicated to Perl. It is published (on paper, not online) quarterly by Jon Orwant (orwant@tpj.com), editor. Subscription information is at http://tpj.com or via email to subscriptions@tpj.com.

Beyond this, two other magazines that frequently carry high-quality articles on Perl are Web Techniques (see http://www.webtechniques.com/) and Unix Review (http://www.unixreview.com/).


Perl on the Net: FTP and WWW Access

To get the best (and possibly cheapest) performance, pick a site from the list below and use it to grab the complete list of mirror sites. From there you can find the quickest site for you. Remember, the following list is not the complete list of CPAN mirrors.

  http://www.perl.com/CPAN	(redirects to another mirror)
  http://www.perl.org/CPAN
  ftp://ftp.funet.fi/pub/languages/perl/CPAN/
  http://www.cs.ruu.nl/pub/PERL/CPAN/
  ftp://ftp.cs.colorado.edu/pub/perl/CPAN/


What mailing lists are there for perl?

Most of the major modules (tk, CGI, libwww-perl) have their own mailing lists. Consult the documentation that came with the module for subscription information. The following are a list of mailing lists related to perl itself.

If you subscribe to a mailing list, it behooves you to know how to unsubscribe from it. Strident pleas to the list itself to get you off will not be favorably received.

MacPerl
There is a mailing list for discussing Macintosh Perl. Contact ``mac-perl-request@iis.ee.ethz.ch''.

Also see Matthias Neeracher's (the creator and maintainer of MacPerl) webpage at http://www.iis.ee.ethz.ch/~neeri/macintosh/perl.html for many links to interesting MacPerl sites, and the applications/MPW tools, precompiled.

Perl5-Porters
The core development team have a mailing list for discussing fixes and changes to the language. Send mail to ``perl5-porters-request@perl.org'' with help in the body of the message for information on subscribing.

NTPerl
This list is used to discuss issues involving Win32 Perl 5 (Windows NT and Win95). Subscribe by emailing ListManager@ActiveWare.com with the message body:

    subscribe Perl-Win32-Users

The list software, also written in perl, will automatically determine your address, and subscribe you automatically. To unsubscribe, email the following in the message body to the same address like so:

    unsubscribe Perl-Win32-Users

You can also check http://www.activeware.com/ and select ``Mailing Lists'' to join or leave this list.

Perl-Packrats
Discussion related to archiving of perl materials, particularly the Comprehensive PerlArchive Network (CPAN). Subscribe by emailing majordomo@cis.ufl.edu:

    subscribe perl-packrats

The list software, also written in perl, will automatically determine your address, and subscribe you automatically. To unsubscribe, simple prepend the same command with an ``un'', and mail to the same address like so:

    unsubscribe perl-packrats


Archives of comp.lang.perl.misc

Have you tried Deja News or Alta Vista?

ftp.cis.ufl.edu:/pub/perl/comp.lang.perl.*/monthly has an almost complete collection dating back to 12/89 (missing 08/91 through 12/93). They are kept as one large file for each month.

You'll probably want more a sophisticated query and retrieval mechanism than a file listing, preferably one that allows you to retrieve articles using a fast-access indices, keyed on at least author, date, subject, thread (as in ``trn'') and probably keywords. The best solution the FAQ authors know of is the MH pick command, but it is very slow to select on 18000 articles.

If you have, or know where can be found, the missing sections, please let perlfaq-suggestions@perl.com know.


Perl Training

While some large training companies offer their own courses on Perl, you may prefer to contact individuals near and dear to the heart of Perl development. Two well-known members of the Perl development team who offer such things are Tom Christiansen and Randal Schwartz , plus their respective minions, who offer a variety of professional tutorials and seminars on Perl. These courses include large public seminars, private corporate training, and fly-ins to Colorado and Oregon. See http://www.perl.com/perl/info/training.html for more details.


Where can I buy a commercial version of Perl?

In a sense, Perl already is commercial software: It has a licence that you can grab and carefully read to your manager. It is distributed in releases and comes in well-defined packages. There is a very large user community and an extensive literature. The comp.lang.perl.* newsgroups and several of the mailing lists provide free answers to your questions in near real-time. Perl has traditionally been supported by Larry, dozens of software designers and developers, and thousands of programmers, all working for free to create a useful thing to make life better for everyone.

However, these answers may not suffice for managers who require a purchase order from a company whom they can sue should anything go wrong. Or maybe they need very serious hand-holding and contractual obligations. Shrink-wrapped CDs with perl on them are available from several sources if that will help.

Or you can purchase a real support contract. Although Cygnus historically provided this service, they no longer sell support contracts for Perl. Instead, the Paul Ingram Group will be taking up the slack through The Perl Clinic. The following is a commercial from them:

``Do you need professional support for Perl and/or Oraperl? Do you need a support contract with defined levels of service? Do you want to pay only for what you need?

``The Paul Ingram Group has provided quality software development and support services to some of the world's largest corporations for ten years. We are now offering the same quality support services for Perl at The Perl Clinic. This service is led by Tim Bunce, an active perl porter since 1994 and well known as the author and maintainer of the DBI, DBD::Oracle, and Oraperl modules and author/co-maintainer of The Perl 5 Module List. We also offer Oracle users support for Perl5 Oraperl and related modules (which Oracle is planning to ship as part of Oracle Web Server 3). 20% of the profit from our Perl support work will be donated to The Perl Institute.''

For more information, contact the The Perl Clinic:

    Tel:    +44 1483 424424
    Fax:    +44 1483 419419
    Web:    http://www.perl.co.uk/
    Email:  perl-support-info@perl.co.uk or Tim.Bunce@ig.co.uk


Where do I send bug reports?

If you are reporting a bug in the perl interpreter or the modules shipped with perl, use the perlbug program in the perl distribution or email your report to perlbug@perl.com.

If you are posting a bug with a non-standard port (see the answer to ``What platforms is Perl available for?''), a binary distribution, or a non-standard module (such as Tk, CGI, etc), then please see the documentation that came with it to determine the correct place to post bugs.

Read the perlbug man page (perl5.004 or later) for more information.


What is perl.com? perl.org? The Perl Institute?

perl.org is the official vehicle for The Perl Institute. The motto of TPI is ``helping people help Perl help people'' (or something like that). It's a non-profit organization supporting development, documentation, and dissemination of perl. Current directors of TPI include Larry Wall, Tom Christiansen, and Randal Schwartz, whom you may have heard of somewhere else around here.

The perl.com domain is Tom Christiansen's domain. He created it as a public service long before perl.org came about. It's the original PBS of the Perl world, a clearinghouse for information about all things Perlian, accepting no paid advertisements, glossy gifs, or (gasp!) java applets on its pages.


How do I learn about object-oriented Perl programming?

the perltoot manpage (distributed with 5.004 or later) is a good place to start. Also, the perlobj manpage, the perlref manpage, and the perlmod manpage are useful references, while the perlbot manpage has some excellent tips and tricks.


perlfaq3 - Programming Tools ($Revision: 1.19 $)

This section of the FAQ answers questions related to programmer tools and programming support.


How do I do (anything)?

Have you looked at CPAN (see the perlfaq2 manpage)? The chances are that someone has already written a module that can solve your problem. Have you read the appropriate man pages? Here's a brief index:

	Objects		perlref, perlmod, perlobj, perltie
	Data Structures	perlref, perllol, perldsc
	Modules		perlmod, perlsub
	Regexps		perlre, perlfunc, perlop
	Moving to perl5	perltrap, perl
	Linking w/C	perlxstut, perlxs, perlcall, perlguts, perlembed
	Various 	http://www.perl.com/CPAN/doc/FMTEYEWTK/index.html
			(not a man-page but still useful)

the perltoc manpage provides a crude table of contents for the perl man page set.


How can I use Perl interactively?

The typical approach uses the Perl debugger, described in the perldebug man page, on an ``empty'' program, like this:

    perl -de 42

Now just type in any legal Perl code, and it will be immediately evaluated. You can also examine the symbol table, get stack backtraces, check variable values, set breakpoints, and other operations typically found in symbolic debuggers


Is there a Perl shell?

In general, no. The Shell.pm module (distributed with perl) makes perl try commands which aren't part of the Perl language as shell commands. perlsh from the source distribution is simplistic and uninteresting, but may still be what you want.


How do I debug my Perl programs?

Have you used -w?

Have you tried use strict?

Did you check the returns of each and every system call?

Did you read the perltrap manpage?

Have you tried the Perl debugger, described in the perldebug manpage?


How do I profile my Perl programs?

You should get the Devel::DProf module from CPAN, and also use Benchmark.pm from the standard distribution. Benchmark lets you time specific portions of your code, while Devel::DProf gives detailed breakdowns of where your code spends its time.


How do I cross-reference my Perl programs?

The B::Xref module, shipped with the new, alpha-release Perl compiler (not the general distribution), can be used to generate cross-reference reports for Perl programs.

    perl -MO=Xref[,OPTIONS] foo.pl


Is there a pretty-printer (formatter) for Perl?

There is no program that will reformat Perl as much as indent will do for C. The complex feedback between the scanner and the parser (this feedback is what confuses the vgrind and emacs programs) makes it challenging at best to write a stand-alone Perl parser.

Of course, if you simply follow the guidelines in the perlstyle manpage, you shouldn't need to reformat.

Your editor can and should help you with source formatting. The perl-mode for emacs can provide a remarkable amount of help with most (but not all) code, and even less programmable editors can provide significant assistance.

If you are using to using vgrind program for printing out nice code to a laser printer, you can take a stab at this using http://www.perl.com/CPAN/doc/misc/tips/working.vgrind.entry, but the results are not particularly satisfying for sophisticated code.


Is there a ctags for Perl?

There's a simple one at http://www.perl.com/CPAN/authors/id/TOMC/scripts/ptags.gz which may do the trick.


Where can I get Perl macros for vi?

For a complete version of Tom Christiansen's vi configuration file, see ftp://ftp.perl.com/pub/vi/toms.exrc, the standard benchmark file for vi emulators. This runs best with nvi, the current version of vi out of Berkeley, which incidentally can be built with an embedded Perl interpreter -- see http://www.perl.com/CPAN/src/misc .


Where can I get perl-mode for emacs?

Since Emacs version 19 patchlevel 22 or so, there have been both a perl-mode.el and support for the perl debugger built in. These should come with the standard Emacs 19 distribution.

In the perl source directory, you'll find a directory called ``emacs'', which contains a cperl-mode that color-codes keywords, provides context-sensitive help, and other nifty things.

Note that the perl-mode of emacs will have fits with ``main'foo'' (single quote), and mess up the indentation and hilighting. You should be using ``main::foo'', anyway.


How can I use curses with Perl?

The Curses module from CPAN provides a dynamically loadable object module interface to a curses library.


How can I use X or Tk with Perl?

Tk is a completely Perl-based, object-oriented interface to the Tk toolkit that doesn't force you to use Tcl just to get at Tk. Sx is an interface to the Athena Widget set. Both are available from CPAN.


How can I generate simple menus without using CGI or Tk?

The http://www.perl.com/CPAN/authors/id/SKUNZ/perlmenu.v4.0.tar.gz module, which is curses-based, can help with this.


Can I dynamically load C routines into Perl?

If your system architecture supports it, then the standard perl on your system should also provide you with this via the DynaLoader module. Read the perlxstut manpage for details.


What is undump?

See the next questions.


How can I make my Perl program run faster?

The best way to do this is to come up with a better algorithm. This can often make a dramatic difference. Chapter 8 in the Camel has some efficiency tips in it you might want to look at.

Other approaches include autoloading seldom-used Perl code. See the AutoSplit and AutoLoader modules in the standard distribution for that. Or you could locate the bottleneck and think about writing just that part in C, the way we used to take bottlenecks in C code and write them in assembler. Similar to rewriting in C is the use of modules that have critical sections written in C (for instance, the PDL module from CPAN).

In some cases, it may be worth it to use the backend compiler to produce byte code (saving compilation time) or compile into C, which will certainly save compilation time and sometimes a small amount (but not much) execution time. See the question about compiling your Perl programs.

If you're currently linking your perl executable to a shared libc.so, you can often gain a 10-25% performance benefit by rebuilding it to link with a static libc.a instead. This will make a bigger perl executable, but your Perl programs (and programmers) may thank you for it. See the INSTALL file in the source distribution for more information.

Unsubstantiated reports allege that Perl interpreters that use sfio outperform those that don't (for IO intensive applications). To try this, see the INSTALL file in the source distribution, especially the ``Selecting File IO mechanisms'' section.

The undump program was an old attempt to speed up your Perl program by storing the already-compiled form to disk. This is no longer a viable option, as it only worked on a few architectures, and wasn't a good solution anyway.


How can I make my Perl program take less memory?

When it comes to time-space tradeoffs, Perl nearly always prefers to throw memory at a problem. Scalars in Perl use more memory than strings in C, arrays take more that, and hashes use even more. While there's still a lot to be done, recent releases have been addressing these issues. For example, as of 5.004, duplicate hash keys are shared amongst all hashes using them, so require no reallocation.

In some cases, using substr or vec to simulate arrays can be highly beneficial. For example, an array of a thousand booleans will take at least 20,000 bytes of space, but it can be turned into one 125-byte bit vector for a considerable memory savings. The standard Tie::SubstrHash module can also help for certain types of data structure. If you're working with specialist data structures (matrices, for instance) modules that implement these in C may use less memory than equivalent Perl modules.

Another thing to try is learning whether your Perl was compiled with the system malloc or with Perl's built-in malloc. Whichever one it is, try using the other one and see whether this makes a difference. Information about malloc is in the INSTALL file in the source distribution. You can find out whether you are using perl's malloc by typing perl -V:usemymalloc.


Is it unsafe to return a pointer to local data?

No, Perl's garbage collection system takes care of this.

    sub makeone {
	my @a = ( 1 .. 10 );
	return \@a;
    }

    for $i ( 1 .. 10 ) {
        push @many, makeone();
    }

    print $many[4][5], "\n";

    print "@many\n";


How can I free an array or hash so my program shrinks?

You can't. Memory the system allocates to a program will never be returned to the system. That's why long-running programs sometimes re-exec themselves.

However, judicious use of my on your variables will help make sure that they go out of scope so that Perl can free up their storage for use in other parts of your program. (NB: my variables also execute about 10% faster than globals.) A global variable, of course, never goes out of scope, so you can't get its space automatically reclaimed, although undefing and/or deleteing it will achieve the same effect. In general, memory allocation and de-allocation isn't something you can or should be worrying about much in Perl, but even this capability (preallocation of data types) is in the works.


How can I make my CGI script more efficient?

Beyond the normal measures described to make general Perl programs faster or smaller, a CGI program has additional issues. It may be run several times per second. Given that each time it runs it will need to be re-compiled and will often allocate a megabyte or more of system memory, this can be a killer. Compiling into C isn't going to help you because the process start-up overhead is where the bottleneck is.

There are at least two popular ways to avoid this overhead. One solution involves running the Apache HTTP server (available from http://www.apache.org/) with either of the mod_perl or mod_fastcgi plugin modules. With mod_perl and the Apache::* modules (from CPAN), httpd will run with an embedded Perl interpreter which pre-compiles your script and then executes it within the same address space without forking. The Apache extension also gives Perl access to the internal server API, so modules written in Perl can do just about anything a module written in C can. With the FCGI module (from CPAN), a Perl executable compiled with sfio (see the INSTALL file in the distribution) and the mod_fastcgi module (available from http://www.fastcgi.com/) each of your perl scripts becomes a permanent CGI daemon processes.

Both of these solutions can have far-reaching effects on your system and on the way you write your CGI scripts, so investigate them with care.


How can I hide the source for my Perl program?

Delete it. :-) Seriously, there are a number of (mostly unsatisfactory) solutions with varying levels of ``security''.

First of all, however, you can't take away read permission, because the source code has to be readable in order to be compiled and interpreted. (That doesn't mean that a CGI script's source is readable by people on the web, though.) So you have to leave the permissions at the socially friendly 0755 level.

Some people regard this as a security problem. If your program does insecure things, and relies on people not knowing how to exploit those insecurities, it is not secure. It is often possible for someone to determine the insecure things and exploit them without viewing the source. Security through obscurity, the name for hiding your bugs instead of fixing them, is little security indeed.

You can try using encryption via source filters (Filter::* from CPAN). But crackers might be able to decrypt it. You can try using the byte-code compiler and interpreter described below, but crackers might be able to de-compile it. You can try using the native-code compiler described below, but crackers might be able to disassemble it. These pose varying degrees of difficulty to people wanting to get at your code, but none can definitively conceal it (this is true of every language, not just Perl).

If you're concerned about people profiting from your code, then the bottom line is that nothing but a restrictive licence will give you legal security. License your software and pepper it with threatening statements like ``This is unpublished proprietary software of XYZ Corp. Your access to it does not give you permission to use it blah blah blah.'' We are not lawyers, of course, so you should see a lawyer if you want to be sure your licence's wording will stand up in court.


How can I compile my Perl program into byte-code or C?

Malcolm Beattie has written a multifunction backend compiler, available from CPAN, that can do both these things. It is as of Feb-1997 in late alpha release, which means it's fun to play with if you're a programmer but not really for people looking for turn-key solutions.

Please understand that merely compiling into C does not in and of itself guarantee that your code will run very much faster. That's because except for lucky cases where a lot of native type inferencing is possible, the normal Perl run time system is still present and thus will still take just as long to run and be just as big. Most programs save little more than compilation time, leaving execution no more than 10-30% faster. A few rare programs actually benefit significantly (like several times faster), but this takes some tweaking of your code.

Malcolm will be in charge of the 5.005 release of Perl itself to try to unify and merge his compiler and multithreading work into the main release.

You'll probably be astonished to learn that the current version of the compiler generates a compiled form of your script whose executable is just as big as the original perl executable, and then some. That's because as currently written, all programs are prepared for a full eval statement. You can tremendously reduce this cost by building a shared libperl.so library and linking against that. See the INSTALL podfile in the perl source distribution for details. If you link your main perl binary with this, it will make it miniscule. For example, on one author's system, /usr/bin/perl is only 11k in size!


How can I get '#!perl' to work on [MSDOS,NT,...]?

For OS/2 just use

    extproc perl -S -your_switches

as the first line in *.cmd file (-S due to a bug in cmd.exe's `extproc' handling). For DOS one should first invent a corresponding batch file, and codify it in ALTERNATIVE_SHEBANG (see the INSTALL file in the source distribution for more information).

The Win95/NT installation, when using the Activeware port of Perl, will modify the Registry to associate the .pl extension with the perl interpreter. If you install another port, or (eventually) build your own Win95/NT Perl using WinGCC, then you'll have to modify the Registry yourself.

Macintosh perl scripts will have the the appropriate Creator and Type, so that double-clicking them will invoke the perl application.

IMPORTANT!: Whatever you do, PLEASE don't get frustrated, and just throw the perl interpreter into your cgi-bin directory, in order to get your scripts working for a web server. This is an EXTREMELY big security risk. Take the time to figure out how to do it correctly.


Can I write useful perl programs on the command line?

Yes. Read the perlrun manpage for more information. Some examples follow. (These assume standard Unix shell quoting rules.)

    # sum first and last fields
    perl -lane 'print $F[0] + $F[-1]'

    # identify text files
    perl -le 'for(@ARGV) {print if -f && -T _}' *

    # remove comments from C program
    perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c

    # make file a month younger than today, defeating reaper daemons
    perl -e '$X=24*60*60; utime(time(),time() + 30 * $X,@ARGV)' *

    # find first unused uid
    perl -le '$i++ while getpwuid($i); print $i'

    # display reasonable manpath
    echo $PATH | perl -nl -072 -e '
	s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}'

Ok, the last one was actually an obfuscated perl entry. :-)


Why don't perl one-liners work on my DOS/Mac/VMS system?

The problem is usually that the command interpreters on those systems have rather different ideas about quoting than the Unix shells under which the one-liners were created. On some systems, you may have to change single-quotes to double ones, which you must NOT do on Unix or Plan9 systems. You might also have to change a single % to a %%.

For example:

    # Unix
    perl -e 'print "Hello world\n"'

    # DOS, etc.
    perl -e "print \"Hello world\n\""

    # Mac
    print "Hello world\n"
     (then Run "Myscript" or Shift-Command-R)

    # VMS
    perl -e "print ""Hello world\n"""

The problem is that none of this is reliable: it depends on the command interpreter. Under Unix, the first two often work. Under DOS, it's entirely possible neither works. If 4DOS was the command shell, I'd probably have better luck like this:

  perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>""

Under the Mac, it depends which environment you are using. The MacPerl shell, or MPW, is much like Unix shells in its support for several quoting variants, except that it makes free use of the Mac's non-ASCII characters as control characters.

I'm afraid that there is no general solution to all of this. It is a mess, pure and simple.

[Some of this answer was contributed by Kenneth Albanowski.]


Where can I learn about CGI or Web programming in Perl?

For modules, get the CGI or LWP modules from CPAN. For textbooks, see the two especially dedicated to web stuff in the question on books. For problems and questions related to the web, like ``Why do I get 500 Errors'' or ``Why doesn't it run from the browser right when it runs fine on the command line'', see these sources:

    The Idiot's Guide to Solving Perl/CGI Problems, by Tom Christiansen
	http://www.perl.com/perl/faq/idiots-guide.html

    Frequently Asked Questions about CGI Programming, by Nick Kew
	ftp://rtfm.mit.edu/pub/usenet/news.answers/www/cgi-faq
	http://www3.pair.com/webthing/docs/cgi/faqs/cgifaq.shtml

    Perl/CGI programming FAQ, by Shishir Gundavaram and Tom Christiansen
	http://www.perl.com/perl/faq/perl-cgi-faq.html

    The WWW Security FAQ, by Lincoln Stein
	http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html

    World Wide Web FAQ, by Thomas Boutell
	http://www.boutell.com/faq/


Where can I learn about object-oriented Perl programming?

the perltoot manpage is a good place to start, and you can use the perlobj manpage and the perlbot manpage for reference. Perltoot didn't come out until the 5.004 release, but you can get a copy (in pod, html, or postscript) from http://www.perl.com/CPAN/doc/FMTEYEWTK/ .


Where can I learn about linking C with Perl? [h2xs, xsubpp]

If you want to call C from Perl, start with the perlxstut manpage, moving on to the perlxs manpage, the xsubpp manpage, and the perlguts manpage. If you want to call Perl from C, then read the perlembed manpage, the perlcall manpage, and the perlguts manpage. Don't forget that you can learn a lot from looking at how the authors of existing extension modules wrote their code and solved their problems.


I've read perlembed, perlguts, etc., but I can't embed perl in my C program, what am I doing wrong?

Download the ExtUtils::Embed kit from CPAN and run `make test'. If the tests pass, read the pods again and again and again. If they fail, see the perlbug manpage and send a bugreport with the output of make test TEST_VERBOSE=1 along with perl -V.


When I tried to run my script, I got this message. What does it mean?

the perldiag manpage has a complete list of perl's error messages and warnings, with explanatory text. You can also use the splain program (distributed with perl) to explain the error messages:

    perl program 2>diag.out
    splain [-v] [-p] diag.out

or change your program to explain the messages for you:

    use diagnostics;

or

    use diagnostics -verbose;


What's MakeMaker?

This module (part of the standard perl distribution) is designed to write a Makefile for an extension module from a Makefile.PL. For more information, see MakeMaker.


perlfaq4 - Data Manipulation ($Revision: 1.15 $)

The section of the FAQ answers question related to the manipulation of data as numbers, dates, strings, arrays, hashes, and miscellaneous data issues.


Data: Numbers


Why isn't my octal data interpreted correctly?

Perl only understands octal and hex numbers as such when they occur as literals in your program. If they are read in from somewhere and assigned, no automatic conversion takes place. You must explicitly use oct or hex if you want the values converted. oct interprets both hex (``0x350'') numbers and octal ones (``0350'' or even without the leading ``0'', like ``377''), while hex only converts hexadecimal ones, with or without a leading ``0x'', like ``0x255'', ``3A'', ``ff'', or ``deadbeef''.

This problem shows up most often when people try using chmod, mkdir, umask, or sysopen, which all want permissions in octal.

    chmod(644,  $file);	# WRONG -- perl -w catches this
    chmod(0644, $file);	# right


Does perl have a round function? What about ceil() and floor()? Trig functions?

For rounding to a certain number of digits, sprintf or printf is usually the easiest route.

The POSIX module (part of the standard perl distribution) implements ceil, floor, and a number of other mathematical and trigonometric functions.

The Math::Complex module (part of the standard perl distribution) defines a number of mathematical functions that can also work on real numbers. It's not as efficient as the POSIX library, but the POSIX library can't work with complex numbers.

Rounding in financial applications can have serious implications, and the rounding method used should be specified precisely. In these cases, it probably pays not to trust whichever system rounding is being used by Perl, but to instead implement the rounding function you need yourself.


How do I convert bits into ints?

To turn a string of 1s and 0s like '10110110' into a scalar containing its binary value, use the pack function (documented in pack):

    $decimal = pack('B8', '10110110');

Here's an example of going the other way:

    $binary_string = join('', unpack('B*', "\x29"));


How do I multiply matrices?

Use the Math::Matrix or Math::MatrixReal modules (available from CPAN) or the PDL extension (also available from CPAN).


How do I perform an operation on a series of integers?

To call a function on each element in an array, and collect the results, use:

    @results = map { my_func($_) } @array;

For example:

    @triple = map { 3 * $_ } @single;

To call a function on each element of an array, but ignore the results:

    foreach $iterator (@array) {
        &my_func($iterator);
    }

To call a function on each integer in a (small) range, you can use:

    @results = map { &my_func($_) } (5 .. 25);

but you should be aware that the .. operator creates an array of all integers in the range. This can take a lot of memory for large ranges. Instead use:

    @results = ();
    for ($i=5; $i < 500_005; $i++) {
        push(@results, &my_func($i));
    }


How can I output Roman numerals?

Get the http://www.perl.com/CPAN/modules/by-module/Roman module.


Why aren't my random numbers random?

The short explanation is that you're getting pseudorandom numbers, not random ones, because that's how these things work. A longer explanation is available on http://www.perl.com/CPAN/doc/FMTEYEWTK/random, courtesy of Tom Phoenix.

You should also check out the Math::TrulyRandom module from CPAN.


Data: Dates


How do I find the week-of-the-year/day-of-the-year?

The day of the year is in the array returned by localtime (see localtime):

    $day_of_year = (localtime(time()))[7];

or more legibly (in 5.004 or higher):

    use Time::localtime;
    $day_of_year = localtime(time())->yday;

You can find the week of the year by dividing this by 7:

    $week_of_year = int($day_of_year / 7);

Of course, this believes that weeks start at zero.


How can I compare two date strings?

Use the Date::Manip or Date::DateCalc modules from CPAN.


How can I take a string and turn it into epoch seconds?

If it's a regular enough string that it always has the same format, you can split it up and pass the parts to timelocal in the standard Time::Local module. Otherwise, you should look into one of the Date modules from CPAN.


How can I find the Julian Day?

Neither Date::Manip nor Date::DateCalc deal with Julian days. Instead, there is an example of Julian date calculation in http://www.perl.com/CPAN/authors/David_Muir_Sharnoff/modules/Time/JulianDay.pm.gz, which should help.


Does Perl have a year 2000 problem?

Not unless you use Perl to create one. The date and time functions supplied with perl (gmtime and localtime) supply adequate information to determine the year well beyond 2000 (2038 is when trouble strikes). The year returned by these functions when used in an array context is the year minus 1900. For years between 1910 and 1999 this happens to be a 2-digit decimal number. To avoid the year 2000 problem simply do not treat the year as a 2-digit number. It isn't.

When gmtime and localtime are used in a scalar context they return a timestamp string that contains a fully-expanded year. For example, $timestamp = gmtime sets $timestamp to ``Tue Nov 13 01:00:00 2001''. There's no year 2000 problem here.


Data: Strings


How do I validate input?

The answer to this question is usually a regular expression, perhaps with auxiliary logic. See the more specific questions (numbers, email addresses, etc.) for details.


How do I unescape a string?

It depends just what you mean by ``escape''. URL escapes are dealt with in the perlfaq9 manpage. Shell escapes with the backslash (\) character are removed with:

    s/\\(.)/$1/g;

Note that this won't expand \n or \t or any other special escapes.


How do I remove consecutive pairs of characters?

To turn ``abbcccd'' into ``abccd'':

    s/(.)\1/$1/g;


How do I expand function calls in a string?

This is documented in the perlref manpage. In general, this is fraught with quoting and readability problems, but it is possible. To interpolate a subroutine call (in a list context) into a string:

    print "My sub returned @{[mysub(1,2,3)]} that time.\n";

If you prefer scalar context, similar chicanery is also useful for arbitrary expressions:

    print "That yields ${\($n + 5)} widgets\n";


How do I find matching/nesting anything?

This isn't something that can be tackled in one regular expression, no matter how complicated. To find something between two single characters, a pattern like /xx/ will get the intervening bits in $1. For multiple ones, then something more like /alphaomega/ would be needed. But none of these deals with nested patterns, nor can they. For that you'll have to write a parser.


How do I reverse a string?

Use reverse in a scalar context, as documented in reverse.

    $reversed = reverse $string;


How do I expand tabs in a string?

You can do it the old-fashioned way:

    1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;

Or you can just use the Text::Tabs module (part of the standard perl distribution).

    use Text::Tabs;
    @expanded_lines = expand(@lines_with_tabs);


How do I reformat a paragraph?

Use Text::Wrap (part of the standard perl distribution):

    use Text::Wrap;
    print wrap("\t", '  ', @paragraphs);


How can I access/change the first N letters of a string?

There are many ways. If you just want to grab a copy, use substr:

    $first_byte = substr($a, 0, 1);

If you want to modify part of a string, the simplest way is often to use substr as an lvalue:

    substr($a, 0, 3) = "Tom";

Although those with a regexp kind of thought process will likely prefer

    $a =~ s/^.../Tom/;


How do I change the Nth occurrence of something?

You have to keep track. For example, let's say you want to change the fifth occurrence of ``whoever'' or ``whomever'' into ``whosoever'', case insensitively.

    $count = 0;
    s{((whom?)ever)}{
	++$count == 5   	# is it the 5th?
	    ? "${2}soever"	# yes, swap
	    : $1		# renege and leave it there
    }igex;


How can I count the number of occurrences of a substring within a string?

There are a number of ways, with varying efficiency: If you want a count of a certain single character (X) within a string, you can use the tr/// function like so:

    $string = "ThisXlineXhasXsomeXx'sXinXit":
    $count = ($string =~ tr/X//);
    print "There are $count X charcters in the string";

This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, tr/// won't work. What you can do is wrap a while loop around a global pattern match. For example, let's count negative integers:

    $string = "-9 55 48 -2 23 -76 4 14 -44";
    while ($string =~ /-\d+/g) { $count++ }
    print "There are $count negative numbers in the string";


How do I capitalize all the words on one line?

To make the first letter of each word upper case: $line =~ s/\b(\w)/\U$1/g;

To make the whole line upper case: $line = uc;

To force each word to be lower case, with the first letter upper case: $line =~ s/(\w+)/\u\L$1/g;


How can I split a [character] delimited string except when inside [character]? (Comma-separated files)

Take the example case of trying to split a string that is comma-separated into its different fields. (We'll pretend you said comma-separated, not comma-delimited, which is different and almost never what you mean.) You can't use split because you shouldn't split if the comma is inside quotes. For example, take a data line like this:

    SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"

Due to the restriction of the quotes, this is a fairly complex problem. Thankfully, we have Jeffrey Friedl, author of a highly recommended book on regular expressions, to handle these for us. He suggests (assuming your string is contained in $text):

     @new = ();
     push(@new, $+) while $text =~ m{
         "([^\"\\]*(?:\\.[^\"\\]*)*)",?  # groups the phrase inside the quotes
       | ([^,]+),?
       | ,
     }gx;
     push(@new, undef) if substr($text,-1,1) eq ',';

Alternatively, the Text::ParseWords module (part of the standard perl distribution) lets you say:

    use Text::ParseWords;
    @new = quotewords(",", 0, $text);


How do I strip blank space from the beginning/end of a string?

The simplest approach, albeit not the fastest, is probably like this:

    $string =~ s/^\s*(.*?)\s*$/$1/;

It would be faster to do this in two steps:

    $string =~ s/^\s+//;
    $string =~ s/\s+$//;

Or more nicely written as:

    for ($string) {
	s/^\s+//;
	s/\s+$//;
    }


How do I extract selected columns from a string?

Use substr or unpack, both documented in the perlfunc manpage.


How do I find the soundex value of a string?

Use the standard Text::Soundex module distributed with perl.


How can I expand variables in text strings?

Let's assume that you have a string like:

    $text = 'this has a $foo in it and a $bar';
    $text =~ s/\$(\w+)/${$1}/g;

Before version 5 of perl, this had to be done with a double-eval substitution:

    $text =~ s/(\$\w+)/$1/eeg;

Which is bizarre enough that you'll probably actually need an EEG afterwards. :-)


What's wrong with always quoting "$vars"?

The problem is that those double-quotes force stringification, coercing numbers and references into strings, even when you don't want them to be.

If you get used to writing odd things like these:

    print "$var";   	# BAD
    $new = "$old";   	# BAD
    somefunc("$var");	# BAD

You'll be in trouble. Those should (in 99.8% of the cases) be the simpler and more direct:

    print $var;
    $new = $old;
    somefunc($var);

Otherwise, besides slowing you down, you're going to break code when the thing in the scalar is actually neither a string nor a number, but a reference:

    func(\@array);
    sub func {
	my $aref = shift;
	my $oref = "$aref";  # WRONG
    }

You can also get into subtle problems on those few operations in Perl that actually do care about the difference between a string and a number, such as the magical ++ autoincrement operator or the syscall function.


Why don't my <

Check for these three things:

  1. There must be no space after the << part.
  2. There (probably) should be a semicolon at the end.
  3. You can't (easily) have any space in front of the tag.


Data: Arrays


What is the difference between $array[1] and @array[1]?

The former is a scalar value, the latter an array slice, which makes it a list with one (scalar) value. You should use $ when you want a scalar value (most of the time) and @ when you want a list with one scalar value in it (very, very rarely; nearly never, in fact).

Sometimes it doesn't make a difference, but sometimes it does. For example, compare:

    $good[0] = `some program that outputs several lines`;

with

    @bad[0]  = `same program that outputs several lines`;

The -w flag will warn you about these matters.


How can I extract just the unique elements of an array?

There are several possible ways, depending on whether the array is ordered and whether you wish to preserve the ordering.

a) If @in is sorted, and you want @out to be sorted:
    $prev = 'nonesuch';
    @out = grep($_ ne $prev && ($prev = $_), @in);

This is nice in that it doesn't use much extra memory, simulating uniq's behavior of removing only adjacent duplicates.

b) If you don't know whether @in is sorted:
    undef %saw;
    @out = grep(!$saw{$_}++, @in);

c) Like (b), but @in contains only small integers:
    @out = grep(!$saw[$_]++, @in);

d) A way to do (b) without any loops or greps:
    undef %saw;
    @saw{@in} = ();
    @out = sort keys %saw;  # remove sort if undesired

e) Like (d), but @in contains only small positive integers:
    undef @ary;
    @ary[@in] = @in;
    @out = @ary;


How can I tell whether an array contains a certain element?

There are several ways to approach this. If you are going to make this query many times and the values are arbitrary strings, the fastest way is probably to invert the original array and keep an associative array lying about whose keys are the first array's values.

    @blues = qw/azure cerulean teal turquoise lapis-lazuli/;
    undef %is_blue;
    for (@blues) { $is_blue{$_} = 1 }

Now you can check whether $is_blue{$some_color}. It might have been a good idea to keep the blues all in a hash in the first place.

If the values are all small integers, you could use a simple indexed array. This kind of an array will take up less space:

    @primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31);
    undef @is_tiny_prime;
    for (@primes) { $is_tiny_prime[$_] = 1; }

Now you check whether $is_tiny_prime[$some_number].

If the values in question are integers instead of strings, you can save quite a lot of space by using bit strings instead:

    @articles = ( 1..10, 150..2000, 2017 );
    undef $read;
    grep (vec($read,$_,1) = 1, @articles);

Now check whether vec is true for some $n.

Please do not use

    $is_there = grep $_ eq $whatever, @array;

or worse yet

    $is_there = grep /$whatever/, @array;

These are slow (checks every element even if the first matches), inefficient (same reason), and potentially buggy (what if there are regexp characters in $whatever?).


How do I compute the difference of two arrays? How do I compute the intersection of two arrays?

Use a hash. Here's code to do both and more. It assumes that each element is unique in a given array:

    @union = @intersection = @difference = ();
    %count = ();
    foreach $element (@array1, @array2) { $count{$element}++ }
    foreach $element (keys %count) {
	push @union, $element;
	push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element;
    }


How do I find the first array element for which a condition is true?

You can use this if you care about the index:

    for ($i=0; $i < @array; $i++) {
        if ($array[$i] eq "Waldo") {
	    $found_index = $i;
            last;
        }
    }

Now $found_index has what you want.


How do I handle linked lists?

In general, you usually don't need a linked list in Perl, since with regular arrays, you can push and pop or shift and unshift at either end, or you can use splice to add and/or remove arbitrary number of elements at arbitrary points.

If you really, really wanted, you could use structures as described in the perldsc manpage or the perltoot manpage and do just what the algorithm book tells you to do.


How do I handle circular lists?

Circular lists could be handled in the traditional fashion with linked lists, or you could just do something like this with an array:

    unshift(@array, pop(@array));  # the last shall be first
    push(@array, shift(@array));   # and vice versa


How do I shuffle an array randomly?

Here's a shuffling algorithm which works its way through the list, randomly picking another element to swap the current element with:

    srand;
    @new = ();
    @old = 1 .. 10;  # just a demo
    while (@old) {
	push(@new, splice(@old, rand @old, 1));
    }

For large arrays, this avoids a lot of the reshuffling:

    srand;
    @new = ();
    @old = 1 .. 10000;  # just a demo
    for( @old ){
        my $r = rand @new+1;
        push(@new,$new[$r]);
        $new[$r] = $_;
    }


How do I process/modify each element of an array?

Use for/foreach:

    for (@lines) {
	s/foo/bar/;
	tr[a-z][A-Z];
    }

Here's another; let's compute spherical volumes:

    for (@radii) {
	$_ **= 3;
	$_ *= (4/3) * 3.14159;  # this will be constant folded
    }


How do I select a random element from an array?

Use the rand function (see rand):

    srand;			# not needed for 5.004 and later
    $index   = rand @array;
    $element = $array[$index];


How do I permute N elements of a list?

Here's a little program that generates all permutations of all the words on each line of input. The algorithm embodied in the permut function should work on any list:

    #!/usr/bin/perl -n
    # permute - tchrist@perl.com
    permut([split], []);
    sub permut {
	my @head = @{ $_[0] };
	my @tail = @{ $_[1] };
	unless (@head) {
	    # stop recursing when there are no elements in the head
	    print "@tail\n";
	} else {
	    # for all elements in @head, move one from @head to @tail
	    # and call permut() on the new @head and @tail
	    my(@newhead,@newtail,$i);
	    foreach $i (0 .. $#head) {
		@newhead = @head;
		@newtail = @tail;
		unshift(@newtail, splice(@newhead, $i, 1));
		permut([@newhead], [@newtail]);
	    }
	}
    }


How do I sort an array by (anything)?

Supply a comparison function to sort (described in sort):

    @list = sort { $a <=> $b } @list;

The default sort function is cmp, string comparison, which would sort into . <=>, used above, is the numerical comparison operator.

If you have a complicated function needed to pull out the part you want to sort on, then don't do it inside the sort function. Pull it out first, because the sort BLOCK can be called many times for the same element. Here's an example of how to pull out the first word after the first number on each item, and then sort those words case-insensitively.

    @idx = ();
    for (@data) {
	($item) = /\d+\s*(\S+)/;
	push @idx, uc($item);
    }
    @sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ];

Which could also be written this way, using a trick that's come to be known as the Schwartzian Transform:

    @sorted = map  { $_->[0] }
	      sort { $a->[1] cmp $b->[1] }
	      map  { [ $_, uc((/\d+\s*(\S+) )[0] ] } @data;

If you need to sort on several fields, the following paradigm is useful.

    @sorted = sort { field1($a) <=> field1($b) ||
                     field2($a) cmp field2($b) ||
                     field3($a) cmp field3($b)
                   }     @data;

This can be conveniently combined with precalculation of keys as given above.

See http://www.perl.com/CPAN/doc/FMTEYEWTK/sort.html for more about this approach.

See also the question below on sorting hashes.


How do I manipulate arrays of bits?

Use pack and unpack, or else vec and the bitwise operations.

For example, this sets $vec to have bit N set if $ints[N] was set:

    $vec = '';
    foreach(@ints) { vec($vec,$_,1) = 1 }

And here's how, given a vector in $vec, you can get those bits into your @ints array:

    sub bitvec_to_list {
	my $vec = shift;
	my @ints;
	# Find null-byte density then select best algorithm
	if ($vec =~ tr/\0// / length $vec > 0.95) {
	    use integer;
	    my $i;
	    # This method is faster with mostly null-bytes
	    while($vec =~ /[^\0]/g ) {
		$i = -9 + 8 * pos $vec;
		push @ints, $i if vec($vec, ++$i, 1);
		push @ints, $i if vec($vec, ++$i, 1);
		push @ints, $i if vec($vec, ++$i, 1);
		push @ints, $i if vec($vec, ++$i, 1);
		push @ints, $i if vec($vec, ++$i, 1);
		push @ints, $i if vec($vec, ++$i, 1);
		push @ints, $i if vec($vec, ++$i, 1);
		push @ints, $i if vec($vec, ++$i, 1);
	    }
	} else {
	    # This method is a fast general algorithm
	    use integer;
	    my $bits = unpack "b*", $vec;
	    push @ints, 0 if $bits =~ s/^(\d)// && $1;
	    push @ints, pos $bits while($bits =~ /1/g);
	}
	return \@ints;
    }

This method gets faster the more sparse the bit vector is. (Courtesy of Tim Bunce and Winfried Koenig.)


Why does defined() return true on empty arrays and hashes?

See defined in the 5.004 release or later of Perl.


Data: Hashes (Associative Arrays)


How do I process an entire hash?

Use the each function (see each) if you don't care whether it's sorted:

    while (($key,$value) = each %hash) {
	print "$key = $value\n";
    }

If you want it sorted, you'll have to use foreach on the result of sorting the keys as shown in an earlier question.


What happens if I add or remove keys from a hash while iterating over it?

Don't do that.


How do I look up a hash element by value?

Create a reverse hash:

    %by_value = reverse %by_key;
    $key = $by_value{$value};

That's not particularly efficient. It would be more space-efficient to use:

    while (($key, $value) = each %by_key) {
	$by_value{$value} = $key;
    }

If your hash could have repeated values, the methods above will only find one of the associated keys. This may or may not worry you.


How can I know how many entries are in a hash?

If you mean how many keys, then all you have to do is take the scalar sense of the keys function:

	$num_keys = scalar keys %hash;

In void context it just resets the iterator, which is faster for tied hashes.


How do I sort a hash (optionally by value instead of key)?

Internally, hashes are stored in a way that prevents you from imposing an order on key-value pairs. Instead, you have to sort a list of the keys or values:

    @keys = sort keys %hash;	# sorted by key
    @keys = sort {
		    $hash{$a} cmp $hash{$b}
	    } keys %hash; 	# and by value

Here we'll do a reverse numeric sort by value, and if two keys are identical, sort by length of key, and if that fails, by straight ASCII comparison of the keys (well, possibly modified by your locale -- see the perllocale manpage).

    @keys = sort {
		$hash{$b} <=> $hash{$a}
			  ||
		length($b) <=> length($a)
			  ||
		      $a cmp $b
    } keys %hash;


How can I always keep my hash sorted?

You can look into using the DB_File module and tie using the $DB_BTREE hash bindings as documented in In Memory Databases.


What's the difference between "delete" and "undef" with hashes?

Hashes are pairs of scalars: the first is the key, the second is the value. The key will be coerced to a string, although the value can be any kind of scalar: string, number, or reference. If a key $key is present in the array, exists will return true. The value for a given key can be undef, in which case $array{$key} will be undef while $exists{$key} will return true. This corresponds to ($key, undef) being in the hash.

Pictures help... here's the %ary table:

	  keys  values
	+------+------+
	|  a   |  3   |
	|  x   |  7   |
	|  d   |  0   |
	|  e   |  2   |
	+------+------+

And these conditions hold

	$ary{'a'}                       is true
	$ary{'d'}                       is false
	defined $ary{'d'}               is true
	defined $ary{'a'}               is true
	exists $ary{'a'}                is true (perl5 only)
	grep ($_ eq 'a', keys %ary)     is true

If you now say

	undef $ary{'a'}

your table now reads:

	  keys  values
	+------+------+
	|  a   | undef|
	|  x   |  7   |
	|  d   |  0   |
	|  e   |  2   |
	+------+------+

and these conditions now hold; changes in caps:

	$ary{'a'}                       is FALSE
	$ary{'d'}                       is false
	defined $ary{'d'}               is true
	defined $ary{'a'}               is FALSE
	exists $ary{'a'}                is true (perl5 only)
	grep ($_ eq 'a', keys %ary)     is true

Notice the last two: you have an undef value, but a defined key!

Now, consider this:

	delete $ary{'a'}

your table now reads:

	  keys  values
	+------+------+
	|  x   |  7   |
	|  d   |  0   |
	|  e   |  2   |
	+------+------+

and these conditions now hold; changes in caps:

	$ary{'a'}                       is false
	$ary{'d'}                       is false
	defined $ary{'d'}               is true
	defined $ary{'a'}               is false
	exists $ary{'a'}                is FALSE (perl5 only)
	grep ($_ eq 'a', keys %ary)     is FALSE

See, the whole entry is gone!


Why don't my tied hashes make the defined/exists distinction?

They may or may not implement the EXISTS and DEFINED methods differently. For example, there isn't the concept of undef with hashes that are tied to DBM* files. This means the true/false tables above will give different results when used on such a hash. It also means that exists and defined do the same thing with a DBM* file, and what they end up doing is not what they do with ordinary hashes.


How do I reset an each() operation