Perl Basics

This short guide will cover the basics of the Perl language. We will cover certain features in greater depth later on, but feel free to ask for clarification now. Also, please refer to the following resources for better introductions to Perl:

Table of Contents

What is Perl?

Perl is a high-level programming language widely used by the bioinformatics community. Check out the following links at your leisure:

In this class, you will instruct your computer via statements in Perl. You might need a simple program to do this (in fact, you will for Project 1):

Take the reverse complement of a DNA sequence

Perl can do this, but it needs some help – you need to give it instructions in something it understands (Perl code). Much like English, the Perl language is comprised of a number of constructs and rules, but unlike English, it is consistent.

How do I speak Perl?

Perl programs are made up of statements, which in turn are comprised of

  • comments,
  • variables,
  • data types,
  • functions (subroutines in Perl),
  • and conditionals.


Statements in Perl are like sentences in English, but less complicated. They look like this:


The important part is the semicolon at the end. This tells Perl your instruction has ended. Here are some examples:

my $var = 'Blah';       # set a variable
print 'wah-wah-wahhhh'; # call a print() function
delete_my_files();      # call a function

Where do you omit semicolons?

  • if, elsif, else statements
  • subroutine definitions


Leave a note for yourself explaining the purpose of a code block. Place a pound/hash symbol on a line, and everything after it will not be executed by Perl.

# parse Vienna notation into something useful
# ...


These are symbols representing a value, like in math. In Perl, these are mutable, so you can have \(x = 2\) and then change it to \(x = 4\) later on.

In Perl, there are three kinds of variables you'll be dealing with:

scalarHolds a single value
arrayHolds a list of values
hashHolds a table of data
my $cat = 'Martin';                    # a scalar
my @dogs = ('Gretel', 'Shep', 'Lola'); # an array
my %groceries =  (                     # a hash
    'eggs'  => 6,
    'bread' => '1 loaf',
    'milk'  => '1 gal',

Variable restrictions

The variable identifier must start with a letter (a-z) or underscore.


Data Types

  • A classification of data like words, letters, integers
  • Perl has its own data types:
    sequence of characters
    a list
    table of data
    user-defined data type (not covered in this class)


my $str = "The quick brown fox jumped over the lazy dog.";

my $borat = "Wow-wow-wowee!111!!\n"

# how do you put 2+ strings together? with the concatenation operator
# (`.' operator)

print $str . ' ' . $borat;
#=> The quick brown fox jumped over the lazy dog. Wow-wow-wowee!111!!\n


my $num_sequences = 30;


my $pizza_radius    = 8.0;
my $area_of_pizza   = 3.14159 * $pizza_radius**2;


  • These can contain other data types
  • Important: These are indexed, but the first element is at position 0, not 1!
my @list = ('eggs', 'bread', 'milk');
# positions:  0        1        2

# ...or use the shorthand trick for string lists
my @list = qw(eggs bread milk);

my @useless_list = (1, 2, 3, 4);

You can access individual elements like so:

my @list = qw(eggs bread milk);
print "I need some ";
print $list[0]; # we use a $ to access a scalar inside the array

print "\n\n";        # some space
print "There are ";
print $#list;        # print number of items
print "items on my list";


my %itemized_list = (
    'eggs'  => 6,
    'bread' => '1 loaf',
    'milk'  => '1 gal',


In mathematics, you would write:

\[ f(x) = x^2 + 2x - 1 \] \[ f(3) = 14 \]

In Perl, you would write:

sub f {
    my $x = shift;
    return $x**2 + 2*$x - 1;
print f(3); #=> 14

Built-in functions

Perl comes with a ton of built-in functions. Here are a few of them:

printPrint a string to screen or a fileprint "Hi Mom!";
ucUppercase a stringprint uc("Hi Mom!")
sqrtReturn the square root of a numberprint sqrt(4)

User-defined functions

  • In \(f(x)\), \(x\) is the parameter
  • In \(f(2)\), 2 is the argument
  • In Perl, \(x\) is passed in as an array of arguments to \(f()\), so it's like writing \(f((x))\)
  • The array is available as @_, or you can shift each argument
  • It's a quirk of the language
sub fetch_dog {
    my $name = shift;
    return uc($name) . "!";

print fetch_dog('Gretel'); # prints "Gretel!"
  • sub fetch_dog: names the function fetch_dog
  • shift: takes the first argument from the list of input ("Gretel")
  • return: Return a value, in this case, an uppercased string with an exclamation point on the end


Sometimes you need to do things based on the value of a variable or output of a function:

if ($on_phone) {
    print 'Corporate accounts payable Nina speaking. Just a moment!';
} elsif ($day eq 'Monday') {
    print "Sounds like somebody's got a case of the Mondays";
} else {
    print '...';


If you have a list of something, you can iterate (loop) over the items in a few ways:

# the handy way, if you don't need to know the index
my @list = qw(eggs bread milk);

for my $item (@list) {
    print $item;
    print "\n";

# another way that involves keeping track of the index
for (my $i = 0; $i < $#list; $i++) {
    print $list[$i];
    print "\n";

# another way:
my $i = 0;
while ($i < $#list) {
    print $list[$i];
    $i += 1;

Running Perl

As we saw in the last tutorial, you can run Perl scripts from the command-line like so:

perl my_file.pl

Always save your Perl scripts with the pl extension.

A silly example

  • Print a list of languages, "shouting" the first one
sub format_languages {
    my @langs = @_;

    for (my $i = 0; $i < $#langs; $i++) {
        if ($i == 0) {
            print uc($langs[$i]) . '!';
        } else {
            print $langs[$i];
        print "\n";

format_languages(qw(Ruby Shell PHP MySQL Perl));

Alternate definitions

sub format_languages {
    my ($favorite, @langs) = @_;

    print uc($favorite) . "!\n";
    for (my $i = 0; $i < $#langs; $i++) {
        print $langs[$i] . "\n";

format_languages(qw(Ruby Shell PHP MySQL Perl));


sub format_languages {
    my @langs = @_;

    $langs[0] = uc($langs[0]) . '!';
    for (my $i = 0; $i < $#langs; $i++) {
        print $langs[$i] . "\n";

format_languages(qw(Ruby Shell PHP MySQL Perl));


sub format_languages {
    my @langs = @_;
    $langs[0] = uc($langs[0]) . '!';

    print join("\n", @langs);

format_languages(qw(Ruby Shell PHP MySQL Perl));

Important boilerplate code to include

Perl can allow some really sloppy coding that can cause obnoxious, subtle errors. Make sure you put this at the top of every Perl file:

use diagnostics;
use strict;
use warnings;

# code goes here

Debugging Perl

If you want to know what a variable holds, print it out with Data::Dumper. Put this at the top of your file:

use Data::Dumper;

Then use it like so:

print Dumper(%my_really_big_hash_table);


Go forth and write some Perl!

Date: 2011-11-13 15:30:08 MST

Author: Jon-Michael Deldin

Org version 7.7 with Emacs version 23