Chapter 8 RefereNCes

CONTENTS

RefereNCe Types
Summary
Review Questions
Review Exercises

A refereNCe is a scalar value that points to a memory location that holds some type of data. Everything in your Perl program is stored inside your computer's memory. Therefore, all of your variables and fuNCtions are located at some memory location. RefereNCes are used to hold the memory addresses. When a refereNCe is derefereNCed, you retrieve the information referred to by the refereNCe.

RefereNCe Types

There are six types of refereNCes. A refereNCe can point to a scalar, an array, a hash, a glob, a fuNCtion, or another refereNCe. Table 8.1 shows how the different types are valued with the assignment operator and how to derefereNCe them using curly braces.

Note

I briefly mentioned hashes in Chapter 3 "Variables." Just to refresh your memory, hashes are another name for associative arrays. Because "hash" is shorter than "associative array," I'll be using both terms in this chapter.

Table 8.1 The Six Types of RefereNCes

RefereNCe Assignment	How to DerefereNCe
`$refScalar` = `\$scalar;`	`${$refScalar}` is a scalar value.
`$refArray` = `\@array;`	`@{$refArray}` is an array value.
`$refHash` = `\%hash;`	`%{$refHash}` is a hash value.
`$refglob` = `\*file;`	Glob refereNCes are beyond the scope of this book, but a short example can be found at http://www. mtolive.com/pbc/ch08.htm#Josh Purinton.
`$refFuNCtion` = `\&fuNCtion;`	`&{$refFuNCtion}` is a fuNCtion location.
`$refRef` = `\$refScalar;`	`${${$refScalar}` is a scalar value.

Essentially, all you need to do in order to create a refereNCe is to add the backslash to the front of a value or variable.

Example: Passing Parameters to FuNCtions

Back in Chapter 5 "FuNCtions," we talked about passing parameters to fuNCtions. At the time, we were not able to pass more than one array to a fuNCtion. This was because fuNCtions only see one array (the @_ array) when looking for parameters. RefereNCes can be used to overcome this limitation.

Let's start off by passing two arrays into a fuNCtion to show that the fuNCtion only sees one array.

Call firstSub() with two arrays as parameters.
Define the firstSub() fuNCtion.
Create local variables and assign elements from the parameter array to them.
Print the local arrays.


firstSub( (1..5), ("A".."E"));



sub firstSub {

    my(@firstArray, @secondArray) = @_ ;



    print("The first array is  @firstArray.\n");

    print("The second array is @secondArray.\n");

}

This program displays:


The first array is  1 2 3 4 5 A B C D E.

The second array is .

Inside the firstSub() fuNCtion, the @firstArray variable was assigned the entire parameter array, leaving nothing for the @secondArray variable. By passing refereNCes to @arrayOne and @arrayTwo, we can preserve the arrays for use inside the fuNCtion. Very few changes are needed to enable the above example to use refereNCes. Take a look.

Call firstSub() using the backslash operator to pass a refereNCe to each array.
Define the firstSub() fuNCtion.
Create two local scalar variables to hold the array refereNCes.
Print the local variables, derefereNCing them to look like arrays. This is done using the @{} notation.


firstSub( \(1..5), \("A".."E") );                         # One



sub firstSub {

    my($ref_firstArray, $ref_secondArray) = @_ ;          # Two



    print("The first array is  @{$ref_firstArray}.\n");   # Three

    print("The second array is @{$ref_secondArray}.\n");  # Three

}

This program displays:


The first array is  1 2 3 4 5.

The second array is A B C D E.

Three things were done to make this example use refereNCes:

In the line marked "One," backslashes were added to indicate that a refereNCe to the array should be passed.
In the line marked "Two," the refereNCes were taken from the parameter array and assigned to scalar variables.
In the lines marked "Three," the scalar values were derefereNCed. DerefereNCing means that Perl will use the refereNCe as if it were a normal data type-in this case, an array variable.

Example: The ref() FuNCtion

Using refereNCes to pass arrays into a fuNCtion worked well and it was easy, wasn't it? However, what happens if you pass a scalar refereNCe to the firstSub() fuNCtion instead of an array refereNCe? Listing 8.1 shows how passing a scalar refereNCe when the fuNCtion demands an array refereNCe causes problems.

Call firstSub() and pass a refereNCe to a scalar and a refereNCe to an array.
Define the firstSub() fuNCtion.
Create two local scalar variables to hold the array refereNCes.
Print the local variables, derefereNCing them to look like arrays.

Listing 8.1 08LST01.PL-Passing a Scalar RefereNCe When the FuNCtion Demands an Array RefereNCe Causes Problems


firstSub( \10, \("A".."E") );



sub firstSub {

    my($ref_firstArray, $ref_secondArray) = @_ ;



    print("The first array is  @{$ref_firstArray}.\n");

    print("The second array is @{$ref_secondArray}.\n");

}

This program displays:


Not an ARRAY refereNCe at 08lst01.pl line 9.

Perl provides the ref() fuNCtion so that you can check the refereNCe type before derefereNCing a refereNCe. The next example shows how to trap the mistake of passing a scalar refereNCe instead of an array refereNCe.

Call firstSub() and pass a refereNCe to each variable.
Define the firstSub() fuNCtion.
Create two local scalar variables to hold the array refereNCes.
Print the local variables if each variable is a refereNCe to an array. Otherwise, print nothing.

Listing 8.2 shows how to test for an Array RefereNCe passed as a parameter.

Listing 8.2 08LST02.PL-How to Test for an Array RefereNCe Passed as a Parameter


firstSub( \10, \("A".."E") );



sub firstSub {

    my($ref_firstArray, $ref_secondArray) = @_ ;





    print("The first array is  @{$ref_firstArray}.\n")

        if (ref($ref_firstArray) eq "ARRAY");             # One



    print("The second array is @{$ref_secondArray}.\n"

        if (ref($ref_secondArray) eq "ARRAY");            # Two

}

This program displays:


The second array is 1 2 3 4 5.

Only the second parameter is printed because the first parameter-the scalar refereNCe-failed the test on the line marked "One." The statement modifiers on the lines marked "One" and "Two" ensure that we are derefereNCing an array refereNCe. This prevents the error message that appeared earlier. Of course, in your own programs you might want to set an error flag or print a warning.

For more information about statement modifiers, see Chapter 6 "Statements."

Table 8.2 shows some values that the ref() fuNCtion can return.

Table 8.2 Using the ref() FuNCtion

FuNCtion Call	Return Value
ref( 10 );	undefined
ref( \10 );	SCALAR
ref( \{1 => "Joe"} );	HASH
ref( \&firstSub );	CODE
ref( \\10 );	REF

Listing 8.3 shows another example of the ref() fuNCtion in action.

Initialize scalar, array, and hash variables.
Pass the variables to the printRef() fuNCtion. These are non-refereNCes so the undefined value should be returned.
Pass variable refereNCes to the printRef() fuNCtion. This is accomplished by prefixing the variable names with a backslash.
Pass a fuNCtion refereNCe and a refereNCe to a refereNCe to the printRef() fuNCtion.
Define the printRef() fuNCtion.
Iterate over the parameter array.
Assign the refereNCe type to $refType.
If the current parameter is a refereNCe, then print its refereNCe type, otherwise, print that it's a non-refereNCe.

Listing 8.3 08LST03.PL-Using the ref() FuNCtion to Determine the RefereNCe Type of a Parameter


$scalar = 10;

@array  = (1, 2);

%hash   = ( "1" => "Davy Jones" );



# I added extra spaces around the parameter list

# so that the backslashes are easier to see.

printRef( $scalar, @array, %hash );

printRef( \$scalar, \@array, \%hash );

printRef( \&printRef, \\$scalar );



# print the refereNCe type of every parameter.

sub printRef {

    foreach (@_) {

        $refType = ref($_);

        defined($refType) ? print "$refType " : print("Non-refereNCe ");

    }

    print("\n");

}

This program displays:


Non-refereNCe Non-refereNCe Non-refereNCe

SCALAR ARRAY HASH

CODE REF

By using the ref() fuNCtion you can protect program code that derefereNCes variables from producing errors when the wrong type of refereNCe is used.

Example: Creating a Data Record

Perl's associative arrays (hashes) are extremely useful when it comes to storing information in a way that facilitates easy retrieval. For example, you could store customer information like this:


%record = ( "Name"    => "Jane Hathaway",

            "Address" => "123 Anylane Rd.",

            "Town"    => "AnyTown",

            "State"   => "AnyState",

            "Zip"     => "12345-1234"

);

The %record associative array also can be considered a data record with five members. Each member is a single item of information. The data record is a group of members that relates to a single topic. In this case, that topic is a customer address. And, a database is one or more data records.

Each member is accessed in the record by using its name as the key. For example, you can access the state member by saying $record{"State"}. In a similar manner, all of the members can be accessed.

Of course, a database with only one record is not very useful. By using refereNCes, you can build a multiple record array. Listing 8.4 shows two records and how to initialize a database array.

Declare a data record called %recordOne as an associative array.
Declare a data record called %recordTwo as an associative array.
Declare an array called @database with refereNCes to the associative arrays as elements.

Listing 8.4 08LST04.PL-A Database with Two Records


%recordOne = ( "Name"    => "Jane Hathaway",

               "Address" => "123 Anylane Rd.",

               "Town"    => "AnyTown",

               "State"   => "AnyState",

               "Zip"     => "12345-1234"

);



%recordTwo = ( "Name"    => "Kevin Hughes",

               "Address" => "123 Allways Dr.",

               "Town"    => "AnyTown",

               "State"   => "AnyState",

               "Zip"     => "12345-1234"

);



@database = ( \%recordOne, \%recordTwo );

You can print the address member of the first record like this:


print( %{$database[0]}->{"Address"} . "\n");

which displays:


123 Anylane Rd.

Let's dissect the derefereNCing expression in this print statement. Remember to work left to right and always evaluate brackets and parentheses first. Ignoring the print() fuNCtion and the newline, you can evaluate this line of code in the following way:

The inner most bracket is [0], which means that we'll be looking at the first element of an array.
The square bracket operators have a left to right associativity, so we look left for the name of the array. The name of the array is database.
Next come the curly brackets, which tell Perl to derefereNCe. Curly brackets also have a left to right associativity, so we look left to see the refereNCe type. In this case we see a %, which means an associative array.
The -> is the infix derefereNCe operator. It tells Perl that the thing being derefereNCed on the left (the database refereNCe in this case) is connected to something on the right.
The 'thing' on the right is the key value or "Address." Notice that it is inside curly braces exactly as if a regular hash key were being used.

The variable declaration in the above example uses three variables to define the data's structure. We can condense the declaration down to one variable as shown in Listing 8.5.

Declare an array called @database with two associative arrays as elements. Because the associative arrays are not being assigned directly to a variable, they are considered anonymous.
Print the value associated with the "Name" key for the first element of the @database array.
Print the value associated with the "Name" key for the second element of the @database array.

Listing 8.5 08LST05.PL-Declaring the Database Structure in One Shot


@database = (    

    { "Name"    => "Jane Hathaway",

      "Address" => "123 Anylane Rd.",

      "Town"    => "AnyTown",

      "State"   => "AnyState",

      "Zip"     => "12345-1234"

    },

    { "Name"    => "Kevin Hughes",

      "Address" => "123 Allways Dr.",

      "Town"    => "AnyTown",

      "State"   => "AnyState",

      "Zip"     => "12345-1234"

    }

);



print(%{$database[0]}->{"Name"} . "\n");

print(%{$database[1]}->{"Name"} . "\n");

This program displays:


Jane Hathaway

Kevin Hughes

Let's analyze the derefereNCing code in the first print line.

The innermost bracket is [0], which means that we'll be looking at the first element of an array.
The square bracket operators have a left to right associativity, so we look left for the name of the array. The name of the array is database.
Next comes the curly brackets, which tell Perl to derefereNCe. Curly brackets also have a left to right associativity, so we look left to see the refereNCe type. In this case we see a %, which means an associative array.
The -> is the infix derefereNCe operator. It tells Perl that the thing being derefereNCed on the left (the database refereNCe in this case) is connected to something on the right.
The 'thing' on the right is the key value or "Name." Notice that it is inside curly braces exactly as if a regular hash key were being used.

Even though the structure declarations in the last two examples look different, they are equivalent. You can confirm this because the structures are derefereNCed the same way. What's happening here? Perl is creating anonymous associative array refereNCes that become elements of the @database array.

In the previous example, each hash had a name-%recordOne and %recordTwo. In the current example, there is no variable name directly associated with the hashes. If you use an anonymous variable in your programs, Perl automatically will provide a refereNCe to it.

We can explore the coNCepts of data records a bit further using this basic example. So far, we've used hash refereNCes as elements of an array. When one data type is stored inside of another data type, this is called nesting data types. You can nest data types as often and as deeply as you would like.

At this stage of the example, %{$database[0]}->{"Name"} was used to derefereNCe the "Name" member of the first record. This type of derefereNCing uses an array subscript to tell Perl which record to look at. However, you could use an associative array to hold the records. With an associative array, you could look at the records using a customer number or other id value. Listing 8.6 shows how this can be done.

Declare a hash called %database with two keys, MRD-100 and MRD-250. Each key has a refereNCe to an anonymous hash as its value.
Find the refereNCe to the hash associated with the key "MRD-100." Then print the value associated with the key "Name" inside the first hash.
Find the refereNCe to the hash associated with the key "MRD-250." Then print the value associated with the key "Name" inside the first hash.

Listing 8.6 08LST06.PL-Using an Associative Array to Hold the Records


 %database = (

    "MRD-100" => { "Name"    => "Jane Hathaway",

                   "Address" => "123 Anylane Rd.",

                   "Town"    => "AnyTown",

                   "State"   => "AnyState",

                   "Zip"     => "12345-1234"

                 },

    "MRD-250" => { "Name"    => "Kevin Hughes",

                   "Address" => "123 Allways Dr.",

                   "Town"    => "AnyTown",

                   "State"   => "AnyState",

                   "Zip"     => "12345-1234"

                 }

);



print(%{$database{"MRD-100"}}->{"Name"} . "\n");

print(%{$database{"MRD-250"}}->{"Name"} . "\n");

This program displays:


Jane Hathaway

Kevin Hughes

You should be able to follow the same steps that we used previously to decipher the print statement in this listing. The key is that the associative array index is surrounded by the curly brackets instead of the square brackets used previously.

There is one more twist that I would like to show you using this data structure. Let's see how to dynamically add information. First, we'll look at adding an entire data record, and then we'll look at adding new members to an existing data record. Listing 8.7 shows you can use a standard hash assignment to dynamically create a data record.

Assign a refereNCe to a hash to the "MRD-300" key in the %database associative array.
Assign the refereNCe to the hash associated with the key "MRD-300" to the $refCustomer variable.
Print the value associated with the key "Name" inside hash refereNCed by $refCustomer.
Print the value associated with the key "Address" inside hash refereNCed by $refCustomer.

Listing 8.7 08LST07.PL-Creating a Record Using Hash Assignment


$database{"MRD-300"} = {

    "Name"    => "Nathan Hale",

    "Address" => "999 Centennial Ave.",

    "Town"    => "AnyTown",

    "State"   => "AnyState",

    "Zip"     => "12345-1234"

};



$refCustomer = $database{"MRD-300"};

print(%{$refCustomer}->{"Name"} . "\n");

print(%{$refCustomer}->{"Address"} . "\n");

This program displays:


Nathan Hale

999 Centennial Ave.

Notice that by using a temporary variable ($refCustomer), the program code is more readable. The alternative would be this:


print(%{$database{"MRD-300"}}->{"Name"} . "\n");

Most programmers would agree that using the temporary variable aids in the understanding of the program.

Our last data structure example will show how to add members to an existing customer record. Listing 8.8 shows how to add two phone number members to customer record MRD-300.

Assign a refereNCe to an anonymous fuNCtion to $codeRef. This fuNCtion will print the elements of the %database hash. Because each value in the %database hash is a refereNCe to another hash, the fuNCtion has an inner loop to derefereNCe the sub-hash.
Assign a refereNCe to a hash to the "MRD-300" key in the %database associative array.
Call the anonymous routine by derefereNCing $codeRef to print the contents of %database. This is done by surrounding the code refereNCe variable with curly braces and prefixing it with a & to indicate that it should be derefereNCed as a fuNCtion.
Assign the refereNCe to the hash associated with the key "MRD-300" to the $refCustomer variable.
Add "Home Phone" as a key to the hash associated with the "MRD-300" key.
Add "Business Phone" as a key to the hash associated with the "MRD-300" key.
Call the anonymous routine by derefereNCing $codeRef to print the contents of %database.

Listing 8.8 08LST08.PL-How to Dynamically Add Members to a Data Structure


$codeRef = sub {

    while (($key, $value) = each(%database)) {

        print("$key = {\n");

        while (($innerKey, $innerValue) = each(%{$value})) {

            print("\t$innerKey => $innerValue\n");

        }

        print("};\n\n");

    }

};



$database{"MRD-300"} = {

    "Name"    => "Nathan Hale",

    "Address" => "999 Centennial Ave.",

    "Town"    => "AnyTown",

    "State"   => "AnyState",

    "Zip"     => "12345-1234"

};



# print database before dynamic changes.

&{$codeRef};



$refCustomer = $database{"MRD-300"};

%{$refCustomer}->{"Home Phone"}     = "(111) 511-1322";

%{$refCustomer}->{"Business Phone"} = "(111) 513-4556";



# print database after dynamic changes.

&{$codeRef};

This program displays:


MRD-300 = {

        Town => AnyTown

        State => AnyState

        Name => Nathan Hale

        Zip => 12345-1234

        Address => 999 Centennial Ave.

};



MRD-300 = {

        Town => AnyTown

        State => AnyState

        Name => Nathan Hale

        Home Phone => (111) 511-1322

        Zip => 12345-1234

        Business Phone => (111) 513-4556

        Address => 999 Centennial Ave.

};

This example does two new things. The first thing is that it uses an anonymous fuNCtion refereNCed by $codeRef. This is done for illustration purposes. There is no reason to use an anonymous fuNCtion. There are actually good reasons for you not to do so in normal programs. I think that anonymous fuNCtions make programs much harder to understand.

Note

When helper fuNCtions are small and easily understood, I like to place them at the beginning of code files. This helps me to quickly refresh my memory when coming back to view program code after time spent doing other things.

The second thing is that a regular hash assignment statement was used to add values. You can use any of the array fuNCtions with these nested data structures.

Example: Interpolating FuNCtions Inside Double-Quoted Strings

You can use refereNCes to force Perl to interpolate the return value of a fuNCtion call inside double-quoted strings. This helps to reduce the number of temporary variables needed by your program.

Call the makeLine() fuNCtion from inside a double-quoted string.
Define the makeLine() fuNCtion.
Return the dash character repeated a specified number of times. The first element in the parameter array is the number of times to repeat the dash.


print("Here are  5 dashes ${\makeLine(5)}.\n");

print("Here are 10 dashes ${\makeLine(10)}.\n");



sub makeLine {

    return("-" x $_[0]);

}

This program displays:


Here are  5 dashes -----.

Here are 10 dashes ----------.

The trick in this example is that the backslash turns the scalar return value into a refereNCe, and then the dollar sign and curly braces turn the refereNCe back into a scalar value that the print() fuNCtion can interpret correctly. If the backslash character is not used to create the refereNCe to the scalar return value, then the ${} derefereNCing operation does not have a refereNCe to derefereNCe, and you will get an "initialized value" error.

Summary

In this chapter you learned about refereNCes. RefereNCes are scalar variables used to hold the memory locations. When refereNCes are derefereNCed, the actual value is returned. For example, if the value of the refereNCe is assigned like this: $refScalar = \10, then, derefereNCing $refScalar would be equal to 10 and would look like this ${$refScalar}. You always can create a refereNCe to a value or variable by preceding it with a backslash. DerefereNCing is accomplished by surrounding the refereNCe variable in curly braces and preceding the left curly brace with a character denoting what type of refereNCe it is. For example, use @ for arrays and & for fuNCtions.

There are five types of refereNCes that you can use in Perl. You can have a refereNCe to scalars, arrays, hashes, fuNCtions, and other refereNCes. If you need to determine what type of refereNCe is passed to a fuNCtion, use the ref() fuNCtion.

The ref() fuNCtion returns a string that indicates which type of refereNCe was passed to it. If the parameter was not a refereNCe, the undefined value is returned. You discovered that it is always a good idea to check refereNCe types to prevent errors caused by passing the wrong type of refereNCe. An example was given that caused an error by passing a scalar refereNCe when the fuNCtion expected an array refereNCe.

A lot of time was spent discussing data records and how to access information stored in them. You learned how to step through dissecting a derefereNCing expression, how to dynamically add new data records to an associative array, and how to add new data members to an existing record.

The last thing covered in this chapter was how to interpolate fuNCtion calls inside double-quoted strings. You'll use this technique-at times-to avoid using temporary variables when printing or coNCatenating the output of fuNCtions to other strings.

Chapter 9 "Using Files," introduces you to opening, reading, and writing files. You find out how to store the data records you've constructed in this chapter to a file for long-term storage.

Review Questions

Answers to Review Questions are in Appendix A.

What is a refereNCe?
How many types of refereNCes are there?
What does the ref() fuNCtion return if passed a non-refereNCe as a parameter?
What notation is used to derefereNCe a refereNCe value?
What is an anonymous array?
What is a nested data structure?
What will the following line of code display?

print("${\ref(\(1..5))}");
Using the %database array in Listing 8.6, what will the following line of code display?

print(%{$database{"MRD-100"}}->{"Zip"} . "\n");

Review Exercises

Write a program that will print the derefereNCed value of $ref in the following line of code:

$ref = \\\45;
Write a fuNCtion that removes the first element from each array passed to it. The return value of the fuNCtion should be the number of elements removed from all arrays.
Add error-checking to the fuNCtion written in Exercise 3 so the undef value is returned if one of the parameters is not an array.
Write a program based on Listing 8.7 that adds a data member indicating which weekdays a salesman may call the customer with an id of MRD-300. Use the following as an example:

"Best days to call" => ["Monday", "Thursday" ]