PHP Tutorials

Before We Begin

Being able to use string manipulation functions in your programs is similar to being able to make your program know how to read. For example, I already mentioned that by the end of this chapter you will be able to make your programs differentiate between valid and invalid e-mail addresses.

Although this chapter will cover some of the string manipulation functions, there are many more refined functions that are simply too numerous to cover for practical, day-to-day use.

However, some situations do lend themselves to these more refined functions. So, if you find yourself wishing PHP had a function to do something, take a look at the manual—there’s a function for just about anything you could need. And, as you might expect, the more you use PHP, the more of these functions will become familiar to you. For now, though, the functions this chapter teaches will give you a strong start.

Filed under: Chapter 5 @ 6:57 pm

The String Concatenation Operator

Like the numeric variable types, strings can be used with operators. However, operators don’t treat strings the same way they treat numeric variables. For example, multiplying two strings together wouldn’t make much sense. Neither would dividing, adding, or subtracting them.

TIP

You may be thinking, “Hey, wait. Wouldn’t adding strings ‘glue’ them together to form a composite?” This operation of “adding” strings, though, is known among programmers as concatenation, which has a separate operator. We’ll discuss that a little later in this section.

In fact, if you perform a mathematical operation on strings, the strings are typecast to the most appropriate numeric type before the expression evaluates them.

To refresh your memory on string-to-numeric typecasting, see “Type Casting,”.

The string concatenation operator, however, is made to work with strings. Concatenation is the joining together of two or more strings to form a single string. The string concatenation operator is the period; so, to join two strings together, we place a string on either side of a period. The resulting string can either be stored in a third string by assignment, or echoed directly to output.

NOTE

This works the same way as adding 1 + 2 + 3. You can have any number of concatenation operators in an expression, as long as each has a string on either side.

For example, if you ask a user for his first and last names as two separate entries, then you can use concatenation to display his full name in whatever format you prefer, or even switch back and forth within the same script as necessary.

The following program asks the user for his first and last names, then prints his full name in two different formats:



<?php
/* File: ch05ex01.php – demonstrates string concatenation */

?>
<html>
<head><title>PHP By Example :: Chapter 5 :: Example 1</title></head>

<body>

<?php

if ($HTTP_GET_VARS['fname'] && $HTTP_GET_VARS['lname'])
{
echo $HTTP_GET_VARS['fname'] . ' ' . $HTTP_GET_VARS['lname'] . '
';
echo $HTTP_GET_VARS['lname'] . ', ' . $HTTP_GET_VARS['fname'] . '
';
}

?>

<form action="ch05ex01.php" method="GET">
First Name: <input type="text" name="fname">

Last Name: <input type="text" name="lname">

<input type="submit">
</form>

</body>

</html>

Being able to join two strings in this manner expands the flexibility of your program. Because you know that this is possible, you can ask for first and last names separately, and, in turn, have the ability to manipulate that name however you want.

For example, let’s say you’re organizing a convention (such as the annual ApacheCon) for which participants will sign up through your Web site. By gathering people’s first and last names independently of each other (instead of gathering one, inflexible fullname), you have the ability to format and sort the names however you want.

In some instances—creating a roster of attendees, for example—you might want to list the names alphabetically by last name. To do so, you simply concatenate the last name, a string containing a comma and a space, and the first name, like so:

$name_full = $name_last . ', ' . $name_first;

The resulting string might look like this:

Smith, John

Later in this chapter, you will learn the skills needed to create a string containing a person’s initials using the strings containing his first and last names. (For example, given that a person’s first name is Joe and his last name is Smith, you could find that his initials are J.S.) For now, you should have a clear idea of how string concatenation works and how to do it.

Filed under: Chapter 5 @ 6:58 pm

String Functions

As the concatenation operator joins strings, the various string functions allow you to divide strings and manipulate what’s already in a string. This will allow you to

Separate a string of data into more workable pieces

Retrieve only a particular part of a string

Find the location of a substring you want to extract

Replace a substring with a different string

Extracting Substrings
Extracting substrings is simply a matter of knowing where within a string the information (another string) you want is located. Specifically, you have to know the index of the first character and the length of the string you want to extract.

For example, let’s assume you have a person’s Social Security number stored in a string and you want to use the last four digits of the number as the default PIN code.

Let’s assume your program (or a person) has already formatted the string such that it is simply a sequence of nine numbers, without hyphens, spaces, or other characters separating the numbers. Let’s say the Social Security number is 012-34-5678. The sequence is stored in a variable as follows:
$SSN = '012345678';
TIP

Notice that the string above must be within quotes or it will lose the intial zero. Although most numeric values may be easily converted back and forth between numbers and strings, this one would lose the zero as soon as it became a numeric type.

It is a good practice to enclose all numbers intended to be used as strings in quotes to denote them as strings and not numbers. Not doing so can not only yield strange results if the number begins with a zero, but it also makes your code somewhat obscure. Variables intended for use only as strings should be coded only as strings.

Now, you want to retrieve the last four characters (in this case, digits) of a nine-character string. To do this, use the substr (substring) function. The syntax for substr follows:

string substr(string str, int start [, int length])
INTERPRETING SYNTAX GUIDES
The monospaced text you see just before this block is called a syntax guide. It’s a brief way of showing how a function is intended to be used that tells two important things about the function: what value is returned and what parameters it takes.

The function’s return value is given before the function name. In this case, it’s the first occurrence of string on that line.

After the function name, the parameters are given in parentheses, similarly to actually calling the function. However, the parameter types are given in addition to the typical parameter itself. Also, the parameters given here are italicized because they are symbolic names for what should be passed as that parameter.

Syntax guides can also tell you which parameters are optional. Optional parameters are enclosed in brackets so you’re aware of which parameters are optional and which aren’t.

str is the string you want to extract a substring from, start is the index of the first character to be extracted, and the optional parameter length is the length of the substring you wish to extract. If you leave length out, the substring returned will go all the way to the end of the string.

So, to get the last four characters of the Social Security number, use
$SSN_lastFour = substr($SSN, 5, 4);
The 5 here means the substring you get will start at the index position 5, which is the fourth character from the end of the string. The last parameter, 4, tells substr() to give us four characters—in this case, the last four. Figure 5.1 illustrates the extraction of the last four digits from the rest of the string.

Figure 5.1. The substring here is the last four characters of the nine-character string, starting at the character index 5 and continuing to the end.

NOTE

When a substring is extracted from a string, it is not removed, but rather only retrieved. For example, in the demonstration involving a Social Security number, $SSN will still be a nine-character string, and it will still be the same as it was before. You are not changing the string in any way; instead, you’re merely “taking a look” at what’s inside the string.

The substr function is much more flexible than that, however. Let’s assume for a moment that you’re not sure if the Social Security number has its number groups separated by some character or not. Any of the following assignments could be true:



$SSN = '012345678';

or

$SSN = '012-34-5678';

or even

$SSN = '012.34.5678';

Independent of the rest of the string, if you know that the last four characters of the string are the last four digits of the number, you can retrieve the last four characters from the end.

Counting from the end is especially important in this case because you can’t be sure whether the string’s length will be 9 (just the nine digits) or 11 (the nine digits plus two separating characters). If you counted from the beginning of the string, you would then have the problem of figuring out what the starting position of the substring would be; it could be either 5 or 7. However, if you count from the end, the substring will always start 4 characters from the end.

To express this to substr, use a negative starting position. Doing so tells substr to count from the end of the string instead of from the beginning. However, unlike counting from the beginning of the string, when counting from the end, the first character is –1 (not 0 or -0).

The following statement retrieves the last four digits, regardless of the format of the string:

$SSN_lastFour = substr($SSN, -4);

Notice that the length parameter is omitted. Because you’re trying to retrieve everything up to the end of the string, it’s not necessary.

Now, try using the length parameter. The length parameter determines how long the substring returned will be. For example, if length is specified as 2, the substring returned will be two characters long. The following example demonstrates this principle.



$str = 'abcdef';
echo substr($str, 0, 2); // outputs 'ab'

In this example, the substring begins at the very first position in the string, 0, and it’s 2 characters long. Thus, the substring returned is the first two characters of the string, ab.

The length parameter can also be negative. Like the start parameter, if the length parameter is negative, it means count from the end of the string. Thus, the ending position for the string will be length number of characters from the end of the string. The character at the ending position specified is included in the substring. Again, -1 is the first character when counting from the end of the string.

Here’s an example:



$str = 'abcdef';
echo substr($str, 0, -2); // outputs 'abcde'

Now, instead of the length of the string being 2, it’s however long it takes to get 2 from the end (position -2). The string starts at the beginning (0), so everything from the first character to the one before the last (-2) is returned as the substring.

CAUTION

Because string index positions can be confusing, it’s a good idea to check the result of substr calls with several different strings to make sure it is doing what you want it to do. If it’s not, you can adjust the parameters you’re passing to it without too much of a hassle; if you continue without testing, you may later find that you have a hard time even figuring out where the problem is.

You may find that sometimes you need the length of a string. This is helpful if you want to get the last character of a string or check to make sure a string isn’t too long to fit somewhere (such as a particularly limited place on a Web page or in a size-limited database field).

To find the length of a string, use the strlen function, which has the following syntax:

int strlen(string str) To find the length of a string $str, then, you would use <code> echo strlen($str);

The complete number of characters (including whitespace characters such as spaces and \n) is returned. Here’s an example:



$str   =  'This is a string.';
$str2  =  "Newlines!nOnenTwo";
echo strlen($str) . ', ' . strlen($str2);

The output of this code would be 17, 17. Remember that even though the second string appears longer, the \n sequence inside of double quotes is interpreted as only one character. Thus, the two strings are of equal length.

Finding Substrings
If you already know where to find a substring within a string, things aren’t too difficult. However, it’s not always so easy; sometimes you only know where a substring is in relation to another string.

Let’s take a string representation of a number raised to a power as an example. To interpret such a string, you would have to break the string into two parts: the number and the power it’s supposed to be raised to.

Here’s an example string:

$numToPower = '20^2';

Keep in mind that this example should be allowed to change; although our example is 20^2, it could be 2^2, 3^5, or 5^10. Therefore, you have no idea where the caret is going to be and where either number will begin or end.

So, in order to extract the numbers as substrings, you first must determine the positions at which they start and end. You know that the first number will always start at 0, and you know that the last number will always go to the end. All you really have to figure out is where the first number ends and the second begins.

If you knew the position of the caret, you could determine the positions you needed: The first number would end 1 before the caret's position, and the second number would begin 1 after the caret's position. Now you need to find the caret's position.

To do this, you'll use the strpos function. The syntax for the strpos function is as follows:

int strpos(string str, string find [, int start])

str is the string to be searched, and find is the string to find. The optional start parameter is used to limit where strpos starts searching for find within str; for example, if you know there are three periods within a string, but want to find the second one, you can rule out the first one by specifying a start that is past it.

Here's how strpos is used to find the caret in the preceding expression:

$caretPos = strpos($numToPower, '^');
Supposing $numToPower was 20^2, $caretPos would now be 2 (the caret's index within the string). See Figure 5.2 for a visual depiction of how strpos() arrives at this value.

Figure 5.2. The index position of the caret is what's returned by strpos('20^2', '^').

Now, to get the two numbers, it's only necessary to use $caretPos with the preceding assertions describing where you will find the beginnings and ends of the numbers in relation to the caret.

There is one complication, however. The substr function doesn't take a start and an end position, but rather a start and a length. To overcome this, you have to calculate the length for the first number. (The second number's length can be unspecified because it will end at the end of the string.)

The caret position is 2; this tells us that there are 2 characters before the caret: those at positions 0 and 1. If the caret were at 3, there would be 3 characters before it (those at 0, 1, and 2). At 4, there would be 4, and so on. Therefore, you can simply use the caret's position as the length of the substring for the first number.

The last number starts at whatever position immediately follows the caret, $caretPos + 1.

The following code extracts the two numbers from the string $numToPower:

Performing Basic String Replacements
Another type of string manipulation is comparable to a word processor's Find and Replace utilty. A string replacement occurs when a particular string is replaced with another string within a larger string. This is commonly used for

Removing possible occurrences of obscene words from publicly submitted text

Changing plain-text characters into HTML characters (such as regular newlines into
tags)

Changing Windows return plus newline (\r\n) into Unix-formatted newlines (\n)

The function to perform simple string replacements with is str_replace. Here's the syntax:

string str_replace(string find, string replace, string str)

Where find is the string that should be found, replace is the string to replace all occurrences of find with, and str is the string to perform the replacements in.

NOTE

Notice that str_replace returns a string. The only way to get the result of the replacement is to store this return value (either to a new variable or even back to the original variable passed as str). The str_replace function does not modify str on its own.

The use of str_replace is pretty straightforward. Let's assume the string $text contains some text a user submitted that's going to be displayed on a Web site. If the user pressed Enter anytime he was typing the text, he would have inserted \n or \r\n into the text. However, these characters are ignored when a browser is interpreting HTML. (You can break a line wherever you want in HTML and the file will be processed exactly the same way.) To get these linebreaks to show up, you must replace the \n sequences with a
tag. Here's how this could be done:

$text = str_replace("\n", '
', $text);
CAUTION

The difference between double quotes and single quotes is extremely important in this example. The newline (\n) passed to str_replace must be the same as the one in $text. Therefore, you must be sure to enclose the newline in double quotes. As with all other strings, enclosing it in single quotes keeps PHP from interpreting it as a newline, but rather forces PHP to interpret it as a slash and an 'n'.

TIP

There is also a function that has been specifically created to handle this task called nl2br. For more information, check out the PHP manual, as specified in Appendix A, "Debugging and Error Handling."

NOTE

The str_replace function has one drawback: It's case-sensitive. If you want to find only the capitalized word Fred then this is fine; however, if you want to find Fred, fred, and FRED, you'll need to use the pregi_replace function, which is mentioned later in this chapter in "Replacements with Regular Expressions."

The str_replace function can also perform multiple replacements at the same time. Any of the three parameters may be specified as arrays. The first parameter may be an array of several different substrings to find within the string. Once found, the corresponding element of the array passed as the second parameter is used as the replacement string. If the second parameter was a single string, then that string will be used for all of the replacements. This can go on for however many strings are in the array passed as the third parameter, which may or may not be an array.

CAUTION

If the array for the second parameter has fewer elements than the one in the first param- eter, empty strings will be used as the replacement strings for the missing elements. If you're replacing multiple strings with multiple values, be sure you have a value for each string you're replacing or you'll end up simply removing the strings without replacing them with anything.

The following example demonstrates the replacement of the strings "dog", "cat", and "ferret" with the single word "animal":

Here's the output from this program:

My animal knows a mammal that knows the animal that stole my keys.

Now that you've replaced several words with one, try replacing them so each word is replaced with a different word:

And here's the output:

My wife knows a guy that knows the thief that stole my keys.

Filed under: Chapter 5 @ 6:59 pm
« Previous PageNext Page »

Powered by WordPress