PHP Regular expressions (regex
Regular expressions (regex) in PHP are used for pattern matching and text manipulation. PHP provides a set of functions to work with regular expressions, allowing you to search, replace, and validate strings based on specific patterns. These functions are powerful tools for text processing.
Basic Concepts
- Pattern: A regex pattern defines the sequence of characters you want to match.
- Modifiers: Optional flags that alter the behavior of the regex engine (e.g., case-insensitive matching).
Regular Expression Functions in PHP
PHP supports two main regular expression libraries:
- POSIX (deprecated in PHP 7.0 and later)
- PCRE (Perl Compatible Regular Expressions)
The PCRE library is more commonly used and offers more features. Most of the functions you’ll use are from the PCRE library.
Common PCRE Functions
preg_match()
Searches a string for a match to a regex pattern.
Syntax:
int preg_match ( string $pattern , string $subject [, array &$matches = null [, int $flags = 0 [, int $offset = 0 ]]] )
Example:
<?php $pattern = "/world/"; $subject = "Hello, world!"; if (preg_match($pattern, $subject)) { echo "Match found!"; } else { echo "No match found."; } ?>
Explanation:
- Searches for the pattern
"world"
in the string"Hello, world!"
. - Returns
1
if a match is found,0
if not, andFALSE
if an error occurred.
- Searches for the pattern
preg_match_all()
Searches a string for all matches to a regex pattern.
Syntax:
int preg_match_all ( string $pattern , string $subject [, array &$matches = null [, int $flags = 0 [, int $offset = 0 ]]] )
Example:
<?php $pattern = "/\d+/"; $subject = "There are 12 apples and 34 oranges."; preg_match_all($pattern, $subject, $matches); print_r($matches[0]); // Outputs: Array ( [0] => 12 [1] => 34 ) ?>
Explanation:
- Finds all matches of one or more digits in the string.
$matches[0]
contains all the matched substrings.
preg_replace()
Replaces matches of a regex pattern with a replacement string.
Syntax:
mixed preg_replace ( string $pattern , string $replacement , string $subject [, int &$count = null ] )
Example:
<?php $pattern = "/world/"; $replacement = "universe"; $subject = "Hello, world!"; $result = preg_replace($pattern, $replacement, $subject); echo $result; // Outputs: Hello, universe! ?>
Explanation:
- Replaces the first occurrence of
"world"
with"universe"
.
- Replaces the first occurrence of
preg_split()
Splits a string by a regex pattern.
Syntax:
array preg_split ( string $pattern , string $subject [, int $limit = -1 [, int $flags = 0 ]] )
Example:
<?php $pattern = "/[\s,]+/"; $subject = "apple, orange banana"; $result = preg_split($pattern, $subject); print_r($result); // Outputs: Array ( [0] => apple [1] => orange [2] => banana ) ?>
Explanation:
- Splits the string by spaces and commas.
preg_replace_callback()
Performs a regex search and replace using a callback function.
Syntax:
mixed preg_replace_callback ( string $pattern , callable $callback , string $subject [, int &$count = null ] )
Example:
<?php $pattern = "/\d+/"; $subject = "There are 12 apples and 34 oranges."; $result = preg_replace_callback($pattern, function($matches) { return $matches[0] * 2; }, $subject); echo $result; // Outputs: There are 24 apples and 68 oranges. ?>
Explanation:
- Multiplies all numbers in the string by 2 using a callback function.
Regular Expression Patterns
Literal Characters: Match exact characters.
- Example:
/cat/
matches"cat"
.
- Example:
Metacharacters: Special characters that have specific meanings.
.
(dot): Matches any single character except a newline.\d
: Matches any digit (0-9).\D
: Matches any non-digit.\w
: Matches any word character (alphanumeric and underscore).\W
: Matches any non-word character.\s
: Matches any whitespace character.\S
: Matches any non-whitespace character.
Anchors: Define positions in the string.
^
: Matches the start of a string.$
: Matches the end of a string.
Quantifiers: Specify the number of times a character or group should be matched.
*
: Matches 0 or more times.+
: Matches 1 or more times.?
: Matches 0 or 1 time.{n}
: Matches exactlyn
times.{n,}
: Matchesn
or more times.{n,m}
: Matches betweenn
andm
times.
Groups and Ranges:
(abc)
: Matches the exact sequence"abc"
.[a-z]
: Matches any single character in the range froma
toz
.|
: Acts as an OR operator.- Example:
/cat|dog/
matches"cat"
or"dog"
.
- Example: