Python Regular Expression
Regular Expressions in Python
A Regular Expression (regex or regexp) is a powerful tool for matching patterns in strings. Python provides the re
module to work with regular expressions, enabling the search, match, and manipulation of text based on patterns. Regular expressions allow complex text searches and text manipulations in a concise manner.
Key Features of Regular Expressions
Pattern Matching: Regular expressions use a combination of literal characters and special symbols (called metacharacters) to define search patterns.
Flexible Search: They allow searching for specific patterns within text, such as words, numbers, or characters in any order or structure.
Text Manipulation: Beyond searching, regular expressions can be used to replace, extract, or split parts of text.
Validation: Regular expressions can be used to validate formats, such as email addresses, phone numbers, or postal codes.
Commonly Used Metacharacters in Regular Expressions
.
: Matches any character except a newline.^
: Matches the start of a string.$
: Matches the end of a string.*
: Matches 0 or more repetitions of the preceding character.+
: Matches 1 or more repetitions of the preceding character.?
: Matches 0 or 1 occurrence of the preceding character.[]
: Denotes a set of characters to match.|
: Acts as an OR operator.\d
: Matches any digit (0-9).\D
: Matches any non-digit character.\w
: Matches any word character (alphanumeric + underscore).\W
: Matches any non-word character.\s
: Matches any whitespace (space, tab, newline).\S
: Matches any non-whitespace character.
Basic Functions in Python's re
Module
re.search()
: Searches for the first occurrence of the pattern in the string.re.match()
: Checks if the beginning of the string matches the pattern.re.findall()
: Returns all occurrences of the pattern in the string.re.sub()
: Replaces occurrences of a pattern with a specified string.re.split()
: Splits a string based on the occurrences of a pattern.re.compile()
: Compiles a regular expression pattern for reuse.
Example of Regular Expressions in Python
Explanation of the Example
re.search()
: This searches for the first occurrence of the word "rain" in the string. If found, it prints the match.re.findall()
: This finds all words that start with "S" or "s" in the text, where\b
represents a word boundary and\w+
matches a sequence of word characters.re.sub()
: This replaces all occurrences of the word "rain" with "snow" in the string.re.split()
: This splits the string at each space, returning a list of words.
Common Use Cases of Regular Expressions
Validating User Input:
- Example: Ensuring an email address has the correct format.
Finding Specific Patterns:
- Example: Extracting all phone numbers from a document.
Replacing Text:
- Example: Replacing all dates in a specific format (MM/DD/YYYY) with a new format.
Summary
- Regular expressions are used for pattern matching and text manipulation.
- Python's
re
module provides functions likesearch()
,match()
,findall()
,sub()
, andsplit()
to work with regular expressions. - Metacharacters such as
.
,*
,+
, and[]
are used to define search patterns. - Regular expressions are widely used in tasks like data validation, string searching, text extraction, and text manipulation.
If you have any further questions or need more examples, feel free to ask!