Introduction

Regular expressions are powerful tools for data extraction and manipulation, used extensively in many programming languages and applications. In Python, regular expressions are implemented using the 're' module and provide a concise and flexible syntax for pattern matching and string manipulation. In this tutorial, we will cover the basic syntax and concepts of regular expressions in Python and provide practical examples of how they can be used in different scenarios. Whether you are new to programming or a seasoned developer, this tutorial will equip you with the knowledge and skills to work with regular expressions in Python.

Introduction to the Python Regular Expressions

  • A regular expression is a pattern that can be used to match text.
  • They can be used to find, replace, or extract certain text in a string.
  • The Python  re  module provides functions for working with regular expressions in Python.

Python Regular Expression Functions

  • re.search() : searches for the first occurrence of a pattern in a string and returns a match object.
  • re.match() : matches a pattern at the beginning of a string and returns a match object.
  • re.fullmatch() : matches the entire string with a pattern and returns a match object.
  • re.sub() : replaces all occurrences of a pattern with a replacement string.
  • re.compile() : compiles a regular expression pattern into a regular expression object, which can be used multiple times for matching.

search() Function

  • re.search() searches for the first occurrence of a pattern in a given string.
  • It returns a match object if the pattern is found, or None otherwise.
  • Here's an example of using  re.search() :
  • Code Sample : 

import re

string = "The quick brown fox jumps over the lazy dog."
pattern = "fox"

result = re.search(pattern, string)

if result:
   print("Pattern found!")
else:
   print("Pattern not found.")
   
   
   
   
  • Explanation : 
  • In the above example, we define a string string and a pattern pattern.
  • We use re.search() to search for the pattern in the string.
  • If the pattern is found, we print "Pattern found!", otherwise we print “Pattern not found.”

match() Function

  •  re.search()  works like re.search(), but only matches a pattern at the beginning of a string.
  • Here's an example of using  re.match() 
  • Code Sample : 

import re

string = "foo bar baz"
pattern = "foo"
result = re.match(pattern, string)

if result:
   print("Pattern found!")
else:
   print("Pattern not found.")
   
   
   
  • Explanation : 
  • In the above example, we define a string string and a pattern pattern.
  • We use  re.match()  to try and match the pattern at the beginning of the string.
  • If the pattern is found, we print "Pattern found!", otherwise we print "Pattern not found."

fullmatch() Function

  •  re.fullmatch()  matches the entire string with a pattern.
  • Here's an example of using  re.fullmatch() 
  • Code Sample : 

import re

string = "123456"
pattern = r"\d+"
result = re.fullmatch(pattern, string)

if result:
   print("Pattern found!")
else:
   print("Pattern not found.")
   
   
   
  • Explanation : 
  • In the above example, we define a string string and a pattern pattern.
  • We use  re.fullmatch()  to match the entire string with the pattern.
  • If the pattern is found, we print "Pattern found!", otherwise we print “Pattern not found.”

Match Object

  • The  search() match() , and  fullmatch()  functions all return a match object if a pattern is found, and None otherwise.
  • The match object contains information about the match, such as the start and end indices of the matched text.
  • Here's an example of using a match object:
  • Code Sample : 

import re

string = "The quick brown fox jumps over the lazy dog."
pattern = "brown"
result = re.search(pattern, string)

if result:
   match_start = result.start()
   match_end = result.end()
   print(f"Pattern found from index {match_start} to index {match_end}.")
else:
   print("Pattern not found.")
   
   
   
  • Explanation : 
  • In the above example, we search for the pattern "brown" in the string.
  • If the pattern is found, we extract the start and end indices of the matched text using the  start()  and   end()   methods of the match object.

Regular Expressions and Raw Strings

  • Regular expressions often contain backslashes, which have a special meaning in Python string literals.
  • To avoid having to escape backslashes, we can use a raw string by prefixing it with the letter "r".
  • Here's an example of using raw strings with regular expressions:
  • Code Sample : 

import re

string = r"\d{3}-\d{2}-\d{4}"
pattern = r"\d{3}-\d{2}-\d{4}"
result = re.fullmatch(pattern, string)

if result:
   print("Pattern found!")
else:
   print("Pattern not found.")
   
   
   
  • Explanation : 
  • In the above example, we define a string string using a raw string to avoid having to escape backslashes.
  • We also define a pattern using a raw string.
  • We use  re.fullmatch()  to match the entire string with the pattern.
  • If the pattern is found, we print "Pattern found!", otherwise we print "Pattern not found."

Prev. Tutorial : gather() function

Next Tutorial : Character sets