Introduction

Regular expressions or regex, is a powerful tool used to match and manipulate strings in Python. In many cases, we  may need to extract data from strings or perform complex manipulations on text. This is where capturing groups come into play. Capturing groups are a mechanism in regex that allow you to group together multiple symbols or expressions and capture the information enclosed within. This makes it easy to extract specific pieces of information from text. In this tutorial, you will learn how to use capturing groups in Python's regex module to manipulate and extract data from strings with ease.

Table of Contents :

  • Python Regex Capturing Groups
  • Named Capturing Groups 

Python Regex Capturing Groups :

  • Capturing Groups are a feature of Python's regular expression syntax.
  • They are used to group together parts of a regular expression pattern and capture the matched characters.
  • Capturing groups are defined using parentheses  ()  around the pattern to be captured.
  • When a regular expression pattern is matched, any capturing groups will be returned as a list or tuple along with the overall match.
  • Capturing groups are numbered in the order that their opening parentheses appear in the regular expression pattern, with group 0 being the overall match.
  • Code Sample :

import re

# Capturing group example
text = "The quick brown fox jumps over the lazy dog."
pattern = r"The quick (.*?) fox"
result = re.findall(pattern, text)
print(result)  


# Output: 
['brown']


  • # Using captured groups
  • Code Sample :

text = "John Smith: john@smith.com"
pattern = r"(\w+ \w+):\s(\b\w+@\w+\.\w{2,3}\b)"
result = re.findall(pattern, text)
print(result)  


# Output: 
[('John Smith', 'john@smith.com')]


Named Capturing Groups :

  • In addition to numbered capturing groups, Python's regular expression syntax also supports named capturing groups.
  • Named capturing groups are defined using the syntax  (?Ppattern) 
  • Named capturing groups are accessed using the  groupdict()  method of the match object, 
  • It returns a dictionary of the named capturing groups and their values.
  • More Named Capturing Groups Example
  • Code Sample :

import re

# Named capturing group example
text = "John Smith: john@smith.com"
pattern = r"(?P\w+ \w+):\s(?P\b\w+@\w+\.\w{2,3}\b)"
match = re.search(pattern, text)
print(match.groupdict())  


# Output: 
{'name': 'John Smith', 'email': 'john@smith.com'}



  • where "name" is a valid Python identifier and 
  • "pattern" is the regular expression pattern to be captured.

Prev. Tutorial : Sets & Ranges

Next Tutorial : Backreferences