Python Split Regex [With Examples]

In this tutorial, I will explain how to master Python split regex with detailed examples and methods.

To split a string in Python using regex, you can utilize the re.split() function from the re module. This function allows you to divide a string into a list based on a specified regex pattern. For instance, to split a string by spaces or commas, you can use re.split(r'[,\s]+', '123 Main St, Chicago, IL 60601'), which will output ['123', 'Main', 'St', 'Chicago', 'IL', '60601'].

Python Split Regex [With Examples]

Recently, I faced a scenario where I needed to clean and preprocess a large dataset containing addresses in the USA. The addresses were in a single string format, and I had to split them into street, city, state, and zip code. This tutorial will help you understand how to use Python’s re.split() function to handle such real-world tasks efficiently.

Python’s re module provides powerful tools for working with regular expressions. One of the most useful functions in this module is re.split(), which allows you to split a string based on a pattern.

Basic Usage of re.split()

The re.split() function divides a string into a list using a specified pattern. Here’s the basic syntax:

import re

pattern = r'\s+'  # Split by one or more whitespace characters
text = "123 Main St Chicago IL 60601"
result = re.split(pattern, text)
print(result)

This will output:

['123', 'Main', 'St', 'Chicago', 'IL', '60601']

I executed the above Python code using VS code. Here is the exact output in the screenshot below:

python split regex

Check out Split a Sentence into Words in Python

Python Split Regex Examples

Now, let me show you a few examples of the re.split() method in Python.

Splitting by Multiple Delimiters

Sometimes, you might need to split a string by multiple delimiters in Python. For example, consider splitting a string by both commas and spaces. You can achieve this using a regex pattern that includes both delimiters.

Let me show you an example.

import re
pattern = r'[,\s]+'  # Split by commas or spaces
text = "123 Main St, Chicago, IL 60601"
result = re.split(pattern, text)
print(result)

This will output:

['123', 'Main', 'St', 'Chicago', 'IL', '60601']

Here is the exact output in the screenshot below:

regex split python

Check out Split a String and Get the Last Element in Python

Keeping the Delimiters

Sometimes, you might want to keep the delimiters in the resulting list. You can use capturing groups in your regex pattern to include the delimiters in the output.

Here is an example.

import re
pattern = r'(\s+)'  # Capture the whitespace characters
text = "123 Main St Chicago IL 60601"
result = re.split(pattern, text)
print(result)

This will output:

['123', ' ', 'Main', ' ', 'St', ' ', 'Chicago', ' ', 'IL', ' ', '60601']

Splitting Addresses

Let me show you a real-time scenario of splitting addresses. Suppose you have a list of addresses, and you want to split each address into its components.

Here is an example.

import re

addresses = [
    "123 Main St, Chicago, IL 60601",
    "456 Elm St, Los Angeles, CA 90001",
    "789 Pine St, New York, NY 10001"
]

pattern = r',\s*|\s+'  # Split by commas followed by optional spaces or by spaces

for address in addresses:
    components = re.split(pattern, address)
    print(components)

This will output:

['123', 'Main', 'St', 'Chicago', 'IL', '60601']
['456', 'Elm', 'St', 'Los Angeles', 'CA', '90001']
['789', 'Pine', 'St', 'New York', 'NY', '10001']

Here is the exact output in the screenshot below:

python re split

Check out Split a String into an Array in Python

Handling Complex Patterns

You can use advanced regex features in Python to split more complex requirements. For example, if the addresses include apartment numbers, you can adjust your pattern accordingly.

import re
pattern = r',\s*|\s+|(?<=\d)\s+(?=\d)'  # Split by commas, spaces, or between numbers

text = "123 Main St Apt 4B, Chicago, IL 60601"
result = re.split(pattern, text)
print(result)

This will output:

['123', 'Main', 'St', 'Apt', '4B', 'Chicago', 'IL', '60601']

Conclusion

In this tutorial, we explored various techniques for splitting strings using Python’s re.split() function. I hope you now understand how to use regex to split strings in Python. Whether you are processing addresses, cleaning data, or parsing text, these techniques will help you manage your data more effectively.

You may also like:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.