0

I have a string like this: "32H74312" I want to extract some parts and put them in different variables.

first_part = 32 # always 2 digits
second_part = H # always 1 chars
third_part = 743 # always 3 digit 
fourth_part = 12 # always 2 digit

Is there some way to this in pythonic way?

6 Answers 6

2

There's now reason to use a regex for such a simple task. The pythonic way could be something like:

string = "32H74312"
part1 = string[:2]
part2 = string[2:3]
part3 = string[3:6]
part4 = string[6:]
Sign up to request clarification or add additional context in comments.

Comments

1

If String is always same length, then you can do this:

string =  "32H74312"
first_part = string[:2] #always 2 digits
second_part = string[2:-5] # always 1 chars
third_part = string[3:-2] # always 3 digit 
fourth_part = string[:6] # always 2 digit

Comments

1

Since you have a fixed amount of characters to capture you can do:

(\d\d)(\w)(\d{3})(\d\d)

You can then utilize re.match.

pattern = r"(\d\d)(\w)(\d{3})(\d\d)"
string = "32H74312"

first_part, second_part, third_part, fourth_part = re.match(pattern, string).groups()

print(first_part, second_part, third_part, fourth_part)

Which outputs:

32 H 743 12

Unless it's because you want an easy way to enforce each part being digits and word characters. Then this isn't really something you need regex for.

Comments

1

This is quite 'pythonic' also :

string = "32H74312"
parts = {0:2, 2:3, 3:6, 3:6, 6:8 } 
string_parts = [ string[ p : parts[p] ] for p in parts ]

1 Comment

Another way to go: parts = [string[a:b] for a, b in {0:2, 2:3, 3:6, 3:6, 6:8 }.items()]
1

Expanding on Pedro's excellent answer, string slicing syntax is the best way to go.

However, having variables like first_part, second_part, . . . nth_part is typically considered an anti-pattern; you are probably looking for a tuple instead:

str = "32H74312"
parts = (str[:2], str[2], str[3:6], str[6:])

print(parts)
print(parts[0], parts[1], parts[2], parts[3])

Comments

1

You can use this method:

import re

line = '32H74312'

d2p = r'(\d\d)' # two digits pattern
ocp = r'(\w)' # one char pattern
d3p = r'(\d{3})' # three digits pattern

lst = re.match(d2p + ocp + d3p + d2p, line).groups()
for item in lst:
    print(item)

Brackets are necessary for grouping search elements. Also to make testing your regexps more comfortable, you can use special platforms such as regex101

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.