0

I'm trying to read a binary file.

My objective is to find all the matches of "10, 10, [any hex value exactly one time], either EE or DD]"

Thought I could do it like this:

pattern = (b"\x10\x10\[0-9a-fA-F]?\[xDD|xEE]")

Clearly not working. It seems that it becomes an error at the third part. I tried dissecting the statement and x10 and x11 works, but the rest just won't.

My understanding of "[0-9a-fA-F]?" is that it matches the range in the brackets 0 or 1 times. and the third part "xDD or xEE" am I wrong?

Any ideas?

2
  • Could you please add, two examples of input ? Commented Jan 3, 2017 at 13:59
  • Seen this? Commented Jan 3, 2017 at 14:04

1 Answer 1

1

Use the regex

b'\x10\x10.[\xdd\xee]'

A single . matches any character (any one-byte) single time, and a single [ab] matches a or b a single time.


>>> re.match(b'\x10\x10.[\xdd\xee]', b'\x10\x10\x00\xee')
<_sre.SRE_Match object; span=(0, 4), match=b'\x10\x10\x00\xee'>
Sign up to request clarification or add additional context in comments.

4 Comments

Why wouldn't the "." be like this "\x10\x10\." why no delimiter between the x10 and the dot?
Because the . is evaluated into a wild-card. \. is for matching strictly a dot character, as it serves as an escape sequence (and you want the wild card).
Many thanks! so in my last statement everything was correct except for | which shoulde have been turned into a \ hence seperating the both hex values inside the brackets?
| in your regex will make the last character need to match \xee, \xdd or | (pipe symbol). (structure of [abc] -> a|b|c)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.