3

So i need to parse thing like this :

commit e397a6e988c05d6fd87ae904303ec0e17f4d79a2
Author: Name <[email protected]>
Date:   Sat Jul 9 21:29:10 2011 +0400

    commit message

 1 files changed, 21 insertions(+), 11 deletions(-)

and get Author name and number of insertions and deletions.

For the name i have this:

re.findall(r"Author: (.+) <",gitLog)

For the numbers i have this:

re.findall(r" (\d+) insertions\S+, (\d+) deletions",gitLog)

But i want to get a list of tuples of name,insertions and delitions with one regular-expression.

I tryed to do somthing like

re.findall(r"Author: (.+) <.+ (\d+) insertions\S+, (\d+) deletions",gitLog,re.DOTALL)

but it returns nothing...

So what is my mistake? How regular-expression should look like?

UPADTE: wRAR is right, but somehow when i read i file and try to parse it i get the whole file as a name , and then last insertion and deletion, so it matches the whole file but not a single commit... [.+] gets the whole file but not a part of a commit...

4 Answers 4

4

If you have access to the repo and not some text dump of git log, save yourself the parsing trouble and generate different log output:

git log --pretty="%an" --numstat

Will produce output of the form:

Author Name

lines_inserted lines_deleted modified_file

Which you don't even need regex for. If you want to keep with regex, you need to match the (+) after insertions or else it will not match at all and not capture the numbers.

Sign up to request clarification or add additional context in comments.

1 Comment

then you would have to run the log command each time for each type of information you need.
3

You should use (directly or by borrowing the code) existing packages such as GitPython, but about your regex question, the provided regex for the provided text returns [('Name', '21', '11')] so I suppose it is right.

Comments

1

There is a module that I used for parsing Git log with Python. Looks quite living:

https://github.com/gaborantal/git-log-parser

Comments

0

So the answer to my question is :

re.findall(r"Author: (\S+) <.+\n.+\n\n.+\n\n.+ (\d+) insertions\S+, (\d+) deletions",gitLog)

But thanks for you answers anyway.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.