3

What would be the best way to loop through a dataframe with strings that I would like to split into multiple rows while retaining the other value at the same time?

input:

genres                   revenue
action|comedy|drama       5000
action|romance            10000

output:

genres      revenue
action      5000
comedy      5000
drama       5000
action      10000
romance     10000

2 Answers 2

5

Use Series.str.split with assign back column by DataFrame.assign and DataFrame.explode, last for default index add DataFrame.reset_index with drop=True:

df1=df.assign(genres = df['genres'].str.split('|')).explode('genres').reset_index(drop=True)
print (df1)
    genres  revenue
0   action     5000
1   comedy     5000
2    drama     5000
3   action    10000
4  romance    10000
Sign up to request clarification or add additional context in comments.

1 Comment

this solution is very helpful and I strongly recommend it!
2

You can use Series.str.split with df.explode:

Note: df.explode works for pandas version >= 0.25

In [2240]: df.genres = df.genres.str.split('|')

In [2242]: df = df.explode('genres')

In [2243]: df
Out[2243]: 
    genres  revenue
0   action     5000
0   comedy     5000
0    drama     5000
1   action    10000
1  romance    10000

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.