18

How can I replace the data 'Beer','Alcohol','Beverage','Drink' with only 'Drink'.

df.replace(['Beer','Alcohol','Beverage','Drink'],'Drink')

doesn't work

2
  • 3
    If your question was answered, please reward the efforts of those who helped you. Accept the most helpful answer. You can also upvote all other useful answers. Please make it a habit, it helps the community. Commented Oct 26, 2017 at 16:58
  • This code does work in pandas now, see example 3 on pandas.pydata.org/pandas-docs/stable/reference/api/… Commented Apr 23, 2020 at 8:58

6 Answers 6

27

You almost had it. You need to pass a dictionary to df.replace.

df

       Col1
0      Beer
1   Alcohol
2  Beverage
3     Drink

df.replace(dict.fromkeys(['Beer','Alcohol','Beverage','Drink'], 'Drink'))

    Col1
0  Drink
1  Drink
2  Drink
3  Drink

This works for exact matches and replacements. For partial matches and substring matching, use

df.replace(
    dict.fromkeys(['Beer','Alcohol','Beverage','Drink'], 'Drink'), 
    regex=True
)

This is not an in-place operation so don't forget to assign the result back.

Sign up to request clarification or add additional context in comments.

3 Comments

Ha! I was going to df.replace({k: 'Drink' for k in ['Beer','Alcohol','Beverage','Drink']}) But instead, I learned about dict.fromkeys
Won't passing inplace=True to df.replace make it an in-place operations, if desired?
@2Toad it would but I don't recommend it: stackoverflow.com/a/60020384/4909087
6

Try the following approach:

lst = ['Beer','Alcohol','Beverage','Drink']
pat = r"\b(?:{})\b".format('|'.join(lst))

df = df.replace(pat, 'Drink', regexp=True)

1 Comment

Nice one , just do not know why yours and mine are different but can get the same result ...Sorry I am not quit work with strings...can you refer some link for future studying
5

Looks like different from MaxU's solution :)

df.replace({'|'.join(['Beer','Alcohol','Beverage','Drink']):'Drink'},regex=True)

1 Comment

which python version is it ?
3

It seems that your initial method of doing it works in the the latest iteration of Python.

df.replace(['Beer','Alcohol','Beverage','Drink'],'Drink', inplace=True)

Should work

Comments

2

Slight change in earlier answers: Following code Replacing values of specific column/Columns

df[['Col1']] = df[['Col1']].replace(dict.fromkeys(['Beer','Alcohol','Beverage','Drink'], 'Drink'))

Comments

0

Not relevant to your exact case using strings, but relevant to the question as stated in the title:

If you have a range of numbers you would like to replace you can provide range() as a dict key.

# Dymmy dataframe with two columns
df = pd.DataFrame(dict(a=[1, 2, 3, 4, 5], b=[66, 44, 33, 22, 77]))

# Mapping that will replace 1 with 'x', 2, 3, 4 with 'y' and 5 with 'z' in column 'a'
mapping = {"a": {1: "x", range(2, 5): "y", 5: "z"}}
df.replace(mapping)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.