1

I have a Pandas Dataframe and wish to reverse the binary encoding (i.e. get_dummies()) of three columns. The encoding is left-to-right:

    a   b   c
0   0   1   1
1   0   0   1
2   1   1   1
3   1   0   0

would result in a new categories column C taking values 0-7:

    C
1   6   
2   4   
3   7
4   1

I am not sure why this line is giving me a syntax error, near axis=1:

df['C'] = df.apply(lambda x: (x['a']==1 ? 1:0)+(x['b']==1 ? 2:0)+(x['c']==1 ? 4:0), axis=1)

2 Answers 2

2

Use numpy if performance is important - first convert DataFrame to numpy array and then use bitwise shift:

a = df.values
#pandas 0.24+
#a = df.to_numpy()
df['C'] = a.dot(1 << np.arange(a.shape[-1]))
print (df)
   a  b  c  C
0  0  1  1  6
1  0  0  1  4
2  1  1  1  7
3  1  0  0  1
Sign up to request clarification or add additional context in comments.

Comments

1

What you are doing is right. (just need some modifications in syntax)

I have modified you code,

>>> df['C'] = df.apply(lambda x: (1 if x['a']==1 else 0)+(2 if x['b']==1 else 0)+(4 if x['c']==1 else 0), axis=1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.