3

I've been trying to figure out a problem I came across for a while now, but somehow I cannot find the solution.

I've created a pandas dataframe which is already filled with values, let's say dimension (4,3)

df=
  A    B    C
0 valX valX valX
1 valY valY valY
2 valZ valZ valZ
3 valW valW valW

What I want to do right now is append ten additional columns, each containing a numpy array filled with 38 zero's.

My solution seems to be working when I first cast my array to a string and then add it to the original df.

However, Pandas doesn't accept a plain numpy array. I need the value of the column to be a numpy array, as I will later do some sklearn computations on them.

Later in my code, I substitute certain columns with a one-hot encoding of certain characters. The remaining columns act as a zero-padding.

Example of my code (which works for adding 10 columns):

#create empty array
x = np.zeros(38)
for i in range(0, 10):
    col_name = "char_" + str(i)

    df[col_name] = str(x)

The problem here is that I need to cast x to a string. If I keep it as a numpy array, it throws me this error:

ValueError: Length of values does not match length of index
2
  • Do you need df[col_name] = x.astype(str) ? Commented Nov 3, 2017 at 11:16
  • Hi jezrael, thank you for your answer. The example above works, only the problem is that it adds strings to my df instead of arrays. Commented Nov 3, 2017 at 11:47

1 Answer 1

1

Use:

x = np.zeros(38)
for i in range(0, 10):
    col_name = "char_" + str(i)

    df[col_name] = pd.Series([x], index=df.index)

print (type(df.loc[0,'char_9']))
<class 'numpy.ndarray'>
Sign up to request clarification or add additional context in comments.

2 Comments

Jezrael, can you check this stackoverflow.com/questions/47095122/…
This is exactely what I needed. Thank you so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.