1

I would like to slice a numpy structured array. I have an array

>>b
>>array([([11.0, 21.0, 31.0, 0.01], [1.0, 2.0, 3.0, 0.0]),
       ([41.0, 51.0, 61.0, 0.11], [4.0, 5.0, 6.0, 0.1]),
       ([71.0, 81.0, 91.0, 0.21], [7.0, 8.0, 9.0, 0.2])], 
       dtype=[('fd', '<f8', (4,)), ('av', '<f8', (4,))])

And I want to access elements of this to create a new array similar to

>>b[:][:,0]

to get an array similar to this. (To get all rows in all columns at [0]). (Please don't mind the parenthesis, brackets and dimensions in the following as this is not an output)

>>array([([11.0],[1.0]),
  ([41.0],[4.0]),
  ([71.0],[7.0])],
  dtype=[('fd', '<f8', (1,)), ('av', '<f8', (1,))])

but I get this error.

>>b[:][:,0]
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  IndexError: too many indices for array

I would like to do this without looping the names in dtype. Thank you very much for the help.

1 Answer 1

4

You access the fields of a structured array by field name. There isn't a way around this. Unless the dtypes let you view it in a different way.

Lets call your desire output c.

In [1061]: b['fd']
Out[1061]: 
array([[  1.10000000e+01,   2.10000000e+01,   3.10000000e+01,
          1.00000000e-02],
       [  4.10000000e+01,   5.10000000e+01,   6.10000000e+01,
          1.10000000e-01],
       [  7.10000000e+01,   8.10000000e+01,   9.10000000e+01,
          2.10000000e-01]])

What I think you are trying to do is collect these values for both fields:

In [1062]: b['fd'][:,0]
Out[1062]: array([ 11.,  41.,  71.])

In [1064]: c['fd']
Out[1064]: 
array([[ 11.],
       [ 41.],
       [ 71.]])

As I just explained in https://stackoverflow.com/a/38090370/901925 the recfunctions generally allocate a target array and copy values by field.

So the field iteration solution would be something like:

In [1066]: c.dtype
Out[1066]: dtype([('fd', '<f8', (1,)), ('av', '<f8', (1,))])

In [1067]: b.dtype
Out[1067]: dtype([('fd', '<f8', (4,)), ('av', '<f8', (4,))])

In [1068]: d=np.zeros((b.shape), dtype=c.dtype)


In [1070]: for n in b.dtype.names:
    d[n][:] = b[n][:,[0]]

In [1071]: d
Out[1071]: 
array([([11.0], [1.0]), ([41.0], [4.0]), ([71.0], [7.0])], 
      dtype=[('fd', '<f8', (1,)), ('av', '<f8', (1,))])

================

Since both fields a floats, I can view b as a 2d array; and select the 2 subcolumns with 2d array indexing:

In [1083]: b.view((float,8)).shape
Out[1083]: (3, 8)

In [1084]: b.view((float,8))[:,[0,4]]
Out[1084]: 
array([[ 11.,   1.],
       [ 41.,   4.],
       [ 71.,   7.]])

Similarly, c can be viewed as 2d

In [1085]: c.view((float,2))
Out[1085]: 
array([[ 11.,   1.],
       [ 41.,   4.],
       [ 71.,   7.]])

And I can, then port the values to a blank d with:

In [1090]: d=np.zeros((b.shape), dtype=c.dtype)

In [1091]: d.view((float,2))[:]=b.view((float,8))[:,[0,4]]

In [1092]: d
Out[1092]: 
array([([11.0], [1.0]), ([41.0], [4.0]), ([71.0], [7.0])], 
      dtype=[('fd', '<f8', (1,)), ('av', '<f8', (1,))])

So, at least in this case, we don't have to do field by field copy. But I can't say, without testing, which is faster. In my previous answer I found that field by field copy was relatively fast when dealing with many rows.

Sign up to request clarification or add additional context in comments.

4 Comments

Can you explain how you got c. I cannot seem to find it. Thanks
c is just a copy/paste from your post, your >>array([([11.0],[1.0]),... ('av', '<f8', (1,))]).
I figured out how to do the copy via views.
Thank you. This helps a lot.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.