Index a NumPy array row-wise [duplicate]

Question

Say I have a NumPy array:

>>> X = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
>>> X
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

and an array of indexes that I want to select for each row:

>>> ixs = np.array([[1, 3], [0, 1], [1, 2]])
>>> ixs
array([[1, 3],
       [0, 1],
       [1, 2]])

How do I index the array X so that for every row in X I select the two indices specified in ixs?

So for this case, I want to select element 1 and 3 for the first row, element 0 and 1 for the second row, and so on. The output should be:

array([[2, 4],
       [5, 6],
       [10, 11]])

A slow solution would be something like this:

output = np.array([row[ix] for row, ix in zip(X, ixs)])

however this can get kinda slow for extremely long arrays. Is there a faster way to do this without a loop using NumPy?

EDIT: Some very approximate speed tests on a 2.5K * 1M array with 2K wide ixs (10GB):

np.array([row[ix] for row, ix in zip(X, ixs)]) 0.16s

X[np.arange(len(ixs)), ixs.T].T 0.175s

X.take(idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None]) 33s

np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype).reshape(ixs.shape) 2.4s

This produces an array that is close to what you want: X[:,ixs]. Can anyone build on this? — Bill
– Bill, Commented Feb 26, 2018 at 22:04
Can you reconstruct ixs like this instead: ixs2 = np.array(((0, 1), (0, 3), (1, 0), ...))? If so, then X[ixs2[:,0], ixs2[:,1]] will get what you need I think. — Bill
– Bill, Commented Feb 26, 2018 at 22:18
No @MohsinBukhari I am already parallelising on a higher loop. I also don't think parallelisation will help here as passing around info between processes is slow. — mxbi
– mxbi, Commented Feb 26, 2018 at 22:24
Hm, have you tried a straight-forward for-loop implementation with numba JIT? — juanpa.arrivillaga
– juanpa.arrivillaga, Commented Feb 26, 2018 at 23:01

llllllllll · Accepted Answer · 2018-02-26 22:15:20Z

7

You can use this:

X[np.arange(len(ixs)), ixs.T].T

Here is the reference for complex indexing.

edited Feb 26, 2018 at 22:15

answered Feb 26, 2018 at 22:05

llllllllll

16.5k4 gold badges35 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

19 Comments

mxbi Over a year ago

This is a neater solution but actually a bit slower in my (approximate) testing. I measured 1.75s vs 1.15s for the original version

juanpa.arrivillaga Over a year ago

@mxbi how big of an array are you testing this on? Unless you are using large arrays, I would expect list-comprehensions to be faster for these small sizes...

mxbi Over a year ago

My array is 10K * 1M @juanpa.arrivillaga - although my initial test was on a much smaller array. On the large array this version is only very slightly slower than the list comprehension.

juanpa.arrivillaga Over a year ago

@mxbi mmm if I had to guess the double-transpose is just taking too long to make up for the efficient indexing... can you try timing my .take approach, maybe add the results as an edit to the question?

llllllllll Over a year ago

@mxbi 10K * 1M is too much memory(10G long ???), maybe you need to try 10K * 10K level.

|

juanpa.arrivillaga · Accepted Answer · 2018-02-26 23:26:15Z

3

I believe you can use .take thusly:

In [185]: X
Out[185]:
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [186]: idx
Out[186]:
array([[1, 3],
       [0, 1],
       [1, 2]])

In [187]: X.take(idx + (np.arange(X.shape[0]) * X.shape[1]).reshape(-1, 1))
Out[187]:
array([[ 2,  4],
       [ 5,  6],
       [10, 11]])

If your array dimensions are massive, it might be faster, albeit uglier, to do:

idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None]

Just for fun, see how the following performs:

np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype, count=ixs.size).reshape(ixs.shape)

Edit to add timings

In [15]: X = np.arange(1000*10000, dtype=np.int32).reshape(1000,-1)

In [16]: ixs = np.random.randint(0, 10000, (1000, 2))

In [17]: ixs.sort(axis=1)

In [18]: ixs
Out[18]:
array([[2738, 3511],
       [3600, 7414],
       [7426, 9851],
       ...,
       [1654, 8252],
       [2194, 8200],
       [5497, 8900]])

In [19]: %timeit  np.array([row[ix] for row, ix in zip(X, ixs)])
928 µs ± 23.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [20]: %timeit X[np.arange(len(ixs)), ixs.T].T
23.6 µs ± 491 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [21]: %timeit X.take(idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
20.6 µs ± 530 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [22]: %timeit np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype, count=ixs.size).reshape(ixs.shape)
1.42 ms ± 9.94 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

@mxbi I've added some timings and my results aren't really consistent with yours, you should check it out

Here's a larger array:

In [33]: X = np.arange(10000*100000, dtype=np.int32).reshape(10000,-1)

In [34]: ixs = np.random.randint(0, 100000, (10000, 2))

In [35]: ixs.sort(axis=1)

In [36]: X.shape
Out[36]: (10000, 100000)

In [37]: ixs.shape
Out[37]: (10000, 2)

With some results:

In [42]: %timeit  np.array([row[ix] for row, ix in zip(X, ixs)])
11.4 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [43]: %timeit X[np.arange(len(ixs)), ixs.T].T
596 µs ± 17.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [44]: %timeit X.take(ixs+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
540 µs ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Now, we are using column 500 indices instead of two, and we see the list-comprehension start winning out:

In [45]: ixs = np.random.randint(0, 100000, (10000, 500))

In [46]: ixs.sort(axis=1)

In [47]: %timeit  np.array([row[ix] for row, ix in zip(X, ixs)])
93 ms ± 1.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [48]: %timeit X[np.arange(len(ixs)), ixs.T].T
133 ms ± 638 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [49]: %timeit X.take(ixs+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
87.5 ms ± 1.13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

edited Feb 26, 2018 at 23:26

answered Feb 26, 2018 at 22:20

juanpa.arrivillaga

97.6k14 gold badges141 silver badges190 bronze badges

10 Comments

mxbi Over a year ago

This one unfortunately seems kinda slow. On a 10K x 1M matrix, the faster version takes 33 seconds to complete, versus 170ms for the list comprehension.

juanpa.arrivillaga Over a year ago

@mxbi yeaaah if your dimensions are massive, the there's a lot of overhead to the idx + ... stuff, although, try with idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None] I wonder if it is faster... even if not fast enough.

llllllllll Over a year ago

@mxbi 10G long is indeed too much, the bottleneck is perhaps OS paging.

mxbi Over a year ago

@liliscent This machine has 128GB ram and no swap so I am not paging. Yeah @ juanpa I guess this works better on smaller arrays.

juanpa.arrivillaga Over a year ago

@mxbi did you try this version: idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None], I don't think it will be fast enough, but I'm curious if it is faster than the other...

|

hpaulj · Accepted Answer · 2018-02-26 23:49:29Z

The usual suggestion for indexing items from rows is:

X[np.arange(X.shape[0])[:,None], ixs]

That is, make a row index of shape (n,1) (column vector), which will broadcast with the (n,m) shape of ixs to give a (n,m) solution.

This basically the same as:

X[np.arange(len(ixs)), ixs.T].T

which broadcasts a (n,) index against a (m,n), and transposes.

Timings are essentially the same:

In [299]: X = np.ones((1000,2000))
In [300]: ixs = np.random.randint(0,2000,(1000,200))
In [301]: timeit X[np.arange(len(ixs)), ixs.T].T
6.58 ms ± 71.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [302]: timeit X[np.arange(X.shape[0])[:,None], ixs]
6.57 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

and for comparison:

In [307]: timeit np.array([row[ix] for row, ix in zip(X, ixs)])
6.63 ms ± 229 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

I'm a little surprised that this list comprehension does so well. I wonder how the relative advantages compare when the dimensions change, particularly in the relative shape of X and ixs (long, wide etc).

The first solution is the style of indexing produced by ix_:

In [303]: np.ix_(np.arange(3), np.arange(2))
Out[303]: 
(array([[0],
        [1],
        [2]]), array([[0, 1]]))

MPękalski · Accepted Answer · 2018-02-26 22:09:41Z

0

This should work

[X[i][[y]] for i, y in enumerate(ixs)]

EDIT: I just noticed you wanted no loop solution.

answered Feb 26, 2018 at 22:09

MPękalski

7,1045 gold badges29 silver badges37 bronze badges

1 Comment

mxbi Over a year ago

This one is also pretty neck and neck with my original solution (if I wrap it in a call to np.array)

Collectives™ on Stack Overflow

Index a NumPy array row-wise [duplicate]

4 Answers 4

19 Comments

Edit to add timings

10 Comments

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

19 Comments

Edit to add timings

10 Comments

Comments

1 Comment

Linked

Related