2

I'm looking for a solution to sum per column in a 2D array ("a" in the example below) and starting from a cell position as defined in a different 1D array ("ref" in the example below).

I have tried the following:

import numpy as np

a = np.arange(20).reshape(5, 4)
print(a)                         # representing an original large 2D array
ref = np.array([0, 2, 4, 1])     # reference array for defining start of sum
s = a.sum(axis=0)
print(s)    # Works: sums all elements per column
s = a[2:].sum(axis=0)
print(s)    # Works as well: sum from the third element till end per column

# This is what I look for: sum per column starting at element defined by ref[]
s = np.zeros(4).astype(int)      # makes an empty 1D array
for i in np.arange(4):           # for each column
    for j in np.arange(ref[i], 5):
        s[i] += a[j, i]          # sums all elements from ref till end (i.e. 5)

print(s)    # This is the desired outcome

for i in np.arange(4):
    s = a[ref[i]:].sum(axis=0)

print(s)    # No good; same as a[ref[4]:].sum(axis=0) and here ref[4] = 1

s = np.zeros(4).astype(int)      # makes an empty 1D array
for i in np.arange(4):
    s[i] = np.sum(a[ref[i]:, i])

print(s)    # Yes; this is also the desired outcome

Is it possible to realize this without using a for loop? Does numpy have functions for doing this in a single step?

s = a[ref:].sum(axis=0)

This would be nice, but is not working.

Thank you for your time!

1
  • Thank you for including a piece of code with what you've tried, it is also appreciated around here. What might also help is a simple example of a, ref, and desired output. Commented May 3, 2018 at 19:54

1 Answer 1

1

A basic solution based on np.cumsum:

In [1]: a = np.arange(15).reshape(5, 3)

In [2]: res = np.array([0, 2, 3])

In [3]: b = np.cumsum(a, axis=0)

In [4]: b
Out[4]: 
array([[ 0,  1,  2],
       [ 3,  5,  7],
       [ 9, 12, 15],
       [18, 22, 26],
       [30, 35, 40]])

In [5]: a
Out[5]: 
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])


In [6]: b[res, np.arange(a.shape[1])]
Out[6]: array([ 0, 12, 26])

In [7]: b[-1, :] - b[res, np.arange(a.shape[1])]
Out[7]: array([30, 23, 14])

so it does not give us the result we want: we need to add a first line of zeros to b:

In [13]: b = np.vstack([np.zeros((1, a.shape[1])), b])

In [14]: b
Out[14]: 
array([[  0.,   0.,   0.],
       [  0.,   1.,   2.],
       [  3.,   5.,   7.],
       [  9.,  12.,  15.],
       [ 18.,  22.,  26.],
       [ 30.,  35.,  40.]])

In [17]: b[-1, :] - b[res, np.arange(a.shape[1])]
Out[17]: array([ 30.,  30.,  25.])

which is, I believe, the desired output.

Sign up to request clarification or add additional context in comments.

2 Comments

Dear P. Camilleri, Thank you for your fast and useful solution; exactly what I was looking for. I am new to Python, but understand that you subtract the sub-sum (of the elements we don't want) from the total column sum. This also allows to define a lower 1D array (up to where to sum), similar to line [6] and subtract both in line [7]. Elegant!
@wilmert yes, you got that right! A part you should remember is indeed [6]: how to extract the subarray with given lines and columns indices. b[res, :] would not work: it creates a 2D array (try it).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.