Variable Partial Array Summation in Python

Question

I'm looking for a solution to sum per column in a 2D array ("a" in the example below) and starting from a cell position as defined in a different 1D array ("ref" in the example below).

I have tried the following:

import numpy as np

a = np.arange(20).reshape(5, 4)
print(a)                         # representing an original large 2D array
ref = np.array([0, 2, 4, 1])     # reference array for defining start of sum
s = a.sum(axis=0)
print(s)    # Works: sums all elements per column
s = a[2:].sum(axis=0)
print(s)    # Works as well: sum from the third element till end per column

# This is what I look for: sum per column starting at element defined by ref[]
s = np.zeros(4).astype(int)      # makes an empty 1D array
for i in np.arange(4):           # for each column
    for j in np.arange(ref[i], 5):
        s[i] += a[j, i]          # sums all elements from ref till end (i.e. 5)

print(s)    # This is the desired outcome

for i in np.arange(4):
    s = a[ref[i]:].sum(axis=0)

print(s)    # No good; same as a[ref[4]:].sum(axis=0) and here ref[4] = 1

s = np.zeros(4).astype(int)      # makes an empty 1D array
for i in np.arange(4):
    s[i] = np.sum(a[ref[i]:, i])

print(s)    # Yes; this is also the desired outcome

Is it possible to realize this without using a for loop? Does numpy have functions for doing this in a single step?

s = a[ref:].sum(axis=0)

This would be nice, but is not working.

Thank you for your time!

Thank you for including a piece of code with what you've tried, it is also appreciated around here. What might also help is a simple example of a, ref, and desired output. — P. Camilleri
– P. Camilleri, Commented May 3, 2018 at 19:54

P. Camilleri · Accepted Answer · 2018-05-03 20:05:31Z

1

A basic solution based on np.cumsum:

In [1]: a = np.arange(15).reshape(5, 3)

In [2]: res = np.array([0, 2, 3])

In [3]: b = np.cumsum(a, axis=0)

In [4]: b
Out[4]: 
array([[ 0,  1,  2],
       [ 3,  5,  7],
       [ 9, 12, 15],
       [18, 22, 26],
       [30, 35, 40]])

In [5]: a
Out[5]: 
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])


In [6]: b[res, np.arange(a.shape[1])]
Out[6]: array([ 0, 12, 26])

In [7]: b[-1, :] - b[res, np.arange(a.shape[1])]
Out[7]: array([30, 23, 14])

so it does not give us the result we want: we need to add a first line of zeros to b:

In [13]: b = np.vstack([np.zeros((1, a.shape[1])), b])

In [14]: b
Out[14]: 
array([[  0.,   0.,   0.],
       [  0.,   1.,   2.],
       [  3.,   5.,   7.],
       [  9.,  12.,  15.],
       [ 18.,  22.,  26.],
       [ 30.,  35.,  40.]])

In [17]: b[-1, :] - b[res, np.arange(a.shape[1])]
Out[17]: array([ 30.,  30.,  25.])

which is, I believe, the desired output.

edited May 3, 2018 at 20:05

answered May 3, 2018 at 19:58

P. Camilleri

13.3k10 gold badges49 silver badges85 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

wilmert Over a year ago

Dear P. Camilleri, Thank you for your fast and useful solution; exactly what I was looking for. I am new to Python, but understand that you subtract the sub-sum (of the elements we don't want) from the total column sum. This also allows to define a lower 1D array (up to where to sum), similar to line [6] and subtract both in line [7]. Elegant!

P. Camilleri Over a year ago

@wilmert yes, you got that right! A part you should remember is indeed [6]: how to extract the subarray with given lines and columns indices. b[res, :] would not work: it creates a 2D array (try it).

Collectives™ on Stack Overflow

Variable Partial Array Summation in Python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related