1

I'm working with a nested array. I need to apply a rather simple but costly arithmetic operation on each element of this array.

Below is the MWE, where the "second block" is the one that takes up most of the time (it will run thousands of times).

The first and second blocks need to be separated since the first one is only processed once, given that the real way to obtain a,b,c is very costly time-wise.

I'm not sure how I could improve the performance of this operation applied on each element of the nested array. Surely numpy would do this much faster, but I'm not much familiarized with broadcasting operations on arrays.

import numpy as np
import time


# Generate some random data.
N = 100
x, y, z = [np.random.uniform(0., 10., N) for _ in range(3)]

# Grid of values in 2 dimensions.
M = 200
p_lst, q_lst = np.linspace(0., 50., M), np.linspace(0., 25., M)

# Define empty nested list to be filled below.
# The shape is given by the length of the lists defined above.
abc_lst = [[[] for _ in p_lst] for _ in q_lst]

# First block. This needs to be separated from the block below.
# Fill nested list with values.
for i, p in enumerate(p_lst):
    for j, q in enumerate(q_lst):

        # a,b,c are obtained via some complicated function of p,q.
        # This is just for the purpose of this example.
        a, b, c = 1.*p, 1.*q, p+q

        # Store in nested list.
        abc_lst[i][j] = [a, b, c]

# Second block <-- THIS IS THE BOTTLENECK
tik = time.time()
# Apply operation on nested list.
lst = []
for i in range(len(p_lst)):
    for j in range(len(q_lst)):

        # Extract a,b,c values from nested list.
        a, b, c = abc_lst[i][j]

        # Apply operation. This is the *actual* operation
        # I need to apply.
        d = sum(abs(a*x + y*b + c*z))

        # Store value.
        lst.append(d)

print time.time() - tik
1
  • 1
    Avoid loops inside loops by flattening your 2D array into a 1D and reshaping at the end. It will be MUCH faster already Commented Jan 20, 2016 at 15:05

2 Answers 2

1

I've found an answer in this question, using the np.outer() function.

It only takes a bit of re-arranging of the first block, and the second block runs many many times faster.

# First block. Store a,b,c separately.
a_lst, b_lst, c_lst = [], [], []
for i, p in enumerate(p_lst):
    for j, q in enumerate(q_lst):

        # a,b,c are obtained via some complicated function of p,q.
        # This is just for the purpose of this example.
        a_lst.append(1.*p)
        b_lst.append(1.*q)
        c_lst.append(p+q)

# As arrays.
a_lst, b_lst, c_lst = np.asarray(a_lst), np.asarray(b_lst), np.asarray(c_lst)

# Second block.
# Apply operation on nested list using np.outer.
lst = np.sum(abs(np.outer(a_lst, x) + np.outer(b_lst, y) + np.outer(c_lst, z)), axis=1)
Sign up to request clarification or add additional context in comments.

Comments

0

I don't think two sets of loops are necessary. Just collapse into one:

## Always pre-allocate with zeros if possible...not just empty lists
lst = np.zeros(M*M)

# First block. This one runs fast.
tik = time.time()
# Fill nested list with values.
for i, p in enumerate(p_lst):
    for j, q in enumerate(q_lst):

        # a,b,c are obtained via some complicated function of p,q.
        # This is just for the purpose of this example.
        a, b, c = 1.*p, 1.*q, p+q

        # Don't store in nested list, just calculate
        ##abc_lst[i][j] = [a, b, c]  
        lst[i*M+j] = (sum(abs(a*x + y*b + c*z)))

1 Comment

Thank you gariepy! Unfortunately the blocks need to be separated because the first one is only processed once. I've edited my question to reflect this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.