3

I was wondering if someone could advise me whether there is a better/faster approach to read data from my C program that outputs two lists of size n. I am using ctypes to call the C program.

The loop I show below works by iterating over a number of scans. For each scan two lists are produced (msX, msY). The c_float data is extracted by using list comprehension loop. Is there a better/faster way to convert the c_float_Array obtained from mzP and mzI to msX and msY?

for scan in xrange(nScans):
    mzP = (c_float * nPoints)() # pointer to list 1, c_float_Array
    mzI = (c_float * nPoints)() # pointer to list 2,  c_float_Array
    mlLib.readData(filePointer, 1, scan, byref(mzP), byref(mzI))
    # The slow part...
    msX = [mzP[i] for i in xrange(nPoints)] # list with mzP data
    msY = [mzI[i] for i in xrange(nPoints)] # list with mzI data

Let me know if my question is not clear. Thanks for your help in advance.

3
  • Try with PyPy pypy.org with CPython there is nothing faster that list comprehension. Commented May 25, 2017 at 12:16
  • 1
    msX = mzP[:] would be faster than a list comprehension, but why do you need a list instead of directly using the ctypes array? If a ctypes array is missing some method that you need, maybe an array.array will suffice? Starting with msX = array.array('f', [0]) * nPoints, you can get a ctypes array that shares it via mzP = (c_float * nPoints).from_buffer(msX). Commented May 25, 2017 at 19:17
  • Thanks for the comments. I tried your first suggestion and this reduces the runtime to approx. 45% which is great. I will have a look at your other suggestion to see whether it will be suitable. Thanks! Commented May 25, 2017 at 22:06

3 Answers 3

4

I may be missing something, but this works for me:

from ctypes import c_float

arr = (c_float * 3)(1,2,3)
arr[:]
#Result: [1.0, 2.0, 3.0]
Sign up to request clarification or add additional context in comments.

Comments

3

If you prefer, you can convert to an array with np.ndarray:

msX = np.ndarray((nPoints, ), 'f', mzP, order='C')    
msY = np.ndarray((nPoints, ), 'f', mzI, order='C') 

1 Comment

It does increase the speed by about 20 % on top of what @eryksun suggested. Thanks!
2

The answer is to use NumPy. You can use NumPy to allocate an array, pass a pointer to its data to your C API which will populate it, and then at the end if you are desperate for a list you can call tolist() on the NumPy array. However, you will likely find that keeping the data stored in a NumPy array instead of a list allows you to accelerate downstream processing.

1 Comment

what if the array to give in input is not empty?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.