Skip to content Skip to sidebar Skip to footer

Python: Nested For Loop Is Extremely Slow - Reading Rle Compressed 3d Data

I need to read 3D data compressed with run-length encoding (RLE) into a 3D numpy array in python. In Matlab this takes around a second using a nested loop. However in python this t

Solution 1:

Speed up loops

At first don't use np.arange to create an iterator (this will create an array on which you iterate over). Use range (Python3) or xrange (Python2) instead. This should increase the performane by a few percent, but isn't your real bottleneck here.

Matlab has a just in time compiler to perform relativly good in loops, CPython doesn't have this by default. But there is a just in time compiler called numba http://numba.pydata.org/ . In the documentation you will find supported functions that can be compiled to native machine code. When using numba I would also recommend to write things in loops instead of vectorised code, because this is easier to handle for the compiler.

I have modified your code a bit.

defdecompress_RLE(labelsRleCompressed,vox_size):
    res=np.empty(vox_size[0]*vox_size[1]*vox_size[2],np.uint32)

    ii=0for i inrange(0,labelsRleCompressed.size,2):
        value=labelsRleCompressed[i]
        rep=labelsRleCompressed[i+1]
        for j inrange(0,rep):
            res[ii]=value
            ii=ii+1

    res=res.reshape((vox_size[0],vox_size[1],vox_size[2]))

    return res

Create data for benchmarking

vox_size=np.array((300,300,300),dtype=int32)
#create some data
labelsRleCompressed=np.random.randint(0, 500, 
size=vox_size[0]*vox_size[1]*vox_size[2]/2, dtype=np.uint32)
labelsRleCompressed[1::2]=4

Simply calling the function with the generated data results in a runtime of 7.5 seconds, which is a rather poor performance.

Now let's use numba.

import numba
nb_decompress_RLE = numba.jit("uint32[:,:,:](uint32[:],int32[:])",nopython=True)(decompress_RLE) #stick to the datatypes written in the decorator

Calling the compiled nb_decompress_RLE with the test data results in a runtime of 0.0617 seconds. A nice speed up by a factor of 119! Simple copying an array with np.copy is only 3 times faster.

Post a Comment for "Python: Nested For Loop Is Extremely Slow - Reading Rle Compressed 3d Data"