How To Separate Time Ranges/intervals Into Bins If Intervals Occur Over Multiple Bins

July 13, 2023 Post a Comment

I have a dataset which consists of pairs of start-end times (say seconds) of something happening across a recorded period of time. For example: #each tuple includes (start, stop) o

Solution 1:

If you don't mind using numpy, here is a strategy:

import numpy as np

def bin_times(data, bin_size, total_length):
    times = np.zeros(total_length, dtype=np.bool)
    for start, stop indata:
        times[start:stop] = True
    binned = 100 * np.average(times.reshape(-1, bin_size), axis=1)
    return binned.tolist()

data = [(0, 1), (5,8), (15,21), (29,30)]
bin_times(data, 5, 40)
// => [20.0, 60.0, 0.0, 100.0, 20.0, 20.0, 0.0, 0.0]

To explain the logic of bin_times(), let me use a smaller example:

data = [(0, 1), (3, 8)]
bin_times(data, 3, 9)
// => [33.3, 100.0, 66.6]

The times array encodes whether your event is happening in each unit time interval. You start by setting every entry to False:
```
[False, False, False, False, False, False, False, False, False]
```

Read the incoming data and turn the appropriate entries to True:

[True, False, False, True, True, True, True, True, False]

Reshape it into a two-dimensional matrix in which the length of the rows is bin_size:
```
[[True, False, False],
 [True,  True,  True],
 [True,  True, False]]
```
Take the average in each row:
```
[0.333, 1.000, 0.666]
```
Multiply by 100 to turn those numbers into percentages:
```
[33.3, 100.0, 66.6]
```
To hide the use of numpy from the consumer of the function, use the .tolist() method to turn the resulting numpy array into a plain Python list.

One caveat: bin_size needs to evenly divide total_length — the reshaping will throw a ValueError otherwise.

Python Tutorial for Beginners

How To Separate Time Ranges/intervals Into Bins If Intervals Occur Over Multiple Bins

Solution 1:

Post a Comment for "How To Separate Time Ranges/intervals Into Bins If Intervals Occur Over Multiple Bins"