Hi there,

I have a dataset including time-series data with a binary variable (X) which is 0 most of the time and 1 else. I need to generate a new variable (X_new) that is based on X and picks the first observation that is 1 in an interval of [-10,+10] around that observation (so, all potential other observations being 1 in that interval shall be neglected). Also, I need another variable (T) that assigns the “distance” [-10,+10] to the event (i.e., where X_new is 1). Those intervals in T are required not to overlap (i.e., there need to be at least 20 observations between the 1s in X_new).

Example of the data (the last two columns are needed):
date X X_new T
01.01.2019 0 0 .
02.01.2019 0 0 .
03.01.2019 0 0 .
04.01.2019 0 0 -10
05.01.2019 0 0 -9
06.01.2019 0 0 -8
07.01.2019 0 0 -7
08.01.2019 0 0 -6
09.01.2019 0 0 -5
10.01.2019 0 0 -4
11.01.2019 0 0 -3
12.01.2019 0 0 -2
13.01.2019 0 0 -1
14.01.2019 1 1 0
15.01.2019 0 0 1
16.01.2019 0 0 2
17.01.2019 0 0 3
18.01.2019 1 0 4
19.01.2019 0 0 5
20.01.2019 0 0 6
21.01.2019 1 0 7
22.01.2019 0 0 8
23.01.2019 0 0 9
24.01.2019 0 0 10
25.01.2019 0 0 .
26.01.2019 0 0 -10
27.01.2019 1 0 -9
28.01.2019 0 0 -8
29.01.2019 0 0 -7
30.01.2019 0 0 -6
31.01.2019 0 0 -5
01.02.2019 0 0 -4
02.02.2019 0 0 -3
03.02.2019 0 0 -2
04.02.2019 0 0 -1
05.02.2019 1 1 0
06.02.2019 0 0 1
07.02.2019 0 0 2
08.02.2019 0 0 3
09.02.2019 0 0 4
10.02.2019 0 0 5
11.02.2019 0 0 6
12.02.2019 0 0 7
13.02.2019 0 0 8
14.02.2019 0 0 9
15.02.2019 0 0 10

Anybody has an idea how to solve this with in a fast way?
I managed to do this in a loop, but since I need to run this on several GB of data, this will take weeks. :/

Thanks a lot!