I am looking to remove repeat blood tests from my dataset if taken within 60 minutes of the initial blood test, however with a couple of caveats: If the initial blood test is too high / too low, but the repeat within 10 minutes is normal then I would like to drop the initial row of data and all subsequent tests (observations) for 60 minutes (keeping only the 10-minute repeat). If the repeat (w/in 10 minutes) is also too low / too high then I would like to keep the initial test and drop the repeat test and all subsequent tests (observations) within 60 minutes of the initial test.

I have used the below code to identify patients with repeat tests within 10 minutes of an initial test. But the repeat60 variable unfortunately also identifies all patients with tests within 60 minutes, whereas I only want the ones with 2 tests within ten minutes first...

capture drop repeat repeat60
gen repeat10 = "."
gen repeat60 = "."

*** identify tests 10 min apart ***
levelsof ptid, local(ptlist)
foreach id of local ptlist {
gsort datetime
levelsof datetime if ptid == "`id'", local(dt)
local d1 = 19200
foreach d of local dt {
local d2 = `d'
if (`d2' - `d1')/1000 < 600 {
replace repeat10 = "repeat test" if datetime == `d2' & ptid == "`id'"
replace repeat10 = "initial test" if datetime == `d1' & ptid == "`id'"
replace repeat60 = "initial test" if datetime == `d1' & ptid == "`id'"
}
if (`d2' - `d1')/1000 < 3600 {
replace repeat60 = "repeat test" if datetime == `d2' & ptid == "`id'"
replace repeat60 = "initial test 1" if datetime == `d1' & ptid == "`id'"
}
else {
local d1 = `d'
}
}
}

So this creates the following data:

ptid testdatetime datetime_meas repeat10 repeat60
P1000001 7/3/2020 6:30 1.909e+12 . .
P1000001 7/4/2020 6:53 1.909e+12 . .
P1000001 7/5/2020 5:31 1.910e+12 . .
P1000001 7/5/2020 6:51 1.910e+12 . .
P1000001 7/6/2020 5:21 1.910e+12 . .
P1000001 7/6/2020 8:10 1.910e+12 . .
P1000001 7/7/2020 7:11 1.910e+12 . .
P1000001 7/7/2020 8:32 1.910e+12 . initial test 1
P1000001 7/7/2020 8:58 1.910e+12 . repeat test
P1000001 7/8/2020 1:41 1.910e+12 . .
P1000001 7/16/2020 4:54 1.910e+12 . .
P1000001 7/16/2020 11:19 1.911e+12 . .
P1000001 10/12/2020 4:56 1.918e+12 . .
P1000001 10/12/2020 7:08 1.918e+12 . .
P1000001 10/13/2020 7:01 1.918e+12 . .
P1000002 7/14/2020 6:53 1.910e+12 . initial test 1
P1000002 7/14/2020 7:09 1.910e+12 . repeat test
P1000002 7/14/2020 7:40 1.910e+12 . repeat test
P1000002 7/14/2020 8:23 1.910e+12 . initial test 1
P1000002 7/14/2020 9:11 1.910e+12 . repeat test
P1000002 7/15/2020 4:46 1.910e+12 . .
P1000002 7/15/2020 5:51 1.910e+12 . .
P1000002 7/15/2020 9:38 1.910e+12 . .
P1000002 7/15/2020 12:04 1.910e+12 . .
P1000002 7/16/2020 4:25 1.910e+12 . .
P1000002 7/16/2020 6:53 1.911e+12 . .
P1000002 7/16/2020 11:05 1.911e+12 . .
P1000002 7/17/2020 8:00 1.911e+12 . .
P1000002 7/18/2020 4:34 1.911e+12 . .
P1000002 7/18/2020 6:24 1.911e+12 . .
P1000002 7/18/2020 10:07 1.911e+12 initial test initial test
P1000002 7/18/2020 10:09 1.911e+12 repeat test repeat test

I can from here create one or two binary vars if initial test out of range & if repeat test out of range but I am still unable to highlight all tests within 60 minutes of those specific 10minute repeat tests.

As an aside, the above loop takes ages to run as it cycles through 10s of thousands of rows of data.

I have tried looking at the data as time series but I am not confident enough with the commands or even whether or not it might work.

Any help would be hugely appreciated