Dear Statalist,

I am working to count a range of medication-related problems using an administrative claims database. One of the clinical indicators that I am trying to count has me stuck and I am looking for advice on the most efficient way to calculate. The dateset I am currently using is quite small, but I am also planning to do the same thing in a large dataset.

The data is in long form, each observation represents one supply of a particular medication "drugname" for an individual (identified by "study_id") on a particular date (date of supply "dos"). The dos is in %td format. I can classify the medications with indicators if they belong to a particular class of medication eg. "sedating".

I am using Stata 13.

Example data is shown below:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str5 study_id str80 drugname float(dos sedate)
"BI301" "paracetamol + codeine" 21271 1
"BI301" "paracetamol + codeine" 21285 1
"BI301" "paracetamol + codeine" 21378 1
"BI301" "paracetamol + codeine" 21384 1
"BI301" "paracetamol + codeine" 21488 1
"BI301" "paracetamol + codeine" 21554 1
"BI302" "fluoxetine"            21106 .
"BI302" "fluoxetine"            21145 .
"BI302" "fluoxetine"            21188 .
"BI302" "fluoxetine"            21263 .
"BI302" "fluoxetine"            21351 .
"BI302" "fluoxetine"            21416 .
"BI302" "fluoxetine"            21460 .
"BI302" "fluoxetine"            21525 .
"BI302" "fluoxetine"            21613 .
"BI302" "buprenorphine"         21074 1
"BI302" "buprenorphine"         21092 1
"BI302" "buprenorphine"         21106 1
"BI302" "fluoxetine"            21106 .
"BI302" "fluoxetine"            21145 .
end
format %td dos
I am trying to count the number of times the following occurs: "Concurrent use of three or more sedating medications". I am defining "concurrent use" as at least one supply of three different sedating medications (ie. three distinct drugnames) within a 90 day time period (this takes into consideration the pack sizes of the medications of interest).

I have managed to calculate concurrent use of two or more different medications within a category by exploiting the time difference operator in Stata to create a date difference:

gen dd=dos-dos[_n-1]
by study_id sedate (dos), sort: gen time=dd + dd[_n+1] if _n>=2 & sedate==1 & drugname[_n]!=drugname[_n+1]

I can then generate an indicator to specify if the time difference is less than or equal to 90 days.

However, I haven't managed to figure out how to expand this to three medications in the same class (sedating).

One option could be a loop counting events in intervals eg.: a variation on the documentation
Stata tip 51: Events in intervals by Nick Cox
The Stata Journal (2007)
7, Number 3, pp. 440-443


quietly forval i = 1/‘= _N’ {
count if inrange(sys_bp, 120, .) & ///
id == id[‘i’] & ///
inrange(date[‘i’] - date, 1, 30)
gen n_high_bp = r(N) in ‘i’
}

However, if at all possible, I would like to avoid a loop as it will be inefficient for use in the large dataset.

I am quite sure there is a relatively simple solution to my problem, any advice on how to get there faster would be greatly appreciated!

Many thanks,

Jean.