I have a dataset on legal cases and individual hearings for each case, across multiple courts. I have set it up as a survival time dataset in order to estimate the factors that contribute to case disposition time, using a Cox PH Model. I want to control for the workload of each court, since a higher workload will undoubtedly affect the speed of disposal. I have a dummy, dispvar, as the failure variable, and the time in years, duration_year, as the time variable.

Each individual observation is a a single hearing, identified by a date stored in datetime format:


Code:
* Example generated by -dataex-.    To install: ssc install dataex
clear
input float(unidcode judge_code)    double(businessondate_sif datefiled_sif)    float(lasthearingdate_sif    newcourt    oldtreatmentcourt    afternc    casetype_code    dispvar    duration_year)
351040 1 16469 14845 16469 0 0 0    42 1  4.449315
350965 1 17616 17429 17616 0 0 0    42 1 .51232874
342692 1 17723 17555 17723 0 0 0    36 1   .460274
350915 1 17835 17695 17835 0 0 0    42 1  .3835616
269514 1 17927 17739 21348 0 0 0    28 0  .5150685
342498 1 18592 17468 18592 0 0 0    36 1  3.079452
247489 1 18913 18637 18913 0 0 0    25 1  .7561644
344091 1 19170 18253 19170 0 0 0    36 1  2.512329
265946 1 19227 18744 19227 0 0 0    28 1 1.3232877
265459 1 19419 18331 19419 0 0 0    28 1  2.980822
end
format %td businessondate_sif
format %td datefiled_sif
format %td lasthearingdate_sif

How do I create a variable that will enable me to to control for the number of legal cases pending (which have not yet been disposed of) on a given hearing date?

Thank you in advance to anybody who can help me with this.