Hi there,

I am using StataMP 15.1.

I would like to estimate the sample attrition in my panel dataset.

I have observations for a large pool of individuals across four years: 2015,16,17 and 2018.

I have seen that xtdescribe presents this nicely, however, because I told STATA that my data was panel, but did not specify a time variable, the xtdescribe command does not work here.

I could not specify a time variable because I had multiple observations for each year, and as I am not intending to use time-series commands such as lags and leads I was advised that xtset id would be fine without the timevar.

I will attach an example of my data below:


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long id float(date diary_day tran tran_freq) double amnt int pi
100001 2016 0 2 2               1000  .
100001 2016 0 . 2                  .  .
100001 2016 0 1 2                  5  .
100001 2016 1 . .                  .  .
100001 2016 2 3 4                127  2
100001 2016 2 2 4                820  3
100001 2016 2 1 4               30.5  2
100001 2016 2 . 4                  .  .
100001 2016 2 4 4 30.400000000000002  2
100001 2016 3 1 1                820  3
100001 2016 3 . 1                  .  .
100001 2017 0 . .                  .  .
100001 2017 1 1 1                127  3
100001 2017 2 1 1                 40  3
100001 2017 3 1 1                 35  1
100001 2018 0 . .                  .  .
100001 2018 1 . .                  .  .
100001 2018 2 1 2                 10  1
100001 2018 2 2 2              89.23  3
100001 2018 3 . .                  .  .
100002 2017 0 . .                  .  .
100002 2017 1 1 1              25.35  4
100002 2017 2 1 1                120  4
100002 2017 3 1 1  6.140000000000001  1
100003 2016 0 1 1               1623  .
100003 2016 0 . 1                  .  .
100003 2016 1 . 8                  .  .
100003 2016 1 3 8                500 10
100003 2016 1 1 8                 20 11
100003 2016 1 6 8                 20  .
100003 2016 1 5 8                150  .
100003 2016 1 2 8                  2  1
100003 2016 1 4 8              34.15  4
100003 2016 1 7 8                 25 10
100003 2016 1 8 8                 20  .
100003 2016 2 . 2                  .  .
100003 2016 2 2 2               17.5 10
100003 2016 2 1 2                 61  4
100003 2016 3 . 1                  .  .
100003 2016 3 1 1                  2  1
100003 2017 0 . .                  .  .
100003 2017 1 2 3               7.45  4
100003 2017 1 1 3              12.99  4
100003 2017 1 3 3                 15  4
100003 2017 2 . .                  .  .
100003 2017 3 2 4              19.72  4
100003 2017 3 1 4              93.97  3
100003 2017 3 4 4            1376.33  .
100003 2017 3 3 4              23.89  4
100003 2018 0 . .                  .  .
100003 2018 1 5 8                  3  .
100003 2018 1 2 8 19.150000000000002  4
100003 2018 1 6 8                 40  .
100003 2018 1 8 8                 20  1
100003 2018 1 3 8             107.92  4
100003 2018 1 1 8              13.71  4
100003 2018 1 7 8                  6  1
100003 2018 1 4 8                 28  4
100003 2018 2 1 3               94.2  6
100003 2018 2 3 3              41.51  3
100003 2018 2 2 3              22.87  3
100003 2018 3 2 2 1696.1000000000001  .
100004 2017 0 . .                  .  .
100004 2017 1 1 1               3.48  4
100004 2017 2 3 4                579  6
100004 2017 2 4 4                505  .
100004 2017 2 1 4                597  2
100004 2017 3 2 5              74.84  4
100004 2017 3 4 5             389.74  .
100004 2017 3 3 5              92.01  6
100004 2017 3 1 5              92.01  2
100004 2017 3 5 5             389.73  .
100004 2018 0 . .                  .  .
100004 2018 1 1 2                123  2
100004 2018 1 2 2                123  2
100004 2018 2 1 2                 12  4
100004 2018 2 2 2                  7  4
100004 2018 3 1 2                  5  4
100004 2018 3 2 2                 40  4
100005 2015 0 . .                  .  .
100005 2015 1 1 1             100.41  3
100005 2015 1 . 1                  .  .
100005 2015 2 1 1              35.81  3
100005 2015 2 . 1                  .  .
100005 2015 3 1 4               6.53  3
100005 2015 3 3 4                 14  .
100005 2015 3 2 4                  3  1
100005 2015 3 4 4                 37  3
100005 2015 3 . 4                  .  .
100005 2016 0 . .                  .  .
100005 2016 1 1 1              516.5  3
100005 2016 1 . 1                  .  .
100005 2016 2 . .                  .  .
100005 2016 3 . .                  .  .
100007 2015 0 . .                  .  .
100007 2015 1 1 1                  5  1
100007 2015 1 . 1                  .  .
100007 2015 2 . 1                  .  .
100007 2015 2 1 1                 30  1
100007 2015 3 . 2                  .  .
end
label values pi pi_l
label def pi_l 1 "1 Cash", modify
label def pi_l 2 "2 Check", modify
label def pi_l 3 "3 Credit card", modify
label def pi_l 4 "4 Debit card", modify
label def pi_l 6 "6 Bank account number payment", modify
label def pi_l 10 "10 PayPal", modify
label def pi_l 11 "11 Account-to-account transfer", modify
This is transaction-level data, each observation represents an individual reporting a payment on a specific 'diary_day' (from 0 to 3), per year.

Each individual has an identifier "id" and the year variable is "date".

Is there a way to estimate the attrition rate via dummy variables perhaps?

Even if I cannot see it across the whole panel like in xtdescribe, perhaps there is a way to estimate the attrition rate from 2015-2016 and then 2016-2017, etc... This, however, will not be helpful if some individuals dropped out of the survey in, say, one 'middle' year but then came back in at a later date, presumably.

Thank you in advance for any help.

Jack