Hi there,
I am using StataMP 15.1.
I would like to estimate the sample attrition in my panel dataset.
I have observations for a large pool of individuals across four years: 2015,16,17 and 2018.
I have seen that xtdescribe presents this nicely, however, because I told STATA that my data was panel, but did not specify a time variable, the xtdescribe command does not work here.
I could not specify a time variable because I had multiple observations for each year, and as I am not intending to use time-series commands such as lags and leads I was advised that xtset id would be fine without the timevar.
I will attach an example of my data below:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long id float(date diary_day tran tran_freq) double amnt int pi 100001 2016 0 2 2 1000 . 100001 2016 0 . 2 . . 100001 2016 0 1 2 5 . 100001 2016 1 . . . . 100001 2016 2 3 4 127 2 100001 2016 2 2 4 820 3 100001 2016 2 1 4 30.5 2 100001 2016 2 . 4 . . 100001 2016 2 4 4 30.400000000000002 2 100001 2016 3 1 1 820 3 100001 2016 3 . 1 . . 100001 2017 0 . . . . 100001 2017 1 1 1 127 3 100001 2017 2 1 1 40 3 100001 2017 3 1 1 35 1 100001 2018 0 . . . . 100001 2018 1 . . . . 100001 2018 2 1 2 10 1 100001 2018 2 2 2 89.23 3 100001 2018 3 . . . . 100002 2017 0 . . . . 100002 2017 1 1 1 25.35 4 100002 2017 2 1 1 120 4 100002 2017 3 1 1 6.140000000000001 1 100003 2016 0 1 1 1623 . 100003 2016 0 . 1 . . 100003 2016 1 . 8 . . 100003 2016 1 3 8 500 10 100003 2016 1 1 8 20 11 100003 2016 1 6 8 20 . 100003 2016 1 5 8 150 . 100003 2016 1 2 8 2 1 100003 2016 1 4 8 34.15 4 100003 2016 1 7 8 25 10 100003 2016 1 8 8 20 . 100003 2016 2 . 2 . . 100003 2016 2 2 2 17.5 10 100003 2016 2 1 2 61 4 100003 2016 3 . 1 . . 100003 2016 3 1 1 2 1 100003 2017 0 . . . . 100003 2017 1 2 3 7.45 4 100003 2017 1 1 3 12.99 4 100003 2017 1 3 3 15 4 100003 2017 2 . . . . 100003 2017 3 2 4 19.72 4 100003 2017 3 1 4 93.97 3 100003 2017 3 4 4 1376.33 . 100003 2017 3 3 4 23.89 4 100003 2018 0 . . . . 100003 2018 1 5 8 3 . 100003 2018 1 2 8 19.150000000000002 4 100003 2018 1 6 8 40 . 100003 2018 1 8 8 20 1 100003 2018 1 3 8 107.92 4 100003 2018 1 1 8 13.71 4 100003 2018 1 7 8 6 1 100003 2018 1 4 8 28 4 100003 2018 2 1 3 94.2 6 100003 2018 2 3 3 41.51 3 100003 2018 2 2 3 22.87 3 100003 2018 3 2 2 1696.1000000000001 . 100004 2017 0 . . . . 100004 2017 1 1 1 3.48 4 100004 2017 2 3 4 579 6 100004 2017 2 4 4 505 . 100004 2017 2 1 4 597 2 100004 2017 3 2 5 74.84 4 100004 2017 3 4 5 389.74 . 100004 2017 3 3 5 92.01 6 100004 2017 3 1 5 92.01 2 100004 2017 3 5 5 389.73 . 100004 2018 0 . . . . 100004 2018 1 1 2 123 2 100004 2018 1 2 2 123 2 100004 2018 2 1 2 12 4 100004 2018 2 2 2 7 4 100004 2018 3 1 2 5 4 100004 2018 3 2 2 40 4 100005 2015 0 . . . . 100005 2015 1 1 1 100.41 3 100005 2015 1 . 1 . . 100005 2015 2 1 1 35.81 3 100005 2015 2 . 1 . . 100005 2015 3 1 4 6.53 3 100005 2015 3 3 4 14 . 100005 2015 3 2 4 3 1 100005 2015 3 4 4 37 3 100005 2015 3 . 4 . . 100005 2016 0 . . . . 100005 2016 1 1 1 516.5 3 100005 2016 1 . 1 . . 100005 2016 2 . . . . 100005 2016 3 . . . . 100007 2015 0 . . . . 100007 2015 1 1 1 5 1 100007 2015 1 . 1 . . 100007 2015 2 . 1 . . 100007 2015 2 1 1 30 1 100007 2015 3 . 2 . . end label values pi pi_l label def pi_l 1 "1 Cash", modify label def pi_l 2 "2 Check", modify label def pi_l 3 "3 Credit card", modify label def pi_l 4 "4 Debit card", modify label def pi_l 6 "6 Bank account number payment", modify label def pi_l 10 "10 PayPal", modify label def pi_l 11 "11 Account-to-account transfer", modify
This is transaction-level data, each observation represents an individual reporting a payment on a specific 'diary_day' (from 0 to 3), per year.
Each individual has an identifier "id" and the year variable is "date".
Is there a way to estimate the attrition rate via dummy variables perhaps?
Even if I cannot see it across the whole panel like in xtdescribe, perhaps there is a way to estimate the attrition rate from 2015-2016 and then 2016-2017, etc... This, however, will not be helpful if some individuals dropped out of the survey in, say, one 'middle' year but then came back in at a later date, presumably.
Thank you in advance for any help.
Jack
Related Posts with Estimating sample attrition in panel dataset (xtset id)
Weird breaking change in Stata 16 with views, cross() and column indexingMy package `elasticregress` has stopped working in Stata 16. I think the below code illustrates what…
Calculating birthdate from age and age at test dateHow do I calculate birthdate from the age at test date (ageIGA) and the test date (tvdIGA)? Variabl…
plotting means + cis using imputed data and coefplotHi everyone, I hope you'll be able to help me with this question. I have multiple years of cross-se…
Kaplan meier survival analysisplease, guys, I want to perform a Kaplan Meier survival analysis with stata, someone could help me??…
Interpreting fixed effects (LSDV and within-estimator)Dear all, I would like to pose a theoretical question. I am studying the impact of firm dynamics (l…
Subscribe to:
Post Comments (Atom)
0 Response to Estimating sample attrition in panel dataset (xtset id)
Post a Comment