Hi there,
I am using StataMP 15.1.
I would like to estimate the sample attrition in my panel dataset.
I have observations for a large pool of individuals across four years: 2015,16,17 and 2018.
I have seen that xtdescribe presents this nicely, however, because I told STATA that my data was panel, but did not specify a time variable, the xtdescribe command does not work here.
I could not specify a time variable because I had multiple observations for each year, and as I am not intending to use time-series commands such as lags and leads I was advised that xtset id would be fine without the timevar.
I will attach an example of my data below:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long id float(date diary_day tran tran_freq) double amnt int pi 100001 2016 0 2 2 1000 . 100001 2016 0 . 2 . . 100001 2016 0 1 2 5 . 100001 2016 1 . . . . 100001 2016 2 3 4 127 2 100001 2016 2 2 4 820 3 100001 2016 2 1 4 30.5 2 100001 2016 2 . 4 . . 100001 2016 2 4 4 30.400000000000002 2 100001 2016 3 1 1 820 3 100001 2016 3 . 1 . . 100001 2017 0 . . . . 100001 2017 1 1 1 127 3 100001 2017 2 1 1 40 3 100001 2017 3 1 1 35 1 100001 2018 0 . . . . 100001 2018 1 . . . . 100001 2018 2 1 2 10 1 100001 2018 2 2 2 89.23 3 100001 2018 3 . . . . 100002 2017 0 . . . . 100002 2017 1 1 1 25.35 4 100002 2017 2 1 1 120 4 100002 2017 3 1 1 6.140000000000001 1 100003 2016 0 1 1 1623 . 100003 2016 0 . 1 . . 100003 2016 1 . 8 . . 100003 2016 1 3 8 500 10 100003 2016 1 1 8 20 11 100003 2016 1 6 8 20 . 100003 2016 1 5 8 150 . 100003 2016 1 2 8 2 1 100003 2016 1 4 8 34.15 4 100003 2016 1 7 8 25 10 100003 2016 1 8 8 20 . 100003 2016 2 . 2 . . 100003 2016 2 2 2 17.5 10 100003 2016 2 1 2 61 4 100003 2016 3 . 1 . . 100003 2016 3 1 1 2 1 100003 2017 0 . . . . 100003 2017 1 2 3 7.45 4 100003 2017 1 1 3 12.99 4 100003 2017 1 3 3 15 4 100003 2017 2 . . . . 100003 2017 3 2 4 19.72 4 100003 2017 3 1 4 93.97 3 100003 2017 3 4 4 1376.33 . 100003 2017 3 3 4 23.89 4 100003 2018 0 . . . . 100003 2018 1 5 8 3 . 100003 2018 1 2 8 19.150000000000002 4 100003 2018 1 6 8 40 . 100003 2018 1 8 8 20 1 100003 2018 1 3 8 107.92 4 100003 2018 1 1 8 13.71 4 100003 2018 1 7 8 6 1 100003 2018 1 4 8 28 4 100003 2018 2 1 3 94.2 6 100003 2018 2 3 3 41.51 3 100003 2018 2 2 3 22.87 3 100003 2018 3 2 2 1696.1000000000001 . 100004 2017 0 . . . . 100004 2017 1 1 1 3.48 4 100004 2017 2 3 4 579 6 100004 2017 2 4 4 505 . 100004 2017 2 1 4 597 2 100004 2017 3 2 5 74.84 4 100004 2017 3 4 5 389.74 . 100004 2017 3 3 5 92.01 6 100004 2017 3 1 5 92.01 2 100004 2017 3 5 5 389.73 . 100004 2018 0 . . . . 100004 2018 1 1 2 123 2 100004 2018 1 2 2 123 2 100004 2018 2 1 2 12 4 100004 2018 2 2 2 7 4 100004 2018 3 1 2 5 4 100004 2018 3 2 2 40 4 100005 2015 0 . . . . 100005 2015 1 1 1 100.41 3 100005 2015 1 . 1 . . 100005 2015 2 1 1 35.81 3 100005 2015 2 . 1 . . 100005 2015 3 1 4 6.53 3 100005 2015 3 3 4 14 . 100005 2015 3 2 4 3 1 100005 2015 3 4 4 37 3 100005 2015 3 . 4 . . 100005 2016 0 . . . . 100005 2016 1 1 1 516.5 3 100005 2016 1 . 1 . . 100005 2016 2 . . . . 100005 2016 3 . . . . 100007 2015 0 . . . . 100007 2015 1 1 1 5 1 100007 2015 1 . 1 . . 100007 2015 2 . 1 . . 100007 2015 2 1 1 30 1 100007 2015 3 . 2 . . end label values pi pi_l label def pi_l 1 "1 Cash", modify label def pi_l 2 "2 Check", modify label def pi_l 3 "3 Credit card", modify label def pi_l 4 "4 Debit card", modify label def pi_l 6 "6 Bank account number payment", modify label def pi_l 10 "10 PayPal", modify label def pi_l 11 "11 Account-to-account transfer", modify
This is transaction-level data, each observation represents an individual reporting a payment on a specific 'diary_day' (from 0 to 3), per year.
Each individual has an identifier "id" and the year variable is "date".
Is there a way to estimate the attrition rate via dummy variables perhaps?
Even if I cannot see it across the whole panel like in xtdescribe, perhaps there is a way to estimate the attrition rate from 2015-2016 and then 2016-2017, etc... This, however, will not be helpful if some individuals dropped out of the survey in, say, one 'middle' year but then came back in at a later date, presumably.
Thank you in advance for any help.
Jack
Related Posts with Estimating sample attrition in panel dataset (xtset id)
The estimate significant level of XTSURDo you think it is normal to have ALL variables extremely significant like p < 0.000? After I ran…
Data based monitoringHi All, This query is regarding Data based monitoring. I have two datasets, one from a research stud…
How to deal with Likert Scale Variables as 101 StatalistHello guys, I have done some Google research regarding my topic and he brought me here. I actually f…
command _qregtrace is unrecognizedWhen I used command "grqreg" to graph quantile regressions, I got error saying "command _qregtrace i…
Hansen test for xtdpdsysHi. I am fairly new to stata but can anyone tell me how I can get the Hansen j statistics after usin…
Subscribe to:
Post Comments (Atom)
0 Response to Estimating sample attrition in panel dataset (xtset id)
Post a Comment