Hi there,
I am using StataMP 15.1.
I would like to estimate the sample attrition in my panel dataset.
I have observations for a large pool of individuals across four years: 2015,16,17 and 2018.
I have seen that xtdescribe presents this nicely, however, because I told STATA that my data was panel, but did not specify a time variable, the xtdescribe command does not work here.
I could not specify a time variable because I had multiple observations for each year, and as I am not intending to use time-series commands such as lags and leads I was advised that xtset id would be fine without the timevar.
I will attach an example of my data below:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long id float(date diary_day tran tran_freq) double amnt int pi 100001 2016 0 2 2 1000 . 100001 2016 0 . 2 . . 100001 2016 0 1 2 5 . 100001 2016 1 . . . . 100001 2016 2 3 4 127 2 100001 2016 2 2 4 820 3 100001 2016 2 1 4 30.5 2 100001 2016 2 . 4 . . 100001 2016 2 4 4 30.400000000000002 2 100001 2016 3 1 1 820 3 100001 2016 3 . 1 . . 100001 2017 0 . . . . 100001 2017 1 1 1 127 3 100001 2017 2 1 1 40 3 100001 2017 3 1 1 35 1 100001 2018 0 . . . . 100001 2018 1 . . . . 100001 2018 2 1 2 10 1 100001 2018 2 2 2 89.23 3 100001 2018 3 . . . . 100002 2017 0 . . . . 100002 2017 1 1 1 25.35 4 100002 2017 2 1 1 120 4 100002 2017 3 1 1 6.140000000000001 1 100003 2016 0 1 1 1623 . 100003 2016 0 . 1 . . 100003 2016 1 . 8 . . 100003 2016 1 3 8 500 10 100003 2016 1 1 8 20 11 100003 2016 1 6 8 20 . 100003 2016 1 5 8 150 . 100003 2016 1 2 8 2 1 100003 2016 1 4 8 34.15 4 100003 2016 1 7 8 25 10 100003 2016 1 8 8 20 . 100003 2016 2 . 2 . . 100003 2016 2 2 2 17.5 10 100003 2016 2 1 2 61 4 100003 2016 3 . 1 . . 100003 2016 3 1 1 2 1 100003 2017 0 . . . . 100003 2017 1 2 3 7.45 4 100003 2017 1 1 3 12.99 4 100003 2017 1 3 3 15 4 100003 2017 2 . . . . 100003 2017 3 2 4 19.72 4 100003 2017 3 1 4 93.97 3 100003 2017 3 4 4 1376.33 . 100003 2017 3 3 4 23.89 4 100003 2018 0 . . . . 100003 2018 1 5 8 3 . 100003 2018 1 2 8 19.150000000000002 4 100003 2018 1 6 8 40 . 100003 2018 1 8 8 20 1 100003 2018 1 3 8 107.92 4 100003 2018 1 1 8 13.71 4 100003 2018 1 7 8 6 1 100003 2018 1 4 8 28 4 100003 2018 2 1 3 94.2 6 100003 2018 2 3 3 41.51 3 100003 2018 2 2 3 22.87 3 100003 2018 3 2 2 1696.1000000000001 . 100004 2017 0 . . . . 100004 2017 1 1 1 3.48 4 100004 2017 2 3 4 579 6 100004 2017 2 4 4 505 . 100004 2017 2 1 4 597 2 100004 2017 3 2 5 74.84 4 100004 2017 3 4 5 389.74 . 100004 2017 3 3 5 92.01 6 100004 2017 3 1 5 92.01 2 100004 2017 3 5 5 389.73 . 100004 2018 0 . . . . 100004 2018 1 1 2 123 2 100004 2018 1 2 2 123 2 100004 2018 2 1 2 12 4 100004 2018 2 2 2 7 4 100004 2018 3 1 2 5 4 100004 2018 3 2 2 40 4 100005 2015 0 . . . . 100005 2015 1 1 1 100.41 3 100005 2015 1 . 1 . . 100005 2015 2 1 1 35.81 3 100005 2015 2 . 1 . . 100005 2015 3 1 4 6.53 3 100005 2015 3 3 4 14 . 100005 2015 3 2 4 3 1 100005 2015 3 4 4 37 3 100005 2015 3 . 4 . . 100005 2016 0 . . . . 100005 2016 1 1 1 516.5 3 100005 2016 1 . 1 . . 100005 2016 2 . . . . 100005 2016 3 . . . . 100007 2015 0 . . . . 100007 2015 1 1 1 5 1 100007 2015 1 . 1 . . 100007 2015 2 . 1 . . 100007 2015 2 1 1 30 1 100007 2015 3 . 2 . . end label values pi pi_l label def pi_l 1 "1 Cash", modify label def pi_l 2 "2 Check", modify label def pi_l 3 "3 Credit card", modify label def pi_l 4 "4 Debit card", modify label def pi_l 6 "6 Bank account number payment", modify label def pi_l 10 "10 PayPal", modify label def pi_l 11 "11 Account-to-account transfer", modify
This is transaction-level data, each observation represents an individual reporting a payment on a specific 'diary_day' (from 0 to 3), per year.
Each individual has an identifier "id" and the year variable is "date".
Is there a way to estimate the attrition rate via dummy variables perhaps?
Even if I cannot see it across the whole panel like in xtdescribe, perhaps there is a way to estimate the attrition rate from 2015-2016 and then 2016-2017, etc... This, however, will not be helpful if some individuals dropped out of the survey in, say, one 'middle' year but then came back in at a later date, presumably.
Thank you in advance for any help.
Jack
Related Posts with Estimating sample attrition in panel dataset (xtset id)
heteroskedastic ordered probit with survey dataI am running an ordered probit model using survey data (svy) and trying to test for heteroskedastict…
SchemesI have just upgraded from 15 to 16. Is there any way to import my schemes from 15 without having to …
Managing workflow with the global macroHello Stata users, I am trying to use the global macro to switch between subfolders. For instance, …
Graph line with values for only a few groups over timeHi, I want to graph the development of the number of houses of the type "allmännyttan" (publicly ow…
Keep last 4 digits of a numeric variableHi everyone, I would like to ask you how can I keep the last 4 digits of the variable date. It is a …
Subscribe to:
Post Comments (Atom)
0 Response to Estimating sample attrition in panel dataset (xtset id)
Post a Comment