Hi, I'm trying to convert my daily panel data to monthly panel data using collapse function.
I'm sorry if I'm asking a similar question that have been posted somewhere.
Before I post this question, I searched for some posts and answers but couldn't find exact one that I wanted.
If someone knows similar Q&A, I would appreciate if you could provide me with a link.
First, here's the example data below.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(hotel_id time_daily rating num_bedroom) str8 cancel_policy float cancel_numeric 100 18263 5 2 "flexible" 0 100 18264 4 2 "flexible" 0 100 18265 5 2 "flexible" 0 100 18266 1 2 "flexible" 0 100 18267 1 2 "flexible" 0 100 18294 1 2 "flexible" 0 100 18295 2 2 "flexible" 0 100 18298 1 2 "flexible" 0 100 18299 2 2 "flexible" 0 102 18264 4 3 "flexible" 0 102 18265 5 3 "flexible" 0 102 18266 3 3 "flexible" 0 102 18267 4 3 "flexible" 0 102 18268 2 3 "flexible" 0 102 18269 3 3 "flexible" 0 102 18294 4 3 "flexible" 0 102 18295 4 3 "flexible" 0 102 18297 3 3 "flexible" 0 102 18300 2 3 "flexible" 0 102 18322 5 3 "flexible" 0 102 18323 4 3 "strict" 1 102 18324 5 3 "strict" 1 end format %tdMon_DD,_CCYY time_daily
I don't know why time_daily and time_monthly shows in that way but it will show up properly (e.g. Jan 01, 2010 or Jan 2010) when you copy and paste the code in Stata.
I created a monthly time variable based on daily time variable:
Code:
gen time_monthly=mofd(time_daily) format %tmMon_CCYY time_monthly
1.time-invariant variable (number of bedrooms): since this never changes over the time, I thought I could just use first value of id when collapsing the data to monthly.
2.time varying continuous variable (rating): since this changes over the time, I would calculate the average value of each variable by id and month.
3.time varying binary variable (cancellation policy;My question): cancellation policy variable corresponds to this type.
Let's say there are only two cases, whether it is flexible or strict. In addition, it may change from flexible to strict over time but it will never become flexible again once it becomes from flexible to strict.
How do I convert this one-time changing variable when collapsing from daily to monthly panel?
My plan is to have a new monthly cancellation policy variable which is equal to 1 if it has been changed within that month or 0 for the past months.
For example, for hotel_id 102, cancellation policy had been changed from flexible to strict on Mar 02, 2010. So, it will be 1 after Mar 2010 (Apr 2010, May 2010, ...) but 0 before Feb 2010 (Jan 2010, Dec 2009,...).
It looks like if I use "if condition", I can implement it but not clear...Can anyone help me out how to do this when using collapse command?
Code:
collapse (mean) avg_rating=rating (firstnm) num_bedroom_monthly=num_bedroom , by(hotel_id time_monthly)
The desired output will look like below:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(hotel_id time_monthly avg_rating num_bedroom_monthly month_cancel_policy) 100 600 3.2 2 0 100 601 1.5 2 0 102 600 3.5 3 0 102 601 3.25 3 0 102 602 4.6666665 3 1 end format %tmMon_CCYY time_monthly
0 Response to re-generate time varying variable when collapsing from daily panel data to monthly panel data
Post a Comment