I am relatively new to stata and statistics in general so excuse me for the rather basic question.
Currently, I am working on a difference in differences (DiD) estimation with panel data and I am struggling to set up the regression. There have been different posts on statalist, some suggesting setting up the DiD with the
1) "xtset" and and then the "xtreg" command
2) and others used the "diff" command.
Personally, I feel more comfortable with the xtset and xtreg commands because the set up seems easier to understand for me and tried to apply it on my Data. However, I am unsure on how to make my data into time-series and especially which variables to define as the "x" and "t "/time variable in my dataset as my time variable in the dataset is in a weird format, having date and time in the same cell (03jan2013 20:46:16).
I have been trying to find a solution to this for a couple of days and I am really grateful for anyone helping out. - Tim
Here is some description of the data:
It is about taxi trips picking up and dropping off people at exact GPS coordinates in New York. The DiD estimation is used to see if attending a concert (treatment) at the Philharmonic will lead to an increase in tipping percentage.
T
he first dummy variable "after_concert" is a before/after variable (30min before concert = 0, and 30min after concert = 1) and the second dummy variable "near_concert" is the geographical location ( between >100 meter & <=500 meter from the entrance of the concert hall = 0, within 100 meter radius from the entrance of the concert hall = 1). "interaction" is the dummy variable of the interaction between "after_concert" and "near_concert".
----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str32(medallion hack_license) float tip_percentage double datetime_pickup float(date_pickup time_pickup) double datetime_dropoff float(date_dropoff time_dropoff after_concert near_concert interaction) "00005007A9F30E289E760362F69E4EAD" "A9AE329EA1138052DAC8FDFD8BA86603" 25 1674683596000 19382 21.88778 1674684197000 19382 22.05472 1 1 1 "0009986BDBAB2F9A125FEF49D0BFCCDD" "44CED38841518B1FB5E25E3624610A38" 0 1673991833000 19374 21.73139 1673992330000 19374 21.869444 1 0 0 "001C8EC421C9BE57D08576617465401A" "107B02D25D1F14E6A8C806F96A9CDE62" 0 1672868671000 19361 21.741945 1672869589000 19361 21.996944 1 0 0 "0030AD2648D81EE87796445DB61FCF20" "C87F7E83E6032DB2B254EF24734FC4ED" 19.285713 1672868340000 19361 21.65 1672869180000 19361 21.883333 1 0 0 "003889E315BFDD985664FE5A4BCC0EC4" "18295C622647C5F8291F47CEE94A517A" 19.047619 1672869060000 19361 21.85 1672869540000 19361 21.983334 1 1 1 "003889E315BFDD985664FE5A4BCC0EC4" "18295C622647C5F8291F47CEE94A517A" 0 1673981940000 19374 18.983334 1673982360000 19374 19.1 0 1 0 "0038EF45118925A510975FD0CCD67192" "6AC19922A9FDA1FCF904B907DF8C553A" 0 1674683580000 19382 21.883333 1674683760000 19382 21.93333 1 0 0 "0053334C798EC6C8E637657962030F99" "9EF690115D60940E7A8039A67542642E" 0 1672859220000 19361 19.116667 1.6728594e+12 19361 19.166666 0 0 0 "0055B7428059CBA7F6176AA898075962" "6397F7B8CD9FDF005AB0B34B5541A524" 0 1673992729000 19374 21.98028 1673993783000 19374 22.273056 1 1 1 "005DED7D6E6C45441C26981DCFBED992" "4170B87A9B9EE6A9DC1B5515D8302AE1" 0 1672858860000 19361 19.016666 1672859520000 19361 19.2 0 0 0 end format %tc datetime_pickup format %td date_pickup format %tc datetime_dropoff format %td date_dropoff
0 Response to Difference in differences: Problem in the set up?
Post a Comment