Using reghdfe command with if-statements

Hello, bit of a complex one here:

I’m currently working as a research assistant, using my supervisor’s code, which uses employee-level data for a firm which “de-trashes” stock coming into its warehouse i.e., removes transit packaging.
The code is designed to estimate productivity, measured in units [de-trashed] per minute (upm). It uses the reghdfe command, a linear regression that absorbs multiple layers of fixed effects. It also uses an independent variable called PLANNED_UPH which is a target that, if reached, workers get paid a bonus.
The fixed effects used in the regression equation are:

fe3_j (SKU code i.e., product fixed effects)
fe3_i (worker fixed effects)
fe3_t (date fixed effects)
fe3_dow (day of week fixed effects)
fe3_shift (shift type fixed effects i.e., day, early or late shift)
fe3_h (hour of the day fixed effects)
fe3_handle (handling class fixed effects)
fe3_station (warehouse workstation fixed effects)
fe3_group (group of workers fixed effects)

The code is as follows:

reghdfe uph PLANNED_UPH, ///
absorb(fe3_j=SKU_ID fe3_i=user_code fe3_t=date_code fe3_dow=dow fe3_shift=shift_type fe3_h=HourDay1 ///
fe3_handle=HANDLING_CLASS fe3_station=STATION_ID fe3_group=GROUP_ID)
quietly estadd local controls "Yes"
quietly estadd local FE_t "Yes"
quietly estadd local FE_i "Yes"
quietly estadd local FE_j "Yes"
est store H3

The output (H3) is as follows:

HDFE Linear regression			Number of obs =	2,480,900
Absorbing 9 HDFE groups			F( 1,2454358) =	1.66
			Prob > F =	0.1971
			R-squared =	0.5447
			Adj R-squared =	0.5398
			Within R-sq. =	0
			Root MSE =	0.2292


uph Coef.	Std. Err.	t	P>t [95% Conf.	Interval]

PLANNED_UPH -2.25e-06	1.75E-06	-1.29	0.197 -5.68e-06	1.17E-06
_cons .4962852	0.002311	214.75	0.000 .4917558	0.5008146

Absorbed degrees of freedom:

Absorbed FE	Categories	Redundant	Num. Coefs
			-
SKU_ID	25692	0	25692
user_code	567	1	566
date_code	232	1	231
dow	7	7	0
shift_type	3	1	2
HourDay1	9	1	8
HANDLING_CLASS	2	2	0
STATION_ID	38	1	37
GROUP_ID	7	2	5

What I have been asked to do is to first, split the data in half by date (I did this by just creating binary dummies called split1 and split2 to represent data from the first and second halves of the year, respectively). I then have to run the same regression again for just the first half and then copy the values of the coefficients on the fixed effects into the data subset from the second half.

To run the regression on the first half of code, I thought of running the code with if-statements so that the regressions would only run if split1==1. Then for each user ID (worker), I could copy the coefficients from split1 to split2 somehow, then run the code only for split2. However, wherever I place the if-statements in the code, it returns with errors. I’m grateful for any ideas, thanks.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Using reghdfe command with if-statements
Using reghdfe command with if-statements

0 Response to Using reghdfe command with if-statements

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Using reghdfe command with if-statements Using reghdfe command with if-statements

Related Posts with Using reghdfe command with if-statements

0 Response to Using reghdfe command with if-statements

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Using reghdfe command with if-statements
Using reghdfe command with if-statements