Hello everyone, we are trying to use propensity score matching (PSM) to find the suitable control group. Our data set is at the exporter (China)-product(HS6)-destination (EU)-year level from 2000-2013. The dataset contains over 650,000 unit of observations. The treatment group is the products that are subject to the EU antidumping (AD) measures. Between 2000-2013, we had 160 product at the HS6 level exported from China to the EU faced with EU AD measures. We want to use PSM to find the control group, as the products that were subject to the EU AD measures are not random, using PSM could avoid this selection bias. We use the logit model and to estimate the probability of a product being imposed by the AD measure, based on a set of observable characteristics. The estimation equation is as follows:
Pr(AD=1)_p=beta_0+beta_1 IP(China)_pt-1+beta_2 GDP_t+ RER_t+year FE+error term
IP(China)_pt-1 is lagged import penetration, which is defined as the share of import from China over total imports in the EU at the HS6 level;
GDP_t: is the GDP growth rate in EU in year t
RER_t: is the log real exchange rate in terms of Euro per Chinese RMB;
I also include the year fixed effects.
My question is how should we define the dependent variable (DV). More specifically, should we define the dependent variable as AD_p=1, if the product is subject to EU AD and 0 otherwise? With this definition, the DV does not change over time, only varies across products. Alternatively, we can define the dependent variable as AD_pt=1, if the product is subject to EU AD in year t and it remains to 1 if this measure is still in force, and 0 otherwise. For instance, if a product imposed an AD measure in 2005 and the measure stayed in force until 2010, then AD_pt=1 between 2005 and 2010, but for the years before the treatment, AD_pt=0, and for the years after the measure is revoked (i.e., after 2010), AD_pt=0. If the DV indeed needs to vary at time dimension, my question is how PSM could find a control for the treated product before the treatment. Specifically, if the product was treated between 2005 and 2010, the rest of the years are all taking 0, how could PSM find a control group for the treated product say in 2004 or 2011?
I am very confusing what is the right definition for the DV, and I really appreciate your help and suggestion.
0 Response to Propensity Score matching: how to define the dependent variable
Post a Comment