Hello everyone, we are trying to use propensity score matching (PSM) to find the suitable control group. Our data set is at the exporter (China)-product(HS6)-destination (EU)-year level from 2000-2013. The dataset contains over 650,000 unit of observations. The treatment group is the products that are subject to the EU antidumping (AD) measures. Between 2000-2013, we had 160 product at the HS6 level exported from China to the EU faced with EU AD measures. We want to use PSM to find the control group, as the products that were subject to the EU AD measures are not random, using PSM could avoid this selection bias. We use the logit model and to estimate the probability of a product being imposed by the AD measure, based on a set of observable characteristics. The estimation equation is as follows:
Pr(AD=1)_p=beta_0+beta_1 IP(China)_pt-1+beta_2 GDP_t+ RER_t+year FE+error term
IP(China)_pt-1 is lagged import penetration, which is defined as the share of import from China over total imports in the EU at the HS6 level;
GDP_t: is the GDP growth rate in EU in year t
RER_t: is the log real exchange rate in terms of Euro per Chinese RMB;
I also include the year fixed effects.
My question is how should we define the dependent variable (DV). More specifically, should we define the dependent variable as AD_p=1, if the product is subject to EU AD and 0 otherwise? With this definition, the DV does not change over time, only varies across products. Alternatively, we can define the dependent variable as AD_pt=1, if the product is subject to EU AD in year t and it remains to 1 if this measure is still in force, and 0 otherwise. For instance, if a product imposed an AD measure in 2005 and the measure stayed in force until 2010, then AD_pt=1 between 2005 and 2010, but for the years before the treatment, AD_pt=0, and for the years after the measure is revoked (i.e., after 2010), AD_pt=0. If the DV indeed needs to vary at time dimension, my question is how PSM could find a control for the treated product before the treatment. Specifically, if the product was treated between 2005 and 2010, the rest of the years are all taking 0, how could PSM find a control group for the treated product say in 2004 or 2011?
I am very confusing what is the right definition for the DV, and I really appreciate your help and suggestion.
Related Posts with Propensity Score matching: how to define the dependent variable
Importing problem BRFSS SAS file post 2014I'm working with BRFSS data 2010-2015. Before 2014, I could open it by clicking the import option an…
Import and reshape data while capturing variable label from ExcelHello, I have 115 worksheets (each of them has about 100 variables) from excel to work with and the…
Generating squared difference within group variableHello, I have the following stylized dataset: Code: * Example generated by -dataex-. To install: …
How to rename around 200 rows of a matrix simultaneously or in iterationHi, I'm very new to Stata and I've searched and tried answers for this question for three hours. I'm…
Generating date variables based on ID and yearDear Statalist community, I am working with a panel data set on armed conflict (UCDP/PRIO Armed Con…
Subscribe to:
Post Comments (Atom)
0 Response to Propensity Score matching: how to define the dependent variable
Post a Comment