Dear all,

I am having some trouble in estimating a difference-in-differences model.


In particular, I am trying to see whether prices of some products increased after a merger in the sector. I am using as a treatment group the markets in which these products are sold and as control the ones in which they are not sold (where the merger should not have had any effect).

I defined three variables, which represent the time, the group variable and the interaction term.

Code:
*Treatment indicator
gen treated = 0
replace treated = 1 if group == "T"
    
*Time indicator. 0 if pre-merger, 1 if post-merger    
gen time = 0
replace time = 1 if year >= 2015

*Interaction term
gen time_treated = time*treated
The dependent variable is the price. To estimate the effect, I run the following regression:

Code:
*DiD estimation
reg price time treated time_treated [fweight=purchasers]
in which I add the fweight=purchasers as data are grouped for the number of purchasers (as there are different prices for the same product).

The result is the following:
Code:
. reg price time treated time_treated [fweight=purchasers]

      Source |       SS           df       MS      Number of obs   = 222396351
-------------+----------------------------------   F(3, 222396347) >  99999.00
       Model |  6.9099e+10         3  2.3033e+10   Prob > F        =    0.0000
    Residual |  5.7580e+12 222396347  25890.8053   R-squared       =    0.0119
-------------+----------------------------------   Adj R-squared   =    0.0119
       Total |  5.8271e+12 222396350   26201.508   Root MSE        =    160.91

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        time |  -3.216122     .07007   -45.90   0.000    -3.353457   -3.078787
     treated |   56.90151   .0426874  1332.98   0.000     56.81784    56.98517
time_treated |  -.5140701    .074221    -6.93   0.000    -.6595407   -.3685996
       _cons |   176.1494   .0403794  4362.36   0.000     176.0702    176.2285
------------------------------------------------------------------------------
However, I would like to see which is the percentage change. Therefore I log transform the dependent variable and I get:

Code:
. *Log transformation
. gen ln_price = ln(price)

. 
. reg ln_price time treated time_treated [fweight=purchasers]

      Source |       SS           df       MS      Number of obs   = 222396351
-------------+----------------------------------   F(3, 222396347) >  99999.00
       Model |  1543037.17         3  514345.723   Prob > F        =    0.0000
    Residual |  62534956.6 222396347  .281186978   R-squared       =    0.0241
-------------+----------------------------------   Adj R-squared   =    0.0241
       Total |  64077993.8 222396350  .288125204   Root MSE        =    .53027

------------------------------------------------------------------------------
    ln_price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        time |  -.0339575   .0002309  -147.05   0.000    -.0344101   -.0335049
     treated |   .2640624   .0001407  1877.08   0.000     .2637867    .2643381
time_treated |    .002573   .0002446    10.52   0.000     .0020936    .0030524
       _cons |    5.04016   .0001331  3.8e+04   0.000     5.039899    5.040421
------------------------------------------------------------------------------
I do not understand why using a log transformation changes the sign of the coefficient (time_treated), as I expected it would have given me the change in %.

Thanks for your help!