Dear Statalist community,

I would like to clarify few things about using ppml on gravity data.

1. I'm trying to do a gravity model using patent data. I have panel gravity dataset with 100000 observations. My dependent variable is bilateral patent counts. independent variables are gdp, economic globalization, education of both country of origin and destination. I have included year_*, origin_*, dest_* as fixed effects and clustered the standard errors by (dist). I noticed that my R-square is close to 0.92 and with additional variables it increases to 0.97. This is extremely high and should I worry about it?

2. I have not declared the dataset as a panel using xtset countrypair year. Is this needed for ppml estimation?


3. Is there a way to get the adjusted r-square using ppml?

4. Instead of using year, origin, destination fixed effects, I have tried year_*, origin_destination_* fixed effects (this created 5049 dummies) as well as origin_year_*, destination_year_* fixed effects. However, ppml method takes an extremely long time to estimate. Is this expected?

5. I'm interested in doing some mediation and moderation analysis on the gravity data but not sure whether ppml can be modified. Is there a method to perform mediation and moderation analysis on gravity data with dependent variable having large zero values?

Appreciate your advice.

Thanks.
Jaya