Hey everyone,

I really appreciate the support here in this forum.
The more I learn about statistic and stata, the more I am questioning my model I am trying to analyze.

I have unbalanced panel data.
I want to analyze if Corporate Venture Capital has an influence on the financial performance of a company.
I have decided that I will add zeros to my panel data whenever a company has not invested in a time period (2009-2019).
The qualitatively meaning of the zeros follows the same logic: The amount of investment; no investment = zero amount.

Adding zeros makes my plot look non-linear.

Code:
. 
. plot tq cvc     

  5.1325 +  
         | * *
         |  
         | *                                                         *
         | *                          *
         | *
    T    | *
    o    | *   *
    b    | **                  *            *
    i    | *   *
    n    | *                                       *
    '    | *  *                *
    s    | ** *                         *                       *
         | ** * *
    Q    | ***                                                *         *
         | ***  * *                    *                                  *
         | ** *   *                                            *
         | **   **      *         *      *    *                       *
         | *  *  ** * * *   ** *  *  *    *  *                            *
         | * **  **   **     *    *     * * *                             *
 .222395 + * *     *                         *                 *
          +----------------------------------------------------------------+
                0    Fund Total Estimated Equity Invested in      76.6452


.
As I am/was aware I can use

Code:
. local controls "fs lev itq rdi growth cap_exp"

. 
. xtreg tq cvc `controls' i.fyear, fe vce(cluster gvkey)  

Fixed-effects (within) regression               Number of obs     =        353
Group variable: gvkey                           Number of groups  =         34

R-squared:                                      Obs per group:
     Within  = 0.5024                                         min =          2
     Between = 0.4515                                         avg =       10.4
     Overall = 0.4733                                         max =         11

                                                F(17,33)          =      31.15
corr(u_i, Xb) = 0.0740                          Prob > F          =     0.0000

                                 (Std. err. adjusted for 34 clusters in gvkey)
------------------------------------------------------------------------------
             |               Robust
          tq | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         cvc |  -.0004955   .0020686    -0.24   0.812    -.0047041     .003713
          fs |  -.1565618   .1077493    -1.45   0.156    -.3757794    .0626558
         lev |   .0371103   .0150445     2.47   0.019     .0065021    .0677186
         itq |    .685569   .1433776     4.78   0.000     .3938652    .9772729
         rdi |  -6.298757   5.390539    -1.17   0.251    -17.26589    4.668377
      growth |   .0939487   .1110024     0.85   0.403    -.1318875    .3197848
     cap_exp |   1.192223   .7782992     1.53   0.135    -.3912389    2.775684
             |
       fyear |
       2010  |  -.1237752   .0610782    -2.03   0.051    -.2480398    .0004894
       2011  |  -.1557566   .0697817    -2.23   0.033    -.2977285   -.0137846
       2012  |  -.1343674    .075972    -1.77   0.086    -.2889337    .0201988
       2013  |  -.0314902   .0584656    -0.54   0.594    -.1504393    .0874589
       2014  |   .0033938   .0763937     0.04   0.965    -.1520302    .1588179
       2015  |   .0466549   .0763477     0.61   0.545    -.1086757    .2019855
       2016  |   .0999757   .0848034     1.18   0.247    -.0725581    .2725095
       2017  |   .0848351      .0936     0.91   0.371    -.1055954    .2752657
       2018  |   .0135201   .0985238     0.14   0.892    -.1869281    .2139683
       2019  |   .0607907    .094598     0.64   0.525    -.1316703    .2532517
             |
       _cons |   2.321139   1.365598     1.70   0.099    -.4571896    5.099468
-------------+----------------------------------------------------------------
     sigma_u |  .74063035
     sigma_e |  .29752864
         rho |  .86104329   (fraction of variance due to u_i)
------------------------------------------------------------------------------
to run this model.

I detected heteroscedasticity, autocorrelation, no multicollinearity (VIF is small) & -fe- is appropriate.


Is my model correct or can someone recommend me a better model/command in my case.
I appreciate your support!

Thank you
Kind regards,
Jana