Hi,

I am running a linear regression in Stata 17, using the regress command. Based on theory I suspect that the relationship between my independent and dependent variable might be non-linear, however, my sample is a bit different from previous samples, so this is not a given. Thus, I decided to chech whether I should include a quadratic term in my model, but I am having some trouble interpreting the results/deciding whether or not to leave the quadratic term in.

This is a cross sectional data set, and my dependent variable is funciton and my independent variable is strength.

Code:
 regress function c.strength##c.strength i.sex

---------------------------------------------------------------------------------------
             function | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
----------------------+----------------------------------------------------------------
             strength |   .2311146   .0905668     2.55   0.012     .0513168    .4109124
                      |
c.strength#c.strength |  -.0205282    .014225    -1.44   0.152    -.0487683     .007712
                      |
                  sex |
              female  |   .1430361   .0463664     3.08   0.003     .0509871     .235085
                _cons |   .2134218   .1359994     1.57   0.120    -.0565712    .4834148
---------------------------------------------------------------------------------------
As you see, the quadratic term is not significant here, so my first thought was that I should drop the quadratic term. However, when I graph the data, I still think it looks more non-linear... So I ran a margins plot. The range of my strength variable is 0.85 - 6.3

Code:
margins , at (strength = (0.85 (0.2) 6.3))
marginsplot
The result was the attached graph, and I would say that this looks quadratic/non-linear?
Array



I did look at the axis of symmetry using nlcom returning the value of 5.63, which is within the range (0.85-6.3), but very close to the upper boundary.

Code:
nlcom -_b[strength]/(2*_b[c.strength#c.strength])

       _nl_1: -_b[strength]/(2*_b[c.strength#c.strength])

------------------------------------------------------------------------------
    function | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _nl_1 |   5.629207   1.861501     3.02   0.002     1.980733    9.277682
------------------------------------------------------------------------------


All of this makes me confused. How do I interpret this? Should I leave the quadratic term in the model, and in that case, how do I interpret/expain that non-significant term? Could it be due to lack of statistical power when I include the quadratic term? It might even be that I have done something wrong here, this is my first time running a regression using a quadratic term. And I also applogize in advance if this post is difficult to read or understand, it is my first time posting in the forume as well.

Best,
Hilde