Hello fellow Stata users,

I'm encoutering some problems with potential non-linearity in a random effects panel regression using xtreg Y X1 X2 X3, re vce(robust) (N ~ 7000, T = 5). A scatter plot suggests that the relationship between X1 and Y might be non-linear/inverted u-shaped, so I added the squared term of c.X1 to the model, but both the linear term (c.X1) and the squared term (c.X1#c.X1) are statistically insignificant (c.X1 is also statistically insignificant without the squared term). I tested for joint significance using
Code:
test c.X c.X#c.X
which turned out significant at the 5% level (thus both variables make a significant contribution to the model):
( 1) c.X1 = 0
( 2) c.X1#c.X1 = 0

chi2( 2) = 7,10
Prob > chi2 = 0,0287

I've read (for example, here, but also in a few papers) that an insignificant squared term can be dropped from the model (because it suggests that the relationship is linear) - but in this case the test of joint significance suggests that both variables contribute significantly to the model. So my questions are:
1. From a statistical point of view, should I keep the squared term (let's say, a non-linear/inverted u-shaped relationship would theoretically make sense)?
2. If I keep the squared term (because it theoretically makes sense and the test of joint significance "tells me to" (I acknowledge that statistical significance (and, in particular, the reliance on statistical significance for decision making purposes) is currently a subject of much debate!)): what does the insignificance of the squared and the linear term tell me or better: how do I report/interpret (if at all) that result? I checked the vertex and it lies within the range of values (although close to one end of the range where there's very little evidence).
3. What if a non-linear/u-shaped relationship doesn't theoretically make sense: Would it be statistically justified to drop the squared term because it is insignificant although the test of joint significance says that in combination with the linear term it contributes significantly to the model?
3. Should I even add the squared term when the linear term itself is insignificant?


Thanks a lot in advance and best regards
F.