Hi everyone.

Im currently writing my thesis and got advised to compare In-sample vs Out-of-sample prediction. Where R2 out of sample is defined as:
Array
My data set is for 1764 observations (months)

Code:
* Example generated by -dataex-. To install: ssc install dataex
dataex return t, count(30)
clear
input double return float t
 .01242389  1
 .02019115  2
 .02964644  3
 .03405454  4
 .02848361  5
-.01542547  6
-.00863808  7
 .01915629  8
 .01136751  9
-.05886838 10
 .03742423 11
  .0257288 12
 .02951672 13
 .00101633 14
 .04072984 15
 .03142682 16
-.00213872 17
-.00795212 18
-.00034236 19
-.00761387 20
-.01335902 21
 .01832136 22
-.00212152 23
 .03346823 24
 .01082923 25
 .01575765 26
-.01307993 27
-.00878622 28
 .01496919 29
-.01535528 30
end
The code that i tried to run, (by dividing data set 50/50 by training period and forecasting period).
Code:
gen t=_n
tsset t
gen segment = 1
replace segment = 0 if t<883
reg return L.return if segment ==0
predict forecast if segment == 1
gen returns = return if segment == 1 
reg return forecast
However, i am very new to STATA and would appreciate if someone could give me some feedback / suggestions on improvements.