Dear All,

I downloaded some data from Kenneth French Library (http://mba.tuck.dartmouth.edu/pages/...a_library.html). Date is in the format yyyymmdd. So I tried to fix it. A sample of data is below:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long date float(mktrf smb hml rf)
19260701   .1 -.24  -.28 .009
19260702  .45 -.32  -.08 .009
19260706  .17  .27  -.35 .009
19260707  .09 -.59   .03 .009
19260708  .21 -.36   .15 .009
19260709 -.71  .44   .56 .009
19260710  .62  -.5  -.15 .009
19260712  .04  .03   .54 .009
19260713  .48 -.26  -.23 .009
19260714  .04  .09  -.48 .009
19260715 -.43  .54   -.3 .009
19260716  .53  .01  -.57 .009
19260717  .34  .43  -.63 .009
19260719 -.01  .01  -.49 .009
19260720 -.57 -.23   .16 .009
19260721  -.6  .21   .31 .009
19260722 -.73  -.3  -.17 .009
19260723 -.02  .08   .06 .009
19260724 -.14  .44  -.06 .009
19260726  .53 -.38  -.25 .009
19260727  .43 -.45   .26 .009
19260728 1.09  .04  -.28 .009
19260729  .36 -.61 -1.01 .009
19260730  .14 -.47   .55 .009
19260731  .46 -.12  -.17 .009
19260802  .84 -.22  -.03  .01
19260803  .47 -.27   -.6  .01
19260804 -.36  .16   .25  .01
19260805 -.09  .01   .73  .01
19260806  .68  .08   .16  .01
19260807  .46  .08  -.24  .01
19260809  .21 -.04  -.26  .01
19260810 -1.4  .34   .21  .01
19260811 -.57  .21   .34  .01
19260812  .84 -.77   .42  .01
19260813  .58 -.85   .25  .01
19260814  .69  .15  -.36  .01
19260816  .07  .23    .7  .01
19260817 -.95  .34   .16  .01
19260818  .26  .08   .04  .01
19260819  -.6  .12    .1  .01
19260820  -.2 -.36   .42  .01
19260821  .32 -.06   .49  .01
19260823  .49 -.48   .62  .01
19260824 -.84   .5  -.09  .01
19260825 -.32 -.43   .28  .01
19260826  .53  .09    .5  .01
19260827  .35  .13  -.39  .01
19260828  .34  .15  -.06  .01
19260830   .3    0   .42  .01
end
I used the following code:

Code:
tostring date, replace
gen year=substr(date, 1, 4)
gen month=substr(date, 5, 6)
gen day=substr(date, 7,8)
Apparently, when I generate month, Stata should extract only 2 digits, for instance 07. But when I try this, it extrapolates four digits, for instance 0701. If I want to extract the first two digits, I need to do a further step:

Code:
gen m2=substr(month, 1, 2)
Do you have any explanation about that?

Thanks in advanced,

Dario