Hello everyone,

I'm using 3 rounds of cross-sectional NSSO data which is at the household level and I would like to construct a pseudo panel. I want to aggregate this household-level data into group-level averages on the basis of household-head's birth year.


tab birthyear

birthyear | Freq. Percent Cum.
------------+-----------------------------------
1897 | 1 0.00 0.00
1898 | 1 0.00 0.00
1900 | 1 0.00 0.00
1903 | 1 0.00 0.00
1904 | 2 0.00 0.00
1905 | 5 0.00 0.00
1906 | 6 0.00 0.01
1907 | 6 0.00 0.01
1908 | 2 0.00 0.01
1909 | 15 0.00 0.01
1910 | 15 0.00 0.02
1911 | 14 0.00 0.02
1912 | 15 0.00 0.03
1913 | 14 0.00 0.03
1914 | 20 0.01 0.04
1915 | 109 0.03 0.07
1916 | 34 0.01 0.08
1917 | 37 0.01 0.09
1918 | 29 0.01 0.10
1919 | 130 0.04 0.14
1920 | 246 0.08 0.22
1921 | 139 0.04 0.26
1922 | 90 0.03 0.29
1923 | 213 0.07 0.35
1924 | 286 0.09 0.44
1925 | 739 0.23 0.67
1926 | 340 0.10 0.77
1927 | 429 0.13 0.90
1928 | 248 0.08 0.98
1929 | 864 0.27 1.24
1930 | 1,259 0.39 1.63
1931 | 992 0.30 1.93
1932 | 452 0.14 2.07
1933 | 1,106 0.34 2.41
1934 | 1,174 0.36 2.77
1935 | 3,107 0.95 3.72
1936 | 1,422 0.44 4.16
1937 | 1,827 0.56 4.72
1938 | 1,003 0.31 5.03
1939 | 3,338 1.02 6.05
1940 | 4,335 1.33 7.38
1941 | 3,765 1.15 8.54
1942 | 1,637 0.50 9.04
1943 | 3,453 1.06 10.10
1944 | 4,442 1.36 11.46
1945 | 5,776 1.77 13.23
1946 | 4,866 1.49 14.73
1947 | 4,619 1.42 16.14
1948 | 2,431 0.75 16.89
1949 | 7,584 2.33 19.22
1950 | 6,514 2.00 21.21
1951 | 7,446 2.28 23.50
1952 | 3,004 0.92 24.42
1953 | 6,645 2.04 26.46
1954 | 6,264 1.92 28.38
1955 | 9,721 2.98 31.36
1956 | 6,513 2.00 33.36
1957 | 7,944 2.44 35.80
1958 | 3,759 1.15 36.95
1959 | 10,637 3.26 40.21
1960 | 10,215 3.13 43.35
1961 | 10,302 3.16 46.51
1962 | 4,444 1.36 47.87
1963 | 9,853 3.02 50.89
1964 | 9,434 2.89 53.79
1965 | 11,403 3.50 57.28
1966 | 9,522 2.92 60.20
1967 | 9,305 2.85 63.06
1968 | 4,368 1.34 64.40
1969 | 13,543 4.15 68.55
1970 | 9,190 2.82 71.37
1971 | 11,612 3.56 74.93
1972 | 4,478 1.37 76.31
1973 | 9,221 2.83 79.14
1974 | 7,637 2.34 81.48
1975 | 8,195 2.51 83.99
1976 | 7,152 2.19 86.19
1977 | 6,466 1.98 88.17
1978 | 3,172 0.97 89.14
1979 | 7,799 2.39 91.54
1980 | 3,649 1.12 92.65
1981 | 6,309 1.94 94.59
1982 | 2,405 0.74 95.33
1983 | 3,763 1.15 96.48
1984 | 2,608 0.80 97.28
1985 | 2,185 0.67 97.95
1986 | 1,899 0.58 98.54
1987 | 1,384 0.42 98.96
1988 | 787 0.24 99.20
1989 | 893 0.27 99.48
1990 | 453 0.14 99.61
1991 | 525 0.16 99.78
1992 | 272 0.08 99.86
1993 | 278 0.09 99.94
1994 | 100 0.03 99.97
1995 | 83 0.03 100.00
------------+-----------------------------------
Total | 325,990 100.00

. As you can see, in some of the birth years, there are lots of observations and I'm hoping someone can advise me what strategy I should use to create say 10 birth cohorts? So since it starts from 1987, I could form one cohort 1897-1927 and another one from 1987-2007. But not sure about the ones in between.
I am aware that the number of cohorts must be sufficiently large for the within estimator to be consistent. Any advise would be so useful.

Thank you so much for your help,
Samira.