Finding the maximum value for a subset of observations

Hi everyone,

I have a panel dataset with 148 countries and 70 years and I am trying to identify episodes of rapid growth. Part of my dataset looks as follows:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 countrycode int year float(pot_episode fstat)
"CHN" 1959 0 .
"CHN" 1960 1 2.644415
"CHN" 1961 0 .
"CHN" 1962 0 .
"CHN" 1963 0 .
"CHN" 1964 0 .
"CHN" 1965 0 .
"CHN" 1966 0 .
"CHN" 1967 0 .
"CHN" 1968 0 .
"CHN" 1969 0 .
"CHN" 1970 0 .
"CHN" 1971 0 .
"CHN" 1972 0 .
"CHN" 1973 0 .
"CHN" 1974 0 .
"CHN" 1975 0 .
"CHN" 1976 1 9.337345
"CHN" 1977 1 40.30368
"CHN" 1978 1 27.767164
"CHN" 1979 1 18.049032
"CHN" 1980 1 15.10914
"CHN" 1981 1 7.487884
"CHN" 1982 0 .
"CHN" 1983 0 .
end

The variable 'pot_episode' indicates whether the respective year meets the conditions for a rapid growth episode. In some cases, multiple consecutive years satisfy the conditions and in these cases I wish to select the year for which the variable 'fstat' is maximized. I would like to create a separate variable 'acceleration' which captures this information. If there is just one year which meets the conditions and for which pot_episode equals "1", acceleration should also be equal to "1". If there are multiple consecutive years for which pot_episode is "1", acceleration should equal "1" only for the year with the highest value for 'fstat', and should equal "0" for the other years.
I would therefore want my dataset to look as follows:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 countrycode float(year pot_episode fstat) byte acceleration
"CHN" 1959 0 . 0
"CHN" 1960 1 2.644415 1
"CHN" 1961 0 . 0
"CHN" 1962 0 . 0
"CHN" 1963 0 . 0
"CHN" 1964 0 . 0
"CHN" 1965 0 . 0
"CHN" 1966 0 . 0
"CHN" 1967 0 . 0
"CHN" 1968 0 . 0
"CHN" 1969 0 . 0
"CHN" 1970 0 . 0
"CHN" 1971 0 . 0
"CHN" 1972 0 . 0
"CHN" 1973 0 . 0
"CHN" 1974 0 . 0
"CHN" 1975 0 . 0
"CHN" 1976 1 9.337345 0
"CHN" 1977 1 40.30368 1
"CHN" 1978 1 27.767164 0
"CHN" 1979 1 18.049032 0
"CHN" 1980 1 15.10914 0
"CHN" 1981 1 7.487884 0
"CHN" 1982 0 . 0
"CHN" 1983 0 . 0
end

I am struggling to find a way to do this in Stata, are there any ideas on how to tackle this?

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Finding the maximum value for a subset of observations
Finding the maximum value for a subset of observations

0 Response to Finding the maximum value for a subset of observations

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Finding the maximum value for a subset of observations Finding the maximum value for a subset of observations

Related Posts with Finding the maximum value for a subset of observations

0 Response to Finding the maximum value for a subset of observations

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Finding the maximum value for a subset of observations
Finding the maximum value for a subset of observations