Hi everyone,
I have a panel dataset with 148 countries and 70 years and I am trying to identify episodes of rapid growth. Part of my dataset looks as follows:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 countrycode int year float(pot_episode fstat)
"CHN" 1959 0 .
"CHN" 1960 1 2.644415
"CHN" 1961 0 .
"CHN" 1962 0 .
"CHN" 1963 0 .
"CHN" 1964 0 .
"CHN" 1965 0 .
"CHN" 1966 0 .
"CHN" 1967 0 .
"CHN" 1968 0 .
"CHN" 1969 0 .
"CHN" 1970 0 .
"CHN" 1971 0 .
"CHN" 1972 0 .
"CHN" 1973 0 .
"CHN" 1974 0 .
"CHN" 1975 0 .
"CHN" 1976 1 9.337345
"CHN" 1977 1 40.30368
"CHN" 1978 1 27.767164
"CHN" 1979 1 18.049032
"CHN" 1980 1 15.10914
"CHN" 1981 1 7.487884
"CHN" 1982 0 .
"CHN" 1983 0 .
end
The variable 'pot_episode' indicates whether the respective year meets the conditions for a rapid growth episode. In some cases, multiple consecutive years satisfy the conditions and in these cases I wish to select the year for which the variable 'fstat' is maximized. I would like to create a separate variable 'acceleration' which captures this information. If there is just one year which meets the conditions and for which pot_episode equals "1", acceleration should also be equal to "1". If there are multiple consecutive years for which pot_episode is "1", acceleration should equal "1" only for the year with the highest value for 'fstat', and should equal "0" for the other years.
I would therefore want my dataset to look as follows:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 countrycode float(year pot_episode fstat) byte acceleration
"CHN" 1959 0 . 0
"CHN" 1960 1 2.644415 1
"CHN" 1961 0 . 0
"CHN" 1962 0 . 0
"CHN" 1963 0 . 0
"CHN" 1964 0 . 0
"CHN" 1965 0 . 0
"CHN" 1966 0 . 0
"CHN" 1967 0 . 0
"CHN" 1968 0 . 0
"CHN" 1969 0 . 0
"CHN" 1970 0 . 0
"CHN" 1971 0 . 0
"CHN" 1972 0 . 0
"CHN" 1973 0 . 0
"CHN" 1974 0 . 0
"CHN" 1975 0 . 0
"CHN" 1976 1 9.337345 0
"CHN" 1977 1 40.30368 1
"CHN" 1978 1 27.767164 0
"CHN" 1979 1 18.049032 0
"CHN" 1980 1 15.10914 0
"CHN" 1981 1 7.487884 0
"CHN" 1982 0 . 0
"CHN" 1983 0 . 0
end
I am struggling to find a way to do this in Stata, are there any ideas on how to tackle this?
Related Posts with Finding the maximum value for a subset of observations
Logit / probit regression with binary endogenous explanatory variableDear all, I am currently facing an issue with regard to binary endogenous explanatory variables in p…
Storing loop regression outputsHello everyone, I'm trying to run the below loop and store the estimates of each regression model; …
Difference in Difference (DiD) within GMMHello everyone, I am trying to work out how to complete a DiD (difference in difference) analysis w…
SignificanceHi guys. For my research (Master Thesis) I have two control variables on segments: one for the numb…
Fama Macbeth cross sectional regressionHello, I want to run Fama Macbeth cross sectional regression of monthly excess return of 500 securit…
Subscribe to:
Post Comments (Atom)
0 Response to Finding the maximum value for a subset of observations
Post a Comment