Hi everyone,
I have a panel dataset with 148 countries and 70 years and I am trying to identify episodes of rapid growth. Part of my dataset looks as follows:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 countrycode int year float(pot_episode fstat)
"CHN" 1959 0 .
"CHN" 1960 1 2.644415
"CHN" 1961 0 .
"CHN" 1962 0 .
"CHN" 1963 0 .
"CHN" 1964 0 .
"CHN" 1965 0 .
"CHN" 1966 0 .
"CHN" 1967 0 .
"CHN" 1968 0 .
"CHN" 1969 0 .
"CHN" 1970 0 .
"CHN" 1971 0 .
"CHN" 1972 0 .
"CHN" 1973 0 .
"CHN" 1974 0 .
"CHN" 1975 0 .
"CHN" 1976 1 9.337345
"CHN" 1977 1 40.30368
"CHN" 1978 1 27.767164
"CHN" 1979 1 18.049032
"CHN" 1980 1 15.10914
"CHN" 1981 1 7.487884
"CHN" 1982 0 .
"CHN" 1983 0 .
end
The variable 'pot_episode' indicates whether the respective year meets the conditions for a rapid growth episode. In some cases, multiple consecutive years satisfy the conditions and in these cases I wish to select the year for which the variable 'fstat' is maximized. I would like to create a separate variable 'acceleration' which captures this information. If there is just one year which meets the conditions and for which pot_episode equals "1", acceleration should also be equal to "1". If there are multiple consecutive years for which pot_episode is "1", acceleration should equal "1" only for the year with the highest value for 'fstat', and should equal "0" for the other years.
I would therefore want my dataset to look as follows:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 countrycode float(year pot_episode fstat) byte acceleration
"CHN" 1959 0 . 0
"CHN" 1960 1 2.644415 1
"CHN" 1961 0 . 0
"CHN" 1962 0 . 0
"CHN" 1963 0 . 0
"CHN" 1964 0 . 0
"CHN" 1965 0 . 0
"CHN" 1966 0 . 0
"CHN" 1967 0 . 0
"CHN" 1968 0 . 0
"CHN" 1969 0 . 0
"CHN" 1970 0 . 0
"CHN" 1971 0 . 0
"CHN" 1972 0 . 0
"CHN" 1973 0 . 0
"CHN" 1974 0 . 0
"CHN" 1975 0 . 0
"CHN" 1976 1 9.337345 0
"CHN" 1977 1 40.30368 1
"CHN" 1978 1 27.767164 0
"CHN" 1979 1 18.049032 0
"CHN" 1980 1 15.10914 0
"CHN" 1981 1 7.487884 0
"CHN" 1982 0 . 0
"CHN" 1983 0 . 0
end
I am struggling to find a way to do this in Stata, are there any ideas on how to tackle this?
Related Posts with Finding the maximum value for a subset of observations
Create a table comparing output from diffrent regression modelsHello All, A frequent problem I have in Stata is that I have to run different regression models (fo…
creating a variable from multiple string variablesI am working with NHANES examination data and would like to create a variable that includes the sum …
Question on How to Make a Panel Table Output with Categorical RegressionHi, I am new to Stata and this forum. I have a question which I spent lots of time on but did not fi…
Odds ratio or hazard riskhi all, how can i calculate odds ratio or hazard risk with a contingency table. for example: tabulat…
Matching to identify a sub-sample of individuals that share similar characteristicsHi all, As opposed to using PSMTACH2 to calculate the propensity scores to identify treatment and c…
Subscribe to:
Post Comments (Atom)
0 Response to Finding the maximum value for a subset of observations
Post a Comment