Dear Statalist community,

my question aims on how to generate a new identifier variable in a panel data set containing information on intrastate conflicts. It is structured in the following form:

Code:
clear
input str5 conflict_id int year
"11342" 2012
"11342" 2014
"11343" 1967
"11343" 1969
"11343" 1970
"11343" 1973
"11344" 2011
"11345" 2011
"11345" 2012
"11345" 2013
end
Each row represents one year of a given conflict (represented by the variable "conflict_id"). In the example above, the first two rows therefore represent the same conflict in 2012 and 2014. I am trying to generate a new ID variable that refers to the old ID values, but similarly accounts for gaps on the variable "year". More specifically, if there is no observation for more than two consecutive years during the same conflict, the new ID variable should assign a new value. In case there are no gaps of more than two consecutive years, the new ID values should be identical. In the example above, the new ID variable should consequently take the following values:

Code:
clear
input str5 conflict_id int year float new_id
"11342" 2012 1
"11342" 2014 1
"11343" 1967 2
"11343" 1969 2
"11343" 1970 2
"11343" 1973 3
"11344" 2011 4
"11345" 2011 5
"11345" 2012 5
"11345" 2013 5
end

Note that the new ID variable assigns two distinct values for observations with conflict_id 11343 since there are no observations for more than two years (between 1970 and 1973).

I am thankful for any advice on how to generate the new identifier variable.


Best regards
Carlo
Stata Version 17.0 on Windows