Hi, Statalist.
I'm trying to clean a dataset. This is a small snippet below...I deleted some variables to increase clarity in this post.
[CODE]
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int conflict_id float year str10 start_date byte start_prec str10 start_date2 byte(start_prec2 ep_end) str10 ep_end_date byte ep_end_prec
200 1946 "1946-07-18" 1 "1946-07-21" 2 1 "1946-07-21" .
200 1947 ""           . ""           . . ""           .
200 1948 ""           . ""           . . ""           .
200 1949 ""           . ""           . . ""           .
200 1950 ""           . ""           . . ""           .
200 1951 ""           . ""           . . ""           .
200 1952 "1946-07-18" 1 "1952-04-09" 1 1 "1952-04-12" .
200 1953 ""           . ""           . . ""           .
200 1954 ""           . ""           . . ""           .
200 1955 ""           . ""           . . ""           .
200 1956 ""           . ""           . . ""           .
200 1957 ""           . ""           . . ""           .
200 1958 ""           . ""           . . ""           .
200 1959 ""           . ""           . . ""           .
200 1960 ""           . ""           . . ""           .
200 1961 ""           . ""           . . ""           .
200 1962 ""           . ""           . . ""           .
200 1963 ""           . ""           . . ""           .
200 1964 ""           . ""           . . ""           .
200 1965 ""           . ""           . . ""           .
200 1966 ""           . ""           . . ""           .
200 1967 "1946-07-18" 1 "1967-03-31" 3 1 "1967-10-16" .
200 1968 ""           . ""           . . ""           .
200 1969 ""           . ""           . . ""           .
200 1970 ""           . ""           . . ""           .
200 1971 ""           . ""           . . ""           .
200 1972 ""           . ""           . . ""           .
200 1973 ""           . ""           . . ""           .
200 1974 ""           . ""           . . ""           .
200 1975 ""           . ""           . . ""           .
200 1976 ""           . ""           . . ""           .
200 1977 ""           . ""           . . ""           .
200 1978 ""           . ""           . . ""           .
200 1979 ""           . ""           . . ""           .
200 1980 ""           . ""           . . ""           .
200 1981 ""           . ""           . . ""           .
200 1982 ""           . ""           . . ""           .
200 1983 ""           . ""           . . ""           .
200 1984 ""           . ""           . . ""           .
200 1985 ""           . ""           . . ""           .
200 1986 ""           . ""           . . ""           .
200 1987 ""           . ""           . . ""           .
200 1988 ""           . ""           . . ""           .
200 1989 ""           . ""           . . ""           .
200 1990 ""           . ""           . . ""           .
200 1991 ""           . ""           . . ""           .
200 1992 ""           . ""           . . ""           .
200 1993 ""           . ""           . . ""           .
200 1994 ""           . ""           . . ""           .
200 1995 ""           . ""           . . ""           .
200 1996 ""           . ""           . . ""           .
200 1997 ""           . ""           . . ""           .
200 1998 ""           . ""           . . ""           .
200 1999 ""           . ""           . . ""           .
200 2000 ""           . ""           . . ""           .
200 2001 ""           . ""           . . ""           .
200 2002 ""           . ""           . . ""           .
200 2003 ""           . ""           . . ""           .
200 2004 ""           . ""           . . ""           .
200 2005 ""           . ""           . . ""           .
200 2006 ""           . ""           . . ""           .
200 2007 ""           . ""           . . ""           .
200 2008 ""           . ""           . . ""           .
200 2009 ""           . ""           . . ""           .
200 2010 ""           . ""           . . ""           .
200 2011 ""           . ""           . . ""           .
200 2012 ""           . ""           . . ""           .
200 2013 ""           . ""           . . ""           .
200 2014 ""           . ""           . . ""           .
200 2015 ""           . ""           . . ""           .
200 2016 ""           . ""           . . ""           .
200 2017 ""           . ""           . . ""           .
200 2018 ""           . ""           . . ""           .
First, I need to clean the dataset to only include conflict_ids where there is conflict from 2000-2018. In other words, if there is ONLY conflict from the year 2000 and before, it should be eliminated. I then have to merge this dataset to another dataset where the variables of interest begin in 2000. Second, I would like to keep a record of which of the conflict_id were deleted.

Originally, I was just inputting the following code, but it takes too long and also more prone to human error:
Code:
drop if conflict_id==200
.

Then, I tried the following code, but it only dropped the observations and not the complete cases, i.e. all of the conflict_id's that meet the conditions:
Code:
bysort conflict_id: drop if year<2000 & ep_end==1
The variable ep_end designates when the conflict was over. When it begins, the variable is coded as zero. When it ends, it is coded as 1.

I tried using an egen command to perhaps create a new variable, and then drop the cases, but what I put together was incorrect did not give me what I was looking for--
Code:
bysort conflict_id (year): egen good = total(year<2000 & ep_end==1)
The data is a little tricky, because you can have cases where the conflict started way before 2000, but leads into 2000-2018. As a result, these cases should be kept and not dropped.