I am constructing panel data from 2 sets of surveys administered monthly for 7 years. The first is called "basic", and second is called the "special" survey.
The sampling rotation is such that in any 2 year period, the same individual answers the basic survey in 2 consecutive months, then again in the same 2 consecutive months the following year.
When answering the final basic survey they are also administered the special survey.
Their PIDs will match across the surveys, so the max times a PID should appear in any Year/Month combination is 2, and the max overall is 5.
However, there are some individuals with duplicates, because xtdescribe shows the max observations per pid is 8, not 5.

Data from the special survey has a variable special = 1, and the basic survey is assigned 0

The following code shows me how many duplicates are in each yearmonth:
Code:
Duplicates tag yearmonth, generate(temp)
tab temp

And the following code would drop all duplicates:
Code:
duplicates drop pid yearmonth, force
I want to do the following:
  1. List the number of duplicates for special AND basic in each yearmonth
  2. Delete any duplicates for special and basic
Any advice would be much appreciated.