Hello everyone!
I have been working on removing duplicated observations for a long time, using different ways I learnt from googling.
However, I realized that I was not effective enough because it normally took me a very long time to detect and remove the duplications.
Sometimes, I have to analyze the duplication obs and choose which one I should keep and which ones I should delete. This is time consuming when there are thousands of duplicated observations.
So, I was wondering if there is a solution that, for each id-year, it keeps only the observation with the most data available, and removes others.
For example, in the attached picture, I want to make a panel dataset based on "cusipid" and "fyear".
For each cusipid-fyear, there are more than one observations.
Array
The code I used is as followed:
Code:
duplicates tag cusipid fyear, gen(isdup)
edit if isdup
order isdup
Then, I have to hand-select which one has more data available and should be kept.
I can not just simple keep the first observations and remove the rest because in the case cusipid=267 and fyear =2007, I actually need the second one with other data equals 0.
So, are there any codes that can help to select obs with more data available and keep them as the only one observation for each cusipid-fyear?
Thank you very much!
Kind regards
Shengze
Related Posts with Quickly edit duplicated observation in a panel setting
Generating new variables, sorted from a varlistI've been puzzling with this problem for hours, but I can't figure it out. Per observation, I need t…
line graph of row percentage by different groupsHi Statalist, I am trying to generate a line graph. I am using four variables. My outcome, which wo…
xtreg, xtreg mle or xtmixed?Hello, I am having trouble understanding what the difference is between xtreg with random effects a…
Oaxaca_rif: How to account for indicator variablesHello, Does the oaxaca_rif command allow one to specify indicator variables (fixed effects)? If so,…
generating observation "scores" after Mokken analysisDear all, I am conducting a Mokken scale analysis using loevh, with the following syntax: Code: lo…
Subscribe to:
Post Comments (Atom)
0 Response to Quickly edit duplicated observation in a panel setting
Post a Comment