Hello everyone!
I have been working on removing duplicated observations for a long time, using different ways I learnt from googling.
However, I realized that I was not effective enough because it normally took me a very long time to detect and remove the duplications.
Sometimes, I have to analyze the duplication obs and choose which one I should keep and which ones I should delete. This is time consuming when there are thousands of duplicated observations.
So, I was wondering if there is a solution that, for each id-year, it keeps only the observation with the most data available, and removes others.
For example, in the attached picture, I want to make a panel dataset based on "cusipid" and "fyear".
For each cusipid-fyear, there are more than one observations.
Array
The code I used is as followed:
Code:
duplicates tag cusipid fyear, gen(isdup)
edit if isdup
order isdup
Then, I have to hand-select which one has more data available and should be kept.
I can not just simple keep the first observations and remove the rest because in the case cusipid=267 and fyear =2007, I actually need the second one with other data equals 0.
So, are there any codes that can help to select obs with more data available and keep them as the only one observation for each cusipid-fyear?
Thank you very much!
Kind regards
Shengze
Related Posts with Quickly edit duplicated observation in a panel setting
Interpreting interaction in probit with endogenous covariateHi all, My model is an ordered probit regression with an endogenous covariate. The dependent variab…
tsvarlist not filling gaps?Hi Statalist, I have a beginners question. I use tsvarlist operators (l. , f. s. , etc.) quite ext…
Clustering variablesHello everyone, I acutally have a problem with clustering and building the mean. My dataset contai…
artbin: Extended sample size for randomised trials with binary outcomesDear Stata List I am very happy to announce a major update to the user-written command artbin, vers…
Line graph connecting the mean values of several variablesHi Stata Loving People, I am stuck on a problem. I want to create a graph showing the mean value …
Subscribe to:
Post Comments (Atom)
0 Response to Quickly edit duplicated observation in a panel setting
Post a Comment