Hi all,

I hava an panel dataset for firms and stock returns between 2005 and 2015. However, for some observations (sorted by FirmID and Date) I have duplicates with differing stock prices.
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(DailyObservation FirmID) long Date double ClosingPrice float dup_obs
675 23 16439 7.46 1
  1 23 16439 2.95 2
  2 23 16440 2.96 1
676 23 16440 7.59 2
  3 23 16441 2.95 1
677 23 16441 7.58 2
  4 23 16442 2.96 1
678 23 16442 7.75 2
679 23 16443 7.76 1
  5 23 16443 3.25 2
680 23 16446 7.84 0
681 23 16447    8 1
  6 23 16447 2.95 2
682 23 16448  7.8 0
683 23 16449 7.72 1
  7 23 16449 2.95 2
684 23 16450 7.84 0
685 23 16453 7.93 1
  8 23 16453 2.95 2
  9 23 16454 2.95 1
686 23 16454 7.83 2
 10 23 16455 2.95 1
687 23 16455 7.85 2
 11 23 16456 2.95 1
688 23 16456 7.79 2
 12 23 16457 2.95 1
689 23 16457 7.78 2
 13 23 16460 3.02 1
690 23 16460 7.68 2
691 23 16461 7.68 1
 14 23 16461 2.95 2
692 23 16462 7.77 1
 15 23 16462 2.95 2
 16 23 16463 2.95 1
693 23 16463 7.85 2
694 23 16464 7.84 1
 17 23 16464 2.95 2
695 23 16467  7.9 1
 18 23 16467 2.95 2
696 23 16468 8.09 1
 19 23 16468 2.93 2
697 23 16469 8.07 1
 20 23 16469  3.1 2
 21 23 16470    3 1
698 23 16470 7.96 2
 22 23 16471 3.01 1
699 23 16471 8.02 2
700 23 16474 8.28 0
701 23 16475 8.36 1
 23 23 16475 2.94 2
 24 23 16476 3.31 1
702 23 16476 8.85 2
 25 23 16477 3.35 1
703 23 16477 8.76 2
704 23 16478 8.78 1
 26 23 16478  3.3 2
 27 23 16481 3.03 1
705 23 16481 8.59 2
 28 23 16482 3.29 1
706 23 16482 8.71 2
 29 23 16483 3.02 1
707 23 16483 8.74 2
 30 23 16484 3.03 1
708 23 16484  8.8 2
709 23 16485 8.57 1
 31 23 16485  3.2 2
710 23 16488 8.51 1
 32 23 16488 3.02 2
 33 23 16489 3.02 1
711 23 16489 8.36 2
 34 23 16490 3.02 1
712 23 16490 8.26 2
713 23 16491 8.23 0
 35 23 16492 3.02 1
714 23 16492 8.38 2
715 23 16495 8.46 0
716 23 16496 8.27 1
 36 23 16496 3.02 2
717 23 16497 8.39 1
 37 23 16497 3.03 2
718 23 16498 8.23 1
 38 23 16498 3.02 2
 39 23 16499 3.05 1
719 23 16499 8.31 2
 40 23 16502 3.02 1
720 23 16502 8.41 2
 41 23 16503 3.02 1
721 23 16503 8.51 2
 42 23 16504 2.93 1
722 23 16504 8.52 2
 43 23 16505 2.95 1
723 23 16505 8.42 2
 44 23 16506  3.2 1
724 23 16506 8.62 2
725 23 16509  8.4 1
 45 23 16509  3.2 2
726 23 16510 8.04 1
 46 23 16510  3.3 2
727 23 16511 7.96 0
 47 23 16512 3.35 1
end
format %d Date
Dup_Obs was derived using an ADO file and the code:
Code:
dup FirmID Date
I would like to drop one duplicate time series set, either the series with the lower closing prices or the series that does not start with DailyObservation #1. I am having trouble coming up with the code to drop one of the series. I have tried:
Code:
dup FirmID Date, drop
However this drops all dup_obs that are not equal to 0. This parts of both of the duplicate series to be dropped.
Using code such as
Code:
drop if dup_obs!=0 & DailyObservation>2865
also does not help because since it is an unbalanced panel data not all duplicate series go until such a high Observation number.