Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int year byte qtr str1 disclosure_code str41 area_title str43 agglvl_title long(qtrly_estabs_count month1 month2 month3) float countyid str5 time float dup 2009 1 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 31 1868 1721 1641 1 "20091" 0 2009 2 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 31 1531 1530 1496 1 "20092" 0 2009 3 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 32 1495 1480 1433 1 "20093" 0 2009 4 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 33 1445 1444 1447 1 "20094" 0 2010 1 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 31 1428 1457 1516 1 "20101" 0 2010 2 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 31 1540 1570 1572 1 "20102" 0 2010 3 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1552 1578 1590 1 "20103" 0 2010 4 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1601 1594 1584 1 "20104" 0 2011 1 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1525 1567 1576 1 "20111" 0 2011 2 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 31 1566 1581 1580 1 "20112" 0 2011 3 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 31 1587 1579 1639 1 "20113" 0 2011 4 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1658 1668 1636 1 "20114" 0 2012 1 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1649 1703 1722 1 "20121" 0 2012 2 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1719 1733 1712 1 "20122" 0 2012 3 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1723 1727 1734 1 "20123" 0 2012 4 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1701 1592 1547 1 "20124" 0 2013 1 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 29 1538 1561 1563 1 "20131" 0 2013 2 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 29 1572 1577 1584 1 "20132" 0 2013 3 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 29 1582 1592 1602 1 "20133" 0 2013 4 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 30 1657 1655 1637 1 "20134" 0 2014 1 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 31 1740 1741 1760 1 "20141" 0 2014 2 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 32 1782 1798 1812 1 "20142" 0 2014 3 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 32 1806 1813 1848 1 "20143" 0 2014 4 "" "Abbeville County, South Carolina" "County, NAICS Sector -- by ownership sector" 32 1789 1789 1783 1 "20144" 0 end
However, the issue is that there are duplicates for the same year, quarter, and county with one dupe listing zeros for employment and the other version listing the actual employment. I sorted the data with this: sort countyid time month1 month2 month3 so that the dupe with zeros would appear before the dupe with employment, giving it a dup value of 1. My intention was to do drop if dup == 1 however i discovered that some of the duplicates come in triplets meaning that there are two lines of zeros, with dup values of 1 and 2, and then the actual employment has a dup value of 3. I thought I could try to use gsort to instead sort the duplicates in descending order so I can do drop if dup >1 but I can't seem to get it to work.
I would appreciate any help!
0 Response to need to drop duplicates depending on how many duplicates there are per observation
Post a Comment