Dear Statalists,

I have a dataset with a (to me, at least) complex form of duplicates. The variables PRODUCTSEGMENT`i'SICCODE (i = 1, 2, 3, 4, 5) indicate if a firm has several different product segments by giving the SIC codes of the industries the firm operates in. The variables PRODUCTSEGMENT`i'SALES (i = 1, 2, 3, 4, 5) give the sales revenue the firm obtains from the corresponding segment.
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float NUM int(PRODUCTSEGMENT1SICCODE PRODUCTSEGMENT2SICCODE PRODUCTSEGMENT3SICCODE PRODUCTSEGMENT4SICCODE PRODUCTSEGMENT5SICCODE) long(PRODUCTSEGMENT1SALES PRODUCTSEGMENT2SALES PRODUCTSEGMENT3SALES PRODUCTSEGMENT4SALES PRODUCTSEGMENT5SALES)
 1 5942 5961 5722 7375    . 1688752  661374  547190  225117      .
 2 3728 3728 3728    .    .  419313  357788   97154       .      .
 3 2834 3841 3845 2834    . 8177000 2929000 2778000 2088000      .
 4 4911 4911 4911 4911    . 2729000 2466000 2444000 1688000      .
 5 3577 5045    .    .    . 2464386  760004       .       .      .
 6 3841 8731    .    .    .   22017    2879       .       .      .
 7 5651    .    .    .    . 1364853       .       .       .      .
 8 7363    .    .    .    . 4373244       .       .       .      .
 9 3674 3674    .    .    . 3793962   97792       .       .      .
10 2835 8731 2834    .    .    4640     700     633       .      .
11 7372 7379 7372 7372    .  543419  291886  289624  104791      .
12 7372    .    .    .    .   12887       .       .       .      .
13 3523 3523 3523 3523 3523 1470300  472200  250300  195300 153400
14 2813 2869 3569    .    . 3944000 1522800  250400       .      .
15 3841 8731 2834    .    .   17443   14946    9859       .      .
16 4581 4513 4522    .    . 1973000   86300   81600       .      .
17 2297 3442 2297    .    .  698187  102273   36236       .      .
18    .    .    .    .    .       .       .       .       .      .
19 8399 8742 7379    .    .  272785  146826   67649       .      .
20 3724 3724 3724 3724    . 9653000 7185000 3457000 3313000      .
end
However, there are duplicates, as the example below shows. Look at row 2. The same SIC code (3728) appear three times, and the corresponding sales variables have different values. Thus, I would need a code that combines the sales into one value if the same SIC code appears multiple times. For example, for row 2, I would like it to look something like this (or with new variables combining the duplicates):
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float NUM int(PRODUCTSEGMENT1SICCODE PRODUCTSEGMENT2SICCODE PRODUCTSEGMENT3SICCODE PRODUCTSEGMENT4SICCODE PRODUCTSEGMENT5SICCODE) long(PRODUCTSEGMENT1SALES PRODUCTSEGMENT2SALES PRODUCTSEGMENT3SALES PRODUCTSEGMENT4SALES PRODUCTSEGMENT5SALES)
1 5942 5961 5722 7375 . 1688752 661374 547190 225117 .
2 3728    .    .    . .  874255      .      .      . .
end
The challenge is that it might be several values that I need to combine. For example, there might be two different SIC codes that both appear multiple times for a single firm.

Thank you in advance!