Dear Stata users,


I am trying to merge two datasets which are firm-level panel data (master) and NAICS to BEA code crosswalk data.
NAICS code in my master data is quite heterogeneous. What is, it varies from 2-digits to 6-digits.
However, NAICS to BEA code crosswalk data looks like the following:
naics beacode
111 1100
112 1100
113 1130
114 1130
115 1130
211 2110
212 2120
213 2130
22 2200
23 2300
311 3110
312 3110
313 3130
314 3130
315 3150
316 3150
321 3210
322 3220
323 3230
324 3240
325 3250
326 3260
327 3270
331 3310
332 3320
333 3330
334 3340
335 3350
336 3360
337 3370
339 3390
42 4200
44 4400
45 4400
481 4810
482 4820
483 4830
484 4840
485 4850
486 4860
487 4870
488 4870
492 4870
493 4930
511 5110
512 5120
513 5130
514 5140
515 5130
516 5140
517 5130
518 5140
519 5140
521 5210
522 5210
523 5230
524 5240
525 5250
531 5310
532 5320
533 5320
5411 5411
5412 5412
5413 5412
5414 5412
5415 5415
5416 5412
5417 5412
5418 5412
5419 5412
55 5500
561 5610
562 5620
61 6100
621 6210
622 6220
623 6220
624 6240
711 7110
712 7110
713 7130
721 7210
722 7220
81 8100


What I want to do is, for example, firm A's NAICS code is 817890, however since the crosswalk data does not include this code, I would like to make them as 81 (closest code). Therefore, BEA code for this firm A is 8100.
Or, if I take another example, firm B's NAICS code is 722909. In this case, the closest code in the crosswalk data would be 722, therefore matched BEA code would be 7220.
I hope this example is clear.
In sum, I would like to match my master data with closest NAICS code in the crosswalk data in order to match with BEA code.
Would there be any code to implement this?


Thank you very much in advance,
AC