I wanted to merge two data sets. There are 99 variables and 315,717 observations (size: 56,513,343) in 1st data set (master). It (MORG data (from Current Population Survey) has the basic demographic variables (age, sex, race, and marital status etc.) The 2nd data set (fatal occupational injury data from the BLS) set has 2 variables (injury rate and occupation code) and 162 observations (size: 1,296). I have sorted both data sets by the variable, occ2012 (occupation code variable) before merging them:
I am not sure which merge command to use, merge 1:1, 1:m or m:1.
I have used m:1 and got this:
use "C:\Users\hmridha\Documents\Fall2018\paper 4\data ETC\sortedmorg13.dta"
. merge m:1 occ2012 using "C:\Users\hmridha\Documents\Fall2018\paper 4\data ETC\foic_new\sortedfoic13.dta"
(note: variable occ2012 was int, now float to accommodate using data's values)
Result # of obs.
-----------------------------------------
not matched 312,407
from master 312,255 (_merge==1)
from using 152 (_merge==2)
matched 3,462 (_merge==3)
-----------------------------------------
Does this merging look all right? I would appreciate any help.
0 Response to Merging two data sets
Post a Comment