Dear All,
I am working on two survey datasets and have encountered the same problem in the small dataset. I am trying to merge 4 data files in the smaller survey and having a problem with duplicate variables. The data sets are on wheat, corn, barley and demography data. The demography and wheat file that am trying to merge have member id variables while the corn and barley only have the cluster and hh. This is a step by step explanation of what I did
use demography.dta, clear
gen qid = string (cluster) + string (hh)
gen qid2 = string (cluster) + string (hh) + string (memID)
save demo.dta, replace
use wheat.dta, clear
gen qid = string (cluster) + string (hh)
gen qid2 = string (cluster) + string (hh) + string (memID)
save wheatO.dta, replace
use barley.dta, clear
gen qid = string (cluster) + string (hh)
save barleyO.dta, replace
use corn.dta, clear
gen Eid = string (cluster) + string (hh)
save cornO.dta, replace
when I tab crop variable I get
wheat = 320
barley= 663
corn= 422
Then I proceed to merge as follows:
use demo.dta, clear // memID
merge m:m qid2 using wheatO.dta
rename _merge MERGE
sort cluster hh memID
drop if merge !=3
tab crop and I get wheat = 320 (which is the same as before the merge=great)
save whdemo.dta
merge m:m qid using barleyO.dta
sort cluster hh memID
order MERGE, after (_merge)
drop if _merge !=3
tab crop
I get 320 wheat (great) but for barley I get 780 (which is way beyond the 663)
what am I doing wrong
Many thanks for your help in advance
Related Posts with merging survey data files - duplicate data problem
labels using VennDiagramHi, I am using a user defined command VennDiagram to draw a venn diagram. Can the command VennDiagr…
Greek letters in the dataHi all, I'm trying to import a dataset with string variables containing Greek characters. I'm using…
Suppress scalebar with twoway contour?twoway contour automatically generates a colour scalebar where there is colour information in the pl…
Hausman TaylorHi Statalist, I am relatively new to stata, albeit I would like to conduct an analysis using the Ha…
xtabond2 data issuextabond2 data issue Today, 09:05 I'm using panel data and xtabond2 command in STATA 15.1. I'm tryin…
Subscribe to:
Post Comments (Atom)
0 Response to merging survey data files - duplicate data problem
Post a Comment