Hi,
I have one dataset with 33 observations where I have three geographical variables:
state which is the largest one,
region- the second largest
district - the smallest union,
e.g the dataset looks like this:
state )region district
Arunachal Pradesh. Arunachal East East Slag
Arunachal Pradesh Arunachal East lohit
Arunachal Pradesh Arunachal East trap
Arunachal Pradesh Arunachal West East Kameng
Arunachal Pradesh Arunachal West Lower Subansiri
The second dataset, individual level, I have 321 observations, with the name of the people coming from each state and regions, say. The same name is very rarely happening more than once in the second data, e.g means it is not very often found in more than one region.
state region
Arunachal Pradesh. Arunachal East KASHME LINGI
Arunachal Pradesh Arunachal East. WANGCHA RAJKUMAR
Arunachal Pradesh Arunachal West PREM KHANDU THUNGON
What I want to have is for each of the name that is present in the second dataset to be matched to each of the districts from the first data, irrespective if that name appears for all regions in the second data or not e.g. the name KASHME LINGI appears in Arunachal Pradesh. state and Arunachal East region , but I want it to be matched to all the districts of the state Arunachal Pradesh, so I want my data to look like this
state )region district name
Arunachal Pradesh. Arunachal East East Slag KASHME LINGI
Arunachal Pradesh Arunachal East lohit KASHME LINGI
Arunachal Pradesh Arunachal East trap KASHME LINGI
Arunachal Pradesh Arunachal West East Kameng KASHME LINGI
Arunachal Pradesh Arunachal West Lower Subansiri KASHME LINGI
Similarly, for the rest of the names. Merge command , doesn't work. Can you please let. me know how to proceed?
Many thanks,
Ciara
Related Posts with Matching two datasets
Error in Saving in old versionDear All Please help me solve this error data cannot be saved in old .dta format Data contain strL…
Calculating avg timeDear All How can find the average time taken when I have hour, min and sec as diff variable clear …
xtlogit,pa vs logisticHi, I'm a grad student with a moderate statistics background having a bit of trouble figuring out w…
Dynamic Panel Regression - T=15, N is around 14-18 (unbalanced)Hi I am currently doing my undergraduate thesis and our dataset includes an unbalanced panel with 1…
Calculating 95% CI for median regression lineDear Stata users, I have generated restricted cubic splines (4 knots) for A, and then fit the median…
Subscribe to:
Post Comments (Atom)
0 Response to Matching two datasets
Post a Comment