Hello everyone!
I'm a very beginner using STATA so please I'd like to get some help with merge command,
My purpose is to add variables to the master dataset and keep only what is matched related to a particular observation, so I've been told I must use merge command (I'm using merge in STATA 17). My master dataset has more observations than my using dataset, but they both have in common two variables householdnumber and household_number_member. I was suggested to use all common variables when merging, not just one in common.
However, in the master dataset the householdnumber variable has repeated observations. In other words, appears multiple times the same household number referring to each member of the household, who will be identified by household_number_member variable. Thus, household_number_member is identifed as 1= marriage immigrant, 2= national spouse, 3= children, etc. On the other hand, the using dataset doesn't have repeated observations in the variable householdnumber, because this dataset only uses in the household_number_member variable the observation 1 (which means, 1= marriage immigrant) that's why it has less observations.
By using merge, I'm only interested in keeping all related to 1 (which means 1=marriage immigrant) and adding variables that appear in the using dataset.
To make myself clear, and to get proper help. I'm using dataex command as follows (the following code corresponds to my master dataset)
* Example generated by -dataex-. For more info, type help dataex
clear
input long householdnumber byte household_number_member
10000028 1
10000028 2
10000031 1
10000031 2
10000062 1
10000068 1
10000068 2
10000124 1
10000124 2
10000140 1
10000140 2
10000186 1
10000187 1
10000187 2
10000213 1
10000232 1
10000232 3
10000270 1
10000270 2
10000292 1
10000293 1
10000293 2
10000315 1
10000315 2
10000328 1
10000328 2
10000344 1
10000393 1
10000404 1
10000404 2
10000415 1
10000415 2
10000419 1
10000419 2
10000422 1
10000422 2
10000455 1
10000455 2
10000492 1
10000492 2
10000497 1
10000497 2
10000533 1
10000533 2
10000549 1
10000556 1
10000556 2
10000556 3
10000556 4
10000556 21
10000592 1
10000592 2
10000592 3
10000592 21
10000601 1
10000601 21
10000608 1
10000608 2
10000615 1
10000658 1
10000658 2
10000700 1
10000700 2
10000700 3
10000712 1
10000712 2
10000712 3
10000713 1
10000713 2
10000764 1
10000771 1
10000771 2
10000774 1
10000774 2
10000787 1
10000855 1
10000855 2
10000894 1
10000894 2
10000894 3
10000902 1
10000902 2
10000952 1
10000952 2
10000952 21
10000970 1
10000970 2
10001003 1
10001003 2
10001004 1
10001004 2
10001069 1
10001069 2
10001142 1
10001193 1
10001193 2
10001193 3
10001193 4
10001200 1
10001200 2
end
The following code corresponds to my using dataset:
* Example generated by -dataex-. For more info, type help dataex
clear
input long householdnumber byte household_number_member
10000028 1
10000031 1
10000062 1
10000068 1
10000124 1
10000140 1
10000186 1
10000187 1
10000213 1
10000232 1
10000270 1
10000292 1
10000293 1
10000315 1
10000328 1
10000344 1
10000393 1
10000404 1
10000415 1
10000419 1
10000422 1
10000455 1
10000492 1
10000497 1
10000533 1
10000549 1
10000556 1
10000592 1
10000601 1
10000608 1
10000658 1
10000700 1
10000712 1
10000713 1
10000764 1
10000771 1
10000774 1
10000787 1
10000855 1
10000894 1
10000902 1
10000952 1
10000970 1
10001003 1
10001004 1
10001069 1
10001142 1
10001193 1
10001200 1
10001214 1
10001292 1
10001332 1
10001334 1
10001353 1
10001425 1
10001426 1
10001434 1
10001445 1
10001482 1
10001523 1
10001609 1
10001690 1
10001720 1
10001775 1
10001782 1
10001827 1
10001865 1
10001883 1
10001924 1
10001959 1
10001969 1
10002006 1
10002084 1
10002150 1
10002174 1
10002193 1
10002227 1
10002311 1
10002325 1
10002403 1
10002409 1
10002437 1
10002498 1
10002511 1
10002568 1
10002580 1
10002592 1
10002595 1
10002600 1
10002620 1
10002714 1
10002730 1
10002732 1
10002749 1
10002816 1
10002840 1
10002870 1
10002941 1
10002970 1
10003085 1
end
I've been using the command as follows, but STATA issued an error message saying: factor-variable and time-series operators not allowed
merge 1:1 householdnumber household_number_member "filename", keep if household_member_number==1
Any help would be very much appreciated.
Related Posts with How to merge two datasets and How to use keep
Error in loop codeHi I ran the following code. Code: foreach y in 2003 2004 2005 { foreach s in 3 6 8 15 27 3…
Reshape by groupHello! I need some help with collapsing some data into groups. The variables are country1 and countr…
Gsem and Logistic regressionHi dears please help me I want to run this model But when I run the model, the effect of variable "C…
Extracting label for a specific value of a variableHi I have a very basic question. I am working on a loop and inside the loop I want to extract the l…
Probiti have run a probit model and want to use it to identify which of my observations i can use as contr…
Subscribe to:
Post Comments (Atom)
0 Response to How to merge two datasets and How to use keep
Post a Comment