Hi,
I have three datasets with master data (userid, names, email addresses, salary etc) data1 (email and salary) and data2 (email, salary and demographic information)
My master data has some gaps in salary and I am trying to user data1 and data2 to fill those gaps.
My aim is to match the userid to salary but since people used multiple email addresses in data1 and data2, I am finding it hard to match them. This is easily done if I manually do it, but I would it to be able to replicate the concept/logic in future too hence stata use
This is what I want - create a possible loop or a way where with the following steps
(1) when salary is missing - the masterfile matches emailid in master data to email id in my data1, return salary information if it exists
(2) for the salary information is still missing, use masterfile email with data2 email ids to fill in the remaining salary information.
I have done this in excel already and the above results my 100 missing salary information down to 10 which am happy with.
I know how to do this in a very long way in stata i.e. create new files each time and match missing ones but i am hoping there is a loop wherein i can mention the salary to be replaced if email id doesnt match masterfile then data1 file and then data2 file.
I would appreciate your input on this or a good way of implementing this.
Thank you
Anna
Related Posts with match email across multiple datasets
Generate string identifier variableHello, trying to create a coded ID based on a string variable that is 18 characters long with over 5…
How can I change the order of the bars in this bar graph?I am a beginner Stata user. For a research project, whose information I must unfortunately keep conf…
how to interpret Probit coefficientstoday, I had a discussion with my professor on how to interpret the coefficients of a probit analysi…
Horizontal labelling of y axis causes white spaceHi all, I have the following data Code: input y1 y2 y3 x -0.34834709 -0.02733159 -0.6137266 97 …
2 Questions regarding data analysisArray [ATTACH]temp_13070_1547246705490_155[/ATTACH] Array [ATTACH]temp_13071_1547246726877_339[/ATTA…
Subscribe to:
Post Comments (Atom)
0 Response to match email across multiple datasets
Post a Comment