Question on merging datasets with duplicates

Hello all, I have a question regarding merging files with duplicates. I suspect the solution is quite simple, but I have not been able to come up with one (nor have any of my coworkers).

I have two datasets I’d like to merge. Both of them contain the numeric variables stckcd and year, along with other variables. The observations in dataset 1 are uniquely identified by stckcd and year. The observations in dataset 2 are not.

I want to merge two datasets by stckcd and year so that if there is a duplicate observation in dataset 2, the corresponding observation for the other variables in dataset 1 is repeated.

Here’s a simple example.

Dataset 1:

stckcd	year	A
1	2000	1
1	2001	1
2	2000	2

Dataset 2:

stckcd	year	B
1	2000	w
1	2000	x
1	2001	y
2	2000	z

Here's what I'd like the merged datasets to look like:

stckcd	year	A	B
1	2000	1	w
1	2000	1	x
1	2001	1	y
2	2000	2	z

My problem seems similar to the one described here: https://www.statalist.org/forums/for...the-duplicates,but I’m not entirely sure what that user wanted the final dataset to look like.

Apologies in advance if this question is not phrased clearly enough. I am new to StataList.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Question on merging datasets with duplicates
Question on merging datasets with duplicates

0 Response to Question on merging datasets with duplicates

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Question on merging datasets with duplicates Question on merging datasets with duplicates

Related Posts with Question on merging datasets with duplicates

0 Response to Question on merging datasets with duplicates

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Question on merging datasets with duplicates
Question on merging datasets with duplicates