Hello Stata users,

On my dataset the individuals have two household identifiers from different sources, with the household ID being different from one variable to the other. My goal is to generate a new household ID which includes all the individuals related to a household. Data looks like this:

ID Household_ID1 Household_ID2 Sex
1 1 85 1
2 1 . 2
3 2 85 1
4 3 62 1
5 4 85 2
6 5 64 1
7 5 . 2
So for example in this case, individuals 1,2,3 and 5 belong to the same household. However, I cannot find a way to create this variable that at first sight seems easy to obtain. I have tried with collapse and bysort in several ways but there are always issues with the observations in which there is a missing value for one of the household IDs.

Thank you in advance for any help!