Aloha,
I am using the NLSY97 and doing a project where I need to link siblings within the data. I have it so my observations are as follows:

ID age sibid year older hhid
1 x_1 2 1997 0 1
1 x_2 2 1998 0 1
2 y_1 1 1997 1 1
2 y_2 1 1998 1 1
etc.

The "older" variable simply indicates if that sibling is the oldest within the household.

I would like to add another variable "age_older" so that I have the age of the older sibling in the same person-year observation as the younger sibling. I imagine the data to look as follows:

ID age sibid age_older year older hhid
1 x_1 2 y_1 1997 0 1
1 x_2 2 y_2 1998 0 1
2 y_1 1 0__ 1997 1 1
2 y_2 1 0__ 1998 1 1
etc.

Two questions: how do I create a command to do this? Second: should the observations for "age_older" of the older sibling (seen above in the latter two person-year observations) be 0 or "." for missing? It really is an "N/A".

I will be using this in a regression discontinuity design, with the cutoff based on the age of the older sibling.

Thank you!