Hi there,

I'm working on a hierarchical dataset where individuals are nested within the household. I would like to count the number of brothers/sisters the individual has in the household. For each household, I have the person's id, their father/mother's id and gender.

Code:
clear
input double(pid fid code gender codef codem)
110001101 110001 101 1 201 202
110001102 110001 102 0 204 205
110002101 110002 101 1 201 202
110002102 110002 102 0 203   .
110002103 110002 103 0 101 102
110003101 110003 101 1 201 202
110003102 110003 102 0 203 204
110003103 110003 103 0 101 102
110005101 110005 101 0 102 103
110005102 110005 102 1 201 104
110005103 110005 103 0 202 203
110005104 110005 104 0 204 205
110006101 110006 101 1 201 202
110006102 110006 102 0 203 204
110006103 110006 103 0 101 102
110007101 110007 101 0 102 104
110007102 110007 102 1 201   .
110007103 110007 103 1 102 104
110007104 110007 104 0 202   .
110009101 110009 101 0 201 202
end

label var pid "Individual ID"
label var fid "household ID"
label var code "ID within household"
label var gender "gender"
label var codef "father ID within household"
label var codem "mother ID within household"


In counting the number of brothers, for example, I'd like to first identify anyone who is male, and shares the same father id/ mother id, to be one's brother. And then counting the number of brothers one has in the household.

I'd like to do this without looping through the data. Here's my try:

Code:
bysort fid  : g sib_male=sum((gender[_n+1]==1)  &((codef==codef[_n+1] & codef<. ) |  ///
                                                  (codem==codem[_n+1] & codem <. ) ) )
bysort fid : replace sib_male=sib_male[_N]
However, it seems that the code is incorrect, as it assigns everyone in the family has the same number of brothers.

Can ayone help me figure out the correct way to conduct this? Many Thanks!