I am working in a dataset that has both household and individual level variables. We have a total of 1,217 households in the sample. I'm trying to figure out how many households in the sample have had at least 1 individual diagnosed with malaria in the last twelve months. Each household survey asks questions about individuals in the household. So, the variable inm1_malaria is an individual level variable and represents the question "Has NAME been diagnosed with malaria in the last 12 months?
When I tab inm1_malaria (the individual level variable), I of course get more observations than total number of households in the sample because there are more than 1 individuals per household. When I say: tab hh_no if inm1_malaria==1, I still get more observations than we have households. It's counting each hh_no each time that hh_no is assigned to an individual in the dataset. So if there are 4 individuals with hh_no CHP/02/0004, it is counting that as 4 separate households.
Any advice on how to get household level data from the individual observations nested within the household would be much appreciated.
Best,
Nancy