Hello,

I am planning to fit an HLM model predicting the odds of HIV risk behavior among people who are aware of their HIV status, using the demographic surveys. I am considering recent survey from 30 countries.

I would fit either a two level model (in which individuals are nested into countries) or three level model (individuals nested into regions - nested in countries).

Here are my sample sizes:

Level 3: Max of 30 countries
Level 2 (regions): varies between 3 regions to 15 regions per country
Level 1: Individuals (men/women): 2 to a 1,000 who are aware of their positive status.

Now, my question relates to the sample size. I've read some of the literature on the area and aware of the 30-30 rule (30 groups with 30 observations/group)

Since my focus is on those who are presumably aware of their seropositivity, I will focus on those countries that have sufficiently large samples of those focal population. The total number varies between 2 to 1,000. But there are gender differences.

So, should I take the gender differences into account (i.e, the number of men/women) when deciding whether to include a country or not. CountryA may have 30 observations total, but considerably more women (25) than men (5). In that case, should I include the country or delete. In otherwords, is the population in sub-groups of interest important for sample determination?

Thanks - Yy