Hello Everyone,

I have a dataset which has data for 100 households from each of 3 cities. Further, 5 members from each household are listed in the dataset. So in total, I have 1500 (3*100*5) observations. The household members are divided into 20 groups based on certain characteristics (each member is assigned a group number between 1 and 20).

Lets call the variables as city, household, member and group.

I want to select (using sample command or any other efficient method) 20 members from each city (one from each group). My condition is that only one member can be selected from each household.

When I run the following command:

bysort city group: sample 1, count

I get one member sampled from each group within each city but this command selects (in some cases) more that one members from one household.
What I want is if one member is selected from some group from household1, then no other member in a particular city should be sampled from household1 and one member from each group should also be selected from each city.

Kindly advise how can I achieve this.

Thank you!

Amit