Hello,

I would like to check if there is a relationship between mothers' fertility and grandparental childcare. I am using a cross-national survey dataset. I restricted the sample to 6 countries of interest and only kept female respondents between fertile ages, who live with a partner and who had at least one child under 14 years old at time t.

Dependent variable: dummy taking value 1 if the respondent had an additional birth between time t and t+1

Main explanatory variable: dummy taking value 1 if the respondent received help with childcare from her parents at time t

Control variables for respondent, all measured at time t:
- age (numeric variable, ranging between 18 to 45)
- education (two dummy variables: first takes value 1 if highest completed schooling is upper secondary; second takes value 1 if highest completed schooling is tertiary; this means the reference level is if highest completed schooling is low secondary)
- employment status (dummy taking value 1 if respondent is employed)
- financial stability (dummy taking value 1 if respondent reports getting by financially with ease)
- conservative views (dummy taking value 1 if respondent agrees with a given statement)
- number of biological children (numeric, between 1 and 3)
- age of youngest child (numeric, between 1 and 14)

I intend to use a logit model and I would like to report separate coefficients for each country.

The nr of observations per country range from a minimum of n=180 to a maximum of n=1100. The proportion of respondents who exhibit the outcome of interest per country ranges from a minimum of 6.5% to a maximum of 12% (for one of the countries, this means only 15 cases had an additional birth)
1. Are these enough observations per country to run a logit model and obtain meaningful estimates?

2. If the answer to 1. is positive, would it be correct to then run the following command and interpret the resulting coefficients:
by country, sort: logit additionalchild grandparenthelp age upsec_edu tertiary_edu employed fin_stable conservative nrbiokids ageyoungest

In other words, does running the regressions by country solve possible issues with clustering?

3. If the answer to 1. is negative, and I instead run a unique regression (command as above but without the by country prefix), then I would have to use the option vce(cluster country), is this correct?


I would really appreciate any help on this, thank you.