Hi all,
So I have a multi-level data, with individuals from different countries, over the period of 8 years. The individuals are not the same over the years. My dataset is also unbalanced, with all countries not having the same participation in all years.

I am running a probit, and i intend to control for the country and year fixed effects.

so my stata command is like:

probit preference age i.gender i.educlevel i.YEAR i.COUNTRY, vce(robust)

since my data is two level, i have read that using probit should be fine?

but to take account of the clustering i should include vce(cluster COUNTRY) as my standard errors.

so i alternatively tried:

probit preference age i.gender i.educlevel i.YEAR i.COUNTRY, vce(cluster COUNTRY)

However, upon doing this, my explanatory variables that were significant using the robust standard errors become insignificant with cluster.

Can someone help me explain what is happening in these commands?

another thing to note is that the number of countries in my data is around 40.

In similar literature, i have come across authors saying that they included country and time dummies (which i am doing using i.) and also robust clustering around country.

Alternatively, I wanted to ask, what happens when instead of the first command, the following commands are used:

meprobit preference age i.gender i.educlevel i.YEAR i.COUNTRY, vce(robust)

meprobit preference age i.gender i.educlevel i.YEAR || COUNTRY:, vce(robust)


IS using -probit- with clustered standard errors doing the same thing as -meprobit- with vce(robust) [ since the stata manual says that in meprobit using vce robust is similar to what they do by clustering around the highest level which in this case is COUNTRY]?