Dear Statalisters!

I am currently dealing with a panel dataset from China where I use Fixed Effects to estimate some regressands (if they are continuous) and Ordered Logit if they are ordinal.
However, I am unable to wrap my head around clustering.

I am trying to examine the effect of various regressors on individual spending. I have an abundance of variables for each individual for year 2008 and 2011. The individuals were interviewed in their cities, i.e. the researchers took 100 cities in China and surveyed people there.

I now basically do the following:

Code:
xtset ID year
xtreg y x1 x2 x3 x4 x5, fe
xtlogit y x1 x2 x3 x4 x5
Now I am thinking about clustering. As far as I understood clustering, I should cluster at the city level, so this would become:

Code:
xtreg y x1 x2 x3 x4 x5, fe vce(cluster city)
xtlogit y x1 x2 x3 x4 x5, vce(cluster city)
However, if I do this I get an error that clusters are not nested within dataset, this is because some people have moved cities between the survey rounds. This means I am unable to cluster at city level, however, what I could do is remove everyone who has moved between the two survey rounds and then cluster at the city level. However, I do not think this is statistically right, although it only removes 1% of the observations.

Someone else told me that I should rather cluster at the individual level, and include the different cities as dummies into the equation, so it would become

Code:
xtreg y x1 x2 x3 x4 x5 i.city, fe vce(cluster ID)
xtlogit y x1 x2 x3 x4 x5 i.city, vce(cluster ID)
However, I do not understand the rationale behind clustering at the individual level and including a dummy for cities?

Is anyone able to help me a bit on this?

Many thanks in advance!
Andreas