Dear all,

I have a pooled cross sectional data ( weakly balanced, 9 different years wealth and income data for individuals belonging to 4 races, 2 genders, and three categories of education) I have been referring to the STATA guide about this. There is a whole pdf manual about about svy.
What I saw was that when we do svy, we don't need to do any calibration of the variables using the weights given in the data. For instance we don't need to calculate weighted mean or anything (like X1W1+X2W2..../W1+W2+....). We just specify the survey design using svyset and use use "svy:" before putting any estimation command. The data somewhat looks like this- Each year has multiple individuals belonging to a each race category and probably there are different individuals in each year, not the same individuals sampled every year. I think this is sampling with replacement (since new set of individuals in each year? or not?How to know?), but I am not sure.

weight Year Race Gender Age Income
1678.4 1990 Black M 55
.2 1990 White M 25
6546 1990 Black M 44
151.55 1990 White F 56
564.55 1991 White F 60
54.66 1991 White M 30
1483.08 1991 Black M 29
452.6 1991 Black F 48
111.56 1992 White M 65
My questions are -
1. How to know if our data is sampling with replacement or sampling without replacement?
2. How to know if it is a one stage design or a two stage design? (This is necessary to know for when we specify survey design)
2. Which approach is better for plotting trends in income? Using svyset, doing regressions and marginsplotting OR without svyset, by calculating weighted means and doing line plot OR
collapse (p50) income [pw=weight], by(year race)?
3. If we use svyset, how to know the type of weights? aweight, pweight...etc?
4. svyset PSU [pweight=pw], strata(strata)
Is is alright if I use year as strata?
5. What is the primary sampling unit here? I think each individual?

Any help would be greatly appreciated.
Thank you.