Question about using one seed to draw sample from multiple data set.

Can anyone help me with how seed works in Stata for drawing sample from multiple data sets?

I have two data sets A and B, each contains 400 observations and each data set is divided into 4 groups (100 observations in each group). I would like to draw 30 observations from each group for each data set and make sure it is reproducible.

I tried two methods:
1. I load data set A, set seed 12345, sample 30, count by(group). Then I load data set B, sample 30, count by(group). At the end of sample data set A and begin of sample data set B, I use command display c(seed) to make sure the seed displayed are the same.
2. I append data set B to data set A (label data set with A and B), then set seed 12345, sample 30, by(group data).

My question is: how come the result sample using these two methods are different? Actually, the result sample for data set A is the same between two methods and for B are different. Can anyone please explain to me how seed works in Stata sampling?

Thank you very much for your help!

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Question about using one seed to draw sample from multiple data set.
Question about using one seed to draw sample from multiple data set.

0 Response to Question about using one seed to draw sample from multiple data set.

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Question about using one seed to draw sample from multiple data set. Question about using one seed to draw sample from multiple data set.

Related Posts with Question about using one seed to draw sample from multiple data set.

0 Response to Question about using one seed to draw sample from multiple data set.

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Question about using one seed to draw sample from multiple data set.
Question about using one seed to draw sample from multiple data set.