Code:
clear all version 16.1 set seed 872387 set sortseed 636445 set obs 1000 gen id = _n gen x = 0 replace x = 1 in 200/600 sort x list id in 1/5
In Stata 16.1 SE, I get: 998, 112, 992, 943, 690.
Here is another example using merge:
Code:
clear all version 16.1 set seed 872387 set sortseed 636445 set obs 1 gen x = 1 tempfile s save `s' clear set obs 1000 gen id = _n gen x = 0 replace x = 1 in 200/600 merge m:1 x using `s' list id in 1/10
In Stata 16.1 SE, I get: 998, 112, 992, 943, 690.
I understand that I can work around this problem by doing a stable sort (prior to merge, in the second example). But I'm surprised that setting the version, seed, and sortseed do not appear to be sufficient to ensure reproducibility across editions of the same Stata version. Is that true, or am I missing something? Is this due to how sorting is parallelized?
Note that I did not observe this problem with fewer observations. The following produced the same output in both Stata 16.1 SE and Stata 16.1 MP (2-core):
Code:
clear all version 16.1 set sortseed 636445 set obs 100 gen id = _n gen x = 0 replace x = 1 in 20/60 sort x list id in 1/5
0 Response to Setting version, seed, and sortseed not sufficient for reproducibility?
Post a Comment