I think I have encountered a bug in Stata where using a temporary frame may result in data loss in the main data frame.
The following is the minimal code that reproduces the problem:
Code:
clear all version 16.0 sysuse nlsw88 tempname tmp frame create `tmp' frame `tmp' : { generate int a=. generate double b=. } frame change `tmp' describe frame default: describe // end of file
After the termination of the do-file the temporary data frame is disposed of, and the frame named default becomes the active frame (as discussed earlier with William Lisowsky here).
But, here is where the bug comes in: the default frame will retain only so many variables as were present in the temporary frame at the time it was disposed of. (Only 2 in this example). You can convince yourself of this by running another describe command from the command line after the do-file completes.
This means, that if you were unfortunate to run a code like above, and then without inspection continue modifying your dataset, and then save the result, it will overwrite your source with a loss of data (if you didn't do backups of the source). The fact that some variables are retained aggravates the issue, since visually the data is still there, and you really have to check the inventory of the variables to find out what's going on.
I have further found that:
1) the flag indicating that the dataset has changed is not set in this case: display c(changed)
2) workaround: manually switching the current frame to the default makes sure that the variables in it are not affected: frame change default
I have not checked what will happen if the temp frame contained more variables than the default frame.
This has been observed in
- Stata MP 16.0(797) on Windows (64-bit x86-64), and further re-confirmed in
- Stata MP 16.1(839) on Windows (64-bit x86-64).
Thank you, and best regards,
Sergiy
0 Response to 🐛 Suspected bug in frames handling of data
Post a Comment