Hi,
I am merging two datasets with 240 variables in common, specifying "update" so that missing values in the master file will be updated (replaced) with nonmissing values from the using file. I have discovered that if the master data contains the extended missing value .a and the using data has system missing (.) in the same place, the value is "updated" to the system missing value (that is, . replaces .a), and the resulting _merge value is 4="missing updated", even though the resulting value still shows as missing (now . instead of .a) . I gather this is because system missing < extended missing, but I have to say, this behavior does not seem intuitive and is not what I want--I want the original .a to remain, and NOT be replaced by sysmiss. (In this particular case, I use .a to signify a survey response of "not applicable"; maybe that is not an appropriate/conventional use of extended missing values but I sure find it handy! It is a missing value for most calculation purposes, but with more information embedded in it than a system missing value.)
I have searched the help and on the web and not been able to find anywhere that this behavior (sysmiss replacing extended missing in merge, update and being considered "missing updated") is documented or mentioned. I am still trying to figure out what to do about it--suggestions welcome--but at the very least I thought I would put in a plea for this to be documented somewhere.
Incidentally, because I wanted information on what was happening with each individual variable in the merge, I decided to run the merge separately for each variable, i.e., 240 successive merges. I don't think that makes a difference, but thought I'd mention it. It definitely helped me track and diagnose the issue.
I am using Stata 15.
thanks so much,
Deb Holtzman
Related Posts with System-missing vs. extended-missing values in merge with update specified
Disaggregation with inequal7 functionHello, I am working with income dataset for household level and calculating GINI index with inequal…
Searching distinct identifier in a folderHi everyone, I have the following issue. In one folder, I have stored different xlsx files. Each fi…
shortcut referring to independent variables for regressionHi STATA, I have a dataset with about 400 independent variables and 4 mill rows. I want to run a lo…
Should the TWFE estimate results from the did_multiplegt command match those generated by reghdfe command?Hello everyone, I am using the -did_multiplegt- command developed by Clément de Chaisemartin Xavier…
Help required with identifying observations between two datasetsHello, I am working with a dataset where I have identified 7,286 observations. These observations b…
Subscribe to:
Post Comments (Atom)
0 Response to System-missing vs. extended-missing values in merge with update specified
Post a Comment