Hi,
I have a matched dataset of ~150k patients, with each matched pair consisting of one case and one control. I used ccmatch; cases and controls were matched on patient_id, a string variable unique to each patient. Each matched pair has a unique value for the variable match.
Now I have appended a new file to this dataset. Content from the new file has many (but not all) of the patient_ids in the existing dataset, plus some additional patient_ids (who I will want to discard). Since matching was not done on the new file, the match variable value is missing for all those patient_ids. I want to populate match for the newly-appended patient_ids who are in the existing dataset, i.e., for the matched cases and controls. After that, I will drop the excess patients who came from the new file (i.e., those patients not matched in the original dataset). The excess patients should be easy to identify since at that point, the non-cases and non-controls should all have missing match values and should be the only patients who have missing match values.
My question: how can I populate a unique value for one variable based on the unique value of another variable? Specifically, how can I populate missing match values, based on existing match and patient_id values? I am thinking this may start with a replacement of match if match==., based on patient_id, but am unsure exactly how to write this out.
An example, using match value 5565, is below. First is the case, then the control. In the newly-appended data, patient_ids 12345 and 67890 may be present, but match (and match_id) would be missing.
[ATTACH=CONFIG]temp_19442_1598106134579_812[/ATTACH]
Array
Related Posts with Populate unique value for a variable based on unique value of another variable
Ratio of frequency of two variablesHello I have a dataset like this: Code: * Example generated by -dataex-. To install: ssc install da…
How can I find out how Stata is calculating covariance matrices exactly(!)I'm trying to translate Stata results into R and with the existing methods, I can only recreate stan…
Heteroskedasticity and MulticollinearityWhen I use the command (estat hettest) after doing the panel regression with re and fe then hausman …
How to construct the second best response to a shock?Hi, I am using a household survey wherein I have been given money raised from top 5 sources in time…
Is using fixed effects and clustering acceptable for my dataset? if so: which command to use?Hi everybody, while looking for advice regarding my modeling/stata problem, I came across this forum…
Subscribe to:
Post Comments (Atom)
0 Response to Populate unique value for a variable based on unique value of another variable
Post a Comment