Hi,
I have a matched dataset of ~150k patients, with each matched pair consisting of one case and one control. I used ccmatch; cases and controls were matched on patient_id, a string variable unique to each patient. Each matched pair has a unique value for the variable match.
Now I have appended a new file to this dataset. Content from the new file has many (but not all) of the patient_ids in the existing dataset, plus some additional patient_ids (who I will want to discard). Since matching was not done on the new file, the match variable value is missing for all those patient_ids. I want to populate match for the newly-appended patient_ids who are in the existing dataset, i.e., for the matched cases and controls. After that, I will drop the excess patients who came from the new file (i.e., those patients not matched in the original dataset). The excess patients should be easy to identify since at that point, the non-cases and non-controls should all have missing match values and should be the only patients who have missing match values.
My question: how can I populate a unique value for one variable based on the unique value of another variable? Specifically, how can I populate missing match values, based on existing match and patient_id values? I am thinking this may start with a replacement of match if match==., based on patient_id, but am unsure exactly how to write this out.
An example, using match value 5565, is below. First is the case, then the control. In the newly-appended data, patient_ids 12345 and 67890 may be present, but match (and match_id) would be missing.
[ATTACH=CONFIG]temp_19442_1598106134579_812[/ATTACH]
Array
Related Posts with Populate unique value for a variable based on unique value of another variable
Replicate R's pnorm functionHi Statalist, I'm porting a couple R scripts to Stata and have hit a roadblock. I have been unable …
How to compare two data which contain different time frequencies?Hi, I am trying to compare two datasets such as Bitcoin and AAPL. As you realized that Bitcoin data …
Panel data with multiple observations per unit/time combo?This question is a bit Stata-related, a bit general panel data analysis-related. I've got a panel da…
Calculate weight for repeated cross-sectional data collected weekly since May 2020Hi Could you please suggest how do I calculate weight for the data considered as repeated cross-sect…
Unable to access Stata MP2 on UbuntuI have been using Stata on windows 10 for sometime now and recently got a laptop with Ubuntu 20.04 i…
Subscribe to:
Post Comments (Atom)
0 Response to Populate unique value for a variable based on unique value of another variable
Post a Comment