Data cleaning question: I have a longitudinal data set from 2009 to 2020 involving children and adolescents with type 1 diabetes. In my data file, each row includes data for a study participant identified with a person_number as a string variable. Each column includes variables like the date that a HbA1c lab test was measured, the result of a central HbA1c value and the result of a local HbA1c value.
My question = How do you write code to first identify a list of dates when a study participant had a lab value reported for a central HbA1c AND a local HbA1c on the same date? Then, I will need to exclude the local HbA1c value, because the central HbA1c value is presumed to be more trustworthy.
*if central_HbA1c and local_HbA1c were done on the exact same date (HbA1c_date), then only keep the central_HbA1c value or keep the central_HbA1c.
*E.g. on 18 Feb 12, central_HbA1c = 61 mmol/mol and local_HbA1c = 63 mmol/mol, so drop the local_HbA1c value of 63 or keep the central_HbA1c value of 61.
0 Response to Selectively dropping 1 of 2 similar variables measured on the same date
Post a Comment