I've got a vexing data wrangling question that I need help figuring our. I've got a medium-sized set of survey data (about 40k rows) with two kinds of duplicate records that I'm trying to filter out. After my wrangling is done, my goal is to have only the most recent partial responses for which there is no complete survey record.
Thus I'm trying to:
- Drop any partial response that is associated with a complete response (same respondent ID) and then drop the complete response
- Drop any duplicate partial responses so that I only have the most recent partials left
Thank you in advance for any help!
-Nick
0 Response to Data Wrangling Question
Post a Comment