Hi Stata Forum,
I am a relatively new user to STATA. I am working on a project that uses data scraped off a website that people have manually entered information into. I have a string variable that is supposed to contain a simple one phrase description "آخر جلسة" etc.
Quite a few entries contain information that should not be there, i.e. the date or multiplications of the entry:
المتابعة (إنشاء الملف) : [12008/2102/2020 آخر جلسة 2021-03-31 09:00:00] [42/2201/2020 آخر جلسة 2020-11-04 12:00:00]
المتابعة (إنشاء الملف) : [13/2114/2020 آخر جلسة 2021-02-02 13:00:00]
Most of the data has been entered correctly and the mistakes are not consistent, so I can't simply delete the first set of unneeded digits.
One of my ideas is to split the variable by the spaces and than drop values that are incorrect and than try and work all the correct values into a single column through if conditions and replace. Does this sound reasonable and are there any commands that could help make this easier?
Kind regards,
Mathew Toll
Related Posts with How to Clean String variable with errors in data entry
random effect estimates for all multilevel level regression modelHello, My question is on estimating coefficients for random effects in mixed effect models. I am ne…
OlogitI have cross-sectional data for a group of people, the dependent variable is an ordinal variable, an…
Managing overlapping datesDear all, I am working a study on sick leave in relation to an interested outcome. However, I am st…
Refining starting values stuck during multilevel modellingDear all, I am running a multilevel model for my paper (individuals nested in countries). For the b…
Durbin-Watson tests in Prais-Winsten output: transformed lower than originalDear all I'm using Stata 17.1. My dataset is panel unbalanced, with N=340. Observations correspond t…
Subscribe to:
Post Comments (Atom)
0 Response to How to Clean String variable with errors in data entry
Post a Comment