Hi Stata Forum,
I am a relatively new user to STATA. I am working on a project that uses data scraped off a website that people have manually entered information into. I have a string variable that is supposed to contain a simple one phrase description "آخر جلسة" etc.
Quite a few entries contain information that should not be there, i.e. the date or multiplications of the entry:
المتابعة (إنشاء الملف) : [12008/2102/2020 آخر جلسة 2021-03-31 09:00:00] [42/2201/2020 آخر جلسة 2020-11-04 12:00:00]
المتابعة (إنشاء الملف) : [13/2114/2020 آخر جلسة 2021-02-02 13:00:00]
Most of the data has been entered correctly and the mistakes are not consistent, so I can't simply delete the first set of unneeded digits.
One of my ideas is to split the variable by the spaces and than drop values that are incorrect and than try and work all the correct values into a single column through if conditions and replace. Does this sound reasonable and are there any commands that could help make this easier?
Kind regards,
Mathew Toll
Related Posts with How to Clean String variable with errors in data entry
save data in different file on the basis of variable valuesi have about 100 files similar to below example: I want to save this data in different files on the …
Bin widths are not constant in histogramDear StataList, I have a question regarding bin width in histograms. I cannot manage to create bin…
Multi-level Linear Mixed Model - Correcting for multiple comparisonsHi, a first-time Statalist forum post for me! As part of my PhD studies, I've ran a Multi-level Lin…
Help: OS, HR and CI 95%Good morning guys, I need some clarification about overall survival, hazard ratio and C.I.95%. I wou…
probit regressionHey Guys, I am trying to fit a probit regression. Where I use adtitional control variables such as …
Subscribe to:
Post Comments (Atom)
0 Response to How to Clean String variable with errors in data entry
Post a Comment