Hi Stata Forum,
I am a relatively new user to STATA. I am working on a project that uses data scraped off a website that people have manually entered information into. I have a string variable that is supposed to contain a simple one phrase description "آخر جلسة" etc.
Quite a few entries contain information that should not be there, i.e. the date or multiplications of the entry:
المتابعة (إنشاء الملف) : [12008/2102/2020 آخر جلسة 2021-03-31 09:00:00] [42/2201/2020 آخر جلسة 2020-11-04 12:00:00]
المتابعة (إنشاء الملف) : [13/2114/2020 آخر جلسة 2021-02-02 13:00:00]
Most of the data has been entered correctly and the mistakes are not consistent, so I can't simply delete the first set of unneeded digits.
One of my ideas is to split the variable by the spaces and than drop values that are incorrect and than try and work all the correct values into a single column through if conditions and replace. Does this sound reasonable and are there any commands that could help make this easier?
Kind regards,
Mathew Toll
Related Posts with How to Clean String variable with errors in data entry
Reporting R-squared values in System GMM estimationHi everyone, This is a query on how to report the R-squared (or Adjusted R-squared) value in a syst…
Using svyset for melogitHello, I am analyzing a binary outcome (depvar) using melogit so that I may 1) account for the comp…
How can I extract a portion of a string variable using regular expressions?Hi everyone! Thanks in advance for your time and help. I have the following problem....Some of the …
replace variable values if it does not satisfy a certain conditionHello, I want all the observations of the variable "Website" to start with "www.". As shown below, m…
Regression and Outreg2 variable label displayHello I am running a normal regression and i labeled my variables and their values (they are all cat…
Subscribe to:
Post Comments (Atom)
0 Response to How to Clean String variable with errors in data entry
Post a Comment