Hi Stata Forum,
I am a relatively new user to STATA. I am working on a project that uses data scraped off a website that people have manually entered information into. I have a string variable that is supposed to contain a simple one phrase description "آخر جلسة" etc.
Quite a few entries contain information that should not be there, i.e. the date or multiplications of the entry:
المتابعة (إنشاء الملف) : [12008/2102/2020 آخر جلسة 2021-03-31 09:00:00] [42/2201/2020 آخر جلسة 2020-11-04 12:00:00]
المتابعة (إنشاء الملف) : [13/2114/2020 آخر جلسة 2021-02-02 13:00:00]
Most of the data has been entered correctly and the mistakes are not consistent, so I can't simply delete the first set of unneeded digits.
One of my ideas is to split the variable by the spaces and than drop values that are incorrect and than try and work all the correct values into a single column through if conditions and replace. Does this sound reasonable and are there any commands that could help make this easier?
Kind regards,
Mathew Toll
Related Posts with How to Clean String variable with errors in data entry
Discrete time survival analysis with time-varying (potential endogenous) discrete covariates questionDear all, Does somebody know whether it is possible to model discrete survival data (daily data, bu…
Generating count based on missing valuesHi All, I am having a problem in generating the COUNT of missing values. Please see the below table.…
Test estimated coefficient stability over timeHi everyone, Firstly, I run the two following models. Code: xtreg lev tan1 ebit1 size1 if d_bank=…
AFD Test with multiple variablesHi, I am having difficulty with ADF test. I know Stata has a command ‘dfuller’, but I want to test t…
How can I resample from finte mixture normal distribution with heteroscedasticity?Dear statalist: Thank you for your help just few days ago,I have completed resampling from fmm with …
Subscribe to:
Post Comments (Atom)
0 Response to How to Clean String variable with errors in data entry
Post a Comment