Hello Statalist --
Hope everyone is doing well. I am currently cleaning a dataset and am stumbling into a slight issue. When I tried to destring my variables using the following code I received the following error:
Code:
destring under_20_count, replace
under_20_count contains nonnumeric characters; no replace
After tabulating the variable I did not catch any nonnumeric numbers so I proceeded to force the destring and instead of a replacing the variable (to avoid data loss), I generated a new variable using this code. My intentton was that this code would help me figure out what is going on:
Code:
destring under_20_count, generate(newvar) force
under_20_count contains nonnumeric characters; newvar generated as int
(1817 missing values generated)
After browsing my dataset, I found that stata was interpreting entries with a "," as a nonnumeric character. For example, an observation with a number greater than 1,000 was classified as nonnumeric (because of the comma) and therefore showed to have a missing value when I forced the destring. I know it is not ideal to use screenshots since they are not useful to the forum helpers, but my dataset is too big to share and I just wanted to include the screenshot to better exemplify what I am trying to express.
As you can see from the screenshot, the original entries without commas were destringed successfully while those with a comma were not destringed.
Any ideas or suggestions on how I can go about overcoming this?
Array
0 Response to [HELP] Destring Variables Dilema
Post a Comment