I thought over this a lot of times as I did not expect to find anything unexpected with such a basic string function as trim(), but hear me out on this one.
While not necessary to understand what I find to be unexpected behavior, here is some background to what I am working on. I am working on a command that is used to test ODK based questionnaire forms. Read more about it here as well if you want. Command is still in development so not all documentation is not yet at the level it will be at the time we publish this, but I always want to give some context to when I am asking a question.
Since the questionnaire are written in Excel files I use import excel to import several columns from an excel sheet, many of them consist of string values. Some of the test my command run are sensitive to leading and trailing spaces so I must remove spaces so that "ABC " becomes "ABC". I use the function trim() for that. However, in the data set that can be accessed here I have one string variable with 10 observation for which 3 has values for which I do not think trim() works as I expected it to do.
When I used
Code:
replace name = trim(name)
If you load the data set linked to above and run this code you can see what I see:
Code:
*Open data and show string values use trim_example.dta tab *Replacing leading and trailing spaces replace name = trim(name) tab //check that spaces where not removed *Keep only one of these values for less cluttered charlist result keep in 7 *Install charlist and return all char codes used in the string value. ssc install charlist charlist name return list //See that regular space 32 and non-breaking space 160 are both used.
0 Response to In my humble opinion unexpected behavior of trim()
Post a Comment