Hello all,

Using Stata 15.1/IC

I need to submit a bulk file with a string variable ("NAME" variable in this example) that is required to have no special characters besides ampersand and dash. I am able to accomplish this using the following series of commands:


charlist NAME //shows which characters are in my string var NAME
"&',-./01234689ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop qrstuvwxyz



egen NEWNAME= sieve(NAME), omit(,./`"""'`"'"') // generates new variable with the special characters omitted but retains & and -

Results:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str86(NAME NEWNAME)
"Single-Benefits, Inc."                                       "Single-Benefits Inc"                                              
"Superstar, LLC"                                              "Superstar LLC"                                                  
"RML Agency, Inc."                                            "RML Agency Inc"                                        
"A & M Company, Inc."                                         "A & M Company Inc"
end
While this approach works as intended, I wanted to be able to use a command that is not dependent on the specific characters to be omitted, which could change between datasets (e.g. a character like "+" or "@" would not be excluded in a string variable that had them with my code--I'd have to manually update the command). Plus, the way you have to set off double- and single quote marks makes it hard to read in the log file.

I thought I could use the char() function to generalize the command by using the integer values associated with ASCII characters with a forvaluesloop (under the assumption I will nor run into any non-ASCII special characters), but I get the following error:

. forvalues i = 33/37 39/44 46/47 58/64 91/96 123/126 {
2. replace NAME = subinstr(NAME, char(`i'), "", .)
3. }
invalid syntax
r(198);


I am, however, able to use the foreachcommand without error:
. foreach i in 33 34 35 36 37 39 40 41 42 43 44 46 47 58 59 60 61 62 63 64 91 92 93 94 95 96 123 124 125 126 {
2. replace NAME =
subinstr(NAME, char(`i'), "", .)
3. }


My question is why the forvalues command doesn't work. My presupposition is that I just did something wrong in the command syntax-wise, but I also wondered if Stata treats values in the char() function differently than I thought when used with forvalues.

Of course, if there is an even better way to accomplish the elimination of all special characters besides ampersands and dashes, I am all ears. Thanks for any advice.