Dear Statalist,


I have a string variable "comment" stored as "strL" that contains a mix of numbers, characters and spaces .

storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
comment strL %9s comment



I need to extract information from it for the source of data the record comes from, one source can be "SoWMy", the other "NRI", ....etc potentially 10 different data sources. I have about 28,000 records.

I am a bit lost on which substring command I should use.

I tried "gen source1 = regexs(1) if(regexm(comment, "SoWMY"))" and got the error "invalid number, outside of allowed range" -

Any advice would be most appreciated.

Kind regards,

Amani