Dear stata user,
I have a question regarding the merge accuracy of str. I have dataset A whose firm_id are in string format, but most of them actually contain only numbers, those not jusr numbers are like the following:

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input str8 patent_id
"RE43814"
"RE43864"
"RE43868"
"RE43956"
"RE43986"
"RE43997"
"RE44164"
"RE44215"
"RE44861"
"RE44874"
"RE44924"
"RE44930"
"RE44958"
"RE44993"
"RE45248"
"RE45348"
"RE45418"
"RE45473"
"RE45539"
"RE45733"
"RE45782"
"RE45804"
"RE45956"
"RE45962"
"RE45990"
"RE46020"
"RE46089"
"RE46096"
"RE46176"
"RE46193"
"RE46351"
"RE46409"
"RE46436"
"RE46436"
"RE46436"
"RE46473"
"RE46488"
"RE46518"
"RE46558"
"RE46564"
"RE46630"
"RE46686"
"RE46703"
"RE46746"
"RE46850"
"RE46891"
"RE47055"
"RE47257"
"RE47341"
"RE47342"
"RE47351"
"RE47425"
"RE47487"
"RE47553"
"RE47663"
"RE47698"
"RE47715"
"RE47736"
"RE47737"
"RE47761"
"RE47763"
"RE47813"
"RE47857"
"RE47949"
"RE48267"
"RE48274"
"RE48308"
"RE48359"
"RE48378"
"RE48446"
"RE48524"
"RE48532"
"RE48599"
"RE48641"
"RE48695"
"RE48702"
"T100501"
"T958006"
"T962010"
"T964006"
"T965001"
"T988005"
end
And I have dataset B whose firm_id contains only numbers and are long format.
Now I want to merge them using firm_id as a key, I have 2 options:
1. turn str to long
2. turn long to str
For the 1st one I think using long format to merge will be more accurate, but I have to drop those firms with characters. And I don't know how to test those contain characters and drop them
For the 2nd one I don't need to give up any observations but I wonder the accuracy of merge using str format, will this be accurate when most of them contain only numbers?

Thanks!