Dear community,

I want to clean a string variable that contains the name of a bank. Sometimes, it also includes some numeric content (a percentage indication) as the example from dataex shows below:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str79 Lender str20 LenderCountry_clean
"ABN AMRO BIBF"                 ""          
"ABN AMRO BIBF"                 ""          
"ABN AMRO BIBF"                 ""          
"ABN AMRO BIBF"                 ""          
"ABN AMRO BIBF"                 "Thailand"  
"ABN AMRO BIBF 10.00%"          ""          
"ABN AMRO BIBF 5.40%"           ""          
"ABN AMRO BIBF 6.00%"           ""          
"ABN AMRO BIBF 6.13%"           ""          
"ABN AMRO BIBF 6.29%"           ""          
"ABN AMRO BIBF 7.42%"           ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd 1.50%"       ""          
"ABN AMRO Bank Bhd 1.50%"       ""          
"ABN AMRO Bank NV"              ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""            
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        "Netherlands"
"ABN AMRO Bank NV [RBS]"        "Netherlands"
"ABN AMRO Bank NV [RBS]"        "Netherlands"
"ABN AMRO Bank NV [RBS] 1.10%"  ""          
"ABN AMRO Bank NV [RBS] 20.00%" ""          
"ABN AMRO Bank NV [RBS] 32.62%" ""          
"ABN AMRO Bank NV [RBS] 5.00%"  ""          
"ABN AMRO Bank NV [RBS] 6.67%"  ""          
"ABN AMRO Inc"                  ""          
"ABN AMRO Inc"                  ""          
"ANZ Grindlays Bank Plc"        ""          
end

My objective is to complete the observations of the variable LenderCountry_clean if the bank name is the same. Now, what I did so far was to use the following command to fill up the observations:

Code:
bysort Lender (LenderCountry_clean) : replace LenderCountry_clean = LenderCountry_clean[_N] if missing(LenderCountry_clean) & _n < _N
However, the problem are now the observations that contain some values (percentages), i.e. 1.10%, 20.00%. In this case, the bank name is not exactly the same and I cannot fill up the observations as I would like to. For example, for the lender "ABN AMRO Bank NV [RBS] 1.10%" I would like to have the information "Netherlands" for the variable LenderCountry_clean.

Hence, my question is if there is a way to "clean" the variable Lender to get rid of the percentages. Note that these percentage indications, if there are any, are always at the end of the string. Is there a way to clean the variable Lender such that I can fill up the observations with the information from LenderCountry_clean ?

I greatly appreciate any help or comments and thank you in advance.

Best wishes,
Elio