Dear all,

I am working with string data and I would like to flag observations for which my string variable has exactly the two elements from below

(i) two letters from the alphabet (A-Z)
(ii) special characters ("/" "." ";" and others)

In particular, the dataset is like

obs string_var
1 GX105
2 9F978
3 (H753)
4 K?174P

I would like the dataset to be as the following

obs string_var dummy_var
1 GX105 1
2 9F978 0
3 (H753) 0
4 K?174P 1
where 'dummy_var' is the dummy variable for whether the observation satisfies the proposed criteria.

Can you help me to find a solution for that?

Thank you very much!


Below I provide the code for importing the example dataset into Stata :

input byte obs str10 string_var
1 "GX105"
2 "9F978"
3 "(H753)"
4 "K?174P"
end

Obs: I tried to use 'dataex' but I found it easier, in this case, to provide the 'importing code'.