How to create an iterative procedure to check variable by variable whether a given observation satisfies some criteria - string variables

Dear all,

I am working with string data and I would like to retrieve the most important word from a string variable that reflects a text-input field. I have already split the original variable into five different variables.

The dataset is like

obs	word1	word2	word3	word4	word5
1	2.45X90			VASSOURA
2	N9020X	PIVO	SUPERIOR
3	(S	1063T)	LANTERNA
4	15W4020	L		OLEO
5	E V A	GLITER

I would like to create a new variable that retrieves the most important word out of these five different string variables.

In particular, I would like the dataset to be as the following

obs	word1	word2	word3	word4	word5	final_word
1	2.45X90			VASSOURA		VASSOURA
2	N9020X	PIVO	SUPERIOR			PIVO
3	(S	1063T)	LANTERNA			LANTERNA
4	15W4020	L		OLEO		OLEO
5	E V A	GLITER				GLITER

where 'final_word' is the variable that retrieves the most important out of 'word1', 'word2', 'word3', 'word4' and 'word5'.

The criteria is the following: if 'word?' has

(i) no special characters ("." "(" "*" and others)
(ii) no numbers
(iii) no whitespace among letters (see obs == 6 for a case of whitespace among letters)
(iv) length > 1 (considering the length of 'word?')

then 'final_word' == 'word?'.

I would like to first check word1, then check word2, after that check word3 and so forth.

Could you help me to find a solution for that?

Thank you very much!

Below I provide the code for importing the example dataset into Stata :

clear
input byte obs str20 word1 str20 word2 str20 word3 str20 word4 str20 word5
1 "2.45X90" "" "" "VASSOURA" ""
2 "N9020X" "PIVO" "SUPERIOR" "" ""
3 "(S" "1063T)" "LANTERNA" "" ""
4 "15W4020" "L" "" "OLEO" ""
5 "E V A" "GLITER" "" "" ""
end

Obs: I tried to use 'dataex' but I found it easier, in this case, to provide the 'importing code'.

BJ Data Tech Solution

0 Response to How to create an iterative procedure to check variable by variable whether a given observation satisfies some criteria - string variables

Post a Comment

Related Posts with How to create an iterative procedure to check variable by variable whether a given observation satisfies some criteria - string variables

0 Response to How to create an iterative procedure to check variable by variable whether a given observation satisfies some criteria - string variables

Post a Comment