Dear STATA community,

Please help!
I have messy data containing over thirty variations of different words (e.g., leadership) in four columns for over 800 observations. A screenshot of the data is attached. I could not use the dataex command due to the large size of the file.

How do I quickly calculate how many times each of the words appear among all four columns?

I also need to do crosstabulations between these two words at a time. How would I do it?

Do I need to recode each word into a numeric value?

I would appreciate your help!
Olena