Dear STATA community,
Please help!
I have messy data containing over thirty variations of different words (e.g., leadership) in four columns for over 800 observations. A screenshot of the data is attached. I could not use the dataex command due to the large size of the file.
How do I quickly calculate how many times each of the words appear among all four columns?
I also need to do crosstabulations between these two words at a time. How would I do it?
Do I need to recode each word into a numeric value?
I would appreciate your help!
Olena
Related Posts with Messy string data: how to do crosstabulations and descriptives
Finding values that are two standard deviations above the meanI need to find values of a variable that are two standard devotions above the mean. I want to create…
change label in y-axis/x-axis?Dear All, I have the following code (please ssc install coefplot) and graph: Code: webuse grunfeld,…
Data transformation after multiple imputationHi Statalist, I am new to the forum and to multiple imputations. I plan to run a cox regression mod…
A few questions about STATA difference in difference using data from multiple time periodsDear all, I want to see the influence of a housing policy on housing prices in two regions, one was…
Sequential logit postestimationDear all, I am making estimation of a sequential logit model. Please I have two questions: 1- I wou…
Subscribe to:
Post Comments (Atom)
0 Response to Messy string data: how to do crosstabulations and descriptives
Post a Comment