Dear STATA community,
Please help!
I have messy data containing over thirty variations of different words (e.g., leadership) in four columns for over 800 observations. A screenshot of the data is attached. I could not use the dataex command due to the large size of the file.
How do I quickly calculate how many times each of the words appear among all four columns?
I also need to do crosstabulations between these two words at a time. How would I do it?
Do I need to recode each word into a numeric value?
I would appreciate your help!
Olena
Related Posts with Messy string data: how to do crosstabulations and descriptives
xtmerlogitFriends, I had this query; I am fitting three levels logistic-mixed effect model with Random coeff…
How to remove an extra digit/character from a string variableDear Statalist users I have a dataset which has a string variable in three parts, by convention, se…
Grouping multiple entries for a distinct id in longitudinal dataDear all, This might be a trivial question, however, I am posting as I have searched multiple stata…
Linux: Make Stata default application of file type .dta (etc.)How can I change the default application of file types such as .dta, .do, .gph, .smcl to Stata? I su…
How to generate predicted value by beta/standardized regression coefficients, rather than unstandardized ones?Is there a simple way to generate predicted value by beta/standardized regression coefficients. For …
Subscribe to:
Post Comments (Atom)
0 Response to Messy string data: how to do crosstabulations and descriptives
Post a Comment