Hi,
I'm using Stata for my short paper.
I'm working on a merged long dataset with 656 observations.
I have 12 string variables called "microvoce1" "microvoce2" "..." .... "microvoce12" .
Each variable reports the needs expressed by the interviewees (656). Specifically :
- microvoce1 reports the first need expressed by a person (tot. 656 people reported a first need)
-microvoce2 reports the second need expressed by a person (tot. 295 people reported a second need - it means that the microvoce2 variables has 361 missing.
-microvoce3 reports the third need and so on. (tot. 162 people reported a third need - microvoce3 has 494 missing)
Each need has a specific labelled code ("POV", "POV01", "POV02"..). The codes are the same for all 12 variables, but not all variables contain all codes.
For example, the first variable (microvoce1) contains "POV" "POV01" "POV02" with associated frequencies;
the second variable (microvoce2) contains "POV" "POV01" "POV99";
the third variable (microvoce3) contains "POV" "POV99" "CAS".
I want to obtain a new variable (called microvoceTOTAL) that contains all the labelled codes contained in the individual 12 variables and the corresponding frequency sums.
If microvoce1 contains
"POV" = 2
"POV01" = 1
"POV02" = 3
and microvoce2 contains
"POV" = 4
"POV01"= 2
"POV99" = 5
and microvoce3 contains
"POV" = 1
"POV99" = 12
"CAS" = 8
the new variable (microvoceTOTAL) should contain
"POOR" = 7 (2+4+1)
"POOR01" = 4 (2+2)
"POV02" = 3
"POV99" = 17
"CAS" = 8
I tried to use "stack" but it is not the solution since I need to keep all the other variables in the dataset, while "stack" make me lose all of them.
I also tried to use
gen microvoceTOTAL = microvoce1+microvoce2 (for example)
but it just give me a concat effect, so the result in microvoceTOTAL will be "POVPOV" = 1 rather than "POV" = 2
Lastly i tried to use
gen microvoceTOTAL=.
replace microvoceTOTAL=1 if microvoce1=="POV" | microvoce2=="POV" | microvoce5=="POV"
replace microvoceTOTAL=2 if microvoce1=="POV01" | microvoce2=="POV01"
and so on for all the codes; but the result is still a variables with 656 observations, while the sum of the frequencies of all the labelled codes should be 1185.
How can I merge these 12 variables into a new one?
I hope I made myself clear. Forgive the lexicon, but I'm new at this.
I remain available for any clarification.
Thanks in advance,
Massimiliano
Related Posts with Merging many string variables into one
Poisson command on panel datawhen i run the following commnad xtpqml lexp_food year mig_stats_16 mig_stats_yr sex sector edu_hh e…
asdoc : addition of bysort prefix with tab, tab1, tab2 commandsI have just added support for the bysort prefix with tabulation commands in asdoc. Details and examp…
Repeated time values within panel when declaring Panel DataHi all, I have a panel data on bilateral trade volume of 55 country pairs for 15 years time period …
Querying multiple variables at a timeHi everyone, I am working in a database where each patient has fifteen procedure variables: proc1 pr…
Within estimator estimateI have the following equation: Array in my data I have year and Id I want to get the within the pa…
Subscribe to:
Post Comments (Atom)
0 Response to Merging many string variables into one
Post a Comment