I am working on a simple data set transformation that I have not found a solution to after searching and testing for quite some time.
I have a data set with multiple values in some cells for three variables.
Aim: I want to transform the data set to only have one value in each cell so that I can tabulate data and create summary statistics.
Problem: When I reshape data (following procedure in #4 here), the frequency of observations on other variables increases, making any statistics useless. The problem seem to be that the same values are counted multiple times within the id variable. I tried -xtset- on id but it did not help. Here are my steps:
1. Data on a variable not incl. in reshape command before reshaping. (The frequency is correct.)
Code:
. . fre A1
A1 -- 18. (1/0/NA)
-----------------------------------------------------------
| Freq. Percent Valid Cum.
--------------+--------------------------------------------
Valid 0 | 4 2.12 2.13 2.13
1 | 184 97.35 97.87 100.00
Total | 188 99.47 100.00
Missing . | 1 0.53
Total | 189 100.00
-----------------------------------------------------------Code:
. split m_cat, p(",")
variables created as string:
m_cat1 m_cat2
. drop m_cat
. reshape long m_cat, i(id) j(m_cat_no)
(note: j = 1 2)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 189 -> 378
Number of variables 43 -> 43
j variable (2 values) -> m_cat_no
xij variables:
m_cat1 m_cat2 -> m_cat
-----------------------------------------------------------------------------
. drop if m_cat ==""
(180 observations deleted)Code:
. fre A1
A1 -- 18. (1/0/NA)
-----------------------------------------------------------
| Freq. Percent Valid Cum.
--------------+--------------------------------------------
Valid 0 | 4 2.02 2.03 2.03
1 | 193 97.47 97.97 100.00
Total | 197 99.49 100.00
Missing . | 1 0.51
Total | 198 100.00
-----------------------------------------------------------Hopefully some of you can provide input on this issue.
0 Response to Reshape increases frequency of observations
Post a Comment