Hi

I am working on a simple data set transformation that I have not found a solution to after searching and testing for quite some time.

I have a data set with multiple values in some cells for three variables.

Aim: I want to transform the data set to only have one value in each cell so that I can tabulate data and create summary statistics.

Problem: When I reshape data (following procedure in #4 here), the frequency of observations on other variables increases, making any statistics useless. The problem seem to be that the same values are counted multiple times within the id variable. I tried -xtset- on id but it did not help. Here are my steps:

1. Data on a variable not incl. in reshape command before reshaping. (The frequency is correct.)
Code:
. . fre A1
A1 -- 18. (1/0/NA)
-----------------------------------------------------------
              |      Freq.    Percent      Valid       Cum.
--------------+--------------------------------------------
Valid   0     |          4       2.12       2.13       2.13
        1     |        184      97.35      97.87     100.00
        Total |        188      99.47     100.00           
Missing .     |          1       0.53                      
Total         |        189     100.00                      
-----------------------------------------------------------
2. Reshape because -mcat- has multiple values in same cell:
Code:
. split m_cat, p(",")
variables created as string: 
m_cat1  m_cat2

. drop m_cat

. reshape long m_cat, i(id) j(m_cat_no)
(note: j = 1 2)

Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                      189   ->     378
Number of variables                  43   ->      43
j variable (2 values)                     ->   m_cat_no
xij variables:
                          m_cat1 m_cat2   ->   m_cat
-----------------------------------------------------------------------------

. drop if m_cat ==""
(180 observations deleted)
3. Data on same variable as #1 after reshape
Code:
. fre A1

A1 -- 18. (1/0/NA)
-----------------------------------------------------------
              |      Freq.    Percent      Valid       Cum.
--------------+--------------------------------------------
Valid   0     |          4       2.02       2.03       2.03
        1     |        193      97.47      97.97     100.00
        Total |        197      99.49     100.00           
Missing .     |          1       0.51                      
Total         |        198     100.00                      
-----------------------------------------------------------
The frequency obtained for the variable I have reshaped are correct, but not other variables. This is also only a suboptimal solution for reshaping one variable, but I have two more variables that need to be reshaped.

Hopefully some of you can provide input on this issue.