Hello,
I am a PhD student and working in stata only since 2 years.
I need to compare proportions of sexual outcomes (ever had sex, condom use at first sex...) in two independant datasets, and one of them is a survey dataset. I have troubles to combine the two datasets and to use commands that take into account the fact that one of the sample is weighted.
The first dataset (name=coverte) contains data from participants with HIV. It is not a survey. N=284
The second datasat (BS2010) contains survey data with its own weight. N=2,899
I appended the two datasets, once i've opened first the BS2010 dataset : "append using coverte07022023.dta, gen (base)"
Typing "svydescribe" shows the following :
svydescribe
Survey: Describing stage 1 sampling units
pweight: RD2TOT
VCE: linearized
Single unit: missing
Strata 1: <one>
SU 1: <observations>
FPC 1: <zero>
#Obs per Unit
----------------------------
Stratum #Units #Obs min mean max
-------- -------- -------- -------- -------- --------
1 2,899 2,899 1 1.0 1
-------- -------- -------- -------- -------- --------
1 2,899 2,899 1 1.0 1
284 = #Obs with missing values in the
-------- survey characteristics
3,183
For example, if I try to compare the proportions of participants reporting ever having sex, which is the binary variable RS_ni (RS for rapport sexuel in French).
I first used the tabulate command but the result that does not take into account the weight in the BS2010 survey, so my understanding is it's not correct :
. tab RS_ni base, col chi2
+-------------------+
| Key |
|-------------------|
| frequency |
| column percentage |
+-------------------+
| base
RS_ni | 0 1 | Total
-----------+----------------------+----------
0 | 302 47 | 349
| 10.42 16.91 | 10.99
-----------+----------------------+----------
1 | 2,592 227 | 2,819
| 89.41 81.65 | 88.73
-----------+----------------------+----------
3 | 5 4 | 9
| 0.17 1.44 | 0.28
-----------+----------------------+----------
Total | 2,899 278 | 3,177
| 100.00 100.00 | 100.00
Pearson chi2(2) = 25.8040 Pr = 0.000
And if I try to use the svy command for proportions, it ignores the coverte sample. (number of observation : 2,899). I tried the svy: logit command and it is the same issue.
svy: proportion RS_ni, over (base)
(running proportion on estimation sample)
Survey: Proportion estimation
Number of strata = 1 Number of obs = 2,899
Number of PSUs = 2,899 Population size = 3,311.8753
Design df = 2,898
--------------------------------------------------------------
| Linearized Logit
| Proportion Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
RS_ni@base |
0 0 | .1227894 .007692 .108486 .1386854
1 0 | .8750966 .0077625 .8590646 .8895394
3 0 | .002114 .0012226 .0006795 .0065567
--------------------------------------------------------------
I would be very grateful if someone can help me. I have checked the stata survey data reference manual, but the only section referring to combining samples from multiples surveys did not help me.
0 Response to Comparing proportions using svy command
Post a Comment