I'm conducting a study in which I have 3 controls per case, matched on age, cancer stage and cancer grade. I have a few categorical and a few continuous variables i want to compare between the groups. Most continuous variables are not normally distributed. Thus, I've understood that I have non-independent data and should go for non-parametric descriptive tests, thus the Wilcoxon sign-rank test. You can see an example below, the indicator variable being cohort and the variable I want to compare is weight_merge. My patients are overweight, so this is not normally distributed.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(cohort weight_merge) 1 54 0 54 1 120 1 50 1 101 1 158 0 74 0 53 1 77 1 80 1 58 1 98 1 129 0 106 0 138 1 70 1 76 1 72 1 72 0 59 0 81 1 64 1 90 0 63 1 62 1 49 1 64 1 74 0 79 1 53 1 58 1 73 1 66 1 110 0 103 0 70 1 75 1 80 1 68 1 96 1 116 0 95 1 113 1 91 1 89 1 72 1 80 1 100 1 86 1 86 1 50 1 86 1 87 1 95 1 77 1 109 1 60 1 63 1 92 0 69 1 67 0 72 1 62 1 75 0 47 1 92 0 85 0 104 1 75 1 80 0 75 1 58 1 61 1 85 1 91 0 96 1 130 1 116 1 73 0 62 1 150 1 53 0 56 0 73 1 69 1 114 1 50 1 81 0 60 1 77 1 119 1 68 1 62 1 150 1 95 1 72 1 98 1 98 1 66 1 77 end label values cohort cohort label def cohort 0 "Study", modify label def cohort 1 "Control", modify
Now, with other "comparison of means" test, Stata allows the option by(groupvar), but not for -signrank-. In either case, even with -ttest-, specifying by(groupvar) assumes unpaired data, according to the help section.
It seems that Stata needs paired data to be ordered as measurement_pre and measurement_post in order to do paired comparisons (ttest measurement_pre == measurement_post) for all these types of test, but in my case that is unattainable since I have 3 controls per case and I cannot tie a certain control to a certain case and I have about 5 variables that I want to compare.
Is there a user-written command for this? Or, how big of an error could I introduce by instead just running for unpaired comparisons with ranksum weight_merge, by(cohort) even though my data is dependent due to the matching? Before matching they were independent, controls were taken from a large, completely separate set of patients.
This question was also posted on ResearchGate, and I will make sure to post any good answer there as well.
https://www.researchgate.net/post/Ho...d23584d80a347d
Best regards!
//Rasmus W Green, PhD student, Karolinska Institute, Dept. of Women's and Children's Health, Stockholm, Sweden.
0 Response to Wilcoxon matched-pair sign-rank test -signrank- with a grouping variable ( by(groupvar) )
Post a Comment