Assessing differences between ordinal ranks based on frequencies

Hi Stata users!

Stata 14.1. I am trying to assess if there is a difference in the ordinal ranks of most frequently diagnosed diseases between men and women, and across different seasons. From my dataex below, we have an id variable (srno), the binary sex variable (sex2: 0 = men; 1 = women), the categorical season variable (seas: 1 = winter; 2 = pre-monsoon; 3 = southwest monsoon; 4 = post-monsoon), and a categorical diagnosis variable, which lists the indicated condition of the individual. I cherry-picked my dataex to include entries from all four seasons.

[CODE]
* Example generated by -dataex-. To install: ssc install dataex
clear
input int srno byte sex2 float seas str40 diag
4734 0 1 "Gastritis"
4735 1 1 "Upper Respiratory Tract Infection (URTI)"
4736 0 1 "Cough"
4737 1 1 "Cold"
4738 1 1 "Diarrhea"
6177 0 2 "SOB"
6178 0 2 "Upper Respiratory Tract Infection (URTI)"
6179 0 2 "Upper Respiratory Tract Infection (URTI)"
6180 0 2 "Cold"
6181 0 2 "Gastritis"
8089 1 3 "Gastritis"
8090 1 3 "Tinea"
8091 0 3 "Gastritis"
8092 0 3 "Fever"
8093 0 3 "Low Back Ache"
10669 0 4 "Tinea"
10670 0 4 "Icterus"
10671 0 4 "Cough"
10672 0 4 "Tinea"

I can quickly see which diagnoses are made most frequently, across sex and season, by using the tabsort command (ssc install tab_chi), e.g. below for sex, for the top five most frequent diagnoses

tabsort diag if sex2==0

RevisedDiagnosis	Freq.	Percent	Cum.

Cough	1,477	12.05	12.05
Musculoskeletal Pain	1,327	10.83	22.88
Road Traffic Accident	1,023	8.35	31.23
Tinea	981	8.00	39.23
Cold	795	6.49	45.72

tabsort diag if sex2==1

RevisedDiagnosis	Freq.	Percent	Cum.

Musculoskeletal Pain	530	12.10	12.10
Cough	490	11.19	23.29
Cold	349	7.97	31.26
Fever	322	7.35	38.62
Gastritis	279	6.37	44.99

Now, this is both a statistics question and a Stata-istics question (I apologise for the former). I believe I should use the Wilcoxon-Mann Whitney or Kruskal Wallis test to see if there is a difference between these two ranks, but what I don't understand is what form the data should take, in order to make these tests possible. I have successfully used other Statalist posts to create two new variables which list the rank of each diagnosis for males and females, but I don't understand how a single variable could contain the necessary information to make these tests possible. I don't discount I am making some error with regards to choice of test, dependent variable or otherwise. I appreciate any help possible!

Kind regards,

Harry

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Assessing differences between ordinal ranks based on frequencies
Assessing differences between ordinal ranks based on frequencies

0 Response to Assessing differences between ordinal ranks based on frequencies

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Assessing differences between ordinal ranks based on frequencies Assessing differences between ordinal ranks based on frequencies

Related Posts with Assessing differences between ordinal ranks based on frequencies

0 Response to Assessing differences between ordinal ranks based on frequencies

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Assessing differences between ordinal ranks based on frequencies
Assessing differences between ordinal ranks based on frequencies