Hello,

I am trying to reproduce the results in the literature using fisher's exact test to compare the distribution of two independent samples.

There is data description:

Let's call the data from one paper 'sample one', and the data from another paper "sample two".
Both of the two papers are measuring the same thing.
In both of two samples, there are 6 types of subjects are identified: level 0, level 1, level 2, level 3, level 4, unidentified.
In sample one, there are 116 subjects, the proportions of types are 5.17%, 23.28%, 26.72%, 21.55%, 22.41%, 0.86% , respectively.
In sample two, there are 179 subjects, the proportions of types are 3.91%, 14.53%, 27.93%, 21.23%, 17.32%, 15.08% respectively.

The paper itself says "If the unidentified subjects are excluded, the Fisher's exact test comparing these two categorical distributions yields a p-value of 0.926, suggesting that they are statistically not different."

Thus, I assume that the Fisher's exact test will reject the null when unidentified subjects are included, which I am able to get, but I am not able to get "p-value of 0.926" to not reject the null excluding unidentified, so I am thinking the command I am using is not right.

Here is the code I am using:
Code:
set obs 179
gen jin = 0 in 1/7
replace jin = 1 in 8/33
replace jin = 2 in 34/83
replace jin = 3 in 84/121
replace jin = 4 in 122/152
replace jin = -1 in 153/179 //unidentified
proportion jin

gen k=-1 in 1 //unidentified
replace k=0 in 2/7
replace k =1 in 8/34
replace k=2 in 35/65
replace k =3 in 66/90
replace k = 4 in 91/116
proportion k

tabulate jin k , all exact //reject the null
tabulate jin k if jin!=-1 & k != -1, all exact// reject
My question what the right way is to reproduce the results. And I am wondering if sample sized matter as if I don't using option -missing-, the table it produces look like the larger sample is truncated, and if for example, shuffle the data, the larger sample will be truncated in a different way. so should we account for missing values if two samples are not balanced?

I also tried other tests to compare two samples which give different results:

Code:
 set obs 295
 gen group = 1 in 1/179
replace group =0 in 180/295
gen jin_k=jin in 1/179
forvalues i = 1(1)116{
replace jin_k = k[`i'] if _n == `i'+179
 }
 ranksum jin_k, by(group)//not reject at 5%
median jin_k, by(group) exact//not reject
ksmirnov jin_k, by(group) exact //not reject
Further, I just realised from this topic, that level 0, level 1, level 2, level 3, level 4 are likely to be ordered category. (I am not sure actually, the category in the paper is like education taking values of high school, undergraduate, postgraduate.) Thus I am wondering if it is indeed ordered category, then fisher's exact test is not appropriate, then what about other test I have used?

Finally, I have my own data measuring the same thing with 157 subjects. When comparing my sample to either sample one or two, I cannot reject the null using Fisher's exact test, but I can reject the null using all other tests -ranksum-, -median-, -ksmirnov-, and -ttest-. It seems that all these give different results from fisher exact test or chi square test, when either comparing sample one and two, or comparing my sample and sample one or two. I am really confused by those different results.

Thanks for any help!!