Hi. I just joined statalist and ask the first question as a beginer.
I'm worrying about duplicate question, but ask for individualized question. Please give me a thoughtful understanding.
I'm on analysis for risk factor of development of proteinuria.
Sample size is 319,457.
To list and compare variables according to some categories (BMI categories or Presence of proteinuria) ,
I need to know whether the variables follows a normal distribution or not.
(Because to use independent t-test or ANOVA for normal distribution continuous variables, and Kruskal-Wallis test for non-normal distribution continuous variables.)
So I performed sktest for normality test.
Skewness/Kurtosis tests for Normality
------ joint ------
Variable | Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2
-------------+---------------------------------------------------------------
wbc | 317,927 0.0000 0.0000 . .
bmi | 317,927 0.0000 0.0000 . .
sbp | 317,927 0.0000 0.0000 . .
height | 317,927 0.0000 0.0000 . 0.0000
Q1) Is it right to interpret this result as not following the normal distribution?
Q2) Why wbc, bmi, sbp doesn't report Prob >chi2 , but height report 0.000
When I performed sum, detail for height , it's skewness was 0.05 and kurtosis 3.21
And.. when I draw histogram, like below.
Array
Q3) Not strictly, can I judge this variable as a normal distribution variable ???
I think the sample size is too large so sktest reports does not follow the normal distribution.
Q4) Is there any appropriate test for large number of sample data??
And. Last...
I learned Central Limit Theorem (when sample size is large enough, we can assume the data follows normal-distribution.)
Q5) Can I apply this theorem to my analysis?? It means, whether I can use mean value instead of median value and use ANOVA test instead of Kruskal-Wallis test .
Related Posts with Normality test*for large sample data
Testing difference between BLUPS after mixedDear Statalisters, I'm running a two-level empty mixed model, with levels individuals, city and st…
Estimate impact of a dummy variable in term of other variablesHello, I estimated utility from going to school conditional on some variables including a dummy var…
What does -teffects ipwra- actually do?Dear Statalist I am trying to figure out to understand what -teffects ipwra- actually does. I unde…
Intraday Data (1 Minute data)Hello, I appreciate if you could help me on formatting 1-minute data. I have datetime variable as “2…
Creating variable from searching multiple stringsHi there - probably an easy question for most of you, but I'm new to STATA and can't find the answer…
Subscribe to:
Post Comments (Atom)
0 Response to Normality test*for large sample data
Post a Comment