Hi. I just joined statalist and ask the first question as a beginer.

I'm worrying about duplicate question, but ask for individualized question. Please give me a thoughtful understanding.

I'm on analysis for risk factor of development of proteinuria.

Sample size is 319,457.

To list and compare variables according to some categories (BMI categories or Presence of proteinuria) ,

I need to know whether the variables follows a normal distribution or not.

(Because to use independent t-test or ANOVA for normal distribution continuous variables, and Kruskal-Wallis test for non-normal distribution continuous variables.)

So I performed sktest for normality test.

Skewness/Kurtosis tests for Normality
------ joint ------
Variable | Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2
-------------+---------------------------------------------------------------
wbc | 317,927 0.0000 0.0000 . .
bmi | 317,927 0.0000 0.0000 . .
sbp | 317,927 0.0000 0.0000 . .
height | 317,927 0.0000 0.0000 . 0.0000


Q1) Is it right to interpret this result as not following the normal distribution?

Q2) Why wbc, bmi, sbp doesn't report Prob >chi2 , but height report 0.000



When I performed sum, detail for height , it's skewness was 0.05 and kurtosis 3.21

And.. when I draw histogram, like below.

Array

Q3) Not strictly, can I judge this variable as a normal distribution variable ???

I think the sample size is too large so sktest reports does not follow the normal distribution.

Q4) Is there any appropriate test for large number of sample data??

And. Last...

I learned Central Limit Theorem (when sample size is large enough, we can assume the data follows normal-distribution.)

Q5) Can I apply this theorem to my analysis?? It means, whether I can use mean value instead of median value and use ANOVA test instead of Kruskal-Wallis test .