Hi. I just joined statalist and ask the first question as a beginer.
I'm worrying about duplicate question, but ask for individualized question. Please give me a thoughtful understanding.
I'm on analysis for risk factor of development of proteinuria.
Sample size is 319,457.
To list and compare variables according to some categories (BMI categories or Presence of proteinuria) ,
I need to know whether the variables follows a normal distribution or not.
(Because to use independent t-test or ANOVA for normal distribution continuous variables, and Kruskal-Wallis test for non-normal distribution continuous variables.)
So I performed sktest for normality test.
Skewness/Kurtosis tests for Normality
------ joint ------
Variable | Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2
-------------+---------------------------------------------------------------
wbc | 317,927 0.0000 0.0000 . .
bmi | 317,927 0.0000 0.0000 . .
sbp | 317,927 0.0000 0.0000 . .
height | 317,927 0.0000 0.0000 . 0.0000
Q1) Is it right to interpret this result as not following the normal distribution?
Q2) Why wbc, bmi, sbp doesn't report Prob >chi2 , but height report 0.000
When I performed sum, detail for height , it's skewness was 0.05 and kurtosis 3.21
And.. when I draw histogram, like below.
Array
Q3) Not strictly, can I judge this variable as a normal distribution variable ???
I think the sample size is too large so sktest reports does not follow the normal distribution.
Q4) Is there any appropriate test for large number of sample data??
And. Last...
I learned Central Limit Theorem (when sample size is large enough, we can assume the data follows normal-distribution.)
Q5) Can I apply this theorem to my analysis?? It means, whether I can use mean value instead of median value and use ANOVA test instead of Kruskal-Wallis test .
Related Posts with Normality test*for large sample data
Automated generation of table1?I'm using Phil Clayton's excellent program -table1- to generate a table of baseline characteristics.…
Issues related to sample size in gravity modelDear all I'm new here. I'm currently working on gravity model and would be glad if someone could he…
Scatter Plot and normalizing variables (logs)Hi, I am dealing with panel data analysis (fixed effects with robust standard errors). The data has…
Create a new variable conditioned on other variablesHi all, It probably is a simple solution but my brain is too jumbled so need help from the experts h…
Computing conditional probabilities, Pr(success(t) = 1 | success(t-1) = 0), in a panel datasetDear all, I have a panel dataset where individuals are assigned a "success" variable over time. Onc…
Subscribe to:
Post Comments (Atom)
0 Response to Normality test*for large sample data
Post a Comment