I want to estimate my statistical power for two groups with clustered data. When I use the power command I get a very different answer then when I run my own simulations (that try to mimic the same analysis). Obviously I'm doing something wrong (or making a conceptual mistake), and I'm trying to figure out why.

Here's the basic set up. I'm assuming a total sample of 100 clusters (50 in each group), with 25 observations per cluster. The degree of clustering (intraclass correlation) is 0.16. The sample std deviation is 1 and the difference between groups is 0.2.

Here are the results when I use Stata's power twomeans command:

Code:
. power twomeans 0 .2, cluster k1(50) k2(50) m1(25) m2(25) rho(0.16) sd(1)

Estimated power for a two-sample means test
Cluster randomized design, z test assuming sd1 = sd2 = sd
Ho: m2 = m1  versus  Ha: m2 != m1

Study parameters:

        alpha =    0.0500
        delta =    0.2000
           m1 =    0.0000
           m2 =    0.2000
           sd =    1.0000

Cluster design:

           K1 =        50
           K2 =        50
           M1 =        25
           M2 =        25
           N1 =     1,250
           N2 =     1,250
          rho =    0.1600

Estimated power:

        power =    0.6228
So my power (for an alpha of .05) is just over 62%. Now here's me trying to simulate the same analysis:

Code:
. program define myprog, rclass
  1.         clear
  2.         set obs 100 // number of clusters
  3.         gen cluster_num = _n
  4.         expand 25 // number of observations per cluster
  5.         local rho = 0.16 // intraclass correlation = .16
  6.         local sd_u = sqrt(`rho')
  7.         local sd_e = sqrt(1-`rho')
  8.         by cluster_num, sort: gen trial_num = _n
  9.         gen x = runiformint(0,1) // assigning clusters to one of two groups
 10.         by cluster_num (trial_num), sort: gen u = rnormal(0, `sd_u') if _n == 1
 11.         by cluster_num (trial_num): replace u = u[1]
 12.         gen e = rnormal(0, `sd_e')
 13.         gen y = (.2 * x) + u + e // generating outcome for each grou (group1: M = 0) (group2: M = 0.2)
 14.         sum y
 15.         return scalar sd = r(sd) // recording SD
 16.         regress y x, cluster(cluster_num)
 17.         lincom x
 18.         return scalar p = r(p) // recording p-value
 19. end

. simulate sd=r(sd) p=r(p), reps(10000) nodots: myprog // running simulation

      command:  myprog
           sd:  r(sd)
            p:  r(p)


. gen power = (p <= .05) // tabulating percentage of times p <= .05

.
. // checking that SD of y is approx 1
. sum sd

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          sd |     10,000    1.004122    .0179245   .9414742   1.073392

.
. // estimating power (alpha = .05)
. sum power

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       power |     10,000       .9984      .03997          0          1
So when I run a set of simulations it suggests that my power (for an alpha of .05) is over 99%.

In sum, when I use Stata's power command I get an estimated power of about 62%, but when I try to simulate the same results I get an estimated power north of 99%. I'm trying to figure out where I'm going wrong here.