I have income data for two populations (population id=1 and 2). I have collapsed my data into the form of cross-tabulation because in my research I assume all people in the same cell has exactly the same income. My original data is pasted at the end of this post. I want to calculate income share of each decile in each of the 2 populations separately. I have tried pshare (by Ben Jann) and sumdist (By Stephen Jenkins). Both can be installed by "ssc install ...". However, their estimates are slightly different. I was wondering if it is due to my incorrect usage of these commands?
For example, the income share of the top 10% in population with id=1 is 25.35% by pshare but 25.31 by sumdist. I also do not understand why there is an additional row with "." in the first column in the output of sumdist.
Thank you in advance!
Code:
. pshare estimate income [iw=freq], over(id) n(10) gini
(variance estimation not supported with iweights)
Percentile shares (proportion)    Number of obs   =         40
            1: id = 1
            2: id = 2
--------------------------------------------------------------
      income |      Coef.   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
1            |
        0-10 |   .0193008          .             .           .
       10-20 |    .034384          .             .           .
       20-30 |   .0535935          .             .           .
       30-40 |   .0679086          .             .           .
       40-50 |   .0808105          .             .           .
       50-60 |   .0935073          .             .           .
       60-70 |   .1067746          .             .           .
       70-80 |     .12858          .             .           .
       80-90 |   .1615961          .             .           .
      90-100 |   .2535447          .             .           .
-------------+------------------------------------------------
2            |
        0-10 |   .0844804          .             .           .
       10-20 |    .087861          .             .           .
       20-30 |   .0886734          .             .           .
       30-40 |   .0904544          .             .           .
       40-50 |   .0927363          .             .           .
       50-60 |   .0940319          .             .           .
       60-70 |   .0971719          .             .           .
       70-80 |   .1013188          .             .           .
       80-90 |   .1245144          .             .           .
      90-100 |   .1387574          .             .           .
--------------------------------------------------------------
-------------------------
             |      Gini
-------------+-----------
           1 |  .3525345
           2 |  .0837606
-------------------------
. sumdist income [aw=freq] if id==1, ngps(10)
Distributional summary statistics, 10 quantile groups
---------------------------------------------------------------------------
Quantile  |
group     |    Quantile  % of median     Share, %      L(p), %        GL(p)
----------+----------------------------------------------------------------
        1 |    15090.00        22.30         1.93         1.93      1510.64
        2 |    31181.00        46.07         3.44         5.38      4204.04
        3 |    44526.00        65.79         5.37        10.75      8401.91
        4 |    53645.00        79.27         6.78        17.52     13700.59
        5 |    67677.00       100.00         8.09        25.62     20027.01
        6 |    76289.00       112.73         9.36        34.98     27347.58
        7 |    85703.00       126.64        10.65        45.63     35676.55
        8 |   104176.00       153.93        12.87        58.51     45741.38
        9 |   131401.00       194.16        16.18        74.69     58393.76
       10 |                                 25.31       100.00     78183.37
        . |                                  0.00       100.00     78183.37
---------------------------------------------------------------------------
. sumdist income [aw=freq] if id==2, ngps(10)
Distributional summary statistics, 10 quantile groups
---------------------------------------------------------------------------
Quantile  |
group     |    Quantile  % of median     Share, %      L(p), %        GL(p)
----------+----------------------------------------------------------------
        1 |    66284.00        92.84        11.99        11.99      9144.18
        2 |    67528.00        94.59         7.39        19.37     14777.63
        3 |    67677.00        94.80         7.08        26.45     20178.05
        4 |    70314.00        98.49         8.77        35.22     26866.54
        5 |    71393.00       100.00        13.36        48.58     37056.61
        6 |    74037.00       103.70        10.45        59.03     45026.36
        7 |    74224.00       103.97         5.40        64.43     49145.43
        8 |    82858.00       116.06        11.07        75.50     57592.88
        9 |    97434.00       136.48        10.66        86.16     65721.22
       10 |                                 13.84       100.00     76281.04
        . |                                  0.00       100.00     76281.04
---------------------------------------------------------------------------Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(income freq id) 58340 46 2 63079 35 2 66056 132 2 66284 175 2 67528 230 2 67677 220 2 67773 88 2 68921 79 2 70314 100 2 70686 196 2 70696 56 2 71393 144 2 71651 123 2 72303 11 2 74037 167 2 74224 153 2 75841 188 2 82858 109 2 97434 230 2 105867 275 2 15090 276 1 17279 0 1 24071 166 1 29035 0 1 31181 110 1 32539 0 1 34466 0 1 38506 0 1 38660 122 1 39778 0 1 39786 0 1 40480 0 1 44162 0 1 44190 0 1 44421 0 1 44526 154 1 45027 0 1 47497 0 1 47751 0 1 48988 0 1 50961 0 1 51042 0 1 51800 78 1 51840 0 1 51950 0 1 52751 0 1 53645 197 1 53694 0 1 55029 0 1 57285 0 1 57428 0 1 57832 0 1 58306 0 1 58340 0 1 59029 0 1 59230 0 1 59411 34 1 59767 0 1 62126 0 1 62644 0 1 63079 0 1 63122 0 1 63521 230 1 66056 0 1 66284 0 1 67528 0 1 67677 12 1 67773 0 1 68921 0 1 70314 0 1 70686 0 1 70696 0 1 71393 0 1 71651 0 1 72303 219 1 74037 0 1 74161 0 1 74224 0 1 75841 0 1 76289 57 1 76468 0 1 78953 0 1 79146 0 1 81072 0 1 81112 0 1 82224 174 1 82370 0 1 82858 0 1 83180 0 1 83960 0 1 85703 101 1 86941 0 1 88456 0 1 88713 0 1 91106 0 1 91921 0 1 93882 0 1 96068 0 1 96173 0 1 96513 131 1 97434 0 1 98657 0 1 98719 0 1 102264 0 1 104176 145 1 104184 0 1 104435 0 1 105847 0 1 105867 0 1 105867 0 1 111956 0 1 112972 0 1 115492 87 1 116767 0 1 118917 0 1 120198 0 1 121313 0 1 121679 0 1 122757 0 1 130627 0 1 131401 189 1 133087 0 1 134584 0 1 135674 0 1 144051 0 1 145649 0 1 147470 0 1 150625 0 1 159560 45 1 205999 230 1 15090 0 2 17279 0 2 24071 0 2 29035 0 2 31181 0 2 32539 0 2 34466 0 2 38506 0 2 38660 0 2 39778 0 2 39786 0 2 40480 0 2 44162 0 2 44190 0 2 44421 0 2 44526 0 2 45027 0 2 47497 0 2 47751 0 2 48988 0 2 50961 0 2 51042 0 2 51800 0 2 51840 0 2 51950 0 2 52751 0 2 53645 0 2 53694 0 2 55029 0 2 57285 0 2 57428 0 2 57832 0 2 58306 0 2 59029 0 2 59230 0 2 59411 0 2 59767 0 2 62126 0 2 62644 0 2 63122 0 2 63521 0 2 74161 0 2 76289 0 2 76468 0 2 78953 0 2 79146 0 2 81072 0 2 81112 0 2 82224 0 2 82370 0 2 83180 0 2 83960 0 2 85703 0 2 86941 0 2 88456 0 2 88713 0 2 91106 0 2 91921 0 2 93882 0 2 96068 0 2 96173 0 2 96513 0 2 98657 0 2 98719 0 2 102264 0 2 104176 0 2 104184 0 2 104435 0 2 105847 0 2 105867 0 2 111956 0 2 112972 0 2 115492 0 2 116767 0 2 118917 0 2 120198 0 2 121313 0 2 121679 0 2 122757 0 2 130627 0 2 131401 0 2 133087 0 2 134584 0 2 135674 0 2 144051 0 2 145649 0 2 147470 0 2 150625 0 2 159560 0 2 205999 0 2 end
0 Response to discrepancies in estimates between sumdist and pshare
Post a Comment