Hello,
I am trying to acquire a bivariate density contour plot of fathers' and their sons' income. I want to display contour lines which represent innermost 95, 90,75, 50, 25 percent of the marginal distributions. My data is shown below:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int cage float(chour_wage cincom_a) int age float(hour_wage incom_a)
 6         .  7.212492 11         .  8.800091
 4         .         . 10         .  7.767715
 4         .  7.096818  8  4.247268  8.301566
 4         .  6.221349  8  4.662684   8.35622
 6  5.064219  8.718682 11         .  8.904714
 5         .         . 11         .  9.126666
 5         .  8.295078 11         .  8.868837
 6  5.506705  8.652475 10         .  8.225165
 6         .         . 13         .  8.651724
 4         .         .  9 4.5149555   8.82954
 5         .         . 12         .  8.760176
 4         .         . 11 4.6266828  9.257171
 4         .         .  9  5.899766  9.591916
 4         .         . 10         .  8.358793
 4 2.1330256  5.939688 10         .  8.462613
 5         .         . 11         .  8.740336
 6  4.769462  8.620869 12         . 8.6872635
 4  3.443087  7.537786  8  3.828916  9.174337
 4         .         . 11 4.6119113  8.796979
 4         .         . 12         .         .
 6 4.1700788  8.051225  9 4.0522957  7.879379
 5 4.2714095  8.519905 12         .  9.397124
 4         .         . 11  4.350053  9.331104
 7         .  7.406225 14         .  8.708834
 4         .         .  9  4.480106  8.538177
 4         .  5.903953 10  5.299745  8.855093
 4         .  7.192451 12         .  9.076223
 9  5.844003   9.49366 14         .  6.615362
 4         .         . 11      4.85  8.838325
 6         .         . 11         .   8.72442
 4         .         .  9  4.883366 8.6722355
 4         .         . 10         .  9.284665
 6         .  8.402228 11         .   8.71677
 6         .         .  9  5.358942  9.384294
 5         .  7.656807 11         .  8.825084
 4         .         . 10         .  8.843168
 6         .   6.83704 12  4.831026  8.485221
 7  4.805626  7.901927 12         .  8.381459
 8         .  2.904088 14 4.1092334  8.096209
 9         .         . 13         .         .
 4         .         .  9  5.691413  9.402679
 4         .  7.549126 10   5.69889  9.436436
 5         .         . 12         .  8.956568
 4         .         . 11         . 8.6546545
 5         .         . 10         .  8.685721
 4         .         .  9  5.848364  9.711116
 4         .         .  9  5.524533  9.720165
 6         .         . 12         .  8.857048
 4         .         . 10   5.50678  9.239161
 4         .         . 11  5.623512  9.302464
 4 3.9960136  7.908037 10         .  9.298968
 4         .         .  8  5.209748  8.729355
 7 4.3242292  8.236253 13         .  9.090649
 5         .         .  9  5.301348  9.136825
 4         .         . 11         .  8.591593
 5         .         . 10  4.512572  8.920359
 4  3.903019  7.232786  8         .   8.60528
 5  3.952956  7.508304 12  3.963319  7.887486
 5 4.1558967   8.06792  9 4.6119113  9.315055
 5         .         . 10  4.889866  8.459979
 4         .         . 10         .  8.583259
 5         .         . 11  6.492935  10.16804
 4         .  6.409692  9         .  8.111072
 5  3.364879  7.830787 13         .  8.675574
 4         .         . 13         . 8.6175995
 4         .  5.010635 10 4.3231745  8.798184
 7         .  6.152733 14         .  8.339564
 4         .         . 10 4.5135283  8.639608
 5 3.4301834  7.203483 10  4.020402  8.484205
 6         .   6.59297 12 4.0716953  8.889424
 6         .         . 12  5.473197  9.283535
 5         .         . 10  2.567618  6.998435
 6 4.6941495  8.304455 11         .  9.062745
 4         .         .  9  4.926992  8.887602
10         .  8.016944 14         .  6.620882
 6 4.1419077  8.019788 13  4.789593  8.829519
 7  4.069587  8.310491 14         .         .
 4         .         . 11  5.058198  8.830907
 5         .         .  9  5.234525  9.146548
 5         .         . 11   4.78181  9.088917
 5         .         . 10  4.290385   8.43055
 7  4.582521  9.248757 14  4.495179 8.7043705
 6         .  7.963516 10  5.206676  8.832919
 9         .         . 14         .  8.598786
 4         .         . 14         .    9.0645
 4         .         .  9  5.835284  9.541452
 8  5.377824  9.080057 13         .  8.650215
 4  2.538491  6.815157  9         .  9.120295
 7         .  8.165836 13         .  8.934477
 6  5.985106 10.340588 11         .  8.657932
 4         .         . 10 4.2352524  8.363701
 5         .   8.93542 13 3.4547815  6.549454
 7  5.914371  9.425598 12         .  8.942617
10         . 9.0603895 14         .  8.649444
 4         .         .  9   6.27616 10.223534
 4         .         . 11         .  9.738289
 5  3.753618  8.309423 10  5.419769   9.14382
 6  4.325815  8.351167 10  5.268691  9.334165
 4         .         .  9  5.161271    9.3525
 6 3.4793975  8.119007 12         .  8.789471
end
So I came across the 'bidensity from http://fmwww.bc.edu/RePEc/bocode/b' which seems to perform exactly this task. However, when I run the bidensity command it saves the density estimates into another file and as it seems it assigns 0 density to nearly half of my observations.
Code:
. drop if (incom_a==. | cincom_a==.)
(6,266 observations deleted)

. display _N
9190

. bidensity cincom_a incom_a, levels(7) saving(bidensity) replace
file bidensity.dta saved

. use bidensity.dta, clear


. sum _d,detail

                             _d
-------------------------------------------------------------
      Percentiles      Smallest
 1%            0              0
 5%            0              0
10%            0              0       Obs               2,500
25%            0              0       Sum of Wgt.       2,500

50%            0                      Mean           .0045984
                        Largest       Std. Dev.      .0158344
75%     .0010251       .1549294
90%     .0103232       .1617224       Variance       .0002507
95%     .0266937       .1621767       Skewness       5.911845
99%     .0865262       .2124513       Kurtosis       46.87279
My intent was to assign bivariate density values corresponding to certain percentages to cutpoints( ccuts() ) for contour lines but I am left with too many zeros. Also the command creates bivariate density only for 2500 of 9190 observations.
The confusion might be stemming from my lack of understanding of kernel density estimation. But at this point I am lost in the documentation and even the slightest guidance would be immensely beneficial.


The graph I want to acquire is something like that
Array
Source. Bjorklund, Anders & Jäntti, Markus. (2012). Intergenerational Income Mobility and the Role of Family Background. Oxford Handbook of Economic Inequality. 10.1093/oxfordhb/9780199606061.013.0020.

Many thanks for your help!