Dear Statalisters,

I have several issues and questions concerning the use of the Heckman correction.

First things first, I am trying to estimate a labor supply curve using SOEP Data from the DIW Berlin.

My first step was to estimate the wage using variables for experience and all the others that I will later use in my Heckman command.

Code:
reg NET_INCOME logjobbtenure EXPERIENCE_FULLTIME EXPERIENCE_PARTTIME YEARS_EDUCATION MATERNITY_LEAVE male age agesq west whitecollardummy intdivmonth NETINCOTHERS rentleasmonth unemploymentbenefitmonth DEGREE_HANDICAP

predict wageestimate
After this is where the issues come in. For some reason the data observes a labor supply, which is measured in monthly hours times ten, for unemployed people.
Is this an issue for the results that come out of this? Typically observability would depend on being employed which it does not strictly in this case.

For now I moved on just to see whether the command will work with the following code:

Code:
. heckman laborsupply lohndach intdivmonth NETINCOTHERS rentleasmonth age agesq  youngchildren,
>  select(D_EMPLOYMENTSTAT= lohndach intdivmonth NETINCOTHERS rentleasmonth unemploymentbenefit
> month age agesq DEGREE_HANDICAP youngchildren) twostep
note: two-step estimate of rho = 1.8198941 is being truncated to 1

Heckman selection model -- two-step estimates   Number of obs     =     48,293
(regression model with sample selection)              Selected    =     47,449
                                                      Nonselected =        844

                                                Wald chi2(7)      =    3366.08
                                                Prob > chi2       =     0.0000

------------------------------------------------------------------------------------------
                         |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
laborsupply              |
            wageestimate |    .313529    .006188    50.67   0.000     .3014007    .3256574
             intdivmonth |  -.0011759   .0037377    -0.31   0.753    -.0085016    .0061498
            NETINCOTHERS |  -.0371831   .0023763   -15.65   0.000    -.0418406   -.0325256
           rentleasmonth |  -.0167795   .0060393    -2.78   0.005    -.0286163   -.0049428
                     age |    9.35933   3.492967     2.68   0.007     2.513241    16.20542
                   agesq |  -.1707992   .0394581    -4.33   0.000    -.2481357   -.0934626
           youngchildren |  -169.0883   14.80611   -11.42   0.000    -198.1078   -140.0689
                   _cons |   943.5435   78.96384    11.95   0.000     788.7772     1098.31
-------------------------+----------------------------------------------------------------
D_EMPLOYMENTSTAT         |
            wageestimate |   .0004075   .0000221    18.46   0.000     .0003642    .0004507
             intdivmonth |  -2.58e-06   .0000116    -0.22   0.823    -.0000253    .0000201
            NETINCOTHERS |  -1.66e-06   4.72e-06    -0.35   0.725    -.0000109    7.59e-06
           rentleasmonth |  -.0000624   .0000157    -3.97   0.000    -.0000933   -.0000316
unemploymentbenefitmonth |  -.0015518    .000115   -13.50   0.000    -.0017771   -.0013265
                     age |    .143686   .0074721    19.23   0.000      .129041     .158331
                   agesq |  -.0015521   .0000881   -17.61   0.000    -.0017248   -.0013793
         DEGREE_HANDICAP |  -.0014723   .0011021    -1.34   0.182    -.0036323    .0006877
           youngchildren |  -.4121195   .0430961    -9.56   0.000    -.4965863   -.3276526
                   _cons |  -1.367453   .1416507    -9.65   0.000    -1.645083   -1.089823
-------------------------+----------------------------------------------------------------
/mills                   |
                  lambda |   898.6195   115.8124     7.76   0.000     671.6313    1125.608
-------------------------+----------------------------------------------------------------
                     rho |    1.00000
                   sigma |  898.61947
------------------------------------------------------------------------------------------
As you can see my selection equation estimates the likelihood of actually participating in the labor market, while the second step estimates the actual labor supply.
My Rho, which as I understand is the correlation between the two error terms is 1 which just does not seem right, but I could not find any clues pertaining to why this might be the case.
I have seen that often researchers estimate these things for subgroups for example women, but I couldn't figure out why.

Hope you can help me and my description of the data is sufficient.

Kindest regards,
R. Gerlitzky