Hello! I am trying to do an interval regression, because my outcome variable (wage) is measured in 9 bins, being left-, right- and interval-censored with intervals of different size. Additionally I want to instrument for one independent variable. I have been advised to use the "cmp"-command to execute an interval regression with IV in Stata.

This is how I generated the variables for the lower and upper bounds of the intervals (values are the $ amounts of the wage interval bounds):

Code:
. recode Outcome (1=.)(2=500)(3=600)(4=700)(5=800)(6=900)(7=1000)(8=1200)(9=15
> 00), gen(Outcome1)
(2742 differences between Outcome and Outcome1)

. 
. recode Outcome (1=500)(2=600)(3=700)(4=800)(5=900)(6=1000)(7=1200)(8=1500)(9
> =.), gen(Outcome2)
(2742 differences between Outcome and Outcome2)
Is this correct? Especially, is it correct to specify the lowest bound as (1=.) or should it rather be (1=0), as negative wages are not possible?

Ignoring covariates, my main variables are Outcome, Treatment and Instrument. Treatment and Instrument are both binary dummies in this specification.

My cmp code looks like this - for simple comparability now I show results with "$cmp_cont" for the first stage and later compare it to "reg", but the same applies when specifying probit for both:

Code:
. cmp (Treatment = Instrument) (Outcome1 Outcome2 = Treatment), indicators($cm
> p_cont $cmp_int) 

Fitting individual models as starting point for full model fit.
Note: For programming reasons, these initial estimates may deviate from your s
> pecification.
      For exact fits of each equation alone, run cmp separately on each.

      Source |       SS           df       MS      Number of obs   =     3,480
-------------+----------------------------------   F(1, 3478)      =   4295.42
       Model |  253.611604         1  253.611604   Prob > F        =    0.0000
    Residual |  205.349316     3,478  .059042356   R-squared       =    0.5526
-------------+----------------------------------   Adj R-squared   =    0.5524
       Total |   458.96092     3,479  .131923231   Root MSE        =    .24299

------------------------------------------------------------------------------
   Treatment |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  Instrument |   .6617225   .0100966    65.54   0.000     .6419267    .6815182
       _cons |   .3215259   .0089688    35.85   0.000     .3039413    .3391105
------------------------------------------------------------------------------

Interval regression                             Number of obs     =      2,691
                                                   Uncensored     =          0
                                                   Left-censored  =        352
                                                   Right-censored =         12
                                                   Interval-cens. =      2,327

                                                LR chi2(1)        =       1.39
Log likelihood = -5365.2625                     Prob > chi2       =     0.2384

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   Treatment |  -13.57253    11.5116    -1.18   0.238    -36.13485    8.989787
       _cons |   749.6888    10.5316    71.18   0.000     729.0472    770.3303
-------------+----------------------------------------------------------------
    /lnsigma |   5.373939   .0157219   341.81   0.000     5.343125    5.404753
-------------+----------------------------------------------------------------
       sigma |   215.7109   3.391395                      209.1652    222.4614
------------------------------------------------------------------------------

Fitting constant-only model for LR test of overall model fit.

Fitting full model.

Iteration 0:   log likelihood = -5381.3669  
Iteration 1:   log likelihood = -5377.7781  
Iteration 2:   log likelihood = -5377.7629  
Iteration 3:   log likelihood = -5377.7629  

Mixed-process regression                        Number of obs     =      3,480
                                                LR chi2(2)        =    2800.95
Log likelihood = -5377.7629                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Treatment    |
  Instrument |   .6617225   .0100937    65.56   0.000     .6419393    .6815057
       _cons |   .3215259   .0089662    35.86   0.000     .3039524    .3390993
-------------+----------------------------------------------------------------
Outcome1     |
   Treatment |  -29.47945   15.82494    -1.86   0.062    -60.49576    1.536867
       _cons |   763.2793   14.03154    54.40   0.000      735.778    790.7806
-------------+----------------------------------------------------------------
    /lnsig_1 |  -1.415038   .0119866  -118.05   0.000    -1.438531   -1.391544
    /lnsig_2 |   5.374463    .015747   341.30   0.000     5.343599    5.405327
/atanhrho_12 |   .0408718   .0278674     1.47   0.142    -.0137473    .0954909
-------------+----------------------------------------------------------------
       sig_1 |   .2429165   .0029117                      .2372761     .248691
       sig_2 |   215.8239   3.398576                      209.2646    222.5889
      rho_12 |   .0408491   .0278209                     -.0137464    .0952017
------------------------------------------------------------------------------
Now compare those results with the following, where I "manually" executed first a regression of Treatment on Instrument, and then performed an interval regression with "intreg" of Outcome on the fitted Treatment values:

Code:
. reg Treatment Instrument, vce(robust)

Linear regression                               Number of obs     =      3,480
                                                F(1, 3478)        =    1443.35
                                                Prob > F          =     0.0000
                                                R-squared         =     0.5526
                                                Root MSE          =     .24299

------------------------------------------------------------------------------
             |               Robust
   Treatment |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  Instrument |   .6617225   .0174177    37.99   0.000     .6275726    .6958724
       _cons |   .3215259   .0172445    18.65   0.000     .2877155    .3553363
------------------------------------------------------------------------------

. predict fitted
(option xb assumed; fitted values)

. intreg Outcome1 Outcome2 fitted, vce(robust)

Fitting constant-only model:

Iteration 0:   log pseudolikelihood = -5491.1709  
Iteration 1:   log pseudolikelihood = -5464.7364  
Iteration 2:   log pseudolikelihood =  -5464.683  
Iteration 3:   log pseudolikelihood =  -5464.683  

Fitting full model:

Iteration 0:   log pseudolikelihood = -5489.4157  
Iteration 1:   log pseudolikelihood = -5462.9742  
Iteration 2:   log pseudolikelihood = -5462.9213  
Iteration 3:   log pseudolikelihood = -5462.9213  

Interval regression                             Number of obs     =      2,742
                                                   Uncensored     =          0
                                                   Left-censored  =        357
                                                   Right-censored =         12
                                                   Interval-cens. =      2,373

                                                Wald chi2(1)      =       3.53
Log pseudolikelihood = -5462.9213               Prob > chi2       =     0.0603

------------------------------------------------------------------------------
             |               Robust
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      fitted |  -28.91668   15.39157    -1.88   0.060     -59.0836    1.250254
       _cons |   762.8507   13.55528    56.28   0.000     736.2829    789.4186
-------------+----------------------------------------------------------------
    /lnsigma |   5.371954   .0190345   282.22   0.000     5.334647    5.409261
-------------+----------------------------------------------------------------
       sigma |   215.2831   4.097817                      207.3995    223.4664
------------------------------------------------------------------------------
As you can see, the first stage regression of Treatment on Instrument gives the same results, but the interval regression in the second stage shows different results for cmp vs intreg. Did I make a mistake here? What exactly is the difference between using intreg vs. cmp with the indicator "$cmp_int"?