[This is picking up on the thread I began here: https://www.statalist.org/forums/for...normal-density]

I'm trying to fit by Maximum Likelihood measurement error models using the specifications of Kapteyn and Ypma (Journal of Labor Economics, 2007: https://www.jstor.org/stable/10.1086...o_tab_contents). They are essentially mixture models involving evaluation of multivariate normal distributions: see their Appendix B for likelihood functions for their general model. I am trying, at least initially, to fit their 'basic model'.

My likelihood evaluator function for the basic model is currently as follows (in a do-file called basic_ll6.do):

Code:
capture program drop basic_ll6

prog def basic_ll6

    version 8.2
    
    args lnf mu_xi atrho lsig_xi lsig_eta
    
    tempname m v11 v12 v22 M V X r d
    
    scalar `m' = `mu_xi'
    
    scalar `r' = (exp(2*`atrho') - 1) / (exp(2*`atrho') + 1) // rho
    
    matrix `M' = ( `m', `m' )
    
    scalar `v11' =  (exp(`lsig_xi'))^2
    
    scalar `v12' = ((1+`r')*(exp(`lsig_xi'))^2)  ///
        /( exp(`lsig_xi')*sqrt((1+`r')^2)*((exp(`lsig_xi'))^2) + ((exp(`lsig_eta'))^2) )
    
    scalar `v22' = ((1+`r')^2)*((exp(`lsig_xi'))^2) + (exp(`lsig_eta'))^2
    
    matrix `V' = (  `v11' , `v12' \ `v12', `v22'   )
    
    capture {
                // check that `V' is pos semi definite
        scalar `d' = det(`V')
        if scalar(`d') < 0 error 506  
    }
    
    putmata X1 = ($R_i $S_i)
    mata:      
        M1 = st_matrix("`M'")
        V1 = st_matrix("`V'")
        myeval = __lnmvnormalden(M1,V1,X1)    
    end
    
    getmata `result' = myeval
    
    qui replace `lnf' = `result' if $L_i == 0
    
    qui replace `lnf' = lnnormalden($R_i, `mu_xi', exp(`lsig_xi')) ///
            if $L_i == 1
    
end
Observe the call to Mata to evaluate a bivariate normal density. (Evaluation can only be done in Mata; not in Stata -- see the Statalist thread cited at the beginning of this post.)

My immediate problem is that if I run basic_ll6.do, Stata throws an error. Here is an extract from the log-file after a set trace on:

Code:
   --------------------------------------------------------- begin getmata ---
    - version 11
    - syntax [anything(name=getlist id="getlist" equalok)] [, DOUBLE FORCE ID(s
> tring) REPLACE UPdate]
    - mata: get()
= invalid vector or matrix name
    ----------------------------------------------------------- end getmata ---
r(198);
Thus Mata treats the matrix with temporary name `M' as undefined in st_matrix("`M'"). The same error is thrown if I place the run command within a do-file surrounded by code reading in the data and setting up the ml command.

I have not found anything on Statalist or in the manuals that directly addresses my problem; perhaps the closest is this thread: https://www.statalist.org/forums/for...e_lf-not-found (with useful contributions by @Matthew J. Baker and @Jeff Pitblado).

I'm guessing from that thread that I need to write a Mata function to do the calculation, perhaps also without putmata and getmata calls -- along the lines suggested by Matt or by Jeff.

I'd be grateful for comments on my conjectures and suggestions about how to proceed. Thank you. (I am a Mata newbie.)

Once I've managed to fit the Basic Model, I wish to fit the other, more complex, specifications in the Kapteyn-Ypma paper. Their data and my data are not able to be shared, but you can simulate data from their full model with the following code:

Code:
* simulate data from Kapteyn-Ypma so that can develop code 
* away from secure server
*
* Model: page 527. Parameter values: Table C2
*************************************************
set seed 123456789

set obs 10000

scalar mu_xi = 12.283
scalar sig_xi = 0.717
scalar mu_zeta = 9.187    
scalar sig_zeta = 1.807
scalar mu_eta = -0.048    
scalar sig_eta = 0.99
scalar mu_omega = -0.304
scalar sig_omega = 1.239
scalar rho = -0.013
scalar pi_r = 0.959
scalar pi_s = 0.152
scalar pi_omega = 0.156

scalar list

/*
rnormal(m, s)
       Description:  normal(m,s) (Gaussian) random variates, where m is the
                     mean and s is the standard deviation
*/

ge xi_i = rnormal(mu_xi, sig_xi)
ge zeta_i = rnormal(mu_zeta, sig_zeta)
ge eta_i = rnormal(mu_eta, sig_eta)
ge omega_i = rnormal(mu_omega, sig_omega)
ge r_i = pi_r * xi_i ///
        + (1 - pi_r) * zeta_i
ge s_i = pi_s * xi_i            ///
        + (1 - pi_s)*(1 - pi_omega)*( xi_i + rho*(xi_i - mu_xi) + eta_i )   ///
        + (1 - pi_s)*pi_omega*( xi_i + rho*(xi_i - mu_xi) + eta_i + omega_i)
ge m_i = s_i - r_i
        
lab var r_i "register log earnings"
lab var s_i "survey log earnings"
lab var m_i "difference: s_i - r_i"

* Compare K-Y Table 4, page 524
su r_i, de
local Vr = r(Var)
su s_i, de
su m_i, de
local Vm = r(Var)
corr m_i r_i
di "Reliability = "  `Vr' / (`Vr' + `Vm')

* Compare K-Y Table C2 (sample counterparts to parameters)
su xi_i zeta_i eta_i omega_i, de
de

save simdata01.dta, replace