Dear all,

Currently performing logistics analysis on my dataset. The dataset contains loan parts over a time series of 14 quarters, each observation contains the loan part with the characteristics (age applicant, loan to value, loan to income, property value, outstanding debt, etc etc). I would like to perform a logistic regression on the dataset, with a binary outcome on the probability of non-performance of the mortgage. (non-performance is a binary variable in the dataset in the case of mortgage arrear > 3 months).

I have certain questions regarding my data and the method:
> I have data of 14 quarters, but not all loan parts were present in those 14 quarters. I addressed this issue by deleting all loan parts with less than 14 observation, but this results in deleting almost 50% of the dataset. Is there any way to include the observation while still controlling for over/under representation?

PHP Code:
egen long Leningdeelnummer group(Leningdeelnr
Secondly I performed the following code on the dataset:

PHP Code:
tsset Leningdeelnummer Datumrapportage
xtlogit nonperforming NHG age1 tweedeaanvrager LTIBruto Rentevastperiodemnd Hoofdsomoorspronkelijk1000 Bedragoorsprtaxatiewaarde1000 
With the following outcome:

PHP Code:
Random-effects logistic regression              Number of obs     =    x
Group variable
Leningdeelnu~r                  Number of groups  =     x

Random effects u_i 
Gaussian                   Obs per group:
                                                              
min =          1
                                                              avg 
=       12.9
                                                              max 
=         13

Integration method
mvaghermite                 Integration pts.  =         12

                                                Wald chi2
(7)      =     131.07
Log likelihood  
= -2137.3967                    Prob chi2       =     0.0000

-----------------------------------------------------------------------------------------------
                
nonperforming |      Coef.   StdErr.      z    P>|z|     [95ConfInterval]
------------------------------+----------------------------------------------------------------
                          
NHG |   x   .6617874     1.57   0.117    -.2602901    2.333869
                         age1 
|  x   .0167961    -5.35   0.000    -.1227112   -.0568715
              tweedeaanvrager 
|   x   .3292693    -3.39   0.001    -1.760786   -.4700739
                     LTIBruto 
|   x    .000996     0.01   0.994    -.0019447    .0019596
          Rentevastperiodemnd 
|  x   .0035796    -4.67   0.000    -.0237263   -.0096946
   Hoofdsomoorspronkelijk1000 
|   x   .0028072     3.03   0.002     .0030151    .0140191
Bedragoorsprtaxatiewaarde1000 
|  x    .003063    -4.55   0.000    -.0199539   -.0079474
                        _cons 
|  x   1.316575    -9.14   0.000    -14.60831   -9.447426
------------------------------+----------------------------------------------------------------
                     /
lnsig2u |   3.703917   .0436273                      3.618409    3.789425
------------------------------+----------------------------------------------------------------
                      
sigma_u |   6.372286   .1390029                      6.105587    6.650635
                          rho 
|   .9250529   .0030247                       .918905    .9307699
-----------------------------------------------------------------------------------------------
LR test of rho=0chibar2(01) = 3692.30                Prob >= chibar2 0.000 

But when I want to include the loan-to-value ratio (LTVpercentage) (ratio of the mortgage loan to the value of the related property) to the model it won't compute:

PHP Code:
xtlogit nonperforming NHG age1 tweedeaanvrager LTIBruto Rentevastperiodemnd LTVpercentage

Fitting comparison model
:

Iteration 0:   log likelihood = -4405.1234  
Iteration 1
:   log likelihood = -4209.8719  
Iteration 2
:   log likelihood = -4208.5684  (backed up)
Iteration 3:   log likelihood = -3974.4999  
Iteration 4
:   log likelihood = -3973.2986  
Iteration 5
:   log likelihood = -3973.2986  (backed up)
Iteration 6:   log likelihood = -3973.2986  (backed up)
Iteration 7:   log likelihood = -3973.2986  (backed up)
Iteration 8:   log likelihood = -3973.2986  (backed up)
Iteration 9:   log likelihood = -3973.2986  (backed up)
Iteration 10:  log likelihood = -3973.2986  (backed up)
Iteration 11:  log likelihood = -3973.2986  (backed up)
Iteration 12:  log likelihood = -3973.2986  (backed up)
Iteration 13:  log likelihood = -3973.2986  (backed up)
Iteration 14:  log likelihood = -3973.2986  (backed up

Can someone help me out on what the problem might be with adding this variable to the model? The log likelihood stays the same, even after > 300 iterations which tells me that it is not converging and will not converge.

Kind regards,

Django