Hello!

I am studying the association between product quality measured by expert intermediary (overallScore) and that measured by online user rating. In particular, I hypothesize that the aforementioned association is dependent on the longevity of product use. For instance, I would expect a positive correlation if the user rating were given right after the product purchase; and negative correlation if the user rating were given after some considerable use of the product. The longevity of use (moder) is captured at four levels (1 = < 1 month, 2 = 1-3 months, 3 = 3-6 months, and 4 = > 6 months). The descriptive statistics of the three variables is given below:

Code:
 sum overallScore rating moder

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
overallScore |     11,716    71.03175    14.30822          4        100
      rating |     11,716    2.804797    1.704218          1          5
       moder |     11,716    2.992147    1.234491          1          4
Further, below I include a screenshot of the binned scatterplot of the relationship between overallScore and rating by levels of moder, that provides some support to me initial assumption.


Now, my dataset has the following structure:

– User rating is captured at the individual level for a given product identified with a name_id (the total number of products is 109). The minimum number of ratings per product is 3, and the maximum is 583.
– Expert intermediary overallScore is captured at the name_id level.
– And finally, each name_id is associated with a product category_id.

In the end of this post I provide an example of the structure of the data I am using.

As far as I understand, following such data structure each rating is nested within name_id and each name_id is nested within category_id. Therefore, to formally test my hypothesis on the moderating role of longevity of use, I use a linear hierarchical model using -mixed- command. While I have no issues running the following model:

Code:
mixed overallScore rating || category_id:

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -45318.885  
Iteration 1:   log likelihood = -45318.885  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =     11,716
Group variable: category_id                     Number of groups  =        109

                                                Obs per group:
                                                              min =          3
                                                              avg =      107.5
                                                              max =        583

                                                Wald chi2(1)      =      20.86
Log likelihood = -45318.885                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
overallScore |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      rating |   .3020026   .0661245     4.57   0.000     .1724009    .4316043
       _cons |   69.16763   .9746409    70.97   0.000     67.25737    71.07789
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
category_id: Identity        |
                  var(_cons) |   94.82431   13.48987      71.75073    125.3179
-----------------------------+------------------------------------------------
               var(Residual) |   129.4131   1.698858      126.1258     132.786
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 4949.73       Prob >= chibar2 = 0.0000

estat icc

Residual intraclass correlation

------------------------------------------------------------------------------
                       Level |        ICC   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
                 category_id |   .4228747   .0348833      .3563758    .4922911
------------------------------------------------------------------------------
I am unable to estimate another model, where I also include name_id effects:

Code:
mixed overallScore rating || name_id: || category_id:

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood =  105812.11  (not concave)
Iteration 1:   log likelihood =  272460.46  (not concave)
Iteration 2:   log likelihood =  281690.62  (not concave)
Iteration 3:   log likelihood =  285747.12  (not concave)
Iteration 4:   log likelihood =  286066.56  (not concave)
Iteration 5:   log likelihood =  286093.57  (not concave)
Iteration 6:   log likelihood =  286098.91  (not concave)
Iteration 7:   log likelihood =   286099.8  (not concave)
Iteration 8:   log likelihood =     286100  (not concave)
Iteration 9:   log likelihood =  286100.05  (not concave)
Iteration 10:  log likelihood =  286100.05  (not concave)
Iteration 11:  log likelihood =  286100.05  (not concave)
Iteration 12:  log likelihood =  286100.05  (not concave)
Iteration 13:  log likelihood =  286100.05  (not concave)
Iteration 14:  log likelihood =  286100.05  (not concave)
Iteration 15:  log likelihood =  286100.05  (not concave)
Iteration 16:  log likelihood =  286100.05  (not concave)
Iteration 17:  log likelihood =  286100.05  (not concave)
--Break--
I would appreciate your comments on what I am doing wrong or where I am missing understanding of the HLM in this case. Thank you!

Array

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long(category_id name_id) byte(overallScore rating) float moder
1 2955 85 1 4
1 1015 89 1 4
1 1015 89 1 1
1 2085 80 3 4
1 1120 68 1 1
1 2085 80 1 4
1 2085 80 1 3
1 2085 80 1 4
1 2955 85 5 4
1 2085 80 1 4
1 2085 80 2 4
1 2085 80 1 4
1 2085 80 2 4
1 2085 80 2 4
1 2085 80 5 4
1 2085 80 1 4
1 2085 80 2 4
1 1013 80 3 4
1 1015 89 2 4
1 1015 89 1 3
1 1013 80 1 4
1 1015 89 5 2
1 1015 89 5 4
1 2085 80 1 4
1 1015 89 1 1
1 2955 85 5 4
1 1013 80 1 3
1 2085 80 2 4
1 1015 89 1 2
1 1120 68 1 4
1 1119 89 5 2
1  127 71 1 3
1 2085 80 1 4
1 2085 80 5 4
1 2085 80 4 4
1 1119 89 5 4
1 2085 80 1 4
1 2085 80 1 4
1 1015 89 3 3
1 1015 89 1 1
1 2085 80 1 2
1 2085 80 1 4
1 2085 80 4 4
1 2085 80 4 4
1 2085 80 1 4
1 1015 89 1 1
1 1015 89 1 4
1 2085 80 1 4
1 2085 80 3 4
1 2085 80 1 3
end
label values category_id category_id
label def category_id 1 "aa_batteries", modify
label values name_id name_id
label def name_id 127 "AmazonBasics Performance AA Alkaline battery", modify
label def name_id 1013 "Duracell Coppertop Duralock AA Alkaline battery", modify
label def name_id 1015 "Duracell Quantum AA Alkaline battery", modify
label def name_id 1119 "Energizer Ultimate Lithium AA battery", modify
label def name_id 1120 "Energizer ecoAdvanced AA Alkaline battery", modify
label def name_id 2085 "Kirkland Signature (Costco) AA Alkaline battery", modify
label def name_id 2955 "Rayovac Fusion Advanced AA Alkaline battery", modify