Latent Profile Analysis: best practices in "starting values" selection?

Dear Statalist users,

I initially created a post here, where I was having difficulties understanding the basics of LPA syntax in Stata. Following the post here, I was able to replicate Masyn's (2013) LPA using startvalues(randompr, draws(5) seed(15)). I applied the same starting values uniformly across my 6 classes with 4 different model restrictions. My BIC statistics results are as follows:

class	BIC class-invariant, diagonal	BIC class-varying, diagonal	BIC class-invariant, unrestricted	BIC class varying, unrestricted
1	6536.391	6536.391	5726.355	5726.355
2	6044.384	5982.513	DRE	5648.8
3	5923.718	5917.452	5563.118	5620.018
4	5915.317	5820.818	5587.027	5741.81
5	5898.285	5838.543	5731.829	5756.148
6	5843.54	5817.436	5259.08	5927.461

Where DRE stands for discontinuous region encountered.

I noticed what looks like an erratic behaviour of my BIC values for the class invariant, unrestricted model (column four), which is probably due to the stringent assumption of being class-invariant. I've also noticed that as I increase the number of classes, Stata struggles quite a bit in providing results for the same model specification.

Now, comment # 7 in the same post recommends using startvalues(randompr, draws(50) seed(15)) emopts(iterate(10)) as one hits 5+ latent classes. I applied this criteria uniformly across my 6 class models with 4 different restrictions. My results look as follows:

class	BIC class-invariant, diagonal	BIC class-varying, diagonal	BIC class-invariant, unrestricted	BIC class varying, unrestricted
1	6536.391	6536.391	5726.355	5726.355
2	6044.384	5982.513	5677.851	5648.8
3	5923.718	5932.882	5698.263	5620.018
4	5915.317	5820.818	5715.92	5773.844
5	5898.285	5846.176	5750.381	5780.88
6	5843.54	5818.873	5772.309	5924.519

At this point, I am very confused about when to use a set of starting values or another. Every time I use a different set my class profiles change markedly, and I am trying to avoid the trap of choosing the starting values that best fit my research expectations.

I'm leaning towards using Masyn's starting values, as in my first table. It just seems like a standard I can follow. But if anyone has some insights on this topic, I would be very grateful to discuss. Many thanks.

P.S. I am aware of the "gsem estimation options" document from the Stata manual. Unfortunately, I could not solve my problem after reading it.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Latent Profile Analysis: best practices in "starting values" selection?
Latent Profile Analysis: best practices in "starting values" selection?

0 Response to Latent Profile Analysis: best practices in "starting values" selection?

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Latent Profile Analysis: best practices in "starting values" selection? Latent Profile Analysis: best practices in "starting values" selection?

Related Posts with Latent Profile Analysis: best practices in "starting values" selection?

0 Response to Latent Profile Analysis: best practices in "starting values" selection?

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Latent Profile Analysis: best practices in "starting values" selection?
Latent Profile Analysis: best practices in "starting values" selection?