Hello Everyone,

I have conducted a discrete choice experiment in multiple countries. Each participant answered 12 choice sets (total 24 choice sets in two blocks). Each choice set had 3 alternatives (2 options and a none). There were 4 attributes: price (4 levels), production a (yes/no), production b (yes/no), and production type (categorical variable with 4 levels). I have a total of around 3000 participants. Also, there was a constraint in place that when production a was yes (=1) production b also had to be yes (=1). Is there anything I need to do to account for this constraint?

I have some questions about the best way to run the analysis. I specifically have questions regarding using lclogit as I think this is the best way forward because there seems to be a lot of heterogeneity in the data. Also, pooling the data also looks to be best.

Here is example of the dataset and code I have used:

Code:
 
ID country choiceset alternative choice price ProdA ProdB ProdType Level1 Level2 Level3 Level4 ASC_none identifier
1 1 1 1 0 1.55 0 0 1 1 0 0 0 0 101
1 1 1 2 1 1.15 1 0 2 0 1 0 0 0 101
1 1 1 3 0 0 0 0 0 0 0 0 0 1 101
1 1 2 1 1 1.15 1 1 4 0 0 0 1 0 102
1 1 2 2 0 0.95 1 0 3 0 0 1 0 0 102
1 1 2 3 0 0 0 0 0 0 0 0 0 1 102

lclogit choice price ASC_none prodA prodB level2 level3 level4, group(identifier) id(ID) nclasses(2) 
lclogitml, iterate (40)
wtp price prodA prodB level2 level3 level4, equation(choice1) krinsky reps(10000)
1. What is meant by the share constant estimates? See example:

Code:
-------------+----------------------------------------------------------------
share1       |
       _cons |  -.0617537   .1364026    -0.45   0.651     -.329098    .2055905
-------------+----------------------------------------------------------------
share2       |
       _cons |  -.7815727   .2272384    -3.44   0.001    -1.226952   -.3361935
------------------------------------------------------------------------------

2. After getting the class estimates I use the following code for probabilities:

Code:
lclogitpr cp, cp
egen double cpmax=rowmax(cp1-cp2)
summarize cpmax, sep(0)

//create the class membership based on the highest probability
gen byte class=.
forval c=1/class#{
replace class= `c' if cpmax==cp`c'
}

forvalues c = 1/class# {
quietly summarize pr if class == `c' & choice==1
local n=r(N)
local a=r(mean)
quietly summarize pr`c' if class == `c' & choice==1
local b=r(mean)
matrix pr = nullmat(pr) \ `n', `c', `a', `b'
}
matrix colnames pr = "Obs" "Class" "Uncond_Pr" "Cond_PR"
matlist pr, name(columns)
Is it correct to then report the posterior probabilities estimating who belongs in each class?

3. Is it correct to profile the classes by using cross tabs (tabs variable class, column). I would like to know the best way to profile the classes. When I compare to output from other programs the interpretation seems to be different (i.e. the variables are included in the class models).

4. How does the seed affect the results? Is it necessary to include? If so, what would be an appropriate seed value?

Also, one of the countries I get different results. For example, the price variable is not significant and latent class analysis does not shed any insights. Is it possible there could be many outliers or its just people did not care at all about any of the choices or are not price sensitive? Most of the other attributes are also not significant.

Is it better to use effects or dummy coding? I get slightly different results between the two and I am a little confused on how exactly to interpret the effects coding. Is it the difference between the utility means for all variables rather than the baselevels?

Thank you in advance! Any help on any of the above questions is much appreciated

Best,
Megan