Dear all,

that is my first post here on Statalist; I hope my request is in line with this forum's policy.

I am currently working with unbalanced panel data. More specifically, that means that my data set consists of two components:
1) archival data for each year between 2008 and 2017 (let's call the variable of my main interest VPerf)
2) survey data, but only for the years 2009, 2010, 2013, 2016, 2017 (let's call the variable my main interest VSurvey)

Although this data set does not represent quite a good starting point, I would still like to show that VSurvey in t0
1) does affect VPerf in t+1 positively and in t+2 negatively and
2) is also dependent on the VPerf t-1

Because of the bidirectional relationship I am hypothesizing in 2), I quickly thought of using SEM in Stata.
However, if I work in the Wide format, I have on the one hand a very weak data set in terms of numbers of observations and on the other hand, I also find mixed evidence (e.g. the impact of VSurvey2009 on VPerf2010 is not the same as the one from VSurvey2013 on VPerf2014).

I do not want to give up at this stage since the number of observations is small and I could still imagine that there is a significant relationship if I would be able to work in the Long format.

The problem with the Long format: without firm fixed effects, Stata believes that several observations from the same company are independent of each other. Therefore, the coefficient is over-estimated. For that reason, I am looking for solutions how I can include firm fixed effects in my SEM.

What might be also helpful to know:
  • Because of my weak data set I am using full information maximum likelihood (mlmv).
  • I have already read about the “xtdpdml” command. However, I do not think that it is suitable for my problem. On the one hand it cannot account for the bidirectional relationship I am hypothesizing. On the other hand, even if I am only looking for unidirectional relationships, Stata calculates for hours and hours if I use the “fiml” option (“mlmv”-equivalent option) without reaching any results. Probably my data set is too weak.
Thank you very much for any kind of help!

Best,
Eric