Dear Stata-listers,

I hope you are all doing well.

Before I ask my question, let me provide a brief background about my research. I think this might be useful when someone (kindly) writes a reply. I am a junior researcher in accounting and finance who has been using panel data-sets for a while and implementing pooled OLS and/or panel models including several identification strategies (difference-in-differences, regression discontinuity design, instrumental variables, etc.).

Recently, I started working on a project with a colleague that comes from a management/strategy background. Our project is in his field of research, where the use of multilevel modelling is quite common, unlike my field of research (finance). Before we start the main analysis, my colleague and I are trying to replicate a seminal paper. The idea of the paper is simple: Chief Executive Officers (CEOs) have gained an increased importance in determining the firm's performance over the years. The data-set used in this paper is a panel data where, in a given year, a CEO manages a firm that operates in some industry. Assuming that firm performance is measured using return on assets (ROA), i.e., the dependent variable, the paper finds that the percentage of the variance of ROA explained by CEOs has increased over time (the paper compares three intervals of time: 1950-1969, 1970-1989, and 1990-2009). I include below a sample of a similar data-set:
Firm_ID Year CEO Industry ROA
1 2003 Liang K7 0.06019
1 2004 Liang K7 0.069624
1 2005 Liang K7 0.077258
1 2006 Liang K7 0.069463
1 2007 Liang K7 0.075686
1 2008 Liang K7 0.048303
1 2009 Liang K7 0.054536
1 2010 Liang K7 0.052903
1 2011 Liang K7 0.047317
1 2012 Liang K7 0.048673
1 2013 Liang K7 0.04473
1 2014 Liang K7 0.040357
1 2015 Liang K7 0.047204
1 2016 Liang K7 0.04153
1 2017 Liang K7 0.039362
2 2003 Kexin C27 0.046562
2 2004 Kexin C27 -0.00105
2 2005 Kexin C27 -0.08607
2 2006 Kexin C27 0.021265
2 2007 Kexin C27 -0.04802
2 2008 Lufeng C27 -0.06058
2 2009 Lufeng C27 0.027213
2 2010 Xiao C27 0.095465
The authors of the original paper mentions the following: "We use multilevel modeling (MLM), which has the advantage of explicitly accounting for the nested structure of the data. For the MLM analysis, we specified a four-level nested model: years, within CEOs, within firms, within industries. We used the Stata command xtmixed for the MLM analysis."


Before I wrote this post, I spent a couple of days searching and reading several resources. I got the general idea of the analysis and how it works (Stata's videos and blogs are very helpful). Yet I am not sure if the command I thought of does what the authors of the original paper described. My suggested code is included below:

Code:
xtset ID Year
mixed ROA control_variables || Industry: || Firm_ID: || CEO: || Year:, mle variance
estat ICC // to get the percentage of variance explained by Industry, Firm_ID, CEO, and Year.
Please let me know what you think. Any additional explanation about MLM or about coding is welcomed.

Thank you all.

Mostafa
(Stata 15.1 MP)