Hi, Statalist.
1. Our task/problem
I’m trying to predict if a startup will grow continuously for the next 3 years. I have a panel data with two levels;firm and year. First, I calculate a growthdummy variable which is 1 if firm grows each year. Next I create a 3_year dummy which is 1 if the firm grows for three continuous years. I am not sure if the 3_year dummy should be put the year before the event happens or after, in the sample data I have put it before it happens.
2. Which model fits our task?
I’m looking to find out which variables is contributing to the 3-year continuous growth. In other words which independent variables is causing a firm to grow. Therefore, I want model that fits my panel data. I’ve been looking into dynamic panel data models such as Arellano-bond estimations, where my dependent variable is included as a lag. I’ve also been looking into xtlogit, but I’m not sure if I should use a random effect or fixed effect (Our data variables is changing every year within every firm).
My dataset consists of 3 million rows and 450 variables. The variables are of different data types such categorical, numeric and binary.
Below is a fake dataset to see how my dependent variable is coded:
firm_id year growthdummy 3_year
1 2000 0 0
1 2001 0 0
1 2002 0 0
1 2003 0 0
1 2004 1 0
1 2005 1 0
2 2010 1 1
2 2011 1 1
2 2012 1 0
2 2013 1 0
2 2014 1 0
2 2015 0 0
3 2003 1 1
3 2004 1 1
3 2005 1 0
3 2006 1 0
3 2007 1 0
0 Response to Predicting firm growth (panel data)
Post a Comment