I am relatively new at using STATA, having always used 'pre-made' datasets prior to now which has facilitated my previous experience quite a bit. I am currently working on a data set which I created using news articles to create four indexes of news tone, separated by political bias. As a result I have four indexes: left, center left, right , center right all of which have daily data spanning from 2016-01-08 to 2017-05-25. I also have data for Consumer Sentiment Index, Inflation, PCE Index industrial Production Index, all of which are collected at monthly intervals.
I initially looked at the data as time series, as that made the most sense to me. However, after running into some issues I was recommended to look at it as panel data.
The issue I am facing is that all of my independent variables are the same across each index, seeing as they are all measured at the national level and all news indexes are created within the US. As a result there is perfect correlation between them and so when I use xtreg my independent variables are dropped because of collinearity.
I have a random sample generated with randomtag below:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float date2 long id float(diffm dem ind rep CSI_agg ind_prod pce infl) 20670 1 -.15003543 . . . . . . . 20845 1 -.076628 . . . . . . . 20935 1 -.0965408 . . . . . . . 20514 2 .1499021 . . . . . . . 20573 2 .010408889 . . . 89 . . . 20598 2 .13686685 . . . . . . . 20746 2 .01415004 . . . . . . . 20855 2 .21080305 . . . . . . . 20860 2 .16345693 . . . . . . . 20717 3 -.313705 . . . . . . . 20868 3 -.26616672 . . . . . . . 20877 3 -.2300913 . . . . . . . 20890 3 -.15893278 . . . . . . . 20455 4 . . . . . . . . 20544 4 -.3524964 . . . . . 12569.7 4.63875 20610 4 -.4287717 . . . . . . . 20624 4 -.4204125 . . . . . . . 20627 4 -.3915495 . . . . . . . 20698 4 -.3510803 . . . . . . . 20704 4 -.3544309 . . . . . . . 20808 4 -.29204187 . . . . . . . 20826 4 -.29282096 . . . . . . . 20833 4 -.26948172 . . . . . . . 20870 4 -.3366554 . . . . . . . 20888 4 -.24112114 . . . . . . . end format %td date2 label values id id label def id 1 "cleft", modify label def id 2 "cright", modify label def id 3 "left", modify label def id 4 "right", modify
when I use xtreg as below:
Code:
xtreg dem priceindex pce CSI_agg diffm i.id
I then tried to use rangestat to get around this:
Code:
rangestat (reg) dem diffm infl, interval(date2 . .) by(id)
"no result for all obs: reg dem diffm infl
varlist required"
I am sure that I am most likely doing something very obviously wrong here, and can't help but think that time series was the way to go. But in any case my questions are as follows:
1.) Am I writing something syntactically wrong when using rangestat? It seems as though I did specify a variable list, but I must be missing something if I get this error...
2.) This question is less of a practical question: I feel like the issue of collinearity across my repeated variables indicates that I should be using time series rather than panel data, however I was thinking about if we were to do analyses based on states for example, they would surely share some national level variables, resulting in the same problem. Is there some way to overcome this?
I am sorry for the long question, and thank you for reading!
0 Response to Panel data regression by id using Rangestat
Post a Comment