Hi all,

I am new to Stata and a somewhat of statistics novice. I am working on a project where I am replicating the methodology of a study that analyzed trends in the number of emergency department (ED) visits over time. I am using the same dataset, in which each observation is a unique patient visit record (i.e., 1 row = 1 visit). In the original study, linear regression was used to assess the statistical significance in trends in the number of ED visits over the study period (1990-2009). With my project, I need to do the same, but for years 2010-2017.

I have been at a loss of how to do this in stata, as typically with linear regression you would declare both an independent and dependent variable (e.g., height and weight). I know that my independent variable would be Year, but how would I declare my dependent variable? The dependent variable would be the number of cases per year, which is actually a frequency and not a specific variable that I have set up.

The only way I could think to do this was to run a frequency on year, and then create a new dataset with year as the independent variable and count and the dependent variable:

svy: tab Year
Year Count
2010 5269
2011 5902
2012 6793
2013 7212
2014 7094
2015 5070
2016 9186
2017 10586

Then, with the new dataset:


Array

Is this totally the wrong way to do this? Is there a way to do this from within my original dataset? I feel like it should be simple, but I've had researched extensively and can't figure it out.