Hi all,

I have created a categorical variable (values 1 through 5) in excel based on certain characteristics of US states in a 20 year time period.

I have a large data set with over 2 million rows and I am trying to assign the value of the categorical variable to each row in my data set based on the state and the year.

What I have tried so far is:

gen state_year = 1
replace state_year = 5 if (year==1962 & state==1)

And so on etc.


However, since there are a 1000 combinations of states and years (50*20), this is going to take a very long time to code in a do-file.

Is there a way to import my excel file so that it creates a variable and assigns the value of that variable depending on what is in the excel?
I have attached a screenshot of the first few rows of my excel document.

Array

Any help would be much appreciated. Thanks in advance!

Harry