Hi there,

I'm very new to Stata and I'm doing some research on regional inequalities in Europe. My analysis is focussed on NUTS level 1 regions however, for some countries in some years the data is represented at the NUTS2 level. (NUTS1 region is made up of several NUTS2 regions).
Take the example of Belgium; the NUTS1 codes are BE1, BE2, and BE3. The NUTS2 codes are BE21, BE22 .... and BE31, BE32 ..... etc.
I want to create a variable that groups the NUTS2 string values to form a NUTS1 variable, but I'm not sure how to write a code for this, I'm experimenting at the moment, but to no avail!

what I'm trying so far is:

gen NUTS1=
replace NUTS1==BE2 if substr(NUTS2, 3) ==2

Basically I want a string variable called NUTS1, whose "value" will be BE2 if the third character in the string variable NUTS2 is equal to 2. Similarly, BE3 if the third character in the NUTS2 variable is equal to 3.

This is the first time I'm using Stata so apologies for my ignorance!
Any help will be hugely appreciated.

Nina