I have put together a data-set with information about military alliances btw states in the post WW2 era. The data-set is organized by country year, going from 1946-2016.
The data-set also includes information about which countries a country shares a land border with at any given year (variable border1 border2 etc, up to border14).
For each alliance a country is a member of any given year, I have created variables with country-codes for members of that alliance (this means that a country may appear more than once as an ally for a given year).
(Example: The variable a1mem1 shows the country-code for the first member for alliance 1, the variable a16mem8 shows the 8th member for alliance 16, and so on.)
The maximum number of alliances a country has been a member of in the post WW2 era is 53. The maximum number of members for any alliance in the post WW2 era is 59. This means that I have many variables, where in most instances the value is missing (very few states are members of more than a couple of alliances, and most alliances involves only a relatively small number of states).
To be precise, the data-set consists of 9720 observations (country-year), and 3196 variables. Of these, the variables with information about members for each alliance make up 3127 (53*59) of these variables (again, variables a1mem1 – a53mem59).
Now, what I would like is to create new variables with the name alliance_b1 alliance_b2 alliance_b3 etc.
The value for the variable alliance_b1 should be “1” if the country in this particular year has an alliance with the country that appears in the variable border1 (and 0 otherwise). Similarly, the value for the variable alliance_b2 should be “1” if a country has an alliance with the country that appear in the variable border2, etc.
Put differently, I would like to run a code that goes through all the 3127 variables from a1mem1 to a1mem59, a2mem1 to a2mem59… up to a53mem59; checks if any of the country-codes there also appears in the variables border1-border14, and if so, codes the new variables alliance_b1, alliance_b2 etc accordingly.
Below I have included a heavily reduced version of my data-set for illustrative purposes (both table and code).
All advice and recommendations are deeply appreciated. Also, since I am new to the forum, any suggestions for how I can improve my questions are most welcome.
Best,
Magnus Åsblad
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double cowcode float(border1 border2 border3) double year int(a1mem1 a1mem2 a1mem3 a2mem1 a2mem2 a2mem3 a3mem1 a3mem2 a3mem3) 2 20 70 . 1946 . 235 . . 40 41 . . . 2 20 70 . 1947 . 40 41 . 31 40 . . . 2 20 70 . 1948 . 31 40 . 31 40 . . . 2 20 70 . 1949 . 31 40 . 31 40 . 20 200 20 2 . . 1946 . . . . . . . . . 20 2 . . 1947 . . . . . . . . . 20 2 . . 1948 . . . . . . . . . 20 2 . . 1949 2 . 200 . . . . . . 70 80 90 2 1946 2 40 41 . . . . . . 70 80 90 2 1947 2 40 41 2 31 40 . . . 70 80 90 2 1948 2 31 40 2 31 40 . . . 70 80 90 2 1949 2 31 40 2 31 40 . . . 220 221 211 232 1946 . 365 . . . . . . . 220 221 211 232 1947 . 365 . 200 . . . . . 220 221 211 232 1948 . 365 . 200 . . 200 210 211 220 221 211 232 1949 . 365 . 200 . . 200 210 211 end
cowcode | border1 | border2 | border3 | year | a1mem1 | a1mem2 | a1mem3 | a2mem1 | a2mem2 | a2mem3 | a3mem1 | a3mem2 | a3mem3 |
2 | 20 | 70 | 1946 | 235 | 40 | 41 | |||||||
2 | 20 | 70 | 1947 | 40 | 41 | 31 | 40 | ||||||
2 | 20 | 70 | 1948 | 31 | 40 | 31 | 40 | ||||||
2 | 20 | 70 | 1949 | 31 | 40 | 31 | 40 | 20 | 200 | ||||
20 | 2 | 1946 | |||||||||||
20 | 2 | 1947 | |||||||||||
20 | 2 | 1948 | |||||||||||
20 | 2 | 1949 | 2 | 200 | |||||||||
70 | 80 | 90 | 2 | 1946 | 2 | 40 | 41 | ||||||
70 | 80 | 90 | 2 | 1947 | 2 | 40 | 41 | 2 | 31 | 40 | |||
70 | 80 | 90 | 2 | 1948 | 2 | 31 | 40 | 2 | 31 | 40 | |||
70 | 80 | 90 | 2 | 1949 | 2 | 31 | 40 | 2 | 31 | 40 | |||
220 | 221 | 211 | 232 | 1946 | 365 | ||||||||
220 | 221 | 211 | 232 | 1947 | 365 | 200 | |||||||
220 | 221 | 211 | 232 | 1948 | 365 | 200 | 200 | 210 | 211 | ||||
220 | 221 | 211 | 232 | 1949 | 365 | 200 | 200 | 210 | 211 |
0 Response to Gen. new variables based on values for multiple existing variables, TSCS data
Post a Comment