Gen. new variables based on values for multiple existing variables, TSCS data

Dear members,

I have put together a data-set with information about military alliances btw states in the post WW2 era. The data-set is organized by country year, going from 1946-2016.

The data-set also includes information about which countries a country shares a land border with at any given year (variable border1 border2 etc, up to border14).

For each alliance a country is a member of any given year, I have created variables with country-codes for members of that alliance (this means that a country may appear more than once as an ally for a given year).

(Example: The variable a1mem1 shows the country-code for the first member for alliance 1, the variable a16mem8 shows the 8^th member for alliance 16, and so on.)

The maximum number of alliances a country has been a member of in the post WW2 era is 53. The maximum number of members for any alliance in the post WW2 era is 59. This means that I have many variables, where in most instances the value is missing (very few states are members of more than a couple of alliances, and most alliances involves only a relatively small number of states).

To be precise, the data-set consists of 9720 observations (country-year), and 3196 variables. Of these, the variables with information about members for each alliance make up 3127 (53*59) of these variables (again, variables a1mem1 – a53mem59).

Now, what I would like is to create new variables with the name alliance_b1 alliance_b2 alliance_b3 etc.

The value for the variable alliance_b1 should be “1” if the country in this particular year has an alliance with the country that appears in the variable border1 (and 0 otherwise). Similarly, the value for the variable alliance_b2 should be “1” if a country has an alliance with the country that appear in the variable border2, etc.

Put differently, I would like to run a code that goes through all the 3127 variables from a1mem1 to a1mem59, a2mem1 to a2mem59… up to a53mem59; checks if any of the country-codes there also appears in the variables border1-border14, and if so, codes the new variables alliance_b1, alliance_b2 etc accordingly.

Below I have included a heavily reduced version of my data-set for illustrative purposes (both table and code).

All advice and recommendations are deeply appreciated. Also, since I am new to the forum, any suggestions for how I can improve my questions are most welcome.

Best,
Magnus Åsblad

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double cowcode float(border1 border2 border3) double year int(a1mem1 a1mem2 a1mem3 a2mem1 a2mem2 a2mem3 a3mem1 a3mem2 a3mem3)
  2  20  70   . 1946 . 235   .   . 40 41   .   .   .
  2  20  70   . 1947 .  40  41   . 31 40   .   .   .
  2  20  70   . 1948 .  31  40   . 31 40   .   .   .
  2  20  70   . 1949 .  31  40   . 31 40   .  20 200
 20   2   .   . 1946 .   .   .   .  .  .   .   .   .
 20   2   .   . 1947 .   .   .   .  .  .   .   .   .
 20   2   .   . 1948 .   .   .   .  .  .   .   .   .
 20   2   .   . 1949 2   . 200   .  .  .   .   .   .
 70  80  90   2 1946 2  40  41   .  .  .   .   .   .
 70  80  90   2 1947 2  40  41   2 31 40   .   .   .
 70  80  90   2 1948 2  31  40   2 31 40   .   .   .
 70  80  90   2 1949 2  31  40   2 31 40   .   .   .
220 221 211 232 1946 . 365   .   .  .  .   .   .   .
220 221 211 232 1947 . 365   . 200  .  .   .   .   .
220 221 211 232 1948 . 365   . 200  .  . 200 210 211
220 221 211 232 1949 . 365   . 200  .  . 200 210 211
end

cowcode	border1	border2	border3	year	a1mem1	a1mem2	a1mem3	a2mem1	a2mem2	a2mem3	a3mem1	a3mem2	a3mem3
2	20	70		1946		235			40	41
2	20	70		1947		40	41		31	40
2	20	70		1948		31	40		31	40
2	20	70		1949		31	40		31	40		20	200
20	2			1946
20	2			1947
20	2			1948
20	2			1949	2		200
70	80	90	2	1946	2	40	41
70	80	90	2	1947	2	40	41	2	31	40
70	80	90	2	1948	2	31	40	2	31	40
70	80	90	2	1949	2	31	40	2	31	40
220	221	211	232	1946		365
220	221	211	232	1947		365		200
220	221	211	232	1948		365		200			200	210	211
220	221	211	232	1949		365		200			200	210	211

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Gen. new variables based on values for multiple existing variables, TSCS data
Gen. new variables based on values for multiple existing variables, TSCS data

0 Response to Gen. new variables based on values for multiple existing variables, TSCS data

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Gen. new variables based on values for multiple existing variables, TSCS data Gen. new variables based on values for multiple existing variables, TSCS data

Related Posts with Gen. new variables based on values for multiple existing variables, TSCS data

0 Response to Gen. new variables based on values for multiple existing variables, TSCS data

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Gen. new variables based on values for multiple existing variables, TSCS data
Gen. new variables based on values for multiple existing variables, TSCS data