Dear all,

I wanted your help with the following. I am working on a patent project and in the following link (https://1drv.ms/f/s!APdCpn4GgD8tfg) you can find the Excel dataset with the patents of one of the companies, under ‘Sheet1’ Excel sheet.

Columns:
column A: pn (patent number. two first characters of each patent number refer to the country the patent was registered in. eg. For AU8425677-A the country code is AU. However, ‘EP’ as country code refers to European patents, so this refers to all country codes under ‘EP_codes’ Excel sheet)
column B: year
column C: environ_patents (1 for environmental patents, 0 for non-environmental patents)

The variables I want to create are the following
1. Which countries it has registered patents in a year.
2. The number of countries all patents in a year have been registered in.
3. Which countries it has registered non-environmental patents in a year
4.The number of countries non-environmental patents have been registered in a year
5. Which countries it has registered environmental patents in a year
6. The number of countries environmental patents have been registered in a year


The most important variables are the first and second one, since once created, the rest variables will be much easier to be created.
For example, for the below observations:
pn year environ_patents
WO2000077147-A1; BR9001712-U2; BR9001712-Y1 2000 0
AU9050094-A; ZA9001404-A; ZA9001404-B; BR9003529-T1 2000 0
AU8425677-A; EP126524-A; JP59231039-A; ZA8401939-A; EP126524-B; CA1215991-A; DE3462027-G; EP126524-B2 2000 1
WO2000096237-A1; BE766843-A; FR2135376-A 2000 1
IT766843-A; FR2135376-A 2001 0
BE766843-A; WO2001102655-A1; FR2135376-A 2001 1
I would like the result as below:
  1. Which countries it has registered patents in a year.
2000: BR AU AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR JP ZA CA DE
2001: IT FR BE
  1. The number of countries all patents in a year have been registered in.
2000: 44
2001: 3
  1. Which countries it has registered non-environmental patents in a year.
2000: AU ZA BR
2001: IT FR
4.The number of countries non-environmental patents have been registered in a year
2000: 3
2001: 2
  1. Which countries it has registered environmental patents in a year
2000: AU AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR JP ZA CA DE
2001: BE FR
  1. The number of countries environmental patents have been registered in a year
2000: 43
2001: 2


The above variables will ignore ‘WOxxxx’ codes(meaning World Organization), show only identical countries(not duplicates) per row and per year, and in case of ‘EPxxxx’ (stands for environmental patents, which includes European countries the company chose the patent to be registered in), to show the country codes under the ‘EP_codes’ Excel sheet.
For the first variable, I would like to generate new columns(columns named: country_1, country_2, etc) for each identical country identified in the ‘pn’ column as in the results below -the below are columns that should appear on the right side of the previous table's columns:
country_1 country_2 country_3 country_4 country_5 country_6 country_7 country_8 country_9 country_10 country_11 country_12 country_13 country_14 country_15 country_16 country_17 country_18 country_19 country_20 country_21 country_22 country_23 country_24 country_25 country_26 country_27 country_28 country_29 country_30 country_31 country_32 country_33 country_34 country_35 country_36 country_37 country_38 country_39 country_40 country_41 country_42 country_43
BR
AU ZA BR
AU AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR JP ZA CA DE
BE FR
IT FR
BE FR
.Therefore, could you advise on the Stata commands required the get the results above?

Thank you very much,
C.