I am complete new in working with Stata and I tried to figure out how to aggregate a huge dataset on different conditions. One of my fellow student recommended me to ask my problem in here in the forum, because you guys are a strong community. So, I am now trying to describe my problem.
I would like to add the sum over the variable "totalpollutantquantityTonne" over the following conditions ("countrycode_c", "pollutantcode_c", "industry_c", "medium_c") and show it in a table.
Each of these conditions also contains different values, e.g. for countrycode_c there is "DE", "AT", "GB",... etc.
For "pollutantcode_c" there is "CO2", "CH4", "NOX", ... etc.
For "industry_c" there is "Energy Industry", "Chemical Industry", ... etc.
For "medium_c" there are "Air", "Land" and "Water".
The conditions "countrycode_c", "pollutantcode_c", "industry_c" and "medium_c" are already encoded.
I know how to create such tables in Excel (via pivot tables), but I would like to have them output in STATA.
In the following I also put the dataex Code for more informations.
I hope anyone of you can help, many thanks in advance.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int reportingyear long(countrycode_c pollutantcode_c industry_c medium_c) float totalpollutantquantityTonne 2007 1 19 1 1 725 2007 1 20 1 1 159000 2007 1 23 1 3 .51 2007 1 24 1 3 5.48 2007 1 57 1 1 206 2007 1 76 1 1 179 2007 1 81 1 3 699 2007 1 83 1 3 490 2007 1 93 1 3 1.02 2007 1 11 2 1 418 2007 1 19 2 1 4600 2007 1 20 2 1 172000 2007 1 20 2 1 949000 2007 1 20 2 1 112000 2007 1 20 2 1 541000 2007 1 24 2 3 .186 2007 1 25 2 1 3.3 2007 1 26 2 1 13 2007 1 36 2 3 11.4 2007 1 42 2 1 .685 2007 1 42 2 1 .484 2007 1 48 2 3 .22 2007 1 53 2 1 871 2007 1 55 2 1 75.9 2007 1 57 2 1 380 2007 1 57 2 1 130 2007 1 58 2 1 870 2007 1 58 2 1 390 2007 1 72 2 1 52.4 2007 1 76 2 1 637 2007 1 81 2 3 75 2007 1 81 2 3 484 2007 1 83 2 3 427 2007 1 83 2 3 185 2007 1 84 2 3 8.77 2007 1 84 2 3 6 2007 1 84 2 3 8.05 2007 1 93 2 1 3.22 2007 1 4 3 3 .0057 2007 1 11 3 1 655 2007 1 11 3 1 1190 2007 1 11 3 1 813 2007 1 20 3 1 166000 2007 1 20 3 1 141000 2007 1 20 3 1 586000 2007 1 20 3 1 895000 2007 1 20 3 1 1050000 2007 1 20 3 1 246000 2007 1 20 3 1 810000 2007 1 20 3 1 234000 2007 1 20 3 1 302000 2007 1 20 3 1 103000 2007 1 20 3 1 147000 2007 1 20 3 1 1920000 2007 1 20 3 1 235000 2007 1 20 3 1 1190000 2007 1 20 3 1 484000 2007 1 20 3 1 349000 2007 1 20 3 1 2870000 2007 1 20 3 1 161000 2007 1 20 3 1 106000 2007 1 20 3 1 201000 2007 1 23 3 3 .618 2007 1 23 3 3 1.03 2007 1 23 3 3 .137 2007 1 23 3 3 .214 2007 1 23 3 3 .214 2007 1 23 3 3 .0899 2007 1 23 3 3 .0931 2007 1 53 3 1 24.1 2007 1 53 3 1 12.3 2007 1 58 3 1 472 2007 1 58 3 1 153 2007 1 58 3 1 145 2007 1 58 3 1 831 2007 1 58 3 1 232 2007 1 58 3 1 268 2007 1 58 3 1 602 2007 1 58 3 1 260 2007 1 58 3 1 694 2007 1 58 3 1 358 2007 1 58 3 1 611 2007 1 58 3 1 101 2007 1 58 3 1 3050 2007 1 58 3 1 171 2007 1 58 3 1 827 2007 1 58 3 1 649 2007 1 71 3 3 .322 2007 1 72 3 1 64.2 2007 1 72 3 1 98.2 2007 1 72 3 1 92.5 2007 1 76 3 1 263 2007 1 76 3 1 392 2007 1 76 3 1 288 2007 1 76 3 1 304 2007 1 76 3 1 3230 2007 1 93 3 3 .424 2007 1 93 3 3 .495 2007 1 93 3 3 .244 2007 1 7 5 1 4.38 end label values countrycode_c countrycode_c label def countrycode_c 1 "AT", modify label values pollutantcode_c pollutantcode_c label def pollutantcode_c 4 "ASANDCOMPOUNDS", modify label def pollutantcode_c 7 "BENZENE", modify label def pollutantcode_c 11 "CH4", modify label def pollutantcode_c 19 "CO", modify label def pollutantcode_c 20 "CO2", modify label def pollutantcode_c 23 "CUANDCOMPOUNDS", modify label def pollutantcode_c 24 "CYANIDES", modify label def pollutantcode_c 25 "DCE-1,2", modify label def pollutantcode_c 26 "DCM", modify label def pollutantcode_c 36 "FLUORIDES", modify label def pollutantcode_c 42 "HCFCS", modify label def pollutantcode_c 48 "HGANDCOMPOUNDS", modify label def pollutantcode_c 53 "N2O", modify label def pollutantcode_c 55 "NH3", modify label def pollutantcode_c 57 "NMVOC", modify label def pollutantcode_c 58 "NOX", modify label def pollutantcode_c 71 "PHENOLS", modify label def pollutantcode_c 72 "PM10", modify label def pollutantcode_c 76 "SOX", modify label def pollutantcode_c 81 "TOC", modify label def pollutantcode_c 83 "TOTALNITROGEN", modify label def pollutantcode_c 84 "TOTALPHOSPHORUS", modify label def pollutantcode_c 93 "ZNANDCOMPOUNDS", modify label values industry_c Industry_c label def Industry_c 1 "Animal and vegetable products from the food and beverage sector", modify label def Industry_c 2 "Chemical Industry", modify label def Industry_c 3 "Energy Industry", modify label def Industry_c 5 "Metals Industry", modify label values medium_c medium_c label def medium_c 1 "AIR", modify label def medium_c 3 "WATER", modify
0 Response to Aggregate BigData in Stata (similar to the Excel Function Pivot Tables)
Post a Comment