DAuid tot_pop tot_male 0_4male 5_9male 10_14male 15_19male 20_24male 25_29male 30_34male 35_39male 40_44male 45_49male 50_54male 55_59male 60_64male 60_69male 70_74male 75_79male 80_84male 85_overmale total_female 0-4female 5_9female 10_14female 15_19female 20-24female 25_29female 30_34female 35_39female 40_44female 45_49female 50_54female 55_59female 60_64female 65_69female 70_74female 75_79female 80_84female 85_over_female
10010165 540 245 5 10 15 20 20 15 10 15 15 25 35 25 20 5 5 5 0 0 295 15 15 10 25 35 20 15 10 30 30 30 25 15 5 5 0 0 5
10010166 374 175 5 10 5 20 20 10 10 5 15 15 25 15 10 5 0 0 0 0 200 5 10 15 15 20 10 5 10 15 25 25 15 15 0 5 0 5 0
10010167 511 250 15 15 15 15 40 25 10 10 15 25 25 15 20 5 0 0 5 0 260 10 5 10 25 30 25 15 15 15 30 30 25 15 0 5 5 0 0
10010168 595 285 5 10 10 20 25 20 10 25 20 20 20 25 25 20 15 5 5 5 315 10 15 10 25 25 30 15 25 15 20 30 30 30 10 15 10 0 5
10010169 326 160 5 10 15 15 10 10 10 15 15 20 10 10 5 5 0 0 0 0 170 5 5 10 20 15 10 10 10 20 15 15 10 5 5 5 0 5 0
10010170 453 215 10 10 15 20 25 20 5 15 15 20 20 20 15 0 0 5 5 0 235 5 10 15 20 20 15 5 20 25 15 25 20 15 5 5 5 5 5
10010171 563 260 10 15 20 30 20 20 15 10 25 30 25 10 15 5 5 5 5 0 300 5 20 25 20 35 20 10 15 35 30 20 15 20 10 5 5 0 0
10010172 246 120 5 5 5 5 15 15 10 5 10 10 15 10 5 0 5 0 0 0 125 5 10 10 10 5 10 10 10 15 5 10 10 0 10 5 5 0 0
10010173 984 465 20 40 45 40 40 30 35 35 50 50 35 15 15 10 10 5 0 0 515 30 35 40 40 45 50 35 35 60 60 30 20 15 10 10 10 0 0
Hi All
I have a agregate data in the above format. DAUID is a census dissemination area. The figures in each cell indicate number of individuals in each category. tot_pop=total population, tot_male= total male population and 0_4male: number of male between 0 and 4 years and so on.
I want to combine this data with people with certain disease condition(confidential data, so I could not share here) to look at the effect of age and sex (among other variables) on the ocurrence of disease. The confidential data has information(age, sex) on people with disease only.

I am trying to make a dataset like the below using the above dataset, where 0 indicates male, and 1 female (male_female) and age_gp (0 to 4 =1, 5-9=2, and so on).
I am hoping that once I combine the disease dataset with controls, the below dataset by using DAUID (dissemination area) variable, I can see the effect of age, and sex on disease ocurrence.
Can you please let me know the STATA data set to do this. I am using STATA 15.1.
many thanks
Yuba
DAUID male_female age_gp
10010165 0 1
10010165 1 2
10010165 1 6
10010165 1 8
10010165 0 3
10010165 1 2
10010165 1 5
10010165 1 6
10010165 1 6
10010165 1 7
10010165 1 8
10010165 1 9
10010165 1 1
10010165 0 4
10010165 0 5
10010165 1 6
10010165 1 7
10010166 1 5
10010166 1 3
10010166 1 3
10010166 1 3
10010166 0 3
10010166 0 3
10010166 0 3