I have a dataset where every row represents a study. Each row contains a column describing the region and country where the study was conducted, and several columns containing data on 5-year age-groups (from 0-4 to 99+). For each study, age-group variables are coded as 0 (the study sample does not contain that age group), 1 (the study sample contains that age group), or . (missing). The number of studies per country varies in the dataset (eg There are 8 studies done in China, 22 in the US, etc.) I am interested in determining what is the count of valid data (ie coded=1) for each age-group by country and region. This is an example of my dataset:
Code:
clear input str71 sid_final str28 regionsimplified str22 country byte(age_20_24 age_25_29 age_30_34) "HBSC 2009-2010" "Asia" "Armenia" 0 0 0 "World Health Survey" "Asia" "Bangladesh" 1 1 1 "Haq et al., 2005" "Asia" "Bangladesh" 1 1 1 "Cambodia Elderly survey 2004" "Asia" "Cambodia" 0 0 0 "Koyanagi et al., 2018" "Asia" "China" 0 0 0 "Liao et al., 2009" "Asia" "China" 1 1 1 "World Health Survey" "Asia" "China" 1 1 1 end
Bangladesh: 20-24: 2 studies; 25-29: 2 studies; 30-34: 2 studies
China: 20-24: 2 studies; 25-29: 2 studies; 30-34: 2 studies
I have tried to use the following command:
Code:
preserve collapse (first) regionsimplified (count) age_20_24 age_25_29 age_30_34, by(country)
Thanks!
0 Response to Collapsing multiple variables using the count option
Post a Comment