I am using Stata 16 and I need some help in automating the process of extracting data. I am trying to do charts of unemployment data for OECD countries. I am basically trying to find the difference between unemployment data at a given quarter and the unemployment data of a recession. Below is a sample of my dataset:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float id str3 location str7 time str6 recession_date byte pt_output float(spell quarters_to_start unemptime new_variable recession1 unemp_data) 1 "AUS" "1966-Q3" "1971Q3" . 0 20 -29 . . 1.8 2 "AUS" "1966-Q4" "1975Q2" . 0 20 -28 . . 1.7 3 "AUS" "1967-Q1" "1977Q2" . 0 20 -27 . . 1.8 4 "AUS" "1967-Q2" "1981Q3" . 0 20 -26 . . 2 5 "AUS" "1967-Q3" "1990Q4" . 0 20 -25 . . 1.9 6 "AUS" "1967-Q4" "" . 0 20 -24 . . 1.8 7 "AUS" "1968-Q1" "" . 0 20 -23 . . 1.9 8 "AUS" "1968-Q2" "" . 0 20 -22 . . 1.9 9 "AUS" "1968-Q3" "" . 0 20 -21 . . 1.8 10 "AUS" "1968-Q4" "" . 0 20 -20 1.8 -.10000005 1.7 11 "AUS" "1969-Q1" "" . 0 20 -19 1.7 -.19999996 1.8 12 "AUS" "1969-Q2" "" . 0 20 -18 1.8 -.10000005 1.8 13 "AUS" "1969-Q3" "" . 0 20 -17 2 .1 1.7 14 "AUS" "1969-Q4" "" . 0 20 -16 1.9 -2.384186e-08 1.9 15 "AUS" "1970-Q1" "" . 0 20 -15 1.8 -.10000005 1.6 16 "AUS" "1970-Q2" "" . 0 20 -14 1.9 -2.384186e-08 1.7 17 "AUS" "1970-Q3" "" . 0 20 -13 1.9 -2.384186e-08 1.7 18 "AUS" "1970-Q4" "" . 0 20 -12 1.8 -.10000005 1.6 19 "AUS" "1971-Q1" "" . 0 20 -11 1.7 -.19999996 1.7 20 "AUS" "1971-Q2" "" . 0 20 -10 1.8 -.10000005 1.9 21 "AUS" "1971-Q3" "" 1 1 15 -9 1.8 -.10000005 1.9 22 "AUS" "1971-Q4" "" . 1 15 -8 1.7 -.19999996 2.2 23 "AUS" "1972-Q1" "" 2 1 15 -7 1.9 -2.384186e-08 2.4 24 "AUS" "1972-Q2" "" . 1 15 -6 1.6 -.3 2.4 25 "AUS" "1972-Q3" "" . 1 15 -5 1.7 -.19999996 2.9 26 "AUS" "1972-Q4" "" . 1 15 -4 1.7 -.19999996 2.8 27 "AUS" "1973-Q1" "" . 1 15 -3 1.6 -.3 2.6 28 "AUS" "1973-Q2" "" . 1 15 -2 1.7 -.19999996 2.3 29 "AUS" "1973-Q3" "" . 1 15 -1 1.9 -2.384186e-08 2.1 30 "AUS" "1973-Q4" "" . 1 15 0 1.9 -2.384186e-08 2.1 31 "AUS" "1974-Q1" "" . 1 15 1 2.2 .3 2.1 32 "AUS" "1974-Q2" "" . 1 15 2 2.4 .5000001 2.1 33 "AUS" "1974-Q3" "" . 1 15 3 2.4 .5000001 2.7 34 "AUS" "1974-Q4" "" . 1 15 4 2.9 1.0000001 4 35 "AUS" "1975-Q1" "" . 1 15 5 2.8 .9 4.8 36 "AUS" "1975-Q2" "" 1 2 8 6 2.6 .6999999 4.8 37 "AUS" "1975-Q3" "" . 2 8 7 2.3 .39999995 4.7 38 "AUS" "1975-Q4" "" 2 2 8 8 2.1 .1999999 5.4 39 "AUS" "1976-Q1" "" . 2 8 9 2.1 .1999999 4.9 40 "AUS" "1976-Q2" "" . 2 8 10 2.1 .1999999 4.5 41 "AUS" "1976-Q3" "" . 2 8 11 2.1 .1999999 4.8 42 "AUS" "1976-Q4" "" . 2 8 12 2.7 .8000001 4.9 end
The main interest variables are unemptime, recession1. What I intend to create is a time series chart that shows the differences in unemployment rates between a given quarter and a recession. So when unemptime = 0, this means that the quarter is the recession quarter and the difference would be zero.
I need to extract unemployment data of the recession dates. For example, extract 1.9% from 1971-Q3 and 4.8% from 1975-Q2. My intention is to create columns of such data. For example, a column called "recession1" that contains 1.9% on all rows and "recession2" that contains 4.8% on all rows. The purpose of such columns is to find the difference between unemployment rates.
I am unsure of any such commands to automate the process. Any help in this area would be appreciated.
As for the extraction of data in the first recession, I used the following code:
Code:
gen new_variable = unemp_data[_n-9] in 1/42
For the second recession date, I am not sure which command to use in order to extract data from row 21 to 42; so data would start from row 15 in new column "recession2" but gets unemployment data from row 21(1971Q3 onwards).
I would appreciate greatly any help given. Thanks!
0 Response to Help needed: Automate extracting of data
Post a Comment