Dear Statalist,
I’ve got a dataset with >850,000 observations of individuals with 132 dummy variables referring to months when the individuals had income, named di1, di2 …di132 (for dummy income).
I want to establish their eligibility for a child leave benefit. They are eligible if they had income for at least 9 months over the 24 months before the month of birth of their child, the months of income don’t have to be consecutive. Thus, the eligibility is to be assessed based on the 24 months preceding the month of birth of their child, and so different across observations. The month of birth is in a separate variable, with values ranging from 1 to 132 (the variable is called b_ren in the data example below).
So for each observation, I need to identify the appropriate di variable (equal to month of birth), sum the preceding 24 di variables in a new variable and see whether the sum is >= 9. The first month I am interested in (for further eligibility reasons) is 25 (i.e. I am not interested in the first two years). So, for example, if the month of birth is 25, the new variable will be the sum of di1 – di24.
I have considered reshaping the dataset, however, I believe it’s too large.
Any help would be much appreciated
Zuzana
[CODE]
* Example generated by -dataex-. To install: ssc install dataex
clear
input str10 rcm float(b_ren di1 di25 di130 di131 di132)
"R002813464" 73 0 0 1 1 1
"R002813464" 34 0 0 1 1 1
"R002813466" 38 1 0 1 1 1
"R002813466" 59 1 0 1 1 1
"R002813467" 30 0 1 0 1 1
"R002813467" 92 0 1 0 1 1
Related Posts with Generate moving sum of 24 variables in dataset with > 800,000 observations
Create a composite score based on standardized individual measuresHi, I would like to create a composite/overall measure based on some individual proxies (eg. A, B, …
Missing Not Random (MNAR): Heckman correctionHi Stata users: I have a case of Missing not random (MNAR) using Rubin's standard classification sy…
Help needed: creating flag to Identify at least 3 consecutive years of data (without discarding other cases)Dear Sir/Madam I have a panel dataset (each company has one or multiple years of data, but there ar…
How does Stata deal with multicollinear variables?Dear All, I make up this data set. Code: * Example generated by -dataex-. For more info, type help …
Cannot Load Files for Table of Graphs Due to r(170) errorHello all: I am attempting to create a table of graphs, but I am having issues loading the files wi…
Subscribe to:
Post Comments (Atom)
0 Response to Generate moving sum of 24 variables in dataset with > 800,000 observations
Post a Comment