I am using Stata 15.1 for Windows.
I would like to keep a certain student in the program that he concentrated in the most. If he took 1 class in the English program, 2 classes in gov, 4 classes in Health Science, and 1 class in Finance, I would only want to consider the student a Health Science concentrator since he took the most classes in that program. (Example below) I have millions of rows of data. This happens to a lot of students where they took courses in different programs or programs that overlap. I want to keep them in the program that they took the most classes in. In terms of compliance, some students take a level 2 course and level 3 course and not a level 1 course. Some take more than 4 courses.
Example:
This is just an example of 1 program (Finance):
gen level_finplan = 1 if inlist(course_code, 5905, 3709, 3638, 3721, 5891)
replace level_finplan = 2 if inlist(course_code, 3496, 3767, 5901, 3749, 3751, 5898)
replace level_finplan = 3 if inlist(course_code, 3701, 5910)
bysort studentid : egen mx_finplan = max(level_finplan)
replace level_finplan = 4 if inlist(course_code, 3713) & level_finplan == . & mx_finplan !=4
drop mx_finplan
bysort studentid : egen mx_finplan = max(level_finplan)
replace level_finplan = 4 if inlist(course_code, 5890) & (mx_finplan == 3 | mx_finplan == 2 | mx_finplan == 1 | mx_finplan == .) & level_finplan == .
replace level_finplan = 5 if inlist(course_code, 5890) & mx_finplan == 4
//Note: level 1 course = 5905, 3709, 3638, 3721, 5891
//level 2 course = 3496 ...
//and so on..
xxxxxxxxx Course1 Course2 Course3 Course4 Program
Student A 2050 3590 1309 2549 Health Science
Student A 2040 English
Student A 2890 4030 Government
Student A 3767 Finance
How would I start? I only want 1 row per student.
Related Posts with Confusion about how to keep one row of data for each student with the most number of classes taken in a program
Error r(430) when including lagged DV in XTMIXEDI am trying to run an HLM model in which monthly brand sales is my dependent variable. The model run…
Cleaning Data Problemhello, everyone I have this problem: I have a variable x1 string and I have x2 byte...I need to cre…
R squared of fixed effects model too highHello everybody, thank you for your time and effort in advance! I really appreciate it! I have some…
gsem with dependent binary variablesHello statalisters, Using Code: webuse womenwk , one can estimate a Heckman model in gsem as outl…
Parameters in assert command must be positiveHello, I am using the assert command in a program that I wrote and I get the error message indicati…
Subscribe to:
Post Comments (Atom)
0 Response to Confusion about how to keep one row of data for each student with the most number of classes taken in a program
Post a Comment