Can I please ask you a question about selecting multiple variables by the use of the asteriks and/or question mark (or perhaps even an other option I have not considered yet)
I am programming a missing data report (see the attached table at the bottom of the post to get an idea of what I would like to achieve). For each variable (see variable_02fudb_visdat as an example) I have created a new variable with prefix Q_ to which I will assign the value 99 (data not required), 1 (data present) or 2 (data missing). In this example _02fudb_visdat (the visit date of the 2nd patient follow-up visit) is only seen as missing, if 02fudb_visyn == 1 (meaning that the visit took place).
So I have initially written this (which works):
replace Q__02fudb_visdat = 99 /* Set to Not Required by default */
replace Q__02fudb_visdat = 2 if _02fudb_visyn == 1 & missing(_02fudb_visdat) /* Set to Missing */
replace Q__02fudb_vistdat = 1 if !missing(_02fudb_visdat) /* Set to Present */
The problem is that I have 13 visits (and each visit has many variables like the example variable _02fudb_visdat), so I would like to re-write this code using foreach, to save myself a lot of work. The variables I want to refer to only differ in the prefix (01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13), so I thought the use of * and ? would help me out here (have already explored the manual regarding this)
The first line of code I have successfully replaced with the following:
foreach x of varlist Q_*fudb_fup_date {
replace `x' = 99 /* Set to Not Required by default */
}
But the second line of code I cannot seem to get working (specifically the text in red):
Idea 1:
foreach x of varlist Q_*fudb_fup_date {
replace `x' = 2 if _02fudb_visdat == 1 & missing(_??fudb_visdat) /* missing */
}
--> this code gives an error: _??fudb_fup_date invalid name
Based on the posts I found on the forum and based on the Stata documentation (picture below), I thought this would be working? Can someone explain why Stata considers this invalid?
Array
Idea 2:
foreach x of varlist Q_*fudb_fup_date {
replace `x' = 2 if _02fudb_visdat == 1 & missing(_*fudb_visdat) /* missing */
}
--> this also gives an error: _ ambiguous abbreviation
Can someone explain why this is not allowed?
Anyway, after some thinking, I realized this probably would not have worked anyway, because both _02fudb_visdat and Q__02fudb_visdat can be abbreviated with _*fudb_visdat, while I only want to refer to _02fudb_visdat (and the fudb_visdat for the other follow-up visits) in the red section.
Idea 3:
It is an option to generate variables that do not start with _Q, but rather end with _Q, so I don't have two variables (_02 and Q__02) that can be abbreviated in the same way. Unfortunately, I already wrote the code for the missing data report for most of my non-repeating forms, so it would be quite some work to go back and change all this.
I hope you have an idea how I can refer to _01fudb_visdat-_13fudb_visdat with an abbreviation without this abbreviation applying to Q__01fudb_visdat-Q__01fudb_visdat as well.
Or perhaps you have an entirely different suggestion how to tackle this?
Thank you for you ideas!
Kind regards,
Moniek
Example table of what I would like to achieve:
subjectid | _02fudb_visdat | Q__02fudb_visdat | _02fudb_visyn |
001 | 2 | 1 | |
002 | 99 | ||
003 | 18Jan2019 | 1 | 1 |
0 Response to Selecting multiple variables (with * or ?)
Post a Comment