Hi again Statalist,

I have a slightly complicated question so I have this all makes sense.

I have a dataset of participants covering 19 years with around 50 different variables. Over the 19 years individuals joined and left at various different points.

I need to identify the first time a hip fracture was reported for a person and then identify the age they were when this fracture was reported. Please see this example of the age and hip fracture variables from my dataset:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(firsthipfracture fractureage entryage) byte(hipfracture_w1 hipfracture_w2 hipfracture_w3 hipfracture_w4 hipfracture_w5 hipfracture_w6 hipfracture_w7 hipfracture_w8 hipfracture_w9) float anyhipfracture
0 64 64 . . . . 0 . . . 0 0
0 66 66 . . 0 0 0 0 0 . 0 0
0 70 70 . . . 0 . . . . . 0
0 87 87 . . . . . . 0 . . 0
. 72 72 . . . . . . . . . 0
0 63 63 . . . . 0 0 . . . 0
0 78 78 . 0 . . . . . . . 0
0 64 64 . . . 0 . 0 0 0 0 0
0 62 62 . . 0 0 0 0 0 0 0 0
0 76 76 . . . 0 0 0 0 0 1 1
0 68 68 . . . 0 . . . . . 0
end
The first variable "firsthipfracture" was a new variable I created using the egenrowfirst command to identify whether each person had a fracture reported on their entry wave but this variable doesn't help me with this problem as some had hip fracture in subsequent years, after their entry to the dataset.

The next 2 variables "fractureage" and "entryage" are duplicate variables, fractureage is the variable I need to match up with when the first reported fracture was and entryage is the variable that lists how old each person was when they entered the dataset.

The hipfracture_w1-hipfracture_w9 are the variables where the hip fractures suffered were recorded

And anyhipfracture is the variable I created to identify if someone had a minimum of 1 fracture at any wave, also not helpful in this query.

I also have the age variables over the 19 years too:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(indager_w1 indager_w2 indager_w3 indager_w4 indager_w5 indager_w6 indager_w7 indager_w8 indager_w9)
63 65 66 68 70 72 74 76 78
63 66  .  .  .  .  .  .  .
66 68 70 71 73 75 77 79  .
61  .  .  .  .  .  .  .  .
60  .  .  .  .  .  .  .  .
74 76 78 80 82 83  .  .  .
64 66 68 70 72 74 76 78 80
71 72 74 76 78 80 82 84 86
 .  . 71 73 75  . 79 81  .
 .  . 61 63 65 67 69 71 73
 .  . 63 65 67 69 71 73  .
 .  .  . 65 67  . 70 72 74
 .  .  . 69 71  .  .  .  .
 .  . 63  .  .  .  .  .  .
 .  .  .  .  . 77 79 81 83
Ideally I will have something that looks like this (using the second to last observations from the example data above):

Code:
input float(firsthipfracture fractureage entryage) byte(hipfracture_w1 hipfracture_w2 hipfracture_w3 hipfracture_w4 hipfracture_w5 hipfracture_w6 hipfracture_w7 hipfracture_w8 hipfracture_w9) float anyhipfracture
0 88 76 . . . 0 0 0 0 0 1 1
end
Where this participant was 76 when they entered the dataset and they suffered a hip fracture at the age of 88.

I generated this code to attempt to do this however it didn't work:

Code:
gen fractureage2 = (indager_w1 | indager_w2 | indager_w3 | indager_w4 | indager_w5 | indager_w6 | indager_w7 | indager_w8 | indager_w9) if hipfracture_w1 | hipfracture_w2 | hipfracture_w3 | hipfracture_w4| hipfracture_w5 | hipfracture_w6 | hipfracture_w7 | hipfracture_w8 | hipfracture_w9==1
end
Anyone able to help with this please?

Many thanks in advance,

Best,
Rhian