I would like to specify a variable that indicates the ratio of new names to the total number of names per ID per year. In other words, I am interested in whether the particular name appears for the first time in the respective year, or whether it has already appeared in the years before (within ID).
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int(ID Year) str31 Names 10499 2005 "PETER ; RON ; JULIA" 10499 2006 "SOPHIA ; EMMA ; LILY ; JULIA" 10499 2008 "GRACE ; PETER ; LILY" 10499 2010 "SAMANTHA ; JULIA" 10499 2017 "RON" 10655 2007 "GRACE ; EMMA" 10655 2010 "EMMA ; EVELYN" end
Solution for ID 10499:
In 2005: PETER; RON and JULIA are new -> 3/3
In 2006: SOPHIA, EMMA and LILY are new (JULIA already appeared in 2005) -> 3/4
In 2008: GRACE is new (PETER and LILY already appeared in 2005 and 2006, respectively) -> 2/3
In 2010: SAMANTHA is new (JULIA already appeared in 2005 and in 2006) -> 1/2
In 2017: No one is new (RON already appeared in 2005) -> 0/1
Solution for ID 10655:
In 2007: GRACE and EMMA are new -> 2/2
In 2010: EVELYN is new (EMMA already appeared in 2007) -> 1/2
I would highly appreciate any support.
Best wishes,
Jan
0 Response to Panel Data: Comparison of Observations
Post a Comment