Hi there,
I have a dataset contain the patent history and job career history for inventors and firms with patent information, firm information and inventor information. Sample as foolows:
firm_id year inventor_id move patent_id class sourcef_id
10036 2001 4144893-1 0 6197294 424093 .
10036 2001 4144893-1 0 6225448 424130 .
10036 2001 4212742-2 0 6322804 424422 .
10036 2001 4495402-2 0 6322804 424422 .
10036 2001 4861627-1 1 6262034 424468 100390
10036 2001 4868121-1 0 6322804 424422 .
firm_id: the focal firm who hires inventor from source firm (sourcef_id) at year t.
inventor_id: inventors hwo is now working at focal firm (firm_id).
move: dummy variable, 1 if inventor move to focal firm at year t, otherwise 0.
class: classification number which indicates certain class for each patent (patent_id)
sourcef_id: id for source firm, that is the inventors' employer before moving to focal firm. this variable only exists when inventor move (move =1).
My issue is: I would like to have a panel as follows:
firm_id year inventor_id move patent_id class sourcef_id patent_ord
10036 2001 4861627-1 1 6262034 424468 100390 1
10036 2001 4861627-1 0 ... ... 100390 2
10036 2001 4861627-1 0 ... ... 100390 3
10036 2002 4861627-1 0 ... ... 100390 4
10036 2003 4861627-1 0 ... ... 100390 5
10036 2004 4861627-1 0 ... ... 100390 6
1. at firm-year-inventor level, list all the patents the mobile inventor patented after move to the focal firm.
2. for all the patents of the mobile inventor at hiring firm, generate a new variable patent_ord to indicate the order of the patents, e.g.in the first row, 1 indicates that the patent 4861627-1 is the first patent inventor inventoed after mobile.
3. fill the source firm values. Now source firm id (sourcef_id) is present at the year of inventor mobile, and is missing afterwards. As the table shows above, I would like to identify source firm at each inventor_patent level.
NOTE: the following notes complicate the case:
1. it is likely that inventor move to focal firm at year t and then move back to source firm at year t+5. For example as the following table shows, inventor 4861627-1 move to firm 10036 from 100390 at year 2001 and patent 4 patents at firm 10036. At year 2003, the inventor move back to 100390 (which is the sorouce firm) and patent 2 patents afterwards. In such a case, patent_ord should recount after each mobility events.
firm_id year inventor_id move patent_id class sourcef_id patent_ord
10036 2001 4861627-1 1 6262034 424468 100390 1
10036 2001 4861627-1 0 ... ... 100390 2
10036 2001 4861627-1 0 ... ... 100390 3
10036 2002 4861627-1 0 ... ... 100390 4
100390 2003 4861627-1 1 ... ... 10036 1
100390 2004 4861627-1 0 ... ... 10036 2
2. all the variables are not uniquely identified. For patent_id, one patent might be assigned to multiple inventors. For example, patent 6262034 might be assigned to 4861627-1 with other inventors at other firms. This is the case when this patent is produced by two firms' cooperation.
3. it is likely that focal firm hire more than one inventor and they cooperate after move to focal firm. For example, in the following table, both inventors 4861627-1 and 4861627-2 was hired at 2001 from the same firm 100390. and they cooperate after move to 10036 at 2002 and patent 6262035. This case might also make tihng complicated.
firm_id year inventor_id move patent_id class sourcef_id patent_ord
10036 2001 4861627-1 1 6262034 424468 100390 1
10036 2002 4861627-1 0 6262035 ... . 2
10036 2001 4861627-2 1 ... ... 100390 1
10036 2002 4861627-2 0 6262035 ... . 2
10036 2004 4861627-2 0 ... ... . 3
Any suggestions will be appreciated!
0 Response to Calculate the number of occurences for certain variable in panel data
Post a Comment