Hello everyone

I have an extensive dataset about individual patents (patnum), belonging to a technology class (class) and subclass (subclass) that a firm (permno) filed. For each patent, I have the exact date (fileddate) the patent was filed and the corresponding year (year_filed). For each patent-firm-year observation, I want to determine whether the technology is new to the firm, i.e. whether it is the first filed application in a specific class (or subclass), or whether it is an application the firm has already knowledge in, i.e. existing patents of the same technology class (or sublass). Thus, for each patent, I aim to have an indicator variable whether the patent is "exploitative" or "explorative" to the firm. In addition, it would be great to have a new variable - proximity - capturing the aggregated number of a firm's patents in class (or subclass) X in relation to the total aggregated number of patents of this firm (up to the year under analysis); in other words, the technological proximity between the patents filed in year t and the existing patent portfolio held by the same firm up to year t−1.

I hope I made it clear what I want to achieve. Any recommendation on how to achieve it is highly appreciated - I really struggle to tackle this issue.

Here is an example of my data

Code:
Example generated by -dataex-. To install: ssc install dataex
clear
input long(patnum permno) str3 class str7 subclass float(green fileddate year_filed)
1568124 12503 "056" "006000O" . -14447 1920
1568134 12503 "239" "584000O" 1 -13628 1922
1568142 10401 "333" "168000O" . -14385 1920
1568153 16029 ""    ""        . -13433 1923
1568163 12503 "056" "011500O" . -14487 1920
1568338 14875 "400" "689000O" . -13209 1923
1568578 18227 "192" "003620O" . -12658 1925
1568609 10620 "324" "502000O" . -12838 1924
1568615 12079 "123" "058100O" 1 -14561 1920
1568623 11754 "396" "440000O" . -13955 1921
1568625 11447 "220" "798000O" . -14129 1921
1568648 12079 "072" "085000O" . -14275 1920
1568656 11754 "396" "341000O" . -12713 1925
1568658 11754 "430" "631000O" . -12722 1925
1568667 11754 "546" "101000O" . -13973 1921
1568670 11754 "396" "341000O" . -13045 1924
1568671 11754 "396" "343000O" . -13045 1924
1568682 12060 "310" "219000O" . -13233 1923
1568692 12060 "310" "203000O" . -12852 1924
1568694 12060 "313" "538000O" . -14139 1921
1568702 12060 "346" "035000O" . -13876 1922
1568710 12060 "363" "084000O" . -13426 1923
1568731 12060 "384" "151000O" . -13100 1924
1568735 12060 "318" "741000O" . -13598 1922
1568748 12060 "361" "643000O" . -12590 1925
Best regards and many thanks
Pascal