I want to calculate the radicalness of patents, so I am working with the backward citation. To calculate the radicalness of patents I am using the OECD radicalness index, which is shown in the figure below. The radicalness of a patent is calculated by dividing the number of IPC codes by the number of all citations.
Array

I am working with the IPC classification data of patentsview and I am merging this dataset with the US patent citation data of patentsview to calculate the radicalness. The problem is that the IPC dataset gives multiple IPC codes for one patent_id. Therefore I can not merge the dataset with the us patent citation dataset. I need to have one IPC code for every patent, How can I achieve this? Attached is a part of my dataset of the IPC codes. I already did drop some variables which are not needed in the analysis. When I can merge the dataset with the US patent citation, I can calculate the radicalness. see picture below for a part of the data, which shows the problem.

Thanks in advance.

Array