I need to analyse duplicates. I've got different newpaper articles. They all have a story_id. These articles mention different EU and US companies. First I need to analyse how many companies are mentioned in one article. For that I used:
Code:
duplicates tag rp_story_id, gen(dup_storyid)
Example:
company country_code story_id headline year
VW DE NDJHAODUW Earnings announcement 3. Qu 2003
BMW DE NDJHAODUW Earnings announcement 3. QU 2003
GM US NDJHAODUW Earnings announcement 3. Qu 2003
VW DE SODOEIKDIDI Earnings announcement 1. Qu 2004
GM US SODOEIKDIDI Earnings announcement 1. Qu 2004
Code:
duplicates tag rp_story_id, gen(dup_storyid) gen continent=0 replace continent=1 if country_code!="US" tab dup_storyid continent
0 Response to Analysing duplicates
Post a Comment