Hi guys,

I need to analyse duplicates. I've got different newpaper articles. They all have a story_id. These articles mention different EU and US companies. First I need to analyse how many companies are mentioned in one article. For that I used:
Code:
duplicates tag rp_story_id, gen(dup_storyid)
Second I need to analyse how many US and Non-US companies (country_code=="US") are mentioned each year.

Example:

company country_code story_id headline year
VW DE NDJHAODUW Earnings announcement 3. Qu 2003
BMW DE NDJHAODUW Earnings announcement 3. QU 2003
GM US NDJHAODUW Earnings announcement 3. Qu 2003
VW DE SODOEIKDIDI Earnings announcement 1. Qu 2004
GM US SODOEIKDIDI Earnings announcement 1. Qu 2004

Code:
duplicates tag rp_story_id, gen(dup_storyid)
gen continent=0
replace continent=1 if country_code!="US"
tab dup_storyid continent
Any suggestions how I could continue?