Hey everybody,

I am Till from Germany and new with Stata and using it for my master thesis. Currently, I am facing an issue that I could not solve, neither with the FAQ's nor other help documents.
I would like to analyze media coverage for specific announcements. Specifically, I want to count the media articles published in various time windows, e.g. the day before the announcement, 3 days before, 1 week before, 2 weeks before, 30 days before and so on until one year before to have an understanding of the companies general media coverage. Additionally, I have several characteristics that I would like to use to distinguish, such as the type of news (see variable news_type) or the source (see variable product_key).


Following a small extract from my data for one company: Given are the publishing dates of the articles, an entity identifier, the before mentioned characteristics as well as a story identifier.
Code:
* Example generated by -dataex-. To install: ssc install dataex
dataex rpna_date_utc rp_entity_id news_type product_key rp_story_id Announcement_Day
clear
input long rpna_date_utc str6 rp_entity_id str50 news_type str6 product_key str32 rp_story_id float Announcement_Day
14612 "00067A" "FULL-ARTICLE"   "DJ-EQ" "91AD179C46FBEB60357F0818D74EF44E"     .
14612 "00067A" "HOT-NEWS-FLASH" "DJ-EQ" "39802459186C26C8863CD5321685260B"     .
14612 "00067A" "FULL-ARTICLE"   "DJ-EQ" "C4D721A26CD46F9D2B2A7E218D908088"     .
14612 "00067A" "FULL-ARTICLE"   "DJ-EQ" "DB0D7CB34E03C883D161F0831A3AB8AE"     .
14612 "00067A" "FULL-ARTICLE"   "DJ-EQ" "EEB4AC362D5B0D36F0ECC5623D4F1DDF"     .
14612 "00067A" "FULL-ARTICLE"   "DJ-EQ" "1DC30A3516E32E5BE16639E029EBCEF1"     .
14613 "00067A" "FULL-ARTICLE"   "DJ-EQ" "0F7423D041E242D01570930DB20A1A9D"     .
14614 "00067A" "FULL-ARTICLE"   "DJ-EQ" "719A71DFE4CAFB364B1A2FC5AEE72EDE"     .
14642 "00067A" "PRESS-RELEASE"  "DJ-EQ" "2A72AB666C82E0890F74BB7BD3B677B9"     .
14643 "00067A" "HOT-NEWS-FLASH" "DJ-EQ" "A738A4E6520A3D048F12C14BCB505177"     .
14643 "00067A" "NEWS-FLASH"     "DJ-EQ" "6579DBEC72577FE78BACF4761BCF5883"     .
14643 "00067A" "FULL-ARTICLE"   "DJ-EQ" "48E2365C2479599784640F5A800061EB"     .
14644 "00067A" "FULL-ARTICLE"   "DJ-EQ" "4186DA028E7290444BBB16B584BFEABA" 14644
14650 "00067A" "FULL-ARTICLE"   "DJ-EQ" "558EC7C430A403AF9D886BD3F433FF23"     .
14650 "00067A" "FULL-ARTICLE"   "DJ-EQ" "49A10F744AADF0C5532B33C8C1654F3C"     .
end
format %td rpna_date_utc
format %td Announcement_Day
Up to now, I created dummy variables for all three characteristics in product_key (those are the different sources that are just not fully represented in this data extract) as well as for the news_type Full-Article etc. The dummy dj-eq for instance takes the value 1 if the observation belongs to the source DJ-EQ and so on. I did that to be able to loop over my observations. Therefore, I used the following Code for PR-EQ, WE-EQ, PR-EQ as well as Full-Article, Press-Relase and News-Flash.:

Code:
 
gen pr_eq = 1 if product_key == "PR-EQ"
label define labpr_eq 1 "PR-EQ"
label values pr_eq labpr_eq
label variable pr_eq "Respondend product_key"

Let's say my announcement day is the 4th of February 2000. My goal is to create, at first, new variables which return on the day of announcement the number of articles published in the 3 days prior to the announcement, given that the various criteria are met. The other time periods will follow as soon as the method is clear to me.
Therefore, I used the following Code:
gen prior_3_days = rpna_date_utc - 3
format prior_3_days %td
Code:
local N = _N
forval i = 1/`N’ {
                foreach key in pr_eq we_eq dj_eq {
                                foreach type in full_article news_flash press_release {
                                                quietly by rp_entity_id, sort: egen countnews_`key'_`type' = count(rp_story_id) if  `key' == 1 & `type' == 1 & inrange(rpna_date_utc, prior_3_days[`i'], Announcement_Day)      
}
}
}

Here is abstract of the results after running the above described codes:
Code:
* Example generated by -dataex-. To install: ssc install dataex
dataex rpna_date_utc rpna_date_utc Announcement_Day prior_3_days pr_eq dj_eq we_eq full_article news_flash press_release countnews_dj_eq_full_article countnews_dj_eq_news_flash countnews_dj_eq_press_release
clear
input long rpna_date_utc float(Announcement_Day prior_3_days pr_eq dj_eq we_eq full_article news_flash press_release countnews_dj_eq_full_article countnews_dj_eq_news_flash countnews_dj_eq_press_release)
14612     . 14609 . 1 . 1 . . 11 . .
14612     . 14609 . 1 . . 1 .  . 3 .
14612     . 14609 . 1 . 1 . . 11 . .
14612     . 14609 . 1 . 1 . . 11 . .
14612     . 14609 . 1 . 1 . . 11 . .
14612     . 14609 . 1 . 1 . . 11 . .
14613     . 14610 . 1 . 1 . . 11 . .
14614     . 14611 . 1 . 1 . . 11 . .
14642     . 14639 . 1 . . . 1  . . 1
14643     . 14640 . 1 . . 1 .  . 3 .
14643     . 14640 . 1 . . 1 .  . 3 .
14643     . 14640 . 1 . 1 . . 11 . .
14644 14644 14641 . 1 . 1 . . 11 . .
14650     . 14647 . 1 . 1 . . 11 . .
14650     . 14647 . 1 . 1 . . 11 . .
end
format %td rpna_date_utc
format %td Announcement_Day
format %td prior_3_days
label values pr_eq labpr_eq
label values dj_eq labdj_eq
label def labdj_eq 1 "DJ-EQ", modify
label values we_eq labwe_eq
label values full_article labfull_article
label def labfull_article 1 "FULL-ARTICLE", modify
label values news_flash labnews_flash
label def labnews_flash 1 "NEWS-FLASH", modify
label values press_release labpress_release
label def labpress_release 1 "PRESS-RELEASE", modify
I understand that the given results represent the frequency of matched characteristics, so we have in our sample from DJ-EQ 11x Full-Articles, 3x News-Flash and 1x Press Release. But these are not matching the required time window.
I have tried other options like:
Code:
count if prior_3_days[`i'] >= calc_date[`i'] | calc_date[`i'] <= prior_3_days[`i']
but nothing leads me to the desired output.

I would highly appreciate any help of you.

Thank you very much in advance.