I am Till from Germany and new with Stata and using it for my master thesis. Currently, I am facing an issue that I could not solve, neither with the FAQ's nor other help documents.
I would like to analyze media coverage for specific announcements. Specifically, I want to count the media articles published in various time windows, e.g. the day before the announcement, 3 days before, 1 week before, 2 weeks before, 30 days before and so on until one year before to have an understanding of the companies general media coverage. Additionally, I have several characteristics that I would like to use to distinguish, such as the type of news (see variable news_type) or the source (see variable product_key).
Following a small extract from my data for one company: Given are the publishing dates of the articles, an entity identifier, the before mentioned characteristics as well as a story identifier.
Code:
* Example generated by -dataex-. To install: ssc install dataex dataex rpna_date_utc rp_entity_id news_type product_key rp_story_id Announcement_Day clear input long rpna_date_utc str6 rp_entity_id str50 news_type str6 product_key str32 rp_story_id float Announcement_Day 14612 "00067A" "FULL-ARTICLE" "DJ-EQ" "91AD179C46FBEB60357F0818D74EF44E" . 14612 "00067A" "HOT-NEWS-FLASH" "DJ-EQ" "39802459186C26C8863CD5321685260B" . 14612 "00067A" "FULL-ARTICLE" "DJ-EQ" "C4D721A26CD46F9D2B2A7E218D908088" . 14612 "00067A" "FULL-ARTICLE" "DJ-EQ" "DB0D7CB34E03C883D161F0831A3AB8AE" . 14612 "00067A" "FULL-ARTICLE" "DJ-EQ" "EEB4AC362D5B0D36F0ECC5623D4F1DDF" . 14612 "00067A" "FULL-ARTICLE" "DJ-EQ" "1DC30A3516E32E5BE16639E029EBCEF1" . 14613 "00067A" "FULL-ARTICLE" "DJ-EQ" "0F7423D041E242D01570930DB20A1A9D" . 14614 "00067A" "FULL-ARTICLE" "DJ-EQ" "719A71DFE4CAFB364B1A2FC5AEE72EDE" . 14642 "00067A" "PRESS-RELEASE" "DJ-EQ" "2A72AB666C82E0890F74BB7BD3B677B9" . 14643 "00067A" "HOT-NEWS-FLASH" "DJ-EQ" "A738A4E6520A3D048F12C14BCB505177" . 14643 "00067A" "NEWS-FLASH" "DJ-EQ" "6579DBEC72577FE78BACF4761BCF5883" . 14643 "00067A" "FULL-ARTICLE" "DJ-EQ" "48E2365C2479599784640F5A800061EB" . 14644 "00067A" "FULL-ARTICLE" "DJ-EQ" "4186DA028E7290444BBB16B584BFEABA" 14644 14650 "00067A" "FULL-ARTICLE" "DJ-EQ" "558EC7C430A403AF9D886BD3F433FF23" . 14650 "00067A" "FULL-ARTICLE" "DJ-EQ" "49A10F744AADF0C5532B33C8C1654F3C" . end format %td rpna_date_utc format %td Announcement_Day
Code:
gen pr_eq = 1 if product_key == "PR-EQ" label define labpr_eq 1 "PR-EQ" label values pr_eq labpr_eq label variable pr_eq "Respondend product_key"
Let's say my announcement day is the 4th of February 2000. My goal is to create, at first, new variables which return on the day of announcement the number of articles published in the 3 days prior to the announcement, given that the various criteria are met. The other time periods will follow as soon as the method is clear to me.
Therefore, I used the following Code:
gen prior_3_days = rpna_date_utc - 3
format prior_3_days %td
Code:
local N = _N forval i = 1/`N’ { foreach key in pr_eq we_eq dj_eq { foreach type in full_article news_flash press_release { quietly by rp_entity_id, sort: egen countnews_`key'_`type' = count(rp_story_id) if `key' == 1 & `type' == 1 & inrange(rpna_date_utc, prior_3_days[`i'], Announcement_Day) } } }
Here is abstract of the results after running the above described codes:
Code:
* Example generated by -dataex-. To install: ssc install dataex dataex rpna_date_utc rpna_date_utc Announcement_Day prior_3_days pr_eq dj_eq we_eq full_article news_flash press_release countnews_dj_eq_full_article countnews_dj_eq_news_flash countnews_dj_eq_press_release clear input long rpna_date_utc float(Announcement_Day prior_3_days pr_eq dj_eq we_eq full_article news_flash press_release countnews_dj_eq_full_article countnews_dj_eq_news_flash countnews_dj_eq_press_release) 14612 . 14609 . 1 . 1 . . 11 . . 14612 . 14609 . 1 . . 1 . . 3 . 14612 . 14609 . 1 . 1 . . 11 . . 14612 . 14609 . 1 . 1 . . 11 . . 14612 . 14609 . 1 . 1 . . 11 . . 14612 . 14609 . 1 . 1 . . 11 . . 14613 . 14610 . 1 . 1 . . 11 . . 14614 . 14611 . 1 . 1 . . 11 . . 14642 . 14639 . 1 . . . 1 . . 1 14643 . 14640 . 1 . . 1 . . 3 . 14643 . 14640 . 1 . . 1 . . 3 . 14643 . 14640 . 1 . 1 . . 11 . . 14644 14644 14641 . 1 . 1 . . 11 . . 14650 . 14647 . 1 . 1 . . 11 . . 14650 . 14647 . 1 . 1 . . 11 . . end format %td rpna_date_utc format %td Announcement_Day format %td prior_3_days label values pr_eq labpr_eq label values dj_eq labdj_eq label def labdj_eq 1 "DJ-EQ", modify label values we_eq labwe_eq label values full_article labfull_article label def labfull_article 1 "FULL-ARTICLE", modify label values news_flash labnews_flash label def labnews_flash 1 "NEWS-FLASH", modify label values press_release labpress_release label def labpress_release 1 "PRESS-RELEASE", modify
I have tried other options like:
Code:
count if prior_3_days[`i'] >= calc_date[`i'] | calc_date[`i'] <= prior_3_days[`i']
I would highly appreciate any help of you.
Thank you very much in advance.
0 Response to Creation of count variables that relate to a period of time
Post a Comment