Dear Statalist Users,

I have a problem with merging the two datasets. I work in Stata 15. The first dataset_1 is below. It is a panel dataset with more than 30 companies from 20 countries with daily observations for the last 50 years. In total I have more than 480 000 obs in it. Below is the sample of it with just one country and company.
trading_date = trading date
country = country name
company ticker = company id (Bloomberg ticker)
start_date3 = event date
diff = the difference in days between the trading date and event date (as sometimes events take place on non-trading dates, and in this case an event date is merged with the next closest trading date)
conflict_episode_id = an id of a conflict episode associated with the event date (it is needed here because company_ticker and trading_date do not uniquely identify the panel. This is because there could be multiple event dates on the the same trading date, so I have created separate observations for these cases - company 1 will then have two records for the same trading date, one for conflict_episode_id=1 and another one for conflict_episode_id=2. The panel then is uniquely identified by company_ticker, trading_date and conflict_episode_id).

* Example generated by -dataex-. To install: ssc install dataex
clear
input float trading_date str14 country str16 company_ticker int start_date3 float diff str14 conflict_episode_id
13493 "Israel" "ESLT IT Equity" . . ""
13494 "Israel" "ESLT IT Equity" . . ""
13495 "Israel" "ESLT IT Equity" . . ""
13496 "Israel" "ESLT IT Equity" . . ""
13499 "Israel" "ESLT IT Equity" . . ""
13500 "Israel" "ESLT IT Equity" . . ""
13501 "Israel" "ESLT IT Equity" . . ""
13502 "Israel" "ESLT IT Equity" . . ""
13503 "Israel" "ESLT IT Equity" . . ""
13506 "Israel" "ESLT IT Equity" . . ""
13507 "Israel" "ESLT IT Equity" . . ""
13508 "Israel" "ESLT IT Equity" . . ""
13509 "Israel" "ESLT IT Equity" . . ""
13510 "Israel" "ESLT IT Equity" . . ""
13513 "Israel" "ESLT IT Equity" . . ""
13514 "Israel" "ESLT IT Equity" . . ""
13515 "Israel" "ESLT IT Equity" . . ""
13516 "Israel" "ESLT IT Equity" . . ""
13517 "Israel" "ESLT IT Equity" . . ""
13520 "Israel" "ESLT IT Equity" . . ""
13521 "Israel" "ESLT IT Equity" . . ""
13522 "Israel" "ESLT IT Equity" . . ""
13523 "Israel" "ESLT IT Equity" . . ""
13524 "Israel" "ESLT IT Equity" . . ""
13527 "Israel" "ESLT IT Equity" . . ""
13528 "Israel" "ESLT IT Equity" . . ""
13529 "Israel" "ESLT IT Equity" . . ""
13530 "Israel" "ESLT IT Equity" . . ""
13531 "Israel" "ESLT IT Equity" . . ""
13534 "Israel" "ESLT IT Equity" . . ""
13535 "Israel" "ESLT IT Equity" . . ""
13536 "Israel" "ESLT IT Equity" . . ""
13537 "Israel" "ESLT IT Equity" . . ""
13538 "Israel" "ESLT IT Equity" . . ""
13541 "Israel" "ESLT IT Equity" . . ""
13542 "Israel" "ESLT IT Equity" . . ""
13543 "Israel" "ESLT IT Equity" . . ""
13544 "Israel" "ESLT IT Equity" . . ""
13545 "Israel" "ESLT IT Equity" . . ""
13548 "Israel" "ESLT IT Equity" . . ""
13549 "Israel" "ESLT IT Equity" . . ""
13550 "Israel" "ESLT IT Equity" . . ""
13551 "Israel" "ESLT IT Equity" . . ""
13552 "Israel" "ESLT IT Equity" . . ""
13555 "Israel" "ESLT IT Equity" . . ""
13556 "Israel" "ESLT IT Equity" . . ""
13557 "Israel" "ESLT IT Equity" . . ""
13558 "Israel" "ESLT IT Equity" . . ""
13559 "Israel" "ESLT IT Equity" . . ""
13562 "Israel" "ESLT IT Equity" . . ""
13563 "Israel" "ESLT IT Equity" 13563 0 "338_19970218"
13564 "Israel" "ESLT IT Equity" . . ""
13565 "Israel" "ESLT IT Equity" . . ""
13566 "Israel" "ESLT IT Equity" . . ""
13569 "Israel" "ESLT IT Equity" . . ""
13570 "Israel" "ESLT IT Equity" . . ""
13571 "Israel" "ESLT IT Equity" . . ""
13572 "Israel" "ESLT IT Equity" . . ""
13573 "Israel" "ESLT IT Equity" . . ""
13576 "Israel" "ESLT IT Equity" . . ""
13577 "Israel" "ESLT IT Equity" . . ""
13578 "Israel" "ESLT IT Equity" . . ""
13579 "Israel" "ESLT IT Equity" . . ""
13580 "Israel" "ESLT IT Equity" . . ""
13583 "Israel" "ESLT IT Equity" . . ""
13584 "Israel" "ESLT IT Equity" . . ""
13585 "Israel" "ESLT IT Equity" . . ""
13586 "Israel" "ESLT IT Equity" . . ""
13587 "Israel" "ESLT IT Equity" . . ""
13590 "Israel" "ESLT IT Equity" . . ""
13591 "Israel" "ESLT IT Equity" 13591 0 "375_19970318"
13592 "Israel" "ESLT IT Equity" . . ""
13593 "Israel" "ESLT IT Equity" . . ""
13594 "Israel" "ESLT IT Equity" . . ""
13597 "Israel" "ESLT IT Equity" . . ""
13598 "Israel" "ESLT IT Equity" . . ""
13599 "Israel" "ESLT IT Equity" . . ""
13600 "Israel" "ESLT IT Equity" . . ""
13601 "Israel" "ESLT IT Equity" . . ""
13604 "Israel" "ESLT IT Equity" . . ""
13605 "Israel" "ESLT IT Equity" . . ""
13606 "Israel" "ESLT IT Equity" . . ""
13607 "Israel" "ESLT IT Equity" . . ""
13608 "Israel" "ESLT IT Equity" . . ""
13611 "Israel" "ESLT IT Equity" . . ""
13612 "Israel" "ESLT IT Equity" . . ""
13613 "Israel" "ESLT IT Equity" . . ""
13614 "Israel" "ESLT IT Equity" . . ""
13615 "Israel" "ESLT IT Equity" . . ""
13618 "Israel" "ESLT IT Equity" . . ""
13619 "Israel" "ESLT IT Equity" . . ""
13620 "Israel" "ESLT IT Equity" . . ""
13621 "Israel" "ESLT IT Equity" . . ""
13622 "Israel" "ESLT IT Equity" . . ""
13625 "Israel" "ESLT IT Equity" 13624 1 "388_19970420"
13626 "Israel" "ESLT IT Equity" . . ""
13627 "Israel" "ESLT IT Equity" . . ""
13628 "Israel" "ESLT IT Equity" . . ""
13629 "Israel" "ESLT IT Equity" . . ""
13632 "Israel" "ESLT IT Equity" . . ""
end
format %td trading_date
format %td start_date3

The second dataset_2 is below. The variables are:
trading date
conflict_episode_id
start_date3_news = same as above - event date, I renamed it to differentiate between them in the two datasets and also to keep this one in the merged dataset later on.
media_coverage = the variable that shows how intense the media coverage of the event is during (-30days;+30days) window around the event.
This dataset_2 is uniquely identifies by trading_date and conflict_episode_id.


* Example generated by -dataex-. To install: ssc install dataex
clear
input int trading_date str14 conflict_episode_id int(start_date3_news media_coverage)
14824 "338_19970218" 14854 35
14825 "338_19970218" 14854 32
14826 "338_19970218" 14854 30
14827 "338_19970218" 14854 36
14828 "338_19970218" 14854 29
14829 "338_19970218" 14854 41
14830 "338_19970218" 14854 21
14831 "338_19970218" 14854 35
14832 "338_19970218" 14854 26
14833 "338_19970218" 14854 40
14834 "338_19970218" 14854 34
14835 "338_19970218" 14854 25
14836 "338_19970218" 14854 37
14837 "338_19970218" 14854 30
14838 "338_19970218" 14854 28
14839 "338_19970218" 14854 29
14840 "338_19970218" 14854 39
14841 "338_19970218" 14854 27
14842 "338_19970218" 14854 33
14843 "338_19970218" 14854 28
14844 "338_19970218" 14854 37
14845 "338_19970218" 14854 41
14846 "338_19970218" 14854 47
14847 "338_19970218" 14854 33
14848 "338_19970218" 14854 35
14849 "338_19970218" 14854 44
14850 "338_19970218" 14854 34
14851 "338_19970218" 14854 29
14852 "338_19970218" 14854 25
14853 "338_19970218" 14854 37
14854 "338_19970218" 14854 96
14855 "338_19970218" 14854 31
14856 "338_19970218" 14854 41
14857 "338_19970218" 14854 36
14858 "338_19970218" 14854 38
14859 "338_19970218" 14854 41
14860 "338_19970218" 14854 56
14861 "338_19970218" 14854 46
14862 "338_19970218" 14854 57
14863 "338_19970218" 14854 46
14864 "338_19970218" 14854 55
14865 "338_19970218" 14854 45
14866 "338_19970218" 14854 39
14867 "338_19970218" 14854 36
14868 "338_19970218" 14854 54
14869 "338_19970218" 14854 35
14870 "338_19970218" 14854 45
14871 "338_19970218" 14854 79
14872 "338_19970218" 14854 73
14873 "338_19970218" 14854 74
14874 "338_19970218" 14854 52
14875 "338_19970218" 14854 74
14876 "338_19970218" 14854 52
14877 "338_19970218" 14854 55
14878 "338_19970218" 14854 57
14879 "338_19970218" 14854 62
14880 "338_19970218" 14854 46
14881 "338_19970218" 14854 46
14882 "338_19970218" 14854 53
14883 "338_19970218" 14854 47
14884 "338_19970218" 14854 108
12373 "375_19970318" 12403 4
12374 "375_19970318" 12403 1
12375 "375_19970318" 12403 0
12376 "375_19970318" 12403 1
12377 "375_19970318" 12403 2
12378 "375_19970318" 12403 1
12379 "375_19970318" 12403 1
12380 "375_19970318" 12403 0
12381 "375_19970318" 12403 0
12382 "375_19970318" 12403 1
12383 "375_19970318" 12403 1
12384 "375_19970318" 12403 2
12385 "375_19970318" 12403 1
12386 "375_19970318" 12403 0
12387 "375_19970318" 12403 1
12388 "375_19970318" 12403 3
12389 "375_19970318" 12403 1
12390 "375_19970318" 12403 2
12391 "375_19970318" 12403 6
12392 "375_19970318" 12403 1
12393 "375_19970318" 12403 1
12394 "375_19970318" 12403 1
12395 "375_19970318" 12403 1
12396 "375_19970318" 12403 1
12397 "375_19970318" 12403 3
12398 "375_19970318" 12403 3
12399 "375_19970318" 12403 0
12400 "375_19970318" 12403 1
12401 "375_19970318" 12403 1
12402 "375_19970318" 12403 3
12403 "375_19970318" 12403 9
12404 "375_19970318" 12403 5
12405 "375_19970318" 12403 0
12406 "375_19970318" 12403 2
12407 "375_19970318" 12403 2
12408 "375_19970318" 12403 4
12409 "375_19970318" 12403 6
12410 "375_19970318" 12403 3
12411 "375_19970318" 12403 1
end
format %td trading_date
format %td start_date3_news
[/CODE]

I would like to merge the two datasets on trading_date and conflict_episode_id. However, I need to create additional observations in dataset_1 because I have overlapping events within (-30;+30) event windows. For example, records [13563 "Israel" "ESLT IT Equity" 13563 0 "338_19970218"] and [13591 "Israel" "ESLT IT Equity" 13591 0 "375_19970318"]. So I was thinking that first I need to:
1. Identify if I have overlapping events within (-30;30) window for each start_date3 which is non-missing
2. Then count the number of records/obs to be duplicated. And duplicate them.
3. Then replace the missing values for the conflict_episode_id within the (-30;30) window with the conflict_episode_id.
4. Finally, merge the two datasets by trading date and conflict_episode_id.

I would highly appreciate any help with achieving the above mentioned points1-3. I have looked at these threads (https://www.statalist.org/forums/for...rlapping-dates, and https://www.statalist.org/forums/for...-of-dates_loop), but they do no exactly help me achieve what I need with minimal memory usage (if I expand my dataset_1 using the number of events per company, Stata returns an error saying that there is not enough memory, although I set memory to max available). Thank you!