Essentially, I have data that looks like this:
| rowid | date | earnings_date | stock_ticker |
| 1 | 18001 | 18002 | AAPL |
| 2 | 18002 | . | AAPL |
| 3 | 18003 | . | AAPL |
| 4 | 18001 | 18003 | MSFT |
| 5 | 18002 | . | MSFT |
| 6 | 18003 | . | MSFT |
| 7 | 18001 | . | TSLA |
| 8 | 18002 | 18001 | TSLA |
| 9 | 18003 | . | TSLA |
The "date" variable covers every working day of the year, sequentially, in Stata format (number of days since January 1, 1960.)
The earnings_date column is messy - it contains a list of dates when various companies announced earnings. It has some duplicates and a great deal of missing values, since there are far fewer earnings dates than days in a year. All earnings_dates, however, are in line with the appropriate company ticker, something like this:
| date | earnings_date | stock_ticker |
| 18001 | 18002 | AAPL |
| 18002 | 18032 | AAPL |
| 18003 | 18097 | AAPL |
| 18004 | . | AAPL |
| 18005 | 18097 | AAPL |
| 18006 | . | AAPL |
| 18007 | . | AAPL |
| 18008 | . | AAPL |
| 18009 | . | AAPL |
I would just like to add a variable ("categ") that's 1 when a date is an earnings date and 0 otherwise:
| rowid | date | earnings_date | stock_ticker | categ |
| 1 | 18001 | 18002 | AAPL | 0 |
| 2 | 18002 | . | AAPL | 1 |
| 3 | 18003 | . | AAPL | 0 |
| 4 | 18001 | 18003 | MSFT | 0 |
| 5 | 18002 | . | MSFT | 0 |
| 6 | 18003 | . | MSFT | 1 |
| 7 | 18001 | . | TSLA | 1 |
| 8 | 18002 | 18001 | TSLA | 0 |
| 9 | 18003 | . | TSLA | 0 |
In other words, my code needs to compare each value of "date" with all the values of "earnings_date" for a particular stock ticker.
I've looked for inspiration in various examples but I only got to the point where I can compare each value of "date" with the inline value of "earnings_date":
Code:
gen categ=.
forvalues f = 1/`=_N' {
quietly sum date if rowid==`f', meanonly
local testvalue = r(min)
quietly egen testvariable = anymatch(earnings_date), values(`testvalue')
quietly replace categ = testvariable if rowid==`f'
drop testvariable
}I do understand what this does. It takes the value of "date", stores it temporarily, then uses egen's anymatch to compare it to "earnings_date". This works.
I just don't know how to compare the stored "testvalue" to ALL values of earnings_date corresponding to EACH ticker (not just with the inline value of earnings_date). I've tinkered for hours and everything failed.
If you have some ideas, I would greatly appreciate hearing from you. Many thanks in advance!
0 Response to Trying to check if a value in one variable is equal to any value in a certain range in another variable
Post a Comment