I have a dataset that contains both physical and online sales of books. I am interested in examining the effect of e-book sales on physical book sales. I adopt a staggered diff-in-diff approach with ppml estimation. For the simplicity, let's assume that the availability of e-books always cannibalise the physical sales.
This is the baseline regression where digital is 0 and switched to 1 after the release of e-books

Code:
xtpoisson monthly_sales i.digital i.month,fe vce(cluster id)
My question is twofold:
First, I observe a clear decaying time trend in the sales of books after their release. If the book's digital version is available in an earlier time period, let's say after the first three months, I would be overestimating the negative effect of e-book sales on the physical sales for that particular book. On the other hand, if it was available later, I would be underestimating the effect. Then the magnitude if coefficient is likely to depend on the timing distribution of treatment. Should I add also add a age fixed effect? In other words:


Code:
xtpoisson monthly_sales i.digital i.month i.age,fe vce(cluster id)
Array


Second, the physical versions of books are also distributed to the market at different times. For example, the book Wonder is released in 2019, and it was not available earlier. So my sample composition changes every month with new book releases. This also means that while some books are in my sample for ten years, some of them are only there for a couple of months. Should I drop the treated observations after some time? Does this cause any bias in my regression?