Hello world of stats and analysis,
I have dibbled a bit in STATA over my university years but my work has required me to become much more engaged with the software.
I am currently working on a project (using STATA 13.0) trying to determine seasonal sales patterns to implement a sales sampling strategy across multiple provinces in Cambodia.
I have a large dataset with 24 variables and 120,000+ observations. My first step is to manipulate the current dataset to remake sales years based on seasonal patterns in Cambodia. It happens that a sales year makes much more sense from November 1st - October 31st (the following year). The dataset has sales records of every sale done by any supplier based on date (SUPPLIER_ID / DAY / MONTH/ YEAR / SALE_AMOUNT).
I am having problems writing code that would systematically eliminate any sales before November the year that particular supplier joined our records. For example supplier 43 joined in 01/04/2014 and has sales until 01/09/2017. I would like to delete all sales previous to 01/11/2014 for supplier 43, and then do that for all suppliers in my dataset.
I have been fiddling with the bysort function: bysort SPID year month: generate y=1 then replace y=. if SPID==SPID[_n-1] & month>=11. I feel like im close but just missing the code to tell STATA to identify and replace only the first year they joined so that later I can write a code to eliminate all 'missing values'
I hope all that makes sense,
Thank you all for your help!
Related Posts with Problem manipulating large database based on conditional statements to eliminate specific observations
R squared and Adjusted R squared values on xtreg (random effects panel regression)Hello everyone! I am doing my dissertation which requires me to do some random effects panel data r…
Help replicate a simple Bayesian modelHello everyone, I'm trying to replicate a simple example of Bayesian Analysis as in the Federal Fun…
How to add leading zeros for string variables containing both numbers and lettersHi Statalisters, I try to add leading zeros for my string variable named cusip by using the code -g…
Margins dxdySo I need to calculate the partial effect at the average and the average partial effect of a probit …
Creating Percent Change VariableHello, I am doing a secondary data analysis of pre and post-surveys for a falls prevention program.…
Subscribe to:
Post Comments (Atom)
0 Response to Problem manipulating large database based on conditional statements to eliminate specific observations
Post a Comment