Hello Generous people on this Forum,
I am trying to merge these two databases found herehttps://drive.google.com/drive/folde...0C?usp=sharing
Just some info, when I say house I mean more like a plot of land also called a lot. House just seems like an easier work to understand during explanations so I went with that for the below.
The specific_lots.dta has houses as observations but each house has about 10 observations as it is the state of the house over the years. So say 1-10 observations are for house 1 and observations 11-20 is house 2 and so on. There are 3 streets worth of houses.
The streets_and_dist.dta is similar to the above as it has houses for the same 3 streets but the difference is that each observation is a different house (compared to the other file which is every 10 observations is a new house), it also has all the houses for each of those three streets unlike the other file which is a subset of all the houses, and the other variables are a bit different.
My issue when I did try to merge before is that I am a novice, I don't know how to get past the issue of 10 observations per house in specific_lots.dta and 1 observation per house in streets_and_dist.dta, and I don't know how to drop the unmatchable observations. So I want to merge streets_and_dist.dta onto specific_lots.dta. which is in other words a dataset with a all the houses but only one row per house merging onto a dataset with 10 rows per house. How I envision this happening is that the streets_and_dist.dta needs to become 10 rows per house to then match up with specific_lots.dta.
Another important fact that I just thought of is that there is no variable in specific_lots.dta that is unique to each observation. What I was thinking is that if I can get variable 'lot_id' also have the year at the end so whatever value lot_id has already + the value for variable 'year' then it can become a unique identifier to serve as the basis for merging. Then for streets_and_lots.dta, it would need to have the same identifier which can happen if for each of those observations duplicates are made and the same years are added there. streets_and_lots.dta also has a variable called lot_id which is unique to each observation prior to the duplication and will become unique again after the duplication when the years are added.
If you are so nice as to make a video of you explaining your steps as you do it and then post on youtube (as storage space wouldn't be an issue) and send me a link and merged file that would the the most perfect help as I would get the complete instructions on how to do this and the completed file for me to compare when I follow the video. A good and free screen recorder and audio recorder is OBS studio if you do want to be so kind and fully help me in this way.
Of course that is asking a lot so any help you are able to provide would be appreciated. Even just typing the instructions would help. I am desperate for any advice as I could not find how to do all of this during my searches.
Thank you all for reading this and helping or wishing me good luck. I really need it. I wish you all a good day and learning in your STATA endeavors.
Related Posts with Specific Help in Merging my Two Datasets
How to write a double foreach loopHi everyone, I had a question about writing a double loop. Below is my data for reference. In my da…
Counting number of days between dates among different observations by groupHello, I am working on Windows 10, Stata 14.2 for a project that asks to count the number of days a…
ARDL with daily frequency and with a business calendarI have a doubt, in the ARDL models with stata can they be run with a daily frequency, specifically w…
Model with interactions vs joint estimationHello! I would appreciate some advice about choosing between a model with interactions (data in lon…
Counting number of days between dates among different observations excluding overlap by groupHello, I am working on Windows 10, Stata 14.2 for a project that asks to count the number of days a…
Subscribe to:
Post Comments (Atom)
0 Response to Specific Help in Merging my Two Datasets
Post a Comment