Essentially, I have a few thousand text files, for which I have a nested foreach loop to import them to .dta files. Here is that code:
foreach f in list_of_folders <all the folders names> {
foreach x in list_of_filenames <all the varying text file names> {
capture import delimited using `f'/`x', clear delimiter(tab)
if c(rc) == 0 { // SUCCESSFUL IMPORT
save `f'/`x'.dta , replace
}
else if c(rc) != 601 { // UNANTICIPATED ERROR
display as error "import delimited failed for unexpected reasons"
exit c(rc)
}
else {
display as text "file `f'/`x' not found -- skipped"
}
}
}
The data that I have is broken up quarterly across 12 years. Within each folder, I need to merge each text file onto each other. They all have a common, identifying variable labelled idrssd. Then, on a successful merge, I need to append each quarter onto the other into one, master dataset. The primary problem that I'm having in doing this is that I'm not sure how, in one loop, to indicate what the files should merge onto because the file name is different for ever quarter. It doesn't matter what the merging dataset is but for each quarter the file lists look like this:
POR_03312006.txt.dta CI_03312006.txt.dta etc....
POR_06302006.txt.dta CI_06302006.txt.dta etc....
all the way to quarter 4 of 2017 so
POR_12312017.txt.dta CI_12312017.txt.dta etc....
I imagined some string command might help overcome this, but this is my first semester using stata, and I am only a sophomore undergraduate, so I feel a bit over my head. Any help would be appreciated, and I'm happy to provide any additional help/information I can. Thank you!
0 Response to Using a nested foreach loop to merge and then append multiple files from multiple folders into one dataset.
Post a Comment