Dear Statalist,

First-time poster, so please be kind.

Background. I am currently building a panel database from several years of monthly excel regulatory data returns submitted by financial institutions to a regulator. Each monthly excel workbook/return contains several sheets, and I am building the database using data points from within each excel sheet (I am building the dataset sheet by sheet, hence all my coding is at the "sheet/institution/date level". My approach so far has been to merge the full data from each sheet once it's completed (i.e. I'll extract all the data for one sheet (for all institutions for all periods), before then merging them with other sheets' data).

Problem. My issue is that excel sheet names from my raw institution excel workbooks/returns are consistent only 95-ish% of the time. The issue arises because financnial institutions have occasionally edited the sheet name which I need to be consistent when I import the data within my loop (see code below). For example, my code stops as soon as the sheet name is "Sheet 1" i.e. with extra space, instead of "Sheet 1". "Sheet 1 " has also been a key contender for breaking my loop.

In essence, I somehow need to add in the flexibility to run my "import excel" and cleaning code for several possible sheet name variations. There are probably only 10 sheet name errors in the raw data files for each sheet (imagine the variations of "Sheet 1"), so I hope to be able to pick up each error and build it into my code as I loop through my raw excel files. I've got about 3000 files, so, after wasting a few days fixing the sheet names manually, I'm looking for a solution.

My failed attempts. Now I've tried to different combinations of "capture" , "if _rc != 0" , and "else" around the "import excel" code line that I had hoped would build in the flexibility I'm looking for. But my failures to date have been a reminder that I am, indeed, still a Stata amateur!

Any thoughts / comments would be greatly appreciated!

Cheers

Tom

Code:
global path "C:\Users..."
    local t = 1         \\ Ignore this line - I use it for the appending after the data has been imported and cleaned
    forvalues year = 2005/2016 {
        foreach month in January February March April May June July August September October November December {
            local file1 : dir "$path\\`year'\\`month'" files "*.xlsx"
            local file2 : dir "$path\\`year'\\`month'" files "*.xls"
            local files "`file1' `file2'"

            cd "$path\\`year'\\`month'"
            
            foreach file in `files' {
                tempfile file`t'
                display in red "`file'"

                import excel using "`file'", sheet(Sheet 1) clear           // Here is the culprit - I need this to work for different variations of "Sheet 1" as I come across the different flavors of error
        
                keep C-I        // I start my cleaning etc. here

                [Lots and lots of data cleaning]
                [and then some appending]