Checking merges over a very large number of files

My apologies in advance if this has been covered - I have looked but haven't found anything quite the same. I'd generally be using R and I'm finding I often come at problems in Stata in quite the wrong way.

I am using Stata 15.

I am trying to merge many (over 100) files for three years of survey data (over 300 separate files in total). Some variables (for example, name, id number, age) are listed in multiple files, but are inconsistent. This is not always due to missing values, but sometimes due to mistranslation of handwriting etc, so I'd like to flag when this occurs, rather update and replace.

As I understand, when Stata merges two files and I don't specify that two identical variable names should be matched, the 'using' data is dumped (or I can specify that it be updated or replaced, but I can't keep both in separate variables). I'd like to keep the variable from the 'using' file with a new name (or the old name with a suffix). For example, if I ran the following:

Code:

 
 set obs 100  forval j = 1/5 {          gen v`j' = uniform() } replace v1 = _n

save myfile1, replace
  
 set obs 100  forval j = 5/10 {         gen v`j' = uniform() } gen v1 = _n

save myfile2, replace

use myfile1, clear

merge 1:1 v1 using myfile2

I would like to generate a dataset with variable names something like:

v1 v2 v3 v4 v5 v5_suffix v6 v7 v8 v9 v10

Alternatively, I'd be happy if I could export list of variable names as a single line to the same excel file, where I could order them and rename/compare manually.

So far I have tried:

Code:

use "$survey/A", clear

global mylist B C D E F G H I J K 

foreach filename of global mylist {
use `"$survey/`filename'"' 
keep if _n==1

export excel * using "$constructedData/checks.xlsx",  firstrow(varlabels) append
}

Of course, this doesn't work because of the 'append' option. But is there a way I could do something that would give me a single file with all the variable names in each of the 100 files for each year?

I'd be very grateful for any advice.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Checking merges over a very large number of files
Checking merges over a very large number of files

0 Response to Checking merges over a very large number of files

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Checking merges over a very large number of files Checking merges over a very large number of files

Related Posts with Checking merges over a very large number of files

0 Response to Checking merges over a very large number of files

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Checking merges over a very large number of files
Checking merges over a very large number of files