Hello,

I am working on a project linking mortality to NHANES survey data and would like some feedback on approaching the matter. I want to merge these files but I'm having some trouble. Here is a copy of my research log for reference.


ON SCREEN

https://www.cdc.gov/nchs/data-linkage/mortality.htm

NCHS has linked National Health and Nutrition Examination Surveys (NHANES): 1999-2014 with death certificate records from the National Death Index (NDI). Linkage of the survey participants with the NDI mortality data provides the opportunity to conduct a vast array of outcome studies designed to investigate the association of a wide variety of health factors with mortality. As of July, 2021, the Linked Mortality File (LMF) has been updated with mortality follow-up data through December 31, 2015.

ONSCREEN

https://www.cdc.gov/nchs/data-linkag...ity-public.htm

Public-use Linked Mortality Files (LMF) are available for 1999-2014 NHANES, and NHANES III. The files include a limited set of mortality variables for adult participants only. The public-use versions of the NCHS Linked Mortality Files were subjected to data perturbation techniques to reduce the risk of participant re-identification. For select records, synthetic data were substituted for follow-up time or underlying cause of death. Information regarding vital status was not perturbed.

The CDC default statistical software is SAS. Although a sample Stata program for reading in the LMF data is provided at the website, we experienced considerable difficulty in reading in the LMF data when using Stata Version 17 on a Macbook. Therefore as a service to other researchers, we created this tutorial.

STEP 1.

After reading all the information about linked mortality files at the website of CDC's National Center for Health Statistics, download the following from this webpage:

ONSCREEN

FILE EXPLORER SHOWING FILES IN DOC, DO AND DATA FOLDERS

Data Files and Data Dictionaries and the sample Stata program. We were able to do this without difficulty. Place the data files, dictionaries and Stata Do file in separate folders.

There is generally a LMF data file for each two years from 1999-2015.

ON SCREEN

STATA DO FILE WITH SAMPLE PROGRAM FROM WEBSITE

STEP 2

Copy and paste the sampe Stata Do file into a new Do file in Stata and save it with a appropriate file name.

STEP 3.

Read carefully the instructions at the beginning of the Do file and delete the section of the DO file regarding the National Health Interview Survey. Save the altered DO file with a new name.

STEP 4

Open and save a Stata log file

ON SCREEN

STATA OUTPUT/LOG WINDOW

STEP 5

Using the Stata File dropdown menu, change the active directory to the folder containing the LMF data file. The copy the CD statement to the Do file to replace the generic statement in the sample Do file. This will ensure that the path in the CD statement is correct, an incorrect path is a common error causing the LMF file not to found later.

STEP 6

Note the Stata V 17 does not recognize text/ASCII files with the extension ".dat." Create a copy of the LMF file and change the extension to ".txt."

ON SCREEN

STATA DO FILE WINDOW

STEP 7

Also update the file name at the end of the INFIX statement to match the name of the first LMF data file with the extension ".txt" you are trying to read in. In this example the first file is for 2007-2008.

STEP 8

Run the DO file statements one section at a time including the one with the INFIX statement. Checking the output window after running each section for error messages. Choose an appropriate name for the Stata ".dta" file that is saved.

ON SCREEN

STATA OUTPUT/LOG WINDOW


STEP 8

If you received an Error 601...file not found, check the following.

ON SCREEN

STATA DO FILE WINDOW

First, please make sure you remove the asterisks from the file name in your INFIX statement. According to your Stata log, your INFIX statement currently references the file as follows:

**NHANES_2007_2008**_MORT_2015_PUBLIC.txt

It should be referenced as:

NHANES_2007_2008_MORT_2015_PUBLIC.txt

Make sure you use the statements listed under the "NHANES VERSION" portion of the program.

Verify that the your CD statement is directing Stata to the appropriate folder, and that the original .DAT fiel and the file copy with .txt extension are saved to that folder.

STEP 9

After saving the Stata .dta file, run the indicated descriptive statistics and check the results against the documentation for number of reccords, permissible values for each variable, etc.

STEP 10

If no errors are found, save the DO file. Repeat this process for each LMF two-year file from the earliest to 2015. Once all are saved as Stata .dta files, the Stata append command can be used to combine files for all years needed. Then the Stata merge command can be used to merge the LMF file with the NHANES survey files matching on seqn.

STEP 11

STATA COMMAND WINDOW SHOWING MERGE

Each two-year LMF file contains complete follow-up information through 2015 for those surveyed in the indicated years. E.g. NHANES_2007_2008_MORT_2015_PUBLIC.txt contains results of record linkage with the National Death Index through 2015 for eligible persons aged 18 and older interviewed in NHANES 2007-2008. Therefore this file is to be merged with the 2007-2008 demographic and other relevant files using the Stata merge command with seqn as key variable.

STEP 12

STATA COMMAND WINDOW SHOWING APPEND

To combine multiple two-year NHANES cycles, append merged survey and LMF files, e.g. to 2007-2008 append 2009-2010 and 2011-2012.


DRAFT 08/05/21