This repository contains instructions and example code for loading and analyzing data from the Agency for Healthcare Research and Quality's Medical Expenditure Panel Survey (MEPS) Household Component (HC). Quick reference guides are also provided for convenience.
- MEPS Workshops
- Survey Background
- Accessing MEPS-HC data
- Analyzing MEPS-HC data
- Additional Survey Components
- Contact MEPS
Example code for loading and analyzing MEPS data in R, SAS, and Stata is available in the following folders. These folders also include example exercises from recent MEPS workshops. In addition, the SAS folder contains exercises from older workshops (1996-2006):
Note to User: All code provided in this repository is intended as an example for loading and analyzing MEPS data. AHRQ cannot certify the quality of your analysis. It is the user's responsibility to verify the accuracy of the results.
The Agency for Healthcare Research and Quality (AHRQ) conducts several workshops throughout the year. These workshops provide extended knowledge about MEPS data, practical information about usage of MEPS public use data files and an opportunity to construct analytic files with the assistance of AHRQ staff. The workshops are designed for health services researchers who have a background or interest in using national health surveys. For questions regarding MEPS Workshops, please contact Anita Soni at [email protected]. Information on upcoming workshops is posted on the MEPS website. Check back regularly for updates.
The agenda, presentation slides, and programming exercises for the most recent workshop are available in the MEPS-workshop repository.
The Medical Expenditure Panel Survey, which began in 1996, is a set of large-scale surveys of families and individuals, their medical providers (doctors, hospitals, pharmacies, etc.), and employers across the United States. The MEPS Household Component (MEPS-HC) survey collects information from families and individuals pertaining to medical expenditures, conditions, and events; demographics (e.g., age, ethnicity, and income); health insurance coverage; access to care; health status; and jobs held. Historically, each surveyed household was interviewed five times (rounds) over a two-year period. However, in the spring of 2020, the COVID-19 pandemic created significant challenges to in-person data collection. To offset the decrease in the number of cases for 2020 data, Panels 23 and 24 were extended to nine rounds (four years) of data collection, so that data year 2020 includes Rounds 6-7 from Panel 23, Rounds 3-5 from Panel 24, and Rounds 1-3 from Panel 25:
The MEPS-HC is designed to produce national and regional estimates of the health care use, expenditures, sources of payment, and insurance coverage of the U.S. civilian noninstitutionalized population. The sample design of the survey includes weighting, stratification, clustering, multiple stages of selection, and disproportionate sampling.
Data from the Household Component of MEPS are available for download as public use files. For data years 2017 and later (and also for the 2016 Medical Conditions file), .zip files for multiple file formats are available, including ASCII (.dat), SAS V9 (.sas7bdat), Stata (.dta), and Excel (.xlsx). Prior to 2017, ASCII (.dat) and SAS transport (.ssp) files are provided for all datasets. The following table summarizes the various file formats available by data year:
Data Years | ASCII1 (.dat) |
SAS XPORT (.ssp) |
SAS V9 (.sas7dat) Stata (.dta) Excel (.xlsx) |
---|---|---|---|
2017 and later | X | (not recommended) 2 | X |
2016 Medical Conditions file | X | (not recommended) 2 | X |
All Other 2016 files | X | X | |
1996-2015 | X | X |
1 Additional programming statements with column widths, types, and names are required to read ASCII files. SAS and Stata programming statements are available for all data files. R programming statements are available for most files from data years 2018 and later.
2 Starting with 2017 data files (and also for the 2016 Medical Conditions file), SAS Transport formats for most of the MEPS Public Use Files were converted from the SAS XPORT to the SAS CPORT engine. Importing XPORT and CPORT files into SAS requires different procedures. In addition, CPORT data files cannot be read directly into R or Stata; alternative file formats must be used. More details are available in the sub-folders for each programming language.
Zip files of each data format can be downloaded from the web page for each MEPS public use file.
The steps for loading the MEPS files into R, SAS, and Stata, depends on the file type being used. Details for loading MEPS data in these languages are available in the corresponding folders.
The complex survey design of MEPS requires special methods for analyzing MEPS data. These tools are available in many common programming languages. Failure to account for the survey design can result in biased estimates. Details and examples of using the appropriate survey methods are provided in the R, SAS, and Stata folders. Additional examples comparing these three languages can be found in the quick reference guide meps_programming_statements.md.
When analyzing MEPS data, it is the user's responsibility to ensure that sample sizes and precision are adequate for the user's purposes. Please refer to AHRQ's guidelines for specific recommendations.
When analyzing multiple years of MEPS data, it is important to note that MEPS variable names may differ across years. Here are just a few examples:
- 1996: Round-specific variables have only one round number (e.g. AGE2X instead of AGE42X)
- 1996-1998: PERWT variable is WTDPERyy (yy = '96', '97', '98')
- 1996-2001: VARPSU variable has 2-digit year at end (e.g. VARPSU96)
- 2018 and later: Panel number is appended to the beginning of Person ID variable (DUPERSID)
The MEPS-HC Variable Explorer Tool can be used to search variables and labels across the years of the most commonly used MEPS public-use files. For detailed information on variables in a specific file, refer to the documentation and codebook that is provided with each data file.
In addition to the Household Component (MEPS-HC), MEPS is comprised of two additional components: The MEPS Medical Provider Component (MEPS-MPC) and the MEPS Insurance Component (MEPS-IC). The MEPS-MPC survey collects information from providers of medical care that supplements the information collected from persons in the MEPS-HC sample in order to provide the most accurate cost data possible. The MEPS-IC survey collects information from employers in the private sector and state and local governments on the health insurance coverage offered to their employees. It also includes information on the number and types of private health insurance plans offered, benefits associated with these plans, annual premiums and contributions to premiums by employers and employees, copayments and coinsurance, by various employer characteristics (for example, State, industry and firm size). Interactive summary tables as well as complete PDF links of the table series are available at the MEPS-IC Data Tools page. Additional information about the MEPS-IC can be found at the MEPS Website.
Special permissions are required to access microdata from the MPC and IC components. Access to the MEPS-MPC data can be requested from the AHRQ data center. For MEPS-IC data, researchers with approved projects can access the data at one of the Federal Statistical Research Data Centers. If you would like to use MEPS-IC data, please contact AHRQ so that we can assist you with your request.
Please review the Frequently Asked Questions on our website before contacting us to see if we already have an answer to your question. Also read our Privacy Policy for answers to any questions you may have about the use of your e-mail address.
You can contact us by e-mail, mail, or telephone:
MEPS Project Director
Medical Expenditure Panel Survey
Agency for Healthcare Research and Quality
5600 Fishers Lane
Rockville, MD 20857
[email protected]
(301) 427-1406