You are here

About EPDC Household Survey Data

Household-Based Education Data

Household based education data are based on information gathered from households through a survey or a census. Household-based data are a valuable complement to administrative (schools-based) data because they can be linked to household characteristics such as income level, and as a source of information on out of school children and other populations that are not measured though school-based instruments. 

The EPDC collection of household-based education data consists primarily of indicators calculated by EPDC using micro data made public through the Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Survey (MICS) programs, but also includes data calculated using survey microdata from other sources as well as samples of census microdata made public though the Integrated Public Use Microdata Series, International (IPUMS) program. By calculating education indicators directly rather than using those published in survey program reports, EPDC is able to ensure improved consistency and reliability of indicators in this data collection.

What follows is a general collection of notes that may be of interest to researchers using the EPDC collection of household survey data. A more detailed collection of notes specific to each household survey dataset can be found here.

General Notes

Data Source  Most household survey data in the EPDC database are indicators calculated by EPDC using microdata obtained through DHS, MICS, IPUMS, or other programs. For reasons described below, indicator values calculated by EPDC may differ from indicator values published in survey reports.

Indicator values calculated by EPDC have a data source name like:

“EPDC extraction of DHS dataset”

Indicator values calculated taken from a survey report have a data source name like:

“DHS Report”                                                                                                                     

Indicator values calculated by a third party have a data source name like:

“Household survey data compiled by Sistema de Información de Tendencias Educativas en América Latina (SITEAL)”


Year For EPDC extractions of household based data, education indicators from a single survey or census dataset may not all be recorded as corresponding to a single year. This is because indicators from the same survey are often representative of different periods of time. As an example, a typical DHS or MICS questionnaire will name one school year in the question used to gather information on school participation and the preceding school year in the question used to calculate pupil flow rates. Attainment and literacy rates may be recorded as corresponding to still a third year if the balance of the survey was carried out in a year different from the school years named in the education module questions. Survey questionnaires can be accessed through the website of the implementing organization. Additional notes on specific dataset extractions can be found here.

Duration and Entry age of formal school programs Whenever possible, EPDC extractions of household based data use a country’s own definition of the official entry age of primary school and the number of grades in each level of the school system. These definitions are taken from the edition of the UNESCO IBE World Data on Education Report that is appropriate for the date of the survey. In cases where the school system parameters described in IBE WDE reports differ from those of the ISCED classification system and/or the default coding of variables in the dataset itself, EPDC will adjust the dataset to fit these parameters.

In cases where the national definition of the ages and grades associated with primary school differ from the parameters of ISCED-1 for the same country, EPDC calculates a subset of attendance/non-attendance indicators using both the ISCED-1 definition of primary and the national definition of primary for that country. Both sets of indicators are labeled in the database as ‘Primary School,’ but have different age ranges displayed in the ‘Age’ column.

There are exceptional cases where EPDC elected or was unable to adjust a dataset to meet the country definition of school system parameters. A full explanation of the parameters used by EPDC for a particular dataset, users may consult EPDC extraction notes

Age Adjustment: For education indicators that are age sensitive (out of school, Attendance, intake, and completion rates, % children overage/underage/on time for grade), EPDC adjusts ages from the month data were collected to the month that the school year began.