Data Collection

Data collection methodology

Data collection methodology en-NZ



The target population for HES is the usually resident population of New Zealand living in private dwellings, aged 15 years and over (15+). This population does not include:

  • overseas visitors who expect to be resident in New Zealand for less than 12 months
  • people living in non-private dwellings (eg hotels, motels, boarding houses, hostels, and homes for the elderly)
  • patients in hospitals, or residents of psychiatric or penal institutions
  • members of the permanent armed forces in group living facilities (eg barracks)
  • people living on offshore islands (excluding Waiheke Island)
  • members of the non-New Zealand armed forces
  • non-New Zealand diplomats and their families.

Children at boarding schools are also not surveyed, but housing costs on behalf of those children are included in the record-keeping of the parent or guardian. The survey population is therefore marginally different from the target population.

For survey purposes, a ‘household’ comprises a group of people who share a private dwelling and normally spend four or more nights a week in the household. They must share consumption of food or contribute some portion of income towards the provision of essentials for living as a group.

Further information about the methodology used to prepare child poverty statistics can be found in the Child poverty statistics: Technical Appendix 2018/19.

HES components

HES (Income) has four survey components:

  • a household questionnaire
  • an housing expenditure questionnaire
  • an income questionnaire for each household member aged 15+
  • a material well-being questionnaire for one member per household who is aged 18+ (chosen randomly).

Sample design information

We select the sample for the HES using a two-stage stratified cluster design. Households are sampled on a statistically representative random basis, from rural and urban areas throughout the North and South islands.

The HES sample in 2019 was increased to 28,500 dwellings for HES Income and subsampled 5,500 dwellings for HES expenditure to provide a more accurate picture of child poverty in New Zealand.

Reliability of survey estimates

Two types of error are possible in estimates based on a sample survey – sampling error and non-sampling error.

###Sampling error:

Sampling error is a measure of the variability that occurs by chance because a sample rather than an entire population is surveyed.

We calculate sampling errors using the jackknife method. It is based on the variation between estimates of different subsamples taken from the whole sample.

Given a certain sample size, the level of sampling error for any given estimate depends on the number of sampled households/individuals in the category of interest and the variability of the estimate due to the random nature of the sample selection.

As the size of the sampled group decreases, the relative sampling errors (RSEs – sample error as a percentage of the estimate) will generally increase. For example, the estimated average annual household income from self-employment would have a larger RSE than the estimated average annual household income for households receiving income from wages and salaries.

In the tables provided with our HES releases, only income or expenditure estimates with RSEs less than or equal to 20 percent are considered sufficiently reliable for most purposes. However, estimates with RSEs over 21 percent are also included but should be used with caution. Although estimates with RSEs over 100 are also provided, they are not deemed very useful.

###Non-sampling errors: Non-sampling errors arise from biases in the patterns of response and non-response, questionnaire design, inaccuracies in reporting by respondents, and errors in recording and coding data. We endeavour to minimise the impact of these errors by applying best-practice survey methods and monitoring known indicators (eg non-response).


A proxy may provide information in ‘family type’ households where:

  • the whole household is informed about the survey. All agree to participate, but are not able to be present when the questionnaires are administered
  • children are away at boarding school
  • people don't work and have no source of income
  • people are elderly, sick, or mentally incapacitated.

In all proxy interviews, the interviewer must be convinced the proxy is totally familiar with the other respondent’s information.


Imputation replaces missing values with actual values from similar respondents.

Two imputation methods are used in HES – nearest neighbour donor imputation and median imputation (the latter for expenditure only).

The nearest neighbour donor imputation method replaces missing values by data values from another record called a donor. A donor is selected by finding a respondent with matching characteristics to the recipient.

We introduced donor imputation into HES in 2009/10, and now use it in all HES releases. We also applied imputation to every previous HES cycle and revised the data accordingly.

The donor imputation is applied to a household where the household does not supply all the required income or expenditure information but supplies sufficient information to be retained in the sample.

We also impute local and regional council rates for respondents who have not provided enough information for us to calculate their rates. A form of manual imputation is used to impute interest rates.

For more details on the imputation methodology adopted, in each survey, please refer to respective survey information.

##Population rebase

HES is a sample survey that uses statistical weights to calculate income, housing costs and material well-being estimates for the total New Zealand population. We revise the weights following each census, based on the latest population counts (called a population rebase).

For HES 2018/19, we have used the weights based on the 2013 Census population. These estimates are proposed to be rebased using 2018 census data in due course.

We applied the last rebase to HES in the 2014/15 HES (Income) year. The revised data applied to the income, housing costs and material well-being data from 2006/07 to 2013/14, and to the expenditure data for 2006/07, 2009/10, and 2012/13.

See Household Economic Survey population rebase: year ended June 2007–15 for more information about the revisions.

##Population weighting adjustments

To enable us to infer from the sample to the target population we must weight the sample data. This entails assigning each responding or imputed unit in the sample a weight that indicates the number of people it represents in the final population estimate.

The population weighting process takes account of under-coverage in the survey for specific population groups, such as young males and Māori.

Weighting ensures that estimates reflect the sample design, adjusts for non-response, and aligns estimates with the current population estimates. For household economic surveys, deriving the weight is a multi-phase process.

The first stage of weighting involves calculating a unit’s initial weight. The initial weight depends on the sample design and equals the inverse of the selection probability.

The second stage involves adjusting the initial weights to account for unit non-response. This refers to a household without information, or where the amount of information provided (and/or quality of) is insufficient to be a response. The initial weight of a non-responding unit is reduced to zero, while initial weights of responding units are scaled up – by combining factors within the weighting cells (eg region, ethnic densities, New Zealand Deprivation Index , and interview quarter).

The first stage of weighting involves calculating a unit’s initial weight. The initial weight depends on the sample design and equals the inverse of the selection probability.

The second stage involves adjusting the initial weights to account for unit non-response. This refers to a household without information, or where the amount of information provided (and/or quality of) is insufficient to be a response. The initial weight of a non-responding unit is reduced to zero, while initial weights of responding units are scaled up – by combining factors within the estimation group (eg region, ethnic densities, urban/rural, and interview quarter).

The final stage in the weighting process is calibration to benchmarks (auxiliary information). Calibration adjusts for under-coverage of the target population. We use a form of calibration called integrated weighting to ensure that all individuals in the same household have the same weight and that household statistics derived from person-level data match the same statistics calculated directly from household-level data.

We have made changes to our weighting methodology for HES 2018/19. Please refer to the document Changes to the Household Economic Survey year ended June 2019 for details.

##HES benchmarks The benchmark variables used for the HES were obtained from two sources: benchmarks based on the estimated resident population (ERP), and benchmarks from admin data available in the IDI.

The person benchmark variables/categories used for the income and housing cost statistics are:

  • regional population estimates (by 12 regions including Northland, Auckland, Waikato, Bay of Plenty, Gisborne and Hawke’s Bay, Taranaki, Manawatu-Wanganui, Wellington, West Coast-Tasman-Nelson/Marlborough, Canterbury, Otago, and Southland);

  • children sub-population estimates by three age groups;

  • adult sub-population estimates by sex and 14 age groups (15-17 years, 18-19 years, 20-24 years, 25-29 years, 30-34 years, 35-39 years, 40-44 years, 45-49 years, 50-54 years, 55-59 years, 60-64 years, 65-69 years, 70-74 years, 75+ years) ; and

  • adult Māori sub-population estimates by two age groups (15-29 years and 30 years and over) from ERP. Number of people who receive any type of benefit; and the income distribution of individuals aged 15+ years were also obtained from the admin data.

The household benchmarks obtained from ERP are two categories of household composition (two-adult households and non-two-adult households), and these categories split further by 12 regions.

Same benchmark variables were used for the household expenditure statistics except the following differences:

  • Using the original 5 broad regions (Auckland, Wellington, Canterbury, Rest of North Island, Rest of South Island) instead of 12 regions.

  • Using the age group 15-19 instead of spilling that into 15-17 and 18-19 age groups.

Resident Population estimates are based on the 2013 Census.

##Consistency with other periods

Although we adjust survey results for various demographic variables (age, sex, and region), there can be variability in survey estimates from one survey collection period to the next. This variability is because a different group of households is selected for each survey.

##Using material well-being data

The material well-being questionnaire asks about things people may or may not have or do, and the extent that people economise. We also ask respondents how they rate their life satisfaction and whether income meets everyday needs. We use separate questionnaires assess child material well-being and household well-being.

From the material well-being questionnaire we publish selected results for satisfaction levels, and for adequacy of income to meet everyday needs.

##Suppressed estimates

For confidentiality purposes, we suppress data in the released tables if a cell is based on fewer than five people or households. Data is no longer suppressed if a relative sample error is 51 percent or higher (21 percent for cross-tabulated data).




View Full History
Revision Date Responsibility Rationale
14 5/03/2024 12:09:56 PM
13 30/11/2021 3:42:59 PM