Data Collection
Household Labour Force Survey
Design
##Data source The target population is the entire group from which you would ideally like to get information. The target population for the HLFS is the working-age population of New Zealand. We define this as "the non-institutionalised population 15 years and over, who usually live in New Zealand." Specifically the target population excludes:
- people who have been living in New Zealand for less than 12 months, and who do not propose to stay in New Zealand for a total of 12 months or more
- long-term residents of homes for older people, hospitals, and psychiatric institutions (long-term is defined as six weeks or more)
- people in prison
The survey population consists of the group members (from the target population) who have a chance of being selected as part of the sample (ie they can be identified through the sampling frame). For the HLFS, we apply further exclusions to the target population to create the survey population (often due to cost and practical reasons), from which we then select the HLFS sample. These exclusions are a small percentage of the population and the bias introduced is minimal. The survey population is the target population with these exclusions. People:
- residing in non-private dwellings (eg people in hotels, motels, hostels, military camp)
- residing in non-permanent dwellings (eg people in tents or caravans not permanently sited)
- residing at a wharf or landing place (eg people in ships, boats)
- residing on islands other than the North, South, and Waiheke islands (eg people on Great Barrier, Kawau, Chatham, and Stewart islands)
##Sample design
The HLFS sample has a stratified design with two stages of clustering. Firstly we select a random sample of primary sampling units (PSUs) from each stratum (first stage of clustering), then we select a systematic sample of households from each PSU (second stage of clustering). Every person in a selected household aged 15 years and over is eligible for the survey. PSUs are aggregations of one or more meshblocks, where meshblocks are the smallest geographical area unit in New Zealand. PSUs constructed from the 2013 Census have an average of 70 occupied and underconstruction dwellings.
Stratification is the process of dividing the population (or survey frame) into homogeneous subgroups before sampling. Stratification is used to 1) reduce sampling errors for survey estimates and ensure that sample sizes for strata are of their expected size and 2) target subgroups by disproportionate sampling (or over-sampling) certain strata.
Stratification for the new HLFS sample design includes five dimensions. PSUs are stratified by region, urban/rural status, a high-NILF (not in the labour force) status, groups based on New Zealand Deprivation Index values, and territorial authority (in that order). The first four dimensions are explicit, or primary, strata (ie the sample is split by these groups and a random sample selected from each group), while the final dimension is implicit (PSUs are sorted by territorial authority within the primary strata and selected from the ordered list).
##Sample size
The HLFS aims to achieve interviews with 15,000 households, which equates to roughly 30,000 individuals. Households stay in the survey for two years. Each quarter, one-eighth of the households in the sample are rotated out and replaced by a new set of households. Therefore, up to seven-eighths of the same people are surveyed in adjacent quarters. This overlap improves the reliability of quarterly change estimates.
Collection
##Interviews
The period of surveying/interviewing is 13 weeks (eg from 3 April to 2 July 2016). The information obtained relates to the week before the interview (referred to as the ‘survey reference week’). We first interview respondents face-to-face at their home. Subsequent interviews are by telephone wherever possible.
##Proxies The HLFS allows interviewers to take responses from proxies if a respondent is unavailable or unable to answer the questions themselves. Although the evidence regarding the quality of proxy responses is mixed, we expect proxies may not be as accurate as self-responses. Therefore, the HLFS monitors the rate of proxy responses – to gauge the quality of responses. The proxy rate is calculated as the percentage of respondents who had someone else respond on their behalf divided by the total number of respondents.
##Response rate and achieved sample rate The achieved sample size measure is the number of eligible households and individuals that responded to the HLFS in the quarter. The achieved sample size typically increases over time as the population grows and more dwellings are added to the survey sample.
We calculate the response rate by determining the number of eligible households that responded to the survey as a proportion of the estimated number of total eligible households in the sample.
Processing
##Weighting To enable us to infer from the sample to the target population we must weight the sample data. This entails assigning each responding or imputed individual a weight, which can be thought of as the number of people in the population that each individual represents.
###Selection weight The first stage of the weighting is the selection weight (also called a design weight). The overall selection weight for a household is made up of the PSU selection weight and the household selection weight. We calculate the selection weight for each PSU as the inverse of the probability of selection, so PSUs with a lower probability of selection receive a higher selection weight. Within strata, PSUs are selected with probability proportional to size. This means that larger PSUs have a higher probability of being selected.
We next multiply the PSU selection weight by a household selection weight to give the overall selection weight. The household selection weight accounts for the sampling of households within PSUs – we calculate it as the inverse of the selection probability, where the selection probability is the number of selected addresses in the PSU divided by the total number of addresses in the PSU.
###Calibration The final stage of weighting for the HLFS is the calibration to benchmarks (auxiliary information), which are the expected counts of people in the total target population. This adjusts for undercoverage of the target population and undercounting of some groups in the population due to differential response rates. We set the calibration weights to sum to a set of benchmarks. The benchmarks we use for the HLFS are five-year age groups by sex, the number of Māori adults by sex by two age groups (age 15–29, 30+), and 12 regions. Integrated weighting is used in the calibration to assign a weight to each individual in the sample. Each individual an a sampled household is given the same weight, which is also the same as the household weight. This allows the production of household estimates which are consistent with person estimates.
##Imputation Imputation is the process where missing values are substituted with an estimate of what the respondent might have provided for a particular variable. This process aims to minimise the loss of data and improve the accuracy of estimates. Imputation is applied to individuals who belong to eligible responding dwellings and have missing values for sex, age, ethnicity, looking for full-time employment, and usual and actual hours.
All variables are imputed using donor imputation, where the donor is a respondent with similar characteristics (nearest-neighbour imputation).
Usual hours are imputed where respondents have provided their actual number of hours worked, and not their number of usual hours.
For those respondents who have not provided either total usual or total actual hours, donor imputation is applied (imputation of usual hours based on the usual hours from respondents with similar characteristics).
We use another form of imputation for people aged 75 and over (75+). If a household has only people aged 75+ when interviewed in its first quarter of participation, then we do not interview respondents in subsequent quarters. Instead, their labour force status for subsequent quarters is imputed, based on answers to the first and only interview. The exception to this is quarters where we also run the income module (in the June quarter each year). In this quarter, 75+ households are again interviewed, and we use this data for imputation in subsequent quarters. Doing this introduces cost savings and reduces respondent burden while estimates remain largely unchanged – the labour force status of people aged 75+ tends to be relatively stable. Such households make up approximately 9 percent of the first-time-in rotation group.
##Modelling The actual reference period captured by the HLFS collection for a quarter does not always line up with the ideal reference period. The reference period for individual interviews in the HLFS is always one week, starting on a Monday and ending on a Sunday. The survey quarters in the HLFS are always exactly 13 weeks (91 days). Four quarters of 13 weeks total 364 days, so an 'HLFS year' is one day shorter than a calendar year in most years, and two days shorter in leap years. Consequently, the last day of the HLFS December quarter occurs successively earlier and earlier through the years.
There is evidence that differences between ideal and actual reference period have an impact on the national total actual hours estimate. This impact is especially pronounced for March and December quarters; whereby public holidays that would be observed in the December quarter are instead accounted for in the March quarter. In order to improve the relevance and accuracy of the national total actual hours estimate, a linear model was applied to estimate by how much the figure should be adjusted by to account for differences between the ideal and actual reference period.
Previously, the estimates of total actual hours worked were adjusted by including appropriately reweighted records from the last week of the previous quarter, and the first week of the following quarter. This method required the assembly of an entire dataset for a single variable. Therefore, a more parsimonious method was sought out in the form of a linear model.
##Sampling errors Sampling errors quantify the variability that occurs by chance because a sample rather than an entire population is surveyed.
We calculate sampling errors using the jackknife method. It is based on the variation between estimates of different subsamples taken from the whole sample. This is an attempt to see how estimates would vary if we were to repeat the survey with new samples of individuals.
We produce sampling errors and confidence intervals for most point and change estimates. Confidence intervals are used to demonstrate the amount of uncertainty associated with a sample estimate; presenting an upper and lower limit for a particular estimate. The HLFS calculates confidence intervals at the 95 percent confidence level, which means that if multiple samples were drawn, 95% of the confidence intervals would contain the true figure.
As the size of the sampled group decrease, the relative sampling errors will generally increase. For example, the estimated number of Pacific peoples employed would have a larger relative sampling error than the estimated total number of people employed. Likewise, the estimated number of people unemployed would have a larger relative sampling error than the estimated number of people employed.
In general, the sampling errors associated with subnational estimates (eg breakdowns by regional council area or ethnic group) are larger than those associated with national estimates.
A change in an estimate, either from one adjacent quarter to the next, or between quarters a year apart, is said to be statistically significant if it is larger than the associated sampling error.
##Non-sampling error
A non-sampling error is very difficult to measure, and if present can lead to biased estimates. We aim to minimise the effect of these errors by applying best survey practices and monitoring known indicators.
##Classifications
The labour market statistics release includes specific statistics about industry, occupation, study, ethnicity, and region. This section lists the classifications we use for these statistics.
- Industry statistics (NZSIOC, based on ANZSIC06): see Industrial classification for more information
- Occupation statistics (ANZSCO): see occupation for more information
- Skill level (ANZSCO): see skill levels of New Zealand jobs for more information
- Region: see regional council for more information
- Total response ethnicity: see Statistical Standard for Ethnicity – 2005 for more information
Email info@stats.govt.nz for further information about the classifications we use.
Analysis
##Seasonal adjustment
In the labour market, cyclical events that affect labour supply and demand occur around the same time each year. For example, in the summertime a large pool of student labour is both available for, and actively seeking, work. Demand for labour in the retail sector and in many primary production industries also increases.
For any series, we can break the estimates down into three components: trend, seasonal, and irregular. Seasonally adjusted series have the seasonal component removed. Trend series have both the seasonal and irregular components removed, and reveal the underlying direction of movement in a series.
Seasonal adjustment makes data for adjacent quarters more comparable by smoothing out the effect on the times series of any regular seasonal events. This ensures that the underlying movements in the time series are more visible.
See the period specific information section in the latest labour market statistics releases for information on the change in estimates between the current and previous publication for the seasonally adjusted and trend data.
All seasonally adjusted and trend series are produced using the X-13ARIMA-SEATS Version 1.1 package developed by the U.S. Census Bureau.
###Quality of seasonal adjustment
We monitor our data to make sure that our seasonal adjustment is robust.
The X-13ARIMA-SEATS programme is highly customisable and can produce a wide variety of possible adjustments for any particular input series. Consequently, X-13ARIMA-SEATS produces a number of diagnostics that are useful in assessing the quality of our chosen adjustment.
###Outliers During the seasonal adjustment process, X-13ARIMA-SEATS gives less weight to the irregular component. Specifically, if the estimated irregular component at a point in time is sufficiently large compared with the standard deviation of the irregular component as a whole, then the irregular component at that point can be downweighted or removed completely and re-estimated. Such observations are referred to as partial and zero-outliers, respectively. In practice, downweighting outliers does little to seasonally adjusted data, but the effect of the outliers on the trend series is generally reduced. However, if an outlier ceases to be an outlier as more data becomes available, then significant revisions to the trend series become possible. The outliers for each quarter for the HLFS are reported in the studies section.
###Prior adjustment In the December 2018 quarter, Stats NZ observed a larger than expected number of people moving out of ‘employment’ to ‘not in the labour force’ (NILF). A similar effect was seen in previous quarters – the March 2008, March 2009, and December 2012 quarters. The March 2008, December 2012, and December 2018 quarters coincide with the Survey of Working Life (SoWL) supplement.
We made prior adjustments to the previous two quarters in which SoWL was run – the March 2008 and December 2012 quarters. These adjustments were implemented in the September 2013 quarter.
The seasonal adjustment package used by Stats NZ has an automatic procedure for dealing with outliers (observations which are far removed from the others in the series), which works well in most cases. However, in certain circumstances outliers need to be dealt with explicitly. This is done via a prior adjustment.
In the December 2018 quarter, we have made a prior adjustment, in addition to seasonal adjustment, to the following high-level data series to improve the accuracy of, and coherence between, the trend series and seasonally adjusted series:
- Male employed
- Female employed
- Male not in the labour force (NILF)
- Female not in the labour force (NILF)
- Full-time employed
- Part-time employed
- Employed 15 to 64 years
- NILF 15 to 64 years
- Usual hours worked
- Actual hours worked.
We used the adjustments from the March 2008 and December 2012 quarters to inform this adjustment. We will monitor these series over the next few quarters and may make future revisions.
Some seasonally adjusted employed and NILF series were not adjusted further this quarter. For example, the number of people employed, broken down by age; underemployment; and youth not in employment, education, and training series may show unrealistic movements this quarter. This data should be used with caution. In addition, all actual (unadjusted) employed and NILF series, including all age, ethnicity, industry, occupation, and regional breakdowns, should be used with caution.
Seasonally adjusting unadjusted data generated from the unit record files (Data Lab, IDI, and other unit-record files) will yield different results from those published because of Stats NZ’s seasonal adjustment settings and prior adjustments.
##Rounding
We round figures presented in this release. Figures are rounded to the nearest hundred or the nearest thousand for seasonally adjusted and trend estimates. This may result in a total disagreeing slightly with the sum of the individual items as shown in the table. Where figures are rounded the unit is shown as (000) if it is thousands.
We calculate any quarterly and annual changes for figures on unrounded numbers. However quarterly and annual percentage point changes for rates are done on rounded rates.
##Suppression of data We suppress cells with estimates of less than 1,000. They appear as ‘S’ in the tables. These estimates are subject to sampling errors too great for most practical purposes.
##Comparing with other datasets Comparing our labour market statistics has more information on how the HLFS compares with the other labour market statistics we produce. This web page explains which measures of employment are included in each of our employment releases, and the timings and coverage of each release.
A Guide to Unemployment Statistics has more information on comparing the HLFS with other datasets on unemployment. This web page explains which measures of unemployment are included in the HLFS, jobseeker support – work ready, and the job seekers register. It also includes information on the timings, coverage, and different purposes of each of these measures.
##Household labour force status
The household's labour force status is derived by looking at the labour force status of household members aged 18–64 years. Household are classified into either ‘All employed’, ‘Mixed work’ or ‘None employed’. For example, if a couple is living by themselves and one is aged 64 years and the other is aged 65 years, this couple will be assigned to the 'All employed' or 'None employed' category, depending on the labour force status of the 64-year-old.
Households that have no members aged 18–64 years are excluded from this analysis. The household categories incorporate the concept of dependent children rather than just children. A child is a person of any age who usually resides with at least one parent (natural, step, adopted, or foster) and who does not usually reside with a partner or children of his or her own. A dependent child is defined as one under the age of 18 years and not in full-time employment.
Dissemination
##Timing of published data The HLFS is published within six weeks after the end of the quarter's reference period.
##Confidentiality Only people authorised by the Statistics Act 1975 are allowed to see your individual information, and they must use it only for statistical purposes. Your information is combined with similar information from other people or households to prepare summary statistics.
##Timing Our information releases are delivered electronically by third parties. Delivery may be delayed by circumstances outside our control. Statistics NZ accept responsibility for any such delay.
##Liability While all care and diligence has been used in processing, analysing, and extracting data and information in this publication, Statistics NZ gives no warranty it is error-free and will not be liable for any loss or damage suffered by the use directly, or indirectly, of the information in this publication.
##More information Statistics in this release have been produced in accordance with the Official Statistics System principles and protocols for producers of Tier 1 statistics for quality. They conform to the Statistics NZ Methodological Standard for Reporting of Data Quality.
##Crown copyright This work is licensed under the Creative Commons Attribution 3.0 New Zealand licence. You are free to copy, distribute, and adapt the work, as long as you attribute the work to Statistics NZ and abide by the other licence terms. Please note you may not use any departmental or governmental emblem, logo, or coat of arms in any way that infringes any provision of the Flags, Emblems, and Names Protection Act 1981. Use the wording 'Statistics New Zealand' in your attribution, not the Statistics NZ logo.
en-NZMethodology
Methodology
Type of data:
Survey
Data collector:
Statistics NZ’s business unit Integrated Data Collection (IDC) performs the data capture.
Mode of data collection:
The HLFS uses three modes of collection: a computer-assisted personal interview (CAPI), a computer-assisted telephone interview (CATI), and a self-complete questionnaire. The first interview is always a CAPI interview.
Frequency of data collection:
Quarterly
Quality information
Editing:
none.
Missing data:
We impute for sex, age, and full-time/part-time status if these are missing when we finalise the weekly data into table of imputation and post stratification (TBLIMPPS) Other quality issues: We adjust for non-response and then calibrates to age and benchmarks from demography. The benchmarks used are five-year age groups by sex, and the number of Māori adults by two age groups (aged 15–29, 30+).
Privacy issues exist around the name and location of individuals.
The HLFS table that is accessible to researchers do not contain any name or address information to identify an individual. All researchers who have access to HLFS data have had their research proposals assessed using Statistics NZ’s microdata access protocols and only approved researchers who have been granted access by Statistics NZ may view the HLFS data. Read Statistics NZ’s microdata access protocols. All outputs produced from HLFS data must be aggregated and counts suppressed if the underlying unrounded count is fewer than 6.
en-NZ