Quality Statement

Label
Industry - 2023 Census: Information by concept en-NZ
Definition

Industry is the type of activity undertaken by the organisation, enterprise, business, or unit of economic activity in which a person works in their main job.

en-NZ
Overall quality rating

High quality
Data quality processes section below has more detail on the rating.

en-NZ
Priority level

Priority level 3
A priority level is assigned to all census concepts: priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).
Industry is a priority 3 concept. Priority 3 concepts are given third priority in terms of quality time, and resources across all phases of the census. Priority 3 concepts are those that are:

  • data that census would not be solely run for, and information about population groups that could not be captured without being in a census
  • data that is important to certain groups
  • data that can be used to create sampling frames for other surveys.

The census priority level for industry remains the same as 2018.
The 2023 Census: Final content report has more information on priority ratings for census concepts.

en-NZ
Subject population

Employed census usually resident population count aged 15 years and over
‘Subject population’ means the people, families, households, or dwellings that the variable applies to.

en-NZ
How this data is classified

Industry is classified into the following categories:

Australian and New Zealand Standard Industrial Classification (ANZSIC) 2006 V1.0.0 – level 1 of 4

Code Category
A Agriculture, Forestry and Fishing
B Mining
C Manufacturing
D Electricity, Gas, Water and Waste Services
E Construction
F Wholesale Trade
G Retail Trade
H Accommodation and Food Services
I Transport, Postal and Warehousing
J Information Media and Telecommunications
K Financial and Insurance Services
L Rental, Hiring and Real Estate Services
M Professional, Scientific and Technical Services
N Administrative and Support Services
O Public Administration and Safety
P Education and Training
Q Health Care and Social Assistance
R Arts and Recreation Services
S Other Services
T Not Elsewhere Included

Industry uses a 4-level hierarchical classification with level 1 presented in the table above.

Residual categories include ‘Don’t know’, ‘Refused to answer’, ‘Response unidentifiable’, ‘Response outside of scope’ and ‘Not stated’.

The 2023 Census classification for Industry is consistent with that used in 2018 Census.

Follow the link above the table to examine the classification in more detail.

Standards and classifications has information on what classifications are, how they are reviewed, where they are stored, and how to provide feedback on them.

en-NZ
Question format

Industry data is derived from the business name, main business activity, and workplace address questions on the individual form (questions 44 to 46 on the paper form). The information provided in these questions is used to identify the business or employer of the respondent, which is matched to the Stats NZ Business Register to obtain the industry in which it is classified. If the business or employer cannot be matched to the Stats NZ Business Register, the main business activity response is matched to the ANZSIC classification directly.

There were differences in the way a person could respond between the modes of collection (online and paper forms).

On the online form:

  • respondents could use an as-you-type list for the business name, main business activity and workplace address questions, and could also enter a free text response.

On the paper form:

  • respondents could only submit a free text response
  • it was possible for respondents outside the subject population (people who were unemployed or not in the labour force) to respond to the industry questions.

Use of the as-you-type list for the business name question has improved data quality, as the options link directly to the Stats NZ Business Register.

Data from the online forms may therefore be of higher overall quality than data from paper forms. However, processing checks and edits were in place to improve the quality of the paper forms.

Stats NZ Store House has samples for both the individual and dwelling paper forms.

en-NZ
Examples of how this data is used

Data-use outside Stats NZ:

  • by central and local government agencies, and social and economic researchers in the monitoring of trends, rates of change, and community outcomes
  • to evaluate qualification levels and skills shortages in different industry sectors. This includes analysis of age distribution, to identify where shortages may be exacerbated by a declining older workforce, and to identify opportunities for skills transmission to younger generations.

Data-use by Stats NZ:

  • alongside sector of ownership, status in employment, and occupation to reweight the Labour cost index. This index provides a measure of wage inflation and is used in wage negotiations, contract escalation clauses, economic research, and policy-making
  • to verify national accounts production statistics.
en-NZ
Data sources

Alternative data sources were used for missing census responses and responses that could not be classified or did not provide the type of information asked for. The table below shows the distribution of data sources for industry data.

Data sources for industry data, as a percentage of the employed census usually resident population count aged 15 years and over, 2023 Census
Source of industry data Percent
2023 Census response 70.0
Historical census 0.0
Admin data 23.3
Deterministic derivation 0.0
Statistical imputation 6.7
 CANCEIS(1) donor’s response sourced from 2023 Census form 6.7
No information 0.0
Total 100.0
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System
Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions.

Where appropriate admin data was used to replace data from census responses where the response required manual coding, or where there was less confidence in the accuracy of the match to the Stats NZ Business Register. This has resulted in an improvement in data accuracy from the 2018 Census.

The individual's data sourced from Inland Revenue was used to identify the business or employer they worked for in their main job. The name of the business or employer was then linked with the Stats NZ Business Register to source industry data.

Statistical imputation was used for records that remained coded to a residual category. Industry data sourced from statistical imputation was entirely from donor responses sourced from the 2023 Census form, whereas 2018 imputed data used both donor responses sourced from the 2018 Census form and admin data. This is due to a change in data processing and has not impacted the quality or consistency of data sourced from statistical imputation.

Editing, data sources, and imputation in the 2023 Census describes how data quality is improved by editing and how missing and residual responses are filled with alternative data sources (admin data and historical census responses) or statistical imputation. The paper also describes the use of CANCEIS (the CANadian Census Editing and Imputation System) which is used to perform imputation. This webpage also contains a spreadsheet that provides additional detail on the admin data sources.

en-NZ
Missing and residual responses

Missing and residual responses represent data gaps where respondents either did not provide answers (missing responses) or provided answers were not valid (residual responses).

Where possible, alternative data sources have been used to fill missing and residual responses in the 2023 and 2018 Censuses.

Percentage of ‘Not stated’ for the employed census usually resident population count aged 15 years and over:

  • 2023: 0.0
  • 2018: 0.0
  • 2013: 3.6

For output purposes, the residual category responses are grouped with ‘not stated’ and are classified as ‘Not elsewhere included’.

Percentage of ‘Not elsewhere included’ for the employed census usually resident population count aged 15 years and over:

  • 2023: 0.0
  • 2018: 0.0
  • 2013: 4.0
en-NZ
Data quality processes

Overall quality rating: High
Data has been evaluated to assess whether it meets quality standards and is suitable for use.

Three quality metrics contribute to the overall quality rating:

  • data sources and coverage
  • consistency and coherence
  • accuracy of responses.

The lowest rated metric determines the overall quality rating.

Data quality assurance in the 2023 Census provides more information on the quality rating scale.

Data sources and coverage: High quality
The quality of all the data sources that contribute to the output for the variable were assessed. To calculate the data sources and coverage quality score for a variable, each data source is rated and multiplied by the proportion it contributes to the total output.

The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:

  • 0.98–1.00 = very high
  • 0.95–<0.98 = high
  • 0.90–<0.95 = moderate
  • 0.75–<0.90 = poor
  • <0.75 = very poor.

The proportion of industry data received from 2023 Census forms, along with the proportion of industry data sourced from admin data and statistical imputation, resulted in a score of 0.96, leading to a quality rating of high.

Data sources and coverage rating calculation for industry data, employed census usually resident population count aged 15 years and over, 2023 Census
Source of industry Rating Percent Score contribution
2023 Census response 1.00 70.00 0.70
Admin data 1.00 23.31 0.23
CANCEIS(1) nearest neighbour imputation 0.40 6.68 0.03
No information 0.00 0.00 0.00
Total 100.00 0.96
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System
Note: Due to rounding, individual figures may not always sum to stated total(s) or score contributions.

The rating for admin data is 1.00, equal to the rating for a census response, as admin data for industry is highly accurate. In many cases, admin data is more accurate than a census response, as any inaccuracy introduced from matching of free-text responses to the Stats NZ Business Register is removed.

While a similar proportion of industry data is sourced from census responses in the 2023 Census compared with 2018 Census, a greater proportion of data this census is from the Stats NZ Business Register, due to improved coding processes.

Consistency and coherence rating: High quality
Industry data is consistent with expectations across nearly all consistency checks, with some minor variation from expectations or benchmarks that makes sense due to real-world change, incorporation of other sources of data, or a change in how the variable has been collected.

Variation from historical trends is minor, occurring only at lower levels of the classification and can be explained by:

  • real world change has impacted employment in some industries
  • changes to the coding of industry since 2018 Census that have improved data accuracy.

Accuracy of responses rating: High quality
Industry data has only minor data quality issues. The quality of coding and responses within classification categories is high. Any issues with the variable appear in a low number of cases (typically in the low hundreds).

Minor data issues include respondent error and a small proportion of data coded in lower accuracy rounds of free text coding.

en-NZ
Recommendations for use and further information

It is recommended that the industry data can be used in a comparable manner to the 2013 and 2018 Censuses.

When using this data, users should be aware of the following:

  • Industry data has been assessed to be largely consistent with previous trends down to a statistical area 2 level of geography.
  • More extensive use of admin data for 2023 has improved both coverage and data accuracy.
  • The 85 years and over age group has higher rates of statistical imputation due to item non-response and a small level of respondent error remaining in the data.

Comparisons to other data sources
Although surveys and sources other than the census collect industry data, data users are advised to become familiar with the strengths and limitations of the sources before use. Industry data that has been sourced completely from admin data is available through Linked Employer Employee Data (LEED).

Key considerations when comparing industry information from the 2023 Census with other sources include:

  • Census is a key source of information on industry for small areas and small populations, many other sources do not provide detail at this level.
  • Census captures information only on the respondent's main job and may therefore under-count employment in industries in which many workers may work additional jobs.
  • Census aims to be a national count of all individuals in a population while other surveys (such as the Household and Labour Force Survey and the General Social Survey) measuring this variable are only based on a sample of the population.

When comparing data from different surveys, data users should be aware that:

  • The Household Labour Force Survey (HLFS) sources industry information entirely from a main business activity question.
  • The Quarterly Employment Survey (QES) displays data by industry (ANZSIC) obtained from the Statistical Business Register.
en-NZ
Information by variables from previous censuses

To assess how this concept aligns with the variables from the previous census, use the links:

Contact our Information centre for further information about using this concept.

en-NZ

Information

History

View Full History
Revision Date Responsibility Rationale
27 26/09/2024 11:33:54 AM
26 26/09/2024 10:00:57 AM