Industry (information about this variable and its quality)

Variable Description

Name
Industry (information about this variable and its quality) en-NZ
Label
Industry (information about this variable and its quality) en-NZ
Description

Industry is the type of activity undertaken by the organisation, enterprise, business, or unit of economic activity that employs one or more people aged 15 years and over.

en-NZ
Other Variable Information

The census data on industry relates to the industry for the main job held by an individual.

This is the job in which a person worked the most hours.

Priority level

Priority level 3

We assign a priority level to all census variables: Priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).

Industry is a priority 3 variable. Priority 3 variables do not fit in directly with the main purpose of a census but are still important to certain groups. These variables are given third priority in in terms of quality, time, and resources across all phases of a census.

The census priority level for industry remains the same as 2013.

Quality Management Strategy and the Information by variable for Industry (2013) have more information on the priority rating.

Overall quality rating for 2018 Census

High quality

Data quality processes section below has more detail on the rating for this variable.

The External Data Quality Panel has provided an independent assessment of the quality of this variable and has rated it as high quality. 2018 Census External Data Quality Panel: Assessment of Variables has more information.

Caution is advised when using this variable at small geographies. Please see Recommendations for use and further information section below.

Subject population

Employed census usually resident population aged 15 years and over

‘Subject population’ means the people, families, households, or dwellings to whom the variable applies.

How this data is classified

Australian and New Zealand Industrial Classification 2006 (ANZSIC06) V1.0.0

Industry is a hierarchical classification with four levels. Level one (division) contains 20 categories:

A Agriculture, Forestry and Fishing

B Mining

C Manufacturing

D Electricity, Gas, Water and Waste Services

E Construction

F Wholesale Trade

G Retail Trade

H Accommodation and Food Services

I Transport, Postal and Warehousing

J Information Media and Telecommunications

K Financial and Insurance Services

L Rental, Hiring and Real Estate Services

M Professional, Scientific and Technical Services

N Administrative and Support Services

O Public Administration and Safety

P Education and Training

Q Health Care and Social Assistance

R Arts and Recreation Services

S Other Services

T Not Elsewhere Included

‘Not elsewhere included’ contains the residual categories of ‘don’t know’, ‘refused to answer’, ‘response unidentifiable’ and ‘response outside of scope’, alongside ‘not stated’.

  • Level 2 contains 87 Subdivisions.
  • Level 3 contains 219 Groups.
  • Level 4 contains 511 Classes.

An example output of the ANZSIC06 V1.0 classification is:

A - Agriculture, forestry and fishing

A01 – Agriculture

A011 – Nursery and Floriculture Production

A011100 – Nursery Production (Under Cover)

The classification of industry in the 2018 Census is consistent with the classification used in the 2013 and 2006 Censuses, although the data was previously dual coded to the previous 1996 ANZSIC classification.

The Standards and Classifications page provides background information on classifications and standards.

Question format

Industry data is derived from ‘name of business/employer’, ‘main activity of the business/employer’ and ‘address of the place where worked’ on the individual form (questions 41- 43 on the paper form).

Stats NZ Store House has samples for both the individual and dwelling paper forms.

The information provided in these questions allowed us to identify the details of the respondent’s employer and therefore find the industry in the Stats NZ Business Register.

There were no differences between the wording or question format in the online and paper versions of these questions. However, there were differences in the way a person could respond between the modes of collection (online and paper forms):

On the online form:

  • respondents were provided with an as-you-type list for the ‘main activity of business’ question
  • built-in routing functionality directed individuals who were usually resident and employed on census night to the industry input questions.

On the paper form:

  • it was possible for unemployed people or those not in the labour force to respond to the industry questions.

Data from the online forms may therefore be of higher overall quality than data from paper forms.

How this data is used

Outside Stats NZ

  • Central and local government agencies and social and economic researchers use industry in the monitoring of trends and rates of change.
  • Industry is used to evaluate qualification levels in different industry sectors.

Within Stats NZ

  • Industry is used alongside sector of ownership and occupation, to reweight the Labour Cost Index. This index provides a measure of wage inflation and is used in wage negotiations, contract escalation clauses, economic research and policy-making.
  • Industry is used to verify national accounts production statistics.

2018 data sources

We used alternative data sources for missing census responses and responses that could not be classified or did not provide the type of information asked for. Where possible, we used responses from the 2013 Census, administrative data from the Integrated Data Infrastructure (IDI), or imputation.

The table below shows the breakdown of the various data sources used for this variable.

2018 Industry – employed census usually resident population
aged 15 years and over.
Source Percent
Response from 2018 Census 71.6 percent
2013 Census data 0.0 percent
Administrative data 20.8 percent
Statistical imputation 7.7 percent
No information 0.0 percent
Total 100 percent
Due to rounding, individual figures may not always sum to the stated total(s)

The ‘no information’ percentage is where we were not able to source industry data for a person in the subject population.

We also used admin data or statistical imputation where a response could not be matched to a synonym in the classification.

Administrative data sources

Using the Individual Tax Return (IR3) and the Employer Monthly Schedule (EMS) data from Inland Revenue, we found the employer and linked the record to the Stats NZ Business Register to find the industry.

Please note that when examining industry data for specific population groups within the subject population, the percentage that is from administrative data and statistical imputation may differ from that for the overall subject population.

Missing and residual responses

‘No information’ in the 2018 data sources table is the percentage of the subject population coded to ‘not stated’. In previous censuses, non-response was the percentage of the subject population coded to ‘not stated’.

In 2018, the percentage of ‘not stated’ is zero due to the use of the additional data sources described above.

Percentage of ‘not stated’ for the employed census usually resident population aged 15 years and over:

  • 2018: 0.0 percent
  • 2013: 3.6 percent
  • 2006: 3.7 percent.

In 2018, admin data and statistical imputation were used to replace any responses coded to the residual categories. In output for the 2013 and 2006 censuses, responses that could not be classified or did not provide the type of information asked for were grouped with ‘not stated’ and classified as ‘not elsewhere included’.

Percentage of ‘not elsewhere included’ for the employed census usually resident population aged 15 years and over:

  • 2018: 0.0 percent
  • 2013: 4.0 percent
  • 2006: 5.6 percent.

2013 Census data user guide provides more information about non-response in the 2013 Census.

Data quality processes

Overall quality rating: High quality

Data was evaluated to assess whether it meets quality standards and is suitable for use.

Three quality metrics contributed to the overall quality rating:

  • data sources and coverage
  • consistency and coherence
  • data quality.

The lowest rated metric determines the overall quality rating.

Data quality assurance for 2018 Census provides more information on the quality rating scale.

Data sources and coverage: High quality

We have assessed the quality of all the data sources that contribute to the output for the variable. To calculate a data sources and coverage quality score for a variable, each data source is rated and multiplied by the proportion it contributes to the total output.

The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:

  • 98–100 = very high
  • 95–<98 = high
  • 90–<95 = moderate
  • 75–<90 = poor
  • <75 = very poor.

Admin data was highly comparable to census forms and data sourced through statistical imputation was moderately comparable to census forms. The high proportion of data from received forms and admin sources in comparison to the low proportion sourced from statistical imputation contributed to the score of 0.96, determining the high quality rating.

Quality rating calculation table for the sources of industry – 2018 employed census usually resident population
aged 15 years and over
Source Rating Percent of total Score contribution
2018 Census form 1.00 71.58 0.72
Admin data 1.00 20.75 0.21
Imputation
Donor’s 2018 Census form 0.50 6.68 0.03
Donor’s response sourced from admin data 0.50 0.98 0.00
No Information 0.00 0.00 0.00
Total 100.00 0.96
Due to rounding, individual figures may not always sum to the stated total(s) or score contributions.

Data sources, editing, and imputation in the 2018 Census has more information on the Canadian census edit and imputation system (CANCEIS) that was used to derive donor responses.

Consistency and coherence: High quality

Industry data is consistent with expectations across nearly all consistency checks, at the national and regional council levels of geography and lowest level of classification, with some minor variation from expectations or benchmarks that makes sense due to real-world change and the incorporation of other sources of data.

  • For some industries at the lower levels of the classification, there is a substantial increase in the number of respondents which may be partly due to the use of admin data replacing missing, illegible or vague responses.

Data quality: High quality

Industry data has only minor data quality issues. The quality of coding and responses within classification categories is high. Any impact of other data sources used is minor. Any issues with the variable appear in a low number of cases (typically in the low hundreds).

  • Online as-you-type functionality helped respondents provide more detailed and accurate business names and addresses.

Recommendations for use and further information

We recommend that the use of the data can be similar to that produced in 2013.

However, when using this data you should be aware that:

  • data has been assessed to be consistent at the national and regional council level of geography and at level 1 of the classification. Some variation is possible at classifications below this level.
  • at small geographies, there will be variability in the percentage of administrative data or imputation for a given area. This means some small geography areas will have poorer quality data than the overall quality rating.
  • the inclusion of admin data and statistical imputation means there is no non-response category for 2018. Care should therefore be taken if comparing absolute figures to previous years. We recommend using proportions.

Comparisons with other data sources

Although surveys and sources other than the census collect industry data, data users are advised to familiarise themselves with the strengths and limitations of the sources before use.

Key considerations when comparing industry information from the 2018 Census with other sources include:

  • census is a key source of information on industry for small areas and small populations. Many other sources do not provide detail at this level
  • census captures information only on the respondent's main job and may therefore under-count employment in industries in which many workers may work additional jobs
  • to compare census data to other surveys, some classifications would need to be aggregated for example professional and administrative support.

Contact our Information Centre for further information about using this variable.

This variable is not part of a dataset.

Representation

Aggregation Method
Unspecified
Temporal
False
Geographic
False

Concept

Conceptual Variable
conceptual-variable-16.png Industry en-NZ
Concept
concept-16.png Work en-NZ

Information

History

View Full History
Revision Date Responsibility Rationale
13 30/11/2021 2:59:19 PM