Ethnicity (information about this variable and its quality)

Variable Description

Ethnicity (information about this variable and its quality) en-NZ
Ethnicity (information about this variable and its quality) en-NZ

Ethnicity is the ethnic group or groups a person identifies with or has a sense of belonging to. It is a measure of cultural affiliation (in contrast to race, ancestry, nationality, or citizenship). Ethnicity is self-perceived and a person can belong to more than one ethnic group.

An ethnic group is made up of people who have some or all of the following characteristics:

  • a common proper name
  • one or more elements of common culture that need not be specified, but may include religion, customs, or language
  • a unique community of interests, feelings, and actions
  • a shared sense of common origins or ancestry
  • a common geographic origin.

The key output variables derived from ethnicity are:

  • Total response
  • Grouped total response
  • Number of ethnic groups.
Other Variable Information

Priority level

Priority level 1

We assign a priority level to all census variables: Priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).

Ethnicity is a priority 1 variable. Priority 1 variables are core census variables that have the highest priority in terms of quality, time, and resources across all phases of a census.

The census priority level for ethnicity remains the same as 2013.

Quality Management Strategy and the Information by variable for Ethnicity (2013) have more information on the priority rating.

Overall quality rating for 2018 Census

High quality

Data quality processes section below has more detail on the rating for this variable.

The External Data Quality Panel has provided an independent assessment of the quality of this variable and has rated it as moderate quality. Initial Report of the 2018 Census External Data Quality Panel and Final report of the 2018 Census External Data Quality Panel have more information.

Subject population

Census night population

This question applies to all people in New Zealand on census night. However, data on ethnicity is usually output for the census usually resident subject population.

‘Subject population’ means the people, families, households, or dwellings to whom the variable applies.

How this data is classified

Census Ethnic05 V2 2018 CENSV2.0.0

Ethnicity is a hierarchical classification with four levels. Detailed ethnic group information is collected so that responses can be coded to specific ethnic group categories at the most detailed level of the classification, level four. Where this is not possible, information is coded to level two or to level three. Level one is used solely for output and contains six categories and one residual category:

1 European

2 Māori

3 Pacific Peoples

4 Asian

5 MELAA (Middle Eastern / Latin American / African)

6 Other ethnicity

9 Not elsewhere included

For the census night population count, ‘not elsewhere included’ contains the residual categories of ‘response unidentifiable’, ‘response outside of scope’, ‘don’t know’ and ‘refused to answer’, alongside ‘not stated’.

As with the classification of the 2006 and 2013 Censuses, ‘New Zealander’ or ‘Kiwi’ responses were coded to ‘other’, whilst ‘Pākeha’ was coded to ‘New Zealand European’ at level two of the classification.

  • Level two has 21 categories and five residual categories.
  • Level three has 36 categories and five residual categories.
  • Level four has 180 categories and five residual categories.

Ethnic group indicators are derived at level one of the ethnicity classification, such as the Māori ethnic group indicator and Pacific Peoples ethnic group indicator.

Ethnicity is a multiple response variable. Therefore, the number of total responses will be greater than the number of respondents. However, ethnicity is also output as a single response variable, for people who reported only one ethnicity, and as a combination response variable, for people who reported more than one ethnicity.

  • For the ethnicity total response output variable, an individual can be counted in as many ethnic groups as they choose. However, for output and time series continuity, the responses were reduced to a maximum of six per person.
  • For the grouped ethnicity total response output variable, if two or more of an individual's ethnic groups fall into the same broad level one ethnic group category, then the person is only counted once in that category.
  • For the ethnicity single and combination response variable, individuals are counted once in the single or combination category that applies to them.

For example, a person of Samoan, Tongan, and German ethnicity would be counted:

  • once in the category of Pacific Peoples and once as European, at level one of the classification for the total response variable and the grouped total response output
  • twice at level four of the classification for the ethnicities within the Pacific grouping, and once in the level four German classification
  • once in the Pacific Peoples ethnic group indicator as ‘Pacific Peoples and at least one other ethnic group’
  • once in the European ethnic group indicator as ‘European and at least one other ethnic group’
  • the person is counted in all other ethnic group indicators once, for example Māori ethnic group indicator: ‘0 – Non-Māori’. This is because they are a member of the census night subject population but are coded to zero for ethnicities that they don't identify with
  • the person is also counted once in the number of ethnic groups specified variable as ‘3 – Three’.

The classification of ethnicity in the 2018 Census is consistent with the 2013 Census, with minor changes at level four of the classification. For example, 2018 now includes categories previously grouped under level four ‘not elsewhere classified’ (nec), for example:

  • Asian ethnicities such as Mongolian and Bhutanese.

Examples of other classifications changes at level four of the classification are:

  • all 12 Cook Islands Maori Census 2013 classifications, such as Aitutaki islander and Rarotongan, are now grouped as ‘Cook Islands Maori’ in the 2018 Census
  • 26 ‘Other Pacific Peoples’ classifications in the 2013 Census, such as Admiralty Islander and Easter Islander are now grouped as ‘Other Pacific Peoples nec’.

The Standards and Classifications page provides background information on classifications and standards.

Question format

Ethnicity is collected on the individual form (question 7 on the paper form). The ethnic groups listed on the form with tick boxes are the same as those used in the 2013 Census, with a free text field option for those who marked ‘other ethnicity’.

Stats NZ Store House has samples for both the individual and dwelling paper forms.

There were differences in the ethnicity question format and the way a person could respond to the question between the modes of collection (online and paper forms):

On the online form:

  • guidance could be provided for the individual form with a definition of ethnicity and a link for further assistance, if respondents selected the ‘show help’ option
  • the form included as-you-type functionality which helped respondents provide valid, detailed responses
  • if there was no match to the as-you-type list, respondents could provide their free-text response, up to a maximum of 50 characters
  • respondents could add up to six other ethnicities in separate free text fields
  • a person had to select one of the ethnicity options in order to proceed. However, it was possible to select ‘other’ without providing an answer to the free text-box.

On the paper form:

  • respondents were provided with a definition of ethnicity in the Guide Notes document
  • non-response and responses outside the valid range were possible
  • multiple responses were possible as with the online form but respondents were limited to 39 characters in the supporting free text field.

Data from the online forms may therefore be of higher overall quality than data from paper forms. However, processing checks and edits were in place to improve quality of the paper forms.

How this data is used

Outside Stats NZ

Ethnicity is a core variable used by government, local authorities and research organisations to:

  • monitor the demographic, social and economic progress of ethnic groups
  • evaluate the impact of government policies on the economic and social well-being of ethnic groups
  • allocate funds and plan services directed at the special needs of ethnic groups in areas such as education, housing, health and social welfare
  • measure, monitor and evaluate programme uptake and effectiveness by ethnicity
  • analyse the relationship between ethnicity and other variables for population sub-groups such as women, young people and immigrants.

Within Stats NZ

  • Produce population estimates and projections for level 1 ethnic groups (European, Māori, Pacific Peoples, Asian, MELAA and other).
  • Provide descriptive and analytical profiles of the demographic, social and economic characteristics of different ethnic groups and produce statistics on the changing ethnic diversity of New Zealand's population.
  • Provide denominator data for the calculation of ethnic specific-rates for a wide range of topics, for example fertility, mortality, morbidity, and inequality.
  • Provide a sampling base, alongside the 2018 Māori descent population, for Te Kupenga (the Māori Social Survey).

2018 data sources

We used alternative data sources for missing census responses and responses that could not be classified or did not provide the type of information asked for.

The table below shows the breakdown of the various data sources used for this variable.

Ethnicity – 2018 census night population
Source Percent
Response from 2018 Census 84.4 percent
2013 Census data 8.2 percent
Administrative data 6.2 percent
Statistical imputation 1.2 percent
No information <0.1 percent
Total 100 percent
Due to rounding, individual figures may not always sum to the stated total(s)

Where appropriate, we used responses from the 2013 census to replace missing responses at level four of the classification. When this was not possible, the following admin data sources were used:

  • Births register, Department of Internal Affairs
  • Qualification enrolments and courses, Ministry of Education
  • Cohort demographics, Ministry of Health
  • Department of Corrections data
  • Ministry of Defence data.

It is important to note the following caveats relating to the use of admin data for the ethnicity variable:

  • the percentage of ethnicity data which is from 2013 Census data, admin data, and imputation for specific population groups will differ from that for the overall subject population
  • data from admin sources and the 2013 Census has been collected at various points in time and may not necessarily reflect the ethnicities they identified with on census night (for example a person is assigned an ethnicity at birth but may now be old enough to provide it themselves)
  • the Births Register can only be used to source ethnicity for individuals born after 01 September 1995, when ethnicity was added
  • Ministry of Health data can only be used to source ethnicity for individuals who have interacted with the healthcare system since 2004.

Stats NZ Store House has samples for both the individual and dwelling paper forms.

Addition of admin records to the NZ Census dataset: an overview of statistical methods provides more information on the use of Department of Corrections and the Ministry of Defence data, as well as general information on the timeliness of administrative data.

When it was not possible to use 2013 Census responses or admin data, we used within household donor imputation, finding the person closest of age in the usual residence and copying their ethnicity. Nearest neighbour statistical imputation was performed if no information about the person from the household was available. Both methods are included in the ‘statistical imputation’ percentage in the data sources table above.

The ‘no information’ percentage in the data sources table is where we were not able to source data for a person in the subject population.

Missing and residual responses

‘No information’ in the data sources table, is the percentage of the subject population coded to ‘not stated’. In recent previous censuses, non-response was the percentage of the subject population coded to ‘not stated’.

In 2018, the percentage of ‘not stated’ for the usually resident population is zero due to the use of the additional data sources described above. However, ethnicity data on the census night population includes overseas visitors for whom we couldn’t complete their ethnicity information using other data sources.

Percentage of ‘not stated’ for the census night population:

  • 2018: <0.1 percent
  • 2013: 5.3 percent
  • 2006: 4.0 percent.

Responses that could not be classified or did not provide the type of information asked for, such as ‘response unidentifiable’ and ‘response outside of scope’, remain in the data where we were unable to find information from another source. In the 2018 data sources table, these residuals are included within the ‘response from 2018 Census’ percentage.

For output purposes, as with the 2013 and 2006 Censuses, these residual category responses are grouped with ‘not stated’ and are classified as ‘not elsewhere included’.

Percentage of ‘not elsewhere included’ for the census night population:

  • 2018: 0.1 percent
  • 2013: 5.5 percent
  • 2006: 4.3 percent.

2013 Census data user guide provides more information about non-response in the 2013 Census.

Data quality processes

Overall quality rating: High quality

Data was evaluated to assess whether it meets quality standards and is suitable for use.

Three quality metrics contributed to the overall quality rating:

  • data sources and coverage
  • consistency and coherence
  • data quality.

The lowest rated metric determines the overall quality rating.

Data quality assurance for 2018 Census provides more information on the quality rating scale.

Data sources and coverage: High quality

We have assessed the quality of all the data sources that contribute to the output for the variable. To calculate a data sources and coverage quality score for a variable, each data source is rated and multiplied by the proportion it contributes to the total output.

The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:

  • 98–100 = very high
  • 95–<98 = high
  • 90–<95 = moderate
  • 75–<90 = poor
  • <75 = very poor.

At level two of the ethnicity classification, 2013 Census data was highly comparable to census responses, while admin data was broadly comparable to census forms. Statistical imputation was moderately comparable to census forms. The high proportion of data from received forms, 2013 Census and admin sources in comparison to the low proportion derived from statistical imputation contributed to the score of 0.97, determining the high quality rating.

Quality rating calculation table for the sources of ethnicity data – 2018 census night population
Source Rating Percent of total Score contribution
2018 Census form 1.00 84.41 0.84
2013 Census 0.91 8.18 0.07
Admin data 0.76 6.17 0.05
Within household donor 0.80 0.44 0.00
Donor’s 2018 Census form 0.60 0.64 0.00
Donor’s response sourced from 2013 Census 0.55 0.08 0.00
Donor’s response sourced from admin data 0.46 0.04 0.00
Donor’s response sourced from within household 0.48 0.01 0.00
No Information 0.00 0.05 0.00
Total 100.00 0.97
Due to rounding, individual figures may not always sum to the stated total(s) or score contributions.

Data sources, editing, and imputation in the 2018 Census has more information on the Canadian census edit and imputation system (CANCEIS) that was used to derive donor responses.

Consistency and coherence: High quality

Ethnicity data was assessed for consistency with expectations and time series for the usually resident subject population. Data is consistent with expectations across nearly all consistency checks, with some minor variation from expectations or benchmarks that makes sense due to real-world change, incorporation of other sources of data, or a change in how the variable has been collected.

Population changes, primarily migration, births and deaths drive the majority of change for this variable. However, the introduction of administrative enumeration and the use of admin data sources has also impacted on the total distributions for this variable compared with previous censuses.

  • The use of admin enumeration means ethnicity data was available for people who may have previously been missed from the census.
  • Missing responses from paper forms, responses outside of scope and other residuals have been replaced with 2013 Census data, admin data or statistical imputation in 2018.

Other changes to the method of collection led to expected variation at the lower levels of the classification. For example:

  • the as-you-type online functionality allowed for more detailed responses – for example reducing counts in the ‘British nfd’ category whilst increasing counts for the ‘English’, ‘Scottish’, ‘Irish’ and ‘Welsh’ categories
  • conversely, admin data was not always available to level four of the classification.

Data quality: High quality

The data quality checks for the ethnicity included cross-variable checks to the SA2 level of geography, for the census usually resident population.

At the highest levels of geography, ethnicity data has only minor data quality issues. The quality of coding and responses within classification categories is high. Any issues with the variable appear in a low number of cases (typically in the low hundreds).

Edits were run to improve the quality of data, in particular for responses received from paper forms. However, it is important to note that the impact of the mode (online or paper) on data quality may vary across ethnic groups.

Recommendations for use and further information

We recommend that the use of the data can be similar to its use in 2013.

When using this data you should be aware that:

  • due to the use of administrative enumeration to replace missing responses in the 2018 Census, the ethnicity counts in the 2018 Census are generally higher than in the 2013 Census, including for Māori and Pacific ethnic groups. As a result of this change in methodology for the 2018 Census, time series data should be interpreted with care.
  • data has been assessed to be consistent at SA2 level of geography and the higher levels of classification. Some variation is possible primarily due to the introduction of administrative data in 2018.
  • some level four categories could be combined for a more consistent time series.

Comparisons with other data sources

Although surveys and sources other than the census collect ethnicity data, data users are advised to familiarise themselves with the strengths and limitations of the sources before use.

Key considerations when comparing ethnicity information from the 2018 Census with other sources include:

  • a person may vary their reporting of their ethnicity, including the number of ethnicities they identify with, according to the context in which they are asked
  • census is a key source of information on ethnicity for small areas and small populations. Many other sources do not provide detail at this level.
  • census aims to be a national count of all individuals in a population while other sources measuring this variable are only based upon a subset of the population.

Contact our Information Centre for further information about using this variable.

This variable is not part of a dataset.


Aggregation Method


Conceptual Variable
conceptual-variable-16.png Ethnicity en-NZ



View Full History
Revision Date Responsibility Rationale
16 30/11/2021 2:59:19 PM