Quality Statement
Ethnicity is the ethnic group or groups that people identify with or feel they belong to. Ethnicity is a measure of cultural affiliation, as opposed to race, ancestry, nationality, or citizenship. Ethnicity is self-perceived, and people can affiliate with more than one ethnic group.
An ethnic group is made up of people who have some, or all, of the following characteristics:
- a common proper name
- one or more elements of common culture, which need not be specified but may include religion, customs, or language
- unique community of interests, feelings, and actions
- a shared sense of common origins or ancestry and/or
- a common geographic origin.
Ethnicity’s key variables are:
- total response ethnicity
- ethnic group indicators for Asian, European, Māori, Middle Eastern/Latin American/African (MELAA), Pacific peoples and other.
Ethnicity data is also output as the following variables:
- grouped total response
- ethnic groups single and combination ethnicity
- number of ethnic groups.
Total response ethnicity includes all people who stated each ethnic group, whether as their only ethnic group or as one of several ethnic groups. Where a person reported more than one ethnic group, they have been counted in each applicable group.
High quality
When examining or using 2023 Census data specific to an ethnic group, consider the data source and coverage quality rating and recommendations for use for that ethnic group. When examining or using ethnicity data for the whole census usually resident population count consider the overall quality rating and recommendations for use.
Data quality processes section below has more detail on the rating.
Priority level 1
A priority level is assigned to all census concepts: priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).
Ethnicity is a priority 1 concept. Priority 1 concepts are core census concepts that have the highest priority in terms of quality, time, and resources across all phases of a census.
The census priority level for ethnicity remains the same as 2018.
The 2023 Census: Final content report has more information on priority ratings for census concepts.
Census usually resident population count
This question applies to all people in New Zealand on census night. However, ethnicity data is usually output for the census usually resident subject population.
‘Subject population’ means the people, families, households, or dwellings that the variable applies to.
Total response ethnicity uses a 4-level hierarchical classification. Level 1 categories are presented in the table below:
Census Ethnic05 V2.1 2023 CENS V1.0.0 - level 1 of 4
Code | Category |
---|---|
1 | European |
2 | Māori |
3 | Pacific Peoples |
4 | Asian |
5 | Middle Eastern/Latin American/African |
6 | Other Ethnicity |
9 | Not elsewhere included |
Subsequent classification levels provide more detailed categories. Detailed ethnicity information is collected so that responses can be coded to specific ethnicity categories at level 4 of the classification. Where this is not possible, information is coded to level 2 or to level 3. Level 1 is used solely for output and contains six categories and one residual category. Follow the link above the table to examine the classification.
Since the 2018 Census, the names of three ethnicities have changed at level 4 of the classification. These changes were:
- Gypsy changed to Romani
- Afghani changed to Afghan
- Nepalese changed to Nepali.
Total response ethnicity is a multiple response variable so the number of responses will be greater than the number of respondents. Note that respondents are limited to six ethnicity codes for output and time series continuity. When ethnicity is output at higher levels of the classification, if two or more of an individual’s ethnicities are in the same higher level category, then the person is only counted once in that category. For example, an individual who identified as both Samoan and Tongan would only be counted once in the level 1 category Pacific peoples.
‘New Zealander’ or ‘Kiwi’ responses were coded to ‘New Zealander’ at level 4 of the classification (under ‘Other Ethnicity’ at level 1), while ‘Pākeha’ was coded to ‘New Zealand European’ at level 2 of the classification.
For each level 1 ethnic group in the total response ethnicity classification, an indicator is derived. Links to the indicator classifications are:
- Census European ethnic group indicator V3.0.0
- Census Māori ethnic group indicator V3.0.0
- Census Pacific peoples ethnic group indicator V3.0.0
- Census Asian ethnic group indicator V3.0.0
- Census MELAA ethnic group indicator V2.0.0
- Census Other ethnicity ethnic group indicator V2.0.0
Each indicator’s categories are similar, with the Māori ethnic group indicator presented below as an example:
Code | Category |
---|---|
00 | Non-Māori |
01 | Māori only |
02 | Māori and at least one other ethnic group |
99 | Not elsewhere included |
Ethnic group indicators use a 2-level hierarchical classification with level 1 categories presented in the table above. At level 2, ‘Not elsewhere included’ is split into five residual categories. Follow the links above the table to examine these classifications.
Total response ethnicity is also output as single-combination response variable. The links to the single-combination response classifications are:
- Census ethnic groups single/combination output (detailed) V1.0.0
- Census ethnic single/combination output (8 groups) V1.0.0
- Census ethnic groups single/combination output (15 groups) V1.0.0
As an example, the categories for the 15 group single-combination response variable are:
Code | Category | Code | Category |
---|---|---|---|
11 | European only | 23 | Pacific Peoples/European |
12 | Māori only | 24 | Asian/European |
13 | Pacific Peoples only | 29 | Two groups not elsewhere included |
14 | Asian only | 31 | Māori/Pacific Peoples/European |
15 | Middle Eastern/Latin American/African only | 39 | Three groups not elsewhere included |
16 | Other Ethnicity only | 41 | Four to six groups |
21 | Māori/European | 99 | Not elsewhere included |
22 | Māori/Pacific Peoples |
Single-combination response variables use a 1-level flat classification like that presented in the table above. Follow the links above the table to examine these classifications.
The single-combination response classifications distinguish between those who responded with a single ethnicity, and those that responded with multiple ethnicities. Respondents are counted once in the single or combination category that applies to them.
Standards and classifications has more information on what classifications are, how they are reviewed, and where they are stored, and to provide feedback on them.
Ethnicity data is collected from the individual form (question 8 paper form).
The same question format has been used to collect ethnicity since the 2001 Census. The ethnicities listed on the form with tick boxes are the same as those used in the 2018 Census, with a free text field option for those who marked ‘other’ option.
The modes of collection (online and paper forms) had differences in question wording, layout, and the way a person could respond.
On the online form:
- if respondents selected the ‘show help’ option, guidance was provided with a definition of ethnicity and a link for further
- the form included as-you-type functionality that helped respondents provide valid, detailed responses
- if there was no match in the as-you-type list, respondents could provide their free-text response, up to a maximum of 50 characters
- respondents could add up to six other ethnicities in separate free text fields
- the question had to be answered in order to proceed; however, it was possible to select ‘other’ without providing an answer to the free textbox.
On the paper form:
- respondents were provided with a definition of ethnicity in the Guide Notes
- non-response and responses outside the valid range were possible
- multiple responses were possible, but respondents were limited to 39 characters in the supporting free text field.
Data from the online forms may therefore be of higher overall quality than data from paper forms. However, processing checks and edits were in place to improve quality of the data.
Stats NZ Store House has samples for both the individual and dwelling paper forms.
Data-use outside Stats NZ:
Ethnicity is used by government, local authorities, and research organisations to:
- assess the demographic, social and economic progress of ethnic groups
- evaluate the impact of government policies on the economic and social well-being of ethnic groups
- allocate funds and plan services directed at the special needs of ethnic groups in areas such as education, housing, health, and social welfare
- measure, monitor, and evaluate programme uptake and effectiveness by ethnicity
- analyse the relationship between ethnicity and other variables for population sub-groups such as women, young people, and immigrants.
Data-use by Stats NZ:
- to input into ethnic population estimates and projections
- to provide descriptive and analytical profiles of demographic, social and economic characteristics of different ethnic groups and produce statistics on the changing ethnic diversity of New Zealand’s population
- to provide denominator data for the calculation of ethnic-specific rates for a wide range of topics, for example, fertility, mortality, morbidity, and inequality
- to help inform sampling frames for other collections.
Alternative data sources were used for missing census responses and responses that could not be classified or did not provide the type of information asked for. The table below shows the distribution of data sources for ethnicity data.
Data sources for ethnicity data, as a percentage of census usually resident population count, 2023 Census | ||
---|---|---|
Source of ethnicity data | Percent | |
2023 Census response | 86.0 | |
Historical census | 8.8 | |
2018 Census | 6.2 | |
2013 Census | 2.6 | |
Admin data | 4.4 | |
Statistical imputation | 0.8 | |
Probabilistic imputation | 0.5 | |
CANCEIS(1) donor's response sourced from 2023 Census form | 0.3 | |
CANCEIS donor's response sourced from 2018 Census | <0.1 | |
CANCEIS donor's response sourced from 2013 Census | <0.1 | |
CANCEIS donor’s response sourced from admin data | <0.1 | |
CANCEIS donor’s response sourced from probabilistic imputation | <0.1 | |
No information | 0.0 | |
Total | 100.0 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
The following admin data sources were used, the below list is in priority order of use:
- Department of Internal Affairs (DIA)
- Ministry of Education - tertiary and school enrolments
- Manatu Haurora – Ministry of Health - cohort demographics
- Ministry of Social Development - primary benefit recipients.
Editing, data sources, and imputation in the 2023 Census describes how data quality is improved by editing and how missing and residual responses are filled with alternative data sources (admin data and historical census responses) or statistical imputation. The paper also describes the use of CANCEIS (the CANadian Census Editing and Imputation System) which is used to perform imputation.
Missing and residual responses represent data gaps where respondents either did not provide answers (missing responses) or provided answers that were not valid (residual responses) in the 2023 Census.
For:
- 2018 and 2023 Census, ethnicity data contains no missing or residual responses as they were filled with data from alternative sources
- 2013 Census, 5.3 percent of responses were coded to ‘Not stated’, and 5.4 percent of respondents were coded to ‘Not elsewhere included’ (which includes ‘Not stated’ responses).
Note that in the 2018 and 2013 Censuses, missing and residual responses were reported for the census night population count so percentages for this subject population will be different to those above.
Overall quality rating: High quality
Data has been evaluated to assess whether it meets quality standards and is suitable for use.
Three quality metrics contribute to the overall quality rating:
- data sources and coverage
- consistency and coherence
- accuracy of responses.
The lowest rated metric determines the overall quality rating.
Data quality assurance in the 2023 Census provides more information on the quality rating scale.
Data sources and coverage: Very high quality
The quality of all the data sources that contribute to the output for the variable were assessed. To calculate the data sources and coverage quality score for a variable, each data source is rated and multiplied by the proportion it contributes to the total output.
The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:
- 0.98–1.00 = very high
- 0.95–<0.98 = high
- 0.90–<0.95 = moderate
- 0.75–<0.90 = poor
- <0.75 = very poor.
The high proportion of ethnicity data sourced from 2023 Census forms, alongside the high quality of alternative data sources, resulted in a score of 0.98 leading to a quality rating of very high.
Please see the Metric 1 data source and coverage quality ratings for Ethnic group indicator data section below for data sources and coverage rating calculation table for ethnic groups.
Data sources and coverage rating calculation for ethnicity data, census usually resident population count, 2023 Census | |||
---|---|---|---|
Source of ethnicity data | Rating | Percent | Score contribution |
2023 Census response | 1.00 | 85.97 | 0.86 |
2018 Census | 0.93 | 6.21 | 0.06 |
2013 Census | 0.91 | 2.60 | 0.02 |
Admin data | 0.79 | 4.43 | 0.04 |
Probabilistic imputation | 0.80 | 0.48 | <0.01 |
CANCEIS(1) nearest neighbour imputation | 0.80 | 0.31 | <0.01 |
No information | 0.00 | 0.00 | 0.00 |
Total | 100.0 | 0.98 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Consistency and coherence: High quality
Ethnicity data is highly consistent with expectations and benchmarks across most consistency checks. Data is aligned with historical trends on regional council and territorial authority and local boards (TALB) geographies.
Māori and Pacific Peoples ethnicity data has a relatively high proportion of historical census and admin data. An individual's ethnic identity can change over time, so these data sources may not represent their ethnic identity at the time of the 2023 Census collection. Māori and Pacific Peoples ethnicity data has a high proportion of alterative data sourcing in some areas (both TALB and statistical area 2). This is particularly evident in areas affected by Cyclone Gabrielle.
Accuracy of responses: Very high quality
Ethnicity data has no accuracy of response issues that have an observable effect on the data. The quality of coding is very high. Any issues with the variable appear in a very low number of cases (typically less than a hundred). Improvement in scanning repair for paper forms reduced the number of responses needing to be sourced from alternative sources. For the 2023 Census, changes to the online form as-you-type list were made to encourage more specific responses.
Ethnicity data can be used in a comparable manner to the 2018 Census. When using the data, users should be aware:
- of the proportion of alternatively sourced ethnicity data at low levels of geography
- that all residual responses have been imputed or sourced from admin data for this variable for the census usually resident population count. The use of the census combined model ensures that the data is of high quality and comparable to the 2018 Census. Note that the 2013 Census did not use additional data sources or imputation which reduces comparability with the 2018 and 2023 Censuses.
Comparison to other data sources
Although there are surveys and sources other than the census that collect ethnicity data, data users are advised to familiarise themselves with the strengths and limitations of the sources before use.
Key considerations when comparing ethnicity information from the 2023 Census with other sources include:
- A person may vary their reporting of their ethnicity, including the number of ethnicities they identify with, according to the context in which they are asked.
- Census is a key source of information on ethnicity for small areas and small populations. Many other sources do not provide detail at this level.
- Census aims to be a national count of all individuals in a population while other sources measuring this variable are only based upon a sample of the population.
To assess how this concept aligns with the variables from the previous census, please use the links below:
en-NZThe data sources and coverage quality ratings for the ethnic group indicator variables are:
- European: Very high
- Māori: High
- Pacific peoples: High
- Asian: Very high
- Middle Eastern/Latin American/African (MELAA): Very high
- Other ethnicity: Very high
Metric 1 data sources and coverage rating calculation for European ethnicity, European ethnic group census usually resident population count, 2023 Census | |||
---|---|---|---|
Source of European ethnicity | Rating | Percent | Score contribution |
2023 Census response | 1.00 | 88.58 | 0.89 |
2018 Census | 0.93 | 5.92 | 0.06 |
2013 Census | 0.91 | 2.03 | 0.02 |
Admin data | 0.79 | 2.86 | 0.02 |
Probabilistic imputation | 0.80 | 0.33 | <0.01 |
CANCEIS(1) nearest neighbour imputation | 0.80 | 0.27 | <0.01 |
No information | 0.00 | 0.00 | 0.00 |
Total | 100.0 | 0.99 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Metric 1 data sources and coverage rating calculation for Māori ethnicity, Māori ethnic group census usually resident population count, 2023 Census | |||
---|---|---|---|
Source of Māori ethnicity data | Rating | Percent | Score contribution |
2023 Census response | 1.00 | 73.11 | 0.73 |
2018 Census | 0.93 | 9.93 | 0.09 |
2013 Census | 0.91 | 5.99 | 0.05 |
Admin data | 0.79 | 9.84 | 0.08 |
Probabilistic imputation | 0.80 | 0.84 | 0.01 |
CANCEIS(1) nearest neighbour imputation | 0.80 | 0.27 | <0.01 |
No information | 0.00 | 0.00 | 0.00 |
Total | 100.0 | 0.96 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Metric 1 data sources and coverage rating calculation for Pacific peoples ethnicity, Pacific peoples ethnic group census usually resident population count, 2023 Census | |||
---|---|---|---|
Source of Pacific peoples ethnicity data | Rating | Percent | Score contribution |
2023 Census response | 1.00 | 75.19 | 0.75 |
2018 Census | 0.93 | 8.62 | 0.08 |
2013 Census | 0.91 | 5.15 | 0.05 |
Admin data | 0.79 | 9.83 | 0.08 |
Probabilistic imputation | 0.80 | 0.94 | 0.01 |
CANCEIS(1) nearest neighbour imputation | 0.80 | 0.28 | <0.01 |
No information | 0.00 | 0.00 | 0.00 |
Total | 100.0 | 0.97 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Metric 1 data sources and coverage rating calculation for Asian ethnicity, Asian ethnic group census usually resident population count, 2023 Census | |||
---|---|---|---|
Source of Asian ethnicity data | Rating | Percent | Score contribution |
2023 Census response | 1.00 | 89.07 | 0.89 |
2018 Census | 0.93 | 4.13 | 0.04 |
2013 Census | 0.91 | 1.19 | 0.01 |
Admin data | 0.79 | 4.58 | 0.04 |
Probabilistic imputation | 0.80 | 0.60 | <0.01 |
CANCEIS(1) nearest neighbour imputation | 0.80 | 0.44 | <0.01 |
No information | 0.00 | 0.00 | 0.00 |
Total | 100.0 | 0.98 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Metric 1 data sources and coverage rating calculation for MELAA ethnicity, MELAA ethnic group census usually resident population count, 2023 Census | |||
---|---|---|---|
Source of MELAA ethnicity data | Rating | Percent | Score contribution |
2023 Census response | 1.00 | 85.95 | 0.86 |
2018 Census | 0.93 | 5.01 | 0.05 |
2013 Census | 0.91 | 1.82 | 0.02 |
Admin data | 0.79 | 6.11 | 0.05 |
Probabilistic imputation | 0.80 | 0.60 | <0.01 |
CANCEIS(1) nearest neighbour imputation | 0.80 | 0.52 | <0.01 |
No information | 0.00 | 0.00 | 0.00 |
Total | 100.0 | 0.98 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Metric 1 data sources and coverage rating calculation for Other ethnicity, Other ethnic group census usually resident population count, 2023 Census | |||
---|---|---|---|
Source of Other ethnicity data | Rating | Percent | Score contribution |
2023 Census response | 1.00 | 84.50 | 0.85 |
2018 Census | 0.93 | 7.23 | 0.07 |
2013 Census | 0.91 | 3.09 | 0.03 |
Admin data | 0.79 | 4.46 | 0.04 |
Probabilistic imputation | 0.80 | 0.47 | <0.01 |
CANCEIS(1) nearest neighbour imputation | 0.80 | 0.25 | <0.01 |
No information | 0.00 | 0.00 | 0.00 |
Total | 100.0 | 0.98 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Contact our Information centre for further information about using this concept.
en-NZ