Sex (information about this variable and its quality)


Sex is the distinction between males and females based on the biological differences in sexual characteristics.



Variable Details

Other Variable Information

Priority level

Priority level 1

We assign a priority level to all census variables: Priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).

Sex is a priority 1 variable. Priority 1 variables are core census variables that have the highest priority in terms of quality, time, and resources across all phases of a census.

The census priority rating for this variable remains the same as 2013.

Quality Management Strategy and the Information by variable for sex (2013) have more information on the priority rating.

Overall quality rating for 2018 Census

Very high quality

Data quality processes section below has more detail on the rating for this variable.

The External Data Quality Panel has provided an independent assessment of the quality of this variable and has rated it as very high quality. Initial Report of the 2018 Census External Data Quality Panel has more information

Subject population

Census night population

This question applies to all people in New Zealand on census night. However, data on sex is usually output for the census usually resident subject population.

‘Subject population’ means the people, families, households, or dwellings to whom the variable applies.

How this data is classified

Sex - New Zealand Standard ClassificationV1.0.0

Sex is a flat classification with two categories:

1 Male

2 Female

No provision is made for residual categories (for example ‘not stated’) as, in line with international practice, it is Stats NZ policy to account for the sex of every person in the census.

Respondents who wished to identify as intersex were instructed to request paper forms and tick both male and female boxes. For official statistics these were classified as either male or female. Data Quality Processes section has more information.

The Standards and Classifications page provides background information on classifications and standards.

Question format

Sex is collected on the individual form (question 3 on the paper form).

Stats NZ Store House has samples for both the individual and dwelling paper forms.

If sex is not answered on the individual form, sex information from the following questions may be used:

  • people present in the dwelling on census night on the household set-up form (online) or dwelling form (question 17 paper)
  • absentees on the household set-up form (online) or the dwelling form (question 20 paper).

There were differences between the modes of collection (paper and online form).

On the online individual form:

  • sex had to be answered to submit the form
  • sex could only be answered with a single response.

On the paper individual and dwelling forms:

  • non-response to this question and multiple responses were possible
  • in many cases forms were mailed back and not subject to a check by field-staff.

How this data is used

Sex is a demographic variable at the core of the census. Its various uses include:

Outside Stats NZ

  • By central and local government for planning, service provision, policy development and to understand the demographics of particular regions.
  • By various organisations to produce population estimates and projections.
  • NZ Deprivation Index.
  • By a range of researchers, organisations, and special interest groups that use sex information along with information on cultural identity (for example religion and ethnicity).

Within Stats NZ

  • Sex data provides the base population for many derived series, including fertility, mortality, morbidity, suicide, accident, and crime statistics.
  • Used by population statistics to produce population estimates and projections, which underpin policy and planning.
  • Used by population statistics for analysis of childlessness.
  • Used for analysis of population mobility.
  • Provides benchmark data for the design of household surveys.
  • In both standard and customised outputs, sex is commonly cross-tabulated with all the personal variables as well as geographic areas.
  • Crucial for linking with other census variables to get a comprehensive understanding of the relationship between sex and other census variables.
  • Sex is a core variable for demographic data integration.

2018 data sources

We used alternative data sources for missing census responses and responses that could not be classified or did not provide the type of information asked for. Where possible, we used responses from the 2013 Census, administrative data from the Integrated Data Infrastructure (IDI), or imputation.

The table below shows the breakdown of the various data sources used for this variable.

2018 Sex – census night population
Source Percent
Response from 2018 Census 84.6 percent
Response from 2018 partial forms1 4.3 percent
2013 Census data 0.0 percent
Administrative data 10.9 percent
Statistical imputation 0.1 percent
No information 0.0 percent
Total 100 percent
Due to rounding, individual figures may not always sum to the stated total(s)
1 Partial response is where the sex of an individual was provided on the household set-up form or the paper dwelling form, but we did not receive an individual form.

The ‘no information’ percentage is where we were not able to source sex data for a person in the subject population.

In 2018, this was zero because if a respondent did not complete the sex question on the individual form, and there was no sex information available on the household set-up form or on the dwelling form, we took the best estimate from a range of sources within the Integrated Data Infrastructure (IDI). If this was not possible, or the individual was an overseas visitor and therefore not in the IDI, a response was imputed.

Please note that when examining sex data for specific population groups within the subject population, the percentage that is from administrative data, and imputation may differ from that for the overall subject population.

Addition of administrative records to the New Zealand 2018 Census Dataset: An overview of statistical methods provides information on the linking of census responses to the IDI.

Missing and residual responses

Sex does not have a non-response (‘not stated’) or any other residual category. Responses that could not be classified or did not provide the type of information asked for were replaced by data derived from admin sources or by statistical imputation.

Sex also did not have a non-response (‘not stated’) or any other residual category in recent previous censuses due to the use of imputation. This included imputation for substitute records.

2013 Census data user guide provides more information about non-response and imputation in the 2013 Census.

Data quality processes

Overall quality rating: Very high quality

Data was evaluated to assess whether it meets quality standards and is suitable for use.

Three quality metrics contributed to the overall quality rating:

  • data sources and coverage
  • consistency and coherence
  • data quality.

The lowest rated metric determines the overall quality rating.

Data quality assurance for 2018 Census provides more information on the quality rating scale.

Data sources and coverage: Very high quality

We have assessed the quality of all the data sources that contribute to the output for the variable. To calculate a data sources and coverage quality score for a variable, each data source is rated and multiplied by the proportion it contributes to the total output.

The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:

  • 98–100 = very high
  • 95–<98 = high
  • 90–<95 = moderate
  • 75–<90 = poor
  • <75 = very poor.

Admin data was highly comparable to census forms, while data sourced through statistical imputation was moderately comparable to census responses. The high proportion of data from received forms and admin sources in comparison to the low proportion sourced from statistical imputation contributed to the score of 1.00, determining the very high quality rating

Quality rating calculation table for the sources of sex data – 2018 census night population
Source Rating Percent of total Score contribution
2018 Census form 1.00 84.60 0.85
2018 Census form (missing from individual form) 1.00 4.31 0.04
Admin data 1.00 10.95 0.11
Within household donor 0.70 0.03 0.00
Donor’s 2018 Census form 0.70 0.10 0.00
Donor’s 2018 Census (missing from individual form) 0.70 0.01 0.00
Donor’s response sourced from admin data 0.70 <0.01 0.00
Donor’s response sourced from within household 0.49 <0.01 0.00
No Information 0.00 0.00 0.00
Total 100.00 1.00
Due to rounding, individual figures may not always sum to the stated total(s) or score contributions.      

Data sources, editing, and imputation in the 2018 Census has more information on the Canadian census edit and imputation system (CANCEIS) that was used to derive donor responses.

Consistency and coherence: Very high quality

Sex data was assessed for consistency with expectations and time series for the usually resident subject population. Sex data is highly consistent with expectations across all consistency checks at SA2 level of geography.

There was a minor deviation to the time series:

  • the ratio of males to females has shifted slightly, with a small increase in the percent of males and a small decrease in the percent of females compared with time series. This may be due to both real-world changes such as changing survivorship patterns and migration, and the use of administrative data and imputation to count those traditionally missed by the census.

Data quality: Very high quality

The data quality checks for the sex variable included edits for consistency within the dataset and cross-tabulations to the SA2 level of geography for the overall census night subject population.

Sex data has no data quality issues that have an observable effect on the data. The quality of coding is very high. Other data sources used do not create any quality impacts for this variable. Any issues with the variable appear in a very low number of cases (typically less than a hundred).

Edits were done when both male and female boxes were ticked (possible on paper forms only). If available, the sex of a respondent was determined from elsewhere on the individual or dwelling form (refer to the Question format section for which questions contain information on sex). If this information was not available a sex was assigned at random, with 49 percent assigned as male and 51 percent as female.

Recommendations for use and further information

We recommend that the use of the data can be similar to its use in 2013.

When using this data, you should be aware that:

  • sex data has been assessed to be consistent at the SA2 level of geography.

Comparisons with other data sources

Although surveys and sources other than the census collect sex data, data users are advised to familiarise themselves with the strengths and limitations of the sources before use.

Key considerations when comparing sex information from the 2018 Census with other sources include:

  • census aims to be a national count of all individuals in a population while other surveys (such as the Household labour force survey and the General social survey) measuring this variable are only based upon a sample of the population.

Contact our Information Centre for further information about using this variable.

Revision Information

Currently viewing revision 8 by on 11/03/2020 4:19:27 a.m.

Revision 8 *
11/03/2020 11:39:05 p.m.
Revision 7
19/02/2020 2:59:11 a.m.
Revision 6
27/11/2019 8:24:13 p.m.
Revision 5
3/10/2019 2:16:37 a.m.
Revision 4
22/09/2019 9:53:26 p.m.

Show / Hide more...


DDI Agency
DDI Version


DDI 3 Download

Select the languages to display