Quality Statement
Study participation measures those attending, studying, or enrolled in tertiary institutions, school, early childhood education, or any other place of education or training. It is grouped into full-time study (20 hours or more a week), part-time study (less than 20 hours a week), and those not studying.
High quality
Data quality processes section below has more detail on the rating.
Priority level 2
A priority level is assigned to all census concepts: priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).
Study participation is a priority 2 concept. Priority 2 concepts cover key subject populations that are important for policy development, evaluation, or monitoring. These concepts are given second priority in terms of quality, time, and resources across all phases of a census.
The census priority level for study participation remains the same as 2018.
The 2023 Census: Final content report has more information on priority ratings for census concepts.
Census usually resident population count
‘Subject population’ means the people, families, households, or dwellings that the variable applies to.
For 2023 and 2018 Censuses, the subject population includes the child usually resident population aged 0 to 14 years. In the 2013 Census, the subject population for study participation was the census usually resident population aged 15 and over. The subject population was changed so that the study participation question could serve as a filter for the travel to education question.
Study participation data is classified into the following categories:
Census Study Classification 2 V3.0.0 – level 1 of 2
Code | Category |
---|---|
01 | Full-time study |
02 | Part-time study |
04 | Not studying |
99 | Not elsewhere included |
Study participation uses a 2-level hierarchical classification with the level 1 categories presented in the table above. Follow the link above the table to examine the classification in more detail.
The level 1 residual category ‘Not elsewhere included’ contains the residual categories ‘Response unidentifiable’ and ‘Not stated’.
The 2023 Census classification for study participation is consistent with that used in 2018 Census.
Standards and classifications has information on what classifications are, how they are reviewed, where they are stored, and how to provide feedback on them.
Study participation is collected on the individual form (question 18 on the paper form).
There were differences in the way a person could respond between the modes of collection (online and paper forms).
On the online form:
- inconsistent responses were not allowed to be marked. This is when the response of ‘no – neither of these’ is selected along with a response indicating that the respondent was either studying full-time or studying part-time. If the ‘no – neither’ box was marked, any other responses to study participation disappeared.
- built-in routing functionality on the online form directed individuals in the subject population to the appropriate question.
On the paper form:
- inconsistent multiple responses to this question were possible when forms were completed on paper. If both full-time and part-time study were marked, these were coded to full-time study. If ‘no – neither of these’ and either full-time or part-time study was marked, these responses were coded to the residual response ‘Response unidentifiable’.
- respondents were able to answer the question even if they were not within the subject population. However, these responses were resolved by edits.
Data from the online forms may therefore be of higher overall quality than data from paper forms. However, processing checks and edits were in place to improve the quality of the paper forms.
Stats NZ Store House has samples for both the individual and dwelling paper forms.
Data-use outside Stats NZ:
- to monitor trends in participation in study, and how this varies for different demographic and socio-economic groups
- to develop programmes and scholarships for groups who are under-represented in the education system
- to inform analysis of work and labour force status and rates of young people who are not in employment, education, or training (NEET).
Data-use by Stats NZ:
- to derive not in employment, education, or training (NEET) rates
- to understand the distinctive characteristics of the student population, including their migration patterns.
Alternative data sources were used for missing and residual census responses and responses that could not be classified or did not provide the type of information asked for. The table below shows the distribution of data sources for study participation data.
Data sources for study participation data, as a percentage of census usually resident population count, 2023 Census | ||
---|---|---|
Source of study participation data | Percent | |
2023 Census response | 84.4 | |
Historical census | 0.0 | |
Admin data | 13.8 | |
Deterministic derivation | 0.0 | |
Statistical imputation | 1.8 | |
CANCEIS(1) donor's response sourced from 2023 Census form | 1.7 | |
CANCEIS donor's response sourced from admin data | 0.1 | |
No information | 0.0 | |
Total | 100.0 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System
Note: Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Where appropriate, admin data from the Ministry of Education was used to replace missing or residual responses. Statistical imputation was used for any remaining responses coded to ‘Not stated’ or ‘Response unidentifiable’.
Editing, data sources, and imputation in the 2023 Census describes how data quality is improved by editing and how missing and residual responses are filled with alternative data sources (admin data and historical census responses) or statistical imputation. The paper also describes the use of CANCEIS (the CANadian Census Editing and Imputation System) which is used to perform imputation. This webpage also contains a spreadsheet that provides additional detail on the admin data sources.
Missing and residual responses represent data gaps where respondents either did not provide answers (missing responses) or provided answers that were not valid (residual responses).
Where possible, alternative data sources have been used to fill missing and residual responses in the 2018 and 2023 Censuses.
Percentage of ‘Not stated’ for the census usually resident population count (2023, 2018), and census usually resident population count aged 15 years and over (2013)
- 2023: 0.0 percent
- 2018: 0.0 percent
- 2013: 10.4 percent
There was no ‘Not elsewhere included’ category for this variable in 2013. In 2018 and 2023, admin data or statistical imputation were used to replace all residual responses resulting in a ‘Not elsewhere included’ percentage of zero.
Overall quality rating: High
Data has been evaluated to assess whether it meets quality standards and is suitable for use.
Three quality metrics contributed to the overall quality rating:
- data sources and coverage
- consistency and coherence
- accuracy of response.
The lowest rated metric determines the overall quality rating.
Data quality assurance in the 2023 Census provides more information on the quality rating scale.
Data sources and coverage: Very high quality
The quality of all the data sources that contribute to the output for the variable have been assessed. To calculate the data sources and coverage quality score for a variable, each data source was rated and multiplied by the proportion it contributes to the total output.
The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:
- 0.98-1.00 = very high
- 0.95-<0.98 = high
- 0.90-<0.95 = moderate
- 0.75-<0.90 = poor
- <0.75 = very poor.
The high proportion of data received from 2023 Census forms, alongside the high quality of alternative data sources, resulted in a score of 0.99, leading to the quality rating of very high.
Data sources and coverage rating calculation for study participation data, census usually resident population count, 2023 Census | |||
---|---|---|---|
Source of study participation data | Rating | Percent | Score contribution |
2023 Census response | 1.00 | 84.42 | 0.84 |
Admin data | 0.95 | 13.78 | 0.13 |
CANCEIS(1) nearest neighbour imputation | 0.60 | 1.80 | 0.01 |
No information | 0.00 | 0.00 | 0.00 |
Total | 100.00 | 0.99 | |
1. CANCEIS = imputation based on CANadian Census Edit and Imputation System Note: Due to rounding, individual figures may not always sum to stated total(s) or score contributions. |
Consistency and coherence: High quality
Study participation data is consistent with expectations across nearly all consistency checks, with some minor variation from expectations or benchmarks which makes sense due to real-world change, incorporation of other sources of data, or a change in how the variable has been collected.
Accuracy of responses: Very high quality
Study participation data has no data quality issues that have an observable effect on the data. The quality of coding is very high. Any issues with the variable appear in a very low number of cases (typically less than a hundred).
Study participation data can be used in a comparable manner to the 2013 and 2018 Censuses.
The methodology for the 2018 and 2023 Censuses is different from the 2013 Census. It is recommended that data users be aware of the following when making time series comparisons:
- The subject population for the 2018 and 2023 Censuses was the census usually resident population count. However, for the 2013 Census, the subject population was the census usually resident population count aged 15 years and over.
Data users should also be aware of the following:
- Study participation data for under 15-year-olds should be used with caution due to the undercount when compared with admin data, especially for early childhood (1 to 4-years-olds). For early childhood, this may be due to parents considering early childhood education as childcare rather than education. For children aged 5 to 14 years, it may be attributable to the parents’ interpretation of ‘attending’ in the question, which may be interrupted or irregular.
- Census does not distinguish between formal and informal study. An individual may consider themselves as ‘studying’ without a formal enrolment. For example, adults may consider themselves as studying when attending an informal adult community education class.
- Census study participation respondents aged 18 and over may include those undergoing on-the job industry training, as well as those enrolled with industry training organisations.
- Census respondents enrolled in an educational institution, but who were not actively studying at the time of the census, may have indicated that they were not participating in study at the time of the survey, while they would be included in admin data.
Comparisons to other data sources
Although surveys and sources other than the census collect study participation data, data users are advised to familiarise themselves with the strengths and limitations of the sources before use.
Key considerations when comparing study participation information from the 2023 Census with other sources include:
- Census is a key source of information on study participation for small areas and small populations. Many other sources do not provide detail at this level.
- Census aims to be a national count of all individuals in a population. Other sources measuring this variable, such as the Household Economic Survey (HES) and Household Labour Force Survey (HLFS), are only based upon a sample of the population.
To assess how this concept aligns with the variables from the previous census, use the links:
- Study participation – 2018 Information by variable
- Study participation – 2013 Information by variable
Contact our Information centre for further information about using this concept.