Variable Description
Qualification
A qualification is a formally recognised award for educational or training attainment. Formal recognition means that the qualification is approved by the New Zealand Qualifications Authority or any formally recognised existing approval body in New Zealand or overseas, or their predecessors or any previous approval body.
A qualification is defined as requiring full-time equivalent study of three months or more. Study time is an estimate of the typical time it takes a learner to achieve the learning outcomes of the qualification. This includes direct contact time with teachers and trainers, as well as time spent in studying, assignments, and assessment.
Highest qualification
Highest qualification is derived for people aged 15 years and over and combines highest secondary school qualification and post-school qualification to obtain a single highest qualification by category of attainment.
en-NZPriority level
Priority level 2
We assign a priority level to all census variables: Priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).
Highest qualification is a priority 2 variable. Priority 2 variables cover key subject populations that are important for policy development, evaluation, or monitoring. These variables are given second priority in terms of quality, time, and resources across all phases of a census.
The census priority level for highest qualification remains the same as 2013.
Quality Management Strategy and the Information by variable for qualifications (2013) have more information on the priority rating.
Overall quality rating for 2018 Census
Moderate quality
Data quality processes section below has more detail on the rating for this variable.
The External Data Quality Panel has provided an independent assessment of the quality of this variable and has rated it as moderate/poor quality. 2018 Census External Data Quality Panel: Assessment of Variables has more information.
Subject population
Census usually resident population aged 15 years and over
‘Subject population’ means the people, families, households, or dwellings to whom the variable applies.
How this data is classified
Census highest qualification output 2V2.0.0
Highest qualification is a flat classification derived from secondary school qualification and post-school qualification level, and includes the following categories:
000 No Qualification
001 Level 1 Certificate
002 Level 2 Certificate
003 Level 3 Certificate
004 Level 4 Certificate
005 Level 5 Diploma
006 Level 6 Diploma
007 Bachelor Degree and Level 7 Qualification
008 Post-graduate and Honours Degrees
009 Masters Degree
010 Doctorate Degree
011 Overseas Secondary School Qualification
999 Not elsewhere included
‘Not elsewhere included’ contains the residual categories, including ‘response unidentifiable’ and ‘not stated’.
The classification of highest qualification in the 2018 Census is consistent with the classification used in the 2013 and 2006 Censuses.
The Standards and Classifications page provides background information on classifications and standards.
Question format
Highest qualification is derived from secondary school qualification and post-school qualification on the individual form. Secondary school qualification comes from question 29 on the paper form. Post-school qualification comes from questions 30, 31, 32, and 33 on the paper form.
Stats NZ Store House has samples for both the individual and dwelling paper forms.
There has been a minor change to this variable. The 2018 post-school qualification level (question 31) used as part of the derivation of highest qualification has changed since the 2013 Census:
- in 2018, post-school qualification level had a check list of NZQA levels of qualification and a free text ‘other qualification’ field. In 2013, respondents were asked to write the level of their qualification in a free text box.
There were differences in the way a person could respond to the qualification questions between the modes of collection (paper and online form).
On the online individual form:
- the qualification questions had as-you-type functionality which helped respondents provide valid responses in text fields
- built-in routing functionality directed individuals to the appropriate questions. Those under 15 and overseas visitors could not answer the qualifications questions.
On the paper individual form:
- responses outside the valid range were possible
- multiple responses to single answer questions were possible. Edits were applied to reduce multiple or inconsistent responses to a single valid response where possible. These were prioritised by the highest qualification stated. If a single, valid response was unable to be determined, these responses were coded to ‘response unidentifiable’.
How this data is used
Outside Stats NZ
- To measure the impact of educational reforms on skill levels.
- By the Ministry of Education in determining decile rankings for schools receiving government funding.
- To identify potential skill gaps in the labour market and plan education and training programmes.
- To identify mismatches in the economy between people’s skills and occupations.
- To track long-term changes in the levels of qualification in the general population.
- Contributes to the measurement and analysis of human capital.
- Used with other variables to derive indices of socio-economic status and deprivation.
Within Stats NZ
- To examine the link between education and income, occupation, sex and various other census variables.
- Labour Market and Household Statistics use this data in both reference and analytical reports on various topics.
- Highest qualification data is used in analysing the different characteristics of those employed in the public and private sectors, along with sex, age, status in employment, industry, occupation, income, full-time/part-time status, hours of work, country of birth, and region.
2018 data sources
We used alternative data sources for missing census responses and responses that could not be classified or did not provide the type of information asked for. Where possible, we used responses from the 2013 Census, administrative data from the Integrated Data Infrastructure (IDI), or imputation.
The table below shows the breakdown of the various data sources used for this variable. As highest qualification is derived from two input variables (highest secondary school qualification and post-school qualification level), the 2013 Census data / Administrative data* row of the table below shows the percent of highest qualification data where either one or both of the input variables were sourced from these alternative sources.
2018 Highest qualification – census usually resident population aged 15 and over | |
---|---|
Source | Percent |
Response from 2018 Census | 80.4 percent |
2013 Census data / Administrative data* | 14.9 percent |
Statistical imputation | 0.0 percent |
No information | 4.7 percent |
Total | 100 percent |
Due to rounding, individual figures may not always sum to the stated total(s) |
The ‘no information’ percentage is where we were not able to source relevant qualification data for a person in the subject population.
Administrative data sources
Data from the following administrative source was used:
- information on course completions, TEC IT learners, targeted training, student qualifications, qualification enrolments, and courses - Ministry of Education.
Please note that when examining highest qualification data for specific population groups within the subject population, the percentage that is from 2013 Census data and administrative data may differ from that for the overall subject population.
Missing and residual responses
‘No information’ in the data sources table above is the percentage of the subject population coded to ‘not stated’. In previous censuses, non-response was the percentage of the subject population coded to ‘not stated.’
In 2018, the percentage of ‘not stated’ is lower than previous censuses due to the use of the additional data sources described above.
Percentage of ‘not stated’ for the census usually resident population aged 15 years and over:
- 2018: 4.7 percent
- 2013: 7.2 percent
- 2006: 6.0 percent.
Responses that could not be classified or did not provide the type of information asked for (response unidentifiable) remain in the data, where we have been unable to find information from another source. In the table above, these residuals are included in the ‘Response from 2018 Census’ percentage.
For output purposes, these residual category responses are grouped with ‘not stated’ and are classified as ‘Not elsewhere included’.
Percentage of ‘not elsewhere included’ for the census usually resident population aged 15 years and over:
- 2018: 6.5 percent
- 2013: 11.1 percent
- 2006: 10.4 percent.
2013 Census data user guide provides more information about non-response in the 2013 Census.
Data quality processes
Overall quality rating: Moderate quality
Data was evaluated to assess whether it meets quality standards and is suitable for use.
Three quality metrics contributed to the overall quality rating:
- data sources and coverage
- consistency and coherence
- data quality.
The lowest rated metric determines the overall quality rating.
Data quality assurance for 2018 Census provides more information on the quality rating scale.
Data sources and coverage: Moderate quality
We have assessed the quality of all the data sources that contribute to the output for the variable. To calculate a data sources and coverage quality score for a variable, each data source is rated and multiplied by the proportion it contributes to the total output.
The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:
- 98–100 = very high
- 95–<98 = high
- 90–<95 = moderate
- 75–<90 = poor
- <75 = very poor.
If a highest qualification response is sourced from the 2018 Census for both input variables, this is included in the 2018 Census response in the table below. If one or both of the input variables are sourced from either the 2018 or admin data these are included in the 2013 Census / Admin data percentage and assigned a rating of 0.83. The proportion of data from alternative sources along with the percentage of data remaining as ‘no information’ contributed to the score of 0.93, determining the moderate quality rating.
Quality rating calculation table for the sources of highest qualification data – 2018 Census usually resident population aged 15 years and over | |||
---|---|---|---|
Source | Rating | Percent of total | Score contribution |
2018 Census form | 1.00 | 80.40 | 0.80 |
2013 Census / Admin data | 0.83 | 14.86 | 0.12 |
No Information | 0.00 | 4.74 | 0.00 |
Total | 100.00 | 0.93 | |
Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Consistency and coherence: High quality
Highest qualification in the 2018 Census is highly comparable with the 2013 and 2006 Census data.
Data is consistent with expectations across nearly all consistency checks, with some minor variation from expectations or benchmarks that makes sense due to real-world change, incorporation of other sources of data, or a change in how the variable has been collected.
For more information on the consistency and coherence of the individual input variables please see Highest secondary school and Post school qualification level.
Data quality: High quality
The data quality checks for highest qualification included edits for consistency within the dataset and cross-tabulations at the national level of geography.
Data has only minor data quality issues. The quality of coding and responses within classification categories is high. Any impact of other data sources used is minor. Any issues with the variable appear in a low number of cases (typically in the low hundreds).
For more information on the data quality of the individual input variables please see Highest secondary school and Post school qualification level
Recommendations for use and further information
While administrative and 2013 census data have been used to produce the 2018 Census data, the overall quality of the data is moderate and can be compared with 2006 and 2013 data using caution.
When using this data you should be aware that:
- the 2013 question on highest qualification was a written-in response and some generic responses like 'Diploma' or 'Certificate' could not be coded easily. Directly comparing 2013 Census Level 5 and 6 highest qualification data with other census years is not recommended. The Level 5 and 6 categories should be aggregated prior to any comparison.
- data has been checked to a regional council level. Some variation is possible at geographies below this level
- due to data quality issues, caution is recommended when cross-tabulating highest qualification with age, particularly for the 15–24 age group as this group has a higher proportion of admin data
- the change in question format from text to tick box for the post-school qualification input variable has resulted in an increase in certain categories (such as post-graduate certificates and diplomas). This may mean these categories are not comparable with 2013 and 2006.
For further data quality issues and recommendations for the input variables, refer to Highest secondary school and Post school qualification level.
Comparisons with other data sources
Although surveys and sources other than the census collect qualification data, data users are advised to familiarise themselves with the strengths and limitations of the sources before use.
Key considerations when comparing highest qualification information from the 2018 Census with other sources include:
- census is a key source of information on qualifications for small areas and small populations. Many other sources do not provide detail at this level.
- census aims to be a national count of all individuals in a population while other sources such as the Household Economic Survey (HES), Household Labour Force Survey (HLFS), and General Social Survey (GSS) measuring this variable are only based upon a sample of the population.
Contact our Information Centre for further information about using this variable.