Years since arrival in New Zealand (information about this variable and its quality)

Variable Description

Name
Years since arrival in New Zealand (information about this variable and its quality) en-NZ
Label
Years since arrival in New Zealand (information about this variable and its quality) en-NZ
Description

Years since arrival in New Zealand is the number of completed years up to census night, since a person born overseas first arrived in New Zealand to live, irrespective of any intervening absences, whether temporary or long term.

en-NZ
Other Variable Information

The Years since arrival in New Zealand variable has changed from moderate quality to high quality

The consistency and coherence quality rating for Years since arrival in New Zealand has been changed from moderate to high quality. This has resulted in an overall quality rating increase from moderate to high quality for the Years since arrival in New Zealand variable. The Data quality processes section (consistency and coherence subsection) has more information.

Related variables:

  • Month of arrival
  • Year of arrival

Priority level

Priority level 3

We assign a priority level to all census variables: Priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).

Years since arrival is a priority 3 variable. Priority 3 variables do not fit in directly with the main purpose of a census but are still important to certain groups. These variables are given third priority in terms of quality, time, and resources across all phases of a census.

The Census priority level for years since arrival remains the same as 2013.

Quality Management Strategy and the Information by variable for years since arrival (2013) have more information on the priority rating.

Overall quality rating for 2018 Census

High quality

Data quality processes section below has more detail on the rating for this variable.

The External Data Quality Panel has provided an independent assessment of the quality of this variable and has rated it as moderate quality. 2018 Census External Data Quality Panel: Assessment of Variables has more information.

Subject population

Overseas-born census usually resident population

‘Subject population’ means the people, families, households, or dwellings to whom the variable applies.

How this data is classified

Census years since arrival in New ZealandV1.0.0

Years since arrival is a flat classification with single year categories:

00 Less than one year

01 1 year

02 2 years

::

96 96 years

97 97 years or more

99 Not elsewhere included

‘Not elsewhere included’ contains the residual categories, including ‘response unidentifiable’ and ‘not stated’.

For many outputs the variable is categorised as single years until 4 years, then mixed groups up to ‘20 years or more’:

Less than one year

1 year

2 years

3 years

4 years

5–9 years

10–19 years

20 years or more

The residual categories are grouped together as ‘not elsewhere included’

The Standards and Classifications page provides background information on classifications and standards.

Question format

Years since arrival is derived from month of arrival and year of arrival on the individual form (question 9 on the paper form).

Stats NZ Store House has samples for both the individual and dwelling paper forms.

There were differences in question format and wording between the modes of collection (online and paper forms):

  • the online form was worded succinctly for those who had been routed to the question after indicating they were born overseas
  • the paper form has extra wording to give context about who should answer this question, and its design is unchanged from the 2013 Census.

There were also differences in the way a person could respond.

On the online individual form:

  • built-in routing functionality directed individuals to the appropriate questions. If a respondent indicated they were born overseas, they were directed to the year of arrival question.

On the paper individual form:

  • responses outside the valid range were possible. These were coded to ‘response unidentifiable’.
  • those born in New Zealand could answer the year and month of arrival questions. These are filtered out by using the correct subject population of overseas-born usual residents.

How this data is used

Outside Stats NZ

  • Used together with birthplace to develop, monitor and evaluate settlement programmes for immigrants.
  • Used by government departments, regional and local bodies, special interest groups, and researchers to provide information on immigration trends.

Within Stats NZ

  • To capture changes in the social and economic status of immigrants.
  • For use in ethnic group profiles and cross-tabulated with birthplace.

2018 data sources

We used alternative data sources for missing census responses and responses that could not be classified or did not provide the type of information asked for. Where possible, we used responses from the 2013 Census, administrative data from the Integrated Data Infrastructure (IDI), or imputation. The table below shows the breakdown of the various data sources used for this variable.

2018 years since arrival in New Zealand –
overseas-born census usually resident population
Source Percent
Response from 2018 Census 83.9 percent
2013 Census data 7.7 percent
Administrative data 7.1 percent
Statistical imputation 0.0 percent
No information 1.3 percent
Total 100 percent
Due to rounding, individual figures may not always sum to the stated total(s)

The ‘no information’ percentage is where we were not able to source years since arrival data for a person in the subject population.

Administrative data sources

Data from the following administrative source was used:

  • Migration data, Ministry of Business, Innovation and Employment.

Note: administrative data only contains years since arrival data for the last 18 years. Addition of administrative records to the New Zealand 2018 Census Dataset: An overview of statistical methods and Potential for admin data to provide country of birth and years since arrival in New Zealand have more information.

Please note that when examining years since arrival data for specific population groups within the subject population, the percentage that is from 2013 Census data and administrative data may differ from that for the overall subject population.

Missing and residual responses

‘No information’ in the data sources table above is the percentage of the subject population coded to ‘not stated’. In previous censuses, non-response was the percentage of the subject population coded to ‘not stated.’

In 2018, the percentage of ‘not stated’ is lower than previous censuses due to the use of the additional data sources described above.

Percentage of ‘not stated’ for the overseas-born census usually resident population:

  • 2018: 1.3 percent
  • 2013: 3.6 percent
  • 2006: 3.8 percent.

Responses that could not be classified or did not provide the type of information asked for such as response unidentifiable remain in the data, where we have been unable to find information from another source. In the data sources table, these residuals are included in the ‘Response from 2018 Census’ percentage.

For output purposes, these residual category responses are grouped with ‘not stated’ and are classified as ‘not elsewhere included’.

Percentage of ‘not elsewhere included’ for the overseas-born census usually resident population:

  • 2018: 1.4 percent
  • 2013: 3.9 percent
  • 2006: 4.3 percent.

2013 Census data user guide provides more information about non-response in the 2013 Census.

Data quality processes

Overall quality rating: High quality

Data was evaluated to assess whether it meets quality standards and is suitable for use.

Three quality metrics contributed to the overall quality rating:

  • data sources and coverage
  • consistency and coherence
  • data quality.

The lowest rated metric determines the overall quality rating.

Data quality assurance for 2018 Census provides more information on the quality rating scale.

Data sources and coverage: High quality

We have assessed the quality of all the data sources that contribute to the output for the variable. To calculate a data sources and coverage quality score for a variable, each data source is rated and multiplied by the proportion it contributes to the total output.

The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:

  • 98–100 = very high
  • 95–<98 = high
  • 90–<95 = moderate
  • 75–<90 = poor
  • <75 = very poor.

Analysis of 2013 census responses showed the data was highly comparable to 2018 Census responses while admin data was moderately comparable to 2018 census responses. The high proportion of responses from individual forms and 2013 Census forms compared to the low proportion of missing responses contributed to the score of 0.96, determining the high quality rating.

Quality rating calculation table for the sources of years since arrival data –
2018 overseas born census usually resident population
Source Rating Percent of total Score contribution
2018 Census form 1.00 83.94 0.84
2013 Census 0.92 7.70 0.07
Admin data 0.70 7.06 0.05
No Information 0.00 1.29 0.00
Total 100.00 0.96
Due to rounding, individual figures may not always sum to the stated total(s) or score contributions.

Quality issue to note:

  • administrative data was only used for arrivals up to and including 18 years ago.

Consistency and coherence: High quality

Years since arrival data is consistent with expectations across nearly all consistency checks, with some minor variation from expectations or benchmarks that makes sense due to real-world change, incorporation of other sources of data, or a change in how the variable has been collected.

Quality issue to note:

  • comparisons with outcomes-based migration data shows a clear consistency between census and migration years since arrival data. While the migration data values are higher than census, census follows the migration patterns very closely.

Data quality: High quality

The data quality checks for years since arrival included edits for consistency within the dataset and cross-tabulations to the regional council level.

Years since arrival data has only minor data quality issues. The quality of coding and responses within classification categories is high. Any impact of other data sources used is minor. Any issues with the variable appear in a low number of cases (typically in the low hundreds).

A small amount of consistency edits were needed, and cross-tabulations with other variables showed no significant issues.

Recommendations for use and further information

While 2013 Census and administrative data have been used to produce the 2018 Census data, the overall quality of the data is high and comparable with 2006 and 2013 data. We recommend that the use of this data can be similar to its use in 2013.

However, when using this data you should be aware that:

  • data has been assessed to be consistent at the regional council level of geography. Some variation is possible at geographies below this level.

Comparisons with other data sources

Although there are surveys and sources other than the census that collect years since arrival data, data users are advised to familiarise themselves with the strengths and limitations of the sources before use.

Key considerations when comparing years since arrival information from the 2018 Census with other sources include:

  • census is a key source of information on years since arrival for small areas and small populations. Many other sources do not provide detail at this level.
  • census aims to be a national count of all individuals in a population while other sources such as the Household Labour Force Survey (HLFS) and General Social Survey (GSS) measuring this variable are only based upon a sample of the population.

Contact our Information Centre for further information about using this variable.

This variable is not part of a dataset.

Representation

Aggregation Method
Unspecified
Temporal
False
Geographic
False

Concept

Information

History

View Full History
Revision Date Responsibility Rationale
14 30/11/2021 2:59:21 PM