## Data Collection

## Data Collection

^{en-NZ}

## Methodology

General methodology is outlined in the National Population Projections data collection.

**Main changes to assumptions since the previous 2014-base projections**

Deriving the projections involved a review of all projection assumptions. The main changes from the previous 2014-base projections relate to the base population, migration, and fertility assumptions.

The **base population** at 30 June 2016 of 4.693 million is 38,000 (0.8 percent) higher than the 2014-base median projection for 2016. This is mainly because observed net migration for the two years ended 30 June 2016 (127,000) was 40,000 higher than the assumed median net migration (87,000) in the 2014-base projections.

The median **annual net migration gain** is assumed to be 15,000 in the long-term, an increase from the long term level assumed in the 2014-base projections of 12,000. In the short term, the median net migration assumptions are also higher: 60,000 in the June year 2017, decreasing by 9,000 a year to 15,000 in 2022. Simulations of net migration are produced using an ARIMA(1,0,1) model, the same model used in the 2011-base projections, rather than the ARIMA(0,1,2) model used in the 2014-base projections.

The median period total fertility rate (TFR) is assumed to be 1.85 births per woman in the long-term, a slight decrease from the long term level assumed in the 2014-base projections of 1.90 births per woman. The change reflects the recent decreases in fertility, to a TFR of 1.90 births per woman in the year ended June 2016.

**Stochastic (probabilistic) population projections**

Stochastic (probabilistic) population projections are produced to give estimates of uncertainty, although these estimates are themselves uncertain. The stochastic population projections are produced by combining 2,000 simulations of the assumptions. These simulations can be summarised by percentiles, which indicate the probability that the actual result is lower than the percentile. For example, the 25th percentile indicates an estimated 25 percent chance that the actual value will be lower, and a 75 percent chance that the actual result will be higher, than this percentile.

Nine alternative percentiles of probability distribution (2.5th, 5th, 10th, 25th, 50th, 75th, 90th, 95th, and 97.5th percentiles) are available for the 2016-base projections.

At the time of release, the median projection (50th percentile) indicates an estimated 50 percent chance that the actual value will be lower, and a 50 percent chance that the actual value will be higher, than this percentile.

**The median projection for 2016-base projections assumes:**

- the total fertility rate declines to 1.85 births per woman in 2036 and beyond
- period life expectancy at birth increases to 89.1 years for males and 91.3 years for females in 2068
- annual net migration of 60,000 in 2017, decreasing by 9,000 annually to 15,000 in 2022 and beyond.

**'What if?' scenarios**

Five 'what if?' scenarios have been produced to illustrate what happens when different specific levels of fertility, mortality, and migration assumptions are combined.

**Very high fertility** assumes:

- the total fertility rate increases to 2.5 births per woman in 2036 and beyond
- period life expectancy at birth increases to 89.1 years for males and 91.3 years for females in 2068
- annual net migration of 60,000 in 2017, decreasing by 9,000 annually to 15,000 in 2022 and beyond.

**Very low mortality** assumes:

- the total fertility rate declines to 1.85 births per woman in 2036 and beyond
- period life expectancy at birth increases to 96.0 years for both males and females in 2068
- annual net migration of 60,000 in 2017, decreasing by 9,000 annually to 15,000 in 2022 and beyond.

**No migration** assumes:

- the total fertility rate declines to 1.85 births per woman in 2036 and beyond
- period life expectancy at birth increases to 89.1 years for males and 91.3 years for females in 2068
- no external migration from 2017 onwards (ie a 'closed' population).

**Cyclic migration** assumes:

- the total fertility rate declines to 1.85 births per woman in 2036 and beyond
- period life expectancy at birth increases to 89.1 years for males and 91.3 years for females in 2068
- annual net migration of 60,000 in 2017, decreasing by 9,000 annually to 15,000 in 2022; net migration then fluctuates between -5,000 and 45,000 over a 10-year cycle. The net migration gain between 2016 and years ending in 2 and 8 (eg 2022, 2028), is the same as the median assumption.

**Very high migration** assumes:

- the total fertility rate declines to 1.85 births per woman in 2036 and beyond
- period life expectancy at birth increases to 89.1 years for males and 91.3 years for females in 2068
- annual net migration of 60,000 in 2017, decreasing by 9,000 annually to 33,000 in 2020 ,and 30,000 in 2021 and beyond.

**Projection assumptions**

Projection assumptions are formulated after analysing short-term and long-term historical trends, recent trends and patterns observed in other countries, and government policy.

**Base population**

These projections have as a base the provisional estimated resident population (ERP) of New Zealand at 30 June 2016. This population (4.693 million) was derived from the ERP at 30 June 2013 (4.442 million), updated for births, deaths, and net migration between 30 June 2013 and 30 June 2016 (+251,000). The ERP at 30 June 2013 was derived from the census usually resident population count at 5 March 2013 (4.242 million) with adjustments for:

- net census undercount (+104,000)
- residents temporarily overseas on census night (+82,000)
- births, deaths, and net migration between census night and 30 June 2013 (+9,000)
- reconciliation with demographic estimates at ages 0–9 years (+5,000).

The ERP is the best available measure of the number of people usually living in New Zealand. However, for projection purposes, some uncertainty in the base population has been assumed. This uncertainty is assumed to vary by age and sex, and arises from two broad sources.

- Census enumeration and processing. Coverage errors may arise from non-enumeration and mis-enumeration (eg residents counted as visitors from overseas, and vice versa), either because of deliberate or inadvertent respondent or collector error. Errors may also arise during census processing (eg scanning, numeric and character recognition, imputation, coding, editing, creation of substitute forms).
- Adjustments in deriving population estimates. This includes the adjustments applied in deriving the ERP at 30 June of the census year (eg net census undercount). It also includes uncertainty associated with the post-censal components of population change (eg estimates of births occurring in each time period based on birth registrations; changes in classification of external migrants between ‘permanent and long-term’ and 'short-term').

Simulations of the base population are produced by drawing a random number sampled from a normal distribution with a mean of zero. For each simulation, a random number is multiplied by the assumed standard error for each age-sex then added to the base ERP.

**Fertility**

Fertility rates are assumed to vary throughout the projection period. The median period TFR declines gradually from 1.90 births per woman in 2016 to 1.87 in 2025, and to 1.85 in 2036 and beyond.

- The period TFR decreased from 1.99 in 2014 to 1.90 in 2016.
- In the 40 years from 1977 to 2016, the period TFR was generally in the range of 1.9–2.2 births per woman.
- The cohort TFR indicates a progressive decline in completed family size. Women born in the early 1970s averaged 2.2 births each, compared with 2.5 for those born in the early 1950s.
- Census data (1981, 1996, 2006, 2013) on the number of children ever born also indicate progressive declines in completed family size and progressive increases in childlessness.
- Internationally, TFRs are generally declining, or are already lower than in New Zealand. New Zealand's TFR is one of the highest among Organisation for Economic Co-operation and Development countries.

Age-specific fertility rates (ASFRs) are assumed to vary throughout the projection period. The median ASFRs decline for women aged under 35 years, and increase for women aged 35 years and over.

Simulations of TFR are produced using a simple random walk with drift model. Random errors are sampled from a normal distribution with a mean of zero and a standard deviation of 0.0553. The standard deviation is derived by fitting an autoregressive integrated moving average or ARIMA(0,1,0) model to annual TFR for June years 1977–2016. The drift function shifts the median of the TFR simulations to follow the assumed median TFR. Median ASFRs are scaled to sum to the simulated TFR.

Simulations of the sex ratio at birth for each year are produced by drawing a random number sampled from a normal distribution with a mean of 105.5 males per 100 females and a standard deviation of 1.0. The mean and standard deviation are calculated from historical data for December years 1900–2015.

**Mortality**

Mortality/survival assumptions are formulated using death registrations, period and cohort mortality rates, and international comparisons. Death rates are assumed to vary throughout the projection period, with the assumptions driven by trends in age-sex death rates. Life expectancy assumptions are not explicitly formulated but are derived from the assumed death rates.

Male and female age-specific death rate assumptions are formulated using a coherent functional demographic method (FDM) developed by Hyndman, Booth, and Yasmeen (Coherent mortality forecasting: the product-ratio method with functional time series models, 2012). This method builds on the FDM of Hyndman and Ullah (Robust forecasting of mortality and fertility rates: A functional data approach, 2007), which is itself an extension of the Lee-Carter method widely used in mortality forecasting. The research of the authors and Booth, Hyndman, Tickle, and de Jong (Lee-Carter mortality forecasting: a multi-country comparison of variants and extensions, 2006) indicates that FDM forecasts are more accurate than the original Lee-Carter method and at least as accurate as several other Lee-Carter variants. The advantage of the coherent FDM is that it ensures male and female assumptions do not diverge over time.

The coherent FDM uses smoothed historical data to fit the model, which is then forecast using ARIMA and autoregressive fractionally integrated moving average (ARFIMA) time-series models. The historical data is derived from Statistics NZ's cohort mortality series, transposed to give period death rates for each age for June years 1977–2015. Simulations of death rates are produced using an ARIMA(0,2,2) model to give plausible uncertainty bounds.

The median assumption has a male period life expectancy at birth of 85.6 years in 2043 and 89.1 years in 2068. The corresponding female period life expectancy at birth is 88.5 years in 2043 and 91.3 years in 2068.

The median assumption has a male cohort life expectancy at birth of 78.9 years for those born in 1956 and 90.7 years for those born in 2016. The corresponding female cohort life expectancy at birth is 83.5 years for those born in 1956 and 92.9 years for those born in 2016.

Despite differences in methods, the New Zealand life expectancy assumptions are broadly consistent with those in other countries.

Although mortality reductions are expected to continue in the future, the extent of the trends is uncertain and depends on many factors:

- changes in population composition and different trends in population subgroups (including ethnic groups).
- changes in biomedical technology, regenerative medicine, and preventative methods including monitoring, treatment, and early intervention
- changes in health care systems including effectiveness of public health
- changes in behaviour and lifestyle (e.g. smoking, exercise, diet)
- changes in infectious diseases and resistance to antibiotics
- environmental change, disasters, and wars.

**Migration**

Migration assumptions are formulated using international travel and migration data (including arrivals and departures by country of citizenship and age), immigration applications and approvals, census data on people born overseas (including years since arrival in New Zealand), and consideration of immigration policies (in New Zealand and other countries).

Migration is assumed to vary throughout the projection period. The median net migration (arrivals less departures) is 60,000 in 2017 and decreases by 9,000 annually to 15,000 in 2022 and beyond. The assumed long-run median annual net migration of 15,000 reflects the average annual gain of 10,000–20,000 since the late 1980s. The short-term net migration levels are broadly in line with recent forecasts available from Treasury, Reserve Bank of New Zealand, and the Ministry of Business, Innovation and Employment.

Net migration is assumed to gradually decrease from a June year record of just over 69,000 in 2016, due to a combination of factors:

- more New Zealand citizens departing to Australia and fewer returning from Australia, as economic conditions in Australia gradually improve
- fewer arrivals of non-New Zealand citizens, as immigration approvals ease
- more departures of non-New Zealand citizens, reflecting those who have been in New Zealand on short-term/temporary student and work visas.

Net migration by age-sex reflects recent observed trends, with the largest movements at ages 15–38 years.

Future migration trends are uncertain and depend on a range of factors in source and destination countries:

- changes in immigration policy (in New Zealand and other countries)
- changes in the main motives for migration (eg work, family reunification, education, asylum, retirement)
- changes in migration pressure in source countries (eg population growth, economic growth)
- changes in the attractiveness of New Zealand as a place to live (eg work opportunities, economic conditions, wages relative to costs and other countries, settlement and integration practices)
- costs of migration, including cost of travel and existence of networks and pathways that facilitate migration
- environmental change, disasters, and wars.

Simulations of net migration are produced using an ARIMA(1,0,1) with drift model. The autoregressive and moving average parameters are derived by fitting an ARIMA(1,0,1) model to annual 'permanent and long-term' migration for June years 1988–2016. The drift function shifts the median of the net migration simulations to follow the assumed median net migration. Net migration by age-sex is interpolated between a high and low pattern, to sum to the simulated net migration level.

**Accuracy of projections**

The accuracy of these projections is unknown at the time of release. While the assumptions are formulated from an assessment of short-term and long-term demographic trends, there is no certainty that any of the assumptions will be realised. The projections do not take into account non-demographic factors (eg war, catastrophes, major government and business decisions) which may invalidate the projections.

See How accurate are population estimates and projections? An evaluation of Statistics New Zealand population estimates and projections, 1996–2013 for an evaluation of previous Statistics NZ national and subnational population estimates and projections.

^{en-NZ}