Variable Description
Cigarette smoking refers to the active smoking of one or more manufactured or hand-rolled tobacco cigarettes, from purchased or home-grown tobacco, per day, by a person aged 15 years and over.
The term 'smoking' refers to active smoking behaviour, that is, the intentional inhalation of tobacco smoke. Smoking does not refer to, or include, passive smoking (the unintentional inhalation of tobacco smoke).
Cigarette smoking does not include:
- the smoking of tobacco in cigars, pipes, and cigarillos
- the smoking of e-cigarettes
- the smoking of any other substances such as herbal cigarettes or marijuana
- the consumption of tobacco products by other means, such as chewing.
The input variables that derive cigarette smoking behaviour are:
- regular smoker indicator
- ever smoked indicator.
Priority level
Priority level 3
We assign a priority level to all census variables: Priority 1, 2, or 3 (with 1 being highest and 3 being the lowest priority).
Cigarette smoking behaviour is a priority 3 variable. Priority 3 variables do not fit in directly with the main purpose of a census but are still important to certain groups. These variables are given third priority in terms of quality, time, and resources across all phases of a census.
The Census priority level for cigarette smoking behaviour remains the same as 2013.
Quality Management Strategy and the Information by variable for cigarette smoking behaviour (2013) have more information on the priority rating.
Overall quality rating for 2018 Census
Moderate quality
Data quality processes section below has more detail on the rating for this variable.
The External Data Quality Panel has provided an independent assessment of the quality of this variable and has rated it as moderate/poor quality. 2018 Census External Data Quality Panel: Assessment of Variables has more information.
Subject population
Census usually resident population aged 15 years and over
The subject population means the people, families, households, or dwellings to whom the variable applies.
How this data is classified
Cigarette smoking behaviour classificationV3.0.0
Cigarette smoking behaviour is a flat classification with the following categories.
01 Regular smoker
02 Ex-smoker
03 Never smoked regularly
99 Not elsewhere included
‘Not elsewhere included’ contains the residual categories of ‘response unidentifiable’ and ‘not stated’
The classification of cigarette smoking behaviour in the 2018 Census is consistent with the classification used in the 2013 and 2006 Censuses.
The Standards and Classifications page provides background information on classifications and standards.
Question format
Cigarette smoking behaviour is derived from regular smoker indicator and ever smoked indicator on the individual form (questions 24 and 25 on the paper form).
Stats NZ Store House has samples for both the individual and dwelling paper forms.
There were differences in the way a person could respond between the modes of collection (paper and online form)
On the online individual form:
- only single yes or no responses were possible
- if a respondent indicated they were a current smoker, they were not able to answer the ever smoked question
- built-in routing functionality directed individuals in the subject population to the appropriate questions.
On the paper individual and dwelling forms:
- multiple responses were possible. If a valid response was unable to be determined, a response was imputed.
- respondents who indicated they were a current smoker were able to answer the ever smoked question. No change was made to these as they were filtered out by using the correct subject population.
How this data is used
Outside Stats NZ
- Cigarette smoking behaviour is used to monitor changes in smoking prevalence among the population of New Zealand.
- To understand the profile of smokers in order to develop health education programmes that target at-risk groups in the community.
- To target and evaluate the success of health education programmes that monitor changes in smoking prevalence among high risk groups in New Zealand.
- For examining the inter-relationship between smoking and other socioeconomic variables and how these change over time.
Within Stats NZ
- Cigarette smoking is included in analyses alongside variables such as ethnic group, iwi, age, sex, personal income, highest qualification, and work and labour force status.
2018 data sources
We used alternative data sources for missing census responses and responses that could not be classified or did not provide the type of information asked for. Where possible, we used responses from the 2013 Census, administrative data from the Integrated Data Infrastructure (IDI), or imputation. The tables below show the breakdown of the various data sources used for the two input variables, regular smoker indicator and ever smoked indicator.
2018 Regular smoker indicator – census usually resident population aged 15 years and over |
|
---|---|
Source | Percent |
Response from 2018 Census | 84.0 percent |
2013 Census data | 7.8 percent |
Administrative data | 0.0 percent |
Statistical imputation | 8.1 percent |
No information | 0.0 percent |
Total | 100 percent |
Due to rounding, individual figures may not always sum to the stated total(s) |
2018 Ever smoked indicator – census usually resident population aged 15 years and over who are not regular smokers |
|
---|---|
Source | Percent |
Response from 2018 Census | 85.1 percent |
2013 Census data | 6.8 percent |
Administrative data | 0.0 percent |
Statistical imputation | 8.1 percent |
No information | 0.0 percent |
Total | 100 percent |
Due to rounding, individual figures may not always sum to the stated total(s) |
The ‘no information’ percentage is where we were not able to source cigarette smoking behaviour data for a person in the subject population.
Please note that when examining cigarette smoking behaviour data for specific population groups within the subject population, the percentage that is from 2013 Census data and statistical imputation may differ from that for the overall subject population.
Missing and residual responses
‘No information’ in the data sources table is the percentage of the subject population coded to ‘not stated’. In previous censuses, non-response was the percentage of the subject population coded to ‘not stated.’ In 2018, the percentage of ‘not stated’ is zero due to the use of the additional data sources described above.
Percentage of ‘not stated’ for the census usually resident population aged 15 years and over:
- 2018: 0.0 percent
- 2013: 6.7 percent
- 2006: 5.2 percent.
In 2018, there were no residual responses remaining in the data due to the use of 2013 Census data and imputation to replace these responses. In output for previous censuses, responses that could not be classified or did not provide the type of information asked for were grouped with ‘not stated; and classified as ‘not elsewhere included’.
Percentage of ‘not elsewhere included’ for the census usually resident population aged 15 years and over:
- 2018: 0.0 percent
- 2013: 9.2 percent
- 2006: 8.6 percent.
2013 Census data user guide provides more information about non-response in the 2013 Census.
Data quality processes
Overall quality rating: Moderate quality
Data was evaluated to assess whether it meets quality standards and is suitable for use.
Three quality metrics contributed to the overall quality rating:
- data sources and coverage
- consistency and coherence
- data quality.
The lowest rated metric determines the overall quality rating.
Data quality assurance for 2018 Census provides more information on the quality rating scale.
Data sources and coverage: High quality
We have assessed the quality of all the data sources that contribute to the output for the variable. To calculate a data sources and coverage quality score for a variable, each data source is rated and multiplied by the proportion it contributes to the total output.
The rating for a valid census response is defined as 1.00. Ratings for other sources are the best estimates available of their quality relative to a census response. Each source that contributes to the output for that variable is then multiplied by the proportion it contributes to the total output. The total score then determines the metric rating according to the following range:
- 98–100 = very high
- 95–<98 = high
- 90–<95 = moderate
- 75–<90 = poor
- <75 = very poor.
Regular smoker indicator
2013 Census data was highly comparable to 2018 Census responses while data sourced through statistical imputation was moderately comparable to census forms. The high proportion of data from received forms and 2013 Census responses in comparison to the low proportion sourced from statistical imputation contributed to the score of 0.97, determining the high quality rating.
Quality rating calculation table for the sources of regular smoker indicator data – census usually resident population aged 15 years and over | |||
---|---|---|---|
Source | Rating | Percent of total | Score contribution |
2018 Census form | 1.00 | 84.02 | 0.84 |
2013 Census | 0.93 | 7.81 | 0.07 |
Imputation | |||
Donor’s 2018 Census form | 0.70 | 8.15 | 0.06 |
Donor’s response sourced from 2013 Census | 0.65 | 0.02 | 0.00 |
No Information | 0.00 | 0.00 | 0.00 |
Total | 100.00 | 0.97 | |
Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Ever smoked indicator
2013 Census data was highly comparable to 2018 Census responses while data sourced through statistical imputation was moderately comparable to census forms. The high proportion of data from received forms and 2013 Census responses in comparison to the low proportion sourced from statistical imputation contributed to the score of 0.97, determining the high quality rating.
Quality rating calculation table for the sources of ever smoked indicator data – census usually resident population aged 15 years and over who are not regular smokers | |||
---|---|---|---|
Source | Rating | Percent of total | Score contribution |
2018 Census form | 1.00 | 85.09 | 0.85 |
2013 Census | 0.93 | 6.83 | 0.06 |
Imputation | |||
Donor’s 2018 Census form | 0.70 | 8.03 | 0.06 |
Donor’s response sourced from 2013 Census | 0.65 | 0.05 | 0.00 |
No Information | 0.00 | 0.00 | 0.00 |
Total | 100.00 | 0.97 | |
Due to rounding, individual figures may not always sum to the stated total(s) or score contributions. |
Data sources, editing, and imputation in the 2018 Census has more information on the Canadian census edit and imputation system (CANCEIS) that was used to derive donor responses.
Consistency and coherence: Moderate quality
Cigarette smoking data is mostly consistent with expectations across consistency checks. There is an overall difference in the data compared with expectations and benchmarks that can be explained through a combination of real-world change, or incorporation of other sources of data.
Quality issues to note:
- cigarette smoking behaviour is consistent with expectations at a national level of geography. There is some variation in trends for ‘regular smokers’ at regional council level of geography which may be due to different proportions of alternative data sources used across different geographies.
- cigarette smoking behaviour in the 2018 Census is moderately comparable with the 2013 and 2006 Census data at the national level of geography
- while 2018 Census smoking data is largely comparable with expectations, the introduction of imputation and historical data for this variable has impacted on the nature of the output population for this variable, compared with previous censuses. This change will need to be factored into comparison with previous census data, and other sources of smoking data.
Data quality: Moderate quality
The data quality checks for cigarette smoking behaviour included edits for consistency within the dataset and cross-tabulations at the national level of geography.
Moderate quality data has various data quality issues involving several categories or aspects of the data, or an entire level of a hierarchical classification. The data quality issues could include problems with the classification or coding of data, such as vague responses resulting in coding issues, or responses that cannot be coded to a specific (non-residual) category, thereby reducing the amount of useful, meaningful data available for analysis. The use of other data sources may be contributing to these issues.
Data quality issues to note:
- the use of 2013 Census data has limitations for a variable like cigarette smoking that may change over time. People who indicated they were regular smokers in 2013 may have since stopped smoking, and those who indicated they had never smoked or were ex-smokers may have become regular smokers by 2018.
- analysis indicated that the use of 2013 Census data has introduced some bias towards ‘regular smokers’ into the dataset, but that this bias is small
- there are higher proportions of 2013 data and imputation for certain ethnicities such as Maori, Pacific Peoples, and the level 1 MELAA ethnicity category.
Recommendations for use and further information
While new imputation methods and 2013 Census data have been used to produce the 2018 Census data, the overall quality of the data is moderate and comparable with 2006 and 2013 data.
However, when using this data you should be aware that:
- due to the use of administrative enumeration to replace missing responses in the 2018 Census, the ethnicity counts in the 2018 Census are generally higher than in the 2013 Census, including for the Māori ethnic group. As a result of this change in methodology for the 2018 Census, cigarette smoking time series data for the Māori ethnic group should be interpreted with care.
- data has been assessed to be consistent at the national level of geography. Some variation is expected at geographies below this level due to both real world change, and the use of 2013 data and imputation.
- the use of 2013 Census data to replace missing responses has introduced a small amount of bias towards ‘regular smokers’ into the dataset.
- the 2018 Census smoking data is largely comparable with expectations. The introduction of imputation and 2013 Census data, and its effect on the nature of the output population, will need to be factored into comparisons with previous census data and other sources of smoking data.
Comparisons with other data sources
Although surveys and sources other than the census collect cigarette smoking behaviour data, data users are advised to familiarise themselves with the strengths and limitations of the sources before use.
Key considerations when comparing cigarette smoking behaviour from the 2018 Census with other sources include:
- census is a key source of information on cigarette smoking behaviour for small areas and small populations. Many other sources do not provide detail at this level.
- census aims to be a national count of all individuals in a population while other sources measuring this variable such as the New Zealand Health Survey and the New Zealand Tobacco Use Survey are only based upon a sample of the population.
Contact our Information Centre for further information about using this variable.