Date Published: March 14, 2019
Publisher: Public Library of Science
Author(s): Jong-Min Kim, Namgil Lee, Xingyao Xiao, Baogui Xin.
Air pollution is well-known as a major risk to public health, causing various diseases including pulmonary and cardiovascular diseases. As social concern increases, the amount of air pollution data is increasing rapidly. The purpose of this study is to statistically characterize dependence between major cities in China based on a measure of directional dependence estimated from PM2.5 measurements. As a measure of the directional dependence, we propose the so-called copula directional dependence (CDD) using beta regression models. An advantage of the CDD is that it does not rely on strict assumptions of specific probability distributions or linearity. We used hourly PM2.5 measurement data collected at four major cities in China: Beijing, Chengdu, Guangzhou, and Shanghai, from 2013 to 2017. After accounting for autocorrelation in the PM2.5 time series via nonlinear autoregressive models, CDDs between the four cities were estimated to produce directed network structures of statistical dependence. In addition, a statistical method was proposed to test the directionality of dependence between each pair of cities. From the PM2.5 data, we could discover that Chengdu and Guangzhou are the most closely related cities and that the directionality between them has changed once during 2013 to 2017, which implies a major economic or environmental change in these Chinese regions.
Recently, air pollution has become a significant environmental and social problem in China. PM2.5 refers to the concentration of atmospheric fine particulate matter (PM) whose diameter is ≤ 2.5μm. Exposure to PM2.5 is associated with increased mortality rates caused by lung cancer and cardiopulmonary diseases [1–4]. It is generally accepted that PM2.5 is more harmful to human health than PM with diameter > 2.5μm and ≤ 10μm (PM10) [5, 6]. Sources of particulate matter include residential wood burning, coal-fired thermal power generation, agricultural burning, diesel fuel combustion, and natural/industrial dust. PM2.5 can also be generated indirectly when gases and particles interact in the air. China’s air pollution situation is so extreme that ambient air pollution ranked the second highest and household air pollution ranked the third highest in the G20 in terms of disability-adjusted life-years (DALYs) in 2010 . It has been estimated that the air pollution in China contributed to 1.6 million deaths/year in 2014 . In addition to the harmful effects of air pollutants on human health and mortality, pollutants can also have diverse effects on climate and weather due to their complex composition and sources [9–12].
A copula is a multivariate joint distribution function, which is an effective approach for describing statistical dependencies within a set of non-normal random variables [52, 53]. Statistical dependence structures can be modeled by choosing a copula function independent of the marginal distributions of each random variable. Since a normal distribution assumption or linearity assumption is not required, copulas can be applied to a wide variety of statistical dependence modeling tasks.
We applied the neural network approach for autocorrelation estimation described in Section 3.3. Details on the feedforward neural networks and their hyper-parameters used in this paper are presented in Table 3.
In this work, we presented a statistical measure of directional dependence which is called copula directional dependence (CDD). We obtained the PM2.5 concentration level data for the four major Chinese cities, Beijing, Chengdu, Guangzhou, and Shanghai, from January 1, 2013 to June 30, 2017. We preprocessed the data by using feedforward neural networks with lagged inputs, and we statistically analyzed directional dependence between the cities based on the proposed CDDs. An R source code for preprocessing and analyzing the PM2.5 data in this study is available at S1 Appendix.