研究成果
过滤器
搜索

Zheng, X., Guo, B., 何婧, & Chen, S. X. (2021). Effects of corona virus disease‐19 control measures on air quality in North China. Environmetrics, 32, e2673.

Corona virus disease-19 (COVID-19) has substantially reduced human activities and the associated anthropogenic emissions. This study quantifies the effects of COVID-19 control measures on six major air pollutants over 68 cities in North China by a Difference in Relative-Difference method that allows estimation of the COVID-19 effects while taking account of the general annual air quality trends, temporal and meteorological variations, and the spring festival effects. Significant COVID-19 effects on all six major air pollutants are found, with NO2 having the largest decline (−39.6%), followed by PM2.5 (−30.9%), O3 (−16.3%), PM10 (−14.3%), CO (−13.9%), and the least in SO2 (−10.0%), which shows the achievability of air quality improvement by a large reduction in anthropogenic emissions. The heterogeneity of effects among the six pollutants and different regions can be partly explained by coal consumption and industrial output data.

Read More »

Zhong, W., Gao, Y., 周玮, & Fan, Q. (2021). Endogenous treatment effect estimation using high-dimensional instruments and double selection. Statistics & Probability Letters, 169, 108967.

We propose a double selection instrumental variable estimator for the endogenous treat- ment effects using both high-dimensional control variables and instrumental variables. It deals with the endogeneity of the treatment variable and reduces omitted variable bias due to imperfect model selection.

Read More »

常晋源, Kolaczyk, E. D. & Yao, Q. (2020). Discussion of ‘Network cross-validation by edge sampling’. Biometrika, 107, 277-280.

We thank the authorsfor their new contribution to networkmodelling.Datareuse, encompassingmethods such as bootstrapping and cross-validation, is an area that to date has largely resisted obvious and rapid development in the network context. One of the major reasons is that mimicking the original sampling mechanisms is challenging if not impossible. To avoid deleting edges and destroying some of the network structure, the resampling strategy proposed in Li et al. (2020) based on splitting node pairs rather than nodes is therefore insightful and effective. Matrix completion is the key technique involved, with its use here providing a new perspective for network analysis.

Read More »

Li, Q., 余关元, & Liu, Y. (2020). A deep multimodal generative and fusion framework for class-imbalanced multimodal data. Multimedia Tools and Applications, 79, 25023-25050.

The purpose of multimodal classification is to integrate features from diverse information sources to make decisions. The interactions between different modalities are crucial to this task. However, common strategies in previous studies have been to either concatenate features from various sources into a single compound vector or input them separately into several different classifiers that are then assembled into a single robust classifier to generate the final prediction. Both of these approaches weaken or even ignore the interactions among different feature modalities. In addition, in the case of class-imbalanced data, multimodal classification becomes troublesome. In this study, we propose a deep multimodal generative and fusion framework for multimodal classification with class-imbalanced data. This framework consists of two modules: a deep multimodal generative adversarial network (DMGAN) and a deep multimodal hybrid fusion network (DMHFN). The DMGAN is used to handle the class imbalance problem. The DMHFN identifies fine-grained interactions and integrates different information sources for multimodal classification. Experiments on a faculty homepage dataset show the superiority of our framework compared to several start-of-the-art methods.

Read More »

余关元, Li, Q., Wang, J., Zhang, D., & Liu, Y. (2020). A multimodal generative and fusion framework for recognizing faculty homepages. Information Sciences, 525, 205-220.

Multimodal data consist of several data modes, where each mode is a group of similar data sharing the same attributes. Recognizing faculty homepages is essentially a multimodal classification problem in which a target faculty homepage is determined from three different information sources, including text, images, and layout. Conventional strategies in previous studies have been either to concatenate features from various information sources into a compound vector or to input them separately into several different classifiers that are then assembled into a stronger classifier for the final prediction. However, both approaches ignore the connections among different feature sets. We argue that such relations are essential to enhance multimodal classification. Besides, recognizing faculty homepages is a class imbalance problem in which the total number of samples of a minority class is far smaller than the sample numbers of other classes. In this study, we propose a multimodal generative and fusion framework for multimodal learning with the problems of imbalanced data and mutually dependent feature modes. Specifically, a multimodal generative adversarial network is first introduced to rebalance the dataset by generating pseudo features based on each mode and combining them to describe a fake sample. Then, a gated fusion network with the gate and fusion mechanisms is presented to reduce the noise to improve the generalization ability and capture the links among the different feature modes. Experiments on a faculty homepage dataset show the superiority of the proposed framework.

Read More »

张佳, & Chen, X. (2020). Principal envelope model. Journal of Statistical Planning and Inference, 206, 249-262.

Principal component analysis (PCA) is widely used in various fields to reduce high dimensional data sets to lower dimensions. Traditionally, the first a few principal components that capture most of the variance in the data are thought to be important. Tipping and Bishop (1999) introduced probabilistic principal component analysis (PPCA) in which they assumed an isotropic error in a latent variable model. Motivated by a general error structure and incorporating the novel idea of ‘‘envelope” proposed by Cook et al. (2010), we construct principal envelope models (PEM) which demonstrate the possibility that any subset of the principal components could retain most of the sample’s information. The useful principal components can be found through maximum likelihood approaches. We also embed the PEM to a factor model setting to illustrate its reasonableness and validity. Numerical results indicate the potentials of the proposed method.

Read More »

张佳, Shi, H., Tian, L., & Xiao, F. (2019). Penalized generalized empirical likelihood in high-dimensional weakly dependent data. Journal of Multivariate Analysis, 171, 270-283.

In this paper, we propose a penalized generalized empirical likelihood (PGEL) approach based on the smoothed moment functions Anatolyev (2005), Smith (1997), Smith (2004) for parameters estimation and variable selection in the growing (high) dimensional weakly dependent time series setting. The dimensions of the parameters and moment restrictions are both allowed to grow with the sample size at some moderate rates. The asymptotic properties of the estimators of the smoothed generalized empirical likelihood (SGEL) and its penalized version (SPGEL) are then obtained by properly restricting the degree of data dependence. It is shown that the SPGEL estimator maintains the oracle property despite the existence of data dependence and growing (high) dimensionality. We finally present simulation results and a real data analysis to illustrate the finite-sample performance and applicability of our proposed method.

Read More »

张佳, & Chen, X. (2019). Robust sufficient dimension reduction via ball covariance. Computational Statistics & Data Analysis, 140, 144-154.

Sufficient dimension reduction is an important branch of dimension reduction, which includes variable selection and projection methods. Most of the sufficient dimension reduction methods are sensitive to outliers and heavy-tailed predictors, and require strict restrictions on the predictors and the response. In order to widen the applicability of sufficient dimension reduction, we propose BCov-SDR, a novel sufficient dimension reduction approach that is based on a recently developed dependence measure: ball covariance. Compared with other popular sufficient dimension reduction methods, our approach requires rather mild conditions on the predictors and the response, and is robust to outliers or heavy-tailed distributions. BCov-SDR does not require the specification of a forward regression model and allows for discrete or categorical predictors and multivariate response. The consistency of the BCov-SDR estimator of the central subspace is obtained without imposing any moment conditions on the predictors. Simulations and real data studies illustrate the applicability and versatility of our proposed method.

Read More »