University of Oulu

Y. Wang et al., "Hyperspectral Estimation of Soil Copper Concentration Based on Improved TabNet Model in the Eastern Junggar Coalfield," in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-20, 2022, Art no. 5534020, doi: 10.1109/TGRS.2022.3190310

Hyperspectral estimation of soil copper concentration based on improved TabNet model in the Eastern Junggar coalfield

Saved in:
Author: Wang, Yuan1,2; Abliz, Abdugheni3; Ma, Hongbing4;
Organizations: 1College of Information Science and Engineering, Xinjiang University, Ürümqi 830046, China
2Department of Electronic Engineering and the Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
3College of Geography and Remote Sensing Science, Xinjiang University, Ürümqi 830046, China
4Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
5College of System Engineering, National University of Defense Technology, Changsha 410073, China
6Center for Machine Vision and Signal Analysis, University of Oulu, 90014 Oulu, Finland
7Sino-Belgian Joint Laboratory for Geo-Information and the Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Ürümqi 830011, China
8School of Resources and Environmental Sciences, Xinjiang University, Ürümqi 830046, China
9Department of Applied Ecology, Saint Petersburg State University, 199178 Saint Petersburg, Russia
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 2.4 MB)
Persistent link: http://urn.fi/urn:nbn:fi-fe2023040535152
Language: English
Published: Institute of Electrical and Electronics Engineers, 2022
Publish Date: 2023-04-05
Description:

Abstract

China is the largest coal consumer in the world. The massive exploitation and utilization of coal resources have resulted in serious problems of heavy metal pollution and environmental contamination, such as soil degradation, water pollution, crop damage, and even threatening human lives. Therefore, monitoring soil heavy metal pollution quickly and in real time is an urgent task at present. This research not only formulated a new preprocessing method enlightened by few-shot learning for soil hyperspectral data but also combined it with other soil-related auxiliary information to extract effective information from the soil hyperspectrum, at the end of which different regression methods were adopted to predict soil heavy metal contamination. This test used 168 actual soil samples from the Eastern Junggar coalfield in Xinjiang for verification. Since copper in the soil is a trace element and the corresponding spectral characteristics are affected by other impurities, improper use of hyperspectral preprocessing methods may introduce interference information or may delete useful information, which makes the model effect unsatisfied. To effectively address the above-mentioned problems, the preprocessing method of this experiment combined the second-order differential derivation, and the data enhancement (DA) method together with the addition of auxiliary information to allow more effective features to be entered into the model. Next, the attentive interpretable tabular learning (TabNet) model was improved in three different ways using the original TabNet model and three improved TabNet models to create regression models. One of the improved TabNet models had the best effect, with a list of the top 30 features according to the degree of importance. Meanwhile, the regression prediction of Cu content using four different convolutional neural networks (CNNs) revealed that the model with the residual block was the strongest and slightly outperformed the improved TabNet model, but lacked interpretation of the input data. Besides, this experiment also employed different preprocessing methods for regression prediction on various models and found that the traditional preprocessing methods performed best in traditional regression models [e.g., partial least square regression (PLSR)] and underperformed in deep learning models. The selected optimal model was compared with PLSR and CNN models. The results indicated that both the improved TabNet model and the improved CNN model had better performance using the new preprocessing approach proposed in this article, with improved TabNet yielding a coefficient of determination ( $\text{R}^{2}$ ), root-mean-square error (RMSE), and the ratio of performance to interquartile range (RPIQ) of 0.94, 1.341, and 4.474, respectively. The improved CNN model had a coefficient of determination of 0.942, an RMSE of 1.324, and an interquartile range of 4.531 in the test dataset.

see all

Series: IEEE transactions on geoscience and remote sensing
ISSN: 0196-2892
ISSN-E: 1558-0644
ISSN-L: 0196-2892
Volume: 60
Pages: 1 - 20
DOI: 10.1109/tgrs.2022.3190310
OADOI: https://oadoi.org/10.1109/tgrs.2022.3190310
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Subjects:
Funding: This work was supported in part by the Autonomous Region Postgraduate Research Innovation under Project XJ2021G064, in part by the National Natural Science Foundation of China under Grant 51704259, and in part by the Shanghai Aerospace Science and Technology Innovation Fund under Grant SAST2019-048.
Copyright information: © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.