Dual Regularized Unsupervised Feature Selection Based on Matrix Factorization and Minimum Redundancy with application in gene selection
Saberi-Movahed, Farid; Rostami, Mehrdad; Berahmand, Kamal; Karami, Saeed; Tiwari, Prayag; Oussalah, Mourad; Band, Shahab S. (2022-09-11)
Saberi-Movahed, F., Rostami, M., Berahmand, K., Karami, S., Tiwari, P., Oussalah, M., & Band, S. S. (2022). Dual regularized unsupervised feature selection based on matrix factorization and minimum redundancy with application in gene selection. Knowledge-Based Systems, 256, 109884. https://doi.org/10.1016/j.knosys.2022.109884
© 2022 The Author(s). This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
https://creativecommons.org/licenses/by/4.0/
https://urn.fi/URN:NBN:fi-fe2022112566970
Tiivistelmä
Abstract
Gene expression data have become increasingly important in machine learning and computational biology over the past few years. In the field of gene expression analysis, several matrix factorization-based dimensionality reduction methods have been developed. However, such methods can still be improved in terms of efficiency and reliability. In this paper, an innovative approach to feature selection, called Dual Regularized Unsupervised Feature Selection Based on Matrix Factorization and Minimum Redundancy (DR-FS-MFMR), is introduced. The major focus of DR-FS-MFMR is to discard redundant features from the set of original features. In order to reach this target, the primary feature selection problem is defined in terms of two aspects: (1) the matrix factorization of data matrix in terms of the feature weight matrix and the representation matrix, and (2) the correlation information related to the selected features set. Then, the objective function is enriched by employing two data representation characteristics along with an inner product regularization criterion to perform both the redundancy minimization process and the sparsity task more precisely. To demonstrate the proficiency of the DR-FS-MFMR method, a large number of experimental studies are conducted on nine gene expression datasets. The obtained computational results indicate the efficiency and productivity of DR-FS-MFMR for the gene selection task.
Kokoelmat
- Avoin saatavuus [32049]