Bluesky Facebook Reddit Email

Unsupervised spectral feature selection algorithms for high dimensional data

11.12.23 | Higher Education Press

SAMSUNG T9 Portable SSD 2TB

SAMSUNG T9 Portable SSD 2TB transfers large imagery and model outputs quickly between field laptops, lab workstations, and secure archives.


It is a significant and challenging task to detect the informative features to carry out explainable analysis and build an interpretable AI system for high dimensional data, especially for those with very small number of samples while without any label information. Unsupervised feature selection algorithms are the right way to deal with this challenge and realize the task, especially in big data era. However, the available unsupervised feature selection approaches usually cannot precisely identify the most discriminative features from high dimensional data with small number of samples.
To address the aforementioned challenges, a research team led by Juanying XIE published their new research on 15 Oct 2023 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposes two novel unsupervised spectral feature selection algorithms, which group features into clusters using advanced Self-Tuning spectral clustering algorithm based on local standard deviation, guaranteeing the global optimal feature clusters could be detected as far as possible. The entropy-based and cosine-similarity-based feature ranking techniques are, respectively, proposed, so that the representative feature from each cluster could be detected out to comprise the feature subset on which an explainable classification system will be built. This guarantees that the detected features are representative and independence each other as far as possible. The extensive experiments and rigorous statistical tests demonstrate that these unsupervised spectral feature selection algorithms are superior to the peer ones in comparison. They detected features having strong discriminative capabilities in downstream classifiers for omics data, such that the AI system built on them would be reliable and explainable, making it possible to build a transparent and trustworthy medical diagnostic system from an interpretable AI perspective.
For the future direction, one is to study the general way for finding an appropriate parameter of the advanced Self-Tuning spectral clustering based on local standard deviation. Another is to reduce the computing cost when detecting the optimal feature subset of very very high dimensionality data, such as SNP data.

Frontiers of Computer Science

10.1007/s11704-022-2135-0

Experimental study

Not applicable

Unsupervised spectral feature selection algorithms for high dimensional data

15-Oct-2023

Keywords

Article Information

Contact Information

Rong Xie
Higher Education Press
xierong@hep.com.cn

Source

How to Cite This Article

APA:
Higher Education Press. (2023, November 12). Unsupervised spectral feature selection algorithms for high dimensional data. Brightsurf News. https://www.brightsurf.com/news/86ZQ2VK8/unsupervised-spectral-feature-selection-algorithms-for-high-dimensional-data.html
MLA:
"Unsupervised spectral feature selection algorithms for high dimensional data." Brightsurf News, Nov. 12 2023, https://www.brightsurf.com/news/86ZQ2VK8/unsupervised-spectral-feature-selection-algorithms-for-high-dimensional-data.html.