结合可以提高深度学习模型的稳定性和功能选择功能

论文标题

结合可以提高深度学习模型的稳定性和功能选择功能

Ensembling improves stability and power of feature selection for deep learning models

论文作者

Gyawali, Prashnna K, Liu, Xiaoxia, Zou, James, He, Zihuai

论文摘要

随着包括计算生物学在内的不同现实世界中的深度学习模型的越来越多，通常有必要了解哪些数据功能对于模型的决策至关重要。尽管最近为深入学习模型定义了不同特征的重要性指标，但我们确定了深度学习模型的设计和培训中的固有随机性使得常用的特征的重要性得分不稳定。这会导致在模型的不同运行中对不同特征的各种解释或选择。我们展示了特征的信号强度和特征之间的相关性如何直接导致这种不稳定性。为了解决这一不稳定，我们探讨了在不同时期的特征重要性得分的结合，并发现这种简单的方法可以基本解决此问题。例如，我们考虑仿冒推理，因为它们允许具有统计保证的特征选择。我们发现深度学习训练不同时期的选定功能的可变性很大，最佳特征的选择不一定会以最低验证损失（确定最佳模型的传统方法）出现。因此，我们提出了一个框架，以结合跨不同的高参数设置和时期训练的模型的特征重要性，而不是从一个最佳模型中选择功能，而是从众多良好模型中执行了功能重要的分数。在模拟和各种现实世界数据集中的实验范围内，我们证明了所提出的框架始终提高特征选择的力量。

With the growing adoption of deep learning models in different real-world domains, including computational biology, it is often necessary to understand which data features are essential for the model's decision. Despite extensive recent efforts to define different feature importance metrics for deep learning models, we identified that inherent stochasticity in the design and training of deep learning models makes commonly used feature importance scores unstable. This results in varied explanations or selections of different features across different runs of the model. We demonstrate how the signal strength of features and correlation among features directly contribute to this instability. To address this instability, we explore the ensembling of feature importance scores of models across different epochs and find that this simple approach can substantially address this issue. For example, we consider knockoff inference as they allow feature selection with statistical guarantees. We discover considerable variability in selected features in different epochs of deep learning training, and the best selection of features doesn't necessarily occur at the lowest validation loss, the conventional approach to determine the best model. As such, we present a framework to combine the feature importance of trained models across different hyperparameter settings and epochs, and instead of selecting features from one best model, we perform an ensemble of feature importance scores from numerous good models. Across the range of experiments in simulated and various real-world datasets, we demonstrate that the proposed framework consistently improves the power of feature selection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题