论文标题

通过基于仿真的合成数据和多任务学习,激光诱导的分解光谱预测的可信赖性预测

Trustworthiness of Laser-Induced Breakdown Spectroscopy Predictions via Simulation-based Synthetic Data Augmentation and Multitask Learning

论文作者

Finotello, Riccardo, L'Hermite, Daniel, Quéré, Celine, Rouge, Benjamin, Tamaazousti, Mohamed, Sirven, Jean-Baptiste

论文摘要

我们考虑使用激光诱导的分解光谱法对光谱数据进行定量分析。我们解决了较小的培训数据,以及在推断未知数据期间的预测验证。出于这个目的,我们使用深卷积多任务学习体系结构构建强大的校准模型,以预测分析物的浓度,并作为其他光谱信息作为辅助输出。这些次要预测可通过利用多任务神经网络参数的相互依赖性来验证模型的可信度。由于实验性缺乏训练样本,我们引入了基于模拟的数据增强过程,以综合任意数量的光谱,这在统计学上代表了实验数据。鉴于深度学习模型的性质,无需降低维度或数据选择过程。该过程是一条端到端的管道,包括合成数据增强的过程,合适的健壮,同质性,深度学习模型的构建以及其预测的验证。在文章中,我们将多任务模型的性能与传统的单变量和多变量分析进行了比较,以突出该过程中引入的每个元素的单独贡献。

We consider quantitative analyses of spectral data using laser-induced breakdown spectroscopy. We address the small size of training data available, and the validation of the predictions during inference on unknown data. For the purpose, we build robust calibration models using deep convolutional multitask learning architectures to predict the concentration of the analyte, alongside additional spectral information as auxiliary outputs. These secondary predictions can be used to validate the trustworthiness of the model by taking advantage of the mutual dependencies of the parameters of the multitask neural networks. Due to the experimental lack of training samples, we introduce a simulation-based data augmentation process to synthesise an arbitrary number of spectra, statistically representative of the experimental data. Given the nature of the deep learning model, no dimensionality reduction or data selection processes are required. The procedure is an end-to-end pipeline including the process of synthetic data augmentation, the construction of a suitable robust, homoscedastic, deep learning model, and the validation of its predictions. In the article, we compare the performance of the multitask model with traditional univariate and multivariate analyses, to highlight the separate contributions of each element introduced in the process.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源