关于使用Bifidelity Data进行不确定性传播的神经网络的转移学习

论文标题

关于使用Bifidelity Data进行不确定性传播的神经网络的转移学习

On transfer learning of neural networks using bi-fidelity data for uncertainty propagation

论文作者

De, Subhayan, Britton, Jolene, Reynolds, Matthew, Skinner, Ryan, Jansen, Kenneth, Doostan, Alireza

论文摘要

由于其高度表达性，神经网络最近被用作替代模型，用于将工程系统的输入映射到感兴趣的输出。一旦受过培训，神经网络在计算上是便宜的，可以评估和消除在不确定性量化应用中重复评估计算昂贵模型的需求。但是，鉴于神经网络的高度参数化结构，尤其是深度神经网络，准确的训练通常需要大量的模拟数据，而在计算昂贵的系统的情况下可能无法使用。在本文中，为了减轻此问题的不确定性繁殖，我们使用从高保真模型和低保真模型中生成的培训数据探讨了转移学习技术的应用。我们探讨了在培训过程中耦合这两个数据集的两种策略，即标准转移学习和双性恋加权学习。在前一种方法中，根据低保真数据训练了将输入映射到感兴趣的输出的神经网络模型。然后，高保真数据用于调整低保真网络上层的参数，或训练更简单的神经网络以将低保真网络的输出映射到高效率模型的输出。在后一种方法中，使用使用小型高保真数据集训练的高斯流程模型生成的数据更新了整个低保真网络参数。参数更新是通过随机梯度下降的变体和高斯过程模型给出的。使用三个数值示例，我们说明了这些双歧度转移学习方法的实用性，在其中我们重点介绍了通过对标准培训方法转移学习实现的准确性提高。

Due to their high degree of expressiveness, neural networks have recently been used as surrogate models for mapping inputs of an engineering system to outputs of interest. Once trained, neural networks are computationally inexpensive to evaluate and remove the need for repeated evaluations of computationally expensive models in uncertainty quantification applications. However, given the highly parameterized construction of neural networks, especially deep neural networks, accurate training often requires large amounts of simulation data that may not be available in the case of computationally expensive systems. In this paper, to alleviate this issue for uncertainty propagation, we explore the application of transfer learning techniques using training data generated from both high- and low-fidelity models. We explore two strategies for coupling these two datasets during the training procedure, namely, the standard transfer learning and the bi-fidelity weighted learning. In the former approach, a neural network model mapping the inputs to the outputs of interest is trained based on the low-fidelity data. The high-fidelity data is then used to adapt the parameters of the upper layer(s) of the low-fidelity network, or train a simpler neural network to map the output of the low-fidelity network to that of the high-fidelity model. In the latter approach, the entire low-fidelity network parameters are updated using data generated via a Gaussian process model trained with a small high-fidelity dataset. The parameter updates are performed via a variant of stochastic gradient descent with learning rates given by the Gaussian process model. Using three numerical examples, we illustrate the utility of these bi-fidelity transfer learning methods where we focus on accuracy improvement achieved by transfer learning over standard training approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题