论文标题
用于经济学面板数据分析的可解释的神经网络
Interpretable Neural Networks for Panel Data Analysis in Economics
论文作者
论文摘要
缺乏可解释性和透明度在于阻止经济学家在其实证研究中使用诸如神经网络之类的先进工具。在本文中,我们提出了一类可解释的神经网络模型,这些模型既可以实现高预测的准确性和可解释性。该模型可以写成一个正规数量的可解释功能的简单函数,这些功能是在神经网络中编码的可解释功能的结果。研究人员可以根据其任务的性质设计不同形式的可解释功能。特别是,我们编码神经网络中名为持续变化过滤器的一类可解释的功能,以研究时间序列横截面数据。我们将模型应用于使用高维管理数据来预测个人的每月就业状况。我们在测试集中达到了94.5%的精度,这与表现最佳的常规机器学习方法相当。此外,该模型的解释性使我们能够理解基于预测的机制:个人的就业状况与她是否支付不同类型的保险密切相关。我们的工作是克服神经网络问题的有用一步,并为经济学家提供了研究行政和专有大数据的新工具。
The lack of interpretability and transparency are preventing economists from using advanced tools like neural networks in their empirical research. In this paper, we propose a class of interpretable neural network models that can achieve both high prediction accuracy and interpretability. The model can be written as a simple function of a regularized number of interpretable features, which are outcomes of interpretable functions encoded in the neural network. Researchers can design different forms of interpretable functions based on the nature of their tasks. In particular, we encode a class of interpretable functions named persistent change filters in the neural network to study time series cross-sectional data. We apply the model to predicting individual's monthly employment status using high-dimensional administrative data. We achieve an accuracy of 94.5% in the test set, which is comparable to the best performed conventional machine learning methods. Furthermore, the interpretability of the model allows us to understand the mechanism that underlies the prediction: an individual's employment status is closely related to whether she pays different types of insurances. Our work is a useful step towards overcoming the black-box problem of neural networks, and provide a new tool for economists to study administrative and proprietary big data.