论文标题

合成模型组合:无监督合奏学习的实例方法

Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning

论文作者

Chan, Alex J., van der Schaar, Mihaela

论文摘要

考虑对新的测试数据进行预测,而无需任何机会从一组标签数据中学习 - 而不是访问一组专家模型及其预测,以及有关用于培训它们的数据集的一些有限信息。在从金融到医学科学甚至消费者实践的情况下,利益相关者都开发了他们不能或不想共享的私人数据模型。鉴于围绕个人信息的价值和立法,只有模型而不是数据才能发布 - 相关的问题成为:如何最好地使用这些模型?以前的工作集中在全球模型选择或结合上,这是由于特征空间的单个最终模型的结果。但是,机器学习模型在其训练领域之外的数据上表现不佳,因此我们认为,当结合模型时,单个实例的权重必须反映其各自的域 - 换句话说,更可能看到有关该实例的信息的模型应该对其有更多的关注。我们介绍了一种通过模型结合的方法,包括用于处理稀疏高维域的新型表示学习步骤。最后,我们证明了我们方法对经典机器学习任务的需求和普遍性,并在万古霉素精确剂量的药理学环境中突出了现实世界的用例。

Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data - instead given access to a set of expert models and their predictions alongside some limited information about the dataset used to train them. In scenarios from finance to the medical sciences, and even consumer practice, stakeholders have developed models on private data they either cannot, or do not want to, share. Given the value and legislation surrounding personal information, it is not surprising that only the models, and not the data, will be released - the pertinent question becoming: how best to use these models? Previous work has focused on global model selection or ensembling, with the result of a single final model across the feature space. Machine learning models perform notoriously poorly on data outside their training domain however, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains - in other words models that are more likely to have seen information on that instance should have more attention paid to them. We introduce a method for such an instance-wise ensembling of models, including a novel representation learning step for handling sparse high-dimensional domains. Finally, we demonstrate the need and generalisability of our method on classical machine learning tasks as well as highlighting a real world use case in the pharmacological setting of vancomycin precision dosing.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源