论文标题
事后正规化展开的横截面测量
Post-hoc regularisation of unfolded cross-section measurements
论文作者
论文摘要
中微子横截面测量通常以“真”变量中展开的bined分布表示。展开问题的不良性可能会导致结果具有强烈的抗相关性和垃圾箱之间的波动,这使得与图中的理论模型进行了比较。为了减轻这个问题,可以在展开程序中介绍正规化术语。这些以对数据的预期形状引入一些偏见,抑制结果中的抗相关性。本文讨论了一种使用简单线性代数的方法,这使得可以将显示为中心值和协方差矩阵的任何结果正规化。通常,这种“事后”正规化比重复具有不同正则化项的展开方法要快得多。该方法还产生正规化矩阵,该矩阵将正则化与未注册的结果联系起来,并且可以在发布更好的正则结果时保留未注册结果的完整统计能力。除正则化方法外,本文还提出了一些有关相关数据的介绍的想法。使用提出的方法时,正则化的偏置可以理解为数据可视化问题,而不是统计问题。可以通过最大程度地减少图中所示的隐式不相关的分布与未注册的中心值和协方差所描述的实际分布之间的差来选择正则化的强度。除了最大程度地减少所示结果和实际结果之间的差异外,还可以通过显示图中显示的模型的局部对数可能梯度来提供其他信息。这添加了有关模型被数据“拉动”位置的更多信息,而不仅仅是将bin值与数据的中心值进行比较。
Neutrino cross-section measurements are often presented as unfolded binned distributions in "true" variables. The ill-posedness of the unfolding problem can lead to results with strong anti-correlations and fluctuations between bins, which make comparisons to theoretical models in plots difficult. To alleviate this problem, one can introduce regularisation terms in the unfolding procedure. These suppress the anti-correlations in the result, at the cost of introducing some bias towards the expected shape of the data. This paper discusses a method using simple linear algebra, which makes it is possible to regularise any result that is presented as a central value and a covariance matrix. This "post-hoc" regularisation is generally much faster than repeating the unfolding method with different regularisation terms. The method also yields a regularisation matrix which connects the regularised to the unregularised result, and can be used to retain the full statistical power of the unregularised result when publishing a nicer looking regularised result. In addition to the regularisation method, this paper also presents some thoughts on the presentation of correlated data in general. When using the proposed method, the bias of the regularisation can be understood as a data visualisation problem rather than a statistical one. The strength of the regularisation can be chosen by minimising the difference between the implicitly uncorrelated distribution shown in the plots and the actual distribution described by the unregularised central value and covariance. Aside from minimising the difference between the shown and the actual result, additional information can be provided by showing the local log-likelihood gradient of the models shown in the plots. This adds more information about where the model is "pulled" by the data than just comparing the bin values to the data's central values.