论文标题
计算机模拟数据集的统计建模和分析
Statistical Modelling and Analysis of the Computer-Simulated Datasets
论文作者
论文摘要
在过去的二十年中,科学从仅依靠物理实验和观察到使用计算机模拟器进行实验已经走了很长一段路。本章的重点是计算机模拟器引起的数据的建模和分析。事实证明,传统的统计元模型通常对于分析此类数据集不是很有用。对于确定的计算机模拟器,高斯过程(GP)模型的实现通常用于拟合模拟器输出的替代统计元模型。本章以对基于标准GP的统计替代模型的快速审查开始。本章还强调了由于GP模型拟合过程中空间相关结构的近乎差异而引起的数值不稳定。作者还提出了GP模型的一些概括,回顾了专门为分析从计算机模型运行获得的大数据而开发的方法和算法,并回顾了此类计算机实验的流行分析目标。这里还简要概述了一些现实生活中的计算机模拟器。
Over the last two decades, the science has come a long way from relying on only physical experiments and observations to experimentation using computer simulators. This chapter focusses on the modelling and analysis of data arising from computer simulators. It turns out that traditional statistical metamodels are often not very useful for analyzing such datasets. For deterministic computer simulators, the realizations of Gaussian Process (GP) models are commonly used for fitting a surrogate statistical metamodel of the simulator output. The chapter starts with a quick review of the standard GP based statistical surrogate model. The chapter also emphasizes on the numerical instability due to near-singularity of the spatial correlation structure in the GP model fitting process. The authors also present a few generalizations of the GP model, reviews methods and algorithms specifically developed for analyzing big data obtained from computer model runs, and reviews the popular analysis goals of such computer experiments. A few real-life computer simulators are also briefly outlined here.