论文标题
整合机器学习和基于代理的生物医学系统建模的创新
Innovations in Integrating Machine Learning and Agent-Based Modeling of Biomedical Systems
论文作者
论文摘要
基于代理的建模(ABM)是一个完善的范式,用于通过组成实体之间的相互作用模拟复杂系统。机器学习(ML)是指统计算法“从数据中学习”的方法,而无需施加先验的系统行为理论。生物系统 - 从分子到细胞,再到整个生物 - 由大量实体组成,由复杂的相互作用网络组成,这些相互作用遍布许多时空尺度并表现出非线性,随机性和实体之间的复杂耦合。此类系统的宏观特性和集体动力学很难通过连续建模和平均场形式主义捕获。 ABM采用了一种“自下而上”的方法,通过使人们可以轻松提出和测试一组明确定义的“规则”来消除这些困难,以应用于系统中的单个实体(代理)。评估系统并在离散的时步中传播其状态可以有效地模拟系统,从而可以计算可观察到的物体并分析系统属性。由于管理ABM的规则可能很难从实验数据中抽象和制定,因此有机会使用ML来帮助推断最佳,系统特定的ABM规则。一旦设计了此类规则集,ABM计算就可以生成大量数据,并且也可以在此应用ML - 例如,探测有意义地描述系统随机属性的统计量度。作为在另一个方向(从ABM到ML)协同作用的一个例子,ABM模拟可以生成用于训练ML算法的逼真的数据集(例如,用于正则化,减轻过度拟合)。通过这些方式,人们可以设想各种协同的ABM $ \ rightleftharpoons $ ml循环。这篇综述总结了如何将ABM和ML整合到从细胞到人群水平流行病学的跨越时空尺度的上下文中。
Agent-based modeling (ABM) is a well-established paradigm for simulating complex systems via interactions between constituent entities. Machine learning (ML) refers to approaches whereby statistical algorithms 'learn' from data on their own, without imposing a priori theories of system behavior. Biological systems -- from molecules, to cells, to entire organisms -- consist of vast numbers of entities, governed by complex webs of interactions that span many spatiotemporal scales and exhibit nonlinearity, stochasticity and intricate coupling between entities. The macroscopic properties and collective dynamics of such systems are difficult to capture via continuum modelling and mean-field formalisms. ABM takes a 'bottom-up' approach that obviates these difficulties by enabling one to easily propose and test a set of well-defined 'rules' to be applied to the individual entities (agents) in a system. Evaluating a system and propagating its state over discrete time-steps effectively simulates the system, allowing observables to be computed and system properties to be analyzed. Because the rules that govern an ABM can be difficult to abstract and formulate from experimental data, there is an opportunity to use ML to help infer optimal, system-specific ABM rules. Once such rule-sets are devised, ABM calculations can generate a wealth of data, and ML can be applied there too -- e.g., to probe statistical measures that meaningfully describe a system's stochastic properties. As an example of synergy in the other direction (from ABM to ML), ABM simulations can generate realistic datasets for training ML algorithms (e.g., for regularization, to mitigate overfitting). In these ways, one can envision various synergistic ABM$\rightleftharpoons$ML loops. This review summarizes how ABM and ML have been integrated in contexts that span spatiotemporal scales, from cellular to population-level epidemiology.