论文标题
从帕特森地图到原子坐标:训练深层神经网络以解决简化情况的相位问题
From Patterson Maps to Atomic Coordinates: Training a Deep Neural Network to Solve the Phase Problem for a Simplified Case
论文作者
论文摘要
这项工作表明,对于10个随机定位原子的简单情况,可以训练神经网络从帕特森地图推断原子坐标。该网络完全接受了综合数据的培训。对于训练集,网络输出是随机定位原子的3D地图。从每个输出映射中,生成了Patterson地图并用作网络的输入。网络概括为未在测试集中的情况,从帕特森地图推断原子位置。 这项工作的一个关键发现是,训练期间向网络输入呈现的帕特森地图必须唯一描述与网络输出配对的原子坐标,否则网络将不会训练,并且不会概括。网络无法培训冲突的数据。避免冲突以3种方式处理:1。Patterson地图对于翻译是不变的。为了消除这种自由度,输出图的平均位置为中心。 2。帕特森地图对于中心对称反转是不变的。通过同时呈现Patterson地图及其中心对称相关的同时,将网络输出呈现网络输出来消除这一冲突。 3。帕特森地图并未唯一描述一组坐标,因为帕特森地图中每个向量的原点都模棱两可。通过在输出图中的原子周围添加空空间,可以删除此歧义。强迫输出原子比输出框边缘尺寸更接近一半,这意味着帕特森地图中每个峰的原点必须是其最接近的原点。
This work demonstrates that, for a simple case of 10 randomly positioned atoms, a neural network can be trained to infer atomic coordinates from Patterson maps. The network was trained entirely on synthetic data. For the training set, the network outputs were 3D maps of randomly positioned atoms. From each output map, a Patterson map was generated and used as input to the network. The network generalized to cases not in the test set, inferring atom positions from Patterson maps. A key finding in this work is that the Patterson maps presented to the network input during training must uniquely describe the atomic coordinates they are paired with on the network output or the network will not train and it will not generalize. The network cannot train on conflicting data. Avoiding conflicts is handled in 3 ways: 1. Patterson maps are invariant to translation. To remove this degree of freedom, output maps are centered on the average of their atom positions. 2. Patterson maps are invariant to centrosymmetric inversion. This conflict is removed by presenting the network output with both the atoms used to make the Patterson Map and their centrosymmetry-related counterparts simultaneously. 3. The Patterson map does not uniquely describe a set of coordinates because the origin for each vector in the Patterson map is ambiguous. By adding empty space around the atoms in the output map, this ambiguity is removed. Forcing output atoms to be closer than half the output box edge dimension means the origin of each peak in the Patterson map must be the origin to which it is closest.