论文标题
利用复杂图案特征用于交互式模式挖掘
Exploiting complex pattern features for interactive pattern mining
论文作者
论文摘要
近年来,已经从一个模式的挖掘过程中转变,该过程使用户提前定义了约束,然后筛选结果,转换为交互式。这个新框架取决于利用用户反馈来学习模式的质量功能。现有的方法具有弱点,因为他们使用静态预定义的低级功能,并试图学习代表其对用户重要性的独立权重。作为替代方案,我们建议使用更复杂的功能,这些功能直接从用户施加的模式排名中得出。然后将学习的权重汇总到较低级别的特征上,并有助于朝着正确的方向推动质量功能。我们在实验上探索了不同参数选择的效果,并发现使用更高复杂性特征会导致选择与隐藏质量函数更好的模式,同时又没有显着增加方法的运行时间。 获得良好的用户反馈需要快速提出各种模式,这是我们实现的,但将现有多样性约束推向交互式采矿系统Letsip的采样组件。在大多数情况下,产生的模式可以更快地收敛到良好的解决方案。 最终,将这两个改进结合起来,导致算法比现有的最新面貌显示出明显的优势。
Recent years have seen a shift from a pattern mining process that has users define constraints before-hand, and sift through the results afterwards, to an interactive one. This new framework depends on exploiting user feedback to learn a quality function for patterns. Existing approaches have a weakness in that they use static pre-defined low-level features, and attempt to learn independent weights representing their importance to the user. As an alternative, we propose to work with more complex features that are derived directly from the pattern ranking imposed by the user. Learned weights are then aggregated onto lower-level features and help to drive the quality function in the right direction. We explore the effect of different parameter choices experimentally and find that using higher-complexity features leads to the selection of patterns that are better aligned with a hidden quality function while not adding significantly to the run times of the method. Getting good user feedback requires to quickly present diverse patterns, something that we achieve but pushing an existing diversity constraint into the sampling component of the interactive mining system LetSip. Resulting patterns allow in most cases to converge to a good solution more quickly. Combining the two improvements, finally, leads to an algorithm showing clear advantages over the existing state-of-the-art.