论文标题
计算基础架构中工作流程计划的调查和注释书目:社区,关键字和文章评论 - 扩展技术报告
A Survey and Annotated Bibliography of Workflow Scheduling in Computing Infrastructures: Community, Keyword, and Article Reviews -- Extended Technical Report
论文作者
论文摘要
在当今的计算基础架构中,工作流程很普遍。工作流模型支持各种不同领域,从机器学习到融资,从天文学到化学。用户和提供商的不同服务质量(QoS)要求以及其他愿望使工作流程安排一个棘手的问题,尤其是因为资源提供商需要尽可能高效地利用其资源才能具有竞争力。对于新来者甚至经验丰富的研究人员来说,筛选大量文章可能是一项艰巨的任务。有关差异技术,政策,新兴领域和机遇的问题。调查是解决这些问题的绝佳方法,但调查很少发布其工具和数据的数据。此外,很少研究这些文章背后的社区。我们试图解决这项工作中的这些缺点。我们专注于工作流程计划中的四个领域:1)工作流形式主义,2)工作流分配,3)资源提供,以及4)应用程序和服务。每个部分都具有一个或多个分类法,社区的观点,重要和新兴的关键字以及未来工作的方向。我们介绍并制作开源的乐器,用于合并和存储文章元数据。使用此元数据,我们1)每年获得总体和每年的重要关键词,2)确定重要的关键字,3)3)深入了解每个社区内的结构和关系,以及4)每个零件进行系统的文献调查,以验证和补充我们的分类法。
Workflows are prevalent in today's computing infrastructures. The workflow model support various different domains, from machine learning to finance and from astronomy to chemistry. Different Quality-of-Service (QoS) requirements and other desires of both users and providers makes workflow scheduling a tough problem, especially since resource providers need to be as efficient as possible with their resources to be competitive. To a newcomer or even an experienced researcher, sifting through the vast amount of articles can be a daunting task. Questions regarding the difference techniques, policies, emerging areas, and opportunities arise. Surveys are an excellent way to cover these questions, yet surveys rarely publish their tools and data on which it is based. Moreover, the communities that are behind these articles are rarely studied. We attempt to address these shortcomings in this work. We focus on four areas within workflow scheduling: 1) the workflow formalism, 2) workflow allocation, 3) resource provisioning, and 4) applications and services. Each part features one or more taxonomies, a view of the community, important and emerging keywords, and directions for future work. We introduce and make open-source an instrument we used to combine and store article meta-data. Using this meta-data, we 1) obtain important keywords overall and per year, per community, 2) identify keywords growing in importance, 3) get insight into the structure and relations within each community, and 4) perform a systematic literature survey per part to validate and complement our taxonomies.