索引于
  • 学术期刊数据库
  • 打开 J 门
  • Genamics 期刊搜索
  • 期刊目录
  • 研究圣经
  • 乌尔里希的期刊目录
  • 电子期刊图书馆
  • 参考搜索
  • 哈姆达大学
  • 亚利桑那州EBSCO
  • OCLC-WorldCat
  • 学者指导
  • SWB 在线目录
  • 虚拟生物学图书馆 (vifabio)
  • 普布隆斯
  • 米亚尔
  • 日内瓦医学教育与研究基金会
  • 欧洲酒吧
  • 谷歌学术
分享此页面
期刊传单
Flyer image

抽象的

Node-Oriented Workflow (NOW): A Command Template Workflow Management Tool for High Throughput Data Analysis Pipelines

 Eric B. Lipsky, Brian R. King, Gerard Tromp

Next generation sequencing (NGS) systems produce vast quantities of data that require substantial computational resources for typical analysis tasks. In addition, data that are generated by different NGS systems are not homogeneous. Moreover, there are an overwhelming number of tools available for performing typical tasks. Managing NGS workflows involves writing custom scripts that quickly grow in complexity, often resulting in unwieldy workflows that underutilize typical high performance compute resources, and increase the demands of the staff managing these workflows. We present Node-Oriented Workflow (NOW), a dynamic command template workflow engine for high performance distributed computing (HPC) systems. Our system provides a simple-to-use browserbased front end for designing and managing complex workflows. Workflows are configured using a simple browser interface, and are managed by the integrated job engine, which initializes nodes, monitors node status, and processes results of individual jobs across nodes in an HPC configuration. We reduce excessive messaging across nodes by placing the burden on nodes to start tasks in a workflow when dependencies are met, i.e., node oriented workflow. Our system was designed for NGS processing in the clinical research setting, emphasizing user simplicity, tool scalability, minimization of redundancy in workflows, while maximizing throughput in an HPC environment. Furthermore, NOW is not restricted to NGS pipeline management, but can used to manage any computational pipeline.