全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

ASAP: an environment for automated preprocessing of sequencing data

DOI: 10.1186/1756-0500-6-5

Keywords: Next-generation sequencing, Data processing, Automation, Computer cluster

Full-Text   Cite this paper   Add to My Lib

Abstract:

Advanced Sequence Automated Pipeline (ASAP) was developed to provide a framework for automating the translation of sequencing data into annotated variant calls with the goal of minimizing user involvement without the need for dedicated hardware or administrative rights. ASAP works both on computer clusters and on standalone machines with minimal human involvement and maintains high data integrity, while allowing complete control over the configuration of its component programs. It offers an easy-to-use interface for submitting and tracking jobs as well as resuming failed jobs. It also provides tools for quality checking and for dividing jobs into pieces for maximum throughput.ASAP provides an environment for building an automated pipeline for NGS data preprocessing. This environment is flexible for use and future development. It is freely available at http://biostat.mc.vanderbilt.edu/ASAP webcite.Modern sequencing technologies have greatly improved our capability of acquiring deep sequencing data on a large scale and in a timely fashion. However, the large amount of data presents many new challenges to researchers, including a significant amount of time and effort on preprocessing raw sequencing reads into variant calls that are ready for statistical analyses. This process involves multiple steps and several independent programs. For example, for species with a reference genome available, sequence reads are often initially aligned to the reference genome using a mapping program such as BWA (Li & Durbin [1]). Additionally, reads aligned to insertion-deletion regions may require local realignment to minimize false variant calls, and base quality scores may require recalibration to reflect empirical error rates; these can be achieved with GATK (McKenna et al. [2]). Moreover, variant calls may require filtering for false call removal and annotation for downstream analyses. The various steps require different programs and there may be multiple programs available for some

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133