全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A Transcriptome Post-Scaffolding Method for Assembling High Quality Contigs

DOI: 10.1155/2014/961823

Full-Text   Cite this paper   Add to My Lib

Abstract:

With the rapid development of high throughput sequencing technologies, new transcriptomes can be sequenced for little cost with high coverage. Sequence assembly approaches have been modified to meet the requirements for de novo transcriptomes, which have complications not found in traditional genome assemblies such as variation in coverage for each candidate mRNA and alternative splicing. As a consequence, de novo assembly strategies tend to generate a large number of redundant contigs due to sequence variations, which adversely affects downstream analysis and experiments. In this work we proposed TransPS, a transcriptome post-scaffolding method, to generate high quality, nonredundant de novo transcriptomes. TransPS shows promising results on the test transcriptome datasets, where redundancy is greatly reduced by more than 50% and, at the same time, coverage is improved considerably. The web server and source code are available. 1. Introduction The rapid development of the next generation sequencing technologies has catalyzed the development of new genome assembly tools able to handle the volume and complexity of the resulting data. Despite the advantages of next generation sequencing technologies, the length of the sequence generated by these modern instruments is considerably short (~100–300?bp), which poses challenge to sequence assembly algorithms. As with short read genome assembly, transcriptome assembly needs to connect short and sometimes low quality reads. However, transcriptome assembly is even more difficult than genome assembly due to the complication of factors such as highly variable sequencing depth, strand specificity, and transcript variants [1]. There are three typical transcriptome assembly strategies: the reference based strategy, the de novo strategy, and the hybrid strategy that combines both reference based and de novo strategies. Widely used transcriptome assembly tools include Cufflinks [2] for reference based assembly and Trinity [3] and Oases [4] for de novo assembly. These de novo assemblers are very sensitive to sequencing errors/polymorphisms, which result in considerable redundancy in the output contigs. Paired-read sequencing technology can help reduce the number of contigs, as the expected distance between read pairs can be used to place contigs in their likely order and orientation. Some assembly tools, such as Trinity, do not include a scaffolding step, while most others provide a scaffolding option only as a built-in function which cannot be independently controlled or effectively used to reduce the number of

References

[1]  J. A. Martin and Z. Wang, “Next-generation transcriptome assembly,” Nature Reviews Genetics, vol. 12, no. 10, pp. 671–682, 2011.
[2]  R. Adam, H. Pimentel, C. Trapnell, and L. Pachter, “Identification of novel transcripts in annotated genomes using RNA-seq,” Bioinformatics, vol. 27, no. 17, pp. 2325–2329, 2011.
[3]  M. G. Grabherr, B. J. Haas, M. Yassour et al., “Full-length transcriptome assembly from RNA-Seq data without a reference genome,” Nature Biotechnology, vol. 29, no. 7, pp. 644–652, 2011.
[4]  H. S. Marcel, D. R. Zerbino, M. Vingron, and E. Birney, “Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels,” Bioinformatics, vol. 28, no. 8, pp. 1086–1092, 2012.
[5]  M. Boetzer, C. V. Henkel, H. J. Jansen, D. Butler, and W. Pirovano, “Scaffolding pre-assembled contigs using SSPACE,” Bioinformatics, vol. 27, no. 4, pp. 578–579, 2011.
[6]  Y. Surget-Groba and J. I. Montoya-Burgos, “Optimization of de novo transcriptome assembly from next-generation sequencing data,” Genome Research, vol. 20, no. 10, pp. 1432–1440, 2010.
[7]  X. Huang, “A contig assembly program based on sensitive detection of fragment overlaps,” Genomics, vol. 14, no. 1, pp. 18–25, 1992.
[8]  X. Huang and J. Zhang, “Methods for comparing a DNA sequence with a protein sequence,” Bioinformatics, vol. 12, pp. 497–506, 1996.
[9]  C. I. Keeling, M. M. S. Yuen, N. Y. Liao, et al., “Draft genome of the mountain pine beetle, Dendroctonus ponderosae Hopkins, a major forest pest,” Genome Biology, vol. 14, no. 3, p. R27, 2013.
[10]  C. I. Keeling, H. Henderson, M. Li, et al., “Transcriptome and full-length cDNA resources for the mountain pine beetle, Dendroctonus ponderosae Hopkins, a major insect pest of pine forests,” Insect Biochemistry and Molecular Biology, vol. 42, no. 8, pp. 525–536, 2012.
[11]  J. Reese, S. L. Johnson, W. B. Hunter, et al., “Characterization of the Asian citrus psyllid transcriptome,” Journal of Genomics, vol. 2, pp. 54–58, 2012.
[12]  A. Schwarz, B. M. von Reumont, J. Erhart, et al., “De novo Ixodes ricinus salivary transcriptome analysis using two different next generation sequencing methodologies,” The FASEB Journal, vol. 27, no. 12, pp. 4745–4756, 2013.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413