全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A Protein Sequence Analysis Hardware Accelerator Based on Divergences

DOI: 10.1155/2012/201378

Full-Text   Cite this paper   Add to My Lib

Abstract:

The Viterbi algorithm is one of the most used dynamic programming algorithms for protein comparison and identification, based on hidden markov Models (HMMs). Most of the works in the literature focus on the implementation of hardware accelerators that act as a prefilter stage in the comparison process. This stage discards poorly aligned sequences with a low similarity score and forwards sequences with good similarity scores to software, where they are reprocessed to generate the sequence alignment. In order to reduce the software reprocessing time, this work proposes a hardware accelerator for the Viterbi algorithm which includes the concept of divergence, in which the region of interest of the dynamic programming matrices is delimited. We obtained gains of up to 182x when compared to unaccelerated software. The performance measurement methodology adopted in this work takes into account not only the acceleration achieved by the hardware but also the reprocessing software stage required to generate the alignment. 1. Introduction Protein sequence comparison and analysis is a repetitive task in the field of molecular biology, as is needed by biologists to predict or determine the function, structure, and evolutional characteristics of newly discovered protein sequences. During the last decade, technological advances had made possible the identification of a vast number of new proteins that have been introduced to the existing protein databases [1, 2]. With the exponential growth of these databases, the execution times of the protein comparison algorithms also grew exponentially [3], and the necessity to accelerate the existing software rose in order to speed up research. The HMMER 2.3.2 program suite [4] is one of the most used programs for sequence comparison. HMMER takes multiple sequence alignments of similar protein sequences grouped into protein families and builds hidden Markov models (HMMs) [5] of them. This is done to estimate statistically the evolutionary relations that exist between different members of the protein family, and to ease the identification of new family members with a similar structure or function. HMMER then takes unclassified input sequences and compares them against the generated HMMs of protein families (profile HMM) via the Viterbi algorithm (see Section 2), to generate both a similarity score and an alignment for the input (query) sequences. As the Viterbi routine is the most time consuming part of the HMMER programs, multiple attempts to optimize and accelerate it have been made. MPI-HMMER [6] explores parallel execution in

References

[1]  The Universal Protein Resource—UniProt, June 2009, http://www.uniprot.org/.
[2]  Sanger's Institute PFAM Protein Sequence Database, May 2009, http://pfam.sanger.ac.uk/.
[3]  A. C. Jacob, J. M. Lancaster, J. D. Buhler, and R. D. Chamberlain, “Preliminary results in accelerating profile HMM search on FPGAs,” in Proceedings of the 21st International Parallel and Distributed Processing Symposium, (IPDPS '07), Long Beach, Calif, USA, March 2007.
[4]  “HMMER: biosequence analysis using profile hidden Markov models,” 2006, http://hmmer.janelia.org/.
[5]  R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, New York, NY, USA, 2008.
[6]  R. Darole, J. P. Walters, and V. Chaudhary, “Improving MPI-HMMER’s scalability with parallel I/O,” Tech. Rep. 2008-11, Department of Computer Science and Engineering, University of Buffalo, Buffalo, NY, USA, 2008.
[7]  G. Chukkapalli, C. Guda, and S. Subramaniam, “SledgeHMMER: a web server for batch searching the Pfam database,” Nucleic Acids Research, vol. 32, supplement 2, pp. W542–W544, 2004.
[8]  “HMMER on the sun grid project,” July 2009, http://www.psc.edu/general/software/packages/hmmer/.
[9]  D. R. Horn, M. Houston, and P. Hanrahan, “ClawHMMER: a streaming HMMer-search implementation,” in Proceedings of the ACM/IEEE Supercomputing Conference, (SC '05), November 2005.
[10]  GPU-HMMER, July 2009, http://www.mpihmmer.org/userguideGPUHMMER.htm.
[11]  R. P. Maddimsetty, J. Buhler, R. D. Chamberlain, M. A. Franklin, and B. Harris, “Accelerator design for protein sequence HMM search,” in Proceedings of the 20th Annual International Conference on Supercomputing (ICS '06), pp. 288–296, New York, NY, USA, July 2006.
[12]  “BLAST: Basic Local Alignment Search Tool,” September 2009, http://blast.ncbi.nlm.nih.gov/.
[13]  K. Benkrid, P. Velentzas, and S. Kasap, “A high performance reconfigurable core for motif searching using profile HMM,” in Procedings of the NASA/ESA Conference on Adaptive Hardware and Systems (AHS '08), pp. 285–292, Noordwijk, The Netherlands, June 2008.
[14]  T. F. Oliver, B. Schmidt, Y. Jakop, and D. L. Maskell, “High speed biological sequence analysis with hidden Markov models on reconfigurable platforms,” IEEE Transactions on Information Technology in Biomedicine, vol. 13, no. 5, pp. 740–746, 2009.
[15]  J. P. Walters, X. Meng, V. Chaudhary et al., “MPI-HMMER-boost: distributed FPGA acceleration,” Journal of VLSI Signal Processing Systems, vol. 48, no. 3, pp. 223–238, 2007.
[16]  S. Derrien and P. Quinton, “Parallelizing HMMER for hardware acceleration on FPGAs,” in Proceedings of the International Conference on Application-specific Systems, Architectures and Processors (ASAP '07), pp. 10–17, Montreal, Canada, July 2007.
[17]  L. Hunter, Artificial Intelligence and Molecular Biology, MIT Press, 1st edition, 1993.
[18]  D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge University Press, 1997.
[19]  L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 1989.
[20]  Y. Sun, P. Li, G. Gu, et al., “HMMer acceleration using systolic array based reconfigurable architecture,” in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '09), p. 282, New York, NY, USA, May 2009.
[21]  T. Takagi and T. Maruyama, “Accelerating hmmer search using FPGA,” in Proceedings of the19th International Conference on Field Programmable Logic and Applications (FPL '09), pp. 332–337, Prague Czech Republic, September 2009.
[22]  R. B. Batista, A. Boukerche, and A. C. M. A. de Melo, “A parallel strategy for biological sequence alignment in restricted memory space,” Journal of Parallel and Distributed Computing, vol. 68, no. 4, pp. 548–561, 2008.
[23]  XtremeData Inc., July 2009, http://www.xtremedata.com/.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133