全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A P2P Framework for Developing Bioinformatics Applications in Dynamic Cloud Environments

DOI: 10.1155/2013/361327

Full-Text   Cite this paper   Add to My Lib

Abstract:

Bioinformatics is advanced from in-house computing infrastructure to cloud computing for tackling the vast quantity of biological data. This advance enables large number of collaborative researches to share their works around the world. In view of that, retrieving biological data over the internet becomes more and more difficult because of the explosive growth and frequent changes. Various efforts have been made to address the problems of data discovery and delivery in the cloud framework, but most of them suffer the hindrance by a MapReduce master server to track all available data. In this paper, we propose an alternative approach, called PRKad, which exploits a Peer-to-Peer (P2P) model to achieve efficient data discovery and delivery. PRKad is a Kademlia-based implementation with Round-Trip-Time (RTT) as the associated key, and it locates data according to Distributed Hash Table (DHT) and XOR metric. The simulation results exhibit that our PRKad has the low link latency to retrieve data. As an interdisciplinary application of P2P computing for bioinformatics, PRKad also provides good scalability for servicing a greater number of users in dynamic cloud environments. 1. Introduction Today new technologies in genomics/proteomics generate biological data with an exponential growth. Current Next Generation Sequencing (NGS) technologies can produce gigabase-scales of DNA and RNA sequencing data within a day at a reasonable cost [1–3]. Cloud computing has been regarded as a key approach for processing such a planet-size data, and hence, many bioinformatics applications have been migrated to the cloud environments [4–7]. Bioinformatics clouds are heavily dependent on data, as data are fundamentally crucial for receiving biological insights. The analyses are commonly based on the extensive and repeated use of comparative parallel process via Data-as-a-Service (DaaS) on the web [8–10], most notably in the gene expression analysis. The data are likely to be updated constantly. The sources and users of the data would be connected by various devices over the internet. The effectiveness for locating the deluged data in cloud computing is often overlooked, but it is a key problem. From the aspect of retrieving the up-to-date data with less complexity and delay, we settled the existing problems in data discovery. Along these lines, the high computing ability of P2P framework is adopted as a dynamic cloud infrastructure to resolve the challenge caused by massive datasets [11–13]. Bioinformatics usually requires the collection, organization, and analysis of large

References

[1]  F. Luciani, R. A. Bull, and A. R. Lioyd, “Next generation deep sequencing and vaccine design: today and tomorrow,” Trends in Biotechnology, vol. 30, no. 9, pp. 443–452, 2012.
[2]  L. Liu, Y. Li, S. Li, et al., “Comparison of next-generation sequencing systems,” Journal of Biomedicine and Biotechnology, vol. 2012, Article ID 251364, 11 pages, 2012.
[3]  L. D. Stein, “The case for cloud computing in genome informatics,” Genome Biology, vol. 11, no. 5, article 207, 7 pages, 2010.
[4]  K. Krampis, T. Booth, B. Chapman, et al., “Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community,” BMC Bioinformatics, vol. 13, article 42, 8 pages, 2012.
[5]  L. Dai, X. Gao, Y. Guo, J. Xiao, and Z. Zhang, “Bioinformatics clouds for big data manipulation,” Biology Direct, vol. 7, article 43, 2012.
[6]  D. P. Wall, P. Kudtarkar, V. A. Fusaro, et al., “Cloud computing for comparative genomics,” BMC Bioinformatics, vol. 11, article 259, 2010.
[7]  M. C. Schatz, “CloudBurst: highly sensitive read mapping with MapReduce,” Bioinformatics, vol. 25, no. 11, pp. 1363–1369, 2009.
[8]  H.-L. Truong, M. Comerio, F. D. Paoli, et al., “Data contracts for cloud-based data marketplaces,” International Journal of Computational Science and Engineering, vol. 7, no. 4, pp. 280–295, 2012.
[9]  S. Dustdar, R. Pichler, V. Savenkov, and H.-L. Truong, “Quality-aware service-oriented data integration: requirements, state of the art and open challenges,” ACM SIGMOD Record, vol. 41, no. 1, pp. 11–19, 2012.
[10]  AWS Public Data Sets, http://aws.amazon.com/publicdatasets/.
[11]  F. Marozzo, D. Talia, and P. Trunfio, “P2P-MapReduce: parallel data processing in dynamic cloud environments,” Journal of Computer and System Sciences, vol. 78, no. 5, pp. 113–125, 2012.
[12]  A. Forestiero, E. Leonardi, C. Mastroianni, and M. Meo, “Self-Chord: a Bio-inspired P2P framework for self-organizing distributed systems,” IEEE/ACM Transactions on Networking, vol. 18, no. 5, pp. 1651–1664, 2010.
[13]  R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility,” Future Generation Computer Systems, vol. 25, no. 6, pp. 599–616, 2009.
[14]  S. B. Montgomery, T. Fu, J. Guan, K. Lin, and S. J. M. Jones, “An application of peer-to-peer technology to the discovery, use and assessment of bioinformatics programs,” Nature Methods, vol. 2, no. 8, p. 563, 2005.
[15]  X. Quan, C. Walton, D. L. Gerloff, J. L. Sharman, and D. Robertson, “Peer-to-peer experimentation in protein structure prediction: an architecture, experiment and initial results,” in Distributed, High-Performance and Grid Computing in Computational Biology, vol. 4360 of Lecture Notes in Computer Science, pp. 75–98, 2007.
[16]  S. Ratnasamy, P. Francis, M. Handley, et al., “A scalable content-addressable network,” in Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, San Diego, Calif, USA, 2001.
[17]  I. Stoica, R. Morris, D. R. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord: a scalable peer-to-peer lookup protocol for Internet applications,” IEEE/ACM Transactions on Networking, vol. 11, no. 1, pp. 17–32, 2003.
[18]  Y. J. Joung and J. C. Wang, “Chord2: a two-layer Chord for reducing maintenance overhead via heterogeneity,” Computer Networks, vol. 51, no. 3, pp. 712–731, 2007.
[19]  R. Rodrigues and P. Druschel, “Peer-to-peer systems,” Communications of the ACM, vol. 53, no. 10, pp. 72–82, 2010.
[20]  B. Y. Zhao, L. Huang, J. Stribling, S. C. Rhea, A. D. Joseph, and J. D. Kubiatowicz, “Tapestry: a resilient global-scale overlay for service deployment,” IEEE Journal on Selected Areas in Communications, vol. 22, no. 1, pp. 41–53, 2004.
[21]  Z. Ou, E. Harjula, O. Kassinen, and M. Ylianttila, “Performance evaluation of a Kademlia-based communication-oriented P2P system under churn,” Computer Networks, vol. 54, no. 5, pp. 689–705, 2010.
[22]  E. K. Lua, J. Crowcroft, M. Pias, et al., “A survey and comparison of peer-to-peer overlay network schemes,” IEEE Communications Surveys & Tutorials, vol. 7, no. 2, pp. 72–93, 2005.
[23]  G. Urdaneta, G. Pierre, and M. van Steen, “A survey of DHT security techniques,” ACM Computing Surveys, vol. 43, no. 2, article 8, 2011.
[24]  M. Steiner, T. En-Najjary, and E. W. Biersack, “Long term study of peer behavior in the kad DHT,” IEEE/ACM Transactions on Networking, vol. 17, no. 5, pp. 1371–1384, 2009.
[25]  Y. Yamato, T. Ogawa, T. Moriya, et al., “Kademlia based routing on locator-ID separated networks for new generation networks,” Peer-to-Peer Networking and Applications, pp. 1–11, 2012.
[26]  S. Sioutas, P. Triantafillou, G. Papaloukopoulos, E. Sakkopoulos, K. Tsichlas, and Y. Manolopoulos, “ART: sub-logarithmic decentralized range query processing with probabilistic guarantees,” Distributed and Parallel Databases, vol. 31, no. 1, pp. 71–109, 2013.
[27]  K. C. Lai and Y. F. Yu, “A scalable multi-attribute hybrid overlay for range queries on the cloud,” Information Systems Frontiers, vol. 14, no. 4, pp. 895–908, 2012.
[28]  Y. Wang, L. Liu, and C. Pu, “Scaling group communication services with self-adaptive and utility-driven message routing,” Mobile Networks and Applications, vol. 17, no. 4, pp. 543–563, 2012.
[29]  A. Kova?evi?, S. Kaun, P. Mukherjee, et al., “Benchmarking platform for peer-to-peer systems,” it—Information Technology, vol. 49, no. 5, pp. 312–319, 2007.
[30]  KOM—Multimedia Communications Lab, “Documentation for PeerfactSim.KOM: all developers of PeerfactSim.KOM,” Tech. Rep., 2011, http://peerfact.kom.e-technik.tu-darmstadt.de/fileadmin/data/Manuals/2011_08_01_PeerfactSimDocumentation.pdf.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413