Motif discovery is one of the fundamental problems that have important applications
in identifying drug targets and regulatory sites. Regulatory sites on DNA
sequence normally correspond to shared conservative sequence patterns among the
regulatory regions of correlated genes. These conserved sequence patterns are
called motifs. Identifying motifs and corresponding instances is very
important, so biologists can investigate the interactions between DNA and
proteins, gene regulation, cell development and cell reaction under
physiological and pathological conditions. In this work, we developed a motif
finding algorithm based on a multi-objective genetic algorithm technique and
incorporated the hypergeometric scoring function to enable it discover gapped motifs
from organisms with challenging genomic structure such as the malaria parasite.
The runtime performance of our resulting algorithm, EMOGAMOD (Extended Multi
Objective Genetic Algorithm MOtif Discovery) was evaluated with that of some
common motif discovery algorithms and the result was remarkable.
Cite this paper
Makolo, U. A. and Suberu, S. O. (2016). Gapped Motif Discovery with Multi-Objective Genetic Algorithm. Open Access Library Journal, 3, e2293. doi: http://dx.doi.org/10.4236/oalib.1102293.
Pratap,
D.K., Agrwal, A. and Meyarivan, S.T. (2002) A Fast and Elitist Multiobjective Genetic
Algorithm: NSGA-II. IEEE Transactions on Evolutionary
Computation, 6, 182-197. http://dx.doi.org/10.1109/4235.996017
Chengwei,
L. and Jianhua, R. (2010)
Finding Gapped Motifs by a Novel Evolutionary.
EvoBIO’10 Proceedings of the 8th European
Conference on Evolutionary Computation, Machine
Learning and Data Mining in Bioinformatics, Brighton, 7-10 April 2010, 50-61.
Cawley, S., Wirth, A. and Speed,
T. (2001) PHAT: A Gene Finding Program for Plasmodium Falciparum. Molecular and Biochemical Parasitology,
118, 167-174. http://dx.doi.org/10.1016/S0166-6851(01)00363-2
Breman,
J.G. (2001) The Ears of the Hippopotamus: Manifestations, Determinants, and
Estimates of the Malaria Burden. American
Journal of Tropical Medicine and Hygiene, 64, 1-11.
Morairu, D.I., Crenulescu, R.G.
and Vinnan, L.N. (2011) Using Suffix Tree Document Representation in Hierarchical
Agglomerative. Journal of World Academy
of Science, Engineering and
Technology, 59, 16-34.
Pizzi, C., Rastas, P. and Ukkonen, E. (2011) Motif
Discovery with Compact Approaches—Design and Applications. IEEE/ACM Transactionson ComputationalBiology and Bioinformatics, 8, 69-79. http://dx.doi.org/10.1109/TCBB.2009.35
Ashlock,
W. (2014) Side Effect Machine Features for Analysis and Comparison of DNA Promoter
Sequences. 2014 IEEE Conference
on Computational Intelligence in Bioinformatics and Computational Biology,
Honolulu, 16-24.
Makolo,
A. and Osofisan, A.O. (2012) Comparative Analysis of Similarity Check
Mechanism for Motif Extraction. African
Journal of Computer Science, 5,
53-58.
Nori, F.A. and Houghten, S. (2012)
A Multi-Objective Genetic Algorithm with
Side Effect Machines for Motif Discovery. 2012 IEEE Conference onComputational Intelligence in Bioinformatics and Computational
Biology, San Diego, 9-12
May 2012, 257-282.