%0 Journal Article
%T Statistical analysis of exon lengths in various eukaryotes
%A Alexander Kaplunovsky
%A Anatoliy Ivashchenko
%A Alexander Bolshoy
%J Open Access Bioinformatics
%D 2011
%I 
%R http://dx.doi.org/10.2147/OAB.S14448
%X atistical analysis of exon lengths in various eukaryotes Original Research (3675) Total Article Views Authors: Alexander Kaplunovsky, Anatoliy Ivashchenko, Alexander Bolshoy Published Date January 2011 Volume 2011:3 Pages 1 - 15 DOI: http://dx.doi.org/10.2147/OAB.S14448 Alexander Kaplunovsky1, Anatoliy Ivashchenko2, Alexander Bolshoy1 1Department of Evolutionary and Environmental Biology, Genome Diversity Center, Institute of Evolution, University of Haifa, Israel; 2Department of Biotechnology, Biochemistry, Plant Physiology, Al-Farabi Kazakh National University, Kazakhstan Purpose: The principal goals of this research were to investigate correlations between certain properties of exons in a gene (ie, between exon density and the corresponding protein length) and to compare genomic trees obtained with different approaches of clustering based on exonic parameters. The aim was a better understanding of exon–intron structures and their origin and development. The exon–intron structures of eukaryote genes are quite different from each other, and the evolution of such structures raises many problematic questions. As a preliminary attempt to address some of these questions, we performed a statistical analysis of gene exon–intron structures. Methods: Taking whole genomes of eukaryotes, we went through all the protein-coding genes in each chromosome separately and calculated the portion of intron-containing genes and average values of the net length of all the exons in a gene, the number of the exons, and the average length of an exon. Comparing those chromosomal and genomic averages, we developed a technique of clustering based on characteristics of the exon–intron structure. This technique of clustering separates different species, grouping them according to eukaryote taxonomy. Conclusion: Our conclusion is that the best approach is based on distances among four principal components obtained by factor analysis and followed by application of clustering algorithms, such as neighbor-joining, k-means, and partitioning around medoids.
%K comparative genomics
%K exon–intron structure
%K eukaryotic clustering
%K principal component analysis
%U https://www.dovepress.com/statistical-analysis-of-exon-lengths-in-various-eukaryotes-peer-reviewed-article-OAB