%0 Journal Article
%T Prediction of Enzyme Mutant Activity Using Computational Mutagenesis and Incremental Transduction
%A Nada Basit
%A Harry Wechsler
%J Advances in Bioinformatics
%D 2011
%I Hindawi Publishing Corporation
%R 10.1155/2011/958129
%X Wet laboratory mutagenesis to determine enzyme activity changes is expensive and time consuming. This paper expands on standard one-shot learning by proposing an incremental transductive method (T2bRF) for the prediction of enzyme mutant activity during mutagenesis using Delaunay tessellation and 4-body statistical potentials for representation. Incremental learning is in tune with both eScience and actual experimentation, as it accounts for cumulative annotation effects of enzyme mutant activity over time. The experimental results reported, using cross-validation, show that overall the incremental transductive method proposed, using random forest as base classifier, yields better results compared to one-shot learning methods. T2bRF is shown to yield 90% on T4 and LAC (and 86% on HIV-1). This is significantly better than state-of-the-art competing methods, whose performance yield is at 80% or less using the same datasets. 1. Introduction A chain of amino acids in a given sequence forms the primary structure that makes up a protein and determines its functions. Proteins are necessary for virtually every activity in the human body [1]. There are twenty distinct amino acids that make up the polypeptides. They are known as proteinogenic or standard amino acids [1, 2]. The order of these amino acids in the chain, known as the primary sequence, is very important. Changes in even one amino acid (e.g., substituting one kind of amino acid, at a given location, with a different one) can affect the way the protein functions, that is, its activity. Such a substitution is an example of a mutation in the protein’s amino acid sequence and is characteristic of a single-site mutation. The interplay between mutations and their effect on protein function is the domain of bioinformatics, in general, and computational mutagenesis, in particular. Mutagenesis can be described as creating a mutation in the protein (in the amino acid chain) by substituting an original (or wild-type) amino acid at a given position in the chain with one of the other 19 amino acid types, for example, substituting the amino acid tryptophan at position 10 with cysteine at that same location in a particular protein [3]. The resulting mutated protein’s activity may be different from its wild-type counterpart (remaining active or becoming inactive). Experiments using mutagenesis enable researchers to collect data about protein activity with respect to mutations. Since wet lab experimentation is very expensive, finding a less expensive method, by being able to predict a protein’s activity/function, is
%U http://www.hindawi.com/journals/abi/2011/958129/