全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于融合特征和Voting集成学习的膜蛋白类型预测
Prediction of Membrane Protein Types Based on Fusion Feature Information and Voting Ensemble Learning

DOI: 10.12677/HJCB.2021.114006, PP. 49-58

Keywords: 膜蛋白分类,蛋白质二结构,特征融合,机器学习,Voting集成学习,Classification of Membrane Protein, Protein Secondary Structure, Feature Fusion, Machine Learning, Voting Ensemble Learning

Full-Text   Cite this paper   Add to My Lib

Abstract:

膜蛋白是细胞功能的主要承担者,其功能与其类型密切相关。膜蛋白类型的鉴定是生物信息学中的一项重要课题。已有的膜蛋白分类模型主要从膜蛋白序列信息中提取特征,本文提出了一种基于蛋白质二级结构信息的蛋白质特征提取方法,并将其融入现有的两种序列特征。通过对比实验结果显示,在融入了蛋白质二级结构特征后,几种不同机器学习分类算法下的膜蛋白预测精度均有提升,说明了该融合蛋白质二级结构特征方法的有效性。最后,基于Voting集成学习框架,结合三种机器学习算法构建膜蛋白分类模型。结果表明,该模型的预测效果优于现有的几种机器学习模型。
Studies have shown that membrane proteins are the main bearers of cellular functions and their functions are closely related to their types. Therefore, the identification of membrane protein types is an important topic in bioinformatics. The existing classification models for membrane proteins mainly extract features from the sequence information of membrane proteins. In this paper, a protein feature extraction method was proposed based on protein secondary structure information, which was integrated into two existing sequence features. By comparing the experimental results, the prediction accuracy of membrane proteins under several different machine learning classification algorithms was improved after integrating protein secondary structure features, which illustrated the effectiveness of this fusion protein secondary structure feature method. Finally, a membrane protein classification model was constructed based on the voting ensemble learning frame-work in combination with three machine learning algorithms. The results show that the prediction performance of this model is better than other machine learning models.

References

[1]  Almén, M.S., Nordstr?m, K.J., Fredriksson, R. and Schi?th, H.B. (2009) Mapping the Human Membrane Proteome: A Majority of the Human Membrane Proteins Can Be Classified According to Function and Evolutionary Origin. BMC Bi-ology, 7, Article No. 50.
https://doi.org/10.1186/1741-7007-7-50
[2]  Overington, J.P., Al-Lazikani, B. and Hop-kins, A.L. (2006) How Many Drug Targets Are There? Nature Reviews Drug Discovery, 5, 993-996.
https://doi.org/10.1038/nrd2199
[3]  Chou, K.C. and Shen, H.B. (2007) MemType-2L: A Web Server for Predict-ing Membrane Proteins and Their Types by Incorporating Evolution Information through Pse-PSSM. Biochemical and Biophysical Research Communications, 360, 339-345.
https://doi.org/10.1016/j.bbrc.2007.06.027
[4]  Chou, K.C. and Elrod, D.W. (1999) Prediction of Membrane Protein Types and Subcellular Locations. Proteins: Structure Function and Bioinformatics, 34, 137-153.
https://doi.org/10.1002/(SICI)1097-0134(19990101)34:1<137::AID-PROT11>3.0.CO;2-O
[5]  Chou, K.C. (2001) Prediction of Protein Cellular Attributes Using Pseudo-Amino Acid Composition. Proteins: Structure Function and Bio-informatics, 43, 246-255.
https://doi.org/10.1002/prot.1035
[6]  Hayat, M., Khan, A. and Yeasin, M. (2012) Pre-diction of Membrane Proteins Using Split Amino Acid and Ensemble Classification. Amino Acids, 42, 2447-2460.
https://doi.org/10.1007/s00726-011-1053-5
[7]  Petrilli, P. (1993) Classification of Protein Sequences by Their Dipeptide Composition. Bioinformatics, 9, 205-209.
https://doi.org/10.1093/bioinformatics/9.2.205
[8]  Alphonse, A.S., Mary, N.A.B. and Starvin, M.S. (2020) Clas-sification of Membrane Protein Using Tetra Peptide Pattern. Analytical Biochemistry, 606, Article ID: 113845.
https://doi.org/10.1016/j.ab.2020.113845
[9]  Hayat, M. and Khan, A. (2012) Mem-PHybrid: Hybrid Fea-tures-Based Prediction System for Classifying Membrane Protein Types. Analytical Biochemistry, 424, 35-44.
https://doi.org/10.1016/j.ab.2012.02.007
[10]  Wang, H., Ding, Y.J., Tang, J.J. and Guo, F. (2020) Identification of Membrane Protein Types via Multivariate Information Fusion with Hilbert-Schmidt Independence Criterion. Neurocom-puting, 83, 257-269.
https://doi.org/10.1016/j.neucom.2019.11.103
[11]  Wang, L.P., Yuan, Z.T., Chen, X.H. and Zhou, Z.F. (2010) The Prediction of Membrane Protein Types with NPE. IEICE Electronics Express, 7, 397-402.
https://doi.org/10.1587/elex.7.397
[12]  Hayat, M. and Khan, A. (2010) Predicting Membrane Protein Types by Fusing Composite Protein Sequence Features into Pseudo Amino Acid Composition. Journal of Theoretical Biology, 271, 10-17.
https://doi.org/10.1016/j.jtbi.2010.11.017
[13]  郭磊, 王顺芳. 序列信息融合与两阶段特征选择的膜蛋白预测[J]. 计算机工程与应用, 2019, 55(6): 145-150.
[14]  Myers, J.K. and Oas, T.G. (2001) Preorganized Secondary Structure as an Important Determinant of Fast Protein Folding. Nature Structural Biology, 8, 552-558.
https://doi.org/10.1038/88626
[15]  Wan, S.B., Mak, M.-W. and Kung, S.-Y. (2016) Benchmark Data for Identify-ing Multifunctional Types of Membrane Proteins. Data in Brief, 8, 105-107.
https://doi.org/10.1016/j.dib.2016.05.024
[16]  Cuff, J.A. and Barton, G.J. (1999) Evaluation and Improvement of Multiple Sequence Methods for Protein Secondary Structure Prediction. Proteins: Structure Function and Bioinformatics, 34, 508-519.
https://doi.org/10.1002/(SICI)1097-0134(19990301)34:4<508::AID-PROT10>3.0.CO;2-4
[17]  Wang, S., Li, W., Liu, S.W. and Xu, J. (2014) RaptorX-Property: A Web Server for Protein Structure Property Prediction. Nucleic Acids Research, 44, W430-W435.
https://doi.org/10.1093/nar/gkw306
[18]  Zhang, X.L. and Chen, L. (2020) Prediction of Membrane Protein Types by Fusing Protein-Protein Interaction and Protein Sequence Information. BBA-Proteins and Proteomics, 1868, Article ID: 140524.
https://doi.org/10.1016/j.bbapap.2020.140524
[19]  Huang, G.H., Zhang, Y.C., Chen, L., Zhang, N., Huang, T. and Cai, Y.-D. (2014) Prediction of Multi-Type Membrane Proteins in Human by an Integrated Approach. PLOS ONE, 9, e93553.
https://doi.org/10.1371/journal.pone.0093553
[20]  Nanni, L., Brahnam, S. and Lumini, A. (2012) Wavelet Images and Chou’s Pseudo Amino Acid Composition for Protein Classification. Amino Acids, 43, 657-665.
https://doi.org/10.1007/s00726-011-1114-9
[21]  Chen, Y.K. and Li, K.B. (2013) Predicting Membrane Protein Types by Incorporating Protein Topology, Domains, Signal Peptides, and Physicochemical Properties into the General form of Chou’s Pseudo Amino Acid Composition. Journal of Theoretical Biology, 318, 1-12.
https://doi.org/10.1016/j.jtbi.2012.10.033

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133