%0 Journal Article
%T 基于机器学习算法识别疾病相关的蛋白与金属离子配体的结合残基
Identifying the Binding Residues between Disease-Associated Proteins and Metal-Ion Ligands Based on Machine Learning Algorithm
%A 邹向辉
%A 冯永娥
%J Hans Journal of Computational Biology
%P 23-31
%@ 2164-5434
%D 2022
%I Hans Publishing
%R 10.12677/HJCB.2022.123004
%X 在研究疾病发生机制中,蛋白质与配体相互作用扮演着重要的角色。因为许多蛋白质功能的实现需要结合特定的配体,而金属离子配体对蛋白质功能的实现起到重要作用。确定蛋白质中哪些残基与金属离子配体相互作用,可以帮助研究者理解蛋白质-金属离子相互作用的分子机制,也对人类健康和精准医学有重要意义。本文基于机器学习算法,研究疾病相关的蛋白质与三种金属离子配体的结合。我们分别提取3种序列特征:位置特异性打分矩阵、氨基酸组分信息、二肽组分,并使用随机森林算法和支持向量机算法建立了三种金属离子配体结合残基的分类模型。对于Zn2+结合残基在特征融合中最高准确率(Acc)达到了87%,Mg2+结合残基识别的最高准确率(Acc)达到70%,Ca2+结合残基识别的最高准确率(Acc)达到70%。可见我们的模型对三种金属离子配体的结合残基有一定的识别能力。
Protein-ligand interactions play an important role in the pathogenesis of diseases. Many proteins perform their functions by binding to specific ligands, and the binding of protein-metal-ion ligands plays an important role in the realization of protein functions. Identifying which residues in the protein interact with metal-ion ligands can help researchers understand the molecular mechanism of protein-metal ion interaction, and it is important for human health and precision medicine. In this paper, we study the binding of disease-associated proteins to three metal ion ligands based on the machine learning algorithm. We extract three sequence features: position-specific scoring Ma-trix (PSSM), amino acid component information, dipeptide component. Then, the random forest al-gorithm and the support vector machine algorithm were used to establish the classification model of the three metal ion ligand-binding residues. Finally, the highest accuracy (Acc) was 87% for the Zn2+ binding residues in the feature fusion, the highest Accuracy (Acc) of Mg2+ binding residues was 70%, and that of Ca2+ binding residues was 70%. These results show that our model has the ability to identify the binding residues of three metal ion ligands.
%K 金属离子配体,5折交叉检验,位置特异性打分矩阵,随机森林算法
Metal-Ion Ligand
%K 5-Fold Cross Validation
%K Position-Specific Scoring Matrix (PSSM)
%K Random Forest (RF)
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=55324