• 首页期刊简介编委会刊物订阅专栏专刊电子刊广告合作联系我们English
引用本文:李因,谭英.基于分子理化性质特征的小样本G蛋白偶联受体靶点结合活性预测的深度学习模型[J].中国现代应用药学,2022,39(21):2842-2849.
LI Yin,TAN Ying.Binding Activity Prediction of the Low-data G-protein Coupled Receptors Targets by Deep Learning of Knowledge-based Molecular Representations[J].Chin J Mod Appl Pharm(中国现代应用药学),2022,39(21):2842-2849.
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  【EndNote】   【RefMan】   【BibTex】
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览 779次   下载 574 本文二维码信息
码上扫一扫!
分享到: 微信 更多
基于分子理化性质特征的小样本G蛋白偶联受体靶点结合活性预测的深度学习模型
李因, 谭英
清华大学深圳国际研究生院, 广东 深圳 518055
摘要:
目的 使用MolMapNet构建深度学习(deep learning,DL)模型,预测化合物对23个小样本(已知活性数据<250)G蛋白偶联受体(G-protein coupled receptors,GPCRs)的结合活性,辅助发现GPCRs的新型药物。方法 从多个数据库搜集小样本GPCRs的活性数据集并进行预处理,使用MolMapNet构建DL模型;将建立的模型与已公布DL模型和ML模型进行比较;用神经肽S受体专利化合物对构建的模型进行评估。结果 构建了23个小样本GPCRs靶点的单回归模型,在10折交叉验证测试下,构建的模型在测试集上的均方根误差为0.373 6~1.199 8(其中20个<1),平均绝对误差为0.299 4~1.008 3(其中21个<1),R2为0.136 9~0.810 7(其中15个>0.5,9个>0.6);与已发表的大样本GPCRs(已知活性数据>250) DL模型和小样本靶点的ML模型相比,显示出相当的性能;使用构建的模型对专利中化合物进行活性预测,模型表现良好。结论 构建的23个回归模型能够预测化合物对特定靶点的生物活性,具有筛选结构新颖的药物的潜力,MolMapNet可用于小样本GPCRs的活性预测。
关键词:  结合活性  深度学习  GPCR  小样本
DOI:10.13748/j.cnki.issn1007-7693.2022.21.021
分类号:R914.2
基金项目:国家重点研究计划合成生物学专项(2019YFA0905901)
Binding Activity Prediction of the Low-data G-protein Coupled Receptors Targets by Deep Learning of Knowledge-based Molecular Representations
LI Yin, TAN Ying
Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China
Abstract:
OBJECTIVE To construct new deep learning(DL) models for binding activity prediction against each of 23 low-data G-protein coupled receptors(GPCRs)(known binders <250) using MolMapNet, assisting in the novel drug discovery of GPCRs. METHODS Binding activity datasets of low-data GPCRs were collected from multiple databases and preprocessed, and DL models were constructed by MolMapNet; the established models were compared with published DL models and ML models; Neuropeptide S receptor proprietary compounds to evaluate the constructed model. RESULTS Under 10-fold cross-validation tests, MolMapNet DL models predicted the binding activity values of the test-set compounds for each GPCR with RMSE 0.373 6-1.199 8(20 among which RMSE<1), MAE 0.299 4-1.008 3(21 among which MAE<1), and R2 0.136 9-0.810 7(15 among which R2 >0.5, 9 among which R2 >0.6). Our low-sample models showed comparable performances to those of the published DL models trained with higher-data GPCRs(>250 known binders). Our models also performed well in activity prediction of patented GPCR binders. CONCLUSION The 23 models constructed here can predict the biological activity of a compound against a specific target with good performance, have the potential to screen drugs with novel structures, and MolMapNet architecture is useful for activity prediction against the low-sample GPCR targets.
Key words:  binding activity  deep learning  G-protein coupled receptors  low-data
扫一扫关注本刊微信