针对低数据量COVID-19靶点的活性预测深度学习模型

梁书然; 霍文博; 申万祥; 陈宇综; 蒋宇扬; 谭英

doi:10.13748/j.cnki.issn1007-7693.2022.21.025

针对低数据量COVID-19靶点的活性预测深度学习模型

Deep Learning Models for Activity Prediction Against the Low-data COVID-19 Targets

摘要

摘要: 目的为应对新型冠状病毒肺炎(Corona Virus Disease 2019，COVID-19)，发现针对低数据量COVID-19靶点(已知抑制剂<300种)的可再利用药物和新药。方法使用一种性能优于药物基准数据集上最先进的深度学习模型的深度学习架构MolMapNet，开发新的深度学习模型，用于预测基于知识的分子表示方式的药物特性。针对6个低数据量COVID-19靶点进行活性预测，这些靶点分别有34，51，81，155，161，241种已知抑制剂。并与使用更高数据集靶点训练的机器学习和深度学习模型(具有5 478~10 000种已知抑制剂)进行比较。结果在10倍交叉验证下进行模型测试，并使用测试集预测了这6个靶点的抑制剂的活性值。RMSE为0.442~0.917，MAE为0.358~0.749，R²为0.436~0.761。结论在已批准药物中筛选针对COVID-19的潜在药物，确定了3种与文献报道的实验结果一致的可再利用药物。这些表明了该深度学习模型在针对COVID-19和其他疾病的低数据量靶点活性预测方面的潜力。

Abstract: OBJECTIVE In response to Corona Virus Disease 2019(COVID-19), reusable drugs and new drugs against the low-data COVID-19 targets (with <300 known inhibitors) need to be discovered. METHODS Employing MolMapNet, a deep learning architecture that outperformed the state-of-the-art deep learning models on pharmaceutical benchmark datasets, new deep learning models were developed for predicting pharmaceutical properties with broadly-learned knowledge-based molecular representations. Predicted activities against 6 low-data COVID-19 targets with 34, 51, 81, 155, 161, 241 known inhibitors respectively. Compared with machine learning and deep learning models(with 5 478-10 000 known inhibitors) trained with targets in higher datasets. RESULTS Tested under the 10-fold cross-validation, our models predicted the activity values of the test-set inhibitors of these 6 targets with RMSE 0.442-0.917, MAE 0.358-0.749, and R² 0.436-0.761. CONCLUSION The screening of approved drugs for potential drug repurposing agents against COVID-19 identified 3 drugs that are consistent with the literature-reported experimental findings. These indicate the potential of our deep learning method for the low-data targets against COVID-19 and other diseases.

HTML全文

参考文献(48)

施引文献

资源附件(0)