基于机器学习的焦虑障碍患者喹硫平血药浓度预测模型构建

    Construction of a Quetiapine Blood Concentration Prediction Model for Patients with Anxiety Disorders Based on Machine Learning

    • 摘要:
      目的 构建一种基于机器学习的预测模型,用于预测焦虑障碍患者血浆中喹硫平的血药浓度,为缺乏治疗药物监测(therapeutic drug monitoring,TDM)条件的基层医疗机构提供参考。
      方法 收集了乌鲁木齐市第四人民医院(新疆精神卫生中心)2020年1月至2022年12月期间接受喹硫平治疗的337例焦虑障碍患者的临床数据和实验室指标。通过单因素分析和Lasso回归进行特征选择。将筛选出的变量纳入5种机器学习模型(随机森林、支持向量机、决策树、轻量级梯度提升机和极限梯度提升)。通过五折交叉验证优化模型超参数,并在测试集上评估模型性能。并在独立时间验证集(2025年1月至2025年2月)上评估模型泛化能力。采用Shapley加性解释(Shapley additive explanations,SHAP)方法对模型进行解释,分析各特征对预测结果的贡献。
      结果 单因素分析显示,给药剂量、甘油三酯、谷丙转氨酶、谷草转氨酶、红细胞计数、白细胞计数、中性粒细胞计数和三碘甲状腺原氨酸对喹硫平血药浓度有显著影响(P<0.05)。Lasso回归筛选出6个变量。在5种机器学习模型中,决策树模型表现最优,其决定系数(R2)为0.746,平均绝对百分比误差(MAPE)为50.81%,平均绝对误差(MAE)为10.0,均方根误差(RMSE)为16.1,准确率为55.78%,在时间验证集上保持良好的预测性能(R2=0.694,MAE=11.05)。
      结论 本研究建立的决策树模型表现出较好的预测性能,为临床个体化用药提供了参考。

       

      Abstract:
      OBJECTIVE To construct a machine learning-based prediction model for predicting quetiapine plasma concentration in patients with anxiety disorders, providing a reference for primary healthcare institutions lacking therapeutic drug monitoring(TDM) conditions.
      METHODS Clinical data and laboratory indicators of 337 patients with anxiety disorders who received quetiapine treatment from January 2020 to December 2022 at Urumqi Fourth People’s Hospital(Xinjiang Mental Health Center) were collected. Feature selection was performed using univariate analysis and Lasso regression. The selected variables were incorporated into five machine learning models(Random Forest, Support Vector Machine, Decision Tree, Light Gradient Boosting Machine, and Extreme Gradient Boosting). Optimize the model hyperparameters via five-fold cross-validation, evaluate the model performance on the test set, and meanwhile assess the model generalization ability on the independent time validation set(January 2025 to February 2025). The model was interpreted using the Shapley additive explanations(SHAP) method, and the contribution of each feature to the prediction results was analyzed.
      RESULTS Univariate analysis showed that dose, triglycerides, alanine aminotransferase, aspartate aminotransferase, red blood cell count, white blood cell count, neutrophil count, and triiodothyronine significantly affected quetiapine plasma concentration(P<0.05). Lasso regression identified six variables. Among the five machine learning models, the Decision Tree model performed the best, with a coefficient of determination(R2) of 0.746, mean absolute percentage error(MAPE) of 50.81%, mean absolute error(MAE) of 10.0, root mean squared error(RMSE) of 16.1, and accuracy of 55.78% and maintained good predictive ability on the temporal validation set(R2=0.694, MAE=11.05).
      CONCLUSION The Decision Tree model established in this study demonstrates good predictive performance and provides a reference for clinical individualized drug use.

       

    /

    返回文章
    返回