Abstract:
OBJECTIVE To construct a medication education knowledge base for patients with chronic kidney disease(CKD) by integrating the context understanding capability of large language models(LLM) with the dynamic knowledge retrieval mechanism of retrieval-augmented generation(RAG), a multimodal medication guidance system is constructed to enhance the safety and compliance of patients’ medication use, and to assist medical professionals.
METHODS The model was built through methods such as data collection and preprocessing, model construction and training, technology integration, knowledge base construction and maintenance, as well as system evaluation and optimization. Thirty CKD-related questions were designed, and three major Chinese LLMs, namely Kimi, Xunfei Starfire(Spark), and Zhipu, were selected for comparative evaluation. Ten clinical pharmacists from the nephrology department rated the models based on five dimensions: accuracy, completeness, relevance, logic, and professionalism. The focus was on evaluating the clinical logical consistency, evidence traceability completeness, and accuracy of contraindications identification of the model in the CKD scenario. Each pharmacist rated the responses of the three model processing methods(basic model, adding prompts, adding knowledge base) separately, and a total of 30 rating forms were collected. At the same time, time investment data for the four stages of requirement analysis, rule design, system training and testing, deployment and optimization were collected from 5 software development companies, and the time consumption differences between the traditional development mode and the LLM+RAG mode were compared. Two-factor and one-factor analysis of variance were used to evaluate the differences in model scores, and paired t-tests were used to analyze the differences in development time(P<0.05 was considered statistically significant for the difference to be significant).
RESULTS The interaction scores among different processing methods and models were significantly different(P<0.001). The score of the Kimi model was significantly higher than that of the Xunfei and Zhipu models after adding prompt words; after adding the knowledge base, the score of Kimi was the highest, with no significant difference from Zhipu but significantly higher than that of Xunfei; among the basic models, the score of Kimi was also the highest. Under the same model, the score of Kimi after adding the knowledge base was significantly higher than that after adding prompt words, but there was no difference from the basic model; the scores of Xunfei and Zhipu were significantly improved after adding the knowledge base. The LLM+RAG mode significantly shortened the development time compared to the traditional mode(P=0.017), with an 80% increase in efficiency during the rule design stage, an average saving of 2.125 weeks per stage, and an overall efficiency improvement of 45.9%.
CONCLUSION The combination of LLM and RAG technology can significantly enhance development efficiency and shorten the cycle, and optimizing prompts and knowledge bases can maximize model performance. Different models can be selected based on cost and speed requirements. This study has verified the application potential of LLM+RAG in the medical field, but the coverage of the knowledge base, the model’s generalization ability, and long-term maintenance still need to be optimized. In the future, the knowledge base will be expanded and the intelligence level will be improved to provide more accurate medical assistance tools.