中国实用口腔科杂志 ›› 2026, Vol. 19 ›› Issue (2): 169-176.DOI: 10.19538/j.kq.2026.02.007

• 论著 • 上一篇    下一篇

三维卷积神经网络结合Transformer模型在颞下颌关节盘锚固术疗效评估中应用效果研究

董凡侨,王    爱,周    青,薛    雷   

  1. 中国医科大学附属口腔医院口腔颌面外科,辽宁 沈阳 110001
  • 出版日期:2026-03-30 发布日期:2026-03-30
  • 通讯作者: 薛雷
  • 基金资助:
    辽宁省科技计划联合计划项目(自然科学基金-面上项目)(2025-MSLH-800)

  • Online:2026-03-30 Published:2026-03-30

摘要: 目的    构建三维卷积神经网络(3D convolutional neural network,3D CNN)结合Transformer(下文简称为“混合3D CNN-Transformer”)模型,分析其在颞下颌关节盘锚固术疗效评估中应用效果。方法    选取2024年1月至2025年5月于中国医科大学附属口腔医院口腔颌面外科就诊且需行颞下颌关节盘锚固术的双侧不可复性盘前移位患者31例(颞下颌关节62侧)行回顾性分析。收集患者术前及术后3个月的MRI数据和临床特征数据[疼痛视觉模拟评分量表(visual analog scale,VAS)评分、最大张口度(maximum interincisal opening,MIO)]行术后疗效综合评价。构建混合3D CNN-Transformer模型,包括双分支特征提取(MRI数据和临床特征数据)、多模态融合及多任务输出(优、良、差评价等级和VAS评分、MIO变化值),采用分层5折交叉验证方法和折外预测(out-of-fold predictions,OOF)技术对模型进行测试及评估。模型分类任务性能评价指标包括:准确率、F1-score、精确度、召回率、受试者工作特征曲线的曲线下面积(area under curve,AUC)。模型回归任务性能评价指标包括:平均绝对误差(mean absolute error,MAE)、均方根误差(root mean square error,RMSE)、决定系数(R2)。模型可靠性评价指标包括:预测置信度、置信度分布及样本覆盖率;并分析模型预测为“优”的概率[下文简称为“P(优)”]分别与VAS评分下降值、MIO增加值及临床综合改善指数的关系。结果    在OOF样本(31例患者)中,模型总体准确率为0.936,其95%CI(Wilson)为0.786 ~ 0.982;总体精确度为0.944,召回率为0.936;加权平均F1-score为0.937,宏平均F1-score为0.936。优、良、差等级的AUC分别为1.00、0.97、0.96。模型预测VAS评分变化值的MAE为0.803分、RMSE为0.977分、R2为0.755,预测MIO变化值的MAE为2.026 mm、RMSE为2.412 mm、R2为0.665。模型预测置信度为0.441 ~ 0.980,中位数为0.919。高置信度(> 0.9)的病例占54.8%(17/31)。随着置信度阈值增加,样本覆盖率降低,纳入病例准确率和宏平均F1-score随之增加。P(优)与VAS评分下降值(r = 0.747,P < 0.001)、MIO增加值(r = 0.813,P < 0.001)及临床综合改善指数(r = 0.773,P < 0.001)呈正相关。结论    应用混合3D CNN-Transformer模型评估颞下颌关节盘锚固术后3个月治疗效果具有高准确率和可靠性,可较准确地反映关节解剖结构变化及疼痛、张口度改善程度,具有术前辅助预测疗效的临床应用潜能,但模型性能仍需进一步提升。

关键词: 颞下颌关节盘锚固术, 不可复性盘前移位, 多模态深度学习, 三维卷积神经网络, Transformer, 疗效评估

Abstract: Objective    To develop a model of 3D convolutional neural network(3D CNN)combined with a Transformer architecture(hybrid 3D CNN-Transformer model)and investigate its effectiveness in evaluating postoperative outcomes of temporomandibular joint(TMJ)disc anchorage. Methods    A retrospective analysis was conducted on 31 patients(62 TMJs)with bilateral anterior disc displacement without reduction who underwent TMJ disc anchorage at the Department of Oral and Maxillofacial Surgery,Hospital of Stomatology,China Medical University,from January 2024 to May 2025. Preoperative and 3‑month postoperative MRI data and clinical characteristics - including visual analog scale(VAS)scores for pain and maximum interincisal opening(MIO)- were collected for comprehensive therapeutic evaluation. The hybrid 3D CNN-Transformer model was constructed, including dual-branch feature extraction(MRI data and clinical features),multimodal fusion,and multitask output(category of evaluation being excellent,good and poor, and changes in VAS scores and MIO). Model performance was evaluated using stratified five‑fold cross‑validation combined with an out-of-fold predictions(OOF)strategy. Classification metrics included accuracy,F1‑score,precision,recall rate,and area under the receiver operating characteristic curve(AUC). Regression metrics included mean absolute error(MAE),root mean square error(RMSE),and coefficient of determination(R2). Reliability metrics included confidence score,confidence distribution,and sample coverage rate. The correlation between the predicted probability for "excellent" therapeutic outcome[P(excellent)]and the decrease in VAS score,increase in MIO and the comprehensive clinical improvement index was further analyzed. Results    In the OOF samples,consisting of 31 patients,the overall accuracy was 0.936 (95%CI:0.786 - 0.982),overall precision was 0.944,and recall rate was 0.936. The weighted-average F1-score and macro-average F1-score were 0.937 and 0.936,respectively. The AUC values for the categories excellent,good,and poor were 1.00,0.97,and 0.96. For regression tasks,the prediction of changes in VAS achieved MAE = 0.803 points,RMSE = 0.977 points,R2 = 0.755;prediction of changes in MIO achieved MAE = 2.026 mm,RMSE = 2.412 mm,R2 = 0.665. Model confidence ranged from 0.441 to 0.980 with a median of 0.919;54.8%(17/31)of cases exhibited high confidence(> 0.9). As the confidence threshold increased,sample coverage decreased while accuracy and macro‑average F1‑score increased correspondingly. P(excellent)was positively correlated with decrease in VAS score(r = 0.747,P < 0.001),increase in MIO(r = 0.813,P < 0.001),and the comprehensive clinical improvement index(r = 0.773,P < 0.001). Conclusion     The hybrid 3D CNN-Transformer model shows very good accuracy and reliability in assessing postoperative conditions at 3 months after TMJ anchorage. The model reasonably represents the reconstruction of structures as well as improvements in pain and maxillary opening functions. It can also be applied in predicting treatment effects before surgeries. Nonetheless,optimization of the model is still needed.

Key words: temporomandibular joint disc anchorage, anterior disc displacement without reduction, multimodal deep learning, 3D convolutional neural network, Transformer, efficacy evaluation