Development and evaluation of machine-learning methods in whole-body magnetic resonance imaging with diffusion weighted imaging for staging of patients with cancer: the MALIBO diagnostic test accuracy study

Loading...
Thumbnail Image

Embargo End Date

ICR Authors

Authors

Rockall, A
Li, X
Johnson, N
Lavdas, I
Santhakumaran, S
Prevost, AT
Koh, D-M
Punwani, S
Goh, V
Bharwani, N
Sandhu, A
Sidhu, H
Plumb, A
Burn, J
Fagan, A
Oliver, A
Wengert, GJ
Rueckert, D
Aboagye, E
Taylor, SA
Glocker, B
The MALIBO Investigators,

Document Type

Journal Article

Date

2024-10-01

Date Accepted

Abstract

<h4>Background</h4>Whole-body magnetic resonance imaging is accurate, efficient and cost-effective for cancer staging. Machine learning may support radiologists reading whole-body magnetic resonance imaging.<h4>Objectives</h4>1. To develop a machine-learning algorithm to detect normal organs and cancer lesions. 2. To compare diagnostic accuracy, time and agreement of radiology reads to detect metastases using whole-body magnetic resonance imaging with concurrent machine learning (whole-body magnetic resonance imaging + machine learning) against standard whole-body magnetic resonance imaging (whole-body magnetic resonance imaging + standard deviation).<h4>Design and participants</h4>Retrospective analysis of (1) prospective single-centre study in healthy volunteers > 18 years (n = 51) and (2) prospective multicentre STREAMLINE study patient data (n = 438).<h4>Tests</h4>Index: whole-body magnetic resonance imaging + machine learning. Comparator: whole-body magnetic resonance imaging + standard deviation.<h4>Reference standard</h4>Previously established expert panel consensus reference at 12 months from diagnosis.<h4>Outcome measures</h4>Primary: difference in per-patient specificity between whole-body magnetic resonance imaging + machine learning and whole-body magnetic resonance imaging + standard deviation. Secondary: per-patient sensitivity, per-lesion sensitivity and specificity, read time and agreement.<h4>Methods</h4>Phase 1: classification forests, convolutional neural networks, and a multi-atlas approaches for organ segmentation. Phase 2/3: whole-body magnetic resonance imaging scans were allocated to Phase 2 (training = 226, validation = 45) and Phase 3 (testing = 193). Disease sites were manually labelled. The final algorithm was applied to 193 Phase 3 cases, generating probability heatmaps. Twenty-five radiologists (18 experienced, 7 inexperienced in whole-body magnetic resonance imaging) were randomly allocated whole-body magnetic resonance imaging + machine learning or whole-body magnetic resonance imaging + standard deviation over two or three rounds in a National Health Service setting. Read time was independently recorded.<h4>Results</h4>Phases 1 and 2: convolutional neural network had best Dice similarity coefficient, recall and precision measurements for healthy organ segmentation. Final algorithm used a ‘two-stage’ initial organ identification followed by lesion detection. Phase 3: evaluable scans (188/193, of which 50 had metastases from 117 colon, 71 lung cancer cases) were read between November 2019 and March 2020. For experienced readers, per-patient specificity for detection of metastases was 86.2% (whole-body magnetic resonance imaging + machine learning) and 87.7% (whole-body magnetic resonance imaging + standard deviation), (difference −1.5%, 95% confidence interval −6.4% to 3.5%; p = 0.387); per-patient sensitivity was 66.0% (whole-body magnetic resonance imaging + machine learning) and 70.0% (whole-body magnetic resonance imaging + standard deviation) (difference −4.0%, 95% confidence interval −13.5% to 5.5%; p = 0.344). For inexperienced readers (53 reads, 15 with metastases), per-patient specificity was 76.3% in both groups with sensitivities of 73.3% (whole-body magnetic resonance imaging + machine learning) and 60.0% (whole-body magnetic resonance imaging + standard deviation). Per-site specificity remained high within all sites; above 95% (experienced) or 90% (inexperienced). Per-site sensitivity was highly variable due to low number of lesions in each site. Reading time lowered under machine learning by 6.2% (95% confidence interval −22.8% to 10.0%). Read time was primarily influenced by read round with round 2 read times reduced by 32% (95% confidence interval 20.8% to 42.8%) overall with subsequent regression analysis showing a significant effect (p = 0.0281) by using machine learning in round 2 estimated as 286 seconds (or 11%) quicker. Interobserver variance for experienced readers suggests moderate agreement, Cohen’s κ = 0.64, 95% confidence interval 0.47 to 0.81 (whole-body magnetic resonance imaging + machine learning) and Cohen’s κ = 0.66, 95% confidence interval 0.47 to 0.81 (whole-body magnetic resonance imaging + standard deviation).<h4>Limitations</h4>Patient whole-body magnetic resonance imaging data were heterogeneous with relatively few metastatic lesions in a wide variety of locations, making training and testing difficult and hampering evaluation of sensitivity.<h4>Conclusions</h4>There was no difference in diagnostic accuracy for whole-body magnetic resonance imaging radiology reads with or without machine-learning support, although radiology read time may be slightly shortened using whole-body magnetic resonance imaging + machine learning.<h4>Future work</h4>Failure-case analysis to improve model training, automate lesion segmentation and transfer of machine-learning techniques to other tumour types and imaging modalities.<h4>Study registration</h4>This study is registered as ISRCTN23068310.<h4>Funding</h4>This award was funded by the National Institute for Health and Care Research (NIHR) Efficacy and Mechanism Evaluation (EME) programme (NIHR award ref: 13/122/01) and is published in full in Efficacy and Mechanism Evaluation; Vol. 11, No. 15. See the NIHR Funding and Awards website for further award information.

Citation

Efficacy and Mechanism Evaluation, 2024, pp. 1 - 141

Source Title

Efficacy and Mechanism Evaluation

Publisher

National Institute for Health and Care Research

ISSN

eISSN

2050-4373

Research Team

RMH Honorary Faculty

Notes