Development and Evaluation of Machine Learning in Whole-Body Magnetic Resonance Imaging for Detecting Metastases in Patients With Lung or Colon Cancer: A Diagnostic Test Accuracy Study.
De Paepe, KN
MetadataShow full item record
OBJECTIVES: Whole-body magnetic resonance imaging (WB-MRI) has been demonstrated to be efficient and cost-effective for cancer staging. The study aim was to develop a machine learning (ML) algorithm to improve radiologists' sensitivity and specificity for metastasis detection and reduce reading times. MATERIALS AND METHODS: A retrospective analysis of 438 prospectively collected WB-MRI scans from multicenter Streamline studies (February 2013-September 2016) was undertaken. Disease sites were manually labeled using Streamline reference standard. Whole-body MRI scans were randomly allocated to training and testing sets. A model for malignant lesion detection was developed based on convolutional neural networks and a 2-stage training strategy. The final algorithm generated lesion probability heat maps. Using a concurrent reader paradigm, 25 radiologists (18 experienced, 7 inexperienced in WB-/MRI) were randomly allocated WB-MRI scans with or without ML support to detect malignant lesions over 2 or 3 reading rounds. Reads were undertaken in the setting of a diagnostic radiology reading room between November 2019 and March 2020. Reading times were recorded by a scribe. Prespecified analysis included sensitivity, specificity, interobserver agreement, and reading time of radiology readers to detect metastases with or without ML support. Reader performance for detection of the primary tumor was also evaluated. RESULTS: Four hundred thirty-three evaluable WB-MRI scans were allocated to algorithm training (245) or radiology testing (50 patients with metastases, from primary 117 colon [n = 117] or lung [n = 71] cancer). Among a total 562 reads by experienced radiologists over 2 reading rounds, per-patient specificity was 86.2% (ML) and 87.7% (non-ML) (-1.5% difference; 95% confidence interval [CI], -6.4%, 3.5%; P = 0.39). Sensitivity was 66.0% (ML) and 70.0% (non-ML) (-4.0% difference; 95% CI, -13.5%, 5.5%; P = 0.344). Among 161 reads by inexperienced readers, per-patient specificity in both groups was 76.3% (0% difference; 95% CI, -15.0%, 15.0%; P = 0.613), with sensitivity of 73.3% (ML) and 60.0% (non-ML) (13.3% difference; 95% CI, -7.9%, 34.5%; P = 0.313). Per-site specificity was high (>90%) for all metastatic sites and experience levels. There was high sensitivity for the detection of primary tumors (lung cancer detection rate of 98.6% with and without ML [0.0% difference; 95% CI, -2.0%, 2.0%; P = 1.00], colon cancer detection rate of 89.0% with and 90.6% without ML [-1.7% difference; 95% CI, -5.6%, 2.2%; P = 0.65]). When combining all reads from rounds 1 and 2, reading times fell by 6.2% (95% CI, -22.8%, 10.0%) when using ML. Round 2 read-times fell by 32% (95% CI, 20.8%, 42.8%) compared with round 1. Within round 2, there was a significant decrease in read-time when using ML support, estimated as 286 seconds (or 11%) quicker (P = 0.0281), using regression analysis to account for reader experience, read round, and tumor type. Interobserver variance suggests moderate agreement, Cohen κ = 0.64; 95% CI, 0.47, 0.81 (with ML), and Cohen κ = 0.66; 95% CI, 0.47, 0.81 (without ML). CONCLUSIONS: There was no evidence of a significant difference in per-patient sensitivity and specificity for detecting metastases or the primary tumor using concurrent ML compared with standard WB-MRI. Radiology read-times with or without ML support fell for round 2 reads compared with round 1, suggesting that readers familiarized themselves with the study reading method. During the second reading round, there was a significant reduction in reading time when using ML support.
RMH Honorary Faculty
License start date
Investigative Radiology, 2023, Publish Ahead of Print
Ovid Technologies (Wolters Kluwer Health)