SPU Main Site  |  Past & Future Meetings
Society For Pediatric Urology

Back to 2022 Abstracts

The hydronephrosis severity index: a multi-site validation of a practical and reliable artificial intelligence tool for rapid follow-up decision-making
Lauren Erdman, MSc, PhD(c)1, Mandy Rickard, MSc1, Erik Drysdale, MSc1, Marta Skreta, MSc1, Stanley Bryan Hua, BSc1, Kunj Sheth, MD2, Daniel Alvarez, BSc2, Kyla N. Velear, Bsc2, Michael Chua, MD1, Joana dos Santos, MD1, Daniel Keefe, MD, MSc1, Norman D. Rosenblum, MD1, Megan A. Bonnett, MD3, Christopher Cooper, MD3, Gregory E. Tasian, MD4, John Weaver, MD4, Alice Xiang, MD4, Yong Fan, PhD5, Bernarda Viteri, MD4, Anna Goldenberg, PhD1, Armando Lorenzo, MD, MSc1.
1SickKids, University of Toronto, Toronto, ON, Canada, 2Stanford Children's Health, Palo Alto, CA, USA, 3University of Iowa Stead Family Children's Hospital, Iowa City, IA, USA, 4Children's Hospital of Phildelphia, Philadelphia, PA, USA, 5University of Pennsylvania, Philadelphia, PA, USA.

Background: In infants with hydronephrosis (HN), risk stratification for obstruction using only ultrasound has the potential to streamline care for low-risk patients, reduce the number of patients investigated with invasive tests, and help providers to comply with the as low as reasonably achievable radiation principle. Herein, we assess how a previously developed artificial intelligence (AI) model which predicts obstructive HN from 1 sagittal and 1 transverse kidney image performs in a multi-centered validation study.
Methods: We trained our model using a retrospectively collected dataset of 1938 ultrasound images for 403 patients and their linked health records from our Source training set (Table 1). The ground truth labels of our obstructed cases were verified using recorded intraoperative findings, which most commonly documented intrinsic narrowing/stenosis, crossing vessels and high ureteric insertions, as well as the pathology findings of fibrosis, smooth muscle hypertrophy and chronic inflammation. We used our validation set to determine the 90% sensitivity threshold for detecting obstructed (vs non-obstructed) HN patients, and evaluated our model for sensitivity, specificity, area under the receiver operator curve (AUROC), and area under the precision recall curve (AUPRC) on an independent test set from our Source Institution, along with test data from 3 additional high volume, pediatric tertiary care institutions in North America.
Results: Our model produced a >90% AUROC in all institutions. We found a lower AUPRC at our Source institution (in which the data was trained) and at Institution 2 than at Institutions 3 and 4 (Figure 1). AUROC and AUPRC indicate our ability to detect obstructed cases at any given threshold across all samples. Therefore we also tested the transfer of a specific, conservative threshold (90% sensitivity) for stream-lining non-obstructed cases and found a minimum of 85% sensitivity at Institution 4, with all institutions not showing significantly lower than 90% sensitivity. In addition, specificity was >50% in all cases. The lowest specificity was at Institution 3, where all samples were high-grade, therefore we could even stream-line half high-grade patients using this tool, while still achieving >90% sensitivity.
Conclusions: It is possible to stratify and streamline patients who are unlikely to require surgery for obstructive HN using automatic determination from two postnatal ultrasound images at a single time-point. This multi-site study demonstrates that our HN model can be successfully and accurately deployed outside the institution of initial development.

Table 1. Demographics of the data used to develop the model, our source data test set, and test sets for Institutions 2, 3, and 4.

Figure 1. Model performance in Source institution test set and beyond.

Back to 2022 Abstracts