BACKGROUND: Urine culture images collected using bacteriology automation are currently interpreted by technologists during routine standard-of-care workflows. Machine learning may be able to improve the harmonization of and assist with these interpretations. METHODS: A deep learning model, BacterioSight, was developed, trained, and tested on standard BD-Kiestra images of routine blood agar urine cultures from 2 different medical centers. RESULTS: BacterioSight displayed performance on par with standard-of-care-trained technologist interpretations. BacterioSight accuracy ranged from 97% when compared to standard-of-care (single technologist) and reached 100% when compared to a consensus reached by a group of technologists (gold standard in this study). Variability in image interpretation by trained technologists was identified and annotation "fuzziness" was quantified and found to correlate with reduced confidence in BacterioSight interpretation. Intra-testing (training and testing performed within the same institution) performed well giving Area Under the Curve (AUC) ≥0.98 for negative and positive plates, whereas, cross-testing on images (trained on one institution's images and tested on images from another institution) showed decreased performance with AUC ≥0.90 for negative and positive plates. CONCLUSIONS: Our study provides a roadmap on how BacterioSight or similar deep learning prototypes may be implemented to screen for microbial growth, flag difficult cases for multi-personnel review, or auto-verify a subset of cultures with high confidence. In addition, our results highlight image interpretation variability by trained technologist within an institution and globally across institutions. We propose a model in which deep learning can enhance patient care by identifying inherent sample annotation variability and improving personnel training.
- convolution neural network
- deep learning
- fuzzy labeling
- image analysis
- urine culture, artificial intelligence