TY - GEN
T1 - Preprocessing of clinical neuro-oncology MRI studies for big data applications
AU - Chakrabarty, Satrajit
AU - Lamontagne, Pamela
AU - Marcus, Daniel S.
AU - Milchenko, Mikhail
N1 - Publisher Copyright:
© COPYRIGHT SPIE. Downloading of the abstract is permitted for personal use only.
PY - 2020
Y1 - 2020
N2 - Clinically acquired, multimodal and multi-site MRI datasets are widely used for neuro-oncology research. However, manual preprocessing of such data is extremely tedious and error prone due to high intrinsic heterogeneity. Automatic standardization of such datasets is therefore important for data-hungry applications like deep learning. Despite rapid advances in MRI data acquisition and processing algorithms, only limited effort was dedicated to automatic methodologies for standardization of such data. To address this challenge, we augment our previously developed Multimodal Glioma Analysis (MGA) pipeline with automation tools to achieve processing scale suitable for big data applications. This new pipeline implements a natural language processing (NLP) based scan-type classifier, with features constructed from DICOM metadata based on bag-ofwords model. The classifier automatically assigns one of 18 pre-defined scan types to all scans in MRI study. Using the described data model, we trained three types of classifiers: logistic regression, linear SVM, and multi-layer artificial neural network (ANN) on the same dataset. Their performance was validated on four datasets from multiple sources. ANN implementation achieved the highest performance, yielding an average classification accuracy of over 99%. We also built a Jupyter notebook based graphical user interface (GUI) which is used to run MGA in semi-automatic mode for progress tracking purposes and quality control to ensure reproducibility of the analyses based thereof. MGA has been implemented as a Docker container image to ensure portability and easy deployment. The application can run in a single or batch study mode, using either local DICOM data or XNAT cloud storage.
AB - Clinically acquired, multimodal and multi-site MRI datasets are widely used for neuro-oncology research. However, manual preprocessing of such data is extremely tedious and error prone due to high intrinsic heterogeneity. Automatic standardization of such datasets is therefore important for data-hungry applications like deep learning. Despite rapid advances in MRI data acquisition and processing algorithms, only limited effort was dedicated to automatic methodologies for standardization of such data. To address this challenge, we augment our previously developed Multimodal Glioma Analysis (MGA) pipeline with automation tools to achieve processing scale suitable for big data applications. This new pipeline implements a natural language processing (NLP) based scan-type classifier, with features constructed from DICOM metadata based on bag-ofwords model. The classifier automatically assigns one of 18 pre-defined scan types to all scans in MRI study. Using the described data model, we trained three types of classifiers: logistic regression, linear SVM, and multi-layer artificial neural network (ANN) on the same dataset. Their performance was validated on four datasets from multiple sources. ANN implementation achieved the highest performance, yielding an average classification accuracy of over 99%. We also built a Jupyter notebook based graphical user interface (GUI) which is used to run MGA in semi-automatic mode for progress tracking purposes and quality control to ensure reproducibility of the analyses based thereof. MGA has been implemented as a Docker container image to ensure portability and easy deployment. The application can run in a single or batch study mode, using either local DICOM data or XNAT cloud storage.
KW - Clinical MRI
KW - Docker
KW - Jupyter Notebook
KW - MRI scan classifier
KW - Natural Language Processing
KW - Neuro-oncology imaging
KW - Translational research
UR - http://www.scopus.com/inward/record.url?scp=85082623504&partnerID=8YFLogxK
U2 - 10.1117/12.2548371
DO - 10.1117/12.2548371
M3 - Conference contribution
AN - SCOPUS:85082623504
T3 - Progress in Biomedical Optics and Imaging - Proceedings of SPIE
BT - Medical Imaging 2020
A2 - Chen, Po-Hao
A2 - Deserno, Thomas M.
PB - SPIE
T2 - Medical Imaging 2020: Imaging Informatics for Healthcare, Research, and Applications
Y2 - 16 February 2020 through 17 February 2020
ER -