TY - JOUR
T1 - Visual Exploration of Neural Document Embedding in Information Retrieval
T2 - Semantics and Feature Selection
AU - Ji, Xiaonan
AU - Shen, Han Wei
AU - Ritter, Alan
AU - MacHiraju, Raghu
AU - Yen, Po Yin
N1 - Funding Information:
This work was supported in part by the Agency for Healthcare Research and Quality (AHRQ), R03HS025047-01. The authors would like to thank Albert Lai, Junpeng Wang, and anonymous reviewers for their generous help.
Publisher Copyright:
© 1995-2012 IEEE.
PY - 2019/6/1
Y1 - 2019/6/1
N2 - Neural embeddings are widely used in language modeling and feature generation with superior computational power. Particularly, neural document embedding-converting texts of variable-length to semantic vector representations-has shown to benefit widespread downstream applications, e.g., information retrieval (IR). However, the black-box nature makes it difficult to understand how the semantics are encoded and employed. We propose visual exploration of neural document embedding to gain insights into the underlying embedding space, and promote the utilization in prevalent IR applications. In this study, we take an IR application-driven view, which is further motivated by biomedical IR in healthcare decision-making, and collaborate with domain experts to design and develop a visual analytics system. This system visualizes neural document embeddings as a configurable document map and enables guidance and reasoning; facilitates to explore the neural embedding space and identify salient neural dimensions (semantic features) per task and domain interest; and supports advisable feature selection (semantic analysis) along with instant visual feedback to promote IR performance. We demonstrate the usefulness and effectiveness of this system and present inspiring findings in use cases. This work will help designers/developers of downstream applications gain insights and confidence in neural document embedding, and exploit that to achieve more favorable performance in application domains.
AB - Neural embeddings are widely used in language modeling and feature generation with superior computational power. Particularly, neural document embedding-converting texts of variable-length to semantic vector representations-has shown to benefit widespread downstream applications, e.g., information retrieval (IR). However, the black-box nature makes it difficult to understand how the semantics are encoded and employed. We propose visual exploration of neural document embedding to gain insights into the underlying embedding space, and promote the utilization in prevalent IR applications. In this study, we take an IR application-driven view, which is further motivated by biomedical IR in healthcare decision-making, and collaborate with domain experts to design and develop a visual analytics system. This system visualizes neural document embeddings as a configurable document map and enables guidance and reasoning; facilitates to explore the neural embedding space and identify salient neural dimensions (semantic features) per task and domain interest; and supports advisable feature selection (semantic analysis) along with instant visual feedback to promote IR performance. We demonstrate the usefulness and effectiveness of this system and present inspiring findings in use cases. This work will help designers/developers of downstream applications gain insights and confidence in neural document embedding, and exploit that to achieve more favorable performance in application domains.
KW - Neural document embedding
KW - feature selection
KW - information retrieval
KW - semantic analysis
UR - http://www.scopus.com/inward/record.url?scp=85065410305&partnerID=8YFLogxK
U2 - 10.1109/TVCG.2019.2903946
DO - 10.1109/TVCG.2019.2903946
M3 - Article
C2 - 30892213
AN - SCOPUS:85065410305
SN - 1077-2626
VL - 25
SP - 2181
EP - 2192
JO - IEEE Transactions on Visualization and Computer Graphics
JF - IEEE Transactions on Visualization and Computer Graphics
IS - 6
M1 - 8667702
ER -