TY - GEN
T1 - Jointly Embedding Entities and Text with Distant Supervision
AU - Newman-Griffis, Denis
AU - Lai, Albert M.
AU - Fosler-Lussier, Eric
N1 - Publisher Copyright:
© 2018 Association for Computational Linguistics.
PY - 2018
Y1 - 2018
N2 - Learning representations for knowledge base entities and concepts is becoming increasingly important for NLP applications. However, recent entity embedding methods have relied on structured resources that are expensive to create for new domains and corpora. We present a distantly-supervised method for jointly learning embeddings of entities and text from an unnanotated corpus, using only a list of mappings between entities and surface forms. We learn embeddings from open-domain and biomedical corpora, and compare against prior methods that rely on human-annotated text or large knowledge graph structure. Our embeddings capture entity similarity and relatedness better than prior work, both in existing biomedical datasets and a new Wikipedia-based dataset that we release to the community. Results on analogy completion and entity sense disambiguation indicate that entities and words capture complementary information that can be effectively combined for downstream use.
AB - Learning representations for knowledge base entities and concepts is becoming increasingly important for NLP applications. However, recent entity embedding methods have relied on structured resources that are expensive to create for new domains and corpora. We present a distantly-supervised method for jointly learning embeddings of entities and text from an unnanotated corpus, using only a list of mappings between entities and surface forms. We learn embeddings from open-domain and biomedical corpora, and compare against prior methods that rely on human-annotated text or large knowledge graph structure. Our embeddings capture entity similarity and relatedness better than prior work, both in existing biomedical datasets and a new Wikipedia-based dataset that we release to the community. Results on analogy completion and entity sense disambiguation indicate that entities and words capture complementary information that can be effectively combined for downstream use.
UR - http://www.scopus.com/inward/record.url?scp=85122021773&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85122021773
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 195
EP - 206
BT - ACL 2018 - Representation Learning for NLP, Proceedings of the 3rd Workshop
PB - Association for Computational Linguistics (ACL)
T2 - 3rd Workshop on Representation Learning for NLP, RepL4NLP 2018 at the 56th Annual Meeting of the Association for Computational Linguistics ACL 2018
Y2 - 20 July 2018
ER -