Abstract
At Albert Einstein College of Medicine a large part of online lecture materials contain PostScript files. As the collection grows it becomes essential to create a digital library to have easy access to relevant sections of the lecture material that is full-text indexed; to create this index it is necessary to extract all the text from the document files that constitute the originals of the lectures. In this study we present a semi automatic indexing method using robust technique for extracting text from PostScript files and National Library of Medicine's Medical Text Indexer (MTI) program for indexing the text. This model can be applied to other medical schools for indexing purposes.
Original language | English |
---|---|
Pages (from-to) | 1053 |
Number of pages | 1 |
Journal | AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium |
State | Published - 2007 |