Exploring the application of deep learning techniques on medical text corpora.






With the rapidly growing amount of biomedical literature it becomes increasingly difficult to find relevant information quickly and reliably. In this study we applied the word2vec deep learning toolkit to medical corpora to test its potential for improving the accessibility of medical knowledge. We evaluated the efficiency of word2vec in identifying properties of pharmaceuticals based on mid-sized, unstructured medical text corpora without any additional background knowledge. Properties included relationships to diseases (‘may treat’) or physiological processes (‘has physiological effect’). We evaluated the relationships identified by word2vec through comparison with the National Drug File – Reference Terminology (NDF-RT) ontology. The results of our first evaluation were mixed, but helped us identify further avenues for employing deep learning technologies in medical information retrieval, as well as using them to complement curated knowledge captured in ontologies and taxonomies.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *