Home // BIOTECHNO 2015, The Seventh International Conference on Bioinformatics, Biocomputational Systems and Biotechnologies // View article


Overcoming Ambiguous Gene Name Synonyms in MEDLINE Searches by Context Mining

Authors:
Modest von Korff
Thomas Sander

Keywords: Gene name disambiguation; classification; word-vectors; datamining; algorithm

Abstract:
Abstract—Classification of ambiguous gene name synonyms is a necessity when mining PubMed Central records with gene-related queries. This work introduces the use of word-vectors for gene name disambiguation. PubMed Central was queried for gene names and their synonyms. The retrieved records were filtered and automatically separated into train- and test-data. A similarity threshold was derived from the similarity matrix of every training word-vector set. The classification performance of the word-vectors was compared to a gene name similarity classification. Both methods showed good results, but the word-vector classification was superior in terms of precision and recall.

Pages: 1 to 3

Copyright: Copyright (c) IARIA, 2015

Publication date: May 24, 2015

Published in: conference

ISSN: 2308-4383

ISBN: 978-1-61208-409-1

Location: Rome, Italy

Dates: from May 24, 2015 to May 29, 2015