@Article{信息:doi 10.2196 / / medinform。7059,作者="Segura Bedmar, Isabel and Mart{\'i}nez, Paloma and Carruana Mart{\'i}n, Adri{\'a}n",标题="生物医学语义索引的搜索与图形数据库技术:实验分析",期刊="JMIR Med Inform",年="2017",月=" 12月",日="01",卷="5",数="4",页="e48",关键词="信息存储与检索;语义索引;背景:生物医学语义索引是人类策展人在生物医学文献索引和编目方面非常有用的支持工具。目的:本研究的目的是描述一个系统自动分配医学主题标题(MeSH)的生物医学文章从MEDLINE。方法:我们的方法依赖于类似的文档应该由相似的MeSH术语分类的假设。虽然之前的工作已经通过使用k-nearest neighbors算法来利用文档的相似性,但我们通过搜索引擎索引将文档表示为文档向量,然后使用余弦相似度计算文档之间的相似性。检索到与给定输入文档最相似的文档后,我们对它们的MeSH术语进行排序,以选择最适合输入文档的一组。为此,我们定义了一个评分函数,该函数考虑词条在检索到的文档集中出现的频率,以及输入文档与每个检索到的文档之间的相似性。此外,我们实施由人类策展人提出的指导方针来注释MEDLINE文章; in particular, the heuristic that says if 3 MeSH terms are proposed to classify an article and they share the same ancestor, they should be replaced by this ancestor. The representation of the MeSH thesaurus as a graph database allows us to employ graph search algorithms to quickly and easily capture hierarchical relationships such as the lowest common ancestor between terms. Results: Our experiments show promising results with an F1 of 69{\%} on the test dataset. Conclusions: To the best of our knowledge, this is the first work that combines search and graph database technologies for the task of biomedical semantic indexing. Due to its horizontal scalability, ElasticSearch becomes a real solution to index large collections of documents (such as the bibliographic database MEDLINE). Moreover, the use of graph search algorithms for accessing MeSH information could provide a support tool for cataloging MEDLINE abstracts in real time. ", issn="2291-9694", doi="10.2196/medinform.7059", url="http://medinform.www.mybigtv.com/2017/4/e48/", url="https://doi.org/10.2196/medinform.7059", url="http://www.ncbi.nlm.nih.gov/pubmed/29196280" }
Baidu
map