TY -非盟的沈Feichen盟——刘,思嘉盟——王,燕山AU -温,Andrew AU -王,身子盟——刘Hongfang PY - 2018 DA - 2018/10/10 TI -利用电子医疗记录和生物医学文献来支持罕见疾病的诊断使用数据融合和协同过滤方法乔-地中海JMIR通知SP - e11301六世- 6 - 4 KW -电子病历KW -文学KW -文本挖掘KW -罕见疾病AB -背景:在美国,罕见病的特征是在一定时期内影响不超过20万名患者。罕见病患者经常被误诊或漏诊,可能是由于临床医生对罕见病的知识或经验不足。随着电子医疗数据量呈指数级增长,成千上万种罕见疾病及其潜在相关诊断信息的大量信息被淹没在电子医疗记录(emr)和医学文献中。目的:本研究旨在利用异构数据集中的信息来辅助罕见病的诊断。可以充分利用emr和生物医学文献中存在的患者表型信息,加快疾病的诊断。方法:在我们之前的工作中,我们推进了协同过滤推荐系统的使用,以支持基于仅来自EMR数据的表型的罕见病诊断决策。然而,在面对来自各种资源的大量数据时,协同过滤是一个必要的问题,而使用异构数据进行协同过滤的影响并没有得到讨论。在本研究中,为了进一步研究协同过滤在异构数据集上的性能,我们研究了梅奥诊所生成的EMR数据以及从Semantic MEDLINE数据库检索的已发表文章摘要。具体而言,在本研究中,我们设计了不同的异构资源数据融合策略,并将其与协同过滤模型相结合。 Results: We evaluated performance of the proposed system using characterizations derived from various combinations of EMR data and literature, as well as with sole EMR data. We extracted nearly 13 million EMRs from the patient cohort generated between 2010 and 2015 at Mayo Clinic and retrieved all article abstracts from the semistructured Semantic MEDLINE Database that were published till the end of 2016. We applied a collaborative filtering model and compared the performance generated by different metrics. Log likelihood ratio similarity combined with k-nearest neighbor on heterogeneous datasets showed the optimal performance in patient recommendation with area under the precision-recall curve (PRAUC) 0.475 (string match), 0.511 (systematized nomenclature of medicine [SNOMED] match), and 0.752 (Genetic and Rare Diseases Information Center [GARD] match). Log likelihood ratio similarity also performed the best with mean average precision 0.465 (string match), 0.5 (SNOMED match), and 0.749 (GARD match). Performance of rare disease prediction was also demonstrated by using the optimal algorithm. Macro-average F-measure for string, SNOMED, and GARD match were 0.32, 0.42, and 0.63, respectively. Conclusions: This study demonstrated potential utilization of heterogeneous datasets in a collaborative filtering model to support rare disease diagnosis. In addition to phenotypic-based analysis, in the future, we plan to further resolve the heterogeneity issue and reduce miscommunication between EMR and literature by mining genotypic information to establish a comprehensive disease-phenotype-gene network for rare disease diagnosis. SN - 2291-9694 UR - http://medinform.www.mybigtv.com/2018/4/e11301/ UR - https://doi.org/10.2196/11301 UR - http://www.ncbi.nlm.nih.gov/pubmed/30305261 DO - 10.2196/11301 ID - info:doi/10.2196/11301 ER -
Baidu
map