@Article{信息:doi 10.2196 / / jmir。4305,作者="Yin, Zhijun and Fabbri, Daniel and Rosenbloom, S Trent and Malin, Bradley",标题="一个可扩展的框架来检测推特上的个人健康提及",期刊="J Med Internet Res",年="2015",月="Jun",日="05",卷="17",数="6",页="e138",关键词="消费者健康;信息检索;机器学习;社交媒体;推特;背景:生物医学研究传统上是通过调查和分析病历进行的。然而,这些资源的内容有限,因此非传统领域(如在线论坛和社交媒体)有机会补充个人健康观点。目的:本研究的目的是开发一个可扩展的框架,以检测Twitter上的个人健康状态提及,并评估这类信息的披露程度。方法:我们在2014年2个月的时间里通过Twitter流媒体API收集了超过2.5亿条推文。 The corpus was filtered down to approximately 250,000 tweets, stratified across 34 high-impact health issues, based on guidance from the Medical Expenditure Panel Survey. We created a labeled corpus of several thousand tweets via a survey, administered over Amazon Mechanical Turk, that documents when terms correspond to mentions of personal health issues or an alternative (eg, a metaphor). We engineered a scalable classifier for personal health mentions via feature selection and assessed its potential over the health issues. We further investigated the utility of the tweets by determining the extent to which Twitter users disclose personal health status. Results: Our investigation yielded several notable findings. First, we find that tweets from a small subset of the health issues can train a scalable classifier to detect health mentions. Specifically, training on 2000 tweets from four health issues (cancer, depression, hypertension, and leukemia) yielded a classifier with precision of 0.77 on all 34 health issues. Second, Twitter users disclosed personal health status for all health issues. Notably, personal health status was disclosed over 50{\%} of the time for 11 out of 34 (33{\%}) investigated health issues. Third, the disclosure rate was dependent on the health issue in a statistically significant manner (P<.001). For instance, more than 80{\%} of the tweets about migraines (83/100) and allergies (85/100) communicated personal health status, while only around 10{\%} of the tweets about obesity (13/100) and heart attack (12/100) did so. Fourth, the likelihood that people disclose their own versus other people's health status was dependent on health issue in a statistically significant manner as well (P<.001). For example, 69{\%} (69/100) of the insomnia tweets disclosed the author's status, while only 1{\%} (1/100) disclosed another person's status. By contrast, 1{\%} (1/100) of the Down syndrome tweets disclosed the author's status, while 21{\%} (21/100) disclosed another person's status. Conclusions: It is possible to automatically detect personal health status mentions on Twitter in a scalable manner. These mentions correspond to the health issues of the Twitter users themselves, but also other individuals. Though this study did not investigate the veracity of such statements, we anticipate such information may be useful in supplementing traditional health-related sources for research purposes. ", issn="1438-8871", doi="10.2196/jmir.4305", url="//www.mybigtv.com/2015/6/e138/", url="https://doi.org/10.2196/jmir.4305", url="http://www.ncbi.nlm.nih.gov/pubmed/26048075" }
Baidu
map