Limits...
Collecting and Analyzing Patient Experiences of Health Care From Social Media.

Rastegar-Mojarad M, Ye Z, Wall D, Murali N, Lin S - JMIR Res Protoc (2015)

Bottom Line: We found that patients wrote longer reviews when they rated the facility poorly (1 or 2 stars).We demonstrated that the computed sentiment scores correlated well with consumer-generated ratings.Such information can subsequently inform and provide opportunity to improve the quality of health care.

View Article: PubMed Central - HTML - PubMed

Affiliation: Marshfield Clinic Research Foundation, Biomedical Informatics Research Center, Marshfield, WI, United States.

ABSTRACT

Background: Social Media, such as Yelp, provides rich information of consumer experience. Previous studies suggest that Yelp can serve as a new source to study patient experience. However, the lack of a corpus of patient reviews causes a major bottleneck for applying computational techniques.

Objective: The objective of this study is to create a corpus of patient experience (COPE) and report descriptive statistics to characterize COPE.

Methods: Yelp reviews about health care-related businesses were extracted from the Yelp Academic Dataset. Natural language processing (NLP) tools were used to split reviews into sentences, extract noun phrases and adjectives from each sentence, and generate parse trees and dependency trees for each sentence. Sentiment analysis techniques and Hadoop were used to calculate a sentiment score of each sentence and for parallel processing, respectively.

Results: COPE contains 79,173 sentences from 6914 patient reviews of 985 health care facilities near 30 universities in the United States. We found that patients wrote longer reviews when they rated the facility poorly (1 or 2 stars). We demonstrated that the computed sentiment scores correlated well with consumer-generated ratings. A consumer vocabulary to describe their health care experience was constructed by a statistical analysis of word counts and co-occurrences in COPE.

Conclusions: A corpus called COPE was built as an initial step to utilize social media to understand patient experiences at health care facilities. The corpus is available to download and COPE can be used in future studies to extract knowledge of patients' experiences from their perspectives. Such information can subsequently inform and provide opportunity to improve the quality of health care.

No MeSH data available.


Distribution of the sentiment score per sentence.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4526973&req=5

figure8: Distribution of the sentiment score per sentence.

Mentions: On a scale of 1-5 (with 5 being the best), 69.68% (4817/6914) patients rated the facility favorably (≥4 out of 5) (Figure 6). A trend was identified between length of patient reviews and perception of a negative experience (correlation=-.5829, P<.001) (Figure 7). Figure 8 illustrates the distribution of sentiment score per sentence. The computed sentiment score was compared with the consumer-generated rating (P<.001, Pearson correlation test) (Figure 9). The sentiment score reflects the degree of accumulation of sentimental words in a sentence, which can be signified by positive words such as “pleasing” and “perfect,” and negative words such as “unhappy” and “disappointing.” Longer sentences tended to carry stronger sentiment score (Figure 10).


Collecting and Analyzing Patient Experiences of Health Care From Social Media.

Rastegar-Mojarad M, Ye Z, Wall D, Murali N, Lin S - JMIR Res Protoc (2015)

Distribution of the sentiment score per sentence.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4526973&req=5

figure8: Distribution of the sentiment score per sentence.
Mentions: On a scale of 1-5 (with 5 being the best), 69.68% (4817/6914) patients rated the facility favorably (≥4 out of 5) (Figure 6). A trend was identified between length of patient reviews and perception of a negative experience (correlation=-.5829, P<.001) (Figure 7). Figure 8 illustrates the distribution of sentiment score per sentence. The computed sentiment score was compared with the consumer-generated rating (P<.001, Pearson correlation test) (Figure 9). The sentiment score reflects the degree of accumulation of sentimental words in a sentence, which can be signified by positive words such as “pleasing” and “perfect,” and negative words such as “unhappy” and “disappointing.” Longer sentences tended to carry stronger sentiment score (Figure 10).

Bottom Line: We found that patients wrote longer reviews when they rated the facility poorly (1 or 2 stars).We demonstrated that the computed sentiment scores correlated well with consumer-generated ratings.Such information can subsequently inform and provide opportunity to improve the quality of health care.

View Article: PubMed Central - HTML - PubMed

Affiliation: Marshfield Clinic Research Foundation, Biomedical Informatics Research Center, Marshfield, WI, United States.

ABSTRACT

Background: Social Media, such as Yelp, provides rich information of consumer experience. Previous studies suggest that Yelp can serve as a new source to study patient experience. However, the lack of a corpus of patient reviews causes a major bottleneck for applying computational techniques.

Objective: The objective of this study is to create a corpus of patient experience (COPE) and report descriptive statistics to characterize COPE.

Methods: Yelp reviews about health care-related businesses were extracted from the Yelp Academic Dataset. Natural language processing (NLP) tools were used to split reviews into sentences, extract noun phrases and adjectives from each sentence, and generate parse trees and dependency trees for each sentence. Sentiment analysis techniques and Hadoop were used to calculate a sentiment score of each sentence and for parallel processing, respectively.

Results: COPE contains 79,173 sentences from 6914 patient reviews of 985 health care facilities near 30 universities in the United States. We found that patients wrote longer reviews when they rated the facility poorly (1 or 2 stars). We demonstrated that the computed sentiment scores correlated well with consumer-generated ratings. A consumer vocabulary to describe their health care experience was constructed by a statistical analysis of word counts and co-occurrences in COPE.

Conclusions: A corpus called COPE was built as an initial step to utilize social media to understand patient experiences at health care facilities. The corpus is available to download and COPE can be used in future studies to extract knowledge of patients' experiences from their perspectives. Such information can subsequently inform and provide opportunity to improve the quality of health care.

No MeSH data available.