Limits...
Event detection using Twitter: a spatio-temporal approach.

Cheng T, Wicks T - PLoS ONE (2014)

Bottom Line: A spatio-temporally significant cluster is found relating to the London helicopter crash.Although the cluster only remains significant for a relatively short time, it is rich in information, such as important key words and photographs.These findings demonstrate that STSS is an effective approach to analysing Twitter data for event detection.

View Article: PubMed Central - PubMed

Affiliation: SpaceTimeLab, Department of Civil, Environmental and Geomatic Engineering, University College London, London, United Kingdom.

ABSTRACT

Background: Every day, around 400 million tweets are sent worldwide, which has become a rich source for detecting, monitoring and analysing news stories and special (disaster) events. Existing research within this field follows key words attributed to an event, monitoring temporal changes in word usage. However, this method requires prior knowledge of the event in order to know which words to follow, and does not guarantee that the words chosen will be the most appropriate to monitor.

Methods: This paper suggests an alternative methodology for event detection using space-time scan statistics (STSS). This technique looks for clusters within the dataset across both space and time, regardless of tweet content. It is expected that clusters of tweets will emerge during spatio-temporally relevant events, as people will tweet more than expected in order to describe the event and spread information. The special event used as a case study is the 2013 London helicopter crash.

Results and conclusion: A spatio-temporally significant cluster is found relating to the London helicopter crash. Although the cluster only remains significant for a relatively short time, it is rich in information, such as important key words and photographs. The method also detects other special events such as football matches, as well as train and flight delays from Twitter data. These findings demonstrate that STSS is an effective approach to analysing Twitter data for event detection.

Show MeSH
The temporal distribution of tweets collected between 7th January and 18th January 2013.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4043742&req=5

pone-0097807-g002: The temporal distribution of tweets collected between 7th January and 18th January 2013.

Mentions: Furthermore, a variety of STSS models are available for analysis, such as the Bernoulli model, the Poisson model and the permutation model. Each model’s use varies with the context within which the data is being used. For this analysis, a space-time permutation model (STPM) was used due to its flexibility when compared to other models. The model simply requires data to contain spatial and temporal attributes and requires no further information, thus matching the attributes collected by each tweet. Additionally, the STPM automatically allows for purely spatial and purely temporal variations in a dataset. This is a critical feature for Twitter data analysis given the large temporal and spatial variations on the data, as illustrated by figures 1 and 2. A detailed methodology of the STPM goes beyond the scope of this paper but is provided in [19].


Event detection using Twitter: a spatio-temporal approach.

Cheng T, Wicks T - PLoS ONE (2014)

The temporal distribution of tweets collected between 7th January and 18th January 2013.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4043742&req=5

pone-0097807-g002: The temporal distribution of tweets collected between 7th January and 18th January 2013.
Mentions: Furthermore, a variety of STSS models are available for analysis, such as the Bernoulli model, the Poisson model and the permutation model. Each model’s use varies with the context within which the data is being used. For this analysis, a space-time permutation model (STPM) was used due to its flexibility when compared to other models. The model simply requires data to contain spatial and temporal attributes and requires no further information, thus matching the attributes collected by each tweet. Additionally, the STPM automatically allows for purely spatial and purely temporal variations in a dataset. This is a critical feature for Twitter data analysis given the large temporal and spatial variations on the data, as illustrated by figures 1 and 2. A detailed methodology of the STPM goes beyond the scope of this paper but is provided in [19].

Bottom Line: A spatio-temporally significant cluster is found relating to the London helicopter crash.Although the cluster only remains significant for a relatively short time, it is rich in information, such as important key words and photographs.These findings demonstrate that STSS is an effective approach to analysing Twitter data for event detection.

View Article: PubMed Central - PubMed

Affiliation: SpaceTimeLab, Department of Civil, Environmental and Geomatic Engineering, University College London, London, United Kingdom.

ABSTRACT

Background: Every day, around 400 million tweets are sent worldwide, which has become a rich source for detecting, monitoring and analysing news stories and special (disaster) events. Existing research within this field follows key words attributed to an event, monitoring temporal changes in word usage. However, this method requires prior knowledge of the event in order to know which words to follow, and does not guarantee that the words chosen will be the most appropriate to monitor.

Methods: This paper suggests an alternative methodology for event detection using space-time scan statistics (STSS). This technique looks for clusters within the dataset across both space and time, regardless of tweet content. It is expected that clusters of tweets will emerge during spatio-temporally relevant events, as people will tweet more than expected in order to describe the event and spread information. The special event used as a case study is the 2013 London helicopter crash.

Results and conclusion: A spatio-temporally significant cluster is found relating to the London helicopter crash. Although the cluster only remains significant for a relatively short time, it is rich in information, such as important key words and photographs. The method also detects other special events such as football matches, as well as train and flight delays from Twitter data. These findings demonstrate that STSS is an effective approach to analysing Twitter data for event detection.

Show MeSH