Limits...
Event detection using Twitter: a spatio-temporal approach.

Cheng T, Wicks T - PLoS ONE (2014)

Bottom Line: A spatio-temporally significant cluster is found relating to the London helicopter crash.Although the cluster only remains significant for a relatively short time, it is rich in information, such as important key words and photographs.These findings demonstrate that STSS is an effective approach to analysing Twitter data for event detection.

View Article: PubMed Central - PubMed

Affiliation: SpaceTimeLab, Department of Civil, Environmental and Geomatic Engineering, University College London, London, United Kingdom.

ABSTRACT

Background: Every day, around 400 million tweets are sent worldwide, which has become a rich source for detecting, monitoring and analysing news stories and special (disaster) events. Existing research within this field follows key words attributed to an event, monitoring temporal changes in word usage. However, this method requires prior knowledge of the event in order to know which words to follow, and does not guarantee that the words chosen will be the most appropriate to monitor.

Methods: This paper suggests an alternative methodology for event detection using space-time scan statistics (STSS). This technique looks for clusters within the dataset across both space and time, regardless of tweet content. It is expected that clusters of tweets will emerge during spatio-temporally relevant events, as people will tweet more than expected in order to describe the event and spread information. The special event used as a case study is the 2013 London helicopter crash.

Results and conclusion: A spatio-temporally significant cluster is found relating to the London helicopter crash. Although the cluster only remains significant for a relatively short time, it is rich in information, such as important key words and photographs. The method also detects other special events such as football matches, as well as train and flight delays from Twitter data. These findings demonstrate that STSS is an effective approach to analysing Twitter data for event detection.

Show MeSH
Significant hourly clusters within London between 16th January and 17th January 2013.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4043742&req=5

pone-0097807-g003: Significant hourly clusters within London between 16th January and 17th January 2013.

Mentions: Figure 3 maps the cluster outputs generated in an oblique view when using hourly aggregations within a space-time cube. Figure 4 is a top view of Figure 3, which shows the precise location and the size of these clusters in a 2-dimensional space. As it can be seen, clusters are spatio-temporally dispersed and display varying spatial and temporal ranges. However, these maps contain little useful information when trying to identify clusters related to the case study disaster event. In order to do this, clusters are classified into four descriptive topics using LDA. If at least half of the topics discovered are attributable to a space-time event, then it is assumed the tweet cluster relates to the real world event. If less than half of the topics can be attributed to an event, it is assumed the cluster does not represent a space-time event and is a spurious result. Table 3 provides the LDA topics generated for clusters which can be attributed to space-time events for daily aggregations, while table 4 provides the LDA topics for hourly aggregations. Terms deemed pertinent to the event are highlighted in orange while non-pertinent terms are highlighted in blue.


Event detection using Twitter: a spatio-temporal approach.

Cheng T, Wicks T - PLoS ONE (2014)

Significant hourly clusters within London between 16th January and 17th January 2013.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4043742&req=5

pone-0097807-g003: Significant hourly clusters within London between 16th January and 17th January 2013.
Mentions: Figure 3 maps the cluster outputs generated in an oblique view when using hourly aggregations within a space-time cube. Figure 4 is a top view of Figure 3, which shows the precise location and the size of these clusters in a 2-dimensional space. As it can be seen, clusters are spatio-temporally dispersed and display varying spatial and temporal ranges. However, these maps contain little useful information when trying to identify clusters related to the case study disaster event. In order to do this, clusters are classified into four descriptive topics using LDA. If at least half of the topics discovered are attributable to a space-time event, then it is assumed the tweet cluster relates to the real world event. If less than half of the topics can be attributed to an event, it is assumed the cluster does not represent a space-time event and is a spurious result. Table 3 provides the LDA topics generated for clusters which can be attributed to space-time events for daily aggregations, while table 4 provides the LDA topics for hourly aggregations. Terms deemed pertinent to the event are highlighted in orange while non-pertinent terms are highlighted in blue.

Bottom Line: A spatio-temporally significant cluster is found relating to the London helicopter crash.Although the cluster only remains significant for a relatively short time, it is rich in information, such as important key words and photographs.These findings demonstrate that STSS is an effective approach to analysing Twitter data for event detection.

View Article: PubMed Central - PubMed

Affiliation: SpaceTimeLab, Department of Civil, Environmental and Geomatic Engineering, University College London, London, United Kingdom.

ABSTRACT

Background: Every day, around 400 million tweets are sent worldwide, which has become a rich source for detecting, monitoring and analysing news stories and special (disaster) events. Existing research within this field follows key words attributed to an event, monitoring temporal changes in word usage. However, this method requires prior knowledge of the event in order to know which words to follow, and does not guarantee that the words chosen will be the most appropriate to monitor.

Methods: This paper suggests an alternative methodology for event detection using space-time scan statistics (STSS). This technique looks for clusters within the dataset across both space and time, regardless of tweet content. It is expected that clusters of tweets will emerge during spatio-temporally relevant events, as people will tweet more than expected in order to describe the event and spread information. The special event used as a case study is the 2013 London helicopter crash.

Results and conclusion: A spatio-temporally significant cluster is found relating to the London helicopter crash. Although the cluster only remains significant for a relatively short time, it is rich in information, such as important key words and photographs. The method also detects other special events such as football matches, as well as train and flight delays from Twitter data. These findings demonstrate that STSS is an effective approach to analysing Twitter data for event detection.

Show MeSH